Netfilter-Devel Archive on lore.kernel.org
 help / color / Atom feed
From: Phil Sutter <phil@nwl.cc>
To: Arturo Borrero Gonzalez <arturo@netfilter.org>
Cc: "Serguei Bezverkhi (sbezverk)" <sbezverk@cisco.com>,
	"netfilter-devel@vger.kernel.org"
	<netfilter-devel@vger.kernel.org>
Subject: Re: Numen with reference to vmap
Date: Wed, 4 Dec 2019 23:32:15 +0100
Message-ID: <20191204223215.GX14469@orbyte.nwl.cc> (raw)
In-Reply-To: <624cc1ac-126e-8ad3-3faa-f7869f7d2d5b@netfilter.org>

Hi Arturo,

On Wed, Dec 04, 2019 at 06:31:02PM +0100, Arturo Borrero Gonzalez wrote:
> On 12/4/19 4:56 PM, Phil Sutter wrote:
> > OK, static load-balancing between two services - no big deal. :)
> > 
> > What happens if config changes? I.e., if one of the endpoints goes down
> > or a third one is added? (That's the thing we're discussing right now,
> > aren't we?)
> 
> if the non-anon map for random numgen was allowed, then only elements would need
> to be adjusted:
> 
> dnat numgen random mod 100 map { 0-49 : 1.1.1.1, 50-99 : 2.2.2.2 }
> 
> You could always use mod 100 (or 10000 if you want) and just play with the map
> probabilities by updating map elements. This is a valid use case I think.
> The mod number can just be the max number of allowed endpoints per service in
> kubernetes.
> 
> @Phil,
> 
> I'm not sure if the typeof() thingy will work in this case, since the integer
> length would depend on the mod value used.
> What about introducing something like an explicit u128 integer datatype. Perhaps
> it's useful for other use cases too...

Out of curiosity I implemented the bits to support typeof keyword in
parser and scanner. It's a bit clumsy, but it works. I can do:

| nft add map t m2 '{ type typeof numgen random mod 2 : verdict; }'

(The 'random mod 2' part is ignored, but needed as otherwise it's not a
primary_expr. :D)

The output is:

| table ip t {
| 	map m2 {
| 		type integer : verdict
| 	}
| }

So integer size information is lost, this won't work when fed back.
There are two options to solve this:

A) Push expression info into kernel so we can correctly deserialize the
   original input.

B) As you suggested, have something like 'int32' or maybe better 'int(32)'.

I consider (B) to be way less ugly. And if we went that route, we could
actually use the 'int32'/'int(32)' thing in the first place. All users
have to know is how large is 'numgen' data type. Or we're even smart
here, taking into account that such a map may be used with different
inputs and mask input to fit map key size. IIRC, we may even have had
this discussion in an inconveniently cold room in Malaga once. :)

> @Serguei,
> 
> kubernetes implements a complex chain of mechanisms to deal with traffic. What
> happens if endpoints for a given svc have different ports? I don't know if
> that's supported or not, but then this approach wouldn't work either: you can't
> use dnat numgen randmo { 0-49 : <ip>:<port> }.
> 
> Also, we have the masquerade/drop thing going on too, which needs to be deal
> with and that currently is done by yet another chain jump + packet mark.
> 
> I'm not sure in which state of the development you are, but this is my
> suggestion: Try to don't over-optimize in the first iteration. Just get a
> working nft ruleset with the few optimization that make sense and are easy to
> use (and understand). For iteration #2 we can do better optimizations, including
> patching missing features we may have in nftables.
> I really want a ruleset with very little rules, but we are still comparing with
> the iptables ruleset. I suggest we leave the hard optimization for a later point
> when we are comparing nft vs nft rulesets.

+1 for optimize not (yet). At least there's a certain chance that we're
spending much effort into optimizing a path which isn't even the
bottleneck later.

Cheers, Phil

      parent reply index

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-04  0:54 Serguei Bezverkhi (sbezverk)
2019-12-04 10:18 ` Phil Sutter
2019-12-04 13:47   ` Serguei Bezverkhi (sbezverk)
2019-12-04 15:17     ` Phil Sutter
2019-12-04 15:42       ` Serguei Bezverkhi (sbezverk)
2019-12-04 15:56         ` Phil Sutter
2019-12-04 16:13           ` Serguei Bezverkhi (sbezverk)
2019-12-04 17:00             ` Phil Sutter
2019-12-04 17:31           ` Arturo Borrero Gonzalez
2019-12-04 17:49             ` Serguei Bezverkhi (sbezverk)
2019-12-04 21:05               ` Serguei Bezverkhi (sbezverk)
2019-12-04 22:32             ` Phil Sutter [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191204223215.GX14469@orbyte.nwl.cc \
    --to=phil@nwl.cc \
    --cc=arturo@netfilter.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=sbezverk@cisco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Netfilter-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netfilter-devel/0 netfilter-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netfilter-devel netfilter-devel/ https://lore.kernel.org/netfilter-devel \
		netfilter-devel@vger.kernel.org
	public-inbox-index netfilter-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netfilter-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git