Re: bpf_fib_lookup support for firewall mark

From: Rumen Telbizov <rumen.telbizov@menlosecurity.com>
To: David Ahern <dsahern@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>,
	bpf@vger.kernel.org, Jesper Dangaard Brouer <brouer@redhat.com>
Subject: Re: bpf_fib_lookup support for firewall mark
Date: Thu, 10 Jun 2021 10:41:41 -0700	[thread overview]
Message-ID: <CA+FoirA28PANkzHE-4uHb7M0vf-V3UZ6NfjKbc_RBJ2=sKSrOQ@mail.gmail.com> (raw)
In-Reply-To: <68345713-e679-fe9f-fedd-62f76911b55a@gmail.com>

> that's the key point on performance - crossing a cacheline. I do not
> have performance data at hand, but it is a substantial hit. That is why
> the struct is so overloaded (and complicated for a uapi) with the input
> vs output setting.

Makes perfect sense now. Thanks for clarifying David and Daniel.

> Presumably you are parsing the packet to id a flow to find the mark that
> should be used with the FIB lookup. correct?

Let me briefly present my high-level use case here to give more colour.
What I am building is an overlay network based on geneve. I have multiple
sites, each of which is going to be represented by a separate routing table.
The selection of the destination site (routing table) is based on firewall marks
and the original packet is preserved intact, encapsulated in geneve. I have a
TC/eBPF program running on the geneve interface which has to query the
appropriate routing table based on the firewall mark and use the
returned next hop
as the tunnel key in the skb. Also worth mentioning is that those routing tables
contain multiple (default) routes as I use ECMP to balance traffic/provide HA
between sites,

> > That said, given h_vlan_proto/h_vlan_TCI are both output parameters,
> > maybe we could just union the two fields with a __u32 mark extension
> > that we then transfer into the flowi{4,6}?
>
> That is one option.
>
> I would go for a union on sport and/or dport. It is a fair tradeoff to
> request users to pick one - policy routing based on L4 ports or fwmark.
> A bit harder to do with a straight up union at this point, but we could
> also limit the supported fwmark to 16-bits. Hard choices have to be made.

A couple of comments on those two options: if the union is between the ports
and the mark then a user of the function would have to choose between
src+dst port or the mark in lookup, correct? If so wouldn't that
result in a loss
of the ability to use multipathing - since the hashing would be static? In my
case that would certainly be another significant drawback.

Having said that, what Daniel suggests looks very interesting to me.
If I understand
it correctly there are 32 bits in h_vlan_proto+h_vlan_TCI that are used only for
output today so if they are merged in a union with a 32 bit mark then we'd stay
at 64B structure and we can pass a full 32 bit mark.

So something like this?
union {
    /* input */
    __u32 mark;

    /* output */
    __be16 h_vlan_proto;
    __be16 h_vlan_TCI;
}

Moreover, there are 12 extra bytes used only as output for the smac/dmac.
If the above works then maybe this opens up the opportunity to incorporate
even more input parameters that way?

Thank you once again for your time and suggestions.

Regards,
Rumen Telbizov