netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oz Shlomo <ozsh@nvidia.com>
To: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Roi Dayan <roid@nvidia.com>, Saeed Mahameed <saeed@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, <netdev@vger.kernel.org>,
	Paul Blakey <paulb@nvidia.com>,
	Saeed Mahameed <saeedm@nvidia.com>
Subject: Re: [net-next 08/15] net/mlx5e: CT: Preparation for offloading +trk+new ct rules
Date: Thu, 14 Jan 2021 16:03:43 +0200	[thread overview]
Message-ID: <d1b5b862-8c30-efb6-1a2f-4f9f0d49ef15@nvidia.com> (raw)
In-Reply-To: <20210114130238.GA2676@horizon.localdomain>



On 1/14/2021 3:02 PM, Marcelo Ricardo Leitner wrote:
> On Tue, Jan 12, 2021 at 11:27:04AM +0200, Oz Shlomo wrote:
>>
>>
>> On 1/12/2021 1:51 AM, Marcelo Ricardo Leitner wrote:
>>> On Sun, Jan 10, 2021 at 09:52:55AM +0200, Roi Dayan wrote:
>>>>
>>>>
>>>> On 2021-01-10 9:45 AM, Roi Dayan wrote:
>>>>>
>>>>>
>>>>> On 2021-01-08 11:48 PM, Marcelo Ricardo Leitner wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Thu, Jan 07, 2021 at 09:30:47PM -0800, Saeed Mahameed wrote:
>>>>>>> From: Roi Dayan <roid@nvidia.com>
>>>>>>>
>>>>>>> Connection tracking associates the connection state per packet. The
>>>>>>> first packet of a connection is assigned with the +trk+new state. The
>>>>>>> connection enters the established state once a packet is seen on the
>>>>>>> other direction.
>>>>>>>
>>>>>>> Currently we offload only the established flows. However, UDP traffic
>>>>>>> using source port entropy (e.g. vxlan, RoCE) will never enter the
>>>>>>> established state. Such protocols do not require stateful processing,
>>>>>>> and therefore could be offloaded.
>>>>>>
>>>>>> If it doesn't require stateful processing, please enlight me on why
>>>>>> conntrack is being used in the first place. What's the use case here?
>>>>>>
>>>>>
>>>>> The use case for example is when we have vxlan traffic but we do
>>>>> conntrack on the inner packet (rules on the physical port) so
>>>>> we never get established but on miss we can still offload as normal
>>>>> vxlan traffic.
>>>>>
>>>>
>>>> my mistake about "inner packet". we do CT on the underlay network, i.e.
>>>> the outer header.
>>>
>>> I miss why the CT match is being used there then. Isn't it a config
>>> issue/waste of resources? What is CT adding to the matches/actions
>>> being done on these flows?
>>>
>>
>> Consider a use case where the network port receives both east-west
>> encapsulated traffic and north-south non-encapsulated traffic that requires
>> NAT.
>>
>> One possible configuration is to first apply the CT-NAT action.
>> Established north-south connections will successfully execute the nat action
>> and will set the +est ct state.
>> However, the +new state may apply either for valid east-west traffic (e.g.
>> vxlan) due to source port entropy, or to insecure north-south traffic that
>> the fw should block. The user may distinguish between the two cases, for
>> example, by matching on the dest udp port.
> 
> Sorry but I still don't see the big picture. :-]
> 
> What do you consider as east-west and north-south traffic? My initial
> understanding of east-west is traffic between VFs and north-south
> would be in and out to the wire. You mentioned that north-south is
> insecure, it would match, but then, non-encapsulated?
> 
> So it seems you referred to the datacenter. East-west is traffic
> between hosts on the same datacenter, and north-south is traffic that
> goes out of it. This seems to match.

Right.

> 
> Assuming it's the latter, then it seems that the idea is to work
> around a config simplification that was done by the user.  As
> mentioned on the changelog, such protocols do not require stateful
> processing, and AFAICU this patch twists conntrack so that the user
> can have simplified rules. Why can't the user have specific rules for
> the tunnels, and other for dealing with north-south traffic? The fw
> would still be able to block unwanted traffic.

We cannot control what the user is doing.
This is a valid tc configuration and would work using tc software datapath.
However, in such configurations vxlan packets would not be processed in hardware because they are 
marked as new connections.

> 
> My main problems with this is this, that it is making conntrack do
> stuff that the user may not be expecting it to do, and that packets
> may get matched (maybe even unintentionally) and the system won't have
> visibility on them. Maybe I'm just missing something?
> 

This is why we restricted this feature to udp protocols that will never enter established state due 
to source port entropy.
Do you see a problematic use case that can arise?

>>
>>
>>>>
>>>>>>>
>>>>>>> The change in the model is that a miss on the CT table will be forwarded
>>>>>>> to a new +trk+new ct table and a miss there will be forwarded to
>>>>>>> the slow
>>>>>>> path table.
>>>>>>
>>>>>> AFAICU this new +trk+new ct table is a wildcard match on sport with
>>>>>> specific dports. Also AFAICU, such entries will not be visible to the
>>>>>> userspace then. Is this right?
>>>>>>
>>>>>>      Marcelo
>>>>>>
>>>>>
>>>>> right.
>>>
>>> Thanks,
>>> Marcelo
>>>
>>

  reply	other threads:[~2021-01-14 14:04 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-08  5:30 [pull request][net-next 00/15] mlx5 updates 2021-01-07 Saeed Mahameed
2021-01-08  5:30 ` [net-next 01/15] net/mlx5: Add HW definition of reg_c_preserve Saeed Mahameed
2021-01-08  5:30 ` [net-next 02/15] net/mlx5e: Simplify condition on esw_vport_enable_qos() Saeed Mahameed
2021-01-08  5:30 ` [net-next 03/15] net/mlx5: E-Switch, use new cap as condition for mpls over udp Saeed Mahameed
2021-01-08  5:30 ` [net-next 04/15] net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported Saeed Mahameed
2021-01-08  5:30 ` [net-next 05/15] net/mlx5e: CT: Pass null instead of zero spec Saeed Mahameed
2021-01-08  5:30 ` [net-next 06/15] net/mlx5e: Remove redundant initialization to null Saeed Mahameed
2021-01-08  5:30 ` [net-next 07/15] net/mlx5e: CT: Remove redundant usage of zone mask Saeed Mahameed
2021-01-08  5:30 ` [net-next 08/15] net/mlx5e: CT: Preparation for offloading +trk+new ct rules Saeed Mahameed
2021-01-08 21:48   ` Marcelo Ricardo Leitner
2021-01-10  7:45     ` Roi Dayan
2021-01-10  7:52       ` Roi Dayan
2021-01-11 23:51         ` Marcelo Ricardo Leitner
2021-01-12  9:27           ` Oz Shlomo
2021-01-14 13:02             ` Marcelo Ricardo Leitner
2021-01-14 14:03               ` Oz Shlomo [this message]
2021-01-14 21:50                 ` Marcelo Ricardo Leitner
2021-01-20 16:09                   ` Oz Shlomo
2021-01-22  1:18                     ` Pablo Neira Ayuso
2021-01-22  2:16                       ` Marcelo Ricardo Leitner
2021-01-25  9:15                         ` Oz Shlomo
2021-01-08  5:30 ` [net-next 09/15] net/mlx5e: CT: Support offload of " Saeed Mahameed
2021-01-08 21:59   ` Marcelo Ricardo Leitner
2021-01-10  7:55     ` Roi Dayan
2021-01-08  5:30 ` [net-next 10/15] net/mlx5e: CT: Add support for mirroring Saeed Mahameed
2021-01-08  5:30 ` [net-next 11/15] net/mlx5e: CT, Avoid false lock depenency warning Saeed Mahameed
2021-01-08  5:30 ` [net-next 12/15] net/mlx5e: IPsec, Enclose csum logic under ipsec config Saeed Mahameed
2021-01-08  5:30 ` [net-next 13/15] net/mlx5e: IPsec, Avoid unreachable return Saeed Mahameed
2021-01-08  5:30 ` [net-next 14/15] net/mlx5e: IPsec, Inline feature_check fast-path function Saeed Mahameed
2021-01-08  5:30 ` [net-next 15/15] net/mlx5e: IPsec, Remove unnecessary config flag usage Saeed Mahameed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d1b5b862-8c30-efb6-1a2f-4f9f0d49ef15@nvidia.com \
    --to=ozsh@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=marcelo.leitner@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulb@nvidia.com \
    --cc=roid@nvidia.com \
    --cc=saeed@kernel.org \
    --cc=saeedm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).