All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesse Gross <jesse@nicira.com>
To: Tom Herbert <therbert@google.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
	Alexander Duyck <alexander.h.duyck@intel.com>,
	Andy Zhou <azhou@nicira.com>, David Miller <davem@davemloft.net>,
	Linux Netdev List <netdev@vger.kernel.org>
Subject: Re: [net-next 02/10] udp: Expand UDP tunnel common APIs
Date: Wed, 23 Jul 2014 23:24:40 -0400	[thread overview]
Message-ID: <CAEP_g=_whjHzVrR+EuCQsYC1owwRk==RnOUYuetHXB9s22RHQQ@mail.gmail.com> (raw)
In-Reply-To: <CA+mtBx8iiZp7hG9uckUPkpxT1t7fNDgzZWOf0TywG42DYS+PCQ@mail.gmail.com>

On Wed, Jul 23, 2014 at 11:45 AM, Tom Herbert <therbert@google.com> wrote:
> On Tue, Jul 22, 2014 at 9:35 PM, Jesse Gross <jesse@nicira.com> wrote:
>> On Tue, Jul 22, 2014 at 11:53 PM, Tom Herbert <therbert@google.com> wrote:
>>>>> Which feature flags control the receive side parsing in the device?
>>>>
>>>> The only real features that need the port info are Rx hash and Rx
>>>> checksum.  If those are disabled then there shouldn't be any need for
>>>> the port numbers.  I don't recall if you can disable them separately
>>>> from the non-tunnel case though.  I believe they are linked to the
>>>> standard offloads.
>>>>
>>> Rx hash is unnecessary consideration because we can derive that from
>>> UDP header. The fact that we can deduce a reasonable hash is a major
>>> rationale of UDP encapsulation. We will need drivers to start
>>> enabling/supporting UDP RSS and providing RX hash to realize full
>>> benefits of this.
>>
>> That's true for basic hashing but for more sophisticated things like
>> flow steering or sending OAM packets to control queues the hardware
>> still needs to be able to look into the header.
>>
> Flow steering (aRFS, FlowDirector, ECMP in network) will work just
> fine based on UDP header-- again this is a fundamental property in UDP
> encapsulation. If you need to implement mechanisms that require
> parsing of the encapsulated headers, then it's better to make this
> part of RX filtering.

Sure, it can operate on the UDP hash but I would argue that it works
better if you actually look into the packet. Using the hash is either
going to just randomly spread traffic or require you to track hashes
and direct them to particular places for established connections.
However, depending on the situation this may not really be optimal
compared to, say, steering based on inner MAC address.

But in reality, whether it is for steering or filtering these
operations are pretty similar to me, just different goals.

> btw, Geneve draft allows for non-zero UDP checksums to be ignored like
> in VXLAN-- this is a violation of UDP standard :-(. We will not do
> this in the stack, but it opens the possibility that HW may tell us
> checksum is okay when it actually isn't. Accepting
> CHECKSUM_UNNECESSARY from all these devices is quite the leap of faith
> we're taking!

This is actually not the intention but I see that the wording of the
draft is poor. I'll see if I can improve it to avoid this situation.

>> Some of these are obviously future looking but I think that means that
>> even if you got your desired changes, the use of the UDP port on
>> receive would only shift, not go away.
>
> I think your hitting the major point that we have to be future
> looking. When hardware hardwire specific protocols instead of using
> generic mechanisms, we become pigeonholed-- this is *not* future
> looking and in the long run it's a disservice to customers if we
> advocate this in the stack. Consider that geneve is likely superior to
> VXLAN because it is extensible, but that VXLAN may still win since it
> is already "supported" in so much HW.

I understand your goal but I'm not really sure what solution you are
proposing. There are obviously ways that the stack can be made more
generic from where it is today but I think we agree that at least some
things will require protocol knowledge.

Geneve (and GUE) are trying to solve this by having a protocol that is
generic - hardware will still need to have support for a specific
protocol but at least that can support many uses. However, this
doesn't seem to be what you are getting at since it's not true of
VXLAN, particularly if you are concerned with deployed hardware.

  reply	other threads:[~2014-07-24  3:25 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-22 10:19 [net-next 00/10] Add Geneve Andy Zhou
2014-07-22 10:19 ` [net-next 01/10] net: Rename ndo_add_vxlan_port to ndo_add_udp_tunnel_port Andy Zhou
2014-07-22 10:49   ` Varka Bhadram
2014-07-24  6:40   ` Or Gerlitz
2014-07-24 20:28     ` Andy Zhou
2014-07-22 10:19 ` [net-next 02/10] udp: Expand UDP tunnel common APIs Andy Zhou
     [not found]   ` <CA+mtBx9M_BpjT-_Egng+jFxmqJzdC2Npg0ufE2ZSAb9Lhw8hxg@mail.gmail.com>
2014-07-22 21:02     ` Andy Zhou
2014-07-22 21:16       ` Tom Herbert
2014-07-22 21:56         ` Jesse Gross
2014-07-22 22:38           ` Tom Herbert
2014-07-22 22:55             ` Alexander Duyck
2014-07-22 23:24               ` Tom Herbert
2014-07-23  2:16                 ` Alexander Duyck
2014-07-23  3:53                   ` Tom Herbert
2014-07-23  4:35                     ` Jesse Gross
2014-07-23 15:45                       ` Tom Herbert
2014-07-24  3:24                         ` Jesse Gross [this message]
2014-07-22 23:12             ` Jesse Gross
2014-07-23 19:57   ` Tom Herbert
2014-07-24 20:23     ` Andy Zhou
2014-07-24 20:47       ` Tom Herbert
2014-07-24 20:54         ` Andy Zhou
2014-07-22 10:19 ` [net-next 03/10] vxlan: Remove vxlan_get_rx_port() Andy Zhou
     [not found]   ` <CAKgT0UeRSc3MaZrLmXyx4jPZO+F1hS5imR1TjFkvKp4S8nQmeg@mail.gmail.com>
2014-07-23  3:57     ` Andy Zhou
2014-07-22 10:19 ` [net-next 04/10] net: Refactor vxlan driver to make use of common UDP tunnel functions Andy Zhou
2014-07-24  6:46   ` Or Gerlitz
2014-07-22 10:19 ` [net-next 05/10] net: Add Geneve tunneling protocol driver Andy Zhou
2014-07-22 23:12   ` Alexander Duyck
2014-07-22 23:24     ` Jesse Gross
2014-07-23 14:11       ` John W. Linville
2014-07-23 18:20   ` Stephen Hemminger
2014-07-22 10:19 ` [net-next 06/10] openvswitch: Eliminate memset() from flow_extract Andy Zhou
2014-07-22 10:19 ` [net-next 07/10] openvswitch: Add support for matching on OAM packets Andy Zhou
2014-07-22 10:19 ` [net-next 08/10] openvswitch: Wrap struct ovs_key_ipv4_tunnel in a new structure Andy Zhou
2014-07-22 10:19 ` [net-next 09/10] openvswitch: Factor out allocation and verification of actions Andy Zhou
2014-07-22 10:19 ` [net-next 10/10] openvswitch: Add support for Geneve tunneling Andy Zhou
2014-07-23 20:29   ` Tom Herbert
2014-07-24  4:10     ` Jesse Gross
     [not found]       ` <CA+mtBx9umxiFYtnG1kzFkK+Ev=b=4f3q2OOow2QcfCB5rUTUyA@mail.gmail.com>
2014-07-24 22:59         ` Jesse Gross
2014-07-24 23:45           ` Tom Herbert
2014-07-25  1:04             ` Jesse Gross
2014-07-22 10:54 ` [net-next 00/10] Add Geneve Varka Bhadram
2014-07-24  6:58 ` Or Gerlitz
2014-07-24 17:40   ` Tom Herbert
2014-07-24 21:03     ` Andy Zhou
2014-07-24 22:03       ` Tom Herbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEP_g=_whjHzVrR+EuCQsYC1owwRk==RnOUYuetHXB9s22RHQQ@mail.gmail.com' \
    --to=jesse@nicira.com \
    --cc=alexander.duyck@gmail.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=azhou@nicira.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=therbert@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.