All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Alex Duyck <aduyck@mirantis.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Tom Herbert <tom@herbertland.com>, Jesse Gross <jesse@kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	Netdev <netdev@vger.kernel.org>
Subject: Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864
Date: Fri, 1 Apr 2016 12:58:41 -0700	[thread overview]
Message-ID: <CAKgT0UcLvAF9gTzcaR=GZyTkM2UQ6ymDmnisjgZiG8hTE+PdLA@mail.gmail.com> (raw)
In-Reply-To: <20160401.152405.915323132719949585.davem@davemloft.net>

On Fri, Apr 1, 2016 at 12:24 PM, David Miller <davem@davemloft.net> wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 01 Apr 2016 11:49:03 -0700
>
>> For example, TCP stack tracks per socket ID generation, even if it
>> sends DF=1 frames. Damn useful for tcpdump analysis and drop
>> inference.
>
> Thanks for mentioning this, I never considered this use case.

RFC 6864 is pretty explicit about this, IPv4 ID used only for
fragmentation.  https://tools.ietf.org/html/rfc6864#section-4.1

The goal with this change is to try and keep most of the existing
behavior in tact without violating this rule?  I would think the
sequence number should give you the ability to infer a drop in the
case of TCP.  In the case of UDP tunnels we are now getting a bit more
data since we were ignoring the outer IP header ID before.

>> With your change, the resulting GRO packet would propagate the ID of
>> first frag. Most GSO/GSO engines will then provide a ID sequence,
>> which might not match original packets.

Right.  But that is only in the case where the IP IDs did not already
increment or were left uninitialized meaning the transmitter was
probably already following RFC 6864 and chose a fixed value.  Odds are
in such a case we end up improving the performance if anything as
there are plenty of legacy systems out there that still require the
IPv4 ID increment in order to get LRO/GRO.

>> I do not particularly care, but it is worth mentioning that GRO+TSO
>> would not be idempotent anymore.

In the patch I mentioned we had already broken that.  I'm basically
just going through and fixing the cases for tunnels where we were
doing the outer header wrong while at the same time relaxing the
requirements for the inner header if DF is set.  I'll probably add
some documentation do the Documentation folder about it as well.  I'm
currently in the process of writing up documentation for GSO and GSO
partial for the upcoming patchset.  I can pretty easily throw in a few
comments about GRO as well.

> Our eventual plan was to start emitting zero in the ID field for
> outgoing TCP datagrams with DF set, since the issue that caused us to
> generate incrementing IDs in the first place (buggy Microsoft SLHC
> compression) we decided is not relevant and important enough to
> accommodate any more.

For the GSO partial stuff I was probably just going to have the IP ID
on the inner headers lurch forward in chunks equal to gso_segs when we
are doing the segmentation.  I didn't want to use a fixed value just
because that would likely make it easy to identify Linux devices being
a bump in the wire.  I figure if there are already sources that
weren't updating IP ID for their segmentation offloads then if we just
take that approach odds are we will blend in with the other devices
and be more difficult to single out.

Another reason for doing it this way is that different devices are
going to have different behaviors with GSO partial.  In the case of
the i40e driver it recognizes both inner and outer network headers so
it can increment both correctly.  In the case of igb and ixgbe they
only can support the outer header so the inner IP ID value would be
lurching by gso_size every time we move from one GSO frame to the
next.

> So outside of your TCP behavior analysis case, there isn't a
> compelling argument to keeping that code around any more, rather than
> just put zero in the ID field.
>
> I suppose we could keep the counter code around and allow it to be
> enabled using a sysctl or socket option, but how strongly do you
> really feel about this?

I'm not suggesting we drop the counter code for transmit.  What RFC
6864 says is "Originating sources MAY set the IPv4 ID field of atomic
datagrams to any value."

For transmit we can leave the IP ID code as is.  For receive we should
not be snooping into the IP ID for any frames that have the DF bit set
as devices that have adopted RFC 6864 on their transmit path will end
up causing issues.

- Alex

  reply	other threads:[~2016-04-01 19:58 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-01 18:05 [net PATCH 0/2] Fixes for GRO and GRE tunnels Alexander Duyck
2016-04-01 18:05 ` [net PATCH 1/2] GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU Alexander Duyck
2016-04-01 18:05 ` [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864 Alexander Duyck
2016-04-01 18:49   ` Eric Dumazet
2016-04-01 19:24     ` David Miller
2016-04-01 19:58       ` Alexander Duyck [this message]
2016-04-01 21:13         ` Subash Abhinov Kasiviswanathan
2016-04-02  2:16         ` David Miller
2016-04-02  2:21           ` Eric Dumazet
2016-04-02 20:26             ` Rick Jones
2016-04-02  6:02           ` Alexander Duyck
2016-04-02  1:57     ` Herbert Xu
2016-04-02  2:15       ` Eric Dumazet
2016-04-02  2:19         ` Herbert Xu
2016-04-02  2:26           ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKgT0UcLvAF9gTzcaR=GZyTkM2UQ6ymDmnisjgZiG8hTE+PdLA@mail.gmail.com' \
    --to=alexander.duyck@gmail.com \
    --cc=aduyck@mirantis.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jesse@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.