netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
To: Cong Wang <amwang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: netdev <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org"
	<dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org>
Subject: Re: A question on the design of OVS GRE tunnel
Date: Mon, 8 Jul 2013 23:26:30 -0700	[thread overview]
Message-ID: <CAEP_g=93ohj0u4uqiG4k_8kRFQYC71tNyri4xnRQcZCYNQhz=A@mail.gmail.com> (raw)
In-Reply-To: <1373337666.4557.13.camel@cr0>

On Mon, Jul 8, 2013 at 7:41 PM, Cong Wang <amwang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Mon, 2013-07-08 at 09:28 -0700, Pravin Shelar wrote:
>> On Mon, Jul 8, 2013 at 2:51 AM, Cong Wang <amwang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> > However, I noticed there is some problem with such design:
>> >
>> > I saw very bad performance with the _default_ setup with OVS GRE. After
>> > digging it a little bit, clearly the cause is that OVS GRE tunnel adds
>> > an outer IP header and a GRE header for every packet that passed to it,
>> > which could result in a packet whose length is larger than the MTU of
>> > the uplink, therefore after the packet goes through OVS, it has to be
>> > fragmented by IP before going to the wire.
>> >
>> I do not understand what do you mean, gre packets greater than MTU
>> must be fragmented before sent on wire and it is done by GRE-GSO code.
>>
>
> Well, I said fragment, not segment. This is exactly why performance is
> so bad.
>
> In my _default_ setup, every net device on the path has MTU=1500,
> therefore, the packets coming out of a KVM guest can have length=1500,
> after they go through OVS GRE tunnel, their length becomes 1538 because
> of the added GRE header and IP header.
>
> After that, since the packets are not GSO (unless you pass vnet_hdr=on
> to KVM guest), the packets with length=1538 will be _fragmented_ by IP
> layer, since the dest uplink has MTU=1500 too. This is why I proposed to
> reuse GRO cell to merge the packets, which requires a netdev...

Large packets coming from a modern KVM guest will use TSO because this
is a huge performance win regardless of whether any tunneling is used.
It doesn't make any sense for the guest IP stack to take a stream of
packets, split them apart, merge them in the hypervisor stack, and
split them again before transmission. Any packets potentially worth
merging will almost certainly have originated as a single buffer in
the guest, so we should keep them together all the way from the guest
to the GSO/TSO layer.

The real problem is that the requested MSS size is not correct. In the
"best" situation we would first segment the packet to the requested
size, add the tunnel headers, and then fragment. However, it looks to
me like the original size is being carried all the way to the GSO
code, which will then generate packets that are greater than the MTU.
Both of these can likely be improved upon by either convincing the
guest to automatically use a lower MSS or adjusting it ourselves.
X-CudaMail-Whitelist-To: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org

  reply	other threads:[~2013-07-09  6:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-08  9:51 A question on the design of OVS GRE tunnel Cong Wang
2013-07-08 16:28 ` Pravin Shelar
2013-07-09  2:41   ` Cong Wang
2013-07-09  6:26     ` Jesse Gross [this message]
2013-07-10  3:34       ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEP_g=93ohj0u4uqiG4k_8kRFQYC71tNyri4xnRQcZCYNQhz=A@mail.gmail.com' \
    --to=jesse-l0m0p4e3n4lqt0dzr+alfa@public.gmane.org \
    --cc=amwang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).