All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: "Templin, Fred L" <Fred.L.Templin@boeing.com>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Rick Jones <rick.jones2@hp.com>, Glen Turner <gdt@gdt.id.au>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: UDP path MTU discovery
Date: Tue, 30 Mar 2010 08:16:27 +0200	[thread overview]
Message-ID: <20100330061627.GA22436@laped.iglesias.mooo.com> (raw)
In-Reply-To: <20100330052044.GJ20695@one.firstfloor.org>

On Tue, Mar 30, 2010 at 07:20:44AM +0200, Andi Kleen wrote:
> On Mon, Mar 29, 2010 at 04:38:49PM -0700, Templin, Fred L wrote:
> > > 1) 4096 bytes UDP messages... well...
> > > 2) Using regular TCP for DNS servers... well...
> > > 
> > > I believe some guys were pushing TCPCT (Cookie Transactions) for this
> > > case ( http://tools.ietf.org/html/draft-simpson-tcpct-00.html )
> > > 
> > > (That is, using an enhanced TCP for long DNS queries... but not only for
> > > DNS...)
> > 
> > IPv4 gets by this by setting DF=0 in the IP header, and
> > lets the network fragment the packet if necessary. IPv6 can
> > similarly get by this by having the sending host fragment
> > the large UDP packet into IPv6 fragments no longer than
> > 1280 bytes each.
> 
> That's true -- in theory the UDP app unwilling/unable to do proper ptmudisc 
> could set the path mtu to 1280 + header and still keep path mtu discovery off 
> and then just fragment. 
> 
> Drawback would be of course suboptimal network use with too small MTUs
> in the common case.
>
> Right now there is no right socket option to set the path mtu. We
> have a IP_MTU option, but it only works for getting the MTU.
> That's because the PMTU is in the routing cache entry and shared
> by multiple sockets. Presumably one could add a special case
> with an MTU in the socket overriding the one in the destination entry.

Sorry I'm not following you here.. Why do you need to set the MTU?

IIUC:
UDP is supposed to preserve datagram boundaries, so the sender should
when seeing an EMSGSIZE, read the PMTU and avoid sending further UDP
packets larger than that. Userspace has control over the UDP datagram
size. If it can, the app will also at this point retransmit any recent
packets that went out larger than the fresh PMTU.

If you don't want to hassle with all of that, the app can stick to
1280 (or I guess for the extreme/lazy cases turn on fragmentation)..

Cheers

  parent reply	other threads:[~2010-03-30  6:16 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-26  0:02 UDP path MTU discovery Glen Turner
2010-03-26  0:53 ` Rick Jones
2010-03-26  3:26   ` David Miller
2010-03-26 17:48     ` Rick Jones
2010-03-31 23:42     ` Glen Turner
2010-03-31 23:51       ` Hagen Paul Pfeifer
2010-04-01  0:06         ` Rick Jones
2010-03-26  3:24 ` David Miller
2010-03-28  8:41   ` Andi Kleen
2010-03-31 23:57     ` Glen Turner
2010-04-01  0:57       ` Andi Kleen
2010-03-28  8:50 ` Andi Kleen
2010-03-29 17:01   ` Rick Jones
2010-03-29 20:14     ` Andi Kleen
2010-03-29 20:25       ` Rick Jones
2010-03-29 20:50       ` Edgar E. Iglesias
2010-03-29 21:01         ` Rick Jones
2010-03-29 21:29           ` Eric Dumazet
2010-03-29 23:38             ` Templin, Fred L
2010-03-30  5:20               ` Andi Kleen
2010-03-30  6:06                 ` Eric Dumazet
2010-03-30  6:16                   ` Andi Kleen
2010-03-30  6:17                   ` UDP path MTU discovery II Andi Kleen
2010-03-30  6:16                 ` Edgar E. Iglesias [this message]
2010-03-30  6:19                   ` UDP path MTU discovery Andi Kleen
2010-03-30  8:20                     ` Edgar E. Iglesias
2010-03-30 14:12                       ` Andi Kleen
2010-03-30 22:04                         ` Edgar E. Iglesias
2010-03-30 15:58                     ` Templin, Fred L
2010-03-30 16:06                       ` Andi Kleen
2010-03-31 23:43     ` Glen Turner
2010-04-01  0:55       ` Andi Kleen
2010-04-02  5:41         ` Glen Turner
2010-04-04 10:25           ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100330061627.GA22436@laped.iglesias.mooo.com \
    --to=edgar.iglesias@gmail.com \
    --cc=Fred.L.Templin@boeing.com \
    --cc=andi@firstfloor.org \
    --cc=eric.dumazet@gmail.com \
    --cc=gdt@gdt.id.au \
    --cc=netdev@vger.kernel.org \
    --cc=rick.jones2@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.