All of lore.kernel.org
 help / color / mirror / Atom feed
From: wenxu <wenxu@ucloud.cn>
To: Jakub Kicinski <kuba@kernel.org>, David Ahern <dsahern@gmail.com>
Cc: netdev@vger.kernel.org, Stefano Brivio <sbrivio@redhat.com>,
	David Ahern <dsahern@kernel.org>
Subject: Re: [PATCH net] ip_tunnel: fix over-mtu packet send fail without TUNNEL_DONT_FRAGMENT flags
Date: Thu, 29 Oct 2020 10:30:50 +0800	[thread overview]
Message-ID: <057f100e-2b80-f831-0a22-8d2dfe5529bd@ucloud.cn> (raw)
In-Reply-To: <20201027085548.05b39e0d@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>


On 10/27/2020 11:55 PM, Jakub Kicinski wrote:
> On Tue, 27 Oct 2020 08:51:07 -0600 David Ahern wrote:
>>> Is this another incarnation of 4cb47a8644cc ("tunnels: PMTU discovery
>>> support for directly bridged IP packets")? Sounds like non-UDP tunnels
>>> need the same treatment to make PMTUD work.
>>>
>>> RFC2003 seems to clearly forbid ignoring the inner DF:  
>> I was looking at this patch Sunday night. To me it seems odd that
>> packets flowing through the overlay affect decisions in the underlay
>> which meant I agree with the proposed change.
> The RFC was probably written before we invented terms like underlay 
> and overlay, and still considered tunneling to be an inefficient hack ;)
>
>> ip_md_tunnel_xmit is inconsistent right now. tnl_update_pmtu is called
>> based on the TUNNEL_DONT_FRAGMENT flag, so why let it be changed later
>> based on the inner header? Or, if you agree with RFC 2003 and the DF
>> should be propagated outer to inner, then it seems like the df reset
>> needs to be moved up before the call to tnl_update_pmtu
> Looks like TUNNEL_DONT_FRAGMENT is intended to switch between using
> PMTU inside the tunnel or just the tunnel dev MTU. ICMP PTB is still
> generated based on the inner headers.
>
> We should be okay to add something like IFLA_GRE_IGNORE_DF to lwt, 
> but IMHO the default should not be violating the RFC.

If we add  TUNNEL_IGNORE_DF to lwt,  the two IGNORE_DF and DONT_FRAGMENT

flags should not coexist ?   Or DONT_FRAGMENT is prior to the IGNORE_DF?


Also there is inconsistent in the kernel for the tunnel device. For geneve and

vxlan tunnel (don't send tunnel with ip_md_tunnel_xmit) in the lwt mode set

the outer df only based  TUNNEL_DONT_FRAGMENT .

And this is also the some behavior for gre device before switching to use 
ip_md_tunnel_xmit as the following patch.

962924f ip_gre: Refactor collect metatdata mode tunnel xmit to ip_md_tunnel_xmit


  reply	other threads:[~2020-10-29  2:32 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-21  9:21 [PATCH net] ip_tunnel: fix over-mtu packet send fail without TUNNEL_DONT_FRAGMENT flags wenxu
2020-10-23 21:12 ` Jakub Kicinski
2020-10-26  8:23   ` wenxu
2020-10-26 20:56     ` Jakub Kicinski
2020-10-27 14:51       ` David Ahern
2020-10-27 15:55         ` Jakub Kicinski
2020-10-29  2:30           ` wenxu [this message]
2020-10-30  1:14             ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=057f100e-2b80-f831-0a22-8d2dfe5529bd@ucloud.cn \
    --to=wenxu@ucloud.cn \
    --cc=dsahern@gmail.com \
    --cc=dsahern@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.