netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steffen Klassert <steffen.klassert@secunet.com>
To: Lorenzo Colitti <lorenzo@google.com>
Cc: mtk81216 <lina.wang@mediatek.com>,
	"David S . Miller" <davem@davemloft.net>,
	"Alexey Kuznetsov" <kuznet@ms2.inr.ac.ru>,
	"Hideaki YOSHIFUJI" <yoshfuji@linux-ipv6.org>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Herbert Xu" <herbert@gondor.apana.org.au>,
	"Matthias Brugger" <matthias.bgg@gmail.com>,
	"Linux NetDev" <netdev@vger.kernel.org>,
	lkml <linux-kernel@vger.kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org,
	"Maciej Żenczykowski" <maze@google.com>,
	"Greg Kroah-Hartman" <gregkh@google.com>
Subject: Re: [PATCH] xfrm:fragmented ipv4 tunnel packets in inner interface
Date: Mon, 9 Nov 2020 10:58:13 +0100	[thread overview]
Message-ID: <20201109095813.GV26422@gauss3.secunet.de> (raw)
In-Reply-To: <CAKD1Yr1VsueZWUtCL4iMWLhnADypUOtDK=dgqM2Y2HDvXc77iw@mail.gmail.com>

On Thu, Nov 05, 2020 at 01:52:01PM +0900, Lorenzo Colitti wrote:
> On Tue, Sep 15, 2020 at 4:30 PM Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> > > In esp's tunnel mode,if inner interface is ipv4,outer is ipv4,one big
> > > packet which travels through tunnel will be fragmented with outer
> > > interface's mtu,peer server will remove tunnelled esp header and assemble
> > > them in big packet.After forwarding such packet to next endpoint,it will
> > > be dropped because of exceeding mtu or be returned ICMP(packet-too-big).
> >
> > What is the exact case where packets are dropped? Given that the packet
> > was fragmented (and reassembled), I'd assume the DF bit was not set. So
> > every router along the path is allowed to fragment again if needed.
> 
> In general, isn't it just suboptimal to rely on fragmentation if the
> sender already knows the packet is too big? That's why we have things
> like path MTU discovery (RFC 1191).

When we setup packets that are sent from a local socket, we take
MTU/PMTU info we have into account. So we don't create fragments in
that case.

When forwarding packets it is different. The router that can not
TX the packet because it exceeds the MTU of the sending interface
is responsible to either fragment (if DF is not set), or send a
PMTU notification (if DF is set). So if we are able to transmit
the packet, we do it.

> Fragmentation is generally
> expensive, increases the chance of packet loss, and has historically
> caused lots of security vulnerabilities. Also, in real world networks,
> fragments sometimes just don't work, either because intermediate
> routers don't fragment, or because firewalls drop the fragments due to
> security reasons.
> 
> While it's possible in theory to ask these operators to configure
> their routers to fragment packets, that may not result in the network
> being fixed, due to hardware constraints, security policy or other
> reasons.

We can not really do anything here. If a flow has no DF bit set
on the packets, we can not rely on PMTU information. If we have PMTU
info on the route, then we have it because some other flow (that has
DF bit set on the packets) triggered PMTU discovery. That means that
the PMTU information is reset when this flow (with DF set) stops
sending packets. So the other flow (with DF not set) will send
big packets again.

> Those operators may also be in a position to place
> requirements on devices that have to use their network. If the Linux
> stack does not work as is on these networks, then those devices will
> have to meet those requirements by making out-of-tree changes. It
> would be good to avoid that if there's a better solution (e.g., make
> this configurable via sysctl).

We should not try to workaround broken configurations, there are just
too many possibilities to configure a broken network.

  reply	other threads:[~2020-11-09  9:58 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200909062613.18604-1-lina.wang@mediatek.com>
2020-09-15  7:30 ` [PATCH] xfrm:fragmented ipv4 tunnel packets in inner interface Steffen Klassert
     [not found]   ` <1600160722.5295.15.camel@mbjsdccf07>
     [not found]     ` <20200915093230.GS20687@gauss3.secunet.de>
     [not found]       ` <1600172260.2494.2.camel@mbjsdccf07>
     [not found]         ` <20200917074637.GV20687@gauss3.secunet.de>
     [not found]           ` <1600341549.32639.5.camel@mbjsdccf07>
     [not found]             ` <1604547381.23648.14.camel@mbjsdccf07>
2020-11-05  4:41               ` Maciej Żenczykowski
2020-11-05  4:52   ` Lorenzo Colitti
2020-11-09  9:58     ` Steffen Klassert [this message]
2020-11-09 19:38       ` Maciej Żenczykowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201109095813.GV26422@gauss3.secunet.de \
    --to=steffen.klassert@secunet.com \
    --cc=davem@davemloft.net \
    --cc=gregkh@google.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=kuba@kernel.org \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=lina.wang@mediatek.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=lorenzo@google.com \
    --cc=matthias.bgg@gmail.com \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).