netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: David Laight <David.Laight@aculab.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	Amit Klein <aksecurity@gmail.com>,
	Eric Dumazet <edumazet@google.com>
Subject: Re: [PATCH net-next] ipv6: use prandom_u32() for ID generation
Date: Mon, 31 May 2021 13:19:40 +0200	[thread overview]
Message-ID: <20210531111940.GA9609@1wt.eu> (raw)
In-Reply-To: <e4cc31c1fead46b3aa1132937a720da2@AcuMS.aculab.com>

On Mon, May 31, 2021 at 10:41:18AM +0000, David Laight wrote:
> The problem is that, on average, 1 in 2^32 packets will use
> the same id as the previous one.
> If a fragment of such a pair gets lost horrid things are
> likely to happen.
> Note that this is different from an ID being reused after a
> count of packets or after a time delay.

I'm well aware of this, as this is something we discussed already
for IPv4 and which I objected to for the same reason (except that
it's 1/2^16 there).

With that said, the differences with IPv4 are significant here,
because you won't fragment below 1280 bytes per packet, which
means the issue could happen every 5 terabytes of fragmented
losses (or reorders). I'd say that in the worst case you're
using load-balanced links with some funny LB algorithm that
ensures that every second fragment is sent on the same link
as the previous packet's first fragment. This is the case where
you could provoke a failure every 5 TB. But then you're still
subject to UDP's 16-bit checksumm so in practice you're seeing
a failure every 320 PB. Finally it's the same probability as
getting both TCP csum + Ethernet CRC correct on a failure,
except that here it applies only to large fragments while with
TCP/eth it applies to any packet.

> So you still need something to ensure IDs aren't reused immediately.

That's what I initially did for IPv4 but Amit could exploit this
specific property. For example it makes it easier to count flows
behind NAT when there is a guaranteed distance :-/  We even tried
with a smooth, non-linear distribution, but that made no difference,
it remained observable.

Another idea we had in mind was to keep small increments for local
networks and use full randoms only over routers (since fragments
are rare and terribly unreliable on the net), but that would involve
quite significant changes for very little benefit compared to the
current option in the end.

Regards,
Willy

  reply	other threads:[~2021-05-31 11:20 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-29 11:07 [PATCH net-next] ipv6: use prandom_u32() for ID generation Willy Tarreau
2021-05-31 10:41 ` David Laight
2021-05-31 11:19   ` Willy Tarreau [this message]
2021-05-31 19:27 ` Eric Dumazet
2021-06-01  5:20 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210531111940.GA9609@1wt.eu \
    --to=w@1wt.eu \
    --cc=David.Laight@aculab.com \
    --cc=aksecurity@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).