linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ivan Babrou <ivan@cloudflare.com>
To: Eric Dumazet <edumazet@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>,
	netdev <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	kernel-team <kernel-team@cloudflare.com>,
	"David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Jonathan Corbet <corbet@lwn.net>
Subject: Re: [PATCH net] tcp: note that tcp_rmem[1] has a limited range
Date: Thu, 6 Jan 2022 14:40:26 -0800	[thread overview]
Message-ID: <CABWYdi26jxzzYPJoN3mrPaXny7FKAmZ61oitSYwuvauGCo+4NA@mail.gmail.com> (raw)
In-Reply-To: <CANn89i+9cOC4Ftnh2q7SZ+iP7-qe2jkFW3NtvFGzXLxoGUOsiA@mail.gmail.com>

On Thu, Jan 6, 2022 at 12:25 AM Eric Dumazet <edumazet@google.com> wrote:

> Just to clarify, normal TCP 3WHS has a final ACK packet, where window
> scaling is enabled.

Correct, yet this final ACK packet won't signal the initial scaled
window above 64k. That's what I'm trying to document, as it seems like
a useful thing to keep in mind. If this statement is incorrect, then
I'm definitely missing something very basic. Let me know if that's the
case.

> You describe a possible issue of passive connections.
> Most of the time, servers want some kind of control before allowing a
> remote peer to send MB of payload in the first round trip.

Let's focus purely on the client side of it. The client is willing to
receive the large payload (let's say 250K), yet it cannot signal this
fact to the server.

> However, a typical connection starts with IW10 (rfc 6928), and
> standard TCP congestion
> control would implement Slow Start, doubling the payload at every round trip,
> so this is not an issue.

It's not an issue on a low latency link, but when a latency sensitive
client is trying to retrieve something across a 300ms RTT link, extra
round trips to stretch the window add a lot of latency.

> If you want to enable bigger than 65535 RWIN for passive connections,
> this would violate standards and should be discussed first at IETF.

I understand this and I don't intend to do this.

> If you want to enable bigger than 65535 RWIN for passive connections
> in a controlled environment, I suggest using an eBPF program to do so.

Right, ebpf was your suggestion: https://lkml.org/lkml/2021/12/22/668

The intention of this patch is to say that you can't achieve this even
for active connections with the client that is willing to advertise a
larger window in the first non-SYN ACK. Currently even with ebpf you
cannot do this, but I'm happy to add the support.

      reply	other threads:[~2022-01-06 22:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-04  0:37 [PATCH net] tcp: note that tcp_rmem[1] has a limited range Ivan Babrou
2022-01-04  0:44 ` Stephen Hemminger
2022-01-04  8:33   ` Eric Dumazet
2022-01-06  4:20     ` Ivan Babrou
2022-01-06  8:25       ` Eric Dumazet
2022-01-06 22:40         ` Ivan Babrou [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABWYdi26jxzzYPJoN3mrPaXny7FKAmZ61oitSYwuvauGCo+4NA@mail.gmail.com \
    --to=ivan@cloudflare.com \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kernel-team@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).