bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Taht <dave.taht@gmail.com>
To: Ivan Babrou <ivan@cloudflare.com>
Cc: bpf <bpf@vger.kernel.org>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	kernel-team <kernel-team@cloudflare.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>,
	Eric Dumazet <edumazet@google.com>
Subject: Re: [PATCH bpf-next] tcp: bpf: Add TCP_BPF_RCV_SSTHRESH for bpf_setsockopt
Date: Sat, 15 Jan 2022 08:46:52 -0800	[thread overview]
Message-ID: <CAA93jw435mThYcBA_7Sf1Z6W_bZrLuK8FLHw8AgAwg0+3y6PBw@mail.gmail.com> (raw)
In-Reply-To: <CABWYdi1p=rRQM3oySw2+N+mcrUq3bXA5MXm8cHmC3=qfCU5SDA@mail.gmail.com>

On Fri, Jan 14, 2022 at 2:21 PM Ivan Babrou <ivan@cloudflare.com> wrote:
> On Thu, Jan 13, 2022 at 9:44 PM Dave Taht <dave.taht@gmail.com> wrote:
> > Yes, but with the caveats below. I'm fine with you just saying round trips,
> > and making this api possible.
> >
> > It would comfort me further if you could provide an actual scenario.
> The actual scenario is getting a response as quickly as possible on a
> fresh connection across long distances (200ms+ RTT). If an RPC
> response doesn't fit into the initial 64k of rcv_ssthresh, we end up
> requiring more roundrips to receive the response. Some customers are
> very picky about the latency they measure and cutting the extra
> roundtrips made a very visible difference in the tests.
> > See also:
> >
> > https://datatracker.ietf.org/doc/html/rfc6928
> >
> > which predates packet pacing (are you using sch_fq?)
> We are using fq and bbr.
> > > Congestion window is a learned property, not a static number. You
> > > won't get a large initcwnd towards a poor connection.
> >
> > initcwnd is set globally or on a per route basis.

Like I said, retaining state from an existing connection as to the
window is ok. i think arbitrarily declaring a window like this
for a new connection is not.

> With TCP_BPF_IW the world is your oyster.

The oyster has to co-habit in this ocean with all the other life
there, and I would be comforted if your customer also tracked various
other TCP_INFO statistics, like RTT growth, loss, marks, and
retransmits, and was aware of not just the self harm inflicted but of
collateral damage. In fact I really wish more were instrumenting
everything with that, of late we've seen a lot of need for
TCP_NOTSENT_LOWAT in things like apache traffic server in containers.
A simple one line patch for an widely used app I can't talk about, did
wonders for actual perceived throughput and responsiveness by the end
user. Measuring from the reciever is far, far more important than
measuring from the sender. Collecting long term statistics over many
connections, also, from
the real world. I hope y'all have been instrumenting your work as well
as google has, on these fronts.

I know that I'm getting old and crunchy and scarred by seeing so many
(corporate wifi mostly) networks over the last decade essentially in
congestion collapse!


I'm very happy with how well sch_fq + packet pacing works to mitigate
impuses like this, as well as with so many other things like BBR and
BQL, but pacing out != pacing in,
and despite my fervent wish for more FQ+AQM techniques on more
bottleneck links also, we're not there yet.

I like very much that BPF is allowing rapid innovation, but with great
power comes great responsibility.
I tried to build a better future, a few times:

Dave Täht CEO, TekLibre, LLC

      reply	other threads:[~2022-01-15 16:47 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-11 19:29 [PATCH bpf-next] tcp: bpf: Add TCP_BPF_RCV_SSTHRESH for bpf_setsockopt Ivan Babrou
2022-01-11 21:47 ` Song Liu
2022-01-13 22:56   ` Ivan Babrou
2022-01-12 21:01 ` Dave Taht
2022-01-13 22:56   ` Ivan Babrou
2022-01-14  5:43     ` Dave Taht
2022-01-14 22:20       ` Ivan Babrou
2022-01-15 16:46         ` Dave Taht [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA93jw435mThYcBA_7Sf1Z6W_bZrLuK8FLHw8AgAwg0+3y6PBw@mail.gmail.com \
    --to=dave.taht@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=edumazet@google.com \
    --cc=ivan@cloudflare.com \
    --cc=kafai@fb.com \
    --cc=kernel-team@cloudflare.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).