From: Dave Taht <dave.taht@gmail.com>
To: Ivan Babrou <ivan@cloudflare.com>
Cc: bpf@vger.kernel.org,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
kernel-team@cloudflare.com, Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <kafai@fb.com>,
Eric Dumazet <edumazet@google.com>
Subject: Re: [PATCH bpf-next] tcp: bpf: Add TCP_BPF_RCV_SSTHRESH for bpf_setsockopt
Date: Wed, 12 Jan 2022 13:01:58 -0800 [thread overview]
Message-ID: <CAA93jw6HKLh857nuh2eX2N=siYz5wwQknMaOtpkqLzpfWTGhuA@mail.gmail.com> (raw)
In-Reply-To: <20220111192952.49040-1-ivan@cloudflare.com>
On Wed, Jan 12, 2022 at 11:59 AM Ivan Babrou <ivan@cloudflare.com> wrote:
>
> This patch adds bpf_setsockopt(TCP_BPF_RCV_SSTHRESH) to allow setting
> rcv_ssthresh value for TCP connections. Combined with increased
> window_clamp via tcp_rmem[1], it allows to advertise initial scaled
> TCP window larger than 64k. This is useful for high BDP connections,
> where it allows to push data with fewer roundtrips, reducing latency.
I would not use the word "latency" in this way, I would just say
potentially reducing
roundtrips...
and potentially massively increasing packet loss, oversaturating
links, and otherwise
hurting latency for other applications sharing the link, including the
application
that advertised an extreme window like this.
This overall focus tends to freak me out somewhat, especially when
faced with further statements that cloudflare is using an initcwnd of 250!???
The kind of damage just IW10 can do to much slower bandwidth
connections has to be
experienced to be believed.
https://tools.ietf.org/id/draft-gettys-iw10-considered-harmful-00.html
>
> For active connections the larger window is advertised in the first
> non-SYN ACK packet as the part of the 3 way handshake.
>
> For passive connections the larger window is advertised whenever
> there's any packet to send after the 3 way handshake.
>
> See: https://lkml.org/lkml/2021/12/22/652
>
> Signed-off-by: Ivan Babrou <ivan@cloudflare.com>
> ---
> include/uapi/linux/bpf.h | 1 +
> net/core/filter.c | 6 ++++++
> tools/include/uapi/linux/bpf.h | 1 +
> 3 files changed, 8 insertions(+)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 791f31dd0abe..36ebf87278bd 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -5978,6 +5978,7 @@ enum {
> TCP_BPF_SYN = 1005, /* Copy the TCP header */
> TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */
> TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */
> + TCP_BPF_RCV_SSTHRESH = 1008, /* Set rcv_ssthresh */
> };
>
> enum {
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 2e32cee2c469..aafb6066b1a6 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -4904,6 +4904,12 @@ static int _bpf_setsockopt(struct sock *sk, int level, int optname,
> return -EINVAL;
> inet_csk(sk)->icsk_rto_min = timeout;
> break;
> + case TCP_BPF_RCV_SSTHRESH:
> + if (val <= 0)
> + ret = -EINVAL;
> + else
> + tp->rcv_ssthresh = min_t(u32, val, tp->window_clamp);
> + break;
> case TCP_SAVE_SYN:
> if (val < 0 || val > 1)
> ret = -EINVAL;
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 791f31dd0abe..36ebf87278bd 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -5978,6 +5978,7 @@ enum {
> TCP_BPF_SYN = 1005, /* Copy the TCP header */
> TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */
> TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */
> + TCP_BPF_RCV_SSTHRESH = 1008, /* Set rcv_ssthresh */
> };
>
> enum {
> --
> 2.34.1
>
--
I tried to build a better future, a few times:
https://wayforward.archive.org/?site=https%3A%2F%2Fwww.icei.org
Dave Täht CEO, TekLibre, LLC
next prev parent reply other threads:[~2022-01-12 21:02 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-11 19:29 [PATCH bpf-next] tcp: bpf: Add TCP_BPF_RCV_SSTHRESH for bpf_setsockopt Ivan Babrou
2022-01-11 21:47 ` Song Liu
2022-01-13 22:56 ` Ivan Babrou
2022-01-12 21:01 ` Dave Taht [this message]
2022-01-13 22:56 ` Ivan Babrou
2022-01-14 5:43 ` Dave Taht
2022-01-14 22:20 ` Ivan Babrou
2022-01-15 16:46 ` Dave Taht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAA93jw6HKLh857nuh2eX2N=siYz5wwQknMaOtpkqLzpfWTGhuA@mail.gmail.com' \
--to=dave.taht@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=edumazet@google.com \
--cc=ivan@cloudflare.com \
--cc=kafai@fb.com \
--cc=kernel-team@cloudflare.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).