netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin KaFai Lau <kafai@fb.com>
To: <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Eric Dumazet <edumazet@google.com>, <kernel-team@fb.com>,
	Lawrence Brakmo <brakmo@fb.com>,
	Neal Cardwell <ncardwell@google.com>, <netdev@vger.kernel.org>,
	Yuchung Cheng <ycheng@google.com>
Subject: [PATCH v3 bpf-next 0/9] BPF TCP header options
Date: Thu, 30 Jul 2020 13:56:57 -0700	[thread overview]
Message-ID: <20200730205657.3351905-1-kafai@fb.com> (raw)

The earlier effort in BPF-TCP-CC allows the TCP Congestion Control
algorithm to be written in BPF.  It opens up opportunities to allow
a faster turnaround time in testing/releasing new congestion control
ideas to production environment.

The same flexibility can be extended to writing TCP header option.
It is not uncommon that people want to test new TCP header option
to improve the TCP performance.  Another use case is for data-center
that has a more controlled environment and has more flexibility in
putting header options for internal traffic only.
    
This patch set introduces the necessary BPF logic and API to
allow bpf program to write and parse header options.

There are also some changes to TCP and they are mostly to provide
the needed sk and skb info to the bpf program to make decision.

Patch 6 is the main patch and has more details on the API and design.

The set includes an example which sends the max delay ack in
the BPF TCP header option and the receiving side can
then adjust its RTO accordingly.

v3:
- Add kdoc for tcp_make_synack (Jakub Kicinski)
- Add BPF_WRITE_HDR_TCP_CURRENT_MSS and BPF_WRITE_HDR_TCP_SYNACK_COOKIE
  in bpf.h to give a clearer meaning to sock_ops->args[0] when
  writing header option.
- Rename BPF_SOCK_OPS_PARSE_UNKWN_HDR_OPT_CB_FLAG
  to     BPF_SOCK_OPS_PARSE_UNKNOWN_HDR_OPT_CB_FLAG

v2:
- Instead of limiting the bpf prog to write experimental
  option (kind:254, magic:0xeB9F), this revision allows the bpf prog to
  write any TCP header option through the bpf_store_hdr_opt() helper.
  That will allow different bpf-progs to write its own
  option and the helper will guarantee there is no duplication.

- Add bpf_load_hdr_opt() helper to search a particular option by kind.
  Some of the get_syn logic is refactored to bpf_sock_ops_get_syn().

- Since bpf prog is no longer limited to option (254, 0xeB9F),
  the TCP_SKB_CB(skb)->bpf_hdr_opt_off is no longer needed.
  Instead, when there is any option kernel cannot recognize,
  the bpf prog will be called if the
  BPF_SOCK_OPS_PARSE_UNKWN_HDR_OPT_CB_FLAG is set.
  [ The "unknown_opt" is learned in tcp_parse_options() in patch 4. ]

- Add BPF_SOCK_OPS_PARSE_ALL_HDR_OPT_CB_FLAG.
  If this flag is set, the bpf-prog will be called
  on all tcp packet received at an established sk.
  It will be useful to ensure a previously written header option is
  received by the peer.
  e.g. The latter test is using this on the active-side during syncookie.

- The test_tcp_hdr_options.c is adjusted accordingly
  to test writing both experimental and regular TCP header option.

- The test_misc_tcp_hdr_options.c is added to mainly
  test different cases on the new helpers.
  
- Break up the TCP_BPF_RTO_MIN and TCP_BPF_DELACK_MAX into
  two patches.

- Directly store the tcp_hdrlen in "struct saved_syn" instead of
  going back to the tcp header to obtain it by "th->doff * 4"

- Add a new optval(==2) for setsockopt(TCP_SAVE_SYN) such
  that it will also store the mac header (patch 9).

Martin KaFai Lau (9):
  tcp: Use a struct to represent a saved_syn
  tcp: bpf: Add TCP_BPF_DELACK_MAX setsockopt
  tcp: bpf: Add TCP_BPF_RTO_MIN for bpf_setsockopt
  tcp: Add unknown_opt arg to tcp_parse_options
  bpf: sock_ops: Change some members of sock_ops_kern from u32 to u8
  bpf: tcp: Allow bpf prog to write and parse TCP header option
  bpf: selftests: Add fastopen_connect to network_helpers
  bpf: selftests: tcp header options
  tcp: bpf: Optionally store mac header in TCP_SAVE_SYN

 drivers/infiniband/hw/cxgb4/cm.c              |   2 +-
 include/linux/bpf-cgroup.h                    |  25 +
 include/linux/filter.h                        |   8 +-
 include/linux/tcp.h                           |  18 +-
 include/net/inet_connection_sock.h            |   2 +
 include/net/request_sock.h                    |   9 +-
 include/net/tcp.h                             |  58 +-
 include/uapi/linux/bpf.h                      | 234 ++++++-
 net/core/filter.c                             | 416 ++++++++++-
 net/ipv4/syncookies.c                         |   2 +-
 net/ipv4/tcp.c                                |  16 +-
 net/ipv4/tcp_fastopen.c                       |   2 +-
 net/ipv4/tcp_input.c                          | 151 +++-
 net/ipv4/tcp_ipv4.c                           |   3 +-
 net/ipv4/tcp_minisocks.c                      |   5 +-
 net/ipv4/tcp_output.c                         | 196 +++++-
 net/ipv6/syncookies.c                         |   2 +-
 net/ipv6/tcp_ipv6.c                           |   3 +-
 tools/include/uapi/linux/bpf.h                | 234 ++++++-
 tools/testing/selftests/bpf/network_helpers.c |  37 +
 tools/testing/selftests/bpf/network_helpers.h |   2 +
 .../bpf/prog_tests/tcp_hdr_options.c          | 629 +++++++++++++++++
 .../bpf/progs/test_misc_tcp_hdr_options.c     | 338 +++++++++
 .../bpf/progs/test_tcp_hdr_options.c          | 657 ++++++++++++++++++
 .../selftests/bpf/test_tcp_hdr_options.h      | 150 ++++
 25 files changed, 3126 insertions(+), 73 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_misc_tcp_hdr_options.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_tcp_hdr_options.c
 create mode 100644 tools/testing/selftests/bpf/test_tcp_hdr_options.h

-- 
2.24.1


             reply	other threads:[~2020-07-30 20:57 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-30 20:56 Martin KaFai Lau [this message]
2020-07-30 20:57 ` [PATCH v3 bpf-next 1/9] tcp: Use a struct to represent a saved_syn Martin KaFai Lau
2020-07-31 15:57   ` Eric Dumazet
2020-07-31 17:31     ` Eric Dumazet
2020-07-30 20:57 ` [PATCH v3 bpf-next 2/9] tcp: bpf: Add TCP_BPF_DELACK_MAX setsockopt Martin KaFai Lau
2020-07-30 20:57 ` [PATCH v3 bpf-next 3/9] tcp: bpf: Add TCP_BPF_RTO_MIN for bpf_setsockopt Martin KaFai Lau
2020-07-30 20:57 ` [PATCH v3 bpf-next 4/9] tcp: Add unknown_opt arg to tcp_parse_options Martin KaFai Lau
2020-07-31 16:12   ` Eric Dumazet
2020-07-31 17:37     ` Martin KaFai Lau
2020-07-30 20:57 ` [PATCH v3 bpf-next 5/9] bpf: sock_ops: Change some members of sock_ops_kern from u32 to u8 Martin KaFai Lau
2020-07-30 20:57 ` [PATCH v3 bpf-next 6/9] bpf: tcp: Allow bpf prog to write and parse TCP header option Martin KaFai Lau
2020-07-31 16:06   ` Eric Dumazet
2020-07-31 17:59     ` Martin KaFai Lau
2020-07-30 20:57 ` [PATCH v3 bpf-next 7/9] bpf: selftests: Add fastopen_connect to network_helpers Martin KaFai Lau
2020-07-30 20:57 ` [PATCH v3 bpf-next 8/9] bpf: selftests: tcp header options Martin KaFai Lau
2020-07-30 20:57 ` [PATCH v3 bpf-next 9/9] tcp: bpf: Optionally store mac header in TCP_SAVE_SYN Martin KaFai Lau
2020-07-31 15:51   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200730205657.3351905-1-kafai@fb.com \
    --to=kafai@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brakmo@fb.com \
    --cc=daniel@iogearbox.net \
    --cc=edumazet@google.com \
    --cc=kernel-team@fb.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).