bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin KaFai Lau <kafai@fb.com>
To: <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	David Miller <davem@davemloft.net>, <kernel-team@fb.com>,
	<netdev@vger.kernel.org>
Subject: [PATCH bpf-next v2 00/11] Introduce BPF STRUCT_OPS
Date: Fri, 20 Dec 2019 22:25:56 -0800	[thread overview]
Message-ID: <20191221062556.1182261-1-kafai@fb.com> (raw)

This series introduces BPF STRUCT_OPS.  It is an infra to allow
implementing some specific kernel's function pointers in BPF.
The first use case included in this series is to implement
TCP congestion control algorithm in BPF  (i.e. implement
struct tcp_congestion_ops in BPF).

There has been attempt to move the TCP CC to the user space
(e.g. CCP in TCP).   The common arguments are faster turn around,
get away from long-tail kernel versions in production...etc,
which are legit points.

BPF has been the continuous effort to join both kernel and
userspace upsides together (e.g. XDP to gain the performance
advantage without bypassing the kernel).  The recent BPF
advancements (in particular BTF-aware verifier, BPF trampoline,
BPF CO-RE...) made implementing kernel struct ops (e.g. tcp cc)
possible in BPF.

The idea is to allow implementing tcp_congestion_ops in bpf.
It allows a faster turnaround for testing algorithm in the
production while leveraging the existing (and continue growing) BPF
feature/framework instead of building one specifically for
userspace TCP CC.

Please see individual patch for details.

The bpftool support will be posted in follow-up patches.

v2:
- Dropped cubic for now.  They will be reposted
  once there are more clarity in "jiffies" on both
  bpf side (about the helper) and
  tcp_cubic side (some of jiffies usages are being replaced
  by tp->tcp_mstamp)
- Remove unnecssary check on bitfield support from btf_struct_access()
  (Yonghong)
- BTF_TYPE_EMIT macro (Yonghong, Andrii)
- value_name's length check to avoid an unlikely
  type match during truncation case (Yonghong)
- BUILD_BUG_ON to ensure no trampoline-image overrun
  in the future (Yonghong)
- Simplify get_next_key() (Yonghong)
- Added comment to explain how to check mandatory
  func ptr in net/ipv4/bpf_tcp_ca.c (Yonghong)
- Rename "__bpf_" to "bpf_struct_ops_" for value prefix (Andrii)
- Add comment to highlight the bpf_dctcp.c is not necessarily
  the same as tcp_dctcp.c. (Alexei, Eric)
- libbpf: Renmae "struct_ops" to ".struct_ops" for elf sec (Andrii)
- libbpf: Expose struct_ops as a bpf_map (Andrii)
- libbpf: Support multiple struct_ops in SEC(".struct_ops") (Andrii)
- libbpf: Add bpf_map__attach_struct_ops()  (Andrii)

Martin KaFai Lau (11):
  bpf: Save PTR_TO_BTF_ID register state when spilling to stack
  bpf: Avoid storing modifier to info->btf_id
  bpf: Add enum support to btf_ctx_access()
  bpf: Support bitfield read access in btf_struct_access
  bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS
  bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS
  bpf: tcp: Support tcp_congestion_ops in bpf
  bpf: Add BPF_FUNC_tcp_send_ack helper
  bpf: Synch uapi bpf.h to tools/
  bpf: libbpf: Add STRUCT_OPS support
  bpf: Add bpf_dctcp example

 arch/x86/net/bpf_jit_comp.c                   |  11 +-
 include/linux/bpf.h                           |  79 ++-
 include/linux/bpf_types.h                     |   7 +
 include/linux/btf.h                           |  47 ++
 include/linux/filter.h                        |   2 +
 include/net/tcp.h                             |   1 +
 include/uapi/linux/bpf.h                      |  19 +-
 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/bpf_struct_ops.c                   | 586 ++++++++++++++++
 kernel/bpf/bpf_struct_ops_types.h             |   9 +
 kernel/bpf/btf.c                              | 129 ++--
 kernel/bpf/map_in_map.c                       |   3 +-
 kernel/bpf/syscall.c                          |  66 +-
 kernel/bpf/trampoline.c                       |   5 +-
 kernel/bpf/verifier.c                         | 140 +++-
 net/core/filter.c                             |   2 +-
 net/ipv4/Makefile                             |   4 +
 net/ipv4/bpf_tcp_ca.c                         | 248 +++++++
 net/ipv4/tcp_cong.c                           |  14 +-
 net/ipv4/tcp_ipv4.c                           |   6 +-
 net/ipv4/tcp_minisocks.c                      |   4 +-
 net/ipv4/tcp_output.c                         |   4 +-
 tools/include/uapi/linux/bpf.h                |  19 +-
 tools/lib/bpf/bpf.c                           |  10 +-
 tools/lib/bpf/bpf.h                           |   5 +-
 tools/lib/bpf/libbpf.c                        | 639 +++++++++++++++++-
 tools/lib/bpf/libbpf.h                        |   1 +
 tools/lib/bpf/libbpf.map                      |   1 +
 tools/lib/bpf/libbpf_probes.c                 |   2 +
 tools/testing/selftests/bpf/bpf_tcp_helpers.h | 228 +++++++
 .../selftests/bpf/prog_tests/bpf_tcp_ca.c     | 218 ++++++
 tools/testing/selftests/bpf/progs/bpf_dctcp.c | 210 ++++++
 32 files changed, 2582 insertions(+), 139 deletions(-)
 create mode 100644 kernel/bpf/bpf_struct_ops.c
 create mode 100644 kernel/bpf/bpf_struct_ops_types.h
 create mode 100644 net/ipv4/bpf_tcp_ca.c
 create mode 100644 tools/testing/selftests/bpf/bpf_tcp_helpers.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_dctcp.c

-- 
2.17.1


             reply	other threads:[~2019-12-21  6:26 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-21  6:25 Martin KaFai Lau [this message]
2019-12-21  6:25 ` [PATCH bpf-next v2 01/11] bpf: Save PTR_TO_BTF_ID register state when spilling to stack Martin KaFai Lau
2019-12-21  6:25 ` [PATCH bpf-next v2 02/11] bpf: Avoid storing modifier to info->btf_id Martin KaFai Lau
2019-12-21  6:26 ` [PATCH bpf-next v2 03/11] bpf: Add enum support to btf_ctx_access() Martin KaFai Lau
2019-12-21  6:26 ` [PATCH bpf-next v2 04/11] bpf: Support bitfield read access in btf_struct_access Martin KaFai Lau
2019-12-23  7:49   ` Yonghong Song
2019-12-23 20:05   ` Andrii Nakryiko
2019-12-23 21:21     ` Yonghong Song
2019-12-21  6:26 ` [PATCH bpf-next v2 05/11] bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS Martin KaFai Lau
2019-12-23 19:33   ` Yonghong Song
2019-12-23 20:29   ` Andrii Nakryiko
2019-12-23 22:29     ` Martin Lau
2019-12-23 22:55       ` Andrii Nakryiko
2019-12-24 11:46   ` kbuild test robot
2019-12-21  6:26 ` [PATCH bpf-next v2 06/11] bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS Martin KaFai Lau
2019-12-23 19:57   ` Yonghong Song
2019-12-23 21:44     ` Andrii Nakryiko
2019-12-23 22:15       ` Martin Lau
2019-12-27  6:16     ` Martin Lau
2019-12-23 23:05   ` Andrii Nakryiko
2019-12-28  1:47     ` Martin Lau
2019-12-28  2:24       ` Andrii Nakryiko
2019-12-28  5:16         ` Martin Lau
2019-12-24 12:28   ` kbuild test robot
2019-12-21  6:26 ` [PATCH bpf-next v2 07/11] bpf: tcp: Support tcp_congestion_ops in bpf Martin KaFai Lau
2019-12-23 20:18   ` Yonghong Song
2019-12-23 23:20   ` Andrii Nakryiko
2019-12-24  7:16   ` kbuild test robot
2019-12-24 13:06   ` kbuild test robot
2019-12-21  6:26 ` [PATCH bpf-next v2 08/11] bpf: Add BPF_FUNC_tcp_send_ack helper Martin KaFai Lau
2019-12-21  6:26 ` [PATCH bpf-next v2 09/11] bpf: Synch uapi bpf.h to tools/ Martin KaFai Lau
2019-12-21  6:26 ` [PATCH bpf-next v2 10/11] bpf: libbpf: Add STRUCT_OPS support Martin KaFai Lau
2019-12-23 19:54   ` Andrii Nakryiko
2019-12-26 22:47     ` Martin Lau
2019-12-21  6:26 ` [PATCH bpf-next v2 11/11] bpf: Add bpf_dctcp example Martin KaFai Lau
2019-12-23 23:26   ` Andrii Nakryiko
2019-12-24  1:31     ` Martin Lau
2019-12-24  7:01       ` Andrii Nakryiko
2019-12-24  7:32         ` Martin Lau
2019-12-24 16:50         ` Martin Lau
2019-12-26 19:02           ` Andrii Nakryiko
2019-12-26 20:25             ` Martin Lau
2019-12-26 20:48               ` Andrii Nakryiko
2019-12-26 22:20                 ` Martin Lau
2019-12-26 22:25                   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191221062556.1182261-1-kafai@fb.com \
    --to=kafai@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).