All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kui-Feng Lee <sinquersw@gmail.com>
To: Kui-Feng Lee <kuifeng@meta.com>,
	bpf@vger.kernel.org, ast@kernel.org, martin.lau@linux.dev,
	song@kernel.org, kernel-team@meta.com, andrii@kernel.org,
	sdf@google.com
Subject: Re: [PATCH bpf-next v6 0/8] Transit between BPF TCP congestion controls.
Date: Fri, 10 Mar 2023 08:28:59 -0800	[thread overview]
Message-ID: <7e0b5974-0518-fe8d-0485-a8b2b73059cb@gmail.com> (raw)
In-Reply-To: <20230310043812.3087672-1-kuifeng@meta.com>



On 3/9/23 20:38, Kui-Feng Lee wrote:
> Major changes:
> 
>   - Create bpf_links in the kernel for BPF struct_ops to register and
>     unregister it.
> 
>   - Enables switching between implementations of bpf-tcp-cc under a
>     name instantly by replacing the backing struct_ops map of a
>     bpf_link.
> 
> Previously, BPF struct_ops didn't go off, as even when the user
> program creating it was terminated, none of these ever were pinned.
> For instance, the TCP congestion control subsystem indirectly
> maintains a reference count on the struct_ops of any registered BPF
> implemented algorithm. Thus, the algorithm won't be deactivated until
> someone deliberately unregisters it.  For compatibility with other BPF
> programs, bpf_links have been created to work in coordination with
> struct_ops maps. This ensures that the registration and unregistration
> of these respective maps is carried out at the start and end of the
> bpf_link.
> 
> We also faced complications when attempting to replace an existing TCP
> congestion control algorithm with a new implementation on the fly. A
> struct_ops map was used to register a TCP congestion control algorithm
> with a unique name.  We had to either register the alternative
> implementation with a new name and move over or unregister the current
> one before being able to reregistration with the same name.  To fix
> this problem, we can an option to migrate the registration of the
> algorithm from struct_ops maps to bpf_links. By modifying the backing
> map of a bpf_link, it suddenly becomes possible to replace an existing
> TCP congestion control algorithm with ease.

The major differences from v5:

  - Add a new step to bpf_object__load() to prepare vdata.

  - Accept BPF_F_REPLACE.

  - Check section IDs in find_struct_ops_map_by_offset()

  - Add a test case to check mixing w/ & w/o link struct_ops.

  - Add a test case of using struct_ops w/o link to update a link.

  - Improve bpf_link__detach_struct_ops() to handle the w/ link case.


> 
> The major differences from v4:
> 
>   - Rebase.
> 
>   - Reorder patches and merge part 4 to part 2 of the v4.
> 
> The major differences from v3:
> 
>   - Remove bpf_struct_ops_map_free_rcu(), and use synchronize_rcu().
> 
>   - Improve the commit log of the part 1.
> 
>   - Before transitioning to the READY state, we conduct a value check
>     to ensure that struct_ops can be successfully utilized and links
>     created later.
> 
> The major differences from v2:
> 
>   - Simplify states
> 
>     - Remove TOBEUNREG.
> 
>     - Rename UNREG to READY.
> 
>   - Stop using the refcnt of the kvalue of a struct_ops. Explicitly
>     increase and decrease the refcount of struct_ops.
> 
>   - Prepare kernel vdata during the load phase of libbpf.
> 
> The major differences from v1:
> 
>   - Added bpf_struct_ops_link to replace the previous union-based
>     approach.
> 
>   - Added UNREG and TOBEUNREG to the state of bpf_struct_ops_map.
> 
>     - bpf_struct_ops_transit_state() maintains state transitions.
> 
>   - Fixed synchronization issue.
> 
>   - Prepare kernel vdata of struct_ops during the loading phase of
>     bpf_object.
> 
>   - Merged previous patch 3 to patch 1.
> 
v5: https://lore.kernel.org/all/20230308005050.255859-1-kuifeng@meta.com/
> v4: https://lore.kernel.org/all/20230307232913.576893-1-andrii@kernel.org/
> v3: https://lore.kernel.org/all/20230303012122.852654-1-kuifeng@meta.com/
> v2: https://lore.kernel.org/bpf/20230223011238.12313-1-kuifeng@meta.com/
> v1: https://lore.kernel.org/bpf/20230214221718.503964-1-kuifeng@meta.com/
> 
> Kui-Feng Lee (8):
>    bpf: Retire the struct_ops map kvalue->refcnt.
>    net: Update an existing TCP congestion control algorithm.
>    bpf: Create links for BPF struct_ops maps.
>    libbpf: Create a bpf_link in bpf_map__attach_struct_ops().
>    bpf: Update the struct_ops of a bpf_link.
>    libbpf: Update a bpf_link with another struct_ops.
>    libbpf: Use .struct_ops.link section to indicate a struct_ops with a
>      link.
>    selftests/bpf: Test switching TCP Congestion Control algorithms.
> 
>   include/linux/bpf.h                           |  10 +
>   include/net/tcp.h                             |   3 +
>   include/uapi/linux/bpf.h                      |  20 +-
>   kernel/bpf/bpf_struct_ops.c                   | 229 +++++++++++++++---
>   kernel/bpf/syscall.c                          |  49 +++-
>   net/bpf/bpf_dummy_struct_ops.c                |   6 +
>   net/ipv4/bpf_tcp_ca.c                         |  14 +-
>   net/ipv4/tcp_cong.c                           |  60 ++++-
>   tools/include/uapi/linux/bpf.h                |  20 +-
>   tools/lib/bpf/libbpf.c                        | 180 +++++++++++---
>   tools/lib/bpf/libbpf.h                        |   1 +
>   tools/lib/bpf/libbpf.map                      |   1 +
>   .../selftests/bpf/prog_tests/bpf_tcp_ca.c     |  91 +++++++
>   .../selftests/bpf/progs/tcp_ca_update.c       |  80 ++++++
>   14 files changed, 671 insertions(+), 93 deletions(-)
>   create mode 100644 tools/testing/selftests/bpf/progs/tcp_ca_update.c
> 

      parent reply	other threads:[~2023-03-10 16:32 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-10  4:38 [PATCH bpf-next v6 0/8] Transit between BPF TCP congestion controls Kui-Feng Lee
2023-03-10  4:38 ` [PATCH bpf-next v6 1/8] bpf: Retire the struct_ops map kvalue->refcnt Kui-Feng Lee
2023-03-14  6:05   ` Martin KaFai Lau
2023-03-10  4:38 ` [PATCH bpf-next v6 2/8] net: Update an existing TCP congestion control algorithm Kui-Feng Lee
2023-03-10 16:47   ` Stephen Hemminger
2023-03-13 15:46     ` Kui-Feng Lee
2023-03-13 16:43       ` Kui-Feng Lee
2023-03-14  0:28   ` Martin KaFai Lau
2023-03-14  4:31     ` Kui-Feng Lee
2023-03-10  4:38 ` [PATCH bpf-next v6 3/8] bpf: Create links for BPF struct_ops maps Kui-Feng Lee
2023-03-14  1:42   ` Martin KaFai Lau
2023-03-16  0:21     ` Kui-Feng Lee
2023-03-10  4:38 ` [PATCH bpf-next v6 4/8] libbpf: Create a bpf_link in bpf_map__attach_struct_ops() Kui-Feng Lee
2023-03-10  4:38 ` [PATCH bpf-next v6 5/8] bpf: Update the struct_ops of a bpf_link Kui-Feng Lee
2023-03-10  4:38 ` [PATCH bpf-next v6 6/8] libbpf: Update a bpf_link with another struct_ops Kui-Feng Lee
2023-03-10  4:38 ` [PATCH bpf-next v6 7/8] libbpf: Use .struct_ops.link section to indicate a struct_ops with a link Kui-Feng Lee
2023-03-10  4:38 ` [PATCH bpf-next v6 8/8] selftests/bpf: Test switching TCP Congestion Control algorithms Kui-Feng Lee
2023-03-14  5:04   ` Martin KaFai Lau
2023-03-10 16:28 ` Kui-Feng Lee [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7e0b5974-0518-fe8d-0485-a8b2b73059cb@gmail.com \
    --to=sinquersw@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=kernel-team@meta.com \
    --cc=kuifeng@meta.com \
    --cc=martin.lau@linux.dev \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.