All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin KaFai Lau <kafai@fb.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Eric Dumazet <edumazet@google.com>, <kernel-team@fb.com>,
	Neal Cardwell <ncardwell@google.com>, <netdev@vger.kernel.org>,
	Yonghong Song <yhs@fb.com>, Yuchung Cheng <ycheng@google.com>
Subject: Re: [PATCH v2 bpf-next 0/8] bpf: Allow bpf tcp iter to do bpf_(get|set)sockopt
Date: Thu, 22 Jul 2021 14:01:44 -0700	[thread overview]
Message-ID: <20210722210144.qrnpycup4rmejvnx@kafai-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <d5ffdaf5-08e5-2b28-d891-73d507bae5fa@gmail.com>

On Thu, Jul 22, 2021 at 03:25:39PM +0200, Eric Dumazet wrote:
> 
> 
> On 7/1/21 10:05 PM, Martin KaFai Lau wrote:
> > This set is to allow bpf tcp iter to call bpf_(get|set)sockopt.
> > 
> > With bpf-tcp-cc, new algo rollout happens more often.  Instead of
> > restarting the applications to pick up the new tcp-cc, this set
> > allows the bpf tcp iter to call bpf_(get|set)sockopt(TCP_CONGESTION).
> > It is not limited to TCP_CONGESTION, the bpf tcp iter can call
> > bpf_(get|set)sockopt() with other options.  The bpf tcp iter can read
> > into all the fields of a tcp_sock, so there is a lot of flexibility
> > to select the desired sk to do setsockopt(), e.g. it can test for
> > TCP_LISTEN only and leave the established connections untouched,
> > or check the addr/port, or check the current tcp-cc name, ...etc.
> > 
> > Patch 1-4 are some cleanup and prep work in the tcp and bpf seq_file.
> > 
> > Patch 5 is to have the tcp seq_file iterate on the
> > port+addr lhash2 instead of the port only listening_hash.
> > 
> > Patch 6 is to have the bpf tcp iter doing batching which
> > then allows lock_sock.  lock_sock is needed for setsockopt.
> > 
> > Patch 7 allows the bpf tcp iter to call bpf_(get|set)sockopt.
> > 
> > v2:
> > - Use __GFP_NOWARN in patch 6
> > - Add bpf_getsockopt() in patch 7 to give a symmetrical user experience.
> >   selftest in patch 8 is changed to also cover bpf_getsockopt().
> > - Remove CAP_NET_ADMIN check in patch 7. Tracing bpf prog has already
> >   required CAP_SYS_ADMIN or CAP_PERFMON.
> > - Move some def macros to bpf_tracing_net.h in patch 8
> > 
> > Martin KaFai Lau (8):
> >   tcp: seq_file: Avoid skipping sk during tcp_seek_last_pos
> >   tcp: seq_file: Refactor net and family matching
> >   bpf: tcp: seq_file: Remove bpf_seq_afinfo from tcp_iter_state
> >   tcp: seq_file: Add listening_get_first()
> >   tcp: seq_file: Replace listening_hash with lhash2
> >   bpf: tcp: bpf iter batching and lock_sock
> >   bpf: tcp: Support bpf_(get|set)sockopt in bpf tcp iter
> >   bpf: selftest: Test batching and bpf_(get|set)sockopt in bpf tcp iter
> 
> For the whole series :
> 
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> 
> Sorry for the delay.
> 
> BTW, it seems weird for new BPF features to use /proc/net "legacy"
> infrastructure and update it.
bpf iter uses seq_file, so the initial bpf_iter_tcp reuses most
of the pieces from /proc/net/tcp.

This set refactored a few things such that the bpf_iter_tcp only
shares the legacy tcp_seek_last_pos(), so the dependency on
/proc/net/tcp should be less going forward.

A similar modification could also be done to bpf_iter_udp in the future.

Thanks for the review!

  reply	other threads:[~2021-07-22 21:02 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-01 20:05 [PATCH v2 bpf-next 0/8] bpf: Allow bpf tcp iter to do bpf_(get|set)sockopt Martin KaFai Lau
2021-07-01 20:05 ` [PATCH v2 bpf-next 1/8] tcp: seq_file: Avoid skipping sk during tcp_seek_last_pos Martin KaFai Lau
2021-07-22 14:16   ` Kuniyuki Iwashima
2021-07-22 15:08     ` Kuniyuki Iwashima
2021-07-22 21:42       ` Martin KaFai Lau
2021-07-22 22:06         ` Kuniyuki Iwashima
2021-07-01 20:05 ` [PATCH v2 bpf-next 2/8] tcp: seq_file: Refactor net and family matching Martin KaFai Lau
2021-07-01 20:05 ` [PATCH v2 bpf-next 3/8] bpf: tcp: seq_file: Remove bpf_seq_afinfo from tcp_iter_state Martin KaFai Lau
2021-07-01 20:06 ` [PATCH v2 bpf-next 4/8] tcp: seq_file: Add listening_get_first() Martin KaFai Lau
2021-07-01 20:06 ` [PATCH v2 bpf-next 5/8] tcp: seq_file: Replace listening_hash with lhash2 Martin KaFai Lau
2021-07-01 20:06 ` [PATCH v2 bpf-next 6/8] bpf: tcp: bpf iter batching and lock_sock Martin KaFai Lau
2021-07-01 20:06 ` [PATCH v2 bpf-next 7/8] bpf: tcp: Support bpf_(get|set)sockopt in bpf tcp iter Martin KaFai Lau
2021-07-01 20:06 ` [PATCH v2 bpf-next 8/8] bpf: selftest: Test batching and " Martin KaFai Lau
2021-07-02 10:50 ` [PATCH v2 bpf-next 0/8] bpf: Allow bpf tcp iter to do bpf_(get|set)sockopt David Laight
2021-07-06 15:44   ` Martin KaFai Lau
2021-07-15  1:29 ` Alexei Starovoitov
2021-07-20 18:05   ` Alexei Starovoitov
2021-07-20 18:42     ` Eric Dumazet
2021-07-22 13:25 ` Eric Dumazet
2021-07-22 21:01   ` Martin KaFai Lau [this message]
2021-07-22 14:53 ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210722210144.qrnpycup4rmejvnx@kafai-mbp.dhcp.thefacebook.com \
    --to=kafai@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=ycheng@google.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.