bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	Lorenz Bauer <lmb@cloudflare.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>, kernel-team <kernel-team@fb.com>
Subject: Re: [RFC PATCH bpf-next] bpf: Introduce bpf_timer
Date: Tue, 1 Jun 2021 19:00:30 -0700	[thread overview]
Message-ID: <20210602020030.igrx5jp45tocekvy@ast-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <CAM_iQpXUBuOirztj3kifdFpvygKb-aoqwuXKkLdG9VFte5nynA@mail.gmail.com>

On Sat, May 29, 2021 at 11:36:08PM -0700, Cong Wang wrote:
> On Tue, May 25, 2021 at 11:21 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon, May 24, 2021 at 9:59 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > >
> > > On Mon, May 24, 2021 at 8:16 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > > >
> > > > On Sun, May 23, 2021 at 9:01 AM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Fri, May 21, 2021 at 2:37 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > > > > >
> > > > > > Hi, Alexei
> > > > > >
> > > > > > On Thu, May 20, 2021 at 11:52 PM Alexei Starovoitov
> > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > >
> > > > > > > Introduce 'struct bpf_timer' that can be embedded in most BPF map types
> > > > > > > and helpers to operate on it:
> > > > > > > long bpf_timer_init(struct bpf_timer *timer, void *callback, int flags)
> > > > > > > long bpf_timer_mod(struct bpf_timer *timer, u64 msecs)
> > > > > > > long bpf_timer_del(struct bpf_timer *timer)
> > > > > >
> > > > > > Like we discussed, this approach would make the timer harder
> > > > > > to be independent of other eBPF programs, which is a must-have
> > > > > > for both of our use cases (mine and Jamal's). Like you explained,
> > > > > > this requires at least another program array, a tail call, a mandatory
> > > > > > prog pinning to work.
> > > > >
> > > > > That is simply not true.
> > > >
> > > > Which part is not true? The above is what I got from your explanation.
> > >
> > > I tried to write some code sketches to use your timer to implement
> > > our conntrack logic, below shows how difficult it is to use,
> >
> > Was it difficult because you've used tail_call and over complicated
> > the progs for no good reason?
> 
> Using tail call is what I got from you, here is the quote:
> 
> "Sure. That's trivially achieved with pinning.
> One can have an ingress prog that tailcalls into another prog
> that arms the timer with one of its subprogs.
> Egress prog can tailcall into the same prog as well.
> The ingress and egress progs can be replaced one by one
> or removed both together and middle prog can stay alive
> if it's pinned in bpffs or held alive by FD."

That was in the context of doing auto-cancel of timers.
There is only one choice to make. Either auto-cancel or not.
That quote was during the time when auto-cancel felt as it would fit
the FD model better.
We auto-detach on close(link_fd) and auto-unload on close(prog_fd).
The armed timer would prevent that and that promise felt
necessary to keep. But disappearing timer is a bigger surprise
to users than not auto-unloading progs.
Hence this patch is doing prog_refcnt++ in bpf_timer_start.
Please see other emails threads in v1 patch set.

> >
> > tail_calls are unnecessary. Just call the funcs directly.
> > All lookups and maps are unnecessary as well.
> > Looks like a single global timer will be enough for this use case.
> 
> Hmm? With your design, a timer has to be embedded into a map
> value, you said this is to mimic bpf spinlock.

The global data is a map.
When spin_lock was introduced there was no global data concept.

> >
> > In general the garbage collection in any form doesn't scale.
> > The conntrack logic doesn't need it. The cillium conntrack is a great
> > example of how to implement a conntrack without GC.
> 
> That is simply not a conntrack. We expire connections based on
> its time, not based on the size of the map where it residents.

Sounds like your goal is to replicate existing kernel conntrack
as bpf program by doing exactly the same algorithm and repeating
the same mistakes. Then add kernel conntrack functions to allow list
of kfuncs (unstable helpers) and call them from your bpf progs.

  reply	other threads:[~2021-06-02  2:00 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20 18:55 [RFC PATCH bpf-next] bpf: Introduce bpf_timer Alexei Starovoitov
2021-05-21 14:38 ` Alexei Starovoitov
2021-05-21 21:37 ` Cong Wang
2021-05-23 16:01   ` Alexei Starovoitov
2021-05-24  8:45     ` Lorenz Bauer
2021-05-25  3:16     ` Cong Wang
2021-05-25  4:59       ` Cong Wang
2021-05-25 18:21         ` Alexei Starovoitov
2021-05-25 19:35           ` Jamal Hadi Salim
2021-05-25 19:57             ` Alexei Starovoitov
2021-05-25 21:09               ` Jamal Hadi Salim
2021-05-25 22:08                 ` Alexei Starovoitov
2021-05-26 15:34                   ` Jamal Hadi Salim
2021-05-26 16:58                     ` Alexei Starovoitov
2021-05-26 18:25                       ` Jamal Hadi Salim
2021-05-30  6:36           ` Cong Wang
2021-06-02  2:00             ` Alexei Starovoitov [this message]
2021-06-02  8:48               ` Toke Høiland-Jørgensen
2021-06-02 17:54                 ` Martin KaFai Lau
2021-06-02 18:13                   ` Kumar Kartikeya Dwivedi
2021-06-02 18:26                     ` Alexei Starovoitov
2021-06-02 18:30                       ` Kumar Kartikeya Dwivedi
2021-06-02 18:46                     ` John Fastabend
2021-05-23 11:48 ` Toke Høiland-Jørgensen
2021-05-23 15:58   ` Alexei Starovoitov
2021-05-24  8:42     ` Lorenz Bauer
2021-05-24 14:48       ` Alexei Starovoitov
2021-05-24 17:33     ` Alexei Starovoitov
2021-05-24 18:39       ` Toke Høiland-Jørgensen
2021-05-24 18:38     ` Toke Høiland-Jørgensen
2021-05-24 11:49 ` Lorenz Bauer
2021-05-24 14:56   ` Alexei Starovoitov
2021-05-24 19:13     ` Andrii Nakryiko
2021-05-25  5:22       ` Cong Wang
2021-05-25 19:47         ` Andrii Nakryiko
  -- strict thread matches above, loose matches on Subject: below --
2021-04-01  4:26 [RFC Patch bpf-next] bpf: introduce bpf timer Cong Wang
2021-04-01  6:38 ` Song Liu
2021-04-01 17:28   ` Cong Wang
2021-04-01 20:17     ` Song Liu
2021-04-02 17:34       ` Cong Wang
2021-04-02 17:57         ` Song Liu
2021-04-02 19:08           ` Cong Wang
2021-04-02 19:43             ` Song Liu
2021-04-02 20:57               ` Cong Wang
2021-04-02 23:31                 ` Song Liu
2021-04-05 23:49                   ` Cong Wang
2021-04-06  1:07                     ` Song Liu
2021-04-06  1:24                       ` Cong Wang
2021-04-06  6:17                         ` Song Liu
2021-04-06 16:48                           ` Cong Wang
2021-04-06 23:36                             ` Song Liu
2021-04-08 22:45                               ` Cong Wang
2021-04-02 19:28 ` Alexei Starovoitov
2021-04-02 21:24   ` Cong Wang
2021-04-02 23:45     ` Alexei Starovoitov
2021-04-06  0:36       ` Cong Wang
2021-04-12 23:01         ` Alexei Starovoitov
2021-04-15  4:02           ` Cong Wang
2021-04-15  4:25             ` Alexei Starovoitov
2021-04-15 15:51               ` Cong Wang
2021-04-26 23:00               ` Cong Wang
2021-04-26 23:05                 ` Alexei Starovoitov
2021-04-26 23:37                   ` Cong Wang
2021-04-27  2:01                     ` Alexei Starovoitov
2021-04-27 11:52                       ` Jamal Hadi Salim
2021-04-27 16:36                       ` Cong Wang
2021-04-27 18:33                         ` Alexei Starovoitov
2021-05-09  5:37                           ` Cong Wang
2021-05-10 20:55                             ` Jamal Hadi Salim
2021-05-11 21:29                               ` Cong Wang
2021-05-12 22:56                                 ` Jamal Hadi Salim
2021-05-11  5:05                             ` Joe Stringer
2021-05-11 21:08                               ` Cong Wang
2021-05-12 22:43                               ` Jamal Hadi Salim
2021-05-13 18:45                                 ` Jamal Hadi Salim
2021-05-14  2:53                                   ` Cong Wang
2021-08-11 21:03                                     ` Joe Stringer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210602020030.igrx5jp45tocekvy@ast-mbp.dhcp.thefacebook.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=john.fastabend@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=lmb@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).