All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cong Wang <xiyou.wangcong@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>, kernel-team <kernel-team@fb.com>
Subject: Re: [PATCH v2 bpf-next 1/3] bpf: Introduce bpf_timer
Date: Mon, 14 Jun 2021 23:10:46 -0700	[thread overview]
Message-ID: <CAM_iQpV2fv=MMhf3w+YpGDXCYaMKVO_hoACL0=oXmn_pDUVexg@mail.gmail.com> (raw)
In-Reply-To: <20210611184516.tpjvlaxjc4zdeqe6@ast-mbp.dhcp.thefacebook.com>

On Fri, Jun 11, 2021 at 11:45 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Jun 10, 2021 at 11:42:24PM -0700, Cong Wang wrote:
> > On Thu, Jun 10, 2021 at 9:27 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > From: Alexei Starovoitov <ast@kernel.org>
>
> Please stick to one email thread in the future, ok?
>
> I'll consolidate them here:
>
> > What is your use case to justify your own code? Asking because
> > you deny mine, so clearly my use case is not yours.
>
> I mentioned several use cases in the prior threads.
> tldr: any periodic event in tracing, networking, security.
> Garbage collection falls into this category as well, but please internalize
> that implementing conntrack as it is today in the kernel is an explicit non-goal.

You need to read my use case again, it is for the conntrack
in Cilium, not the kernel one.

>
> > And more importantly, why not just use BPF_TEST_RUN with
> > a user-space timer? Asking because you offer no API to read or
> > modify timer expiration, so literally the same with BPF_TEST_RUN
> > approach.
>
> a wrapper on top of hrtimer_get_remaining() like bpf_timer_get_remaining()
> is trivial to add, but what is the use case?

If you do not have any use case, then stick to BPF_TEST_RUN
with user-space timers? And of course your patches are not needed
at all.

>
> > >
> > > Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded
> > > in hash/array/lru maps as regular field and helpers to operate on it:
> >
> > Can be or has to be? Huge difference here.
>
> map elements don't have to use timers.

You interpret this in a wrong way, what I asked is whether a bpf timer has
to be embedded in a map. IOW, can a bpf timer be a standalone global
data?

>
> > In the other thread, you said it is global data, which implies that it does
> > not have to be in a map.
>
> global data is a map. That was explained in the prior thread as well.
>

I think you implied bpf timer can exist without a map, hence I am asking.


> >
> > In your test case or your example, all timers are still in a map. So what has
> > changed since then? Looks nothing to me.
>
> look again?

Yes, I just looked at it again, only more confusing, not less.

>
> > Hmm, finally you begin refcounting it, which you were strongly against. ;)
>
> That was already answered in the prior thread.
> tldr: there were two options. This is one of them. Another can be added
> in the future as well.
>
> > Three questions:
> >
> > 1. Can t->prog be freed between bpf_timer_init() and bpf_timer_start()?
>
> yes.

Good. So if a program which only initializes the timer and then exits,
the other program which shares this timer will crash when it calls
bpf_timer_start(), right?

>
> > If the timer subprog is always in the same prog which installs it, then
>
> installs it? I'm not following the quesiton.
>
> > this is fine. But then how do multiple programs share a timer?
>
> there is only one callback function.

That's exactly my question. How is one callback function shared
by multiple eBPF programs which want to share the timer?


>
> > In the
> > case of conntrack, either ingress or egress could install the timer,
> > it only depends which one gets traffic first. Do they have to copy
> > the same subprog for the same timer?
>
> conntrack is an explicit non-goal.

I interpret this as you do not want timers to be shared by multiple
eBPF programs, correct? Weirdly, maps are shared, timers are
placed in a map, so timers should be shared naturally too.

>
> >
> > 2. Can t->prog be freed between a timer expiration and bpf_timer_start()
> > again?
>
> If it's already armed with the first bpf_timer_start() it won't be freed.

Why? I see t->prog is released in your timer callback:

+       bpf_prog_put(prog);
+       this_cpu_write(hrtimer_running, NULL);
+       return HRTIMER_NORESTART;

>
> > It gets a refcnt when starting a timer and puts it when cancelling
> > or expired, so t->prog can be freed right after cancelling or expired. What
> > if another program which shares this timer wants to restart this timer?
>
> There is only one callback_fn per timer. Another program can share
> the struct bpf_timer and the map. It might have subprog callback_fn code
> that looks exactly the same as callback_fn in the first prog.
> For example when libbpf loads progs/timer.c (after it was compiled into .o)
> it might share a subprog in the future (when kernel has support for
> dynamic linking). From bpf user pov it's a single .c file.
> The split into programs and subprograms is an implemenation detail
> that C programmer doesn't need to worry about.

Not exactly, they share a same C file but still can be loaded/unloaded
separately. And logically, whether a timer has been initialized once or
twice makes a huge difference for programers.

>
> > 3. Since you offer no API to read the expiration time, why not just use
> > BPF_TEST_RUN with a user-space timer? This is preferred by Andrii.
>
> Andrii point was that there should be no syscall cmds that replicate
> bpf_timer_init/start/cancel helpers. I agree with this.

Actually there is no strong reason to bother a bpf timer unless you
want to access the timer itself, which mostly contains expiration.
User-space timers work just fine for your cases, even if not, extending
BPF_TEST_RUN should too.

>
>
> > Thanks.
> >
> > Another unpopular point of view:
> >
> > This init() is not suitable for bpf programs, because unlike kernel modules,
> > there is no init or exit functions for a bpf program. And timer init
> > is typically
> > called during module init.
>
> Already answerd this in the prior thread. There will be __init and __fini like
> subprograms in bpf progs.

I interpret this as init does not make sense until we have __init and __fini?

>
> Please apply the patches to your local tree and do few experiments based
> on selftests/bpf/progs/timer.c. I think experimenting with the code
> will answer all of your questions.

Sounds like you find a great excuse for a failure of documentation.
What I asked are just fundamental design questions you should have
covered in your cover letter, which is literally empty.

Thanks.

  reply	other threads:[~2021-06-15  6:11 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-11  4:24 [PATCH v2 bpf-next 0/3] bpf: Introduce BPF timers Alexei Starovoitov
2021-06-11  4:24 ` [PATCH v2 bpf-next 1/3] bpf: Introduce bpf_timer Alexei Starovoitov
2021-06-11  6:42   ` Cong Wang
2021-06-11 18:45     ` Alexei Starovoitov
2021-06-15  6:10       ` Cong Wang [this message]
2021-06-16  4:53         ` Alexei Starovoitov
2021-06-11  7:05   ` Cong Wang
2021-06-11 22:12   ` Yonghong Song
2021-06-15  3:33     ` Alexei Starovoitov
2021-06-15  4:21       ` Yonghong Song
2021-06-14 16:51   ` Yonghong Song
2021-06-15  3:29     ` Alexei Starovoitov
2021-06-15  5:31       ` Andrii Nakryiko
2021-06-15  5:40         ` Alexei Starovoitov
2021-06-15 15:24           ` Andrii Nakryiko
2021-06-16  4:26             ` Alexei Starovoitov
2021-06-16  5:54               ` Andrii Nakryiko
2021-06-16 16:52                 ` Alexei Starovoitov
2021-06-15  4:48   ` Andrii Nakryiko
2021-06-11  4:24 ` [PATCH v2 bpf-next 2/3] bpf: Add verifier checks for bpf_timer Alexei Starovoitov
2021-06-11  4:24 ` [PATCH v2 bpf-next 3/3] selftests/bpf: Add bpf_timer test Alexei Starovoitov
2021-06-11  6:47 ` [PATCH v2 bpf-next 0/3] bpf: Introduce BPF timers Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM_iQpV2fv=MMhf3w+YpGDXCYaMKVO_hoACL0=oXmn_pDUVexg@mail.gmail.com' \
    --to=xiyou.wangcong@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.