From: Cong Wang <xiyou.wangcong@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Miller <davem@davemloft.net>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
bpf <bpf@vger.kernel.org>, kernel-team <kernel-team@fb.com>
Subject: Re: [PATCH v2 bpf-next 1/3] bpf: Introduce bpf_timer
Date: Mon, 14 Jun 2021 23:10:46 -0700 [thread overview]
Message-ID: <CAM_iQpV2fv=MMhf3w+YpGDXCYaMKVO_hoACL0=oXmn_pDUVexg@mail.gmail.com> (raw)
In-Reply-To: <20210611184516.tpjvlaxjc4zdeqe6@ast-mbp.dhcp.thefacebook.com>
On Fri, Jun 11, 2021 at 11:45 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Jun 10, 2021 at 11:42:24PM -0700, Cong Wang wrote:
> > On Thu, Jun 10, 2021 at 9:27 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > From: Alexei Starovoitov <ast@kernel.org>
>
> Please stick to one email thread in the future, ok?
>
> I'll consolidate them here:
>
> > What is your use case to justify your own code? Asking because
> > you deny mine, so clearly my use case is not yours.
>
> I mentioned several use cases in the prior threads.
> tldr: any periodic event in tracing, networking, security.
> Garbage collection falls into this category as well, but please internalize
> that implementing conntrack as it is today in the kernel is an explicit non-goal.
You need to read my use case again, it is for the conntrack
in Cilium, not the kernel one.
>
> > And more importantly, why not just use BPF_TEST_RUN with
> > a user-space timer? Asking because you offer no API to read or
> > modify timer expiration, so literally the same with BPF_TEST_RUN
> > approach.
>
> a wrapper on top of hrtimer_get_remaining() like bpf_timer_get_remaining()
> is trivial to add, but what is the use case?
If you do not have any use case, then stick to BPF_TEST_RUN
with user-space timers? And of course your patches are not needed
at all.
>
> > >
> > > Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded
> > > in hash/array/lru maps as regular field and helpers to operate on it:
> >
> > Can be or has to be? Huge difference here.
>
> map elements don't have to use timers.
You interpret this in a wrong way, what I asked is whether a bpf timer has
to be embedded in a map. IOW, can a bpf timer be a standalone global
data?
>
> > In the other thread, you said it is global data, which implies that it does
> > not have to be in a map.
>
> global data is a map. That was explained in the prior thread as well.
>
I think you implied bpf timer can exist without a map, hence I am asking.
> >
> > In your test case or your example, all timers are still in a map. So what has
> > changed since then? Looks nothing to me.
>
> look again?
Yes, I just looked at it again, only more confusing, not less.
>
> > Hmm, finally you begin refcounting it, which you were strongly against. ;)
>
> That was already answered in the prior thread.
> tldr: there were two options. This is one of them. Another can be added
> in the future as well.
>
> > Three questions:
> >
> > 1. Can t->prog be freed between bpf_timer_init() and bpf_timer_start()?
>
> yes.
Good. So if a program which only initializes the timer and then exits,
the other program which shares this timer will crash when it calls
bpf_timer_start(), right?
>
> > If the timer subprog is always in the same prog which installs it, then
>
> installs it? I'm not following the quesiton.
>
> > this is fine. But then how do multiple programs share a timer?
>
> there is only one callback function.
That's exactly my question. How is one callback function shared
by multiple eBPF programs which want to share the timer?
>
> > In the
> > case of conntrack, either ingress or egress could install the timer,
> > it only depends which one gets traffic first. Do they have to copy
> > the same subprog for the same timer?
>
> conntrack is an explicit non-goal.
I interpret this as you do not want timers to be shared by multiple
eBPF programs, correct? Weirdly, maps are shared, timers are
placed in a map, so timers should be shared naturally too.
>
> >
> > 2. Can t->prog be freed between a timer expiration and bpf_timer_start()
> > again?
>
> If it's already armed with the first bpf_timer_start() it won't be freed.
Why? I see t->prog is released in your timer callback:
+ bpf_prog_put(prog);
+ this_cpu_write(hrtimer_running, NULL);
+ return HRTIMER_NORESTART;
>
> > It gets a refcnt when starting a timer and puts it when cancelling
> > or expired, so t->prog can be freed right after cancelling or expired. What
> > if another program which shares this timer wants to restart this timer?
>
> There is only one callback_fn per timer. Another program can share
> the struct bpf_timer and the map. It might have subprog callback_fn code
> that looks exactly the same as callback_fn in the first prog.
> For example when libbpf loads progs/timer.c (after it was compiled into .o)
> it might share a subprog in the future (when kernel has support for
> dynamic linking). From bpf user pov it's a single .c file.
> The split into programs and subprograms is an implemenation detail
> that C programmer doesn't need to worry about.
Not exactly, they share a same C file but still can be loaded/unloaded
separately. And logically, whether a timer has been initialized once or
twice makes a huge difference for programers.
>
> > 3. Since you offer no API to read the expiration time, why not just use
> > BPF_TEST_RUN with a user-space timer? This is preferred by Andrii.
>
> Andrii point was that there should be no syscall cmds that replicate
> bpf_timer_init/start/cancel helpers. I agree with this.
Actually there is no strong reason to bother a bpf timer unless you
want to access the timer itself, which mostly contains expiration.
User-space timers work just fine for your cases, even if not, extending
BPF_TEST_RUN should too.
>
>
> > Thanks.
> >
> > Another unpopular point of view:
> >
> > This init() is not suitable for bpf programs, because unlike kernel modules,
> > there is no init or exit functions for a bpf program. And timer init
> > is typically
> > called during module init.
>
> Already answerd this in the prior thread. There will be __init and __fini like
> subprograms in bpf progs.
I interpret this as init does not make sense until we have __init and __fini?
>
> Please apply the patches to your local tree and do few experiments based
> on selftests/bpf/progs/timer.c. I think experimenting with the code
> will answer all of your questions.
Sounds like you find a great excuse for a failure of documentation.
What I asked are just fundamental design questions you should have
covered in your cover letter, which is literally empty.
Thanks.
next prev parent reply other threads:[~2021-06-15 6:11 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-11 4:24 [PATCH v2 bpf-next 0/3] bpf: Introduce BPF timers Alexei Starovoitov
2021-06-11 4:24 ` [PATCH v2 bpf-next 1/3] bpf: Introduce bpf_timer Alexei Starovoitov
2021-06-11 6:42 ` Cong Wang
2021-06-11 18:45 ` Alexei Starovoitov
2021-06-15 6:10 ` Cong Wang [this message]
2021-06-16 4:53 ` Alexei Starovoitov
2021-06-11 7:05 ` Cong Wang
2021-06-11 22:12 ` Yonghong Song
2021-06-15 3:33 ` Alexei Starovoitov
2021-06-15 4:21 ` Yonghong Song
2021-06-14 16:51 ` Yonghong Song
2021-06-15 3:29 ` Alexei Starovoitov
2021-06-15 5:31 ` Andrii Nakryiko
2021-06-15 5:40 ` Alexei Starovoitov
2021-06-15 15:24 ` Andrii Nakryiko
2021-06-16 4:26 ` Alexei Starovoitov
2021-06-16 5:54 ` Andrii Nakryiko
2021-06-16 16:52 ` Alexei Starovoitov
2021-06-15 4:48 ` Andrii Nakryiko
2021-06-11 4:24 ` [PATCH v2 bpf-next 2/3] bpf: Add verifier checks for bpf_timer Alexei Starovoitov
2021-06-11 4:24 ` [PATCH v2 bpf-next 3/3] selftests/bpf: Add bpf_timer test Alexei Starovoitov
2021-06-11 6:47 ` [PATCH v2 bpf-next 0/3] bpf: Introduce BPF timers Cong Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAM_iQpV2fv=MMhf3w+YpGDXCYaMKVO_hoACL0=oXmn_pDUVexg@mail.gmail.com' \
--to=xiyou.wangcong@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=kernel-team@fb.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).