From: Cong Wang <xiyou.wangcong@gmail.com>
To: Song Liu <songliubraving@fb.com>
Cc: "open list:BPF (Safe dynamic programs and tools)"
<netdev@vger.kernel.org>,
"open list:BPF (Safe dynamic programs and tools)"
<bpf@vger.kernel.org>,
"duanxiongchun@bytedance.com" <duanxiongchun@bytedance.com>,
"wangdongdong.6@bytedance.com" <wangdongdong.6@bytedance.com>,
Muchun Song <songmuchun@bytedance.com>,
Cong Wang <cong.wang@bytedance.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>, Martin Lau <kafai@fb.com>,
Yonghong Song <yhs@fb.com>
Subject: Re: [RFC Patch bpf-next] bpf: introduce bpf timer
Date: Fri, 2 Apr 2021 12:08:23 -0700 [thread overview]
Message-ID: <CAM_iQpXEuxwQvT9FNqDa7y5kNpknA4xMNo_973ncy3iYaF-NTA@mail.gmail.com> (raw)
In-Reply-To: <93BBD473-7E1C-4A6E-8BB7-12E63D4799E8@fb.com>
On Fri, Apr 2, 2021 at 10:57 AM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Apr 2, 2021, at 10:34 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >
> > On Thu, Apr 1, 2021 at 1:17 PM Song Liu <songliubraving@fb.com> wrote:
> >>
> >>
> >>
> >>> On Apr 1, 2021, at 10:28 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >>>
> >>> On Wed, Mar 31, 2021 at 11:38 PM Song Liu <songliubraving@fb.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>>> On Mar 31, 2021, at 9:26 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >>>>>
> >>>>> From: Cong Wang <cong.wang@bytedance.com>
> >>>>>
> >>>>> (This patch is still in early stage and obviously incomplete. I am sending
> >>>>> it out to get some high-level feedbacks. Please kindly ignore any coding
> >>>>> details for now and focus on the design.)
> >>>>
> >>>> Could you please explain the use case of the timer? Is it the same as
> >>>> earlier proposal of BPF_MAP_TYPE_TIMEOUT_HASH?
> >>>>
> >>>> Assuming that is the case, I guess the use case is to assign an expire
> >>>> time for each element in a hash map; and periodically remove expired
> >>>> element from the map.
> >>>>
> >>>> If this is still correct, my next question is: how does this compare
> >>>> against a user space timer? Will the user space timer be too slow?
> >>>
> >>> Yes, as I explained in timeout hashmap patchset, doing it in user-space
> >>> would require a lot of syscalls (without batching) or copying (with batching).
> >>> I will add the explanation here, in case people miss why we need a timer.
> >>
> >> How about we use a user space timer to trigger a BPF program (e.g. use
> >> BPF_PROG_TEST_RUN on a raw_tp program); then, in the BPF program, we can
> >> use bpf_for_each_map_elem and bpf_map_delete_elem to scan and update the
> >> map? With this approach, we only need one syscall per period.
> >
> > Interesting, I didn't know we can explicitly trigger a BPF program running
> > from user-space. Is it for testing purposes only?
>
> This is not only for testing. We will use this in perf (starting in 5.13).
>
> /* currently in Arnaldo's tree, tools/perf/util/bpf_counter.c: */
>
> /* trigger the leader program on a cpu */
> static int bperf_trigger_reading(int prog_fd, int cpu)
> {
> DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts,
> .ctx_in = NULL,
> .ctx_size_in = 0,
> .flags = BPF_F_TEST_RUN_ON_CPU,
> .cpu = cpu,
> .retval = 0,
> );
>
> return bpf_prog_test_run_opts(prog_fd, &opts);
> }
>
> test_run also passes return value (retval) back to user space, so we and
> adjust the timer interval based on retval.
This is really odd, every name here contains a "test" but it is not for testing
purposes. You probably need to rename/alias it. ;)
So, with this we have to get a user-space daemon running just to keep
this "timer" alive. If I want to run it every 1ms, it means I have to issue
a syscall BPF_PROG_TEST_RUN every 1ms. Even with a timer fd, we
still need poll() and timerfd_settime(). This is a considerable overhead
for just a single timer.
With current design, user-space can just exit after installing the timer,
either it can adjust itself or other eBPF code can adjust it, so the per
timer overhead is the same as a kernel timer.
The visibility to other BPF code is important for the conntrack case,
because each time we get an expired item during a lookup, we may
want to schedule the GC timer to run sooner. At least this would give
users more freedom to decide when to reschedule the timer.
Thanks.
next prev parent reply other threads:[~2021-04-02 19:08 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-01 4:26 [RFC Patch bpf-next] bpf: introduce bpf timer Cong Wang
2021-04-01 6:38 ` Song Liu
2021-04-01 17:28 ` Cong Wang
2021-04-01 20:17 ` Song Liu
2021-04-02 17:34 ` Cong Wang
2021-04-02 17:57 ` Song Liu
2021-04-02 19:08 ` Cong Wang [this message]
2021-04-02 19:43 ` Song Liu
2021-04-02 20:57 ` Cong Wang
2021-04-02 23:31 ` Song Liu
2021-04-05 23:49 ` Cong Wang
2021-04-06 1:07 ` Song Liu
2021-04-06 1:24 ` Cong Wang
2021-04-06 6:17 ` Song Liu
2021-04-06 16:48 ` Cong Wang
2021-04-06 23:36 ` Song Liu
2021-04-08 22:45 ` Cong Wang
2021-04-02 19:28 ` Alexei Starovoitov
2021-04-02 21:24 ` Cong Wang
2021-04-02 23:45 ` Alexei Starovoitov
2021-04-06 0:36 ` Cong Wang
2021-04-12 23:01 ` Alexei Starovoitov
2021-04-15 4:02 ` Cong Wang
2021-04-15 4:25 ` Alexei Starovoitov
2021-04-15 15:51 ` Cong Wang
2021-04-26 23:00 ` Cong Wang
2021-04-26 23:05 ` Alexei Starovoitov
2021-04-26 23:37 ` Cong Wang
2021-04-27 2:01 ` Alexei Starovoitov
2021-04-27 11:52 ` Jamal Hadi Salim
2021-04-27 16:36 ` Cong Wang
2021-04-27 18:33 ` Alexei Starovoitov
2021-05-09 5:37 ` Cong Wang
2021-05-10 20:55 ` Jamal Hadi Salim
2021-05-11 21:29 ` Cong Wang
2021-05-12 22:56 ` Jamal Hadi Salim
2021-05-11 5:05 ` Joe Stringer
2021-05-11 21:08 ` Cong Wang
2021-05-12 22:43 ` Jamal Hadi Salim
2021-05-13 18:45 ` Jamal Hadi Salim
2021-05-14 2:53 ` Cong Wang
2021-08-11 21:03 ` Joe Stringer
2021-05-20 18:55 [RFC PATCH bpf-next] bpf: Introduce bpf_timer Alexei Starovoitov
2021-05-21 14:38 ` Alexei Starovoitov
2021-05-21 21:37 ` Cong Wang
2021-05-23 16:01 ` Alexei Starovoitov
2021-05-24 8:45 ` Lorenz Bauer
2021-05-25 3:16 ` Cong Wang
2021-05-25 4:59 ` Cong Wang
2021-05-25 18:21 ` Alexei Starovoitov
2021-05-25 19:35 ` Jamal Hadi Salim
2021-05-25 19:57 ` Alexei Starovoitov
2021-05-25 21:09 ` Jamal Hadi Salim
2021-05-25 22:08 ` Alexei Starovoitov
2021-05-26 15:34 ` Jamal Hadi Salim
2021-05-26 16:58 ` Alexei Starovoitov
2021-05-26 18:25 ` Jamal Hadi Salim
2021-05-30 6:36 ` Cong Wang
2021-06-02 2:00 ` Alexei Starovoitov
2021-06-02 8:48 ` Toke Høiland-Jørgensen
2021-06-02 17:54 ` Martin KaFai Lau
2021-06-02 18:13 ` Kumar Kartikeya Dwivedi
2021-06-02 18:26 ` Alexei Starovoitov
2021-06-02 18:30 ` Kumar Kartikeya Dwivedi
2021-06-02 18:46 ` John Fastabend
2021-05-23 11:48 ` Toke Høiland-Jørgensen
2021-05-23 15:58 ` Alexei Starovoitov
2021-05-24 8:42 ` Lorenz Bauer
2021-05-24 14:48 ` Alexei Starovoitov
2021-05-24 17:33 ` Alexei Starovoitov
2021-05-24 18:39 ` Toke Høiland-Jørgensen
2021-05-24 18:38 ` Toke Høiland-Jørgensen
2021-05-24 11:49 ` Lorenz Bauer
2021-05-24 14:56 ` Alexei Starovoitov
2021-05-24 19:13 ` Andrii Nakryiko
2021-05-25 5:22 ` Cong Wang
2021-05-25 19:47 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAM_iQpXEuxwQvT9FNqDa7y5kNpknA4xMNo_973ncy3iYaF-NTA@mail.gmail.com \
--to=xiyou.wangcong@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=cong.wang@bytedance.com \
--cc=daniel@iogearbox.net \
--cc=duanxiongchun@bytedance.com \
--cc=kafai@fb.com \
--cc=netdev@vger.kernel.org \
--cc=songliubraving@fb.com \
--cc=songmuchun@bytedance.com \
--cc=wangdongdong.6@bytedance.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).