bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Cong Wang <xiyou.wangcong@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>,
	Xiongchun Duan <duanxiongchun@bytedance.com>,
	Dongdong Wang <wangdongdong.6@bytedance.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Cong Wang <cong.wang@bytedance.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	Pedro Tammela <pctammela@mojatatu.com>,
	Jamal Hadi Salim <jhs@mojatatu.com>
Subject: Re: [RFC Patch bpf-next] bpf: introduce bpf timer
Date: Tue, 27 Apr 2021 09:36:01 -0700	[thread overview]
Message-ID: <CAM_iQpVE4XG7SPAVBmV2UtqUANg3X-1ngY7COYC03NrT6JkZ+g@mail.gmail.com> (raw)
In-Reply-To: <20210427020159.hhgyfkjhzjk3lxgs@ast-mbp.dhcp.thefacebook.com>

On Mon, Apr 26, 2021 at 7:02 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Apr 26, 2021 at 04:37:19PM -0700, Cong Wang wrote:
> > On Mon, Apr 26, 2021 at 4:05 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Mon, Apr 26, 2021 at 4:00 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > > >
> > > > Hi, Alexei
> > > >
> > > > On Wed, Apr 14, 2021 at 9:25 PM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Wed, Apr 14, 2021 at 9:02 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > > > > >
> > > > > > Then how do you prevent prog being unloaded when the timer callback
> > > > > > is still active?
> > > > >
> > > > > As I said earlier:
> > > > > "
> > > > > If prog refers such hmap as above during prog free the kernel does
> > > > > for_each_map_elem {if (elem->opaque) del_timer().}
> > > > > "
> > > >
> > > > I have discussed this with my colleagues, sharing timers among different
> > > > eBPF programs is a must-have feature for conntrack.
> > > >
> > > > For conntrack, we need to attach two eBPF programs, one on egress and
> > > > one on ingress. They share a conntrack table (an eBPF map), and no matter
> > > > we use a per-map or per-entry timer, updating the timer(s) could happen
> > > > on both sides, hence timers must be shared for both.
> > > >
> > > > So, your proposal we discussed does not work well for this scenario.
> > >
> > > why? The timer inside the map element will be shared just fine.
> > > Just like different progs can see the same map value.
> >
> > Hmm? In the above quotes from you, you suggested removing all the
> > timers installed by one eBPF program when it is freed, but they could be
> > still running independent of which program installs them.
>
> Right. That was before the office hours chat where we discussed an approach
> to remove timers installed by this particular prog only.
> The timers armed by other progs in the same map would be preserved.
>
> > In other words, timers are independent of other eBPF programs, so
> > they should not have an owner. With your proposal, the owner of a timer
> > is the program which contains the subprog (or callback) of the timer.
>
> right. so?
> How is this anything to do with "sharing timers among different eBPF programs"?

It matters a lot which program installs hence removes these timers,
because conceptually each connection inside a conntrack table does not
belong to any program, so are the timers associated with these
connections.

If we enforce this ownership, in case of conntrack the owner would be
the program which sees the connection first, which is pretty much
unpredictable. For example, if the ingress program sees a connection
first, it installs a timer for this connection, but the traffic is
bidirectional,
hence egress program needs this connection and its timer too, we
should not remove this timer when the ingress program is freed.

From another point of view: maps and programs are both first-class
resources in eBPF, a timer is stored in a map and associated with a
program, so it is naturally a first-class resource too.

>
> > >
> > > Also if your colleagues have something to share they should be
> > > posting to the mailing list. Right now you're acting as a broken phone
> > > passing info back and forth and the knowledge gets lost.
> > > Please ask your colleagues to participate online.
> >
> > They are already in CC from the very beginning. And our use case is
> > public, it is Cilium conntrack:
> > https://github.com/cilium/cilium/blob/master/bpf/lib/conntrack.h
> >
> > The entries of the code are:
> > https://github.com/cilium/cilium/blob/master/bpf/bpf_lxc.c
> >
> > The maps for conntrack are:
> > https://github.com/cilium/cilium/blob/master/bpf/lib/conntrack_map.h
>
> If that's the only goal then kernel timers are not needed.
> cilium conntrack works well as-is.

We don't go back to why user-space cleanup is inefficient again,
do we? ;)

More importantly, although conntrack is our use case, we don't
design timers just for our case, obviously. Timers must be as flexible
to use as possible, to allow other future use cases.

Thanks.

  parent reply	other threads:[~2021-04-27 16:36 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01  4:26 [RFC Patch bpf-next] bpf: introduce bpf timer Cong Wang
2021-04-01  6:38 ` Song Liu
2021-04-01 17:28   ` Cong Wang
2021-04-01 20:17     ` Song Liu
2021-04-02 17:34       ` Cong Wang
2021-04-02 17:57         ` Song Liu
2021-04-02 19:08           ` Cong Wang
2021-04-02 19:43             ` Song Liu
2021-04-02 20:57               ` Cong Wang
2021-04-02 23:31                 ` Song Liu
2021-04-05 23:49                   ` Cong Wang
2021-04-06  1:07                     ` Song Liu
2021-04-06  1:24                       ` Cong Wang
2021-04-06  6:17                         ` Song Liu
2021-04-06 16:48                           ` Cong Wang
2021-04-06 23:36                             ` Song Liu
2021-04-08 22:45                               ` Cong Wang
2021-04-02 19:28 ` Alexei Starovoitov
2021-04-02 21:24   ` Cong Wang
2021-04-02 23:45     ` Alexei Starovoitov
2021-04-06  0:36       ` Cong Wang
2021-04-12 23:01         ` Alexei Starovoitov
2021-04-15  4:02           ` Cong Wang
2021-04-15  4:25             ` Alexei Starovoitov
2021-04-15 15:51               ` Cong Wang
2021-04-26 23:00               ` Cong Wang
2021-04-26 23:05                 ` Alexei Starovoitov
2021-04-26 23:37                   ` Cong Wang
2021-04-27  2:01                     ` Alexei Starovoitov
2021-04-27 11:52                       ` Jamal Hadi Salim
2021-04-27 16:36                       ` Cong Wang [this message]
2021-04-27 18:33                         ` Alexei Starovoitov
2021-05-09  5:37                           ` Cong Wang
2021-05-10 20:55                             ` Jamal Hadi Salim
2021-05-11 21:29                               ` Cong Wang
2021-05-12 22:56                                 ` Jamal Hadi Salim
2021-05-11  5:05                             ` Joe Stringer
2021-05-11 21:08                               ` Cong Wang
2021-05-12 22:43                               ` Jamal Hadi Salim
2021-05-13 18:45                                 ` Jamal Hadi Salim
2021-05-14  2:53                                   ` Cong Wang
2021-08-11 21:03                                     ` Joe Stringer
2021-05-20 18:55 [RFC PATCH bpf-next] bpf: Introduce bpf_timer Alexei Starovoitov
2021-05-21 14:38 ` Alexei Starovoitov
2021-05-21 21:37 ` Cong Wang
2021-05-23 16:01   ` Alexei Starovoitov
2021-05-24  8:45     ` Lorenz Bauer
2021-05-25  3:16     ` Cong Wang
2021-05-25  4:59       ` Cong Wang
2021-05-25 18:21         ` Alexei Starovoitov
2021-05-25 19:35           ` Jamal Hadi Salim
2021-05-25 19:57             ` Alexei Starovoitov
2021-05-25 21:09               ` Jamal Hadi Salim
2021-05-25 22:08                 ` Alexei Starovoitov
2021-05-26 15:34                   ` Jamal Hadi Salim
2021-05-26 16:58                     ` Alexei Starovoitov
2021-05-26 18:25                       ` Jamal Hadi Salim
2021-05-30  6:36           ` Cong Wang
2021-06-02  2:00             ` Alexei Starovoitov
2021-06-02  8:48               ` Toke Høiland-Jørgensen
2021-06-02 17:54                 ` Martin KaFai Lau
2021-06-02 18:13                   ` Kumar Kartikeya Dwivedi
2021-06-02 18:26                     ` Alexei Starovoitov
2021-06-02 18:30                       ` Kumar Kartikeya Dwivedi
2021-06-02 18:46                     ` John Fastabend
2021-05-23 11:48 ` Toke Høiland-Jørgensen
2021-05-23 15:58   ` Alexei Starovoitov
2021-05-24  8:42     ` Lorenz Bauer
2021-05-24 14:48       ` Alexei Starovoitov
2021-05-24 17:33     ` Alexei Starovoitov
2021-05-24 18:39       ` Toke Høiland-Jørgensen
2021-05-24 18:38     ` Toke Høiland-Jørgensen
2021-05-24 11:49 ` Lorenz Bauer
2021-05-24 14:56   ` Alexei Starovoitov
2021-05-24 19:13     ` Andrii Nakryiko
2021-05-25  5:22       ` Cong Wang
2021-05-25 19:47         ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAM_iQpVE4XG7SPAVBmV2UtqUANg3X-1ngY7COYC03NrT6JkZ+g@mail.gmail.com \
    --to=xiyou.wangcong@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cong.wang@bytedance.com \
    --cc=daniel@iogearbox.net \
    --cc=duanxiongchun@bytedance.com \
    --cc=jhs@mojatatu.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=pctammela@mojatatu.com \
    --cc=songliubraving@fb.com \
    --cc=songmuchun@bytedance.com \
    --cc=wangdongdong.6@bytedance.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).