bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joe Stringer <joe@cilium.io>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>, Joe Stringer <joe@cilium.io>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>,
	Xiongchun Duan <duanxiongchun@bytedance.com>,
	Dongdong Wang <wangdongdong.6@bytedance.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Cong Wang <cong.wang@bytedance.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	Pedro Tammela <pctammela@mojatatu.com>
Subject: Re: [RFC Patch bpf-next] bpf: introduce bpf timer
Date: Wed, 11 Aug 2021 14:03:07 -0700	[thread overview]
Message-ID: <CADa=RyyuVD5r9_95HTj_-hPq4AjN1RgrGcZsJssRjYfajY=6hQ@mail.gmail.com> (raw)
In-Reply-To: <CAM_iQpX=Qk6GjxB=saTpbo4Oc1KBxK2tU5N==HO_LimiOEtoDA@mail.gmail.com>

Hi folks, apparently I never clicked 'send' on this email, but if you
wanted to continue the discussion I had some questions and thoughts.

This is also an interesting enough topic that it may be worth
considering to submit for the upcoming LPC Networking & BPF track
(submission deadline is this Friday August 13, Conference dates 20-24
September).

On Thu, May 13, 2021 at 7:53 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Thu, May 13, 2021 at 11:46 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> >
> > On 2021-05-12 6:43 p.m., Jamal Hadi Salim wrote:
> >
> > >
> > > Will run some tests tomorrow to see the effect of batching vs nobatch
> > > and capture cost of syscalls and cpu.
> > >
> >
> > So here are some numbers:
> > Processor: Intel(R) Xeon(R) Gold 6230R CPU @ 2.10GHz
> > This machine is very similar to where a real deployment
> > would happen.
> >
> > Hyperthreading turned off so we can dedicate the core to the
> > dumping process and Performance mode on, so no frequency scaling
> > meddling.
> > Tests were ran about 3 times each. Results eye-balled to make
> > sure deviation was reasonable.
> > 100% of the one core was used just for dumping during each run.
>
> I checked with Cilium users here at Bytedance, they actually observed
> 100% CPU usage too.

Thanks for the feedback. Can you provide further details? For instance,

* Which version of Cilium?
* How long do you observe this 100% CPU usage?
* What size CT map is in use?
* How frequently do you intend for CT GC to run? (Do you use the
default settings or are they mismatched with your requirements for
some reason? If so can we learn more about the requirements/why?)
* Do you have a threshold in mind that would be sufficient?

If necessary we can take these discussions off-list if the details are
sensitive but I'd prefer to continue the discussion here to have some
public examples we can discuss & use to motivate future discussions.
We can alternatively move the discussion to a Cilium GitHub issue if
the tradeoffs are more about the userspace implementation rather than
the kernel specifics, though I suspect some of the folks here would
also like to follow along so I don't want to exclude the list from the
discussion.

FWIW I'm not inherently against a timer, in fact I've wondered for a
while what kind of interesting things we could build with such
support. At the same time, connection tracking entry management is a
nuanced topic and it's easy to fix an issue in one area only to
introduce a problem in another area.

> >
> > bpftool does linear retrieval whereas our tool does batch dumping.
> > bpftool does print the dumped results, for our tool we just count
> > the number of entries retrieved (cost would have been higher if
> > we actually printed). In any case in the real setup there is
> > a processing cost which is much higher.
> >
> > Summary is: the dumping is problematic costwise as the number of
> > entries increase. While batching does improve things it doesnt
> > solve our problem (Like i said we have upto 16M entries and most
> > of the time we are dumping useless things)
>
> Thank you for sharing these numbers! Hopefully they could convince
> people here to accept the bpf timer. I will include your use case and
> performance number in my next update.

Yes, Thanks Jamal for the numbers. It's very interesting, clearly
batch dumping is far more efficient and we should enhance bpftool to
take advantage of it where applicable.

> Like i said we have upto 16M entries and most
> of the time we are dumping useless things)

I'm curious if there's a more intelligent way to figure out this
'dumping useless things' aspect? I can see how timers would eliminate
the cycles spent on the syscall aspect of this entirely (in favor of
the timer handling logic which I'd guess is cheaper), but at some
point if you're running certain logic on every entry in a map then of
course it will scale linearly.

The use case is different for the CT problem we discussed above, but
if I look at the same question for the CT case, this is why I find LRU
useful - rather than firing off a number of timers linear on the size
of the map, the eviction logic is limited to the map insert rate,
which itself can be governed and ratelimited by logic running in eBPF.
The scan of the map then becomes less critical, so it can be run less
frequently and alleviate the CPU usage question that way.

  reply	other threads:[~2021-08-11 21:03 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01  4:26 [RFC Patch bpf-next] bpf: introduce bpf timer Cong Wang
2021-04-01  6:38 ` Song Liu
2021-04-01 17:28   ` Cong Wang
2021-04-01 20:17     ` Song Liu
2021-04-02 17:34       ` Cong Wang
2021-04-02 17:57         ` Song Liu
2021-04-02 19:08           ` Cong Wang
2021-04-02 19:43             ` Song Liu
2021-04-02 20:57               ` Cong Wang
2021-04-02 23:31                 ` Song Liu
2021-04-05 23:49                   ` Cong Wang
2021-04-06  1:07                     ` Song Liu
2021-04-06  1:24                       ` Cong Wang
2021-04-06  6:17                         ` Song Liu
2021-04-06 16:48                           ` Cong Wang
2021-04-06 23:36                             ` Song Liu
2021-04-08 22:45                               ` Cong Wang
2021-04-02 19:28 ` Alexei Starovoitov
2021-04-02 21:24   ` Cong Wang
2021-04-02 23:45     ` Alexei Starovoitov
2021-04-06  0:36       ` Cong Wang
2021-04-12 23:01         ` Alexei Starovoitov
2021-04-15  4:02           ` Cong Wang
2021-04-15  4:25             ` Alexei Starovoitov
2021-04-15 15:51               ` Cong Wang
2021-04-26 23:00               ` Cong Wang
2021-04-26 23:05                 ` Alexei Starovoitov
2021-04-26 23:37                   ` Cong Wang
2021-04-27  2:01                     ` Alexei Starovoitov
2021-04-27 11:52                       ` Jamal Hadi Salim
2021-04-27 16:36                       ` Cong Wang
2021-04-27 18:33                         ` Alexei Starovoitov
2021-05-09  5:37                           ` Cong Wang
2021-05-10 20:55                             ` Jamal Hadi Salim
2021-05-11 21:29                               ` Cong Wang
2021-05-12 22:56                                 ` Jamal Hadi Salim
2021-05-11  5:05                             ` Joe Stringer
2021-05-11 21:08                               ` Cong Wang
2021-05-12 22:43                               ` Jamal Hadi Salim
2021-05-13 18:45                                 ` Jamal Hadi Salim
2021-05-14  2:53                                   ` Cong Wang
2021-08-11 21:03                                     ` Joe Stringer [this message]
2021-05-20 18:55 [RFC PATCH bpf-next] bpf: Introduce bpf_timer Alexei Starovoitov
2021-05-21 14:38 ` Alexei Starovoitov
2021-05-21 21:37 ` Cong Wang
2021-05-23 16:01   ` Alexei Starovoitov
2021-05-24  8:45     ` Lorenz Bauer
2021-05-25  3:16     ` Cong Wang
2021-05-25  4:59       ` Cong Wang
2021-05-25 18:21         ` Alexei Starovoitov
2021-05-25 19:35           ` Jamal Hadi Salim
2021-05-25 19:57             ` Alexei Starovoitov
2021-05-25 21:09               ` Jamal Hadi Salim
2021-05-25 22:08                 ` Alexei Starovoitov
2021-05-26 15:34                   ` Jamal Hadi Salim
2021-05-26 16:58                     ` Alexei Starovoitov
2021-05-26 18:25                       ` Jamal Hadi Salim
2021-05-30  6:36           ` Cong Wang
2021-06-02  2:00             ` Alexei Starovoitov
2021-06-02  8:48               ` Toke Høiland-Jørgensen
2021-06-02 17:54                 ` Martin KaFai Lau
2021-06-02 18:13                   ` Kumar Kartikeya Dwivedi
2021-06-02 18:26                     ` Alexei Starovoitov
2021-06-02 18:30                       ` Kumar Kartikeya Dwivedi
2021-06-02 18:46                     ` John Fastabend
2021-05-23 11:48 ` Toke Høiland-Jørgensen
2021-05-23 15:58   ` Alexei Starovoitov
2021-05-24  8:42     ` Lorenz Bauer
2021-05-24 14:48       ` Alexei Starovoitov
2021-05-24 17:33     ` Alexei Starovoitov
2021-05-24 18:39       ` Toke Høiland-Jørgensen
2021-05-24 18:38     ` Toke Høiland-Jørgensen
2021-05-24 11:49 ` Lorenz Bauer
2021-05-24 14:56   ` Alexei Starovoitov
2021-05-24 19:13     ` Andrii Nakryiko
2021-05-25  5:22       ` Cong Wang
2021-05-25 19:47         ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADa=RyyuVD5r9_95HTj_-hPq4AjN1RgrGcZsJssRjYfajY=6hQ@mail.gmail.com' \
    --to=joe@cilium.io \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cong.wang@bytedance.com \
    --cc=daniel@iogearbox.net \
    --cc=duanxiongchun@bytedance.com \
    --cc=jhs@mojatatu.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=pctammela@mojatatu.com \
    --cc=songliubraving@fb.com \
    --cc=songmuchun@bytedance.com \
    --cc=wangdongdong.6@bytedance.com \
    --cc=xiyou.wangcong@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).