linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Song Liu <songliubraving@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: open list <linux-kernel@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>, Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <Kernel-team@fb.com>,
	john fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@chromium.org>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Daniel Xu <dlxu@fb.com>
Subject: Re: [PATCH bpf-next 5/5] selftests/bpf: add benchmark for uprobe vs. user_prog
Date: Wed, 5 Aug 2020 04:47:30 +0000	[thread overview]
Message-ID: <AF9D0E8C-0AA5-4BE4-90F4-946FABAB63FD@fb.com> (raw)
In-Reply-To: <CAEf4BzaiJnCu14AWougmxH80msGdOp4S8ZNmAiexMmtwUM_2Xg@mail.gmail.com>



> On Aug 4, 2020, at 6:52 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Tue, Aug 4, 2020 at 2:01 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Aug 2, 2020, at 10:10 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>>> 
>>> On Sun, Aug 2, 2020 at 9:47 PM Song Liu <songliubraving@fb.com> wrote:
>>>> 
>>>> 
>>>>> On Aug 2, 2020, at 6:51 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>>>>> 
>>>>> On Sat, Aug 1, 2020 at 1:50 AM Song Liu <songliubraving@fb.com> wrote:
>>>>>> 
>>>>>> Add a benchmark to compare performance of
>>>>>> 1) uprobe;
>>>>>> 2) user program w/o args;
>>>>>> 3) user program w/ args;
>>>>>> 4) user program w/ args on random cpu.
>>>>>> 
>>>>> 
>>>>> Can you please add it to the existing benchmark runner instead, e.g.,
>>>>> along the other bench_trigger benchmarks? No need to re-implement
>>>>> benchmark setup. And also that would also allow to compare existing
>>>>> ways of cheaply triggering a program vs this new _USER program?
>>>> 
>>>> Will try.
>>>> 
>>>>> 
>>>>> If the performance is not significantly better than other ways, do you
>>>>> think it still makes sense to add a new BPF program type? I think
>>>>> triggering KPROBE/TRACEPOINT from bpf_prog_test_run() would be very
>>>>> nice, maybe it's possible to add that instead of a new program type?
>>>>> Either way, let's see comparison with other program triggering
>>>>> mechanisms first.
>>>> 
>>>> Triggering KPROBE and TRACEPOINT from bpf_prog_test_run() will be useful.
>>>> But I don't think they can be used instead of user program, for a couple
>>>> reasons. First, KPROBE/TRACEPOINT may be triggered by other programs
>>>> running in the system, so user will have to filter those noise out in
>>>> each program. Second, it is not easy to specify CPU for KPROBE/TRACEPOINT,
>>>> while this feature could be useful in many cases, e.g. get stack trace
>>>> on a given CPU.
>>>> 
>>> 
>>> Right, it's not as convenient with KPROBE/TRACEPOINT as with the USER
>>> program you've added specifically with that feature in mind. But if
>>> you pin user-space thread on the needed CPU and trigger kprobe/tp,
>>> then you'll get what you want. As for the "noise", see how
>>> bench_trigger() deals with that: it records thread ID and filters
>>> everything not matching. You can do the same with CPU ID. It's not as
>>> automatic as with a special BPF program type, but still pretty simple,
>>> which is why I'm still deciding (for myself) whether USER program type
>>> is necessary :)
>> 
>> Here are some bench_trigger numbers:
>> 
>> base      :    1.698 ± 0.001M/s
>> tp        :    1.477 ± 0.001M/s
>> rawtp     :    1.567 ± 0.001M/s
>> kprobe    :    1.431 ± 0.000M/s
>> fentry    :    1.691 ± 0.000M/s
>> fmodret   :    1.654 ± 0.000M/s
>> user      :    1.253 ± 0.000M/s
>> fentry-on-cpu:    0.022 ± 0.011M/s
>> user-on-cpu:    0.315 ± 0.001M/s
>> 
> 
> Ok, so basically all of raw_tp,tp,kprobe,fentry/fexit are
> significantly faster than USER programs. Sure, when compared to
> uprobe, they are faster, but not when doing on-specific-CPU run, it
> seems (judging from this patch's description, if I'm reading it
> right). Anyways, speed argument shouldn't be a reason for doing this,
> IMO.
> 
>> The two "on-cpu" tests run the program on a different CPU (see the patch
>> at the end).
>> 
>> "user" is about 25% slower than "fentry". I think this is mostly because
>> getpgid() is a faster syscall than bpf(BPF_TEST_RUN).
> 
> Yes, probably.
> 
>> 
>> "user-on-cpu" is more than 10x faster than "fentry-on-cpu", because IPI
>> is way faster than moving the process (via sched_setaffinity).
> 
> I don't think that's a good comparison, because you are actually
> testing sched_setaffinity performance on each iteration vs IPI in the
> kernel, not a BPF overhead.
> 
> I think the fair comparison for this would be to create a thread and
> pin it on necessary CPU, and only then BPF program calls in a loop.
> But I bet any of existing program types would beat USER program.
> 
>> 
>> For use cases that we would like to call BPF program on specific CPU,
>> triggering it via IPI is a lot faster.
> 
> So these use cases would be nice to expand on in the motivational part
> of the patch set. It's not really emphasized and it's not at all clear
> what you are trying to achieve. It also seems, depending on latency
> requirements, it's totally possible to achieve comparable results by
> pre-creating a thread for each CPU, pinning each one to its designated
> CPU and then using any suitable user-space signaling mechanism (a
> queue, condvar, etc) to ask a thread to trigger BPF program (fentry on
> getpgid(), for instance).

I don't see why user space signal plus fentry would be faster than IPI.
If the target cpu is running something, this gonna add two context 
switches. 

> I bet in this case the  performance would be
> really nice for a lot of practical use cases. But then again, I don't
> know details of the intended use case, so please provide some more
> details.

Being able to trigger BPF program on a different CPU could enable many
use cases and optimizations. The use case I am looking at is to access
perf_event and percpu maps on the target CPU. For example:
	0. trigger the program
	1. read perf_event on cpu x;
	2. (optional) check which process is running on cpu x;
	3. add perf_event value to percpu map(s) on cpu x. 

If we do these steps in a BPF program on cpu x, the cost is:
	A.0) trigger BPF via IPI;
	A.1) read perf_event locally;
	A.2) local access current;
	A.3) local access of percpu map(s). 

If we can only do these on a different CPU, the cost will be:
	B.0) trigger BPF locally;
	B.1) read perf_event via IPI;
	B.2) remote access current on cpu x;
	B.3) remote access percpu map(s), or use non-percpu map(2). 

Cost of (A.0 + A.1) is about same as (B.0 + B.1), maybe a little higher
(sys_bpf(), vs. sys_getpgid()). But A.2 and A.3 will be significantly 
cheaper than B.2 and B.3. 

Does this make sense? 


OTOH, I do agree we can trigger bpftrace BEGIN/END with sys_getpgid() 
or something similar. 

Thanks,
Song

  reply	other threads:[~2020-08-05  4:47 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-01  8:47 [PATCH bpf-next 0/5] introduce BPF_PROG_TYPE_USER Song Liu
2020-08-01  8:47 ` [PATCH bpf-next 1/5] bpf: " Song Liu
2020-08-01 13:58   ` kernel test robot
2020-08-01 15:21   ` kernel test robot
2020-08-06 18:18   ` kernel test robot
2020-08-06 18:18   ` [RFC PATCH] bpf: user_verifier_ops can be static kernel test robot
2020-08-01  8:47 ` [PATCH bpf-next 2/5] libbpf: support BPF_PROG_TYPE_USER programs Song Liu
2020-08-03  1:40   ` Andrii Nakryiko
2020-08-03  4:21     ` Song Liu
2020-08-03  5:05       ` Andrii Nakryiko
2020-08-04  1:18     ` Song Liu
2020-08-05  1:38       ` Andrii Nakryiko
2020-08-05  3:59         ` Song Liu
2020-08-05  5:32           ` Andrii Nakryiko
2020-08-05  6:26             ` Song Liu
2020-08-05  6:54               ` Andrii Nakryiko
2020-08-05  7:23                 ` Song Liu
2020-08-05 17:44                   ` Andrii Nakryiko
2020-08-01  8:47 ` [PATCH bpf-next 3/5] selftests/bpf: add selftest for BPF_PROG_TYPE_USER Song Liu
2020-08-03  1:43   ` Andrii Nakryiko
2020-08-03  4:33     ` Song Liu
2020-08-03  5:07       ` Andrii Nakryiko
2020-08-01  8:47 ` [PATCH bpf-next 4/5] selftests/bpf: move two functions to test_progs.c Song Liu
2020-08-03  1:46   ` Andrii Nakryiko
2020-08-03  4:34     ` Song Liu
2020-08-01  8:47 ` [PATCH bpf-next 5/5] selftests/bpf: add benchmark for uprobe vs. user_prog Song Liu
2020-08-03  1:51   ` Andrii Nakryiko
2020-08-03  4:47     ` Song Liu
2020-08-03  5:10       ` Andrii Nakryiko
2020-08-04 20:54         ` Song Liu
2020-08-05  1:52           ` Andrii Nakryiko
2020-08-05  4:47             ` Song Liu [this message]
2020-08-05  5:47               ` Andrii Nakryiko
2020-08-05  7:01                 ` Song Liu
2020-08-05 17:39                   ` Andrii Nakryiko
2020-08-05 18:41                     ` Song Liu
2020-08-05 17:16               ` Alexei Starovoitov
2020-08-05 17:27                 ` Andrii Nakryiko
2020-08-05 17:45                   ` Alexei Starovoitov
2020-08-05 17:56                     ` Andrii Nakryiko
2020-08-05 18:56                 ` Song Liu
2020-08-05 22:50                   ` Alexei Starovoitov
2020-08-05 23:50                     ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AF9D0E8C-0AA5-4BE4-90F4-946FABAB63FD@fb.com \
    --to=songliubraving@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=dlxu@fb.com \
    --cc=john.fastabend@gmail.com \
    --cc=kpsingh@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).