Re: [PATCH bpf-next 5/5] selftests/bpf: add benchmark for uprobe vs. user_prog

From: Song Liu <songliubraving@fb.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	open list <linux-kernel@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>, Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	Kernel Team <Kernel-team@fb.com>,
	"john fastabend" <john.fastabend@gmail.com>,
	KP Singh <kpsingh@chromium.org>,
	"Jesper Dangaard Brouer" <brouer@redhat.com>,
	Daniel Xu <dlxu@fb.com>
Subject: Re: [PATCH bpf-next 5/5] selftests/bpf: add benchmark for uprobe vs. user_prog
Date: Wed, 5 Aug 2020 23:50:08 +0000	[thread overview]
Message-ID: <5D24F0EF-6592-402C-BFF8-34119FFF7A2C@fb.com> (raw)
In-Reply-To: <20200805225015.kd4tx6w3wh67oara@ast-mbp.dhcp.thefacebook.com>

> On Aug 5, 2020, at 3:50 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> On Wed, Aug 05, 2020 at 06:56:26PM +0000, Song Liu wrote:
>> 
>> 
>>> On Aug 5, 2020, at 10:16 AM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>>> 
>>> On Wed, Aug 05, 2020 at 04:47:30AM +0000, Song Liu wrote:
>>>> 
>>>> Being able to trigger BPF program on a different CPU could enable many
>>>> use cases and optimizations. The use case I am looking at is to access
>>>> perf_event and percpu maps on the target CPU. For example:
>>>> 	0. trigger the program
>>>> 	1. read perf_event on cpu x;
>>>> 	2. (optional) check which process is running on cpu x;
>>>> 	3. add perf_event value to percpu map(s) on cpu x. 
>>> 
>>> If the whole thing is about doing the above then I don't understand why new
>>> prog type is needed.
>> 
>> I was under the (probably wrong) impression that adding prog type is not
>> that big a deal. 
> 
> Not a big deal when it's necessary.
> 
>>> Can prog_test_run support existing BPF_PROG_TYPE_KPROBE?
>> 
>> I haven't looked into all the details, but I bet this is possible.
>> 
>>> "enable many use cases" sounds vague. I don't think folks reading
>>> the patches can guess those "use cases".
>>> "Testing existing kprobe bpf progs" would sound more convincing to me.
>>> If the test_run framework can be extended to trigger kprobe with correct pt_regs.
>>> As part of it test_run would trigger on a given cpu with $ip pointing
>>> to some test fuction in test_run.c. For local test_run the stack trace
>>> would include bpf syscall chain. For IPI the stack trace would include
>>> the corresponding kernel pieces where top is our special test function.
>>> Sort of like pseudo kprobe where there is no actual kprobe logic,
>>> since kprobe prog doesn't care about mechanism. It needs correct
>>> pt_regs only as input context.
>>> The kprobe prog output (return value) has special meaning though,
>>> so may be kprobe prog type is not a good fit.
>>> Something like fentry/fexit may be better, since verifier check_return_code()
>>> enforces 'return 0'. So their return value is effectively "void".
>>> Then prog_test_run would need to gain an ability to trigger
>>> fentry/fexit prog on a given cpu.
>> 
>> Maybe we add a new attach type for BPF_PROG_TYPE_TRACING, which is in 
>> parallel with BPF_TRACE_FENTRY and BPF_TRACE_EXIT? Say BPF_TRACE_USER? 
>> (Just realized I like this name :-D, it matches USDT...). Then we can 
>> enable test_run for most (if not all) tracing programs, including
>> fentry/fexit. 
> 
> Why new hook? Why prog_test_run cmd cannot be made to work
> BPF_PROG_TYPE_TRACING when it's loaded as BPF_TRACE_FENTRY and attach_btf_id
> points to special test function?
> The test_run cmd will trigger execution of that special function.

I am not sure I am following 100%. IIUC, the special test function is a 
kernel function, and we attach fentry program to it. When multiple fentry
programs attach to the function, these programs will need proper filter
logic. 

Alternatively, if test_run just prepare the ctx and call BPF_PROG_RUN(), 
like in bpf_test_run(), we don't need the special test function. 

So I do think the new attach type requires new hook. It is just like
BPF_TRACE_FENTRY without valid attach_btf_id. Of course, we can reserve
a test function and use it for attach_btf_id. If test_run just calls
BPF_PROG_RUN(), we will probably never touch the test function. 

IMO, we are choosing from two options. 
1. FENTRY on special function. User will specify attach_btf_id on the
   special function. 
2. new attach type (BPF_TRACE_USER), that do not require attach_btf_id;
   and there is no need for a special function. 

I personally think #2 is cleaner API. But I have no objection if #1 is 
better in other means. 

Thanks,
Song