Re: [RFC bpf-next 3/4] bpftool: introduce "prog profile" command

From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Song Liu <songliubraving@fb.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com,
	ast@kernel.org, daniel@iogearbox.net
Subject: Re: [RFC bpf-next 3/4] bpftool: introduce "prog profile" command
Date: Fri, 14 Feb 2020 16:45:02 -0800	[thread overview]
Message-ID: <20200215004500.gs3ylstfo3aksfbp@ast-mbp> (raw)
In-Reply-To: <20200213210115.1455809-4-songliubraving@fb.com>

On Thu, Feb 13, 2020 at 01:01:14PM -0800, Song Liu wrote:
> With fentry/fexit programs, it is possible to profile BPF program with
> hardware counters. Introduce bpftool "prog profile", which measures key
> metrics of a BPF program.
> 
> bpftool prog profile command creates per-cpu perf events. Then it attaches
> fentry/fexit programs to the target BPF program. The fentry program saves
> perf event value to a map. The fexit program reads the perf event again,
> and calculates the difference, which is the instructions/cycles used by
> the target program.
> 
> Example input and output:
> 
>   ./bpftool prog profile 20 id 810 cycles instructions
>   cycles: duration 20 run_cnt 1368 miss_cnt 665
>           counter 503377 enabled 668202 running 351857
>   instructions: duration 20 run_cnt 1368 miss_cnt 707
>           counter 398625 enabled 502330 running 272014
> 
> This command measures cycles and instructions for BPF program with id
> 810 for 20 seconds. The program has triggered 1368 times. cycles was not
> measured in 665 out of these runs, because of perf event multiplexing
> (some perf commands are running in the background). In these runs, the BPF
> program consumed 503377 cycles. The perf_event enabled and running time
> are 668202 and 351857 respectively.

if (diff.enabled > diff.running) increment miss_cnt.
Why show this to users?
I think 'miss_cnt' the users will interpret as data is bogus,
but it only means that the counter was multiplexed.
The data is still accurate, no?
This condition will probably be hit fairly often, no?

>  tools/bpf/bpftool/profiler.skel.h         | 820 ++++++++++++++++++++++

I think bpftool needs to be build twice to avoid this.

Could you change the output format to be 'perf stat' like:
         55,766.51 msec task-clock                #    0.996 CPUs utilized
             4,891      context-switches          #    0.088 K/sec
                31      cpu-migrations            #    0.001 K/sec
         1,806,065      page-faults               #    0.032 M/sec
   166,819,295,451      cycles                    #    2.991 GHz                      (50.12%)
   251,115,795,764      instructions              #    1.51  insn per cycle           (50.10%)

Also printing 'duration' is unnecessary. The user specified it at the command
line and it doesn't need to be reported back to the user.
Can you also make it optional? Until users Ctrl-C's bpftool ?
So it may look like:
$ ./bpftool prog profile id 810 cycles instructions
             1,368      run_cnt
           503,377      cycles
           398,625      instructions         # 0.79 insn per cycle

Computing additional things like 'insn per cycle' do help humans to
pay attention the issue. Like <1 ipc is not great and the next step
would be to profile this program for cache misses.