Re: [PATCH bpf-next 1/2] libbpf: add perf buffer reading API

From: Song Liu <liu.song.a23@gmail.com>
To: Andrii Nakryiko <andriin@fb.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next 1/2] libbpf: add perf buffer reading API
Date: Tue, 25 Jun 2019 19:18:57 -0700	[thread overview]
Message-ID: <CAPhsuW6FeBHHNgT3OA6x6i9kVsKutnVR46DFdkeG0cggaKbTnQ@mail.gmail.com> (raw)
In-Reply-To: <20190625232601.3227055-2-andriin@fb.com>

On Tue, Jun 25, 2019 at 4:28 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> BPF_MAP_TYPE_PERF_EVENT_ARRAY map is often used to send data from BPF program
> to user space for additional processing. libbpf already has very low-level API
> to read single CPU perf buffer, bpf_perf_event_read_simple(), but it's hard to
> use and requires a lot of code to set everything up. This patch adds
> perf_buffer abstraction on top of it, abstracting setting up and polling
> per-CPU logic into simple and convenient API, similar to what BCC provides.
>
> perf_buffer__new() sets up per-CPU ring buffers and updates corresponding BPF
> map entries. It accepts two user-provided callbacks: one for handling raw
> samples and one for get notifications of lost samples due to buffer overflow.
>
> perf_buffer__poll() is used to fetch ring buffer data across all CPUs,
> utilizing epoll instance.
>
> perf_buffer__free() does corresponding clean up and unsets FDs from BPF map.
>
> All APIs are not thread-safe. User should ensure proper locking/coordination if
> used in multi-threaded set up.
>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Overall looks good. Some nit below.

> ---
>  tools/lib/bpf/libbpf.c   | 282 +++++++++++++++++++++++++++++++++++++++
>  tools/lib/bpf/libbpf.h   |  12 ++
>  tools/lib/bpf/libbpf.map |   5 +-
>  3 files changed, 298 insertions(+), 1 deletion(-)

[...]

> +struct perf_buffer *perf_buffer__new(struct bpf_map *map, size_t page_cnt,
> +                                    perf_buffer_sample_fn sample_cb,
> +                                    perf_buffer_lost_fn lost_cb, void *ctx)
> +{
> +       char msg[STRERR_BUFSIZE];
> +       struct perf_buffer *pb;
> +       int err, cpu;
> +
> +       if (bpf_map__def(map)->type != BPF_MAP_TYPE_PERF_EVENT_ARRAY) {
> +               pr_warning("map '%s' should be BPF_MAP_TYPE_PERF_EVENT_ARRAY\n",
> +                          bpf_map__name(map));
> +               return ERR_PTR(-EINVAL);
> +       }
> +       if (bpf_map__fd(map) < 0) {
> +               pr_warning("map '%s' doesn't have associated FD\n",
> +                          bpf_map__name(map));
> +               return ERR_PTR(-EINVAL);
> +       }
> +       if (page_cnt & (page_cnt - 1)) {
> +               pr_warning("page count should be power of two, but is %zu\n",
> +                          page_cnt);
> +               return ERR_PTR(-EINVAL);
> +       }
> +
> +       pb = calloc(1, sizeof(*pb));
> +       if (!pb)
> +               return ERR_PTR(-ENOMEM);
> +
> +       pb->sample_cb = sample_cb;
> +       pb->lost_cb = lost_cb;

I think we need to check sample_cb != NULL && lost_cb != NULL.

> +       pb->ctx = ctx;
> +       pb->page_size = getpagesize();
> +       pb->mmap_size = pb->page_size * page_cnt;
> +       pb->mapfd = bpf_map__fd(map);
> +
> +       pb->epfd = epoll_create1(EPOLL_CLOEXEC);
[...]
> +perf_buffer__process_record(struct perf_event_header *e, void *ctx)
> +{
> +       struct perf_buffer *pb = ctx;
> +       void *data = e;
> +
> +       switch (e->type) {
> +       case PERF_RECORD_SAMPLE: {
> +               struct perf_sample_raw *s = data;
> +
> +               pb->sample_cb(pb->ctx, s->data, s->size);
> +               break;
> +       }
> +       case PERF_RECORD_LOST: {
> +               struct perf_sample_lost *s = data;
> +
> +               if (pb->lost_cb)
> +                       pb->lost_cb(pb->ctx, s->lost);

OK, we test lost_cb here, so not necessary at init time.

[...]
>                 bpf_program__attach_perf_event;
>                 bpf_program__attach_raw_tracepoint;
>                 bpf_program__attach_tracepoint;
>                 bpf_program__attach_uprobe;
> +               btf__parse_elf;

Why move btf__parse_elf ?

Thanks,
Song