From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BCADC2D0F8 for ; Wed, 13 May 2020 06:51:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D5C6720708 for ; Wed, 13 May 2020 06:51:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HvSuMsTw" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729547AbgEMGv1 (ORCPT ); Wed, 13 May 2020 02:51:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729367AbgEMGv1 (ORCPT ); Wed, 13 May 2020 02:51:27 -0400 Received: from mail-yb1-xb42.google.com (mail-yb1-xb42.google.com [IPv6:2607:f8b0:4864:20::b42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00395C061A0C; Tue, 12 May 2020 23:51:26 -0700 (PDT) Received: by mail-yb1-xb42.google.com with SMTP id a10so4050251ybc.3; Tue, 12 May 2020 23:51:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=0Hq0+NcHUZkPr45/gSvsislYzDjnCtP4+4enTWioQS0=; b=HvSuMsTw+H6ECYAn3bYAHkH80FyGJz5hQVbr8w2zbDIKgzT+mPZUjcz9rgaIIn7hYP 5B8HK/ayRmCLTTOfX5vn/5OorG1J9/m6Bz8wjgQjF/ynR0gn552SBN+Jsf9QlwLmxGjv RMVzXdogYOg1P/IMCQORiUu/Qegbe9/5pGvg2EqZc2iXuihIBijTMHx+DIs6IPjwgoCl f3AcPlcW9n0zSoKIQyfncFH6BvOcW8CVW2DtWpvBVPbVqV4Hd+txsAgqnOQsIx9FLVdg niWn6y4wIlGszRPT9jL7i1Ai7AC7NQpuLv/cYZiWteO9DzHEJWtb+R0lA9N3iBDWri/J lg+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0Hq0+NcHUZkPr45/gSvsislYzDjnCtP4+4enTWioQS0=; b=T2FAQ9px+WDwUXtMIqOKpIyidRL+KjXO3wTbNPK9QxQB1mri6+JD1cG/nh1rx5hSwi CYKGilqkFqFrFC+8+qOZUAzrnTK7BFla2XkGe0gNNRgj+nMnnGLiLuIIhzVf0VBLUrrY rkgHPquyFCZMKLDI4d+iepPvR71wjLz2sBIMxb+nSNbIwj1ByRT0Zzq+KbnYFmwVmM5R ctaq2xzF8eWjAPre3e6gvqlnwWNIhZPW0qHVQbS4544XKL2NkJf3FmexhjRtCCGLEApQ jWJbwszjfWrN+5FG7H8sAL54Lh8ahUc6SyFTOlEgoKscIi454b1sT/bNQgaTj640s5jE usDA== X-Gm-Message-State: AOAM530IxzhtMTHRfhqRWm52rrlLD5q/ijyD2Tl79R2RL33GcupphfNm r+q10igaTJeKIjgn5QM0lDHbKZ9JxmHyqV7j+zqMrt/GxVHmfhA= X-Google-Smtp-Source: ABdhPJy53mfROHub+rZ0huUz9CN07I0rQaNYlC1G/rDqav2f790At8JYNBObnUshQ9zDcKHyHxC+LyQYKnUXay5o4IA= X-Received: by 2002:a25:ab89:: with SMTP id v9mr19565502ybi.306.1589352686108; Tue, 12 May 2020 23:51:26 -0700 (PDT) MIME-Version: 1.0 References: <20200512144339.1617069-1-danieltimlee@gmail.com> <20200512144339.1617069-2-danieltimlee@gmail.com> In-Reply-To: From: "Daniel T. Lee" Date: Wed, 13 May 2020 15:51:10 +0900 Message-ID: Subject: Re: [PATCH bpf-next 1/3] samples: bpf: refactor kprobe tracing user progs with libbpf To: Yonghong Song Cc: Daniel Borkmann , Alexei Starovoitov , netdev , bpf , Andrii Nakryiko , John Fastabend Content-Type: text/plain; charset="UTF-8" Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Wed, May 13, 2020 at 10:40 AM Yonghong Song wrote: > > > > On 5/12/20 7:43 AM, Daniel T. Lee wrote: > > Currently, the kprobe BPF program attachment method for bpf_load is > > quite old. The implementation of bpf_load "directly" controls and > > manages(create, delete) the kprobe events of DEBUGFS. On the other hand, > > using using the libbpf automatically manages the kprobe event. > > (under bpf_link interface) > > > > By calling bpf_program__attach(_kprobe) in libbpf, the corresponding > > kprobe is created and the BPF program will be attached to this kprobe. > > To remove this, by simply invoking bpf_link__destroy will clean up the > > event. > > > > This commit refactors kprobe tracing programs (tracex{1~7}_user.c) with > > libbpf using bpf_link interface and bpf_program__attach. > > > > tracex2_kern.c, which tracks system calls (sys_*), has been modified to > > append prefix depending on architecture. > > > > Signed-off-by: Daniel T. Lee > > --- > > samples/bpf/Makefile | 12 +++---- > > samples/bpf/tracex1_user.c | 41 ++++++++++++++++++++---- > > samples/bpf/tracex2_kern.c | 8 ++++- > > samples/bpf/tracex2_user.c | 55 ++++++++++++++++++++++++++------ > > samples/bpf/tracex3_user.c | 65 ++++++++++++++++++++++++++++---------- > > samples/bpf/tracex4_user.c | 55 +++++++++++++++++++++++++------- > > samples/bpf/tracex6_user.c | 53 +++++++++++++++++++++++++++---- > > samples/bpf/tracex7_user.c | 43 ++++++++++++++++++++----- > > 8 files changed, 268 insertions(+), 64 deletions(-) > > > > diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile > > index 424f6fe7ce38..4c91e5914329 100644 > > --- a/samples/bpf/Makefile > > +++ b/samples/bpf/Makefile > > @@ -64,13 +64,13 @@ fds_example-objs := fds_example.o > > sockex1-objs := sockex1_user.o > > sockex2-objs := sockex2_user.o > > sockex3-objs := bpf_load.o sockex3_user.o > > -tracex1-objs := bpf_load.o tracex1_user.o $(TRACE_HELPERS) > > -tracex2-objs := bpf_load.o tracex2_user.o > > -tracex3-objs := bpf_load.o tracex3_user.o > > -tracex4-objs := bpf_load.o tracex4_user.o > > +tracex1-objs := tracex1_user.o $(TRACE_HELPERS) > > +tracex2-objs := tracex2_user.o > > +tracex3-objs := tracex3_user.o > > +tracex4-objs := tracex4_user.o > > tracex5-objs := bpf_load.o tracex5_user.o $(TRACE_HELPERS) > > -tracex6-objs := bpf_load.o tracex6_user.o > > -tracex7-objs := bpf_load.o tracex7_user.o > > +tracex6-objs := tracex6_user.o > > +tracex7-objs := tracex7_user.o > > test_probe_write_user-objs := bpf_load.o test_probe_write_user_user.o > > trace_output-objs := bpf_load.o trace_output_user.o $(TRACE_HELPERS) > > lathist-objs := bpf_load.o lathist_user.o > > diff --git a/samples/bpf/tracex1_user.c b/samples/bpf/tracex1_user.c > > index 55fddbd08702..1b15ab98f7d3 100644 > > --- a/samples/bpf/tracex1_user.c > > +++ b/samples/bpf/tracex1_user.c > > @@ -1,21 +1,45 @@ > > // SPDX-License-Identifier: GPL-2.0 > > #include > > -#include > > #include > > -#include > > -#include "bpf_load.h" > > +#include > > #include "trace_helpers.h" > > > > +#define __must_check > > This is not very user friendly. > Maybe not including linux/err.h and > use libbpf API libbpf_get_error() instead? > This approach looks more apparent and can stick with the libbpf API. I'll update code using this way. > > +#include > > + > > int main(int ac, char **argv) > > { > > - FILE *f; > > + struct bpf_link *link = NULL; > > + struct bpf_program *prog; > > + struct bpf_object *obj; > > char filename[256]; > > + FILE *f; > > > > snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]); > > + obj = bpf_object__open_file(filename, NULL); > > + if (IS_ERR(obj)) { > > + fprintf(stderr, "ERROR: opening BPF object file failed\n"); > > + obj = NULL; > > + goto cleanup; > > You do not need to goto cleanup, directly return 0 is okay here. > The same for other files in this patch. > As you said, it would be better to return right away than to proceed any further. I'll apply the code at next patch. > > + } > > + > > + prog = bpf_object__find_program_by_name(obj, "bpf_prog1"); > > + if (!prog) { > > + fprintf(stderr, "ERROR: finding a prog in obj file failed\n"); > > + goto cleanup; > > + } > > + > > + /* load BPF program */ > > + if (bpf_object__load(obj)) { > > + fprintf(stderr, "ERROR: loading BPF object file failed\n"); > > + goto cleanup; > > + } > > > > - if (load_bpf_file(filename)) { > > - printf("%s", bpf_log_buf); > > - return 1; > > + link = bpf_program__attach(prog); > > + if (IS_ERR(link)) { > > + fprintf(stderr, "ERROR: bpf_program__attach failed\n"); > > + link = NULL; > > + goto cleanup; > > } > > > > f = popen("taskset 1 ping -c5 localhost", "r"); > > @@ -23,5 +47,8 @@ int main(int ac, char **argv) > > > > read_trace_pipe(); > > > > +cleanup: > > + bpf_link__destroy(link); > > + bpf_object__close(obj); > > Typically in kernel, we do multiple labels for such cases > like > destroy_link: > bpf_link__destroy(link); > close_object: > bpf_object__close(obj); > > The error path in the main() function jumps to proper label. > This is more clean and less confusion. > > The same for other cases in this file. > I totally agree that multiple labels are much more intuitive. But It's not very common to jump to the destroy_link label. Either when on the routine is completed successfully and jumps to the destroy_link branch, or an error occurred while bpf_program__attach was called "several" times and jumps to the destroy_link branch. Single bpf_program__attach like this tracex1 sample doesn't really have to destroy link, since the link has been set to NULL on attach error and bpf_link__destroy() is designed to do nothing if passed NULL to it. So I think current approach will keep consistent between samples since most of the sample won't need to jump to destroy_link. > > return 0; > > } > > diff --git a/samples/bpf/tracex2_kern.c b/samples/bpf/tracex2_kern.c > > index d865bb309bcb..ff5d00916733 100644 > > --- a/samples/bpf/tracex2_kern.c > > +++ b/samples/bpf/tracex2_kern.c > > @@ -11,6 +11,12 @@ > > #include > > #include > > > > +#ifdef __x86_64__ > > +#define SYSCALL "__x64_" > > +#else > > +#define SYSCALL > > +#endif > > See test_progs.h, one more case to handle: > #ifdef __x86_64__ > #define SYS_NANOSLEEP_KPROBE_NAME "__x64_sys_nanosleep" > #elif defined(__s390x__) > #define SYS_NANOSLEEP_KPROBE_NAME "__s390x_sys_nanosleep" > #else > #define SYS_NANOSLEEP_KPROBE_NAME "sys_nanosleep" > #endif > That was also one of the considerations when writing patches. I'm planning to refactor most of the programs in the sample using libbpf, and found out that there are bunch of samples that tracks syscall with kprobe. Replacing all of them will take lots of macros and I thought using prefix will be better idea. Actually, my initial plan was to create macro of SYSCALL() #ifdef __x86_64__ #define PREFIX "__x64_" #elif defined(__s390x__) #define PREFIX "__s390x_" #else #define PREFIX "" #endif #define SYSCALL(SYS) PREFIX ## SYS And to use this macro universally without creating additional headers, I was trying to add this to samples/bpf/syscall_nrs.c which later compiles to samples/bpf/syscall_nrs.h. But it was pretty hacky way and it won't work properly. So I ended up with just adding prefix to syscall. Is it necessary to define all of the macro for each architecture? > > + > > struct bpf_map_def SEC("maps") my_map = { > > .type = BPF_MAP_TYPE_HASH, > > .key_size = sizeof(long), > > @@ -77,7 +83,7 @@ struct bpf_map_def SEC("maps") my_hist_map = { > > .max_entries = 1024, > > }; > > > > -SEC("kprobe/sys_write") > > +SEC("kprobe/" SYSCALL "sys_write") > > int bpf_prog3(struct pt_regs *ctx) > > { > > long write_size = PT_REGS_PARM3(ctx); > [...] Thank you for your time and effort for the review :) Best, Daniel