From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89C6CC07E95 for ; Tue, 20 Jul 2021 00:41:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 61E3A61107 for ; Tue, 20 Jul 2021 00:41:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232259AbhGSX6Q (ORCPT ); Mon, 19 Jul 2021 19:58:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378444AbhGSXhQ (ORCPT ); Mon, 19 Jul 2021 19:37:16 -0400 Received: from mail-yb1-xb2d.google.com (mail-yb1-xb2d.google.com [IPv6:2607:f8b0:4864:20::b2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6CB5BC08EBB4 for ; Mon, 19 Jul 2021 17:10:59 -0700 (PDT) Received: by mail-yb1-xb2d.google.com with SMTP id p22so30434184yba.7 for ; Mon, 19 Jul 2021 17:10:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=f1LYlveBFicZVaUr3iwPelkh8/Tvyo0xDy3WLV4VFjo=; b=hUTYDAFLIToK0AwXh12zl66SyppQ46A0nJ6YnE5agpBZw2aTeBPInt6TwfiyvTBEQ3 aqXjMDVgBo86u/IUdtATjp4VUVDDkEhJ7N3xSjSFaOu4vHUQALUXbk2BqcfyX0CJNKah aRzGVxv5MKBlucsxMHH9iqDLwG1YQ/sutBIa9NAWObQ8mCR+5ldaepmMlD7HZiBRfKoc n12/98FmzIdhNtC+p7/d97UNcyJIN+zpymTWLx8/9eNxAQdLMZTc4OM4AXyiEScnD2/u dxHnakKrxSd3JngV9+wmXbSiMVEm+ENiuzqKJV1NkKaXr8O7Ox9wLg1Ao89wwutm0eop fLZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=f1LYlveBFicZVaUr3iwPelkh8/Tvyo0xDy3WLV4VFjo=; b=Co3CW8BU7bPxrSlZtS1Wn+44YakMxgTac/GT/0YG0/8m0T1NyS2IiVAUwFMhgU9T44 TOL1c4KGFdPcUJ1sgctZHqzYxCJwMMj7sdd172V7bd+O9vSDCOmqzECFXZE1Z9A81J6t fW6YSUt00UrOcFdgK98eHADaqB2HtMTfWCd8Le0DQ0zOq03XU09b+Do8vhx33W8HuKvK THA65k2iypT/MAL2dm+SbmRYLhP8Ct9yuKTq5xxd4vRNV3wiXMcQHBiXBhCpKyLxn3Ve kmKq8NvyNk4S7h5C3lFH7P954ZYOxBWGuh/1UgrY/6L8T4S4onskAy44ORS1q9Ci9WBt gyZw== X-Gm-Message-State: AOAM533ZUxawBEMoETItJkV1GX9qsI4yGDx6fAEnEVmVvGVz88bdkF+Q VROK8OK/rGm6fwBrNiSciIshQtoLxOj2fNTf54k= X-Google-Smtp-Source: ABdhPJxoP7BRlWrlzYtic0NVj5MHg3JLqabgAklvBUU9yg2c9KcRIs7BzYh1Ntd2gzdaBgMKWkw5Q3yX3vzkVtpnbt0= X-Received: by 2002:a25:d349:: with SMTP id e70mr35013467ybf.510.1626739858699; Mon, 19 Jul 2021 17:10:58 -0700 (PDT) MIME-Version: 1.0 References: <20210625044459.1249282-1-rafaeldtinoco@gmail.com> <701c5dea-2db9-4df8-888b-9e10c854afc3@www.fastmail.com> In-Reply-To: <701c5dea-2db9-4df8-888b-9e10c854afc3@www.fastmail.com> From: Andrii Nakryiko Date: Mon, 19 Jul 2021 17:10:47 -0700 Message-ID: Subject: Re: [PATCH bpf-next v3] libbpf: introduce legacy kprobe events support To: Rafael David Tinoco Cc: bpf Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Sun, Jul 18, 2021 at 6:59 PM Rafael David Tinoco wrote: > > > > --- a/tools/lib/bpf/libbpf.c > > > +++ b/tools/lib/bpf/libbpf.c > > > @@ -10007,6 +10007,10 @@ struct bpf_link { > > > char *pin_path; /* NULL, if not pinned */ > > > int fd; /* hook FD, -1 if not applicable */ > > > bool disconnected; > > > + struct { > > > + char *name; > > > + bool retprobe; > > > + } legacy; > > > > we shouldn't extend common bpf_link with kprobe-specific parts. We > > used to have something like this (for other use cases): > > > > struct bpf_link_kprobe { > > struct bpf_link link; > > char *legacy_name; > > bool is_retprobe; > > }; > > would this: > > struct bpf_link { > int (*detach)(struct bpf_link *link); > int (*destroy)(struct bpf_link *link); > char *pin_path; > int fd; > bool disconnected; > }; > > struct bpf_link_kprobe { > char *legacy_name; > bool is_retprobe; > struct bpf_link *link; > }; > > be ok ? No. > > > And then internally do container_of() to "cast" struct bpf_link to > > struct bpf_link_kprobe. External API should still operate on struct > > bpf_link everywhere. > > and what about this: > > static struct bpf_link* > bpf_program__attach_kprobe_opts(struct bpf_program *prog, > const char *func_name, > struct bpf_program_attach_kprobe_opts *opts) > { > char errmsg[STRERR_BUFSIZE]; > struct bpf_link_kprobe *kplink; > int pfd, err; > bool legacy; > > legacy = determine_kprobe_legacy(); > if (!legacy) { > pfd = perf_event_open_probe(false /* uprobe */, > opts->retprobe, > func_name, > 0 /* offset */, > -1 /* pid */); > } else { > pfd = perf_event_open_kprobe_legacy(opts->retprobe, > func_name, > 0 /* offset */, > -1 /* pid */); > } > if (pfd < 0) { > pr_warn("prog '%s': failed to create %s '%s' perf event: %s\n", > prog->name, opts->retprobe ? "kretprobe" : "kprobe", func_name, > libbpf_strerror_r(pfd, errmsg, sizeof(errmsg))); > return libbpf_err_ptr(pfd); > } > kplink = calloc(1, sizeof(struct bpf_link_kprobe)); > if (!kplink) > return libbpf_err_ptr(-ENOMEM); > kplink->link = bpf_program__attach_perf_event(prog, pfd); > err = libbpf_get_error(link); > if (err) { > free(kplink); > close(pfd); > pr_warn("prog '%s': failed to attach to %s '%s': %s\n", > prog->name, opts->retprobe ? "kretprobe" : "kprobe", func_name, > libbpf_strerror_r(err, errmsg, sizeof(errmsg))); > return libbpf_err_ptr(err); > } > if (legacy) { > kplink->legacy_name = strdup(func_name); > kplink->is_retprobe = opts->retprobe; > } > > return kplink->link; > } > > And use this 'kplink->link' pointer as the bpf_link pointer for all kprobe > functions. For the detachment we would have something like: > > static int bpf_link__detach_perf_event(struct bpf_link *link) { > struct bpf_link *const *plink; > struct bpf_link_kprobe *kplink; > int err; > > plink = (struct bpf_link *const *) link; > kplink = container_of(plink, struct bpf_link_kprobe, link); Did you check if this works? Also how that could ever even work for non-kprobe but perf_event-based cases, like tracepoint? And why are we even discussing these "alternatives"? What's wrong with the way I proposed earlier? For container_of() to work you have to do it the way I described in my previous email with one struct being embedded in another one. > err = ioctl(link->fd, PERF_EVENT_IOC_DISABLE, 0); > if (err) > err = -errno; > close(link->fd); > if (kplink) { > remove_kprobe_event_legacy(kplink->legacy_name, kplink->is_retprobe); > free(kplink->legacy_name); > free(kplink); > } > > return libbpf_err(err); > } > > for the bpf_link__detach_perf_event(): This would also clean the container at > the detachment. (Next comment talks about having this here versus having a > legacy kprobe detachment callback). > > [snip] > > > > static int bpf_link__detach_perf_event(struct bpf_link *link) > > > { > > > int err; > > > @@ -10152,6 +10197,12 @@ static int bpf_link__detach_perf_event(struct bpf_link *link) > > > err = -errno; > > > > > > close(link->fd); > > > + > > > + if (link->legacy.name) { > > > + remove_kprobe_event_legacy(link->legacy.name, link->legacy.retprobe); > > > + free(link->legacy.name); > > > + } > > > > instead of this check in bpf_link__detach_perf_event, attach_kprobe > > should install its own bpf_link__detach_kprobe_legacy callback > > attach_kprobe_opts() could pass a pointer to link->detach->callback through the > opts I suppose (or now, the kplink->link->detach->callback). This way the > default would still be bpf_link__detach_perf_event() but we could create a > function bpf_link__detach_perf_event_legacy_kprobe() with what was previously > showed (about kplink freeing). This is not needed with the version showed > before the [snip] though. > > > > +static int perf_event_open_probe_legacy(bool uprobe, bool retprobe, const char *name, > > > + uint64_t offset, int pid) > > > > you are not using offset here, let's pass it into > > add_kprobe_event_legacy and use it when attaching as "p:kprobes/%s > > %s+123" in poke_kprobe_events? There are separate patches that are > > adding ability to attach kprobe at offset, so let's support that > > (internally) from the get go for legacy case as well. > > > > also, it's not generic perf_event_open, it's specifically kprobe, so > > let's call it with kprobe in the name (e.g., kprobe_open_legacy or > > something) > > I'm calling it now perf_event_open_kprobe_legacy() and it calls: > > static inline int add_kprobe_event_legacy(const char *name, bool retprobe, uint64_t offset) > { > return poke_kprobe_events(true, name, retprobe, offset); > } > > and then we set {kprobes/kretprobes}/funcname_pid, also supporting offset: > > static int poke_kprobe_events(bool add, const char *name, bool retprobe, uint64_t offset) { > int fd, ret = 0; > char cmd[192] = {}, probename[128] = {}, probefunc[128] = {}; > const char *file = "/sys/kernel/debug/tracing/kprobe_events"; > > if (retprobe) > ret = snprintf(probename, sizeof(probename), "kretprobes/%s_libbpf_%u", name, getpid()); > else > ret = snprintf(probename, sizeof(probename), "kprobes/%s_libbpf_%u", name, getpid()); > if (offset) > ret = snprintf(probefunc, sizeof(probefunc), "%s+%lu", name, offset); > if (ret) > return -EINVAL; > if (add) { > snprintf(cmd, sizeof(cmd), "%c:%s %s", > retprobe ? 'r' : 'p', > probename, > offset ? probefunc : name); > } else { > snprintf(cmd, sizeof(cmd), "-:%s", probename); > } > fd = open(file, O_WRONLY | O_APPEND, 0); > if (!fd) > return -errno; > ret = write(fd, cmd, strlen(cmd)); > if (ret) > ret = -errno; > close(fd); > > return ret; > } > > [snip] > > > > + pfd = perf_event_open_probe(false /* uprobe */, > > > + retprobe, func_name, > > > + 0 /* offset */, > > > + -1 /* pid */); > > > + else > > > + pfd = perf_event_open_probe_legacy(false /* uprobe */, > > > + retprobe, func_name, > > > + 0 /* offset */, > > > + -1 /* pid */); > > > if (pfd < 0) { > > > pr_warn("prog '%s': failed to create %s '%s' perf event: %s\n", > > > prog->name, retprobe ? "kretprobe" : "kprobe", func_name, > > > > we can't use bpf_program__attach_perf_event as is now, because we need > > to allocate a different struct bpf_link_kprobe. > > We could do the container encapsulation using heap in > bpf_program_attach_kprobe_opts() or attach_kprobe() like I'm showing here, no ? > > ... > kplink = calloc(1, sizeof(struct bpf_link_kprobe)); > if (!kplink) > return libbpf_err_ptr(-ENOMEM); > kplink->link = bpf_program__attach_perf_event(prog, pfd); > ... > > and then free all this structure (bpf_link and its encapsulation at the > detachment, like said previously also). This way we don't have to change > bpf_program__attach_perf_event() which would continue to serve > bpf_program__attach_tracepoint() and bpf_program__attach_uprobe() unmodified. > This way, kprobe would have a container for all cases and uprobe and tracepoint > could have a container in the future if needed. > > > Let's extract the > > PERF_EVENT_IOC_SET_BPF and PERF_EVENT_IOC_ENABLE logic into a helper > > and use it from both bpf_program__attach_perf_event and > > bpf_program__attach_kprobe. It's actually good because we can check > > silly errors (like prog_fd < 0) before we create perf_event FD now. > > Okay, but I'm considering this orthogonal to what you said previously (on > changing bpf_program__attach_perf_event). UNLESS you really prefer me to do the > container allocation in bpf_program__attach_perf_event() but then we would have > to free the container in all detachments (kprobe, tracepoint and uprobe) as it > couldn't be placed in stack (or it would eventually be lost, no ?). container will be different for kprobe/uprobe and tracepoint/perf_event. So allocation has to happen separately from PERF_EVENT_IOC_SET_BPF and PERF_EVENT_IOC_ENABLE.