From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.7 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5A28C3A5A6 for ; Wed, 28 Aug 2019 20:24:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6F02022CED for ; Wed, 28 Aug 2019 20:24:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HDD/csk5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726875AbfH1UYD (ORCPT ); Wed, 28 Aug 2019 16:24:03 -0400 Received: from mail-qk1-f193.google.com ([209.85.222.193]:38588 "EHLO mail-qk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726687AbfH1UYD (ORCPT ); Wed, 28 Aug 2019 16:24:03 -0400 Received: by mail-qk1-f193.google.com with SMTP id u190so967896qkh.5; Wed, 28 Aug 2019 13:24:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=0PthWGi5zyhlTnok+h1QQSOQeRjewqREOQF5mh2omPY=; b=HDD/csk5KFc0GxByqsRCYaKyfndjqV2NalmWPdkCVEDKomHYTWg3daowCLSS5piOqt VpYLfU557o5zwvwXBuZyZ8QCqo3c30RclvvZ6kogHseEAsYew7rijrs07uEN31+PM4Ob TfIai3nf+rSx5SXngGRoNZ4SwUrWQd8JZaujvmn+jSXPue4GFmWTZqlAhRwpjX18LNuC Ri20JDMPkdrUgFTIoBqbduB49scrAvjOsdW0LBjuj7BImfbO7zSzoBF18/tugKGZcKAJ xeWx+wQtQ1imhbhCp/b1BBzG4nJ1iaW4AEKgsR5A4LHaWdWjm8WB3gzWdxyJYauKisQA qPYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=0PthWGi5zyhlTnok+h1QQSOQeRjewqREOQF5mh2omPY=; b=P/YtiS1sc6ajbDyd6Pu9E6I3RrBGp+4L6nd6TUUOwCYyenvHsOOHNHqDgoJxFiJwaM 0paX1YtQy7PBEAYNR/PScKyVuKP3mozQIkvg+4YeFai7OA1ngVGjb8IcnmkpiUeguPvZ NV0CsxoEWiJr6rY0RGhMijwhny4d1+H/dqrQ1rPuJ0L8V/Ia33oLH+/o3JtUx4ebPgdT hiTbs62Je/b5KvnfDOK3r/R8I+pbYleePbOMAS2VD5RGtcsWvLtOSLfczQm18u+4Lk/Z vmRi6O6Yvr+yoST8zy0oebJTwZet6ncPBHSBmCz3EcTmSg+oupXtedk3TufDoK4T/YzV aPXA== X-Gm-Message-State: APjAAAW8NwE8bqfntpiapMsHELaHUGCTPZ16AeNJAAwm9yuatuAvjuac VIDzcq4o8s9PHsELwldKHIJtJvsFvY1OOrBQEZ0= X-Google-Smtp-Source: APXvYqzmQgaD9udTS9piXFwvCnf0kMM9MokjH4+r6GG2uaRhGR1Faz2m0KtcknN2iQlMSDPkhWIPuKGoUvQHQK2Afgk= X-Received: by 2002:a37:6007:: with SMTP id u7mr5926088qkb.92.1567023842170; Wed, 28 Aug 2019 13:24:02 -0700 (PDT) MIME-Version: 1.0 References: <20190820114706.18546-1-toke@redhat.com> <20190823122713.73450a4b@carbon> In-Reply-To: <20190823122713.73450a4b@carbon> From: Andrii Nakryiko Date: Wed, 28 Aug 2019 13:23:51 -0700 Message-ID: Subject: Re: [RFC bpf-next 0/5] Convert iproute2 to use libbpf (WIP) To: Jesper Dangaard Brouer Cc: =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= , Stephen Hemminger , Daniel Borkmann , Alexei Starovoitov , Martin KaFai Lau , Song Liu , Yonghong Song , David Miller , Networking , bpf , Anton Protopopov , Stanislav Fomichev , Yoel Caspersen Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Fri, Aug 23, 2019 at 3:27 AM Jesper Dangaard Brouer wrote: > > On Wed, 21 Aug 2019 13:30:09 -0700 > Andrii Nakryiko wrote: > > > On Tue, Aug 20, 2019 at 4:47 AM Toke H=C3=B8iland-J=C3=B8rgensen wrote: > > > > > > iproute2 uses its own bpf loader to load eBPF programs, which has > > > evolved separately from libbpf. Since we are now standardising on > > > libbpf, this becomes a problem as iproute2 is slowly accumulating > > > feature incompatibilities with libbpf-based loaders. In particular, > > > iproute2 has its own (expanded) version of the map definition struct, > > > which makes it difficult to write programs that can be loaded with bo= th > > > custom loaders and iproute2. > > > > > > This series seeks to address this by converting iproute2 to using lib= bpf > > > for all its bpf needs. This version is an early proof-of-concept RFC,= to > > > get some feedback on whether people think this is the right direction= . > > > > > > What this series does is the following: > > > > > > - Updates the libbpf map definition struct to match that of iproute2 > > > (patch 1). > > > > > > Hi Toke, > > > > Thanks for taking a stab at unifying libbpf and iproute2 loaders. I'm > > totally in support of making iproute2 use libbpf to load/initialize > > BPF programs. But I'm against adding iproute2-specific fields to > > libbpf's bpf_map_def definitions to support this. > > > > I've proposed the plan of extending libbpf's supported features so > > that it can be used to load iproute2-style BPF programs earlier, > > please see discussions in [0] and [1]. I think instead of emulating > > iproute2 way of matching everything based on user-specified internal > > IDs, which doesn't provide good user experience and is quite easy to > > get wrong, we should support same scenarios with better declarative > > syntax and in a less error-prone way. I believe we can do that by > > relying on BTF more heavily (again, please check some of my proposals > > in [0], [1], and discussion with Daniel in those threads). It will > > feel more natural and be more straightforward to follow. It would be > > great if you can lend a hand in implementing pieces of that plan! > > > > I'm currently on vacation, so my availability is very sparse, but I'd > > be happy to discuss this further, if need be. > > > > [0] https://lore.kernel.org/bpf/CAEf4BzbfdG2ub7gCi0OYqBrUoChVHWsmOntW= AkJt47=3DFE+km+A@mail.gmail.com/ > > [1] https://www.spinics.net/lists/bpf/msg03976.html > > > > > - Adds functionality to libbpf to support automatic pinning of maps w= hen > > > loading an eBPF program, while re-using pinned maps if they already > > > exist (patches 2-3). > > For production use-cases, libbpf really need an easier higher-level API > for re-using pinned maps, for establishing shared maps between > programs. The existing libbpf API bpf_object__pin_maps() and > bpf_object__unpin_maps(), which don't re-use pinned maps, are not > really usable, because they pin/unpin ALL maps in the ELF file. > > What users really need is an easy way to specify, on a per map basis, > what kind of pinning and reuse/sharing they want. E.g. like iproute2 > have, "global", "object-scope", and "no-pinning". ("ifindex-scope" would > be nice for XDP). I totally agree and I think this is easy to add both for BTF-defined and "classic" bpf_map_def maps. Daniel mentioned in one of the previous threads that in practice object-scope doesn't seem to be used, so I'd say we should start with no-pinning + global pinning as two initial supported values for pinning attribute. ifindex-scope is interesting, but I'd love to hear a bit more about the use cases. > Today users have to split/reimplement bpf_prog_load_xattr(), and > use/add bpf_map__reuse_fd(). Which is that I ended doing for Honestly, bpf_prog_load_xattr() existence seems redundant to me. It's basically just bpf_object__open + bpf_object__load. There is a piece in the middle with "guessing" program types, but it should just be moved into bpf_object__open and happen by default. Using open + load gives more control and isn't really harder than bpf_prog_load_xattr. bpf_prog_load_xattr which might be slightly more convenient for simple use case, but falls apart immediately if you need to tune anything before load. > xdp-cpumap-tc[2] (used in production at ISP) resulting in 142 lines of > extra code[3] that should have been hidden inside libbpf. And worse, > in this solution[4] the maps for reuse-pinning is specified in the code > by name. Thus, they cannot use a generic loader. That I why, I want > to mark the maps via a pinning member, like iproute2. > > I really hope this moves in a practical direction, as I have the next > production request lined up (also from an ISP), and I hate to have to > advice them to choose the same route as [3]. It seems to me that map pinning doesn't need much discussion at this point, let's start with no-pinning + global pinning. To accommodate pinning at custom root, bpf_object__open_xattr should accept extra argument with non-default pinning root path. That should solve your case completely, shouldn't it? Ultimately, with BTF-defined maps it should be possible to specify custom pinning path on per-map basis for cases where user needs ultimate non-uniform manual control. > > > [2] https://github.com/xdp-project/xdp-cpumap-tc/ > [3] https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/xdp_ipha= sh_to_cpu_user.c#L262-L403 > [4] https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/xdp_ipha= sh_to_cpu_user.c#L431-L441 > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer