From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01972CA9EAE for ; Wed, 23 Oct 2019 04:58:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BF76021872 for ; Wed, 23 Oct 2019 04:58:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="n2ek7N7L" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727885AbfJWE6x (ORCPT ); Wed, 23 Oct 2019 00:58:53 -0400 Received: from mail-qt1-f194.google.com ([209.85.160.194]:46965 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727850AbfJWE6x (ORCPT ); Wed, 23 Oct 2019 00:58:53 -0400 Received: by mail-qt1-f194.google.com with SMTP id u22so30413774qtq.13; Tue, 22 Oct 2019 21:58:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=KD9wk3rtlGk25cugQBW8PeMmDWyY5bD+3+xHNKYM4+M=; b=n2ek7N7LIUnGxMfxT6ProPSOnRma9QOTpPmqgCKjvVzxSUfcMjO1JR2PtStO7R6mdm TrZIwO/uVfhrpB+GxI1CtlOvXbDcNCVyS6wC/BIZ4dk14BzTti0WjrEYEPcNzZXDcGU1 eTK7TgIFvKtlatJXfISkNAvG21F+FLTe6FTJHnU736zk9nXbo/rO0+ibvYTNSUcPWpXW 6seaAtj9dASL6kKGqccgZjNqqiZjh25iecrWf+SLcPHPcn5VpOZgX7EWIrXVeI2lCcfd HNijg1+lMD2G70nisP6lyIniFB9oMUjH/1PaRK217jFdXNznzkqtDo89nkbzIoHf6PEW UkIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=KD9wk3rtlGk25cugQBW8PeMmDWyY5bD+3+xHNKYM4+M=; b=WMQv7ry3Z8asNy8XRpI8udQ51LwwWGgT06gRD7bSB3C0lDyavXXTSUiD8WVtCwbknS 7Zo76XLH6o8IVCWF8NKMTEEPi+Ibv515gGkwCfyS43O3J2D5zAioPc3lzbp0ng6G0u7q NS5hpz6iHSXmkKpUBLQtYgoaYeQifkK67WGzOkhI3N2/mYOH3jWWnEMdnc+ATIwejznx vuWS42+Ya3Za/4m/PfF4JjUGSNoUPpyxSeR9g0eN7myH+edEdqG1vZrowKvPBocXIib5 KimdUMdRVdX07XDAOhjIWBo/OaUios3emz+aEsgACwQ/t+KSiwSKKW6RuwcPaRpqZMTt 6xgw== X-Gm-Message-State: APjAAAWhJrVXhbI4YAFCx0T1VOUJ4k9Kefa10uQfMAIBvsMTpH6akv43 i9HECAeHV5eWMvovoh9GFq31qy8B4j4xv73fmPU= X-Google-Smtp-Source: APXvYqxxryhdMIXux0RZDHmS7x1f/k2bBMHC6QglZx1yrb9ug/N0OA1HRtGRXWhXKUzZlH6Wx0axp63g3uQj84Hur5w= X-Received: by 2002:a05:6214:16c5:: with SMTP id d5mr1480870qvz.247.1571806732279; Tue, 22 Oct 2019 21:58:52 -0700 (PDT) MIME-Version: 1.0 References: <157175668770.112621.17344362302386223623.stgit@toke.dk> <157175669103.112621.7847833678119315310.stgit@toke.dk> <8736fkob4g.fsf@toke.dk> In-Reply-To: <8736fkob4g.fsf@toke.dk> From: Andrii Nakryiko Date: Tue, 22 Oct 2019 21:58:41 -0700 Message-ID: Subject: Re: [PATCH bpf-next 3/3] libbpf: Add pin option to automount BPF filesystem before pinning To: =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= Cc: Daniel Borkmann , Alexei Starovoitov , Martin KaFai Lau , Song Liu , Yonghong Song , Jesper Dangaard Brouer , David Miller , Networking , bpf Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Tue, Oct 22, 2019 at 12:04 PM Toke H=C3=B8iland-J=C3=B8rgensen wrote: > > Andrii Nakryiko writes: > > > On Tue, Oct 22, 2019 at 9:08 AM Toke H=C3=B8iland-J=C3=B8rgensen wrote: > >> > >> From: Toke H=C3=B8iland-J=C3=B8rgensen > >> > >> While the current map pinning functions will check whether the pin pat= h is > >> contained on a BPF filesystem, it does not offer any options to mount = the > >> file system if it doesn't exist. Since we now have pinning options, ad= d a > >> new one to automount a BPF filesystem at the pinning path if that is n= ot > > > > Next thing we'll be adding extra options to mount BPF FS... Can we > > leave the task of auto-mounting BPF FS to tools/applications? > > Well, there was a reason I put this into a separate patch: I wasn't sure > it really fit here. My reasoning is the following: If we end up with a > default auto-pinning that works really well, people are going to just > use that. And end up really confused when bpffs is not mounted. And it > seems kinda silly to make every application re-implement the same mount > check and logic. > > Or to put it another way: If we agree that the reasonable default thing > is to just pin things in /sys/fs/bpf, let's make it as easy as possible > for applications to do that right. > This reminds me the setrlimit() issue, though. And we decided that library shouldn't be manipulating global resources on behalf of users. I think this is a similar one. > >> already pointing at a bpffs. > >> > >> The mounting logic itself is copied from the iproute2 BPF helper funct= ions. > >> > >> Signed-off-by: Toke H=C3=B8iland-J=C3=B8rgensen > >> --- > >> tools/lib/bpf/libbpf.c | 47 +++++++++++++++++++++++++++++++++++++++= ++++++++ > >> tools/lib/bpf/libbpf.h | 5 ++++- > >> 2 files changed, 51 insertions(+), 1 deletion(-) > >> > >> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > >> index aea3916de341..f527224bb211 100644 > >> --- a/tools/lib/bpf/libbpf.c > >> +++ b/tools/lib/bpf/libbpf.c > >> @@ -37,6 +37,7 @@ > >> #include > >> #include > >> #include > >> +#include > >> #include > >> #include > >> #include > >> @@ -4072,6 +4073,35 @@ int bpf_map__unpin(struct bpf_map *map, const c= har *path) > >> return 0; > >> } > >> > >> +static int mount_bpf_fs(const char *target) > >> +{ > >> + bool bind_done =3D false; > >> + > >> + while (mount("", target, "none", MS_PRIVATE | MS_REC, NULL)) { > > > > what does this loop do? we need some comments explaining what's going > > on here > > Well, as it says in the commit message I stole this from iproute2. I > think the "--make-private, --bind" dance is there to make sure we don't > mess up some other mount points at this path. Which seems like a good > idea, and one of those things that most people probably won't think > about when just writing an application that wants to mount the fs; which > is another reason to put this into libbpf :) I think this is exactly a reason to not do this and rely on applications to know and set up their environment properly. All these races, accidentally stomping on someone else's FS, etc, that sounds like a really good excuse to not do this in libbpf. Definitely not until we get a real experience, driven by production use cases, on how to go about that correctly. It might be added as a helper, but I think application has to call it explicitly. > > >> + if (errno !=3D EINVAL || bind_done) { > >> + pr_warning("mount --make-private %s failed: %s= \n", > >> + target, strerror(errno)); > >> + return -1; > >> + } > >> + > >> + if (mount(target, target, "none", MS_BIND, NULL)) { > >> + pr_warning("mount --bind %s %s failed: %s\n", > >> + target, target, strerror(errno)); > >> + return -1; > >> + } > >> + > >> + bind_done =3D true; > >> + } > >> + > >> + if (mount("bpf", target, "bpf", 0, "mode=3D0700")) { > >> + fprintf(stderr, "mount -t bpf bpf %s failed: %s\n", > >> + target, strerror(errno)); > >> + return -1; > >> + } > >> + > >> + return 0; > >> +} > >> + > >> static int get_pin_path(char *buf, size_t buf_len, > >> struct bpf_map *map, struct bpf_object_pin_opt= s *opts, > >> bool mkdir) > >> @@ -4102,6 +4132,23 @@ static int get_pin_path(char *buf, size_t buf_l= en, > > > > Nothing in `get_pin_path` indicates that it's going to do an entire FS > > mount, please split this out of get_pin_path. > > Regardless of the arguments above, that is certainly a fair point ;) > > -Toke