bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	David Miller <davem@davemloft.net>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next v3 3/4] libbpf: Add auto-pinning of maps when loading BPF objects
Date: Tue, 29 Oct 2019 11:56:34 -0700	[thread overview]
Message-ID: <CAEf4BzYhaCpiP3QZJ9zoKq6CujF-49HsNXfMSYmnWqBTq2z2Nw@mail.gmail.com> (raw)
In-Reply-To: <878sp3qtln.fsf@toke.dk>

On Tue, Oct 29, 2019 at 11:44 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > On Tue, Oct 29, 2019 at 2:30 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> >>
> >> > On Sun, Oct 27, 2019 at 1:53 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >>
> >> >> From: Toke Høiland-Jørgensen <toke@redhat.com>
> >> >>
> >> >> This adds support to libbpf for setting map pinning information as part of
> >> >> the BTF map declaration, to get automatic map pinning (and reuse) on load.
> >> >> The pinning type currently only supports a single PIN_BY_NAME mode, where
> >> >> each map will be pinned by its name in a path that can be overridden, but
> >> >> defaults to /sys/fs/bpf.
> >> >>
> >> >> Since auto-pinning only does something if any maps actually have a
> >> >> 'pinning' BTF attribute set, we default the new option to enabled, on the
> >> >> assumption that seamless pinning is what most callers want.
> >> >>
> >> >> When a map has a pin_path set at load time, libbpf will compare the map
> >> >> pinned at that location (if any), and if the attributes match, will re-use
> >> >> that map instead of creating a new one. If no existing map is found, the
> >> >> newly created map will instead be pinned at the location.
> >> >>
> >> >> Programs wanting to customise the pinning can override the pinning paths
> >> >> using bpf_map__set_pin_path() before calling bpf_object__load() (including
> >> >> setting it to NULL to disable pinning of a particular map).
> >> >>
> >> >> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> >> >> ---
> >> >>  tools/lib/bpf/bpf_helpers.h |    6 ++
> >> >>  tools/lib/bpf/libbpf.c      |  142 ++++++++++++++++++++++++++++++++++++++++++-
> >> >>  tools/lib/bpf/libbpf.h      |   11 +++
> >> >>  3 files changed, 154 insertions(+), 5 deletions(-)
> >> >>
> >> >
> >> > [...]
> >> >
> >> >>
> >> >> -static int bpf_object__init_maps(struct bpf_object *obj, bool relaxed_maps)
> >> >> +static int bpf_object__build_map_pin_paths(struct bpf_object *obj,
> >> >> +                                          const char *path)
> >> >> +{
> >> >> +       struct bpf_map *map;
> >> >> +
> >> >> +       if (!path)
> >> >> +               path = "/sys/fs/bpf";
> >> >> +
> >> >> +       bpf_object__for_each_map(map, obj) {
> >> >> +               char buf[PATH_MAX];
> >> >> +               int err, len;
> >> >> +
> >> >> +               if (map->pinning != LIBBPF_PIN_BY_NAME)
> >> >> +                       continue;
> >> >
> >> > still think it's better be done from map definition parsing code
> >> > instead of a separate path, which will ignore most of maps anyways (of
> >> > course by extracting this whole buffer creation logic into a
> >> > function).
> >>
> >> Hmm, okay, can do that. I think we should still store the actual value
> >> of the 'pinning' attribute, though; and even have a getter for it. The
> >> app may want to do something with that information instead of having to
> >> infer it from map->pin_path. Certainly when we add other values of the
> >> pinning attribute, but we may as well add the API to get the value
> >> now...
> >
> > Let's now expose more stuff than what we need to expose. If we really
> > will have a need for that, it's really easy to add. Right now you
> > won't even need to store pinning attribute in bpf_map, because you'll
> > be just setting proper pin_path in init_user_maps(), as suggested
> > above.
>
> While I do think it's a bit weird that there's an attribute you can set
> but can't get at, I will grudgingly admit that it's not strictly needed
> right now... So OK, I'll leave it out :)
>
> >> >> +
> >> >> +               len = snprintf(buf, PATH_MAX, "%s/%s", path, bpf_map__name(map));
> >> >> +               if (len < 0)
> >> >> +                       return -EINVAL;
> >> >> +               else if (len >= PATH_MAX)
> >> >
> >> > [...]
> >> >
> >> >>         return 0;
> >> >>  }
> >> >>
> >> >> +static bool map_is_reuse_compat(const struct bpf_map *map,
> >> >> +                               int map_fd)
> >> >
> >> > nit: this should fit on single line?
> >> >
> >> >> +{
> >> >> +       struct bpf_map_info map_info = {};
> >> >> +       char msg[STRERR_BUFSIZE];
> >> >> +       __u32 map_info_len;
> >> >> +
> >> >> +       map_info_len = sizeof(map_info);
> >> >> +
> >> >> +       if (bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len)) {
> >> >> +               pr_warn("failed to get map info for map FD %d: %s\n",
> >> >> +                       map_fd, libbpf_strerror_r(errno, msg, sizeof(msg)));
> >> >> +               return false;
> >> >> +       }
> >> >> +
> >> >> +       return (map_info.type == map->def.type &&
> >> >> +               map_info.key_size == map->def.key_size &&
> >> >> +               map_info.value_size == map->def.value_size &&
> >> >> +               map_info.max_entries == map->def.max_entries &&
> >> >> +               map_info.map_flags == map->def.map_flags &&
> >> >> +               map_info.btf_key_type_id == map->btf_key_type_id &&
> >> >> +               map_info.btf_value_type_id == map->btf_value_type_id);
> >> >
> >> > If map was pinned by older version of the same app, key and value type
> >> > id are probably gonna be different, even if the type definition itself
> >> > it correct. We probably shouldn't check that?
> >>
> >> Oh, I thought the type IDs would stay relatively stable. If not then I
> >> agree that we shouldn't be checking them here. Will fix.
> >
> > type IDs are just an ordered index of a type, as generated by Clang.
> > No stability guarantees. Just adding extra typedef somewhere in
> > unrelated type might shift all the type IDs around.
>
> Ah, so it's just numbering types within the same translation unit? I
> thought it was somehow globally (or system-wide) unique (though not sure
> how I imagined that would be achieved, TBH).
>
> >> >> +}
> >> >> +
> >> >> +static int
> >> >> +bpf_object__reuse_map(struct bpf_map *map)
> >> >> +{
> >> >> +       char *cp, errmsg[STRERR_BUFSIZE];
> >> >> +       int err, pin_fd;
> >> >> +
> >> >> +       pin_fd = bpf_obj_get(map->pin_path);
> >> >> +       if (pin_fd < 0) {
> >> >> +               if (errno == ENOENT) {
> >> >> +                       pr_debug("found no pinned map to reuse at '%s'\n",
> >> >> +                                map->pin_path);
> >> >> +                       return 0;
> >> >> +               }
> >> >> +
> >> >> +               cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg));
> >> >> +               pr_warn("couldn't retrieve pinned map '%s': %s\n",
> >> >> +                       map->pin_path, cp);
> >> >> +               return -errno;
> >> >
> >> > store errno locally
> >>
> >> *shrugs* okay, if you insist...
> >
> > I guess I do insist on correct handling of errno, instead of
> > potentially returning garbage value from some unrelated syscall from
> > inside of pr_warn's user-provided callback.
> >
> > Even libbpf_strerror_r can garble errno (e.g., through its strerror_r
> > call), so make sure you store it before passing into
> > libbpf_strerror_r().
>
> Ohh, right, didn't think about those having side effects; then your
> worry makes more sense. I thought you were just being pedantic, which is
> why I was being grumpy (did change it, though) :)
>
> >>
> >> >> +       }
> >> >> +
> >> >> +       if (!map_is_reuse_compat(map, pin_fd)) {
> >> >> +               pr_warn("couldn't reuse pinned map at '%s': "
> >> >> +                       "parameter mismatch\n", map->pin_path);
> >> >> +               close(pin_fd);
> >> >> +               return -EINVAL;
> >> >> +       }
> >> >> +
> >> >> +       err = bpf_map__reuse_fd(map, pin_fd);
> >> >> +       if (err) {
> >> >> +               close(pin_fd);
> >> >> +               return err;
> >> >> +       }
> >> >> +       map->pinned = true;
> >> >> +       pr_debug("reused pinned map at '%s'\n", map->pin_path);
> >> >> +
> >> >> +       return 0;
> >> >> +}
> >> >> +
> >> >
> >> > [...]
> >> >
> >> >> +enum libbpf_pin_type {
> >> >> +       LIBBPF_PIN_NONE,
> >> >> +       /* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
> >> >> +       LIBBPF_PIN_BY_NAME,
> >> >> +};
> >> >> +
> >> >>  LIBBPF_API int bpf_object__pin_maps(struct bpf_object *obj, const char *path);
> >> >
> >> > pin_maps should take into account opts->auto_pin_path, shouldn't it?
> >> >
> >> > Which is why I also think that auto_pin_path is bad name, because it's
> >> > not only for auto-pinning, it's a pinning root path, so something like
> >> > pin_root_path or just pin_root is better and less misleading name.
> >>
> >> I view auto_pin_path as something that is used specifically for the
> >> automatic pinning based on the 'pinning' attribute. Any other use of
> >> pinning is for custom use and the user can pass a custom pin path to
> >> those functions.
> >
> > What's the benefit of restricting it to just this use case? If app
> > wants to use something other than /sys/fs/bpf as a default root path,
> > why would that be restricted only to auto-pinned maps? It seems to me
> > that having set this on bpf_object__open() and then calling
> > bpf_object__pin_maps(NULL) should just take this overridden root path
> > into account. Isn't that a logical behavior?
>
> No, I think the logical behaviour is for pin_maps(NULL) to just pin all
> maps at map->pin_path if set (and same for unpin). Already changed it to
> this behaviour, actually.

Sure, I guess that makes sense as well. Can you please add comment to
pin_maps describing this convention?

I still think that auto_pin_path is both too specific and also
imprecise at the same time :) It's not really a pin path, it's a part
of a pin path for any particular map, it's a root of those paths. And
let's not corner us to just auto-pinning use case yet by saying it's
auto_pin_path? So something like pin_root_path or root_pin_path is
generic enough to allow extensions if we need them, but without being
misleading.

>
> -Toke

  reply	other threads:[~2019-10-29 18:56 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-27 20:53 [PATCH bpf-next v3 0/4] libbpf: Support automatic pinning of maps using 'pinning' BTF attribute Toke Høiland-Jørgensen
2019-10-27 20:53 ` [PATCH bpf-next v3 1/4] libbpf: Fix error handling in bpf_map__reuse_fd() Toke Høiland-Jørgensen
2019-10-27 20:53 ` [PATCH bpf-next v3 2/4] libbpf: Store map pin path and status in struct bpf_map Toke Høiland-Jørgensen
2019-10-28 18:24   ` Andrii Nakryiko
2019-10-29  9:01     ` Toke Høiland-Jørgensen
2019-10-29 18:02       ` Andrii Nakryiko
2019-10-29 18:36         ` Toke Høiland-Jørgensen
2019-10-27 20:53 ` [PATCH bpf-next v3 3/4] libbpf: Add auto-pinning of maps when loading BPF objects Toke Høiland-Jørgensen
2019-10-28 18:24   ` Andrii Nakryiko
2019-10-29  9:30     ` Toke Høiland-Jørgensen
2019-10-29 18:13       ` Andrii Nakryiko
2019-10-29 18:44         ` Toke Høiland-Jørgensen
2019-10-29 18:56           ` Andrii Nakryiko [this message]
2019-10-29 19:07             ` Toke Høiland-Jørgensen
2019-10-27 20:53 ` [PATCH bpf-next v3 4/4] selftests: Add tests for automatic map pinning Toke Høiland-Jørgensen
2019-10-28 13:06   ` Jesper Dangaard Brouer
2019-10-28 13:15     ` Toke Høiland-Jørgensen
2019-10-28 15:32       ` Yonghong Song
2019-10-28 16:13         ` Jesper Dangaard Brouer
2019-10-28 17:32           ` Alexei Starovoitov
2019-10-28 18:23           ` Andrii Nakryiko
2019-10-28 18:43   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4BzYhaCpiP3QZJ9zoKq6CujF-49HsNXfMSYmnWqBTq2z2Nw@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=songliubraving@fb.com \
    --cc=toke@redhat.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).