netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>,
	Stephen Hemminger <stephen@networkplumber.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>, David Miller <davem@davemloft.net>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>
Subject: Re: [RFC bpf-next 0/5] Convert iproute2 to use libbpf (WIP)
Date: Mon, 3 Feb 2020 16:56:14 -0800	[thread overview]
Message-ID: <CAEf4Bza4bSAzjFp2WDiPAM7hbKcKgAX4A8_TUN8V38gXV9GbTg@mail.gmail.com> (raw)
In-Reply-To: <87blqfcvnf.fsf@toke.dk>

On Mon, Feb 3, 2020 at 11:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > On Wed, Aug 28, 2019 at 1:40 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> >>
> >> On Fri, Aug 23, 2019 at 4:29 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >
> >> > [ ... snip ...]
> >> >
> >> > > E.g., today's API is essentially three steps:
> >> > >
> >> > > 1. open and parse ELF: collect relos, programs, map definitions
> >> > > 2. load: create maps from collected defs, do program/global data/CO-RE
> >> > > relocs, load and verify BPF programs
> >> > > 3. attach programs one by one.
> >> > >
> >> > > Between step 1 and 2 user has flexibility to create more maps, set up
> >> > > map-in-map, etc. Between 2 and 3 you can fill in global data, fill in
> >> > > tail call maps, etc. That's already pretty flexible. But we can tune
> >> > > and break apart those steps even further, if necessary.
> >> >
> >> > Today, steps 1 and 2 can be collapsed into a single call to
> >> > bpf_prog_load_xattr(). As Jesper's mail explains, for XDP we don't
> >> > generally want to do all the fancy rewriting stuff, we just want a
> >> > simple way to load a program and get reusable pinning of maps.
> >>
> >> I agree. See my response to Jesper's message. Note also my view of
> >> bpf_prog_load_xattr() existence.
> >>
> >> > Preferably in a way that is compatible with the iproute2 loader.
> >> >
> >
> > Hi Toke,
> >
> > I was wondering what's the state of converting iproute2 to use libbpf?
> > Is this still something you (or someone else) interested to do?
>
> Yeah, it's still on my list; planning to circle back to it once I have
> finished an RFC implementation for XDP multiprog loading based on the
> new function-replacing in the kernel.
>
> (Not that this should keep anyone else from giving the conversion a go
> and beating me to it :)).
>
> > Briefly re-reading the thread, I think libbpf already has almost
> > everything to be used by iproute2. You've added map pinning, so with
> > bpf_map__set_pin_path() iproute2 should be able to specify pinning
> > path, according to its own logic. The only thing missing that I can
> > see is ability to specify numa_node, which we should add both to
> > BTF-defined map definitions (trivial change), as well as probably
> > expose a method like bpf_map__set_numa_node(struct bpf_map *map, int
> > numa_node) for non-declarative and non-BTF legacy cases.
>
> Yes, adding this to libbpf would be good.
>
> > There was concern about supporting "extended" bpf_map_def format of
> > iproute2 (bpf_elf_map, actually) with extra fields. I think it's
> > actually easy to handle as is without any extra new APIs.
> > bpf_object__open() w/ .relaxed_maps = true option will process
> > compatible 5 fields of bpf_map_def (type, key/value sizes,
> > max_entries, and map_flags) and will set up corresponding struct
> > bpf_map entries (but won't create BPF maps in kernel yet). Then
> > iproute2 can iterate over "maps" ELF section on its own, and see which
> > maps need to get some more adjustments before load phase: map-in-map
> > set up, numa node, pinning, etc. All those adjustments can be done
> > (except for numa yet) through existing libbpf APIs, as far as I can
> > tell. Once that is taken care of, proceed to bpf_object__load() and
> > other standard steps. No callbacks, no extra cruft.
> >
> > Is there anything else that can block iproute2 conversion to libbpf?
>
> I haven't looked into the details since my last RFC conversion series,
> but from what I recall from that, and what we've been changing in libbpf
> since, I was basically planning to do what you explained. So while there
> are some details to work out, I believe it's basically straight forward,
> and I can't think of anything that should block it.
>

Great! Just to disambiguate and make sure we are in agreement, my hope
here is that iproute2 can completely delegate to libbpf all the ELF
parsing, map creation, program loading, etc (including all the new
stuff like global variables, etc). And only for legacy maps in
SEC("maps"), it would have to parse that *single* ELF section (again,
on its own) and see if there are any extra features of struct
bpf_elf_map requested (i.e., numa, map-in-map, pinning), and if yes,
it would use programmatic libbpf APIs to set this up. It might need to
do additional BPF_PROG_ARRAY set up after BPF programs are loaded
(because iproute2 has its custom naming-based convention). But
hopefully we'll encourage people to gradually migrate to BTF-defined
maps with declarative ways of doing all that.

> > BTW, I have a draft patches for declarative (BTF-based) map-in-map set
> > up and initialization the way I described it at Plumbers last year. So
> > while I'm finalizing that, thought I'll resurrect iproute2 thread and
> > see if we can get iproute2 migration to libbpf started.
>
> Great! FWIW, as long as we have the legacy compatibility code in
> iproute2, I don't think it'll be a problem (or a blocker for the
> conversion) if BTF-defined maps can't express all the same things as the
> legacy maps. The missing bits will come automatically as libbpf is
> updated. But great to hear that you're working on this :)
>
> -Toke
>

  reply	other threads:[~2020-02-04  0:56 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-20 11:47 [RFC bpf-next 0/5] Convert iproute2 to use libbpf (WIP) Toke Høiland-Jørgensen
2019-08-20 11:47 ` [RFC bpf-next 1/5] libbpf: Add map definition struct fields from iproute2 Toke Høiland-Jørgensen
2019-08-20 11:47 ` [RFC bpf-next 2/5] libbpf: Add support for auto-pinning of maps with reuse on program load Toke Høiland-Jørgensen
2019-08-20 11:47 ` [RFC bpf-next 3/5] libbpf: Add support for specifying map pinning path via callback Toke Høiland-Jørgensen
2019-08-20 11:47 ` [RFC bpf-next 4/5] iproute2: Allow compiling against libbpf Toke Høiland-Jørgensen
2019-08-22  8:58   ` Daniel Borkmann
2019-08-22 10:43     ` Toke Høiland-Jørgensen
2019-08-22 11:45       ` Daniel Borkmann
2019-08-22 12:04         ` Toke Høiland-Jørgensen
2019-08-22 12:33           ` Daniel Borkmann
2019-08-22 13:38             ` Toke Høiland-Jørgensen
2019-08-22 13:45               ` Daniel Borkmann
2019-08-22 15:28                 ` Toke Høiland-Jørgensen
2019-08-20 11:47 ` [RFC bpf-next 5/5] iproute2: Support loading XDP programs with libbpf Toke Høiland-Jørgensen
2019-08-21 19:26 ` [RFC bpf-next 0/5] Convert iproute2 to use libbpf (WIP) Alexei Starovoitov
2019-08-21 21:00   ` Toke Høiland-Jørgensen
2019-08-22  7:52     ` Andrii Nakryiko
2019-08-22 10:38       ` Toke Høiland-Jørgensen
2019-08-21 20:30 ` Andrii Nakryiko
2019-08-21 21:07   ` Toke Høiland-Jørgensen
2019-08-22  7:49     ` Andrii Nakryiko
2019-08-22  8:33       ` Daniel Borkmann
2019-08-22 11:48         ` Toke Høiland-Jørgensen
2019-08-22 11:49           ` Toke Høiland-Jørgensen
2019-08-23  6:31         ` Andrii Nakryiko
2019-08-23 11:29           ` Toke Høiland-Jørgensen
2019-08-28 20:40             ` Andrii Nakryiko
2020-02-03  7:29               ` Andrii Nakryiko
2020-02-03 19:34                 ` Toke Høiland-Jørgensen
2020-02-04  0:56                   ` Andrii Nakryiko [this message]
2020-02-04  1:46                     ` David Ahern
2020-02-04  3:41                       ` Andrii Nakryiko
2020-02-04  4:52                         ` David Ahern
2020-02-04  5:00                           ` Andrii Nakryiko
2020-02-04  8:25                             ` Toke Høiland-Jørgensen
2020-02-04 18:47                               ` Andrii Nakryiko
2020-02-04 19:19                                 ` Toke Høiland-Jørgensen
2020-02-04 19:29                                   ` Andrii Nakryiko
2020-02-04 21:56                                     ` Toke Høiland-Jørgensen
2020-02-04 22:12                                       ` David Ahern
2020-02-04 22:35                                         ` Toke Høiland-Jørgensen
2020-02-04 23:13                                           ` David Ahern
2020-02-05 10:37                                             ` Toke Høiland-Jørgensen
2020-02-04  8:27                     ` Toke Høiland-Jørgensen
2019-08-23 10:27   ` Jesper Dangaard Brouer
2019-08-28 20:23     ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4Bza4bSAzjFp2WDiPAM7hbKcKgAX4A8_TUN8V38gXV9GbTg@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=songliubraving@fb.com \
    --cc=stephen@networkplumber.org \
    --cc=toke@redhat.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).