Re: static and dynamic linking. Was: [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF - Toke Høiland-Jørgensen

From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Edward Cree <ecree@solarflare.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	Marek Majkowski <marek@cloudflare.com>,
	Lorenz Bauer <lmb@cloudflare.com>,
	Alan Maguire <alan.maguire@oracle.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: static and dynamic linking. Was: [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF
Date: Thu, 14 Nov 2019 16:41:01 +0100	[thread overview]
Message-ID: <87y2wimpo2.fsf@toke.dk> (raw)
In-Reply-To: <20191112195223.cp5kcmkko54dsfbg@ast-mbp.dhcp.thefacebook.com>

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

[...]

> Back to your question of how fw2 will get loaded.. I'm thinking the following:
> 1. Static linking:
>   obj = bpf_object__open("rootlet.o", "fw1.o", "fw2.o");
>   // libbpf adjusts call offsets and links into single loadable bpf_object
>   bpf_object__load(obj);
>   bpf_set_link_xdp_fd()
> No kernel changes are necessary to support program chaining via static linking.
>
> 2. Dynamic linking:
>   // assuming libxdp.so manages eth0
>   rootlet_fd = get_xdp_fd(eth0);
>   subprog_btf_id = libbpf_find_prog_btf_id("name_of_placeholder", roolet_fd);
>   //                  ^ this function is in patch 16/18 of trampoline
>   attr.attach_prog_fd = roolet_fd;
>   attr.attach_btf_id = subprog_btf_id;
>   // pair (prog_fd, btf_id) needs to be specified at load time
>   obj = bpf_object__open("fw2.o", attr);
>   bpf_object__load(obj);
>   prog = bpf_object__find_program_by_title(obj);
>   link = bpf_program__replace(prog); // similar to bpf_program__attach_trace()
>   // no extra arguments during 'replace'.
>   // Target (prog_fd, btf_id) already known to the kernel and verified

OK, this makes sense.

>> So the two component programs would still exist as kernel objects,
>> right? 
>
> yes. Both fw1.o and fw2.o will be loaded and running instead of placeholders.
>
>> And the trampolines would keep individual stats for each one (if
>> BPF stats are enabled)? 
>
> In case of dynamic linking both fw1.o and fw2.o will be seen as individual
> programs from 'bpftool p s' point of view. And both will have
> individual stats.

Right, this is important, and I think it's where my skepticism about
static linking comes from. With static linking, each XDP program will be
"reduced" to a subprog instead of a full stand-alone program. Which
means that its execution will be different depending on whether it is
just attached directly to an interface, or if it's been linked with a
rootlet before loading.

I'll admit I don't know enough about how subprograms actually work to
know if it's a *meaningful* difference, so I guess I'll go play around
with it. If nothing else, experimenting with static linking can be a way
to hash out the semantics until dynamic linking lands.

>> Could userspace also extract the prog IDs being
>> referenced by the "glue" proglet? 
>
> Not sure I follow. Both fw1.o and fw2.o will have their own prog ids.
> fw1_prog->aux->linked_prog == rootlet_prog
> fw2_prog->aux->linked_prog == rootlet_prog
> Unloading and detaching fw1.o will make kernel to switch back to placeholder
> subprog in roolet_prog. I believe roolet_prog should not keep a list of progs
> that attached to it (or replaced its subprogs) to avoid circular
> dependency.

Well I did mean the link in the other direction. But thinking about it
some more, I don't think it really matters. The important bit is that
userspace can answer the question "given that rootlet ID X is currently
attached on eth0, which two program IDs Y and Z will actually run on
that interface?". And if there's a link in the other direction, it could
just iterate over all loaded programs in the kernel to find them, so
that is OK; as long as we can also tell in which "slot" in the rootlet a
given program is currently attached.

> Due to that detaching roolet_prog from netdev will stop the flow of
> packets into fw1.o, but refcnt of rootlet_prog will not go to zero, so
> it will stay in memory until both fw1.o and fw2.o detach from
> rootlet.o.

OK, that is probably fine. I think we should teach most utilities to
deal with this anyway; in particular, iproute2 should know about
multi-progs (i.e., link against libxdp).

>> What about attaching a third program? Would that work by recursion (as
>> above, but with the old proglet as old_fd), or should the library build
>> a whole new sequence from the component programs?
>
> This choice is up to libxdp.so. It can have a number of placeholders
> ready to be replaced by new progs. Or it can re-generate rootlet.o
> every time new fwX.o comes along. Short term I would start development
> with auto-generated roolet.o and static linking done by libbpf
> while the policy and roolet are done by libxdp.so, since this work
> doesn't depend on any kernel changes. Long term auto-generation
> can stay in libxdp.so if it turns out to be sufficient.

Yes, as I said above this sounds like at least it's a start.

>> Finally, what happens if someone where to try to attach a retprobe to
>> one of the component programs? Could it be possible to do that even
>> while program is being run from proglet dispatch? That way we can still
>> debug an individual XDP program even though it's run as part of a chain.
>
> Right. The fentry/fexit tracing is orthogonal to static/dynamic linking.
> It will be available for all prog types after trampoline patches land.
> See fexit_bpf2bpf.c example in the last 18/18 patch.
> We will be able to debug XDP program regardless whether it's a rootlet
> or a subprogram. Doesn't matter whether linking was static or dynamic.

OK, that's great, and certainly resolved one point of skepticism :)

> With fentry/fexit we will be able to do different stats too.
> Right now bpf program stats are limited to cycles and I resisted a lot
> of pressure to add more hard coded stats. With fentry/fexit we can
> collect arbitrary counters per program. Like number of L1-cache misses
> or number of TLB misses in a given XDP prog.

Yeah, that makes a lot of sense, of course. Great!

>> Sounds reasonable. Any reason libxdp.so couldn't be part of libbpf?
>
> libxdp.so is a policy specifier while libbpf is a tool. It makes more
> sense for them to be separate. libbpf has strong api compatibility
> guarantees. While I don't think anyone knows at this point how libxdp
> api should look and it will take some time for it to mature.

Well, we'd want libxdp to have the same strong API guarantees,
eventually. Which would be a reason to just include it in libbpf. But
sure, I wasn't suggesting to do this from the get-go; we can start out
with something separate and decide later when/if it makes sense to
integrate. As long as libbpf can do the heavy lifting on the actual
linking that is fine with me.

-Toke