Re: [PATCH RFC bpf-next 0/3] libbpf: Add support for extern function calls - Toke Høiland-Jørgensen

From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH RFC bpf-next 0/3] libbpf: Add support for extern function calls
Date: Sat, 21 Dec 2019 17:24:24 +0100	[thread overview]
Message-ID: <878sn53avb.fsf@toke.dk> (raw)
In-Reply-To: <20191220203045.hmeoum5l4uw7gy5g@ast-mbp.dhcp.thefacebook.com>

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Thu, Dec 19, 2019 at 03:29:30PM +0100, Toke Høiland-Jørgensen wrote:
>> This series adds support for resolving function calls to functions marked as
>> 'extern' in eBPF source files, by resolving the function call targets at load
>> time. For now, this only works by static linking (i.e., copying over the
>> instructions from the function target. Once the kernel support for dynamic
>> linking lands, support can be added for having a function target be an already
>> loaded program fd instead of a bpf object.
>> 
>> The API I'm proposing for this is that the caller specifies an explicit mapping
>> between extern function names and function names in the target object file.
>> This is to support the XDP multi-prog case, where the dispatcher program may not
>> necessarily have control over function names in the target programs, so simple
>> function name resolution can't be used.
>
> I think simple name resolution should be the default behavior for both static
> and dynamic linking. That's the part where I think we must not reinvent the wheel.
> When one .c has
> extern int prog1(struct xdp_md *ctx);
> another .c should have:
> int prog1(struct xdp_md *ctx) {...}
> Both static and dynamic linking should link these two .c together without any
> extra steps from the user. It's expected behavior that any C user assumes and
> it should 'just work'.

Sure, absolutely, when we can, we should just auto-resolve function
signatures and names...

> Where we need to be creative is how plug two xdp firewalls with arbitrary
> program names (including the same names) into common roolet.

...however, the "same name" issue is why I started down the path of
specifying links explicitly. I figure it will be somewhat common to have
to link in two independent XDP programs that both picked the same
function name (such as "xdp_main").

> One firewall can be:
> noinline int foo(struct xdp_md *ctx)
> { // some logic
> }
> SEC("xdp")
> int xdp_prog1(struct xdp_md *ctx)
> {
>        return foo(ctx);
> }
>
> And another firewall:
> noinline int foo(struct xdp_md *ctx)
> { // some other logic
> }
> SEC("xdp")
> int xdp_prog2(struct xdp_md *ctx)
> {
>        return foo(ctx);
> }
>
> Both xdp programs (with multiple functions) need to be connected into:
>
> __weak noinline int dummy1(struct xdp_md *ctx) { return XDP_PASS; }
> __weak noinline int dummy2(struct xdp_md *ctx) { return XDP_PASS; }
>
> SEC("xdp")
> int rootlet(struct xdp_md *ctx)
> {
>         int ret;
>
>         ret = dummy1(ctx);
>         if (ret != XDP_PASS)
>                 goto out;
>
>         ret = dummy2(ctx);
>         if (ret != XDP_DROP)
>                 goto out;
> out:
>         return ret;
> }
>
> where xdp_prog1() from 1st firewall needs to replace dummy1()
> and xdp_prog2() from 2nd firewall needs to replaced dummy2().
> Or the other way around depending on the order of installation.
>
> At the kernel level the API is actually simple. It's the pair of
> target_prog_fd + btf_id I described earlier in "static/dynamic linking" thread.
> Where target_prog_fd is FD of loaded into kernel rootlet and
> btf_id is BTF id of dummy1 or dummy2.

Ah, right; I was thinking it would need a name, but I agree that btf_id
is better.

> When 1st firewall is being loaded libbpf needs to pass target_prog_fd+btf_id
> along with xdp_prog1() into the kernel, so that the verifier can do
> appropriate checking and refcnting.
>
> Note that the kernel and every program have their own BTF id space.
> Their own BTF ids == their own names.
> Loading two programs with exactly the same name is ok today and in the future.
> Linking into other program name space is where we need to agree on naming first.
>
> The static linking of two .o should follow familiar user space linker logic.
> Proposed bpf_linker__..("first.o") and bpf_linker__..("second.o") should work.
> Meaning that "extern int foo()" in "second.o" will get resolved with "int foo()"
> from "first.o".
> Dynamic linking is when "first.o" with "int foo()" was already loaded into
> the kernel and "second.o" is loaded after. In such case its "extern int foo()"
> will be resolved dynamically from previously loaded program.
> The user space analogy of this behavior is glibc.
> "first.o" is glibc.so that supplies memcpy() and friends.
> "second.o" is some a.out that used "extern int memcpy()".

Right, this makes sense. Are you proposing that the kernel does this
without any intervention from libbpf when the BTF indicates it has an
extern KIND_FUNC_PROTO? What about overriding the names (dynamically
linking against two programs with identical function names)?

> For XDP rootlet case already loaded weak function dummy[12]() need to
> be replaced later by xdp_prog[12](). It's like replacing memcpy() in glibc.so.
> I think the user space doesn't have such concepts. I was simply calling it
> dynamic linking too, but it's not quite accurate. It's dynamically replacing
> already loaded functions. Let's call it "dynamic re-linking" ?

I guess it's kinda akin to LD_PRELOAD? But I'm fine with calling it by a
separate name.

> As far as libbpf api for dynamic linking, so far I didn't need to add new stuff.
> I'm trying to piggy back on fexit/fentry approach.

Cool :)

> I think to prototype re-linking without kernel support. We can do static re-linking.
> I think the best approach is to stick with name based resolution. libxdp can do:
> - add alias("dummy1") to xdp_prog1() in first_firewall.o
> - rename foo() in first_firewall.o into unique_foo().
> - add alias("dummy2") to xdp_prog2() in second_firewall.o
> - rename foo() in second_firewall.o into very_unique_foo().
> - use standard static linking of first_firewall.o + second_firewall.o + rootlet.o

The alias() would be a BTF annotation? Or something else?

> The static re-linking is more work than dynamic re-linking because it needs to
> operate in a single name space of final .o. Whereas dynamic re-linking has
> individual name space for every loaded into kernel program.
> I'm hoping to share a prototype of dynamic re-linking soon.

Excellent! At the rate you're going, you'll have the dynamic re-linking
working before I get static linking done :)

-Toke