All of lore.kernel.org
 help / color / mirror / Atom feed
* bpf_helpers and you... some more...
@ 2019-10-30 19:03 Farid Zakaria
  2019-10-31  9:58 ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 2+ messages in thread
From: Farid Zakaria @ 2019-10-30 19:03 UTC (permalink / raw)
  To: Xdp

This is my attempt of a continuation of David's prior e-mail
https://www.spinics.net/lists/xdp-newbies/msg00179.html

I was curious how ebpf filters are wired and work. The heavy use of C
macros makes the source code difficult for me to comprehend (maybe
there's an online pre-processed version?).
I'm hoping others may find this exploratory-dive insightful (hopefully
it's accurate enough).

Let's write a very trivial ebpf filter (hello_world_kern.c) and have
it print "hello world"

    #include <linux/bpf.h>

    #define __section(NAME) __attribute__((section(NAME), used))

    static char _license[] __section("license") = "GPL";

    /* helper functions called from eBPF programs written in C */
    static int (*bpf_trace_printk)(const char *fmt, int fmt_size,
                                ...) = (void *)BPF_FUNC_trace_printk;

    __section("hello_world") int hello_world_filter(struct __sk_buff *skb) {
        char msg[] = "hello world";
        bpf_debug_printk(msg, sizeof(msg));
        return 0;
    }

If we compile the above using the below we can inspect the LLVM IR.
    clang -c -o hello_world_kern.ll -x c -S -emit-llvm hello_world_kern.c

The few lines that standout are:

    @bpf_trace_printk = internal global i32 (i8*, i32, ...)* inttoptr
(i64 6 to i32 (i8*, i32, ...)*), align 8
    ....
    %6 = load i32 (i8*, i32, ...)*, i32 (i8*, i32, ...)**
@bpf_trace_printk, align 8
    %7 = getelementptr inbounds [13 x i8], [13 x i8]* %3, i32 0, i32 0
    %8 = call i32 (i8*, i32, ...) %6(i8* %7, i32 13)

The above demonstrates that the value of BPF_FUNC_trace_printk is
simply the integer 6 and it is being casted to a function pointer.
Sure enough, we can confirm that `bpf_trace_printk` is the 6th value
in the enumeration of known bpf bpf_helpers.
(https://elixir.bootlin.com/linux/v5.3.7/source/include/uapi/linux/bpf.h#L2724)

We can go even further and take this LLVM IR and generate human
readable eBPF assembly using `llc`

    llc hello_world_kern.ll -march=bpf

Depending on the optimization level of the earlier `clang` call you
may see different results however using `-O3` we can see

    call 6

Great! so we know that the call to `bpf_trace_printk` gets translated
into a call instruction with immediate value of 6.

How does it end up calling code within the kernel though?
Once the Verifier verifies the bytecode it calls `fixup_bpf_calls`
(https://elixir.bootlin.com/linux/v5.3.8/source/kernel/bpf/verifier.c#L8869)
which goes through all the instructions and makes the necessary
adjustment to the immediate value

    fixup_bpf_calls(...) {
        ...
        patch_call_imm:
            fn = env->ops->get_func_proto(insn->imm, env->prog);
            /* all functions that have prototype and verifier allowed
            * programs to call them, must be real in-kernel functions
            */
            if (!fn->func) {
                verbose(env,
                    "kernel subsystem misconfigured func %s#%d\n",
                    func_id_name(insn->imm), insn->imm);
                return -EFAULT;
            }
            insn->imm = fn->func - __bpf_call_base;

N.B. I haven't deciphered how __bpf_call_base is used / works

The `get_func_proto` will return the function prototypes registered by
every subsystem such as in net.
(https://elixir.bootlin.com/linux/v5.3.8/source/net/core/filter.c#L5991)
At this point in the method it's a simple switch statement to get the
matching function prototype given the numeric value.

I'd love to see more on the code path of how the non-JIT vs JIT
instructions get handled.
For the net subsystem, I can see where the ebpf prog is invoked
(https://elixir.bootlin.com/linux/v5.3.8/source/net/core/filter.c#L119),
but it's difficult to work out how the choice of executing the
function directly (in the case of JIT) vs running it through the
interpreter is handled.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-10-31  9:58 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-30 19:03 bpf_helpers and you... some more Farid Zakaria
2019-10-31  9:58 ` Toke Høiland-Jørgensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.