[BUG] verifier escape with iteration helpers (bpf_loop, ...)

* [BUG] verifier escape with iteration helpers (bpf_loop, ...)
@ 2023-07-07 14:04 Andrew Werner
  2023-07-07 16:44 ` Alexei Starovoitov
  0 siblings, 1 reply; 52+ messages in thread
From: Andrew Werner @ 2023-07-07 14:04 UTC (permalink / raw)
  To: bpf
  Cc: Andrei Matei, Tamir Duberstein, joannelkoong, kernel-team,
	andrii.nakryiko, song

My coworkers and I have been playing with tracing use cases with bpf
and we uncovered some programs we did not expect to pass verification.
We believe that there's incorrect logic when in the verifier in the
context of the
`bpf_loop` and `bpf_for_each_map_elem` helpers (I suspect also
`bpf_user_ringbuf_drain`, but I haven't verified it). These helpers are
interesting in that they take a callback function, and a context pointer which
will be passed into the callback function; these are the helpers that do some
form of iterations. The observation is that the verifier doesn't take into
account possible changes to the context made by the callback. I'm interested in
better understanding how the verifier is supposed to work in these cases, and to
collaborate on fixing the underlying problem. Consider the following probes
(Note: this will crash the machine if triggered):

```C
#include <linux/bpf.h>
#include <linux/ptrace.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>

char _license[] SEC("license") = "GPL";

typedef struct context {
    char* buf;
} context_t;

static long loop_callback(__u32 idx, context_t* ctx)
{
    if (idx == 0) {
        ctx->buf = (char*)(0xDEAD);
        return 0;
    }
    if (bpf_probe_read_user(ctx->buf, 8, (void*)(0xBADC0FFEE))) {
        return 1;
    }
    return 0;
}

SEC("uprobe/bpf_loop_bug")
int BPF_KPROBE(bpf_loop_bug, void* foo_ptr)
{
    __u64 buf = 0;
    context_t context = (context_t) { .buf = (char*)(&buf) };
    bpf_loop(100, loop_callback, &context, 0);
}
```

As far as I can tell, the verifier only verifies the callback relative to the
state of the context value when it's initially passed in. This can
problematically allow for pointers to erroneously be considered valid
pointers to
map data when in reality those pointers can point anywhere. The example probes
use this to write arbitrary data into arbitrary memory addresses.

From my reading of the verifier code, it seems that only exactly one
iteration of the loop is checked. The important call, as I understand it, is
`__check_func_call`[1]. It seems that that logic adjusts the verifier state
to move the arguments into the appropriate registers, and then passes
verification into the body of the function, but then, when the verifier hits
a return, it just goes back to the caller as opposed to verifying the next
iteration of the loop. Am I missing something?

When it comes to fixing the problem, I don't quite know where to start.
Perhaps these iteration callbacks ought to be treated more like global functions
-- you can't always make assumptions about the state of the data in the context
pointer. Treating the context pointer as totally opaque seems bad from
a usability
perspective. Maybe there's a way to attempt to verify the function
body of the function
by treating all or part of the context as read-only, and then if that
fails, go back and
assume nothing about that part of the context structure. What is the
right way to
think about plugging this hole?

Somewhat tangential, it seems that if I later interact with the data in the
context which may have been mutated, I get a rather opaque verifier error about
the context type itself: `reg type unsupported for arg#1 function
loop_callback#33`. I'd be interested to understand more deeply why this combined
version fails verification, and how to improve the error. See the
following probe:

```C
SEC("uprobe/both")
int BPF_KPROBE(bpf_for_each_map_elem_bug, void *foo_ptr) {
    __u64 buf = 0;
    context_t context = (context_t) {.buf = (char *)(&buf) };
    bpf_for_each_map_elem(&ARRAY, for_each_map_elem_callback, &context, 0);
    bpf_loop(100, loop_callback, &context, 0);
}
```

Just for completeness, here's one using `bpf_for_each_map_elem`:

```C
struct {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __type(key, __u32);
    __type(value, __u64);
    __uint(max_entries, 100);
} ARRAY SEC(".maps");

static long for_each_map_elem_callback(
    struct bpf_map* map,
    const __u32* key,
    __u64* value,
    context_t* ctx
) {
    if (*(__u64*)(ctx->buf) == 0) {
        ctx->buf = (char*)(0xDEAD);
        return 0;
    }
    if (bpf_probe_read_user(ctx->buf, 8, (void*)(0xBADC0FFEE))) {
        return 1;
    }
    return 0;
}

SEC("uprobe/for_each_map_elem_bug")
int BPF_KPROBE(for_each_map_elem_bug, void* foo_ptr)
{
    __u64 buf = 0;
    context_t context = (context_t) { .buf = (char*)(&buf) };
    bpf_for_each_map_elem(&ARRAY, for_each_map_elem_callback, &context, 0);
    bpf_loop(100, loop_callback, &context, 0);
}
```
[1]: https://github.com/torvalds/linux/blob/5133c9e51de41bfa902153888e11add3342ede18/kernel/bpf/verifier.c#L8668

^ permalink raw reply	[flat|nested] 52+ messages in thread