All of lore.kernel.org
 help / color / mirror / Atom feed
From: YiFei Zhu <zhuyifei@google.com>
To: bpf <bpf@vger.kernel.org>, Yonghong Song <yhs@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>, Fangrui Song <maskray@google.com>
Subject: Re: BPF CO-RE and array fields in context struct
Date: Mon, 22 Nov 2021 12:44:58 -0800	[thread overview]
Message-ID: <CAA-VZPmxh8o8EBcJ=m-DH4ytcxDFmo0JKsm1p1gf40kS0CE3NQ@mail.gmail.com> (raw)
In-Reply-To: <CAA-VZPniKnO4ZkYztkt0uL0s5TdKuwTRvoz5KORJg+MY-bVcHw@mail.gmail.com>

On Mon, Nov 22, 2021 at 8:19 AM YiFei Zhu <zhuyifei@google.com> wrote:
>
> Hi
>
> I've been investigating the use of BPF CO-RE. I discovered that if I
> include vmlinux.h and have all structures annotated with
> __attribute__((preserve_access_index)), including the context struct,
> then a prog that accesses an array field in the context struct, in
> some particular way, cannot pass the verifier.
>
> A bunch of manual reduction plus creduce gives me this output:
>
>   struct bpf_sock_ops {
>     int family;
>     int remote_ip6[];
>   } __attribute__((preserve_access_index));
>   __attribute__((section("sockops"))) int b(struct bpf_sock_ops *d) {
>     int a = d->family;
>     int *c = d->remote_ip6;
>     c[2] = a;
>     return 0;
>   }
>
> With Debian clang version 11.1.0-4+build1, this compiles to
>
>   0000000000000000 <b>:
>          0: b7 02 00 00 04 00 00 00 r2 = 4
>          1: bf 13 00 00 00 00 00 00 r3 = r1
>          2: 0f 23 00 00 00 00 00 00 r3 += r2
>          3: 61 11 00 00 00 00 00 00 r1 = *(u32 *)(r1 + 0)
>          4: 63 13 08 00 00 00 00 00 *(u32 *)(r3 + 8) = r1
>          5: b7 00 00 00 00 00 00 00 r0 = 0
>          6: 95 00 00 00 00 00 00 00 exit
>
> And the prog will be rejected with this verifier log:
>
>   ; __attribute__((section("sockops"))) int b(struct bpf_sock_ops *d) {
>   0: (b7) r2 = 32
>   1: (bf) r3 = r1
>   2: (0f) r3 += r2
>   last_idx 2 first_idx 0
>   regs=4 stack=0 before 1: (bf) r3 = r1
>   regs=4 stack=0 before 0: (b7) r2 = 32
>   ; int a = d->family;
>   3: (61) r1 = *(u32 *)(r1 +20)
>   ; c[2] = a;
>   4: (63) *(u32 *)(r3 +8) = r1
>   dereference of modified ctx ptr R3 off=32 disallowed
>   processed 5 insns (limit 1000000) max_states_per_insn 0 total_states
> 0 peak_states 0 mark_read 0
>
> Looking at check_ctx_reg() and its callsite at check_mem_access() in
> verifier.c, it seems that the verifier really does not like when the
> context pointer has an offset, in this case the field offset of
> d->remote_ip6.
>
> I thought this is just an issue with array fields, that field offset
> relocations may have trouble expressing two field accesses (one struct
> member, one array memory). However, further testing reveals that this
> is not the case, because if I simplify out the local variables, the
> error is gone:
>
>   struct bpf_sock_ops {
>     int family;
>     int remote_ip6[];
>   } __attribute__((preserve_access_index));
>   __attribute__((section("sockops"))) int b(struct bpf_sock_ops *d) {
>     d->remote_ip6[2] = d->family;
>     return 0;
>   }
>
> is compiled to:
>
>   0000000000000000 <b>:
>          0: 61 12 00 00 00 00 00 00 r2 = *(u32 *)(r1 + 0)
>          1: 63 21 0c 00 00 00 00 00 *(u32 *)(r1 + 12) = r2
>          2: b7 00 00 00 00 00 00 00 r0 = 0
>          3: 95 00 00 00 00 00 00 00 exit
>
> and is loaded as:
>
>   ; d->remote_ip6[2] = d->family;
>   0: (61) r2 = *(u32 *)(r1 +20)
>   ; d->remote_ip6[2] = d->family;
>   1: (63) *(u32 *)(r1 +40) = r2
>   invalid bpf_context access off=40 size=4
>
> I believe this error is because d->remote_ip6 is read-only, that this
> modification might be more of a product of creduce, but we can see
> that the CO-RE adjected offset of the array element from the context
> pointer is correct: 32 to remote_ip6, 8 array index, so total offset
> is 40.
>
> Also note that removal of __attribute__((preserve_access_index)) from
> the first (miscompiled) program produces exactly the same bytecode as
> this new program (with no locals).
>
> What is going on here? Why does the access of an array in context in
> this particular way cause it to generate code that would not pass the
> verifier? Is it a bug in Clang/LLVM, or is it the verifier being too
> strict?

Additionally, testing the latest LLVM main branch, this test case,
which does not touch array fields at all, fails but succeeded with
clang version 11.1.0:

  struct bpf_sock_ops {
    int op;
    int bpf_sock_ops_cb_flags;
  } __attribute__((preserve_access_index));
  enum { a, b } static (*c)() = (void *)9;
  enum d { e } f;
  enum d g;
  __attribute__((section("sockops"))) int h(struct bpf_sock_ops *i) {
    switch (i->op) {
    case a:
      f = g = c(i, i->bpf_sock_ops_cb_flags);
      break;
    case b:
      f = g = c(i, i->bpf_sock_ops_cb_flags);
    }
    return 0;
  }

The bad code generation of latest LLVM:

  0000000000000000 <h>:
         0: 61 12 00 00 00 00 00 00 r2 = *(u32 *)(r1 + 0)
         1: 15 02 01 00 01 00 00 00 if r2 == 1 goto +1 <LBB0_2>
         2: 55 02 0b 00 00 00 00 00 if r2 != 0 goto +11 <LBB0_3>

  0000000000000018 <LBB0_2>:
         3: b7 03 00 00 04 00 00 00 r3 = 4
         4: bf 12 00 00 00 00 00 00 r2 = r1
         5: 0f 32 00 00 00 00 00 00 r2 += r3
         6: 61 22 00 00 00 00 00 00 r2 = *(u32 *)(r2 + 0)
         7: 85 00 00 00 09 00 00 00 call 9
         8: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
        10: 63 01 00 00 00 00 00 00 *(u32 *)(r1 + 0) = r0
        11: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
        13: 63 01 00 00 00 00 00 00 *(u32 *)(r1 + 0) = r0

  0000000000000070 <LBB0_3>:
        14: b7 00 00 00 00 00 00 00 r0 = 0
        15: 95 00 00 00 00 00 00 00 exit

The good code generation of LLVM 11.1.0:

  0000000000000000 <h>:
         0: 61 12 00 00 00 00 00 00 r2 = *(u32 *)(r1 + 0)
         1: 25 02 08 00 01 00 00 00 if r2 > 1 goto +8 <LBB0_2>
         2: 61 12 04 00 00 00 00 00 r2 = *(u32 *)(r1 + 4)
         3: 85 00 00 00 09 00 00 00 call 9
         4: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
         6: 63 01 00 00 00 00 00 00 *(u32 *)(r1 + 0) = r0
         7: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
         9: 63 01 00 00 00 00 00 00 *(u32 *)(r1 + 0) = r0

  0000000000000050 <LBB0_2>:
        10: b7 00 00 00 00 00 00 00 r0 = 0
        11: 95 00 00 00 00 00 00 00 exit

A bisect points me to this commit in LLVM as the first bad commit:

  commit 54d9f743c8b0f501288119123cf1828bf7ade69c
  Author: Yonghong Song <yhs@fb.com>
  Date:   Wed Sep 2 22:56:41 2020 -0700

      BPF: move AbstractMemberAccess and PreserveDIType passes to
EP_EarlyAsPossible

      Move abstractMemberAccess and PreserveDIType passes as early as
      possible, right after clang code generation.

  [...]

      Differential Revision: https://reviews.llvm.org/D87153

YiFei Zhu

  reply	other threads:[~2021-11-22 20:45 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-22 16:19 BPF CO-RE and array fields in context struct YiFei Zhu
2021-11-22 20:44 ` YiFei Zhu [this message]
2021-11-23  0:24   ` Yonghong Song
2021-11-23 16:15     ` YiFei Zhu
2021-11-23 20:08       ` Yonghong Song
2021-11-23 20:14         ` YiFei Zhu
2021-11-22 23:56 ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA-VZPmxh8o8EBcJ=m-DH4ytcxDFmo0JKsm1p1gf40kS0CE3NQ@mail.gmail.com' \
    --to=zhuyifei@google.com \
    --cc=bpf@vger.kernel.org \
    --cc=maskray@google.com \
    --cc=sdf@google.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.