All of lore.kernel.org
 help / color / mirror / Atom feed
* R11 is invalid with LLVM 12 and later
@ 2021-08-09 15:12 Paul Chaignon
  2021-08-09 22:53 ` Yonghong Song
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Chaignon @ 2021-08-09 15:12 UTC (permalink / raw)
  To: bpf; +Cc: Yonghong Song, Martynas Pumputis

Hello,

While trying to use LLVM 12.0.0 in Cilium, we've noticed that it can
generate invalid BPF bytecode:

    $ clang --version
    Ubuntu clang version 12.0.0-++20210409092622+fa0971b87fb2-1~exp1~20210409193326.73
    Target: x86_64-pc-linux-gnu
    Thread model: posix
    InstalledDir: /usr/bin
    $ make -C bpf -j6 KERNEL=419
    $ llvm-objdump -D -section=2/20 bpf/bpf_lxc.o | grep -i r11
         171:   7b ba 18 ff 00 00 00 00 *(u64 *)(r10 - 232) = r11
         436:   79 ab 18 ff 00 00 00 00 r11 = *(u64 *)(r10 - 232)
         484:   bf 8b 00 00 00 00 00 00 r11 = r8

That bytecode is of course rejected by the verifier:

    171: (7b) *(u64 *)(r10 -232) = r11
    R11 is invalid

LLVM 12.0.1 and latest LLVM sources (e.g., commit 2b4a1d4b from today)
have the same issue. We've bisected it to LLVM commit 552c6c23
("PR44406: Follow behavior of array bound constant folding in more
recent versions of GCC."), but that could just be the commit where
the regression was exposed in Cilium's case.

--
Paul

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: R11 is invalid with LLVM 12 and later
  2021-08-09 15:12 R11 is invalid with LLVM 12 and later Paul Chaignon
@ 2021-08-09 22:53 ` Yonghong Song
  2021-08-10  6:31   ` Yonghong Song
  0 siblings, 1 reply; 5+ messages in thread
From: Yonghong Song @ 2021-08-09 22:53 UTC (permalink / raw)
  To: Paul Chaignon, bpf; +Cc: Martynas Pumputis



On 8/9/21 8:12 AM, Paul Chaignon wrote:
> Hello,
> 
> While trying to use LLVM 12.0.0 in Cilium, we've noticed that it can
> generate invalid BPF bytecode:
> 
>      $ clang --version
>      Ubuntu clang version 12.0.0-++20210409092622+fa0971b87fb2-1~exp1~20210409193326.73
>      Target: x86_64-pc-linux-gnu
>      Thread model: posix
>      InstalledDir: /usr/bin
>      $ make -C bpf -j6 KERNEL=419
>      $ llvm-objdump -D -section=2/20 bpf/bpf_lxc.o | grep -i r11
>           171:   7b ba 18 ff 00 00 00 00 *(u64 *)(r10 - 232) = r11
>           436:   79 ab 18 ff 00 00 00 00 r11 = *(u64 *)(r10 - 232)
>           484:   bf 8b 00 00 00 00 00 00 r11 = r8
> 
> That bytecode is of course rejected by the verifier:
> 
>      171: (7b) *(u64 *)(r10 -232) = r11
>      R11 is invalid

Thanks for reporting. I can reproduce the problem and will take a look soon.

> 
> LLVM 12.0.1 and latest LLVM sources (e.g., commit 2b4a1d4b from today)
> have the same issue. We've bisected it to LLVM commit 552c6c23
> ("PR44406: Follow behavior of array bound constant folding in more
> recent versions of GCC."), but that could just be the commit where
> the regression was exposed in Cilium's case.
> 
> --
> Paul
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: R11 is invalid with LLVM 12 and later
  2021-08-09 22:53 ` Yonghong Song
@ 2021-08-10  6:31   ` Yonghong Song
  2021-08-11 16:54     ` Paul Chaignon
  0 siblings, 1 reply; 5+ messages in thread
From: Yonghong Song @ 2021-08-10  6:31 UTC (permalink / raw)
  To: Paul Chaignon, bpf; +Cc: Martynas Pumputis



On 8/9/21 3:53 PM, Yonghong Song wrote:
> 
> 
> On 8/9/21 8:12 AM, Paul Chaignon wrote:
>> Hello,
>>
>> While trying to use LLVM 12.0.0 in Cilium, we've noticed that it can
>> generate invalid BPF bytecode:
>>
>>      $ clang --version
>>      Ubuntu clang version 
>> 12.0.0-++20210409092622+fa0971b87fb2-1~exp1~20210409193326.73
>>      Target: x86_64-pc-linux-gnu
>>      Thread model: posix
>>      InstalledDir: /usr/bin
>>      $ make -C bpf -j6 KERNEL=419
>>      $ llvm-objdump -D -section=2/20 bpf/bpf_lxc.o | grep -i r11
>>           171:   7b ba 18 ff 00 00 00 00 *(u64 *)(r10 - 232) = r11
>>           436:   79 ab 18 ff 00 00 00 00 r11 = *(u64 *)(r10 - 232)
>>           484:   bf 8b 00 00 00 00 00 00 r11 = r8
>>
>> That bytecode is of course rejected by the verifier:
>>
>>      171: (7b) *(u64 *)(r10 -232) = r11
>>      R11 is invalid
> 
> Thanks for reporting. I can reproduce the problem and will take a look 
> soon.
> 
>>
>> LLVM 12.0.1 and latest LLVM sources (e.g., commit 2b4a1d4b from today)
>> have the same issue. We've bisected it to LLVM commit 552c6c23
>> ("PR44406: Follow behavior of array bound constant folding in more
>> recent versions of GCC."), but that could just be the commit where
>> the regression was exposed in Cilium's case.

The above commit is indeed responsible. With the above commit,
the variable length array compile time evaluation becomes conservative.
For cilium function dsr_reply_icmp4 in nodeport.h
   const __u32 l3_max = MAX_IPOPTLEN + sizeof(*ip4) + orig_dgram;
   __u8 tmp[l3_max];

I didn't try to compile with/without the above commit, the following
is the thesis.

Before the above commit, llvm evaluates the expression
   MAX_IPOPTLEN + sizeof(*ip4) + orig_dgram
and concludes that l3_max is a constant so tmp can be allocated
on stack with fixed size.

With the above commit, llvm becomes conservative to evaluate
the expression at compile time. So above l3_max becomes a
non-constant variable and tmp becomes a variable length
array. Since it becomes a variable length array, the
hidden stack pointer "r11" is used and this caused a problem
in verifier.

To workaround the issue, simply define "tmp" with
    __u8 tmp[68];
But that is not for from user experience. I guess we can do:
   1. for BPF target, let us still do aggressive constant folding
      in compile time in clang, basically restores to pre-commit-552c6c23
      level for BPF programs.
   2. provide an error message if r11 is generated in final code.

Will come back to this thread once I got concrete patches. Thanks!

>>
>> -- 
>> Paul
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: R11 is invalid with LLVM 12 and later
  2021-08-10  6:31   ` Yonghong Song
@ 2021-08-11 16:54     ` Paul Chaignon
  2021-08-12  1:23       ` Yonghong Song
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Chaignon @ 2021-08-11 16:54 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, Martynas Pumputis

On Mon, Aug 09, 2021 at 11:31:48PM -0700, Yonghong Song wrote:
> On 8/9/21 3:53 PM, Yonghong Song wrote:
> > On 8/9/21 8:12 AM, Paul Chaignon wrote:

[...]

> > > LLVM 12.0.1 and latest LLVM sources (e.g., commit 2b4a1d4b from today)
> > > have the same issue. We've bisected it to LLVM commit 552c6c23
> > > ("PR44406: Follow behavior of array bound constant folding in more
> > > recent versions of GCC."), but that could just be the commit where
> > > the regression was exposed in Cilium's case.
> 
> The above commit is indeed responsible. With the above commit,
> the variable length array compile time evaluation becomes conservative.
> For cilium function dsr_reply_icmp4 in nodeport.h
>   const __u32 l3_max = MAX_IPOPTLEN + sizeof(*ip4) + orig_dgram;
>   __u8 tmp[l3_max];
> 
> I didn't try to compile with/without the above commit, the following
> is the thesis.
> 
> Before the above commit, llvm evaluates the expression
>   MAX_IPOPTLEN + sizeof(*ip4) + orig_dgram
> and concludes that l3_max is a constant so tmp can be allocated
> on stack with fixed size.
> 
> With the above commit, llvm becomes conservative to evaluate
> the expression at compile time. So above l3_max becomes a
> non-constant variable and tmp becomes a variable length
> array. Since it becomes a variable length array, the
> hidden stack pointer "r11" is used and this caused a problem
> in verifier.
> 
> To workaround the issue, simply define "tmp" with
>    __u8 tmp[68];

Thanks Yonghong!  I've tested this workaround on the Cilium codebase
with all of our tested configurations.  I'm not seeing this R11 issue in
this BPF program or anywhere else anymore.

> But that is not for from user experience. I guess we can do:
>   1. for BPF target, let us still do aggressive constant folding
>      in compile time in clang, basically restores to pre-commit-552c6c23
>      level for BPF programs.
>   2. provide an error message if r11 is generated in final code.
> 
> Will come back to this thread once I got concrete patches. Thanks!

I'm happy to run any patch you have through the Cilium CI if that helps.

> 
> > > 
> > > -- 
> > > Paul
> > > 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: R11 is invalid with LLVM 12 and later
  2021-08-11 16:54     ` Paul Chaignon
@ 2021-08-12  1:23       ` Yonghong Song
  0 siblings, 0 replies; 5+ messages in thread
From: Yonghong Song @ 2021-08-12  1:23 UTC (permalink / raw)
  To: Paul Chaignon; +Cc: bpf, Martynas Pumputis



On 8/11/21 9:54 AM, Paul Chaignon wrote:
> On Mon, Aug 09, 2021 at 11:31:48PM -0700, Yonghong Song wrote:
>> On 8/9/21 3:53 PM, Yonghong Song wrote:
>>> On 8/9/21 8:12 AM, Paul Chaignon wrote:
> 
> [...]
> 
>>>> LLVM 12.0.1 and latest LLVM sources (e.g., commit 2b4a1d4b from today)
>>>> have the same issue. We've bisected it to LLVM commit 552c6c23
>>>> ("PR44406: Follow behavior of array bound constant folding in more
>>>> recent versions of GCC."), but that could just be the commit where
>>>> the regression was exposed in Cilium's case.
>>
>> The above commit is indeed responsible. With the above commit,
>> the variable length array compile time evaluation becomes conservative.
>> For cilium function dsr_reply_icmp4 in nodeport.h
>>    const __u32 l3_max = MAX_IPOPTLEN + sizeof(*ip4) + orig_dgram;
>>    __u8 tmp[l3_max];
>>
>> I didn't try to compile with/without the above commit, the following
>> is the thesis.
>>
>> Before the above commit, llvm evaluates the expression
>>    MAX_IPOPTLEN + sizeof(*ip4) + orig_dgram
>> and concludes that l3_max is a constant so tmp can be allocated
>> on stack with fixed size.
>>
>> With the above commit, llvm becomes conservative to evaluate
>> the expression at compile time. So above l3_max becomes a
>> non-constant variable and tmp becomes a variable length
>> array. Since it becomes a variable length array, the
>> hidden stack pointer "r11" is used and this caused a problem
>> in verifier.
>>
>> To workaround the issue, simply define "tmp" with
>>     __u8 tmp[68];
> 
> Thanks Yonghong!  I've tested this workaround on the Cilium codebase
> with all of our tested configurations.  I'm not seeing this R11 issue in
> this BPF program or anywhere else anymore.

That is great!

> 
>> But that is not for from user experience. I guess we can do:
>>    1. for BPF target, let us still do aggressive constant folding
>>       in compile time in clang, basically restores to pre-commit-552c6c23
>>       level for BPF programs.
>>    2. provide an error message if r11 is generated in final code.
>>
>> Will come back to this thread once I got concrete patches. Thanks!
> 
> I'm happy to run any patch you have through the Cilium CI if that helps.

Thanks for testing. Could you help try whether the patch 
https://reviews.llvm.org/D107882 can fix the problem or not?
You might need to add -Wno-gnu-folding-constant to silence
array size constant folding warnings.
If this indeed fix the cilium, could you add a comment
to the above llvm patch?

> 
>>
>>>>
>>>> -- 
>>>> Paul
>>>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-08-12  1:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-09 15:12 R11 is invalid with LLVM 12 and later Paul Chaignon
2021-08-09 22:53 ` Yonghong Song
2021-08-10  6:31   ` Yonghong Song
2021-08-11 16:54     ` Paul Chaignon
2021-08-12  1:23       ` Yonghong Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.