netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Self-XORing BPF registers is undefined behavior
@ 2018-12-13 11:00 Alexander Potapenko
  2018-12-13 11:06 ` Eric Dumazet
  2018-12-13 11:59 ` Michal Kubecek
  0 siblings, 2 replies; 26+ messages in thread
From: Alexander Potapenko @ 2018-12-13 11:00 UTC (permalink / raw)
  To: ast, daniel; +Cc: Dmitriy Vyukov, Networking

Hi BPF maintainers,

some time ago KMSAN found an issue in BPF code which we decided to
suppress at that point, but now I'd like to bring it to your
attention.
Namely, some BPF programs may contain instructions that XOR a register
with itself.
This effectively results in the following C code:
  regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
or
  regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
being executed.

According to the C11 standard this is undefined behavior, so KMSAN
reports an error in this case.

Do you think it's feasible to explicitly initialize the register
values like it's done here:
https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d
?

Thanks,
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2018-12-13 11:00 Self-XORing BPF registers is undefined behavior Alexander Potapenko
@ 2018-12-13 11:06 ` Eric Dumazet
  2018-12-13 11:23   ` Alexander Potapenko
  2018-12-13 11:59 ` Michal Kubecek
  1 sibling, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2018-12-13 11:06 UTC (permalink / raw)
  To: Alexander Potapenko, ast, daniel; +Cc: Dmitriy Vyukov, Networking



On 12/13/2018 03:00 AM, Alexander Potapenko wrote:
> Hi BPF maintainers,
> 
> some time ago KMSAN found an issue in BPF code which we decided to
> suppress at that point, but now I'd like to bring it to your
> attention.
> Namely, some BPF programs may contain instructions that XOR a register
> with itself.
> This effectively results in the following C code:
>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> or
>   regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
> being executed.
> 
> According to the C11 standard this is undefined behavior, so KMSAN
> reports an error in this case.

eBPF is not C11 ;)

XOR boolean operation on a cpu is following boolean logic, which is much stronger than
any C standard.


> 
> Do you think it's feasible to explicitly initialize the register
> values like it's done here:
> https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d
> ?
> 
> Thanks,
> Alexander Potapenko
> Software Engineer
> 
> Google Germany GmbH
> Erika-Mann-Straße, 33
> 80636 München
> 
> Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2018-12-13 11:06 ` Eric Dumazet
@ 2018-12-13 11:23   ` Alexander Potapenko
  0 siblings, 0 replies; 26+ messages in thread
From: Alexander Potapenko @ 2018-12-13 11:23 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: ast, daniel, Dmitriy Vyukov, Networking

On Thu, Dec 13, 2018 at 12:06 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 12/13/2018 03:00 AM, Alexander Potapenko wrote:
> > Hi BPF maintainers,
> >
> > some time ago KMSAN found an issue in BPF code which we decided to
> > suppress at that point, but now I'd like to bring it to your
> > attention.
> > Namely, some BPF programs may contain instructions that XOR a register
> > with itself.
> > This effectively results in the following C code:
> >   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> > or
> >   regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
> > being executed.
> >
> > According to the C11 standard this is undefined behavior, so KMSAN
> > reports an error in this case.
>
> eBPF is not C11 ;)
And is planning to stay such forever? :)
I'm not a language lawyer, so I can't tell for sure if this is valid
in C99 either. I think the term "trap representation" had already been
there.

> XOR boolean operation on a cpu is following boolean logic, which is much stronger than
> any C standard.
Yes, this is true if we compile a eBPF program into assembly code.
But if the JIT is off, it ends up being interpreted by
___bpf_prog_run(), which just executes C code from a big switch:
...
  ALU_XOR_X:
    regs[insn->dst_reg] = regs[insn->dst_reg] ^ regs[insn->src_reg];
...

Note that it's even unknown at compile time that dst_reg and src_reg
are the same.
>
> >
> > Do you think it's feasible to explicitly initialize the register
> > values like it's done here:
> > https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d
> > ?
> >
> > Thanks,
> > Alexander Potapenko
> > Software Engineer
> >
> > Google Germany GmbH
> > Erika-Mann-Straße, 33
> > 80636 München
> >
> > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
> > Registergericht und -nummer: Hamburg, HRB 86891
> > Sitz der Gesellschaft: Hamburg
> >



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2018-12-13 11:00 Self-XORing BPF registers is undefined behavior Alexander Potapenko
  2018-12-13 11:06 ` Eric Dumazet
@ 2018-12-13 11:59 ` Michal Kubecek
  2018-12-13 12:20   ` Michal Kubecek
  1 sibling, 1 reply; 26+ messages in thread
From: Michal Kubecek @ 2018-12-13 11:59 UTC (permalink / raw)
  To: Alexander Potapenko; +Cc: ast, daniel, Dmitriy Vyukov, Networking

On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote:
> Hi BPF maintainers,
> 
> some time ago KMSAN found an issue in BPF code which we decided to
> suppress at that point, but now I'd like to bring it to your
> attention.
> Namely, some BPF programs may contain instructions that XOR a register
> with itself.
> This effectively results in the following C code:
>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> or
>   regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
> being executed.
> 
> According to the C11 standard this is undefined behavior, so KMSAN
> reports an error in this case.

Can you quote the part of the standard saying this is undefined
behavior? I couldn't find anything else than

  If the value being stored in an object is read from another object
  that overlaps in any way the storage of the first object, then the
  overlap shall be exact and the two objects shall have qualified or
  unqualified versions of a compatible type; otherwise, the behavior
  is undefined.

(but I only have a draft for obvious reasons). I'm not sure what exactly
they mean by "exact overlap" and the standard doesn't seem to define
the term but if the two objects are actually the same, they certainly
have compatible types.

> 
> Do you think it's feasible to explicitly initialize the register
> values like it's done here:
> https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d
> ?

Wouldn't that mean we still end up with undefined behavior whenever
a cBPF program explicitly uses the xor with itself to zero a register?

Michal Kubecek

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2018-12-13 11:59 ` Michal Kubecek
@ 2018-12-13 12:20   ` Michal Kubecek
  2018-12-13 12:24     ` Alexander Potapenko
  0 siblings, 1 reply; 26+ messages in thread
From: Michal Kubecek @ 2018-12-13 12:20 UTC (permalink / raw)
  To: Alexander Potapenko; +Cc: ast, daniel, Dmitriy Vyukov, Networking

On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote:
> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote:
> > Hi BPF maintainers,
> > 
> > some time ago KMSAN found an issue in BPF code which we decided to
> > suppress at that point, but now I'd like to bring it to your
> > attention.
> > Namely, some BPF programs may contain instructions that XOR a register
> > with itself.
> > This effectively results in the following C code:
> >   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> > or
> >   regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
> > being executed.
> > 
> > According to the C11 standard this is undefined behavior, so KMSAN
> > reports an error in this case.
> 
> Can you quote the part of the standard saying this is undefined
> behavior? I couldn't find anything else than
> 
>   If the value being stored in an object is read from another object
>   that overlaps in any way the storage of the first object, then the
>   overlap shall be exact and the two objects shall have qualified or
>   unqualified versions of a compatible type; otherwise, the behavior
>   is undefined.
> 
> (but I only have a draft for obvious reasons). I'm not sure what exactly
> they mean by "exact overlap" and the standard doesn't seem to define
> the term but if the two objects are actually the same, they certainly
> have compatible types.


I think I understand now. You didn't want to say that the statement

  regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];

as such is undefined behavior but that it's UB when regs[BPF_REG_A] is
uninitialized. Right?

Michal Kubecek

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2018-12-13 12:20   ` Michal Kubecek
@ 2018-12-13 12:24     ` Alexander Potapenko
  2018-12-13 13:18       ` Daniel Borkmann
  0 siblings, 1 reply; 26+ messages in thread
From: Alexander Potapenko @ 2018-12-13 12:24 UTC (permalink / raw)
  To: mkubecek; +Cc: ast, daniel, Dmitriy Vyukov, Networking

On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote:
>
> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote:
> > On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote:
> > > Hi BPF maintainers,
> > >
> > > some time ago KMSAN found an issue in BPF code which we decided to
> > > suppress at that point, but now I'd like to bring it to your
> > > attention.
> > > Namely, some BPF programs may contain instructions that XOR a register
> > > with itself.
> > > This effectively results in the following C code:
> > >   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> > > or
> > >   regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
> > > being executed.
> > >
> > > According to the C11 standard this is undefined behavior, so KMSAN
> > > reports an error in this case.
> >
> > Can you quote the part of the standard saying this is undefined
> > behavior? I couldn't find anything else than
> >
> >   If the value being stored in an object is read from another object
> >   that overlaps in any way the storage of the first object, then the
> >   overlap shall be exact and the two objects shall have qualified or
> >   unqualified versions of a compatible type; otherwise, the behavior
> >   is undefined.
> >
> > (but I only have a draft for obvious reasons). I'm not sure what exactly
> > they mean by "exact overlap" and the standard doesn't seem to define
> > the term but if the two objects are actually the same, they certainly
> > have compatible types.
>
>
> I think I understand now. You didn't want to say that the statement
>
>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
>
> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is
> uninitialized. Right?
Yes. Sorry for being unclear.
By default regs[] is uninitialized, so we need to initialize it before
using the register values.
I am also wondering if it's possible to simply copy the uninitialized
register values from regs[] to the userspace via maps.
> Michal Kubecek



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2018-12-13 12:24     ` Alexander Potapenko
@ 2018-12-13 13:18       ` Daniel Borkmann
  2018-12-13 14:54         ` Daniel Borkmann
  2018-12-18 14:09         ` Alexander Potapenko
  0 siblings, 2 replies; 26+ messages in thread
From: Daniel Borkmann @ 2018-12-13 13:18 UTC (permalink / raw)
  To: Alexander Potapenko, mkubecek; +Cc: ast, Dmitriy Vyukov, Networking

On 12/13/2018 01:24 PM, Alexander Potapenko wrote:
> On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote:
>> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote:
>>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote:
>>>> Hi BPF maintainers,
>>>>
>>>> some time ago KMSAN found an issue in BPF code which we decided to
>>>> suppress at that point, but now I'd like to bring it to your
>>>> attention.
>>>> Namely, some BPF programs may contain instructions that XOR a register
>>>> with itself.
>>>> This effectively results in the following C code:
>>>>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
>>>> or
>>>>   regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
>>>> being executed.
>>>>
>>>> According to the C11 standard this is undefined behavior, so KMSAN
>>>> reports an error in this case.

Can you elaborate / quote the exact bit from C11 that KMSAN is referring
to? (The below that Michal was quoting or something else?)

Does that only refer to C11 standard? Note that kernel's Makefile +430
explicitly states 'std=gnu89' and not 'std=c11' [0].

  [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51b97e354ba9fce1890cf38ecc754aa49677fc89

>>> Can you quote the part of the standard saying this is undefined
>>> behavior? I couldn't find anything else than
>>>
>>>   If the value being stored in an object is read from another object
>>>   that overlaps in any way the storage of the first object, then the
>>>   overlap shall be exact and the two objects shall have qualified or
>>>   unqualified versions of a compatible type; otherwise, the behavior
>>>   is undefined.
>>>
>>> (but I only have a draft for obvious reasons). I'm not sure what exactly
>>> they mean by "exact overlap" and the standard doesn't seem to define
>>> the term but if the two objects are actually the same, they certainly
>>> have compatible types.

Here is an example for the overlap quoted above; I don't think this
applies to our case since it would be "exact". Quote [1]:

  struct S { int x; int y; };
  struct T { int z; struct S s; };
  union U { struct S f ; struct T g; } u;

  main(){
    u.f = u.g.s;
    return 0;
  }

  [1] https://bts.frama-c.com/print_bug_page.php?bug_id=945

>> I think I understand now. You didn't want to say that the statement
>>
>>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
>>
>> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is
>> uninitialized. Right?
> Yes. Sorry for being unclear.
> By default regs[] is uninitialized, so we need to initialize it before
> using the register values.
> I am also wondering if it's possible to simply copy the uninitialized
> register values from regs[] to the userspace via maps.
>> Michal Kubecek
> 
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2018-12-13 13:18       ` Daniel Borkmann
@ 2018-12-13 14:54         ` Daniel Borkmann
  2018-12-18 14:36           ` Alexander Potapenko
  2018-12-18 14:09         ` Alexander Potapenko
  1 sibling, 1 reply; 26+ messages in thread
From: Daniel Borkmann @ 2018-12-13 14:54 UTC (permalink / raw)
  To: Alexander Potapenko, mkubecek; +Cc: ast, Dmitriy Vyukov, Networking

On 12/13/2018 02:18 PM, Daniel Borkmann wrote:
> On 12/13/2018 01:24 PM, Alexander Potapenko wrote:
>> On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote:
>>> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote:
>>>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote:
>>>>> Hi BPF maintainers,
>>>>>
>>>>> some time ago KMSAN found an issue in BPF code which we decided to
>>>>> suppress at that point, but now I'd like to bring it to your
>>>>> attention.
>>>>> Namely, some BPF programs may contain instructions that XOR a register
>>>>> with itself.
>>>>> This effectively results in the following C code:
>>>>>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
>>>>> or
>>>>>   regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
>>>>> being executed.
>>>>>
>>>>> According to the C11 standard this is undefined behavior, so KMSAN
>>>>> reports an error in this case.
> 
> Can you elaborate / quote the exact bit from C11 that KMSAN is referring
> to? (The below that Michal was quoting or something else?)
> 
> Does that only refer to C11 standard? Note that kernel's Makefile +430
> explicitly states 'std=gnu89' and not 'std=c11' [0].
> 
>   [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51b97e354ba9fce1890cf38ecc754aa49677fc89
> 
>>>> Can you quote the part of the standard saying this is undefined
>>>> behavior? I couldn't find anything else than
>>>>
>>>>   If the value being stored in an object is read from another object
>>>>   that overlaps in any way the storage of the first object, then the
>>>>   overlap shall be exact and the two objects shall have qualified or
>>>>   unqualified versions of a compatible type; otherwise, the behavior
>>>>   is undefined.
>>>>
>>>> (but I only have a draft for obvious reasons). I'm not sure what exactly
>>>> they mean by "exact overlap" and the standard doesn't seem to define
>>>> the term but if the two objects are actually the same, they certainly
>>>> have compatible types.
> 
> Here is an example for the overlap quoted above; I don't think this
> applies to our case since it would be "exact". Quote [1]:
> 
>   struct S { int x; int y; };
>   struct T { int z; struct S s; };
>   union U { struct S f ; struct T g; } u;
> 
>   main(){
>     u.f = u.g.s;
>     return 0;
>   }
> 
>   [1] https://bts.frama-c.com/print_bug_page.php?bug_id=945
> 
>>> I think I understand now. You didn't want to say that the statement
>>>
>>>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
>>>
>>> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is
>>> uninitialized. Right?
>> Yes. Sorry for being unclear.
>> By default regs[] is uninitialized, so we need to initialize it before
>> using the register values.
>> I am also wondering if it's possible to simply copy the uninitialized
>> register values from regs[] to the userspace via maps.

Nope, not possible. And to elaborate on cBPF / eBPF cases:

# cat /proc/sys/net/core/bpf_jit_enable
0

For the eBPF case, lets take (test_verifier.c):

{
        "xor test",
        .insns = {
                BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2),
                BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
                BPF_EXIT_INSN(),
        },
        .result = ACCEPT,
        .retval = 0,
},

# ./test_verifier 0
#0/u xor test FAIL
Failed to load prog 'Permission denied'!
0: (af) r2 ^= r2
R2 !read_ok
#0/p xor test FAIL
Failed to load prog 'Permission denied'!
0: (af) r2 ^= r2
R2 !read_ok
Summary: 0 PASSED, 0 SKIPPED, 2 FAILED

And for the initialized case:

{
        "xor test",
        .insns = {
                BPF_MOV64_IMM(BPF_REG_2, 42),
                BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2),
                BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
                BPF_EXIT_INSN(),
        },
        .result = ACCEPT,
        .retval = 0,
},

# ./test_verifier 0
#0/u xor test OK
#0/p xor test OK
Summary: 2 PASSED, 0 SKIPPED, 0 FAILED

And cBPF (test_bpf.c):

{
        "xor test",
        .u.insns = {
                BPF_STMT(BPF_RET | BPF_A, 0)
        },
        CLASSIC,
        { },
        { { 0, 0 }, },
},

[...]
[88819.276142] test_bpf: #0 xor test jited:0 PASS
[...]

Given latter two use the same underlying interpreter implementation for
the 'good' and 'bad' case, I don't see how compiler would have any degree
of freedom to judge about the runtime dependent BPF insn input stream
here. If it would assume UB, then also 'good' xor case would be broken
and miscompiled which it is clearly not (even for some very obvious case
below). While KMSAN is analyzing runtime memory access, compiler cannot..

# cat foo.c
#include <stdio.h>
int main(void)
{
	int foo = foo ^ foo;
	printf("%u\n", foo);
	return 0;
}

# clang -Wall -O2 foo.c
foo.c:4:12: warning: variable 'foo' is uninitialized when used within its own initialization [-Wuninitialized]
        int foo = foo ^ foo;
            ~~~   ^~~
# ./a.out
0

disasm:

00000000004004d0 <main>:
  4004d0:	50                   	push   %rax
  4004d1:	bf 74 05 40 00       	mov    $0x400574,%edi
  4004d6:	31 f6                	xor    %esi,%esi
  4004d8:	31 c0                	xor    %eax,%eax
  4004da:	e8 f1 fe ff ff       	callq  4003d0 <printf@plt>
  4004df:	31 c0                	xor    %eax,%eax
  4004e1:	59                   	pop    %rcx
  4004e2:	c3                   	retq
  4004e3:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
  4004ea:	00 00 00
  4004ed:	0f 1f 00             	nopl   (%rax)

# gcc -Wall -O2 foo.c   (neither complains with using -Wextra)
# ./a.out
0

disasm:

0000000000000560 <main>:
 560:	48 8d 35 ad 01 00 00 	lea    0x1ad(%rip),%rsi        # 714 <_IO_stdin_used+0x4>
 567:	48 83 ec 08          	sub    $0x8,%rsp
 56b:	31 d2                	xor    %edx,%edx
 56d:	bf 01 00 00 00       	mov    $0x1,%edi
 572:	31 c0                	xor    %eax,%eax
 574:	e8 c7 ff ff ff       	callq  540 <__printf_chk@plt>
 579:	31 c0                	xor    %eax,%eax
 57b:	48 83 c4 08          	add    $0x8,%rsp
 57f:	c3                   	retq

Cheers,
Daniel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2018-12-13 13:18       ` Daniel Borkmann
  2018-12-13 14:54         ` Daniel Borkmann
@ 2018-12-18 14:09         ` Alexander Potapenko
  1 sibling, 0 replies; 26+ messages in thread
From: Alexander Potapenko @ 2018-12-18 14:09 UTC (permalink / raw)
  To: daniel; +Cc: mkubecek, ast, Dmitriy Vyukov, Networking

On Thu, Dec 13, 2018 at 2:18 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 12/13/2018 01:24 PM, Alexander Potapenko wrote:
> > On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote:
> >> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote:
> >>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote:
> >>>> Hi BPF maintainers,
> >>>>
> >>>> some time ago KMSAN found an issue in BPF code which we decided to
> >>>> suppress at that point, but now I'd like to bring it to your
> >>>> attention.
> >>>> Namely, some BPF programs may contain instructions that XOR a register
> >>>> with itself.
> >>>> This effectively results in the following C code:
> >>>>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> >>>> or
> >>>>   regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
> >>>> being executed.
> >>>>
> >>>> According to the C11 standard this is undefined behavior, so KMSAN
> >>>> reports an error in this case.
>
> Can you elaborate / quote the exact bit from C11 that KMSAN is referring
> to? (The below that Michal was quoting or something else?)

"6.7.9 Initialization
10 If an object that has automatic storage duration is not initialized
explicitly, its value is indeterminate."

and

"5 Certain object representations need not represent a value of the
object type. If the stored
value of an object has such a representation and is read by an lvalue
expression that does
not have character type, the behavior is undefined. If such a
representation is produced
by a side effect that modifies all or any part of the object by an
lvalue expression that
does not have character type, the behavior is undefined.[50] Such a
representation is called a trap representation."

> Does that only refer to C11 standard? Note that kernel's Makefile +430
> explicitly states 'std=gnu89' and not 'std=c11' [0].
It's better defined in newer standards, but C89 also forbids uses of
uninitialized values:

"3.16 undefined behavior: Behavior, upon use of a nonponable or
erroneous program construct,
of erroneous data, or of indeterminately valued objects, for which
this International Standard
imposes no requirements. Permissible undefined behavior ranges from
ignoring the situation
completely with unpredictable results, to behaving during translation
or program execution in a
documented manner characteristic of the environment (with or without
the issuance of a
diagnostic message), to terminating a translation or execution (with
the issuance of a diagnostic message)."

and

"6.5.7 Initialization
...
If an object that has automatic storage duration is not initialized
explicitly, its value is
indeterminate[74]. If an object that has static storage duration is
not initialized explicitly, it is
initialized implicitly as if every member that has arithmetic type
were assigned 0 and every
member that has pointer type were assigned a null pointer constant."


>   [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51b97e354ba9fce1890cf38ecc754aa49677fc89
>
> >>> Can you quote the part of the standard saying this is undefined
> >>> behavior? I couldn't find anything else than
> >>>
> >>>   If the value being stored in an object is read from another object
> >>>   that overlaps in any way the storage of the first object, then the
> >>>   overlap shall be exact and the two objects shall have qualified or
> >>>   unqualified versions of a compatible type; otherwise, the behavior
> >>>   is undefined.
> >>>
> >>> (but I only have a draft for obvious reasons). I'm not sure what exactly
> >>> they mean by "exact overlap" and the standard doesn't seem to define
> >>> the term but if the two objects are actually the same, they certainly
> >>> have compatible types.
>
> Here is an example for the overlap quoted above; I don't think this
> applies to our case since it would be "exact". Quote [1]:
>
>   struct S { int x; int y; };
>   struct T { int z; struct S s; };
>   union U { struct S f ; struct T g; } u;
>
>   main(){
>     u.f = u.g.s;
>     return 0;
>   }
>
>   [1] https://bts.frama-c.com/print_bug_page.php?bug_id=945
>
> >> I think I understand now. You didn't want to say that the statement
> >>
> >>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> >>
> >> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is
> >> uninitialized. Right?
> > Yes. Sorry for being unclear.
> > By default regs[] is uninitialized, so we need to initialize it before
> > using the register values.
> > I am also wondering if it's possible to simply copy the uninitialized
> > register values from regs[] to the userspace via maps.
> >> Michal Kubecek
> >
> >
> >
>


-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2018-12-13 14:54         ` Daniel Borkmann
@ 2018-12-18 14:36           ` Alexander Potapenko
  2020-05-27 15:52             ` Alexander Potapenko
  0 siblings, 1 reply; 26+ messages in thread
From: Alexander Potapenko @ 2018-12-18 14:36 UTC (permalink / raw)
  To: daniel; +Cc: mkubecek, ast, Dmitriy Vyukov, Networking

On Thu, Dec 13, 2018 at 3:54 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 12/13/2018 02:18 PM, Daniel Borkmann wrote:
> > On 12/13/2018 01:24 PM, Alexander Potapenko wrote:
> >> On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote:
> >>> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote:
> >>>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote:
> >>>>> Hi BPF maintainers,
> >>>>>
> >>>>> some time ago KMSAN found an issue in BPF code which we decided to
> >>>>> suppress at that point, but now I'd like to bring it to your
> >>>>> attention.
> >>>>> Namely, some BPF programs may contain instructions that XOR a register
> >>>>> with itself.
> >>>>> This effectively results in the following C code:
> >>>>>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> >>>>> or
> >>>>>   regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
> >>>>> being executed.
> >>>>>
> >>>>> According to the C11 standard this is undefined behavior, so KMSAN
> >>>>> reports an error in this case.
> >
> > Can you elaborate / quote the exact bit from C11 that KMSAN is referring
> > to? (The below that Michal was quoting or something else?)
> >
> > Does that only refer to C11 standard? Note that kernel's Makefile +430
> > explicitly states 'std=gnu89' and not 'std=c11' [0].
> >
> >   [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51b97e354ba9fce1890cf38ecc754aa49677fc89
> >
> >>>> Can you quote the part of the standard saying this is undefined
> >>>> behavior? I couldn't find anything else than
> >>>>
> >>>>   If the value being stored in an object is read from another object
> >>>>   that overlaps in any way the storage of the first object, then the
> >>>>   overlap shall be exact and the two objects shall have qualified or
> >>>>   unqualified versions of a compatible type; otherwise, the behavior
> >>>>   is undefined.
> >>>>
> >>>> (but I only have a draft for obvious reasons). I'm not sure what exactly
> >>>> they mean by "exact overlap" and the standard doesn't seem to define
> >>>> the term but if the two objects are actually the same, they certainly
> >>>> have compatible types.
> >
> > Here is an example for the overlap quoted above; I don't think this
> > applies to our case since it would be "exact". Quote [1]:
> >
> >   struct S { int x; int y; };
> >   struct T { int z; struct S s; };
> >   union U { struct S f ; struct T g; } u;
> >
> >   main(){
> >     u.f = u.g.s;
> >     return 0;
> >   }
> >
> >   [1] https://bts.frama-c.com/print_bug_page.php?bug_id=945
> >
> >>> I think I understand now. You didn't want to say that the statement
> >>>
> >>>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> >>>
> >>> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is
> >>> uninitialized. Right?
> >> Yes. Sorry for being unclear.
> >> By default regs[] is uninitialized, so we need to initialize it before
> >> using the register values.
> >> I am also wondering if it's possible to simply copy the uninitialized
> >> register values from regs[] to the userspace via maps.
>
> Nope, not possible. And to elaborate on cBPF / eBPF cases:
If you mean that it's not possible to generate a eBPF program that
XORs an uninitialized register with itself, you may be actually right.
I've reverted https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d,
and syzkaller couldn't find a program triggering this behavior so far.
Perhaps something had changed in eBPF code since I encountered this error.
I'll be watching the dashboard and will let you know if I have a
reliable reproducer for the aforementioned problem.
Thanks for checking!

> # cat /proc/sys/net/core/bpf_jit_enable
> 0
>
> For the eBPF case, lets take (test_verifier.c):
>
> {
>         "xor test",
>         .insns = {
>                 BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2),
>                 BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
>                 BPF_EXIT_INSN(),
>         },
>         .result = ACCEPT,
>         .retval = 0,
> },
>
> # ./test_verifier 0
> #0/u xor test FAIL
> Failed to load prog 'Permission denied'!
> 0: (af) r2 ^= r2
> R2 !read_ok
> #0/p xor test FAIL
> Failed to load prog 'Permission denied'!
> 0: (af) r2 ^= r2
> R2 !read_ok
> Summary: 0 PASSED, 0 SKIPPED, 2 FAILED
>
> And for the initialized case:
>
> {
>         "xor test",
>         .insns = {
>                 BPF_MOV64_IMM(BPF_REG_2, 42),
>                 BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2),
>                 BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
>                 BPF_EXIT_INSN(),
>         },
>         .result = ACCEPT,
>         .retval = 0,
> },
>
> # ./test_verifier 0
> #0/u xor test OK
> #0/p xor test OK
> Summary: 2 PASSED, 0 SKIPPED, 0 FAILED
>
> And cBPF (test_bpf.c):
>
> {
>         "xor test",
>         .u.insns = {
>                 BPF_STMT(BPF_RET | BPF_A, 0)
>         },
>         CLASSIC,
>         { },
>         { { 0, 0 }, },
> },
>
> [...]
> [88819.276142] test_bpf: #0 xor test jited:0 PASS
> [...]
>
> Given latter two use the same underlying interpreter implementation for
> the 'good' and 'bad' case, I don't see how compiler would have any degree
> of freedom to judge about the runtime dependent BPF insn input stream
> here. If it would assume UB, then also 'good' xor case would be broken
> and miscompiled which it is clearly not (even for some very obvious case
> below). While KMSAN is analyzing runtime memory access, compiler cannot..

IIUC some hardware, like IA64, may do these checks at runtime (see
https://blogs.msdn.microsoft.com/oldnewthing/20150727-00/?p=90821 for
an example)
Please note that referring to results of compiling a finite number of
programs with UB with a finite number of compilers doesn't prove
anything.
It won't help much in the case tomorrow e.g. GCC changes its behavior
to make use of some new hardware feature.
> # cat foo.c
> #include <stdio.h>
> int main(void)
> {
>         int foo = foo ^ foo;
>         printf("%u\n", foo);
>         return 0;
> }
>
> # clang -Wall -O2 foo.c
> foo.c:4:12: warning: variable 'foo' is uninitialized when used within its own initialization [-Wuninitialized]
>         int foo = foo ^ foo;
>             ~~~   ^~~
> # ./a.out
> 0
>
> disasm:
>
> 00000000004004d0 <main>:
>   4004d0:       50                      push   %rax
>   4004d1:       bf 74 05 40 00          mov    $0x400574,%edi
>   4004d6:       31 f6                   xor    %esi,%esi
>   4004d8:       31 c0                   xor    %eax,%eax
>   4004da:       e8 f1 fe ff ff          callq  4003d0 <printf@plt>
>   4004df:       31 c0                   xor    %eax,%eax
>   4004e1:       59                      pop    %rcx
>   4004e2:       c3                      retq
>   4004e3:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
>   4004ea:       00 00 00
>   4004ed:       0f 1f 00                nopl   (%rax)
>
> # gcc -Wall -O2 foo.c   (neither complains with using -Wextra)
> # ./a.out
> 0
>
> disasm:
>
> 0000000000000560 <main>:
>  560:   48 8d 35 ad 01 00 00    lea    0x1ad(%rip),%rsi        # 714 <_IO_stdin_used+0x4>
>  567:   48 83 ec 08             sub    $0x8,%rsp
>  56b:   31 d2                   xor    %edx,%edx
>  56d:   bf 01 00 00 00          mov    $0x1,%edi
>  572:   31 c0                   xor    %eax,%eax
>  574:   e8 c7 ff ff ff          callq  540 <__printf_chk@plt>
>  579:   31 c0                   xor    %eax,%eax
>  57b:   48 83 c4 08             add    $0x8,%rsp
>  57f:   c3                      retq
>
> Cheers,
> Daniel



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2018-12-18 14:36           ` Alexander Potapenko
@ 2020-05-27 15:52             ` Alexander Potapenko
  2020-05-27 16:58               ` Alexei Starovoitov
  0 siblings, 1 reply; 26+ messages in thread
From: Alexander Potapenko @ 2020-05-27 15:52 UTC (permalink / raw)
  To: daniel; +Cc: mkubecek, ast, Dmitriy Vyukov, Networking

[-- Attachment #1: Type: text/plain, Size: 6771 bytes --]

On Tue, Dec 18, 2018 at 3:36 PM Alexander Potapenko <glider@google.com> wrote:
>
> On Thu, Dec 13, 2018 at 3:54 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >
> > On 12/13/2018 02:18 PM, Daniel Borkmann wrote:
> > > On 12/13/2018 01:24 PM, Alexander Potapenko wrote:
> > >> On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote:
> > >>> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote:
> > >>>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote:
> > >>>>> Hi BPF maintainers,
> > >>>>>
> > >>>>> some time ago KMSAN found an issue in BPF code which we decided to
> > >>>>> suppress at that point, but now I'd like to bring it to your
> > >>>>> attention.
> > >>>>> Namely, some BPF programs may contain instructions that XOR a register
> > >>>>> with itself.
> > >>>>> This effectively results in the following C code:
> > >>>>>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> > >>>>> or
> > >>>>>   regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X];
> > >>>>> being executed.
> > >>>>>
> > >>>>> According to the C11 standard this is undefined behavior, so KMSAN
> > >>>>> reports an error in this case.
> > >
> > > Can you elaborate / quote the exact bit from C11 that KMSAN is referring
> > > to? (The below that Michal was quoting or something else?)
> > >
> > > Does that only refer to C11 standard? Note that kernel's Makefile +430
> > > explicitly states 'std=gnu89' and not 'std=c11' [0].
> > >
> > >   [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51b97e354ba9fce1890cf38ecc754aa49677fc89
> > >
> > >>>> Can you quote the part of the standard saying this is undefined
> > >>>> behavior? I couldn't find anything else than
> > >>>>
> > >>>>   If the value being stored in an object is read from another object
> > >>>>   that overlaps in any way the storage of the first object, then the
> > >>>>   overlap shall be exact and the two objects shall have qualified or
> > >>>>   unqualified versions of a compatible type; otherwise, the behavior
> > >>>>   is undefined.
> > >>>>
> > >>>> (but I only have a draft for obvious reasons). I'm not sure what exactly
> > >>>> they mean by "exact overlap" and the standard doesn't seem to define
> > >>>> the term but if the two objects are actually the same, they certainly
> > >>>> have compatible types.
> > >
> > > Here is an example for the overlap quoted above; I don't think this
> > > applies to our case since it would be "exact". Quote [1]:
> > >
> > >   struct S { int x; int y; };
> > >   struct T { int z; struct S s; };
> > >   union U { struct S f ; struct T g; } u;
> > >
> > >   main(){
> > >     u.f = u.g.s;
> > >     return 0;
> > >   }
> > >
> > >   [1] https://bts.frama-c.com/print_bug_page.php?bug_id=945
> > >
> > >>> I think I understand now. You didn't want to say that the statement
> > >>>
> > >>>   regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A];
> > >>>
> > >>> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is
> > >>> uninitialized. Right?
> > >> Yes. Sorry for being unclear.
> > >> By default regs[] is uninitialized, so we need to initialize it before
> > >> using the register values.
> > >> I am also wondering if it's possible to simply copy the uninitialized
> > >> register values from regs[] to the userspace via maps.
> >
> > Nope, not possible. And to elaborate on cBPF / eBPF cases:
> If you mean that it's not possible to generate a eBPF program that
> XORs an uninitialized register with itself, you may be actually right.
> I've reverted https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d,
> and syzkaller couldn't find a program triggering this behavior so far.
> Perhaps something had changed in eBPF code since I encountered this error.
> I'll be watching the dashboard and will let you know if I have a
> reliable reproducer for the aforementioned problem.
> Thanks for checking!

Hi again,

similar bugs have started showing up recently.
When I run the attached program (note it uses
SO_SECURITY_AUTHENTICATION, which as far as I understand is a no-op)
on the KMSAN-enabled kernel (currently using v5.7-rc4) I see the
following errors:

=====================================================
BUG: KMSAN: uninit-value in packet_rcv_fanout+0x242b/0x25a0
net/packet/af_packet.c:1463
 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
 packet_rcv_fanout+0x242b/0x25a0 net/packet/af_packet.c:1463
 deliver_skb net/core/dev.c:2168
 __netif_receive_skb_core+0x1434/0x5860 net/core/dev.c:5052
 __netif_receive_skb_list_core+0x315/0x1380 net/core/dev.c:5264
 __netif_receive_skb_list net/core/dev.c:5331
 netif_receive_skb_list_internal+0xf54/0x1600 net/core/dev.c:5426
 gro_normal_list net/core/dev.c:5537
 napi_complete_done+0x2ef/0xb40 net/core/dev.c:6258
 e1000_clean+0x1bc8/0x5d80 drivers/net/ethernet/intel/e1000/e1000_main.c:3802
 napi_poll net/core/dev.c:6572
...
Uninit was stored to memory at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144
 kmsan_internal_chain_origin+0xad/0x130 mm/kmsan/kmsan.c:310
 __msan_chain_origin+0x50/0x90 mm/kmsan/kmsan_instr.c:165
 ___bpf_prog_run+0x68fa/0x9300 kernel/bpf/core.c:1408
 __bpf_prog_run32+0x101/0x170 kernel/bpf/core.c:1681
 bpf_dispatcher_nop_func ./include/linux/bpf.h:545
 bpf_prog_run_pin_on_cpu ./include/linux/filter.h:599
 bpf_prog_run_clear_cb ./include/linux/filter.h:721
 fanout_demux_bpf net/packet/af_packet.c:1404
 packet_rcv_fanout+0x517/0x25a0 net/packet/af_packet.c:1456
 deliver_skb net/core/dev.c:2168
...
Local variable ----regs@__bpf_prog_run32 created at:
 __bpf_prog_run32+0x87/0x170 kernel/bpf/core.c:1681
 __bpf_prog_run32+0x87/0x170 kernel/bpf/core.c:1681
=====================================================

This basically means that BPF's output register was uninitialized when
___bpf_prog_run() returned.

When I replace the lines initializing registers A and X in net/core/filter.c:

-               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
-               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);

with

+               *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0);
+               *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0);

, the bug goes away, therefore I think it's being caused by XORing the
initially uninitialized registers with themselves.

kernel/bpf/core.c:1408, where the uninitialized value was stored to
memory, points to the "ALU(ADD,  +)" macro in ___bpf_prog_run().
But the debug info seems to be incorrect here: if I comment this line
out and unroll the macro manually, KMSAN points to "ALU(SUB,  -)".
Most certainly it's actually one of the XOR instruction declarations.

Do you think it makes sense to use the UB-proof BPF_MOV32_IMM
instructions to initialize the registers?

[-- Attachment #2: bpf2.c --]
[-- Type: text/x-csrc, Size: 596 bytes --]

// autogenerated by syzkaller (https://github.com/google/syzkaller)

#include <linux/filter.h>
#include <stdint.h>
#include <sys/mman.h>
#include <sys/socket.h>
#include <unistd.h>

int main(void)
{
  int sock = socket(PF_PACKET, SOCK_RAW, 0x300);
  if (sock == -1)
    return 1;

  uint16_t rcv_arg[2] = {0, 6};
  setsockopt(sock, SOL_PACKET, /*SO_RCVLOWAT*/0x12, &rcv_arg, sizeof(rcv_arg));
  struct sock_filter code[] = { {0x16, 0, 0, 0} };
  struct sock_fprog bpf = {1, code};
  setsockopt(sock, SOL_PACKET, /*SO_SECURITY_AUTHENTICATION*/0x16, &bpf, sizeof(bpf));

  sleep(10);
  return 0;
}

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-05-27 15:52             ` Alexander Potapenko
@ 2020-05-27 16:58               ` Alexei Starovoitov
  2020-05-27 17:12                 ` Alexander Potapenko
  0 siblings, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2020-05-27 16:58 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov,
	Dmitriy Vyukov, Networking

On Wed, May 27, 2020 at 8:52 AM Alexander Potapenko <glider@google.com> wrote:
>
> This basically means that BPF's output register was uninitialized when
> ___bpf_prog_run() returned.
>
> When I replace the lines initializing registers A and X in net/core/filter.c:
>
> -               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
> -               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);
>
> with
>
> +               *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0);
> +               *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0);
>
> , the bug goes away, therefore I think it's being caused by XORing the
> initially uninitialized registers with themselves.
>
> kernel/bpf/core.c:1408, where the uninitialized value was stored to
> memory, points to the "ALU(ADD,  +)" macro in ___bpf_prog_run().
> But the debug info seems to be incorrect here: if I comment this line
> out and unroll the macro manually, KMSAN points to "ALU(SUB,  -)".
> Most certainly it's actually one of the XOR instruction declarations.
>
> Do you think it makes sense to use the UB-proof BPF_MOV32_IMM
> instructions to initialize the registers?

I think it's better for UBsan to get smarter about xor-ing.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-05-27 16:58               ` Alexei Starovoitov
@ 2020-05-27 17:12                 ` Alexander Potapenko
  2020-05-27 17:14                   ` Alexei Starovoitov
  0 siblings, 1 reply; 26+ messages in thread
From: Alexander Potapenko @ 2020-05-27 17:12 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov,
	Dmitriy Vyukov, Networking

On Wed, May 27, 2020 at 6:58 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, May 27, 2020 at 8:52 AM Alexander Potapenko <glider@google.com> wrote:
> >
> > This basically means that BPF's output register was uninitialized when
> > ___bpf_prog_run() returned.
> >
> > When I replace the lines initializing registers A and X in net/core/filter.c:
> >
> > -               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
> > -               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);
> >
> > with
> >
> > +               *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0);
> > +               *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0);
> >
> > , the bug goes away, therefore I think it's being caused by XORing the
> > initially uninitialized registers with themselves.
> >
> > kernel/bpf/core.c:1408, where the uninitialized value was stored to
> > memory, points to the "ALU(ADD,  +)" macro in ___bpf_prog_run().
> > But the debug info seems to be incorrect here: if I comment this line
> > out and unroll the macro manually, KMSAN points to "ALU(SUB,  -)".
> > Most certainly it's actually one of the XOR instruction declarations.
> >
> > Do you think it makes sense to use the UB-proof BPF_MOV32_IMM
> > instructions to initialize the registers?
>
> I think it's better for UBsan to get smarter about xor-ing.

Could you please elaborate on this? How exactly should KMSAN handle
this situation?
Note that despite the source says "BPF_ALU32_REG(BPF_XOR, BPF_REG_A,
BPF_REG_A);", it doesn't necessarily boil down to an expression like A
= A ^ A. It's more likely that temporary values will be involved,
making it quite hard to figure out whether the two operands are really
the same.
For an expression like A = B ^ C, KMSAN just calculates A's shadow
bits based on the values and the shadow bits of B and C. If either B
or C is uninitialized, the result will be uninitialized as well, even
if both B and C contain the same values. It's therefore strange to
expect the tool to behave differently if B and C are the same
variable.



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-05-27 17:12                 ` Alexander Potapenko
@ 2020-05-27 17:14                   ` Alexei Starovoitov
  2020-05-28  9:54                     ` Alexander Potapenko
  0 siblings, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2020-05-27 17:14 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov,
	Dmitriy Vyukov, Networking

On Wed, May 27, 2020 at 10:12 AM Alexander Potapenko <glider@google.com> wrote:
>
> On Wed, May 27, 2020 at 6:58 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, May 27, 2020 at 8:52 AM Alexander Potapenko <glider@google.com> wrote:
> > >
> > > This basically means that BPF's output register was uninitialized when
> > > ___bpf_prog_run() returned.
> > >
> > > When I replace the lines initializing registers A and X in net/core/filter.c:
> > >
> > > -               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
> > > -               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);
> > >
> > > with
> > >
> > > +               *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0);
> > > +               *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0);
> > >
> > > , the bug goes away, therefore I think it's being caused by XORing the
> > > initially uninitialized registers with themselves.
> > >
> > > kernel/bpf/core.c:1408, where the uninitialized value was stored to
> > > memory, points to the "ALU(ADD,  +)" macro in ___bpf_prog_run().
> > > But the debug info seems to be incorrect here: if I comment this line
> > > out and unroll the macro manually, KMSAN points to "ALU(SUB,  -)".
> > > Most certainly it's actually one of the XOR instruction declarations.
> > >
> > > Do you think it makes sense to use the UB-proof BPF_MOV32_IMM
> > > instructions to initialize the registers?
> >
> > I think it's better for UBsan to get smarter about xor-ing.
>
> Could you please elaborate on this? How exactly should KMSAN handle
> this situation?
> Note that despite the source says "BPF_ALU32_REG(BPF_XOR, BPF_REG_A,
> BPF_REG_A);", it doesn't necessarily boil down to an expression like A
> = A ^ A. It's more likely that temporary values will be involved,
> making it quite hard to figure out whether the two operands are really
> the same.

I really don't know who to make it smarter. It's your area of expertise.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-05-27 17:14                   ` Alexei Starovoitov
@ 2020-05-28  9:54                     ` Alexander Potapenko
  2020-05-28 16:00                       ` Alexei Starovoitov
  0 siblings, 1 reply; 26+ messages in thread
From: Alexander Potapenko @ 2020-05-28  9:54 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov,
	Dmitriy Vyukov, Networking

On Wed, May 27, 2020 at 7:15 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, May 27, 2020 at 10:12 AM Alexander Potapenko <glider@google.com> wrote:
> >
> > On Wed, May 27, 2020 at 6:58 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Wed, May 27, 2020 at 8:52 AM Alexander Potapenko <glider@google.com> wrote:
> > > >
> > > > This basically means that BPF's output register was uninitialized when
> > > > ___bpf_prog_run() returned.
> > > >
> > > > When I replace the lines initializing registers A and X in net/core/filter.c:
> > > >
> > > > -               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
> > > > -               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);
> > > >
> > > > with
> > > >
> > > > +               *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0);
> > > > +               *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0);
> > > >
> > > > , the bug goes away, therefore I think it's being caused by XORing the
> > > > initially uninitialized registers with themselves.
> > > >
> > > > kernel/bpf/core.c:1408, where the uninitialized value was stored to
> > > > memory, points to the "ALU(ADD,  +)" macro in ___bpf_prog_run().
> > > > But the debug info seems to be incorrect here: if I comment this line
> > > > out and unroll the macro manually, KMSAN points to "ALU(SUB,  -)".
> > > > Most certainly it's actually one of the XOR instruction declarations.
> > > >
> > > > Do you think it makes sense to use the UB-proof BPF_MOV32_IMM
> > > > instructions to initialize the registers?
> > >
> > > I think it's better for UBsan to get smarter about xor-ing.
> >
> > Could you please elaborate on this? How exactly should KMSAN handle
> > this situation?
> > Note that despite the source says "BPF_ALU32_REG(BPF_XOR, BPF_REG_A,
> > BPF_REG_A);", it doesn't necessarily boil down to an expression like A
> > = A ^ A. It's more likely that temporary values will be involved,
> > making it quite hard to figure out whether the two operands are really
> > the same.
>
> I really don't know who to make it smarter. It's your area of expertise.

The point I am trying to make is that BPF is relying on undefined
behavior (see the quotations from C89 standard) here for no apparent
reason.
This code might be working with the current set of
compilers/optimizations, but will it pay off to debug it when
something breaks?

The C implementation for BPF_XOR in kernel/bpf/core.c is translated to:
  regs[insn->dst_reg] = regs[insn->dst_reg] ^ regs[insn->src_reg];
, at which point the compiler already can't tell whether dst_reg and
src_reg are the same.

In fact Clang may already generate unexpected code for such cases: see
the assembly for foo3() in https://godbolt.org/z/VoMHaC.

I therefore think it's better to fix the buggy code unless there are
strong reasons not to do so, rather than teach the tool how to work
around this particular bug.

-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-05-28  9:54                     ` Alexander Potapenko
@ 2020-05-28 16:00                       ` Alexei Starovoitov
  2020-05-29  0:17                         ` Edward Cree
  0 siblings, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2020-05-28 16:00 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov,
	Dmitriy Vyukov, Networking

On Thu, May 28, 2020 at 2:54 AM Alexander Potapenko <glider@google.com> wrote:
>
> On Wed, May 27, 2020 at 7:15 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, May 27, 2020 at 10:12 AM Alexander Potapenko <glider@google.com> wrote:
> > >
> > > On Wed, May 27, 2020 at 6:58 PM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Wed, May 27, 2020 at 8:52 AM Alexander Potapenko <glider@google.com> wrote:
> > > > >
> > > > > This basically means that BPF's output register was uninitialized when
> > > > > ___bpf_prog_run() returned.
> > > > >
> > > > > When I replace the lines initializing registers A and X in net/core/filter.c:
> > > > >
> > > > > -               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
> > > > > -               *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);
> > > > >
> > > > > with
> > > > >
> > > > > +               *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0);
> > > > > +               *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0);
> > > > >
> > > > > , the bug goes away, therefore I think it's being caused by XORing the
> > > > > initially uninitialized registers with themselves.
> > > > >
> > > > > kernel/bpf/core.c:1408, where the uninitialized value was stored to
> > > > > memory, points to the "ALU(ADD,  +)" macro in ___bpf_prog_run().
> > > > > But the debug info seems to be incorrect here: if I comment this line
> > > > > out and unroll the macro manually, KMSAN points to "ALU(SUB,  -)".
> > > > > Most certainly it's actually one of the XOR instruction declarations.
> > > > >
> > > > > Do you think it makes sense to use the UB-proof BPF_MOV32_IMM
> > > > > instructions to initialize the registers?
> > > >
> > > > I think it's better for UBsan to get smarter about xor-ing.
> > >
> > > Could you please elaborate on this? How exactly should KMSAN handle
> > > this situation?
> > > Note that despite the source says "BPF_ALU32_REG(BPF_XOR, BPF_REG_A,
> > > BPF_REG_A);", it doesn't necessarily boil down to an expression like A
> > > = A ^ A. It's more likely that temporary values will be involved,
> > > making it quite hard to figure out whether the two operands are really
> > > the same.
> >
> > I really don't know who to make it smarter. It's your area of expertise.
>
> The point I am trying to make is that BPF is relying on undefined
> behavior (see the quotations from C89 standard) here for no apparent
> reason.
> This code might be working with the current set of
> compilers/optimizations, but will it pay off to debug it when
> something breaks?

xoring of two identical values is undefined in standard?
If that's really true such standard worth nothing.

> The C implementation for BPF_XOR in kernel/bpf/core.c is translated to:
>   regs[insn->dst_reg] = regs[insn->dst_reg] ^ regs[insn->src_reg];
> , at which point the compiler already can't tell whether dst_reg and
> src_reg are the same.

of course. compile can use different cpu regs.

> In fact Clang may already generate unexpected code for such cases: see
> the assembly for foo3() in https://godbolt.org/z/VoMHaC.

"unexpected" meaning that xor has different regs?

> I therefore think it's better to fix the buggy code unless there are
> strong reasons not to do so, rather than teach the tool how to work
> around this particular bug.

The tool needs to do data flow analysis to realize that the source
data in different regs is the same, hence xoring them will produce zero.
That could be super hard to achieve in practice and will take years
of work to implement KMSAN, but that's what you have to do.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-05-28 16:00                       ` Alexei Starovoitov
@ 2020-05-29  0:17                         ` Edward Cree
  2020-05-29  6:14                           ` Dmitry Vyukov
  2020-05-29 12:28                           ` Alexander Potapenko
  0 siblings, 2 replies; 26+ messages in thread
From: Edward Cree @ 2020-05-29  0:17 UTC (permalink / raw)
  To: Alexei Starovoitov, Alexander Potapenko
  Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov,
	Dmitriy Vyukov, Networking

On 28/05/2020 17:00, Alexei Starovoitov wrote:
> xoring of two identical values is undefined in standard?
I believe it is in this case, yes; even without the complication
 of array references that happen to alias, Alexander's foo1() is
 undefined behaviour under C89 (and also C99 which handles the
 case differently).

From the definitions section (1.6) of the C89 draft [1]:
> * Undefined behavior --- behavior, upon use of a nonportable or
> erroneous program construct, of erroneous data, or of
> indeterminately-valued objects, for which the Standard imposes
> no requirements.
And from 3.5.7 'Initialization':
> If an object that has automatic storage duration is not
> initialized explicitly, its value is indeterminate.
Since the standard doesn't say anything about self-XORing that
 could make it 'special' in this regard, the compiler isn't
 required to notice that it's a self-XOR, and (in the tradition
 of compiler-writers the world over) is entitled to optimise the
 program based on the assumption that the programmer has not
 committed UB, so in the foo1() example would be strictly within
 its rights to generate a binary that contained no XOR
 instruction at all.  UB, as you surely know, isn't guaranteed to
 do something 'sensible'.
And in the BPF example, if the compiler at some point manages to
 statically figure out that regs[insn->dst_reg] is uninitialised,
 it might say "hey, I can just grab any old free register and
 declare that that's now regs[insn->dst_reg] without filling it.
 And then it can do the same for regs[insn->src_reg], or heck,
 even choose to fill that one (this is now legal even though the
 pointers alias, because you already committed UB), and do a xor
 with different regs and produce garbage results.

(In C99 it gets subtler because an 'indeterminate value' is
 defined to be 'either a valid value or a trap representation',
 so arguably the compiler can only do this stuff if it _has_
 trap representations for the type in question.)

> If that's really true such standard worth nothing.
You may be right, but plenty of compiler writers will take that
 as a reason to ignore you, and if (say) a gcc upgrade breaks
 filter.c, they will merrily close any bugs you file as NOTABUG
 or INVALID or GOAWAYWEDONTCARE.
Is this annoying?  Extremely; the XOR-clearing _would_ be fine
 if the standard had chosen to define things differently (e.g.
 it's fine under a hypothetical 'C99 but uninitialised auto
 variables have unspecified rather than indeterminate values').
I can't see a way to work around it that doesn't have a possible
 performance cost (alternatives to Alexander's MOV_IMM 0 include
 initialising regs[BPF_REG_A] and regs[BPF_REG_X] in PROG_NAME
 and PROG_NAME_ARGS), although there is the question of whether
 anyone who cares about performance (or security) will be using
 BPF without the JIT anyway.
But I don't think "Alexandar has to do the data-flow analysis in
 KMSAN" is the right answer; KMSAN's diagnostic here is _correct_
 in that ___bpf_prog_run() invokes UB on this XOR.
Now, since it would be rather difficult and pointless for the
 compiler to statically prove that the reg is uninitialised (it
 would need to generate a special code-path just for this one
 case), maybe the best thing to do is to get GCC folks to bless
 this usage (perhaps defining uninitialised variables to have
 what C99 would call an unspecified value), at which point it
 becomes defined under the "gnu89" pseudo-standard which is what
 we compile the kernel with.  At which point KMSAN can be taught
 that this is OK, and can figure out "hey, you're self-XORing an
 unspecified value, the result is determinate" and clear the
 shadow bits appropriately.

-ed

[1]: https://port70.net/~nsz/c/c89/c89-draft.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-05-29  0:17                         ` Edward Cree
@ 2020-05-29  6:14                           ` Dmitry Vyukov
  2020-05-29  8:46                             ` Edward Cree
  2020-05-29 12:28                           ` Alexander Potapenko
  1 sibling, 1 reply; 26+ messages in thread
From: Dmitry Vyukov @ 2020-05-29  6:14 UTC (permalink / raw)
  To: Edward Cree
  Cc: Alexei Starovoitov, Alexander Potapenko, Daniel Borkmann,
	Michal Kubecek, Alexei Starovoitov, Networking

On Fri, May 29, 2020 at 2:17 AM Edward Cree <ecree@solarflare.com> wrote:
>
> On 28/05/2020 17:00, Alexei Starovoitov wrote:
> > xoring of two identical values is undefined in standard?
> I believe it is in this case, yes; even without the complication
>  of array references that happen to alias, Alexander's foo1() is
>  undefined behaviour under C89 (and also C99 which handles the
>  case differently).
>
> From the definitions section (1.6) of the C89 draft [1]:
> > * Undefined behavior --- behavior, upon use of a nonportable or
> > erroneous program construct, of erroneous data, or of
> > indeterminately-valued objects, for which the Standard imposes
> > no requirements.
> And from 3.5.7 'Initialization':
> > If an object that has automatic storage duration is not
> > initialized explicitly, its value is indeterminate.
> Since the standard doesn't say anything about self-XORing that
>  could make it 'special' in this regard, the compiler isn't
>  required to notice that it's a self-XOR, and (in the tradition
>  of compiler-writers the world over) is entitled to optimise the
>  program based on the assumption that the programmer has not
>  committed UB, so in the foo1() example would be strictly within
>  its rights to generate a binary that contained no XOR
>  instruction at all.  UB, as you surely know, isn't guaranteed to
>  do something 'sensible'.
> And in the BPF example, if the compiler at some point manages to
>  statically figure out that regs[insn->dst_reg] is uninitialised,
>  it might say "hey, I can just grab any old free register and
>  declare that that's now regs[insn->dst_reg] without filling it.
>  And then it can do the same for regs[insn->src_reg], or heck,
>  even choose to fill that one (this is now legal even though the
>  pointers alias, because you already committed UB), and do a xor
>  with different regs and produce garbage results.
>
> (In C99 it gets subtler because an 'indeterminate value' is
>  defined to be 'either a valid value or a trap representation',
>  so arguably the compiler can only do this stuff if it _has_
>  trap representations for the type in question.)

Interesting. Are you sure that's the meaning of 'indeterminate value'?
My latest copy of the standard says:

3.19.2
1 indeterminate value
either an unspecified value or a trap representation

My reading of this would be that this only prevents things from
exploding in all possible random ways (like formatting drive). The
effects are only reduced to either getting a random value, or a trap
on access to the value. Both of these do not seem to be acceptable for
a bpf program.


> > If that's really true such standard worth nothing.
> You may be right, but plenty of compiler writers will take that
>  as a reason to ignore you, and if (say) a gcc upgrade breaks
>  filter.c, they will merrily close any bugs you file as NOTABUG
>  or INVALID or GOAWAYWEDONTCARE.
> Is this annoying?  Extremely; the XOR-clearing _would_ be fine
>  if the standard had chosen to define things differently (e.g.
>  it's fine under a hypothetical 'C99 but uninitialised auto
>  variables have unspecified rather than indeterminate values').
> I can't see a way to work around it that doesn't have a possible
>  performance cost (alternatives to Alexander's MOV_IMM 0 include
>  initialising regs[BPF_REG_A] and regs[BPF_REG_X] in PROG_NAME
>  and PROG_NAME_ARGS), although there is the question of whether
>  anyone who cares about performance (or security) will be using
>  BPF without the JIT anyway.
> But I don't think "Alexandar has to do the data-flow analysis in
>  KMSAN" is the right answer; KMSAN's diagnostic here is _correct_
>  in that ___bpf_prog_run() invokes UB on this XOR.
> Now, since it would be rather difficult and pointless for the
>  compiler to statically prove that the reg is uninitialised (it
>  would need to generate a special code-path just for this one
>  case), maybe the best thing to do is to get GCC folks to bless
>  this usage (perhaps defining uninitialised variables to have
>  what C99 would call an unspecified value), at which point it
>  becomes defined under the "gnu89" pseudo-standard which is what
>  we compile the kernel with.  At which point KMSAN can be taught
>  that this is OK, and can figure out "hey, you're self-XORing an
>  unspecified value, the result is determinate" and clear the
>  shadow bits appropriately.
>
> -ed
>
> [1]: https://port70.net/~nsz/c/c89/c89-draft.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-05-29  6:14                           ` Dmitry Vyukov
@ 2020-05-29  8:46                             ` Edward Cree
  2020-05-29  8:53                               ` Dmitry Vyukov
  0 siblings, 1 reply; 26+ messages in thread
From: Edward Cree @ 2020-05-29  8:46 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Alexei Starovoitov, Alexander Potapenko, Daniel Borkmann,
	Michal Kubecek, Alexei Starovoitov, Networking

On 29/05/2020 07:14, Dmitry Vyukov wrote:
>> (In C99 it gets subtler because an 'indeterminate value' is
>>  defined to be 'either a valid value or a trap representation',
>>  so arguably the compiler can only do this stuff if it _has_
>>  trap representations for the type in question.)
> Interesting. Are you sure that's the meaning of 'indeterminate value'?
> My latest copy of the standard says:
>
> 3.19.2
> 1 indeterminate value
> either an unspecified value or a trap representation
Yes, but (from N1256):
| 3.17.3
| unspecified value
| valid value of the relevant type where this International Standard
| imposes no requirements on which value is chosen in any instance
| NOTE An unspecified value cannot be a trap representation

> My reading of this would be that this only prevents things from
> exploding in all possible random ways (like formatting drive). The
> effects are only reduced to either getting a random value, or a trap
> on access to the value. Both of these do not seem to be acceptable for
> a bpf program.
A random value, XORed with itself, produces 0, which is what we want.
(XORing a trap representation with itself, of course, produces a trap.)

So it'd be fine *unless* the 'in any instance' language can be read as
 allowing the uninitialised object to have *different* random values on
 separate accesses.

-ed

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-05-29  8:46                             ` Edward Cree
@ 2020-05-29  8:53                               ` Dmitry Vyukov
  0 siblings, 0 replies; 26+ messages in thread
From: Dmitry Vyukov @ 2020-05-29  8:53 UTC (permalink / raw)
  To: Edward Cree
  Cc: Alexei Starovoitov, Alexander Potapenko, Daniel Borkmann,
	Michal Kubecek, Alexei Starovoitov, Networking

On Fri, May 29, 2020 at 10:46 AM Edward Cree <ecree@solarflare.com> wrote:
>
> On 29/05/2020 07:14, Dmitry Vyukov wrote:
> >> (In C99 it gets subtler because an 'indeterminate value' is
> >>  defined to be 'either a valid value or a trap representation',
> >>  so arguably the compiler can only do this stuff if it _has_
> >>  trap representations for the type in question.)
> > Interesting. Are you sure that's the meaning of 'indeterminate value'?
> > My latest copy of the standard says:
> >
> > 3.19.2
> > 1 indeterminate value
> > either an unspecified value or a trap representation
> Yes, but (from N1256):
> | 3.17.3
> | unspecified value
> | valid value of the relevant type where this International Standard
> | imposes no requirements on which value is chosen in any instance
> | NOTE An unspecified value cannot be a trap representation
>
> > My reading of this would be that this only prevents things from
> > exploding in all possible random ways (like formatting drive). The
> > effects are only reduced to either getting a random value, or a trap
> > on access to the value. Both of these do not seem to be acceptable for
> > a bpf program.
> A random value, XORed with itself, produces 0, which is what we want.
> (XORing a trap representation with itself, of course, produces a trap.)
>
> So it'd be fine *unless* the 'in any instance' language can be read as
>  allowing the uninitialised object to have *different* random values on
>  separate accesses.

Ah, I see. I thought the result of such XOR is redefined to be an
indeterminate value rather than UB.
Thanks

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-05-29  0:17                         ` Edward Cree
  2020-05-29  6:14                           ` Dmitry Vyukov
@ 2020-05-29 12:28                           ` Alexander Potapenko
  2020-06-01  9:55                             ` Edward Cree
  1 sibling, 1 reply; 26+ messages in thread
From: Alexander Potapenko @ 2020-05-29 12:28 UTC (permalink / raw)
  To: Edward Cree, Alexei Starovoitov
  Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov,
	Dmitriy Vyukov, Networking

On Fri, May 29, 2020 at 2:17 AM Edward Cree <ecree@solarflare.com> wrote:
>
> On 28/05/2020 17:00, Alexei Starovoitov wrote:
> > xoring of two identical values is undefined in standard?
> I believe it is in this case, yes; even without the complication
>  of array references that happen to alias, Alexander's foo1() is
>  undefined behaviour under C89 (and also C99 which handles the
>  case differently).
>
> From the definitions section (1.6) of the C89 draft [1]:
> > * Undefined behavior --- behavior, upon use of a nonportable or
> > erroneous program construct, of erroneous data, or of
> > indeterminately-valued objects, for which the Standard imposes
> > no requirements.
> And from 3.5.7 'Initialization':
> > If an object that has automatic storage duration is not
> > initialized explicitly, its value is indeterminate.
> Since the standard doesn't say anything about self-XORing that
>  could make it 'special' in this regard, the compiler isn't
>  required to notice that it's a self-XOR, and (in the tradition
>  of compiler-writers the world over) is entitled to optimise the
>  program based on the assumption that the programmer has not
>  committed UB, so in the foo1() example would be strictly within
>  its rights to generate a binary that contained no XOR
>  instruction at all.  UB, as you surely know, isn't guaranteed to
>  do something 'sensible'.
> And in the BPF example, if the compiler at some point manages to
>  statically figure out that regs[insn->dst_reg] is uninitialised,
>  it might say "hey, I can just grab any old free register and
>  declare that that's now regs[insn->dst_reg] without filling it.
>  And then it can do the same for regs[insn->src_reg], or heck,
>  even choose to fill that one (this is now legal even though the
>  pointers alias, because you already committed UB), and do a xor
>  with different regs and produce garbage results.

Thanks for this writeup!

> Is this annoying?  Extremely; the XOR-clearing _would_ be fine
>  if the standard had chosen to define things differently (e.g.
>  it's fine under a hypothetical 'C99 but uninitialised auto
>  variables have unspecified rather than indeterminate values').

I wouldn't call this particular use case "extremely annoying". I think
so far this is the only case of initializing something by XOR we've
seen with both MSan and KMSAN.

> I can't see a way to work around it that doesn't have a possible
>  performance cost (alternatives to Alexander's MOV_IMM 0 include
>  initialising regs[BPF_REG_A] and regs[BPF_REG_X] in PROG_NAME
>  and PROG_NAME_ARGS), although there is the question of whether
>  anyone who cares about performance (or security) will be using
>  BPF without the JIT anyway.

If I understand correctly, these two instructions are only executed
once per program.
Are they really expected to impact performance that much?

It's also an interesting question whether the JIT compiler emits
consistently better code for BPF_XOR than for MOV_IMM 0 on every
architecture - while "xorl %rax, %rax" is probably shorter and faster
on X86, on ARM a better alternative would be "mov w0, wzr".
If the performance is really critical here, perhaps a better
alternative is to introduce a BPF instruction (which could be an alias
of BPF_XOR REG, REG) for zeroing out a register? Then different
architectures may choose more efficient implementations for it, and
the interpreter will be just assigning zero to the register without
violating the C standard.

> But I don't think "Alexandar has to do the data-flow analysis in
>  KMSAN" is the right answer; KMSAN's diagnostic here is _correct_
>  in that ___bpf_prog_run() invokes UB on this XOR.
> Now, since it would be rather difficult and pointless for the
>  compiler to statically prove that the reg is uninitialised (it
>  would need to generate a special code-path just for this one
>  case)

The godbolt link above actually shows a case (foo3()) in which Clang
knows that a local is uninitialized and transforms it into a special
`undef` value that can be then e.g. passed around and take part in
various optimizations, making them more efficient.
I don't have evidence that such a transformation is currently possible
for the BPF code in question, but all the building blocks are there,
so it's probably just a matter of time.

>, maybe the best thing to do is to get GCC folks to bless
>  this usage (perhaps defining uninitialised variables to have
>  what C99 would call an unspecified value), at which point it
>  becomes defined under the "gnu89" pseudo-standard which is what
>  we compile the kernel with.

Given the increased popularity of Clang in the kernel these days, I
don't think it's a good idea for a single compiler to further diverge
from the standard. Again, this code pattern doesn't really seem to be
popular enough to justify such a change.



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-05-29 12:28                           ` Alexander Potapenko
@ 2020-06-01  9:55                             ` Edward Cree
  2020-06-02 13:31                               ` Alexander Potapenko
  0 siblings, 1 reply; 26+ messages in thread
From: Edward Cree @ 2020-06-01  9:55 UTC (permalink / raw)
  To: Alexander Potapenko, Alexei Starovoitov
  Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov,
	Dmitriy Vyukov, Networking

On 29/05/2020 13:28, Alexander Potapenko wrote:> If the performance is really critical here, perhaps a better
> alternative is to introduce a BPF instruction (which could be an alias
> of BPF_XOR REG, REG) for zeroing out a register? Then different
> architectures may choose more efficient implementations for it, and
> the interpreter will be just assigning zero to the register without
> violating the C standard.If it's an alias of BPF_XOR r,r, then the interpreter will surely still
 interpret it with the XOR code.  Unless you make the interpreter
 special-case this, in which case you've added an extra branch to every
 XOR the interpreter handles :(

> Given the increased popularity of Clang in the kernel these days, I
> don't think it's a good idea for a single compiler to further diverge
> from the standard.The standard in question isn't C89, but "--std=gnu89", which is
 whatever GCC says it is :grin:
So if GCC declares that some class of optimisations are not legal under
 --std=gnu89, then they're not legal and Clang has to adapt to that.

> I wouldn't call this particular use case "extremely annoying".
To be clear, what's "annoying" is the double-bind we're in as a result
 of trying to optimise the prologue for both JITs (whose semantics are
 whatever we define eBPF to be) and the interpreter (which has to be
 implemented with reasonable efficiency as C code).

> If I understand correctly, these two instructions are only executed
> once per program.
> Are they really expected to impact performance that much?
If you have a very short program that's bound to a very frequent event,
 then they might.  But I don't have, and haven't seen, any numbers...

> I don't have evidence that such a transformation is currently possible
> for the BPF code in question, but all the building blocks are there,
> so it's probably just a matter of time.
I'm not so sure.  Consider the following sequence of BPF instructions:
    xor r0, r0
    ld r0, 42
    xor r0, r0
    exit
I hope you'll agree that at entry to the second XOR, the value of r0 is
 not indeterminate.  So any optimisation the compiler does for the first
 XOR ("oh, we don't need to fill the existing regs[] values, we can just
 use whatever's already in whichever register we allocate") will need to
 be predicated on something that makes it only happen for the first XOR.
But the place it's doing this optimisation is in the interpreter, which
 is a big fetch-execute loop.  I don't think even Clang is smart enough
 to figure out "BPF programs always start with a prologue, I'll stick in
 something that knows which prologue the prog uses and branches to a
 statically compiled, optimised version thereof".
(If Clang *is* that smart, then it's too clever by half...)

-ed

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-06-01  9:55                             ` Edward Cree
@ 2020-06-02 13:31                               ` Alexander Potapenko
  2020-06-02 17:32                                 ` Alexei Starovoitov
  0 siblings, 1 reply; 26+ messages in thread
From: Alexander Potapenko @ 2020-06-02 13:31 UTC (permalink / raw)
  To: Edward Cree
  Cc: Alexei Starovoitov, Daniel Borkmann, Michal Kubecek,
	Alexei Starovoitov, Dmitriy Vyukov, Networking

On Mon, Jun 1, 2020 at 11:55 AM Edward Cree <ecree@solarflare.com> wrote:
>
> If it's an alias of BPF_XOR r,r, then the interpreter will surely still
>  interpret it with the XOR code.  Unless you make the interpreter
>  special-case this, in which case you've added an extra branch to every
>  XOR the interpreter handles :(

You are right, the interpreter can be probably fixed in a different
way that doesn't introduce extra branches.

> > Given the increased popularity of Clang in the kernel these days, I
> > don't think it's a good idea for a single compiler to further diverge
> > from the standard.The standard in question isn't C89, but "--std=gnu89", which is
>  whatever GCC says it is :grin:
> So if GCC declares that some class of optimisations are not legal under
>  --std=gnu89, then they're not legal and Clang has to adapt to that.

Hm, I never thought of such an approach to standards. This sounds like
a legitimate solution, which will however result in:
 - GCC developers saying "no, we won't do this", or
 - Clang developers saying "no, we won't do this"
, unless we present them with good reasons to do so.

Because the use-case is quite narrow, and prohibiting this
optimization may break other optimizations (at least in Clang), I
won't expect anyone spending time on making Clang compliant with
--std=gnu89 in this particular case.

> > I wouldn't call this particular use case "extremely annoying".
> To be clear, what's "annoying" is the double-bind we're in as a result
>  of trying to optimise the prologue for both JITs (whose semantics are
>  whatever we define eBPF to be) and the interpreter (which has to be
>  implemented with reasonable efficiency as C code).

> > If I understand correctly, these two instructions are only executed
> > once per program.
> > Are they really expected to impact performance that much?
> If you have a very short program that's bound to a very frequent event,
>  then they might.  But I don't have, and haven't seen, any numbers...

So we're back to the question how much people care about the
interpreter performance.

> > I don't have evidence that such a transformation is currently possible
> > for the BPF code in question, but all the building blocks are there,
> > so it's probably just a matter of time.
> I'm not so sure.  Consider the following sequence of BPF instructions:
>     xor r0, r0
>     ld r0, 42
>     xor r0, r0
>     exit
> I hope you'll agree that at entry to the second XOR, the value of r0 is
>  not indeterminate.  So any optimisation the compiler does for the first
>  XOR ("oh, we don't need to fill the existing regs[] values, we can just
>  use whatever's already in whichever register we allocate") will need to
>  be predicated on something that makes it only happen for the first XOR.
> But the place it's doing this optimisation is in the interpreter, which
>  is a big fetch-execute loop.  I don't think even Clang is smart enough
>  to figure out "BPF programs always start with a prologue, I'll stick in
>  something that knows which prologue the prog uses and branches to a
>  statically compiled, optimised version thereof".
> (If Clang *is* that smart, then it's too clever by half...)

I don't think Clang does this at the moment, but I can certainly
imagine it unrolling the first two iterations of the interpretation
loop (since the prologue instructions are known at compile time) and
replacing them with explicit XOR instructions.

I however suggest we get to the practical question of how to deal with
this issue in the long run.
Currently these error reports prevent us from testing BPF with syzbot,
so we're using https://github.com/google/kmsan/commit/69b987d53462a7f3b5a41c62eb731340b53165f8
to work around them.
This seems to be the easiest fix that doesn't affect JIT performance
and removes the UB.

Once KMSAN makes it to the mainline, we'll need to either come up with
a similar fix, or disable fuzzing BPF, which isn't what we want.

--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-06-02 13:31                               ` Alexander Potapenko
@ 2020-06-02 17:32                                 ` Alexei Starovoitov
  2020-06-03 15:37                                   ` Edward Cree
  0 siblings, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2020-06-02 17:32 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Edward Cree, Daniel Borkmann, Michal Kubecek, Alexei Starovoitov,
	Dmitriy Vyukov, Networking

On Tue, Jun 02, 2020 at 03:31:40PM +0200, Alexander Potapenko wrote:
> 
> So we're back to the question how much people care about the
> interpreter performance.

No. The question is whether people care about UB fuzzing.

> 
> > > I don't have evidence that such a transformation is currently possible
> > > for the BPF code in question, but all the building blocks are there,
> > > so it's probably just a matter of time.
> > I'm not so sure.  Consider the following sequence of BPF instructions:
> >     xor r0, r0
> >     ld r0, 42
> >     xor r0, r0
> >     exit
> > I hope you'll agree that at entry to the second XOR, the value of r0 is
> >  not indeterminate.  So any optimisation the compiler does for the first
> >  XOR ("oh, we don't need to fill the existing regs[] values, we can just
> >  use whatever's already in whichever register we allocate") will need to
> >  be predicated on something that makes it only happen for the first XOR.
> > But the place it's doing this optimisation is in the interpreter, which
> >  is a big fetch-execute loop.  I don't think even Clang is smart enough
> >  to figure out "BPF programs always start with a prologue, I'll stick in
> >  something that knows which prologue the prog uses and branches to a
> >  statically compiled, optimised version thereof".
> > (If Clang *is* that smart, then it's too clever by half...)
> 
> I don't think Clang does this at the moment, but I can certainly
> imagine it unrolling the first two iterations of the interpretation
> loop (since the prologue instructions are known at compile time) and
> replacing them with explicit XOR instructions.
> 
> I however suggest we get to the practical question of how to deal with
> this issue in the long run.
> Currently these error reports prevent us from testing BPF with syzbot,
> so we're using https://github.com/google/kmsan/commit/69b987d53462a7f3b5a41c62eb731340b53165f8
> to work around them.
> This seems to be the easiest fix that doesn't affect JIT performance
> and removes the UB.
> 
> Once KMSAN makes it to the mainline, we'll need to either come up with
> a similar fix, or disable fuzzing BPF, which isn't what we want.

Disabling fuzzing altogether would be unfortunate,
but disabling UB fuzzing is perfectly fine.
I don't think UB fuzzer scratched the surface yet.
See fixup_bpf_calls() for example. It's the same UB due to uninit regs.
And I have to add the same Nack on messing with it.
There could be other places in bpf codegen. We're not going to
chase them one by one because standard says it's UB and interpreter
is written in C. The target for bpf codegen is JITs. JITs on some arch
optimize mov r,0 into xor. Some dont. bpf interpreter is simulating hw.
If gcc/clang will ever mess it with it to the point that this UB
will become an issue we will rewrite certain pieces of interpreter in asm.
For now if you want UB fuzzer running in your environment please add
_out_of_tree_ patch that inits all interpreter registers to zero.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-06-02 17:32                                 ` Alexei Starovoitov
@ 2020-06-03 15:37                                   ` Edward Cree
  2020-06-03 16:33                                     ` Alexander Potapenko
  0 siblings, 1 reply; 26+ messages in thread
From: Edward Cree @ 2020-06-03 15:37 UTC (permalink / raw)
  To: Alexei Starovoitov, Alexander Potapenko
  Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov,
	Dmitriy Vyukov, Networking

On 02/06/2020 18:32, Alexei Starovoitov wrote:
> The target for bpf codegen is JITs.
> bpf interpreter is simulating hw.
> For now if you want UB fuzzer running in your environment please add
> _out_of_tree_ patch that inits all interpreter registers to zero.
+1 to all the above.

Also, note that you can still fuzz BPF JITs by building the kernel
 without the interpreter: CONFIG_BPF_JIT_ALWAYS_ON.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Self-XORing BPF registers is undefined behavior
  2020-06-03 15:37                                   ` Edward Cree
@ 2020-06-03 16:33                                     ` Alexander Potapenko
  0 siblings, 0 replies; 26+ messages in thread
From: Alexander Potapenko @ 2020-06-03 16:33 UTC (permalink / raw)
  To: Edward Cree
  Cc: Alexei Starovoitov, Daniel Borkmann, Michal Kubecek,
	Alexei Starovoitov, Dmitriy Vyukov, Networking

On Wed, Jun 3, 2020 at 5:37 PM Edward Cree <ecree@solarflare.com> wrote:
>
> On 02/06/2020 18:32, Alexei Starovoitov wrote:
> > The target for bpf codegen is JITs.
> > bpf interpreter is simulating hw.
> > For now if you want UB fuzzer running in your environment please add
> > _out_of_tree_ patch that inits all interpreter registers to zero.
> +1 to all the above.

Noted, thank you.

> Also, note that you can still fuzz BPF JITs by building the kernel
>  without the interpreter: CONFIG_BPF_JIT_ALWAYS_ON.

Unfortunately KMSAN doesn't play well with JITed code. To be able to
detect uninit bugs in JIT, we'll need to instrument the generated code
as well.


-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-06-03 16:33 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-13 11:00 Self-XORing BPF registers is undefined behavior Alexander Potapenko
2018-12-13 11:06 ` Eric Dumazet
2018-12-13 11:23   ` Alexander Potapenko
2018-12-13 11:59 ` Michal Kubecek
2018-12-13 12:20   ` Michal Kubecek
2018-12-13 12:24     ` Alexander Potapenko
2018-12-13 13:18       ` Daniel Borkmann
2018-12-13 14:54         ` Daniel Borkmann
2018-12-18 14:36           ` Alexander Potapenko
2020-05-27 15:52             ` Alexander Potapenko
2020-05-27 16:58               ` Alexei Starovoitov
2020-05-27 17:12                 ` Alexander Potapenko
2020-05-27 17:14                   ` Alexei Starovoitov
2020-05-28  9:54                     ` Alexander Potapenko
2020-05-28 16:00                       ` Alexei Starovoitov
2020-05-29  0:17                         ` Edward Cree
2020-05-29  6:14                           ` Dmitry Vyukov
2020-05-29  8:46                             ` Edward Cree
2020-05-29  8:53                               ` Dmitry Vyukov
2020-05-29 12:28                           ` Alexander Potapenko
2020-06-01  9:55                             ` Edward Cree
2020-06-02 13:31                               ` Alexander Potapenko
2020-06-02 17:32                                 ` Alexei Starovoitov
2020-06-03 15:37                                   ` Edward Cree
2020-06-03 16:33                                     ` Alexander Potapenko
2018-12-18 14:09         ` Alexander Potapenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).