* Self-XORing BPF registers is undefined behavior @ 2018-12-13 11:00 Alexander Potapenko 2018-12-13 11:06 ` Eric Dumazet 2018-12-13 11:59 ` Michal Kubecek 0 siblings, 2 replies; 26+ messages in thread From: Alexander Potapenko @ 2018-12-13 11:00 UTC (permalink / raw) To: ast, daniel; +Cc: Dmitriy Vyukov, Networking Hi BPF maintainers, some time ago KMSAN found an issue in BPF code which we decided to suppress at that point, but now I'd like to bring it to your attention. Namely, some BPF programs may contain instructions that XOR a register with itself. This effectively results in the following C code: regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; or regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X]; being executed. According to the C11 standard this is undefined behavior, so KMSAN reports an error in this case. Do you think it's feasible to explicitly initialize the register values like it's done here: https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d ? Thanks, Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2018-12-13 11:00 Self-XORing BPF registers is undefined behavior Alexander Potapenko @ 2018-12-13 11:06 ` Eric Dumazet 2018-12-13 11:23 ` Alexander Potapenko 2018-12-13 11:59 ` Michal Kubecek 1 sibling, 1 reply; 26+ messages in thread From: Eric Dumazet @ 2018-12-13 11:06 UTC (permalink / raw) To: Alexander Potapenko, ast, daniel; +Cc: Dmitriy Vyukov, Networking On 12/13/2018 03:00 AM, Alexander Potapenko wrote: > Hi BPF maintainers, > > some time ago KMSAN found an issue in BPF code which we decided to > suppress at that point, but now I'd like to bring it to your > attention. > Namely, some BPF programs may contain instructions that XOR a register > with itself. > This effectively results in the following C code: > regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > or > regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X]; > being executed. > > According to the C11 standard this is undefined behavior, so KMSAN > reports an error in this case. eBPF is not C11 ;) XOR boolean operation on a cpu is following boolean logic, which is much stronger than any C standard. > > Do you think it's feasible to explicitly initialize the register > values like it's done here: > https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d > ? > > Thanks, > Alexander Potapenko > Software Engineer > > Google Germany GmbH > Erika-Mann-Straße, 33 > 80636 München > > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2018-12-13 11:06 ` Eric Dumazet @ 2018-12-13 11:23 ` Alexander Potapenko 0 siblings, 0 replies; 26+ messages in thread From: Alexander Potapenko @ 2018-12-13 11:23 UTC (permalink / raw) To: Eric Dumazet; +Cc: ast, daniel, Dmitriy Vyukov, Networking On Thu, Dec 13, 2018 at 12:06 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > On 12/13/2018 03:00 AM, Alexander Potapenko wrote: > > Hi BPF maintainers, > > > > some time ago KMSAN found an issue in BPF code which we decided to > > suppress at that point, but now I'd like to bring it to your > > attention. > > Namely, some BPF programs may contain instructions that XOR a register > > with itself. > > This effectively results in the following C code: > > regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > > or > > regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X]; > > being executed. > > > > According to the C11 standard this is undefined behavior, so KMSAN > > reports an error in this case. > > eBPF is not C11 ;) And is planning to stay such forever? :) I'm not a language lawyer, so I can't tell for sure if this is valid in C99 either. I think the term "trap representation" had already been there. > XOR boolean operation on a cpu is following boolean logic, which is much stronger than > any C standard. Yes, this is true if we compile a eBPF program into assembly code. But if the JIT is off, it ends up being interpreted by ___bpf_prog_run(), which just executes C code from a big switch: ... ALU_XOR_X: regs[insn->dst_reg] = regs[insn->dst_reg] ^ regs[insn->src_reg]; ... Note that it's even unknown at compile time that dst_reg and src_reg are the same. > > > > > Do you think it's feasible to explicitly initialize the register > > values like it's done here: > > https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d > > ? > > > > Thanks, > > Alexander Potapenko > > Software Engineer > > > > Google Germany GmbH > > Erika-Mann-Straße, 33 > > 80636 München > > > > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado > > Registergericht und -nummer: Hamburg, HRB 86891 > > Sitz der Gesellschaft: Hamburg > > -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2018-12-13 11:00 Self-XORing BPF registers is undefined behavior Alexander Potapenko 2018-12-13 11:06 ` Eric Dumazet @ 2018-12-13 11:59 ` Michal Kubecek 2018-12-13 12:20 ` Michal Kubecek 1 sibling, 1 reply; 26+ messages in thread From: Michal Kubecek @ 2018-12-13 11:59 UTC (permalink / raw) To: Alexander Potapenko; +Cc: ast, daniel, Dmitriy Vyukov, Networking On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote: > Hi BPF maintainers, > > some time ago KMSAN found an issue in BPF code which we decided to > suppress at that point, but now I'd like to bring it to your > attention. > Namely, some BPF programs may contain instructions that XOR a register > with itself. > This effectively results in the following C code: > regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > or > regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X]; > being executed. > > According to the C11 standard this is undefined behavior, so KMSAN > reports an error in this case. Can you quote the part of the standard saying this is undefined behavior? I couldn't find anything else than If the value being stored in an object is read from another object that overlaps in any way the storage of the first object, then the overlap shall be exact and the two objects shall have qualified or unqualified versions of a compatible type; otherwise, the behavior is undefined. (but I only have a draft for obvious reasons). I'm not sure what exactly they mean by "exact overlap" and the standard doesn't seem to define the term but if the two objects are actually the same, they certainly have compatible types. > > Do you think it's feasible to explicitly initialize the register > values like it's done here: > https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d > ? Wouldn't that mean we still end up with undefined behavior whenever a cBPF program explicitly uses the xor with itself to zero a register? Michal Kubecek ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2018-12-13 11:59 ` Michal Kubecek @ 2018-12-13 12:20 ` Michal Kubecek 2018-12-13 12:24 ` Alexander Potapenko 0 siblings, 1 reply; 26+ messages in thread From: Michal Kubecek @ 2018-12-13 12:20 UTC (permalink / raw) To: Alexander Potapenko; +Cc: ast, daniel, Dmitriy Vyukov, Networking On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote: > On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote: > > Hi BPF maintainers, > > > > some time ago KMSAN found an issue in BPF code which we decided to > > suppress at that point, but now I'd like to bring it to your > > attention. > > Namely, some BPF programs may contain instructions that XOR a register > > with itself. > > This effectively results in the following C code: > > regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > > or > > regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X]; > > being executed. > > > > According to the C11 standard this is undefined behavior, so KMSAN > > reports an error in this case. > > Can you quote the part of the standard saying this is undefined > behavior? I couldn't find anything else than > > If the value being stored in an object is read from another object > that overlaps in any way the storage of the first object, then the > overlap shall be exact and the two objects shall have qualified or > unqualified versions of a compatible type; otherwise, the behavior > is undefined. > > (but I only have a draft for obvious reasons). I'm not sure what exactly > they mean by "exact overlap" and the standard doesn't seem to define > the term but if the two objects are actually the same, they certainly > have compatible types. I think I understand now. You didn't want to say that the statement regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; as such is undefined behavior but that it's UB when regs[BPF_REG_A] is uninitialized. Right? Michal Kubecek ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2018-12-13 12:20 ` Michal Kubecek @ 2018-12-13 12:24 ` Alexander Potapenko 2018-12-13 13:18 ` Daniel Borkmann 0 siblings, 1 reply; 26+ messages in thread From: Alexander Potapenko @ 2018-12-13 12:24 UTC (permalink / raw) To: mkubecek; +Cc: ast, daniel, Dmitriy Vyukov, Networking On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote: > > On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote: > > On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote: > > > Hi BPF maintainers, > > > > > > some time ago KMSAN found an issue in BPF code which we decided to > > > suppress at that point, but now I'd like to bring it to your > > > attention. > > > Namely, some BPF programs may contain instructions that XOR a register > > > with itself. > > > This effectively results in the following C code: > > > regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > > > or > > > regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X]; > > > being executed. > > > > > > According to the C11 standard this is undefined behavior, so KMSAN > > > reports an error in this case. > > > > Can you quote the part of the standard saying this is undefined > > behavior? I couldn't find anything else than > > > > If the value being stored in an object is read from another object > > that overlaps in any way the storage of the first object, then the > > overlap shall be exact and the two objects shall have qualified or > > unqualified versions of a compatible type; otherwise, the behavior > > is undefined. > > > > (but I only have a draft for obvious reasons). I'm not sure what exactly > > they mean by "exact overlap" and the standard doesn't seem to define > > the term but if the two objects are actually the same, they certainly > > have compatible types. > > > I think I understand now. You didn't want to say that the statement > > regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > > as such is undefined behavior but that it's UB when regs[BPF_REG_A] is > uninitialized. Right? Yes. Sorry for being unclear. By default regs[] is uninitialized, so we need to initialize it before using the register values. I am also wondering if it's possible to simply copy the uninitialized register values from regs[] to the userspace via maps. > Michal Kubecek -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2018-12-13 12:24 ` Alexander Potapenko @ 2018-12-13 13:18 ` Daniel Borkmann 2018-12-13 14:54 ` Daniel Borkmann 2018-12-18 14:09 ` Alexander Potapenko 0 siblings, 2 replies; 26+ messages in thread From: Daniel Borkmann @ 2018-12-13 13:18 UTC (permalink / raw) To: Alexander Potapenko, mkubecek; +Cc: ast, Dmitriy Vyukov, Networking On 12/13/2018 01:24 PM, Alexander Potapenko wrote: > On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote: >> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote: >>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote: >>>> Hi BPF maintainers, >>>> >>>> some time ago KMSAN found an issue in BPF code which we decided to >>>> suppress at that point, but now I'd like to bring it to your >>>> attention. >>>> Namely, some BPF programs may contain instructions that XOR a register >>>> with itself. >>>> This effectively results in the following C code: >>>> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; >>>> or >>>> regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X]; >>>> being executed. >>>> >>>> According to the C11 standard this is undefined behavior, so KMSAN >>>> reports an error in this case. Can you elaborate / quote the exact bit from C11 that KMSAN is referring to? (The below that Michal was quoting or something else?) Does that only refer to C11 standard? Note that kernel's Makefile +430 explicitly states 'std=gnu89' and not 'std=c11' [0]. [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51b97e354ba9fce1890cf38ecc754aa49677fc89 >>> Can you quote the part of the standard saying this is undefined >>> behavior? I couldn't find anything else than >>> >>> If the value being stored in an object is read from another object >>> that overlaps in any way the storage of the first object, then the >>> overlap shall be exact and the two objects shall have qualified or >>> unqualified versions of a compatible type; otherwise, the behavior >>> is undefined. >>> >>> (but I only have a draft for obvious reasons). I'm not sure what exactly >>> they mean by "exact overlap" and the standard doesn't seem to define >>> the term but if the two objects are actually the same, they certainly >>> have compatible types. Here is an example for the overlap quoted above; I don't think this applies to our case since it would be "exact". Quote [1]: struct S { int x; int y; }; struct T { int z; struct S s; }; union U { struct S f ; struct T g; } u; main(){ u.f = u.g.s; return 0; } [1] https://bts.frama-c.com/print_bug_page.php?bug_id=945 >> I think I understand now. You didn't want to say that the statement >> >> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; >> >> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is >> uninitialized. Right? > Yes. Sorry for being unclear. > By default regs[] is uninitialized, so we need to initialize it before > using the register values. > I am also wondering if it's possible to simply copy the uninitialized > register values from regs[] to the userspace via maps. >> Michal Kubecek > > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2018-12-13 13:18 ` Daniel Borkmann @ 2018-12-13 14:54 ` Daniel Borkmann 2018-12-18 14:36 ` Alexander Potapenko 2018-12-18 14:09 ` Alexander Potapenko 1 sibling, 1 reply; 26+ messages in thread From: Daniel Borkmann @ 2018-12-13 14:54 UTC (permalink / raw) To: Alexander Potapenko, mkubecek; +Cc: ast, Dmitriy Vyukov, Networking On 12/13/2018 02:18 PM, Daniel Borkmann wrote: > On 12/13/2018 01:24 PM, Alexander Potapenko wrote: >> On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote: >>> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote: >>>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote: >>>>> Hi BPF maintainers, >>>>> >>>>> some time ago KMSAN found an issue in BPF code which we decided to >>>>> suppress at that point, but now I'd like to bring it to your >>>>> attention. >>>>> Namely, some BPF programs may contain instructions that XOR a register >>>>> with itself. >>>>> This effectively results in the following C code: >>>>> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; >>>>> or >>>>> regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X]; >>>>> being executed. >>>>> >>>>> According to the C11 standard this is undefined behavior, so KMSAN >>>>> reports an error in this case. > > Can you elaborate / quote the exact bit from C11 that KMSAN is referring > to? (The below that Michal was quoting or something else?) > > Does that only refer to C11 standard? Note that kernel's Makefile +430 > explicitly states 'std=gnu89' and not 'std=c11' [0]. > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51b97e354ba9fce1890cf38ecc754aa49677fc89 > >>>> Can you quote the part of the standard saying this is undefined >>>> behavior? I couldn't find anything else than >>>> >>>> If the value being stored in an object is read from another object >>>> that overlaps in any way the storage of the first object, then the >>>> overlap shall be exact and the two objects shall have qualified or >>>> unqualified versions of a compatible type; otherwise, the behavior >>>> is undefined. >>>> >>>> (but I only have a draft for obvious reasons). I'm not sure what exactly >>>> they mean by "exact overlap" and the standard doesn't seem to define >>>> the term but if the two objects are actually the same, they certainly >>>> have compatible types. > > Here is an example for the overlap quoted above; I don't think this > applies to our case since it would be "exact". Quote [1]: > > struct S { int x; int y; }; > struct T { int z; struct S s; }; > union U { struct S f ; struct T g; } u; > > main(){ > u.f = u.g.s; > return 0; > } > > [1] https://bts.frama-c.com/print_bug_page.php?bug_id=945 > >>> I think I understand now. You didn't want to say that the statement >>> >>> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; >>> >>> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is >>> uninitialized. Right? >> Yes. Sorry for being unclear. >> By default regs[] is uninitialized, so we need to initialize it before >> using the register values. >> I am also wondering if it's possible to simply copy the uninitialized >> register values from regs[] to the userspace via maps. Nope, not possible. And to elaborate on cBPF / eBPF cases: # cat /proc/sys/net/core/bpf_jit_enable 0 For the eBPF case, lets take (test_verifier.c): { "xor test", .insns = { BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2), BPF_MOV64_REG(BPF_REG_0, BPF_REG_2), BPF_EXIT_INSN(), }, .result = ACCEPT, .retval = 0, }, # ./test_verifier 0 #0/u xor test FAIL Failed to load prog 'Permission denied'! 0: (af) r2 ^= r2 R2 !read_ok #0/p xor test FAIL Failed to load prog 'Permission denied'! 0: (af) r2 ^= r2 R2 !read_ok Summary: 0 PASSED, 0 SKIPPED, 2 FAILED And for the initialized case: { "xor test", .insns = { BPF_MOV64_IMM(BPF_REG_2, 42), BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2), BPF_MOV64_REG(BPF_REG_0, BPF_REG_2), BPF_EXIT_INSN(), }, .result = ACCEPT, .retval = 0, }, # ./test_verifier 0 #0/u xor test OK #0/p xor test OK Summary: 2 PASSED, 0 SKIPPED, 0 FAILED And cBPF (test_bpf.c): { "xor test", .u.insns = { BPF_STMT(BPF_RET | BPF_A, 0) }, CLASSIC, { }, { { 0, 0 }, }, }, [...] [88819.276142] test_bpf: #0 xor test jited:0 PASS [...] Given latter two use the same underlying interpreter implementation for the 'good' and 'bad' case, I don't see how compiler would have any degree of freedom to judge about the runtime dependent BPF insn input stream here. If it would assume UB, then also 'good' xor case would be broken and miscompiled which it is clearly not (even for some very obvious case below). While KMSAN is analyzing runtime memory access, compiler cannot.. # cat foo.c #include <stdio.h> int main(void) { int foo = foo ^ foo; printf("%u\n", foo); return 0; } # clang -Wall -O2 foo.c foo.c:4:12: warning: variable 'foo' is uninitialized when used within its own initialization [-Wuninitialized] int foo = foo ^ foo; ~~~ ^~~ # ./a.out 0 disasm: 00000000004004d0 <main>: 4004d0: 50 push %rax 4004d1: bf 74 05 40 00 mov $0x400574,%edi 4004d6: 31 f6 xor %esi,%esi 4004d8: 31 c0 xor %eax,%eax 4004da: e8 f1 fe ff ff callq 4003d0 <printf@plt> 4004df: 31 c0 xor %eax,%eax 4004e1: 59 pop %rcx 4004e2: c3 retq 4004e3: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 4004ea: 00 00 00 4004ed: 0f 1f 00 nopl (%rax) # gcc -Wall -O2 foo.c (neither complains with using -Wextra) # ./a.out 0 disasm: 0000000000000560 <main>: 560: 48 8d 35 ad 01 00 00 lea 0x1ad(%rip),%rsi # 714 <_IO_stdin_used+0x4> 567: 48 83 ec 08 sub $0x8,%rsp 56b: 31 d2 xor %edx,%edx 56d: bf 01 00 00 00 mov $0x1,%edi 572: 31 c0 xor %eax,%eax 574: e8 c7 ff ff ff callq 540 <__printf_chk@plt> 579: 31 c0 xor %eax,%eax 57b: 48 83 c4 08 add $0x8,%rsp 57f: c3 retq Cheers, Daniel ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2018-12-13 14:54 ` Daniel Borkmann @ 2018-12-18 14:36 ` Alexander Potapenko 2020-05-27 15:52 ` Alexander Potapenko 0 siblings, 1 reply; 26+ messages in thread From: Alexander Potapenko @ 2018-12-18 14:36 UTC (permalink / raw) To: daniel; +Cc: mkubecek, ast, Dmitriy Vyukov, Networking On Thu, Dec 13, 2018 at 3:54 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 12/13/2018 02:18 PM, Daniel Borkmann wrote: > > On 12/13/2018 01:24 PM, Alexander Potapenko wrote: > >> On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote: > >>> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote: > >>>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote: > >>>>> Hi BPF maintainers, > >>>>> > >>>>> some time ago KMSAN found an issue in BPF code which we decided to > >>>>> suppress at that point, but now I'd like to bring it to your > >>>>> attention. > >>>>> Namely, some BPF programs may contain instructions that XOR a register > >>>>> with itself. > >>>>> This effectively results in the following C code: > >>>>> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > >>>>> or > >>>>> regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X]; > >>>>> being executed. > >>>>> > >>>>> According to the C11 standard this is undefined behavior, so KMSAN > >>>>> reports an error in this case. > > > > Can you elaborate / quote the exact bit from C11 that KMSAN is referring > > to? (The below that Michal was quoting or something else?) > > > > Does that only refer to C11 standard? Note that kernel's Makefile +430 > > explicitly states 'std=gnu89' and not 'std=c11' [0]. > > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51b97e354ba9fce1890cf38ecc754aa49677fc89 > > > >>>> Can you quote the part of the standard saying this is undefined > >>>> behavior? I couldn't find anything else than > >>>> > >>>> If the value being stored in an object is read from another object > >>>> that overlaps in any way the storage of the first object, then the > >>>> overlap shall be exact and the two objects shall have qualified or > >>>> unqualified versions of a compatible type; otherwise, the behavior > >>>> is undefined. > >>>> > >>>> (but I only have a draft for obvious reasons). I'm not sure what exactly > >>>> they mean by "exact overlap" and the standard doesn't seem to define > >>>> the term but if the two objects are actually the same, they certainly > >>>> have compatible types. > > > > Here is an example for the overlap quoted above; I don't think this > > applies to our case since it would be "exact". Quote [1]: > > > > struct S { int x; int y; }; > > struct T { int z; struct S s; }; > > union U { struct S f ; struct T g; } u; > > > > main(){ > > u.f = u.g.s; > > return 0; > > } > > > > [1] https://bts.frama-c.com/print_bug_page.php?bug_id=945 > > > >>> I think I understand now. You didn't want to say that the statement > >>> > >>> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > >>> > >>> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is > >>> uninitialized. Right? > >> Yes. Sorry for being unclear. > >> By default regs[] is uninitialized, so we need to initialize it before > >> using the register values. > >> I am also wondering if it's possible to simply copy the uninitialized > >> register values from regs[] to the userspace via maps. > > Nope, not possible. And to elaborate on cBPF / eBPF cases: If you mean that it's not possible to generate a eBPF program that XORs an uninitialized register with itself, you may be actually right. I've reverted https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d, and syzkaller couldn't find a program triggering this behavior so far. Perhaps something had changed in eBPF code since I encountered this error. I'll be watching the dashboard and will let you know if I have a reliable reproducer for the aforementioned problem. Thanks for checking! > # cat /proc/sys/net/core/bpf_jit_enable > 0 > > For the eBPF case, lets take (test_verifier.c): > > { > "xor test", > .insns = { > BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2), > BPF_MOV64_REG(BPF_REG_0, BPF_REG_2), > BPF_EXIT_INSN(), > }, > .result = ACCEPT, > .retval = 0, > }, > > # ./test_verifier 0 > #0/u xor test FAIL > Failed to load prog 'Permission denied'! > 0: (af) r2 ^= r2 > R2 !read_ok > #0/p xor test FAIL > Failed to load prog 'Permission denied'! > 0: (af) r2 ^= r2 > R2 !read_ok > Summary: 0 PASSED, 0 SKIPPED, 2 FAILED > > And for the initialized case: > > { > "xor test", > .insns = { > BPF_MOV64_IMM(BPF_REG_2, 42), > BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2), > BPF_MOV64_REG(BPF_REG_0, BPF_REG_2), > BPF_EXIT_INSN(), > }, > .result = ACCEPT, > .retval = 0, > }, > > # ./test_verifier 0 > #0/u xor test OK > #0/p xor test OK > Summary: 2 PASSED, 0 SKIPPED, 0 FAILED > > And cBPF (test_bpf.c): > > { > "xor test", > .u.insns = { > BPF_STMT(BPF_RET | BPF_A, 0) > }, > CLASSIC, > { }, > { { 0, 0 }, }, > }, > > [...] > [88819.276142] test_bpf: #0 xor test jited:0 PASS > [...] > > Given latter two use the same underlying interpreter implementation for > the 'good' and 'bad' case, I don't see how compiler would have any degree > of freedom to judge about the runtime dependent BPF insn input stream > here. If it would assume UB, then also 'good' xor case would be broken > and miscompiled which it is clearly not (even for some very obvious case > below). While KMSAN is analyzing runtime memory access, compiler cannot.. IIUC some hardware, like IA64, may do these checks at runtime (see https://blogs.msdn.microsoft.com/oldnewthing/20150727-00/?p=90821 for an example) Please note that referring to results of compiling a finite number of programs with UB with a finite number of compilers doesn't prove anything. It won't help much in the case tomorrow e.g. GCC changes its behavior to make use of some new hardware feature. > # cat foo.c > #include <stdio.h> > int main(void) > { > int foo = foo ^ foo; > printf("%u\n", foo); > return 0; > } > > # clang -Wall -O2 foo.c > foo.c:4:12: warning: variable 'foo' is uninitialized when used within its own initialization [-Wuninitialized] > int foo = foo ^ foo; > ~~~ ^~~ > # ./a.out > 0 > > disasm: > > 00000000004004d0 <main>: > 4004d0: 50 push %rax > 4004d1: bf 74 05 40 00 mov $0x400574,%edi > 4004d6: 31 f6 xor %esi,%esi > 4004d8: 31 c0 xor %eax,%eax > 4004da: e8 f1 fe ff ff callq 4003d0 <printf@plt> > 4004df: 31 c0 xor %eax,%eax > 4004e1: 59 pop %rcx > 4004e2: c3 retq > 4004e3: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) > 4004ea: 00 00 00 > 4004ed: 0f 1f 00 nopl (%rax) > > # gcc -Wall -O2 foo.c (neither complains with using -Wextra) > # ./a.out > 0 > > disasm: > > 0000000000000560 <main>: > 560: 48 8d 35 ad 01 00 00 lea 0x1ad(%rip),%rsi # 714 <_IO_stdin_used+0x4> > 567: 48 83 ec 08 sub $0x8,%rsp > 56b: 31 d2 xor %edx,%edx > 56d: bf 01 00 00 00 mov $0x1,%edi > 572: 31 c0 xor %eax,%eax > 574: e8 c7 ff ff ff callq 540 <__printf_chk@plt> > 579: 31 c0 xor %eax,%eax > 57b: 48 83 c4 08 add $0x8,%rsp > 57f: c3 retq > > Cheers, > Daniel -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2018-12-18 14:36 ` Alexander Potapenko @ 2020-05-27 15:52 ` Alexander Potapenko 2020-05-27 16:58 ` Alexei Starovoitov 0 siblings, 1 reply; 26+ messages in thread From: Alexander Potapenko @ 2020-05-27 15:52 UTC (permalink / raw) To: daniel; +Cc: mkubecek, ast, Dmitriy Vyukov, Networking [-- Attachment #1: Type: text/plain, Size: 6771 bytes --] On Tue, Dec 18, 2018 at 3:36 PM Alexander Potapenko <glider@google.com> wrote: > > On Thu, Dec 13, 2018 at 3:54 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > On 12/13/2018 02:18 PM, Daniel Borkmann wrote: > > > On 12/13/2018 01:24 PM, Alexander Potapenko wrote: > > >> On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote: > > >>> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote: > > >>>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote: > > >>>>> Hi BPF maintainers, > > >>>>> > > >>>>> some time ago KMSAN found an issue in BPF code which we decided to > > >>>>> suppress at that point, but now I'd like to bring it to your > > >>>>> attention. > > >>>>> Namely, some BPF programs may contain instructions that XOR a register > > >>>>> with itself. > > >>>>> This effectively results in the following C code: > > >>>>> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > > >>>>> or > > >>>>> regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X]; > > >>>>> being executed. > > >>>>> > > >>>>> According to the C11 standard this is undefined behavior, so KMSAN > > >>>>> reports an error in this case. > > > > > > Can you elaborate / quote the exact bit from C11 that KMSAN is referring > > > to? (The below that Michal was quoting or something else?) > > > > > > Does that only refer to C11 standard? Note that kernel's Makefile +430 > > > explicitly states 'std=gnu89' and not 'std=c11' [0]. > > > > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51b97e354ba9fce1890cf38ecc754aa49677fc89 > > > > > >>>> Can you quote the part of the standard saying this is undefined > > >>>> behavior? I couldn't find anything else than > > >>>> > > >>>> If the value being stored in an object is read from another object > > >>>> that overlaps in any way the storage of the first object, then the > > >>>> overlap shall be exact and the two objects shall have qualified or > > >>>> unqualified versions of a compatible type; otherwise, the behavior > > >>>> is undefined. > > >>>> > > >>>> (but I only have a draft for obvious reasons). I'm not sure what exactly > > >>>> they mean by "exact overlap" and the standard doesn't seem to define > > >>>> the term but if the two objects are actually the same, they certainly > > >>>> have compatible types. > > > > > > Here is an example for the overlap quoted above; I don't think this > > > applies to our case since it would be "exact". Quote [1]: > > > > > > struct S { int x; int y; }; > > > struct T { int z; struct S s; }; > > > union U { struct S f ; struct T g; } u; > > > > > > main(){ > > > u.f = u.g.s; > > > return 0; > > > } > > > > > > [1] https://bts.frama-c.com/print_bug_page.php?bug_id=945 > > > > > >>> I think I understand now. You didn't want to say that the statement > > >>> > > >>> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > > >>> > > >>> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is > > >>> uninitialized. Right? > > >> Yes. Sorry for being unclear. > > >> By default regs[] is uninitialized, so we need to initialize it before > > >> using the register values. > > >> I am also wondering if it's possible to simply copy the uninitialized > > >> register values from regs[] to the userspace via maps. > > > > Nope, not possible. And to elaborate on cBPF / eBPF cases: > If you mean that it's not possible to generate a eBPF program that > XORs an uninitialized register with itself, you may be actually right. > I've reverted https://github.com/google/kmsan/commit/813c0f3d45ebfa321d70b4b06cc054518dd1d90d, > and syzkaller couldn't find a program triggering this behavior so far. > Perhaps something had changed in eBPF code since I encountered this error. > I'll be watching the dashboard and will let you know if I have a > reliable reproducer for the aforementioned problem. > Thanks for checking! Hi again, similar bugs have started showing up recently. When I run the attached program (note it uses SO_SECURITY_AUTHENTICATION, which as far as I understand is a no-op) on the KMSAN-enabled kernel (currently using v5.7-rc4) I see the following errors: ===================================================== BUG: KMSAN: uninit-value in packet_rcv_fanout+0x242b/0x25a0 net/packet/af_packet.c:1463 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 packet_rcv_fanout+0x242b/0x25a0 net/packet/af_packet.c:1463 deliver_skb net/core/dev.c:2168 __netif_receive_skb_core+0x1434/0x5860 net/core/dev.c:5052 __netif_receive_skb_list_core+0x315/0x1380 net/core/dev.c:5264 __netif_receive_skb_list net/core/dev.c:5331 netif_receive_skb_list_internal+0xf54/0x1600 net/core/dev.c:5426 gro_normal_list net/core/dev.c:5537 napi_complete_done+0x2ef/0xb40 net/core/dev.c:6258 e1000_clean+0x1bc8/0x5d80 drivers/net/ethernet/intel/e1000/e1000_main.c:3802 napi_poll net/core/dev.c:6572 ... Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 kmsan_internal_chain_origin+0xad/0x130 mm/kmsan/kmsan.c:310 __msan_chain_origin+0x50/0x90 mm/kmsan/kmsan_instr.c:165 ___bpf_prog_run+0x68fa/0x9300 kernel/bpf/core.c:1408 __bpf_prog_run32+0x101/0x170 kernel/bpf/core.c:1681 bpf_dispatcher_nop_func ./include/linux/bpf.h:545 bpf_prog_run_pin_on_cpu ./include/linux/filter.h:599 bpf_prog_run_clear_cb ./include/linux/filter.h:721 fanout_demux_bpf net/packet/af_packet.c:1404 packet_rcv_fanout+0x517/0x25a0 net/packet/af_packet.c:1456 deliver_skb net/core/dev.c:2168 ... Local variable ----regs@__bpf_prog_run32 created at: __bpf_prog_run32+0x87/0x170 kernel/bpf/core.c:1681 __bpf_prog_run32+0x87/0x170 kernel/bpf/core.c:1681 ===================================================== This basically means that BPF's output register was uninitialized when ___bpf_prog_run() returned. When I replace the lines initializing registers A and X in net/core/filter.c: - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A); - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X); with + *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0); + *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0); , the bug goes away, therefore I think it's being caused by XORing the initially uninitialized registers with themselves. kernel/bpf/core.c:1408, where the uninitialized value was stored to memory, points to the "ALU(ADD, +)" macro in ___bpf_prog_run(). But the debug info seems to be incorrect here: if I comment this line out and unroll the macro manually, KMSAN points to "ALU(SUB, -)". Most certainly it's actually one of the XOR instruction declarations. Do you think it makes sense to use the UB-proof BPF_MOV32_IMM instructions to initialize the registers? [-- Attachment #2: bpf2.c --] [-- Type: text/x-csrc, Size: 596 bytes --] // autogenerated by syzkaller (https://github.com/google/syzkaller) #include <linux/filter.h> #include <stdint.h> #include <sys/mman.h> #include <sys/socket.h> #include <unistd.h> int main(void) { int sock = socket(PF_PACKET, SOCK_RAW, 0x300); if (sock == -1) return 1; uint16_t rcv_arg[2] = {0, 6}; setsockopt(sock, SOL_PACKET, /*SO_RCVLOWAT*/0x12, &rcv_arg, sizeof(rcv_arg)); struct sock_filter code[] = { {0x16, 0, 0, 0} }; struct sock_fprog bpf = {1, code}; setsockopt(sock, SOL_PACKET, /*SO_SECURITY_AUTHENTICATION*/0x16, &bpf, sizeof(bpf)); sleep(10); return 0; } ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-05-27 15:52 ` Alexander Potapenko @ 2020-05-27 16:58 ` Alexei Starovoitov 2020-05-27 17:12 ` Alexander Potapenko 0 siblings, 1 reply; 26+ messages in thread From: Alexei Starovoitov @ 2020-05-27 16:58 UTC (permalink / raw) To: Alexander Potapenko Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On Wed, May 27, 2020 at 8:52 AM Alexander Potapenko <glider@google.com> wrote: > > This basically means that BPF's output register was uninitialized when > ___bpf_prog_run() returned. > > When I replace the lines initializing registers A and X in net/core/filter.c: > > - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A); > - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X); > > with > > + *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0); > + *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0); > > , the bug goes away, therefore I think it's being caused by XORing the > initially uninitialized registers with themselves. > > kernel/bpf/core.c:1408, where the uninitialized value was stored to > memory, points to the "ALU(ADD, +)" macro in ___bpf_prog_run(). > But the debug info seems to be incorrect here: if I comment this line > out and unroll the macro manually, KMSAN points to "ALU(SUB, -)". > Most certainly it's actually one of the XOR instruction declarations. > > Do you think it makes sense to use the UB-proof BPF_MOV32_IMM > instructions to initialize the registers? I think it's better for UBsan to get smarter about xor-ing. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-05-27 16:58 ` Alexei Starovoitov @ 2020-05-27 17:12 ` Alexander Potapenko 2020-05-27 17:14 ` Alexei Starovoitov 0 siblings, 1 reply; 26+ messages in thread From: Alexander Potapenko @ 2020-05-27 17:12 UTC (permalink / raw) To: Alexei Starovoitov Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On Wed, May 27, 2020 at 6:58 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Wed, May 27, 2020 at 8:52 AM Alexander Potapenko <glider@google.com> wrote: > > > > This basically means that BPF's output register was uninitialized when > > ___bpf_prog_run() returned. > > > > When I replace the lines initializing registers A and X in net/core/filter.c: > > > > - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A); > > - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X); > > > > with > > > > + *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0); > > + *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0); > > > > , the bug goes away, therefore I think it's being caused by XORing the > > initially uninitialized registers with themselves. > > > > kernel/bpf/core.c:1408, where the uninitialized value was stored to > > memory, points to the "ALU(ADD, +)" macro in ___bpf_prog_run(). > > But the debug info seems to be incorrect here: if I comment this line > > out and unroll the macro manually, KMSAN points to "ALU(SUB, -)". > > Most certainly it's actually one of the XOR instruction declarations. > > > > Do you think it makes sense to use the UB-proof BPF_MOV32_IMM > > instructions to initialize the registers? > > I think it's better for UBsan to get smarter about xor-ing. Could you please elaborate on this? How exactly should KMSAN handle this situation? Note that despite the source says "BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);", it doesn't necessarily boil down to an expression like A = A ^ A. It's more likely that temporary values will be involved, making it quite hard to figure out whether the two operands are really the same. For an expression like A = B ^ C, KMSAN just calculates A's shadow bits based on the values and the shadow bits of B and C. If either B or C is uninitialized, the result will be uninitialized as well, even if both B and C contain the same values. It's therefore strange to expect the tool to behave differently if B and C are the same variable. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-05-27 17:12 ` Alexander Potapenko @ 2020-05-27 17:14 ` Alexei Starovoitov 2020-05-28 9:54 ` Alexander Potapenko 0 siblings, 1 reply; 26+ messages in thread From: Alexei Starovoitov @ 2020-05-27 17:14 UTC (permalink / raw) To: Alexander Potapenko Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On Wed, May 27, 2020 at 10:12 AM Alexander Potapenko <glider@google.com> wrote: > > On Wed, May 27, 2020 at 6:58 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Wed, May 27, 2020 at 8:52 AM Alexander Potapenko <glider@google.com> wrote: > > > > > > This basically means that BPF's output register was uninitialized when > > > ___bpf_prog_run() returned. > > > > > > When I replace the lines initializing registers A and X in net/core/filter.c: > > > > > > - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A); > > > - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X); > > > > > > with > > > > > > + *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0); > > > + *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0); > > > > > > , the bug goes away, therefore I think it's being caused by XORing the > > > initially uninitialized registers with themselves. > > > > > > kernel/bpf/core.c:1408, where the uninitialized value was stored to > > > memory, points to the "ALU(ADD, +)" macro in ___bpf_prog_run(). > > > But the debug info seems to be incorrect here: if I comment this line > > > out and unroll the macro manually, KMSAN points to "ALU(SUB, -)". > > > Most certainly it's actually one of the XOR instruction declarations. > > > > > > Do you think it makes sense to use the UB-proof BPF_MOV32_IMM > > > instructions to initialize the registers? > > > > I think it's better for UBsan to get smarter about xor-ing. > > Could you please elaborate on this? How exactly should KMSAN handle > this situation? > Note that despite the source says "BPF_ALU32_REG(BPF_XOR, BPF_REG_A, > BPF_REG_A);", it doesn't necessarily boil down to an expression like A > = A ^ A. It's more likely that temporary values will be involved, > making it quite hard to figure out whether the two operands are really > the same. I really don't know who to make it smarter. It's your area of expertise. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-05-27 17:14 ` Alexei Starovoitov @ 2020-05-28 9:54 ` Alexander Potapenko 2020-05-28 16:00 ` Alexei Starovoitov 0 siblings, 1 reply; 26+ messages in thread From: Alexander Potapenko @ 2020-05-28 9:54 UTC (permalink / raw) To: Alexei Starovoitov Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On Wed, May 27, 2020 at 7:15 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Wed, May 27, 2020 at 10:12 AM Alexander Potapenko <glider@google.com> wrote: > > > > On Wed, May 27, 2020 at 6:58 PM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > On Wed, May 27, 2020 at 8:52 AM Alexander Potapenko <glider@google.com> wrote: > > > > > > > > This basically means that BPF's output register was uninitialized when > > > > ___bpf_prog_run() returned. > > > > > > > > When I replace the lines initializing registers A and X in net/core/filter.c: > > > > > > > > - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A); > > > > - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X); > > > > > > > > with > > > > > > > > + *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0); > > > > + *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0); > > > > > > > > , the bug goes away, therefore I think it's being caused by XORing the > > > > initially uninitialized registers with themselves. > > > > > > > > kernel/bpf/core.c:1408, where the uninitialized value was stored to > > > > memory, points to the "ALU(ADD, +)" macro in ___bpf_prog_run(). > > > > But the debug info seems to be incorrect here: if I comment this line > > > > out and unroll the macro manually, KMSAN points to "ALU(SUB, -)". > > > > Most certainly it's actually one of the XOR instruction declarations. > > > > > > > > Do you think it makes sense to use the UB-proof BPF_MOV32_IMM > > > > instructions to initialize the registers? > > > > > > I think it's better for UBsan to get smarter about xor-ing. > > > > Could you please elaborate on this? How exactly should KMSAN handle > > this situation? > > Note that despite the source says "BPF_ALU32_REG(BPF_XOR, BPF_REG_A, > > BPF_REG_A);", it doesn't necessarily boil down to an expression like A > > = A ^ A. It's more likely that temporary values will be involved, > > making it quite hard to figure out whether the two operands are really > > the same. > > I really don't know who to make it smarter. It's your area of expertise. The point I am trying to make is that BPF is relying on undefined behavior (see the quotations from C89 standard) here for no apparent reason. This code might be working with the current set of compilers/optimizations, but will it pay off to debug it when something breaks? The C implementation for BPF_XOR in kernel/bpf/core.c is translated to: regs[insn->dst_reg] = regs[insn->dst_reg] ^ regs[insn->src_reg]; , at which point the compiler already can't tell whether dst_reg and src_reg are the same. In fact Clang may already generate unexpected code for such cases: see the assembly for foo3() in https://godbolt.org/z/VoMHaC. I therefore think it's better to fix the buggy code unless there are strong reasons not to do so, rather than teach the tool how to work around this particular bug. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-05-28 9:54 ` Alexander Potapenko @ 2020-05-28 16:00 ` Alexei Starovoitov 2020-05-29 0:17 ` Edward Cree 0 siblings, 1 reply; 26+ messages in thread From: Alexei Starovoitov @ 2020-05-28 16:00 UTC (permalink / raw) To: Alexander Potapenko Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On Thu, May 28, 2020 at 2:54 AM Alexander Potapenko <glider@google.com> wrote: > > On Wed, May 27, 2020 at 7:15 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Wed, May 27, 2020 at 10:12 AM Alexander Potapenko <glider@google.com> wrote: > > > > > > On Wed, May 27, 2020 at 6:58 PM Alexei Starovoitov > > > <alexei.starovoitov@gmail.com> wrote: > > > > > > > > On Wed, May 27, 2020 at 8:52 AM Alexander Potapenko <glider@google.com> wrote: > > > > > > > > > > This basically means that BPF's output register was uninitialized when > > > > > ___bpf_prog_run() returned. > > > > > > > > > > When I replace the lines initializing registers A and X in net/core/filter.c: > > > > > > > > > > - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A); > > > > > - *new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X); > > > > > > > > > > with > > > > > > > > > > + *new_insn++ = BPF_MOV32_IMM(BPF_REG_A, 0); > > > > > + *new_insn++ = BPF_MOV32_IMM(BPF_REG_X, 0); > > > > > > > > > > , the bug goes away, therefore I think it's being caused by XORing the > > > > > initially uninitialized registers with themselves. > > > > > > > > > > kernel/bpf/core.c:1408, where the uninitialized value was stored to > > > > > memory, points to the "ALU(ADD, +)" macro in ___bpf_prog_run(). > > > > > But the debug info seems to be incorrect here: if I comment this line > > > > > out and unroll the macro manually, KMSAN points to "ALU(SUB, -)". > > > > > Most certainly it's actually one of the XOR instruction declarations. > > > > > > > > > > Do you think it makes sense to use the UB-proof BPF_MOV32_IMM > > > > > instructions to initialize the registers? > > > > > > > > I think it's better for UBsan to get smarter about xor-ing. > > > > > > Could you please elaborate on this? How exactly should KMSAN handle > > > this situation? > > > Note that despite the source says "BPF_ALU32_REG(BPF_XOR, BPF_REG_A, > > > BPF_REG_A);", it doesn't necessarily boil down to an expression like A > > > = A ^ A. It's more likely that temporary values will be involved, > > > making it quite hard to figure out whether the two operands are really > > > the same. > > > > I really don't know who to make it smarter. It's your area of expertise. > > The point I am trying to make is that BPF is relying on undefined > behavior (see the quotations from C89 standard) here for no apparent > reason. > This code might be working with the current set of > compilers/optimizations, but will it pay off to debug it when > something breaks? xoring of two identical values is undefined in standard? If that's really true such standard worth nothing. > The C implementation for BPF_XOR in kernel/bpf/core.c is translated to: > regs[insn->dst_reg] = regs[insn->dst_reg] ^ regs[insn->src_reg]; > , at which point the compiler already can't tell whether dst_reg and > src_reg are the same. of course. compile can use different cpu regs. > In fact Clang may already generate unexpected code for such cases: see > the assembly for foo3() in https://godbolt.org/z/VoMHaC. "unexpected" meaning that xor has different regs? > I therefore think it's better to fix the buggy code unless there are > strong reasons not to do so, rather than teach the tool how to work > around this particular bug. The tool needs to do data flow analysis to realize that the source data in different regs is the same, hence xoring them will produce zero. That could be super hard to achieve in practice and will take years of work to implement KMSAN, but that's what you have to do. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-05-28 16:00 ` Alexei Starovoitov @ 2020-05-29 0:17 ` Edward Cree 2020-05-29 6:14 ` Dmitry Vyukov 2020-05-29 12:28 ` Alexander Potapenko 0 siblings, 2 replies; 26+ messages in thread From: Edward Cree @ 2020-05-29 0:17 UTC (permalink / raw) To: Alexei Starovoitov, Alexander Potapenko Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On 28/05/2020 17:00, Alexei Starovoitov wrote: > xoring of two identical values is undefined in standard? I believe it is in this case, yes; even without the complication of array references that happen to alias, Alexander's foo1() is undefined behaviour under C89 (and also C99 which handles the case differently). From the definitions section (1.6) of the C89 draft [1]: > * Undefined behavior --- behavior, upon use of a nonportable or > erroneous program construct, of erroneous data, or of > indeterminately-valued objects, for which the Standard imposes > no requirements. And from 3.5.7 'Initialization': > If an object that has automatic storage duration is not > initialized explicitly, its value is indeterminate. Since the standard doesn't say anything about self-XORing that could make it 'special' in this regard, the compiler isn't required to notice that it's a self-XOR, and (in the tradition of compiler-writers the world over) is entitled to optimise the program based on the assumption that the programmer has not committed UB, so in the foo1() example would be strictly within its rights to generate a binary that contained no XOR instruction at all. UB, as you surely know, isn't guaranteed to do something 'sensible'. And in the BPF example, if the compiler at some point manages to statically figure out that regs[insn->dst_reg] is uninitialised, it might say "hey, I can just grab any old free register and declare that that's now regs[insn->dst_reg] without filling it. And then it can do the same for regs[insn->src_reg], or heck, even choose to fill that one (this is now legal even though the pointers alias, because you already committed UB), and do a xor with different regs and produce garbage results. (In C99 it gets subtler because an 'indeterminate value' is defined to be 'either a valid value or a trap representation', so arguably the compiler can only do this stuff if it _has_ trap representations for the type in question.) > If that's really true such standard worth nothing. You may be right, but plenty of compiler writers will take that as a reason to ignore you, and if (say) a gcc upgrade breaks filter.c, they will merrily close any bugs you file as NOTABUG or INVALID or GOAWAYWEDONTCARE. Is this annoying? Extremely; the XOR-clearing _would_ be fine if the standard had chosen to define things differently (e.g. it's fine under a hypothetical 'C99 but uninitialised auto variables have unspecified rather than indeterminate values'). I can't see a way to work around it that doesn't have a possible performance cost (alternatives to Alexander's MOV_IMM 0 include initialising regs[BPF_REG_A] and regs[BPF_REG_X] in PROG_NAME and PROG_NAME_ARGS), although there is the question of whether anyone who cares about performance (or security) will be using BPF without the JIT anyway. But I don't think "Alexandar has to do the data-flow analysis in KMSAN" is the right answer; KMSAN's diagnostic here is _correct_ in that ___bpf_prog_run() invokes UB on this XOR. Now, since it would be rather difficult and pointless for the compiler to statically prove that the reg is uninitialised (it would need to generate a special code-path just for this one case), maybe the best thing to do is to get GCC folks to bless this usage (perhaps defining uninitialised variables to have what C99 would call an unspecified value), at which point it becomes defined under the "gnu89" pseudo-standard which is what we compile the kernel with. At which point KMSAN can be taught that this is OK, and can figure out "hey, you're self-XORing an unspecified value, the result is determinate" and clear the shadow bits appropriately. -ed [1]: https://port70.net/~nsz/c/c89/c89-draft.html ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-05-29 0:17 ` Edward Cree @ 2020-05-29 6:14 ` Dmitry Vyukov 2020-05-29 8:46 ` Edward Cree 2020-05-29 12:28 ` Alexander Potapenko 1 sibling, 1 reply; 26+ messages in thread From: Dmitry Vyukov @ 2020-05-29 6:14 UTC (permalink / raw) To: Edward Cree Cc: Alexei Starovoitov, Alexander Potapenko, Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Networking On Fri, May 29, 2020 at 2:17 AM Edward Cree <ecree@solarflare.com> wrote: > > On 28/05/2020 17:00, Alexei Starovoitov wrote: > > xoring of two identical values is undefined in standard? > I believe it is in this case, yes; even without the complication > of array references that happen to alias, Alexander's foo1() is > undefined behaviour under C89 (and also C99 which handles the > case differently). > > From the definitions section (1.6) of the C89 draft [1]: > > * Undefined behavior --- behavior, upon use of a nonportable or > > erroneous program construct, of erroneous data, or of > > indeterminately-valued objects, for which the Standard imposes > > no requirements. > And from 3.5.7 'Initialization': > > If an object that has automatic storage duration is not > > initialized explicitly, its value is indeterminate. > Since the standard doesn't say anything about self-XORing that > could make it 'special' in this regard, the compiler isn't > required to notice that it's a self-XOR, and (in the tradition > of compiler-writers the world over) is entitled to optimise the > program based on the assumption that the programmer has not > committed UB, so in the foo1() example would be strictly within > its rights to generate a binary that contained no XOR > instruction at all. UB, as you surely know, isn't guaranteed to > do something 'sensible'. > And in the BPF example, if the compiler at some point manages to > statically figure out that regs[insn->dst_reg] is uninitialised, > it might say "hey, I can just grab any old free register and > declare that that's now regs[insn->dst_reg] without filling it. > And then it can do the same for regs[insn->src_reg], or heck, > even choose to fill that one (this is now legal even though the > pointers alias, because you already committed UB), and do a xor > with different regs and produce garbage results. > > (In C99 it gets subtler because an 'indeterminate value' is > defined to be 'either a valid value or a trap representation', > so arguably the compiler can only do this stuff if it _has_ > trap representations for the type in question.) Interesting. Are you sure that's the meaning of 'indeterminate value'? My latest copy of the standard says: 3.19.2 1 indeterminate value either an unspecified value or a trap representation My reading of this would be that this only prevents things from exploding in all possible random ways (like formatting drive). The effects are only reduced to either getting a random value, or a trap on access to the value. Both of these do not seem to be acceptable for a bpf program. > > If that's really true such standard worth nothing. > You may be right, but plenty of compiler writers will take that > as a reason to ignore you, and if (say) a gcc upgrade breaks > filter.c, they will merrily close any bugs you file as NOTABUG > or INVALID or GOAWAYWEDONTCARE. > Is this annoying? Extremely; the XOR-clearing _would_ be fine > if the standard had chosen to define things differently (e.g. > it's fine under a hypothetical 'C99 but uninitialised auto > variables have unspecified rather than indeterminate values'). > I can't see a way to work around it that doesn't have a possible > performance cost (alternatives to Alexander's MOV_IMM 0 include > initialising regs[BPF_REG_A] and regs[BPF_REG_X] in PROG_NAME > and PROG_NAME_ARGS), although there is the question of whether > anyone who cares about performance (or security) will be using > BPF without the JIT anyway. > But I don't think "Alexandar has to do the data-flow analysis in > KMSAN" is the right answer; KMSAN's diagnostic here is _correct_ > in that ___bpf_prog_run() invokes UB on this XOR. > Now, since it would be rather difficult and pointless for the > compiler to statically prove that the reg is uninitialised (it > would need to generate a special code-path just for this one > case), maybe the best thing to do is to get GCC folks to bless > this usage (perhaps defining uninitialised variables to have > what C99 would call an unspecified value), at which point it > becomes defined under the "gnu89" pseudo-standard which is what > we compile the kernel with. At which point KMSAN can be taught > that this is OK, and can figure out "hey, you're self-XORing an > unspecified value, the result is determinate" and clear the > shadow bits appropriately. > > -ed > > [1]: https://port70.net/~nsz/c/c89/c89-draft.html ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-05-29 6:14 ` Dmitry Vyukov @ 2020-05-29 8:46 ` Edward Cree 2020-05-29 8:53 ` Dmitry Vyukov 0 siblings, 1 reply; 26+ messages in thread From: Edward Cree @ 2020-05-29 8:46 UTC (permalink / raw) To: Dmitry Vyukov Cc: Alexei Starovoitov, Alexander Potapenko, Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Networking On 29/05/2020 07:14, Dmitry Vyukov wrote: >> (In C99 it gets subtler because an 'indeterminate value' is >> defined to be 'either a valid value or a trap representation', >> so arguably the compiler can only do this stuff if it _has_ >> trap representations for the type in question.) > Interesting. Are you sure that's the meaning of 'indeterminate value'? > My latest copy of the standard says: > > 3.19.2 > 1 indeterminate value > either an unspecified value or a trap representation Yes, but (from N1256): | 3.17.3 | unspecified value | valid value of the relevant type where this International Standard | imposes no requirements on which value is chosen in any instance | NOTE An unspecified value cannot be a trap representation > My reading of this would be that this only prevents things from > exploding in all possible random ways (like formatting drive). The > effects are only reduced to either getting a random value, or a trap > on access to the value. Both of these do not seem to be acceptable for > a bpf program. A random value, XORed with itself, produces 0, which is what we want. (XORing a trap representation with itself, of course, produces a trap.) So it'd be fine *unless* the 'in any instance' language can be read as allowing the uninitialised object to have *different* random values on separate accesses. -ed ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-05-29 8:46 ` Edward Cree @ 2020-05-29 8:53 ` Dmitry Vyukov 0 siblings, 0 replies; 26+ messages in thread From: Dmitry Vyukov @ 2020-05-29 8:53 UTC (permalink / raw) To: Edward Cree Cc: Alexei Starovoitov, Alexander Potapenko, Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Networking On Fri, May 29, 2020 at 10:46 AM Edward Cree <ecree@solarflare.com> wrote: > > On 29/05/2020 07:14, Dmitry Vyukov wrote: > >> (In C99 it gets subtler because an 'indeterminate value' is > >> defined to be 'either a valid value or a trap representation', > >> so arguably the compiler can only do this stuff if it _has_ > >> trap representations for the type in question.) > > Interesting. Are you sure that's the meaning of 'indeterminate value'? > > My latest copy of the standard says: > > > > 3.19.2 > > 1 indeterminate value > > either an unspecified value or a trap representation > Yes, but (from N1256): > | 3.17.3 > | unspecified value > | valid value of the relevant type where this International Standard > | imposes no requirements on which value is chosen in any instance > | NOTE An unspecified value cannot be a trap representation > > > My reading of this would be that this only prevents things from > > exploding in all possible random ways (like formatting drive). The > > effects are only reduced to either getting a random value, or a trap > > on access to the value. Both of these do not seem to be acceptable for > > a bpf program. > A random value, XORed with itself, produces 0, which is what we want. > (XORing a trap representation with itself, of course, produces a trap.) > > So it'd be fine *unless* the 'in any instance' language can be read as > allowing the uninitialised object to have *different* random values on > separate accesses. Ah, I see. I thought the result of such XOR is redefined to be an indeterminate value rather than UB. Thanks ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-05-29 0:17 ` Edward Cree 2020-05-29 6:14 ` Dmitry Vyukov @ 2020-05-29 12:28 ` Alexander Potapenko 2020-06-01 9:55 ` Edward Cree 1 sibling, 1 reply; 26+ messages in thread From: Alexander Potapenko @ 2020-05-29 12:28 UTC (permalink / raw) To: Edward Cree, Alexei Starovoitov Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On Fri, May 29, 2020 at 2:17 AM Edward Cree <ecree@solarflare.com> wrote: > > On 28/05/2020 17:00, Alexei Starovoitov wrote: > > xoring of two identical values is undefined in standard? > I believe it is in this case, yes; even without the complication > of array references that happen to alias, Alexander's foo1() is > undefined behaviour under C89 (and also C99 which handles the > case differently). > > From the definitions section (1.6) of the C89 draft [1]: > > * Undefined behavior --- behavior, upon use of a nonportable or > > erroneous program construct, of erroneous data, or of > > indeterminately-valued objects, for which the Standard imposes > > no requirements. > And from 3.5.7 'Initialization': > > If an object that has automatic storage duration is not > > initialized explicitly, its value is indeterminate. > Since the standard doesn't say anything about self-XORing that > could make it 'special' in this regard, the compiler isn't > required to notice that it's a self-XOR, and (in the tradition > of compiler-writers the world over) is entitled to optimise the > program based on the assumption that the programmer has not > committed UB, so in the foo1() example would be strictly within > its rights to generate a binary that contained no XOR > instruction at all. UB, as you surely know, isn't guaranteed to > do something 'sensible'. > And in the BPF example, if the compiler at some point manages to > statically figure out that regs[insn->dst_reg] is uninitialised, > it might say "hey, I can just grab any old free register and > declare that that's now regs[insn->dst_reg] without filling it. > And then it can do the same for regs[insn->src_reg], or heck, > even choose to fill that one (this is now legal even though the > pointers alias, because you already committed UB), and do a xor > with different regs and produce garbage results. Thanks for this writeup! > Is this annoying? Extremely; the XOR-clearing _would_ be fine > if the standard had chosen to define things differently (e.g. > it's fine under a hypothetical 'C99 but uninitialised auto > variables have unspecified rather than indeterminate values'). I wouldn't call this particular use case "extremely annoying". I think so far this is the only case of initializing something by XOR we've seen with both MSan and KMSAN. > I can't see a way to work around it that doesn't have a possible > performance cost (alternatives to Alexander's MOV_IMM 0 include > initialising regs[BPF_REG_A] and regs[BPF_REG_X] in PROG_NAME > and PROG_NAME_ARGS), although there is the question of whether > anyone who cares about performance (or security) will be using > BPF without the JIT anyway. If I understand correctly, these two instructions are only executed once per program. Are they really expected to impact performance that much? It's also an interesting question whether the JIT compiler emits consistently better code for BPF_XOR than for MOV_IMM 0 on every architecture - while "xorl %rax, %rax" is probably shorter and faster on X86, on ARM a better alternative would be "mov w0, wzr". If the performance is really critical here, perhaps a better alternative is to introduce a BPF instruction (which could be an alias of BPF_XOR REG, REG) for zeroing out a register? Then different architectures may choose more efficient implementations for it, and the interpreter will be just assigning zero to the register without violating the C standard. > But I don't think "Alexandar has to do the data-flow analysis in > KMSAN" is the right answer; KMSAN's diagnostic here is _correct_ > in that ___bpf_prog_run() invokes UB on this XOR. > Now, since it would be rather difficult and pointless for the > compiler to statically prove that the reg is uninitialised (it > would need to generate a special code-path just for this one > case) The godbolt link above actually shows a case (foo3()) in which Clang knows that a local is uninitialized and transforms it into a special `undef` value that can be then e.g. passed around and take part in various optimizations, making them more efficient. I don't have evidence that such a transformation is currently possible for the BPF code in question, but all the building blocks are there, so it's probably just a matter of time. >, maybe the best thing to do is to get GCC folks to bless > this usage (perhaps defining uninitialised variables to have > what C99 would call an unspecified value), at which point it > becomes defined under the "gnu89" pseudo-standard which is what > we compile the kernel with. Given the increased popularity of Clang in the kernel these days, I don't think it's a good idea for a single compiler to further diverge from the standard. Again, this code pattern doesn't really seem to be popular enough to justify such a change. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-05-29 12:28 ` Alexander Potapenko @ 2020-06-01 9:55 ` Edward Cree 2020-06-02 13:31 ` Alexander Potapenko 0 siblings, 1 reply; 26+ messages in thread From: Edward Cree @ 2020-06-01 9:55 UTC (permalink / raw) To: Alexander Potapenko, Alexei Starovoitov Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On 29/05/2020 13:28, Alexander Potapenko wrote:> If the performance is really critical here, perhaps a better > alternative is to introduce a BPF instruction (which could be an alias > of BPF_XOR REG, REG) for zeroing out a register? Then different > architectures may choose more efficient implementations for it, and > the interpreter will be just assigning zero to the register without > violating the C standard.If it's an alias of BPF_XOR r,r, then the interpreter will surely still interpret it with the XOR code. Unless you make the interpreter special-case this, in which case you've added an extra branch to every XOR the interpreter handles :( > Given the increased popularity of Clang in the kernel these days, I > don't think it's a good idea for a single compiler to further diverge > from the standard.The standard in question isn't C89, but "--std=gnu89", which is whatever GCC says it is :grin: So if GCC declares that some class of optimisations are not legal under --std=gnu89, then they're not legal and Clang has to adapt to that. > I wouldn't call this particular use case "extremely annoying". To be clear, what's "annoying" is the double-bind we're in as a result of trying to optimise the prologue for both JITs (whose semantics are whatever we define eBPF to be) and the interpreter (which has to be implemented with reasonable efficiency as C code). > If I understand correctly, these two instructions are only executed > once per program. > Are they really expected to impact performance that much? If you have a very short program that's bound to a very frequent event, then they might. But I don't have, and haven't seen, any numbers... > I don't have evidence that such a transformation is currently possible > for the BPF code in question, but all the building blocks are there, > so it's probably just a matter of time. I'm not so sure. Consider the following sequence of BPF instructions: xor r0, r0 ld r0, 42 xor r0, r0 exit I hope you'll agree that at entry to the second XOR, the value of r0 is not indeterminate. So any optimisation the compiler does for the first XOR ("oh, we don't need to fill the existing regs[] values, we can just use whatever's already in whichever register we allocate") will need to be predicated on something that makes it only happen for the first XOR. But the place it's doing this optimisation is in the interpreter, which is a big fetch-execute loop. I don't think even Clang is smart enough to figure out "BPF programs always start with a prologue, I'll stick in something that knows which prologue the prog uses and branches to a statically compiled, optimised version thereof". (If Clang *is* that smart, then it's too clever by half...) -ed ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-06-01 9:55 ` Edward Cree @ 2020-06-02 13:31 ` Alexander Potapenko 2020-06-02 17:32 ` Alexei Starovoitov 0 siblings, 1 reply; 26+ messages in thread From: Alexander Potapenko @ 2020-06-02 13:31 UTC (permalink / raw) To: Edward Cree Cc: Alexei Starovoitov, Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On Mon, Jun 1, 2020 at 11:55 AM Edward Cree <ecree@solarflare.com> wrote: > > If it's an alias of BPF_XOR r,r, then the interpreter will surely still > interpret it with the XOR code. Unless you make the interpreter > special-case this, in which case you've added an extra branch to every > XOR the interpreter handles :( You are right, the interpreter can be probably fixed in a different way that doesn't introduce extra branches. > > Given the increased popularity of Clang in the kernel these days, I > > don't think it's a good idea for a single compiler to further diverge > > from the standard.The standard in question isn't C89, but "--std=gnu89", which is > whatever GCC says it is :grin: > So if GCC declares that some class of optimisations are not legal under > --std=gnu89, then they're not legal and Clang has to adapt to that. Hm, I never thought of such an approach to standards. This sounds like a legitimate solution, which will however result in: - GCC developers saying "no, we won't do this", or - Clang developers saying "no, we won't do this" , unless we present them with good reasons to do so. Because the use-case is quite narrow, and prohibiting this optimization may break other optimizations (at least in Clang), I won't expect anyone spending time on making Clang compliant with --std=gnu89 in this particular case. > > I wouldn't call this particular use case "extremely annoying". > To be clear, what's "annoying" is the double-bind we're in as a result > of trying to optimise the prologue for both JITs (whose semantics are > whatever we define eBPF to be) and the interpreter (which has to be > implemented with reasonable efficiency as C code). > > If I understand correctly, these two instructions are only executed > > once per program. > > Are they really expected to impact performance that much? > If you have a very short program that's bound to a very frequent event, > then they might. But I don't have, and haven't seen, any numbers... So we're back to the question how much people care about the interpreter performance. > > I don't have evidence that such a transformation is currently possible > > for the BPF code in question, but all the building blocks are there, > > so it's probably just a matter of time. > I'm not so sure. Consider the following sequence of BPF instructions: > xor r0, r0 > ld r0, 42 > xor r0, r0 > exit > I hope you'll agree that at entry to the second XOR, the value of r0 is > not indeterminate. So any optimisation the compiler does for the first > XOR ("oh, we don't need to fill the existing regs[] values, we can just > use whatever's already in whichever register we allocate") will need to > be predicated on something that makes it only happen for the first XOR. > But the place it's doing this optimisation is in the interpreter, which > is a big fetch-execute loop. I don't think even Clang is smart enough > to figure out "BPF programs always start with a prologue, I'll stick in > something that knows which prologue the prog uses and branches to a > statically compiled, optimised version thereof". > (If Clang *is* that smart, then it's too clever by half...) I don't think Clang does this at the moment, but I can certainly imagine it unrolling the first two iterations of the interpretation loop (since the prologue instructions are known at compile time) and replacing them with explicit XOR instructions. I however suggest we get to the practical question of how to deal with this issue in the long run. Currently these error reports prevent us from testing BPF with syzbot, so we're using https://github.com/google/kmsan/commit/69b987d53462a7f3b5a41c62eb731340b53165f8 to work around them. This seems to be the easiest fix that doesn't affect JIT performance and removes the UB. Once KMSAN makes it to the mainline, we'll need to either come up with a similar fix, or disable fuzzing BPF, which isn't what we want. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-06-02 13:31 ` Alexander Potapenko @ 2020-06-02 17:32 ` Alexei Starovoitov 2020-06-03 15:37 ` Edward Cree 0 siblings, 1 reply; 26+ messages in thread From: Alexei Starovoitov @ 2020-06-02 17:32 UTC (permalink / raw) To: Alexander Potapenko Cc: Edward Cree, Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On Tue, Jun 02, 2020 at 03:31:40PM +0200, Alexander Potapenko wrote: > > So we're back to the question how much people care about the > interpreter performance. No. The question is whether people care about UB fuzzing. > > > > I don't have evidence that such a transformation is currently possible > > > for the BPF code in question, but all the building blocks are there, > > > so it's probably just a matter of time. > > I'm not so sure. Consider the following sequence of BPF instructions: > > xor r0, r0 > > ld r0, 42 > > xor r0, r0 > > exit > > I hope you'll agree that at entry to the second XOR, the value of r0 is > > not indeterminate. So any optimisation the compiler does for the first > > XOR ("oh, we don't need to fill the existing regs[] values, we can just > > use whatever's already in whichever register we allocate") will need to > > be predicated on something that makes it only happen for the first XOR. > > But the place it's doing this optimisation is in the interpreter, which > > is a big fetch-execute loop. I don't think even Clang is smart enough > > to figure out "BPF programs always start with a prologue, I'll stick in > > something that knows which prologue the prog uses and branches to a > > statically compiled, optimised version thereof". > > (If Clang *is* that smart, then it's too clever by half...) > > I don't think Clang does this at the moment, but I can certainly > imagine it unrolling the first two iterations of the interpretation > loop (since the prologue instructions are known at compile time) and > replacing them with explicit XOR instructions. > > I however suggest we get to the practical question of how to deal with > this issue in the long run. > Currently these error reports prevent us from testing BPF with syzbot, > so we're using https://github.com/google/kmsan/commit/69b987d53462a7f3b5a41c62eb731340b53165f8 > to work around them. > This seems to be the easiest fix that doesn't affect JIT performance > and removes the UB. > > Once KMSAN makes it to the mainline, we'll need to either come up with > a similar fix, or disable fuzzing BPF, which isn't what we want. Disabling fuzzing altogether would be unfortunate, but disabling UB fuzzing is perfectly fine. I don't think UB fuzzer scratched the surface yet. See fixup_bpf_calls() for example. It's the same UB due to uninit regs. And I have to add the same Nack on messing with it. There could be other places in bpf codegen. We're not going to chase them one by one because standard says it's UB and interpreter is written in C. The target for bpf codegen is JITs. JITs on some arch optimize mov r,0 into xor. Some dont. bpf interpreter is simulating hw. If gcc/clang will ever mess it with it to the point that this UB will become an issue we will rewrite certain pieces of interpreter in asm. For now if you want UB fuzzer running in your environment please add _out_of_tree_ patch that inits all interpreter registers to zero. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-06-02 17:32 ` Alexei Starovoitov @ 2020-06-03 15:37 ` Edward Cree 2020-06-03 16:33 ` Alexander Potapenko 0 siblings, 1 reply; 26+ messages in thread From: Edward Cree @ 2020-06-03 15:37 UTC (permalink / raw) To: Alexei Starovoitov, Alexander Potapenko Cc: Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On 02/06/2020 18:32, Alexei Starovoitov wrote: > The target for bpf codegen is JITs. > bpf interpreter is simulating hw. > For now if you want UB fuzzer running in your environment please add > _out_of_tree_ patch that inits all interpreter registers to zero. +1 to all the above. Also, note that you can still fuzz BPF JITs by building the kernel without the interpreter: CONFIG_BPF_JIT_ALWAYS_ON. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2020-06-03 15:37 ` Edward Cree @ 2020-06-03 16:33 ` Alexander Potapenko 0 siblings, 0 replies; 26+ messages in thread From: Alexander Potapenko @ 2020-06-03 16:33 UTC (permalink / raw) To: Edward Cree Cc: Alexei Starovoitov, Daniel Borkmann, Michal Kubecek, Alexei Starovoitov, Dmitriy Vyukov, Networking On Wed, Jun 3, 2020 at 5:37 PM Edward Cree <ecree@solarflare.com> wrote: > > On 02/06/2020 18:32, Alexei Starovoitov wrote: > > The target for bpf codegen is JITs. > > bpf interpreter is simulating hw. > > For now if you want UB fuzzer running in your environment please add > > _out_of_tree_ patch that inits all interpreter registers to zero. > +1 to all the above. Noted, thank you. > Also, note that you can still fuzz BPF JITs by building the kernel > without the interpreter: CONFIG_BPF_JIT_ALWAYS_ON. Unfortunately KMSAN doesn't play well with JITed code. To be able to detect uninit bugs in JIT, we'll need to instrument the generated code as well. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Self-XORing BPF registers is undefined behavior 2018-12-13 13:18 ` Daniel Borkmann 2018-12-13 14:54 ` Daniel Borkmann @ 2018-12-18 14:09 ` Alexander Potapenko 1 sibling, 0 replies; 26+ messages in thread From: Alexander Potapenko @ 2018-12-18 14:09 UTC (permalink / raw) To: daniel; +Cc: mkubecek, ast, Dmitriy Vyukov, Networking On Thu, Dec 13, 2018 at 2:18 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 12/13/2018 01:24 PM, Alexander Potapenko wrote: > > On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek <mkubecek@suse.cz> wrote: > >> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote: > >>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote: > >>>> Hi BPF maintainers, > >>>> > >>>> some time ago KMSAN found an issue in BPF code which we decided to > >>>> suppress at that point, but now I'd like to bring it to your > >>>> attention. > >>>> Namely, some BPF programs may contain instructions that XOR a register > >>>> with itself. > >>>> This effectively results in the following C code: > >>>> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > >>>> or > >>>> regs[BPF_REG_X] = regs[BPF_REG_X] ^ regs[BPF_REG_X]; > >>>> being executed. > >>>> > >>>> According to the C11 standard this is undefined behavior, so KMSAN > >>>> reports an error in this case. > > Can you elaborate / quote the exact bit from C11 that KMSAN is referring > to? (The below that Michal was quoting or something else?) "6.7.9 Initialization 10 If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate." and "5 Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.[50] Such a representation is called a trap representation." > Does that only refer to C11 standard? Note that kernel's Makefile +430 > explicitly states 'std=gnu89' and not 'std=c11' [0]. It's better defined in newer standards, but C89 also forbids uses of uninitialized values: "3.16 undefined behavior: Behavior, upon use of a nonponable or erroneous program construct, of erroneous data, or of indeterminately valued objects, for which this International Standard imposes no requirements. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message)." and "6.5.7 Initialization ... If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate[74]. If an object that has static storage duration is not initialized explicitly, it is initialized implicitly as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant." > [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51b97e354ba9fce1890cf38ecc754aa49677fc89 > > >>> Can you quote the part of the standard saying this is undefined > >>> behavior? I couldn't find anything else than > >>> > >>> If the value being stored in an object is read from another object > >>> that overlaps in any way the storage of the first object, then the > >>> overlap shall be exact and the two objects shall have qualified or > >>> unqualified versions of a compatible type; otherwise, the behavior > >>> is undefined. > >>> > >>> (but I only have a draft for obvious reasons). I'm not sure what exactly > >>> they mean by "exact overlap" and the standard doesn't seem to define > >>> the term but if the two objects are actually the same, they certainly > >>> have compatible types. > > Here is an example for the overlap quoted above; I don't think this > applies to our case since it would be "exact". Quote [1]: > > struct S { int x; int y; }; > struct T { int z; struct S s; }; > union U { struct S f ; struct T g; } u; > > main(){ > u.f = u.g.s; > return 0; > } > > [1] https://bts.frama-c.com/print_bug_page.php?bug_id=945 > > >> I think I understand now. You didn't want to say that the statement > >> > >> regs[BPF_REG_A] = regs[BPF_REG_A] ^ regs[BPF_REG_A]; > >> > >> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is > >> uninitialized. Right? > > Yes. Sorry for being unclear. > > By default regs[] is uninitialized, so we need to initialize it before > > using the register values. > > I am also wondering if it's possible to simply copy the uninitialized > > register values from regs[] to the userspace via maps. > >> Michal Kubecek > > > > > > > -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2020-06-03 16:33 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-12-13 11:00 Self-XORing BPF registers is undefined behavior Alexander Potapenko 2018-12-13 11:06 ` Eric Dumazet 2018-12-13 11:23 ` Alexander Potapenko 2018-12-13 11:59 ` Michal Kubecek 2018-12-13 12:20 ` Michal Kubecek 2018-12-13 12:24 ` Alexander Potapenko 2018-12-13 13:18 ` Daniel Borkmann 2018-12-13 14:54 ` Daniel Borkmann 2018-12-18 14:36 ` Alexander Potapenko 2020-05-27 15:52 ` Alexander Potapenko 2020-05-27 16:58 ` Alexei Starovoitov 2020-05-27 17:12 ` Alexander Potapenko 2020-05-27 17:14 ` Alexei Starovoitov 2020-05-28 9:54 ` Alexander Potapenko 2020-05-28 16:00 ` Alexei Starovoitov 2020-05-29 0:17 ` Edward Cree 2020-05-29 6:14 ` Dmitry Vyukov 2020-05-29 8:46 ` Edward Cree 2020-05-29 8:53 ` Dmitry Vyukov 2020-05-29 12:28 ` Alexander Potapenko 2020-06-01 9:55 ` Edward Cree 2020-06-02 13:31 ` Alexander Potapenko 2020-06-02 17:32 ` Alexei Starovoitov 2020-06-03 15:37 ` Edward Cree 2020-06-03 16:33 ` Alexander Potapenko 2018-12-18 14:09 ` Alexander Potapenko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).