From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Potapenko Subject: Re: Self-XORing BPF registers is undefined behavior Date: Tue, 18 Dec 2018 15:09:24 +0100 Message-ID: References: <20181213115936.GG21324@unicorn.suse.cz> <20181213122004.GH21324@unicorn.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Cc: mkubecek@suse.cz, ast@kernel.org, Dmitriy Vyukov , Networking To: daniel@iogearbox.net Return-path: Received: from mail-vs1-f66.google.com ([209.85.217.66]:35129 "EHLO mail-vs1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726469AbeLROJk (ORCPT ); Tue, 18 Dec 2018 09:09:40 -0500 Received: by mail-vs1-f66.google.com with SMTP id e7so10068228vsc.2 for ; Tue, 18 Dec 2018 06:09:38 -0800 (PST) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Thu, Dec 13, 2018 at 2:18 PM Daniel Borkmann wrot= e: > > On 12/13/2018 01:24 PM, Alexander Potapenko wrote: > > On Thu, Dec 13, 2018 at 1:20 PM Michal Kubecek wrote= : > >> On Thu, Dec 13, 2018 at 12:59:36PM +0100, Michal Kubecek wrote: > >>> On Thu, Dec 13, 2018 at 12:00:59PM +0100, Alexander Potapenko wrote: > >>>> Hi BPF maintainers, > >>>> > >>>> some time ago KMSAN found an issue in BPF code which we decided to > >>>> suppress at that point, but now I'd like to bring it to your > >>>> attention. > >>>> Namely, some BPF programs may contain instructions that XOR a regist= er > >>>> with itself. > >>>> This effectively results in the following C code: > >>>> regs[BPF_REG_A] =3D regs[BPF_REG_A] ^ regs[BPF_REG_A]; > >>>> or > >>>> regs[BPF_REG_X] =3D regs[BPF_REG_X] ^ regs[BPF_REG_X]; > >>>> being executed. > >>>> > >>>> According to the C11 standard this is undefined behavior, so KMSAN > >>>> reports an error in this case. > > Can you elaborate / quote the exact bit from C11 that KMSAN is referring > to? (The below that Michal was quoting or something else?) "6.7.9 Initialization 10 If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate." and "5 Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.[50] Such a representation is called a trap representation." > Does that only refer to C11 standard? Note that kernel's Makefile +430 > explicitly states 'std=3Dgnu89' and not 'std=3Dc11' [0]. It's better defined in newer standards, but C89 also forbids uses of uninitialized values: "3.16 undefined behavior: Behavior, upon use of a nonponable or erroneous program construct, of erroneous data, or of indeterminately valued objects, for which this International Standard imposes no requirements. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message)." and "6.5.7 Initialization ... If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate[74]. If an object that has static storage duration is not initialized explicitly, it is initialized implicitly as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant." > [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/= commit/?id=3D51b97e354ba9fce1890cf38ecc754aa49677fc89 > > >>> Can you quote the part of the standard saying this is undefined > >>> behavior? I couldn't find anything else than > >>> > >>> If the value being stored in an object is read from another object > >>> that overlaps in any way the storage of the first object, then the > >>> overlap shall be exact and the two objects shall have qualified or > >>> unqualified versions of a compatible type; otherwise, the behavior > >>> is undefined. > >>> > >>> (but I only have a draft for obvious reasons). I'm not sure what exac= tly > >>> they mean by "exact overlap" and the standard doesn't seem to define > >>> the term but if the two objects are actually the same, they certainly > >>> have compatible types. > > Here is an example for the overlap quoted above; I don't think this > applies to our case since it would be "exact". Quote [1]: > > struct S { int x; int y; }; > struct T { int z; struct S s; }; > union U { struct S f ; struct T g; } u; > > main(){ > u.f =3D u.g.s; > return 0; > } > > [1] https://bts.frama-c.com/print_bug_page.php?bug_id=3D945 > > >> I think I understand now. You didn't want to say that the statement > >> > >> regs[BPF_REG_A] =3D regs[BPF_REG_A] ^ regs[BPF_REG_A]; > >> > >> as such is undefined behavior but that it's UB when regs[BPF_REG_A] is > >> uninitialized. Right? > > Yes. Sorry for being unclear. > > By default regs[] is uninitialized, so we need to initialize it before > > using the register values. > > I am also wondering if it's possible to simply copy the uninitialized > > register values from regs[] to the userspace via maps. > >> Michal Kubecek > > > > > > > --=20 Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Stra=C3=9Fe, 33 80636 M=C3=BCnchen Gesch=C3=A4ftsf=C3=BChrer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg