From: Nick Desaulniers <ndesaulniers@google.com> To: Arvind Sankar <nivedita@alum.mit.edu> Cc: Adrian Ratiu <adrian.ratiu@collabora.com>, Nathan Chancellor <natechancellor@gmail.com>, Arnd Bergmann <arnd@arndb.de>, Linux ARM <linux-arm-kernel@lists.infradead.org>, clang-built-linux <clang-built-linux@googlegroups.com>, Russell King <linux@armlinux.org.uk>, LKML <linux-kernel@vger.kernel.org>, Collabora Kernel ML <kernel@collabora.com>, Ard Biesheuvel <ardb@kernel.org> Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization Date: Tue, 10 Nov 2020 14:39:59 -0800 [thread overview] Message-ID: <CAKwvOd=6nPUvUY6XJULmkZywHJG2kfu46h7=Zm18j9O30ovdpw@mail.gmail.com> (raw) In-Reply-To: <CAKwvOdmsCY8Hb6t8t9jW5DKCZKbja27bGufO=tw3+-UuTTFBOQ@mail.gmail.com> On Tue, Nov 10, 2020 at 2:39 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Tue, Nov 10, 2020 at 2:36 PM Nick Desaulniers > <ndesaulniers@google.com> wrote: > > > > On Tue, Nov 10, 2020 at 2:15 PM Arvind Sankar <nivedita@alum.mit.edu> wrote: > > > > > > On Tue, Nov 10, 2020 at 01:41:17PM -0800, Nick Desaulniers wrote: > > > > On Mon, Nov 9, 2020 at 11:51 AM Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > > > > > > > > > On Fri, 06 Nov 2020, Nick Desaulniers <ndesaulniers@google.com> > > > > > wrote: > > > > > > +#pragma clang loop vectorize(enable) > > > > > > do { > > > > > > p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] ^= > > > > > > p2[1] ^ p3[1] ^ p4[1] ^ p5[1]; > > > > > > ``` seems to generate the vectorized code. > > > > > > > > > > > > Why don't we find a way to make those pragma's more toolchain > > > > > > portable, rather than open coding them like I have above rather > > > > > > than this series? > > > > > > > > > > Hi again Nick, > > > > > > > > > > How did you verify the above pragmas generate correct vectorized > > > > > code? Have you tested this specific use case? > > > > > > > > I read the disassembly before and after my suggested use of pragmas; > > > > look for vld/vstr. You can also add -Rpass-missed=loop-vectorize to > > > > CFLAGS_xor-neon.o in arch/arm/lib/Makefile and rebuild > > > > arch/arm/lib/xor-neon.o with CONFIG_BTRFS enabled. > > > > > > > > > > https://godbolt.org/z/1oo9M6 > > > > > > With the __restrict__ keywords added, clang seems to vectorize the loop, > > > but still reports that vectorization wasn't beneficial -- any idea > > > what's going on? > > Anyways, it's not safe to make that change in the kernel unless you > can guarantee that callers of these routines do not alias or overlap. s/callers/parameters passed by callers/ > > > > > I suspect that loop-vectorize is a higher level pass that relies on > > slp-vectorizer for the transform. > > > > $ clang -O2 --target=arm-linux-gnueabi -S -o - foo.c -mfpu=neon -mllvm > > -print-after-all > > ... > > *** IR Dump After SLP Vectorizer *** > > (bunch of <4 x i32> types) > > > > If you add -Rpass-missed=slp-vectorizer, observe that the existing > > warnings from -Rpass-missed=loop-vectorize disappear; I suspect > > loop-vectorize will print a "remark" if passes it calls did not, but > > returned some for of error code. > > > > -Rpass=slp-vectorizer shows that it vectorizes two sequences of the > > loop, and warns that some third portion (that's > > non-immediately-obvious to me) was non beneficial. > > > > -- > > Thanks, > > ~Nick Desaulniers > > > > -- > Thanks, > ~Nick Desaulniers -- Thanks, ~Nick Desaulniers
WARNING: multiple messages have this Message-ID (diff)
From: Nick Desaulniers <ndesaulniers@google.com> To: Arvind Sankar <nivedita@alum.mit.edu> Cc: Arnd Bergmann <arnd@arndb.de>, Adrian Ratiu <adrian.ratiu@collabora.com>, Russell King <linux@armlinux.org.uk>, LKML <linux-kernel@vger.kernel.org>, clang-built-linux <clang-built-linux@googlegroups.com>, Nathan Chancellor <natechancellor@gmail.com>, Collabora Kernel ML <kernel@collabora.com>, Ard Biesheuvel <ardb@kernel.org>, Linux ARM <linux-arm-kernel@lists.infradead.org> Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization Date: Tue, 10 Nov 2020 14:39:59 -0800 [thread overview] Message-ID: <CAKwvOd=6nPUvUY6XJULmkZywHJG2kfu46h7=Zm18j9O30ovdpw@mail.gmail.com> (raw) In-Reply-To: <CAKwvOdmsCY8Hb6t8t9jW5DKCZKbja27bGufO=tw3+-UuTTFBOQ@mail.gmail.com> On Tue, Nov 10, 2020 at 2:39 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Tue, Nov 10, 2020 at 2:36 PM Nick Desaulniers > <ndesaulniers@google.com> wrote: > > > > On Tue, Nov 10, 2020 at 2:15 PM Arvind Sankar <nivedita@alum.mit.edu> wrote: > > > > > > On Tue, Nov 10, 2020 at 01:41:17PM -0800, Nick Desaulniers wrote: > > > > On Mon, Nov 9, 2020 at 11:51 AM Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > > > > > > > > > On Fri, 06 Nov 2020, Nick Desaulniers <ndesaulniers@google.com> > > > > > wrote: > > > > > > +#pragma clang loop vectorize(enable) > > > > > > do { > > > > > > p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] ^= > > > > > > p2[1] ^ p3[1] ^ p4[1] ^ p5[1]; > > > > > > ``` seems to generate the vectorized code. > > > > > > > > > > > > Why don't we find a way to make those pragma's more toolchain > > > > > > portable, rather than open coding them like I have above rather > > > > > > than this series? > > > > > > > > > > Hi again Nick, > > > > > > > > > > How did you verify the above pragmas generate correct vectorized > > > > > code? Have you tested this specific use case? > > > > > > > > I read the disassembly before and after my suggested use of pragmas; > > > > look for vld/vstr. You can also add -Rpass-missed=loop-vectorize to > > > > CFLAGS_xor-neon.o in arch/arm/lib/Makefile and rebuild > > > > arch/arm/lib/xor-neon.o with CONFIG_BTRFS enabled. > > > > > > > > > > https://godbolt.org/z/1oo9M6 > > > > > > With the __restrict__ keywords added, clang seems to vectorize the loop, > > > but still reports that vectorization wasn't beneficial -- any idea > > > what's going on? > > Anyways, it's not safe to make that change in the kernel unless you > can guarantee that callers of these routines do not alias or overlap. s/callers/parameters passed by callers/ > > > > > I suspect that loop-vectorize is a higher level pass that relies on > > slp-vectorizer for the transform. > > > > $ clang -O2 --target=arm-linux-gnueabi -S -o - foo.c -mfpu=neon -mllvm > > -print-after-all > > ... > > *** IR Dump After SLP Vectorizer *** > > (bunch of <4 x i32> types) > > > > If you add -Rpass-missed=slp-vectorizer, observe that the existing > > warnings from -Rpass-missed=loop-vectorize disappear; I suspect > > loop-vectorize will print a "remark" if passes it calls did not, but > > returned some for of error code. > > > > -Rpass=slp-vectorizer shows that it vectorizes two sequences of the > > loop, and warns that some third portion (that's > > non-immediately-obvious to me) was non beneficial. > > > > -- > > Thanks, > > ~Nick Desaulniers > > > > -- > Thanks, > ~Nick Desaulniers -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-11-10 22:40 UTC|newest] Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-11-06 5:14 [PATCH 0/2] arm: lib: xor-neon: Remove warn & disble neon vect Adrian Ratiu 2020-11-06 5:14 ` Adrian Ratiu 2020-11-06 5:14 ` [PATCH 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu 2020-11-06 5:14 ` Adrian Ratiu 2020-11-06 14:46 ` Arnd Bergmann 2020-11-06 14:46 ` Arnd Bergmann 2020-11-06 18:03 ` Nathan Chancellor 2020-11-06 18:03 ` Nathan Chancellor 2020-11-06 21:46 ` Arnd Bergmann 2020-11-06 21:46 ` Arnd Bergmann 2020-11-06 5:14 ` [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization Adrian Ratiu 2020-11-06 5:14 ` Adrian Ratiu 2020-11-06 10:14 ` Nathan Chancellor 2020-11-06 10:14 ` Nathan Chancellor 2020-11-06 11:50 ` Adrian Ratiu 2020-11-06 11:50 ` Adrian Ratiu 2020-11-06 18:01 ` Nathan Chancellor 2020-11-06 18:01 ` Nathan Chancellor 2020-11-06 19:52 ` Nick Desaulniers 2020-11-06 19:52 ` Nick Desaulniers 2020-11-07 18:07 ` Adrian Ratiu 2020-11-07 18:07 ` Adrian Ratiu 2020-11-09 19:53 ` Adrian Ratiu 2020-11-09 19:53 ` Adrian Ratiu 2020-11-10 21:41 ` Nick Desaulniers 2020-11-10 21:41 ` Nick Desaulniers 2020-11-10 22:15 ` Arvind Sankar 2020-11-10 22:15 ` Arvind Sankar 2020-11-10 22:36 ` Nick Desaulniers 2020-11-10 22:36 ` Nick Desaulniers 2020-11-10 22:39 ` Nick Desaulniers 2020-11-10 22:39 ` Nick Desaulniers 2020-11-10 22:39 ` Nick Desaulniers [this message] 2020-11-10 22:39 ` Nick Desaulniers 2020-11-10 22:54 ` Arvind Sankar 2020-11-10 22:54 ` Arvind Sankar 2020-11-10 23:56 ` Adrian Ratiu 2020-11-10 23:56 ` Adrian Ratiu 2020-11-11 0:18 ` Nick Desaulniers 2020-11-11 0:18 ` Nick Desaulniers 2020-11-11 14:15 ` Adrian Ratiu 2020-11-11 14:15 ` Adrian Ratiu 2020-11-12 21:50 ` Arvind Sankar 2020-11-12 21:50 ` Arvind Sankar 2020-11-12 21:55 ` Nick Desaulniers 2020-11-12 21:55 ` Nick Desaulniers 2020-11-07 10:22 ` Russell King - ARM Linux admin 2020-11-07 10:22 ` Russell King - ARM Linux admin 2020-11-07 18:12 ` Adrian Ratiu 2020-11-07 18:12 ` Adrian Ratiu 2020-11-08 17:40 ` Arvind Sankar 2020-11-08 17:40 ` Arvind Sankar 2020-11-08 18:09 ` Arvind Sankar 2020-11-08 18:09 ` Arvind Sankar 2020-11-08 20:14 ` Ard Biesheuvel 2020-11-08 20:14 ` Ard Biesheuvel
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAKwvOd=6nPUvUY6XJULmkZywHJG2kfu46h7=Zm18j9O30ovdpw@mail.gmail.com' \ --to=ndesaulniers@google.com \ --cc=adrian.ratiu@collabora.com \ --cc=ardb@kernel.org \ --cc=arnd@arndb.de \ --cc=clang-built-linux@googlegroups.com \ --cc=kernel@collabora.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux@armlinux.org.uk \ --cc=natechancellor@gmail.com \ --cc=nivedita@alum.mit.edu \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.