From: Nick Desaulniers <ndesaulniers@google.com>
To: Arvind Sankar <nivedita@alum.mit.edu>
Cc: Adrian Ratiu <adrian.ratiu@collabora.com>,
Nathan Chancellor <natechancellor@gmail.com>,
Arnd Bergmann <arnd@arndb.de>,
Linux ARM <linux-arm-kernel@lists.infradead.org>,
clang-built-linux <clang-built-linux@googlegroups.com>,
Russell King <linux@armlinux.org.uk>,
LKML <linux-kernel@vger.kernel.org>,
Collabora Kernel ML <kernel@collabora.com>,
Ard Biesheuvel <ardb@kernel.org>
Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization
Date: Tue, 10 Nov 2020 14:36:27 -0800 [thread overview]
Message-ID: <CAKwvOd=NQ==umC+N_Sgji5HCCFTRARh4jWiB3DaBfV6jDd5cRg@mail.gmail.com> (raw)
In-Reply-To: <20201110221511.GA1373528@rani.riverdale.lan>
On Tue, Nov 10, 2020 at 2:15 PM Arvind Sankar <nivedita@alum.mit.edu> wrote:
>
> On Tue, Nov 10, 2020 at 01:41:17PM -0800, Nick Desaulniers wrote:
> > On Mon, Nov 9, 2020 at 11:51 AM Adrian Ratiu <adrian.ratiu@collabora.com> wrote:
> > >
> > > On Fri, 06 Nov 2020, Nick Desaulniers <ndesaulniers@google.com>
> > > wrote:
> > > > +#pragma clang loop vectorize(enable)
> > > > do {
> > > > p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] ^=
> > > > p2[1] ^ p3[1] ^ p4[1] ^ p5[1];
> > > > ``` seems to generate the vectorized code.
> > > >
> > > > Why don't we find a way to make those pragma's more toolchain
> > > > portable, rather than open coding them like I have above rather
> > > > than this series?
> > >
> > > Hi again Nick,
> > >
> > > How did you verify the above pragmas generate correct vectorized
> > > code? Have you tested this specific use case?
> >
> > I read the disassembly before and after my suggested use of pragmas;
> > look for vld/vstr. You can also add -Rpass-missed=loop-vectorize to
> > CFLAGS_xor-neon.o in arch/arm/lib/Makefile and rebuild
> > arch/arm/lib/xor-neon.o with CONFIG_BTRFS enabled.
> >
>
> https://godbolt.org/z/1oo9M6
>
> With the __restrict__ keywords added, clang seems to vectorize the loop,
> but still reports that vectorization wasn't beneficial -- any idea
> what's going on?
I suspect that loop-vectorize is a higher level pass that relies on
slp-vectorizer for the transform.
$ clang -O2 --target=arm-linux-gnueabi -S -o - foo.c -mfpu=neon -mllvm
-print-after-all
...
*** IR Dump After SLP Vectorizer ***
(bunch of <4 x i32> types)
If you add -Rpass-missed=slp-vectorizer, observe that the existing
warnings from -Rpass-missed=loop-vectorize disappear; I suspect
loop-vectorize will print a "remark" if passes it calls did not, but
returned some for of error code.
-Rpass=slp-vectorizer shows that it vectorizes two sequences of the
loop, and warns that some third portion (that's
non-immediately-obvious to me) was non beneficial.
--
Thanks,
~Nick Desaulniers
WARNING: multiple messages have this Message-ID (diff)
From: Nick Desaulniers <ndesaulniers@google.com>
To: Arvind Sankar <nivedita@alum.mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>,
Adrian Ratiu <adrian.ratiu@collabora.com>,
Russell King <linux@armlinux.org.uk>,
LKML <linux-kernel@vger.kernel.org>,
clang-built-linux <clang-built-linux@googlegroups.com>,
Nathan Chancellor <natechancellor@gmail.com>,
Collabora Kernel ML <kernel@collabora.com>,
Ard Biesheuvel <ardb@kernel.org>,
Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization
Date: Tue, 10 Nov 2020 14:36:27 -0800 [thread overview]
Message-ID: <CAKwvOd=NQ==umC+N_Sgji5HCCFTRARh4jWiB3DaBfV6jDd5cRg@mail.gmail.com> (raw)
In-Reply-To: <20201110221511.GA1373528@rani.riverdale.lan>
On Tue, Nov 10, 2020 at 2:15 PM Arvind Sankar <nivedita@alum.mit.edu> wrote:
>
> On Tue, Nov 10, 2020 at 01:41:17PM -0800, Nick Desaulniers wrote:
> > On Mon, Nov 9, 2020 at 11:51 AM Adrian Ratiu <adrian.ratiu@collabora.com> wrote:
> > >
> > > On Fri, 06 Nov 2020, Nick Desaulniers <ndesaulniers@google.com>
> > > wrote:
> > > > +#pragma clang loop vectorize(enable)
> > > > do {
> > > > p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] ^=
> > > > p2[1] ^ p3[1] ^ p4[1] ^ p5[1];
> > > > ``` seems to generate the vectorized code.
> > > >
> > > > Why don't we find a way to make those pragma's more toolchain
> > > > portable, rather than open coding them like I have above rather
> > > > than this series?
> > >
> > > Hi again Nick,
> > >
> > > How did you verify the above pragmas generate correct vectorized
> > > code? Have you tested this specific use case?
> >
> > I read the disassembly before and after my suggested use of pragmas;
> > look for vld/vstr. You can also add -Rpass-missed=loop-vectorize to
> > CFLAGS_xor-neon.o in arch/arm/lib/Makefile and rebuild
> > arch/arm/lib/xor-neon.o with CONFIG_BTRFS enabled.
> >
>
> https://godbolt.org/z/1oo9M6
>
> With the __restrict__ keywords added, clang seems to vectorize the loop,
> but still reports that vectorization wasn't beneficial -- any idea
> what's going on?
I suspect that loop-vectorize is a higher level pass that relies on
slp-vectorizer for the transform.
$ clang -O2 --target=arm-linux-gnueabi -S -o - foo.c -mfpu=neon -mllvm
-print-after-all
...
*** IR Dump After SLP Vectorizer ***
(bunch of <4 x i32> types)
If you add -Rpass-missed=slp-vectorizer, observe that the existing
warnings from -Rpass-missed=loop-vectorize disappear; I suspect
loop-vectorize will print a "remark" if passes it calls did not, but
returned some for of error code.
-Rpass=slp-vectorizer shows that it vectorizes two sequences of the
loop, and warns that some third portion (that's
non-immediately-obvious to me) was non beneficial.
--
Thanks,
~Nick Desaulniers
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-11-10 22:36 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-06 5:14 [PATCH 0/2] arm: lib: xor-neon: Remove warn & disble neon vect Adrian Ratiu
2020-11-06 5:14 ` Adrian Ratiu
2020-11-06 5:14 ` [PATCH 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu
2020-11-06 5:14 ` Adrian Ratiu
2020-11-06 14:46 ` Arnd Bergmann
2020-11-06 14:46 ` Arnd Bergmann
2020-11-06 18:03 ` Nathan Chancellor
2020-11-06 18:03 ` Nathan Chancellor
2020-11-06 21:46 ` Arnd Bergmann
2020-11-06 21:46 ` Arnd Bergmann
2020-11-06 5:14 ` [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization Adrian Ratiu
2020-11-06 5:14 ` Adrian Ratiu
2020-11-06 10:14 ` Nathan Chancellor
2020-11-06 10:14 ` Nathan Chancellor
2020-11-06 11:50 ` Adrian Ratiu
2020-11-06 11:50 ` Adrian Ratiu
2020-11-06 18:01 ` Nathan Chancellor
2020-11-06 18:01 ` Nathan Chancellor
2020-11-06 19:52 ` Nick Desaulniers
2020-11-06 19:52 ` Nick Desaulniers
2020-11-07 18:07 ` Adrian Ratiu
2020-11-07 18:07 ` Adrian Ratiu
2020-11-09 19:53 ` Adrian Ratiu
2020-11-09 19:53 ` Adrian Ratiu
2020-11-10 21:41 ` Nick Desaulniers
2020-11-10 21:41 ` Nick Desaulniers
2020-11-10 22:15 ` Arvind Sankar
2020-11-10 22:15 ` Arvind Sankar
2020-11-10 22:36 ` Nick Desaulniers [this message]
2020-11-10 22:36 ` Nick Desaulniers
2020-11-10 22:39 ` Nick Desaulniers
2020-11-10 22:39 ` Nick Desaulniers
2020-11-10 22:39 ` Nick Desaulniers
2020-11-10 22:39 ` Nick Desaulniers
2020-11-10 22:54 ` Arvind Sankar
2020-11-10 22:54 ` Arvind Sankar
2020-11-10 23:56 ` Adrian Ratiu
2020-11-10 23:56 ` Adrian Ratiu
2020-11-11 0:18 ` Nick Desaulniers
2020-11-11 0:18 ` Nick Desaulniers
2020-11-11 14:15 ` Adrian Ratiu
2020-11-11 14:15 ` Adrian Ratiu
2020-11-12 21:50 ` Arvind Sankar
2020-11-12 21:50 ` Arvind Sankar
2020-11-12 21:55 ` Nick Desaulniers
2020-11-12 21:55 ` Nick Desaulniers
2020-11-07 10:22 ` Russell King - ARM Linux admin
2020-11-07 10:22 ` Russell King - ARM Linux admin
2020-11-07 18:12 ` Adrian Ratiu
2020-11-07 18:12 ` Adrian Ratiu
2020-11-08 17:40 ` Arvind Sankar
2020-11-08 17:40 ` Arvind Sankar
2020-11-08 18:09 ` Arvind Sankar
2020-11-08 18:09 ` Arvind Sankar
2020-11-08 20:14 ` Ard Biesheuvel
2020-11-08 20:14 ` Ard Biesheuvel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKwvOd=NQ==umC+N_Sgji5HCCFTRARh4jWiB3DaBfV6jDd5cRg@mail.gmail.com' \
--to=ndesaulniers@google.com \
--cc=adrian.ratiu@collabora.com \
--cc=ardb@kernel.org \
--cc=arnd@arndb.de \
--cc=clang-built-linux@googlegroups.com \
--cc=kernel@collabora.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=natechancellor@gmail.com \
--cc=nivedita@alum.mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.