linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arvind Sankar <nivedita@alum.mit.edu>
To: Arvind Sankar <nivedita@alum.mit.edu>
Cc: Adrian Ratiu <adrian.ratiu@collabora.com>,
	linux-arm-kernel@lists.infradead.org,
	Nathan Chancellor <natechancellor@gmail.com>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Arnd Bergmann <arnd@arndb.de>,
	clang-built-linux@googlegroups.com,
	Russell King <linux@armlinux.org.uk>,
	linux-kernel@vger.kernel.org, kernel@collabora.com
Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization
Date: Sun, 8 Nov 2020 13:09:42 -0500	[thread overview]
Message-ID: <20201108180942.GA226037@rani.riverdale.lan> (raw)
In-Reply-To: <20201108174014.GA219672@rani.riverdale.lan>

On Sun, Nov 08, 2020 at 12:40:14PM -0500, Arvind Sankar wrote:
> On Fri, Nov 06, 2020 at 07:14:36AM +0200, Adrian Ratiu wrote:
> > Due to a Clang bug [1] neon autoloop vectorization does not happen or
> > happens badly with no gains and considering previous GCC experiences
> > which generated unoptimized code which was worse than the default asm
> > implementation, it is safer to default clang builds to the known good
> > generic implementation.
> > 
> > The kernel currently supports a minimum Clang version of v10.0.1, see
> > commit 1f7a44f63e6c ("compiler-clang: add build check for clang 10.0.1").
> > 
> > When the bug gets eventually fixed, this commit could be reverted or,
> > if the minimum clang version bump takes a long time, a warning could
> > be added for users to upgrade their compilers like was done for GCC.
> > 
> > [1] https://bugs.llvm.org/show_bug.cgi?id=40976
> > 
> > Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
> > ---
> >  arch/arm/include/asm/xor.h | 3 ++-
> >  arch/arm/lib/Makefile      | 3 +++
> >  arch/arm/lib/xor-neon.c    | 4 ++++
> >  3 files changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm/include/asm/xor.h b/arch/arm/include/asm/xor.h
> > index aefddec79286..49937dafaa71 100644
> > --- a/arch/arm/include/asm/xor.h
> > +++ b/arch/arm/include/asm/xor.h
> > @@ -141,7 +141,8 @@ static struct xor_block_template xor_block_arm4regs = {
> >  		NEON_TEMPLATES;			\
> >  	} while (0)
> >  
> > -#ifdef CONFIG_KERNEL_MODE_NEON
> > +/* disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976 */
> > +#if defined(CONFIG_KERNEL_MODE_NEON) && !defined(CONFIG_CC_IS_CLANG)
> >  
> >  extern struct xor_block_template const xor_block_neon_inner;
> >  
> > diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> > index 6d2ba454f25b..53f9e7dd9714 100644
> > --- a/arch/arm/lib/Makefile
> > +++ b/arch/arm/lib/Makefile
> > @@ -43,8 +43,11 @@ endif
> >  $(obj)/csumpartialcopy.o:	$(obj)/csumpartialcopygeneric.S
> >  $(obj)/csumpartialcopyuser.o:	$(obj)/csumpartialcopygeneric.S
> >  
> > +# disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976
> > +ifndef CONFIG_CC_IS_CLANG
> >  ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> >    NEON_FLAGS			:= -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> >    CFLAGS_xor-neon.o		+= $(NEON_FLAGS)
> >    obj-$(CONFIG_XOR_BLOCKS)	+= xor-neon.o
> >  endif
> > +endif
> > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> > index e1e76186ec23..84c91c48dfa2 100644
> > --- a/arch/arm/lib/xor-neon.c
> > +++ b/arch/arm/lib/xor-neon.c
> > @@ -18,6 +18,10 @@ MODULE_LICENSE("GPL");
> >   * Pull in the reference implementations while instructing GCC (through
> >   * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
> >   * NEON instructions.
> > +
> > + * On Clang the loop vectorizer is enabled by default, but due to a bug
> > + * (https://bugs.llvm.org/show_bug.cgi?id=40976) vectorization is broke
> > + * so xor-neon is disabled in favor of the default reg implementations.
> >   */
> >  #ifdef CONFIG_CC_IS_GCC
> >  #pragma GCC optimize "tree-vectorize"
> > -- 
> > 2.29.0
> > 
> 
> It's actually a bad idea to use #pragma GCC optimize. This is basically
> the same as tagging all the functions with __attribute__((optimize)),
> which GCC does not recommend for production use, as it _replaces_
> optimization options rather than appending to them, and has been
> observed to result in dropping important compiler flags.
> 
> There've been a few discussions recently around other such cases:
> https://lore.kernel.org/lkml/20201028171506.15682-1-ardb@kernel.org/
> https://lore.kernel.org/lkml/20201028081123.GT2628@hirez.programming.kicks-ass.net/
> 
> For this file, given that it is supposed to use -ftree-vectorize for the
> whole file anyway, is there any reason it's not just added to CFLAGS via
> the Makefile? This seems to be the only use of pragma optimize in the
> kernel.

Eg, this shows that the pragma results in dropping -fno-strict-aliasing.
https://godbolt.org/z/1nfrKT

The first function does not use vectorization because s and s->a might
alias.

  reply	other threads:[~2020-11-08 18:09 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-06  5:14 [PATCH 0/2] arm: lib: xor-neon: Remove warn & disble neon vect Adrian Ratiu
2020-11-06  5:14 ` [PATCH 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu
2020-11-06 14:46   ` Arnd Bergmann
2020-11-06 18:03     ` Nathan Chancellor
2020-11-06 21:46       ` Arnd Bergmann
2020-11-06  5:14 ` [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization Adrian Ratiu
2020-11-06 10:14   ` Nathan Chancellor
2020-11-06 11:50     ` Adrian Ratiu
2020-11-06 18:01       ` Nathan Chancellor
2020-11-06 19:52       ` Nick Desaulniers
2020-11-07 18:07         ` Adrian Ratiu
2020-11-09 19:53         ` Adrian Ratiu
2020-11-10 21:41           ` Nick Desaulniers
2020-11-10 22:15             ` Arvind Sankar
2020-11-10 22:36               ` Nick Desaulniers
2020-11-10 22:39                 ` Nick Desaulniers
2020-11-10 22:39                   ` Nick Desaulniers
2020-11-10 22:54                     ` Arvind Sankar
2020-11-10 23:56             ` Adrian Ratiu
2020-11-11  0:18               ` Nick Desaulniers
2020-11-11 14:15                 ` Adrian Ratiu
2020-11-12 21:50                   ` Arvind Sankar
2020-11-12 21:55                     ` Nick Desaulniers
2020-11-07 10:22   ` Russell King - ARM Linux admin
2020-11-07 18:12     ` Adrian Ratiu
2020-11-08 17:40   ` Arvind Sankar
2020-11-08 18:09     ` Arvind Sankar [this message]
2020-11-08 20:14       ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201108180942.GA226037@rani.riverdale.lan \
    --to=nivedita@alum.mit.edu \
    --cc=adrian.ratiu@collabora.com \
    --cc=arnd@arndb.de \
    --cc=clang-built-linux@googlegroups.com \
    --cc=kernel@collabora.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=natechancellor@gmail.com \
    --cc=ndesaulniers@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).