* [PATCH v2 0/2] xor-neon: Remove GCC warn & pragmas @ 2020-11-12 21:24 Adrian Ratiu 2020-11-12 21:24 ` [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu 2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu 0 siblings, 2 replies; 12+ messages in thread From: Adrian Ratiu @ 2020-11-12 21:24 UTC (permalink / raw) To: linux-arm-kernel Cc: Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King, Ard Biesheuvel, Arvind Sankar, kernel, clang-built-linux, Linux Kernel Mailing List Dear all, This is v2 of the patch series at id:20201106051436.2384842-1-adrian.ratiu@collabora.com Tested on next-20201112 using GCC 10.2.0 and Clang 10.0.1. Kind regards, Adrian Changes in v2: - Dropped the patch which disabled Clang vectorization (Nick) - Added new patch to move pragmas to makefile cmdline options (Arvid and Ard) Adrian Ratiu (1): arm: lib: xor-neon: move pragma options to makefile Nathan Chancellor (1): arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning arch/arm/lib/Makefile | 2 +- arch/arm/lib/xor-neon.c | 17 ----------------- 2 files changed, 1 insertion(+), 18 deletions(-) -- 2.29.2 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning 2020-11-12 21:24 [PATCH v2 0/2] xor-neon: Remove GCC warn & pragmas Adrian Ratiu @ 2020-11-12 21:24 ` Adrian Ratiu 2020-11-12 21:38 ` Nick Desaulniers 2020-11-13 7:49 ` Ard Biesheuvel 2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu 1 sibling, 2 replies; 12+ messages in thread From: Adrian Ratiu @ 2020-11-12 21:24 UTC (permalink / raw) To: linux-arm-kernel Cc: Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King, Ard Biesheuvel, Arvind Sankar, kernel, clang-built-linux, Linux Kernel Mailing List From: Nathan Chancellor <natechancellor@gmail.com> Drop warning because kernel now requires GCC >= v4.9 after commit 6ec4476ac825 ("Raise gcc version requirement to 4.9"). Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> --- arch/arm/lib/xor-neon.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c index b99dd8e1c93f..e1e76186ec23 100644 --- a/arch/arm/lib/xor-neon.c +++ b/arch/arm/lib/xor-neon.c @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL"); * -ftree-vectorize) to attempt to exploit implicit parallelism and emit * NEON instructions. */ -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6) +#ifdef CONFIG_CC_IS_GCC #pragma GCC optimize "tree-vectorize" -#else -/* - * While older versions of GCC do not generate incorrect code, they fail to - * recognize the parallel nature of these functions, and emit plain ARM code, - * which is known to be slower than the optimized ARM code in asm-arm/xor.h. - */ -#warning This code requires at least version 4.6 of GCC #endif #pragma GCC diagnostic ignored "-Wunused-variable" -- 2.29.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning 2020-11-12 21:24 ` [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu @ 2020-11-12 21:38 ` Nick Desaulniers 2020-11-13 7:49 ` Ard Biesheuvel 1 sibling, 0 replies; 12+ messages in thread From: Nick Desaulniers @ 2020-11-12 21:38 UTC (permalink / raw) To: Adrian Ratiu Cc: Linux ARM, Nathan Chancellor, Arnd Bergmann, Russell King, Ard Biesheuvel, Arvind Sankar, Collabora Kernel ML, clang-built-linux, Linux Kernel Mailing List On Thu, Nov 12, 2020 at 1:23 PM Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > From: Nathan Chancellor <natechancellor@gmail.com> > > Drop warning because kernel now requires GCC >= v4.9 after > commit 6ec4476ac825 ("Raise gcc version requirement to 4.9"). > > Reported-by: Nick Desaulniers <ndesaulniers@google.com> > Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> > Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Link: https://github.com/ClangBuiltLinux/linux/issues/496 Link: https://github.com/ClangBuiltLinux/linux/issues/503 Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> > --- > arch/arm/lib/xor-neon.c | 9 +-------- > 1 file changed, 1 insertion(+), 8 deletions(-) > > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c > index b99dd8e1c93f..e1e76186ec23 100644 > --- a/arch/arm/lib/xor-neon.c > +++ b/arch/arm/lib/xor-neon.c > @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL"); > * -ftree-vectorize) to attempt to exploit implicit parallelism and emit > * NEON instructions. > */ > -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6) > +#ifdef CONFIG_CC_IS_GCC > #pragma GCC optimize "tree-vectorize" > -#else > -/* > - * While older versions of GCC do not generate incorrect code, they fail to > - * recognize the parallel nature of these functions, and emit plain ARM code, > - * which is known to be slower than the optimized ARM code in asm-arm/xor.h. > - */ > -#warning This code requires at least version 4.6 of GCC > #endif > > #pragma GCC diagnostic ignored "-Wunused-variable" > -- > 2.29.2 > -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning 2020-11-12 21:24 ` [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu 2020-11-12 21:38 ` Nick Desaulniers @ 2020-11-13 7:49 ` Ard Biesheuvel 2020-11-13 11:07 ` Adrian Ratiu 1 sibling, 1 reply; 12+ messages in thread From: Ard Biesheuvel @ 2020-11-13 7:49 UTC (permalink / raw) To: Adrian Ratiu Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King, Arvind Sankar, kernel, clang-built-linux, Linux Kernel Mailing List On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > From: Nathan Chancellor <natechancellor@gmail.com> > > Drop warning because kernel now requires GCC >= v4.9 after > commit 6ec4476ac825 ("Raise gcc version requirement to 4.9"). > > Reported-by: Nick Desaulniers <ndesaulniers@google.com> > Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> > Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Again, this does not do what it says on the tin. If you want to disable the pragma for Clang, call that out in the commit log, and don't hide it under a GCC version change. Without the pragma, the generated code is the same as the generic code, so it makes no sense to build xor-neon.ko at all, right? > --- > arch/arm/lib/xor-neon.c | 9 +-------- > 1 file changed, 1 insertion(+), 8 deletions(-) > > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c > index b99dd8e1c93f..e1e76186ec23 100644 > --- a/arch/arm/lib/xor-neon.c > +++ b/arch/arm/lib/xor-neon.c > @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL"); > * -ftree-vectorize) to attempt to exploit implicit parallelism and emit > * NEON instructions. > */ > -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6) > +#ifdef CONFIG_CC_IS_GCC > #pragma GCC optimize "tree-vectorize" > -#else > -/* > - * While older versions of GCC do not generate incorrect code, they fail to > - * recognize the parallel nature of these functions, and emit plain ARM code, > - * which is known to be slower than the optimized ARM code in asm-arm/xor.h. > - */ > -#warning This code requires at least version 4.6 of GCC > #endif > > #pragma GCC diagnostic ignored "-Wunused-variable" > -- > 2.29.2 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning 2020-11-13 7:49 ` Ard Biesheuvel @ 2020-11-13 11:07 ` Adrian Ratiu 2020-11-13 11:41 ` Ard Biesheuvel 0 siblings, 1 reply; 12+ messages in thread From: Adrian Ratiu @ 2020-11-13 11:07 UTC (permalink / raw) To: Ard Biesheuvel Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King, Arvind Sankar, kernel, clang-built-linux, Linux Kernel Mailing List Hi Ard, On Fri, 13 Nov 2020, Ard Biesheuvel <ardb@kernel.org> wrote: > On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu > <adrian.ratiu@collabora.com> wrote: >> >> From: Nathan Chancellor <natechancellor@gmail.com> >> >> Drop warning because kernel now requires GCC >= v4.9 after >> commit 6ec4476ac825 ("Raise gcc version requirement to 4.9"). >> >> Reported-by: Nick Desaulniers <ndesaulniers@google.com> >> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> >> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> > > Again, this does not do what it says on the tin. > > If you want to disable the pragma for Clang, call that out in > the commit log, and don't hide it under a GCC version change. I am not doing anything for Clang in this series. The option to auto-vectorize in Clang is enabled by default but doesn't work for some reason (likely to do with how it computes the cost model, so maybe not even a bug at all) and if we enable it explicitely (eg via a Clang specific pragma) we get some warnings we currently do not understand, so I am not changing the Clang behaviour at the recommendation of Nick. So this is only for GCC as the "tin" says :) We can fix clang separately as the Clang bug has always been present and is unrelated. > > Without the pragma, the generated code is the same as the > generic code, so it makes no sense to build xor-neon.ko at all, > right? > Yes that is correct and that is the reason why in v1 I opted to not build xor-neon.ko for Clang anymore, but that got NACKed, so here I'm fixing the low hanging fruit: the very obvious & clear GCC problems. >> --- >> arch/arm/lib/xor-neon.c | 9 +-------- >> 1 file changed, 1 insertion(+), 8 deletions(-) >> >> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c >> index b99dd8e1c93f..e1e76186ec23 100644 >> --- a/arch/arm/lib/xor-neon.c >> +++ b/arch/arm/lib/xor-neon.c >> @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL"); >> * -ftree-vectorize) to attempt to exploit implicit parallelism and emit >> * NEON instructions. >> */ >> -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6) >> +#ifdef CONFIG_CC_IS_GCC >> #pragma GCC optimize "tree-vectorize" >> -#else >> -/* >> - * While older versions of GCC do not generate incorrect code, they fail to >> - * recognize the parallel nature of these functions, and emit plain ARM code, >> - * which is known to be slower than the optimized ARM code in asm-arm/xor.h. >> - */ >> -#warning This code requires at least version 4.6 of GCC >> #endif >> >> #pragma GCC diagnostic ignored "-Wunused-variable" >> -- >> 2.29.2 >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning 2020-11-13 11:07 ` Adrian Ratiu @ 2020-11-13 11:41 ` Ard Biesheuvel 2020-11-13 11:59 ` Adrian Ratiu 0 siblings, 1 reply; 12+ messages in thread From: Ard Biesheuvel @ 2020-11-13 11:41 UTC (permalink / raw) To: Adrian Ratiu Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King, Arvind Sankar, Collabora Kernel ML, clang-built-linux, Linux Kernel Mailing List On Fri, 13 Nov 2020 at 12:05, Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > Hi Ard, > > On Fri, 13 Nov 2020, Ard Biesheuvel <ardb@kernel.org> wrote: > > On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu > > <adrian.ratiu@collabora.com> wrote: > >> > >> From: Nathan Chancellor <natechancellor@gmail.com> > >> > >> Drop warning because kernel now requires GCC >= v4.9 after > >> commit 6ec4476ac825 ("Raise gcc version requirement to 4.9"). > >> > >> Reported-by: Nick Desaulniers <ndesaulniers@google.com> > >> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> > >> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> > > > > Again, this does not do what it says on the tin. > > > > If you want to disable the pragma for Clang, call that out in > > the commit log, and don't hide it under a GCC version change. > > I am not doing anything for Clang in this series. > > The option to auto-vectorize in Clang is enabled by default but > doesn't work for some reason (likely to do with how it computes > the cost model, so maybe not even a bug at all) and if we enable > it explicitely (eg via a Clang specific pragma) we get some > warnings we currently do not understand, so I am not changing the > Clang behaviour at the recommendation of Nick. > > So this is only for GCC as the "tin" says :) We can fix clang > separately as the Clang bug has always been present and is > unrelated. > But you are adding the IS_GCC check here, no? Is that equivalent? IOW, does Clang today identify as GCC <= 4.6? > > > > Without the pragma, the generated code is the same as the > > generic code, so it makes no sense to build xor-neon.ko at all, > > right? > > > > Yes that is correct and that is the reason why in v1 I opted to > not build xor-neon.ko for Clang anymore, but that got NACKed, so > here I'm fixing the low hanging fruit: the very obvious & clear > GCC problems. > > Fair enough. > >> --- > >> arch/arm/lib/xor-neon.c | 9 +-------- > >> 1 file changed, 1 insertion(+), 8 deletions(-) > >> > >> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c > >> index b99dd8e1c93f..e1e76186ec23 100644 > >> --- a/arch/arm/lib/xor-neon.c > >> +++ b/arch/arm/lib/xor-neon.c > >> @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL"); > >> * -ftree-vectorize) to attempt to exploit implicit parallelism and emit > >> * NEON instructions. > >> */ > >> -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6) > >> +#ifdef CONFIG_CC_IS_GCC > >> #pragma GCC optimize "tree-vectorize" > >> -#else > >> -/* > >> - * While older versions of GCC do not generate incorrect code, they fail to > >> - * recognize the parallel nature of these functions, and emit plain ARM code, > >> - * which is known to be slower than the optimized ARM code in asm-arm/xor.h. > >> - */ > >> -#warning This code requires at least version 4.6 of GCC > >> #endif > >> > >> #pragma GCC diagnostic ignored "-Wunused-variable" > >> -- > >> 2.29.2 > >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning 2020-11-13 11:41 ` Ard Biesheuvel @ 2020-11-13 11:59 ` Adrian Ratiu 0 siblings, 0 replies; 12+ messages in thread From: Adrian Ratiu @ 2020-11-13 11:59 UTC (permalink / raw) To: Ard Biesheuvel Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King, Arvind Sankar, Collabora Kernel ML, clang-built-linux, Linux Kernel Mailing List On Fri, 13 Nov 2020, Ard Biesheuvel <ardb@kernel.org> wrote: > On Fri, 13 Nov 2020 at 12:05, Adrian Ratiu > <adrian.ratiu@collabora.com> wrote: >> >> Hi Ard, >> >> On Fri, 13 Nov 2020, Ard Biesheuvel <ardb@kernel.org> wrote: >> > On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu >> > <adrian.ratiu@collabora.com> wrote: >> >> >> >> From: Nathan Chancellor <natechancellor@gmail.com> >> >> >> >> Drop warning because kernel now requires GCC >= v4.9 after >> >> commit 6ec4476ac825 ("Raise gcc version requirement to >> >> 4.9"). >> >> >> >> Reported-by: Nick Desaulniers <ndesaulniers@google.com> >> >> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> >> >> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> >> > >> > Again, this does not do what it says on the tin. >> > >> > If you want to disable the pragma for Clang, call that out in >> > the commit log, and don't hide it under a GCC version change. >> >> I am not doing anything for Clang in this series. >> >> The option to auto-vectorize in Clang is enabled by default but >> doesn't work for some reason (likely to do with how it computes >> the cost model, so maybe not even a bug at all) and if we >> enable it explicitely (eg via a Clang specific pragma) we get >> some warnings we currently do not understand, so I am not >> changing the Clang behaviour at the recommendation of Nick. >> >> So this is only for GCC as the "tin" says :) We can fix clang >> separately as the Clang bug has always been present and is >> unrelated. >> > > But you are adding the IS_GCC check here, no? Is that > equivalent? IOW, does Clang today identify as GCC <= 4.6? > I see what you mean now. Thanks. Clang identifies as GCC <= 4.6 yes, so the code is not strictly speaking equivalent. The warning to upgrade GCC doesn't make sense for Clang but I should mention removing it in the commit message as well. >> > >> > Without the pragma, the generated code is the same as the >> > generic code, so it makes no sense to build xor-neon.ko at all, >> > right? >> > >> >> Yes that is correct and that is the reason why in v1 I opted to >> not build xor-neon.ko for Clang anymore, but that got NACKed, so >> here I'm fixing the low hanging fruit: the very obvious & clear >> GCC problems. >> >> > > Fair enough. > >> >> --- >> >> arch/arm/lib/xor-neon.c | 9 +-------- >> >> 1 file changed, 1 insertion(+), 8 deletions(-) >> >> >> >> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c >> >> index b99dd8e1c93f..e1e76186ec23 100644 >> >> --- a/arch/arm/lib/xor-neon.c >> >> +++ b/arch/arm/lib/xor-neon.c >> >> @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL"); >> >> * -ftree-vectorize) to attempt to exploit implicit parallelism and emit >> >> * NEON instructions. >> >> */ >> >> -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6) >> >> +#ifdef CONFIG_CC_IS_GCC >> >> #pragma GCC optimize "tree-vectorize" >> >> -#else >> >> -/* >> >> - * While older versions of GCC do not generate incorrect code, they fail to >> >> - * recognize the parallel nature of these functions, and emit plain ARM code, >> >> - * which is known to be slower than the optimized ARM code in asm-arm/xor.h. >> >> - */ >> >> -#warning This code requires at least version 4.6 of GCC >> >> #endif >> >> >> >> #pragma GCC diagnostic ignored "-Wunused-variable" >> >> -- >> >> 2.29.2 >> >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile 2020-11-12 21:24 [PATCH v2 0/2] xor-neon: Remove GCC warn & pragmas Adrian Ratiu 2020-11-12 21:24 ` [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu @ 2020-11-12 21:24 ` Adrian Ratiu 2020-11-12 21:40 ` Nick Desaulniers ` (2 more replies) 1 sibling, 3 replies; 12+ messages in thread From: Adrian Ratiu @ 2020-11-12 21:24 UTC (permalink / raw) To: linux-arm-kernel Cc: Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King, Ard Biesheuvel, Arvind Sankar, kernel, clang-built-linux, Linux Kernel Mailing List Using a pragma like GCC optimize is a bad idea because it tags all functions with an __attribute__((optimize)) which replaces optimization options rather than appending so could result in dropping important flags. Not recommended for production use. Because these options should always be enabled for this file, it's better to set them via command line. tree-vectorize is on by default in Clang, but it doesn't hurt to make it explicit. Suggested-by: Arvind Sankar <nivedita@alum.mit.edu> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> --- arch/arm/lib/Makefile | 2 +- arch/arm/lib/xor-neon.c | 10 ---------- 2 files changed, 1 insertion(+), 11 deletions(-) diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile index 6d2ba454f25b..12d31d1a7630 100644 --- a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S ifeq ($(CONFIG_KERNEL_MODE_NEON),y) NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon - CFLAGS_xor-neon.o += $(NEON_FLAGS) + CFLAGS_xor-neon.o += $(NEON_FLAGS) -ftree-vectorize -Wno-unused-variable obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o endif diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c index e1e76186ec23..62b493e386c4 100644 --- a/arch/arm/lib/xor-neon.c +++ b/arch/arm/lib/xor-neon.c @@ -14,16 +14,6 @@ MODULE_LICENSE("GPL"); #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon' #endif -/* - * Pull in the reference implementations while instructing GCC (through - * -ftree-vectorize) to attempt to exploit implicit parallelism and emit - * NEON instructions. - */ -#ifdef CONFIG_CC_IS_GCC -#pragma GCC optimize "tree-vectorize" -#endif - -#pragma GCC diagnostic ignored "-Wunused-variable" #include <asm-generic/xor.h> struct xor_block_template const xor_block_neon_inner = { -- 2.29.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile 2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu @ 2020-11-12 21:40 ` Nick Desaulniers 2020-11-12 21:50 ` Nathan Chancellor 2020-11-13 7:50 ` Ard Biesheuvel 2 siblings, 0 replies; 12+ messages in thread From: Nick Desaulniers @ 2020-11-12 21:40 UTC (permalink / raw) To: Adrian Ratiu Cc: Linux ARM, Nathan Chancellor, Arnd Bergmann, Russell King, Ard Biesheuvel, Arvind Sankar, Collabora Kernel ML, clang-built-linux, Linux Kernel Mailing List On Thu, Nov 12, 2020 at 1:23 PM Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > Using a pragma like GCC optimize is a bad idea because it tags > all functions with an __attribute__((optimize)) which replaces > optimization options rather than appending so could result in > dropping important flags. Not recommended for production use. > > Because these options should always be enabled for this file, > it's better to set them via command line. tree-vectorize is on > by default in Clang, but it doesn't hurt to make it explicit. > > Suggested-by: Arvind Sankar <nivedita@alum.mit.edu> > Suggested-by: Ard Biesheuvel <ardb@kernel.org> > Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> > --- > arch/arm/lib/Makefile | 2 +- > arch/arm/lib/xor-neon.c | 10 ---------- > 2 files changed, 1 insertion(+), 11 deletions(-) > > diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile > index 6d2ba454f25b..12d31d1a7630 100644 > --- a/arch/arm/lib/Makefile > +++ b/arch/arm/lib/Makefile > @@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S > > ifeq ($(CONFIG_KERNEL_MODE_NEON),y) > NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon > - CFLAGS_xor-neon.o += $(NEON_FLAGS) > + CFLAGS_xor-neon.o += $(NEON_FLAGS) -ftree-vectorize -Wno-unused-variable > obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o > endif > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c > index e1e76186ec23..62b493e386c4 100644 > --- a/arch/arm/lib/xor-neon.c > +++ b/arch/arm/lib/xor-neon.c > @@ -14,16 +14,6 @@ MODULE_LICENSE("GPL"); > #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon' > #endif > > -/* > - * Pull in the reference implementations while instructing GCC (through > - * -ftree-vectorize) to attempt to exploit implicit parallelism and emit > - * NEON instructions. > - */ > -#ifdef CONFIG_CC_IS_GCC > -#pragma GCC optimize "tree-vectorize" > -#endif > - > -#pragma GCC diagnostic ignored "-Wunused-variable" > #include <asm-generic/xor.h> > > struct xor_block_template const xor_block_neon_inner = { > -- > 2.29.2 > -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile 2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu 2020-11-12 21:40 ` Nick Desaulniers @ 2020-11-12 21:50 ` Nathan Chancellor 2020-11-13 7:50 ` Ard Biesheuvel 2 siblings, 0 replies; 12+ messages in thread From: Nathan Chancellor @ 2020-11-12 21:50 UTC (permalink / raw) To: Adrian Ratiu Cc: linux-arm-kernel, Nick Desaulniers, Arnd Bergmann, Russell King, Ard Biesheuvel, Arvind Sankar, kernel, clang-built-linux, Linux Kernel Mailing List On Thu, Nov 12, 2020 at 11:24:57PM +0200, Adrian Ratiu wrote: > Using a pragma like GCC optimize is a bad idea because it tags > all functions with an __attribute__((optimize)) which replaces > optimization options rather than appending so could result in > dropping important flags. Not recommended for production use. > > Because these options should always be enabled for this file, > it's better to set them via command line. tree-vectorize is on > by default in Clang, but it doesn't hurt to make it explicit. > > Suggested-by: Arvind Sankar <nivedita@alum.mit.edu> > Suggested-by: Ard Biesheuvel <ardb@kernel.org> > Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> > --- > arch/arm/lib/Makefile | 2 +- > arch/arm/lib/xor-neon.c | 10 ---------- > 2 files changed, 1 insertion(+), 11 deletions(-) > > diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile > index 6d2ba454f25b..12d31d1a7630 100644 > --- a/arch/arm/lib/Makefile > +++ b/arch/arm/lib/Makefile > @@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S > > ifeq ($(CONFIG_KERNEL_MODE_NEON),y) > NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon > - CFLAGS_xor-neon.o += $(NEON_FLAGS) > + CFLAGS_xor-neon.o += $(NEON_FLAGS) -ftree-vectorize -Wno-unused-variable > obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o > endif > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c > index e1e76186ec23..62b493e386c4 100644 > --- a/arch/arm/lib/xor-neon.c > +++ b/arch/arm/lib/xor-neon.c > @@ -14,16 +14,6 @@ MODULE_LICENSE("GPL"); > #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon' > #endif > > -/* > - * Pull in the reference implementations while instructing GCC (through > - * -ftree-vectorize) to attempt to exploit implicit parallelism and emit > - * NEON instructions. > - */ > -#ifdef CONFIG_CC_IS_GCC > -#pragma GCC optimize "tree-vectorize" > -#endif > - > -#pragma GCC diagnostic ignored "-Wunused-variable" > #include <asm-generic/xor.h> > > struct xor_block_template const xor_block_neon_inner = { > -- > 2.29.2 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile 2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu 2020-11-12 21:40 ` Nick Desaulniers 2020-11-12 21:50 ` Nathan Chancellor @ 2020-11-13 7:50 ` Ard Biesheuvel 2020-11-13 11:17 ` Adrian Ratiu 2 siblings, 1 reply; 12+ messages in thread From: Ard Biesheuvel @ 2020-11-13 7:50 UTC (permalink / raw) To: Adrian Ratiu Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King, Arvind Sankar, kernel, clang-built-linux, Linux Kernel Mailing List On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > Using a pragma like GCC optimize is a bad idea because it tags > all functions with an __attribute__((optimize)) which replaces > optimization options rather than appending so could result in > dropping important flags. Not recommended for production use. > > Because these options should always be enabled for this file, > it's better to set them via command line. tree-vectorize is on > by default in Clang, but it doesn't hurt to make it explicit. > > Suggested-by: Arvind Sankar <nivedita@alum.mit.edu> > Suggested-by: Ard Biesheuvel <ardb@kernel.org> > Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> > --- > arch/arm/lib/Makefile | 2 +- > arch/arm/lib/xor-neon.c | 10 ---------- > 2 files changed, 1 insertion(+), 11 deletions(-) > > diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile > index 6d2ba454f25b..12d31d1a7630 100644 > --- a/arch/arm/lib/Makefile > +++ b/arch/arm/lib/Makefile > @@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S > > ifeq ($(CONFIG_KERNEL_MODE_NEON),y) > NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon > - CFLAGS_xor-neon.o += $(NEON_FLAGS) > + CFLAGS_xor-neon.o += $(NEON_FLAGS) -ftree-vectorize -Wno-unused-variable > obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o > endif > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c > index e1e76186ec23..62b493e386c4 100644 > --- a/arch/arm/lib/xor-neon.c > +++ b/arch/arm/lib/xor-neon.c > @@ -14,16 +14,6 @@ MODULE_LICENSE("GPL"); > #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon' > #endif > > -/* > - * Pull in the reference implementations while instructing GCC (through > - * -ftree-vectorize) to attempt to exploit implicit parallelism and emit > - * NEON instructions. > - */ > -#ifdef CONFIG_CC_IS_GCC > -#pragma GCC optimize "tree-vectorize" > -#endif > - > -#pragma GCC diagnostic ignored "-Wunused-variable" > #include <asm-generic/xor.h> > > struct xor_block_template const xor_block_neon_inner = { > -- > 2.29.2 > So what is the status now here? How does putting -ftree-vectorize on the command line interact with Clang? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile 2020-11-13 7:50 ` Ard Biesheuvel @ 2020-11-13 11:17 ` Adrian Ratiu 0 siblings, 0 replies; 12+ messages in thread From: Adrian Ratiu @ 2020-11-13 11:17 UTC (permalink / raw) To: Ard Biesheuvel Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King, Arvind Sankar, kernel, clang-built-linux, Linux Kernel Mailing List On Fri, 13 Nov 2020, Ard Biesheuvel <ardb@kernel.org> wrote: > On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu > <adrian.ratiu@collabora.com> wrote: >> >> Using a pragma like GCC optimize is a bad idea because it tags >> all functions with an __attribute__((optimize)) which replaces >> optimization options rather than appending so could result in >> dropping important flags. Not recommended for production use. >> >> Because these options should always be enabled for this file, >> it's better to set them via command line. tree-vectorize is on >> by default in Clang, but it doesn't hurt to make it explicit. >> >> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu> >> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: >> Adrian Ratiu <adrian.ratiu@collabora.com> --- >> arch/arm/lib/Makefile | 2 +- arch/arm/lib/xor-neon.c | 10 >> ---------- 2 files changed, 1 insertion(+), 11 deletions(-) >> >> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile >> index 6d2ba454f25b..12d31d1a7630 100644 --- >> a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -45,6 >> +45,6 @@ $(obj)/csumpartialcopyuser.o: >> $(obj)/csumpartialcopygeneric.S >> >> ifeq ($(CONFIG_KERNEL_MODE_NEON),y) >> NEON_FLAGS := -march=armv7-a >> -mfloat-abi=softfp -mfpu=neon >> - CFLAGS_xor-neon.o += $(NEON_FLAGS) + >> CFLAGS_xor-neon.o += $(NEON_FLAGS) -ftree-vectorize >> -Wno-unused-variable >> obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o >> endif >> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c >> index e1e76186ec23..62b493e386c4 100644 --- >> a/arch/arm/lib/xor-neon.c +++ b/arch/arm/lib/xor-neon.c @@ >> -14,16 +14,6 @@ MODULE_LICENSE("GPL"); >> #error You should compile this file with '-march=armv7-a >> -mfloat-abi=softfp -mfpu=neon' #endif >> >> -/* - * Pull in the reference implementations while instructing >> GCC (through - * -ftree-vectorize) to attempt to exploit >> implicit parallelism and emit - * NEON instructions. - */ >> -#ifdef CONFIG_CC_IS_GCC -#pragma GCC optimize "tree-vectorize" >> -#endif - -#pragma GCC diagnostic ignored "-Wunused-variable" >> #include <asm-generic/xor.h> >> >> struct xor_block_template const xor_block_neon_inner = { >> -- 2.29.2 >> > > So what is the status now here? How does putting > -ftree-vectorize on the command line interact with Clang? Clang needs to be fixed separately as -ftree-vectorize does not change anything, the option is enabled by default. I know it sucks to have such a silent failure, but it's always been there (the "upgrade your GCC" warning during Clang builds was bogus) and I do not want to rush a Clang fix without fully understanding it. Warning Clang users that the optimization doesn't work was discussed but dropped because users can't do anything about it. If we are positively certain this is a kernel bug and not a Clang bug (i.e. the xor-neon use case is not enabling/triggering the optimization properly) I could add a TODO comment in the code FWIW. Adrian ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-11-13 12:06 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-11-12 21:24 [PATCH v2 0/2] xor-neon: Remove GCC warn & pragmas Adrian Ratiu 2020-11-12 21:24 ` [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu 2020-11-12 21:38 ` Nick Desaulniers 2020-11-13 7:49 ` Ard Biesheuvel 2020-11-13 11:07 ` Adrian Ratiu 2020-11-13 11:41 ` Ard Biesheuvel 2020-11-13 11:59 ` Adrian Ratiu 2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu 2020-11-12 21:40 ` Nick Desaulniers 2020-11-12 21:50 ` Nathan Chancellor 2020-11-13 7:50 ` Ard Biesheuvel 2020-11-13 11:17 ` Adrian Ratiu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).