linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] xor-neon: Remove GCC warn & pragmas
@ 2020-11-12 21:24 Adrian Ratiu
  2020-11-12 21:24 ` [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu
  2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu
  0 siblings, 2 replies; 12+ messages in thread
From: Adrian Ratiu @ 2020-11-12 21:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King,
	Ard Biesheuvel, Arvind Sankar, kernel, clang-built-linux,
	Linux Kernel Mailing List

Dear all,

This is v2 of the patch series at
id:20201106051436.2384842-1-adrian.ratiu@collabora.com

Tested on next-20201112 using GCC 10.2.0 and Clang 10.0.1.

Kind regards,
Adrian

Changes in v2:
  - Dropped the patch which disabled Clang vectorization (Nick)
  - Added new patch to move pragmas to makefile cmdline options
  (Arvid and Ard)
  
Adrian Ratiu (1):
  arm: lib: xor-neon: move pragma options to makefile

Nathan Chancellor (1):
  arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

 arch/arm/lib/Makefile   |  2 +-
 arch/arm/lib/xor-neon.c | 17 -----------------
 2 files changed, 1 insertion(+), 18 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning
  2020-11-12 21:24 [PATCH v2 0/2] xor-neon: Remove GCC warn & pragmas Adrian Ratiu
@ 2020-11-12 21:24 ` Adrian Ratiu
  2020-11-12 21:38   ` Nick Desaulniers
  2020-11-13  7:49   ` Ard Biesheuvel
  2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu
  1 sibling, 2 replies; 12+ messages in thread
From: Adrian Ratiu @ 2020-11-12 21:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King,
	Ard Biesheuvel, Arvind Sankar, kernel, clang-built-linux,
	Linux Kernel Mailing List

From: Nathan Chancellor <natechancellor@gmail.com>

Drop warning because kernel now requires GCC >= v4.9 after
commit 6ec4476ac825 ("Raise gcc version requirement to 4.9").

Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
---
 arch/arm/lib/xor-neon.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index b99dd8e1c93f..e1e76186ec23 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
  * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
  * NEON instructions.
  */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
+#ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
-#else
-/*
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
- */
-#warning This code requires at least version 4.6 of GCC
 #endif
 
 #pragma GCC diagnostic ignored "-Wunused-variable"
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile
  2020-11-12 21:24 [PATCH v2 0/2] xor-neon: Remove GCC warn & pragmas Adrian Ratiu
  2020-11-12 21:24 ` [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu
@ 2020-11-12 21:24 ` Adrian Ratiu
  2020-11-12 21:40   ` Nick Desaulniers
                     ` (2 more replies)
  1 sibling, 3 replies; 12+ messages in thread
From: Adrian Ratiu @ 2020-11-12 21:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Nathan Chancellor, Nick Desaulniers, Arnd Bergmann, Russell King,
	Ard Biesheuvel, Arvind Sankar, kernel, clang-built-linux,
	Linux Kernel Mailing List

Using a pragma like GCC optimize is a bad idea because it tags
all functions with an __attribute__((optimize)) which replaces
optimization options rather than appending so could result in
dropping important flags. Not recommended for production use.

Because these options should always be enabled for this file,
it's better to set them via command line. tree-vectorize is on
by default in Clang, but it doesn't hurt to make it explicit.

Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
---
 arch/arm/lib/Makefile   |  2 +-
 arch/arm/lib/xor-neon.c | 10 ----------
 2 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 6d2ba454f25b..12d31d1a7630 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o:	$(obj)/csumpartialcopygeneric.S
 
 ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
   NEON_FLAGS			:= -march=armv7-a -mfloat-abi=softfp -mfpu=neon
-  CFLAGS_xor-neon.o		+= $(NEON_FLAGS)
+  CFLAGS_xor-neon.o		+= $(NEON_FLAGS) -ftree-vectorize -Wno-unused-variable
   obj-$(CONFIG_XOR_BLOCKS)	+= xor-neon.o
 endif
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index e1e76186ec23..62b493e386c4 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -14,16 +14,6 @@ MODULE_LICENSE("GPL");
 #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
 #endif
 
-/*
- * Pull in the reference implementations while instructing GCC (through
- * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
- * NEON instructions.
- */
-#ifdef CONFIG_CC_IS_GCC
-#pragma GCC optimize "tree-vectorize"
-#endif
-
-#pragma GCC diagnostic ignored "-Wunused-variable"
 #include <asm-generic/xor.h>
 
 struct xor_block_template const xor_block_neon_inner = {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning
  2020-11-12 21:24 ` [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu
@ 2020-11-12 21:38   ` Nick Desaulniers
  2020-11-13  7:49   ` Ard Biesheuvel
  1 sibling, 0 replies; 12+ messages in thread
From: Nick Desaulniers @ 2020-11-12 21:38 UTC (permalink / raw)
  To: Adrian Ratiu
  Cc: Linux ARM, Nathan Chancellor, Arnd Bergmann, Russell King,
	Ard Biesheuvel, Arvind Sankar, Collabora Kernel ML,
	clang-built-linux, Linux Kernel Mailing List

On Thu, Nov 12, 2020 at 1:23 PM Adrian Ratiu <adrian.ratiu@collabora.com> wrote:
>
> From: Nathan Chancellor <natechancellor@gmail.com>
>
> Drop warning because kernel now requires GCC >= v4.9 after
> commit 6ec4476ac825 ("Raise gcc version requirement to 4.9").
>
> Reported-by: Nick Desaulniers <ndesaulniers@google.com>
> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>

Link: https://github.com/ClangBuiltLinux/linux/issues/496
Link: https://github.com/ClangBuiltLinux/linux/issues/503
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

> ---
>  arch/arm/lib/xor-neon.c | 9 +--------
>  1 file changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index b99dd8e1c93f..e1e76186ec23 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
>   * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
>   * NEON instructions.
>   */
> -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
> +#ifdef CONFIG_CC_IS_GCC
>  #pragma GCC optimize "tree-vectorize"
> -#else
> -/*
> - * While older versions of GCC do not generate incorrect code, they fail to
> - * recognize the parallel nature of these functions, and emit plain ARM code,
> - * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
> - */
> -#warning This code requires at least version 4.6 of GCC
>  #endif
>
>  #pragma GCC diagnostic ignored "-Wunused-variable"
> --
> 2.29.2
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile
  2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu
@ 2020-11-12 21:40   ` Nick Desaulniers
  2020-11-12 21:50   ` Nathan Chancellor
  2020-11-13  7:50   ` Ard Biesheuvel
  2 siblings, 0 replies; 12+ messages in thread
From: Nick Desaulniers @ 2020-11-12 21:40 UTC (permalink / raw)
  To: Adrian Ratiu
  Cc: Linux ARM, Nathan Chancellor, Arnd Bergmann, Russell King,
	Ard Biesheuvel, Arvind Sankar, Collabora Kernel ML,
	clang-built-linux, Linux Kernel Mailing List

On Thu, Nov 12, 2020 at 1:23 PM Adrian Ratiu <adrian.ratiu@collabora.com> wrote:
>
> Using a pragma like GCC optimize is a bad idea because it tags
> all functions with an __attribute__((optimize)) which replaces
> optimization options rather than appending so could result in
> dropping important flags. Not recommended for production use.
>
> Because these options should always be enabled for this file,
> it's better to set them via command line. tree-vectorize is on
> by default in Clang, but it doesn't hurt to make it explicit.
>
> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
> Suggested-by: Ard Biesheuvel <ardb@kernel.org>
> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

> ---
>  arch/arm/lib/Makefile   |  2 +-
>  arch/arm/lib/xor-neon.c | 10 ----------
>  2 files changed, 1 insertion(+), 11 deletions(-)
>
> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> index 6d2ba454f25b..12d31d1a7630 100644
> --- a/arch/arm/lib/Makefile
> +++ b/arch/arm/lib/Makefile
> @@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
>
>  ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
>    NEON_FLAGS                   := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> -  CFLAGS_xor-neon.o            += $(NEON_FLAGS)
> +  CFLAGS_xor-neon.o            += $(NEON_FLAGS) -ftree-vectorize -Wno-unused-variable
>    obj-$(CONFIG_XOR_BLOCKS)     += xor-neon.o
>  endif
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index e1e76186ec23..62b493e386c4 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -14,16 +14,6 @@ MODULE_LICENSE("GPL");
>  #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
>  #endif
>
> -/*
> - * Pull in the reference implementations while instructing GCC (through
> - * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
> - * NEON instructions.
> - */
> -#ifdef CONFIG_CC_IS_GCC
> -#pragma GCC optimize "tree-vectorize"
> -#endif
> -
> -#pragma GCC diagnostic ignored "-Wunused-variable"
>  #include <asm-generic/xor.h>
>
>  struct xor_block_template const xor_block_neon_inner = {
> --
> 2.29.2
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile
  2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu
  2020-11-12 21:40   ` Nick Desaulniers
@ 2020-11-12 21:50   ` Nathan Chancellor
  2020-11-13  7:50   ` Ard Biesheuvel
  2 siblings, 0 replies; 12+ messages in thread
From: Nathan Chancellor @ 2020-11-12 21:50 UTC (permalink / raw)
  To: Adrian Ratiu
  Cc: linux-arm-kernel, Nick Desaulniers, Arnd Bergmann, Russell King,
	Ard Biesheuvel, Arvind Sankar, kernel, clang-built-linux,
	Linux Kernel Mailing List

On Thu, Nov 12, 2020 at 11:24:57PM +0200, Adrian Ratiu wrote:
> Using a pragma like GCC optimize is a bad idea because it tags
> all functions with an __attribute__((optimize)) which replaces
> optimization options rather than appending so could result in
> dropping important flags. Not recommended for production use.
> 
> Because these options should always be enabled for this file,
> it's better to set them via command line. tree-vectorize is on
> by default in Clang, but it doesn't hurt to make it explicit.
> 
> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
> Suggested-by: Ard Biesheuvel <ardb@kernel.org>
> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>

Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>

> ---
>  arch/arm/lib/Makefile   |  2 +-
>  arch/arm/lib/xor-neon.c | 10 ----------
>  2 files changed, 1 insertion(+), 11 deletions(-)
> 
> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> index 6d2ba454f25b..12d31d1a7630 100644
> --- a/arch/arm/lib/Makefile
> +++ b/arch/arm/lib/Makefile
> @@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o:	$(obj)/csumpartialcopygeneric.S
>  
>  ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
>    NEON_FLAGS			:= -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> -  CFLAGS_xor-neon.o		+= $(NEON_FLAGS)
> +  CFLAGS_xor-neon.o		+= $(NEON_FLAGS) -ftree-vectorize -Wno-unused-variable
>    obj-$(CONFIG_XOR_BLOCKS)	+= xor-neon.o
>  endif
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index e1e76186ec23..62b493e386c4 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -14,16 +14,6 @@ MODULE_LICENSE("GPL");
>  #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
>  #endif
>  
> -/*
> - * Pull in the reference implementations while instructing GCC (through
> - * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
> - * NEON instructions.
> - */
> -#ifdef CONFIG_CC_IS_GCC
> -#pragma GCC optimize "tree-vectorize"
> -#endif
> -
> -#pragma GCC diagnostic ignored "-Wunused-variable"
>  #include <asm-generic/xor.h>
>  
>  struct xor_block_template const xor_block_neon_inner = {
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning
  2020-11-12 21:24 ` [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu
  2020-11-12 21:38   ` Nick Desaulniers
@ 2020-11-13  7:49   ` Ard Biesheuvel
  2020-11-13 11:07     ` Adrian Ratiu
  1 sibling, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2020-11-13  7:49 UTC (permalink / raw)
  To: Adrian Ratiu
  Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
	Russell King, Arvind Sankar, kernel, clang-built-linux,
	Linux Kernel Mailing List

On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu <adrian.ratiu@collabora.com> wrote:
>
> From: Nathan Chancellor <natechancellor@gmail.com>
>
> Drop warning because kernel now requires GCC >= v4.9 after
> commit 6ec4476ac825 ("Raise gcc version requirement to 4.9").
>
> Reported-by: Nick Desaulniers <ndesaulniers@google.com>
> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>

Again, this does not do what it says on the tin.

If you want to disable the pragma for Clang, call that out in the
commit log, and don't hide it under a GCC version change.

Without the pragma, the generated code is the same as the generic
code, so it makes no sense to build xor-neon.ko at all, right?

> ---
>  arch/arm/lib/xor-neon.c | 9 +--------
>  1 file changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index b99dd8e1c93f..e1e76186ec23 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
>   * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
>   * NEON instructions.
>   */
> -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
> +#ifdef CONFIG_CC_IS_GCC
>  #pragma GCC optimize "tree-vectorize"
> -#else
> -/*
> - * While older versions of GCC do not generate incorrect code, they fail to
> - * recognize the parallel nature of these functions, and emit plain ARM code,
> - * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
> - */
> -#warning This code requires at least version 4.6 of GCC
>  #endif
>
>  #pragma GCC diagnostic ignored "-Wunused-variable"
> --
> 2.29.2
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile
  2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu
  2020-11-12 21:40   ` Nick Desaulniers
  2020-11-12 21:50   ` Nathan Chancellor
@ 2020-11-13  7:50   ` Ard Biesheuvel
  2020-11-13 11:17     ` Adrian Ratiu
  2 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2020-11-13  7:50 UTC (permalink / raw)
  To: Adrian Ratiu
  Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
	Russell King, Arvind Sankar, kernel, clang-built-linux,
	Linux Kernel Mailing List

On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu <adrian.ratiu@collabora.com> wrote:
>
> Using a pragma like GCC optimize is a bad idea because it tags
> all functions with an __attribute__((optimize)) which replaces
> optimization options rather than appending so could result in
> dropping important flags. Not recommended for production use.
>
> Because these options should always be enabled for this file,
> it's better to set them via command line. tree-vectorize is on
> by default in Clang, but it doesn't hurt to make it explicit.
>
> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
> Suggested-by: Ard Biesheuvel <ardb@kernel.org>
> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
> ---
>  arch/arm/lib/Makefile   |  2 +-
>  arch/arm/lib/xor-neon.c | 10 ----------
>  2 files changed, 1 insertion(+), 11 deletions(-)
>
> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> index 6d2ba454f25b..12d31d1a7630 100644
> --- a/arch/arm/lib/Makefile
> +++ b/arch/arm/lib/Makefile
> @@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
>
>  ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
>    NEON_FLAGS                   := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> -  CFLAGS_xor-neon.o            += $(NEON_FLAGS)
> +  CFLAGS_xor-neon.o            += $(NEON_FLAGS) -ftree-vectorize -Wno-unused-variable
>    obj-$(CONFIG_XOR_BLOCKS)     += xor-neon.o
>  endif
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index e1e76186ec23..62b493e386c4 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -14,16 +14,6 @@ MODULE_LICENSE("GPL");
>  #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
>  #endif
>
> -/*
> - * Pull in the reference implementations while instructing GCC (through
> - * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
> - * NEON instructions.
> - */
> -#ifdef CONFIG_CC_IS_GCC
> -#pragma GCC optimize "tree-vectorize"
> -#endif
> -
> -#pragma GCC diagnostic ignored "-Wunused-variable"
>  #include <asm-generic/xor.h>
>
>  struct xor_block_template const xor_block_neon_inner = {
> --
> 2.29.2
>

So what is the status now here? How does putting -ftree-vectorize on
the command line interact with Clang?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning
  2020-11-13  7:49   ` Ard Biesheuvel
@ 2020-11-13 11:07     ` Adrian Ratiu
  2020-11-13 11:41       ` Ard Biesheuvel
  0 siblings, 1 reply; 12+ messages in thread
From: Adrian Ratiu @ 2020-11-13 11:07 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
	Russell King, Arvind Sankar, kernel, clang-built-linux,
	Linux Kernel Mailing List

Hi Ard,

On Fri, 13 Nov 2020, Ard Biesheuvel <ardb@kernel.org> wrote:
> On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu 
> <adrian.ratiu@collabora.com> wrote: 
>> 
>> From: Nathan Chancellor <natechancellor@gmail.com> 
>> 
>> Drop warning because kernel now requires GCC >= v4.9 after 
>> commit 6ec4476ac825 ("Raise gcc version requirement to 4.9"). 
>> 
>> Reported-by: Nick Desaulniers <ndesaulniers@google.com> 
>> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> 
>> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> 
> 
> Again, this does not do what it says on the tin. 
> 
> If you want to disable the pragma for Clang, call that out in 
> the commit log, and don't hide it under a GCC version change.

I am not doing anything for Clang in this series.

The option to auto-vectorize in Clang is enabled by default but 
doesn't work for some reason (likely to do with how it computes 
the cost model, so maybe not even a bug at all) and if we enable 
it explicitely (eg via a Clang specific pragma) we get some 
warnings we currently do not understand, so I am not changing the 
Clang behaviour at the recommendation of Nick.

So this is only for GCC as the "tin" says :) We can fix clang 
separately as the Clang bug has always been present and is 
unrelated.

> 
> Without the pragma, the generated code is the same as the 
> generic code, so it makes no sense to build xor-neon.ko at all, 
> right? 
>

Yes that is correct and that is the reason why in v1 I opted to 
not build xor-neon.ko for Clang anymore, but that got NACKed, so 
here I'm fixing the low hanging fruit: the very obvious & clear 
GCC problems.


>> ---
>>  arch/arm/lib/xor-neon.c | 9 +--------
>>  1 file changed, 1 insertion(+), 8 deletions(-)
>>
>> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
>> index b99dd8e1c93f..e1e76186ec23 100644
>> --- a/arch/arm/lib/xor-neon.c
>> +++ b/arch/arm/lib/xor-neon.c
>> @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
>>   * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
>>   * NEON instructions.
>>   */
>> -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
>> +#ifdef CONFIG_CC_IS_GCC
>>  #pragma GCC optimize "tree-vectorize"
>> -#else
>> -/*
>> - * While older versions of GCC do not generate incorrect code, they fail to
>> - * recognize the parallel nature of these functions, and emit plain ARM code,
>> - * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
>> - */
>> -#warning This code requires at least version 4.6 of GCC
>>  #endif
>>
>>  #pragma GCC diagnostic ignored "-Wunused-variable"
>> --
>> 2.29.2
>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile
  2020-11-13  7:50   ` Ard Biesheuvel
@ 2020-11-13 11:17     ` Adrian Ratiu
  0 siblings, 0 replies; 12+ messages in thread
From: Adrian Ratiu @ 2020-11-13 11:17 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
	Russell King, Arvind Sankar, kernel, clang-built-linux,
	Linux Kernel Mailing List

On Fri, 13 Nov 2020, Ard Biesheuvel <ardb@kernel.org> wrote:
> On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu 
> <adrian.ratiu@collabora.com> wrote: 
>> 
>> Using a pragma like GCC optimize is a bad idea because it tags 
>> all functions with an __attribute__((optimize)) which replaces 
>> optimization options rather than appending so could result in 
>> dropping important flags. Not recommended for production use. 
>> 
>> Because these options should always be enabled for this file, 
>> it's better to set them via command line. tree-vectorize is on 
>> by default in Clang, but it doesn't hurt to make it explicit. 
>> 
>> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu> 
>> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: 
>> Adrian Ratiu <adrian.ratiu@collabora.com> --- 
>>  arch/arm/lib/Makefile   |  2 +- arch/arm/lib/xor-neon.c | 10 
>>  ---------- 2 files changed, 1 insertion(+), 11 deletions(-) 
>> 
>> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile 
>> index 6d2ba454f25b..12d31d1a7630 100644 --- 
>> a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -45,6 
>> +45,6 @@ $(obj)/csumpartialcopyuser.o: 
>> $(obj)/csumpartialcopygeneric.S 
>> 
>>  ifeq ($(CONFIG_KERNEL_MODE_NEON),y) 
>>    NEON_FLAGS                   := -march=armv7-a 
>>    -mfloat-abi=softfp -mfpu=neon 
>> -  CFLAGS_xor-neon.o            += $(NEON_FLAGS) + 
>> CFLAGS_xor-neon.o            += $(NEON_FLAGS) -ftree-vectorize 
>> -Wno-unused-variable 
>>    obj-$(CONFIG_XOR_BLOCKS)     += xor-neon.o 
>>  endif 
>> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c 
>> index e1e76186ec23..62b493e386c4 100644 --- 
>> a/arch/arm/lib/xor-neon.c +++ b/arch/arm/lib/xor-neon.c @@ 
>> -14,16 +14,6 @@ MODULE_LICENSE("GPL"); 
>>  #error You should compile this file with '-march=armv7-a 
>>  -mfloat-abi=softfp -mfpu=neon' #endif 
>> 
>> -/* - * Pull in the reference implementations while instructing 
>> GCC (through - * -ftree-vectorize) to attempt to exploit 
>> implicit parallelism and emit - * NEON instructions.  - */ 
>> -#ifdef CONFIG_CC_IS_GCC -#pragma GCC optimize "tree-vectorize" 
>> -#endif - -#pragma GCC diagnostic ignored "-Wunused-variable" 
>>  #include <asm-generic/xor.h> 
>> 
>>  struct xor_block_template const xor_block_neon_inner = { 
>> -- 2.29.2 
>> 
> 
> So what is the status now here? How does putting 
> -ftree-vectorize on the command line interact with Clang? 

Clang needs to be fixed separately as -ftree-vectorize does not 
change anything, the option is enabled by default.

I know it sucks to have such a silent failure, but it's always 
been there (the "upgrade your GCC" warning during Clang builds was 
bogus) and I do not want to rush a Clang fix without fully 
understanding it.

Warning Clang users that the optimization doesn't work was 
discussed but dropped because users can't do anything about it.

If we are positively certain this is a kernel bug and not a Clang 
bug (i.e. the xor-neon use case is not enabling/triggering the 
optimization properly) I could add a TODO comment in the code 
FWIW.

Adrian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning
  2020-11-13 11:07     ` Adrian Ratiu
@ 2020-11-13 11:41       ` Ard Biesheuvel
  2020-11-13 11:59         ` Adrian Ratiu
  0 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2020-11-13 11:41 UTC (permalink / raw)
  To: Adrian Ratiu
  Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
	Russell King, Arvind Sankar, Collabora Kernel ML,
	clang-built-linux, Linux Kernel Mailing List

On Fri, 13 Nov 2020 at 12:05, Adrian Ratiu <adrian.ratiu@collabora.com> wrote:
>
> Hi Ard,
>
> On Fri, 13 Nov 2020, Ard Biesheuvel <ardb@kernel.org> wrote:
> > On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu
> > <adrian.ratiu@collabora.com> wrote:
> >>
> >> From: Nathan Chancellor <natechancellor@gmail.com>
> >>
> >> Drop warning because kernel now requires GCC >= v4.9 after
> >> commit 6ec4476ac825 ("Raise gcc version requirement to 4.9").
> >>
> >> Reported-by: Nick Desaulniers <ndesaulniers@google.com>
> >> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
> >> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
> >
> > Again, this does not do what it says on the tin.
> >
> > If you want to disable the pragma for Clang, call that out in
> > the commit log, and don't hide it under a GCC version change.
>
> I am not doing anything for Clang in this series.
>
> The option to auto-vectorize in Clang is enabled by default but
> doesn't work for some reason (likely to do with how it computes
> the cost model, so maybe not even a bug at all) and if we enable
> it explicitely (eg via a Clang specific pragma) we get some
> warnings we currently do not understand, so I am not changing the
> Clang behaviour at the recommendation of Nick.
>
> So this is only for GCC as the "tin" says :) We can fix clang
> separately as the Clang bug has always been present and is
> unrelated.
>

But you are adding the IS_GCC check here, no? Is that equivalent? IOW,
does Clang today identify as GCC <= 4.6?

> >
> > Without the pragma, the generated code is the same as the
> > generic code, so it makes no sense to build xor-neon.ko at all,
> > right?
> >
>
> Yes that is correct and that is the reason why in v1 I opted to
> not build xor-neon.ko for Clang anymore, but that got NACKed, so
> here I'm fixing the low hanging fruit: the very obvious & clear
> GCC problems.
>
>

Fair enough.

> >> ---
> >>  arch/arm/lib/xor-neon.c | 9 +--------
> >>  1 file changed, 1 insertion(+), 8 deletions(-)
> >>
> >> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> >> index b99dd8e1c93f..e1e76186ec23 100644
> >> --- a/arch/arm/lib/xor-neon.c
> >> +++ b/arch/arm/lib/xor-neon.c
> >> @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
> >>   * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
> >>   * NEON instructions.
> >>   */
> >> -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
> >> +#ifdef CONFIG_CC_IS_GCC
> >>  #pragma GCC optimize "tree-vectorize"
> >> -#else
> >> -/*
> >> - * While older versions of GCC do not generate incorrect code, they fail to
> >> - * recognize the parallel nature of these functions, and emit plain ARM code,
> >> - * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
> >> - */
> >> -#warning This code requires at least version 4.6 of GCC
> >>  #endif
> >>
> >>  #pragma GCC diagnostic ignored "-Wunused-variable"
> >> --
> >> 2.29.2
> >>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning
  2020-11-13 11:41       ` Ard Biesheuvel
@ 2020-11-13 11:59         ` Adrian Ratiu
  0 siblings, 0 replies; 12+ messages in thread
From: Adrian Ratiu @ 2020-11-13 11:59 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux ARM, Nathan Chancellor, Nick Desaulniers, Arnd Bergmann,
	Russell King, Arvind Sankar, Collabora Kernel ML,
	clang-built-linux, Linux Kernel Mailing List

On Fri, 13 Nov 2020, Ard Biesheuvel <ardb@kernel.org> wrote:
> On Fri, 13 Nov 2020 at 12:05, Adrian Ratiu 
> <adrian.ratiu@collabora.com> wrote: 
>> 
>> Hi Ard, 
>> 
>> On Fri, 13 Nov 2020, Ard Biesheuvel <ardb@kernel.org> wrote: 
>> > On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu 
>> > <adrian.ratiu@collabora.com> wrote: 
>> >> 
>> >> From: Nathan Chancellor <natechancellor@gmail.com> 
>> >> 
>> >> Drop warning because kernel now requires GCC >= v4.9 after 
>> >> commit 6ec4476ac825 ("Raise gcc version requirement to 
>> >> 4.9"). 
>> >> 
>> >> Reported-by: Nick Desaulniers <ndesaulniers@google.com> 
>> >> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> 
>> >> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> 
>> > 
>> > Again, this does not do what it says on the tin. 
>> > 
>> > If you want to disable the pragma for Clang, call that out in 
>> > the commit log, and don't hide it under a GCC version change. 
>> 
>> I am not doing anything for Clang in this series. 
>> 
>> The option to auto-vectorize in Clang is enabled by default but 
>> doesn't work for some reason (likely to do with how it computes 
>> the cost model, so maybe not even a bug at all) and if we 
>> enable it explicitely (eg via a Clang specific pragma) we get 
>> some warnings we currently do not understand, so I am not 
>> changing the Clang behaviour at the recommendation of Nick. 
>> 
>> So this is only for GCC as the "tin" says :) We can fix clang 
>> separately as the Clang bug has always been present and is 
>> unrelated. 
>> 
> 
> But you are adding the IS_GCC check here, no? Is that 
> equivalent? IOW, does Clang today identify as GCC <= 4.6? 
>

I see what you mean now. Thanks.

Clang identifies as GCC <= 4.6 yes, so the code is not strictly 
speaking equivalent. The warning to upgrade GCC doesn't make sense 
for Clang but I should mention removing it in the commit message 
as well.

>> >
>> > Without the pragma, the generated code is the same as the
>> > generic code, so it makes no sense to build xor-neon.ko at all,
>> > right?
>> >
>>
>> Yes that is correct and that is the reason why in v1 I opted to
>> not build xor-neon.ko for Clang anymore, but that got NACKed, so
>> here I'm fixing the low hanging fruit: the very obvious & clear
>> GCC problems.
>>
>>
>
> Fair enough.
>
>> >> ---
>> >>  arch/arm/lib/xor-neon.c | 9 +--------
>> >>  1 file changed, 1 insertion(+), 8 deletions(-)
>> >>
>> >> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
>> >> index b99dd8e1c93f..e1e76186ec23 100644
>> >> --- a/arch/arm/lib/xor-neon.c
>> >> +++ b/arch/arm/lib/xor-neon.c
>> >> @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
>> >>   * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
>> >>   * NEON instructions.
>> >>   */
>> >> -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
>> >> +#ifdef CONFIG_CC_IS_GCC
>> >>  #pragma GCC optimize "tree-vectorize"
>> >> -#else
>> >> -/*
>> >> - * While older versions of GCC do not generate incorrect code, they fail to
>> >> - * recognize the parallel nature of these functions, and emit plain ARM code,
>> >> - * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
>> >> - */
>> >> -#warning This code requires at least version 4.6 of GCC
>> >>  #endif
>> >>
>> >>  #pragma GCC diagnostic ignored "-Wunused-variable"
>> >> --
>> >> 2.29.2
>> >>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-11-13 12:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-12 21:24 [PATCH v2 0/2] xor-neon: Remove GCC warn & pragmas Adrian Ratiu
2020-11-12 21:24 ` [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning Adrian Ratiu
2020-11-12 21:38   ` Nick Desaulniers
2020-11-13  7:49   ` Ard Biesheuvel
2020-11-13 11:07     ` Adrian Ratiu
2020-11-13 11:41       ` Ard Biesheuvel
2020-11-13 11:59         ` Adrian Ratiu
2020-11-12 21:24 ` [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile Adrian Ratiu
2020-11-12 21:40   ` Nick Desaulniers
2020-11-12 21:50   ` Nathan Chancellor
2020-11-13  7:50   ` Ard Biesheuvel
2020-11-13 11:17     ` Adrian Ratiu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).