All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] compiler.h: Fix barrier_data() on clang
@ 2020-10-14 21:26 Arvind Sankar
  2020-10-14 22:51 ` Nick Desaulniers
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Arvind Sankar @ 2020-10-14 21:26 UTC (permalink / raw)
  To: Nathan Chancellor, Nick Desaulniers, clang-built-linux; +Cc: linux-kernel

Commit
  815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")

neglected to copy barrier_data() from compiler-gcc.h into
compiler-clang.h. The definition in compiler-gcc.h was really to work
around clang's more aggressive optimization, so this broke
barrier_data() on clang, and consequently memzero_explicit() as well.

For example, this results in at least the memzero_explicit() call in
lib/crypto/sha256.c:sha256_transform() being optimized away by clang.

Fix this by moving the definition of barrier_data() into compiler.h.

Also move the gcc/clang definition of barrier() into compiler.h,
__memory_barrier() is icc-specific (and barrier() is already defined
using it in compiler-intel.h) and doesn't belong in compiler.h.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
---
 include/linux/compiler-clang.h |  6 ------
 include/linux/compiler-gcc.h   | 19 -------------------
 include/linux/compiler.h       | 18 ++++++++++++++++--
 3 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index cee0c728d39a..04c0a5a717f7 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -52,12 +52,6 @@
 #define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
 #endif
 
-/* The following are for compatibility with GCC, from compiler-gcc.h,
- * and may be redefined here because they should not be shared with other
- * compilers, like ICC.
- */
-#define barrier() __asm__ __volatile__("" : : : "memory")
-
 #if __has_feature(shadow_call_stack)
 # define __noscs	__attribute__((__no_sanitize__("shadow-call-stack")))
 #endif
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 7a3769040d7d..fda30ffb037b 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -15,25 +15,6 @@
 # error Sorry, your compiler is too old - please upgrade it.
 #endif
 
-/* Optimization barrier */
-
-/* The "volatile" is due to gcc bugs */
-#define barrier() __asm__ __volatile__("": : :"memory")
-/*
- * This version is i.e. to prevent dead stores elimination on @ptr
- * where gcc and llvm may behave differently when otherwise using
- * normal barrier(): while gcc behavior gets along with a normal
- * barrier(), llvm needs an explicit input variable to be assumed
- * clobbered. The issue is as follows: while the inline asm might
- * access any memory it wants, the compiler could have fit all of
- * @ptr into memory registers instead, and since @ptr never escaped
- * from that, it proved that the inline asm wasn't touching any of
- * it. This version works well with both compilers, i.e. we're telling
- * the compiler that the inline asm absolutely may see the contents
- * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
- */
-#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
-
 /*
  * This macro obfuscates arithmetic on a variable address so that gcc
  * shouldn't recognize the original var, and make assumptions about it.
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 92ef163a7479..dfba70b2644f 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -80,11 +80,25 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 
 /* Optimization barrier */
 #ifndef barrier
-# define barrier() __memory_barrier()
+/* The "volatile" is due to gcc bugs */
+# define barrier() __asm__ __volatile__("": : :"memory")
 #endif
 
 #ifndef barrier_data
-# define barrier_data(ptr) barrier()
+/*
+ * This version is i.e. to prevent dead stores elimination on @ptr
+ * where gcc and llvm may behave differently when otherwise using
+ * normal barrier(): while gcc behavior gets along with a normal
+ * barrier(), llvm needs an explicit input variable to be assumed
+ * clobbered. The issue is as follows: while the inline asm might
+ * access any memory it wants, the compiler could have fit all of
+ * @ptr into memory registers instead, and since @ptr never escaped
+ * from that, it proved that the inline asm wasn't touching any of
+ * it. This version works well with both compilers, i.e. we're telling
+ * the compiler that the inline asm absolutely may see the contents
+ * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
+ */
+# define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
 #endif
 
 /* workaround for GCC PR82365 if needed */
-- 
2.26.2


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-10-14 21:26 [PATCH] compiler.h: Fix barrier_data() on clang Arvind Sankar
@ 2020-10-14 22:51 ` Nick Desaulniers
  2020-10-15  8:50 ` David Laight
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 28+ messages in thread
From: Nick Desaulniers @ 2020-10-14 22:51 UTC (permalink / raw)
  To: Arvind Sankar, Andrew Morton
  Cc: Nathan Chancellor, clang-built-linux, LKML, Kees Cook

On Wed, Oct 14, 2020 at 2:26 PM Arvind Sankar <nivedita@alum.mit.edu> wrote:
>
> Commit
>   815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
>
> neglected to copy barrier_data() from compiler-gcc.h into
> compiler-clang.h. The definition in compiler-gcc.h was really to work
> around clang's more aggressive optimization, so this broke
> barrier_data() on clang, and consequently memzero_explicit() as well.
>
> For example, this results in at least the memzero_explicit() call in
> lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
>
> Fix this by moving the definition of barrier_data() into compiler.h.
>
> Also move the gcc/clang definition of barrier() into compiler.h,
> __memory_barrier() is icc-specific (and barrier() is already defined
> using it in compiler-intel.h) and doesn't belong in compiler.h.
>
> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")

Thanks for the patch! Curious how you spotted this? My mistake for
missing it.  Definite difference in the disassembly before/after.

Cc: stable@vger.kernel.org
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>

akpm@ would you mind picking this up when you have a chance?

See also:
commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
dead store elimination")

I'm pretty sure `man 3 explicit_bzero` was created in libc for this
exact problem, though the manual page is an interesting read.

> ---
>  include/linux/compiler-clang.h |  6 ------
>  include/linux/compiler-gcc.h   | 19 -------------------
>  include/linux/compiler.h       | 18 ++++++++++++++++--
>  3 files changed, 16 insertions(+), 27 deletions(-)
>
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index cee0c728d39a..04c0a5a717f7 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -52,12 +52,6 @@
>  #define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
>  #endif
>
> -/* The following are for compatibility with GCC, from compiler-gcc.h,
> - * and may be redefined here because they should not be shared with other
> - * compilers, like ICC.
> - */
> -#define barrier() __asm__ __volatile__("" : : : "memory")
> -
>  #if __has_feature(shadow_call_stack)
>  # define __noscs       __attribute__((__no_sanitize__("shadow-call-stack")))
>  #endif
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index 7a3769040d7d..fda30ffb037b 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -15,25 +15,6 @@
>  # error Sorry, your compiler is too old - please upgrade it.
>  #endif
>
> -/* Optimization barrier */
> -
> -/* The "volatile" is due to gcc bugs */
> -#define barrier() __asm__ __volatile__("": : :"memory")
> -/*
> - * This version is i.e. to prevent dead stores elimination on @ptr
> - * where gcc and llvm may behave differently when otherwise using
> - * normal barrier(): while gcc behavior gets along with a normal
> - * barrier(), llvm needs an explicit input variable to be assumed
> - * clobbered. The issue is as follows: while the inline asm might
> - * access any memory it wants, the compiler could have fit all of
> - * @ptr into memory registers instead, and since @ptr never escaped
> - * from that, it proved that the inline asm wasn't touching any of
> - * it. This version works well with both compilers, i.e. we're telling
> - * the compiler that the inline asm absolutely may see the contents
> - * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
> - */
> -#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
> -
>  /*
>   * This macro obfuscates arithmetic on a variable address so that gcc
>   * shouldn't recognize the original var, and make assumptions about it.
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 92ef163a7479..dfba70b2644f 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -80,11 +80,25 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>
>  /* Optimization barrier */
>  #ifndef barrier
> -# define barrier() __memory_barrier()
> +/* The "volatile" is due to gcc bugs */
> +# define barrier() __asm__ __volatile__("": : :"memory")
>  #endif
>
>  #ifndef barrier_data
> -# define barrier_data(ptr) barrier()
> +/*
> + * This version is i.e. to prevent dead stores elimination on @ptr
> + * where gcc and llvm may behave differently when otherwise using
> + * normal barrier(): while gcc behavior gets along with a normal
> + * barrier(), llvm needs an explicit input variable to be assumed
> + * clobbered. The issue is as follows: while the inline asm might
> + * access any memory it wants, the compiler could have fit all of
> + * @ptr into memory registers instead, and since @ptr never escaped
> + * from that, it proved that the inline asm wasn't touching any of
> + * it. This version works well with both compilers, i.e. we're telling
> + * the compiler that the inline asm absolutely may see the contents
> + * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
> + */
> +# define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
>  #endif
>
>  /* workaround for GCC PR82365 if needed */
> --
> 2.26.2
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-10-14 21:26 [PATCH] compiler.h: Fix barrier_data() on clang Arvind Sankar
  2020-10-14 22:51 ` Nick Desaulniers
@ 2020-10-15  8:50 ` David Laight
  2020-10-15 14:45   ` Arvind Sankar
  2020-10-21 19:46 ` [PATCH] compiler.h: Fix barrier_data() on clang Kees Cook
  2020-11-16 17:47   ` Andreas Schwab
  3 siblings, 1 reply; 28+ messages in thread
From: David Laight @ 2020-10-15  8:50 UTC (permalink / raw)
  To: 'Arvind Sankar',
	Nathan Chancellor, Nick Desaulniers, clang-built-linux
  Cc: linux-kernel

From: Arvind Sankar
> Sent: 14 October 2020 22:27
...
> +/*
> + * This version is i.e. to prevent dead stores elimination on @ptr
> + * where gcc and llvm may behave differently when otherwise using
> + * normal barrier(): while gcc behavior gets along with a normal
> + * barrier(), llvm needs an explicit input variable to be assumed
> + * clobbered. The issue is as follows: while the inline asm might
> + * access any memory it wants, the compiler could have fit all of
> + * @ptr into memory registers instead, and since @ptr never escaped
> + * from that, it proved that the inline asm wasn't touching any of
> + * it. This version works well with both compilers, i.e. we're telling
> + * the compiler that the inline asm absolutely may see the contents
> + * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
> + */
> +# define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")

That comment doesn't actually match the asm statement.
Although the asm statement probably has the desired effect.

The "r"(ptr) constraint only passes the address of the buffer
into the asm - it doesn't say anything at all about the associated
memory.

What the "r"(ptr) actually does is to force the address of the
associated data to be taken.
This means that on-stack space must actually be allocated.
The "memory" clobber will then force the registers caching
the variable be written out to stack.

If you only want to force stores on a single data structure
you actually want:
#define barrier_data(ptr) asm volatile("" :: "m"(*ptr))
although it would be best then to add an explicit size
and associated cast.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-10-15  8:50 ` David Laight
@ 2020-10-15 14:45   ` Arvind Sankar
  2020-10-15 15:24     ` David Laight
  0 siblings, 1 reply; 28+ messages in thread
From: Arvind Sankar @ 2020-10-15 14:45 UTC (permalink / raw)
  To: David Laight
  Cc: 'Arvind Sankar',
	Nathan Chancellor, Nick Desaulniers, clang-built-linux,
	linux-kernel

On Thu, Oct 15, 2020 at 08:50:05AM +0000, David Laight wrote:
> From: Arvind Sankar
> > Sent: 14 October 2020 22:27
> ...
> > +/*
> > + * This version is i.e. to prevent dead stores elimination on @ptr
> > + * where gcc and llvm may behave differently when otherwise using
> > + * normal barrier(): while gcc behavior gets along with a normal
> > + * barrier(), llvm needs an explicit input variable to be assumed
> > + * clobbered. The issue is as follows: while the inline asm might
> > + * access any memory it wants, the compiler could have fit all of
> > + * @ptr into memory registers instead, and since @ptr never escaped
> > + * from that, it proved that the inline asm wasn't touching any of
> > + * it. This version works well with both compilers, i.e. we're telling
> > + * the compiler that the inline asm absolutely may see the contents
> > + * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
> > + */
> > +# define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
> 
> That comment doesn't actually match the asm statement.
> Although the asm statement probably has the desired effect.
> 
> The "r"(ptr) constraint only passes the address of the buffer
> into the asm - it doesn't say anything at all about the associated
> memory.
> 
> What the "r"(ptr) actually does is to force the address of the
> associated data to be taken.
> This means that on-stack space must actually be allocated.
> The "memory" clobber will then force the registers caching
> the variable be written out to stack.
> 

I think the comment is unclear now that you bring it up, but the problem
it actually addresses is not that the data is held in registers: in the
sha256_transform() case mentioned in the commit message, for example,
the data is in fact in memory even before this change (it's a 256-byte
array), and that together with the memory clobber is enough for gcc to
assume that the asm might use it. But with clang, if the address of that
data has never escaped -- in this case the data is a local variable
whose address was never passed out of the function -- the compiler
assumes that the asm cannot possibly depend on that memory, because it
has no way of getting its address.

Passing ptr to the inline asm tells clang that the asm knows the
address, and since it also has a memory clobber, that it may use the
data. This is somewhat suboptimal, since if the data was some small
structure that the compiler was just holding in registers originally,
forcing it out to memory is a bad thing to do.

> If you only want to force stores on a single data structure
> you actually want:
> #define barrier_data(ptr) asm volatile("" :: "m"(*ptr))
> although it would be best then to add an explicit size
> and associated cast.
> 

i.e. something like:
	static inline void barrier_data(void *ptr, size_t size)
	{
		asm volatile("" : "+m"(*(char (*)[size])ptr));
	}
plus some magic to disable the VLA warning, otherwise it causes a build
error.

But I think that might lead to even more subtle issues by dropping the
memory clobber. For example (and this is actually done in
sha256_transform() as well, though the zero'ing simply doesn't work with
any compiler, as it's missing the barrier_data()'s):

	unsigned long x, y;
	... do something secret with x/y ...
	x = y = 0;
	barrier_data(&x, sizeof(x));
	barrier_data(&y, sizeof(y));
	return;

Since x is not used after its barrier_data(), I think the compiler would
be within its rights to turn that into:

	xorl	%eax, %eax
	leaq	-16(%rbp), %rdx	// &x == -16(%rbp)
	movq	%eax, (%rdx)	// x = 0;
	// inline asm for barrier_data(&x, sizeof(x));
	movq	%eax, (%rdx)	// y = 0; (!)
	// inline asm for barrier_data(&y, sizeof(y));

which saves one instruction by putting y at the same location as x, once
x is dead.

With a memory clobber, the compiler has to keep x and y at different
addresses, since the first barrier_data() might have saved the address
of x.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-10-15 14:45   ` Arvind Sankar
@ 2020-10-15 15:24     ` David Laight
  2020-10-15 15:39       ` Arvind Sankar
  0 siblings, 1 reply; 28+ messages in thread
From: David Laight @ 2020-10-15 15:24 UTC (permalink / raw)
  To: 'Arvind Sankar'
  Cc: Nathan Chancellor, Nick Desaulniers, clang-built-linux, linux-kernel

From: Arvind Sankar
> Sent: 15 October 2020 15:45
> 
> On Thu, Oct 15, 2020 at 08:50:05AM +0000, David Laight wrote:
> > From: Arvind Sankar
> > > Sent: 14 October 2020 22:27
> > ...
> > > +/*
> > > + * This version is i.e. to prevent dead stores elimination on @ptr
> > > + * where gcc and llvm may behave differently when otherwise using
> > > + * normal barrier(): while gcc behavior gets along with a normal
> > > + * barrier(), llvm needs an explicit input variable to be assumed
> > > + * clobbered. The issue is as follows: while the inline asm might
> > > + * access any memory it wants, the compiler could have fit all of
> > > + * @ptr into memory registers instead, and since @ptr never escaped
> > > + * from that, it proved that the inline asm wasn't touching any of
> > > + * it. This version works well with both compilers, i.e. we're telling
> > > + * the compiler that the inline asm absolutely may see the contents
> > > + * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
> > > + */
> > > +# define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
> >
> > That comment doesn't actually match the asm statement.
> > Although the asm statement probably has the desired effect.
> >
> > The "r"(ptr) constraint only passes the address of the buffer
> > into the asm - it doesn't say anything at all about the associated
> > memory.
> >
> > What the "r"(ptr) actually does is to force the address of the
> > associated data to be taken.
> > This means that on-stack space must actually be allocated.
> > The "memory" clobber will then force the registers caching
> > the variable be written out to stack.
> >
> 
> I think the comment is unclear now that you bring it up, but the problem
> it actually addresses is not that the data is held in registers: in the
> sha256_transform() case mentioned in the commit message, for example,
> the data is in fact in memory even before this change (it's a 256-byte
> array), and that together with the memory clobber is enough for gcc to
> assume that the asm might use it. But with clang, if the address of that
> data has never escaped -- in this case the data is a local variable
> whose address was never passed out of the function -- the compiler
> assumes that the asm cannot possibly depend on that memory, because it
> has no way of getting its address.

Ok, slightly different from what i thought.
But the current comment is just wrong.

> Passing ptr to the inline asm tells clang that the asm knows the
> address, and since it also has a memory clobber, that it may use the
> data. This is somewhat suboptimal, since if the data was some small
> structure that the compiler was just holding in registers originally,
> forcing it out to memory is a bad thing to do.
> 
> > If you only want to force stores on a single data structure
> > you actually want:
> > #define barrier_data(ptr) asm volatile("" :: "m"(*ptr))
> > although it would be best then to add an explicit size
> > and associated cast.
> >
> 
> i.e. something like:
> 	static inline void barrier_data(void *ptr, size_t size)
> 	{
> 		asm volatile("" : "+m"(*(char (*)[size])ptr));

I think it has to be a struct with an array member?

> 	}
> plus some magic to disable the VLA warning, otherwise it causes a build
> error.

It shouldn't if the size is a compile time constant.
And given this is an instruction to the compiler it better be.

> But I think that might lead to even more subtle issues by dropping the
> memory clobber. For example (and this is actually done in
> sha256_transform() as well, though the zero'ing simply doesn't work with
> any compiler, as it's missing the barrier_data()'s):
> 
> 	unsigned long x, y;
> 	... do something secret with x/y ...
> 	x = y = 0;
> 	barrier_data(&x, sizeof(x));
> 	barrier_data(&y, sizeof(y));
> 	return;
> 
> Since x is not used after its barrier_data(), I think the compiler would
> be within its rights to turn that into:
> 
> 	xorl	%eax, %eax
> 	leaq	-16(%rbp), %rdx	// &x == -16(%rbp)
> 	movq	%eax, (%rdx)	// x = 0;
> 	// inline asm for barrier_data(&x, sizeof(x));
> 	movq	%eax, (%rdx)	// y = 0; (!)
> 	// inline asm for barrier_data(&y, sizeof(y));
> 
> which saves one instruction by putting y at the same location as x, once
> x is dead.
> 
> With a memory clobber, the compiler has to keep x and y at different
> addresses, since the first barrier_data() might have saved the address
> of x.

Maybe "+m"(*ptr) : "r"(ptr) would work.
OTOH a "memory" clobber at the bottom of a function isn't going to
cause bloat.

The explicit ranged memory access (without "memory") probably has its
uses - but only if the full "memory" clobber causes grief.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-10-15 15:24     ` David Laight
@ 2020-10-15 15:39       ` Arvind Sankar
  2020-10-15 17:39         ` Nick Desaulniers
  0 siblings, 1 reply; 28+ messages in thread
From: Arvind Sankar @ 2020-10-15 15:39 UTC (permalink / raw)
  To: David Laight
  Cc: 'Arvind Sankar',
	Nathan Chancellor, Nick Desaulniers, clang-built-linux,
	linux-kernel

On Thu, Oct 15, 2020 at 03:24:09PM +0000, David Laight wrote:
> > I think the comment is unclear now that you bring it up, but the problem
> > it actually addresses is not that the data is held in registers: in the
> > sha256_transform() case mentioned in the commit message, for example,
> > the data is in fact in memory even before this change (it's a 256-byte
> > array), and that together with the memory clobber is enough for gcc to
> > assume that the asm might use it. But with clang, if the address of that
> > data has never escaped -- in this case the data is a local variable
> > whose address was never passed out of the function -- the compiler
> > assumes that the asm cannot possibly depend on that memory, because it
> > has no way of getting its address.
> 
> Ok, slightly different from what i thought.
> But the current comment is just wrong.

Should I fix up the comment in the same commit, or do a second one after
moving the macro?

> > i.e. something like:
> > 	static inline void barrier_data(void *ptr, size_t size)
> > 	{
> > 		asm volatile("" : "+m"(*(char (*)[size])ptr));
> 
> I think it has to be a struct with an array member?

I don't think so, this is actually an example in gcc's documentation:

  An x86 example where the string memory argument is of unknown length.

	asm("repne scasb"
	: "=c" (count), "+D" (p)
        : "m" (*(const char (*)[]) p), "0" (-1), "a" (0));

  If you know the above will only be reading a ten byte array then you
  could instead use a memory input like: "m" (*(const char (*)[10]) p).

> 
> > 	}
> > plus some magic to disable the VLA warning, otherwise it causes a build
> > error.
> 
> It shouldn't if the size is a compile time constant.
> And given this is an instruction to the compiler it better be.

Ah right. I saw the warning when playing with something else where size
could be constant or variable depending on the call.

> > 
> > With a memory clobber, the compiler has to keep x and y at different
> > addresses, since the first barrier_data() might have saved the address
> > of x.
> 
> Maybe "+m"(*ptr) : "r"(ptr) would work.

Nothing that can only modify what ptr points to could avoid this, since
that storage is dead after the barrier.

> OTOH a "memory" clobber at the bottom of a function isn't going to
> cause bloat.
> 
> The explicit ranged memory access (without "memory") probably has its
> uses - but only if the full "memory" clobber causes grief.
> 
> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-10-15 15:39       ` Arvind Sankar
@ 2020-10-15 17:39         ` Nick Desaulniers
  2020-10-15 18:13           ` [PATCH] compiler.h: Clarify comment about the need for barrier_data() Arvind Sankar
  0 siblings, 1 reply; 28+ messages in thread
From: Nick Desaulniers @ 2020-10-15 17:39 UTC (permalink / raw)
  To: Arvind Sankar
  Cc: David Laight, Nathan Chancellor, clang-built-linux, linux-kernel

On Thu, Oct 15, 2020 at 8:39 AM Arvind Sankar <nivedita@alum.mit.edu> wrote:
>
> On Thu, Oct 15, 2020 at 03:24:09PM +0000, David Laight wrote:
> > > I think the comment is unclear now that you bring it up, but the problem
> > > it actually addresses is not that the data is held in registers: in the
> > > sha256_transform() case mentioned in the commit message, for example,
> > > the data is in fact in memory even before this change (it's a 256-byte
> > > array), and that together with the memory clobber is enough for gcc to
> > > assume that the asm might use it. But with clang, if the address of that
> > > data has never escaped -- in this case the data is a local variable
> > > whose address was never passed out of the function -- the compiler
> > > assumes that the asm cannot possibly depend on that memory, because it
> > > has no way of getting its address.
> >
> > Ok, slightly different from what i thought.
> > But the current comment is just wrong.
>
> Should I fix up the comment in the same commit, or do a second one after
> moving the macro?

I would prefer undoing the mistake from 815f0ddb346c and getting that
backported to stable, then rewriting comments or the trick to retain
the memset separately.

>
> > > i.e. something like:
> > >     static inline void barrier_data(void *ptr, size_t size)
> > >     {
> > >             asm volatile("" : "+m"(*(char (*)[size])ptr));
> >
> > I think it has to be a struct with an array member?
>
> I don't think so, this is actually an example in gcc's documentation:
>
>   An x86 example where the string memory argument is of unknown length.
>
>         asm("repne scasb"
>         : "=c" (count), "+D" (p)
>         : "m" (*(const char (*)[]) p), "0" (-1), "a" (0));
>
>   If you know the above will only be reading a ten byte array then you
>   could instead use a memory input like: "m" (*(const char (*)[10]) p).
>
> >
> > >     }
> > > plus some magic to disable the VLA warning, otherwise it causes a build
> > > error.
> >
> > It shouldn't if the size is a compile time constant.
> > And given this is an instruction to the compiler it better be.
>
> Ah right. I saw the warning when playing with something else where size
> could be constant or variable depending on the call.
>
> > >
> > > With a memory clobber, the compiler has to keep x and y at different
> > > addresses, since the first barrier_data() might have saved the address
> > > of x.
> >
> > Maybe "+m"(*ptr) : "r"(ptr) would work.
>
> Nothing that can only modify what ptr points to could avoid this, since
> that storage is dead after the barrier.
>
> > OTOH a "memory" clobber at the bottom of a function isn't going to
> > cause bloat.
> >
> > The explicit ranged memory access (without "memory") probably has its
> > uses - but only if the full "memory" clobber causes grief.
> >
> >       David
> >
> > -
> > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> > Registration No: 1397386 (Wales)



-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] compiler.h: Clarify comment about the need for barrier_data()
  2020-10-15 17:39         ` Nick Desaulniers
@ 2020-10-15 18:13           ` Arvind Sankar
  2020-10-15 18:25             ` Nick Desaulniers
  2020-10-15 21:09             ` David Laight
  0 siblings, 2 replies; 28+ messages in thread
From: Arvind Sankar @ 2020-10-15 18:13 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Andrew Morton, Kees Cook, David Laight, Nathan Chancellor,
	clang-built-linux, linux-kernel

Be clear about @ptr vs the variable that @ptr points to, and add some
more details as to why the special barrier_data() macro is required.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
---
 include/linux/compiler.h | 33 ++++++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 93035d7fee0d..d8cee7c8968d 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -86,17 +86,28 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 
 #ifndef barrier_data
 /*
- * This version is i.e. to prevent dead stores elimination on @ptr
- * where gcc and llvm may behave differently when otherwise using
- * normal barrier(): while gcc behavior gets along with a normal
- * barrier(), llvm needs an explicit input variable to be assumed
- * clobbered. The issue is as follows: while the inline asm might
- * access any memory it wants, the compiler could have fit all of
- * @ptr into memory registers instead, and since @ptr never escaped
- * from that, it proved that the inline asm wasn't touching any of
- * it. This version works well with both compilers, i.e. we're telling
- * the compiler that the inline asm absolutely may see the contents
- * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
+ * This version is to prevent dead stores elimination on @ptr where gcc and
+ * llvm may behave differently when otherwise using normal barrier(): while gcc
+ * behavior gets along with a normal barrier(), llvm needs an explicit input
+ * variable to be assumed clobbered.
+ *
+ * Its primary use is in implementing memzero_explicit(), which is used for
+ * clearing temporary data that may contain secrets.
+ *
+ * The issue is as follows: while the inline asm might access any memory it
+ * wants, the compiler could have fit all of the variable that @ptr points to
+ * into registers instead, and if @ptr never escaped from the function, it
+ * proved that the inline asm wasn't touching any of it. gcc only eliminates
+ * dead stores if the variable was actually allocated in registers, but llvm
+ * reasons that the variable _could_ have been in registers, so the inline asm
+ * can't reliably access it anyway, and eliminates dead stores even if the
+ * variable is actually in memory.
+ *
+ * This version works well with both compilers, i.e. we're telling the compiler
+ * that the inline asm absolutely may see the contents of the variable pointed
+ * to by @ptr.
+ *
+ * See also: https://llvm.org/bugs/show_bug.cgi?id=15495#c5
  */
 # define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
 #endif
-- 
2.26.2


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Clarify comment about the need for barrier_data()
  2020-10-15 18:13           ` [PATCH] compiler.h: Clarify comment about the need for barrier_data() Arvind Sankar
@ 2020-10-15 18:25             ` Nick Desaulniers
  2020-10-15 21:09             ` David Laight
  1 sibling, 0 replies; 28+ messages in thread
From: Nick Desaulniers @ 2020-10-15 18:25 UTC (permalink / raw)
  To: Arvind Sankar
  Cc: Andrew Morton, Kees Cook, David Laight, Nathan Chancellor,
	clang-built-linux, LKML

On Thu, Oct 15, 2020 at 11:13 AM Arvind Sankar <nivedita@alum.mit.edu> wrote:
>
> Be clear about @ptr vs the variable that @ptr points to, and add some
> more details as to why the special barrier_data() macro is required.
>
> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>

Thanks for this distinct cleanup.
Acked-by: Nick Desaulniers <ndesaulniers@google.com>

> ---
>  include/linux/compiler.h | 33 ++++++++++++++++++++++-----------
>  1 file changed, 22 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 93035d7fee0d..d8cee7c8968d 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -86,17 +86,28 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>
>  #ifndef barrier_data
>  /*
> - * This version is i.e. to prevent dead stores elimination on @ptr
> - * where gcc and llvm may behave differently when otherwise using
> - * normal barrier(): while gcc behavior gets along with a normal
> - * barrier(), llvm needs an explicit input variable to be assumed
> - * clobbered. The issue is as follows: while the inline asm might
> - * access any memory it wants, the compiler could have fit all of
> - * @ptr into memory registers instead, and since @ptr never escaped
> - * from that, it proved that the inline asm wasn't touching any of
> - * it. This version works well with both compilers, i.e. we're telling
> - * the compiler that the inline asm absolutely may see the contents
> - * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
> + * This version is to prevent dead stores elimination on @ptr where gcc and
> + * llvm may behave differently when otherwise using normal barrier(): while gcc
> + * behavior gets along with a normal barrier(), llvm needs an explicit input
> + * variable to be assumed clobbered.
> + *
> + * Its primary use is in implementing memzero_explicit(), which is used for
> + * clearing temporary data that may contain secrets.
> + *
> + * The issue is as follows: while the inline asm might access any memory it
> + * wants, the compiler could have fit all of the variable that @ptr points to
> + * into registers instead, and if @ptr never escaped from the function, it
> + * proved that the inline asm wasn't touching any of it. gcc only eliminates
> + * dead stores if the variable was actually allocated in registers, but llvm
> + * reasons that the variable _could_ have been in registers, so the inline asm
> + * can't reliably access it anyway, and eliminates dead stores even if the
> + * variable is actually in memory.
> + *
> + * This version works well with both compilers, i.e. we're telling the compiler
> + * that the inline asm absolutely may see the contents of the variable pointed
> + * to by @ptr.
> + *
> + * See also: https://llvm.org/bugs/show_bug.cgi?id=15495#c5
>   */
>  # define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
>  #endif
> --
> 2.26.2
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH] compiler.h: Clarify comment about the need for barrier_data()
  2020-10-15 18:13           ` [PATCH] compiler.h: Clarify comment about the need for barrier_data() Arvind Sankar
  2020-10-15 18:25             ` Nick Desaulniers
@ 2020-10-15 21:09             ` David Laight
  2020-10-15 22:01               ` Arvind Sankar
  1 sibling, 1 reply; 28+ messages in thread
From: David Laight @ 2020-10-15 21:09 UTC (permalink / raw)
  To: 'Arvind Sankar', Nick Desaulniers
  Cc: Andrew Morton, Kees Cook, Nathan Chancellor, clang-built-linux,
	linux-kernel

From: Arvind Sankar
> Sent: 15 October 2020 19:14
> 
> Be clear about @ptr vs the variable that @ptr points to, and add some
> more details as to why the special barrier_data() macro is required.
> 
> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> ---
>  include/linux/compiler.h | 33 ++++++++++++++++++++++-----------
>  1 file changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 93035d7fee0d..d8cee7c8968d 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -86,17 +86,28 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> 
>  #ifndef barrier_data
>  /*
> - * This version is i.e. to prevent dead stores elimination on @ptr
> - * where gcc and llvm may behave differently when otherwise using
> - * normal barrier(): while gcc behavior gets along with a normal
> - * barrier(), llvm needs an explicit input variable to be assumed
> - * clobbered. The issue is as follows: while the inline asm might
> - * access any memory it wants, the compiler could have fit all of
> - * @ptr into memory registers instead, and since @ptr never escaped
> - * from that, it proved that the inline asm wasn't touching any of
> - * it. This version works well with both compilers, i.e. we're telling
> - * the compiler that the inline asm absolutely may see the contents
> - * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
> + * This version is to prevent dead stores elimination on @ptr where gcc and
> + * llvm may behave differently when otherwise using normal barrier(): while gcc
> + * behavior gets along with a normal barrier(), llvm needs an explicit input
> + * variable to be assumed clobbered.
> + *
> + * Its primary use is in implementing memzero_explicit(), which is used for
> + * clearing temporary data that may contain secrets.
> + *
> + * The issue is as follows: while the inline asm might access any memory it
> + * wants, the compiler could have fit all of the variable that @ptr points to
> + * into registers instead, and if @ptr never escaped from the function, it
> + * proved that the inline asm wasn't touching any of it. gcc only eliminates
> + * dead stores if the variable was actually allocated in registers, but llvm
> + * reasons that the variable _could_ have been in registers, so the inline asm
> + * can't reliably access it anyway, and eliminates dead stores even if the
> + * variable is actually in memory.

I think I'd just say something like:

Although the compiler must assume a "memory" clobber may affect all
memory, local variables (on stack) cannot actually be visible to the
asm unless their address has been passed to an external function.
So the compiler may assume such variables cannot be affected by
a normal asm volatile(::"memory") barrier().
Passing the address of the local variables to the asm barrier
is enough to tell the compiler that the asm can 'see' the variables
(and spill anything held in registers to the stack) so that
the "memory" clobber has the expected effect.

This is necessary to get llvm to do a memset() of on-stack data
at the end of a function to clear memory that contains secrets.

	David

> + *
> + * This version works well with both compilers, i.e. we're telling the compiler
> + * that the inline asm absolutely may see the contents of the variable pointed
> + * to by @ptr.
> + *
> + * See also: https://llvm.org/bugs/show_bug.cgi?id=15495#c5
>   */
>  # define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
>  #endif
> --
> 2.26.2

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Clarify comment about the need for barrier_data()
  2020-10-15 21:09             ` David Laight
@ 2020-10-15 22:01               ` Arvind Sankar
  2020-10-16  8:13                 ` David Laight
  0 siblings, 1 reply; 28+ messages in thread
From: Arvind Sankar @ 2020-10-15 22:01 UTC (permalink / raw)
  To: David Laight
  Cc: 'Arvind Sankar',
	Nick Desaulniers, Andrew Morton, Kees Cook, Nathan Chancellor,
	clang-built-linux, linux-kernel

On Thu, Oct 15, 2020 at 09:09:11PM +0000, David Laight wrote:
> From: Arvind Sankar
> > Sent: 15 October 2020 19:14
> > 
> > Be clear about @ptr vs the variable that @ptr points to, and add some
> > more details as to why the special barrier_data() macro is required.
> > 
> > Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> > ---
> >  include/linux/compiler.h | 33 ++++++++++++++++++++++-----------
> >  1 file changed, 22 insertions(+), 11 deletions(-)
> > 
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 93035d7fee0d..d8cee7c8968d 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -86,17 +86,28 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> > 
> >  #ifndef barrier_data
> >  /*
> > - * This version is i.e. to prevent dead stores elimination on @ptr
> > - * where gcc and llvm may behave differently when otherwise using
> > - * normal barrier(): while gcc behavior gets along with a normal
> > - * barrier(), llvm needs an explicit input variable to be assumed
> > - * clobbered. The issue is as follows: while the inline asm might
> > - * access any memory it wants, the compiler could have fit all of
> > - * @ptr into memory registers instead, and since @ptr never escaped
> > - * from that, it proved that the inline asm wasn't touching any of
> > - * it. This version works well with both compilers, i.e. we're telling
> > - * the compiler that the inline asm absolutely may see the contents
> > - * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
> > + * This version is to prevent dead stores elimination on @ptr where gcc and
> > + * llvm may behave differently when otherwise using normal barrier(): while gcc
> > + * behavior gets along with a normal barrier(), llvm needs an explicit input
> > + * variable to be assumed clobbered.
> > + *
> > + * Its primary use is in implementing memzero_explicit(), which is used for
> > + * clearing temporary data that may contain secrets.
> > + *
> > + * The issue is as follows: while the inline asm might access any memory it
> > + * wants, the compiler could have fit all of the variable that @ptr points to
> > + * into registers instead, and if @ptr never escaped from the function, it
> > + * proved that the inline asm wasn't touching any of it. gcc only eliminates
> > + * dead stores if the variable was actually allocated in registers, but llvm
> > + * reasons that the variable _could_ have been in registers, so the inline asm
> > + * can't reliably access it anyway, and eliminates dead stores even if the
> > + * variable is actually in memory.
> 
> I think I'd just say something like:
> 
> Although the compiler must assume a "memory" clobber may affect all
> memory, local variables (on stack) cannot actually be visible to the
> asm unless their address has been passed to an external function.
> So the compiler may assume such variables cannot be affected by
> a normal asm volatile(::"memory") barrier().
> Passing the address of the local variables to the asm barrier
> is enough to tell the compiler that the asm can 'see' the variables
> (and spill anything held in registers to the stack) so that
> the "memory" clobber has the expected effect.
> 
> This is necessary to get llvm to do a memset() of on-stack data
> at the end of a function to clear memory that contains secrets.
> 
> 	David

I think it's helpful to have the more detailed explanation about
register variables -- at first glance, it's a bit mystifying as to why
the compiler would think that the asm can't access the stack. Spilling
registers to the stack is actually an undesirable side-effect of the
workaround.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH] compiler.h: Clarify comment about the need for barrier_data()
  2020-10-15 22:01               ` Arvind Sankar
@ 2020-10-16  8:13                 ` David Laight
  2020-10-16 13:09                   ` Arvind Sankar
  0 siblings, 1 reply; 28+ messages in thread
From: David Laight @ 2020-10-16  8:13 UTC (permalink / raw)
  To: 'Arvind Sankar'
  Cc: Nick Desaulniers, Andrew Morton, Kees Cook, Nathan Chancellor,
	clang-built-linux, linux-kernel

From: Arvind Sankar
> Sent: 15 October 2020 23:01
,,,
> I think it's helpful to have the more detailed explanation about
> register variables -- at first glance, it's a bit mystifying as to why
> the compiler would think that the asm can't access the stack. Spilling
> registers to the stack is actually an undesirable side-effect of the
> workaround.

That is the very bit that just confuses things.
The data the memzero_explictit() is trying to clear is (probably)
on-stack already - it won't be in registers.

If it were in registers you wouldn't need the memset().

Actually I suspect that the memset() is inlined so that is
just assigns zeros to all the variables.
This will be done using 'virtual registers' that cache the
on-stack value.
You then need to do something to force the instructions to flush
the 'virtual registers' back to stack to be generated.

The fundamental thing is that the address of a local (auto!)
variable must be visible to the asm statement for the compiler
to make the contents of those variables visible.

I even suspect you may need to pass the address of the structure
(to be zeroed) to an asm block at the top of the function as well.
Otherwise the compiler could change the stack offsets where the
structure is stored.
But I don't think compilers do that.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Clarify comment about the need for barrier_data()
  2020-10-16  8:13                 ` David Laight
@ 2020-10-16 13:09                   ` Arvind Sankar
  0 siblings, 0 replies; 28+ messages in thread
From: Arvind Sankar @ 2020-10-16 13:09 UTC (permalink / raw)
  To: David Laight
  Cc: 'Arvind Sankar',
	Nick Desaulniers, Andrew Morton, Kees Cook, Nathan Chancellor,
	clang-built-linux, linux-kernel

On Fri, Oct 16, 2020 at 08:13:44AM +0000, David Laight wrote:
> From: Arvind Sankar
> > Sent: 15 October 2020 23:01
> ,,,
> > I think it's helpful to have the more detailed explanation about
> > register variables -- at first glance, it's a bit mystifying as to why
> > the compiler would think that the asm can't access the stack. Spilling
> > registers to the stack is actually an undesirable side-effect of the
> > workaround.
> 
> That is the very bit that just confuses things.
> The data the memzero_explictit() is trying to clear is (probably)
> on-stack already - it won't be in registers.
> 

Are you saying the explanation is confusing things?

What I think is confusing is the fact that the compiler believes that an
asm with a memory clobber cannot access a variable that happens to be in
memory, and the comment is explaining how the compiler came to that
conclusion. The comment is already saying that this applies to LLVM
(unlike GCC) even if the variable isn't actually in registers.

> If it were in registers you wouldn't need the memset().

There's obviously no guarantee of where the compiler decided to keep the
variables. This isn't so clear-cut: for SHA, there is a 256-byte array
that you can be pretty sure will be in memory, but there are also 10 u32
variables which may or may not be in registers depending on how many
registers the arch has and how clever the compiler was in allocating
them.

> 
> Actually I suspect that the memset() is inlined so that is
> just assigns zeros to all the variables.
> This will be done using 'virtual registers' that cache the
> on-stack value.
> You then need to do something to force the instructions to flush
> the 'virtual registers' back to stack to be generated.

This is definitely getting too much into the weeds. What the compiler
knows is that memset does nothing other than writing to the variable,
and if the variable is never used after that, then the memset can be
eliminated.  Whether the memset ends up getting inlined or not is not
relevant here: clang doesn't inline it, for eg.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-10-14 21:26 [PATCH] compiler.h: Fix barrier_data() on clang Arvind Sankar
  2020-10-14 22:51 ` Nick Desaulniers
  2020-10-15  8:50 ` David Laight
@ 2020-10-21 19:46 ` Kees Cook
  2020-11-16 17:47   ` Andreas Schwab
  3 siblings, 0 replies; 28+ messages in thread
From: Kees Cook @ 2020-10-21 19:46 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds, Arvind Sankar
  Cc: Nathan Chancellor, Nick Desaulniers, clang-built-linux, linux-kernel

On Wed, Oct 14, 2020 at 05:26:31PM -0400, Arvind Sankar wrote:
> Commit
>   815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> 
> neglected to copy barrier_data() from compiler-gcc.h into
> compiler-clang.h. The definition in compiler-gcc.h was really to work
> around clang's more aggressive optimization, so this broke
> barrier_data() on clang, and consequently memzero_explicit() as well.
> 
> For example, this results in at least the memzero_explicit() call in
> lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
> 
> Fix this by moving the definition of barrier_data() into compiler.h.
> 
> Also move the gcc/clang definition of barrier() into compiler.h,
> __memory_barrier() is icc-specific (and barrier() is already defined
> using it in compiler-intel.h) and doesn't belong in compiler.h.
> 
> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")

Yeowch.

Cc: stable@vger.kernel.org
Reviewed-by: Kees Cook <keescook@chromium.org>

Nick just mentioned this to me; I hadn't had a chance to read it yet. This
needs to go to Linus ASAP; memzero_explicit() under Clang in v4.19 and
later isn't so explicit. :(

Andrew, Linus, can one of you pick this up please?

As Nick mentioned, sorting out the specifics of the comments[1] can
come later.

[1] https://lore.kernel.org/lkml/CAKwvOdkLvxeYeBh7Kx0gw7JPktPH8A4DomJTidUqA0jRQTR0FA@mail.gmail.com/

> ---
>  include/linux/compiler-clang.h |  6 ------
>  include/linux/compiler-gcc.h   | 19 -------------------
>  include/linux/compiler.h       | 18 ++++++++++++++++--
>  3 files changed, 16 insertions(+), 27 deletions(-)
> 
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index cee0c728d39a..04c0a5a717f7 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -52,12 +52,6 @@
>  #define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
>  #endif
>  
> -/* The following are for compatibility with GCC, from compiler-gcc.h,
> - * and may be redefined here because they should not be shared with other
> - * compilers, like ICC.
> - */
> -#define barrier() __asm__ __volatile__("" : : : "memory")
> -
>  #if __has_feature(shadow_call_stack)
>  # define __noscs	__attribute__((__no_sanitize__("shadow-call-stack")))
>  #endif
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index 7a3769040d7d..fda30ffb037b 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -15,25 +15,6 @@
>  # error Sorry, your compiler is too old - please upgrade it.
>  #endif
>  
> -/* Optimization barrier */
> -
> -/* The "volatile" is due to gcc bugs */
> -#define barrier() __asm__ __volatile__("": : :"memory")
> -/*
> - * This version is i.e. to prevent dead stores elimination on @ptr
> - * where gcc and llvm may behave differently when otherwise using
> - * normal barrier(): while gcc behavior gets along with a normal
> - * barrier(), llvm needs an explicit input variable to be assumed
> - * clobbered. The issue is as follows: while the inline asm might
> - * access any memory it wants, the compiler could have fit all of
> - * @ptr into memory registers instead, and since @ptr never escaped
> - * from that, it proved that the inline asm wasn't touching any of
> - * it. This version works well with both compilers, i.e. we're telling
> - * the compiler that the inline asm absolutely may see the contents
> - * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
> - */
> -#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
> -
>  /*
>   * This macro obfuscates arithmetic on a variable address so that gcc
>   * shouldn't recognize the original var, and make assumptions about it.
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 92ef163a7479..dfba70b2644f 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -80,11 +80,25 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>  
>  /* Optimization barrier */
>  #ifndef barrier
> -# define barrier() __memory_barrier()
> +/* The "volatile" is due to gcc bugs */
> +# define barrier() __asm__ __volatile__("": : :"memory")
>  #endif
>  
>  #ifndef barrier_data
> -# define barrier_data(ptr) barrier()
> +/*
> + * This version is i.e. to prevent dead stores elimination on @ptr
> + * where gcc and llvm may behave differently when otherwise using
> + * normal barrier(): while gcc behavior gets along with a normal
> + * barrier(), llvm needs an explicit input variable to be assumed
> + * clobbered. The issue is as follows: while the inline asm might
> + * access any memory it wants, the compiler could have fit all of
> + * @ptr into memory registers instead, and since @ptr never escaped
> + * from that, it proved that the inline asm wasn't touching any of
> + * it. This version works well with both compilers, i.e. we're telling
> + * the compiler that the inline asm absolutely may see the contents
> + * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
> + */
> +# define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
>  #endif
>  
>  /* workaround for GCC PR82365 if needed */
> -- 
> 2.26.2
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-10-14 21:26 [PATCH] compiler.h: Fix barrier_data() on clang Arvind Sankar
@ 2020-11-16 17:47   ` Andreas Schwab
  2020-10-15  8:50 ` David Laight
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 28+ messages in thread
From: Andreas Schwab @ 2020-11-16 17:47 UTC (permalink / raw)
  To: Arvind Sankar
  Cc: Nathan Chancellor, Nick Desaulniers, clang-built-linux,
	linux-kernel, linux-riscv

On Okt 14 2020, Arvind Sankar wrote:

> Commit
>   815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
>
> neglected to copy barrier_data() from compiler-gcc.h into
> compiler-clang.h. The definition in compiler-gcc.h was really to work
> around clang's more aggressive optimization, so this broke
> barrier_data() on clang, and consequently memzero_explicit() as well.
>
> For example, this results in at least the memzero_explicit() call in
> lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
>
> Fix this by moving the definition of barrier_data() into compiler.h.
>
> Also move the gcc/clang definition of barrier() into compiler.h,
> __memory_barrier() is icc-specific (and barrier() is already defined
> using it in compiler-intel.h) and doesn't belong in compiler.h.
>
> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")

This breaks build on riscv:

  CC [M]  drivers/net/ethernet/emulex/benet/be_main.o
In file included from ./include/vdso/processor.h:10,
                 from ./arch/riscv/include/asm/processor.h:11,
                 from ./include/linux/prefetch.h:15,
                 from drivers/net/ethernet/emulex/benet/be_main.c:14:
./arch/riscv/include/asm/vdso/processor.h: In function 'cpu_relax':
./arch/riscv/include/asm/vdso/processor.h:14:2: error: implicit declaration of function 'barrier' [-Werror=implicit-function-declaration]
   14 |  barrier();
      |  ^~~~~~~
cc1: some warnings being treated as errors
make[5]: *** [scripts/Makefile.build:283: drivers/net/ethernet/emulex/benet/be_main.o] Error 1
make[4]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex/benet] Error 2
make[3]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex] Error 2
make[2]: *** [scripts/Makefile.build:500: drivers/net/ethernet] Error 2
make[1]: *** [scripts/Makefile.build:500: drivers/net] Error 2
make: *** [Makefile:1799: drivers] Error 2

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
@ 2020-11-16 17:47   ` Andreas Schwab
  0 siblings, 0 replies; 28+ messages in thread
From: Andreas Schwab @ 2020-11-16 17:47 UTC (permalink / raw)
  To: Arvind Sankar
  Cc: clang-built-linux, Nathan Chancellor, Nick Desaulniers,
	linux-kernel, linux-riscv

On Okt 14 2020, Arvind Sankar wrote:

> Commit
>   815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
>
> neglected to copy barrier_data() from compiler-gcc.h into
> compiler-clang.h. The definition in compiler-gcc.h was really to work
> around clang's more aggressive optimization, so this broke
> barrier_data() on clang, and consequently memzero_explicit() as well.
>
> For example, this results in at least the memzero_explicit() call in
> lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
>
> Fix this by moving the definition of barrier_data() into compiler.h.
>
> Also move the gcc/clang definition of barrier() into compiler.h,
> __memory_barrier() is icc-specific (and barrier() is already defined
> using it in compiler-intel.h) and doesn't belong in compiler.h.
>
> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")

This breaks build on riscv:

  CC [M]  drivers/net/ethernet/emulex/benet/be_main.o
In file included from ./include/vdso/processor.h:10,
                 from ./arch/riscv/include/asm/processor.h:11,
                 from ./include/linux/prefetch.h:15,
                 from drivers/net/ethernet/emulex/benet/be_main.c:14:
./arch/riscv/include/asm/vdso/processor.h: In function 'cpu_relax':
./arch/riscv/include/asm/vdso/processor.h:14:2: error: implicit declaration of function 'barrier' [-Werror=implicit-function-declaration]
   14 |  barrier();
      |  ^~~~~~~
cc1: some warnings being treated as errors
make[5]: *** [scripts/Makefile.build:283: drivers/net/ethernet/emulex/benet/be_main.o] Error 1
make[4]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex/benet] Error 2
make[3]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex] Error 2
make[2]: *** [scripts/Makefile.build:500: drivers/net/ethernet] Error 2
make[1]: *** [scripts/Makefile.build:500: drivers/net] Error 2
make: *** [Makefile:1799: drivers] Error 2

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-11-16 17:47   ` Andreas Schwab
@ 2020-11-16 17:53     ` Randy Dunlap
  -1 siblings, 0 replies; 28+ messages in thread
From: Randy Dunlap @ 2020-11-16 17:53 UTC (permalink / raw)
  To: Andreas Schwab, Arvind Sankar
  Cc: Nathan Chancellor, Nick Desaulniers, clang-built-linux,
	linux-kernel, linux-riscv

On 11/16/20 9:47 AM, Andreas Schwab wrote:
> On Okt 14 2020, Arvind Sankar wrote:
> 
>> Commit
>>   815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
>>
>> neglected to copy barrier_data() from compiler-gcc.h into
>> compiler-clang.h. The definition in compiler-gcc.h was really to work
>> around clang's more aggressive optimization, so this broke
>> barrier_data() on clang, and consequently memzero_explicit() as well.
>>
>> For example, this results in at least the memzero_explicit() call in
>> lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
>>
>> Fix this by moving the definition of barrier_data() into compiler.h.
>>
>> Also move the gcc/clang definition of barrier() into compiler.h,
>> __memory_barrier() is icc-specific (and barrier() is already defined
>> using it in compiler-intel.h) and doesn't belong in compiler.h.
>>
>> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
>> Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> 
> This breaks build on riscv:
> 
>   CC [M]  drivers/net/ethernet/emulex/benet/be_main.o
> In file included from ./include/vdso/processor.h:10,
>                  from ./arch/riscv/include/asm/processor.h:11,
>                  from ./include/linux/prefetch.h:15,
>                  from drivers/net/ethernet/emulex/benet/be_main.c:14:
> ./arch/riscv/include/asm/vdso/processor.h: In function 'cpu_relax':
> ./arch/riscv/include/asm/vdso/processor.h:14:2: error: implicit declaration of function 'barrier' [-Werror=implicit-function-declaration]
>    14 |  barrier();
>       |  ^~~~~~~
> cc1: some warnings being treated as errors
> make[5]: *** [scripts/Makefile.build:283: drivers/net/ethernet/emulex/benet/be_main.o] Error 1
> make[4]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex/benet] Error 2
> make[3]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex] Error 2
> make[2]: *** [scripts/Makefile.build:500: drivers/net/ethernet] Error 2
> make[1]: *** [scripts/Makefile.build:500: drivers/net] Error 2
> make: *** [Makefile:1799: drivers] Error 2

Hi,
What kernel version are you building?
This should be fixed in 5.10-rc4 by
commit 3347acc6fcd4ee71ad18a9ff9d9dac176b517329;
specifically this portion of it:

diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 798027bb89be..640f09479bdf 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -13,6 +13,7 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/compiler.h>
 #include <asm/rwonce.h>
 
 #ifndef nop



-- 
~Randy


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
@ 2020-11-16 17:53     ` Randy Dunlap
  0 siblings, 0 replies; 28+ messages in thread
From: Randy Dunlap @ 2020-11-16 17:53 UTC (permalink / raw)
  To: Andreas Schwab, Arvind Sankar
  Cc: clang-built-linux, Nathan Chancellor, Nick Desaulniers,
	linux-kernel, linux-riscv

On 11/16/20 9:47 AM, Andreas Schwab wrote:
> On Okt 14 2020, Arvind Sankar wrote:
> 
>> Commit
>>   815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
>>
>> neglected to copy barrier_data() from compiler-gcc.h into
>> compiler-clang.h. The definition in compiler-gcc.h was really to work
>> around clang's more aggressive optimization, so this broke
>> barrier_data() on clang, and consequently memzero_explicit() as well.
>>
>> For example, this results in at least the memzero_explicit() call in
>> lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
>>
>> Fix this by moving the definition of barrier_data() into compiler.h.
>>
>> Also move the gcc/clang definition of barrier() into compiler.h,
>> __memory_barrier() is icc-specific (and barrier() is already defined
>> using it in compiler-intel.h) and doesn't belong in compiler.h.
>>
>> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
>> Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> 
> This breaks build on riscv:
> 
>   CC [M]  drivers/net/ethernet/emulex/benet/be_main.o
> In file included from ./include/vdso/processor.h:10,
>                  from ./arch/riscv/include/asm/processor.h:11,
>                  from ./include/linux/prefetch.h:15,
>                  from drivers/net/ethernet/emulex/benet/be_main.c:14:
> ./arch/riscv/include/asm/vdso/processor.h: In function 'cpu_relax':
> ./arch/riscv/include/asm/vdso/processor.h:14:2: error: implicit declaration of function 'barrier' [-Werror=implicit-function-declaration]
>    14 |  barrier();
>       |  ^~~~~~~
> cc1: some warnings being treated as errors
> make[5]: *** [scripts/Makefile.build:283: drivers/net/ethernet/emulex/benet/be_main.o] Error 1
> make[4]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex/benet] Error 2
> make[3]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex] Error 2
> make[2]: *** [scripts/Makefile.build:500: drivers/net/ethernet] Error 2
> make[1]: *** [scripts/Makefile.build:500: drivers/net] Error 2
> make: *** [Makefile:1799: drivers] Error 2

Hi,
What kernel version are you building?
This should be fixed in 5.10-rc4 by
commit 3347acc6fcd4ee71ad18a9ff9d9dac176b517329;
specifically this portion of it:

diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 798027bb89be..640f09479bdf 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -13,6 +13,7 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/compiler.h>
 #include <asm/rwonce.h>
 
 #ifndef nop



-- 
~Randy


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-11-16 17:53     ` Randy Dunlap
@ 2020-11-16 18:30       ` Andreas Schwab
  -1 siblings, 0 replies; 28+ messages in thread
From: Andreas Schwab @ 2020-11-16 18:30 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Arvind Sankar, Nathan Chancellor, Nick Desaulniers,
	clang-built-linux, linux-kernel, linux-riscv

On Nov 16 2020, Randy Dunlap wrote:

> What kernel version are you building?

5.10-rc4

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
@ 2020-11-16 18:30       ` Andreas Schwab
  0 siblings, 0 replies; 28+ messages in thread
From: Andreas Schwab @ 2020-11-16 18:30 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Nick Desaulniers, linux-kernel, clang-built-linux, Arvind Sankar,
	Nathan Chancellor, linux-riscv

On Nov 16 2020, Randy Dunlap wrote:

> What kernel version are you building?

5.10-rc4

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-11-16 18:30       ` Andreas Schwab
@ 2020-11-16 19:28         ` Randy Dunlap
  -1 siblings, 0 replies; 28+ messages in thread
From: Randy Dunlap @ 2020-11-16 19:28 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Arvind Sankar, Nathan Chancellor, Nick Desaulniers,
	clang-built-linux, linux-kernel, linux-riscv

On 11/16/20 10:30 AM, Andreas Schwab wrote:
> On Nov 16 2020, Randy Dunlap wrote:
> 
>> What kernel version are you building?
> 
> 5.10-rc4
> 
> Andreas.

OK, thanks.

My build machine is slow, but I have a patch that I am testing:

---
From: Randy Dunlap <rdunlap@infradead.org>

riscv's <vdso/processor.h> uses barrier() so it should
#include <asm/barrier.h> to prevent build errors.

Reported-by: Andreas Schwab <schwab@linux-m68k.org>
---
  arch/riscv/include/asm/vdso/processor.h |    2 ++
  1 file changed, 2 insertions(+)

--- lnx-510-rc4.orig/arch/riscv/include/asm/vdso/processor.h
+++ lnx-510-rc4/arch/riscv/include/asm/vdso/processor.h
@@ -4,6 +4,8 @@

  #ifndef __ASSEMBLY__

+#include <asm/barrier.h>
+
  static inline void cpu_relax(void)
  {
  #ifdef __riscv_muldiv






^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
@ 2020-11-16 19:28         ` Randy Dunlap
  0 siblings, 0 replies; 28+ messages in thread
From: Randy Dunlap @ 2020-11-16 19:28 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Nick Desaulniers, linux-kernel, clang-built-linux, Arvind Sankar,
	Nathan Chancellor, linux-riscv

On 11/16/20 10:30 AM, Andreas Schwab wrote:
> On Nov 16 2020, Randy Dunlap wrote:
> 
>> What kernel version are you building?
> 
> 5.10-rc4
> 
> Andreas.

OK, thanks.

My build machine is slow, but I have a patch that I am testing:

---
From: Randy Dunlap <rdunlap@infradead.org>

riscv's <vdso/processor.h> uses barrier() so it should
#include <asm/barrier.h> to prevent build errors.

Reported-by: Andreas Schwab <schwab@linux-m68k.org>
---
  arch/riscv/include/asm/vdso/processor.h |    2 ++
  1 file changed, 2 insertions(+)

--- lnx-510-rc4.orig/arch/riscv/include/asm/vdso/processor.h
+++ lnx-510-rc4/arch/riscv/include/asm/vdso/processor.h
@@ -4,6 +4,8 @@

  #ifndef __ASSEMBLY__

+#include <asm/barrier.h>
+
  static inline void cpu_relax(void)
  {
  #ifdef __riscv_muldiv






_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-11-16 17:47   ` Andreas Schwab
@ 2020-11-16 19:31     ` Nick Desaulniers
  -1 siblings, 0 replies; 28+ messages in thread
From: Nick Desaulniers @ 2020-11-16 19:31 UTC (permalink / raw)
  To: Andreas Schwab, Palmer Dabbelt
  Cc: Arvind Sankar, Nathan Chancellor, clang-built-linux, LKML, linux-riscv

On Mon, Nov 16, 2020 at 9:47 AM Andreas Schwab <schwab@linux-m68k.org> wrote:
>
> On Okt 14 2020, Arvind Sankar wrote:
>
> > Commit
> >   815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> >
> > neglected to copy barrier_data() from compiler-gcc.h into
> > compiler-clang.h. The definition in compiler-gcc.h was really to work
> > around clang's more aggressive optimization, so this broke
> > barrier_data() on clang, and consequently memzero_explicit() as well.
> >
> > For example, this results in at least the memzero_explicit() call in
> > lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
> >
> > Fix this by moving the definition of barrier_data() into compiler.h.
> >
> > Also move the gcc/clang definition of barrier() into compiler.h,
> > __memory_barrier() is icc-specific (and barrier() is already defined
> > using it in compiler-intel.h) and doesn't belong in compiler.h.
> >
> > Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> > Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
>
> This breaks build on riscv:
>
>   CC [M]  drivers/net/ethernet/emulex/benet/be_main.o
> In file included from ./include/vdso/processor.h:10,
>                  from ./arch/riscv/include/asm/processor.h:11,
>                  from ./include/linux/prefetch.h:15,
>                  from drivers/net/ethernet/emulex/benet/be_main.c:14:
> ./arch/riscv/include/asm/vdso/processor.h: In function 'cpu_relax':
> ./arch/riscv/include/asm/vdso/processor.h:14:2: error: implicit declaration of function 'barrier' [-Werror=implicit-function-declaration]
>    14 |  barrier();
>       |  ^~~~~~~
> cc1: some warnings being treated as errors
> make[5]: *** [scripts/Makefile.build:283: drivers/net/ethernet/emulex/benet/be_main.o] Error 1
> make[4]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex/benet] Error 2
> make[3]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex] Error 2
> make[2]: *** [scripts/Makefile.build:500: drivers/net/ethernet] Error 2
> make[1]: *** [scripts/Makefile.build:500: drivers/net] Error 2
> make: *** [Makefile:1799: drivers] Error 2
>
> Andreas.

A lot of VDSO's reset KBUILD_CFLAGS or use a new variable for their
compiler flags.  As such, they're missing `-include` command line flag
that injects include/linux/compiler_types.h, which `#includes`
numerous other headers if `__KERNEL__` is defined (`-D__KERNEL__`).
So the RISCV VDSO Makefile might need to `-include
$(srctree)/include/linux/compiler_types.h -D__KERNEL__`, or `#include
<linux/compiler.h>"` directly in
arch/riscv/include/asm/vdso/processor.h.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
@ 2020-11-16 19:31     ` Nick Desaulniers
  0 siblings, 0 replies; 28+ messages in thread
From: Nick Desaulniers @ 2020-11-16 19:31 UTC (permalink / raw)
  To: Andreas Schwab, Palmer Dabbelt
  Cc: clang-built-linux, Arvind Sankar, Nathan Chancellor, linux-riscv, LKML

On Mon, Nov 16, 2020 at 9:47 AM Andreas Schwab <schwab@linux-m68k.org> wrote:
>
> On Okt 14 2020, Arvind Sankar wrote:
>
> > Commit
> >   815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> >
> > neglected to copy barrier_data() from compiler-gcc.h into
> > compiler-clang.h. The definition in compiler-gcc.h was really to work
> > around clang's more aggressive optimization, so this broke
> > barrier_data() on clang, and consequently memzero_explicit() as well.
> >
> > For example, this results in at least the memzero_explicit() call in
> > lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
> >
> > Fix this by moving the definition of barrier_data() into compiler.h.
> >
> > Also move the gcc/clang definition of barrier() into compiler.h,
> > __memory_barrier() is icc-specific (and barrier() is already defined
> > using it in compiler-intel.h) and doesn't belong in compiler.h.
> >
> > Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> > Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
>
> This breaks build on riscv:
>
>   CC [M]  drivers/net/ethernet/emulex/benet/be_main.o
> In file included from ./include/vdso/processor.h:10,
>                  from ./arch/riscv/include/asm/processor.h:11,
>                  from ./include/linux/prefetch.h:15,
>                  from drivers/net/ethernet/emulex/benet/be_main.c:14:
> ./arch/riscv/include/asm/vdso/processor.h: In function 'cpu_relax':
> ./arch/riscv/include/asm/vdso/processor.h:14:2: error: implicit declaration of function 'barrier' [-Werror=implicit-function-declaration]
>    14 |  barrier();
>       |  ^~~~~~~
> cc1: some warnings being treated as errors
> make[5]: *** [scripts/Makefile.build:283: drivers/net/ethernet/emulex/benet/be_main.o] Error 1
> make[4]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex/benet] Error 2
> make[3]: *** [scripts/Makefile.build:500: drivers/net/ethernet/emulex] Error 2
> make[2]: *** [scripts/Makefile.build:500: drivers/net/ethernet] Error 2
> make[1]: *** [scripts/Makefile.build:500: drivers/net] Error 2
> make: *** [Makefile:1799: drivers] Error 2
>
> Andreas.

A lot of VDSO's reset KBUILD_CFLAGS or use a new variable for their
compiler flags.  As such, they're missing `-include` command line flag
that injects include/linux/compiler_types.h, which `#includes`
numerous other headers if `__KERNEL__` is defined (`-D__KERNEL__`).
So the RISCV VDSO Makefile might need to `-include
$(srctree)/include/linux/compiler_types.h -D__KERNEL__`, or `#include
<linux/compiler.h>"` directly in
arch/riscv/include/asm/vdso/processor.h.
-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-11-16 19:31     ` Nick Desaulniers
@ 2020-11-16 21:07       ` Andreas Schwab
  -1 siblings, 0 replies; 28+ messages in thread
From: Andreas Schwab @ 2020-11-16 21:07 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Palmer Dabbelt, Arvind Sankar, Nathan Chancellor,
	clang-built-linux, LKML, linux-riscv

On Nov 16 2020, Nick Desaulniers wrote:

> A lot of VDSO's reset KBUILD_CFLAGS or use a new variable for their
> compiler flags.  As such, they're missing `-include` command line flag
> that injects include/linux/compiler_types.h,

It's not missing here.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
@ 2020-11-16 21:07       ` Andreas Schwab
  0 siblings, 0 replies; 28+ messages in thread
From: Andreas Schwab @ 2020-11-16 21:07 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Palmer Dabbelt, LKML, clang-built-linux, Arvind Sankar,
	Nathan Chancellor, linux-riscv

On Nov 16 2020, Nick Desaulniers wrote:

> A lot of VDSO's reset KBUILD_CFLAGS or use a new variable for their
> compiler flags.  As such, they're missing `-include` command line flag
> that injects include/linux/compiler_types.h,

It's not missing here.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
  2020-11-16 19:28         ` Randy Dunlap
@ 2020-11-16 22:19           ` Randy Dunlap
  -1 siblings, 0 replies; 28+ messages in thread
From: Randy Dunlap @ 2020-11-16 22:19 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Arvind Sankar, Nathan Chancellor, Nick Desaulniers,
	clang-built-linux, linux-kernel, linux-riscv

On 11/16/20 11:28 AM, Randy Dunlap wrote:
> On 11/16/20 10:30 AM, Andreas Schwab wrote:
>> On Nov 16 2020, Randy Dunlap wrote:
>>
>>> What kernel version are you building?
>>
>> 5.10-rc4
>>
>> Andreas.
> 
> OK, thanks.
> 
> My build machine is slow, but I have a patch that I am testing:
> 
> ---
> From: Randy Dunlap <rdunlap@infradead.org>
> 
> riscv's <vdso/processor.h> uses barrier() so it should
> #include <asm/barrier.h> to prevent build errors.
> 
> Reported-by: Andreas Schwab <schwab@linux-m68k.org>
> ---
>  arch/riscv/include/asm/vdso/processor.h |    2 ++
>  1 file changed, 2 insertions(+)
> 
> --- lnx-510-rc4.orig/arch/riscv/include/asm/vdso/processor.h
> +++ lnx-510-rc4/arch/riscv/include/asm/vdso/processor.h
> @@ -4,6 +4,8 @@
> 
>  #ifndef __ASSEMBLY__
> 
> +#include <asm/barrier.h>
> +
>  static inline void cpu_relax(void)
>  {
>  #ifdef __riscv_muldiv


This fixes the emulex/benet/ driver build.
I'm still building allmodconfig to see if there are any
other issues.

-- 
~Randy


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] compiler.h: Fix barrier_data() on clang
@ 2020-11-16 22:19           ` Randy Dunlap
  0 siblings, 0 replies; 28+ messages in thread
From: Randy Dunlap @ 2020-11-16 22:19 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Nick Desaulniers, linux-kernel, clang-built-linux, Arvind Sankar,
	Nathan Chancellor, linux-riscv

On 11/16/20 11:28 AM, Randy Dunlap wrote:
> On 11/16/20 10:30 AM, Andreas Schwab wrote:
>> On Nov 16 2020, Randy Dunlap wrote:
>>
>>> What kernel version are you building?
>>
>> 5.10-rc4
>>
>> Andreas.
> 
> OK, thanks.
> 
> My build machine is slow, but I have a patch that I am testing:
> 
> ---
> From: Randy Dunlap <rdunlap@infradead.org>
> 
> riscv's <vdso/processor.h> uses barrier() so it should
> #include <asm/barrier.h> to prevent build errors.
> 
> Reported-by: Andreas Schwab <schwab@linux-m68k.org>
> ---
>  arch/riscv/include/asm/vdso/processor.h |    2 ++
>  1 file changed, 2 insertions(+)
> 
> --- lnx-510-rc4.orig/arch/riscv/include/asm/vdso/processor.h
> +++ lnx-510-rc4/arch/riscv/include/asm/vdso/processor.h
> @@ -4,6 +4,8 @@
> 
>  #ifndef __ASSEMBLY__
> 
> +#include <asm/barrier.h>
> +
>  static inline void cpu_relax(void)
>  {
>  #ifdef __riscv_muldiv


This fixes the emulex/benet/ driver build.
I'm still building allmodconfig to see if there are any
other issues.

-- 
~Randy


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-11-16 22:20 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-14 21:26 [PATCH] compiler.h: Fix barrier_data() on clang Arvind Sankar
2020-10-14 22:51 ` Nick Desaulniers
2020-10-15  8:50 ` David Laight
2020-10-15 14:45   ` Arvind Sankar
2020-10-15 15:24     ` David Laight
2020-10-15 15:39       ` Arvind Sankar
2020-10-15 17:39         ` Nick Desaulniers
2020-10-15 18:13           ` [PATCH] compiler.h: Clarify comment about the need for barrier_data() Arvind Sankar
2020-10-15 18:25             ` Nick Desaulniers
2020-10-15 21:09             ` David Laight
2020-10-15 22:01               ` Arvind Sankar
2020-10-16  8:13                 ` David Laight
2020-10-16 13:09                   ` Arvind Sankar
2020-10-21 19:46 ` [PATCH] compiler.h: Fix barrier_data() on clang Kees Cook
2020-11-16 17:47 ` Andreas Schwab
2020-11-16 17:47   ` Andreas Schwab
2020-11-16 17:53   ` Randy Dunlap
2020-11-16 17:53     ` Randy Dunlap
2020-11-16 18:30     ` Andreas Schwab
2020-11-16 18:30       ` Andreas Schwab
2020-11-16 19:28       ` Randy Dunlap
2020-11-16 19:28         ` Randy Dunlap
2020-11-16 22:19         ` Randy Dunlap
2020-11-16 22:19           ` Randy Dunlap
2020-11-16 19:31   ` Nick Desaulniers
2020-11-16 19:31     ` Nick Desaulniers
2020-11-16 21:07     ` Andreas Schwab
2020-11-16 21:07       ` Andreas Schwab

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.