linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: bitops: fix build regression
@ 2020-05-05 17:44 Nick Desaulniers
  2020-05-05 18:07 ` hpa
                   ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-05 17:44 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Sedat Dilek, stable, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, Ilie Halip, Nick Desaulniers, x86,
	H. Peter Anvin, Marco Elver, Paul E. McKenney,
	Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck, linux-kernel,
	clang-built-linux

From: Sedat Dilek <sedat.dilek@gmail.com>

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, this now produces invalid
assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb\t%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

The "q" constraint only has meanting on -m32 otherwise is treated as
"r".

This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
or Clang+allyesconfig.

Keep the masking operation to appease sparse (`make C=1`), add back the
cast in order to properly select the proper 8b register alias.

 [Nick: reworded]

Cc: stable@vger.kernel.org
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
 arch/x86/include/asm/bitops.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b392571c1f1d..139122e5b25b 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
 	if (__builtin_constant_p(nr)) {
 		asm volatile(LOCK_PREFIX "orb %1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" (CONST_MASK(nr) & 0xff)
+			: "iq" ((u8)(CONST_MASK(nr) & 0xff))
 			: "memory");
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
 	if (__builtin_constant_p(nr)) {
 		asm volatile(LOCK_PREFIX "andb %1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" (CONST_MASK(nr) ^ 0xff));
+			: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
 			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
-- 
2.26.2.526.g744177e7f7-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-05 17:44 [PATCH] x86: bitops: fix build regression Nick Desaulniers
@ 2020-05-05 18:07 ` hpa
  2020-05-05 18:22   ` Nick Desaulniers
  2020-05-07 11:34   ` Peter Zijlstra
  2020-05-06  4:30 ` Nathan Chancellor
  2020-05-07  6:18 ` [PATCH] " Brian Gerst
  2 siblings, 2 replies; 45+ messages in thread
From: hpa @ 2020-05-05 18:07 UTC (permalink / raw)
  To: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Sedat Dilek, stable, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, Ilie Halip, x86, Marco Elver, Paul E. McKenney,
	Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck, linux-kernel,
	clang-built-linux

On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers <ndesaulniers@google.com> wrote:
>From: Sedat Dilek <sedat.dilek@gmail.com>
>
>It turns out that if your config tickles __builtin_constant_p via
>differences in choices to inline or not, this now produces invalid
>assembly:
>
>$ cat foo.c
>long a(long b, long c) {
>  asm("orb\t%1, %0" : "+q"(c): "r"(b));
>  return c;
>}
>$ gcc foo.c
>foo.c: Assembler messages:
>foo.c:2: Error: `%rax' not allowed with `orb'
>
>The "q" constraint only has meanting on -m32 otherwise is treated as
>"r".
>
>This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
>or Clang+allyesconfig.
>
>Keep the masking operation to appease sparse (`make C=1`), add back the
>cast in order to properly select the proper 8b register alias.
>
> [Nick: reworded]
>
>Cc: stable@vger.kernel.org
>Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
>Link: https://github.com/ClangBuiltLinux/linux/issues/961
>Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
>Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
>Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
>Reported-by: kernelci.org bot <bot@kernelci.org>
>Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
>Suggested-by: Ilie Halip <ilie.halip@gmail.com>
>Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
>Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
>Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
>---
> arch/x86/include/asm/bitops.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/arch/x86/include/asm/bitops.h
>b/arch/x86/include/asm/bitops.h
>index b392571c1f1d..139122e5b25b 100644
>--- a/arch/x86/include/asm/bitops.h
>+++ b/arch/x86/include/asm/bitops.h
>@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> 	if (__builtin_constant_p(nr)) {
> 		asm volatile(LOCK_PREFIX "orb %1,%0"
> 			: CONST_MASK_ADDR(nr, addr)
>-			: "iq" (CONST_MASK(nr) & 0xff)
>+			: "iq" ((u8)(CONST_MASK(nr) & 0xff))
> 			: "memory");
> 	} else {
> 		asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
>@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> 	if (__builtin_constant_p(nr)) {
> 		asm volatile(LOCK_PREFIX "andb %1,%0"
> 			: CONST_MASK_ADDR(nr, addr)
>-			: "iq" (CONST_MASK(nr) ^ 0xff));
>+			: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> 	} else {
> 		asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> 			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");

Drop & 0xff and change ^ 0xff to ~.

The redundancy is confusing.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-05 18:07 ` hpa
@ 2020-05-05 18:22   ` Nick Desaulniers
  2020-05-07 11:34   ` Peter Zijlstra
  1 sibling, 0 replies; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-05 18:22 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sedat Dilek,
	# 3.4.x, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	Ilie Halip, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Marco Elver, Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck, LKML,
	clang-built-linux

On Tue, May 5, 2020 at 11:07 AM <hpa@zytor.com> wrote:
>
> On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers <ndesaulniers@google.com> wrote:
> >From: Sedat Dilek <sedat.dilek@gmail.com>
> >
> >It turns out that if your config tickles __builtin_constant_p via
> >differences in choices to inline or not, this now produces invalid
> >assembly:
> >
> >$ cat foo.c
> >long a(long b, long c) {
> >  asm("orb\t%1, %0" : "+q"(c): "r"(b));
> >  return c;
> >}
> >$ gcc foo.c
> >foo.c: Assembler messages:
> >foo.c:2: Error: `%rax' not allowed with `orb'
> >
> >The "q" constraint only has meanting on -m32 otherwise is treated as
> >"r".
> >
> >This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> >or Clang+allyesconfig.
> >
> >Keep the masking operation to appease sparse (`make C=1`), add back the
> >cast in order to properly select the proper 8b register alias.
> >
> > [Nick: reworded]
> >
> >Cc: stable@vger.kernel.org
> >Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> >Link: https://github.com/ClangBuiltLinux/linux/issues/961
> >Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> >Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> >Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> >Reported-by: kernelci.org bot <bot@kernelci.org>
> >Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> >Suggested-by: Ilie Halip <ilie.halip@gmail.com>
> >Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
> >Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
> >Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> >---
> > arch/x86/include/asm/bitops.h | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> >diff --git a/arch/x86/include/asm/bitops.h
> >b/arch/x86/include/asm/bitops.h
> >index b392571c1f1d..139122e5b25b 100644
> >--- a/arch/x86/include/asm/bitops.h
> >+++ b/arch/x86/include/asm/bitops.h
> >@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> >       if (__builtin_constant_p(nr)) {
> >               asm volatile(LOCK_PREFIX "orb %1,%0"
> >                       : CONST_MASK_ADDR(nr, addr)
> >-                      : "iq" (CONST_MASK(nr) & 0xff)
> >+                      : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> >                       : "memory");
> >       } else {
> >               asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> >@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> >       if (__builtin_constant_p(nr)) {
> >               asm volatile(LOCK_PREFIX "andb %1,%0"
> >                       : CONST_MASK_ADDR(nr, addr)
> >-                      : "iq" (CONST_MASK(nr) ^ 0xff));
> >+                      : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> >       } else {
> >               asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> >                       : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
>
> Drop & 0xff and change ^ 0xff to ~.
>
> The redundancy is confusing.

Thanks for the review.  While I would also like to have less
redundancy, we have ourselves a catch-22 that that won't resolve.

Without the cast to u8, gcc and clang will not select low-8-bit
registers required for the `b` suffix on `orb` and `andb`, which
results in an assembler error.
Without the mask, sparse will warn about the upper bytes of the value
being truncated.
(I guess that would have been a more concise commit message).
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-05 17:44 [PATCH] x86: bitops: fix build regression Nick Desaulniers
  2020-05-05 18:07 ` hpa
@ 2020-05-06  4:30 ` Nathan Chancellor
  2020-05-06  9:22   ` Sedat Dilek
                     ` (2 more replies)
  2020-05-07  6:18 ` [PATCH] " Brian Gerst
  2 siblings, 3 replies; 45+ messages in thread
From: Nathan Chancellor @ 2020-05-06  4:30 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sedat Dilek,
	stable, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	Ilie Halip, x86, H. Peter Anvin, Marco Elver, Paul E. McKenney,
	Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck, linux-kernel,
	clang-built-linux

On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
> From: Sedat Dilek <sedat.dilek@gmail.com>
> 
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, this now produces invalid
> assembly:
> 
> $ cat foo.c
> long a(long b, long c) {
>   asm("orb\t%1, %0" : "+q"(c): "r"(b));
>   return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
> 
> The "q" constraint only has meanting on -m32 otherwise is treated as
> "r".
> 
> This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> or Clang+allyesconfig.

For what it's worth, I don't see this with allyesconfig.

> Keep the masking operation to appease sparse (`make C=1`), add back the
> cast in order to properly select the proper 8b register alias.
> 
>  [Nick: reworded]
> 
> Cc: stable@vger.kernel.org

The offending commit was added in 5.7-rc1; we shouldn't need to
Cc stable since this should be picked up as an -rc fix.

> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> Reported-by: kernelci.org bot <bot@kernelci.org>
> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> Suggested-by: Ilie Halip <ilie.halip@gmail.com>

Not to split hairs but this is Ilie's diff, he should probably be the
author with Sedat's Reported-by/Tested-by.

https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458

But eh, it's all a team effort plus that can only happen with Ilie's
explicit consent for a Signed-off-by.

I am currently doing a set of builds with clang-11 with this patch on
top of 5.7-rc4 to make sure that all of the cases I have found work.
Once that is done, I'll comment back with a tag.

> Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
>  arch/x86/include/asm/bitops.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..139122e5b25b 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
>  	if (__builtin_constant_p(nr)) {
>  		asm volatile(LOCK_PREFIX "orb %1,%0"
>  			: CONST_MASK_ADDR(nr, addr)
> -			: "iq" (CONST_MASK(nr) & 0xff)
> +			: "iq" ((u8)(CONST_MASK(nr) & 0xff))
>  			: "memory");
>  	} else {
>  		asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> @@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
>  	if (__builtin_constant_p(nr)) {
>  		asm volatile(LOCK_PREFIX "andb %1,%0"
>  			: CONST_MASK_ADDR(nr, addr)
> -			: "iq" (CONST_MASK(nr) ^ 0xff));
> +			: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
>  	} else {
>  		asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
>  			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> -- 
> 2.26.2.526.g744177e7f7-goog
> 

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-06  4:30 ` Nathan Chancellor
@ 2020-05-06  9:22   ` Sedat Dilek
  2020-05-06 15:41   ` Nathan Chancellor
  2020-05-06 16:37   ` Nick Desaulniers
  2 siblings, 0 replies; 45+ messages in thread
From: Sedat Dilek @ 2020-05-06  9:22 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	stable, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	Ilie Halip, x86, H. Peter Anvin, Marco Elver, Paul E. McKenney,
	Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck, linux-kernel,
	Clang-Built-Linux ML

On Wed, May 6, 2020 at 6:30 AM Nathan Chancellor
<natechancellor@gmail.com> wrote:
>
> On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
> > From: Sedat Dilek <sedat.dilek@gmail.com>
> >
> > It turns out that if your config tickles __builtin_constant_p via
> > differences in choices to inline or not, this now produces invalid
> > assembly:
> >
> > $ cat foo.c
> > long a(long b, long c) {
> >   asm("orb\t%1, %0" : "+q"(c): "r"(b));
> >   return c;
> > }
> > $ gcc foo.c
> > foo.c: Assembler messages:
> > foo.c:2: Error: `%rax' not allowed with `orb'
> >
> > The "q" constraint only has meanting on -m32 otherwise is treated as
> > "r".
> >
> > This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> > or Clang+allyesconfig.
>
> For what it's worth, I don't see this with allyesconfig.
>
> > Keep the masking operation to appease sparse (`make C=1`), add back the
> > cast in order to properly select the proper 8b register alias.
> >
> >  [Nick: reworded]
> >
> > Cc: stable@vger.kernel.org
>
> The offending commit was added in 5.7-rc1; we shouldn't need to
> Cc stable since this should be picked up as an -rc fix.
>
> > Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> > Link: https://github.com/ClangBuiltLinux/linux/issues/961
> > Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> > Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> > Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> > Reported-by: kernelci.org bot <bot@kernelci.org>
> > Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> > Suggested-by: Ilie Halip <ilie.halip@gmail.com>
>
> Not to split hairs but this is Ilie's diff, he should probably be the
> author with Sedat's Reported-by/Tested-by.
>
> https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458
>
> But eh, it's all a team effort plus that can only happen with Ilie's
> explicit consent for a Signed-off-by.
>

Digital dementia... Looking 3 weeks back I have put all relevant
informations into the patches in [1], mentionning the diff is from
Ilie.
Ilie for what reason did not react on any response for 3 weeks in the
CBL issue-tracker.
I think Nick wants to quickly fix the Kernel-CI-Bot issue seen with Clang.
Personally, I hope this patch will be upstreamed in (one of) the next
RC release.

I agree on CC:stable can be dropped.
Check causing commit-id:

$ git describe --contains 1651e700664b4
v5.7-rc1~122^2

Thanks.

Regards,
- Sedat -

[1] https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-613207374

> I am currently doing a set of builds with clang-11 with this patch on
> top of 5.7-rc4 to make sure that all of the cases I have found work.
> Once that is done, I'll comment back with a tag.
>
> > Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
> > Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
> > Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> > ---
> >  arch/x86/include/asm/bitops.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> > index b392571c1f1d..139122e5b25b 100644
> > --- a/arch/x86/include/asm/bitops.h
> > +++ b/arch/x86/include/asm/bitops.h
> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> >       if (__builtin_constant_p(nr)) {
> >               asm volatile(LOCK_PREFIX "orb %1,%0"
> >                       : CONST_MASK_ADDR(nr, addr)
> > -                     : "iq" (CONST_MASK(nr) & 0xff)
> > +                     : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> >                       : "memory");
> >       } else {
> >               asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> > @@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> >       if (__builtin_constant_p(nr)) {
> >               asm volatile(LOCK_PREFIX "andb %1,%0"
> >                       : CONST_MASK_ADDR(nr, addr)
> > -                     : "iq" (CONST_MASK(nr) ^ 0xff));
> > +                     : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> >       } else {
> >               asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> >                       : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> > --
> > 2.26.2.526.g744177e7f7-goog
> >
>
> Cheers,
> Nathan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-06  4:30 ` Nathan Chancellor
  2020-05-06  9:22   ` Sedat Dilek
@ 2020-05-06 15:41   ` Nathan Chancellor
  2020-05-06 16:37   ` Nick Desaulniers
  2 siblings, 0 replies; 45+ messages in thread
From: Nathan Chancellor @ 2020-05-06 15:41 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sedat Dilek,
	stable, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	Ilie Halip, x86, H. Peter Anvin, Marco Elver, Paul E. McKenney,
	Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck, linux-kernel,
	clang-built-linux

On Tue, May 05, 2020 at 09:30:28PM -0700, Nathan Chancellor wrote:
> On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
> > From: Sedat Dilek <sedat.dilek@gmail.com>
> > 
> > It turns out that if your config tickles __builtin_constant_p via
> > differences in choices to inline or not, this now produces invalid
> > assembly:
> > 
> > $ cat foo.c
> > long a(long b, long c) {
> >   asm("orb\t%1, %0" : "+q"(c): "r"(b));
> >   return c;
> > }
> > $ gcc foo.c
> > foo.c: Assembler messages:
> > foo.c:2: Error: `%rax' not allowed with `orb'
> > 
> > The "q" constraint only has meanting on -m32 otherwise is treated as
> > "r".
> > 
> > This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> > or Clang+allyesconfig.
> 
> For what it's worth, I don't see this with allyesconfig.
> 
> > Keep the masking operation to appease sparse (`make C=1`), add back the
> > cast in order to properly select the proper 8b register alias.
> > 
> >  [Nick: reworded]
> > 
> > Cc: stable@vger.kernel.org
> 
> The offending commit was added in 5.7-rc1; we shouldn't need to
> Cc stable since this should be picked up as an -rc fix.
> 
> > Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> > Link: https://github.com/ClangBuiltLinux/linux/issues/961
> > Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> > Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> > Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> > Reported-by: kernelci.org bot <bot@kernelci.org>
> > Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> > Suggested-by: Ilie Halip <ilie.halip@gmail.com>
> 
> Not to split hairs but this is Ilie's diff, he should probably be the
> author with Sedat's Reported-by/Tested-by.
> 
> https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458
> 
> But eh, it's all a team effort plus that can only happen with Ilie's
> explicit consent for a Signed-off-by.
> 
> I am currently doing a set of builds with clang-11 with this patch on
> top of 5.7-rc4 to make sure that all of the cases I have found work.
> Once that is done, I'll comment back with a tag.

Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com> # build

> > Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
> > Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
> > Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> > ---
> >  arch/x86/include/asm/bitops.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> > index b392571c1f1d..139122e5b25b 100644
> > --- a/arch/x86/include/asm/bitops.h
> > +++ b/arch/x86/include/asm/bitops.h
> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> >  	if (__builtin_constant_p(nr)) {
> >  		asm volatile(LOCK_PREFIX "orb %1,%0"
> >  			: CONST_MASK_ADDR(nr, addr)
> > -			: "iq" (CONST_MASK(nr) & 0xff)
> > +			: "iq" ((u8)(CONST_MASK(nr) & 0xff))
> >  			: "memory");
> >  	} else {
> >  		asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> > @@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> >  	if (__builtin_constant_p(nr)) {
> >  		asm volatile(LOCK_PREFIX "andb %1,%0"
> >  			: CONST_MASK_ADDR(nr, addr)
> > -			: "iq" (CONST_MASK(nr) ^ 0xff));
> > +			: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> >  	} else {
> >  		asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> >  			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> > -- 
> > 2.26.2.526.g744177e7f7-goog
> > 
> 
> Cheers,
> Nathan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-06  4:30 ` Nathan Chancellor
  2020-05-06  9:22   ` Sedat Dilek
  2020-05-06 15:41   ` Nathan Chancellor
@ 2020-05-06 16:37   ` Nick Desaulniers
  2020-05-06 16:55     ` Ilie Halip
  2 siblings, 1 reply; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-06 16:37 UTC (permalink / raw)
  To: Nathan Chancellor, Ilie Halip
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sedat Dilek,
	# 3.4.x, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Marco Elver, Paul E. McKenney,
	Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck, LKML,
	clang-built-linux

On Tue, May 5, 2020 at 9:30 PM Nathan Chancellor
<natechancellor@gmail.com> wrote:
>
> On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
> > This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> > or Clang+allyesconfig.
>
> For what it's worth, I don't see this with allyesconfig.

Oops, ok, I'll drop that from the commit message in v2.  I was testing
with the former.

>
> > Keep the masking operation to appease sparse (`make C=1`), add back the
> > cast in order to properly select the proper 8b register alias.
> >
> >  [Nick: reworded]
> >
> > Cc: stable@vger.kernel.org
>
> The offending commit was added in 5.7-rc1; we shouldn't need to
> Cc stable since this should be picked up as an -rc fix.

Got it, will drop in v2.

>
> > Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> > Link: https://github.com/ClangBuiltLinux/linux/issues/961
> > Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> > Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> > Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> > Reported-by: kernelci.org bot <bot@kernelci.org>
> > Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> > Suggested-by: Ilie Halip <ilie.halip@gmail.com>
>
> Not to split hairs but this is Ilie's diff, he should probably be the
> author with Sedat's Reported-by/Tested-by.
>
> https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458

Ooh, you're right. Sorry about that Ilie.  I'm usually pretty pedantic
about getting that right; my mistake.  I'll fix that in v2.  As Sedat
noted, the issue tracker has been a little quiet on this issue, but
I'll note that there are extraordinary circumstances going on in the
world these days (COVID) so delays should be anticipated.

Ilie, may I put your authorship and signed off by tag on the V2?

>
> But eh, it's all a team effort plus that can only happen with Ilie's
> explicit consent for a Signed-off-by.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-06 16:37   ` Nick Desaulniers
@ 2020-05-06 16:55     ` Ilie Halip
  2020-05-06 17:05       ` [PATCH v2] " Nick Desaulniers
  0 siblings, 1 reply; 45+ messages in thread
From: Ilie Halip @ 2020-05-06 16:55 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Nathan Chancellor, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sedat Dilek, # 3.4.x, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Marco Elver, Paul E. McKenney,
	Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck, LKML,
	clang-built-linux

Hi Nick,

> Ilie, may I put your authorship and signed off by tag on the V2?

Yes, of course. With the current global situation I took some time off
and didn't follow the latest discussion.

Feel free to credit/rework the code as you see fit.

Thanks,
I.H.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2] x86: bitops: fix build regression
  2020-05-06 16:55     ` Ilie Halip
@ 2020-05-06 17:05       ` Nick Desaulniers
  0 siblings, 0 replies; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-06 17:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Ilie Halip, Jesse Brandeburg, Sedat Dilek, kernelci . org bot,
	Andy Shevchenko, Nathan Chancellor, Nick Desaulniers, x86,
	H. Peter Anvin, Marco Elver, Paul E. McKenney, Michael Ellerman,
	Masahiro Yamada, Daniel Axtens, Luc Van Oostenryck,
	Peter Zijlstra (Intel),
	linux-kernel, clang-built-linux

From: Ilie Halip <ilie.halip@gmail.com>

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, this now produces invalid
assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb\t%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

The "q" constraint only has meanting on -m32 otherwise is treated as
"r".

This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

Without the cast to u8, gcc and clang will not select low-8-bit
registers required for the `b` suffix on `orb` and `andb`, which results
in an assembler error.  Without the mask, sparse will warn about the
upper bytes of the value being truncated.

 [Nick: reworded]

Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Ilie Halip <ilie.halip@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
Changes V1 -> V2:
* change authorship/signed-off-by to Ilie
* add Nathan's Tested by/reviewed by
* update commit message slightly with info sent to HPA.

 arch/x86/include/asm/bitops.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b392571c1f1d..139122e5b25b 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
 	if (__builtin_constant_p(nr)) {
 		asm volatile(LOCK_PREFIX "orb %1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" (CONST_MASK(nr) & 0xff)
+			: "iq" ((u8)(CONST_MASK(nr) & 0xff))
 			: "memory");
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
 	if (__builtin_constant_p(nr)) {
 		asm volatile(LOCK_PREFIX "andb %1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" (CONST_MASK(nr) ^ 0xff));
+			: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
 			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
-- 
2.26.2.526.g744177e7f7-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-05 17:44 [PATCH] x86: bitops: fix build regression Nick Desaulniers
  2020-05-05 18:07 ` hpa
  2020-05-06  4:30 ` Nathan Chancellor
@ 2020-05-07  6:18 ` Brian Gerst
  2020-05-07  7:02   ` hpa
                     ` (2 more replies)
  2 siblings, 3 replies; 45+ messages in thread
From: Brian Gerst @ 2020-05-07  6:18 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sedat Dilek,
	stable, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	Ilie Halip, the arch/x86 maintainers, H. Peter Anvin,
	Marco Elver, Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> From: Sedat Dilek <sedat.dilek@gmail.com>
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, this now produces invalid
> assembly:
>
> $ cat foo.c
> long a(long b, long c) {
>   asm("orb\t%1, %0" : "+q"(c): "r"(b));
>   return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> The "q" constraint only has meanting on -m32 otherwise is treated as
> "r".
>
> This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> or Clang+allyesconfig.
>
> Keep the masking operation to appease sparse (`make C=1`), add back the
> cast in order to properly select the proper 8b register alias.
>
>  [Nick: reworded]
>
> Cc: stable@vger.kernel.org
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> Reported-by: kernelci.org bot <bot@kernelci.org>
> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> Suggested-by: Ilie Halip <ilie.halip@gmail.com>
> Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
>  arch/x86/include/asm/bitops.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..139122e5b25b 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
>         if (__builtin_constant_p(nr)) {
>                 asm volatile(LOCK_PREFIX "orb %1,%0"
>                         : CONST_MASK_ADDR(nr, addr)
> -                       : "iq" (CONST_MASK(nr) & 0xff)
> +                       : "iq" ((u8)(CONST_MASK(nr) & 0xff))

I think a better fix would be to make CONST_MASK() return a u8 value
rather than have to cast on every use.

Also I question the need for the "q" constraint.  It was added in
commit 437a0a54 as a workaround for GCC 3.4.4.  Now that the minimum
GCC version is 4.6, is this still necessary?

--
Brian Gerst

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-07  6:18 ` [PATCH] " Brian Gerst
@ 2020-05-07  7:02   ` hpa
  2020-05-07 13:32     ` Brian Gerst
  2020-05-07  7:44   ` David Laight
  2020-05-07 19:22   ` Nick Desaulniers
  2 siblings, 1 reply; 45+ messages in thread
From: hpa @ 2020-05-07  7:02 UTC (permalink / raw)
  To: Brian Gerst, Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sedat Dilek,
	stable, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	Ilie Halip, the arch/x86 maintainers, Marco Elver,
	Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On May 6, 2020 11:18:09 PM PDT, Brian Gerst <brgerst@gmail.com> wrote:
>On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers
><ndesaulniers@google.com> wrote:
>>
>> From: Sedat Dilek <sedat.dilek@gmail.com>
>>
>> It turns out that if your config tickles __builtin_constant_p via
>> differences in choices to inline or not, this now produces invalid
>> assembly:
>>
>> $ cat foo.c
>> long a(long b, long c) {
>>   asm("orb\t%1, %0" : "+q"(c): "r"(b));
>>   return c;
>> }
>> $ gcc foo.c
>> foo.c: Assembler messages:
>> foo.c:2: Error: `%rax' not allowed with `orb'
>>
>> The "q" constraint only has meanting on -m32 otherwise is treated as
>> "r".
>>
>> This is easily reproducible via
>Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
>> or Clang+allyesconfig.
>>
>> Keep the masking operation to appease sparse (`make C=1`), add back
>the
>> cast in order to properly select the proper 8b register alias.
>>
>>  [Nick: reworded]
>>
>> Cc: stable@vger.kernel.org
>> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
>> Link: https://github.com/ClangBuiltLinux/linux/issues/961
>> Link:
>https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
>> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
>> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
>> Reported-by: kernelci.org bot <bot@kernelci.org>
>> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
>> Suggested-by: Ilie Halip <ilie.halip@gmail.com>
>> Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
>> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
>> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
>> ---
>>  arch/x86/include/asm/bitops.h | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/bitops.h
>b/arch/x86/include/asm/bitops.h
>> index b392571c1f1d..139122e5b25b 100644
>> --- a/arch/x86/include/asm/bitops.h
>> +++ b/arch/x86/include/asm/bitops.h
>> @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
>>         if (__builtin_constant_p(nr)) {
>>                 asm volatile(LOCK_PREFIX "orb %1,%0"
>>                         : CONST_MASK_ADDR(nr, addr)
>> -                       : "iq" (CONST_MASK(nr) & 0xff)
>> +                       : "iq" ((u8)(CONST_MASK(nr) & 0xff))
>
>I think a better fix would be to make CONST_MASK() return a u8 value
>rather than have to cast on every use.
>
>Also I question the need for the "q" constraint.  It was added in
>commit 437a0a54 as a workaround for GCC 3.4.4.  Now that the minimum
>GCC version is 4.6, is this still necessary?
>
>--
>Brian Gerst

Yes, "q" is needed on i386.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH] x86: bitops: fix build regression
  2020-05-07  6:18 ` [PATCH] " Brian Gerst
  2020-05-07  7:02   ` hpa
@ 2020-05-07  7:44   ` David Laight
  2020-05-07  7:59     ` hpa
  2020-05-07  9:17     ` Andy Shevchenko
  2020-05-07 19:22   ` Nick Desaulniers
  2 siblings, 2 replies; 45+ messages in thread
From: David Laight @ 2020-05-07  7:44 UTC (permalink / raw)
  To: 'Brian Gerst', Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sedat Dilek,
	stable, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	Ilie Halip, the arch/x86 maintainers, H. Peter Anvin,
	Marco Elver, Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

From: Brian Gerst
> Sent: 07 May 2020 07:18
...
> > --- a/arch/x86/include/asm/bitops.h
> > +++ b/arch/x86/include/asm/bitops.h
> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> >         if (__builtin_constant_p(nr)) {
> >                 asm volatile(LOCK_PREFIX "orb %1,%0"
> >                         : CONST_MASK_ADDR(nr, addr)
> > -                       : "iq" (CONST_MASK(nr) & 0xff)
> > +                       : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> 
> I think a better fix would be to make CONST_MASK() return a u8 value
> rather than have to cast on every use.

Or assign to a local variable - then it doesn't matter how
the value is actually calculated. So:
			u8 mask = CONST_MASK(nr);
			asm volatile(LOCK_PREFIX "orb %1,%0"
				: CONST_MASK_ADDR(nr, addr)
				: "iq" mask

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH] x86: bitops: fix build regression
  2020-05-07  7:44   ` David Laight
@ 2020-05-07  7:59     ` hpa
  2020-05-07  8:35       ` David Laight
  2020-05-07  9:17     ` Andy Shevchenko
  1 sibling, 1 reply; 45+ messages in thread
From: hpa @ 2020-05-07  7:59 UTC (permalink / raw)
  To: David Laight, 'Brian Gerst', Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sedat Dilek,
	stable, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	Ilie Halip, the arch/x86 maintainers, Marco Elver,
	Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On May 7, 2020 12:44:44 AM PDT, David Laight <David.Laight@ACULAB.COM> wrote:
>From: Brian Gerst
>> Sent: 07 May 2020 07:18
>...
>> > --- a/arch/x86/include/asm/bitops.h
>> > +++ b/arch/x86/include/asm/bitops.h
>> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
>*addr)
>> >         if (__builtin_constant_p(nr)) {
>> >                 asm volatile(LOCK_PREFIX "orb %1,%0"
>> >                         : CONST_MASK_ADDR(nr, addr)
>> > -                       : "iq" (CONST_MASK(nr) & 0xff)
>> > +                       : "iq" ((u8)(CONST_MASK(nr) & 0xff))
>> 
>> I think a better fix would be to make CONST_MASK() return a u8 value
>> rather than have to cast on every use.
>
>Or assign to a local variable - then it doesn't matter how
>the value is actually calculated. So:
>			u8 mask = CONST_MASK(nr);
>			asm volatile(LOCK_PREFIX "orb %1,%0"
>				: CONST_MASK_ADDR(nr, addr)
>				: "iq" mask
>
>	David
>
>-
>Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
>MK1 1PT, UK
>Registration No: 1397386 (Wales)

"const u8" please...
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH] x86: bitops: fix build regression
  2020-05-07  7:59     ` hpa
@ 2020-05-07  8:35       ` David Laight
  2020-05-07  8:38         ` hpa
  0 siblings, 1 reply; 45+ messages in thread
From: David Laight @ 2020-05-07  8:35 UTC (permalink / raw)
  To: 'hpa@zytor.com', 'Brian Gerst', Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sedat Dilek,
	stable, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	Ilie Halip, the arch/x86 maintainers, Marco Elver,
	Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

From: hpa@zytor.com
> Sent: 07 May 2020 08:59
> On May 7, 2020 12:44:44 AM PDT, David Laight <David.Laight@ACULAB.COM> wrote:
> >From: Brian Gerst
> >> Sent: 07 May 2020 07:18
> >...
> >> > --- a/arch/x86/include/asm/bitops.h
> >> > +++ b/arch/x86/include/asm/bitops.h
> >> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
> >*addr)
> >> >         if (__builtin_constant_p(nr)) {
> >> >                 asm volatile(LOCK_PREFIX "orb %1,%0"
> >> >                         : CONST_MASK_ADDR(nr, addr)
> >> > -                       : "iq" (CONST_MASK(nr) & 0xff)
> >> > +                       : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> >>
> >> I think a better fix would be to make CONST_MASK() return a u8 value
> >> rather than have to cast on every use.
> >
> >Or assign to a local variable - then it doesn't matter how
> >the value is actually calculated. So:
> >			u8 mask = CONST_MASK(nr);
> >			asm volatile(LOCK_PREFIX "orb %1,%0"
> >				: CONST_MASK_ADDR(nr, addr)
> >				: "iq" mask
> >
> >	David
> >
> >-
> >Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
> >MK1 1PT, UK
> >Registration No: 1397386 (Wales)
> 
> "const u8" please...

Why, just a waste of disk space.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH] x86: bitops: fix build regression
  2020-05-07  8:35       ` David Laight
@ 2020-05-07  8:38         ` hpa
  0 siblings, 0 replies; 45+ messages in thread
From: hpa @ 2020-05-07  8:38 UTC (permalink / raw)
  To: David Laight, 'Brian Gerst', Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sedat Dilek,
	stable, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	Ilie Halip, the arch/x86 maintainers, Marco Elver,
	Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On May 7, 2020 1:35:01 AM PDT, David Laight <David.Laight@ACULAB.COM> wrote:
>From: hpa@zytor.com
>> Sent: 07 May 2020 08:59
>> On May 7, 2020 12:44:44 AM PDT, David Laight
><David.Laight@ACULAB.COM> wrote:
>> >From: Brian Gerst
>> >> Sent: 07 May 2020 07:18
>> >...
>> >> > --- a/arch/x86/include/asm/bitops.h
>> >> > +++ b/arch/x86/include/asm/bitops.h
>> >> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
>> >*addr)
>> >> >         if (__builtin_constant_p(nr)) {
>> >> >                 asm volatile(LOCK_PREFIX "orb %1,%0"
>> >> >                         : CONST_MASK_ADDR(nr, addr)
>> >> > -                       : "iq" (CONST_MASK(nr) & 0xff)
>> >> > +                       : "iq" ((u8)(CONST_MASK(nr) & 0xff))
>> >>
>> >> I think a better fix would be to make CONST_MASK() return a u8
>value
>> >> rather than have to cast on every use.
>> >
>> >Or assign to a local variable - then it doesn't matter how
>> >the value is actually calculated. So:
>> >			u8 mask = CONST_MASK(nr);
>> >			asm volatile(LOCK_PREFIX "orb %1,%0"
>> >				: CONST_MASK_ADDR(nr, addr)
>> >				: "iq" mask
>> >
>> >	David
>> >
>> >-
>> >Registered Address Lakeside, Bramley Road, Mount Farm, Milton
>Keynes,
>> >MK1 1PT, UK
>> >Registration No: 1397386 (Wales)
>> 
>> "const u8" please...
>
>Why, just a waste of disk space.
>
>	David
>
>-
>Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
>MK1 1PT, UK
>Registration No: 1397386 (Wales)

Descriptive.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-07  7:44   ` David Laight
  2020-05-07  7:59     ` hpa
@ 2020-05-07  9:17     ` Andy Shevchenko
  1 sibling, 0 replies; 45+ messages in thread
From: Andy Shevchenko @ 2020-05-07  9:17 UTC (permalink / raw)
  To: David Laight
  Cc: Brian Gerst, Nick Desaulniers, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sedat Dilek, stable, Jesse Brandeburg,
	kernelci . org bot, Andy Shevchenko, Ilie Halip,
	the arch/x86 maintainers, H. Peter Anvin, Marco Elver,
	Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On Thu, May 7, 2020 at 10:50 AM David Laight <David.Laight@aculab.com> wrote:
> From: Brian Gerst
> > Sent: 07 May 2020 07:18

> > I think a better fix would be to make CONST_MASK() return a u8 value
> > rather than have to cast on every use.
>
> Or assign to a local variable - then it doesn't matter how
> the value is actually calculated. So:
>                         u8 mask = CONST_MASK(nr);

Another case with negation won't work like this I believe.
So, I thin kthe patch we have is good enough, no need to seek for an evil.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-05 18:07 ` hpa
  2020-05-05 18:22   ` Nick Desaulniers
@ 2020-05-07 11:34   ` Peter Zijlstra
  2020-05-07 14:00     ` Brian Gerst
  2020-05-07 19:29     ` hpa
  1 sibling, 2 replies; 45+ messages in thread
From: Peter Zijlstra @ 2020-05-07 11:34 UTC (permalink / raw)
  To: hpa
  Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sedat Dilek, stable, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, Ilie Halip, x86, Marco Elver, Paul E. McKenney,
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck, linux-kernel,
	clang-built-linux

On Tue, May 05, 2020 at 11:07:24AM -0700, hpa@zytor.com wrote:
> On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers <ndesaulniers@google.com> wrote:

> >@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> > 	if (__builtin_constant_p(nr)) {
> > 		asm volatile(LOCK_PREFIX "orb %1,%0"
> > 			: CONST_MASK_ADDR(nr, addr)
> >-			: "iq" (CONST_MASK(nr) & 0xff)
> >+			: "iq" ((u8)(CONST_MASK(nr) & 0xff))
> > 			: "memory");
> > 	} else {
> > 		asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> >@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> > 	if (__builtin_constant_p(nr)) {
> > 		asm volatile(LOCK_PREFIX "andb %1,%0"
> > 			: CONST_MASK_ADDR(nr, addr)
> >-			: "iq" (CONST_MASK(nr) ^ 0xff));
> >+			: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> > 	} else {
> > 		asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> > 			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> 
> Drop & 0xff and change ^ 0xff to ~.

But then we're back to sparse being unhappy, no? The thing with ~ is
that it will set high bits which will be truncated, which makes sparse
sad.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-07  7:02   ` hpa
@ 2020-05-07 13:32     ` Brian Gerst
  2020-05-07 15:09       ` David Laight
  2020-05-07 19:29       ` hpa
  0 siblings, 2 replies; 45+ messages in thread
From: Brian Gerst @ 2020-05-07 13:32 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sedat Dilek, stable, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, Ilie Halip, the arch/x86 maintainers,
	Marco Elver, Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On Thu, May 7, 2020 at 3:02 AM <hpa@zytor.com> wrote:
>
> On May 6, 2020 11:18:09 PM PDT, Brian Gerst <brgerst@gmail.com> wrote:
> >On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers
> ><ndesaulniers@google.com> wrote:
> >>
> >> From: Sedat Dilek <sedat.dilek@gmail.com>
> >>
> >> It turns out that if your config tickles __builtin_constant_p via
> >> differences in choices to inline or not, this now produces invalid
> >> assembly:
> >>
> >> $ cat foo.c
> >> long a(long b, long c) {
> >>   asm("orb\t%1, %0" : "+q"(c): "r"(b));
> >>   return c;
> >> }
> >> $ gcc foo.c
> >> foo.c: Assembler messages:
> >> foo.c:2: Error: `%rax' not allowed with `orb'
> >>
> >> The "q" constraint only has meanting on -m32 otherwise is treated as
> >> "r".
> >>
> >> This is easily reproducible via
> >Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> >> or Clang+allyesconfig.
> >>
> >> Keep the masking operation to appease sparse (`make C=1`), add back
> >the
> >> cast in order to properly select the proper 8b register alias.
> >>
> >>  [Nick: reworded]
> >>
> >> Cc: stable@vger.kernel.org
> >> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> >> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> >> Link:
> >https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> >> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> >> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> >> Reported-by: kernelci.org bot <bot@kernelci.org>
> >> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> >> Suggested-by: Ilie Halip <ilie.halip@gmail.com>
> >> Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
> >> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
> >> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> >> ---
> >>  arch/x86/include/asm/bitops.h | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/x86/include/asm/bitops.h
> >b/arch/x86/include/asm/bitops.h
> >> index b392571c1f1d..139122e5b25b 100644
> >> --- a/arch/x86/include/asm/bitops.h
> >> +++ b/arch/x86/include/asm/bitops.h
> >> @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> >>         if (__builtin_constant_p(nr)) {
> >>                 asm volatile(LOCK_PREFIX "orb %1,%0"
> >>                         : CONST_MASK_ADDR(nr, addr)
> >> -                       : "iq" (CONST_MASK(nr) & 0xff)
> >> +                       : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> >
> >I think a better fix would be to make CONST_MASK() return a u8 value
> >rather than have to cast on every use.
> >
> >Also I question the need for the "q" constraint.  It was added in
> >commit 437a0a54 as a workaround for GCC 3.4.4.  Now that the minimum
> >GCC version is 4.6, is this still necessary?
> >
> >--
> >Brian Gerst
>
> Yes, "q" is needed on i386.

I think the bug this worked around was that the compiler didn't detect
that CONST_MASK(nr) was also constant and doesn't need to be put into
a register.  The question is does that bug still exist on compiler
versions we care about?

--
Brian Gerst

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-07 11:34   ` Peter Zijlstra
@ 2020-05-07 14:00     ` Brian Gerst
  2020-05-07 19:19       ` Nick Desaulniers
  2020-05-07 19:29     ` hpa
  1 sibling, 1 reply; 45+ messages in thread
From: Brian Gerst @ 2020-05-07 14:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: H. Peter Anvin, Nick Desaulniers, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sedat Dilek, stable, Jesse Brandeburg,
	kernelci . org bot, Andy Shevchenko, Ilie Halip,
	the arch/x86 maintainers, Marco Elver, Paul E. McKenney,
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On Thu, May 7, 2020 at 7:38 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, May 05, 2020 at 11:07:24AM -0700, hpa@zytor.com wrote:
> > On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> > >@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> > >     if (__builtin_constant_p(nr)) {
> > >             asm volatile(LOCK_PREFIX "orb %1,%0"
> > >                     : CONST_MASK_ADDR(nr, addr)
> > >-                    : "iq" (CONST_MASK(nr) & 0xff)
> > >+                    : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> > >                     : "memory");
> > >     } else {
> > >             asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> > >@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> > >     if (__builtin_constant_p(nr)) {
> > >             asm volatile(LOCK_PREFIX "andb %1,%0"
> > >                     : CONST_MASK_ADDR(nr, addr)
> > >-                    : "iq" (CONST_MASK(nr) ^ 0xff));
> > >+                    : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> > >     } else {
> > >             asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> > >                     : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> >
> > Drop & 0xff and change ^ 0xff to ~.
>
> But then we're back to sparse being unhappy, no? The thing with ~ is
> that it will set high bits which will be truncated, which makes sparse
> sad.

This change will make sparse happy and allow these cleanups:
#define CONST_MASK(nr)                 ((u8)1 << ((nr) & 7))

Tested with GCC 9.3.1.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH] x86: bitops: fix build regression
  2020-05-07 13:32     ` Brian Gerst
@ 2020-05-07 15:09       ` David Laight
  2020-05-07 19:31         ` hpa
  2020-05-07 19:29       ` hpa
  1 sibling, 1 reply; 45+ messages in thread
From: David Laight @ 2020-05-07 15:09 UTC (permalink / raw)
  To: 'Brian Gerst', H. Peter Anvin
  Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sedat Dilek, stable, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, Ilie Halip, the arch/x86 maintainers,
	Marco Elver, Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

From: Brian Gerst
> Sent: 07 May 2020 14:32
...
> I think the bug this worked around was that the compiler didn't detect
> that CONST_MASK(nr) was also constant and doesn't need to be put into
> a register.  The question is does that bug still exist on compiler
> versions we care about?

Hmmm...
That ought to have been fixed instead of worrying about the fact
that an invalid register was used.

Alternatively is there any reason not to use the bts/btc instructions?
Yes, I know they'll do wider accesses, but variable bit numbers do.
It is also possible that the assembler will support constant bit
numbers >= 32 by adding to the address offset.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-07 14:00     ` Brian Gerst
@ 2020-05-07 19:19       ` Nick Desaulniers
  2020-05-07 22:29         ` Nick Desaulniers
  0 siblings, 1 reply; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-07 19:19 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Peter Zijlstra, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sedat Dilek, stable, Jesse Brandeburg,
	kernelci . org bot, Andy Shevchenko, Ilie Halip,
	the arch/x86 maintainers, Marco Elver, Paul E. McKenney,
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On Thu, May 7, 2020 at 7:00 AM Brian Gerst <brgerst@gmail.com> wrote:
>
> This change will make sparse happy and allow these cleanups:
> #define CONST_MASK(nr)                 ((u8)1 << ((nr) & 7))

yep, this is more elegant, IMO.  Will send a v3 later with this
change.  Looking at the uses of CONST_MASK, I noticed
arch_change_bit() currently has the (u8) cast from commit
838e8bb71dc0c ("x86: Implement change_bit with immediate operand as
"lock xorb""), so that instance can get cleaned up with the above
suggestion.

-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-07  6:18 ` [PATCH] " Brian Gerst
  2020-05-07  7:02   ` hpa
  2020-05-07  7:44   ` David Laight
@ 2020-05-07 19:22   ` Nick Desaulniers
  2 siblings, 0 replies; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-07 19:22 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sedat Dilek,
	stable, Jesse Brandeburg, kernelci . org bot, Andy Shevchenko,
	Ilie Halip, the arch/x86 maintainers, H. Peter Anvin,
	Marco Elver, Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On Wed, May 6, 2020 at 11:18 PM Brian Gerst <brgerst@gmail.com> wrote:
>
> I think a better fix would be to make CONST_MASK() return a u8 value
> rather than have to cast on every use.
>
> Also I question the need for the "q" constraint.  It was added in
> commit 437a0a54 as a workaround for GCC 3.4.4.  Now that the minimum
> GCC version is 4.6, is this still necessary?

TL;DR yes

ah, thanks for the git archeology, it's useful.  I don't think this is
a compiler bug however, just the compiler being more strict that the
`b` suffix on `orb` requires a 8b register operand.  For 32b x86, `q`
asm constraint is required, because not all GPR's had lower 8b
register aliases, as Arnd found and HPA noted as well.

I like your suggested change to CONST_MASK, and will send that in a bit.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-07 11:34   ` Peter Zijlstra
  2020-05-07 14:00     ` Brian Gerst
@ 2020-05-07 19:29     ` hpa
  1 sibling, 0 replies; 45+ messages in thread
From: hpa @ 2020-05-07 19:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sedat Dilek, stable, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, Ilie Halip, x86, Marco Elver, Paul E. McKenney,
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck, linux-kernel,
	clang-built-linux

On May 7, 2020 4:34:22 AM PDT, Peter Zijlstra <peterz@infradead.org> wrote:
>On Tue, May 05, 2020 at 11:07:24AM -0700, hpa@zytor.com wrote:
>> On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers
><ndesaulniers@google.com> wrote:
>
>> >@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
>*addr)
>> > 	if (__builtin_constant_p(nr)) {
>> > 		asm volatile(LOCK_PREFIX "orb %1,%0"
>> > 			: CONST_MASK_ADDR(nr, addr)
>> >-			: "iq" (CONST_MASK(nr) & 0xff)
>> >+			: "iq" ((u8)(CONST_MASK(nr) & 0xff))
>> > 			: "memory");
>> > 	} else {
>> > 		asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
>> >@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long
>*addr)
>> > 	if (__builtin_constant_p(nr)) {
>> > 		asm volatile(LOCK_PREFIX "andb %1,%0"
>> > 			: CONST_MASK_ADDR(nr, addr)
>> >-			: "iq" (CONST_MASK(nr) ^ 0xff));
>> >+			: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
>> > 	} else {
>> > 		asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
>> > 			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
>> 
>> Drop & 0xff and change ^ 0xff to ~.
>
>But then we're back to sparse being unhappy, no? The thing with ~ is
>that it will set high bits which will be truncated, which makes sparse
>sad.

In that case, sparse is just broken.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-07 13:32     ` Brian Gerst
  2020-05-07 15:09       ` David Laight
@ 2020-05-07 19:29       ` hpa
  1 sibling, 0 replies; 45+ messages in thread
From: hpa @ 2020-05-07 19:29 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sedat Dilek, stable, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, Ilie Halip, the arch/x86 maintainers,
	Marco Elver, Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On May 7, 2020 6:32:24 AM PDT, Brian Gerst <brgerst@gmail.com> wrote:
>On Thu, May 7, 2020 at 3:02 AM <hpa@zytor.com> wrote:
>>
>> On May 6, 2020 11:18:09 PM PDT, Brian Gerst <brgerst@gmail.com>
>wrote:
>> >On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers
>> ><ndesaulniers@google.com> wrote:
>> >>
>> >> From: Sedat Dilek <sedat.dilek@gmail.com>
>> >>
>> >> It turns out that if your config tickles __builtin_constant_p via
>> >> differences in choices to inline or not, this now produces invalid
>> >> assembly:
>> >>
>> >> $ cat foo.c
>> >> long a(long b, long c) {
>> >>   asm("orb\t%1, %0" : "+q"(c): "r"(b));
>> >>   return c;
>> >> }
>> >> $ gcc foo.c
>> >> foo.c: Assembler messages:
>> >> foo.c:2: Error: `%rax' not allowed with `orb'
>> >>
>> >> The "q" constraint only has meanting on -m32 otherwise is treated
>as
>> >> "r".
>> >>
>> >> This is easily reproducible via
>> >Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
>> >> or Clang+allyesconfig.
>> >>
>> >> Keep the masking operation to appease sparse (`make C=1`), add
>back
>> >the
>> >> cast in order to properly select the proper 8b register alias.
>> >>
>> >>  [Nick: reworded]
>> >>
>> >> Cc: stable@vger.kernel.org
>> >> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
>> >> Link: https://github.com/ClangBuiltLinux/linux/issues/961
>> >> Link:
>> >https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
>> >> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved
>cast")
>> >> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
>> >> Reported-by: kernelci.org bot <bot@kernelci.org>
>> >> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
>> >> Suggested-by: Ilie Halip <ilie.halip@gmail.com>
>> >> Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
>> >> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
>> >> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
>> >> ---
>> >>  arch/x86/include/asm/bitops.h | 4 ++--
>> >>  1 file changed, 2 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/arch/x86/include/asm/bitops.h
>> >b/arch/x86/include/asm/bitops.h
>> >> index b392571c1f1d..139122e5b25b 100644
>> >> --- a/arch/x86/include/asm/bitops.h
>> >> +++ b/arch/x86/include/asm/bitops.h
>> >> @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
>*addr)
>> >>         if (__builtin_constant_p(nr)) {
>> >>                 asm volatile(LOCK_PREFIX "orb %1,%0"
>> >>                         : CONST_MASK_ADDR(nr, addr)
>> >> -                       : "iq" (CONST_MASK(nr) & 0xff)
>> >> +                       : "iq" ((u8)(CONST_MASK(nr) & 0xff))
>> >
>> >I think a better fix would be to make CONST_MASK() return a u8 value
>> >rather than have to cast on every use.
>> >
>> >Also I question the need for the "q" constraint.  It was added in
>> >commit 437a0a54 as a workaround for GCC 3.4.4.  Now that the minimum
>> >GCC version is 4.6, is this still necessary?
>> >
>> >--
>> >Brian Gerst
>>
>> Yes, "q" is needed on i386.
>
>I think the bug this worked around was that the compiler didn't detect
>that CONST_MASK(nr) was also constant and doesn't need to be put into
>a register.  The question is does that bug still exist on compiler
>versions we care about?
>
>--
>Brian Gerst

The compiler is free to do that, including for legit reasons (common subexpression elimination, especially.) So yes.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH] x86: bitops: fix build regression
  2020-05-07 15:09       ` David Laight
@ 2020-05-07 19:31         ` hpa
  0 siblings, 0 replies; 45+ messages in thread
From: hpa @ 2020-05-07 19:31 UTC (permalink / raw)
  To: David Laight, 'Brian Gerst'
  Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sedat Dilek, stable, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, Ilie Halip, the arch/x86 maintainers,
	Marco Elver, Paul E. McKenney, Peter Zijlstra (Intel),
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On May 7, 2020 8:09:35 AM PDT, David Laight <David.Laight@ACULAB.COM> wrote:
>From: Brian Gerst
>> Sent: 07 May 2020 14:32
>...
>> I think the bug this worked around was that the compiler didn't
>detect
>> that CONST_MASK(nr) was also constant and doesn't need to be put into
>> a register.  The question is does that bug still exist on compiler
>> versions we care about?
>
>Hmmm...
>That ought to have been fixed instead of worrying about the fact
>that an invalid register was used.
>
>Alternatively is there any reason not to use the bts/btc instructions?
>Yes, I know they'll do wider accesses, but variable bit numbers do.
>It is also possible that the assembler will support constant bit
>numbers >= 32 by adding to the address offset.
>
>	David
>
>-
>Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
>MK1 1PT, UK
>Registration No: 1397386 (Wales)

They're slower, and for unaligned locked fields can be severely so.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-07 19:19       ` Nick Desaulniers
@ 2020-05-07 22:29         ` Nick Desaulniers
  2020-05-08  1:57           ` Brian Gerst
  0 siblings, 1 reply; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-07 22:29 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Peter Zijlstra, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sedat Dilek, stable, Jesse Brandeburg,
	kernelci . org bot, Andy Shevchenko, Ilie Halip,
	the arch/x86 maintainers, Marco Elver, Paul E. McKenney,
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On Thu, May 7, 2020 at 12:19 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Thu, May 7, 2020 at 7:00 AM Brian Gerst <brgerst@gmail.com> wrote:
> >
> > This change will make sparse happy and allow these cleanups:
> > #define CONST_MASK(nr)                 ((u8)1 << ((nr) & 7))
>
> yep, this is more elegant, IMO.  Will send a v3 later with this
> change.  Looking at the uses of CONST_MASK, I noticed
> arch_change_bit() currently has the (u8) cast from commit
> 838e8bb71dc0c ("x86: Implement change_bit with immediate operand as
> "lock xorb""), so that instance can get cleaned up with the above
> suggestion.

Oh, we need the cast to be the final operation.  The binary AND and
XOR in 2 of the 3 uses of CONST_MASK implicitly promote the operands
of the binary operand to int, so the type of the evaluated
subexpression is int.
https://wiki.sei.cmu.edu/confluence/display/c/EXP14-C.+Beware+of+integer+promotion+when+performing+bitwise+operations+on+integer+types+smaller+than+int
So I think this version (v2) is most precise fix, and would be better
than defining more macros or (worse) using metaprogramming.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-07 22:29         ` Nick Desaulniers
@ 2020-05-08  1:57           ` Brian Gerst
  2020-05-08 17:21             ` Nick Desaulniers
  0 siblings, 1 reply; 45+ messages in thread
From: Brian Gerst @ 2020-05-08  1:57 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Peter Zijlstra, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sedat Dilek, stable, Jesse Brandeburg,
	kernelci . org bot, Andy Shevchenko, Ilie Halip,
	the arch/x86 maintainers, Marco Elver, Paul E. McKenney,
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On Thu, May 7, 2020 at 6:29 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Thu, May 7, 2020 at 12:19 PM Nick Desaulniers
> <ndesaulniers@google.com> wrote:
> >
> > On Thu, May 7, 2020 at 7:00 AM Brian Gerst <brgerst@gmail.com> wrote:
> > >
> > > This change will make sparse happy and allow these cleanups:
> > > #define CONST_MASK(nr)                 ((u8)1 << ((nr) & 7))
> >
> > yep, this is more elegant, IMO.  Will send a v3 later with this
> > change.  Looking at the uses of CONST_MASK, I noticed
> > arch_change_bit() currently has the (u8) cast from commit
> > 838e8bb71dc0c ("x86: Implement change_bit with immediate operand as
> > "lock xorb""), so that instance can get cleaned up with the above
> > suggestion.
>
> Oh, we need the cast to be the final operation.  The binary AND and
> XOR in 2 of the 3 uses of CONST_MASK implicitly promote the operands
> of the binary operand to int, so the type of the evaluated
> subexpression is int.
> https://wiki.sei.cmu.edu/confluence/display/c/EXP14-C.+Beware+of+integer+promotion+when+performing+bitwise+operations+on+integer+types+smaller+than+int
> So I think this version (v2) is most precise fix, and would be better
> than defining more macros or (worse) using metaprogramming.

One last suggestion.  Add the "b" modifier to the mask operand: "orb
%b1, %0".  That forces the compiler to use the 8-bit register name
instead of trying to deduce the width from the input.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-08  1:57           ` Brian Gerst
@ 2020-05-08 17:21             ` Nick Desaulniers
  2020-05-08 17:31               ` H. Peter Anvin
  0 siblings, 1 reply; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-08 17:21 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Peter Zijlstra, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Sedat Dilek, stable, Jesse Brandeburg,
	kernelci . org bot, Andy Shevchenko, Ilie Halip,
	the arch/x86 maintainers, Marco Elver, Paul E. McKenney,
	Daniel Axtens, Masahiro Yamada, Luc Van Oostenryck,
	Linux Kernel Mailing List, clang-built-linux

On Thu, May 7, 2020 at 6:57 PM Brian Gerst <brgerst@gmail.com> wrote:
>
> On Thu, May 7, 2020 at 6:29 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > On Thu, May 7, 2020 at 12:19 PM Nick Desaulniers
> > <ndesaulniers@google.com> wrote:
> > >
> > > On Thu, May 7, 2020 at 7:00 AM Brian Gerst <brgerst@gmail.com> wrote:
> > > >
> > > > This change will make sparse happy and allow these cleanups:
> > > > #define CONST_MASK(nr)                 ((u8)1 << ((nr) & 7))
> > >
> > > yep, this is more elegant, IMO.  Will send a v3 later with this
> > > change.  Looking at the uses of CONST_MASK, I noticed
> > > arch_change_bit() currently has the (u8) cast from commit
> > > 838e8bb71dc0c ("x86: Implement change_bit with immediate operand as
> > > "lock xorb""), so that instance can get cleaned up with the above
> > > suggestion.
> >
> > Oh, we need the cast to be the final operation.  The binary AND and
> > XOR in 2 of the 3 uses of CONST_MASK implicitly promote the operands
> > of the binary operand to int, so the type of the evaluated
> > subexpression is int.
> > https://wiki.sei.cmu.edu/confluence/display/c/EXP14-C.+Beware+of+integer+promotion+when+performing+bitwise+operations+on+integer+types+smaller+than+int
> > So I think this version (v2) is most precise fix, and would be better
> > than defining more macros or (worse) using metaprogramming.
>
> One last suggestion.  Add the "b" modifier to the mask operand: "orb
> %b1, %0".  That forces the compiler to use the 8-bit register name
> instead of trying to deduce the width from the input.

Ah right: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers

Looks like that works for both compilers.  In that case, we can likely
drop the `& 0xff`, too.  Let me play with that, then I'll hopefully
send a v3 today.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] x86: bitops: fix build regression
  2020-05-08 17:21             ` Nick Desaulniers
@ 2020-05-08 17:31               ` H. Peter Anvin
  2020-05-08 18:05                 ` [PATCH v3] " Nick Desaulniers
  2020-05-10 11:59                 ` [PATCH] " David Laight
  0 siblings, 2 replies; 45+ messages in thread
From: H. Peter Anvin @ 2020-05-08 17:31 UTC (permalink / raw)
  To: Nick Desaulniers, Brian Gerst
  Cc: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sedat Dilek, stable, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, Ilie Halip, the arch/x86 maintainers,
	Marco Elver, Paul E. McKenney, Daniel Axtens, Masahiro Yamada,
	Luc Van Oostenryck, Linux Kernel Mailing List, clang-built-linux

On 2020-05-08 10:21, Nick Desaulniers wrote:
>>
>> One last suggestion.  Add the "b" modifier to the mask operand: "orb
>> %b1, %0".  That forces the compiler to use the 8-bit register name
>> instead of trying to deduce the width from the input.
> 
> Ah right: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> 
> Looks like that works for both compilers.  In that case, we can likely
> drop the `& 0xff`, too.  Let me play with that, then I'll hopefully
> send a v3 today.
> 

Good idea. I requested a while ago that they document these modifiers; they
chose not to document them all which in some ways is good; it shows what they
are willing to commit to indefinitely.

	-hpa

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v3] x86: bitops: fix build regression
  2020-05-08 17:31               ` H. Peter Anvin
@ 2020-05-08 18:05                 ` Nick Desaulniers
  2020-05-08 18:08                   ` Nick Desaulniers
  2020-05-08 18:22                   ` Brian Gerst
  2020-05-10 11:59                 ` [PATCH] " David Laight
  1 sibling, 2 replies; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-08 18:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Nick Desaulniers, Jesse Brandeburg, Sedat Dilek,
	kernelci . org bot, Andy Shevchenko, Brian Gerst,
	H . Peter Anvin, Ilie Halip, x86, Marco Elver, Paul E. McKenney,
	Andrey Ryabinin, Andrew Morton, Masahiro Yamada, Daniel Axtens,
	Luc Van Oostenryck, Peter Zijlstra (Intel),
	linux-kernel, clang-built-linux

This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce
invalid assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb\t%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
Changes V2 -> V3:
* use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
* reword commit message.
* add Brian and HPA suggested by tags
* drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
  enough).

Changes V1 -> V2:
* change authorship/signed-off-by to Ilie
* add Nathan's Tested by/reviewed by
* update commit message slightly with info sent to HPA.

 arch/x86/include/asm/bitops.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b392571c1f1d..03e24286e4eb 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -52,9 +52,9 @@ static __always_inline void
 arch_set_bit(long nr, volatile unsigned long *addr)
 {
 	if (__builtin_constant_p(nr)) {
-		asm volatile(LOCK_PREFIX "orb %1,%0"
+		asm volatile(LOCK_PREFIX "orb %b1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" (CONST_MASK(nr) & 0xff)
+			: "iq" (CONST_MASK(nr))
 			: "memory");
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -72,9 +72,9 @@ static __always_inline void
 arch_clear_bit(long nr, volatile unsigned long *addr)
 {
 	if (__builtin_constant_p(nr)) {
-		asm volatile(LOCK_PREFIX "andb %1,%0"
+		asm volatile(LOCK_PREFIX "andb %b1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" (CONST_MASK(nr) ^ 0xff));
+			: "iq" (~CONST_MASK(nr)));
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
 			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v3] x86: bitops: fix build regression
  2020-05-08 18:05                 ` [PATCH v3] " Nick Desaulniers
@ 2020-05-08 18:08                   ` Nick Desaulniers
  2020-05-08 18:22                   ` Brian Gerst
  1 sibling, 0 replies; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-08 18:08 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jesse Brandeburg, Sedat Dilek, kernelci . org bot,
	Andy Shevchenko, Brian Gerst, H . Peter Anvin, Ilie Halip,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Marco Elver, Paul E. McKenney, Andrey Ryabinin, Andrew Morton,
	Masahiro Yamada, Daniel Axtens, Luc Van Oostenryck,
	Peter Zijlstra (Intel),
	LKML, clang-built-linux

On Fri, May 8, 2020 at 11:06 AM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
>
> $ cat foo.c
> long a(long b, long c) {
>   asm("orb\t%1, %0" : "+q"(c): "r"(b));
>   return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> Use the `%b` "x86 Operand Modifier" to instead force register allocation
> to select a lower-8-bit GPR operand.
>
> The "q" constraint only has meaning on -m32 otherwise is treated as
> "r". Not all GPRs have low-8-bit aliases for -m32.
>
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> Reported-by: kernelci.org bot <bot@kernelci.org>
> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> Suggested-by: Brian Gerst <brgerst@gmail.com>
> Suggested-by: H. Peter Anvin <hpa@zytor.com>
> Suggested-by: Ilie Halip <ilie.halip@gmail.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
> Changes V2 -> V3:
> * use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
> * reword commit message.
> * add Brian and HPA suggested by tags
> * drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
>   enough).

Oh, and I took over authorship for this version.

>
> Changes V1 -> V2:
> * change authorship/signed-off-by to Ilie
> * add Nathan's Tested by/reviewed by
> * update commit message slightly with info sent to HPA.
>
>  arch/x86/include/asm/bitops.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..03e24286e4eb 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -52,9 +52,9 @@ static __always_inline void
>  arch_set_bit(long nr, volatile unsigned long *addr)
>  {
>         if (__builtin_constant_p(nr)) {
> -               asm volatile(LOCK_PREFIX "orb %1,%0"
> +               asm volatile(LOCK_PREFIX "orb %b1,%0"
>                         : CONST_MASK_ADDR(nr, addr)
> -                       : "iq" (CONST_MASK(nr) & 0xff)
> +                       : "iq" (CONST_MASK(nr))
>                         : "memory");
>         } else {
>                 asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> @@ -72,9 +72,9 @@ static __always_inline void
>  arch_clear_bit(long nr, volatile unsigned long *addr)
>  {
>         if (__builtin_constant_p(nr)) {
> -               asm volatile(LOCK_PREFIX "andb %1,%0"
> +               asm volatile(LOCK_PREFIX "andb %b1,%0"
>                         : CONST_MASK_ADDR(nr, addr)
> -                       : "iq" (CONST_MASK(nr) ^ 0xff));
> +                       : "iq" (~CONST_MASK(nr)));
>         } else {
>                 asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
>                         : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> --
> 2.26.2.645.ge9eca65c58-goog
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3] x86: bitops: fix build regression
  2020-05-08 18:05                 ` [PATCH v3] " Nick Desaulniers
  2020-05-08 18:08                   ` Nick Desaulniers
@ 2020-05-08 18:22                   ` Brian Gerst
  2020-05-08 18:28                     ` [PATCH v4] " Nick Desaulniers
  1 sibling, 1 reply; 45+ messages in thread
From: Brian Gerst @ 2020-05-08 18:22 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Jesse Brandeburg,
	Sedat Dilek, kernelci . org bot, Andy Shevchenko,
	H . Peter Anvin, Ilie Halip, the arch/x86 maintainers,
	Marco Elver, Paul E. McKenney, Andrey Ryabinin, Andrew Morton,
	Masahiro Yamada, Daniel Axtens, Luc Van Oostenryck,
	Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux

On Fri, May 8, 2020 at 2:06 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
>
> $ cat foo.c
> long a(long b, long c) {
>   asm("orb\t%1, %0" : "+q"(c): "r"(b));
>   return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> Use the `%b` "x86 Operand Modifier" to instead force register allocation
> to select a lower-8-bit GPR operand.
>
> The "q" constraint only has meaning on -m32 otherwise is treated as
> "r". Not all GPRs have low-8-bit aliases for -m32.
>
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> Reported-by: kernelci.org bot <bot@kernelci.org>
> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> Suggested-by: Brian Gerst <brgerst@gmail.com>
> Suggested-by: H. Peter Anvin <hpa@zytor.com>
> Suggested-by: Ilie Halip <ilie.halip@gmail.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
> Changes V2 -> V3:
> * use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
> * reword commit message.
> * add Brian and HPA suggested by tags
> * drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
>   enough).
>
> Changes V1 -> V2:
> * change authorship/signed-off-by to Ilie
> * add Nathan's Tested by/reviewed by
> * update commit message slightly with info sent to HPA.
>
>  arch/x86/include/asm/bitops.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..03e24286e4eb 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -52,9 +52,9 @@ static __always_inline void
>  arch_set_bit(long nr, volatile unsigned long *addr)
>  {
>         if (__builtin_constant_p(nr)) {
> -               asm volatile(LOCK_PREFIX "orb %1,%0"
> +               asm volatile(LOCK_PREFIX "orb %b1,%0"
>                         : CONST_MASK_ADDR(nr, addr)
> -                       : "iq" (CONST_MASK(nr) & 0xff)
> +                       : "iq" (CONST_MASK(nr))
>                         : "memory");
>         } else {
>                 asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> @@ -72,9 +72,9 @@ static __always_inline void
>  arch_clear_bit(long nr, volatile unsigned long *addr)
>  {
>         if (__builtin_constant_p(nr)) {
> -               asm volatile(LOCK_PREFIX "andb %1,%0"
> +               asm volatile(LOCK_PREFIX "andb %b1,%0"
>                         : CONST_MASK_ADDR(nr, addr)
> -                       : "iq" (CONST_MASK(nr) ^ 0xff));
> +                       : "iq" (~CONST_MASK(nr)));
>         } else {
>                 asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
>                         : : RLONG_ADDR(addr), "Ir" (nr) : "memory");

arch_change_bit() should also be changed, but otherwise looks good.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v4] x86: bitops: fix build regression
  2020-05-08 18:22                   ` Brian Gerst
@ 2020-05-08 18:28                     ` Nick Desaulniers
  2020-05-08 18:32                       ` [PATCH v5] " Nick Desaulniers
  0 siblings, 1 reply; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-08 18:28 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Nick Desaulniers, Jesse Brandeburg, Sedat Dilek,
	kernelci . org bot, Andy Shevchenko, Brian Gerst,
	H . Peter Anvin, Ilie Halip, x86, Marco Elver, Paul E. McKenney,
	Andrew Morton, Daniel Axtens, Masahiro Yamada,
	Luc Van Oostenryck, Peter Zijlstra (Intel),
	linux-kernel, clang-built-linux

This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce
invalid assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb\t%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
Changes V3 -> V4:
* drop (u8) cast from arch_change_bit() as well.

Changes V2 -> V3:
* use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
* reword commit message.
* add Brian and HPA suggested by tags
* drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
  enough).
* Take over authorship.

Changes V1 -> V2:
* change authorship/signed-off-by to Ilie
* add Nathan's Tested by/reviewed by
* update commit message slightly with info sent to HPA.

 arch/x86/include/asm/bitops.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b392571c1f1d..8a8b7bb9677b 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -52,9 +52,9 @@ static __always_inline void
 arch_set_bit(long nr, volatile unsigned long *addr)
 {
 	if (__builtin_constant_p(nr)) {
-		asm volatile(LOCK_PREFIX "orb %1,%0"
+		asm volatile(LOCK_PREFIX "orb %b1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" (CONST_MASK(nr) & 0xff)
+			: "iq" (CONST_MASK(nr))
 			: "memory");
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -72,9 +72,9 @@ static __always_inline void
 arch_clear_bit(long nr, volatile unsigned long *addr)
 {
 	if (__builtin_constant_p(nr)) {
-		asm volatile(LOCK_PREFIX "andb %1,%0"
+		asm volatile(LOCK_PREFIX "andb %b1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" (CONST_MASK(nr) ^ 0xff));
+			: "iq" (~CONST_MASK(nr)));
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
 			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
@@ -125,7 +125,7 @@ arch_change_bit(long nr, volatile unsigned long *addr)
 	if (__builtin_constant_p(nr)) {
 		asm volatile(LOCK_PREFIX "xorb %1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" ((u8)CONST_MASK(nr)));
+			: "iq" (CONST_MASK(nr)));
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(btc) " %1,%0"
 			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5] x86: bitops: fix build regression
  2020-05-08 18:28                     ` [PATCH v4] " Nick Desaulniers
@ 2020-05-08 18:32                       ` Nick Desaulniers
  2020-05-08 20:28                         ` Nathan Chancellor
                                           ` (3 more replies)
  0 siblings, 4 replies; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-08 18:32 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Nick Desaulniers, Jesse Brandeburg, Sedat Dilek,
	kernelci . org bot, Andy Shevchenko, Brian Gerst,
	H . Peter Anvin, Ilie Halip, x86, Marco Elver, Paul E. McKenney,
	Andrey Ryabinin, Luc Van Oostenryck, Andrew Morton,
	Masahiro Yamada, Daniel Axtens, Peter Zijlstra (Intel),
	linux-kernel, clang-built-linux

This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce
invalid assembly:

$ cat foo.c
long a(long b, long c) {
  asm("orb\t%1, %0" : "+q"(c): "r"(b));
  return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/961
Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Ilie Halip <ilie.halip@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
Changes V4 -> V5:
* actually use `%b` in arch_change_bit().

Changes V3 -> V4:
* drop (u8) cast from arch_change_bit() as well.

Changes V2 -> V3:
* use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
* reword commit message.
* add Brian and HPA suggested by tags
* drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
  enough).
* Take over authorship.

Changes V1 -> V2:
* change authorship/signed-off-by to Ilie
* add Nathan's Tested by/reviewed by
* update commit message slightly with info sent to HPA.
 arch/x86/include/asm/bitops.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b392571c1f1d..35460fef39b8 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -52,9 +52,9 @@ static __always_inline void
 arch_set_bit(long nr, volatile unsigned long *addr)
 {
 	if (__builtin_constant_p(nr)) {
-		asm volatile(LOCK_PREFIX "orb %1,%0"
+		asm volatile(LOCK_PREFIX "orb %b1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" (CONST_MASK(nr) & 0xff)
+			: "iq" (CONST_MASK(nr))
 			: "memory");
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -72,9 +72,9 @@ static __always_inline void
 arch_clear_bit(long nr, volatile unsigned long *addr)
 {
 	if (__builtin_constant_p(nr)) {
-		asm volatile(LOCK_PREFIX "andb %1,%0"
+		asm volatile(LOCK_PREFIX "andb %b1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" (CONST_MASK(nr) ^ 0xff));
+			: "iq" (~CONST_MASK(nr)));
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
 			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
@@ -123,9 +123,9 @@ static __always_inline void
 arch_change_bit(long nr, volatile unsigned long *addr)
 {
 	if (__builtin_constant_p(nr)) {
-		asm volatile(LOCK_PREFIX "xorb %1,%0"
+		asm volatile(LOCK_PREFIX "xorb %b1,%0"
 			: CONST_MASK_ADDR(nr, addr)
-			: "iq" ((u8)CONST_MASK(nr)));
+			: "iq" (CONST_MASK(nr)));
 	} else {
 		asm volatile(LOCK_PREFIX __ASM_SIZE(btc) " %1,%0"
 			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v5] x86: bitops: fix build regression
  2020-05-08 18:32                       ` [PATCH v5] " Nick Desaulniers
@ 2020-05-08 20:28                         ` Nathan Chancellor
  2020-05-08 23:47                           ` Jesse Brandeburg
  2020-05-09 12:20                         ` Andy Shevchenko
                                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 45+ messages in thread
From: Nathan Chancellor @ 2020-05-08 20:28 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Jesse Brandeburg,
	Sedat Dilek, kernelci . org bot, Andy Shevchenko, Brian Gerst,
	H . Peter Anvin, Ilie Halip, x86, Marco Elver, Paul E. McKenney,
	Andrey Ryabinin, Luc Van Oostenryck, Andrew Morton,
	Masahiro Yamada, Daniel Axtens, Peter Zijlstra (Intel),
	linux-kernel, clang-built-linux

On Fri, May 08, 2020 at 11:32:29AM -0700, Nick Desaulniers wrote:
> This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
> 
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
> 
> $ cat foo.c
> long a(long b, long c) {
>   asm("orb\t%1, %0" : "+q"(c): "r"(b));
>   return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
> 
> Use the `%b` "x86 Operand Modifier" to instead force register allocation
> to select a lower-8-bit GPR operand.
> 
> The "q" constraint only has meaning on -m32 otherwise is treated as
> "r". Not all GPRs have low-8-bit aliases for -m32.
> 
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> Reported-by: kernelci.org bot <bot@kernelci.org>
> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> Suggested-by: Brian Gerst <brgerst@gmail.com>
> Suggested-by: H. Peter Anvin <hpa@zytor.com>
> Suggested-by: Ilie Halip <ilie.halip@gmail.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com> # build, clang-11

> ---
> Changes V4 -> V5:
> * actually use `%b` in arch_change_bit().
> 
> Changes V3 -> V4:
> * drop (u8) cast from arch_change_bit() as well.
> 
> Changes V2 -> V3:
> * use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
> * reword commit message.
> * add Brian and HPA suggested by tags
> * drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
>   enough).
> * Take over authorship.
> 
> Changes V1 -> V2:
> * change authorship/signed-off-by to Ilie
> * add Nathan's Tested by/reviewed by
> * update commit message slightly with info sent to HPA.
>  arch/x86/include/asm/bitops.h | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..35460fef39b8 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -52,9 +52,9 @@ static __always_inline void
>  arch_set_bit(long nr, volatile unsigned long *addr)
>  {
>  	if (__builtin_constant_p(nr)) {
> -		asm volatile(LOCK_PREFIX "orb %1,%0"
> +		asm volatile(LOCK_PREFIX "orb %b1,%0"
>  			: CONST_MASK_ADDR(nr, addr)
> -			: "iq" (CONST_MASK(nr) & 0xff)
> +			: "iq" (CONST_MASK(nr))
>  			: "memory");
>  	} else {
>  		asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> @@ -72,9 +72,9 @@ static __always_inline void
>  arch_clear_bit(long nr, volatile unsigned long *addr)
>  {
>  	if (__builtin_constant_p(nr)) {
> -		asm volatile(LOCK_PREFIX "andb %1,%0"
> +		asm volatile(LOCK_PREFIX "andb %b1,%0"
>  			: CONST_MASK_ADDR(nr, addr)
> -			: "iq" (CONST_MASK(nr) ^ 0xff));
> +			: "iq" (~CONST_MASK(nr)));
>  	} else {
>  		asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
>  			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> @@ -123,9 +123,9 @@ static __always_inline void
>  arch_change_bit(long nr, volatile unsigned long *addr)
>  {
>  	if (__builtin_constant_p(nr)) {
> -		asm volatile(LOCK_PREFIX "xorb %1,%0"
> +		asm volatile(LOCK_PREFIX "xorb %b1,%0"
>  			: CONST_MASK_ADDR(nr, addr)
> -			: "iq" ((u8)CONST_MASK(nr)));
> +			: "iq" (CONST_MASK(nr)));
>  	} else {
>  		asm volatile(LOCK_PREFIX __ASM_SIZE(btc) " %1,%0"
>  			: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> -- 
> 2.26.2.645.ge9eca65c58-goog
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5] x86: bitops: fix build regression
  2020-05-08 20:28                         ` Nathan Chancellor
@ 2020-05-08 23:47                           ` Jesse Brandeburg
  2020-05-09  4:44                             ` Sedat Dilek
  0 siblings, 1 reply; 45+ messages in thread
From: Jesse Brandeburg @ 2020-05-08 23:47 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sedat Dilek, kernelci . org bot, Andy Shevchenko, Brian Gerst,
	H . Peter Anvin, Ilie Halip, x86, Marco Elver, Paul E. McKenney,
	Andrey Ryabinin, Luc Van Oostenryck, Andrew Morton,
	Masahiro Yamada, Daniel Axtens, Peter Zijlstra (Intel),
	linux-kernel, clang-built-linux

On Fri, 8 May 2020 13:28:35 -0700
Nathan Chancellor <natechancellor@gmail.com> wrote:

> On Fri, May 08, 2020 at 11:32:29AM -0700, Nick Desaulniers wrote:
> > Use the `%b` "x86 Operand Modifier" to instead force register allocation
> > to select a lower-8-bit GPR operand.

This looks OK to me, I appreciate the work done to find the right
fix and clean up the code while not breaking sparse! I had a look at
the assembly from gcc 9.3.1 and it looks good. Thanks!

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5] x86: bitops: fix build regression
  2020-05-08 23:47                           ` Jesse Brandeburg
@ 2020-05-09  4:44                             ` Sedat Dilek
  0 siblings, 0 replies; 45+ messages in thread
From: Sedat Dilek @ 2020-05-09  4:44 UTC (permalink / raw)
  To: Jesse Brandeburg
  Cc: Nathan Chancellor, Nick Desaulniers, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, kernelci . org bot,
	Andy Shevchenko, Brian Gerst, H . Peter Anvin, Ilie Halip, x86,
	Marco Elver, Paul E. McKenney, Andrey Ryabinin,
	Luc Van Oostenryck, Andrew Morton, Masahiro Yamada,
	Daniel Axtens, Peter Zijlstra (Intel),
	linux-kernel, Clang-Built-Linux ML

On Sat, May 9, 2020 at 1:47 AM Jesse Brandeburg
<jesse.brandeburg@intel.com> wrote:
>
> On Fri, 8 May 2020 13:28:35 -0700
> Nathan Chancellor <natechancellor@gmail.com> wrote:
>
> > On Fri, May 08, 2020 at 11:32:29AM -0700, Nick Desaulniers wrote:
> > > Use the `%b` "x86 Operand Modifier" to instead force register allocation
> > > to select a lower-8-bit GPR operand.
>
> This looks OK to me, I appreciate the work done to find the right
> fix and clean up the code while not breaking sparse! I had a look at
> the assembly from gcc 9.3.1 and it looks good. Thanks!
>
> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
>

Tested v5 on Debian/testing AMD64 with a selfmade llvm-toolchain
(LLVM/Clang/LLD) v10.0.1+git92d5c1be9ee9

Please add:

     Tested-by: Sedat Dilek <sedat.dilek@gmail.com>

For details see
<https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-626104287>

Thanks to all involved people.

- Sedat -

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5] x86: bitops: fix build regression
  2020-05-08 18:32                       ` [PATCH v5] " Nick Desaulniers
  2020-05-08 20:28                         ` Nathan Chancellor
@ 2020-05-09 12:20                         ` Andy Shevchenko
  2020-05-09 15:43                           ` Brian Gerst
  2020-05-11 17:22                           ` Nick Desaulniers
  2020-05-10 13:54                         ` David Laight
  2020-05-11 18:52                         ` Brian Gerst
  3 siblings, 2 replies; 45+ messages in thread
From: Andy Shevchenko @ 2020-05-09 12:20 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Jesse Brandeburg,
	Sedat Dilek, kernelci . org bot, Andy Shevchenko, Brian Gerst,
	H . Peter Anvin, Ilie Halip,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Marco Elver, Paul E. McKenney, Andrey Ryabinin,
	Luc Van Oostenryck, Andrew Morton, Masahiro Yamada,
	Daniel Axtens, Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux

On Fri, May 8, 2020 at 9:35 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
>
> $ cat foo.c
> long a(long b, long c) {
>   asm("orb\t%1, %0" : "+q"(c): "r"(b));
>   return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> Use the `%b` "x86 Operand Modifier" to instead force register allocation
> to select a lower-8-bit GPR operand.
>
> The "q" constraint only has meaning on -m32 otherwise is treated as
> "r". Not all GPRs have low-8-bit aliases for -m32.
>

Looks very good!
One question though, does it work with minimum supported version of gcc?

> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> Reported-by: kernelci.org bot <bot@kernelci.org>
> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> Suggested-by: Brian Gerst <brgerst@gmail.com>
> Suggested-by: H. Peter Anvin <hpa@zytor.com>
> Suggested-by: Ilie Halip <ilie.halip@gmail.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
> Changes V4 -> V5:
> * actually use `%b` in arch_change_bit().
>
> Changes V3 -> V4:
> * drop (u8) cast from arch_change_bit() as well.
>
> Changes V2 -> V3:
> * use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
> * reword commit message.
> * add Brian and HPA suggested by tags
> * drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
>   enough).
> * Take over authorship.
>
> Changes V1 -> V2:
> * change authorship/signed-off-by to Ilie
> * add Nathan's Tested by/reviewed by
> * update commit message slightly with info sent to HPA.
>  arch/x86/include/asm/bitops.h | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..35460fef39b8 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -52,9 +52,9 @@ static __always_inline void
>  arch_set_bit(long nr, volatile unsigned long *addr)
>  {
>         if (__builtin_constant_p(nr)) {
> -               asm volatile(LOCK_PREFIX "orb %1,%0"
> +               asm volatile(LOCK_PREFIX "orb %b1,%0"
>                         : CONST_MASK_ADDR(nr, addr)
> -                       : "iq" (CONST_MASK(nr) & 0xff)
> +                       : "iq" (CONST_MASK(nr))
>                         : "memory");
>         } else {
>                 asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> @@ -72,9 +72,9 @@ static __always_inline void
>  arch_clear_bit(long nr, volatile unsigned long *addr)
>  {
>         if (__builtin_constant_p(nr)) {
> -               asm volatile(LOCK_PREFIX "andb %1,%0"
> +               asm volatile(LOCK_PREFIX "andb %b1,%0"
>                         : CONST_MASK_ADDR(nr, addr)
> -                       : "iq" (CONST_MASK(nr) ^ 0xff));
> +                       : "iq" (~CONST_MASK(nr)));
>         } else {
>                 asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
>                         : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> @@ -123,9 +123,9 @@ static __always_inline void
>  arch_change_bit(long nr, volatile unsigned long *addr)
>  {
>         if (__builtin_constant_p(nr)) {
> -               asm volatile(LOCK_PREFIX "xorb %1,%0"
> +               asm volatile(LOCK_PREFIX "xorb %b1,%0"
>                         : CONST_MASK_ADDR(nr, addr)
> -                       : "iq" ((u8)CONST_MASK(nr)));
> +                       : "iq" (CONST_MASK(nr)));
>         } else {
>                 asm volatile(LOCK_PREFIX __ASM_SIZE(btc) " %1,%0"
>                         : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> --
> 2.26.2.645.ge9eca65c58-goog
>


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5] x86: bitops: fix build regression
  2020-05-09 12:20                         ` Andy Shevchenko
@ 2020-05-09 15:43                           ` Brian Gerst
  2020-05-11 17:22                           ` Nick Desaulniers
  1 sibling, 0 replies; 45+ messages in thread
From: Brian Gerst @ 2020-05-09 15:43 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Jesse Brandeburg, Sedat Dilek, kernelci . org bot,
	Andy Shevchenko, H . Peter Anvin, Ilie Halip,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Marco Elver, Paul E. McKenney, Andrey Ryabinin,
	Luc Van Oostenryck, Andrew Morton, Masahiro Yamada,
	Daniel Axtens, Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux

On Sat, May 9, 2020 at 8:20 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> On Fri, May 8, 2020 at 9:35 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
> >
> > It turns out that if your config tickles __builtin_constant_p via
> > differences in choices to inline or not, these statements produce
> > invalid assembly:
> >
> > $ cat foo.c
> > long a(long b, long c) {
> >   asm("orb\t%1, %0" : "+q"(c): "r"(b));
> >   return c;
> > }
> > $ gcc foo.c
> > foo.c: Assembler messages:
> > foo.c:2: Error: `%rax' not allowed with `orb'
> >
> > Use the `%b` "x86 Operand Modifier" to instead force register allocation
> > to select a lower-8-bit GPR operand.
> >
> > The "q" constraint only has meaning on -m32 otherwise is treated as
> > "r". Not all GPRs have low-8-bit aliases for -m32.
> >
>
> Looks very good!
> One question though, does it work with minimum supported version of gcc?

Yes.  The operand width modifiers have been around a long time but not
well documented until more recently.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH] x86: bitops: fix build regression
  2020-05-08 17:31               ` H. Peter Anvin
  2020-05-08 18:05                 ` [PATCH v3] " Nick Desaulniers
@ 2020-05-10 11:59                 ` David Laight
  2020-05-10 12:33                   ` hpa
  1 sibling, 1 reply; 45+ messages in thread
From: David Laight @ 2020-05-10 11:59 UTC (permalink / raw)
  To: 'H. Peter Anvin', Nick Desaulniers, Brian Gerst
  Cc: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sedat Dilek, stable, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, Ilie Halip, the arch/x86 maintainers,
	Marco Elver, Paul E. McKenney, Daniel Axtens, Masahiro Yamada,
	Luc Van Oostenryck, Linux Kernel Mailing List, clang-built-linux

From: Peter Anvin
> Sent: 08 May 2020 18:32
> On 2020-05-08 10:21, Nick Desaulniers wrote:
> >>
> >> One last suggestion.  Add the "b" modifier to the mask operand: "orb
> >> %b1, %0".  That forces the compiler to use the 8-bit register name
> >> instead of trying to deduce the width from the input.
> >
> > Ah right: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> >
> > Looks like that works for both compilers.  In that case, we can likely
> > drop the `& 0xff`, too.  Let me play with that, then I'll hopefully
> > send a v3 today.
> >
> 
> Good idea. I requested a while ago that they document these modifiers; they
> chose not to document them all which in some ways is good; it shows what they
> are willing to commit to indefinitely.

I thought the intention here was to explicitly do a byte access.
If the constant bit number has had a div/mod by 8 done on it then
the address can be misaligned - so you mustn't do a non-byte sized
locked access.

OTOH the original base address must be aligned.

Looking at some instruction timing, BTS/BTR aren't too bad if the
bit number is a constant. But are 6 or 7 clocks slower if it is in %cl.
Given these are locked RMW bus cycles they'll always be slow!

How about an asm multi-part alternative that uses a byte offset
and byte constant if the compiler thinks the mask is constant
or a 4-byte offset and 32bit mask if it doesn't.

The other alternative is to just use BTS/BTS and (maybe) rely on the
assembler to add in the word offset to the base address.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH] x86: bitops: fix build regression
  2020-05-10 11:59                 ` [PATCH] " David Laight
@ 2020-05-10 12:33                   ` hpa
  0 siblings, 0 replies; 45+ messages in thread
From: hpa @ 2020-05-10 12:33 UTC (permalink / raw)
  To: David Laight, Nick Desaulniers, Brian Gerst
  Cc: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Sedat Dilek, stable, Jesse Brandeburg, kernelci . org bot,
	Andy Shevchenko, Ilie Halip, the arch/x86 maintainers,
	Marco Elver, Paul E. McKenney, Daniel Axtens, Masahiro Yamada,
	Luc Van Oostenryck, Linux Kernel Mailing List, clang-built-linux

On May 10, 2020 4:59:17 AM PDT, David Laight <David.Laight@ACULAB.COM> wrote:
>From: Peter Anvin
>> Sent: 08 May 2020 18:32
>> On 2020-05-08 10:21, Nick Desaulniers wrote:
>> >>
>> >> One last suggestion.  Add the "b" modifier to the mask operand:
>"orb
>> >> %b1, %0".  That forces the compiler to use the 8-bit register name
>> >> instead of trying to deduce the width from the input.
>> >
>> > Ah right:
>https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
>> >
>> > Looks like that works for both compilers.  In that case, we can
>likely
>> > drop the `& 0xff`, too.  Let me play with that, then I'll hopefully
>> > send a v3 today.
>> >
>> 
>> Good idea. I requested a while ago that they document these
>modifiers; they
>> chose not to document them all which in some ways is good; it shows
>what they
>> are willing to commit to indefinitely.
>
>I thought the intention here was to explicitly do a byte access.
>If the constant bit number has had a div/mod by 8 done on it then
>the address can be misaligned - so you mustn't do a non-byte sized
>locked access.
>
>OTOH the original base address must be aligned.
>
>Looking at some instruction timing, BTS/BTR aren't too bad if the
>bit number is a constant. But are 6 or 7 clocks slower if it is in %cl.
>Given these are locked RMW bus cycles they'll always be slow!
>
>How about an asm multi-part alternative that uses a byte offset
>and byte constant if the compiler thinks the mask is constant
>or a 4-byte offset and 32bit mask if it doesn't.
>
>The other alternative is to just use BTS/BTS and (maybe) rely on the
>assembler to add in the word offset to the base address.
>
>	David
>
>-
>Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
>MK1 1PT, UK
>Registration No: 1397386 (Wales)

I don't understand what you are getting at here.

The intent is to do a byte access. The "multi-part asm" you are talking about is also already there...
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH v5] x86: bitops: fix build regression
  2020-05-08 18:32                       ` [PATCH v5] " Nick Desaulniers
  2020-05-08 20:28                         ` Nathan Chancellor
  2020-05-09 12:20                         ` Andy Shevchenko
@ 2020-05-10 13:54                         ` David Laight
  2020-05-11 18:52                         ` Brian Gerst
  3 siblings, 0 replies; 45+ messages in thread
From: David Laight @ 2020-05-10 13:54 UTC (permalink / raw)
  To: 'Nick Desaulniers',
	Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jesse Brandeburg, Sedat Dilek, kernelci . org bot,
	Andy Shevchenko, Brian Gerst, H . Peter Anvin, Ilie Halip, x86,
	Marco Elver, Paul E. McKenney, Andrey Ryabinin,
	Luc Van Oostenryck, Andrew Morton, Masahiro Yamada,
	Daniel Axtens, Peter Zijlstra (Intel),
	linux-kernel, clang-built-linux

From: Nick Desaulniers
> Sent: 08 May 2020 19:32
..
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
...
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..35460fef39b8 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -52,9 +52,9 @@ static __always_inline void
>  arch_set_bit(long nr, volatile unsigned long *addr)
>  {
>  	if (__builtin_constant_p(nr)) {
> -		asm volatile(LOCK_PREFIX "orb %1,%0"
> +		asm volatile(LOCK_PREFIX "orb %b1,%0"
>  			: CONST_MASK_ADDR(nr, addr)
> -			: "iq" (CONST_MASK(nr) & 0xff)
> +			: "iq" (CONST_MASK(nr))
>  			: "memory");

What happens if CONST_MASK() is changed to:
#define CONST_MASK_(n) (n == 0 ? 1 : n == 1 ? 2 : n ....)
#define CONST_MASK(n) CONST_MASK_(((n) & 7))

and a separate definition for the inverse mask.

The lack of arithmetic promotion may mean that the only "i"
constraint is needed.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5] x86: bitops: fix build regression
  2020-05-09 12:20                         ` Andy Shevchenko
  2020-05-09 15:43                           ` Brian Gerst
@ 2020-05-11 17:22                           ` Nick Desaulniers
  1 sibling, 0 replies; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-11 17:22 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Jesse Brandeburg,
	Sedat Dilek, kernelci . org bot, Andy Shevchenko, Brian Gerst,
	H . Peter Anvin, Ilie Halip,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Marco Elver, Paul E. McKenney, Andrey Ryabinin,
	Luc Van Oostenryck, Andrew Morton, Masahiro Yamada,
	Daniel Axtens, Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux

On Sat, May 9, 2020 at 5:20 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> On Fri, May 8, 2020 at 9:35 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
> >
> > It turns out that if your config tickles __builtin_constant_p via
> > differences in choices to inline or not, these statements produce
> > invalid assembly:
> >
> > $ cat foo.c
> > long a(long b, long c) {
> >   asm("orb\t%1, %0" : "+q"(c): "r"(b));
> >   return c;
> > }
> > $ gcc foo.c
> > foo.c: Assembler messages:
> > foo.c:2: Error: `%rax' not allowed with `orb'
> >
> > Use the `%b` "x86 Operand Modifier" to instead force register allocation
> > to select a lower-8-bit GPR operand.
> >
> > The "q" constraint only has meaning on -m32 otherwise is treated as
> > "r". Not all GPRs have low-8-bit aliases for -m32.
> >
>
> Looks very good!
> One question though, does it work with minimum supported version of gcc?

Yes; the oldest version of GCC in godbolt.org is GCC 4.1.2 which
supports the syntax and generates the expected lower-8-bit register
operands.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5] x86: bitops: fix build regression
  2020-05-08 18:32                       ` [PATCH v5] " Nick Desaulniers
                                           ` (2 preceding siblings ...)
  2020-05-10 13:54                         ` David Laight
@ 2020-05-11 18:52                         ` Brian Gerst
  2020-05-14 23:47                           ` Nick Desaulniers
  3 siblings, 1 reply; 45+ messages in thread
From: Brian Gerst @ 2020-05-11 18:52 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Jesse Brandeburg,
	Sedat Dilek, kernelci . org bot, Andy Shevchenko,
	H . Peter Anvin, Ilie Halip, the arch/x86 maintainers,
	Marco Elver, Paul E. McKenney, Andrey Ryabinin,
	Luc Van Oostenryck, Andrew Morton, Masahiro Yamada,
	Daniel Axtens, Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux

On Fri, May 8, 2020 at 2:32 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
>
> $ cat foo.c
> long a(long b, long c) {
>   asm("orb\t%1, %0" : "+q"(c): "r"(b));
>   return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> Use the `%b` "x86 Operand Modifier" to instead force register allocation
> to select a lower-8-bit GPR operand.
>
> The "q" constraint only has meaning on -m32 otherwise is treated as
> "r". Not all GPRs have low-8-bit aliases for -m32.
>
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> Reported-by: kernelci.org bot <bot@kernelci.org>
> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> Suggested-by: Brian Gerst <brgerst@gmail.com>
> Suggested-by: H. Peter Anvin <hpa@zytor.com>
> Suggested-by: Ilie Halip <ilie.halip@gmail.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed-By: Brian Gerst <brgerst@gmail.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5] x86: bitops: fix build regression
  2020-05-11 18:52                         ` Brian Gerst
@ 2020-05-14 23:47                           ` Nick Desaulniers
  0 siblings, 0 replies; 45+ messages in thread
From: Nick Desaulniers @ 2020-05-14 23:47 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Jesse Brandeburg, Sedat Dilek, kernelci . org bot,
	Andy Shevchenko, H . Peter Anvin, Ilie Halip,
	the arch/x86 maintainers, Marco Elver, Paul E. McKenney,
	Andrey Ryabinin, Luc Van Oostenryck, Andrew Morton,
	Masahiro Yamada, Daniel Axtens, Peter Zijlstra (Intel),
	Linux Kernel Mailing List, clang-built-linux, Brian Gerst

Bumping for this to get reviewed+picked up.  Fixes a regression I
would prefer didn't ship in 5.7.

On Mon, May 11, 2020 at 11:52 AM Brian Gerst <brgerst@gmail.com> wrote:
>
> On Fri, May 8, 2020 at 2:32 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
> >
> > It turns out that if your config tickles __builtin_constant_p via
> > differences in choices to inline or not, these statements produce
> > invalid assembly:
> >
> > $ cat foo.c
> > long a(long b, long c) {
> >   asm("orb\t%1, %0" : "+q"(c): "r"(b));
> >   return c;
> > }
> > $ gcc foo.c
> > foo.c: Assembler messages:
> > foo.c:2: Error: `%rax' not allowed with `orb'
> >
> > Use the `%b` "x86 Operand Modifier" to instead force register allocation
> > to select a lower-8-bit GPR operand.
> >
> > The "q" constraint only has meaning on -m32 otherwise is treated as
> > "r". Not all GPRs have low-8-bit aliases for -m32.
> >
> > Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> > Link: https://github.com/ClangBuiltLinux/linux/issues/961
> > Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
> > Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> > Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> > Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> > Reported-by: kernelci.org bot <bot@kernelci.org>
> > Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> > Suggested-by: Brian Gerst <brgerst@gmail.com>
> > Suggested-by: H. Peter Anvin <hpa@zytor.com>
> > Suggested-by: Ilie Halip <ilie.halip@gmail.com>
> > Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
>
> Reviewed-By: Brian Gerst <brgerst@gmail.com>



-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2020-05-14 23:47 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-05 17:44 [PATCH] x86: bitops: fix build regression Nick Desaulniers
2020-05-05 18:07 ` hpa
2020-05-05 18:22   ` Nick Desaulniers
2020-05-07 11:34   ` Peter Zijlstra
2020-05-07 14:00     ` Brian Gerst
2020-05-07 19:19       ` Nick Desaulniers
2020-05-07 22:29         ` Nick Desaulniers
2020-05-08  1:57           ` Brian Gerst
2020-05-08 17:21             ` Nick Desaulniers
2020-05-08 17:31               ` H. Peter Anvin
2020-05-08 18:05                 ` [PATCH v3] " Nick Desaulniers
2020-05-08 18:08                   ` Nick Desaulniers
2020-05-08 18:22                   ` Brian Gerst
2020-05-08 18:28                     ` [PATCH v4] " Nick Desaulniers
2020-05-08 18:32                       ` [PATCH v5] " Nick Desaulniers
2020-05-08 20:28                         ` Nathan Chancellor
2020-05-08 23:47                           ` Jesse Brandeburg
2020-05-09  4:44                             ` Sedat Dilek
2020-05-09 12:20                         ` Andy Shevchenko
2020-05-09 15:43                           ` Brian Gerst
2020-05-11 17:22                           ` Nick Desaulniers
2020-05-10 13:54                         ` David Laight
2020-05-11 18:52                         ` Brian Gerst
2020-05-14 23:47                           ` Nick Desaulniers
2020-05-10 11:59                 ` [PATCH] " David Laight
2020-05-10 12:33                   ` hpa
2020-05-07 19:29     ` hpa
2020-05-06  4:30 ` Nathan Chancellor
2020-05-06  9:22   ` Sedat Dilek
2020-05-06 15:41   ` Nathan Chancellor
2020-05-06 16:37   ` Nick Desaulniers
2020-05-06 16:55     ` Ilie Halip
2020-05-06 17:05       ` [PATCH v2] " Nick Desaulniers
2020-05-07  6:18 ` [PATCH] " Brian Gerst
2020-05-07  7:02   ` hpa
2020-05-07 13:32     ` Brian Gerst
2020-05-07 15:09       ` David Laight
2020-05-07 19:31         ` hpa
2020-05-07 19:29       ` hpa
2020-05-07  7:44   ` David Laight
2020-05-07  7:59     ` hpa
2020-05-07  8:35       ` David Laight
2020-05-07  8:38         ` hpa
2020-05-07  9:17     ` Andy Shevchenko
2020-05-07 19:22   ` Nick Desaulniers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).