linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: support i386 with Clang
@ 2020-05-04 23:03 Nick Desaulniers
  2020-05-11 17:23 ` Nick Desaulniers
  0 siblings, 1 reply; 14+ messages in thread
From: Nick Desaulniers @ 2020-05-04 23:03 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Nick Desaulniers, Arnd Bergmann, David Woodhouse, Dmitry Golovin,
	Linus Torvalds, Dennis Zhou, Tejun Heo, Christoph Lameter, x86,
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, linux-kernel, clang-built-linux

GCC and Clang are architecturally different, which leads to subtle
issues for code that's invalid but clearly dead. This can happen with
code that emulates polymorphism with the preprocessor and sizeof.

GCC will perform semantic analysis after early inlining and dead code
elimination, so it will not warn on invalid code that's dead. Clang
strictly performs optimizations after semantic analysis, so it will warn
for dead code.

Neither Clang nor GCC like this very much with -m32:

long long ret;
asm ("movb $5, %0" : "=q" (ret));

However, GCC can tolerate this variant:

long long ret;
switch (sizeof(ret)) {
case 1:
        asm ("movb $5, %0" : "=q" (ret));
        break;
case 8:;
}

Clang, on the other hand, won't accept that because it validates the
inline asm for the '1' case *before* the optimisation phase where it
realises that it wouldn't have to emit it anyway.

If LLVM (Clang's "back end") fails such as during instruction selection
or register allocation, it cannot provide accurate diagnostics
(warnings/errors) that contain line information, as the AST has been
discarded from memory at that point.

While there have been early discussions about having C/C++ specific
language optimizations in Clang via the use of MLIR, which would enable
such earlier optimizations, such work is not scoped and likely a
multi-year endeavor.

We also don't want to swap the use of "=q" with "=r". For 64b, it
doesn't matter. For 32b, it's possible that a 32b register without a 8b
lower alias (i.e. ESI, EDI, EBP) is selected which the assembler will
then reject.

With this, Clang can finally build an i386 defconfig.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: David Woodhouse <dwmw2@infradead.org>
Reported-by: Dmitry Golovin <dima@golovin.in>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://bugs.llvm.org/show_bug.cgi?id=33587
Link: https://github.com/ClangBuiltLinux/linux/issues/3
Link: https://github.com/ClangBuiltLinux/linux/issues/194
Link: https://github.com/ClangBuiltLinux/linux/issues/781
Link: https://lore.kernel.org/lkml/20180209161833.4605-1-dwmw2@infradead.org/
Link: https://lore.kernel.org/lkml/CAK8P3a1EBaWdbAEzirFDSgHVJMtWjuNt2HGG8z+vpXeNHwETFQ@mail.gmail.com/
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
Note: this is a resend of:
https://lore.kernel.org/lkml/20180209161833.4605-1-dwmw2@infradead.org/
rebased on today's Linux next, and with the additional change to
uaccess.h.

I'm happy to resend with authorship and reported by tags changed to
suggested by's or whatever, just let me know.

Part of the commit message is stolen from David, and part from Linus.
Shall I resend with David's authorship and
[Nick: reworded]
???

I don't really care, I just don't really want to carry this out of tree
for our CI, which is green for i386 otherwise.


 arch/x86/include/asm/percpu.h  | 12 ++++++++----
 arch/x86/include/asm/uaccess.h |  4 +++-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 2278797c769d..826d086f71c9 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -99,7 +99,7 @@ do {							\
 	case 1:						\
 		asm qual (op "b %1,"__percpu_arg(0)	\
 		    : "+m" (var)			\
-		    : "qi" ((pto_T__)(val)));		\
+		    : "qi" ((unsigned char)(unsigned long)(val))); \
 		break;					\
 	case 2:						\
 		asm qual (op "w %1,"__percpu_arg(0)	\
@@ -144,7 +144,7 @@ do {									\
 		else							\
 			asm qual ("addb %1, "__percpu_arg(0)		\
 			    : "+m" (var)				\
-			    : "qi" ((pao_T__)(val)));			\
+			    : "qi" ((unsigned char)(unsigned long)(val))); \
 		break;							\
 	case 2:								\
 		if (pao_ID__ == 1)					\
@@ -182,12 +182,14 @@ do {									\
 
 #define percpu_from_op(qual, op, var)			\
 ({							\
+	unsigned char pfo_u8__;				\
 	typeof(var) pfo_ret__;				\
 	switch (sizeof(var)) {				\
 	case 1:						\
 		asm qual (op "b "__percpu_arg(1)",%0"	\
-		    : "=q" (pfo_ret__)			\
+		    : "=q" (pfo_u8__)			\
 		    : "m" (var));			\
+		pfo_ret__ = (typeof(var))(unsigned long)pfo_u8__;	\
 		break;					\
 	case 2:						\
 		asm qual (op "w "__percpu_arg(1)",%0"	\
@@ -211,12 +213,14 @@ do {									\
 
 #define percpu_stable_op(op, var)			\
 ({							\
+	unsigned char pfo_u8__;				\
 	typeof(var) pfo_ret__;				\
 	switch (sizeof(var)) {				\
 	case 1:						\
 		asm(op "b "__percpu_arg(P1)",%0"	\
-		    : "=q" (pfo_ret__)			\
+		    : "=q" (pfo_u8__)			\
 		    : "p" (&(var)));			\
+		pfo_ret__ = (typeof(var))(unsigned long)pfo_u8__;	\
 		break;					\
 	case 2:						\
 		asm(op "w "__percpu_arg(P1)",%0"	\
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index d8f283b9a569..cf8483cd80e1 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -314,11 +314,13 @@ do {									\
 
 #define __get_user_size(x, ptr, size, retval)				\
 do {									\
+	unsigned char x_u8__;						\
 	retval = 0;							\
 	__chk_user_ptr(ptr);						\
 	switch (size) {							\
 	case 1:								\
-		__get_user_asm(x, ptr, retval, "b", "=q");		\
+		__get_user_asm(x_u8__, ptr, retval, "b", "=q");		\
+		(x) = x_u8__;						\
 		break;							\
 	case 2:								\
 		__get_user_asm(x, ptr, retval, "w", "=r");		\
-- 
2.26.2.526.g744177e7f7-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-04 23:03 [PATCH] x86: support i386 with Clang Nick Desaulniers
@ 2020-05-11 17:23 ` Nick Desaulniers
  2020-05-11 18:09   ` Brian Gerst
  2020-05-11 18:12   ` Linus Torvalds
  0 siblings, 2 replies; 14+ messages in thread
From: Nick Desaulniers @ 2020-05-11 17:23 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Woodhouse,
	Arnd Bergmann, Linus Torvalds
  Cc: Dmitry Golovin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

Bumping for comment+review.

On Mon, May 4, 2020 at 4:03 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> GCC and Clang are architecturally different, which leads to subtle
> issues for code that's invalid but clearly dead. This can happen with
> code that emulates polymorphism with the preprocessor and sizeof.
>
> GCC will perform semantic analysis after early inlining and dead code
> elimination, so it will not warn on invalid code that's dead. Clang
> strictly performs optimizations after semantic analysis, so it will warn
> for dead code.
>
> Neither Clang nor GCC like this very much with -m32:
>
> long long ret;
> asm ("movb $5, %0" : "=q" (ret));
>
> However, GCC can tolerate this variant:
>
> long long ret;
> switch (sizeof(ret)) {
> case 1:
>         asm ("movb $5, %0" : "=q" (ret));
>         break;
> case 8:;
> }
>
> Clang, on the other hand, won't accept that because it validates the
> inline asm for the '1' case *before* the optimisation phase where it
> realises that it wouldn't have to emit it anyway.
>
> If LLVM (Clang's "back end") fails such as during instruction selection
> or register allocation, it cannot provide accurate diagnostics
> (warnings/errors) that contain line information, as the AST has been
> discarded from memory at that point.
>
> While there have been early discussions about having C/C++ specific
> language optimizations in Clang via the use of MLIR, which would enable
> such earlier optimizations, such work is not scoped and likely a
> multi-year endeavor.
>
> We also don't want to swap the use of "=q" with "=r". For 64b, it
> doesn't matter. For 32b, it's possible that a 32b register without a 8b
> lower alias (i.e. ESI, EDI, EBP) is selected which the assembler will
> then reject.
>
> With this, Clang can finally build an i386 defconfig.
>
> Reported-by: Arnd Bergmann <arnd@arndb.de>
> Reported-by: David Woodhouse <dwmw2@infradead.org>
> Reported-by: Dmitry Golovin <dima@golovin.in>
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Link: https://bugs.llvm.org/show_bug.cgi?id=33587
> Link: https://github.com/ClangBuiltLinux/linux/issues/3
> Link: https://github.com/ClangBuiltLinux/linux/issues/194
> Link: https://github.com/ClangBuiltLinux/linux/issues/781
> Link: https://lore.kernel.org/lkml/20180209161833.4605-1-dwmw2@infradead.org/
> Link: https://lore.kernel.org/lkml/CAK8P3a1EBaWdbAEzirFDSgHVJMtWjuNt2HGG8z+vpXeNHwETFQ@mail.gmail.com/
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
> Note: this is a resend of:
> https://lore.kernel.org/lkml/20180209161833.4605-1-dwmw2@infradead.org/
> rebased on today's Linux next, and with the additional change to
> uaccess.h.
>
> I'm happy to resend with authorship and reported by tags changed to
> suggested by's or whatever, just let me know.
>
> Part of the commit message is stolen from David, and part from Linus.
> Shall I resend with David's authorship and
> [Nick: reworded]
> ???
>
> I don't really care, I just don't really want to carry this out of tree
> for our CI, which is green for i386 otherwise.
>
>
>  arch/x86/include/asm/percpu.h  | 12 ++++++++----
>  arch/x86/include/asm/uaccess.h |  4 +++-
>  2 files changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
> index 2278797c769d..826d086f71c9 100644
> --- a/arch/x86/include/asm/percpu.h
> +++ b/arch/x86/include/asm/percpu.h
> @@ -99,7 +99,7 @@ do {                                                  \
>         case 1:                                         \
>                 asm qual (op "b %1,"__percpu_arg(0)     \
>                     : "+m" (var)                        \
> -                   : "qi" ((pto_T__)(val)));           \
> +                   : "qi" ((unsigned char)(unsigned long)(val))); \
>                 break;                                  \
>         case 2:                                         \
>                 asm qual (op "w %1,"__percpu_arg(0)     \
> @@ -144,7 +144,7 @@ do {                                                                        \
>                 else                                                    \
>                         asm qual ("addb %1, "__percpu_arg(0)            \
>                             : "+m" (var)                                \
> -                           : "qi" ((pao_T__)(val)));                   \
> +                           : "qi" ((unsigned char)(unsigned long)(val))); \
>                 break;                                                  \
>         case 2:                                                         \
>                 if (pao_ID__ == 1)                                      \
> @@ -182,12 +182,14 @@ do {                                                                      \
>
>  #define percpu_from_op(qual, op, var)                  \
>  ({                                                     \
> +       unsigned char pfo_u8__;                         \
>         typeof(var) pfo_ret__;                          \
>         switch (sizeof(var)) {                          \
>         case 1:                                         \
>                 asm qual (op "b "__percpu_arg(1)",%0"   \
> -                   : "=q" (pfo_ret__)                  \
> +                   : "=q" (pfo_u8__)                   \
>                     : "m" (var));                       \
> +               pfo_ret__ = (typeof(var))(unsigned long)pfo_u8__;       \
>                 break;                                  \
>         case 2:                                         \
>                 asm qual (op "w "__percpu_arg(1)",%0"   \
> @@ -211,12 +213,14 @@ do {                                                                      \
>
>  #define percpu_stable_op(op, var)                      \
>  ({                                                     \
> +       unsigned char pfo_u8__;                         \
>         typeof(var) pfo_ret__;                          \
>         switch (sizeof(var)) {                          \
>         case 1:                                         \
>                 asm(op "b "__percpu_arg(P1)",%0"        \
> -                   : "=q" (pfo_ret__)                  \
> +                   : "=q" (pfo_u8__)                   \
>                     : "p" (&(var)));                    \
> +               pfo_ret__ = (typeof(var))(unsigned long)pfo_u8__;       \
>                 break;                                  \
>         case 2:                                         \
>                 asm(op "w "__percpu_arg(P1)",%0"        \
> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
> index d8f283b9a569..cf8483cd80e1 100644
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -314,11 +314,13 @@ do {                                                                      \
>
>  #define __get_user_size(x, ptr, size, retval)                          \
>  do {                                                                   \
> +       unsigned char x_u8__;                                           \
>         retval = 0;                                                     \
>         __chk_user_ptr(ptr);                                            \
>         switch (size) {                                                 \
>         case 1:                                                         \
> -               __get_user_asm(x, ptr, retval, "b", "=q");              \
> +               __get_user_asm(x_u8__, ptr, retval, "b", "=q");         \
> +               (x) = x_u8__;                                           \
>                 break;                                                  \
>         case 2:                                                         \
>                 __get_user_asm(x, ptr, retval, "w", "=r");              \
> --
> 2.26.2.526.g744177e7f7-goog
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 17:23 ` Nick Desaulniers
@ 2020-05-11 18:09   ` Brian Gerst
  2020-05-11 18:46     ` Nick Desaulniers
  2020-05-11 18:12   ` Linus Torvalds
  1 sibling, 1 reply; 14+ messages in thread
From: Brian Gerst @ 2020-05-11 18:09 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Woodhouse,
	Arnd Bergmann, Linus Torvalds, Dmitry Golovin, Dennis Zhou,
	Tejun Heo, Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

On Mon, May 11, 2020 at 1:26 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> Bumping for comment+review.
>
> On Mon, May 4, 2020 at 4:03 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > GCC and Clang are architecturally different, which leads to subtle
> > issues for code that's invalid but clearly dead. This can happen with
> > code that emulates polymorphism with the preprocessor and sizeof.
> >
> > GCC will perform semantic analysis after early inlining and dead code
> > elimination, so it will not warn on invalid code that's dead. Clang
> > strictly performs optimizations after semantic analysis, so it will warn
> > for dead code.
> >
> > Neither Clang nor GCC like this very much with -m32:
> >
> > long long ret;
> > asm ("movb $5, %0" : "=q" (ret));
> >
> > However, GCC can tolerate this variant:
> >
> > long long ret;
> > switch (sizeof(ret)) {
> > case 1:
> >         asm ("movb $5, %0" : "=q" (ret));
> >         break;
> > case 8:;
> > }
> >
> > Clang, on the other hand, won't accept that because it validates the
> > inline asm for the '1' case *before* the optimisation phase where it
> > realises that it wouldn't have to emit it anyway.
> >
> > If LLVM (Clang's "back end") fails such as during instruction selection
> > or register allocation, it cannot provide accurate diagnostics
> > (warnings/errors) that contain line information, as the AST has been
> > discarded from memory at that point.
> >
> > While there have been early discussions about having C/C++ specific
> > language optimizations in Clang via the use of MLIR, which would enable
> > such earlier optimizations, such work is not scoped and likely a
> > multi-year endeavor.
> >
> > We also don't want to swap the use of "=q" with "=r". For 64b, it
> > doesn't matter. For 32b, it's possible that a 32b register without a 8b
> > lower alias (i.e. ESI, EDI, EBP) is selected which the assembler will
> > then reject.
> >
> > With this, Clang can finally build an i386 defconfig.
> >
> > Reported-by: Arnd Bergmann <arnd@arndb.de>
> > Reported-by: David Woodhouse <dwmw2@infradead.org>
> > Reported-by: Dmitry Golovin <dima@golovin.in>
> > Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> > Link: https://bugs.llvm.org/show_bug.cgi?id=33587
> > Link: https://github.com/ClangBuiltLinux/linux/issues/3
> > Link: https://github.com/ClangBuiltLinux/linux/issues/194
> > Link: https://github.com/ClangBuiltLinux/linux/issues/781
> > Link: https://lore.kernel.org/lkml/20180209161833.4605-1-dwmw2@infradead.org/
> > Link: https://lore.kernel.org/lkml/CAK8P3a1EBaWdbAEzirFDSgHVJMtWjuNt2HGG8z+vpXeNHwETFQ@mail.gmail.com/
> > Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> > ---
> > Note: this is a resend of:
> > https://lore.kernel.org/lkml/20180209161833.4605-1-dwmw2@infradead.org/
> > rebased on today's Linux next, and with the additional change to
> > uaccess.h.
> >
> > I'm happy to resend with authorship and reported by tags changed to
> > suggested by's or whatever, just let me know.
> >
> > Part of the commit message is stolen from David, and part from Linus.
> > Shall I resend with David's authorship and
> > [Nick: reworded]
> > ???
> >
> > I don't really care, I just don't really want to carry this out of tree
> > for our CI, which is green for i386 otherwise.
> >
> >
> >  arch/x86/include/asm/percpu.h  | 12 ++++++++----
> >  arch/x86/include/asm/uaccess.h |  4 +++-
> >  2 files changed, 11 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
> > index 2278797c769d..826d086f71c9 100644
> > --- a/arch/x86/include/asm/percpu.h
> > +++ b/arch/x86/include/asm/percpu.h
> > @@ -99,7 +99,7 @@ do {                                                  \
> >         case 1:                                         \
> >                 asm qual (op "b %1,"__percpu_arg(0)     \
> >                     : "+m" (var)                        \
> > -                   : "qi" ((pto_T__)(val)));           \
> > +                   : "qi" ((unsigned char)(unsigned long)(val))); \
> >                 break;                                  \
> >         case 2:                                         \
> >                 asm qual (op "w %1,"__percpu_arg(0)     \
> > @@ -144,7 +144,7 @@ do {                                                                        \
> >                 else                                                    \
> >                         asm qual ("addb %1, "__percpu_arg(0)            \
> >                             : "+m" (var)                                \
> > -                           : "qi" ((pao_T__)(val)));                   \
> > +                           : "qi" ((unsigned char)(unsigned long)(val))); \
> >                 break;                                                  \
> >         case 2:                                                         \
> >                 if (pao_ID__ == 1)                                      \
> > @@ -182,12 +182,14 @@ do {                                                                      \
> >
> >  #define percpu_from_op(qual, op, var)                  \
> >  ({                                                     \
> > +       unsigned char pfo_u8__;                         \
> >         typeof(var) pfo_ret__;                          \
> >         switch (sizeof(var)) {                          \
> >         case 1:                                         \
> >                 asm qual (op "b "__percpu_arg(1)",%0"   \
> > -                   : "=q" (pfo_ret__)                  \
> > +                   : "=q" (pfo_u8__)                   \
> >                     : "m" (var));                       \
> > +               pfo_ret__ = (typeof(var))(unsigned long)pfo_u8__;       \
> >                 break;                                  \
> >         case 2:                                         \
> >                 asm qual (op "w "__percpu_arg(1)",%0"   \
> > @@ -211,12 +213,14 @@ do {                                                                      \
> >
> >  #define percpu_stable_op(op, var)                      \
> >  ({                                                     \
> > +       unsigned char pfo_u8__;                         \
> >         typeof(var) pfo_ret__;                          \
> >         switch (sizeof(var)) {                          \
> >         case 1:                                         \
> >                 asm(op "b "__percpu_arg(P1)",%0"        \
> > -                   : "=q" (pfo_ret__)                  \
> > +                   : "=q" (pfo_u8__)                   \
> >                     : "p" (&(var)));                    \
> > +               pfo_ret__ = (typeof(var))(unsigned long)pfo_u8__;       \
> >                 break;                                  \
> >         case 2:                                         \
> >                 asm(op "w "__percpu_arg(P1)",%0"        \
> > diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
> > index d8f283b9a569..cf8483cd80e1 100644
> > --- a/arch/x86/include/asm/uaccess.h
> > +++ b/arch/x86/include/asm/uaccess.h
> > @@ -314,11 +314,13 @@ do {                                                                      \
> >
> >  #define __get_user_size(x, ptr, size, retval)                          \
> >  do {                                                                   \
> > +       unsigned char x_u8__;                                           \
> >         retval = 0;                                                     \
> >         __chk_user_ptr(ptr);                                            \
> >         switch (size) {                                                 \
> >         case 1:                                                         \
> > -               __get_user_asm(x, ptr, retval, "b", "=q");              \
> > +               __get_user_asm(x_u8__, ptr, retval, "b", "=q");         \
> > +               (x) = x_u8__;                                           \
> >                 break;                                                  \
> >         case 2:                                                         \
> >                 __get_user_asm(x, ptr, retval, "w", "=r");              \
> > --
> > 2.26.2.526.g744177e7f7-goog
> >
>
>
> --
> Thanks,
> ~Nick Desaulniers

This looks like the same issue that we just discussed for bitops.h.
Add the "b" operand size modifier to force it to use the 8-bit
register names (and probably also needs the "w" modifier in the 16-bit
case).

--
Brian Gerst

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 17:23 ` Nick Desaulniers
  2020-05-11 18:09   ` Brian Gerst
@ 2020-05-11 18:12   ` Linus Torvalds
  2020-05-11 18:24     ` Linus Torvalds
  1 sibling, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2020-05-11 18:12 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Woodhouse,
	Arnd Bergmann, Dmitry Golovin, Dennis Zhou, Tejun Heo,
	Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

On Mon, May 11, 2020 at 10:24 AM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> Bumping for comment+review.
>
> On Mon, May 4, 2020 at 4:03 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > -                   : "qi" ((pto_T__)(val)));           \
> > +                   : "qi" ((unsigned char)(unsigned long)(val))); \

I find the extra casts really annoying and illogical.

I kind of see why they happen: just casting to 'unsigned char' can
cause warnings, when this case isn't even taken, and the sizeof(var)
is something else (and you cast a pointer to a char without going
through the 'unsigned long' first.

But that doesn't make this change any more sensible or readable.

Would using "__builtin_choose_expr()" be able to avoid this whole issue?

Because I'd rather re-write the whole thing with a different model
that doesn't have this issue, than make the code look insane.

And it does look insane, because that whole thing makes for a very
natural question: "why do we cast to unsigned long and then to char,
when we just checked that the size of the type is 1"?

                 Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 18:12   ` Linus Torvalds
@ 2020-05-11 18:24     ` Linus Torvalds
  2020-05-11 18:36       ` Linus Torvalds
  2020-05-11 19:52       ` Nick Desaulniers
  0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2020-05-11 18:24 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Woodhouse,
	Arnd Bergmann, Dmitry Golovin, Dennis Zhou, Tejun Heo,
	Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

[-- Attachment #1: Type: text/plain, Size: 612 bytes --]

On Mon, May 11, 2020 at 11:12 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Would using "__builtin_choose_expr()" be able to avoid this whole issue?

We actually have a fair amount of "pick expression based on size", so
with a few helper macros we could make the code look better than the
case statements too.

Something (ENTIRELY UNTESTED!) like the attached patch, perhaps?

NOTE! I only converted one single use to that "pick_size_xyz()" model.
If this actually works for clang too, we could do the others.

I guess I should just test it, since I have that clang tree.

                  Linus

[-- Attachment #2: patch --]
[-- Type: application/octet-stream, Size: 1855 bytes --]

 arch/x86/include/asm/percpu.h | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 2278797c769d..b479a0e650e5 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -86,6 +86,17 @@
 /* For arch-specific code, we can use direct single-insn ops (they
  * don't give an lvalue though). */
 extern void __bad_percpu_size(void);
+extern void __bad_expr_size(void);
+
+#define pick_type_expression(x, e8, e16, e32, e64)	\
+	__builtin_choose_expr(sizeof(x)==1, e8,		\
+	__builtin_choose_expr(sizeof(x)==2, e16,	\
+	__builtin_choose_expr(sizeof(x)==4, e32,	\
+	__builtin_choose_expr(sizeof(x)==8, e64,	\
+	__bad_expr_size()))))
+
+#define pick_type_statement(x, s8, s16, s32, s64) \
+	pick_type_expression(x, ({s8;0;}), ({s16;0;}),({s32;0;}),({s64;0;}))
 
 #define percpu_to_op(qual, op, var, val)		\
 do {							\
@@ -95,29 +106,19 @@ do {							\
 		pto_tmp__ = (val);			\
 		(void)pto_tmp__;			\
 	}						\
-	switch (sizeof(var)) {				\
-	case 1:						\
+	pick_type_statement(var,			\
 		asm qual (op "b %1,"__percpu_arg(0)	\
 		    : "+m" (var)			\
-		    : "qi" ((pto_T__)(val)));		\
-		break;					\
-	case 2:						\
+		    : "qi" ((pto_T__)(val))),		\
 		asm qual (op "w %1,"__percpu_arg(0)	\
 		    : "+m" (var)			\
-		    : "ri" ((pto_T__)(val)));		\
-		break;					\
-	case 4:						\
+		    : "ri" ((pto_T__)(val))),		\
 		asm qual (op "l %1,"__percpu_arg(0)	\
 		    : "+m" (var)			\
-		    : "ri" ((pto_T__)(val)));		\
-		break;					\
-	case 8:						\
+		    : "ri" ((pto_T__)(val))),		\
 		asm qual (op "q %1,"__percpu_arg(0)	\
 		    : "+m" (var)			\
-		    : "re" ((pto_T__)(val)));		\
-		break;					\
-	default: __bad_percpu_size();			\
-	}						\
+		    : "re" ((pto_T__)(val))));		\
 } while (0)
 
 /*

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 18:24     ` Linus Torvalds
@ 2020-05-11 18:36       ` Linus Torvalds
  2020-05-11 19:52       ` Nick Desaulniers
  1 sibling, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2020-05-11 18:36 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Woodhouse,
	Arnd Bergmann, Dmitry Golovin, Dennis Zhou, Tejun Heo,
	Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

On Mon, May 11, 2020 at 11:24 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I guess I should just test it, since I have that clang tree.

No, clang doesn't seem to handle it even with __builtin_choose_expr(),
and has that

     invalid input size for constraint 'qi'

even when it's in a side that is never chosen.

Very annoying. A lot of our magic macros are literally about "pick one
case when the others are not valid for this type".

             Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 18:09   ` Brian Gerst
@ 2020-05-11 18:46     ` Nick Desaulniers
  2020-05-11 19:34       ` Brian Gerst
  0 siblings, 1 reply; 14+ messages in thread
From: Nick Desaulniers @ 2020-05-11 18:46 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Woodhouse,
	Arnd Bergmann, Linus Torvalds, Dmitry Golovin, Dennis Zhou,
	Tejun Heo, Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

On Mon, May 11, 2020 at 11:09 AM Brian Gerst <brgerst@gmail.com> wrote:
> This looks like the same issue that we just discussed for bitops.h.
> Add the "b" operand size modifier to force it to use the 8-bit
> register names (and probably also needs the "w" modifier in the 16-bit
> case).

While it does feel familiar, it is slightly different.
https://godbolt.org/z/Rme4Zg
That case was both compilers validating the inline asm, yet generating
assembly that the assembler would choke on.  This case is validation
in the front end failing.

Side note: would you mind sending a review by tag for v5 of that patch
if you think it's good to go?  It does fix a regression I'd prefer
didn't ship in 5.7.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 18:46     ` Nick Desaulniers
@ 2020-05-11 19:34       ` Brian Gerst
  2020-05-11 20:18         ` Nick Desaulniers
  2020-05-11 22:54         ` Brian Gerst
  0 siblings, 2 replies; 14+ messages in thread
From: Brian Gerst @ 2020-05-11 19:34 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Woodhouse,
	Arnd Bergmann, Linus Torvalds, Dmitry Golovin, Dennis Zhou,
	Tejun Heo, Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

On Mon, May 11, 2020 at 2:46 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Mon, May 11, 2020 at 11:09 AM Brian Gerst <brgerst@gmail.com> wrote:
> > This looks like the same issue that we just discussed for bitops.h.
> > Add the "b" operand size modifier to force it to use the 8-bit
> > register names (and probably also needs the "w" modifier in the 16-bit
> > case).
>
> While it does feel familiar, it is slightly different.
> https://godbolt.org/z/Rme4Zg
> That case was both compilers validating the inline asm, yet generating
> assembly that the assembler would choke on.  This case is validation
> in the front end failing.

> long long ret;
> switch (sizeof(ret)) {
> case 1:
>         asm ("movb $5, %0" : "=q" (ret));
>         break;
> case 8:;
> }

So if the issue here is that the output variable type is long long,
what code is using a 64-bit percpu variable on a 32-bit kernel?  Can
you give a specific file that fails to build with Clang?  If Clang is
choking on it it may be silently miscompiling on GCC.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 18:24     ` Linus Torvalds
  2020-05-11 18:36       ` Linus Torvalds
@ 2020-05-11 19:52       ` Nick Desaulniers
  2020-05-11 20:01         ` Linus Torvalds
  1 sibling, 1 reply; 14+ messages in thread
From: Nick Desaulniers @ 2020-05-11 19:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Woodhouse,
	Arnd Bergmann, Dmitry Golovin, Dennis Zhou, Tejun Heo,
	Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

On Mon, May 11, 2020 at 11:24 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, May 11, 2020 at 11:12 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Would using "__builtin_choose_expr()" be able to avoid this whole issue?
>
> We actually have a fair amount of "pick expression based on size", so
> with a few helper macros we could make the code look better than the
> case statements too.
>
> Something (ENTIRELY UNTESTED!) like the attached patch, perhaps?
>
> NOTE! I only converted one single use to that "pick_size_xyz()" model.
> If this actually works for clang too, we could do the others.
>
> I guess I should just test it, since I have that clang tree.

Interesting approach.  Researching __builtin_choose_expr, it looks
like it was cited as prior art for C11's _Generic keyword.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1404.htm

I'm kind of tempted now to play with _Generic and see if that makes a
difference, though I'm not hopeful.  Without checking, I don't know if
that will produce warnings with `-std=gnu89`.

...

I'm playing around with _Generic now to see if that's a possible
solution, but it looks like it can't be used for selecting inline asm
constraint strings.  But maybe there's another way to use _Generic
here.  TODO: more testing.

I don't quite understand the use of GNU C statement expressions in
pick_type_statement, though I'm guessing the return value is important
somewhere.  Maybe just looking at the preprocessed source would make
it clearer.

> "why do we cast to unsigned long and then to char,
> when we just checked that the size of the type is 1"?

Deja vu.  I don't remember who I discussed that with, maybe Arnd, but
that was something else I had asked at some point.  I must have
forgotten to look into it more before sending the patch.  Can likely
drop that at least.

--
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 19:52       ` Nick Desaulniers
@ 2020-05-11 20:01         ` Linus Torvalds
  2020-05-11 20:03           ` David Woodhouse
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2020-05-11 20:01 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Woodhouse,
	Arnd Bergmann, Dmitry Golovin, Dennis Zhou, Tejun Heo,
	Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

On Mon, May 11, 2020 at 12:52 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> Interesting approach.  Researching __builtin_choose_expr, it looks
> like it was cited as prior art for C11's _Generic keyword.

Well, the thing that made me think that __builtin_choose_expr() would
work is that unlike the switch statement, you absolutely _have_ to do
the choice in the front end. You can't leave it as some kind of
optimization for later phases, because the choice od expression ends
up also determining the type of the result, so it isn't just a local
choice - it affects everything around that expression.

But clang still doesn't like that "qi" constraint with a (non-chosen)
expression that has a "u64" type.

I guess we can take the stupid extra cast, but I think it would at
least need a comment (maybe through a helper function) about why "qi"
needs it, but "ri" does not, and why the cast to "unsigned long" is
needed, even though "clearly" the type is already just 8 bits.

Otherwise somebody will just remove that "obviously pointless" cast,
and gcc will eat the result happily, and clang will fail.

And nobody will notice for a while anyway, because this issue only
happens on 32-bit targets, and developers don't use those any more.

                 Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 20:01         ` Linus Torvalds
@ 2020-05-11 20:03           ` David Woodhouse
  2020-05-12 20:35             ` Nick Desaulniers
  0 siblings, 1 reply; 14+ messages in thread
From: David Woodhouse @ 2020-05-11 20:03 UTC (permalink / raw)
  To: Linus Torvalds, Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Arnd Bergmann,
	Dmitry Golovin, Dennis Zhou, Tejun Heo, Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

[-- Attachment #1: Type: text/plain, Size: 1376 bytes --]

On Mon, 2020-05-11 at 13:01 -0700, Linus Torvalds wrote:
> On Mon, May 11, 2020 at 12:52 PM Nick Desaulniers
> <ndesaulniers@google.com> wrote:
> > 
> > Interesting approach.  Researching __builtin_choose_expr, it looks
> > like it was cited as prior art for C11's _Generic keyword.
> 
> Well, the thing that made me think that __builtin_choose_expr() would
> work is that unlike the switch statement, you absolutely _have_ to do
> the choice in the front end. You can't leave it as some kind of
> optimization for later phases, because the choice od expression ends
> up also determining the type of the result, so it isn't just a local
> choice - it affects everything around that expression.
> 
> But clang still doesn't like that "qi" constraint with a (non-chosen)
> expression that has a "u64" type.
> 
> I guess we can take the stupid extra cast, but I think it would at
> least need a comment (maybe through a helper function) about why "qi"
> needs it, but "ri" does not, and why the cast to "unsigned long" is
> needed, even though "clearly" the type is already just 8 bits.
> 
> Otherwise somebody will just remove that "obviously pointless" cast,
> and gcc will eat the result happily, and clang will fail.

I'm also mildly concerned that LLVM will start to whine about the 'ri'
case too. It's odd that it doesn't, even when GCC does.


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 19:34       ` Brian Gerst
@ 2020-05-11 20:18         ` Nick Desaulniers
  2020-05-11 22:54         ` Brian Gerst
  1 sibling, 0 replies; 14+ messages in thread
From: Nick Desaulniers @ 2020-05-11 20:18 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Woodhouse,
	Arnd Bergmann, Linus Torvalds, Dmitry Golovin, Dennis Zhou,
	Tejun Heo, Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

On Mon, May 11, 2020 at 12:34 PM Brian Gerst <brgerst@gmail.com> wrote:
>
> On Mon, May 11, 2020 at 2:46 PM Nick Desaulniers
> <ndesaulniers@google.com> wrote:
> >
> > On Mon, May 11, 2020 at 11:09 AM Brian Gerst <brgerst@gmail.com> wrote:
> > > This looks like the same issue that we just discussed for bitops.h.
> > > Add the "b" operand size modifier to force it to use the 8-bit
> > > register names (and probably also needs the "w" modifier in the 16-bit
> > > case).
> >
> > While it does feel familiar, it is slightly different.
> > https://godbolt.org/z/Rme4Zg
> > That case was both compilers validating the inline asm, yet generating
> > assembly that the assembler would choke on.  This case is validation
> > in the front end failing.
>
> > long long ret;
> > switch (sizeof(ret)) {
> > case 1:
> >         asm ("movb $5, %0" : "=q" (ret));
> >         break;
> > case 8:;
> > }
>
> So if the issue here is that the output variable type is long long,
> what code is using a 64-bit percpu variable on a 32-bit kernel?  Can
> you give a specific file that fails to build with Clang?  If Clang is
> choking on it it may be silently miscompiling on GCC.

I'm not sure that's the case.  Applying this patch, undoing the hunk
in percpu_from_op() we get tons of errors.  Looking at one:

kernel/events/core.c:8679:8: error: invalid output size for constraint '=q'
./include/linux/percpu-defs.h:446:2: note: expanded from macro '__this_cpu_read'
        raw_cpu_read(pcp);                                              \
        ^
...

There's nothing wrong with this line, it's reading a percpu u64 into a
local u64.  The error comes from validating the inline asm in the dead
branch.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 19:34       ` Brian Gerst
  2020-05-11 20:18         ` Nick Desaulniers
@ 2020-05-11 22:54         ` Brian Gerst
  1 sibling, 0 replies; 14+ messages in thread
From: Brian Gerst @ 2020-05-11 22:54 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Woodhouse,
	Arnd Bergmann, Linus Torvalds, Dmitry Golovin, Dennis Zhou,
	Tejun Heo, Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

On Mon, May 11, 2020 at 3:34 PM Brian Gerst <brgerst@gmail.com> wrote:
>
> On Mon, May 11, 2020 at 2:46 PM Nick Desaulniers
> <ndesaulniers@google.com> wrote:
> >
> > On Mon, May 11, 2020 at 11:09 AM Brian Gerst <brgerst@gmail.com> wrote:
> > > This looks like the same issue that we just discussed for bitops.h.
> > > Add the "b" operand size modifier to force it to use the 8-bit
> > > register names (and probably also needs the "w" modifier in the 16-bit
> > > case).
> >
> > While it does feel familiar, it is slightly different.
> > https://godbolt.org/z/Rme4Zg
> > That case was both compilers validating the inline asm, yet generating
> > assembly that the assembler would choke on.  This case is validation
> > in the front end failing.
>
> > long long ret;
> > switch (sizeof(ret)) {
> > case 1:
> >         asm ("movb $5, %0" : "=q" (ret));
> >         break;
> > case 8:;
> > }
>
> So if the issue here is that the output variable type is long long,
> what code is using a 64-bit percpu variable on a 32-bit kernel?  Can
> you give a specific file that fails to build with Clang?  If Clang is
> choking on it it may be silently miscompiling on GCC.

On further investigation, 64-bit percpu operations fall back to the
generic code on x86-32, so there is no problem with miscompiling here.

On a side note from looking at the preprocessed output of the percpu
macros: they generate a ton of extra dead code because the core macros
also have a switch on data size.  I will take a stab at cleaning that
up.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] x86: support i386 with Clang
  2020-05-11 20:03           ` David Woodhouse
@ 2020-05-12 20:35             ` Nick Desaulniers
  0 siblings, 0 replies; 14+ messages in thread
From: Nick Desaulniers @ 2020-05-12 20:35 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Linus Torvalds, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Arnd Bergmann, Dmitry Golovin, Dennis Zhou, Tejun Heo,
	Christoph Lameter,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Al Viro, Josh Poimboeuf, Masami Hiramatsu,
	Peter Zijlstra, LKML, clang-built-linux

On Mon, May 11, 2020 at 1:03 PM David Woodhouse <dwmw2@infradead.org> wrote:
>
> On Mon, 2020-05-11 at 13:01 -0700, Linus Torvalds wrote:
> > On Mon, May 11, 2020 at 12:52 PM Nick Desaulniers
> > <ndesaulniers@google.com> wrote:
> > >
> > > Interesting approach.  Researching __builtin_choose_expr, it looks
> > > like it was cited as prior art for C11's _Generic keyword.
> >
> > Well, the thing that made me think that __builtin_choose_expr() would
> > work is that unlike the switch statement, you absolutely _have_ to do
> > the choice in the front end. You can't leave it as some kind of
> > optimization for later phases, because the choice od expression ends
> > up also determining the type of the result, so it isn't just a local
> > choice - it affects everything around that expression.
> >
> > But clang still doesn't like that "qi" constraint with a (non-chosen)
> > expression that has a "u64" type.
> >
> > I guess we can take the stupid extra cast, but I think it would at
> > least need a comment (maybe through a helper function) about why "qi"
> > needs it, but "ri" does not, and why the cast to "unsigned long" is
> > needed, even though "clearly" the type is already just 8 bits.
> >
> > Otherwise somebody will just remove that "obviously pointless" cast,
> > and gcc will eat the result happily, and clang will fail.
>
> I'm also mildly concerned that LLVM will start to whine about the 'ri'
> case too. It's odd that it doesn't, even when GCC does.

Ah, sorry, it took me a while to understand what case you meant by
this.  I recall you pointing this out previously in
https://bugs.llvm.org/show_bug.cgi?id=33587#c16, but at the time I
simply wasn't well versed enough in inline asm to understand.  The
case you're referring to is 64b operands, "r" constraint, -m32 target.
Yes, if I fix that: https://reviews.llvm.org/D79804, then this whole
patch needs to be reworked.  Back to the drawing board.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-05-12 20:35 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-04 23:03 [PATCH] x86: support i386 with Clang Nick Desaulniers
2020-05-11 17:23 ` Nick Desaulniers
2020-05-11 18:09   ` Brian Gerst
2020-05-11 18:46     ` Nick Desaulniers
2020-05-11 19:34       ` Brian Gerst
2020-05-11 20:18         ` Nick Desaulniers
2020-05-11 22:54         ` Brian Gerst
2020-05-11 18:12   ` Linus Torvalds
2020-05-11 18:24     ` Linus Torvalds
2020-05-11 18:36       ` Linus Torvalds
2020-05-11 19:52       ` Nick Desaulniers
2020-05-11 20:01         ` Linus Torvalds
2020-05-11 20:03           ` David Woodhouse
2020-05-12 20:35             ` Nick Desaulniers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).