All of lore.kernel.org
 help / color / mirror / Atom feed
* mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-14  8:55 ` Sudip Mukherjee (Codethink)
  0 siblings, 0 replies; 47+ messages in thread
From: Sudip Mukherjee (Codethink) @ 2022-07-14  8:55 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, Kees Cook
  Cc: linuxppc-dev, linux-kernel, linux-hardening, Linus Torvalds

Hi All,

Not sure if it has been reported before but the latest mainline kernel
branch fails to build for powerpc allmodconfig with gcc-12 and the error is:

Error: External symbol 'memset' referenced from prom_init.c
make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

The commit ca5dabcff1df ("powerpc/prom_init: Fix build failure with
GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and KASAN") looks similar but the error
is still there with gcc-12.

Note: I don't see this error with gcc-11.


I will be happy to test any patch or provide any extra log if needed.

--
Regards
Sudip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-14  8:55 ` Sudip Mukherjee (Codethink)
  0 siblings, 0 replies; 47+ messages in thread
From: Sudip Mukherjee (Codethink) @ 2022-07-14  8:55 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, Kees Cook
  Cc: Linus Torvalds, linuxppc-dev, linux-kernel, linux-hardening

Hi All,

Not sure if it has been reported before but the latest mainline kernel
branch fails to build for powerpc allmodconfig with gcc-12 and the error is:

Error: External symbol 'memset' referenced from prom_init.c
make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

The commit ca5dabcff1df ("powerpc/prom_init: Fix build failure with
GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and KASAN") looks similar but the error
is still there with gcc-12.

Note: I don't see this error with gcc-11.


I will be happy to test any patch or provide any extra log if needed.

--
Regards
Sudip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-14  8:55 ` Sudip Mukherjee (Codethink)
@ 2022-07-17  9:12   ` Sudip Mukherjee
  -1 siblings, 0 replies; 47+ messages in thread
From: Sudip Mukherjee @ 2022-07-17  9:12 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, Kees Cook
  Cc: linuxppc-dev, linux-kernel, linux-hardening, Linus Torvalds

On Thu, Jul 14, 2022 at 9:55 AM Sudip Mukherjee (Codethink)
<sudipm.mukherjee@gmail.com> wrote:
>
> Hi All,
>
> Not sure if it has been reported before but the latest mainline kernel
> branch fails to build for powerpc allmodconfig with gcc-12 and the error is:
>
> Error: External symbol 'memset' referenced from prom_init.c
> make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

I was trying to check it. With gcc-11 the assembly code generated is
not using memset, but using __memset.
But with gcc-12, I can see the assembly code is using memset. One
example from the assembly:

call_prom:
        .quad   .call_prom,.TOC.@tocbase,0
        .previous
        .size   call_prom,24
        .type   .call_prom,@function
.call_prom:
        mflr 0           #,
        std 29,-24(1)    #,
        std 30,-16(1)    #,
        std 31,-8(1)     #,
        mr 29,3          # tmp166, service
        mr 31,4          # nargs, tmp167
        mr 30,5          # tmp168, nret
 # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
        li 4,254                 #,
        li 5,52          #,
 # arch/powerpc/kernel/prom_init.c:394: {
        std 0,16(1)      #,
        stdu 1,-208(1)   #,,
 # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
        addi 3,1,112     # tmp174,,
 # arch/powerpc/kernel/prom_init.c:394: {
        std 9,304(1)     #,
        std 10,312(1)    #,
        std 6,280(1)     #,
        std 7,288(1)     #,
        std 8,296(1)     #,
 # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
        bl .memset       #
        nop
        rldicl 9,31,0,32         # nargs, nargs
        addi 9,9,1       # tmp163, nargs,
        mtctr 9          # tmp163, tmp163



-- 
Regards
Sudip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-17  9:12   ` Sudip Mukherjee
  0 siblings, 0 replies; 47+ messages in thread
From: Sudip Mukherjee @ 2022-07-17  9:12 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, Kees Cook
  Cc: Linus Torvalds, linuxppc-dev, linux-kernel, linux-hardening

On Thu, Jul 14, 2022 at 9:55 AM Sudip Mukherjee (Codethink)
<sudipm.mukherjee@gmail.com> wrote:
>
> Hi All,
>
> Not sure if it has been reported before but the latest mainline kernel
> branch fails to build for powerpc allmodconfig with gcc-12 and the error is:
>
> Error: External symbol 'memset' referenced from prom_init.c
> make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

I was trying to check it. With gcc-11 the assembly code generated is
not using memset, but using __memset.
But with gcc-12, I can see the assembly code is using memset. One
example from the assembly:

call_prom:
        .quad   .call_prom,.TOC.@tocbase,0
        .previous
        .size   call_prom,24
        .type   .call_prom,@function
.call_prom:
        mflr 0           #,
        std 29,-24(1)    #,
        std 30,-16(1)    #,
        std 31,-8(1)     #,
        mr 29,3          # tmp166, service
        mr 31,4          # nargs, tmp167
        mr 30,5          # tmp168, nret
 # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
        li 4,254                 #,
        li 5,52          #,
 # arch/powerpc/kernel/prom_init.c:394: {
        std 0,16(1)      #,
        stdu 1,-208(1)   #,,
 # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
        addi 3,1,112     # tmp174,,
 # arch/powerpc/kernel/prom_init.c:394: {
        std 9,304(1)     #,
        std 10,312(1)    #,
        std 6,280(1)     #,
        std 7,288(1)     #,
        std 8,296(1)     #,
 # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
        bl .memset       #
        nop
        rldicl 9,31,0,32         # nargs, nargs
        addi 9,9,1       # tmp163, nargs,
        mtctr 9          # tmp163, tmp163



-- 
Regards
Sudip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17  9:12   ` Sudip Mukherjee
@ 2022-07-17 14:44     ` Linus Torvalds
  -1 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-17 14:44 UTC (permalink / raw)
  To: Sudip Mukherjee
  Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Kees Cook, linuxppc-dev, linux-kernel, linux-hardening

On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
<sudipm.mukherjee@gmail.com> wrote:
>
> I was trying to check it. With gcc-11 the assembly code generated is
> not using memset, but using __memset.
> But with gcc-12, I can see the assembly code is using memset. One
> example from the assembly:

You could try making the 'args' array in 'struct prom_args' be marked
'volatile'.

Ie something like this:

  --- a/arch/powerpc/kernel/prom_init.c
  +++ b/arch/powerpc/kernel/prom_init.c
  @@ -115,6 +115,6 @@ struct prom_args {
           __be32 service;
           __be32 nargs;
           __be32 nret;
  -          __be32 args[10];
  +        volatile __be32 args[10];
   };

because I think it's just the compilers turning the small loop over
those fields into a "memset()".

              Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-17 14:44     ` Linus Torvalds
  0 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-17 14:44 UTC (permalink / raw)
  To: Sudip Mukherjee
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening, linuxppc-dev

On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
<sudipm.mukherjee@gmail.com> wrote:
>
> I was trying to check it. With gcc-11 the assembly code generated is
> not using memset, but using __memset.
> But with gcc-12, I can see the assembly code is using memset. One
> example from the assembly:

You could try making the 'args' array in 'struct prom_args' be marked
'volatile'.

Ie something like this:

  --- a/arch/powerpc/kernel/prom_init.c
  +++ b/arch/powerpc/kernel/prom_init.c
  @@ -115,6 +115,6 @@ struct prom_args {
           __be32 service;
           __be32 nargs;
           __be32 nret;
  -          __be32 args[10];
  +        volatile __be32 args[10];
   };

because I think it's just the compilers turning the small loop over
those fields into a "memset()".

              Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17 14:44     ` Linus Torvalds
@ 2022-07-17 19:54       ` Segher Boessenkool
  -1 siblings, 0 replies; 47+ messages in thread
From: Segher Boessenkool @ 2022-07-17 19:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening,
	linuxppc-dev, Sudip Mukherjee

On Sun, Jul 17, 2022 at 07:44:22AM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
> <sudipm.mukherjee@gmail.com> wrote:
> > I was trying to check it. With gcc-11 the assembly code generated is
> > not using memset, but using __memset.
> > But with gcc-12, I can see the assembly code is using memset. One
> > example from the assembly:
> 
> You could try making the 'args' array in 'struct prom_args' be marked
> 'volatile'.
> 
> Ie something like this:
> 
>   --- a/arch/powerpc/kernel/prom_init.c
>   +++ b/arch/powerpc/kernel/prom_init.c
>   @@ -115,6 +115,6 @@ struct prom_args {
>            __be32 service;
>            __be32 nargs;
>            __be32 nret;
>   -          __be32 args[10];
>   +        volatile __be32 args[10];
>    };
> 
> because I think it's just the compilers turning the small loop over
> those fields into a "memset()".

Yes.  See <https://gcc.gnu.org/onlinedocs/gcc/Standards.html#C-Language>
near the end:
  Most of the compiler support routines used by GCC are present in
  libgcc, but there are a few exceptions. GCC requires the freestanding
  environment provide memcpy, memmove, memset and memcmp. Finally, if
  __builtin_trap is used, and the target does not implement the trap
  pattern, then GCC emits a call to abort.

Can't we simply have a small simple implementation of these functions in
arch/powerpc/boot/?  This stuff is not performance-critical, and this is
not the first time we hit these problems.


Segher

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-17 19:54       ` Segher Boessenkool
  0 siblings, 0 replies; 47+ messages in thread
From: Segher Boessenkool @ 2022-07-17 19:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sudip Mukherjee, Kees Cook, linux-kernel, Paul Mackerras,
	linux-hardening, linuxppc-dev

On Sun, Jul 17, 2022 at 07:44:22AM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
> <sudipm.mukherjee@gmail.com> wrote:
> > I was trying to check it. With gcc-11 the assembly code generated is
> > not using memset, but using __memset.
> > But with gcc-12, I can see the assembly code is using memset. One
> > example from the assembly:
> 
> You could try making the 'args' array in 'struct prom_args' be marked
> 'volatile'.
> 
> Ie something like this:
> 
>   --- a/arch/powerpc/kernel/prom_init.c
>   +++ b/arch/powerpc/kernel/prom_init.c
>   @@ -115,6 +115,6 @@ struct prom_args {
>            __be32 service;
>            __be32 nargs;
>            __be32 nret;
>   -          __be32 args[10];
>   +        volatile __be32 args[10];
>    };
> 
> because I think it's just the compilers turning the small loop over
> those fields into a "memset()".

Yes.  See <https://gcc.gnu.org/onlinedocs/gcc/Standards.html#C-Language>
near the end:
  Most of the compiler support routines used by GCC are present in
  libgcc, but there are a few exceptions. GCC requires the freestanding
  environment provide memcpy, memmove, memset and memcmp. Finally, if
  __builtin_trap is used, and the target does not implement the trap
  pattern, then GCC emits a call to abort.

Can't we simply have a small simple implementation of these functions in
arch/powerpc/boot/?  This stuff is not performance-critical, and this is
not the first time we hit these problems.


Segher

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17 14:44     ` Linus Torvalds
@ 2022-07-17 20:25       ` Sudip Mukherjee
  -1 siblings, 0 replies; 47+ messages in thread
From: Sudip Mukherjee @ 2022-07-17 20:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Kees Cook, linuxppc-dev, linux-kernel, linux-hardening,
	Segher Boessenkool

On Sun, Jul 17, 2022 at 3:44 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
> <sudipm.mukherjee@gmail.com> wrote:
> >
> > I was trying to check it. With gcc-11 the assembly code generated is
> > not using memset, but using __memset.
> > But with gcc-12, I can see the assembly code is using memset. One
> > example from the assembly:
>
> You could try making the 'args' array in 'struct prom_args' be marked
> 'volatile'.
>
> Ie something like this:
>
>   --- a/arch/powerpc/kernel/prom_init.c
>   +++ b/arch/powerpc/kernel/prom_init.c
>   @@ -115,6 +115,6 @@ struct prom_args {
>            __be32 service;
>            __be32 nargs;
>            __be32 nret;
>   -          __be32 args[10];
>   +        volatile __be32 args[10];
>    };
>
> because I think it's just the compilers turning the small loop over
> those fields into a "memset()".

That didn't work.
"Error: External symbol 'memset' referenced from prom_init.c" is still
there with this change.
And the generated assembly still has the memset for "struct prom_args".


-- 
Regards
Sudip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-17 20:25       ` Sudip Mukherjee
  0 siblings, 0 replies; 47+ messages in thread
From: Sudip Mukherjee @ 2022-07-17 20:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening, linuxppc-dev

On Sun, Jul 17, 2022 at 3:44 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
> <sudipm.mukherjee@gmail.com> wrote:
> >
> > I was trying to check it. With gcc-11 the assembly code generated is
> > not using memset, but using __memset.
> > But with gcc-12, I can see the assembly code is using memset. One
> > example from the assembly:
>
> You could try making the 'args' array in 'struct prom_args' be marked
> 'volatile'.
>
> Ie something like this:
>
>   --- a/arch/powerpc/kernel/prom_init.c
>   +++ b/arch/powerpc/kernel/prom_init.c
>   @@ -115,6 +115,6 @@ struct prom_args {
>            __be32 service;
>            __be32 nargs;
>            __be32 nret;
>   -          __be32 args[10];
>   +        volatile __be32 args[10];
>    };
>
> because I think it's just the compilers turning the small loop over
> those fields into a "memset()".

That didn't work.
"Error: External symbol 'memset' referenced from prom_init.c" is still
there with this change.
And the generated assembly still has the memset for "struct prom_args".


-- 
Regards
Sudip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17 20:25       ` Sudip Mukherjee
@ 2022-07-17 20:29         ` Linus Torvalds
  -1 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-17 20:29 UTC (permalink / raw)
  To: Sudip Mukherjee
  Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Kees Cook, linuxppc-dev, linux-kernel, linux-hardening,
	Segher Boessenkool

On Sun, Jul 17, 2022 at 1:25 PM Sudip Mukherjee
<sudipm.mukherjee@gmail.com> wrote:
>
> And the generated assembly still has the memset for "struct prom_args".

Strange. That smells like a compiler bug to me.

But I can't read powerpc assembly code - it's been too many years, and
even back when I did read it I hated how the register "names" worked.

Maybe it was never the args array, and it was about the other fields.
Not that that makes any sense either, but it makes more sense than the
compiler turning a series of volatile accesses into a memset.

              Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-17 20:29         ` Linus Torvalds
  0 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-17 20:29 UTC (permalink / raw)
  To: Sudip Mukherjee
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening, linuxppc-dev

On Sun, Jul 17, 2022 at 1:25 PM Sudip Mukherjee
<sudipm.mukherjee@gmail.com> wrote:
>
> And the generated assembly still has the memset for "struct prom_args".

Strange. That smells like a compiler bug to me.

But I can't read powerpc assembly code - it's been too many years, and
even back when I did read it I hated how the register "names" worked.

Maybe it was never the args array, and it was about the other fields.
Not that that makes any sense either, but it makes more sense than the
compiler turning a series of volatile accesses into a memset.

              Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17 20:29         ` Linus Torvalds
@ 2022-07-17 20:38           ` Sudip Mukherjee
  -1 siblings, 0 replies; 47+ messages in thread
From: Sudip Mukherjee @ 2022-07-17 20:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Kees Cook, linuxppc-dev, linux-kernel, linux-hardening,
	Segher Boessenkool

On Sun, Jul 17, 2022 at 9:29 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, Jul 17, 2022 at 1:25 PM Sudip Mukherjee
> <sudipm.mukherjee@gmail.com> wrote:
> >
> > And the generated assembly still has the memset for "struct prom_args".
>
> Strange. That smells like a compiler bug to me.

Both gcc-12 and clang gives this error.

>
> But I can't read powerpc assembly code - it's been too many years, and
> even back when I did read it I hated how the register "names" worked.
>
> Maybe it was never the args array, and it was about the other fields.
> Not that that makes any sense either, but it makes more sense than the
> compiler turning a series of volatile accesses into a memset.

I have also tried adding volatile to all the members of that struct.  :(


-- 
Regards
Sudip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-17 20:38           ` Sudip Mukherjee
  0 siblings, 0 replies; 47+ messages in thread
From: Sudip Mukherjee @ 2022-07-17 20:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening, linuxppc-dev

On Sun, Jul 17, 2022 at 9:29 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, Jul 17, 2022 at 1:25 PM Sudip Mukherjee
> <sudipm.mukherjee@gmail.com> wrote:
> >
> > And the generated assembly still has the memset for "struct prom_args".
>
> Strange. That smells like a compiler bug to me.

Both gcc-12 and clang gives this error.

>
> But I can't read powerpc assembly code - it's been too many years, and
> even back when I did read it I hated how the register "names" worked.
>
> Maybe it was never the args array, and it was about the other fields.
> Not that that makes any sense either, but it makes more sense than the
> compiler turning a series of volatile accesses into a memset.

I have also tried adding volatile to all the members of that struct.  :(


-- 
Regards
Sudip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17 20:29         ` Linus Torvalds
@ 2022-07-17 20:56           ` Segher Boessenkool
  -1 siblings, 0 replies; 47+ messages in thread
From: Segher Boessenkool @ 2022-07-17 20:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening,
	linuxppc-dev, Sudip Mukherjee

On Sun, Jul 17, 2022 at 01:29:07PM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 1:25 PM Sudip Mukherjee
> <sudipm.mukherjee@gmail.com> wrote:
> >
> > And the generated assembly still has the memset for "struct prom_args".
> 
> Strange. That smells like a compiler bug to me.
> 
> But I can't read powerpc assembly code - it's been too many years, and
> even back when I did read it I hated how the register "names" worked.
> 
> Maybe it was never the args array, and it was about the other fields.
> Not that that makes any sense either, but it makes more sense than the
> compiler turning a series of volatile accesses into a memset.

Calling mem* on a volatile object (or a struct containing one) is not
valid.  I opened gcc.gnu.org/PR106335.  Thanks for bringing this up!


Segher

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-17 20:56           ` Segher Boessenkool
  0 siblings, 0 replies; 47+ messages in thread
From: Segher Boessenkool @ 2022-07-17 20:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sudip Mukherjee, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Kees Cook, linuxppc-dev, linux-kernel,
	linux-hardening

On Sun, Jul 17, 2022 at 01:29:07PM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 1:25 PM Sudip Mukherjee
> <sudipm.mukherjee@gmail.com> wrote:
> >
> > And the generated assembly still has the memset for "struct prom_args".
> 
> Strange. That smells like a compiler bug to me.
> 
> But I can't read powerpc assembly code - it's been too many years, and
> even back when I did read it I hated how the register "names" worked.
> 
> Maybe it was never the args array, and it was about the other fields.
> Not that that makes any sense either, but it makes more sense than the
> compiler turning a series of volatile accesses into a memset.

Calling mem* on a volatile object (or a struct containing one) is not
valid.  I opened gcc.gnu.org/PR106335.  Thanks for bringing this up!


Segher

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17 20:38           ` Sudip Mukherjee
@ 2022-07-17 20:56             ` Linus Torvalds
  -1 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-17 20:56 UTC (permalink / raw)
  To: Sudip Mukherjee
  Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Kees Cook, linuxppc-dev, linux-kernel, linux-hardening,
	Segher Boessenkool

On Sun, Jul 17, 2022 at 1:38 PM Sudip Mukherjee
<sudipm.mukherjee@gmail.com> wrote:
>
> I have also tried adding volatile to all the members of that struct.  :(

Can you read the code to figure otu what the memcpy is all about?

Or maybe there is something that disables 'volatile' with pre-processor hackery.

Because a compiler that turns a loop over volatile members into
'memset()' isn't a C compiler, it's just a random noise generator.
'volatile' is fundamental enough that I really doubt both gcc and
clang can be that broken.

I just tested this

        struct hello {
                volatile int array[100];
        };

        void test(void)
        {
                int i;
                struct hello hello;
                for (i = 0; i < 100; i++)
                        hello.array[i] = 0;
        }

on x86-64, and sure enough, gcc-12 turns turns it into a memset
without the volatile (in fact, the above will just be optimized away
entirely since it has no user), but with the volatile it's a proper
regular loop that does 32-byte accesses one by one (and in the proper
ascending oder). Something that memset() most definitely does not
guarantee:

.L2:
        movslq  %eax, %rdx
        addl    $1, %eax
        movl    $0, -120(%rsp,%rdx,4)
        cmpl    $100, %eax
        jne     .L2

and honestly, anything else sounds completely unacceptable.

So I suspect there is something wrong with your testing, because gcc
simply isn't that incredibly broken. Clang is interesting in that it
seems to unroll the loop five times, but it still does the proper
"write individual 32-bit entities in ascending order".

The other alternative is that it's something else than that 'struct
prom_args'. Again, I don't read powerpc asm good.

                  Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-17 20:56             ` Linus Torvalds
  0 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-17 20:56 UTC (permalink / raw)
  To: Sudip Mukherjee
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening, linuxppc-dev

On Sun, Jul 17, 2022 at 1:38 PM Sudip Mukherjee
<sudipm.mukherjee@gmail.com> wrote:
>
> I have also tried adding volatile to all the members of that struct.  :(

Can you read the code to figure otu what the memcpy is all about?

Or maybe there is something that disables 'volatile' with pre-processor hackery.

Because a compiler that turns a loop over volatile members into
'memset()' isn't a C compiler, it's just a random noise generator.
'volatile' is fundamental enough that I really doubt both gcc and
clang can be that broken.

I just tested this

        struct hello {
                volatile int array[100];
        };

        void test(void)
        {
                int i;
                struct hello hello;
                for (i = 0; i < 100; i++)
                        hello.array[i] = 0;
        }

on x86-64, and sure enough, gcc-12 turns turns it into a memset
without the volatile (in fact, the above will just be optimized away
entirely since it has no user), but with the volatile it's a proper
regular loop that does 32-byte accesses one by one (and in the proper
ascending oder). Something that memset() most definitely does not
guarantee:

.L2:
        movslq  %eax, %rdx
        addl    $1, %eax
        movl    $0, -120(%rsp,%rdx,4)
        cmpl    $100, %eax
        jne     .L2

and honestly, anything else sounds completely unacceptable.

So I suspect there is something wrong with your testing, because gcc
simply isn't that incredibly broken. Clang is interesting in that it
seems to unroll the loop five times, but it still does the proper
"write individual 32-bit entities in ascending order".

The other alternative is that it's something else than that 'struct
prom_args'. Again, I don't read powerpc asm good.

                  Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17 20:56           ` Segher Boessenkool
@ 2022-07-17 21:11             ` Linus Torvalds
  -1 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-17 21:11 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Sudip Mukherjee, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Kees Cook, linuxppc-dev, linux-kernel,
	linux-hardening

On Sun, Jul 17, 2022 at 2:00 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Calling mem* on a volatile object (or a struct containing one) is not
> valid.  I opened gcc.gnu.org/PR106335.

Well, that very quickly got marked as a duplicate of a decade-old bug.

So I guess we shouldn't expect this to be fixed any time soon.

That said, your test-case of copying the whole structure is very
different from the one in the kernel that works on them one structure
member at a time.

I can *kind of* see the logic that when you do a whole struct
assignment, it turns into a "memcpy" without regard for volatile
members. You're not actually accessing the volatile members in some
particular order, so the struct assignment arguably does not really
have an access ordering that needs to be preserved.

But the kernel code in question very much does access the members
individually, and so I think that the compiler quite unequivocally did
something horribly horribly bad by turning them into a memset.

So I don't think your test-case is really particularly good, and maybe
that's why that old bug has languished for over a decade - people
didn't realize just *how* incredibly broken it was.

             Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-17 21:11             ` Linus Torvalds
  0 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-17 21:11 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening,
	linuxppc-dev, Sudip Mukherjee

On Sun, Jul 17, 2022 at 2:00 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Calling mem* on a volatile object (or a struct containing one) is not
> valid.  I opened gcc.gnu.org/PR106335.

Well, that very quickly got marked as a duplicate of a decade-old bug.

So I guess we shouldn't expect this to be fixed any time soon.

That said, your test-case of copying the whole structure is very
different from the one in the kernel that works on them one structure
member at a time.

I can *kind of* see the logic that when you do a whole struct
assignment, it turns into a "memcpy" without regard for volatile
members. You're not actually accessing the volatile members in some
particular order, so the struct assignment arguably does not really
have an access ordering that needs to be preserved.

But the kernel code in question very much does access the members
individually, and so I think that the compiler quite unequivocally did
something horribly horribly bad by turning them into a memset.

So I don't think your test-case is really particularly good, and maybe
that's why that old bug has languished for over a decade - people
didn't realize just *how* incredibly broken it was.

             Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17 21:11             ` Linus Torvalds
@ 2022-07-17 21:45               ` Segher Boessenkool
  -1 siblings, 0 replies; 47+ messages in thread
From: Segher Boessenkool @ 2022-07-17 21:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening,
	linuxppc-dev, Sudip Mukherjee

On Sun, Jul 17, 2022 at 02:11:52PM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 2:00 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> > Calling mem* on a volatile object (or a struct containing one) is not
> > valid.  I opened gcc.gnu.org/PR106335.
> 
> Well, that very quickly got marked as a duplicate of a decade-old bug.
> 
> So I guess we shouldn't expect this to be fixed any time soon.

It shouldn't be all that hard to implement.  GCC wants all ports to
define their own mem* because these functions are so critical for
performance, but it isn't hard to do a straightforward by-field copy
for assignments if using memcpy would not be valid at all.  Also, if
we would have this we could make a compiler flag saying to always
open-code this, getting rid of this annoyance (namely, that extetnal
mem* are required) for -ffreestanding.

> That said, your test-case of copying the whole structure is very
> different from the one in the kernel that works on them one structure
> member at a time.
> 
> I can *kind of* see the logic that when you do a whole struct
> assignment, it turns into a "memcpy" without regard for volatile
> members. You're not actually accessing the volatile members in some
> particular order, so the struct assignment arguably does not really
> have an access ordering that needs to be preserved.

The order is not defined, correct.  But a "volatile int" can only be
accessed as an int, and an external memcpy will typically use different
size accesses, and can even access some fields more than once (or
partially); all not okay for a volatile object.

> But the kernel code in question very much does access the members
> individually, and so I think that the compiler quite unequivocally did
> something horribly horribly bad by turning them into a memset.
> 
> So I don't think your test-case is really particularly good, and maybe
> that's why that old bug has languished for over a decade - people
> didn't realize just *how* incredibly broken it was.

People haven't looked at my test case for all that time, it sprouted
from my demented mind just minutes ago ;-)  The purpose of writing it
this way was to make sure that memcpy will be called for this (on any
target etc.), not some shorter and/or smarter thing.

I don't know what the real reason is that this bugs hasn't been fixed
yet.  It should be quite easy to make this more correct.  In
<https://patchwork.ozlabs.org/project/gcc/patch/1408617247-21558-1-git-send-email-james.greenhalgh@arm.com/#843066>
Richard suggested doing it in the frontend, which seems reasonable (but
more work than the patch there).

There have been no follow-up patches as far as I can see :-(


Segher

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-17 21:45               ` Segher Boessenkool
  0 siblings, 0 replies; 47+ messages in thread
From: Segher Boessenkool @ 2022-07-17 21:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sudip Mukherjee, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Kees Cook, linuxppc-dev, linux-kernel,
	linux-hardening

On Sun, Jul 17, 2022 at 02:11:52PM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 2:00 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> > Calling mem* on a volatile object (or a struct containing one) is not
> > valid.  I opened gcc.gnu.org/PR106335.
> 
> Well, that very quickly got marked as a duplicate of a decade-old bug.
> 
> So I guess we shouldn't expect this to be fixed any time soon.

It shouldn't be all that hard to implement.  GCC wants all ports to
define their own mem* because these functions are so critical for
performance, but it isn't hard to do a straightforward by-field copy
for assignments if using memcpy would not be valid at all.  Also, if
we would have this we could make a compiler flag saying to always
open-code this, getting rid of this annoyance (namely, that extetnal
mem* are required) for -ffreestanding.

> That said, your test-case of copying the whole structure is very
> different from the one in the kernel that works on them one structure
> member at a time.
> 
> I can *kind of* see the logic that when you do a whole struct
> assignment, it turns into a "memcpy" without regard for volatile
> members. You're not actually accessing the volatile members in some
> particular order, so the struct assignment arguably does not really
> have an access ordering that needs to be preserved.

The order is not defined, correct.  But a "volatile int" can only be
accessed as an int, and an external memcpy will typically use different
size accesses, and can even access some fields more than once (or
partially); all not okay for a volatile object.

> But the kernel code in question very much does access the members
> individually, and so I think that the compiler quite unequivocally did
> something horribly horribly bad by turning them into a memset.
> 
> So I don't think your test-case is really particularly good, and maybe
> that's why that old bug has languished for over a decade - people
> didn't realize just *how* incredibly broken it was.

People haven't looked at my test case for all that time, it sprouted
from my demented mind just minutes ago ;-)  The purpose of writing it
this way was to make sure that memcpy will be called for this (on any
target etc.), not some shorter and/or smarter thing.

I don't know what the real reason is that this bugs hasn't been fixed
yet.  It should be quite easy to make this more correct.  In
<https://patchwork.ozlabs.org/project/gcc/patch/1408617247-21558-1-git-send-email-james.greenhalgh@arm.com/#843066>
Richard suggested doing it in the frontend, which seems reasonable (but
more work than the patch there).

There have been no follow-up patches as far as I can see :-(


Segher

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17 21:45               ` Segher Boessenkool
@ 2022-07-18  1:38                 ` Linus Torvalds
  -1 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-18  1:38 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Sudip Mukherjee, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Kees Cook, linuxppc-dev, linux-kernel,
	linux-hardening

On Sun, Jul 17, 2022 at 2:49 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> > I can *kind of* see the logic that when you do a whole struct
> > assignment, it turns into a "memcpy" without regard for volatile
> > members. You're not actually accessing the volatile members in some
> > particular order, so the struct assignment arguably does not really
> > have an access ordering that needs to be preserved.
>
> The order is not defined, correct.  But a "volatile int" can only be
> accessed as an int, and an external memcpy will typically use different
> size accesses, and can even access some fields more than once (or
> partially); all not okay for a volatile object.

That is not actually a valid or realistic argument in the general case.

The thing is, an operation on an aggregate type in C is fundamentally
different from the "do the same operation on the individual parts of
the struct".

Just to make a very concrete example of that, it's not at all
unreasonable to have a struct like this:

    struct io_accessor {
        union {
            volatile unsigned char byte[8];
            volatile unsigned short word[4];
        ...

and while that wasn't the example here, it's not completely insane as
a concept to use as a helper type so that you can do volatile accesses
of different sizes.

And while you'd be right to say that "assigning that kind of struct is
probably insane", and I wouldn't argue against it, I also think that
basically *any* union member is basically an argument that a structure
assignment is *NOT* about "assign all the individual members", and
never really can be.

In the above union, make one union member be a non-volatile type, and
suddenly it actually *can* be ok to copy the struct. Even though it
also has volatile bytes.

So once you start doing structure assignments but argue about
individual fields being volatile, I think you're on very shaky ground.

And I think "memcpy" is a reasonable way to say "we don't care - and
in the general case we CANNOT know - what the individual members are,
so we'll just copy it as one thing".

So the compiler emitting a "memcpy()" to assign a structure sounds
fine. Even in theory. Because the "but individual fields.." argument
just cannot work in general.

In contrast, when you access the members individually (like the kernel
does in this powerpc case), there is no such ambiguity.

There is no way in hell that it is ever ok to do a "memcpy()" when the
user has done the assignments one volatile member at a time.

So that's why I don't think your test-case with the struct assignment
is very good. I think it's very reasonable for a compiler person to
say "you assigned the whole struct, you get what you asked for, you
get a memcpy".

But when you do a loop that assigns individual volatile fields? No
such problem. Completely unambiguous that you need to do them one at a
time as individual accesses.

                Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-18  1:38                 ` Linus Torvalds
  0 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-18  1:38 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening,
	linuxppc-dev, Sudip Mukherjee

On Sun, Jul 17, 2022 at 2:49 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> > I can *kind of* see the logic that when you do a whole struct
> > assignment, it turns into a "memcpy" without regard for volatile
> > members. You're not actually accessing the volatile members in some
> > particular order, so the struct assignment arguably does not really
> > have an access ordering that needs to be preserved.
>
> The order is not defined, correct.  But a "volatile int" can only be
> accessed as an int, and an external memcpy will typically use different
> size accesses, and can even access some fields more than once (or
> partially); all not okay for a volatile object.

That is not actually a valid or realistic argument in the general case.

The thing is, an operation on an aggregate type in C is fundamentally
different from the "do the same operation on the individual parts of
the struct".

Just to make a very concrete example of that, it's not at all
unreasonable to have a struct like this:

    struct io_accessor {
        union {
            volatile unsigned char byte[8];
            volatile unsigned short word[4];
        ...

and while that wasn't the example here, it's not completely insane as
a concept to use as a helper type so that you can do volatile accesses
of different sizes.

And while you'd be right to say that "assigning that kind of struct is
probably insane", and I wouldn't argue against it, I also think that
basically *any* union member is basically an argument that a structure
assignment is *NOT* about "assign all the individual members", and
never really can be.

In the above union, make one union member be a non-volatile type, and
suddenly it actually *can* be ok to copy the struct. Even though it
also has volatile bytes.

So once you start doing structure assignments but argue about
individual fields being volatile, I think you're on very shaky ground.

And I think "memcpy" is a reasonable way to say "we don't care - and
in the general case we CANNOT know - what the individual members are,
so we'll just copy it as one thing".

So the compiler emitting a "memcpy()" to assign a structure sounds
fine. Even in theory. Because the "but individual fields.." argument
just cannot work in general.

In contrast, when you access the members individually (like the kernel
does in this powerpc case), there is no such ambiguity.

There is no way in hell that it is ever ok to do a "memcpy()" when the
user has done the assignments one volatile member at a time.

So that's why I don't think your test-case with the struct assignment
is very good. I think it's very reasonable for a compiler person to
say "you assigned the whole struct, you get what you asked for, you
get a memcpy".

But when you do a loop that assigns individual volatile fields? No
such problem. Completely unambiguous that you need to do them one at a
time as individual accesses.

                Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17 19:54       ` Segher Boessenkool
@ 2022-07-18  3:52         ` Michael Ellerman
  -1 siblings, 0 replies; 47+ messages in thread
From: Michael Ellerman @ 2022-07-18  3:52 UTC (permalink / raw)
  To: Segher Boessenkool, Linus Torvalds
  Cc: Sudip Mukherjee, Kees Cook, linux-kernel, Paul Mackerras,
	linux-hardening, linuxppc-dev

Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Sun, Jul 17, 2022 at 07:44:22AM -0700, Linus Torvalds wrote:
>> On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
>> <sudipm.mukherjee@gmail.com> wrote:
>> > I was trying to check it. With gcc-11 the assembly code generated is
>> > not using memset, but using __memset.
>> > But with gcc-12, I can see the assembly code is using memset. One
>> > example from the assembly:
>> 
>> You could try making the 'args' array in 'struct prom_args' be marked
>> 'volatile'.
>> 
>> Ie something like this:
>> 
>>   --- a/arch/powerpc/kernel/prom_init.c
>>   +++ b/arch/powerpc/kernel/prom_init.c
>>   @@ -115,6 +115,6 @@ struct prom_args {
>>            __be32 service;
>>            __be32 nargs;
>>            __be32 nret;
>>   -          __be32 args[10];
>>   +        volatile __be32 args[10];
>>    };
>> 
>> because I think it's just the compilers turning the small loop over
>> those fields into a "memset()".
>
> Yes.  See <https://gcc.gnu.org/onlinedocs/gcc/Standards.html#C-Language>
> near the end:
>   Most of the compiler support routines used by GCC are present in
>   libgcc, but there are a few exceptions. GCC requires the freestanding
>   environment provide memcpy, memmove, memset and memcmp. Finally, if
>   __builtin_trap is used, and the target does not implement the trap
>   pattern, then GCC emits a call to abort.
>
> Can't we simply have a small simple implementation of these functions in
> arch/powerpc/boot/?  This stuff is not performance-critical, and this is
> not the first time we hit these problems.

prom_init.c isn't in arch/powerpc/boot :)

It's linked into the kernel proper, but we want it to behave like a
pre-boot environment (because not all boot paths run it) which is why we
restrict what symbols it can call.

We could have a prom_memset() etc. but we'd need to do some tricks to
rewrite references to memset() to prom_memset() before linking.

cheers

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-18  3:52         ` Michael Ellerman
  0 siblings, 0 replies; 47+ messages in thread
From: Michael Ellerman @ 2022-07-18  3:52 UTC (permalink / raw)
  To: Segher Boessenkool, Linus Torvalds
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening,
	linuxppc-dev, Sudip Mukherjee

Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Sun, Jul 17, 2022 at 07:44:22AM -0700, Linus Torvalds wrote:
>> On Sun, Jul 17, 2022 at 2:13 AM Sudip Mukherjee
>> <sudipm.mukherjee@gmail.com> wrote:
>> > I was trying to check it. With gcc-11 the assembly code generated is
>> > not using memset, but using __memset.
>> > But with gcc-12, I can see the assembly code is using memset. One
>> > example from the assembly:
>> 
>> You could try making the 'args' array in 'struct prom_args' be marked
>> 'volatile'.
>> 
>> Ie something like this:
>> 
>>   --- a/arch/powerpc/kernel/prom_init.c
>>   +++ b/arch/powerpc/kernel/prom_init.c
>>   @@ -115,6 +115,6 @@ struct prom_args {
>>            __be32 service;
>>            __be32 nargs;
>>            __be32 nret;
>>   -          __be32 args[10];
>>   +        volatile __be32 args[10];
>>    };
>> 
>> because I think it's just the compilers turning the small loop over
>> those fields into a "memset()".
>
> Yes.  See <https://gcc.gnu.org/onlinedocs/gcc/Standards.html#C-Language>
> near the end:
>   Most of the compiler support routines used by GCC are present in
>   libgcc, but there are a few exceptions. GCC requires the freestanding
>   environment provide memcpy, memmove, memset and memcmp. Finally, if
>   __builtin_trap is used, and the target does not implement the trap
>   pattern, then GCC emits a call to abort.
>
> Can't we simply have a small simple implementation of these functions in
> arch/powerpc/boot/?  This stuff is not performance-critical, and this is
> not the first time we hit these problems.

prom_init.c isn't in arch/powerpc/boot :)

It's linked into the kernel proper, but we want it to behave like a
pre-boot environment (because not all boot paths run it) which is why we
restrict what symbols it can call.

We could have a prom_memset() etc. but we'd need to do some tricks to
rewrite references to memset() to prom_memset() before linking.

cheers

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-17  9:12   ` Sudip Mukherjee
@ 2022-07-18  4:41     ` Michael Ellerman
  -1 siblings, 0 replies; 47+ messages in thread
From: Michael Ellerman @ 2022-07-18  4:41 UTC (permalink / raw)
  To: Sudip Mukherjee, Benjamin Herrenschmidt, Paul Mackerras, Kees Cook
  Cc: linuxppc-dev, linux-kernel, linux-hardening, Linus Torvalds

Sudip Mukherjee <sudipm.mukherjee@gmail.com> writes:
> On Thu, Jul 14, 2022 at 9:55 AM Sudip Mukherjee (Codethink)
> <sudipm.mukherjee@gmail.com> wrote:
>>
>> Hi All,
>>
>> Not sure if it has been reported before but the latest mainline kernel
>> branch fails to build for powerpc allmodconfig with gcc-12 and the error is:
>>
>> Error: External symbol 'memset' referenced from prom_init.c
>> make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1
>
> I was trying to check it. With gcc-11 the assembly code generated is
> not using memset, but using __memset.
> But with gcc-12, I can see the assembly code is using memset. One
> example from the assembly:
>
> call_prom:
>         .quad   .call_prom,.TOC.@tocbase,0
>         .previous
>         .size   call_prom,24
>         .type   .call_prom,@function
> .call_prom:
>         mflr 0           #,
>         std 29,-24(1)    #,
>         std 30,-16(1)    #,
>         std 31,-8(1)     #,
>         mr 29,3          # tmp166, service
>         mr 31,4          # nargs, tmp167
>         mr 30,5          # tmp168, nret
>  # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
>         li 4,254                 #,

Here we load 254 into r4, which is the 2nd parameter to memset (c).

>         li 5,52          #,

This is r5, the 3rd parameter (n), ie. the size of the structure.

That tells us we're memsetting the entire structure, ie. the 10 x 4
bytes of args.args plus 3 x 4 bytes for the other members.

>  # arch/powerpc/kernel/prom_init.c:394: {
>         std 0,16(1)      #,
>         stdu 1,-208(1)   #,,
>  # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
>         addi 3,1,112     # tmp174,,

Here we load (calculate) the address of "args" into r3, the first
parameter to memset.

>  # arch/powerpc/kernel/prom_init.c:394: {
>         std 9,304(1)     #,
>         std 10,312(1)    #,
>         std 6,280(1)     #,
>         std 7,288(1)     #,
>         std 8,296(1)     #,
>  # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
>         bl .memset       #

So we're memsetting all of args to 254, not zero.

That's happening because allmodconfig with gcc 12 enables
CONFIG_INIT_STACK_ALL_PATTERN, whereas gcc 11 doesn't.

I think the simplest fix in the short term is to just disable stack
initialisation for prom_init.c. It only runs at boot so there's no real
security impact to disabling it.

cheers

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-18  4:41     ` Michael Ellerman
  0 siblings, 0 replies; 47+ messages in thread
From: Michael Ellerman @ 2022-07-18  4:41 UTC (permalink / raw)
  To: Sudip Mukherjee, Benjamin Herrenschmidt, Paul Mackerras, Kees Cook
  Cc: Linus Torvalds, linuxppc-dev, linux-kernel, linux-hardening

Sudip Mukherjee <sudipm.mukherjee@gmail.com> writes:
> On Thu, Jul 14, 2022 at 9:55 AM Sudip Mukherjee (Codethink)
> <sudipm.mukherjee@gmail.com> wrote:
>>
>> Hi All,
>>
>> Not sure if it has been reported before but the latest mainline kernel
>> branch fails to build for powerpc allmodconfig with gcc-12 and the error is:
>>
>> Error: External symbol 'memset' referenced from prom_init.c
>> make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1
>
> I was trying to check it. With gcc-11 the assembly code generated is
> not using memset, but using __memset.
> But with gcc-12, I can see the assembly code is using memset. One
> example from the assembly:
>
> call_prom:
>         .quad   .call_prom,.TOC.@tocbase,0
>         .previous
>         .size   call_prom,24
>         .type   .call_prom,@function
> .call_prom:
>         mflr 0           #,
>         std 29,-24(1)    #,
>         std 30,-16(1)    #,
>         std 31,-8(1)     #,
>         mr 29,3          # tmp166, service
>         mr 31,4          # nargs, tmp167
>         mr 30,5          # tmp168, nret
>  # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
>         li 4,254                 #,

Here we load 254 into r4, which is the 2nd parameter to memset (c).

>         li 5,52          #,

This is r5, the 3rd parameter (n), ie. the size of the structure.

That tells us we're memsetting the entire structure, ie. the 10 x 4
bytes of args.args plus 3 x 4 bytes for the other members.

>  # arch/powerpc/kernel/prom_init.c:394: {
>         std 0,16(1)      #,
>         stdu 1,-208(1)   #,,
>  # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
>         addi 3,1,112     # tmp174,,

Here we load (calculate) the address of "args" into r3, the first
parameter to memset.

>  # arch/powerpc/kernel/prom_init.c:394: {
>         std 9,304(1)     #,
>         std 10,312(1)    #,
>         std 6,280(1)     #,
>         std 7,288(1)     #,
>         std 8,296(1)     #,
>  # arch/powerpc/kernel/prom_init.c:396:         struct prom_args args;
>         bl .memset       #

So we're memsetting all of args to 254, not zero.

That's happening because allmodconfig with gcc 12 enables
CONFIG_INIT_STACK_ALL_PATTERN, whereas gcc 11 doesn't.

I think the simplest fix in the short term is to just disable stack
initialisation for prom_init.c. It only runs at boot so there's no real
security impact to disabling it.

cheers

^ permalink raw reply	[flat|nested] 47+ messages in thread

* RE: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-18  4:41     ` Michael Ellerman
@ 2022-07-18  7:51       ` David Laight
  -1 siblings, 0 replies; 47+ messages in thread
From: David Laight @ 2022-07-18  7:51 UTC (permalink / raw)
  To: 'Michael Ellerman',
	Sudip Mukherjee, Benjamin Herrenschmidt, Paul Mackerras,
	Kees Cook
  Cc: linuxppc-dev, linux-kernel, linux-hardening, Linus Torvalds

From: Michael Ellerman
> Sent: 18 July 2022 05:41
...
> So we're memsetting all of args to 254, not zero.
> 
> That's happening because allmodconfig with gcc 12 enables
> CONFIG_INIT_STACK_ALL_PATTERN, whereas gcc 11 doesn't.

I can't help feeling it would be better if that generated
a call to a memset64() function.
Saving loads of tests at the top of the function,
and (most of?) the constant expansion to 64bit.

Although and explicit 'stack clear' function would be better
for the kernel - since it would give the option of patching
it away at startup.

I really can't help feeling that initialising on-stack
arrays will kill performance.
While kernel stack frames have to be relatively small,
in userspace very large on-stack arrays can be allocated
(and correctly bound checked) knowing that the cost is
minimal (maybe a TLB miss).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 47+ messages in thread

* RE: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-18  7:51       ` David Laight
  0 siblings, 0 replies; 47+ messages in thread
From: David Laight @ 2022-07-18  7:51 UTC (permalink / raw)
  To: 'Michael Ellerman',
	Sudip Mukherjee, Benjamin Herrenschmidt, Paul Mackerras,
	Kees Cook
  Cc: Linus Torvalds, linuxppc-dev, linux-kernel, linux-hardening

From: Michael Ellerman
> Sent: 18 July 2022 05:41
...
> So we're memsetting all of args to 254, not zero.
> 
> That's happening because allmodconfig with gcc 12 enables
> CONFIG_INIT_STACK_ALL_PATTERN, whereas gcc 11 doesn't.

I can't help feeling it would be better if that generated
a call to a memset64() function.
Saving loads of tests at the top of the function,
and (most of?) the constant expansion to 64bit.

Although and explicit 'stack clear' function would be better
for the kernel - since it would give the option of patching
it away at startup.

I really can't help feeling that initialising on-stack
arrays will kill performance.
While kernel stack frames have to be relatively small,
in userspace very large on-stack arrays can be allocated
(and correctly bound checked) knowing that the cost is
minimal (maybe a TLB miss).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init
  2022-07-18  4:41     ` Michael Ellerman
@ 2022-07-18 13:44       ` Michael Ellerman
  -1 siblings, 0 replies; 47+ messages in thread
From: Michael Ellerman @ 2022-07-18 13:44 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: sudipm.mukherjee, keescook, linux-kernel, linux-hardening, torvalds

With GCC 12 allmodconfig prom_init fails to build:

  Error: External symbol 'memset' referenced from prom_init.c
  make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

The allmodconfig build enables KASAN, so all calls to memset in
prom_init should be converted to __memset by the #ifdefs in
asm/string.h, because prom_init must use the non-KASAN instrumented
versions.

The build failure happens because there's a call to memset that hasn't
been caught by the pre-processor and converted to __memset. Typically
that's because it's a memset generated by the compiler itself, and that
is the case here.

With GCC 12, allmodconfig enables CONFIG_INIT_STACK_ALL_PATTERN, which
causes the compiler to emit memset calls to initialise on-stack
variables with a pattern.

Because prom_init is non-user-facing boot-time only code, as a
workaround just disable stack variable initialisation to unbreak the
build.

Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index f91f0f29a566..c8cf924bf9c0 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -20,6 +20,7 @@ CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
 CFLAGS_prom_init.o += -fno-stack-protector
 CFLAGS_prom_init.o += -DDISABLE_BRANCH_PROFILING
 CFLAGS_prom_init.o += -ffreestanding
+CFLAGS_prom_init.o += $(call cc-option, -ftrivial-auto-var-init=uninitialized)
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not trace early boot code
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init
@ 2022-07-18 13:44       ` Michael Ellerman
  0 siblings, 0 replies; 47+ messages in thread
From: Michael Ellerman @ 2022-07-18 13:44 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, torvalds, linux-hardening, sudipm.mukherjee, keescook

With GCC 12 allmodconfig prom_init fails to build:

  Error: External symbol 'memset' referenced from prom_init.c
  make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

The allmodconfig build enables KASAN, so all calls to memset in
prom_init should be converted to __memset by the #ifdefs in
asm/string.h, because prom_init must use the non-KASAN instrumented
versions.

The build failure happens because there's a call to memset that hasn't
been caught by the pre-processor and converted to __memset. Typically
that's because it's a memset generated by the compiler itself, and that
is the case here.

With GCC 12, allmodconfig enables CONFIG_INIT_STACK_ALL_PATTERN, which
causes the compiler to emit memset calls to initialise on-stack
variables with a pattern.

Because prom_init is non-user-facing boot-time only code, as a
workaround just disable stack variable initialisation to unbreak the
build.

Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index f91f0f29a566..c8cf924bf9c0 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -20,6 +20,7 @@ CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
 CFLAGS_prom_init.o += -fno-stack-protector
 CFLAGS_prom_init.o += -DDISABLE_BRANCH_PROFILING
 CFLAGS_prom_init.o += -ffreestanding
+CFLAGS_prom_init.o += $(call cc-option, -ftrivial-auto-var-init=uninitialized)
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not trace early boot code
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-18  3:52         ` Michael Ellerman
@ 2022-07-18 14:56           ` Segher Boessenkool
  -1 siblings, 0 replies; 47+ messages in thread
From: Segher Boessenkool @ 2022-07-18 14:56 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Kees Cook, linuxppc-dev, linux-kernel, Paul Mackerras,
	linux-hardening, Linus Torvalds, Sudip Mukherjee

On Mon, Jul 18, 2022 at 01:52:38PM +1000, Michael Ellerman wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
> > Can't we simply have a small simple implementation of these functions in
> > arch/powerpc/boot/?  This stuff is not performance-critical, and this is
> > not the first time we hit these problems.
> 
> prom_init.c isn't in arch/powerpc/boot :)

Ah duh :-)

> It's linked into the kernel proper, but we want it to behave like a
> pre-boot environment (because not all boot paths run it) which is why we
> restrict what symbols it can call.
> 
> We could have a prom_memset() etc. but we'd need to do some tricks to
> rewrite references to memset() to prom_memset() before linking.

You can do it in its linker script?


Segher

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-18 14:56           ` Segher Boessenkool
  0 siblings, 0 replies; 47+ messages in thread
From: Segher Boessenkool @ 2022-07-18 14:56 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Linus Torvalds, Sudip Mukherjee, Kees Cook, linux-kernel,
	Paul Mackerras, linux-hardening, linuxppc-dev

On Mon, Jul 18, 2022 at 01:52:38PM +1000, Michael Ellerman wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
> > Can't we simply have a small simple implementation of these functions in
> > arch/powerpc/boot/?  This stuff is not performance-critical, and this is
> > not the first time we hit these problems.
> 
> prom_init.c isn't in arch/powerpc/boot :)

Ah duh :-)

> It's linked into the kernel proper, but we want it to behave like a
> pre-boot environment (because not all boot paths run it) which is why we
> restrict what symbols it can call.
> 
> We could have a prom_memset() etc. but we'd need to do some tricks to
> rewrite references to memset() to prom_memset() before linking.

You can do it in its linker script?


Segher

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init
  2022-07-18 13:44       ` Michael Ellerman
@ 2022-07-18 15:03         ` Sudip Mukherjee
  -1 siblings, 0 replies; 47+ messages in thread
From: Sudip Mukherjee @ 2022-07-18 15:03 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, Kees Cook, linux-kernel, linux-hardening, Linus Torvalds

On Mon, Jul 18, 2022 at 2:44 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> With GCC 12 allmodconfig prom_init fails to build:
>
>   Error: External symbol 'memset' referenced from prom_init.c
>   make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1
>

<snip>

>
> Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

And, this has fixed the build failure.
Thanks Michael.


-- 
Regards
Sudip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init
@ 2022-07-18 15:03         ` Sudip Mukherjee
  0 siblings, 0 replies; 47+ messages in thread
From: Sudip Mukherjee @ 2022-07-18 15:03 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linux-hardening, Linus Torvalds, linuxppc-dev, Kees Cook, linux-kernel

On Mon, Jul 18, 2022 at 2:44 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> With GCC 12 allmodconfig prom_init fails to build:
>
>   Error: External symbol 'memset' referenced from prom_init.c
>   make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1
>

<snip>

>
> Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

And, this has fixed the build failure.
Thanks Michael.


-- 
Regards
Sudip

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init
  2022-07-18 13:44       ` Michael Ellerman
@ 2022-07-18 18:34         ` Linus Torvalds
  -1 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-18 18:34 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, Sudip Mukherjee, Kees Cook,
	Linux Kernel Mailing List, linux-hardening

On Mon, Jul 18, 2022 at 6:44 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> With GCC 12, allmodconfig enables CONFIG_INIT_STACK_ALL_PATTERN, which
> causes the compiler to emit memset calls to initialise on-stack
> variables with a pattern.

Ahh, and that explains why "volatile" made no difference. That did
seem very odd.

Thanks for figuring it out,

                   Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init
@ 2022-07-18 18:34         ` Linus Torvalds
  0 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-18 18:34 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linux-hardening, Linux Kernel Mailing List, linuxppc-dev,
	Sudip Mukherjee, Kees Cook

On Mon, Jul 18, 2022 at 6:44 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> With GCC 12, allmodconfig enables CONFIG_INIT_STACK_ALL_PATTERN, which
> causes the compiler to emit memset calls to initialise on-stack
> variables with a pattern.

Ahh, and that explains why "volatile" made no difference. That did
seem very odd.

Thanks for figuring it out,

                   Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-18  4:41     ` Michael Ellerman
@ 2022-07-18 19:06       ` Linus Torvalds
  -1 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-18 19:06 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Sudip Mukherjee, Benjamin Herrenschmidt, Paul Mackerras,
	Kees Cook, linuxppc-dev, linux-kernel, linux-hardening

On Sun, Jul 17, 2022 at 9:41 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> >         li 4,254                 #,
>
> Here we load 254 into r4, which is the 2nd parameter to memset (c).

I love how even powerpc people know that "4" is bogus, and have to
make it clear that it means "r4".

I don't understand why the powerpc assembler is so messed up, and uses
random integer constants for register "names".

And it gets even worse, when you start mixing FP, vector and integer "names".

I've seen many bad assemblers (in fact, I have *written* a couple of
bad assemblers myself), but I have never seen anything quite that
broken on any other architecture.

Oddities, yes ("$" as a prefix for register? Alpha asm is also very
odd), but nothing *quite* as broken as "simple constants have entirely
different meanings depending on the exact instruction and argument
position".

It's not even an IBM thing. S390 uses perfectly sane register syntax,
and calls things '%r4" etc.

The human-written asm files have those #define's in headers just to
make things slightly more legible, because apparently the assembler
doesn't even *accept* the sane names. So it's not even a "the compiler
generates this abbreviated illegible mess". It's literally that the
assembler is so horrid.

Why do people put up with that?

               Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-18 19:06       ` Linus Torvalds
  0 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-18 19:06 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening,
	linuxppc-dev, Sudip Mukherjee

On Sun, Jul 17, 2022 at 9:41 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> >         li 4,254                 #,
>
> Here we load 254 into r4, which is the 2nd parameter to memset (c).

I love how even powerpc people know that "4" is bogus, and have to
make it clear that it means "r4".

I don't understand why the powerpc assembler is so messed up, and uses
random integer constants for register "names".

And it gets even worse, when you start mixing FP, vector and integer "names".

I've seen many bad assemblers (in fact, I have *written* a couple of
bad assemblers myself), but I have never seen anything quite that
broken on any other architecture.

Oddities, yes ("$" as a prefix for register? Alpha asm is also very
odd), but nothing *quite* as broken as "simple constants have entirely
different meanings depending on the exact instruction and argument
position".

It's not even an IBM thing. S390 uses perfectly sane register syntax,
and calls things '%r4" etc.

The human-written asm files have those #define's in headers just to
make things slightly more legible, because apparently the assembler
doesn't even *accept* the sane names. So it's not even a "the compiler
generates this abbreviated illegible mess". It's literally that the
assembler is so horrid.

Why do people put up with that?

               Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-18 19:06       ` Linus Torvalds
@ 2022-07-18 22:08         ` Segher Boessenkool
  -1 siblings, 0 replies; 47+ messages in thread
From: Segher Boessenkool @ 2022-07-18 22:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening,
	linuxppc-dev, Sudip Mukherjee

On Mon, Jul 18, 2022 at 12:06:52PM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 9:41 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> > >         li 4,254                 #,
> >
> > Here we load 254 into r4, which is the 2nd parameter to memset (c).
> 
> I love how even powerpc people know that "4" is bogus, and have to
> make it clear that it means "r4".

This is compiler output.  Compiler output is mainly meant for the
assembler to produce object code from.  It isn't meant to be readable
(and e.g. -fverbose-asm didn't help much here, that's the "#," ;-) ).

The mnemonic determines what the operands mean.  It is much easier to
read and write "li 4,254" than "li r4,254" or "li %r4,254", all of which
are valid.  You can also write "li 3+1,2*127", but not with the other
forms (this is useful if you use assembler macros, which are way more
powerful and appropriate than abusing the C preprocessor, when writing
assembler code).

It matters more if you have three or four or five or six operands to an
assembler instruction, all the extra line noise makes things illegible.

The "%r4" variant hails from winnt.  It is a bit problematic in inline
assembler, because you need to escape the % in extended inline asm, but
not in basic inline asm.  It also is pure line noise to read.

The "r4" variant is problematic if you have symbols named the same.
When you use the -mregnames assembler option it is taken to mean the
register; you can write "(r6)" to mean the symbol.  (There also are "sp"
and "rtos" and "xer" and whatnot, not just "r4").

> I don't understand why the powerpc assembler is so messed up, and uses
> random integer constants for register "names".

360 was the same.  370 was the same.  390 is the same.  801 was the
same.  RIOS (aka POWER) was the same.  So yes, PowerPC inherited it, I
don't know how much thought was put into this, don't change a winning
team etc.

> And it gets even worse, when you start mixing FP, vector and integer "names".

It is clear from the mnemonic what the operands are: some register, an
immediate, a constant, etc.  An expression (which can include object
symbols) can be any of those.

Assembler language is unforgiving.  It isn't easy to write, and most
mistakes will not be diagnosed.  If the assmbler language makes it
easier to read the code, that makes it more likely correct code will be
written, and that correct code will be written in less time.

> I've seen many bad assemblers (in fact, I have *written* a couple of
> bad assemblers myself), but I have never seen anything quite that
> broken on any other architecture.
> 
> Oddities, yes ("$" as a prefix for register? Alpha asm is also very
> odd), but nothing *quite* as broken as "simple constants have entirely
> different meanings depending on the exact instruction and argument
> position".

What is broken about that?  It makes everything very consistent, and
very readable.  Sigils are just nasty, and having the register names the
same as valid symbol names is also problematic.

> It's not even an IBM thing. S390 uses perfectly sane register syntax,
> and calls things '%r4" etc.

s390 has the same syntax, and even inherited the GAS code for this from
the ppc port.

> The human-written asm files have those #define's in headers just to
> make things slightly more legible, because apparently the assembler
> doesn't even *accept* the sane names.

That was true a long time ago.  And the "#define r0 0" thing caused
quite a few bugs itself btw.

> So it's not even a "the compiler
> generates this abbreviated illegible mess". It's literally that the
> assembler is so horrid.

The disassembler has shown "r4" etc. by default since ages.  The
assembler needs -mregnames to accept it; enabling this by default would
be a compatibility break, not acceptable.

> Why do people put up with that?

Why are people misinformed?

Is there anything in particular in the documentation we could improve?


Hope this helps,


Segher

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-18 22:08         ` Segher Boessenkool
  0 siblings, 0 replies; 47+ messages in thread
From: Segher Boessenkool @ 2022-07-18 22:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Ellerman, Kees Cook, linux-kernel, Paul Mackerras,
	linux-hardening, linuxppc-dev, Sudip Mukherjee

On Mon, Jul 18, 2022 at 12:06:52PM -0700, Linus Torvalds wrote:
> On Sun, Jul 17, 2022 at 9:41 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> > >         li 4,254                 #,
> >
> > Here we load 254 into r4, which is the 2nd parameter to memset (c).
> 
> I love how even powerpc people know that "4" is bogus, and have to
> make it clear that it means "r4".

This is compiler output.  Compiler output is mainly meant for the
assembler to produce object code from.  It isn't meant to be readable
(and e.g. -fverbose-asm didn't help much here, that's the "#," ;-) ).

The mnemonic determines what the operands mean.  It is much easier to
read and write "li 4,254" than "li r4,254" or "li %r4,254", all of which
are valid.  You can also write "li 3+1,2*127", but not with the other
forms (this is useful if you use assembler macros, which are way more
powerful and appropriate than abusing the C preprocessor, when writing
assembler code).

It matters more if you have three or four or five or six operands to an
assembler instruction, all the extra line noise makes things illegible.

The "%r4" variant hails from winnt.  It is a bit problematic in inline
assembler, because you need to escape the % in extended inline asm, but
not in basic inline asm.  It also is pure line noise to read.

The "r4" variant is problematic if you have symbols named the same.
When you use the -mregnames assembler option it is taken to mean the
register; you can write "(r6)" to mean the symbol.  (There also are "sp"
and "rtos" and "xer" and whatnot, not just "r4").

> I don't understand why the powerpc assembler is so messed up, and uses
> random integer constants for register "names".

360 was the same.  370 was the same.  390 is the same.  801 was the
same.  RIOS (aka POWER) was the same.  So yes, PowerPC inherited it, I
don't know how much thought was put into this, don't change a winning
team etc.

> And it gets even worse, when you start mixing FP, vector and integer "names".

It is clear from the mnemonic what the operands are: some register, an
immediate, a constant, etc.  An expression (which can include object
symbols) can be any of those.

Assembler language is unforgiving.  It isn't easy to write, and most
mistakes will not be diagnosed.  If the assmbler language makes it
easier to read the code, that makes it more likely correct code will be
written, and that correct code will be written in less time.

> I've seen many bad assemblers (in fact, I have *written* a couple of
> bad assemblers myself), but I have never seen anything quite that
> broken on any other architecture.
> 
> Oddities, yes ("$" as a prefix for register? Alpha asm is also very
> odd), but nothing *quite* as broken as "simple constants have entirely
> different meanings depending on the exact instruction and argument
> position".

What is broken about that?  It makes everything very consistent, and
very readable.  Sigils are just nasty, and having the register names the
same as valid symbol names is also problematic.

> It's not even an IBM thing. S390 uses perfectly sane register syntax,
> and calls things '%r4" etc.

s390 has the same syntax, and even inherited the GAS code for this from
the ppc port.

> The human-written asm files have those #define's in headers just to
> make things slightly more legible, because apparently the assembler
> doesn't even *accept* the sane names.

That was true a long time ago.  And the "#define r0 0" thing caused
quite a few bugs itself btw.

> So it's not even a "the compiler
> generates this abbreviated illegible mess". It's literally that the
> assembler is so horrid.

The disassembler has shown "r4" etc. by default since ages.  The
assembler needs -mregnames to accept it; enabling this by default would
be a compatibility break, not acceptable.

> Why do people put up with that?

Why are people misinformed?

Is there anything in particular in the documentation we could improve?


Hope this helps,


Segher

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-18 22:08         ` Segher Boessenkool
@ 2022-07-18 22:55           ` Linus Torvalds
  -1 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-18 22:55 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Ellerman, Kees Cook, linux-kernel, Paul Mackerras,
	linux-hardening, linuxppc-dev, Sudip Mukherjee

On Mon, Jul 18, 2022 at 3:12 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Assembler language is unforgiving.  It isn't easy to write, and most
> mistakes will not be diagnosed.  If the assmbler language makes it
> easier to read the code, that makes it more likely correct code will be
> written, and that correct code will be written in less time.

What's your argument? That it's unforgiving, so it has to be
unreadable and easy to make mistakes in too?

You can get the order of operands wrong, and it will still build -
just to completely the wrong thing.

> > Oddities, yes ("$" as a prefix for register? Alpha asm is also very
> > odd), but nothing *quite* as broken as "simple constants have entirely
> > different meanings depending on the exact instruction and argument
> > position".
>
> What is broken about that?  It makes everything very consistent, and
> very readable.  Sigils are just nasty, and having the register names the
> same as valid symbol names is also problematic.

Oh, I agree that sigils are good to make the type clear. So '%r4' is
better than 'r4' because the latter could be ambiguous and you could
have a symbol 'r4'.

But just '4' is plain garbage.  There's no "could be ambiguous" about it.

Type checking matters. Not just in C. In asm too.

The reason '$0' is odd because it's *just* a sigil and a number.

Which certainly is not unusual in itself, but it reads like it's a
number to me. Not just because of x86 gas ('$' means 'immediate'), but
Z80 ('$' means HEX), or at least 'Nth argument' (shell).

And yeah, alpha got it from MIPS, afaik.

And presumably MIPS got it from "we're hacking up the simplest
assembler we can".

> > The human-written asm files have those #define's in headers just to
> > make things slightly more legible, because apparently the assembler
> > doesn't even *accept* the sane names.
>
> That was true a long time ago.  And the "#define r0 0" thing caused
> quite a few bugs itself btw.

Those #define's still exist. Look it up.

And yes, they are horrible, and they are wrong, and they shouldn't exist.

               Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-18 22:55           ` Linus Torvalds
  0 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2022-07-18 22:55 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening,
	linuxppc-dev, Sudip Mukherjee

On Mon, Jul 18, 2022 at 3:12 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Assembler language is unforgiving.  It isn't easy to write, and most
> mistakes will not be diagnosed.  If the assmbler language makes it
> easier to read the code, that makes it more likely correct code will be
> written, and that correct code will be written in less time.

What's your argument? That it's unforgiving, so it has to be
unreadable and easy to make mistakes in too?

You can get the order of operands wrong, and it will still build -
just to completely the wrong thing.

> > Oddities, yes ("$" as a prefix for register? Alpha asm is also very
> > odd), but nothing *quite* as broken as "simple constants have entirely
> > different meanings depending on the exact instruction and argument
> > position".
>
> What is broken about that?  It makes everything very consistent, and
> very readable.  Sigils are just nasty, and having the register names the
> same as valid symbol names is also problematic.

Oh, I agree that sigils are good to make the type clear. So '%r4' is
better than 'r4' because the latter could be ambiguous and you could
have a symbol 'r4'.

But just '4' is plain garbage.  There's no "could be ambiguous" about it.

Type checking matters. Not just in C. In asm too.

The reason '$0' is odd because it's *just* a sigil and a number.

Which certainly is not unusual in itself, but it reads like it's a
number to me. Not just because of x86 gas ('$' means 'immediate'), but
Z80 ('$' means HEX), or at least 'Nth argument' (shell).

And yeah, alpha got it from MIPS, afaik.

And presumably MIPS got it from "we're hacking up the simplest
assembler we can".

> > The human-written asm files have those #define's in headers just to
> > make things slightly more legible, because apparently the assembler
> > doesn't even *accept* the sane names.
>
> That was true a long time ago.  And the "#define r0 0" thing caused
> quite a few bugs itself btw.

Those #define's still exist. Look it up.

And yes, they are horrible, and they are wrong, and they shouldn't exist.

               Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
  2022-07-18 19:06       ` Linus Torvalds
@ 2022-07-19 13:35         ` Michael Ellerman
  -1 siblings, 0 replies; 47+ messages in thread
From: Michael Ellerman @ 2022-07-19 13:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, linux-kernel, Paul Mackerras, linux-hardening,
	linuxppc-dev, Sudip Mukherjee

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Sun, Jul 17, 2022 at 9:41 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> >         li 4,254                 #,
>>
>> Here we load 254 into r4, which is the 2nd parameter to memset (c).
>
> I love how even powerpc people know that "4" is bogus, and have to
> make it clear that it means "r4".

I wouldn't say it's bogus, I was just translating from asm to English :)

But I agree it's preferable to use a proper register name rather than a
bare integer. I never write asm using bare integers, I always use r4 or
%r4, because as you say it's too easy to get mixed up otherwise.

When looking at generated code I usually use objdump -d output, which
uses the "r4" syntax.

> It's not even an IBM thing. S390 uses perfectly sane register syntax,
> and calls things '%r4" etc.

as accepts that syntax if you tell it to.

We use that syntax in some of our newer inline asm blocks.

> The human-written asm files have those #define's in headers just to
> make things slightly more legible, because apparently the assembler
> doesn't even *accept* the sane names.

I would like to switch to using %rX everywhere and get rid of those
defines, but it's never seemed like it's worth the churn. We have ~48K
lines of asm in arch/powerpc.

cheers

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: mainline build failure of powerpc allmodconfig for prom_init_check
@ 2022-07-19 13:35         ` Michael Ellerman
  0 siblings, 0 replies; 47+ messages in thread
From: Michael Ellerman @ 2022-07-19 13:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sudip Mukherjee, Benjamin Herrenschmidt, Paul Mackerras,
	Kees Cook, linuxppc-dev, linux-kernel, linux-hardening

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Sun, Jul 17, 2022 at 9:41 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> >         li 4,254                 #,
>>
>> Here we load 254 into r4, which is the 2nd parameter to memset (c).
>
> I love how even powerpc people know that "4" is bogus, and have to
> make it clear that it means "r4".

I wouldn't say it's bogus, I was just translating from asm to English :)

But I agree it's preferable to use a proper register name rather than a
bare integer. I never write asm using bare integers, I always use r4 or
%r4, because as you say it's too easy to get mixed up otherwise.

When looking at generated code I usually use objdump -d output, which
uses the "r4" syntax.

> It's not even an IBM thing. S390 uses perfectly sane register syntax,
> and calls things '%r4" etc.

as accepts that syntax if you tell it to.

We use that syntax in some of our newer inline asm blocks.

> The human-written asm files have those #define's in headers just to
> make things slightly more legible, because apparently the assembler
> doesn't even *accept* the sane names.

I would like to switch to using %rX everywhere and get rid of those
defines, but it's never seemed like it's worth the churn. We have ~48K
lines of asm in arch/powerpc.

cheers

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init
  2022-07-18 13:44       ` Michael Ellerman
                         ` (2 preceding siblings ...)
  (?)
@ 2022-07-27 12:02       ` Michael Ellerman
  -1 siblings, 0 replies; 47+ messages in thread
From: Michael Ellerman @ 2022-07-27 12:02 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev

On Mon, 18 Jul 2022 23:44:18 +1000, Michael Ellerman wrote:
> With GCC 12 allmodconfig prom_init fails to build:
> 
>   Error: External symbol 'memset' referenced from prom_init.c
>   make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1
> 
> The allmodconfig build enables KASAN, so all calls to memset in
> prom_init should be converted to __memset by the #ifdefs in
> asm/string.h, because prom_init must use the non-KASAN instrumented
> versions.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/64s: Disable stack variable initialisation for prom_init
      https://git.kernel.org/powerpc/c/be640317a1d0b9cf42fedb2debc2887a7cfa38de

cheers

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2022-07-27 12:03 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-14  8:55 mainline build failure of powerpc allmodconfig for prom_init_check Sudip Mukherjee (Codethink)
2022-07-14  8:55 ` Sudip Mukherjee (Codethink)
2022-07-17  9:12 ` Sudip Mukherjee
2022-07-17  9:12   ` Sudip Mukherjee
2022-07-17 14:44   ` Linus Torvalds
2022-07-17 14:44     ` Linus Torvalds
2022-07-17 19:54     ` Segher Boessenkool
2022-07-17 19:54       ` Segher Boessenkool
2022-07-18  3:52       ` Michael Ellerman
2022-07-18  3:52         ` Michael Ellerman
2022-07-18 14:56         ` Segher Boessenkool
2022-07-18 14:56           ` Segher Boessenkool
2022-07-17 20:25     ` Sudip Mukherjee
2022-07-17 20:25       ` Sudip Mukherjee
2022-07-17 20:29       ` Linus Torvalds
2022-07-17 20:29         ` Linus Torvalds
2022-07-17 20:38         ` Sudip Mukherjee
2022-07-17 20:38           ` Sudip Mukherjee
2022-07-17 20:56           ` Linus Torvalds
2022-07-17 20:56             ` Linus Torvalds
2022-07-17 20:56         ` Segher Boessenkool
2022-07-17 20:56           ` Segher Boessenkool
2022-07-17 21:11           ` Linus Torvalds
2022-07-17 21:11             ` Linus Torvalds
2022-07-17 21:45             ` Segher Boessenkool
2022-07-17 21:45               ` Segher Boessenkool
2022-07-18  1:38               ` Linus Torvalds
2022-07-18  1:38                 ` Linus Torvalds
2022-07-18  4:41   ` Michael Ellerman
2022-07-18  4:41     ` Michael Ellerman
2022-07-18  7:51     ` David Laight
2022-07-18  7:51       ` David Laight
2022-07-18 13:44     ` [PATCH] powerpc/64s: Disable stack variable initialisation for prom_init Michael Ellerman
2022-07-18 13:44       ` Michael Ellerman
2022-07-18 15:03       ` Sudip Mukherjee
2022-07-18 15:03         ` Sudip Mukherjee
2022-07-18 18:34       ` Linus Torvalds
2022-07-18 18:34         ` Linus Torvalds
2022-07-27 12:02       ` Michael Ellerman
2022-07-18 19:06     ` mainline build failure of powerpc allmodconfig for prom_init_check Linus Torvalds
2022-07-18 19:06       ` Linus Torvalds
2022-07-18 22:08       ` Segher Boessenkool
2022-07-18 22:08         ` Segher Boessenkool
2022-07-18 22:55         ` Linus Torvalds
2022-07-18 22:55           ` Linus Torvalds
2022-07-19 13:35       ` Michael Ellerman
2022-07-19 13:35         ` Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.