linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm64/io: Remind compiler that there is a memory side effect
@ 2022-04-01 16:44 Jeremy Linton
  2022-04-01 17:22 ` Mark Rutland
  0 siblings, 1 reply; 14+ messages in thread
From: Jeremy Linton @ 2022-04-01 16:44 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: gcc, catalin.marinas, will, marcan, maz, linux-kernel,
	mark.rutland, szabolcs.nagy, f.fainelli, opendmb, Jeremy Linton

The relaxed variants of read/write macros are only declared
as `asm volatile()` which forces the compiler to generate the
instruction in the code path as intended. The only problem
is that it doesn't also tell the compiler that there may
be memory side effects. Meaning that if a function is comprised
entirely of relaxed io operations, the compiler may think that
it only has register side effects and doesn't need to be called.

For an example function look at bcmgenet_enable_dma(), before the
relaxed variants were removed. When built with gcc12 the code
contains the asm blocks as expected, but then the function is
never called.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/io.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 7fd836bea7eb..3cceda7948a0 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -24,25 +24,25 @@
 #define __raw_writeb __raw_writeb
 static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
 {
-	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
+	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
 }
 
 #define __raw_writew __raw_writew
 static inline void __raw_writew(u16 val, volatile void __iomem *addr)
 {
-	asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr));
+	asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
 }
 
 #define __raw_writel __raw_writel
 static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr)
 {
-	asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
+	asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
 }
 
 #define __raw_writeq __raw_writeq
 static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
 {
-	asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr));
+	asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
 }
 
 #define __raw_readb __raw_readb
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect
  2022-04-01 16:44 [PATCH] arm64/io: Remind compiler that there is a memory side effect Jeremy Linton
@ 2022-04-01 17:22 ` Mark Rutland
  2022-04-03  7:36   ` Andrew Pinski
  2022-04-05 12:51   ` GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect) Mark Rutland
  0 siblings, 2 replies; 14+ messages in thread
From: Mark Rutland @ 2022-04-01 17:22 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-arm-kernel, gcc, catalin.marinas, will, marcan, maz,
	linux-kernel, szabolcs.nagy, f.fainelli, opendmb

Hi Jeremy,

Thanks for raising this.

On Fri, Apr 01, 2022 at 11:44:06AM -0500, Jeremy Linton wrote:
> The relaxed variants of read/write macros are only declared
> as `asm volatile()` which forces the compiler to generate the
> instruction in the code path as intended. The only problem
> is that it doesn't also tell the compiler that there may
> be memory side effects. Meaning that if a function is comprised
> entirely of relaxed io operations, the compiler may think that
> it only has register side effects and doesn't need to be called.

As I mentioned on a private mail, I don't think that reasoning above is
correct, and I think this is a miscompilation (i.e. a compiler bug).

The important thing is that any `asm volatile` may have a side effects
generally outside of memory or GPRs, and whether the assembly contains a memory
load/store is immaterial. We should not need to add a memory clobber in order
to retain the volatile semantic.

See:
	
  https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile

... and consider the x86 example that reads rdtsc, or an arm64 sequence like:

| void do_sysreg_thing(void)
| {
| 	unsigned long tmp;
| 	
| 	tmp = read_sysreg(some_reg);
| 	tmp |= SOME_BIT;
| 	write_sysreg(some_reg);
| }

... where there's no memory that we should need to hazard against.

This patch might workaround the issue, but I don't believe it is a correct fix.

> For an example function look at bcmgenet_enable_dma(), before the
> relaxed variants were removed. When built with gcc12 the code
> contains the asm blocks as expected, but then the function is
> never called.

So it sounds like this is a regression in GCC 12, which IIUC isn't released yet
per:

  https://gcc.gnu.org/gcc-12/changes.html

... which says:

| Note: GCC 12 has not been released yet

Surely we can fix it prior to release?

Thanks,
Mark.

> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/arm64/include/asm/io.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
> index 7fd836bea7eb..3cceda7948a0 100644
> --- a/arch/arm64/include/asm/io.h
> +++ b/arch/arm64/include/asm/io.h
> @@ -24,25 +24,25 @@
>  #define __raw_writeb __raw_writeb
>  static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
>  {
> -	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
> +	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
>  }
>  
>  #define __raw_writew __raw_writew
>  static inline void __raw_writew(u16 val, volatile void __iomem *addr)
>  {
> -	asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr));
> +	asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
>  }
>  
>  #define __raw_writel __raw_writel
>  static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr)
>  {
> -	asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
> +	asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
>  }
>  
>  #define __raw_writeq __raw_writeq
>  static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
>  {
> -	asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr));
> +	asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
>  }
>  
>  #define __raw_readb __raw_readb
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect
  2022-04-01 17:22 ` Mark Rutland
@ 2022-04-03  7:36   ` Andrew Pinski
  2022-04-03  7:47     ` Ard Biesheuvel
  2022-04-03 17:40     ` Doug Berger
  2022-04-05 12:51   ` GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect) Mark Rutland
  1 sibling, 2 replies; 14+ messages in thread
From: Andrew Pinski @ 2022-04-03  7:36 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Jeremy Linton, GCC Mailing List, f.fainelli, maz, marcan, LKML,
	opendmb, Catalin Marinas, will, linux-arm-kernel

On Fri, Apr 1, 2022 at 10:24 AM Mark Rutland via Gcc <gcc@gcc.gnu.org> wrote:
>
> Hi Jeremy,
>
> Thanks for raising this.
>
> On Fri, Apr 01, 2022 at 11:44:06AM -0500, Jeremy Linton wrote:
> > The relaxed variants of read/write macros are only declared
> > as `asm volatile()` which forces the compiler to generate the
> > instruction in the code path as intended. The only problem
> > is that it doesn't also tell the compiler that there may
> > be memory side effects. Meaning that if a function is comprised
> > entirely of relaxed io operations, the compiler may think that
> > it only has register side effects and doesn't need to be called.
>
> As I mentioned on a private mail, I don't think that reasoning above is
> correct, and I think this is a miscompilation (i.e. a compiler bug).
>
> The important thing is that any `asm volatile` may have a side effects
> generally outside of memory or GPRs, and whether the assembly contains a memory
> load/store is immaterial. We should not need to add a memory clobber in order
> to retain the volatile semantic.
>
> See:
>
>   https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
>
> ... and consider the x86 example that reads rdtsc, or an arm64 sequence like:
>
> | void do_sysreg_thing(void)
> | {
> |       unsigned long tmp;
> |
> |       tmp = read_sysreg(some_reg);
> |       tmp |= SOME_BIT;
> |       write_sysreg(some_reg);
> | }
>
> ... where there's no memory that we should need to hazard against.
>
> This patch might workaround the issue, but I don't believe it is a correct fix.

It might not be the most restricted fix but it is a fix.
The best fix is to tell that you are writing to that location of memory.
volatile asm does not do what you think it does.
You didn't read further down about memory clobbers:
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers
Specifically this part:
The "memory" clobber tells the compiler that the assembly code
performs memory reads or writes to items other than those listed in
the input and output operands




>
> > For an example function look at bcmgenet_enable_dma(), before the
> > relaxed variants were removed. When built with gcc12 the code
> > contains the asm blocks as expected, but then the function is
> > never called.
>
> So it sounds like this is a regression in GCC 12, which IIUC isn't released yet
> per:

It is NOT a bug in GCC 12. Just you depended on behavior which
accidently worked in the cases you were looking at. GCC 12 did not
change in this area at all even.

Thanks,
Andrew Pinski

>
>   https://gcc.gnu.org/gcc-12/changes.html
>
> ... which says:
>
> | Note: GCC 12 has not been released yet
>
> Surely we can fix it prior to release?
>
> Thanks,
> Mark.
>
> >
> > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > ---
> >  arch/arm64/include/asm/io.h | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
> > index 7fd836bea7eb..3cceda7948a0 100644
> > --- a/arch/arm64/include/asm/io.h
> > +++ b/arch/arm64/include/asm/io.h
> > @@ -24,25 +24,25 @@
> >  #define __raw_writeb __raw_writeb
> >  static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
> >  {
> > -     asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
> > +     asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
> >  }
> >
> >  #define __raw_writew __raw_writew
> >  static inline void __raw_writew(u16 val, volatile void __iomem *addr)
> >  {
> > -     asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr));
> > +     asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
> >  }
> >
> >  #define __raw_writel __raw_writel
> >  static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr)
> >  {
> > -     asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
> > +     asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
> >  }
> >
> >  #define __raw_writeq __raw_writeq
> >  static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
> >  {
> > -     asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr));
> > +     asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
> >  }
> >
> >  #define __raw_readb __raw_readb
> > --
> > 2.35.1
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect
  2022-04-03  7:36   ` Andrew Pinski
@ 2022-04-03  7:47     ` Ard Biesheuvel
  2022-04-03  7:47       ` Ard Biesheuvel
  2022-04-03 17:40     ` Doug Berger
  1 sibling, 1 reply; 14+ messages in thread
From: Ard Biesheuvel @ 2022-04-03  7:47 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Mark Rutland, Jeremy Linton, GCC Mailing List, f.fainelli, maz,
	marcan, LKML, opendmb, Catalin Marinas, will, linux-arm-kernel

On Sun, 3 Apr 2022 at 09:38, Andrew Pinski <pinskia@gmail.com> wrote:
>
> On Fri, Apr 1, 2022 at 10:24 AM Mark Rutland via Gcc <gcc@gcc.gnu.org> wrote:
> >
> > Hi Jeremy,
> >
> > Thanks for raising this.
> >
> > On Fri, Apr 01, 2022 at 11:44:06AM -0500, Jeremy Linton wrote:
> > > The relaxed variants of read/write macros are only declared
> > > as `asm volatile()` which forces the compiler to generate the
> > > instruction in the code path as intended. The only problem
> > > is that it doesn't also tell the compiler that there may
> > > be memory side effects. Meaning that if a function is comprised
> > > entirely of relaxed io operations, the compiler may think that
> > > it only has register side effects and doesn't need to be called.
> >
> > As I mentioned on a private mail, I don't think that reasoning above is
> > correct, and I think this is a miscompilation (i.e. a compiler bug).
> >
> > The important thing is that any `asm volatile` may have a side effects
> > generally outside of memory or GPRs, and whether the assembly contains a memory
> > load/store is immaterial. We should not need to add a memory clobber in order
> > to retain the volatile semantic.
> >
> > See:
> >
> >   https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
> >
> > ... and consider the x86 example that reads rdtsc, or an arm64 sequence like:
> >
> > | void do_sysreg_thing(void)
> > | {
> > |       unsigned long tmp;
> > |
> > |       tmp = read_sysreg(some_reg);
> > |       tmp |= SOME_BIT;
> > |       write_sysreg(some_reg);
> > | }
> >
> > ... where there's no memory that we should need to hazard against.
> >
> > This patch might workaround the issue, but I don't believe it is a correct fix.
>
> It might not be the most restricted fix but it is a fix.
> The best fix is to tell that you are writing to that location of memory.
> volatile asm does not do what you think it does.
> You didn't read further down about memory clobbers:
> https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers
> Specifically this part:
> The "memory" clobber tells the compiler that the assembly code
> performs memory reads or writes to items other than those listed in
> the input and output operands
>

So should we be using "m"(*addr) instead of "r"(addr) here?

(along with the appropriately sized casts)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect
  2022-04-03  7:47     ` Ard Biesheuvel
@ 2022-04-03  7:47       ` Ard Biesheuvel
  2022-04-04  9:14         ` Will Deacon
  0 siblings, 1 reply; 14+ messages in thread
From: Ard Biesheuvel @ 2022-04-03  7:47 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Mark Rutland, Jeremy Linton, GCC Mailing List, f.fainelli, maz,
	marcan, LKML, opendmb, Catalin Marinas, will, linux-arm-kernel

On Sun, 3 Apr 2022 at 09:47, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Sun, 3 Apr 2022 at 09:38, Andrew Pinski <pinskia@gmail.com> wrote:
> >
> > On Fri, Apr 1, 2022 at 10:24 AM Mark Rutland via Gcc <gcc@gcc.gnu.org> wrote:
> > >
> > > Hi Jeremy,
> > >
> > > Thanks for raising this.
> > >
> > > On Fri, Apr 01, 2022 at 11:44:06AM -0500, Jeremy Linton wrote:
> > > > The relaxed variants of read/write macros are only declared
> > > > as `asm volatile()` which forces the compiler to generate the
> > > > instruction in the code path as intended. The only problem
> > > > is that it doesn't also tell the compiler that there may
> > > > be memory side effects. Meaning that if a function is comprised
> > > > entirely of relaxed io operations, the compiler may think that
> > > > it only has register side effects and doesn't need to be called.
> > >
> > > As I mentioned on a private mail, I don't think that reasoning above is
> > > correct, and I think this is a miscompilation (i.e. a compiler bug).
> > >
> > > The important thing is that any `asm volatile` may have a side effects
> > > generally outside of memory or GPRs, and whether the assembly contains a memory
> > > load/store is immaterial. We should not need to add a memory clobber in order
> > > to retain the volatile semantic.
> > >
> > > See:
> > >
> > >   https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
> > >
> > > ... and consider the x86 example that reads rdtsc, or an arm64 sequence like:
> > >
> > > | void do_sysreg_thing(void)
> > > | {
> > > |       unsigned long tmp;
> > > |
> > > |       tmp = read_sysreg(some_reg);
> > > |       tmp |= SOME_BIT;
> > > |       write_sysreg(some_reg);
> > > | }
> > >
> > > ... where there's no memory that we should need to hazard against.
> > >
> > > This patch might workaround the issue, but I don't believe it is a correct fix.
> >
> > It might not be the most restricted fix but it is a fix.
> > The best fix is to tell that you are writing to that location of memory.
> > volatile asm does not do what you think it does.
> > You didn't read further down about memory clobbers:
> > https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers
> > Specifically this part:
> > The "memory" clobber tells the compiler that the assembly code
> > performs memory reads or writes to items other than those listed in
> > the input and output operands
> >
>
> So should we be using "m"(*addr) instead of "r"(addr) here?
>
> (along with the appropriately sized casts)

I mean "=m" not "m"

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect
  2022-04-03  7:36   ` Andrew Pinski
  2022-04-03  7:47     ` Ard Biesheuvel
@ 2022-04-03 17:40     ` Doug Berger
  1 sibling, 0 replies; 14+ messages in thread
From: Doug Berger @ 2022-04-03 17:40 UTC (permalink / raw)
  To: Andrew Pinski, Mark Rutland
  Cc: Jeremy Linton, GCC Mailing List, f.fainelli, maz, marcan, LKML,
	Catalin Marinas, will, linux-arm-kernel

On 4/3/2022 12:36 AM, Andrew Pinski wrote:
> On Fri, Apr 1, 2022 at 10:24 AM Mark Rutland via Gcc <gcc@gcc.gnu.org> wrote:
>>
>> Hi Jeremy,
>>
>> Thanks for raising this.
>>
>> On Fri, Apr 01, 2022 at 11:44:06AM -0500, Jeremy Linton wrote:
>>> The relaxed variants of read/write macros are only declared
>>> as `asm volatile()` which forces the compiler to generate the
>>> instruction in the code path as intended. The only problem
>>> is that it doesn't also tell the compiler that there may
>>> be memory side effects. Meaning that if a function is comprised
>>> entirely of relaxed io operations, the compiler may think that
>>> it only has register side effects and doesn't need to be called.
>>
>> As I mentioned on a private mail, I don't think that reasoning above is
>> correct, and I think this is a miscompilation (i.e. a compiler bug).
>>
>> The important thing is that any `asm volatile` may have a side effects
>> generally outside of memory or GPRs, and whether the assembly contains a memory
>> load/store is immaterial. We should not need to add a memory clobber in order
>> to retain the volatile semantic.
>>
>> See:
>>
>>    https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
>>
>> ... and consider the x86 example that reads rdtsc, or an arm64 sequence like:
>>
>> | void do_sysreg_thing(void)
>> | {
>> |       unsigned long tmp;
>> |
>> |       tmp = read_sysreg(some_reg);
>> |       tmp |= SOME_BIT;
>> |       write_sysreg(some_reg);
>> | }
>>
>> ... where there's no memory that we should need to hazard against.
>>
>> This patch might workaround the issue, but I don't believe it is a correct fix.

I agree with Mark that this patch is an attempt to work around a bug in 
the GCC 12 compiler.
> 
> It might not be the most restricted fix but it is a fix.
> The best fix is to tell that you are writing to that location of memory.
> volatile asm does not do what you think it does.
> You didn't read further down about memory clobbers:
> https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers
> Specifically this part:
> The "memory" clobber tells the compiler that the assembly code
> performs memory reads or writes to items other than those listed in
> the input and output operands

I agree that volatile asm does not do what I think it should do in this 
case, but it appears to me that it does not do what the documentation 
states that it should do, and that is the bug in GCC 12.

My interpretation of the referenced documentation is that the volatile 
qualifier of the asm keyword should prevent the GCC optimizer from 
performing certain optimizations. Of specific relevance in this scenario 
is the optimizers that "sometimes discard asm statements if they 
determine there is no need for the output variables."

The clobbers tell the compiler about side effects or dependencies that 
the asm block may have that could be relevant to code outside the asm 
block so that proper functionality can be preserved and the optimizer 
can still do a good job. The functions in this patch do access memory 
(well technically registers...) and therefore adding the "memory" 
clobber is not "wrong", but the read variants of these functions also 
access memory so adding the "memory" clobber to them would be equally 
appropriate (or inappropriate). This would not affect the functionality, 
but it is "heavy-handed" and can have an unnecessary effect on performance.

The "memory" clobber indicates that memory is somehow affected by the 
asm block and therefore requires the compiler to flush data in working 
registers to memory before the block and reload values from memory after 
the block. A better solution is to communicate the side effects more 
precisely to avoid operations that can be determined to be unnecessary.

In the case of these functions, the only address accessed is in register 
space. Accesses to register space can have all kinds of side effects, 
but these side effects are communicated to the compiler by declaring the 
addr formal parameter with the volatile and __iomem attributes. In this 
way it is clear to the compiler that any writes to addr by code before 
the start of the asm block must occur before entering the block and any 
accesses to addr by code after the block must occur after executing the 
block such that the use of the "memory" clobber is unnecessary.

> 
>>
>>> For an example function look at bcmgenet_enable_dma(), before the
>>> relaxed variants were removed. When built with gcc12 the code
>>> contains the asm blocks as expected, but then the function is
>>> never called.
>>
>> So it sounds like this is a regression in GCC 12, which IIUC isn't released yet
>> per:
> 
> It is NOT a bug in GCC 12. Just you depended on behavior which
> accidently worked in the cases you were looking at. GCC 12 did not
> change in this area at all even.

GCC 12 should not have changed in this area, but the evidence suggests 
that in fact the behavior has changed such that an asm volatile block 
can be discarded by an optimizer. This appears unintentional and is 
therefore a bug that should be corrected before release of the toolchain 
since it could potentially affect any asm volatile block in the Linux 
source.

Regards,
     Doug

> 
> Thanks,
> Andrew Pinski
> 
>>
>>    https://gcc.gnu.org/gcc-12/changes.html
>>
>> ... which says:
>>
>> | Note: GCC 12 has not been released yet
>>
>> Surely we can fix it prior to release?
>>
>> Thanks,
>> Mark.
>>
>>>
>>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>>> ---
>>>   arch/arm64/include/asm/io.h | 8 ++++----
>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
>>> index 7fd836bea7eb..3cceda7948a0 100644
>>> --- a/arch/arm64/include/asm/io.h
>>> +++ b/arch/arm64/include/asm/io.h
>>> @@ -24,25 +24,25 @@
>>>   #define __raw_writeb __raw_writeb
>>>   static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
>>>   {
>>> -     asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
>>> +     asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
>>>   }
>>>
>>>   #define __raw_writew __raw_writew
>>>   static inline void __raw_writew(u16 val, volatile void __iomem *addr)
>>>   {
>>> -     asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr));
>>> +     asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
>>>   }
>>>
>>>   #define __raw_writel __raw_writel
>>>   static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr)
>>>   {
>>> -     asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
>>> +     asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
>>>   }
>>>
>>>   #define __raw_writeq __raw_writeq
>>>   static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
>>>   {
>>> -     asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr));
>>> +     asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr) : "memory");
>>>   }
>>>
>>>   #define __raw_readb __raw_readb
>>> --
>>> 2.35.1
>>>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect
  2022-04-03  7:47       ` Ard Biesheuvel
@ 2022-04-04  9:14         ` Will Deacon
  0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2022-04-04  9:14 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Andrew Pinski, Mark Rutland, Jeremy Linton, GCC Mailing List,
	f.fainelli, maz, marcan, LKML, opendmb, Catalin Marinas,
	linux-arm-kernel

On Sun, Apr 03, 2022 at 09:47:47AM +0200, Ard Biesheuvel wrote:
> On Sun, 3 Apr 2022 at 09:47, Ard Biesheuvel <ardb@kernel.org> wrote:
> > On Sun, 3 Apr 2022 at 09:38, Andrew Pinski <pinskia@gmail.com> wrote:
> > > It might not be the most restricted fix but it is a fix.
> > > The best fix is to tell that you are writing to that location of memory.
> > > volatile asm does not do what you think it does.
> > > You didn't read further down about memory clobbers:
> > > https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers
> > > Specifically this part:
> > > The "memory" clobber tells the compiler that the assembly code
> > > performs memory reads or writes to items other than those listed in
> > > the input and output operands
> > >
> >
> > So should we be using "m"(*addr) instead of "r"(addr) here?
> >
> > (along with the appropriately sized casts)
> 
> I mean "=m" not "m"

That can generate writeback addressing modes, which I think breaks
MMIO virtualisation. We usually end up using "Q" instead but the codegen
tends to be worse iirc.

In any case, for this specific problem I think we either need a fixed
compiler or some kbuild magic to avoid using it / disable the new behaviour.

We rely on 'asm volatile' not being elided in other places too.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect)
  2022-04-01 17:22 ` Mark Rutland
  2022-04-03  7:36   ` Andrew Pinski
@ 2022-04-05 12:51   ` Mark Rutland
  2022-04-05 13:04     ` Mark Rutland
                       ` (2 more replies)
  1 sibling, 3 replies; 14+ messages in thread
From: Mark Rutland @ 2022-04-05 12:51 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: linux-arch, gcc, catalin.marinas, will, marcan, maz,
	szabolcs.nagy, f.fainelli, opendmb, Andrew Pinski,
	Ard Biesheuvel, Peter Zijlstra, x86, andrew.cooper3,
	Jeremy Linton

Hi all,

[adding kernel folk who work on asm stuff]

As a heads-up, GCC 12 (not yet released) appears to erroneously optimize away
calls to functions with volatile asm. Szabolcs has raised an issue on the GCC
bugzilla:  

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105160

... which is a P1 release blocker, and is currently being investigated.

Jemery originally reported this as an issue with {readl,writel}_relaxed(), but
the underlying problem doesn't have anything to do with those specifically.

I'm dumping a bunch of info here largely for posterity / archival, and to find
out who (from the kernel side) is willing and able to test proposed compiler
fixes, once those are available.

I'm happy to do so for aarch64; Peter, I assume you'd be happy to look at the
x86 side?

This is a generic issue, and 

I wrote test cases for aarch64 and x86_64. Those are inline later in this mail,
and currently you can see them on compiler explorer:

  aarch64: https://godbolt.org/z/vMczqjYvs

  x86_64: https://godbolt.org/z/cveff9hq5



My aarch64 test case is:

| #define sysreg_read(regname)		\
| ({					\
| 	unsigned long __sr_val;		\
| 	asm volatile(			\
| 	"mrs %0, " #regname "\n"	\
| 	: "=r" (__sr_val));		\
| 					\
| 	__sr_val;			\
| })
| 
| #define sysreg_write(regname, __sw_val)	\
| do {					\
| 	asm volatile(			\
| 	"msr " #regname ", %0\n"	\
| 	:				\
| 	: "r" (__sw_val));		\
| } while (0)
| 
| #define isb()				\
| do {					\
| 	asm volatile(			\
| 	"isb"				\
| 	:				\
| 	:				\
| 	: "memory");			\
| } while (0)
| 
| static unsigned long sctlr_read(void)
| {
| 	return sysreg_read(sctlr_el1);
| }
| 
| static void sctlr_write(unsigned long val)
| {
| 	sysreg_write(sctlr_el1, val);
| }
| 
| static void sctlr_rmw(void)
| {
| 	unsigned long val;
| 
| 	val = sctlr_read();
| 	val |= 1UL << 7;
| 	sctlr_write(val);
| }
| 
| void sctlr_read_multiple(void)
| {
| 	sctlr_read();
| 	sctlr_read();
| 	sctlr_read();
| 	sctlr_read();
| }
| 
| void sctlr_write_multiple(void)
| {
| 	sctlr_write(0);
| 	sctlr_write(0);
| 	sctlr_write(0);
| 	sctlr_write(0);
| 	sctlr_write(0);
| }
| 
| void sctlr_rmw_multiple(void)
| {
| 	sctlr_rmw();
| 	sctlr_rmw();
| 	sctlr_rmw();
| 	sctlr_rmw();
| }
| 
| void function(void)
| {
| 	sctlr_read_multiple();
| 	sctlr_write_multiple();
| 	sctlr_rmw_multiple();
| 
| 	isb();
| }

Per compiler explorer (https://godbolt.org/z/vMczqjYvs) GCC trunk currently
compiles this as:

| sctlr_rmw:
|         mrs x0, sctlr_el1
|         orr     x0, x0, 128
|         msr sctlr_el1, x0
|         ret
| sctlr_read_multiple:
|         mrs x0, sctlr_el1
|         mrs x0, sctlr_el1
|         mrs x0, sctlr_el1
|         mrs x0, sctlr_el1
|         ret
| sctlr_write_multiple:
|         mov     x0, 0
|         msr sctlr_el1, x0
|         msr sctlr_el1, x0
|         msr sctlr_el1, x0
|         msr sctlr_el1, x0
|         msr sctlr_el1, x0
|         ret
| sctlr_rmw_multiple:
|         ret
| function:
|         isb
|         ret

Whereas GCC 11.2 compiles this as:

| sctlr_rmw:
|         mrs x0, sctlr_el1
|         orr     x0, x0, 128
|         msr sctlr_el1, x0
|         ret
| sctlr_read_multiple:
|         mrs x0, sctlr_el1
|         mrs x0, sctlr_el1
|         mrs x0, sctlr_el1
|         mrs x0, sctlr_el1
|         ret
| sctlr_write_multiple:
|         mov     x0, 0
|         msr sctlr_el1, x0
|         msr sctlr_el1, x0
|         msr sctlr_el1, x0
|         msr sctlr_el1, x0
|         msr sctlr_el1, x0
|         ret
| sctlr_rmw_multiple:
|         stp     x29, x30, [sp, -16]!
|         mov     x29, sp
|         bl      sctlr_rmw
|         bl      sctlr_rmw
|         bl      sctlr_rmw
|         bl      sctlr_rmw
|         ldp     x29, x30, [sp], 16
|         ret
| function:
|         stp     x29, x30, [sp, -16]!
|         mov     x29, sp
|         bl      sctlr_read_multiple
|         bl      sctlr_write_multiple
|         bl      sctlr_rmw_multiple
|         isb
|         ldp     x29, x30, [sp], 16
|         ret



My x86_64 test case is:

| unsigned long rdmsr(unsigned long reg)
| {
|     unsigned int lo, hi;
| 
|     asm volatile(
|     "rdmsr"
|     : "=d" (hi), "=a" (lo)
|     : "c" (reg)
|     );
| 
|     return ((unsigned long)hi << 32) | lo;
| }
| 
| void wrmsr(unsigned long reg, unsigned long val)
| {
|     unsigned int lo = val;
|     unsigned int hi = val >> 32;
| 
|     asm volatile(
|     "wrmsr"
|     :
|     : "d" (hi), "a" (lo), "c" (reg)
|     );
| }
| 
| void msr_rmw_set_bits(unsigned long reg, unsigned long bits)
| {
|     unsigned long val;
| 
|     val = rdmsr(reg);
|     val |= bits;
|     wrmsr(reg, val);
| }
| 
| void func_with_msr_side_effects(unsigned long reg)
| {
|     msr_rmw_set_bits(reg, 1UL << 0);
|     msr_rmw_set_bits(reg, 1UL << 1);
|     msr_rmw_set_bits(reg, 1UL << 2);
|     msr_rmw_set_bits(reg, 1UL << 3);
| }

Per compiler explorer (https://godbolt.org/z/cveff9hq5) GCC trunk currently
compiles this as:

| msr_rmw_set_bits:
|         mov     rcx, rdi
|         rdmsr
|         sal     rdx, 32
|         mov     eax, eax
|         or      rax, rsi
|         or      rax, rdx
|         mov     rdx, rax
|         shr     rdx, 32
|         wrmsr
|         ret
| func_with_msr_side_effects:
|         ret

While GCC 11.2 compiles that as:

| msr_rmw_set_bits:
|         mov     rcx, rdi
|         rdmsr
|         sal     rdx, 32
|         mov     eax, eax
|         or      rax, rsi
|         or      rax, rdx
|         mov     rdx, rax
|         shr     rdx, 32
|         wrmsr
|         ret
| func_with_msr_side_effects:
|         push    rbp
|         push    rbx
|         mov     rbx, rdi
|         mov     rbp, rsi
|         call    msr_rmw_set_bits
|         mov     rsi, rbp
|         mov     rdi, rbx
|         call    msr_rmw_set_bits
|         mov     rsi, rbp
|         mov     rdi, rbx
|         call    msr_rmw_set_bits
|         mov     rsi, rbp
|         mov     rdi, rbx
|         call    msr_rmw_set_bits
|         pop     rbx
|         pop     rbp
|         ret

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect)
  2022-04-05 12:51   ` GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect) Mark Rutland
@ 2022-04-05 13:04     ` Mark Rutland
  2022-04-05 13:20       ` Andrew Cooper
  2022-04-05 14:05     ` Peter Zijlstra
  2022-04-11 10:31     ` Mark Rutland
  2 siblings, 1 reply; 14+ messages in thread
From: Mark Rutland @ 2022-04-05 13:04 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: linux-arch, gcc, catalin.marinas, will, marcan, maz,
	szabolcs.nagy, f.fainelli, opendmb, Andrew Pinski,
	Ard Biesheuvel, Peter Zijlstra, x86, andrew.cooper3,
	Jeremy Linton

Sorry, I copied the wrong version of the x86_64 assembly as generated by GCC
11.2.0). Updated below.

On Tue, Apr 05, 2022 at 01:51:30PM +0100, Mark Rutland wrote:
> My x86_64 test case is:
> 
> | unsigned long rdmsr(unsigned long reg)
> | {
> |     unsigned int lo, hi;
> | 
> |     asm volatile(
> |     "rdmsr"
> |     : "=d" (hi), "=a" (lo)
> |     : "c" (reg)
> |     );
> | 
> |     return ((unsigned long)hi << 32) | lo;
> | }
> | 
> | void wrmsr(unsigned long reg, unsigned long val)
> | {
> |     unsigned int lo = val;
> |     unsigned int hi = val >> 32;
> | 
> |     asm volatile(
> |     "wrmsr"
> |     :
> |     : "d" (hi), "a" (lo), "c" (reg)
> |     );
> | }
> | 
> | void msr_rmw_set_bits(unsigned long reg, unsigned long bits)
> | {
> |     unsigned long val;
> | 
> |     val = rdmsr(reg);
> |     val |= bits;
> |     wrmsr(reg, val);
> | }
> | 
> | void func_with_msr_side_effects(unsigned long reg)
> | {
> |     msr_rmw_set_bits(reg, 1UL << 0);
> |     msr_rmw_set_bits(reg, 1UL << 1);
> |     msr_rmw_set_bits(reg, 1UL << 2);
> |     msr_rmw_set_bits(reg, 1UL << 3);
> | }
> 
> Per compiler explorer (https://godbolt.org/z/cveff9hq5) GCC trunk currently
> compiles this as:
> 
> | msr_rmw_set_bits:
> |         mov     rcx, rdi
> |         rdmsr
> |         sal     rdx, 32
> |         mov     eax, eax
> |         or      rax, rsi
> |         or      rax, rdx
> |         mov     rdx, rax
> |         shr     rdx, 32
> |         wrmsr
> |         ret
> | func_with_msr_side_effects:
> |         ret
> 

GCC 11.2 compiles that as:

| rdmsr:
|         mov     rcx, rdi
|         rdmsr
|         sal     rdx, 32
|         mov     eax, eax
|         or      rax, rdx
|         ret
| wrmsr:
|         mov     rax, rsi
|         mov     rdx, rsi
|         shr     rdx, 32
|         mov     rcx, rdi
|         wrmsr
|         ret
| msr_rmw_set_bits:
|         mov     rcx, rdi
|         rdmsr
|         sal     rdx, 32
|         mov     eax, eax
|         or      rax, rsi
|         or      rax, rdx
|         mov     rdx, rax
|         shr     rdx, 32
|         wrmsr
|         ret
| func_with_msr_side_effects:
|         push    rbx
|         mov     rbx, rdi
|         mov     esi, 1
|         call    msr_rmw_set_bits
|         mov     esi, 2
|         mov     rdi, rbx
|         call    msr_rmw_set_bits
|         mov     esi, 4
|         mov     rdi, rbx
|         call    msr_rmw_set_bits
|         mov     esi, 8
|         mov     rdi, rbx
|         call    msr_rmw_set_bits
|         pop     rbx
|         ret

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect)
  2022-04-05 13:04     ` Mark Rutland
@ 2022-04-05 13:20       ` Andrew Cooper
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Cooper @ 2022-04-05 13:20 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel, linux-kernel
  Cc: linux-arch, gcc, catalin.marinas, will, marcan, maz,
	szabolcs.nagy, f.fainelli, opendmb, Andrew Pinski,
	Ard Biesheuvel, Peter Zijlstra, x86, Jeremy Linton

On 05/04/2022 14:04, Mark Rutland wrote:
> On Tue, Apr 05, 2022 at 01:51:30PM +0100, Mark Rutland wrote:
> My x86_64 test case is:
>
> Per compiler explorer (https://godbolt.org/z/cveff9hq5) GCC trunk currently
> compiles this as:
>
> | msr_rmw_set_bits:
> |         mov     rcx, rdi
> |         rdmsr
> |         sal     rdx, 32
> |         mov     eax, eax
> |         or      rax, rsi
> |         or      rax, rdx
> |         mov     rdx, rax
> |         shr     rdx, 32
> |         wrmsr
> |         ret
> | func_with_msr_side_effects:
> |         ret
>

Yeah... that code gen is very broken for func_with_msr_side_effects().

~Andrew

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect)
  2022-04-05 12:51   ` GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect) Mark Rutland
  2022-04-05 13:04     ` Mark Rutland
@ 2022-04-05 14:05     ` Peter Zijlstra
  2022-04-11 10:22       ` Mark Rutland
  2022-04-11 10:31     ` Mark Rutland
  2 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2022-04-05 14:05 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-arch, gcc, catalin.marinas,
	will, marcan, maz, szabolcs.nagy, f.fainelli, opendmb,
	Andrew Pinski, Ard Biesheuvel, x86, andrew.cooper3,
	Jeremy Linton

On Tue, Apr 05, 2022 at 01:51:30PM +0100, Mark Rutland wrote:
> Hi all,
> 
> [adding kernel folk who work on asm stuff]
> 
> As a heads-up, GCC 12 (not yet released) appears to erroneously optimize away
> calls to functions with volatile asm. Szabolcs has raised an issue on the GCC
> bugzilla:  
> 
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105160
> 
> ... which is a P1 release blocker, and is currently being investigated.
> 
> Jemery originally reported this as an issue with {readl,writel}_relaxed(), but
> the underlying problem doesn't have anything to do with those specifically.
> 
> I'm dumping a bunch of info here largely for posterity / archival, and to find
> out who (from the kernel side) is willing and able to test proposed compiler
> fixes, once those are available.
> 
> I'm happy to do so for aarch64; Peter, I assume you'd be happy to look at the
> x86 side?

Sure..

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect)
  2022-04-05 14:05     ` Peter Zijlstra
@ 2022-04-11 10:22       ` Mark Rutland
  0 siblings, 0 replies; 14+ messages in thread
From: Mark Rutland @ 2022-04-11 10:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arm-kernel, linux-kernel, linux-arch, gcc, catalin.marinas,
	will, marcan, maz, szabolcs.nagy, f.fainelli, opendmb,
	Andrew Pinski, Ard Biesheuvel, x86, andrew.cooper3,
	Jeremy Linton

On Tue, Apr 05, 2022 at 04:05:22PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 05, 2022 at 01:51:30PM +0100, Mark Rutland wrote:
> > Hi all,
> > 
> > [adding kernel folk who work on asm stuff]
> > 
> > As a heads-up, GCC 12 (not yet released) appears to erroneously optimize away
> > calls to functions with volatile asm. Szabolcs has raised an issue on the GCC
> > bugzilla:  
> > 
> >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105160
> > 
> > ... which is a P1 release blocker, and is currently being investigated.
> > 
> > Jemery originally reported this as an issue with {readl,writel}_relaxed(), but
> > the underlying problem doesn't have anything to do with those specifically.
> > 
> > I'm dumping a bunch of info here largely for posterity / archival, and to find
> > out who (from the kernel side) is willing and able to test proposed compiler
> > fixes, once those are available.
> > 
> > I'm happy to do so for aarch64; Peter, I assume you'd be happy to look at the
> > x86 side?
> 
> Sure..

FWIW, compiler explorer now have a trunk build with the fix, and my x86-64
example now gets compiled to something which looks correct:

  https://godbolt.org/z/cveff9hq5

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect)
  2022-04-05 12:51   ` GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect) Mark Rutland
  2022-04-05 13:04     ` Mark Rutland
  2022-04-05 14:05     ` Peter Zijlstra
@ 2022-04-11 10:31     ` Mark Rutland
  2022-04-11 19:02       ` Jeremy Linton
  2 siblings, 1 reply; 14+ messages in thread
From: Mark Rutland @ 2022-04-11 10:31 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, Jeremy Linton
  Cc: linux-arch, gcc, catalin.marinas, will, marcan, maz,
	szabolcs.nagy, f.fainelli, opendmb, Andrew Pinski,
	Ard Biesheuvel, Peter Zijlstra, x86, andrew.cooper3

On Tue, Apr 05, 2022 at 01:51:30PM +0100, Mark Rutland wrote:
> Hi all,
> 
> [adding kernel folk who work on asm stuff]
> 
> As a heads-up, GCC 12 (not yet released) appears to erroneously optimize away
> calls to functions with volatile asm. Szabolcs has raised an issue on the GCC
> bugzilla:  
> 
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105160
> 
> ... which is a P1 release blocker, and is currently being investigated.

Jan Hubicka fixed this in GCC commit:

  aabb9a261ef060cf ("Propagate nondeterministic and side_effects flags in modref summary after inlining")

... and all my local tests look good with that applied.

Compiler explorer's trunk build now has that fix, so the examples from before
now look good:

  aarch64: https://godbolt.org/z/vMczqjYvs

  x86_64: https://godbolt.org/z/cveff9hq5

Jeremy, now that the real issue has been identified and fixed, I assume you'll
send a revert for commit:

  8d3ea3d402db94b6 ("net: bcmgenet: Use stronger register read/writes to assure ordering")

... ?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect)
  2022-04-11 10:31     ` Mark Rutland
@ 2022-04-11 19:02       ` Jeremy Linton
  0 siblings, 0 replies; 14+ messages in thread
From: Jeremy Linton @ 2022-04-11 19:02 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel, linux-kernel
  Cc: linux-arch, gcc, catalin.marinas, will, marcan, maz,
	szabolcs.nagy, f.fainelli, opendmb, Andrew Pinski,
	Ard Biesheuvel, Peter Zijlstra, x86, andrew.cooper3

Hi,


On 4/11/22 05:31, Mark Rutland wrote:
> On Tue, Apr 05, 2022 at 01:51:30PM +0100, Mark Rutland wrote:
>> Hi all,
>>
>> [adding kernel folk who work on asm stuff]
>>
>> As a heads-up, GCC 12 (not yet released) appears to erroneously optimize away
>> calls to functions with volatile asm. Szabolcs has raised an issue on the GCC
>> bugzilla:
>>
>>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105160
>>
>> ... which is a P1 release blocker, and is currently being investigated.
> 
> Jan Hubicka fixed this in GCC commit:
> 
>    aabb9a261ef060cf ("Propagate nondeterministic and side_effects flags in modref summary after inlining")
> 
> ... and all my local tests look good with that applied.
> 
> Compiler explorer's trunk build now has that fix, so the examples from before
> now look good:
> 
>    aarch64: https://godbolt.org/z/vMczqjYvs
> 
>    x86_64: https://godbolt.org/z/cveff9hq5
> 
> Jeremy, now that the real issue has been identified and fixed, I assume you'll
> send a revert for commit:
> 
>    8d3ea3d402db94b6 ("net: bcmgenet: Use stronger register read/writes to assure ordering")
> 
> ... ?

Yes, that's the plan.

Thanks,


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-04-11 19:03 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-01 16:44 [PATCH] arm64/io: Remind compiler that there is a memory side effect Jeremy Linton
2022-04-01 17:22 ` Mark Rutland
2022-04-03  7:36   ` Andrew Pinski
2022-04-03  7:47     ` Ard Biesheuvel
2022-04-03  7:47       ` Ard Biesheuvel
2022-04-04  9:14         ` Will Deacon
2022-04-03 17:40     ` Doug Berger
2022-04-05 12:51   ` GCC 12 miscompilation of volatile asm (was: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect) Mark Rutland
2022-04-05 13:04     ` Mark Rutland
2022-04-05 13:20       ` Andrew Cooper
2022-04-05 14:05     ` Peter Zijlstra
2022-04-11 10:22       ` Mark Rutland
2022-04-11 10:31     ` Mark Rutland
2022-04-11 19:02       ` Jeremy Linton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).