linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
@ 2019-01-25 21:03 Russell King
  2019-01-25 23:20 ` Tony Lindgren
  2019-02-01 10:19 ` Will Deacon
  0 siblings, 2 replies; 16+ messages in thread
From: Russell King @ 2019-01-25 21:03 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel, linux-omap
  Cc: Tony Lindgren, Paul Walmsley, Rajendra Nayak

Executing loops such as:

	while (1)
		cpu_relax();

with interrupts disabled results in a livelock of the entire system,
as other CPUs are prevented making progress.  This is most noticable
as a failure of crashdump kexec, which stops just after issuing:

	Loading crashdump kernel...

to the system console.  A workaround for this is to use 10 nops in
cpu_relax().

We also use wfe() in while (1) loops to avoid burning cycles in a
tight loop, giving the CPU a hint that we're not doing anything
useful.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
It's been a while since this was posted, Will's suggestion was to use
10 nops in cpu_relax() last time around.  I still prefer wfe() in
these infinite-not-doing-anything-ever loops.

 arch/arm/include/asm/barrier.h   | 2 ++
 arch/arm/include/asm/processor.h | 6 +++++-
 arch/arm/kernel/machine_kexec.c  | 5 ++++-
 arch/arm/kernel/smp.c            | 4 +++-
 arch/arm/mach-omap2/prm_common.c | 4 +++-
 5 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 69772e742a0a..83ae97c049d9 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -11,6 +11,8 @@
 #define sev()	__asm__ __volatile__ ("sev" : : : "memory")
 #define wfe()	__asm__ __volatile__ ("wfe" : : : "memory")
 #define wfi()	__asm__ __volatile__ ("wfi" : : : "memory")
+#else
+#define wfe()	do { } while (0)
 #endif
 
 #if __LINUX_ARM_ARCH__ >= 7
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index 120f4c9bbfde..57fe73ea0f72 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -89,7 +89,11 @@ extern void release_thread(struct task_struct *);
 unsigned long get_wchan(struct task_struct *p);
 
 #if __LINUX_ARM_ARCH__ == 6 || defined(CONFIG_ARM_ERRATA_754327)
-#define cpu_relax()			smp_mb()
+#define cpu_relax()						\
+	do {							\
+		smp_mb();					\
+		__asm__ __volatile__("nop; nop; nop; nop; nop; nop; nop; nop; nop; nop;");	\
+	} while (0)
 #else
 #define cpu_relax()			barrier()
 #endif
diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index dd2eb5f76b9f..76300f3813e8 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -91,8 +91,11 @@ void machine_crash_nonpanic_core(void *unused)
 
 	set_cpu_online(smp_processor_id(), false);
 	atomic_dec(&waiting_for_crash_ipi);
-	while (1)
+
+	while (1) {
 		cpu_relax();
+		wfe();
+	}
 }
 
 void crash_smp_send_stop(void)
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index ebac63fe458b..4e785d025771 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -595,8 +595,10 @@ static void ipi_cpu_stop(unsigned int cpu)
 	local_fiq_disable();
 	local_irq_disable();
 
-	while (1)
+	while (1) {
 		cpu_relax();
+		wfe();
+	}
 }
 
 static DEFINE_PER_CPU(struct completion *, cpu_completion);
diff --git a/arch/arm/mach-omap2/prm_common.c b/arch/arm/mach-omap2/prm_common.c
index 058a37e6d11c..fd6e0671f957 100644
--- a/arch/arm/mach-omap2/prm_common.c
+++ b/arch/arm/mach-omap2/prm_common.c
@@ -523,8 +523,10 @@ void omap_prm_reset_system(void)
 
 	prm_ll_data->reset_system();
 
-	while (1)
+	while (1) {
 		cpu_relax();
+		wfe();
+	}
 }
 
 /**
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2019-01-25 21:03 [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops Russell King
@ 2019-01-25 23:20 ` Tony Lindgren
  2019-01-26 21:00   ` Paul Walmsley
  2019-02-01 10:19 ` Will Deacon
  1 sibling, 1 reply; 16+ messages in thread
From: Tony Lindgren @ 2019-01-25 23:20 UTC (permalink / raw)
  To: Russell King
  Cc: Rajendra Nayak, Paul Walmsley, linux-omap, Will Deacon, linux-arm-kernel

* Russell King <rmk+kernel@armlinux.org.uk> [190125 21:04]:
> Executing loops such as:
> 
> 	while (1)
> 		cpu_relax();
> 
> with interrupts disabled results in a livelock of the entire system,
> as other CPUs are prevented making progress.  This is most noticable
> as a failure of crashdump kexec, which stops just after issuing:
> 
> 	Loading crashdump kernel...
> 
> to the system console.  A workaround for this is to use 10 nops in
> cpu_relax().
> 
> We also use wfe() in while (1) loops to avoid burning cycles in a
> tight loop, giving the CPU a hint that we're not doing anything
> useful.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> ---
> It's been a while since this was posted, Will's suggestion was to use
> 10 nops in cpu_relax() last time around.  I still prefer wfe() in
> these infinite-not-doing-anything-ever loops.

Works for me:

Tested-by: Tony Lindgren <tony@atomide.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2019-01-25 23:20 ` Tony Lindgren
@ 2019-01-26 21:00   ` Paul Walmsley
  2019-01-26 23:51     ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Walmsley @ 2019-01-26 21:00 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Rajendra Nayak, Russell King, linux-omap, Will Deacon, linux-arm-kernel

On Fri, 25 Jan 2019, Tony Lindgren wrote:

> * Russell King <rmk+kernel@armlinux.org.uk> [190125 21:04]:
> > Executing loops such as:
> > 
> > 	while (1)
> > 		cpu_relax();
> > 
> > with interrupts disabled results in a livelock of the entire system,
> > as other CPUs are prevented making progress.  This is most noticable
> > as a failure of crashdump kexec, which stops just after issuing:
> > 
> > 	Loading crashdump kernel...
> > 
> > to the system console.  A workaround for this is to use 10 nops in
> > cpu_relax().
> > 
> > We also use wfe() in while (1) loops to avoid burning cycles in a
> > tight loop, giving the CPU a hint that we're not doing anything
> > useful.
> > 
> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > ---
> > It's been a while since this was posted, Will's suggestion was to use
> > 10 nops in cpu_relax() last time around.  I still prefer wfe() in
> > these infinite-not-doing-anything-ever loops.
> 
> Works for me:
> 
> Tested-by: Tony Lindgren <tony@atomide.com>

There was some concern in the past that WFE, like WFI, might cause the 
core to assert an external signal that might cause the SoC integration to 
place the core into a low-power mode from which it might not be able to 
wake up.  This could happen on OMAP, for example, with WFI.

I don't recall the outcome of those discussions.  Was a conclusion ever 
reached?


- Paul

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2019-01-26 21:00   ` Paul Walmsley
@ 2019-01-26 23:51     ` Russell King - ARM Linux admin
  2019-01-27  1:15       ` Paul Walmsley
  0 siblings, 1 reply; 16+ messages in thread
From: Russell King - ARM Linux admin @ 2019-01-26 23:51 UTC (permalink / raw)
  To: Paul Walmsley
  Cc: Tony Lindgren, Rajendra Nayak, linux-omap, Will Deacon, linux-arm-kernel

On Sat, Jan 26, 2019 at 09:00:03PM +0000, Paul Walmsley wrote:
> On Fri, 25 Jan 2019, Tony Lindgren wrote:
> 
> > * Russell King <rmk+kernel@armlinux.org.uk> [190125 21:04]:
> > > Executing loops such as:
> > > 
> > > 	while (1)
> > > 		cpu_relax();
> > > 
> > > with interrupts disabled results in a livelock of the entire system,
> > > as other CPUs are prevented making progress.  This is most noticable
> > > as a failure of crashdump kexec, which stops just after issuing:
> > > 
> > > 	Loading crashdump kernel...
> > > 
> > > to the system console.  A workaround for this is to use 10 nops in
> > > cpu_relax().
> > > 
> > > We also use wfe() in while (1) loops to avoid burning cycles in a
> > > tight loop, giving the CPU a hint that we're not doing anything
> > > useful.
> > > 
> > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > > ---
> > > It's been a while since this was posted, Will's suggestion was to use
> > > 10 nops in cpu_relax() last time around.  I still prefer wfe() in
> > > these infinite-not-doing-anything-ever loops.
> > 
> > Works for me:
> > 
> > Tested-by: Tony Lindgren <tony@atomide.com>
> 
> There was some concern in the past that WFE, like WFI, might cause the 
> core to assert an external signal that might cause the SoC integration to 
> place the core into a low-power mode from which it might not be able to 
> wake up.  This could happen on OMAP, for example, with WFI.
> 
> I don't recall the outcome of those discussions.  Was a conclusion ever 
> reached?

First, we use WFE in spinlocks.  If WFE were to place the CPU in a
low power state that it may not be able to wake up from, all our
spinlocks would be unsafe.

Next, in all of the situations in this patch, we're executing an
infinite loop.  If it were to cause the core to go into a low power
mode, surely that's a good thing, rather than the core endlessly
executing NOPs?  The only way out of that is for the core to receive
a reset _anyway_.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2019-01-26 23:51     ` Russell King - ARM Linux admin
@ 2019-01-27  1:15       ` Paul Walmsley
  2019-01-27 15:28         ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Walmsley @ 2019-01-27  1:15 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Tony Lindgren, Rajendra Nayak, linux-omap, Will Deacon, linux-arm-kernel

On Sat, 26 Jan 2019, Russell King - ARM Linux admin wrote:

> On Sat, Jan 26, 2019 at 09:00:03PM +0000, Paul Walmsley wrote:
> > On Fri, 25 Jan 2019, Tony Lindgren wrote:
> > 
> > > * Russell King <rmk+kernel@armlinux.org.uk> [190125 21:04]:
> > > > Executing loops such as:
> > > > 
> > > > 	while (1)
> > > > 		cpu_relax();
> > > > 
> > > > with interrupts disabled results in a livelock of the entire system,
> > > > as other CPUs are prevented making progress.  This is most noticable
> > > > as a failure of crashdump kexec, which stops just after issuing:
> > > > 
> > > > 	Loading crashdump kernel...
> > > > 
> > > > to the system console.  A workaround for this is to use 10 nops in
> > > > cpu_relax().
> > > > 
> > > > We also use wfe() in while (1) loops to avoid burning cycles in a
> > > > tight loop, giving the CPU a hint that we're not doing anything
> > > > useful.
> > > > 
> > > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > > > ---
> > > > It's been a while since this was posted, Will's suggestion was to use
> > > > 10 nops in cpu_relax() last time around.  I still prefer wfe() in
> > > > these infinite-not-doing-anything-ever loops.
> > > 
> > > Works for me:
> > > 
> > > Tested-by: Tony Lindgren <tony@atomide.com>
> > 
> > There was some concern in the past that WFE, like WFI, might cause the 
> > core to assert an external signal that might cause the SoC integration to 
> > place the core into a low-power mode from which it might not be able to 
> > wake up.  This could happen on OMAP, for example, with WFI.
> > 
> > I don't recall the outcome of those discussions.  Was a conclusion ever 
> > reached?
> 
> First, we use WFE in spinlocks.  If WFE were to place the CPU in a
> low power state that it may not be able to wake up from, all our
> spinlocks would be unsafe.

Good point.  WFE must not assert the external signal that indicates 
that the core is inactive.

> Next, in all of the situations in this patch, we're executing an
> infinite loop.  If it were to cause the core to go into a low power
> mode, surely that's a good thing, rather than the core endlessly
> executing NOPs?  The only way out of that is for the core to receive
> a reset _anyway_.

Makes sense.  

Do you recall what Will's reasoning was for preferring 10 NOPs to a WFE?


- Paul

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2019-01-27  1:15       ` Paul Walmsley
@ 2019-01-27 15:28         ` Russell King - ARM Linux admin
  2019-01-31 13:58           ` Will Deacon
  0 siblings, 1 reply; 16+ messages in thread
From: Russell King - ARM Linux admin @ 2019-01-27 15:28 UTC (permalink / raw)
  To: Paul Walmsley
  Cc: Tony Lindgren, Rajendra Nayak, linux-omap, Will Deacon, linux-arm-kernel

On Sun, Jan 27, 2019 at 01:15:31AM +0000, Paul Walmsley wrote:
> On Sat, 26 Jan 2019, Russell King - ARM Linux admin wrote:
> 
> > On Sat, Jan 26, 2019 at 09:00:03PM +0000, Paul Walmsley wrote:
> > > On Fri, 25 Jan 2019, Tony Lindgren wrote:
> > > 
> > > > * Russell King <rmk+kernel@armlinux.org.uk> [190125 21:04]:
> > > > > Executing loops such as:
> > > > > 
> > > > > 	while (1)
> > > > > 		cpu_relax();
> > > > > 
> > > > > with interrupts disabled results in a livelock of the entire system,
> > > > > as other CPUs are prevented making progress.  This is most noticable
> > > > > as a failure of crashdump kexec, which stops just after issuing:
> > > > > 
> > > > > 	Loading crashdump kernel...
> > > > > 
> > > > > to the system console.  A workaround for this is to use 10 nops in
> > > > > cpu_relax().
> > > > > 
> > > > > We also use wfe() in while (1) loops to avoid burning cycles in a
> > > > > tight loop, giving the CPU a hint that we're not doing anything
> > > > > useful.
> > > > > 
> > > > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > > > > ---
> > > > > It's been a while since this was posted, Will's suggestion was to use
> > > > > 10 nops in cpu_relax() last time around.  I still prefer wfe() in
> > > > > these infinite-not-doing-anything-ever loops.
> > > > 
> > > > Works for me:
> > > > 
> > > > Tested-by: Tony Lindgren <tony@atomide.com>
> > > 
> > > There was some concern in the past that WFE, like WFI, might cause the 
> > > core to assert an external signal that might cause the SoC integration to 
> > > place the core into a low-power mode from which it might not be able to 
> > > wake up.  This could happen on OMAP, for example, with WFI.
> > > 
> > > I don't recall the outcome of those discussions.  Was a conclusion ever 
> > > reached?
> > 
> > First, we use WFE in spinlocks.  If WFE were to place the CPU in a
> > low power state that it may not be able to wake up from, all our
> > spinlocks would be unsafe.
> 
> Good point.  WFE must not assert the external signal that indicates 
> that the core is inactive.
> 
> > Next, in all of the situations in this patch, we're executing an
> > infinite loop.  If it were to cause the core to go into a low power
> > mode, surely that's a good thing, rather than the core endlessly
> > executing NOPs?  The only way out of that is for the core to receive
> > a reset _anyway_.
> 
> Makes sense.  
> 
> Do you recall what Will's reasoning was for preferring 10 NOPs to a WFE?

I think there may be an erratum for this which specifies 10 NOPs as
its workaround, but I don't have its number.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2019-01-27 15:28         ` Russell King - ARM Linux admin
@ 2019-01-31 13:58           ` Will Deacon
  2019-01-31 22:58             ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2019-01-31 13:58 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Tony Lindgren, Paul Walmsley, linux-omap, Rajendra Nayak,
	linux-arm-kernel

Hi Russell, Paul,

On Sun, Jan 27, 2019 at 03:28:50PM +0000, Russell King - ARM Linux admin wrote:
> On Sun, Jan 27, 2019 at 01:15:31AM +0000, Paul Walmsley wrote:
> > On Sat, 26 Jan 2019, Russell King - ARM Linux admin wrote:
> > > On Sat, Jan 26, 2019 at 09:00:03PM +0000, Paul Walmsley wrote:
> > > > There was some concern in the past that WFE, like WFI, might cause the 
> > > > core to assert an external signal that might cause the SoC integration to 
> > > > place the core into a low-power mode from which it might not be able to 
> > > > wake up.  This could happen on OMAP, for example, with WFI.
> > > > 
> > > > I don't recall the outcome of those discussions.  Was a conclusion ever 
> > > > reached?
> > > 
> > > First, we use WFE in spinlocks.  If WFE were to place the CPU in a
> > > low power state that it may not be able to wake up from, all our
> > > spinlocks would be unsafe.
> > 
> > Good point.  WFE must not assert the external signal that indicates 
> > that the core is inactive.
> > 
> > > Next, in all of the situations in this patch, we're executing an
> > > infinite loop.  If it were to cause the core to go into a low power
> > > mode, surely that's a good thing, rather than the core endlessly
> > > executing NOPs?  The only way out of that is for the core to receive
> > > a reset _anyway_.
> > 
> > Makes sense.  
> > 
> > Do you recall what Will's reasoning was for preferring 10 NOPs to a WFE?
> 
> I think there may be an erratum for this which specifies 10 NOPs as
> its workaround, but I don't have its number.

The erratum hits because cpu_relax() is a DMB instruction due to erratum
754327. That then triggers erratum 794072 because a tight loop of DMB
instructions can cause a denial of service. One of the conditions for that
to occur is:

  | * No more than 10 instructions other than the DMB are executed between
  |   each DMB

Digging up the workaround:

  |  This erratum can be worked round by setting bit[4] of the undocumented
  |  Diagnostic Control Register to 1. This register is encoded as
  |  CP15 c15 0 c0 1. This bit can be written in Secure state only, with the
  |  following Read/Modify/Write code sequence:
  |
  |	MRC p15,0,rt,c15,c0,1
  |	ORR rt,rt,#0x10
  |	MCR p15,0,rt,c15,c0,1
  |
  |  When it is set, this bit causes the DMB instruction to be decoded and
  |  executed like a DSB. Using this software workaround is not expected to
  |  have any impact on the overall performance of the processor on a typical
  |  code base.
  |
  |  Other workarounds are also available for this erratum, to either prevent
  |  or interrupt the continuous stream of DMB instructions that causes the
  |  deadlock.
  |
  |  For example:
  |	* Inserting a non-conditional Load or Store instruction in the loop
  |	  between each DMB
  |	* Inserting additional instructions in the loop, such as NOPs, to
  |       avoid the processor seeing back to back DMB instructions.
  |	* Making the processor executing the short loop take regular
  |	  interrupts.

So the reason I prefer the NOPs is because that's guaranteed by the h/w folks
to do the trick, whereas they say nothing about WFE. It should be dead easy to
use NOPs instead, so I'm not sure why we're not just following the workaround
here. We could even use NOPs + WFE if you like!

Will "archaeologist" Deacon

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2019-01-31 13:58           ` Will Deacon
@ 2019-01-31 22:58             ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 16+ messages in thread
From: Russell King - ARM Linux admin @ 2019-01-31 22:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: Tony Lindgren, Paul Walmsley, linux-omap, Rajendra Nayak,
	linux-arm-kernel

On Thu, Jan 31, 2019 at 01:58:05PM +0000, Will Deacon wrote:
> Hi Russell, Paul,
> 
> On Sun, Jan 27, 2019 at 03:28:50PM +0000, Russell King - ARM Linux admin wrote:
> > On Sun, Jan 27, 2019 at 01:15:31AM +0000, Paul Walmsley wrote:
> > > On Sat, 26 Jan 2019, Russell King - ARM Linux admin wrote:
> > > > On Sat, Jan 26, 2019 at 09:00:03PM +0000, Paul Walmsley wrote:
> > > > > There was some concern in the past that WFE, like WFI, might cause the 
> > > > > core to assert an external signal that might cause the SoC integration to 
> > > > > place the core into a low-power mode from which it might not be able to 
> > > > > wake up.  This could happen on OMAP, for example, with WFI.
> > > > > 
> > > > > I don't recall the outcome of those discussions.  Was a conclusion ever 
> > > > > reached?
> > > > 
> > > > First, we use WFE in spinlocks.  If WFE were to place the CPU in a
> > > > low power state that it may not be able to wake up from, all our
> > > > spinlocks would be unsafe.
> > > 
> > > Good point.  WFE must not assert the external signal that indicates 
> > > that the core is inactive.
> > > 
> > > > Next, in all of the situations in this patch, we're executing an
> > > > infinite loop.  If it were to cause the core to go into a low power
> > > > mode, surely that's a good thing, rather than the core endlessly
> > > > executing NOPs?  The only way out of that is for the core to receive
> > > > a reset _anyway_.
> > > 
> > > Makes sense.  
> > > 
> > > Do you recall what Will's reasoning was for preferring 10 NOPs to a WFE?
> > 
> > I think there may be an erratum for this which specifies 10 NOPs as
> > its workaround, but I don't have its number.
> 
> The erratum hits because cpu_relax() is a DMB instruction due to erratum
> 754327. That then triggers erratum 794072 because a tight loop of DMB
> instructions can cause a denial of service. One of the conditions for that
> to occur is:
> 
>   | * No more than 10 instructions other than the DMB are executed between
>   |   each DMB
> 
> Digging up the workaround:
> 
>   |  This erratum can be worked round by setting bit[4] of the undocumented
>   |  Diagnostic Control Register to 1. This register is encoded as
>   |  CP15 c15 0 c0 1. This bit can be written in Secure state only, with the
>   |  following Read/Modify/Write code sequence:
>   |
>   |	MRC p15,0,rt,c15,c0,1
>   |	ORR rt,rt,#0x10
>   |	MCR p15,0,rt,c15,c0,1
>   |
>   |  When it is set, this bit causes the DMB instruction to be decoded and
>   |  executed like a DSB. Using this software workaround is not expected to
>   |  have any impact on the overall performance of the processor on a typical
>   |  code base.
>   |
>   |  Other workarounds are also available for this erratum, to either prevent
>   |  or interrupt the continuous stream of DMB instructions that causes the
>   |  deadlock.
>   |
>   |  For example:
>   |	* Inserting a non-conditional Load or Store instruction in the loop
>   |	  between each DMB
>   |	* Inserting additional instructions in the loop, such as NOPs, to
>   |       avoid the processor seeing back to back DMB instructions.
>   |	* Making the processor executing the short loop take regular
>   |	  interrupts.
> 
> So the reason I prefer the NOPs is because that's guaranteed by the h/w folks
> to do the trick, whereas they say nothing about WFE. It should be dead easy to
> use NOPs instead, so I'm not sure why we're not just following the workaround
> here. We could even use NOPs + WFE if you like!

Okay, let's start off at the beginning:

machine_crash_nonpanic_core() does this:

	while (1)
		cpu_relax();

because the kernel has crashed, and we have no known safe way to deal
with the CPU.  So, we place the CPU into an infinite loop which we
expect it to _never_ exit - at least not until the system as a whole is
reset by some method.

In the absence of 754327, this code assembles to:

	b	.

In other words, an infinite loop.  When 754327 is enabled, this
becomes:

1:	dmb
	b	1b

Now, it's been so long ago (the commit says April 2018), that I don't
remember _which_ of these triggered the problem in OMAP4 where, if a
crash is triggered, the system tries to kexec into the panic kernel,
but fails after taking the secondary CPU down - placing it into one
of these loops.

The test as working solution I came up with was to add wfe() to
these infinite loops thusly:

	while (1) {
		cpu_relax();
		wfe();
	}

which, without 754327 builds to:

1:	wfe
	b	1b

or with 754327 is enabled:

1:	dmb
	wfe
	b	1b

Adding "wfe" does two things - where we're running on bare metal, and
the processor implements "wfe", it stops us spinning endlessly in a
loop where we're never going to do any useful work.

If we hit one of these loops in a VM, it allows the CPU to be given
back to the hypervisor and rescheduled for other purposes (maybe a
different VM) rather than wasting CPU cycles inside a crashed VM.

However, in light of 794072, you decided you wanted to see 10 nops
as well - which is reasonable to cover the case where we have 754327
enabled _and_ we have a processor that doesn't implement the wfe
hint.

So, we now end up with:

1:	wfe
	b	1b

when 754327 is disabled, or:

1:	dmb
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	wfe
	b	1b

when 754327 is enabled.  We also get the dmb + 10 nop sequence
elsewhere in the kernel, in terminating loops.

IMHO, this is reasonable - it means we get the workaround for 794072
when 754327 is enabled, but still relinquish the dead processor -
either by placing it in a lower power mode when wfe is implemented as
such or by returning it to the hypervisior, or in the case where wfe
is a no-op, we use the workaround specified in 794072 to avoid the
problem.

I personally see these as two entirely orthogonal problems - the 10
nops addresses 794072, and the wfe is an optimisation that makes the
system more efficient when crashed either in terms of power consumption
or by allowing the host/other VMs to make use of the CPU.

I don't see any reason not to use kexec() inside a VM - it has the
potential to provide automated recovery from a failure of the VMs
kernel with the opportunity for saving a crashdump of the failure.
A panic() with a reboot timeout won't do that, and reading the
libvirt documentation, setting on_reboot to "preserve" won't either
(the documentation states "The preserve action for an on_reboot event
is treated as a destroy".)  Surely it has to be a good thing to
avoiding having CPUs spinning inside a VM that is doing no useful
work.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2019-01-25 21:03 [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops Russell King
  2019-01-25 23:20 ` Tony Lindgren
@ 2019-02-01 10:19 ` Will Deacon
  2019-02-01 21:20   ` Russell King - ARM Linux admin
  1 sibling, 1 reply; 16+ messages in thread
From: Will Deacon @ 2019-02-01 10:19 UTC (permalink / raw)
  To: Russell King
  Cc: Tony Lindgren, Paul Walmsley, linux-omap, Rajendra Nayak,
	linux-arm-kernel

Hi Russell,

On Fri, Jan 25, 2019 at 09:03:57PM +0000, Russell King wrote:
> Executing loops such as:
> 
> 	while (1)
> 		cpu_relax();
> 
> with interrupts disabled results in a livelock of the entire system,
> as other CPUs are prevented making progress.  This is most noticable
> as a failure of crashdump kexec, which stops just after issuing:
> 
> 	Loading crashdump kernel...
> 
> to the system console.  A workaround for this is to use 10 nops in
> cpu_relax().
> 
> We also use wfe() in while (1) loops to avoid burning cycles in a
> tight loop, giving the CPU a hint that we're not doing anything
> useful.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> ---
> It's been a while since this was posted, Will's suggestion was to use
> 10 nops in cpu_relax() last time around.  I still prefer wfe() in
> these infinite-not-doing-anything-ever loops.
> 
>  arch/arm/include/asm/barrier.h   | 2 ++
>  arch/arm/include/asm/processor.h | 6 +++++-
>  arch/arm/kernel/machine_kexec.c  | 5 ++++-
>  arch/arm/kernel/smp.c            | 4 +++-
>  arch/arm/mach-omap2/prm_common.c | 4 +++-
>  5 files changed, 17 insertions(+), 4 deletions(-)

Thanks, this looks good to me and your explanation later in the thread makes
a lot of sense:

Acked-by: Will Deacon <will.deacon@arm.com>

Feel free to put some of the erratum writeup that I shared in the commit
message, if you like.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2019-02-01 10:19 ` Will Deacon
@ 2019-02-01 21:20   ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 16+ messages in thread
From: Russell King - ARM Linux admin @ 2019-02-01 21:20 UTC (permalink / raw)
  To: Will Deacon
  Cc: Tony Lindgren, Paul Walmsley, linux-omap, Rajendra Nayak,
	linux-arm-kernel

Hi Will,

On Fri, Feb 01, 2019 at 10:19:19AM +0000, Will Deacon wrote:
> Hi Russell,
> 
> On Fri, Jan 25, 2019 at 09:03:57PM +0000, Russell King wrote:
> > Executing loops such as:
> > 
> > 	while (1)
> > 		cpu_relax();
> > 
> > with interrupts disabled results in a livelock of the entire system,
> > as other CPUs are prevented making progress.  This is most noticable
> > as a failure of crashdump kexec, which stops just after issuing:
> > 
> > 	Loading crashdump kernel...
> > 
> > to the system console.  A workaround for this is to use 10 nops in
> > cpu_relax().
> > 
> > We also use wfe() in while (1) loops to avoid burning cycles in a
> > tight loop, giving the CPU a hint that we're not doing anything
> > useful.
> > 
> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> > ---
> > It's been a while since this was posted, Will's suggestion was to use
> > 10 nops in cpu_relax() last time around.  I still prefer wfe() in
> > these infinite-not-doing-anything-ever loops.
> > 
> >  arch/arm/include/asm/barrier.h   | 2 ++
> >  arch/arm/include/asm/processor.h | 6 +++++-
> >  arch/arm/kernel/machine_kexec.c  | 5 ++++-
> >  arch/arm/kernel/smp.c            | 4 +++-
> >  arch/arm/mach-omap2/prm_common.c | 4 +++-
> >  5 files changed, 17 insertions(+), 4 deletions(-)
> 
> Thanks, this looks good to me and your explanation later in the thread makes
> a lot of sense:
> 
> Acked-by: Will Deacon <will.deacon@arm.com>
> 
> Feel free to put some of the erratum writeup that I shared in the commit
> message, if you like.

I think it may make more sense to use my writeup as a basis for a
better commit log that explains why we're doing what we're doing.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2018-06-04  9:42 ` Will Deacon
@ 2018-06-04 18:08   ` Russell King - ARM Linux
  0 siblings, 0 replies; 16+ messages in thread
From: Russell King - ARM Linux @ 2018-06-04 18:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 04, 2018 at 10:42:43AM +0100, Will Deacon wrote:
> Hi Russell,
> 
> On Fri, Jun 01, 2018 at 12:00:16PM +0100, Russell King wrote:
> > Executing loops such as:
> > 
> > 	while (1)
> > 		cpu_relax();
> >
> > with interrupts disabled results in a livelock of the entire system,
> > as other CPUs are prevented making progress.  This is most noticable
> > as a failure of crashdump kexec, which stops just after issuing:
> > 
> > 	Loading crashdump kernel...
> > 
> > to the system console.  Two other locations of these loops within the
> > ARM code have been identified and fixed up.
> 
> Can you confirm that this only happens if CONFIG_ARM_ERRATA_754327=y?

CONFIG_ARM_ERRATA_754327=y + patch => works
CONFIG_ARM_ERRATA_754327=y => fails
CONFIG_ARM_ERRATA_754327=n => works

> The only erratum I can find for A9 that matches this behaviour exists
> when the body of the tight loop contains a DMB and some of the possible
> workarounds are:
> 
>   - Add ten NOPs after the DMB
>   - Use DSB instead of DMB in the tight loop
>   - Set bit 16 in the diagnostic control register (p15, c1, 5, 0, c0, 1)

Yes, I think you pointed me at that.  It may be appropriate to mitigate
the cases where we have a tight loop where the loop has a termination
condition, but in these cases, all the loops are infinite - finding some
way to avoid spinning in this case is probably a good idea in any case.

What I'm more interested in this patch is to fix kexec crashdump when
CONFIG_ARM_ERRATA_754327=y on OMAP4 (and similar) platforms.

> WFE is probably fine (the write-up isn't clear), but if this only occurs
> due to CONFIG_ARM_ERRATA_754327=y it would be nice to mitigate it in the
> alternative cpu_relax() definition itself, which isn't generally possible
> with WFE.

With the WFE, it is no longer "a tight loop", although WFE is just a
hint to the processor, it could ultimately ignore it.  That said, in
all these cases, either:

- we're either talking about a secondary CPU, so SMP must be supported
  (which presumably guarantees implementation of SEV/WFE)
or:
- we're the only CPU so this problem doesn't apply to the infinite loop
  case.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2018-06-01 11:00 Russell King
  2018-06-01 15:35 ` Tony Lindgren
@ 2018-06-04  9:42 ` Will Deacon
  2018-06-04 18:08   ` Russell King - ARM Linux
  1 sibling, 1 reply; 16+ messages in thread
From: Will Deacon @ 2018-06-04  9:42 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Fri, Jun 01, 2018 at 12:00:16PM +0100, Russell King wrote:
> Executing loops such as:
> 
> 	while (1)
> 		cpu_relax();
>
> with interrupts disabled results in a livelock of the entire system,
> as other CPUs are prevented making progress.  This is most noticable
> as a failure of crashdump kexec, which stops just after issuing:
> 
> 	Loading crashdump kernel...
> 
> to the system console.  Two other locations of these loops within the
> ARM code have been identified and fixed up.

Can you confirm that this only happens if CONFIG_ARM_ERRATA_754327=y?
The only erratum I can find for A9 that matches this behaviour exists
when the body of the tight loop contains a DMB and some of the possible
workarounds are:

  - Add ten NOPs after the DMB
  - Use DSB instead of DMB in the tight loop
  - Set bit 16 in the diagnostic control register (p15, c1, 5, 0, c0, 1)

WFE is probably fine (the write-up isn't clear), but if this only occurs
due to CONFIG_ARM_ERRATA_754327=y it would be nice to mitigate it in the
alternative cpu_relax() definition itself, which isn't generally possible
with WFE.

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2018-06-01 15:55   ` Russell King - ARM Linux
@ 2018-06-01 16:12     ` Tony Lindgren
  0 siblings, 0 replies; 16+ messages in thread
From: Tony Lindgren @ 2018-06-01 16:12 UTC (permalink / raw)
  To: linux-arm-kernel

* Russell King - ARM Linux <linux@armlinux.org.uk> [180601 15:57]:
> On Fri, Jun 01, 2018 at 08:35:12AM -0700, Tony Lindgren wrote:
> > CONFIG_KERNEL_LZMA fails:
> > 
> > Try gzip decompression.
> > Try LZMA decompression.
> > lzma_decompress_file: read on /boot/zImage of 65536 bytes failed
> > kernel: 0xb6abb010 kernel_size: 0x43d0f0
> > MEMORY RANGES
> > 0000000080000000-00000000bfdfffff (0)
> > zImage header: 0x016f2818 0x00000000 0x0043d0f0
> > zImage size 0x43d0f0, file size 0x43d0f0
> > Reserved memory ranges
> 
> This looks like an old kexec binary as it's missing the output from:
> 
>         dbgprintf("zImage requires 0x%08llx bytes\n", (unsigned long long)len);
> 
> Please can you test with the current version - the official
> repository should now be up to date with my version.  Thanks.

OK great. After updating kexec-tools to latest git veresion LZMA
crashkernel now also boots for me :)

Regards,

Tony

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2018-06-01 15:35 ` Tony Lindgren
@ 2018-06-01 15:55   ` Russell King - ARM Linux
  2018-06-01 16:12     ` Tony Lindgren
  0 siblings, 1 reply; 16+ messages in thread
From: Russell King - ARM Linux @ 2018-06-01 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 01, 2018 at 08:35:12AM -0700, Tony Lindgren wrote:
> * Russell King <rmk+kernel@armlinux.org.uk> [180601 11:02]:
> > Executing loops such as:
> > 
> > 	while (1)
> > 		cpu_relax();
> > 
> > with interrupts disabled results in a livelock of the entire system,
> > as other CPUs are prevented making progress.  This is most noticable
> > as a failure of crashdump kexec, which stops just after issuing:
> > 
> > 	Loading crashdump kernel...
> > 
> > to the system console.  Two other locations of these loops within the
> > ARM code have been identified and fixed up.
> > 
> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> 
> Works for me thanks:
> 
> Tested-by: Tony Lindgren <tony@atomide.com>

Thanks.

> BTW, do LZMA crashkernels boot for you with crashdump?
> 
> For me LZMA crashkernels fail to boot while GZIP crashkernels
> boots. Some more info below for failing and working output.
> 
> Regards,
> 
> Tony
> 
> 8< ----------------------
> CONFIG_KERNEL_LZMA fails:
> 
> Try gzip decompression.
> Try LZMA decompression.
> lzma_decompress_file: read on /boot/zImage of 65536 bytes failed
> kernel: 0xb6abb010 kernel_size: 0x43d0f0
> MEMORY RANGES
> 0000000080000000-00000000bfdfffff (0)
> zImage header: 0x016f2818 0x00000000 0x0043d0f0
> zImage size 0x43d0f0, file size 0x43d0f0
> Reserved memory ranges

This looks like an old kexec binary as it's missing the output from:

        dbgprintf("zImage requires 0x%08llx bytes\n", (unsigned long long)len);

Please can you test with the current version - the official
repository should now be up to date with my version.  Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
  2018-06-01 11:00 Russell King
@ 2018-06-01 15:35 ` Tony Lindgren
  2018-06-01 15:55   ` Russell King - ARM Linux
  2018-06-04  9:42 ` Will Deacon
  1 sibling, 1 reply; 16+ messages in thread
From: Tony Lindgren @ 2018-06-01 15:35 UTC (permalink / raw)
  To: linux-arm-kernel

* Russell King <rmk+kernel@armlinux.org.uk> [180601 11:02]:
> Executing loops such as:
> 
> 	while (1)
> 		cpu_relax();
> 
> with interrupts disabled results in a livelock of the entire system,
> as other CPUs are prevented making progress.  This is most noticable
> as a failure of crashdump kexec, which stops just after issuing:
> 
> 	Loading crashdump kernel...
> 
> to the system console.  Two other locations of these loops within the
> ARM code have been identified and fixed up.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Works for me thanks:

Tested-by: Tony Lindgren <tony@atomide.com>

BTW, do LZMA crashkernels boot for you with crashdump?

For me LZMA crashkernels fail to boot while GZIP crashkernels
boots. Some more info below for failing and working output.

Regards,

Tony

8< ----------------------
CONFIG_KERNEL_LZMA fails:

Try gzip decompression.
Try LZMA decompression.
lzma_decompress_file: read on /boot/zImage of 65536 bytes failed
kernel: 0xb6abb010 kernel_size: 0x43d0f0
MEMORY RANGES
0000000080000000-00000000bfdfffff (0)
zImage header: 0x016f2818 0x00000000 0x0043d0f0
zImage size 0x43d0f0, file size 0x43d0f0
Reserved memory ranges
00000000a8000000-00000000abffffff (0)
Coredump memory ranges
0000000080000000-00000000a7ffffff (0)
00000000ac000000-00000000bfdfffff (0)
kernel symbol _stext vaddr =         c0100000
phys offset = 0x80000000, page offset = c0000000
Using 32-bit ELF core format
get_crash_notes_per_cpu: crash_notes addr = be001200, size = 180
Elf header: p_type = 4, p_offset = 0xbe001200 p_paddr = 0xbe001200 p_vaddr = 0x0 p_filesz = 0xb4 p_memsz = 0xb4
get_crash_notes_per_cpu: crash_notes addr = be002200, size = 180
Elf header: p_type = 4, p_offset = 0xbe002200 p_paddr = 0xbe002200 p_vaddr = 0x0 p_filesz = 0xb4 p_memsz = 0xb4
vmcoreinfo header: p_type = 4, p_offset = 0xaeae2000 p_paddr = 0xaeae2000 p_vaddr = 0x0 p_filesz = 0x1024 p_memsz = 0x1024
Elf header: p_type = 1, p_offset = 0x80000000 p_paddr = 0x80000000 p_vaddr = 0xc0000000 p_filesz = 0x28000000 p_memsz = 0x2800
0000
Elf header: p_type = 1, p_offset = 0xac000000 p_paddr = 0xac000000 p_vaddr = 0xec000000 p_filesz = 0x13e00000 p_memsz = 0x13e0
0000
elfcorehdr: 0xabf00000
crashkernel: [0xa8000000 - 0xabffffff] (64M)
memory range: [0x80000000 - 0xa7ffffff] (640M)
memory range: [0xac000000 - 0xbfdfffff] (318M)
kernel command line: "console=ttyS2,115200n8 root=/dev/nfs ip=dhcp debug earlyprintk earlycon crashkernel=64M elfcorehdr=0xabf
00000 mem=64512K"
kexec_load: entry = 0xa8008000 flags = 0x280001
nr_segments = 3
segment[0].buf   = 0xb6abb010
segment[0].bufsz = 0x43d0f0
segment[0].mem   = 0xa8008000
segment[0].memsz = 0x43e000
segment[1].buf   = 0x53dc80
segment[1].bufsz = 0x128da
segment[1].mem   = 0xa953b000
segment[1].memsz = 0x13000
segment[2].buf   = 0xb6fe8150
segment[2].bufsz = 0x400
segment[2].mem   = 0xabf00000
segment[2].memsz = 0x1000


CONFIG_KERNEL_GZIP Works:

Try gzip decompression.
Try LZMA decompression.
lzma_decompress_file: read on /boot/zImage of 65536 bytes failed
kernel: 0xb693f010 kernel_size: 0x5c74a8
MEMORY RANGES
0000000080000000-00000000bfdfffff (0)
zImage header: 0x016f2818 0x00000000 0x005c74a8
zImage size 0x5c74a8, file size 0x5c74a8
Reserved memory ranges
00000000a8000000-00000000abffffff (0)
Coredump memory ranges
0000000080000000-00000000a7ffffff (0)
00000000ac000000-00000000bfdfffff (0)
kernel symbol _stext vaddr =         c0100000
phys offset = 0x80000000, page offset = c0000000
Using 32-bit ELF core format
get_crash_notes_per_cpu: crash_notes addr = be001200, size = 180
Elf header: p_type = 4, p_offset = 0xbe001200 p_paddr = 0xbe001200 p_vaddr = 0x0 p_filesz = 0xb4 p_memsz = 0xb4
get_crash_notes_per_cpu: crash_notes addr = be002200, size = 180
Elf header: p_type = 4, p_offset = 0xbe002200 p_paddr = 0xbe002200 p_vaddr = 0x0 p_filesz = 0xb4 p_memsz = 0xb4
vmcoreinfo header: p_type = 4, p_offset = 0xaeae2000 p_paddr = 0xaeae2000 p_vaddr = 0x0 p_filesz = 0x1024 p_memsz = 0x1024
Elf header: p_type = 1, p_offset = 0x80000000 p_paddr = 0x80000000 p_vaddr = 0xc0000000 p_filesz = 0x28000000 p_memsz = 0x28000000
Elf header: p_type = 1, p_offset = 0xac000000 p_paddr = 0xac000000 p_vaddr = 0xec000000 p_filesz = 0x13e00000 p_memsz = 0x13e00000
elfcorehdr: 0xabf00000
crashkernel: [0xa8000000 - 0xabffffff] (64M)
memory range: [0x80000000 - 0xa7ffffff] (640M)
memory range: [0xac000000 - 0xbfdfffff] (318M)
kernel command line: "console=ttyS2,115200n8 root=/dev/nfs ip=dhcp debug earlyprintk earlycon crashkernel=64M elfcorehdr=0xabf00000 mem=64512K"
kexec_load: entry = 0xa8008000 flags = 0x280001
nr_segments = 3
segment[0].buf   = 0xb693f010
segment[0].bufsz = 0x5c74a8
segment[0].mem   = 0xa8008000
segment[0].memsz = 0x5c8000
segment[1].buf   = 0x476c80
segment[1].bufsz = 0x128da
segment[1].mem   = 0xa9cee000
segment[1].memsz = 0x13000
segment[2].buf   = 0xb6ff6150
segment[2].bufsz = 0x400
segment[2].mem   = 0xabf00000
segment[2].memsz = 0x1000

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops
@ 2018-06-01 11:00 Russell King
  2018-06-01 15:35 ` Tony Lindgren
  2018-06-04  9:42 ` Will Deacon
  0 siblings, 2 replies; 16+ messages in thread
From: Russell King @ 2018-06-01 11:00 UTC (permalink / raw)
  To: linux-arm-kernel

Executing loops such as:

	while (1)
		cpu_relax();

with interrupts disabled results in a livelock of the entire system,
as other CPUs are prevented making progress.  This is most noticable
as a failure of crashdump kexec, which stops just after issuing:

	Loading crashdump kernel...

to the system console.  Two other locations of these loops within the
ARM code have been identified and fixed up.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
v2: use wfe() instead of cpu_do_idle

 arch/arm/include/asm/barrier.h   | 2 ++
 arch/arm/kernel/machine_kexec.c  | 5 ++++-
 arch/arm/kernel/smp.c            | 4 +++-
 arch/arm/mach-omap2/prm_common.c | 4 +++-
 4 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 69772e742a0a..83ae97c049d9 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -11,6 +11,8 @@
 #define sev()	__asm__ __volatile__ ("sev" : : : "memory")
 #define wfe()	__asm__ __volatile__ ("wfe" : : : "memory")
 #define wfi()	__asm__ __volatile__ ("wfi" : : : "memory")
+#else
+#define wfe()	do { } while (0)
 #endif
 
 #if __LINUX_ARM_ARCH__ >= 7
diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index dd2eb5f76b9f..76300f3813e8 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -91,8 +91,11 @@ void machine_crash_nonpanic_core(void *unused)
 
 	set_cpu_online(smp_processor_id(), false);
 	atomic_dec(&waiting_for_crash_ipi);
-	while (1)
+
+	while (1) {
 		cpu_relax();
+		wfe();
+	}
 }
 
 void crash_smp_send_stop(void)
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 9ec9a366ef44..a39fe6ab89a2 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -570,8 +570,10 @@ static void ipi_cpu_stop(unsigned int cpu)
 	local_fiq_disable();
 	local_irq_disable();
 
-	while (1)
+	while (1) {
 		cpu_relax();
+		wfe();
+	}
 }
 
 static DEFINE_PER_CPU(struct completion *, cpu_completion);
diff --git a/arch/arm/mach-omap2/prm_common.c b/arch/arm/mach-omap2/prm_common.c
index 021b5a8b9c0a..bb15a9eec05c 100644
--- a/arch/arm/mach-omap2/prm_common.c
+++ b/arch/arm/mach-omap2/prm_common.c
@@ -522,8 +522,10 @@ void omap_prm_reset_system(void)
 
 	prm_ll_data->reset_system();
 
-	while (1)
+	while (1) {
 		cpu_relax();
+		wfe();
+	}
 }
 
 /**
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-02-01 21:20 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-25 21:03 [PATCH v2] ARM: avoid Cortex-A9 livelock on tight dmb loops Russell King
2019-01-25 23:20 ` Tony Lindgren
2019-01-26 21:00   ` Paul Walmsley
2019-01-26 23:51     ` Russell King - ARM Linux admin
2019-01-27  1:15       ` Paul Walmsley
2019-01-27 15:28         ` Russell King - ARM Linux admin
2019-01-31 13:58           ` Will Deacon
2019-01-31 22:58             ` Russell King - ARM Linux admin
2019-02-01 10:19 ` Will Deacon
2019-02-01 21:20   ` Russell King - ARM Linux admin
  -- strict thread matches above, loose matches on Subject: below --
2018-06-01 11:00 Russell King
2018-06-01 15:35 ` Tony Lindgren
2018-06-01 15:55   ` Russell King - ARM Linux
2018-06-01 16:12     ` Tony Lindgren
2018-06-04  9:42 ` Will Deacon
2018-06-04 18:08   ` Russell King - ARM Linux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).