All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] ARM: shmobile: Fix loop-based delays
@ 2016-01-11 18:41 Geert Uytterhoeven
  2016-01-11 18:41 ` [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays Geert Uytterhoeven
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Geert Uytterhoeven @ 2016-01-11 18:41 UTC (permalink / raw)
  To: linux-sh

	Hi Simon, Magnus,

I accidentally noticed that on r8a7740/armadillo, "mdelay(1000)" took
only 333 ms with my personal config, while it took 3675 ms with
shmobile_defconfig.

Finding out why brought me an animating journey through obscure code
and kernel config options...

Apparently loop-based delays may complete early on all shmobile SoCs,
which may lead to subtle driver failures. Fortunately(?) this was
mitigated by:
  1. the ARM arch timer on Cortex A7 and A15, which is used instead of
     loop-based delays if available,
  2. shmobile_defconfig having CONFIG_CPU_BPREDICT_DISABLE enabled
     (why?), which makes loops up to 16 times slower.

This patch series fixes both, and removes a related check for Cortex A8
CPU cores, which are no longer supported anyway.

Tested on a collection of shmobile SoCs with Cortex A7, A9, and A15 CPU
cores, with some verification on BeagleBone Black (Cortex A8).

Thanks for applying!

Geert Uytterhoeven (3):
  ARM: shmobile: timer: Fix preset_lpj leading to too short delays
  ARM: shmobile: timer: Drop support for Cortex A8
  ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE

 arch/arm/configs/shmobile_defconfig |  1 -
 arch/arm/mach-shmobile/timer.c      | 26 +++++++++-----------------
 2 files changed, 9 insertions(+), 18 deletions(-)

-- 
1.9.1

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays
@ 2016-01-11 18:41 ` Geert Uytterhoeven
  2016-01-18  1:42     ` Simon Horman
  0 siblings, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2016-01-11 18:41 UTC (permalink / raw)
  To: linux-sh

On all shmobile ARM SoCs, loop-based delays may complete early, which
can be after only 1/3 (Cortex A9) or 1/2 (Cortex A7 or A15) of the
minimum required time.

This is caused by calculating preset_lpj based on incorrect assumptions
about the number of clock cycles per loop:
  - All of Cortex A7, A9, and A15 run __loop_delay() at 1 loop per
    CPU clock cycle,
  - As of commit 11d4bb1bd067f9d0 ("ARM: 7907/1: lib: delay-loop: Add
    align directive to fix BogoMIPS calculation"), Cortex A8 runs
    __loop_delay() at 1 loop per 2 instead of 3 CPU clock cycles.

On SoCs with Cortex A7 and/or A15 CPU cores, this went unnoticed, as
delays use the ARM arch timer if available. R-Car Gen2 doesn't work if
the arch timer is disabled. However, APE6 can be used without the arch
timer.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
Backporting note: older kernels need modifications in all calls to
shmobile_setup_delay() or shmobile_setup_delay_hz().
---
 arch/arm/mach-shmobile/timer.c | 28 +++++++++++-----------------
 1 file changed, 11 insertions(+), 17 deletions(-)

diff --git a/arch/arm/mach-shmobile/timer.c b/arch/arm/mach-shmobile/timer.c
index c17d4d3881ffc45e..d61014a52b43e205 100644
--- a/arch/arm/mach-shmobile/timer.c
+++ b/arch/arm/mach-shmobile/timer.c
@@ -38,8 +38,7 @@ static void __init shmobile_setup_delay_hz(unsigned int max_cpu_core_hz,
 void __init shmobile_init_delay(void)
 {
 	struct device_node *np, *cpus;
-	bool is_a7_a8_a9 = false;
-	bool is_a15 = false;
+	unsigned int div = 0;
 	bool has_arch_timer = false;
 	u32 max_freq = 0;
 
@@ -53,27 +52,22 @@ void __init shmobile_init_delay(void)
 		if (!of_property_read_u32(np, "clock-frequency", &freq))
 			max_freq = max(max_freq, freq);
 
-		if (of_device_is_compatible(np, "arm,cortex-a8") ||
-		    of_device_is_compatible(np, "arm,cortex-a9")) {
-			is_a7_a8_a9 = true;
-		} else if (of_device_is_compatible(np, "arm,cortex-a7")) {
-			is_a7_a8_a9 = true;
-			has_arch_timer = true;
-		} else if (of_device_is_compatible(np, "arm,cortex-a15")) {
-			is_a15 = true;
+		if (of_device_is_compatible(np, "arm,cortex-a8")) {
+			div = 2;
+		} else if (of_device_is_compatible(np, "arm,cortex-a9")) {
+			div = 1;
+		} else if (of_device_is_compatible(np, "arm,cortex-a7") ||
+			 of_device_is_compatible(np, "arm,cortex-a15")) {
+			div = 1;
 			has_arch_timer = true;
 		}
 	}
 
 	of_node_put(cpus);
 
-	if (!max_freq)
+	if (!max_freq || !div)
 		return;
 
-	if (!has_arch_timer || !IS_ENABLED(CONFIG_ARM_ARCH_TIMER)) {
-		if (is_a7_a8_a9)
-			shmobile_setup_delay_hz(max_freq, 1, 3);
-		else if (is_a15)
-			shmobile_setup_delay_hz(max_freq, 2, 4);
-	}
+	if (!has_arch_timer || !IS_ENABLED(CONFIG_ARM_ARCH_TIMER))
+		shmobile_setup_delay_hz(max_freq, 1, div);
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] ARM: shmobile: timer: Drop support for Cortex A8
@ 2016-01-11 18:41 ` Geert Uytterhoeven
  2016-01-18  1:45     ` Simon Horman
  0 siblings, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2016-01-11 18:41 UTC (permalink / raw)
  To: linux-sh

Commit edf4100906044225 ("ARM: shmobile: sh7372 dtsi: Remove Legacy
file") removed the DTS for the last shmobile SoC with a Cortex A8 CPU
core (sh7372 aka SH-Mobile AP4).

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
 arch/arm/mach-shmobile/timer.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/arm/mach-shmobile/timer.c b/arch/arm/mach-shmobile/timer.c
index d61014a52b43e205..15fc9db797ea6359 100644
--- a/arch/arm/mach-shmobile/timer.c
+++ b/arch/arm/mach-shmobile/timer.c
@@ -52,9 +52,7 @@ void __init shmobile_init_delay(void)
 		if (!of_property_read_u32(np, "clock-frequency", &freq))
 			max_freq = max(max_freq, freq);
 
-		if (of_device_is_compatible(np, "arm,cortex-a8")) {
-			div = 2;
-		} else if (of_device_is_compatible(np, "arm,cortex-a9")) {
+		if (of_device_is_compatible(np, "arm,cortex-a9")) {
 			div = 1;
 		} else if (of_device_is_compatible(np, "arm,cortex-a7") ||
 			 of_device_is_compatible(np, "arm,cortex-a15")) {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE
@ 2016-01-11 18:41 ` Geert Uytterhoeven
  2016-01-18  1:48     ` Simon Horman
  0 siblings, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2016-01-11 18:41 UTC (permalink / raw)
  To: linux-sh

Do not disable branch prediction, as this has a big performance impact.

After re-enabling branch prediction, __loop_delay() runs 16 (Cortex A15)
or 11 (Cortex A9) times faster than before.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
This used to be enabled in:

  - ag5evm_defconfig (sh73a0)
  - ape6evm_defconfig (r8a73a4)
  - armadillo800eva_defconfig (r8a7740)
  - koelsch_defconfig (r8a7791)
  - kota2_defconfig (sh73a0)

Was there ever a good reason for that?
---
 arch/arm/configs/shmobile_defconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/configs/shmobile_defconfig b/arch/arm/configs/shmobile_defconfig
index 969738324a5d5f81..5208b5caacd1a4e2 100644
--- a/arch/arm/configs/shmobile_defconfig
+++ b/arch/arm/configs/shmobile_defconfig
@@ -20,7 +20,6 @@ CONFIG_ARCH_R8A7791=y
 CONFIG_ARCH_R8A7793=y
 CONFIG_ARCH_R8A7794=y
 CONFIG_ARCH_SH73A0=y
-CONFIG_CPU_BPREDICT_DISABLE=y
 CONFIG_PL310_ERRATA_588369=y
 CONFIG_ARM_ERRATA_754322=y
 CONFIG_PCI=y
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays
  2016-01-11 18:41 ` [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays Geert Uytterhoeven
@ 2016-01-18  1:42     ` Simon Horman
  0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18  1:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 11, 2016 at 07:41:12PM +0100, Geert Uytterhoeven wrote:
> On all shmobile ARM SoCs, loop-based delays may complete early, which
> can be after only 1/3 (Cortex A9) or 1/2 (Cortex A7 or A15) of the
> minimum required time.
> 
> This is caused by calculating preset_lpj based on incorrect assumptions
> about the number of clock cycles per loop:
>   - All of Cortex A7, A9, and A15 run __loop_delay() at 1 loop per
>     CPU clock cycle,
>   - As of commit 11d4bb1bd067f9d0 ("ARM: 7907/1: lib: delay-loop: Add
>     align directive to fix BogoMIPS calculation"), Cortex A8 runs
>     __loop_delay() at 1 loop per 2 instead of 3 CPU clock cycles.
> 
> On SoCs with Cortex A7 and/or A15 CPU cores, this went unnoticed, as
> delays use the ARM arch timer if available. R-Car Gen2 doesn't work if
> the arch timer is disabled. However, APE6 can be used without the arch
> timer.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> Backporting note: older kernels need modifications in all calls to
> shmobile_setup_delay() or shmobile_setup_delay_hz().

Thanks, I have queued this up for v4.6.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays
@ 2016-01-18  1:42     ` Simon Horman
  0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18  1:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 11, 2016 at 07:41:12PM +0100, Geert Uytterhoeven wrote:
> On all shmobile ARM SoCs, loop-based delays may complete early, which
> can be after only 1/3 (Cortex A9) or 1/2 (Cortex A7 or A15) of the
> minimum required time.
> 
> This is caused by calculating preset_lpj based on incorrect assumptions
> about the number of clock cycles per loop:
>   - All of Cortex A7, A9, and A15 run __loop_delay() at 1 loop per
>     CPU clock cycle,
>   - As of commit 11d4bb1bd067f9d0 ("ARM: 7907/1: lib: delay-loop: Add
>     align directive to fix BogoMIPS calculation"), Cortex A8 runs
>     __loop_delay() at 1 loop per 2 instead of 3 CPU clock cycles.
> 
> On SoCs with Cortex A7 and/or A15 CPU cores, this went unnoticed, as
> delays use the ARM arch timer if available. R-Car Gen2 doesn't work if
> the arch timer is disabled. However, APE6 can be used without the arch
> timer.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> Backporting note: older kernels need modifications in all calls to
> shmobile_setup_delay() or shmobile_setup_delay_hz().

Thanks, I have queued this up for v4.6.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] ARM: shmobile: timer: Drop support for Cortex A8
  2016-01-11 18:41 ` [PATCH 2/3] ARM: shmobile: timer: Drop support for Cortex A8 Geert Uytterhoeven
@ 2016-01-18  1:45     ` Simon Horman
  0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18  1:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 11, 2016 at 07:41:13PM +0100, Geert Uytterhoeven wrote:
> Commit edf4100906044225 ("ARM: shmobile: sh7372 dtsi: Remove Legacy
> file") removed the DTS for the last shmobile SoC with a Cortex A8 CPU
> core (sh7372 aka SH-Mobile AP4).
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>

Thanks, I have queued this up for v4.6.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/3] ARM: shmobile: timer: Drop support for Cortex A8
@ 2016-01-18  1:45     ` Simon Horman
  0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18  1:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 11, 2016 at 07:41:13PM +0100, Geert Uytterhoeven wrote:
> Commit edf4100906044225 ("ARM: shmobile: sh7372 dtsi: Remove Legacy
> file") removed the DTS for the last shmobile SoC with a Cortex A8 CPU
> core (sh7372 aka SH-Mobile AP4).
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>

Thanks, I have queued this up for v4.6.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE
  2016-01-11 18:41 ` [PATCH 3/3] ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE Geert Uytterhoeven
@ 2016-01-18  1:48     ` Simon Horman
  0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18  1:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 11, 2016 at 07:41:14PM +0100, Geert Uytterhoeven wrote:
> Do not disable branch prediction, as this has a big performance impact.
> 
> After re-enabling branch prediction, __loop_delay() runs 16 (Cortex A15)
> or 11 (Cortex A9) times faster than before.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> This used to be enabled in:
> 
>   - ag5evm_defconfig (sh73a0)
>   - ape6evm_defconfig (r8a73a4)
>   - armadillo800eva_defconfig (r8a7740)
>   - koelsch_defconfig (r8a7791)
>   - kota2_defconfig (sh73a0)
> 
> Was there ever a good reason for that?

If that was a concious decision then I believe it was made before my time.
I'm going ahead and queuing up this change for v4.6.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/3] ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE
@ 2016-01-18  1:48     ` Simon Horman
  0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18  1:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 11, 2016 at 07:41:14PM +0100, Geert Uytterhoeven wrote:
> Do not disable branch prediction, as this has a big performance impact.
> 
> After re-enabling branch prediction, __loop_delay() runs 16 (Cortex A15)
> or 11 (Cortex A9) times faster than before.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> This used to be enabled in:
> 
>   - ag5evm_defconfig (sh73a0)
>   - ape6evm_defconfig (r8a73a4)
>   - armadillo800eva_defconfig (r8a7740)
>   - koelsch_defconfig (r8a7791)
>   - kota2_defconfig (sh73a0)
> 
> Was there ever a good reason for that?

If that was a concious decision then I believe it was made before my time.
I'm going ahead and queuing up this change for v4.6.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-01-18  1:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-11 18:41 [PATCH 0/3] ARM: shmobile: Fix loop-based delays Geert Uytterhoeven
2016-01-11 18:41 ` [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays Geert Uytterhoeven
2016-01-18  1:42   ` Simon Horman
2016-01-18  1:42     ` Simon Horman
2016-01-11 18:41 ` [PATCH 2/3] ARM: shmobile: timer: Drop support for Cortex A8 Geert Uytterhoeven
2016-01-18  1:45   ` Simon Horman
2016-01-18  1:45     ` Simon Horman
2016-01-11 18:41 ` [PATCH 3/3] ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE Geert Uytterhoeven
2016-01-18  1:48   ` Simon Horman
2016-01-18  1:48     ` Simon Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.