* [PATCH 0/3] ARM: shmobile: Fix loop-based delays
@ 2016-01-11 18:41 Geert Uytterhoeven
2016-01-11 18:41 ` [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays Geert Uytterhoeven
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Geert Uytterhoeven @ 2016-01-11 18:41 UTC (permalink / raw)
To: linux-sh
Hi Simon, Magnus,
I accidentally noticed that on r8a7740/armadillo, "mdelay(1000)" took
only 333 ms with my personal config, while it took 3675 ms with
shmobile_defconfig.
Finding out why brought me an animating journey through obscure code
and kernel config options...
Apparently loop-based delays may complete early on all shmobile SoCs,
which may lead to subtle driver failures. Fortunately(?) this was
mitigated by:
1. the ARM arch timer on Cortex A7 and A15, which is used instead of
loop-based delays if available,
2. shmobile_defconfig having CONFIG_CPU_BPREDICT_DISABLE enabled
(why?), which makes loops up to 16 times slower.
This patch series fixes both, and removes a related check for Cortex A8
CPU cores, which are no longer supported anyway.
Tested on a collection of shmobile SoCs with Cortex A7, A9, and A15 CPU
cores, with some verification on BeagleBone Black (Cortex A8).
Thanks for applying!
Geert Uytterhoeven (3):
ARM: shmobile: timer: Fix preset_lpj leading to too short delays
ARM: shmobile: timer: Drop support for Cortex A8
ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE
arch/arm/configs/shmobile_defconfig | 1 -
arch/arm/mach-shmobile/timer.c | 26 +++++++++-----------------
2 files changed, 9 insertions(+), 18 deletions(-)
--
1.9.1
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays
@ 2016-01-11 18:41 ` Geert Uytterhoeven
2016-01-18 1:42 ` Simon Horman
0 siblings, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2016-01-11 18:41 UTC (permalink / raw)
To: linux-sh
On all shmobile ARM SoCs, loop-based delays may complete early, which
can be after only 1/3 (Cortex A9) or 1/2 (Cortex A7 or A15) of the
minimum required time.
This is caused by calculating preset_lpj based on incorrect assumptions
about the number of clock cycles per loop:
- All of Cortex A7, A9, and A15 run __loop_delay() at 1 loop per
CPU clock cycle,
- As of commit 11d4bb1bd067f9d0 ("ARM: 7907/1: lib: delay-loop: Add
align directive to fix BogoMIPS calculation"), Cortex A8 runs
__loop_delay() at 1 loop per 2 instead of 3 CPU clock cycles.
On SoCs with Cortex A7 and/or A15 CPU cores, this went unnoticed, as
delays use the ARM arch timer if available. R-Car Gen2 doesn't work if
the arch timer is disabled. However, APE6 can be used without the arch
timer.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
Backporting note: older kernels need modifications in all calls to
shmobile_setup_delay() or shmobile_setup_delay_hz().
---
arch/arm/mach-shmobile/timer.c | 28 +++++++++++-----------------
1 file changed, 11 insertions(+), 17 deletions(-)
diff --git a/arch/arm/mach-shmobile/timer.c b/arch/arm/mach-shmobile/timer.c
index c17d4d3881ffc45e..d61014a52b43e205 100644
--- a/arch/arm/mach-shmobile/timer.c
+++ b/arch/arm/mach-shmobile/timer.c
@@ -38,8 +38,7 @@ static void __init shmobile_setup_delay_hz(unsigned int max_cpu_core_hz,
void __init shmobile_init_delay(void)
{
struct device_node *np, *cpus;
- bool is_a7_a8_a9 = false;
- bool is_a15 = false;
+ unsigned int div = 0;
bool has_arch_timer = false;
u32 max_freq = 0;
@@ -53,27 +52,22 @@ void __init shmobile_init_delay(void)
if (!of_property_read_u32(np, "clock-frequency", &freq))
max_freq = max(max_freq, freq);
- if (of_device_is_compatible(np, "arm,cortex-a8") ||
- of_device_is_compatible(np, "arm,cortex-a9")) {
- is_a7_a8_a9 = true;
- } else if (of_device_is_compatible(np, "arm,cortex-a7")) {
- is_a7_a8_a9 = true;
- has_arch_timer = true;
- } else if (of_device_is_compatible(np, "arm,cortex-a15")) {
- is_a15 = true;
+ if (of_device_is_compatible(np, "arm,cortex-a8")) {
+ div = 2;
+ } else if (of_device_is_compatible(np, "arm,cortex-a9")) {
+ div = 1;
+ } else if (of_device_is_compatible(np, "arm,cortex-a7") ||
+ of_device_is_compatible(np, "arm,cortex-a15")) {
+ div = 1;
has_arch_timer = true;
}
}
of_node_put(cpus);
- if (!max_freq)
+ if (!max_freq || !div)
return;
- if (!has_arch_timer || !IS_ENABLED(CONFIG_ARM_ARCH_TIMER)) {
- if (is_a7_a8_a9)
- shmobile_setup_delay_hz(max_freq, 1, 3);
- else if (is_a15)
- shmobile_setup_delay_hz(max_freq, 2, 4);
- }
+ if (!has_arch_timer || !IS_ENABLED(CONFIG_ARM_ARCH_TIMER))
+ shmobile_setup_delay_hz(max_freq, 1, div);
}
--
1.9.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/3] ARM: shmobile: timer: Drop support for Cortex A8
@ 2016-01-11 18:41 ` Geert Uytterhoeven
2016-01-18 1:45 ` Simon Horman
0 siblings, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2016-01-11 18:41 UTC (permalink / raw)
To: linux-sh
Commit edf4100906044225 ("ARM: shmobile: sh7372 dtsi: Remove Legacy
file") removed the DTS for the last shmobile SoC with a Cortex A8 CPU
core (sh7372 aka SH-Mobile AP4).
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
arch/arm/mach-shmobile/timer.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/arm/mach-shmobile/timer.c b/arch/arm/mach-shmobile/timer.c
index d61014a52b43e205..15fc9db797ea6359 100644
--- a/arch/arm/mach-shmobile/timer.c
+++ b/arch/arm/mach-shmobile/timer.c
@@ -52,9 +52,7 @@ void __init shmobile_init_delay(void)
if (!of_property_read_u32(np, "clock-frequency", &freq))
max_freq = max(max_freq, freq);
- if (of_device_is_compatible(np, "arm,cortex-a8")) {
- div = 2;
- } else if (of_device_is_compatible(np, "arm,cortex-a9")) {
+ if (of_device_is_compatible(np, "arm,cortex-a9")) {
div = 1;
} else if (of_device_is_compatible(np, "arm,cortex-a7") ||
of_device_is_compatible(np, "arm,cortex-a15")) {
--
1.9.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/3] ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE
@ 2016-01-11 18:41 ` Geert Uytterhoeven
2016-01-18 1:48 ` Simon Horman
0 siblings, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2016-01-11 18:41 UTC (permalink / raw)
To: linux-sh
Do not disable branch prediction, as this has a big performance impact.
After re-enabling branch prediction, __loop_delay() runs 16 (Cortex A15)
or 11 (Cortex A9) times faster than before.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
This used to be enabled in:
- ag5evm_defconfig (sh73a0)
- ape6evm_defconfig (r8a73a4)
- armadillo800eva_defconfig (r8a7740)
- koelsch_defconfig (r8a7791)
- kota2_defconfig (sh73a0)
Was there ever a good reason for that?
---
arch/arm/configs/shmobile_defconfig | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/arm/configs/shmobile_defconfig b/arch/arm/configs/shmobile_defconfig
index 969738324a5d5f81..5208b5caacd1a4e2 100644
--- a/arch/arm/configs/shmobile_defconfig
+++ b/arch/arm/configs/shmobile_defconfig
@@ -20,7 +20,6 @@ CONFIG_ARCH_R8A7791=y
CONFIG_ARCH_R8A7793=y
CONFIG_ARCH_R8A7794=y
CONFIG_ARCH_SH73A0=y
-CONFIG_CPU_BPREDICT_DISABLE=y
CONFIG_PL310_ERRATA_588369=y
CONFIG_ARM_ERRATA_754322=y
CONFIG_PCI=y
--
1.9.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays
2016-01-11 18:41 ` [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays Geert Uytterhoeven
@ 2016-01-18 1:42 ` Simon Horman
0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18 1:42 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 11, 2016 at 07:41:12PM +0100, Geert Uytterhoeven wrote:
> On all shmobile ARM SoCs, loop-based delays may complete early, which
> can be after only 1/3 (Cortex A9) or 1/2 (Cortex A7 or A15) of the
> minimum required time.
>
> This is caused by calculating preset_lpj based on incorrect assumptions
> about the number of clock cycles per loop:
> - All of Cortex A7, A9, and A15 run __loop_delay() at 1 loop per
> CPU clock cycle,
> - As of commit 11d4bb1bd067f9d0 ("ARM: 7907/1: lib: delay-loop: Add
> align directive to fix BogoMIPS calculation"), Cortex A8 runs
> __loop_delay() at 1 loop per 2 instead of 3 CPU clock cycles.
>
> On SoCs with Cortex A7 and/or A15 CPU cores, this went unnoticed, as
> delays use the ARM arch timer if available. R-Car Gen2 doesn't work if
> the arch timer is disabled. However, APE6 can be used without the arch
> timer.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> Backporting note: older kernels need modifications in all calls to
> shmobile_setup_delay() or shmobile_setup_delay_hz().
Thanks, I have queued this up for v4.6.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays
@ 2016-01-18 1:42 ` Simon Horman
0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18 1:42 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 11, 2016 at 07:41:12PM +0100, Geert Uytterhoeven wrote:
> On all shmobile ARM SoCs, loop-based delays may complete early, which
> can be after only 1/3 (Cortex A9) or 1/2 (Cortex A7 or A15) of the
> minimum required time.
>
> This is caused by calculating preset_lpj based on incorrect assumptions
> about the number of clock cycles per loop:
> - All of Cortex A7, A9, and A15 run __loop_delay() at 1 loop per
> CPU clock cycle,
> - As of commit 11d4bb1bd067f9d0 ("ARM: 7907/1: lib: delay-loop: Add
> align directive to fix BogoMIPS calculation"), Cortex A8 runs
> __loop_delay() at 1 loop per 2 instead of 3 CPU clock cycles.
>
> On SoCs with Cortex A7 and/or A15 CPU cores, this went unnoticed, as
> delays use the ARM arch timer if available. R-Car Gen2 doesn't work if
> the arch timer is disabled. However, APE6 can be used without the arch
> timer.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> Backporting note: older kernels need modifications in all calls to
> shmobile_setup_delay() or shmobile_setup_delay_hz().
Thanks, I have queued this up for v4.6.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] ARM: shmobile: timer: Drop support for Cortex A8
2016-01-11 18:41 ` [PATCH 2/3] ARM: shmobile: timer: Drop support for Cortex A8 Geert Uytterhoeven
@ 2016-01-18 1:45 ` Simon Horman
0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18 1:45 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 11, 2016 at 07:41:13PM +0100, Geert Uytterhoeven wrote:
> Commit edf4100906044225 ("ARM: shmobile: sh7372 dtsi: Remove Legacy
> file") removed the DTS for the last shmobile SoC with a Cortex A8 CPU
> core (sh7372 aka SH-Mobile AP4).
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Thanks, I have queued this up for v4.6.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/3] ARM: shmobile: timer: Drop support for Cortex A8
@ 2016-01-18 1:45 ` Simon Horman
0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18 1:45 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 11, 2016 at 07:41:13PM +0100, Geert Uytterhoeven wrote:
> Commit edf4100906044225 ("ARM: shmobile: sh7372 dtsi: Remove Legacy
> file") removed the DTS for the last shmobile SoC with a Cortex A8 CPU
> core (sh7372 aka SH-Mobile AP4).
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Thanks, I have queued this up for v4.6.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE
2016-01-11 18:41 ` [PATCH 3/3] ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE Geert Uytterhoeven
@ 2016-01-18 1:48 ` Simon Horman
0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18 1:48 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 11, 2016 at 07:41:14PM +0100, Geert Uytterhoeven wrote:
> Do not disable branch prediction, as this has a big performance impact.
>
> After re-enabling branch prediction, __loop_delay() runs 16 (Cortex A15)
> or 11 (Cortex A9) times faster than before.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> This used to be enabled in:
>
> - ag5evm_defconfig (sh73a0)
> - ape6evm_defconfig (r8a73a4)
> - armadillo800eva_defconfig (r8a7740)
> - koelsch_defconfig (r8a7791)
> - kota2_defconfig (sh73a0)
>
> Was there ever a good reason for that?
If that was a concious decision then I believe it was made before my time.
I'm going ahead and queuing up this change for v4.6.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/3] ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE
@ 2016-01-18 1:48 ` Simon Horman
0 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2016-01-18 1:48 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 11, 2016 at 07:41:14PM +0100, Geert Uytterhoeven wrote:
> Do not disable branch prediction, as this has a big performance impact.
>
> After re-enabling branch prediction, __loop_delay() runs 16 (Cortex A15)
> or 11 (Cortex A9) times faster than before.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> This used to be enabled in:
>
> - ag5evm_defconfig (sh73a0)
> - ape6evm_defconfig (r8a73a4)
> - armadillo800eva_defconfig (r8a7740)
> - koelsch_defconfig (r8a7791)
> - kota2_defconfig (sh73a0)
>
> Was there ever a good reason for that?
If that was a concious decision then I believe it was made before my time.
I'm going ahead and queuing up this change for v4.6.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-01-18 1:48 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-11 18:41 [PATCH 0/3] ARM: shmobile: Fix loop-based delays Geert Uytterhoeven
2016-01-11 18:41 ` [PATCH 1/3] ARM: shmobile: timer: Fix preset_lpj leading to too short delays Geert Uytterhoeven
2016-01-18 1:42 ` Simon Horman
2016-01-18 1:42 ` Simon Horman
2016-01-11 18:41 ` [PATCH 2/3] ARM: shmobile: timer: Drop support for Cortex A8 Geert Uytterhoeven
2016-01-18 1:45 ` Simon Horman
2016-01-18 1:45 ` Simon Horman
2016-01-11 18:41 ` [PATCH 3/3] ARM: shmobile: defconfig: Do not enable CONFIG_CPU_BPREDICT_DISABLE Geert Uytterhoeven
2016-01-18 1:48 ` Simon Horman
2016-01-18 1:48 ` Simon Horman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.