All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD
@ 2021-04-13 19:38 Rasmus Villemoes
  2021-04-13 19:38 ` [PATCH v2 1/2] powerpc: lib: remove leftover CONFIG_5xx Rasmus Villemoes
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Rasmus Villemoes @ 2021-04-13 19:38 UTC (permalink / raw)
  To: u-boot

Back in June, I sent a patch [1] to reduce the number of WATCHDOG_RESET
calls in flush_cache() to one per 64K. Wolfgang rejected that, saying
that the threshold should be configurable.

So here's a new version which introduces a config knob. When left at
its default, the generated code is binary identical to what one gets
before these patches.

I stumbled on the CONFIG_5xx and thought I ought to make the config
knob depend on !CONFIG_5xx, but it turns out it's better to just
remove that piece of legacy, hence the first janitorial patch.

[1] https://lists.denx.de/pipermail/u-boot/2020-June/414852.html

Rasmus Villemoes (2):
  powerpc: lib: remove leftover CONFIG_5xx
  powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD

 arch/powerpc/Kconfig     |  1 +
 arch/powerpc/lib/Kconfig |  9 +++++++++
 arch/powerpc/lib/cache.c | 16 ++++++++++++----
 3 files changed, 22 insertions(+), 4 deletions(-)
 create mode 100644 arch/powerpc/lib/Kconfig

-- 
2.29.2

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] powerpc: lib: remove leftover CONFIG_5xx
  2021-04-13 19:38 [PATCH v2 0/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD Rasmus Villemoes
@ 2021-04-13 19:38 ` Rasmus Villemoes
  2021-04-15  5:30   ` Stefan Roese
  2021-04-13 19:38 ` [PATCH v2 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD Rasmus Villemoes
  2021-04-21  9:16 ` [PATCH v3 0/2] " Rasmus Villemoes
  2 siblings, 1 reply; 10+ messages in thread
From: Rasmus Villemoes @ 2021-04-13 19:38 UTC (permalink / raw)
  To: u-boot

CONFIG_5xx hasn't existed since commit 502589777416 (powerpc, 5xx:
remove support for 5xx). Remove this last mention of it.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 arch/powerpc/lib/cache.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/lib/cache.c b/arch/powerpc/lib/cache.c
index 3c3c470bbb..3e487f50fe 100644
--- a/arch/powerpc/lib/cache.c
+++ b/arch/powerpc/lib/cache.c
@@ -11,7 +11,6 @@
 
 void flush_cache(ulong start_addr, ulong size)
 {
-#ifndef CONFIG_5xx
 	ulong addr, start, end;
 
 	start = start_addr & ~(CONFIG_SYS_CACHELINE_SIZE - 1);
@@ -33,5 +32,4 @@ void flush_cache(ulong start_addr, ulong size)
 	asm volatile("sync" : : : "memory");
 	/* flush prefetch queue */
 	asm volatile("isync" : : : "memory");
-#endif
 }
-- 
2.29.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD
  2021-04-13 19:38 [PATCH v2 0/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD Rasmus Villemoes
  2021-04-13 19:38 ` [PATCH v2 1/2] powerpc: lib: remove leftover CONFIG_5xx Rasmus Villemoes
@ 2021-04-13 19:38 ` Rasmus Villemoes
  2021-04-15  5:34   ` Stefan Roese
  2021-04-21  9:16 ` [PATCH v3 0/2] " Rasmus Villemoes
  2 siblings, 1 reply; 10+ messages in thread
From: Rasmus Villemoes @ 2021-04-13 19:38 UTC (permalink / raw)
  To: u-boot

When flush_cache() is called during boot on our ~7M kernel image, the
hundreds of thousands of WATCHDOG_RESET calls end up adding
significantly to boottime. Flushing a single cache line doesn't take
many microseconds, so doing these calls for every cache line is
complete overkill.

The generic watchdog_reset() provided by wdt-uclass.c actually
contains some rate-limiting logic that should in theory mitigate this,
but alas, that rate-limiting must be disabled on powerpc because of
its get_timer() implementation - get_timer() works just fine until
interrupts are disabled, but it just so happens that the "big"
flush_cache() call happens in the part of bootm where interrupts are
indeed disabled. [1] [2] [3]

I have checked with objdump that the generated code doesn't change
when this option is left at its default value of 0: gcc is smart
enough to see that wd_limit is constant 0, the ">=" comparisons are
tautologically true, hence all assignments to "flushed" are eliminated
as dead stores.

On our board, setting the option to something like 65536 ends up
reducing total boottime by about 0.8 seconds.

[1] https://patchwork.ozlabs.org/project/uboot/patch/20200605111657.28773-1-rasmus.villemoes at prevas.dk/
[2] https://lists.denx.de/pipermail/u-boot/2021-April/446906.html
[3] https://lists.denx.de/pipermail/u-boot/2021-April/447280.html

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 arch/powerpc/Kconfig     |  1 +
 arch/powerpc/lib/Kconfig |  9 +++++++++
 arch/powerpc/lib/cache.c | 14 ++++++++++++--
 3 files changed, 22 insertions(+), 2 deletions(-)
 create mode 100644 arch/powerpc/lib/Kconfig

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 6a2e88fed2..133447648c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -49,5 +49,6 @@ source "arch/powerpc/cpu/mpc83xx/Kconfig"
 source "arch/powerpc/cpu/mpc85xx/Kconfig"
 source "arch/powerpc/cpu/mpc86xx/Kconfig"
 source "arch/powerpc/cpu/mpc8xx/Kconfig"
+source "arch/powerpc/lib/Kconfig"
 
 endmenu
diff --git a/arch/powerpc/lib/Kconfig b/arch/powerpc/lib/Kconfig
new file mode 100644
index 0000000000..b30b5edf7c
--- /dev/null
+++ b/arch/powerpc/lib/Kconfig
@@ -0,0 +1,9 @@
+config CACHE_FLUSH_WATCHDOG_THRESHOLD
+	int "Bytes to flush between WATCHDOG_RESET calls"
+	default 0
+	help
+	  The flush_cache() function periodically, and by default for
+	  every cache line, calls WATCHDOG_RESET(). When flushing a
+	  large area, that may add a significant amount of
+	  overhead. This option allows you to set a threshold for how
+	  many bytes to flush between each WATCHDOG_RESET call.
diff --git a/arch/powerpc/lib/cache.c b/arch/powerpc/lib/cache.c
index 3e487f50fe..115ae8e160 100644
--- a/arch/powerpc/lib/cache.c
+++ b/arch/powerpc/lib/cache.c
@@ -12,6 +12,8 @@
 void flush_cache(ulong start_addr, ulong size)
 {
 	ulong addr, start, end;
+	ulong wd_limit = CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD;
+	ulong flushed = 0;
 
 	start = start_addr & ~(CONFIG_SYS_CACHELINE_SIZE - 1);
 	end = start_addr + size - 1;
@@ -19,7 +21,11 @@ void flush_cache(ulong start_addr, ulong size)
 	for (addr = start; (addr <= end) && (addr >= start);
 			addr += CONFIG_SYS_CACHELINE_SIZE) {
 		asm volatile("dcbst 0,%0" : : "r" (addr) : "memory");
-		WATCHDOG_RESET();
+		flushed += CONFIG_SYS_CACHELINE_SIZE;
+		if (flushed >= wd_limit) {
+			WATCHDOG_RESET();
+			flushed = 0;
+		}
 	}
 	/* wait for all dcbst to complete on bus */
 	asm volatile("sync" : : : "memory");
@@ -27,7 +33,11 @@ void flush_cache(ulong start_addr, ulong size)
 	for (addr = start; (addr <= end) && (addr >= start);
 			addr += CONFIG_SYS_CACHELINE_SIZE) {
 		asm volatile("icbi 0,%0" : : "r" (addr) : "memory");
-		WATCHDOG_RESET();
+		flushed += CONFIG_SYS_CACHELINE_SIZE;
+		if (flushed >= wd_limit) {
+			WATCHDOG_RESET();
+			flushed = 0;
+		}
 	}
 	asm volatile("sync" : : : "memory");
 	/* flush prefetch queue */
-- 
2.29.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] powerpc: lib: remove leftover CONFIG_5xx
  2021-04-13 19:38 ` [PATCH v2 1/2] powerpc: lib: remove leftover CONFIG_5xx Rasmus Villemoes
@ 2021-04-15  5:30   ` Stefan Roese
  0 siblings, 0 replies; 10+ messages in thread
From: Stefan Roese @ 2021-04-15  5:30 UTC (permalink / raw)
  To: u-boot

On 13.04.21 21:38, Rasmus Villemoes wrote:
> CONFIG_5xx hasn't existed since commit 502589777416 (powerpc, 5xx:
> remove support for 5xx). Remove this last mention of it.
> 
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
>   arch/powerpc/lib/cache.c | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/arch/powerpc/lib/cache.c b/arch/powerpc/lib/cache.c
> index 3c3c470bbb..3e487f50fe 100644
> --- a/arch/powerpc/lib/cache.c
> +++ b/arch/powerpc/lib/cache.c
> @@ -11,7 +11,6 @@
>   
>   void flush_cache(ulong start_addr, ulong size)
>   {
> -#ifndef CONFIG_5xx
>   	ulong addr, start, end;
>   
>   	start = start_addr & ~(CONFIG_SYS_CACHELINE_SIZE - 1);
> @@ -33,5 +32,4 @@ void flush_cache(ulong start_addr, ulong size)
>   	asm volatile("sync" : : : "memory");
>   	/* flush prefetch queue */
>   	asm volatile("isync" : : : "memory");
> -#endif
>   }
> 

Reviewed-by: Stefan Roese <sr@denx.de>

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD
  2021-04-13 19:38 ` [PATCH v2 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD Rasmus Villemoes
@ 2021-04-15  5:34   ` Stefan Roese
  2021-04-15  6:46     ` Rasmus Villemoes
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Roese @ 2021-04-15  5:34 UTC (permalink / raw)
  To: u-boot

On 13.04.21 21:38, Rasmus Villemoes wrote:
> When flush_cache() is called during boot on our ~7M kernel image, the
> hundreds of thousands of WATCHDOG_RESET calls end up adding
> significantly to boottime. Flushing a single cache line doesn't take
> many microseconds, so doing these calls for every cache line is
> complete overkill.
> 
> The generic watchdog_reset() provided by wdt-uclass.c actually
> contains some rate-limiting logic that should in theory mitigate this,
> but alas, that rate-limiting must be disabled on powerpc because of
> its get_timer() implementation - get_timer() works just fine until
> interrupts are disabled, but it just so happens that the "big"
> flush_cache() call happens in the part of bootm where interrupts are
> indeed disabled. [1] [2] [3]
> 
> I have checked with objdump that the generated code doesn't change
> when this option is left at its default value of 0: gcc is smart
> enough to see that wd_limit is constant 0, the ">=" comparisons are
> tautologically true, hence all assignments to "flushed" are eliminated

Nitpicking:
s/tautologically/autologically. BTW, I've never heard or read this word
before.

> as dead stores.
> 
> On our board, setting the option to something like 65536 ends up
> reducing total boottime by about 0.8 seconds.
> 
> [1] https://patchwork.ozlabs.org/project/uboot/patch/20200605111657.28773-1-rasmus.villemoes at prevas.dk/
> [2] https://lists.denx.de/pipermail/u-boot/2021-April/446906.html
> [3] https://lists.denx.de/pipermail/u-boot/2021-April/447280.html
> 
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
>   arch/powerpc/Kconfig     |  1 +
>   arch/powerpc/lib/Kconfig |  9 +++++++++
>   arch/powerpc/lib/cache.c | 14 ++++++++++++--
>   3 files changed, 22 insertions(+), 2 deletions(-)
>   create mode 100644 arch/powerpc/lib/Kconfig
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 6a2e88fed2..133447648c 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -49,5 +49,6 @@ source "arch/powerpc/cpu/mpc83xx/Kconfig"
>   source "arch/powerpc/cpu/mpc85xx/Kconfig"
>   source "arch/powerpc/cpu/mpc86xx/Kconfig"
>   source "arch/powerpc/cpu/mpc8xx/Kconfig"
> +source "arch/powerpc/lib/Kconfig"
>   
>   endmenu
> diff --git a/arch/powerpc/lib/Kconfig b/arch/powerpc/lib/Kconfig
> new file mode 100644
> index 0000000000..b30b5edf7c
> --- /dev/null
> +++ b/arch/powerpc/lib/Kconfig
> @@ -0,0 +1,9 @@
> +config CACHE_FLUSH_WATCHDOG_THRESHOLD
> +	int "Bytes to flush between WATCHDOG_RESET calls"
> +	default 0
> +	help
> +	  The flush_cache() function periodically, and by default for
> +	  every cache line, calls WATCHDOG_RESET(). When flushing a
> +	  large area, that may add a significant amount of
> +	  overhead. This option allows you to set a threshold for how
> +	  many bytes to flush between each WATCHDOG_RESET call.
> diff --git a/arch/powerpc/lib/cache.c b/arch/powerpc/lib/cache.c
> index 3e487f50fe..115ae8e160 100644
> --- a/arch/powerpc/lib/cache.c
> +++ b/arch/powerpc/lib/cache.c
> @@ -12,6 +12,8 @@
>   void flush_cache(ulong start_addr, ulong size)
>   {
>   	ulong addr, start, end;
> +	ulong wd_limit = CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD;
> +	ulong flushed = 0;
>   
>   	start = start_addr & ~(CONFIG_SYS_CACHELINE_SIZE - 1);
>   	end = start_addr + size - 1;
> @@ -19,7 +21,11 @@ void flush_cache(ulong start_addr, ulong size)
>   	for (addr = start; (addr <= end) && (addr >= start);
>   			addr += CONFIG_SYS_CACHELINE_SIZE) {
>   		asm volatile("dcbst 0,%0" : : "r" (addr) : "memory");
> -		WATCHDOG_RESET();
> +		flushed += CONFIG_SYS_CACHELINE_SIZE;
> +		if (flushed >= wd_limit) {
> +			WATCHDOG_RESET();
> +			flushed = 0;
> +		}
>   	}
>   	/* wait for all dcbst to complete on bus */
>   	asm volatile("sync" : : : "memory");
> @@ -27,7 +33,11 @@ void flush_cache(ulong start_addr, ulong size)
>   	for (addr = start; (addr <= end) && (addr >= start);
>   			addr += CONFIG_SYS_CACHELINE_SIZE) {
>   		asm volatile("icbi 0,%0" : : "r" (addr) : "memory");
> -		WATCHDOG_RESET();
> +		flushed += CONFIG_SYS_CACHELINE_SIZE;
> +		if (flushed >= wd_limit) {
> +			WATCHDOG_RESET();
> +			flushed = 0;
> +		}

It might make sense to pull these 2 identical code snippets into a
function to not duplicate this code.

Other than this:

Reviewed-by: Stefan Roese <sr@denx.de>

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD
  2021-04-15  5:34   ` Stefan Roese
@ 2021-04-15  6:46     ` Rasmus Villemoes
  2021-04-15  6:55       ` Stefan Roese
  0 siblings, 1 reply; 10+ messages in thread
From: Rasmus Villemoes @ 2021-04-15  6:46 UTC (permalink / raw)
  To: u-boot

On 15/04/2021 07.34, Stefan Roese wrote:
> On 13.04.21 21:38, Rasmus Villemoes wrote:
>>
>> I have checked with objdump that the generated code doesn't change
>> when this option is left at its default value of 0: gcc is smart
>> enough to see that wd_limit is constant 0, the ">=" comparisons are
>> tautologically true, hence all assignments to "flushed" are eliminated
> 
> Nitpicking:
> s/tautologically/autologically. 

No.

https://www.merriam-webster.com/dictionary/tautological
https://en.wikipedia.org/wiki/Tautology_(logic)
"In Mathematical logic, a tautology (from Greek: ??????????) is a
formula or assertion that is true in every possible interpretation."

It even exists in gcc docs in the form of -Wtautological-compare. [But I
do admit that "tautologically true" is a pleonasm; "tautologies" or
"automatically true" could have been used instead. And going full meta,
now that I look up pleonasm in wikipedia to check that I'm using it
right, that article has a reference to "tautology (language)" in its
second sentence.]

>> ????? /* wait for all dcbst to complete on bus */
>> ????? asm volatile("sync" : : : "memory");
>> @@ -27,7 +33,11 @@ void flush_cache(ulong start_addr, ulong size)
>> ????? for (addr = start; (addr <= end) && (addr >= start);
>> ????????????? addr += CONFIG_SYS_CACHELINE_SIZE) {
>> ????????? asm volatile("icbi 0,%0" : : "r" (addr) : "memory");
>> -??????? WATCHDOG_RESET();
>> +??????? flushed += CONFIG_SYS_CACHELINE_SIZE;
>> +??????? if (flushed >= wd_limit) {
>> +??????????? WATCHDOG_RESET();
>> +??????????? flushed = 0;
>> +??????? }
> 
> It might make sense to pull these 2 identical code snippets into a
> function to not duplicate this code.

Something like

static ulong maybe_reset(ulong flushed)
{
  const ulong wd_limit = ...;
  flushed += CONFIG_SYS_CACHELINE_SIZE;
  if (flushed >= wd_limit) {
    WATCHDOG_RESET();
    flushed = 0;
  }
  return flushed;
}

and "flushed = maybe_reset(flushed);"? I don't think that improves
readability.

Thanks,
Rasmus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD
  2021-04-15  6:46     ` Rasmus Villemoes
@ 2021-04-15  6:55       ` Stefan Roese
  0 siblings, 0 replies; 10+ messages in thread
From: Stefan Roese @ 2021-04-15  6:55 UTC (permalink / raw)
  To: u-boot

On 15.04.21 08:46, Rasmus Villemoes wrote:
> On 15/04/2021 07.34, Stefan Roese wrote:
>> On 13.04.21 21:38, Rasmus Villemoes wrote:
>>>
>>> I have checked with objdump that the generated code doesn't change
>>> when this option is left at its default value of 0: gcc is smart
>>> enough to see that wd_limit is constant 0, the ">=" comparisons are
>>> tautologically true, hence all assignments to "flushed" are eliminated
>>
>> Nitpicking:
>> s/tautologically/autologically.
> 
> No.
> 
> https://www.merriam-webster.com/dictionary/tautological
> https://en.wikipedia.org/wiki/Tautology_(logic)
> "In Mathematical logic, a tautology (from Greek: ??????????) is a
> formula or assertion that is true in every possible interpretation."
> 
> It even exists in gcc docs in the form of -Wtautological-compare. [But I
> do admit that "tautologically true" is a pleonasm; "tautologies" or
> "automatically true" could have been used instead. And going full meta,
> now that I look up pleonasm in wikipedia to check that I'm using it
> right, that article has a reference to "tautology (language)" in its
> second sentence.]

Interesting. Thanks for the detailed info.

>>>  ????? /* wait for all dcbst to complete on bus */
>>>  ????? asm volatile("sync" : : : "memory");
>>> @@ -27,7 +33,11 @@ void flush_cache(ulong start_addr, ulong size)
>>>  ????? for (addr = start; (addr <= end) && (addr >= start);
>>>  ????????????? addr += CONFIG_SYS_CACHELINE_SIZE) {
>>>  ????????? asm volatile("icbi 0,%0" : : "r" (addr) : "memory");
>>> -??????? WATCHDOG_RESET();
>>> +??????? flushed += CONFIG_SYS_CACHELINE_SIZE;
>>> +??????? if (flushed >= wd_limit) {
>>> +??????????? WATCHDOG_RESET();
>>> +??????????? flushed = 0;
>>> +??????? }
>>
>> It might make sense to pull these 2 identical code snippets into a
>> function to not duplicate this code.
> 
> Something like
> 
> static ulong maybe_reset(ulong flushed)
> {
>    const ulong wd_limit = ...;
>    flushed += CONFIG_SYS_CACHELINE_SIZE;
>    if (flushed >= wd_limit) {
>      WATCHDOG_RESET();
>      flushed = 0;
>    }
>    return flushed;
> }
> 
> and "flushed = maybe_reset(flushed);"? I don't think that improves
> readability.

Yes, something like this. I agree that it's not really beautiful but for
my personal taste it's a bit better. But I don't have hard feeling here.
Let's see what other comment on this.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 0/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD
  2021-04-13 19:38 [PATCH v2 0/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD Rasmus Villemoes
  2021-04-13 19:38 ` [PATCH v2 1/2] powerpc: lib: remove leftover CONFIG_5xx Rasmus Villemoes
  2021-04-13 19:38 ` [PATCH v2 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD Rasmus Villemoes
@ 2021-04-21  9:16 ` Rasmus Villemoes
  2021-04-21  9:16   ` [PATCH v3 1/2] powerpc: lib: remove leftover CONFIG_5xx Rasmus Villemoes
  2021-04-21  9:16   ` [PATCH v3 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD Rasmus Villemoes
  2 siblings, 2 replies; 10+ messages in thread
From: Rasmus Villemoes @ 2021-04-21  9:16 UTC (permalink / raw)
  To: u-boot

v3: Introduce a little helper function as suggested by Stefan. gcc
generates the same code before and after that refactoring - in
particular, the code is still the same as before these patches when
the config option is left at its default.

Include Stefan's Reviewed-bys.

v2 cover letter:

Back in June, I sent a patch [1] to reduce the number of WATCHDOG_RESET
calls in flush_cache() to one per 64K. Wolfgang rejected that, saying
that the threshold should be configurable.

So here's a new version which introduces a config knob. When left at
its default, the generated code is binary identical to what one gets
before these patches.

I stumbled on the CONFIG_5xx and thought I ought to make the config
knob depend on !CONFIG_5xx, but it turns out it's better to just
remove that piece of legacy, hence the first janitorial patch.

[1] https://lists.denx.de/pipermail/u-boot/2020-June/414852.html

Rasmus Villemoes (2):
  powerpc: lib: remove leftover CONFIG_5xx
  powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD

 arch/powerpc/Kconfig     |  1 +
 arch/powerpc/lib/Kconfig |  9 +++++++++
 arch/powerpc/lib/cache.c | 17 +++++++++++++----
 3 files changed, 23 insertions(+), 4 deletions(-)
 create mode 100644 arch/powerpc/lib/Kconfig

-- 
2.29.2

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/2] powerpc: lib: remove leftover CONFIG_5xx
  2021-04-21  9:16 ` [PATCH v3 0/2] " Rasmus Villemoes
@ 2021-04-21  9:16   ` Rasmus Villemoes
  2021-04-21  9:16   ` [PATCH v3 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD Rasmus Villemoes
  1 sibling, 0 replies; 10+ messages in thread
From: Rasmus Villemoes @ 2021-04-21  9:16 UTC (permalink / raw)
  To: u-boot

CONFIG_5xx hasn't existed since commit 502589777416 (powerpc, 5xx:
remove support for 5xx). Remove this last mention of it.

Reviewed-by: Stefan Roese <sr@denx.de>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 arch/powerpc/lib/cache.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/lib/cache.c b/arch/powerpc/lib/cache.c
index 3c3c470bbb..3e487f50fe 100644
--- a/arch/powerpc/lib/cache.c
+++ b/arch/powerpc/lib/cache.c
@@ -11,7 +11,6 @@
 
 void flush_cache(ulong start_addr, ulong size)
 {
-#ifndef CONFIG_5xx
 	ulong addr, start, end;
 
 	start = start_addr & ~(CONFIG_SYS_CACHELINE_SIZE - 1);
@@ -33,5 +32,4 @@ void flush_cache(ulong start_addr, ulong size)
 	asm volatile("sync" : : : "memory");
 	/* flush prefetch queue */
 	asm volatile("isync" : : : "memory");
-#endif
 }
-- 
2.29.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD
  2021-04-21  9:16 ` [PATCH v3 0/2] " Rasmus Villemoes
  2021-04-21  9:16   ` [PATCH v3 1/2] powerpc: lib: remove leftover CONFIG_5xx Rasmus Villemoes
@ 2021-04-21  9:16   ` Rasmus Villemoes
  1 sibling, 0 replies; 10+ messages in thread
From: Rasmus Villemoes @ 2021-04-21  9:16 UTC (permalink / raw)
  To: u-boot

When flush_cache() is called during boot on our ~7M kernel image, the
hundreds of thousands of WATCHDOG_RESET calls end up adding
significantly to boottime. Flushing a single cache line doesn't take
many microseconds, so doing these calls for every cache line is
complete overkill.

The generic watchdog_reset() provided by wdt-uclass.c actually
contains some rate-limiting logic that should in theory mitigate this,
but alas, that rate-limiting must be disabled on powerpc because of
its get_timer() implementation - get_timer() works just fine until
interrupts are disabled, but it just so happens that the "big"
flush_cache() call happens in the part of bootm where interrupts are
indeed disabled. [1] [2] [3]

I have checked with objdump that the generated code doesn't change
when this option is left at its default value of 0: gcc is smart
enough to see that the ">=" comparison is tautologically true, hence
all assignments to "flushed" are eliminated as dead stores.

On our board, setting the option to something like 65536 ends up
reducing total boottime by about 0.8 seconds.

[1] https://patchwork.ozlabs.org/project/uboot/patch/20200605111657.28773-1-rasmus.villemoes at prevas.dk/
[2] https://lists.denx.de/pipermail/u-boot/2021-April/446906.html
[3] https://lists.denx.de/pipermail/u-boot/2021-April/447280.html

Reviewed-by: Stefan Roese <sr@denx.de>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 arch/powerpc/Kconfig     |  1 +
 arch/powerpc/lib/Kconfig |  9 +++++++++
 arch/powerpc/lib/cache.c | 15 +++++++++++++--
 3 files changed, 23 insertions(+), 2 deletions(-)
 create mode 100644 arch/powerpc/lib/Kconfig

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index c2c577f60c..1829de9e2c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -49,5 +49,6 @@ source "arch/powerpc/cpu/mpc83xx/Kconfig"
 source "arch/powerpc/cpu/mpc85xx/Kconfig"
 source "arch/powerpc/cpu/mpc86xx/Kconfig"
 source "arch/powerpc/cpu/mpc8xx/Kconfig"
+source "arch/powerpc/lib/Kconfig"
 
 endmenu
diff --git a/arch/powerpc/lib/Kconfig b/arch/powerpc/lib/Kconfig
new file mode 100644
index 0000000000..b30b5edf7c
--- /dev/null
+++ b/arch/powerpc/lib/Kconfig
@@ -0,0 +1,9 @@
+config CACHE_FLUSH_WATCHDOG_THRESHOLD
+	int "Bytes to flush between WATCHDOG_RESET calls"
+	default 0
+	help
+	  The flush_cache() function periodically, and by default for
+	  every cache line, calls WATCHDOG_RESET(). When flushing a
+	  large area, that may add a significant amount of
+	  overhead. This option allows you to set a threshold for how
+	  many bytes to flush between each WATCHDOG_RESET call.
diff --git a/arch/powerpc/lib/cache.c b/arch/powerpc/lib/cache.c
index 3e487f50fe..19162511ce 100644
--- a/arch/powerpc/lib/cache.c
+++ b/arch/powerpc/lib/cache.c
@@ -9,9 +9,20 @@
 #include <asm/cache.h>
 #include <watchdog.h>
 
+static ulong maybe_watchdog_reset(ulong flushed)
+{
+	flushed += CONFIG_SYS_CACHELINE_SIZE;
+	if (flushed >= CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD) {
+		WATCHDOG_RESET();
+		flushed = 0;
+	}
+	return flushed;
+}
+
 void flush_cache(ulong start_addr, ulong size)
 {
 	ulong addr, start, end;
+	ulong flushed = 0;
 
 	start = start_addr & ~(CONFIG_SYS_CACHELINE_SIZE - 1);
 	end = start_addr + size - 1;
@@ -19,7 +30,7 @@ void flush_cache(ulong start_addr, ulong size)
 	for (addr = start; (addr <= end) && (addr >= start);
 			addr += CONFIG_SYS_CACHELINE_SIZE) {
 		asm volatile("dcbst 0,%0" : : "r" (addr) : "memory");
-		WATCHDOG_RESET();
+		flushed = maybe_watchdog_reset(flushed);
 	}
 	/* wait for all dcbst to complete on bus */
 	asm volatile("sync" : : : "memory");
@@ -27,7 +38,7 @@ void flush_cache(ulong start_addr, ulong size)
 	for (addr = start; (addr <= end) && (addr >= start);
 			addr += CONFIG_SYS_CACHELINE_SIZE) {
 		asm volatile("icbi 0,%0" : : "r" (addr) : "memory");
-		WATCHDOG_RESET();
+		flushed = maybe_watchdog_reset(flushed);
 	}
 	asm volatile("sync" : : : "memory");
 	/* flush prefetch queue */
-- 
2.29.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-04-21  9:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 19:38 [PATCH v2 0/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD Rasmus Villemoes
2021-04-13 19:38 ` [PATCH v2 1/2] powerpc: lib: remove leftover CONFIG_5xx Rasmus Villemoes
2021-04-15  5:30   ` Stefan Roese
2021-04-13 19:38 ` [PATCH v2 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD Rasmus Villemoes
2021-04-15  5:34   ` Stefan Roese
2021-04-15  6:46     ` Rasmus Villemoes
2021-04-15  6:55       ` Stefan Roese
2021-04-21  9:16 ` [PATCH v3 0/2] " Rasmus Villemoes
2021-04-21  9:16   ` [PATCH v3 1/2] powerpc: lib: remove leftover CONFIG_5xx Rasmus Villemoes
2021-04-21  9:16   ` [PATCH v3 2/2] powerpc: introduce CONFIG_CACHE_FLUSH_WATCHDOG_THRESHOLD Rasmus Villemoes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.