* [PATCH] arm: stm32mp1: activate data cache in SPL and before relocation
@ 2020-03-30 7:46 Patrick Delaunay
2020-03-30 9:56 ` Marek Vasut
0 siblings, 1 reply; 6+ messages in thread
From: Patrick Delaunay @ 2020-03-30 7:46 UTC (permalink / raw)
To: u-boot
Activate the data cache in SPL and in U-Boot before relocation
- before relocation, the TLB is located after U-Boot
(CONFIG_SYS_TEXT_BASE and an assumed 2MB max size)
and all the DDR is marked cacheable
- in SPL, the TLB is located at the end of SYSRAM, just after the stack
(CONFIG_SPL_STACK) with size PGTABLE_SIZE = 16kB
and all the SYSRAM is marked cacheable
This patch allows to reduce the execution time, particularly
- for the device tree parsing in U-Boot pre-reloc stage
(dm_extended_scan_fd =>dm_scan_fdt)
- in I2C timing computation in SPL (stm32_i2c_choose_solution())
For example, the result on STM32MP157C-DK2 board is:
1,6s gain for trusted boot chain with TF-A
2,2s gain for basic boot chain with SPL
Signed-off-by: Patrick Delaunay <patrick.delaunay@st.com>
---
Example of bootstage report on STM32MP157C-DK2
1/ For trusted boot chain with TF-A
a) Before the patch
STM32MP> bootstage report
Timer summary in microseconds (9 records):
Mark Elapsed Stage
0 0 reset
583,290 583,290 board_init_f
2,348,898 1,765,608 board_init_r
2,664,580 315,682 id=64
2,704,027 39,447 id=65
2,704,729 702 main_loop
5,563,519 2,858,790 id=175
Accumulated time:
41,696 dm_r
615,561 dm_f
b) After the patch
STM32MP> bootstage report
Timer summary in microseconds (9 records):
Mark Elapsed Stage
0 0 reset
577,841 577,841 board_init_f
722,178 144,337 board_init_r
1,038,458 316,280 id=64
1,078,298 39,840 id=65
1,078,999 701 main_loop
4,169,020 3,090,021 id=175
Accumulated time:
36,330 dm_f
41,999 dm_r
2/ And for the basic boot chain with SPL
a) Before the patch:
STM32MP> bootstage report
Timer summary in microseconds (12 records):
Mark Elapsed Stage
0 0 reset
195,613 195,613 SPL
837,867 642,254 end SPL
840,117 2,250 board_init_f
2,739,639 1,899,522 board_init_r
3,066,815 327,176 id=64
3,103,377 36,562 id=65
3,104,078 701 main_loop
3,142,171 38,093 id=175
Accumulated time:
38,124 dm_spl
41,956 dm_r
648,861 dm_f
b) After the patch
STM32MP> bootstage report
Timer summary in microseconds (12 records):
Mark Elapsed Stage
0 0 reset
195,859 195,859 SPL
330,190 134,331 end SPL
332,408 2,218 board_init_f
482,688 150,280 board_init_r
808,694 326,006 id=64
845,029 36,335 id=65
845,730 701 main_loop
3,281,876 2,436,146 id=175
Accumulated time:
3,169 dm_spl
36,041 dm_f
41,701 dm_r
arch/arm/mach-stm32mp/cpu.c | 28 ++++++++++++++++++++++++-
arch/arm/mach-stm32mp/dram_init.c | 35 +++++++++++++++++++++++++++++++
include/configs/stm32mp1.h | 4 ++--
3 files changed, 64 insertions(+), 3 deletions(-)
diff --git a/arch/arm/mach-stm32mp/cpu.c b/arch/arm/mach-stm32mp/cpu.c
index 36a9205819..579468a1a0 100644
--- a/arch/arm/mach-stm32mp/cpu.c
+++ b/arch/arm/mach-stm32mp/cpu.c
@@ -193,6 +193,26 @@ int arch_cpu_init(void)
{
u32 boot_mode;
+ /*
+ * initialialize the MMU and activate data cache
+ * in SPL or in U- Boot pre-reloc stage
+ */
+ gd->arch.tlb_size = PGTABLE_SIZE;
+#if defined(CONFIG_SPL_BUILD)
+#if (STM32_SYSRAM_BASE + STM32_SYSRAM_SIZE - CONFIG_SPL_STACK < PGTABLE_SIZE)
+#error "The reserved memory for SPL PGTABLE is not enough."
+#endif
+ gd->arch.tlb_addr = CONFIG_SPL_STACK;
+#else
+ /* TLB is located after U-Boot, assumed 2MB as max size */
+ gd->arch.tlb_addr = CONFIG_SYS_TEXT_BASE + SZ_2M;
+#endif
+ dcache_enable();
+ /*
+ * MMU/TLB is updated in enable_caches() for U-Boot after relocation
+ * or is deactivated in U-Boot start.S::cpu_init_cp15 for SPL
+ */
+
/* early armv7 timer init: needed for polling */
timer_init();
@@ -225,7 +245,13 @@ int arch_cpu_init(void)
void enable_caches(void)
{
- /* Enable D-cache. I-cache is already enabled in start.S */
+ /* I-cache is already enabled in start.S */
+ /* deactivate the data cache, early enabled in arch_cpu_init() */
+ dcache_disable();
+ /*
+ * update MMU after relocation, the TLB location was udpated in
+ * board_f.c::reserve_mmu, and enable the data cache
+ */
dcache_enable();
}
diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c
index 3233415eff..15b39ecc03 100644
--- a/arch/arm/mach-stm32mp/dram_init.c
+++ b/arch/arm/mach-stm32mp/dram_init.c
@@ -7,9 +7,29 @@
#include <dm.h>
#include <lmb.h>
#include <ram.h>
+#include <asm/cache.h>
DECLARE_GLOBAL_DATA_PTR;
+static void set_mmu_dcache(u32 addr, u32 size)
+{
+ int i;
+ u32 start, end;
+
+ start = addr >> MMU_SECTION_SHIFT;
+ end = ((u64)addr + (u64)size) >> MMU_SECTION_SHIFT;
+
+ for (i = start; i < end; i++) {
+#if defined(CONFIG_SYS_ARM_CACHE_WRITETHROUGH)
+ set_section_dcache(i, DCACHE_WRITETHROUGH);
+#elif defined(CONFIG_SYS_ARM_CACHE_WRITEALLOC)
+ set_section_dcache(i, DCACHE_WRITEALLOC);
+#else
+ set_section_dcache(i, DCACHE_WRITEBACK);
+#endif
+ }
+}
+
int dram_init(void)
{
struct ram_info ram;
@@ -49,3 +69,18 @@ ulong board_get_usable_ram_top(ulong total_size)
return gd->ram_top;
}
+
+void dram_bank_mmu_setup(int bank)
+{
+ if (gd->flags & GD_FLG_RELOC) {
+ debug("%s: bank: %d\n", __func__, bank);
+ set_mmu_dcache(gd->bd->bi_dram[bank].start,
+ gd->bd->bi_dram[bank].size);
+ } else {
+#if defined(CONFIG_SPL_BUILD)
+ set_mmu_dcache(STM32_SYSRAM_BASE, STM32_SYSRAM_SIZE);
+#else
+ set_mmu_dcache(STM32_DDR_BASE, STM32_DDR_SIZE);
+#endif
+ }
+}
diff --git a/include/configs/stm32mp1.h b/include/configs/stm32mp1.h
index c34a720e0c..5203fc93ad 100644
--- a/include/configs/stm32mp1.h
+++ b/include/configs/stm32mp1.h
@@ -58,8 +58,8 @@
/* limit SYSRAM usage to first 128 KB */
#define CONFIG_SPL_MAX_SIZE 0x00020000
-#define CONFIG_SPL_STACK (STM32_SYSRAM_BASE + \
- STM32_SYSRAM_SIZE)
+/* stack at STM32_SYSRAM_BASE + STM32_SYSRAM_SIZE - PGTABLE_SIZE (=16kB) */
+#define CONFIG_SPL_STACK 0x2FFFC000
#endif /* #ifdef CONFIG_SPL */
#define CONFIG_SYS_MEMTEST_START STM32_DDR_BASE
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH] arm: stm32mp1: activate data cache in SPL and before relocation
2020-03-30 7:46 [PATCH] arm: stm32mp1: activate data cache in SPL and before relocation Patrick Delaunay
@ 2020-03-30 9:56 ` Marek Vasut
2020-03-30 13:49 ` Patrick DELAUNAY
0 siblings, 1 reply; 6+ messages in thread
From: Marek Vasut @ 2020-03-30 9:56 UTC (permalink / raw)
To: u-boot
On 3/30/20 9:46 AM, Patrick Delaunay wrote:
[...]
> 2/ And for the basic boot chain with SPL
>
> a) Before the patch:
>
> STM32MP> bootstage report
> Timer summary in microseconds (12 records):
> Mark Elapsed Stage
> 0 0 reset
> 195,613 195,613 SPL
> 837,867 642,254 end SPL
> 840,117 2,250 board_init_f
> 2,739,639 1,899,522 board_init_r
> 3,066,815 327,176 id=64
> 3,103,377 36,562 id=65
> 3,104,078 701 main_loop
> 3,142,171 38,093 id=175
>
> Accumulated time:
> 38,124 dm_spl
> 41,956 dm_r
> 648,861 dm_f
>
> b) After the patch
>
> STM32MP> bootstage report
> Timer summary in microseconds (12 records):
> Mark Elapsed Stage
> 0 0 reset
> 195,859 195,859 SPL
> 330,190 134,331 end SPL
> 332,408 2,218 board_init_f
> 482,688 150,280 board_init_r
> 808,694 326,006 id=64
> 845,029 36,335 id=65
> 845,730 701 main_loop
> 3,281,876 2,436,146 id=175
>
> Accumulated time:
> 3,169 dm_spl
> 36,041 dm_f
> 41,701 dm_r
Nice.
[...]
> diff --git a/arch/arm/mach-stm32mp/cpu.c b/arch/arm/mach-stm32mp/cpu.c
> index 36a9205819..579468a1a0 100644
> --- a/arch/arm/mach-stm32mp/cpu.c
> +++ b/arch/arm/mach-stm32mp/cpu.c
> @@ -193,6 +193,26 @@ int arch_cpu_init(void)
> {
> u32 boot_mode;
>
> + /*
> + * initialialize the MMU and activate data cache
> + * in SPL or in U- Boot pre-reloc stage
> + */
> + gd->arch.tlb_size = PGTABLE_SIZE;
> +#if defined(CONFIG_SPL_BUILD)
> +#if (STM32_SYSRAM_BASE + STM32_SYSRAM_SIZE - CONFIG_SPL_STACK < PGTABLE_SIZE)
> +#error "The reserved memory for SPL PGTABLE is not enough."
> +#endif
Move this check to the beginning of the file, out of the code.
> + gd->arch.tlb_addr = CONFIG_SPL_STACK;
if (IS_ENABLED(SPL_BUILD))
gd->....
else
gd->....
> +#else
> + /* TLB is located after U-Boot, assumed 2MB as max size */
> + gd->arch.tlb_addr = CONFIG_SYS_TEXT_BASE + SZ_2M;
> +#endif
I think you also need to set arch.tlb_size . Also, this should be a
separate function.
> + dcache_enable();
What about icache, shouldn't that one be enabled too ?
> + /*
> + * MMU/TLB is updated in enable_caches() for U-Boot after relocation
> + * or is deactivated in U-Boot start.S::cpu_init_cp15 for SPL
> + */
> +
> /* early armv7 timer init: needed for polling */
> timer_init();
>
> @@ -225,7 +245,13 @@ int arch_cpu_init(void)
>
> void enable_caches(void)
> {
Icache should be enabled here too. You don't know what e.g. a debugger
did to caches.
> - /* Enable D-cache. I-cache is already enabled in start.S */
> + /* I-cache is already enabled in start.S */
> + /* deactivate the data cache, early enabled in arch_cpu_init() */
> + dcache_disable();
> + /*
> + * update MMU after relocation, the TLB location was udpated in
> + * board_f.c::reserve_mmu, and enable the data cache
> + */
> dcache_enable();
> }
>
> diff --git a/arch/arm/mach-stm32mp/dram_init.c b/arch/arm/mach-stm32mp/dram_init.c
> index 3233415eff..15b39ecc03 100644
> --- a/arch/arm/mach-stm32mp/dram_init.c
> +++ b/arch/arm/mach-stm32mp/dram_init.c
> @@ -7,9 +7,29 @@
> #include <dm.h>
> #include <lmb.h>
> #include <ram.h>
> +#include <asm/cache.h>
>
> DECLARE_GLOBAL_DATA_PTR;
>
> +static void set_mmu_dcache(u32 addr, u32 size)
> +{
> + int i;
> + u32 start, end;
> +
> + start = addr >> MMU_SECTION_SHIFT;
> + end = ((u64)addr + (u64)size) >> MMU_SECTION_SHIFT;
Is this a copy of dram_bank_mmu_setup() in arch/arm/lib/cache-cp15.c ?
Why ?
> + for (i = start; i < end; i++) {
> +#if defined(CONFIG_SYS_ARM_CACHE_WRITETHROUGH)
> + set_section_dcache(i, DCACHE_WRITETHROUGH);
> +#elif defined(CONFIG_SYS_ARM_CACHE_WRITEALLOC)
> + set_section_dcache(i, DCACHE_WRITEALLOC);
> +#else
> + set_section_dcache(i, DCACHE_WRITEBACK);
> +#endif
> + }
> +}
[...]
> diff --git a/include/configs/stm32mp1.h b/include/configs/stm32mp1.h
> index c34a720e0c..5203fc93ad 100644
> --- a/include/configs/stm32mp1.h
> +++ b/include/configs/stm32mp1.h
> @@ -58,8 +58,8 @@
>
> /* limit SYSRAM usage to first 128 KB */
> #define CONFIG_SPL_MAX_SIZE 0x00020000
> -#define CONFIG_SPL_STACK (STM32_SYSRAM_BASE + \
> - STM32_SYSRAM_SIZE)
> +/* stack at STM32_SYSRAM_BASE + STM32_SYSRAM_SIZE - PGTABLE_SIZE (=16kB) */
> +#define CONFIG_SPL_STACK 0x2FFFC000
Can't you memalign() the pagetable area instead of this hacking around?
Or use something around board_init_f_alloc_reserve().
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] arm: stm32mp1: activate data cache in SPL and before relocation
2020-03-30 9:56 ` Marek Vasut
@ 2020-03-30 13:49 ` Patrick DELAUNAY
2020-03-30 14:03 ` Marek Vasut
0 siblings, 1 reply; 6+ messages in thread
From: Patrick DELAUNAY @ 2020-03-30 13:49 UTC (permalink / raw)
To: u-boot
Hi Marek,
> From: Marek Vasut <marex@denx.de>
> Sent: lundi 30 mars 2020 11:57
>
> On 3/30/20 9:46 AM, Patrick Delaunay wrote:
> [...]
>
> > 2/ And for the basic boot chain with SPL
> >
> > a) Before the patch:
> >
> > STM32MP> bootstage report
> > Timer summary in microseconds (12 records):
> > Mark Elapsed Stage
> > 0 0 reset
> > 195,613 195,613 SPL
> > 837,867 642,254 end SPL
> > 840,117 2,250 board_init_f
> > 2,739,639 1,899,522 board_init_r
> > 3,066,815 327,176 id=64
> > 3,103,377 36,562 id=65
> > 3,104,078 701 main_loop
> > 3,142,171 38,093 id=175
> >
> > Accumulated time:
> > 38,124 dm_spl
> > 41,956 dm_r
> > 648,861 dm_f
> >
> > b) After the patch
> >
> > STM32MP> bootstage report
> > Timer summary in microseconds (12 records):
> > Mark Elapsed Stage
> > 0 0 reset
> > 195,859 195,859 SPL
> > 330,190 134,331 end SPL
> > 332,408 2,218 board_init_f
> > 482,688 150,280 board_init_r
> > 808,694 326,006 id=64
> > 845,029 36,335 id=65
> > 845,730 701 main_loop
> > 3,281,876 2,436,146 id=175
> >
> > Accumulated time:
> > 3,169 dm_spl
> > 36,041 dm_f
> > 41,701 dm_r
>
> Nice.
>
> [...]
>
> > diff --git a/arch/arm/mach-stm32mp/cpu.c b/arch/arm/mach-stm32mp/cpu.c
> > index 36a9205819..579468a1a0 100644
> > --- a/arch/arm/mach-stm32mp/cpu.c
> > +++ b/arch/arm/mach-stm32mp/cpu.c
> > @@ -193,6 +193,26 @@ int arch_cpu_init(void) {
> > u32 boot_mode;
> >
> > + /*
> > + * initialialize the MMU and activate data cache
> > + * in SPL or in U- Boot pre-reloc stage
> > + */
> > + gd->arch.tlb_size = PGTABLE_SIZE;
> > +#if defined(CONFIG_SPL_BUILD)
> > +#if (STM32_SYSRAM_BASE + STM32_SYSRAM_SIZE -
> CONFIG_SPL_STACK <
> > +PGTABLE_SIZE) #error "The reserved memory for SPL PGTABLE is not
> enough."
> > +#endif
>
> Move this check to the beginning of the file, out of the code.
Ok
> > + gd->arch.tlb_addr = CONFIG_SPL_STACK;
>
> if (IS_ENABLED(SPL_BUILD))
> gd->....
> else
> gd->....
Yes with
IS_ENABLED(CONFIG_SPL_BUILD)
> > +#else
> > + /* TLB is located after U-Boot, assumed 2MB as max size */
> > + gd->arch.tlb_addr = CONFIG_SYS_TEXT_BASE + SZ_2M; #endif
>
> I think you also need to set arch.tlb_size . Also, this should be a separate function.
It is done before the test / common for SPL or U-Boot case.
gd->arch.tlb_size = PGTABLE_SIZE;
but I will create a function early_early_caches()
> > + dcache_enable();
>
> What about icache, shouldn't that one be enabled too ?
It is not needed...
I-cache is already activated in start.S::cpu_init_cp15 for arm V7
But I will add a comment.
> > + /*
> > + * MMU/TLB is updated in enable_caches() for U-Boot after relocation
> > + * or is deactivated in U-Boot start.S::cpu_init_cp15 for SPL
> > + */
> > +
> > /* early armv7 timer init: needed for polling */
> > timer_init();
> >
> > @@ -225,7 +245,13 @@ int arch_cpu_init(void)
> >
> > void enable_caches(void)
> > {
>
> Icache should be enabled here too. You don't know what e.g. a debugger did to
> caches.
>
> > - /* Enable D-cache. I-cache is already enabled in start.S */
> > + /* I-cache is already enabled in start.S */
Not needed for arm V7 (I copy this function from other platfrom ... I don't remember which one)
I-cache is already activated in start.S::cpu_init_cp15 as indicated in this comment
> > + /* deactivate the data cache, early enabled in arch_cpu_init() */
> > + dcache_disable();
> > + /*
> > + * update MMU after relocation, the TLB location was udpated in
> > + * board_f.c::reserve_mmu, and enable the data cache
> > + */
> > dcache_enable();
> > }
> >
> > diff --git a/arch/arm/mach-stm32mp/dram_init.c
> > b/arch/arm/mach-stm32mp/dram_init.c
> > index 3233415eff..15b39ecc03 100644
> > --- a/arch/arm/mach-stm32mp/dram_init.c
> > +++ b/arch/arm/mach-stm32mp/dram_init.c
> > @@ -7,9 +7,29 @@
> > #include <dm.h>
> > #include <lmb.h>
> > #include <ram.h>
> > +#include <asm/cache.h>
> >
> > DECLARE_GLOBAL_DATA_PTR;
> >
> > +static void set_mmu_dcache(u32 addr, u32 size) {
> > + int i;
> > + u32 start, end;
> > +
> > + start = addr >> MMU_SECTION_SHIFT;
> > + end = ((u64)addr + (u64)size) >> MMU_SECTION_SHIFT;
>
> Is this a copy of dram_bank_mmu_setup() in arch/arm/lib/cache-cp15.c ?
> Why ?
It is not just a copy...
set__mmu_dache is only a static helper for function dram_bank_mmu_setup()
I override the default implementation of the weak functon dram_bank_mmu_setup()
1/ mark SYSRAM cacheable in SPL (embedded RAM used by SPL)
2/ mark beginning of DDR cacheable in U-Boot pre-reloc (today all the DDR)
=> I prepare a possible TF-A limitation: when TF-A is running in DDR,
a part of DDR is securized and can't be mapped to avoid speculative access
3/ after relocation, DDR init is performed....
use the default implementation to map all the DDR (gd->bd->bi_dram[0])
PS: in future patches, I will only limit this case for reserved memory part,
with "no-map" tag (also for TF-A requirement)
Now after my explcation I found a issue in my patch...
SPL also use DDR (at least for CONFIG_SYS_SPL_MALLOC_START 0xC0300000) ,
so I need to marked DDR as cacheache also for SPL
> > + for (i = start; i < end; i++) {
> > +#if defined(CONFIG_SYS_ARM_CACHE_WRITETHROUGH)
> > + set_section_dcache(i, DCACHE_WRITETHROUGH); #elif
> > +defined(CONFIG_SYS_ARM_CACHE_WRITEALLOC)
> > + set_section_dcache(i, DCACHE_WRITEALLOC); #else
> > + set_section_dcache(i, DCACHE_WRITEBACK); #endif
> > + }
> > +}
>
> [...]
>
> > diff --git a/include/configs/stm32mp1.h b/include/configs/stm32mp1.h
> > index c34a720e0c..5203fc93ad 100644
> > --- a/include/configs/stm32mp1.h
> > +++ b/include/configs/stm32mp1.h
> > @@ -58,8 +58,8 @@
> >
> > /* limit SYSRAM usage to first 128 KB */
> > #define CONFIG_SPL_MAX_SIZE 0x00020000
> > -#define CONFIG_SPL_STACK (STM32_SYSRAM_BASE + \
> > - STM32_SYSRAM_SIZE)
> > +/* stack at STM32_SYSRAM_BASE + STM32_SYSRAM_SIZE -
> PGTABLE_SIZE (=16kB) */
> > +#define CONFIG_SPL_STACK 0x2FFFC000
>
> Can't you memalign() the pagetable area instead of this hacking around?
> Or use something around board_init_f_alloc_reserve().
It was my first idea: use malloc
But as I try to activate the data cache as soon as possible.
So before spl_early_init call (for spl in board_init_f) and malloc is not yet accessible.
And board_init_f_alloc_reserve is only called in U-Boot board-f.c.....
after the MMU configuration for pre-relocation / not called in SPL.
I don't see this address as hack but a memory configuration of SYSRAM:
SYRAM content (board_f)
- SPL code
- SPL data
- SPL stack (reversed order)
- TTB
But I can move it in BSS as global apl variable, I need to think about it....
It is probably more clean.
Regards
Patrick
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] arm: stm32mp1: activate data cache in SPL and before relocation
2020-03-30 13:49 ` Patrick DELAUNAY
@ 2020-03-30 14:03 ` Marek Vasut
2020-04-03 8:03 ` Patrick DELAUNAY
0 siblings, 1 reply; 6+ messages in thread
From: Marek Vasut @ 2020-03-30 14:03 UTC (permalink / raw)
To: u-boot
On 3/30/20 3:49 PM, Patrick DELAUNAY wrote:
> Hi Marek,
Hi,
[...]
>>> - /* Enable D-cache. I-cache is already enabled in start.S */
>>> + /* I-cache is already enabled in start.S */
>
> Not needed for arm V7 (I copy this function from other platfrom ... I don't remember which one)
Maybe this needs to be de-duplicated if it's a copy ?
[...]
>>> diff --git a/arch/arm/mach-stm32mp/dram_init.c
>>> b/arch/arm/mach-stm32mp/dram_init.c
>>> index 3233415eff..15b39ecc03 100644
>>> --- a/arch/arm/mach-stm32mp/dram_init.c
>>> +++ b/arch/arm/mach-stm32mp/dram_init.c
>>> @@ -7,9 +7,29 @@
>>> #include <dm.h>
>>> #include <lmb.h>
>>> #include <ram.h>
>>> +#include <asm/cache.h>
>>>
>>> DECLARE_GLOBAL_DATA_PTR;
>>>
>>> +static void set_mmu_dcache(u32 addr, u32 size) {
>>> + int i;
>>> + u32 start, end;
>>> +
>>> + start = addr >> MMU_SECTION_SHIFT;
>>> + end = ((u64)addr + (u64)size) >> MMU_SECTION_SHIFT;
>>
>> Is this a copy of dram_bank_mmu_setup() in arch/arm/lib/cache-cp15.c ?
>> Why ?
>
> It is not just a copy...
>
> set__mmu_dache is only a static helper for function dram_bank_mmu_setup()
>
> I override the default implementation of the weak functon dram_bank_mmu_setup()
Can you instead augment the original implementation to cater for this
usecase or would that be too difficult ?
> 1/ mark SYSRAM cacheable in SPL (embedded RAM used by SPL)
>
> 2/ mark beginning of DDR cacheable in U-Boot pre-reloc (today all the DDR)
> => I prepare a possible TF-A limitation: when TF-A is running in DDR,
> a part of DDR is securized and can't be mapped to avoid speculative access
>
> 3/ after relocation, DDR init is performed....
> use the default implementation to map all the DDR (gd->bd->bi_dram[0])
>
> PS: in future patches, I will only limit this case for reserved memory part,
> with "no-map" tag (also for TF-A requirement)
OK, TF-A is just adding more and more problems.
> Now after my explcation I found a issue in my patch...
> SPL also use DDR (at least for CONFIG_SYS_SPL_MALLOC_START 0xC0300000) ,
> so I need to marked DDR as cacheache also for SPL
>
>>> + for (i = start; i < end; i++) {
>>> +#if defined(CONFIG_SYS_ARM_CACHE_WRITETHROUGH)
>>> + set_section_dcache(i, DCACHE_WRITETHROUGH); #elif
>>> +defined(CONFIG_SYS_ARM_CACHE_WRITEALLOC)
>>> + set_section_dcache(i, DCACHE_WRITEALLOC); #else
>>> + set_section_dcache(i, DCACHE_WRITEBACK); #endif
>>> + }
>>> +}
>>
>> [...]
>>
>>> diff --git a/include/configs/stm32mp1.h b/include/configs/stm32mp1.h
>>> index c34a720e0c..5203fc93ad 100644
>>> --- a/include/configs/stm32mp1.h
>>> +++ b/include/configs/stm32mp1.h
>>> @@ -58,8 +58,8 @@
>>>
>>> /* limit SYSRAM usage to first 128 KB */
>>> #define CONFIG_SPL_MAX_SIZE 0x00020000
>>> -#define CONFIG_SPL_STACK (STM32_SYSRAM_BASE + \
>>> - STM32_SYSRAM_SIZE)
>>> +/* stack at STM32_SYSRAM_BASE + STM32_SYSRAM_SIZE -
>> PGTABLE_SIZE (=16kB) */
>>> +#define CONFIG_SPL_STACK 0x2FFFC000
>>
>> Can't you memalign() the pagetable area instead of this hacking around?
>> Or use something around board_init_f_alloc_reserve().
>
> It was my first idea: use malloc
>
> But as I try to activate the data cache as soon as possible.
> So before spl_early_init call (for spl in board_init_f) and malloc is not yet accessible.
>
> And board_init_f_alloc_reserve is only called in U-Boot board-f.c.....
> after the MMU configuration for pre-relocation / not called in SPL.
>
> I don't see this address as hack but a memory configuration of SYSRAM:
>
> SYRAM content (board_f)
> - SPL code
> - SPL data
> - SPL stack (reversed order)
> - TTB
>
> But I can move it in BSS as global apl variable, I need to think about it....
> It is probably more clean.
Please do :)
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] arm: stm32mp1: activate data cache in SPL and before relocation
2020-03-30 14:03 ` Marek Vasut
@ 2020-04-03 8:03 ` Patrick DELAUNAY
2020-04-03 21:37 ` Marek Vasut
0 siblings, 1 reply; 6+ messages in thread
From: Patrick DELAUNAY @ 2020-04-03 8:03 UTC (permalink / raw)
To: u-boot
Hi Marek,
> From: Marek Vasut <marex@denx.de>
> Sent: lundi 30 mars 2020 16:04
>
> On 3/30/20 3:49 PM, Patrick DELAUNAY wrote:
> > Hi Marek,
>
> Hi,
>
> [...]
>
> >>> - /* Enable D-cache. I-cache is already enabled in start.S */
> >>> + /* I-cache is already enabled in start.S */
> >
> > Not needed for arm V7 (I copy this function from other platfrom ... I
> > don't remember which one)
>
> Maybe this needs to be de-duplicated if it's a copy ?
I don't remember....
As I mixed several references
But I found the same content in many arm arch;
arch/arm/mach-imx/mx5/soc.c:67
arch/arm/mach-rockchip/board.c:47
arch/arm/mach-tegra/board.c:271
arch/arm/mach-sunxi/board.c:347
arch/arm/mach-exynos/soc.c:30:
arch/arm/mach-zynq/cpu.c:88:
arch/arm/cpu/armv7/iproc-common/hwinit-common.c:1
arch/arm/mach-u8500/cache.c:14
arch/arm/mach-keystone/init.c:206
And different implementation in
arch/arm/mach-socfpga/misc.c:55
mach-omap2/omap-cache.c:49
void enable_caches(void)
{
/* Enable I cache if not enabled */
if (!icache_status())
icache_enable();
dcache_enable();
}
the issue the weak function empty, so it is mandatory to
redefine the real implementation for each platform.
arch/arm/lib/cache.c:35
/*
* Default implementation of enable_caches()
* Real implementation should be in platform code
*/
__weak void enable_caches(void)
{
puts("WARNING: Caches not enabled\n");
}
[...]
> >>>
> >>> +static void set_mmu_dcache(u32 addr, u32 size) {
> >>> + int i;
> >>> + u32 start, end;
> >>> +
> >>> + start = addr >> MMU_SECTION_SHIFT;
> >>> + end = ((u64)addr + (u64)size) >> MMU_SECTION_SHIFT;
> >>
> >> Is this a copy of dram_bank_mmu_setup() in arch/arm/lib/cache-cp15.c ?
> >> Why ?
> >
> > It is not just a copy...
> >
> > set__mmu_dache is only a static helper for function
> > dram_bank_mmu_setup()
> >
> > I override the default implementation of the weak functon
> > dram_bank_mmu_setup()
>
> Can you instead augment the original implementation to cater for this usecase or
> would that be too difficult ?
Have a generic behavior...
I will propose to protect the access to bd->bi_dram[bank] in dram_bank_mmu_setup
[....]
> >>
> >>> diff --git a/include/configs/stm32mp1.h b/include/configs/stm32mp1.h
> >>> index c34a720e0c..5203fc93ad 100644
> >>> --- a/include/configs/stm32mp1.h
> >>> +++ b/include/configs/stm32mp1.h
> >>> @@ -58,8 +58,8 @@
> >>>
> >>> /* limit SYSRAM usage to first 128 KB */
> >>> #define CONFIG_SPL_MAX_SIZE 0x00020000
> >>> -#define CONFIG_SPL_STACK (STM32_SYSRAM_BASE + \
> >>> - STM32_SYSRAM_SIZE)
> >>> +/* stack at STM32_SYSRAM_BASE + STM32_SYSRAM_SIZE -
> >> PGTABLE_SIZE (=16kB) */
> >>> +#define CONFIG_SPL_STACK 0x2FFFC000
> >>
> >> Can't you memalign() the pagetable area instead of this hacking around?
> >> Or use something around board_init_f_alloc_reserve().
> >
> > It was my first idea: use malloc
> >
> > But as I try to activate the data cache as soon as possible.
> > So before spl_early_init call (for spl in board_init_f) and malloc is not yet
> accessible.
> >
> > And board_init_f_alloc_reserve is only called in U-Boot board-f.c.....
> > after the MMU configuration for pre-relocation / not called in SPL.
> >
> > I don't see this address as hack but a memory configuration of SYSRAM:
> >
> > SYRAM content (board_f)
> > - SPL code
> > - SPL data
> > - SPL stack (reversed order)
> > - TTB
> >
> > But I can move it in BSS as global apl variable, I need to think about it....
> > It is probably more clean.
>
> Please do :)
I willl move it in ".data" section in V2 for SPL and U-Boot.
Even in binary size increase and the SPL load time
by ROM code is increase by 30ms.
Patrick
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] arm: stm32mp1: activate data cache in SPL and before relocation
2020-04-03 8:03 ` Patrick DELAUNAY
@ 2020-04-03 21:37 ` Marek Vasut
0 siblings, 0 replies; 6+ messages in thread
From: Marek Vasut @ 2020-04-03 21:37 UTC (permalink / raw)
To: u-boot
On 4/3/20 10:03 AM, Patrick DELAUNAY wrote:
> Hi Marek,
Hi,
>> From: Marek Vasut <marex@denx.de>
>> Sent: lundi 30 mars 2020 16:04
>>
>> On 3/30/20 3:49 PM, Patrick DELAUNAY wrote:
>>> Hi Marek,
>>
>> Hi,
>>
>> [...]
>>
>>>>> - /* Enable D-cache. I-cache is already enabled in start.S */
>>>>> + /* I-cache is already enabled in start.S */
>>>
>>> Not needed for arm V7 (I copy this function from other platfrom ... I
>>> don't remember which one)
>>
>> Maybe this needs to be de-duplicated if it's a copy ?
>
> I don't remember....
> As I mixed several references
>
> But I found the same content in many arm arch;
>
> arch/arm/mach-imx/mx5/soc.c:67
> arch/arm/mach-rockchip/board.c:47
> arch/arm/mach-tegra/board.c:271
> arch/arm/mach-sunxi/board.c:347
> arch/arm/mach-exynos/soc.c:30:
> arch/arm/mach-zynq/cpu.c:88:
> arch/arm/cpu/armv7/iproc-common/hwinit-common.c:1
> arch/arm/mach-u8500/cache.c:14
> arch/arm/mach-keystone/init.c:206
>
> And different implementation in
> arch/arm/mach-socfpga/misc.c:55
On SoCFPGA, we are sure both need to be enabled, except SPL might not
want to enable dcache, which is why it's implemented there that way.
I dunno about the other platforms.
> mach-omap2/omap-cache.c:49
> void enable_caches(void)
> {
>
> /* Enable I cache if not enabled */
> if (!icache_status())
> icache_enable();
>
> dcache_enable();
> }
>
> the issue the weak function empty, so it is mandatory to
> redefine the real implementation for each platform.
>
> arch/arm/lib/cache.c:35
> /*
> * Default implementation of enable_caches()
> * Real implementation should be in platform code
> */
> __weak void enable_caches(void)
> {
> puts("WARNING: Caches not enabled\n");
> }
Hm, that's a valid point. Then I think we're stuck with
re-reimplementing this one.
> [...]
>
>>>>>
>>>>> +static void set_mmu_dcache(u32 addr, u32 size) {
>>>>> + int i;
>>>>> + u32 start, end;
>>>>> +
>>>>> + start = addr >> MMU_SECTION_SHIFT;
>>>>> + end = ((u64)addr + (u64)size) >> MMU_SECTION_SHIFT;
>>>>
>>>> Is this a copy of dram_bank_mmu_setup() in arch/arm/lib/cache-cp15.c ?
>>>> Why ?
>>>
>>> It is not just a copy...
>>>
>>> set__mmu_dache is only a static helper for function
>>> dram_bank_mmu_setup()
>>>
>>> I override the default implementation of the weak functon
>>> dram_bank_mmu_setup()
>>
>> Can you instead augment the original implementation to cater for this usecase or
>> would that be too difficult ?
>
> Have a generic behavior...
>
> I will propose to protect the access to bd->bi_dram[bank] in dram_bank_mmu_setup
Thanks!
[...]
>>> SYRAM content (board_f)
>>> - SPL code
>>> - SPL data
>>> - SPL stack (reversed order)
>>> - TTB
>>>
>>> But I can move it in BSS as global apl variable, I need to think about it....
>>> It is probably more clean.
>>
>> Please do :)
>
> I willl move it in ".data" section in V2 for SPL and U-Boot.
>
> Even in binary size increase and the SPL load time
> by ROM code is increase by 30ms.
Can it be mallocated instead ? If it's in initialized data, it will add
to the binary size, and I don't think you need it to be initialized data.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-04-03 21:37 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-30 7:46 [PATCH] arm: stm32mp1: activate data cache in SPL and before relocation Patrick Delaunay
2020-03-30 9:56 ` Marek Vasut
2020-03-30 13:49 ` Patrick DELAUNAY
2020-03-30 14:03 ` Marek Vasut
2020-04-03 8:03 ` Patrick DELAUNAY
2020-04-03 21:37 ` Marek Vasut
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.