From mboxrd@z Thu Jan 1 00:00:00 1970 From: bhupesh.sharma at freescale.com Date: Thu, 21 Aug 2014 18:50:14 +0000 Subject: [U-Boot] [Patch v2 3/5] armv8/fsl-lsch3: Release secondary cores from boot hold off with Boot Page In-Reply-To: <53F63B55.3040900@freescale.com> References: <1408480082-4617-1-git-send-email-yorksun@freescale.com> <1408480082-4617-3-git-send-email-yorksun@freescale.com> <20140821134724.GM21734@leverpostej> <53F63B55.3040900@freescale.com> Message-ID: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Hi Mark, > -----Original Message----- > From: u-boot-bounces at lists.denx.de [mailto:u-boot-bounces at lists.denx.de] > On Behalf Of York Sun > Sent: Friday, August 22, 2014 12:03 AM > To: Mark Rutland; Basu Arnab-B45036 > Cc: trini at ti.com; u-boot at lists.denx.de; Wood Scott-B07421 > Subject: Re: [U-Boot] [Patch v2 3/5] armv8/fsl-lsch3: Release secondary > cores from boot hold off with Boot Page > > On 08/21/2014 06:47 AM, Mark Rutland wrote: > > Hi York, > > > > I have mostly minor comments this time; this is looking pretty good. > > > > On Tue, Aug 19, 2014 at 09:28:00PM +0100, York Sun wrote: > >> Secondary cores need to be released from holdoff by boot release > >> registers. With GPP bootrom, they can boot from main memory directly. > >> Individual spin table is used for each core. If a single release > >> address is needed, defining macro CONFIG_FSL_SMP_RELEASE_ALL will use > >> the CPU_RELEASE_ADDR. Spin table and the boot page is reserved in > >> device tree so OS won't overwrite. > >> > >> Signed-off-by: York Sun > >> Signed-off-by: Arnab Basu > >> --- > >> v2: Removed copying boot page. Use u-boot image as is in memory. > >> Added dealing with different size of addr_cell in device tree. > >> Added dealing with big- and little-endian. > >> Added flushing spin table after cpu_release(). > >> > >> arch/arm/cpu/armv8/fsl-lsch3/Makefile | 2 + > >> arch/arm/cpu/armv8/fsl-lsch3/cpu.c | 13 ++ > >> arch/arm/cpu/armv8/fsl-lsch3/cpu.h | 1 + > >> arch/arm/cpu/armv8/fsl-lsch3/fdt.c | 56 +++++++ > >> arch/arm/cpu/armv8/fsl-lsch3/lowlevel.S | 128 ++++++++++++- > -- > >> arch/arm/cpu/armv8/fsl-lsch3/mp.c | 172 > +++++++++++++++++++++ > >> arch/arm/cpu/armv8/fsl-lsch3/mp.h | 36 +++++ > >> arch/arm/cpu/armv8/transition.S | 63 +------- > >> arch/arm/include/asm/arch-fsl-lsch3/config.h | 3 +- > >> arch/arm/include/asm/arch-fsl-lsch3/immap_lsch3.h | 35 +++++ > >> arch/arm/include/asm/macro.h | 92 +++++++++++ > >> arch/arm/lib/gic_64.S | 10 +- > >> common/board_f.c | 2 +- > >> 13 files changed, 525 insertions(+), 88 deletions(-) create mode > >> 100644 arch/arm/cpu/armv8/fsl-lsch3/fdt.c > >> create mode 100644 arch/arm/cpu/armv8/fsl-lsch3/mp.c create mode > >> 100644 arch/arm/cpu/armv8/fsl-lsch3/mp.h > >> > >> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/Makefile > >> b/arch/arm/cpu/armv8/fsl-lsch3/Makefile > >> index 9249537..f920eeb 100644 > >> --- a/arch/arm/cpu/armv8/fsl-lsch3/Makefile > >> +++ b/arch/arm/cpu/armv8/fsl-lsch3/Makefile > >> @@ -7,3 +7,5 @@ > >> obj-y += cpu.o > >> obj-y += lowlevel.o > >> obj-y += speed.o > >> +obj-$(CONFIG_MP) += mp.o > >> +obj-$(CONFIG_OF_LIBFDT) += fdt.o > >> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/cpu.c > >> b/arch/arm/cpu/armv8/fsl-lsch3/cpu.c > >> index c129d03..47b947f 100644 > >> --- a/arch/arm/cpu/armv8/fsl-lsch3/cpu.c > >> +++ b/arch/arm/cpu/armv8/fsl-lsch3/cpu.c > >> @@ -11,6 +11,7 @@ > >> #include > >> #include > >> #include "cpu.h" > >> +#include "mp.h" > >> #include "speed.h" > >> #include > >> > >> @@ -434,3 +435,15 @@ int cpu_eth_init(bd_t *bis) #endif > >> return error; > >> } > >> + > >> + > >> +int arch_early_init_r(void) > >> +{ > >> + int rv; > >> + rv = fsl_lsch3_wake_seconday_cores(); > >> + > >> + if (rv) > >> + printf("Did not wake secondary cores\n"); > >> + > >> + return 0; > >> +} > >> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/cpu.h > >> b/arch/arm/cpu/armv8/fsl-lsch3/cpu.h > >> index 28544d7..2e3312b 100644 > >> --- a/arch/arm/cpu/armv8/fsl-lsch3/cpu.h > >> +++ b/arch/arm/cpu/armv8/fsl-lsch3/cpu.h > >> @@ -5,3 +5,4 @@ > >> */ > >> > >> int fsl_qoriq_core_to_cluster(unsigned int core); > >> +u32 cpu_mask(void); > >> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/fdt.c > >> b/arch/arm/cpu/armv8/fsl-lsch3/fdt.c > >> new file mode 100644 > >> index 0000000..2dbcdcb > >> --- /dev/null > >> +++ b/arch/arm/cpu/armv8/fsl-lsch3/fdt.c > >> @@ -0,0 +1,56 @@ > >> +/* > >> + * Copyright 2014 Freescale Semiconductor, Inc. > >> + * > >> + * SPDX-License-Identifier: GPL-2.0+ > >> + */ > >> + > >> +#include > >> +#include > >> +#include > >> +#include "mp.h" > >> + > >> +#ifdef CONFIG_MP > >> +void ft_fixup_cpu(void *blob) > >> +{ > >> + int off; > >> + __maybe_unused u64 spin_tbl_addr = (u64)get_spin_tbl_addr(); > >> + fdt32_t *reg; > >> + int addr_cells; > >> + u64 val; > >> + size_t *boot_code_size = &(__secondary_boot_code_size); > >> + > >> + off = fdt_node_offset_by_prop_value(blob, -1, "device_type", > >> + "cpus", 4); > > > > I didn't think /cpus had device_type = "cpus". I can't see any > > instances in any DTs I have to hand. Can we not find /cpus by path? > > I will let Arnab to comment on this. He is coordinating with Linux device > tree. Since I contribute to the DTS for FSL ARMv8 SoC, here is my rationale behind the same. I have used the standard ARM cpu device-tree binding documentation as a reference (see [1]) which defined the device_type which it mentions should be set to cpu. Please let me know if I am missing something. [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/arm/cpus.txt > > > >> + of_bus_default_count_cells(blob, off, &addr_cells, NULL); > >> + > >> + off = fdt_node_offset_by_prop_value(blob, -1, "device_type", > "cpu", 4); > >> + while (off != -FDT_ERR_NOTFOUND) { > >> + reg = (fdt32_t *)fdt_getprop(blob, off, "reg", 0); > >> + if (reg) { > >> + val = spin_tbl_addr; #ifndef > >> +CONFIG_FSL_SMP_RELEASE_ALL > >> + val += id_to_core(of_read_number(reg, > addr_cells)) > >> + * SPIN_TABLE_ELEM_SIZE; #endif > >> + val = cpu_to_fdt64(val); > >> + fdt_setprop_string(blob, off, "enable-method", > >> + "spin-table"); > >> + fdt_setprop(blob, off, "cpu-release-addr", > >> + &val, sizeof(val)); > >> + } else { > >> + puts("cpu NULL\n"); > > > > Could we change that to "Warning: found cpu node without reg property" > > or something like that which makes the problem clear? > > Yes, we can. > > > > >> + } > >> + off = fdt_node_offset_by_prop_value(blob, off, > "device_type", > >> + "cpu", 4); > >> + } > >> + > >> + fdt_add_mem_rsv(blob, (uintptr_t)&secondary_boot_code, > >> + *boot_code_size); } #endif > >> + > >> +void ft_cpu_setup(void *blob, bd_t *bd) { #ifdef CONFIG_MP > >> + ft_fixup_cpu(blob); > >> +#endif > >> +} > >> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/lowlevel.S > >> b/arch/arm/cpu/armv8/fsl-lsch3/lowlevel.S > >> index ad32b6c..629e6d4 100644 > >> --- a/arch/arm/cpu/armv8/fsl-lsch3/lowlevel.S > >> +++ b/arch/arm/cpu/armv8/fsl-lsch3/lowlevel.S > >> @@ -8,7 +8,9 @@ > >> > >> #include > >> #include > >> +#include > >> #include > >> +#include "mp.h" > >> > >> ENTRY(lowlevel_init) > >> mov x29, lr /* Save LR */ > >> @@ -37,29 +39,119 @@ ENTRY(lowlevel_init) > >> > >> branch_if_master x0, x1, 1f > >> > >> + ldr x0, =secondary_boot_func > >> + blr x0 > >> + > >> +1: > >> +2: > > > > Isn't the '2' label redundant? > > We have some internal code dealing with trust zone between the 1 and 2. It > is not likely to be used in long term since we are trying to move them > into security monitor. I can drop label 2 here. U-boot can still be booted in EL3, as it can be well booted w/o a ATF or EL3 capable s/w running before the same. That's why we have CONFIG_EL3 and CONFIG_EL2 code legs in the u-boot ARMv8 code. > > > > >> + mov lr, x29 /* Restore LR */ > >> + ret > >> +ENDPROC(lowlevel_init) > >> + > >> + /* Keep literals not used by the secondary boot code outside it > */ > >> + .ltorg > >> + > >> + /* Using 64 bit alignment since the spin table is accessed as > data */ > >> + .align 4 > >> + .global secondary_boot_code > >> + /* Secondary Boot Code starts here */ > >> +secondary_boot_code: > >> + .global __spin_table > >> +__spin_table: > >> + .space CONFIG_MAX_CPUS*SPIN_TABLE_ELEM_SIZE > >> + > >> + .align 2 > >> +ENTRY(secondary_boot_func) > >> /* > >> - * Slave should wait for master clearing spin table. > >> - * This sync prevent salves observing incorrect > >> - * value of spin table and jumping to wrong place. > >> + * MPIDR_EL1 Fields: > >> + * MPIDR[1:0] = AFF0_CPUID <- Core ID (0,1) > >> + * MPIDR[7:2] = AFF0_RES > >> + * MPIDR[15:8] = AFF1_CLUSTERID <- Cluster ID (0,1,2,3) > >> + * MPIDR[23:16] = AFF2_CLUSTERID > >> + * MPIDR[24] = MT > >> + * MPIDR[29:25] = RES0 > >> + * MPIDR[30] = U > >> + * MPIDR[31] = ME > >> + * MPIDR[39:32] = AFF3 > >> + * > >> + * Linear Processor ID (LPID) calculation from MPIDR_EL1: > >> + * (We only use AFF0_CPUID and AFF1_CLUSTERID for now > >> + * until AFF2_CLUSTERID and AFF3 have non-zero values) > >> + * > >> + * LPID = MPIDR[15:8] | MPIDR[1:0] > >> */ > >> -#if defined(CONFIG_GICV2) || defined(CONFIG_GICV3) -#ifdef > >> CONFIG_GICV2 > >> - ldr x0, =GICC_BASE > >> + mrs x0, mpidr_el1 > >> + ubfm x1, x0, #8, #15 > >> + ubfm x2, x0, #0, #1 > >> + orr x10, x2, x1, lsl #2 /* x10 has LPID */ > >> + ubfm x9, x0, #0, #15 /* x9 contains MPIDR[15:0] */ > >> + /* > >> + * offset of the spin table element for this core from start of > spin > >> + * table (each elem is padded to 64 bytes) > >> + */ > >> + lsl x1, x10, #6 > >> + ldr x0, =__spin_table > >> + /* physical address of this cpus spin table element */ > >> + add x11, x1, x0 > >> + > >> + str x9, [x11, #16] /* LPID */ > >> + mov x4, #1 > >> + str x4, [x11, #8] /* STATUS */ > >> + dsb sy > >> +#if defined(CONFIG_GICV3) > >> + gic_wait_for_interrupt_m x0 > >> #endif > >> - bl gic_wait_for_interrupt > > > > Why do we only need to wait for GICv3? We previously did so for GICv2. > > I think this is leftover from the attempt to boot all cores > simultaneously. I will let Arnab to comment. > > > > >> + > >> + bl secondary_switch_to_el2 > >> +#ifdef CONFIG_ARMV8_SWITCH_TO_EL1 > >> + secondary_switch_to_el1 > >> #endif > > > > Missing bl? > > Looks so. > > > > > Is there any reason to have the SWITCH_TO_EL1 option other than for > > debugging? > > Good question. I will let Arnab to comment here. > > > > > EL2 is the preferred EL to boot at for Linux and Xen (it gives far > > more flexibility), and if dropping to EL1 is necessary I think it > > would make more sense as a run-time option than a compile-time option. > > > >> > >> - /* > >> - * All processors will enter EL2 and optionally EL1. > >> +slave_cpu: > >> + wfe > >> +#ifdef CONFIG_FSL_SMP_RELEASE_ALL > >> + /* All cores are released from the address in the 1st spin > table > >> + * element > >> */ > >> - bl armv8_switch_to_el2 > >> -#ifdef CONFIG_ARMV8_SWITCH_TO_EL1 > >> - bl armv8_switch_to_el1 > >> + ldr x1, =__spin_table > >> + ldr x0, [x1] > >> +#else > >> + ldr x0, [x11] > >> +#endif > >> + cbz x0, slave_cpu > > > > Similarly is there any reason to have the option of a single release > > addr if we can support unique addresses? > > I think it was used by Linux for some ARM parts. I personally not a fun of > using single release. But if it makes everyone happy, I can keep it. We followed the standard ARMv8 foundation model DTS initially which along with others supported a single release address for all the cores. So, we wanted to comply to the same. > > > > > [...] > > > >> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/mp.c > >> b/arch/arm/cpu/armv8/fsl-lsch3/mp.c > >> new file mode 100644 > >> index 0000000..b29eda9 > >> --- /dev/null > >> +++ b/arch/arm/cpu/armv8/fsl-lsch3/mp.c > >> @@ -0,0 +1,172 @@ > >> +/* > >> + * Copyright 2014 Freescale Semiconductor, Inc. > >> + * > >> + * SPDX-License-Identifier: GPL-2.0+ > >> + */ > >> + > >> +#include > >> +#include > >> +#include > >> +#include > >> +#include > >> +#include "mp.h" > >> + > >> +DECLARE_GLOBAL_DATA_PTR; > >> + > >> +void *get_spin_tbl_addr(void) > >> +{ > >> + void *ptr = (void *)&__spin_table; > >> + > >> + return ptr; > >> +} > > > > I don't think the cast is necessary here. As far as I can see this > > could be just: > > > > void *get_spin_tbl_addr(void) > > { > > return &__spin_table; > > } > > > > Let me try that. > > > [...] > > > >> diff --git a/arch/arm/include/asm/macro.h > >> b/arch/arm/include/asm/macro.h index f77e4b8..0009c28 100644 > >> --- a/arch/arm/include/asm/macro.h > >> +++ b/arch/arm/include/asm/macro.h > >> @@ -105,6 +105,98 @@ lr .req x30 > >> cbz \xreg1, \master_label > >> .endm > >> > >> +.macro armv8_switch_to_el2_m, xreg1 > >> + mov \xreg1, #0x5b1 /* Non-secure EL0/EL1 | HVC | 64bit EL2 > */ > >> + msr scr_el3, \xreg1 > > > > When dropping to EL1 from EL2 we disable HVC via HCR_EL2; presumably > > due to lack of a handler. Would it make sense to do similarly here and > > disable SMC here until we have a user (e.g. PSCI)? > > I will let Arnab to comment here. In my view u-boot should be similar to UEFI and both should drop only to EL2 before booting either Linux or Hypervisor. UEFI code doesn't switch to EL1 before invoking Linux. Thoughts? > > > York > _______________________________________________ > U-Boot mailing list > U-Boot at lists.denx.de > http://lists.denx.de/mailman/listinfo/u-boot