From mboxrd@z Thu Jan 1 00:00:00 1970 From: York Sun Date: Wed, 4 Jun 2014 09:27:30 -0700 Subject: [U-Boot] [Patch v4 2/5] ARMv8: Adjust MMU setup In-Reply-To: <20140602180100.GA887@leverpostej> References: <1401396548-3353-1-git-send-email-yorksun@freescale.com> <1401396548-3353-2-git-send-email-yorksun@freescale.com> <20140602113450.GD13573@leverpostej> <538CA0F5.2060103@freescale.com> <20140602180100.GA887@leverpostej> Message-ID: <538F48F2.3080408@freescale.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On 06/02/2014 11:01 AM, Mark Rutland wrote: > On Mon, Jun 02, 2014 at 05:06:13PM +0100, York Sun wrote: >> On 06/02/2014 04:34 AM, Mark Rutland wrote: >>> On Thu, May 29, 2014 at 09:49:05PM +0100, York Sun wrote: >>>> Make MMU functions reusable. Platform code can setup its own MMU tables. >>> >>> What exactly does platform code need to setup its own tables for? >> >> The general ARMv8 MMU table is not detail enough to control memory attribute >> like cache for all addresses. We have devices mapping to addresses with >> different requirement for cache control. > > And there are no APIs for creating device mappings rather than exporting > the raw pagetable accessors and hard-coding them differently in every > board file? > That's a good question. At this point, only two platforms are using ARMv8 code. I am expecting FSL ARMv8 implementation will stay similar, i.e. covered by the file I added. If that's not the case, or more ARMv8 SoCs need special MMU table, we then should introduce such API. Having a full function MMU API may be an overkill for U-boot. We don't need dynamic MMU anyway. >> >>> >>>> Also fix a typo of "TCR_EL3_IPS_BITS" in cache_v8.c. >>>> >>>> Signed-off-by: York Sun >>>> CC: David Feng >>>> --- >>>> Change log: >>>> v4: new patch, splitted from v3 2/4 >>>> Revise set_pgtable_section() to be reused by platform MMU code >>>> Add inline function set_ttbr_tcr_mair() to be used by this and platform mmu code >>>> >>>> arch/arm/cpu/armv8/cache_v8.c | 49 ++++++++++++++++---------------------- >>>> arch/arm/include/asm/armv8/mmu.h | 23 ++++++++++++++++++ >>>> 2 files changed, 43 insertions(+), 29 deletions(-) >>>> >>>> diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c >>>> index a96ecda..67dbd46 100644 >>>> --- a/arch/arm/cpu/armv8/cache_v8.c >>>> +++ b/arch/arm/cpu/armv8/cache_v8.c >>>> @@ -12,15 +12,14 @@ >>>> DECLARE_GLOBAL_DATA_PTR; >>>> >>>> #ifndef CONFIG_SYS_DCACHE_OFF >>>> - >>>> -static void set_pgtable_section(u64 section, u64 memory_type) >>>> +void set_pgtable_section(u64 *page_table, u64 index, u64 section, >>>> + u64 memory_type) >>>> { >>>> - u64 *page_table = (u64 *)gd->arch.tlb_addr; >>>> u64 value; >>>> >>>> - value = (section << SECTION_SHIFT) | PMD_TYPE_SECT | PMD_SECT_AF; >>>> + value = section | PMD_TYPE_SECT | PMD_SECT_AF; >>>> value |= PMD_ATTRINDX(memory_type); >>>> - page_table[section] = value; >>>> + page_table[index] = value; >>>> } >>>> >>>> /* to activate the MMU we need to set up virtual memory */ >>>> @@ -28,10 +27,13 @@ static void mmu_setup(void) >>>> { >>>> int i, j, el; >>>> bd_t *bd = gd->bd; >>>> + u64 *page_table = (u64 *)gd->arch.tlb_addr; >>>> >>>> /* Setup an identity-mapping for all spaces */ >>>> - for (i = 0; i < (PGTABLE_SIZE >> 3); i++) >>>> - set_pgtable_section(i, MT_DEVICE_NGNRNE); >>>> + for (i = 0; i < (PGTABLE_SIZE >> 3); i++) { >>>> + set_pgtable_section(page_table, i, i << SECTION_SHIFT, >>>> + MT_DEVICE_NGNRNE); >>>> + } >>>> >>>> /* Setup an identity-mapping for all RAM space */ >>>> for (i = 0; i < CONFIG_NR_DRAM_BANKS; i++) { >>>> @@ -39,36 +41,25 @@ static void mmu_setup(void) >>>> ulong end = bd->bi_dram[i].start + bd->bi_dram[i].size; >>>> for (j = start >> SECTION_SHIFT; >>>> j < end >> SECTION_SHIFT; j++) { >>>> - set_pgtable_section(j, MT_NORMAL); >>>> + set_pgtable_section(page_table, j, j << SECTION_SHIFT, >>>> + MT_NORMAL); >>>> } >>>> } >>>> >>>> /* load TTBR0 */ >>>> el = current_el(); >>>> if (el == 1) { >>>> - asm volatile("msr ttbr0_el1, %0" >>>> - : : "r" (gd->arch.tlb_addr) : "memory"); >>>> - asm volatile("msr tcr_el1, %0" >>>> - : : "r" (TCR_FLAGS | TCR_EL1_IPS_BITS) >>>> - : "memory"); >>>> - asm volatile("msr mair_el1, %0" >>>> - : : "r" (MEMORY_ATTRIBUTES) : "memory"); >>>> + set_ttbr_tcr_mair(el, gd->arch.tlb_addr, >>>> + TCR_FLAGS | TCR_EL1_IPS_BITS, >>>> + MEMORY_ATTRIBUTES); >>>> } else if (el == 2) { >>>> - asm volatile("msr ttbr0_el2, %0" >>>> - : : "r" (gd->arch.tlb_addr) : "memory"); >>>> - asm volatile("msr tcr_el2, %0" >>>> - : : "r" (TCR_FLAGS | TCR_EL2_IPS_BITS) >>>> - : "memory"); >>>> - asm volatile("msr mair_el2, %0" >>>> - : : "r" (MEMORY_ATTRIBUTES) : "memory"); >>>> + set_ttbr_tcr_mair(el, gd->arch.tlb_addr, >>>> + TCR_FLAGS | TCR_EL2_IPS_BITS, >>>> + MEMORY_ATTRIBUTES); >>>> } else { >>>> - asm volatile("msr ttbr0_el3, %0" >>>> - : : "r" (gd->arch.tlb_addr) : "memory"); >>>> - asm volatile("msr tcr_el3, %0" >>>> - : : "r" (TCR_FLAGS | TCR_EL2_IPS_BITS) >>>> - : "memory"); >>>> - asm volatile("msr mair_el3, %0" >>>> - : : "r" (MEMORY_ATTRIBUTES) : "memory"); >>>> + set_ttbr_tcr_mair(el, gd->arch.tlb_addr, >>>> + TCR_FLAGS | TCR_EL3_IPS_BITS, >>>> + MEMORY_ATTRIBUTES); >>>> } >>>> >>>> /* enable the mmu */ >>>> diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h >>>> index 1193e76..7de4ff9 100644 >>>> --- a/arch/arm/include/asm/armv8/mmu.h >>>> +++ b/arch/arm/include/asm/armv8/mmu.h >>>> @@ -108,4 +108,27 @@ >>>> TCR_IRGN_WBWA | \ >>>> TCR_T0SZ(VA_BITS)) >>>> >>>> +#ifndef __ASSEMBLY__ >>>> +void set_pgtable_section(u64 *page_table, u64 index, >>>> + u64 section, u64 memory_type); >>>> +static inline void set_ttbr_tcr_mair(int el, u64 table, u64 tcr, u64 attr) >>>> +{ >>>> + asm volatile("dsb sy;isb"); >>> >>> Huh? This wasn't anywhere before. Is the isb necessary? >> >> Probably not, but to make sure all previous instructions finish. > > Which instructions do you care about completing? > > You'll certainly want any page table writes to complete, but is anything > else in-flight at this point in time? Before calling this inline function, MMU tables were created. We want the writing completes before setting these registers. > >>> >>> When is this function expected to be called? Is the MMU expected to be >>> on? >> >> The general ARMv8 code calls this when MMU is off. For the SoC I am enabling, it >> is called when MMU is off for the first time, but MMU on for the 2nd time. > > Ok. > >> >>> >>>> + if (el == 1) { >>>> + asm volatile("msr ttbr0_el1, %0" : : "r" (table) : "memory"); >>>> + asm volatile("msr tcr_el1, %0" : : "r" (tcr) : "memory"); >>>> + asm volatile("msr mair_el1, %0" : : "r" (attr) : "memory"); >>> >>> If the MMU is on, this looks really scary -- what are you expecting to >>> change in a single invocation? >> >> It is not scary for general ARMv8 code. MMU is off then this is called. For FSL >> SoC, I have these called for the 2nd time, when MMU is on. The 2nd time call >> points to a new MMU table. > > Oh but it is ;) > >> >>> >>>> + } else if (el == 2) { >>>> + asm volatile("msr ttbr0_el2, %0" : : "r" (table) : "memory"); >>>> + asm volatile("msr tcr_el2, %0" : : "r" (tcr) : "memory"); >>>> + asm volatile("msr mair_el2, %0" : : "r" (attr) : "memory"); >>>> + } else if (el == 3) { >>>> + asm volatile("msr ttbr0_el3, %0" : : "r" (table) : "memory"); >>>> + asm volatile("msr tcr_el3, %0" : : "r" (tcr) : "memory"); >>>> + asm volatile("msr mair_el3, %0" : : "r" (attr) : "memory"); >>>> + } else { >>>> + hang(); >>>> + } >>> >>> And no synchronisation to ensure that these writes are complete or even >>> ordered w.r.t. each other? >>> >> >> That's why I added asm volatile("dsb sy;isb") before them. The order of these >> write doesn't matter. See the code before my change >> http://git.denx.de/?p=u-boot.git;a=blob;f=arch/arm/cpu/armv8/cache_v8.c;h=a96ecda7e3146996d20a2e46b975dced2769255c;hb=HEAD > > Ok, so that orders them against everything _before_ them, but not with > respect to each other, or anything _after_ them. > > When the MMU is off, you are correct that the ordering of these > operations w.r.t. each other doesn't matter. With the MMU on that's not > true as the CPU can rerder the operations and can walk the page tables > asynchronously. > > Changing the MAIR at runtime is somewhat scary because you may be > changing attributes for entries which are active in the cache and/or > TLBs. I'm not sure I follow why you need to change it once the MMU is on > -- there are plenty of subfields in the MAIR that you could configure > before turning the MMU on for use later. > > Likewise changing the TCR at runtime is somewhat scary because you can > change attributes of active cache and/or TLB entries, change fields that > affect the way page tables are walked (TG*. T*SZ), all asynchronously > w.r.t. the logic that walsk the page tables. > > Changing the TTBR is fine, except that we didn't invalidate old entries > with a TLBI, so we may even have partial translations cahced in the > TLBs, and we can get odd translations generated from the combination of > the old and new tables. Maybe a more proper way to do so is to change TTBR only. That's actually what I really need. Or if I really need to change all of them, I should turn off MMU first. > >> For the 2nd write used by FSL SoC, I need to flush the cache to make sure the >> table is in main DDR and perform TLB invalidation. That's when the loading of >> new MMU table happens. > > I missed the TLB invaliation -- where is that meant to happen? > After flushing the new MMU table into DDR. To your point above, I will send a new version with updating TTBR only. York