All of lore.kernel.org
 help / color / mirror / Atom feed
* [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3
@ 2016-02-26  0:49 Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 01/11] thunderx: Calculate TCR dynamically Alexander Graf
                   ` (11 more replies)
  0 siblings, 12 replies; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

Howdy,

Currently on arm64 there is a big pile of mess when it comes to MMU
support and page tables. Each board does its own little thing and the
generic code is pretty dumb and nobody actually uses it.

This patch set tries to clean that up. After this series is applied,
all boards except for the FSL Layerscape ones are converted to the
new generic page table logic and have icache+dcache enabled.

The new code always uses 4k page size. It dynamically allocates 1G or
2M pages for ranges that fit. When a dcache attribute request comes in
that requires a smaller granularity than our previous allocation could
fulfill, pages get automatically split.

I have tested and verified the code works on HiKey (bare metal),
vexpress64 (Foundation Model) and zynqmp (QEMU). The TX1 target is
untested, but given the simplicity of the maps I doubt it'll break.
ThunderX in theory should also work, but I haven't tested it. I would
be very happy if people with access to those system could give the patch
set a try.

With this we're a big step closer to a good base line for EFI payload
support, since we can now just require that all boards always have dcache
enabled.

I would also be incredibly happy if some Freescale people could look
at their MMU code and try to unify it into the now cleaned up generic
code. I don't think we're far off here.


Alex

v1 -> v2:

  - Fix comment for create_table()
  - Rework page table size calculation
  - Move mmu tables into board files
  - New patch: thunderx: Move mmu table into board file

v2 -> v3:

  - Move PGTABLE_SIZE out of the ASSEMBLY section
  - Run pte splitting code in separate page table
  - Replace is_level in count_required_pts with more obvious pte_type
  - New patch: arm64: Disable TTBR1 maps in EL1

v3 -> v4:

  - Fix Lv3 PTEs that are always of type PTE_TYPE_LEVEL (fixes zynqmp)

Alexander Graf (11):
  thunderx: Calculate TCR dynamically
  arm64: Disable TTBR1 maps in EL1
  arm64: Make full va map code more dynamic
  thunderx: Move mmu table into board file
  zymqmp: Replace home grown mmu code with generic table approach
  tegra: Replace home grown mmu code with generic table approach
  vexpress64: Add MMU tables
  dwmmc: Increase retry timeout
  hikey: Add MMU tables
  arm64: Remove non-full-va map code
  arm64: Only allow dcache disabled in SPL builds

 arch/arm/cpu/armv8/cache.S                     |  54 +++
 arch/arm/cpu/armv8/cache_v8.c                  | 551 +++++++++++++++++++------
 arch/arm/cpu/armv8/fsl-layerscape/cpu.c        |  37 +-
 arch/arm/cpu/armv8/zynqmp/cpu.c                | 217 +++-------
 arch/arm/include/asm/arch-fsl-layerscape/cpu.h |  94 ++---
 arch/arm/include/asm/armv8/mmu.h               | 123 ++----
 arch/arm/include/asm/global_data.h             |   7 +-
 arch/arm/include/asm/system.h                  |  12 +-
 arch/arm/mach-tegra/arm64-mmu.c                | 132 +-----
 board/armltd/vexpress64/vexpress64.c           |  21 +
 board/cavium/thunderx/thunderx.c               |  24 ++
 board/hisilicon/hikey/hikey.c                  |  21 +
 doc/README.arm64                               |  20 -
 drivers/mmc/dw_mmc.c                           |   2 +-
 include/configs/hikey.h                        |   4 +-
 include/configs/thunderx_88xx.h                |  30 --
 include/configs/vexpress_aemv8a.h              |   5 +-
 17 files changed, 738 insertions(+), 616 deletions(-)

-- 
1.8.5.6

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 01/11] thunderx: Calculate TCR dynamically
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
@ 2016-02-26  0:49 ` Alexander Graf
  2016-02-26  2:51   ` Simon Glass
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 02/11] arm64: Disable TTBR1 maps in EL1 Alexander Graf
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

Based on the memory map we can determine a lot of hard coded fields of
TCR, like the maximum VA and max PA we want to support. Calculate those
dynamically to reduce the chance for pit falls.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/arm/cpu/armv8/cache_v8.c    | 59 +++++++++++++++++++++++++++++++++++++++-
 arch/arm/include/asm/armv8/mmu.h |  6 +---
 include/configs/thunderx_88xx.h  |  3 --
 3 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c
index 71f0020..9229532 100644
--- a/arch/arm/cpu/armv8/cache_v8.c
+++ b/arch/arm/cpu/armv8/cache_v8.c
@@ -38,6 +38,58 @@ static struct mm_region mem_map[] = CONFIG_SYS_MEM_MAP;
 #define PTL1_ENTRIES CONFIG_SYS_PTL1_ENTRIES
 #define PTL2_ENTRIES CONFIG_SYS_PTL2_ENTRIES
 
+static u64 get_tcr(int el, u64 *pips, u64 *pva_bits)
+{
+	u64 max_addr = 0;
+	u64 ips, va_bits;
+	u64 tcr;
+	int i;
+
+	/* Find the largest address we need to support */
+	for (i = 0; i < ARRAY_SIZE(mem_map); i++)
+		max_addr = max(max_addr, mem_map[i].base + mem_map[i].size);
+
+	/* Calculate the maximum physical (and thus virtual) address */
+	if (max_addr > (1ULL << 44)) {
+		ips = 5;
+		va_bits = 48;
+	} else  if (max_addr > (1ULL << 42)) {
+		ips = 4;
+		va_bits = 44;
+	} else  if (max_addr > (1ULL << 40)) {
+		ips = 3;
+		va_bits = 42;
+	} else  if (max_addr > (1ULL << 36)) {
+		ips = 2;
+		va_bits = 40;
+	} else  if (max_addr > (1ULL << 32)) {
+		ips = 1;
+		va_bits = 36;
+	} else {
+		ips = 0;
+		va_bits = 32;
+	}
+
+	if (el == 1) {
+		tcr = TCR_EL1_RSVD | (ips << 32);
+	} else if (el == 2) {
+		tcr = TCR_EL2_RSVD | (ips << 16);
+	} else {
+		tcr = TCR_EL3_RSVD | (ips << 16);
+	}
+
+	/* PTWs cacheable, inner/outer WBWA and inner shareable */
+	tcr |= TCR_TG0_64K | TCR_SHARED_INNER | TCR_ORGN_WBWA | TCR_IRGN_WBWA;
+	tcr |= TCR_T0SZ(VA_BITS);
+
+	if (pips)
+		*pips = ips;
+	if (pva_bits)
+		*pva_bits = va_bits;
+
+	return tcr;
+}
+
 static void setup_pgtables(void)
 {
 	int l1_e, l2_e;
@@ -110,6 +162,10 @@ __weak void mmu_setup(void)
 	/* Set up page tables only on BSP */
 	if (coreid == BSP_COREID)
 		setup_pgtables();
+
+	el = current_el();
+	set_ttbr_tcr_mair(el, gd->arch.tlb_addr, get_tcr(el, NULL, NULL),
+			  MEMORY_ATTRIBUTES);
 #else
 	/* Setup an identity-mapping for all spaces */
 	for (i = 0; i < (PGTABLE_SIZE >> 3); i++) {
@@ -128,7 +184,6 @@ __weak void mmu_setup(void)
 		}
 	}
 
-#endif
 	/* load TTBR0 */
 	el = current_el();
 	if (el == 1) {
@@ -144,6 +199,8 @@ __weak void mmu_setup(void)
 				  TCR_EL3_RSVD | TCR_FLAGS | TCR_EL3_IPS_BITS,
 				  MEMORY_ATTRIBUTES);
 	}
+#endif
+
 	/* enable the mmu */
 	set_sctlr(get_sctlr() | CR_M);
 }
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h
index 897f010..39ff745 100644
--- a/arch/arm/include/asm/armv8/mmu.h
+++ b/arch/arm/include/asm/armv8/mmu.h
@@ -159,11 +159,6 @@
 #define TCR_EL1_IPS_BITS	(UL(3) << 32)	/* 42 bits physical address */
 #define TCR_EL2_IPS_BITS	(3 << 16)	/* 42 bits physical address */
 #define TCR_EL3_IPS_BITS	(3 << 16)	/* 42 bits physical address */
-#else
-#define TCR_EL1_IPS_BITS	CONFIG_SYS_TCR_EL1_IPS_BITS
-#define TCR_EL2_IPS_BITS	CONFIG_SYS_TCR_EL2_IPS_BITS
-#define TCR_EL3_IPS_BITS	CONFIG_SYS_TCR_EL3_IPS_BITS
-#endif
 
 /* PTWs cacheable, inner/outer WBWA and inner shareable */
 #define TCR_FLAGS		(TCR_TG0_64K |		\
@@ -171,6 +166,7 @@
 				TCR_ORGN_WBWA |		\
 				TCR_IRGN_WBWA |		\
 				TCR_T0SZ(VA_BITS))
+#endif
 
 #define TCR_EL1_RSVD		(1 << 31)
 #define TCR_EL2_RSVD		(1 << 31 | 1 << 23)
diff --git a/include/configs/thunderx_88xx.h b/include/configs/thunderx_88xx.h
index cece4dd..b9f93ad 100644
--- a/include/configs/thunderx_88xx.h
+++ b/include/configs/thunderx_88xx.h
@@ -50,9 +50,6 @@
 #define CONFIG_SYS_PGTABLE_SIZE		\
 	((CONFIG_SYS_PTL1_ENTRIES + \
 	  CONFIG_SYS_MEM_MAP_SIZE * CONFIG_SYS_PTL2_ENTRIES) * 8)
-#define CONFIG_SYS_TCR_EL1_IPS_BITS	(5UL << 32)
-#define CONFIG_SYS_TCR_EL2_IPS_BITS	(5 << 16)
-#define CONFIG_SYS_TCR_EL3_IPS_BITS	(5 << 16)
 
 /* Link Definitions */
 #define CONFIG_SYS_TEXT_BASE		0x00500000
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 02/11] arm64: Disable TTBR1 maps in EL1
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 01/11] thunderx: Calculate TCR dynamically Alexander Graf
@ 2016-02-26  0:49 ` Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 03/11] arm64: Make full va map code more dynamic Alexander Graf
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

When running in EL1, AArch64 knows two page table maps. One with addresses
that start with all zeros (TTBR0) and one with addresses that start with all
ones (TTBR1).

In U-Boot we don't care about the high up maps, so just disable them to ensure
we don't walk an invalid page table by accident.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/arm/cpu/armv8/cache_v8.c    | 2 +-
 arch/arm/include/asm/armv8/mmu.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c
index 9229532..d92f2d1 100644
--- a/arch/arm/cpu/armv8/cache_v8.c
+++ b/arch/arm/cpu/armv8/cache_v8.c
@@ -71,7 +71,7 @@ static u64 get_tcr(int el, u64 *pips, u64 *pva_bits)
 	}
 
 	if (el == 1) {
-		tcr = TCR_EL1_RSVD | (ips << 32);
+		tcr = TCR_EL1_RSVD | (ips << 32) | TCR_EPD1_DISABLE;
 	} else if (el == 2) {
 		tcr = TCR_EL2_RSVD | (ips << 16);
 	} else {
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h
index 39ff745..0080ae6 100644
--- a/arch/arm/include/asm/armv8/mmu.h
+++ b/arch/arm/include/asm/armv8/mmu.h
@@ -154,6 +154,7 @@
 #define TCR_TG0_4K		(0 << 14)
 #define TCR_TG0_64K		(1 << 14)
 #define TCR_TG0_16K		(2 << 14)
+#define TCR_EPD1_DISABLE	(1 << 23)
 
 #ifndef CONFIG_SYS_FULL_VA
 #define TCR_EL1_IPS_BITS	(UL(3) << 32)	/* 42 bits physical address */
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 03/11] arm64: Make full va map code more dynamic
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 01/11] thunderx: Calculate TCR dynamically Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 02/11] arm64: Disable TTBR1 maps in EL1 Alexander Graf
@ 2016-02-26  0:49 ` Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 04/11] thunderx: Move mmu table into board file Alexander Graf
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

The idea to generate our pages tables from an array of memory ranges
is very sound. However, instead of hard coding the code to create up
to 2 levels of 64k granule page tables, we really should just create
normal 4k page tables that allow us to set caching attributes on 2M
or 4k level later on.

So this patch moves the full_va mapping code to 4k page size and
makes it fully flexible to dynamically create as many levels as
necessary for a map (including dynamic 1G/2M pages). It also adds
support to dynamically split a large map into smaller ones when
some code wants to set dcache attributes.

With all this in place, there is very little reason to create your
own page tables in board specific files.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v1 -> v2:

  - Fix comment for create_table()
  - Rework page table size calculation
  - s/DPRINTF/debug/g
  - Improve panic messages

v2 -> v3:

  - Move PGTABLE_SIZE out of the ASSEMBLY section
  - Run pte splitting code in separate page table
  - Replace is_level in count_required_pts with more obvious pte_type

v3 -> v4:

  - Lv3 PTEs are always of type table
---
 arch/arm/cpu/armv8/cache.S         |  54 +++++
 arch/arm/cpu/armv8/cache_v8.c      | 453 ++++++++++++++++++++++++++++++++-----
 arch/arm/include/asm/armv8/mmu.h   |  68 +++---
 arch/arm/include/asm/global_data.h |   5 +-
 arch/arm/include/asm/system.h      |  14 +-
 include/configs/thunderx_88xx.h    |  14 +-
 6 files changed, 500 insertions(+), 108 deletions(-)

diff --git a/arch/arm/cpu/armv8/cache.S b/arch/arm/cpu/armv8/cache.S
index ab8c089..a9f4fec 100644
--- a/arch/arm/cpu/armv8/cache.S
+++ b/arch/arm/cpu/armv8/cache.S
@@ -10,6 +10,7 @@
 #include <asm-offsets.h>
 #include <config.h>
 #include <asm/macro.h>
+#include <asm/system.h>
 #include <linux/linkage.h>
 
 /*
@@ -160,3 +161,56 @@ ENTRY(__asm_flush_l3_cache)
 	ret
 ENDPROC(__asm_flush_l3_cache)
 	.weak	__asm_flush_l3_cache
+
+/*
+ * void __asm_switch_ttbr(ulong new_ttbr)
+ *
+ * Safely switches to a new page table.
+ */
+ENTRY(__asm_switch_ttbr)
+	/* x2 = SCTLR (alive throghout the function) */
+	switch_el x4, 3f, 2f, 1f
+3:	mrs	x2, sctlr_el3
+	b	0f
+2:	mrs	x2, sctlr_el2
+	b	0f
+1:	mrs	x2, sctlr_el1
+0:
+
+	/* Unset CR_M | CR_C | CR_I from SCTLR to disable all caches */
+	movn	x1, #(CR_M | CR_C | CR_I)
+	and	x1, x2, x1
+	switch_el x4, 3f, 2f, 1f
+3:	msr	sctlr_el3, x1
+	b	0f
+2:	msr	sctlr_el2, x1
+	b	0f
+1:	msr	sctlr_el1, x1
+0:	isb
+
+	/* This call only clobbers x30 (lr) and x9 (unused) */
+	mov	x3, x30
+	bl	__asm_invalidate_tlb_all
+
+	/* From here on we're running safely with caches disabled */
+
+	/* Set TTBR to our first argument */
+	switch_el x4, 3f, 2f, 1f
+3:	msr	ttbr0_el3, x0
+	b	0f
+2:	msr	ttbr0_el2, x0
+	b	0f
+1:	msr	ttbr0_el1, x0
+0:	isb
+
+	/* Restore original SCTLR and thus enable caches again */
+	switch_el x4, 3f, 2f, 1f
+3:	msr	sctlr_el3, x2
+	b	0f
+2:	msr	sctlr_el2, x2
+	b	0f
+1:	msr	sctlr_el1, x2
+0:	isb
+
+	ret	x3
+ENDPROC(__asm_switch_ttbr)
diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c
index d92f2d1..73628c9 100644
--- a/arch/arm/cpu/armv8/cache_v8.c
+++ b/arch/arm/cpu/armv8/cache_v8.c
@@ -2,6 +2,9 @@
  * (C) Copyright 2013
  * David Feng <fenghua@phytium.com.cn>
  *
+ * (C) Copyright 2016
+ * Alexander Graf <agraf@suse.de>
+ *
  * SPDX-License-Identifier:	GPL-2.0+
  */
 
@@ -13,31 +16,28 @@ DECLARE_GLOBAL_DATA_PTR;
 
 #ifndef CONFIG_SYS_DCACHE_OFF
 
-#ifdef CONFIG_SYS_FULL_VA
-static void set_ptl1_entry(u64 index, u64 ptl2_entry)
-{
-	u64 *pgd = (u64 *)gd->arch.tlb_addr;
-	u64 value;
-
-	value = ptl2_entry | PTL1_TYPE_TABLE;
-	pgd[index] = value;
-}
-
-static void set_ptl2_block(u64 ptl1, u64 bfn, u64 address, u64 memory_attrs)
-{
-	u64 *pmd = (u64 *)ptl1;
-	u64 value;
-
-	value = address | PTL2_TYPE_BLOCK | PTL2_BLOCK_AF;
-	value |= memory_attrs;
-	pmd[bfn] = value;
-}
+/*
+ *  With 4k page granule, a virtual address is split into 4 lookup parts
+ *  spanning 9 bits each:
+ *
+ *    _______________________________________________
+ *   |       |       |       |       |       |       |
+ *   |   0   |  Lv0  |  Lv1  |  Lv2  |  Lv3  |  off  |
+ *   |_______|_______|_______|_______|_______|_______|
+ *     63-48   47-39   38-30   29-21   20-12   11-00
+ *
+ *             mask        page size
+ *
+ *    Lv0: FF8000000000       --
+ *    Lv1:   7FC0000000       1G
+ *    Lv2:     3FE00000       2M
+ *    Lv3:       1FF000       4K
+ *    off:          FFF
+ */
 
+#ifdef CONFIG_SYS_FULL_VA
 static struct mm_region mem_map[] = CONFIG_SYS_MEM_MAP;
 
-#define PTL1_ENTRIES CONFIG_SYS_PTL1_ENTRIES
-#define PTL2_ENTRIES CONFIG_SYS_PTL2_ENTRIES
-
 static u64 get_tcr(int el, u64 *pips, u64 *pva_bits)
 {
 	u64 max_addr = 0;
@@ -79,8 +79,8 @@ static u64 get_tcr(int el, u64 *pips, u64 *pva_bits)
 	}
 
 	/* PTWs cacheable, inner/outer WBWA and inner shareable */
-	tcr |= TCR_TG0_64K | TCR_SHARED_INNER | TCR_ORGN_WBWA | TCR_IRGN_WBWA;
-	tcr |= TCR_T0SZ(VA_BITS);
+	tcr |= TCR_TG0_4K | TCR_SHARED_INNER | TCR_ORGN_WBWA | TCR_IRGN_WBWA;
+	tcr |= TCR_T0SZ(va_bits);
 
 	if (pips)
 		*pips = ips;
@@ -90,39 +90,302 @@ static u64 get_tcr(int el, u64 *pips, u64 *pva_bits)
 	return tcr;
 }
 
-static void setup_pgtables(void)
+#define MAX_PTE_ENTRIES 512
+
+static int pte_type(u64 *pte)
+{
+	return *pte & PTE_TYPE_MASK;
+}
+
+/* Returns the LSB number for a PTE on level <level> */
+static int level2shift(int level)
+{
+	/* Page is 12 bits wide, every level translates 9 bits */
+	return (12 + 9 * (3 - level));
+}
+
+static u64 *find_pte(u64 addr, int level)
 {
-	int l1_e, l2_e;
-	unsigned long pmd = 0;
-	unsigned long address;
-
-	/* Setup the PMD pointers */
-	for (l1_e = 0; l1_e < CONFIG_SYS_MEM_MAP_SIZE; l1_e++) {
-		gd->arch.pmd_addr[l1_e] = gd->arch.tlb_addr +
-						PTL1_ENTRIES * sizeof(u64);
-		gd->arch.pmd_addr[l1_e] += PTL2_ENTRIES * sizeof(u64) * l1_e;
-		gd->arch.pmd_addr[l1_e] = ALIGN(gd->arch.pmd_addr[l1_e],
-						0x10000UL);
+	int start_level = 0;
+	u64 *pte;
+	u64 idx;
+	u64 va_bits;
+	int i;
+
+	debug("addr=%llx level=%d\n", addr, level);
+
+	get_tcr(0, NULL, &va_bits);
+	if (va_bits < 39)
+		start_level = 1;
+
+	if (level < start_level)
+		return NULL;
+
+	/* Walk through all page table levels to find our PTE */
+	pte = (u64*)gd->arch.tlb_addr;
+	for (i = start_level; i < 4; i++) {
+		idx = (addr >> level2shift(i)) & 0x1FF;
+		pte += idx;
+		debug("idx=%llx PTE %p at level %d: %llx\n", idx, pte, i, *pte);
+
+		/* Found it */
+		if (i == level)
+			return pte;
+		/* PTE is no table (either invalid or block), can't traverse */
+		if (pte_type(pte) != PTE_TYPE_TABLE)
+			return NULL;
+		/* Off to the next level */
+		pte = (u64*)(*pte & 0x0000fffffffff000ULL);
 	}
 
-	/* Setup the page tables */
-	for (l1_e = 0; l1_e < PTL1_ENTRIES; l1_e++) {
-		if (mem_map[pmd].base ==
-			(uintptr_t)l1_e << PTL2_BITS) {
-			set_ptl1_entry(l1_e, gd->arch.pmd_addr[pmd]);
-
-			for (l2_e = 0; l2_e < PTL2_ENTRIES; l2_e++) {
-				address = mem_map[pmd].base
-					+ (uintptr_t)l2_e * BLOCK_SIZE;
-				set_ptl2_block(gd->arch.pmd_addr[pmd], l2_e,
-					       address, mem_map[pmd].attrs);
+	/* Should never reach here */
+	return NULL;
+}
+
+/* Returns and creates a new full table (512 entries) */
+static u64 *create_table(void)
+{
+	u64 *new_table = (u64*)gd->arch.tlb_fillptr;
+	u64 pt_len = MAX_PTE_ENTRIES * sizeof(u64);
+
+	/* Allocate MAX_PTE_ENTRIES pte entries */
+	gd->arch.tlb_fillptr += pt_len;
+
+	if (gd->arch.tlb_fillptr - gd->arch.tlb_addr > gd->arch.tlb_size)
+		panic("Insufficient RAM for page table: 0x%lx > 0x%lx. "
+		      "Please increase the size in get_page_table_size()",
+			gd->arch.tlb_fillptr - gd->arch.tlb_addr,
+			gd->arch.tlb_size);
+
+	/* Mark all entries as invalid */
+	memset(new_table, 0, pt_len);
+
+	return new_table;
+}
+
+static void set_pte_table(u64 *pte, u64 *table)
+{
+	/* Point *pte to the new table */
+	debug("Setting %p to addr=%p\n", pte, table);
+	*pte = PTE_TYPE_TABLE | (ulong)table;
+}
+
+/* Add one mm_region map entry to the page tables */
+static void add_map(struct mm_region *map)
+{
+	u64 *pte;
+	u64 addr = map->base;
+	u64 size = map->size;
+	u64 attrs = map->attrs | PTE_TYPE_BLOCK | PTE_BLOCK_AF;
+	u64 blocksize;
+	int level;
+	u64 *new_table;
+
+	while (size) {
+		pte = find_pte(addr, 0);
+		if (pte && (pte_type(pte) == PTE_TYPE_FAULT)) {
+			debug("Creating table for addr 0x%llx\n", addr);
+			new_table = create_table();
+			set_pte_table(pte, new_table);
+		}
+
+		for (level = 1; level < 4; level++) {
+			pte = find_pte(addr, level);
+			blocksize = 1ULL << level2shift(level);
+			debug("Checking if pte fits for addr=%llx size=%llx "
+			      "blocksize=%llx\n", addr, size, blocksize);
+			if (size >= blocksize && !(addr & (blocksize - 1))) {
+				/* Page fits, create block PTE */
+				debug("Setting PTE %p to block addr=%llx\n",
+				      pte, addr);
+				*pte = addr | attrs;
+				addr += blocksize;
+				size -= blocksize;
+				break;
+			} else if ((pte_type(pte) == PTE_TYPE_FAULT)) {
+				/* Page doesn't fit, create subpages */
+				debug("Creating subtable for addr 0x%llx "
+				      "blksize=%llx\n", addr, blocksize);
+				new_table = create_table();
+				set_pte_table(pte, new_table);
 			}
+		}
+	}
+}
+
+/* Splits a block PTE into table with subpages spanning the old block */
+static void split_block(u64 *pte, int level)
+{
+	u64 old_pte = *pte;
+	u64 *new_table;
+	u64 i = 0;
+	/* level describes the parent level, we need the child ones */
+	int levelshift = level2shift(level + 1);
+
+	if (pte_type(pte) != PTE_TYPE_BLOCK)
+		panic("PTE %p (%llx) is not a block. Some driver code wants to "
+		      "modify dcache settings for an range not covered in "
+		      "mem_map.", pte, old_pte);
+
+	new_table = create_table();
+	debug("Splitting pte %p (%llx) into %p\n", pte, old_pte, new_table);
+
+	for (i = 0; i < MAX_PTE_ENTRIES; i++) {
+		new_table[i] = old_pte | (i << levelshift);
+
+		/* Level 3 block PTEs have the table type */
+		if ((level + 1) == 3)
+			new_table[i] |= PTE_TYPE_TABLE;
+
+		debug("Setting new_table[%lld] = %llx\n", i, new_table[i]);
+	}
+
+	/* Set the new table into effect */
+	set_pte_table(pte, new_table);
+}
+
+enum pte_type {
+	PTE_INVAL,
+	PTE_BLOCK,
+	PTE_LEVEL,
+};
 
-			pmd++;
-		} else {
-			set_ptl1_entry(l1_e, 0);
+/*
+ * This is a recursively called function to count the number of
+ * page tables we need to cover a particular PTE range. If you
+ * call this with level = -1 you basically get the full 48 bit
+ * coverage.
+ */
+static int count_required_pts(u64 addr, int level, u64 maxaddr)
+{
+	int levelshift = level2shift(level);
+	u64 levelsize = 1ULL << levelshift;
+	u64 levelmask = levelsize - 1;
+	u64 levelend = addr + levelsize;
+	int r = 0;
+	int i;
+	enum pte_type pte_type = PTE_INVAL;
+
+	for (i = 0; i < ARRAY_SIZE(mem_map); i++) {
+		struct mm_region *map = &mem_map[i];
+		u64 start = map->base;
+		u64 end = start + map->size;
+
+		/* Check if the PTE would overlap with the map */
+		if (max(addr, start) <= min(levelend, end)) {
+			start = max(addr, start);
+			end = min(levelend, end);
+
+			/* We need a sub-pt for this level */
+			if ((start & levelmask) || (end & levelmask)) {
+				pte_type = PTE_LEVEL;
+				break;
+			}
+
+			/* Lv0 can not do block PTEs, so do levels here too */
+			if (level <= 0) {
+				pte_type = PTE_LEVEL;
+				break;
+			}
+
+			/* PTE is active, but fits into a block */
+			pte_type = PTE_BLOCK;
 		}
 	}
+
+	/*
+	 * Block PTEs@this level are already covered by the parent page
+	 * table, so we only need to count sub page tables.
+	 */
+	if (pte_type == PTE_LEVEL) {
+		int sublevel = level + 1;
+		u64 sublevelsize = 1ULL << level2shift(sublevel);
+
+		/* Account for the new sub page table ... */
+		r = 1;
+
+		/* ... and for all child page tables that one might have */
+		for (i = 0; i < MAX_PTE_ENTRIES; i++) {
+			r += count_required_pts(addr, sublevel, maxaddr);
+			addr += sublevelsize;
+
+			if (addr >= maxaddr) {
+				/*
+				 * We reached the end of address space, no need
+				 * to look any further.
+				 */
+				break;
+			}
+		}
+	}
+
+	return r;
+}
+
+/* Returns the estimated required size of all page tables */
+u64 get_page_table_size(void)
+{
+	u64 one_pt = MAX_PTE_ENTRIES * sizeof(u64);
+	u64 size = 0;
+	u64 va_bits;
+	int start_level = 0;
+
+	get_tcr(0, NULL, &va_bits);
+	if (va_bits < 39)
+		start_level = 1;
+
+	/* Account for all page tables we would need to cover our memory map */
+	size = one_pt * count_required_pts(0, start_level - 1, 1ULL << va_bits);
+
+	/*
+	 * We need to duplicate our page table once to have an emergency pt to
+	 * resort to when splitting page tables later on
+	 */
+	size *= 2;
+
+	/*
+	 * We may need to split page tables later on if dcache settings change,
+	 * so reserve up to 4 (random pick) page tables for that.
+	 */
+	size += one_pt * 4;
+
+	return size;
+}
+
+static void setup_pgtables(void)
+{
+	int i;
+
+	/*
+	 * Allocate the first level we're on with invalidate entries.
+	 * If the starting level is 0 (va_bits >= 39), then this is our
+	 * Lv0 page table, otherwise it's the entry Lv1 page table.
+	 */
+	create_table();
+
+	/* Now add all MMU table entries one after another to the table */
+	for (i = 0; i < ARRAY_SIZE(mem_map); i++)
+		add_map(&mem_map[i]);
+
+	/* Create the same thing once more for our emergency page table */
+	create_table();
+}
+
+static void setup_all_pgtables(void)
+{
+	u64 tlb_addr = gd->arch.tlb_addr;
+
+	/* Reset the fill ptr */
+	gd->arch.tlb_fillptr = tlb_addr;
+
+	/* Create normal system page tables */
+	setup_pgtables();
+
+	/* Create emergency page tables */
+	gd->arch.tlb_addr = gd->arch.tlb_fillptr;
+	setup_pgtables();
+	gd->arch.tlb_emerg = gd->arch.tlb_addr;
+	gd->arch.tlb_addr = tlb_addr;
 }
 
 #else
@@ -157,11 +420,9 @@ __weak void mmu_setup(void)
 	int el;
 
 #ifdef CONFIG_SYS_FULL_VA
-	unsigned long coreid = read_mpidr() & CONFIG_COREID_MASK;
-
-	/* Set up page tables only on BSP */
-	if (coreid == BSP_COREID)
-		setup_pgtables();
+	/* Set up page tables only once */
+	if (!gd->arch.tlb_fillptr)
+		setup_all_pgtables();
 
 	el = current_el();
 	set_ttbr_tcr_mair(el, gd->arch.tlb_addr, get_tcr(el, NULL, NULL),
@@ -311,6 +572,88 @@ void mmu_set_region_dcache_behaviour(phys_addr_t start, size_t size,
 	flush_dcache_range(start, end);
 	asm volatile("dsb sy");
 }
+#else
+static bool is_aligned(u64 addr, u64 size, u64 align)
+{
+	return !(addr & (align - 1)) && !(size & (align - 1));
+}
+
+static u64 set_one_region(u64 start, u64 size, u64 attrs, int level)
+{
+	int levelshift = level2shift(level);
+	u64 levelsize = 1ULL << levelshift;
+	u64 *pte = find_pte(start, level);
+
+	/* Can we can just modify the current level block PTE? */
+	if (is_aligned(start, size, levelsize)) {
+		*pte &= ~PMD_ATTRINDX_MASK;
+		*pte |= attrs;
+		debug("Set attrs=%llx pte=%p level=%d\n", attrs, pte, level);
+
+		return levelsize;
+	}
+
+	/* Unaligned or doesn't fit, maybe split block into table */
+	debug("addr=%llx level=%d pte=%p (%llx)\n", start, level, pte, *pte);
+
+	/* Maybe we need to split the block into a table */
+	if (pte_type(pte) == PTE_TYPE_BLOCK)
+		split_block(pte, level);
+
+	/* And then double-check it became a table or already is one */
+	if (pte_type(pte) != PTE_TYPE_TABLE)
+		panic("PTE %p (%llx) for addr=%llx should be a table",
+		      pte, *pte, start);
+
+	/* Roll on to the next page table level */
+	return 0;
+}
+
+void mmu_set_region_dcache_behaviour(phys_addr_t start, size_t size,
+				     enum dcache_option option)
+{
+	u64 attrs = PMD_ATTRINDX(option);
+	u64 real_start = start;
+	u64 real_size = size;
+
+	debug("start=%lx size=%lx\n", (ulong)start, (ulong)size);
+
+	/*
+	 * We can not modify page tables that we're currently running on,
+	 * so we first need to switch to the "emergency" page tables where
+	 * we can safely modify our primary page tables and then switch back
+	 */
+	__asm_switch_ttbr(gd->arch.tlb_emerg);
+
+	/*
+	 * Loop through the address range until we find a page granule that fits
+	 * our alignment constraints, then set it to the new cache attributes
+	 */
+	while (size > 0) {
+		int level;
+		u64 r;
+
+		for (level = 1; level < 4; level++) {
+			r = set_one_region(start, size, attrs, level);
+			if (r) {
+				/* PTE successfully replaced */
+				size -= r;
+				start += r;
+				break;
+			}
+		}
+
+	}
+
+	/* We're done modifying page tables, switch back to our primary ones */
+	__asm_switch_ttbr(gd->arch.tlb_addr);
+
+	/*
+	 * Make sure there's nothing stale in dcache for a region that might
+	 * have caches off now
+	 */
+	flush_dcache_range(real_start, real_start + real_size);
+}
 #endif
 
 #else	/* CONFIG_SYS_DCACHE_OFF */
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h
index 0080ae6..1c490dc 100644
--- a/arch/arm/include/asm/armv8/mmu.h
+++ b/arch/arm/include/asm/armv8/mmu.h
@@ -26,15 +26,9 @@
 #define VA_BITS			(42)	/* 42 bits virtual address */
 #else
 #define VA_BITS			CONFIG_SYS_VA_BITS
-#define PTL2_BITS		CONFIG_SYS_PTL2_BITS
+#define PTE_BLOCK_BITS		CONFIG_SYS_PTL2_BITS
 #endif
 
-/* PAGE_SHIFT determines the page size */
-#undef  PAGE_SIZE
-#define PAGE_SHIFT		16
-#define PAGE_SIZE		(1 << PAGE_SHIFT)
-#define PAGE_MASK		(~(PAGE_SIZE-1))
-
 /*
  * block/section address mask and size definitions.
  */
@@ -42,10 +36,21 @@
 #define SECTION_SHIFT		29
 #define SECTION_SIZE		(UL(1) << SECTION_SHIFT)
 #define SECTION_MASK		(~(SECTION_SIZE-1))
+
+/* PAGE_SHIFT determines the page size */
+#undef  PAGE_SIZE
+#define PAGE_SHIFT		16
+#define PAGE_SIZE		(1 << PAGE_SHIFT)
+#define PAGE_MASK		(~(PAGE_SIZE-1))
+
 #else
-#define BLOCK_SHIFT		CONFIG_SYS_BLOCK_SHIFT
-#define BLOCK_SIZE		(UL(1) << BLOCK_SHIFT)
-#define BLOCK_MASK		(~(BLOCK_SIZE-1))
+
+/* PAGE_SHIFT determines the page size */
+#undef  PAGE_SIZE
+#define PAGE_SHIFT		12
+#define PAGE_SIZE		(1 << PAGE_SHIFT)
+#define PAGE_MASK		(~(PAGE_SIZE-1))
+
 #endif
 
 /***************************************************************/
@@ -71,39 +76,28 @@
  */
 
 #ifdef CONFIG_SYS_FULL_VA
-/*
- * Level 1 descriptor (PGD).
- */
 
-#define PTL1_TYPE_MASK		(3 << 0)
-#define PTL1_TYPE_TABLE		(3 << 0)
-
-#define PTL1_TABLE_PXN		(1UL << 59)
-#define PTL1_TABLE_XN		(1UL << 60)
-#define PTL1_TABLE_AP		(1UL << 61)
-#define PTL1_TABLE_NS		(1UL << 63)
-
-
-/*
- * Level 2 descriptor (PMD).
- */
+#define PTE_TYPE_MASK		(3 << 0)
+#define PTE_TYPE_FAULT		(0 << 0)
+#define PTE_TYPE_TABLE		(3 << 0)
+#define PTE_TYPE_BLOCK		(1 << 0)
 
-#define PTL2_TYPE_MASK		(3 << 0)
-#define PTL2_TYPE_FAULT		(0 << 0)
-#define PTL2_TYPE_TABLE		(3 << 0)
-#define PTL2_TYPE_BLOCK		(1 << 0)
+#define PTE_TABLE_PXN		(1UL << 59)
+#define PTE_TABLE_XN		(1UL << 60)
+#define PTE_TABLE_AP		(1UL << 61)
+#define PTE_TABLE_NS		(1UL << 63)
 
 /*
  * Block
  */
-#define PTL2_MEMTYPE(x)		((x) << 2)
-#define PTL2_BLOCK_NON_SHARE	(0 << 8)
-#define PTL2_BLOCK_OUTER_SHARE	(2 << 8)
-#define PTL2_BLOCK_INNER_SHARE	(3 << 8)
-#define PTL2_BLOCK_AF		(1 << 10)
-#define PTL2_BLOCK_NG		(1 << 11)
-#define PTL2_BLOCK_PXN		(UL(1) << 53)
-#define PTL2_BLOCK_UXN		(UL(1) << 54)
+#define PTE_BLOCK_MEMTYPE(x)	((x) << 2)
+#define PTE_BLOCK_NON_SHARE	(0 << 8)
+#define PTE_BLOCK_OUTER_SHARE	(2 << 8)
+#define PTE_BLOCK_INNER_SHARE	(3 << 8)
+#define PTE_BLOCK_AF		(1 << 10)
+#define PTE_BLOCK_NG		(1 << 11)
+#define PTE_BLOCK_PXN		(UL(1) << 53)
+#define PTE_BLOCK_UXN		(UL(1) << 54)
 
 #else
 /*
diff --git a/arch/arm/include/asm/global_data.h b/arch/arm/include/asm/global_data.h
index dcfa098..259daa1 100644
--- a/arch/arm/include/asm/global_data.h
+++ b/arch/arm/include/asm/global_data.h
@@ -38,10 +38,11 @@ struct arch_global_data {
 	unsigned long long timer_reset_value;
 #if !(defined(CONFIG_SYS_ICACHE_OFF) && defined(CONFIG_SYS_DCACHE_OFF))
 	unsigned long tlb_addr;
+	unsigned long tlb_size;
 #if defined(CONFIG_SYS_FULL_VA)
-	unsigned long pmd_addr[CONFIG_SYS_PTL1_ENTRIES];
+	unsigned long tlb_fillptr;
+	unsigned long tlb_emerg;
 #endif
-	unsigned long tlb_size;
 #endif
 
 #ifdef CONFIG_OMAP_COMMON
diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
index 026e7ef..9b1cbf2 100644
--- a/arch/arm/include/asm/system.h
+++ b/arch/arm/include/asm/system.h
@@ -17,18 +17,19 @@
 #define CR_WXN		(1 << 19)	/* Write Permision Imply XN	*/
 #define CR_EE		(1 << 25)	/* Exception (Big) Endian	*/
 
-#ifndef CONFIG_SYS_FULL_VA
-#define PGTABLE_SIZE	(0x10000)
-#else
-#define PGTABLE_SIZE	CONFIG_SYS_PGTABLE_SIZE
-#endif
-
 /* 2MB granularity */
 #define MMU_SECTION_SHIFT	21
 #define MMU_SECTION_SIZE	(1 << MMU_SECTION_SHIFT)
 
 #ifndef __ASSEMBLY__
 
+#ifndef CONFIG_SYS_FULL_VA
+#define PGTABLE_SIZE	(0x10000)
+#else
+u64 get_page_table_size(void);
+#define PGTABLE_SIZE	get_page_table_size()
+#endif
+
 enum dcache_option {
 	DCACHE_OFF = 0x3,
 };
@@ -97,6 +98,7 @@ void __asm_flush_dcache_range(u64 start, u64 end);
 void __asm_invalidate_tlb_all(void);
 void __asm_invalidate_icache_all(void);
 int __asm_flush_l3_cache(void);
+void __asm_switch_ttbr(u64 new_ttbr);
 
 void armv8_switch_to_el2(void);
 void armv8_switch_to_el1(void);
diff --git a/include/configs/thunderx_88xx.h b/include/configs/thunderx_88xx.h
index b9f93ad..20b25f7 100644
--- a/include/configs/thunderx_88xx.h
+++ b/include/configs/thunderx_88xx.h
@@ -22,21 +22,19 @@
 
 #define MEM_BASE			0x00500000
 
-#define CONFIG_COREID_MASK             0xffffff
-
 #define CONFIG_SYS_FULL_VA
 
 #define CONFIG_SYS_LOWMEM_BASE		MEM_BASE
 
 #define CONFIG_SYS_MEM_MAP		{{0x000000000000UL, 0x40000000000UL, \
-					  PTL2_MEMTYPE(MT_NORMAL) |	     \
-					  PTL2_BLOCK_NON_SHARE},	     \
+					  PTE_BLOCK_MEMTYPE(MT_NORMAL) |     \
+					  PTE_BLOCK_NON_SHARE},	     \
 					 {0x800000000000UL, 0x40000000000UL, \
-					  PTL2_MEMTYPE(MT_DEVICE_NGNRNE) |   \
-					  PTL2_BLOCK_NON_SHARE},	     \
+					  PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) | \
+					  PTE_BLOCK_NON_SHARE},	     \
 					 {0x840000000000UL, 0x40000000000UL, \
-					  PTL2_MEMTYPE(MT_DEVICE_NGNRNE) |   \
-					  PTL2_BLOCK_NON_SHARE},	     \
+					  PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) | \
+					  PTE_BLOCK_NON_SHARE},	     \
 					}
 
 #define CONFIG_SYS_MEM_MAP_SIZE		3
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 04/11] thunderx: Move mmu table into board file
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
                   ` (2 preceding siblings ...)
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 03/11] arm64: Make full va map code more dynamic Alexander Graf
@ 2016-02-26  0:49 ` Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 05/11] zymqmp: Replace home grown mmu code with generic table approach Alexander Graf
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

The MMU range table can vary depending on things we may only find
out at runtime. While the very simple ThunderX variant does not
change, other boards will, so move the definition from a static
entry in a header file to the board file.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/arm/cpu/armv8/cache_v8.c    |  8 +++-----
 arch/arm/include/asm/armv8/mmu.h |  2 ++
 board/cavium/thunderx/thunderx.c | 24 ++++++++++++++++++++++++
 include/configs/thunderx_88xx.h  | 11 -----------
 4 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c
index 73628c9..55c6f2f 100644
--- a/arch/arm/cpu/armv8/cache_v8.c
+++ b/arch/arm/cpu/armv8/cache_v8.c
@@ -36,8 +36,6 @@ DECLARE_GLOBAL_DATA_PTR;
  */
 
 #ifdef CONFIG_SYS_FULL_VA
-static struct mm_region mem_map[] = CONFIG_SYS_MEM_MAP;
-
 static u64 get_tcr(int el, u64 *pips, u64 *pva_bits)
 {
 	u64 max_addr = 0;
@@ -46,7 +44,7 @@ static u64 get_tcr(int el, u64 *pips, u64 *pva_bits)
 	int i;
 
 	/* Find the largest address we need to support */
-	for (i = 0; i < ARRAY_SIZE(mem_map); i++)
+	for (i = 0; mem_map[i].size || mem_map[i].attrs; i++)
 		max_addr = max(max_addr, mem_map[i].base + mem_map[i].size);
 
 	/* Calculate the maximum physical (and thus virtual) address */
@@ -266,7 +264,7 @@ static int count_required_pts(u64 addr, int level, u64 maxaddr)
 	int i;
 	enum pte_type pte_type = PTE_INVAL;
 
-	for (i = 0; i < ARRAY_SIZE(mem_map); i++) {
+	for (i = 0; mem_map[i].size || mem_map[i].attrs; i++) {
 		struct mm_region *map = &mem_map[i];
 		u64 start = map->base;
 		u64 end = start + map->size;
@@ -364,7 +362,7 @@ static void setup_pgtables(void)
 	create_table();
 
 	/* Now add all MMU table entries one after another to the table */
-	for (i = 0; i < ARRAY_SIZE(mem_map); i++)
+	for (i = 0; mem_map[i].size || mem_map[i].attrs; i++)
 		add_map(&mem_map[i]);
 
 	/* Create the same thing once more for our emergency page table */
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h
index 1c490dc..06126c8 100644
--- a/arch/arm/include/asm/armv8/mmu.h
+++ b/arch/arm/include/asm/armv8/mmu.h
@@ -203,6 +203,8 @@ struct mm_region {
 	u64 size;
 	u64 attrs;
 };
+
+extern struct mm_region *mem_map;
 #endif
 
 #endif /* _ASM_ARMV8_MMU_H_ */
diff --git a/board/cavium/thunderx/thunderx.c b/board/cavium/thunderx/thunderx.c
index b926767..9131a38 100644
--- a/board/cavium/thunderx/thunderx.c
+++ b/board/cavium/thunderx/thunderx.c
@@ -10,6 +10,7 @@
 #include <linux/compiler.h>
 
 #include <cavium/atf.h>
+#include <asm/armv8/mmu.h>
 
 #if !CONFIG_IS_ENABLED(OF_CONTROL)
 #include <dm/platdata.h>
@@ -42,6 +43,29 @@ U_BOOT_DEVICE(thunderx_serial1) = {
 
 DECLARE_GLOBAL_DATA_PTR;
 
+static struct mm_region thunderx_mem_map[] = {
+	{
+		.base = 0x000000000000UL,
+		.size = 0x40000000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) | PTE_BLOCK_NON_SHARE,
+	}, {
+		.base = 0x800000000000UL,
+		.size = 0x40000000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
+			 PTE_BLOCK_NON_SHARE,
+	}, {
+		.base = 0x840000000000UL,
+		.size = 0x40000000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
+			 PTE_BLOCK_NON_SHARE,
+	}, {
+		/* List terminator */
+		0,
+	}
+};
+
+struct mm_region *mem_map = thunderx_mem_map;
+
 int board_init(void)
 {
 	return 0;
diff --git a/include/configs/thunderx_88xx.h b/include/configs/thunderx_88xx.h
index 20b25f7..64e4616 100644
--- a/include/configs/thunderx_88xx.h
+++ b/include/configs/thunderx_88xx.h
@@ -26,17 +26,6 @@
 
 #define CONFIG_SYS_LOWMEM_BASE		MEM_BASE
 
-#define CONFIG_SYS_MEM_MAP		{{0x000000000000UL, 0x40000000000UL, \
-					  PTE_BLOCK_MEMTYPE(MT_NORMAL) |     \
-					  PTE_BLOCK_NON_SHARE},	     \
-					 {0x800000000000UL, 0x40000000000UL, \
-					  PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) | \
-					  PTE_BLOCK_NON_SHARE},	     \
-					 {0x840000000000UL, 0x40000000000UL, \
-					  PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) | \
-					  PTE_BLOCK_NON_SHARE},	     \
-					}
-
 #define CONFIG_SYS_MEM_MAP_SIZE		3
 
 #define CONFIG_SYS_VA_BITS		48
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 05/11] zymqmp: Replace home grown mmu code with generic table approach
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
                   ` (3 preceding siblings ...)
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 04/11] thunderx: Move mmu table into board file Alexander Graf
@ 2016-02-26  0:49 ` Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 06/11] tegra: " Alexander Graf
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

Now that we have nice table driven page table creating code that gives
us everything we need, move to that.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v1 -> v2:

  - Move mmu tables into board file
---
 arch/arm/cpu/armv8/zynqmp/cpu.c | 217 +++++++++-------------------------------
 include/configs/xilinx_zynqmp.h |   2 +
 2 files changed, 50 insertions(+), 169 deletions(-)

diff --git a/arch/arm/cpu/armv8/zynqmp/cpu.c b/arch/arm/cpu/armv8/zynqmp/cpu.c
index c71f291..5dd3cd8 100644
--- a/arch/arm/cpu/armv8/zynqmp/cpu.c
+++ b/arch/arm/cpu/armv8/zynqmp/cpu.c
@@ -8,6 +8,7 @@
 #include <common.h>
 #include <asm/arch/hardware.h>
 #include <asm/arch/sys_proto.h>
+#include <asm/armv8/mmu.h>
 #include <asm/io.h>
 
 #define ZYNQ_SILICON_VER_MASK	0xF000
@@ -15,6 +16,53 @@
 
 DECLARE_GLOBAL_DATA_PTR;
 
+static struct mm_region zynqmp_mem_map[] = {
+	{
+		.base = 0x0UL,
+		.size = 0x80000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
+			 PTE_BLOCK_INNER_SHARE
+	}, {
+		.base = 0x80000000UL,
+		.size = 0x70000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
+			 PTE_BLOCK_NON_SHARE |
+			 PTE_BLOCK_PXN | PTE_BLOCK_UXN
+	}, {
+		.base = 0xf8000000UL,
+		.size = 0x07e00000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
+			 PTE_BLOCK_NON_SHARE |
+			 PTE_BLOCK_PXN | PTE_BLOCK_UXN
+	}, {
+		.base = 0xffe00000UL,
+		.size = 0x00200000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
+			 PTE_BLOCK_INNER_SHARE
+	}, {
+		.base = 0x400000000UL,
+		.size = 0x200000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
+			 PTE_BLOCK_NON_SHARE |
+			 PTE_BLOCK_PXN | PTE_BLOCK_UXN
+	}, {
+		.base = 0x600000000UL,
+		.size = 0x800000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
+			 PTE_BLOCK_INNER_SHARE
+	}, {
+		.base = 0xe00000000UL,
+		.size = 0xf200000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
+			 PTE_BLOCK_NON_SHARE |
+			 PTE_BLOCK_PXN | PTE_BLOCK_UXN
+	}, {
+		/* List terminator */
+		0,
+	}
+};
+struct mm_region *mem_map = zynqmp_mem_map;
+
 static unsigned int zynqmp_get_silicon_version_secure(void)
 {
 	u32 ver;
@@ -44,172 +92,3 @@ unsigned int zynqmp_get_silicon_version(void)
 
 	return ZYNQMP_CSU_VERSION_SILICON;
 }
-
-#ifndef CONFIG_SYS_DCACHE_OFF
-#include <asm/armv8/mmu.h>
-
-#define SECTION_SHIFT_L1	30UL
-#define SECTION_SHIFT_L2	21UL
-#define BLOCK_SIZE_L0		0x8000000000UL
-#define BLOCK_SIZE_L1		(1 << SECTION_SHIFT_L1)
-#define BLOCK_SIZE_L2		(1 << SECTION_SHIFT_L2)
-
-#define TCR_TG1_4K		(1 << 31)
-#define TCR_EPD1_DISABLE	(1 << 23)
-#define ZYNQMO_VA_BITS		40
-#define ZYNQMP_TCR		TCR_TG1_4K | \
-				TCR_EPD1_DISABLE | \
-				TCR_SHARED_OUTER | \
-				TCR_SHARED_INNER | \
-				TCR_IRGN_WBWA | \
-				TCR_ORGN_WBWA | \
-				TCR_T0SZ(ZYNQMO_VA_BITS)
-
-#define MEMORY_ATTR	PMD_SECT_AF | PMD_SECT_INNER_SHARE |	\
-			PMD_ATTRINDX(MT_NORMAL) |	\
-			PMD_TYPE_SECT
-#define DEVICE_ATTR	PMD_SECT_AF | PMD_SECT_PXN |	\
-			PMD_SECT_UXN | PMD_ATTRINDX(MT_DEVICE_NGNRNE) |	\
-			PMD_TYPE_SECT
-
-/* 4K size is required to place 512 entries in each level */
-#define TLB_TABLE_SIZE	0x1000
-
-struct attr_tbl {
-	u32 num;
-	u64 attr;
-};
-
-static struct attr_tbl attr_tbll1t0[4] = { {16, 0x0},
-					   {8, DEVICE_ATTR},
-					   {32, MEMORY_ATTR},
-					   {456, DEVICE_ATTR}
-					 };
-static struct attr_tbl attr_tbll2t3[4] = { {0x180, DEVICE_ATTR},
-					   {0x40, 0x0},
-					   {0x3F, DEVICE_ATTR},
-					   {0x1, MEMORY_ATTR}
-					 };
-
-/*
- * This mmu table looks as below
- * Level 0 table contains two entries to 512GB sizes. One is Level1 Table 0
- * and other Level1 Table1.
- * Level1 Table0 contains entries for each 1GB from 0 to 511GB.
- * Level1 Table1 contains entries for each 1GB from 512GB to 1TB.
- * Level2 Table0, Level2 Table1, Level2 Table2 and Level2 Table3 contains
- * entries for each 2MB starting from 0GB, 1GB, 2GB and 3GB respectively.
- */
-static void zynqmp_mmu_setup(void)
-{
-	int el;
-	u32 index_attr;
-	u64 i, section_l1t0, section_l1t1;
-	u64 section_l2t0, section_l2t1, section_l2t2, section_l2t3;
-	u64 *level0_table = (u64 *)gd->arch.tlb_addr;
-	u64 *level1_table_0 = (u64 *)(gd->arch.tlb_addr + TLB_TABLE_SIZE);
-	u64 *level1_table_1 = (u64 *)(gd->arch.tlb_addr + (2 * TLB_TABLE_SIZE));
-	u64 *level2_table_0 = (u64 *)(gd->arch.tlb_addr + (3 * TLB_TABLE_SIZE));
-	u64 *level2_table_1 = (u64 *)(gd->arch.tlb_addr + (4 * TLB_TABLE_SIZE));
-	u64 *level2_table_2 = (u64 *)(gd->arch.tlb_addr + (5 * TLB_TABLE_SIZE));
-	u64 *level2_table_3 = (u64 *)(gd->arch.tlb_addr + (6 * TLB_TABLE_SIZE));
-
-	level0_table[0] =
-		(u64)level1_table_0 | PMD_TYPE_TABLE;
-	level0_table[1] =
-		(u64)level1_table_1 | PMD_TYPE_TABLE;
-
-	/*
-	 * set level 1 table 0, covering 0 to 512GB
-	 * set level 1 table 1, covering 512GB to 1TB
-	 */
-	section_l1t0 = 0;
-	section_l1t1 = BLOCK_SIZE_L0;
-
-	index_attr = 0;
-	for (i = 0; i < 512; i++) {
-		level1_table_0[i] = section_l1t0;
-		level1_table_0[i] |= attr_tbll1t0[index_attr].attr;
-		attr_tbll1t0[index_attr].num--;
-		if (attr_tbll1t0[index_attr].num == 0)
-			index_attr++;
-		level1_table_1[i] = section_l1t1;
-		level1_table_1[i] |= DEVICE_ATTR;
-		section_l1t0 += BLOCK_SIZE_L1;
-		section_l1t1 += BLOCK_SIZE_L1;
-	}
-
-	level1_table_0[0] =
-		(u64)level2_table_0 | PMD_TYPE_TABLE;
-	level1_table_0[1] =
-		(u64)level2_table_1 | PMD_TYPE_TABLE;
-	level1_table_0[2] =
-		(u64)level2_table_2 | PMD_TYPE_TABLE;
-	level1_table_0[3] =
-		(u64)level2_table_3 | PMD_TYPE_TABLE;
-
-	section_l2t0 = 0;
-	section_l2t1 = section_l2t0 + BLOCK_SIZE_L1; /* 1GB */
-	section_l2t2 = section_l2t1 + BLOCK_SIZE_L1; /* 2GB */
-	section_l2t3 = section_l2t2 + BLOCK_SIZE_L1; /* 3GB */
-
-	index_attr = 0;
-
-	for (i = 0; i < 512; i++) {
-		level2_table_0[i] = section_l2t0 | MEMORY_ATTR;
-		level2_table_1[i] = section_l2t1 | MEMORY_ATTR;
-		level2_table_2[i] = section_l2t2 | DEVICE_ATTR;
-		level2_table_3[i] = section_l2t3 |
-				    attr_tbll2t3[index_attr].attr;
-		attr_tbll2t3[index_attr].num--;
-		if (attr_tbll2t3[index_attr].num == 0)
-			index_attr++;
-		section_l2t0 += BLOCK_SIZE_L2;
-		section_l2t1 += BLOCK_SIZE_L2;
-		section_l2t2 += BLOCK_SIZE_L2;
-		section_l2t3 += BLOCK_SIZE_L2;
-	}
-
-	/* flush new MMU table */
-	flush_dcache_range(gd->arch.tlb_addr,
-			   gd->arch.tlb_addr + gd->arch.tlb_size);
-
-	/* point TTBR to the new table */
-	el = current_el();
-	set_ttbr_tcr_mair(el, gd->arch.tlb_addr,
-			  ZYNQMP_TCR, MEMORY_ATTRIBUTES);
-
-	set_sctlr(get_sctlr() | CR_M);
-}
-
-int arch_cpu_init(void)
-{
-	icache_enable();
-	__asm_invalidate_dcache_all();
-	__asm_invalidate_tlb_all();
-	return 0;
-}
-
-/*
- * This function is called from lib/board.c.
- * It recreates MMU table in main memory. MMU and d-cache are enabled earlier.
- * There is no need to disable d-cache for this operation.
- */
-void enable_caches(void)
-{
-	/* The data cache is not active unless the mmu is enabled */
-	if (!(get_sctlr() & CR_M)) {
-		invalidate_dcache_all();
-		__asm_invalidate_tlb_all();
-		zynqmp_mmu_setup();
-	}
-	puts("Enabling Caches...\n");
-
-	set_sctlr(get_sctlr() | CR_C);
-}
-
-u64 *arch_get_page_table(void)
-{
-	return (u64 *)(gd->arch.tlb_addr + 0x3000);
-}
-#endif
diff --git a/include/configs/xilinx_zynqmp.h b/include/configs/xilinx_zynqmp.h
index da868b8..08f430c 100644
--- a/include/configs/xilinx_zynqmp.h
+++ b/include/configs/xilinx_zynqmp.h
@@ -29,6 +29,8 @@
 #define CONFIG_SYS_MEMTEST_START	CONFIG_SYS_SDRAM_BASE
 #define CONFIG_SYS_MEMTEST_END		CONFIG_SYS_SDRAM_SIZE
 
+#define CONFIG_SYS_FULL_VA
+
 /* Have release address at the end of 256MB for now */
 #define CPU_RELEASE_ADDR	0xFFFFFF0
 
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 06/11] tegra: Replace home grown mmu code with generic table approach
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
                   ` (4 preceding siblings ...)
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 05/11] zymqmp: Replace home grown mmu code with generic table approach Alexander Graf
@ 2016-02-26  0:49 ` Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 07/11] vexpress64: Add MMU tables Alexander Graf
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

Now that we have nice table driven page table creating code that gives
us everything we need, move to that.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v1 -> v2:

  - Move mmu tables into .c file
---
 arch/arm/mach-tegra/arm64-mmu.c   | 132 +++++---------------------------------
 include/configs/tegra210-common.h |   2 +
 2 files changed, 19 insertions(+), 115 deletions(-)

diff --git a/arch/arm/mach-tegra/arm64-mmu.c b/arch/arm/mach-tegra/arm64-mmu.c
index c227652..501c4f0 100644
--- a/arch/arm/mach-tegra/arm64-mmu.c
+++ b/arch/arm/mach-tegra/arm64-mmu.c
@@ -12,120 +12,22 @@
 #include <asm/system.h>
 #include <asm/armv8/mmu.h>
 
-DECLARE_GLOBAL_DATA_PTR;
-
-#define SECTION_SHIFT_L1	30UL
-#define SECTION_SHIFT_L2	21UL
-#define BLOCK_SIZE_L0		0x8000000000UL
-#define BLOCK_SIZE_L1		(1 << SECTION_SHIFT_L1)
-#define BLOCK_SIZE_L2		(1 << SECTION_SHIFT_L2)
-
-#define TCR_TG1_4K		(1 << 31)
-#define TCR_EPD1_DISABLE	(1 << 23)
-#define TEGRA_VA_BITS		40
-#define TEGRA_TCR		TCR_TG1_4K | \
-				TCR_EPD1_DISABLE | \
-				TCR_SHARED_OUTER | \
-				TCR_SHARED_INNER | \
-				TCR_IRGN_WBWA | \
-				TCR_ORGN_WBWA | \
-				TCR_T0SZ(TEGRA_VA_BITS)
-
-#define MEMORY_ATTR	PMD_SECT_AF | PMD_SECT_INNER_SHARE |	\
-			PMD_ATTRINDX(MT_NORMAL) |	\
-			PMD_TYPE_SECT
-#define DEVICE_ATTR	PMD_SECT_AF | PMD_SECT_PXN |	\
-			PMD_SECT_UXN | PMD_ATTRINDX(MT_DEVICE_NGNRNE) |	\
-			PMD_TYPE_SECT
-
-/* 4K size is required to place 512 entries in each level */
-#define TLB_TABLE_SIZE	0x1000
-
-/*
- * This mmu table looks as below
- * Level 0 table contains two entries to 512GB sizes. One is Level1 Table 0
- * and other Level1 Table1.
- * Level1 Table0 contains entries for each 1GB from 0 to 511GB.
- * Level1 Table1 contains entries for each 1GB from 512GB to 1TB.
- * Level2 Table0, Level2 Table1, Level2 Table2 and Level2 Table3 contains
- * entries for each 2MB starting from 0GB, 1GB, 2GB and 3GB respectively.
- */
-void mmu_setup(void)
-{
-	int el;
-	u64 i, section_l1t0, section_l1t1;
-	u64 section_l2t0, section_l2t1, section_l2t2, section_l2t3;
-	u64 *level0_table = (u64 *)gd->arch.tlb_addr;
-	u64 *level1_table_0 = (u64 *)(gd->arch.tlb_addr + TLB_TABLE_SIZE);
-	u64 *level1_table_1 = (u64 *)(gd->arch.tlb_addr + (2 * TLB_TABLE_SIZE));
-	u64 *level2_table_0 = (u64 *)(gd->arch.tlb_addr + (3 * TLB_TABLE_SIZE));
-	u64 *level2_table_1 = (u64 *)(gd->arch.tlb_addr + (4 * TLB_TABLE_SIZE));
-	u64 *level2_table_2 = (u64 *)(gd->arch.tlb_addr + (5 * TLB_TABLE_SIZE));
-	u64 *level2_table_3 = (u64 *)(gd->arch.tlb_addr + (6 * TLB_TABLE_SIZE));
-
-	/* Invalidate all table entries */
-	memset(level0_table, 0, PGTABLE_SIZE);
-
-	level0_table[0] =
-		(u64)level1_table_0 | PMD_TYPE_TABLE;
-	level0_table[1] =
-		(u64)level1_table_1 | PMD_TYPE_TABLE;
-
-	/*
-	 * set level 1 table 0, covering 0 to 512GB
-	 * set level 1 table 1, covering 512GB to 1TB
-	 */
-	section_l1t0 = 0;
-	section_l1t1 = BLOCK_SIZE_L0;
-
-	for (i = 0; i < 512; i++) {
-		level1_table_0[i] = section_l1t0;
-		if (i >= 4)
-			level1_table_0[i] |= MEMORY_ATTR;
-		level1_table_1[i] = section_l1t1;
-		level1_table_1[i] |= MEMORY_ATTR;
-		section_l1t0 += BLOCK_SIZE_L1;
-		section_l1t1 += BLOCK_SIZE_L1;
+static struct mm_region tegra_mem_map[] = {
+	{
+		.base = 0x0UL,
+		.size = 0x80000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
+			 PTE_BLOCK_NON_SHARE |
+			 PTE_BLOCK_PXN | PTE_BLOCK_UXN
+	}, {
+		.base = 0x80000000UL,
+		.size = 0xff80000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
+			 PTE_BLOCK_INNER_SHARE
+	}, {
+		/* List terminator */
+		0,
 	}
+};
 
-	level1_table_0[0] =
-		(u64)level2_table_0 | PMD_TYPE_TABLE;
-	level1_table_0[1] =
-		(u64)level2_table_1 | PMD_TYPE_TABLE;
-	level1_table_0[2] =
-		(u64)level2_table_2 | PMD_TYPE_TABLE;
-	level1_table_0[3] =
-		(u64)level2_table_3 | PMD_TYPE_TABLE;
-
-	section_l2t0 = 0;
-	section_l2t1 = section_l2t0 + BLOCK_SIZE_L1; /* 1GB */
-	section_l2t2 = section_l2t1 + BLOCK_SIZE_L1; /* 2GB */
-	section_l2t3 = section_l2t2 + BLOCK_SIZE_L1; /* 3GB */
-
-	for (i = 0; i < 512; i++) {
-		level2_table_0[i] = section_l2t0 | DEVICE_ATTR;
-		level2_table_1[i] = section_l2t1 | DEVICE_ATTR;
-		level2_table_2[i] = section_l2t2 | MEMORY_ATTR;
-		level2_table_3[i] = section_l2t3 | MEMORY_ATTR;
-		section_l2t0 += BLOCK_SIZE_L2;
-		section_l2t1 += BLOCK_SIZE_L2;
-		section_l2t2 += BLOCK_SIZE_L2;
-		section_l2t3 += BLOCK_SIZE_L2;
-	}
-
-	/* flush new MMU table */
-	flush_dcache_range(gd->arch.tlb_addr,
-			   gd->arch.tlb_addr + gd->arch.tlb_size);
-
-	/* point TTBR to the new table */
-	el = current_el();
-	set_ttbr_tcr_mair(el, gd->arch.tlb_addr,
-			  TEGRA_TCR, MEMORY_ATTRIBUTES);
-
-	set_sctlr(get_sctlr() | CR_M);
-}
-
-u64 *arch_get_page_table(void)
-{
-	return (u64 *)(gd->arch.tlb_addr + (3 * TLB_TABLE_SIZE));
-}
+struct mm_region *mem_map = tegra_mem_map;
diff --git a/include/configs/tegra210-common.h b/include/configs/tegra210-common.h
index 8f35a7b..2a6e317 100644
--- a/include/configs/tegra210-common.h
+++ b/include/configs/tegra210-common.h
@@ -13,6 +13,8 @@
 /* Cortex-A57 uses a cache line size of 64 bytes */
 #define CONFIG_SYS_CACHELINE_SIZE	64
 
+#define CONFIG_SYS_FULL_VA
+
 /*
  * NS16550 Configuration
  */
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 07/11] vexpress64: Add MMU tables
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
                   ` (5 preceding siblings ...)
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 06/11] tegra: " Alexander Graf
@ 2016-02-26  0:49 ` Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 08/11] dwmmc: Increase retry timeout Alexander Graf
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

There's no good excuse for running with caches disabled on AArch64,
so let's just move the vexpress64 target to enable the MMU and run
with caches on.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v1 -> v2:

  - Move tables to .c file
---
 board/armltd/vexpress64/vexpress64.c | 21 +++++++++++++++++++++
 include/configs/vexpress_aemv8a.h    |  6 +++---
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/board/armltd/vexpress64/vexpress64.c b/board/armltd/vexpress64/vexpress64.c
index 6efc8c1..973b579 100644
--- a/board/armltd/vexpress64/vexpress64.c
+++ b/board/armltd/vexpress64/vexpress64.c
@@ -14,6 +14,7 @@
 #include <dm/platdata.h>
 #include <dm/platform_data/serial_pl01x.h>
 #include "pcie.h"
+#include <asm/armv8/mmu.h>
 
 DECLARE_GLOBAL_DATA_PTR;
 
@@ -28,6 +29,26 @@ U_BOOT_DEVICE(vexpress_serials) = {
 	.platdata = &serial_platdata,
 };
 
+static struct mm_region vexpress64_mem_map[] = {
+	{
+		.base = 0x0UL,
+		.size = 0x80000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
+			 PTE_BLOCK_NON_SHARE |
+			 PTE_BLOCK_PXN | PTE_BLOCK_UXN
+	}, {
+		.base = 0x80000000UL,
+		.size = 0xff80000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
+			 PTE_BLOCK_INNER_SHARE
+	}, {
+		/* List terminator */
+		0,
+	}
+};
+
+struct mm_region *mem_map = vexpress64_mem_map;
+
 /* This function gets replaced by platforms supporting PCIe.
  * The replacement function, eg. on Juno, initialises the PCIe bus.
  */
diff --git a/include/configs/vexpress_aemv8a.h b/include/configs/vexpress_aemv8a.h
index 133041b..ddb9848 100644
--- a/include/configs/vexpress_aemv8a.h
+++ b/include/configs/vexpress_aemv8a.h
@@ -19,9 +19,9 @@
 
 #define CONFIG_SUPPORT_RAW_INITRD
 
-/* Cache Definitions */
-#define CONFIG_SYS_DCACHE_OFF
-#define CONFIG_SYS_ICACHE_OFF
+/* MMU Definitions */
+#define CONFIG_SYS_CACHELINE_SIZE	64
+#define CONFIG_SYS_FULL_VA
 
 #define CONFIG_IDENT_STRING		" vexpress_aemv8a"
 #define CONFIG_BOOTP_VCI_STRING		"U-Boot.armv8.vexpress_aemv8a"
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 08/11] dwmmc: Increase retry timeout
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
                   ` (6 preceding siblings ...)
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 07/11] vexpress64: Add MMU tables Alexander Graf
@ 2016-02-26  0:49 ` Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 09/11] hikey: Add MMU tables Alexander Graf
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

When enable dcache on HiKey, we're running into MMC command timeouts
because our retry loop is now faster than the eMMC (or an external SD
card) can answer.

Increase the retry count to the same as the timeout value for status
reports.

The real fix is obviously to not base this whole thing on a cycle counter
but on real wall time, but that would be slightly more intrusive.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 drivers/mmc/dw_mmc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/dw_mmc.c b/drivers/mmc/dw_mmc.c
index 909e3ca..7329f40 100644
--- a/drivers/mmc/dw_mmc.c
+++ b/drivers/mmc/dw_mmc.c
@@ -189,7 +189,7 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd,
 				 data ? DIV_ROUND_UP(data->blocks, 8) : 0);
 	int ret = 0, flags = 0, i;
 	unsigned int timeout = 100000;
-	u32 retry = 10000;
+	u32 retry = 100000;
 	u32 mask, ctrl;
 	ulong start = get_timer(0);
 	struct bounce_buffer bbstate;
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 09/11] hikey: Add MMU tables
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
                   ` (7 preceding siblings ...)
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 08/11] dwmmc: Increase retry timeout Alexander Graf
@ 2016-02-26  0:49 ` Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 10/11] arm64: Remove non-full-va map code Alexander Graf
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

The hikey runs with dcache disabled today. There really should be no reason
not to use caches on AArch64, so let's add MMU definitions and enable the
dcache.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v1 -> v2:

  - Move tables to .c file
---
 board/hisilicon/hikey/hikey.c | 21 +++++++++++++++++++++
 include/configs/hikey.h       |  5 +++--
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/board/hisilicon/hikey/hikey.c b/board/hisilicon/hikey/hikey.c
index c4ae40b..1edc807 100644
--- a/board/hisilicon/hikey/hikey.c
+++ b/board/hisilicon/hikey/hikey.c
@@ -19,6 +19,7 @@
 #include <asm/arch/periph.h>
 #include <asm/arch/pinmux.h>
 #include <asm/arch/hi6220.h>
+#include <asm/armv8/mmu.h>
 
 /*TODO drop this table in favour of device tree */
 static const struct hikey_gpio_platdata hi6220_gpio[] = {
@@ -87,6 +88,26 @@ U_BOOT_DEVICE(hikey_seriala) = {
 	.platdata = &serial_platdata,
 };
 
+static struct mm_region hikey_mem_map[] = {
+	{
+		.base = 0x0UL,
+		.size = 0x80000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
+			 PTE_BLOCK_INNER_SHARE
+	}, {
+		.base = 0x80000000UL,
+		.size = 0x80000000UL,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
+			 PTE_BLOCK_NON_SHARE |
+			 PTE_BLOCK_PXN | PTE_BLOCK_UXN
+	}, {
+		/* List terminator */
+		0,
+	}
+};
+
+struct mm_region *mem_map = hikey_mem_map;
+
 #ifdef CONFIG_BOARD_EARLY_INIT_F
 int board_uart_init(void)
 {
diff --git a/include/configs/hikey.h b/include/configs/hikey.h
index 796861e..d33dcef 100644
--- a/include/configs/hikey.h
+++ b/include/configs/hikey.h
@@ -21,8 +21,9 @@
 
 #define CONFIG_SUPPORT_RAW_INITRD
 
-/* Cache Definitions */
-#define CONFIG_SYS_DCACHE_OFF
+/* MMU Definitions */
+#define CONFIG_SYS_CACHELINE_SIZE	64
+#define CONFIG_SYS_FULL_VA
 
 #define CONFIG_IDENT_STRING		"hikey"
 
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 10/11] arm64: Remove non-full-va map code
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
                   ` (8 preceding siblings ...)
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 09/11] hikey: Add MMU tables Alexander Graf
@ 2016-02-26  0:49 ` Alexander Graf
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 11/11] arm64: Only allow dcache disabled in SPL builds Alexander Graf
  2016-02-26 18:37 ` [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Stephen Warren
  11 siblings, 0 replies; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

By now the code to only have a single page table level with 64k page
size and 42 bit address space is no longer used by any board in tree,
so we can safely remove it.

To clean up code, move the layerscape mmu code to the new defines,
removing redundant field definitions.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v1 -> v2:

  - Add layerscape empty mmu table in .c file
---
 arch/arm/cpu/armv8/cache_v8.c                  | 90 ------------------------
 arch/arm/cpu/armv8/fsl-layerscape/cpu.c        | 37 ++++++++--
 arch/arm/include/asm/arch-fsl-layerscape/cpu.h | 94 +++++++++++++-------------
 arch/arm/include/asm/armv8/mmu.h               | 66 +-----------------
 arch/arm/include/asm/global_data.h             |  2 +-
 arch/arm/include/asm/system.h                  | 12 ++--
 doc/README.arm64                               | 20 ------
 include/configs/hikey.h                        |  1 -
 include/configs/tegra210-common.h              |  2 -
 include/configs/thunderx_88xx.h                | 14 ----
 include/configs/vexpress_aemv8a.h              |  1 -
 include/configs/xilinx_zynqmp.h                |  2 -
 12 files changed, 85 insertions(+), 256 deletions(-)

diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c
index 55c6f2f..df15e00 100644
--- a/arch/arm/cpu/armv8/cache_v8.c
+++ b/arch/arm/cpu/armv8/cache_v8.c
@@ -35,7 +35,6 @@ DECLARE_GLOBAL_DATA_PTR;
  *    off:          FFF
  */
 
-#ifdef CONFIG_SYS_FULL_VA
 static u64 get_tcr(int el, u64 *pips, u64 *pva_bits)
 {
 	u64 max_addr = 0;
@@ -386,38 +385,11 @@ static void setup_all_pgtables(void)
 	gd->arch.tlb_addr = tlb_addr;
 }
 
-#else
-
-inline void set_pgtable_section(u64 *page_table, u64 index, u64 section,
-			 u64 memory_type, u64 attribute)
-{
-	u64 value;
-
-	value = section | PMD_TYPE_SECT | PMD_SECT_AF;
-	value |= PMD_ATTRINDX(memory_type);
-	value |= attribute;
-	page_table[index] = value;
-}
-
-inline void set_pgtable_table(u64 *page_table, u64 index, u64 *table_addr)
-{
-	u64 value;
-
-	value = (u64)table_addr | PMD_TYPE_TABLE;
-	page_table[index] = value;
-}
-#endif
-
 /* to activate the MMU we need to set up virtual memory */
 __weak void mmu_setup(void)
 {
-#ifndef CONFIG_SYS_FULL_VA
-	bd_t *bd = gd->bd;
-	u64 *page_table = (u64 *)gd->arch.tlb_addr, i, j;
-#endif
 	int el;
 
-#ifdef CONFIG_SYS_FULL_VA
 	/* Set up page tables only once */
 	if (!gd->arch.tlb_fillptr)
 		setup_all_pgtables();
@@ -425,40 +397,6 @@ __weak void mmu_setup(void)
 	el = current_el();
 	set_ttbr_tcr_mair(el, gd->arch.tlb_addr, get_tcr(el, NULL, NULL),
 			  MEMORY_ATTRIBUTES);
-#else
-	/* Setup an identity-mapping for all spaces */
-	for (i = 0; i < (PGTABLE_SIZE >> 3); i++) {
-		set_pgtable_section(page_table, i, i << SECTION_SHIFT,
-				    MT_DEVICE_NGNRNE, PMD_SECT_NON_SHARE);
-	}
-
-	/* Setup an identity-mapping for all RAM space */
-	for (i = 0; i < CONFIG_NR_DRAM_BANKS; i++) {
-		ulong start = bd->bi_dram[i].start;
-		ulong end = bd->bi_dram[i].start + bd->bi_dram[i].size;
-		for (j = start >> SECTION_SHIFT;
-		     j < end >> SECTION_SHIFT; j++) {
-			set_pgtable_section(page_table, j, j << SECTION_SHIFT,
-					    MT_NORMAL, PMD_SECT_NON_SHARE);
-		}
-	}
-
-	/* load TTBR0 */
-	el = current_el();
-	if (el == 1) {
-		set_ttbr_tcr_mair(el, gd->arch.tlb_addr,
-				  TCR_EL1_RSVD | TCR_FLAGS | TCR_EL1_IPS_BITS,
-				  MEMORY_ATTRIBUTES);
-	} else if (el == 2) {
-		set_ttbr_tcr_mair(el, gd->arch.tlb_addr,
-				  TCR_EL2_RSVD | TCR_FLAGS | TCR_EL2_IPS_BITS,
-				  MEMORY_ATTRIBUTES);
-	} else {
-		set_ttbr_tcr_mair(el, gd->arch.tlb_addr,
-				  TCR_EL3_RSVD | TCR_FLAGS | TCR_EL3_IPS_BITS,
-				  MEMORY_ATTRIBUTES);
-	}
-#endif
 
 	/* enable the mmu */
 	set_sctlr(get_sctlr() | CR_M);
@@ -544,33 +482,6 @@ u64 *__weak arch_get_page_table(void) {
 	return NULL;
 }
 
-#ifndef CONFIG_SYS_FULL_VA
-void mmu_set_region_dcache_behaviour(phys_addr_t start, size_t size,
-				     enum dcache_option option)
-{
-	u64 *page_table = arch_get_page_table();
-	u64 upto, end;
-
-	if (page_table == NULL)
-		return;
-
-	end = ALIGN(start + size, (1 << MMU_SECTION_SHIFT)) >>
-	      MMU_SECTION_SHIFT;
-	start = start >> MMU_SECTION_SHIFT;
-	for (upto = start; upto < end; upto++) {
-		page_table[upto] &= ~PMD_ATTRINDX_MASK;
-		page_table[upto] |= PMD_ATTRINDX(option);
-	}
-	asm volatile("dsb sy");
-	__asm_invalidate_tlb_all();
-	asm volatile("dsb sy");
-	asm volatile("isb");
-	start = start << MMU_SECTION_SHIFT;
-	end = end << MMU_SECTION_SHIFT;
-	flush_dcache_range(start, end);
-	asm volatile("dsb sy");
-}
-#else
 static bool is_aligned(u64 addr, u64 size, u64 align)
 {
 	return !(addr & (align - 1)) && !(size & (align - 1));
@@ -652,7 +563,6 @@ void mmu_set_region_dcache_behaviour(phys_addr_t start, size_t size,
 	 */
 	flush_dcache_range(real_start, real_start + real_size);
 }
-#endif
 
 #else	/* CONFIG_SYS_DCACHE_OFF */
 
diff --git a/arch/arm/cpu/armv8/fsl-layerscape/cpu.c b/arch/arm/cpu/armv8/fsl-layerscape/cpu.c
index 6ea28ed..7404bd9 100644
--- a/arch/arm/cpu/armv8/fsl-layerscape/cpu.c
+++ b/arch/arm/cpu/armv8/fsl-layerscape/cpu.c
@@ -26,6 +26,14 @@
 
 DECLARE_GLOBAL_DATA_PTR;
 
+static struct mm_region layerscape_mem_map[] = {
+	{
+		/* List terminator */
+		0,
+	}
+};
+struct mm_region *mem_map = layerscape_mem_map;
+
 void cpu_name(char *name)
 {
 	struct ccsr_gur __iomem *gur = (void *)(CONFIG_SYS_FSL_GUTS_ADDR);
@@ -48,6 +56,25 @@ void cpu_name(char *name)
 }
 
 #ifndef CONFIG_SYS_DCACHE_OFF
+static void set_pgtable_section(u64 *page_table, u64 index, u64 section,
+			u64 memory_type, u64 attribute)
+{
+       u64 value;
+
+       value = section | PTE_TYPE_BLOCK | PTE_BLOCK_AF;
+       value |= PMD_ATTRINDX(memory_type);
+       value |= attribute;
+       page_table[index] = value;
+}
+
+static void set_pgtable_table(u64 *page_table, u64 index, u64 *table_addr)
+{
+       u64 value;
+
+       value = (u64)table_addr | PTE_TYPE_TABLE;
+       page_table[index] = value;
+}
+
 /*
  * Set the block entries according to the information of the table.
  */
@@ -114,10 +141,10 @@ static int find_table(const struct sys_mmu_table *list,
 
 		temp_base -= block_size;
 
-		if ((level_table[index - 1] & PMD_TYPE_MASK) ==
-		    PMD_TYPE_TABLE) {
+		if ((level_table[index - 1] & PTE_TYPE_MASK) ==
+		    PTE_TYPE_TABLE) {
 			level_table = (u64 *)(level_table[index - 1] &
-				      ~PMD_TYPE_MASK);
+				      ~PTE_TYPE_MASK);
 			level++;
 			continue;
 		} else {
@@ -220,7 +247,7 @@ static inline int final_secure_ddr(u64 *level0_table,
 	struct table_info table = {};
 	struct sys_mmu_table ddr_entry = {
 		0, 0, BLOCK_SIZE_L1, MT_NORMAL,
-		PMD_SECT_OUTER_SHARE | PMD_SECT_NS
+		PTE_BLOCK_OUTER_SHARE | PTE_BLOCK_NS
 	};
 	u64 index;
 
@@ -243,7 +270,7 @@ static inline int final_secure_ddr(u64 *level0_table,
 	ddr_entry.virt_addr = phys_addr;
 	ddr_entry.phys_addr = phys_addr;
 	ddr_entry.size = CONFIG_SYS_MEM_RESERVE_SECURE;
-	ddr_entry.attribute = PMD_SECT_OUTER_SHARE;
+	ddr_entry.attribute = PTE_BLOCK_OUTER_SHARE;
 	ret = find_table(&ddr_entry, &table, level0_table);
 	if (ret) {
 		printf("MMU error: could not find secure ddr table\n");
diff --git a/arch/arm/include/asm/arch-fsl-layerscape/cpu.h b/arch/arm/include/asm/arch-fsl-layerscape/cpu.h
index 15ade84..93bbda3 100644
--- a/arch/arm/include/asm/arch-fsl-layerscape/cpu.h
+++ b/arch/arm/include/asm/arch-fsl-layerscape/cpu.h
@@ -117,48 +117,48 @@ static const struct sys_mmu_table early_mmu_table[] = {
 #ifdef CONFIG_FSL_LSCH3
 	{ CONFIG_SYS_FSL_CCSR_BASE, CONFIG_SYS_FSL_CCSR_BASE,
 	  CONFIG_SYS_FSL_CCSR_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_OCRAM_BASE, CONFIG_SYS_FSL_OCRAM_BASE,
-	  CONFIG_SYS_FSL_OCRAM_SIZE, MT_NORMAL, PMD_SECT_NON_SHARE },
+	  CONFIG_SYS_FSL_OCRAM_SIZE, MT_NORMAL, PTE_BLOCK_NON_SHARE },
 	/* For IFC Region #1, only the first 4MB is cache-enabled */
 	{ CONFIG_SYS_FSL_IFC_BASE1, CONFIG_SYS_FSL_IFC_BASE1,
-	  CONFIG_SYS_FSL_IFC_SIZE1_1, MT_NORMAL, PMD_SECT_NON_SHARE },
+	  CONFIG_SYS_FSL_IFC_SIZE1_1, MT_NORMAL, PTE_BLOCK_NON_SHARE },
 	{ CONFIG_SYS_FSL_IFC_BASE1 + CONFIG_SYS_FSL_IFC_SIZE1_1,
 	  CONFIG_SYS_FSL_IFC_BASE1 + CONFIG_SYS_FSL_IFC_SIZE1_1,
 	  CONFIG_SYS_FSL_IFC_SIZE1 - CONFIG_SYS_FSL_IFC_SIZE1_1,
-	  MT_DEVICE_NGNRNE, PMD_SECT_NON_SHARE },
+	  MT_DEVICE_NGNRNE, PTE_BLOCK_NON_SHARE },
 	{ CONFIG_SYS_FLASH_BASE, CONFIG_SYS_FSL_IFC_BASE1,
-	  CONFIG_SYS_FSL_IFC_SIZE1, MT_DEVICE_NGNRNE, PMD_SECT_NON_SHARE },
+	  CONFIG_SYS_FSL_IFC_SIZE1, MT_DEVICE_NGNRNE, PTE_BLOCK_NON_SHARE },
 	{ CONFIG_SYS_FSL_DRAM_BASE1, CONFIG_SYS_FSL_DRAM_BASE1,
 	  CONFIG_SYS_FSL_DRAM_SIZE1, MT_NORMAL,
-	  PMD_SECT_OUTER_SHARE | PMD_SECT_NS },
+	  PTE_BLOCK_OUTER_SHARE | PTE_BLOCK_NS },
 	/* Map IFC region #2 up to CONFIG_SYS_FLASH_BASE for NAND boot */
 	{ CONFIG_SYS_FSL_IFC_BASE2, CONFIG_SYS_FSL_IFC_BASE2,
 	  CONFIG_SYS_FLASH_BASE - CONFIG_SYS_FSL_IFC_BASE2,
-	  MT_DEVICE_NGNRNE, PMD_SECT_NON_SHARE },
+	  MT_DEVICE_NGNRNE, PTE_BLOCK_NON_SHARE },
 	{ CONFIG_SYS_FSL_DCSR_BASE, CONFIG_SYS_FSL_DCSR_BASE,
 	  CONFIG_SYS_FSL_DCSR_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_DRAM_BASE2, CONFIG_SYS_FSL_DRAM_BASE2,
 	  CONFIG_SYS_FSL_DRAM_SIZE2, MT_NORMAL,
-	  PMD_SECT_OUTER_SHARE | PMD_SECT_NS },
+	  PTE_BLOCK_OUTER_SHARE | PTE_BLOCK_NS },
 #elif defined(CONFIG_FSL_LSCH2)
 	{ CONFIG_SYS_FSL_CCSR_BASE, CONFIG_SYS_FSL_CCSR_BASE,
 	  CONFIG_SYS_FSL_CCSR_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_OCRAM_BASE, CONFIG_SYS_FSL_OCRAM_BASE,
-	  CONFIG_SYS_FSL_OCRAM_SIZE, MT_NORMAL, PMD_SECT_NON_SHARE },
+	  CONFIG_SYS_FSL_OCRAM_SIZE, MT_NORMAL, PTE_BLOCK_NON_SHARE },
 	{ CONFIG_SYS_FSL_DCSR_BASE, CONFIG_SYS_FSL_DCSR_BASE,
 	  CONFIG_SYS_FSL_DCSR_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_QSPI_BASE, CONFIG_SYS_FSL_QSPI_BASE,
-	  CONFIG_SYS_FSL_QSPI_SIZE, MT_DEVICE_NGNRNE, PMD_SECT_NON_SHARE },
+	  CONFIG_SYS_FSL_QSPI_SIZE, MT_DEVICE_NGNRNE, PTE_BLOCK_NON_SHARE },
 	{ CONFIG_SYS_FSL_IFC_BASE, CONFIG_SYS_FSL_IFC_BASE,
-	  CONFIG_SYS_FSL_IFC_SIZE, MT_DEVICE_NGNRNE, PMD_SECT_NON_SHARE },
+	  CONFIG_SYS_FSL_IFC_SIZE, MT_DEVICE_NGNRNE, PTE_BLOCK_NON_SHARE },
 	{ CONFIG_SYS_FSL_DRAM_BASE1, CONFIG_SYS_FSL_DRAM_BASE1,
-	  CONFIG_SYS_FSL_DRAM_SIZE1, MT_NORMAL, PMD_SECT_OUTER_SHARE },
+	  CONFIG_SYS_FSL_DRAM_SIZE1, MT_NORMAL, PTE_BLOCK_OUTER_SHARE },
 	{ CONFIG_SYS_FSL_DRAM_BASE2, CONFIG_SYS_FSL_DRAM_BASE2,
-	  CONFIG_SYS_FSL_DRAM_SIZE2, MT_NORMAL, PMD_SECT_OUTER_SHARE },
+	  CONFIG_SYS_FSL_DRAM_SIZE2, MT_NORMAL, PTE_BLOCK_OUTER_SHARE },
 #endif
 };
 
@@ -166,96 +166,96 @@ static const struct sys_mmu_table final_mmu_table[] = {
 #ifdef CONFIG_FSL_LSCH3
 	{ CONFIG_SYS_FSL_CCSR_BASE, CONFIG_SYS_FSL_CCSR_BASE,
 	  CONFIG_SYS_FSL_CCSR_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_OCRAM_BASE, CONFIG_SYS_FSL_OCRAM_BASE,
-	  CONFIG_SYS_FSL_OCRAM_SIZE, MT_NORMAL, PMD_SECT_NON_SHARE },
+	  CONFIG_SYS_FSL_OCRAM_SIZE, MT_NORMAL, PTE_BLOCK_NON_SHARE },
 	{ CONFIG_SYS_FSL_DRAM_BASE1, CONFIG_SYS_FSL_DRAM_BASE1,
 	  CONFIG_SYS_FSL_DRAM_SIZE1, MT_NORMAL,
-	  PMD_SECT_OUTER_SHARE | PMD_SECT_NS },
+	  PTE_BLOCK_OUTER_SHARE | PTE_BLOCK_NS },
 	{ CONFIG_SYS_FSL_QSPI_BASE2, CONFIG_SYS_FSL_QSPI_BASE2,
 	  CONFIG_SYS_FSL_QSPI_SIZE2, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_IFC_BASE2, CONFIG_SYS_FSL_IFC_BASE2,
-	  CONFIG_SYS_FSL_IFC_SIZE2, MT_DEVICE_NGNRNE, PMD_SECT_NON_SHARE },
+	  CONFIG_SYS_FSL_IFC_SIZE2, MT_DEVICE_NGNRNE, PTE_BLOCK_NON_SHARE },
 	{ CONFIG_SYS_FSL_DCSR_BASE, CONFIG_SYS_FSL_DCSR_BASE,
 	  CONFIG_SYS_FSL_DCSR_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_MC_BASE, CONFIG_SYS_FSL_MC_BASE,
 	  CONFIG_SYS_FSL_MC_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_NI_BASE, CONFIG_SYS_FSL_NI_BASE,
 	  CONFIG_SYS_FSL_NI_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	/* For QBMAN portal, only the first 64MB is cache-enabled */
 	{ CONFIG_SYS_FSL_QBMAN_BASE, CONFIG_SYS_FSL_QBMAN_BASE,
 	  CONFIG_SYS_FSL_QBMAN_SIZE_1, MT_NORMAL,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN | PMD_SECT_NS },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN | PTE_BLOCK_NS },
 	{ CONFIG_SYS_FSL_QBMAN_BASE + CONFIG_SYS_FSL_QBMAN_SIZE_1,
 	  CONFIG_SYS_FSL_QBMAN_BASE + CONFIG_SYS_FSL_QBMAN_SIZE_1,
 	  CONFIG_SYS_FSL_QBMAN_SIZE - CONFIG_SYS_FSL_QBMAN_SIZE_1,
-	  MT_DEVICE_NGNRNE, PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  MT_DEVICE_NGNRNE, PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_PCIE1_PHYS_ADDR, CONFIG_SYS_PCIE1_PHYS_ADDR,
 	  CONFIG_SYS_PCIE1_PHYS_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_PCIE2_PHYS_ADDR, CONFIG_SYS_PCIE2_PHYS_ADDR,
 	  CONFIG_SYS_PCIE2_PHYS_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_PCIE3_PHYS_ADDR, CONFIG_SYS_PCIE3_PHYS_ADDR,
 	  CONFIG_SYS_PCIE3_PHYS_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 #if defined(CONFIG_LS2080A) || defined(CONFIG_LS2085A)
 	{ CONFIG_SYS_PCIE4_PHYS_ADDR, CONFIG_SYS_PCIE4_PHYS_ADDR,
 	  CONFIG_SYS_PCIE4_PHYS_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 #endif
 	{ CONFIG_SYS_FSL_WRIOP1_BASE, CONFIG_SYS_FSL_WRIOP1_BASE,
 	  CONFIG_SYS_FSL_WRIOP1_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_AIOP1_BASE, CONFIG_SYS_FSL_AIOP1_BASE,
 	  CONFIG_SYS_FSL_AIOP1_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_PEBUF_BASE, CONFIG_SYS_FSL_PEBUF_BASE,
 	  CONFIG_SYS_FSL_PEBUF_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_DRAM_BASE2, CONFIG_SYS_FSL_DRAM_BASE2,
 	  CONFIG_SYS_FSL_DRAM_SIZE2, MT_NORMAL,
-	  PMD_SECT_OUTER_SHARE | PMD_SECT_NS },
+	  PTE_BLOCK_OUTER_SHARE | PTE_BLOCK_NS },
 #elif defined(CONFIG_FSL_LSCH2)
 	{ CONFIG_SYS_FSL_BOOTROM_BASE, CONFIG_SYS_FSL_BOOTROM_BASE,
 	  CONFIG_SYS_FSL_BOOTROM_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_CCSR_BASE, CONFIG_SYS_FSL_CCSR_BASE,
 	  CONFIG_SYS_FSL_CCSR_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_OCRAM_BASE, CONFIG_SYS_FSL_OCRAM_BASE,
-	  CONFIG_SYS_FSL_OCRAM_SIZE, MT_NORMAL, PMD_SECT_NON_SHARE },
+	  CONFIG_SYS_FSL_OCRAM_SIZE, MT_NORMAL, PTE_BLOCK_NON_SHARE },
 	{ CONFIG_SYS_FSL_DCSR_BASE, CONFIG_SYS_FSL_DCSR_BASE,
 	  CONFIG_SYS_FSL_DCSR_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_QSPI_BASE, CONFIG_SYS_FSL_QSPI_BASE,
 	  CONFIG_SYS_FSL_QSPI_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_IFC_BASE, CONFIG_SYS_FSL_IFC_BASE,
-	  CONFIG_SYS_FSL_IFC_SIZE, MT_DEVICE_NGNRNE, PMD_SECT_NON_SHARE },
+	  CONFIG_SYS_FSL_IFC_SIZE, MT_DEVICE_NGNRNE, PTE_BLOCK_NON_SHARE },
 	{ CONFIG_SYS_FSL_DRAM_BASE1, CONFIG_SYS_FSL_DRAM_BASE1,
 	  CONFIG_SYS_FSL_DRAM_SIZE1, MT_NORMAL,
-	  PMD_SECT_OUTER_SHARE | PMD_SECT_NS },
+	  PTE_BLOCK_OUTER_SHARE | PTE_BLOCK_NS },
 	{ CONFIG_SYS_FSL_QBMAN_BASE, CONFIG_SYS_FSL_QBMAN_BASE,
 	  CONFIG_SYS_FSL_QBMAN_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_DRAM_BASE2, CONFIG_SYS_FSL_DRAM_BASE2,
-	  CONFIG_SYS_FSL_DRAM_SIZE2, MT_NORMAL, PMD_SECT_OUTER_SHARE },
+	  CONFIG_SYS_FSL_DRAM_SIZE2, MT_NORMAL, PTE_BLOCK_OUTER_SHARE },
 	{ CONFIG_SYS_PCIE1_PHYS_ADDR, CONFIG_SYS_PCIE1_PHYS_ADDR,
 	  CONFIG_SYS_PCIE1_PHYS_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_PCIE2_PHYS_ADDR, CONFIG_SYS_PCIE2_PHYS_ADDR,
 	  CONFIG_SYS_PCIE2_PHYS_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_PCIE3_PHYS_ADDR, CONFIG_SYS_PCIE3_PHYS_ADDR,
 	  CONFIG_SYS_PCIE3_PHYS_SIZE, MT_DEVICE_NGNRNE,
-	  PMD_SECT_NON_SHARE | PMD_SECT_PXN | PMD_SECT_UXN },
+	  PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN },
 	{ CONFIG_SYS_FSL_DRAM_BASE3, CONFIG_SYS_FSL_DRAM_BASE3,
-	  CONFIG_SYS_FSL_DRAM_SIZE3, MT_NORMAL, PMD_SECT_OUTER_SHARE },
+	  CONFIG_SYS_FSL_DRAM_SIZE3, MT_NORMAL, PTE_BLOCK_OUTER_SHARE },
 #endif
 };
 #endif
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h
index 06126c8..0d08ed3 100644
--- a/arch/arm/include/asm/armv8/mmu.h
+++ b/arch/arm/include/asm/armv8/mmu.h
@@ -22,28 +22,12 @@
  * calculated specifically.
  */
 
-#ifndef CONFIG_SYS_FULL_VA
-#define VA_BITS			(42)	/* 42 bits virtual address */
-#else
 #define VA_BITS			CONFIG_SYS_VA_BITS
 #define PTE_BLOCK_BITS		CONFIG_SYS_PTL2_BITS
-#endif
 
 /*
  * block/section address mask and size definitions.
  */
-#ifndef CONFIG_SYS_FULL_VA
-#define SECTION_SHIFT		29
-#define SECTION_SIZE		(UL(1) << SECTION_SHIFT)
-#define SECTION_MASK		(~(SECTION_SIZE-1))
-
-/* PAGE_SHIFT determines the page size */
-#undef  PAGE_SIZE
-#define PAGE_SHIFT		16
-#define PAGE_SIZE		(1 << PAGE_SHIFT)
-#define PAGE_MASK		(~(PAGE_SIZE-1))
-
-#else
 
 /* PAGE_SHIFT determines the page size */
 #undef  PAGE_SIZE
@@ -51,8 +35,6 @@
 #define PAGE_SIZE		(1 << PAGE_SHIFT)
 #define PAGE_MASK		(~(PAGE_SIZE-1))
 
-#endif
-
 /***************************************************************/
 
 /*
@@ -75,8 +57,6 @@
  *
  */
 
-#ifdef CONFIG_SYS_FULL_VA
-
 #define PTE_TYPE_MASK		(3 << 0)
 #define PTE_TYPE_FAULT		(0 << 0)
 #define PTE_TYPE_TABLE		(3 << 0)
@@ -91,6 +71,7 @@
  * Block
  */
 #define PTE_BLOCK_MEMTYPE(x)	((x) << 2)
+#define PTE_BLOCK_NS            (1 << 5)
 #define PTE_BLOCK_NON_SHARE	(0 << 8)
 #define PTE_BLOCK_OUTER_SHARE	(2 << 8)
 #define PTE_BLOCK_INNER_SHARE	(3 << 8)
@@ -99,29 +80,6 @@
 #define PTE_BLOCK_PXN		(UL(1) << 53)
 #define PTE_BLOCK_UXN		(UL(1) << 54)
 
-#else
-/*
- * Level 2 descriptor (PMD).
- */
-#define PMD_TYPE_MASK		(3 << 0)
-#define PMD_TYPE_FAULT		(0 << 0)
-#define PMD_TYPE_TABLE		(3 << 0)
-#define PMD_TYPE_SECT		(1 << 0)
-
-/*
- * Section
- */
-#define PMD_SECT_NS		(1 << 5)
-#define PMD_SECT_NON_SHARE	(0 << 8)
-#define PMD_SECT_OUTER_SHARE	(2 << 8)
-#define PMD_SECT_INNER_SHARE	(3 << 8)
-#define PMD_SECT_AF		(1 << 10)
-#define PMD_SECT_NG		(1 << 11)
-#define PMD_SECT_PXN		(UL(1) << 53)
-#define PMD_SECT_UXN		(UL(1) << 54)
-
-#endif
-
 /*
  * AttrIndx[2:0]
  */
@@ -150,33 +108,11 @@
 #define TCR_TG0_16K		(2 << 14)
 #define TCR_EPD1_DISABLE	(1 << 23)
 
-#ifndef CONFIG_SYS_FULL_VA
-#define TCR_EL1_IPS_BITS	(UL(3) << 32)	/* 42 bits physical address */
-#define TCR_EL2_IPS_BITS	(3 << 16)	/* 42 bits physical address */
-#define TCR_EL3_IPS_BITS	(3 << 16)	/* 42 bits physical address */
-
-/* PTWs cacheable, inner/outer WBWA and inner shareable */
-#define TCR_FLAGS		(TCR_TG0_64K |		\
-				TCR_SHARED_INNER |	\
-				TCR_ORGN_WBWA |		\
-				TCR_IRGN_WBWA |		\
-				TCR_T0SZ(VA_BITS))
-#endif
-
 #define TCR_EL1_RSVD		(1 << 31)
 #define TCR_EL2_RSVD		(1 << 31 | 1 << 23)
 #define TCR_EL3_RSVD		(1 << 31 | 1 << 23)
 
 #ifndef __ASSEMBLY__
-#ifndef CONFIG_SYS_FULL_VA
-
-void set_pgtable_section(u64 *page_table, u64 index,
-			 u64 section, u64 memory_type,
-			 u64 attribute);
-void set_pgtable_table(u64 *page_table, u64 index,
-		       u64 *table_addr);
-
-#endif
 static inline void set_ttbr_tcr_mair(int el, u64 table, u64 tcr, u64 attr)
 {
 	asm volatile("dsb sy");
diff --git a/arch/arm/include/asm/global_data.h b/arch/arm/include/asm/global_data.h
index 259daa1..77d2653 100644
--- a/arch/arm/include/asm/global_data.h
+++ b/arch/arm/include/asm/global_data.h
@@ -39,7 +39,7 @@ struct arch_global_data {
 #if !(defined(CONFIG_SYS_ICACHE_OFF) && defined(CONFIG_SYS_DCACHE_OFF))
 	unsigned long tlb_addr;
 	unsigned long tlb_size;
-#if defined(CONFIG_SYS_FULL_VA)
+#if defined(CONFIG_ARM64)
 	unsigned long tlb_fillptr;
 	unsigned long tlb_emerg;
 #endif
diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
index 9b1cbf2..ac1173d 100644
--- a/arch/arm/include/asm/system.h
+++ b/arch/arm/include/asm/system.h
@@ -17,18 +17,14 @@
 #define CR_WXN		(1 << 19)	/* Write Permision Imply XN	*/
 #define CR_EE		(1 << 25)	/* Exception (Big) Endian	*/
 
-/* 2MB granularity */
-#define MMU_SECTION_SHIFT	21
-#define MMU_SECTION_SIZE	(1 << MMU_SECTION_SHIFT)
-
 #ifndef __ASSEMBLY__
 
-#ifndef CONFIG_SYS_FULL_VA
-#define PGTABLE_SIZE	(0x10000)
-#else
 u64 get_page_table_size(void);
 #define PGTABLE_SIZE	get_page_table_size()
-#endif
+
+/* 2MB granularity */
+#define MMU_SECTION_SHIFT	21
+#define MMU_SECTION_SIZE	(1 << MMU_SECTION_SHIFT)
 
 enum dcache_option {
 	DCACHE_OFF = 0x3,
diff --git a/doc/README.arm64 b/doc/README.arm64
index de669cb..f658fa2 100644
--- a/doc/README.arm64
+++ b/doc/README.arm64
@@ -36,26 +36,6 @@ Notes
 6. CONFIG_ARM64 instead of CONFIG_ARMV8 is used to distinguish aarch64 and
    aarch32 specific codes.
 
-7. CONFIG_SYS_FULL_VA is used to enable 2-level page tables. For cores
-   supporting 64k pages it allows usage of full 48+ virtual/physical addresses
-
-   Enabling this option requires the following ones to be defined:
-       - CONFIG_SYS_MEM_MAP - an array of 'struct mm_region' describing the
-         system memory map (start, length, attributes)
-       - CONFIG_SYS_MEM_MAP_SIZE - number of entries in CONFIG_SYS_MEM_MAP
-       - CONFIG_SYS_PTL1_ENTRIES - number of 1st level page table entries
-       - CONFIG_SYS_PTL2_ENTRIES - number of 1nd level page table entries
-         for the largest CONFIG_SYS_MEM_MAP entry
-       - CONFIG_COREID_MASK - the mask value used to get the core from the
-         MPIDR_EL1 register
-       - CONFIG_SYS_PTL2_BITS - number of bits addressed by the 2nd level
-         page tables
-       - CONFIG_SYS_BLOCK_SHIFT - number of bits addressed by a single block
-         entry from L2 page tables
-       - CONFIG_SYS_PGTABLE_SIZE - total size of the page table
-       - CONFIG_SYS_TCR_EL{1,2,3}_IPS_BITS - the IPS field of the TCR_EL{1,2,3}
-
-
 
 
 Contributor
diff --git a/include/configs/hikey.h b/include/configs/hikey.h
index d33dcef..2d9ace9 100644
--- a/include/configs/hikey.h
+++ b/include/configs/hikey.h
@@ -23,7 +23,6 @@
 
 /* MMU Definitions */
 #define CONFIG_SYS_CACHELINE_SIZE	64
-#define CONFIG_SYS_FULL_VA
 
 #define CONFIG_IDENT_STRING		"hikey"
 
diff --git a/include/configs/tegra210-common.h b/include/configs/tegra210-common.h
index 2a6e317..8f35a7b 100644
--- a/include/configs/tegra210-common.h
+++ b/include/configs/tegra210-common.h
@@ -13,8 +13,6 @@
 /* Cortex-A57 uses a cache line size of 64 bytes */
 #define CONFIG_SYS_CACHELINE_SIZE	64
 
-#define CONFIG_SYS_FULL_VA
-
 /*
  * NS16550 Configuration
  */
diff --git a/include/configs/thunderx_88xx.h b/include/configs/thunderx_88xx.h
index 64e4616..736d0a5 100644
--- a/include/configs/thunderx_88xx.h
+++ b/include/configs/thunderx_88xx.h
@@ -22,22 +22,8 @@
 
 #define MEM_BASE			0x00500000
 
-#define CONFIG_SYS_FULL_VA
-
 #define CONFIG_SYS_LOWMEM_BASE		MEM_BASE
 
-#define CONFIG_SYS_MEM_MAP_SIZE		3
-
-#define CONFIG_SYS_VA_BITS		48
-#define CONFIG_SYS_PTL2_BITS		42
-#define CONFIG_SYS_BLOCK_SHIFT		29
-#define CONFIG_SYS_PTL1_ENTRIES		64
-#define CONFIG_SYS_PTL2_ENTRIES		8192
-
-#define CONFIG_SYS_PGTABLE_SIZE		\
-	((CONFIG_SYS_PTL1_ENTRIES + \
-	  CONFIG_SYS_MEM_MAP_SIZE * CONFIG_SYS_PTL2_ENTRIES) * 8)
-
 /* Link Definitions */
 #define CONFIG_SYS_TEXT_BASE		0x00500000
 #define CONFIG_SYS_INIT_SP_ADDR		(CONFIG_SYS_SDRAM_BASE + 0x7fff0)
diff --git a/include/configs/vexpress_aemv8a.h b/include/configs/vexpress_aemv8a.h
index ddb9848..2949170 100644
--- a/include/configs/vexpress_aemv8a.h
+++ b/include/configs/vexpress_aemv8a.h
@@ -21,7 +21,6 @@
 
 /* MMU Definitions */
 #define CONFIG_SYS_CACHELINE_SIZE	64
-#define CONFIG_SYS_FULL_VA
 
 #define CONFIG_IDENT_STRING		" vexpress_aemv8a"
 #define CONFIG_BOOTP_VCI_STRING		"U-Boot.armv8.vexpress_aemv8a"
diff --git a/include/configs/xilinx_zynqmp.h b/include/configs/xilinx_zynqmp.h
index 08f430c..da868b8 100644
--- a/include/configs/xilinx_zynqmp.h
+++ b/include/configs/xilinx_zynqmp.h
@@ -29,8 +29,6 @@
 #define CONFIG_SYS_MEMTEST_START	CONFIG_SYS_SDRAM_BASE
 #define CONFIG_SYS_MEMTEST_END		CONFIG_SYS_SDRAM_SIZE
 
-#define CONFIG_SYS_FULL_VA
-
 /* Have release address@the end of 256MB for now */
 #define CPU_RELEASE_ADDR	0xFFFFFF0
 
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 11/11] arm64: Only allow dcache disabled in SPL builds
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
                   ` (9 preceding siblings ...)
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 10/11] arm64: Remove non-full-va map code Alexander Graf
@ 2016-02-26  0:49 ` Alexander Graf
  2016-02-26 18:37 ` [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Stephen Warren
  11 siblings, 0 replies; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  0:49 UTC (permalink / raw)
  To: u-boot

Now that we have an easy way to describe memory regions and enable the MMU,
there really shouldn't be anything holding people back from running with
caches enabled on AArch64. To make sure people catch early if they're missing
on the caching fun, give them a compile error.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/arm/cpu/armv8/cache_v8.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c
index df15e00..d1bd06b 100644
--- a/arch/arm/cpu/armv8/cache_v8.c
+++ b/arch/arm/cpu/armv8/cache_v8.c
@@ -566,6 +566,15 @@ void mmu_set_region_dcache_behaviour(phys_addr_t start, size_t size,
 
 #else	/* CONFIG_SYS_DCACHE_OFF */
 
+/*
+ * For SPL builds, we may want to not have dcache enabled. Any real U-Boot
+ * running however really wants to have dcache and the MMU active. Check that
+ * everything is sane and give the developer a hint if it isn't.
+ */
+#ifndef CONFIG_SPL_BUILD
+#error Please describe your MMU layout in CONFIG_SYS_MEM_MAP and enable dcache.
+#endif
+
 void invalidate_dcache_all(void)
 {
 }
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 01/11] thunderx: Calculate TCR dynamically
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 01/11] thunderx: Calculate TCR dynamically Alexander Graf
@ 2016-02-26  2:51   ` Simon Glass
  2016-02-26  8:58     ` Alexander Graf
  0 siblings, 1 reply; 18+ messages in thread
From: Simon Glass @ 2016-02-26  2:51 UTC (permalink / raw)
  To: u-boot

Hi,

On 25 February 2016 at 17:49, Alexander Graf <agraf@suse.de> wrote:
> Based on the memory map we can determine a lot of hard coded fields of
> TCR, like the maximum VA and max PA we want to support. Calculate those
> dynamically to reduce the chance for pit falls.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/arm/cpu/armv8/cache_v8.c    | 59 +++++++++++++++++++++++++++++++++++++++-
>  arch/arm/include/asm/armv8/mmu.h |  6 +---
>  include/configs/thunderx_88xx.h  |  3 --
>  3 files changed, 59 insertions(+), 9 deletions(-)

This approach (extern variable) looks good to me.

Reviewed-by: Simon Glass <sjg@chromium.org>

I didn't know ARRAY_SIZE() worked on external variables, so I leaned
something also.

Regards,
Simon

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 01/11] thunderx: Calculate TCR dynamically
  2016-02-26  2:51   ` Simon Glass
@ 2016-02-26  8:58     ` Alexander Graf
  2016-02-26 15:39       ` Simon Glass
  0 siblings, 1 reply; 18+ messages in thread
From: Alexander Graf @ 2016-02-26  8:58 UTC (permalink / raw)
  To: u-boot



> Am 26.02.2016 um 03:51 schrieb Simon Glass <sjg@chromium.org>:
> 
> Hi,
> 
>> On 25 February 2016 at 17:49, Alexander Graf <agraf@suse.de> wrote:
>> Based on the memory map we can determine a lot of hard coded fields of
>> TCR, like the maximum VA and max PA we want to support. Calculate those
>> dynamically to reduce the chance for pit falls.
>> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/arm/cpu/armv8/cache_v8.c    | 59 +++++++++++++++++++++++++++++++++++++++-
>> arch/arm/include/asm/armv8/mmu.h |  6 +---
>> include/configs/thunderx_88xx.h  |  3 --
>> 3 files changed, 59 insertions(+), 9 deletions(-)
> 
> This approach (extern variable) looks good to me.
> 
> Reviewed-by: Simon Glass <sjg@chromium.org>
> 
> I didn't know ARRAY_SIZE() worked on external variables, so I leaned
> something also.

It doesn't. Could you please point me to a case where I'm still using ARRAY_SIZE in v4? I thought I had moved everything to null-terminating entries.


Alex

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 01/11] thunderx: Calculate TCR dynamically
  2016-02-26  8:58     ` Alexander Graf
@ 2016-02-26 15:39       ` Simon Glass
  2016-02-26 15:44         ` Alexander Graf
  0 siblings, 1 reply; 18+ messages in thread
From: Simon Glass @ 2016-02-26 15:39 UTC (permalink / raw)
  To: u-boot

Hi Alexander,

On 26 February 2016 at 01:58, Alexander Graf <agraf@suse.de> wrote:
>
>
>> Am 26.02.2016 um 03:51 schrieb Simon Glass <sjg@chromium.org>:
>>
>> Hi,
>>
>>> On 25 February 2016 at 17:49, Alexander Graf <agraf@suse.de> wrote:
>>> Based on the memory map we can determine a lot of hard coded fields of
>>> TCR, like the maximum VA and max PA we want to support. Calculate those
>>> dynamically to reduce the chance for pit falls.
>>>
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> ---
>>> arch/arm/cpu/armv8/cache_v8.c    | 59 +++++++++++++++++++++++++++++++++++++++-
>>> arch/arm/include/asm/armv8/mmu.h |  6 +---
>>> include/configs/thunderx_88xx.h  |  3 --
>>> 3 files changed, 59 insertions(+), 9 deletions(-)
>>
>> This approach (extern variable) looks good to me.
>>
>> Reviewed-by: Simon Glass <sjg@chromium.org>
>>
>> I didn't know ARRAY_SIZE() worked on external variables, so I leaned
>> something also.
>
> It doesn't. Could you please point me to a case where I'm still using ARRAY_SIZE in v4? I thought I had moved everything to null-terminating entries.

I imagined some gcc linker sizeof() magic. Oh well.

See near the top of get_tcr() - the patch on this thread.

Regards,
Simon

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 01/11] thunderx: Calculate TCR dynamically
  2016-02-26 15:39       ` Simon Glass
@ 2016-02-26 15:44         ` Alexander Graf
  2016-02-26 15:50           ` Simon Glass
  0 siblings, 1 reply; 18+ messages in thread
From: Alexander Graf @ 2016-02-26 15:44 UTC (permalink / raw)
  To: u-boot

On 02/26/2016 04:39 PM, Simon Glass wrote:
> Hi Alexander,
>
> On 26 February 2016 at 01:58, Alexander Graf <agraf@suse.de> wrote:
>>
>>> Am 26.02.2016 um 03:51 schrieb Simon Glass <sjg@chromium.org>:
>>>
>>> Hi,
>>>
>>>> On 25 February 2016 at 17:49, Alexander Graf <agraf@suse.de> wrote:
>>>> Based on the memory map we can determine a lot of hard coded fields of
>>>> TCR, like the maximum VA and max PA we want to support. Calculate those
>>>> dynamically to reduce the chance for pit falls.
>>>>
>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>> ---
>>>> arch/arm/cpu/armv8/cache_v8.c    | 59 +++++++++++++++++++++++++++++++++++++++-
>>>> arch/arm/include/asm/armv8/mmu.h |  6 +---
>>>> include/configs/thunderx_88xx.h  |  3 --
>>>> 3 files changed, 59 insertions(+), 9 deletions(-)
>>> This approach (extern variable) looks good to me.
>>>
>>> Reviewed-by: Simon Glass <sjg@chromium.org>
>>>
>>> I didn't know ARRAY_SIZE() worked on external variables, so I leaned
>>> something also.
>> It doesn't. Could you please point me to a case where I'm still using ARRAY_SIZE in v4? I thought I had moved everything to null-terminating entries.
> I imagined some gcc linker sizeof() magic. Oh well.
>
> See near the top of get_tcr() - the patch on this thread.

In 1/11 the mem_map is still a locally set array which gets 
preinitialized to a constant array. That one is supported by ARRAY_SIZE.

Later in "thunderx: Move mmu table into board file" I move the mem_map 
variable into board files and change all ARRAY_SIZE instances to loops 
that loop until they hit an empty entry. This is mostly to make review 
and bisectability easier, since that way only few things change at a time.


Alex

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 01/11] thunderx: Calculate TCR dynamically
  2016-02-26 15:44         ` Alexander Graf
@ 2016-02-26 15:50           ` Simon Glass
  0 siblings, 0 replies; 18+ messages in thread
From: Simon Glass @ 2016-02-26 15:50 UTC (permalink / raw)
  To: u-boot

Hi Alex,

On 26 February 2016 at 08:44, Alexander Graf <agraf@suse.de> wrote:
> On 02/26/2016 04:39 PM, Simon Glass wrote:
>>
>> Hi Alexander,
>>
>> On 26 February 2016 at 01:58, Alexander Graf <agraf@suse.de> wrote:
>>>
>>>
>>>> Am 26.02.2016 um 03:51 schrieb Simon Glass <sjg@chromium.org>:
>>>>
>>>> Hi,
>>>>
>>>>> On 25 February 2016 at 17:49, Alexander Graf <agraf@suse.de> wrote:
>>>>> Based on the memory map we can determine a lot of hard coded fields of
>>>>> TCR, like the maximum VA and max PA we want to support. Calculate those
>>>>> dynamically to reduce the chance for pit falls.
>>>>>
>>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>>> ---
>>>>> arch/arm/cpu/armv8/cache_v8.c    | 59
>>>>> +++++++++++++++++++++++++++++++++++++++-
>>>>> arch/arm/include/asm/armv8/mmu.h |  6 +---
>>>>> include/configs/thunderx_88xx.h  |  3 --
>>>>> 3 files changed, 59 insertions(+), 9 deletions(-)
>>>>
>>>> This approach (extern variable) looks good to me.
>>>>
>>>> Reviewed-by: Simon Glass <sjg@chromium.org>
>>>>
>>>> I didn't know ARRAY_SIZE() worked on external variables, so I leaned
>>>> something also.
>>>
>>> It doesn't. Could you please point me to a case where I'm still using
>>> ARRAY_SIZE in v4? I thought I had moved everything to null-terminating
>>> entries.
>>
>> I imagined some gcc linker sizeof() magic. Oh well.
>>
>> See near the top of get_tcr() - the patch on this thread.
>
>
> In 1/11 the mem_map is still a locally set array which gets preinitialized
> to a constant array. That one is supported by ARRAY_SIZE.
>
> Later in "thunderx: Move mmu table into board file" I move the mem_map
> variable into board files and change all ARRAY_SIZE instances to loops that
> loop until they hit an empty entry. This is mostly to make review and
> bisectability easier, since that way only few things change at a time.

OK, makes sense. Sorry I haven't received it very closely but you seem
to have got plenty of input.

Regards,
Simon

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3
  2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
                   ` (10 preceding siblings ...)
  2016-02-26  0:49 ` [U-Boot] [PATCH v4 11/11] arm64: Only allow dcache disabled in SPL builds Alexander Graf
@ 2016-02-26 18:37 ` Stephen Warren
  11 siblings, 0 replies; 18+ messages in thread
From: Stephen Warren @ 2016-02-26 18:37 UTC (permalink / raw)
  To: u-boot

On 02/25/2016 05:49 PM, Alexander Graf wrote:
> Howdy,
>
> Currently on arm64 there is a big pile of mess when it comes to MMU
> support and page tables. Each board does its own little thing and the
> generic code is pretty dumb and nobody actually uses it.
>
> This patch set tries to clean that up. After this series is applied,
> all boards except for the FSL Layerscape ones are converted to the
> new generic page table logic and have icache+dcache enabled.
>
> The new code always uses 4k page size. It dynamically allocates 1G or
> 2M pages for ranges that fit. When a dcache attribute request comes in
> that requires a smaller granularity than our previous allocation could
> fulfill, pages get automatically split.
>
> I have tested and verified the code works on HiKey (bare metal),
> vexpress64 (Foundation Model) and zynqmp (QEMU). The TX1 target is
> untested, but given the simplicity of the maps I doubt it'll break.
> ThunderX in theory should also work, but I haven't tested it. I would
> be very happy if people with access to those system could give the patch
> set a try.
>
> With this we're a big step closer to a good base line for EFI payload
> support, since we can now just require that all boards always have dcache
> enabled.
>
> I would also be incredibly happy if some Freescale people could look
> at their MMU code and try to unify it into the now cleaned up generic
> code. I don't think we're far off here.

Tested-by: Stephen Warren <swarren@nvidia.com>
(On p2371-0000 and p2371-2180)

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-02-26 18:37 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-26  0:49 [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Alexander Graf
2016-02-26  0:49 ` [U-Boot] [PATCH v4 01/11] thunderx: Calculate TCR dynamically Alexander Graf
2016-02-26  2:51   ` Simon Glass
2016-02-26  8:58     ` Alexander Graf
2016-02-26 15:39       ` Simon Glass
2016-02-26 15:44         ` Alexander Graf
2016-02-26 15:50           ` Simon Glass
2016-02-26  0:49 ` [U-Boot] [PATCH v4 02/11] arm64: Disable TTBR1 maps in EL1 Alexander Graf
2016-02-26  0:49 ` [U-Boot] [PATCH v4 03/11] arm64: Make full va map code more dynamic Alexander Graf
2016-02-26  0:49 ` [U-Boot] [PATCH v4 04/11] thunderx: Move mmu table into board file Alexander Graf
2016-02-26  0:49 ` [U-Boot] [PATCH v4 05/11] zymqmp: Replace home grown mmu code with generic table approach Alexander Graf
2016-02-26  0:49 ` [U-Boot] [PATCH v4 06/11] tegra: " Alexander Graf
2016-02-26  0:49 ` [U-Boot] [PATCH v4 07/11] vexpress64: Add MMU tables Alexander Graf
2016-02-26  0:49 ` [U-Boot] [PATCH v4 08/11] dwmmc: Increase retry timeout Alexander Graf
2016-02-26  0:49 ` [U-Boot] [PATCH v4 09/11] hikey: Add MMU tables Alexander Graf
2016-02-26  0:49 ` [U-Boot] [PATCH v4 10/11] arm64: Remove non-full-va map code Alexander Graf
2016-02-26  0:49 ` [U-Boot] [PATCH v4 11/11] arm64: Only allow dcache disabled in SPL builds Alexander Graf
2016-02-26 18:37 ` [U-Boot] [PATCH v4 00/11] arm64: Unify MMU code v3 Stephen Warren

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.