All of lore.kernel.org
 help / color / mirror / Atom feed
* [U-Boot] [PATCH 0/5] Enable caches for the RPi2
@ 2016-03-15 17:21 Alexander Graf
  2016-03-15 17:21 ` [U-Boot] [PATCH 1/5] arm64: Add 32bit arm compatible dcache definitions Alexander Graf
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Alexander Graf @ 2016-03-15 17:21 UTC (permalink / raw)
  To: u-boot

This patch set converts the Raspberry Pi 2 system to properly make use of
the caches available in it.

Because we're running in HYP mode, we first need to teach U-Boot how to
make use of HYP registers and the LPAE page layout which is mandated by
hardware when running in HYP mode.

Then while we're at it, also mark the frame buffer cached to speed up
screen updates.

With this patch set, my Raspberry Pi 3 running in AArch32 mode is a *lot*
faster than without.

Please verify that the code works on a RPi2 as well. In theory it should,
but I only have a 3 to test on available here.

Alexander Graf (5):
  arm64: Add 32bit arm compatible dcache definitions
  arm: Add support for HYP mode and LPAE page tables
  lcd: Fix compile warning in 64bit mode
  RPi: Enable caches for rpi2
  bcm2835 video: Map fb as cached

 arch/arm/include/asm/system.h | 105 +++++++++++++++++++++++++++++++++++++++---
 arch/arm/lib/cache-cp15.c     |  66 +++++++++++++++++++++++---
 arch/arm/mach-bcm283x/init.c  |   7 +++
 common/lcd.c                  |   4 +-
 drivers/video/bcm2835.c       |   6 +++
 include/configs/rpi_2.h       |   2 +-
 6 files changed, 174 insertions(+), 16 deletions(-)

-- 
1.8.5.6

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 1/5] arm64: Add 32bit arm compatible dcache definitions
  2016-03-15 17:21 [U-Boot] [PATCH 0/5] Enable caches for the RPi2 Alexander Graf
@ 2016-03-15 17:21 ` Alexander Graf
  2016-03-16 17:55   ` Andreas Färber
  2016-03-15 17:21 ` [U-Boot] [PATCH 2/5] arm: Add support for HYP mode and LPAE page tables Alexander Graf
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2016-03-15 17:21 UTC (permalink / raw)
  To: u-boot

We want to be able to reuse device drivers from 32bit code, so let's add
definitions for all the dcache options that 32bit code has.

While at it, fix up the DCACHE_OFF configuration. That was setting the bits
to declare a PTE a PTE and left the MAIR index bit at 0. Drop the useless
bits and make the index explicit.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/arm/include/asm/system.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
index ac1173d..832c1db 100644
--- a/arch/arm/include/asm/system.h
+++ b/arch/arm/include/asm/system.h
@@ -26,8 +26,12 @@ u64 get_page_table_size(void);
 #define MMU_SECTION_SHIFT	21
 #define MMU_SECTION_SIZE	(1 << MMU_SECTION_SHIFT)
 
+/* These constants need to be synced to the MT_ types in asm/armv8/mmu.h */
 enum dcache_option {
-	DCACHE_OFF = 0x3,
+	DCACHE_OFF = 0 << 2,
+	DCACHE_WRITETHROUGH = 3 << 2,
+	DCACHE_WRITEBACK = 4 << 2,
+	DCACHE_WRITEALLOC = 4 << 2,
 };
 
 #define isb()				\
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 2/5] arm: Add support for HYP mode and LPAE page tables
  2016-03-15 17:21 [U-Boot] [PATCH 0/5] Enable caches for the RPi2 Alexander Graf
  2016-03-15 17:21 ` [U-Boot] [PATCH 1/5] arm64: Add 32bit arm compatible dcache definitions Alexander Graf
@ 2016-03-15 17:21 ` Alexander Graf
  2016-03-15 17:35   ` Tom Rini
  2016-03-15 17:21 ` [U-Boot] [PATCH 3/5] lcd: Fix compile warning in 64bit mode Alexander Graf
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2016-03-15 17:21 UTC (permalink / raw)
  To: u-boot

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/arm/include/asm/system.h | 99 ++++++++++++++++++++++++++++++++++++++++---
 arch/arm/lib/cache-cp15.c     | 66 ++++++++++++++++++++++++++---
 2 files changed, 153 insertions(+), 12 deletions(-)

diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
index 832c1db..9d0f32b 100644
--- a/arch/arm/include/asm/system.h
+++ b/arch/arm/include/asm/system.h
@@ -176,7 +176,9 @@ void smc_call(struct pt_regs *args);
 #define CR_AFE	(1 << 29)	/* Access flag enable			*/
 #define CR_TE	(1 << 30)	/* Thumb exception enable		*/
 
-#ifndef PGTABLE_SIZE
+#if defined(CONFIG_SYS_ARM_LPAE) && !defined(PGTABLE_SIZE)
+#define PGTABLE_SIZE		(4096 * 5)
+#elif !defined(PGTABLE_SIZE)
 #define PGTABLE_SIZE		(4096 * 4)
 #endif
 
@@ -233,17 +235,50 @@ void save_boot_params_ret(void);
 #define wfi()
 #endif
 
+static inline unsigned long get_cpsr(void)
+{
+	unsigned long cpsr;
+
+	asm volatile("mrs %0, cpsr" : "=r"(cpsr): );
+	return cpsr;
+}
+
+static inline int is_hyp(void)
+{
+#ifdef CONFIG_SYS_ARM_LPAE
+	/* HYP mode requires LPAE ... */
+	return ((get_cpsr() & 0x1f) == 0x1a);
+#else
+	/* ... so without LPAE support we can optimize all hyp code away */
+	return 0;
+#endif
+}
+
 static inline unsigned int get_cr(void)
 {
 	unsigned int val;
-	asm volatile("mrc p15, 0, %0, c1, c0, 0	@ get CR" : "=r" (val) : : "cc");
+
+	if (is_hyp())
+		asm volatile("mrc p15, 4, %0, c1, c0, 0	@ get CR" : "=r" (val)
+								  :
+								  : "cc");
+	else
+		asm volatile("mrc p15, 0, %0, c1, c0, 0	@ get CR" : "=r" (val)
+								  :
+								  : "cc");
 	return val;
 }
 
 static inline void set_cr(unsigned int val)
 {
-	asm volatile("mcr p15, 0, %0, c1, c0, 0	@ set CR"
-	  : : "r" (val) : "cc");
+	if (is_hyp())
+		asm volatile("mcr p15, 4, %0, c1, c0, 0	@ set CR" :
+								  : "r" (val)
+								  : "cc");
+	else
+		asm volatile("mcr p15, 0, %0, c1, c0, 0	@ set CR" :
+								  : "r" (val)
+								  : "cc");
 	isb();
 }
 
@@ -261,12 +296,59 @@ static inline void set_dacr(unsigned int val)
 	isb();
 }
 
-#ifdef CONFIG_CPU_V7
+#ifdef CONFIG_SYS_ARM_LPAE
+/* Long-Descriptor Translation Table Level 1/2 Bits */
+#define TTB_SECT_XN_MASK	(1ULL << 54)
+#define TTB_SECT_NG_MASK	(1 << 11)
+#define TTB_SECT_AF		(1 << 10)
+#define TTB_SECT_SH_MASK	(3 << 8)
+#define TTB_SECT_NS_MASK	(1 << 5)
+#define TTB_SECT_AP		(1 << 6)
+/* Note: TTB AP bits are set elsewhere */
+#define TTB_SECT_MAIR(x)	((x & 0x7) << 2) /* Index into MAIR */
+#define TTB_SECT		(1 << 0)
+#define TTB_PAGETABLE		(3 << 0)
+
+/* TTBCR flags */
+#define TTBCR_EAE		(1 << 31)
+#define TTBCR_T0SZ(x)		((x) << 0)
+#define TTBCR_T1SZ(x)		((x) << 16)
+#define TTBCR_USING_TTBR0	(TTBCR_T0SZ(0) | TTBCR_T1SZ(0))
+#define TTBCR_IRGN0_NC		(0 << 8)
+#define TTBCR_IRGN0_WBWA	(1 << 8)
+#define TTBCR_IRGN0_WT		(2 << 8)
+#define TTBCR_IRGN0_WBNWA	(3 << 8)
+#define TTBCR_IRGN0_MASK	(3 << 8)
+#define TTBCR_ORGN0_NC		(0 << 10)
+#define TTBCR_ORGN0_WBWA	(1 << 10)
+#define TTBCR_ORGN0_WT		(2 << 10)
+#define TTBCR_ORGN0_WBNWA	(3 << 10)
+#define TTBCR_ORGN0_MASK	(3 << 10)
+#define TTBCR_SHARED_NON	(0 << 12)
+#define TTBCR_SHARED_OUTER	(2 << 12)
+#define TTBCR_SHARED_INNER	(3 << 12)
+#define TTBCR_EPD0		(0 << 7)
+
+/*
+ * Memory types
+ */
+#define MEMORY_ATTRIBUTES	((0x00 << (0 * 8)) | (0x88 << (1 * 8)) | \
+				 (0xcc << (2 * 8)) | (0xff << (3 * 8)))
+
+/* options available for data cache on each page */
+enum dcache_option {
+	DCACHE_OFF = TTB_SECT | TTB_SECT_MAIR(0),
+	DCACHE_WRITETHROUGH = TTB_SECT | TTB_SECT_MAIR(1),
+	DCACHE_WRITEBACK = TTB_SECT | TTB_SECT_MAIR(2),
+	DCACHE_WRITEALLOC = TTB_SECT | TTB_SECT_MAIR(3),
+};
+#elif defined(CONFIG_CPU_V7)
 /* Short-Descriptor Translation Table Level 1 Bits */
 #define TTB_SECT_NS_MASK	(1 << 19)
 #define TTB_SECT_NG_MASK	(1 << 17)
 #define TTB_SECT_S_MASK		(1 << 16)
 /* Note: TTB AP bits are set elsewhere */
+#define TTB_SECT_AP		(3 << 10)
 #define TTB_SECT_TEX(x)		((x & 0x7) << 12)
 #define TTB_SECT_DOMAIN(x)	((x & 0xf) << 5)
 #define TTB_SECT_XN_MASK	(1 << 4)
@@ -282,6 +364,7 @@ enum dcache_option {
 	DCACHE_WRITEALLOC = DCACHE_WRITEBACK | TTB_SECT_TEX(1),
 };
 #else
+#define TTB_SECT_AP		(3 << 10)
 /* options available for data cache on each page */
 enum dcache_option {
 	DCACHE_OFF = 0x12,
@@ -293,7 +376,11 @@ enum dcache_option {
 
 /* Size of an MMU section */
 enum {
-	MMU_SECTION_SHIFT	= 20,
+#ifdef CONFIG_SYS_ARM_LPAE
+	MMU_SECTION_SHIFT	= 21, /* 2MB */
+#else
+	MMU_SECTION_SHIFT	= 20, /* 1MB */
+#endif
 	MMU_SECTION_SIZE	= 1 << MMU_SECTION_SHIFT,
 };
 
diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c
index 8e18538..849cb89 100644
--- a/arch/arm/lib/cache-cp15.c
+++ b/arch/arm/lib/cache-cp15.c
@@ -34,11 +34,22 @@ static void cp_delay (void)
 
 void set_section_dcache(int section, enum dcache_option option)
 {
+#ifdef CONFIG_SYS_ARM_LPAE
+	u64 *page_table = (u64 *)gd->arch.tlb_addr;
+	/* Need to set the access flag to not fault */
+	u64 value = TTB_SECT_AP | TTB_SECT_AF;
+#else
 	u32 *page_table = (u32 *)gd->arch.tlb_addr;
-	u32 value;
+	u32 value = TTB_SECT_AP;
+#endif
+
+	/* Add the page offset */
+	value |= ((u32)section << MMU_SECTION_SHIFT);
 
-	value = (section << MMU_SECTION_SHIFT) | (3 << 10);
+	/* Add caching bits */
 	value |= option;
+
+	/* Set PTE */
 	page_table[section] = value;
 }
 
@@ -68,8 +79,9 @@ __weak void dram_bank_mmu_setup(int bank)
 	int	i;
 
 	debug("%s: bank: %d\n", __func__, bank);
-	for (i = bd->bi_dram[bank].start >> 20;
-	     i < (bd->bi_dram[bank].start >> 20) + (bd->bi_dram[bank].size >> 20);
+	for (i = bd->bi_dram[bank].start >> MMU_SECTION_SHIFT;
+	     i < (bd->bi_dram[bank].start >> MMU_SECTION_SHIFT) +
+		 (bd->bi_dram[bank].size >> MMU_SECTION_SHIFT);
 	     i++) {
 #if defined(CONFIG_SYS_ARM_CACHE_WRITETHROUGH)
 		set_section_dcache(i, DCACHE_WRITETHROUGH);
@@ -89,14 +101,56 @@ static inline void mmu_setup(void)
 
 	arm_init_before_mmu();
 	/* Set up an identity-mapping for all 4GB, rw for everyone */
-	for (i = 0; i < 4096; i++)
+	for (i = 0; i < ((4096ULL * 1024 * 1024) >> MMU_SECTION_SHIFT); i++)
 		set_section_dcache(i, DCACHE_OFF);
 
 	for (i = 0; i < CONFIG_NR_DRAM_BANKS; i++) {
 		dram_bank_mmu_setup(i);
 	}
 
-#ifdef CONFIG_CPU_V7
+#ifdef CONFIG_SYS_ARM_LPAE
+	/* Set up 4 PTE entries pointing to our 4 1GB page tables */
+	for (i = 0; i < 4; i++) {
+		u64 *page_table = (u64 *)(gd->arch.tlb_addr + (4096 * 4));
+		u64 tpt = gd->arch.tlb_addr + (4096 * i);
+		page_table[i] = tpt | TTB_PAGETABLE;
+	}
+
+	reg = TTBCR_EAE;
+#if defined(CONFIG_SYS_ARM_CACHE_WRITETHROUGH)
+	reg |= TTBCR_ORGN0_WT | TTBCR_IRGN0_WT;
+#elif defined(CONFIG_SYS_ARM_CACHE_WRITEALLOC)
+	reg |= TTBCR_ORGN0_WBWA | TTBCR_IRGN0_WBWA;
+#else
+	reg |= TTBCR_ORGN0_WBNWA | TTBCR_IRGN0_WBNWA;
+#endif
+
+	if (is_hyp()) {
+		/* Set HCTR to enable LPAE */
+		asm volatile("mcr p15, 4, %0, c2, c0, 2"
+			: : "r" (reg) : "memory");
+		/* Set HTTBR0 */
+		asm volatile("mcrr p15, 4, %0, %1, c2"
+			:
+			: "r"(gd->arch.tlb_addr + (4096 * 4)), "r"(0)
+			: "memory");
+		/* Set HMAIR */
+		asm volatile("mcr p15, 4, %0, c10, c2, 0"
+			: : "r" (MEMORY_ATTRIBUTES) : "memory");
+	} else {
+		/* Set TTBCR to enable LPAE */
+		asm volatile("mcr p15, 0, %0, c2, c0, 2"
+			: : "r" (reg) : "memory");
+		/* Set 64-bit TTBR0 */
+		asm volatile("mcrr p15, 0, %0, %1, c2"
+			:
+			: "r"(gd->arch.tlb_addr + (4096 * 4)), "r"(0)
+			: "memory");
+		/* Set MAIR */
+		asm volatile("mcr p15, 0, %0, c10, c2, 0"
+			: : "r" (MEMORY_ATTRIBUTES) : "memory");
+	}
+#elif defined(CONFIG_CPU_V7)
 	/* Set TTBR0 */
 	reg = gd->arch.tlb_addr & TTBR0_BASE_ADDR_MASK;
 #if defined(CONFIG_SYS_ARM_CACHE_WRITETHROUGH)
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 3/5] lcd: Fix compile warning in 64bit mode
  2016-03-15 17:21 [U-Boot] [PATCH 0/5] Enable caches for the RPi2 Alexander Graf
  2016-03-15 17:21 ` [U-Boot] [PATCH 1/5] arm64: Add 32bit arm compatible dcache definitions Alexander Graf
  2016-03-15 17:21 ` [U-Boot] [PATCH 2/5] arm: Add support for HYP mode and LPAE page tables Alexander Graf
@ 2016-03-15 17:21 ` Alexander Graf
  2016-03-15 17:21 ` [U-Boot] [PATCH 4/5] RPi: Enable caches for rpi2 Alexander Graf
  2016-03-15 17:21 ` [U-Boot] [PATCH 5/5] bcm2835 video: Map fb as cached Alexander Graf
  4 siblings, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2016-03-15 17:21 UTC (permalink / raw)
  To: u-boot

When compiling the code for 64bit, the lcd code emits warnings because it
tries to cast pointers to 32bit values. Fix it by casting them to longs
instead, actually properly aligning with the function prototype.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 common/lcd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/common/lcd.c b/common/lcd.c
index 51705ad..783626e 100644
--- a/common/lcd.c
+++ b/common/lcd.c
@@ -66,8 +66,8 @@ void lcd_sync(void)
 	int line_length;
 
 	if (lcd_flush_dcache)
-		flush_dcache_range((u32)lcd_base,
-			(u32)(lcd_base + lcd_get_size(&line_length)));
+		flush_dcache_range((ulong)lcd_base,
+			(ulong)(lcd_base + lcd_get_size(&line_length)));
 #endif
 }
 
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 4/5] RPi: Enable caches for rpi2
  2016-03-15 17:21 [U-Boot] [PATCH 0/5] Enable caches for the RPi2 Alexander Graf
                   ` (2 preceding siblings ...)
  2016-03-15 17:21 ` [U-Boot] [PATCH 3/5] lcd: Fix compile warning in 64bit mode Alexander Graf
@ 2016-03-15 17:21 ` Alexander Graf
  2016-03-15 17:21 ` [U-Boot] [PATCH 5/5] bcm2835 video: Map fb as cached Alexander Graf
  4 siblings, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2016-03-15 17:21 UTC (permalink / raw)
  To: u-boot

Now that we have support for running with caches enabled in HYP mode,
opt in to that on the Raspberry Pi 2. This brings a significant performance
boost.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/arm/mach-bcm283x/init.c | 7 +++++++
 include/configs/rpi_2.h      | 2 +-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mach-bcm283x/init.c b/arch/arm/mach-bcm283x/init.c
index d2d366b..2ec87c2 100644
--- a/arch/arm/mach-bcm283x/init.c
+++ b/arch/arm/mach-bcm283x/init.c
@@ -15,3 +15,10 @@ int arch_cpu_init(void)
 
 	return 0;
 }
+
+#ifdef CONFIG_SYS_ARM_LPAE
+void enable_caches(void)
+{
+	dcache_enable();
+}
+#endif
diff --git a/include/configs/rpi_2.h b/include/configs/rpi_2.h
index bea4ebd..14b807a 100644
--- a/include/configs/rpi_2.h
+++ b/include/configs/rpi_2.h
@@ -10,7 +10,7 @@
 #define CONFIG_SKIP_LOWLEVEL_INIT
 #define CONFIG_BCM2836
 #define CONFIG_SYS_CACHELINE_SIZE		64
-#define CONFIG_SYS_DCACHE_OFF
+#define CONFIG_SYS_ARM_LPAE
 
 #include "rpi-common.h"
 
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 5/5] bcm2835 video: Map fb as cached
  2016-03-15 17:21 [U-Boot] [PATCH 0/5] Enable caches for the RPi2 Alexander Graf
                   ` (3 preceding siblings ...)
  2016-03-15 17:21 ` [U-Boot] [PATCH 4/5] RPi: Enable caches for rpi2 Alexander Graf
@ 2016-03-15 17:21 ` Alexander Graf
  2016-03-16 18:00   ` Andreas Färber
  4 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2016-03-15 17:21 UTC (permalink / raw)
  To: u-boot

The bcm2835 frame buffer is in RAM, so we can easily map it as cached and gain
all the glorious performance boost that brings with it.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 drivers/video/bcm2835.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/video/bcm2835.c b/drivers/video/bcm2835.c
index bff1fcb..fe49f2e 100644
--- a/drivers/video/bcm2835.c
+++ b/drivers/video/bcm2835.c
@@ -106,6 +106,12 @@ void lcd_ctrl_init(void *lcdbase)
 
 	gd->fb_base = bus_to_phys(
 		msg_setup->allocate_buffer.body.resp.fb_address);
+
+	/* Enable dcache for the frame buffer */
+        mmu_set_region_dcache_behaviour(gd->fb_base,
+		ALIGN(PAGE_SIZE, msg_setup->allocate_buffer.body.resp.fb_size),
+		DCACHE_WRITEBACK);
+	lcd_set_flush_dcache(1);
 }
 
 void lcd_enable(void)
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 2/5] arm: Add support for HYP mode and LPAE page tables
  2016-03-15 17:21 ` [U-Boot] [PATCH 2/5] arm: Add support for HYP mode and LPAE page tables Alexander Graf
@ 2016-03-15 17:35   ` Tom Rini
  2016-03-16  8:33     ` Alexander Graf
  0 siblings, 1 reply; 13+ messages in thread
From: Tom Rini @ 2016-03-15 17:35 UTC (permalink / raw)
  To: u-boot

On Tue, Mar 15, 2016 at 06:21:45PM +0100, Alexander Graf wrote:

> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/arm/include/asm/system.h | 99 ++++++++++++++++++++++++++++++++++++++++---
>  arch/arm/lib/cache-cp15.c     | 66 ++++++++++++++++++++++++++---
>  2 files changed, 153 insertions(+), 12 deletions(-)

I think in this patch we need to add SYS_ARM_LPAE to arch/arm/Kconfig
and then later select it under ARCH_BCM283X or TARGET_RPI_2 (I don't
know the SoC well enough to say which side is forcing this on us).

-- 
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.denx.de/pipermail/u-boot/attachments/20160315/08181a8f/attachment.sig>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 2/5] arm: Add support for HYP mode and LPAE page tables
  2016-03-15 17:35   ` Tom Rini
@ 2016-03-16  8:33     ` Alexander Graf
  2016-03-16 14:25       ` Tom Rini
  0 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2016-03-16  8:33 UTC (permalink / raw)
  To: u-boot



On 15.03.16 18:35, Tom Rini wrote:
> On Tue, Mar 15, 2016 at 06:21:45PM +0100, Alexander Graf wrote:
> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>>  arch/arm/include/asm/system.h | 99 ++++++++++++++++++++++++++++++++++++++++---
>>  arch/arm/lib/cache-cp15.c     | 66 ++++++++++++++++++++++++++---
>>  2 files changed, 153 insertions(+), 12 deletions(-)
> 
> I think in this patch we need to add SYS_ARM_LPAE to arch/arm/Kconfig
> and then later select it under ARCH_BCM283X or TARGET_RPI_2 (I don't
> know the SoC well enough to say which side is forcing this on us).

So you'd prefer to go via kconfig rather than a board #define?

The reason we need this is that U-Boot gets entered in HYP mode. This is
not specific to the SoC, it's specific to the way it boots, so it
belongs in the rpi2 configuration. The only other system I'm aware of
that supposedly works like this is the Calxeda Midway.


Alex

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 2/5] arm: Add support for HYP mode and LPAE page tables
  2016-03-16  8:33     ` Alexander Graf
@ 2016-03-16 14:25       ` Tom Rini
  0 siblings, 0 replies; 13+ messages in thread
From: Tom Rini @ 2016-03-16 14:25 UTC (permalink / raw)
  To: u-boot

On Wed, Mar 16, 2016 at 09:33:12AM +0100, Alexander Graf wrote:
> 
> 
> On 15.03.16 18:35, Tom Rini wrote:
> > On Tue, Mar 15, 2016 at 06:21:45PM +0100, Alexander Graf wrote:
> > 
> >> Signed-off-by: Alexander Graf <agraf@suse.de>
> >> ---
> >>  arch/arm/include/asm/system.h | 99 ++++++++++++++++++++++++++++++++++++++++---
> >>  arch/arm/lib/cache-cp15.c     | 66 ++++++++++++++++++++++++++---
> >>  2 files changed, 153 insertions(+), 12 deletions(-)
> > 
> > I think in this patch we need to add SYS_ARM_LPAE to arch/arm/Kconfig
> > and then later select it under ARCH_BCM283X or TARGET_RPI_2 (I don't
> > know the SoC well enough to say which side is forcing this on us).
> 
> So you'd prefer to go via kconfig rather than a board #define?

Yes.

> The reason we need this is that U-Boot gets entered in HYP mode. This is
> not specific to the SoC, it's specific to the way it boots, so it
> belongs in the rpi2 configuration. The only other system I'm aware of
> that supposedly works like this is the Calxeda Midway.

Right, OK, so selected by TARGET_RPI_2 it is :)

-- 
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.denx.de/pipermail/u-boot/attachments/20160316/d6d7a88b/attachment.sig>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 1/5] arm64: Add 32bit arm compatible dcache definitions
  2016-03-15 17:21 ` [U-Boot] [PATCH 1/5] arm64: Add 32bit arm compatible dcache definitions Alexander Graf
@ 2016-03-16 17:55   ` Andreas Färber
  2016-03-16 22:31     ` Alexander Graf
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Färber @ 2016-03-16 17:55 UTC (permalink / raw)
  To: u-boot

Am 15.03.2016 um 18:21 schrieb Alexander Graf:
> We want to be able to reuse device drivers from 32bit code, so let's add
> definitions for all the dcache options that 32bit code has.
> 
> While at it, fix up the DCACHE_OFF configuration. That was setting the bits
> to declare a PTE a PTE and left the MAIR index bit at 0. Drop the useless
> bits and make the index explicit.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/arm/include/asm/system.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
> index ac1173d..832c1db 100644
> --- a/arch/arm/include/asm/system.h
> +++ b/arch/arm/include/asm/system.h
> @@ -26,8 +26,12 @@ u64 get_page_table_size(void);
>  #define MMU_SECTION_SHIFT	21
>  #define MMU_SECTION_SIZE	(1 << MMU_SECTION_SHIFT)
>  
> +/* These constants need to be synced to the MT_ types in asm/armv8/mmu.h */
>  enum dcache_option {
> -	DCACHE_OFF = 0x3,
> +	DCACHE_OFF = 0 << 2,
> +	DCACHE_WRITETHROUGH = 3 << 2,
> +	DCACHE_WRITEBACK = 4 << 2,
> +	DCACHE_WRITEALLOC = 4 << 2,

Is it intentional that these two have the same value?

Regards,
Andreas

>  };
>  
>  #define isb()				\

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 N?rnberg, Germany
GF: Felix Imend?rffer, Jane Smithard, Graham Norton; HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 5/5] bcm2835 video: Map fb as cached
  2016-03-15 17:21 ` [U-Boot] [PATCH 5/5] bcm2835 video: Map fb as cached Alexander Graf
@ 2016-03-16 18:00   ` Andreas Färber
  2016-03-16 22:32     ` Alexander Graf
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Färber @ 2016-03-16 18:00 UTC (permalink / raw)
  To: u-boot

Am 15.03.2016 um 18:21 schrieb Alexander Graf:
> The bcm2835 frame buffer is in RAM, so we can easily map it as cached and gain
> all the glorious performance boost that brings with it.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  drivers/video/bcm2835.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/video/bcm2835.c b/drivers/video/bcm2835.c
> index bff1fcb..fe49f2e 100644
> --- a/drivers/video/bcm2835.c
> +++ b/drivers/video/bcm2835.c
> @@ -106,6 +106,12 @@ void lcd_ctrl_init(void *lcdbase)
>  
>  	gd->fb_base = bus_to_phys(
>  		msg_setup->allocate_buffer.body.resp.fb_address);
> +
> +	/* Enable dcache for the frame buffer */
> +        mmu_set_region_dcache_behaviour(gd->fb_base,

Spaces vs. tab.

Andreas

> +		ALIGN(PAGE_SIZE, msg_setup->allocate_buffer.body.resp.fb_size),
> +		DCACHE_WRITEBACK);
> +	lcd_set_flush_dcache(1);
>  }
>  
>  void lcd_enable(void)
> 


-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 N?rnberg, Germany
GF: Felix Imend?rffer, Jane Smithard, Graham Norton; HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 1/5] arm64: Add 32bit arm compatible dcache definitions
  2016-03-16 17:55   ` Andreas Färber
@ 2016-03-16 22:31     ` Alexander Graf
  0 siblings, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2016-03-16 22:31 UTC (permalink / raw)
  To: u-boot



On 16.03.16 18:55, Andreas F?rber wrote:
> Am 15.03.2016 um 18:21 schrieb Alexander Graf:
>> We want to be able to reuse device drivers from 32bit code, so let's add
>> definitions for all the dcache options that 32bit code has.
>>
>> While at it, fix up the DCACHE_OFF configuration. That was setting the bits
>> to declare a PTE a PTE and left the MAIR index bit at 0. Drop the useless
>> bits and make the index explicit.
>>
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>>  arch/arm/include/asm/system.h | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
>> index ac1173d..832c1db 100644
>> --- a/arch/arm/include/asm/system.h
>> +++ b/arch/arm/include/asm/system.h
>> @@ -26,8 +26,12 @@ u64 get_page_table_size(void);
>>  #define MMU_SECTION_SHIFT	21
>>  #define MMU_SECTION_SIZE	(1 << MMU_SECTION_SHIFT)
>>  
>> +/* These constants need to be synced to the MT_ types in asm/armv8/mmu.h */
>>  enum dcache_option {
>> -	DCACHE_OFF = 0x3,
>> +	DCACHE_OFF = 0 << 2,
>> +	DCACHE_WRITETHROUGH = 3 << 2,
>> +	DCACHE_WRITEBACK = 4 << 2,
>> +	DCACHE_WRITEALLOC = 4 << 2,
> 
> Is it intentional that these two have the same value?

Yes. We don't have any MAIR entry on AArch64 that defines writeback
without alloc:

#define MT_DEVICE_NGNRNE        0
#define MT_DEVICE_NGNRE         1
#define MT_DEVICE_GRE           2
#define MT_NORMAL_NC            3
#define MT_NORMAL               4

#define MEMORY_ATTRIBUTES       ((0x00 << (MT_DEVICE_NGNRNE * 8)) |     \
                                (0x04 << (MT_DEVICE_NGNRE * 8))   |     \
                                (0x0c << (MT_DEVICE_GRE * 8))     |     \
                                (0x44 << (MT_NORMAL_NC * 8))      |     \
                                (UL(0xff) << (MT_NORMAL * 8)))

So MAIR entries 0-4 are:

0: Device memory, Device-nGnRnE memory
1: Device memory, Device-nGnRE memory
2: Device memory, Device-GRE memory
3: Normal Memory, Outer Non-Cacheable, Inner Non-Cacheable
4: Normal Memory, Outer Write-back non-transient, Outer Read Allocate,
Outer Write Allocate, Inner Write-back non-transient, Inner Read
Allocate, Inner Write Allocate

But on armv7 we map memory as non-allocated by default. So I wanted to
make sure we stay compatible with our RAM maps. Basically on armv8,
"writeback" and "writealloc" both mean "cached RAM".


Alex

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH 5/5] bcm2835 video: Map fb as cached
  2016-03-16 18:00   ` Andreas Färber
@ 2016-03-16 22:32     ` Alexander Graf
  0 siblings, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2016-03-16 22:32 UTC (permalink / raw)
  To: u-boot



On 16.03.16 19:00, Andreas F?rber wrote:
> Am 15.03.2016 um 18:21 schrieb Alexander Graf:
>> The bcm2835 frame buffer is in RAM, so we can easily map it as cached and gain
>> all the glorious performance boost that brings with it.
>>
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>>  drivers/video/bcm2835.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/video/bcm2835.c b/drivers/video/bcm2835.c
>> index bff1fcb..fe49f2e 100644
>> --- a/drivers/video/bcm2835.c
>> +++ b/drivers/video/bcm2835.c
>> @@ -106,6 +106,12 @@ void lcd_ctrl_init(void *lcdbase)
>>  
>>  	gd->fb_base = bus_to_phys(
>>  		msg_setup->allocate_buffer.body.resp.fb_address);
>> +
>> +	/* Enable dcache for the frame buffer */
>> +        mmu_set_region_dcache_behaviour(gd->fb_base,
> 
> Spaces vs. tab.

If the rest of the patches are good in v2, I'd appreciate if whoever
applies this code could fix that up :).


Alex

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-03-16 22:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-15 17:21 [U-Boot] [PATCH 0/5] Enable caches for the RPi2 Alexander Graf
2016-03-15 17:21 ` [U-Boot] [PATCH 1/5] arm64: Add 32bit arm compatible dcache definitions Alexander Graf
2016-03-16 17:55   ` Andreas Färber
2016-03-16 22:31     ` Alexander Graf
2016-03-15 17:21 ` [U-Boot] [PATCH 2/5] arm: Add support for HYP mode and LPAE page tables Alexander Graf
2016-03-15 17:35   ` Tom Rini
2016-03-16  8:33     ` Alexander Graf
2016-03-16 14:25       ` Tom Rini
2016-03-15 17:21 ` [U-Boot] [PATCH 3/5] lcd: Fix compile warning in 64bit mode Alexander Graf
2016-03-15 17:21 ` [U-Boot] [PATCH 4/5] RPi: Enable caches for rpi2 Alexander Graf
2016-03-15 17:21 ` [U-Boot] [PATCH 5/5] bcm2835 video: Map fb as cached Alexander Graf
2016-03-16 18:00   ` Andreas Färber
2016-03-16 22:32     ` Alexander Graf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.