All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/14] Random LPAE-related patches
@ 2013-05-17 17:07 Will Deacon
  2013-05-17 17:07 ` [PATCH 01/14] ARM: LPAE: use signed arithmetic for mask definitions Will Deacon
                   ` (15 more replies)
  0 siblings, 16 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

This is a collection of LPAE-related patches (mostly fixes) that have
been kicking around for a while and I've been collecting in one place.
There were some extra patches for bootmem, but they produced lots of
warnings and need more thought, so they've not been included here.

All comments welcome,

Will

Cyril Chemparathy (9):
  ARM: LPAE: use signed arithmetic for mask definitions
  ARM: LPAE: use phys_addr_t in switch_mm()
  ARM: LPAE: use 64-bit accessors for TTBR registers
  ARM: LPAE: factor out T1SZ and TTBR1 computations
  ARM: LPAE: accomodate >32-bit addresses for page table base
  ARM: mm: use physical addresses in highmem sanity checks
  ARM: fix type of PHYS_PFN_OFFSET to unsigned long
  ARM: mm: cleanup checks for membank overlap with vmalloc area
  ARM: mm: clean up membank size limit checks

Vitaly Andrianov (3):
  ARM: LPAE: use phys_addr_t in alloc_init_pud()
  ARM: LPAE: use phys_addr_t in free_memmap()
  ARM: LPAE: use phys_addr_t for initrd location

Will Deacon (2):
  ARM: lpae: fix definition of PTE_HWTABLE_PTRS
  ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions

 arch/arm/include/asm/memory.h               | 18 +++++++++-
 arch/arm/include/asm/page.h                 |  2 +-
 arch/arm/include/asm/pgtable-3level-hwdef.h | 20 +++++++++++
 arch/arm/include/asm/pgtable-3level.h       |  8 ++---
 arch/arm/include/asm/proc-fns.h             | 26 ++++++++++----
 arch/arm/include/uapi/asm/hwcap.h           |  2 +-
 arch/arm/kernel/head.S                      | 10 +++---
 arch/arm/kernel/setup.c                     |  8 ++++-
 arch/arm/kernel/smp.c                       | 11 ++++--
 arch/arm/mm/context.c                       |  9 ++---
 arch/arm/mm/init.c                          | 19 ++++++-----
 arch/arm/mm/mmu.c                           | 49 +++++++++-----------------
 arch/arm/mm/proc-v7-3level.S                | 53 +++++++++++++++--------------
 13 files changed, 139 insertions(+), 96 deletions(-)

-- 
1.8.2.2

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 01/14] ARM: LPAE: use signed arithmetic for mask definitions
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 02/14] ARM: LPAE: use phys_addr_t in alloc_init_pud() Will Deacon
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Cyril Chemparathy <cyril@ti.com>

This patch applies to PAGE_MASK, PMD_MASK, and PGDIR_MASK, where forcing
unsigned long math truncates the mask at the 32-bits.  This clearly does bad
things on PAE systems.

This patch fixes this problem by defining these masks as signed quantities.
We then rely on sign extension to do the right thing.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/page.h           | 2 +-
 arch/arm/include/asm/pgtable-3level.h | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 812a494..6363f3d 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -13,7 +13,7 @@
 /* PAGE_SHIFT determines the page size */
 #define PAGE_SHIFT		12
 #define PAGE_SIZE		(_AC(1,UL) << PAGE_SHIFT)
-#define PAGE_MASK		(~(PAGE_SIZE-1))
+#define PAGE_MASK		(~((1 << PAGE_SHIFT) - 1))
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 86b8fe3..5b85b21 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -48,16 +48,16 @@
 #define PMD_SHIFT		21
 
 #define PMD_SIZE		(1UL << PMD_SHIFT)
-#define PMD_MASK		(~(PMD_SIZE-1))
+#define PMD_MASK		(~((1 << PMD_SHIFT) - 1))
 #define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
-#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+#define PGDIR_MASK		(~((1 << PGDIR_SHIFT) - 1))
 
 /*
  * section address mask and size definitions.
  */
 #define SECTION_SHIFT		21
 #define SECTION_SIZE		(1UL << SECTION_SHIFT)
-#define SECTION_MASK		(~(SECTION_SIZE-1))
+#define SECTION_MASK		(~((1 << SECTION_SHIFT) - 1))
 
 #define USER_PTRS_PER_PGD	(PAGE_OFFSET / PGDIR_SIZE)
 
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 02/14] ARM: LPAE: use phys_addr_t in alloc_init_pud()
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
  2013-05-17 17:07 ` [PATCH 01/14] ARM: LPAE: use signed arithmetic for mask definitions Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 03/14] ARM: LPAE: use phys_addr_t in free_memmap() Will Deacon
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Vitaly Andrianov <vitalya@ti.com>

This patch fixes the alloc_init_pud() function to use phys_addr_t instead of
unsigned long when passing in the phys argument.

This is an extension to commit 97092e0c56830457af0639f6bd904537a150ea4a (ARM:
pgtable: use phys_addr_t for physical addresses), which applied similar changes
elsewhere in the ARM memory management code.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/mm/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index e0d8565..b7ce65a8 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -673,7 +673,8 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 }
 
 static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
-	unsigned long end, unsigned long phys, const struct mem_type *type)
+				  unsigned long end, phys_addr_t phys,
+				  const struct mem_type *type)
 {
 	pud_t *pud = pud_offset(pgd, addr);
 	unsigned long next;
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 03/14] ARM: LPAE: use phys_addr_t in free_memmap()
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
  2013-05-17 17:07 ` [PATCH 01/14] ARM: LPAE: use signed arithmetic for mask definitions Will Deacon
  2013-05-17 17:07 ` [PATCH 02/14] ARM: LPAE: use phys_addr_t in alloc_init_pud() Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 04/14] ARM: LPAE: use phys_addr_t for initrd location Will Deacon
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Vitaly Andrianov <vitalya@ti.com>

The free_memmap() was mistakenly using unsigned long type to represent
physical addresses.  This breaks on PAE systems where memory could be placed
above the 32-bit addressible limit.

This patch fixes this function to properly use phys_addr_t instead.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/mm/init.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 9a5cdc0..68c914e 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -442,7 +442,7 @@ static inline void
 free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 {
 	struct page *start_pg, *end_pg;
-	unsigned long pg, pgend;
+	phys_addr_t pg, pgend;
 
 	/*
 	 * Convert start_pfn/end_pfn to a struct page pointer.
@@ -454,8 +454,8 @@ free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 	 * Convert to physical addresses, and
 	 * round start upwards and end downwards.
 	 */
-	pg = (unsigned long)PAGE_ALIGN(__pa(start_pg));
-	pgend = (unsigned long)__pa(end_pg) & PAGE_MASK;
+	pg = PAGE_ALIGN(__pa(start_pg));
+	pgend = __pa(end_pg) & PAGE_MASK;
 
 	/*
 	 * If there are free pages between these,
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 04/14] ARM: LPAE: use phys_addr_t for initrd location
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (2 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 03/14] ARM: LPAE: use phys_addr_t in free_memmap() Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 05/14] ARM: LPAE: use phys_addr_t in switch_mm() Will Deacon
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Vitaly Andrianov <vitalya@ti.com>

This patch fixes the initrd setup code to use phys_addr_t instead of assuming
32-bit addressing.  Without this we cannot boot on systems where initrd is
located above the 4G physical address limit.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/mm/init.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 68c914e..2ffee02 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -36,12 +36,13 @@
 
 #include "mm.h"
 
-static unsigned long phys_initrd_start __initdata = 0;
+static phys_addr_t phys_initrd_start __initdata = 0;
 static unsigned long phys_initrd_size __initdata = 0;
 
 static int __init early_initrd(char *p)
 {
-	unsigned long start, size;
+	phys_addr_t start;
+	unsigned long size;
 	char *endp;
 
 	start = memparse(p, &endp);
@@ -350,14 +351,14 @@ void __init arm_memblock_init(struct meminfo *mi, struct machine_desc *mdesc)
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (phys_initrd_size &&
 	    !memblock_is_region_memory(phys_initrd_start, phys_initrd_size)) {
-		pr_err("INITRD: 0x%08lx+0x%08lx is not a memory region - disabling initrd\n",
-		       phys_initrd_start, phys_initrd_size);
+		pr_err("INITRD: 0x%08llx+0x%08lx is not a memory region - disabling initrd\n",
+		       (u64)phys_initrd_start, phys_initrd_size);
 		phys_initrd_start = phys_initrd_size = 0;
 	}
 	if (phys_initrd_size &&
 	    memblock_is_region_reserved(phys_initrd_start, phys_initrd_size)) {
-		pr_err("INITRD: 0x%08lx+0x%08lx overlaps in-use memory region - disabling initrd\n",
-		       phys_initrd_start, phys_initrd_size);
+		pr_err("INITRD: 0x%08llx+0x%08lx overlaps in-use memory region - disabling initrd\n",
+		       (u64)phys_initrd_start, phys_initrd_size);
 		phys_initrd_start = phys_initrd_size = 0;
 	}
 	if (phys_initrd_size) {
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 05/14] ARM: LPAE: use phys_addr_t in switch_mm()
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (3 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 04/14] ARM: LPAE: use phys_addr_t for initrd location Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 06/14] ARM: LPAE: use 64-bit accessors for TTBR registers Will Deacon
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Cyril Chemparathy <cyril@ti.com>

This patch modifies the switch_mm() processor functions to use phys_addr_t.
On LPAE systems, we now honor the upper 32-bits of the physical address that
is being passed in, and program these into TTBR as expected.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
[will: fixed up conflict in 3-level switch_mm with big-endian changes]
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/proc-fns.h |  4 ++--
 arch/arm/mm/proc-v7-3level.S    | 16 ++++++++++++----
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index f3628fb..75b5f14 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -60,7 +60,7 @@ extern struct processor {
 	/*
 	 * Set the page table
 	 */
-	void (*switch_mm)(unsigned long pgd_phys, struct mm_struct *mm);
+	void (*switch_mm)(phys_addr_t pgd_phys, struct mm_struct *mm);
 	/*
 	 * Set a possibly extended PTE.  Non-extended PTEs should
 	 * ignore 'ext'.
@@ -82,7 +82,7 @@ extern void cpu_proc_init(void);
 extern void cpu_proc_fin(void);
 extern int cpu_do_idle(void);
 extern void cpu_dcache_clean_area(void *, int);
-extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm);
+extern void cpu_do_switch_mm(phys_addr_t pgd_phys, struct mm_struct *mm);
 #ifdef CONFIG_ARM_LPAE
 extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte);
 #else
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 363027e..995857d 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -39,6 +39,14 @@
 #define TTB_FLAGS_SMP	(TTB_IRGN_WBWA|TTB_S|TTB_RGN_OC_WBWA)
 #define PMD_FLAGS_SMP	(PMD_SECT_WBWA|PMD_SECT_S)
 
+#ifndef __ARMEB__
+#  define rpgdl	r0
+#  define rpgdh	r1
+#else
+#  define rpgdl	r1
+#  define rpgdh	r0
+#endif
+
 /*
  * cpu_v7_switch_mm(pgd_phys, tsk)
  *
@@ -47,10 +55,10 @@
  */
 ENTRY(cpu_v7_switch_mm)
 #ifdef CONFIG_MMU
-	mmid	r1, r1				@ get mm->context.id
-	asid	r3, r1
-	mov	r3, r3, lsl #(48 - 32)		@ ASID
-	mcrr	p15, 0, r0, r3, c2		@ set TTB 0
+	mmid	r2, r2
+	asid	r2, r2
+	orr	rpgdh, rpgdh, r2, lsl #(48 - 32)	@ upper 32-bits of pgd
+	mcrr	p15, 0, rpgdl, rpgdh, c2		@ set TTB 0
 	isb
 #endif
 	mov	pc, lr
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 06/14] ARM: LPAE: use 64-bit accessors for TTBR registers
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (4 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 05/14] ARM: LPAE: use phys_addr_t in switch_mm() Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 07/14] ARM: LPAE: factor out T1SZ and TTBR1 computations Will Deacon
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Cyril Chemparathy <cyril@ti.com>

This patch adds TTBR accessor macros, and modifies cpu_get_pgd() and
the LPAE version of cpu_set_reserved_ttbr0() to use these instead.

In the process, we also fix these functions to correctly handle cases
where the physical address lies beyond the 4G limit of 32-bit addressing.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/proc-fns.h | 22 +++++++++++++++++-----
 arch/arm/mm/context.c           |  9 ++-------
 2 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index 75b5f14..1c3cf94 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -116,13 +116,25 @@ extern void cpu_resume(void);
 #define cpu_switch_mm(pgd,mm) cpu_do_switch_mm(virt_to_phys(pgd),mm)
 
 #ifdef CONFIG_ARM_LPAE
+
+#define cpu_get_ttbr(nr)					\
+	({							\
+		u64 ttbr;					\
+		__asm__("mrrc	p15, " #nr ", %Q0, %R0, c2"	\
+			: "=r" (ttbr));				\
+		ttbr;						\
+	})
+
+#define cpu_set_ttbr(nr, val)					\
+	do {							\
+		u64 ttbr = val;					\
+		__asm__("mcrr	p15, " #nr ", %Q0, %R0, c2"	\
+			: : "r" (ttbr));			\
+	} while (0)
+
 #define cpu_get_pgd()	\
 	({						\
-		unsigned long pg, pg2;			\
-		__asm__("mrrc	p15, 0, %0, %1, c2"	\
-			: "=r" (pg), "=r" (pg2)		\
-			:				\
-			: "cc");			\
+		u64 pg = cpu_get_ttbr(0);		\
 		pg &= ~(PTRS_PER_PGD*sizeof(pgd_t)-1);	\
 		(pgd_t *)phys_to_virt(pg);		\
 	})
diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c
index 2ac3737..3675e31 100644
--- a/arch/arm/mm/context.c
+++ b/arch/arm/mm/context.c
@@ -20,6 +20,7 @@
 #include <asm/smp_plat.h>
 #include <asm/thread_notify.h>
 #include <asm/tlbflush.h>
+#include <asm/proc-fns.h>
 
 /*
  * On ARMv6, we have the following structure in the Context ID:
@@ -55,17 +56,11 @@ static cpumask_t tlb_flush_pending;
 #ifdef CONFIG_ARM_LPAE
 static void cpu_set_reserved_ttbr0(void)
 {
-	unsigned long ttbl = __pa(swapper_pg_dir);
-	unsigned long ttbh = 0;
-
 	/*
 	 * Set TTBR0 to swapper_pg_dir which contains only global entries. The
 	 * ASID is set to 0.
 	 */
-	asm volatile(
-	"	mcrr	p15, 0, %0, %1, c2		@ set TTBR0\n"
-	:
-	: "r" (ttbl), "r" (ttbh));
+	cpu_set_ttbr(0, __pa(swapper_pg_dir));
 	isb();
 }
 #else
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 07/14] ARM: LPAE: factor out T1SZ and TTBR1 computations
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (5 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 06/14] ARM: LPAE: use 64-bit accessors for TTBR registers Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 08/14] ARM: LPAE: accomodate >32-bit addresses for page table base Will Deacon
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Cyril Chemparathy <cyril@ti.com>

This patch moves the TTBR1 offset calculation and the T1SZ calculation out
of the TTB setup assembly code.  This should not affect functionality in
any way, but improves code readability as well as readability of subsequent
patches in this series.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/pgtable-3level-hwdef.h | 20 ++++++++++++++++++++
 arch/arm/mm/proc-v7-3level.S                | 29 ++++++++---------------------
 2 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
index 18f5cef..c6c6e6d 100644
--- a/arch/arm/include/asm/pgtable-3level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -79,4 +79,24 @@
 #define PHYS_MASK_SHIFT		(40)
 #define PHYS_MASK		((1ULL << PHYS_MASK_SHIFT) - 1)
 
+/*
+ * TTBR0/TTBR1 split (PAGE_OFFSET):
+ *   0x40000000: T0SZ = 2, T1SZ = 0 (not used)
+ *   0x80000000: T0SZ = 0, T1SZ = 1
+ *   0xc0000000: T0SZ = 0, T1SZ = 2
+ *
+ * Only use this feature if PHYS_OFFSET <= PAGE_OFFSET, otherwise
+ * booting secondary CPUs would end up using TTBR1 for the identity
+ * mapping set up in TTBR0.
+ */
+#if defined CONFIG_VMSPLIT_2G
+#define TTBR1_OFFSET	16			/* skip two L1 entries */
+#elif defined CONFIG_VMSPLIT_3G
+#define TTBR1_OFFSET	(4096 * (1 + 3))	/* only L2, skip pgd + 3*pmd */
+#else
+#define TTBR1_OFFSET	0
+#endif
+
+#define TTBR1_SIZE	(((PAGE_OFFSET >> 30) - 1) << 16)
+
 #endif
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 995857d..58ab747 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -114,7 +114,7 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 */
 	.macro	v7_ttb_setup, zero, ttbr0, ttbr1, tmp
 	ldr	\tmp, =swapper_pg_dir		@ swapper_pg_dir virtual address
-	cmp	\ttbr1, \tmp			@ PHYS_OFFSET > PAGE_OFFSET? (branch below)
+	cmp	\ttbr1, \tmp			@ PHYS_OFFSET > PAGE_OFFSET?
 	mrc	p15, 0, \tmp, c2, c0, 2		@ TTB control register
 	orr	\tmp, \tmp, #TTB_EAE
 	ALT_SMP(orr	\tmp, \tmp, #TTB_FLAGS_SMP)
@@ -122,27 +122,14 @@ ENDPROC(cpu_v7_set_pte_ext)
 	ALT_SMP(orr	\tmp, \tmp, #TTB_FLAGS_SMP << 16)
 	ALT_UP(orr	\tmp, \tmp, #TTB_FLAGS_UP << 16)
 	/*
-	 * TTBR0/TTBR1 split (PAGE_OFFSET):
-	 *   0x40000000: T0SZ = 2, T1SZ = 0 (not used)
-	 *   0x80000000: T0SZ = 0, T1SZ = 1
-	 *   0xc0000000: T0SZ = 0, T1SZ = 2
-	 *
-	 * Only use this feature if PHYS_OFFSET <= PAGE_OFFSET, otherwise
-	 * booting secondary CPUs would end up using TTBR1 for the identity
-	 * mapping set up in TTBR0.
+	 * Only use split TTBRs if PHYS_OFFSET <= PAGE_OFFSET (cmp above),
+	 * otherwise booting secondary CPUs would end up using TTBR1 for the
+	 * identity mapping set up in TTBR0.
 	 */
-	bhi	9001f				@ PHYS_OFFSET > PAGE_OFFSET?
-	orr	\tmp, \tmp, #(((PAGE_OFFSET >> 30) - 1) << 16) @ TTBCR.T1SZ
-#if defined CONFIG_VMSPLIT_2G
-	/* PAGE_OFFSET == 0x80000000, T1SZ == 1 */
-	add	\ttbr1, \ttbr1, #1 << 4		@ skip two L1 entries
-#elif defined CONFIG_VMSPLIT_3G
-	/* PAGE_OFFSET == 0xc0000000, T1SZ == 2 */
-	add	\ttbr1, \ttbr1, #4096 * (1 + 3)	@ only L2 used, skip pgd+3*pmd
-#endif
-	/* CONFIG_VMSPLIT_1G does not need TTBR1 adjustment */
-9001:	mcr	p15, 0, \tmp, c2, c0, 2		@ TTB control register
-	mcrr	p15, 1, \ttbr1, \zero, c2	@ load TTBR1
+	orrls	\tmp, \tmp, #TTBR1_SIZE				@ TTBCR.T1SZ
+	mcr	p15, 0, \tmp, c2, c0, 2				@ TTBCR
+	addls	\ttbr1, \ttbr1, #TTBR1_OFFSET
+	mcrr	p15, 1, \ttbr1, \zero, c2			@ load TTBR1
 	.endm
 
 	__CPUINIT
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 08/14] ARM: LPAE: accomodate >32-bit addresses for page table base
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (6 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 07/14] ARM: LPAE: factor out T1SZ and TTBR1 computations Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 09/14] ARM: mm: use physical addresses in highmem sanity checks Will Deacon
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Cyril Chemparathy <cyril@ti.com>

This patch redefines the early boot time use of the R4 register to steal a few
low order bits (ARCH_PGD_SHIFT bits) on LPAE systems.  This allows for up to
38-bit physical addresses.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/memory.h | 16 ++++++++++++++++
 arch/arm/kernel/head.S        | 10 ++++------
 arch/arm/kernel/smp.c         | 11 +++++++++--
 arch/arm/mm/proc-v7-3level.S  |  8 ++++++++
 4 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 57870ab..e506088 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -18,6 +18,8 @@
 #include <linux/types.h>
 #include <linux/sizes.h>
 
+#include <asm/cache.h>
+
 #ifdef CONFIG_NEED_MACH_MEMORY_H
 #include <mach/memory.h>
 #endif
@@ -141,6 +143,20 @@
 #define page_to_phys(page)	(__pfn_to_phys(page_to_pfn(page)))
 #define phys_to_page(phys)	(pfn_to_page(__phys_to_pfn(phys)))
 
+/*
+ * Minimum guaranted alignment in pgd_alloc().  The page table pointers passed
+ * around in head.S and proc-*.S are shifted by this amount, in order to
+ * leave spare high bits for systems with physical address extension.  This
+ * does not fully accomodate the 40-bit addressing capability of ARM LPAE, but
+ * gives us about 38-bits or so.
+ */
+#ifdef CONFIG_ARM_LPAE
+#define ARCH_PGD_SHIFT		L1_CACHE_SHIFT
+#else
+#define ARCH_PGD_SHIFT		0
+#endif
+#define ARCH_PGD_MASK		((1 << ARCH_PGD_SHIFT) - 1)
+
 #ifndef __ASSEMBLY__
 
 /*
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 8bac553..45e8935 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -156,7 +156,7 @@ ENDPROC(stext)
  *
  * Returns:
  *  r0, r3, r5-r7 corrupted
- *  r4 = physical page table address
+ *  r4 = page table (see ARCH_PGD_SHIFT in asm/memory.h)
  */
 __create_page_tables:
 	pgtbl	r4, r8				@ page table address
@@ -331,6 +331,7 @@ __create_page_tables:
 #endif
 #ifdef CONFIG_ARM_LPAE
 	sub	r4, r4, #0x1000		@ point to the PGD table
+	mov	r4, r4, lsr #ARCH_PGD_SHIFT
 #endif
 	mov	pc, lr
 ENDPROC(__create_page_tables)
@@ -408,7 +409,7 @@ __secondary_data:
  *  r0  = cp#15 control register
  *  r1  = machine ID
  *  r2  = atags or dtb pointer
- *  r4  = page table pointer
+ *  r4  = page table (see ARCH_PGD_SHIFT in asm/memory.h)
  *  r9  = processor ID
  *  r13 = *virtual* address to jump to upon completion
  */
@@ -427,10 +428,7 @@ __enable_mmu:
 #ifdef CONFIG_CPU_ICACHE_DISABLE
 	bic	r0, r0, #CR_I
 #endif
-#ifdef CONFIG_ARM_LPAE
-	mov	r5, #0
-	mcrr	p15, 0, r4, r5, c2		@ load TTBR0
-#else
+#ifndef CONFIG_ARM_LPAE
 	mov	r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 47ab905..c9c3f3ad 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -78,6 +78,13 @@ void __init smp_set_ops(struct smp_operations *ops)
 		smp_ops = *ops;
 };
 
+static unsigned long get_arch_pgd(pgd_t *pgd)
+{
+	phys_addr_t pgdir = virt_to_phys(pgd);
+	BUG_ON(pgdir & ARCH_PGD_MASK);
+	return pgdir >> ARCH_PGD_SHIFT;
+}
+
 int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
 {
 	int ret;
@@ -87,8 +94,8 @@ int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
 	 * its stack and the page tables.
 	 */
 	secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
-	secondary_data.pgdir = virt_to_phys(idmap_pgd);
-	secondary_data.swapper_pg_dir = virt_to_phys(swapper_pg_dir);
+	secondary_data.pgdir = get_arch_pgd(idmap_pgd);
+	secondary_data.swapper_pg_dir = get_arch_pgd(swapper_pg_dir);
 	__cpuc_flush_dcache_area(&secondary_data, sizeof(secondary_data));
 	outer_clean_range(__pa(&secondary_data), __pa(&secondary_data + 1));
 
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 58ab747..5ffe195 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -114,6 +114,7 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 */
 	.macro	v7_ttb_setup, zero, ttbr0, ttbr1, tmp
 	ldr	\tmp, =swapper_pg_dir		@ swapper_pg_dir virtual address
+	mov	\tmp, \tmp, lsr #ARCH_PGD_SHIFT
 	cmp	\ttbr1, \tmp			@ PHYS_OFFSET > PAGE_OFFSET?
 	mrc	p15, 0, \tmp, c2, c0, 2		@ TTB control register
 	orr	\tmp, \tmp, #TTB_EAE
@@ -128,8 +129,15 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 */
 	orrls	\tmp, \tmp, #TTBR1_SIZE				@ TTBCR.T1SZ
 	mcr	p15, 0, \tmp, c2, c0, 2				@ TTBCR
+	mov	\tmp, \ttbr1, lsr #(32 - ARCH_PGD_SHIFT)	@ upper bits
+	mov	\ttbr1, \ttbr1, lsl #ARCH_PGD_SHIFT		@ lower bits
 	addls	\ttbr1, \ttbr1, #TTBR1_OFFSET
 	mcrr	p15, 1, \ttbr1, \zero, c2			@ load TTBR1
+	mov	\tmp, \ttbr0, lsr #(32 - ARCH_PGD_SHIFT)	@ upper bits
+	mov	\ttbr0, \ttbr0, lsl #ARCH_PGD_SHIFT		@ lower bits
+	mcrr	p15, 0, \ttbr0, \zero, c2			@ load TTBR0
+	mcrr	p15, 1, \ttbr1, \zero, c2			@ load TTBR1
+	mcrr	p15, 0, \ttbr0, \zero, c2			@ load TTBR0
 	.endm
 
 	__CPUINIT
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 09/14] ARM: mm: use physical addresses in highmem sanity checks
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (7 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 08/14] ARM: LPAE: accomodate >32-bit addresses for page table base Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 10/14] ARM: fix type of PHYS_PFN_OFFSET to unsigned long Will Deacon
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Cyril Chemparathy <cyril@ti.com>

This patch modifies the highmem sanity checking code to use physical addresses
instead.  This change eliminates the wrap-around problems associated with the
original virtual address based checks, and this simplifies the code a bit.

The one constraint imposed here is that low physical memory must be mapped in
a monotonically increasing fashion if there are multiple banks of memory,
i.e., x < y must => pa(x) < pa(y).

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/mm/mmu.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index b7ce65a8..fc6ff1a 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -988,6 +988,7 @@ phys_addr_t arm_lowmem_limit __initdata = 0;
 void __init sanity_check_meminfo(void)
 {
 	int i, j, highmem = 0;
+	phys_addr_t vmalloc_limit = __pa(vmalloc_min - 1) + 1;
 
 	for (i = 0, j = 0; i < meminfo.nr_banks; i++) {
 		struct membank *bank = &meminfo.bank[j];
@@ -997,8 +998,7 @@ void __init sanity_check_meminfo(void)
 			highmem = 1;
 
 #ifdef CONFIG_HIGHMEM
-		if (__va(bank->start) >= vmalloc_min ||
-		    __va(bank->start) < (void *)PAGE_OFFSET)
+		if (bank->start >= vmalloc_limit)
 			highmem = 1;
 
 		bank->highmem = highmem;
@@ -1007,8 +1007,8 @@ void __init sanity_check_meminfo(void)
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
 		 */
-		if (!highmem && __va(bank->start) < vmalloc_min &&
-		    bank->size > vmalloc_min - __va(bank->start)) {
+		if (!highmem && bank->start < vmalloc_limit &&
+		    bank->size > vmalloc_limit - bank->start) {
 			if (meminfo.nr_banks >= NR_BANKS) {
 				printk(KERN_CRIT "NR_BANKS too low, "
 						 "ignoring high memory\n");
@@ -1017,12 +1017,12 @@ void __init sanity_check_meminfo(void)
 					(meminfo.nr_banks - i) * sizeof(*bank));
 				meminfo.nr_banks++;
 				i++;
-				bank[1].size -= vmalloc_min - __va(bank->start);
-				bank[1].start = __pa(vmalloc_min - 1) + 1;
+				bank[1].size -= vmalloc_limit - bank->start;
+				bank[1].start = vmalloc_limit;
 				bank[1].highmem = highmem = 1;
 				j++;
 			}
-			bank->size = vmalloc_min - __va(bank->start);
+			bank->size = vmalloc_limit - bank->start;
 		}
 #else
 		bank->highmem = highmem;
@@ -1042,8 +1042,7 @@ void __init sanity_check_meminfo(void)
 		 * Check whether this memory bank would entirely overlap
 		 * the vmalloc area.
 		 */
-		if (__va(bank->start) >= vmalloc_min ||
-		    __va(bank->start) < (void *)PAGE_OFFSET) {
+		if (bank->start >= vmalloc_limit) {
 			printk(KERN_NOTICE "Ignoring RAM at %.8llx-%.8llx "
 			       "(vmalloc region overlap).\n",
 			       (unsigned long long)bank->start,
@@ -1055,9 +1054,8 @@ void __init sanity_check_meminfo(void)
 		 * Check whether this memory bank would partially overlap
 		 * the vmalloc area.
 		 */
-		if (__va(bank->start + bank->size - 1) >= vmalloc_min ||
-		    __va(bank->start + bank->size - 1) <= __va(bank->start)) {
-			unsigned long newsize = vmalloc_min - __va(bank->start);
+		if (bank->start + bank->size > vmalloc_limit)
+			unsigned long newsize = vmalloc_limit - bank->start;
 			printk(KERN_NOTICE "Truncating RAM at %.8llx-%.8llx "
 			       "to -%.8llx (vmalloc region overlap).\n",
 			       (unsigned long long)bank->start,
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 10/14] ARM: fix type of PHYS_PFN_OFFSET to unsigned long
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (8 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 09/14] ARM: mm: use physical addresses in highmem sanity checks Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 11/14] ARM: mm: cleanup checks for membank overlap with vmalloc area Will Deacon
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Cyril Chemparathy <cyril@ti.com>

On LPAE machines, PHYS_OFFSET evaluates to a phys_addr_t and this type is
inherited by the PHYS_PFN_OFFSET definition as well.  Consequently, the kernel
build emits warnings of the form:

init/main.c: In function 'start_kernel':
init/main.c:588:7: warning: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'phys_addr_t' [-Wformat]

This patch fixes this warning by pinning down the PFN type to unsigned long.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/memory.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index e506088..584786f 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -223,7 +223,7 @@ static inline unsigned long __phys_to_virt(unsigned long x)
  * direct-mapped view.  We assume this is the first page
  * of RAM in the mem_map as well.
  */
-#define PHYS_PFN_OFFSET	(PHYS_OFFSET >> PAGE_SHIFT)
+#define PHYS_PFN_OFFSET	((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
 
 /*
  * These are *only* valid on the kernel direct mapped RAM memory.
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 11/14] ARM: mm: cleanup checks for membank overlap with vmalloc area
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (9 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 10/14] ARM: fix type of PHYS_PFN_OFFSET to unsigned long Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 12/14] ARM: mm: clean up membank size limit checks Will Deacon
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Cyril Chemparathy <cyril@ti.com>

On Keystone platforms, physical memory is entirely outside the 32-bit
addressible range.  Therefore, the (bank->start > ULONG_MAX) check below marks
the entire system memory as highmem, and this causes unpleasentness all over.

This patch eliminates the extra bank start check (against ULONG_MAX) by
checking bank->start against the physical address corresponding to vmalloc_min
instead.

In the process, this patch also cleans up parts of the highmem sanity check
code by removing what has now become a redundant check for banks that entirely
overlap with the vmalloc range.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/mm/mmu.c | 19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index fc6ff1a..ae249d1 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -994,15 +994,12 @@ void __init sanity_check_meminfo(void)
 		struct membank *bank = &meminfo.bank[j];
 		*bank = meminfo.bank[i];
 
-		if (bank->start > ULONG_MAX)
-			highmem = 1;
-
-#ifdef CONFIG_HIGHMEM
 		if (bank->start >= vmalloc_limit)
 			highmem = 1;
 
 		bank->highmem = highmem;
 
+#ifdef CONFIG_HIGHMEM
 		/*
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
@@ -1025,8 +1022,6 @@ void __init sanity_check_meminfo(void)
 			bank->size = vmalloc_limit - bank->start;
 		}
 #else
-		bank->highmem = highmem;
-
 		/*
 		 * Highmem banks not allowed with !CONFIG_HIGHMEM.
 		 */
@@ -1039,18 +1034,6 @@ void __init sanity_check_meminfo(void)
 		}
 
 		/*
-		 * Check whether this memory bank would entirely overlap
-		 * the vmalloc area.
-		 */
-		if (bank->start >= vmalloc_limit) {
-			printk(KERN_NOTICE "Ignoring RAM at %.8llx-%.8llx "
-			       "(vmalloc region overlap).\n",
-			       (unsigned long long)bank->start,
-			       (unsigned long long)bank->start + bank->size - 1);
-			continue;
-		}
-
-		/*
 		 * Check whether this memory bank would partially overlap
 		 * the vmalloc area.
 		 */
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 12/14] ARM: mm: clean up membank size limit checks
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (10 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 11/14] ARM: mm: cleanup checks for membank overlap with vmalloc area Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 13/14] ARM: lpae: fix definition of PTE_HWTABLE_PTRS Will Deacon
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

From: Cyril Chemparathy <cyril@ti.com>

This patch cleans up the highmem sanity check code by simplifying the range
checks with a pre-calculated size_limit.  This patch should otherwise have no
functional impact on behavior.

This patch also removes a redundant (bank->start < vmalloc_limit) check, since
this is already covered by the !highmem condition.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/mm/mmu.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index ae249d1..280f91d 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -992,10 +992,15 @@ void __init sanity_check_meminfo(void)
 
 	for (i = 0, j = 0; i < meminfo.nr_banks; i++) {
 		struct membank *bank = &meminfo.bank[j];
+		phys_addr_t size_limit;
+
 		*bank = meminfo.bank[i];
+		size_limit = bank->size;
 
 		if (bank->start >= vmalloc_limit)
 			highmem = 1;
+		else
+			size_limit = vmalloc_limit - bank->start;
 
 		bank->highmem = highmem;
 
@@ -1004,8 +1009,7 @@ void __init sanity_check_meminfo(void)
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
 		 */
-		if (!highmem && bank->start < vmalloc_limit &&
-		    bank->size > vmalloc_limit - bank->start) {
+		if (!highmem && bank->size > size_limit) {
 			if (meminfo.nr_banks >= NR_BANKS) {
 				printk(KERN_CRIT "NR_BANKS too low, "
 						 "ignoring high memory\n");
@@ -1014,12 +1018,12 @@ void __init sanity_check_meminfo(void)
 					(meminfo.nr_banks - i) * sizeof(*bank));
 				meminfo.nr_banks++;
 				i++;
-				bank[1].size -= vmalloc_limit - bank->start;
+				bank[1].size -= size_limit;
 				bank[1].start = vmalloc_limit;
 				bank[1].highmem = highmem = 1;
 				j++;
 			}
-			bank->size = vmalloc_limit - bank->start;
+			bank->size = size_limit;
 		}
 #else
 		/*
@@ -1037,14 +1041,13 @@ void __init sanity_check_meminfo(void)
 		 * Check whether this memory bank would partially overlap
 		 * the vmalloc area.
 		 */
-		if (bank->start + bank->size > vmalloc_limit)
-			unsigned long newsize = vmalloc_limit - bank->start;
+		if (bank->size > size_limit) {
 			printk(KERN_NOTICE "Truncating RAM at %.8llx-%.8llx "
 			       "to -%.8llx (vmalloc region overlap).\n",
 			       (unsigned long long)bank->start,
 			       (unsigned long long)bank->start + bank->size - 1,
-			       (unsigned long long)bank->start + newsize - 1);
-			bank->size = newsize;
+			       (unsigned long long)bank->start + size_limit - 1);
+			bank->size = size_limit;
 		}
 #endif
 		if (!bank->highmem && bank->start + bank->size > arm_lowmem_limit)
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 13/14] ARM: lpae: fix definition of PTE_HWTABLE_PTRS
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (11 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 12/14] ARM: mm: clean up membank size limit checks Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-17 17:07 ` [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions Will Deacon
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

For 2-level page tables, PTE_HWTABLE_PTRS describes the offset between
Linux PTEs and hardware PTEs. On LPAE, there is no distinction (since
we have 64-bit descriptors with plenty of space) so PTE_HWTABLE_PTRS
should be 0. Unfortunately, it is wrongly defined as PTRS_PER_PTE,
meaning that current pte table flushing is off by a page. Luckily,
all current LPAE implementations are SMP, so the hardware walker can
snoop L1.

This patch fixes the broken definition.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/pgtable-3level.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 5b85b21..d03c589 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -33,7 +33,7 @@
 #define PTRS_PER_PMD		512
 #define PTRS_PER_PGD		4
 
-#define PTE_HWTABLE_PTRS	(PTRS_PER_PTE)
+#define PTE_HWTABLE_PTRS	(0)
 #define PTE_HWTABLE_OFF		(0)
 #define PTE_HWTABLE_SIZE	(PTRS_PER_PTE * sizeof(u64))
 
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (12 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 13/14] ARM: lpae: fix definition of PTE_HWTABLE_PTRS Will Deacon
@ 2013-05-17 17:07 ` Will Deacon
  2013-05-20 14:18   ` Catalin Marinas
  2013-05-17 18:23 ` [PATCH 00/14] Random LPAE-related patches Santosh Shilimkar
  2013-05-30 13:31 ` Subash Patel
  15 siblings, 1 reply; 30+ messages in thread
From: Will Deacon @ 2013-05-17 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

CPUs implementing LPAE have atomic ldrd/strd instructions, meaning that
userspace software can avoid having to use the exclusive variants of
these instructions if they wish.

This patch advertises the atomicity of these instructions via the
hwcaps, so userspace can detect this CPU feature.

Reported-by: Vladimir Danushevsky <vladimir.danushevsky@oracle.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/uapi/asm/hwcap.h | 2 +-
 arch/arm/kernel/setup.c           | 8 +++++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/uapi/asm/hwcap.h b/arch/arm/include/uapi/asm/hwcap.h
index 3688fd1..6d34d08 100644
--- a/arch/arm/include/uapi/asm/hwcap.h
+++ b/arch/arm/include/uapi/asm/hwcap.h
@@ -25,6 +25,6 @@
 #define HWCAP_IDIVT	(1 << 18)
 #define HWCAP_VFPD32	(1 << 19)	/* set if VFP has 32 regs (not 16) */
 #define HWCAP_IDIV	(HWCAP_IDIVA | HWCAP_IDIVT)
-
+#define HWCAP_LPAE	(1 << 20)
 
 #endif /* _UAPI__ASMARM_HWCAP_H */
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 1522c7a..bdcd4dd 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -355,7 +355,7 @@ void __init early_print(const char *str, ...)
 
 static void __init cpuid_init_hwcaps(void)
 {
-	unsigned int divide_instrs;
+	unsigned int divide_instrs, vmsa;
 
 	if (cpu_architecture() < CPU_ARCH_ARMv7)
 		return;
@@ -368,6 +368,11 @@ static void __init cpuid_init_hwcaps(void)
 	case 1:
 		elf_hwcap |= HWCAP_IDIVT;
 	}
+
+	/* LPAE implies atomic ldrd/strd instructions */
+	vmsa = (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xf) >> 0;
+	if (vmsa >= 5)
+		elf_hwcap |= HWCAP_LPAE;
 }
 
 static void __init feat_v6_fixup(void)
@@ -872,6 +877,7 @@ static const char *hwcap_str[] = {
 	"vfpv4",
 	"idiva",
 	"idivt",
+	"lpae",
 	NULL
 };
 
-- 
1.8.2.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 00/14] Random LPAE-related patches
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (13 preceding siblings ...)
  2013-05-17 17:07 ` [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions Will Deacon
@ 2013-05-17 18:23 ` Santosh Shilimkar
  2013-05-30 13:31 ` Subash Patel
  15 siblings, 0 replies; 30+ messages in thread
From: Santosh Shilimkar @ 2013-05-17 18:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 17 May 2013 10:37 PM, Will Deacon wrote:
> Hello,
> 
> This is a collection of LPAE-related patches (mostly fixes) that have
> been kicking around for a while and I've been collecting in one place.
> There were some extra patches for bootmem, but they produced lots of
> warnings and need more thought, so they've not been included here.
> 
> All comments welcome,
> 
> Will
> 
> Cyril Chemparathy (9):
>   ARM: LPAE: use signed arithmetic for mask definitions
>   ARM: LPAE: use phys_addr_t in switch_mm()
>   ARM: LPAE: use 64-bit accessors for TTBR registers
>   ARM: LPAE: factor out T1SZ and TTBR1 computations
>   ARM: LPAE: accomodate >32-bit addresses for page table base
>   ARM: mm: use physical addresses in highmem sanity checks
>   ARM: fix type of PHYS_PFN_OFFSET to unsigned long
>   ARM: mm: cleanup checks for membank overlap with vmalloc area
>   ARM: mm: clean up membank size limit checks
> 
> Vitaly Andrianov (3):
>   ARM: LPAE: use phys_addr_t in alloc_init_pud()
>   ARM: LPAE: use phys_addr_t in free_memmap()
>   ARM: LPAE: use phys_addr_t for initrd location
> 
> Will Deacon (2):
>   ARM: lpae: fix definition of PTE_HWTABLE_PTRS
>   ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
> 
Thanks for picking up the patches. As off-list discussed, I have
tested these patches along with remainder bootmem related patches.

Feel free to use,
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-17 17:07 ` [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions Will Deacon
@ 2013-05-20 14:18   ` Catalin Marinas
  2013-05-20 14:24     ` Will Deacon
  2013-05-21 18:48     ` Rob Herring
  0 siblings, 2 replies; 30+ messages in thread
From: Catalin Marinas @ 2013-05-20 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 17, 2013 at 06:07:53PM +0100, Will Deacon wrote:
> CPUs implementing LPAE have atomic ldrd/strd instructions, meaning that
> userspace software can avoid having to use the exclusive variants of
> these instructions if they wish.
> 
> This patch advertises the atomicity of these instructions via the
> hwcaps, so userspace can detect this CPU feature.
> 
> Reported-by: Vladimir Danushevsky <vladimir.danushevsky@oracle.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
...
> +
> +	/* LPAE implies atomic ldrd/strd instructions */
> +	vmsa = (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xf) >> 0;
> +	if (vmsa >= 5)
> +		elf_hwcap |= HWCAP_LPAE;

As I mentioned in the past, I don't agree with exposing the "LPAE"
feature to user-space, it's not a feature that user space should care
about. An atomic double hwcap is better and you can even make this per
CPU via __v7_proc.

-- 
Catalin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-20 14:18   ` Catalin Marinas
@ 2013-05-20 14:24     ` Will Deacon
  2013-05-20 15:11       ` Catalin Marinas
  2013-05-21 18:48     ` Rob Herring
  1 sibling, 1 reply; 30+ messages in thread
From: Will Deacon @ 2013-05-20 14:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 20, 2013 at 03:18:09PM +0100, Catalin Marinas wrote:
> On Fri, May 17, 2013 at 06:07:53PM +0100, Will Deacon wrote:
> > CPUs implementing LPAE have atomic ldrd/strd instructions, meaning that
> > userspace software can avoid having to use the exclusive variants of
> > these instructions if they wish.
> > 
> > This patch advertises the atomicity of these instructions via the
> > hwcaps, so userspace can detect this CPU feature.
> > 
> > Reported-by: Vladimir Danushevsky <vladimir.danushevsky@oracle.com>
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> ...
> > +
> > +	/* LPAE implies atomic ldrd/strd instructions */
> > +	vmsa = (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xf) >> 0;
> > +	if (vmsa >= 5)
> > +		elf_hwcap |= HWCAP_LPAE;
> 
> As I mentioned in the past, I don't agree with exposing the "LPAE"
> feature to user-space, it's not a feature that user space should care
> about. An atomic double hwcap is better and you can even make this per
> CPU via __v7_proc.

I don't buy the argument that this could be per-CPU: doubleword atomicity
requires support in the whole system -- not just in the CPU. The only way we
can rely on it, is by guarantees made in the architecture, which are made
as part of LPAE.

If this just boils down to a naming issue, thn I'm happy to change it, but
we *are* reporting whether LPAE is supported and I can't think of a better
name than that.

Will

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-20 14:24     ` Will Deacon
@ 2013-05-20 15:11       ` Catalin Marinas
  2013-05-20 16:04         ` Will Deacon
  0 siblings, 1 reply; 30+ messages in thread
From: Catalin Marinas @ 2013-05-20 15:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 20, 2013 at 03:24:59PM +0100, Will Deacon wrote:
> On Mon, May 20, 2013 at 03:18:09PM +0100, Catalin Marinas wrote:
> > On Fri, May 17, 2013 at 06:07:53PM +0100, Will Deacon wrote:
> > > CPUs implementing LPAE have atomic ldrd/strd instructions, meaning that
> > > userspace software can avoid having to use the exclusive variants of
> > > these instructions if they wish.
> > > 
> > > This patch advertises the atomicity of these instructions via the
> > > hwcaps, so userspace can detect this CPU feature.
> > > 
> > > Reported-by: Vladimir Danushevsky <vladimir.danushevsky@oracle.com>
> > > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > ...
> > > +
> > > +	/* LPAE implies atomic ldrd/strd instructions */
> > > +	vmsa = (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xf) >> 0;
> > > +	if (vmsa >= 5)
> > > +		elf_hwcap |= HWCAP_LPAE;
> > 
> > As I mentioned in the past, I don't agree with exposing the "LPAE"
> > feature to user-space, it's not a feature that user space should care
> > about. An atomic double hwcap is better and you can even make this per
> > CPU via __v7_proc.
> 
> I don't buy the argument that this could be per-CPU: doubleword atomicity
> requires support in the whole system -- not just in the CPU. The only way we
> can rely on it, is by guarantees made in the architecture, which are made
> as part of LPAE.

Well, you know that for A7/A15 you have this feature as they support
LPAE. You can have it as a generic LPAE test (only that the ARM ARM
isn't entirely clear here, so for people asking in the future we could
say it's a feature of the A7/A15/etc.)

> If this just boils down to a naming issue, thn I'm happy to change it, but
> we *are* reporting whether LPAE is supported and I can't think of a better
> name than that.

Given that the ARM ARM isn't clear (though this is the case on the
actual implementations), user space may not necessarily assume that
LPAE==atomic doubles. That's why I think reporting the actual atomic
feature is better.

-- 
Catalin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-20 15:11       ` Catalin Marinas
@ 2013-05-20 16:04         ` Will Deacon
  2013-05-20 17:13           ` Catalin Marinas
  0 siblings, 1 reply; 30+ messages in thread
From: Will Deacon @ 2013-05-20 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 20, 2013 at 04:11:58PM +0100, Catalin Marinas wrote:
> On Mon, May 20, 2013 at 03:24:59PM +0100, Will Deacon wrote:
> > On Mon, May 20, 2013 at 03:18:09PM +0100, Catalin Marinas wrote:
> > > As I mentioned in the past, I don't agree with exposing the "LPAE"
> > > feature to user-space, it's not a feature that user space should care
> > > about. An atomic double hwcap is better and you can even make this per
> > > CPU via __v7_proc.
> > 
> > I don't buy the argument that this could be per-CPU: doubleword atomicity
> > requires support in the whole system -- not just in the CPU. The only way we
> > can rely on it, is by guarantees made in the architecture, which are made
> > as part of LPAE.
> 
> Well, you know that for A7/A15 you have this feature as they support
> LPAE. You can have it as a generic LPAE test (only that the ARM ARM
> isn't entirely clear here, so for people asking in the future we could
> say it's a feature of the A7/A15/etc.)

No, as far as Linux is concerned, it's a feature of any system claiming to
have support for LPAE. Given that we allocate page tables using our usual
allocators, all memory that we map as normal must be capable of doubleword
atomics. We make use of this in our atomic64_{read,set} implementations.

> > If this just boils down to a naming issue, thn I'm happy to change it, but
> > we *are* reporting whether LPAE is supported and I can't think of a better
> > name than that.
> 
> Given that the ARM ARM isn't clear (though this is the case on the
> actual implementations), user space may not necessarily assume that
> LPAE==atomic doubles. That's why I think reporting the actual atomic
> feature is better.

The ARM ARM isn't too bad: it's just avoiding mandating 64-bit-wide paths
around the entire SoC (and I've checked this with the architects). The only
way we can probe this feature is using the MMFR0 and checking if LPAE is
supported, and that's exactly what userspace will need to rely on. We can
change the name, but the probe code will remain the same so I'm not sure it
makes anything clearer. We had "atomicd" originally, but that sounds like a
techno band.

Will

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-20 16:04         ` Will Deacon
@ 2013-05-20 17:13           ` Catalin Marinas
  2013-05-21 18:02             ` Will Deacon
  0 siblings, 1 reply; 30+ messages in thread
From: Catalin Marinas @ 2013-05-20 17:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 20, 2013 at 05:04:07PM +0100, Will Deacon wrote:
> On Mon, May 20, 2013 at 04:11:58PM +0100, Catalin Marinas wrote:
> > On Mon, May 20, 2013 at 03:24:59PM +0100, Will Deacon wrote:
> > > On Mon, May 20, 2013 at 03:18:09PM +0100, Catalin Marinas wrote:
> > > > As I mentioned in the past, I don't agree with exposing the "LPAE"
> > > > feature to user-space, it's not a feature that user space should care
> > > > about. An atomic double hwcap is better and you can even make this per
> > > > CPU via __v7_proc.
> > > 
> > > I don't buy the argument that this could be per-CPU: doubleword atomicity
> > > requires support in the whole system -- not just in the CPU. The only way we
> > > can rely on it, is by guarantees made in the architecture, which are made
> > > as part of LPAE.
> > 
> > Well, you know that for A7/A15 you have this feature as they support
> > LPAE. You can have it as a generic LPAE test (only that the ARM ARM
> > isn't entirely clear here, so for people asking in the future we could
> > say it's a feature of the A7/A15/etc.)
> 
> No, as far as Linux is concerned, it's a feature of any system claiming to
> have support for LPAE. Given that we allocate page tables using our usual
> allocators, all memory that we map as normal must be capable of doubleword
> atomics. We make use of this in our atomic64_{read,set} implementations.

I guess we don't have non-LPAE processors that do atomic doubles (if we
would, they can use __v7_proc explicitly, though with a different name).

> > > If this just boils down to a naming issue, thn I'm happy to change it, but
> > > we *are* reporting whether LPAE is supported and I can't think of a better
> > > name than that.
> > 
> > Given that the ARM ARM isn't clear (though this is the case on the
> > actual implementations), user space may not necessarily assume that
> > LPAE==atomic doubles. That's why I think reporting the actual atomic
> > feature is better.
> 
> The ARM ARM isn't too bad: it's just avoiding mandating 64-bit-wide paths
> around the entire SoC (and I've checked this with the architects). The only
> way we can probe this feature is using the MMFR0 and checking if LPAE is
> supported, and that's exactly what userspace will need to rely on.

Well, LPAE implies atomic doubles but I wouldn't say that's the "only"
way, it can always be a feature of the CPU. Now, would the user
developers fully understand the implications of LPAE?

> We can
> change the name, but the probe code will remain the same so I'm not sure it
> makes anything clearer. We had "atomicd" originally, but that sounds like a
> techno band.

We can make it longer, 'atomicdbl', if that's the issue ;).

-- 
Catalin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-20 17:13           ` Catalin Marinas
@ 2013-05-21 18:02             ` Will Deacon
  2013-05-22  8:43               ` Catalin Marinas
  0 siblings, 1 reply; 30+ messages in thread
From: Will Deacon @ 2013-05-21 18:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 20, 2013 at 06:13:52PM +0100, Catalin Marinas wrote:
> On Mon, May 20, 2013 at 05:04:07PM +0100, Will Deacon wrote:
> > On Mon, May 20, 2013 at 04:11:58PM +0100, Catalin Marinas wrote:
> > > Given that the ARM ARM isn't clear (though this is the case on the
> > > actual implementations), user space may not necessarily assume that
> > > LPAE==atomic doubles. That's why I think reporting the actual atomic
> > > feature is better.
> > 
> > The ARM ARM isn't too bad: it's just avoiding mandating 64-bit-wide paths
> > around the entire SoC (and I've checked this with the architects). The only
> > way we can probe this feature is using the MMFR0 and checking if LPAE is
> > supported, and that's exactly what userspace will need to rely on.
> 
> Well, LPAE implies atomic doubles but I wouldn't say that's the "only"
> way, it can always be a feature of the CPU. Now, would the user
> developers fully understand the implications of LPAE?

I don't think it *can* be a feature of the CPU, because it depends on
system-wide support. It could be a feature of an SoC, but per-SoC hwcaps
isn't something we currently support. As I said, the only reason we can even
probe this is because the architecture helps us out.

> > We can
> > change the name, but the probe code will remain the same so I'm not sure it
> > makes anything clearer. We had "atomicd" originally, but that sounds like a
> > techno band.
> 
> We can make it longer, 'atomicdbl', if that's the issue ;).

Argh! :)

Will

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-20 14:18   ` Catalin Marinas
  2013-05-20 14:24     ` Will Deacon
@ 2013-05-21 18:48     ` Rob Herring
  2013-05-22  8:47       ` Catalin Marinas
  1 sibling, 1 reply; 30+ messages in thread
From: Rob Herring @ 2013-05-21 18:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 20, 2013 at 9:18 AM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
> On Fri, May 17, 2013 at 06:07:53PM +0100, Will Deacon wrote:
>> CPUs implementing LPAE have atomic ldrd/strd instructions, meaning that
>> userspace software can avoid having to use the exclusive variants of
>> these instructions if they wish.
>>
>> This patch advertises the atomicity of these instructions via the
>> hwcaps, so userspace can detect this CPU feature.
>>
>> Reported-by: Vladimir Danushevsky <vladimir.danushevsky@oracle.com>
>> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ...
>> +
>> +     /* LPAE implies atomic ldrd/strd instructions */
>> +     vmsa = (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xf) >> 0;
>> +     if (vmsa >= 5)
>> +             elf_hwcap |= HWCAP_LPAE;
>
> As I mentioned in the past, I don't agree with exposing the "LPAE"
> feature to user-space, it's not a feature that user space should care
> about. An atomic double hwcap is better and you can even make this per
> CPU via __v7_proc.

How does userspace know whether to install a non-LPAE or LPAE kernel
in a generic way?

Rob

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-21 18:02             ` Will Deacon
@ 2013-05-22  8:43               ` Catalin Marinas
  0 siblings, 0 replies; 30+ messages in thread
From: Catalin Marinas @ 2013-05-22  8:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 21, 2013 at 07:02:01PM +0100, Will Deacon wrote:
> On Mon, May 20, 2013 at 06:13:52PM +0100, Catalin Marinas wrote:
> > On Mon, May 20, 2013 at 05:04:07PM +0100, Will Deacon wrote:
> > > On Mon, May 20, 2013 at 04:11:58PM +0100, Catalin Marinas wrote:
> > > > Given that the ARM ARM isn't clear (though this is the case on the
> > > > actual implementations), user space may not necessarily assume that
> > > > LPAE==atomic doubles. That's why I think reporting the actual atomic
> > > > feature is better.
> > > 
> > > The ARM ARM isn't too bad: it's just avoiding mandating 64-bit-wide paths
> > > around the entire SoC (and I've checked this with the architects). The only
> > > way we can probe this feature is using the MMFR0 and checking if LPAE is
> > > supported, and that's exactly what userspace will need to rely on.
> > 
> > Well, LPAE implies atomic doubles but I wouldn't say that's the "only"
> > way, it can always be a feature of the CPU. Now, would the user
> > developers fully understand the implications of LPAE?
> 
> I don't think it *can* be a feature of the CPU, because it depends on
> system-wide support. It could be a feature of an SoC, but per-SoC hwcaps
> isn't something we currently support. As I said, the only reason we can even
> probe this is because the architecture helps us out.

LPAE is also a feature of the CPU, not the SoC.

-- 
Catalin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-21 18:48     ` Rob Herring
@ 2013-05-22  8:47       ` Catalin Marinas
  2013-05-22 18:09         ` Will Deacon
  0 siblings, 1 reply; 30+ messages in thread
From: Catalin Marinas @ 2013-05-22  8:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 21, 2013 at 07:48:35PM +0100, Rob Herring wrote:
> On Mon, May 20, 2013 at 9:18 AM, Catalin Marinas
> <catalin.marinas@arm.com> wrote:
> > On Fri, May 17, 2013 at 06:07:53PM +0100, Will Deacon wrote:
> >> CPUs implementing LPAE have atomic ldrd/strd instructions, meaning that
> >> userspace software can avoid having to use the exclusive variants of
> >> these instructions if they wish.
> >>
> >> This patch advertises the atomicity of these instructions via the
> >> hwcaps, so userspace can detect this CPU feature.
> >>
> >> Reported-by: Vladimir Danushevsky <vladimir.danushevsky@oracle.com>
> >> Signed-off-by: Will Deacon <will.deacon@arm.com>
> > ...
> >> +
> >> +     /* LPAE implies atomic ldrd/strd instructions */
> >> +     vmsa = (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xf) >> 0;
> >> +     if (vmsa >= 5)
> >> +             elf_hwcap |= HWCAP_LPAE;
> >
> > As I mentioned in the past, I don't agree with exposing the "LPAE"
> > feature to user-space, it's not a feature that user space should care
> > about. An atomic double hwcap is better and you can even make this per
> > CPU via __v7_proc.
> 
> How does userspace know whether to install a non-LPAE or LPAE kernel
> in a generic way?

This is a valid reason to expose LPAE to user, though elf_hwcap sounds a
bit strange.

-- 
Catalin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-22  8:47       ` Catalin Marinas
@ 2013-05-22 18:09         ` Will Deacon
  2013-05-23  8:50           ` Catalin Marinas
  0 siblings, 1 reply; 30+ messages in thread
From: Will Deacon @ 2013-05-22 18:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 22, 2013 at 09:47:44AM +0100, Catalin Marinas wrote:
> On Tue, May 21, 2013 at 07:48:35PM +0100, Rob Herring wrote:
> > On Mon, May 20, 2013 at 9:18 AM, Catalin Marinas
> > <catalin.marinas@arm.com> wrote:
> > > On Fri, May 17, 2013 at 06:07:53PM +0100, Will Deacon wrote:
> > >> CPUs implementing LPAE have atomic ldrd/strd instructions, meaning that
> > >> userspace software can avoid having to use the exclusive variants of
> > >> these instructions if they wish.
> > >>
> > >> This patch advertises the atomicity of these instructions via the
> > >> hwcaps, so userspace can detect this CPU feature.
> > >>
> > >> Reported-by: Vladimir Danushevsky <vladimir.danushevsky@oracle.com>
> > >> Signed-off-by: Will Deacon <will.deacon@arm.com>
> > > ...
> > >> +
> > >> +     /* LPAE implies atomic ldrd/strd instructions */
> > >> +     vmsa = (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xf) >> 0;
> > >> +     if (vmsa >= 5)
> > >> +             elf_hwcap |= HWCAP_LPAE;
> > >
> > > As I mentioned in the past, I don't agree with exposing the "LPAE"
> > > feature to user-space, it's not a feature that user space should care
> > > about. An atomic double hwcap is better and you can even make this per
> > > CPU via __v7_proc.
> > 
> > How does userspace know whether to install a non-LPAE or LPAE kernel
> > in a generic way?
> 
> This is a valid reason to expose LPAE to user, though elf_hwcap sounds a
> bit strange.

In lieu of anything else, do you mind if I continue with the patch as it
stands?

Will

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
  2013-05-22 18:09         ` Will Deacon
@ 2013-05-23  8:50           ` Catalin Marinas
  0 siblings, 0 replies; 30+ messages in thread
From: Catalin Marinas @ 2013-05-23  8:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 22, 2013 at 07:09:42PM +0100, Will Deacon wrote:
> On Wed, May 22, 2013 at 09:47:44AM +0100, Catalin Marinas wrote:
> > On Tue, May 21, 2013 at 07:48:35PM +0100, Rob Herring wrote:
> > > On Mon, May 20, 2013 at 9:18 AM, Catalin Marinas
> > > <catalin.marinas@arm.com> wrote:
> > > > On Fri, May 17, 2013 at 06:07:53PM +0100, Will Deacon wrote:
> > > >> CPUs implementing LPAE have atomic ldrd/strd instructions, meaning that
> > > >> userspace software can avoid having to use the exclusive variants of
> > > >> these instructions if they wish.
> > > >>
> > > >> This patch advertises the atomicity of these instructions via the
> > > >> hwcaps, so userspace can detect this CPU feature.
> > > >>
> > > >> Reported-by: Vladimir Danushevsky <vladimir.danushevsky@oracle.com>
> > > >> Signed-off-by: Will Deacon <will.deacon@arm.com>
> > > > ...
> > > >> +
> > > >> +     /* LPAE implies atomic ldrd/strd instructions */
> > > >> +     vmsa = (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xf) >> 0;
> > > >> +     if (vmsa >= 5)
> > > >> +             elf_hwcap |= HWCAP_LPAE;
> > > >
> > > > As I mentioned in the past, I don't agree with exposing the "LPAE"
> > > > feature to user-space, it's not a feature that user space should care
> > > > about. An atomic double hwcap is better and you can even make this per
> > > > CPU via __v7_proc.
> > > 
> > > How does userspace know whether to install a non-LPAE or LPAE kernel
> > > in a generic way?
> > 
> > This is a valid reason to expose LPAE to user, though elf_hwcap sounds a
> > bit strange.
> 
> In lieu of anything else, do you mind if I continue with the patch as it
> stands?

OK, I give up ;)

-- 
Catalin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 00/14] Random LPAE-related patches
  2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
                   ` (14 preceding siblings ...)
  2013-05-17 18:23 ` [PATCH 00/14] Random LPAE-related patches Santosh Shilimkar
@ 2013-05-30 13:31 ` Subash Patel
  2013-05-30 15:03   ` Will Deacon
  15 siblings, 1 reply; 30+ messages in thread
From: Subash Patel @ 2013-05-30 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

Sorry for late response, as I was on vacation.

Thanks for these patches. I have tested these patches on exynos5440 
platform. It hasnt affected LPAE functionality for us from previous 
versions(posted by Cyril). You can use "Tested-By: Subash Patel 
<subash.rp@samsung.com>" as you deem necessary.

Regards,
Subash

On 05/17/2013 10:37 PM, Will Deacon wrote:
> Hello,
>
> This is a collection of LPAE-related patches (mostly fixes) that have
> been kicking around for a while and I've been collecting in one place.
> There were some extra patches for bootmem, but they produced lots of
> warnings and need more thought, so they've not been included here.
>
> All comments welcome,
>
> Will
>
> Cyril Chemparathy (9):
>    ARM: LPAE: use signed arithmetic for mask definitions
>    ARM: LPAE: use phys_addr_t in switch_mm()
>    ARM: LPAE: use 64-bit accessors for TTBR registers
>    ARM: LPAE: factor out T1SZ and TTBR1 computations
>    ARM: LPAE: accomodate >32-bit addresses for page table base
>    ARM: mm: use physical addresses in highmem sanity checks
>    ARM: fix type of PHYS_PFN_OFFSET to unsigned long
>    ARM: mm: cleanup checks for membank overlap with vmalloc area
>    ARM: mm: clean up membank size limit checks
>
> Vitaly Andrianov (3):
>    ARM: LPAE: use phys_addr_t in alloc_init_pud()
>    ARM: LPAE: use phys_addr_t in free_memmap()
>    ARM: LPAE: use phys_addr_t for initrd location
>
> Will Deacon (2):
>    ARM: lpae: fix definition of PTE_HWTABLE_PTRS
>    ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
>
>   arch/arm/include/asm/memory.h               | 18 +++++++++-
>   arch/arm/include/asm/page.h                 |  2 +-
>   arch/arm/include/asm/pgtable-3level-hwdef.h | 20 +++++++++++
>   arch/arm/include/asm/pgtable-3level.h       |  8 ++---
>   arch/arm/include/asm/proc-fns.h             | 26 ++++++++++----
>   arch/arm/include/uapi/asm/hwcap.h           |  2 +-
>   arch/arm/kernel/head.S                      | 10 +++---
>   arch/arm/kernel/setup.c                     |  8 ++++-
>   arch/arm/kernel/smp.c                       | 11 ++++--
>   arch/arm/mm/context.c                       |  9 ++---
>   arch/arm/mm/init.c                          | 19 ++++++-----
>   arch/arm/mm/mmu.c                           | 49 +++++++++-----------------
>   arch/arm/mm/proc-v7-3level.S                | 53 +++++++++++++++--------------
>   13 files changed, 139 insertions(+), 96 deletions(-)
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 00/14] Random LPAE-related patches
  2013-05-30 13:31 ` Subash Patel
@ 2013-05-30 15:03   ` Will Deacon
  2013-05-31  0:38     ` Kukjin Kim
  0 siblings, 1 reply; 30+ messages in thread
From: Will Deacon @ 2013-05-30 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 30, 2013 at 02:31:09PM +0100, Subash Patel wrote:
> Thanks for these patches. I have tested these patches on exynos5440 
> platform. It hasnt affected LPAE functionality for us from previous 
> versions(posted by Cyril). You can use "Tested-By: Subash Patel 
> <subash.rp@samsung.com>" as you deem necessary.

Thanks Subash. I'll add your tag and send a pull to Russell when he's back
from holiday.

Will

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 00/14] Random LPAE-related patches
  2013-05-30 15:03   ` Will Deacon
@ 2013-05-31  0:38     ` Kukjin Kim
  0 siblings, 0 replies; 30+ messages in thread
From: Kukjin Kim @ 2013-05-31  0:38 UTC (permalink / raw)
  To: linux-arm-kernel

Will Deacon wrote:
> 
> On Thu, May 30, 2013 at 02:31:09PM +0100, Subash Patel wrote:
> > Thanks for these patches. I have tested these patches on exynos5440
> > platform. It hasnt affected LPAE functionality for us from previous
> > versions(posted by Cyril). You can use "Tested-By: Subash Patel
> > <subash.rp@samsung.com>" as you deem necessary.
> 
> Thanks Subash. I'll add your tag and send a pull to Russell when he's back
> from holiday.
> 
Sounds good, so we can support LPAE on exynos5440 with mainline soon.

Thanks.

- Kukjin

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2013-05-31  0:38 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-17 17:07 [PATCH 00/14] Random LPAE-related patches Will Deacon
2013-05-17 17:07 ` [PATCH 01/14] ARM: LPAE: use signed arithmetic for mask definitions Will Deacon
2013-05-17 17:07 ` [PATCH 02/14] ARM: LPAE: use phys_addr_t in alloc_init_pud() Will Deacon
2013-05-17 17:07 ` [PATCH 03/14] ARM: LPAE: use phys_addr_t in free_memmap() Will Deacon
2013-05-17 17:07 ` [PATCH 04/14] ARM: LPAE: use phys_addr_t for initrd location Will Deacon
2013-05-17 17:07 ` [PATCH 05/14] ARM: LPAE: use phys_addr_t in switch_mm() Will Deacon
2013-05-17 17:07 ` [PATCH 06/14] ARM: LPAE: use 64-bit accessors for TTBR registers Will Deacon
2013-05-17 17:07 ` [PATCH 07/14] ARM: LPAE: factor out T1SZ and TTBR1 computations Will Deacon
2013-05-17 17:07 ` [PATCH 08/14] ARM: LPAE: accomodate >32-bit addresses for page table base Will Deacon
2013-05-17 17:07 ` [PATCH 09/14] ARM: mm: use physical addresses in highmem sanity checks Will Deacon
2013-05-17 17:07 ` [PATCH 10/14] ARM: fix type of PHYS_PFN_OFFSET to unsigned long Will Deacon
2013-05-17 17:07 ` [PATCH 11/14] ARM: mm: cleanup checks for membank overlap with vmalloc area Will Deacon
2013-05-17 17:07 ` [PATCH 12/14] ARM: mm: clean up membank size limit checks Will Deacon
2013-05-17 17:07 ` [PATCH 13/14] ARM: lpae: fix definition of PTE_HWTABLE_PTRS Will Deacon
2013-05-17 17:07 ` [PATCH 14/14] ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions Will Deacon
2013-05-20 14:18   ` Catalin Marinas
2013-05-20 14:24     ` Will Deacon
2013-05-20 15:11       ` Catalin Marinas
2013-05-20 16:04         ` Will Deacon
2013-05-20 17:13           ` Catalin Marinas
2013-05-21 18:02             ` Will Deacon
2013-05-22  8:43               ` Catalin Marinas
2013-05-21 18:48     ` Rob Herring
2013-05-22  8:47       ` Catalin Marinas
2013-05-22 18:09         ` Will Deacon
2013-05-23  8:50           ` Catalin Marinas
2013-05-17 18:23 ` [PATCH 00/14] Random LPAE-related patches Santosh Shilimkar
2013-05-30 13:31 ` Subash Patel
2013-05-30 15:03   ` Will Deacon
2013-05-31  0:38     ` Kukjin Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.