All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/20] ARM: Add support for the Large Physical Address Extensions
@ 2010-11-12 18:00 ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

Hi,

This set of patches adds support for the Large Physical Extensions on
the ARM architecture (available with the Cortex-A15 processor). LPAE
comes with a 3-level page table format (compared to 2-level for the
classic one), allowing up to 40-bit physical address space.

The ARM LPAE documentation is available from (free registration needed):

http://infocenter.arm.com/help/topic/com.arm.doc.ddi0406b_virtualization_extns/index.html

The full set of patches (kernel fixes, LPAE and support for an emulated
Versatile Express with Cortex-A15 tile) is available on this branch:

git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm.git arm-lpae

Changelog:

- Upgraded to latest mainline kernel (2.6.37-rc1) solving several conflicts.
- Enable CONFIG_ARCH_DMA_ADDR_T_64BIT for future compatibility with the unified
  dma_addr_t patch.
- PHYS_ADDR_FMT printk format changed to be ANSI C compliant.
- Alignment fault now uses SIGBUS instead of SIGILL.
- arch/arm/kernel/head.S modified to use SECTION_SHIFT and reduce the amount of
  #ifdef's.
- setup_mm_for_reboot() modified for LPAE.
- identity_mapping_add/del() modified for LPAE.
- Removed FIRST_USER_PGD_NR definition as the place where it was used have been
  modified.

Any comments are welcome. Thanks.


Catalin Marinas (13):
  ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  ARM: LPAE: Factor out 2-level page table definitions into separate
    files
  ARM: LPAE: Do not assume Linux PTEs are always at PTRS_PER_PTE offset
  ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  ARM: LPAE: Introduce the 3-level page table format definitions
  ARM: LPAE: Page table maintenance for the 3-level format
  ARM: LPAE: MMU setup for the 3-level page table format
  ARM: LPAE: Change setup_mm_for_reboot() to work with LPAE
  ARM: LPAE: Remove the FIRST_USER_PGD_NR and USER_PTRS_PER_PGD
    definitions
  ARM: LPAE: Add fault handling support
  ARM: LPAE: Add context switching support
  ARM: LPAE: Add SMP support for the 3-level page table format
  ARM: LPAE: Add the Kconfig entries

Will Deacon (7):
  ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  ARM: LPAE: use phys_addr_t instead of unsigned long for physical
    addresses
  ARM: LPAE: Use generic dma_addr_t type definition
  ARM: LPAE: mark memory banks with start > ULONG_MAX as highmem
  ARM: LPAE: use phys_addr_t for physical start address in early_mem
  ARM: LPAE: add support for ATAG_MEM64
  ARM: LPAE: define printk format for physical addresses and page table
    entries

 arch/arm/Kconfig                            |    2 +-
 arch/arm/include/asm/cpu-multi32.h          |    8 +
 arch/arm/include/asm/cpu-single.h           |    4 +
 arch/arm/include/asm/memory.h               |   17 +-
 arch/arm/include/asm/outercache.h           |   14 +-
 arch/arm/include/asm/page-nommu.h           |    8 +-
 arch/arm/include/asm/page.h                 |   42 +----
 arch/arm/include/asm/pgalloc.h              |   34 +++-
 arch/arm/include/asm/pgtable-2level-hwdef.h |   91 +++++++++
 arch/arm/include/asm/pgtable-2level-types.h |   64 +++++++
 arch/arm/include/asm/pgtable-2level.h       |  147 +++++++++++++++
 arch/arm/include/asm/pgtable-3level-hwdef.h |   78 ++++++++
 arch/arm/include/asm/pgtable-3level-types.h |   55 ++++++
 arch/arm/include/asm/pgtable-3level.h       |  110 +++++++++++
 arch/arm/include/asm/pgtable-hwdef.h        |   81 +--------
 arch/arm/include/asm/pgtable.h              |  267 +++++++++++----------------
 arch/arm/include/asm/proc-fns.h             |   13 ++
 arch/arm/include/asm/setup.h                |   12 +-
 arch/arm/include/asm/types.h                |   24 +--
 arch/arm/kernel/compat.c                    |    4 +-
 arch/arm/kernel/head.S                      |  106 ++++++++----
 arch/arm/kernel/module.c                    |    2 +-
 arch/arm/kernel/setup.c                     |   19 ++-
 arch/arm/kernel/smp.c                       |   43 ++++-
 arch/arm/mm/Kconfig                         |   13 ++
 arch/arm/mm/alignment.c                     |    8 +-
 arch/arm/mm/context.c                       |   18 ++-
 arch/arm/mm/dma-mapping.c                   |    6 +-
 arch/arm/mm/fault.c                         |   88 +++++++++-
 arch/arm/mm/init.c                          |    6 +-
 arch/arm/mm/ioremap.c                       |    8 +-
 arch/arm/mm/mm.h                            |    8 +-
 arch/arm/mm/mmu.c                           |   99 +++++++---
 arch/arm/mm/pgd.c                           |   20 ++-
 arch/arm/mm/proc-macros.S                   |    5 +-
 arch/arm/mm/proc-v7.S                       |  115 +++++++++++-
 36 files changed, 1213 insertions(+), 426 deletions(-)
 create mode 100644 arch/arm/include/asm/pgtable-2level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-2level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-2level.h
 create mode 100644 arch/arm/include/asm/pgtable-3level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-3level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-3level.h


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 00/20] ARM: Add support for the Large Physical Address Extensions
@ 2010-11-12 18:00 ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

This set of patches adds support for the Large Physical Extensions on
the ARM architecture (available with the Cortex-A15 processor). LPAE
comes with a 3-level page table format (compared to 2-level for the
classic one), allowing up to 40-bit physical address space.

The ARM LPAE documentation is available from (free registration needed):

http://infocenter.arm.com/help/topic/com.arm.doc.ddi0406b_virtualization_extns/index.html

The full set of patches (kernel fixes, LPAE and support for an emulated
Versatile Express with Cortex-A15 tile) is available on this branch:

git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm.git arm-lpae

Changelog:

- Upgraded to latest mainline kernel (2.6.37-rc1) solving several conflicts.
- Enable CONFIG_ARCH_DMA_ADDR_T_64BIT for future compatibility with the unified
  dma_addr_t patch.
- PHYS_ADDR_FMT printk format changed to be ANSI C compliant.
- Alignment fault now uses SIGBUS instead of SIGILL.
- arch/arm/kernel/head.S modified to use SECTION_SHIFT and reduce the amount of
  #ifdef's.
- setup_mm_for_reboot() modified for LPAE.
- identity_mapping_add/del() modified for LPAE.
- Removed FIRST_USER_PGD_NR definition as the place where it was used have been
  modified.

Any comments are welcome. Thanks.


Catalin Marinas (13):
  ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  ARM: LPAE: Factor out 2-level page table definitions into separate
    files
  ARM: LPAE: Do not assume Linux PTEs are always at PTRS_PER_PTE offset
  ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  ARM: LPAE: Introduce the 3-level page table format definitions
  ARM: LPAE: Page table maintenance for the 3-level format
  ARM: LPAE: MMU setup for the 3-level page table format
  ARM: LPAE: Change setup_mm_for_reboot() to work with LPAE
  ARM: LPAE: Remove the FIRST_USER_PGD_NR and USER_PTRS_PER_PGD
    definitions
  ARM: LPAE: Add fault handling support
  ARM: LPAE: Add context switching support
  ARM: LPAE: Add SMP support for the 3-level page table format
  ARM: LPAE: Add the Kconfig entries

Will Deacon (7):
  ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  ARM: LPAE: use phys_addr_t instead of unsigned long for physical
    addresses
  ARM: LPAE: Use generic dma_addr_t type definition
  ARM: LPAE: mark memory banks with start > ULONG_MAX as highmem
  ARM: LPAE: use phys_addr_t for physical start address in early_mem
  ARM: LPAE: add support for ATAG_MEM64
  ARM: LPAE: define printk format for physical addresses and page table
    entries

 arch/arm/Kconfig                            |    2 +-
 arch/arm/include/asm/cpu-multi32.h          |    8 +
 arch/arm/include/asm/cpu-single.h           |    4 +
 arch/arm/include/asm/memory.h               |   17 +-
 arch/arm/include/asm/outercache.h           |   14 +-
 arch/arm/include/asm/page-nommu.h           |    8 +-
 arch/arm/include/asm/page.h                 |   42 +----
 arch/arm/include/asm/pgalloc.h              |   34 +++-
 arch/arm/include/asm/pgtable-2level-hwdef.h |   91 +++++++++
 arch/arm/include/asm/pgtable-2level-types.h |   64 +++++++
 arch/arm/include/asm/pgtable-2level.h       |  147 +++++++++++++++
 arch/arm/include/asm/pgtable-3level-hwdef.h |   78 ++++++++
 arch/arm/include/asm/pgtable-3level-types.h |   55 ++++++
 arch/arm/include/asm/pgtable-3level.h       |  110 +++++++++++
 arch/arm/include/asm/pgtable-hwdef.h        |   81 +--------
 arch/arm/include/asm/pgtable.h              |  267 +++++++++++----------------
 arch/arm/include/asm/proc-fns.h             |   13 ++
 arch/arm/include/asm/setup.h                |   12 +-
 arch/arm/include/asm/types.h                |   24 +--
 arch/arm/kernel/compat.c                    |    4 +-
 arch/arm/kernel/head.S                      |  106 ++++++++----
 arch/arm/kernel/module.c                    |    2 +-
 arch/arm/kernel/setup.c                     |   19 ++-
 arch/arm/kernel/smp.c                       |   43 ++++-
 arch/arm/mm/Kconfig                         |   13 ++
 arch/arm/mm/alignment.c                     |    8 +-
 arch/arm/mm/context.c                       |   18 ++-
 arch/arm/mm/dma-mapping.c                   |    6 +-
 arch/arm/mm/fault.c                         |   88 +++++++++-
 arch/arm/mm/init.c                          |    6 +-
 arch/arm/mm/ioremap.c                       |    8 +-
 arch/arm/mm/mm.h                            |    8 +-
 arch/arm/mm/mmu.c                           |   99 +++++++---
 arch/arm/mm/pgd.c                           |   20 ++-
 arch/arm/mm/proc-macros.S                   |    5 +-
 arch/arm/mm/proc-v7.S                       |  115 +++++++++++-
 36 files changed, 1213 insertions(+), 426 deletions(-)
 create mode 100644 arch/arm/include/asm/pgtable-2level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-2level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-2level.h
 create mode 100644 arch/arm/include/asm/pgtable-3level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-3level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-3level.h

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 01/20] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

PGDIR_SHIFT and PMD_SHIFT for the classic 2-level page table format have
the same value (21). This patch converts the PGDIR_* uses in the kernel
to the PMD_* equivalent.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/kernel/module.c  |    2 +-
 arch/arm/kernel/smp.c     |    4 ++--
 arch/arm/mm/dma-mapping.c |    6 +++---
 arch/arm/mm/mmu.c         |   16 ++++++++--------
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index d9bd786..6b30f01 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -32,7 +32,7 @@
  * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
  */
 #undef MODULES_VADDR
-#define MODULES_VADDR	(((unsigned long)_etext + ~PGDIR_MASK) & PGDIR_MASK)
+#define MODULES_VADDR	(((unsigned long)_etext + ~PMD_MASK) & PMD_MASK)
 #endif
 
 #ifdef CONFIG_MMU
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 8c19595..40b386c 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -78,7 +78,7 @@ static inline void identity_mapping_add(pgd_t *pgd, unsigned long start,
 	if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
 		prot |= PMD_BIT4;
 
-	for (addr = start & PGDIR_MASK; addr < end;) {
+	for (addr = start & PMD_MASK; addr < end;) {
 		pmd = pmd_offset(pgd + pgd_index(addr), addr);
 		pmd[0] = __pmd(addr | prot);
 		addr += SECTION_SIZE;
@@ -95,7 +95,7 @@ static inline void identity_mapping_del(pgd_t *pgd, unsigned long start,
 	unsigned long addr;
 	pmd_t *pmd;
 
-	for (addr = start & PGDIR_MASK; addr < end; addr += PGDIR_SIZE) {
+	for (addr = start & PMD_MASK; addr < end; addr += PMD_SIZE) {
 		pmd = pmd_offset(pgd + pgd_index(addr), addr);
 		pmd[0] = __pmd(0);
 		pmd[1] = __pmd(0);
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index e4dd064..2aab1b4 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -120,8 +120,8 @@ static void __dma_free_buffer(struct page *page, size_t size)
 #endif
 
 #define CONSISTENT_OFFSET(x)	(((unsigned long)(x) - CONSISTENT_BASE) >> PAGE_SHIFT)
-#define CONSISTENT_PTE_INDEX(x) (((unsigned long)(x) - CONSISTENT_BASE) >> PGDIR_SHIFT)
-#define NUM_CONSISTENT_PTES (CONSISTENT_DMA_SIZE >> PGDIR_SHIFT)
+#define CONSISTENT_PTE_INDEX(x) (((unsigned long)(x) - CONSISTENT_BASE) >> PMD_SHIFT)
+#define NUM_CONSISTENT_PTES (CONSISTENT_DMA_SIZE >> PMD_SHIFT)
 
 /*
  * These are the page tables (2MB each) covering uncached, DMA consistent allocations
@@ -171,7 +171,7 @@ static int __init consistent_init(void)
 		}
 
 		consistent_pte[i++] = pte;
-		base += (1 << PGDIR_SHIFT);
+		base += (1 << PMD_SHIFT);
 	} while (base < CONSISTENT_END);
 
 	return ret;
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 79c01f5..5e3adca 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -858,14 +858,14 @@ static inline void prepare_page_table(void)
 	/*
 	 * Clear out all the mappings below the kernel image.
 	 */
-	for (addr = 0; addr < MODULES_VADDR; addr += PGDIR_SIZE)
+	for (addr = 0; addr < MODULES_VADDR; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 
 #ifdef CONFIG_XIP_KERNEL
 	/* The XIP kernel is mapped in the module area -- skip over it */
-	addr = ((unsigned long)_etext + PGDIR_SIZE - 1) & PGDIR_MASK;
+	addr = ((unsigned long)_etext + PMD_SIZE - 1) & PMD_MASK;
 #endif
-	for ( ; addr < PAGE_OFFSET; addr += PGDIR_SIZE)
+	for ( ; addr < PAGE_OFFSET; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 
 	/*
@@ -880,7 +880,7 @@ static inline void prepare_page_table(void)
 	 * memory bank, up to the end of the vmalloc region.
 	 */
 	for (addr = __phys_to_virt(end);
-	     addr < VMALLOC_END; addr += PGDIR_SIZE)
+	     addr < VMALLOC_END; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 }
 
@@ -921,7 +921,7 @@ static void __init devicemaps_init(struct machine_desc *mdesc)
 	 */
 	vectors_page = early_alloc(PAGE_SIZE);
 
-	for (addr = VMALLOC_END; addr; addr += PGDIR_SIZE)
+	for (addr = VMALLOC_END; addr; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 
 	/*
@@ -1068,12 +1068,12 @@ void setup_mm_for_reboot(char mode)
 		base_pmdval |= PMD_BIT4;
 
 	for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
-		unsigned long pmdval = (i << PGDIR_SHIFT) | base_pmdval;
+		unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
 		pmd_t *pmd;
 
-		pmd = pmd_off(pgd, i << PGDIR_SHIFT);
+		pmd = pmd_off(pgd, i << PMD_SHIFT);
 		pmd[0] = __pmd(pmdval);
-		pmd[1] = __pmd(pmdval + (1 << (PGDIR_SHIFT - 1)));
+		pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
 		flush_pmd_entry(pmd);
 	}
 

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 01/20] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

PGDIR_SHIFT and PMD_SHIFT for the classic 2-level page table format have
the same value (21). This patch converts the PGDIR_* uses in the kernel
to the PMD_* equivalent.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/kernel/module.c  |    2 +-
 arch/arm/kernel/smp.c     |    4 ++--
 arch/arm/mm/dma-mapping.c |    6 +++---
 arch/arm/mm/mmu.c         |   16 ++++++++--------
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index d9bd786..6b30f01 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -32,7 +32,7 @@
  * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
  */
 #undef MODULES_VADDR
-#define MODULES_VADDR	(((unsigned long)_etext + ~PGDIR_MASK) & PGDIR_MASK)
+#define MODULES_VADDR	(((unsigned long)_etext + ~PMD_MASK) & PMD_MASK)
 #endif
 
 #ifdef CONFIG_MMU
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 8c19595..40b386c 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -78,7 +78,7 @@ static inline void identity_mapping_add(pgd_t *pgd, unsigned long start,
 	if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
 		prot |= PMD_BIT4;
 
-	for (addr = start & PGDIR_MASK; addr < end;) {
+	for (addr = start & PMD_MASK; addr < end;) {
 		pmd = pmd_offset(pgd + pgd_index(addr), addr);
 		pmd[0] = __pmd(addr | prot);
 		addr += SECTION_SIZE;
@@ -95,7 +95,7 @@ static inline void identity_mapping_del(pgd_t *pgd, unsigned long start,
 	unsigned long addr;
 	pmd_t *pmd;
 
-	for (addr = start & PGDIR_MASK; addr < end; addr += PGDIR_SIZE) {
+	for (addr = start & PMD_MASK; addr < end; addr += PMD_SIZE) {
 		pmd = pmd_offset(pgd + pgd_index(addr), addr);
 		pmd[0] = __pmd(0);
 		pmd[1] = __pmd(0);
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index e4dd064..2aab1b4 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -120,8 +120,8 @@ static void __dma_free_buffer(struct page *page, size_t size)
 #endif
 
 #define CONSISTENT_OFFSET(x)	(((unsigned long)(x) - CONSISTENT_BASE) >> PAGE_SHIFT)
-#define CONSISTENT_PTE_INDEX(x) (((unsigned long)(x) - CONSISTENT_BASE) >> PGDIR_SHIFT)
-#define NUM_CONSISTENT_PTES (CONSISTENT_DMA_SIZE >> PGDIR_SHIFT)
+#define CONSISTENT_PTE_INDEX(x) (((unsigned long)(x) - CONSISTENT_BASE) >> PMD_SHIFT)
+#define NUM_CONSISTENT_PTES (CONSISTENT_DMA_SIZE >> PMD_SHIFT)
 
 /*
  * These are the page tables (2MB each) covering uncached, DMA consistent allocations
@@ -171,7 +171,7 @@ static int __init consistent_init(void)
 		}
 
 		consistent_pte[i++] = pte;
-		base += (1 << PGDIR_SHIFT);
+		base += (1 << PMD_SHIFT);
 	} while (base < CONSISTENT_END);
 
 	return ret;
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 79c01f5..5e3adca 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -858,14 +858,14 @@ static inline void prepare_page_table(void)
 	/*
 	 * Clear out all the mappings below the kernel image.
 	 */
-	for (addr = 0; addr < MODULES_VADDR; addr += PGDIR_SIZE)
+	for (addr = 0; addr < MODULES_VADDR; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 
 #ifdef CONFIG_XIP_KERNEL
 	/* The XIP kernel is mapped in the module area -- skip over it */
-	addr = ((unsigned long)_etext + PGDIR_SIZE - 1) & PGDIR_MASK;
+	addr = ((unsigned long)_etext + PMD_SIZE - 1) & PMD_MASK;
 #endif
-	for ( ; addr < PAGE_OFFSET; addr += PGDIR_SIZE)
+	for ( ; addr < PAGE_OFFSET; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 
 	/*
@@ -880,7 +880,7 @@ static inline void prepare_page_table(void)
 	 * memory bank, up to the end of the vmalloc region.
 	 */
 	for (addr = __phys_to_virt(end);
-	     addr < VMALLOC_END; addr += PGDIR_SIZE)
+	     addr < VMALLOC_END; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 }
 
@@ -921,7 +921,7 @@ static void __init devicemaps_init(struct machine_desc *mdesc)
 	 */
 	vectors_page = early_alloc(PAGE_SIZE);
 
-	for (addr = VMALLOC_END; addr; addr += PGDIR_SIZE)
+	for (addr = VMALLOC_END; addr; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 
 	/*
@@ -1068,12 +1068,12 @@ void setup_mm_for_reboot(char mode)
 		base_pmdval |= PMD_BIT4;
 
 	for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
-		unsigned long pmdval = (i << PGDIR_SHIFT) | base_pmdval;
+		unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
 		pmd_t *pmd;
 
-		pmd = pmd_off(pgd, i << PGDIR_SHIFT);
+		pmd = pmd_off(pgd, i << PMD_SHIFT);
 		pmd[0] = __pmd(pmdval);
-		pmd[1] = __pmd(pmdval + (1 << (PGDIR_SHIFT - 1)));
+		pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
 		flush_pmd_entry(pmd);
 	}
 

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

This patch moves page table definitions from asm/page.h, asm/pgtable.h
and asm/ptgable-hwdef.h into corresponding *-2level* files.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/page.h                 |   40 +-------
 arch/arm/include/asm/pgtable-2level-hwdef.h |   91 +++++++++++++++++
 arch/arm/include/asm/pgtable-2level-types.h |   64 ++++++++++++
 arch/arm/include/asm/pgtable-2level.h       |  147 +++++++++++++++++++++++++++
 arch/arm/include/asm/pgtable-hwdef.h        |   77 +--------------
 arch/arm/include/asm/pgtable.h              |  139 +-------------------------
 6 files changed, 306 insertions(+), 252 deletions(-)
 create mode 100644 arch/arm/include/asm/pgtable-2level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-2level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-2level.h

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index a485ac3..3848105 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -151,45 +151,7 @@ extern void __cpu_copy_user_highpage(struct page *to, struct page *from,
 #define clear_page(page)	memset((void *)(page), 0, PAGE_SIZE)
 extern void copy_page(void *to, const void *from);
 
-#undef STRICT_MM_TYPECHECKS
-
-#ifdef STRICT_MM_TYPECHECKS
-/*
- * These are used to make use of C type-checking..
- */
-typedef struct { unsigned long pte; } pte_t;
-typedef struct { unsigned long pmd; } pmd_t;
-typedef struct { unsigned long pgd[2]; } pgd_t;
-typedef struct { unsigned long pgprot; } pgprot_t;
-
-#define pte_val(x)      ((x).pte)
-#define pmd_val(x)      ((x).pmd)
-#define pgd_val(x)	((x).pgd[0])
-#define pgprot_val(x)   ((x).pgprot)
-
-#define __pte(x)        ((pte_t) { (x) } )
-#define __pmd(x)        ((pmd_t) { (x) } )
-#define __pgprot(x)     ((pgprot_t) { (x) } )
-
-#else
-/*
- * .. while these make it easier on the compiler
- */
-typedef unsigned long pte_t;
-typedef unsigned long pmd_t;
-typedef unsigned long pgd_t[2];
-typedef unsigned long pgprot_t;
-
-#define pte_val(x)      (x)
-#define pmd_val(x)      (x)
-#define pgd_val(x)	((x)[0])
-#define pgprot_val(x)   (x)
-
-#define __pte(x)        (x)
-#define __pmd(x)        (x)
-#define __pgprot(x)     (x)
-
-#endif /* STRICT_MM_TYPECHECKS */
+#include <asm/pgtable-2level-types.h>
 
 #endif /* CONFIG_MMU */
 
diff --git a/arch/arm/include/asm/pgtable-2level-hwdef.h b/arch/arm/include/asm/pgtable-2level-hwdef.h
new file mode 100644
index 0000000..436529c
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-2level-hwdef.h
@@ -0,0 +1,91 @@
+/*
+ *  arch/arm/include/asm/pgtable-2level-hwdef.h
+ *
+ *  Copyright (C) 1995-2002 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _ASM_PGTABLE_2LEVEL_HWDEF_H
+#define _ASM_PGTABLE_2LEVEL_HWDEF_H
+
+/*
+ * Hardware page table definitions.
+ *
+ * + Level 1 descriptor (PMD)
+ *   - common
+ */
+#define PMD_TYPE_MASK		(3 << 0)
+#define PMD_TYPE_FAULT		(0 << 0)
+#define PMD_TYPE_TABLE		(1 << 0)
+#define PMD_TYPE_SECT		(2 << 0)
+#define PMD_BIT4		(1 << 4)
+#define PMD_DOMAIN(x)		((x) << 5)
+#define PMD_PROTECTION		(1 << 9)	/* v5 */
+/*
+ *   - section
+ */
+#define PMD_SECT_BUFFERABLE	(1 << 2)
+#define PMD_SECT_CACHEABLE	(1 << 3)
+#define PMD_SECT_XN		(1 << 4)	/* v6 */
+#define PMD_SECT_AP_WRITE	(1 << 10)
+#define PMD_SECT_AP_READ	(1 << 11)
+#define PMD_SECT_TEX(x)		((x) << 12)	/* v5 */
+#define PMD_SECT_APX		(1 << 15)	/* v6 */
+#define PMD_SECT_S		(1 << 16)	/* v6 */
+#define PMD_SECT_nG		(1 << 17)	/* v6 */
+#define PMD_SECT_SUPER		(1 << 18)	/* v6 */
+#define PMD_SECT_AF		(0)
+
+#define PMD_SECT_UNCACHED	(0)
+#define PMD_SECT_BUFFERED	(PMD_SECT_BUFFERABLE)
+#define PMD_SECT_WT		(PMD_SECT_CACHEABLE)
+#define PMD_SECT_WB		(PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
+#define PMD_SECT_MINICACHE	(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE)
+#define PMD_SECT_WBWA		(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
+#define PMD_SECT_NONSHARED_DEV	(PMD_SECT_TEX(2))
+
+/*
+ *   - coarse table (not used)
+ */
+
+/*
+ * + Level 2 descriptor (PTE)
+ *   - common
+ */
+#define PTE_TYPE_MASK		(3 << 0)
+#define PTE_TYPE_FAULT		(0 << 0)
+#define PTE_TYPE_LARGE		(1 << 0)
+#define PTE_TYPE_SMALL		(2 << 0)
+#define PTE_TYPE_EXT		(3 << 0)	/* v5 */
+#define PTE_BUFFERABLE		(1 << 2)
+#define PTE_CACHEABLE		(1 << 3)
+
+/*
+ *   - extended small page/tiny page
+ */
+#define PTE_EXT_XN		(1 << 0)	/* v6 */
+#define PTE_EXT_AP_MASK		(3 << 4)
+#define PTE_EXT_AP0		(1 << 4)
+#define PTE_EXT_AP1		(2 << 4)
+#define PTE_EXT_AP_UNO_SRO	(0 << 4)
+#define PTE_EXT_AP_UNO_SRW	(PTE_EXT_AP0)
+#define PTE_EXT_AP_URO_SRW	(PTE_EXT_AP1)
+#define PTE_EXT_AP_URW_SRW	(PTE_EXT_AP1|PTE_EXT_AP0)
+#define PTE_EXT_TEX(x)		((x) << 6)	/* v5 */
+#define PTE_EXT_APX		(1 << 9)	/* v6 */
+#define PTE_EXT_COHERENT	(1 << 9)	/* XScale3 */
+#define PTE_EXT_SHARED		(1 << 10)	/* v6 */
+#define PTE_EXT_NG		(1 << 11)	/* v6 */
+
+/*
+ *   - small page
+ */
+#define PTE_SMALL_AP_MASK	(0xff << 4)
+#define PTE_SMALL_AP_UNO_SRO	(0x00 << 4)
+#define PTE_SMALL_AP_UNO_SRW	(0x55 << 4)
+#define PTE_SMALL_AP_URO_SRW	(0xaa << 4)
+#define PTE_SMALL_AP_URW_SRW	(0xff << 4)
+
+#endif
diff --git a/arch/arm/include/asm/pgtable-2level-types.h b/arch/arm/include/asm/pgtable-2level-types.h
new file mode 100644
index 0000000..30f6741
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-2level-types.h
@@ -0,0 +1,64 @@
+/*
+ * arch/arm/include/asm/pgtable_32_types.h
+ *
+ * Copyright (C) 1995-2003 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_2LEVEL_TYPES_H
+#define _ASM_PGTABLE_2LEVEL_TYPES_H
+
+#undef STRICT_MM_TYPECHECKS
+
+typedef unsigned long pteval_t;
+
+#ifdef STRICT_MM_TYPECHECKS
+/*
+ * These are used to make use of C type-checking..
+ */
+typedef struct { unsigned long pte; } pte_t;
+typedef struct { unsigned long pmd; } pmd_t;
+typedef struct { unsigned long pgd[2]; } pgd_t;
+typedef struct { unsigned long pgprot; } pgprot_t;
+
+#define pte_val(x)      ((x).pte)
+#define pmd_val(x)      ((x).pmd)
+#define pgd_val(x)	((x).pgd[0])
+#define pgprot_val(x)   ((x).pgprot)
+
+#define __pte(x)        ((pte_t) { (x) } )
+#define __pmd(x)        ((pmd_t) { (x) } )
+#define __pgprot(x)     ((pgprot_t) { (x) } )
+
+#else
+/*
+ * .. while these make it easier on the compiler
+ */
+typedef unsigned long pte_t;
+typedef unsigned long pmd_t;
+typedef unsigned long pgd_t[2];
+typedef unsigned long pgprot_t;
+
+#define pte_val(x)      (x)
+#define pmd_val(x)      (x)
+#define pgd_val(x)	((x)[0])
+#define pgprot_val(x)   (x)
+
+#define __pte(x)        (x)
+#define __pmd(x)        (x)
+#define __pgprot(x)     (x)
+
+#endif /* STRICT_MM_TYPECHECKS */
+
+#endif	/* _ASM_PGTABLE_2LEVEL_TYPES_H */
diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
new file mode 100644
index 0000000..d60bda9
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -0,0 +1,147 @@
+/*
+ *  arch/arm/include/asm/pgtable-2level.h
+ *
+ *  Copyright (C) 1995-2002 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _ASM_PGTABLE_2LEVEL_H
+#define _ASM_PGTABLE_2LEVEL_H
+
+/*
+ * Hardware-wise, we have a two level page table structure, where the first
+ * level has 4096 entries, and the second level has 256 entries.  Each entry
+ * is one 32-bit word.  Most of the bits in the second level entry are used
+ * by hardware, and there aren't any "accessed" and "dirty" bits.
+ *
+ * Linux on the other hand has a three level page table structure, which can
+ * be wrapped to fit a two level page table structure easily - using the PGD
+ * and PTE only.  However, Linux also expects one "PTE" table per page, and
+ * at least a "dirty" bit.
+ *
+ * Therefore, we tweak the implementation slightly - we tell Linux that we
+ * have 2048 entries in the first level, each of which is 8 bytes (iow, two
+ * hardware pointers to the second level.)  The second level contains two
+ * hardware PTE tables arranged contiguously, followed by Linux versions
+ * which contain the state information Linux needs.  We, therefore, end up
+ * with 512 entries in the "PTE" level.
+ *
+ * This leads to the page tables having the following layout:
+ *
+ *    pgd             pte
+ * |        |
+ * +--------+ +0
+ * |        |-----> +------------+ +0
+ * +- - - - + +4    |  h/w pt 0  |
+ * |        |-----> +------------+ +1024
+ * +--------+ +8    |  h/w pt 1  |
+ * |        |       +------------+ +2048
+ * +- - - - +       | Linux pt 0 |
+ * |        |       +------------+ +3072
+ * +--------+       | Linux pt 1 |
+ * |        |       +------------+ +4096
+ *
+ * See L_PTE_xxx below for definitions of bits in the "Linux pt", and
+ * PTE_xxx for definitions of bits appearing in the "h/w pt".
+ *
+ * PMD_xxx definitions refer to bits in the first level page table.
+ *
+ * The "dirty" bit is emulated by only granting hardware write permission
+ * iff the page is marked "writable" and "dirty" in the Linux PTE.  This
+ * means that a write to a clean page will cause a permission fault, and
+ * the Linux MM layer will mark the page dirty via handle_pte_fault().
+ * For the hardware to notice the permission change, the TLB entry must
+ * be flushed, and ptep_set_access_flags() does that for us.
+ *
+ * The "accessed" or "young" bit is emulated by a similar method; we only
+ * allow accesses to the page if the "young" bit is set.  Accesses to the
+ * page will cause a fault, and handle_pte_fault() will set the young bit
+ * for us as long as the page is marked present in the corresponding Linux
+ * PTE entry.  Again, ptep_set_access_flags() will ensure that the TLB is
+ * up to date.
+ *
+ * However, when the "young" bit is cleared, we deny access to the page
+ * by clearing the hardware PTE.  Currently Linux does not flush the TLB
+ * for us in this case, which means the TLB will retain the transation
+ * until either the TLB entry is evicted under pressure, or a context
+ * switch which changes the user space mapping occurs.
+ */
+#define PTRS_PER_PTE		512
+#define PTRS_PER_PMD		1
+#define PTRS_PER_PGD		2048
+
+/*
+ * PMD_SHIFT determines the size of the area a second-level page table can map
+ * PGDIR_SHIFT determines what a third-level page table entry can map
+ */
+#define PMD_SHIFT		21
+#define PGDIR_SHIFT		21
+
+#define PMD_SIZE		(1UL << PMD_SHIFT)
+#define PMD_MASK		(~(PMD_SIZE-1))
+#define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+
+/*
+ * This is the lowest virtual address we can permit any user space
+ * mapping to be mapped at.  This is particularly important for
+ * non-high vector CPUs.
+ */
+#define FIRST_USER_ADDRESS	PAGE_SIZE
+
+#define FIRST_USER_PGD_NR	1
+#define USER_PTRS_PER_PGD	((TASK_SIZE/PGDIR_SIZE) - FIRST_USER_PGD_NR)
+
+/*
+ * section address mask and size definitions.
+ */
+#define SECTION_SHIFT		20
+#define SECTION_SIZE		(1UL << SECTION_SHIFT)
+#define SECTION_MASK		(~(SECTION_SIZE-1))
+
+/*
+ * ARMv6 supersection address mask and size definitions.
+ */
+#define SUPERSECTION_SHIFT	24
+#define SUPERSECTION_SIZE	(1UL << SUPERSECTION_SHIFT)
+#define SUPERSECTION_MASK	(~(SUPERSECTION_SIZE-1))
+
+/*
+ * "Linux" PTE definitions.
+ *
+ * We keep two sets of PTEs - the hardware and the linux version.
+ * This allows greater flexibility in the way we map the Linux bits
+ * onto the hardware tables, and allows us to have YOUNG and DIRTY
+ * bits.
+ *
+ * The PTE table pointer refers to the hardware entries; the "Linux"
+ * entries are stored 1024 bytes below.
+ */
+#define L_PTE_PRESENT		(1 << 0)
+#define L_PTE_YOUNG		(1 << 1)
+#define L_PTE_FILE		(1 << 2)	/* only when !PRESENT */
+#define L_PTE_DIRTY		(1 << 6)
+#define L_PTE_WRITE		(1 << 7)
+#define L_PTE_USER		(1 << 8)
+#define L_PTE_EXEC		(1 << 9)
+#define L_PTE_SHARED		(1 << 10)	/* shared(v6), coherent(xsc3) */
+
+/*
+ * These are the memory types, defined to be compatible with
+ * pre-ARMv6 CPUs cacheable and bufferable bits:   XXCB
+ */
+#define L_PTE_MT_UNCACHED	(0x00 << 2)	/* 0000 */
+#define L_PTE_MT_BUFFERABLE	(0x01 << 2)	/* 0001 */
+#define L_PTE_MT_WRITETHROUGH	(0x02 << 2)	/* 0010 */
+#define L_PTE_MT_WRITEBACK	(0x03 << 2)	/* 0011 */
+#define L_PTE_MT_MINICACHE	(0x06 << 2)	/* 0110 (sa1100, xscale) */
+#define L_PTE_MT_WRITEALLOC	(0x07 << 2)	/* 0111 */
+#define L_PTE_MT_DEV_SHARED	(0x04 << 2)	/* 0100 */
+#define L_PTE_MT_DEV_NONSHARED	(0x0c << 2)	/* 1100 */
+#define L_PTE_MT_DEV_WC		(0x09 << 2)	/* 1001 */
+#define L_PTE_MT_DEV_CACHED	(0x0b << 2)	/* 1011 */
+#define L_PTE_MT_MASK		(0x0f << 2)
+
+#endif /* _ASM_PGTABLE_2LEVEL_H */
diff --git a/arch/arm/include/asm/pgtable-hwdef.h b/arch/arm/include/asm/pgtable-hwdef.h
index fd1521d..1831111 100644
--- a/arch/arm/include/asm/pgtable-hwdef.h
+++ b/arch/arm/include/asm/pgtable-hwdef.h
@@ -10,81 +10,6 @@
 #ifndef _ASMARM_PGTABLE_HWDEF_H
 #define _ASMARM_PGTABLE_HWDEF_H
 
-/*
- * Hardware page table definitions.
- *
- * + Level 1 descriptor (PMD)
- *   - common
- */
-#define PMD_TYPE_MASK		(3 << 0)
-#define PMD_TYPE_FAULT		(0 << 0)
-#define PMD_TYPE_TABLE		(1 << 0)
-#define PMD_TYPE_SECT		(2 << 0)
-#define PMD_BIT4		(1 << 4)
-#define PMD_DOMAIN(x)		((x) << 5)
-#define PMD_PROTECTION		(1 << 9)	/* v5 */
-/*
- *   - section
- */
-#define PMD_SECT_BUFFERABLE	(1 << 2)
-#define PMD_SECT_CACHEABLE	(1 << 3)
-#define PMD_SECT_XN		(1 << 4)	/* v6 */
-#define PMD_SECT_AP_WRITE	(1 << 10)
-#define PMD_SECT_AP_READ	(1 << 11)
-#define PMD_SECT_TEX(x)		((x) << 12)	/* v5 */
-#define PMD_SECT_APX		(1 << 15)	/* v6 */
-#define PMD_SECT_S		(1 << 16)	/* v6 */
-#define PMD_SECT_nG		(1 << 17)	/* v6 */
-#define PMD_SECT_SUPER		(1 << 18)	/* v6 */
-
-#define PMD_SECT_UNCACHED	(0)
-#define PMD_SECT_BUFFERED	(PMD_SECT_BUFFERABLE)
-#define PMD_SECT_WT		(PMD_SECT_CACHEABLE)
-#define PMD_SECT_WB		(PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
-#define PMD_SECT_MINICACHE	(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE)
-#define PMD_SECT_WBWA		(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
-#define PMD_SECT_NONSHARED_DEV	(PMD_SECT_TEX(2))
-
-/*
- *   - coarse table (not used)
- */
-
-/*
- * + Level 2 descriptor (PTE)
- *   - common
- */
-#define PTE_TYPE_MASK		(3 << 0)
-#define PTE_TYPE_FAULT		(0 << 0)
-#define PTE_TYPE_LARGE		(1 << 0)
-#define PTE_TYPE_SMALL		(2 << 0)
-#define PTE_TYPE_EXT		(3 << 0)	/* v5 */
-#define PTE_BUFFERABLE		(1 << 2)
-#define PTE_CACHEABLE		(1 << 3)
-
-/*
- *   - extended small page/tiny page
- */
-#define PTE_EXT_XN		(1 << 0)	/* v6 */
-#define PTE_EXT_AP_MASK		(3 << 4)
-#define PTE_EXT_AP0		(1 << 4)
-#define PTE_EXT_AP1		(2 << 4)
-#define PTE_EXT_AP_UNO_SRO	(0 << 4)
-#define PTE_EXT_AP_UNO_SRW	(PTE_EXT_AP0)
-#define PTE_EXT_AP_URO_SRW	(PTE_EXT_AP1)
-#define PTE_EXT_AP_URW_SRW	(PTE_EXT_AP1|PTE_EXT_AP0)
-#define PTE_EXT_TEX(x)		((x) << 6)	/* v5 */
-#define PTE_EXT_APX		(1 << 9)	/* v6 */
-#define PTE_EXT_COHERENT	(1 << 9)	/* XScale3 */
-#define PTE_EXT_SHARED		(1 << 10)	/* v6 */
-#define PTE_EXT_NG		(1 << 11)	/* v6 */
-
-/*
- *   - small page
- */
-#define PTE_SMALL_AP_MASK	(0xff << 4)
-#define PTE_SMALL_AP_UNO_SRO	(0x00 << 4)
-#define PTE_SMALL_AP_UNO_SRW	(0x55 << 4)
-#define PTE_SMALL_AP_URO_SRW	(0xaa << 4)
-#define PTE_SMALL_AP_URW_SRW	(0xff << 4)
+#include <asm/pgtable-2level-hwdef.h>
 
 #endif
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index b155414..17e7ba6 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -23,6 +23,8 @@
 #include <mach/vmalloc.h>
 #include <asm/pgtable-hwdef.h>
 
+#include <asm/pgtable-2level.h>
+
 /*
  * Just any arbitrary offset to the start of the vmalloc VM area: the
  * current 8MB value just means that there will be a 8MB "hole" after the
@@ -40,75 +42,6 @@
 #define VMALLOC_START		(((unsigned long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1))
 #endif
 
-/*
- * Hardware-wise, we have a two level page table structure, where the first
- * level has 4096 entries, and the second level has 256 entries.  Each entry
- * is one 32-bit word.  Most of the bits in the second level entry are used
- * by hardware, and there aren't any "accessed" and "dirty" bits.
- *
- * Linux on the other hand has a three level page table structure, which can
- * be wrapped to fit a two level page table structure easily - using the PGD
- * and PTE only.  However, Linux also expects one "PTE" table per page, and
- * at least a "dirty" bit.
- *
- * Therefore, we tweak the implementation slightly - we tell Linux that we
- * have 2048 entries in the first level, each of which is 8 bytes (iow, two
- * hardware pointers to the second level.)  The second level contains two
- * hardware PTE tables arranged contiguously, followed by Linux versions
- * which contain the state information Linux needs.  We, therefore, end up
- * with 512 entries in the "PTE" level.
- *
- * This leads to the page tables having the following layout:
- *
- *    pgd             pte
- * |        |
- * +--------+ +0
- * |        |-----> +------------+ +0
- * +- - - - + +4    |  h/w pt 0  |
- * |        |-----> +------------+ +1024
- * +--------+ +8    |  h/w pt 1  |
- * |        |       +------------+ +2048
- * +- - - - +       | Linux pt 0 |
- * |        |       +------------+ +3072
- * +--------+       | Linux pt 1 |
- * |        |       +------------+ +4096
- *
- * See L_PTE_xxx below for definitions of bits in the "Linux pt", and
- * PTE_xxx for definitions of bits appearing in the "h/w pt".
- *
- * PMD_xxx definitions refer to bits in the first level page table.
- *
- * The "dirty" bit is emulated by only granting hardware write permission
- * iff the page is marked "writable" and "dirty" in the Linux PTE.  This
- * means that a write to a clean page will cause a permission fault, and
- * the Linux MM layer will mark the page dirty via handle_pte_fault().
- * For the hardware to notice the permission change, the TLB entry must
- * be flushed, and ptep_set_access_flags() does that for us.
- *
- * The "accessed" or "young" bit is emulated by a similar method; we only
- * allow accesses to the page if the "young" bit is set.  Accesses to the
- * page will cause a fault, and handle_pte_fault() will set the young bit
- * for us as long as the page is marked present in the corresponding Linux
- * PTE entry.  Again, ptep_set_access_flags() will ensure that the TLB is
- * up to date.
- *
- * However, when the "young" bit is cleared, we deny access to the page
- * by clearing the hardware PTE.  Currently Linux does not flush the TLB
- * for us in this case, which means the TLB will retain the transation
- * until either the TLB entry is evicted under pressure, or a context
- * switch which changes the user space mapping occurs.
- */
-#define PTRS_PER_PTE		512
-#define PTRS_PER_PMD		1
-#define PTRS_PER_PGD		2048
-
-/*
- * PMD_SHIFT determines the size of the area a second-level page table can map
- * PGDIR_SHIFT determines what a third-level page table entry can map
- */
-#define PMD_SHIFT		21
-#define PGDIR_SHIFT		21
-
 #define LIBRARY_TEXT_START	0x0c000000
 
 #ifndef __ASSEMBLY__
@@ -119,74 +52,6 @@ extern void __pgd_error(const char *file, int line, unsigned long val);
 #define pte_ERROR(pte)		__pte_error(__FILE__, __LINE__, pte_val(pte))
 #define pmd_ERROR(pmd)		__pmd_error(__FILE__, __LINE__, pmd_val(pmd))
 #define pgd_ERROR(pgd)		__pgd_error(__FILE__, __LINE__, pgd_val(pgd))
-#endif /* !__ASSEMBLY__ */
-
-#define PMD_SIZE		(1UL << PMD_SHIFT)
-#define PMD_MASK		(~(PMD_SIZE-1))
-#define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
-#define PGDIR_MASK		(~(PGDIR_SIZE-1))
-
-/*
- * This is the lowest virtual address we can permit any user space
- * mapping to be mapped at.  This is particularly important for
- * non-high vector CPUs.
- */
-#define FIRST_USER_ADDRESS	PAGE_SIZE
-
-#define FIRST_USER_PGD_NR	1
-#define USER_PTRS_PER_PGD	((TASK_SIZE/PGDIR_SIZE) - FIRST_USER_PGD_NR)
-
-/*
- * section address mask and size definitions.
- */
-#define SECTION_SHIFT		20
-#define SECTION_SIZE		(1UL << SECTION_SHIFT)
-#define SECTION_MASK		(~(SECTION_SIZE-1))
-
-/*
- * ARMv6 supersection address mask and size definitions.
- */
-#define SUPERSECTION_SHIFT	24
-#define SUPERSECTION_SIZE	(1UL << SUPERSECTION_SHIFT)
-#define SUPERSECTION_MASK	(~(SUPERSECTION_SIZE-1))
-
-/*
- * "Linux" PTE definitions.
- *
- * We keep two sets of PTEs - the hardware and the linux version.
- * This allows greater flexibility in the way we map the Linux bits
- * onto the hardware tables, and allows us to have YOUNG and DIRTY
- * bits.
- *
- * The PTE table pointer refers to the hardware entries; the "Linux"
- * entries are stored 1024 bytes below.
- */
-#define L_PTE_PRESENT		(1 << 0)
-#define L_PTE_YOUNG		(1 << 1)
-#define L_PTE_FILE		(1 << 2)	/* only when !PRESENT */
-#define L_PTE_DIRTY		(1 << 6)
-#define L_PTE_WRITE		(1 << 7)
-#define L_PTE_USER		(1 << 8)
-#define L_PTE_EXEC		(1 << 9)
-#define L_PTE_SHARED		(1 << 10)	/* shared(v6), coherent(xsc3) */
-
-/*
- * These are the memory types, defined to be compatible with
- * pre-ARMv6 CPUs cacheable and bufferable bits:   XXCB
- */
-#define L_PTE_MT_UNCACHED	(0x00 << 2)	/* 0000 */
-#define L_PTE_MT_BUFFERABLE	(0x01 << 2)	/* 0001 */
-#define L_PTE_MT_WRITETHROUGH	(0x02 << 2)	/* 0010 */
-#define L_PTE_MT_WRITEBACK	(0x03 << 2)	/* 0011 */
-#define L_PTE_MT_MINICACHE	(0x06 << 2)	/* 0110 (sa1100, xscale) */
-#define L_PTE_MT_WRITEALLOC	(0x07 << 2)	/* 0111 */
-#define L_PTE_MT_DEV_SHARED	(0x04 << 2)	/* 0100 */
-#define L_PTE_MT_DEV_NONSHARED	(0x0c << 2)	/* 1100 */
-#define L_PTE_MT_DEV_WC		(0x09 << 2)	/* 1001 */
-#define L_PTE_MT_DEV_CACHED	(0x0b << 2)	/* 1011 */
-#define L_PTE_MT_MASK		(0x0f << 2)
-
-#ifndef __ASSEMBLY__
 
 /*
  * The pgprot_* and protection_map entries will be fixed up in runtime

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

This patch moves page table definitions from asm/page.h, asm/pgtable.h
and asm/ptgable-hwdef.h into corresponding *-2level* files.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/page.h                 |   40 +-------
 arch/arm/include/asm/pgtable-2level-hwdef.h |   91 +++++++++++++++++
 arch/arm/include/asm/pgtable-2level-types.h |   64 ++++++++++++
 arch/arm/include/asm/pgtable-2level.h       |  147 +++++++++++++++++++++++++++
 arch/arm/include/asm/pgtable-hwdef.h        |   77 +--------------
 arch/arm/include/asm/pgtable.h              |  139 +-------------------------
 6 files changed, 306 insertions(+), 252 deletions(-)
 create mode 100644 arch/arm/include/asm/pgtable-2level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-2level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-2level.h

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index a485ac3..3848105 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -151,45 +151,7 @@ extern void __cpu_copy_user_highpage(struct page *to, struct page *from,
 #define clear_page(page)	memset((void *)(page), 0, PAGE_SIZE)
 extern void copy_page(void *to, const void *from);
 
-#undef STRICT_MM_TYPECHECKS
-
-#ifdef STRICT_MM_TYPECHECKS
-/*
- * These are used to make use of C type-checking..
- */
-typedef struct { unsigned long pte; } pte_t;
-typedef struct { unsigned long pmd; } pmd_t;
-typedef struct { unsigned long pgd[2]; } pgd_t;
-typedef struct { unsigned long pgprot; } pgprot_t;
-
-#define pte_val(x)      ((x).pte)
-#define pmd_val(x)      ((x).pmd)
-#define pgd_val(x)	((x).pgd[0])
-#define pgprot_val(x)   ((x).pgprot)
-
-#define __pte(x)        ((pte_t) { (x) } )
-#define __pmd(x)        ((pmd_t) { (x) } )
-#define __pgprot(x)     ((pgprot_t) { (x) } )
-
-#else
-/*
- * .. while these make it easier on the compiler
- */
-typedef unsigned long pte_t;
-typedef unsigned long pmd_t;
-typedef unsigned long pgd_t[2];
-typedef unsigned long pgprot_t;
-
-#define pte_val(x)      (x)
-#define pmd_val(x)      (x)
-#define pgd_val(x)	((x)[0])
-#define pgprot_val(x)   (x)
-
-#define __pte(x)        (x)
-#define __pmd(x)        (x)
-#define __pgprot(x)     (x)
-
-#endif /* STRICT_MM_TYPECHECKS */
+#include <asm/pgtable-2level-types.h>
 
 #endif /* CONFIG_MMU */
 
diff --git a/arch/arm/include/asm/pgtable-2level-hwdef.h b/arch/arm/include/asm/pgtable-2level-hwdef.h
new file mode 100644
index 0000000..436529c
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-2level-hwdef.h
@@ -0,0 +1,91 @@
+/*
+ *  arch/arm/include/asm/pgtable-2level-hwdef.h
+ *
+ *  Copyright (C) 1995-2002 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _ASM_PGTABLE_2LEVEL_HWDEF_H
+#define _ASM_PGTABLE_2LEVEL_HWDEF_H
+
+/*
+ * Hardware page table definitions.
+ *
+ * + Level 1 descriptor (PMD)
+ *   - common
+ */
+#define PMD_TYPE_MASK		(3 << 0)
+#define PMD_TYPE_FAULT		(0 << 0)
+#define PMD_TYPE_TABLE		(1 << 0)
+#define PMD_TYPE_SECT		(2 << 0)
+#define PMD_BIT4		(1 << 4)
+#define PMD_DOMAIN(x)		((x) << 5)
+#define PMD_PROTECTION		(1 << 9)	/* v5 */
+/*
+ *   - section
+ */
+#define PMD_SECT_BUFFERABLE	(1 << 2)
+#define PMD_SECT_CACHEABLE	(1 << 3)
+#define PMD_SECT_XN		(1 << 4)	/* v6 */
+#define PMD_SECT_AP_WRITE	(1 << 10)
+#define PMD_SECT_AP_READ	(1 << 11)
+#define PMD_SECT_TEX(x)		((x) << 12)	/* v5 */
+#define PMD_SECT_APX		(1 << 15)	/* v6 */
+#define PMD_SECT_S		(1 << 16)	/* v6 */
+#define PMD_SECT_nG		(1 << 17)	/* v6 */
+#define PMD_SECT_SUPER		(1 << 18)	/* v6 */
+#define PMD_SECT_AF		(0)
+
+#define PMD_SECT_UNCACHED	(0)
+#define PMD_SECT_BUFFERED	(PMD_SECT_BUFFERABLE)
+#define PMD_SECT_WT		(PMD_SECT_CACHEABLE)
+#define PMD_SECT_WB		(PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
+#define PMD_SECT_MINICACHE	(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE)
+#define PMD_SECT_WBWA		(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
+#define PMD_SECT_NONSHARED_DEV	(PMD_SECT_TEX(2))
+
+/*
+ *   - coarse table (not used)
+ */
+
+/*
+ * + Level 2 descriptor (PTE)
+ *   - common
+ */
+#define PTE_TYPE_MASK		(3 << 0)
+#define PTE_TYPE_FAULT		(0 << 0)
+#define PTE_TYPE_LARGE		(1 << 0)
+#define PTE_TYPE_SMALL		(2 << 0)
+#define PTE_TYPE_EXT		(3 << 0)	/* v5 */
+#define PTE_BUFFERABLE		(1 << 2)
+#define PTE_CACHEABLE		(1 << 3)
+
+/*
+ *   - extended small page/tiny page
+ */
+#define PTE_EXT_XN		(1 << 0)	/* v6 */
+#define PTE_EXT_AP_MASK		(3 << 4)
+#define PTE_EXT_AP0		(1 << 4)
+#define PTE_EXT_AP1		(2 << 4)
+#define PTE_EXT_AP_UNO_SRO	(0 << 4)
+#define PTE_EXT_AP_UNO_SRW	(PTE_EXT_AP0)
+#define PTE_EXT_AP_URO_SRW	(PTE_EXT_AP1)
+#define PTE_EXT_AP_URW_SRW	(PTE_EXT_AP1|PTE_EXT_AP0)
+#define PTE_EXT_TEX(x)		((x) << 6)	/* v5 */
+#define PTE_EXT_APX		(1 << 9)	/* v6 */
+#define PTE_EXT_COHERENT	(1 << 9)	/* XScale3 */
+#define PTE_EXT_SHARED		(1 << 10)	/* v6 */
+#define PTE_EXT_NG		(1 << 11)	/* v6 */
+
+/*
+ *   - small page
+ */
+#define PTE_SMALL_AP_MASK	(0xff << 4)
+#define PTE_SMALL_AP_UNO_SRO	(0x00 << 4)
+#define PTE_SMALL_AP_UNO_SRW	(0x55 << 4)
+#define PTE_SMALL_AP_URO_SRW	(0xaa << 4)
+#define PTE_SMALL_AP_URW_SRW	(0xff << 4)
+
+#endif
diff --git a/arch/arm/include/asm/pgtable-2level-types.h b/arch/arm/include/asm/pgtable-2level-types.h
new file mode 100644
index 0000000..30f6741
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-2level-types.h
@@ -0,0 +1,64 @@
+/*
+ * arch/arm/include/asm/pgtable_32_types.h
+ *
+ * Copyright (C) 1995-2003 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_2LEVEL_TYPES_H
+#define _ASM_PGTABLE_2LEVEL_TYPES_H
+
+#undef STRICT_MM_TYPECHECKS
+
+typedef unsigned long pteval_t;
+
+#ifdef STRICT_MM_TYPECHECKS
+/*
+ * These are used to make use of C type-checking..
+ */
+typedef struct { unsigned long pte; } pte_t;
+typedef struct { unsigned long pmd; } pmd_t;
+typedef struct { unsigned long pgd[2]; } pgd_t;
+typedef struct { unsigned long pgprot; } pgprot_t;
+
+#define pte_val(x)      ((x).pte)
+#define pmd_val(x)      ((x).pmd)
+#define pgd_val(x)	((x).pgd[0])
+#define pgprot_val(x)   ((x).pgprot)
+
+#define __pte(x)        ((pte_t) { (x) } )
+#define __pmd(x)        ((pmd_t) { (x) } )
+#define __pgprot(x)     ((pgprot_t) { (x) } )
+
+#else
+/*
+ * .. while these make it easier on the compiler
+ */
+typedef unsigned long pte_t;
+typedef unsigned long pmd_t;
+typedef unsigned long pgd_t[2];
+typedef unsigned long pgprot_t;
+
+#define pte_val(x)      (x)
+#define pmd_val(x)      (x)
+#define pgd_val(x)	((x)[0])
+#define pgprot_val(x)   (x)
+
+#define __pte(x)        (x)
+#define __pmd(x)        (x)
+#define __pgprot(x)     (x)
+
+#endif /* STRICT_MM_TYPECHECKS */
+
+#endif	/* _ASM_PGTABLE_2LEVEL_TYPES_H */
diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
new file mode 100644
index 0000000..d60bda9
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -0,0 +1,147 @@
+/*
+ *  arch/arm/include/asm/pgtable-2level.h
+ *
+ *  Copyright (C) 1995-2002 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _ASM_PGTABLE_2LEVEL_H
+#define _ASM_PGTABLE_2LEVEL_H
+
+/*
+ * Hardware-wise, we have a two level page table structure, where the first
+ * level has 4096 entries, and the second level has 256 entries.  Each entry
+ * is one 32-bit word.  Most of the bits in the second level entry are used
+ * by hardware, and there aren't any "accessed" and "dirty" bits.
+ *
+ * Linux on the other hand has a three level page table structure, which can
+ * be wrapped to fit a two level page table structure easily - using the PGD
+ * and PTE only.  However, Linux also expects one "PTE" table per page, and
+ *@least a "dirty" bit.
+ *
+ * Therefore, we tweak the implementation slightly - we tell Linux that we
+ * have 2048 entries in the first level, each of which is 8 bytes (iow, two
+ * hardware pointers to the second level.)  The second level contains two
+ * hardware PTE tables arranged contiguously, followed by Linux versions
+ * which contain the state information Linux needs.  We, therefore, end up
+ * with 512 entries in the "PTE" level.
+ *
+ * This leads to the page tables having the following layout:
+ *
+ *    pgd             pte
+ * |        |
+ * +--------+ +0
+ * |        |-----> +------------+ +0
+ * +- - - - + +4    |  h/w pt 0  |
+ * |        |-----> +------------+ +1024
+ * +--------+ +8    |  h/w pt 1  |
+ * |        |       +------------+ +2048
+ * +- - - - +       | Linux pt 0 |
+ * |        |       +------------+ +3072
+ * +--------+       | Linux pt 1 |
+ * |        |       +------------+ +4096
+ *
+ * See L_PTE_xxx below for definitions of bits in the "Linux pt", and
+ * PTE_xxx for definitions of bits appearing in the "h/w pt".
+ *
+ * PMD_xxx definitions refer to bits in the first level page table.
+ *
+ * The "dirty" bit is emulated by only granting hardware write permission
+ * iff the page is marked "writable" and "dirty" in the Linux PTE.  This
+ * means that a write to a clean page will cause a permission fault, and
+ * the Linux MM layer will mark the page dirty via handle_pte_fault().
+ * For the hardware to notice the permission change, the TLB entry must
+ * be flushed, and ptep_set_access_flags() does that for us.
+ *
+ * The "accessed" or "young" bit is emulated by a similar method; we only
+ * allow accesses to the page if the "young" bit is set.  Accesses to the
+ * page will cause a fault, and handle_pte_fault() will set the young bit
+ * for us as long as the page is marked present in the corresponding Linux
+ * PTE entry.  Again, ptep_set_access_flags() will ensure that the TLB is
+ * up to date.
+ *
+ * However, when the "young" bit is cleared, we deny access to the page
+ * by clearing the hardware PTE.  Currently Linux does not flush the TLB
+ * for us in this case, which means the TLB will retain the transation
+ * until either the TLB entry is evicted under pressure, or a context
+ * switch which changes the user space mapping occurs.
+ */
+#define PTRS_PER_PTE		512
+#define PTRS_PER_PMD		1
+#define PTRS_PER_PGD		2048
+
+/*
+ * PMD_SHIFT determines the size of the area a second-level page table can map
+ * PGDIR_SHIFT determines what a third-level page table entry can map
+ */
+#define PMD_SHIFT		21
+#define PGDIR_SHIFT		21
+
+#define PMD_SIZE		(1UL << PMD_SHIFT)
+#define PMD_MASK		(~(PMD_SIZE-1))
+#define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+
+/*
+ * This is the lowest virtual address we can permit any user space
+ * mapping to be mapped at.  This is particularly important for
+ * non-high vector CPUs.
+ */
+#define FIRST_USER_ADDRESS	PAGE_SIZE
+
+#define FIRST_USER_PGD_NR	1
+#define USER_PTRS_PER_PGD	((TASK_SIZE/PGDIR_SIZE) - FIRST_USER_PGD_NR)
+
+/*
+ * section address mask and size definitions.
+ */
+#define SECTION_SHIFT		20
+#define SECTION_SIZE		(1UL << SECTION_SHIFT)
+#define SECTION_MASK		(~(SECTION_SIZE-1))
+
+/*
+ * ARMv6 supersection address mask and size definitions.
+ */
+#define SUPERSECTION_SHIFT	24
+#define SUPERSECTION_SIZE	(1UL << SUPERSECTION_SHIFT)
+#define SUPERSECTION_MASK	(~(SUPERSECTION_SIZE-1))
+
+/*
+ * "Linux" PTE definitions.
+ *
+ * We keep two sets of PTEs - the hardware and the linux version.
+ * This allows greater flexibility in the way we map the Linux bits
+ * onto the hardware tables, and allows us to have YOUNG and DIRTY
+ * bits.
+ *
+ * The PTE table pointer refers to the hardware entries; the "Linux"
+ * entries are stored 1024 bytes below.
+ */
+#define L_PTE_PRESENT		(1 << 0)
+#define L_PTE_YOUNG		(1 << 1)
+#define L_PTE_FILE		(1 << 2)	/* only when !PRESENT */
+#define L_PTE_DIRTY		(1 << 6)
+#define L_PTE_WRITE		(1 << 7)
+#define L_PTE_USER		(1 << 8)
+#define L_PTE_EXEC		(1 << 9)
+#define L_PTE_SHARED		(1 << 10)	/* shared(v6), coherent(xsc3) */
+
+/*
+ * These are the memory types, defined to be compatible with
+ * pre-ARMv6 CPUs cacheable and bufferable bits:   XXCB
+ */
+#define L_PTE_MT_UNCACHED	(0x00 << 2)	/* 0000 */
+#define L_PTE_MT_BUFFERABLE	(0x01 << 2)	/* 0001 */
+#define L_PTE_MT_WRITETHROUGH	(0x02 << 2)	/* 0010 */
+#define L_PTE_MT_WRITEBACK	(0x03 << 2)	/* 0011 */
+#define L_PTE_MT_MINICACHE	(0x06 << 2)	/* 0110 (sa1100, xscale) */
+#define L_PTE_MT_WRITEALLOC	(0x07 << 2)	/* 0111 */
+#define L_PTE_MT_DEV_SHARED	(0x04 << 2)	/* 0100 */
+#define L_PTE_MT_DEV_NONSHARED	(0x0c << 2)	/* 1100 */
+#define L_PTE_MT_DEV_WC		(0x09 << 2)	/* 1001 */
+#define L_PTE_MT_DEV_CACHED	(0x0b << 2)	/* 1011 */
+#define L_PTE_MT_MASK		(0x0f << 2)
+
+#endif /* _ASM_PGTABLE_2LEVEL_H */
diff --git a/arch/arm/include/asm/pgtable-hwdef.h b/arch/arm/include/asm/pgtable-hwdef.h
index fd1521d..1831111 100644
--- a/arch/arm/include/asm/pgtable-hwdef.h
+++ b/arch/arm/include/asm/pgtable-hwdef.h
@@ -10,81 +10,6 @@
 #ifndef _ASMARM_PGTABLE_HWDEF_H
 #define _ASMARM_PGTABLE_HWDEF_H
 
-/*
- * Hardware page table definitions.
- *
- * + Level 1 descriptor (PMD)
- *   - common
- */
-#define PMD_TYPE_MASK		(3 << 0)
-#define PMD_TYPE_FAULT		(0 << 0)
-#define PMD_TYPE_TABLE		(1 << 0)
-#define PMD_TYPE_SECT		(2 << 0)
-#define PMD_BIT4		(1 << 4)
-#define PMD_DOMAIN(x)		((x) << 5)
-#define PMD_PROTECTION		(1 << 9)	/* v5 */
-/*
- *   - section
- */
-#define PMD_SECT_BUFFERABLE	(1 << 2)
-#define PMD_SECT_CACHEABLE	(1 << 3)
-#define PMD_SECT_XN		(1 << 4)	/* v6 */
-#define PMD_SECT_AP_WRITE	(1 << 10)
-#define PMD_SECT_AP_READ	(1 << 11)
-#define PMD_SECT_TEX(x)		((x) << 12)	/* v5 */
-#define PMD_SECT_APX		(1 << 15)	/* v6 */
-#define PMD_SECT_S		(1 << 16)	/* v6 */
-#define PMD_SECT_nG		(1 << 17)	/* v6 */
-#define PMD_SECT_SUPER		(1 << 18)	/* v6 */
-
-#define PMD_SECT_UNCACHED	(0)
-#define PMD_SECT_BUFFERED	(PMD_SECT_BUFFERABLE)
-#define PMD_SECT_WT		(PMD_SECT_CACHEABLE)
-#define PMD_SECT_WB		(PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
-#define PMD_SECT_MINICACHE	(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE)
-#define PMD_SECT_WBWA		(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
-#define PMD_SECT_NONSHARED_DEV	(PMD_SECT_TEX(2))
-
-/*
- *   - coarse table (not used)
- */
-
-/*
- * + Level 2 descriptor (PTE)
- *   - common
- */
-#define PTE_TYPE_MASK		(3 << 0)
-#define PTE_TYPE_FAULT		(0 << 0)
-#define PTE_TYPE_LARGE		(1 << 0)
-#define PTE_TYPE_SMALL		(2 << 0)
-#define PTE_TYPE_EXT		(3 << 0)	/* v5 */
-#define PTE_BUFFERABLE		(1 << 2)
-#define PTE_CACHEABLE		(1 << 3)
-
-/*
- *   - extended small page/tiny page
- */
-#define PTE_EXT_XN		(1 << 0)	/* v6 */
-#define PTE_EXT_AP_MASK		(3 << 4)
-#define PTE_EXT_AP0		(1 << 4)
-#define PTE_EXT_AP1		(2 << 4)
-#define PTE_EXT_AP_UNO_SRO	(0 << 4)
-#define PTE_EXT_AP_UNO_SRW	(PTE_EXT_AP0)
-#define PTE_EXT_AP_URO_SRW	(PTE_EXT_AP1)
-#define PTE_EXT_AP_URW_SRW	(PTE_EXT_AP1|PTE_EXT_AP0)
-#define PTE_EXT_TEX(x)		((x) << 6)	/* v5 */
-#define PTE_EXT_APX		(1 << 9)	/* v6 */
-#define PTE_EXT_COHERENT	(1 << 9)	/* XScale3 */
-#define PTE_EXT_SHARED		(1 << 10)	/* v6 */
-#define PTE_EXT_NG		(1 << 11)	/* v6 */
-
-/*
- *   - small page
- */
-#define PTE_SMALL_AP_MASK	(0xff << 4)
-#define PTE_SMALL_AP_UNO_SRO	(0x00 << 4)
-#define PTE_SMALL_AP_UNO_SRW	(0x55 << 4)
-#define PTE_SMALL_AP_URO_SRW	(0xaa << 4)
-#define PTE_SMALL_AP_URW_SRW	(0xff << 4)
+#include <asm/pgtable-2level-hwdef.h>
 
 #endif
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index b155414..17e7ba6 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -23,6 +23,8 @@
 #include <mach/vmalloc.h>
 #include <asm/pgtable-hwdef.h>
 
+#include <asm/pgtable-2level.h>
+
 /*
  * Just any arbitrary offset to the start of the vmalloc VM area: the
  * current 8MB value just means that there will be a 8MB "hole" after the
@@ -40,75 +42,6 @@
 #define VMALLOC_START		(((unsigned long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1))
 #endif
 
-/*
- * Hardware-wise, we have a two level page table structure, where the first
- * level has 4096 entries, and the second level has 256 entries.  Each entry
- * is one 32-bit word.  Most of the bits in the second level entry are used
- * by hardware, and there aren't any "accessed" and "dirty" bits.
- *
- * Linux on the other hand has a three level page table structure, which can
- * be wrapped to fit a two level page table structure easily - using the PGD
- * and PTE only.  However, Linux also expects one "PTE" table per page, and
- * at least a "dirty" bit.
- *
- * Therefore, we tweak the implementation slightly - we tell Linux that we
- * have 2048 entries in the first level, each of which is 8 bytes (iow, two
- * hardware pointers to the second level.)  The second level contains two
- * hardware PTE tables arranged contiguously, followed by Linux versions
- * which contain the state information Linux needs.  We, therefore, end up
- * with 512 entries in the "PTE" level.
- *
- * This leads to the page tables having the following layout:
- *
- *    pgd             pte
- * |        |
- * +--------+ +0
- * |        |-----> +------------+ +0
- * +- - - - + +4    |  h/w pt 0  |
- * |        |-----> +------------+ +1024
- * +--------+ +8    |  h/w pt 1  |
- * |        |       +------------+ +2048
- * +- - - - +       | Linux pt 0 |
- * |        |       +------------+ +3072
- * +--------+       | Linux pt 1 |
- * |        |       +------------+ +4096
- *
- * See L_PTE_xxx below for definitions of bits in the "Linux pt", and
- * PTE_xxx for definitions of bits appearing in the "h/w pt".
- *
- * PMD_xxx definitions refer to bits in the first level page table.
- *
- * The "dirty" bit is emulated by only granting hardware write permission
- * iff the page is marked "writable" and "dirty" in the Linux PTE.  This
- * means that a write to a clean page will cause a permission fault, and
- * the Linux MM layer will mark the page dirty via handle_pte_fault().
- * For the hardware to notice the permission change, the TLB entry must
- * be flushed, and ptep_set_access_flags() does that for us.
- *
- * The "accessed" or "young" bit is emulated by a similar method; we only
- * allow accesses to the page if the "young" bit is set.  Accesses to the
- * page will cause a fault, and handle_pte_fault() will set the young bit
- * for us as long as the page is marked present in the corresponding Linux
- * PTE entry.  Again, ptep_set_access_flags() will ensure that the TLB is
- * up to date.
- *
- * However, when the "young" bit is cleared, we deny access to the page
- * by clearing the hardware PTE.  Currently Linux does not flush the TLB
- * for us in this case, which means the TLB will retain the transation
- * until either the TLB entry is evicted under pressure, or a context
- * switch which changes the user space mapping occurs.
- */
-#define PTRS_PER_PTE		512
-#define PTRS_PER_PMD		1
-#define PTRS_PER_PGD		2048
-
-/*
- * PMD_SHIFT determines the size of the area a second-level page table can map
- * PGDIR_SHIFT determines what a third-level page table entry can map
- */
-#define PMD_SHIFT		21
-#define PGDIR_SHIFT		21
-
 #define LIBRARY_TEXT_START	0x0c000000
 
 #ifndef __ASSEMBLY__
@@ -119,74 +52,6 @@ extern void __pgd_error(const char *file, int line, unsigned long val);
 #define pte_ERROR(pte)		__pte_error(__FILE__, __LINE__, pte_val(pte))
 #define pmd_ERROR(pmd)		__pmd_error(__FILE__, __LINE__, pmd_val(pmd))
 #define pgd_ERROR(pgd)		__pgd_error(__FILE__, __LINE__, pgd_val(pgd))
-#endif /* !__ASSEMBLY__ */
-
-#define PMD_SIZE		(1UL << PMD_SHIFT)
-#define PMD_MASK		(~(PMD_SIZE-1))
-#define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
-#define PGDIR_MASK		(~(PGDIR_SIZE-1))
-
-/*
- * This is the lowest virtual address we can permit any user space
- * mapping to be mapped at.  This is particularly important for
- * non-high vector CPUs.
- */
-#define FIRST_USER_ADDRESS	PAGE_SIZE
-
-#define FIRST_USER_PGD_NR	1
-#define USER_PTRS_PER_PGD	((TASK_SIZE/PGDIR_SIZE) - FIRST_USER_PGD_NR)
-
-/*
- * section address mask and size definitions.
- */
-#define SECTION_SHIFT		20
-#define SECTION_SIZE		(1UL << SECTION_SHIFT)
-#define SECTION_MASK		(~(SECTION_SIZE-1))
-
-/*
- * ARMv6 supersection address mask and size definitions.
- */
-#define SUPERSECTION_SHIFT	24
-#define SUPERSECTION_SIZE	(1UL << SUPERSECTION_SHIFT)
-#define SUPERSECTION_MASK	(~(SUPERSECTION_SIZE-1))
-
-/*
- * "Linux" PTE definitions.
- *
- * We keep two sets of PTEs - the hardware and the linux version.
- * This allows greater flexibility in the way we map the Linux bits
- * onto the hardware tables, and allows us to have YOUNG and DIRTY
- * bits.
- *
- * The PTE table pointer refers to the hardware entries; the "Linux"
- * entries are stored 1024 bytes below.
- */
-#define L_PTE_PRESENT		(1 << 0)
-#define L_PTE_YOUNG		(1 << 1)
-#define L_PTE_FILE		(1 << 2)	/* only when !PRESENT */
-#define L_PTE_DIRTY		(1 << 6)
-#define L_PTE_WRITE		(1 << 7)
-#define L_PTE_USER		(1 << 8)
-#define L_PTE_EXEC		(1 << 9)
-#define L_PTE_SHARED		(1 << 10)	/* shared(v6), coherent(xsc3) */
-
-/*
- * These are the memory types, defined to be compatible with
- * pre-ARMv6 CPUs cacheable and bufferable bits:   XXCB
- */
-#define L_PTE_MT_UNCACHED	(0x00 << 2)	/* 0000 */
-#define L_PTE_MT_BUFFERABLE	(0x01 << 2)	/* 0001 */
-#define L_PTE_MT_WRITETHROUGH	(0x02 << 2)	/* 0010 */
-#define L_PTE_MT_WRITEBACK	(0x03 << 2)	/* 0011 */
-#define L_PTE_MT_MINICACHE	(0x06 << 2)	/* 0110 (sa1100, xscale) */
-#define L_PTE_MT_WRITEALLOC	(0x07 << 2)	/* 0111 */
-#define L_PTE_MT_DEV_SHARED	(0x04 << 2)	/* 0100 */
-#define L_PTE_MT_DEV_NONSHARED	(0x0c << 2)	/* 1100 */
-#define L_PTE_MT_DEV_WC		(0x09 << 2)	/* 1001 */
-#define L_PTE_MT_DEV_CACHED	(0x0b << 2)	/* 1011 */
-#define L_PTE_MT_MASK		(0x0f << 2)
-
-#ifndef __ASSEMBLY__
 
 /*
  * The pgprot_* and protection_map entries will be fixed up in runtime

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel; +Cc: Will Deacon

From: Will Deacon <will.deacon@arm.com>

When using 2-level paging, pte_t and pmd_t are typedefs for
unsigned long but phys_addr_t is a typedef for u32.

This patch uses u32 for the page table entry types when
phys_addr_t is not 64-bit, allowing the same conversion
specifier to be used for physical addresses and page table
entries regardless of LPAE.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/page-nommu.h           |    8 ++++----
 arch/arm/include/asm/pgtable-2level-types.h |   18 +++++++++---------
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/arm/include/asm/page-nommu.h b/arch/arm/include/asm/page-nommu.h
index d1b162a..a20641a 100644
--- a/arch/arm/include/asm/page-nommu.h
+++ b/arch/arm/include/asm/page-nommu.h
@@ -29,10 +29,10 @@
 /*
  * These are used to make use of C type-checking..
  */
-typedef unsigned long pte_t;
-typedef unsigned long pmd_t;
-typedef unsigned long pgd_t[2];
-typedef unsigned long pgprot_t;
+typedef u32 pte_t;
+typedef u32 pmd_t;
+typedef u32 pgd_t[2];
+typedef u32 pgprot_t;
 
 #define pte_val(x)      (x)
 #define pmd_val(x)      (x)
diff --git a/arch/arm/include/asm/pgtable-2level-types.h b/arch/arm/include/asm/pgtable-2level-types.h
index 30f6741..adc4928 100644
--- a/arch/arm/include/asm/pgtable-2level-types.h
+++ b/arch/arm/include/asm/pgtable-2level-types.h
@@ -21,16 +21,16 @@
 
 #undef STRICT_MM_TYPECHECKS
 
-typedef unsigned long pteval_t;
+typedef u32 pteval_t;
 
 #ifdef STRICT_MM_TYPECHECKS
 /*
  * These are used to make use of C type-checking..
  */
-typedef struct { unsigned long pte; } pte_t;
-typedef struct { unsigned long pmd; } pmd_t;
-typedef struct { unsigned long pgd[2]; } pgd_t;
-typedef struct { unsigned long pgprot; } pgprot_t;
+typedef struct { u32 pte; } pte_t;
+typedef struct { u32 pmd; } pmd_t;
+typedef struct { u32 pgd[2]; } pgd_t;
+typedef struct { u32 pgprot; } pgprot_t;
 
 #define pte_val(x)      ((x).pte)
 #define pmd_val(x)      ((x).pmd)
@@ -45,10 +45,10 @@ typedef struct { unsigned long pgprot; } pgprot_t;
 /*
  * .. while these make it easier on the compiler
  */
-typedef unsigned long pte_t;
-typedef unsigned long pmd_t;
-typedef unsigned long pgd_t[2];
-typedef unsigned long pgprot_t;
+typedef u32 pte_t;
+typedef u32 pmd_t;
+typedef u32 pgd_t[2];
+typedef u32 pgprot_t;
 
 #define pte_val(x)      (x)
 #define pmd_val(x)      (x)

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Will Deacon <will.deacon@arm.com>

When using 2-level paging, pte_t and pmd_t are typedefs for
unsigned long but phys_addr_t is a typedef for u32.

This patch uses u32 for the page table entry types when
phys_addr_t is not 64-bit, allowing the same conversion
specifier to be used for physical addresses and page table
entries regardless of LPAE.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/page-nommu.h           |    8 ++++----
 arch/arm/include/asm/pgtable-2level-types.h |   18 +++++++++---------
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/arm/include/asm/page-nommu.h b/arch/arm/include/asm/page-nommu.h
index d1b162a..a20641a 100644
--- a/arch/arm/include/asm/page-nommu.h
+++ b/arch/arm/include/asm/page-nommu.h
@@ -29,10 +29,10 @@
 /*
  * These are used to make use of C type-checking..
  */
-typedef unsigned long pte_t;
-typedef unsigned long pmd_t;
-typedef unsigned long pgd_t[2];
-typedef unsigned long pgprot_t;
+typedef u32 pte_t;
+typedef u32 pmd_t;
+typedef u32 pgd_t[2];
+typedef u32 pgprot_t;
 
 #define pte_val(x)      (x)
 #define pmd_val(x)      (x)
diff --git a/arch/arm/include/asm/pgtable-2level-types.h b/arch/arm/include/asm/pgtable-2level-types.h
index 30f6741..adc4928 100644
--- a/arch/arm/include/asm/pgtable-2level-types.h
+++ b/arch/arm/include/asm/pgtable-2level-types.h
@@ -21,16 +21,16 @@
 
 #undef STRICT_MM_TYPECHECKS
 
-typedef unsigned long pteval_t;
+typedef u32 pteval_t;
 
 #ifdef STRICT_MM_TYPECHECKS
 /*
  * These are used to make use of C type-checking..
  */
-typedef struct { unsigned long pte; } pte_t;
-typedef struct { unsigned long pmd; } pmd_t;
-typedef struct { unsigned long pgd[2]; } pgd_t;
-typedef struct { unsigned long pgprot; } pgprot_t;
+typedef struct { u32 pte; } pte_t;
+typedef struct { u32 pmd; } pmd_t;
+typedef struct { u32 pgd[2]; } pgd_t;
+typedef struct { u32 pgprot; } pgprot_t;
 
 #define pte_val(x)      ((x).pte)
 #define pmd_val(x)      ((x).pmd)
@@ -45,10 +45,10 @@ typedef struct { unsigned long pgprot; } pgprot_t;
 /*
  * .. while these make it easier on the compiler
  */
-typedef unsigned long pte_t;
-typedef unsigned long pmd_t;
-typedef unsigned long pgd_t[2];
-typedef unsigned long pgprot_t;
+typedef u32 pte_t;
+typedef u32 pmd_t;
+typedef u32 pgd_t[2];
+typedef u32 pgprot_t;
 
 #define pte_val(x)      (x)
 #define pmd_val(x)      (x)

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 04/20] ARM: LPAE: Do not assume Linux PTEs are always at PTRS_PER_PTE offset
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

Placing the Linux PTEs at a 2KB offset inside a page is a workaround for
the 2-level page table format where not enough spare bits are available.
With LPAE this is no longer required. This patch changes such assumption
by using a different macro, LINUX_PTE_OFFSET, which is defined to
PTRS_PER_PTE for the 2-level page tables.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/pgalloc.h        |    6 +++---
 arch/arm/include/asm/pgtable-2level.h |    1 +
 arch/arm/include/asm/pgtable.h        |    6 +++---
 arch/arm/mm/fault.c                   |    2 +-
 arch/arm/mm/mmu.c                     |    3 ++-
 5 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h
index b12cc98..c2a1f64 100644
--- a/arch/arm/include/asm/pgalloc.h
+++ b/arch/arm/include/asm/pgalloc.h
@@ -62,7 +62,7 @@ pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr)
 	pte = (pte_t *)__get_free_page(PGALLOC_GFP);
 	if (pte) {
 		clean_dcache_area(pte, sizeof(pte_t) * PTRS_PER_PTE);
-		pte += PTRS_PER_PTE;
+		pte += LINUX_PTE_OFFSET;
 	}
 
 	return pte;
@@ -95,7 +95,7 @@ pte_alloc_one(struct mm_struct *mm, unsigned long addr)
 static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
 {
 	if (pte) {
-		pte -= PTRS_PER_PTE;
+		pte -= LINUX_PTE_OFFSET;
 		free_page((unsigned long)pte);
 	}
 }
@@ -128,7 +128,7 @@ pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
 	 * The pmd must be loaded with the physical
 	 * address of the PTE table
 	 */
-	pte_ptr -= PTRS_PER_PTE * sizeof(void *);
+	pte_ptr -= LINUX_PTE_OFFSET * sizeof(void *);
 	__pmd_populate(pmdp, __pa(pte_ptr) | _PAGE_KERNEL_TABLE);
 }
 
diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index d60bda9..36bdef7 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -71,6 +71,7 @@
 #define PTRS_PER_PTE		512
 #define PTRS_PER_PMD		1
 #define PTRS_PER_PGD		2048
+#define LINUX_PTE_OFFSET	PTRS_PER_PTE
 
 /*
  * PMD_SHIFT determines the size of the area a second-level page table can map
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 17e7ba6..ea08ab7 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -135,8 +135,8 @@ extern struct page *empty_zero_page;
 #define __pte_map(dir)		pmd_page_vaddr(*(dir))
 #define __pte_unmap(pte)	do { } while (0)
 #else
-#define __pte_map(dir)		((pte_t *)kmap_atomic(pmd_page(*(dir))) + PTRS_PER_PTE)
-#define __pte_unmap(pte)	kunmap_atomic((pte - PTRS_PER_PTE))
+#define __pte_map(dir)		((pte_t *)kmap_atomic(pmd_page(*(dir))) + LINUX_PTE_OFFSET)
+#define __pte_unmap(pte)	kunmap_atomic((pte - LINUX_PTE_OFFSET))
 #endif
 
 #define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
@@ -232,7 +232,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 	unsigned long ptr;
 
 	ptr = pmd_val(pmd) & ~(PTRS_PER_PTE * sizeof(void *) - 1);
-	ptr += PTRS_PER_PTE * sizeof(void *);
+	ptr += LINUX_PTE_OFFSET * sizeof(void *);
 
 	return __va(ptr);
 }
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 1e21e12..5da7b0c 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -108,7 +108,7 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 
 		pte = pte_offset_map(pmd, addr);
 		printk(", *pte=%08lx", pte_val(*pte));
-		printk(", *ppte=%08lx", pte_val(pte[-PTRS_PER_PTE]));
+		printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
 		pte_unmap(pte);
 	} while(0);
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 5e3adca..7324fbc 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -535,7 +535,8 @@ static void __init *early_alloc(unsigned long sz)
 static pte_t * __init early_pte_alloc(pmd_t *pmd, unsigned long addr, unsigned long prot)
 {
 	if (pmd_none(*pmd)) {
-		pte_t *pte = early_alloc(2 * PTRS_PER_PTE * sizeof(pte_t));
+		pte_t *pte = early_alloc((LINUX_PTE_OFFSET +
+					  PTRS_PER_PTE) * sizeof(pte_t));
 		__pmd_populate(pmd, __pa(pte) | prot);
 	}
 	BUG_ON(pmd_bad(*pmd));

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 04/20] ARM: LPAE: Do not assume Linux PTEs are always at PTRS_PER_PTE offset
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

Placing the Linux PTEs at a 2KB offset inside a page is a workaround for
the 2-level page table format where not enough spare bits are available.
With LPAE this is no longer required. This patch changes such assumption
by using a different macro, LINUX_PTE_OFFSET, which is defined to
PTRS_PER_PTE for the 2-level page tables.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/pgalloc.h        |    6 +++---
 arch/arm/include/asm/pgtable-2level.h |    1 +
 arch/arm/include/asm/pgtable.h        |    6 +++---
 arch/arm/mm/fault.c                   |    2 +-
 arch/arm/mm/mmu.c                     |    3 ++-
 5 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h
index b12cc98..c2a1f64 100644
--- a/arch/arm/include/asm/pgalloc.h
+++ b/arch/arm/include/asm/pgalloc.h
@@ -62,7 +62,7 @@ pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr)
 	pte = (pte_t *)__get_free_page(PGALLOC_GFP);
 	if (pte) {
 		clean_dcache_area(pte, sizeof(pte_t) * PTRS_PER_PTE);
-		pte += PTRS_PER_PTE;
+		pte += LINUX_PTE_OFFSET;
 	}
 
 	return pte;
@@ -95,7 +95,7 @@ pte_alloc_one(struct mm_struct *mm, unsigned long addr)
 static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
 {
 	if (pte) {
-		pte -= PTRS_PER_PTE;
+		pte -= LINUX_PTE_OFFSET;
 		free_page((unsigned long)pte);
 	}
 }
@@ -128,7 +128,7 @@ pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
 	 * The pmd must be loaded with the physical
 	 * address of the PTE table
 	 */
-	pte_ptr -= PTRS_PER_PTE * sizeof(void *);
+	pte_ptr -= LINUX_PTE_OFFSET * sizeof(void *);
 	__pmd_populate(pmdp, __pa(pte_ptr) | _PAGE_KERNEL_TABLE);
 }
 
diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index d60bda9..36bdef7 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -71,6 +71,7 @@
 #define PTRS_PER_PTE		512
 #define PTRS_PER_PMD		1
 #define PTRS_PER_PGD		2048
+#define LINUX_PTE_OFFSET	PTRS_PER_PTE
 
 /*
  * PMD_SHIFT determines the size of the area a second-level page table can map
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 17e7ba6..ea08ab7 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -135,8 +135,8 @@ extern struct page *empty_zero_page;
 #define __pte_map(dir)		pmd_page_vaddr(*(dir))
 #define __pte_unmap(pte)	do { } while (0)
 #else
-#define __pte_map(dir)		((pte_t *)kmap_atomic(pmd_page(*(dir))) + PTRS_PER_PTE)
-#define __pte_unmap(pte)	kunmap_atomic((pte - PTRS_PER_PTE))
+#define __pte_map(dir)		((pte_t *)kmap_atomic(pmd_page(*(dir))) + LINUX_PTE_OFFSET)
+#define __pte_unmap(pte)	kunmap_atomic((pte - LINUX_PTE_OFFSET))
 #endif
 
 #define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
@@ -232,7 +232,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 	unsigned long ptr;
 
 	ptr = pmd_val(pmd) & ~(PTRS_PER_PTE * sizeof(void *) - 1);
-	ptr += PTRS_PER_PTE * sizeof(void *);
+	ptr += LINUX_PTE_OFFSET * sizeof(void *);
 
 	return __va(ptr);
 }
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 1e21e12..5da7b0c 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -108,7 +108,7 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 
 		pte = pte_offset_map(pmd, addr);
 		printk(", *pte=%08lx", pte_val(*pte));
-		printk(", *ppte=%08lx", pte_val(pte[-PTRS_PER_PTE]));
+		printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
 		pte_unmap(pte);
 	} while(0);
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 5e3adca..7324fbc 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -535,7 +535,8 @@ static void __init *early_alloc(unsigned long sz)
 static pte_t * __init early_pte_alloc(pmd_t *pmd, unsigned long addr, unsigned long prot)
 {
 	if (pmd_none(*pmd)) {
-		pte_t *pte = early_alloc(2 * PTRS_PER_PTE * sizeof(pte_t));
+		pte_t *pte = early_alloc((LINUX_PTE_OFFSET +
+					  PTRS_PER_PTE) * sizeof(pte_t));
 		__pmd_populate(pmd, __pa(pte) | prot);
 	}
 	BUG_ON(pmd_bad(*pmd));

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

The LPAE page table format needs to explicitly disable execution or
write permissions on a page by setting the corresponding bits (similar
to the classic page table format with Access Flag enabled). This patch
introduces null definitions for the 2-level format and the actual noexec
and nowrite bits for the LPAE format. It also changes several PTE
maintenance macros and masks.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/pgtable-2level.h |    2 +
 arch/arm/include/asm/pgtable.h        |   44 +++++++++++++++++++++------------
 arch/arm/mm/mmu.c                     |    6 ++--
 3 files changed, 33 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index 36bdef7..4e21166 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -128,6 +128,8 @@
 #define L_PTE_USER		(1 << 8)
 #define L_PTE_EXEC		(1 << 9)
 #define L_PTE_SHARED		(1 << 10)	/* shared(v6), coherent(xsc3) */
+#define L_PTE_NOEXEC		(0)
+#define L_PTE_NOWRITE		(0)
 
 /*
  * These are the memory types, defined to be compatible with
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index ea08ab7..5bd0e64 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -66,23 +66,23 @@ extern pgprot_t		pgprot_kernel;
 
 #define _MOD_PROT(p, b)	__pgprot(pgprot_val(p) | (b))
 
-#define PAGE_NONE		pgprot_user
-#define PAGE_SHARED		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_WRITE)
+#define PAGE_NONE		_MOD_PROT(pgprot_user, L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define PAGE_SHARED		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_WRITE | L_PTE_NOEXEC)
 #define PAGE_SHARED_EXEC	_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_WRITE | L_PTE_EXEC)
-#define PAGE_COPY		_MOD_PROT(pgprot_user, L_PTE_USER)
-#define PAGE_COPY_EXEC		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_EXEC)
-#define PAGE_READONLY		_MOD_PROT(pgprot_user, L_PTE_USER)
-#define PAGE_READONLY_EXEC	_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_EXEC)
-#define PAGE_KERNEL		pgprot_kernel
+#define PAGE_COPY		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define PAGE_COPY_EXEC		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_EXEC | L_PTE_NOWRITE)
+#define PAGE_READONLY		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define PAGE_READONLY_EXEC	_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_EXEC | L_PTE_NOWRITE)
+#define PAGE_KERNEL		_MOD_PROT(pgprot_kernel, L_PTE_NOEXEC)
 #define PAGE_KERNEL_EXEC	_MOD_PROT(pgprot_kernel, L_PTE_EXEC)
 
-#define __PAGE_NONE		__pgprot(_L_PTE_DEFAULT)
-#define __PAGE_SHARED		__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_WRITE)
+#define __PAGE_NONE		__pgprot(_L_PTE_DEFAULT | L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define __PAGE_SHARED		__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_WRITE | L_PTE_NOEXEC)
 #define __PAGE_SHARED_EXEC	__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_WRITE | L_PTE_EXEC)
-#define __PAGE_COPY		__pgprot(_L_PTE_DEFAULT | L_PTE_USER)
-#define __PAGE_COPY_EXEC	__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_EXEC)
-#define __PAGE_READONLY		__pgprot(_L_PTE_DEFAULT | L_PTE_USER)
-#define __PAGE_READONLY_EXEC	__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_EXEC)
+#define __PAGE_COPY		__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define __PAGE_COPY_EXEC	__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_EXEC | L_PTE_NOWRITE)
+#define __PAGE_READONLY		__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define __PAGE_READONLY_EXEC	__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_EXEC | L_PTE_NOWRITE)
 
 #endif /* __ASSEMBLY__ */
 
@@ -165,12 +165,18 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
  * Undefined behaviour if not..
  */
 #define pte_present(pte)	(pte_val(pte) & L_PTE_PRESENT)
-#define pte_write(pte)		(pte_val(pte) & L_PTE_WRITE)
 #define pte_dirty(pte)		(pte_val(pte) & L_PTE_DIRTY)
 #define pte_young(pte)		(pte_val(pte) & L_PTE_YOUNG)
-#define pte_exec(pte)		(pte_val(pte) & L_PTE_EXEC)
 #define pte_special(pte)	(0)
 
+#ifdef CONFIG_ARM_LPAE
+#define pte_write(pte)		(!(pte_val(pte) & L_PTE_NOWRITE))
+#define pte_exec(pte)		(!(pte_val(pte) & L_PTE_NOEXEC))
+#else
+#define pte_write(pte)		(pte_val(pte) & L_PTE_WRITE)
+#define pte_exec(pte)		(pte_val(pte) & L_PTE_EXEC)
+#endif
+
 #define pte_present_user(pte) \
 	((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) == \
 	 (L_PTE_PRESENT | L_PTE_USER))
@@ -178,8 +184,13 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
 #define PTE_BIT_FUNC(fn,op) \
 static inline pte_t pte_##fn(pte_t pte) { pte_val(pte) op; return pte; }
 
+#ifdef CONFIG_ARM_LPAE
+PTE_BIT_FUNC(wrprotect, |= L_PTE_NOWRITE);
+PTE_BIT_FUNC(mkwrite,   &= ~L_PTE_NOWRITE);
+#else
 PTE_BIT_FUNC(wrprotect, &= ~L_PTE_WRITE);
 PTE_BIT_FUNC(mkwrite,   |= L_PTE_WRITE);
+#endif
 PTE_BIT_FUNC(mkclean,   &= ~L_PTE_DIRTY);
 PTE_BIT_FUNC(mkdirty,   |= L_PTE_DIRTY);
 PTE_BIT_FUNC(mkold,     &= ~L_PTE_YOUNG);
@@ -272,7 +283,8 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
-	const unsigned long mask = L_PTE_EXEC | L_PTE_WRITE | L_PTE_USER;
+	const unsigned long mask = L_PTE_EXEC | L_PTE_WRITE | L_PTE_USER |
+		L_PTE_NOEXEC | L_PTE_NOWRITE;
 	pte_val(pte) = (pte_val(pte) & ~mask) | (pgprot_val(newprot) & mask);
 	return pte;
 }
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 7324fbc..0ca33dd 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -191,7 +191,7 @@ void adjust_cr(unsigned long mask, unsigned long set)
 }
 #endif
 
-#define PROT_PTE_DEVICE		L_PTE_PRESENT|L_PTE_YOUNG|L_PTE_DIRTY|L_PTE_WRITE
+#define PROT_PTE_DEVICE		L_PTE_PRESENT|L_PTE_YOUNG|L_PTE_DIRTY|L_PTE_WRITE|L_PTE_NOEXEC
 #define PROT_SECT_DEVICE	PMD_TYPE_SECT|PMD_SECT_AP_WRITE
 
 static struct mem_type mem_types[] = {
@@ -236,13 +236,13 @@ static struct mem_type mem_types[] = {
 	},
 	[MT_LOW_VECTORS] = {
 		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
-				L_PTE_EXEC,
+				L_PTE_EXEC | L_PTE_NOWRITE,
 		.prot_l1   = PMD_TYPE_TABLE,
 		.domain    = DOMAIN_USER,
 	},
 	[MT_HIGH_VECTORS] = {
 		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
-				L_PTE_USER | L_PTE_EXEC,
+				L_PTE_USER | L_PTE_EXEC | L_PTE_NOWRITE,
 		.prot_l1   = PMD_TYPE_TABLE,
 		.domain    = DOMAIN_USER,
 	},

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

The LPAE page table format needs to explicitly disable execution or
write permissions on a page by setting the corresponding bits (similar
to the classic page table format with Access Flag enabled). This patch
introduces null definitions for the 2-level format and the actual noexec
and nowrite bits for the LPAE format. It also changes several PTE
maintenance macros and masks.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/pgtable-2level.h |    2 +
 arch/arm/include/asm/pgtable.h        |   44 +++++++++++++++++++++------------
 arch/arm/mm/mmu.c                     |    6 ++--
 3 files changed, 33 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index 36bdef7..4e21166 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -128,6 +128,8 @@
 #define L_PTE_USER		(1 << 8)
 #define L_PTE_EXEC		(1 << 9)
 #define L_PTE_SHARED		(1 << 10)	/* shared(v6), coherent(xsc3) */
+#define L_PTE_NOEXEC		(0)
+#define L_PTE_NOWRITE		(0)
 
 /*
  * These are the memory types, defined to be compatible with
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index ea08ab7..5bd0e64 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -66,23 +66,23 @@ extern pgprot_t		pgprot_kernel;
 
 #define _MOD_PROT(p, b)	__pgprot(pgprot_val(p) | (b))
 
-#define PAGE_NONE		pgprot_user
-#define PAGE_SHARED		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_WRITE)
+#define PAGE_NONE		_MOD_PROT(pgprot_user, L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define PAGE_SHARED		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_WRITE | L_PTE_NOEXEC)
 #define PAGE_SHARED_EXEC	_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_WRITE | L_PTE_EXEC)
-#define PAGE_COPY		_MOD_PROT(pgprot_user, L_PTE_USER)
-#define PAGE_COPY_EXEC		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_EXEC)
-#define PAGE_READONLY		_MOD_PROT(pgprot_user, L_PTE_USER)
-#define PAGE_READONLY_EXEC	_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_EXEC)
-#define PAGE_KERNEL		pgprot_kernel
+#define PAGE_COPY		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define PAGE_COPY_EXEC		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_EXEC | L_PTE_NOWRITE)
+#define PAGE_READONLY		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define PAGE_READONLY_EXEC	_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_EXEC | L_PTE_NOWRITE)
+#define PAGE_KERNEL		_MOD_PROT(pgprot_kernel, L_PTE_NOEXEC)
 #define PAGE_KERNEL_EXEC	_MOD_PROT(pgprot_kernel, L_PTE_EXEC)
 
-#define __PAGE_NONE		__pgprot(_L_PTE_DEFAULT)
-#define __PAGE_SHARED		__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_WRITE)
+#define __PAGE_NONE		__pgprot(_L_PTE_DEFAULT | L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define __PAGE_SHARED		__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_WRITE | L_PTE_NOEXEC)
 #define __PAGE_SHARED_EXEC	__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_WRITE | L_PTE_EXEC)
-#define __PAGE_COPY		__pgprot(_L_PTE_DEFAULT | L_PTE_USER)
-#define __PAGE_COPY_EXEC	__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_EXEC)
-#define __PAGE_READONLY		__pgprot(_L_PTE_DEFAULT | L_PTE_USER)
-#define __PAGE_READONLY_EXEC	__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_EXEC)
+#define __PAGE_COPY		__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define __PAGE_COPY_EXEC	__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_EXEC | L_PTE_NOWRITE)
+#define __PAGE_READONLY		__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_NOEXEC | L_PTE_NOWRITE)
+#define __PAGE_READONLY_EXEC	__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_EXEC | L_PTE_NOWRITE)
 
 #endif /* __ASSEMBLY__ */
 
@@ -165,12 +165,18 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
  * Undefined behaviour if not..
  */
 #define pte_present(pte)	(pte_val(pte) & L_PTE_PRESENT)
-#define pte_write(pte)		(pte_val(pte) & L_PTE_WRITE)
 #define pte_dirty(pte)		(pte_val(pte) & L_PTE_DIRTY)
 #define pte_young(pte)		(pte_val(pte) & L_PTE_YOUNG)
-#define pte_exec(pte)		(pte_val(pte) & L_PTE_EXEC)
 #define pte_special(pte)	(0)
 
+#ifdef CONFIG_ARM_LPAE
+#define pte_write(pte)		(!(pte_val(pte) & L_PTE_NOWRITE))
+#define pte_exec(pte)		(!(pte_val(pte) & L_PTE_NOEXEC))
+#else
+#define pte_write(pte)		(pte_val(pte) & L_PTE_WRITE)
+#define pte_exec(pte)		(pte_val(pte) & L_PTE_EXEC)
+#endif
+
 #define pte_present_user(pte) \
 	((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) == \
 	 (L_PTE_PRESENT | L_PTE_USER))
@@ -178,8 +184,13 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
 #define PTE_BIT_FUNC(fn,op) \
 static inline pte_t pte_##fn(pte_t pte) { pte_val(pte) op; return pte; }
 
+#ifdef CONFIG_ARM_LPAE
+PTE_BIT_FUNC(wrprotect, |= L_PTE_NOWRITE);
+PTE_BIT_FUNC(mkwrite,   &= ~L_PTE_NOWRITE);
+#else
 PTE_BIT_FUNC(wrprotect, &= ~L_PTE_WRITE);
 PTE_BIT_FUNC(mkwrite,   |= L_PTE_WRITE);
+#endif
 PTE_BIT_FUNC(mkclean,   &= ~L_PTE_DIRTY);
 PTE_BIT_FUNC(mkdirty,   |= L_PTE_DIRTY);
 PTE_BIT_FUNC(mkold,     &= ~L_PTE_YOUNG);
@@ -272,7 +283,8 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
-	const unsigned long mask = L_PTE_EXEC | L_PTE_WRITE | L_PTE_USER;
+	const unsigned long mask = L_PTE_EXEC | L_PTE_WRITE | L_PTE_USER |
+		L_PTE_NOEXEC | L_PTE_NOWRITE;
 	pte_val(pte) = (pte_val(pte) & ~mask) | (pgprot_val(newprot) & mask);
 	return pte;
 }
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 7324fbc..0ca33dd 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -191,7 +191,7 @@ void adjust_cr(unsigned long mask, unsigned long set)
 }
 #endif
 
-#define PROT_PTE_DEVICE		L_PTE_PRESENT|L_PTE_YOUNG|L_PTE_DIRTY|L_PTE_WRITE
+#define PROT_PTE_DEVICE		L_PTE_PRESENT|L_PTE_YOUNG|L_PTE_DIRTY|L_PTE_WRITE|L_PTE_NOEXEC
 #define PROT_SECT_DEVICE	PMD_TYPE_SECT|PMD_SECT_AP_WRITE
 
 static struct mem_type mem_types[] = {
@@ -236,13 +236,13 @@ static struct mem_type mem_types[] = {
 	},
 	[MT_LOW_VECTORS] = {
 		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
-				L_PTE_EXEC,
+				L_PTE_EXEC | L_PTE_NOWRITE,
 		.prot_l1   = PMD_TYPE_TABLE,
 		.domain    = DOMAIN_USER,
 	},
 	[MT_HIGH_VECTORS] = {
 		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
-				L_PTE_USER | L_PTE_EXEC,
+				L_PTE_USER | L_PTE_EXEC | L_PTE_NOWRITE,
 		.prot_l1   = PMD_TYPE_TABLE,
 		.domain    = DOMAIN_USER,
 	},

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 06/20] ARM: LPAE: Introduce the 3-level page table format definitions
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

This patch introduces the pgtable-3level*.h files with definitions
specific to the LPAE page table format (3 levels of page tables).

Each table is 4KB and has 512 64-bit entries. An entry can point to a
40-bit physical address. The young, write and exec software bits share
the corresponding hardware bits (negated). Other software bits use spare
bits in the PTE.

The patch also changes some variable types from unsigned long or int to
pteval_t or pgprot_t.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/page.h                 |    4 +
 arch/arm/include/asm/pgtable-3level-hwdef.h |   78 ++++++++++++++++++
 arch/arm/include/asm/pgtable-3level-types.h |   55 +++++++++++++
 arch/arm/include/asm/pgtable-3level.h       |  113 +++++++++++++++++++++++++++
 arch/arm/include/asm/pgtable-hwdef.h        |    4 +
 arch/arm/include/asm/pgtable.h              |    6 +-
 arch/arm/mm/mm.h                            |    8 +-
 arch/arm/mm/mmu.c                           |    2 +-
 8 files changed, 264 insertions(+), 6 deletions(-)
 create mode 100644 arch/arm/include/asm/pgtable-3level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-3level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-3level.h

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 3848105..e5124db 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -151,7 +151,11 @@ extern void __cpu_copy_user_highpage(struct page *to, struct page *from,
 #define clear_page(page)	memset((void *)(page), 0, PAGE_SIZE)
 extern void copy_page(void *to, const void *from);
 
+#ifdef CONFIG_ARM_LPAE
+#include <asm/pgtable-3level-types.h>
+#else
 #include <asm/pgtable-2level-types.h>
+#endif
 
 #endif /* CONFIG_MMU */
 
diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
new file mode 100644
index 0000000..2f99c3c
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -0,0 +1,78 @@
+/*
+ * arch/arm/include/asm/pgtable-3level-hwdef.h
+ *
+ * Copyright (C) 2010 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_3LEVEL_HWDEF_H
+#define _ASM_PGTABLE_3LEVEL_HWDEF_H
+
+#include <linux/const.h>
+#include <asm/pgtable-3level-types.h>
+
+/*
+ * Hardware page table definitions.
+ *
+ * + Level 1/2 descriptor
+ *   - common
+ */
+#define PMD_TYPE_MASK		(_AT(pmd_t, 3) << 0)
+#define PMD_TYPE_FAULT		(_AT(pmd_t, 0) << 0)
+#define PMD_TYPE_TABLE		(_AT(pmd_t, 3) << 0)
+#define PMD_TYPE_SECT		(_AT(pmd_t, 1) << 0)
+#define PMD_BIT4		(_AT(pmd_t, 0))
+#define PMD_DOMAIN(x)		(_AT(pmd_t, 0))
+
+/*
+ *   - section
+ */
+#define PMD_SECT_BUFFERABLE	(_AT(pmd_t, 1) << 2)
+#define PMD_SECT_CACHEABLE	(_AT(pmd_t, 1) << 3)
+#define PMD_SECT_S		(_AT(pmd_t, 3) << 8)
+#define PMD_SECT_AF		(_AT(pmd_t, 1) << 10)
+#define PMD_SECT_nG		(_AT(pmd_t, 1) << 11)
+#ifdef __ASSEMBLY__
+/* avoid 'shift count out of range' warning */
+#define PMD_SECT_XN		(0)
+#else
+#define PMD_SECT_XN		((pmd_t)1 << 54)
+#endif
+#define PMD_SECT_AP_WRITE	(_AT(pmd_t, 0))
+#define PMD_SECT_AP_READ	(_AT(pmd_t, 0))
+#define PMD_SECT_TEX(x)		(_AT(pmd_t, 0))
+
+/*
+ * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
+ */
+#define PMD_SECT_UNCACHED	(_AT(pteval_t, 0) << 2)	/* strongly ordered */
+#define PMD_SECT_BUFFERED	(_AT(pteval_t, 1) << 2)	/* normal non-cacheable */
+#define PMD_SECT_WT		(_AT(pteval_t, 2) << 2)	/* normal inner write-through */
+#define PMD_SECT_WB		(_AT(pteval_t, 3) << 2)	/* normal inner write-back */
+#define PMD_SECT_WBWA		(_AT(pteval_t, 7) << 2)	/* normal inner write-alloc */
+
+/*
+ * + Level 3 descriptor (PTE)
+ */
+#define PTE_TYPE_MASK		(_AT(pteval_t, 3) << 0)
+#define PTE_TYPE_FAULT		(_AT(pteval_t, 0) << 0)
+#define PTE_TYPE_PAGE		(_AT(pteval_t, 3) << 0)
+#define PTE_BUFFERABLE		(_AT(pteval_t, 1) << 2)		/* AttrIndx[0] */
+#define PTE_CACHEABLE		(_AT(pteval_t, 1) << 3)		/* AttrIndx[1] */
+#define PTE_EXT_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define PTE_EXT_AF		(_AT(pteval_t, 1) << 10)	/* Access Flag */
+#define PTE_EXT_NG		(_AT(pteval_t, 1) << 11)	/* nG */
+#define PTE_EXT_XN		(_AT(pteval_t, 1) << 54)	/* XN */
+
+#endif
diff --git a/arch/arm/include/asm/pgtable-3level-types.h b/arch/arm/include/asm/pgtable-3level-types.h
new file mode 100644
index 0000000..c9aca5b
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-3level-types.h
@@ -0,0 +1,55 @@
+/*
+ * arch/arm/include/asm/pgtable-3level-types.h
+ *
+ * Copyright (C) 2010 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_3LEVEL_TYPES_H
+#define _ASM_PGTABLE_3LEVEL_TYPES_H
+
+#ifndef __ASSEMBLY__
+
+typedef u64 pteval_t;
+typedef u64 pte_t;
+typedef u64 pmd_t;
+typedef u64 pgd_t;
+typedef u64 pgprot_t;
+
+#define pte_val(x)	(x)
+#define pmd_val(x)	(x)
+#define pgd_val(x)	(x)
+#define pgprot_val(x)	(x)
+
+#define __pte(x)	(x)
+#define __pmd(x)	(x)
+#define __pgd(x)	(x)
+#define __pgprot(x)	(x)
+
+/*
+ * 40-bit physical address supported.
+ */
+#define __PHYSICAL_MASK_SHIFT	(40)
+#define __PHYSICAL_MASK		((1ULL << __PHYSICAL_MASK_SHIFT) - 1)
+
+/*
+ * Mask for extracting the PFN from a PTE value. The PAGE_MASK is
+ * sign-extended to 64-bit because the physical range is larger than the
+ * virtual one.
+ */
+#define PTE_PFN_MASK	((u64)((s32)PAGE_MASK & __PHYSICAL_MASK))
+
+#endif	/* __ASSEMBLY__ */
+
+#endif	/* _ASM_PGTABLE_3LEVEL_TYPES_H */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
new file mode 100644
index 0000000..5b1482d
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -0,0 +1,113 @@
+/*
+ * arch/arm/include/asm/pgtable-3level.h
+ *
+ * Copyright (C) 2010 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_3LEVEL_H
+#define _ASM_PGTABLE_3LEVEL_H
+
+#include <linux/const.h>
+#include <asm/pgtable-3level-types.h>
+
+/*
+ * With LPAE, there are 3 levels of page tables. Each level has 512 entries of
+ * 8 bytes each, occupying a 4K page. The first level table covers a range of
+ * 512GB, each entry representing 1GB. Since we are limited to 4GB input
+ * address range, only 4 entries in the PGD are used.
+ *
+ * There are enough spare bits in a page table entry for the kernel specific
+ * state.
+ */
+#define PTRS_PER_PTE		512
+#define PTRS_PER_PMD		512
+#define PTRS_PER_PGD		4
+#define LINUX_PTE_OFFSET	0
+
+/*
+ * PGDIR_SHIFT determines the size a top-level page table entry can map.
+ */
+#define PGDIR_SHIFT		30
+
+/*
+ * PMD_SHIFT determines the size a middle-level page table entry can map.
+ */
+#define PMD_SHIFT		21
+
+#define PMD_SIZE		(1UL << PMD_SHIFT)
+#define PMD_MASK		(~(PMD_SIZE-1))
+#define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+
+/*
+ * This is the lowest virtual address we can permit any user space
+ * mapping to be mapped at.  This is particularly important for
+ * non-high vector CPUs.
+ */
+#define FIRST_USER_ADDRESS	PAGE_SIZE
+
+#define FIRST_USER_PGD_NR	1
+#define USER_PTRS_PER_PGD	((TASK_SIZE/PGDIR_SIZE) - FIRST_USER_PGD_NR)
+
+/*
+ * section address mask and size definitions.
+ */
+#define SECTION_SHIFT		21
+#define SECTION_SIZE		(1UL << SECTION_SHIFT)
+#define SECTION_MASK		(~(SECTION_SIZE-1))
+
+/*
+ * "Linux" PTE definitions for LPAE.
+ *
+ * These bits overlap with the hardware bits but the naming is preserved for
+ * consistency with the classic page table format.
+ */
+#define L_PTE_PRESENT		(_AT(pteval_t, 3) << 0)		/* Valid */
+#define L_PTE_FILE		(_AT(pteval_t, 1) << 2)		/* only when !PRESENT */
+#define L_PTE_BUFFERABLE	(_AT(pteval_t, 1) << 2)		/* AttrIndx[0] */
+#define L_PTE_CACHEABLE		(_AT(pteval_t, 1) << 3)		/* AttrIndx[1] */
+#define L_PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
+#define L_PTE_NOWRITE		(_AT(pteval_t, 1) << 7)		/* AP[2] */
+#define L_PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define L_PTE_YOUNG		(_AT(pteval_t, 1) << 10)	/* AF */
+#define L_PTE_NOEXEC		(_AT(pteval_t, 1) << 54)	/* XN */
+#define L_PTE_DIRTY		(_AT(pteval_t, 1) << 55)	/* unused */
+#define L_PTE_SPECIAL		(_AT(pteval_t, 1) << 56)	/* unused */
+#define L_PTE_EXEC		(_AT(pteval_t, 0))
+#define L_PTE_WRITE		(_AT(pteval_t, 0))
+
+/*
+ * To be used in assembly code with the upper page attributes.
+ */
+#define L_PTE_NOEXEC_HIGH	(1 << (54 - 32))
+#define L_PTE_DIRTY_HIGH	(1 << (55 - 32))
+
+#define L_PTE_SWP_SHIFT		2	/* shift for the swap or file PTE */
+
+/*
+ * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
+ */
+#define L_PTE_MT_UNCACHED	(_AT(pteval_t, 0) << 2)	/* strongly ordered */
+#define L_PTE_MT_BUFFERABLE	(_AT(pteval_t, 1) << 2)	/* normal non-cacheable */
+#define L_PTE_MT_WRITETHROUGH	(_AT(pteval_t, 2) << 2)	/* normal inner write-through */
+#define L_PTE_MT_WRITEBACK	(_AT(pteval_t, 3) << 2)	/* normal inner write-back */
+#define L_PTE_MT_WRITEALLOC	(_AT(pteval_t, 7) << 2)	/* normal inner write-alloc */
+#define L_PTE_MT_DEV_SHARED	(_AT(pteval_t, 4) << 2)	/* device */
+#define L_PTE_MT_DEV_NONSHARED	(_AT(pteval_t, 4) << 2)	/* device */
+#define L_PTE_MT_DEV_WC		(_AT(pteval_t, 1) << 2)	/* normal non-cacheable */
+#define L_PTE_MT_DEV_CACHED	(_AT(pteval_t, 3) << 2)	/* normal inner write-back */
+#define L_PTE_MT_MASK		(_AT(pteval_t, 7) << 2)
+
+#endif /* _ASM_PGTABLE_3LEVEL_H */
diff --git a/arch/arm/include/asm/pgtable-hwdef.h b/arch/arm/include/asm/pgtable-hwdef.h
index 1831111..8426229 100644
--- a/arch/arm/include/asm/pgtable-hwdef.h
+++ b/arch/arm/include/asm/pgtable-hwdef.h
@@ -10,6 +10,10 @@
 #ifndef _ASMARM_PGTABLE_HWDEF_H
 #define _ASMARM_PGTABLE_HWDEF_H
 
+#ifdef CONFIG_ARM_LPAE
+#include <asm/pgtable-3level-hwdef.h>
+#else
 #include <asm/pgtable-2level-hwdef.h>
+#endif
 
 #endif
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 5bd0e64..97a5de3 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -23,7 +23,11 @@
 #include <mach/vmalloc.h>
 #include <asm/pgtable-hwdef.h>
 
+#ifdef CONFIG_ARM_LPAE
+#include <asm/pgtable-3level.h>
+#else
 #include <asm/pgtable-2level.h>
+#endif
 
 /*
  * Just any arbitrary offset to the start of the vmalloc VM area: the
@@ -283,7 +287,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
-	const unsigned long mask = L_PTE_EXEC | L_PTE_WRITE | L_PTE_USER |
+	const pteval_t mask = L_PTE_EXEC | L_PTE_WRITE | L_PTE_USER |
 		L_PTE_NOEXEC | L_PTE_NOWRITE;
 	pte_val(pte) = (pte_val(pte) & ~mask) | (pgprot_val(newprot) & mask);
 	return pte;
diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
index 6630620..a62f093 100644
--- a/arch/arm/mm/mm.h
+++ b/arch/arm/mm/mm.h
@@ -16,10 +16,10 @@ static inline pmd_t *pmd_off_k(unsigned long virt)
 }
 
 struct mem_type {
-	unsigned int prot_pte;
-	unsigned int prot_l1;
-	unsigned int prot_sect;
-	unsigned int domain;
+	pgprot_t prot_pte;
+	pgprot_t prot_l1;
+	pgprot_t prot_sect;
+	pgprot_t domain;
 };
 
 const struct mem_type *get_mem_type(unsigned int type);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 0ca33dd..7c803c4 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -292,7 +292,7 @@ static void __init build_mem_type_table(void)
 {
 	struct cachepolicy *cp;
 	unsigned int cr = get_cr();
-	unsigned int user_pgprot, kern_pgprot, vecs_pgprot;
+	pgprot_t user_pgprot, kern_pgprot, vecs_pgprot;
 	int cpu_arch = cpu_architecture();
 	int i;
 

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 06/20] ARM: LPAE: Introduce the 3-level page table format definitions
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

This patch introduces the pgtable-3level*.h files with definitions
specific to the LPAE page table format (3 levels of page tables).

Each table is 4KB and has 512 64-bit entries. An entry can point to a
40-bit physical address. The young, write and exec software bits share
the corresponding hardware bits (negated). Other software bits use spare
bits in the PTE.

The patch also changes some variable types from unsigned long or int to
pteval_t or pgprot_t.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/page.h                 |    4 +
 arch/arm/include/asm/pgtable-3level-hwdef.h |   78 ++++++++++++++++++
 arch/arm/include/asm/pgtable-3level-types.h |   55 +++++++++++++
 arch/arm/include/asm/pgtable-3level.h       |  113 +++++++++++++++++++++++++++
 arch/arm/include/asm/pgtable-hwdef.h        |    4 +
 arch/arm/include/asm/pgtable.h              |    6 +-
 arch/arm/mm/mm.h                            |    8 +-
 arch/arm/mm/mmu.c                           |    2 +-
 8 files changed, 264 insertions(+), 6 deletions(-)
 create mode 100644 arch/arm/include/asm/pgtable-3level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-3level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-3level.h

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 3848105..e5124db 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -151,7 +151,11 @@ extern void __cpu_copy_user_highpage(struct page *to, struct page *from,
 #define clear_page(page)	memset((void *)(page), 0, PAGE_SIZE)
 extern void copy_page(void *to, const void *from);
 
+#ifdef CONFIG_ARM_LPAE
+#include <asm/pgtable-3level-types.h>
+#else
 #include <asm/pgtable-2level-types.h>
+#endif
 
 #endif /* CONFIG_MMU */
 
diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
new file mode 100644
index 0000000..2f99c3c
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -0,0 +1,78 @@
+/*
+ * arch/arm/include/asm/pgtable-3level-hwdef.h
+ *
+ * Copyright (C) 2010 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_3LEVEL_HWDEF_H
+#define _ASM_PGTABLE_3LEVEL_HWDEF_H
+
+#include <linux/const.h>
+#include <asm/pgtable-3level-types.h>
+
+/*
+ * Hardware page table definitions.
+ *
+ * + Level 1/2 descriptor
+ *   - common
+ */
+#define PMD_TYPE_MASK		(_AT(pmd_t, 3) << 0)
+#define PMD_TYPE_FAULT		(_AT(pmd_t, 0) << 0)
+#define PMD_TYPE_TABLE		(_AT(pmd_t, 3) << 0)
+#define PMD_TYPE_SECT		(_AT(pmd_t, 1) << 0)
+#define PMD_BIT4		(_AT(pmd_t, 0))
+#define PMD_DOMAIN(x)		(_AT(pmd_t, 0))
+
+/*
+ *   - section
+ */
+#define PMD_SECT_BUFFERABLE	(_AT(pmd_t, 1) << 2)
+#define PMD_SECT_CACHEABLE	(_AT(pmd_t, 1) << 3)
+#define PMD_SECT_S		(_AT(pmd_t, 3) << 8)
+#define PMD_SECT_AF		(_AT(pmd_t, 1) << 10)
+#define PMD_SECT_nG		(_AT(pmd_t, 1) << 11)
+#ifdef __ASSEMBLY__
+/* avoid 'shift count out of range' warning */
+#define PMD_SECT_XN		(0)
+#else
+#define PMD_SECT_XN		((pmd_t)1 << 54)
+#endif
+#define PMD_SECT_AP_WRITE	(_AT(pmd_t, 0))
+#define PMD_SECT_AP_READ	(_AT(pmd_t, 0))
+#define PMD_SECT_TEX(x)		(_AT(pmd_t, 0))
+
+/*
+ * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
+ */
+#define PMD_SECT_UNCACHED	(_AT(pteval_t, 0) << 2)	/* strongly ordered */
+#define PMD_SECT_BUFFERED	(_AT(pteval_t, 1) << 2)	/* normal non-cacheable */
+#define PMD_SECT_WT		(_AT(pteval_t, 2) << 2)	/* normal inner write-through */
+#define PMD_SECT_WB		(_AT(pteval_t, 3) << 2)	/* normal inner write-back */
+#define PMD_SECT_WBWA		(_AT(pteval_t, 7) << 2)	/* normal inner write-alloc */
+
+/*
+ * + Level 3 descriptor (PTE)
+ */
+#define PTE_TYPE_MASK		(_AT(pteval_t, 3) << 0)
+#define PTE_TYPE_FAULT		(_AT(pteval_t, 0) << 0)
+#define PTE_TYPE_PAGE		(_AT(pteval_t, 3) << 0)
+#define PTE_BUFFERABLE		(_AT(pteval_t, 1) << 2)		/* AttrIndx[0] */
+#define PTE_CACHEABLE		(_AT(pteval_t, 1) << 3)		/* AttrIndx[1] */
+#define PTE_EXT_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define PTE_EXT_AF		(_AT(pteval_t, 1) << 10)	/* Access Flag */
+#define PTE_EXT_NG		(_AT(pteval_t, 1) << 11)	/* nG */
+#define PTE_EXT_XN		(_AT(pteval_t, 1) << 54)	/* XN */
+
+#endif
diff --git a/arch/arm/include/asm/pgtable-3level-types.h b/arch/arm/include/asm/pgtable-3level-types.h
new file mode 100644
index 0000000..c9aca5b
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-3level-types.h
@@ -0,0 +1,55 @@
+/*
+ * arch/arm/include/asm/pgtable-3level-types.h
+ *
+ * Copyright (C) 2010 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_3LEVEL_TYPES_H
+#define _ASM_PGTABLE_3LEVEL_TYPES_H
+
+#ifndef __ASSEMBLY__
+
+typedef u64 pteval_t;
+typedef u64 pte_t;
+typedef u64 pmd_t;
+typedef u64 pgd_t;
+typedef u64 pgprot_t;
+
+#define pte_val(x)	(x)
+#define pmd_val(x)	(x)
+#define pgd_val(x)	(x)
+#define pgprot_val(x)	(x)
+
+#define __pte(x)	(x)
+#define __pmd(x)	(x)
+#define __pgd(x)	(x)
+#define __pgprot(x)	(x)
+
+/*
+ * 40-bit physical address supported.
+ */
+#define __PHYSICAL_MASK_SHIFT	(40)
+#define __PHYSICAL_MASK		((1ULL << __PHYSICAL_MASK_SHIFT) - 1)
+
+/*
+ * Mask for extracting the PFN from a PTE value. The PAGE_MASK is
+ * sign-extended to 64-bit because the physical range is larger than the
+ * virtual one.
+ */
+#define PTE_PFN_MASK	((u64)((s32)PAGE_MASK & __PHYSICAL_MASK))
+
+#endif	/* __ASSEMBLY__ */
+
+#endif	/* _ASM_PGTABLE_3LEVEL_TYPES_H */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
new file mode 100644
index 0000000..5b1482d
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -0,0 +1,113 @@
+/*
+ * arch/arm/include/asm/pgtable-3level.h
+ *
+ * Copyright (C) 2010 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_3LEVEL_H
+#define _ASM_PGTABLE_3LEVEL_H
+
+#include <linux/const.h>
+#include <asm/pgtable-3level-types.h>
+
+/*
+ * With LPAE, there are 3 levels of page tables. Each level has 512 entries of
+ * 8 bytes each, occupying a 4K page. The first level table covers a range of
+ * 512GB, each entry representing 1GB. Since we are limited to 4GB input
+ * address range, only 4 entries in the PGD are used.
+ *
+ * There are enough spare bits in a page table entry for the kernel specific
+ * state.
+ */
+#define PTRS_PER_PTE		512
+#define PTRS_PER_PMD		512
+#define PTRS_PER_PGD		4
+#define LINUX_PTE_OFFSET	0
+
+/*
+ * PGDIR_SHIFT determines the size a top-level page table entry can map.
+ */
+#define PGDIR_SHIFT		30
+
+/*
+ * PMD_SHIFT determines the size a middle-level page table entry can map.
+ */
+#define PMD_SHIFT		21
+
+#define PMD_SIZE		(1UL << PMD_SHIFT)
+#define PMD_MASK		(~(PMD_SIZE-1))
+#define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+
+/*
+ * This is the lowest virtual address we can permit any user space
+ * mapping to be mapped at.  This is particularly important for
+ * non-high vector CPUs.
+ */
+#define FIRST_USER_ADDRESS	PAGE_SIZE
+
+#define FIRST_USER_PGD_NR	1
+#define USER_PTRS_PER_PGD	((TASK_SIZE/PGDIR_SIZE) - FIRST_USER_PGD_NR)
+
+/*
+ * section address mask and size definitions.
+ */
+#define SECTION_SHIFT		21
+#define SECTION_SIZE		(1UL << SECTION_SHIFT)
+#define SECTION_MASK		(~(SECTION_SIZE-1))
+
+/*
+ * "Linux" PTE definitions for LPAE.
+ *
+ * These bits overlap with the hardware bits but the naming is preserved for
+ * consistency with the classic page table format.
+ */
+#define L_PTE_PRESENT		(_AT(pteval_t, 3) << 0)		/* Valid */
+#define L_PTE_FILE		(_AT(pteval_t, 1) << 2)		/* only when !PRESENT */
+#define L_PTE_BUFFERABLE	(_AT(pteval_t, 1) << 2)		/* AttrIndx[0] */
+#define L_PTE_CACHEABLE		(_AT(pteval_t, 1) << 3)		/* AttrIndx[1] */
+#define L_PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
+#define L_PTE_NOWRITE		(_AT(pteval_t, 1) << 7)		/* AP[2] */
+#define L_PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define L_PTE_YOUNG		(_AT(pteval_t, 1) << 10)	/* AF */
+#define L_PTE_NOEXEC		(_AT(pteval_t, 1) << 54)	/* XN */
+#define L_PTE_DIRTY		(_AT(pteval_t, 1) << 55)	/* unused */
+#define L_PTE_SPECIAL		(_AT(pteval_t, 1) << 56)	/* unused */
+#define L_PTE_EXEC		(_AT(pteval_t, 0))
+#define L_PTE_WRITE		(_AT(pteval_t, 0))
+
+/*
+ * To be used in assembly code with the upper page attributes.
+ */
+#define L_PTE_NOEXEC_HIGH	(1 << (54 - 32))
+#define L_PTE_DIRTY_HIGH	(1 << (55 - 32))
+
+#define L_PTE_SWP_SHIFT		2	/* shift for the swap or file PTE */
+
+/*
+ * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
+ */
+#define L_PTE_MT_UNCACHED	(_AT(pteval_t, 0) << 2)	/* strongly ordered */
+#define L_PTE_MT_BUFFERABLE	(_AT(pteval_t, 1) << 2)	/* normal non-cacheable */
+#define L_PTE_MT_WRITETHROUGH	(_AT(pteval_t, 2) << 2)	/* normal inner write-through */
+#define L_PTE_MT_WRITEBACK	(_AT(pteval_t, 3) << 2)	/* normal inner write-back */
+#define L_PTE_MT_WRITEALLOC	(_AT(pteval_t, 7) << 2)	/* normal inner write-alloc */
+#define L_PTE_MT_DEV_SHARED	(_AT(pteval_t, 4) << 2)	/* device */
+#define L_PTE_MT_DEV_NONSHARED	(_AT(pteval_t, 4) << 2)	/* device */
+#define L_PTE_MT_DEV_WC		(_AT(pteval_t, 1) << 2)	/* normal non-cacheable */
+#define L_PTE_MT_DEV_CACHED	(_AT(pteval_t, 3) << 2)	/* normal inner write-back */
+#define L_PTE_MT_MASK		(_AT(pteval_t, 7) << 2)
+
+#endif /* _ASM_PGTABLE_3LEVEL_H */
diff --git a/arch/arm/include/asm/pgtable-hwdef.h b/arch/arm/include/asm/pgtable-hwdef.h
index 1831111..8426229 100644
--- a/arch/arm/include/asm/pgtable-hwdef.h
+++ b/arch/arm/include/asm/pgtable-hwdef.h
@@ -10,6 +10,10 @@
 #ifndef _ASMARM_PGTABLE_HWDEF_H
 #define _ASMARM_PGTABLE_HWDEF_H
 
+#ifdef CONFIG_ARM_LPAE
+#include <asm/pgtable-3level-hwdef.h>
+#else
 #include <asm/pgtable-2level-hwdef.h>
+#endif
 
 #endif
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 5bd0e64..97a5de3 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -23,7 +23,11 @@
 #include <mach/vmalloc.h>
 #include <asm/pgtable-hwdef.h>
 
+#ifdef CONFIG_ARM_LPAE
+#include <asm/pgtable-3level.h>
+#else
 #include <asm/pgtable-2level.h>
+#endif
 
 /*
  * Just any arbitrary offset to the start of the vmalloc VM area: the
@@ -283,7 +287,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
-	const unsigned long mask = L_PTE_EXEC | L_PTE_WRITE | L_PTE_USER |
+	const pteval_t mask = L_PTE_EXEC | L_PTE_WRITE | L_PTE_USER |
 		L_PTE_NOEXEC | L_PTE_NOWRITE;
 	pte_val(pte) = (pte_val(pte) & ~mask) | (pgprot_val(newprot) & mask);
 	return pte;
diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
index 6630620..a62f093 100644
--- a/arch/arm/mm/mm.h
+++ b/arch/arm/mm/mm.h
@@ -16,10 +16,10 @@ static inline pmd_t *pmd_off_k(unsigned long virt)
 }
 
 struct mem_type {
-	unsigned int prot_pte;
-	unsigned int prot_l1;
-	unsigned int prot_sect;
-	unsigned int domain;
+	pgprot_t prot_pte;
+	pgprot_t prot_l1;
+	pgprot_t prot_sect;
+	pgprot_t domain;
 };
 
 const struct mem_type *get_mem_type(unsigned int type);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 0ca33dd..7c803c4 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -292,7 +292,7 @@ static void __init build_mem_type_table(void)
 {
 	struct cachepolicy *cp;
 	unsigned int cr = get_cr();
-	unsigned int user_pgprot, kern_pgprot, vecs_pgprot;
+	pgprot_t user_pgprot, kern_pgprot, vecs_pgprot;
 	int cpu_arch = cpu_architecture();
 	int i;
 

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 07/20] ARM: LPAE: Page table maintenance for the 3-level format
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

This patch modifies the pgd/pmd/pte manipulation functions to support
the 3-level page table format. Since there is no need for an 'ext'
argument to cpu_set_pte_ext(), this patch conditionally defines a
different prototype for this function when CONFIG_ARM_LPAE.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/cpu-multi32.h |    8 ++++
 arch/arm/include/asm/cpu-single.h  |    4 ++
 arch/arm/include/asm/pgalloc.h     |   26 ++++++++++++-
 arch/arm/include/asm/pgtable.h     |   72 ++++++++++++++++++++++++++++++++++++
 arch/arm/include/asm/proc-fns.h    |   13 ++++++
 arch/arm/mm/ioremap.c              |    8 ++-
 arch/arm/mm/pgd.c                  |   18 +++++++--
 arch/arm/mm/proc-v7.S              |    8 ++++
 8 files changed, 149 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/cpu-multi32.h b/arch/arm/include/asm/cpu-multi32.h
index e2b5b0b..985fcf5 100644
--- a/arch/arm/include/asm/cpu-multi32.h
+++ b/arch/arm/include/asm/cpu-multi32.h
@@ -57,7 +57,11 @@ extern struct processor {
 	 * Set a possibly extended PTE.  Non-extended PTEs should
 	 * ignore 'ext'.
 	 */
+#ifdef CONFIG_ARM_LPAE
+	void (*set_pte_ext)(pte_t *ptep, pte_t pte);
+#else
 	void (*set_pte_ext)(pte_t *ptep, pte_t pte, unsigned int ext);
+#endif
 } processor;
 
 #define cpu_proc_init()			processor._proc_init()
@@ -65,5 +69,9 @@ extern struct processor {
 #define cpu_reset(addr)			processor.reset(addr)
 #define cpu_do_idle()			processor._do_idle()
 #define cpu_dcache_clean_area(addr,sz)	processor.dcache_clean_area(addr,sz)
+#ifdef CONFIG_ARM_LPAE
+#define cpu_set_pte_ext(ptep,pte)	processor.set_pte_ext(ptep,pte)
+#else
 #define cpu_set_pte_ext(ptep,pte,ext)	processor.set_pte_ext(ptep,pte,ext)
+#endif
 #define cpu_do_switch_mm(pgd,mm)	processor.switch_mm(pgd,mm)
diff --git a/arch/arm/include/asm/cpu-single.h b/arch/arm/include/asm/cpu-single.h
index f073a6d..f436df2 100644
--- a/arch/arm/include/asm/cpu-single.h
+++ b/arch/arm/include/asm/cpu-single.h
@@ -40,5 +40,9 @@ extern void cpu_proc_fin(void);
 extern int cpu_do_idle(void);
 extern void cpu_dcache_clean_area(void *, int);
 extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm);
+#ifdef CONFIG_ARM_LPAE
+extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte);
+#else
 extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte, unsigned int ext);
+#endif
 extern void cpu_reset(unsigned long addr) __attribute__((noreturn));
diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h
index c2a1f64..64a303d 100644
--- a/arch/arm/include/asm/pgalloc.h
+++ b/arch/arm/include/asm/pgalloc.h
@@ -23,6 +23,26 @@
 #define _PAGE_USER_TABLE	(PMD_TYPE_TABLE | PMD_BIT4 | PMD_DOMAIN(DOMAIN_USER))
 #define _PAGE_KERNEL_TABLE	(PMD_TYPE_TABLE | PMD_BIT4 | PMD_DOMAIN(DOMAIN_KERNEL))
 
+#ifdef CONFIG_ARM_LPAE
+
+static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
+{
+	return (pmd_t *)get_zeroed_page(GFP_KERNEL | __GFP_REPEAT);
+}
+
+static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
+{
+	BUG_ON((unsigned long)pmd & (PAGE_SIZE-1));
+	free_page((unsigned long)pmd);
+}
+
+static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pmd_t *pmd)
+{
+	set_pgd(pgd, __pgd(__pa(pmd) | PMD_TYPE_TABLE));
+}
+
+#else	/* !CONFIG_ARM_LPAE */
+
 /*
  * Since we have only two-level page tables, these are trivial
  */
@@ -30,6 +50,8 @@
 #define pmd_free(mm, pmd)		do { } while (0)
 #define pgd_populate(mm,pmd,pte)	BUG()
 
+#endif	/* CONFIG_ARM_LPAE */
+
 extern pgd_t *get_pgd_slow(struct mm_struct *mm);
 extern void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd);
 
@@ -106,10 +128,12 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
 	__free_page(pte);
 }
 
-static inline void __pmd_populate(pmd_t *pmdp, unsigned long pmdval)
+static inline void __pmd_populate(pmd_t *pmdp, pmd_t pmdval)
 {
 	pmdp[0] = __pmd(pmdval);
+#ifndef CONFIG_ARM_LPAE
 	pmdp[1] = __pmd(pmdval + 256 * sizeof(pte_t));
+#endif
 	flush_pmd_entry(pmdp);
 }
 
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 97a5de3..41236f0 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -124,7 +124,12 @@ extern pgprot_t		pgprot_kernel;
 extern struct page *empty_zero_page;
 #define ZERO_PAGE(vaddr)	(empty_zero_page)
 
+#ifdef CONFIG_ARM_LPAE
+#define pte_pfn(pte)		((pte_val(pte) & PTE_PFN_MASK) >> PAGE_SHIFT)
+#else
 #define pte_pfn(pte)		(pte_val(pte) >> PAGE_SHIFT)
+#endif
+
 #define pfn_pte(pfn,prot)	(__pte(((pfn) << PAGE_SHIFT) | pgprot_val(prot)))
 
 #define pte_none(pte)		(!pte_val(pte))
@@ -143,7 +148,11 @@ extern struct page *empty_zero_page;
 #define __pte_unmap(pte)	kunmap_atomic((pte - LINUX_PTE_OFFSET))
 #endif
 
+#ifdef CONFIG_ARM_LPAE
+#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,(pte)|(ext))
+#else
 #define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
+#endif
 
 #if __LINUX_ARM_ARCH__ < 6
 static inline void __sync_icache_dcache(pte_t pteval)
@@ -226,6 +235,30 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define pmd_present(pmd)	(pmd_val(pmd))
+
+#ifdef CONFIG_ARM_LPAE
+
+#define pmd_bad(pmd)		(!(pmd_val(pmd) & 2))
+
+#define copy_pmd(pmdpd,pmdps)		\
+	do {				\
+		*pmdpd = *pmdps;	\
+		flush_pmd_entry(pmdpd);	\
+	} while (0)
+
+#define pmd_clear(pmdp)			\
+	do {				\
+		*pmdp = __pmd(0);	\
+		clean_pmd_entry(pmdp);	\
+	} while (0)
+
+static inline pte_t *pmd_page_vaddr(pmd_t pmd)
+{
+	return __va(pmd_val(pmd) & PTE_PFN_MASK);
+}
+
+#else	/* !CONFIG_ARM_LPAE */
+
 #define pmd_bad(pmd)		(pmd_val(pmd) & 2)
 
 #define copy_pmd(pmdpd,pmdps)		\
@@ -252,7 +285,13 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 	return __va(ptr);
 }
 
+#endif	/* CONFIG_ARM_LPAE */
+
+#ifdef CONFIG_ARM_LPAE
+#define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PTE_PFN_MASK))
+#else
 #define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd)))
+#endif
 
 /*
  * Conversion functions: convert a page and protection to a page entry,
@@ -260,6 +299,31 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
  */
 #define mk_pte(page,prot)	pfn_pte(page_to_pfn(page),prot)
 
+#ifdef CONFIG_ARM_LPAE
+
+#define pgd_none(pgd)		(!pgd_val(pgd))
+#define pgd_bad(pgd)		(!(pgd_val(pgd) & 2))
+#define pgd_present(pgd)	(pgd_val(pgd))
+
+#define pgd_clear(pgdp)			\
+	do {				\
+		*pgdp = __pgd(0);	\
+		clean_pmd_entry(pgdp);	\
+	} while (0)
+
+#define set_pgd(pgdp, pgd)		\
+	do {				\
+		*pgdp = pgd;		\
+		flush_pmd_entry(pgdp);	\
+	} while (0)
+
+static inline pte_t *pgd_page_vaddr(pmd_t pgd)
+{
+	return __va(pgd_val(pgd) & PTE_PFN_MASK);
+}
+
+#else	/* !CONFIG_ARM_LPAE */
+
 /*
  * The "pgd_xxx()" functions here are trivial for a folded two-level
  * setup: the pgd is never bad, and a pmd always exists (as it's folded
@@ -271,6 +335,8 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 #define pgd_clear(pgdp)		do { } while (0)
 #define set_pgd(pgd,pgdp)	do { } while (0)
 
+#endif	/* CONFIG_ARM_LPAE */
+
 /* to find an entry in a page-table-directory */
 #define pgd_index(addr)		((addr) >> PGDIR_SHIFT)
 
@@ -280,7 +346,13 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 #define pgd_offset_k(addr)	pgd_offset(&init_mm, addr)
 
 /* Find an entry in the second-level page table.. */
+#ifdef CONFIG_ARM_LPAE
+#define pmd_index(addr)		(((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
+#define pmd_offset(pgdp, addr)	((pmd_t *)(pgd_page_vaddr(*(pgdp))) + \
+				 pmd_index(addr))
+#else
 #define pmd_offset(dir, addr)	((pmd_t *)(dir))
+#endif
 
 /* Find an entry in the third-level page table.. */
 #define __pte_index(addr)	(((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index 8fdae9b..f00ae99 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -263,6 +263,18 @@
 
 #define cpu_switch_mm(pgd,mm) cpu_do_switch_mm(virt_to_phys(pgd),mm)
 
+#ifdef CONFIG_ARM_LPAE
+#define cpu_get_pgd()	\
+	({						\
+		unsigned long pg, pg2;			\
+		__asm__("mrrc	p15, 0, %0, %1, c2"	\
+			: "=r" (pg), "=r" (pg2)		\
+			:				\
+			: "cc");			\
+		pg &= ~(PTRS_PER_PGD*sizeof(pgd_t)-1);	\
+		(pgd_t *)phys_to_virt(pg);		\
+	})
+#else
 #define cpu_get_pgd()	\
 	({						\
 		unsigned long pg;			\
@@ -271,6 +283,7 @@
 		pg &= ~0x3fff;				\
 		(pgd_t *)phys_to_virt(pg);		\
 	})
+#endif
 
 #endif
 
diff --git a/arch/arm/mm/ioremap.c b/arch/arm/mm/ioremap.c
index 17e7b0b..ccfb2ab 100644
--- a/arch/arm/mm/ioremap.c
+++ b/arch/arm/mm/ioremap.c
@@ -64,7 +64,7 @@ void __check_kvm_seq(struct mm_struct *mm)
 	} while (seq != init_mm.context.kvm_seq);
 }
 
-#ifndef CONFIG_SMP
+#if !defined(CONFIG_SMP) && !defined(CONFIG_ARM_LPAE)
 /*
  * Section support is unsafe on SMP - If you iounmap and ioremap a region,
  * the other CPUs will not see this change until their next context switch.
@@ -195,11 +195,13 @@ void __iomem * __arm_ioremap_pfn_caller(unsigned long pfn,
 	unsigned long addr;
  	struct vm_struct * area;
 
+#ifndef CONFIG_ARM_LPAE
 	/*
 	 * High mappings must be supersection aligned
 	 */
 	if (pfn >= 0x100000 && (__pfn_to_phys(pfn) & ~SUPERSECTION_MASK))
 		return NULL;
+#endif
 
 	/*
 	 * Don't allow RAM to be mapped - this causes problems with ARMv6+
@@ -225,7 +227,7 @@ void __iomem * __arm_ioremap_pfn_caller(unsigned long pfn,
  		return NULL;
  	addr = (unsigned long)area->addr;
 
-#ifndef CONFIG_SMP
+#if !defined(CONFIG_SMP) && !defined(CONFIG_ARM_LPAE)
 	if (DOMAIN_IO == 0 &&
 	    (((cpu_architecture() >= CPU_ARCH_ARMv6) && (get_cr() & CR_XP)) ||
 	       cpu_is_xsc3()) && pfn >= 0x100000 &&
@@ -296,7 +298,7 @@ EXPORT_SYMBOL(__arm_ioremap);
 void __iounmap(volatile void __iomem *io_addr)
 {
 	void *addr = (void *)(PAGE_MASK & (unsigned long)io_addr);
-#ifndef CONFIG_SMP
+#if !defined(CONFIG_SMP) && !defined(CONFIG_ARM_LPAE)
 	struct vm_struct **p, *tmp;
 
 	/*
diff --git a/arch/arm/mm/pgd.c b/arch/arm/mm/pgd.c
index 69bbfc6..e7c149b 100644
--- a/arch/arm/mm/pgd.c
+++ b/arch/arm/mm/pgd.c
@@ -10,6 +10,7 @@
 #include <linux/mm.h>
 #include <linux/gfp.h>
 #include <linux/highmem.h>
+#include <linux/slab.h>
 
 #include <asm/pgalloc.h>
 #include <asm/page.h>
@@ -17,7 +18,15 @@
 
 #include "mm.h"
 
+#ifdef CONFIG_ARM_LPAE
+#define alloc_pgd()	kmalloc(PTRS_PER_PGD * sizeof(pgd_t), GFP_KERNEL)
+#define free_pgd(pgd)	kfree(pgd)
+#define FIRST_KERNEL_PGD_NR	(PAGE_OFFSET >> PGDIR_SHIFT)
+#else
+#define alloc_pgd()	(pgd_t *)__get_free_pages(GFP_KERNEL, 2)
+#define free_pgd(pgd)	free_pages((unsigned long)pgd, 2)
 #define FIRST_KERNEL_PGD_NR	(FIRST_USER_PGD_NR + USER_PTRS_PER_PGD)
+#endif
 
 /*
  * need to get a 16k page for level 1
@@ -28,7 +37,7 @@ pgd_t *get_pgd_slow(struct mm_struct *mm)
 	pmd_t *new_pmd, *init_pmd;
 	pte_t *new_pte, *init_pte;
 
-	new_pgd = (pgd_t *)__get_free_pages(GFP_KERNEL, 2);
+	new_pgd = alloc_pgd();
 	if (!new_pgd)
 		goto no_pgd;
 
@@ -68,7 +77,7 @@ pgd_t *get_pgd_slow(struct mm_struct *mm)
 no_pte:
 	pmd_free(mm, new_pmd);
 no_pmd:
-	free_pages((unsigned long)new_pgd, 2);
+	free_pgd(new_pgd);
 no_pgd:
 	return NULL;
 }
@@ -81,7 +90,8 @@ void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd)
 	if (!pgd)
 		return;
 
-	/* pgd is always present and good */
+	if (pgd_none(*pgd))
+		goto free;
 	pmd = pmd_off(pgd, 0);
 	if (pmd_none(*pmd))
 		goto free;
@@ -96,5 +106,5 @@ void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd)
 	pte_free(mm, pte);
 	pmd_free(mm, pmd);
 free:
-	free_pages((unsigned long) pgd, 2);
+	free_pgd(pgd);
 }
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 2b5b20b..1098a49 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -130,6 +130,13 @@ ENDPROC(cpu_v7_switch_mm)
  */
 ENTRY(cpu_v7_set_pte_ext)
 #ifdef CONFIG_MMU
+#ifdef CONFIG_ARM_LPAE
+	tst	r2, #L_PTE_PRESENT
+	beq	1f
+	tst	r3, #1 << (55 - 32)		@ L_PTE_DIRTY
+	orreq	r2, #L_PTE_NOWRITE
+1:	strd	r2, r3, [r0]
+#else	/* !CONFIG_ARM_LPAE */
  ARM(	str	r1, [r0], #-2048	)	@ linux version
  THUMB(	str	r1, [r0]		)	@ linux version
  THUMB(	sub	r0, r0, #2048		)
@@ -162,6 +169,7 @@ ENTRY(cpu_v7_set_pte_ext)
 	moveq	r3, #0
 
 	str	r3, [r0]
+#endif	/* CONFIG_ARM_LPAE */
 	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
 #endif
 	mov	pc, lr

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 07/20] ARM: LPAE: Page table maintenance for the 3-level format
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

This patch modifies the pgd/pmd/pte manipulation functions to support
the 3-level page table format. Since there is no need for an 'ext'
argument to cpu_set_pte_ext(), this patch conditionally defines a
different prototype for this function when CONFIG_ARM_LPAE.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/cpu-multi32.h |    8 ++++
 arch/arm/include/asm/cpu-single.h  |    4 ++
 arch/arm/include/asm/pgalloc.h     |   26 ++++++++++++-
 arch/arm/include/asm/pgtable.h     |   72 ++++++++++++++++++++++++++++++++++++
 arch/arm/include/asm/proc-fns.h    |   13 ++++++
 arch/arm/mm/ioremap.c              |    8 ++-
 arch/arm/mm/pgd.c                  |   18 +++++++--
 arch/arm/mm/proc-v7.S              |    8 ++++
 8 files changed, 149 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/cpu-multi32.h b/arch/arm/include/asm/cpu-multi32.h
index e2b5b0b..985fcf5 100644
--- a/arch/arm/include/asm/cpu-multi32.h
+++ b/arch/arm/include/asm/cpu-multi32.h
@@ -57,7 +57,11 @@ extern struct processor {
 	 * Set a possibly extended PTE.  Non-extended PTEs should
 	 * ignore 'ext'.
 	 */
+#ifdef CONFIG_ARM_LPAE
+	void (*set_pte_ext)(pte_t *ptep, pte_t pte);
+#else
 	void (*set_pte_ext)(pte_t *ptep, pte_t pte, unsigned int ext);
+#endif
 } processor;
 
 #define cpu_proc_init()			processor._proc_init()
@@ -65,5 +69,9 @@ extern struct processor {
 #define cpu_reset(addr)			processor.reset(addr)
 #define cpu_do_idle()			processor._do_idle()
 #define cpu_dcache_clean_area(addr,sz)	processor.dcache_clean_area(addr,sz)
+#ifdef CONFIG_ARM_LPAE
+#define cpu_set_pte_ext(ptep,pte)	processor.set_pte_ext(ptep,pte)
+#else
 #define cpu_set_pte_ext(ptep,pte,ext)	processor.set_pte_ext(ptep,pte,ext)
+#endif
 #define cpu_do_switch_mm(pgd,mm)	processor.switch_mm(pgd,mm)
diff --git a/arch/arm/include/asm/cpu-single.h b/arch/arm/include/asm/cpu-single.h
index f073a6d..f436df2 100644
--- a/arch/arm/include/asm/cpu-single.h
+++ b/arch/arm/include/asm/cpu-single.h
@@ -40,5 +40,9 @@ extern void cpu_proc_fin(void);
 extern int cpu_do_idle(void);
 extern void cpu_dcache_clean_area(void *, int);
 extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm);
+#ifdef CONFIG_ARM_LPAE
+extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte);
+#else
 extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte, unsigned int ext);
+#endif
 extern void cpu_reset(unsigned long addr) __attribute__((noreturn));
diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h
index c2a1f64..64a303d 100644
--- a/arch/arm/include/asm/pgalloc.h
+++ b/arch/arm/include/asm/pgalloc.h
@@ -23,6 +23,26 @@
 #define _PAGE_USER_TABLE	(PMD_TYPE_TABLE | PMD_BIT4 | PMD_DOMAIN(DOMAIN_USER))
 #define _PAGE_KERNEL_TABLE	(PMD_TYPE_TABLE | PMD_BIT4 | PMD_DOMAIN(DOMAIN_KERNEL))
 
+#ifdef CONFIG_ARM_LPAE
+
+static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
+{
+	return (pmd_t *)get_zeroed_page(GFP_KERNEL | __GFP_REPEAT);
+}
+
+static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
+{
+	BUG_ON((unsigned long)pmd & (PAGE_SIZE-1));
+	free_page((unsigned long)pmd);
+}
+
+static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pmd_t *pmd)
+{
+	set_pgd(pgd, __pgd(__pa(pmd) | PMD_TYPE_TABLE));
+}
+
+#else	/* !CONFIG_ARM_LPAE */
+
 /*
  * Since we have only two-level page tables, these are trivial
  */
@@ -30,6 +50,8 @@
 #define pmd_free(mm, pmd)		do { } while (0)
 #define pgd_populate(mm,pmd,pte)	BUG()
 
+#endif	/* CONFIG_ARM_LPAE */
+
 extern pgd_t *get_pgd_slow(struct mm_struct *mm);
 extern void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd);
 
@@ -106,10 +128,12 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
 	__free_page(pte);
 }
 
-static inline void __pmd_populate(pmd_t *pmdp, unsigned long pmdval)
+static inline void __pmd_populate(pmd_t *pmdp, pmd_t pmdval)
 {
 	pmdp[0] = __pmd(pmdval);
+#ifndef CONFIG_ARM_LPAE
 	pmdp[1] = __pmd(pmdval + 256 * sizeof(pte_t));
+#endif
 	flush_pmd_entry(pmdp);
 }
 
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 97a5de3..41236f0 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -124,7 +124,12 @@ extern pgprot_t		pgprot_kernel;
 extern struct page *empty_zero_page;
 #define ZERO_PAGE(vaddr)	(empty_zero_page)
 
+#ifdef CONFIG_ARM_LPAE
+#define pte_pfn(pte)		((pte_val(pte) & PTE_PFN_MASK) >> PAGE_SHIFT)
+#else
 #define pte_pfn(pte)		(pte_val(pte) >> PAGE_SHIFT)
+#endif
+
 #define pfn_pte(pfn,prot)	(__pte(((pfn) << PAGE_SHIFT) | pgprot_val(prot)))
 
 #define pte_none(pte)		(!pte_val(pte))
@@ -143,7 +148,11 @@ extern struct page *empty_zero_page;
 #define __pte_unmap(pte)	kunmap_atomic((pte - LINUX_PTE_OFFSET))
 #endif
 
+#ifdef CONFIG_ARM_LPAE
+#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,(pte)|(ext))
+#else
 #define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
+#endif
 
 #if __LINUX_ARM_ARCH__ < 6
 static inline void __sync_icache_dcache(pte_t pteval)
@@ -226,6 +235,30 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define pmd_present(pmd)	(pmd_val(pmd))
+
+#ifdef CONFIG_ARM_LPAE
+
+#define pmd_bad(pmd)		(!(pmd_val(pmd) & 2))
+
+#define copy_pmd(pmdpd,pmdps)		\
+	do {				\
+		*pmdpd = *pmdps;	\
+		flush_pmd_entry(pmdpd);	\
+	} while (0)
+
+#define pmd_clear(pmdp)			\
+	do {				\
+		*pmdp = __pmd(0);	\
+		clean_pmd_entry(pmdp);	\
+	} while (0)
+
+static inline pte_t *pmd_page_vaddr(pmd_t pmd)
+{
+	return __va(pmd_val(pmd) & PTE_PFN_MASK);
+}
+
+#else	/* !CONFIG_ARM_LPAE */
+
 #define pmd_bad(pmd)		(pmd_val(pmd) & 2)
 
 #define copy_pmd(pmdpd,pmdps)		\
@@ -252,7 +285,13 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 	return __va(ptr);
 }
 
+#endif	/* CONFIG_ARM_LPAE */
+
+#ifdef CONFIG_ARM_LPAE
+#define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PTE_PFN_MASK))
+#else
 #define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd)))
+#endif
 
 /*
  * Conversion functions: convert a page and protection to a page entry,
@@ -260,6 +299,31 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
  */
 #define mk_pte(page,prot)	pfn_pte(page_to_pfn(page),prot)
 
+#ifdef CONFIG_ARM_LPAE
+
+#define pgd_none(pgd)		(!pgd_val(pgd))
+#define pgd_bad(pgd)		(!(pgd_val(pgd) & 2))
+#define pgd_present(pgd)	(pgd_val(pgd))
+
+#define pgd_clear(pgdp)			\
+	do {				\
+		*pgdp = __pgd(0);	\
+		clean_pmd_entry(pgdp);	\
+	} while (0)
+
+#define set_pgd(pgdp, pgd)		\
+	do {				\
+		*pgdp = pgd;		\
+		flush_pmd_entry(pgdp);	\
+	} while (0)
+
+static inline pte_t *pgd_page_vaddr(pmd_t pgd)
+{
+	return __va(pgd_val(pgd) & PTE_PFN_MASK);
+}
+
+#else	/* !CONFIG_ARM_LPAE */
+
 /*
  * The "pgd_xxx()" functions here are trivial for a folded two-level
  * setup: the pgd is never bad, and a pmd always exists (as it's folded
@@ -271,6 +335,8 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 #define pgd_clear(pgdp)		do { } while (0)
 #define set_pgd(pgd,pgdp)	do { } while (0)
 
+#endif	/* CONFIG_ARM_LPAE */
+
 /* to find an entry in a page-table-directory */
 #define pgd_index(addr)		((addr) >> PGDIR_SHIFT)
 
@@ -280,7 +346,13 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 #define pgd_offset_k(addr)	pgd_offset(&init_mm, addr)
 
 /* Find an entry in the second-level page table.. */
+#ifdef CONFIG_ARM_LPAE
+#define pmd_index(addr)		(((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
+#define pmd_offset(pgdp, addr)	((pmd_t *)(pgd_page_vaddr(*(pgdp))) + \
+				 pmd_index(addr))
+#else
 #define pmd_offset(dir, addr)	((pmd_t *)(dir))
+#endif
 
 /* Find an entry in the third-level page table.. */
 #define __pte_index(addr)	(((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index 8fdae9b..f00ae99 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -263,6 +263,18 @@
 
 #define cpu_switch_mm(pgd,mm) cpu_do_switch_mm(virt_to_phys(pgd),mm)
 
+#ifdef CONFIG_ARM_LPAE
+#define cpu_get_pgd()	\
+	({						\
+		unsigned long pg, pg2;			\
+		__asm__("mrrc	p15, 0, %0, %1, c2"	\
+			: "=r" (pg), "=r" (pg2)		\
+			:				\
+			: "cc");			\
+		pg &= ~(PTRS_PER_PGD*sizeof(pgd_t)-1);	\
+		(pgd_t *)phys_to_virt(pg);		\
+	})
+#else
 #define cpu_get_pgd()	\
 	({						\
 		unsigned long pg;			\
@@ -271,6 +283,7 @@
 		pg &= ~0x3fff;				\
 		(pgd_t *)phys_to_virt(pg);		\
 	})
+#endif
 
 #endif
 
diff --git a/arch/arm/mm/ioremap.c b/arch/arm/mm/ioremap.c
index 17e7b0b..ccfb2ab 100644
--- a/arch/arm/mm/ioremap.c
+++ b/arch/arm/mm/ioremap.c
@@ -64,7 +64,7 @@ void __check_kvm_seq(struct mm_struct *mm)
 	} while (seq != init_mm.context.kvm_seq);
 }
 
-#ifndef CONFIG_SMP
+#if !defined(CONFIG_SMP) && !defined(CONFIG_ARM_LPAE)
 /*
  * Section support is unsafe on SMP - If you iounmap and ioremap a region,
  * the other CPUs will not see this change until their next context switch.
@@ -195,11 +195,13 @@ void __iomem * __arm_ioremap_pfn_caller(unsigned long pfn,
 	unsigned long addr;
  	struct vm_struct * area;
 
+#ifndef CONFIG_ARM_LPAE
 	/*
 	 * High mappings must be supersection aligned
 	 */
 	if (pfn >= 0x100000 && (__pfn_to_phys(pfn) & ~SUPERSECTION_MASK))
 		return NULL;
+#endif
 
 	/*
 	 * Don't allow RAM to be mapped - this causes problems with ARMv6+
@@ -225,7 +227,7 @@ void __iomem * __arm_ioremap_pfn_caller(unsigned long pfn,
  		return NULL;
  	addr = (unsigned long)area->addr;
 
-#ifndef CONFIG_SMP
+#if !defined(CONFIG_SMP) && !defined(CONFIG_ARM_LPAE)
 	if (DOMAIN_IO == 0 &&
 	    (((cpu_architecture() >= CPU_ARCH_ARMv6) && (get_cr() & CR_XP)) ||
 	       cpu_is_xsc3()) && pfn >= 0x100000 &&
@@ -296,7 +298,7 @@ EXPORT_SYMBOL(__arm_ioremap);
 void __iounmap(volatile void __iomem *io_addr)
 {
 	void *addr = (void *)(PAGE_MASK & (unsigned long)io_addr);
-#ifndef CONFIG_SMP
+#if !defined(CONFIG_SMP) && !defined(CONFIG_ARM_LPAE)
 	struct vm_struct **p, *tmp;
 
 	/*
diff --git a/arch/arm/mm/pgd.c b/arch/arm/mm/pgd.c
index 69bbfc6..e7c149b 100644
--- a/arch/arm/mm/pgd.c
+++ b/arch/arm/mm/pgd.c
@@ -10,6 +10,7 @@
 #include <linux/mm.h>
 #include <linux/gfp.h>
 #include <linux/highmem.h>
+#include <linux/slab.h>
 
 #include <asm/pgalloc.h>
 #include <asm/page.h>
@@ -17,7 +18,15 @@
 
 #include "mm.h"
 
+#ifdef CONFIG_ARM_LPAE
+#define alloc_pgd()	kmalloc(PTRS_PER_PGD * sizeof(pgd_t), GFP_KERNEL)
+#define free_pgd(pgd)	kfree(pgd)
+#define FIRST_KERNEL_PGD_NR	(PAGE_OFFSET >> PGDIR_SHIFT)
+#else
+#define alloc_pgd()	(pgd_t *)__get_free_pages(GFP_KERNEL, 2)
+#define free_pgd(pgd)	free_pages((unsigned long)pgd, 2)
 #define FIRST_KERNEL_PGD_NR	(FIRST_USER_PGD_NR + USER_PTRS_PER_PGD)
+#endif
 
 /*
  * need to get a 16k page for level 1
@@ -28,7 +37,7 @@ pgd_t *get_pgd_slow(struct mm_struct *mm)
 	pmd_t *new_pmd, *init_pmd;
 	pte_t *new_pte, *init_pte;
 
-	new_pgd = (pgd_t *)__get_free_pages(GFP_KERNEL, 2);
+	new_pgd = alloc_pgd();
 	if (!new_pgd)
 		goto no_pgd;
 
@@ -68,7 +77,7 @@ pgd_t *get_pgd_slow(struct mm_struct *mm)
 no_pte:
 	pmd_free(mm, new_pmd);
 no_pmd:
-	free_pages((unsigned long)new_pgd, 2);
+	free_pgd(new_pgd);
 no_pgd:
 	return NULL;
 }
@@ -81,7 +90,8 @@ void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd)
 	if (!pgd)
 		return;
 
-	/* pgd is always present and good */
+	if (pgd_none(*pgd))
+		goto free;
 	pmd = pmd_off(pgd, 0);
 	if (pmd_none(*pmd))
 		goto free;
@@ -96,5 +106,5 @@ void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd)
 	pte_free(mm, pte);
 	pmd_free(mm, pmd);
 free:
-	free_pages((unsigned long) pgd, 2);
+	free_pgd(pgd);
 }
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 2b5b20b..1098a49 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -130,6 +130,13 @@ ENDPROC(cpu_v7_switch_mm)
  */
 ENTRY(cpu_v7_set_pte_ext)
 #ifdef CONFIG_MMU
+#ifdef CONFIG_ARM_LPAE
+	tst	r2, #L_PTE_PRESENT
+	beq	1f
+	tst	r3, #1 << (55 - 32)		@ L_PTE_DIRTY
+	orreq	r2, #L_PTE_NOWRITE
+1:	strd	r2, r3, [r0]
+#else	/* !CONFIG_ARM_LPAE */
  ARM(	str	r1, [r0], #-2048	)	@ linux version
  THUMB(	str	r1, [r0]		)	@ linux version
  THUMB(	sub	r0, r0, #2048		)
@@ -162,6 +169,7 @@ ENTRY(cpu_v7_set_pte_ext)
 	moveq	r3, #0
 
 	str	r3, [r0]
+#endif	/* CONFIG_ARM_LPAE */
 	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
 #endif
 	mov	pc, lr

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

This patch adds the MMU initialisation for the LPAE page table format.
The swapper_pg_dir size with LPAE is 5 rather than 4 pages. The
__v7_setup function configures the TTBRx split based on the PAGE_OFFSET
and sets the corresponding TTB control and MAIRx bits (similar to
PRRR/NMRR for TEX remapping). The 36-bit mappings (supersections) and
a few other memory types in mmu.c are conditionally compiled.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/kernel/head.S    |   96 +++++++++++++++++++++++++++++++------------
 arch/arm/mm/mmu.c         |   32 ++++++++++++++-
 arch/arm/mm/proc-macros.S |    5 +-
 arch/arm/mm/proc-v7.S     |   99 ++++++++++++++++++++++++++++++++++++++++----
 4 files changed, 193 insertions(+), 39 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index dd6b369..fd8a29e 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -21,6 +21,7 @@
 #include <asm/memory.h>
 #include <asm/thread_info.h>
 #include <asm/system.h>
+#include <asm/pgtable.h>
 
 #ifdef CONFIG_DEBUG_LL
 #include <mach/debug-macro.S>
@@ -45,11 +46,20 @@
 #error KERNEL_RAM_VADDR must start at 0xXXXX8000
 #endif
 
+#ifdef CONFIG_ARM_LPAE
+	/* LPAE requires an additional page for the PGD */
+#define PG_DIR_SIZE	0x5000
+#define PTE_WORDS	3
+#else
+#define PG_DIR_SIZE	0x4000
+#define PTE_WORDS	2
+#endif
+
 	.globl	swapper_pg_dir
-	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - 0x4000
+	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE
 
 	.macro	pgtbl, rd
-	ldr	\rd, =(KERNEL_RAM_PADDR - 0x4000)
+	ldr	\rd, =(KERNEL_RAM_PADDR - PG_DIR_SIZE)
 	.endm
 
 #ifdef CONFIG_XIP_KERNEL
@@ -129,11 +139,11 @@ __create_page_tables:
 	pgtbl	r4				@ page table address
 
 	/*
-	 * Clear the 16K level 1 swapper page table
+	 * Clear the swapper page table
 	 */
 	mov	r0, r4
 	mov	r3, #0
-	add	r6, r0, #0x4000
+	add	r6, r0, #PG_DIR_SIZE
 1:	str	r3, [r0], #4
 	str	r3, [r0], #4
 	str	r3, [r0], #4
@@ -141,6 +151,23 @@ __create_page_tables:
 	teq	r0, r6
 	bne	1b
 
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * Build the PGD table (first level) to point to the PMD table. A PGD
+	 * entry is 64-bit wide and the top 32 bits are 0.
+	 */
+	mov	r0, r4
+	add	r3, r4, #0x1000			@ first PMD table address
+	orr	r3, r3, #3			@ PGD block type
+	mov	r6, #4				@ PTRS_PER_PGD
+1:	str	r3, [r0], #8			@ set PGD entry
+	add	r3, r3, #0x1000			@ next PMD table
+	subs	r6, r6, #1
+	bne	1b
+
+	add	r4, r4, #0x1000			@ point to the PMD tables
+#endif
+
 	ldr	r7, [r10, #PROCINFO_MM_MMUFLAGS] @ mm_mmuflags
 
 	/*
@@ -152,30 +179,30 @@ __create_page_tables:
 	sub	r0, r0, r3			@ virt->phys offset
 	add	r5, r5, r0			@ phys __enable_mmu
 	add	r6, r6, r0			@ phys __enable_mmu_end
-	mov	r5, r5, lsr #20
-	mov	r6, r6, lsr #20
+	mov	r5, r5, lsr #SECTION_SHIFT
+	mov	r6, r6, lsr #SECTION_SHIFT
 
-1:	orr	r3, r7, r5, lsl #20		@ flags + kernel base
-	str	r3, [r4, r5, lsl #2]		@ identity mapping
-	teq	r5, r6
-	addne	r5, r5, #1			@ next section
-	bne	1b
+1:	orr	r3, r7, r5, lsl #SECTION_SHIFT	@ flags + kernel base
+	str	r3, [r4, r5, lsl #PTE_WORDS]	@ identity mapping
+	cmp	r5, r6
+	addlo	r5, r5, #SECTION_SHIFT >> 20	@ next section
+	blo	1b
 
 	/*
 	 * Now setup the pagetables for our kernel direct
 	 * mapped region.
 	 */
 	mov	r3, pc
-	mov	r3, r3, lsr #20
-	orr	r3, r7, r3, lsl #20
+	mov	r3, r3, lsr #SECTION_SHIFT
+	orr	r3, r7, r3, lsl #SECTION_SHIFT
 	add	r0, r4,  #(KERNEL_START & 0xff000000) >> 18
-	str	r3, [r0, #(KERNEL_START & 0x00f00000) >> 18]!
+	str	r3, [r0, #(KERNEL_START & 0x00e00000) >> 18]!
 	ldr	r6, =(KERNEL_END - 1)
-	add	r0, r0, #4
+	add	r0, r0, #1 << PTE_WORDS
 	add	r6, r4, r6, lsr #18
 1:	cmp	r0, r6
-	add	r3, r3, #1 << 20
-	strls	r3, [r0], #4
+	add	r3, r3, #1 << SECTION_SHIFT
+	strls	r3, [r0], #1 << PTE_WORDS
 	bls	1b
 
 #ifdef CONFIG_XIP_KERNEL
@@ -198,12 +225,13 @@ __create_page_tables:
 #endif
 
 	/*
-	 * Then map first 1MB of ram in case it contains our boot params.
+	 * Then map first section of RAM in case it contains our boot params.
+	 * It assumes that PAGE_OFFSET is 2MB-aligned.
 	 */
 	add	r0, r4, #PAGE_OFFSET >> 18
 	orr	r6, r7, #(PHYS_OFFSET & 0xff000000)
-	.if	(PHYS_OFFSET & 0x00f00000)
-	orr	r6, r6, #(PHYS_OFFSET & 0x00f00000)
+	.if	(PHYS_OFFSET & 0x00e00000)
+	orr	r6, r6, #(PHYS_OFFSET & 0x00e00000)
 	.endif
 	str	r6, [r0]
 
@@ -216,21 +244,27 @@ __create_page_tables:
 	 */
 	addruart r7, r3
 
-	mov	r3, r3, lsr #20
-	mov	r3, r3, lsl #2
+	mov	r3, r3, lsr #SECTION_SHIFT
+	mov	r3, r3, lsl #PTE_WORDS
 
 	add	r0, r4, r3
 	rsb	r3, r3, #0x4000			@ PTRS_PER_PGD*sizeof(long)
 	cmp	r3, #0x0800			@ limit to 512MB
 	movhi	r3, #0x0800
 	add	r6, r0, r3
-	mov	r3, r7, lsr #20
+	mov	r3, r7, lsr #SECTION_SHIFT
 	ldr	r7, [r10, #PROCINFO_IO_MMUFLAGS] @ io_mmuflags
-	orr	r3, r7, r3, lsl #20
+	orr	r3, r7, r3, lsl #SECTION_SHIFT
+#ifdef CONFIG_ARM_LPAE
+	mov	r7, #1 << (54 - 32)		@ XN
+#endif
 1:	str	r3, [r0], #4
-	add	r3, r3, #1 << 20
-	teq	r0, r6
-	bne	1b
+#ifdef CONFIG_ARM_LPAE
+	str	r7, [r0], #4
+#endif
+	add	r3, r3, #1 << SECTION_SHIFT
+	cmp	r0, r6
+	blo	1b
 
 #else /* CONFIG_DEBUG_ICEDCC */
 	/* we don't need any serial debugging mappings for ICEDCC */
@@ -259,6 +293,9 @@ __create_page_tables:
 	str	r3, [r0]
 #endif
 #endif
+#ifdef CONFIG_ARM_LPAE
+	sub	r4, r4, #0x1000		@ point to the PGD table
+#endif
 	mov	pc, lr
 ENDPROC(__create_page_tables)
 	.ltorg
@@ -344,12 +381,17 @@ __enable_mmu:
 #ifdef CONFIG_CPU_ICACHE_DISABLE
 	bic	r0, r0, #CR_I
 #endif
+#ifdef CONFIG_ARM_LPAE
+	mov	r5, #0
+	mcrr	p15, 0, r4, r5, c2		@ load TTBR0
+#else
 	mov	r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_IO, DOMAIN_CLIENT))
 	mcr	p15, 0, r5, c3, c0, 0		@ load domain access register
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
+#endif
 	b	__turn_mmu_on
 ENDPROC(__enable_mmu)
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 7c803c4..4147cc6 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -152,6 +152,7 @@ static int __init early_nowrite(char *__unused)
 }
 early_param("nowb", early_nowrite);
 
+#ifndef CONFIG_ARM_LPAE
 static int __init early_ecc(char *p)
 {
 	if (memcmp(p, "on", 2) == 0)
@@ -161,6 +162,7 @@ static int __init early_ecc(char *p)
 	return 0;
 }
 early_param("ecc", early_ecc);
+#endif
 
 static int __init noalign_setup(char *__unused)
 {
@@ -230,10 +232,12 @@ static struct mem_type mem_types[] = {
 		.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN,
 		.domain    = DOMAIN_KERNEL,
 	},
+#ifndef CONFIG_ARM_LPAE
 	[MT_MINICLEAN] = {
 		.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN | PMD_SECT_MINICACHE,
 		.domain    = DOMAIN_KERNEL,
 	},
+#endif
 	[MT_LOW_VECTORS] = {
 		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
 				L_PTE_EXEC | L_PTE_NOWRITE,
@@ -425,6 +429,7 @@ static void __init build_mem_type_table(void)
 	 * ARMv6 and above have extended page tables.
 	 */
 	if (cpu_arch >= CPU_ARCH_ARMv6 && (cr & CR_XP)) {
+#ifndef CONFIG_ARM_LPAE
 		/*
 		 * Mark cache clean areas and XIP ROM read only
 		 * from SVC mode and no access from userspace.
@@ -432,6 +437,7 @@ static void __init build_mem_type_table(void)
 		mem_types[MT_ROM].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
 		mem_types[MT_MINICLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
 		mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
+#endif
 
 		if (is_smp()) {
 			/*
@@ -470,6 +476,18 @@ static void __init build_mem_type_table(void)
 		mem_types[MT_MEMORY_NONCACHED].prot_sect |= PMD_SECT_BUFFERABLE;
 	}
 
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * Do not generate access flag faults for the kernel mappings.
+	 */
+	for (i = 0; i < ARRAY_SIZE(mem_types); i++) {
+		mem_types[i].prot_pte |= PTE_EXT_AF;
+		mem_types[i].prot_sect |= PMD_SECT_AF;
+	}
+	kern_pgprot |= PTE_EXT_AF;
+	vecs_pgprot |= PTE_EXT_AF;
+#endif
+
 	for (i = 0; i < 16; i++) {
 		unsigned long v = pgprot_val(protection_map[i]);
 		protection_map[i] = __pgprot(v | user_pgprot);
@@ -587,6 +605,7 @@ static void __init alloc_init_section(pgd_t *pgd, unsigned long addr,
 	}
 }
 
+#ifndef CONFIG_ARM_LPAE
 static void __init create_36bit_mapping(struct map_desc *md,
 					const struct mem_type *type)
 {
@@ -644,6 +663,7 @@ static void __init create_36bit_mapping(struct map_desc *md,
 		pgd += SUPERSECTION_SIZE >> PGDIR_SHIFT;
 	} while (addr != end);
 }
+#endif	/* !CONFIG_ARM_LPAE */
 
 /*
  * Create the page directory entries and any necessary
@@ -674,6 +694,7 @@ static void __init create_mapping(struct map_desc *md)
 
 	type = &mem_types[md->type];
 
+#ifndef CONFIG_ARM_LPAE
 	/*
 	 * Catch 36-bit addresses
 	 */
@@ -681,6 +702,7 @@ static void __init create_mapping(struct map_desc *md)
 		create_36bit_mapping(md, type);
 		return;
 	}
+#endif
 
 	addr = md->virtual & PAGE_MASK;
 	phys = (unsigned long)__pfn_to_phys(md->pfn);
@@ -885,6 +907,14 @@ static inline void prepare_page_table(void)
 		pmd_clear(pmd_off_k(addr));
 }
 
+#ifdef CONFIG_ARM_LPAE
+/* the first page is reserved for pgd */
+#define SWAPPER_PG_DIR_SIZE	(PAGE_SIZE + \
+				 PTRS_PER_PGD * PTRS_PER_PMD * sizeof(pmd_t))
+#else
+#define SWAPPER_PG_DIR_SIZE	(PTRS_PER_PGD * sizeof(pgd_t))
+#endif
+
 /*
  * Reserve the special regions of memory
  */
@@ -894,7 +924,7 @@ void __init arm_mm_memblock_reserve(void)
 	 * Reserve the page tables.  These are already in use,
 	 * and can only be in node 0.
 	 */
-	memblock_reserve(__pa(swapper_pg_dir), PTRS_PER_PGD * sizeof(pgd_t));
+	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_PG_DIR_SIZE);
 
 #ifdef CONFIG_SA1111
 	/*
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index 337f102..fed053c 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -81,8 +81,9 @@
 #if L_PTE_SHARED != PTE_EXT_SHARED
 #error PTE shared bit mismatch
 #endif
-#if (L_PTE_EXEC+L_PTE_USER+L_PTE_WRITE+L_PTE_DIRTY+L_PTE_YOUNG+\
-     L_PTE_FILE+L_PTE_PRESENT) > L_PTE_SHARED
+#if !defined(CONFIG_ARM_LPAE) && \
+		(L_PTE_EXEC+L_PTE_USER+L_PTE_WRITE+L_PTE_DIRTY+L_PTE_YOUNG+ \
+		 L_PTE_FILE+L_PTE_PRESENT) > L_PTE_SHARED
 #error Invalid Linux PTE bit settings
 #endif
 #endif	/* CONFIG_MMU */
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 1098a49..33a8c82 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -19,6 +19,19 @@
 
 #include "proc-macros.S"
 
+#ifdef CONFIG_ARM_LPAE
+#define TTB_IRGN_NC	(0 << 8)
+#define TTB_IRGN_WBWA	(1 << 8)
+#define TTB_IRGN_WT	(2 << 8)
+#define TTB_IRGN_WB	(3 << 8)
+#define TTB_RGN_NC	(0 << 10)
+#define TTB_RGN_OC_WBWA	(1 << 10)
+#define TTB_RGN_OC_WT	(2 << 10)
+#define TTB_RGN_OC_WB	(3 << 10)
+#define TTB_S		(3 << 12)
+#define TTB_NOS		(0)
+#define TTB_EAE		(1 << 31)
+#else
 #define TTB_S		(1 << 1)
 #define TTB_RGN_NC	(0 << 3)
 #define TTB_RGN_OC_WBWA	(1 << 3)
@@ -29,14 +42,15 @@
 #define TTB_IRGN_WBWA	((0 << 0) | (1 << 6))
 #define TTB_IRGN_WT	((1 << 0) | (0 << 6))
 #define TTB_IRGN_WB	((1 << 0) | (1 << 6))
+#endif
 
 /* PTWs cacheable, inner WB not shareable, outer WB not shareable */
-#define TTB_FLAGS_UP	TTB_IRGN_WB|TTB_RGN_OC_WB
-#define PMD_FLAGS_UP	PMD_SECT_WB
+#define TTB_FLAGS_UP	(TTB_IRGN_WB|TTB_RGN_OC_WB)
+#define PMD_FLAGS_UP	(PMD_SECT_WB)
 
 /* PTWs cacheable, inner WBWA shareable, outer WBWA not shareable */
-#define TTB_FLAGS_SMP	TTB_IRGN_WBWA|TTB_S|TTB_NOS|TTB_RGN_OC_WBWA
-#define PMD_FLAGS_SMP	PMD_SECT_WBWA|PMD_SECT_S
+#define TTB_FLAGS_SMP	(TTB_IRGN_WBWA|TTB_S|TTB_NOS|TTB_RGN_OC_WBWA)
+#define PMD_FLAGS_SMP	(PMD_SECT_WBWA|PMD_SECT_S)
 
 ENTRY(cpu_v7_proc_init)
 	mov	pc, lr
@@ -280,10 +294,46 @@ __v7_setup:
 	dsb
 #ifdef CONFIG_MMU
 	mcr	p15, 0, r10, c8, c7, 0		@ invalidate I + D TLBs
+#ifdef CONFIG_ARM_LPAE
+	mov	r5, #TTB_EAE
+	ALT_SMP(orr	r5, r5, #TTB_FLAGS_SMP)
+	ALT_SMP(orr	r5, r5, #TTB_FLAGS_SMP << 16)
+	ALT_UP(orr	r5, r5, #TTB_FLAGS_UP)
+	ALT_UP(orr	r5, r5, #TTB_FLAGS_UP << 16)
+	mrc	p15, 0, r10, c2, c0, 2
+	orr	r10, r10, r5
+#if PHYS_OFFSET <= PAGE_OFFSET
+	/*
+	 * TTBR0/TTBR1 split (PAGE_OFFSET):
+	 *   0x40000000: T0SZ = 2, T1SZ = 0 (not used)
+	 *   0x80000000: T0SZ = 0, T1SZ = 1
+	 *   0xc0000000: T0SZ = 0, T1SZ = 2
+	 *
+	 * Only use this feature if PAGE_OFFSET <=  PAGE_OFFSET, otherwise
+	 * booting secondary CPUs would end up using TTBR1 for the identity
+	 * mapping set up in TTBR0.
+	 */
+	orr	r10, r10, #(((PAGE_OFFSET >> 30) - 1) << 16)	@ TTBCR.T1SZ
+#endif
+#endif
 	mcr	p15, 0, r10, c2, c0, 2		@ TTB control register
+#ifdef CONFIG_ARM_LPAE
+	mov	r5, #0
+#if defined CONFIG_VMSPLIT_2G
+	/* PAGE_OFFSET == 0x80000000, T1SZ == 1 */
+	add	r6, r4, #1 << 4			@ skip two L1 entries
+#elif defined CONFIG_VMSPLIT_3G
+	/* PAGE_OFFSET == 0xc0000000, T1SZ == 2 */
+	add	r6, r4, #4096 * (1 + 3)		@ only L2 used, skip pgd+3*pmd
+#else
+	mov	r6, r4
+#endif
+	mcrr	p15, 1, r6, r5, c2		@ load TTBR1
+#else	/* !CONFIG_ARM_LPAE */
 	ALT_SMP(orr	r4, r4, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r4, r4, #TTB_FLAGS_UP)
 	mcr	p15, 0, r4, c2, c0, 1		@ load TTB1
+#endif	/* CONFIG_ARM_LPAE */
 	/*
 	 * Memory region attributes with SCTLR.TRE=1
 	 *
@@ -311,11 +361,33 @@ __v7_setup:
 	 *   NS0 = PRRR[18] = 0		- normal shareable property
 	 *   NS1 = PRRR[19] = 1		- normal shareable property
 	 *   NOS = PRRR[24+n] = 1	- not outer shareable
+	 *
+	 * Memory region attributes for LPAE (defined in pgtable-3level.h):
+	 *
+	 *   n = AttrIndx[2:0]
+	 *
+	 *			n	MAIR
+	 *   UNCACHED		000	00000000
+	 *   BUFFERABLE		001	01000100
+	 *   DEV_WC		001	01000100
+	 *   WRITETHROUGH	010	10101010
+	 *   WRITEBACK		011	11101110
+	 *   DEV_CACHED		011	11101110
+	 *   DEV_SHARED		100	00000100
+	 *   DEV_NONSHARED	100	00000100
+	 *   unused		101
+	 *   unused		110
+	 *   WRITEALLOC		111	11111111
 	 */
+#ifdef CONFIG_ARM_LPAE
+	ldr	r5, =0xeeaa4400			@ MAIR0
+	ldr	r6, =0xff000004			@ MAIR1
+#else
 	ldr	r5, =0xff0a81a8			@ PRRR
 	ldr	r6, =0x40e040e0			@ NMRR
-	mcr	p15, 0, r5, c10, c2, 0		@ write PRRR
-	mcr	p15, 0, r6, c10, c2, 1		@ write NMRR
+#endif
+	mcr	p15, 0, r5, c10, c2, 0		@ write PRRR/MAIR0
+	mcr	p15, 0, r6, c10, c2, 1		@ write NMRR/MAIR1
 #endif
 	adr	r5, v7_crval
 	ldmia	r5, {r5, r6}
@@ -334,14 +406,19 @@ __v7_setup:
 ENDPROC(__v7_setup)
 
 	/*   AT
-	 *  TFR   EV X F   I D LR    S
-	 * .EEE ..EE PUI. .T.T 4RVI ZWRS BLDP WCAM
+	 *  TFR   EV X F   IHD LR    S
+	 * .EEE ..EE PUI. .TAT 4RVI ZWRS BLDP WCAM
 	 * rxxx rrxx xxx0 0101 xxxx xxxx x111 xxxx < forced
 	 *    1    0 110       0011 1100 .111 1101 < we want
+	 *   11    0 110    1  0011 1100 .111 1101 < we want (LPAE)
 	 */
 	.type	v7_crval, #object
 v7_crval:
+#ifdef CONFIG_ARM_LPAE
+	crval	clear=0x0120c302, mmuset=0x30c23c7d, ucset=0x00c01c7c
+#else
 	crval	clear=0x0120c302, mmuset=0x10c03c7d, ucset=0x00c01c7c
+#endif
 
 __v7_setup_stack:
 	.space	4 * 11				@ 11 registers
@@ -416,16 +493,20 @@ __v7_proc_info:
 		PMD_TYPE_SECT | \
 		PMD_SECT_AP_WRITE | \
 		PMD_SECT_AP_READ | \
+		PMD_SECT_AF | \
 		PMD_FLAGS_SMP)
 	ALT_UP(.long \
 		PMD_TYPE_SECT | \
 		PMD_SECT_AP_WRITE | \
 		PMD_SECT_AP_READ | \
+		PMD_SECT_AF | \
 		PMD_FLAGS_UP)
+		/* PMD_SECT_XN is set explicitly in head.S for LPAE */
 	.long   PMD_TYPE_SECT | \
 		PMD_SECT_XN | \
 		PMD_SECT_AP_WRITE | \
-		PMD_SECT_AP_READ
+		PMD_SECT_AP_READ | \
+		PMD_SECT_AF
 	b	__v7_setup
 	.long	cpu_arch_name
 	.long	cpu_elf_name

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds the MMU initialisation for the LPAE page table format.
The swapper_pg_dir size with LPAE is 5 rather than 4 pages. The
__v7_setup function configures the TTBRx split based on the PAGE_OFFSET
and sets the corresponding TTB control and MAIRx bits (similar to
PRRR/NMRR for TEX remapping). The 36-bit mappings (supersections) and
a few other memory types in mmu.c are conditionally compiled.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/kernel/head.S    |   96 +++++++++++++++++++++++++++++++------------
 arch/arm/mm/mmu.c         |   32 ++++++++++++++-
 arch/arm/mm/proc-macros.S |    5 +-
 arch/arm/mm/proc-v7.S     |   99 ++++++++++++++++++++++++++++++++++++++++----
 4 files changed, 193 insertions(+), 39 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index dd6b369..fd8a29e 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -21,6 +21,7 @@
 #include <asm/memory.h>
 #include <asm/thread_info.h>
 #include <asm/system.h>
+#include <asm/pgtable.h>
 
 #ifdef CONFIG_DEBUG_LL
 #include <mach/debug-macro.S>
@@ -45,11 +46,20 @@
 #error KERNEL_RAM_VADDR must start at 0xXXXX8000
 #endif
 
+#ifdef CONFIG_ARM_LPAE
+	/* LPAE requires an additional page for the PGD */
+#define PG_DIR_SIZE	0x5000
+#define PTE_WORDS	3
+#else
+#define PG_DIR_SIZE	0x4000
+#define PTE_WORDS	2
+#endif
+
 	.globl	swapper_pg_dir
-	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - 0x4000
+	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE
 
 	.macro	pgtbl, rd
-	ldr	\rd, =(KERNEL_RAM_PADDR - 0x4000)
+	ldr	\rd, =(KERNEL_RAM_PADDR - PG_DIR_SIZE)
 	.endm
 
 #ifdef CONFIG_XIP_KERNEL
@@ -129,11 +139,11 @@ __create_page_tables:
 	pgtbl	r4				@ page table address
 
 	/*
-	 * Clear the 16K level 1 swapper page table
+	 * Clear the swapper page table
 	 */
 	mov	r0, r4
 	mov	r3, #0
-	add	r6, r0, #0x4000
+	add	r6, r0, #PG_DIR_SIZE
 1:	str	r3, [r0], #4
 	str	r3, [r0], #4
 	str	r3, [r0], #4
@@ -141,6 +151,23 @@ __create_page_tables:
 	teq	r0, r6
 	bne	1b
 
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * Build the PGD table (first level) to point to the PMD table. A PGD
+	 * entry is 64-bit wide and the top 32 bits are 0.
+	 */
+	mov	r0, r4
+	add	r3, r4, #0x1000			@ first PMD table address
+	orr	r3, r3, #3			@ PGD block type
+	mov	r6, #4				@ PTRS_PER_PGD
+1:	str	r3, [r0], #8			@ set PGD entry
+	add	r3, r3, #0x1000			@ next PMD table
+	subs	r6, r6, #1
+	bne	1b
+
+	add	r4, r4, #0x1000			@ point to the PMD tables
+#endif
+
 	ldr	r7, [r10, #PROCINFO_MM_MMUFLAGS] @ mm_mmuflags
 
 	/*
@@ -152,30 +179,30 @@ __create_page_tables:
 	sub	r0, r0, r3			@ virt->phys offset
 	add	r5, r5, r0			@ phys __enable_mmu
 	add	r6, r6, r0			@ phys __enable_mmu_end
-	mov	r5, r5, lsr #20
-	mov	r6, r6, lsr #20
+	mov	r5, r5, lsr #SECTION_SHIFT
+	mov	r6, r6, lsr #SECTION_SHIFT
 
-1:	orr	r3, r7, r5, lsl #20		@ flags + kernel base
-	str	r3, [r4, r5, lsl #2]		@ identity mapping
-	teq	r5, r6
-	addne	r5, r5, #1			@ next section
-	bne	1b
+1:	orr	r3, r7, r5, lsl #SECTION_SHIFT	@ flags + kernel base
+	str	r3, [r4, r5, lsl #PTE_WORDS]	@ identity mapping
+	cmp	r5, r6
+	addlo	r5, r5, #SECTION_SHIFT >> 20	@ next section
+	blo	1b
 
 	/*
 	 * Now setup the pagetables for our kernel direct
 	 * mapped region.
 	 */
 	mov	r3, pc
-	mov	r3, r3, lsr #20
-	orr	r3, r7, r3, lsl #20
+	mov	r3, r3, lsr #SECTION_SHIFT
+	orr	r3, r7, r3, lsl #SECTION_SHIFT
 	add	r0, r4,  #(KERNEL_START & 0xff000000) >> 18
-	str	r3, [r0, #(KERNEL_START & 0x00f00000) >> 18]!
+	str	r3, [r0, #(KERNEL_START & 0x00e00000) >> 18]!
 	ldr	r6, =(KERNEL_END - 1)
-	add	r0, r0, #4
+	add	r0, r0, #1 << PTE_WORDS
 	add	r6, r4, r6, lsr #18
 1:	cmp	r0, r6
-	add	r3, r3, #1 << 20
-	strls	r3, [r0], #4
+	add	r3, r3, #1 << SECTION_SHIFT
+	strls	r3, [r0], #1 << PTE_WORDS
 	bls	1b
 
 #ifdef CONFIG_XIP_KERNEL
@@ -198,12 +225,13 @@ __create_page_tables:
 #endif
 
 	/*
-	 * Then map first 1MB of ram in case it contains our boot params.
+	 * Then map first section of RAM in case it contains our boot params.
+	 * It assumes that PAGE_OFFSET is 2MB-aligned.
 	 */
 	add	r0, r4, #PAGE_OFFSET >> 18
 	orr	r6, r7, #(PHYS_OFFSET & 0xff000000)
-	.if	(PHYS_OFFSET & 0x00f00000)
-	orr	r6, r6, #(PHYS_OFFSET & 0x00f00000)
+	.if	(PHYS_OFFSET & 0x00e00000)
+	orr	r6, r6, #(PHYS_OFFSET & 0x00e00000)
 	.endif
 	str	r6, [r0]
 
@@ -216,21 +244,27 @@ __create_page_tables:
 	 */
 	addruart r7, r3
 
-	mov	r3, r3, lsr #20
-	mov	r3, r3, lsl #2
+	mov	r3, r3, lsr #SECTION_SHIFT
+	mov	r3, r3, lsl #PTE_WORDS
 
 	add	r0, r4, r3
 	rsb	r3, r3, #0x4000			@ PTRS_PER_PGD*sizeof(long)
 	cmp	r3, #0x0800			@ limit to 512MB
 	movhi	r3, #0x0800
 	add	r6, r0, r3
-	mov	r3, r7, lsr #20
+	mov	r3, r7, lsr #SECTION_SHIFT
 	ldr	r7, [r10, #PROCINFO_IO_MMUFLAGS] @ io_mmuflags
-	orr	r3, r7, r3, lsl #20
+	orr	r3, r7, r3, lsl #SECTION_SHIFT
+#ifdef CONFIG_ARM_LPAE
+	mov	r7, #1 << (54 - 32)		@ XN
+#endif
 1:	str	r3, [r0], #4
-	add	r3, r3, #1 << 20
-	teq	r0, r6
-	bne	1b
+#ifdef CONFIG_ARM_LPAE
+	str	r7, [r0], #4
+#endif
+	add	r3, r3, #1 << SECTION_SHIFT
+	cmp	r0, r6
+	blo	1b
 
 #else /* CONFIG_DEBUG_ICEDCC */
 	/* we don't need any serial debugging mappings for ICEDCC */
@@ -259,6 +293,9 @@ __create_page_tables:
 	str	r3, [r0]
 #endif
 #endif
+#ifdef CONFIG_ARM_LPAE
+	sub	r4, r4, #0x1000		@ point to the PGD table
+#endif
 	mov	pc, lr
 ENDPROC(__create_page_tables)
 	.ltorg
@@ -344,12 +381,17 @@ __enable_mmu:
 #ifdef CONFIG_CPU_ICACHE_DISABLE
 	bic	r0, r0, #CR_I
 #endif
+#ifdef CONFIG_ARM_LPAE
+	mov	r5, #0
+	mcrr	p15, 0, r4, r5, c2		@ load TTBR0
+#else
 	mov	r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_IO, DOMAIN_CLIENT))
 	mcr	p15, 0, r5, c3, c0, 0		@ load domain access register
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
+#endif
 	b	__turn_mmu_on
 ENDPROC(__enable_mmu)
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 7c803c4..4147cc6 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -152,6 +152,7 @@ static int __init early_nowrite(char *__unused)
 }
 early_param("nowb", early_nowrite);
 
+#ifndef CONFIG_ARM_LPAE
 static int __init early_ecc(char *p)
 {
 	if (memcmp(p, "on", 2) == 0)
@@ -161,6 +162,7 @@ static int __init early_ecc(char *p)
 	return 0;
 }
 early_param("ecc", early_ecc);
+#endif
 
 static int __init noalign_setup(char *__unused)
 {
@@ -230,10 +232,12 @@ static struct mem_type mem_types[] = {
 		.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN,
 		.domain    = DOMAIN_KERNEL,
 	},
+#ifndef CONFIG_ARM_LPAE
 	[MT_MINICLEAN] = {
 		.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN | PMD_SECT_MINICACHE,
 		.domain    = DOMAIN_KERNEL,
 	},
+#endif
 	[MT_LOW_VECTORS] = {
 		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
 				L_PTE_EXEC | L_PTE_NOWRITE,
@@ -425,6 +429,7 @@ static void __init build_mem_type_table(void)
 	 * ARMv6 and above have extended page tables.
 	 */
 	if (cpu_arch >= CPU_ARCH_ARMv6 && (cr & CR_XP)) {
+#ifndef CONFIG_ARM_LPAE
 		/*
 		 * Mark cache clean areas and XIP ROM read only
 		 * from SVC mode and no access from userspace.
@@ -432,6 +437,7 @@ static void __init build_mem_type_table(void)
 		mem_types[MT_ROM].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
 		mem_types[MT_MINICLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
 		mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
+#endif
 
 		if (is_smp()) {
 			/*
@@ -470,6 +476,18 @@ static void __init build_mem_type_table(void)
 		mem_types[MT_MEMORY_NONCACHED].prot_sect |= PMD_SECT_BUFFERABLE;
 	}
 
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * Do not generate access flag faults for the kernel mappings.
+	 */
+	for (i = 0; i < ARRAY_SIZE(mem_types); i++) {
+		mem_types[i].prot_pte |= PTE_EXT_AF;
+		mem_types[i].prot_sect |= PMD_SECT_AF;
+	}
+	kern_pgprot |= PTE_EXT_AF;
+	vecs_pgprot |= PTE_EXT_AF;
+#endif
+
 	for (i = 0; i < 16; i++) {
 		unsigned long v = pgprot_val(protection_map[i]);
 		protection_map[i] = __pgprot(v | user_pgprot);
@@ -587,6 +605,7 @@ static void __init alloc_init_section(pgd_t *pgd, unsigned long addr,
 	}
 }
 
+#ifndef CONFIG_ARM_LPAE
 static void __init create_36bit_mapping(struct map_desc *md,
 					const struct mem_type *type)
 {
@@ -644,6 +663,7 @@ static void __init create_36bit_mapping(struct map_desc *md,
 		pgd += SUPERSECTION_SIZE >> PGDIR_SHIFT;
 	} while (addr != end);
 }
+#endif	/* !CONFIG_ARM_LPAE */
 
 /*
  * Create the page directory entries and any necessary
@@ -674,6 +694,7 @@ static void __init create_mapping(struct map_desc *md)
 
 	type = &mem_types[md->type];
 
+#ifndef CONFIG_ARM_LPAE
 	/*
 	 * Catch 36-bit addresses
 	 */
@@ -681,6 +702,7 @@ static void __init create_mapping(struct map_desc *md)
 		create_36bit_mapping(md, type);
 		return;
 	}
+#endif
 
 	addr = md->virtual & PAGE_MASK;
 	phys = (unsigned long)__pfn_to_phys(md->pfn);
@@ -885,6 +907,14 @@ static inline void prepare_page_table(void)
 		pmd_clear(pmd_off_k(addr));
 }
 
+#ifdef CONFIG_ARM_LPAE
+/* the first page is reserved for pgd */
+#define SWAPPER_PG_DIR_SIZE	(PAGE_SIZE + \
+				 PTRS_PER_PGD * PTRS_PER_PMD * sizeof(pmd_t))
+#else
+#define SWAPPER_PG_DIR_SIZE	(PTRS_PER_PGD * sizeof(pgd_t))
+#endif
+
 /*
  * Reserve the special regions of memory
  */
@@ -894,7 +924,7 @@ void __init arm_mm_memblock_reserve(void)
 	 * Reserve the page tables.  These are already in use,
 	 * and can only be in node 0.
 	 */
-	memblock_reserve(__pa(swapper_pg_dir), PTRS_PER_PGD * sizeof(pgd_t));
+	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_PG_DIR_SIZE);
 
 #ifdef CONFIG_SA1111
 	/*
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index 337f102..fed053c 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -81,8 +81,9 @@
 #if L_PTE_SHARED != PTE_EXT_SHARED
 #error PTE shared bit mismatch
 #endif
-#if (L_PTE_EXEC+L_PTE_USER+L_PTE_WRITE+L_PTE_DIRTY+L_PTE_YOUNG+\
-     L_PTE_FILE+L_PTE_PRESENT) > L_PTE_SHARED
+#if !defined(CONFIG_ARM_LPAE) && \
+		(L_PTE_EXEC+L_PTE_USER+L_PTE_WRITE+L_PTE_DIRTY+L_PTE_YOUNG+ \
+		 L_PTE_FILE+L_PTE_PRESENT) > L_PTE_SHARED
 #error Invalid Linux PTE bit settings
 #endif
 #endif	/* CONFIG_MMU */
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 1098a49..33a8c82 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -19,6 +19,19 @@
 
 #include "proc-macros.S"
 
+#ifdef CONFIG_ARM_LPAE
+#define TTB_IRGN_NC	(0 << 8)
+#define TTB_IRGN_WBWA	(1 << 8)
+#define TTB_IRGN_WT	(2 << 8)
+#define TTB_IRGN_WB	(3 << 8)
+#define TTB_RGN_NC	(0 << 10)
+#define TTB_RGN_OC_WBWA	(1 << 10)
+#define TTB_RGN_OC_WT	(2 << 10)
+#define TTB_RGN_OC_WB	(3 << 10)
+#define TTB_S		(3 << 12)
+#define TTB_NOS		(0)
+#define TTB_EAE		(1 << 31)
+#else
 #define TTB_S		(1 << 1)
 #define TTB_RGN_NC	(0 << 3)
 #define TTB_RGN_OC_WBWA	(1 << 3)
@@ -29,14 +42,15 @@
 #define TTB_IRGN_WBWA	((0 << 0) | (1 << 6))
 #define TTB_IRGN_WT	((1 << 0) | (0 << 6))
 #define TTB_IRGN_WB	((1 << 0) | (1 << 6))
+#endif
 
 /* PTWs cacheable, inner WB not shareable, outer WB not shareable */
-#define TTB_FLAGS_UP	TTB_IRGN_WB|TTB_RGN_OC_WB
-#define PMD_FLAGS_UP	PMD_SECT_WB
+#define TTB_FLAGS_UP	(TTB_IRGN_WB|TTB_RGN_OC_WB)
+#define PMD_FLAGS_UP	(PMD_SECT_WB)
 
 /* PTWs cacheable, inner WBWA shareable, outer WBWA not shareable */
-#define TTB_FLAGS_SMP	TTB_IRGN_WBWA|TTB_S|TTB_NOS|TTB_RGN_OC_WBWA
-#define PMD_FLAGS_SMP	PMD_SECT_WBWA|PMD_SECT_S
+#define TTB_FLAGS_SMP	(TTB_IRGN_WBWA|TTB_S|TTB_NOS|TTB_RGN_OC_WBWA)
+#define PMD_FLAGS_SMP	(PMD_SECT_WBWA|PMD_SECT_S)
 
 ENTRY(cpu_v7_proc_init)
 	mov	pc, lr
@@ -280,10 +294,46 @@ __v7_setup:
 	dsb
 #ifdef CONFIG_MMU
 	mcr	p15, 0, r10, c8, c7, 0		@ invalidate I + D TLBs
+#ifdef CONFIG_ARM_LPAE
+	mov	r5, #TTB_EAE
+	ALT_SMP(orr	r5, r5, #TTB_FLAGS_SMP)
+	ALT_SMP(orr	r5, r5, #TTB_FLAGS_SMP << 16)
+	ALT_UP(orr	r5, r5, #TTB_FLAGS_UP)
+	ALT_UP(orr	r5, r5, #TTB_FLAGS_UP << 16)
+	mrc	p15, 0, r10, c2, c0, 2
+	orr	r10, r10, r5
+#if PHYS_OFFSET <= PAGE_OFFSET
+	/*
+	 * TTBR0/TTBR1 split (PAGE_OFFSET):
+	 *   0x40000000: T0SZ = 2, T1SZ = 0 (not used)
+	 *   0x80000000: T0SZ = 0, T1SZ = 1
+	 *   0xc0000000: T0SZ = 0, T1SZ = 2
+	 *
+	 * Only use this feature if PAGE_OFFSET <=  PAGE_OFFSET, otherwise
+	 * booting secondary CPUs would end up using TTBR1 for the identity
+	 * mapping set up in TTBR0.
+	 */
+	orr	r10, r10, #(((PAGE_OFFSET >> 30) - 1) << 16)	@ TTBCR.T1SZ
+#endif
+#endif
 	mcr	p15, 0, r10, c2, c0, 2		@ TTB control register
+#ifdef CONFIG_ARM_LPAE
+	mov	r5, #0
+#if defined CONFIG_VMSPLIT_2G
+	/* PAGE_OFFSET == 0x80000000, T1SZ == 1 */
+	add	r6, r4, #1 << 4			@ skip two L1 entries
+#elif defined CONFIG_VMSPLIT_3G
+	/* PAGE_OFFSET == 0xc0000000, T1SZ == 2 */
+	add	r6, r4, #4096 * (1 + 3)		@ only L2 used, skip pgd+3*pmd
+#else
+	mov	r6, r4
+#endif
+	mcrr	p15, 1, r6, r5, c2		@ load TTBR1
+#else	/* !CONFIG_ARM_LPAE */
 	ALT_SMP(orr	r4, r4, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r4, r4, #TTB_FLAGS_UP)
 	mcr	p15, 0, r4, c2, c0, 1		@ load TTB1
+#endif	/* CONFIG_ARM_LPAE */
 	/*
 	 * Memory region attributes with SCTLR.TRE=1
 	 *
@@ -311,11 +361,33 @@ __v7_setup:
 	 *   NS0 = PRRR[18] = 0		- normal shareable property
 	 *   NS1 = PRRR[19] = 1		- normal shareable property
 	 *   NOS = PRRR[24+n] = 1	- not outer shareable
+	 *
+	 * Memory region attributes for LPAE (defined in pgtable-3level.h):
+	 *
+	 *   n = AttrIndx[2:0]
+	 *
+	 *			n	MAIR
+	 *   UNCACHED		000	00000000
+	 *   BUFFERABLE		001	01000100
+	 *   DEV_WC		001	01000100
+	 *   WRITETHROUGH	010	10101010
+	 *   WRITEBACK		011	11101110
+	 *   DEV_CACHED		011	11101110
+	 *   DEV_SHARED		100	00000100
+	 *   DEV_NONSHARED	100	00000100
+	 *   unused		101
+	 *   unused		110
+	 *   WRITEALLOC		111	11111111
 	 */
+#ifdef CONFIG_ARM_LPAE
+	ldr	r5, =0xeeaa4400			@ MAIR0
+	ldr	r6, =0xff000004			@ MAIR1
+#else
 	ldr	r5, =0xff0a81a8			@ PRRR
 	ldr	r6, =0x40e040e0			@ NMRR
-	mcr	p15, 0, r5, c10, c2, 0		@ write PRRR
-	mcr	p15, 0, r6, c10, c2, 1		@ write NMRR
+#endif
+	mcr	p15, 0, r5, c10, c2, 0		@ write PRRR/MAIR0
+	mcr	p15, 0, r6, c10, c2, 1		@ write NMRR/MAIR1
 #endif
 	adr	r5, v7_crval
 	ldmia	r5, {r5, r6}
@@ -334,14 +406,19 @@ __v7_setup:
 ENDPROC(__v7_setup)
 
 	/*   AT
-	 *  TFR   EV X F   I D LR    S
-	 * .EEE ..EE PUI. .T.T 4RVI ZWRS BLDP WCAM
+	 *  TFR   EV X F   IHD LR    S
+	 * .EEE ..EE PUI. .TAT 4RVI ZWRS BLDP WCAM
 	 * rxxx rrxx xxx0 0101 xxxx xxxx x111 xxxx < forced
 	 *    1    0 110       0011 1100 .111 1101 < we want
+	 *   11    0 110    1  0011 1100 .111 1101 < we want (LPAE)
 	 */
 	.type	v7_crval, #object
 v7_crval:
+#ifdef CONFIG_ARM_LPAE
+	crval	clear=0x0120c302, mmuset=0x30c23c7d, ucset=0x00c01c7c
+#else
 	crval	clear=0x0120c302, mmuset=0x10c03c7d, ucset=0x00c01c7c
+#endif
 
 __v7_setup_stack:
 	.space	4 * 11				@ 11 registers
@@ -416,16 +493,20 @@ __v7_proc_info:
 		PMD_TYPE_SECT | \
 		PMD_SECT_AP_WRITE | \
 		PMD_SECT_AP_READ | \
+		PMD_SECT_AF | \
 		PMD_FLAGS_SMP)
 	ALT_UP(.long \
 		PMD_TYPE_SECT | \
 		PMD_SECT_AP_WRITE | \
 		PMD_SECT_AP_READ | \
+		PMD_SECT_AF | \
 		PMD_FLAGS_UP)
+		/* PMD_SECT_XN is set explicitly in head.S for LPAE */
 	.long   PMD_TYPE_SECT | \
 		PMD_SECT_XN | \
 		PMD_SECT_AP_WRITE | \
-		PMD_SECT_AP_READ
+		PMD_SECT_AP_READ | \
+		PMD_SECT_AF
 	b	__v7_setup
 	.long	cpu_arch_name
 	.long	cpu_elf_name

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 09/20] ARM: LPAE: Change setup_mm_for_reboot() to work with LPAE
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

This function was assuming that there are only two levels of page
tables. The patch changes looping over the PMD entries to make it
compatible with LPAE.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/mm/mmu.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 4147cc6..3784acc 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1098,13 +1098,16 @@ void setup_mm_for_reboot(char mode)
 	if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
 		base_pmdval |= PMD_BIT4;
 
-	for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
+	for (i = 0; i < TASK_SIZE >> PMD_SHIFT; i++) {
 		unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
 		pmd_t *pmd;
+		unsigned long addr = i << PMD_SHIFT;
 
-		pmd = pmd_off(pgd, i << PMD_SHIFT);
+		pmd = pmd_off(pgd + pgd_index(addr), addr);
 		pmd[0] = __pmd(pmdval);
+#ifndef CONFIG_ARM_LPAE
 		pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
+#endif
 		flush_pmd_entry(pmd);
 	}
 

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 09/20] ARM: LPAE: Change setup_mm_for_reboot() to work with LPAE
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

This function was assuming that there are only two levels of page
tables. The patch changes looping over the PMD entries to make it
compatible with LPAE.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/mm/mmu.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 4147cc6..3784acc 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1098,13 +1098,16 @@ void setup_mm_for_reboot(char mode)
 	if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
 		base_pmdval |= PMD_BIT4;
 
-	for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
+	for (i = 0; i < TASK_SIZE >> PMD_SHIFT; i++) {
 		unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
 		pmd_t *pmd;
+		unsigned long addr = i << PMD_SHIFT;
 
-		pmd = pmd_off(pgd, i << PMD_SHIFT);
+		pmd = pmd_off(pgd + pgd_index(addr), addr);
 		pmd[0] = __pmd(pmdval);
+#ifndef CONFIG_ARM_LPAE
 		pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
+#endif
 		flush_pmd_entry(pmd);
 	}
 

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 10/20] ARM: LPAE: Remove the FIRST_USER_PGD_NR and USER_PTRS_PER_PGD definitions
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

These macros were only used in setup_mm_for_reboot and get_pgd_slow.
Both have been modified to no longer use these definitions. One of the
reasons is the different meaning that PGD has with the 2-level and
3-level page tables.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/pgtable-2level.h |    3 ---
 arch/arm/include/asm/pgtable-3level.h |    3 ---
 arch/arm/mm/pgd.c                     |    2 +-
 3 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index 4e21166..a0548b6 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -92,9 +92,6 @@
  */
 #define FIRST_USER_ADDRESS	PAGE_SIZE
 
-#define FIRST_USER_PGD_NR	1
-#define USER_PTRS_PER_PGD	((TASK_SIZE/PGDIR_SIZE) - FIRST_USER_PGD_NR)
-
 /*
  * section address mask and size definitions.
  */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 5b1482d..381b04b 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -58,9 +58,6 @@
  */
 #define FIRST_USER_ADDRESS	PAGE_SIZE
 
-#define FIRST_USER_PGD_NR	1
-#define USER_PTRS_PER_PGD	((TASK_SIZE/PGDIR_SIZE) - FIRST_USER_PGD_NR)
-
 /*
  * section address mask and size definitions.
  */
diff --git a/arch/arm/mm/pgd.c b/arch/arm/mm/pgd.c
index e7c149b..09238fa 100644
--- a/arch/arm/mm/pgd.c
+++ b/arch/arm/mm/pgd.c
@@ -25,7 +25,7 @@
 #else
 #define alloc_pgd()	(pgd_t *)__get_free_pages(GFP_KERNEL, 2)
 #define free_pgd(pgd)	free_pages((unsigned long)pgd, 2)
-#define FIRST_KERNEL_PGD_NR	(FIRST_USER_PGD_NR + USER_PTRS_PER_PGD)
+#define FIRST_KERNEL_PGD_NR	(TASK_SIZE >> PGDIR_SHIFT)
 #endif
 
 /*

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 10/20] ARM: LPAE: Remove the FIRST_USER_PGD_NR and USER_PTRS_PER_PGD definitions
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

These macros were only used in setup_mm_for_reboot and get_pgd_slow.
Both have been modified to no longer use these definitions. One of the
reasons is the different meaning that PGD has with the 2-level and
3-level page tables.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/pgtable-2level.h |    3 ---
 arch/arm/include/asm/pgtable-3level.h |    3 ---
 arch/arm/mm/pgd.c                     |    2 +-
 3 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index 4e21166..a0548b6 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -92,9 +92,6 @@
  */
 #define FIRST_USER_ADDRESS	PAGE_SIZE
 
-#define FIRST_USER_PGD_NR	1
-#define USER_PTRS_PER_PGD	((TASK_SIZE/PGDIR_SIZE) - FIRST_USER_PGD_NR)
-
 /*
  * section address mask and size definitions.
  */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 5b1482d..381b04b 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -58,9 +58,6 @@
  */
 #define FIRST_USER_ADDRESS	PAGE_SIZE
 
-#define FIRST_USER_PGD_NR	1
-#define USER_PTRS_PER_PGD	((TASK_SIZE/PGDIR_SIZE) - FIRST_USER_PGD_NR)
-
 /*
  * section address mask and size definitions.
  */
diff --git a/arch/arm/mm/pgd.c b/arch/arm/mm/pgd.c
index e7c149b..09238fa 100644
--- a/arch/arm/mm/pgd.c
+++ b/arch/arm/mm/pgd.c
@@ -25,7 +25,7 @@
 #else
 #define alloc_pgd()	(pgd_t *)__get_free_pages(GFP_KERNEL, 2)
 #define free_pgd(pgd)	free_pages((unsigned long)pgd, 2)
-#define FIRST_KERNEL_PGD_NR	(FIRST_USER_PGD_NR + USER_PTRS_PER_PGD)
+#define FIRST_KERNEL_PGD_NR	(TASK_SIZE >> PGDIR_SHIFT)
 #endif
 
 /*

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 11/20] ARM: LPAE: Add fault handling support
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

The DFSR and IFSR register format is different when LPAE is enabled. In
addition, DFSR and IFSR have the similar definitions for the fault type.
This modifies modifies the fault code to correctly handle the new
format.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/mm/alignment.c |    8 ++++-
 arch/arm/mm/fault.c     |   80 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 87 insertions(+), 1 deletions(-)

diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index 724ba3b..bc98a6e 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -906,6 +906,12 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	return 0;
 }
 
+#ifdef CONFIG_ARM_LPAE
+#define ALIGNMENT_FAULT		33
+#else
+#define ALIGNMENT_FAULT		1
+#endif
+
 /*
  * This needs to be done after sysctl_init, otherwise sys/ will be
  * overwritten.  Actually, this shouldn't be in sys/ at all since
@@ -939,7 +945,7 @@ static int __init alignment_init(void)
 		ai_usermode = UM_FIXUP;
 	}
 
-	hook_fault_code(1, do_alignment, SIGBUS, BUS_ADRALN,
+	hook_fault_code(ALIGNMENT_FAULT, do_alignment, SIGBUS, BUS_ADRALN,
 			"alignment exception");
 
 	/*
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 5da7b0c..2dde9cd 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -33,10 +33,15 @@
 #define FSR_WRITE		(1 << 11)
 #define FSR_FS4			(1 << 10)
 #define FSR_FS3_0		(15)
+#define FSR_FS5_0		(0x3f)
 
 static inline int fsr_fs(unsigned int fsr)
 {
+#ifdef CONFIG_ARM_LPAE
+	return fsr & FSR_FS5_0;
+#else
 	return (fsr & FSR_FS3_0) | (fsr & FSR_FS4) >> 6;
+#endif
 }
 
 #ifdef CONFIG_MMU
@@ -108,7 +113,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 
 		pte = pte_offset_map(pmd, addr);
 		printk(", *pte=%08lx", pte_val(*pte));
+#ifndef CONFIG_ARM_LPAE
 		printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
+#endif
 		pte_unmap(pte);
 	} while(0);
 
@@ -467,6 +474,72 @@ static struct fsr_info {
 	int	code;
 	const char *name;
 } fsr_info[] = {
+#ifdef CONFIG_ARM_LPAE
+	{ do_bad,		SIGBUS,  0,		"unknown 0"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 1"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 2"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 3"			},
+	{ do_bad,		SIGBUS,  0,		"reserved translation fault"	},
+	{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"level 1 translation fault"	},
+	{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"level 2 translation fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_MAPERR,	"level 3 translation fault"	},
+	{ do_bad,		SIGBUS,  0,		"reserved access flag fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 1 access flag fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 2 access flag fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 access flag fault"	},
+	{ do_bad,		SIGBUS,  0,		"reserved permission fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 1 permission fault"	},
+	{ do_sect_fault,	SIGSEGV, SEGV_ACCERR,	"level 2 permission fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 permission fault"	},
+	{ do_bad,		SIGBUS,  0,		"synchronous external abort"	},
+	{ do_bad,		SIGBUS,  0,		"asynchronous external abort"	},
+	{ do_bad,		SIGBUS,  0,		"unknown 18"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 19"			},
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error"	},
+	{ do_bad,		SIGBUS,  0,		"asynchronous parity error"	},
+	{ do_bad,		SIGBUS,  0,		"unknown 26"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 27"			},
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"unknown 32"			},
+	{ do_bad,		SIGBUS,  BUS_ADRALN,	"alignment fault"		},
+	{ do_bad,		SIGBUS,  0,		"debug event"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 35"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 36"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 37"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 38"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 39"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 40"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 41"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 42"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 43"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 44"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 45"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 46"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 47"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 48"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 49"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 50"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 51"			},
+	{ do_bad,		SIGBUS,  0,		"implementation fault (lockdown abort)" },
+	{ do_bad,		SIGBUS,  0,		"unknown 53"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 54"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 55"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 56"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 57"			},
+	{ do_bad,		SIGBUS,  0,		"implementation fault (coprocessor abort)" },
+	{ do_bad,		SIGBUS,  0,		"unknown 59"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 60"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 61"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 62"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 63"			},
+#else	/* !CONFIG_ARM_LPAE */
 	/*
 	 * The following are the standard ARMv3 and ARMv4 aborts.  ARMv5
 	 * defines these to be "precise" aborts.
@@ -508,6 +581,7 @@ static struct fsr_info {
 	{ do_bad,		SIGBUS,  0,		"unknown 29"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 30"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 31"			   }
+#endif	/* CONFIG_ARM_LPAE */
 };
 
 void __init
@@ -546,6 +620,9 @@ do_DataAbort(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 }
 
 
+#ifdef CONFIG_ARM_LPAE
+#define ifsr_info	fsr_info
+#else	/* !CONFIG_ARM_LPAE */
 static struct fsr_info ifsr_info[] = {
 	{ do_bad,		SIGBUS,  0,		"unknown 0"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 1"			   },
@@ -580,6 +657,7 @@ static struct fsr_info ifsr_info[] = {
 	{ do_bad,		SIGBUS,  0,		"unknown 30"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 31"			   },
 };
+#endif	/* CONFIG_ARM_LPAE */
 
 void __init
 hook_ifault_code(int nr, int (*fn)(unsigned long, unsigned int, struct pt_regs *),
@@ -615,6 +693,7 @@ do_PrefetchAbort(unsigned long addr, unsigned int ifsr, struct pt_regs *regs)
 
 static int __init exceptions_init(void)
 {
+#ifndef CONFIG_ARM_LPAE
 	if (cpu_architecture() >= CPU_ARCH_ARMv6) {
 		hook_fault_code(4, do_translation_fault, SIGSEGV, SEGV_MAPERR,
 				"I-cache maintenance fault");
@@ -630,6 +709,7 @@ static int __init exceptions_init(void)
 		hook_fault_code(6, do_bad, SIGSEGV, SEGV_MAPERR,
 				"section access flag fault");
 	}
+#endif
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 11/20] ARM: LPAE: Add fault handling support
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

The DFSR and IFSR register format is different when LPAE is enabled. In
addition, DFSR and IFSR have the similar definitions for the fault type.
This modifies modifies the fault code to correctly handle the new
format.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/mm/alignment.c |    8 ++++-
 arch/arm/mm/fault.c     |   80 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 87 insertions(+), 1 deletions(-)

diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index 724ba3b..bc98a6e 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -906,6 +906,12 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	return 0;
 }
 
+#ifdef CONFIG_ARM_LPAE
+#define ALIGNMENT_FAULT		33
+#else
+#define ALIGNMENT_FAULT		1
+#endif
+
 /*
  * This needs to be done after sysctl_init, otherwise sys/ will be
  * overwritten.  Actually, this shouldn't be in sys/ at all since
@@ -939,7 +945,7 @@ static int __init alignment_init(void)
 		ai_usermode = UM_FIXUP;
 	}
 
-	hook_fault_code(1, do_alignment, SIGBUS, BUS_ADRALN,
+	hook_fault_code(ALIGNMENT_FAULT, do_alignment, SIGBUS, BUS_ADRALN,
 			"alignment exception");
 
 	/*
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 5da7b0c..2dde9cd 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -33,10 +33,15 @@
 #define FSR_WRITE		(1 << 11)
 #define FSR_FS4			(1 << 10)
 #define FSR_FS3_0		(15)
+#define FSR_FS5_0		(0x3f)
 
 static inline int fsr_fs(unsigned int fsr)
 {
+#ifdef CONFIG_ARM_LPAE
+	return fsr & FSR_FS5_0;
+#else
 	return (fsr & FSR_FS3_0) | (fsr & FSR_FS4) >> 6;
+#endif
 }
 
 #ifdef CONFIG_MMU
@@ -108,7 +113,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 
 		pte = pte_offset_map(pmd, addr);
 		printk(", *pte=%08lx", pte_val(*pte));
+#ifndef CONFIG_ARM_LPAE
 		printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
+#endif
 		pte_unmap(pte);
 	} while(0);
 
@@ -467,6 +474,72 @@ static struct fsr_info {
 	int	code;
 	const char *name;
 } fsr_info[] = {
+#ifdef CONFIG_ARM_LPAE
+	{ do_bad,		SIGBUS,  0,		"unknown 0"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 1"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 2"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 3"			},
+	{ do_bad,		SIGBUS,  0,		"reserved translation fault"	},
+	{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"level 1 translation fault"	},
+	{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"level 2 translation fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_MAPERR,	"level 3 translation fault"	},
+	{ do_bad,		SIGBUS,  0,		"reserved access flag fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 1 access flag fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 2 access flag fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 access flag fault"	},
+	{ do_bad,		SIGBUS,  0,		"reserved permission fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 1 permission fault"	},
+	{ do_sect_fault,	SIGSEGV, SEGV_ACCERR,	"level 2 permission fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 permission fault"	},
+	{ do_bad,		SIGBUS,  0,		"synchronous external abort"	},
+	{ do_bad,		SIGBUS,  0,		"asynchronous external abort"	},
+	{ do_bad,		SIGBUS,  0,		"unknown 18"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 19"			},
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error"	},
+	{ do_bad,		SIGBUS,  0,		"asynchronous parity error"	},
+	{ do_bad,		SIGBUS,  0,		"unknown 26"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 27"			},
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"unknown 32"			},
+	{ do_bad,		SIGBUS,  BUS_ADRALN,	"alignment fault"		},
+	{ do_bad,		SIGBUS,  0,		"debug event"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 35"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 36"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 37"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 38"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 39"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 40"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 41"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 42"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 43"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 44"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 45"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 46"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 47"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 48"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 49"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 50"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 51"			},
+	{ do_bad,		SIGBUS,  0,		"implementation fault (lockdown abort)" },
+	{ do_bad,		SIGBUS,  0,		"unknown 53"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 54"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 55"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 56"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 57"			},
+	{ do_bad,		SIGBUS,  0,		"implementation fault (coprocessor abort)" },
+	{ do_bad,		SIGBUS,  0,		"unknown 59"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 60"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 61"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 62"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 63"			},
+#else	/* !CONFIG_ARM_LPAE */
 	/*
 	 * The following are the standard ARMv3 and ARMv4 aborts.  ARMv5
 	 * defines these to be "precise" aborts.
@@ -508,6 +581,7 @@ static struct fsr_info {
 	{ do_bad,		SIGBUS,  0,		"unknown 29"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 30"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 31"			   }
+#endif	/* CONFIG_ARM_LPAE */
 };
 
 void __init
@@ -546,6 +620,9 @@ do_DataAbort(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 }
 
 
+#ifdef CONFIG_ARM_LPAE
+#define ifsr_info	fsr_info
+#else	/* !CONFIG_ARM_LPAE */
 static struct fsr_info ifsr_info[] = {
 	{ do_bad,		SIGBUS,  0,		"unknown 0"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 1"			   },
@@ -580,6 +657,7 @@ static struct fsr_info ifsr_info[] = {
 	{ do_bad,		SIGBUS,  0,		"unknown 30"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 31"			   },
 };
+#endif	/* CONFIG_ARM_LPAE */
 
 void __init
 hook_ifault_code(int nr, int (*fn)(unsigned long, unsigned int, struct pt_regs *),
@@ -615,6 +693,7 @@ do_PrefetchAbort(unsigned long addr, unsigned int ifsr, struct pt_regs *regs)
 
 static int __init exceptions_init(void)
 {
+#ifndef CONFIG_ARM_LPAE
 	if (cpu_architecture() >= CPU_ARCH_ARMv6) {
 		hook_fault_code(4, do_translation_fault, SIGSEGV, SEGV_MAPERR,
 				"I-cache maintenance fault");
@@ -630,6 +709,7 @@ static int __init exceptions_init(void)
 		hook_fault_code(6, do_bad, SIGSEGV, SEGV_MAPERR,
 				"section access flag fault");
 	}
+#endif
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 12/20] ARM: LPAE: Add context switching support
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

With LPAE, TTBRx registers are 64-bit. The ASID is stored in TTBR0
rather than a separate Context ID register. This patch makes the
necessary changes to handle context switching on LPAE.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/mm/context.c |   18 ++++++++++++++++--
 arch/arm/mm/proc-v7.S |    8 +++++++-
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c
index b0ee9ba..d40d3fa 100644
--- a/arch/arm/mm/context.c
+++ b/arch/arm/mm/context.c
@@ -22,6 +22,20 @@ unsigned int cpu_last_asid = ASID_FIRST_VERSION;
 DEFINE_PER_CPU(struct mm_struct *, current_mm);
 #endif
 
+#ifdef CONFIG_ARM_LPAE
+#define cpu_set_asid(asid) {						\
+	unsigned long ttbl, ttbh;					\
+	asm("	mrrc	p15, 0, %0, %1, c2		@ read TTBR0\n"	\
+	    "	mov	%1, %1, lsl #(48 - 32)		@ set ASID\n"	\
+	    "	mcrr	p15, 0, %0, %1, c2		@ set TTBR0\n"	\
+	    : "=r" (ttbl), "=r" (ttbh)					\
+	    : "r" (asid & ~ASID_MASK));					\
+}
+#else
+#define cpu_set_asid(asid) \
+	asm("	mcr	p15, 0, %0, c13, c0, 1\n" : : "r" (asid))
+#endif
+
 /*
  * We fork()ed a process, and we need a new context for the child
  * to run in.  We reserve version 0 for initial tasks so we will
@@ -37,7 +51,7 @@ void __init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 static void flush_context(void)
 {
 	/* set the reserved ASID before flushing the TLB */
-	asm("mcr	p15, 0, %0, c13, c0, 1\n" : : "r" (0));
+	cpu_set_asid(0);
 	isb();
 	local_flush_tlb_all();
 	if (icache_is_vivt_asid_tagged()) {
@@ -99,7 +113,7 @@ static void reset_context(void *info)
 	set_mm_context(mm, asid);
 
 	/* set the new ASID */
-	asm("mcr	p15, 0, %0, c13, c0, 1\n" : : "r" (mm->context.id));
+	cpu_set_asid(mm->context.id);
 	isb();
 }
 
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 33a8c82..b0932c1 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -117,6 +117,11 @@ ENTRY(cpu_v7_switch_mm)
 #ifdef CONFIG_MMU
 	mov	r2, #0
 	ldr	r1, [r1, #MM_CONTEXT_ID]	@ get mm->context.id
+#ifdef CONFIG_ARM_LPAE
+	and	r3, r1, #0xff
+	mov	r3, r3, lsl #(48 - 32)		@ ASID
+	mcrr	p15, 0, r0, r3, c2		@ set TTB 0
+#else	/* !CONFIG_ARM_LPAE */
 	ALT_SMP(orr	r0, r0, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r0, r0, #TTB_FLAGS_UP)
 #ifdef CONFIG_ARM_ERRATA_430973
@@ -124,9 +129,10 @@ ENTRY(cpu_v7_switch_mm)
 #endif
 	mcr	p15, 0, r2, c13, c0, 1		@ set reserved context ID
 	isb
-1:	mcr	p15, 0, r0, c2, c0, 0		@ set TTB 0
+	mcr	p15, 0, r0, c2, c0, 0		@ set TTB 0
 	isb
 	mcr	p15, 0, r1, c13, c0, 1		@ set context ID
+#endif	/* CONFIG_ARM_LPAE */
 	isb
 #endif
 	mov	pc, lr

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 12/20] ARM: LPAE: Add context switching support
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

With LPAE, TTBRx registers are 64-bit. The ASID is stored in TTBR0
rather than a separate Context ID register. This patch makes the
necessary changes to handle context switching on LPAE.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/mm/context.c |   18 ++++++++++++++++--
 arch/arm/mm/proc-v7.S |    8 +++++++-
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c
index b0ee9ba..d40d3fa 100644
--- a/arch/arm/mm/context.c
+++ b/arch/arm/mm/context.c
@@ -22,6 +22,20 @@ unsigned int cpu_last_asid = ASID_FIRST_VERSION;
 DEFINE_PER_CPU(struct mm_struct *, current_mm);
 #endif
 
+#ifdef CONFIG_ARM_LPAE
+#define cpu_set_asid(asid) {						\
+	unsigned long ttbl, ttbh;					\
+	asm("	mrrc	p15, 0, %0, %1, c2		@ read TTBR0\n"	\
+	    "	mov	%1, %1, lsl #(48 - 32)		@ set ASID\n"	\
+	    "	mcrr	p15, 0, %0, %1, c2		@ set TTBR0\n"	\
+	    : "=r" (ttbl), "=r" (ttbh)					\
+	    : "r" (asid & ~ASID_MASK));					\
+}
+#else
+#define cpu_set_asid(asid) \
+	asm("	mcr	p15, 0, %0, c13, c0, 1\n" : : "r" (asid))
+#endif
+
 /*
  * We fork()ed a process, and we need a new context for the child
  * to run in.  We reserve version 0 for initial tasks so we will
@@ -37,7 +51,7 @@ void __init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 static void flush_context(void)
 {
 	/* set the reserved ASID before flushing the TLB */
-	asm("mcr	p15, 0, %0, c13, c0, 1\n" : : "r" (0));
+	cpu_set_asid(0);
 	isb();
 	local_flush_tlb_all();
 	if (icache_is_vivt_asid_tagged()) {
@@ -99,7 +113,7 @@ static void reset_context(void *info)
 	set_mm_context(mm, asid);
 
 	/* set the new ASID */
-	asm("mcr	p15, 0, %0, c13, c0, 1\n" : : "r" (mm->context.id));
+	cpu_set_asid(mm->context.id);
 	isb();
 }
 
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 33a8c82..b0932c1 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -117,6 +117,11 @@ ENTRY(cpu_v7_switch_mm)
 #ifdef CONFIG_MMU
 	mov	r2, #0
 	ldr	r1, [r1, #MM_CONTEXT_ID]	@ get mm->context.id
+#ifdef CONFIG_ARM_LPAE
+	and	r3, r1, #0xff
+	mov	r3, r3, lsl #(48 - 32)		@ ASID
+	mcrr	p15, 0, r0, r3, c2		@ set TTB 0
+#else	/* !CONFIG_ARM_LPAE */
 	ALT_SMP(orr	r0, r0, #TTB_FLAGS_SMP)
 	ALT_UP(orr	r0, r0, #TTB_FLAGS_UP)
 #ifdef CONFIG_ARM_ERRATA_430973
@@ -124,9 +129,10 @@ ENTRY(cpu_v7_switch_mm)
 #endif
 	mcr	p15, 0, r2, c13, c0, 1		@ set reserved context ID
 	isb
-1:	mcr	p15, 0, r0, c2, c0, 0		@ set TTB 0
+	mcr	p15, 0, r0, c2, c0, 0		@ set TTB 0
 	isb
 	mcr	p15, 0, r1, c13, c0, 1		@ set context ID
+#endif	/* CONFIG_ARM_LPAE */
 	isb
 #endif
 	mov	pc, lr

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 13/20] ARM: LPAE: Add SMP support for the 3-level page table format
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

With 3-level page tables, starting secondary CPUs required allocating
the pgd as well. Since LPAE Linux uses TTBR1 for the kernel page tables,
this patch reorders the CPU setup call in the head.S file so that the
swapper_pg_dir is used. TTBR0 is set to the value generated by the
primary CPU.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/kernel/head.S |   10 +++++-----
 arch/arm/kernel/smp.c  |   39 +++++++++++++++++++++++++++++++++++++--
 2 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index fd8a29e..b54d00e 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -321,6 +321,10 @@ ENTRY(secondary_startup)
 	moveq	r0, #'p'			@ yes, error 'p'
 	beq	__error_p
 
+	pgtbl	r4
+	add	r12, r10, #BSYM(PROCINFO_INITFUNC)
+	blx	r12				@ initialise processor
+						@ (return control reg)
 	/*
 	 * Use the page tables supplied from  __cpu_up.
 	 */
@@ -328,12 +332,8 @@ ENTRY(secondary_startup)
 	ldmia	r4, {r5, r7, r12}		@ address to jump to after
 	sub	r4, r4, r5			@ mmu has been enabled
 	ldr	r4, [r7, r4]			@ get secondary_data.pgdir
-	adr	lr, BSYM(__enable_mmu)		@ return address
 	mov	r13, r12			@ __secondary_switched address
- ARM(	add	pc, r10, #PROCINFO_INITFUNC	) @ initialise processor
-						  @ (return control reg)
- THUMB(	add	r12, r10, #PROCINFO_INITFUNC	)
- THUMB(	mov	pc, r12				)
+	b	__enable_mmu
 ENDPROC(secondary_startup)
 
 	/*
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 40b386c..089e2ae 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -82,8 +82,10 @@ static inline void identity_mapping_add(pgd_t *pgd, unsigned long start,
 		pmd = pmd_offset(pgd + pgd_index(addr), addr);
 		pmd[0] = __pmd(addr | prot);
 		addr += SECTION_SIZE;
+#ifndef CONFIG_ARM_LPAE
 		pmd[1] = __pmd(addr | prot);
 		addr += SECTION_SIZE;
+#endif
 		flush_pmd_entry(pmd);
 		outer_clean_range(__pa(pmd), __pa(pmd + 1));
 	}
@@ -98,7 +100,9 @@ static inline void identity_mapping_del(pgd_t *pgd, unsigned long start,
 	for (addr = start & PMD_MASK; addr < end; addr += PMD_SIZE) {
 		pmd = pmd_offset(pgd + pgd_index(addr), addr);
 		pmd[0] = __pmd(0);
+#ifndef CONFIG_ARM_LPAE
 		pmd[1] = __pmd(0);
+#endif
 		clean_pmd_entry(pmd);
 		outer_clean_range(__pa(pmd), __pa(pmd + 1));
 	}
@@ -109,6 +113,10 @@ int __cpuinit __cpu_up(unsigned int cpu)
 	struct cpuinfo_arm *ci = &per_cpu(cpu_data, cpu);
 	struct task_struct *idle = ci->idle;
 	pgd_t *pgd;
+#ifdef CONFIG_ARM_LPAE
+	pgd_t *pgd_phys;
+	pmd_t *pmd;
+#endif
 	int ret;
 
 	/*
@@ -137,8 +145,30 @@ int __cpuinit __cpu_up(unsigned int cpu)
 	 * a 1:1 mapping for the physical address of the kernel.
 	 */
 	pgd = pgd_alloc(&init_mm);
-	if (!pgd)
-		return -ENOMEM;
+	if (!pgd) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * Check for overlapping between PHYS_OFFSET and PAGE_OFFSET and
+	 * duplicate the pmd to avoid overriding valid kernel mappings in the
+	 * init_mm page tables. The code assumes that the kernel text and data
+	 * sections are within 1GB of PHYS_OFFSET (maximum range convered by
+	 * a PMD table with LPAE).
+	 */
+	pgd_phys = pgd + pgd_index(PHYS_OFFSET);
+	pmd = pmd_alloc_one(NULL, PHYS_OFFSET);
+	if (!pmd) {
+		ret = -ENOMEM;
+		goto nopmd;
+	}
+	if (pgd_present(*pgd_phys))
+		memcpy(pmd, pmd_offset(pgd_phys, 0),
+		       PTRS_PER_PMD * sizeof(pmd_t));
+	pgd_populate(NULL, pgd_phys, pmd);
+#endif
 
 	if (PHYS_OFFSET != PAGE_OFFSET) {
 #ifndef CONFIG_HOTPLUG_CPU
@@ -192,8 +222,13 @@ int __cpuinit __cpu_up(unsigned int cpu)
 		identity_mapping_del(pgd, __pa(_sdata), __pa(_edata));
 	}
 
+#ifdef CONFIG_ARM_LPAE
+	pmd_free(&init_mm, pmd_offset(pgd_phys, 0));
+nopmd:
+#endif
 	pgd_free(&init_mm, pgd);
 
+out:
 	if (ret) {
 		printk(KERN_CRIT "CPU%u: processor failed to boot\n", cpu);
 

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 13/20] ARM: LPAE: Add SMP support for the 3-level page table format
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

With 3-level page tables, starting secondary CPUs required allocating
the pgd as well. Since LPAE Linux uses TTBR1 for the kernel page tables,
this patch reorders the CPU setup call in the head.S file so that the
swapper_pg_dir is used. TTBR0 is set to the value generated by the
primary CPU.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/kernel/head.S |   10 +++++-----
 arch/arm/kernel/smp.c  |   39 +++++++++++++++++++++++++++++++++++++--
 2 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index fd8a29e..b54d00e 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -321,6 +321,10 @@ ENTRY(secondary_startup)
 	moveq	r0, #'p'			@ yes, error 'p'
 	beq	__error_p
 
+	pgtbl	r4
+	add	r12, r10, #BSYM(PROCINFO_INITFUNC)
+	blx	r12				@ initialise processor
+						@ (return control reg)
 	/*
 	 * Use the page tables supplied from  __cpu_up.
 	 */
@@ -328,12 +332,8 @@ ENTRY(secondary_startup)
 	ldmia	r4, {r5, r7, r12}		@ address to jump to after
 	sub	r4, r4, r5			@ mmu has been enabled
 	ldr	r4, [r7, r4]			@ get secondary_data.pgdir
-	adr	lr, BSYM(__enable_mmu)		@ return address
 	mov	r13, r12			@ __secondary_switched address
- ARM(	add	pc, r10, #PROCINFO_INITFUNC	) @ initialise processor
-						  @ (return control reg)
- THUMB(	add	r12, r10, #PROCINFO_INITFUNC	)
- THUMB(	mov	pc, r12				)
+	b	__enable_mmu
 ENDPROC(secondary_startup)
 
 	/*
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 40b386c..089e2ae 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -82,8 +82,10 @@ static inline void identity_mapping_add(pgd_t *pgd, unsigned long start,
 		pmd = pmd_offset(pgd + pgd_index(addr), addr);
 		pmd[0] = __pmd(addr | prot);
 		addr += SECTION_SIZE;
+#ifndef CONFIG_ARM_LPAE
 		pmd[1] = __pmd(addr | prot);
 		addr += SECTION_SIZE;
+#endif
 		flush_pmd_entry(pmd);
 		outer_clean_range(__pa(pmd), __pa(pmd + 1));
 	}
@@ -98,7 +100,9 @@ static inline void identity_mapping_del(pgd_t *pgd, unsigned long start,
 	for (addr = start & PMD_MASK; addr < end; addr += PMD_SIZE) {
 		pmd = pmd_offset(pgd + pgd_index(addr), addr);
 		pmd[0] = __pmd(0);
+#ifndef CONFIG_ARM_LPAE
 		pmd[1] = __pmd(0);
+#endif
 		clean_pmd_entry(pmd);
 		outer_clean_range(__pa(pmd), __pa(pmd + 1));
 	}
@@ -109,6 +113,10 @@ int __cpuinit __cpu_up(unsigned int cpu)
 	struct cpuinfo_arm *ci = &per_cpu(cpu_data, cpu);
 	struct task_struct *idle = ci->idle;
 	pgd_t *pgd;
+#ifdef CONFIG_ARM_LPAE
+	pgd_t *pgd_phys;
+	pmd_t *pmd;
+#endif
 	int ret;
 
 	/*
@@ -137,8 +145,30 @@ int __cpuinit __cpu_up(unsigned int cpu)
 	 * a 1:1 mapping for the physical address of the kernel.
 	 */
 	pgd = pgd_alloc(&init_mm);
-	if (!pgd)
-		return -ENOMEM;
+	if (!pgd) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * Check for overlapping between PHYS_OFFSET and PAGE_OFFSET and
+	 * duplicate the pmd to avoid overriding valid kernel mappings in the
+	 * init_mm page tables. The code assumes that the kernel text and data
+	 * sections are within 1GB of PHYS_OFFSET (maximum range convered by
+	 * a PMD table with LPAE).
+	 */
+	pgd_phys = pgd + pgd_index(PHYS_OFFSET);
+	pmd = pmd_alloc_one(NULL, PHYS_OFFSET);
+	if (!pmd) {
+		ret = -ENOMEM;
+		goto nopmd;
+	}
+	if (pgd_present(*pgd_phys))
+		memcpy(pmd, pmd_offset(pgd_phys, 0),
+		       PTRS_PER_PMD * sizeof(pmd_t));
+	pgd_populate(NULL, pgd_phys, pmd);
+#endif
 
 	if (PHYS_OFFSET != PAGE_OFFSET) {
 #ifndef CONFIG_HOTPLUG_CPU
@@ -192,8 +222,13 @@ int __cpuinit __cpu_up(unsigned int cpu)
 		identity_mapping_del(pgd, __pa(_sdata), __pa(_edata));
 	}
 
+#ifdef CONFIG_ARM_LPAE
+	pmd_free(&init_mm, pmd_offset(pgd_phys, 0));
+nopmd:
+#endif
 	pgd_free(&init_mm, pgd);
 
+out:
 	if (ret) {
 		printk(KERN_CRIT "CPU%u: processor failed to boot\n", cpu);
 

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 14/20] ARM: LPAE: use phys_addr_t instead of unsigned long for physical addresses
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel; +Cc: Will Deacon

From: Will Deacon <will.deacon@arm.com>

The unsigned long datatype is not sufficient for mapping physical addresses
>= 4GB.

This patch ensures that the phys_addr_t datatype is used to represent
physical addresses which may be beyond the range of an unsigned long.
The virt <-> phys macros are updated accordingly to ensure that virtual
addresses can remain as they are.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/memory.h     |   17 +++++++++--------
 arch/arm/include/asm/outercache.h |   14 ++++++++------
 arch/arm/include/asm/pgalloc.h    |    2 +-
 arch/arm/include/asm/pgtable.h    |    2 +-
 arch/arm/include/asm/setup.h      |    2 +-
 arch/arm/mm/init.c                |    6 +++---
 arch/arm/mm/mmu.c                 |   12 +++++++-----
 7 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 23c2e8e..756252b 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -15,6 +15,7 @@
 
 #include <linux/compiler.h>
 #include <linux/const.h>
+#include <linux/types.h>
 #include <mach/memory.h>
 #include <asm/sizes.h>
 
@@ -138,15 +139,15 @@
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
  */
 #ifndef __virt_to_phys
-#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
-#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
+#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
 #endif
 
 /*
  * Convert a physical address to a Page Frame Number and back
  */
-#define	__phys_to_pfn(paddr)	((paddr) >> PAGE_SHIFT)
-#define	__pfn_to_phys(pfn)	((pfn) << PAGE_SHIFT)
+#define	__phys_to_pfn(paddr)	((unsigned long)((paddr) >> PAGE_SHIFT))
+#define	__pfn_to_phys(pfn)	((phys_addr_t)(pfn) << PAGE_SHIFT)
 
 /*
  * Convert a page to/from a physical address
@@ -188,21 +189,21 @@
  * translation for translating DMA addresses.  Use the driver
  * DMA support - see dma-mapping.h.
  */
-static inline unsigned long virt_to_phys(void *x)
+static inline phys_addr_t virt_to_phys(void *x)
 {
 	return __virt_to_phys((unsigned long)(x));
 }
 
-static inline void *phys_to_virt(unsigned long x)
+static inline void *phys_to_virt(phys_addr_t x)
 {
-	return (void *)(__phys_to_virt((unsigned long)(x)));
+	return (void *)(__phys_to_virt(x));
 }
 
 /*
  * Drivers should NOT use these either.
  */
 #define __pa(x)			__virt_to_phys((unsigned long)(x))
-#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
+#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
 #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
 
 /*
diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
index fc19009..88ad892 100644
--- a/arch/arm/include/asm/outercache.h
+++ b/arch/arm/include/asm/outercache.h
@@ -21,6 +21,8 @@
 #ifndef __ASM_OUTERCACHE_H
 #define __ASM_OUTERCACHE_H
 
+#include <linux/types.h>
+
 struct outer_cache_fns {
 	void (*inv_range)(unsigned long, unsigned long);
 	void (*clean_range)(unsigned long, unsigned long);
@@ -37,17 +39,17 @@ struct outer_cache_fns {
 
 extern struct outer_cache_fns outer_cache;
 
-static inline void outer_inv_range(unsigned long start, unsigned long end)
+static inline void outer_inv_range(phys_addr_t start, phys_addr_t end)
 {
 	if (outer_cache.inv_range)
 		outer_cache.inv_range(start, end);
 }
-static inline void outer_clean_range(unsigned long start, unsigned long end)
+static inline void outer_clean_range(phys_addr_t start, phys_addr_t end)
 {
 	if (outer_cache.clean_range)
 		outer_cache.clean_range(start, end);
 }
-static inline void outer_flush_range(unsigned long start, unsigned long end)
+static inline void outer_flush_range(phys_addr_t start, phys_addr_t end)
 {
 	if (outer_cache.flush_range)
 		outer_cache.flush_range(start, end);
@@ -73,11 +75,11 @@ static inline void outer_disable(void)
 
 #else
 
-static inline void outer_inv_range(unsigned long start, unsigned long end)
+static inline void outer_inv_range(phys_addr_t start, phys_addr_t end)
 { }
-static inline void outer_clean_range(unsigned long start, unsigned long end)
+static inline void outer_clean_range(phys_addr_t start, phys_addr_t end)
 { }
-static inline void outer_flush_range(unsigned long start, unsigned long end)
+static inline void outer_flush_range(phys_addr_t start, phys_addr_t end)
 { }
 static inline void outer_flush_all(void) { }
 static inline void outer_inv_all(void) { }
diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h
index 64a303d..33761b8 100644
--- a/arch/arm/include/asm/pgalloc.h
+++ b/arch/arm/include/asm/pgalloc.h
@@ -159,7 +159,7 @@ pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
 static inline void
 pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t ptep)
 {
-	__pmd_populate(pmdp, page_to_pfn(ptep) << PAGE_SHIFT | _PAGE_USER_TABLE);
+	__pmd_populate(pmdp, page_to_phys(ptep) | _PAGE_USER_TABLE);
 }
 #define pmd_pgtable(pmd) pmd_page(pmd)
 
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 41236f0..bfbfcbd 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -130,7 +130,7 @@ extern struct page *empty_zero_page;
 #define pte_pfn(pte)		(pte_val(pte) >> PAGE_SHIFT)
 #endif
 
-#define pfn_pte(pfn,prot)	(__pte(((pfn) << PAGE_SHIFT) | pgprot_val(prot)))
+#define pfn_pte(pfn,prot)	(__pte(((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)))
 
 #define pte_none(pte)		(!pte_val(pte))
 #define pte_clear(mm,addr,ptep)	set_pte_ext(ptep, __pte(0), 0)
diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h
index f1e5a9b..5092118 100644
--- a/arch/arm/include/asm/setup.h
+++ b/arch/arm/include/asm/setup.h
@@ -199,7 +199,7 @@ static struct tagtable __tagtable_##fn __tag = { tag, fn }
 #endif
 
 struct membank {
-	unsigned long start;
+	phys_addr_t start;
 	unsigned long size;
 	unsigned int highmem;
 };
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 5164069..14a00a1 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -344,7 +344,7 @@ void __init bootmem_init(void)
 	 */
 	arm_bootmem_free(min, max_low, max_high);
 
-	high_memory = __va((max_low << PAGE_SHIFT) - 1) + 1;
+	high_memory = __va(((phys_addr_t)max_low << PAGE_SHIFT) - 1) + 1;
 
 	/*
 	 * This doesn't seem to be used by the Linux memory manager any
@@ -392,8 +392,8 @@ free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 	 * Convert to physical addresses, and
 	 * round start upwards and end downwards.
 	 */
-	pg = PAGE_ALIGN(__pa(start_pg));
-	pgend = __pa(end_pg) & PAGE_MASK;
+	pg = (unsigned long)PAGE_ALIGN(__pa(start_pg));
+	pgend = (unsigned long)__pa(end_pg) & PAGE_MASK;
 
 	/*
 	 * If there are free pages between these,
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 3784acc..b03e431 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -573,7 +573,7 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
 }
 
 static void __init alloc_init_section(pgd_t *pgd, unsigned long addr,
-				      unsigned long end, unsigned long phys,
+				      unsigned long end, phys_addr_t phys,
 				      const struct mem_type *type)
 {
 	pmd_t *pmd = pmd_offset(pgd, addr);
@@ -609,11 +609,12 @@ static void __init alloc_init_section(pgd_t *pgd, unsigned long addr,
 static void __init create_36bit_mapping(struct map_desc *md,
 					const struct mem_type *type)
 {
-	unsigned long phys, addr, length, end;
+	unsigned long addr, length, end;
+	phys_addr_t phys;
 	pgd_t *pgd;
 
 	addr = md->virtual;
-	phys = (unsigned long)__pfn_to_phys(md->pfn);
+	phys = __pfn_to_phys(md->pfn);
 	length = PAGE_ALIGN(md->length);
 
 	if (!(cpu_architecture() >= CPU_ARCH_ARMv6 || cpu_is_xsc3())) {
@@ -674,7 +675,8 @@ static void __init create_36bit_mapping(struct map_desc *md,
  */
 static void __init create_mapping(struct map_desc *md)
 {
-	unsigned long phys, addr, length, end;
+	unsigned long addr, length, end;
+	phys_addr_t phys;
 	const struct mem_type *type;
 	pgd_t *pgd;
 
@@ -705,7 +707,7 @@ static void __init create_mapping(struct map_desc *md)
 #endif
 
 	addr = md->virtual & PAGE_MASK;
-	phys = (unsigned long)__pfn_to_phys(md->pfn);
+	phys = __pfn_to_phys(md->pfn);
 	length = PAGE_ALIGN(md->length + (md->virtual & ~PAGE_MASK));
 
 	if (type->prot_l1 == 0 && ((addr | phys | length) & ~SECTION_MASK)) {

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 14/20] ARM: LPAE: use phys_addr_t instead of unsigned long for physical addresses
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Will Deacon <will.deacon@arm.com>

The unsigned long datatype is not sufficient for mapping physical addresses
>= 4GB.

This patch ensures that the phys_addr_t datatype is used to represent
physical addresses which may be beyond the range of an unsigned long.
The virt <-> phys macros are updated accordingly to ensure that virtual
addresses can remain as they are.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/memory.h     |   17 +++++++++--------
 arch/arm/include/asm/outercache.h |   14 ++++++++------
 arch/arm/include/asm/pgalloc.h    |    2 +-
 arch/arm/include/asm/pgtable.h    |    2 +-
 arch/arm/include/asm/setup.h      |    2 +-
 arch/arm/mm/init.c                |    6 +++---
 arch/arm/mm/mmu.c                 |   12 +++++++-----
 7 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 23c2e8e..756252b 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -15,6 +15,7 @@
 
 #include <linux/compiler.h>
 #include <linux/const.h>
+#include <linux/types.h>
 #include <mach/memory.h>
 #include <asm/sizes.h>
 
@@ -138,15 +139,15 @@
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
  */
 #ifndef __virt_to_phys
-#define __virt_to_phys(x)	((x) - PAGE_OFFSET + PHYS_OFFSET)
-#define __phys_to_virt(x)	((x) - PHYS_OFFSET + PAGE_OFFSET)
+#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
 #endif
 
 /*
  * Convert a physical address to a Page Frame Number and back
  */
-#define	__phys_to_pfn(paddr)	((paddr) >> PAGE_SHIFT)
-#define	__pfn_to_phys(pfn)	((pfn) << PAGE_SHIFT)
+#define	__phys_to_pfn(paddr)	((unsigned long)((paddr) >> PAGE_SHIFT))
+#define	__pfn_to_phys(pfn)	((phys_addr_t)(pfn) << PAGE_SHIFT)
 
 /*
  * Convert a page to/from a physical address
@@ -188,21 +189,21 @@
  * translation for translating DMA addresses.  Use the driver
  * DMA support - see dma-mapping.h.
  */
-static inline unsigned long virt_to_phys(void *x)
+static inline phys_addr_t virt_to_phys(void *x)
 {
 	return __virt_to_phys((unsigned long)(x));
 }
 
-static inline void *phys_to_virt(unsigned long x)
+static inline void *phys_to_virt(phys_addr_t x)
 {
-	return (void *)(__phys_to_virt((unsigned long)(x)));
+	return (void *)(__phys_to_virt(x));
 }
 
 /*
  * Drivers should NOT use these either.
  */
 #define __pa(x)			__virt_to_phys((unsigned long)(x))
-#define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
+#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
 #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
 
 /*
diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
index fc19009..88ad892 100644
--- a/arch/arm/include/asm/outercache.h
+++ b/arch/arm/include/asm/outercache.h
@@ -21,6 +21,8 @@
 #ifndef __ASM_OUTERCACHE_H
 #define __ASM_OUTERCACHE_H
 
+#include <linux/types.h>
+
 struct outer_cache_fns {
 	void (*inv_range)(unsigned long, unsigned long);
 	void (*clean_range)(unsigned long, unsigned long);
@@ -37,17 +39,17 @@ struct outer_cache_fns {
 
 extern struct outer_cache_fns outer_cache;
 
-static inline void outer_inv_range(unsigned long start, unsigned long end)
+static inline void outer_inv_range(phys_addr_t start, phys_addr_t end)
 {
 	if (outer_cache.inv_range)
 		outer_cache.inv_range(start, end);
 }
-static inline void outer_clean_range(unsigned long start, unsigned long end)
+static inline void outer_clean_range(phys_addr_t start, phys_addr_t end)
 {
 	if (outer_cache.clean_range)
 		outer_cache.clean_range(start, end);
 }
-static inline void outer_flush_range(unsigned long start, unsigned long end)
+static inline void outer_flush_range(phys_addr_t start, phys_addr_t end)
 {
 	if (outer_cache.flush_range)
 		outer_cache.flush_range(start, end);
@@ -73,11 +75,11 @@ static inline void outer_disable(void)
 
 #else
 
-static inline void outer_inv_range(unsigned long start, unsigned long end)
+static inline void outer_inv_range(phys_addr_t start, phys_addr_t end)
 { }
-static inline void outer_clean_range(unsigned long start, unsigned long end)
+static inline void outer_clean_range(phys_addr_t start, phys_addr_t end)
 { }
-static inline void outer_flush_range(unsigned long start, unsigned long end)
+static inline void outer_flush_range(phys_addr_t start, phys_addr_t end)
 { }
 static inline void outer_flush_all(void) { }
 static inline void outer_inv_all(void) { }
diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h
index 64a303d..33761b8 100644
--- a/arch/arm/include/asm/pgalloc.h
+++ b/arch/arm/include/asm/pgalloc.h
@@ -159,7 +159,7 @@ pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
 static inline void
 pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t ptep)
 {
-	__pmd_populate(pmdp, page_to_pfn(ptep) << PAGE_SHIFT | _PAGE_USER_TABLE);
+	__pmd_populate(pmdp, page_to_phys(ptep) | _PAGE_USER_TABLE);
 }
 #define pmd_pgtable(pmd) pmd_page(pmd)
 
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 41236f0..bfbfcbd 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -130,7 +130,7 @@ extern struct page *empty_zero_page;
 #define pte_pfn(pte)		(pte_val(pte) >> PAGE_SHIFT)
 #endif
 
-#define pfn_pte(pfn,prot)	(__pte(((pfn) << PAGE_SHIFT) | pgprot_val(prot)))
+#define pfn_pte(pfn,prot)	(__pte(((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)))
 
 #define pte_none(pte)		(!pte_val(pte))
 #define pte_clear(mm,addr,ptep)	set_pte_ext(ptep, __pte(0), 0)
diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h
index f1e5a9b..5092118 100644
--- a/arch/arm/include/asm/setup.h
+++ b/arch/arm/include/asm/setup.h
@@ -199,7 +199,7 @@ static struct tagtable __tagtable_##fn __tag = { tag, fn }
 #endif
 
 struct membank {
-	unsigned long start;
+	phys_addr_t start;
 	unsigned long size;
 	unsigned int highmem;
 };
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 5164069..14a00a1 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -344,7 +344,7 @@ void __init bootmem_init(void)
 	 */
 	arm_bootmem_free(min, max_low, max_high);
 
-	high_memory = __va((max_low << PAGE_SHIFT) - 1) + 1;
+	high_memory = __va(((phys_addr_t)max_low << PAGE_SHIFT) - 1) + 1;
 
 	/*
 	 * This doesn't seem to be used by the Linux memory manager any
@@ -392,8 +392,8 @@ free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 	 * Convert to physical addresses, and
 	 * round start upwards and end downwards.
 	 */
-	pg = PAGE_ALIGN(__pa(start_pg));
-	pgend = __pa(end_pg) & PAGE_MASK;
+	pg = (unsigned long)PAGE_ALIGN(__pa(start_pg));
+	pgend = (unsigned long)__pa(end_pg) & PAGE_MASK;
 
 	/*
 	 * If there are free pages between these,
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 3784acc..b03e431 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -573,7 +573,7 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
 }
 
 static void __init alloc_init_section(pgd_t *pgd, unsigned long addr,
-				      unsigned long end, unsigned long phys,
+				      unsigned long end, phys_addr_t phys,
 				      const struct mem_type *type)
 {
 	pmd_t *pmd = pmd_offset(pgd, addr);
@@ -609,11 +609,12 @@ static void __init alloc_init_section(pgd_t *pgd, unsigned long addr,
 static void __init create_36bit_mapping(struct map_desc *md,
 					const struct mem_type *type)
 {
-	unsigned long phys, addr, length, end;
+	unsigned long addr, length, end;
+	phys_addr_t phys;
 	pgd_t *pgd;
 
 	addr = md->virtual;
-	phys = (unsigned long)__pfn_to_phys(md->pfn);
+	phys = __pfn_to_phys(md->pfn);
 	length = PAGE_ALIGN(md->length);
 
 	if (!(cpu_architecture() >= CPU_ARCH_ARMv6 || cpu_is_xsc3())) {
@@ -674,7 +675,8 @@ static void __init create_36bit_mapping(struct map_desc *md,
  */
 static void __init create_mapping(struct map_desc *md)
 {
-	unsigned long phys, addr, length, end;
+	unsigned long addr, length, end;
+	phys_addr_t phys;
 	const struct mem_type *type;
 	pgd_t *pgd;
 
@@ -705,7 +707,7 @@ static void __init create_mapping(struct map_desc *md)
 #endif
 
 	addr = md->virtual & PAGE_MASK;
-	phys = (unsigned long)__pfn_to_phys(md->pfn);
+	phys = __pfn_to_phys(md->pfn);
 	length = PAGE_ALIGN(md->length + (md->virtual & ~PAGE_MASK));
 
 	if (type->prot_l1 == 0 && ((addr | phys | length) & ~SECTION_MASK)) {

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 15/20] ARM: LPAE: Use generic dma_addr_t type definition
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel; +Cc: Will Deacon

From: Will Deacon <will.deacon@arm.com>

This patch uses the types.h implementation in asm-generic to define the
dma_addr_t type as the same width as phys_addr_t.

NOTE: this is a temporary patch until the corresponding patches unifying
the dma_addr_t and removing the dma64_addr_t are merged into mainline.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/types.h |   20 +-------------------
 1 files changed, 1 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/types.h b/arch/arm/include/asm/types.h
index 345df01..dc1bdbb 100644
--- a/arch/arm/include/asm/types.h
+++ b/arch/arm/include/asm/types.h
@@ -1,30 +1,12 @@
 #ifndef __ASM_ARM_TYPES_H
 #define __ASM_ARM_TYPES_H
 
-#include <asm-generic/int-ll64.h>
+#include <asm-generic/types.h>
 
-#ifndef __ASSEMBLY__
-
-typedef unsigned short umode_t;
-
-#endif /* __ASSEMBLY__ */
-
-/*
- * These aren't exported outside the kernel to avoid name space clashes
- */
 #ifdef __KERNEL__
 
 #define BITS_PER_LONG 32
 
-#ifndef __ASSEMBLY__
-
-/* Dma addresses are 32-bits wide.  */
-
-typedef u32 dma_addr_t;
-typedef u32 dma64_addr_t;
-
-#endif /* __ASSEMBLY__ */
-
 #endif /* __KERNEL__ */
 
 #endif

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 15/20] ARM: LPAE: Use generic dma_addr_t type definition
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Will Deacon <will.deacon@arm.com>

This patch uses the types.h implementation in asm-generic to define the
dma_addr_t type as the same width as phys_addr_t.

NOTE: this is a temporary patch until the corresponding patches unifying
the dma_addr_t and removing the dma64_addr_t are merged into mainline.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/types.h |   20 +-------------------
 1 files changed, 1 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/types.h b/arch/arm/include/asm/types.h
index 345df01..dc1bdbb 100644
--- a/arch/arm/include/asm/types.h
+++ b/arch/arm/include/asm/types.h
@@ -1,30 +1,12 @@
 #ifndef __ASM_ARM_TYPES_H
 #define __ASM_ARM_TYPES_H
 
-#include <asm-generic/int-ll64.h>
+#include <asm-generic/types.h>
 
-#ifndef __ASSEMBLY__
-
-typedef unsigned short umode_t;
-
-#endif /* __ASSEMBLY__ */
-
-/*
- * These aren't exported outside the kernel to avoid name space clashes
- */
 #ifdef __KERNEL__
 
 #define BITS_PER_LONG 32
 
-#ifndef __ASSEMBLY__
-
-/* Dma addresses are 32-bits wide.  */
-
-typedef u32 dma_addr_t;
-typedef u32 dma64_addr_t;
-
-#endif /* __ASSEMBLY__ */
-
 #endif /* __KERNEL__ */
 
 #endif

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 16/20] ARM: LPAE: mark memory banks with start > ULONG_MAX as highmem
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel; +Cc: Will Deacon

From: Will Deacon <will.deacon@arm.com>

Memory banks living outside of the 32-bit physical address
space do not have a 1:1 pa <-> va mapping and therefore the
__va macro may wrap.

This patch ensures that such banks are marked as highmem so
that the Kernel doesn't try to split them up when it sees that
the wrapped virtual address overlaps the vmalloc space.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/mm/mmu.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index b03e431..787a409 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -785,7 +785,8 @@ static void __init sanity_check_meminfo(void)
 
 #ifdef CONFIG_HIGHMEM
 		if (__va(bank->start) > vmalloc_min ||
-		    __va(bank->start) < (void *)PAGE_OFFSET)
+		    __va(bank->start) < (void *)PAGE_OFFSET ||
+		    bank->start > ULONG_MAX)
 			highmem = 1;
 
 		bank->highmem = highmem;
@@ -794,7 +795,7 @@ static void __init sanity_check_meminfo(void)
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
 		 */
-		if (__va(bank->start) < vmalloc_min &&
+		if (!highmem && __va(bank->start) < vmalloc_min &&
 		    bank->size > vmalloc_min - __va(bank->start)) {
 			if (meminfo.nr_banks >= NR_BANKS) {
 				printk(KERN_CRIT "NR_BANKS too low, "

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 16/20] ARM: LPAE: mark memory banks with start > ULONG_MAX as highmem
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Will Deacon <will.deacon@arm.com>

Memory banks living outside of the 32-bit physical address
space do not have a 1:1 pa <-> va mapping and therefore the
__va macro may wrap.

This patch ensures that such banks are marked as highmem so
that the Kernel doesn't try to split them up when it sees that
the wrapped virtual address overlaps the vmalloc space.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/mm/mmu.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index b03e431..787a409 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -785,7 +785,8 @@ static void __init sanity_check_meminfo(void)
 
 #ifdef CONFIG_HIGHMEM
 		if (__va(bank->start) > vmalloc_min ||
-		    __va(bank->start) < (void *)PAGE_OFFSET)
+		    __va(bank->start) < (void *)PAGE_OFFSET ||
+		    bank->start > ULONG_MAX)
 			highmem = 1;
 
 		bank->highmem = highmem;
@@ -794,7 +795,7 @@ static void __init sanity_check_meminfo(void)
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
 		 */
-		if (__va(bank->start) < vmalloc_min &&
+		if (!highmem && __va(bank->start) < vmalloc_min &&
 		    bank->size > vmalloc_min - __va(bank->start)) {
 			if (meminfo.nr_banks >= NR_BANKS) {
 				printk(KERN_CRIT "NR_BANKS too low, "

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 17/20] ARM: LPAE: use phys_addr_t for physical start address in early_mem
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel; +Cc: Will Deacon

From: Will Deacon <will.deacon@arm.com>

The physical start address of memory may be > 4GB and therefore
unrepresentable using an unsigned long.

This patch changes early_mem and arm_add_memory to use phys_addr_t
instead of unsigned long for the start address.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/kernel/setup.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 3cadb46..751ac80 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -442,7 +442,7 @@ static struct machine_desc * __init setup_machine(unsigned int nr)
 	return list;
 }
 
-static int __init arm_add_memory(unsigned long start, unsigned long size)
+static int __init arm_add_memory(phys_addr_t start, unsigned long size)
 {
 	struct membank *bank = &meminfo.bank[meminfo.nr_banks];
 
@@ -478,7 +478,8 @@ static int __init arm_add_memory(unsigned long start, unsigned long size)
 static int __init early_mem(char *p)
 {
 	static int usermem __initdata = 0;
-	unsigned long size, start;
+	unsigned long size;
+	phys_addr_t start;
 	char *endp;
 
 	/*

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 17/20] ARM: LPAE: use phys_addr_t for physical start address in early_mem
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Will Deacon <will.deacon@arm.com>

The physical start address of memory may be > 4GB and therefore
unrepresentable using an unsigned long.

This patch changes early_mem and arm_add_memory to use phys_addr_t
instead of unsigned long for the start address.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/kernel/setup.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 3cadb46..751ac80 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -442,7 +442,7 @@ static struct machine_desc * __init setup_machine(unsigned int nr)
 	return list;
 }
 
-static int __init arm_add_memory(unsigned long start, unsigned long size)
+static int __init arm_add_memory(phys_addr_t start, unsigned long size)
 {
 	struct membank *bank = &meminfo.bank[meminfo.nr_banks];
 
@@ -478,7 +478,8 @@ static int __init arm_add_memory(unsigned long start, unsigned long size)
 static int __init early_mem(char *p)
 {
 	static int usermem __initdata = 0;
-	unsigned long size, start;
+	unsigned long size;
+	phys_addr_t start;
 	char *endp;
 
 	/*

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 18/20] ARM: LPAE: add support for ATAG_MEM64
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel; +Cc: Will Deacon

From: Will Deacon <will.deacon@arm.com>

LPAE provides support for memory banks with physical addresses of up
to 40 bits.

This patch adds a new atag, ATAG_MEM64, so that the Kernel can be
informed about memory that exists above the 4GB boundary.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/setup.h |   10 +++++++++-
 arch/arm/kernel/compat.c     |    4 ++--
 arch/arm/kernel/setup.c      |   12 +++++++++++-
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h
index 5092118..fab849f 100644
--- a/arch/arm/include/asm/setup.h
+++ b/arch/arm/include/asm/setup.h
@@ -43,6 +43,13 @@ struct tag_mem32 {
 	__u32	start;	/* physical start address */
 };
 
+#define ATAG_MEM64	0x54420002
+
+struct tag_mem64 {
+	__u64	size;
+	__u64	start;	/* physical start address */
+};
+
 /* VGA text type displays */
 #define ATAG_VIDEOTEXT	0x54410003
 
@@ -147,7 +154,8 @@ struct tag {
 	struct tag_header hdr;
 	union {
 		struct tag_core		core;
-		struct tag_mem32	mem;
+		struct tag_mem32	mem32;
+		struct tag_mem64	mem64;
 		struct tag_videotext	videotext;
 		struct tag_ramdisk	ramdisk;
 		struct tag_initrd	initrd;
diff --git a/arch/arm/kernel/compat.c b/arch/arm/kernel/compat.c
index 9256523..f224d95 100644
--- a/arch/arm/kernel/compat.c
+++ b/arch/arm/kernel/compat.c
@@ -86,8 +86,8 @@ static struct tag * __init memtag(struct tag *tag, unsigned long start, unsigned
 	tag = tag_next(tag);
 	tag->hdr.tag = ATAG_MEM;
 	tag->hdr.size = tag_size(tag_mem32);
-	tag->u.mem.size = size;
-	tag->u.mem.start = start;
+	tag->u.mem32.size = size;
+	tag->u.mem32.start = start;
 
 	return tag;
 }
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 751ac80..0128db2 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -591,11 +591,21 @@ __tagtable(ATAG_CORE, parse_tag_core);
 
 static int __init parse_tag_mem32(const struct tag *tag)
 {
-	return arm_add_memory(tag->u.mem.start, tag->u.mem.size);
+	return arm_add_memory(tag->u.mem32.start, tag->u.mem32.size);
 }
 
 __tagtable(ATAG_MEM, parse_tag_mem32);
 
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+static int __init parse_tag_mem64(const struct tag *tag)
+{
+	/* We only use 32-bits for the size. */
+	return arm_add_memory(tag->u.mem64.start, (unsigned long)tag->u.mem64.size);
+}
+
+__tagtable(ATAG_MEM64, parse_tag_mem64);
+#endif /* CONFIG_PHYS_ADDR_T_64BIT */
+
 #if defined(CONFIG_VGA_CONSOLE) || defined(CONFIG_DUMMY_CONSOLE)
 struct screen_info screen_info = {
  .orig_video_lines	= 30,

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 18/20] ARM: LPAE: add support for ATAG_MEM64
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Will Deacon <will.deacon@arm.com>

LPAE provides support for memory banks with physical addresses of up
to 40 bits.

This patch adds a new atag, ATAG_MEM64, so that the Kernel can be
informed about memory that exists above the 4GB boundary.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/setup.h |   10 +++++++++-
 arch/arm/kernel/compat.c     |    4 ++--
 arch/arm/kernel/setup.c      |   12 +++++++++++-
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h
index 5092118..fab849f 100644
--- a/arch/arm/include/asm/setup.h
+++ b/arch/arm/include/asm/setup.h
@@ -43,6 +43,13 @@ struct tag_mem32 {
 	__u32	start;	/* physical start address */
 };
 
+#define ATAG_MEM64	0x54420002
+
+struct tag_mem64 {
+	__u64	size;
+	__u64	start;	/* physical start address */
+};
+
 /* VGA text type displays */
 #define ATAG_VIDEOTEXT	0x54410003
 
@@ -147,7 +154,8 @@ struct tag {
 	struct tag_header hdr;
 	union {
 		struct tag_core		core;
-		struct tag_mem32	mem;
+		struct tag_mem32	mem32;
+		struct tag_mem64	mem64;
 		struct tag_videotext	videotext;
 		struct tag_ramdisk	ramdisk;
 		struct tag_initrd	initrd;
diff --git a/arch/arm/kernel/compat.c b/arch/arm/kernel/compat.c
index 9256523..f224d95 100644
--- a/arch/arm/kernel/compat.c
+++ b/arch/arm/kernel/compat.c
@@ -86,8 +86,8 @@ static struct tag * __init memtag(struct tag *tag, unsigned long start, unsigned
 	tag = tag_next(tag);
 	tag->hdr.tag = ATAG_MEM;
 	tag->hdr.size = tag_size(tag_mem32);
-	tag->u.mem.size = size;
-	tag->u.mem.start = start;
+	tag->u.mem32.size = size;
+	tag->u.mem32.start = start;
 
 	return tag;
 }
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 751ac80..0128db2 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -591,11 +591,21 @@ __tagtable(ATAG_CORE, parse_tag_core);
 
 static int __init parse_tag_mem32(const struct tag *tag)
 {
-	return arm_add_memory(tag->u.mem.start, tag->u.mem.size);
+	return arm_add_memory(tag->u.mem32.start, tag->u.mem32.size);
 }
 
 __tagtable(ATAG_MEM, parse_tag_mem32);
 
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+static int __init parse_tag_mem64(const struct tag *tag)
+{
+	/* We only use 32-bits for the size. */
+	return arm_add_memory(tag->u.mem64.start, (unsigned long)tag->u.mem64.size);
+}
+
+__tagtable(ATAG_MEM64, parse_tag_mem64);
+#endif /* CONFIG_PHYS_ADDR_T_64BIT */
+
 #if defined(CONFIG_VGA_CONSOLE) || defined(CONFIG_DUMMY_CONSOLE)
 struct screen_info screen_info = {
  .orig_video_lines	= 30,

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 19/20] ARM: LPAE: define printk format for physical addresses and page table entries
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel; +Cc: Will Deacon

From: Will Deacon <will.deacon@arm.com>

Now that the Kernel supports 2 level and 3 level page tables, physical
addresses (and also page table entries) may be 32 or 64-bits depending
upon the configuration.

This patch adds a conversion specifier (PHYS_ADDR_FMT) which represents
a u32 or u64 depending on the width of a physical address.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/types.h |    6 ++++++
 arch/arm/kernel/setup.c      |    2 +-
 arch/arm/mm/fault.c          |    8 ++++----
 arch/arm/mm/mmu.c            |   18 +++++++++---------
 4 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/arch/arm/include/asm/types.h b/arch/arm/include/asm/types.h
index dc1bdbb..b740539 100644
--- a/arch/arm/include/asm/types.h
+++ b/arch/arm/include/asm/types.h
@@ -7,6 +7,12 @@
 
 #define BITS_PER_LONG 32
 
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+#define PHYS_ADDR_FMT	"%016llx"
+#else
+#define PHYS_ADDR_FMT	"%08x"
+#endif
+
 #endif /* __KERNEL__ */
 
 #endif
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 0128db2..d143241 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -448,7 +448,7 @@ static int __init arm_add_memory(phys_addr_t start, unsigned long size)
 
 	if (meminfo.nr_banks >= NR_BANKS) {
 		printk(KERN_CRIT "NR_BANKS too low, "
-			"ignoring memory at %#lx\n", start);
+			"ignoring memory at " PHYS_ADDR_FMT "\n", start);
 		return -EINVAL;
 	}
 
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 2dde9cd..8112f77 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -81,7 +81,7 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 
 	printk(KERN_ALERT "pgd = %p\n", mm->pgd);
 	pgd = pgd_offset(mm, addr);
-	printk(KERN_ALERT "[%08lx] *pgd=%08lx", addr, pgd_val(*pgd));
+	printk(KERN_ALERT "[%08lx] *pgd=" PHYS_ADDR_FMT, addr, pgd_val(*pgd));
 
 	do {
 		pmd_t *pmd;
@@ -97,7 +97,7 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 
 		pmd = pmd_offset(pgd, addr);
 		if (PTRS_PER_PMD != 1)
-			printk(", *pmd=%08lx", pmd_val(*pmd));
+			printk(", *pmd=" PHYS_ADDR_FMT, pmd_val(*pmd));
 
 		if (pmd_none(*pmd))
 			break;
@@ -112,9 +112,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 			break;
 
 		pte = pte_offset_map(pmd, addr);
-		printk(", *pte=%08lx", pte_val(*pte));
+		printk(", *pte=" PHYS_ADDR_FMT, pte_val(*pte));
 #ifndef CONFIG_ARM_LPAE
-		printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
+		printk(", *ppte=" PHYS_ADDR_FMT, pte_val(pte[-LINUX_PTE_OFFSET]));
 #endif
 		pte_unmap(pte);
 	} while(0);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 787a409..f05f8ed 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -619,7 +619,7 @@ static void __init create_36bit_mapping(struct map_desc *md,
 
 	if (!(cpu_architecture() >= CPU_ARCH_ARMv6 || cpu_is_xsc3())) {
 		printk(KERN_ERR "MM: CPU does not support supersection "
-		       "mapping for 0x%08llx at 0x%08lx\n",
+		       "mapping for 0x" PHYS_ADDR_FMT " at 0x%08lx\n",
 		       __pfn_to_phys((u64)md->pfn), addr);
 		return;
 	}
@@ -632,14 +632,14 @@ static void __init create_36bit_mapping(struct map_desc *md,
 	 */
 	if (type->domain) {
 		printk(KERN_ERR "MM: invalid domain in supersection "
-		       "mapping for 0x%08llx at 0x%08lx\n",
+		       "mapping for 0x" PHYS_ADDR_FMT " at 0x%08lx\n",
 		       __pfn_to_phys((u64)md->pfn), addr);
 		return;
 	}
 
 	if ((addr | length | __pfn_to_phys(md->pfn)) & ~SUPERSECTION_MASK) {
-		printk(KERN_ERR "MM: cannot create mapping for "
-		       "0x%08llx at 0x%08lx invalid alignment\n",
+		printk(KERN_ERR "MM: cannot create mapping for 0x" PHYS_ADDR_FMT
+		       " at 0x%08lx invalid alignment\n",
 		       __pfn_to_phys((u64)md->pfn), addr);
 		return;
 	}
@@ -681,16 +681,16 @@ static void __init create_mapping(struct map_desc *md)
 	pgd_t *pgd;
 
 	if (md->virtual != vectors_base() && md->virtual < TASK_SIZE) {
-		printk(KERN_WARNING "BUG: not creating mapping for "
-		       "0x%08llx at 0x%08lx in user region\n",
+		printk(KERN_WARNING "BUG: not creating mapping for 0x"
+		       PHYS_ADDR_FMT " at 0x%08lx in user region\n",
 		       __pfn_to_phys((u64)md->pfn), md->virtual);
 		return;
 	}
 
 	if ((md->type == MT_DEVICE || md->type == MT_ROM) &&
 	    md->virtual >= PAGE_OFFSET && md->virtual < VMALLOC_END) {
-		printk(KERN_WARNING "BUG: mapping for 0x%08llx at 0x%08lx "
-		       "overlaps vmalloc space\n",
+		printk(KERN_WARNING "BUG: mapping for 0x" PHYS_ADDR_FMT
+		       " at 0x%08lx overlaps vmalloc space\n",
 		       __pfn_to_phys((u64)md->pfn), md->virtual);
 	}
 
@@ -711,7 +711,7 @@ static void __init create_mapping(struct map_desc *md)
 	length = PAGE_ALIGN(md->length + (md->virtual & ~PAGE_MASK));
 
 	if (type->prot_l1 == 0 && ((addr | phys | length) & ~SECTION_MASK)) {
-		printk(KERN_WARNING "BUG: map for 0x%08lx at 0x%08lx can not "
+		printk(KERN_WARNING "BUG: map for 0x" PHYS_ADDR_FMT " at 0x%08lx can not "
 		       "be mapped using pages, ignoring.\n",
 		       __pfn_to_phys(md->pfn), addr);
 		return;

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 19/20] ARM: LPAE: define printk format for physical addresses and page table entries
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Will Deacon <will.deacon@arm.com>

Now that the Kernel supports 2 level and 3 level page tables, physical
addresses (and also page table entries) may be 32 or 64-bits depending
upon the configuration.

This patch adds a conversion specifier (PHYS_ADDR_FMT) which represents
a u32 or u64 depending on the width of a physical address.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/types.h |    6 ++++++
 arch/arm/kernel/setup.c      |    2 +-
 arch/arm/mm/fault.c          |    8 ++++----
 arch/arm/mm/mmu.c            |   18 +++++++++---------
 4 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/arch/arm/include/asm/types.h b/arch/arm/include/asm/types.h
index dc1bdbb..b740539 100644
--- a/arch/arm/include/asm/types.h
+++ b/arch/arm/include/asm/types.h
@@ -7,6 +7,12 @@
 
 #define BITS_PER_LONG 32
 
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+#define PHYS_ADDR_FMT	"%016llx"
+#else
+#define PHYS_ADDR_FMT	"%08x"
+#endif
+
 #endif /* __KERNEL__ */
 
 #endif
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 0128db2..d143241 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -448,7 +448,7 @@ static int __init arm_add_memory(phys_addr_t start, unsigned long size)
 
 	if (meminfo.nr_banks >= NR_BANKS) {
 		printk(KERN_CRIT "NR_BANKS too low, "
-			"ignoring memory at %#lx\n", start);
+			"ignoring memory at " PHYS_ADDR_FMT "\n", start);
 		return -EINVAL;
 	}
 
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 2dde9cd..8112f77 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -81,7 +81,7 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 
 	printk(KERN_ALERT "pgd = %p\n", mm->pgd);
 	pgd = pgd_offset(mm, addr);
-	printk(KERN_ALERT "[%08lx] *pgd=%08lx", addr, pgd_val(*pgd));
+	printk(KERN_ALERT "[%08lx] *pgd=" PHYS_ADDR_FMT, addr, pgd_val(*pgd));
 
 	do {
 		pmd_t *pmd;
@@ -97,7 +97,7 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 
 		pmd = pmd_offset(pgd, addr);
 		if (PTRS_PER_PMD != 1)
-			printk(", *pmd=%08lx", pmd_val(*pmd));
+			printk(", *pmd=" PHYS_ADDR_FMT, pmd_val(*pmd));
 
 		if (pmd_none(*pmd))
 			break;
@@ -112,9 +112,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 			break;
 
 		pte = pte_offset_map(pmd, addr);
-		printk(", *pte=%08lx", pte_val(*pte));
+		printk(", *pte=" PHYS_ADDR_FMT, pte_val(*pte));
 #ifndef CONFIG_ARM_LPAE
-		printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
+		printk(", *ppte=" PHYS_ADDR_FMT, pte_val(pte[-LINUX_PTE_OFFSET]));
 #endif
 		pte_unmap(pte);
 	} while(0);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 787a409..f05f8ed 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -619,7 +619,7 @@ static void __init create_36bit_mapping(struct map_desc *md,
 
 	if (!(cpu_architecture() >= CPU_ARCH_ARMv6 || cpu_is_xsc3())) {
 		printk(KERN_ERR "MM: CPU does not support supersection "
-		       "mapping for 0x%08llx at 0x%08lx\n",
+		       "mapping for 0x" PHYS_ADDR_FMT " at 0x%08lx\n",
 		       __pfn_to_phys((u64)md->pfn), addr);
 		return;
 	}
@@ -632,14 +632,14 @@ static void __init create_36bit_mapping(struct map_desc *md,
 	 */
 	if (type->domain) {
 		printk(KERN_ERR "MM: invalid domain in supersection "
-		       "mapping for 0x%08llx at 0x%08lx\n",
+		       "mapping for 0x" PHYS_ADDR_FMT " at 0x%08lx\n",
 		       __pfn_to_phys((u64)md->pfn), addr);
 		return;
 	}
 
 	if ((addr | length | __pfn_to_phys(md->pfn)) & ~SUPERSECTION_MASK) {
-		printk(KERN_ERR "MM: cannot create mapping for "
-		       "0x%08llx at 0x%08lx invalid alignment\n",
+		printk(KERN_ERR "MM: cannot create mapping for 0x" PHYS_ADDR_FMT
+		       " at 0x%08lx invalid alignment\n",
 		       __pfn_to_phys((u64)md->pfn), addr);
 		return;
 	}
@@ -681,16 +681,16 @@ static void __init create_mapping(struct map_desc *md)
 	pgd_t *pgd;
 
 	if (md->virtual != vectors_base() && md->virtual < TASK_SIZE) {
-		printk(KERN_WARNING "BUG: not creating mapping for "
-		       "0x%08llx at 0x%08lx in user region\n",
+		printk(KERN_WARNING "BUG: not creating mapping for 0x"
+		       PHYS_ADDR_FMT "@0x%08lx in user region\n",
 		       __pfn_to_phys((u64)md->pfn), md->virtual);
 		return;
 	}
 
 	if ((md->type == MT_DEVICE || md->type == MT_ROM) &&
 	    md->virtual >= PAGE_OFFSET && md->virtual < VMALLOC_END) {
-		printk(KERN_WARNING "BUG: mapping for 0x%08llx at 0x%08lx "
-		       "overlaps vmalloc space\n",
+		printk(KERN_WARNING "BUG: mapping for 0x" PHYS_ADDR_FMT
+		       "@0x%08lx overlaps vmalloc space\n",
 		       __pfn_to_phys((u64)md->pfn), md->virtual);
 	}
 
@@ -711,7 +711,7 @@ static void __init create_mapping(struct map_desc *md)
 	length = PAGE_ALIGN(md->length + (md->virtual & ~PAGE_MASK));
 
 	if (type->prot_l1 == 0 && ((addr | phys | length) & ~SECTION_MASK)) {
-		printk(KERN_WARNING "BUG: map for 0x%08lx at 0x%08lx can not "
+		printk(KERN_WARNING "BUG: map for 0x" PHYS_ADDR_FMT " at 0x%08lx can not "
 		       "be mapped using pages, ignoring.\n",
 		       __pfn_to_phys(md->pfn), addr);
 		return;

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 20/20] ARM: LPAE: Add the Kconfig entries
  2010-11-12 18:00 ` Catalin Marinas
@ 2010-11-12 18:00   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

This patch adds the ARM_LPAE and ARCH_PHYS_ADDR_T_64BIT Kconfig entries
allowing LPAE support to be compiled into the kernel.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/Kconfig    |    2 +-
 arch/arm/mm/Kconfig |   13 +++++++++++++
 2 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index f35fe82..e376b7b 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1599,7 +1599,7 @@ config CMDLINE_FORCE
 
 config XIP_KERNEL
 	bool "Kernel Execute-In-Place from ROM"
-	depends on !ZBOOT_ROM
+	depends on !ZBOOT_ROM && !ARM_LPAE
 	help
 	  Execute-In-Place allows the kernel to run from non-volatile storage
 	  directly addressable by the CPU, such as NOR flash. This saves RAM
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 8493ed0..3ca2d15 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -615,6 +615,19 @@ config IO_36
 
 comment "Processor Features"
 
+config ARM_LPAE
+	bool "Support for the Large Physical Address Extension"
+	depends on MMU && CPU_V7
+	help
+	  Say Y if you have an ARMv7 processor supporting the LPAE page table
+	  format and you would like access memory beyond the 4GB limit.
+
+config ARCH_PHYS_ADDR_T_64BIT
+	def_bool ARM_LPAE
+
+config ARCH_DMA_ADDR_T_64BIT
+	def_bool ARM_LPAE
+
 config ARM_THUMB
 	bool "Support Thumb user binaries"
 	depends on CPU_ARM720T || CPU_ARM740T || CPU_ARM920T || CPU_ARM922T || CPU_ARM925T || CPU_ARM926T || CPU_ARM940T || CPU_ARM946E || CPU_ARM1020 || CPU_ARM1020E || CPU_ARM1022 || CPU_ARM1026 || CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_V6 || CPU_V7 || CPU_FEROCEON

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 20/20] ARM: LPAE: Add the Kconfig entries
@ 2010-11-12 18:00   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-12 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds the ARM_LPAE and ARCH_PHYS_ADDR_T_64BIT Kconfig entries
allowing LPAE support to be compiled into the kernel.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/Kconfig    |    2 +-
 arch/arm/mm/Kconfig |   13 +++++++++++++
 2 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index f35fe82..e376b7b 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1599,7 +1599,7 @@ config CMDLINE_FORCE
 
 config XIP_KERNEL
 	bool "Kernel Execute-In-Place from ROM"
-	depends on !ZBOOT_ROM
+	depends on !ZBOOT_ROM && !ARM_LPAE
 	help
 	  Execute-In-Place allows the kernel to run from non-volatile storage
 	  directly addressable by the CPU, such as NOR flash. This saves RAM
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 8493ed0..3ca2d15 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -615,6 +615,19 @@ config IO_36
 
 comment "Processor Features"
 
+config ARM_LPAE
+	bool "Support for the Large Physical Address Extension"
+	depends on MMU && CPU_V7
+	help
+	  Say Y if you have an ARMv7 processor supporting the LPAE page table
+	  format and you would like access memory beyond the 4GB limit.
+
+config ARCH_PHYS_ADDR_T_64BIT
+	def_bool ARM_LPAE
+
+config ARCH_DMA_ADDR_T_64BIT
+	def_bool ARM_LPAE
+
 config ARM_THUMB
 	bool "Support Thumb user binaries"
 	depends on CPU_ARM720T || CPU_ARM740T || CPU_ARM920T || CPU_ARM922T || CPU_ARM925T || CPU_ARM926T || CPU_ARM940T || CPU_ARM946E || CPU_ARM1020 || CPU_ARM1020E || CPU_ARM1022 || CPU_ARM1026 || CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_V6 || CPU_V7 || CPU_FEROCEON

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 20/20] ARM: LPAE: Add the Kconfig entries
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-13 12:38     ` Sergei Shtylyov
  -1 siblings, 0 replies; 154+ messages in thread
From: Sergei Shtylyov @ 2010-11-13 12:38 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

Hello.

On 12-11-2010 21:00, Catalin Marinas wrote:

> This patch adds the ARM_LPAE and ARCH_PHYS_ADDR_T_64BIT Kconfig entries
> allowing LPAE support to be compiled into the kernel.

> Signed-off-by: Catalin Marinas<catalin.marinas@arm.com>
[...]

> diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
> index 8493ed0..3ca2d15 100644
> --- a/arch/arm/mm/Kconfig
> +++ b/arch/arm/mm/Kconfig
> @@ -615,6 +615,19 @@ config IO_36
>
>   comment "Processor Features"
>
> +config ARM_LPAE
> +	bool "Support for the Large Physical Address Extension"
> +	depends on MMU&&  CPU_V7
> +	help
> +	  Say Y if you have an ARMv7 processor supporting the LPAE page table
> +	  format and you would like access memory beyond the 4GB limit.

    Maybe "like to access"?

WBR, Sergei

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 20/20] ARM: LPAE: Add the Kconfig entries
@ 2010-11-13 12:38     ` Sergei Shtylyov
  0 siblings, 0 replies; 154+ messages in thread
From: Sergei Shtylyov @ 2010-11-13 12:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hello.

On 12-11-2010 21:00, Catalin Marinas wrote:

> This patch adds the ARM_LPAE and ARCH_PHYS_ADDR_T_64BIT Kconfig entries
> allowing LPAE support to be compiled into the kernel.

> Signed-off-by: Catalin Marinas<catalin.marinas@arm.com>
[...]

> diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
> index 8493ed0..3ca2d15 100644
> --- a/arch/arm/mm/Kconfig
> +++ b/arch/arm/mm/Kconfig
> @@ -615,6 +615,19 @@ config IO_36
>
>   comment "Processor Features"
>
> +config ARM_LPAE
> +	bool "Support for the Large Physical Address Extension"
> +	depends on MMU&&  CPU_V7
> +	help
> +	  Say Y if you have an ARMv7 processor supporting the LPAE page table
> +	  format and you would like access memory beyond the 4GB limit.

    Maybe "like to access"?

WBR, Sergei

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 20/20] ARM: LPAE: Add the Kconfig entries
  2010-11-13 12:38     ` Sergei Shtylyov
@ 2010-11-14 10:11       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-14 10:11 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: linux-arm-kernel, linux-kernel

On 13 November 2010 12:38, Sergei Shtylyov <sshtylyov@mvista.com> wrote:
> On 12-11-2010 21:00, Catalin Marinas wrote:
>> +config ARM_LPAE
>> +       bool "Support for the Large Physical Address Extension"
>> +       depends on MMU&&  CPU_V7
>> +       help
>> +         Say Y if you have an ARMv7 processor supporting the LPAE page
>> table
>> +         format and you would like access memory beyond the 4GB limit.
>
>   Maybe "like to access"?

Yes, thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 20/20] ARM: LPAE: Add the Kconfig entries
@ 2010-11-14 10:11       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-14 10:11 UTC (permalink / raw)
  To: linux-arm-kernel

On 13 November 2010 12:38, Sergei Shtylyov <sshtylyov@mvista.com> wrote:
> On 12-11-2010 21:00, Catalin Marinas wrote:
>> +config ARM_LPAE
>> + ? ? ? bool "Support for the Large Physical Address Extension"
>> + ? ? ? depends on MMU&& ?CPU_V7
>> + ? ? ? help
>> + ? ? ? ? Say Y if you have an ARMv7 processor supporting the LPAE page
>> table
>> + ? ? ? ? format and you would like access memory beyond the 4GB limit.
>
> ? Maybe "like to access"?

Yes, thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-14 10:13     ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-14 10:13 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel

On 12 November 2010 18:00, Catalin Marinas <catalin.marinas@arm.com> wrote:
> This patch adds the MMU initialisation for the LPAE page table format.
> The swapper_pg_dir size with LPAE is 5 rather than 4 pages. The
> __v7_setup function configures the TTBRx split based on the PAGE_OFFSET
> and sets the corresponding TTB control and MAIRx bits (similar to
> PRRR/NMRR for TEX remapping). The 36-bit mappings (supersections) and
> a few other memory types in mmu.c are conditionally compiled.
[...]
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -45,11 +46,20 @@
>  #error KERNEL_RAM_VADDR must start at 0xXXXX8000
>  #endif
>
> +#ifdef CONFIG_ARM_LPAE
> +       /* LPAE requires an additional page for the PGD */
> +#define PG_DIR_SIZE    0x5000
> +#define PTE_WORDS      3
> +#else
> +#define PG_DIR_SIZE    0x4000
> +#define PTE_WORDS      2
> +#endif

This should have been called PTE_ORDER, the PTE_WORDS naming is misleading.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
@ 2010-11-14 10:13     ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-14 10:13 UTC (permalink / raw)
  To: linux-arm-kernel

On 12 November 2010 18:00, Catalin Marinas <catalin.marinas@arm.com> wrote:
> This patch adds the MMU initialisation for the LPAE page table format.
> The swapper_pg_dir size with LPAE is 5 rather than 4 pages. The
> __v7_setup function configures the TTBRx split based on the PAGE_OFFSET
> and sets the corresponding TTB control and MAIRx bits (similar to
> PRRR/NMRR for TEX remapping). The 36-bit mappings (supersections) and
> a few other memory types in mmu.c are conditionally compiled.
[...]
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -45,11 +46,20 @@
> ?#error KERNEL_RAM_VADDR must start at 0xXXXX8000
> ?#endif
>
> +#ifdef CONFIG_ARM_LPAE
> + ? ? ? /* LPAE requires an additional page for the PGD */
> +#define PG_DIR_SIZE ? ?0x5000
> +#define PTE_WORDS ? ? ?3
> +#else
> +#define PG_DIR_SIZE ? ?0x4000
> +#define PTE_WORDS ? ? ?2
> +#endif

This should have been called PTE_ORDER, the PTE_WORDS naming is misleading.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-14 13:19     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-14 13:19 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel, Will Deacon

On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
> From: Will Deacon <will.deacon@arm.com>
> 
> When using 2-level paging, pte_t and pmd_t are typedefs for
> unsigned long but phys_addr_t is a typedef for u32.
> 
> This patch uses u32 for the page table entry types when
> phys_addr_t is not 64-bit, allowing the same conversion
> specifier to be used for physical addresses and page table
> entries regardless of LPAE.

However, code which prints the value of page table entries assumes that
they are unsigned long, and places where we store the raw pte value also
uses 'unsigned long'.

If we're going to make this change, we need to change more places than
this patch covers.  grep for pte_val to help find those places.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-14 13:19     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-14 13:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
> From: Will Deacon <will.deacon@arm.com>
> 
> When using 2-level paging, pte_t and pmd_t are typedefs for
> unsigned long but phys_addr_t is a typedef for u32.
> 
> This patch uses u32 for the page table entry types when
> phys_addr_t is not 64-bit, allowing the same conversion
> specifier to be used for physical addresses and page table
> entries regardless of LPAE.

However, code which prints the value of page table entries assumes that
they are unsigned long, and places where we store the raw pte value also
uses 'unsigned long'.

If we're going to make this change, we need to change more places than
this patch covers.  grep for pte_val to help find those places.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-14 13:19     ` Russell King - ARM Linux
@ 2010-11-14 14:09       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-14 14:09 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Catalin Marinas, linux-arm-kernel, linux-kernel, Will Deacon

On Sunday, November 14, 2010, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
>> From: Will Deacon <will.deacon@arm.com>
>>
>> When using 2-level paging, pte_t and pmd_t are typedefs for
>> unsigned long but phys_addr_t is a typedef for u32.
>>
>> This patch uses u32 for the page table entry types when
>> phys_addr_t is not 64-bit, allowing the same conversion
>> specifier to be used for physical addresses and page table
>> entries regardless of LPAE.
>
> However, code which prints the value of page table entries assumes that
> they are unsigned long, and places where we store the raw pte value also
> uses 'unsigned long'.
>
> If we're going to make this change, we need to change more places than
> this patch covers.  grep for pte_val to help find those places.

Patch 19/20 introduces a common macro for formatting but we should
probably order the patches a bit to avoid problems if anyone is
bisecting  in the middle of the series.


-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-14 14:09       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-14 14:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Sunday, November 14, 2010, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
>> From: Will Deacon <will.deacon@arm.com>
>>
>> When using 2-level paging, pte_t and pmd_t are typedefs for
>> unsigned long but phys_addr_t is a typedef for u32.
>>
>> This patch uses u32 for the page table entry types when
>> phys_addr_t is not 64-bit, allowing the same conversion
>> specifier to be used for physical addresses and page table
>> entries regardless of LPAE.
>
> However, code which prints the value of page table entries assumes that
> they are unsigned long, and places where we store the raw pte value also
> uses 'unsigned long'.
>
> If we're going to make this change, we need to change more places than
> this patch covers. ?grep for pte_val to help find those places.

Patch 19/20 introduces a common macro for formatting but we should
probably order the patches a bit to avoid problems if anyone is
bisecting  in the middle of the series.


-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-14 14:09       ` Catalin Marinas
@ 2010-11-14 14:13         ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-14 14:13 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King - ARM Linux, linux-arm-kernel, linux-kernel, Will Deacon

On Sunday, November 14, 2010, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Sunday, November 14, 2010, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
>>> From: Will Deacon <will.deacon@arm.com>
>>>
>>> When using 2-level paging, pte_t and pmd_t are typedefs for
>>> unsigned long but phys_addr_t is a typedef for u32.
>>>
>>> This patch uses u32 for the page table entry types when
>>> phys_addr_t is not 64-bit, allowing the same conversion
>>> specifier to be used for physical addresses and page table
>>> entries regardless of LPAE.
>>
>> However, code which prints the value of page table entries assumes that
>> they are unsigned long, and places where we store the raw pte value also
>> uses 'unsigned long'.
>>
>> If we're going to make this change, we need to change more places than
>> this patch covers.  grep for pte_val to help find those places.
>
> Patch 19/20 introduces a common macro for formatting but we should
> probably order the patches a bit to avoid problems if anyone is
> bisecting  in the middle of the series.

Actually not a problem since LPAE is only enabled by the last patch.
There may be some compiler warnings without 19/20, I need to check.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-14 14:13         ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-14 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Sunday, November 14, 2010, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Sunday, November 14, 2010, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
>>> From: Will Deacon <will.deacon@arm.com>
>>>
>>> When using 2-level paging, pte_t and pmd_t are typedefs for
>>> unsigned long but phys_addr_t is a typedef for u32.
>>>
>>> This patch uses u32 for the page table entry types when
>>> phys_addr_t is not 64-bit, allowing the same conversion
>>> specifier to be used for physical addresses and page table
>>> entries regardless of LPAE.
>>
>> However, code which prints the value of page table entries assumes that
>> they are unsigned long, and places where we store the raw pte value also
>> uses 'unsigned long'.
>>
>> If we're going to make this change, we need to change more places than
>> this patch covers. ?grep for pte_val to help find those places.
>
> Patch 19/20 introduces a common macro for formatting but we should
> probably order the patches a bit to avoid problems if anyone is
> bisecting ?in the middle of the series.

Actually not a problem since LPAE is only enabled by the last patch.
There may be some compiler warnings without 19/20, I need to check.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-14 14:13         ` Catalin Marinas
@ 2010-11-14 15:14           ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-14 15:14 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel, Will Deacon

On Sun, Nov 14, 2010 at 02:13:23PM +0000, Catalin Marinas wrote:
> On Sunday, November 14, 2010, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Sunday, November 14, 2010, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> >> On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
> >>> From: Will Deacon <will.deacon@arm.com>
> >>>
> >>> When using 2-level paging, pte_t and pmd_t are typedefs for
> >>> unsigned long but phys_addr_t is a typedef for u32.
> >>>
> >>> This patch uses u32 for the page table entry types when
> >>> phys_addr_t is not 64-bit, allowing the same conversion
> >>> specifier to be used for physical addresses and page table
> >>> entries regardless of LPAE.
> >>
> >> However, code which prints the value of page table entries assumes that
> >> they are unsigned long, and places where we store the raw pte value also
> >> uses 'unsigned long'.
> >>
> >> If we're going to make this change, we need to change more places than
> >> this patch covers.  grep for pte_val to help find those places.
> >
> > Patch 19/20 introduces a common macro for formatting but we should
> > probably order the patches a bit to avoid problems if anyone is
> > bisecting  in the middle of the series.
> 
> Actually not a problem since LPAE is only enabled by the last patch.
> There may be some compiler warnings without 19/20, I need to check.

There will be compiler warnings because u32 is unsigned int, and we
print it as %08lx.  Generic code cases pte values to (long long) and
prints them using %08llx.  We should do the same.

In any case, this patch on its own introduces new compiler warnings.
These need to be fixed in this patch, rather than relying on one later
in the series.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-14 15:14           ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-14 15:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Nov 14, 2010 at 02:13:23PM +0000, Catalin Marinas wrote:
> On Sunday, November 14, 2010, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Sunday, November 14, 2010, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> >> On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
> >>> From: Will Deacon <will.deacon@arm.com>
> >>>
> >>> When using 2-level paging, pte_t and pmd_t are typedefs for
> >>> unsigned long but phys_addr_t is a typedef for u32.
> >>>
> >>> This patch uses u32 for the page table entry types when
> >>> phys_addr_t is not 64-bit, allowing the same conversion
> >>> specifier to be used for physical addresses and page table
> >>> entries regardless of LPAE.
> >>
> >> However, code which prints the value of page table entries assumes that
> >> they are unsigned long, and places where we store the raw pte value also
> >> uses 'unsigned long'.
> >>
> >> If we're going to make this change, we need to change more places than
> >> this patch covers. ?grep for pte_val to help find those places.
> >
> > Patch 19/20 introduces a common macro for formatting but we should
> > probably order the patches a bit to avoid problems if anyone is
> > bisecting ?in the middle of the series.
> 
> Actually not a problem since LPAE is only enabled by the last patch.
> There may be some compiler warnings without 19/20, I need to check.

There will be compiler warnings because u32 is unsigned int, and we
print it as %08lx.  Generic code cases pte values to (long long) and
prints them using %08llx.  We should do the same.

In any case, this patch on its own introduces new compiler warnings.
These need to be fixed in this patch, rather than relying on one later
in the series.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-14 15:14           ` Russell King - ARM Linux
@ 2010-11-15  9:39             ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-15  9:39 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel, Will Deacon

On 14 November 2010 15:14, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sun, Nov 14, 2010 at 02:13:23PM +0000, Catalin Marinas wrote:
>> On Sunday, November 14, 2010, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> > On Sunday, November 14, 2010, Russell King - ARM Linux
>> > <linux@arm.linux.org.uk> wrote:
>> >> On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
>> >>> From: Will Deacon <will.deacon@arm.com>
>> >>>
>> >>> When using 2-level paging, pte_t and pmd_t are typedefs for
>> >>> unsigned long but phys_addr_t is a typedef for u32.
>> >>>
>> >>> This patch uses u32 for the page table entry types when
>> >>> phys_addr_t is not 64-bit, allowing the same conversion
>> >>> specifier to be used for physical addresses and page table
>> >>> entries regardless of LPAE.
>> >>
>> >> However, code which prints the value of page table entries assumes that
>> >> they are unsigned long, and places where we store the raw pte value also
>> >> uses 'unsigned long'.
>> >>
>> >> If we're going to make this change, we need to change more places than
>> >> this patch covers.  grep for pte_val to help find those places.
>> >
>> > Patch 19/20 introduces a common macro for formatting but we should
>> > probably order the patches a bit to avoid problems if anyone is
>> > bisecting  in the middle of the series.
>>
>> Actually not a problem since LPAE is only enabled by the last patch.
>> There may be some compiler warnings without 19/20, I need to check.
>
> There will be compiler warnings because u32 is unsigned int, and we
> print it as %08lx.  Generic code cases pte values to (long long) and
> prints them using %08llx.  We should do the same.

We still need some kind of macro because with LPAE we need %016llx
since the phys address can go to 40-bit and there are some additional
bits in the top word. Unless you'd like to always print 16 characters
even for 32-bit ptes (or if there is some other printk magic I'm not
aware of).

> In any case, this patch on its own introduces new compiler warnings.
> These need to be fixed in this patch, rather than relying on one later
> in the series.

Yes, we'll look into this.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-15  9:39             ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-15  9:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 14 November 2010 15:14, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sun, Nov 14, 2010 at 02:13:23PM +0000, Catalin Marinas wrote:
>> On Sunday, November 14, 2010, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> > On Sunday, November 14, 2010, Russell King - ARM Linux
>> > <linux@arm.linux.org.uk> wrote:
>> >> On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
>> >>> From: Will Deacon <will.deacon@arm.com>
>> >>>
>> >>> When using 2-level paging, pte_t and pmd_t are typedefs for
>> >>> unsigned long but phys_addr_t is a typedef for u32.
>> >>>
>> >>> This patch uses u32 for the page table entry types when
>> >>> phys_addr_t is not 64-bit, allowing the same conversion
>> >>> specifier to be used for physical addresses and page table
>> >>> entries regardless of LPAE.
>> >>
>> >> However, code which prints the value of page table entries assumes that
>> >> they are unsigned long, and places where we store the raw pte value also
>> >> uses 'unsigned long'.
>> >>
>> >> If we're going to make this change, we need to change more places than
>> >> this patch covers. ?grep for pte_val to help find those places.
>> >
>> > Patch 19/20 introduces a common macro for formatting but we should
>> > probably order the patches a bit to avoid problems if anyone is
>> > bisecting ?in the middle of the series.
>>
>> Actually not a problem since LPAE is only enabled by the last patch.
>> There may be some compiler warnings without 19/20, I need to check.
>
> There will be compiler warnings because u32 is unsigned int, and we
> print it as %08lx. ?Generic code cases pte values to (long long) and
> prints them using %08llx. ?We should do the same.

We still need some kind of macro because with LPAE we need %016llx
since the phys address can go to 40-bit and there are some additional
bits in the top word. Unless you'd like to always print 16 characters
even for 32-bit ptes (or if there is some other printk magic I'm not
aware of).

> In any case, this patch on its own introduces new compiler warnings.
> These need to be fixed in this patch, rather than relying on one later
> in the series.

Yes, we'll look into this.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-15  9:39             ` Catalin Marinas
@ 2010-11-15  9:47               ` Arnd Bergmann
  -1 siblings, 0 replies; 154+ messages in thread
From: Arnd Bergmann @ 2010-11-15  9:47 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King - ARM Linux, linux-arm-kernel, linux-kernel, Will Deacon

On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
> > There will be compiler warnings because u32 is unsigned int, and we
> > print it as %08lx.  Generic code cases pte values to (long long) and
> > prints them using %08llx.  We should do the same.
> 
> We still need some kind of macro because with LPAE we need %016llx
> since the phys address can go to 40-bit and there are some additional
> bits in the top word. Unless you'd like to always print 16 characters
> even for 32-bit ptes (or if there is some other printk magic I'm not
> aware of).

Why not just %010llx? That would just be two extra characters.

	Arnd

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-15  9:47               ` Arnd Bergmann
  0 siblings, 0 replies; 154+ messages in thread
From: Arnd Bergmann @ 2010-11-15  9:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
> > There will be compiler warnings because u32 is unsigned int, and we
> > print it as %08lx.  Generic code cases pte values to (long long) and
> > prints them using %08llx.  We should do the same.
> 
> We still need some kind of macro because with LPAE we need %016llx
> since the phys address can go to 40-bit and there are some additional
> bits in the top word. Unless you'd like to always print 16 characters
> even for 32-bit ptes (or if there is some other printk magic I'm not
> aware of).

Why not just %010llx? That would just be two extra characters.

	Arnd

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-15  9:47               ` Arnd Bergmann
@ 2010-11-15  9:51                 ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-15  9:51 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Russell King - ARM Linux, linux-arm-kernel, linux-kernel, Will Deacon

On 15 November 2010 09:47, Arnd Bergmann <arnd@arndb.de> wrote:
> On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
>> > There will be compiler warnings because u32 is unsigned int, and we
>> > print it as %08lx.  Generic code cases pte values to (long long) and
>> > prints them using %08llx.  We should do the same.
>>
>> We still need some kind of macro because with LPAE we need %016llx
>> since the phys address can go to 40-bit and there are some additional
>> bits in the top word. Unless you'd like to always print 16 characters
>> even for 32-bit ptes (or if there is some other printk magic I'm not
>> aware of).
>
> Why not just %010llx? That would just be two extra characters.

We still have attributes (like XN, bit 54) stored in the top part of
the pte. This may be of interest when debugging.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-15  9:51                 ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-15  9:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 15 November 2010 09:47, Arnd Bergmann <arnd@arndb.de> wrote:
> On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
>> > There will be compiler warnings because u32 is unsigned int, and we
>> > print it as %08lx. ?Generic code cases pte values to (long long) and
>> > prints them using %08llx. ?We should do the same.
>>
>> We still need some kind of macro because with LPAE we need %016llx
>> since the phys address can go to 40-bit and there are some additional
>> bits in the top word. Unless you'd like to always print 16 characters
>> even for 32-bit ptes (or if there is some other printk magic I'm not
>> aware of).
>
> Why not just %010llx? That would just be two extra characters.

We still have attributes (like XN, bit 54) stored in the top part of
the pte. This may be of interest when debugging.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-15  9:39             ` Catalin Marinas
@ 2010-11-15 17:36               ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 17:36 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel, Will Deacon

On Mon, Nov 15, 2010 at 09:39:30AM +0000, Catalin Marinas wrote:
> On 14 November 2010 15:14, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > There will be compiler warnings because u32 is unsigned int, and we
> > print it as %08lx.  Generic code cases pte values to (long long) and
> > prints them using %08llx.  We should do the same.
> 
> We still need some kind of macro because with LPAE we need %016llx
> since the phys address can go to 40-bit and there are some additional
> bits in the top word. Unless you'd like to always print 16 characters
> even for 32-bit ptes (or if there is some other printk magic I'm not
> aware of).

Eeh?  %08llx prints 8 characters _minimum_.  If it needs more to represent
the number, it will use more characters.  You surely don't think generic
code is brain dead enough to cast something to a 64-bit long long and
then only print 32 bits of it???

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-15 17:36               ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 17:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 15, 2010 at 09:39:30AM +0000, Catalin Marinas wrote:
> On 14 November 2010 15:14, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > There will be compiler warnings because u32 is unsigned int, and we
> > print it as %08lx. ?Generic code cases pte values to (long long) and
> > prints them using %08llx. ?We should do the same.
> 
> We still need some kind of macro because with LPAE we need %016llx
> since the phys address can go to 40-bit and there are some additional
> bits in the top word. Unless you'd like to always print 16 characters
> even for 32-bit ptes (or if there is some other printk magic I'm not
> aware of).

Eeh?  %08llx prints 8 characters _minimum_.  If it needs more to represent
the number, it will use more characters.  You surely don't think generic
code is brain dead enough to cast something to a 64-bit long long and
then only print 32 bits of it???

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-15 17:36               ` Russell King - ARM Linux
@ 2010-11-15 17:39                 ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-15 17:39 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel, Will Deacon

On Mon, 2010-11-15 at 17:36 +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 15, 2010 at 09:39:30AM +0000, Catalin Marinas wrote:
> > On 14 November 2010 15:14, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > There will be compiler warnings because u32 is unsigned int, and we
> > > print it as %08lx.  Generic code cases pte values to (long long) and
> > > prints them using %08llx.  We should do the same.
> >
> > We still need some kind of macro because with LPAE we need %016llx
> > since the phys address can go to 40-bit and there are some additional
> > bits in the top word. Unless you'd like to always print 16 characters
> > even for 32-bit ptes (or if there is some other printk magic I'm not
> > aware of).
> 
> Eeh?  %08llx prints 8 characters _minimum_.  If it needs more to represent
> the number, it will use more characters.  You surely don't think generic
> code is brain dead enough to cast something to a 64-bit long long and
> then only print 32 bits of it???

That's correct. I was just wondering whether the alignment would look
weird with ptes being printed with different lengths.

Anyway, here comes another set of patches with this update (%08llx in
printk).

-- 
Catalin


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-15 17:39                 ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-15 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2010-11-15 at 17:36 +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 15, 2010 at 09:39:30AM +0000, Catalin Marinas wrote:
> > On 14 November 2010 15:14, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > There will be compiler warnings because u32 is unsigned int, and we
> > > print it as %08lx.  Generic code cases pte values to (long long) and
> > > prints them using %08llx.  We should do the same.
> >
> > We still need some kind of macro because with LPAE we need %016llx
> > since the phys address can go to 40-bit and there are some additional
> > bits in the top word. Unless you'd like to always print 16 characters
> > even for 32-bit ptes (or if there is some other printk magic I'm not
> > aware of).
> 
> Eeh?  %08llx prints 8 characters _minimum_.  If it needs more to represent
> the number, it will use more characters.  You surely don't think generic
> code is brain dead enough to cast something to a 64-bit long long and
> then only print 32 bits of it???

That's correct. I was just wondering whether the alignment would look
weird with ptes being printed with different lengths.

Anyway, here comes another set of patches with this update (%08llx in
printk).

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 04/20] ARM: LPAE: Do not assume Linux PTEs are always at PTRS_PER_PTE offset
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-15 17:42     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 17:42 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Fri, Nov 12, 2010 at 06:00:24PM +0000, Catalin Marinas wrote:
> Placing the Linux PTEs at a 2KB offset inside a page is a workaround for
> the 2-level page table format where not enough spare bits are available.
> With LPAE this is no longer required. This patch changes such assumption
> by using a different macro, LINUX_PTE_OFFSET, which is defined to
> PTRS_PER_PTE for the 2-level page tables.

Hmm.  I think we should be doing this a different way - in fact, I think
we should switch the order of the linux vs hardware page tables.  This
actually simplifies the code a bit too - notice that we lose the arith.
in __pte_map, __pte_unmap, pmd_page_vaddr, which is all page table
walking stuff.

 arch/arm/include/asm/pgalloc.h |   34 +++++++++++++++-------------------
 arch/arm/include/asm/pgtable.h |   30 +++++++++++++++---------------
 arch/arm/mm/fault.c            |    2 +-
 arch/arm/mm/mmu.c              |    2 +-
 arch/arm/mm/proc-macros.S      |   10 +++++-----
 arch/arm/mm/proc-v7.S          |    8 +++-----
 6 files changed, 40 insertions(+), 46 deletions(-)

diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h
index b12cc98..e2a6613 100644
--- a/arch/arm/include/asm/pgalloc.h
+++ b/arch/arm/include/asm/pgalloc.h
@@ -38,6 +38,11 @@ extern void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd);
 
 #define PGALLOC_GFP	(GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO)
 
+static inline void clean_pte_table(void *ptr)
+{
+	clean_dcache_area(ptr + PTE_HWTABLE_OFF, PTE_HWTABLE_SIZE);
+}
+
 /*
  * Allocate one PTE table.
  *
@@ -60,10 +65,8 @@ pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr)
 	pte_t *pte;
 
 	pte = (pte_t *)__get_free_page(PGALLOC_GFP);
-	if (pte) {
-		clean_dcache_area(pte, sizeof(pte_t) * PTRS_PER_PTE);
-		pte += PTRS_PER_PTE;
-	}
+	if (pte)
+		clean_pte_table(pte);
 
 	return pte;
 }
@@ -79,10 +82,8 @@ pte_alloc_one(struct mm_struct *mm, unsigned long addr)
 	pte = alloc_pages(PGALLOC_GFP, 0);
 #endif
 	if (pte) {
-		if (!PageHighMem(pte)) {
-			void *page = page_address(pte);
-			clean_dcache_area(page, sizeof(pte_t) * PTRS_PER_PTE);
-		}
+		if (!PageHighMem(pte))
+			clean_pte_table(page_address(pte));
 		pgtable_page_ctor(pte);
 	}
 
@@ -94,10 +95,8 @@ pte_alloc_one(struct mm_struct *mm, unsigned long addr)
  */
 static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
 {
-	if (pte) {
-		pte -= PTRS_PER_PTE;
+	if (pte)
 		free_page((unsigned long)pte);
-	}
 }
 
 static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
@@ -106,8 +105,9 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
 	__free_page(pte);
 }
 
-static inline void __pmd_populate(pmd_t *pmdp, unsigned long pmdval)
+static inline void __pmd_populate(pmd_t *pmdp, unsigned long pte, unsigned long prot)
 {
+	unsigned long pmdval = (pte + PTE_HWTABLE_OFF) | prot;
 	pmdp[0] = __pmd(pmdval);
 	pmdp[1] = __pmd(pmdval + 256 * sizeof(pte_t));
 	flush_pmd_entry(pmdp);
@@ -122,20 +122,16 @@ static inline void __pmd_populate(pmd_t *pmdp, unsigned long pmdval)
 static inline void
 pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
 {
-	unsigned long pte_ptr = (unsigned long)ptep;
-
 	/*
-	 * The pmd must be loaded with the physical
-	 * address of the PTE table
+	 * The pmd must be loaded with the physical address of the PTE table
 	 */
-	pte_ptr -= PTRS_PER_PTE * sizeof(void *);
-	__pmd_populate(pmdp, __pa(pte_ptr) | _PAGE_KERNEL_TABLE);
+	__pmd_populate(pmdp, __pa(ptep), _PAGE_KERNEL_TABLE);
 }
 
 static inline void
 pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t ptep)
 {
-	__pmd_populate(pmdp, page_to_pfn(ptep) << PAGE_SHIFT | _PAGE_USER_TABLE);
+	__pmd_populate(pmdp, page_to_pfn(ptep) << PAGE_SHIFT, _PAGE_USER_TABLE);
 }
 #define pmd_pgtable(pmd) pmd_page(pmd)
 
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index b155414..d9f1bfa 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -54,7 +54,7 @@
  * Therefore, we tweak the implementation slightly - we tell Linux that we
  * have 2048 entries in the first level, each of which is 8 bytes (iow, two
  * hardware pointers to the second level.)  The second level contains two
- * hardware PTE tables arranged contiguously, followed by Linux versions
+ * hardware PTE tables arranged contiguously, preceded by Linux versions
  * which contain the state information Linux needs.  We, therefore, end up
  * with 512 entries in the "PTE" level.
  *
@@ -62,15 +62,15 @@
  *
  *    pgd             pte
  * |        |
- * +--------+ +0
- * |        |-----> +------------+ +0
+ * +--------+
+ * |        |       +------------+ +0
+ * +- - - - +       | Linux pt 0 |
+ * |        |       +------------+ +1024
+ * +--------+ +0    | Linux pt 1 |
+ * |        |-----> +------------+ +2048
  * +- - - - + +4    |  h/w pt 0  |
- * |        |-----> +------------+ +1024
+ * |        |-----> +------------+ +3072
  * +--------+ +8    |  h/w pt 1  |
- * |        |       +------------+ +2048
- * +- - - - +       | Linux pt 0 |
- * |        |       +------------+ +3072
- * +--------+       | Linux pt 1 |
  * |        |       +------------+ +4096
  *
  * See L_PTE_xxx below for definitions of bits in the "Linux pt", and
@@ -102,6 +102,10 @@
 #define PTRS_PER_PMD		1
 #define PTRS_PER_PGD		2048
 
+#define PTE_HWTABLE_PTRS	(PTRS_PER_PTE)
+#define PTE_HWTABLE_OFF		(PTE_HWTABLE_PTRS * sizeof(pte_t))
+#define PTE_HWTABLE_SIZE	(PTRS_PER_PTE * sizeof(u32))
+
 /*
  * PMD_SHIFT determines the size of the area a second-level page table can map
  * PGDIR_SHIFT determines what a third-level page table entry can map
@@ -270,8 +274,8 @@ extern struct page *empty_zero_page;
 #define __pte_map(dir)		pmd_page_vaddr(*(dir))
 #define __pte_unmap(pte)	do { } while (0)
 #else
-#define __pte_map(dir)		((pte_t *)kmap_atomic(pmd_page(*(dir))) + PTRS_PER_PTE)
-#define __pte_unmap(pte)	kunmap_atomic((pte - PTRS_PER_PTE))
+#define __pte_map(dir)		(pte_t *)kmap_atomic(pmd_page(*(dir)))
+#define __pte_unmap(pte)	kunmap_atomic(pte)
 #endif
 
 #define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
@@ -364,11 +368,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 
 static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 {
-	unsigned long ptr;
-
-	ptr = pmd_val(pmd) & ~(PTRS_PER_PTE * sizeof(void *) - 1);
-	ptr += PTRS_PER_PTE * sizeof(void *);
-
+	unsigned long ptr = pmd_val(pmd) & PAGE_MASK;
 	return __va(ptr);
 }
 
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 1e21e12..f10f9ba 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -108,7 +108,7 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 
 		pte = pte_offset_map(pmd, addr);
 		printk(", *pte=%08lx", pte_val(*pte));
-		printk(", *ppte=%08lx", pte_val(pte[-PTRS_PER_PTE]));
+		printk(", *ppte=%08lx", pte_val(pte[PTE_HWTABLE_PTRS]));
 		pte_unmap(pte);
 	} while(0);
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 72ad3e1..9963189 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -535,7 +535,7 @@ static pte_t * __init early_pte_alloc(pmd_t *pmd, unsigned long addr, unsigned l
 {
 	if (pmd_none(*pmd)) {
 		pte_t *pte = early_alloc(2 * PTRS_PER_PTE * sizeof(pte_t));
-		__pmd_populate(pmd, __pa(pte) | prot);
+		__pmd_populate(pmd, __pa(pte), prot);
 	}
 	BUG_ON(pmd_bad(*pmd));
 	return pte_offset_kernel(pmd, addr);
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index 7d63bea..cbedf9c 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -121,7 +121,7 @@
 	.endm
 
 	.macro	armv6_set_pte_ext pfx
-	str	r1, [r0], #-2048		@ linux version
+	str	r1, [r0], #2048			@ linux version
 
 	bic	r3, r1, #0x000003fc
 	bic	r3, r3, #PTE_TYPE_MASK
@@ -170,7 +170,7 @@
  *  1111  0xff	r/w	r/w
  */
 	.macro	armv3_set_pte_ext wc_disable=1
-	str	r1, [r0], #-2048		@ linux version
+	str	r1, [r0], #2048			@ linux version
 
 	eor	r3, r1, #L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_WRITE | L_PTE_DIRTY
 
@@ -193,7 +193,7 @@
 	bicne	r2, r2, #PTE_BUFFERABLE
 #endif
 	.endif
-	str	r2, [r0]			@ hardware version
+	str	r2, [r0]		@ hardware version
 	.endm
 
 
@@ -213,7 +213,7 @@
  *  1111  11	r/w	r/w
  */
 	.macro	xscale_set_pte_ext_prologue
-	str	r1, [r0], #-2048		@ linux version
+	str	r1, [r0]			@ linux version
 
 	eor	r3, r1, #L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_WRITE | L_PTE_DIRTY
 
@@ -232,7 +232,7 @@
 	tst	r3, #L_PTE_PRESENT | L_PTE_YOUNG	@ present and young?
 	movne	r2, #0				@ no -> fault
 
-	str	r2, [r0]			@ hardware version
+	str	r2, [r0, #2048]!		@ hardware version
 	mov	ip, #0
 	mcr	p15, 0, r0, c7, c10, 1		@ clean L1 D line
 	mcr	p15, 0, ip, c7, c10, 4		@ data write barrier
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 53cbe22..89c31a6 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -124,15 +124,13 @@ ENDPROC(cpu_v7_switch_mm)
  *	Set a level 2 translation table entry.
  *
  *	- ptep  - pointer to level 2 translation table entry
- *		  (hardware version is stored at -1024 bytes)
+ *		  (hardware version is stored at +2048 bytes)
  *	- pte   - PTE value to store
  *	- ext	- value for extended PTE bits
  */
 ENTRY(cpu_v7_set_pte_ext)
 #ifdef CONFIG_MMU
- ARM(	str	r1, [r0], #-2048	)	@ linux version
- THUMB(	str	r1, [r0]		)	@ linux version
- THUMB(	sub	r0, r0, #2048		)
+	str	r1, [r0]			@ linux version
 
 	bic	r3, r1, #0x000003f0
 	bic	r3, r3, #PTE_TYPE_MASK
@@ -158,7 +156,7 @@ ENTRY(cpu_v7_set_pte_ext)
 	tstne	r1, #L_PTE_PRESENT
 	moveq	r3, #0
 
-	str	r3, [r0]
+	str	r3, [r0, #2048]!
 	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
 #endif
 	mov	pc, lr


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [PATCH v2 04/20] ARM: LPAE: Do not assume Linux PTEs are always at PTRS_PER_PTE offset
@ 2010-11-15 17:42     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 17:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:24PM +0000, Catalin Marinas wrote:
> Placing the Linux PTEs at a 2KB offset inside a page is a workaround for
> the 2-level page table format where not enough spare bits are available.
> With LPAE this is no longer required. This patch changes such assumption
> by using a different macro, LINUX_PTE_OFFSET, which is defined to
> PTRS_PER_PTE for the 2-level page tables.

Hmm.  I think we should be doing this a different way - in fact, I think
we should switch the order of the linux vs hardware page tables.  This
actually simplifies the code a bit too - notice that we lose the arith.
in __pte_map, __pte_unmap, pmd_page_vaddr, which is all page table
walking stuff.

 arch/arm/include/asm/pgalloc.h |   34 +++++++++++++++-------------------
 arch/arm/include/asm/pgtable.h |   30 +++++++++++++++---------------
 arch/arm/mm/fault.c            |    2 +-
 arch/arm/mm/mmu.c              |    2 +-
 arch/arm/mm/proc-macros.S      |   10 +++++-----
 arch/arm/mm/proc-v7.S          |    8 +++-----
 6 files changed, 40 insertions(+), 46 deletions(-)

diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h
index b12cc98..e2a6613 100644
--- a/arch/arm/include/asm/pgalloc.h
+++ b/arch/arm/include/asm/pgalloc.h
@@ -38,6 +38,11 @@ extern void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd);
 
 #define PGALLOC_GFP	(GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO)
 
+static inline void clean_pte_table(void *ptr)
+{
+	clean_dcache_area(ptr + PTE_HWTABLE_OFF, PTE_HWTABLE_SIZE);
+}
+
 /*
  * Allocate one PTE table.
  *
@@ -60,10 +65,8 @@ pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr)
 	pte_t *pte;
 
 	pte = (pte_t *)__get_free_page(PGALLOC_GFP);
-	if (pte) {
-		clean_dcache_area(pte, sizeof(pte_t) * PTRS_PER_PTE);
-		pte += PTRS_PER_PTE;
-	}
+	if (pte)
+		clean_pte_table(pte);
 
 	return pte;
 }
@@ -79,10 +82,8 @@ pte_alloc_one(struct mm_struct *mm, unsigned long addr)
 	pte = alloc_pages(PGALLOC_GFP, 0);
 #endif
 	if (pte) {
-		if (!PageHighMem(pte)) {
-			void *page = page_address(pte);
-			clean_dcache_area(page, sizeof(pte_t) * PTRS_PER_PTE);
-		}
+		if (!PageHighMem(pte))
+			clean_pte_table(page_address(pte));
 		pgtable_page_ctor(pte);
 	}
 
@@ -94,10 +95,8 @@ pte_alloc_one(struct mm_struct *mm, unsigned long addr)
  */
 static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
 {
-	if (pte) {
-		pte -= PTRS_PER_PTE;
+	if (pte)
 		free_page((unsigned long)pte);
-	}
 }
 
 static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
@@ -106,8 +105,9 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
 	__free_page(pte);
 }
 
-static inline void __pmd_populate(pmd_t *pmdp, unsigned long pmdval)
+static inline void __pmd_populate(pmd_t *pmdp, unsigned long pte, unsigned long prot)
 {
+	unsigned long pmdval = (pte + PTE_HWTABLE_OFF) | prot;
 	pmdp[0] = __pmd(pmdval);
 	pmdp[1] = __pmd(pmdval + 256 * sizeof(pte_t));
 	flush_pmd_entry(pmdp);
@@ -122,20 +122,16 @@ static inline void __pmd_populate(pmd_t *pmdp, unsigned long pmdval)
 static inline void
 pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
 {
-	unsigned long pte_ptr = (unsigned long)ptep;
-
 	/*
-	 * The pmd must be loaded with the physical
-	 * address of the PTE table
+	 * The pmd must be loaded with the physical address of the PTE table
 	 */
-	pte_ptr -= PTRS_PER_PTE * sizeof(void *);
-	__pmd_populate(pmdp, __pa(pte_ptr) | _PAGE_KERNEL_TABLE);
+	__pmd_populate(pmdp, __pa(ptep), _PAGE_KERNEL_TABLE);
 }
 
 static inline void
 pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t ptep)
 {
-	__pmd_populate(pmdp, page_to_pfn(ptep) << PAGE_SHIFT | _PAGE_USER_TABLE);
+	__pmd_populate(pmdp, page_to_pfn(ptep) << PAGE_SHIFT, _PAGE_USER_TABLE);
 }
 #define pmd_pgtable(pmd) pmd_page(pmd)
 
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index b155414..d9f1bfa 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -54,7 +54,7 @@
  * Therefore, we tweak the implementation slightly - we tell Linux that we
  * have 2048 entries in the first level, each of which is 8 bytes (iow, two
  * hardware pointers to the second level.)  The second level contains two
- * hardware PTE tables arranged contiguously, followed by Linux versions
+ * hardware PTE tables arranged contiguously, preceded by Linux versions
  * which contain the state information Linux needs.  We, therefore, end up
  * with 512 entries in the "PTE" level.
  *
@@ -62,15 +62,15 @@
  *
  *    pgd             pte
  * |        |
- * +--------+ +0
- * |        |-----> +------------+ +0
+ * +--------+
+ * |        |       +------------+ +0
+ * +- - - - +       | Linux pt 0 |
+ * |        |       +------------+ +1024
+ * +--------+ +0    | Linux pt 1 |
+ * |        |-----> +------------+ +2048
  * +- - - - + +4    |  h/w pt 0  |
- * |        |-----> +------------+ +1024
+ * |        |-----> +------------+ +3072
  * +--------+ +8    |  h/w pt 1  |
- * |        |       +------------+ +2048
- * +- - - - +       | Linux pt 0 |
- * |        |       +------------+ +3072
- * +--------+       | Linux pt 1 |
  * |        |       +------------+ +4096
  *
  * See L_PTE_xxx below for definitions of bits in the "Linux pt", and
@@ -102,6 +102,10 @@
 #define PTRS_PER_PMD		1
 #define PTRS_PER_PGD		2048
 
+#define PTE_HWTABLE_PTRS	(PTRS_PER_PTE)
+#define PTE_HWTABLE_OFF		(PTE_HWTABLE_PTRS * sizeof(pte_t))
+#define PTE_HWTABLE_SIZE	(PTRS_PER_PTE * sizeof(u32))
+
 /*
  * PMD_SHIFT determines the size of the area a second-level page table can map
  * PGDIR_SHIFT determines what a third-level page table entry can map
@@ -270,8 +274,8 @@ extern struct page *empty_zero_page;
 #define __pte_map(dir)		pmd_page_vaddr(*(dir))
 #define __pte_unmap(pte)	do { } while (0)
 #else
-#define __pte_map(dir)		((pte_t *)kmap_atomic(pmd_page(*(dir))) + PTRS_PER_PTE)
-#define __pte_unmap(pte)	kunmap_atomic((pte - PTRS_PER_PTE))
+#define __pte_map(dir)		(pte_t *)kmap_atomic(pmd_page(*(dir)))
+#define __pte_unmap(pte)	kunmap_atomic(pte)
 #endif
 
 #define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
@@ -364,11 +368,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 
 static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 {
-	unsigned long ptr;
-
-	ptr = pmd_val(pmd) & ~(PTRS_PER_PTE * sizeof(void *) - 1);
-	ptr += PTRS_PER_PTE * sizeof(void *);
-
+	unsigned long ptr = pmd_val(pmd) & PAGE_MASK;
 	return __va(ptr);
 }
 
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 1e21e12..f10f9ba 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -108,7 +108,7 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 
 		pte = pte_offset_map(pmd, addr);
 		printk(", *pte=%08lx", pte_val(*pte));
-		printk(", *ppte=%08lx", pte_val(pte[-PTRS_PER_PTE]));
+		printk(", *ppte=%08lx", pte_val(pte[PTE_HWTABLE_PTRS]));
 		pte_unmap(pte);
 	} while(0);
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 72ad3e1..9963189 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -535,7 +535,7 @@ static pte_t * __init early_pte_alloc(pmd_t *pmd, unsigned long addr, unsigned l
 {
 	if (pmd_none(*pmd)) {
 		pte_t *pte = early_alloc(2 * PTRS_PER_PTE * sizeof(pte_t));
-		__pmd_populate(pmd, __pa(pte) | prot);
+		__pmd_populate(pmd, __pa(pte), prot);
 	}
 	BUG_ON(pmd_bad(*pmd));
 	return pte_offset_kernel(pmd, addr);
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index 7d63bea..cbedf9c 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -121,7 +121,7 @@
 	.endm
 
 	.macro	armv6_set_pte_ext pfx
-	str	r1, [r0], #-2048		@ linux version
+	str	r1, [r0], #2048			@ linux version
 
 	bic	r3, r1, #0x000003fc
 	bic	r3, r3, #PTE_TYPE_MASK
@@ -170,7 +170,7 @@
  *  1111  0xff	r/w	r/w
  */
 	.macro	armv3_set_pte_ext wc_disable=1
-	str	r1, [r0], #-2048		@ linux version
+	str	r1, [r0], #2048			@ linux version
 
 	eor	r3, r1, #L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_WRITE | L_PTE_DIRTY
 
@@ -193,7 +193,7 @@
 	bicne	r2, r2, #PTE_BUFFERABLE
 #endif
 	.endif
-	str	r2, [r0]			@ hardware version
+	str	r2, [r0]		@ hardware version
 	.endm
 
 
@@ -213,7 +213,7 @@
  *  1111  11	r/w	r/w
  */
 	.macro	xscale_set_pte_ext_prologue
-	str	r1, [r0], #-2048		@ linux version
+	str	r1, [r0]			@ linux version
 
 	eor	r3, r1, #L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_WRITE | L_PTE_DIRTY
 
@@ -232,7 +232,7 @@
 	tst	r3, #L_PTE_PRESENT | L_PTE_YOUNG	@ present and young?
 	movne	r2, #0				@ no -> fault
 
-	str	r2, [r0]			@ hardware version
+	str	r2, [r0, #2048]!		@ hardware version
 	mov	ip, #0
 	mcr	p15, 0, r0, c7, c10, 1		@ clean L1 D line
 	mcr	p15, 0, ip, c7, c10, 4		@ data write barrier
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 53cbe22..89c31a6 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -124,15 +124,13 @@ ENDPROC(cpu_v7_switch_mm)
  *	Set a level 2 translation table entry.
  *
  *	- ptep  - pointer to level 2 translation table entry
- *		  (hardware version is stored at -1024 bytes)
+ *		  (hardware version is stored at +2048 bytes)
  *	- pte   - PTE value to store
  *	- ext	- value for extended PTE bits
  */
 ENTRY(cpu_v7_set_pte_ext)
 #ifdef CONFIG_MMU
- ARM(	str	r1, [r0], #-2048	)	@ linux version
- THUMB(	str	r1, [r0]		)	@ linux version
- THUMB(	sub	r0, r0, #2048		)
+	str	r1, [r0]			@ linux version
 
 	bic	r3, r1, #0x000003f0
 	bic	r3, r3, #PTE_TYPE_MASK
@@ -158,7 +156,7 @@ ENTRY(cpu_v7_set_pte_ext)
 	tstne	r1, #L_PTE_PRESENT
 	moveq	r3, #0
 
-	str	r3, [r0]
+	str	r3, [r0, #2048]!
 	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
 #endif
 	mov	pc, lr

^ permalink raw reply related	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-15 18:30     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 18:30 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
> The LPAE page table format needs to explicitly disable execution or
> write permissions on a page by setting the corresponding bits (similar
> to the classic page table format with Access Flag enabled). This patch
> introduces null definitions for the 2-level format and the actual noexec
> and nowrite bits for the LPAE format. It also changes several PTE
> maintenance macros and masks.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm/include/asm/pgtable-2level.h |    2 +
>  arch/arm/include/asm/pgtable.h        |   44 +++++++++++++++++++++------------
>  arch/arm/mm/mmu.c                     |    6 ++--
>  3 files changed, 33 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
> index 36bdef7..4e21166 100644
> --- a/arch/arm/include/asm/pgtable-2level.h
> +++ b/arch/arm/include/asm/pgtable-2level.h
> @@ -128,6 +128,8 @@
>  #define L_PTE_USER		(1 << 8)
>  #define L_PTE_EXEC		(1 << 9)
>  #define L_PTE_SHARED		(1 << 10)	/* shared(v6), coherent(xsc3) */
> +#define L_PTE_NOEXEC		(0)
> +#define L_PTE_NOWRITE		(0)

Let's not make this more complicated than it has to be.  If we need the
inverse of WRITE and EXEC, then that's what we should change everyone to,
not invent a new system to work along side the old system.

We're already inverting the write bit for the vast majority of processors,
and exec has always been inverted by the ARMv6 and v7 code.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-15 18:30     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 18:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
> The LPAE page table format needs to explicitly disable execution or
> write permissions on a page by setting the corresponding bits (similar
> to the classic page table format with Access Flag enabled). This patch
> introduces null definitions for the 2-level format and the actual noexec
> and nowrite bits for the LPAE format. It also changes several PTE
> maintenance macros and masks.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm/include/asm/pgtable-2level.h |    2 +
>  arch/arm/include/asm/pgtable.h        |   44 +++++++++++++++++++++------------
>  arch/arm/mm/mmu.c                     |    6 ++--
>  3 files changed, 33 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
> index 36bdef7..4e21166 100644
> --- a/arch/arm/include/asm/pgtable-2level.h
> +++ b/arch/arm/include/asm/pgtable-2level.h
> @@ -128,6 +128,8 @@
>  #define L_PTE_USER		(1 << 8)
>  #define L_PTE_EXEC		(1 << 9)
>  #define L_PTE_SHARED		(1 << 10)	/* shared(v6), coherent(xsc3) */
> +#define L_PTE_NOEXEC		(0)
> +#define L_PTE_NOWRITE		(0)

Let's not make this more complicated than it has to be.  If we need the
inverse of WRITE and EXEC, then that's what we should change everyone to,
not invent a new system to work along side the old system.

We're already inverting the write bit for the vast majority of processors,
and exec has always been inverted by the ARMv6 and v7 code.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 06/20] ARM: LPAE: Introduce the 3-level page table format definitions
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-15 18:34     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 18:34 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Fri, Nov 12, 2010 at 06:00:26PM +0000, Catalin Marinas wrote:
> This patch introduces the pgtable-3level*.h files with definitions
> specific to the LPAE page table format (3 levels of page tables).
> 
> Each table is 4KB and has 512 64-bit entries. An entry can point to a
> 40-bit physical address. The young, write and exec software bits share
> the corresponding hardware bits (negated). Other software bits use spare
> bits in the PTE.
> 
> The patch also changes some variable types from unsigned long or int to
> pteval_t or pgprot_t.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm/include/asm/page.h                 |    4 +
>  arch/arm/include/asm/pgtable-3level-hwdef.h |   78 ++++++++++++++++++
>  arch/arm/include/asm/pgtable-3level-types.h |   55 +++++++++++++
>  arch/arm/include/asm/pgtable-3level.h       |  113 +++++++++++++++++++++++++++
>  arch/arm/include/asm/pgtable-hwdef.h        |    4 +
>  arch/arm/include/asm/pgtable.h              |    6 +-
>  arch/arm/mm/mm.h                            |    8 +-
>  arch/arm/mm/mmu.c                           |    2 +-
>  8 files changed, 264 insertions(+), 6 deletions(-)
>  create mode 100644 arch/arm/include/asm/pgtable-3level-hwdef.h
>  create mode 100644 arch/arm/include/asm/pgtable-3level-types.h
>  create mode 100644 arch/arm/include/asm/pgtable-3level.h
> 
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 3848105..e5124db 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -151,7 +151,11 @@ extern void __cpu_copy_user_highpage(struct page *to, struct page *from,
>  #define clear_page(page)	memset((void *)(page), 0, PAGE_SIZE)
>  extern void copy_page(void *to, const void *from);
>  
> +#ifdef CONFIG_ARM_LPAE
> +#include <asm/pgtable-3level-types.h>
> +#else
>  #include <asm/pgtable-2level-types.h>
> +#endif
>  
>  #endif /* CONFIG_MMU */
>  
> diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
> new file mode 100644
> index 0000000..2f99c3c
> --- /dev/null
> +++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
> @@ -0,0 +1,78 @@
> +/*
> + * arch/arm/include/asm/pgtable-3level-hwdef.h
> + *
> + * Copyright (C) 2010 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> + */
> +#ifndef _ASM_PGTABLE_3LEVEL_HWDEF_H
> +#define _ASM_PGTABLE_3LEVEL_HWDEF_H
> +
> +#include <linux/const.h>
> +#include <asm/pgtable-3level-types.h>
> +
> +/*
> + * Hardware page table definitions.
> + *
> + * + Level 1/2 descriptor
> + *   - common
> + */
> +#define PMD_TYPE_MASK		(_AT(pmd_t, 3) << 0)
> +#define PMD_TYPE_FAULT		(_AT(pmd_t, 0) << 0)
> +#define PMD_TYPE_TABLE		(_AT(pmd_t, 3) << 0)
> +#define PMD_TYPE_SECT		(_AT(pmd_t, 1) << 0)
> +#define PMD_BIT4		(_AT(pmd_t, 0))
> +#define PMD_DOMAIN(x)		(_AT(pmd_t, 0))

It is really not correct to have these constants type'd as pmd_t.
The idea behind pmd_t et.al. is to detect when normal arithmetic or
logical operations are performed on page table entries when the
accessors instead should be used.

By typing these as pmd_t, it means operations need to be:

	u32 pmdval = pmd_val(foo) | pmd_val(PMD_TYE_TABLE);

which is obviously more complicated than is needed.

> diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
> index 6630620..a62f093 100644
> --- a/arch/arm/mm/mm.h
> +++ b/arch/arm/mm/mm.h
> @@ -16,10 +16,10 @@ static inline pmd_t *pmd_off_k(unsigned long virt)
>  }
>  
>  struct mem_type {
> -	unsigned int prot_pte;
> -	unsigned int prot_l1;
> -	unsigned int prot_sect;
> -	unsigned int domain;
> +	pgprot_t prot_pte;
> +	pgprot_t prot_l1;
> +	pgprot_t prot_sect;
> +	pgprot_t domain;

Again, this is wrong.  There's an accessor for pgprot_t typed data.  This
causes code to violate it.

> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index 0ca33dd..7c803c4 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -292,7 +292,7 @@ static void __init build_mem_type_table(void)
>  {
>  	struct cachepolicy *cp;
>  	unsigned int cr = get_cr();
> -	unsigned int user_pgprot, kern_pgprot, vecs_pgprot;
> +	pgprot_t user_pgprot, kern_pgprot, vecs_pgprot;

Ditto.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 06/20] ARM: LPAE: Introduce the 3-level page table format definitions
@ 2010-11-15 18:34     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 18:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:26PM +0000, Catalin Marinas wrote:
> This patch introduces the pgtable-3level*.h files with definitions
> specific to the LPAE page table format (3 levels of page tables).
> 
> Each table is 4KB and has 512 64-bit entries. An entry can point to a
> 40-bit physical address. The young, write and exec software bits share
> the corresponding hardware bits (negated). Other software bits use spare
> bits in the PTE.
> 
> The patch also changes some variable types from unsigned long or int to
> pteval_t or pgprot_t.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm/include/asm/page.h                 |    4 +
>  arch/arm/include/asm/pgtable-3level-hwdef.h |   78 ++++++++++++++++++
>  arch/arm/include/asm/pgtable-3level-types.h |   55 +++++++++++++
>  arch/arm/include/asm/pgtable-3level.h       |  113 +++++++++++++++++++++++++++
>  arch/arm/include/asm/pgtable-hwdef.h        |    4 +
>  arch/arm/include/asm/pgtable.h              |    6 +-
>  arch/arm/mm/mm.h                            |    8 +-
>  arch/arm/mm/mmu.c                           |    2 +-
>  8 files changed, 264 insertions(+), 6 deletions(-)
>  create mode 100644 arch/arm/include/asm/pgtable-3level-hwdef.h
>  create mode 100644 arch/arm/include/asm/pgtable-3level-types.h
>  create mode 100644 arch/arm/include/asm/pgtable-3level.h
> 
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 3848105..e5124db 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -151,7 +151,11 @@ extern void __cpu_copy_user_highpage(struct page *to, struct page *from,
>  #define clear_page(page)	memset((void *)(page), 0, PAGE_SIZE)
>  extern void copy_page(void *to, const void *from);
>  
> +#ifdef CONFIG_ARM_LPAE
> +#include <asm/pgtable-3level-types.h>
> +#else
>  #include <asm/pgtable-2level-types.h>
> +#endif
>  
>  #endif /* CONFIG_MMU */
>  
> diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
> new file mode 100644
> index 0000000..2f99c3c
> --- /dev/null
> +++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
> @@ -0,0 +1,78 @@
> +/*
> + * arch/arm/include/asm/pgtable-3level-hwdef.h
> + *
> + * Copyright (C) 2010 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> + */
> +#ifndef _ASM_PGTABLE_3LEVEL_HWDEF_H
> +#define _ASM_PGTABLE_3LEVEL_HWDEF_H
> +
> +#include <linux/const.h>
> +#include <asm/pgtable-3level-types.h>
> +
> +/*
> + * Hardware page table definitions.
> + *
> + * + Level 1/2 descriptor
> + *   - common
> + */
> +#define PMD_TYPE_MASK		(_AT(pmd_t, 3) << 0)
> +#define PMD_TYPE_FAULT		(_AT(pmd_t, 0) << 0)
> +#define PMD_TYPE_TABLE		(_AT(pmd_t, 3) << 0)
> +#define PMD_TYPE_SECT		(_AT(pmd_t, 1) << 0)
> +#define PMD_BIT4		(_AT(pmd_t, 0))
> +#define PMD_DOMAIN(x)		(_AT(pmd_t, 0))

It is really not correct to have these constants type'd as pmd_t.
The idea behind pmd_t et.al. is to detect when normal arithmetic or
logical operations are performed on page table entries when the
accessors instead should be used.

By typing these as pmd_t, it means operations need to be:

	u32 pmdval = pmd_val(foo) | pmd_val(PMD_TYE_TABLE);

which is obviously more complicated than is needed.

> diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
> index 6630620..a62f093 100644
> --- a/arch/arm/mm/mm.h
> +++ b/arch/arm/mm/mm.h
> @@ -16,10 +16,10 @@ static inline pmd_t *pmd_off_k(unsigned long virt)
>  }
>  
>  struct mem_type {
> -	unsigned int prot_pte;
> -	unsigned int prot_l1;
> -	unsigned int prot_sect;
> -	unsigned int domain;
> +	pgprot_t prot_pte;
> +	pgprot_t prot_l1;
> +	pgprot_t prot_sect;
> +	pgprot_t domain;

Again, this is wrong.  There's an accessor for pgprot_t typed data.  This
causes code to violate it.

> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index 0ca33dd..7c803c4 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -292,7 +292,7 @@ static void __init build_mem_type_table(void)
>  {
>  	struct cachepolicy *cp;
>  	unsigned int cr = get_cr();
> -	unsigned int user_pgprot, kern_pgprot, vecs_pgprot;
> +	pgprot_t user_pgprot, kern_pgprot, vecs_pgprot;

Ditto.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 04/20] ARM: LPAE: Do not assume Linux PTEs are always at PTRS_PER_PTE offset
  2010-11-15 17:42     ` Russell King - ARM Linux
@ 2010-11-15 21:46       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-15 21:46 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On 15 November 2010 17:42, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:24PM +0000, Catalin Marinas wrote:
>> Placing the Linux PTEs at a 2KB offset inside a page is a workaround for
>> the 2-level page table format where not enough spare bits are available.
>> With LPAE this is no longer required. This patch changes such assumption
>> by using a different macro, LINUX_PTE_OFFSET, which is defined to
>> PTRS_PER_PTE for the 2-level page tables.
>
> Hmm.  I think we should be doing this a different way - in fact, I think
> we should switch the order of the linux vs hardware page tables.  This
> actually simplifies the code a bit too - notice that we lose the arith.
> in __pte_map, __pte_unmap, pmd_page_vaddr, which is all page table
> walking stuff.

It looks like a good clean-up to me (though I need some refactoring on
my LPAE patches). Do you plan to push this upstream? If you add a
comment and a signed-off line, I can carry it in my LPAE branch until
it appears in mainline.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 04/20] ARM: LPAE: Do not assume Linux PTEs are always at PTRS_PER_PTE offset
@ 2010-11-15 21:46       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-15 21:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 15 November 2010 17:42, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:24PM +0000, Catalin Marinas wrote:
>> Placing the Linux PTEs at a 2KB offset inside a page is a workaround for
>> the 2-level page table format where not enough spare bits are available.
>> With LPAE this is no longer required. This patch changes such assumption
>> by using a different macro, LINUX_PTE_OFFSET, which is defined to
>> PTRS_PER_PTE for the 2-level page tables.
>
> Hmm. ?I think we should be doing this a different way - in fact, I think
> we should switch the order of the linux vs hardware page tables. ?This
> actually simplifies the code a bit too - notice that we lose the arith.
> in __pte_map, __pte_unmap, pmd_page_vaddr, which is all page table
> walking stuff.

It looks like a good clean-up to me (though I need some refactoring on
my LPAE patches). Do you plan to push this upstream? If you add a
comment and a signed-off line, I can carry it in my LPAE branch until
it appears in mainline.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-15  9:47               ` Arnd Bergmann
@ 2010-11-15 22:07                 ` Nicolas Pitre
  -1 siblings, 0 replies; 154+ messages in thread
From: Nicolas Pitre @ 2010-11-15 22:07 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Catalin Marinas, Russell King - ARM Linux, linux-arm-kernel,
	linux-kernel, Will Deacon

On Mon, 15 Nov 2010, Arnd Bergmann wrote:

> On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
> > > There will be compiler warnings because u32 is unsigned int, and we
> > > print it as %08lx.  Generic code cases pte values to (long long) and
> > > prints them using %08llx.  We should do the same.
> > 
> > We still need some kind of macro because with LPAE we need %016llx
> > since the phys address can go to 40-bit and there are some additional
> > bits in the top word. Unless you'd like to always print 16 characters
> > even for 32-bit ptes (or if there is some other printk magic I'm not
> > aware of).
> 
> Why not just %010llx? That would just be two extra characters.

Not on non-LPAE build, please.


Nicolas

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-15 22:07                 ` Nicolas Pitre
  0 siblings, 0 replies; 154+ messages in thread
From: Nicolas Pitre @ 2010-11-15 22:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 15 Nov 2010, Arnd Bergmann wrote:

> On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
> > > There will be compiler warnings because u32 is unsigned int, and we
> > > print it as %08lx.  Generic code cases pte values to (long long) and
> > > prints them using %08llx.  We should do the same.
> > 
> > We still need some kind of macro because with LPAE we need %016llx
> > since the phys address can go to 40-bit and there are some additional
> > bits in the top word. Unless you'd like to always print 16 characters
> > even for 32-bit ptes (or if there is some other printk magic I'm not
> > aware of).
> 
> Why not just %010llx? That would just be two extra characters.

Not on non-LPAE build, please.


Nicolas

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-15  9:51                 ` Catalin Marinas
@ 2010-11-15 22:11                   ` Nicolas Pitre
  -1 siblings, 0 replies; 154+ messages in thread
From: Nicolas Pitre @ 2010-11-15 22:11 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Arnd Bergmann, Russell King - ARM Linux, linux-arm-kernel,
	linux-kernel, Will Deacon

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1167 bytes --]

On Mon, 15 Nov 2010, Catalin Marinas wrote:

> On 15 November 2010 09:47, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
> >> > There will be compiler warnings because u32 is unsigned int, and we
> >> > print it as %08lx.  Generic code cases pte values to (long long) and
> >> > prints them using %08llx.  We should do the same.
> >>
> >> We still need some kind of macro because with LPAE we need %016llx
> >> since the phys address can go to 40-bit and there are some additional
> >> bits in the top word. Unless you'd like to always print 16 characters
> >> even for 32-bit ptes (or if there is some other printk magic I'm not
> >> aware of).
> >
> > Why not just %010llx? That would just be two extra characters.
> 
> We still have attributes (like XN, bit 54) stored in the top part of
> the pte. This may be of interest when debugging.

They will be printed if they exist. The %010 in front of llx only means 
to have a minimum of 10 zero-paded digits if the value is smaller than 
that.

However, not having aligned values will be confusing.  A macro for the 
format might be the best compromize.


Nicolas

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-15 22:11                   ` Nicolas Pitre
  0 siblings, 0 replies; 154+ messages in thread
From: Nicolas Pitre @ 2010-11-15 22:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 15 Nov 2010, Catalin Marinas wrote:

> On 15 November 2010 09:47, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
> >> > There will be compiler warnings because u32 is unsigned int, and we
> >> > print it as %08lx. ?Generic code cases pte values to (long long) and
> >> > prints them using %08llx. ?We should do the same.
> >>
> >> We still need some kind of macro because with LPAE we need %016llx
> >> since the phys address can go to 40-bit and there are some additional
> >> bits in the top word. Unless you'd like to always print 16 characters
> >> even for 32-bit ptes (or if there is some other printk magic I'm not
> >> aware of).
> >
> > Why not just %010llx? That would just be two extra characters.
> 
> We still have attributes (like XN, bit 54) stored in the top part of
> the pte. This may be of interest when debugging.

They will be printed if they exist. The %010 in front of llx only means 
to have a minimum of 10 zero-paded digits if the value is smaller than 
that.

However, not having aligned values will be confusing.  A macro for the 
format might be the best compromize.


Nicolas

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-15 23:31     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 23:31 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Fri, Nov 12, 2010 at 06:00:22PM +0000, Catalin Marinas wrote:
> This patch moves page table definitions from asm/page.h, asm/pgtable.h
> and asm/ptgable-hwdef.h into corresponding *-2level* files.

This also introduces pteval_t.  It would be useful to have the
introduction of pteval_t as a separate patch, which not only
introduces the typedef, but also makes use of it.

> +#ifndef _ASM_PGTABLE_2LEVEL_TYPES_H
> +#define _ASM_PGTABLE_2LEVEL_TYPES_H
> +
> +#undef STRICT_MM_TYPECHECKS
> +
> +typedef unsigned long pteval_t;
> +
> +#ifdef STRICT_MM_TYPECHECKS
> +/*
> + * These are used to make use of C type-checking..
> + */
> +typedef struct { unsigned long pte; } pte_t;

This should become:

typedef struct { pteval_t pte; } pte_t;

L_PTE_* can then be declared using linux/const.h stuff to typedef them
to pteval_t.  shared_pte_mask also needs to be pteval_t.

As far as the __p*_error() functions, these should probably be passed
the pte/pmd/pgd value itself, rather than first passing them through
__pte_val() et.al.

Of couse, I now have patches for this and my other points... will sort
them out into a series in the next few days.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
@ 2010-11-15 23:31     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 23:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:22PM +0000, Catalin Marinas wrote:
> This patch moves page table definitions from asm/page.h, asm/pgtable.h
> and asm/ptgable-hwdef.h into corresponding *-2level* files.

This also introduces pteval_t.  It would be useful to have the
introduction of pteval_t as a separate patch, which not only
introduces the typedef, but also makes use of it.

> +#ifndef _ASM_PGTABLE_2LEVEL_TYPES_H
> +#define _ASM_PGTABLE_2LEVEL_TYPES_H
> +
> +#undef STRICT_MM_TYPECHECKS
> +
> +typedef unsigned long pteval_t;
> +
> +#ifdef STRICT_MM_TYPECHECKS
> +/*
> + * These are used to make use of C type-checking..
> + */
> +typedef struct { unsigned long pte; } pte_t;

This should become:

typedef struct { pteval_t pte; } pte_t;

L_PTE_* can then be declared using linux/const.h stuff to typedef them
to pteval_t.  shared_pte_mask also needs to be pteval_t.

As far as the __p*_error() functions, these should probably be passed
the pte/pmd/pgd value itself, rather than first passing them through
__pte_val() et.al.

Of couse, I now have patches for this and my other points... will sort
them out into a series in the next few days.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-15 22:11                   ` Nicolas Pitre
@ 2010-11-15 23:35                     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 23:35 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Catalin Marinas, Arnd Bergmann, linux-arm-kernel, linux-kernel,
	Will Deacon

On Mon, Nov 15, 2010 at 05:11:50PM -0500, Nicolas Pitre wrote:
> On Mon, 15 Nov 2010, Catalin Marinas wrote:
> 
> > On 15 November 2010 09:47, Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
> > >> > There will be compiler warnings because u32 is unsigned int, and we
> > >> > print it as %08lx.  Generic code cases pte values to (long long) and
> > >> > prints them using %08llx.  We should do the same.
> > >>
> > >> We still need some kind of macro because with LPAE we need %016llx
> > >> since the phys address can go to 40-bit and there are some additional
> > >> bits in the top word. Unless you'd like to always print 16 characters
> > >> even for 32-bit ptes (or if there is some other printk magic I'm not
> > >> aware of).
> > >
> > > Why not just %010llx? That would just be two extra characters.
> > 
> > We still have attributes (like XN, bit 54) stored in the top part of
> > the pte. This may be of interest when debugging.
> 
> They will be printed if they exist. The %010 in front of llx only means 
> to have a minimum of 10 zero-paded digits if the value is smaller than 
> that.
> 
> However, not having aligned values will be confusing.  A macro for the 
> format might be the best compromize.

It's what is done in the generic kernel code for page table entries.

        printk(KERN_ALERT
                "BUG: Bad page map in process %s  pte:%08llx pmd:%08llx\n",
                current->comm,
                (long long)pte_val(pte), (long long)pmd_val(*pmd));

The places where this matters, there isn't any alignment between
lines to worry about:

	printk(", *pmd=%08lx", pmd_val(*pmd));
	printk(", *pte=%08lx", pte_val(*pte));
	printk(", *ppte=%08lx", pte_val(pte[-PTRS_PER_PTE]));

in show_pte() are examples of what need changing.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-15 23:35                     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-15 23:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 15, 2010 at 05:11:50PM -0500, Nicolas Pitre wrote:
> On Mon, 15 Nov 2010, Catalin Marinas wrote:
> 
> > On 15 November 2010 09:47, Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
> > >> > There will be compiler warnings because u32 is unsigned int, and we
> > >> > print it as %08lx. ?Generic code cases pte values to (long long) and
> > >> > prints them using %08llx. ?We should do the same.
> > >>
> > >> We still need some kind of macro because with LPAE we need %016llx
> > >> since the phys address can go to 40-bit and there are some additional
> > >> bits in the top word. Unless you'd like to always print 16 characters
> > >> even for 32-bit ptes (or if there is some other printk magic I'm not
> > >> aware of).
> > >
> > > Why not just %010llx? That would just be two extra characters.
> > 
> > We still have attributes (like XN, bit 54) stored in the top part of
> > the pte. This may be of interest when debugging.
> 
> They will be printed if they exist. The %010 in front of llx only means 
> to have a minimum of 10 zero-paded digits if the value is smaller than 
> that.
> 
> However, not having aligned values will be confusing.  A macro for the 
> format might be the best compromize.

It's what is done in the generic kernel code for page table entries.

        printk(KERN_ALERT
                "BUG: Bad page map in process %s  pte:%08llx pmd:%08llx\n",
                current->comm,
                (long long)pte_val(pte), (long long)pmd_val(*pmd));

The places where this matters, there isn't any alignment between
lines to worry about:

	printk(", *pmd=%08lx", pmd_val(*pmd));
	printk(", *pte=%08lx", pte_val(*pte));
	printk(", *ppte=%08lx", pte_val(pte[-PTRS_PER_PTE]));

in show_pte() are examples of what need changing.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
  2010-11-15 23:31     ` Russell King - ARM Linux
@ 2010-11-16  9:14       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16  9:14 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On 15 November 2010 23:31, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:22PM +0000, Catalin Marinas wrote:
>> This patch moves page table definitions from asm/page.h, asm/pgtable.h
>> and asm/ptgable-hwdef.h into corresponding *-2level* files.
>
> This also introduces pteval_t.  It would be useful to have the
> introduction of pteval_t as a separate patch, which not only
> introduces the typedef, but also makes use of it.

I can do this. I missed this while splitting my initial big diff.

>> +#ifndef _ASM_PGTABLE_2LEVEL_TYPES_H
>> +#define _ASM_PGTABLE_2LEVEL_TYPES_H
>> +
>> +#undef STRICT_MM_TYPECHECKS
>> +
>> +typedef unsigned long pteval_t;
>> +
>> +#ifdef STRICT_MM_TYPECHECKS
>> +/*
>> + * These are used to make use of C type-checking..
>> + */
>> +typedef struct { unsigned long pte; } pte_t;
>
> This should become:
>
> typedef struct { pteval_t pte; } pte_t;
>
> L_PTE_* can then be declared using linux/const.h stuff to typedef them
> to pteval_t.

I already do this for LPAE but can be done for the 2-level definitions
for consistency.

BTW, do you think it's worth adding STRICT_MM_TYPECHECKS for LPAE as
well? It would probably spot some issues.

> shared_pte_mask also needs to be pteval_t.

That's done in the 3rd version of the series.

> As far as the __p*_error() functions, these should probably be passed
> the pte/pmd/pgd value itself, rather than first passing them through
> __pte_val() et.al.
>
> Of couse, I now have patches for this and my other points... will sort
> them out into a series in the next few days.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
@ 2010-11-16  9:14       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16  9:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 15 November 2010 23:31, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:22PM +0000, Catalin Marinas wrote:
>> This patch moves page table definitions from asm/page.h, asm/pgtable.h
>> and asm/ptgable-hwdef.h into corresponding *-2level* files.
>
> This also introduces pteval_t. ?It would be useful to have the
> introduction of pteval_t as a separate patch, which not only
> introduces the typedef, but also makes use of it.

I can do this. I missed this while splitting my initial big diff.

>> +#ifndef _ASM_PGTABLE_2LEVEL_TYPES_H
>> +#define _ASM_PGTABLE_2LEVEL_TYPES_H
>> +
>> +#undef STRICT_MM_TYPECHECKS
>> +
>> +typedef unsigned long pteval_t;
>> +
>> +#ifdef STRICT_MM_TYPECHECKS
>> +/*
>> + * These are used to make use of C type-checking..
>> + */
>> +typedef struct { unsigned long pte; } pte_t;
>
> This should become:
>
> typedef struct { pteval_t pte; } pte_t;
>
> L_PTE_* can then be declared using linux/const.h stuff to typedef them
> to pteval_t.

I already do this for LPAE but can be done for the 2-level definitions
for consistency.

BTW, do you think it's worth adding STRICT_MM_TYPECHECKS for LPAE as
well? It would probably spot some issues.

> shared_pte_mask also needs to be pteval_t.

That's done in the 3rd version of the series.

> As far as the __p*_error() functions, these should probably be passed
> the pte/pmd/pgd value itself, rather than first passing them through
> __pte_val() et.al.
>
> Of couse, I now have patches for this and my other points... will sort
> them out into a series in the next few days.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-15 22:11                   ` Nicolas Pitre
@ 2010-11-16  9:19                     ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16  9:19 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Arnd Bergmann, Russell King - ARM Linux, linux-arm-kernel,
	linux-kernel, Will Deacon

On 15 November 2010 22:11, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Mon, 15 Nov 2010, Catalin Marinas wrote:
>
>> On 15 November 2010 09:47, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
>> >> > There will be compiler warnings because u32 is unsigned int, and we
>> >> > print it as %08lx.  Generic code cases pte values to (long long) and
>> >> > prints them using %08llx.  We should do the same.
>> >>
>> >> We still need some kind of macro because with LPAE we need %016llx
>> >> since the phys address can go to 40-bit and there are some additional
>> >> bits in the top word. Unless you'd like to always print 16 characters
>> >> even for 32-bit ptes (or if there is some other printk magic I'm not
>> >> aware of).
>> >
>> > Why not just %010llx? That would just be two extra characters.
>>
>> We still have attributes (like XN, bit 54) stored in the top part of
>> the pte. This may be of interest when debugging.
>
> They will be printed if they exist. The %010 in front of llx only means
> to have a minimum of 10 zero-paded digits if the value is smaller than
> that.
>
> However, not having aligned values will be confusing.  A macro for the
> format might be the best compromize.

We thought about using something like

printk("%0*llx", sizeof(pteval_t) * 2, (long long)pte_val(pte));

but it complicates the code.

Anyway, since these are printed for debugging mainline, we can
probably cope with some lack of alignment (as Russell said, there may
not be any where it matters).

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-16  9:19                     ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16  9:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 15 November 2010 22:11, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Mon, 15 Nov 2010, Catalin Marinas wrote:
>
>> On 15 November 2010 09:47, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Monday 15 November 2010 10:39:30 Catalin Marinas wrote:
>> >> > There will be compiler warnings because u32 is unsigned int, and we
>> >> > print it as %08lx. ?Generic code cases pte values to (long long) and
>> >> > prints them using %08llx. ?We should do the same.
>> >>
>> >> We still need some kind of macro because with LPAE we need %016llx
>> >> since the phys address can go to 40-bit and there are some additional
>> >> bits in the top word. Unless you'd like to always print 16 characters
>> >> even for 32-bit ptes (or if there is some other printk magic I'm not
>> >> aware of).
>> >
>> > Why not just %010llx? That would just be two extra characters.
>>
>> We still have attributes (like XN, bit 54) stored in the top part of
>> the pte. This may be of interest when debugging.
>
> They will be printed if they exist. The %010 in front of llx only means
> to have a minimum of 10 zero-paded digits if the value is smaller than
> that.
>
> However, not having aligned values will be confusing. ?A macro for the
> format might be the best compromize.

We thought about using something like

printk("%0*llx", sizeof(pteval_t) * 2, (long long)pte_val(pte));

but it complicates the code.

Anyway, since these are printed for debugging mainline, we can
probably cope with some lack of alignment (as Russell said, there may
not be any where it matters).

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 06/20] ARM: LPAE: Introduce the 3-level page table format definitions
  2010-11-15 18:34     ` Russell King - ARM Linux
@ 2010-11-16  9:57       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16  9:57 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On 15 November 2010 18:34, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:26PM +0000, Catalin Marinas wrote:
>> +#define PMD_TYPE_MASK                (_AT(pmd_t, 3) << 0)
>> +#define PMD_TYPE_FAULT               (_AT(pmd_t, 0) << 0)
>> +#define PMD_TYPE_TABLE               (_AT(pmd_t, 3) << 0)
>> +#define PMD_TYPE_SECT                (_AT(pmd_t, 1) << 0)
>> +#define PMD_BIT4             (_AT(pmd_t, 0))
>> +#define PMD_DOMAIN(x)                (_AT(pmd_t, 0))
>
> It is really not correct to have these constants type'd as pmd_t.
> The idea behind pmd_t et.al. is to detect when normal arithmetic or
> logical operations are performed on page table entries when the
> accessors instead should be used.

OK, I can add pmdval_t (and pgdval_t) and use it for these definitions.

>> diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
>> index 6630620..a62f093 100644
>> --- a/arch/arm/mm/mm.h
>> +++ b/arch/arm/mm/mm.h
>> @@ -16,10 +16,10 @@ static inline pmd_t *pmd_off_k(unsigned long virt)
>>  }
>>
>>  struct mem_type {
>> -     unsigned int prot_pte;
>> -     unsigned int prot_l1;
>> -     unsigned int prot_sect;
>> -     unsigned int domain;
>> +     pgprot_t prot_pte;
>> +     pgprot_t prot_l1;
>> +     pgprot_t prot_sect;
>> +     pgprot_t domain;
>
> Again, this is wrong.  There's an accessor for pgprot_t typed data.  This
> causes code to violate it.

OK, I'll define pgprotval_t and accessors.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 06/20] ARM: LPAE: Introduce the 3-level page table format definitions
@ 2010-11-16  9:57       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16  9:57 UTC (permalink / raw)
  To: linux-arm-kernel

On 15 November 2010 18:34, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:26PM +0000, Catalin Marinas wrote:
>> +#define PMD_TYPE_MASK ? ? ? ? ? ? ? ?(_AT(pmd_t, 3) << 0)
>> +#define PMD_TYPE_FAULT ? ? ? ? ? ? ? (_AT(pmd_t, 0) << 0)
>> +#define PMD_TYPE_TABLE ? ? ? ? ? ? ? (_AT(pmd_t, 3) << 0)
>> +#define PMD_TYPE_SECT ? ? ? ? ? ? ? ?(_AT(pmd_t, 1) << 0)
>> +#define PMD_BIT4 ? ? ? ? ? ? (_AT(pmd_t, 0))
>> +#define PMD_DOMAIN(x) ? ? ? ? ? ? ? ?(_AT(pmd_t, 0))
>
> It is really not correct to have these constants type'd as pmd_t.
> The idea behind pmd_t et.al. is to detect when normal arithmetic or
> logical operations are performed on page table entries when the
> accessors instead should be used.

OK, I can add pmdval_t (and pgdval_t) and use it for these definitions.

>> diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
>> index 6630620..a62f093 100644
>> --- a/arch/arm/mm/mm.h
>> +++ b/arch/arm/mm/mm.h
>> @@ -16,10 +16,10 @@ static inline pmd_t *pmd_off_k(unsigned long virt)
>> ?}
>>
>> ?struct mem_type {
>> - ? ? unsigned int prot_pte;
>> - ? ? unsigned int prot_l1;
>> - ? ? unsigned int prot_sect;
>> - ? ? unsigned int domain;
>> + ? ? pgprot_t prot_pte;
>> + ? ? pgprot_t prot_l1;
>> + ? ? pgprot_t prot_sect;
>> + ? ? pgprot_t domain;
>
> Again, this is wrong. ?There's an accessor for pgprot_t typed data. ?This
> causes code to violate it.

OK, I'll define pgprotval_t and accessors.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
  2010-11-16  9:14       ` Catalin Marinas
@ 2010-11-16  9:59         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-16  9:59 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Tue, Nov 16, 2010 at 09:14:52AM +0000, Catalin Marinas wrote:
> On 15 November 2010 23:31, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > This should become:
> >
> > typedef struct { pteval_t pte; } pte_t;
> >
> > L_PTE_* can then be declared using linux/const.h stuff to typedef them
> > to pteval_t.
> 
> I already do this for LPAE but can be done for the 2-level definitions
> for consistency.

No you don't.  You define the 2nd level definitions using pmd_t which
is _wrong_.  pmd_t is the type of the pmd container, not the pmd value.

> BTW, do you think it's worth adding STRICT_MM_TYPECHECKS for LPAE as
> well? It would probably spot some issues.

Definitely, because it'll throw out warnings for most of your _AT(pmd_t,)
definitions.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
@ 2010-11-16  9:59         ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-16  9:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 16, 2010 at 09:14:52AM +0000, Catalin Marinas wrote:
> On 15 November 2010 23:31, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > This should become:
> >
> > typedef struct { pteval_t pte; } pte_t;
> >
> > L_PTE_* can then be declared using linux/const.h stuff to typedef them
> > to pteval_t.
> 
> I already do this for LPAE but can be done for the 2-level definitions
> for consistency.

No you don't.  You define the 2nd level definitions using pmd_t which
is _wrong_.  pmd_t is the type of the pmd container, not the pmd value.

> BTW, do you think it's worth adding STRICT_MM_TYPECHECKS for LPAE as
> well? It would probably spot some issues.

Definitely, because it'll throw out warnings for most of your _AT(pmd_t,)
definitions.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
  2010-11-16  9:59         ` Russell King - ARM Linux
@ 2010-11-16 10:02           ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 10:02 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On 16 November 2010 09:59, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Nov 16, 2010 at 09:14:52AM +0000, Catalin Marinas wrote:
>> On 15 November 2010 23:31, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > This should become:
>> >
>> > typedef struct { pteval_t pte; } pte_t;
>> >
>> > L_PTE_* can then be declared using linux/const.h stuff to typedef them
>> > to pteval_t.
>>
>> I already do this for LPAE but can be done for the 2-level definitions
>> for consistency.
>
> No you don't.  You define the 2nd level definitions using pmd_t which
> is _wrong_.  pmd_t is the type of the pmd container, not the pmd value.

I was only referring to L_PTE_*. The PMD_* definitions are wrong indeed.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
@ 2010-11-16 10:02           ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 10:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 16 November 2010 09:59, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Nov 16, 2010 at 09:14:52AM +0000, Catalin Marinas wrote:
>> On 15 November 2010 23:31, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > This should become:
>> >
>> > typedef struct { pteval_t pte; } pte_t;
>> >
>> > L_PTE_* can then be declared using linux/const.h stuff to typedef them
>> > to pteval_t.
>>
>> I already do this for LPAE but can be done for the 2-level definitions
>> for consistency.
>
> No you don't. ?You define the 2nd level definitions using pmd_t which
> is _wrong_. ?pmd_t is the type of the pmd container, not the pmd value.

I was only referring to L_PTE_*. The PMD_* definitions are wrong indeed.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
  2010-11-16  9:14       ` Catalin Marinas
@ 2010-11-16 10:04         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-16 10:04 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Tue, Nov 16, 2010 at 09:14:52AM +0000, Catalin Marinas wrote:
> > Of couse, I now have patches for this and my other points... will sort
> > them out into a series in the next few days.
> 
> Thanks.

BTW, don't post another round of patches just because you've had _some_
comments back - your v2 patches are still being looked through, your v3
patches haven't even been looked at yet.

It took some 4 hours for the mailing list to get through last nights
posting frenzy that it really isn't worth overloading - is it really
worth bringing the list server to its knees when the job is only half
done?

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
@ 2010-11-16 10:04         ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-16 10:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 16, 2010 at 09:14:52AM +0000, Catalin Marinas wrote:
> > Of couse, I now have patches for this and my other points... will sort
> > them out into a series in the next few days.
> 
> Thanks.

BTW, don't post another round of patches just because you've had _some_
comments back - your v2 patches are still being looked through, your v3
patches haven't even been looked at yet.

It took some 4 hours for the mailing list to get through last nights
posting frenzy that it really isn't worth overloading - is it really
worth bringing the list server to its knees when the job is only half
done?

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-15 18:30     ` Russell King - ARM Linux
@ 2010-11-16 10:07       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 10:07 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On 15 November 2010 18:30, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
>> --- a/arch/arm/include/asm/pgtable-2level.h
>> +++ b/arch/arm/include/asm/pgtable-2level.h
>> @@ -128,6 +128,8 @@
>>  #define L_PTE_USER           (1 << 8)
>>  #define L_PTE_EXEC           (1 << 9)
>>  #define L_PTE_SHARED         (1 << 10)       /* shared(v6), coherent(xsc3) */
>> +#define L_PTE_NOEXEC         (0)
>> +#define L_PTE_NOWRITE                (0)
>
> Let's not make this more complicated than it has to be.  If we need the
> inverse of WRITE and EXEC, then that's what we should change everyone to,
> not invent a new system to work along side the old system.

Yes, that's fine.

For PMD, we may still need a dummy PMD_SECT_AP_WRITE for the 3-level
definitions unless we change the __pmd() accessor or __pmd_populate().

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-16 10:07       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 10:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 15 November 2010 18:30, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
>> --- a/arch/arm/include/asm/pgtable-2level.h
>> +++ b/arch/arm/include/asm/pgtable-2level.h
>> @@ -128,6 +128,8 @@
>> ?#define L_PTE_USER ? ? ? ? ? (1 << 8)
>> ?#define L_PTE_EXEC ? ? ? ? ? (1 << 9)
>> ?#define L_PTE_SHARED ? ? ? ? (1 << 10) ? ? ? /* shared(v6), coherent(xsc3) */
>> +#define L_PTE_NOEXEC ? ? ? ? (0)
>> +#define L_PTE_NOWRITE ? ? ? ? ? ? ? ?(0)
>
> Let's not make this more complicated than it has to be. ?If we need the
> inverse of WRITE and EXEC, then that's what we should change everyone to,
> not invent a new system to work along side the old system.

Yes, that's fine.

For PMD, we may still need a dummy PMD_SECT_AP_WRITE for the 3-level
definitions unless we change the __pmd() accessor or __pmd_populate().

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
  2010-11-16 10:04         ` Russell King - ARM Linux
@ 2010-11-16 10:11           ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 10:11 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On 16 November 2010 10:04, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Nov 16, 2010 at 09:14:52AM +0000, Catalin Marinas wrote:
>> > Of couse, I now have patches for this and my other points... will sort
>> > them out into a series in the next few days.
>>
>> Thanks.
>
> BTW, don't post another round of patches just because you've had _some_
> comments back - your v2 patches are still being looked through, your v3
> patches haven't even been looked at yet.

I'll wait for your patches on the PTE offset and than rebase mine on
top. It may be sometime next week as I already have a lot to do.

I posted the v3 patches just to clarify the issues around the 3/20
patch. The other patches are pretty much the same, so you can skip
this version and wait for v4.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files
@ 2010-11-16 10:11           ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 10:11 UTC (permalink / raw)
  To: linux-arm-kernel

On 16 November 2010 10:04, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Nov 16, 2010 at 09:14:52AM +0000, Catalin Marinas wrote:
>> > Of couse, I now have patches for this and my other points... will sort
>> > them out into a series in the next few days.
>>
>> Thanks.
>
> BTW, don't post another round of patches just because you've had _some_
> comments back - your v2 patches are still being looked through, your v3
> patches haven't even been looked at yet.

I'll wait for your patches on the PTE offset and than rebase mine on
top. It may be sometime next week as I already have a lot to do.

I posted the v3 patches just to clarify the issues around the 3/20
patch. The other patches are pretty much the same, so you can skip
this version and wait for v4.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-15 18:30     ` Russell King - ARM Linux
@ 2010-11-16 15:18       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 15:18 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On 15 November 2010 18:30, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
>> The LPAE page table format needs to explicitly disable execution or
>> write permissions on a page by setting the corresponding bits (similar
>> to the classic page table format with Access Flag enabled). This patch
>> introduces null definitions for the 2-level format and the actual noexec
>> and nowrite bits for the LPAE format. It also changes several PTE
>> maintenance macros and masks.
>>
>> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>> ---
>>  arch/arm/include/asm/pgtable-2level.h |    2 +
>>  arch/arm/include/asm/pgtable.h        |   44 +++++++++++++++++++++------------
>>  arch/arm/mm/mmu.c                     |    6 ++--
>>  3 files changed, 33 insertions(+), 19 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
>> index 36bdef7..4e21166 100644
>> --- a/arch/arm/include/asm/pgtable-2level.h
>> +++ b/arch/arm/include/asm/pgtable-2level.h
>> @@ -128,6 +128,8 @@
>>  #define L_PTE_USER           (1 << 8)
>>  #define L_PTE_EXEC           (1 << 9)
>>  #define L_PTE_SHARED         (1 << 10)       /* shared(v6), coherent(xsc3) */
>> +#define L_PTE_NOEXEC         (0)
>> +#define L_PTE_NOWRITE                (0)
>
> Let's not make this more complicated than it has to be.  If we need the
> inverse of WRITE and EXEC, then that's what we should change everyone to,
> not invent a new system to work along side the old system.

This adds an additional instruction in set_pte_ext, unless you can
write the bit checking in a better way:

	tst	r1, #L_PTE_NOWRITE
	orrne	r3, r3, #PTE_EXT_APX
	tsteq	r1, #L_PTE_DIRTY
	orreq	r3, r3, #PTE_EXT_APX

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-16 15:18       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 15:18 UTC (permalink / raw)
  To: linux-arm-kernel

On 15 November 2010 18:30, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
>> The LPAE page table format needs to explicitly disable execution or
>> write permissions on a page by setting the corresponding bits (similar
>> to the classic page table format with Access Flag enabled). This patch
>> introduces null definitions for the 2-level format and the actual noexec
>> and nowrite bits for the LPAE format. It also changes several PTE
>> maintenance macros and masks.
>>
>> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>> ---
>> ?arch/arm/include/asm/pgtable-2level.h | ? ?2 +
>> ?arch/arm/include/asm/pgtable.h ? ? ? ?| ? 44 +++++++++++++++++++++------------
>> ?arch/arm/mm/mmu.c ? ? ? ? ? ? ? ? ? ? | ? ?6 ++--
>> ?3 files changed, 33 insertions(+), 19 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
>> index 36bdef7..4e21166 100644
>> --- a/arch/arm/include/asm/pgtable-2level.h
>> +++ b/arch/arm/include/asm/pgtable-2level.h
>> @@ -128,6 +128,8 @@
>> ?#define L_PTE_USER ? ? ? ? ? (1 << 8)
>> ?#define L_PTE_EXEC ? ? ? ? ? (1 << 9)
>> ?#define L_PTE_SHARED ? ? ? ? (1 << 10) ? ? ? /* shared(v6), coherent(xsc3) */
>> +#define L_PTE_NOEXEC ? ? ? ? (0)
>> +#define L_PTE_NOWRITE ? ? ? ? ? ? ? ?(0)
>
> Let's not make this more complicated than it has to be. ?If we need the
> inverse of WRITE and EXEC, then that's what we should change everyone to,
> not invent a new system to work along side the old system.

This adds an additional instruction in set_pte_ext, unless you can
write the bit checking in a better way:

	tst	r1, #L_PTE_NOWRITE
	orrne	r3, r3, #PTE_EXT_APX
	tsteq	r1, #L_PTE_DIRTY
	orreq	r3, r3, #PTE_EXT_APX

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-16 15:18       ` Catalin Marinas
@ 2010-11-16 15:32         ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 15:32 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On 16 November 2010 15:18, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On 15 November 2010 18:30, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
>>> The LPAE page table format needs to explicitly disable execution or
>>> write permissions on a page by setting the corresponding bits (similar
>>> to the classic page table format with Access Flag enabled). This patch
>>> introduces null definitions for the 2-level format and the actual noexec
>>> and nowrite bits for the LPAE format. It also changes several PTE
>>> maintenance macros and masks.
>>>
>>> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>>> ---
>>>  arch/arm/include/asm/pgtable-2level.h |    2 +
>>>  arch/arm/include/asm/pgtable.h        |   44 +++++++++++++++++++++------------
>>>  arch/arm/mm/mmu.c                     |    6 ++--
>>>  3 files changed, 33 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
>>> index 36bdef7..4e21166 100644
>>> --- a/arch/arm/include/asm/pgtable-2level.h
>>> +++ b/arch/arm/include/asm/pgtable-2level.h
>>> @@ -128,6 +128,8 @@
>>>  #define L_PTE_USER           (1 << 8)
>>>  #define L_PTE_EXEC           (1 << 9)
>>>  #define L_PTE_SHARED         (1 << 10)       /* shared(v6), coherent(xsc3) */
>>> +#define L_PTE_NOEXEC         (0)
>>> +#define L_PTE_NOWRITE                (0)
>>
>> Let's not make this more complicated than it has to be.  If we need the
>> inverse of WRITE and EXEC, then that's what we should change everyone to,
>> not invent a new system to work along side the old system.
>
> This adds an additional instruction in set_pte_ext, unless you can
> write the bit checking in a better way:
>
>        tst     r1, #L_PTE_NOWRITE
>        orrne   r3, r3, #PTE_EXT_APX
>        tsteq   r1, #L_PTE_DIRTY
>        orreq   r3, r3, #PTE_EXT_APX

I think that would work with 3 instructions:

	eor	r1, r1, L_PTE_DIRTY
	tst	r1, #L_PTE_NOWRITE | L_PTE_DIRTY
	orrne	r3, r3, #PTE_EXT_APX

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-16 15:32         ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 15:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 16 November 2010 15:18, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On 15 November 2010 18:30, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
>>> The LPAE page table format needs to explicitly disable execution or
>>> write permissions on a page by setting the corresponding bits (similar
>>> to the classic page table format with Access Flag enabled). This patch
>>> introduces null definitions for the 2-level format and the actual noexec
>>> and nowrite bits for the LPAE format. It also changes several PTE
>>> maintenance macros and masks.
>>>
>>> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>>> ---
>>> ?arch/arm/include/asm/pgtable-2level.h | ? ?2 +
>>> ?arch/arm/include/asm/pgtable.h ? ? ? ?| ? 44 +++++++++++++++++++++------------
>>> ?arch/arm/mm/mmu.c ? ? ? ? ? ? ? ? ? ? | ? ?6 ++--
>>> ?3 files changed, 33 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
>>> index 36bdef7..4e21166 100644
>>> --- a/arch/arm/include/asm/pgtable-2level.h
>>> +++ b/arch/arm/include/asm/pgtable-2level.h
>>> @@ -128,6 +128,8 @@
>>> ?#define L_PTE_USER ? ? ? ? ? (1 << 8)
>>> ?#define L_PTE_EXEC ? ? ? ? ? (1 << 9)
>>> ?#define L_PTE_SHARED ? ? ? ? (1 << 10) ? ? ? /* shared(v6), coherent(xsc3) */
>>> +#define L_PTE_NOEXEC ? ? ? ? (0)
>>> +#define L_PTE_NOWRITE ? ? ? ? ? ? ? ?(0)
>>
>> Let's not make this more complicated than it has to be. ?If we need the
>> inverse of WRITE and EXEC, then that's what we should change everyone to,
>> not invent a new system to work along side the old system.
>
> This adds an additional instruction in set_pte_ext, unless you can
> write the bit checking in a better way:
>
> ? ? ? ?tst ? ? r1, #L_PTE_NOWRITE
> ? ? ? ?orrne ? r3, r3, #PTE_EXT_APX
> ? ? ? ?tsteq ? r1, #L_PTE_DIRTY
> ? ? ? ?orreq ? r3, r3, #PTE_EXT_APX

I think that would work with 3 instructions:

	eor	r1, r1, L_PTE_DIRTY
	tst	r1, #L_PTE_NOWRITE | L_PTE_DIRTY
	orrne	r3, r3, #PTE_EXT_APX

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-16 15:18       ` Catalin Marinas
@ 2010-11-16 18:19         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-16 18:19 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Tue, Nov 16, 2010 at 03:18:03PM +0000, Catalin Marinas wrote:
> On 15 November 2010 18:30, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > Let's not make this more complicated than it has to be.  If we need the
> > inverse of WRITE and EXEC, then that's what we should change everyone to,
> > not invent a new system to work along side the old system.
> 
> This adds an additional instruction in set_pte_ext, unless you can
> write the bit checking in a better way:

It actually results in the same number of instructions.  From memory:

ARMv3-ARMv5:
-	eor	r3, r1, #L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_WRITE | L_PTE_DIRTY
-	tst	r3, #L_PTE_WRITE | L_PTE_DIRTY	@ write and dirty?
-	orreq	r2, r2, #PTE_SMALL_AP_UNO_SRW
+	eor	r3, r1, #L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY
+	tst	r3, #L_PTE_RDONLY | L_PTE_DIRTY	@ write and dirty?
+	orreq	r2, r2, #PTE_SMALL_AP_UNO_SRW

and for ARMv6+:

-	tst	r1, #L_PTE_WRITE
-	tstne	r1, #L_PTE_DIRTY
-	orreq	r3, r3, #PTE_EXT_APX
+	eor	r1, r1, #L_PTE_DIRTY
+	tst	r1, #L_PTE_RDONLY | L_PTE_DIRTY
+	orrne	r3, r3, #PTE_EXT_APX

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-16 18:19         ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-16 18:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 16, 2010 at 03:18:03PM +0000, Catalin Marinas wrote:
> On 15 November 2010 18:30, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > Let's not make this more complicated than it has to be. ?If we need the
> > inverse of WRITE and EXEC, then that's what we should change everyone to,
> > not invent a new system to work along side the old system.
> 
> This adds an additional instruction in set_pte_ext, unless you can
> write the bit checking in a better way:

It actually results in the same number of instructions.  From memory:

ARMv3-ARMv5:
-	eor	r3, r1, #L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_WRITE | L_PTE_DIRTY
-	tst	r3, #L_PTE_WRITE | L_PTE_DIRTY	@ write and dirty?
-	orreq	r2, r2, #PTE_SMALL_AP_UNO_SRW
+	eor	r3, r1, #L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY
+	tst	r3, #L_PTE_RDONLY | L_PTE_DIRTY	@ write and dirty?
+	orreq	r2, r2, #PTE_SMALL_AP_UNO_SRW

and for ARMv6+:

-	tst	r1, #L_PTE_WRITE
-	tstne	r1, #L_PTE_DIRTY
-	orreq	r3, r3, #PTE_EXT_APX
+	eor	r1, r1, #L_PTE_DIRTY
+	tst	r1, #L_PTE_RDONLY | L_PTE_DIRTY
+	orrne	r3, r3, #PTE_EXT_APX

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
  2010-11-14 14:09       ` Catalin Marinas
@ 2010-11-16 19:34         ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 19:34 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King - ARM Linux, linux-arm-kernel, linux-kernel, Will Deacon

On Sunday, November 14, 2010, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Sunday, November 14, 2010, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
>>> From: Will Deacon <will.deacon@arm.com>
>>>
>>> When using 2-level paging, pte_t and pmd_t are typedefs for
>>> unsigned long but phys_addr_t is a typedef for u32.
>>>
>>> This patch uses u32 for the page table entry types when
>>> phys_addr_t is not 64-bit, allowing the same conversion
>>> specifier to be used for physical addresses and page table
>>> entries regardless of LPAE.
>>
>> However, code which prints the value of page table entries assumes that
>> they are unsigned long, and places where we store the raw pte value also
>> uses 'unsigned long'.
>>
>> If we're going to make this change, we need to change more places than
>> this patch covers.  grep for pte_val to help find those places.
>
> Patch 19/20 introduces a common macro for formatting but we should
> probably order the patches a bit to avoid problems if anyone is
> bisecting  in the middle of the series.

Actually not a problem since LPAE is only enabled by the last patch.
There may be some compiler warnings without 19/20, I need to check.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes
@ 2010-11-16 19:34         ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-16 19:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Sunday, November 14, 2010, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Sunday, November 14, 2010, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Fri, Nov 12, 2010 at 06:00:23PM +0000, Catalin Marinas wrote:
>>> From: Will Deacon <will.deacon@arm.com>
>>>
>>> When using 2-level paging, pte_t and pmd_t are typedefs for
>>> unsigned long but phys_addr_t is a typedef for u32.
>>>
>>> This patch uses u32 for the page table entry types when
>>> phys_addr_t is not 64-bit, allowing the same conversion
>>> specifier to be used for physical addresses and page table
>>> entries regardless of LPAE.
>>
>> However, code which prints the value of page table entries assumes that
>> they are unsigned long, and places where we store the raw pte value also
>> uses 'unsigned long'.
>>
>> If we're going to make this change, we need to change more places than
>> this patch covers. ?grep for pte_val to help find those places.
>
> Patch 19/20 introduces a common macro for formatting but we should
> probably order the patches a bit to avoid problems if anyone is
> bisecting ?in the middle of the series.

Actually not a problem since LPAE is only enabled by the last patch.
There may be some compiler warnings without 19/20, I need to check.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-15 18:30     ` Russell King - ARM Linux
@ 2010-11-17 17:02       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-17 17:02 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On 15 November 2010 18:30, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
>> --- a/arch/arm/include/asm/pgtable-2level.h
>> +++ b/arch/arm/include/asm/pgtable-2level.h
>> @@ -128,6 +128,8 @@
>>  #define L_PTE_USER           (1 << 8)
>>  #define L_PTE_EXEC           (1 << 9)
>>  #define L_PTE_SHARED         (1 << 10)       /* shared(v6), coherent(xsc3) */
>> +#define L_PTE_NOEXEC         (0)
>> +#define L_PTE_NOWRITE                (0)
>
> Let's not make this more complicated than it has to be.  If we need the
> inverse of WRITE and EXEC, then that's what we should change everyone to,
> not invent a new system to work along side the old system.

Question on the pgprot_noncached/writecombine/dmacoherent - in the
current implementation we pass L_PTE_EXEC on the dmacoherent macro. Do
we need to pass L_PTE_NOEXEC to the noncached/writecombine ones? I
don't see a reason for any of these to be executable but maybe we can
let the code calling them decide.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-17 17:02       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-17 17:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 15 November 2010 18:30, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
>> --- a/arch/arm/include/asm/pgtable-2level.h
>> +++ b/arch/arm/include/asm/pgtable-2level.h
>> @@ -128,6 +128,8 @@
>> ?#define L_PTE_USER ? ? ? ? ? (1 << 8)
>> ?#define L_PTE_EXEC ? ? ? ? ? (1 << 9)
>> ?#define L_PTE_SHARED ? ? ? ? (1 << 10) ? ? ? /* shared(v6), coherent(xsc3) */
>> +#define L_PTE_NOEXEC ? ? ? ? (0)
>> +#define L_PTE_NOWRITE ? ? ? ? ? ? ? ?(0)
>
> Let's not make this more complicated than it has to be. ?If we need the
> inverse of WRITE and EXEC, then that's what we should change everyone to,
> not invent a new system to work along side the old system.

Question on the pgprot_noncached/writecombine/dmacoherent - in the
current implementation we pass L_PTE_EXEC on the dmacoherent macro. Do
we need to pass L_PTE_NOEXEC to the noncached/writecombine ones? I
don't see a reason for any of these to be executable but maybe we can
let the code calling them decide.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-17 17:02       ` Catalin Marinas
@ 2010-11-17 17:16         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-17 17:16 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Wed, Nov 17, 2010 at 05:02:37PM +0000, Catalin Marinas wrote:
> On 15 November 2010 18:30, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
> >> --- a/arch/arm/include/asm/pgtable-2level.h
> >> +++ b/arch/arm/include/asm/pgtable-2level.h
> >> @@ -128,6 +128,8 @@
> >>  #define L_PTE_USER           (1 << 8)
> >>  #define L_PTE_EXEC           (1 << 9)
> >>  #define L_PTE_SHARED         (1 << 10)       /* shared(v6), coherent(xsc3) */
> >> +#define L_PTE_NOEXEC         (0)
> >> +#define L_PTE_NOWRITE                (0)
> >
> > Let's not make this more complicated than it has to be.  If we need the
> > inverse of WRITE and EXEC, then that's what we should change everyone to,
> > not invent a new system to work along side the old system.
> 
> Question on the pgprot_noncached/writecombine/dmacoherent - in the
> current implementation we pass L_PTE_EXEC on the dmacoherent macro.

Erm.  Please look at the code again.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-17 17:16         ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-17 17:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 17, 2010 at 05:02:37PM +0000, Catalin Marinas wrote:
> On 15 November 2010 18:30, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
> >> --- a/arch/arm/include/asm/pgtable-2level.h
> >> +++ b/arch/arm/include/asm/pgtable-2level.h
> >> @@ -128,6 +128,8 @@
> >> ?#define L_PTE_USER ? ? ? ? ? (1 << 8)
> >> ?#define L_PTE_EXEC ? ? ? ? ? (1 << 9)
> >> ?#define L_PTE_SHARED ? ? ? ? (1 << 10) ? ? ? /* shared(v6), coherent(xsc3) */
> >> +#define L_PTE_NOEXEC ? ? ? ? (0)
> >> +#define L_PTE_NOWRITE ? ? ? ? ? ? ? ?(0)
> >
> > Let's not make this more complicated than it has to be. ?If we need the
> > inverse of WRITE and EXEC, then that's what we should change everyone to,
> > not invent a new system to work along side the old system.
> 
> Question on the pgprot_noncached/writecombine/dmacoherent - in the
> current implementation we pass L_PTE_EXEC on the dmacoherent macro.

Erm.  Please look at the code again.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-17 17:16         ` Russell King - ARM Linux
@ 2010-11-17 17:22           ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-17 17:22 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On 17 November 2010 17:16, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Nov 17, 2010 at 05:02:37PM +0000, Catalin Marinas wrote:
>> On 15 November 2010 18:30, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
>> >> --- a/arch/arm/include/asm/pgtable-2level.h
>> >> +++ b/arch/arm/include/asm/pgtable-2level.h
>> >> @@ -128,6 +128,8 @@
>> >>  #define L_PTE_USER           (1 << 8)
>> >>  #define L_PTE_EXEC           (1 << 9)
>> >>  #define L_PTE_SHARED         (1 << 10)       /* shared(v6), coherent(xsc3) */
>> >> +#define L_PTE_NOEXEC         (0)
>> >> +#define L_PTE_NOWRITE                (0)
>> >
>> > Let's not make this more complicated than it has to be.  If we need the
>> > inverse of WRITE and EXEC, then that's what we should change everyone to,
>> > not invent a new system to work along side the old system.
>>
>> Question on the pgprot_noncached/writecombine/dmacoherent - in the
>> current implementation we pass L_PTE_EXEC on the dmacoherent macro.
>
> Erm.  Please look at the code again.

Ah, good point, that was the mask.

So for dmacoherent we make sure that L_PTE_EXEC is cleared. I suspect
we should now make sure that L_PTE_NOEXEC is set. For the other two,
just leave them as they are.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-17 17:22           ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-17 17:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 17 November 2010 17:16, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Nov 17, 2010 at 05:02:37PM +0000, Catalin Marinas wrote:
>> On 15 November 2010 18:30, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Fri, Nov 12, 2010 at 06:00:25PM +0000, Catalin Marinas wrote:
>> >> --- a/arch/arm/include/asm/pgtable-2level.h
>> >> +++ b/arch/arm/include/asm/pgtable-2level.h
>> >> @@ -128,6 +128,8 @@
>> >> ?#define L_PTE_USER ? ? ? ? ? (1 << 8)
>> >> ?#define L_PTE_EXEC ? ? ? ? ? (1 << 9)
>> >> ?#define L_PTE_SHARED ? ? ? ? (1 << 10) ? ? ? /* shared(v6), coherent(xsc3) */
>> >> +#define L_PTE_NOEXEC ? ? ? ? (0)
>> >> +#define L_PTE_NOWRITE ? ? ? ? ? ? ? ?(0)
>> >
>> > Let's not make this more complicated than it has to be. ?If we need the
>> > inverse of WRITE and EXEC, then that's what we should change everyone to,
>> > not invent a new system to work along side the old system.
>>
>> Question on the pgprot_noncached/writecombine/dmacoherent - in the
>> current implementation we pass L_PTE_EXEC on the dmacoherent macro.
>
> Erm. ?Please look at the code again.

Ah, good point, that was the mask.

So for dmacoherent we make sure that L_PTE_EXEC is cleared. I suspect
we should now make sure that L_PTE_NOEXEC is set. For the other two,
just leave them as they are.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-17 17:22           ` Catalin Marinas
@ 2010-11-17 17:24             ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-17 17:24 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Wed, Nov 17, 2010 at 05:22:12PM +0000, Catalin Marinas wrote:
> Ah, good point, that was the mask.
> 
> So for dmacoherent we make sure that L_PTE_EXEC is cleared. I suspect
> we should now make sure that L_PTE_NOEXEC is set. For the other two,
> just leave them as they are.

Already done:

 #define pgprot_dmacoherent(prot) \
-       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_BUFFERABLE)
+       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE|L_PTE_XN)
...
 #define pgprot_dmacoherent(prot) \
-       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_UNCACHED)
+       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED|L_PTE_XN)

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-17 17:24             ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-17 17:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 17, 2010 at 05:22:12PM +0000, Catalin Marinas wrote:
> Ah, good point, that was the mask.
> 
> So for dmacoherent we make sure that L_PTE_EXEC is cleared. I suspect
> we should now make sure that L_PTE_NOEXEC is set. For the other two,
> just leave them as they are.

Already done:

 #define pgprot_dmacoherent(prot) \
-       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_BUFFERABLE)
+       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE|L_PTE_XN)
...
 #define pgprot_dmacoherent(prot) \
-       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_UNCACHED)
+       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED|L_PTE_XN)

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-17 17:24             ` Russell King - ARM Linux
@ 2010-11-17 17:30               ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-17 17:30 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On Wed, 2010-11-17 at 17:24 +0000, Russell King - ARM Linux wrote:
> On Wed, Nov 17, 2010 at 05:22:12PM +0000, Catalin Marinas wrote:
> > Ah, good point, that was the mask.
> >
> > So for dmacoherent we make sure that L_PTE_EXEC is cleared. I suspect
> > we should now make sure that L_PTE_NOEXEC is set. For the other two,
> > just leave them as they are.
> 
> Already done:
> 
>  #define pgprot_dmacoherent(prot) \
> -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_BUFFERABLE)
> +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE|L_PTE_XN)
> ...
>  #define pgprot_dmacoherent(prot) \
> -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_UNCACHED)
> +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED|L_PTE_XN)

Are you already doing such changes? Just to avoid duplicating effort
(and use common naming scheme).

-- 
Catalin


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-17 17:30               ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-17 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2010-11-17 at 17:24 +0000, Russell King - ARM Linux wrote:
> On Wed, Nov 17, 2010 at 05:22:12PM +0000, Catalin Marinas wrote:
> > Ah, good point, that was the mask.
> >
> > So for dmacoherent we make sure that L_PTE_EXEC is cleared. I suspect
> > we should now make sure that L_PTE_NOEXEC is set. For the other two,
> > just leave them as they are.
> 
> Already done:
> 
>  #define pgprot_dmacoherent(prot) \
> -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_BUFFERABLE)
> +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE|L_PTE_XN)
> ...
>  #define pgprot_dmacoherent(prot) \
> -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_UNCACHED)
> +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED|L_PTE_XN)

Are you already doing such changes? Just to avoid duplicating effort
(and use common naming scheme).

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-17 17:30               ` Catalin Marinas
@ 2010-11-17 17:32                 ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-17 17:32 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Wed, Nov 17, 2010 at 05:30:33PM +0000, Catalin Marinas wrote:
> On Wed, 2010-11-17 at 17:24 +0000, Russell King - ARM Linux wrote:
> > On Wed, Nov 17, 2010 at 05:22:12PM +0000, Catalin Marinas wrote:
> > > Ah, good point, that was the mask.
> > >
> > > So for dmacoherent we make sure that L_PTE_EXEC is cleared. I suspect
> > > we should now make sure that L_PTE_NOEXEC is set. For the other two,
> > > just leave them as they are.
> > 
> > Already done:
> > 
> >  #define pgprot_dmacoherent(prot) \
> > -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_BUFFERABLE)
> > +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE|L_PTE_XN)
> > ...
> >  #define pgprot_dmacoherent(prot) \
> > -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_UNCACHED)
> > +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED|L_PTE_XN)
> 
> Are you already doing such changes? Just to avoid duplicating effort
> (and use common naming scheme).

I did say that I had patches for all the issues I raised so far...  They're
just in the process of being posted (if lists.infradead.org this time can
cope with one patch every 20 secs...)

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-17 17:32                 ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-17 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 17, 2010 at 05:30:33PM +0000, Catalin Marinas wrote:
> On Wed, 2010-11-17 at 17:24 +0000, Russell King - ARM Linux wrote:
> > On Wed, Nov 17, 2010 at 05:22:12PM +0000, Catalin Marinas wrote:
> > > Ah, good point, that was the mask.
> > >
> > > So for dmacoherent we make sure that L_PTE_EXEC is cleared. I suspect
> > > we should now make sure that L_PTE_NOEXEC is set. For the other two,
> > > just leave them as they are.
> > 
> > Already done:
> > 
> >  #define pgprot_dmacoherent(prot) \
> > -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_BUFFERABLE)
> > +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE|L_PTE_XN)
> > ...
> >  #define pgprot_dmacoherent(prot) \
> > -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_UNCACHED)
> > +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED|L_PTE_XN)
> 
> Are you already doing such changes? Just to avoid duplicating effort
> (and use common naming scheme).

I did say that I had patches for all the issues I raised so far...  They're
just in the process of being posted (if lists.infradead.org this time can
cope with one patch every 20 secs...)

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
  2010-11-17 17:32                 ` Russell King - ARM Linux
@ 2010-11-17 17:34                   ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-17 17:34 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On Wed, 2010-11-17 at 17:32 +0000, Russell King - ARM Linux wrote:
> On Wed, Nov 17, 2010 at 05:30:33PM +0000, Catalin Marinas wrote:
> > On Wed, 2010-11-17 at 17:24 +0000, Russell King - ARM Linux wrote:
> > > On Wed, Nov 17, 2010 at 05:22:12PM +0000, Catalin Marinas wrote:
> > > > Ah, good point, that was the mask.
> > > >
> > > > So for dmacoherent we make sure that L_PTE_EXEC is cleared. I suspect
> > > > we should now make sure that L_PTE_NOEXEC is set. For the other two,
> > > > just leave them as they are.
> > >
> > > Already done:
> > >
> > >  #define pgprot_dmacoherent(prot) \
> > > -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_BUFFERABLE)
> > > +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE|L_PTE_XN)
> > > ...
> > >  #define pgprot_dmacoherent(prot) \
> > > -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_UNCACHED)
> > > +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED|L_PTE_XN)
> >
> > Are you already doing such changes? Just to avoid duplicating effort
> > (and use common naming scheme).
> 
> I did say that I had patches for all the issues I raised so far...  They're
> just in the process of being posted (if lists.infradead.org this time can
> cope with one patch every 20 secs...)

I wasn't sure which patches, so I did the XN/RDONLY as well (not big
patch though).

I'll rebase my LPAE stuff in the next days and repost.

Thanks.

-- 
Catalin


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE
@ 2010-11-17 17:34                   ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-17 17:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2010-11-17 at 17:32 +0000, Russell King - ARM Linux wrote:
> On Wed, Nov 17, 2010 at 05:30:33PM +0000, Catalin Marinas wrote:
> > On Wed, 2010-11-17 at 17:24 +0000, Russell King - ARM Linux wrote:
> > > On Wed, Nov 17, 2010 at 05:22:12PM +0000, Catalin Marinas wrote:
> > > > Ah, good point, that was the mask.
> > > >
> > > > So for dmacoherent we make sure that L_PTE_EXEC is cleared. I suspect
> > > > we should now make sure that L_PTE_NOEXEC is set. For the other two,
> > > > just leave them as they are.
> > >
> > > Already done:
> > >
> > >  #define pgprot_dmacoherent(prot) \
> > > -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_BUFFERABLE)
> > > +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_BUFFERABLE|L_PTE_XN)
> > > ...
> > >  #define pgprot_dmacoherent(prot) \
> > > -       __pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_UNCACHED)
> > > +       __pgprot_modify(prot, L_PTE_MT_MASK, L_PTE_MT_UNCACHED|L_PTE_XN)
> >
> > Are you already doing such changes? Just to avoid duplicating effort
> > (and use common naming scheme).
> 
> I did say that I had patches for all the issues I raised so far...  They're
> just in the process of being posted (if lists.infradead.org this time can
> cope with one patch every 20 secs...)

I wasn't sure which patches, so I did the XN/RDONLY as well (not big
patch though).

I'll rebase my LPAE stuff in the next days and repost.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 01/20] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-22 12:43     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 12:43 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Fri, Nov 12, 2010 at 06:00:21PM +0000, Catalin Marinas wrote:
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 8c19595..40b386c 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -78,7 +78,7 @@ static inline void identity_mapping_add(pgd_t *pgd, unsigned long start,
>  	if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
>  		prot |= PMD_BIT4;
>  
> -	for (addr = start & PGDIR_MASK; addr < end;) {
> +	for (addr = start & PMD_MASK; addr < end;) {
>  		pmd = pmd_offset(pgd + pgd_index(addr), addr);
>  		pmd[0] = __pmd(addr | prot);
>  		addr += SECTION_SIZE;
> @@ -95,7 +95,7 @@ static inline void identity_mapping_del(pgd_t *pgd, unsigned long start,
>  	unsigned long addr;
>  	pmd_t *pmd;
>  
> -	for (addr = start & PGDIR_MASK; addr < end; addr += PGDIR_SIZE) {
> +	for (addr = start & PMD_MASK; addr < end; addr += PMD_SIZE) {
>  		pmd = pmd_offset(pgd + pgd_index(addr), addr);
>  		pmd[0] = __pmd(0);
>  		pmd[1] = __pmd(0);
...
> @@ -1068,12 +1068,12 @@ void setup_mm_for_reboot(char mode)
>  		base_pmdval |= PMD_BIT4;
>  
>  	for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
> -		unsigned long pmdval = (i << PGDIR_SHIFT) | base_pmdval;
> +		unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
>  		pmd_t *pmd;
>  
> -		pmd = pmd_off(pgd, i << PGDIR_SHIFT);
> +		pmd = pmd_off(pgd, i << PMD_SHIFT);
>  		pmd[0] = __pmd(pmdval);
> -		pmd[1] = __pmd(pmdval + (1 << (PGDIR_SHIFT - 1)));
> +		pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
>  		flush_pmd_entry(pmd);
>  	}
>  

This lot really does need unifying - and in any case this last addition
should be using 'SECTION SIZE' not something related to PMD shifts.

Strangely, it's something I've done over the weekend...

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 01/20] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
@ 2010-11-22 12:43     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 12:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:21PM +0000, Catalin Marinas wrote:
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 8c19595..40b386c 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -78,7 +78,7 @@ static inline void identity_mapping_add(pgd_t *pgd, unsigned long start,
>  	if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
>  		prot |= PMD_BIT4;
>  
> -	for (addr = start & PGDIR_MASK; addr < end;) {
> +	for (addr = start & PMD_MASK; addr < end;) {
>  		pmd = pmd_offset(pgd + pgd_index(addr), addr);
>  		pmd[0] = __pmd(addr | prot);
>  		addr += SECTION_SIZE;
> @@ -95,7 +95,7 @@ static inline void identity_mapping_del(pgd_t *pgd, unsigned long start,
>  	unsigned long addr;
>  	pmd_t *pmd;
>  
> -	for (addr = start & PGDIR_MASK; addr < end; addr += PGDIR_SIZE) {
> +	for (addr = start & PMD_MASK; addr < end; addr += PMD_SIZE) {
>  		pmd = pmd_offset(pgd + pgd_index(addr), addr);
>  		pmd[0] = __pmd(0);
>  		pmd[1] = __pmd(0);
...
> @@ -1068,12 +1068,12 @@ void setup_mm_for_reboot(char mode)
>  		base_pmdval |= PMD_BIT4;
>  
>  	for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
> -		unsigned long pmdval = (i << PGDIR_SHIFT) | base_pmdval;
> +		unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
>  		pmd_t *pmd;
>  
> -		pmd = pmd_off(pgd, i << PGDIR_SHIFT);
> +		pmd = pmd_off(pgd, i << PMD_SHIFT);
>  		pmd[0] = __pmd(pmdval);
> -		pmd[1] = __pmd(pmdval + (1 << (PGDIR_SHIFT - 1)));
> +		pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
>  		flush_pmd_entry(pmd);
>  	}
>  

This lot really does need unifying - and in any case this last addition
should be using 'SECTION SIZE' not something related to PMD shifts.

Strangely, it's something I've done over the weekend...

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 07/20] ARM: LPAE: Page table maintenance for the 3-level format
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-22 12:58     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 12:58 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Fri, Nov 12, 2010 at 06:00:27PM +0000, Catalin Marinas wrote:
> diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> index 97a5de3..41236f0 100644
> --- a/arch/arm/include/asm/pgtable.h
> +++ b/arch/arm/include/asm/pgtable.h
> @@ -124,7 +124,12 @@ extern pgprot_t		pgprot_kernel;
>  extern struct page *empty_zero_page;
>  #define ZERO_PAGE(vaddr)	(empty_zero_page)
>  
> +#ifdef CONFIG_ARM_LPAE
> +#define pte_pfn(pte)		((pte_val(pte) & PTE_PFN_MASK) >> PAGE_SHIFT)
> +#else
>  #define pte_pfn(pte)		(pte_val(pte) >> PAGE_SHIFT)
> +#endif

Just make LPAE and non-LPAE both provide PTE_PFN_MASK - for non-LPAE
this can be defined as ~0UL to optimize it away.  However, PTE_PFN_MASK
is the wrong name for this - you're not masking out the PFN, but the
physical address.  It only becomes a PFN when you shift.

This is important because...

> +static inline pte_t *pmd_page_vaddr(pmd_t pmd)
> +{
> +	return __va(pmd_val(pmd) & PTE_PFN_MASK);

... here it becomes much more confusing - it suggests that
"pmd_val(pmd) & PTE_PFN_MASK" gives you a PFN, which you then pass to
a function which takes a physical address.

Also, pmd_page_vaddr() in my patches ends up as:

 static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 {
+       return __va(pmd_val(pmd) & PAGE_MASK);
 }

which is almost the same.  I'd suggest that this becomes for both:

 static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 {
        return __va(pmd_val(pmd) & PTE_PFN_MASK & PAGE_MASK);
 }

but with PTE_PFN_MASK more appropriately named.

> +}
> +
> +#else	/* !CONFIG_ARM_LPAE */
> +
>  #define pmd_bad(pmd)		(pmd_val(pmd) & 2)
>  
>  #define copy_pmd(pmdpd,pmdps)		\
> @@ -252,7 +285,13 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
>  	return __va(ptr);
>  }
>  
> +#endif	/* CONFIG_ARM_LPAE */
> +
> +#ifdef CONFIG_ARM_LPAE
> +#define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PTE_PFN_MASK))
> +#else
>  #define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd)))
> +#endif

Ditto.

> diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
> index 8fdae9b..f00ae99 100644
> --- a/arch/arm/include/asm/proc-fns.h
> +++ b/arch/arm/include/asm/proc-fns.h
> @@ -263,6 +263,18 @@
>  
>  #define cpu_switch_mm(pgd,mm) cpu_do_switch_mm(virt_to_phys(pgd),mm)
>  
> +#ifdef CONFIG_ARM_LPAE
> +#define cpu_get_pgd()	\
> +	({						\
> +		unsigned long pg, pg2;			\
> +		__asm__("mrrc	p15, 0, %0, %1, c2"	\
> +			: "=r" (pg), "=r" (pg2)		\
> +			:				\
> +			: "cc");			\
> +		pg &= ~(PTRS_PER_PGD*sizeof(pgd_t)-1);	\
> +		(pgd_t *)phys_to_virt(pg);		\
> +	})
> +#else
>  #define cpu_get_pgd()	\
>  	({						\
>  		unsigned long pg;			\
> @@ -271,6 +283,7 @@
>  		pg &= ~0x3fff;				\

I think this wants updating to use similar math to the one above.

> @@ -81,7 +90,8 @@ void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd)
>  	if (!pgd)
>  		return;
>  
> -	/* pgd is always present and good */
> +	if (pgd_none(*pgd))
> +		goto free;

This actually wants to become something more like:

+       pgd = pgd_base + pgd_index(0);
+       if (pgd_none_or_clear_bad(pgd))
+               goto no_pgd;

+       pmd = pmd_offset(pgd, 0);
+       if (pmd_none_or_clear_bad(pmd))
+               goto no_pmd;

        pte = pmd_pgtable(*pmd);
        pmd_clear(pmd);
        pte_free(mm, pte);
+no_pmd:
+       pgd_clear(pgd);
        pmd_free(mm, pmd);
+no_pgd:
	free_pgd(pgd_base);


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 07/20] ARM: LPAE: Page table maintenance for the 3-level format
@ 2010-11-22 12:58     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 12:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:27PM +0000, Catalin Marinas wrote:
> diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> index 97a5de3..41236f0 100644
> --- a/arch/arm/include/asm/pgtable.h
> +++ b/arch/arm/include/asm/pgtable.h
> @@ -124,7 +124,12 @@ extern pgprot_t		pgprot_kernel;
>  extern struct page *empty_zero_page;
>  #define ZERO_PAGE(vaddr)	(empty_zero_page)
>  
> +#ifdef CONFIG_ARM_LPAE
> +#define pte_pfn(pte)		((pte_val(pte) & PTE_PFN_MASK) >> PAGE_SHIFT)
> +#else
>  #define pte_pfn(pte)		(pte_val(pte) >> PAGE_SHIFT)
> +#endif

Just make LPAE and non-LPAE both provide PTE_PFN_MASK - for non-LPAE
this can be defined as ~0UL to optimize it away.  However, PTE_PFN_MASK
is the wrong name for this - you're not masking out the PFN, but the
physical address.  It only becomes a PFN when you shift.

This is important because...

> +static inline pte_t *pmd_page_vaddr(pmd_t pmd)
> +{
> +	return __va(pmd_val(pmd) & PTE_PFN_MASK);

... here it becomes much more confusing - it suggests that
"pmd_val(pmd) & PTE_PFN_MASK" gives you a PFN, which you then pass to
a function which takes a physical address.

Also, pmd_page_vaddr() in my patches ends up as:

 static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 {
+       return __va(pmd_val(pmd) & PAGE_MASK);
 }

which is almost the same.  I'd suggest that this becomes for both:

 static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 {
        return __va(pmd_val(pmd) & PTE_PFN_MASK & PAGE_MASK);
 }

but with PTE_PFN_MASK more appropriately named.

> +}
> +
> +#else	/* !CONFIG_ARM_LPAE */
> +
>  #define pmd_bad(pmd)		(pmd_val(pmd) & 2)
>  
>  #define copy_pmd(pmdpd,pmdps)		\
> @@ -252,7 +285,13 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
>  	return __va(ptr);
>  }
>  
> +#endif	/* CONFIG_ARM_LPAE */
> +
> +#ifdef CONFIG_ARM_LPAE
> +#define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PTE_PFN_MASK))
> +#else
>  #define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd)))
> +#endif

Ditto.

> diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
> index 8fdae9b..f00ae99 100644
> --- a/arch/arm/include/asm/proc-fns.h
> +++ b/arch/arm/include/asm/proc-fns.h
> @@ -263,6 +263,18 @@
>  
>  #define cpu_switch_mm(pgd,mm) cpu_do_switch_mm(virt_to_phys(pgd),mm)
>  
> +#ifdef CONFIG_ARM_LPAE
> +#define cpu_get_pgd()	\
> +	({						\
> +		unsigned long pg, pg2;			\
> +		__asm__("mrrc	p15, 0, %0, %1, c2"	\
> +			: "=r" (pg), "=r" (pg2)		\
> +			:				\
> +			: "cc");			\
> +		pg &= ~(PTRS_PER_PGD*sizeof(pgd_t)-1);	\
> +		(pgd_t *)phys_to_virt(pg);		\
> +	})
> +#else
>  #define cpu_get_pgd()	\
>  	({						\
>  		unsigned long pg;			\
> @@ -271,6 +283,7 @@
>  		pg &= ~0x3fff;				\

I think this wants updating to use similar math to the one above.

> @@ -81,7 +90,8 @@ void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd)
>  	if (!pgd)
>  		return;
>  
> -	/* pgd is always present and good */
> +	if (pgd_none(*pgd))
> +		goto free;

This actually wants to become something more like:

+       pgd = pgd_base + pgd_index(0);
+       if (pgd_none_or_clear_bad(pgd))
+               goto no_pgd;

+       pmd = pmd_offset(pgd, 0);
+       if (pmd_none_or_clear_bad(pmd))
+               goto no_pmd;

        pte = pmd_pgtable(*pmd);
        pmd_clear(pmd);
        pte_free(mm, pte);
+no_pmd:
+       pgd_clear(pgd);
        pmd_free(mm, pmd);
+no_pgd:
	free_pgd(pgd_base);

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 01/20] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  2010-11-22 12:43     ` Russell King - ARM Linux
@ 2010-11-22 13:00       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-22 13:00 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On Mon, 2010-11-22 at 12:43 +0000, Russell King - ARM Linux wrote:
> On Fri, Nov 12, 2010 at 06:00:21PM +0000, Catalin Marinas wrote:
> > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> > index 8c19595..40b386c 100644
> > --- a/arch/arm/kernel/smp.c
> > +++ b/arch/arm/kernel/smp.c
> > @@ -78,7 +78,7 @@ static inline void identity_mapping_add(pgd_t *pgd, unsigned long start,
> >       if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
> >               prot |= PMD_BIT4;
> > 
> > -     for (addr = start & PGDIR_MASK; addr < end;) {
> > +     for (addr = start & PMD_MASK; addr < end;) {
> >               pmd = pmd_offset(pgd + pgd_index(addr), addr);
> >               pmd[0] = __pmd(addr | prot);
> >               addr += SECTION_SIZE;
> > @@ -95,7 +95,7 @@ static inline void identity_mapping_del(pgd_t *pgd, unsigned long start,
> >       unsigned long addr;
> >       pmd_t *pmd;
> > 
> > -     for (addr = start & PGDIR_MASK; addr < end; addr += PGDIR_SIZE) {
> > +     for (addr = start & PMD_MASK; addr < end; addr += PMD_SIZE) {
> >               pmd = pmd_offset(pgd + pgd_index(addr), addr);
> >               pmd[0] = __pmd(0);
> >               pmd[1] = __pmd(0);
> ...
> > @@ -1068,12 +1068,12 @@ void setup_mm_for_reboot(char mode)
> >               base_pmdval |= PMD_BIT4;
> > 
> >       for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
> > -             unsigned long pmdval = (i << PGDIR_SHIFT) | base_pmdval;
> > +             unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
> >               pmd_t *pmd;
> > 
> > -             pmd = pmd_off(pgd, i << PGDIR_SHIFT);
> > +             pmd = pmd_off(pgd, i << PMD_SHIFT);
> >               pmd[0] = __pmd(pmdval);
> > -             pmd[1] = __pmd(pmdval + (1 << (PGDIR_SHIFT - 1)));
> > +             pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
> >               flush_pmd_entry(pmd);
> >       }
> > 
> 
> This lot really does need unifying - and in any case this last addition
> should be using 'SECTION SIZE' not something related to PMD shifts.

On the classic page tables, the PMD_SHIFT is 21 while the SECTION_SHIFT
is 20. They have slightly different meaning.

But we currently have some hacks to cope with PMD_SHIFT being 21 by
writing the pmd[0] and pmd[1] in the same call. The way I see to use
SECTION_SHIFT is to drop the pmd[] array (but haven't looked closely
enough). With LPAE I have a few #ifndef around pmd[1] setting.

> Strangely, it's something I've done over the weekend...

OK, the more clean-up the better.

-- 
Catalin



^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 01/20] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
@ 2010-11-22 13:00       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-22 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2010-11-22 at 12:43 +0000, Russell King - ARM Linux wrote:
> On Fri, Nov 12, 2010 at 06:00:21PM +0000, Catalin Marinas wrote:
> > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> > index 8c19595..40b386c 100644
> > --- a/arch/arm/kernel/smp.c
> > +++ b/arch/arm/kernel/smp.c
> > @@ -78,7 +78,7 @@ static inline void identity_mapping_add(pgd_t *pgd, unsigned long start,
> >       if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
> >               prot |= PMD_BIT4;
> > 
> > -     for (addr = start & PGDIR_MASK; addr < end;) {
> > +     for (addr = start & PMD_MASK; addr < end;) {
> >               pmd = pmd_offset(pgd + pgd_index(addr), addr);
> >               pmd[0] = __pmd(addr | prot);
> >               addr += SECTION_SIZE;
> > @@ -95,7 +95,7 @@ static inline void identity_mapping_del(pgd_t *pgd, unsigned long start,
> >       unsigned long addr;
> >       pmd_t *pmd;
> > 
> > -     for (addr = start & PGDIR_MASK; addr < end; addr += PGDIR_SIZE) {
> > +     for (addr = start & PMD_MASK; addr < end; addr += PMD_SIZE) {
> >               pmd = pmd_offset(pgd + pgd_index(addr), addr);
> >               pmd[0] = __pmd(0);
> >               pmd[1] = __pmd(0);
> ...
> > @@ -1068,12 +1068,12 @@ void setup_mm_for_reboot(char mode)
> >               base_pmdval |= PMD_BIT4;
> > 
> >       for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
> > -             unsigned long pmdval = (i << PGDIR_SHIFT) | base_pmdval;
> > +             unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
> >               pmd_t *pmd;
> > 
> > -             pmd = pmd_off(pgd, i << PGDIR_SHIFT);
> > +             pmd = pmd_off(pgd, i << PMD_SHIFT);
> >               pmd[0] = __pmd(pmdval);
> > -             pmd[1] = __pmd(pmdval + (1 << (PGDIR_SHIFT - 1)));
> > +             pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
> >               flush_pmd_entry(pmd);
> >       }
> > 
> 
> This lot really does need unifying - and in any case this last addition
> should be using 'SECTION SIZE' not something related to PMD shifts.

On the classic page tables, the PMD_SHIFT is 21 while the SECTION_SHIFT
is 20. They have slightly different meaning.

But we currently have some hacks to cope with PMD_SHIFT being 21 by
writing the pmd[0] and pmd[1] in the same call. The way I see to use
SECTION_SHIFT is to drop the pmd[] array (but haven't looked closely
enough). With LPAE I have a few #ifndef around pmd[1] setting.

> Strangely, it's something I've done over the weekend...

OK, the more clean-up the better.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-22 13:10     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:10 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Fri, Nov 12, 2010 at 06:00:28PM +0000, Catalin Marinas wrote:
> This patch adds the MMU initialisation for the LPAE page table format.
> The swapper_pg_dir size with LPAE is 5 rather than 4 pages. The
> __v7_setup function configures the TTBRx split based on the PAGE_OFFSET
> and sets the corresponding TTB control and MAIRx bits (similar to
> PRRR/NMRR for TEX remapping). The 36-bit mappings (supersections) and
> a few other memory types in mmu.c are conditionally compiled.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm/kernel/head.S    |   96 +++++++++++++++++++++++++++++++------------
>  arch/arm/mm/mmu.c         |   32 ++++++++++++++-
>  arch/arm/mm/proc-macros.S |    5 +-
>  arch/arm/mm/proc-v7.S     |   99 ++++++++++++++++++++++++++++++++++++++++----
>  4 files changed, 193 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index dd6b369..fd8a29e 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -21,6 +21,7 @@
>  #include <asm/memory.h>
>  #include <asm/thread_info.h>
>  #include <asm/system.h>
> +#include <asm/pgtable.h>
>  
>  #ifdef CONFIG_DEBUG_LL
>  #include <mach/debug-macro.S>
> @@ -45,11 +46,20 @@
>  #error KERNEL_RAM_VADDR must start at 0xXXXX8000
>  #endif
>  
> +#ifdef CONFIG_ARM_LPAE
> +	/* LPAE requires an additional page for the PGD */
> +#define PG_DIR_SIZE	0x5000
> +#define PTE_WORDS	3
> +#else
> +#define PG_DIR_SIZE	0x4000
> +#define PTE_WORDS	2

PTE is not the right prefix here - we don't deal with the lowest level
of page tables, which in Linux is called PTE.  I think you mean PMD_WORDS
instead.

> +#endif
> +
>  	.globl	swapper_pg_dir
> -	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - 0x4000
> +	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE
>  
>  	.macro	pgtbl, rd
> -	ldr	\rd, =(KERNEL_RAM_PADDR - 0x4000)
> +	ldr	\rd, =(KERNEL_RAM_PADDR - PG_DIR_SIZE)
>  	.endm
>  
>  #ifdef CONFIG_XIP_KERNEL
> @@ -129,11 +139,11 @@ __create_page_tables:
>  	pgtbl	r4				@ page table address
>  
>  	/*
> -	 * Clear the 16K level 1 swapper page table
> +	 * Clear the swapper page table
>  	 */
>  	mov	r0, r4
>  	mov	r3, #0
> -	add	r6, r0, #0x4000
> +	add	r6, r0, #PG_DIR_SIZE
>  1:	str	r3, [r0], #4
>  	str	r3, [r0], #4
>  	str	r3, [r0], #4
> @@ -141,6 +151,23 @@ __create_page_tables:
>  	teq	r0, r6
>  	bne	1b
>  
> +#ifdef CONFIG_ARM_LPAE
> +	/*
> +	 * Build the PGD table (first level) to point to the PMD table. A PGD
> +	 * entry is 64-bit wide and the top 32 bits are 0.
> +	 */
> +	mov	r0, r4
> +	add	r3, r4, #0x1000			@ first PMD table address
> +	orr	r3, r3, #3			@ PGD block type
> +	mov	r6, #4				@ PTRS_PER_PGD
> +1:	str	r3, [r0], #8			@ set PGD entry
> +	add	r3, r3, #0x1000			@ next PMD table
> +	subs	r6, r6, #1
> +	bne	1b
> +
> +	add	r4, r4, #0x1000			@ point to the PMD tables
> +#endif
> +
>  	ldr	r7, [r10, #PROCINFO_MM_MMUFLAGS] @ mm_mmuflags
>  
>  	/*
> @@ -152,30 +179,30 @@ __create_page_tables:
>  	sub	r0, r0, r3			@ virt->phys offset
>  	add	r5, r5, r0			@ phys __enable_mmu
>  	add	r6, r6, r0			@ phys __enable_mmu_end
> -	mov	r5, r5, lsr #20
> -	mov	r6, r6, lsr #20
> +	mov	r5, r5, lsr #SECTION_SHIFT
> +	mov	r6, r6, lsr #SECTION_SHIFT
>  
> -1:	orr	r3, r7, r5, lsl #20		@ flags + kernel base
> -	str	r3, [r4, r5, lsl #2]		@ identity mapping
> -	teq	r5, r6
> -	addne	r5, r5, #1			@ next section
> -	bne	1b
> +1:	orr	r3, r7, r5, lsl #SECTION_SHIFT	@ flags + kernel base
> +	str	r3, [r4, r5, lsl #PTE_WORDS]	@ identity mapping
> +	cmp	r5, r6
> +	addlo	r5, r5, #SECTION_SHIFT >> 20	@ next section
> +	blo	1b
>  
>  	/*
>  	 * Now setup the pagetables for our kernel direct
>  	 * mapped region.
>  	 */
>  	mov	r3, pc
> -	mov	r3, r3, lsr #20
> -	orr	r3, r7, r3, lsl #20
> +	mov	r3, r3, lsr #SECTION_SHIFT
> +	orr	r3, r7, r3, lsl #SECTION_SHIFT
>  	add	r0, r4,  #(KERNEL_START & 0xff000000) >> 18
> -	str	r3, [r0, #(KERNEL_START & 0x00f00000) >> 18]!
> +	str	r3, [r0, #(KERNEL_START & 0x00e00000) >> 18]!
>  	ldr	r6, =(KERNEL_END - 1)
> -	add	r0, r0, #4
> +	add	r0, r0, #1 << PTE_WORDS
>  	add	r6, r4, r6, lsr #18

Are you sure these shifts by 18 places are correct?  They're actually
(val >> SECTION_SHIFT) << 2, so maybe they should be (SECTION_SHIFT -
PMD_WORDS) ?

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
@ 2010-11-22 13:10     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:28PM +0000, Catalin Marinas wrote:
> This patch adds the MMU initialisation for the LPAE page table format.
> The swapper_pg_dir size with LPAE is 5 rather than 4 pages. The
> __v7_setup function configures the TTBRx split based on the PAGE_OFFSET
> and sets the corresponding TTB control and MAIRx bits (similar to
> PRRR/NMRR for TEX remapping). The 36-bit mappings (supersections) and
> a few other memory types in mmu.c are conditionally compiled.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm/kernel/head.S    |   96 +++++++++++++++++++++++++++++++------------
>  arch/arm/mm/mmu.c         |   32 ++++++++++++++-
>  arch/arm/mm/proc-macros.S |    5 +-
>  arch/arm/mm/proc-v7.S     |   99 ++++++++++++++++++++++++++++++++++++++++----
>  4 files changed, 193 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index dd6b369..fd8a29e 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -21,6 +21,7 @@
>  #include <asm/memory.h>
>  #include <asm/thread_info.h>
>  #include <asm/system.h>
> +#include <asm/pgtable.h>
>  
>  #ifdef CONFIG_DEBUG_LL
>  #include <mach/debug-macro.S>
> @@ -45,11 +46,20 @@
>  #error KERNEL_RAM_VADDR must start at 0xXXXX8000
>  #endif
>  
> +#ifdef CONFIG_ARM_LPAE
> +	/* LPAE requires an additional page for the PGD */
> +#define PG_DIR_SIZE	0x5000
> +#define PTE_WORDS	3
> +#else
> +#define PG_DIR_SIZE	0x4000
> +#define PTE_WORDS	2

PTE is not the right prefix here - we don't deal with the lowest level
of page tables, which in Linux is called PTE.  I think you mean PMD_WORDS
instead.

> +#endif
> +
>  	.globl	swapper_pg_dir
> -	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - 0x4000
> +	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE
>  
>  	.macro	pgtbl, rd
> -	ldr	\rd, =(KERNEL_RAM_PADDR - 0x4000)
> +	ldr	\rd, =(KERNEL_RAM_PADDR - PG_DIR_SIZE)
>  	.endm
>  
>  #ifdef CONFIG_XIP_KERNEL
> @@ -129,11 +139,11 @@ __create_page_tables:
>  	pgtbl	r4				@ page table address
>  
>  	/*
> -	 * Clear the 16K level 1 swapper page table
> +	 * Clear the swapper page table
>  	 */
>  	mov	r0, r4
>  	mov	r3, #0
> -	add	r6, r0, #0x4000
> +	add	r6, r0, #PG_DIR_SIZE
>  1:	str	r3, [r0], #4
>  	str	r3, [r0], #4
>  	str	r3, [r0], #4
> @@ -141,6 +151,23 @@ __create_page_tables:
>  	teq	r0, r6
>  	bne	1b
>  
> +#ifdef CONFIG_ARM_LPAE
> +	/*
> +	 * Build the PGD table (first level) to point to the PMD table. A PGD
> +	 * entry is 64-bit wide and the top 32 bits are 0.
> +	 */
> +	mov	r0, r4
> +	add	r3, r4, #0x1000			@ first PMD table address
> +	orr	r3, r3, #3			@ PGD block type
> +	mov	r6, #4				@ PTRS_PER_PGD
> +1:	str	r3, [r0], #8			@ set PGD entry
> +	add	r3, r3, #0x1000			@ next PMD table
> +	subs	r6, r6, #1
> +	bne	1b
> +
> +	add	r4, r4, #0x1000			@ point to the PMD tables
> +#endif
> +
>  	ldr	r7, [r10, #PROCINFO_MM_MMUFLAGS] @ mm_mmuflags
>  
>  	/*
> @@ -152,30 +179,30 @@ __create_page_tables:
>  	sub	r0, r0, r3			@ virt->phys offset
>  	add	r5, r5, r0			@ phys __enable_mmu
>  	add	r6, r6, r0			@ phys __enable_mmu_end
> -	mov	r5, r5, lsr #20
> -	mov	r6, r6, lsr #20
> +	mov	r5, r5, lsr #SECTION_SHIFT
> +	mov	r6, r6, lsr #SECTION_SHIFT
>  
> -1:	orr	r3, r7, r5, lsl #20		@ flags + kernel base
> -	str	r3, [r4, r5, lsl #2]		@ identity mapping
> -	teq	r5, r6
> -	addne	r5, r5, #1			@ next section
> -	bne	1b
> +1:	orr	r3, r7, r5, lsl #SECTION_SHIFT	@ flags + kernel base
> +	str	r3, [r4, r5, lsl #PTE_WORDS]	@ identity mapping
> +	cmp	r5, r6
> +	addlo	r5, r5, #SECTION_SHIFT >> 20	@ next section
> +	blo	1b
>  
>  	/*
>  	 * Now setup the pagetables for our kernel direct
>  	 * mapped region.
>  	 */
>  	mov	r3, pc
> -	mov	r3, r3, lsr #20
> -	orr	r3, r7, r3, lsl #20
> +	mov	r3, r3, lsr #SECTION_SHIFT
> +	orr	r3, r7, r3, lsl #SECTION_SHIFT
>  	add	r0, r4,  #(KERNEL_START & 0xff000000) >> 18
> -	str	r3, [r0, #(KERNEL_START & 0x00f00000) >> 18]!
> +	str	r3, [r0, #(KERNEL_START & 0x00e00000) >> 18]!
>  	ldr	r6, =(KERNEL_END - 1)
> -	add	r0, r0, #4
> +	add	r0, r0, #1 << PTE_WORDS
>  	add	r6, r4, r6, lsr #18

Are you sure these shifts by 18 places are correct?  They're actually
(val >> SECTION_SHIFT) << 2, so maybe they should be (SECTION_SHIFT -
PMD_WORDS) ?

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 09/20] ARM: LPAE: Change setup_mm_for_reboot() to work with LPAE
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-22 13:11     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:11 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Fri, Nov 12, 2010 at 06:00:29PM +0000, Catalin Marinas wrote:
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index 4147cc6..3784acc 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -1098,13 +1098,16 @@ void setup_mm_for_reboot(char mode)
>  	if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
>  		base_pmdval |= PMD_BIT4;
>  
> -	for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
> +	for (i = 0; i < TASK_SIZE >> PMD_SHIFT; i++) {
>  		unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
>  		pmd_t *pmd;
> +		unsigned long addr = i << PMD_SHIFT;
>  
> -		pmd = pmd_off(pgd, i << PMD_SHIFT);
> +		pmd = pmd_off(pgd + pgd_index(addr), addr);
>  		pmd[0] = __pmd(pmdval);
> +#ifndef CONFIG_ARM_LPAE
>  		pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
> +#endif
>  		flush_pmd_entry(pmd);
>  	}

The same is required for the identity mapping code.  If this uses that
code, the problem becomes localized there.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 09/20] ARM: LPAE: Change setup_mm_for_reboot() to work with LPAE
@ 2010-11-22 13:11     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:29PM +0000, Catalin Marinas wrote:
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index 4147cc6..3784acc 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -1098,13 +1098,16 @@ void setup_mm_for_reboot(char mode)
>  	if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
>  		base_pmdval |= PMD_BIT4;
>  
> -	for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
> +	for (i = 0; i < TASK_SIZE >> PMD_SHIFT; i++) {
>  		unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
>  		pmd_t *pmd;
> +		unsigned long addr = i << PMD_SHIFT;
>  
> -		pmd = pmd_off(pgd, i << PMD_SHIFT);
> +		pmd = pmd_off(pgd + pgd_index(addr), addr);
>  		pmd[0] = __pmd(pmdval);
> +#ifndef CONFIG_ARM_LPAE
>  		pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
> +#endif
>  		flush_pmd_entry(pmd);
>  	}

The same is required for the identity mapping code.  If this uses that
code, the problem becomes localized there.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 10/20] ARM: LPAE: Remove the FIRST_USER_PGD_NR and USER_PTRS_PER_PGD definitions
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-22 13:11     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:11 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Fri, Nov 12, 2010 at 06:00:30PM +0000, Catalin Marinas wrote:
> These macros were only used in setup_mm_for_reboot and get_pgd_slow.
> Both have been modified to no longer use these definitions. One of the
> reasons is the different meaning that PGD has with the 2-level and
> 3-level page tables.

We don't actually need this macro anymore, it can be killed (and I've
already done so.)

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 10/20] ARM: LPAE: Remove the FIRST_USER_PGD_NR and USER_PTRS_PER_PGD definitions
@ 2010-11-22 13:11     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:30PM +0000, Catalin Marinas wrote:
> These macros were only used in setup_mm_for_reboot and get_pgd_slow.
> Both have been modified to no longer use these definitions. One of the
> reasons is the different meaning that PGD has with the 2-level and
> 3-level page tables.

We don't actually need this macro anymore, it can be killed (and I've
already done so.)

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 11/20] ARM: LPAE: Add fault handling support
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-22 13:15     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:15 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Fri, Nov 12, 2010 at 06:00:31PM +0000, Catalin Marinas wrote:
> @@ -108,7 +113,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
>  
>  		pte = pte_offset_map(pmd, addr);
>  		printk(", *pte=%08lx", pte_val(*pte));
> +#ifndef CONFIG_ARM_LPAE
>  		printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
> +#endif

This is an unrelated change - should it be in a different patch?

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 11/20] ARM: LPAE: Add fault handling support
@ 2010-11-22 13:15     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:31PM +0000, Catalin Marinas wrote:
> @@ -108,7 +113,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
>  
>  		pte = pte_offset_map(pmd, addr);
>  		printk(", *pte=%08lx", pte_val(*pte));
> +#ifndef CONFIG_ARM_LPAE
>  		printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
> +#endif

This is an unrelated change - should it be in a different patch?

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 11/20] ARM: LPAE: Add fault handling support
  2010-11-22 13:15     ` Russell King - ARM Linux
@ 2010-11-22 13:19       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-22 13:19 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On Mon, 2010-11-22 at 13:15 +0000, Russell King - ARM Linux wrote:
> On Fri, Nov 12, 2010 at 06:00:31PM +0000, Catalin Marinas wrote:
> > @@ -108,7 +113,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
> > 
> >               pte = pte_offset_map(pmd, addr);
> >               printk(", *pte=%08lx", pte_val(*pte));
> > +#ifndef CONFIG_ARM_LPAE
> >               printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
> > +#endif
> 
> This is an unrelated change - should it be in a different patch?

It was intended to be in this patch as I couldn't find a better place.
This patch sorts out the fault handling (and error reporting) for LPAE
and we don't need the additional printk here.

-- 
Catalin



^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 11/20] ARM: LPAE: Add fault handling support
@ 2010-11-22 13:19       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-22 13:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2010-11-22 at 13:15 +0000, Russell King - ARM Linux wrote:
> On Fri, Nov 12, 2010 at 06:00:31PM +0000, Catalin Marinas wrote:
> > @@ -108,7 +113,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
> > 
> >               pte = pte_offset_map(pmd, addr);
> >               printk(", *pte=%08lx", pte_val(*pte));
> > +#ifndef CONFIG_ARM_LPAE
> >               printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
> > +#endif
> 
> This is an unrelated change - should it be in a different patch?

It was intended to be in this patch as I couldn't find a better place.
This patch sorts out the fault handling (and error reporting) for LPAE
and we don't need the additional printk here.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 01/20] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  2010-11-22 13:00       ` Catalin Marinas
@ 2010-11-22 13:28         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:28 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Mon, Nov 22, 2010 at 01:00:10PM +0000, Catalin Marinas wrote:
> On Mon, 2010-11-22 at 12:43 +0000, Russell King - ARM Linux wrote:
> > On Fri, Nov 12, 2010 at 06:00:21PM +0000, Catalin Marinas wrote:
> > > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> > > index 8c19595..40b386c 100644
> > > --- a/arch/arm/kernel/smp.c
> > > +++ b/arch/arm/kernel/smp.c
> > > @@ -78,7 +78,7 @@ static inline void identity_mapping_add(pgd_t *pgd, unsigned long start,
> > >       if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
> > >               prot |= PMD_BIT4;
> > > 
> > > -     for (addr = start & PGDIR_MASK; addr < end;) {
> > > +     for (addr = start & PMD_MASK; addr < end;) {
> > >               pmd = pmd_offset(pgd + pgd_index(addr), addr);
> > >               pmd[0] = __pmd(addr | prot);
> > >               addr += SECTION_SIZE;
> > > @@ -95,7 +95,7 @@ static inline void identity_mapping_del(pgd_t *pgd, unsigned long start,
> > >       unsigned long addr;
> > >       pmd_t *pmd;
> > > 
> > > -     for (addr = start & PGDIR_MASK; addr < end; addr += PGDIR_SIZE) {
> > > +     for (addr = start & PMD_MASK; addr < end; addr += PMD_SIZE) {
> > >               pmd = pmd_offset(pgd + pgd_index(addr), addr);
> > >               pmd[0] = __pmd(0);
> > >               pmd[1] = __pmd(0);
> > ...
> > > @@ -1068,12 +1068,12 @@ void setup_mm_for_reboot(char mode)
> > >               base_pmdval |= PMD_BIT4;
> > > 
> > >       for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
> > > -             unsigned long pmdval = (i << PGDIR_SHIFT) | base_pmdval;
> > > +             unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
> > >               pmd_t *pmd;
> > > 
> > > -             pmd = pmd_off(pgd, i << PGDIR_SHIFT);
> > > +             pmd = pmd_off(pgd, i << PMD_SHIFT);
> > >               pmd[0] = __pmd(pmdval);
> > > -             pmd[1] = __pmd(pmdval + (1 << (PGDIR_SHIFT - 1)));
> > > +             pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
> > >               flush_pmd_entry(pmd);
> > >       }
> > > 
> > 
> > This lot really does need unifying - and in any case this last addition
> > should be using 'SECTION SIZE' not something related to PMD shifts.
> 
> On the classic page tables, the PMD_SHIFT is 21 while the SECTION_SHIFT
> is 20. They have slightly different meaning.

Correct, and if you look at the code again and analyze what it's doing,
you'll see that it's using the wrong thing.  The code pre-exists the
SECTION_* macros, and was never fixed up when they were introduced.

SECTION_SIZE is the right thing here - it's setting up sections.

Just look at identity_mapping_add() to see how the code _should_ be.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 01/20] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
@ 2010-11-22 13:28         ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 22, 2010 at 01:00:10PM +0000, Catalin Marinas wrote:
> On Mon, 2010-11-22 at 12:43 +0000, Russell King - ARM Linux wrote:
> > On Fri, Nov 12, 2010 at 06:00:21PM +0000, Catalin Marinas wrote:
> > > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> > > index 8c19595..40b386c 100644
> > > --- a/arch/arm/kernel/smp.c
> > > +++ b/arch/arm/kernel/smp.c
> > > @@ -78,7 +78,7 @@ static inline void identity_mapping_add(pgd_t *pgd, unsigned long start,
> > >       if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
> > >               prot |= PMD_BIT4;
> > > 
> > > -     for (addr = start & PGDIR_MASK; addr < end;) {
> > > +     for (addr = start & PMD_MASK; addr < end;) {
> > >               pmd = pmd_offset(pgd + pgd_index(addr), addr);
> > >               pmd[0] = __pmd(addr | prot);
> > >               addr += SECTION_SIZE;
> > > @@ -95,7 +95,7 @@ static inline void identity_mapping_del(pgd_t *pgd, unsigned long start,
> > >       unsigned long addr;
> > >       pmd_t *pmd;
> > > 
> > > -     for (addr = start & PGDIR_MASK; addr < end; addr += PGDIR_SIZE) {
> > > +     for (addr = start & PMD_MASK; addr < end; addr += PMD_SIZE) {
> > >               pmd = pmd_offset(pgd + pgd_index(addr), addr);
> > >               pmd[0] = __pmd(0);
> > >               pmd[1] = __pmd(0);
> > ...
> > > @@ -1068,12 +1068,12 @@ void setup_mm_for_reboot(char mode)
> > >               base_pmdval |= PMD_BIT4;
> > > 
> > >       for (i = 0; i < FIRST_USER_PGD_NR + USER_PTRS_PER_PGD; i++, pgd++) {
> > > -             unsigned long pmdval = (i << PGDIR_SHIFT) | base_pmdval;
> > > +             unsigned long pmdval = (i << PMD_SHIFT) | base_pmdval;
> > >               pmd_t *pmd;
> > > 
> > > -             pmd = pmd_off(pgd, i << PGDIR_SHIFT);
> > > +             pmd = pmd_off(pgd, i << PMD_SHIFT);
> > >               pmd[0] = __pmd(pmdval);
> > > -             pmd[1] = __pmd(pmdval + (1 << (PGDIR_SHIFT - 1)));
> > > +             pmd[1] = __pmd(pmdval + (1 << (PMD_SHIFT - 1)));
> > >               flush_pmd_entry(pmd);
> > >       }
> > > 
> > 
> > This lot really does need unifying - and in any case this last addition
> > should be using 'SECTION SIZE' not something related to PMD shifts.
> 
> On the classic page tables, the PMD_SHIFT is 21 while the SECTION_SHIFT
> is 20. They have slightly different meaning.

Correct, and if you look at the code again and analyze what it's doing,
you'll see that it's using the wrong thing.  The code pre-exists the
SECTION_* macros, and was never fixed up when they were introduced.

SECTION_SIZE is the right thing here - it's setting up sections.

Just look at identity_mapping_add() to see how the code _should_ be.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 11/20] ARM: LPAE: Add fault handling support
  2010-11-22 13:19       ` Catalin Marinas
@ 2010-11-22 13:32         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:32 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Mon, Nov 22, 2010 at 01:19:27PM +0000, Catalin Marinas wrote:
> On Mon, 2010-11-22 at 13:15 +0000, Russell King - ARM Linux wrote:
> > On Fri, Nov 12, 2010 at 06:00:31PM +0000, Catalin Marinas wrote:
> > > @@ -108,7 +113,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
> > > 
> > >               pte = pte_offset_map(pmd, addr);
> > >               printk(", *pte=%08lx", pte_val(*pte));
> > > +#ifndef CONFIG_ARM_LPAE
> > >               printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
> > > +#endif
> > 
> > This is an unrelated change - should it be in a different patch?
> 
> It was intended to be in this patch as I couldn't find a better place.
> This patch sorts out the fault handling (and error reporting) for LPAE
> and we don't need the additional printk here.

It doesn't sort the fault error reporting actually.  With pte_val()
returning u64 constants on LPAE, all the above printk's using %08lx will
issue warnings.

Also, as one of your previous patches changed the non-LPAE stuff to use
u32, which is 'unsigned int', %08lx is wrong for them too, and will cause
the compiler to spit out warnings.

I can only assume this patch hasn't been build-tested, or maybe it has
but the warnings ignored?

It seems a larger patch is required here - and as such might as well
become a separate "fix fault reporting" patch.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 11/20] ARM: LPAE: Add fault handling support
@ 2010-11-22 13:32         ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 22, 2010 at 01:19:27PM +0000, Catalin Marinas wrote:
> On Mon, 2010-11-22 at 13:15 +0000, Russell King - ARM Linux wrote:
> > On Fri, Nov 12, 2010 at 06:00:31PM +0000, Catalin Marinas wrote:
> > > @@ -108,7 +113,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
> > > 
> > >               pte = pte_offset_map(pmd, addr);
> > >               printk(", *pte=%08lx", pte_val(*pte));
> > > +#ifndef CONFIG_ARM_LPAE
> > >               printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
> > > +#endif
> > 
> > This is an unrelated change - should it be in a different patch?
> 
> It was intended to be in this patch as I couldn't find a better place.
> This patch sorts out the fault handling (and error reporting) for LPAE
> and we don't need the additional printk here.

It doesn't sort the fault error reporting actually.  With pte_val()
returning u64 constants on LPAE, all the above printk's using %08lx will
issue warnings.

Also, as one of your previous patches changed the non-LPAE stuff to use
u32, which is 'unsigned int', %08lx is wrong for them too, and will cause
the compiler to spit out warnings.

I can only assume this patch hasn't been build-tested, or maybe it has
but the warnings ignored?

It seems a larger patch is required here - and as such might as well
become a separate "fix fault reporting" patch.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 13/20] ARM: LPAE: Add SMP support for the 3-level page table format
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-22 13:37     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:37 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Fri, Nov 12, 2010 at 06:00:33PM +0000, Catalin Marinas wrote:
> With 3-level page tables, starting secondary CPUs required allocating
> the pgd as well. Since LPAE Linux uses TTBR1 for the kernel page tables,
> this patch reorders the CPU setup call in the head.S file so that the
> swapper_pg_dir is used. TTBR0 is set to the value generated by the
> primary CPU.

> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm/kernel/head.S |   10 +++++-----
>  arch/arm/kernel/smp.c  |   39 +++++++++++++++++++++++++++++++++++++--
>  2 files changed, 42 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index fd8a29e..b54d00e 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -321,6 +321,10 @@ ENTRY(secondary_startup)
>  	moveq	r0, #'p'			@ yes, error 'p'
>  	beq	__error_p
>  
> +	pgtbl	r4
> +	add	r12, r10, #BSYM(PROCINFO_INITFUNC)
> +	blx	r12				@ initialise processor
> +						@ (return control reg)

I really don't like this being different in ordering from the boot
CPU bring up.  If we want to have the init function dealing with
split page tables, we should pass in two pointers for it in both
paths.


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 13/20] ARM: LPAE: Add SMP support for the 3-level page table format
@ 2010-11-22 13:37     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:33PM +0000, Catalin Marinas wrote:
> With 3-level page tables, starting secondary CPUs required allocating
> the pgd as well. Since LPAE Linux uses TTBR1 for the kernel page tables,
> this patch reorders the CPU setup call in the head.S file so that the
> swapper_pg_dir is used. TTBR0 is set to the value generated by the
> primary CPU.

> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm/kernel/head.S |   10 +++++-----
>  arch/arm/kernel/smp.c  |   39 +++++++++++++++++++++++++++++++++++++--
>  2 files changed, 42 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index fd8a29e..b54d00e 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -321,6 +321,10 @@ ENTRY(secondary_startup)
>  	moveq	r0, #'p'			@ yes, error 'p'
>  	beq	__error_p
>  
> +	pgtbl	r4
> +	add	r12, r10, #BSYM(PROCINFO_INITFUNC)
> +	blx	r12				@ initialise processor
> +						@ (return control reg)

I really don't like this being different in ordering from the boot
CPU bring up.  If we want to have the init function dealing with
split page tables, we should pass in two pointers for it in both
paths.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 11/20] ARM: LPAE: Add fault handling support
  2010-11-22 13:32         ` Russell King - ARM Linux
@ 2010-11-22 13:38           ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-22 13:38 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On Mon, 2010-11-22 at 13:32 +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 22, 2010 at 01:19:27PM +0000, Catalin Marinas wrote:
> > On Mon, 2010-11-22 at 13:15 +0000, Russell King - ARM Linux wrote:
> > > On Fri, Nov 12, 2010 at 06:00:31PM +0000, Catalin Marinas wrote:
> > > > @@ -108,7 +113,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
> > > >
> > > >               pte = pte_offset_map(pmd, addr);
> > > >               printk(", *pte=%08lx", pte_val(*pte));
> > > > +#ifndef CONFIG_ARM_LPAE
> > > >               printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
> > > > +#endif
> > >
> > > This is an unrelated change - should it be in a different patch?
> >
> > It was intended to be in this patch as I couldn't find a better place.
> > This patch sorts out the fault handling (and error reporting) for LPAE
> > and we don't need the additional printk here.
> 
> It doesn't sort the fault error reporting actually.  With pte_val()
> returning u64 constants on LPAE, all the above printk's using %08lx will
> issue warnings.
> 
> Also, as one of your previous patches changed the non-LPAE stuff to use
> u32, which is 'unsigned int', %08lx is wrong for them too, and will cause
> the compiler to spit out warnings.

This has been fixed in a subsequent version of the series with the
conversion to %08llx and long long.

> I can only assume this patch hasn't been build-tested, or maybe it has
> but the warnings ignored?

Probably the latter. I run the resulting kernels both on VE (with A9)
and a model supporting A15.

-- 
Catalin



^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 11/20] ARM: LPAE: Add fault handling support
@ 2010-11-22 13:38           ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-22 13:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2010-11-22 at 13:32 +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 22, 2010 at 01:19:27PM +0000, Catalin Marinas wrote:
> > On Mon, 2010-11-22 at 13:15 +0000, Russell King - ARM Linux wrote:
> > > On Fri, Nov 12, 2010 at 06:00:31PM +0000, Catalin Marinas wrote:
> > > > @@ -108,7 +113,9 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
> > > >
> > > >               pte = pte_offset_map(pmd, addr);
> > > >               printk(", *pte=%08lx", pte_val(*pte));
> > > > +#ifndef CONFIG_ARM_LPAE
> > > >               printk(", *ppte=%08lx", pte_val(pte[-LINUX_PTE_OFFSET]));
> > > > +#endif
> > >
> > > This is an unrelated change - should it be in a different patch?
> >
> > It was intended to be in this patch as I couldn't find a better place.
> > This patch sorts out the fault handling (and error reporting) for LPAE
> > and we don't need the additional printk here.
> 
> It doesn't sort the fault error reporting actually.  With pte_val()
> returning u64 constants on LPAE, all the above printk's using %08lx will
> issue warnings.
> 
> Also, as one of your previous patches changed the non-LPAE stuff to use
> u32, which is 'unsigned int', %08lx is wrong for them too, and will cause
> the compiler to spit out warnings.

This has been fixed in a subsequent version of the series with the
conversion to %08llx and long long.

> I can only assume this patch hasn't been build-tested, or maybe it has
> but the warnings ignored?

Probably the latter. I run the resulting kernels both on VE (with A9)
and a model supporting A15.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 19/20] ARM: LPAE: define printk format for physical addresses and page table entries
  2010-11-12 18:00   ` Catalin Marinas
@ 2010-11-22 13:43     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:43 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel, Will Deacon

On Fri, Nov 12, 2010 at 06:00:39PM +0000, Catalin Marinas wrote:
> From: Will Deacon <will.deacon@arm.com>
> 
> Now that the Kernel supports 2 level and 3 level page tables, physical
> addresses (and also page table entries) may be 32 or 64-bits depending
> upon the configuration.
> 
> This patch adds a conversion specifier (PHYS_ADDR_FMT) which represents
> a u32 or u64 depending on the width of a physical address.

I hope this patch is gone in v3.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 19/20] ARM: LPAE: define printk format for physical addresses and page table entries
@ 2010-11-22 13:43     ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-22 13:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 12, 2010 at 06:00:39PM +0000, Catalin Marinas wrote:
> From: Will Deacon <will.deacon@arm.com>
> 
> Now that the Kernel supports 2 level and 3 level page tables, physical
> addresses (and also page table entries) may be 32 or 64-bits depending
> upon the configuration.
> 
> This patch adds a conversion specifier (PHYS_ADDR_FMT) which represents
> a u32 or u64 depending on the width of a physical address.

I hope this patch is gone in v3.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 19/20] ARM: LPAE: define printk format for physical addresses and page table entries
  2010-11-22 13:43     ` Russell King - ARM Linux
@ 2010-11-22 13:49       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-22 13:49 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel, Will Deacon

On Mon, 2010-11-22 at 13:43 +0000, Russell King - ARM Linux wrote:
> On Fri, Nov 12, 2010 at 06:00:39PM +0000, Catalin Marinas wrote:
> > From: Will Deacon <will.deacon@arm.com>
> >
> > Now that the Kernel supports 2 level and 3 level page tables, physical
> > addresses (and also page table entries) may be 32 or 64-bits depending
> > upon the configuration.
> >
> > This patch adds a conversion specifier (PHYS_ADDR_FMT) which represents
> > a u32 or u64 depending on the width of a physical address.
> 
> I hope this patch is gone in v3.

Yes, everything is converted to %08llx now.

Once you complete your clean-up, I'll rebase the LPAE patches on top and
repost a new v4 version.

-- 
Catalin



^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 19/20] ARM: LPAE: define printk format for physical addresses and page table entries
@ 2010-11-22 13:49       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-22 13:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2010-11-22 at 13:43 +0000, Russell King - ARM Linux wrote:
> On Fri, Nov 12, 2010 at 06:00:39PM +0000, Catalin Marinas wrote:
> > From: Will Deacon <will.deacon@arm.com>
> >
> > Now that the Kernel supports 2 level and 3 level page tables, physical
> > addresses (and also page table entries) may be 32 or 64-bits depending
> > upon the configuration.
> >
> > This patch adds a conversion specifier (PHYS_ADDR_FMT) which represents
> > a u32 or u64 depending on the width of a physical address.
> 
> I hope this patch is gone in v3.

Yes, everything is converted to %08llx now.

Once you complete your clean-up, I'll rebase the LPAE patches on top and
repost a new v4 version.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
  2010-11-22 13:10     ` Russell King - ARM Linux
@ 2010-11-23 11:38       ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-23 11:38 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On 22 November 2010 13:10, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:28PM +0000, Catalin Marinas wrote:
>> This patch adds the MMU initialisation for the LPAE page table format.
>> The swapper_pg_dir size with LPAE is 5 rather than 4 pages. The
>> __v7_setup function configures the TTBRx split based on the PAGE_OFFSET
>> and sets the corresponding TTB control and MAIRx bits (similar to
>> PRRR/NMRR for TEX remapping). The 36-bit mappings (supersections) and
>> a few other memory types in mmu.c are conditionally compiled.
[...]
>> --- a/arch/arm/kernel/head.S
>> +++ b/arch/arm/kernel/head.S
>> @@ -21,6 +21,7 @@
>>  #include <asm/memory.h>
>>  #include <asm/thread_info.h>
>>  #include <asm/system.h>
>> +#include <asm/pgtable.h>
>>
>>  #ifdef CONFIG_DEBUG_LL
>>  #include <mach/debug-macro.S>
>> @@ -45,11 +46,20 @@
>>  #error KERNEL_RAM_VADDR must start at 0xXXXX8000
>>  #endif
>>
>> +#ifdef CONFIG_ARM_LPAE
>> +     /* LPAE requires an additional page for the PGD */
>> +#define PG_DIR_SIZE  0x5000
>> +#define PTE_WORDS    3
>> +#else
>> +#define PG_DIR_SIZE  0x4000
>> +#define PTE_WORDS    2
>
> PTE is not the right prefix here - we don't deal with the lowest level
> of page tables, which in Linux is called PTE.  I think you mean PMD_WORDS
> instead.

It should actually be something PMD_ORDER because of the log2 value.

>>  #ifdef CONFIG_XIP_KERNEL
>> @@ -129,11 +139,11 @@ __create_page_tables:
>>       pgtbl   r4                              @ page table address
>>
>>       /*
>> -      * Clear the 16K level 1 swapper page table
>> +      * Clear the swapper page table
>>        */
>>       mov     r0, r4
>>       mov     r3, #0
>> -     add     r6, r0, #0x4000
>> +     add     r6, r0, #PG_DIR_SIZE
>>  1:   str     r3, [r0], #4
>>       str     r3, [r0], #4
>>       str     r3, [r0], #4
>> @@ -141,6 +151,23 @@ __create_page_tables:
>>       teq     r0, r6
>>       bne     1b
>>
>> +#ifdef CONFIG_ARM_LPAE
>> +     /*
>> +      * Build the PGD table (first level) to point to the PMD table. A PGD
>> +      * entry is 64-bit wide and the top 32 bits are 0.
>> +      */
>> +     mov     r0, r4
>> +     add     r3, r4, #0x1000                 @ first PMD table address
>> +     orr     r3, r3, #3                      @ PGD block type
>> +     mov     r6, #4                          @ PTRS_PER_PGD
>> +1:   str     r3, [r0], #8                    @ set PGD entry
>> +     add     r3, r3, #0x1000                 @ next PMD table
>> +     subs    r6, r6, #1
>> +     bne     1b
>> +
>> +     add     r4, r4, #0x1000                 @ point to the PMD tables
>> +#endif
>> +
>>       ldr     r7, [r10, #PROCINFO_MM_MMUFLAGS] @ mm_mmuflags
>>
>>       /*
>> @@ -152,30 +179,30 @@ __create_page_tables:
>>       sub     r0, r0, r3                      @ virt->phys offset
>>       add     r5, r5, r0                      @ phys __enable_mmu
>>       add     r6, r6, r0                      @ phys __enable_mmu_end
>> -     mov     r5, r5, lsr #20
>> -     mov     r6, r6, lsr #20
>> +     mov     r5, r5, lsr #SECTION_SHIFT
>> +     mov     r6, r6, lsr #SECTION_SHIFT
>>
>> -1:   orr     r3, r7, r5, lsl #20             @ flags + kernel base
>> -     str     r3, [r4, r5, lsl #2]            @ identity mapping
>> -     teq     r5, r6
>> -     addne   r5, r5, #1                      @ next section
>> -     bne     1b
>> +1:   orr     r3, r7, r5, lsl #SECTION_SHIFT  @ flags + kernel base
>> +     str     r3, [r4, r5, lsl #PTE_WORDS]    @ identity mapping
>> +     cmp     r5, r6
>> +     addlo   r5, r5, #SECTION_SHIFT >> 20    @ next section
>> +     blo     1b
>>
>>       /*
>>        * Now setup the pagetables for our kernel direct
>>        * mapped region.
>>        */
>>       mov     r3, pc
>> -     mov     r3, r3, lsr #20
>> -     orr     r3, r7, r3, lsl #20
>> +     mov     r3, r3, lsr #SECTION_SHIFT
>> +     orr     r3, r7, r3, lsl #SECTION_SHIFT
>>       add     r0, r4,  #(KERNEL_START & 0xff000000) >> 18
>> -     str     r3, [r0, #(KERNEL_START & 0x00f00000) >> 18]!
>> +     str     r3, [r0, #(KERNEL_START & 0x00e00000) >> 18]!
>>       ldr     r6, =(KERNEL_END - 1)
>> -     add     r0, r0, #4
>> +     add     r0, r0, #1 << PTE_WORDS
>>       add     r6, r4, r6, lsr #18
>
> Are you sure these shifts by 18 places are correct?  They're actually
> (val >> SECTION_SHIFT) << 2, so maybe they should be (SECTION_SHIFT -
> PMD_WORDS) ?

SECTION_SHIFT - PMD_ORDER is (20 - 2) for classic page tables and (21
- 3) for LPAE. But we could change the 18 to some macros for
clarification (the line would be long though).

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
@ 2010-11-23 11:38       ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-23 11:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 22 November 2010 13:10, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 12, 2010 at 06:00:28PM +0000, Catalin Marinas wrote:
>> This patch adds the MMU initialisation for the LPAE page table format.
>> The swapper_pg_dir size with LPAE is 5 rather than 4 pages. The
>> __v7_setup function configures the TTBRx split based on the PAGE_OFFSET
>> and sets the corresponding TTB control and MAIRx bits (similar to
>> PRRR/NMRR for TEX remapping). The 36-bit mappings (supersections) and
>> a few other memory types in mmu.c are conditionally compiled.
[...]
>> --- a/arch/arm/kernel/head.S
>> +++ b/arch/arm/kernel/head.S
>> @@ -21,6 +21,7 @@
>> ?#include <asm/memory.h>
>> ?#include <asm/thread_info.h>
>> ?#include <asm/system.h>
>> +#include <asm/pgtable.h>
>>
>> ?#ifdef CONFIG_DEBUG_LL
>> ?#include <mach/debug-macro.S>
>> @@ -45,11 +46,20 @@
>> ?#error KERNEL_RAM_VADDR must start at 0xXXXX8000
>> ?#endif
>>
>> +#ifdef CONFIG_ARM_LPAE
>> + ? ? /* LPAE requires an additional page for the PGD */
>> +#define PG_DIR_SIZE ?0x5000
>> +#define PTE_WORDS ? ?3
>> +#else
>> +#define PG_DIR_SIZE ?0x4000
>> +#define PTE_WORDS ? ?2
>
> PTE is not the right prefix here - we don't deal with the lowest level
> of page tables, which in Linux is called PTE. ?I think you mean PMD_WORDS
> instead.

It should actually be something PMD_ORDER because of the log2 value.

>> ?#ifdef CONFIG_XIP_KERNEL
>> @@ -129,11 +139,11 @@ __create_page_tables:
>> ? ? ? pgtbl ? r4 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ page table address
>>
>> ? ? ? /*
>> - ? ? ?* Clear the 16K level 1 swapper page table
>> + ? ? ?* Clear the swapper page table
>> ? ? ? ?*/
>> ? ? ? mov ? ? r0, r4
>> ? ? ? mov ? ? r3, #0
>> - ? ? add ? ? r6, r0, #0x4000
>> + ? ? add ? ? r6, r0, #PG_DIR_SIZE
>> ?1: ? str ? ? r3, [r0], #4
>> ? ? ? str ? ? r3, [r0], #4
>> ? ? ? str ? ? r3, [r0], #4
>> @@ -141,6 +151,23 @@ __create_page_tables:
>> ? ? ? teq ? ? r0, r6
>> ? ? ? bne ? ? 1b
>>
>> +#ifdef CONFIG_ARM_LPAE
>> + ? ? /*
>> + ? ? ?* Build the PGD table (first level) to point to the PMD table. A PGD
>> + ? ? ?* entry is 64-bit wide and the top 32 bits are 0.
>> + ? ? ?*/
>> + ? ? mov ? ? r0, r4
>> + ? ? add ? ? r3, r4, #0x1000 ? ? ? ? ? ? ? ? @ first PMD table address
>> + ? ? orr ? ? r3, r3, #3 ? ? ? ? ? ? ? ? ? ? ?@ PGD block type
>> + ? ? mov ? ? r6, #4 ? ? ? ? ? ? ? ? ? ? ? ? ?@ PTRS_PER_PGD
>> +1: ? str ? ? r3, [r0], #8 ? ? ? ? ? ? ? ? ? ?@ set PGD entry
>> + ? ? add ? ? r3, r3, #0x1000 ? ? ? ? ? ? ? ? @ next PMD table
>> + ? ? subs ? ?r6, r6, #1
>> + ? ? bne ? ? 1b
>> +
>> + ? ? add ? ? r4, r4, #0x1000 ? ? ? ? ? ? ? ? @ point to the PMD tables
>> +#endif
>> +
>> ? ? ? ldr ? ? r7, [r10, #PROCINFO_MM_MMUFLAGS] @ mm_mmuflags
>>
>> ? ? ? /*
>> @@ -152,30 +179,30 @@ __create_page_tables:
>> ? ? ? sub ? ? r0, r0, r3 ? ? ? ? ? ? ? ? ? ? ?@ virt->phys offset
>> ? ? ? add ? ? r5, r5, r0 ? ? ? ? ? ? ? ? ? ? ?@ phys __enable_mmu
>> ? ? ? add ? ? r6, r6, r0 ? ? ? ? ? ? ? ? ? ? ?@ phys __enable_mmu_end
>> - ? ? mov ? ? r5, r5, lsr #20
>> - ? ? mov ? ? r6, r6, lsr #20
>> + ? ? mov ? ? r5, r5, lsr #SECTION_SHIFT
>> + ? ? mov ? ? r6, r6, lsr #SECTION_SHIFT
>>
>> -1: ? orr ? ? r3, r7, r5, lsl #20 ? ? ? ? ? ? @ flags + kernel base
>> - ? ? str ? ? r3, [r4, r5, lsl #2] ? ? ? ? ? ?@ identity mapping
>> - ? ? teq ? ? r5, r6
>> - ? ? addne ? r5, r5, #1 ? ? ? ? ? ? ? ? ? ? ?@ next section
>> - ? ? bne ? ? 1b
>> +1: ? orr ? ? r3, r7, r5, lsl #SECTION_SHIFT ?@ flags + kernel base
>> + ? ? str ? ? r3, [r4, r5, lsl #PTE_WORDS] ? ?@ identity mapping
>> + ? ? cmp ? ? r5, r6
>> + ? ? addlo ? r5, r5, #SECTION_SHIFT >> 20 ? ?@ next section
>> + ? ? blo ? ? 1b
>>
>> ? ? ? /*
>> ? ? ? ?* Now setup the pagetables for our kernel direct
>> ? ? ? ?* mapped region.
>> ? ? ? ?*/
>> ? ? ? mov ? ? r3, pc
>> - ? ? mov ? ? r3, r3, lsr #20
>> - ? ? orr ? ? r3, r7, r3, lsl #20
>> + ? ? mov ? ? r3, r3, lsr #SECTION_SHIFT
>> + ? ? orr ? ? r3, r7, r3, lsl #SECTION_SHIFT
>> ? ? ? add ? ? r0, r4, ?#(KERNEL_START & 0xff000000) >> 18
>> - ? ? str ? ? r3, [r0, #(KERNEL_START & 0x00f00000) >> 18]!
>> + ? ? str ? ? r3, [r0, #(KERNEL_START & 0x00e00000) >> 18]!
>> ? ? ? ldr ? ? r6, =(KERNEL_END - 1)
>> - ? ? add ? ? r0, r0, #4
>> + ? ? add ? ? r0, r0, #1 << PTE_WORDS
>> ? ? ? add ? ? r6, r4, r6, lsr #18
>
> Are you sure these shifts by 18 places are correct? ?They're actually
> (val >> SECTION_SHIFT) << 2, so maybe they should be (SECTION_SHIFT -
> PMD_WORDS) ?

SECTION_SHIFT - PMD_ORDER is (20 - 2) for classic page tables and (21
- 3) for LPAE. But we could change the 18 to some macros for
clarification (the line would be long though).

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
  2010-11-23 11:38       ` Catalin Marinas
@ 2010-11-23 17:33         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-23 17:33 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arm-kernel, linux-kernel

On Tue, Nov 23, 2010 at 11:38:15AM +0000, Catalin Marinas wrote:
> On 22 November 2010 13:10, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > Are you sure these shifts by 18 places are correct?  They're actually
> > (val >> SECTION_SHIFT) << 2, so maybe they should be (SECTION_SHIFT -
> > PMD_WORDS) ?
> 
> SECTION_SHIFT - PMD_ORDER is (20 - 2) for classic page tables and (21
> - 3) for LPAE. But we could change the 18 to some macros for
> clarification (the line would be long though).

So yes, it's SECTION_SHIFT - PMD_ORDER, which is how they should be
used IMHO.  I don't see why another macro would be necessary.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
@ 2010-11-23 17:33         ` Russell King - ARM Linux
  0 siblings, 0 replies; 154+ messages in thread
From: Russell King - ARM Linux @ 2010-11-23 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 23, 2010 at 11:38:15AM +0000, Catalin Marinas wrote:
> On 22 November 2010 13:10, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > Are you sure these shifts by 18 places are correct? ?They're actually
> > (val >> SECTION_SHIFT) << 2, so maybe they should be (SECTION_SHIFT -
> > PMD_WORDS) ?
> 
> SECTION_SHIFT - PMD_ORDER is (20 - 2) for classic page tables and (21
> - 3) for LPAE. But we could change the 18 to some macros for
> clarification (the line would be long though).

So yes, it's SECTION_SHIFT - PMD_ORDER, which is how they should be
used IMHO.  I don't see why another macro would be necessary.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
  2010-11-23 17:33         ` Russell King - ARM Linux
@ 2010-11-23 17:35           ` Catalin Marinas
  -1 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-23 17:35 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-arm-kernel, linux-kernel

On Tue, 2010-11-23 at 17:33 +0000, Russell King - ARM Linux wrote:
> On Tue, Nov 23, 2010 at 11:38:15AM +0000, Catalin Marinas wrote:
> > On 22 November 2010 13:10, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > Are you sure these shifts by 18 places are correct?  They're actually
> > > (val >> SECTION_SHIFT) << 2, so maybe they should be (SECTION_SHIFT -
> > > PMD_WORDS) ?
> >
> > SECTION_SHIFT - PMD_ORDER is (20 - 2) for classic page tables and (21
> > - 3) for LPAE. But we could change the 18 to some macros for
> > clarification (the line would be long though).
> 
> So yes, it's SECTION_SHIFT - PMD_ORDER, which is how they should be
> used IMHO.  I don't see why another macro would be necessary.

I didn't mean adding another macro but using (SECTION_SHIFT - PMD_ORDER)
on a long line.

-- 
Catalin



^ permalink raw reply	[flat|nested] 154+ messages in thread

* [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format
@ 2010-11-23 17:35           ` Catalin Marinas
  0 siblings, 0 replies; 154+ messages in thread
From: Catalin Marinas @ 2010-11-23 17:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-11-23 at 17:33 +0000, Russell King - ARM Linux wrote:
> On Tue, Nov 23, 2010 at 11:38:15AM +0000, Catalin Marinas wrote:
> > On 22 November 2010 13:10, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > Are you sure these shifts by 18 places are correct?  They're actually
> > > (val >> SECTION_SHIFT) << 2, so maybe they should be (SECTION_SHIFT -
> > > PMD_WORDS) ?
> >
> > SECTION_SHIFT - PMD_ORDER is (20 - 2) for classic page tables and (21
> > - 3) for LPAE. But we could change the 18 to some macros for
> > clarification (the line would be long though).
> 
> So yes, it's SECTION_SHIFT - PMD_ORDER, which is how they should be
> used IMHO.  I don't see why another macro would be necessary.

I didn't mean adding another macro but using (SECTION_SHIFT - PMD_ORDER)
on a long line.

-- 
Catalin

^ permalink raw reply	[flat|nested] 154+ messages in thread

end of thread, other threads:[~2010-11-23 17:35 UTC | newest]

Thread overview: 154+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-12 18:00 [PATCH v2 00/20] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
2010-11-12 18:00 ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 01/20] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_* Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-22 12:43   ` Russell King - ARM Linux
2010-11-22 12:43     ` Russell King - ARM Linux
2010-11-22 13:00     ` Catalin Marinas
2010-11-22 13:00       ` Catalin Marinas
2010-11-22 13:28       ` Russell King - ARM Linux
2010-11-22 13:28         ` Russell King - ARM Linux
2010-11-12 18:00 ` [PATCH v2 02/20] ARM: LPAE: Factor out 2-level page table definitions into separate files Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-15 23:31   ` Russell King - ARM Linux
2010-11-15 23:31     ` Russell King - ARM Linux
2010-11-16  9:14     ` Catalin Marinas
2010-11-16  9:14       ` Catalin Marinas
2010-11-16  9:59       ` Russell King - ARM Linux
2010-11-16  9:59         ` Russell King - ARM Linux
2010-11-16 10:02         ` Catalin Marinas
2010-11-16 10:02           ` Catalin Marinas
2010-11-16 10:04       ` Russell King - ARM Linux
2010-11-16 10:04         ` Russell King - ARM Linux
2010-11-16 10:11         ` Catalin Marinas
2010-11-16 10:11           ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 03/20] ARM: LPAE: use u32 instead of unsigned long for 32-bit ptes Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-14 13:19   ` Russell King - ARM Linux
2010-11-14 13:19     ` Russell King - ARM Linux
2010-11-14 14:09     ` Catalin Marinas
2010-11-14 14:09       ` Catalin Marinas
2010-11-14 14:13       ` Catalin Marinas
2010-11-14 14:13         ` Catalin Marinas
2010-11-14 15:14         ` Russell King - ARM Linux
2010-11-14 15:14           ` Russell King - ARM Linux
2010-11-15  9:39           ` Catalin Marinas
2010-11-15  9:39             ` Catalin Marinas
2010-11-15  9:47             ` Arnd Bergmann
2010-11-15  9:47               ` Arnd Bergmann
2010-11-15  9:51               ` Catalin Marinas
2010-11-15  9:51                 ` Catalin Marinas
2010-11-15 22:11                 ` Nicolas Pitre
2010-11-15 22:11                   ` Nicolas Pitre
2010-11-15 23:35                   ` Russell King - ARM Linux
2010-11-15 23:35                     ` Russell King - ARM Linux
2010-11-16  9:19                   ` Catalin Marinas
2010-11-16  9:19                     ` Catalin Marinas
2010-11-15 22:07               ` Nicolas Pitre
2010-11-15 22:07                 ` Nicolas Pitre
2010-11-15 17:36             ` Russell King - ARM Linux
2010-11-15 17:36               ` Russell King - ARM Linux
2010-11-15 17:39               ` Catalin Marinas
2010-11-15 17:39                 ` Catalin Marinas
2010-11-16 19:34       ` Catalin Marinas
2010-11-16 19:34         ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 04/20] ARM: LPAE: Do not assume Linux PTEs are always at PTRS_PER_PTE offset Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-15 17:42   ` Russell King - ARM Linux
2010-11-15 17:42     ` Russell King - ARM Linux
2010-11-15 21:46     ` Catalin Marinas
2010-11-15 21:46       ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 05/20] ARM: LPAE: Introduce L_PTE_NOEXEC and L_PTE_NOWRITE Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-15 18:30   ` Russell King - ARM Linux
2010-11-15 18:30     ` Russell King - ARM Linux
2010-11-16 10:07     ` Catalin Marinas
2010-11-16 10:07       ` Catalin Marinas
2010-11-16 15:18     ` Catalin Marinas
2010-11-16 15:18       ` Catalin Marinas
2010-11-16 15:32       ` Catalin Marinas
2010-11-16 15:32         ` Catalin Marinas
2010-11-16 18:19       ` Russell King - ARM Linux
2010-11-16 18:19         ` Russell King - ARM Linux
2010-11-17 17:02     ` Catalin Marinas
2010-11-17 17:02       ` Catalin Marinas
2010-11-17 17:16       ` Russell King - ARM Linux
2010-11-17 17:16         ` Russell King - ARM Linux
2010-11-17 17:22         ` Catalin Marinas
2010-11-17 17:22           ` Catalin Marinas
2010-11-17 17:24           ` Russell King - ARM Linux
2010-11-17 17:24             ` Russell King - ARM Linux
2010-11-17 17:30             ` Catalin Marinas
2010-11-17 17:30               ` Catalin Marinas
2010-11-17 17:32               ` Russell King - ARM Linux
2010-11-17 17:32                 ` Russell King - ARM Linux
2010-11-17 17:34                 ` Catalin Marinas
2010-11-17 17:34                   ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 06/20] ARM: LPAE: Introduce the 3-level page table format definitions Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-15 18:34   ` Russell King - ARM Linux
2010-11-15 18:34     ` Russell King - ARM Linux
2010-11-16  9:57     ` Catalin Marinas
2010-11-16  9:57       ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 07/20] ARM: LPAE: Page table maintenance for the 3-level format Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-22 12:58   ` Russell King - ARM Linux
2010-11-22 12:58     ` Russell King - ARM Linux
2010-11-12 18:00 ` [PATCH v2 08/20] ARM: LPAE: MMU setup for the 3-level page table format Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-14 10:13   ` Catalin Marinas
2010-11-14 10:13     ` Catalin Marinas
2010-11-22 13:10   ` Russell King - ARM Linux
2010-11-22 13:10     ` Russell King - ARM Linux
2010-11-23 11:38     ` Catalin Marinas
2010-11-23 11:38       ` Catalin Marinas
2010-11-23 17:33       ` Russell King - ARM Linux
2010-11-23 17:33         ` Russell King - ARM Linux
2010-11-23 17:35         ` Catalin Marinas
2010-11-23 17:35           ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 09/20] ARM: LPAE: Change setup_mm_for_reboot() to work with LPAE Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-22 13:11   ` Russell King - ARM Linux
2010-11-22 13:11     ` Russell King - ARM Linux
2010-11-12 18:00 ` [PATCH v2 10/20] ARM: LPAE: Remove the FIRST_USER_PGD_NR and USER_PTRS_PER_PGD definitions Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-22 13:11   ` Russell King - ARM Linux
2010-11-22 13:11     ` Russell King - ARM Linux
2010-11-12 18:00 ` [PATCH v2 11/20] ARM: LPAE: Add fault handling support Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-22 13:15   ` Russell King - ARM Linux
2010-11-22 13:15     ` Russell King - ARM Linux
2010-11-22 13:19     ` Catalin Marinas
2010-11-22 13:19       ` Catalin Marinas
2010-11-22 13:32       ` Russell King - ARM Linux
2010-11-22 13:32         ` Russell King - ARM Linux
2010-11-22 13:38         ` Catalin Marinas
2010-11-22 13:38           ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 12/20] ARM: LPAE: Add context switching support Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 13/20] ARM: LPAE: Add SMP support for the 3-level page table format Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-22 13:37   ` Russell King - ARM Linux
2010-11-22 13:37     ` Russell King - ARM Linux
2010-11-12 18:00 ` [PATCH v2 14/20] ARM: LPAE: use phys_addr_t instead of unsigned long for physical addresses Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 15/20] ARM: LPAE: Use generic dma_addr_t type definition Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 16/20] ARM: LPAE: mark memory banks with start > ULONG_MAX as highmem Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 17/20] ARM: LPAE: use phys_addr_t for physical start address in early_mem Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 18/20] ARM: LPAE: add support for ATAG_MEM64 Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 19/20] ARM: LPAE: define printk format for physical addresses and page table entries Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-22 13:43   ` Russell King - ARM Linux
2010-11-22 13:43     ` Russell King - ARM Linux
2010-11-22 13:49     ` Catalin Marinas
2010-11-22 13:49       ` Catalin Marinas
2010-11-12 18:00 ` [PATCH v2 20/20] ARM: LPAE: Add the Kconfig entries Catalin Marinas
2010-11-12 18:00   ` Catalin Marinas
2010-11-13 12:38   ` Sergei Shtylyov
2010-11-13 12:38     ` Sergei Shtylyov
2010-11-14 10:11     ` Catalin Marinas
2010-11-14 10:11       ` Catalin Marinas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.