All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 00/18] arm64: mm: rework page table creation
@ 2016-01-04 17:56 Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 01/18] asm-generic: make __set_fixmap_offset a static inline Mark Rutland
                   ` (20 more replies)
  0 siblings, 21 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

This series reworks the arm64 early page table code, in order to:

(a) Avoid issues with potentially-conflicting TTBR1 TLB entries (as raised in
    Jeremy's thread [1]). This can happen when splitting/merging sections or
    contiguous ranges, and per a pessimistic reading of the ARM ARM may happen
    for changes to other fields in translation table entries.
    
(b) Allow for more complex page table creation early on, with tables created
    with fine-grained permissions as early as possible. In the cases where we
    currently use fine-grained permissions (e.g. DEBUG_RODATA and marking .init
    as non-executable), this is required for the same reasons as (a), as we
    must ensure that changes to page tables do not split/merge sections or
    contiguous regions for memory in active use.

(c) Avoid edge cases where we need to allocate memory before a sufficient
    proportion of the early linear map is in place to accommodate allocations.

This series:

* Introduces the necessary infrastructure to safely swap TTBR1_EL1 (i.e.
  without risking conflicting TLB entries being allocated). The arm64 KASAN
  code is migrated to this.

* Adds helpers to walk page tables by physical address, independent of the
  linear mapping, and modifies __create_mapping and friends to relying on a new
  set of FIX_{PGD,PUD,PMD,PTE} to map tables as required for modification.

* Removes the early memblock limit, now that create_mapping does not rely on the
  early linear map. This solves (c), and allows for (b).

* Generates an entirely new set of kernel page tables with fine-grained (i.e.
  page-level) permission boundaries, which can then be safely installed. These
  are created with sufficient granularity such that later changes (currently
  only fixup_init) will not split/merge sections or contiguous regions, and can
  follow a break-before-make approach without affecting the rest of the page
  tables.

There are still opportunities for improvement:

* BUG() when splitting sections or creating overlapping entries in
  create_mapping, as these both indicate serious bugs in kernel page table
  creation.
  
  This will require rework to the EFI runtime services pagetable creation, as
  for >4K page kernels EFI memory descriptors may share pages (and currently
  such overlap is assumed to be benign).

* Use ROX mappings for the kernel text and rodata when creating the new tables.
  This avoiding potential conflicts from changes to translation tables, and
  giving us better protections earlier.

  Currently the alternatives patching code relies on being able to use the
  kernel mapping to update the text. We cannot rely on any text which itself
  may be patched, and updates may straddle page boundaries, so this is
  non-trivial.

* Clean up usage of swapper_pg_dir so we can switch to the new tables without
  having to reuse the existing pgd. This will allow us to free the original
  pgd (i.e. we can free all the initial tables in one go).

Any and all feedback is welcome.

This series is based on today's arm64 [2] for-next/core branch (commit
c9cd0ed925c0b927), and this version is tagged as
arm64-pagetable-rework-20160104 while the latest version should be in the
unstable branch arm64/pagetable-rework in my git repo [3].

Since v1 [4] (tagged arm64-pagetable-rework-20151209):
* Drop patches taken into the arm64 tree.
* Rebase to arm64 for-next/core.
* Copy early KASAN tables.
* Fix KASAN pgd manipulation.
* Specialise allocators for page tables, in function and naming.
* Update comments.

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/386178.html
[2] git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
[3] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git
[4] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/392292.html

Mark Rutland (18):
  asm-generic: make __set_fixmap_offset a static inline
  arm64: mm: specialise pagetable allocators
  arm64: mm: place empty_zero_page in bss
  arm64: unify idmap removal
  arm64: unmap idmap earlier
  arm64: add function to install the idmap
  arm64: mm: add code to safely replace TTBR1_EL1
  arm64: kasan: avoid TLB conflicts
  arm64: mm: move pte_* macros
  arm64: mm: add functions to walk page tables by PA
  arm64: mm: avoid redundant __pa(__va(x))
  arm64: mm: add __{pud,pgd}_populate
  arm64: mm: add functions to walk tables in fixmap
  arm64: mm: use fixmap when creating page tables
  arm64: mm: allocate pagetables anywhere
  arm64: mm: allow passing a pgdir to alloc_init_*
  arm64: ensure _stext and _etext are page-aligned
  arm64: mm: create new fine-grained mappings at boot

 arch/arm64/include/asm/fixmap.h      |  10 ++
 arch/arm64/include/asm/kasan.h       |   3 +
 arch/arm64/include/asm/mmu_context.h |  63 ++++++-
 arch/arm64/include/asm/pgalloc.h     |  26 ++-
 arch/arm64/include/asm/pgtable.h     |  87 +++++++---
 arch/arm64/kernel/head.S             |   1 +
 arch/arm64/kernel/setup.c            |   7 +
 arch/arm64/kernel/smp.c              |   4 +-
 arch/arm64/kernel/suspend.c          |  20 +--
 arch/arm64/kernel/vmlinux.lds.S      |   5 +-
 arch/arm64/mm/kasan_init.c           |  32 ++--
 arch/arm64/mm/mmu.c                  | 311 ++++++++++++++++++-----------------
 arch/arm64/mm/proc.S                 |  27 +++
 include/asm-generic/fixmap.h         |  14 +-
 14 files changed, 381 insertions(+), 229 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 01/18] asm-generic: make __set_fixmap_offset a static inline
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-19 11:55   ` Mark Rutland
  2016-01-28 15:10   ` Will Deacon
  2016-01-04 17:56 ` [PATCHv2 02/18] arm64: mm: specialise pagetable allocators Mark Rutland
                   ` (19 subsequent siblings)
  20 siblings, 2 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

Currently __set_fixmap_offset is a macro function which has a local
variable called 'addr'. If a caller passes a 'phys' parameter which is
derived from a variable also called 'addr', the local variable will
shadow this, and the compiler will complain about the use of an
uninitialized variable.

It is likely that fixmap users may use the name 'addr' for variables
that may be directly passed to __set_fixmap_offset, or that may be
indirectly generated via other macros. Rather than placing the burden on
callers to avoid the name 'addr', this patch changes __set_fixmap_offset
into a static inline function, avoiding namespace collisions.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 include/asm-generic/fixmap.h | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/asm-generic/fixmap.h b/include/asm-generic/fixmap.h
index 1cbb833..f9c27b6 100644
--- a/include/asm-generic/fixmap.h
+++ b/include/asm-generic/fixmap.h
@@ -70,13 +70,13 @@ static inline unsigned long virt_to_fix(const unsigned long vaddr)
 #endif
 
 /* Return a pointer with offset calculated */
-#define __set_fixmap_offset(idx, phys, flags)		      \
-({							      \
-	unsigned long addr;				      \
-	__set_fixmap(idx, phys, flags);			      \
-	addr = fix_to_virt(idx) + ((phys) & (PAGE_SIZE - 1)); \
-	addr;						      \
-})
+static inline unsigned long __set_fixmap_offset(enum fixed_addresses idx,
+						phys_addr_t phys,
+						pgprot_t flags)
+{
+	__set_fixmap(idx, phys, flags);
+	return fix_to_virt(idx) + (phys & (PAGE_SIZE - 1));
+}
 
 #define set_fixmap_offset(idx, phys) \
 	__set_fixmap_offset(idx, phys, FIXMAP_PAGE_NORMAL)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 02/18] arm64: mm: specialise pagetable allocators
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 01/18] asm-generic: make __set_fixmap_offset a static inline Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 03/18] arm64: mm: place empty_zero_page in bss Mark Rutland
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

We pass a size parameter to early_alloc and late_alloc, but these are
only ever used to allocate single pages. In late_alloc we always
allocate a single page.

Both allocators provide us with zeroed pages (such that all entries are
invalid), but we have no barriers between allocating a page and adding
that page to existing (live) tables. A concurrent page table walk may
see stale data, leading to a number of issues.

This patch specialises the two allocators for page tables. The size
parameter is removed and the necessary dsb(ishst) is folded into each.
To make it clear that the functions are intended for use for page table
allocation, they are renamed to {early_,late_}pgtable_alloc, with the
related function pointed renamed to pgtable_alloc.

As the dsb(ishst) is now in the allocator, the existing barrier for the
zero page is redundant and thus is removed. The previously missing
include of barrier.h is added.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/mm/mmu.c | 52 +++++++++++++++++++++++++++-------------------------
 1 file changed, 27 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 58faeaa..b25d5cb 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -30,6 +30,7 @@
 #include <linux/slab.h>
 #include <linux/stop_machine.h>
 
+#include <asm/barrier.h>
 #include <asm/cputype.h>
 #include <asm/fixmap.h>
 #include <asm/kernel-pgtable.h>
@@ -62,15 +63,18 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 }
 EXPORT_SYMBOL(phys_mem_access_prot);
 
-static void __init *early_alloc(unsigned long sz)
+static void __init *early_pgtable_alloc(void)
 {
 	phys_addr_t phys;
 	void *ptr;
 
-	phys = memblock_alloc(sz, sz);
+	phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
 	BUG_ON(!phys);
 	ptr = __va(phys);
-	memset(ptr, 0, sz);
+	memset(ptr, 0, PAGE_SIZE);
+
+	/* Ensure the zeroed page is visible to the page table walker */
+	dsb(ishst);
 	return ptr;
 }
 
@@ -95,12 +99,12 @@ static void split_pmd(pmd_t *pmd, pte_t *pte)
 static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
 				  unsigned long end, unsigned long pfn,
 				  pgprot_t prot,
-				  void *(*alloc)(unsigned long size))
+				  void *(*pgtable_alloc)(void))
 {
 	pte_t *pte;
 
 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
-		pte = alloc(PTRS_PER_PTE * sizeof(pte_t));
+		pte = pgtable_alloc();
 		if (pmd_sect(*pmd))
 			split_pmd(pmd, pte);
 		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
@@ -130,7 +134,7 @@ static void split_pud(pud_t *old_pud, pmd_t *pmd)
 static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 				  unsigned long addr, unsigned long end,
 				  phys_addr_t phys, pgprot_t prot,
-				  void *(*alloc)(unsigned long size))
+				  void *(*pgtable_alloc)(void))
 {
 	pmd_t *pmd;
 	unsigned long next;
@@ -139,7 +143,7 @@ static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 	 * Check for initial section mappings in the pgd/pud and remove them.
 	 */
 	if (pud_none(*pud) || pud_sect(*pud)) {
-		pmd = alloc(PTRS_PER_PMD * sizeof(pmd_t));
+		pmd = pgtable_alloc();
 		if (pud_sect(*pud)) {
 			/*
 			 * need to have the 1G of mappings continue to be
@@ -174,7 +178,7 @@ static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 			}
 		} else {
 			alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
-				       prot, alloc);
+				       prot, pgtable_alloc);
 		}
 		phys += next - addr;
 	} while (pmd++, addr = next, addr != end);
@@ -195,13 +199,13 @@ static inline bool use_1G_block(unsigned long addr, unsigned long next,
 static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				  unsigned long addr, unsigned long end,
 				  phys_addr_t phys, pgprot_t prot,
-				  void *(*alloc)(unsigned long size))
+				  void *(*pgtable_alloc)(void))
 {
 	pud_t *pud;
 	unsigned long next;
 
 	if (pgd_none(*pgd)) {
-		pud = alloc(PTRS_PER_PUD * sizeof(pud_t));
+		pud = pgtable_alloc();
 		pgd_populate(mm, pgd, pud);
 	}
 	BUG_ON(pgd_bad(*pgd));
@@ -234,7 +238,8 @@ static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				}
 			}
 		} else {
-			alloc_init_pmd(mm, pud, addr, next, phys, prot, alloc);
+			alloc_init_pmd(mm, pud, addr, next, phys, prot,
+				       pgtable_alloc);
 		}
 		phys += next - addr;
 	} while (pud++, addr = next, addr != end);
@@ -247,7 +252,7 @@ static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 static void  __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 				    phys_addr_t phys, unsigned long virt,
 				    phys_addr_t size, pgprot_t prot,
-				    void *(*alloc)(unsigned long size))
+				    void *(*pgtable_alloc)(void))
 {
 	unsigned long addr, length, end, next;
 
@@ -265,18 +270,18 @@ static void  __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 	end = addr + length;
 	do {
 		next = pgd_addr_end(addr, end);
-		alloc_init_pud(mm, pgd, addr, next, phys, prot, alloc);
+		alloc_init_pud(mm, pgd, addr, next, phys, prot, pgtable_alloc);
 		phys += next - addr;
 	} while (pgd++, addr = next, addr != end);
 }
 
-static void *late_alloc(unsigned long size)
+static void *late_pgtable_alloc(void)
 {
-	void *ptr;
-
-	BUG_ON(size > PAGE_SIZE);
-	ptr = (void *)__get_free_page(PGALLOC_GFP);
+	void *ptr = (void *)__get_free_page(PGALLOC_GFP);
 	BUG_ON(!ptr);
+
+	/* Ensure the zeroed page is visible to the page table walker */
+	dsb(ishst);
 	return ptr;
 }
 
@@ -289,7 +294,7 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 		return;
 	}
 	__create_mapping(&init_mm, pgd_offset_k(virt), phys, virt,
-			 size, prot, early_alloc);
+			 size, prot, early_pgtable_alloc);
 }
 
 void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
@@ -297,7 +302,7 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 			       pgprot_t prot)
 {
 	__create_mapping(mm, pgd_offset(mm, virt), phys, virt, size, prot,
-				late_alloc);
+				late_pgtable_alloc);
 }
 
 static void create_mapping_late(phys_addr_t phys, unsigned long virt,
@@ -310,7 +315,7 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
 	}
 
 	return __create_mapping(&init_mm, pgd_offset_k(virt),
-				phys, virt, size, prot, late_alloc);
+				phys, virt, size, prot, late_pgtable_alloc);
 }
 
 #ifdef CONFIG_DEBUG_RODATA
@@ -460,15 +465,12 @@ void __init paging_init(void)
 	fixup_executable();
 
 	/* allocate the zero page. */
-	zero_page = early_alloc(PAGE_SIZE);
+	zero_page = early_pgtable_alloc();
 
 	bootmem_init();
 
 	empty_zero_page = virt_to_page(zero_page);
 
-	/* Ensure the zero page is visible to the page table walker */
-	dsb(ishst);
-
 	/*
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
 	 * point to zero page to avoid speculatively fetching new entries.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 03/18] arm64: mm: place empty_zero_page in bss
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 01/18] asm-generic: make __set_fixmap_offset a static inline Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 02/18] arm64: mm: specialise pagetable allocators Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 04/18] arm64: unify idmap removal Mark Rutland
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

Currently the zero page is set up in paging_init, and thus we cannot use
the zero page earlier. We use the zero page as a reserved TTBR value
from which no TLB entries may be allocated (e.g. when uninstalling the
idmap). To enable such usage earlier (as may be required for invasive
changes to the kernel page tables), and to minimise the time that the
idmap is active, we need to be able to use the zero page before
paging_init.

This patch follows the example set by x86, by allocating the zero page
at compile time, in .bss. This means that the zero page itself is
available immediately upon entry to start_kernel (as we zero .bss before
this), and also means that the zero page takes up no space in the raw
Image binary. The associated struct page is allocated in bootmem_init,
and remains unavailable until this time.

Outside of arch code, the only users of empty_zero_page assume that the
empty_zero_page symbol refers to the zeroed memory itself, and that
ZERO_PAGE(x) must be used to acquire the associated struct page,
following the example of x86. This patch also brings arm64 inline with
these assumptions.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/mmu_context.h | 2 +-
 arch/arm64/include/asm/pgtable.h     | 4 ++--
 arch/arm64/kernel/head.S             | 1 +
 arch/arm64/mm/mmu.c                  | 9 +--------
 4 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 2416578..600eacb 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -48,7 +48,7 @@ static inline void contextidr_thread_switch(struct task_struct *next)
  */
 static inline void cpu_set_reserved_ttbr0(void)
 {
-	unsigned long ttbr = page_to_phys(empty_zero_page);
+	unsigned long ttbr = virt_to_phys(empty_zero_page);
 
 	asm(
 	"	msr	ttbr0_el1, %0			// set TTBR0\n"
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 35a318c..382d627 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -121,8 +121,8 @@ extern void __pgd_error(const char *file, int line, unsigned long val);
  * ZERO_PAGE is a global shared page that is always zero: used
  * for zero-mapped memory areas etc..
  */
-extern struct page *empty_zero_page;
-#define ZERO_PAGE(vaddr)	(empty_zero_page)
+extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
+#define ZERO_PAGE(vaddr)	virt_to_page(empty_zero_page)
 
 #define pte_ERROR(pte)		__pte_error(__FILE__, __LINE__, pte_val(pte))
 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b363f34..9da5d0c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -423,6 +423,7 @@ __mmap_switched:
 	str	xzr, [x6], #8			// Clear BSS
 	b	1b
 2:
+	dsb	ishst				// Make zero page visible to PTW
 	adr_l	sp, initial_sp, x4
 	mov	x4, sp
 	and	x4, x4, #~(THREAD_SIZE - 1)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b25d5cb..cdbf055 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -49,7 +49,7 @@ u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
  * Empty_zero_page is a special page that is used for zero-initialized data
  * and COW.
  */
-struct page *empty_zero_page;
+unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
 EXPORT_SYMBOL(empty_zero_page);
 
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
@@ -459,18 +459,11 @@ void fixup_init(void)
  */
 void __init paging_init(void)
 {
-	void *zero_page;
-
 	map_mem();
 	fixup_executable();
 
-	/* allocate the zero page. */
-	zero_page = early_pgtable_alloc();
-
 	bootmem_init();
 
-	empty_zero_page = virt_to_page(zero_page);
-
 	/*
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
 	 * point to zero page to avoid speculatively fetching new entries.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 04/18] arm64: unify idmap removal
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (2 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 03/18] arm64: mm: place empty_zero_page in bss Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 05/18] arm64: unmap idmap earlier Mark Rutland
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

We currently open-code the removal of the idmap and restoration of the
current task's MMU state in a few places.

Before introducing yet more copies of this sequence, unify these to call
a new helper, cpu_uninstall_idmap.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/mmu_context.h | 25 +++++++++++++++++++++++++
 arch/arm64/kernel/setup.c            |  1 +
 arch/arm64/kernel/smp.c              |  4 +---
 arch/arm64/kernel/suspend.c          | 20 ++++----------------
 arch/arm64/mm/mmu.c                  |  4 +---
 5 files changed, 32 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 600eacb..b1b2514 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -27,6 +27,7 @@
 #include <asm-generic/mm_hooks.h>
 #include <asm/cputype.h>
 #include <asm/pgtable.h>
+#include <asm/tlbflush.h>
 
 #ifdef CONFIG_PID_IN_CONTEXTIDR
 static inline void contextidr_thread_switch(struct task_struct *next)
@@ -90,6 +91,30 @@ static inline void cpu_set_default_tcr_t0sz(void)
 }
 
 /*
+ * Remove the idmap from TTBR0_EL1 and install the pgd of the active mm.
+ *
+ * The idmap lives in the same VA range as userspace, but uses global entries
+ * and may use a different TCR_EL1.T0SZ. To avoid issues resulting from
+ * speculative TLB fetches, we must temporarily install the reserved page
+ * tables while we invalidate the TLBs and set up the correct TCR_EL1.T0SZ.
+ *
+ * If current is a not a user task, the mm covers the TTBR1_EL1 page tables,
+ * which should not be installed in TTBR0_EL1. In this case we can leave the
+ * reserved page tables in place.
+ */
+static inline void cpu_uninstall_idmap(void)
+{
+	struct mm_struct *mm = current->active_mm;
+
+	cpu_set_reserved_ttbr0();
+	local_flush_tlb_all();
+	cpu_set_default_tcr_t0sz();
+
+	if (mm != &init_mm)
+		cpu_switch_mm(mm->pgd, mm);
+}
+
+/*
  * It would be nice to return ASIDs back to the allocator, but unfortunately
  * that introduces a race with a generation rollover where we could erroneously
  * free an ASID allocated in a future generation. We could workaround this by
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 8119479..f6621ba 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -62,6 +62,7 @@
 #include <asm/memblock.h>
 #include <asm/efi.h>
 #include <asm/xen/hypervisor.h>
+#include <asm/mmu_context.h>
 
 phys_addr_t __fdt_pointer __initdata;
 
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index b1adc51..68e7f79 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -149,9 +149,7 @@ asmlinkage void secondary_start_kernel(void)
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
 	 * point to zero page to avoid speculatively fetching new entries.
 	 */
-	cpu_set_reserved_ttbr0();
-	local_flush_tlb_all();
-	cpu_set_default_tcr_t0sz();
+	cpu_uninstall_idmap();
 
 	preempt_disable();
 	trace_hardirqs_off();
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 1095aa4..6605539 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -60,7 +60,6 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
  */
 int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 {
-	struct mm_struct *mm = current->active_mm;
 	int ret;
 	unsigned long flags;
 
@@ -87,22 +86,11 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	ret = __cpu_suspend_enter(arg, fn);
 	if (ret == 0) {
 		/*
-		 * We are resuming from reset with TTBR0_EL1 set to the
-		 * idmap to enable the MMU; set the TTBR0 to the reserved
-		 * page tables to prevent speculative TLB allocations, flush
-		 * the local tlb and set the default tcr_el1.t0sz so that
-		 * the TTBR0 address space set-up is properly restored.
-		 * If the current active_mm != &init_mm we entered cpu_suspend
-		 * with mappings in TTBR0 that must be restored, so we switch
-		 * them back to complete the address space configuration
-		 * restoration before returning.
+		 * We are resuming from reset with the idmap active in TTBR0_EL1.
+		 * We must uninstall the idmap and restore the expected MMU
+		 * state before we can possibly return to userspace.
 		 */
-		cpu_set_reserved_ttbr0();
-		local_flush_tlb_all();
-		cpu_set_default_tcr_t0sz();
-
-		if (mm != &init_mm)
-			cpu_switch_mm(mm->pgd, mm);
+		cpu_uninstall_idmap();
 
 		/*
 		 * Restore per-cpu offset before any kernel
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index cdbf055..e85a719 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -468,9 +468,7 @@ void __init paging_init(void)
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
 	 * point to zero page to avoid speculatively fetching new entries.
 	 */
-	cpu_set_reserved_ttbr0();
-	local_flush_tlb_all();
-	cpu_set_default_tcr_t0sz();
+	cpu_uninstall_idmap();
 }
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 05/18] arm64: unmap idmap earlier
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (3 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 04/18] arm64: unify idmap removal Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 06/18] arm64: add function to install the idmap Mark Rutland
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

During boot we leave the idmap in place until paging_init, as we
previously had to wait for the zero page to become allocated and
accessible.

Now that we have a statically-allocated zero page, we can uninstall the
idmap much earlier in the boot process, making it far eaier to spot
accidental use of physical addresses. This also brings the cold boot
path in line with the secondary boot path.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/setup.c | 6 ++++++
 arch/arm64/mm/mmu.c       | 6 ------
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index f6621ba..cfed56f 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -314,6 +314,12 @@ void __init setup_arch(char **cmdline_p)
 	 */
 	local_async_enable();
 
+	/*
+	 * TTBR0 is only used for the identity mapping at this stage. Make it
+	 * point to zero page to avoid speculatively fetching new entries.
+	 */
+	cpu_uninstall_idmap();
+
 	efi_init();
 	arm64_memblock_init();
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index e85a719..c3ea9df 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -463,12 +463,6 @@ void __init paging_init(void)
 	fixup_executable();
 
 	bootmem_init();
-
-	/*
-	 * TTBR0 is only used for the identity mapping at this stage. Make it
-	 * point to zero page to avoid speculatively fetching new entries.
-	 */
-	cpu_uninstall_idmap();
 }
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 06/18] arm64: add function to install the idmap
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (4 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 05/18] arm64: unmap idmap earlier Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 07/18] arm64: mm: add code to safely replace TTBR1_EL1 Mark Rutland
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

In some cases (e.g. when making invasive changes to the kernel page
tables) we will need to execute code from the idmap.

Add a new helper which may be used to install the idmap, complementing
the existing cpu_uninstall_idmap.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/mmu_context.h | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index b1b2514..944f273 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -74,7 +74,7 @@ static inline bool __cpu_uses_extended_idmap(void)
 /*
  * Set TCR.T0SZ to its default value (based on VA_BITS)
  */
-static inline void cpu_set_default_tcr_t0sz(void)
+static inline void __cpu_set_tcr_t0sz(unsigned long t0sz)
 {
 	unsigned long tcr;
 
@@ -87,9 +87,12 @@ static inline void cpu_set_default_tcr_t0sz(void)
 	"	msr	tcr_el1, %0	;"
 	"	isb"
 	: "=&r" (tcr)
-	: "r"(TCR_T0SZ(VA_BITS)), "I"(TCR_T0SZ_OFFSET), "I"(TCR_TxSZ_WIDTH));
+	: "r"(t0sz), "I"(TCR_T0SZ_OFFSET), "I"(TCR_TxSZ_WIDTH));
 }
 
+#define cpu_set_default_tcr_t0sz()	__cpu_set_tcr_t0sz(TCR_T0SZ(VA_BITS))
+#define cpu_set_idmap_tcr_t0sz()	__cpu_set_tcr_t0sz(idmap_t0sz)
+
 /*
  * Remove the idmap from TTBR0_EL1 and install the pgd of the active mm.
  *
@@ -114,6 +117,15 @@ static inline void cpu_uninstall_idmap(void)
 		cpu_switch_mm(mm->pgd, mm);
 }
 
+static inline void cpu_install_idmap(void)
+{
+	cpu_set_reserved_ttbr0();
+	local_flush_tlb_all();
+	cpu_set_idmap_tcr_t0sz();
+
+	cpu_switch_mm(idmap_pg_dir, &init_mm);
+}
+
 /*
  * It would be nice to return ASIDs back to the allocator, but unfortunately
  * that introduces a race with a generation rollover where we could erroneously
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 07/18] arm64: mm: add code to safely replace TTBR1_EL1
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (5 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 06/18] arm64: add function to install the idmap Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-05 15:22   ` Catalin Marinas
  2016-01-04 17:56 ` [PATCHv2 08/18] arm64: kasan: avoid TLB conflicts Mark Rutland
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

If page tables are modified without suitable TLB maintenance, the ARM
architecture permits multiple TLB entries to be allocated for the same
VA. When this occurs, it is permitted that TLB conflict aborts are
raised in response to synchronous data/instruction accesses, and/or and
amalgamation of the TLB entries may be used as a result of a TLB lookup.

The presence of conflicting TLB entries may result in a variety of
behaviours detrimental to the system (e.g. erroneous physical addresses
may be used by I-cache fetches and/or page table walks). Some of these
cases may result in unexpected changes of hardware state, and/or result
in the (asynchronous) delivery of SError.

To avoid these issues, we must avoid situations where conflicting
entries may be allocated into TLBs. For user and module mappings we can
follow a strict break-before-make approach, but this cannot work for
modifications to the swapper page tables that cover the kernel text and
data.

Instead, this patch adds code which is intended to be executed from the
idmap, which can safely unmap the swapper page tables as it only
requires the idmap to be active. This enables us to uninstall the active
TTBR1_EL1 entry, invalidate TLBs, then install a new TTBR1_EL1 entry
without potentially unmapping code or data required for the sequence.
This avoids the risk of conflict, but requires that updates are staged
in a copy of the swapper page tables prior to being installed.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/mmu_context.h | 20 ++++++++++++++++++++
 arch/arm64/mm/proc.S                 | 27 +++++++++++++++++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 944f273..280ce2e 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -127,6 +127,26 @@ static inline void cpu_install_idmap(void)
 }
 
 /*
+ * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD,
+ * avoiding the possibility of conflicting TLB entries being allocated.
+ */
+static inline void cpu_replace_ttbr1(pgd_t *pgd)
+{
+	typedef void (ttbr_replace_func)(phys_addr_t, phys_addr_t);
+	extern ttbr_replace_func idmap_cpu_replace_ttbr1;
+	ttbr_replace_func *replace_phys;
+
+	phys_addr_t pgd_phys = virt_to_phys(pgd);
+	phys_addr_t reserved_phys = virt_to_phys(empty_zero_page);
+
+	replace_phys = (void*)virt_to_phys(idmap_cpu_replace_ttbr1);
+
+	cpu_install_idmap();
+	replace_phys(pgd_phys, reserved_phys);
+	cpu_uninstall_idmap();
+}
+
+/*
  * It would be nice to return ASIDs back to the allocator, but unfortunately
  * that introduces a race with a generation rollover where we could erroneously
  * free an ASID allocated in a future generation. We could workaround this by
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index b6f9053..025dea5 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -139,6 +139,33 @@ ENTRY(cpu_do_switch_mm)
 	ret
 ENDPROC(cpu_do_switch_mm)
 
+	.pushsection ".idmap.text", "ax"
+/*
+ * void idmap_cpu_replace_ttbr1(phys_addr_t new_pgd, phys_addr_t reserved_pgd)
+ *
+ * This is the low-level counterpart to cpu_replace_ttbr1, and should not be
+ * called by anything else. It can only be executed from a TTBR0 mapping.
+ */
+ENTRY(idmap_cpu_replace_ttbr1)
+	mrs	x2, daif
+	msr	daifset, #0xf
+
+	msr	ttbr1_el1, x1
+	isb
+
+	tlbi	vmalle1
+	dsb	nsh
+	isb
+
+	msr	ttbr1_el1, x0
+	isb
+
+	msr	daif, x2
+
+	ret
+ENDPROC(idmap_cpu_replace_ttbr1)
+	.popsection
+
 /*
  *	__cpu_setup
  *
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 08/18] arm64: kasan: avoid TLB conflicts
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (6 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 07/18] arm64: mm: add code to safely replace TTBR1_EL1 Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 09/18] arm64: mm: move pte_* macros Mark Rutland
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

The page table modification performed during the KASAN init risks the
allocation of conflicting TLB entries, as it swaps a set of valid global
entries for another without suitable TLB maintenance.

The presence of conflicting TLB entries can result in the delivery of
synchronous TLB conflict aborts, or may result in the use of erroneous
data being returned in response to a TLB lookup. This can affect
explicit data accesses from software as well as translations performed
asynchronously (e.g. as part of page table walks or speculative I-cache
fetches), and can therefore result in a wide variety of problems.

To avoid this, use cpu_replace_ttbr1 to swap the page tables. This
ensures that when the new tables are installed there are no stale
entries from the old tables which may conflict. As all updates are made
to the tables while they are not active, the updates themselves are
safe.

At the same time, add the missing barrier to ensure that the tmp_pg_dir
entries updated via memcpy are visible to the page table walkers at the
point the tmp_pg_dir is installed. All other page table updates made as
part of KASAN initialisation have the requisite barriers due to the use
of the standard page table accessors.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/mm/kasan_init.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index cf038c7..3e3d280 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -16,6 +16,7 @@
 #include <linux/memblock.h>
 #include <linux/start_kernel.h>
 
+#include <asm/mmu_context.h>
 #include <asm/page.h>
 #include <asm/pgalloc.h>
 #include <asm/pgtable.h>
@@ -108,15 +109,6 @@ static void __init clear_pgds(unsigned long start,
 		set_pgd(pgd_offset_k(start), __pgd(0));
 }
 
-static void __init cpu_set_ttbr1(unsigned long ttbr1)
-{
-	asm(
-	"	msr	ttbr1_el1, %0\n"
-	"	isb"
-	:
-	: "r" (ttbr1));
-}
-
 void __init kasan_init(void)
 {
 	struct memblock_region *reg;
@@ -129,8 +121,8 @@ void __init kasan_init(void)
 	 * setup will be finished.
 	 */
 	memcpy(tmp_pg_dir, swapper_pg_dir, sizeof(tmp_pg_dir));
-	cpu_set_ttbr1(__pa(tmp_pg_dir));
-	flush_tlb_all();
+	dsb(ishst);
+	cpu_replace_ttbr1(tmp_pg_dir);
 
 	clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
 
@@ -156,8 +148,7 @@ void __init kasan_init(void)
 	}
 
 	memset(kasan_zero_page, 0, PAGE_SIZE);
-	cpu_set_ttbr1(__pa(swapper_pg_dir));
-	flush_tlb_all();
+	cpu_replace_ttbr1(swapper_pg_dir);
 
 	/* At this point kasan is fully initialized. Enable error messages */
 	init_task.kasan_depth = 0;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 09/18] arm64: mm: move pte_* macros
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (7 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 08/18] arm64: kasan: avoid TLB conflicts Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 10/18] arm64: mm: add functions to walk page tables by PA Mark Rutland
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

For pmd, pud, and pgd levels of table, functions including p?d_index and
p?d_offset are defined after the p?d_page_vaddr function for the
immediately higher level of table.

The pte functions however are defined much earlier, even though several
rely on the later definition of pmd_page_vaddr. While this isn't
currently a problem as these are macros, it prevents the logical
grouping of later C functions (which cannot rely on prototypes for
functions not yet defined).

Move these definitions after pmd_page_vaddr, for consistency with the
placement of these functions for other levels of table.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/pgtable.h | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 382d627..3603cca 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -134,16 +134,6 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 #define pte_clear(mm,addr,ptep)	set_pte(ptep, __pte(0))
 #define pte_page(pte)		(pfn_to_page(pte_pfn(pte)))
 
-/* Find an entry in the third-level page table. */
-#define pte_index(addr)		(((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
-
-#define pte_offset_kernel(dir,addr)	(pmd_page_vaddr(*(dir)) + pte_index(addr))
-
-#define pte_offset_map(dir,addr)	pte_offset_kernel((dir), (addr))
-#define pte_offset_map_nested(dir,addr)	pte_offset_kernel((dir), (addr))
-#define pte_unmap(pte)			do { } while (0)
-#define pte_unmap_nested(pte)		do { } while (0)
-
 /*
  * The following only work if pte_present(). Undefined behaviour otherwise.
  */
@@ -441,6 +431,16 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 	return __va(pmd_val(pmd) & PHYS_MASK & (s32)PAGE_MASK);
 }
 
+/* Find an entry in the third-level page table. */
+#define pte_index(addr)		(((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
+
+#define pte_offset_kernel(dir,addr)	(pmd_page_vaddr(*(dir)) + pte_index(addr))
+
+#define pte_offset_map(dir,addr)	pte_offset_kernel((dir), (addr))
+#define pte_offset_map_nested(dir,addr)	pte_offset_kernel((dir), (addr))
+#define pte_unmap(pte)			do { } while (0)
+#define pte_unmap_nested(pte)		do { } while (0)
+
 #define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 10/18] arm64: mm: add functions to walk page tables by PA
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (8 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 09/18] arm64: mm: move pte_* macros Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 11/18] arm64: mm: avoid redundant __pa(__va(x)) Mark Rutland
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

To allow us to walk tables allocated into the fixmap, we need to acquire
the physical address of a page, rather than the virtual address in the
linear map.

This patch adds new p??_page_paddr and p??_offset_phys functions to
acquire the physical address of a next-level table, and changes
p??_offset* into macros which simply convert this to a linear map VA.
This renders p??_page_vaddr unused, end hence they are removed.

At the pgd level, a new pgd_offset_raw function is added to find the
relevant PGD entry given the base of a PGD and a virtual address.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/pgtable.h | 39 +++++++++++++++++++++++----------------
 1 file changed, 23 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 3603cca..f5742db 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -426,15 +426,16 @@ static inline void pmd_clear(pmd_t *pmdp)
 	set_pmd(pmdp, __pmd(0));
 }
 
-static inline pte_t *pmd_page_vaddr(pmd_t pmd)
+static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
 {
-	return __va(pmd_val(pmd) & PHYS_MASK & (s32)PAGE_MASK);
+	return pmd_val(pmd) & PHYS_MASK & (s32)PAGE_MASK;
 }
 
 /* Find an entry in the third-level page table. */
 #define pte_index(addr)		(((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
 
-#define pte_offset_kernel(dir,addr)	(pmd_page_vaddr(*(dir)) + pte_index(addr))
+#define pte_offset_phys(dir,addr)	(pmd_page_paddr(*(dir)) + pte_index(addr) * sizeof(pte_t))
+#define pte_offset_kernel(dir,addr)	((pte_t *)__va(pte_offset_phys((dir), (addr))))
 
 #define pte_offset_map(dir,addr)	pte_offset_kernel((dir), (addr))
 #define pte_offset_map_nested(dir,addr)	pte_offset_kernel((dir), (addr))
@@ -469,21 +470,23 @@ static inline void pud_clear(pud_t *pudp)
 	set_pud(pudp, __pud(0));
 }
 
-static inline pmd_t *pud_page_vaddr(pud_t pud)
+static inline phys_addr_t pud_page_paddr(pud_t pud)
 {
-	return __va(pud_val(pud) & PHYS_MASK & (s32)PAGE_MASK);
+	return pud_val(pud) & PHYS_MASK & (s32)PAGE_MASK;
 }
 
 /* Find an entry in the second-level page table. */
 #define pmd_index(addr)		(((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
 
-static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
-{
-	return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(addr);
-}
+#define pmd_offset_phys(dir, addr)	(pud_page_paddr(*(dir)) + pmd_index(addr) * sizeof(pmd_t))
+#define pmd_offset(dir, addr)		((pmd_t *)__va(pmd_offset_phys((dir), (addr))))
 
 #define pud_page(pud)		pfn_to_page(__phys_to_pfn(pud_val(pud) & PHYS_MASK))
 
+#else
+
+#define pud_page_paddr(pud)	({ BUILD_BUG(); 0; })
+
 #endif	/* CONFIG_PGTABLE_LEVELS > 2 */
 
 #if CONFIG_PGTABLE_LEVELS > 3
@@ -505,21 +508,23 @@ static inline void pgd_clear(pgd_t *pgdp)
 	set_pgd(pgdp, __pgd(0));
 }
 
-static inline pud_t *pgd_page_vaddr(pgd_t pgd)
+static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 {
-	return __va(pgd_val(pgd) & PHYS_MASK & (s32)PAGE_MASK);
+	return pgd_val(pgd) & PHYS_MASK & (s32)PAGE_MASK;
 }
 
 /* Find an entry in the frst-level page table. */
 #define pud_index(addr)		(((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
 
-static inline pud_t *pud_offset(pgd_t *pgd, unsigned long addr)
-{
-	return (pud_t *)pgd_page_vaddr(*pgd) + pud_index(addr);
-}
+#define pud_offset_phys(dir, addr)	(pgd_page_paddr(*(dir)) + pud_index(addr) * sizeof(pud_t))
+#define pud_offset(dir, addr)		((pud_t *)__va(pud_offset_phys((dir), (addr))))
 
 #define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
 
+#else
+
+#define pgd_page_paddr(pgd)	({ BUILD_BUG(); 0;})
+
 #endif  /* CONFIG_PGTABLE_LEVELS > 3 */
 
 #define pgd_ERROR(pgd)		__pgd_error(__FILE__, __LINE__, pgd_val(pgd))
@@ -527,7 +532,9 @@ static inline pud_t *pud_offset(pgd_t *pgd, unsigned long addr)
 /* to find an entry in a page-table-directory */
 #define pgd_index(addr)		(((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
 
-#define pgd_offset(mm, addr)	((mm)->pgd+pgd_index(addr))
+#define pgd_offset_raw(pgd, addr)	((pgd) + pgd_index(addr))
+
+#define pgd_offset(mm, addr)	(pgd_offset_raw((mm)->pgd, (addr)))
 
 /* to find an entry in a kernel page-table-directory */
 #define pgd_offset_k(addr)	pgd_offset(&init_mm, addr)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 11/18] arm64: mm: avoid redundant __pa(__va(x))
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (9 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 10/18] arm64: mm: add functions to walk page tables by PA Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 12/18] arm64: mm: add __{pud,pgd}_populate Mark Rutland
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

When we "upgrade" to a section mapping, we free any table we made
redundant by giving it back to memblock. To get the PA, we acquire the
physical address and convert this to a VA, then subsequently convert
this back to a PA.

This works currently, but will not work if the tables are not accessed
via linear map VAs (e.g. is we use fixmap slots).

This patch uses {pmd,pud}_page_paddr to acquire the PA. This avoids the
__pa(__va()) round trip, saving some work and avoiding reliance on the
linear mapping.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/mm/mmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c3ea9df..8879aed 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -171,7 +171,7 @@ static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 			if (!pmd_none(old_pmd)) {
 				flush_tlb_all();
 				if (pmd_table(old_pmd)) {
-					phys_addr_t table = __pa(pte_offset_map(&old_pmd, 0));
+					phys_addr_t table = pmd_page_paddr(old_pmd);
 					if (!WARN_ON_ONCE(slab_is_available()))
 						memblock_free(table, PAGE_SIZE);
 				}
@@ -232,7 +232,7 @@ static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 			if (!pud_none(old_pud)) {
 				flush_tlb_all();
 				if (pud_table(old_pud)) {
-					phys_addr_t table = __pa(pmd_offset(&old_pud, 0));
+					phys_addr_t table = pud_page_paddr(old_pud);
 					if (!WARN_ON_ONCE(slab_is_available()))
 						memblock_free(table, PAGE_SIZE);
 				}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 12/18] arm64: mm: add __{pud,pgd}_populate
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (10 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 11/18] arm64: mm: avoid redundant __pa(__va(x)) Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 13/18] arm64: mm: add functions to walk tables in fixmap Mark Rutland
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

We currently have __pmd_populate for creating a pmd table entry given
the physical address of a pte, but don't have equivalents for the pud or
pgd levels of table.

To enable us to manipulate tables which are mapped outside of the linear
mapping (where we have a PA, but not a linear map VA), it is useful to
have these functions.

This patch adds __{pud,pgd}_populate. As these should not be called when
the kernel uses folded {pmd,pud}s, in these cases they expand to
BUILD_BUG(). So long as the appropriate checks are made on the {pud,pgd}
entry prior to attempting population, these should be optimized out at
compile time.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/pgalloc.h | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index c150539..ff98585 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -42,11 +42,20 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
 	free_page((unsigned long)pmd);
 }
 
-static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
+static inline void __pud_populate(pud_t *pud, phys_addr_t pmd, pudval_t prot)
 {
-	set_pud(pud, __pud(__pa(pmd) | PMD_TYPE_TABLE));
+	set_pud(pud, __pud(pmd | prot));
 }
 
+static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
+{
+	__pud_populate(pud, __pa(pmd), PMD_TYPE_TABLE);
+}
+#else
+static inline void __pud_populate(pud_t *pud, phys_addr_t pmd, pudval_t prot)
+{
+	BUILD_BUG();
+}
 #endif	/* CONFIG_PGTABLE_LEVELS > 2 */
 
 #if CONFIG_PGTABLE_LEVELS > 3
@@ -62,11 +71,20 @@ static inline void pud_free(struct mm_struct *mm, pud_t *pud)
 	free_page((unsigned long)pud);
 }
 
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
+static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pud, pgdval_t prot)
 {
-	set_pgd(pgd, __pgd(__pa(pud) | PUD_TYPE_TABLE));
+	set_pgd(pgdp, __pgd(pud | prot));
 }
 
+static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
+{
+	__pgd_populate(pgd, __pa(pud), PUD_TYPE_TABLE);
+}
+#else
+static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pud, pgdval_t prot)
+{
+	BUILD_BUG();
+}
 #endif	/* CONFIG_PGTABLE_LEVELS > 3 */
 
 extern pgd_t *pgd_alloc(struct mm_struct *mm);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 13/18] arm64: mm: add functions to walk tables in fixmap
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (11 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 12/18] arm64: mm: add __{pud,pgd}_populate Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 22:49   ` Laura Abbott
  2016-01-04 17:56 ` [PATCHv2 14/18] arm64: mm: use fixmap when creating page tables Mark Rutland
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

As a prepratory step to allow us to allocate early page tables from
unmapped memory using memblock_alloc, add new p??_fixmap* functions that
can be used to walk page tables outside of the linear mapping by using
fixmap slots.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/fixmap.h  | 10 ++++++++++
 arch/arm64/include/asm/pgtable.h | 26 ++++++++++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 3097045..1a617d4 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -62,6 +62,16 @@ enum fixed_addresses {
 
 	FIX_BTMAP_END = __end_of_permanent_fixed_addresses,
 	FIX_BTMAP_BEGIN = FIX_BTMAP_END + TOTAL_FIX_BTMAPS - 1,
+
+	/*
+	 * Used for kernel page table creation, so unmapped memory may be used
+	 * for tables.
+	 */
+	FIX_PTE,
+	FIX_PMD,
+	FIX_PUD,
+	FIX_PGD,
+
 	__end_of_fixed_addresses
 };
 
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f5742db..824e7f0 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -57,6 +57,7 @@
 
 #ifndef __ASSEMBLY__
 
+#include <asm/fixmap.h>
 #include <linux/mmdebug.h>
 
 extern void __pte_error(const char *file, int line, unsigned long val);
@@ -442,6 +443,10 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
 #define pte_unmap(pte)			do { } while (0)
 #define pte_unmap_nested(pte)		do { } while (0)
 
+#define pte_fixmap(addr)		((pte_t *)set_fixmap_offset(FIX_PTE, addr))
+#define pte_fixmap_offset(pmd, addr)	pte_fixmap(pte_offset_phys(pmd, addr))
+#define pte_fixmap_unmap()		clear_fixmap(FIX_PTE)
+
 #define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
 
 /*
@@ -481,12 +486,21 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
 #define pmd_offset_phys(dir, addr)	(pud_page_paddr(*(dir)) + pmd_index(addr) * sizeof(pmd_t))
 #define pmd_offset(dir, addr)		((pmd_t *)__va(pmd_offset_phys((dir), (addr))))
 
+#define pmd_fixmap(addr)		((pmd_t *)set_fixmap_offset(FIX_PMD, addr))
+#define pmd_fixmap_offset(pud, addr)	pmd_fixmap(pmd_offset_phys(pud, addr))
+#define pmd_fixmap_unmap()		clear_fixmap(FIX_PMD)
+
 #define pud_page(pud)		pfn_to_page(__phys_to_pfn(pud_val(pud) & PHYS_MASK))
 
 #else
 
 #define pud_page_paddr(pud)	({ BUILD_BUG(); 0; })
 
+/* Match pmd_offset folding in <asm/generic/pgtable-nopmd.h> */
+#define pmd_fixmap(addr)		NULL
+#define pmd_fixmap_offset(pudp, addr)	((pmd_t *)pudp)
+#define pmd_fixmap_unmap()
+
 #endif	/* CONFIG_PGTABLE_LEVELS > 2 */
 
 #if CONFIG_PGTABLE_LEVELS > 3
@@ -519,12 +533,21 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 #define pud_offset_phys(dir, addr)	(pgd_page_paddr(*(dir)) + pud_index(addr) * sizeof(pud_t))
 #define pud_offset(dir, addr)		((pud_t *)__va(pud_offset_phys((dir), (addr))))
 
+#define pud_fixmap(addr)		((pud_t *)set_fixmap_offset(FIX_PUD, addr))
+#define pud_fixmap_offset(pgd, addr)	pud_fixmap(pmd_offset_phys(pgd, addr))
+#define pud_fixmap_unmap()		clear_fixmap(FIX_PUD)
+
 #define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
 
 #else
 
 #define pgd_page_paddr(pgd)	({ BUILD_BUG(); 0;})
 
+/* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */
+#define pud_fixmap(addr)		NULL
+#define pud_fixmap_offset(pgdp, addr)	((pud_t *)pgdp)
+#define pud_fixmap_unmap()
+
 #endif  /* CONFIG_PGTABLE_LEVELS > 3 */
 
 #define pgd_ERROR(pgd)		__pgd_error(__FILE__, __LINE__, pgd_val(pgd))
@@ -539,6 +562,9 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 /* to find an entry in a kernel page-table-directory */
 #define pgd_offset_k(addr)	pgd_offset(&init_mm, addr)
 
+#define pgd_fixmap(addr)		((pgd_t *)set_fixmap_offset(FIX_PGD, addr))
+#define pgd_fixmap_unmap()		clear_fixmap(FIX_PGD)
+
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
 	const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY |
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 14/18] arm64: mm: use fixmap when creating page tables
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (12 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 13/18] arm64: mm: add functions to walk tables in fixmap Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 22:38   ` Laura Abbott
  2016-01-04 17:56 ` [PATCHv2 15/18] arm64: mm: allocate pagetables anywhere Mark Rutland
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

As a prepratory step to allow us to allocate early page tables form
unmapped memory using memblock_alloc, modify the __create_mapping
callees to map and unmap the tables they modify using fixmap entries.

All but the top-level pgd initialisation is performed via the fixmap.
Subsequent patches will inject the pgd physical address, and migrate to
using the FIX_PGD slot.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/pgtable.h |  2 +-
 arch/arm64/mm/mmu.c              | 63 ++++++++++++++++++++++++++--------------
 2 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 824e7f0..6fbf9fa 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -534,7 +534,7 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 #define pud_offset(dir, addr)		((pud_t *)__va(pud_offset_phys((dir), (addr))))
 
 #define pud_fixmap(addr)		((pud_t *)set_fixmap_offset(FIX_PUD, addr))
-#define pud_fixmap_offset(pgd, addr)	pud_fixmap(pmd_offset_phys(pgd, addr))
+#define pud_fixmap_offset(pgd, addr)	pud_fixmap(pud_offset_phys(pgd, addr))
 #define pud_fixmap_unmap()		clear_fixmap(FIX_PUD)
 
 #define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8879aed..bf09d44 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -63,19 +63,30 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 }
 EXPORT_SYMBOL(phys_mem_access_prot);
 
-static void __init *early_pgtable_alloc(void)
+static phys_addr_t __init early_pgtable_alloc(void)
 {
 	phys_addr_t phys;
 	void *ptr;
 
 	phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
 	BUG_ON(!phys);
-	ptr = __va(phys);
+
+	/*
+	 * The FIX_{PGD,PUD,PMD} slots may be in active use, but the FIX_PTE
+	 * slot will be free, so we can (ab)use the FIX_PTE slot to initialise
+	 * any level of table.
+	 */
+	ptr = pte_fixmap(phys);
+
 	memset(ptr, 0, PAGE_SIZE);
 
-	/* Ensure the zeroed page is visible to the page table walker */
-	dsb(ishst);
-	return ptr;
+	/* 
+	 * Implicit barriers also ensure the zeroed page is visible to the page
+	 * table walker
+	 */
+	pte_fixmap_unmap();
+
+	return phys;
 }
 
 /*
@@ -99,24 +110,28 @@ static void split_pmd(pmd_t *pmd, pte_t *pte)
 static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
 				  unsigned long end, unsigned long pfn,
 				  pgprot_t prot,
-				  void *(*pgtable_alloc)(void))
+				  phys_addr_t (*pgtable_alloc)(void))
 {
 	pte_t *pte;
 
 	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
-		pte = pgtable_alloc();
+		phys_addr_t pte_phys = pgtable_alloc();
+		pte = pte_fixmap(pte_phys);
 		if (pmd_sect(*pmd))
 			split_pmd(pmd, pte);
-		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
+		__pmd_populate(pmd, pte_phys, PMD_TYPE_TABLE);
 		flush_tlb_all();
+		pte_fixmap_unmap();
 	}
 	BUG_ON(pmd_bad(*pmd));
 
-	pte = pte_offset_kernel(pmd, addr);
+	pte = pte_fixmap_offset(pmd, addr);
 	do {
 		set_pte(pte, pfn_pte(pfn, prot));
 		pfn++;
 	} while (pte++, addr += PAGE_SIZE, addr != end);
+
+	pte_fixmap_unmap();
 }
 
 static void split_pud(pud_t *old_pud, pmd_t *pmd)
@@ -134,7 +149,7 @@ static void split_pud(pud_t *old_pud, pmd_t *pmd)
 static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 				  unsigned long addr, unsigned long end,
 				  phys_addr_t phys, pgprot_t prot,
-				  void *(*pgtable_alloc)(void))
+				  phys_addr_t (*pgtable_alloc)(void))
 {
 	pmd_t *pmd;
 	unsigned long next;
@@ -143,7 +158,8 @@ static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 	 * Check for initial section mappings in the pgd/pud and remove them.
 	 */
 	if (pud_none(*pud) || pud_sect(*pud)) {
-		pmd = pgtable_alloc();
+		phys_addr_t pmd_phys = pgtable_alloc();
+		pmd = pmd_fixmap(pmd_phys);
 		if (pud_sect(*pud)) {
 			/*
 			 * need to have the 1G of mappings continue to be
@@ -151,12 +167,13 @@ static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 			 */
 			split_pud(pud, pmd);
 		}
-		pud_populate(mm, pud, pmd);
+		__pud_populate(pud, pmd_phys, PUD_TYPE_TABLE);
 		flush_tlb_all();
+		pmd_fixmap_unmap();
 	}
 	BUG_ON(pud_bad(*pud));
 
-	pmd = pmd_offset(pud, addr);
+	pmd = pmd_fixmap_offset(pud, addr);
 	do {
 		next = pmd_addr_end(addr, end);
 		/* try section mapping first */
@@ -182,6 +199,8 @@ static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 		}
 		phys += next - addr;
 	} while (pmd++, addr = next, addr != end);
+
+	pmd_fixmap_unmap();
 }
 
 static inline bool use_1G_block(unsigned long addr, unsigned long next,
@@ -199,18 +218,18 @@ static inline bool use_1G_block(unsigned long addr, unsigned long next,
 static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				  unsigned long addr, unsigned long end,
 				  phys_addr_t phys, pgprot_t prot,
-				  void *(*pgtable_alloc)(void))
+				  phys_addr_t (*pgtable_alloc)(void))
 {
 	pud_t *pud;
 	unsigned long next;
 
 	if (pgd_none(*pgd)) {
-		pud = pgtable_alloc();
-		pgd_populate(mm, pgd, pud);
+		phys_addr_t pud_phys = pgtable_alloc();
+		__pgd_populate(pgd, pud_phys, PUD_TYPE_TABLE);
 	}
 	BUG_ON(pgd_bad(*pgd));
 
-	pud = pud_offset(pgd, addr);
+	pud = pud_fixmap_offset(pgd, addr);
 	do {
 		next = pud_addr_end(addr, end);
 
@@ -243,6 +262,8 @@ static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 		}
 		phys += next - addr;
 	} while (pud++, addr = next, addr != end);
+
+	pud_fixmap_unmap();
 }
 
 /*
@@ -252,7 +273,7 @@ static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 static void  __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 				    phys_addr_t phys, unsigned long virt,
 				    phys_addr_t size, pgprot_t prot,
-				    void *(*pgtable_alloc)(void))
+				    phys_addr_t (*pgtable_alloc)(void))
 {
 	unsigned long addr, length, end, next;
 
@@ -275,14 +296,12 @@ static void  __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 	} while (pgd++, addr = next, addr != end);
 }
 
-static void *late_pgtable_alloc(void)
+static phys_addr_t late_pgtable_alloc(void)
 {
 	void *ptr = (void *)__get_free_page(PGALLOC_GFP);
 	BUG_ON(!ptr);
-
-	/* Ensure the zeroed page is visible to the page table walker */
 	dsb(ishst);
-	return ptr;
+	return __pa(ptr);
 }
 
 static void __init create_mapping(phys_addr_t phys, unsigned long virt,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 15/18] arm64: mm: allocate pagetables anywhere
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (13 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 14/18] arm64: mm: use fixmap when creating page tables Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 16/18] arm64: mm: allow passing a pgdir to alloc_init_* Mark Rutland
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

Now that create_mapping uses fixmap slots to modify pte, pmd, and pud
entries, we can access page tables anywhere in physical memory,
regardless of the extent of the linear mapping.

Given that, we no longer need to limit memblock allocations during page
table creation, and can leave the limit as its default
MEMBLOCK_ALLOC_ANYWHERE.

We never add memory which will fall outside of the linear map range
given phys_offset and MAX_MEMBLOCK_ADDR are configured appropriately, so
any tables we create will fall in the linear map of the final tables.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/mm/mmu.c | 35 -----------------------------------
 1 file changed, 35 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index bf09d44..28bc764 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -382,20 +382,6 @@ static void __init __map_memblock(phys_addr_t start, phys_addr_t end)
 static void __init map_mem(void)
 {
 	struct memblock_region *reg;
-	phys_addr_t limit;
-
-	/*
-	 * Temporarily limit the memblock range. We need to do this as
-	 * create_mapping requires puds, pmds and ptes to be allocated from
-	 * memory addressable from the initial direct kernel mapping.
-	 *
-	 * The initial direct kernel mapping, located at swapper_pg_dir, gives
-	 * us PUD_SIZE (with SECTION maps) or PMD_SIZE (without SECTION maps,
-	 * memory starting from PHYS_OFFSET (which must be aligned to 2MB as
-	 * per Documentation/arm64/booting.txt).
-	 */
-	limit = PHYS_OFFSET + SWAPPER_INIT_MAP_SIZE;
-	memblock_set_current_limit(limit);
 
 	/* map all the memory banks */
 	for_each_memblock(memory, reg) {
@@ -407,29 +393,8 @@ static void __init map_mem(void)
 		if (memblock_is_nomap(reg))
 			continue;
 
-		if (ARM64_SWAPPER_USES_SECTION_MAPS) {
-			/*
-			 * For the first memory bank align the start address and
-			 * current memblock limit to prevent create_mapping() from
-			 * allocating pte page tables from unmapped memory. With
-			 * the section maps, if the first block doesn't end on section
-			 * size boundary, create_mapping() will try to allocate a pte
-			 * page, which may be returned from an unmapped area.
-			 * When section maps are not used, the pte page table for the
-			 * current limit is already present in swapper_pg_dir.
-			 */
-			if (start < limit)
-				start = ALIGN(start, SECTION_SIZE);
-			if (end < limit) {
-				limit = end & SECTION_MASK;
-				memblock_set_current_limit(limit);
-			}
-		}
 		__map_memblock(start, end);
 	}
-
-	/* Limit no longer required. */
-	memblock_set_current_limit(MEMBLOCK_ALLOC_ANYWHERE);
 }
 
 static void __init fixup_executable(void)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 16/18] arm64: mm: allow passing a pgdir to alloc_init_*
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (14 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 15/18] arm64: mm: allocate pagetables anywhere Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 17/18] arm64: ensure _stext and _etext are page-aligned Mark Rutland
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

To allow us to initialise pgdirs which are fixmapped, allow explicitly
passing a pgdir rather than an mm. A new __create_pgd_mapping function
is added for this, with existing __create_mapping callers migrated to
this.

The mm argument was previously only used at the top level. Now that it
is redundant at all levels, it is removed. To indicate its new found
similarity to alloc_init_{pud,pmd,pte}, __create_mapping is renamed to
init_pgd.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/mm/mmu.c | 33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 28bc764..f3cd8f4 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -146,8 +146,7 @@ static void split_pud(pud_t *old_pud, pmd_t *pmd)
 	} while (pmd++, i++, i < PTRS_PER_PMD);
 }
 
-static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
-				  unsigned long addr, unsigned long end,
+static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
 				  phys_addr_t phys, pgprot_t prot,
 				  phys_addr_t (*pgtable_alloc)(void))
 {
@@ -215,8 +214,7 @@ static inline bool use_1G_block(unsigned long addr, unsigned long next,
 	return true;
 }
 
-static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
-				  unsigned long addr, unsigned long end,
+static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
 				  phys_addr_t phys, pgprot_t prot,
 				  phys_addr_t (*pgtable_alloc)(void))
 {
@@ -257,7 +255,7 @@ static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				}
 			}
 		} else {
-			alloc_init_pmd(mm, pud, addr, next, phys, prot,
+			alloc_init_pmd(pud, addr, next, phys, prot,
 				       pgtable_alloc);
 		}
 		phys += next - addr;
@@ -270,8 +268,7 @@ static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
  * Create the page directory entries and any necessary page tables for the
  * mapping specified by 'md'.
  */
-static void  __create_mapping(struct mm_struct *mm, pgd_t *pgd,
-				    phys_addr_t phys, unsigned long virt,
+static void init_pgd(pgd_t *pgd, phys_addr_t phys, unsigned long virt,
 				    phys_addr_t size, pgprot_t prot,
 				    phys_addr_t (*pgtable_alloc)(void))
 {
@@ -291,7 +288,7 @@ static void  __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 	end = addr + length;
 	do {
 		next = pgd_addr_end(addr, end);
-		alloc_init_pud(mm, pgd, addr, next, phys, prot, pgtable_alloc);
+		alloc_init_pud(pgd, addr, next, phys, prot, pgtable_alloc);
 		phys += next - addr;
 	} while (pgd++, addr = next, addr != end);
 }
@@ -304,6 +301,14 @@ static phys_addr_t late_pgtable_alloc(void)
 	return __pa(ptr);
 }
 
+static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
+				 unsigned long virt, phys_addr_t size,
+				 pgprot_t prot,
+				 phys_addr_t (*alloc)(void))
+{
+	init_pgd(pgd_offset_raw(pgdir, virt), phys, virt, size, prot, alloc);
+}
+
 static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 				  phys_addr_t size, pgprot_t prot)
 {
@@ -312,16 +317,16 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 			&phys, virt);
 		return;
 	}
-	__create_mapping(&init_mm, pgd_offset_k(virt), phys, virt,
-			 size, prot, early_pgtable_alloc);
+	__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot,
+			     early_pgtable_alloc);
 }
 
 void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 			       unsigned long virt, phys_addr_t size,
 			       pgprot_t prot)
 {
-	__create_mapping(mm, pgd_offset(mm, virt), phys, virt, size, prot,
-				late_pgtable_alloc);
+	__create_pgd_mapping(mm->pgd, phys, virt, size, prot,
+			     late_pgtable_alloc);
 }
 
 static void create_mapping_late(phys_addr_t phys, unsigned long virt,
@@ -333,8 +338,8 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
 		return;
 	}
 
-	return __create_mapping(&init_mm, pgd_offset_k(virt),
-				phys, virt, size, prot, late_pgtable_alloc);
+	__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot,
+			     late_pgtable_alloc);
 }
 
 #ifdef CONFIG_DEBUG_RODATA
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 17/18] arm64: ensure _stext and _etext are page-aligned
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (15 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 16/18] arm64: mm: allow passing a pgdir to alloc_init_* Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-04 17:56 ` [PATCHv2 18/18] arm64: mm: create new fine-grained mappings at boot Mark Rutland
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

Currently we have separate ALIGN_DEBUG_RO{,_MIN} directives to align
_etext and __init_begin. While we ensure that __init_begin is
page-aligned, we do not provide the same guarantee for _etext. This is
not problematic currently as the alignemtn of __init_begin is suffucient
to prevent issues when we modify permissions.

Subsequent patches will assume page alignment of segments of the kernel
we wish to map with different permissions. To ensure this, move _etext
after the ALIGN_DEBUG_RO_MIN for the init section. This renders the
prior ALIGN_DEBUG_RO irrelevant, and hence it is removed. Likewise,
upgrade to ALIGN_DEBUG_RO_MIN(PAGE_SIZE) for _stext.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/vmlinux.lds.S | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index f943a84..7de6c39 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -94,7 +94,7 @@ SECTIONS
 		_text = .;
 		HEAD_TEXT
 	}
-	ALIGN_DEBUG_RO
+	ALIGN_DEBUG_RO_MIN(PAGE_SIZE)
 	.text : {			/* Real text segment		*/
 		_stext = .;		/* Text and read-only data	*/
 			__exception_text_start = .;
@@ -115,10 +115,9 @@ SECTIONS
 	RO_DATA(PAGE_SIZE)
 	EXCEPTION_TABLE(8)
 	NOTES
-	ALIGN_DEBUG_RO
-	_etext = .;			/* End of text and rodata section */
 
 	ALIGN_DEBUG_RO_MIN(PAGE_SIZE)
+	_etext = .;			/* End of text and rodata section */
 	__init_begin = .;
 
 	INIT_TEXT_SECTION(8)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 18/18] arm64: mm: create new fine-grained mappings at boot
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (16 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 17/18] arm64: ensure _stext and _etext are page-aligned Mark Rutland
@ 2016-01-04 17:56 ` Mark Rutland
  2016-01-05  1:08 ` [PATCHv2 00/18] arm64: mm: rework page table creation Laura Abbott
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-04 17:56 UTC (permalink / raw)
  To: linux-arm-kernel

At boot we may change the granularity of the tables mapping the kernel
(by splitting or making sections). This may happen when we create the
linear mapping (in __map_memblock), or at any point we try to apply
fine-grained permissions to the kernel (e.g. fixup_executable,
mark_rodata_ro, fixup_init).

Changing the active page tables in this manner may result in multiple
entries for the same address being allocated into TLBs, risking problems
such as TLB conflict aborts or issues derived from the amalgamation of
TLB entries. Generally, a break-before-make (BBM) approach is necessary
to avoid conflicts, but we cannot do this for the kernel tables as it
risks unmapping text or data being used to do so.

Instead, we can create a new set of tables from scratch in the safety of
the existing mappings, and subsequently migrate over to these using the
new cpu_replace_ttbr1 helper, which avoids the two sets of tables being
active simultaneously.

To avoid issues when we later modify permissions of the page tables
(e.g. in fixup_init), we must create the page tables at a granularity
such that later modification does not result in splitting of tables.

This patch applies this strategy, creating a new set of fine-grained
page tables from scratch, and safely migrating to them. The existing
fixmap and kasan shadow page tables are reused in the new fine-grained
tables.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/kasan.h |   3 +
 arch/arm64/mm/kasan_init.c     |  15 ++++
 arch/arm64/mm/mmu.c            | 153 ++++++++++++++++++++++++-----------------
 3 files changed, 109 insertions(+), 62 deletions(-)

diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index 2774fa3..de0d212 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -7,6 +7,7 @@
 
 #include <linux/linkage.h>
 #include <asm/memory.h>
+#include <asm/pgtable-types.h>
 
 /*
  * KASAN_SHADOW_START: beginning of the kernel virtual addresses.
@@ -28,10 +29,12 @@
 #define KASAN_SHADOW_OFFSET     (KASAN_SHADOW_END - (1ULL << (64 - 3)))
 
 void kasan_init(void);
+void kasan_copy_shadow(pgd_t *pgdir);
 asmlinkage void kasan_early_init(void);
 
 #else
 static inline void kasan_init(void) { }
+static inline void kasan_copy_shadow(pgd_t *pgdir) { }
 #endif
 
 #endif
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 3e3d280..0ca411f 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -97,6 +97,21 @@ asmlinkage void __init kasan_early_init(void)
 	kasan_map_early_shadow();
 }
 
+/*
+ * Copy the current shadow region into a new pgdir.
+ */
+void __init kasan_copy_shadow(pgd_t *pgdir)
+{
+	pgd_t *pgd, *pgd_new, *pgd_end;
+
+	pgd = pgd_offset_k(KASAN_SHADOW_START);
+	pgd_end = pgd_offset_k(KASAN_SHADOW_END);
+	pgd_new = pgd_offset_raw(pgdir, KASAN_SHADOW_START);
+	do {
+		set_pgd(pgd_new, *pgd);
+	} while (pgd++, pgd_new++, pgd != pgd_end);
+}
+
 static void __init clear_pgds(unsigned long start,
 			unsigned long end)
 {
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index f3cd8f4..e141762 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -33,6 +33,7 @@
 #include <asm/barrier.h>
 #include <asm/cputype.h>
 #include <asm/fixmap.h>
+#include <asm/kasan.h>
 #include <asm/kernel-pgtable.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
@@ -342,49 +343,42 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
 			     late_pgtable_alloc);
 }
 
-#ifdef CONFIG_DEBUG_RODATA
-static void __init __map_memblock(phys_addr_t start, phys_addr_t end)
+static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end)
 {
+
+	unsigned long kernel_start = __pa(_stext);
+	unsigned long kernel_end = __pa(_end);
+
 	/*
-	 * Set up the executable regions using the existing section mappings
-	 * for now. This will get more fine grained later once all memory
-	 * is mapped
+	 * The kernel itself is mapped at page granularity. Map all other
+	 * memory, making sure we don't overwrite the existing kernel mappings.
 	 */
-	unsigned long kernel_x_start = round_down(__pa(_stext), SWAPPER_BLOCK_SIZE);
-	unsigned long kernel_x_end = round_up(__pa(__init_end), SWAPPER_BLOCK_SIZE);
-
-	if (end < kernel_x_start) {
-		create_mapping(start, __phys_to_virt(start),
-			end - start, PAGE_KERNEL);
-	} else if (start >= kernel_x_end) {
-		create_mapping(start, __phys_to_virt(start),
-			end - start, PAGE_KERNEL);
-	} else {
-		if (start < kernel_x_start)
-			create_mapping(start, __phys_to_virt(start),
-				kernel_x_start - start,
-				PAGE_KERNEL);
-		create_mapping(kernel_x_start,
-				__phys_to_virt(kernel_x_start),
-				kernel_x_end - kernel_x_start,
-				PAGE_KERNEL_EXEC);
-		if (kernel_x_end < end)
-			create_mapping(kernel_x_end,
-				__phys_to_virt(kernel_x_end),
-				end - kernel_x_end,
-				PAGE_KERNEL);
+
+	/* No overlap with the kernel. */
+	if (end < kernel_start || start >= kernel_end) {
+		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
+				     end - start, PAGE_KERNEL,
+				     early_pgtable_alloc);
+		return;
 	}
 
+	/*
+	 * This block overlaps the kernel mapping. Map the portion(s) which
+	 * don't overlap.
+	 */
+	if (start < kernel_start)
+		__create_pgd_mapping(pgd, start,
+				     __phys_to_virt(start),
+				     kernel_start - start, PAGE_KERNEL,
+				     early_pgtable_alloc);
+	if (kernel_end < end)
+		__create_pgd_mapping(pgd, kernel_end,
+				     __phys_to_virt(kernel_end),
+				     end - kernel_end, PAGE_KERNEL,
+				     early_pgtable_alloc);
 }
-#else
-static void __init __map_memblock(phys_addr_t start, phys_addr_t end)
-{
-	create_mapping(start, __phys_to_virt(start), end - start,
-			PAGE_KERNEL_EXEC);
-}
-#endif
 
-static void __init map_mem(void)
+static void __init map_mem(pgd_t *pgd)
 {
 	struct memblock_region *reg;
 
@@ -398,33 +392,10 @@ static void __init map_mem(void)
 		if (memblock_is_nomap(reg))
 			continue;
 
-		__map_memblock(start, end);
+		__map_memblock(pgd, start, end);
 	}
 }
 
-static void __init fixup_executable(void)
-{
-#ifdef CONFIG_DEBUG_RODATA
-	/* now that we are actually fully mapped, make the start/end more fine grained */
-	if (!IS_ALIGNED((unsigned long)_stext, SWAPPER_BLOCK_SIZE)) {
-		unsigned long aligned_start = round_down(__pa(_stext),
-							 SWAPPER_BLOCK_SIZE);
-
-		create_mapping(aligned_start, __phys_to_virt(aligned_start),
-				__pa(_stext) - aligned_start,
-				PAGE_KERNEL);
-	}
-
-	if (!IS_ALIGNED((unsigned long)__init_end, SWAPPER_BLOCK_SIZE)) {
-		unsigned long aligned_end = round_up(__pa(__init_end),
-							  SWAPPER_BLOCK_SIZE);
-		create_mapping(__pa(__init_end), (unsigned long)__init_end,
-				aligned_end - __pa(__init_end),
-				PAGE_KERNEL);
-	}
-#endif
-}
-
 #ifdef CONFIG_DEBUG_RODATA
 void mark_rodata_ro(void)
 {
@@ -442,14 +413,72 @@ void fixup_init(void)
 			PAGE_KERNEL);
 }
 
+static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
+				    pgprot_t prot)
+{
+	phys_addr_t pa_start = __pa(va_start);
+	unsigned long size = va_end - va_start;
+
+	BUG_ON(!PAGE_ALIGNED(pa_start));
+	BUG_ON(!PAGE_ALIGNED(size));
+
+	__create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
+			     early_pgtable_alloc);
+}
+
+/*
+ * Create fine-grained mappings for the kernel.
+ */
+static void __init map_kernel(pgd_t *pgd)
+{
+
+	map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC);
+	map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC);
+	map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL);
+
+	/*
+	 * The fixmap falls in a separate pgd to the kernel, and doesn't live
+	 * in the carveout for the swapper_pg_dir. We can simply re-use the
+	 * existing dir for the fixmap.
+	 */
+	set_pgd(pgd_offset_raw(pgd, FIXADDR_START), *pgd_offset_k(FIXADDR_START));
+
+	kasan_copy_shadow(pgd);
+}
+
 /*
  * paging_init() sets up the page tables, initialises the zone memory
  * maps and sets up the zero page.
  */
 void __init paging_init(void)
 {
-	map_mem();
-	fixup_executable();
+	phys_addr_t pgd_phys = early_pgtable_alloc();
+	pgd_t *pgd = pgd_fixmap(pgd_phys);
+
+	map_kernel(pgd);
+	map_mem(pgd);
+
+	/*
+	 * We want to reuse the original swapper_pg_dir so we don't have to
+	 * communicate the new address to non-coherent secondaries in
+	 * secondary_entry, and so cpu_switch_mm can generate the address with
+	 * adrp+add rather than a load from some global variable.
+	 *
+	 * To do this we need to go via a temporary pgd.
+	 */
+	cpu_replace_ttbr1(__va(pgd_phys));
+	memcpy(swapper_pg_dir, pgd, PAGE_SIZE);
+	cpu_replace_ttbr1(swapper_pg_dir);
+
+	pgd_fixmap_unmap();
+	memblock_free(pgd_phys, PAGE_SIZE);
+
+	/*
+	 * We only reuse the PGD from the swapper_pg_dir, not the pud + pmd
+	 * allocated with it.
+	 */
+	memblock_free(__pa(swapper_pg_dir) + PAGE_SIZE,
+		      SWAPPER_DIR_SIZE - PAGE_SIZE);
 
 	bootmem_init();
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2 14/18] arm64: mm: use fixmap when creating page tables
  2016-01-04 17:56 ` [PATCHv2 14/18] arm64: mm: use fixmap when creating page tables Mark Rutland
@ 2016-01-04 22:38   ` Laura Abbott
  2016-01-05 10:40     ` Mark Rutland
  0 siblings, 1 reply; 40+ messages in thread
From: Laura Abbott @ 2016-01-04 22:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/04/2016 09:56 AM, Mark Rutland wrote:
> As a prepratory step to allow us to allocate early page tables form
"page tables from"

> unmapped memory using memblock_alloc, modify the __create_mapping
> callees to map and unmap the tables they modify using fixmap entries.
>
> All but the top-level pgd initialisation is performed via the fixmap.
> Subsequent patches will inject the pgd physical address, and migrate to
> using the FIX_PGD slot.
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Jeremy Linton <jeremy.linton@arm.com>
> Cc: Laura Abbott <labbott@fedoraproject.org>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>   arch/arm64/include/asm/pgtable.h |  2 +-
>   arch/arm64/mm/mmu.c              | 63 ++++++++++++++++++++++++++--------------
>   2 files changed, 42 insertions(+), 23 deletions(-)
>
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 824e7f0..6fbf9fa 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -534,7 +534,7 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
>   #define pud_offset(dir, addr)		((pud_t *)__va(pud_offset_phys((dir), (addr))))
>
>   #define pud_fixmap(addr)		((pud_t *)set_fixmap_offset(FIX_PUD, addr))
> -#define pud_fixmap_offset(pgd, addr)	pud_fixmap(pmd_offset_phys(pgd, addr))
> +#define pud_fixmap_offset(pgd, addr)	pud_fixmap(pud_offset_phys(pgd, addr))

Was this supposed to be folded into a previous patch?

>   #define pud_fixmap_unmap()		clear_fixmap(FIX_PUD)
>
>   #define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 8879aed..bf09d44 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -63,19 +63,30 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>   }
>   EXPORT_SYMBOL(phys_mem_access_prot);
>
> -static void __init *early_pgtable_alloc(void)
> +static phys_addr_t __init early_pgtable_alloc(void)
>   {
>   	phys_addr_t phys;
>   	void *ptr;
>
>   	phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
>   	BUG_ON(!phys);
> -	ptr = __va(phys);
> +
> +	/*
> +	 * The FIX_{PGD,PUD,PMD} slots may be in active use, but the FIX_PTE
> +	 * slot will be free, so we can (ab)use the FIX_PTE slot to initialise
> +	 * any level of table.
> +	 */
> +	ptr = pte_fixmap(phys);
> +
>   	memset(ptr, 0, PAGE_SIZE);
>
> -	/* Ensure the zeroed page is visible to the page table walker */
> -	dsb(ishst);
> -	return ptr;
> +	/*
> +	 * Implicit barriers also ensure the zeroed page is visible to the page
> +	 * table walker
> +	 */
> +	pte_fixmap_unmap();
> +
> +	return phys;
>   }
>
>   /*
> @@ -99,24 +110,28 @@ static void split_pmd(pmd_t *pmd, pte_t *pte)
>   static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
>   				  unsigned long end, unsigned long pfn,
>   				  pgprot_t prot,
> -				  void *(*pgtable_alloc)(void))
> +				  phys_addr_t (*pgtable_alloc)(void))
>   {
>   	pte_t *pte;
>
>   	if (pmd_none(*pmd) || pmd_sect(*pmd)) {
> -		pte = pgtable_alloc();
> +		phys_addr_t pte_phys = pgtable_alloc();
> +		pte = pte_fixmap(pte_phys);
>   		if (pmd_sect(*pmd))
>   			split_pmd(pmd, pte);
> -		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
> +		__pmd_populate(pmd, pte_phys, PMD_TYPE_TABLE);
>   		flush_tlb_all();
> +		pte_fixmap_unmap();
>   	}
>   	BUG_ON(pmd_bad(*pmd));
>
> -	pte = pte_offset_kernel(pmd, addr);
> +	pte = pte_fixmap_offset(pmd, addr);
>   	do {
>   		set_pte(pte, pfn_pte(pfn, prot));
>   		pfn++;
>   	} while (pte++, addr += PAGE_SIZE, addr != end);
> +
> +	pte_fixmap_unmap();
>   }
>
>   static void split_pud(pud_t *old_pud, pmd_t *pmd)
> @@ -134,7 +149,7 @@ static void split_pud(pud_t *old_pud, pmd_t *pmd)
>   static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>   				  unsigned long addr, unsigned long end,
>   				  phys_addr_t phys, pgprot_t prot,
> -				  void *(*pgtable_alloc)(void))
> +				  phys_addr_t (*pgtable_alloc)(void))
>   {
>   	pmd_t *pmd;
>   	unsigned long next;
> @@ -143,7 +158,8 @@ static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>   	 * Check for initial section mappings in the pgd/pud and remove them.
>   	 */
>   	if (pud_none(*pud) || pud_sect(*pud)) {
> -		pmd = pgtable_alloc();
> +		phys_addr_t pmd_phys = pgtable_alloc();
> +		pmd = pmd_fixmap(pmd_phys);
>   		if (pud_sect(*pud)) {
>   			/*
>   			 * need to have the 1G of mappings continue to be
> @@ -151,12 +167,13 @@ static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>   			 */
>   			split_pud(pud, pmd);
>   		}
> -		pud_populate(mm, pud, pmd);
> +		__pud_populate(pud, pmd_phys, PUD_TYPE_TABLE);
>   		flush_tlb_all();
> +		pmd_fixmap_unmap();
>   	}
>   	BUG_ON(pud_bad(*pud));
>
> -	pmd = pmd_offset(pud, addr);
> +	pmd = pmd_fixmap_offset(pud, addr);
>   	do {
>   		next = pmd_addr_end(addr, end);
>   		/* try section mapping first */
> @@ -182,6 +199,8 @@ static void alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>   		}
>   		phys += next - addr;
>   	} while (pmd++, addr = next, addr != end);
> +
> +	pmd_fixmap_unmap();
>   }
>
>   static inline bool use_1G_block(unsigned long addr, unsigned long next,
> @@ -199,18 +218,18 @@ static inline bool use_1G_block(unsigned long addr, unsigned long next,
>   static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>   				  unsigned long addr, unsigned long end,
>   				  phys_addr_t phys, pgprot_t prot,
> -				  void *(*pgtable_alloc)(void))
> +				  phys_addr_t (*pgtable_alloc)(void))
>   {
>   	pud_t *pud;
>   	unsigned long next;
>
>   	if (pgd_none(*pgd)) {
> -		pud = pgtable_alloc();
> -		pgd_populate(mm, pgd, pud);
> +		phys_addr_t pud_phys = pgtable_alloc();
> +		__pgd_populate(pgd, pud_phys, PUD_TYPE_TABLE);
>   	}
>   	BUG_ON(pgd_bad(*pgd));
>
> -	pud = pud_offset(pgd, addr);
> +	pud = pud_fixmap_offset(pgd, addr);
>   	do {
>   		next = pud_addr_end(addr, end);
>
> @@ -243,6 +262,8 @@ static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>   		}
>   		phys += next - addr;
>   	} while (pud++, addr = next, addr != end);
> +
> +	pud_fixmap_unmap();
>   }
>
>   /*
> @@ -252,7 +273,7 @@ static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>   static void  __create_mapping(struct mm_struct *mm, pgd_t *pgd,
>   				    phys_addr_t phys, unsigned long virt,
>   				    phys_addr_t size, pgprot_t prot,
> -				    void *(*pgtable_alloc)(void))
> +				    phys_addr_t (*pgtable_alloc)(void))
>   {
>   	unsigned long addr, length, end, next;
>
> @@ -275,14 +296,12 @@ static void  __create_mapping(struct mm_struct *mm, pgd_t *pgd,
>   	} while (pgd++, addr = next, addr != end);
>   }
>
> -static void *late_pgtable_alloc(void)
> +static phys_addr_t late_pgtable_alloc(void)
>   {
>   	void *ptr = (void *)__get_free_page(PGALLOC_GFP);
>   	BUG_ON(!ptr);
> -
> -	/* Ensure the zeroed page is visible to the page table walker */
>   	dsb(ishst);

This dropped the comment but not the actual barrier, was that intentional?

> -	return ptr;
> +	return __pa(ptr);
>   }
>
>   static void __init create_mapping(phys_addr_t phys, unsigned long virt,
>

Thanks,
Laura

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 13/18] arm64: mm: add functions to walk tables in fixmap
  2016-01-04 17:56 ` [PATCHv2 13/18] arm64: mm: add functions to walk tables in fixmap Mark Rutland
@ 2016-01-04 22:49   ` Laura Abbott
  2016-01-05 11:08     ` Mark Rutland
  0 siblings, 1 reply; 40+ messages in thread
From: Laura Abbott @ 2016-01-04 22:49 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/04/2016 09:56 AM, Mark Rutland wrote:
> As a prepratory step to allow us to allocate early page tables from
> unmapped memory using memblock_alloc, add new p??_fixmap* functions that
> can be used to walk page tables outside of the linear mapping by using
> fixmap slots.
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Jeremy Linton <jeremy.linton@arm.com>
> Cc: Laura Abbott <labbott@fedoraproject.org>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>   arch/arm64/include/asm/fixmap.h  | 10 ++++++++++
>   arch/arm64/include/asm/pgtable.h | 26 ++++++++++++++++++++++++++
>   2 files changed, 36 insertions(+)
>
> diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
> index 3097045..1a617d4 100644
> --- a/arch/arm64/include/asm/fixmap.h
> +++ b/arch/arm64/include/asm/fixmap.h
> @@ -62,6 +62,16 @@ enum fixed_addresses {
>
>   	FIX_BTMAP_END = __end_of_permanent_fixed_addresses,
>   	FIX_BTMAP_BEGIN = FIX_BTMAP_END + TOTAL_FIX_BTMAPS - 1,
> +
> +	/*
> +	 * Used for kernel page table creation, so unmapped memory may be used
> +	 * for tables.
> +	 */
> +	FIX_PTE,
> +	FIX_PMD,
> +	FIX_PUD,
> +	FIX_PGD,
> +
>   	__end_of_fixed_addresses
>   };
>
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index f5742db..824e7f0 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -57,6 +57,7 @@
>
>   #ifndef __ASSEMBLY__
>
> +#include <asm/fixmap.h>
>   #include <linux/mmdebug.h>
>
>   extern void __pte_error(const char *file, int line, unsigned long val);
> @@ -442,6 +443,10 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
>   #define pte_unmap(pte)			do { } while (0)
>   #define pte_unmap_nested(pte)		do { } while (0)
>
> +#define pte_fixmap(addr)		((pte_t *)set_fixmap_offset(FIX_PTE, addr))
> +#define pte_fixmap_offset(pmd, addr)	pte_fixmap(pte_offset_phys(pmd, addr))
> +#define pte_fixmap_unmap()		clear_fixmap(FIX_PTE)
> +
>   #define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
>
>   /*
> @@ -481,12 +486,21 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
>   #define pmd_offset_phys(dir, addr)	(pud_page_paddr(*(dir)) + pmd_index(addr) * sizeof(pmd_t))
>   #define pmd_offset(dir, addr)		((pmd_t *)__va(pmd_offset_phys((dir), (addr))))
>
> +#define pmd_fixmap(addr)		((pmd_t *)set_fixmap_offset(FIX_PMD, addr))
> +#define pmd_fixmap_offset(pud, addr)	pmd_fixmap(pmd_offset_phys(pud, addr))
> +#define pmd_fixmap_unmap()		clear_fixmap(FIX_PMD)
> +
>   #define pud_page(pud)		pfn_to_page(__phys_to_pfn(pud_val(pud) & PHYS_MASK))
>
>   #else
>
>   #define pud_page_paddr(pud)	({ BUILD_BUG(); 0; })
>
> +/* Match pmd_offset folding in <asm/generic/pgtable-nopmd.h> */
> +#define pmd_fixmap(addr)		NULL
> +#define pmd_fixmap_offset(pudp, addr)	((pmd_t *)pudp)
> +#define pmd_fixmap_unmap()
> +
>   #endif	/* CONFIG_PGTABLE_LEVELS > 2 */
>
>   #if CONFIG_PGTABLE_LEVELS > 3
> @@ -519,12 +533,21 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
>   #define pud_offset_phys(dir, addr)	(pgd_page_paddr(*(dir)) + pud_index(addr) * sizeof(pud_t))
>   #define pud_offset(dir, addr)		((pud_t *)__va(pud_offset_phys((dir), (addr))))
>
> +#define pud_fixmap(addr)		((pud_t *)set_fixmap_offset(FIX_PUD, addr))
> +#define pud_fixmap_offset(pgd, addr)	pud_fixmap(pmd_offset_phys(pgd, addr))
> +#define pud_fixmap_unmap()		clear_fixmap(FIX_PUD)
> +
>   #define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
>
>   #else
>
>   #define pgd_page_paddr(pgd)	({ BUILD_BUG(); 0;})
>
> +/* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */
> +#define pud_fixmap(addr)		NULL
> +#define pud_fixmap_offset(pgdp, addr)	((pud_t *)pgdp)
> +#define pud_fixmap_unmap()
> +
>   #endif  /* CONFIG_PGTABLE_LEVELS > 3 */
>
>   #define pgd_ERROR(pgd)		__pgd_error(__FILE__, __LINE__, pgd_val(pgd))
> @@ -539,6 +562,9 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
>   /* to find an entry in a kernel page-table-directory */
>   #define pgd_offset_k(addr)	pgd_offset(&init_mm, addr)
>
> +#define pgd_fixmap(addr)		((pgd_t *)set_fixmap_offset(FIX_PGD, addr))
> +#define pgd_fixmap_unmap()		clear_fixmap(FIX_PGD)
> +
>   static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>   {
>   	const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY |
>

Bikeshed: p??_fixmap_offset doesn't make it obvious that this is an
operation with a side effect. It seems more similar to
p??_offset_kernel which is read only. Perhaps it's the lack of set/map
in the name.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 00/18] arm64: mm: rework page table creation
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (17 preceding siblings ...)
  2016-01-04 17:56 ` [PATCHv2 18/18] arm64: mm: create new fine-grained mappings at boot Mark Rutland
@ 2016-01-05  1:08 ` Laura Abbott
  2016-01-05 11:54   ` Mark Rutland
  2016-01-06 10:24 ` Catalin Marinas
  2016-01-18 14:47 ` Ard Biesheuvel
  20 siblings, 1 reply; 40+ messages in thread
From: Laura Abbott @ 2016-01-05  1:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/04/2016 09:56 AM, Mark Rutland wrote:
> Hi all,
>
> This series reworks the arm64 early page table code, in order to:
>
> (a) Avoid issues with potentially-conflicting TTBR1 TLB entries (as raised in
>      Jeremy's thread [1]). This can happen when splitting/merging sections or
>      contiguous ranges, and per a pessimistic reading of the ARM ARM may happen
>      for changes to other fields in translation table entries.
>
> (b) Allow for more complex page table creation early on, with tables created
>      with fine-grained permissions as early as possible. In the cases where we
>      currently use fine-grained permissions (e.g. DEBUG_RODATA and marking .init
>      as non-executable), this is required for the same reasons as (a), as we
>      must ensure that changes to page tables do not split/merge sections or
>      contiguous regions for memory in active use.
>
> (c) Avoid edge cases where we need to allocate memory before a sufficient
>      proportion of the early linear map is in place to accommodate allocations.
>
> This series:
>
> * Introduces the necessary infrastructure to safely swap TTBR1_EL1 (i.e.
>    without risking conflicting TLB entries being allocated). The arm64 KASAN
>    code is migrated to this.
>
> * Adds helpers to walk page tables by physical address, independent of the
>    linear mapping, and modifies __create_mapping and friends to relying on a new
>    set of FIX_{PGD,PUD,PMD,PTE} to map tables as required for modification.
>
> * Removes the early memblock limit, now that create_mapping does not rely on the
>    early linear map. This solves (c), and allows for (b).
>
> * Generates an entirely new set of kernel page tables with fine-grained (i.e.
>    page-level) permission boundaries, which can then be safely installed. These
>    are created with sufficient granularity such that later changes (currently
>    only fixup_init) will not split/merge sections or contiguous regions, and can
>    follow a break-before-make approach without affecting the rest of the page
>    tables.
>
> There are still opportunities for improvement:
>
> * BUG() when splitting sections or creating overlapping entries in
>    create_mapping, as these both indicate serious bugs in kernel page table
>    creation.
>
>    This will require rework to the EFI runtime services pagetable creation, as
>    for >4K page kernels EFI memory descriptors may share pages (and currently
>    such overlap is assumed to be benign).

Given the split_{pmd,pud} were added for DEBUG_RODATA, is there any reason
those can't be dropped now since it sounds like the EFI problem is for overlapping
entries and not splitting?

>
> * Use ROX mappings for the kernel text and rodata when creating the new tables.
>    This avoiding potential conflicts from changes to translation tables, and
>    giving us better protections earlier.
>
>    Currently the alternatives patching code relies on being able to use the
>    kernel mapping to update the text. We cannot rely on any text which itself
>    may be patched, and updates may straddle page boundaries, so this is
>    non-trivial.
>
> * Clean up usage of swapper_pg_dir so we can switch to the new tables without
>    having to reuse the existing pgd. This will allow us to free the original
>    pgd (i.e. we can free all the initial tables in one go).
>
> Any and all feedback is welcome.

This series points out that my attempt to allow set_memory_* to
work on regular kernel memory[1] is broken right now because it breaks down
the larger block sizes. Do you have any suggestions for a cleaner approach
short of requiring all memory mapped with 4K pages? The only solution I see
right now is having a separate copy of page tables to switch to. Any idea
other idea I come up with would have problems if we tried to invalidate an
entry before breaking it down.

Thanks,
Laura

[1]https://lkml.kernel.org/g/<1447207057-11323-1-git-send-email-labbott@fedoraproject.org>

>
> This series is based on today's arm64 [2] for-next/core branch (commit
> c9cd0ed925c0b927), and this version is tagged as
> arm64-pagetable-rework-20160104 while the latest version should be in the
> unstable branch arm64/pagetable-rework in my git repo [3].
>
> Since v1 [4] (tagged arm64-pagetable-rework-20151209):
> * Drop patches taken into the arm64 tree.
> * Rebase to arm64 for-next/core.
> * Copy early KASAN tables.
> * Fix KASAN pgd manipulation.
> * Specialise allocators for page tables, in function and naming.
> * Update comments.
>
> Thanks,
> Mark.
>
> [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/386178.html
> [2] git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
> [3] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git
> [4] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/392292.html
>
> Mark Rutland (18):
>    asm-generic: make __set_fixmap_offset a static inline
>    arm64: mm: specialise pagetable allocators
>    arm64: mm: place empty_zero_page in bss
>    arm64: unify idmap removal
>    arm64: unmap idmap earlier
>    arm64: add function to install the idmap
>    arm64: mm: add code to safely replace TTBR1_EL1
>    arm64: kasan: avoid TLB conflicts
>    arm64: mm: move pte_* macros
>    arm64: mm: add functions to walk page tables by PA
>    arm64: mm: avoid redundant __pa(__va(x))
>    arm64: mm: add __{pud,pgd}_populate
>    arm64: mm: add functions to walk tables in fixmap
>    arm64: mm: use fixmap when creating page tables
>    arm64: mm: allocate pagetables anywhere
>    arm64: mm: allow passing a pgdir to alloc_init_*
>    arm64: ensure _stext and _etext are page-aligned
>    arm64: mm: create new fine-grained mappings at boot
>
>   arch/arm64/include/asm/fixmap.h      |  10 ++
>   arch/arm64/include/asm/kasan.h       |   3 +
>   arch/arm64/include/asm/mmu_context.h |  63 ++++++-
>   arch/arm64/include/asm/pgalloc.h     |  26 ++-
>   arch/arm64/include/asm/pgtable.h     |  87 +++++++---
>   arch/arm64/kernel/head.S             |   1 +
>   arch/arm64/kernel/setup.c            |   7 +
>   arch/arm64/kernel/smp.c              |   4 +-
>   arch/arm64/kernel/suspend.c          |  20 +--
>   arch/arm64/kernel/vmlinux.lds.S      |   5 +-
>   arch/arm64/mm/kasan_init.c           |  32 ++--
>   arch/arm64/mm/mmu.c                  | 311 ++++++++++++++++++-----------------
>   arch/arm64/mm/proc.S                 |  27 +++
>   include/asm-generic/fixmap.h         |  14 +-
>   14 files changed, 381 insertions(+), 229 deletions(-)
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 14/18] arm64: mm: use fixmap when creating page tables
  2016-01-04 22:38   ` Laura Abbott
@ 2016-01-05 10:40     ` Mark Rutland
  0 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-05 10:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Mon, Jan 04, 2016 at 02:38:42PM -0800, Laura Abbott wrote:
> On 01/04/2016 09:56 AM, Mark Rutland wrote:
> >As a prepratory step to allow us to allocate early page tables form
> "page tables from"

Fixed.

> >-#define pud_fixmap_offset(pgd, addr)	pud_fixmap(pmd_offset_phys(pgd, addr))
> >+#define pud_fixmap_offset(pgd, addr)	pud_fixmap(pud_offset_phys(pgd, addr))
> 
> Was this supposed to be folded into a previous patch?

Yes. I've folded that into the prior patch now.

> >-static void *late_pgtable_alloc(void)
> >+static phys_addr_t late_pgtable_alloc(void)
> >  {
> >  	void *ptr = (void *)__get_free_page(PGALLOC_GFP);
> >  	BUG_ON(!ptr);
> >-
> >-	/* Ensure the zeroed page is visible to the page table walker */
> >  	dsb(ishst);
> 
> This dropped the comment but not the actual barrier, was that intentional?

I messed that up when rebasing. Restored now.

Thanks for pointing those out!

Mark.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 13/18] arm64: mm: add functions to walk tables in fixmap
  2016-01-04 22:49   ` Laura Abbott
@ 2016-01-05 11:08     ` Mark Rutland
  0 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-05 11:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 04, 2016 at 02:49:46PM -0800, Laura Abbott wrote:
> On 01/04/2016 09:56 AM, Mark Rutland wrote:
> >As a prepratory step to allow us to allocate early page tables from
> >unmapped memory using memblock_alloc, add new p??_fixmap* functions that
> >can be used to walk page tables outside of the linear mapping by using
> >fixmap slots.
> >
> >Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> >Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >Cc: Catalin Marinas <catalin.marinas@arm.com>
> >Cc: Jeremy Linton <jeremy.linton@arm.com>
> >Cc: Laura Abbott <labbott@fedoraproject.org>
> >Cc: Will Deacon <will.deacon@arm.com>
> >---
> >  arch/arm64/include/asm/fixmap.h  | 10 ++++++++++
> >  arch/arm64/include/asm/pgtable.h | 26 ++++++++++++++++++++++++++
> >  2 files changed, 36 insertions(+)

> >+#define pgd_fixmap(addr)		((pgd_t *)set_fixmap_offset(FIX_PGD, addr))
> >+#define pgd_fixmap_unmap()		clear_fixmap(FIX_PGD)

> Bikeshed: p??_fixmap_offset doesn't make it obvious that this is an
> operation with a side effect. It seems more similar to
> p??_offset_kernel which is read only. Perhaps it's the lack of set/map
> in the name.

I agree.

I've locally changed them to p??_set_fixmap{,_offset}, p??_clear_fixmap,
to match the usual fixmap function naming.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 00/18] arm64: mm: rework page table creation
  2016-01-05  1:08 ` [PATCHv2 00/18] arm64: mm: rework page table creation Laura Abbott
@ 2016-01-05 11:54   ` Mark Rutland
  2016-01-05 18:36     ` Laura Abbott
  2016-01-08 19:15     ` Mark Rutland
  0 siblings, 2 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-05 11:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 04, 2016 at 05:08:58PM -0800, Laura Abbott wrote:
> On 01/04/2016 09:56 AM, Mark Rutland wrote:
> >Hi all,
> >
> >This series reworks the arm64 early page table code, in order to:
> >
> >(a) Avoid issues with potentially-conflicting TTBR1 TLB entries (as raised in
> >     Jeremy's thread [1]). This can happen when splitting/merging sections or
> >     contiguous ranges, and per a pessimistic reading of the ARM ARM may happen
> >     for changes to other fields in translation table entries.
> >
> >(b) Allow for more complex page table creation early on, with tables created
> >     with fine-grained permissions as early as possible. In the cases where we
> >     currently use fine-grained permissions (e.g. DEBUG_RODATA and marking .init
> >     as non-executable), this is required for the same reasons as (a), as we
> >     must ensure that changes to page tables do not split/merge sections or
> >     contiguous regions for memory in active use.

[...]

> >There are still opportunities for improvement:
> >
> >* BUG() when splitting sections or creating overlapping entries in
> >   create_mapping, as these both indicate serious bugs in kernel page table
> >   creation.
> >
> >   This will require rework to the EFI runtime services pagetable creation, as
> >   for >4K page kernels EFI memory descriptors may share pages (and currently
> >   such overlap is assumed to be benign).
> 
> Given the split_{pmd,pud} were added for DEBUG_RODATA, is there any reason
> those can't be dropped now since it sounds like the EFI problem is for overlapping
> entries and not splitting?

Good point. I think they can be removed.

I'll take a look into that.

> This series points out that my attempt to allow set_memory_* to
> work on regular kernel memory[1] is broken right now because it breaks down
> the larger block sizes.

What's the rationale for set_memory_* on kernel mappings? I see
"security", but I couldn't figure out a concrete use-case. Is there any
example of a subsystem that wants to use this?

For statically-allocated data, an alternative approach would be for such
memory to be mapped with minimal permissions from the outset (e.g. being
placed in .rodata), and when elevated permissions are required a
(temporary) memremap'd alias could be used, like what patch_map does to
modify ROX kernel/module text.

For dynamically-allocated data, we could create (minimal permission)
mappings in the vmalloc region and pass those around. The linear map
alias would still be writeable, but as the offset between the two isn't
linear (and the owner of that allocation doesn't have to know/care about
the linear map address), it would be much harder to find the linear map
address to attack. An alias with elevated permissions could be used as
required, or if it's a one-time RW->RO switch, the mapping could me
modified in-place as the granularity wouldn't change.

> Do you have any suggestions for a cleaner approach
> short of requiring all memory mapped with 4K pages? The only solution I see
> right now is having a separate copy of page tables to switch to. Any idea
> other idea I come up with would have problems if we tried to invalidate an
> entry before breaking it down.

The other option I looked into was to have a completely independent
TTBR0 mapping (like the idmap or efi runtime tables), and have that map
code for modifying page tables. That way you could modify the tables
in-place (with TTBR1 disabled for the duration of the modification).

That ended up having its own set of problems, as you could only rely on
self-contained position independent code, which ruled out most kernel
APIs (including locking/atomic primitives due to debug paths). That gets
worse when secondaries are online and you have to synchronise those
disabling/invalidating/enabling the TTBR1 mapping.

Other than that I haven't managed to come up with other functional
ideas. The RCU-like approach is the cleanest I've found so far.

Thanks,
Mark.

> Thanks,
> Laura
> 
> [1]https://lkml.kernel.org/g/<1447207057-11323-1-git-send-email-labbott@fedoraproject.org>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 07/18] arm64: mm: add code to safely replace TTBR1_EL1
  2016-01-04 17:56 ` [PATCHv2 07/18] arm64: mm: add code to safely replace TTBR1_EL1 Mark Rutland
@ 2016-01-05 15:22   ` Catalin Marinas
  2016-01-05 15:45     ` Mark Rutland
  0 siblings, 1 reply; 40+ messages in thread
From: Catalin Marinas @ 2016-01-05 15:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 04, 2016 at 05:56:40PM +0000, Mark Rutland wrote:
> +	.pushsection ".idmap.text", "ax"
> +/*
> + * void idmap_cpu_replace_ttbr1(phys_addr_t new_pgd, phys_addr_t reserved_pgd)
> + *
> + * This is the low-level counterpart to cpu_replace_ttbr1, and should not be
> + * called by anything else. It can only be executed from a TTBR0 mapping.
> + */
> +ENTRY(idmap_cpu_replace_ttbr1)
> +	mrs	x2, daif
> +	msr	daifset, #0xf
> +
> +	msr	ttbr1_el1, x1

Would it work to avoid the second argument and only use adrp, now that
empty_zero_page is at a fixed offset relative to this function?

-- 
Catalin

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 07/18] arm64: mm: add code to safely replace TTBR1_EL1
  2016-01-05 15:22   ` Catalin Marinas
@ 2016-01-05 15:45     ` Mark Rutland
  0 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-05 15:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 05, 2016 at 03:22:18PM +0000, Catalin Marinas wrote:
> On Mon, Jan 04, 2016 at 05:56:40PM +0000, Mark Rutland wrote:
> > +	.pushsection ".idmap.text", "ax"
> > +/*
> > + * void idmap_cpu_replace_ttbr1(phys_addr_t new_pgd, phys_addr_t reserved_pgd)
> > + *
> > + * This is the low-level counterpart to cpu_replace_ttbr1, and should not be
> > + * called by anything else. It can only be executed from a TTBR0 mapping.
> > + */
> > +ENTRY(idmap_cpu_replace_ttbr1)
> > +	mrs	x2, daif
> > +	msr	daifset, #0xf
> > +
> > +	msr	ttbr1_el1, x1
> 
> Would it work to avoid the second argument and only use adrp, now that
> empty_zero_page is at a fixed offset relative to this function?

Yes, it would.

I've folded that in locally.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 00/18] arm64: mm: rework page table creation
  2016-01-05 11:54   ` Mark Rutland
@ 2016-01-05 18:36     ` Laura Abbott
  2016-01-05 18:58       ` Mark Rutland
  2016-01-08 19:15     ` Mark Rutland
  1 sibling, 1 reply; 40+ messages in thread
From: Laura Abbott @ 2016-01-05 18:36 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/05/2016 03:54 AM, Mark Rutland wrote:
> On Mon, Jan 04, 2016 at 05:08:58PM -0800, Laura Abbott wrote:
>> On 01/04/2016 09:56 AM, Mark Rutland wrote:
>>> Hi all,
>>>
>>> This series reworks the arm64 early page table code, in order to:
>>>
>>> (a) Avoid issues with potentially-conflicting TTBR1 TLB entries (as raised in
>>>      Jeremy's thread [1]). This can happen when splitting/merging sections or
>>>      contiguous ranges, and per a pessimistic reading of the ARM ARM may happen
>>>      for changes to other fields in translation table entries.
>>>
>>> (b) Allow for more complex page table creation early on, with tables created
>>>      with fine-grained permissions as early as possible. In the cases where we
>>>      currently use fine-grained permissions (e.g. DEBUG_RODATA and marking .init
>>>      as non-executable), this is required for the same reasons as (a), as we
>>>      must ensure that changes to page tables do not split/merge sections or
>>>      contiguous regions for memory in active use.
>
> [...]
>
>>> There are still opportunities for improvement:
>>>
>>> * BUG() when splitting sections or creating overlapping entries in
>>>    create_mapping, as these both indicate serious bugs in kernel page table
>>>    creation.
>>>
>>>    This will require rework to the EFI runtime services pagetable creation, as
>>>    for >4K page kernels EFI memory descriptors may share pages (and currently
>>>    such overlap is assumed to be benign).
>>
>> Given the split_{pmd,pud} were added for DEBUG_RODATA, is there any reason
>> those can't be dropped now since it sounds like the EFI problem is for overlapping
>> entries and not splitting?
>
> Good point. I think they can be removed.
>
> I'll take a look into that.
>
>> This series points out that my attempt to allow set_memory_* to
>> work on regular kernel memory[1] is broken right now because it breaks down
>> the larger block sizes.
>
> What's the rationale for set_memory_* on kernel mappings? I see
> "security", but I couldn't figure out a concrete use-case. Is there any
> example of a subsystem that wants to use this?

 From the description, it sounded like this was possibly new work but
the eBPF interpreter currently supports setting a page read only via
set_memory_ro (see 60a3b2253c413cf601783b070507d7dd6620c954
"net: bpf: make eBPF interpreter images read-only") so it's not
unheard of.

>
> For statically-allocated data, an alternative approach would be for such
> memory to be mapped with minimal permissions from the outset (e.g. being
> placed in .rodata), and when elevated permissions are required a
> (temporary) memremap'd alias could be used, like what patch_map does to
> modify ROX kernel/module text.
>
> For dynamically-allocated data, we could create (minimal permission)
> mappings in the vmalloc region and pass those around. The linear map
> alias would still be writeable, but as the offset between the two isn't
> linear (and the owner of that allocation doesn't have to know/care about
> the linear map address), it would be much harder to find the linear map
> address to attack. An alias with elevated permissions could be used as
> required, or if it's a one-time RW->RO switch, the mapping could me
> modified in-place as the granularity wouldn't change.

This would work for new features but probably not for existing features
such as the eBPF interpreter.

>
>> Do you have any suggestions for a cleaner approach
>> short of requiring all memory mapped with 4K pages? The only solution I see
>> right now is having a separate copy of page tables to switch to. Any idea
>> other idea I come up with would have problems if we tried to invalidate an
>> entry before breaking it down.
>
> The other option I looked into was to have a completely independent
> TTBR0 mapping (like the idmap or efi runtime tables), and have that map
> code for modifying page tables. That way you could modify the tables
> in-place (with TTBR1 disabled for the duration of the modification).
>
> That ended up having its own set of problems, as you could only rely on
> self-contained position independent code, which ruled out most kernel
> APIs (including locking/atomic primitives due to debug paths). That gets
> worse when secondaries are online and you have to synchronise those
> disabling/invalidating/enabling the TTBR1 mapping.
>
> Other than that I haven't managed to come up with other functional
> ideas. The RCU-like approach is the cleanest I've found so far.
>

Yeah, I suspect this is going to remain open for a while. Thanks for
your thoughts.

Laura

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 00/18] arm64: mm: rework page table creation
  2016-01-05 18:36     ` Laura Abbott
@ 2016-01-05 18:58       ` Mark Rutland
  2016-01-05 19:17         ` Laura Abbott
  0 siblings, 1 reply; 40+ messages in thread
From: Mark Rutland @ 2016-01-05 18:58 UTC (permalink / raw)
  To: linux-arm-kernel

> >>This series points out that my attempt to allow set_memory_* to
> >>work on regular kernel memory[1] is broken right now because it breaks down
> >>the larger block sizes.
> >
> >What's the rationale for set_memory_* on kernel mappings? I see
> >"security", but I couldn't figure out a concrete use-case. Is there any
> >example of a subsystem that wants to use this?
> 
> From the description, it sounded like this was possibly new work but
> the eBPF interpreter currently supports setting a page read only via
> set_memory_ro (see 60a3b2253c413cf601783b070507d7dd6620c954
> "net: bpf: make eBPF interpreter images read-only") so it's not
> unheard of.

Oh. For some reason I thought that used the vmalloc area, but evidently
I was mistaken.

That is unfortunate, it would be good to protect the JITed code.

> >For statically-allocated data, an alternative approach would be for such
> >memory to be mapped with minimal permissions from the outset (e.g. being
> >placed in .rodata), and when elevated permissions are required a
> >(temporary) memremap'd alias could be used, like what patch_map does to
> >modify ROX kernel/module text.
> >
> >For dynamically-allocated data, we could create (minimal permission)
> >mappings in the vmalloc region and pass those around. The linear map
> >alias would still be writeable, but as the offset between the two isn't
> >linear (and the owner of that allocation doesn't have to know/care about
> >the linear map address), it would be much harder to find the linear map
> >address to attack. An alias with elevated permissions could be used as
> >required, or if it's a one-time RW->RO switch, the mapping could me
> >modified in-place as the granularity wouldn't change.
> 
> This would work for new features but probably not for existing features
> such as the eBPF interpreter.

Sure.

For eBPF it might be possible to rework the code to support using
separate aliases, but that's probably not going to be easy and that
probably works against some performance requirement. :/

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 00/18] arm64: mm: rework page table creation
  2016-01-05 18:58       ` Mark Rutland
@ 2016-01-05 19:17         ` Laura Abbott
  2016-01-06 11:10           ` Mark Rutland
  0 siblings, 1 reply; 40+ messages in thread
From: Laura Abbott @ 2016-01-05 19:17 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/05/2016 10:58 AM, Mark Rutland wrote:
>>>> This series points out that my attempt to allow set_memory_* to
>>>> work on regular kernel memory[1] is broken right now because it breaks down
>>>> the larger block sizes.
>>>
>>> What's the rationale for set_memory_* on kernel mappings? I see
>>> "security", but I couldn't figure out a concrete use-case. Is there any
>>> example of a subsystem that wants to use this?
>>
>>  From the description, it sounded like this was possibly new work but
>> the eBPF interpreter currently supports setting a page read only via
>> set_memory_ro (see 60a3b2253c413cf601783b070507d7dd6620c954
>> "net: bpf: make eBPF interpreter images read-only") so it's not
>> unheard of.
>
> Oh. For some reason I thought that used the vmalloc area, but evidently
> I was mistaken.
>
> That is unfortunate, it would be good to protect the JITed code.
>
>>> For statically-allocated data, an alternative approach would be for such
>>> memory to be mapped with minimal permissions from the outset (e.g. being
>>> placed in .rodata), and when elevated permissions are required a
>>> (temporary) memremap'd alias could be used, like what patch_map does to
>>> modify ROX kernel/module text.
>>>
>>> For dynamically-allocated data, we could create (minimal permission)
>>> mappings in the vmalloc region and pass those around. The linear map
>>> alias would still be writeable, but as the offset between the two isn't
>>> linear (and the owner of that allocation doesn't have to know/care about
>>> the linear map address), it would be much harder to find the linear map
>>> address to attack. An alias with elevated permissions could be used as
>>> required, or if it's a one-time RW->RO switch, the mapping could me
>>> modified in-place as the granularity wouldn't change.
>>
>> This would work for new features but probably not for existing features
>> such as the eBPF interpreter.
>
> Sure.
>
> For eBPF it might be possible to rework the code to support using
> separate aliases, but that's probably not going to be easy and that
> probably works against some performance requirement. :/

Ah no, you are correct, I misread how the code was working. The eBPF code
does use vmalloc so that can easily be fixed up. I think your suggestion
of either using vmalloc or a special static section is the best
recommendation. If anyone really thinks they need to change any other
memory they can make a proposal.

Sorry for the confusion.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 00/18] arm64: mm: rework page table creation
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (18 preceding siblings ...)
  2016-01-05  1:08 ` [PATCHv2 00/18] arm64: mm: rework page table creation Laura Abbott
@ 2016-01-06 10:24 ` Catalin Marinas
  2016-01-06 11:36   ` Mark Rutland
  2016-01-18 14:47 ` Ard Biesheuvel
  20 siblings, 1 reply; 40+ messages in thread
From: Catalin Marinas @ 2016-01-06 10:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 04, 2016 at 05:56:33PM +0000, Mark Rutland wrote:
> Mark Rutland (18):
>   asm-generic: make __set_fixmap_offset a static inline
>   arm64: mm: specialise pagetable allocators
>   arm64: mm: place empty_zero_page in bss
>   arm64: unify idmap removal
>   arm64: unmap idmap earlier
>   arm64: add function to install the idmap
>   arm64: mm: add code to safely replace TTBR1_EL1
>   arm64: kasan: avoid TLB conflicts
>   arm64: mm: move pte_* macros
>   arm64: mm: add functions to walk page tables by PA
>   arm64: mm: avoid redundant __pa(__va(x))
>   arm64: mm: add __{pud,pgd}_populate
>   arm64: mm: add functions to walk tables in fixmap
>   arm64: mm: use fixmap when creating page tables
>   arm64: mm: allocate pagetables anywhere
>   arm64: mm: allow passing a pgdir to alloc_init_*
>   arm64: ensure _stext and _etext are page-aligned
>   arm64: mm: create new fine-grained mappings at boot

The patches look fine (once you fix the issues Laura raised). Thanks for
putting them together.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

I'll queue them sometime after -rc1, in the meantime keep you branch up
to date so that Ard and Jeremy can base their patches on top.

(now going to look at the KASLR patches, would make more sense once I
read this series ;))

-- 
Catalin

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 00/18] arm64: mm: rework page table creation
  2016-01-05 19:17         ` Laura Abbott
@ 2016-01-06 11:10           ` Mark Rutland
  0 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-06 11:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 05, 2016 at 11:17:55AM -0800, Laura Abbott wrote:
> On 01/05/2016 10:58 AM, Mark Rutland wrote:
> >>>>This series points out that my attempt to allow set_memory_* to
> >>>>work on regular kernel memory[1] is broken right now because it breaks down
> >>>>the larger block sizes.
> >>>
> >>>What's the rationale for set_memory_* on kernel mappings? I see
> >>>"security", but I couldn't figure out a concrete use-case. Is there any
> >>>example of a subsystem that wants to use this?
> >>
> >> From the description, it sounded like this was possibly new work but
> >>the eBPF interpreter currently supports setting a page read only via
> >>set_memory_ro (see 60a3b2253c413cf601783b070507d7dd6620c954
> >>"net: bpf: make eBPF interpreter images read-only") so it's not
> >>unheard of.
> >
> >Oh. For some reason I thought that used the vmalloc area, but evidently
> >I was mistaken.
> >
> >That is unfortunate, it would be good to protect the JITed code.
> >
> >>>For statically-allocated data, an alternative approach would be for such
> >>>memory to be mapped with minimal permissions from the outset (e.g. being
> >>>placed in .rodata), and when elevated permissions are required a
> >>>(temporary) memremap'd alias could be used, like what patch_map does to
> >>>modify ROX kernel/module text.
> >>>
> >>>For dynamically-allocated data, we could create (minimal permission)
> >>>mappings in the vmalloc region and pass those around. The linear map
> >>>alias would still be writeable, but as the offset between the two isn't
> >>>linear (and the owner of that allocation doesn't have to know/care about
> >>>the linear map address), it would be much harder to find the linear map
> >>>address to attack. An alias with elevated permissions could be used as
> >>>required, or if it's a one-time RW->RO switch, the mapping could me
> >>>modified in-place as the granularity wouldn't change.
> >>
> >>This would work for new features but probably not for existing features
> >>such as the eBPF interpreter.
> >
> >Sure.
> >
> >For eBPF it might be possible to rework the code to support using
> >separate aliases, but that's probably not going to be easy and that
> >probably works against some performance requirement. :/
> 
> Ah no, you are correct, I misread how the code was working. The eBPF code
> does use vmalloc so that can easily be fixed up.

Ah, phew. Thanks for digging into that.

> I think your suggestion of either using vmalloc or a special static
> section is the best recommendation. If anyone really thinks they need
> to change any other memory they can make a proposal.

Sounds good to me.

> Sorry for the confusion.

No worries.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 00/18] arm64: mm: rework page table creation
  2016-01-06 10:24 ` Catalin Marinas
@ 2016-01-06 11:36   ` Mark Rutland
  2016-01-06 14:23     ` Ard Biesheuvel
  0 siblings, 1 reply; 40+ messages in thread
From: Mark Rutland @ 2016-01-06 11:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 06, 2016 at 10:24:49AM +0000, Catalin Marinas wrote:
> On Mon, Jan 04, 2016 at 05:56:33PM +0000, Mark Rutland wrote:
> > Mark Rutland (18):
> >   asm-generic: make __set_fixmap_offset a static inline
> >   arm64: mm: specialise pagetable allocators
> >   arm64: mm: place empty_zero_page in bss
> >   arm64: unify idmap removal
> >   arm64: unmap idmap earlier
> >   arm64: add function to install the idmap
> >   arm64: mm: add code to safely replace TTBR1_EL1
> >   arm64: kasan: avoid TLB conflicts
> >   arm64: mm: move pte_* macros
> >   arm64: mm: add functions to walk page tables by PA
> >   arm64: mm: avoid redundant __pa(__va(x))
> >   arm64: mm: add __{pud,pgd}_populate
> >   arm64: mm: add functions to walk tables in fixmap
> >   arm64: mm: use fixmap when creating page tables
> >   arm64: mm: allocate pagetables anywhere
> >   arm64: mm: allow passing a pgdir to alloc_init_*
> >   arm64: ensure _stext and _etext are page-aligned
> >   arm64: mm: create new fine-grained mappings at boot
> 
> The patches look fine (once you fix the issues Laura raised). Thanks for
> putting them together.
> 
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Thanks!

I assume that applies to everything even without the suggested
split_{pmd,pud} removal [1], for which I'll cook up a follow-up patch.

> I'll queue them sometime after -rc1, in the meantime keep you branch up
> to date so that Ard and Jeremy can base their patches on top.

Will do.

FWIW I've just updated the branch [2] with said fixes and your
Reviewed-by. I won't send out a v3 just yet to give people time to
digest this version.

Ard, you'll find when rebasing that the compiler will scream at you due
to the p??_fixmap* function renaming. It's fairly mechanical, and if you
have vim handy you just need to run:

:%s /\(pgd\|pud\|pmd\|pte\)_fixmap_unmap/\1_clear_fixmap/g
:%s /\(pgd\|pud\|pmd\|pte\)_fixmap/\1_set_fixmap/g

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/397208.html
[2] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/pagetable-rework

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 00/18] arm64: mm: rework page table creation
  2016-01-06 11:36   ` Mark Rutland
@ 2016-01-06 14:23     ` Ard Biesheuvel
  0 siblings, 0 replies; 40+ messages in thread
From: Ard Biesheuvel @ 2016-01-06 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 6 January 2016 at 12:36, Mark Rutland <mark.rutland@arm.com> wrote:
> On Wed, Jan 06, 2016 at 10:24:49AM +0000, Catalin Marinas wrote:
>> On Mon, Jan 04, 2016 at 05:56:33PM +0000, Mark Rutland wrote:
>> > Mark Rutland (18):
>> >   asm-generic: make __set_fixmap_offset a static inline
>> >   arm64: mm: specialise pagetable allocators
>> >   arm64: mm: place empty_zero_page in bss
>> >   arm64: unify idmap removal
>> >   arm64: unmap idmap earlier
>> >   arm64: add function to install the idmap
>> >   arm64: mm: add code to safely replace TTBR1_EL1
>> >   arm64: kasan: avoid TLB conflicts
>> >   arm64: mm: move pte_* macros
>> >   arm64: mm: add functions to walk page tables by PA
>> >   arm64: mm: avoid redundant __pa(__va(x))
>> >   arm64: mm: add __{pud,pgd}_populate
>> >   arm64: mm: add functions to walk tables in fixmap
>> >   arm64: mm: use fixmap when creating page tables
>> >   arm64: mm: allocate pagetables anywhere
>> >   arm64: mm: allow passing a pgdir to alloc_init_*
>> >   arm64: ensure _stext and _etext are page-aligned
>> >   arm64: mm: create new fine-grained mappings at boot
>>
>> The patches look fine (once you fix the issues Laura raised). Thanks for
>> putting them together.
>>
>> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>
> Thanks!
>
> I assume that applies to everything even without the suggested
> split_{pmd,pud} removal [1], for which I'll cook up a follow-up patch.
>
>> I'll queue them sometime after -rc1, in the meantime keep you branch up
>> to date so that Ard and Jeremy can base their patches on top.
>
> Will do.
>
> FWIW I've just updated the branch [2] with said fixes and your
> Reviewed-by. I won't send out a v3 just yet to give people time to
> digest this version.
>
> Ard, you'll find when rebasing that the compiler will scream at you due
> to the p??_fixmap* function renaming. It's fairly mechanical, and if you
> have vim handy you just need to run:
>
> :%s /\(pgd\|pud\|pmd\|pte\)_fixmap_unmap/\1_clear_fixmap/g
> :%s /\(pgd\|pud\|pmd\|pte\)_fixmap/\1_set_fixmap/g
>

Thanks. That went without a hitch, in fact. I don't actually call any
of these in the KASLR code, just the p??_offset_phys() accessors.

FYI I pushed my latest here

https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v3
git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v3

I'll probably hold off from [rebasing and] sending out my v3 until
your stuff hits post-rc1

-- 
Ard.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 00/18] arm64: mm: rework page table creation
  2016-01-05 11:54   ` Mark Rutland
  2016-01-05 18:36     ` Laura Abbott
@ 2016-01-08 19:15     ` Mark Rutland
  1 sibling, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-08 19:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 05, 2016 at 11:54:14AM +0000, Mark Rutland wrote:
> On Mon, Jan 04, 2016 at 05:08:58PM -0800, Laura Abbott wrote:
> > On 01/04/2016 09:56 AM, Mark Rutland wrote:
> > >Hi all,
> > >
> > >This series reworks the arm64 early page table code, in order to:
> > >
> > >(a) Avoid issues with potentially-conflicting TTBR1 TLB entries (as raised in
> > >     Jeremy's thread [1]). This can happen when splitting/merging sections or
> > >     contiguous ranges, and per a pessimistic reading of the ARM ARM may happen
> > >     for changes to other fields in translation table entries.
> > >
> > >(b) Allow for more complex page table creation early on, with tables created
> > >     with fine-grained permissions as early as possible. In the cases where we
> > >     currently use fine-grained permissions (e.g. DEBUG_RODATA and marking .init
> > >     as non-executable), this is required for the same reasons as (a), as we
> > >     must ensure that changes to page tables do not split/merge sections or
> > >     contiguous regions for memory in active use.
> 
> [...]
> 
> > >There are still opportunities for improvement:
> > >
> > >* BUG() when splitting sections or creating overlapping entries in
> > >   create_mapping, as these both indicate serious bugs in kernel page table
> > >   creation.
> > >
> > >   This will require rework to the EFI runtime services pagetable creation, as
> > >   for >4K page kernels EFI memory descriptors may share pages (and currently
> > >   such overlap is assumed to be benign).
> > 
> > Given the split_{pmd,pud} were added for DEBUG_RODATA, is there any reason
> > those can't be dropped now since it sounds like the EFI problem is for overlapping
> > entries and not splitting?
> 
> Good point. I think they can be removed.
> 
> I'll take a look into that.

Looking into this further, it turns out there is a set of cases where
we'll try to split currently for !4K page kernels.

Say you have a region starting at a PMD/PUD boundary, which ends somewhere
up to PAGE_SIZE short of the next PMD/PUD boundary. We'll round the end
up to the next PAGE_SIZE boundary and can create a PMD/PUD block entry.

Say another region shares some of that PAGE_SIZE gap. Its gets mapped at
PAGE_SIZE granularity, and we try to create a PTE page entry for the
overlap. The pmd entry is valid, so we decide we must split it. Bang.

A similar set of problems would apply for contiguous PTEs, once we
support those.

For EFI, we could skip the overlap as the spec requires that attributes
are the same within a 64K frame, efi_create_mapping contrives to
ensure that permissions are the same, and we create the mappings in
ascending VA/PA order.

However, we don't want to do that in any other case. Perhaps we can pass
a "strict" parameter and skip in the non-strict case.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 00/18] arm64: mm: rework page table creation
  2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
                   ` (19 preceding siblings ...)
  2016-01-06 10:24 ` Catalin Marinas
@ 2016-01-18 14:47 ` Ard Biesheuvel
  20 siblings, 0 replies; 40+ messages in thread
From: Ard Biesheuvel @ 2016-01-18 14:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 January 2016 at 18:56, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi all,
>
> This series reworks the arm64 early page table code, in order to:
>
> (a) Avoid issues with potentially-conflicting TTBR1 TLB entries (as raised in
>     Jeremy's thread [1]). This can happen when splitting/merging sections or
>     contiguous ranges, and per a pessimistic reading of the ARM ARM may happen
>     for changes to other fields in translation table entries.
>
> (b) Allow for more complex page table creation early on, with tables created
>     with fine-grained permissions as early as possible. In the cases where we
>     currently use fine-grained permissions (e.g. DEBUG_RODATA and marking .init
>     as non-executable), this is required for the same reasons as (a), as we
>     must ensure that changes to page tables do not split/merge sections or
>     contiguous regions for memory in active use.
>
> (c) Avoid edge cases where we need to allocate memory before a sufficient
>     proportion of the early linear map is in place to accommodate allocations.
>
> This series:
>
> * Introduces the necessary infrastructure to safely swap TTBR1_EL1 (i.e.
>   without risking conflicting TLB entries being allocated). The arm64 KASAN
>   code is migrated to this.
>
> * Adds helpers to walk page tables by physical address, independent of the
>   linear mapping, and modifies __create_mapping and friends to relying on a new
>   set of FIX_{PGD,PUD,PMD,PTE} to map tables as required for modification.
>
> * Removes the early memblock limit, now that create_mapping does not rely on the
>   early linear map. This solves (c), and allows for (b).
>
> * Generates an entirely new set of kernel page tables with fine-grained (i.e.
>   page-level) permission boundaries, which can then be safely installed. These
>   are created with sufficient granularity such that later changes (currently
>   only fixup_init) will not split/merge sections or contiguous regions, and can
>   follow a break-before-make approach without affecting the rest of the page
>   tables.
>
> There are still opportunities for improvement:
>
> * BUG() when splitting sections or creating overlapping entries in
>   create_mapping, as these both indicate serious bugs in kernel page table
>   creation.
>
>   This will require rework to the EFI runtime services pagetable creation, as
>   for >4K page kernels EFI memory descriptors may share pages (and currently
>   such overlap is assumed to be benign).
>
> * Use ROX mappings for the kernel text and rodata when creating the new tables.
>   This avoiding potential conflicts from changes to translation tables, and
>   giving us better protections earlier.
>
>   Currently the alternatives patching code relies on being able to use the
>   kernel mapping to update the text. We cannot rely on any text which itself
>   may be patched, and updates may straddle page boundaries, so this is
>   non-trivial.
>
> * Clean up usage of swapper_pg_dir so we can switch to the new tables without
>   having to reuse the existing pgd. This will allow us to free the original
>   pgd (i.e. we can free all the initial tables in one go).
>
> Any and all feedback is welcome.
>

I have been using these patches as the basis of my KASLR work for a
couple of weeks now, and I haven't encountered any problems. Also, the
patches all look correct to me.

So please have my

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Regards,
Ard.


> This series is based on today's arm64 [2] for-next/core branch (commit
> c9cd0ed925c0b927), and this version is tagged as
> arm64-pagetable-rework-20160104 while the latest version should be in the
> unstable branch arm64/pagetable-rework in my git repo [3].
>
> Since v1 [4] (tagged arm64-pagetable-rework-20151209):
> * Drop patches taken into the arm64 tree.
> * Rebase to arm64 for-next/core.
> * Copy early KASAN tables.
> * Fix KASAN pgd manipulation.
> * Specialise allocators for page tables, in function and naming.
> * Update comments.
>
> Thanks,
> Mark.
>
> [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/386178.html
> [2] git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
> [3] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git
> [4] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/392292.html
>
> Mark Rutland (18):
>   asm-generic: make __set_fixmap_offset a static inline
>   arm64: mm: specialise pagetable allocators
>   arm64: mm: place empty_zero_page in bss
>   arm64: unify idmap removal
>   arm64: unmap idmap earlier
>   arm64: add function to install the idmap
>   arm64: mm: add code to safely replace TTBR1_EL1
>   arm64: kasan: avoid TLB conflicts
>   arm64: mm: move pte_* macros
>   arm64: mm: add functions to walk page tables by PA
>   arm64: mm: avoid redundant __pa(__va(x))
>   arm64: mm: add __{pud,pgd}_populate
>   arm64: mm: add functions to walk tables in fixmap
>   arm64: mm: use fixmap when creating page tables
>   arm64: mm: allocate pagetables anywhere
>   arm64: mm: allow passing a pgdir to alloc_init_*
>   arm64: ensure _stext and _etext are page-aligned
>   arm64: mm: create new fine-grained mappings at boot
>
>  arch/arm64/include/asm/fixmap.h      |  10 ++
>  arch/arm64/include/asm/kasan.h       |   3 +
>  arch/arm64/include/asm/mmu_context.h |  63 ++++++-
>  arch/arm64/include/asm/pgalloc.h     |  26 ++-
>  arch/arm64/include/asm/pgtable.h     |  87 +++++++---
>  arch/arm64/kernel/head.S             |   1 +
>  arch/arm64/kernel/setup.c            |   7 +
>  arch/arm64/kernel/smp.c              |   4 +-
>  arch/arm64/kernel/suspend.c          |  20 +--
>  arch/arm64/kernel/vmlinux.lds.S      |   5 +-
>  arch/arm64/mm/kasan_init.c           |  32 ++--
>  arch/arm64/mm/mmu.c                  | 311 ++++++++++++++++++-----------------
>  arch/arm64/mm/proc.S                 |  27 +++
>  include/asm-generic/fixmap.h         |  14 +-
>  14 files changed, 381 insertions(+), 229 deletions(-)
>
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 01/18] asm-generic: make __set_fixmap_offset a static inline
  2016-01-04 17:56 ` [PATCHv2 01/18] asm-generic: make __set_fixmap_offset a static inline Mark Rutland
@ 2016-01-19 11:55   ` Mark Rutland
  2016-01-19 14:11     ` Arnd Bergmann
  2016-01-28 15:10   ` Will Deacon
  1 sibling, 1 reply; 40+ messages in thread
From: Mark Rutland @ 2016-01-19 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Arnd,

Sorry to poke during the merge window.

Are you happy with the below patch, and if so, would you be happy for
this to go via the arm64 tree?

If possible, I'd like to be able to place the series on a stable branch
come -rc1, so that can be used as the base for other work (e.g. [1]),
and so that we can get it into -next.

Everything else has the appropriate acks collected.

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/398527.html

On Mon, Jan 04, 2016 at 05:56:34PM +0000, Mark Rutland wrote:
> Currently __set_fixmap_offset is a macro function which has a local
> variable called 'addr'. If a caller passes a 'phys' parameter which is
> derived from a variable also called 'addr', the local variable will
> shadow this, and the compiler will complain about the use of an
> uninitialized variable.
> 
> It is likely that fixmap users may use the name 'addr' for variables
> that may be directly passed to __set_fixmap_offset, or that may be
> indirectly generated via other macros. Rather than placing the burden on
> callers to avoid the name 'addr', this patch changes __set_fixmap_offset
> into a static inline function, avoiding namespace collisions.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Jeremy Linton <jeremy.linton@arm.com>
> Cc: Laura Abbott <labbott@fedoraproject.org>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  include/asm-generic/fixmap.h | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/include/asm-generic/fixmap.h b/include/asm-generic/fixmap.h
> index 1cbb833..f9c27b6 100644
> --- a/include/asm-generic/fixmap.h
> +++ b/include/asm-generic/fixmap.h
> @@ -70,13 +70,13 @@ static inline unsigned long virt_to_fix(const unsigned long vaddr)
>  #endif
>  
>  /* Return a pointer with offset calculated */
> -#define __set_fixmap_offset(idx, phys, flags)		      \
> -({							      \
> -	unsigned long addr;				      \
> -	__set_fixmap(idx, phys, flags);			      \
> -	addr = fix_to_virt(idx) + ((phys) & (PAGE_SIZE - 1)); \
> -	addr;						      \
> -})
> +static inline unsigned long __set_fixmap_offset(enum fixed_addresses idx,
> +						phys_addr_t phys,
> +						pgprot_t flags)
> +{
> +	__set_fixmap(idx, phys, flags);
> +	return fix_to_virt(idx) + (phys & (PAGE_SIZE - 1));
> +}
>  
>  #define set_fixmap_offset(idx, phys) \
>  	__set_fixmap_offset(idx, phys, FIXMAP_PAGE_NORMAL)
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 01/18] asm-generic: make __set_fixmap_offset a static inline
  2016-01-19 11:55   ` Mark Rutland
@ 2016-01-19 14:11     ` Arnd Bergmann
  2016-01-19 14:18       ` Mark Rutland
  0 siblings, 1 reply; 40+ messages in thread
From: Arnd Bergmann @ 2016-01-19 14:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 19 January 2016 11:55:40 Mark Rutland wrote:
> Hi Arnd,
> 
> Sorry to poke during the merge window.
> 
> Are you happy with the below patch, and if so, would you be happy for
> this to go via the arm64 tree?
> 
> If possible, I'd like to be able to place the series on a stable branch
> come -rc1, so that can be used as the base for other work (e.g. [1]),
> and so that we can get it into -next.
> 
> Everything else has the appropriate acks collected.
> 
> 

Yes, please merge it through arm64

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 01/18] asm-generic: make __set_fixmap_offset a static inline
  2016-01-19 14:11     ` Arnd Bergmann
@ 2016-01-19 14:18       ` Mark Rutland
  0 siblings, 0 replies; 40+ messages in thread
From: Mark Rutland @ 2016-01-19 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 19, 2016 at 03:11:09PM +0100, Arnd Bergmann wrote:
> On Tuesday 19 January 2016 11:55:40 Mark Rutland wrote:
> > Hi Arnd,
> > 
> > Sorry to poke during the merge window.
> > 
> > Are you happy with the below patch, and if so, would you be happy for
> > this to go via the arm64 tree?
> > 
> > If possible, I'd like to be able to place the series on a stable branch
> > come -rc1, so that can be used as the base for other work (e.g. [1]),
> > and so that we can get it into -next.
> > 
> > Everything else has the appropriate acks collected.
> > 
> > 
> 
> Yes, please merge it through arm64
> 
> Acked-by: Arnd Bergmann <arnd@arndb.de>

Cheers!

Mark.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv2 01/18] asm-generic: make __set_fixmap_offset a static inline
  2016-01-04 17:56 ` [PATCHv2 01/18] asm-generic: make __set_fixmap_offset a static inline Mark Rutland
  2016-01-19 11:55   ` Mark Rutland
@ 2016-01-28 15:10   ` Will Deacon
  1 sibling, 0 replies; 40+ messages in thread
From: Will Deacon @ 2016-01-28 15:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 04, 2016 at 05:56:34PM +0000, Mark Rutland wrote:
> Currently __set_fixmap_offset is a macro function which has a local
> variable called 'addr'. If a caller passes a 'phys' parameter which is
> derived from a variable also called 'addr', the local variable will
> shadow this, and the compiler will complain about the use of an
> uninitialized variable.
> 
> It is likely that fixmap users may use the name 'addr' for variables
> that may be directly passed to __set_fixmap_offset, or that may be
> indirectly generated via other macros. Rather than placing the burden on
> callers to avoid the name 'addr', this patch changes __set_fixmap_offset
> into a static inline function, avoiding namespace collisions.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Jeremy Linton <jeremy.linton@arm.com>
> Cc: Laura Abbott <labbott@fedoraproject.org>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  include/asm-generic/fixmap.h | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)

Acked-by: Will Deacon <will.deacon@arm.com>

Catalin can pick this up for 4.6 along with Arnd's ack.

Will

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2016-01-28 15:10 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-04 17:56 [PATCHv2 00/18] arm64: mm: rework page table creation Mark Rutland
2016-01-04 17:56 ` [PATCHv2 01/18] asm-generic: make __set_fixmap_offset a static inline Mark Rutland
2016-01-19 11:55   ` Mark Rutland
2016-01-19 14:11     ` Arnd Bergmann
2016-01-19 14:18       ` Mark Rutland
2016-01-28 15:10   ` Will Deacon
2016-01-04 17:56 ` [PATCHv2 02/18] arm64: mm: specialise pagetable allocators Mark Rutland
2016-01-04 17:56 ` [PATCHv2 03/18] arm64: mm: place empty_zero_page in bss Mark Rutland
2016-01-04 17:56 ` [PATCHv2 04/18] arm64: unify idmap removal Mark Rutland
2016-01-04 17:56 ` [PATCHv2 05/18] arm64: unmap idmap earlier Mark Rutland
2016-01-04 17:56 ` [PATCHv2 06/18] arm64: add function to install the idmap Mark Rutland
2016-01-04 17:56 ` [PATCHv2 07/18] arm64: mm: add code to safely replace TTBR1_EL1 Mark Rutland
2016-01-05 15:22   ` Catalin Marinas
2016-01-05 15:45     ` Mark Rutland
2016-01-04 17:56 ` [PATCHv2 08/18] arm64: kasan: avoid TLB conflicts Mark Rutland
2016-01-04 17:56 ` [PATCHv2 09/18] arm64: mm: move pte_* macros Mark Rutland
2016-01-04 17:56 ` [PATCHv2 10/18] arm64: mm: add functions to walk page tables by PA Mark Rutland
2016-01-04 17:56 ` [PATCHv2 11/18] arm64: mm: avoid redundant __pa(__va(x)) Mark Rutland
2016-01-04 17:56 ` [PATCHv2 12/18] arm64: mm: add __{pud,pgd}_populate Mark Rutland
2016-01-04 17:56 ` [PATCHv2 13/18] arm64: mm: add functions to walk tables in fixmap Mark Rutland
2016-01-04 22:49   ` Laura Abbott
2016-01-05 11:08     ` Mark Rutland
2016-01-04 17:56 ` [PATCHv2 14/18] arm64: mm: use fixmap when creating page tables Mark Rutland
2016-01-04 22:38   ` Laura Abbott
2016-01-05 10:40     ` Mark Rutland
2016-01-04 17:56 ` [PATCHv2 15/18] arm64: mm: allocate pagetables anywhere Mark Rutland
2016-01-04 17:56 ` [PATCHv2 16/18] arm64: mm: allow passing a pgdir to alloc_init_* Mark Rutland
2016-01-04 17:56 ` [PATCHv2 17/18] arm64: ensure _stext and _etext are page-aligned Mark Rutland
2016-01-04 17:56 ` [PATCHv2 18/18] arm64: mm: create new fine-grained mappings at boot Mark Rutland
2016-01-05  1:08 ` [PATCHv2 00/18] arm64: mm: rework page table creation Laura Abbott
2016-01-05 11:54   ` Mark Rutland
2016-01-05 18:36     ` Laura Abbott
2016-01-05 18:58       ` Mark Rutland
2016-01-05 19:17         ` Laura Abbott
2016-01-06 11:10           ` Mark Rutland
2016-01-08 19:15     ` Mark Rutland
2016-01-06 10:24 ` Catalin Marinas
2016-01-06 11:36   ` Mark Rutland
2016-01-06 14:23     ` Ard Biesheuvel
2016-01-18 14:47 ` Ard Biesheuvel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.