All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules
@ 2015-04-15 15:34 Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 01/13] arm64: reduce ID map to a single page Ard Biesheuvel
                   ` (12 more replies)
  0 siblings, 13 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

This series came about after Mark Rutland brought up the fact that the current
FDT placement logic used by the EFI stub is flawed. But actually, it turned out
that the documentation for both the Image and FDT placement was incorrect as
well, or confusing at the very least.

Changes since v3:
As it turns out, it is quite feasible to add a bias to PHYS_OFFSET during early
boot so all __pa()/__va() translations are redirected into the kernel mapping
even after it has been moved out of the linear region (as suggested by Catalin).
So the primary change with respect to v2 is the replacement of all open-coded
page table manipulations with invocations of create_mapping() et al.

Other changes:
- map FDT blocks only as needed: after mapping the first block (2 MB or 64 KB
  depending on page size) the FDT size is retrieved and remaining blocks are
  only mapped if necessary;
- add <asm/boot.h> header to have the min/max kernel/fdt alignment/size in a
  single place;
- handle the case where the linear region is too small for all of memory to be
  mapped;
- change the memory reservation logic so that statically allocated translation
  table pages are not reserved unless actually used;
- map the linear region as non-executable after we have moved the kernel text
  out of it;
- deal with mem= limits correctly when the kernel image is high in memory;
- incorporated various other minor review comments from Mark Rutland.

Changes since v2:
This is a complete overhaul of the previous version. The FDT changes are mostly
equivalent, but have been reimplemented in a way that does not rely on the
linear mapping to have been initialized yet. This includes changes to the fixmap
code itself to not rely on that either. Combined with the ID map reduction in
patch #1, this paves the way for relaxing the Image placement requirements as
well, i.e., the kernel Image can now be placed anywhere in memory without
affecting the accessibility of memory below it, or causing the resulting mapping
to be less efficient due to physical and virtual memory to not be relatively
aligned.

Changes since v1:
- patch #1: split off reservation of the FDT binary itself from the memreserve
  processing, since the former assumes the FDT is accessed via the linear
  mapping, which we are about to change
- patch #2: mention the older, stricter FDT placement rules in booting.txt,
  get rid of early_print,
  use correct format specifier for phys_addr_t,
  use R/O mapping for FDT,
- patches #3 .. #5: add R-b, minor style and grammar tweaks

Ard Biesheuvel (13):
  arm64: reduce ID map to a single page
  arm64: drop sleep_idmap_phys and clean up cpu_resume()
  of/fdt: split off FDT self reservation from memreserve processing
  arm64: use fixmap region for permanent FDT mapping
  arm64/efi: adapt to relaxed FDT placement requirements
  arm64: implement our own early_init_dt_add_memory_arch()
  arm64: use more granular reservations for static page table
    allocations
  arm64: split off early mapping code from early_fixmap_init()
  arm64: mm: explicitly bootstrap the linear mapping
  arm64: move kernel mapping out of linear region
  arm64: map linear region as non-executable
  arm64: allow kernel Image to be loaded anywhere in physical memory
  arm64/efi: adapt to relaxed kernel Image placement requirements

 Documentation/arm64/booting.txt         |  31 ++--
 arch/arm/mm/init.c                      |   1 +
 arch/arm64/include/asm/boot.h           |  21 +++
 arch/arm64/include/asm/compiler.h       |   2 +
 arch/arm64/include/asm/efi.h            |  10 +-
 arch/arm64/include/asm/fixmap.h         |  15 ++
 arch/arm64/include/asm/memory.h         |  28 +++-
 arch/arm64/include/asm/mmu.h            |   1 +
 arch/arm64/kernel/efi-stub.c            |   5 +-
 arch/arm64/kernel/head.S                |  60 ++-----
 arch/arm64/kernel/setup.c               |  32 ++--
 arch/arm64/kernel/sleep.S               |   9 +-
 arch/arm64/kernel/suspend.c             |   3 -
 arch/arm64/kernel/vmlinux.lds.S         |  50 +++++-
 arch/arm64/mm/Makefile                  |   3 +
 arch/arm64/mm/init.c                    |  64 +++++++-
 arch/arm64/mm/mmu.c                     | 266 ++++++++++++++++++++++----------
 arch/arm64/mm/proc.S                    |   3 +-
 arch/powerpc/kernel/prom.c              |   1 +
 drivers/firmware/efi/libstub/arm-stub.c |   5 +-
 drivers/firmware/efi/libstub/efistub.h  |   1 -
 drivers/firmware/efi/libstub/fdt.c      |  23 +--
 drivers/of/fdt.c                        |  19 ++-
 include/linux/of_fdt.h                  |   2 +
 24 files changed, 442 insertions(+), 213 deletions(-)
 create mode 100644 arch/arm64/include/asm/boot.h

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 01/13] arm64: reduce ID map to a single page
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 02/13] arm64: drop sleep_idmap_phys and clean up cpu_resume() Ard Biesheuvel
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

Commit ea8c2e112445 ("arm64: Extend the idmap to the whole kernel
image") changed the early page table code so that the entire kernel
Image is covered by the identity map. This allows functions that
need to enable or disable the MMU to reside anywhere in the kernel
Image.

However, this change has the unfortunate side effect that the Image
cannot cross a physical 512 MB alignment boundary anymore, since the
early page table code cannot deal with the Image crossing a /virtual/
512 MB alignment boundary.

So instead, reduce the ID map to a single page, that is populated by
the contents of the .idmap.text section. Only three functions reside
there at the moment: __enable_mmu(), cpu_resume_mmu() and cpu_reset().
If new code is introduced that needs to manipulate the MMU state, it
should be added to this section as well.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/head.S        | 13 +++++++------
 arch/arm64/kernel/sleep.S       |  2 ++
 arch/arm64/kernel/vmlinux.lds.S | 11 ++++++++++-
 arch/arm64/mm/proc.S            |  3 ++-
 4 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 19f915e8f6e0..0c14471d7616 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -382,7 +382,7 @@ __create_page_tables:
 	 * Create the identity mapping.
 	 */
 	mov	x0, x25				// idmap_pg_dir
-	adrp	x3, KERNEL_START		// __pa(KERNEL_START)
+	adrp	x3, __idmap_text_start		// __pa(__idmap_text_start)
 
 #ifndef CONFIG_ARM64_VA_BITS_48
 #define EXTRA_SHIFT	(PGDIR_SHIFT + PAGE_SHIFT - 3)
@@ -405,11 +405,11 @@ __create_page_tables:
 
 	/*
 	 * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the
-	 * entire kernel image can be ID mapped. As T0SZ == (64 - #bits used),
+	 * entire ID map region can be mapped. As T0SZ == (64 - #bits used),
 	 * this number conveniently equals the number of leading zeroes in
-	 * the physical address of KERNEL_END.
+	 * the physical address of __idmap_text_end.
 	 */
-	adrp	x5, KERNEL_END
+	adrp	x5, __idmap_text_end
 	clz	x5, x5
 	cmp	x5, TCR_T0SZ(VA_BITS)	// default T0SZ small enough?
 	b.ge	1f			// .. then skip additional level
@@ -424,8 +424,8 @@ __create_page_tables:
 #endif
 
 	create_pgd_entry x0, x3, x5, x6
-	mov	x5, x3				// __pa(KERNEL_START)
-	adr_l	x6, KERNEL_END			// __pa(KERNEL_END)
+	mov	x5, x3				// __pa(__idmap_text_start)
+	adr_l	x6, __idmap_text_end		// __pa(__idmap_text_end)
 	create_block_map x0, x7, x3, x5, x6
 
 	/*
@@ -669,6 +669,7 @@ ENDPROC(__secondary_switched)
  *
  * other registers depend on the function called upon completion
  */
+	.section	".idmap.text", "ax"
 __enable_mmu:
 	ldr	x5, =vectors
 	msr	vbar_el1, x5
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index ede186cdd452..811e61a2d847 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -130,12 +130,14 @@ ENDPROC(__cpu_suspend_enter)
 /*
  * x0 must contain the sctlr value retrieved from restored context
  */
+	.pushsection	".idmap.text", "ax"
 ENTRY(cpu_resume_mmu)
 	ldr	x3, =cpu_resume_after_mmu
 	msr	sctlr_el1, x0		// restore sctlr_el1
 	isb
 	br	x3			// global jump to virtual address
 ENDPROC(cpu_resume_mmu)
+	.popsection
 cpu_resume_after_mmu:
 	mov	x0, #0			// return zero on success
 	ldp	x19, x20, [sp, #16]
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index a2c29865c3fe..98073332e2d0 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -38,6 +38,12 @@ jiffies = jiffies_64;
 	*(.hyp.text)					\
 	VMLINUX_SYMBOL(__hyp_text_end) = .;
 
+#define IDMAP_TEXT					\
+	. = ALIGN(SZ_4K);				\
+	VMLINUX_SYMBOL(__idmap_text_start) = .;		\
+	*(.idmap.text)					\
+	VMLINUX_SYMBOL(__idmap_text_end) = .;
+
 /*
  * The size of the PE/COFF section that covers the kernel image, which
  * runs from stext to _edata, must be a round multiple of the PE/COFF
@@ -95,6 +101,7 @@ SECTIONS
 			SCHED_TEXT
 			LOCK_TEXT
 			HYPERVISOR_TEXT
+			IDMAP_TEXT
 			*(.fixup)
 			*(.gnu.warning)
 		. = ALIGN(16);
@@ -167,11 +174,13 @@ SECTIONS
 }
 
 /*
- * The HYP init code can't be more than a page long,
+ * The HYP init code and ID map text can't be longer than a page each,
  * and should not cross a page boundary.
  */
 ASSERT(__hyp_idmap_text_end - (__hyp_idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"HYP init code too big or misaligned")
+ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
+	"ID map text too big or misaligned")
 
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index cdd754e19b9b..a265934ab0af 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -67,7 +67,7 @@ ENDPROC(cpu_cache_off)
  *
  *	- loc   - location to jump to for soft reset
  */
-	.align	5
+	.pushsection	".idmap.text", "ax"
 ENTRY(cpu_reset)
 	mrs	x1, sctlr_el1
 	bic	x1, x1, #1
@@ -75,6 +75,7 @@ ENTRY(cpu_reset)
 	isb
 	ret	x0
 ENDPROC(cpu_reset)
+	.popsection
 
 ENTRY(cpu_soft_restart)
 	/* Save address of cpu_reset() and reset address */
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 02/13] arm64: drop sleep_idmap_phys and clean up cpu_resume()
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 01/13] arm64: reduce ID map to a single page Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 03/13] of/fdt: split off FDT self reservation from memreserve processing Ard Biesheuvel
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

Two cleanups of the asm function cpu_resume():
- The global variable sleep_idmap_phys always points to idmap_pg_dir,
  so we can just use that value directly in the CPU resume path.
- Unclutter the load of sleep_save_sp::save_ptr_stash_phys.

Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/sleep.S   | 7 ++-----
 arch/arm64/kernel/suspend.c | 3 ---
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 811e61a2d847..803cfea41962 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -164,15 +164,12 @@ ENTRY(cpu_resume)
 #else
 	mov	x7, xzr
 #endif
-	adrp	x0, sleep_save_sp
-	add	x0, x0, #:lo12:sleep_save_sp
-	ldr	x0, [x0, #SLEEP_SAVE_SP_PHYS]
+	ldr_l	x0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
 	ldr	x0, [x0, x7, lsl #3]
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
-	adrp	x1, sleep_idmap_phys
 	/* load physical address of identity map page table in x1 */
-	ldr	x1, [x1, #:lo12:sleep_idmap_phys]
+	adrp	x1, idmap_pg_dir
 	mov	sp, x2
 	/*
 	 * cpu_do_resume expects x0 to contain context physical address
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index d7daf45ae7a2..f6073c27d65f 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -118,7 +118,6 @@ int __cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 }
 
 struct sleep_save_sp sleep_save_sp;
-phys_addr_t sleep_idmap_phys;
 
 static int __init cpu_suspend_init(void)
 {
@@ -132,9 +131,7 @@ static int __init cpu_suspend_init(void)
 
 	sleep_save_sp.save_ptr_stash = ctx_ptr;
 	sleep_save_sp.save_ptr_stash_phys = virt_to_phys(ctx_ptr);
-	sleep_idmap_phys = virt_to_phys(idmap_pg_dir);
 	__flush_dcache_area(&sleep_save_sp, sizeof(struct sleep_save_sp));
-	__flush_dcache_area(&sleep_idmap_phys, sizeof(sleep_idmap_phys));
 
 	return 0;
 }
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 03/13] of/fdt: split off FDT self reservation from memreserve processing
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 01/13] arm64: reduce ID map to a single page Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 02/13] arm64: drop sleep_idmap_phys and clean up cpu_resume() Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 04/13] arm64: use fixmap region for permanent FDT mapping Ard Biesheuvel
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

This splits off the reservation of the memory occupied by the FDT
binary itself from the processing of the memory reservations it
contains. This is necessary because the physical address of the FDT,
which is needed to perform the reservation, may not be known to the
FDT driver core, i.e., it may be mapped outside the linear direct
mapping, in which case __pa() returns a bogus value.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/mm/init.c         |  1 +
 arch/arm64/mm/init.c       |  1 +
 arch/powerpc/kernel/prom.c |  1 +
 drivers/of/fdt.c           | 19 ++++++++++++++-----
 include/linux/of_fdt.h     |  2 ++
 5 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 1609b022a72f..0b8657c36fe4 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -317,6 +317,7 @@ void __init arm_memblock_init(const struct machine_desc *mdesc)
 	if (mdesc->reserve)
 		mdesc->reserve();
 
+	early_init_fdt_reserve_self();
 	early_init_fdt_scan_reserved_mem();
 
 	/* reserve memory for DMA contiguous allocations */
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index ae85da6307bb..fa2389b0f7f0 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -170,6 +170,7 @@ void __init arm64_memblock_init(void)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
 
+	early_init_fdt_reserve_self();
 	early_init_fdt_scan_reserved_mem();
 
 	/* 4GB maximum for 32-bit only capable devices */
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index b8e15c678960..093ccfb384af 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -573,6 +573,7 @@ static void __init early_reserve_mem_dt(void)
 	int len;
 	const __be32 *prop;
 
+	early_init_fdt_reserve_self();
 	early_init_fdt_scan_reserved_mem();
 
 	dt_root = of_get_flat_dt_root();
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 3a896c9aeb74..bbb35cdb06e8 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -561,11 +561,6 @@ void __init early_init_fdt_scan_reserved_mem(void)
 	if (!initial_boot_params)
 		return;
 
-	/* Reserve the dtb region */
-	early_init_dt_reserve_memory_arch(__pa(initial_boot_params),
-					  fdt_totalsize(initial_boot_params),
-					  0);
-
 	/* Process header /memreserve/ fields */
 	for (n = 0; ; n++) {
 		fdt_get_mem_rsv(initial_boot_params, n, &base, &size);
@@ -579,6 +574,20 @@ void __init early_init_fdt_scan_reserved_mem(void)
 }
 
 /**
+ * early_init_fdt_reserve_self() - reserve the memory used by the FDT blob
+ */
+void __init early_init_fdt_reserve_self(void)
+{
+	if (!initial_boot_params)
+		return;
+
+	/* Reserve the dtb region */
+	early_init_dt_reserve_memory_arch(__pa(initial_boot_params),
+					  fdt_totalsize(initial_boot_params),
+					  0);
+}
+
+/**
  * of_scan_flat_dt - scan flattened tree blob and call callback on each.
  * @it: callback function
  * @data: context data pointer
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index 0ff360d5b3b3..6ef6b33238d3 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -62,6 +62,7 @@ extern int early_init_dt_scan_chosen(unsigned long node, const char *uname,
 extern int early_init_dt_scan_memory(unsigned long node, const char *uname,
 				     int depth, void *data);
 extern void early_init_fdt_scan_reserved_mem(void);
+extern void early_init_fdt_reserve_self(void);
 extern void early_init_dt_add_memory_arch(u64 base, u64 size);
 extern int early_init_dt_reserve_memory_arch(phys_addr_t base, phys_addr_t size,
 					     bool no_map);
@@ -89,6 +90,7 @@ extern u64 fdt_translate_address(const void *blob, int node_offset);
 extern void of_fdt_limit_memory(int limit);
 #else /* CONFIG_OF_FLATTREE */
 static inline void early_init_fdt_scan_reserved_mem(void) {}
+static inline void early_init_fdt_reserve_self(void) {}
 static inline const char *of_flat_dt_get_machine_name(void) { return NULL; }
 static inline void unflatten_device_tree(void) {}
 static inline void unflatten_and_copy_device_tree(void) {}
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 04/13] arm64: use fixmap region for permanent FDT mapping
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2015-04-15 15:34 ` [PATCH v4 03/13] of/fdt: split off FDT self reservation from memreserve processing Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-04-17 15:13   ` Mark Rutland
  2015-04-15 15:34 ` [PATCH v4 05/13] arm64/efi: adapt to relaxed FDT placement requirements Ard Biesheuvel
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

Currently, the FDT blob needs to be in the same 512 MB region as
the kernel, so that it can be mapped into the kernel virtual memory
space very early on using a minimal set of statically allocated
translation tables.

Now that we have early fixmap support, we can relax this restriction,
by moving the permanent FDT mapping to the fixmap region instead.
This way, the FDT blob may be anywhere in memory.

This also moves the vetting of the FDT to mmu.c, since the early
init code in head.S does not handle mapping of the FDT anymore.
At the same time, fix up some comments in head.S that have gone stale.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt | 13 +++++----
 arch/arm64/include/asm/boot.h   | 14 +++++++++
 arch/arm64/include/asm/fixmap.h | 15 ++++++++++
 arch/arm64/include/asm/mmu.h    |  1 +
 arch/arm64/kernel/head.S        | 39 +------------------------
 arch/arm64/kernel/setup.c       | 32 +++++++-------------
 arch/arm64/mm/Makefile          |  2 ++
 arch/arm64/mm/init.c            |  1 -
 arch/arm64/mm/mmu.c             | 65 +++++++++++++++++++++++++++++++++++++++++
 9 files changed, 117 insertions(+), 65 deletions(-)
 create mode 100644 arch/arm64/include/asm/boot.h

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index f3c05b5f9f08..53f18e13d51c 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -45,11 +45,14 @@ sees fit.)
 
 Requirement: MANDATORY
 
-The device tree blob (dtb) must be placed on an 8-byte boundary within
-the first 512 megabytes from the start of the kernel image and must not
-cross a 2-megabyte boundary. This is to allow the kernel to map the
-blob using a single section mapping in the initial page tables.
-
+The device tree blob (dtb) must be placed on an 8-byte boundary and must
+not exceed 2 megabytes in size. Since the dtb will be mapped cacheable using
+blocks of up to 2 megabytes in size, it should not be placed within 2 megabytes
+of memreserves or other special carveouts that may be mapped with non-matching
+attributes.
+
+NOTE: versions prior to v4.2 also require that the DTB be placed within
+the 512 MB region starting at text_offset bytes below the kernel Image.
 
 3. Decompress the kernel image
 ------------------------------
diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
new file mode 100644
index 000000000000..81151b67b26b
--- /dev/null
+++ b/arch/arm64/include/asm/boot.h
@@ -0,0 +1,14 @@
+
+#ifndef __ASM_BOOT_H
+#define __ASM_BOOT_H
+
+#include <asm/sizes.h>
+
+/*
+ * arm64 requires the DTB to be 8 byte aligned and
+ * not exceed 2MB in size.
+ */
+#define MIN_FDT_ALIGN		8
+#define MAX_FDT_SIZE		SZ_2M
+
+#endif
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 926495686554..ef66c26eebbe 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -17,6 +17,7 @@
 
 #ifndef __ASSEMBLY__
 #include <linux/kernel.h>
+#include <asm/boot.h>
 #include <asm/page.h>
 
 /*
@@ -32,6 +33,20 @@
  */
 enum fixed_addresses {
 	FIX_HOLE,
+
+	/*
+	 * Reserve a virtual window for the FDT that is 2 MB larger than the
+	 * maximum supported size, and put it at the top of the fixmap region.
+	 * The additional space ensures that any FDT that does not exceed
+	 * MAX_FDT_SIZE can be mapped regardless of whether it crosses any
+	 * 2 MB alignment boundaries.
+	 *
+	 * Keep this at the top so it remains 2 MB aligned.
+	 */
+#define FIX_FDT_SIZE		(MAX_FDT_SIZE + SZ_2M)
+	FIX_FDT_END,
+	FIX_FDT = FIX_FDT_END + FIX_FDT_SIZE / PAGE_SIZE - 1,
+
 	FIX_EARLYCON_MEM_BASE,
 	FIX_TEXT_POKE0,
 	__end_of_permanent_fixed_addresses,
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 3d311761e3c2..79fcfb048884 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -34,5 +34,6 @@ extern void init_mem_pgprot(void);
 extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 			       unsigned long virt, phys_addr_t size,
 			       pgprot_t prot);
+extern void *fixmap_remap_fdt(phys_addr_t dt_phys);
 
 #endif
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 0c14471d7616..c0ff3ce4299e 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -237,8 +237,6 @@ ENTRY(stext)
 	bl	el2_setup			// Drop to EL1, w20=cpu_boot_mode
 	adrp	x24, __PHYS_OFFSET
 	bl	set_cpu_boot_mode_flag
-
-	bl	__vet_fdt
 	bl	__create_page_tables		// x25=TTBR0, x26=TTBR1
 	/*
 	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
@@ -270,24 +268,6 @@ preserve_boot_args:
 ENDPROC(preserve_boot_args)
 
 /*
- * Determine validity of the x21 FDT pointer.
- * The dtb must be 8-byte aligned and live in the first 512M of memory.
- */
-__vet_fdt:
-	tst	x21, #0x7
-	b.ne	1f
-	cmp	x21, x24
-	b.lt	1f
-	mov	x0, #(1 << 29)
-	add	x0, x0, x24
-	cmp	x21, x0
-	b.ge	1f
-	ret
-1:
-	mov	x21, #0
-	ret
-ENDPROC(__vet_fdt)
-/*
  * Macro to create a table entry to the next page.
  *
  *	tbl:	page table address
@@ -348,8 +328,7 @@ ENDPROC(__vet_fdt)
  * required to get the kernel running. The following sections are required:
  *   - identity mapping to enable the MMU (low address, TTBR0)
  *   - first few MB of the kernel linear mapping to jump to once the MMU has
- *     been enabled, including the FDT blob (TTBR1)
- *   - pgd entry for fixed mappings (TTBR1)
+ *     been enabled
  */
 __create_page_tables:
 	adrp	x25, idmap_pg_dir
@@ -439,22 +418,6 @@ __create_page_tables:
 	create_block_map x0, x7, x3, x5, x6
 
 	/*
-	 * Map the FDT blob (maximum 2MB; must be within 512MB of
-	 * PHYS_OFFSET).
-	 */
-	mov	x3, x21				// FDT phys address
-	and	x3, x3, #~((1 << 21) - 1)	// 2MB aligned
-	mov	x6, #PAGE_OFFSET
-	sub	x5, x3, x24			// subtract PHYS_OFFSET
-	tst	x5, #~((1 << 29) - 1)		// within 512MB?
-	csel	x21, xzr, x21, ne		// zero the FDT pointer
-	b.ne	1f
-	add	x5, x5, x6			// __va(FDT blob)
-	add	x6, x5, #1 << 21		// 2MB for the FDT blob
-	sub	x6, x6, #1			// inclusive range
-	create_block_map x0, x7, x3, x5, x6
-1:
-	/*
 	 * Since the page tables have been populated with non-cacheable
 	 * accesses (MMU disabled), invalidate the idmap and swapper page
 	 * tables again to remove any speculatively loaded cache lines.
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 51ef97274b52..eafad20a4c80 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -103,18 +103,6 @@ static struct resource mem_res[] = {
 #define kernel_code mem_res[0]
 #define kernel_data mem_res[1]
 
-void __init early_print(const char *str, ...)
-{
-	char buf[256];
-	va_list ap;
-
-	va_start(ap, str);
-	vsnprintf(buf, sizeof(buf), str, ap);
-	va_end(ap);
-
-	printk("%s", buf);
-}
-
 /*
  * The recorded values of x0 .. x3 upon kernel entry.
  */
@@ -324,12 +312,14 @@ static void __init setup_processor(void)
 
 static void __init setup_machine_fdt(phys_addr_t dt_phys)
 {
-	if (!dt_phys || !early_init_dt_scan(phys_to_virt(dt_phys))) {
-		early_print("\n"
-			"Error: invalid device tree blob at physical address 0x%p (virtual address 0x%p)\n"
-			"The dtb must be 8-byte aligned and passed in the first 512MB of memory\n"
-			"\nPlease check your bootloader.\n",
-			dt_phys, phys_to_virt(dt_phys));
+	void *dt_virt = fixmap_remap_fdt(dt_phys);
+
+	if (!dt_virt || !early_init_dt_scan(dt_virt)) {
+		pr_crit("\n"
+			"Error: invalid device tree blob@physical address %pa (virtual address 0x%p)\n"
+			"The dtb must be 8-byte aligned and must not exceed 2 MB in size\n"
+			"\nPlease check your bootloader.",
+			&dt_phys, dt_virt);
 
 		while (true)
 			cpu_relax();
@@ -372,6 +362,9 @@ void __init setup_arch(char **cmdline_p)
 {
 	setup_processor();
 
+	early_fixmap_init();
+	early_ioremap_init();
+
 	setup_machine_fdt(__fdt_pointer);
 
 	init_mm.start_code = (unsigned long) _text;
@@ -381,9 +374,6 @@ void __init setup_arch(char **cmdline_p)
 
 	*cmdline_p = boot_command_line;
 
-	early_fixmap_init();
-	early_ioremap_init();
-
 	parse_early_param();
 
 	/*
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
index 773d37a14039..9d84feb41a16 100644
--- a/arch/arm64/mm/Makefile
+++ b/arch/arm64/mm/Makefile
@@ -4,3 +4,5 @@ obj-y				:= dma-mapping.o extable.o fault.o init.o \
 				   context.o proc.o pageattr.o
 obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
 obj-$(CONFIG_ARM64_PTDUMP)	+= dump.o
+
+CFLAGS_mmu.o			:= -I$(srctree)/scripts/dtc/libfdt/
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index fa2389b0f7f0..ae85da6307bb 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -170,7 +170,6 @@ void __init arm64_memblock_init(void)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
 
-	early_init_fdt_reserve_self();
 	early_init_fdt_scan_reserved_mem();
 
 	/* 4GB maximum for 32-bit only capable devices */
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 428aaf86c95b..aa99b7a0d660 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -21,6 +21,7 @@
 #include <linux/kernel.h>
 #include <linux/errno.h>
 #include <linux/init.h>
+#include <linux/libfdt.h>
 #include <linux/mman.h>
 #include <linux/nodemask.h>
 #include <linux/memblock.h>
@@ -643,3 +644,67 @@ void __set_fixmap(enum fixed_addresses idx,
 		flush_tlb_kernel_range(addr, addr+PAGE_SIZE);
 	}
 }
+
+void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+{
+	const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
+	pgprot_t prot = PAGE_KERNEL | PTE_RDONLY;
+	int granularity, size;
+	void *dt_virt;
+
+	/*
+	 * Check whether the physical FDT address is set and meets the minimum
+	 * alignment requirement. Since we are relying on MIN_FDT_ALIGN to be
+	 * at least 8 bytes so that we can always access the size field of the
+	 * FDT header after mapping the first chunk, double check here if that
+	 * is indeed the case.
+	 */
+	BUILD_BUG_ON(MIN_FDT_ALIGN < 8);
+	if (!dt_phys || dt_phys % MIN_FDT_ALIGN)
+		return NULL;
+
+	/*
+	 * Make sure that the FDT region can be mapped without the need to
+	 * allocate additional translation table pages, so that it is safe
+	 * to call create_mapping() this early.
+	 *
+	 * On 64k pages, the fixmap region will be mapped using PTEs, so we
+	 * need to be in the same PMD as the rest of the fixmap.
+	 * On 4k pages, we'll use section mappings for the region so we only
+	 * have to be in the same PUD.
+	 */
+	BUILD_BUG_ON(dt_virt_base % SZ_2M);
+
+	if (IS_ENABLED(CONFIG_ARM64_64K_PAGES)) {
+		BUILD_BUG_ON(__fix_to_virt(FIX_FDT_END) >> PMD_SHIFT !=
+			     __fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT);
+
+		granularity = PAGE_SIZE;
+	} else {
+		BUILD_BUG_ON(__fix_to_virt(FIX_FDT_END) >> PUD_SHIFT !=
+			     __fix_to_virt(FIX_BTMAP_BEGIN) >> PUD_SHIFT);
+
+		granularity = PMD_SIZE;
+	}
+
+	dt_virt = (void *)dt_virt_base + dt_phys % granularity;
+
+	/* map the first chunk so we can read the size from the header */
+	create_mapping(round_down(dt_phys, granularity), dt_virt_base,
+		       granularity, prot);
+
+	if (fdt_check_header(dt_virt) != 0)
+		return NULL;
+
+	size = fdt_totalsize(dt_virt);
+	if (size > MAX_FDT_SIZE)
+		return NULL;
+
+	if (size > granularity)
+		create_mapping(round_down(dt_phys, granularity), dt_virt_base,
+			       round_up(size, granularity), prot);
+
+	memblock_reserve(dt_phys, size);
+
+	return dt_virt;
+}
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 05/13] arm64/efi: adapt to relaxed FDT placement requirements
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
                   ` (3 preceding siblings ...)
  2015-04-15 15:34 ` [PATCH v4 04/13] arm64: use fixmap region for permanent FDT mapping Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 06/13] arm64: implement our own early_init_dt_add_memory_arch() Ard Biesheuvel
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

With the relaxed FDT placement requirements in place, we can change
the allocation strategy used by the stub to put the FDT image higher
up in memory. At the same time, reduce the minimal alignment to 8 bytes,
and impose a 2 MB size limit, as per the new requirements.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/efi.h            | 10 +++-------
 drivers/firmware/efi/libstub/arm-stub.c |  5 ++---
 drivers/firmware/efi/libstub/efistub.h  |  1 -
 drivers/firmware/efi/libstub/fdt.c      | 23 +++++++++++++----------
 4 files changed, 18 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index ef572206f1c3..825c85666b6b 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -1,6 +1,7 @@
 #ifndef _ASM_EFI_H
 #define _ASM_EFI_H
 
+#include <asm/boot.h>
 #include <asm/io.h>
 #include <asm/neon.h>
 
@@ -38,13 +39,8 @@ extern void efi_init(void);
 
 /* arch specific definitions used by the stub code */
 
-/*
- * AArch64 requires the DTB to be 8-byte aligned in the first 512MiB from
- * start of kernel and may not cross a 2MiB boundary. We set alignment to
- * 2MiB so we know it won't cross a 2MiB boundary.
- */
-#define EFI_FDT_ALIGN	SZ_2M   /* used by allocate_new_fdt_and_exit_boot() */
-#define MAX_FDT_OFFSET	SZ_512M
+#define EFI_FDT_ALIGN		MIN_FDT_ALIGN
+#define EFI_FDT_MAX_SIZE	MAX_FDT_SIZE
 
 #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__)
 
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index dcae482a9a17..f54c76a4fd32 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -269,9 +269,8 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 
 	new_fdt_addr = fdt_addr;
 	status = allocate_new_fdt_and_exit_boot(sys_table, handle,
-				&new_fdt_addr, dram_base + MAX_FDT_OFFSET,
-				initrd_addr, initrd_size, cmdline_ptr,
-				fdt_addr, fdt_size);
+				&new_fdt_addr, initrd_addr, initrd_size,
+				cmdline_ptr, fdt_addr, fdt_size);
 
 	/*
 	 * If all went well, we need to return the FDT address to the
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 47437b16b186..c8e096094ea9 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -35,7 +35,6 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
 efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 					    void *handle,
 					    unsigned long *new_fdt_addr,
-					    unsigned long max_addr,
 					    u64 initrd_addr, u64 initrd_size,
 					    char *cmdline_ptr,
 					    unsigned long fdt_addr,
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index 91da56c4fd54..ace5ed70a88e 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -165,10 +165,6 @@ fdt_set_fail:
 	return EFI_LOAD_ERROR;
 }
 
-#ifndef EFI_FDT_ALIGN
-#define EFI_FDT_ALIGN EFI_PAGE_SIZE
-#endif
-
 /*
  * Allocate memory for a new FDT, then add EFI, commandline, and
  * initrd related fields to the FDT.  This routine increases the
@@ -186,7 +182,6 @@ fdt_set_fail:
 efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 					    void *handle,
 					    unsigned long *new_fdt_addr,
-					    unsigned long max_addr,
 					    u64 initrd_addr, u64 initrd_size,
 					    char *cmdline_ptr,
 					    unsigned long fdt_addr,
@@ -197,6 +192,7 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 	unsigned long mmap_key;
 	efi_memory_desc_t *memory_map, *runtime_map;
 	unsigned long new_fdt_size;
+	void *fdt_alloc;
 	efi_status_t status;
 	int runtime_entry_count = 0;
 
@@ -221,14 +217,21 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 	 * will allocate a bigger buffer if this ends up being too
 	 * small, so a rough guess is OK here.
 	 */
-	new_fdt_size = fdt_size + EFI_PAGE_SIZE;
+	new_fdt_size = fdt_size + EFI_PAGE_SIZE + EFI_FDT_ALIGN;
 	while (1) {
-		status = efi_high_alloc(sys_table, new_fdt_size, EFI_FDT_ALIGN,
-					new_fdt_addr, max_addr);
+		if (new_fdt_size > EFI_FDT_MAX_SIZE) {
+			pr_efi_err(sys_table, "FDT size exceeds EFI_FDT_MAX_SIZE.\n");
+			goto fail;
+		}
+		status = sys_table->boottime->allocate_pool(EFI_LOADER_DATA,
+							    new_fdt_size,
+							    &fdt_alloc);
 		if (status != EFI_SUCCESS) {
 			pr_efi_err(sys_table, "Unable to allocate memory for new device tree.\n");
 			goto fail;
 		}
+		*new_fdt_addr = round_up((unsigned long)fdt_alloc,
+					 EFI_FDT_ALIGN);
 
 		/*
 		 * Now that we have done our final memory allocation (and free)
@@ -258,7 +261,7 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 			 * to get new one that reflects the free/alloc we do
 			 * on the device tree buffer.
 			 */
-			efi_free(sys_table, new_fdt_size, *new_fdt_addr);
+			sys_table->boottime->free_pool(&fdt_alloc);
 			sys_table->boottime->free_pool(memory_map);
 			new_fdt_size += EFI_PAGE_SIZE;
 		} else {
@@ -316,7 +319,7 @@ fail_free_mmap:
 	sys_table->boottime->free_pool(memory_map);
 
 fail_free_new_fdt:
-	efi_free(sys_table, new_fdt_size, *new_fdt_addr);
+	sys_table->boottime->free_pool(&fdt_alloc);
 
 fail:
 	sys_table->boottime->free_pool(runtime_map);
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 06/13] arm64: implement our own early_init_dt_add_memory_arch()
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
                   ` (4 preceding siblings ...)
  2015-04-15 15:34 ` [PATCH v4 05/13] arm64/efi: adapt to relaxed FDT placement requirements Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 07/13] arm64: use more granular reservations for static page table allocations Ard Biesheuvel
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

Override the __weak early_init_dt_add_memory_arch() with our own
version. This allows us to relax the imposed restrictions at memory
discovery time, which is needed if we want to defer the assignment
of PHYS_OFFSET and make it independent of where the kernel Image
is placed in physical memory.

So copy the generic original, but only retain the check against
regions whose sizes become zero when clipped to page alignment.

For now, we will remove the range below PHYS_OFFSET explicitly
until we rework that logic in a subsequent patch. Any memory that
we will not be able to map due to insufficient size of the linear
region is also removed.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/init.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index ae85da6307bb..1599a5c5e94a 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -158,6 +158,15 @@ early_param("mem", early_mem);
 
 void __init arm64_memblock_init(void)
 {
+	/*
+	 * Remove the memory that we will not be able to cover
+	 * with the linear mapping.
+	 */
+	const s64 linear_region_size = -(s64)PAGE_OFFSET;
+
+	memblock_remove(0, memstart_addr);
+	memblock_remove(memstart_addr + linear_region_size, ULLONG_MAX);
+
 	memblock_enforce_memory_limit(memory_limit);
 
 	/*
@@ -372,3 +381,19 @@ static int __init keepinitrd_setup(char *__unused)
 
 __setup("keepinitrd", keepinitrd_setup);
 #endif
+
+void __init early_init_dt_add_memory_arch(u64 base, u64 size)
+{
+	if (!PAGE_ALIGNED(base)) {
+		if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
+			pr_warn("Ignoring memory block 0x%llx - 0x%llx\n",
+				base, base + size);
+			return;
+		}
+		size -= PAGE_SIZE - (base & ~PAGE_MASK);
+		base = PAGE_ALIGN(base);
+	}
+	size &= PAGE_MASK;
+
+	memblock_add(base, size);
+}
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 07/13] arm64: use more granular reservations for static page table allocations
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
                   ` (5 preceding siblings ...)
  2015-04-15 15:34 ` [PATCH v4 06/13] arm64: implement our own early_init_dt_add_memory_arch() Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 08/13] arm64: split off early mapping code from early_fixmap_init() Ard Biesheuvel
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

Before introducing new statically allocated page tables and increasing
their alignment in subsequent patches, update the reservation logic
so that only pages that are in actual use end up as reserved with
memblock.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/init.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1599a5c5e94a..0e7d9a2aad39 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -37,6 +37,7 @@
 
 #include <asm/fixmap.h>
 #include <asm/memory.h>
+#include <asm/mmu_context.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
 #include <asm/sizes.h>
@@ -173,11 +174,13 @@ void __init arm64_memblock_init(void)
 	 * Register the kernel text, kernel data, initrd, and initial
 	 * pagetables with memblock.
 	 */
-	memblock_reserve(__pa(_text), _end - _text);
+	memblock_reserve(__pa(_text), __bss_stop - _text);
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (initrd_start)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
+	memblock_reserve(__pa(idmap_pg_dir), IDMAP_DIR_SIZE);
+	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_DIR_SIZE);
 
 	early_init_fdt_scan_reserved_mem();
 
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 08/13] arm64: split off early mapping code from early_fixmap_init()
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
                   ` (6 preceding siblings ...)
  2015-04-15 15:34 ` [PATCH v4 07/13] arm64: use more granular reservations for static page table allocations Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 09/13] arm64: mm: explicitly bootstrap the linear mapping Ard Biesheuvel
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

This splits off and generalises the population of the statically
allocated fixmap page tables so that we may reuse it later for
the linear mapping once we move the kernel text mapping out of it.

This also involves taking into account that table entries at any of
the levels we are populating may have been populated already, since
the fixmap mapping might not be disjoint up to the pgd level anymore
from other early mappings.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/compiler.h |  2 ++
 arch/arm64/kernel/vmlinux.lds.S   | 12 ++++----
 arch/arm64/mm/mmu.c               | 60 +++++++++++++++++++++++++++------------
 3 files changed, 51 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/compiler.h b/arch/arm64/include/asm/compiler.h
index ee35fd0f2236..dd342af63673 100644
--- a/arch/arm64/include/asm/compiler.h
+++ b/arch/arm64/include/asm/compiler.h
@@ -27,4 +27,6 @@
  */
 #define __asmeq(x, y)  ".ifnc " x "," y " ; .err ; .endif\n\t"
 
+#define __pgdir		__attribute__((section(".pgdir"),aligned(PAGE_SIZE)))
+
 #endif	/* __ASM_COMPILER_H */
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 98073332e2d0..ceec4def354b 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -160,11 +160,13 @@ SECTIONS
 
 	BSS_SECTION(0, 0, 0)
 
-	. = ALIGN(PAGE_SIZE);
-	idmap_pg_dir = .;
-	. += IDMAP_DIR_SIZE;
-	swapper_pg_dir = .;
-	. += SWAPPER_DIR_SIZE;
+	.pgdir (NOLOAD) : ALIGN(PAGE_SIZE) {
+		idmap_pg_dir = .;
+		. += IDMAP_DIR_SIZE;
+		swapper_pg_dir = .;
+		. += SWAPPER_DIR_SIZE;
+		*(.pgdir)
+	}
 
 	_end = .;
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index aa99b7a0d660..c27ab20a5ba9 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -342,6 +342,44 @@ static void __init __map_memblock(phys_addr_t start, phys_addr_t end)
 }
 #endif
 
+struct bootstrap_pgtables {
+	pte_t	pte[PTRS_PER_PTE];
+	pmd_t	pmd[PTRS_PER_PMD > 1 ? PTRS_PER_PMD : 0];
+	pud_t	pud[PTRS_PER_PUD > 1 ? PTRS_PER_PUD : 0];
+};
+
+static void __init bootstrap_early_mapping(unsigned long addr,
+					   struct bootstrap_pgtables *reg,
+					   bool pte_level)
+{
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+
+	pgd = pgd_offset_k(addr);
+	if (pgd_none(*pgd)) {
+		clear_page(reg->pud);
+		memblock_reserve(__pa(reg->pud), PAGE_SIZE);
+		pgd_populate(&init_mm, pgd, reg->pud);
+	}
+	pud = pud_offset(pgd, addr);
+	if (pud_none(*pud)) {
+		clear_page(reg->pmd);
+		memblock_reserve(__pa(reg->pmd), PAGE_SIZE);
+		pud_populate(&init_mm, pud, reg->pmd);
+	}
+
+	if (!pte_level)
+		return;
+
+	pmd = pmd_offset(pud, addr);
+	if (pmd_none(*pmd)) {
+		clear_page(reg->pte);
+		memblock_reserve(__pa(reg->pte), PAGE_SIZE);
+		pmd_populate_kernel(&init_mm, pmd, reg->pte);
+	}
+}
+
 static void __init map_mem(void)
 {
 	struct memblock_region *reg;
@@ -555,14 +593,6 @@ void vmemmap_free(unsigned long start, unsigned long end)
 }
 #endif	/* CONFIG_SPARSEMEM_VMEMMAP */
 
-static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
-#if CONFIG_ARM64_PGTABLE_LEVELS > 2
-static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
-#endif
-#if CONFIG_ARM64_PGTABLE_LEVELS > 3
-static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
-#endif
-
 static inline pud_t * fixmap_pud(unsigned long addr)
 {
 	pgd_t *pgd = pgd_offset_k(addr);
@@ -592,21 +622,15 @@ static inline pte_t * fixmap_pte(unsigned long addr)
 
 void __init early_fixmap_init(void)
 {
-	pgd_t *pgd;
-	pud_t *pud;
+	static struct bootstrap_pgtables fixmap_bs_pgtables __pgdir;
 	pmd_t *pmd;
-	unsigned long addr = FIXADDR_START;
 
-	pgd = pgd_offset_k(addr);
-	pgd_populate(&init_mm, pgd, bm_pud);
-	pud = pud_offset(pgd, addr);
-	pud_populate(&init_mm, pud, bm_pmd);
-	pmd = pmd_offset(pud, addr);
-	pmd_populate_kernel(&init_mm, pmd, bm_pte);
+	bootstrap_early_mapping(FIXADDR_START, &fixmap_bs_pgtables, true);
+	pmd = fixmap_pmd(FIXADDR_START);
 
 	/*
 	 * The boot-ioremap range spans multiple pmds, for which
-	 * we are not preparted:
+	 * we are not prepared:
 	 */
 	BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
 		     != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 09/13] arm64: mm: explicitly bootstrap the linear mapping
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
                   ` (7 preceding siblings ...)
  2015-04-15 15:34 ` [PATCH v4 08/13] arm64: split off early mapping code from early_fixmap_init() Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-05-07 16:54   ` Catalin Marinas
  2015-04-15 15:34 ` [PATCH v4 10/13] arm64: move kernel mapping out of linear region Ard Biesheuvel
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

In preparation of moving the kernel text out of the linear
mapping, ensure that the part of the kernel Image that contains
the statically allocated page tables is made accessible via the
linear mapping before performing the actual mapping of all of
memory. This is needed by the normal mapping routines, that rely
on the linear mapping to walk the page tables while manipulating
them.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/vmlinux.lds.S | 18 ++++++++-
 arch/arm64/mm/mmu.c             | 89 +++++++++++++++++++++++++++--------------
 2 files changed, 75 insertions(+), 32 deletions(-)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index ceec4def354b..338eaa7bcbfd 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -68,6 +68,17 @@ PECOFF_FILE_ALIGNMENT = 0x200;
 #define ALIGN_DEBUG_RO_MIN(min)		. = ALIGN(min);
 #endif
 
+/*
+ * The pgdir region needs to be mappable using a single PMD or PUD sized region,
+ * so it should not cross a 512 MB or 1 GB alignment boundary, respectively
+ * (depending on page size). So align to an upper bound of its size.
+ */
+#if CONFIG_ARM64_PGTABLE_LEVELS == 2
+#define PGDIR_ALIGN	(8 * PAGE_SIZE)
+#else
+#define PGDIR_ALIGN	(16 * PAGE_SIZE)
+#endif
+
 SECTIONS
 {
 	/*
@@ -160,7 +171,7 @@ SECTIONS
 
 	BSS_SECTION(0, 0, 0)
 
-	.pgdir (NOLOAD) : ALIGN(PAGE_SIZE) {
+	.pgdir (NOLOAD) : ALIGN(PGDIR_ALIGN) {
 		idmap_pg_dir = .;
 		. += IDMAP_DIR_SIZE;
 		swapper_pg_dir = .;
@@ -185,6 +196,11 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"ID map text too big or misaligned")
 
 /*
+ * Check that the chosen PGDIR_ALIGN value if sufficient.
+ */
+ASSERT(SIZEOF(.pgdir) < ALIGNOF(.pgdir), ".pgdir size exceeds its alignment")
+
+/*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
  */
 ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c27ab20a5ba9..93e5a2497f01 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -380,26 +380,68 @@ static void __init bootstrap_early_mapping(unsigned long addr,
 	}
 }
 
+static void __init bootstrap_linear_mapping(unsigned long va_offset)
+{
+	/*
+	 * Bootstrap the linear range that covers swapper_pg_dir so that the
+	 * statically allocated page tables as well as newly allocated ones
+	 * are accessible via the linear mapping.
+	 */
+	static struct bootstrap_pgtables linear_bs_pgtables __pgdir;
+	const phys_addr_t swapper_phys = __pa(swapper_pg_dir);
+	unsigned long swapper_virt = __phys_to_virt(swapper_phys) + va_offset;
+	struct memblock_region *reg;
+
+	bootstrap_early_mapping(swapper_virt, &linear_bs_pgtables,
+				IS_ENABLED(CONFIG_ARM64_64K_PAGES));
+
+	/* now find the memblock that covers swapper_pg_dir, and clip */
+	for_each_memblock(memory, reg) {
+		phys_addr_t start = reg->base;
+		phys_addr_t end = start + reg->size;
+		unsigned long vstart, vend;
+
+		if (start > swapper_phys || end <= swapper_phys)
+			continue;
+
+#ifdef CONFIG_ARM64_64K_PAGES
+		/* clip the region to PMD size */
+		vstart = max(swapper_virt & PMD_MASK,
+			     round_up(__phys_to_virt(start + va_offset),
+				      PAGE_SIZE));
+		vend = min(round_up(swapper_virt, PMD_SIZE),
+			   round_down(__phys_to_virt(end + va_offset),
+				      PAGE_SIZE));
+#else
+		/* clip the region to PUD size */
+		vstart = max(swapper_virt & PUD_MASK,
+			     round_up(__phys_to_virt(start + va_offset),
+				      PMD_SIZE));
+		vend = min(round_up(swapper_virt, PUD_SIZE),
+			   round_down(__phys_to_virt(end + va_offset),
+				      PMD_SIZE));
+#endif
+
+		create_mapping(__pa(vstart - va_offset), vstart, vend - vstart,
+			       PAGE_KERNEL_EXEC);
+
+		/*
+		 * Temporarily limit the memblock range. We need to do this as
+		 * create_mapping requires puds, pmds and ptes to be allocated
+		 * from memory addressable from the early linear mapping.
+		 */
+		memblock_set_current_limit(__pa(vend - va_offset));
+
+		return;
+	}
+	BUG();
+}
+
 static void __init map_mem(void)
 {
 	struct memblock_region *reg;
-	phys_addr_t limit;
 
-	/*
-	 * Temporarily limit the memblock range. We need to do this as
-	 * create_mapping requires puds, pmds and ptes to be allocated from
-	 * memory addressable from the initial direct kernel mapping.
-	 *
-	 * The initial direct kernel mapping, located@swapper_pg_dir, gives
-	 * us PUD_SIZE (4K pages) or PMD_SIZE (64K pages) memory starting from
-	 * PHYS_OFFSET (which must be aligned to 2MB as per
-	 * Documentation/arm64/booting.txt).
-	 */
-	if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
-		limit = PHYS_OFFSET + PMD_SIZE;
-	else
-		limit = PHYS_OFFSET + PUD_SIZE;
-	memblock_set_current_limit(limit);
+	bootstrap_linear_mapping(0);
 
 	/* map all the memory banks */
 	for_each_memblock(memory, reg) {
@@ -409,21 +451,6 @@ static void __init map_mem(void)
 		if (start >= end)
 			break;
 
-#ifndef CONFIG_ARM64_64K_PAGES
-		/*
-		 * For the first memory bank align the start address and
-		 * current memblock limit to prevent create_mapping() from
-		 * allocating pte page tables from unmapped memory.
-		 * When 64K pages are enabled, the pte page table for the
-		 * first PGDIR_SIZE is already present in swapper_pg_dir.
-		 */
-		if (start < limit)
-			start = ALIGN(start, PMD_SIZE);
-		if (end < limit) {
-			limit = end & PMD_MASK;
-			memblock_set_current_limit(limit);
-		}
-#endif
 		__map_memblock(start, end);
 	}
 
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 10/13] arm64: move kernel mapping out of linear region
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
                   ` (8 preceding siblings ...)
  2015-04-15 15:34 ` [PATCH v4 09/13] arm64: mm: explicitly bootstrap the linear mapping Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-05-08 17:16   ` Catalin Marinas
  2015-04-15 15:34 ` [PATCH v4 11/13] arm64: map linear region as non-executable Ard Biesheuvel
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

This moves the primary mapping of the kernel Image out of
the linear region. This is a preparatory step towards allowing
the kernel Image to reside anywhere in physical memory without
affecting the ability to map all of it efficiently.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/boot.h   |  7 +++++++
 arch/arm64/include/asm/memory.h | 28 ++++++++++++++++++++++++----
 arch/arm64/kernel/head.S        |  8 ++++----
 arch/arm64/kernel/vmlinux.lds.S | 11 +++++++++--
 arch/arm64/mm/mmu.c             | 11 ++++++++++-
 5 files changed, 54 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b67b26b..092d1096ce9a 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -11,4 +11,11 @@
 #define MIN_FDT_ALIGN		8
 #define MAX_FDT_SIZE		SZ_2M
 
+/*
+ * arm64 requires the kernel image to be 2 MB aligned and
+ * not exceed 64 MB in size.
+ */
+#define MIN_KIMG_ALIGN		SZ_2M
+#define MAX_KIMG_SIZE		SZ_64M
+
 #endif
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index f800d45ea226..801331793bd3 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -24,6 +24,7 @@
 #include <linux/compiler.h>
 #include <linux/const.h>
 #include <linux/types.h>
+#include <asm/boot.h>
 #include <asm/sizes.h>
 
 /*
@@ -39,7 +40,12 @@
 #define PCI_IO_SIZE		SZ_16M
 
 /*
- * PAGE_OFFSET - the virtual address of the start of the kernel image (top
+ * Offset below PAGE_OFFSET where to map the kernel Image.
+ */
+#define KIMAGE_OFFSET		MAX_KIMG_SIZE
+
+/*
+ * PAGE_OFFSET - the virtual address of the base of the linear mapping (top
  *		 (VA_BITS - 1))
  * VA_BITS - the maximum number of bits for virtual addresses.
  * TASK_SIZE - the maximum size of a user space task.
@@ -49,7 +55,8 @@
  */
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define MODULES_END		(PAGE_OFFSET)
+#define KIMAGE_VADDR		(PAGE_OFFSET - KIMAGE_OFFSET)
+#define MODULES_END		KIMAGE_VADDR
 #define MODULES_VADDR		(MODULES_END - SZ_64M)
 #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
@@ -77,7 +84,11 @@
  * private definitions which should NOT be used outside memory.h
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
  */
-#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __virt_to_phys(x) ({						\
+	long __x = (long)(x) - PAGE_OFFSET;				\
+	__x >= 0 ? (phys_addr_t)(__x + PHYS_OFFSET) : 			\
+		   (phys_addr_t)(__x + PHYS_OFFSET + kernel_va_offset); })
+
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
 
 /*
@@ -111,7 +122,16 @@
 
 extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
-#define PHYS_OFFSET		({ memstart_addr; })
+#define PHYS_OFFSET		({ memstart_addr + phys_offset_bias; })
+
+/*
+ * Before the linear mapping has been set up, __va() translations will
+ * not produce usable virtual addresses unless we tweak PHYS_OFFSET to
+ * compensate for the offset between the kernel mapping and the base of
+ * the linear mapping. We will undo this in map_mem().
+ */
+extern u64 phys_offset_bias;
+extern u64 kernel_va_offset;
 
 /*
  * PFNs are used to describe any physical page; this means
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c0ff3ce4299e..3bf1d339dd8d 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -36,8 +36,6 @@
 #include <asm/page.h>
 #include <asm/virt.h>
 
-#define __PHYS_OFFSET	(KERNEL_START - TEXT_OFFSET)
-
 #if (TEXT_OFFSET & 0xfff) != 0
 #error TEXT_OFFSET must be at least 4KB aligned
 #elif (PAGE_OFFSET & 0x1fffff) != 0
@@ -58,6 +56,8 @@
 
 #define KERNEL_START	_text
 #define KERNEL_END	_end
+#define KERNEL_BASE	(KERNEL_START - TEXT_OFFSET)
+
 
 /*
  * Initial memory map attributes.
@@ -235,7 +235,7 @@ section_table:
 ENTRY(stext)
 	bl	preserve_boot_args
 	bl	el2_setup			// Drop to EL1, w20=cpu_boot_mode
-	adrp	x24, __PHYS_OFFSET
+	adrp	x24, KERNEL_BASE
 	bl	set_cpu_boot_mode_flag
 	bl	__create_page_tables		// x25=TTBR0, x26=TTBR1
 	/*
@@ -411,7 +411,7 @@ __create_page_tables:
 	 * Map the kernel image (starting with PHYS_OFFSET).
 	 */
 	mov	x0, x26				// swapper_pg_dir
-	mov	x5, #PAGE_OFFSET
+	ldr	x5, =KERNEL_BASE
 	create_pgd_entry x0, x5, x3, x6
 	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
 	mov	x3, x24				// phys offset
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 338eaa7bcbfd..8dbb816c0338 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -6,6 +6,7 @@
 
 #include <asm-generic/vmlinux.lds.h>
 #include <asm/thread_info.h>
+#include <asm/boot.h>
 #include <asm/memory.h>
 #include <asm/page.h>
 #include <asm/pgtable.h>
@@ -95,7 +96,7 @@ SECTIONS
 		*(.discard.*)
 	}
 
-	. = PAGE_OFFSET + TEXT_OFFSET;
+	. = KIMAGE_VADDR + TEXT_OFFSET;
 
 	.head.text : {
 		_text = .;
@@ -203,4 +204,10 @@ ASSERT(SIZEOF(.pgdir) < ALIGNOF(.pgdir), ".pgdir size exceeds its alignment")
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
  */
-ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
+ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
+
+/*
+ * Make sure the memory footprint of the kernel Image does not exceed the limit.
+ */
+ASSERT(_end - _text + TEXT_OFFSET <= MAX_KIMG_SIZE,
+	"Kernel Image memory footprint exceeds MAX_KIMG_SIZE")
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 93e5a2497f01..b457b7e425cc 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -50,6 +50,9 @@ u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
 struct page *empty_zero_page;
 EXPORT_SYMBOL(empty_zero_page);
 
+u64 phys_offset_bias __read_mostly = KIMAGE_OFFSET;
+u64 kernel_va_offset __read_mostly;
+
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 			      unsigned long size, pgprot_t vma_prot)
 {
@@ -386,6 +389,9 @@ static void __init bootstrap_linear_mapping(unsigned long va_offset)
 	 * Bootstrap the linear range that covers swapper_pg_dir so that the
 	 * statically allocated page tables as well as newly allocated ones
 	 * are accessible via the linear mapping.
+	 * Since@this point, PHYS_OFFSET is still biased to redirect __va()
+	 * translations into the kernel text mapping, we need to apply an
+	 * explicit va_offset to calculate virtual linear addresses.
 	 */
 	static struct bootstrap_pgtables linear_bs_pgtables __pgdir;
 	const phys_addr_t swapper_phys = __pa(swapper_pg_dir);
@@ -441,7 +447,10 @@ static void __init map_mem(void)
 {
 	struct memblock_region *reg;
 
-	bootstrap_linear_mapping(0);
+	bootstrap_linear_mapping(KIMAGE_OFFSET);
+
+	kernel_va_offset = KIMAGE_OFFSET;
+	phys_offset_bias = 0;
 
 	/* map all the memory banks */
 	for_each_memblock(memory, reg) {
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 11/13] arm64: map linear region as non-executable
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
                   ` (9 preceding siblings ...)
  2015-04-15 15:34 ` [PATCH v4 10/13] arm64: move kernel mapping out of linear region Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 12/13] arm64: allow kernel Image to be loaded anywhere in physical memory Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 13/13] arm64/efi: adapt to relaxed kernel Image placement requirements Ard Biesheuvel
  12 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we moved the kernel text out of the linear region, there
is no longer a reason to map it as executable. This also allows
us to completely get rid of the __map_mem() variant that only maps
some of it executable if CONFIG_DEBUG_RODATA is selected.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/mmu.c | 41 ++---------------------------------------
 1 file changed, 2 insertions(+), 39 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b457b7e425cc..c07ba8bdd8ed 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -303,47 +303,10 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
 				phys, virt, size, prot, late_alloc);
 }
 
-#ifdef CONFIG_DEBUG_RODATA
 static void __init __map_memblock(phys_addr_t start, phys_addr_t end)
 {
-	/*
-	 * Set up the executable regions using the existing section mappings
-	 * for now. This will get more fine grained later once all memory
-	 * is mapped
-	 */
-	unsigned long kernel_x_start = round_down(__pa(_stext), SECTION_SIZE);
-	unsigned long kernel_x_end = round_up(__pa(__init_end), SECTION_SIZE);
-
-	if (end < kernel_x_start) {
-		create_mapping(start, __phys_to_virt(start),
-			end - start, PAGE_KERNEL);
-	} else if (start >= kernel_x_end) {
-		create_mapping(start, __phys_to_virt(start),
-			end - start, PAGE_KERNEL);
-	} else {
-		if (start < kernel_x_start)
-			create_mapping(start, __phys_to_virt(start),
-				kernel_x_start - start,
-				PAGE_KERNEL);
-		create_mapping(kernel_x_start,
-				__phys_to_virt(kernel_x_start),
-				kernel_x_end - kernel_x_start,
-				PAGE_KERNEL_EXEC);
-		if (kernel_x_end < end)
-			create_mapping(kernel_x_end,
-				__phys_to_virt(kernel_x_end),
-				end - kernel_x_end,
-				PAGE_KERNEL);
-	}
-
-}
-#else
-static void __init __map_memblock(phys_addr_t start, phys_addr_t end)
-{
-	create_mapping(start, __phys_to_virt(start), end - start,
-			PAGE_KERNEL_EXEC);
+	create_mapping(start, __phys_to_virt(start), end - start, PAGE_KERNEL);
 }
-#endif
 
 struct bootstrap_pgtables {
 	pte_t	pte[PTRS_PER_PTE];
@@ -429,7 +392,7 @@ static void __init bootstrap_linear_mapping(unsigned long va_offset)
 #endif
 
 		create_mapping(__pa(vstart - va_offset), vstart, vend - vstart,
-			       PAGE_KERNEL_EXEC);
+			       PAGE_KERNEL);
 
 		/*
 		 * Temporarily limit the memblock range. We need to do this as
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 12/13] arm64: allow kernel Image to be loaded anywhere in physical memory
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
                   ` (10 preceding siblings ...)
  2015-04-15 15:34 ` [PATCH v4 11/13] arm64: map linear region as non-executable Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  2015-04-15 15:34 ` [PATCH v4 13/13] arm64/efi: adapt to relaxed kernel Image placement requirements Ard Biesheuvel
  12 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.

This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image (in addition to the 64 MB that is is moved
below PAGE_OFFSET). As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt | 20 ++++++++++----------
 arch/arm64/mm/Makefile          |  1 +
 arch/arm64/mm/init.c            | 38 +++++++++++++++++++++++++++++++++++---
 arch/arm64/mm/mmu.c             | 24 ++++++++++++++++++++++--
 4 files changed, 68 insertions(+), 15 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 53f18e13d51c..7bd9feedb6f9 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -113,16 +113,16 @@ Header notes:
   depending on selected features, and is effectively unbound.
 
 The Image must be placed text_offset bytes from a 2MB aligned base
-address near the start of usable system RAM and called there. Memory
-below that base address is currently unusable by Linux, and therefore it
-is strongly recommended that this location is the start of system RAM.
-At least image_size bytes from the start of the image must be free for
-use by the kernel.
-
-Any memory described to the kernel (even that below the 2MB aligned base
-address) which is not marked as reserved from the kernel e.g. with a
-memreserve region in the device tree) will be considered as available to
-the kernel.
+address anywhere in usable system RAM and called there. At least
+image_size bytes from the start of the image must be free for use
+by the kernel.
+NOTE: versions prior to v4.2 cannot make use of memory below the
+physical offset of the Image so it is recommended that the Image be
+placed as close as possible to the start of system RAM.
+
+Any memory described to the kernel which is not marked as reserved from
+the kernel (e.g., with a memreserve region in the device tree) will be
+considered as available to the kernel.
 
 Before jumping into the kernel, the following conditions must be met:
 
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
index 9d84feb41a16..49e90bab4d57 100644
--- a/arch/arm64/mm/Makefile
+++ b/arch/arm64/mm/Makefile
@@ -6,3 +6,4 @@ obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
 obj-$(CONFIG_ARM64_PTDUMP)	+= dump.o
 
 CFLAGS_mmu.o			:= -I$(srctree)/scripts/dtc/libfdt/
+CFLAGS_init.o			:= -DTEXT_OFFSET=$(TEXT_OFFSET)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 0e7d9a2aad39..98a009885229 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -157,6 +157,38 @@ static int __init early_mem(char *p)
 }
 early_param("mem", early_mem);
 
+static void enforce_memory_limit(void)
+{
+	const phys_addr_t kstart = __pa(_text) - TEXT_OFFSET;
+	const phys_addr_t kend = round_up(__pa(_end), SZ_2M);
+	const u64 ksize = kend - kstart;
+	struct memblock_region *reg;
+
+	if (likely(memory_limit == (phys_addr_t)ULLONG_MAX))
+		return;
+
+	if (WARN(memory_limit < ksize, "mem= limit is unreasonably low"))
+		return;
+
+	/*
+	 * We have to make sure that the kernel image is still covered by
+	 * memblock after we apply the memory limit, even if the kernel image
+	 * is high up in physical memory. So if the kernel image becomes
+	 * inaccessible after the limit is applied, we will lower the limit
+	 * so that it compensates for the kernel image and reapply it. That way,
+	 * we can add back the kernel image region and still honor the limit.
+	 */
+	memblock_enforce_memory_limit(memory_limit);
+
+	for_each_memblock(memory, reg)
+		if (reg->base <= kstart && reg->base + reg->size >= kend)
+			/* kernel image still accessible -> we're done */
+			return;
+
+	memblock_enforce_memory_limit(memory_limit - ksize);
+	memblock_add(kstart, ksize);
+}
+
 void __init arm64_memblock_init(void)
 {
 	/*
@@ -165,10 +197,10 @@ void __init arm64_memblock_init(void)
 	 */
 	const s64 linear_region_size = -(s64)PAGE_OFFSET;
 
-	memblock_remove(0, memstart_addr);
-	memblock_remove(memstart_addr + linear_region_size, ULLONG_MAX);
+	memblock_remove(round_down(memblock_start_of_DRAM(), SZ_1G) +
+			linear_region_size, ULLONG_MAX);
 
-	memblock_enforce_memory_limit(memory_limit);
+	enforce_memory_limit();
 
 	/*
 	 * Register the kernel text, kernel data, initrd, and initial
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c07ba8bdd8ed..1487824c5896 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -409,10 +409,30 @@ static void __init bootstrap_linear_mapping(unsigned long va_offset)
 static void __init map_mem(void)
 {
 	struct memblock_region *reg;
+	u64 new_memstart_addr = memblock_start_of_DRAM();
+	u64 new_va_offset;
 
-	bootstrap_linear_mapping(KIMAGE_OFFSET);
+	/*
+	 * Select a suitable value for the base of physical memory.
+	 * This should be below the lowest usable physical memory
+	 * address, and aligned to PUD/PMD size so that we can map
+	 * it efficiently.
+	 */
+	if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
+		new_memstart_addr &= PMD_MASK;
+	else
+		new_memstart_addr &= PUD_MASK;
+
+	/*
+	 * Calculate the offset between the kernel text mapping that exists
+	 * outside of the linear mapping, and its mapping in the linear region.
+	 */
+	new_va_offset = memstart_addr - new_memstart_addr + phys_offset_bias;
+
+	bootstrap_linear_mapping(new_va_offset);
 
-	kernel_va_offset = KIMAGE_OFFSET;
+	memstart_addr = new_memstart_addr;
+	kernel_va_offset = new_va_offset;
 	phys_offset_bias = 0;
 
 	/* map all the memory banks */
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 13/13] arm64/efi: adapt to relaxed kernel Image placement requirements
  2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
                   ` (11 preceding siblings ...)
  2015-04-15 15:34 ` [PATCH v4 12/13] arm64: allow kernel Image to be loaded anywhere in physical memory Ard Biesheuvel
@ 2015-04-15 15:34 ` Ard Biesheuvel
  12 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-04-15 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

This adapts the EFI stub kernel placement to the new relaxed
requirements, by placing the kernel Image at the highest available
2 MB offset in physical memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi-stub.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/efi-stub.c b/arch/arm64/kernel/efi-stub.c
index f5374065ad53..60ae3324e26e 100644
--- a/arch/arm64/kernel/efi-stub.c
+++ b/arch/arm64/kernel/efi-stub.c
@@ -10,6 +10,7 @@
  *
  */
 #include <linux/efi.h>
+#include <asm/boot.h>
 #include <asm/efi.h>
 #include <asm/sections.h>
 
@@ -28,8 +29,8 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table,
 	kernel_size = _edata - _text;
 	if (*image_addr != (dram_base + TEXT_OFFSET)) {
 		kernel_memsize = kernel_size + (_end - _edata);
-		status = efi_low_alloc(sys_table, kernel_memsize + TEXT_OFFSET,
-				       SZ_2M, reserve_addr);
+		status = efi_high_alloc(sys_table, kernel_memsize + TEXT_OFFSET,
+				       MIN_KIMG_ALIGN, reserve_addr, ULONG_MAX);
 		if (status != EFI_SUCCESS) {
 			pr_efi_err(sys_table, "Failed to relocate kernel\n");
 			return status;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 04/13] arm64: use fixmap region for permanent FDT mapping
  2015-04-15 15:34 ` [PATCH v4 04/13] arm64: use fixmap region for permanent FDT mapping Ard Biesheuvel
@ 2015-04-17 15:13   ` Mark Rutland
  0 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2015-04-17 15:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 15, 2015 at 04:34:15PM +0100, Ard Biesheuvel wrote:
> Currently, the FDT blob needs to be in the same 512 MB region as
> the kernel, so that it can be mapped into the kernel virtual memory
> space very early on using a minimal set of statically allocated
> translation tables.
>
> Now that we have early fixmap support, we can relax this restriction,
> by moving the permanent FDT mapping to the fixmap region instead.
> This way, the FDT blob may be anywhere in memory.
>
> This also moves the vetting of the FDT to mmu.c, since the early
> init code in head.S does not handle mapping of the FDT anymore.
> At the same time, fix up some comments in head.S that have gone stale.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  Documentation/arm64/booting.txt | 13 +++++----
>  arch/arm64/include/asm/boot.h   | 14 +++++++++
>  arch/arm64/include/asm/fixmap.h | 15 ++++++++++
>  arch/arm64/include/asm/mmu.h    |  1 +
>  arch/arm64/kernel/head.S        | 39 +------------------------
>  arch/arm64/kernel/setup.c       | 32 +++++++-------------
>  arch/arm64/mm/Makefile          |  2 ++
>  arch/arm64/mm/init.c            |  1 -
>  arch/arm64/mm/mmu.c             | 65 +++++++++++++++++++++++++++++++++++++++++
>  9 files changed, 117 insertions(+), 65 deletions(-)
>  create mode 100644 arch/arm64/include/asm/boot.h
>
> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
> index f3c05b5f9f08..53f18e13d51c 100644
> --- a/Documentation/arm64/booting.txt
> +++ b/Documentation/arm64/booting.txt
> @@ -45,11 +45,14 @@ sees fit.)
>
>  Requirement: MANDATORY
>
> -The device tree blob (dtb) must be placed on an 8-byte boundary within
> -the first 512 megabytes from the start of the kernel image and must not
> -cross a 2-megabyte boundary. This is to allow the kernel to map the
> -blob using a single section mapping in the initial page tables.
> -
> +The device tree blob (dtb) must be placed on an 8-byte boundary and must
> +not exceed 2 megabytes in size. Since the dtb will be mapped cacheable using
> +blocks of up to 2 megabytes in size, it should not be placed within 2 megabytes
> +of memreserves or other special carveouts that may be mapped with non-matching
> +attributes.

Nit: memreserves are always permitted to be mapped cacheable (following
the ePAPR definition and the de-facto Linux implementation on everything
other than PPC), so those should be fine.

How about:

The device tree blob (dtb) must be placed on an 8-byte boundary and must
not exceed 2 megabytes in size. Since the dtb will be mapped cacheable
using blocks of up to 2 megabytes in size, it must not be placed within
any 2M region which must be mapped with any specific attributes.

As an aside, we perhaps need a more formal definition of /memreserve/
semantics.

The code itself looks good to me. I'll try to give that a go with some
padded DTBs at some point next week.

Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 09/13] arm64: mm: explicitly bootstrap the linear mapping
  2015-04-15 15:34 ` [PATCH v4 09/13] arm64: mm: explicitly bootstrap the linear mapping Ard Biesheuvel
@ 2015-05-07 16:54   ` Catalin Marinas
  2015-05-07 19:21     ` Ard Biesheuvel
  0 siblings, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2015-05-07 16:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 15, 2015 at 05:34:20PM +0200, Ard Biesheuvel wrote:
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index ceec4def354b..338eaa7bcbfd 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -68,6 +68,17 @@ PECOFF_FILE_ALIGNMENT = 0x200;
>  #define ALIGN_DEBUG_RO_MIN(min)		. = ALIGN(min);
>  #endif
>  
> +/*
> + * The pgdir region needs to be mappable using a single PMD or PUD sized region,
> + * so it should not cross a 512 MB or 1 GB alignment boundary, respectively
> + * (depending on page size). So align to an upper bound of its size.
> + */
> +#if CONFIG_ARM64_PGTABLE_LEVELS == 2
> +#define PGDIR_ALIGN	(8 * PAGE_SIZE)
> +#else
> +#define PGDIR_ALIGN	(16 * PAGE_SIZE)
> +#endif

Isn't 8 pages sufficient in both cases? Unless some other patch changes
the idmap and swapper, I can count maximum 7 pages in total.

> +
>  SECTIONS
>  {
>  	/*
> @@ -160,7 +171,7 @@ SECTIONS
>  
>  	BSS_SECTION(0, 0, 0)
>  
> -	.pgdir (NOLOAD) : ALIGN(PAGE_SIZE) {
> +	.pgdir (NOLOAD) : ALIGN(PGDIR_ALIGN) {
>  		idmap_pg_dir = .;
>  		. += IDMAP_DIR_SIZE;
>  		swapper_pg_dir = .;
> @@ -185,6 +196,11 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
>  	"ID map text too big or misaligned")
>  
>  /*
> + * Check that the chosen PGDIR_ALIGN value if sufficient.
> + */
> +ASSERT(SIZEOF(.pgdir) < ALIGNOF(.pgdir), ".pgdir size exceeds its alignment")
> +
> +/*
>   * If padding is applied before .head.text, virt<->phys conversions will fail.
>   */
>  ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index c27ab20a5ba9..93e5a2497f01 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -380,26 +380,68 @@ static void __init bootstrap_early_mapping(unsigned long addr,
>  	}
>  }
>  
> +static void __init bootstrap_linear_mapping(unsigned long va_offset)
> +{
> +	/*
> +	 * Bootstrap the linear range that covers swapper_pg_dir so that the
> +	 * statically allocated page tables as well as newly allocated ones
> +	 * are accessible via the linear mapping.
> +	 */

Just move the comment outside the function.

> +	static struct bootstrap_pgtables linear_bs_pgtables __pgdir;
> +	const phys_addr_t swapper_phys = __pa(swapper_pg_dir);
> +	unsigned long swapper_virt = __phys_to_virt(swapper_phys) + va_offset;
> +	struct memblock_region *reg;
> +
> +	bootstrap_early_mapping(swapper_virt, &linear_bs_pgtables,
> +				IS_ENABLED(CONFIG_ARM64_64K_PAGES));
> +
> +	/* now find the memblock that covers swapper_pg_dir, and clip */
> +	for_each_memblock(memory, reg) {
> +		phys_addr_t start = reg->base;
> +		phys_addr_t end = start + reg->size;
> +		unsigned long vstart, vend;
> +
> +		if (start > swapper_phys || end <= swapper_phys)
> +			continue;
> +
> +#ifdef CONFIG_ARM64_64K_PAGES
> +		/* clip the region to PMD size */
> +		vstart = max(swapper_virt & PMD_MASK,
> +			     round_up(__phys_to_virt(start + va_offset),
> +				      PAGE_SIZE));
> +		vend = min(round_up(swapper_virt, PMD_SIZE),
> +			   round_down(__phys_to_virt(end + va_offset),
> +				      PAGE_SIZE));
> +#else
> +		/* clip the region to PUD size */
> +		vstart = max(swapper_virt & PUD_MASK,
> +			     round_up(__phys_to_virt(start + va_offset),
> +				      PMD_SIZE));
> +		vend = min(round_up(swapper_virt, PUD_SIZE),
> +			   round_down(__phys_to_virt(end + va_offset),
> +				      PMD_SIZE));
> +#endif
> +
> +		create_mapping(__pa(vstart - va_offset), vstart, vend - vstart,
> +			       PAGE_KERNEL_EXEC);
> +
> +		/*
> +		 * Temporarily limit the memblock range. We need to do this as
> +		 * create_mapping requires puds, pmds and ptes to be allocated
> +		 * from memory addressable from the early linear mapping.
> +		 */
> +		memblock_set_current_limit(__pa(vend - va_offset));
> +
> +		return;
> +	}
> +	BUG();
> +}

I'll probably revisit this function after I see the whole series. But in
the meantime, if the kernel is not loaded in the first memblock (in
address order), isn't there a risk that we allocate memory from the
first memblock which is not mapped yet?

-- 
Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 09/13] arm64: mm: explicitly bootstrap the linear mapping
  2015-05-07 16:54   ` Catalin Marinas
@ 2015-05-07 19:21     ` Ard Biesheuvel
  2015-05-08 14:44       ` Catalin Marinas
  0 siblings, 1 reply; 24+ messages in thread
From: Ard Biesheuvel @ 2015-05-07 19:21 UTC (permalink / raw)
  To: linux-arm-kernel

On 7 May 2015 at 18:54, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Wed, Apr 15, 2015 at 05:34:20PM +0200, Ard Biesheuvel wrote:
>> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> index ceec4def354b..338eaa7bcbfd 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -68,6 +68,17 @@ PECOFF_FILE_ALIGNMENT = 0x200;
>>  #define ALIGN_DEBUG_RO_MIN(min)              . = ALIGN(min);
>>  #endif
>>
>> +/*
>> + * The pgdir region needs to be mappable using a single PMD or PUD sized region,
>> + * so it should not cross a 512 MB or 1 GB alignment boundary, respectively
>> + * (depending on page size). So align to an upper bound of its size.
>> + */
>> +#if CONFIG_ARM64_PGTABLE_LEVELS == 2
>> +#define PGDIR_ALIGN  (8 * PAGE_SIZE)
>> +#else
>> +#define PGDIR_ALIGN  (16 * PAGE_SIZE)
>> +#endif
>
> Isn't 8 pages sufficient in both cases? Unless some other patch changes
> the idmap and swapper, I can count maximum 7 pages in total.
>

The preceding patch moves the fixmap page tables to this region as well.
But the logic is still incorrect -> we only need 16x for 4 levels (7 +
3 == 10), the remaining ones are all <= 8

>> +
>>  SECTIONS
>>  {
>>       /*
>> @@ -160,7 +171,7 @@ SECTIONS
>>
>>       BSS_SECTION(0, 0, 0)
>>
>> -     .pgdir (NOLOAD) : ALIGN(PAGE_SIZE) {
>> +     .pgdir (NOLOAD) : ALIGN(PGDIR_ALIGN) {
>>               idmap_pg_dir = .;
>>               . += IDMAP_DIR_SIZE;
>>               swapper_pg_dir = .;
>> @@ -185,6 +196,11 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
>>       "ID map text too big or misaligned")
>>
>>  /*
>> + * Check that the chosen PGDIR_ALIGN value if sufficient.
>> + */
>> +ASSERT(SIZEOF(.pgdir) < ALIGNOF(.pgdir), ".pgdir size exceeds its alignment")
>> +
>> +/*
>>   * If padding is applied before .head.text, virt<->phys conversions will fail.
>>   */
>>  ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index c27ab20a5ba9..93e5a2497f01 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -380,26 +380,68 @@ static void __init bootstrap_early_mapping(unsigned long addr,
>>       }
>>  }
>>
>> +static void __init bootstrap_linear_mapping(unsigned long va_offset)
>> +{
>> +     /*
>> +      * Bootstrap the linear range that covers swapper_pg_dir so that the
>> +      * statically allocated page tables as well as newly allocated ones
>> +      * are accessible via the linear mapping.
>> +      */
>
> Just move the comment outside the function.
>

OK

>> +     static struct bootstrap_pgtables linear_bs_pgtables __pgdir;
>> +     const phys_addr_t swapper_phys = __pa(swapper_pg_dir);
>> +     unsigned long swapper_virt = __phys_to_virt(swapper_phys) + va_offset;
>> +     struct memblock_region *reg;
>> +
>> +     bootstrap_early_mapping(swapper_virt, &linear_bs_pgtables,
>> +                             IS_ENABLED(CONFIG_ARM64_64K_PAGES));
>> +
>> +     /* now find the memblock that covers swapper_pg_dir, and clip */
>> +     for_each_memblock(memory, reg) {
>> +             phys_addr_t start = reg->base;
>> +             phys_addr_t end = start + reg->size;
>> +             unsigned long vstart, vend;
>> +
>> +             if (start > swapper_phys || end <= swapper_phys)
>> +                     continue;
>> +
>> +#ifdef CONFIG_ARM64_64K_PAGES
>> +             /* clip the region to PMD size */
>> +             vstart = max(swapper_virt & PMD_MASK,
>> +                          round_up(__phys_to_virt(start + va_offset),
>> +                                   PAGE_SIZE));
>> +             vend = min(round_up(swapper_virt, PMD_SIZE),
>> +                        round_down(__phys_to_virt(end + va_offset),
>> +                                   PAGE_SIZE));
>> +#else
>> +             /* clip the region to PUD size */
>> +             vstart = max(swapper_virt & PUD_MASK,
>> +                          round_up(__phys_to_virt(start + va_offset),
>> +                                   PMD_SIZE));
>> +             vend = min(round_up(swapper_virt, PUD_SIZE),
>> +                        round_down(__phys_to_virt(end + va_offset),
>> +                                   PMD_SIZE));
>> +#endif
>> +
>> +             create_mapping(__pa(vstart - va_offset), vstart, vend - vstart,
>> +                            PAGE_KERNEL_EXEC);
>> +
>> +             /*
>> +              * Temporarily limit the memblock range. We need to do this as
>> +              * create_mapping requires puds, pmds and ptes to be allocated
>> +              * from memory addressable from the early linear mapping.
>> +              */
>> +             memblock_set_current_limit(__pa(vend - va_offset));
>> +
>> +             return;
>> +     }
>> +     BUG();
>> +}
>
> I'll probably revisit this function after I see the whole series. But in
> the meantime, if the kernel is not loaded in the first memblock (in
> address order), isn't there a risk that we allocate memory from the
> first memblock which is not mapped yet?
>

memblock allocates top down, so it should only allocate from this
region, unless the remaining room is completely reserved. I think that
is a theoretical problem which exists currently as well, i.e., the
boot protocol does not mandate that the 512MB/1GB region containing
the kernel contains unreserved room.

-- 
Ard.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 09/13] arm64: mm: explicitly bootstrap the linear mapping
  2015-05-07 19:21     ` Ard Biesheuvel
@ 2015-05-08 14:44       ` Catalin Marinas
  2015-05-08 15:03         ` Ard Biesheuvel
  0 siblings, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2015-05-08 14:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 07, 2015 at 09:21:28PM +0200, Ard Biesheuvel wrote:
> On 7 May 2015 at 18:54, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Wed, Apr 15, 2015 at 05:34:20PM +0200, Ard Biesheuvel wrote:
> >> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> >> index ceec4def354b..338eaa7bcbfd 100644
> >> --- a/arch/arm64/kernel/vmlinux.lds.S
> >> +++ b/arch/arm64/kernel/vmlinux.lds.S
> >> @@ -68,6 +68,17 @@ PECOFF_FILE_ALIGNMENT = 0x200;
> >>  #define ALIGN_DEBUG_RO_MIN(min)              . = ALIGN(min);
> >>  #endif
> >>
> >> +/*
> >> + * The pgdir region needs to be mappable using a single PMD or PUD sized region,
> >> + * so it should not cross a 512 MB or 1 GB alignment boundary, respectively
> >> + * (depending on page size). So align to an upper bound of its size.
> >> + */
> >> +#if CONFIG_ARM64_PGTABLE_LEVELS == 2
> >> +#define PGDIR_ALIGN  (8 * PAGE_SIZE)
> >> +#else
> >> +#define PGDIR_ALIGN  (16 * PAGE_SIZE)
> >> +#endif
> >
> > Isn't 8 pages sufficient in both cases? Unless some other patch changes
> > the idmap and swapper, I can count maximum 7 pages in total.
> 
> The preceding patch moves the fixmap page tables to this region as well.
> But the logic is still incorrect -> we only need 16x for 4 levels (7 +
> 3 == 10), the remaining ones are all <= 8

You should improve the comment here to include the maths, "upper bound
of its size" is not very clear ;).

> >> +     static struct bootstrap_pgtables linear_bs_pgtables __pgdir;
> >> +     const phys_addr_t swapper_phys = __pa(swapper_pg_dir);
> >> +     unsigned long swapper_virt = __phys_to_virt(swapper_phys) + va_offset;
> >> +     struct memblock_region *reg;
> >> +
> >> +     bootstrap_early_mapping(swapper_virt, &linear_bs_pgtables,
> >> +                             IS_ENABLED(CONFIG_ARM64_64K_PAGES));
> >> +
> >> +     /* now find the memblock that covers swapper_pg_dir, and clip */
> >> +     for_each_memblock(memory, reg) {
> >> +             phys_addr_t start = reg->base;
> >> +             phys_addr_t end = start + reg->size;
> >> +             unsigned long vstart, vend;
> >> +
> >> +             if (start > swapper_phys || end <= swapper_phys)
> >> +                     continue;
> >> +
> >> +#ifdef CONFIG_ARM64_64K_PAGES
> >> +             /* clip the region to PMD size */
> >> +             vstart = max(swapper_virt & PMD_MASK,
> >> +                          round_up(__phys_to_virt(start + va_offset),
> >> +                                   PAGE_SIZE));
> >> +             vend = min(round_up(swapper_virt, PMD_SIZE),
> >> +                        round_down(__phys_to_virt(end + va_offset),
> >> +                                   PAGE_SIZE));
> >> +#else
> >> +             /* clip the region to PUD size */
> >> +             vstart = max(swapper_virt & PUD_MASK,
> >> +                          round_up(__phys_to_virt(start + va_offset),
> >> +                                   PMD_SIZE));
> >> +             vend = min(round_up(swapper_virt, PUD_SIZE),
> >> +                        round_down(__phys_to_virt(end + va_offset),
> >> +                                   PMD_SIZE));
> >> +#endif
> >> +
> >> +             create_mapping(__pa(vstart - va_offset), vstart, vend - vstart,
> >> +                            PAGE_KERNEL_EXEC);
> >> +
> >> +             /*
> >> +              * Temporarily limit the memblock range. We need to do this as
> >> +              * create_mapping requires puds, pmds and ptes to be allocated
> >> +              * from memory addressable from the early linear mapping.
> >> +              */
> >> +             memblock_set_current_limit(__pa(vend - va_offset));
> >> +
> >> +             return;
> >> +     }
> >> +     BUG();
> >> +}
> >
> > I'll probably revisit this function after I see the whole series. But in
> > the meantime, if the kernel is not loaded in the first memblock (in
> > address order), isn't there a risk that we allocate memory from the
> > first memblock which is not mapped yet?
> 
> memblock allocates top down, so it should only allocate from this
> region, unless the remaining room is completely reserved.

I don't like to rely on this, it's not guaranteed behaviour.

> I think that is a theoretical problem which exists currently as well,
> i.e., the boot protocol does not mandate that the 512MB/1GB region
> containing the kernel contains unreserved room.

That's more of a documentation problem, we can make the requirements
clearer. Debugging is probably easier as well, it fails to allocate
memory. But for the other case, not placing the kernel in the first
memblock has high chances of allocating unmapped memory.

Can we not have another set of level 2,3(,4) page tables pre-allocated
in swapper for the first block (start of RAM)? It gets hairy, in total
we would need:

1) idmap
2) swapper
  2.a) kernel image outside the linear mapping
  2.b) fixmap
  2.c) start-of-ram
  2.d) swapper mapping in the linear mapping

Can we avoid accessing 2.d (swapper in linear mapping) until we finished
mapping 2.c? Once we mapped the start of RAM and set the memblock limit,
we can allocate pages to start mapping the rest.

-- 
Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 09/13] arm64: mm: explicitly bootstrap the linear mapping
  2015-05-08 14:44       ` Catalin Marinas
@ 2015-05-08 15:03         ` Ard Biesheuvel
  2015-05-08 16:43           ` Catalin Marinas
  0 siblings, 1 reply; 24+ messages in thread
From: Ard Biesheuvel @ 2015-05-08 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 May 2015 at 16:44, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Thu, May 07, 2015 at 09:21:28PM +0200, Ard Biesheuvel wrote:
>> On 7 May 2015 at 18:54, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> > On Wed, Apr 15, 2015 at 05:34:20PM +0200, Ard Biesheuvel wrote:
>> >> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> >> index ceec4def354b..338eaa7bcbfd 100644
>> >> --- a/arch/arm64/kernel/vmlinux.lds.S
>> >> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> >> @@ -68,6 +68,17 @@ PECOFF_FILE_ALIGNMENT = 0x200;
>> >>  #define ALIGN_DEBUG_RO_MIN(min)              . = ALIGN(min);
>> >>  #endif
>> >>
>> >> +/*
>> >> + * The pgdir region needs to be mappable using a single PMD or PUD sized region,
>> >> + * so it should not cross a 512 MB or 1 GB alignment boundary, respectively
>> >> + * (depending on page size). So align to an upper bound of its size.
>> >> + */
>> >> +#if CONFIG_ARM64_PGTABLE_LEVELS == 2
>> >> +#define PGDIR_ALIGN  (8 * PAGE_SIZE)
>> >> +#else
>> >> +#define PGDIR_ALIGN  (16 * PAGE_SIZE)
>> >> +#endif
>> >
>> > Isn't 8 pages sufficient in both cases? Unless some other patch changes
>> > the idmap and swapper, I can count maximum 7 pages in total.
>>
>> The preceding patch moves the fixmap page tables to this region as well.
>> But the logic is still incorrect -> we only need 16x for 4 levels (7 +
>> 3 == 10), the remaining ones are all <= 8
>
> You should improve the comment here to include the maths, "upper bound
> of its size" is not very clear ;).
>

Yes, you are right, it should read 'power-of-2 upper bound'

>> >> +     static struct bootstrap_pgtables linear_bs_pgtables __pgdir;
>> >> +     const phys_addr_t swapper_phys = __pa(swapper_pg_dir);
>> >> +     unsigned long swapper_virt = __phys_to_virt(swapper_phys) + va_offset;
>> >> +     struct memblock_region *reg;
>> >> +
>> >> +     bootstrap_early_mapping(swapper_virt, &linear_bs_pgtables,
>> >> +                             IS_ENABLED(CONFIG_ARM64_64K_PAGES));
>> >> +
>> >> +     /* now find the memblock that covers swapper_pg_dir, and clip */
>> >> +     for_each_memblock(memory, reg) {
>> >> +             phys_addr_t start = reg->base;
>> >> +             phys_addr_t end = start + reg->size;
>> >> +             unsigned long vstart, vend;
>> >> +
>> >> +             if (start > swapper_phys || end <= swapper_phys)
>> >> +                     continue;
>> >> +
>> >> +#ifdef CONFIG_ARM64_64K_PAGES
>> >> +             /* clip the region to PMD size */
>> >> +             vstart = max(swapper_virt & PMD_MASK,
>> >> +                          round_up(__phys_to_virt(start + va_offset),
>> >> +                                   PAGE_SIZE));
>> >> +             vend = min(round_up(swapper_virt, PMD_SIZE),
>> >> +                        round_down(__phys_to_virt(end + va_offset),
>> >> +                                   PAGE_SIZE));
>> >> +#else
>> >> +             /* clip the region to PUD size */
>> >> +             vstart = max(swapper_virt & PUD_MASK,
>> >> +                          round_up(__phys_to_virt(start + va_offset),
>> >> +                                   PMD_SIZE));
>> >> +             vend = min(round_up(swapper_virt, PUD_SIZE),
>> >> +                        round_down(__phys_to_virt(end + va_offset),
>> >> +                                   PMD_SIZE));
>> >> +#endif
>> >> +
>> >> +             create_mapping(__pa(vstart - va_offset), vstart, vend - vstart,
>> >> +                            PAGE_KERNEL_EXEC);
>> >> +
>> >> +             /*
>> >> +              * Temporarily limit the memblock range. We need to do this as
>> >> +              * create_mapping requires puds, pmds and ptes to be allocated
>> >> +              * from memory addressable from the early linear mapping.
>> >> +              */
>> >> +             memblock_set_current_limit(__pa(vend - va_offset));
>> >> +
>> >> +             return;
>> >> +     }
>> >> +     BUG();
>> >> +}
>> >
>> > I'll probably revisit this function after I see the whole series. But in
>> > the meantime, if the kernel is not loaded in the first memblock (in
>> > address order), isn't there a risk that we allocate memory from the
>> > first memblock which is not mapped yet?
>>
>> memblock allocates top down, so it should only allocate from this
>> region, unless the remaining room is completely reserved.
>
> I don't like to rely on this, it's not guaranteed behaviour.
>

Actually, it is. Allocation is always top-down unless you call
memblock_set_bottom_up(), which is a NOP  unless CONFIG_MOVABLE_NODE
is selected.
That is why the memblock limit only limits at the top afaict

>> I think that is a theoretical problem which exists currently as well,
>> i.e., the boot protocol does not mandate that the 512MB/1GB region
>> containing the kernel contains unreserved room.
>
> That's more of a documentation problem, we can make the requirements
> clearer. Debugging is probably easier as well, it fails to allocate
> memory. But for the other case, not placing the kernel in the first
> memblock has high chances of allocating unmapped memory.
>

The only way we could allocate unmapped memory is if the 512 MB /1 GB
sized/aligned intersection with the memblock covering the kernel is
completely reserved, either by the kernel or by other reservations.
Since UEFI allocates from the top as well, if the kernel is loaded
high and may end up such that there is little room between the start
of the kernel and the beginning of the intersection. Still quite
unlikely imo since it would mean that UEFI using hundreds of megabytes
of memory, and it isn't quite /that/ bad [yet :-)]

> Can we not have another set of level 2,3(,4) page tables pre-allocated
> in swapper for the first block (start of RAM)? It gets hairy, in total
> we would need:
>
> 1) idmap
> 2) swapper
>   2.a) kernel image outside the linear mapping
>   2.b) fixmap
>   2.c) start-of-ram
>   2.d) swapper mapping in the linear mapping
>
> Can we avoid accessing 2.d (swapper in linear mapping) until we finished
> mapping 2.c? Once we mapped the start of RAM and set the memblock limit,
> we can allocate pages to start mapping the rest.
>

I really don't think any of this is necessary tbh

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 09/13] arm64: mm: explicitly bootstrap the linear mapping
  2015-05-08 15:03         ` Ard Biesheuvel
@ 2015-05-08 16:43           ` Catalin Marinas
  2015-05-08 16:59             ` Ard Biesheuvel
  0 siblings, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2015-05-08 16:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 08, 2015 at 05:03:37PM +0200, Ard Biesheuvel wrote:
> On 8 May 2015 at 16:44, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Thu, May 07, 2015 at 09:21:28PM +0200, Ard Biesheuvel wrote:
> >> On 7 May 2015 at 18:54, Catalin Marinas <catalin.marinas@arm.com> wrote:
> >> > On Wed, Apr 15, 2015 at 05:34:20PM +0200, Ard Biesheuvel wrote:
> >> >> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> >> >> index ceec4def354b..338eaa7bcbfd 100644
> >> >> --- a/arch/arm64/kernel/vmlinux.lds.S
> >> >> +++ b/arch/arm64/kernel/vmlinux.lds.S
> >> >> @@ -68,6 +68,17 @@ PECOFF_FILE_ALIGNMENT = 0x200;
> >> >>  #define ALIGN_DEBUG_RO_MIN(min)              . = ALIGN(min);
> >> >>  #endif
> >> >>
> >> >> +/*
> >> >> + * The pgdir region needs to be mappable using a single PMD or PUD sized region,
> >> >> + * so it should not cross a 512 MB or 1 GB alignment boundary, respectively
> >> >> + * (depending on page size). So align to an upper bound of its size.
> >> >> + */
> >> >> +#if CONFIG_ARM64_PGTABLE_LEVELS == 2
> >> >> +#define PGDIR_ALIGN  (8 * PAGE_SIZE)
> >> >> +#else
> >> >> +#define PGDIR_ALIGN  (16 * PAGE_SIZE)
> >> >> +#endif
> >> >
> >> > Isn't 8 pages sufficient in both cases? Unless some other patch changes
> >> > the idmap and swapper, I can count maximum 7 pages in total.
> >>
> >> The preceding patch moves the fixmap page tables to this region as well.
> >> But the logic is still incorrect -> we only need 16x for 4 levels (7 +
> >> 3 == 10), the remaining ones are all <= 8
> >
> > You should improve the comment here to include the maths, "upper bound
> > of its size" is not very clear ;).
> 
> Yes, you are right, it should read 'power-of-2 upper bound'

And the number of pages required for the initial page tables (I figured
out it's a power of two already ;)).

> >> >> +     static struct bootstrap_pgtables linear_bs_pgtables __pgdir;
> >> >> +     const phys_addr_t swapper_phys = __pa(swapper_pg_dir);
> >> >> +     unsigned long swapper_virt = __phys_to_virt(swapper_phys) + va_offset;
> >> >> +     struct memblock_region *reg;
> >> >> +
> >> >> +     bootstrap_early_mapping(swapper_virt, &linear_bs_pgtables,
> >> >> +                             IS_ENABLED(CONFIG_ARM64_64K_PAGES));
> >> >> +
> >> >> +     /* now find the memblock that covers swapper_pg_dir, and clip */
> >> >> +     for_each_memblock(memory, reg) {
> >> >> +             phys_addr_t start = reg->base;
> >> >> +             phys_addr_t end = start + reg->size;
> >> >> +             unsigned long vstart, vend;
> >> >> +
> >> >> +             if (start > swapper_phys || end <= swapper_phys)
> >> >> +                     continue;
> >> >> +
> >> >> +#ifdef CONFIG_ARM64_64K_PAGES
> >> >> +             /* clip the region to PMD size */
> >> >> +             vstart = max(swapper_virt & PMD_MASK,
> >> >> +                          round_up(__phys_to_virt(start + va_offset),
> >> >> +                                   PAGE_SIZE));
> >> >> +             vend = min(round_up(swapper_virt, PMD_SIZE),
> >> >> +                        round_down(__phys_to_virt(end + va_offset),
> >> >> +                                   PAGE_SIZE));
> >> >> +#else
> >> >> +             /* clip the region to PUD size */
> >> >> +             vstart = max(swapper_virt & PUD_MASK,
> >> >> +                          round_up(__phys_to_virt(start + va_offset),
> >> >> +                                   PMD_SIZE));
> >> >> +             vend = min(round_up(swapper_virt, PUD_SIZE),
> >> >> +                        round_down(__phys_to_virt(end + va_offset),
> >> >> +                                   PMD_SIZE));
> >> >> +#endif
> >> >> +
> >> >> +             create_mapping(__pa(vstart - va_offset), vstart, vend - vstart,
> >> >> +                            PAGE_KERNEL_EXEC);
> >> >> +
> >> >> +             /*
> >> >> +              * Temporarily limit the memblock range. We need to do this as
> >> >> +              * create_mapping requires puds, pmds and ptes to be allocated
> >> >> +              * from memory addressable from the early linear mapping.
> >> >> +              */
> >> >> +             memblock_set_current_limit(__pa(vend - va_offset));
> >> >> +
> >> >> +             return;
> >> >> +     }
> >> >> +     BUG();
> >> >> +}
> >> >
> >> > I'll probably revisit this function after I see the whole series. But in
> >> > the meantime, if the kernel is not loaded in the first memblock (in
> >> > address order), isn't there a risk that we allocate memory from the
> >> > first memblock which is not mapped yet?
> >>
> >> memblock allocates top down, so it should only allocate from this
> >> region, unless the remaining room is completely reserved.
> >
> > I don't like to rely on this, it's not guaranteed behaviour.
> 
> Actually, it is. Allocation is always top-down unless you call
> memblock_set_bottom_up(), which is a NOP  unless CONFIG_MOVABLE_NODE
> is selected.
> That is why the memblock limit only limits at the top afaict

It currently works like this but I'm not sure it is guaranteed to always
behave this way (e.g. someone "improves" the memblock allocator in the
future). And you never know, we may need memory hotplug on arm64 at some
point in the future (together with CONFIG_MOVABLE_NODE).

-- 
Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 09/13] arm64: mm: explicitly bootstrap the linear mapping
  2015-05-08 16:43           ` Catalin Marinas
@ 2015-05-08 16:59             ` Ard Biesheuvel
  0 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-05-08 16:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 May 2015 at 18:43, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Fri, May 08, 2015 at 05:03:37PM +0200, Ard Biesheuvel wrote:
>> On 8 May 2015 at 16:44, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> > On Thu, May 07, 2015 at 09:21:28PM +0200, Ard Biesheuvel wrote:
>> >> On 7 May 2015 at 18:54, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> >> > On Wed, Apr 15, 2015 at 05:34:20PM +0200, Ard Biesheuvel wrote:
>> >> >> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> >> >> index ceec4def354b..338eaa7bcbfd 100644
>> >> >> --- a/arch/arm64/kernel/vmlinux.lds.S
>> >> >> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> >> >> @@ -68,6 +68,17 @@ PECOFF_FILE_ALIGNMENT = 0x200;
>> >> >>  #define ALIGN_DEBUG_RO_MIN(min)              . = ALIGN(min);
>> >> >>  #endif
>> >> >>
>> >> >> +/*
>> >> >> + * The pgdir region needs to be mappable using a single PMD or PUD sized region,
>> >> >> + * so it should not cross a 512 MB or 1 GB alignment boundary, respectively
>> >> >> + * (depending on page size). So align to an upper bound of its size.
>> >> >> + */
>> >> >> +#if CONFIG_ARM64_PGTABLE_LEVELS == 2
>> >> >> +#define PGDIR_ALIGN  (8 * PAGE_SIZE)
>> >> >> +#else
>> >> >> +#define PGDIR_ALIGN  (16 * PAGE_SIZE)
>> >> >> +#endif
>> >> >
>> >> > Isn't 8 pages sufficient in both cases? Unless some other patch changes
>> >> > the idmap and swapper, I can count maximum 7 pages in total.
>> >>
>> >> The preceding patch moves the fixmap page tables to this region as well.
>> >> But the logic is still incorrect -> we only need 16x for 4 levels (7 +
>> >> 3 == 10), the remaining ones are all <= 8
>> >
>> > You should improve the comment here to include the maths, "upper bound
>> > of its size" is not very clear ;).
>>
>> Yes, you are right, it should read 'power-of-2 upper bound'
>
> And the number of pages required for the initial page tables (I figured
> out it's a power of two already ;)).
>

Ok

>> >> >> +     static struct bootstrap_pgtables linear_bs_pgtables __pgdir;
>> >> >> +     const phys_addr_t swapper_phys = __pa(swapper_pg_dir);
>> >> >> +     unsigned long swapper_virt = __phys_to_virt(swapper_phys) + va_offset;
>> >> >> +     struct memblock_region *reg;
>> >> >> +
>> >> >> +     bootstrap_early_mapping(swapper_virt, &linear_bs_pgtables,
>> >> >> +                             IS_ENABLED(CONFIG_ARM64_64K_PAGES));
>> >> >> +
>> >> >> +     /* now find the memblock that covers swapper_pg_dir, and clip */
>> >> >> +     for_each_memblock(memory, reg) {
>> >> >> +             phys_addr_t start = reg->base;
>> >> >> +             phys_addr_t end = start + reg->size;
>> >> >> +             unsigned long vstart, vend;
>> >> >> +
>> >> >> +             if (start > swapper_phys || end <= swapper_phys)
>> >> >> +                     continue;
>> >> >> +
>> >> >> +#ifdef CONFIG_ARM64_64K_PAGES
>> >> >> +             /* clip the region to PMD size */
>> >> >> +             vstart = max(swapper_virt & PMD_MASK,
>> >> >> +                          round_up(__phys_to_virt(start + va_offset),
>> >> >> +                                   PAGE_SIZE));
>> >> >> +             vend = min(round_up(swapper_virt, PMD_SIZE),
>> >> >> +                        round_down(__phys_to_virt(end + va_offset),
>> >> >> +                                   PAGE_SIZE));
>> >> >> +#else
>> >> >> +             /* clip the region to PUD size */
>> >> >> +             vstart = max(swapper_virt & PUD_MASK,
>> >> >> +                          round_up(__phys_to_virt(start + va_offset),
>> >> >> +                                   PMD_SIZE));
>> >> >> +             vend = min(round_up(swapper_virt, PUD_SIZE),
>> >> >> +                        round_down(__phys_to_virt(end + va_offset),
>> >> >> +                                   PMD_SIZE));
>> >> >> +#endif
>> >> >> +
>> >> >> +             create_mapping(__pa(vstart - va_offset), vstart, vend - vstart,
>> >> >> +                            PAGE_KERNEL_EXEC);
>> >> >> +
>> >> >> +             /*
>> >> >> +              * Temporarily limit the memblock range. We need to do this as
>> >> >> +              * create_mapping requires puds, pmds and ptes to be allocated
>> >> >> +              * from memory addressable from the early linear mapping.
>> >> >> +              */
>> >> >> +             memblock_set_current_limit(__pa(vend - va_offset));
>> >> >> +
>> >> >> +             return;
>> >> >> +     }
>> >> >> +     BUG();
>> >> >> +}
>> >> >
>> >> > I'll probably revisit this function after I see the whole series. But in
>> >> > the meantime, if the kernel is not loaded in the first memblock (in
>> >> > address order), isn't there a risk that we allocate memory from the
>> >> > first memblock which is not mapped yet?
>> >>
>> >> memblock allocates top down, so it should only allocate from this
>> >> region, unless the remaining room is completely reserved.
>> >
>> > I don't like to rely on this, it's not guaranteed behaviour.
>>
>> Actually, it is. Allocation is always top-down unless you call
>> memblock_set_bottom_up(), which is a NOP  unless CONFIG_MOVABLE_NODE
>> is selected.
>> That is why the memblock limit only limits at the top afaict
>
> It currently works like this but I'm not sure it is guaranteed to always
> behave this way (e.g. someone "improves" the memblock allocator in the
> future). And you never know, we may need memory hotplug on arm64 at some
> point in the future (together with CONFIG_MOVABLE_NODE).
>

I am not it is worth the additional hassle now to take into
consideration what someone may or may not implement at some point in
the future. I am sure memory hotplug is going to take more effort than
just setting the Kconfig option, and I would not be surprised if it
imposed additional restrictions on the placement on the kernel.

So the question is really if allowing the kernel to be placed at
arbitrary offsets in physical memory is worth the hassle in the first
place. There is also a rather nasty interaction with the mem= command
line option (and that does not work 100% correctly in this version of
the series). There is a policy decision to be made there, i.e., if you
remove memory to uphold mem=, where do you remove it? Removing from
the low end may waste precious <4 GB memory but removing from the top
may be impossible if the kernel image is loaded there.

Perhaps we should consider adding an early early memblock allocator
that allocates statically from the __pgdir region. The only problem is
finding a reasonable upper bound for the amount of memory you would
need ...

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 10/13] arm64: move kernel mapping out of linear region
  2015-04-15 15:34 ` [PATCH v4 10/13] arm64: move kernel mapping out of linear region Ard Biesheuvel
@ 2015-05-08 17:16   ` Catalin Marinas
  2015-05-08 17:26     ` Ard Biesheuvel
  0 siblings, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2015-05-08 17:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 15, 2015 at 05:34:21PM +0200, Ard Biesheuvel wrote:
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index f800d45ea226..801331793bd3 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -24,6 +24,7 @@
>  #include <linux/compiler.h>
>  #include <linux/const.h>
>  #include <linux/types.h>
> +#include <asm/boot.h>
>  #include <asm/sizes.h>
>  
>  /*
> @@ -39,7 +40,12 @@
>  #define PCI_IO_SIZE		SZ_16M
>  
>  /*
> - * PAGE_OFFSET - the virtual address of the start of the kernel image (top
> + * Offset below PAGE_OFFSET where to map the kernel Image.
> + */
> +#define KIMAGE_OFFSET		MAX_KIMG_SIZE
> +
> +/*
> + * PAGE_OFFSET - the virtual address of the base of the linear mapping (top
>   *		 (VA_BITS - 1))
>   * VA_BITS - the maximum number of bits for virtual addresses.
>   * TASK_SIZE - the maximum size of a user space task.
> @@ -49,7 +55,8 @@
>   */
>  #define VA_BITS			(CONFIG_ARM64_VA_BITS)
>  #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
> -#define MODULES_END		(PAGE_OFFSET)
> +#define KIMAGE_VADDR		(PAGE_OFFSET - KIMAGE_OFFSET)
> +#define MODULES_END		KIMAGE_VADDR
>  #define MODULES_VADDR		(MODULES_END - SZ_64M)
>  #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
>  #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
> @@ -77,7 +84,11 @@
>   * private definitions which should NOT be used outside memory.h
>   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
>   */
> -#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
> +#define __virt_to_phys(x) ({						\
> +	long __x = (long)(x) - PAGE_OFFSET;				\
> +	__x >= 0 ? (phys_addr_t)(__x + PHYS_OFFSET) : 			\
> +		   (phys_addr_t)(__x + PHYS_OFFSET + kernel_va_offset); })

Just wondering, when do we need a __pa on kernel addresses? But it looks
to me like second case is always (__x + PHYS_OFFSET + KIMAGE_OFFSET).
Before map_mem(), we have phys_offset_bias set but kernel_va_offset 0.
After map_mem(), we reset the former and set the latter. Maybe we can
get rid of kernel_va_offset entirely (see more below about
phys_offset_bias).

> +
>  #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
>  
>  /*
> @@ -111,7 +122,16 @@
>  
>  extern phys_addr_t		memstart_addr;
>  /* PHYS_OFFSET - the physical address of the start of memory. */
> -#define PHYS_OFFSET		({ memstart_addr; })
> +#define PHYS_OFFSET		({ memstart_addr + phys_offset_bias; })
> +
> +/*
> + * Before the linear mapping has been set up, __va() translations will
> + * not produce usable virtual addresses unless we tweak PHYS_OFFSET to
> + * compensate for the offset between the kernel mapping and the base of
> + * the linear mapping. We will undo this in map_mem().
> + */
> +extern u64 phys_offset_bias;
> +extern u64 kernel_va_offset;

Can we not add the bias to memstart_addr during boot and reset it later
in map_mem()? Otherwise the run-time kernel ends up having to do a dummy
addition any time it needs PHYS_OFFSET.

-- 
Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 10/13] arm64: move kernel mapping out of linear region
  2015-05-08 17:16   ` Catalin Marinas
@ 2015-05-08 17:26     ` Ard Biesheuvel
  2015-05-08 17:27       ` Ard Biesheuvel
  0 siblings, 1 reply; 24+ messages in thread
From: Ard Biesheuvel @ 2015-05-08 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 May 2015 at 19:16, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Wed, Apr 15, 2015 at 05:34:21PM +0200, Ard Biesheuvel wrote:
>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>> index f800d45ea226..801331793bd3 100644
>> --- a/arch/arm64/include/asm/memory.h
>> +++ b/arch/arm64/include/asm/memory.h
>> @@ -24,6 +24,7 @@
>>  #include <linux/compiler.h>
>>  #include <linux/const.h>
>>  #include <linux/types.h>
>> +#include <asm/boot.h>
>>  #include <asm/sizes.h>
>>
>>  /*
>> @@ -39,7 +40,12 @@
>>  #define PCI_IO_SIZE          SZ_16M
>>
>>  /*
>> - * PAGE_OFFSET - the virtual address of the start of the kernel image (top
>> + * Offset below PAGE_OFFSET where to map the kernel Image.
>> + */
>> +#define KIMAGE_OFFSET                MAX_KIMG_SIZE
>> +
>> +/*
>> + * PAGE_OFFSET - the virtual address of the base of the linear mapping (top
>>   *            (VA_BITS - 1))
>>   * VA_BITS - the maximum number of bits for virtual addresses.
>>   * TASK_SIZE - the maximum size of a user space task.
>> @@ -49,7 +55,8 @@
>>   */
>>  #define VA_BITS                      (CONFIG_ARM64_VA_BITS)
>>  #define PAGE_OFFSET          (UL(0xffffffffffffffff) << (VA_BITS - 1))
>> -#define MODULES_END          (PAGE_OFFSET)
>> +#define KIMAGE_VADDR         (PAGE_OFFSET - KIMAGE_OFFSET)
>> +#define MODULES_END          KIMAGE_VADDR
>>  #define MODULES_VADDR                (MODULES_END - SZ_64M)
>>  #define PCI_IO_END           (MODULES_VADDR - SZ_2M)
>>  #define PCI_IO_START         (PCI_IO_END - PCI_IO_SIZE)
>> @@ -77,7 +84,11 @@
>>   * private definitions which should NOT be used outside memory.h
>>   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
>>   */
>> -#define __virt_to_phys(x)    (((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
>> +#define __virt_to_phys(x) ({                                         \
>> +     long __x = (long)(x) - PAGE_OFFSET;                             \
>> +     __x >= 0 ? (phys_addr_t)(__x + PHYS_OFFSET) :                   \
>> +                (phys_addr_t)(__x + PHYS_OFFSET + kernel_va_offset); })
>
> Just wondering, when do we need a __pa on kernel addresses? But it looks
> to me like second case is always (__x + PHYS_OFFSET + KIMAGE_OFFSET).

For now, yes. But when the kernel Image moves up in physical memory,
and/or the kernel virtual image moves down in virtual memory (for
kaslr) this offset could increase.

> Before map_mem(), we have phys_offset_bias set but kernel_va_offset 0.
> After map_mem(), we reset the former and set the latter. Maybe we can
> get rid of kernel_va_offset entirely (see more below about
> phys_offset_bias).
>


>> +
>>  #define __phys_to_virt(x)    ((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
>>
>>  /*
>> @@ -111,7 +122,16 @@
>>
>>  extern phys_addr_t           memstart_addr;
>>  /* PHYS_OFFSET - the physical address of the start of memory. */
>> -#define PHYS_OFFSET          ({ memstart_addr; })
>> +#define PHYS_OFFSET          ({ memstart_addr + phys_offset_bias; })
>> +
>> +/*
>> + * Before the linear mapping has been set up, __va() translations will
>> + * not produce usable virtual addresses unless we tweak PHYS_OFFSET to
>> + * compensate for the offset between the kernel mapping and the base of
>> + * the linear mapping. We will undo this in map_mem().
>> + */
>> +extern u64 phys_offset_bias;
>> +extern u64 kernel_va_offset;
>
> Can we not add the bias to memstart_addr during boot and reset it later
> in map_mem()? Otherwise the run-time kernel ends up having to do a dummy
> addition any time it needs PHYS_OFFSET.
>

Yes, that is how I started out. At some point during development, that
became a bit cumbersome, because for instance, when you remove the
memory that is inaccessible, you want memstart_addr to contain a
meaningful value and not have to undo the bias. But looking at this
version of the series, I think there are no references left to
memstart_addr.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 10/13] arm64: move kernel mapping out of linear region
  2015-05-08 17:26     ` Ard Biesheuvel
@ 2015-05-08 17:27       ` Ard Biesheuvel
  0 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2015-05-08 17:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 May 2015 at 19:26, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 8 May 2015 at 19:16, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> On Wed, Apr 15, 2015 at 05:34:21PM +0200, Ard Biesheuvel wrote:
>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>>> index f800d45ea226..801331793bd3 100644
>>> --- a/arch/arm64/include/asm/memory.h
>>> +++ b/arch/arm64/include/asm/memory.h
>>> @@ -24,6 +24,7 @@
>>>  #include <linux/compiler.h>
>>>  #include <linux/const.h>
>>>  #include <linux/types.h>
>>> +#include <asm/boot.h>
>>>  #include <asm/sizes.h>
>>>
>>>  /*
>>> @@ -39,7 +40,12 @@
>>>  #define PCI_IO_SIZE          SZ_16M
>>>
>>>  /*
>>> - * PAGE_OFFSET - the virtual address of the start of the kernel image (top
>>> + * Offset below PAGE_OFFSET where to map the kernel Image.
>>> + */
>>> +#define KIMAGE_OFFSET                MAX_KIMG_SIZE
>>> +
>>> +/*
>>> + * PAGE_OFFSET - the virtual address of the base of the linear mapping (top
>>>   *            (VA_BITS - 1))
>>>   * VA_BITS - the maximum number of bits for virtual addresses.
>>>   * TASK_SIZE - the maximum size of a user space task.
>>> @@ -49,7 +55,8 @@
>>>   */
>>>  #define VA_BITS                      (CONFIG_ARM64_VA_BITS)
>>>  #define PAGE_OFFSET          (UL(0xffffffffffffffff) << (VA_BITS - 1))
>>> -#define MODULES_END          (PAGE_OFFSET)
>>> +#define KIMAGE_VADDR         (PAGE_OFFSET - KIMAGE_OFFSET)
>>> +#define MODULES_END          KIMAGE_VADDR
>>>  #define MODULES_VADDR                (MODULES_END - SZ_64M)
>>>  #define PCI_IO_END           (MODULES_VADDR - SZ_2M)
>>>  #define PCI_IO_START         (PCI_IO_END - PCI_IO_SIZE)
>>> @@ -77,7 +84,11 @@
>>>   * private definitions which should NOT be used outside memory.h
>>>   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
>>>   */
>>> -#define __virt_to_phys(x)    (((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
>>> +#define __virt_to_phys(x) ({                                         \
>>> +     long __x = (long)(x) - PAGE_OFFSET;                             \
>>> +     __x >= 0 ? (phys_addr_t)(__x + PHYS_OFFSET) :                   \
>>> +                (phys_addr_t)(__x + PHYS_OFFSET + kernel_va_offset); })
>>
>> Just wondering, when do we need a __pa on kernel addresses? But it looks
>> to me like second case is always (__x + PHYS_OFFSET + KIMAGE_OFFSET).
>
> For now, yes. But when the kernel Image moves up in physical memory,
> and/or the kernel virtual image moves down in virtual memory (for
> kaslr) this offset could increase.
>
>> Before map_mem(), we have phys_offset_bias set but kernel_va_offset 0.
>> After map_mem(), we reset the former and set the latter. Maybe we can
>> get rid of kernel_va_offset entirely (see more below about
>> phys_offset_bias).
>>
>
>
>>> +
>>>  #define __phys_to_virt(x)    ((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
>>>
>>>  /*
>>> @@ -111,7 +122,16 @@
>>>
>>>  extern phys_addr_t           memstart_addr;
>>>  /* PHYS_OFFSET - the physical address of the start of memory. */
>>> -#define PHYS_OFFSET          ({ memstart_addr; })
>>> +#define PHYS_OFFSET          ({ memstart_addr + phys_offset_bias; })
>>> +
>>> +/*
>>> + * Before the linear mapping has been set up, __va() translations will
>>> + * not produce usable virtual addresses unless we tweak PHYS_OFFSET to
>>> + * compensate for the offset between the kernel mapping and the base of
>>> + * the linear mapping. We will undo this in map_mem().
>>> + */
>>> +extern u64 phys_offset_bias;
>>> +extern u64 kernel_va_offset;
>>
>> Can we not add the bias to memstart_addr during boot and reset it later
>> in map_mem()? Otherwise the run-time kernel ends up having to do a dummy
>> addition any time it needs PHYS_OFFSET.
>>
>
> Yes, that is how I started out. At some point during development, that
> became a bit cumbersome, because for instance, when you remove the
> memory that is inaccessible, you want memstart_addr to contain a
> meaningful value and not have to undo the bias. But looking at this
> version of the series, I think there are no references left to
> memstart_addr.


Patch #12 in this series removes the problematic references to
memstart_addr, so I could squash the bias in that patch.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-05-08 17:27 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-15 15:34 [PATCH v4 00/13] arm64: update/clarify/relax Image and FDT placement rules Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 01/13] arm64: reduce ID map to a single page Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 02/13] arm64: drop sleep_idmap_phys and clean up cpu_resume() Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 03/13] of/fdt: split off FDT self reservation from memreserve processing Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 04/13] arm64: use fixmap region for permanent FDT mapping Ard Biesheuvel
2015-04-17 15:13   ` Mark Rutland
2015-04-15 15:34 ` [PATCH v4 05/13] arm64/efi: adapt to relaxed FDT placement requirements Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 06/13] arm64: implement our own early_init_dt_add_memory_arch() Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 07/13] arm64: use more granular reservations for static page table allocations Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 08/13] arm64: split off early mapping code from early_fixmap_init() Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 09/13] arm64: mm: explicitly bootstrap the linear mapping Ard Biesheuvel
2015-05-07 16:54   ` Catalin Marinas
2015-05-07 19:21     ` Ard Biesheuvel
2015-05-08 14:44       ` Catalin Marinas
2015-05-08 15:03         ` Ard Biesheuvel
2015-05-08 16:43           ` Catalin Marinas
2015-05-08 16:59             ` Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 10/13] arm64: move kernel mapping out of linear region Ard Biesheuvel
2015-05-08 17:16   ` Catalin Marinas
2015-05-08 17:26     ` Ard Biesheuvel
2015-05-08 17:27       ` Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 11/13] arm64: map linear region as non-executable Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 12/13] arm64: allow kernel Image to be loaded anywhere in physical memory Ard Biesheuvel
2015-04-15 15:34 ` [PATCH v4 13/13] arm64/efi: adapt to relaxed kernel Image placement requirements Ard Biesheuvel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.