All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] arm64: stable UEFI mappings for kexec
@ 2014-11-06 14:13 ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A
  Cc: Ard Biesheuvel

This is v2 of the series to update the UEFI memory map handling for the arm64
architecture so that:
- virtual mappings of UEFI Runtime Services are stable across kexec
- userland mappings via /dev/mem use cache attributes that are compatible with
  the memory type of the UEFI memory map entry
- code that can be reused for 32-bit ARM is moved to a common area.

Changes since v2 are primarily the move of reusable infrastructure to
drivers/firmware/efi/virtmap.c, and the newly added handling of /dev/mem
mappings.

The main premise of these patches is that, in order to support kexec, we need
to add code to the kernel that is able to deal with the state of the firmware
after SetVirtualAddressMap() [SVAM] has been called. However, if we are going to
deal with that anyway, why not make that the default state, and have only a
single code path for both cases.

This means SVAM() needs to move to the stub, and hence the code that invents
the virtual layout needs to move with it. The result is that the kernel proper
is entered with the virt_addr members of all EFI_MEMORY_RUNTIME regions
assigned, and the mapping installed into the firmware. The kernel proper needs
to set up the page tables, and switch to them while performing the runtime
services calls. Note that there is also an efi_to_phys() to translate the values
of the fw_vendor and tables fields of the EFI system table. Again, this is
something we need to do anyway under kexec, or we end up handing over state
between one kernel and the next, which implies different code paths between
non-kexec and kexec.

The layout is chosen such that it used the userland half of the virtual address
space (TTBR0), which means no additional alignment or reservation is required
to ensure that it will always be available.

One thing that may stand out is the reordering of the memory map. The reason
for doing this is that we can use the same memory map as input to SVAM(). The
alternative is allocating memory for it using boot services, but that clutters
up the existing logic a bit between getting the memory map, populating the fdt,
and loop again if it didn't fit.

Ard Biesheuvel (10):
  arm64/mm: add explicit struct_mm argument to __create_mapping()
  arm64/mm: add create_pgd_mapping() to create private page tables
  efi: split off remapping code from efi_config_init()
  efi: add common infrastructure for stub-installed virtual mapping
  arm64/efi: move SetVirtualAddressMap() to UEFI stub
  arm64/efi: remove free_boot_services() and friends
  arm64/efi: remove idmap manipulations from UEFI code
  arm64/efi: use UEFI memory map unconditionally if available
  arm64/efi: ignore unusable regions instead of reserving them
  arm64/efi: improve /dev/mem mmap() handling under UEFI

 arch/arm64/Kconfig                 |   1 +
 arch/arm64/include/asm/efi.h       |  18 +-
 arch/arm64/include/asm/mmu.h       |   5 +-
 arch/arm64/include/asm/pgtable.h   |   5 +
 arch/arm64/kernel/efi.c            | 351 ++++++-------------------------------
 arch/arm64/kernel/setup.c          |   2 +-
 arch/arm64/mm/mmu.c                |  78 +++++----
 drivers/firmware/efi/Kconfig       |   3 +
 drivers/firmware/efi/Makefile      |   1 +
 drivers/firmware/efi/efi.c         |  49 ++++--
 drivers/firmware/efi/libstub/fdt.c | 107 ++++++++++-
 drivers/firmware/efi/virtmap.c     | 224 +++++++++++++++++++++++
 include/linux/efi.h                |  17 +-
 13 files changed, 496 insertions(+), 365 deletions(-)
 create mode 100644 drivers/firmware/efi/virtmap.c

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 00/10] arm64: stable UEFI mappings for kexec
@ 2014-11-06 14:13 ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

This is v2 of the series to update the UEFI memory map handling for the arm64
architecture so that:
- virtual mappings of UEFI Runtime Services are stable across kexec
- userland mappings via /dev/mem use cache attributes that are compatible with
  the memory type of the UEFI memory map entry
- code that can be reused for 32-bit ARM is moved to a common area.

Changes since v2 are primarily the move of reusable infrastructure to
drivers/firmware/efi/virtmap.c, and the newly added handling of /dev/mem
mappings.

The main premise of these patches is that, in order to support kexec, we need
to add code to the kernel that is able to deal with the state of the firmware
after SetVirtualAddressMap() [SVAM] has been called. However, if we are going to
deal with that anyway, why not make that the default state, and have only a
single code path for both cases.

This means SVAM() needs to move to the stub, and hence the code that invents
the virtual layout needs to move with it. The result is that the kernel proper
is entered with the virt_addr members of all EFI_MEMORY_RUNTIME regions
assigned, and the mapping installed into the firmware. The kernel proper needs
to set up the page tables, and switch to them while performing the runtime
services calls. Note that there is also an efi_to_phys() to translate the values
of the fw_vendor and tables fields of the EFI system table. Again, this is
something we need to do anyway under kexec, or we end up handing over state
between one kernel and the next, which implies different code paths between
non-kexec and kexec.

The layout is chosen such that it used the userland half of the virtual address
space (TTBR0), which means no additional alignment or reservation is required
to ensure that it will always be available.

One thing that may stand out is the reordering of the memory map. The reason
for doing this is that we can use the same memory map as input to SVAM(). The
alternative is allocating memory for it using boot services, but that clutters
up the existing logic a bit between getting the memory map, populating the fdt,
and loop again if it didn't fit.

Ard Biesheuvel (10):
  arm64/mm: add explicit struct_mm argument to __create_mapping()
  arm64/mm: add create_pgd_mapping() to create private page tables
  efi: split off remapping code from efi_config_init()
  efi: add common infrastructure for stub-installed virtual mapping
  arm64/efi: move SetVirtualAddressMap() to UEFI stub
  arm64/efi: remove free_boot_services() and friends
  arm64/efi: remove idmap manipulations from UEFI code
  arm64/efi: use UEFI memory map unconditionally if available
  arm64/efi: ignore unusable regions instead of reserving them
  arm64/efi: improve /dev/mem mmap() handling under UEFI

 arch/arm64/Kconfig                 |   1 +
 arch/arm64/include/asm/efi.h       |  18 +-
 arch/arm64/include/asm/mmu.h       |   5 +-
 arch/arm64/include/asm/pgtable.h   |   5 +
 arch/arm64/kernel/efi.c            | 351 ++++++-------------------------------
 arch/arm64/kernel/setup.c          |   2 +-
 arch/arm64/mm/mmu.c                |  78 +++++----
 drivers/firmware/efi/Kconfig       |   3 +
 drivers/firmware/efi/Makefile      |   1 +
 drivers/firmware/efi/efi.c         |  49 ++++--
 drivers/firmware/efi/libstub/fdt.c | 107 ++++++++++-
 drivers/firmware/efi/virtmap.c     | 224 +++++++++++++++++++++++
 include/linux/efi.h                |  17 +-
 13 files changed, 496 insertions(+), 365 deletions(-)
 create mode 100644 drivers/firmware/efi/virtmap.c

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 01/10] arm64/mm: add explicit struct_mm argument to __create_mapping()
  2014-11-06 14:13 ` Ard Biesheuvel
@ 2014-11-06 14:13     ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A
  Cc: Ard Biesheuvel

Currently, swapper_pg_dir and idmap_pg_dir share the init_mm mm_struct
instance. To allow the introduction of other pg_dir instances, for instance,
for UEFI's mapping of Runtime Services, make the struct_mm instance an
explicit argument that gets passed down to the pmd and pte instantiation
functions. Note that the consumers (pmd_populate/pgd_populate) of the
mm_struct argument don't actually inspect it, but let's fix it for
correctness' sake.

Acked-by: Steve Capper <steve.capper-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/mm/mmu.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 0bf90d26e745..83e6713143a3 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -155,9 +155,9 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 }
 
-static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
-				  unsigned long end, phys_addr_t phys,
-				  int map_io)
+static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
+				  unsigned long addr, unsigned long end,
+				  phys_addr_t phys, int map_io)
 {
 	pmd_t *pmd;
 	unsigned long next;
@@ -177,7 +177,7 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 	 */
 	if (pud_none(*pud) || pud_bad(*pud)) {
 		pmd = early_alloc(PTRS_PER_PMD * sizeof(pmd_t));
-		pud_populate(&init_mm, pud, pmd);
+		pud_populate(mm, pud, pmd);
 	}
 
 	pmd = pmd_offset(pud, addr);
@@ -201,16 +201,16 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 	} while (pmd++, addr = next, addr != end);
 }
 
-static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
-				  unsigned long end, unsigned long phys,
-				  int map_io)
+static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
+				  unsigned long addr, unsigned long end,
+				  unsigned long phys, int map_io)
 {
 	pud_t *pud;
 	unsigned long next;
 
 	if (pgd_none(*pgd)) {
 		pud = early_alloc(PTRS_PER_PUD * sizeof(pud_t));
-		pgd_populate(&init_mm, pgd, pud);
+		pgd_populate(mm, pgd, pud);
 	}
 	BUG_ON(pgd_bad(*pgd));
 
@@ -239,7 +239,7 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
 				flush_tlb_all();
 			}
 		} else {
-			alloc_init_pmd(pud, addr, next, phys, map_io);
+			alloc_init_pmd(mm, pud, addr, next, phys, map_io);
 		}
 		phys += next - addr;
 	} while (pud++, addr = next, addr != end);
@@ -249,9 +249,9 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
  * Create the page directory entries and any necessary page tables for the
  * mapping specified by 'md'.
  */
-static void __init __create_mapping(pgd_t *pgd, phys_addr_t phys,
-				    unsigned long virt, phys_addr_t size,
-				    int map_io)
+static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
+				    phys_addr_t phys, unsigned long virt,
+				    phys_addr_t size, int map_io)
 {
 	unsigned long addr, length, end, next;
 
@@ -261,7 +261,7 @@ static void __init __create_mapping(pgd_t *pgd, phys_addr_t phys,
 	end = addr + length;
 	do {
 		next = pgd_addr_end(addr, end);
-		alloc_init_pud(pgd, addr, next, phys, map_io);
+		alloc_init_pud(mm, pgd, addr, next, phys, map_io);
 		phys += next - addr;
 	} while (pgd++, addr = next, addr != end);
 }
@@ -274,7 +274,8 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 			&phys, virt);
 		return;
 	}
-	__create_mapping(pgd_offset_k(virt & PAGE_MASK), phys, virt, size, 0);
+	__create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK), phys, virt,
+			 size, 0);
 }
 
 void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
@@ -283,7 +284,7 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
 		pr_warn("BUG: not creating id mapping for %pa\n", &addr);
 		return;
 	}
-	__create_mapping(&idmap_pg_dir[pgd_index(addr)],
+	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
 			 addr, addr, size, map_io);
 }
 
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 01/10] arm64/mm: add explicit struct_mm argument to __create_mapping()
@ 2014-11-06 14:13     ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

Currently, swapper_pg_dir and idmap_pg_dir share the init_mm mm_struct
instance. To allow the introduction of other pg_dir instances, for instance,
for UEFI's mapping of Runtime Services, make the struct_mm instance an
explicit argument that gets passed down to the pmd and pte instantiation
functions. Note that the consumers (pmd_populate/pgd_populate) of the
mm_struct argument don't actually inspect it, but let's fix it for
correctness' sake.

Acked-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/mmu.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 0bf90d26e745..83e6713143a3 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -155,9 +155,9 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 }
 
-static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
-				  unsigned long end, phys_addr_t phys,
-				  int map_io)
+static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
+				  unsigned long addr, unsigned long end,
+				  phys_addr_t phys, int map_io)
 {
 	pmd_t *pmd;
 	unsigned long next;
@@ -177,7 +177,7 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 	 */
 	if (pud_none(*pud) || pud_bad(*pud)) {
 		pmd = early_alloc(PTRS_PER_PMD * sizeof(pmd_t));
-		pud_populate(&init_mm, pud, pmd);
+		pud_populate(mm, pud, pmd);
 	}
 
 	pmd = pmd_offset(pud, addr);
@@ -201,16 +201,16 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 	} while (pmd++, addr = next, addr != end);
 }
 
-static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
-				  unsigned long end, unsigned long phys,
-				  int map_io)
+static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
+				  unsigned long addr, unsigned long end,
+				  unsigned long phys, int map_io)
 {
 	pud_t *pud;
 	unsigned long next;
 
 	if (pgd_none(*pgd)) {
 		pud = early_alloc(PTRS_PER_PUD * sizeof(pud_t));
-		pgd_populate(&init_mm, pgd, pud);
+		pgd_populate(mm, pgd, pud);
 	}
 	BUG_ON(pgd_bad(*pgd));
 
@@ -239,7 +239,7 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
 				flush_tlb_all();
 			}
 		} else {
-			alloc_init_pmd(pud, addr, next, phys, map_io);
+			alloc_init_pmd(mm, pud, addr, next, phys, map_io);
 		}
 		phys += next - addr;
 	} while (pud++, addr = next, addr != end);
@@ -249,9 +249,9 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
  * Create the page directory entries and any necessary page tables for the
  * mapping specified by 'md'.
  */
-static void __init __create_mapping(pgd_t *pgd, phys_addr_t phys,
-				    unsigned long virt, phys_addr_t size,
-				    int map_io)
+static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
+				    phys_addr_t phys, unsigned long virt,
+				    phys_addr_t size, int map_io)
 {
 	unsigned long addr, length, end, next;
 
@@ -261,7 +261,7 @@ static void __init __create_mapping(pgd_t *pgd, phys_addr_t phys,
 	end = addr + length;
 	do {
 		next = pgd_addr_end(addr, end);
-		alloc_init_pud(pgd, addr, next, phys, map_io);
+		alloc_init_pud(mm, pgd, addr, next, phys, map_io);
 		phys += next - addr;
 	} while (pgd++, addr = next, addr != end);
 }
@@ -274,7 +274,8 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 			&phys, virt);
 		return;
 	}
-	__create_mapping(pgd_offset_k(virt & PAGE_MASK), phys, virt, size, 0);
+	__create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK), phys, virt,
+			 size, 0);
 }
 
 void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
@@ -283,7 +284,7 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
 		pr_warn("BUG: not creating id mapping for %pa\n", &addr);
 		return;
 	}
-	__create_mapping(&idmap_pg_dir[pgd_index(addr)],
+	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
 			 addr, addr, size, map_io);
 }
 
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 02/10] arm64/mm: add create_pgd_mapping() to create private page tables
  2014-11-06 14:13 ` Ard Biesheuvel
@ 2014-11-06 14:13     ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A
  Cc: Ard Biesheuvel

For UEFI, we need to install the memory mappings used for Runtime Services
in a dedicated set of page tables. Add create_pgd_mapping(), which allows
us to allocate and install those page table entries early.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/include/asm/mmu.h     |  3 +++
 arch/arm64/include/asm/pgtable.h |  5 +++++
 arch/arm64/mm/mmu.c              | 44 +++++++++++++++++++++-------------------
 3 files changed, 31 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index c2f006c48bdb..5fd40c43be80 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -33,5 +33,8 @@ extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
 extern void init_mem_pgprot(void);
 /* create an identity mapping for memory (or io if map_io is true) */
 extern void create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io);
+extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
+			       unsigned long virt, phys_addr_t size,
+			       pgprot_t prot);
 
 #endif
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 41a43bf26492..1abe4d08725b 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -264,6 +264,11 @@ static inline pmd_t pte_pmd(pte_t pte)
 	return __pmd(pte_val(pte));
 }
 
+static inline pgprot_t mk_sect_prot(pgprot_t prot)
+{
+	return __pgprot(pgprot_val(prot) & ~PTE_TABLE_BIT);
+}
+
 /*
  * THP definitions.
  */
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 83e6713143a3..b6dc2ce3991a 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -157,20 +157,10 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
 
 static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 				  unsigned long addr, unsigned long end,
-				  phys_addr_t phys, int map_io)
+				  phys_addr_t phys, pgprot_t prot)
 {
 	pmd_t *pmd;
 	unsigned long next;
-	pmdval_t prot_sect;
-	pgprot_t prot_pte;
-
-	if (map_io) {
-		prot_sect = PROT_SECT_DEVICE_nGnRE;
-		prot_pte = __pgprot(PROT_DEVICE_nGnRE);
-	} else {
-		prot_sect = PROT_SECT_NORMAL_EXEC;
-		prot_pte = PAGE_KERNEL_EXEC;
-	}
 
 	/*
 	 * Check for initial section mappings in the pgd/pud and remove them.
@@ -186,7 +176,8 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 		/* try section mapping first */
 		if (((addr | next | phys) & ~SECTION_MASK) == 0) {
 			pmd_t old_pmd =*pmd;
-			set_pmd(pmd, __pmd(phys | prot_sect));
+			set_pmd(pmd, __pmd(phys |
+					   pgprot_val(mk_sect_prot(prot))));
 			/*
 			 * Check for previous table entries created during
 			 * boot (__create_page_tables) and flush them.
@@ -195,7 +186,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 				flush_tlb_all();
 		} else {
 			alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
-				       prot_pte);
+				       prot);
 		}
 		phys += next - addr;
 	} while (pmd++, addr = next, addr != end);
@@ -203,7 +194,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 
 static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				  unsigned long addr, unsigned long end,
-				  unsigned long phys, int map_io)
+				  unsigned long phys, pgprot_t prot)
 {
 	pud_t *pud;
 	unsigned long next;
@@ -221,10 +212,11 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 		/*
 		 * For 4K granule only, attempt to put down a 1GB block
 		 */
-		if (!map_io && (PAGE_SHIFT == 12) &&
+		if ((PAGE_SHIFT == 12) &&
 		    ((addr | next | phys) & ~PUD_MASK) == 0) {
 			pud_t old_pud = *pud;
-			set_pud(pud, __pud(phys | PROT_SECT_NORMAL_EXEC));
+			set_pud(pud, __pud(phys |
+					   pgprot_val(mk_sect_prot(prot))));
 
 			/*
 			 * If we have an old value for a pud, it will
@@ -239,7 +231,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				flush_tlb_all();
 			}
 		} else {
-			alloc_init_pmd(mm, pud, addr, next, phys, map_io);
+			alloc_init_pmd(mm, pud, addr, next, phys, prot);
 		}
 		phys += next - addr;
 	} while (pud++, addr = next, addr != end);
@@ -251,7 +243,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
  */
 static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 				    phys_addr_t phys, unsigned long virt,
-				    phys_addr_t size, int map_io)
+				    phys_addr_t size, pgprot_t prot)
 {
 	unsigned long addr, length, end, next;
 
@@ -261,7 +253,8 @@ static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 	end = addr + length;
 	do {
 		next = pgd_addr_end(addr, end);
-		alloc_init_pud(mm, pgd, addr, next, phys, map_io);
+		alloc_init_pud(mm, pgd, addr, next, phys, prot);
+			       
 		phys += next - addr;
 	} while (pgd++, addr = next, addr != end);
 }
@@ -275,7 +268,7 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 		return;
 	}
 	__create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK), phys, virt,
-			 size, 0);
+			 size, PAGE_KERNEL_EXEC);
 }
 
 void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
@@ -285,7 +278,16 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
 		return;
 	}
 	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
-			 addr, addr, size, map_io);
+			 addr, addr, size,
+			 map_io ? __pgprot(PROT_DEVICE_nGnRE)
+				: PAGE_KERNEL_EXEC);
+}
+
+void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
+			       unsigned long virt, phys_addr_t size,
+			       pgprot_t prot)
+{
+	__create_mapping(mm, pgd_offset(mm, virt), phys, virt, size, prot);
 }
 
 static void __init map_mem(void)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 02/10] arm64/mm: add create_pgd_mapping() to create private page tables
@ 2014-11-06 14:13     ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

For UEFI, we need to install the memory mappings used for Runtime Services
in a dedicated set of page tables. Add create_pgd_mapping(), which allows
us to allocate and install those page table entries early.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/mmu.h     |  3 +++
 arch/arm64/include/asm/pgtable.h |  5 +++++
 arch/arm64/mm/mmu.c              | 44 +++++++++++++++++++++-------------------
 3 files changed, 31 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index c2f006c48bdb..5fd40c43be80 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -33,5 +33,8 @@ extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
 extern void init_mem_pgprot(void);
 /* create an identity mapping for memory (or io if map_io is true) */
 extern void create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io);
+extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
+			       unsigned long virt, phys_addr_t size,
+			       pgprot_t prot);
 
 #endif
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 41a43bf26492..1abe4d08725b 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -264,6 +264,11 @@ static inline pmd_t pte_pmd(pte_t pte)
 	return __pmd(pte_val(pte));
 }
 
+static inline pgprot_t mk_sect_prot(pgprot_t prot)
+{
+	return __pgprot(pgprot_val(prot) & ~PTE_TABLE_BIT);
+}
+
 /*
  * THP definitions.
  */
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 83e6713143a3..b6dc2ce3991a 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -157,20 +157,10 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
 
 static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 				  unsigned long addr, unsigned long end,
-				  phys_addr_t phys, int map_io)
+				  phys_addr_t phys, pgprot_t prot)
 {
 	pmd_t *pmd;
 	unsigned long next;
-	pmdval_t prot_sect;
-	pgprot_t prot_pte;
-
-	if (map_io) {
-		prot_sect = PROT_SECT_DEVICE_nGnRE;
-		prot_pte = __pgprot(PROT_DEVICE_nGnRE);
-	} else {
-		prot_sect = PROT_SECT_NORMAL_EXEC;
-		prot_pte = PAGE_KERNEL_EXEC;
-	}
 
 	/*
 	 * Check for initial section mappings in the pgd/pud and remove them.
@@ -186,7 +176,8 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 		/* try section mapping first */
 		if (((addr | next | phys) & ~SECTION_MASK) == 0) {
 			pmd_t old_pmd =*pmd;
-			set_pmd(pmd, __pmd(phys | prot_sect));
+			set_pmd(pmd, __pmd(phys |
+					   pgprot_val(mk_sect_prot(prot))));
 			/*
 			 * Check for previous table entries created during
 			 * boot (__create_page_tables) and flush them.
@@ -195,7 +186,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 				flush_tlb_all();
 		} else {
 			alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
-				       prot_pte);
+				       prot);
 		}
 		phys += next - addr;
 	} while (pmd++, addr = next, addr != end);
@@ -203,7 +194,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 
 static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				  unsigned long addr, unsigned long end,
-				  unsigned long phys, int map_io)
+				  unsigned long phys, pgprot_t prot)
 {
 	pud_t *pud;
 	unsigned long next;
@@ -221,10 +212,11 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 		/*
 		 * For 4K granule only, attempt to put down a 1GB block
 		 */
-		if (!map_io && (PAGE_SHIFT == 12) &&
+		if ((PAGE_SHIFT == 12) &&
 		    ((addr | next | phys) & ~PUD_MASK) == 0) {
 			pud_t old_pud = *pud;
-			set_pud(pud, __pud(phys | PROT_SECT_NORMAL_EXEC));
+			set_pud(pud, __pud(phys |
+					   pgprot_val(mk_sect_prot(prot))));
 
 			/*
 			 * If we have an old value for a pud, it will
@@ -239,7 +231,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				flush_tlb_all();
 			}
 		} else {
-			alloc_init_pmd(mm, pud, addr, next, phys, map_io);
+			alloc_init_pmd(mm, pud, addr, next, phys, prot);
 		}
 		phys += next - addr;
 	} while (pud++, addr = next, addr != end);
@@ -251,7 +243,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
  */
 static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 				    phys_addr_t phys, unsigned long virt,
-				    phys_addr_t size, int map_io)
+				    phys_addr_t size, pgprot_t prot)
 {
 	unsigned long addr, length, end, next;
 
@@ -261,7 +253,8 @@ static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 	end = addr + length;
 	do {
 		next = pgd_addr_end(addr, end);
-		alloc_init_pud(mm, pgd, addr, next, phys, map_io);
+		alloc_init_pud(mm, pgd, addr, next, phys, prot);
+			       
 		phys += next - addr;
 	} while (pgd++, addr = next, addr != end);
 }
@@ -275,7 +268,7 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 		return;
 	}
 	__create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK), phys, virt,
-			 size, 0);
+			 size, PAGE_KERNEL_EXEC);
 }
 
 void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
@@ -285,7 +278,16 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
 		return;
 	}
 	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
-			 addr, addr, size, map_io);
+			 addr, addr, size,
+			 map_io ? __pgprot(PROT_DEVICE_nGnRE)
+				: PAGE_KERNEL_EXEC);
+}
+
+void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
+			       unsigned long virt, phys_addr_t size,
+			       pgprot_t prot)
+{
+	__create_mapping(mm, pgd_offset(mm, virt), phys, virt, size, prot);
 }
 
 static void __init map_mem(void)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 03/10] efi: split off remapping code from efi_config_init()
  2014-11-06 14:13 ` Ard Biesheuvel
@ 2014-11-06 14:13     ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A
  Cc: Ard Biesheuvel

Split of the remapping code from efi_config_init() so that the caller
can perform its own remapping. This is necessary to correctly handle
virtually remapped UEFI memory regions under kexec, as efi.systab will
have been updated to a virtual address.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/firmware/efi/efi.c | 49 +++++++++++++++++++++++++++++-----------------
 include/linux/efi.h        |  2 ++
 2 files changed, 33 insertions(+), 18 deletions(-)

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 9035c1b74d58..0de77db4fb88 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -293,9 +293,10 @@ static __init int match_config_table(efi_guid_t *guid,
 	return 0;
 }
 
-int __init efi_config_init(efi_config_table_type_t *arch_tables)
+int __init efi_config_parse_tables(void *config_tables, int count,
+				   efi_config_table_type_t *arch_tables)
 {
-	void *config_tables, *tablep;
+	void *tablep;
 	int i, sz;
 
 	if (efi_enabled(EFI_64BIT))
@@ -303,19 +304,9 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
 	else
 		sz = sizeof(efi_config_table_32_t);
 
-	/*
-	 * Let's see what config tables the firmware passed to us.
-	 */
-	config_tables = early_memremap(efi.systab->tables,
-				       efi.systab->nr_tables * sz);
-	if (config_tables == NULL) {
-		pr_err("Could not map Configuration table!\n");
-		return -ENOMEM;
-	}
-
 	tablep = config_tables;
 	pr_info("");
-	for (i = 0; i < efi.systab->nr_tables; i++) {
+	for (i = 0; i < count; i++) {
 		efi_guid_t guid;
 		unsigned long table;
 
@@ -328,8 +319,6 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
 			if (table64 >> 32) {
 				pr_cont("\n");
 				pr_err("Table located above 4GB, disabling EFI.\n");
-				early_memunmap(config_tables,
-					       efi.systab->nr_tables * sz);
 				return -EINVAL;
 			}
 #endif
@@ -344,13 +333,37 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
 		tablep += sz;
 	}
 	pr_cont("\n");
-	early_memunmap(config_tables, efi.systab->nr_tables * sz);
-
 	set_bit(EFI_CONFIG_TABLES, &efi.flags);
-
 	return 0;
 }
 
+int __init efi_config_init(efi_config_table_type_t *arch_tables)
+{
+	void *config_tables;
+	int sz, ret;
+
+	if (efi_enabled(EFI_64BIT))
+		sz = sizeof(efi_config_table_64_t);
+	else
+		sz = sizeof(efi_config_table_32_t);
+
+	/*
+	 * Let's see what config tables the firmware passed to us.
+	 */
+	config_tables = early_memremap(efi.systab->tables,
+				       efi.systab->nr_tables * sz);
+	if (config_tables == NULL) {
+		pr_err("Could not map Configuration table!\n");
+		return -ENOMEM;
+	}
+
+	ret = efi_config_parse_tables(config_tables, efi.systab->nr_tables,
+				      arch_tables);
+
+	early_memunmap(config_tables, efi.systab->nr_tables * sz);
+	return ret;
+}
+
 #ifdef CONFIG_EFI_VARS_MODULE
 static int __init efi_load_efivars(void)
 {
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 0238d612750e..2dc8577032b3 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -875,6 +875,8 @@ static inline efi_status_t efi_query_variable_store(u32 attributes, unsigned lon
 #endif
 extern void __iomem *efi_lookup_mapped_addr(u64 phys_addr);
 extern int efi_config_init(efi_config_table_type_t *arch_tables);
+extern int efi_config_parse_tables(void *config_tables, int count,
+				   efi_config_table_type_t *arch_tables);
 extern u64 efi_get_iobase (void);
 extern u32 efi_mem_type (unsigned long phys_addr);
 extern u64 efi_mem_attributes (unsigned long phys_addr);
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 03/10] efi: split off remapping code from efi_config_init()
@ 2014-11-06 14:13     ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

Split of the remapping code from efi_config_init() so that the caller
can perform its own remapping. This is necessary to correctly handle
virtually remapped UEFI memory regions under kexec, as efi.systab will
have been updated to a virtual address.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/firmware/efi/efi.c | 49 +++++++++++++++++++++++++++++-----------------
 include/linux/efi.h        |  2 ++
 2 files changed, 33 insertions(+), 18 deletions(-)

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 9035c1b74d58..0de77db4fb88 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -293,9 +293,10 @@ static __init int match_config_table(efi_guid_t *guid,
 	return 0;
 }
 
-int __init efi_config_init(efi_config_table_type_t *arch_tables)
+int __init efi_config_parse_tables(void *config_tables, int count,
+				   efi_config_table_type_t *arch_tables)
 {
-	void *config_tables, *tablep;
+	void *tablep;
 	int i, sz;
 
 	if (efi_enabled(EFI_64BIT))
@@ -303,19 +304,9 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
 	else
 		sz = sizeof(efi_config_table_32_t);
 
-	/*
-	 * Let's see what config tables the firmware passed to us.
-	 */
-	config_tables = early_memremap(efi.systab->tables,
-				       efi.systab->nr_tables * sz);
-	if (config_tables == NULL) {
-		pr_err("Could not map Configuration table!\n");
-		return -ENOMEM;
-	}
-
 	tablep = config_tables;
 	pr_info("");
-	for (i = 0; i < efi.systab->nr_tables; i++) {
+	for (i = 0; i < count; i++) {
 		efi_guid_t guid;
 		unsigned long table;
 
@@ -328,8 +319,6 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
 			if (table64 >> 32) {
 				pr_cont("\n");
 				pr_err("Table located above 4GB, disabling EFI.\n");
-				early_memunmap(config_tables,
-					       efi.systab->nr_tables * sz);
 				return -EINVAL;
 			}
 #endif
@@ -344,13 +333,37 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
 		tablep += sz;
 	}
 	pr_cont("\n");
-	early_memunmap(config_tables, efi.systab->nr_tables * sz);
-
 	set_bit(EFI_CONFIG_TABLES, &efi.flags);
-
 	return 0;
 }
 
+int __init efi_config_init(efi_config_table_type_t *arch_tables)
+{
+	void *config_tables;
+	int sz, ret;
+
+	if (efi_enabled(EFI_64BIT))
+		sz = sizeof(efi_config_table_64_t);
+	else
+		sz = sizeof(efi_config_table_32_t);
+
+	/*
+	 * Let's see what config tables the firmware passed to us.
+	 */
+	config_tables = early_memremap(efi.systab->tables,
+				       efi.systab->nr_tables * sz);
+	if (config_tables == NULL) {
+		pr_err("Could not map Configuration table!\n");
+		return -ENOMEM;
+	}
+
+	ret = efi_config_parse_tables(config_tables, efi.systab->nr_tables,
+				      arch_tables);
+
+	early_memunmap(config_tables, efi.systab->nr_tables * sz);
+	return ret;
+}
+
 #ifdef CONFIG_EFI_VARS_MODULE
 static int __init efi_load_efivars(void)
 {
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 0238d612750e..2dc8577032b3 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -875,6 +875,8 @@ static inline efi_status_t efi_query_variable_store(u32 attributes, unsigned lon
 #endif
 extern void __iomem *efi_lookup_mapped_addr(u64 phys_addr);
 extern int efi_config_init(efi_config_table_type_t *arch_tables);
+extern int efi_config_parse_tables(void *config_tables, int count,
+				   efi_config_table_type_t *arch_tables);
 extern u64 efi_get_iobase (void);
 extern u32 efi_mem_type (unsigned long phys_addr);
 extern u64 efi_mem_attributes (unsigned long phys_addr);
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 04/10] efi: add common infrastructure for stub-installed virtual mapping
  2014-11-06 14:13 ` Ard Biesheuvel
@ 2014-11-06 14:13     ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A
  Cc: Ard Biesheuvel

This introduces the common infrastructure to be shared between arm64
and ARM that wires up the UEFI memory map into system RAM discovery,
virtual mappings for Runtime Services and aligning cache attributes
between kernel and userland (/dev/mem) mappings for regions owned
by UEFI.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/firmware/efi/Kconfig   |   3 +
 drivers/firmware/efi/Makefile  |   1 +
 drivers/firmware/efi/virtmap.c | 224 +++++++++++++++++++++++++++++++++++++++++
 include/linux/efi.h            |  15 ++-
 4 files changed, 242 insertions(+), 1 deletion(-)
 create mode 100644 drivers/firmware/efi/virtmap.c

diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
index f712d47f30d8..c71498180e67 100644
--- a/drivers/firmware/efi/Kconfig
+++ b/drivers/firmware/efi/Kconfig
@@ -60,6 +60,9 @@ config EFI_RUNTIME_WRAPPERS
 config EFI_ARMSTUB
 	bool
 
+config EFI_VIRTMAP
+	bool
+
 endmenu
 
 config UEFI_CPER
diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
index aef6a95adef5..3fd26c0ad40b 100644
--- a/drivers/firmware/efi/Makefile
+++ b/drivers/firmware/efi/Makefile
@@ -8,3 +8,4 @@ obj-$(CONFIG_UEFI_CPER)			+= cper.o
 obj-$(CONFIG_EFI_RUNTIME_MAP)		+= runtime-map.o
 obj-$(CONFIG_EFI_RUNTIME_WRAPPERS)	+= runtime-wrappers.o
 obj-$(CONFIG_EFI_ARM_STUB)		+= libstub/
+obj-$(CONFIG_EFI_VIRTMAP)		+= virtmap.o
diff --git a/drivers/firmware/efi/virtmap.c b/drivers/firmware/efi/virtmap.c
new file mode 100644
index 000000000000..11670186f8e6
--- /dev/null
+++ b/drivers/firmware/efi/virtmap.c
@@ -0,0 +1,224 @@
+/*
+ * Copyright (C) 2014 Linaro Ltd.
+ * Author: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/efi.h>
+#include <linux/mm_types.h>
+#include <linux/rbtree.h>
+#include <linux/rwsem.h>
+#include <linux/spinlock.h>
+#include <linux/atomic.h>
+
+#include <asm/efi.h>
+#include <asm/pgtable.h>
+#include <asm/mmu_context.h>
+#include <asm/mmu.h>
+
+static pgd_t efi_pgd[PTRS_PER_PGD] __page_aligned_bss;
+
+static struct mm_struct efi_mm = {
+	.mm_rb			= RB_ROOT,
+	.pgd			= efi_pgd,
+	.mm_users		= ATOMIC_INIT(2),
+	.mm_count		= ATOMIC_INIT(1),
+	.mmap_sem		= __RWSEM_INITIALIZER(efi_mm.mmap_sem),
+	.page_table_lock	= __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock),
+	.mmlist			= LIST_HEAD_INIT(efi_mm.mmlist),
+	INIT_MM_CONTEXT(efi_mm)
+};
+
+void efi_virtmap_load(void)
+{
+	WARN_ON(preemptible());
+	efi_set_pgd(&efi_mm);
+}
+
+void efi_virtmap_unload(void)
+{
+	efi_set_pgd(current->active_mm);
+}
+
+static pgprot_t efi_md_access_prot(efi_memory_desc_t *md, pgprot_t prot)
+{
+	if (md->attribute & EFI_MEMORY_WB)
+		return prot;
+	if (md->attribute & (EFI_MEMORY_WT|EFI_MEMORY_WC))
+		return pgprot_writecombine(prot);
+	return pgprot_device(prot);
+}
+
+void __init efi_virtmap_init(void)
+{
+	efi_memory_desc_t *md;
+
+	if (!efi_enabled(EFI_BOOT))
+		return;
+
+	for_each_efi_memory_desc(&memmap, md) {
+		u64 paddr, npages, size;
+		pgprot_t prot;
+
+		if (!(md->attribute & EFI_MEMORY_RUNTIME))
+			continue;
+		if (WARN(md->virt_addr == 0,
+			 "UEFI virtual mapping incomplete or missing -- no entry found for 0x%llx\n",
+			 md->phys_addr))
+			return;
+
+		paddr = md->phys_addr;
+		npages = md->num_pages;
+		memrange_efi_to_native(&paddr, &npages);
+		size = npages << PAGE_SHIFT;
+
+		pr_info("  EFI remap 0x%012llx => %p\n",
+			md->phys_addr, (void *)md->virt_addr);
+
+		/*
+		 * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
+		 * executable, everything else can be mapped with the XN bits
+		 * set.
+		 */
+		if (md->type == EFI_RUNTIME_SERVICES_CODE)
+			prot = efi_md_access_prot(md, PAGE_KERNEL_EXEC);
+		else
+			prot = efi_md_access_prot(md, PAGE_KERNEL);
+
+		create_pgd_mapping(&efi_mm, paddr, md->virt_addr, size, prot);
+	}
+	set_bit(EFI_VIRTMAP, &efi.flags);
+}
+
+/*
+ * Return true for RAM regions that are available for general use.
+ */
+bool efi_mem_is_usable_region(efi_memory_desc_t *md)
+{
+	switch (md->type) {
+	case EFI_LOADER_CODE:
+	case EFI_LOADER_DATA:
+	case EFI_BOOT_SERVICES_CODE:
+	case EFI_BOOT_SERVICES_DATA:
+	case EFI_CONVENTIONAL_MEMORY:
+		return md->attribute & EFI_MEMORY_WB;
+	default:
+		break;
+	}
+	return false;
+}
+
+/*
+ * Translate a EFI virtual address into a physical address: this is necessary,
+ * as some data members of the EFI system table are virtually remapped after
+ * SetVirtualAddressMap() has been called.
+ */
+phys_addr_t efi_to_phys(unsigned long addr)
+{
+	efi_memory_desc_t *md;
+
+	for_each_efi_memory_desc(&memmap, md) {
+		if (!(md->attribute & EFI_MEMORY_RUNTIME))
+			continue;
+		if (md->virt_addr == 0)
+			/* no virtual mapping has been installed by the stub */
+			break;
+		if (md->virt_addr <= addr &&
+		    (addr - md->virt_addr) < (md->num_pages << EFI_PAGE_SHIFT))
+			return md->phys_addr + addr - md->virt_addr;
+	}
+	return addr;
+}
+
+static efi_memory_desc_t *efi_memory_desc(phys_addr_t addr)
+{
+	efi_memory_desc_t *md;
+
+	for_each_efi_memory_desc(&memmap, md)
+		if (md->phys_addr <= addr &&
+		    (addr - md->phys_addr) < (md->num_pages << EFI_PAGE_SHIFT))
+			return md;
+	return NULL;
+}
+
+bool efi_mem_access_prot(unsigned long pfn, pgprot_t *prot)
+{
+	efi_memory_desc_t *md = efi_memory_desc(pfn << PAGE_SHIFT);
+
+	if (!md)
+		return false;
+
+	*prot = efi_md_access_prot(md, *prot);
+	return true;
+}
+
+bool efi_mem_access_allowed(unsigned long pfn, unsigned long size, 
+			    bool writable)
+{
+	static const u64 memtype_mask = EFI_MEMORY_UC | EFI_MEMORY_WC |
+					EFI_MEMORY_WT | EFI_MEMORY_WB |
+					EFI_MEMORY_UCE;
+	phys_addr_t addr = pfn << PAGE_SHIFT;
+	efi_memory_desc_t *md;
+
+	/*
+	 * To keep things reasonable and simple, let's only allow mappings
+	 * that are either entirely disjoint from the UEFI memory map, or
+	 * are completely covered by it.
+	 */
+	md = efi_memory_desc(addr);
+	if (!md) {
+		/*
+		 * 'addr' is not covered by the UEFI memory map, so no other
+		 * UEFI memory map entries should intersect with the range
+		 */
+		for_each_efi_memory_desc(&memmap, md)
+			if (min(addr + size, md->phys_addr +
+				(md->num_pages << EFI_PAGE_SHIFT))
+			    > max(addr, md->phys_addr))
+				return false;
+		return true;
+	} else {
+		/*
+		 * Iterate over physically adjacent UEFI memory map entries.
+		 * (This would be a lot easier if the memory map were sorted,
+		 * and while it is in most cases, it is not mandated by the
+		 * UEFI spec so we cannot rely on it.)
+		 */
+		u64 attr = md->attribute & memtype_mask;
+		unsigned long md_size;
+
+		do {
+			/*
+			 * All regions must have the same memory type
+			 * attributes.
+			 */
+			if (attr != (md->attribute & memtype_mask))
+				return false;
+
+			/*
+			 * Under CONFIG_STRICT_DEVMEM, don't allow any of
+			 * the regions with special significance to UEFI
+			 * to be mapped read-write. In case of unusable or
+			 * MMIO memory, don't allow read-only access either.
+			 */
+			if (IS_ENABLED(CONFIG_STRICT_DEVMEM) &&
+			    (md->type == EFI_UNUSABLE_MEMORY ||
+			     md->type == EFI_MEMORY_MAPPED_IO ||
+			     (!efi_mem_is_usable_region(md) && writable)))
+				return false;
+
+			/*
+			 * We are done if this entry covers
+			 * the remainder of the region.
+			 */
+			md_size = md->num_pages << EFI_PAGE_SHIFT;
+			if (md->phys_addr + md_size >= addr + size)
+				return true;
+		} while ((md = efi_memory_desc(md->phys_addr + md_size)));
+	}
+	return false;
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 2dc8577032b3..e97abb226541 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -941,7 +941,8 @@ extern int __init efi_setup_pcdp_console(char *);
 #define EFI_MEMMAP		4	/* Can we use EFI memory map? */
 #define EFI_64BIT		5	/* Is the firmware 64-bit? */
 #define EFI_PARAVIRT		6	/* Access is via a paravirt interface */
-#define EFI_ARCH_1		7	/* First arch-specific bit */
+#define EFI_VIRTMAP		7	/* Use virtmap installed by the stub */
+#define EFI_ARCH_1		8	/* First arch-specific bit */
 
 #ifdef CONFIG_EFI
 /*
@@ -952,6 +953,7 @@ static inline bool efi_enabled(int feature)
 	return test_bit(feature, &efi.flags) != 0;
 }
 extern void efi_reboot(enum reboot_mode reboot_mode, const char *__unused);
+extern void efi_virtmap_init(void);
 #else
 static inline bool efi_enabled(int feature)
 {
@@ -959,6 +961,7 @@ static inline bool efi_enabled(int feature)
 }
 static inline void
 efi_reboot(enum reboot_mode reboot_mode, const char *__unused) {}
+static inline void efi_virtmap_init(void) {}
 #endif
 
 /*
@@ -1250,4 +1253,14 @@ efi_status_t handle_cmdline_files(efi_system_table_t *sys_table_arg,
 efi_status_t efi_parse_options(char *cmdline);
 
 bool efi_runtime_disabled(void);
+
+phys_addr_t efi_to_phys(unsigned long addr);
+bool efi_mem_is_usable_region(efi_memory_desc_t *md);
+bool efi_mem_access_prot(unsigned long pfn, pgprot_t *prot);
+bool efi_mem_access_allowed(unsigned long pfn, unsigned long size,
+			    bool writable);
+
+void efi_virtmap_load(void);
+void efi_virtmap_unload(void);
+
 #endif /* _LINUX_EFI_H */
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 04/10] efi: add common infrastructure for stub-installed virtual mapping
@ 2014-11-06 14:13     ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

This introduces the common infrastructure to be shared between arm64
and ARM that wires up the UEFI memory map into system RAM discovery,
virtual mappings for Runtime Services and aligning cache attributes
between kernel and userland (/dev/mem) mappings for regions owned
by UEFI.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/firmware/efi/Kconfig   |   3 +
 drivers/firmware/efi/Makefile  |   1 +
 drivers/firmware/efi/virtmap.c | 224 +++++++++++++++++++++++++++++++++++++++++
 include/linux/efi.h            |  15 ++-
 4 files changed, 242 insertions(+), 1 deletion(-)
 create mode 100644 drivers/firmware/efi/virtmap.c

diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
index f712d47f30d8..c71498180e67 100644
--- a/drivers/firmware/efi/Kconfig
+++ b/drivers/firmware/efi/Kconfig
@@ -60,6 +60,9 @@ config EFI_RUNTIME_WRAPPERS
 config EFI_ARMSTUB
 	bool
 
+config EFI_VIRTMAP
+	bool
+
 endmenu
 
 config UEFI_CPER
diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
index aef6a95adef5..3fd26c0ad40b 100644
--- a/drivers/firmware/efi/Makefile
+++ b/drivers/firmware/efi/Makefile
@@ -8,3 +8,4 @@ obj-$(CONFIG_UEFI_CPER)			+= cper.o
 obj-$(CONFIG_EFI_RUNTIME_MAP)		+= runtime-map.o
 obj-$(CONFIG_EFI_RUNTIME_WRAPPERS)	+= runtime-wrappers.o
 obj-$(CONFIG_EFI_ARM_STUB)		+= libstub/
+obj-$(CONFIG_EFI_VIRTMAP)		+= virtmap.o
diff --git a/drivers/firmware/efi/virtmap.c b/drivers/firmware/efi/virtmap.c
new file mode 100644
index 000000000000..11670186f8e6
--- /dev/null
+++ b/drivers/firmware/efi/virtmap.c
@@ -0,0 +1,224 @@
+/*
+ * Copyright (C) 2014 Linaro Ltd.
+ * Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/efi.h>
+#include <linux/mm_types.h>
+#include <linux/rbtree.h>
+#include <linux/rwsem.h>
+#include <linux/spinlock.h>
+#include <linux/atomic.h>
+
+#include <asm/efi.h>
+#include <asm/pgtable.h>
+#include <asm/mmu_context.h>
+#include <asm/mmu.h>
+
+static pgd_t efi_pgd[PTRS_PER_PGD] __page_aligned_bss;
+
+static struct mm_struct efi_mm = {
+	.mm_rb			= RB_ROOT,
+	.pgd			= efi_pgd,
+	.mm_users		= ATOMIC_INIT(2),
+	.mm_count		= ATOMIC_INIT(1),
+	.mmap_sem		= __RWSEM_INITIALIZER(efi_mm.mmap_sem),
+	.page_table_lock	= __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock),
+	.mmlist			= LIST_HEAD_INIT(efi_mm.mmlist),
+	INIT_MM_CONTEXT(efi_mm)
+};
+
+void efi_virtmap_load(void)
+{
+	WARN_ON(preemptible());
+	efi_set_pgd(&efi_mm);
+}
+
+void efi_virtmap_unload(void)
+{
+	efi_set_pgd(current->active_mm);
+}
+
+static pgprot_t efi_md_access_prot(efi_memory_desc_t *md, pgprot_t prot)
+{
+	if (md->attribute & EFI_MEMORY_WB)
+		return prot;
+	if (md->attribute & (EFI_MEMORY_WT|EFI_MEMORY_WC))
+		return pgprot_writecombine(prot);
+	return pgprot_device(prot);
+}
+
+void __init efi_virtmap_init(void)
+{
+	efi_memory_desc_t *md;
+
+	if (!efi_enabled(EFI_BOOT))
+		return;
+
+	for_each_efi_memory_desc(&memmap, md) {
+		u64 paddr, npages, size;
+		pgprot_t prot;
+
+		if (!(md->attribute & EFI_MEMORY_RUNTIME))
+			continue;
+		if (WARN(md->virt_addr == 0,
+			 "UEFI virtual mapping incomplete or missing -- no entry found for 0x%llx\n",
+			 md->phys_addr))
+			return;
+
+		paddr = md->phys_addr;
+		npages = md->num_pages;
+		memrange_efi_to_native(&paddr, &npages);
+		size = npages << PAGE_SHIFT;
+
+		pr_info("  EFI remap 0x%012llx => %p\n",
+			md->phys_addr, (void *)md->virt_addr);
+
+		/*
+		 * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
+		 * executable, everything else can be mapped with the XN bits
+		 * set.
+		 */
+		if (md->type == EFI_RUNTIME_SERVICES_CODE)
+			prot = efi_md_access_prot(md, PAGE_KERNEL_EXEC);
+		else
+			prot = efi_md_access_prot(md, PAGE_KERNEL);
+
+		create_pgd_mapping(&efi_mm, paddr, md->virt_addr, size, prot);
+	}
+	set_bit(EFI_VIRTMAP, &efi.flags);
+}
+
+/*
+ * Return true for RAM regions that are available for general use.
+ */
+bool efi_mem_is_usable_region(efi_memory_desc_t *md)
+{
+	switch (md->type) {
+	case EFI_LOADER_CODE:
+	case EFI_LOADER_DATA:
+	case EFI_BOOT_SERVICES_CODE:
+	case EFI_BOOT_SERVICES_DATA:
+	case EFI_CONVENTIONAL_MEMORY:
+		return md->attribute & EFI_MEMORY_WB;
+	default:
+		break;
+	}
+	return false;
+}
+
+/*
+ * Translate a EFI virtual address into a physical address: this is necessary,
+ * as some data members of the EFI system table are virtually remapped after
+ * SetVirtualAddressMap() has been called.
+ */
+phys_addr_t efi_to_phys(unsigned long addr)
+{
+	efi_memory_desc_t *md;
+
+	for_each_efi_memory_desc(&memmap, md) {
+		if (!(md->attribute & EFI_MEMORY_RUNTIME))
+			continue;
+		if (md->virt_addr == 0)
+			/* no virtual mapping has been installed by the stub */
+			break;
+		if (md->virt_addr <= addr &&
+		    (addr - md->virt_addr) < (md->num_pages << EFI_PAGE_SHIFT))
+			return md->phys_addr + addr - md->virt_addr;
+	}
+	return addr;
+}
+
+static efi_memory_desc_t *efi_memory_desc(phys_addr_t addr)
+{
+	efi_memory_desc_t *md;
+
+	for_each_efi_memory_desc(&memmap, md)
+		if (md->phys_addr <= addr &&
+		    (addr - md->phys_addr) < (md->num_pages << EFI_PAGE_SHIFT))
+			return md;
+	return NULL;
+}
+
+bool efi_mem_access_prot(unsigned long pfn, pgprot_t *prot)
+{
+	efi_memory_desc_t *md = efi_memory_desc(pfn << PAGE_SHIFT);
+
+	if (!md)
+		return false;
+
+	*prot = efi_md_access_prot(md, *prot);
+	return true;
+}
+
+bool efi_mem_access_allowed(unsigned long pfn, unsigned long size, 
+			    bool writable)
+{
+	static const u64 memtype_mask = EFI_MEMORY_UC | EFI_MEMORY_WC |
+					EFI_MEMORY_WT | EFI_MEMORY_WB |
+					EFI_MEMORY_UCE;
+	phys_addr_t addr = pfn << PAGE_SHIFT;
+	efi_memory_desc_t *md;
+
+	/*
+	 * To keep things reasonable and simple, let's only allow mappings
+	 * that are either entirely disjoint from the UEFI memory map, or
+	 * are completely covered by it.
+	 */
+	md = efi_memory_desc(addr);
+	if (!md) {
+		/*
+		 * 'addr' is not covered by the UEFI memory map, so no other
+		 * UEFI memory map entries should intersect with the range
+		 */
+		for_each_efi_memory_desc(&memmap, md)
+			if (min(addr + size, md->phys_addr +
+				(md->num_pages << EFI_PAGE_SHIFT))
+			    > max(addr, md->phys_addr))
+				return false;
+		return true;
+	} else {
+		/*
+		 * Iterate over physically adjacent UEFI memory map entries.
+		 * (This would be a lot easier if the memory map were sorted,
+		 * and while it is in most cases, it is not mandated by the
+		 * UEFI spec so we cannot rely on it.)
+		 */
+		u64 attr = md->attribute & memtype_mask;
+		unsigned long md_size;
+
+		do {
+			/*
+			 * All regions must have the same memory type
+			 * attributes.
+			 */
+			if (attr != (md->attribute & memtype_mask))
+				return false;
+
+			/*
+			 * Under CONFIG_STRICT_DEVMEM, don't allow any of
+			 * the regions with special significance to UEFI
+			 * to be mapped read-write. In case of unusable or
+			 * MMIO memory, don't allow read-only access either.
+			 */
+			if (IS_ENABLED(CONFIG_STRICT_DEVMEM) &&
+			    (md->type == EFI_UNUSABLE_MEMORY ||
+			     md->type == EFI_MEMORY_MAPPED_IO ||
+			     (!efi_mem_is_usable_region(md) && writable)))
+				return false;
+
+			/*
+			 * We are done if this entry covers
+			 * the remainder of the region.
+			 */
+			md_size = md->num_pages << EFI_PAGE_SHIFT;
+			if (md->phys_addr + md_size >= addr + size)
+				return true;
+		} while ((md = efi_memory_desc(md->phys_addr + md_size)));
+	}
+	return false;
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 2dc8577032b3..e97abb226541 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -941,7 +941,8 @@ extern int __init efi_setup_pcdp_console(char *);
 #define EFI_MEMMAP		4	/* Can we use EFI memory map? */
 #define EFI_64BIT		5	/* Is the firmware 64-bit? */
 #define EFI_PARAVIRT		6	/* Access is via a paravirt interface */
-#define EFI_ARCH_1		7	/* First arch-specific bit */
+#define EFI_VIRTMAP		7	/* Use virtmap installed by the stub */
+#define EFI_ARCH_1		8	/* First arch-specific bit */
 
 #ifdef CONFIG_EFI
 /*
@@ -952,6 +953,7 @@ static inline bool efi_enabled(int feature)
 	return test_bit(feature, &efi.flags) != 0;
 }
 extern void efi_reboot(enum reboot_mode reboot_mode, const char *__unused);
+extern void efi_virtmap_init(void);
 #else
 static inline bool efi_enabled(int feature)
 {
@@ -959,6 +961,7 @@ static inline bool efi_enabled(int feature)
 }
 static inline void
 efi_reboot(enum reboot_mode reboot_mode, const char *__unused) {}
+static inline void efi_virtmap_init(void) {}
 #endif
 
 /*
@@ -1250,4 +1253,14 @@ efi_status_t handle_cmdline_files(efi_system_table_t *sys_table_arg,
 efi_status_t efi_parse_options(char *cmdline);
 
 bool efi_runtime_disabled(void);
+
+phys_addr_t efi_to_phys(unsigned long addr);
+bool efi_mem_is_usable_region(efi_memory_desc_t *md);
+bool efi_mem_access_prot(unsigned long pfn, pgprot_t *prot);
+bool efi_mem_access_allowed(unsigned long pfn, unsigned long size,
+			    bool writable);
+
+void efi_virtmap_load(void);
+void efi_virtmap_unload(void);
+
 #endif /* _LINUX_EFI_H */
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 05/10] arm64/efi: move SetVirtualAddressMap() to UEFI stub
  2014-11-06 14:13 ` Ard Biesheuvel
@ 2014-11-06 14:13     ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A
  Cc: Ard Biesheuvel

In order to support kexec, the kernel needs to be able to deal with the
state of the UEFI firmware after SetVirtualAddressMap() has been called.
To avoid having separate code paths for non-kexec and kexec, let's move
the call to SetVirtualAddressMap() to the stub: this will guarantee us
that it will only be called once (since the stub is not executed during
kexec), and ensures that the UEFI state is identical between kexec and
normal boot.

This implies that the layout of the virtual mapping needs to be created
by the stub as well. All regions are rounded up to a naturally aligned
multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
in the UEFI memory map. The kernel proper reads those values and installs
the mappings in a dedicated set of page tables that are swapped in during
UEFI Runtime Services calls.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/Kconfig                 |   1 +
 arch/arm64/include/asm/efi.h       |  16 +++--
 arch/arm64/kernel/efi.c            | 128 ++++++++-----------------------------
 arch/arm64/kernel/setup.c          |   1 +
 drivers/firmware/efi/libstub/fdt.c | 107 ++++++++++++++++++++++++++++++-
 5 files changed, 144 insertions(+), 109 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 2c3c2ca6f8bc..a6d00b7cf60b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -393,6 +393,7 @@ config EFI
 	select EFI_RUNTIME_WRAPPERS
 	select EFI_STUB
 	select EFI_ARMSTUB
+	select EFI_VIRTMAP
 	default y
 	help
 	  This option provides support for runtime services provided
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index a34fd3b12e2b..e2cea16f3bd7 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -14,21 +14,27 @@ extern void efi_idmap_init(void);
 
 #define efi_call_virt(f, ...)						\
 ({									\
-	efi_##f##_t *__f = efi.systab->runtime->f;			\
+	efi_##f##_t *__f;						\
 	efi_status_t __s;						\
 									\
-	kernel_neon_begin();						\
+	kernel_neon_begin();	/* disables preemption */		\
+	efi_virtmap_load();						\
+	__f = efi.systab->runtime->f;					\
 	__s = __f(__VA_ARGS__);						\
+	efi_virtmap_unload();						\
 	kernel_neon_end();						\
 	__s;								\
 })
 
 #define __efi_call_virt(f, ...)						\
 ({									\
-	efi_##f##_t *__f = efi.systab->runtime->f;			\
+	efi_##f##_t *__f;						\
 									\
-	kernel_neon_begin();						\
+	kernel_neon_begin();	/* disables preemption */		\
+	efi_virtmap_load();						\
+	__f = efi.systab->runtime->f;					\
 	__f(__VA_ARGS__);						\
+	efi_virtmap_unload();						\
 	kernel_neon_end();						\
 })
 
@@ -44,4 +50,6 @@ extern void efi_idmap_init(void);
 
 #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__)
 
+void efi_set_pgd(struct mm_struct *mm);
+
 #endif /* _ASM_EFI_H */
diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 6fac253bc783..beb5a79d32c3 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -25,11 +25,10 @@
 #include <asm/efi.h>
 #include <asm/tlbflush.h>
 #include <asm/mmu_context.h>
+#include <asm/mmu.h>
 
 struct efi_memory_map memmap;
 
-static efi_runtime_services_t *runtime;
-
 static u64 efi_system_table;
 
 static int uefi_debug __initdata;
@@ -72,6 +71,8 @@ static void __init efi_setup_idmap(void)
 static int __init uefi_init(void)
 {
 	efi_char16_t *c16;
+	void *config_tables;
+	u64 table_size;
 	char vendor[100] = "unknown";
 	int i, retval;
 
@@ -99,7 +100,7 @@ static int __init uefi_init(void)
 			efi.systab->hdr.revision & 0xffff);
 
 	/* Show what we know for posterity */
-	c16 = early_memremap(efi.systab->fw_vendor,
+	c16 = early_memremap(efi_to_phys(efi.systab->fw_vendor),
 			     sizeof(vendor));
 	if (c16) {
 		for (i = 0; i < (int) sizeof(vendor) - 1 && *c16; ++i)
@@ -112,8 +113,14 @@ static int __init uefi_init(void)
 		efi.systab->hdr.revision >> 16,
 		efi.systab->hdr.revision & 0xffff, vendor);
 
-	retval = efi_config_init(NULL);
+	table_size = sizeof(efi_config_table_64_t) * efi.systab->nr_tables;
+	config_tables = early_memremap(efi_to_phys(efi.systab->tables),
+				       table_size);
+
+	retval = efi_config_parse_tables(config_tables,
+					 efi.systab->nr_tables, NULL);
 
+	early_memunmap(config_tables, table_size);
 out:
 	early_memunmap(efi.systab,  sizeof(efi_system_table_t));
 	return retval;
@@ -328,51 +335,9 @@ void __init efi_idmap_init(void)
 	efi_setup_idmap();
 }
 
-static int __init remap_region(efi_memory_desc_t *md, void **new)
-{
-	u64 paddr, vaddr, npages, size;
-
-	paddr = md->phys_addr;
-	npages = md->num_pages;
-	memrange_efi_to_native(&paddr, &npages);
-	size = npages << PAGE_SHIFT;
-
-	if (is_normal_ram(md))
-		vaddr = (__force u64)ioremap_cache(paddr, size);
-	else
-		vaddr = (__force u64)ioremap(paddr, size);
-
-	if (!vaddr) {
-		pr_err("Unable to remap 0x%llx pages @ %p\n",
-		       npages, (void *)paddr);
-		return 0;
-	}
-
-	/* adjust for any rounding when EFI and system pagesize differs */
-	md->virt_addr = vaddr + (md->phys_addr - paddr);
-
-	if (uefi_debug)
-		pr_info("  EFI remap 0x%012llx => %p\n",
-			md->phys_addr, (void *)md->virt_addr);
-
-	memcpy(*new, md, memmap.desc_size);
-	*new += memmap.desc_size;
-
-	return 1;
-}
-
-/*
- * Switch UEFI from an identity map to a kernel virtual map
- */
 static int __init arm64_enter_virtual_mode(void)
 {
-	efi_memory_desc_t *md;
-	phys_addr_t virtmap_phys;
-	void *virtmap, *virt_md;
-	efi_status_t status;
 	u64 mapsize;
-	int count = 0;
-	unsigned long flags;
 
 	if (!efi_enabled(EFI_BOOT)) {
 		pr_info("EFI services will not be available.\n");
@@ -395,79 +360,28 @@ static int __init arm64_enter_virtual_mode(void)
 
 	efi.memmap = &memmap;
 
-	/* Map the runtime regions */
-	virtmap = kmalloc(mapsize, GFP_KERNEL);
-	if (!virtmap) {
-		pr_err("Failed to allocate EFI virtual memmap\n");
-		return -1;
-	}
-	virtmap_phys = virt_to_phys(virtmap);
-	virt_md = virtmap;
-
-	for_each_efi_memory_desc(&memmap, md) {
-		if (!(md->attribute & EFI_MEMORY_RUNTIME))
-			continue;
-		if (!remap_region(md, &virt_md))
-			goto err_unmap;
-		++count;
-	}
-
-	efi.systab = (__force void *)efi_lookup_mapped_addr(efi_system_table);
+	efi.systab = (__force void *)ioremap_cache(efi_system_table,
+						   sizeof(efi_system_table_t));
 	if (!efi.systab) {
-		/*
-		 * If we have no virtual mapping for the System Table at this
-		 * point, the memory map doesn't cover the physical offset where
-		 * it resides. This means the System Table will be inaccessible
-		 * to Runtime Services themselves once the virtual mapping is
-		 * installed.
-		 */
-		pr_err("Failed to remap EFI System Table -- buggy firmware?\n");
-		goto err_unmap;
+		pr_err("Failed to remap EFI System Table\n");
+		return -1;
 	}
 	set_bit(EFI_SYSTEM_TABLES, &efi.flags);
 
-	local_irq_save(flags);
-	cpu_switch_mm(idmap_pg_dir, &init_mm);
-
-	/* Call SetVirtualAddressMap with the physical address of the map */
-	runtime = efi.systab->runtime;
-	efi.set_virtual_address_map = runtime->set_virtual_address_map;
-
-	status = efi.set_virtual_address_map(count * memmap.desc_size,
-					     memmap.desc_size,
-					     memmap.desc_version,
-					     (efi_memory_desc_t *)virtmap_phys);
-	cpu_set_reserved_ttbr0();
-	flush_tlb_all();
-	local_irq_restore(flags);
-
-	kfree(virtmap);
-
 	free_boot_services();
 
-	if (status != EFI_SUCCESS) {
-		pr_err("Failed to set EFI virtual address map! [%lx]\n",
-			status);
+	if (!efi_enabled(EFI_VIRTMAP)) {
+		pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
 		return -1;
 	}
 
 	/* Set up runtime services function pointers */
-	runtime = efi.systab->runtime;
 	efi_native_runtime_setup();
 	set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
 
 	efi.runtime_version = efi.systab->hdr.revision;
 
 	return 0;
-
-err_unmap:
-	/* unmap all mappings that succeeded: there are 'count' of those */
-	for (virt_md = virtmap; count--; virt_md += memmap.desc_size) {
-		md = virt_md;
-		iounmap((__force void __iomem *)md->virt_addr);
-	}
-	kfree(virtmap);
-	return -1;
 }
 early_initcall(arm64_enter_virtual_mode);
 
@@ -484,3 +398,11 @@ static int __init arm64_dmi_init(void)
 	return 0;
 }
 core_initcall(arm64_dmi_init);
+
+void efi_set_pgd(struct mm_struct *mm)
+{
+	cpu_switch_mm(mm->pgd, mm);
+	flush_tlb_all();
+	if (icache_is_aivivt())
+		__flush_icache_all();
+}
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 2437196cc5d4..ac19b2d6a3fc 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -393,6 +393,7 @@ void __init setup_arch(char **cmdline_p)
 	request_standard_resources();
 
 	efi_idmap_init();
+	efi_virtmap_init();
 
 	unflatten_device_tree();
 
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index c846a9608cbd..7129ed55e4bb 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -168,6 +168,68 @@ fdt_set_fail:
 #endif
 
 /*
+ * This is the base address at which to start allocating virtual memory ranges
+ * for UEFI Runtime Services. This is a userland range so that we can use any
+ * allocation we choose, and eliminate the risk of a conflict after kexec.
+ */
+#define EFI_RT_VIRTUAL_BASE	0x40000000
+
+static void update_memory_map(efi_memory_desc_t *memory_map,
+			      unsigned long map_size, unsigned long desc_size,
+			      int *count)
+{
+	u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
+	union {
+		efi_memory_desc_t entry;
+		u8 pad[desc_size];
+	} *p, *q, tmp;
+	int i = map_size / desc_size;
+
+	p = (void *)memory_map;
+	for (q = p; i >= 0; i--, q++) {
+		u64 paddr, size;
+
+		if (!(q->entry.attribute & EFI_MEMORY_RUNTIME))
+			continue;
+
+		/*
+		 * Swap the entries around so that all EFI_MEMORY_RUNTIME
+		 * entries bubble to the top. This will allow us to reuse the
+		 * table as input to SetVirtualAddressMap().
+		 */
+		if (q != p) {
+			tmp = *p;
+			*p = *q;
+			*q = tmp;
+		}
+
+		/*
+		 * Make the mapping compatible with 64k pages: this allows
+		 * a 4k page size kernel to kexec a 64k page size kernel and
+		 * vice versa.
+		 */
+		paddr = round_down(p->entry.phys_addr, SZ_64K);
+		size = round_up(p->entry.num_pages * EFI_PAGE_SIZE +
+				p->entry.phys_addr - paddr, SZ_64K);
+
+		/*
+		 * Avoid wasting memory on PTEs by choosing a virtual base that
+		 * is compatible with section mappings if this region has the
+		 * appropriate size and physical alignment. (Sections are 2 MB
+		 * on 4k granule kernels)
+		 */
+		if (IS_ALIGNED(p->entry.phys_addr, SZ_2M) && size >= SZ_2M)
+			efi_virt_base = round_up(efi_virt_base, SZ_2M);
+
+		p->entry.virt_addr = efi_virt_base + p->entry.phys_addr - paddr;
+		efi_virt_base += size;
+
+		++p;
+		++*count;
+	}
+}
+
+/*
  * Allocate memory for a new FDT, then add EFI, commandline, and
  * initrd related fields to the FDT.  This routine increases the
  * FDT allocation size until the allocated memory is large
@@ -196,6 +258,7 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 	efi_memory_desc_t *memory_map;
 	unsigned long new_fdt_size;
 	efi_status_t status;
+	int runtime_entry_count = 0;
 
 	/*
 	 * Estimate size of new FDT, and allocate memory for it. We
@@ -248,12 +311,52 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 		}
 	}
 
+	/*
+	 * Update the memory map with virtual addresses, and reorder the entries
+	 * so that we can pass it straight into SetVirtualAddressMap()
+	 */
+	update_memory_map(memory_map, map_size, desc_size,
+			  &runtime_entry_count);
+
+	pr_efi(sys_table,
+	       "Exiting boot services and installing virtual address map...\n");
+
 	/* Now we are ready to exit_boot_services.*/
 	status = sys_table->boottime->exit_boot_services(handle, mmap_key);
 
+	if (status == EFI_SUCCESS) {
+		efi_set_virtual_address_map_t *svam;
+
+		/* Install the new virtual address map */
+		svam = sys_table->runtime->set_virtual_address_map;
+		status = svam(runtime_entry_count * desc_size, desc_size,
+			      desc_ver, memory_map);
 
-	if (status == EFI_SUCCESS)
-		return status;
+		/*
+		 * We are beyond the point of no return here, so if the call to
+		 * SetVirtualAddressMap() failed, we need to signal that to the
+		 * incoming kernel but proceed normally otherwise.
+		 */
+		if (status != EFI_SUCCESS) {
+			int i;
+
+			/*
+			 * Set the virtual address field of all
+			 * EFI_MEMORY_RUNTIME entries to 0. This will signal
+			 * the incoming kernel that no virtual translation has
+			 * been installed.
+			 */
+			for (i = 0; i < map_size; i += desc_size) {
+				efi_memory_desc_t *p;
+
+				p = (efi_memory_desc_t *)((u8 *)memory_map + i);
+				if (!(p->attribute & EFI_MEMORY_RUNTIME))
+					break;
+				p->virt_addr = 0;
+			}
+		}
+		return EFI_SUCCESS;
+	}
 
 	pr_efi_err(sys_table, "Exit boot services failed.\n");
 
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 05/10] arm64/efi: move SetVirtualAddressMap() to UEFI stub
@ 2014-11-06 14:13     ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

In order to support kexec, the kernel needs to be able to deal with the
state of the UEFI firmware after SetVirtualAddressMap() has been called.
To avoid having separate code paths for non-kexec and kexec, let's move
the call to SetVirtualAddressMap() to the stub: this will guarantee us
that it will only be called once (since the stub is not executed during
kexec), and ensures that the UEFI state is identical between kexec and
normal boot.

This implies that the layout of the virtual mapping needs to be created
by the stub as well. All regions are rounded up to a naturally aligned
multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
in the UEFI memory map. The kernel proper reads those values and installs
the mappings in a dedicated set of page tables that are swapped in during
UEFI Runtime Services calls.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig                 |   1 +
 arch/arm64/include/asm/efi.h       |  16 +++--
 arch/arm64/kernel/efi.c            | 128 ++++++++-----------------------------
 arch/arm64/kernel/setup.c          |   1 +
 drivers/firmware/efi/libstub/fdt.c | 107 ++++++++++++++++++++++++++++++-
 5 files changed, 144 insertions(+), 109 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 2c3c2ca6f8bc..a6d00b7cf60b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -393,6 +393,7 @@ config EFI
 	select EFI_RUNTIME_WRAPPERS
 	select EFI_STUB
 	select EFI_ARMSTUB
+	select EFI_VIRTMAP
 	default y
 	help
 	  This option provides support for runtime services provided
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index a34fd3b12e2b..e2cea16f3bd7 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -14,21 +14,27 @@ extern void efi_idmap_init(void);
 
 #define efi_call_virt(f, ...)						\
 ({									\
-	efi_##f##_t *__f = efi.systab->runtime->f;			\
+	efi_##f##_t *__f;						\
 	efi_status_t __s;						\
 									\
-	kernel_neon_begin();						\
+	kernel_neon_begin();	/* disables preemption */		\
+	efi_virtmap_load();						\
+	__f = efi.systab->runtime->f;					\
 	__s = __f(__VA_ARGS__);						\
+	efi_virtmap_unload();						\
 	kernel_neon_end();						\
 	__s;								\
 })
 
 #define __efi_call_virt(f, ...)						\
 ({									\
-	efi_##f##_t *__f = efi.systab->runtime->f;			\
+	efi_##f##_t *__f;						\
 									\
-	kernel_neon_begin();						\
+	kernel_neon_begin();	/* disables preemption */		\
+	efi_virtmap_load();						\
+	__f = efi.systab->runtime->f;					\
 	__f(__VA_ARGS__);						\
+	efi_virtmap_unload();						\
 	kernel_neon_end();						\
 })
 
@@ -44,4 +50,6 @@ extern void efi_idmap_init(void);
 
 #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__)
 
+void efi_set_pgd(struct mm_struct *mm);
+
 #endif /* _ASM_EFI_H */
diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 6fac253bc783..beb5a79d32c3 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -25,11 +25,10 @@
 #include <asm/efi.h>
 #include <asm/tlbflush.h>
 #include <asm/mmu_context.h>
+#include <asm/mmu.h>
 
 struct efi_memory_map memmap;
 
-static efi_runtime_services_t *runtime;
-
 static u64 efi_system_table;
 
 static int uefi_debug __initdata;
@@ -72,6 +71,8 @@ static void __init efi_setup_idmap(void)
 static int __init uefi_init(void)
 {
 	efi_char16_t *c16;
+	void *config_tables;
+	u64 table_size;
 	char vendor[100] = "unknown";
 	int i, retval;
 
@@ -99,7 +100,7 @@ static int __init uefi_init(void)
 			efi.systab->hdr.revision & 0xffff);
 
 	/* Show what we know for posterity */
-	c16 = early_memremap(efi.systab->fw_vendor,
+	c16 = early_memremap(efi_to_phys(efi.systab->fw_vendor),
 			     sizeof(vendor));
 	if (c16) {
 		for (i = 0; i < (int) sizeof(vendor) - 1 && *c16; ++i)
@@ -112,8 +113,14 @@ static int __init uefi_init(void)
 		efi.systab->hdr.revision >> 16,
 		efi.systab->hdr.revision & 0xffff, vendor);
 
-	retval = efi_config_init(NULL);
+	table_size = sizeof(efi_config_table_64_t) * efi.systab->nr_tables;
+	config_tables = early_memremap(efi_to_phys(efi.systab->tables),
+				       table_size);
+
+	retval = efi_config_parse_tables(config_tables,
+					 efi.systab->nr_tables, NULL);
 
+	early_memunmap(config_tables, table_size);
 out:
 	early_memunmap(efi.systab,  sizeof(efi_system_table_t));
 	return retval;
@@ -328,51 +335,9 @@ void __init efi_idmap_init(void)
 	efi_setup_idmap();
 }
 
-static int __init remap_region(efi_memory_desc_t *md, void **new)
-{
-	u64 paddr, vaddr, npages, size;
-
-	paddr = md->phys_addr;
-	npages = md->num_pages;
-	memrange_efi_to_native(&paddr, &npages);
-	size = npages << PAGE_SHIFT;
-
-	if (is_normal_ram(md))
-		vaddr = (__force u64)ioremap_cache(paddr, size);
-	else
-		vaddr = (__force u64)ioremap(paddr, size);
-
-	if (!vaddr) {
-		pr_err("Unable to remap 0x%llx pages @ %p\n",
-		       npages, (void *)paddr);
-		return 0;
-	}
-
-	/* adjust for any rounding when EFI and system pagesize differs */
-	md->virt_addr = vaddr + (md->phys_addr - paddr);
-
-	if (uefi_debug)
-		pr_info("  EFI remap 0x%012llx => %p\n",
-			md->phys_addr, (void *)md->virt_addr);
-
-	memcpy(*new, md, memmap.desc_size);
-	*new += memmap.desc_size;
-
-	return 1;
-}
-
-/*
- * Switch UEFI from an identity map to a kernel virtual map
- */
 static int __init arm64_enter_virtual_mode(void)
 {
-	efi_memory_desc_t *md;
-	phys_addr_t virtmap_phys;
-	void *virtmap, *virt_md;
-	efi_status_t status;
 	u64 mapsize;
-	int count = 0;
-	unsigned long flags;
 
 	if (!efi_enabled(EFI_BOOT)) {
 		pr_info("EFI services will not be available.\n");
@@ -395,79 +360,28 @@ static int __init arm64_enter_virtual_mode(void)
 
 	efi.memmap = &memmap;
 
-	/* Map the runtime regions */
-	virtmap = kmalloc(mapsize, GFP_KERNEL);
-	if (!virtmap) {
-		pr_err("Failed to allocate EFI virtual memmap\n");
-		return -1;
-	}
-	virtmap_phys = virt_to_phys(virtmap);
-	virt_md = virtmap;
-
-	for_each_efi_memory_desc(&memmap, md) {
-		if (!(md->attribute & EFI_MEMORY_RUNTIME))
-			continue;
-		if (!remap_region(md, &virt_md))
-			goto err_unmap;
-		++count;
-	}
-
-	efi.systab = (__force void *)efi_lookup_mapped_addr(efi_system_table);
+	efi.systab = (__force void *)ioremap_cache(efi_system_table,
+						   sizeof(efi_system_table_t));
 	if (!efi.systab) {
-		/*
-		 * If we have no virtual mapping for the System Table at this
-		 * point, the memory map doesn't cover the physical offset where
-		 * it resides. This means the System Table will be inaccessible
-		 * to Runtime Services themselves once the virtual mapping is
-		 * installed.
-		 */
-		pr_err("Failed to remap EFI System Table -- buggy firmware?\n");
-		goto err_unmap;
+		pr_err("Failed to remap EFI System Table\n");
+		return -1;
 	}
 	set_bit(EFI_SYSTEM_TABLES, &efi.flags);
 
-	local_irq_save(flags);
-	cpu_switch_mm(idmap_pg_dir, &init_mm);
-
-	/* Call SetVirtualAddressMap with the physical address of the map */
-	runtime = efi.systab->runtime;
-	efi.set_virtual_address_map = runtime->set_virtual_address_map;
-
-	status = efi.set_virtual_address_map(count * memmap.desc_size,
-					     memmap.desc_size,
-					     memmap.desc_version,
-					     (efi_memory_desc_t *)virtmap_phys);
-	cpu_set_reserved_ttbr0();
-	flush_tlb_all();
-	local_irq_restore(flags);
-
-	kfree(virtmap);
-
 	free_boot_services();
 
-	if (status != EFI_SUCCESS) {
-		pr_err("Failed to set EFI virtual address map! [%lx]\n",
-			status);
+	if (!efi_enabled(EFI_VIRTMAP)) {
+		pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
 		return -1;
 	}
 
 	/* Set up runtime services function pointers */
-	runtime = efi.systab->runtime;
 	efi_native_runtime_setup();
 	set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
 
 	efi.runtime_version = efi.systab->hdr.revision;
 
 	return 0;
-
-err_unmap:
-	/* unmap all mappings that succeeded: there are 'count' of those */
-	for (virt_md = virtmap; count--; virt_md += memmap.desc_size) {
-		md = virt_md;
-		iounmap((__force void __iomem *)md->virt_addr);
-	}
-	kfree(virtmap);
-	return -1;
 }
 early_initcall(arm64_enter_virtual_mode);
 
@@ -484,3 +398,11 @@ static int __init arm64_dmi_init(void)
 	return 0;
 }
 core_initcall(arm64_dmi_init);
+
+void efi_set_pgd(struct mm_struct *mm)
+{
+	cpu_switch_mm(mm->pgd, mm);
+	flush_tlb_all();
+	if (icache_is_aivivt())
+		__flush_icache_all();
+}
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 2437196cc5d4..ac19b2d6a3fc 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -393,6 +393,7 @@ void __init setup_arch(char **cmdline_p)
 	request_standard_resources();
 
 	efi_idmap_init();
+	efi_virtmap_init();
 
 	unflatten_device_tree();
 
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index c846a9608cbd..7129ed55e4bb 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -168,6 +168,68 @@ fdt_set_fail:
 #endif
 
 /*
+ * This is the base address at which to start allocating virtual memory ranges
+ * for UEFI Runtime Services. This is a userland range so that we can use any
+ * allocation we choose, and eliminate the risk of a conflict after kexec.
+ */
+#define EFI_RT_VIRTUAL_BASE	0x40000000
+
+static void update_memory_map(efi_memory_desc_t *memory_map,
+			      unsigned long map_size, unsigned long desc_size,
+			      int *count)
+{
+	u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
+	union {
+		efi_memory_desc_t entry;
+		u8 pad[desc_size];
+	} *p, *q, tmp;
+	int i = map_size / desc_size;
+
+	p = (void *)memory_map;
+	for (q = p; i >= 0; i--, q++) {
+		u64 paddr, size;
+
+		if (!(q->entry.attribute & EFI_MEMORY_RUNTIME))
+			continue;
+
+		/*
+		 * Swap the entries around so that all EFI_MEMORY_RUNTIME
+		 * entries bubble to the top. This will allow us to reuse the
+		 * table as input to SetVirtualAddressMap().
+		 */
+		if (q != p) {
+			tmp = *p;
+			*p = *q;
+			*q = tmp;
+		}
+
+		/*
+		 * Make the mapping compatible with 64k pages: this allows
+		 * a 4k page size kernel to kexec a 64k page size kernel and
+		 * vice versa.
+		 */
+		paddr = round_down(p->entry.phys_addr, SZ_64K);
+		size = round_up(p->entry.num_pages * EFI_PAGE_SIZE +
+				p->entry.phys_addr - paddr, SZ_64K);
+
+		/*
+		 * Avoid wasting memory on PTEs by choosing a virtual base that
+		 * is compatible with section mappings if this region has the
+		 * appropriate size and physical alignment. (Sections are 2 MB
+		 * on 4k granule kernels)
+		 */
+		if (IS_ALIGNED(p->entry.phys_addr, SZ_2M) && size >= SZ_2M)
+			efi_virt_base = round_up(efi_virt_base, SZ_2M);
+
+		p->entry.virt_addr = efi_virt_base + p->entry.phys_addr - paddr;
+		efi_virt_base += size;
+
+		++p;
+		++*count;
+	}
+}
+
+/*
  * Allocate memory for a new FDT, then add EFI, commandline, and
  * initrd related fields to the FDT.  This routine increases the
  * FDT allocation size until the allocated memory is large
@@ -196,6 +258,7 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 	efi_memory_desc_t *memory_map;
 	unsigned long new_fdt_size;
 	efi_status_t status;
+	int runtime_entry_count = 0;
 
 	/*
 	 * Estimate size of new FDT, and allocate memory for it. We
@@ -248,12 +311,52 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 		}
 	}
 
+	/*
+	 * Update the memory map with virtual addresses, and reorder the entries
+	 * so that we can pass it straight into SetVirtualAddressMap()
+	 */
+	update_memory_map(memory_map, map_size, desc_size,
+			  &runtime_entry_count);
+
+	pr_efi(sys_table,
+	       "Exiting boot services and installing virtual address map...\n");
+
 	/* Now we are ready to exit_boot_services.*/
 	status = sys_table->boottime->exit_boot_services(handle, mmap_key);
 
+	if (status == EFI_SUCCESS) {
+		efi_set_virtual_address_map_t *svam;
+
+		/* Install the new virtual address map */
+		svam = sys_table->runtime->set_virtual_address_map;
+		status = svam(runtime_entry_count * desc_size, desc_size,
+			      desc_ver, memory_map);
 
-	if (status == EFI_SUCCESS)
-		return status;
+		/*
+		 * We are beyond the point of no return here, so if the call to
+		 * SetVirtualAddressMap() failed, we need to signal that to the
+		 * incoming kernel but proceed normally otherwise.
+		 */
+		if (status != EFI_SUCCESS) {
+			int i;
+
+			/*
+			 * Set the virtual address field of all
+			 * EFI_MEMORY_RUNTIME entries to 0. This will signal
+			 * the incoming kernel that no virtual translation has
+			 * been installed.
+			 */
+			for (i = 0; i < map_size; i += desc_size) {
+				efi_memory_desc_t *p;
+
+				p = (efi_memory_desc_t *)((u8 *)memory_map + i);
+				if (!(p->attribute & EFI_MEMORY_RUNTIME))
+					break;
+				p->virt_addr = 0;
+			}
+		}
+		return EFI_SUCCESS;
+	}
 
 	pr_efi_err(sys_table, "Exit boot services failed.\n");
 
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 06/10] arm64/efi: remove free_boot_services() and friends
  2014-11-06 14:13 ` Ard Biesheuvel
@ 2014-11-06 14:13     ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A
  Cc: Ard Biesheuvel

Now that we are calling SetVirtualAddressMap() from the stub, there is no
need to reserve boot-only memory regions, which implies that there is also
no reason to free them again later.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/kernel/efi.c | 123 +-----------------------------------------------
 1 file changed, 1 insertion(+), 122 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index beb5a79d32c3..9f05b4fe9fd5 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -170,9 +170,7 @@ static __init void reserve_regions(void)
 		if (is_normal_ram(md))
 			early_init_dt_add_memory_arch(paddr, size);
 
-		if (is_reserve_region(md) ||
-		    md->type == EFI_BOOT_SERVICES_CODE ||
-		    md->type == EFI_BOOT_SERVICES_DATA) {
+		if (is_reserve_region(md)) {
 			memblock_reserve(paddr, size);
 			if (uefi_debug)
 				pr_cont("*");
@@ -185,123 +183,6 @@ static __init void reserve_regions(void)
 	set_bit(EFI_MEMMAP, &efi.flags);
 }
 
-
-static u64 __init free_one_region(u64 start, u64 end)
-{
-	u64 size = end - start;
-
-	if (uefi_debug)
-		pr_info("  EFI freeing: 0x%012llx-0x%012llx\n",	start, end - 1);
-
-	free_bootmem_late(start, size);
-	return size;
-}
-
-static u64 __init free_region(u64 start, u64 end)
-{
-	u64 map_start, map_end, total = 0;
-
-	if (end <= start)
-		return total;
-
-	map_start = (u64)memmap.phys_map;
-	map_end = PAGE_ALIGN(map_start + (memmap.map_end - memmap.map));
-	map_start &= PAGE_MASK;
-
-	if (start < map_end && end > map_start) {
-		/* region overlaps UEFI memmap */
-		if (start < map_start)
-			total += free_one_region(start, map_start);
-
-		if (map_end < end)
-			total += free_one_region(map_end, end);
-	} else
-		total += free_one_region(start, end);
-
-	return total;
-}
-
-static void __init free_boot_services(void)
-{
-	u64 total_freed = 0;
-	u64 keep_end, free_start, free_end;
-	efi_memory_desc_t *md;
-
-	/*
-	 * If kernel uses larger pages than UEFI, we have to be careful
-	 * not to inadvertantly free memory we want to keep if there is
-	 * overlap at the kernel page size alignment. We do not want to
-	 * free is_reserve_region() memory nor the UEFI memmap itself.
-	 *
-	 * The memory map is sorted, so we keep track of the end of
-	 * any previous region we want to keep, remember any region
-	 * we want to free and defer freeing it until we encounter
-	 * the next region we want to keep. This way, before freeing
-	 * it, we can clip it as needed to avoid freeing memory we
-	 * want to keep for UEFI.
-	 */
-
-	keep_end = 0;
-	free_start = 0;
-
-	for_each_efi_memory_desc(&memmap, md) {
-		u64 paddr, npages, size;
-
-		if (is_reserve_region(md)) {
-			/*
-			 * We don't want to free any memory from this region.
-			 */
-			if (free_start) {
-				/* adjust free_end then free region */
-				if (free_end > md->phys_addr)
-					free_end -= PAGE_SIZE;
-				total_freed += free_region(free_start, free_end);
-				free_start = 0;
-			}
-			keep_end = md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT);
-			continue;
-		}
-
-		if (md->type != EFI_BOOT_SERVICES_CODE &&
-		    md->type != EFI_BOOT_SERVICES_DATA) {
-			/* no need to free this region */
-			continue;
-		}
-
-		/*
-		 * We want to free memory from this region.
-		 */
-		paddr = md->phys_addr;
-		npages = md->num_pages;
-		memrange_efi_to_native(&paddr, &npages);
-		size = npages << PAGE_SHIFT;
-
-		if (free_start) {
-			if (paddr <= free_end)
-				free_end = paddr + size;
-			else {
-				total_freed += free_region(free_start, free_end);
-				free_start = paddr;
-				free_end = paddr + size;
-			}
-		} else {
-			free_start = paddr;
-			free_end = paddr + size;
-		}
-		if (free_start < keep_end) {
-			free_start += PAGE_SIZE;
-			if (free_start >= free_end)
-				free_start = 0;
-		}
-	}
-	if (free_start)
-		total_freed += free_region(free_start, free_end);
-
-	if (total_freed)
-		pr_info("Freed 0x%llx bytes of EFI boot services memory",
-			total_freed);
-}
-
 void __init efi_init(void)
 {
 	struct efi_fdt_params params;
@@ -368,8 +249,6 @@ static int __init arm64_enter_virtual_mode(void)
 	}
 	set_bit(EFI_SYSTEM_TABLES, &efi.flags);
 
-	free_boot_services();
-
 	if (!efi_enabled(EFI_VIRTMAP)) {
 		pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
 		return -1;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 06/10] arm64/efi: remove free_boot_services() and friends
@ 2014-11-06 14:13     ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we are calling SetVirtualAddressMap() from the stub, there is no
need to reserve boot-only memory regions, which implies that there is also
no reason to free them again later.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi.c | 123 +-----------------------------------------------
 1 file changed, 1 insertion(+), 122 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index beb5a79d32c3..9f05b4fe9fd5 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -170,9 +170,7 @@ static __init void reserve_regions(void)
 		if (is_normal_ram(md))
 			early_init_dt_add_memory_arch(paddr, size);
 
-		if (is_reserve_region(md) ||
-		    md->type == EFI_BOOT_SERVICES_CODE ||
-		    md->type == EFI_BOOT_SERVICES_DATA) {
+		if (is_reserve_region(md)) {
 			memblock_reserve(paddr, size);
 			if (uefi_debug)
 				pr_cont("*");
@@ -185,123 +183,6 @@ static __init void reserve_regions(void)
 	set_bit(EFI_MEMMAP, &efi.flags);
 }
 
-
-static u64 __init free_one_region(u64 start, u64 end)
-{
-	u64 size = end - start;
-
-	if (uefi_debug)
-		pr_info("  EFI freeing: 0x%012llx-0x%012llx\n",	start, end - 1);
-
-	free_bootmem_late(start, size);
-	return size;
-}
-
-static u64 __init free_region(u64 start, u64 end)
-{
-	u64 map_start, map_end, total = 0;
-
-	if (end <= start)
-		return total;
-
-	map_start = (u64)memmap.phys_map;
-	map_end = PAGE_ALIGN(map_start + (memmap.map_end - memmap.map));
-	map_start &= PAGE_MASK;
-
-	if (start < map_end && end > map_start) {
-		/* region overlaps UEFI memmap */
-		if (start < map_start)
-			total += free_one_region(start, map_start);
-
-		if (map_end < end)
-			total += free_one_region(map_end, end);
-	} else
-		total += free_one_region(start, end);
-
-	return total;
-}
-
-static void __init free_boot_services(void)
-{
-	u64 total_freed = 0;
-	u64 keep_end, free_start, free_end;
-	efi_memory_desc_t *md;
-
-	/*
-	 * If kernel uses larger pages than UEFI, we have to be careful
-	 * not to inadvertantly free memory we want to keep if there is
-	 * overlap@the kernel page size alignment. We do not want to
-	 * free is_reserve_region() memory nor the UEFI memmap itself.
-	 *
-	 * The memory map is sorted, so we keep track of the end of
-	 * any previous region we want to keep, remember any region
-	 * we want to free and defer freeing it until we encounter
-	 * the next region we want to keep. This way, before freeing
-	 * it, we can clip it as needed to avoid freeing memory we
-	 * want to keep for UEFI.
-	 */
-
-	keep_end = 0;
-	free_start = 0;
-
-	for_each_efi_memory_desc(&memmap, md) {
-		u64 paddr, npages, size;
-
-		if (is_reserve_region(md)) {
-			/*
-			 * We don't want to free any memory from this region.
-			 */
-			if (free_start) {
-				/* adjust free_end then free region */
-				if (free_end > md->phys_addr)
-					free_end -= PAGE_SIZE;
-				total_freed += free_region(free_start, free_end);
-				free_start = 0;
-			}
-			keep_end = md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT);
-			continue;
-		}
-
-		if (md->type != EFI_BOOT_SERVICES_CODE &&
-		    md->type != EFI_BOOT_SERVICES_DATA) {
-			/* no need to free this region */
-			continue;
-		}
-
-		/*
-		 * We want to free memory from this region.
-		 */
-		paddr = md->phys_addr;
-		npages = md->num_pages;
-		memrange_efi_to_native(&paddr, &npages);
-		size = npages << PAGE_SHIFT;
-
-		if (free_start) {
-			if (paddr <= free_end)
-				free_end = paddr + size;
-			else {
-				total_freed += free_region(free_start, free_end);
-				free_start = paddr;
-				free_end = paddr + size;
-			}
-		} else {
-			free_start = paddr;
-			free_end = paddr + size;
-		}
-		if (free_start < keep_end) {
-			free_start += PAGE_SIZE;
-			if (free_start >= free_end)
-				free_start = 0;
-		}
-	}
-	if (free_start)
-		total_freed += free_region(free_start, free_end);
-
-	if (total_freed)
-		pr_info("Freed 0x%llx bytes of EFI boot services memory",
-			total_freed);
-}
-
 void __init efi_init(void)
 {
 	struct efi_fdt_params params;
@@ -368,8 +249,6 @@ static int __init arm64_enter_virtual_mode(void)
 	}
 	set_bit(EFI_SYSTEM_TABLES, &efi.flags);
 
-	free_boot_services();
-
 	if (!efi_enabled(EFI_VIRTMAP)) {
 		pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
 		return -1;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 07/10] arm64/efi: remove idmap manipulations from UEFI code
  2014-11-06 14:13 ` Ard Biesheuvel
@ 2014-11-06 14:13     ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A
  Cc: Ard Biesheuvel

Now that we have moved the call to SetVirtualAddressMap() to the stub,
UEFI has no use for the ID map, so we can drop the code that installs
ID mappings for UEFI memory regions.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/include/asm/efi.h |  2 --
 arch/arm64/include/asm/mmu.h |  2 --
 arch/arm64/kernel/efi.c      | 30 ------------------------------
 arch/arm64/kernel/setup.c    |  1 -
 arch/arm64/mm/mmu.c          | 12 ------------
 5 files changed, 47 deletions(-)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index e2cea16f3bd7..5420ad5a4c63 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -6,10 +6,8 @@
 
 #ifdef CONFIG_EFI
 extern void efi_init(void);
-extern void efi_idmap_init(void);
 #else
 #define efi_init()
-#define efi_idmap_init()
 #endif
 
 #define efi_call_virt(f, ...)						\
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 5fd40c43be80..3d311761e3c2 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -31,8 +31,6 @@ extern void paging_init(void);
 extern void setup_mm_for_reboot(void);
 extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
 extern void init_mem_pgprot(void);
-/* create an identity mapping for memory (or io if map_io is true) */
-extern void create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io);
 extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 			       unsigned long virt, phys_addr_t size,
 			       pgprot_t prot);
diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 9f05b4fe9fd5..a48a2f9ab669 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -47,27 +47,6 @@ static int __init is_normal_ram(efi_memory_desc_t *md)
 	return 0;
 }
 
-static void __init efi_setup_idmap(void)
-{
-	struct memblock_region *r;
-	efi_memory_desc_t *md;
-	u64 paddr, npages, size;
-
-	for_each_memblock(memory, r)
-		create_id_mapping(r->base, r->size, 0);
-
-	/* map runtime io spaces */
-	for_each_efi_memory_desc(&memmap, md) {
-		if (!(md->attribute & EFI_MEMORY_RUNTIME) || is_normal_ram(md))
-			continue;
-		paddr = md->phys_addr;
-		npages = md->num_pages;
-		memrange_efi_to_native(&paddr, &npages);
-		size = npages << PAGE_SHIFT;
-		create_id_mapping(paddr, size, 1);
-	}
-}
-
 static int __init uefi_init(void)
 {
 	efi_char16_t *c16;
@@ -207,15 +186,6 @@ void __init efi_init(void)
 	reserve_regions();
 }
 
-void __init efi_idmap_init(void)
-{
-	if (!efi_enabled(EFI_BOOT))
-		return;
-
-	/* boot time idmap_pg_dir is incomplete, so fill in missing parts */
-	efi_setup_idmap();
-}
-
 static int __init arm64_enter_virtual_mode(void)
 {
 	u64 mapsize;
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index ac19b2d6a3fc..a6028490a28f 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -392,7 +392,6 @@ void __init setup_arch(char **cmdline_p)
 	paging_init();
 	request_standard_resources();
 
-	efi_idmap_init();
 	efi_virtmap_init();
 
 	unflatten_device_tree();
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b6dc2ce3991a..3e8b3732ca87 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -271,18 +271,6 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 			 size, PAGE_KERNEL_EXEC);
 }
 
-void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
-{
-	if ((addr >> PGDIR_SHIFT) >= ARRAY_SIZE(idmap_pg_dir)) {
-		pr_warn("BUG: not creating id mapping for %pa\n", &addr);
-		return;
-	}
-	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
-			 addr, addr, size,
-			 map_io ? __pgprot(PROT_DEVICE_nGnRE)
-				: PAGE_KERNEL_EXEC);
-}
-
 void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 			       unsigned long virt, phys_addr_t size,
 			       pgprot_t prot)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 07/10] arm64/efi: remove idmap manipulations from UEFI code
@ 2014-11-06 14:13     ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we have moved the call to SetVirtualAddressMap() to the stub,
UEFI has no use for the ID map, so we can drop the code that installs
ID mappings for UEFI memory regions.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/efi.h |  2 --
 arch/arm64/include/asm/mmu.h |  2 --
 arch/arm64/kernel/efi.c      | 30 ------------------------------
 arch/arm64/kernel/setup.c    |  1 -
 arch/arm64/mm/mmu.c          | 12 ------------
 5 files changed, 47 deletions(-)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index e2cea16f3bd7..5420ad5a4c63 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -6,10 +6,8 @@
 
 #ifdef CONFIG_EFI
 extern void efi_init(void);
-extern void efi_idmap_init(void);
 #else
 #define efi_init()
-#define efi_idmap_init()
 #endif
 
 #define efi_call_virt(f, ...)						\
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 5fd40c43be80..3d311761e3c2 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -31,8 +31,6 @@ extern void paging_init(void);
 extern void setup_mm_for_reboot(void);
 extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
 extern void init_mem_pgprot(void);
-/* create an identity mapping for memory (or io if map_io is true) */
-extern void create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io);
 extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 			       unsigned long virt, phys_addr_t size,
 			       pgprot_t prot);
diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 9f05b4fe9fd5..a48a2f9ab669 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -47,27 +47,6 @@ static int __init is_normal_ram(efi_memory_desc_t *md)
 	return 0;
 }
 
-static void __init efi_setup_idmap(void)
-{
-	struct memblock_region *r;
-	efi_memory_desc_t *md;
-	u64 paddr, npages, size;
-
-	for_each_memblock(memory, r)
-		create_id_mapping(r->base, r->size, 0);
-
-	/* map runtime io spaces */
-	for_each_efi_memory_desc(&memmap, md) {
-		if (!(md->attribute & EFI_MEMORY_RUNTIME) || is_normal_ram(md))
-			continue;
-		paddr = md->phys_addr;
-		npages = md->num_pages;
-		memrange_efi_to_native(&paddr, &npages);
-		size = npages << PAGE_SHIFT;
-		create_id_mapping(paddr, size, 1);
-	}
-}
-
 static int __init uefi_init(void)
 {
 	efi_char16_t *c16;
@@ -207,15 +186,6 @@ void __init efi_init(void)
 	reserve_regions();
 }
 
-void __init efi_idmap_init(void)
-{
-	if (!efi_enabled(EFI_BOOT))
-		return;
-
-	/* boot time idmap_pg_dir is incomplete, so fill in missing parts */
-	efi_setup_idmap();
-}
-
 static int __init arm64_enter_virtual_mode(void)
 {
 	u64 mapsize;
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index ac19b2d6a3fc..a6028490a28f 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -392,7 +392,6 @@ void __init setup_arch(char **cmdline_p)
 	paging_init();
 	request_standard_resources();
 
-	efi_idmap_init();
 	efi_virtmap_init();
 
 	unflatten_device_tree();
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b6dc2ce3991a..3e8b3732ca87 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -271,18 +271,6 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 			 size, PAGE_KERNEL_EXEC);
 }
 
-void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
-{
-	if ((addr >> PGDIR_SHIFT) >= ARRAY_SIZE(idmap_pg_dir)) {
-		pr_warn("BUG: not creating id mapping for %pa\n", &addr);
-		return;
-	}
-	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
-			 addr, addr, size,
-			 map_io ? __pgprot(PROT_DEVICE_nGnRE)
-				: PAGE_KERNEL_EXEC);
-}
-
 void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 			       unsigned long virt, phys_addr_t size,
 			       pgprot_t prot)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 08/10] arm64/efi: use UEFI memory map unconditionally if available
  2014-11-06 14:13 ` Ard Biesheuvel
@ 2014-11-06 14:13     ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A
  Cc: Ard Biesheuvel

On systems that boot via UEFI, all memory nodes are deleted from the
device tree, and instead, the size and location of system RAM is derived
from the UEFI memory map. This is handled by reserve_regions, which not only
reserves parts of memory that UEFI declares as reserved, but also installs
the memblocks that cover the remaining usable memory.

Currently, reserve_regions() is only called if uefi_init() succeeds.
However, it does not actually depend on anything that uefi_init() does,
and not calling reserve_regions() results in a broken boot, so it is
better to just call it unconditionally.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/kernel/efi.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index a48a2f9ab669..3009c22e2620 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -180,8 +180,7 @@ void __init efi_init(void)
 	memmap.desc_size = params.desc_size;
 	memmap.desc_version = params.desc_ver;
 
-	if (uefi_init() < 0)
-		return;
+	WARN_ON(uefi_init() < 0);
 
 	reserve_regions();
 }
@@ -190,15 +189,13 @@ static int __init arm64_enter_virtual_mode(void)
 {
 	u64 mapsize;
 
-	if (!efi_enabled(EFI_BOOT)) {
-		pr_info("EFI services will not be available.\n");
-		return -1;
-	}
+	if (!efi_enabled(EFI_MEMMAP))
+		return 0;
 
 	mapsize = memmap.map_end - memmap.map;
 	early_memunmap(memmap.map, mapsize);
 
-	if (efi_runtime_disabled()) {
+	if (!efi_enabled(EFI_BOOT) || efi_runtime_disabled()) {
 		pr_info("EFI runtime services will be disabled.\n");
 		return -1;
 	}
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 08/10] arm64/efi: use UEFI memory map unconditionally if available
@ 2014-11-06 14:13     ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

On systems that boot via UEFI, all memory nodes are deleted from the
device tree, and instead, the size and location of system RAM is derived
from the UEFI memory map. This is handled by reserve_regions, which not only
reserves parts of memory that UEFI declares as reserved, but also installs
the memblocks that cover the remaining usable memory.

Currently, reserve_regions() is only called if uefi_init() succeeds.
However, it does not actually depend on anything that uefi_init() does,
and not calling reserve_regions() results in a broken boot, so it is
better to just call it unconditionally.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index a48a2f9ab669..3009c22e2620 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -180,8 +180,7 @@ void __init efi_init(void)
 	memmap.desc_size = params.desc_size;
 	memmap.desc_version = params.desc_ver;
 
-	if (uefi_init() < 0)
-		return;
+	WARN_ON(uefi_init() < 0);
 
 	reserve_regions();
 }
@@ -190,15 +189,13 @@ static int __init arm64_enter_virtual_mode(void)
 {
 	u64 mapsize;
 
-	if (!efi_enabled(EFI_BOOT)) {
-		pr_info("EFI services will not be available.\n");
-		return -1;
-	}
+	if (!efi_enabled(EFI_MEMMAP))
+		return 0;
 
 	mapsize = memmap.map_end - memmap.map;
 	early_memunmap(memmap.map, mapsize);
 
-	if (efi_runtime_disabled()) {
+	if (!efi_enabled(EFI_BOOT) || efi_runtime_disabled()) {
 		pr_info("EFI runtime services will be disabled.\n");
 		return -1;
 	}
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
  2014-11-06 14:13 ` Ard Biesheuvel
@ 2014-11-06 14:13     ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A
  Cc: Ard Biesheuvel

This changes the way memblocks are installed based on the contents of
the UEFI memory map. Formerly, all regions would be memblock_add()'ed,
after which unusable regions would be memblock_reserve()'d as well.
To simplify things, but also to allow access to the unusable regions
through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
this so that only usable regions are memblock_add()'ed in the first
place.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/kernel/efi.c | 69 +++++++++++++++++++------------------------------
 1 file changed, 26 insertions(+), 43 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 3009c22e2620..af2214c692d3 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -40,13 +40,6 @@ static int __init uefi_debug_setup(char *str)
 }
 early_param("uefi_debug", uefi_debug_setup);
 
-static int __init is_normal_ram(efi_memory_desc_t *md)
-{
-	if (md->attribute & EFI_MEMORY_WB)
-		return 1;
-	return 0;
-}
-
 static int __init uefi_init(void)
 {
 	efi_char16_t *c16;
@@ -105,28 +98,11 @@ out:
 	return retval;
 }
 
-/*
- * Return true for RAM regions we want to permanently reserve.
- */
-static __init int is_reserve_region(efi_memory_desc_t *md)
-{
-	switch (md->type) {
-	case EFI_LOADER_CODE:
-	case EFI_LOADER_DATA:
-	case EFI_BOOT_SERVICES_CODE:
-	case EFI_BOOT_SERVICES_DATA:
-	case EFI_CONVENTIONAL_MEMORY:
-		return 0;
-	default:
-		break;
-	}
-	return is_normal_ram(md);
-}
-
-static __init void reserve_regions(void)
+static __init void process_memory_map(void)
 {
 	efi_memory_desc_t *md;
 	u64 paddr, npages, size;
+	u32 lost = 0;
 
 	if (uefi_debug)
 		pr_info("Processing EFI memory map:\n");
@@ -134,31 +110,38 @@ static __init void reserve_regions(void)
 	for_each_efi_memory_desc(&memmap, md) {
 		paddr = md->phys_addr;
 		npages = md->num_pages;
+		size = npages << EFI_PAGE_SHIFT;
 
 		if (uefi_debug) {
 			char buf[64];
 
-			pr_info("  0x%012llx-0x%012llx %s",
-				paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
+			pr_info("  0x%012llx-0x%012llx %s\n",
+				paddr, paddr + size - 1,
 				efi_md_typeattr_format(buf, sizeof(buf), md));
 		}
 
-		memrange_efi_to_native(&paddr, &npages);
-		size = npages << PAGE_SHIFT;
-
-		if (is_normal_ram(md))
-			early_init_dt_add_memory_arch(paddr, size);
-
-		if (is_reserve_region(md)) {
-			memblock_reserve(paddr, size);
-			if (uefi_debug)
-				pr_cont("*");
-		}
-
-		if (uefi_debug)
-			pr_cont("\n");
+		if (!efi_mem_is_usable_region(md))
+			continue;
+
+		early_init_dt_add_memory_arch(paddr, size);
+
+		/*
+		 * Keep a tally of how much memory we are losing due to
+		 * rounding of regions that are not aligned to the page
+		 * size. We cannot easily recover this memory without
+		 * sorting the memory map and attempting to merge adjacent
+		 * usable regions.
+		 */
+		if (PAGE_SHIFT != EFI_PAGE_SHIFT)
+			lost += (npages << EFI_PAGE_SHIFT) -
+				round_down(max_t(s64, size - PAGE_ALIGN(paddr) +
+						 md->phys_addr, 0),
+					   PAGE_SIZE);
 	}
 
+	if (lost > SZ_1K)
+		pr_warn("efi: lost %u KB of RAM to rounding\n", lost / SZ_1K);
+
 	set_bit(EFI_MEMMAP, &efi.flags);
 }
 
@@ -182,7 +165,7 @@ void __init efi_init(void)
 
 	WARN_ON(uefi_init() < 0);
 
-	reserve_regions();
+	process_memory_map();
 }
 
 static int __init arm64_enter_virtual_mode(void)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
@ 2014-11-06 14:13     ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

This changes the way memblocks are installed based on the contents of
the UEFI memory map. Formerly, all regions would be memblock_add()'ed,
after which unusable regions would be memblock_reserve()'d as well.
To simplify things, but also to allow access to the unusable regions
through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
this so that only usable regions are memblock_add()'ed in the first
place.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi.c | 69 +++++++++++++++++++------------------------------
 1 file changed, 26 insertions(+), 43 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 3009c22e2620..af2214c692d3 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -40,13 +40,6 @@ static int __init uefi_debug_setup(char *str)
 }
 early_param("uefi_debug", uefi_debug_setup);
 
-static int __init is_normal_ram(efi_memory_desc_t *md)
-{
-	if (md->attribute & EFI_MEMORY_WB)
-		return 1;
-	return 0;
-}
-
 static int __init uefi_init(void)
 {
 	efi_char16_t *c16;
@@ -105,28 +98,11 @@ out:
 	return retval;
 }
 
-/*
- * Return true for RAM regions we want to permanently reserve.
- */
-static __init int is_reserve_region(efi_memory_desc_t *md)
-{
-	switch (md->type) {
-	case EFI_LOADER_CODE:
-	case EFI_LOADER_DATA:
-	case EFI_BOOT_SERVICES_CODE:
-	case EFI_BOOT_SERVICES_DATA:
-	case EFI_CONVENTIONAL_MEMORY:
-		return 0;
-	default:
-		break;
-	}
-	return is_normal_ram(md);
-}
-
-static __init void reserve_regions(void)
+static __init void process_memory_map(void)
 {
 	efi_memory_desc_t *md;
 	u64 paddr, npages, size;
+	u32 lost = 0;
 
 	if (uefi_debug)
 		pr_info("Processing EFI memory map:\n");
@@ -134,31 +110,38 @@ static __init void reserve_regions(void)
 	for_each_efi_memory_desc(&memmap, md) {
 		paddr = md->phys_addr;
 		npages = md->num_pages;
+		size = npages << EFI_PAGE_SHIFT;
 
 		if (uefi_debug) {
 			char buf[64];
 
-			pr_info("  0x%012llx-0x%012llx %s",
-				paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
+			pr_info("  0x%012llx-0x%012llx %s\n",
+				paddr, paddr + size - 1,
 				efi_md_typeattr_format(buf, sizeof(buf), md));
 		}
 
-		memrange_efi_to_native(&paddr, &npages);
-		size = npages << PAGE_SHIFT;
-
-		if (is_normal_ram(md))
-			early_init_dt_add_memory_arch(paddr, size);
-
-		if (is_reserve_region(md)) {
-			memblock_reserve(paddr, size);
-			if (uefi_debug)
-				pr_cont("*");
-		}
-
-		if (uefi_debug)
-			pr_cont("\n");
+		if (!efi_mem_is_usable_region(md))
+			continue;
+
+		early_init_dt_add_memory_arch(paddr, size);
+
+		/*
+		 * Keep a tally of how much memory we are losing due to
+		 * rounding of regions that are not aligned to the page
+		 * size. We cannot easily recover this memory without
+		 * sorting the memory map and attempting to merge adjacent
+		 * usable regions.
+		 */
+		if (PAGE_SHIFT != EFI_PAGE_SHIFT)
+			lost += (npages << EFI_PAGE_SHIFT) -
+				round_down(max_t(s64, size - PAGE_ALIGN(paddr) +
+						 md->phys_addr, 0),
+					   PAGE_SIZE);
 	}
 
+	if (lost > SZ_1K)
+		pr_warn("efi: lost %u KB of RAM to rounding\n", lost / SZ_1K);
+
 	set_bit(EFI_MEMMAP, &efi.flags);
 }
 
@@ -182,7 +165,7 @@ void __init efi_init(void)
 
 	WARN_ON(uefi_init() < 0);
 
-	reserve_regions();
+	process_memory_map();
 }
 
 static int __init arm64_enter_virtual_mode(void)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 10/10] arm64/efi: improve /dev/mem mmap() handling under UEFI
  2014-11-06 14:13 ` Ard Biesheuvel
@ 2014-11-06 14:13     ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A
  Cc: Ard Biesheuvel

When booting via UEFI, we need to ensure that mappings of arbitrary
physical ranges through /dev/mem do not violate architectural rules
regarding conflicting cacheability attributes of overlapping regions.
Also, when CONFIG_STRICT_DEVMEM is in effect, memory regions with
special significance to UEFI should be protected sufficiently.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/mm/mmu.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 3e8b3732ca87..9ae3904f386c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -19,6 +19,7 @@
 
 #include <linux/export.h>
 #include <linux/kernel.h>
+#include <linux/efi.h>
 #include <linux/errno.h>
 #include <linux/init.h>
 #include <linux/mman.h>
@@ -28,6 +29,7 @@
 #include <linux/io.h>
 
 #include <asm/cputype.h>
+#include <asm/efi.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
 #include <asm/sizes.h>
@@ -121,6 +123,8 @@ early_param("cachepolicy", early_cachepolicy);
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 			      unsigned long size, pgprot_t vma_prot)
 {
+	if (efi_enabled(EFI_MEMMAP) && efi_mem_access_prot(pfn, &vma_prot))
+		return vma_prot;
 	if (!pfn_valid(pfn))
 		return pgprot_noncached(vma_prot);
 	else if (file->f_flags & O_SYNC)
@@ -129,6 +133,19 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 }
 EXPORT_SYMBOL(phys_mem_access_prot);
 
+/*
+ * This definition of phys_mem_access_prot_allowed() overrides
+ * the __weak definition in drivers/char/mem.c
+ */
+int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
+				 unsigned long size, pgprot_t *prot)
+{
+	if (efi_enabled(EFI_MEMMAP))
+		return efi_mem_access_allowed(pfn, size,
+					      pgprot_val(*prot) & PTE_WRITE);
+	return 1;
+}
+
 static void __init *early_alloc(unsigned long sz)
 {
 	void *ptr = __va(memblock_alloc(sz, sz));
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 10/10] arm64/efi: improve /dev/mem mmap() handling under UEFI
@ 2014-11-06 14:13     ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-06 14:13 UTC (permalink / raw)
  To: linux-arm-kernel

When booting via UEFI, we need to ensure that mappings of arbitrary
physical ranges through /dev/mem do not violate architectural rules
regarding conflicting cacheability attributes of overlapping regions.
Also, when CONFIG_STRICT_DEVMEM is in effect, memory regions with
special significance to UEFI should be protected sufficiently.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/mmu.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 3e8b3732ca87..9ae3904f386c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -19,6 +19,7 @@
 
 #include <linux/export.h>
 #include <linux/kernel.h>
+#include <linux/efi.h>
 #include <linux/errno.h>
 #include <linux/init.h>
 #include <linux/mman.h>
@@ -28,6 +29,7 @@
 #include <linux/io.h>
 
 #include <asm/cputype.h>
+#include <asm/efi.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
 #include <asm/sizes.h>
@@ -121,6 +123,8 @@ early_param("cachepolicy", early_cachepolicy);
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 			      unsigned long size, pgprot_t vma_prot)
 {
+	if (efi_enabled(EFI_MEMMAP) && efi_mem_access_prot(pfn, &vma_prot))
+		return vma_prot;
 	if (!pfn_valid(pfn))
 		return pgprot_noncached(vma_prot);
 	else if (file->f_flags & O_SYNC)
@@ -129,6 +133,19 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 }
 EXPORT_SYMBOL(phys_mem_access_prot);
 
+/*
+ * This definition of phys_mem_access_prot_allowed() overrides
+ * the __weak definition in drivers/char/mem.c
+ */
+int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
+				 unsigned long size, pgprot_t *prot)
+{
+	if (efi_enabled(EFI_MEMMAP))
+		return efi_mem_access_allowed(pfn, size,
+					      pgprot_val(*prot) & PTE_WRITE);
+	return 1;
+}
+
 static void __init *early_alloc(unsigned long sz)
 {
 	void *ptr = __va(memblock_alloc(sz, sz));
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 02/10] arm64/mm: add create_pgd_mapping() to create private page tables
  2014-11-06 14:13     ` Ard Biesheuvel
@ 2014-11-07 15:08       ` Steve Capper
  -1 siblings, 0 replies; 44+ messages in thread
From: Steve Capper @ 2014-11-07 15:08 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: mark.rutland, linux-efi, catalin.marinas, will.deacon,
	leif.lindholm, roy.franz, matt.fleming, msalter, grant.likely,
	dyoung, linux-arm-kernel

On Thu, Nov 06, 2014 at 03:13:18PM +0100, Ard Biesheuvel wrote:

Hi Ard,
Some comments below:

> For UEFI, we need to install the memory mappings used for Runtime Services
> in a dedicated set of page tables. Add create_pgd_mapping(), which allows
> us to allocate and install those page table entries early.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/include/asm/mmu.h     |  3 +++
>  arch/arm64/include/asm/pgtable.h |  5 +++++
>  arch/arm64/mm/mmu.c              | 44 +++++++++++++++++++++-------------------
>  3 files changed, 31 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> index c2f006c48bdb..5fd40c43be80 100644
> --- a/arch/arm64/include/asm/mmu.h
> +++ b/arch/arm64/include/asm/mmu.h
> @@ -33,5 +33,8 @@ extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
>  extern void init_mem_pgprot(void);
>  /* create an identity mapping for memory (or io if map_io is true) */
>  extern void create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io);
> +extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
> +			       unsigned long virt, phys_addr_t size,
> +			       pgprot_t prot);
>  
>  #endif
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 41a43bf26492..1abe4d08725b 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -264,6 +264,11 @@ static inline pmd_t pte_pmd(pte_t pte)
>  	return __pmd(pte_val(pte));
>  }
>  
> +static inline pgprot_t mk_sect_prot(pgprot_t prot)
> +{
> +	return __pgprot(pgprot_val(prot) & ~PTE_TABLE_BIT);
> +}
> +
>  /*
>   * THP definitions.
>   */
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 83e6713143a3..b6dc2ce3991a 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -157,20 +157,10 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
>  
>  static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>  				  unsigned long addr, unsigned long end,
> -				  phys_addr_t phys, int map_io)
> +				  phys_addr_t phys, pgprot_t prot)
>  {
>  	pmd_t *pmd;
>  	unsigned long next;
> -	pmdval_t prot_sect;
> -	pgprot_t prot_pte;
> -
> -	if (map_io) {
> -		prot_sect = PROT_SECT_DEVICE_nGnRE;
> -		prot_pte = __pgprot(PROT_DEVICE_nGnRE);
> -	} else {
> -		prot_sect = PROT_SECT_NORMAL_EXEC;
> -		prot_pte = PAGE_KERNEL_EXEC;
> -	}

Thanks :-)

>  
>  	/*
>  	 * Check for initial section mappings in the pgd/pud and remove them.
> @@ -186,7 +176,8 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>  		/* try section mapping first */
>  		if (((addr | next | phys) & ~SECTION_MASK) == 0) {
>  			pmd_t old_pmd =*pmd;
> -			set_pmd(pmd, __pmd(phys | prot_sect));
> +			set_pmd(pmd, __pmd(phys |
> +					   pgprot_val(mk_sect_prot(prot))));
>  			/*
>  			 * Check for previous table entries created during
>  			 * boot (__create_page_tables) and flush them.
> @@ -195,7 +186,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>  				flush_tlb_all();
>  		} else {
>  			alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
> -				       prot_pte);
> +				       prot);
>  		}
>  		phys += next - addr;
>  	} while (pmd++, addr = next, addr != end);
> @@ -203,7 +194,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>  
>  static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>  				  unsigned long addr, unsigned long end,
> -				  unsigned long phys, int map_io)
> +				  unsigned long phys, pgprot_t prot)
>  {
>  	pud_t *pud;
>  	unsigned long next;
> @@ -221,10 +212,11 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>  		/*
>  		 * For 4K granule only, attempt to put down a 1GB block
>  		 */
> -		if (!map_io && (PAGE_SHIFT == 12) &&
> +		if ((PAGE_SHIFT == 12) &&
>  		    ((addr | next | phys) & ~PUD_MASK) == 0) {
>  			pud_t old_pud = *pud;
> -			set_pud(pud, __pud(phys | PROT_SECT_NORMAL_EXEC));
> +			set_pud(pud, __pud(phys |
> +					   pgprot_val(mk_sect_prot(prot))));
>  
>  			/*
>  			 * If we have an old value for a pud, it will
> @@ -239,7 +231,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>  				flush_tlb_all();
>  			}
>  		} else {
> -			alloc_init_pmd(mm, pud, addr, next, phys, map_io);
> +			alloc_init_pmd(mm, pud, addr, next, phys, prot);
>  		}
>  		phys += next - addr;
>  	} while (pud++, addr = next, addr != end);
> @@ -251,7 +243,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>   */
>  static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
>  				    phys_addr_t phys, unsigned long virt,
> -				    phys_addr_t size, int map_io)
> +				    phys_addr_t size, pgprot_t prot)
>  {
>  	unsigned long addr, length, end, next;
>  
> @@ -261,7 +253,8 @@ static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
>  	end = addr + length;
>  	do {
>  		next = pgd_addr_end(addr, end);
> -		alloc_init_pud(mm, pgd, addr, next, phys, map_io);
> +		alloc_init_pud(mm, pgd, addr, next, phys, prot);
> +			       
>  		phys += next - addr;
>  	} while (pgd++, addr = next, addr != end);
>  }
> @@ -275,7 +268,7 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
>  		return;
>  	}
>  	__create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK), phys, virt,
> -			 size, 0);
> +			 size, PAGE_KERNEL_EXEC);
>  }
>  
>  void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
> @@ -285,7 +278,16 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
>  		return;
>  	}
>  	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
> -			 addr, addr, size, map_io);
> +			 addr, addr, size,
> +			 map_io ? __pgprot(PROT_DEVICE_nGnRE)
> +				: PAGE_KERNEL_EXEC);
> +}

Could you please also change efi_setup_idmap (it's the only caller I
can see for create_id_mapping)?

That way the prototype for create_id_mapping would look like:
void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, pgprot_t prot)

> +
> +void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
> +			       unsigned long virt, phys_addr_t size,
> +			       pgprot_t prot)
> +{
> +	__create_mapping(mm, pgd_offset(mm, virt), phys, virt, size, prot);
>  }
>  
>  static void __init map_mem(void)
> -- 
> 1.8.3.2
> 

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 02/10] arm64/mm: add create_pgd_mapping() to create private page tables
@ 2014-11-07 15:08       ` Steve Capper
  0 siblings, 0 replies; 44+ messages in thread
From: Steve Capper @ 2014-11-07 15:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 06, 2014 at 03:13:18PM +0100, Ard Biesheuvel wrote:

Hi Ard,
Some comments below:

> For UEFI, we need to install the memory mappings used for Runtime Services
> in a dedicated set of page tables. Add create_pgd_mapping(), which allows
> us to allocate and install those page table entries early.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/include/asm/mmu.h     |  3 +++
>  arch/arm64/include/asm/pgtable.h |  5 +++++
>  arch/arm64/mm/mmu.c              | 44 +++++++++++++++++++++-------------------
>  3 files changed, 31 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> index c2f006c48bdb..5fd40c43be80 100644
> --- a/arch/arm64/include/asm/mmu.h
> +++ b/arch/arm64/include/asm/mmu.h
> @@ -33,5 +33,8 @@ extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
>  extern void init_mem_pgprot(void);
>  /* create an identity mapping for memory (or io if map_io is true) */
>  extern void create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io);
> +extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
> +			       unsigned long virt, phys_addr_t size,
> +			       pgprot_t prot);
>  
>  #endif
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 41a43bf26492..1abe4d08725b 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -264,6 +264,11 @@ static inline pmd_t pte_pmd(pte_t pte)
>  	return __pmd(pte_val(pte));
>  }
>  
> +static inline pgprot_t mk_sect_prot(pgprot_t prot)
> +{
> +	return __pgprot(pgprot_val(prot) & ~PTE_TABLE_BIT);
> +}
> +
>  /*
>   * THP definitions.
>   */
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 83e6713143a3..b6dc2ce3991a 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -157,20 +157,10 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
>  
>  static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>  				  unsigned long addr, unsigned long end,
> -				  phys_addr_t phys, int map_io)
> +				  phys_addr_t phys, pgprot_t prot)
>  {
>  	pmd_t *pmd;
>  	unsigned long next;
> -	pmdval_t prot_sect;
> -	pgprot_t prot_pte;
> -
> -	if (map_io) {
> -		prot_sect = PROT_SECT_DEVICE_nGnRE;
> -		prot_pte = __pgprot(PROT_DEVICE_nGnRE);
> -	} else {
> -		prot_sect = PROT_SECT_NORMAL_EXEC;
> -		prot_pte = PAGE_KERNEL_EXEC;
> -	}

Thanks :-)

>  
>  	/*
>  	 * Check for initial section mappings in the pgd/pud and remove them.
> @@ -186,7 +176,8 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>  		/* try section mapping first */
>  		if (((addr | next | phys) & ~SECTION_MASK) == 0) {
>  			pmd_t old_pmd =*pmd;
> -			set_pmd(pmd, __pmd(phys | prot_sect));
> +			set_pmd(pmd, __pmd(phys |
> +					   pgprot_val(mk_sect_prot(prot))));
>  			/*
>  			 * Check for previous table entries created during
>  			 * boot (__create_page_tables) and flush them.
> @@ -195,7 +186,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>  				flush_tlb_all();
>  		} else {
>  			alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
> -				       prot_pte);
> +				       prot);
>  		}
>  		phys += next - addr;
>  	} while (pmd++, addr = next, addr != end);
> @@ -203,7 +194,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>  
>  static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>  				  unsigned long addr, unsigned long end,
> -				  unsigned long phys, int map_io)
> +				  unsigned long phys, pgprot_t prot)
>  {
>  	pud_t *pud;
>  	unsigned long next;
> @@ -221,10 +212,11 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>  		/*
>  		 * For 4K granule only, attempt to put down a 1GB block
>  		 */
> -		if (!map_io && (PAGE_SHIFT == 12) &&
> +		if ((PAGE_SHIFT == 12) &&
>  		    ((addr | next | phys) & ~PUD_MASK) == 0) {
>  			pud_t old_pud = *pud;
> -			set_pud(pud, __pud(phys | PROT_SECT_NORMAL_EXEC));
> +			set_pud(pud, __pud(phys |
> +					   pgprot_val(mk_sect_prot(prot))));
>  
>  			/*
>  			 * If we have an old value for a pud, it will
> @@ -239,7 +231,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>  				flush_tlb_all();
>  			}
>  		} else {
> -			alloc_init_pmd(mm, pud, addr, next, phys, map_io);
> +			alloc_init_pmd(mm, pud, addr, next, phys, prot);
>  		}
>  		phys += next - addr;
>  	} while (pud++, addr = next, addr != end);
> @@ -251,7 +243,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>   */
>  static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
>  				    phys_addr_t phys, unsigned long virt,
> -				    phys_addr_t size, int map_io)
> +				    phys_addr_t size, pgprot_t prot)
>  {
>  	unsigned long addr, length, end, next;
>  
> @@ -261,7 +253,8 @@ static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
>  	end = addr + length;
>  	do {
>  		next = pgd_addr_end(addr, end);
> -		alloc_init_pud(mm, pgd, addr, next, phys, map_io);
> +		alloc_init_pud(mm, pgd, addr, next, phys, prot);
> +			       
>  		phys += next - addr;
>  	} while (pgd++, addr = next, addr != end);
>  }
> @@ -275,7 +268,7 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
>  		return;
>  	}
>  	__create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK), phys, virt,
> -			 size, 0);
> +			 size, PAGE_KERNEL_EXEC);
>  }
>  
>  void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
> @@ -285,7 +278,16 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
>  		return;
>  	}
>  	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
> -			 addr, addr, size, map_io);
> +			 addr, addr, size,
> +			 map_io ? __pgprot(PROT_DEVICE_nGnRE)
> +				: PAGE_KERNEL_EXEC);
> +}

Could you please also change efi_setup_idmap (it's the only caller I
can see for create_id_mapping)?

That way the prototype for create_id_mapping would look like:
void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, pgprot_t prot)

> +
> +void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
> +			       unsigned long virt, phys_addr_t size,
> +			       pgprot_t prot)
> +{
> +	__create_mapping(mm, pgd_offset(mm, virt), phys, virt, size, prot);
>  }
>  
>  static void __init map_mem(void)
> -- 
> 1.8.3.2
> 

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 02/10] arm64/mm: add create_pgd_mapping() to create private page tables
  2014-11-07 15:08       ` Steve Capper
@ 2014-11-07 15:12           ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-07 15:12 UTC (permalink / raw)
  To: Steve Capper
  Cc: Leif Lindholm, Roy Franz,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Mark Rutland,
	Mark Salter, Dave Young, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	Matt Fleming, Will Deacon, Catalin Marinas, Grant Likely

On 7 November 2014 16:08, Steve Capper <steve.capper-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
> On Thu, Nov 06, 2014 at 03:13:18PM +0100, Ard Biesheuvel wrote:
>
> Hi Ard,
> Some comments below:
>
>> For UEFI, we need to install the memory mappings used for Runtime Services
>> in a dedicated set of page tables. Add create_pgd_mapping(), which allows
>> us to allocate and install those page table entries early.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>> ---
>>  arch/arm64/include/asm/mmu.h     |  3 +++
>>  arch/arm64/include/asm/pgtable.h |  5 +++++
>>  arch/arm64/mm/mmu.c              | 44 +++++++++++++++++++++-------------------
>>  3 files changed, 31 insertions(+), 21 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
>> index c2f006c48bdb..5fd40c43be80 100644
>> --- a/arch/arm64/include/asm/mmu.h
>> +++ b/arch/arm64/include/asm/mmu.h
>> @@ -33,5 +33,8 @@ extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
>>  extern void init_mem_pgprot(void);
>>  /* create an identity mapping for memory (or io if map_io is true) */
>>  extern void create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io);
>> +extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
>> +                            unsigned long virt, phys_addr_t size,
>> +                            pgprot_t prot);
>>
>>  #endif
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index 41a43bf26492..1abe4d08725b 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -264,6 +264,11 @@ static inline pmd_t pte_pmd(pte_t pte)
>>       return __pmd(pte_val(pte));
>>  }
>>
>> +static inline pgprot_t mk_sect_prot(pgprot_t prot)
>> +{
>> +     return __pgprot(pgprot_val(prot) & ~PTE_TABLE_BIT);
>> +}
>> +
>>  /*
>>   * THP definitions.
>>   */
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 83e6713143a3..b6dc2ce3991a 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -157,20 +157,10 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
>>
>>  static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>>                                 unsigned long addr, unsigned long end,
>> -                               phys_addr_t phys, int map_io)
>> +                               phys_addr_t phys, pgprot_t prot)
>>  {
>>       pmd_t *pmd;
>>       unsigned long next;
>> -     pmdval_t prot_sect;
>> -     pgprot_t prot_pte;
>> -
>> -     if (map_io) {
>> -             prot_sect = PROT_SECT_DEVICE_nGnRE;
>> -             prot_pte = __pgprot(PROT_DEVICE_nGnRE);
>> -     } else {
>> -             prot_sect = PROT_SECT_NORMAL_EXEC;
>> -             prot_pte = PAGE_KERNEL_EXEC;
>> -     }
>
> Thanks :-)
>
>>
>>       /*
>>        * Check for initial section mappings in the pgd/pud and remove them.
>> @@ -186,7 +176,8 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>>               /* try section mapping first */
>>               if (((addr | next | phys) & ~SECTION_MASK) == 0) {
>>                       pmd_t old_pmd =*pmd;
>> -                     set_pmd(pmd, __pmd(phys | prot_sect));
>> +                     set_pmd(pmd, __pmd(phys |
>> +                                        pgprot_val(mk_sect_prot(prot))));
>>                       /*
>>                        * Check for previous table entries created during
>>                        * boot (__create_page_tables) and flush them.
>> @@ -195,7 +186,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>>                               flush_tlb_all();
>>               } else {
>>                       alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
>> -                                    prot_pte);
>> +                                    prot);
>>               }
>>               phys += next - addr;
>>       } while (pmd++, addr = next, addr != end);
>> @@ -203,7 +194,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>>
>>  static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>>                                 unsigned long addr, unsigned long end,
>> -                               unsigned long phys, int map_io)
>> +                               unsigned long phys, pgprot_t prot)
>>  {
>>       pud_t *pud;
>>       unsigned long next;
>> @@ -221,10 +212,11 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>>               /*
>>                * For 4K granule only, attempt to put down a 1GB block
>>                */
>> -             if (!map_io && (PAGE_SHIFT == 12) &&
>> +             if ((PAGE_SHIFT == 12) &&
>>                   ((addr | next | phys) & ~PUD_MASK) == 0) {
>>                       pud_t old_pud = *pud;
>> -                     set_pud(pud, __pud(phys | PROT_SECT_NORMAL_EXEC));
>> +                     set_pud(pud, __pud(phys |
>> +                                        pgprot_val(mk_sect_prot(prot))));
>>
>>                       /*
>>                        * If we have an old value for a pud, it will
>> @@ -239,7 +231,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>>                               flush_tlb_all();
>>                       }
>>               } else {
>> -                     alloc_init_pmd(mm, pud, addr, next, phys, map_io);
>> +                     alloc_init_pmd(mm, pud, addr, next, phys, prot);
>>               }
>>               phys += next - addr;
>>       } while (pud++, addr = next, addr != end);
>> @@ -251,7 +243,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>>   */
>>  static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
>>                                   phys_addr_t phys, unsigned long virt,
>> -                                 phys_addr_t size, int map_io)
>> +                                 phys_addr_t size, pgprot_t prot)
>>  {
>>       unsigned long addr, length, end, next;
>>
>> @@ -261,7 +253,8 @@ static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
>>       end = addr + length;
>>       do {
>>               next = pgd_addr_end(addr, end);
>> -             alloc_init_pud(mm, pgd, addr, next, phys, map_io);
>> +             alloc_init_pud(mm, pgd, addr, next, phys, prot);
>> +
>>               phys += next - addr;
>>       } while (pgd++, addr = next, addr != end);
>>  }
>> @@ -275,7 +268,7 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
>>               return;
>>       }
>>       __create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK), phys, virt,
>> -                      size, 0);
>> +                      size, PAGE_KERNEL_EXEC);
>>  }
>>
>>  void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
>> @@ -285,7 +278,16 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
>>               return;
>>       }
>>       __create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
>> -                      addr, addr, size, map_io);
>> +                      addr, addr, size,
>> +                      map_io ? __pgprot(PROT_DEVICE_nGnRE)
>> +                             : PAGE_KERNEL_EXEC);
>> +}
>
> Could you please also change efi_setup_idmap (it's the only caller I
> can see for create_id_mapping)?
>
> That way the prototype for create_id_mapping would look like:
> void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, pgprot_t prot)
>

I didn't bother because a couple of patches later, this stuff is all
ripped out anyway
(7/10 arm64/efi: remove idmap manipulations from UEFI code), because
there is no longer a need for UEFI to switch to the ID map.

Do you feel I should still change it here, and then remove it later?

Thanks,
Ard.


>> +
>> +void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
>> +                            unsigned long virt, phys_addr_t size,
>> +                            pgprot_t prot)
>> +{
>> +     __create_mapping(mm, pgd_offset(mm, virt), phys, virt, size, prot);
>>  }
>>
>>  static void __init map_mem(void)
>> --
>> 1.8.3.2
>>
>
> Cheers,
> --
> Steve

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 02/10] arm64/mm: add create_pgd_mapping() to create private page tables
@ 2014-11-07 15:12           ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-07 15:12 UTC (permalink / raw)
  To: linux-arm-kernel

On 7 November 2014 16:08, Steve Capper <steve.capper@linaro.org> wrote:
> On Thu, Nov 06, 2014 at 03:13:18PM +0100, Ard Biesheuvel wrote:
>
> Hi Ard,
> Some comments below:
>
>> For UEFI, we need to install the memory mappings used for Runtime Services
>> in a dedicated set of page tables. Add create_pgd_mapping(), which allows
>> us to allocate and install those page table entries early.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/include/asm/mmu.h     |  3 +++
>>  arch/arm64/include/asm/pgtable.h |  5 +++++
>>  arch/arm64/mm/mmu.c              | 44 +++++++++++++++++++++-------------------
>>  3 files changed, 31 insertions(+), 21 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
>> index c2f006c48bdb..5fd40c43be80 100644
>> --- a/arch/arm64/include/asm/mmu.h
>> +++ b/arch/arm64/include/asm/mmu.h
>> @@ -33,5 +33,8 @@ extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
>>  extern void init_mem_pgprot(void);
>>  /* create an identity mapping for memory (or io if map_io is true) */
>>  extern void create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io);
>> +extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
>> +                            unsigned long virt, phys_addr_t size,
>> +                            pgprot_t prot);
>>
>>  #endif
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index 41a43bf26492..1abe4d08725b 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -264,6 +264,11 @@ static inline pmd_t pte_pmd(pte_t pte)
>>       return __pmd(pte_val(pte));
>>  }
>>
>> +static inline pgprot_t mk_sect_prot(pgprot_t prot)
>> +{
>> +     return __pgprot(pgprot_val(prot) & ~PTE_TABLE_BIT);
>> +}
>> +
>>  /*
>>   * THP definitions.
>>   */
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 83e6713143a3..b6dc2ce3991a 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -157,20 +157,10 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
>>
>>  static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>>                                 unsigned long addr, unsigned long end,
>> -                               phys_addr_t phys, int map_io)
>> +                               phys_addr_t phys, pgprot_t prot)
>>  {
>>       pmd_t *pmd;
>>       unsigned long next;
>> -     pmdval_t prot_sect;
>> -     pgprot_t prot_pte;
>> -
>> -     if (map_io) {
>> -             prot_sect = PROT_SECT_DEVICE_nGnRE;
>> -             prot_pte = __pgprot(PROT_DEVICE_nGnRE);
>> -     } else {
>> -             prot_sect = PROT_SECT_NORMAL_EXEC;
>> -             prot_pte = PAGE_KERNEL_EXEC;
>> -     }
>
> Thanks :-)
>
>>
>>       /*
>>        * Check for initial section mappings in the pgd/pud and remove them.
>> @@ -186,7 +176,8 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>>               /* try section mapping first */
>>               if (((addr | next | phys) & ~SECTION_MASK) == 0) {
>>                       pmd_t old_pmd =*pmd;
>> -                     set_pmd(pmd, __pmd(phys | prot_sect));
>> +                     set_pmd(pmd, __pmd(phys |
>> +                                        pgprot_val(mk_sect_prot(prot))));
>>                       /*
>>                        * Check for previous table entries created during
>>                        * boot (__create_page_tables) and flush them.
>> @@ -195,7 +186,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>>                               flush_tlb_all();
>>               } else {
>>                       alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
>> -                                    prot_pte);
>> +                                    prot);
>>               }
>>               phys += next - addr;
>>       } while (pmd++, addr = next, addr != end);
>> @@ -203,7 +194,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
>>
>>  static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>>                                 unsigned long addr, unsigned long end,
>> -                               unsigned long phys, int map_io)
>> +                               unsigned long phys, pgprot_t prot)
>>  {
>>       pud_t *pud;
>>       unsigned long next;
>> @@ -221,10 +212,11 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>>               /*
>>                * For 4K granule only, attempt to put down a 1GB block
>>                */
>> -             if (!map_io && (PAGE_SHIFT == 12) &&
>> +             if ((PAGE_SHIFT == 12) &&
>>                   ((addr | next | phys) & ~PUD_MASK) == 0) {
>>                       pud_t old_pud = *pud;
>> -                     set_pud(pud, __pud(phys | PROT_SECT_NORMAL_EXEC));
>> +                     set_pud(pud, __pud(phys |
>> +                                        pgprot_val(mk_sect_prot(prot))));
>>
>>                       /*
>>                        * If we have an old value for a pud, it will
>> @@ -239,7 +231,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>>                               flush_tlb_all();
>>                       }
>>               } else {
>> -                     alloc_init_pmd(mm, pud, addr, next, phys, map_io);
>> +                     alloc_init_pmd(mm, pud, addr, next, phys, prot);
>>               }
>>               phys += next - addr;
>>       } while (pud++, addr = next, addr != end);
>> @@ -251,7 +243,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
>>   */
>>  static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
>>                                   phys_addr_t phys, unsigned long virt,
>> -                                 phys_addr_t size, int map_io)
>> +                                 phys_addr_t size, pgprot_t prot)
>>  {
>>       unsigned long addr, length, end, next;
>>
>> @@ -261,7 +253,8 @@ static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
>>       end = addr + length;
>>       do {
>>               next = pgd_addr_end(addr, end);
>> -             alloc_init_pud(mm, pgd, addr, next, phys, map_io);
>> +             alloc_init_pud(mm, pgd, addr, next, phys, prot);
>> +
>>               phys += next - addr;
>>       } while (pgd++, addr = next, addr != end);
>>  }
>> @@ -275,7 +268,7 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
>>               return;
>>       }
>>       __create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK), phys, virt,
>> -                      size, 0);
>> +                      size, PAGE_KERNEL_EXEC);
>>  }
>>
>>  void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
>> @@ -285,7 +278,16 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
>>               return;
>>       }
>>       __create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
>> -                      addr, addr, size, map_io);
>> +                      addr, addr, size,
>> +                      map_io ? __pgprot(PROT_DEVICE_nGnRE)
>> +                             : PAGE_KERNEL_EXEC);
>> +}
>
> Could you please also change efi_setup_idmap (it's the only caller I
> can see for create_id_mapping)?
>
> That way the prototype for create_id_mapping would look like:
> void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, pgprot_t prot)
>

I didn't bother because a couple of patches later, this stuff is all
ripped out anyway
(7/10 arm64/efi: remove idmap manipulations from UEFI code), because
there is no longer a need for UEFI to switch to the ID map.

Do you feel I should still change it here, and then remove it later?

Thanks,
Ard.


>> +
>> +void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
>> +                            unsigned long virt, phys_addr_t size,
>> +                            pgprot_t prot)
>> +{
>> +     __create_mapping(mm, pgd_offset(mm, virt), phys, virt, size, prot);
>>  }
>>
>>  static void __init map_mem(void)
>> --
>> 1.8.3.2
>>
>
> Cheers,
> --
> Steve

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 02/10] arm64/mm: add create_pgd_mapping() to create private page tables
  2014-11-07 15:12           ` Ard Biesheuvel
@ 2014-11-07 15:21               ` Steve Capper
  -1 siblings, 0 replies; 44+ messages in thread
From: Steve Capper @ 2014-11-07 15:21 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Leif Lindholm, Roy Franz,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Mark Rutland,
	Mark Salter, Dave Young, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	Matt Fleming, Will Deacon, Catalin Marinas, Grant Likely

On Fri, Nov 07, 2014 at 04:12:34PM +0100, Ard Biesheuvel wrote:
> On 7 November 2014 16:08, Steve Capper <steve.capper-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
> > On Thu, Nov 06, 2014 at 03:13:18PM +0100, Ard Biesheuvel wrote:

[...]

> >>  void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
> >> @@ -285,7 +278,16 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
> >>               return;
> >>       }
> >>       __create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
> >> -                      addr, addr, size, map_io);
> >> +                      addr, addr, size,
> >> +                      map_io ? __pgprot(PROT_DEVICE_nGnRE)
> >> +                             : PAGE_KERNEL_EXEC);
> >> +}
> >
> > Could you please also change efi_setup_idmap (it's the only caller I
> > can see for create_id_mapping)?
> >
> > That way the prototype for create_id_mapping would look like:
> > void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, pgprot_t prot)
> >
> 
> I didn't bother because a couple of patches later, this stuff is all
> ripped out anyway
> (7/10 arm64/efi: remove idmap manipulations from UEFI code), because
> there is no longer a need for UEFI to switch to the ID map.
> 
> Do you feel I should still change it here, and then remove it later?

Ahh, I looked at this patch in isolation.
Yeah, this looks fine to me as is then Ard.

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 02/10] arm64/mm: add create_pgd_mapping() to create private page tables
@ 2014-11-07 15:21               ` Steve Capper
  0 siblings, 0 replies; 44+ messages in thread
From: Steve Capper @ 2014-11-07 15:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 07, 2014 at 04:12:34PM +0100, Ard Biesheuvel wrote:
> On 7 November 2014 16:08, Steve Capper <steve.capper@linaro.org> wrote:
> > On Thu, Nov 06, 2014 at 03:13:18PM +0100, Ard Biesheuvel wrote:

[...]

> >>  void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
> >> @@ -285,7 +278,16 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
> >>               return;
> >>       }
> >>       __create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
> >> -                      addr, addr, size, map_io);
> >> +                      addr, addr, size,
> >> +                      map_io ? __pgprot(PROT_DEVICE_nGnRE)
> >> +                             : PAGE_KERNEL_EXEC);
> >> +}
> >
> > Could you please also change efi_setup_idmap (it's the only caller I
> > can see for create_id_mapping)?
> >
> > That way the prototype for create_id_mapping would look like:
> > void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, pgprot_t prot)
> >
> 
> I didn't bother because a couple of patches later, this stuff is all
> ripped out anyway
> (7/10 arm64/efi: remove idmap manipulations from UEFI code), because
> there is no longer a need for UEFI to switch to the ID map.
> 
> Do you feel I should still change it here, and then remove it later?

Ahh, I looked at this patch in isolation.
Yeah, this looks fine to me as is then Ard.

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
  2014-11-06 14:13     ` Ard Biesheuvel
@ 2014-11-10  4:11         ` Mark Salter
  -1 siblings, 0 replies; 44+ messages in thread
From: Mark Salter @ 2014-11-10  4:11 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mark.rutland-5wv7dgnIgG8, dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, will.deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A

On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> This changes the way memblocks are installed based on the contents of
> the UEFI memory map. Formerly, all regions would be memblock_add()'ed,
> after which unusable regions would be memblock_reserve()'d as well.
> To simplify things, but also to allow access to the unusable regions
> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> this so that only usable regions are memblock_add()'ed in the first
> place.

This patch is crashing 64K pagesize kernels during boot. I'm not exactly
sure why, though. Here is what I get on an APM Mustang box:

[    0.046262] Unhandled fault: alignment fault (0x96000021) at 0xfffffc000004f026
[    0.053863] Internal error: : 96000021 [#1] SMP
[    0.058566] Modules linked in:
[    0.061736] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc3+ #11
[    0.068418] Hardware name: APM X-Gene Mustang board (DT)
[    0.073942] task: fffffe0000f46a20 ti: fffffe0000f10000 task.ti: fffffe0000f10000
[    0.081735] PC is at acpi_ns_lookup+0x520/0x734
[    0.086448] LR is at acpi_ns_lookup+0x4ac/0x734
[    0.091151] pc : [<fffffe0000460458>] lr : [<fffffe00004603e4>] pstate: 60000245
[    0.098843] sp : fffffe0000f13b00
[    0.102293] x29: fffffe0000f13b10 x28: 000000000000000a 
[    0.107810] x27: 0000000000000008 x26: 0000000000000000 
[    0.113345] x25: fffffe0000f13c38 x24: fffffe0000fae000 
[    0.118862] x23: 0000000000000008 x22: 0000000000000001 
[    0.124406] x21: fffffe0000972000 x20: 000000000000000a 
[    0.129940] x19: fffffc000004f026 x18: 000000000000000b 
[    0.135483] x17: 0000000000000023 x16: 000000000000059f 
[    0.141026] x15: 0000000000007fff x14: 000000000000038e 
[    0.146543] x13: ff00000000000000 x12: ffffffffffffffff 
[    0.152060] x11: 0000000000000010 x10: 00000000fffffff6 
[    0.157585] x9 : 0000000000000050 x8 : fffffe03fa7401b0 
[    0.163111] x7 : fffffe0001115a00 x6 : fffffe0001115a00 
[    0.168620] x5 : fffffe0001115000 x4 : 0000000000000010 
[    0.174146] x3 : 0000000000000010 x2 : 000000000000000b 
[    0.179689] x1 : 00000000fffffff5 x0 : 0000000000000000 
[    0.185215] 
[    0.186765] Process swapper/0 (pid: 0, stack limit = 0xfffffe0000f10058)
[    0.193724] Stack: (0xfffffe0000f13b00 to 0xfffffe0000f14000)
[    0.199707] 3b00: 00000000 fffffe03 00000666 00000000 00f13bd0 fffffe00 0044f388 fffffe00
...
[    0.539605] Call trace:
[    0.542150] [<fffffe0000460458>] acpi_ns_lookup+0x520/0x734
[    0.547935] [<fffffe000044f384>] acpi_ds_load1_begin_op+0x384/0x4b4
[    0.554445] [<fffffe0000469594>] acpi_ps_build_named_op+0xfc/0x228
[    0.560868] [<fffffe00004698c8>] acpi_ps_create_op+0x208/0x340
[    0.566945] [<fffffe0000468ea8>] acpi_ps_parse_loop+0x208/0x7f8
[    0.573092] [<fffffe000046a6b8>] acpi_ps_parse_aml+0x1c0/0x434
[    0.579153] [<fffffe0000463cb0>] acpi_ns_one_complete_parse+0x1a4/0x1ec
[    0.586017] [<fffffe0000463d88>] acpi_ns_parse_table+0x90/0x130
[    0.592181] [<fffffe0000462f94>] acpi_ns_load_table+0xc8/0x1b0
[    0.598243] [<fffffe0000e71848>] acpi_load_tables+0xf4/0x234
[    0.604122] [<fffffe0000e70a0c>] acpi_early_init+0x78/0xc8
[    0.609828] [<fffffe0000e40928>] start_kernel+0x334/0x3ac
[    0.615431] Code: 2a00037b b9009fbc 36180057 321d037b (b9400260) 
[    0.621777] ---[ end trace cb88537fdc8fa200 ]---
[    0.626586] Kernel panic - not syncing: Attempted to kill the idle task!

I also get a different crash when booting with device tree instead of ACPI.

One thing is that early_init_dt_add_memory_arch() is called with UEFI
(4K) pagesize alignment and size and it clips the passed in range such
that what is added is the 64K aligned range which fits completely within
the given range. This may mean nothing gets added if the given range is
smaller than 64K. And only a subset of the desired range is added in
other cases. This could theoretically leave bits of devicetree and/or
initrd unmapped, but that doesn't seem to be the case here. I'll keep
poking at it...

> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>  arch/arm64/kernel/efi.c | 69 +++++++++++++++++++------------------------------
>  1 file changed, 26 insertions(+), 43 deletions(-)
> 
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index 3009c22e2620..af2214c692d3 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -40,13 +40,6 @@ static int __init uefi_debug_setup(char *str)
>  }
>  early_param("uefi_debug", uefi_debug_setup);
>  
> -static int __init is_normal_ram(efi_memory_desc_t *md)
> -{
> -	if (md->attribute & EFI_MEMORY_WB)
> -		return 1;
> -	return 0;
> -}
> -
>  static int __init uefi_init(void)
>  {
>  	efi_char16_t *c16;
> @@ -105,28 +98,11 @@ out:
>  	return retval;
>  }
>  
> -/*
> - * Return true for RAM regions we want to permanently reserve.
> - */
> -static __init int is_reserve_region(efi_memory_desc_t *md)
> -{
> -	switch (md->type) {
> -	case EFI_LOADER_CODE:
> -	case EFI_LOADER_DATA:
> -	case EFI_BOOT_SERVICES_CODE:
> -	case EFI_BOOT_SERVICES_DATA:
> -	case EFI_CONVENTIONAL_MEMORY:
> -		return 0;
> -	default:
> -		break;
> -	}
> -	return is_normal_ram(md);
> -}
> -
> -static __init void reserve_regions(void)
> +static __init void process_memory_map(void)
>  {
>  	efi_memory_desc_t *md;
>  	u64 paddr, npages, size;
> +	u32 lost = 0;
>  
>  	if (uefi_debug)
>  		pr_info("Processing EFI memory map:\n");
> @@ -134,31 +110,38 @@ static __init void reserve_regions(void)
>  	for_each_efi_memory_desc(&memmap, md) {
>  		paddr = md->phys_addr;
>  		npages = md->num_pages;
> +		size = npages << EFI_PAGE_SHIFT;
>  
>  		if (uefi_debug) {
>  			char buf[64];
>  
> -			pr_info("  0x%012llx-0x%012llx %s",
> -				paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
> +			pr_info("  0x%012llx-0x%012llx %s\n",
> +				paddr, paddr + size - 1,
>  				efi_md_typeattr_format(buf, sizeof(buf), md));
>  		}
>  
> -		memrange_efi_to_native(&paddr, &npages);
> -		size = npages << PAGE_SHIFT;
> -
> -		if (is_normal_ram(md))
> -			early_init_dt_add_memory_arch(paddr, size);
> -
> -		if (is_reserve_region(md)) {
> -			memblock_reserve(paddr, size);
> -			if (uefi_debug)
> -				pr_cont("*");
> -		}
> -
> -		if (uefi_debug)
> -			pr_cont("\n");
> +		if (!efi_mem_is_usable_region(md))
> +			continue;
> +
> +		early_init_dt_add_memory_arch(paddr, size);
> +
> +		/*
> +		 * Keep a tally of how much memory we are losing due to
> +		 * rounding of regions that are not aligned to the page
> +		 * size. We cannot easily recover this memory without
> +		 * sorting the memory map and attempting to merge adjacent
> +		 * usable regions.
> +		 */
> +		if (PAGE_SHIFT != EFI_PAGE_SHIFT)
> +			lost += (npages << EFI_PAGE_SHIFT) -
> +				round_down(max_t(s64, size - PAGE_ALIGN(paddr) +
> +						 md->phys_addr, 0),
> +					   PAGE_SIZE);
>  	}
>  
> +	if (lost > SZ_1K)
> +		pr_warn("efi: lost %u KB of RAM to rounding\n", lost / SZ_1K);
> +
>  	set_bit(EFI_MEMMAP, &efi.flags);
>  }
>  
> @@ -182,7 +165,7 @@ void __init efi_init(void)
>  
>  	WARN_ON(uefi_init() < 0);
>  
> -	reserve_regions();
> +	process_memory_map();
>  }
>  
>  static int __init arm64_enter_virtual_mode(void)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
@ 2014-11-10  4:11         ` Mark Salter
  0 siblings, 0 replies; 44+ messages in thread
From: Mark Salter @ 2014-11-10  4:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> This changes the way memblocks are installed based on the contents of
> the UEFI memory map. Formerly, all regions would be memblock_add()'ed,
> after which unusable regions would be memblock_reserve()'d as well.
> To simplify things, but also to allow access to the unusable regions
> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> this so that only usable regions are memblock_add()'ed in the first
> place.

This patch is crashing 64K pagesize kernels during boot. I'm not exactly
sure why, though. Here is what I get on an APM Mustang box:

[    0.046262] Unhandled fault: alignment fault (0x96000021) at 0xfffffc000004f026
[    0.053863] Internal error: : 96000021 [#1] SMP
[    0.058566] Modules linked in:
[    0.061736] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc3+ #11
[    0.068418] Hardware name: APM X-Gene Mustang board (DT)
[    0.073942] task: fffffe0000f46a20 ti: fffffe0000f10000 task.ti: fffffe0000f10000
[    0.081735] PC is at acpi_ns_lookup+0x520/0x734
[    0.086448] LR is at acpi_ns_lookup+0x4ac/0x734
[    0.091151] pc : [<fffffe0000460458>] lr : [<fffffe00004603e4>] pstate: 60000245
[    0.098843] sp : fffffe0000f13b00
[    0.102293] x29: fffffe0000f13b10 x28: 000000000000000a 
[    0.107810] x27: 0000000000000008 x26: 0000000000000000 
[    0.113345] x25: fffffe0000f13c38 x24: fffffe0000fae000 
[    0.118862] x23: 0000000000000008 x22: 0000000000000001 
[    0.124406] x21: fffffe0000972000 x20: 000000000000000a 
[    0.129940] x19: fffffc000004f026 x18: 000000000000000b 
[    0.135483] x17: 0000000000000023 x16: 000000000000059f 
[    0.141026] x15: 0000000000007fff x14: 000000000000038e 
[    0.146543] x13: ff00000000000000 x12: ffffffffffffffff 
[    0.152060] x11: 0000000000000010 x10: 00000000fffffff6 
[    0.157585] x9 : 0000000000000050 x8 : fffffe03fa7401b0 
[    0.163111] x7 : fffffe0001115a00 x6 : fffffe0001115a00 
[    0.168620] x5 : fffffe0001115000 x4 : 0000000000000010 
[    0.174146] x3 : 0000000000000010 x2 : 000000000000000b 
[    0.179689] x1 : 00000000fffffff5 x0 : 0000000000000000 
[    0.185215] 
[    0.186765] Process swapper/0 (pid: 0, stack limit = 0xfffffe0000f10058)
[    0.193724] Stack: (0xfffffe0000f13b00 to 0xfffffe0000f14000)
[    0.199707] 3b00: 00000000 fffffe03 00000666 00000000 00f13bd0 fffffe00 0044f388 fffffe00
...
[    0.539605] Call trace:
[    0.542150] [<fffffe0000460458>] acpi_ns_lookup+0x520/0x734
[    0.547935] [<fffffe000044f384>] acpi_ds_load1_begin_op+0x384/0x4b4
[    0.554445] [<fffffe0000469594>] acpi_ps_build_named_op+0xfc/0x228
[    0.560868] [<fffffe00004698c8>] acpi_ps_create_op+0x208/0x340
[    0.566945] [<fffffe0000468ea8>] acpi_ps_parse_loop+0x208/0x7f8
[    0.573092] [<fffffe000046a6b8>] acpi_ps_parse_aml+0x1c0/0x434
[    0.579153] [<fffffe0000463cb0>] acpi_ns_one_complete_parse+0x1a4/0x1ec
[    0.586017] [<fffffe0000463d88>] acpi_ns_parse_table+0x90/0x130
[    0.592181] [<fffffe0000462f94>] acpi_ns_load_table+0xc8/0x1b0
[    0.598243] [<fffffe0000e71848>] acpi_load_tables+0xf4/0x234
[    0.604122] [<fffffe0000e70a0c>] acpi_early_init+0x78/0xc8
[    0.609828] [<fffffe0000e40928>] start_kernel+0x334/0x3ac
[    0.615431] Code: 2a00037b b9009fbc 36180057 321d037b (b9400260) 
[    0.621777] ---[ end trace cb88537fdc8fa200 ]---
[    0.626586] Kernel panic - not syncing: Attempted to kill the idle task!

I also get a different crash when booting with device tree instead of ACPI.

One thing is that early_init_dt_add_memory_arch() is called with UEFI
(4K) pagesize alignment and size and it clips the passed in range such
that what is added is the 64K aligned range which fits completely within
the given range. This may mean nothing gets added if the given range is
smaller than 64K. And only a subset of the desired range is added in
other cases. This could theoretically leave bits of devicetree and/or
initrd unmapped, but that doesn't seem to be the case here. I'll keep
poking at it...

> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/kernel/efi.c | 69 +++++++++++++++++++------------------------------
>  1 file changed, 26 insertions(+), 43 deletions(-)
> 
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index 3009c22e2620..af2214c692d3 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -40,13 +40,6 @@ static int __init uefi_debug_setup(char *str)
>  }
>  early_param("uefi_debug", uefi_debug_setup);
>  
> -static int __init is_normal_ram(efi_memory_desc_t *md)
> -{
> -	if (md->attribute & EFI_MEMORY_WB)
> -		return 1;
> -	return 0;
> -}
> -
>  static int __init uefi_init(void)
>  {
>  	efi_char16_t *c16;
> @@ -105,28 +98,11 @@ out:
>  	return retval;
>  }
>  
> -/*
> - * Return true for RAM regions we want to permanently reserve.
> - */
> -static __init int is_reserve_region(efi_memory_desc_t *md)
> -{
> -	switch (md->type) {
> -	case EFI_LOADER_CODE:
> -	case EFI_LOADER_DATA:
> -	case EFI_BOOT_SERVICES_CODE:
> -	case EFI_BOOT_SERVICES_DATA:
> -	case EFI_CONVENTIONAL_MEMORY:
> -		return 0;
> -	default:
> -		break;
> -	}
> -	return is_normal_ram(md);
> -}
> -
> -static __init void reserve_regions(void)
> +static __init void process_memory_map(void)
>  {
>  	efi_memory_desc_t *md;
>  	u64 paddr, npages, size;
> +	u32 lost = 0;
>  
>  	if (uefi_debug)
>  		pr_info("Processing EFI memory map:\n");
> @@ -134,31 +110,38 @@ static __init void reserve_regions(void)
>  	for_each_efi_memory_desc(&memmap, md) {
>  		paddr = md->phys_addr;
>  		npages = md->num_pages;
> +		size = npages << EFI_PAGE_SHIFT;
>  
>  		if (uefi_debug) {
>  			char buf[64];
>  
> -			pr_info("  0x%012llx-0x%012llx %s",
> -				paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
> +			pr_info("  0x%012llx-0x%012llx %s\n",
> +				paddr, paddr + size - 1,
>  				efi_md_typeattr_format(buf, sizeof(buf), md));
>  		}
>  
> -		memrange_efi_to_native(&paddr, &npages);
> -		size = npages << PAGE_SHIFT;
> -
> -		if (is_normal_ram(md))
> -			early_init_dt_add_memory_arch(paddr, size);
> -
> -		if (is_reserve_region(md)) {
> -			memblock_reserve(paddr, size);
> -			if (uefi_debug)
> -				pr_cont("*");
> -		}
> -
> -		if (uefi_debug)
> -			pr_cont("\n");
> +		if (!efi_mem_is_usable_region(md))
> +			continue;
> +
> +		early_init_dt_add_memory_arch(paddr, size);
> +
> +		/*
> +		 * Keep a tally of how much memory we are losing due to
> +		 * rounding of regions that are not aligned to the page
> +		 * size. We cannot easily recover this memory without
> +		 * sorting the memory map and attempting to merge adjacent
> +		 * usable regions.
> +		 */
> +		if (PAGE_SHIFT != EFI_PAGE_SHIFT)
> +			lost += (npages << EFI_PAGE_SHIFT) -
> +				round_down(max_t(s64, size - PAGE_ALIGN(paddr) +
> +						 md->phys_addr, 0),
> +					   PAGE_SIZE);
>  	}
>  
> +	if (lost > SZ_1K)
> +		pr_warn("efi: lost %u KB of RAM to rounding\n", lost / SZ_1K);
> +
>  	set_bit(EFI_MEMMAP, &efi.flags);
>  }
>  
> @@ -182,7 +165,7 @@ void __init efi_init(void)
>  
>  	WARN_ON(uefi_init() < 0);
>  
> -	reserve_regions();
> +	process_memory_map();
>  }
>  
>  static int __init arm64_enter_virtual_mode(void)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
  2014-11-10  4:11         ` Mark Salter
@ 2014-11-10  7:31             ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-10  7:31 UTC (permalink / raw)
  To: Mark Salter
  Cc: Leif Lindholm, Roy Franz,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Mark Rutland,
	Dave Young, linux-efi-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	Will Deacon, Catalin Marinas, Grant Likely

On 10 November 2014 05:11, Mark Salter <msalter-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
>> This changes the way memblocks are installed based on the contents of
>> the UEFI memory map. Formerly, all regions would be memblock_add()'ed,
>> after which unusable regions would be memblock_reserve()'d as well.
>> To simplify things, but also to allow access to the unusable regions
>> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
>> this so that only usable regions are memblock_add()'ed in the first
>> place.
>
> This patch is crashing 64K pagesize kernels during boot. I'm not exactly
> sure why, though. Here is what I get on an APM Mustang box:
>

Ah, yes, I meant to mention this patch

https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191

in the cover letter, which addresses this issue at least for the DT case.

-- 
Ard.





> [    0.046262] Unhandled fault: alignment fault (0x96000021) at 0xfffffc000004f026
> [    0.053863] Internal error: : 96000021 [#1] SMP
> [    0.058566] Modules linked in:
> [    0.061736] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc3+ #11
> [    0.068418] Hardware name: APM X-Gene Mustang board (DT)
> [    0.073942] task: fffffe0000f46a20 ti: fffffe0000f10000 task.ti: fffffe0000f10000
> [    0.081735] PC is at acpi_ns_lookup+0x520/0x734
> [    0.086448] LR is at acpi_ns_lookup+0x4ac/0x734
> [    0.091151] pc : [<fffffe0000460458>] lr : [<fffffe00004603e4>] pstate: 60000245
> [    0.098843] sp : fffffe0000f13b00
> [    0.102293] x29: fffffe0000f13b10 x28: 000000000000000a
> [    0.107810] x27: 0000000000000008 x26: 0000000000000000
> [    0.113345] x25: fffffe0000f13c38 x24: fffffe0000fae000
> [    0.118862] x23: 0000000000000008 x22: 0000000000000001
> [    0.124406] x21: fffffe0000972000 x20: 000000000000000a
> [    0.129940] x19: fffffc000004f026 x18: 000000000000000b
> [    0.135483] x17: 0000000000000023 x16: 000000000000059f
> [    0.141026] x15: 0000000000007fff x14: 000000000000038e
> [    0.146543] x13: ff00000000000000 x12: ffffffffffffffff
> [    0.152060] x11: 0000000000000010 x10: 00000000fffffff6
> [    0.157585] x9 : 0000000000000050 x8 : fffffe03fa7401b0
> [    0.163111] x7 : fffffe0001115a00 x6 : fffffe0001115a00
> [    0.168620] x5 : fffffe0001115000 x4 : 0000000000000010
> [    0.174146] x3 : 0000000000000010 x2 : 000000000000000b
> [    0.179689] x1 : 00000000fffffff5 x0 : 0000000000000000
> [    0.185215]
> [    0.186765] Process swapper/0 (pid: 0, stack limit = 0xfffffe0000f10058)
> [    0.193724] Stack: (0xfffffe0000f13b00 to 0xfffffe0000f14000)
> [    0.199707] 3b00: 00000000 fffffe03 00000666 00000000 00f13bd0 fffffe00 0044f388 fffffe00
> ...
> [    0.539605] Call trace:
> [    0.542150] [<fffffe0000460458>] acpi_ns_lookup+0x520/0x734
> [    0.547935] [<fffffe000044f384>] acpi_ds_load1_begin_op+0x384/0x4b4
> [    0.554445] [<fffffe0000469594>] acpi_ps_build_named_op+0xfc/0x228
> [    0.560868] [<fffffe00004698c8>] acpi_ps_create_op+0x208/0x340
> [    0.566945] [<fffffe0000468ea8>] acpi_ps_parse_loop+0x208/0x7f8
> [    0.573092] [<fffffe000046a6b8>] acpi_ps_parse_aml+0x1c0/0x434
> [    0.579153] [<fffffe0000463cb0>] acpi_ns_one_complete_parse+0x1a4/0x1ec
> [    0.586017] [<fffffe0000463d88>] acpi_ns_parse_table+0x90/0x130
> [    0.592181] [<fffffe0000462f94>] acpi_ns_load_table+0xc8/0x1b0
> [    0.598243] [<fffffe0000e71848>] acpi_load_tables+0xf4/0x234
> [    0.604122] [<fffffe0000e70a0c>] acpi_early_init+0x78/0xc8
> [    0.609828] [<fffffe0000e40928>] start_kernel+0x334/0x3ac
> [    0.615431] Code: 2a00037b b9009fbc 36180057 321d037b (b9400260)
> [    0.621777] ---[ end trace cb88537fdc8fa200 ]---
> [    0.626586] Kernel panic - not syncing: Attempted to kill the idle task!
>
> I also get a different crash when booting with device tree instead of ACPI.
>
> One thing is that early_init_dt_add_memory_arch() is called with UEFI
> (4K) pagesize alignment and size and it clips the passed in range such
> that what is added is the 64K aligned range which fits completely within
> the given range. This may mean nothing gets added if the given range is
> smaller than 64K. And only a subset of the desired range is added in
> other cases. This could theoretically leave bits of devicetree and/or
> initrd unmapped, but that doesn't seem to be the case here. I'll keep
> poking at it...
>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>> ---
>>  arch/arm64/kernel/efi.c | 69 +++++++++++++++++++------------------------------
>>  1 file changed, 26 insertions(+), 43 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
>> index 3009c22e2620..af2214c692d3 100644
>> --- a/arch/arm64/kernel/efi.c
>> +++ b/arch/arm64/kernel/efi.c
>> @@ -40,13 +40,6 @@ static int __init uefi_debug_setup(char *str)
>>  }
>>  early_param("uefi_debug", uefi_debug_setup);
>>
>> -static int __init is_normal_ram(efi_memory_desc_t *md)
>> -{
>> -     if (md->attribute & EFI_MEMORY_WB)
>> -             return 1;
>> -     return 0;
>> -}
>> -
>>  static int __init uefi_init(void)
>>  {
>>       efi_char16_t *c16;
>> @@ -105,28 +98,11 @@ out:
>>       return retval;
>>  }
>>
>> -/*
>> - * Return true for RAM regions we want to permanently reserve.
>> - */
>> -static __init int is_reserve_region(efi_memory_desc_t *md)
>> -{
>> -     switch (md->type) {
>> -     case EFI_LOADER_CODE:
>> -     case EFI_LOADER_DATA:
>> -     case EFI_BOOT_SERVICES_CODE:
>> -     case EFI_BOOT_SERVICES_DATA:
>> -     case EFI_CONVENTIONAL_MEMORY:
>> -             return 0;
>> -     default:
>> -             break;
>> -     }
>> -     return is_normal_ram(md);
>> -}
>> -
>> -static __init void reserve_regions(void)
>> +static __init void process_memory_map(void)
>>  {
>>       efi_memory_desc_t *md;
>>       u64 paddr, npages, size;
>> +     u32 lost = 0;
>>
>>       if (uefi_debug)
>>               pr_info("Processing EFI memory map:\n");
>> @@ -134,31 +110,38 @@ static __init void reserve_regions(void)
>>       for_each_efi_memory_desc(&memmap, md) {
>>               paddr = md->phys_addr;
>>               npages = md->num_pages;
>> +             size = npages << EFI_PAGE_SHIFT;
>>
>>               if (uefi_debug) {
>>                       char buf[64];
>>
>> -                     pr_info("  0x%012llx-0x%012llx %s",
>> -                             paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
>> +                     pr_info("  0x%012llx-0x%012llx %s\n",
>> +                             paddr, paddr + size - 1,
>>                               efi_md_typeattr_format(buf, sizeof(buf), md));
>>               }
>>
>> -             memrange_efi_to_native(&paddr, &npages);
>> -             size = npages << PAGE_SHIFT;
>> -
>> -             if (is_normal_ram(md))
>> -                     early_init_dt_add_memory_arch(paddr, size);
>> -
>> -             if (is_reserve_region(md)) {
>> -                     memblock_reserve(paddr, size);
>> -                     if (uefi_debug)
>> -                             pr_cont("*");
>> -             }
>> -
>> -             if (uefi_debug)
>> -                     pr_cont("\n");
>> +             if (!efi_mem_is_usable_region(md))
>> +                     continue;
>> +
>> +             early_init_dt_add_memory_arch(paddr, size);
>> +
>> +             /*
>> +              * Keep a tally of how much memory we are losing due to
>> +              * rounding of regions that are not aligned to the page
>> +              * size. We cannot easily recover this memory without
>> +              * sorting the memory map and attempting to merge adjacent
>> +              * usable regions.
>> +              */
>> +             if (PAGE_SHIFT != EFI_PAGE_SHIFT)
>> +                     lost += (npages << EFI_PAGE_SHIFT) -
>> +                             round_down(max_t(s64, size - PAGE_ALIGN(paddr) +
>> +                                              md->phys_addr, 0),
>> +                                        PAGE_SIZE);
>>       }
>>
>> +     if (lost > SZ_1K)
>> +             pr_warn("efi: lost %u KB of RAM to rounding\n", lost / SZ_1K);
>> +
>>       set_bit(EFI_MEMMAP, &efi.flags);
>>  }
>>
>> @@ -182,7 +165,7 @@ void __init efi_init(void)
>>
>>       WARN_ON(uefi_init() < 0);
>>
>> -     reserve_regions();
>> +     process_memory_map();
>>  }
>>
>>  static int __init arm64_enter_virtual_mode(void)
>
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
@ 2014-11-10  7:31             ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-10  7:31 UTC (permalink / raw)
  To: linux-arm-kernel

On 10 November 2014 05:11, Mark Salter <msalter@redhat.com> wrote:
> On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
>> This changes the way memblocks are installed based on the contents of
>> the UEFI memory map. Formerly, all regions would be memblock_add()'ed,
>> after which unusable regions would be memblock_reserve()'d as well.
>> To simplify things, but also to allow access to the unusable regions
>> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
>> this so that only usable regions are memblock_add()'ed in the first
>> place.
>
> This patch is crashing 64K pagesize kernels during boot. I'm not exactly
> sure why, though. Here is what I get on an APM Mustang box:
>

Ah, yes, I meant to mention this patch

https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191

in the cover letter, which addresses this issue at least for the DT case.

-- 
Ard.





> [    0.046262] Unhandled fault: alignment fault (0x96000021) at 0xfffffc000004f026
> [    0.053863] Internal error: : 96000021 [#1] SMP
> [    0.058566] Modules linked in:
> [    0.061736] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc3+ #11
> [    0.068418] Hardware name: APM X-Gene Mustang board (DT)
> [    0.073942] task: fffffe0000f46a20 ti: fffffe0000f10000 task.ti: fffffe0000f10000
> [    0.081735] PC is at acpi_ns_lookup+0x520/0x734
> [    0.086448] LR is at acpi_ns_lookup+0x4ac/0x734
> [    0.091151] pc : [<fffffe0000460458>] lr : [<fffffe00004603e4>] pstate: 60000245
> [    0.098843] sp : fffffe0000f13b00
> [    0.102293] x29: fffffe0000f13b10 x28: 000000000000000a
> [    0.107810] x27: 0000000000000008 x26: 0000000000000000
> [    0.113345] x25: fffffe0000f13c38 x24: fffffe0000fae000
> [    0.118862] x23: 0000000000000008 x22: 0000000000000001
> [    0.124406] x21: fffffe0000972000 x20: 000000000000000a
> [    0.129940] x19: fffffc000004f026 x18: 000000000000000b
> [    0.135483] x17: 0000000000000023 x16: 000000000000059f
> [    0.141026] x15: 0000000000007fff x14: 000000000000038e
> [    0.146543] x13: ff00000000000000 x12: ffffffffffffffff
> [    0.152060] x11: 0000000000000010 x10: 00000000fffffff6
> [    0.157585] x9 : 0000000000000050 x8 : fffffe03fa7401b0
> [    0.163111] x7 : fffffe0001115a00 x6 : fffffe0001115a00
> [    0.168620] x5 : fffffe0001115000 x4 : 0000000000000010
> [    0.174146] x3 : 0000000000000010 x2 : 000000000000000b
> [    0.179689] x1 : 00000000fffffff5 x0 : 0000000000000000
> [    0.185215]
> [    0.186765] Process swapper/0 (pid: 0, stack limit = 0xfffffe0000f10058)
> [    0.193724] Stack: (0xfffffe0000f13b00 to 0xfffffe0000f14000)
> [    0.199707] 3b00: 00000000 fffffe03 00000666 00000000 00f13bd0 fffffe00 0044f388 fffffe00
> ...
> [    0.539605] Call trace:
> [    0.542150] [<fffffe0000460458>] acpi_ns_lookup+0x520/0x734
> [    0.547935] [<fffffe000044f384>] acpi_ds_load1_begin_op+0x384/0x4b4
> [    0.554445] [<fffffe0000469594>] acpi_ps_build_named_op+0xfc/0x228
> [    0.560868] [<fffffe00004698c8>] acpi_ps_create_op+0x208/0x340
> [    0.566945] [<fffffe0000468ea8>] acpi_ps_parse_loop+0x208/0x7f8
> [    0.573092] [<fffffe000046a6b8>] acpi_ps_parse_aml+0x1c0/0x434
> [    0.579153] [<fffffe0000463cb0>] acpi_ns_one_complete_parse+0x1a4/0x1ec
> [    0.586017] [<fffffe0000463d88>] acpi_ns_parse_table+0x90/0x130
> [    0.592181] [<fffffe0000462f94>] acpi_ns_load_table+0xc8/0x1b0
> [    0.598243] [<fffffe0000e71848>] acpi_load_tables+0xf4/0x234
> [    0.604122] [<fffffe0000e70a0c>] acpi_early_init+0x78/0xc8
> [    0.609828] [<fffffe0000e40928>] start_kernel+0x334/0x3ac
> [    0.615431] Code: 2a00037b b9009fbc 36180057 321d037b (b9400260)
> [    0.621777] ---[ end trace cb88537fdc8fa200 ]---
> [    0.626586] Kernel panic - not syncing: Attempted to kill the idle task!
>
> I also get a different crash when booting with device tree instead of ACPI.
>
> One thing is that early_init_dt_add_memory_arch() is called with UEFI
> (4K) pagesize alignment and size and it clips the passed in range such
> that what is added is the 64K aligned range which fits completely within
> the given range. This may mean nothing gets added if the given range is
> smaller than 64K. And only a subset of the desired range is added in
> other cases. This could theoretically leave bits of devicetree and/or
> initrd unmapped, but that doesn't seem to be the case here. I'll keep
> poking at it...
>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/kernel/efi.c | 69 +++++++++++++++++++------------------------------
>>  1 file changed, 26 insertions(+), 43 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
>> index 3009c22e2620..af2214c692d3 100644
>> --- a/arch/arm64/kernel/efi.c
>> +++ b/arch/arm64/kernel/efi.c
>> @@ -40,13 +40,6 @@ static int __init uefi_debug_setup(char *str)
>>  }
>>  early_param("uefi_debug", uefi_debug_setup);
>>
>> -static int __init is_normal_ram(efi_memory_desc_t *md)
>> -{
>> -     if (md->attribute & EFI_MEMORY_WB)
>> -             return 1;
>> -     return 0;
>> -}
>> -
>>  static int __init uefi_init(void)
>>  {
>>       efi_char16_t *c16;
>> @@ -105,28 +98,11 @@ out:
>>       return retval;
>>  }
>>
>> -/*
>> - * Return true for RAM regions we want to permanently reserve.
>> - */
>> -static __init int is_reserve_region(efi_memory_desc_t *md)
>> -{
>> -     switch (md->type) {
>> -     case EFI_LOADER_CODE:
>> -     case EFI_LOADER_DATA:
>> -     case EFI_BOOT_SERVICES_CODE:
>> -     case EFI_BOOT_SERVICES_DATA:
>> -     case EFI_CONVENTIONAL_MEMORY:
>> -             return 0;
>> -     default:
>> -             break;
>> -     }
>> -     return is_normal_ram(md);
>> -}
>> -
>> -static __init void reserve_regions(void)
>> +static __init void process_memory_map(void)
>>  {
>>       efi_memory_desc_t *md;
>>       u64 paddr, npages, size;
>> +     u32 lost = 0;
>>
>>       if (uefi_debug)
>>               pr_info("Processing EFI memory map:\n");
>> @@ -134,31 +110,38 @@ static __init void reserve_regions(void)
>>       for_each_efi_memory_desc(&memmap, md) {
>>               paddr = md->phys_addr;
>>               npages = md->num_pages;
>> +             size = npages << EFI_PAGE_SHIFT;
>>
>>               if (uefi_debug) {
>>                       char buf[64];
>>
>> -                     pr_info("  0x%012llx-0x%012llx %s",
>> -                             paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
>> +                     pr_info("  0x%012llx-0x%012llx %s\n",
>> +                             paddr, paddr + size - 1,
>>                               efi_md_typeattr_format(buf, sizeof(buf), md));
>>               }
>>
>> -             memrange_efi_to_native(&paddr, &npages);
>> -             size = npages << PAGE_SHIFT;
>> -
>> -             if (is_normal_ram(md))
>> -                     early_init_dt_add_memory_arch(paddr, size);
>> -
>> -             if (is_reserve_region(md)) {
>> -                     memblock_reserve(paddr, size);
>> -                     if (uefi_debug)
>> -                             pr_cont("*");
>> -             }
>> -
>> -             if (uefi_debug)
>> -                     pr_cont("\n");
>> +             if (!efi_mem_is_usable_region(md))
>> +                     continue;
>> +
>> +             early_init_dt_add_memory_arch(paddr, size);
>> +
>> +             /*
>> +              * Keep a tally of how much memory we are losing due to
>> +              * rounding of regions that are not aligned to the page
>> +              * size. We cannot easily recover this memory without
>> +              * sorting the memory map and attempting to merge adjacent
>> +              * usable regions.
>> +              */
>> +             if (PAGE_SHIFT != EFI_PAGE_SHIFT)
>> +                     lost += (npages << EFI_PAGE_SHIFT) -
>> +                             round_down(max_t(s64, size - PAGE_ALIGN(paddr) +
>> +                                              md->phys_addr, 0),
>> +                                        PAGE_SIZE);
>>       }
>>
>> +     if (lost > SZ_1K)
>> +             pr_warn("efi: lost %u KB of RAM to rounding\n", lost / SZ_1K);
>> +
>>       set_bit(EFI_MEMMAP, &efi.flags);
>>  }
>>
>> @@ -182,7 +165,7 @@ void __init efi_init(void)
>>
>>       WARN_ON(uefi_init() < 0);
>>
>> -     reserve_regions();
>> +     process_memory_map();
>>  }
>>
>>  static int __init arm64_enter_virtual_mode(void)
>
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
  2014-11-10  7:31             ` Ard Biesheuvel
@ 2014-11-11 15:42                 ` Mark Salter
  -1 siblings, 0 replies; 44+ messages in thread
From: Mark Salter @ 2014-11-11 15:42 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Leif Lindholm, Roy Franz,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Mark Rutland,
	Dave Young, linux-efi-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	Will Deacon, Catalin Marinas, Grant Likely

On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
> On 10 November 2014 05:11, Mark Salter <msalter-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> >> This changes the way memblocks are installed based on the contents
> of
> >> the UEFI memory map. Formerly, all regions would be
> memblock_add()'ed,
> >> after which unusable regions would be memblock_reserve()'d as well.
> >> To simplify things, but also to allow access to the unusable
> regions
> >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> >> this so that only usable regions are memblock_add()'ed in the first
> >> place.
> >
> > This patch is crashing 64K pagesize kernels during boot. I'm not
> exactly
> > sure why, though. Here is what I get on an APM Mustang box:
> >
> 
> Ah, yes, I meant to mention this patch
> 
> https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
> 
> in the cover letter, which addresses this issue at least for the DT
> case.
> 

That isn't the problem. In general, with 64K kernel pages, you can't be
sure if you leave something you need out of the kernel linear mapping.
If you have Loader Code/Data regions begin and/or end at something other
than a 64K boundary and that region is adjacent to a region not being
added, then you end up leaving out the unaligned bits from the linear
mapping. This could be bits of the initramfs or devicetree.

What I don't get with this failure is that it is an alignment fault
which should be masked at EL1 for the kernel. The same unaligned
access happens without this patch and it doesn't generate a fault.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
@ 2014-11-11 15:42                 ` Mark Salter
  0 siblings, 0 replies; 44+ messages in thread
From: Mark Salter @ 2014-11-11 15:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
> On 10 November 2014 05:11, Mark Salter <msalter@redhat.com> wrote:
> > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> >> This changes the way memblocks are installed based on the contents
> of
> >> the UEFI memory map. Formerly, all regions would be
> memblock_add()'ed,
> >> after which unusable regions would be memblock_reserve()'d as well.
> >> To simplify things, but also to allow access to the unusable
> regions
> >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> >> this so that only usable regions are memblock_add()'ed in the first
> >> place.
> >
> > This patch is crashing 64K pagesize kernels during boot. I'm not
> exactly
> > sure why, though. Here is what I get on an APM Mustang box:
> >
> 
> Ah, yes, I meant to mention this patch
> 
> https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
> 
> in the cover letter, which addresses this issue at least for the DT
> case.
> 

That isn't the problem. In general, with 64K kernel pages, you can't be
sure if you leave something you need out of the kernel linear mapping.
If you have Loader Code/Data regions begin and/or end at something other
than a 64K boundary and that region is adjacent to a region not being
added, then you end up leaving out the unaligned bits from the linear
mapping. This could be bits of the initramfs or devicetree.

What I don't get with this failure is that it is an alignment fault
which should be masked at EL1 for the kernel. The same unaligned
access happens without this patch and it doesn't generate a fault.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
  2014-11-11 15:42                 ` Mark Salter
@ 2014-11-11 17:12                     ` Mark Salter
  -1 siblings, 0 replies; 44+ messages in thread
From: Mark Salter @ 2014-11-11 17:12 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Leif Lindholm, Roy Franz,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Mark Rutland,
	Dave Young, linux-efi-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	Will Deacon, Catalin Marinas, Grant Likely

On Tue, 2014-11-11 at 10:42 -0500, Mark Salter wrote:
> On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
> > On 10 November 2014 05:11, Mark Salter <msalter-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> > >> This changes the way memblocks are installed based on the contents
> > of
> > >> the UEFI memory map. Formerly, all regions would be
> > memblock_add()'ed,
> > >> after which unusable regions would be memblock_reserve()'d as well.
> > >> To simplify things, but also to allow access to the unusable
> > regions
> > >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> > >> this so that only usable regions are memblock_add()'ed in the first
> > >> place.
> > >
> > > This patch is crashing 64K pagesize kernels during boot. I'm not
> > exactly
> > > sure why, though. Here is what I get on an APM Mustang box:
> > >
> > 
> > Ah, yes, I meant to mention this patch
> > 
> > https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
> > 
> > in the cover letter, which addresses this issue at least for the DT
> > case.
> > 
> 
> That isn't the problem. In general, with 64K kernel pages, you can't be
> sure if you leave something you need out of the kernel linear mapping.
> If you have Loader Code/Data regions begin and/or end at something other
> than a 64K boundary and that region is adjacent to a region not being
> added, then you end up leaving out the unaligned bits from the linear
> mapping. This could be bits of the initramfs or devicetree.
> 
> What I don't get with this failure is that it is an alignment fault
> which should be masked at EL1 for the kernel. The same unaligned
> access happens without this patch and it doesn't generate a fault.
> 

Ah, but unaligned accesses are not ignored for device memory.
I have this in include/acpi/acpi_io.h:

static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
					    acpi_size size)
{
#ifdef CONFIG_ARM64
	if (!page_is_ram(phys >> PAGE_SHIFT))
		return ioremap(phys, size);
#endif

       return ioremap_cache(phys, size);
}

Because the table isn't in the linear mapping, it fails the
page_is_ram() test and it gits mapped with ioremap() leading to
the alignment fault.

If I take out the code inside the #ifdef, I get a different
fault:

[    0.350057] Unhandled fault: synchronous external abort (0x96000010) at 0xfffffe0000fae6f4
[    0.358704] pgd = fffffe0001160000
[    0.362276] [fffffe0000fae6f4] *pgd=0000004001370003, *pud=0000004001370003, *pmd=0000004001370003, *pte=02c00040011a0713
[    0.373746] Internal error: : 96000010 [#1] SMP
[    0.378484] Modules linked in:
[    0.381601] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc4+ #15
[    0.388248] Hardware name: APM X-Gene Mustang board (DT)
[    0.393738] task: fffffe03dbe10000 ti: fffffe03dbf00000 task.ti: fffffe03dbf00000
[    0.401503] PC is at acpi_ex_system_memory_space_handler+0x238/0x2e0
[    0.408160] LR is at acpi_ex_system_memory_space_handler+0x130/0x2e0

That happens because AML is trying to access a hardware register
which has been mapped as normal memory.

So, we need a way to tell a table in ram from an io address in AML.
And page_is_ram() no longer cuts it if the tables are not in the
linear mapping.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
@ 2014-11-11 17:12                     ` Mark Salter
  0 siblings, 0 replies; 44+ messages in thread
From: Mark Salter @ 2014-11-11 17:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2014-11-11 at 10:42 -0500, Mark Salter wrote:
> On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
> > On 10 November 2014 05:11, Mark Salter <msalter@redhat.com> wrote:
> > > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> > >> This changes the way memblocks are installed based on the contents
> > of
> > >> the UEFI memory map. Formerly, all regions would be
> > memblock_add()'ed,
> > >> after which unusable regions would be memblock_reserve()'d as well.
> > >> To simplify things, but also to allow access to the unusable
> > regions
> > >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> > >> this so that only usable regions are memblock_add()'ed in the first
> > >> place.
> > >
> > > This patch is crashing 64K pagesize kernels during boot. I'm not
> > exactly
> > > sure why, though. Here is what I get on an APM Mustang box:
> > >
> > 
> > Ah, yes, I meant to mention this patch
> > 
> > https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
> > 
> > in the cover letter, which addresses this issue at least for the DT
> > case.
> > 
> 
> That isn't the problem. In general, with 64K kernel pages, you can't be
> sure if you leave something you need out of the kernel linear mapping.
> If you have Loader Code/Data regions begin and/or end at something other
> than a 64K boundary and that region is adjacent to a region not being
> added, then you end up leaving out the unaligned bits from the linear
> mapping. This could be bits of the initramfs or devicetree.
> 
> What I don't get with this failure is that it is an alignment fault
> which should be masked at EL1 for the kernel. The same unaligned
> access happens without this patch and it doesn't generate a fault.
> 

Ah, but unaligned accesses are not ignored for device memory.
I have this in include/acpi/acpi_io.h:

static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
					    acpi_size size)
{
#ifdef CONFIG_ARM64
	if (!page_is_ram(phys >> PAGE_SHIFT))
		return ioremap(phys, size);
#endif

       return ioremap_cache(phys, size);
}

Because the table isn't in the linear mapping, it fails the
page_is_ram() test and it gits mapped with ioremap() leading to
the alignment fault.

If I take out the code inside the #ifdef, I get a different
fault:

[    0.350057] Unhandled fault: synchronous external abort (0x96000010) at 0xfffffe0000fae6f4
[    0.358704] pgd = fffffe0001160000
[    0.362276] [fffffe0000fae6f4] *pgd=0000004001370003, *pud=0000004001370003, *pmd=0000004001370003, *pte=02c00040011a0713
[    0.373746] Internal error: : 96000010 [#1] SMP
[    0.378484] Modules linked in:
[    0.381601] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc4+ #15
[    0.388248] Hardware name: APM X-Gene Mustang board (DT)
[    0.393738] task: fffffe03dbe10000 ti: fffffe03dbf00000 task.ti: fffffe03dbf00000
[    0.401503] PC is at acpi_ex_system_memory_space_handler+0x238/0x2e0
[    0.408160] LR is at acpi_ex_system_memory_space_handler+0x130/0x2e0

That happens because AML is trying to access a hardware register
which has been mapped as normal memory.

So, we need a way to tell a table in ram from an io address in AML.
And page_is_ram() no longer cuts it if the tables are not in the
linear mapping.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
  2014-11-11 17:12                     ` Mark Salter
@ 2014-11-11 17:44                         ` Mark Rutland
  -1 siblings, 0 replies; 44+ messages in thread
From: Mark Rutland @ 2014-11-11 17:44 UTC (permalink / raw)
  To: msalter-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Ard Biesheuvel, Leif Lindholm, Roy Franz,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Dave Young,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Matt Fleming, Will Deacon,
	Catalin Marinas, grant.likely-QSEj5FYQhm4dnm+yROfE0A

On Tue, Nov 11, 2014 at 05:12:09PM +0000, Mark Salter wrote:
> On Tue, 2014-11-11 at 10:42 -0500, Mark Salter wrote:
> > On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
> > > On 10 November 2014 05:11, Mark Salter <msalter-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> > > >> This changes the way memblocks are installed based on the contents
> > > of
> > > >> the UEFI memory map. Formerly, all regions would be
> > > memblock_add()'ed,
> > > >> after which unusable regions would be memblock_reserve()'d as well.
> > > >> To simplify things, but also to allow access to the unusable
> > > regions
> > > >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> > > >> this so that only usable regions are memblock_add()'ed in the first
> > > >> place.
> > > >
> > > > This patch is crashing 64K pagesize kernels during boot. I'm not
> > > exactly
> > > > sure why, though. Here is what I get on an APM Mustang box:
> > > >
> > > 
> > > Ah, yes, I meant to mention this patch
> > > 
> > > https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
> > > 
> > > in the cover letter, which addresses this issue at least for the DT
> > > case.
> > > 
> > 
> > That isn't the problem. In general, with 64K kernel pages, you can't be
> > sure if you leave something you need out of the kernel linear mapping.

Regardless of 64k pages you can never assume that anything will be in
the linear mapping due to the current relationship between the start of
the linear map and the address the kernel was loaded at.

> > If you have Loader Code/Data regions begin and/or end at something other
> > than a 64K boundary and that region is adjacent to a region not being
> > added, then you end up leaving out the unaligned bits from the linear
> > mapping. This could be bits of the initramfs or devicetree.
> > 
> > What I don't get with this failure is that it is an alignment fault
> > which should be masked at EL1 for the kernel. The same unaligned
> > access happens without this patch and it doesn't generate a fault.
> > 
> 
> Ah, but unaligned accesses are not ignored for device memory.
> I have this in include/acpi/acpi_io.h:
> 
> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
> 					    acpi_size size)
> {
> #ifdef CONFIG_ARM64
> 	if (!page_is_ram(phys >> PAGE_SHIFT))
> 		return ioremap(phys, size);
> #endif
> 
>        return ioremap_cache(phys, size);
> }
> 
> Because the table isn't in the linear mapping, it fails the
> page_is_ram() test and it gits mapped with ioremap() leading to
> the alignment fault.
> 
> If I take out the code inside the #ifdef, I get a different
> fault:
> 
> [    0.350057] Unhandled fault: synchronous external abort (0x96000010) at 0xfffffe0000fae6f4
> [    0.358704] pgd = fffffe0001160000
> [    0.362276] [fffffe0000fae6f4] *pgd=0000004001370003, *pud=0000004001370003, *pmd=0000004001370003, *pte=02c00040011a0713
> [    0.373746] Internal error: : 96000010 [#1] SMP
> [    0.378484] Modules linked in:
> [    0.381601] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc4+ #15
> [    0.388248] Hardware name: APM X-Gene Mustang board (DT)
> [    0.393738] task: fffffe03dbe10000 ti: fffffe03dbf00000 task.ti: fffffe03dbf00000
> [    0.401503] PC is at acpi_ex_system_memory_space_handler+0x238/0x2e0
> [    0.408160] LR is at acpi_ex_system_memory_space_handler+0x130/0x2e0
> 
> That happens because AML is trying to access a hardware register
> which has been mapped as normal memory.

Which is why removing the check in the ifdef is completely nonsensical.
We already knew we can't map everything cacheable -- regardless of what
AML does the CPU can prefetch from anything mapped cacheable (or simply
executable) at any time.

> So, we need a way to tell a table in ram from an io address in AML.
> And page_is_ram() no longer cuts it if the tables are not in the
> linear mapping.

If the kernel were loaded at an address above the tables, it would fail
similarly. So the page_is_ram was never sufficient to ensure tables
would be mapped cacheable.

As we haven't decoupled the kernel text mapping from the linear mapping,
and that doesn't look to be happening any time soon, we can't fix up
page_is_ram -- we need something else entirely.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
@ 2014-11-11 17:44                         ` Mark Rutland
  0 siblings, 0 replies; 44+ messages in thread
From: Mark Rutland @ 2014-11-11 17:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 11, 2014 at 05:12:09PM +0000, Mark Salter wrote:
> On Tue, 2014-11-11 at 10:42 -0500, Mark Salter wrote:
> > On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
> > > On 10 November 2014 05:11, Mark Salter <msalter@redhat.com> wrote:
> > > > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> > > >> This changes the way memblocks are installed based on the contents
> > > of
> > > >> the UEFI memory map. Formerly, all regions would be
> > > memblock_add()'ed,
> > > >> after which unusable regions would be memblock_reserve()'d as well.
> > > >> To simplify things, but also to allow access to the unusable
> > > regions
> > > >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> > > >> this so that only usable regions are memblock_add()'ed in the first
> > > >> place.
> > > >
> > > > This patch is crashing 64K pagesize kernels during boot. I'm not
> > > exactly
> > > > sure why, though. Here is what I get on an APM Mustang box:
> > > >
> > > 
> > > Ah, yes, I meant to mention this patch
> > > 
> > > https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
> > > 
> > > in the cover letter, which addresses this issue at least for the DT
> > > case.
> > > 
> > 
> > That isn't the problem. In general, with 64K kernel pages, you can't be
> > sure if you leave something you need out of the kernel linear mapping.

Regardless of 64k pages you can never assume that anything will be in
the linear mapping due to the current relationship between the start of
the linear map and the address the kernel was loaded at.

> > If you have Loader Code/Data regions begin and/or end at something other
> > than a 64K boundary and that region is adjacent to a region not being
> > added, then you end up leaving out the unaligned bits from the linear
> > mapping. This could be bits of the initramfs or devicetree.
> > 
> > What I don't get with this failure is that it is an alignment fault
> > which should be masked at EL1 for the kernel. The same unaligned
> > access happens without this patch and it doesn't generate a fault.
> > 
> 
> Ah, but unaligned accesses are not ignored for device memory.
> I have this in include/acpi/acpi_io.h:
> 
> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
> 					    acpi_size size)
> {
> #ifdef CONFIG_ARM64
> 	if (!page_is_ram(phys >> PAGE_SHIFT))
> 		return ioremap(phys, size);
> #endif
> 
>        return ioremap_cache(phys, size);
> }
> 
> Because the table isn't in the linear mapping, it fails the
> page_is_ram() test and it gits mapped with ioremap() leading to
> the alignment fault.
> 
> If I take out the code inside the #ifdef, I get a different
> fault:
> 
> [    0.350057] Unhandled fault: synchronous external abort (0x96000010) at 0xfffffe0000fae6f4
> [    0.358704] pgd = fffffe0001160000
> [    0.362276] [fffffe0000fae6f4] *pgd=0000004001370003, *pud=0000004001370003, *pmd=0000004001370003, *pte=02c00040011a0713
> [    0.373746] Internal error: : 96000010 [#1] SMP
> [    0.378484] Modules linked in:
> [    0.381601] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc4+ #15
> [    0.388248] Hardware name: APM X-Gene Mustang board (DT)
> [    0.393738] task: fffffe03dbe10000 ti: fffffe03dbf00000 task.ti: fffffe03dbf00000
> [    0.401503] PC is at acpi_ex_system_memory_space_handler+0x238/0x2e0
> [    0.408160] LR is at acpi_ex_system_memory_space_handler+0x130/0x2e0
> 
> That happens because AML is trying to access a hardware register
> which has been mapped as normal memory.

Which is why removing the check in the ifdef is completely nonsensical.
We already knew we can't map everything cacheable -- regardless of what
AML does the CPU can prefetch from anything mapped cacheable (or simply
executable) at any time.

> So, we need a way to tell a table in ram from an io address in AML.
> And page_is_ram() no longer cuts it if the tables are not in the
> linear mapping.

If the kernel were loaded at an address above the tables, it would fail
similarly. So the page_is_ram was never sufficient to ensure tables
would be mapped cacheable.

As we haven't decoupled the kernel text mapping from the linear mapping,
and that doesn't look to be happening any time soon, we can't fix up
page_is_ram -- we need something else entirely.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
  2014-11-11 17:44                         ` Mark Rutland
@ 2014-11-11 17:55                           ` Ard Biesheuvel
  -1 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-11 17:55 UTC (permalink / raw)
  To: Mark Rutland
  Cc: msalter-H+wXaHxf7aLQT0dZR+AlfA, Leif Lindholm, Roy Franz,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Dave Young,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Matt Fleming, Will Deacon,
	Catalin Marinas, grant.likely-QSEj5FYQhm4dnm+yROfE0A

On 11 November 2014 18:44, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
> On Tue, Nov 11, 2014 at 05:12:09PM +0000, Mark Salter wrote:
>> On Tue, 2014-11-11 at 10:42 -0500, Mark Salter wrote:
>> > On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
>> > > On 10 November 2014 05:11, Mark Salter <msalter-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> > > > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
>> > > >> This changes the way memblocks are installed based on the contents
>> > > of
>> > > >> the UEFI memory map. Formerly, all regions would be
>> > > memblock_add()'ed,
>> > > >> after which unusable regions would be memblock_reserve()'d as well.
>> > > >> To simplify things, but also to allow access to the unusable
>> > > regions
>> > > >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
>> > > >> this so that only usable regions are memblock_add()'ed in the first
>> > > >> place.
>> > > >
>> > > > This patch is crashing 64K pagesize kernels during boot. I'm not
>> > > exactly
>> > > > sure why, though. Here is what I get on an APM Mustang box:
>> > > >
>> > >
>> > > Ah, yes, I meant to mention this patch
>> > >
>> > > https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
>> > >
>> > > in the cover letter, which addresses this issue at least for the DT
>> > > case.
>> > >
>> >
>> > That isn't the problem. In general, with 64K kernel pages, you can't be
>> > sure if you leave something you need out of the kernel linear mapping.
>
> Regardless of 64k pages you can never assume that anything will be in
> the linear mapping due to the current relationship between the start of
> the linear map and the address the kernel was loaded at.
>
>> > If you have Loader Code/Data regions begin and/or end at something other
>> > than a 64K boundary and that region is adjacent to a region not being
>> > added, then you end up leaving out the unaligned bits from the linear
>> > mapping. This could be bits of the initramfs or devicetree.
>> >
>> > What I don't get with this failure is that it is an alignment fault
>> > which should be masked at EL1 for the kernel. The same unaligned
>> > access happens without this patch and it doesn't generate a fault.
>> >
>>
>> Ah, but unaligned accesses are not ignored for device memory.
>> I have this in include/acpi/acpi_io.h:
>>
>> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
>>                                           acpi_size size)
>> {
>> #ifdef CONFIG_ARM64
>>       if (!page_is_ram(phys >> PAGE_SHIFT))
>>               return ioremap(phys, size);
>> #endif
>>
>>        return ioremap_cache(phys, size);
>> }
>>
>> Because the table isn't in the linear mapping, it fails the
>> page_is_ram() test and it gits mapped with ioremap() leading to
>> the alignment fault.
>>
>> If I take out the code inside the #ifdef, I get a different
>> fault:
>>
>> [ 0.350057] Unhandled fault: synchronous external abort (0x96000010) at 0xfffffe0000fae6f4
>> [    0.358704] pgd = fffffe0001160000
>> [    0.362276] [fffffe0000fae6f4] *pgd=0000004001370003, *pud=0000004001370003, *pmd=0000004001370003, *pte=02c00040011a0713
>> [    0.373746] Internal error: : 96000010 [#1] SMP
>> [    0.378484] Modules linked in:
>> [    0.381601] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc4+ #15
>> [    0.388248] Hardware name: APM X-Gene Mustang board (DT)
>> [    0.393738] task: fffffe03dbe10000 ti: fffffe03dbf00000 task.ti: fffffe03dbf00000
>> [    0.401503] PC is at acpi_ex_system_memory_space_handler+0x238/0x2e0
>> [    0.408160] LR is at acpi_ex_system_memory_space_handler+0x130/0x2e0
>>
>> That happens because AML is trying to access a hardware register
>> which has been mapped as normal memory.
>
> Which is why removing the check in the ifdef is completely nonsensical.

I am sure we are all in agreement on this part.

> We already knew we can't map everything cacheable -- regardless of what
> AML does the CPU can prefetch from anything mapped cacheable (or simply
> executable) at any time.
>
>> So, we need a way to tell a table in ram from an io address in AML.
>> And page_is_ram() no longer cuts it if the tables are not in the
>> linear mapping.
>
> If the kernel were loaded at an address above the tables, it would fail
> similarly. So the page_is_ram was never sufficient to ensure tables
> would be mapped cacheable.
>
> As we haven't decoupled the kernel text mapping from the linear mapping,
> and that doesn't look to be happening any time soon, we can't fix up
> page_is_ram -- we need something else entirely.
>

Well, if you look at the series, and in particular at the /dev/mem
handling, there is already some code that classifies physical
addresses based on whether they appear in the UEFI memory map, and
with which attributes. I suppose reimplementing page_is_ram() [whose
default implementation is conveniently __weak] to return true for
EFI_MEMORY_WB ranges and false for everything else would do the trick
here, and would arguably be more elegant than matching the string
"System RAM" in the resource table. And, putting the UEFI memory map
central in a range of RAM remapping related functions ensures that
they are always in agreement (or so the theory goes)

-- 
Ard.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
@ 2014-11-11 17:55                           ` Ard Biesheuvel
  0 siblings, 0 replies; 44+ messages in thread
From: Ard Biesheuvel @ 2014-11-11 17:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 11 November 2014 18:44, Mark Rutland <mark.rutland@arm.com> wrote:
> On Tue, Nov 11, 2014 at 05:12:09PM +0000, Mark Salter wrote:
>> On Tue, 2014-11-11 at 10:42 -0500, Mark Salter wrote:
>> > On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
>> > > On 10 November 2014 05:11, Mark Salter <msalter@redhat.com> wrote:
>> > > > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
>> > > >> This changes the way memblocks are installed based on the contents
>> > > of
>> > > >> the UEFI memory map. Formerly, all regions would be
>> > > memblock_add()'ed,
>> > > >> after which unusable regions would be memblock_reserve()'d as well.
>> > > >> To simplify things, but also to allow access to the unusable
>> > > regions
>> > > >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
>> > > >> this so that only usable regions are memblock_add()'ed in the first
>> > > >> place.
>> > > >
>> > > > This patch is crashing 64K pagesize kernels during boot. I'm not
>> > > exactly
>> > > > sure why, though. Here is what I get on an APM Mustang box:
>> > > >
>> > >
>> > > Ah, yes, I meant to mention this patch
>> > >
>> > > https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
>> > >
>> > > in the cover letter, which addresses this issue at least for the DT
>> > > case.
>> > >
>> >
>> > That isn't the problem. In general, with 64K kernel pages, you can't be
>> > sure if you leave something you need out of the kernel linear mapping.
>
> Regardless of 64k pages you can never assume that anything will be in
> the linear mapping due to the current relationship between the start of
> the linear map and the address the kernel was loaded at.
>
>> > If you have Loader Code/Data regions begin and/or end at something other
>> > than a 64K boundary and that region is adjacent to a region not being
>> > added, then you end up leaving out the unaligned bits from the linear
>> > mapping. This could be bits of the initramfs or devicetree.
>> >
>> > What I don't get with this failure is that it is an alignment fault
>> > which should be masked at EL1 for the kernel. The same unaligned
>> > access happens without this patch and it doesn't generate a fault.
>> >
>>
>> Ah, but unaligned accesses are not ignored for device memory.
>> I have this in include/acpi/acpi_io.h:
>>
>> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
>>                                           acpi_size size)
>> {
>> #ifdef CONFIG_ARM64
>>       if (!page_is_ram(phys >> PAGE_SHIFT))
>>               return ioremap(phys, size);
>> #endif
>>
>>        return ioremap_cache(phys, size);
>> }
>>
>> Because the table isn't in the linear mapping, it fails the
>> page_is_ram() test and it gits mapped with ioremap() leading to
>> the alignment fault.
>>
>> If I take out the code inside the #ifdef, I get a different
>> fault:
>>
>> [ 0.350057] Unhandled fault: synchronous external abort (0x96000010) at 0xfffffe0000fae6f4
>> [    0.358704] pgd = fffffe0001160000
>> [    0.362276] [fffffe0000fae6f4] *pgd=0000004001370003, *pud=0000004001370003, *pmd=0000004001370003, *pte=02c00040011a0713
>> [    0.373746] Internal error: : 96000010 [#1] SMP
>> [    0.378484] Modules linked in:
>> [    0.381601] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc4+ #15
>> [    0.388248] Hardware name: APM X-Gene Mustang board (DT)
>> [    0.393738] task: fffffe03dbe10000 ti: fffffe03dbf00000 task.ti: fffffe03dbf00000
>> [    0.401503] PC is at acpi_ex_system_memory_space_handler+0x238/0x2e0
>> [    0.408160] LR is at acpi_ex_system_memory_space_handler+0x130/0x2e0
>>
>> That happens because AML is trying to access a hardware register
>> which has been mapped as normal memory.
>
> Which is why removing the check in the ifdef is completely nonsensical.

I am sure we are all in agreement on this part.

> We already knew we can't map everything cacheable -- regardless of what
> AML does the CPU can prefetch from anything mapped cacheable (or simply
> executable) at any time.
>
>> So, we need a way to tell a table in ram from an io address in AML.
>> And page_is_ram() no longer cuts it if the tables are not in the
>> linear mapping.
>
> If the kernel were loaded at an address above the tables, it would fail
> similarly. So the page_is_ram was never sufficient to ensure tables
> would be mapped cacheable.
>
> As we haven't decoupled the kernel text mapping from the linear mapping,
> and that doesn't look to be happening any time soon, we can't fix up
> page_is_ram -- we need something else entirely.
>

Well, if you look at the series, and in particular at the /dev/mem
handling, there is already some code that classifies physical
addresses based on whether they appear in the UEFI memory map, and
with which attributes. I suppose reimplementing page_is_ram() [whose
default implementation is conveniently __weak] to return true for
EFI_MEMORY_WB ranges and false for everything else would do the trick
here, and would arguably be more elegant than matching the string
"System RAM" in the resource table. And, putting the UEFI memory map
central in a range of RAM remapping related functions ensures that
they are always in agreement (or so the theory goes)

-- 
Ard.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
  2014-11-11 17:44                         ` Mark Rutland
@ 2014-11-11 18:23                           ` Mark Salter
  -1 siblings, 0 replies; 44+ messages in thread
From: Mark Salter @ 2014-11-11 18:23 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ard Biesheuvel, Leif Lindholm, Roy Franz,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Dave Young,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Matt Fleming, Will Deacon,
	Catalin Marinas, grant.likely-QSEj5FYQhm4dnm+yROfE0A

On Tue, 2014-11-11 at 17:44 +0000, Mark Rutland wrote:
> On Tue, Nov 11, 2014 at 05:12:09PM +0000, Mark Salter wrote:
> > On Tue, 2014-11-11 at 10:42 -0500, Mark Salter wrote:
> > > On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
> > > > On 10 November 2014 05:11, Mark Salter <msalter-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > > > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> > > > >> This changes the way memblocks are installed based on the contents
> > > > of
> > > > >> the UEFI memory map. Formerly, all regions would be
> > > > memblock_add()'ed,
> > > > >> after which unusable regions would be memblock_reserve()'d as well.
> > > > >> To simplify things, but also to allow access to the unusable
> > > > regions
> > > > >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> > > > >> this so that only usable regions are memblock_add()'ed in the first
> > > > >> place.
> > > > >
> > > > > This patch is crashing 64K pagesize kernels during boot. I'm not
> > > > exactly
> > > > > sure why, though. Here is what I get on an APM Mustang box:
> > > > >
> > > > 
> > > > Ah, yes, I meant to mention this patch
> > > > 
> > > > https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
> > > > 
> > > > in the cover letter, which addresses this issue at least for the DT
> > > > case.
> > > > 
> > > 
> > > That isn't the problem. In general, with 64K kernel pages, you can't be
> > > sure if you leave something you need out of the kernel linear mapping.
> 
> Regardless of 64k pages you can never assume that anything will be in
> the linear mapping due to the current relationship between the start of
> the linear map and the address the kernel was loaded at.

We know that the kernel, initramfs, and devicetree must be in the linear
mapping or we have already lost. I don't think we care about anything
else in Loader Data/Code.

If we fix the kernel so that initramfs and devicetree can exist outside
the linear mapping, then fine. Until then, if they do fall above the
kernel and are potentially able to be included the linear mapping, then
you had better take care of the unaligned bits that get clipped by
early_init_dt_add_memory_arch().

> 
> > > If you have Loader Code/Data regions begin and/or end at something other
> > > than a 64K boundary and that region is adjacent to a region not being
> > > added, then you end up leaving out the unaligned bits from the linear
> > > mapping. This could be bits of the initramfs or devicetree.
> > > 
> > > What I don't get with this failure is that it is an alignment fault
> > > which should be masked at EL1 for the kernel. The same unaligned
> > > access happens without this patch and it doesn't generate a fault.
> > > 
> > 
> > Ah, but unaligned accesses are not ignored for device memory.
> > I have this in include/acpi/acpi_io.h:
> > 
> > static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
> > 					    acpi_size size)
> > {
> > #ifdef CONFIG_ARM64
> > 	if (!page_is_ram(phys >> PAGE_SHIFT))
> > 		return ioremap(phys, size);
> > #endif
> > 
> >        return ioremap_cache(phys, size);
> > }
> > 
> > Because the table isn't in the linear mapping, it fails the
> > page_is_ram() test and it gits mapped with ioremap() leading to
> > the alignment fault.
> > 
> > If I take out the code inside the #ifdef, I get a different
> > fault:
> > 
> > [    0.350057] Unhandled fault: synchronous external abort (0x96000010) at 0xfffffe0000fae6f4
> > [    0.358704] pgd = fffffe0001160000
> > [    0.362276] [fffffe0000fae6f4] *pgd=0000004001370003, *pud=0000004001370003, *pmd=0000004001370003, *pte=02c00040011a0713
> > [    0.373746] Internal error: : 96000010 [#1] SMP
> > [    0.378484] Modules linked in:
> > [    0.381601] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc4+ #15
> > [    0.388248] Hardware name: APM X-Gene Mustang board (DT)
> > [    0.393738] task: fffffe03dbe10000 ti: fffffe03dbf00000 task.ti: fffffe03dbf00000
> > [    0.401503] PC is at acpi_ex_system_memory_space_handler+0x238/0x2e0
> > [    0.408160] LR is at acpi_ex_system_memory_space_handler+0x130/0x2e0
> > 
> > That happens because AML is trying to access a hardware register
> > which has been mapped as normal memory.
> 
> Which is why removing the check in the ifdef is completely nonsensical.
> We already knew we can't map everything cacheable -- regardless of what
> AML does the CPU can prefetch from anything mapped cacheable (or simply
> executable) at any time.
> 
> > So, we need a way to tell a table in ram from an io address in AML.
> > And page_is_ram() no longer cuts it if the tables are not in the
> > linear mapping.
> 
> If the kernel were loaded at an address above the tables, it would fail
> similarly. So the page_is_ram was never sufficient to ensure tables
> would be mapped cacheable.
> 
> As we haven't decoupled the kernel text mapping from the linear mapping,
> and that doesn't look to be happening any time soon, we can't fix up
> page_is_ram -- we need something else entirely.

Agreed.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
@ 2014-11-11 18:23                           ` Mark Salter
  0 siblings, 0 replies; 44+ messages in thread
From: Mark Salter @ 2014-11-11 18:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2014-11-11 at 17:44 +0000, Mark Rutland wrote:
> On Tue, Nov 11, 2014 at 05:12:09PM +0000, Mark Salter wrote:
> > On Tue, 2014-11-11 at 10:42 -0500, Mark Salter wrote:
> > > On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
> > > > On 10 November 2014 05:11, Mark Salter <msalter@redhat.com> wrote:
> > > > > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> > > > >> This changes the way memblocks are installed based on the contents
> > > > of
> > > > >> the UEFI memory map. Formerly, all regions would be
> > > > memblock_add()'ed,
> > > > >> after which unusable regions would be memblock_reserve()'d as well.
> > > > >> To simplify things, but also to allow access to the unusable
> > > > regions
> > > > >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> > > > >> this so that only usable regions are memblock_add()'ed in the first
> > > > >> place.
> > > > >
> > > > > This patch is crashing 64K pagesize kernels during boot. I'm not
> > > > exactly
> > > > > sure why, though. Here is what I get on an APM Mustang box:
> > > > >
> > > > 
> > > > Ah, yes, I meant to mention this patch
> > > > 
> > > > https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
> > > > 
> > > > in the cover letter, which addresses this issue at least for the DT
> > > > case.
> > > > 
> > > 
> > > That isn't the problem. In general, with 64K kernel pages, you can't be
> > > sure if you leave something you need out of the kernel linear mapping.
> 
> Regardless of 64k pages you can never assume that anything will be in
> the linear mapping due to the current relationship between the start of
> the linear map and the address the kernel was loaded at.

We know that the kernel, initramfs, and devicetree must be in the linear
mapping or we have already lost. I don't think we care about anything
else in Loader Data/Code.

If we fix the kernel so that initramfs and devicetree can exist outside
the linear mapping, then fine. Until then, if they do fall above the
kernel and are potentially able to be included the linear mapping, then
you had better take care of the unaligned bits that get clipped by
early_init_dt_add_memory_arch().

> 
> > > If you have Loader Code/Data regions begin and/or end at something other
> > > than a 64K boundary and that region is adjacent to a region not being
> > > added, then you end up leaving out the unaligned bits from the linear
> > > mapping. This could be bits of the initramfs or devicetree.
> > > 
> > > What I don't get with this failure is that it is an alignment fault
> > > which should be masked at EL1 for the kernel. The same unaligned
> > > access happens without this patch and it doesn't generate a fault.
> > > 
> > 
> > Ah, but unaligned accesses are not ignored for device memory.
> > I have this in include/acpi/acpi_io.h:
> > 
> > static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
> > 					    acpi_size size)
> > {
> > #ifdef CONFIG_ARM64
> > 	if (!page_is_ram(phys >> PAGE_SHIFT))
> > 		return ioremap(phys, size);
> > #endif
> > 
> >        return ioremap_cache(phys, size);
> > }
> > 
> > Because the table isn't in the linear mapping, it fails the
> > page_is_ram() test and it gits mapped with ioremap() leading to
> > the alignment fault.
> > 
> > If I take out the code inside the #ifdef, I get a different
> > fault:
> > 
> > [    0.350057] Unhandled fault: synchronous external abort (0x96000010) at 0xfffffe0000fae6f4
> > [    0.358704] pgd = fffffe0001160000
> > [    0.362276] [fffffe0000fae6f4] *pgd=0000004001370003, *pud=0000004001370003, *pmd=0000004001370003, *pte=02c00040011a0713
> > [    0.373746] Internal error: : 96000010 [#1] SMP
> > [    0.378484] Modules linked in:
> > [    0.381601] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc4+ #15
> > [    0.388248] Hardware name: APM X-Gene Mustang board (DT)
> > [    0.393738] task: fffffe03dbe10000 ti: fffffe03dbf00000 task.ti: fffffe03dbf00000
> > [    0.401503] PC is at acpi_ex_system_memory_space_handler+0x238/0x2e0
> > [    0.408160] LR is at acpi_ex_system_memory_space_handler+0x130/0x2e0
> > 
> > That happens because AML is trying to access a hardware register
> > which has been mapped as normal memory.
> 
> Which is why removing the check in the ifdef is completely nonsensical.
> We already knew we can't map everything cacheable -- regardless of what
> AML does the CPU can prefetch from anything mapped cacheable (or simply
> executable) at any time.
> 
> > So, we need a way to tell a table in ram from an io address in AML.
> > And page_is_ram() no longer cuts it if the tables are not in the
> > linear mapping.
> 
> If the kernel were loaded at an address above the tables, it would fail
> similarly. So the page_is_ram was never sufficient to ensure tables
> would be mapped cacheable.
> 
> As we haven't decoupled the kernel text mapping from the linear mapping,
> and that doesn't look to be happening any time soon, we can't fix up
> page_is_ram -- we need something else entirely.

Agreed.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
  2014-11-11 17:55                           ` Ard Biesheuvel
@ 2014-11-11 18:39                               ` Mark Rutland
  -1 siblings, 0 replies; 44+ messages in thread
From: Mark Rutland @ 2014-11-11 18:39 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: msalter-H+wXaHxf7aLQT0dZR+AlfA, Leif Lindholm, Roy Franz,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Dave Young,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Matt Fleming, Will Deacon,
	Catalin Marinas, grant.likely-QSEj5FYQhm4dnm+yROfE0A

On Tue, Nov 11, 2014 at 05:55:24PM +0000, Ard Biesheuvel wrote:
> On 11 November 2014 18:44, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
> > On Tue, Nov 11, 2014 at 05:12:09PM +0000, Mark Salter wrote:
> >> On Tue, 2014-11-11 at 10:42 -0500, Mark Salter wrote:
> >> > On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
> >> > > On 10 November 2014 05:11, Mark Salter <msalter-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >> > > > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> >> > > >> This changes the way memblocks are installed based on the contents
> >> > > of
> >> > > >> the UEFI memory map. Formerly, all regions would be
> >> > > memblock_add()'ed,
> >> > > >> after which unusable regions would be memblock_reserve()'d as well.
> >> > > >> To simplify things, but also to allow access to the unusable
> >> > > regions
> >> > > >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> >> > > >> this so that only usable regions are memblock_add()'ed in the first
> >> > > >> place.
> >> > > >
> >> > > > This patch is crashing 64K pagesize kernels during boot. I'm not
> >> > > exactly
> >> > > > sure why, though. Here is what I get on an APM Mustang box:
> >> > > >
> >> > >
> >> > > Ah, yes, I meant to mention this patch
> >> > >
> >> > > https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
> >> > >
> >> > > in the cover letter, which addresses this issue at least for the DT
> >> > > case.
> >> > >
> >> >
> >> > That isn't the problem. In general, with 64K kernel pages, you can't be
> >> > sure if you leave something you need out of the kernel linear mapping.
> >
> > Regardless of 64k pages you can never assume that anything will be in
> > the linear mapping due to the current relationship between the start of
> > the linear map and the address the kernel was loaded at.
> >
> >> > If you have Loader Code/Data regions begin and/or end at something other
> >> > than a 64K boundary and that region is adjacent to a region not being
> >> > added, then you end up leaving out the unaligned bits from the linear
> >> > mapping. This could be bits of the initramfs or devicetree.
> >> >
> >> > What I don't get with this failure is that it is an alignment fault
> >> > which should be masked at EL1 for the kernel. The same unaligned
> >> > access happens without this patch and it doesn't generate a fault.
> >> >
> >>
> >> Ah, but unaligned accesses are not ignored for device memory.
> >> I have this in include/acpi/acpi_io.h:
> >>
> >> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
> >>                                           acpi_size size)
> >> {
> >> #ifdef CONFIG_ARM64
> >>       if (!page_is_ram(phys >> PAGE_SHIFT))
> >>               return ioremap(phys, size);
> >> #endif
> >>
> >>        return ioremap_cache(phys, size);
> >> }
> >>
> >> Because the table isn't in the linear mapping, it fails the
> >> page_is_ram() test and it gits mapped with ioremap() leading to
> >> the alignment fault.
> >>
> >> If I take out the code inside the #ifdef, I get a different
> >> fault:
> >>
> >> [ 0.350057] Unhandled fault: synchronous external abort (0x96000010) at 0xfffffe0000fae6f4
> >> [    0.358704] pgd = fffffe0001160000
> >> [    0.362276] [fffffe0000fae6f4] *pgd=0000004001370003, *pud=0000004001370003, *pmd=0000004001370003, *pte=02c00040011a0713
> >> [    0.373746] Internal error: : 96000010 [#1] SMP
> >> [    0.378484] Modules linked in:
> >> [    0.381601] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc4+ #15
> >> [    0.388248] Hardware name: APM X-Gene Mustang board (DT)
> >> [    0.393738] task: fffffe03dbe10000 ti: fffffe03dbf00000 task.ti: fffffe03dbf00000
> >> [    0.401503] PC is at acpi_ex_system_memory_space_handler+0x238/0x2e0
> >> [    0.408160] LR is at acpi_ex_system_memory_space_handler+0x130/0x2e0
> >>
> >> That happens because AML is trying to access a hardware register
> >> which has been mapped as normal memory.
> >
> > Which is why removing the check in the ifdef is completely nonsensical.
> 
> I am sure we are all in agreement on this part.
> 
> > We already knew we can't map everything cacheable -- regardless of what
> > AML does the CPU can prefetch from anything mapped cacheable (or simply
> > executable) at any time.
> >
> >> So, we need a way to tell a table in ram from an io address in AML.
> >> And page_is_ram() no longer cuts it if the tables are not in the
> >> linear mapping.
> >
> > If the kernel were loaded at an address above the tables, it would fail
> > similarly. So the page_is_ram was never sufficient to ensure tables
> > would be mapped cacheable.
> >
> > As we haven't decoupled the kernel text mapping from the linear mapping,
> > and that doesn't look to be happening any time soon, we can't fix up
> > page_is_ram -- we need something else entirely.
> >
> 
> Well, if you look at the series, and in particular at the /dev/mem
> handling, there is already some code that classifies physical
> addresses based on whether they appear in the UEFI memory map, and
> with which attributes. I suppose reimplementing page_is_ram() [whose
> default implementation is conveniently __weak] to return true for
> EFI_MEMORY_WB ranges and false for everything else would do the trick
> here, and would arguably be more elegant than matching the string
> "System RAM" in the resource table. And, putting the UEFI memory map
> central in a range of RAM remapping related functions ensures that
> they are always in agreement (or so the theory goes)

Sure. We'll have to be careful to fall back to the current behaviour for
!UEFI systems, though.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them
@ 2014-11-11 18:39                               ` Mark Rutland
  0 siblings, 0 replies; 44+ messages in thread
From: Mark Rutland @ 2014-11-11 18:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 11, 2014 at 05:55:24PM +0000, Ard Biesheuvel wrote:
> On 11 November 2014 18:44, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Tue, Nov 11, 2014 at 05:12:09PM +0000, Mark Salter wrote:
> >> On Tue, 2014-11-11 at 10:42 -0500, Mark Salter wrote:
> >> > On Mon, 2014-11-10 at 08:31 +0100, Ard Biesheuvel wrote:
> >> > > On 10 November 2014 05:11, Mark Salter <msalter@redhat.com> wrote:
> >> > > > On Thu, 2014-11-06 at 15:13 +0100, Ard Biesheuvel wrote:
> >> > > >> This changes the way memblocks are installed based on the contents
> >> > > of
> >> > > >> the UEFI memory map. Formerly, all regions would be
> >> > > memblock_add()'ed,
> >> > > >> after which unusable regions would be memblock_reserve()'d as well.
> >> > > >> To simplify things, but also to allow access to the unusable
> >> > > regions
> >> > > >> through mmap(/dev/mem), even with CONFIG_STRICT_DEVMEM set, change
> >> > > >> this so that only usable regions are memblock_add()'ed in the first
> >> > > >> place.
> >> > > >
> >> > > > This patch is crashing 64K pagesize kernels during boot. I'm not
> >> > > exactly
> >> > > > sure why, though. Here is what I get on an APM Mustang box:
> >> > > >
> >> > >
> >> > > Ah, yes, I meant to mention this patch
> >> > >
> >> > > https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git/commit/?id=8cccffc52694938fc88f3d90bc7fed8460e27191
> >> > >
> >> > > in the cover letter, which addresses this issue at least for the DT
> >> > > case.
> >> > >
> >> >
> >> > That isn't the problem. In general, with 64K kernel pages, you can't be
> >> > sure if you leave something you need out of the kernel linear mapping.
> >
> > Regardless of 64k pages you can never assume that anything will be in
> > the linear mapping due to the current relationship between the start of
> > the linear map and the address the kernel was loaded at.
> >
> >> > If you have Loader Code/Data regions begin and/or end at something other
> >> > than a 64K boundary and that region is adjacent to a region not being
> >> > added, then you end up leaving out the unaligned bits from the linear
> >> > mapping. This could be bits of the initramfs or devicetree.
> >> >
> >> > What I don't get with this failure is that it is an alignment fault
> >> > which should be masked at EL1 for the kernel. The same unaligned
> >> > access happens without this patch and it doesn't generate a fault.
> >> >
> >>
> >> Ah, but unaligned accesses are not ignored for device memory.
> >> I have this in include/acpi/acpi_io.h:
> >>
> >> static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
> >>                                           acpi_size size)
> >> {
> >> #ifdef CONFIG_ARM64
> >>       if (!page_is_ram(phys >> PAGE_SHIFT))
> >>               return ioremap(phys, size);
> >> #endif
> >>
> >>        return ioremap_cache(phys, size);
> >> }
> >>
> >> Because the table isn't in the linear mapping, it fails the
> >> page_is_ram() test and it gits mapped with ioremap() leading to
> >> the alignment fault.
> >>
> >> If I take out the code inside the #ifdef, I get a different
> >> fault:
> >>
> >> [ 0.350057] Unhandled fault: synchronous external abort (0x96000010) at 0xfffffe0000fae6f4
> >> [    0.358704] pgd = fffffe0001160000
> >> [    0.362276] [fffffe0000fae6f4] *pgd=0000004001370003, *pud=0000004001370003, *pmd=0000004001370003, *pte=02c00040011a0713
> >> [    0.373746] Internal error: : 96000010 [#1] SMP
> >> [    0.378484] Modules linked in:
> >> [    0.381601] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc4+ #15
> >> [    0.388248] Hardware name: APM X-Gene Mustang board (DT)
> >> [    0.393738] task: fffffe03dbe10000 ti: fffffe03dbf00000 task.ti: fffffe03dbf00000
> >> [    0.401503] PC is at acpi_ex_system_memory_space_handler+0x238/0x2e0
> >> [    0.408160] LR is at acpi_ex_system_memory_space_handler+0x130/0x2e0
> >>
> >> That happens because AML is trying to access a hardware register
> >> which has been mapped as normal memory.
> >
> > Which is why removing the check in the ifdef is completely nonsensical.
> 
> I am sure we are all in agreement on this part.
> 
> > We already knew we can't map everything cacheable -- regardless of what
> > AML does the CPU can prefetch from anything mapped cacheable (or simply
> > executable) at any time.
> >
> >> So, we need a way to tell a table in ram from an io address in AML.
> >> And page_is_ram() no longer cuts it if the tables are not in the
> >> linear mapping.
> >
> > If the kernel were loaded at an address above the tables, it would fail
> > similarly. So the page_is_ram was never sufficient to ensure tables
> > would be mapped cacheable.
> >
> > As we haven't decoupled the kernel text mapping from the linear mapping,
> > and that doesn't look to be happening any time soon, we can't fix up
> > page_is_ram -- we need something else entirely.
> >
> 
> Well, if you look at the series, and in particular at the /dev/mem
> handling, there is already some code that classifies physical
> addresses based on whether they appear in the UEFI memory map, and
> with which attributes. I suppose reimplementing page_is_ram() [whose
> default implementation is conveniently __weak] to return true for
> EFI_MEMORY_WB ranges and false for everything else would do the trick
> here, and would arguably be more elegant than matching the string
> "System RAM" in the resource table. And, putting the UEFI memory map
> central in a range of RAM remapping related functions ensures that
> they are always in agreement (or so the theory goes)

Sure. We'll have to be careful to fall back to the current behaviour for
!UEFI systems, though.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2014-11-11 18:39 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-06 14:13 [PATCH v2 00/10] arm64: stable UEFI mappings for kexec Ard Biesheuvel
2014-11-06 14:13 ` Ard Biesheuvel
     [not found] ` <1415283206-14713-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-11-06 14:13   ` [PATCH v2 01/10] arm64/mm: add explicit struct_mm argument to __create_mapping() Ard Biesheuvel
2014-11-06 14:13     ` Ard Biesheuvel
2014-11-06 14:13   ` [PATCH v2 02/10] arm64/mm: add create_pgd_mapping() to create private page tables Ard Biesheuvel
2014-11-06 14:13     ` Ard Biesheuvel
2014-11-07 15:08     ` Steve Capper
2014-11-07 15:08       ` Steve Capper
     [not found]       ` <20141107150830.GA10210-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-11-07 15:12         ` Ard Biesheuvel
2014-11-07 15:12           ` Ard Biesheuvel
     [not found]           ` <CAKv+Gu8SuNy8ufq2trZB=0jW6QRZde4QQS3M+jSDuWCTZ257vg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-11-07 15:21             ` Steve Capper
2014-11-07 15:21               ` Steve Capper
2014-11-06 14:13   ` [PATCH v2 03/10] efi: split off remapping code from efi_config_init() Ard Biesheuvel
2014-11-06 14:13     ` Ard Biesheuvel
2014-11-06 14:13   ` [PATCH v2 04/10] efi: add common infrastructure for stub-installed virtual mapping Ard Biesheuvel
2014-11-06 14:13     ` Ard Biesheuvel
2014-11-06 14:13   ` [PATCH v2 05/10] arm64/efi: move SetVirtualAddressMap() to UEFI stub Ard Biesheuvel
2014-11-06 14:13     ` Ard Biesheuvel
2014-11-06 14:13   ` [PATCH v2 06/10] arm64/efi: remove free_boot_services() and friends Ard Biesheuvel
2014-11-06 14:13     ` Ard Biesheuvel
2014-11-06 14:13   ` [PATCH v2 07/10] arm64/efi: remove idmap manipulations from UEFI code Ard Biesheuvel
2014-11-06 14:13     ` Ard Biesheuvel
2014-11-06 14:13   ` [PATCH v2 08/10] arm64/efi: use UEFI memory map unconditionally if available Ard Biesheuvel
2014-11-06 14:13     ` Ard Biesheuvel
2014-11-06 14:13   ` [PATCH v2 09/10] arm64/efi: ignore unusable regions instead of reserving them Ard Biesheuvel
2014-11-06 14:13     ` Ard Biesheuvel
     [not found]     ` <1415283206-14713-10-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-11-10  4:11       ` Mark Salter
2014-11-10  4:11         ` Mark Salter
     [not found]         ` <1415592695.32311.91.camel-PDpCo7skNiwAicBL8TP8PQ@public.gmane.org>
2014-11-10  7:31           ` Ard Biesheuvel
2014-11-10  7:31             ` Ard Biesheuvel
     [not found]             ` <CAKv+Gu8ZpDpfJSvDksUfW0L4k9uU0-j9mConGwcdcsmh1XM_2w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-11-11 15:42               ` Mark Salter
2014-11-11 15:42                 ` Mark Salter
     [not found]                 ` <1415720536.32311.113.camel-PDpCo7skNiwAicBL8TP8PQ@public.gmane.org>
2014-11-11 17:12                   ` Mark Salter
2014-11-11 17:12                     ` Mark Salter
     [not found]                     ` <1415725929.32311.130.camel-PDpCo7skNiwAicBL8TP8PQ@public.gmane.org>
2014-11-11 17:44                       ` Mark Rutland
2014-11-11 17:44                         ` Mark Rutland
2014-11-11 17:55                         ` Ard Biesheuvel
2014-11-11 17:55                           ` Ard Biesheuvel
     [not found]                           ` <CAKv+Gu91rqSw5yFmgNPQQNEO0ChzqncUzqrqgkTK5s633TD16Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-11-11 18:39                             ` Mark Rutland
2014-11-11 18:39                               ` Mark Rutland
2014-11-11 18:23                         ` Mark Salter
2014-11-11 18:23                           ` Mark Salter
2014-11-06 14:13   ` [PATCH v2 10/10] arm64/efi: improve /dev/mem mmap() handling under UEFI Ard Biesheuvel
2014-11-06 14:13     ` Ard Biesheuvel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.