linux-efi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/8] stable UEFI virtual mappings for kexec
@ 2014-12-22 10:58 Ard Biesheuvel
       [not found] ` <1419245944-2424-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 30+ messages in thread
From: Ard Biesheuvel @ 2014-12-22 10:58 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Ard Biesheuvel

This is v4 of the series to update the UEFI memory map handling for the arm64
architecture so that virtual mappings of UEFI Runtime Services are stable across
kexec.

To de-risk the adoption of the subset of patches that are essential to get kexec
working on UEFI systems, in this v4 I dropped all the patches related to the
iomem resource table, /dev/mem permissions and memory attributes etc. These
topics will be revisited asap in a separate series.

The primary changes with respect to v3 in the patches that were kept are:
- instead of reording the memory map so that part of it can double as input
  argument to SetVirtualAddressMap(), increase the allocation size for the
  memory map so that we can use some of it as scratch space and use that to
  prepare the input to SVAM() instead.
- added some acks
- rebased onto v3.19-rc1

NOTE: these changes trigger an issue on AMD Seattle that we (Mark Rutland and I)
consider a firmware bug. It appears that, during the call to SVAM() (which is
called with a 1:1 mapping as per the UEFI spec) the virtual mapping being
installed is dereferenced prematurely. This went unnoticed in the original
situation, as the virtual mappings were just kernel mappings that are always
accessible. However, in the new situation, those mappings are only active
during Runtime Service invocations, and performing any kind of access through
them at any other time triggers a fault.

============== v1 blurb ==================

The main premise of these patches is that, in order to support kexec, we need
to add code to the kernel that is able to deal with the state of the firmware
after SetVirtualAddressMap() [SVAM] has been called. However, if we are going to
deal with that anyway, why not make that the default state, and have only a
single code path for both cases.

This means SVAM() needs to move to the stub, and hence the code that invents
the virtual layout needs to move with it. The result is that the kernel proper
is entered with the virt_addr members of all EFI_MEMORY_RUNTIME regions
assigned, and the mapping installed into the firmware. The kernel proper needs
to set up the page tables, and switch to them while performing the runtime
services calls. Note that there is also an efi_to_phys() to translate the values
of the fw_vendor and tables fields of the EFI system table. Again, this is
something we need to do anyway under kexec, or we end up handing over state
between one kernel and the next, which implies different code paths between
non-kexec and kexec.

The layout is chosen such that it used the userland half of the virtual address
space (TTBR0), which means no additional alignment or reservation is required
to ensure that it will always be available.

One thing that may stand out is the reordering of the memory map. The reason
for doing this is that we can use the same memory map as input to SVAM(). The
alternative is allocating memory for it using boot services, but that clutters
up the existing logic a bit between getting the memory map, populating the fdt,
and loop again if it didn't fit.

Ard Biesheuvel (8):
  arm64/mm: add explicit struct_mm argument to __create_mapping()
  arm64/mm: add create_pgd_mapping() to create private page tables
  efi: split off remapping code from efi_config_init()
  efi: efistub: allow allocation alignment larger than EFI_PAGE_SIZE
  arm64/efi: set EFI_ALLOC_ALIGN to 64 KB
  arm64/efi: move SetVirtualAddressMap() to UEFI stub
  arm64/efi: remove free_boot_services() and friends
  arm64/efi: remove idmap manipulations from UEFI code

 arch/arm64/include/asm/efi.h                   |  24 +-
 arch/arm64/include/asm/mmu.h                   |   5 +-
 arch/arm64/include/asm/pgtable.h               |   5 +
 arch/arm64/kernel/efi.c                        | 360 ++++++++-----------------
 arch/arm64/kernel/setup.c                      |   2 +-
 arch/arm64/mm/mmu.c                            |  60 ++---
 drivers/firmware/efi/efi.c                     |  49 ++--
 drivers/firmware/efi/libstub/efi-stub-helper.c |  25 +-
 drivers/firmware/efi/libstub/fdt.c             | 137 +++++++++-
 include/linux/efi.h                            |   2 +
 10 files changed, 348 insertions(+), 321 deletions(-)

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v4 1/8] arm64/mm: add explicit struct_mm argument to __create_mapping()
       [not found] ` <1419245944-2424-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
@ 2014-12-22 10:58   ` Ard Biesheuvel
  2014-12-22 10:58   ` [PATCH v4 2/8] arm64/mm: add create_pgd_mapping() to create private page tables Ard Biesheuvel
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2014-12-22 10:58 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Ard Biesheuvel

Currently, swapper_pg_dir and idmap_pg_dir share the init_mm mm_struct
instance. To allow the introduction of other pg_dir instances, for instance,
for UEFI's mapping of Runtime Services, make the struct_mm instance an
explicit argument that gets passed down to the pmd and pte instantiation
functions. Note that the consumers (pmd_populate/pgd_populate) of the
mm_struct argument don't actually inspect it, but let's fix it for
correctness' sake.

Acked-by: Steve Capper <steve.capper-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/mm/mmu.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 6032f3e3056a..7d5dfe2d3de0 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -156,9 +156,9 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 }
 
-static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
-				  unsigned long end, phys_addr_t phys,
-				  int map_io)
+static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
+				  unsigned long addr, unsigned long end,
+				  phys_addr_t phys, int map_io)
 {
 	pmd_t *pmd;
 	unsigned long next;
@@ -178,7 +178,7 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 	 */
 	if (pud_none(*pud) || pud_bad(*pud)) {
 		pmd = early_alloc(PTRS_PER_PMD * sizeof(pmd_t));
-		pud_populate(&init_mm, pud, pmd);
+		pud_populate(mm, pud, pmd);
 	}
 
 	pmd = pmd_offset(pud, addr);
@@ -202,16 +202,16 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 	} while (pmd++, addr = next, addr != end);
 }
 
-static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
-				  unsigned long end, phys_addr_t phys,
-				  int map_io)
+static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
+				  unsigned long addr, unsigned long end,
+				  phys_addr_t phys, int map_io)
 {
 	pud_t *pud;
 	unsigned long next;
 
 	if (pgd_none(*pgd)) {
 		pud = early_alloc(PTRS_PER_PUD * sizeof(pud_t));
-		pgd_populate(&init_mm, pgd, pud);
+		pgd_populate(mm, pgd, pud);
 	}
 	BUG_ON(pgd_bad(*pgd));
 
@@ -240,7 +240,7 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
 				flush_tlb_all();
 			}
 		} else {
-			alloc_init_pmd(pud, addr, next, phys, map_io);
+			alloc_init_pmd(mm, pud, addr, next, phys, map_io);
 		}
 		phys += next - addr;
 	} while (pud++, addr = next, addr != end);
@@ -250,9 +250,9 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
  * Create the page directory entries and any necessary page tables for the
  * mapping specified by 'md'.
  */
-static void __init __create_mapping(pgd_t *pgd, phys_addr_t phys,
-				    unsigned long virt, phys_addr_t size,
-				    int map_io)
+static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
+				    phys_addr_t phys, unsigned long virt,
+				    phys_addr_t size, int map_io)
 {
 	unsigned long addr, length, end, next;
 
@@ -262,7 +262,7 @@ static void __init __create_mapping(pgd_t *pgd, phys_addr_t phys,
 	end = addr + length;
 	do {
 		next = pgd_addr_end(addr, end);
-		alloc_init_pud(pgd, addr, next, phys, map_io);
+		alloc_init_pud(mm, pgd, addr, next, phys, map_io);
 		phys += next - addr;
 	} while (pgd++, addr = next, addr != end);
 }
@@ -275,7 +275,8 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 			&phys, virt);
 		return;
 	}
-	__create_mapping(pgd_offset_k(virt & PAGE_MASK), phys, virt, size, 0);
+	__create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK), phys, virt,
+			 size, 0);
 }
 
 void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
@@ -284,7 +285,7 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
 		pr_warn("BUG: not creating id mapping for %pa\n", &addr);
 		return;
 	}
-	__create_mapping(&idmap_pg_dir[pgd_index(addr)],
+	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
 			 addr, addr, size, map_io);
 }
 
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 2/8] arm64/mm: add create_pgd_mapping() to create private page tables
       [not found] ` <1419245944-2424-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  2014-12-22 10:58   ` [PATCH v4 1/8] arm64/mm: add explicit struct_mm argument to __create_mapping() Ard Biesheuvel
@ 2014-12-22 10:58   ` Ard Biesheuvel
  2014-12-22 10:58   ` [PATCH v4 3/8] efi: split off remapping code from efi_config_init() Ard Biesheuvel
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2014-12-22 10:58 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Ard Biesheuvel

For UEFI, we need to install the memory mappings used for Runtime Services
in a dedicated set of page tables. Add create_pgd_mapping(), which allows
us to allocate and install those page table entries early.

Reviewed-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/include/asm/mmu.h     |  3 +++
 arch/arm64/include/asm/pgtable.h |  5 +++++
 arch/arm64/mm/mmu.c              | 43 ++++++++++++++++++++--------------------
 3 files changed, 30 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index c2f006c48bdb..5fd40c43be80 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -33,5 +33,8 @@ extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
 extern void init_mem_pgprot(void);
 /* create an identity mapping for memory (or io if map_io is true) */
 extern void create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io);
+extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
+			       unsigned long virt, phys_addr_t size,
+			       pgprot_t prot);
 
 #endif
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index df22314f57cf..d5d8dc17e296 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -264,6 +264,11 @@ static inline pmd_t pte_pmd(pte_t pte)
 	return __pmd(pte_val(pte));
 }
 
+static inline pgprot_t mk_sect_prot(pgprot_t prot)
+{
+	return __pgprot(pgprot_val(prot) & ~PTE_TABLE_BIT);
+}
+
 /*
  * THP definitions.
  */
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7d5dfe2d3de0..3f3d5aa4a8b1 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -158,20 +158,10 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
 
 static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 				  unsigned long addr, unsigned long end,
-				  phys_addr_t phys, int map_io)
+				  phys_addr_t phys, pgprot_t prot)
 {
 	pmd_t *pmd;
 	unsigned long next;
-	pmdval_t prot_sect;
-	pgprot_t prot_pte;
-
-	if (map_io) {
-		prot_sect = PROT_SECT_DEVICE_nGnRE;
-		prot_pte = __pgprot(PROT_DEVICE_nGnRE);
-	} else {
-		prot_sect = PROT_SECT_NORMAL_EXEC;
-		prot_pte = PAGE_KERNEL_EXEC;
-	}
 
 	/*
 	 * Check for initial section mappings in the pgd/pud and remove them.
@@ -187,7 +177,8 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 		/* try section mapping first */
 		if (((addr | next | phys) & ~SECTION_MASK) == 0) {
 			pmd_t old_pmd =*pmd;
-			set_pmd(pmd, __pmd(phys | prot_sect));
+			set_pmd(pmd, __pmd(phys |
+					   pgprot_val(mk_sect_prot(prot))));
 			/*
 			 * Check for previous table entries created during
 			 * boot (__create_page_tables) and flush them.
@@ -196,7 +187,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 				flush_tlb_all();
 		} else {
 			alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys),
-				       prot_pte);
+				       prot);
 		}
 		phys += next - addr;
 	} while (pmd++, addr = next, addr != end);
@@ -204,7 +195,7 @@ static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
 
 static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				  unsigned long addr, unsigned long end,
-				  phys_addr_t phys, int map_io)
+				  phys_addr_t phys, pgprot_t prot)
 {
 	pud_t *pud;
 	unsigned long next;
@@ -222,10 +213,11 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 		/*
 		 * For 4K granule only, attempt to put down a 1GB block
 		 */
-		if (!map_io && (PAGE_SHIFT == 12) &&
+		if ((PAGE_SHIFT == 12) &&
 		    ((addr | next | phys) & ~PUD_MASK) == 0) {
 			pud_t old_pud = *pud;
-			set_pud(pud, __pud(phys | PROT_SECT_NORMAL_EXEC));
+			set_pud(pud, __pud(phys |
+					   pgprot_val(mk_sect_prot(prot))));
 
 			/*
 			 * If we have an old value for a pud, it will
@@ -240,7 +232,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
 				flush_tlb_all();
 			}
 		} else {
-			alloc_init_pmd(mm, pud, addr, next, phys, map_io);
+			alloc_init_pmd(mm, pud, addr, next, phys, prot);
 		}
 		phys += next - addr;
 	} while (pud++, addr = next, addr != end);
@@ -252,7 +244,7 @@ static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
  */
 static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 				    phys_addr_t phys, unsigned long virt,
-				    phys_addr_t size, int map_io)
+				    phys_addr_t size, pgprot_t prot)
 {
 	unsigned long addr, length, end, next;
 
@@ -262,7 +254,7 @@ static void __init __create_mapping(struct mm_struct *mm, pgd_t *pgd,
 	end = addr + length;
 	do {
 		next = pgd_addr_end(addr, end);
-		alloc_init_pud(mm, pgd, addr, next, phys, map_io);
+		alloc_init_pud(mm, pgd, addr, next, phys, prot);
 		phys += next - addr;
 	} while (pgd++, addr = next, addr != end);
 }
@@ -276,7 +268,7 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 		return;
 	}
 	__create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK), phys, virt,
-			 size, 0);
+			 size, PAGE_KERNEL_EXEC);
 }
 
 void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
@@ -286,7 +278,16 @@ void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
 		return;
 	}
 	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
-			 addr, addr, size, map_io);
+			 addr, addr, size,
+			 map_io ? __pgprot(PROT_DEVICE_nGnRE)
+				: PAGE_KERNEL_EXEC);
+}
+
+void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
+			       unsigned long virt, phys_addr_t size,
+			       pgprot_t prot)
+{
+	__create_mapping(mm, pgd_offset(mm, virt), phys, virt, size, prot);
 }
 
 static void __init map_mem(void)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 3/8] efi: split off remapping code from efi_config_init()
       [not found] ` <1419245944-2424-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  2014-12-22 10:58   ` [PATCH v4 1/8] arm64/mm: add explicit struct_mm argument to __create_mapping() Ard Biesheuvel
  2014-12-22 10:58   ` [PATCH v4 2/8] arm64/mm: add create_pgd_mapping() to create private page tables Ard Biesheuvel
@ 2014-12-22 10:58   ` Ard Biesheuvel
       [not found]     ` <1419245944-2424-4-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  2014-12-22 10:59   ` [PATCH v4 4/8] efi: efistub: allow allocation alignment larger than EFI_PAGE_SIZE Ard Biesheuvel
                     ` (4 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Ard Biesheuvel @ 2014-12-22 10:58 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Ard Biesheuvel

Split of the remapping code from efi_config_init() so that the caller
can perform its own remapping. This is necessary to correctly handle
virtually remapped UEFI memory regions under kexec, as efi.systab will
have been updated to a virtual address.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/firmware/efi/efi.c | 49 +++++++++++++++++++++++++++++-----------------
 include/linux/efi.h        |  2 ++
 2 files changed, 33 insertions(+), 18 deletions(-)

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 9035c1b74d58..0de77db4fb88 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -293,9 +293,10 @@ static __init int match_config_table(efi_guid_t *guid,
 	return 0;
 }
 
-int __init efi_config_init(efi_config_table_type_t *arch_tables)
+int __init efi_config_parse_tables(void *config_tables, int count,
+				   efi_config_table_type_t *arch_tables)
 {
-	void *config_tables, *tablep;
+	void *tablep;
 	int i, sz;
 
 	if (efi_enabled(EFI_64BIT))
@@ -303,19 +304,9 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
 	else
 		sz = sizeof(efi_config_table_32_t);
 
-	/*
-	 * Let's see what config tables the firmware passed to us.
-	 */
-	config_tables = early_memremap(efi.systab->tables,
-				       efi.systab->nr_tables * sz);
-	if (config_tables == NULL) {
-		pr_err("Could not map Configuration table!\n");
-		return -ENOMEM;
-	}
-
 	tablep = config_tables;
 	pr_info("");
-	for (i = 0; i < efi.systab->nr_tables; i++) {
+	for (i = 0; i < count; i++) {
 		efi_guid_t guid;
 		unsigned long table;
 
@@ -328,8 +319,6 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
 			if (table64 >> 32) {
 				pr_cont("\n");
 				pr_err("Table located above 4GB, disabling EFI.\n");
-				early_memunmap(config_tables,
-					       efi.systab->nr_tables * sz);
 				return -EINVAL;
 			}
 #endif
@@ -344,13 +333,37 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
 		tablep += sz;
 	}
 	pr_cont("\n");
-	early_memunmap(config_tables, efi.systab->nr_tables * sz);
-
 	set_bit(EFI_CONFIG_TABLES, &efi.flags);
-
 	return 0;
 }
 
+int __init efi_config_init(efi_config_table_type_t *arch_tables)
+{
+	void *config_tables;
+	int sz, ret;
+
+	if (efi_enabled(EFI_64BIT))
+		sz = sizeof(efi_config_table_64_t);
+	else
+		sz = sizeof(efi_config_table_32_t);
+
+	/*
+	 * Let's see what config tables the firmware passed to us.
+	 */
+	config_tables = early_memremap(efi.systab->tables,
+				       efi.systab->nr_tables * sz);
+	if (config_tables == NULL) {
+		pr_err("Could not map Configuration table!\n");
+		return -ENOMEM;
+	}
+
+	ret = efi_config_parse_tables(config_tables, efi.systab->nr_tables,
+				      arch_tables);
+
+	early_memunmap(config_tables, efi.systab->nr_tables * sz);
+	return ret;
+}
+
 #ifdef CONFIG_EFI_VARS_MODULE
 static int __init efi_load_efivars(void)
 {
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 0238d612750e..2dc8577032b3 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -875,6 +875,8 @@ static inline efi_status_t efi_query_variable_store(u32 attributes, unsigned lon
 #endif
 extern void __iomem *efi_lookup_mapped_addr(u64 phys_addr);
 extern int efi_config_init(efi_config_table_type_t *arch_tables);
+extern int efi_config_parse_tables(void *config_tables, int count,
+				   efi_config_table_type_t *arch_tables);
 extern u64 efi_get_iobase (void);
 extern u32 efi_mem_type (unsigned long phys_addr);
 extern u64 efi_mem_attributes (unsigned long phys_addr);
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 4/8] efi: efistub: allow allocation alignment larger than EFI_PAGE_SIZE
       [not found] ` <1419245944-2424-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
                     ` (2 preceding siblings ...)
  2014-12-22 10:58   ` [PATCH v4 3/8] efi: split off remapping code from efi_config_init() Ard Biesheuvel
@ 2014-12-22 10:59   ` Ard Biesheuvel
       [not found]     ` <1419245944-2424-5-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  2014-12-22 10:59   ` [PATCH v4 5/8] arm64/efi: set EFI_ALLOC_ALIGN to 64 KB Ard Biesheuvel
                     ` (3 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Ard Biesheuvel @ 2014-12-22 10:59 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Ard Biesheuvel

On systems with 64 KB pages, it is preferable for UEFI memory map
entries to be 64 KB aligned multiples of 64 KB, because it relieves
us of having to deal with the residues.
So, if EFI_ALLOC_ALIGN is #define'd by the platform, use it to round
up all memory allocations made.

Acked-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/firmware/efi/libstub/efi-stub-helper.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index a920fec8fe88..e766df60fbfb 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -32,6 +32,15 @@
 
 static unsigned long __chunk_size = EFI_READ_CHUNK_SIZE;
 
+/*
+ * Allow the platform to override the allocation granularity: this allows
+ * systems that have the capability to run with a larger page size to deal
+ * with the allocations for initrd and fdt more efficiently.
+ */
+#ifndef EFI_ALLOC_ALIGN
+#define EFI_ALLOC_ALIGN		EFI_PAGE_SIZE
+#endif
+
 struct file_info {
 	efi_file_handle_t *handle;
 	u64 size;
@@ -150,10 +159,10 @@ efi_status_t efi_high_alloc(efi_system_table_t *sys_table_arg,
 	 * a specific address.  We are doing page-based allocations,
 	 * so we must be aligned to a page.
 	 */
-	if (align < EFI_PAGE_SIZE)
-		align = EFI_PAGE_SIZE;
+	if (align < EFI_ALLOC_ALIGN)
+		align = EFI_ALLOC_ALIGN;
 
-	nr_pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+	nr_pages = round_up(size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE;
 again:
 	for (i = 0; i < map_size / desc_size; i++) {
 		efi_memory_desc_t *desc;
@@ -235,10 +244,10 @@ efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
 	 * a specific address.  We are doing page-based allocations,
 	 * so we must be aligned to a page.
 	 */
-	if (align < EFI_PAGE_SIZE)
-		align = EFI_PAGE_SIZE;
+	if (align < EFI_ALLOC_ALIGN)
+		align = EFI_ALLOC_ALIGN;
 
-	nr_pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+	nr_pages = round_up(size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE;
 	for (i = 0; i < map_size / desc_size; i++) {
 		efi_memory_desc_t *desc;
 		unsigned long m = (unsigned long)map;
@@ -292,7 +301,7 @@ void efi_free(efi_system_table_t *sys_table_arg, unsigned long size,
 	if (!size)
 		return;
 
-	nr_pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+	nr_pages = round_up(size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE;
 	efi_call_early(free_pages, addr, nr_pages);
 }
 
@@ -561,7 +570,7 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
 	 * to the preferred address.  If that fails, allocate as low
 	 * as possible while respecting the required alignment.
 	 */
-	nr_pages = round_up(alloc_size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+	nr_pages = round_up(alloc_size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE;
 	status = efi_call_early(allocate_pages,
 				EFI_ALLOCATE_ADDRESS, EFI_LOADER_DATA,
 				nr_pages, &efi_addr);
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 5/8] arm64/efi: set EFI_ALLOC_ALIGN to 64 KB
       [not found] ` <1419245944-2424-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
                     ` (3 preceding siblings ...)
  2014-12-22 10:59   ` [PATCH v4 4/8] efi: efistub: allow allocation alignment larger than EFI_PAGE_SIZE Ard Biesheuvel
@ 2014-12-22 10:59   ` Ard Biesheuvel
  2015-01-06 16:37     ` Leif Lindholm
  2014-12-22 10:59   ` [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub Ard Biesheuvel
                     ` (2 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Ard Biesheuvel @ 2014-12-22 10:59 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Ard Biesheuvel

Set EFI_ALLOC_ALIGN to 64 KB so that all allocations done by the stub
are naturally compatible with a 64 KB granule kernel.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/include/asm/efi.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index a34fd3b12e2b..71291253114f 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -44,4 +44,6 @@ extern void efi_idmap_init(void);
 
 #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__)
 
+#define EFI_ALLOC_ALIGN		SZ_64K
+
 #endif /* _ASM_EFI_H */
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
       [not found] ` <1419245944-2424-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
                     ` (4 preceding siblings ...)
  2014-12-22 10:59   ` [PATCH v4 5/8] arm64/efi: set EFI_ALLOC_ALIGN to 64 KB Ard Biesheuvel
@ 2014-12-22 10:59   ` Ard Biesheuvel
       [not found]     ` <1419245944-2424-7-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  2014-12-22 10:59   ` [PATCH v4 7/8] arm64/efi: remove free_boot_services() and friends Ard Biesheuvel
  2014-12-22 10:59   ` [PATCH v4 8/8] arm64/efi: remove idmap manipulations from UEFI code Ard Biesheuvel
  7 siblings, 1 reply; 30+ messages in thread
From: Ard Biesheuvel @ 2014-12-22 10:59 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Ard Biesheuvel

In order to support kexec, the kernel needs to be able to deal with the
state of the UEFI firmware after SetVirtualAddressMap() has been called.
To avoid having separate code paths for non-kexec and kexec, let's move
the call to SetVirtualAddressMap() to the stub: this will guarantee us
that it will only be called once (since the stub is not executed during
kexec), and ensures that the UEFI state is identical between kexec and
normal boot.

This implies that the layout of the virtual mapping needs to be created
by the stub as well. All regions are rounded up to a naturally aligned
multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
in the UEFI memory map. The kernel proper reads those values and installs
the mappings in a dedicated set of page tables that are swapped in during
UEFI Runtime Services calls.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/include/asm/efi.h       |  20 +++-
 arch/arm64/kernel/efi.c            | 223 ++++++++++++++++++++-----------------
 arch/arm64/kernel/setup.c          |   1 +
 drivers/firmware/efi/libstub/fdt.c | 137 ++++++++++++++++++++++-
 4 files changed, 270 insertions(+), 111 deletions(-)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 71291253114f..6cc668a378c5 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -7,28 +7,36 @@
 #ifdef CONFIG_EFI
 extern void efi_init(void);
 extern void efi_idmap_init(void);
+extern void efi_virtmap_init(void);
 #else
 #define efi_init()
 #define efi_idmap_init()
+#define efi_virtmap_init
 #endif
 
 #define efi_call_virt(f, ...)						\
 ({									\
-	efi_##f##_t *__f = efi.systab->runtime->f;			\
+	efi_##f##_t *__f;						\
 	efi_status_t __s;						\
 									\
-	kernel_neon_begin();						\
+	kernel_neon_begin();	/* disables preemption */		\
+	efi_virtmap_load();						\
+	__f = efi.systab->runtime->f;					\
 	__s = __f(__VA_ARGS__);						\
+	efi_virtmap_unload();						\
 	kernel_neon_end();						\
 	__s;								\
 })
 
 #define __efi_call_virt(f, ...)						\
 ({									\
-	efi_##f##_t *__f = efi.systab->runtime->f;			\
+	efi_##f##_t *__f;						\
 									\
-	kernel_neon_begin();						\
+	kernel_neon_begin();	/* disables preemption */		\
+	efi_virtmap_load();						\
+	__f = efi.systab->runtime->f;					\
 	__f(__VA_ARGS__);						\
+	efi_virtmap_unload();						\
 	kernel_neon_end();						\
 })
 
@@ -45,5 +53,9 @@ extern void efi_idmap_init(void);
 #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__)
 
 #define EFI_ALLOC_ALIGN		SZ_64K
+#define EFI_VIRTMAP		EFI_ARCH_1
+
+void efi_virtmap_load(void);
+void efi_virtmap_unload(void);
 
 #endif /* _ASM_EFI_H */
diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 6fac253bc783..2ebe67ffb629 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -11,25 +11,30 @@
  *
  */
 
+#include <linux/atomic.h>
 #include <linux/dmi.h>
 #include <linux/efi.h>
 #include <linux/export.h>
 #include <linux/memblock.h>
+#include <linux/mm_types.h>
 #include <linux/bootmem.h>
 #include <linux/of.h>
 #include <linux/of_fdt.h>
+#include <linux/rbtree.h>
+#include <linux/rwsem.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/spinlock.h>
 
 #include <asm/cacheflush.h>
 #include <asm/efi.h>
 #include <asm/tlbflush.h>
 #include <asm/mmu_context.h>
+#include <asm/mmu.h>
+#include <asm/pgtable.h>
 
 struct efi_memory_map memmap;
 
-static efi_runtime_services_t *runtime;
-
 static u64 efi_system_table;
 
 static int uefi_debug __initdata;
@@ -69,9 +74,33 @@ static void __init efi_setup_idmap(void)
 	}
 }
 
+/*
+ * Translate a EFI virtual address into a physical address: this is necessary,
+ * as some data members of the EFI system table are virtually remapped after
+ * SetVirtualAddressMap() has been called.
+ */
+static phys_addr_t efi_to_phys(unsigned long addr)
+{
+	efi_memory_desc_t *md;
+
+	for_each_efi_memory_desc(&memmap, md) {
+		if (!(md->attribute & EFI_MEMORY_RUNTIME))
+			continue;
+		if (md->virt_addr == 0)
+			/* no virtual mapping has been installed by the stub */
+			break;
+		if (md->virt_addr <= addr &&
+		    (addr - md->virt_addr) < (md->num_pages << EFI_PAGE_SHIFT))
+			return md->phys_addr + addr - md->virt_addr;
+	}
+	return addr;
+}
+
 static int __init uefi_init(void)
 {
 	efi_char16_t *c16;
+	void *config_tables;
+	u64 table_size;
 	char vendor[100] = "unknown";
 	int i, retval;
 
@@ -99,7 +128,7 @@ static int __init uefi_init(void)
 			efi.systab->hdr.revision & 0xffff);
 
 	/* Show what we know for posterity */
-	c16 = early_memremap(efi.systab->fw_vendor,
+	c16 = early_memremap(efi_to_phys(efi.systab->fw_vendor),
 			     sizeof(vendor));
 	if (c16) {
 		for (i = 0; i < (int) sizeof(vendor) - 1 && *c16; ++i)
@@ -112,8 +141,14 @@ static int __init uefi_init(void)
 		efi.systab->hdr.revision >> 16,
 		efi.systab->hdr.revision & 0xffff, vendor);
 
-	retval = efi_config_init(NULL);
+	table_size = sizeof(efi_config_table_64_t) * efi.systab->nr_tables;
+	config_tables = early_memremap(efi_to_phys(efi.systab->tables),
+				       table_size);
+
+	retval = efi_config_parse_tables(config_tables,
+					 efi.systab->nr_tables, NULL);
 
+	early_memunmap(config_tables, table_size);
 out:
 	early_memunmap(efi.systab,  sizeof(efi_system_table_t));
 	return retval;
@@ -328,51 +363,9 @@ void __init efi_idmap_init(void)
 	efi_setup_idmap();
 }
 
-static int __init remap_region(efi_memory_desc_t *md, void **new)
-{
-	u64 paddr, vaddr, npages, size;
-
-	paddr = md->phys_addr;
-	npages = md->num_pages;
-	memrange_efi_to_native(&paddr, &npages);
-	size = npages << PAGE_SHIFT;
-
-	if (is_normal_ram(md))
-		vaddr = (__force u64)ioremap_cache(paddr, size);
-	else
-		vaddr = (__force u64)ioremap(paddr, size);
-
-	if (!vaddr) {
-		pr_err("Unable to remap 0x%llx pages @ %p\n",
-		       npages, (void *)paddr);
-		return 0;
-	}
-
-	/* adjust for any rounding when EFI and system pagesize differs */
-	md->virt_addr = vaddr + (md->phys_addr - paddr);
-
-	if (uefi_debug)
-		pr_info("  EFI remap 0x%012llx => %p\n",
-			md->phys_addr, (void *)md->virt_addr);
-
-	memcpy(*new, md, memmap.desc_size);
-	*new += memmap.desc_size;
-
-	return 1;
-}
-
-/*
- * Switch UEFI from an identity map to a kernel virtual map
- */
 static int __init arm64_enter_virtual_mode(void)
 {
-	efi_memory_desc_t *md;
-	phys_addr_t virtmap_phys;
-	void *virtmap, *virt_md;
-	efi_status_t status;
 	u64 mapsize;
-	int count = 0;
-	unsigned long flags;
 
 	if (!efi_enabled(EFI_BOOT)) {
 		pr_info("EFI services will not be available.\n");
@@ -395,79 +388,28 @@ static int __init arm64_enter_virtual_mode(void)
 
 	efi.memmap = &memmap;
 
-	/* Map the runtime regions */
-	virtmap = kmalloc(mapsize, GFP_KERNEL);
-	if (!virtmap) {
-		pr_err("Failed to allocate EFI virtual memmap\n");
-		return -1;
-	}
-	virtmap_phys = virt_to_phys(virtmap);
-	virt_md = virtmap;
-
-	for_each_efi_memory_desc(&memmap, md) {
-		if (!(md->attribute & EFI_MEMORY_RUNTIME))
-			continue;
-		if (!remap_region(md, &virt_md))
-			goto err_unmap;
-		++count;
-	}
-
-	efi.systab = (__force void *)efi_lookup_mapped_addr(efi_system_table);
+	efi.systab = (__force void *)ioremap_cache(efi_system_table,
+						   sizeof(efi_system_table_t));
 	if (!efi.systab) {
-		/*
-		 * If we have no virtual mapping for the System Table at this
-		 * point, the memory map doesn't cover the physical offset where
-		 * it resides. This means the System Table will be inaccessible
-		 * to Runtime Services themselves once the virtual mapping is
-		 * installed.
-		 */
-		pr_err("Failed to remap EFI System Table -- buggy firmware?\n");
-		goto err_unmap;
+		pr_err("Failed to remap EFI System Table\n");
+		return -1;
 	}
 	set_bit(EFI_SYSTEM_TABLES, &efi.flags);
 
-	local_irq_save(flags);
-	cpu_switch_mm(idmap_pg_dir, &init_mm);
-
-	/* Call SetVirtualAddressMap with the physical address of the map */
-	runtime = efi.systab->runtime;
-	efi.set_virtual_address_map = runtime->set_virtual_address_map;
-
-	status = efi.set_virtual_address_map(count * memmap.desc_size,
-					     memmap.desc_size,
-					     memmap.desc_version,
-					     (efi_memory_desc_t *)virtmap_phys);
-	cpu_set_reserved_ttbr0();
-	flush_tlb_all();
-	local_irq_restore(flags);
-
-	kfree(virtmap);
-
 	free_boot_services();
 
-	if (status != EFI_SUCCESS) {
-		pr_err("Failed to set EFI virtual address map! [%lx]\n",
-			status);
+	if (!efi_enabled(EFI_VIRTMAP)) {
+		pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
 		return -1;
 	}
 
 	/* Set up runtime services function pointers */
-	runtime = efi.systab->runtime;
 	efi_native_runtime_setup();
 	set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
 
 	efi.runtime_version = efi.systab->hdr.revision;
 
 	return 0;
-
-err_unmap:
-	/* unmap all mappings that succeeded: there are 'count' of those */
-	for (virt_md = virtmap; count--; virt_md += memmap.desc_size) {
-		md = virt_md;
-		iounmap((__force void __iomem *)md->virt_addr);
-	}
-	kfree(virtmap);
-	return -1;
 }
 early_initcall(arm64_enter_virtual_mode);
 
@@ -484,3 +426,78 @@ static int __init arm64_dmi_init(void)
 	return 0;
 }
 core_initcall(arm64_dmi_init);
+
+static pgd_t efi_pgd[PTRS_PER_PGD] __page_aligned_bss;
+
+static struct mm_struct efi_mm = {
+	.mm_rb			= RB_ROOT,
+	.pgd			= efi_pgd,
+	.mm_users		= ATOMIC_INIT(2),
+	.mm_count		= ATOMIC_INIT(1),
+	.mmap_sem		= __RWSEM_INITIALIZER(efi_mm.mmap_sem),
+	.page_table_lock	= __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock),
+	.mmlist			= LIST_HEAD_INIT(efi_mm.mmlist),
+	INIT_MM_CONTEXT(efi_mm)
+};
+
+static void efi_set_pgd(struct mm_struct *mm)
+{
+	cpu_switch_mm(mm->pgd, mm);
+	flush_tlb_all();
+	if (icache_is_aivivt())
+		__flush_icache_all();
+}
+
+void efi_virtmap_load(void)
+{
+	WARN_ON(preemptible());
+	efi_set_pgd(&efi_mm);
+}
+
+void efi_virtmap_unload(void)
+{
+	efi_set_pgd(current->active_mm);
+}
+
+void __init efi_virtmap_init(void)
+{
+	efi_memory_desc_t *md;
+
+	if (!efi_enabled(EFI_BOOT))
+		return;
+
+	for_each_efi_memory_desc(&memmap, md) {
+		u64 paddr, npages, size;
+		pgprot_t prot;
+
+		if (!(md->attribute & EFI_MEMORY_RUNTIME))
+			continue;
+		if (WARN(md->virt_addr == 0,
+			 "UEFI virtual mapping incomplete or missing -- no entry found for 0x%llx\n",
+			 md->phys_addr))
+			return;
+
+		paddr = md->phys_addr;
+		npages = md->num_pages;
+		memrange_efi_to_native(&paddr, &npages);
+		size = npages << PAGE_SHIFT;
+
+		pr_info("  EFI remap 0x%012llx => %p\n",
+			md->phys_addr, (void *)md->virt_addr);
+
+		/*
+		 * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
+		 * executable, everything else can be mapped with the XN bits
+		 * set.
+		 */
+		if (!is_normal_ram(md))
+			prot = __pgprot(PROT_DEVICE_nGnRE);
+		else if (md->type == EFI_RUNTIME_SERVICES_CODE)
+			prot = PAGE_KERNEL_EXEC;
+		else
+			prot = PAGE_KERNEL;
+
+		create_pgd_mapping(&efi_mm, paddr, md->virt_addr, size, prot);
+	}
+	set_bit(EFI_VIRTMAP, &efi.flags);
+}
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index b80991166754..d8390f507da0 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -402,6 +402,7 @@ void __init setup_arch(char **cmdline_p)
 	request_standard_resources();
 
 	efi_idmap_init();
+	efi_virtmap_init();
 
 	unflatten_device_tree();
 
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index c846a9608cbd..76bc8abf41d1 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -167,6 +167,94 @@ fdt_set_fail:
 #define EFI_FDT_ALIGN EFI_PAGE_SIZE
 #endif
 
+static efi_status_t get_memory_map(efi_system_table_t *sys_table_arg,
+				   efi_memory_desc_t **map,
+				   unsigned long *map_size,
+				   unsigned long *desc_size,
+				   u32 *desc_ver, unsigned long *key_ptr)
+{
+	efi_status_t status;
+
+	/*
+	 * Call get_memory_map() with 0 size to retrieve the size of the
+	 * required allocation.
+	 */
+	*map_size = 0;
+	status = efi_call_early(get_memory_map, map_size, NULL,
+				key_ptr, desc_size, desc_ver);
+	if (status != EFI_BUFFER_TOO_SMALL)
+		return EFI_LOAD_ERROR;
+
+	/*
+	 * Add an additional efi_memory_desc_t to map_size because we're doing
+	 * an allocation which may be in a new descriptor region. Then double it
+	 * to give us some scratch space to prepare the input virtmap to give
+	 * to SetVirtualAddressMap(). Note that this is EFI_LOADER_DATA memory,
+	 * and the kernel memblock_reserve()'s only the size of the actual
+	 * memory map, so the scratch space is freed again automatically.
+	 */
+	*map_size += *desc_size;
+	status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
+				*map_size * 2, (void **)map);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	status = efi_call_early(get_memory_map, map_size, *map,
+				key_ptr, desc_size, desc_ver);
+	if (status != EFI_SUCCESS)
+		efi_call_early(free_pool, *map);
+	return status;
+}
+
+/*
+ * This is the base address at which to start allocating virtual memory ranges
+ * for UEFI Runtime Services. This is a userland range so that we can use any
+ * allocation we choose, and eliminate the risk of a conflict after kexec.
+ */
+#define EFI_RT_VIRTUAL_BASE	0x40000000
+
+static void update_memory_map(efi_memory_desc_t *memory_map,
+			      unsigned long map_size, unsigned long desc_size,
+			      int *count)
+{
+	u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
+	efi_memory_desc_t *out = (void *)memory_map + map_size;
+	int l;
+
+	for (l = 0; l < map_size; l += desc_size) {
+		efi_memory_desc_t *in = (void *)memory_map + l;
+		u64 paddr, size;
+
+		if (!(in->attribute & EFI_MEMORY_RUNTIME))
+			continue;
+
+		/*
+		 * Make the mapping compatible with 64k pages: this allows
+		 * a 4k page size kernel to kexec a 64k page size kernel and
+		 * vice versa.
+		 */
+		paddr = round_down(in->phys_addr, SZ_64K);
+		size = round_up(in->num_pages * EFI_PAGE_SIZE +
+				in->phys_addr - paddr, SZ_64K);
+
+		/*
+		 * Avoid wasting memory on PTEs by choosing a virtual base that
+		 * is compatible with section mappings if this region has the
+		 * appropriate size and physical alignment. (Sections are 2 MB
+		 * on 4k granule kernels)
+		 */
+		if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
+			efi_virt_base = round_up(efi_virt_base, SZ_2M);
+
+		in->virt_addr = efi_virt_base + in->phys_addr - paddr;
+		efi_virt_base += size;
+
+		memcpy(out, in, desc_size);
+		out = (void *)out + desc_size;
+		++*count;
+	}
+}
+
 /*
  * Allocate memory for a new FDT, then add EFI, commandline, and
  * initrd related fields to the FDT.  This routine increases the
@@ -196,6 +284,7 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 	efi_memory_desc_t *memory_map;
 	unsigned long new_fdt_size;
 	efi_status_t status;
+	int runtime_entry_count = 0;
 
 	/*
 	 * Estimate size of new FDT, and allocate memory for it. We
@@ -216,8 +305,8 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 		 * we can get the memory map key  needed for
 		 * exit_boot_services().
 		 */
-		status = efi_get_memory_map(sys_table, &memory_map, &map_size,
-					    &desc_size, &desc_ver, &mmap_key);
+		status = get_memory_map(sys_table, &memory_map, &map_size,
+					&desc_size, &desc_ver, &mmap_key);
 		if (status != EFI_SUCCESS)
 			goto fail_free_new_fdt;
 
@@ -248,12 +337,52 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 		}
 	}
 
+	/*
+	 * Update the memory map with virtual addresses. The function will also
+	 * populate the spare second half of the memory_map allocation with
+	 * copies of just the EFI_MEMORY_RUNTIME entries so that we can pass it
+	 * straight into SetVirtualAddressMap()
+	 */
+	update_memory_map(memory_map, map_size, desc_size,
+			  &runtime_entry_count);
+
+	pr_efi(sys_table,
+	       "Exiting boot services and installing virtual address map...\n");
+
 	/* Now we are ready to exit_boot_services.*/
 	status = sys_table->boottime->exit_boot_services(handle, mmap_key);
 
+	if (status == EFI_SUCCESS) {
+		efi_set_virtual_address_map_t *svam;
 
-	if (status == EFI_SUCCESS)
-		return status;
+		/* Install the new virtual address map */
+		svam = sys_table->runtime->set_virtual_address_map;
+		status = svam(runtime_entry_count * desc_size, desc_size,
+			      desc_ver, (void *)memory_map + map_size);
+
+		/*
+		 * We are beyond the point of no return here, so if the call to
+		 * SetVirtualAddressMap() failed, we need to signal that to the
+		 * incoming kernel but proceed normally otherwise.
+		 */
+		if (status != EFI_SUCCESS) {
+			int l;
+
+			/*
+			 * Set the virtual address field of all
+			 * EFI_MEMORY_RUNTIME entries to 0. This will signal
+			 * the incoming kernel that no virtual translation has
+			 * been installed.
+			 */
+			for (l = 0; l < map_size; l += desc_size) {
+				efi_memory_desc_t *p = (void *)memory_map + l;
+
+				if (p->attribute & EFI_MEMORY_RUNTIME)
+					p->virt_addr = 0;
+			}
+		}
+		return EFI_SUCCESS;
+	}
 
 	pr_efi_err(sys_table, "Exit boot services failed.\n");
 
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 7/8] arm64/efi: remove free_boot_services() and friends
       [not found] ` <1419245944-2424-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
                     ` (5 preceding siblings ...)
  2014-12-22 10:59   ` [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub Ard Biesheuvel
@ 2014-12-22 10:59   ` Ard Biesheuvel
       [not found]     ` <1419245944-2424-8-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  2014-12-22 10:59   ` [PATCH v4 8/8] arm64/efi: remove idmap manipulations from UEFI code Ard Biesheuvel
  7 siblings, 1 reply; 30+ messages in thread
From: Ard Biesheuvel @ 2014-12-22 10:59 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Ard Biesheuvel

Now that we are calling SetVirtualAddressMap() from the stub, there is no
need to reserve boot-only memory regions, which implies that there is also
no reason to free them again later.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/kernel/efi.c | 123 +-----------------------------------------------
 1 file changed, 1 insertion(+), 122 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 2ebe67ffb629..e969c5076713 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -198,9 +198,7 @@ static __init void reserve_regions(void)
 		if (is_normal_ram(md))
 			early_init_dt_add_memory_arch(paddr, size);
 
-		if (is_reserve_region(md) ||
-		    md->type == EFI_BOOT_SERVICES_CODE ||
-		    md->type == EFI_BOOT_SERVICES_DATA) {
+		if (is_reserve_region(md)) {
 			memblock_reserve(paddr, size);
 			if (uefi_debug)
 				pr_cont("*");
@@ -213,123 +211,6 @@ static __init void reserve_regions(void)
 	set_bit(EFI_MEMMAP, &efi.flags);
 }
 
-
-static u64 __init free_one_region(u64 start, u64 end)
-{
-	u64 size = end - start;
-
-	if (uefi_debug)
-		pr_info("  EFI freeing: 0x%012llx-0x%012llx\n",	start, end - 1);
-
-	free_bootmem_late(start, size);
-	return size;
-}
-
-static u64 __init free_region(u64 start, u64 end)
-{
-	u64 map_start, map_end, total = 0;
-
-	if (end <= start)
-		return total;
-
-	map_start = (u64)memmap.phys_map;
-	map_end = PAGE_ALIGN(map_start + (memmap.map_end - memmap.map));
-	map_start &= PAGE_MASK;
-
-	if (start < map_end && end > map_start) {
-		/* region overlaps UEFI memmap */
-		if (start < map_start)
-			total += free_one_region(start, map_start);
-
-		if (map_end < end)
-			total += free_one_region(map_end, end);
-	} else
-		total += free_one_region(start, end);
-
-	return total;
-}
-
-static void __init free_boot_services(void)
-{
-	u64 total_freed = 0;
-	u64 keep_end, free_start, free_end;
-	efi_memory_desc_t *md;
-
-	/*
-	 * If kernel uses larger pages than UEFI, we have to be careful
-	 * not to inadvertantly free memory we want to keep if there is
-	 * overlap at the kernel page size alignment. We do not want to
-	 * free is_reserve_region() memory nor the UEFI memmap itself.
-	 *
-	 * The memory map is sorted, so we keep track of the end of
-	 * any previous region we want to keep, remember any region
-	 * we want to free and defer freeing it until we encounter
-	 * the next region we want to keep. This way, before freeing
-	 * it, we can clip it as needed to avoid freeing memory we
-	 * want to keep for UEFI.
-	 */
-
-	keep_end = 0;
-	free_start = 0;
-
-	for_each_efi_memory_desc(&memmap, md) {
-		u64 paddr, npages, size;
-
-		if (is_reserve_region(md)) {
-			/*
-			 * We don't want to free any memory from this region.
-			 */
-			if (free_start) {
-				/* adjust free_end then free region */
-				if (free_end > md->phys_addr)
-					free_end -= PAGE_SIZE;
-				total_freed += free_region(free_start, free_end);
-				free_start = 0;
-			}
-			keep_end = md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT);
-			continue;
-		}
-
-		if (md->type != EFI_BOOT_SERVICES_CODE &&
-		    md->type != EFI_BOOT_SERVICES_DATA) {
-			/* no need to free this region */
-			continue;
-		}
-
-		/*
-		 * We want to free memory from this region.
-		 */
-		paddr = md->phys_addr;
-		npages = md->num_pages;
-		memrange_efi_to_native(&paddr, &npages);
-		size = npages << PAGE_SHIFT;
-
-		if (free_start) {
-			if (paddr <= free_end)
-				free_end = paddr + size;
-			else {
-				total_freed += free_region(free_start, free_end);
-				free_start = paddr;
-				free_end = paddr + size;
-			}
-		} else {
-			free_start = paddr;
-			free_end = paddr + size;
-		}
-		if (free_start < keep_end) {
-			free_start += PAGE_SIZE;
-			if (free_start >= free_end)
-				free_start = 0;
-		}
-	}
-	if (free_start)
-		total_freed += free_region(free_start, free_end);
-
-	if (total_freed)
-		pr_info("Freed 0x%llx bytes of EFI boot services memory",
-			total_freed);
-}
-
 void __init efi_init(void)
 {
 	struct efi_fdt_params params;
@@ -396,8 +277,6 @@ static int __init arm64_enter_virtual_mode(void)
 	}
 	set_bit(EFI_SYSTEM_TABLES, &efi.flags);
 
-	free_boot_services();
-
 	if (!efi_enabled(EFI_VIRTMAP)) {
 		pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
 		return -1;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 8/8] arm64/efi: remove idmap manipulations from UEFI code
       [not found] ` <1419245944-2424-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
                     ` (6 preceding siblings ...)
  2014-12-22 10:59   ` [PATCH v4 7/8] arm64/efi: remove free_boot_services() and friends Ard Biesheuvel
@ 2014-12-22 10:59   ` Ard Biesheuvel
       [not found]     ` <1419245944-2424-9-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  7 siblings, 1 reply; 30+ messages in thread
From: Ard Biesheuvel @ 2014-12-22 10:59 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Ard Biesheuvel

Now that we have moved the call to SetVirtualAddressMap() to the stub,
UEFI has no use for the ID map, so we can drop the code that installs
ID mappings for UEFI memory regions.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/include/asm/efi.h |  2 --
 arch/arm64/include/asm/mmu.h |  2 --
 arch/arm64/kernel/efi.c      | 30 ------------------------------
 arch/arm64/kernel/setup.c    |  1 -
 arch/arm64/mm/mmu.c          | 12 ------------
 5 files changed, 47 deletions(-)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 6cc668a378c5..84c50853b5dc 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -6,11 +6,9 @@
 
 #ifdef CONFIG_EFI
 extern void efi_init(void);
-extern void efi_idmap_init(void);
 extern void efi_virtmap_init(void);
 #else
 #define efi_init()
-#define efi_idmap_init()
 #define efi_virtmap_init
 #endif
 
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 5fd40c43be80..3d311761e3c2 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -31,8 +31,6 @@ extern void paging_init(void);
 extern void setup_mm_for_reboot(void);
 extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
 extern void init_mem_pgprot(void);
-/* create an identity mapping for memory (or io if map_io is true) */
-extern void create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io);
 extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 			       unsigned long virt, phys_addr_t size,
 			       pgprot_t prot);
diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index e969c5076713..d7d2e818c856 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -53,27 +53,6 @@ static int __init is_normal_ram(efi_memory_desc_t *md)
 	return 0;
 }
 
-static void __init efi_setup_idmap(void)
-{
-	struct memblock_region *r;
-	efi_memory_desc_t *md;
-	u64 paddr, npages, size;
-
-	for_each_memblock(memory, r)
-		create_id_mapping(r->base, r->size, 0);
-
-	/* map runtime io spaces */
-	for_each_efi_memory_desc(&memmap, md) {
-		if (!(md->attribute & EFI_MEMORY_RUNTIME) || is_normal_ram(md))
-			continue;
-		paddr = md->phys_addr;
-		npages = md->num_pages;
-		memrange_efi_to_native(&paddr, &npages);
-		size = npages << PAGE_SHIFT;
-		create_id_mapping(paddr, size, 1);
-	}
-}
-
 /*
  * Translate a EFI virtual address into a physical address: this is necessary,
  * as some data members of the EFI system table are virtually remapped after
@@ -235,15 +214,6 @@ void __init efi_init(void)
 	reserve_regions();
 }
 
-void __init efi_idmap_init(void)
-{
-	if (!efi_enabled(EFI_BOOT))
-		return;
-
-	/* boot time idmap_pg_dir is incomplete, so fill in missing parts */
-	efi_setup_idmap();
-}
-
 static int __init arm64_enter_virtual_mode(void)
 {
 	u64 mapsize;
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index d8390f507da0..0771de962f37 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -401,7 +401,6 @@ void __init setup_arch(char **cmdline_p)
 	paging_init();
 	request_standard_resources();
 
-	efi_idmap_init();
 	efi_virtmap_init();
 
 	unflatten_device_tree();
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 3f3d5aa4a8b1..328638548871 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -271,18 +271,6 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
 			 size, PAGE_KERNEL_EXEC);
 }
 
-void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
-{
-	if ((addr >> PGDIR_SHIFT) >= ARRAY_SIZE(idmap_pg_dir)) {
-		pr_warn("BUG: not creating id mapping for %pa\n", &addr);
-		return;
-	}
-	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
-			 addr, addr, size,
-			 map_io ? __pgprot(PROT_DEVICE_nGnRE)
-				: PAGE_KERNEL_EXEC);
-}
-
 void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 			       unsigned long virt, phys_addr_t size,
 			       pgprot_t prot)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 4/8] efi: efistub: allow allocation alignment larger than EFI_PAGE_SIZE
       [not found]     ` <1419245944-2424-5-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
@ 2014-12-23 16:45       ` Borislav Petkov
       [not found]         ` <20141223164549.GB3810-fF5Pk5pvG8Y@public.gmane.org>
  0 siblings, 1 reply; 30+ messages in thread
From: Borislav Petkov @ 2014-12-23 16:45 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

On Mon, Dec 22, 2014 at 10:59:00AM +0000, Ard Biesheuvel wrote:
> On systems with 64 KB pages, it is preferable for UEFI memory map
> entries to be 64 KB aligned multiples of 64 KB, because it relieves
> us of having to deal with the residues.
> So, if EFI_ALLOC_ALIGN is #define'd by the platform, use it to round
> up all memory allocations made.
> 
> Acked-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

Acked-by: Borislav Petkov <bp-l3A5Bk7waGM@public.gmane.org>

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 4/8] efi: efistub: allow allocation alignment larger than EFI_PAGE_SIZE
       [not found]         ` <20141223164549.GB3810-fF5Pk5pvG8Y@public.gmane.org>
@ 2014-12-29  9:25           ` Ard Biesheuvel
       [not found]             ` <CAKv+Gu_+scCHYuXccgsvn9THsbP24OfyjvjN9XtEVe4nMdQtig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 30+ messages in thread
From: Ard Biesheuvel @ 2014-12-29  9:25 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Roy Franz,
	Mark Rutland, Catalin Marinas, Will Deacon, Matt Fleming,
	Dave Young, Geoff Levand, Mark Salter

On 23 December 2014 at 16:45, Borislav Petkov <bp-Gina5bIWoIWzQB+pC5nmwQ@public.gmane.org> wrote:
> On Mon, Dec 22, 2014 at 10:59:00AM +0000, Ard Biesheuvel wrote:
>> On systems with 64 KB pages, it is preferable for UEFI memory map
>> entries to be 64 KB aligned multiples of 64 KB, because it relieves
>> us of having to deal with the residues.
>> So, if EFI_ALLOC_ALIGN is #define'd by the platform, use it to round
>> up all memory allocations made.
>>
>> Acked-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>
> Acked-by: Borislav Petkov <bp-l3A5Bk7waGM@public.gmane.org>
>

Thanks Boris.

Do you have any comments regarding patches #3 and #6 in this series?
Patch #3 is actual common code, but patch #6 also touches a file under
drivers/firmware/efi/libstub (but it is only used by arm64 atm) With
your acks on those, I could ask Catalin or Will to take the whole
series through the arm64 tree.

Regards,
Ard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 4/8] efi: efistub: allow allocation alignment larger than EFI_PAGE_SIZE
       [not found]             ` <CAKv+Gu_+scCHYuXccgsvn9THsbP24OfyjvjN9XtEVe4nMdQtig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-12-29  9:47               ` Borislav Petkov
       [not found]                 ` <20141229094732.GA16051-K5JNixvcfoxupOikMc4+xw@public.gmane.org>
  0 siblings, 1 reply; 30+ messages in thread
From: Borislav Petkov @ 2014-12-29  9:47 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Roy Franz,
	Mark Rutland, Catalin Marinas, Will Deacon, Matt Fleming,
	Dave Young, Geoff Levand, Mark Salter

On Mon, Dec 29, 2014 at 09:25:15AM +0000, Ard Biesheuvel wrote:
> Do you have any comments regarding patches #3 and #6 in this series?

Sorry, have had no time so far to take a look. Holidays and all...
You're on the TODO list though :-)

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 4/8] efi: efistub: allow allocation alignment larger than EFI_PAGE_SIZE
       [not found]                 ` <20141229094732.GA16051-K5JNixvcfoxupOikMc4+xw@public.gmane.org>
@ 2015-01-05 11:24                   ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2015-01-05 11:24 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Roy Franz,
	Mark Rutland, Catalin Marinas, Will Deacon, Matt Fleming,
	Dave Young, Geoff Levand, Mark Salter

On 29 December 2014 at 09:47, Borislav Petkov <bp-Gina5bIWoIWzQB+pC5nmwQ@public.gmane.org> wrote:
> On Mon, Dec 29, 2014 at 09:25:15AM +0000, Ard Biesheuvel wrote:
>> Do you have any comments regarding patches #3 and #6 in this series?
>
> Sorry, have had no time so far to take a look. Holidays and all...
> You're on the TODO list though :-)
>

No worries, business has been slow for everyone, I'm sure.

Regards,
Ard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/8] efi: split off remapping code from efi_config_init()
       [not found]     ` <1419245944-2424-4-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
@ 2015-01-05 14:55       ` Matt Fleming
  2015-01-05 21:56       ` Borislav Petkov
  1 sibling, 0 replies; 30+ messages in thread
From: Matt Fleming @ 2015-01-05 14:55 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

On Mon, 22 Dec, at 10:58:59AM, Ard Biesheuvel wrote:
> Split of the remapping code from efi_config_init() so that the caller
> can perform its own remapping. This is necessary to correctly handle
> virtually remapped UEFI memory regions under kexec, as efi.systab will
> have been updated to a virtual address.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>  drivers/firmware/efi/efi.c | 49 +++++++++++++++++++++++++++++-----------------
>  include/linux/efi.h        |  2 ++
>  2 files changed, 33 insertions(+), 18 deletions(-)

Looks good Ard.

Acked-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
       [not found]     ` <1419245944-2424-7-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
@ 2015-01-05 16:20       ` Mark Rutland
  2015-01-06 17:13         ` Leif Lindholm
  2015-01-07 18:05         ` Ard Biesheuvel
  2015-01-05 16:54       ` Matt Fleming
  2015-01-07 12:06       ` Leif Lindholm
  2 siblings, 2 replies; 30+ messages in thread
From: Mark Rutland @ 2015-01-05 16:20 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, Catalin Marinas, Will Deacon,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

Hi Ard,

I have a few (minor) comments below.

On Mon, Dec 22, 2014 at 10:59:02AM +0000, Ard Biesheuvel wrote:
> In order to support kexec, the kernel needs to be able to deal with the
> state of the UEFI firmware after SetVirtualAddressMap() has been called.
> To avoid having separate code paths for non-kexec and kexec, let's move
> the call to SetVirtualAddressMap() to the stub: this will guarantee us
> that it will only be called once (since the stub is not executed during
> kexec), and ensures that the UEFI state is identical between kexec and
> normal boot.
> 
> This implies that the layout of the virtual mapping needs to be created
> by the stub as well. All regions are rounded up to a naturally aligned
> multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
> in the UEFI memory map. The kernel proper reads those values and installs
> the mappings in a dedicated set of page tables that are swapped in during
> UEFI Runtime Services calls.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>  arch/arm64/include/asm/efi.h       |  20 +++-
>  arch/arm64/kernel/efi.c            | 223 ++++++++++++++++++++-----------------
>  arch/arm64/kernel/setup.c          |   1 +
>  drivers/firmware/efi/libstub/fdt.c | 137 ++++++++++++++++++++++-
>  4 files changed, 270 insertions(+), 111 deletions(-)

[...]

> +static void efi_set_pgd(struct mm_struct *mm)
> +{
> +       cpu_switch_mm(mm->pgd, mm);
> +       flush_tlb_all();
> +       if (icache_is_aivivt())
> +               __flush_icache_all();
> +}

Do we have any idea how often we call runtime services?

I assume not all that often (read the RTC at boot, set/get variables).

If we're nuking the TLBs and I-cache a lot we'll probably need to
reserve an asid for the EFI virtmap.

[...]

> +void __init efi_virtmap_init(void)
> +{
> +       efi_memory_desc_t *md;
> +
> +       if (!efi_enabled(EFI_BOOT))
> +               return;
> +
> +       for_each_efi_memory_desc(&memmap, md) {
> +               u64 paddr, npages, size;
> +               pgprot_t prot;
> +
> +               if (!(md->attribute & EFI_MEMORY_RUNTIME))
> +                       continue;
> +               if (WARN(md->virt_addr == 0,
> +                        "UEFI virtual mapping incomplete or missing -- no entry found for 0x%llx\n",
> +                        md->phys_addr))
> +                       return;

I wonder if we might want to use a different address to signal a bad
mapping (e.g. 0UL-1 as it's not even EFI_PAGE_SIZE aligned), just in
case we have to handle a valid use of zero in future for some reason --
perhaps coming from another OS.

> +
> +               paddr = md->phys_addr;
> +               npages = md->num_pages;
> +               memrange_efi_to_native(&paddr, &npages);
> +               size = npages << PAGE_SHIFT;
> +
> +               pr_info("  EFI remap 0x%012llx => %p\n",

Why not use %016llx? We might only have 48-bit PAs currently, but we may
as well keep the VA and PA equally long when printing out -- that'll
also help to identify issues with garbage in the upper 16 bits of the
PA field.

[...]

> +/*
> + * This is the base address at which to start allocating virtual memory ranges
> + * for UEFI Runtime Services. This is a userland range so that we can use any
> + * allocation we choose, and eliminate the risk of a conflict after kexec.
> + */
> +#define EFI_RT_VIRTUAL_BASE    0x40000000

Nit: I'm not keen on the term "userland" here. You can map stuff to EL0
in the high half of the address space if you wanted, so TTBR0/TTBR1
aren't architecturally user/kernel.

s/userland/low half/, perhaps?

It might also be worth pointing out that the arbitrary base address
isn't zero so as to be less likely to be an idmap.

> +
> +static void update_memory_map(efi_memory_desc_t *memory_map,
> +                             unsigned long map_size, unsigned long desc_size,
> +                             int *count)
> +{
> +       u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
> +       efi_memory_desc_t *out = (void *)memory_map + map_size;
> +       int l;
> +
> +       for (l = 0; l < map_size; l += desc_size) {
> +               efi_memory_desc_t *in = (void *)memory_map + l;
> +               u64 paddr, size;
> +
> +               if (!(in->attribute & EFI_MEMORY_RUNTIME))
> +                       continue;
> +
> +               /*
> +                * Make the mapping compatible with 64k pages: this allows
> +                * a 4k page size kernel to kexec a 64k page size kernel and
> +                * vice versa.
> +                */
> +               paddr = round_down(in->phys_addr, SZ_64K);
> +               size = round_up(in->num_pages * EFI_PAGE_SIZE +
> +                               in->phys_addr - paddr, SZ_64K);
> +
> +               /*
> +                * Avoid wasting memory on PTEs by choosing a virtual base that
> +                * is compatible with section mappings if this region has the
> +                * appropriate size and physical alignment. (Sections are 2 MB
> +                * on 4k granule kernels)
> +                */
> +               if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
> +                       efi_virt_base = round_up(efi_virt_base, SZ_2M);
> +
> +               in->virt_addr = efi_virt_base + in->phys_addr - paddr;
> +               efi_virt_base += size;
> +
> +               memcpy(out, in, desc_size);
> +               out = (void *)out + desc_size;
> +               ++*count;
> +       }
> +}

This feels like this should live in arch/arm64, or under some other
directory specific to arm/arm64. That said, I guess the fdt stuff is
currently arm-specific anyway, so perhaps this is fine.

[...]

> @@ -248,12 +337,52 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
>                 }
>         }
> 
> +       /*
> +        * Update the memory map with virtual addresses. The function will also
> +        * populate the spare second half of the memory_map allocation with
> +        * copies of just the EFI_MEMORY_RUNTIME entries so that we can pass it
> +        * straight into SetVirtualAddressMap()
> +        */
> +       update_memory_map(memory_map, map_size, desc_size,
> +                         &runtime_entry_count);
> +
> +       pr_efi(sys_table,
> +              "Exiting boot services and installing virtual address map...\n");

I believe that the memory map is allowed to change as a result of this
call, so I think this needs to be moved before update_memory_map.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
       [not found]     ` <1419245944-2424-7-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  2015-01-05 16:20       ` Mark Rutland
@ 2015-01-05 16:54       ` Matt Fleming
       [not found]         ` <20150105165402.GE3163-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org>
  2015-01-07 12:06       ` Leif Lindholm
  2 siblings, 1 reply; 30+ messages in thread
From: Matt Fleming @ 2015-01-05 16:54 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

On Mon, 22 Dec, at 10:59:02AM, Ard Biesheuvel wrote:
> In order to support kexec, the kernel needs to be able to deal with the
> state of the UEFI firmware after SetVirtualAddressMap() has been called.
> To avoid having separate code paths for non-kexec and kexec, let's move
> the call to SetVirtualAddressMap() to the stub: this will guarantee us
> that it will only be called once (since the stub is not executed during
> kexec), and ensures that the UEFI state is identical between kexec and
> normal boot.
> 
> This implies that the layout of the virtual mapping needs to be created
> by the stub as well. All regions are rounded up to a naturally aligned
> multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
> in the UEFI memory map. The kernel proper reads those values and installs
> the mappings in a dedicated set of page tables that are swapped in during
> UEFI Runtime Services calls.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>  arch/arm64/include/asm/efi.h       |  20 +++-
>  arch/arm64/kernel/efi.c            | 223 ++++++++++++++++++++-----------------
>  arch/arm64/kernel/setup.c          |   1 +
>  drivers/firmware/efi/libstub/fdt.c | 137 ++++++++++++++++++++++-
>  4 files changed, 270 insertions(+), 111 deletions(-)

[...]

> @@ -45,5 +53,9 @@ extern void efi_idmap_init(void);
>  #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__)
>  
>  #define EFI_ALLOC_ALIGN		SZ_64K
> +#define EFI_VIRTMAP		EFI_ARCH_1

Any chance of documenting what EFI_VIRTMAP means for posterity and why
you want to use one of the EFI arch config bits? How is this different
from EFI_RUNTIME_SERVICES?

Take a look at EFI_OLD_MEMMAP in arch/x86/include/asm/efi.h for a
crackingly well documented example from Borislav.

[...]
  
> diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
> index c846a9608cbd..76bc8abf41d1 100644
> --- a/drivers/firmware/efi/libstub/fdt.c
> +++ b/drivers/firmware/efi/libstub/fdt.c
> @@ -167,6 +167,94 @@ fdt_set_fail:
>  #define EFI_FDT_ALIGN EFI_PAGE_SIZE
>  #endif
>  
> +static efi_status_t get_memory_map(efi_system_table_t *sys_table_arg,
> +				   efi_memory_desc_t **map,
> +				   unsigned long *map_size,
> +				   unsigned long *desc_size,
> +				   u32 *desc_ver, unsigned long *key_ptr)
> +{
> +	efi_status_t status;
> +
> +	/*
> +	 * Call get_memory_map() with 0 size to retrieve the size of the
> +	 * required allocation.
> +	 */
> +	*map_size = 0;
> +	status = efi_call_early(get_memory_map, map_size, NULL,
> +				key_ptr, desc_size, desc_ver);
> +	if (status != EFI_BUFFER_TOO_SMALL)
> +		return EFI_LOAD_ERROR;
> +
> +	/*
> +	 * Add an additional efi_memory_desc_t to map_size because we're doing
> +	 * an allocation which may be in a new descriptor region. Then double it
> +	 * to give us some scratch space to prepare the input virtmap to give
> +	 * to SetVirtualAddressMap(). Note that this is EFI_LOADER_DATA memory,
> +	 * and the kernel memblock_reserve()'s only the size of the actual
> +	 * memory map, so the scratch space is freed again automatically.
> +	 */
> +	*map_size += *desc_size;
> +	status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
> +				*map_size * 2, (void **)map);
> +	if (status != EFI_SUCCESS)
> +		return status;
> +
> +	status = efi_call_early(get_memory_map, map_size, *map,
> +				key_ptr, desc_size, desc_ver);
> +	if (status != EFI_SUCCESS)
> +		efi_call_early(free_pool, *map);
> +	return status;
> +}

We've now got two (slightly different) versions of this function. Is
there any way we could make do with just one?

> +/*
> + * This is the base address at which to start allocating virtual memory ranges
> + * for UEFI Runtime Services. This is a userland range so that we can use any
> + * allocation we choose, and eliminate the risk of a conflict after kexec.
> + */
> +#define EFI_RT_VIRTUAL_BASE	0x40000000
> +
> +static void update_memory_map(efi_memory_desc_t *memory_map,
> +			      unsigned long map_size, unsigned long desc_size,
> +			      int *count)
> +{
> +	u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
> +	efi_memory_desc_t *out = (void *)memory_map + map_size;
> +	int l;
> +
> +	for (l = 0; l < map_size; l += desc_size) {
> +		efi_memory_desc_t *in = (void *)memory_map + l;
> +		u64 paddr, size;
> +
> +		if (!(in->attribute & EFI_MEMORY_RUNTIME))
> +			continue;
> +
> +		/*
> +		 * Make the mapping compatible with 64k pages: this allows
> +		 * a 4k page size kernel to kexec a 64k page size kernel and
> +		 * vice versa.
> +		 */
> +		paddr = round_down(in->phys_addr, SZ_64K);
> +		size = round_up(in->num_pages * EFI_PAGE_SIZE +
> +				in->phys_addr - paddr, SZ_64K);
> +
> +		/*
> +		 * Avoid wasting memory on PTEs by choosing a virtual base that
> +		 * is compatible with section mappings if this region has the
> +		 * appropriate size and physical alignment. (Sections are 2 MB
> +		 * on 4k granule kernels)
> +		 */
> +		if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
> +			efi_virt_base = round_up(efi_virt_base, SZ_2M);
> +
> +		in->virt_addr = efi_virt_base + in->phys_addr - paddr;
> +		efi_virt_base += size;
> +
> +		memcpy(out, in, desc_size);
> +		out = (void *)out + desc_size;
> +		++*count;
> +	}
> +}
> +

fdt.c is starting to become a dumping ground for arm*-specific stuff :-(

Is there no general solution for sharing code between arm and aarch64 in
the kernel so we don't have to stick things like this in
drivers/firmware/efi/?

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 3/8] efi: split off remapping code from efi_config_init()
       [not found]     ` <1419245944-2424-4-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  2015-01-05 14:55       ` Matt Fleming
@ 2015-01-05 21:56       ` Borislav Petkov
  1 sibling, 0 replies; 30+ messages in thread
From: Borislav Petkov @ 2015-01-05 21:56 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

On Mon, Dec 22, 2014 at 10:58:59AM +0000, Ard Biesheuvel wrote:
> Split of the remapping code from efi_config_init() so that the caller
> can perform its own remapping. This is necessary to correctly handle
> virtually remapped UEFI memory regions under kexec, as efi.systab will
> have been updated to a virtual address.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

...

> @@ -344,13 +333,37 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
>  		tablep += sz;
>  	}
>  	pr_cont("\n");
> -	early_memunmap(config_tables, efi.systab->nr_tables * sz);
> -
>  	set_bit(EFI_CONFIG_TABLES, &efi.flags);
> -
>  	return 0;
>  }
>  
> +int __init efi_config_init(efi_config_table_type_t *arch_tables)
> +{
> +	void *config_tables;
> +	int sz, ret;
> +
> +	if (efi_enabled(EFI_64BIT))
> +		sz = sizeof(efi_config_table_64_t);
> +	else
> +		sz = sizeof(efi_config_table_32_t);
> +
> +	/*
> +	 * Let's see what config tables the firmware passed to us.
> +	 */
> +	config_tables = early_memremap(efi.systab->tables,
> +				       efi.systab->nr_tables * sz);
> +	if (config_tables == NULL) {
> +		pr_err("Could not map Configuration table!\n");
> +		return -ENOMEM;
> +	}
> +
> +	ret = efi_config_parse_tables(config_tables, efi.systab->nr_tables,
> +				      arch_tables);

Just a nitpick:

You could hand down @sz to this function so that you don't have to
compute it again in efi_config_parse_tables().

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/8] arm64/efi: set EFI_ALLOC_ALIGN to 64 KB
  2014-12-22 10:59   ` [PATCH v4 5/8] arm64/efi: set EFI_ALLOC_ALIGN to 64 KB Ard Biesheuvel
@ 2015-01-06 16:37     ` Leif Lindholm
  0 siblings, 0 replies; 30+ messages in thread
From: Leif Lindholm @ 2015-01-06 16:37 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: mark.rutland, linux-efi, geoff.levand, catalin.marinas,
	will.deacon, roy.franz, matt.fleming, bp, msalter, dyoung,
	linux-arm-kernel

On Mon, Dec 22, 2014 at 10:59:01AM +0000, Ard Biesheuvel wrote:
> Set EFI_ALLOC_ALIGN to 64 KB so that all allocations done by the stub
> are naturally compatible with a 64 KB granule kernel.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/include/asm/efi.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
> index a34fd3b12e2b..71291253114f 100644
> --- a/arch/arm64/include/asm/efi.h
> +++ b/arch/arm64/include/asm/efi.h
> @@ -44,4 +44,6 @@ extern void efi_idmap_init(void);
>  
>  #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__)
>  
> +#define EFI_ALLOC_ALIGN		SZ_64K
> +
>  #endif /* _ASM_EFI_H */
> -- 
> 1.8.3.2

Acked-by: Leif Lindholm <leif.lindholm@linaro.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
  2015-01-05 16:20       ` Mark Rutland
@ 2015-01-06 17:13         ` Leif Lindholm
  2015-01-07 18:05         ` Ard Biesheuvel
  1 sibling, 0 replies; 30+ messages in thread
From: Leif Lindholm @ 2015-01-06 17:13 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ard Biesheuvel,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, Catalin Marinas, Will Deacon,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

On Mon, Jan 05, 2015 at 04:20:16PM +0000, Mark Rutland wrote:
> Hi Ard,
> 
> I have a few (minor) comments below.
> 
> On Mon, Dec 22, 2014 at 10:59:02AM +0000, Ard Biesheuvel wrote:
> > In order to support kexec, the kernel needs to be able to deal with the
> > state of the UEFI firmware after SetVirtualAddressMap() has been called.
> > To avoid having separate code paths for non-kexec and kexec, let's move
> > the call to SetVirtualAddressMap() to the stub: this will guarantee us
> > that it will only be called once (since the stub is not executed during
> > kexec), and ensures that the UEFI state is identical between kexec and
> > normal boot.
> > 
> > This implies that the layout of the virtual mapping needs to be created
> > by the stub as well. All regions are rounded up to a naturally aligned
> > multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
> > in the UEFI memory map. The kernel proper reads those values and installs
> > the mappings in a dedicated set of page tables that are swapped in during
> > UEFI Runtime Services calls.
> > 
> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> > ---
> >  arch/arm64/include/asm/efi.h       |  20 +++-
> >  arch/arm64/kernel/efi.c            | 223 ++++++++++++++++++++-----------------
> >  arch/arm64/kernel/setup.c          |   1 +
> >  drivers/firmware/efi/libstub/fdt.c | 137 ++++++++++++++++++++++-
> >  4 files changed, 270 insertions(+), 111 deletions(-)
> 
> [...]
> 
> > +static void efi_set_pgd(struct mm_struct *mm)
> > +{
> > +       cpu_switch_mm(mm->pgd, mm);
> > +       flush_tlb_all();
> > +       if (icache_is_aivivt())
> > +               __flush_icache_all();
> > +}
> 
> Do we have any idea how often we call runtime services?

Very rarely.
 
> I assume not all that often (read the RTC at boot, set/get variables).

And in future possibly:
- adding capsules
- getwakeuptime/setwakeuptime
- saving time to RTC on shutdown
- reboot

All of which are uid0-invoked manual operations.

The only exception I can think of is if we're using the efivarfs
backend for pstore.

> If we're nuking the TLBs and I-cache a lot we'll probably need to
> reserve an asid for the EFI virtmap.

> > @@ -248,12 +337,52 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
> >                 }
> >         }
> > 
> > +       /*
> > +        * Update the memory map with virtual addresses. The function will also
> > +        * populate the spare second half of the memory_map allocation with
> > +        * copies of just the EFI_MEMORY_RUNTIME entries so that we can pass it
> > +        * straight into SetVirtualAddressMap()
> > +        */
> > +       update_memory_map(memory_map, map_size, desc_size,
> > +                         &runtime_entry_count);
> > +
> > +       pr_efi(sys_table,
> > +              "Exiting boot services and installing virtual address map...\n");
> 
> I believe that the memory map is allowed to change as a result of this
> call, so I think this needs to be moved before update_memory_map.

You are absolutely correct - but update_memory_map() only modifies the
copy of the memory map. The message needs to move all the way to
before the calll to get_memory_map(), and for practical reasons
perhaps before the while(1) loop.

/
    Leif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
       [not found]         ` <20150105165402.GE3163-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org>
@ 2015-01-06 18:01           ` Leif Lindholm
       [not found]             ` <20150106180120.GF3827-t77nlHhSwNqAroYi2ySoxKxOck334EZe@public.gmane.org>
  2015-01-07 18:15           ` Ard Biesheuvel
  1 sibling, 1 reply; 30+ messages in thread
From: Leif Lindholm @ 2015-01-06 18:01 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Ard Biesheuvel,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

On Mon, Jan 05, 2015 at 04:54:02PM +0000, Matt Fleming wrote:

> > +static efi_status_t get_memory_map(efi_system_table_t *sys_table_arg,
> > +				   efi_memory_desc_t **map,
> > +				   unsigned long *map_size,
> > +				   unsigned long *desc_size,
> > +				   u32 *desc_ver, unsigned long *key_ptr)
> > +{
> > +	efi_status_t status;
> > +
> > +	/*
> > +	 * Call get_memory_map() with 0 size to retrieve the size of the
> > +	 * required allocation.
> > +	 */
> > +	*map_size = 0;
> > +	status = efi_call_early(get_memory_map, map_size, NULL,
> > +				key_ptr, desc_size, desc_ver);
> > +	if (status != EFI_BUFFER_TOO_SMALL)
> > +		return EFI_LOAD_ERROR;
> > +
> > +	/*
> > +	 * Add an additional efi_memory_desc_t to map_size because we're doing
> > +	 * an allocation which may be in a new descriptor region. Then double it
> > +	 * to give us some scratch space to prepare the input virtmap to give
> > +	 * to SetVirtualAddressMap(). Note that this is EFI_LOADER_DATA memory,
> > +	 * and the kernel memblock_reserve()'s only the size of the actual
> > +	 * memory map, so the scratch space is freed again automatically.
> > +	 */
> > +	*map_size += *desc_size;
> > +	status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
> > +				*map_size * 2, (void **)map);
> > +	if (status != EFI_SUCCESS)
> > +		return status;
> > +
> > +	status = efi_call_early(get_memory_map, map_size, *map,
> > +				key_ptr, desc_size, desc_ver);
> > +	if (status != EFI_SUCCESS)
> > +		efi_call_early(free_pool, *map);
> > +	return status;
> > +}
> 
> We've now got two (slightly different) versions of this function. Is
> there any way we could make do with just one?

I think all we really need above what efi_get_memory_map() provides is
the scratch space. Would we care about temporarily wasting a little
bit of EFI_LOADER_DATA on all platforms, or could we just swap the
function body in efi-stub-helper.c for Ard's version above?

(I would guess memory maps with <= 32 entries are uncommon anyway, so
the existing version would already make the bootservice call twice.)

> > +/*
> > + * This is the base address at which to start allocating virtual memory ranges
> > + * for UEFI Runtime Services. This is a userland range so that we can use any
> > + * allocation we choose, and eliminate the risk of a conflict after kexec.
> > + */
> > +#define EFI_RT_VIRTUAL_BASE	0x40000000
> > +
> > +static void update_memory_map(efi_memory_desc_t *memory_map,
> > +			      unsigned long map_size, unsigned long desc_size,
> > +			      int *count)
> > +{
> > +	u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
> > +	efi_memory_desc_t *out = (void *)memory_map + map_size;
> > +	int l;
> > +
> > +	for (l = 0; l < map_size; l += desc_size) {
> > +		efi_memory_desc_t *in = (void *)memory_map + l;
> > +		u64 paddr, size;
> > +
> > +		if (!(in->attribute & EFI_MEMORY_RUNTIME))
> > +			continue;
> > +
> > +		/*
> > +		 * Make the mapping compatible with 64k pages: this allows
> > +		 * a 4k page size kernel to kexec a 64k page size kernel and
> > +		 * vice versa.
> > +		 */
> > +		paddr = round_down(in->phys_addr, SZ_64K);
> > +		size = round_up(in->num_pages * EFI_PAGE_SIZE +
> > +				in->phys_addr - paddr, SZ_64K);
> > +
> > +		/*
> > +		 * Avoid wasting memory on PTEs by choosing a virtual base that
> > +		 * is compatible with section mappings if this region has the
> > +		 * appropriate size and physical alignment. (Sections are 2 MB
> > +		 * on 4k granule kernels)
> > +		 */
> > +		if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
> > +			efi_virt_base = round_up(efi_virt_base, SZ_2M);
> > +
> > +		in->virt_addr = efi_virt_base + in->phys_addr - paddr;
> > +		efi_virt_base += size;
> > +
> > +		memcpy(out, in, desc_size);
> > +		out = (void *)out + desc_size;
> > +		++*count;
> > +	}
> > +}
> > +
> 
> fdt.c is starting to become a dumping ground for arm*-specific stuff :-(

Mmm, not optimal.
That said, the only arm*-specific things about this particular
function are the page sizes. Should this move to efi-stub-helper.c
with EFI_RT_VIRTUAL_BASE moved to arch/<x>/include/asm/efi.h and
joined by EFI_RT_VIRTUAL_BASE_ALIGN and EFI_RT_VIRTUAL_REGION_ALIGN?
 
> Is there no general solution for sharing code between arm and aarch64 in
> the kernel so we don't have to stick things like this in
> drivers/firmware/efi/?

Not currently.

/
    Leif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 7/8] arm64/efi: remove free_boot_services() and friends
       [not found]     ` <1419245944-2424-8-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
@ 2015-01-06 18:06       ` Leif Lindholm
  0 siblings, 0 replies; 30+ messages in thread
From: Leif Lindholm @ 2015-01-06 18:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

On Mon, Dec 22, 2014 at 10:59:03AM +0000, Ard Biesheuvel wrote:
> Now that we are calling SetVirtualAddressMap() from the stub, there is no
> need to reserve boot-only memory regions, which implies that there is also
> no reason to free them again later.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>  arch/arm64/kernel/efi.c | 123 +-----------------------------------------------
>  1 file changed, 1 insertion(+), 122 deletions(-)
> 
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index 2ebe67ffb629..e969c5076713 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -198,9 +198,7 @@ static __init void reserve_regions(void)
>  		if (is_normal_ram(md))
>  			early_init_dt_add_memory_arch(paddr, size);
>  
> -		if (is_reserve_region(md) ||
> -		    md->type == EFI_BOOT_SERVICES_CODE ||
> -		    md->type == EFI_BOOT_SERVICES_DATA) {
> +		if (is_reserve_region(md)) {
>  			memblock_reserve(paddr, size);
>  			if (uefi_debug)
>  				pr_cont("*");
> @@ -213,123 +211,6 @@ static __init void reserve_regions(void)
>  	set_bit(EFI_MEMMAP, &efi.flags);
>  }
>  
> -
> -static u64 __init free_one_region(u64 start, u64 end)
> -{
> -	u64 size = end - start;
> -
> -	if (uefi_debug)
> -		pr_info("  EFI freeing: 0x%012llx-0x%012llx\n",	start, end - 1);
> -
> -	free_bootmem_late(start, size);
> -	return size;
> -}
> -
> -static u64 __init free_region(u64 start, u64 end)
> -{
> -	u64 map_start, map_end, total = 0;
> -
> -	if (end <= start)
> -		return total;
> -
> -	map_start = (u64)memmap.phys_map;
> -	map_end = PAGE_ALIGN(map_start + (memmap.map_end - memmap.map));
> -	map_start &= PAGE_MASK;
> -
> -	if (start < map_end && end > map_start) {
> -		/* region overlaps UEFI memmap */
> -		if (start < map_start)
> -			total += free_one_region(start, map_start);
> -
> -		if (map_end < end)
> -			total += free_one_region(map_end, end);
> -	} else
> -		total += free_one_region(start, end);
> -
> -	return total;
> -}
> -
> -static void __init free_boot_services(void)
> -{
> -	u64 total_freed = 0;
> -	u64 keep_end, free_start, free_end;
> -	efi_memory_desc_t *md;
> -
> -	/*
> -	 * If kernel uses larger pages than UEFI, we have to be careful
> -	 * not to inadvertantly free memory we want to keep if there is
> -	 * overlap at the kernel page size alignment. We do not want to
> -	 * free is_reserve_region() memory nor the UEFI memmap itself.
> -	 *
> -	 * The memory map is sorted, so we keep track of the end of
> -	 * any previous region we want to keep, remember any region
> -	 * we want to free and defer freeing it until we encounter
> -	 * the next region we want to keep. This way, before freeing
> -	 * it, we can clip it as needed to avoid freeing memory we
> -	 * want to keep for UEFI.
> -	 */
> -
> -	keep_end = 0;
> -	free_start = 0;
> -
> -	for_each_efi_memory_desc(&memmap, md) {
> -		u64 paddr, npages, size;
> -
> -		if (is_reserve_region(md)) {
> -			/*
> -			 * We don't want to free any memory from this region.
> -			 */
> -			if (free_start) {
> -				/* adjust free_end then free region */
> -				if (free_end > md->phys_addr)
> -					free_end -= PAGE_SIZE;
> -				total_freed += free_region(free_start, free_end);
> -				free_start = 0;
> -			}
> -			keep_end = md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT);
> -			continue;
> -		}
> -
> -		if (md->type != EFI_BOOT_SERVICES_CODE &&
> -		    md->type != EFI_BOOT_SERVICES_DATA) {
> -			/* no need to free this region */
> -			continue;
> -		}
> -
> -		/*
> -		 * We want to free memory from this region.
> -		 */
> -		paddr = md->phys_addr;
> -		npages = md->num_pages;
> -		memrange_efi_to_native(&paddr, &npages);
> -		size = npages << PAGE_SHIFT;
> -
> -		if (free_start) {
> -			if (paddr <= free_end)
> -				free_end = paddr + size;
> -			else {
> -				total_freed += free_region(free_start, free_end);
> -				free_start = paddr;
> -				free_end = paddr + size;
> -			}
> -		} else {
> -			free_start = paddr;
> -			free_end = paddr + size;
> -		}
> -		if (free_start < keep_end) {
> -			free_start += PAGE_SIZE;
> -			if (free_start >= free_end)
> -				free_start = 0;
> -		}
> -	}
> -	if (free_start)
> -		total_freed += free_region(free_start, free_end);
> -
> -	if (total_freed)
> -		pr_info("Freed 0x%llx bytes of EFI boot services memory",
> -			total_freed);
> -}
> -
>  void __init efi_init(void)
>  {
>  	struct efi_fdt_params params;
> @@ -396,8 +277,6 @@ static int __init arm64_enter_virtual_mode(void)
>  	}
>  	set_bit(EFI_SYSTEM_TABLES, &efi.flags);
>  
> -	free_boot_services();
> -
>  	if (!efi_enabled(EFI_VIRTMAP)) {
>  		pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
>  		return -1;
> -- 
> 1.8.3.2

Acked-by: Leif Lindholm <leif.lindholm-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 8/8] arm64/efi: remove idmap manipulations from UEFI code
       [not found]     ` <1419245944-2424-9-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
@ 2015-01-06 18:17       ` Leif Lindholm
  0 siblings, 0 replies; 30+ messages in thread
From: Leif Lindholm @ 2015-01-06 18:17 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

On Mon, Dec 22, 2014 at 10:59:04AM +0000, Ard Biesheuvel wrote:
> Now that we have moved the call to SetVirtualAddressMap() to the stub,
> UEFI has no use for the ID map, so we can drop the code that installs
> ID mappings for UEFI memory regions.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>  arch/arm64/include/asm/efi.h |  2 --
>  arch/arm64/include/asm/mmu.h |  2 --
>  arch/arm64/kernel/efi.c      | 30 ------------------------------
>  arch/arm64/kernel/setup.c    |  1 -
>  arch/arm64/mm/mmu.c          | 12 ------------
>  5 files changed, 47 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
> index 6cc668a378c5..84c50853b5dc 100644
> --- a/arch/arm64/include/asm/efi.h
> +++ b/arch/arm64/include/asm/efi.h
> @@ -6,11 +6,9 @@
>  
>  #ifdef CONFIG_EFI
>  extern void efi_init(void);
> -extern void efi_idmap_init(void);
>  extern void efi_virtmap_init(void);
>  #else
>  #define efi_init()
> -#define efi_idmap_init()
>  #define efi_virtmap_init
>  #endif
>  
> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> index 5fd40c43be80..3d311761e3c2 100644
> --- a/arch/arm64/include/asm/mmu.h
> +++ b/arch/arm64/include/asm/mmu.h
> @@ -31,8 +31,6 @@ extern void paging_init(void);
>  extern void setup_mm_for_reboot(void);
>  extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
>  extern void init_mem_pgprot(void);
> -/* create an identity mapping for memory (or io if map_io is true) */
> -extern void create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io);
>  extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
>  			       unsigned long virt, phys_addr_t size,
>  			       pgprot_t prot);
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index e969c5076713..d7d2e818c856 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -53,27 +53,6 @@ static int __init is_normal_ram(efi_memory_desc_t *md)
>  	return 0;
>  }
>  
> -static void __init efi_setup_idmap(void)
> -{
> -	struct memblock_region *r;
> -	efi_memory_desc_t *md;
> -	u64 paddr, npages, size;
> -
> -	for_each_memblock(memory, r)
> -		create_id_mapping(r->base, r->size, 0);
> -
> -	/* map runtime io spaces */
> -	for_each_efi_memory_desc(&memmap, md) {
> -		if (!(md->attribute & EFI_MEMORY_RUNTIME) || is_normal_ram(md))
> -			continue;
> -		paddr = md->phys_addr;
> -		npages = md->num_pages;
> -		memrange_efi_to_native(&paddr, &npages);
> -		size = npages << PAGE_SHIFT;
> -		create_id_mapping(paddr, size, 1);
> -	}
> -}
> -
>  /*
>   * Translate a EFI virtual address into a physical address: this is necessary,
>   * as some data members of the EFI system table are virtually remapped after
> @@ -235,15 +214,6 @@ void __init efi_init(void)
>  	reserve_regions();
>  }
>  
> -void __init efi_idmap_init(void)
> -{
> -	if (!efi_enabled(EFI_BOOT))
> -		return;
> -
> -	/* boot time idmap_pg_dir is incomplete, so fill in missing parts */
> -	efi_setup_idmap();
> -}
> -
>  static int __init arm64_enter_virtual_mode(void)
>  {
>  	u64 mapsize;
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index d8390f507da0..0771de962f37 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -401,7 +401,6 @@ void __init setup_arch(char **cmdline_p)
>  	paging_init();
>  	request_standard_resources();
>  
> -	efi_idmap_init();
>  	efi_virtmap_init();
>  
>  	unflatten_device_tree();
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 3f3d5aa4a8b1..328638548871 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -271,18 +271,6 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
>  			 size, PAGE_KERNEL_EXEC);
>  }
>  
> -void __init create_id_mapping(phys_addr_t addr, phys_addr_t size, int map_io)
> -{
> -	if ((addr >> PGDIR_SHIFT) >= ARRAY_SIZE(idmap_pg_dir)) {
> -		pr_warn("BUG: not creating id mapping for %pa\n", &addr);
> -		return;
> -	}
> -	__create_mapping(&init_mm, &idmap_pg_dir[pgd_index(addr)],
> -			 addr, addr, size,
> -			 map_io ? __pgprot(PROT_DEVICE_nGnRE)
> -				: PAGE_KERNEL_EXEC);
> -}
> -
>  void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
>  			       unsigned long virt, phys_addr_t size,
>  			       pgprot_t prot)
> -- 
> 1.8.3.2

So, this one will clearly need to be reworked if it is to fit on top
of the patches I sent out today. However, this is trivial rework, so:

Acked-by: Leif Lindholm <leif.lindholm-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
       [not found]     ` <1419245944-2424-7-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  2015-01-05 16:20       ` Mark Rutland
  2015-01-05 16:54       ` Matt Fleming
@ 2015-01-07 12:06       ` Leif Lindholm
       [not found]         ` <20150107120608.GJ3827-t77nlHhSwNqAroYi2ySoxKxOck334EZe@public.gmane.org>
  2 siblings, 1 reply; 30+ messages in thread
From: Leif Lindholm @ 2015-01-07 12:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

On Mon, Dec 22, 2014 at 10:59:02AM +0000, Ard Biesheuvel wrote:
> In order to support kexec, the kernel needs to be able to deal with the
> state of the UEFI firmware after SetVirtualAddressMap() has been called.
> To avoid having separate code paths for non-kexec and kexec, let's move
> the call to SetVirtualAddressMap() to the stub: this will guarantee us
> that it will only be called once (since the stub is not executed during
> kexec), and ensures that the UEFI state is identical between kexec and
> normal boot.
> 
> This implies that the layout of the virtual mapping needs to be created
> by the stub as well. All regions are rounded up to a naturally aligned
> multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
> in the UEFI memory map. The kernel proper reads those values and installs
> the mappings in a dedicated set of page tables that are swapped in during
> UEFI Runtime Services calls.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>  arch/arm64/include/asm/efi.h       |  20 +++-
>  arch/arm64/kernel/efi.c            | 223 ++++++++++++++++++++-----------------
>  arch/arm64/kernel/setup.c          |   1 +
>  drivers/firmware/efi/libstub/fdt.c | 137 ++++++++++++++++++++++-
>  4 files changed, 270 insertions(+), 111 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
> index 71291253114f..6cc668a378c5 100644
> --- a/arch/arm64/include/asm/efi.h
> +++ b/arch/arm64/include/asm/efi.h
> @@ -7,28 +7,36 @@
>  #ifdef CONFIG_EFI
>  extern void efi_init(void);
>  extern void efi_idmap_init(void);
> +extern void efi_virtmap_init(void);
>  #else
>  #define efi_init()
>  #define efi_idmap_init()
> +#define efi_virtmap_init
>  #endif
>  
>  #define efi_call_virt(f, ...)						\
>  ({									\
> -	efi_##f##_t *__f = efi.systab->runtime->f;			\
> +	efi_##f##_t *__f;						\
>  	efi_status_t __s;						\
>  									\
> -	kernel_neon_begin();						\
> +	kernel_neon_begin();	/* disables preemption */		\

Nitpick: adding comment to otherwise untouched source line.

> +	efi_virtmap_load();						\
> +	__f = efi.systab->runtime->f;					\
>  	__s = __f(__VA_ARGS__);						\
> +	efi_virtmap_unload();						\
>  	kernel_neon_end();						\
>  	__s;								\
>  })
>  
>  #define __efi_call_virt(f, ...)						\
>  ({									\
> -	efi_##f##_t *__f = efi.systab->runtime->f;			\
> +	efi_##f##_t *__f;						\
>  									\
> -	kernel_neon_begin();						\
> +	kernel_neon_begin();	/* disables preemption */		\

Same nitpick.

> +	efi_virtmap_load();						\
> +	__f = efi.systab->runtime->f;					\
>  	__f(__VA_ARGS__);						\
> +	efi_virtmap_unload();						\
>  	kernel_neon_end();						\
>  })
>  
> @@ -45,5 +53,9 @@ extern void efi_idmap_init(void);
>  #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__)
>  
>  #define EFI_ALLOC_ALIGN		SZ_64K
> +#define EFI_VIRTMAP		EFI_ARCH_1
> +
> +void efi_virtmap_load(void);
> +void efi_virtmap_unload(void);
>  
>  #endif /* _ASM_EFI_H */
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index 6fac253bc783..2ebe67ffb629 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -11,25 +11,30 @@
>   *
>   */
>  
> +#include <linux/atomic.h>
>  #include <linux/dmi.h>
>  #include <linux/efi.h>
>  #include <linux/export.h>
>  #include <linux/memblock.h>
> +#include <linux/mm_types.h>
>  #include <linux/bootmem.h>
>  #include <linux/of.h>
>  #include <linux/of_fdt.h>
> +#include <linux/rbtree.h>
> +#include <linux/rwsem.h>
>  #include <linux/sched.h>
>  #include <linux/slab.h>
> +#include <linux/spinlock.h>
>  
>  #include <asm/cacheflush.h>
>  #include <asm/efi.h>
>  #include <asm/tlbflush.h>
>  #include <asm/mmu_context.h>
> +#include <asm/mmu.h>
> +#include <asm/pgtable.h>
>  
>  struct efi_memory_map memmap;
>  
> -static efi_runtime_services_t *runtime;
> -
>  static u64 efi_system_table;
>  
>  static int uefi_debug __initdata;
> @@ -69,9 +74,33 @@ static void __init efi_setup_idmap(void)
>  	}
>  }
>  
> +/*
> + * Translate a EFI virtual address into a physical address: this is necessary,
> + * as some data members of the EFI system table are virtually remapped after
> + * SetVirtualAddressMap() has been called.
> + */
> +static phys_addr_t efi_to_phys(unsigned long addr)
> +{
> +	efi_memory_desc_t *md;
> +
> +	for_each_efi_memory_desc(&memmap, md) {
> +		if (!(md->attribute & EFI_MEMORY_RUNTIME))
> +			continue;
> +		if (md->virt_addr == 0)
> +			/* no virtual mapping has been installed by the stub */
> +			break;
> +		if (md->virt_addr <= addr &&
> +		    (addr - md->virt_addr) < (md->num_pages << EFI_PAGE_SHIFT))
> +			return md->phys_addr + addr - md->virt_addr;
> +	}
> +	return addr;
> +}
> +
>  static int __init uefi_init(void)
>  {
>  	efi_char16_t *c16;
> +	void *config_tables;
> +	u64 table_size;
>  	char vendor[100] = "unknown";
>  	int i, retval;
>  
> @@ -99,7 +128,7 @@ static int __init uefi_init(void)
>  			efi.systab->hdr.revision & 0xffff);
>  
>  	/* Show what we know for posterity */
> -	c16 = early_memremap(efi.systab->fw_vendor,
> +	c16 = early_memremap(efi_to_phys(efi.systab->fw_vendor),
>  			     sizeof(vendor));
>  	if (c16) {
>  		for (i = 0; i < (int) sizeof(vendor) - 1 && *c16; ++i)
> @@ -112,8 +141,14 @@ static int __init uefi_init(void)
>  		efi.systab->hdr.revision >> 16,
>  		efi.systab->hdr.revision & 0xffff, vendor);
>  
> -	retval = efi_config_init(NULL);
> +	table_size = sizeof(efi_config_table_64_t) * efi.systab->nr_tables;
> +	config_tables = early_memremap(efi_to_phys(efi.systab->tables),
> +				       table_size);
> +
> +	retval = efi_config_parse_tables(config_tables,
> +					 efi.systab->nr_tables, NULL);
>  
> +	early_memunmap(config_tables, table_size);
>  out:
>  	early_memunmap(efi.systab,  sizeof(efi_system_table_t));
>  	return retval;
> @@ -328,51 +363,9 @@ void __init efi_idmap_init(void)
>  	efi_setup_idmap();
>  }
>  
> -static int __init remap_region(efi_memory_desc_t *md, void **new)
> -{
> -	u64 paddr, vaddr, npages, size;
> -
> -	paddr = md->phys_addr;
> -	npages = md->num_pages;
> -	memrange_efi_to_native(&paddr, &npages);
> -	size = npages << PAGE_SHIFT;
> -
> -	if (is_normal_ram(md))
> -		vaddr = (__force u64)ioremap_cache(paddr, size);
> -	else
> -		vaddr = (__force u64)ioremap(paddr, size);
> -
> -	if (!vaddr) {
> -		pr_err("Unable to remap 0x%llx pages @ %p\n",
> -		       npages, (void *)paddr);
> -		return 0;
> -	}
> -
> -	/* adjust for any rounding when EFI and system pagesize differs */
> -	md->virt_addr = vaddr + (md->phys_addr - paddr);
> -
> -	if (uefi_debug)
> -		pr_info("  EFI remap 0x%012llx => %p\n",
> -			md->phys_addr, (void *)md->virt_addr);
> -
> -	memcpy(*new, md, memmap.desc_size);
> -	*new += memmap.desc_size;
> -
> -	return 1;
> -}
> -
> -/*
> - * Switch UEFI from an identity map to a kernel virtual map
> - */

No function description at all?
Arguably this function could change name now as well, since UEFI will
already be in virtual mode. arm64_enable_runtime_map()?

>  static int __init arm64_enter_virtual_mode(void)
>  {
> -	efi_memory_desc_t *md;
> -	phys_addr_t virtmap_phys;
> -	void *virtmap, *virt_md;
> -	efi_status_t status;
>  	u64 mapsize;
> -	int count = 0;
> -	unsigned long flags;
>  
>  	if (!efi_enabled(EFI_BOOT)) {
>  		pr_info("EFI services will not be available.\n");
> @@ -395,79 +388,28 @@ static int __init arm64_enter_virtual_mode(void)
>  
>  	efi.memmap = &memmap;
>  
> -	/* Map the runtime regions */
> -	virtmap = kmalloc(mapsize, GFP_KERNEL);
> -	if (!virtmap) {
> -		pr_err("Failed to allocate EFI virtual memmap\n");
> -		return -1;
> -	}
> -	virtmap_phys = virt_to_phys(virtmap);
> -	virt_md = virtmap;
> -
> -	for_each_efi_memory_desc(&memmap, md) {
> -		if (!(md->attribute & EFI_MEMORY_RUNTIME))
> -			continue;
> -		if (!remap_region(md, &virt_md))
> -			goto err_unmap;
> -		++count;
> -	}
> -
> -	efi.systab = (__force void *)efi_lookup_mapped_addr(efi_system_table);
> +	efi.systab = (__force void *)ioremap_cache(efi_system_table,
> +						   sizeof(efi_system_table_t));
>  	if (!efi.systab) {
> -		/*
> -		 * If we have no virtual mapping for the System Table at this
> -		 * point, the memory map doesn't cover the physical offset where
> -		 * it resides. This means the System Table will be inaccessible
> -		 * to Runtime Services themselves once the virtual mapping is
> -		 * installed.
> -		 */
> -		pr_err("Failed to remap EFI System Table -- buggy firmware?\n");
> -		goto err_unmap;
> +		pr_err("Failed to remap EFI System Table\n");
> +		return -1;
>  	}
>  	set_bit(EFI_SYSTEM_TABLES, &efi.flags);
>  
> -	local_irq_save(flags);
> -	cpu_switch_mm(idmap_pg_dir, &init_mm);
> -
> -	/* Call SetVirtualAddressMap with the physical address of the map */
> -	runtime = efi.systab->runtime;
> -	efi.set_virtual_address_map = runtime->set_virtual_address_map;
> -
> -	status = efi.set_virtual_address_map(count * memmap.desc_size,
> -					     memmap.desc_size,
> -					     memmap.desc_version,
> -					     (efi_memory_desc_t *)virtmap_phys);
> -	cpu_set_reserved_ttbr0();
> -	flush_tlb_all();
> -	local_irq_restore(flags);
> -
> -	kfree(virtmap);
> -
>  	free_boot_services();
>  
> -	if (status != EFI_SUCCESS) {
> -		pr_err("Failed to set EFI virtual address map! [%lx]\n",
> -			status);
> +	if (!efi_enabled(EFI_VIRTMAP)) {
> +		pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
>  		return -1;
>  	}
>  
>  	/* Set up runtime services function pointers */
> -	runtime = efi.systab->runtime;
>  	efi_native_runtime_setup();
>  	set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
>  
>  	efi.runtime_version = efi.systab->hdr.revision;
>  
>  	return 0;
> -
> -err_unmap:
> -	/* unmap all mappings that succeeded: there are 'count' of those */
> -	for (virt_md = virtmap; count--; virt_md += memmap.desc_size) {
> -		md = virt_md;
> -		iounmap((__force void __iomem *)md->virt_addr);
> -	}
> -	kfree(virtmap);
> -	return -1;
>  }
>  early_initcall(arm64_enter_virtual_mode);
>  
> @@ -484,3 +426,78 @@ static int __init arm64_dmi_init(void)
>  	return 0;
>  }
>  core_initcall(arm64_dmi_init);
> +
> +static pgd_t efi_pgd[PTRS_PER_PGD] __page_aligned_bss;
> +
> +static struct mm_struct efi_mm = {
> +	.mm_rb			= RB_ROOT,
> +	.pgd			= efi_pgd,
> +	.mm_users		= ATOMIC_INIT(2),
> +	.mm_count		= ATOMIC_INIT(1),
> +	.mmap_sem		= __RWSEM_INITIALIZER(efi_mm.mmap_sem),
> +	.page_table_lock	= __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock),
> +	.mmlist			= LIST_HEAD_INIT(efi_mm.mmlist),
> +	INIT_MM_CONTEXT(efi_mm)
> +};
> +
> +static void efi_set_pgd(struct mm_struct *mm)
> +{
> +	cpu_switch_mm(mm->pgd, mm);
> +	flush_tlb_all();
> +	if (icache_is_aivivt())
> +		__flush_icache_all();
> +}
> +
> +void efi_virtmap_load(void)
> +{
> +	WARN_ON(preemptible());
> +	efi_set_pgd(&efi_mm);
> +}
> +
> +void efi_virtmap_unload(void)
> +{
> +	efi_set_pgd(current->active_mm);
> +}
> +
> +void __init efi_virtmap_init(void)
> +{
> +	efi_memory_desc_t *md;
> +
> +	if (!efi_enabled(EFI_BOOT))
> +		return;
> +
> +	for_each_efi_memory_desc(&memmap, md) {
> +		u64 paddr, npages, size;
> +		pgprot_t prot;
> +
> +		if (!(md->attribute & EFI_MEMORY_RUNTIME))
> +			continue;
> +		if (WARN(md->virt_addr == 0,
> +			 "UEFI virtual mapping incomplete or missing -- no entry found for 0x%llx\n",
> +			 md->phys_addr))
> +			return;
> +
> +		paddr = md->phys_addr;
> +		npages = md->num_pages;
> +		memrange_efi_to_native(&paddr, &npages);
> +		size = npages << PAGE_SHIFT;
> +
> +		pr_info("  EFI remap 0x%012llx => %p\n",
> +			md->phys_addr, (void *)md->virt_addr);
> +
> +		/*
> +		 * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
> +		 * executable, everything else can be mapped with the XN bits
> +		 * set.
> +		 */
> +		if (!is_normal_ram(md))
> +			prot = __pgprot(PROT_DEVICE_nGnRE);
> +		else if (md->type == EFI_RUNTIME_SERVICES_CODE)
> +			prot = PAGE_KERNEL_EXEC;
> +		else
> +			prot = PAGE_KERNEL;
> +
> +		create_pgd_mapping(&efi_mm, paddr, md->virt_addr, size, prot);
> +	}
> +	set_bit(EFI_VIRTMAP, &efi.flags);
> +}
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index b80991166754..d8390f507da0 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -402,6 +402,7 @@ void __init setup_arch(char **cmdline_p)
>  	request_standard_resources();
>  
>  	efi_idmap_init();
> +	efi_virtmap_init();

Could these two be merged together into one function?
Say efi_memmap_init()?

>  
>  	unflatten_device_tree();
>  
> diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
> index c846a9608cbd..76bc8abf41d1 100644
> --- a/drivers/firmware/efi/libstub/fdt.c
> +++ b/drivers/firmware/efi/libstub/fdt.c
> @@ -167,6 +167,94 @@ fdt_set_fail:
>  #define EFI_FDT_ALIGN EFI_PAGE_SIZE
>  #endif
>  
> +static efi_status_t get_memory_map(efi_system_table_t *sys_table_arg,
> +				   efi_memory_desc_t **map,
> +				   unsigned long *map_size,
> +				   unsigned long *desc_size,
> +				   u32 *desc_ver, unsigned long *key_ptr)
> +{
> +	efi_status_t status;
> +
> +	/*
> +	 * Call get_memory_map() with 0 size to retrieve the size of the
> +	 * required allocation.
> +	 */
> +	*map_size = 0;
> +	status = efi_call_early(get_memory_map, map_size, NULL,
> +				key_ptr, desc_size, desc_ver);
> +	if (status != EFI_BUFFER_TOO_SMALL)
> +		return EFI_LOAD_ERROR;
> +
> +	/*
> +	 * Add an additional efi_memory_desc_t to map_size because we're doing
> +	 * an allocation which may be in a new descriptor region. Then double it
> +	 * to give us some scratch space to prepare the input virtmap to give
> +	 * to SetVirtualAddressMap(). Note that this is EFI_LOADER_DATA memory,
> +	 * and the kernel memblock_reserve()'s only the size of the actual
> +	 * memory map, so the scratch space is freed again automatically.
> +	 */
> +	*map_size += *desc_size;
> +	status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
> +				*map_size * 2, (void **)map);
> +	if (status != EFI_SUCCESS)
> +		return status;
> +
> +	status = efi_call_early(get_memory_map, map_size, *map,
> +				key_ptr, desc_size, desc_ver);
> +	if (status != EFI_SUCCESS)
> +		efi_call_early(free_pool, *map);
> +	return status;
> +}
> +
> +/*
> + * This is the base address at which to start allocating virtual memory ranges
> + * for UEFI Runtime Services. This is a userland range so that we can use any
> + * allocation we choose, and eliminate the risk of a conflict after kexec.
> + */
> +#define EFI_RT_VIRTUAL_BASE	0x40000000
> +
> +static void update_memory_map(efi_memory_desc_t *memory_map,
> +			      unsigned long map_size, unsigned long desc_size,
> +			      int *count)
> +{
> +	u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
> +	efi_memory_desc_t *out = (void *)memory_map + map_size;
> +	int l;
> +
> +	for (l = 0; l < map_size; l += desc_size) {
> +		efi_memory_desc_t *in = (void *)memory_map + l;
> +		u64 paddr, size;
> +
> +		if (!(in->attribute & EFI_MEMORY_RUNTIME))
> +			continue;
> +
> +		/*
> +		 * Make the mapping compatible with 64k pages: this allows
> +		 * a 4k page size kernel to kexec a 64k page size kernel and
> +		 * vice versa.
> +		 */
> +		paddr = round_down(in->phys_addr, SZ_64K);
> +		size = round_up(in->num_pages * EFI_PAGE_SIZE +
> +				in->phys_addr - paddr, SZ_64K);
> +
> +		/*
> +		 * Avoid wasting memory on PTEs by choosing a virtual base that
> +		 * is compatible with section mappings if this region has the
> +		 * appropriate size and physical alignment. (Sections are 2 MB
> +		 * on 4k granule kernels)
> +		 */
> +		if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
> +			efi_virt_base = round_up(efi_virt_base, SZ_2M);
> +
> +		in->virt_addr = efi_virt_base + in->phys_addr - paddr;
> +		efi_virt_base += size;
> +
> +		memcpy(out, in, desc_size);
> +		out = (void *)out + desc_size;
> +		++*count;
> +	}
> +}
> +
>  /*
>   * Allocate memory for a new FDT, then add EFI, commandline, and
>   * initrd related fields to the FDT.  This routine increases the
> @@ -196,6 +284,7 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
>  	efi_memory_desc_t *memory_map;
>  	unsigned long new_fdt_size;
>  	efi_status_t status;
> +	int runtime_entry_count = 0;
>  
>  	/*
>  	 * Estimate size of new FDT, and allocate memory for it. We
> @@ -216,8 +305,8 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
>  		 * we can get the memory map key  needed for
>  		 * exit_boot_services().
>  		 */
> -		status = efi_get_memory_map(sys_table, &memory_map, &map_size,
> -					    &desc_size, &desc_ver, &mmap_key);
> +		status = get_memory_map(sys_table, &memory_map, &map_size,
> +					&desc_size, &desc_ver, &mmap_key);
>  		if (status != EFI_SUCCESS)
>  			goto fail_free_new_fdt;
>  
> @@ -248,12 +337,52 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
>  		}
>  	}
>  
> +	/*
> +	 * Update the memory map with virtual addresses. The function will also
> +	 * populate the spare second half of the memory_map allocation with
> +	 * copies of just the EFI_MEMORY_RUNTIME entries so that we can pass it
> +	 * straight into SetVirtualAddressMap()
> +	 */
> +	update_memory_map(memory_map, map_size, desc_size,
> +			  &runtime_entry_count);
> +
> +	pr_efi(sys_table,
> +	       "Exiting boot services and installing virtual address map...\n");
> +
>  	/* Now we are ready to exit_boot_services.*/
>  	status = sys_table->boottime->exit_boot_services(handle, mmap_key);
>  
> +	if (status == EFI_SUCCESS) {
> +		efi_set_virtual_address_map_t *svam;
>  
> -	if (status == EFI_SUCCESS)
> -		return status;
> +		/* Install the new virtual address map */
> +		svam = sys_table->runtime->set_virtual_address_map;
> +		status = svam(runtime_entry_count * desc_size, desc_size,
> +			      desc_ver, (void *)memory_map + map_size);
> +
> +		/*
> +		 * We are beyond the point of no return here, so if the call to
> +		 * SetVirtualAddressMap() failed, we need to signal that to the
> +		 * incoming kernel but proceed normally otherwise.
> +		 */
> +		if (status != EFI_SUCCESS) {
> +			int l;
> +
> +			/*
> +			 * Set the virtual address field of all
> +			 * EFI_MEMORY_RUNTIME entries to 0. This will signal
> +			 * the incoming kernel that no virtual translation has
> +			 * been installed.
> +			 */
> +			for (l = 0; l < map_size; l += desc_size) {
> +				efi_memory_desc_t *p = (void *)memory_map + l;
> +
> +				if (p->attribute & EFI_MEMORY_RUNTIME)
> +					p->virt_addr = 0;
> +			}
> +		}
> +		return EFI_SUCCESS;
> +	}
>  
>  	pr_efi_err(sys_table, "Exit boot services failed.\n");
>  
> -- 
> 1.8.3.2

Apart from this, and other comments in the thread, looks good.

/
    Leif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
       [not found]         ` <20150107120608.GJ3827-t77nlHhSwNqAroYi2ySoxKxOck334EZe@public.gmane.org>
@ 2015-01-07 12:16           ` Ard Biesheuvel
       [not found]             ` <CAKv+Gu_k0_anqvm24P8J4WaMrFLMxOhQETAfUb6xmzaJ+Z2vBQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 30+ messages in thread
From: Ard Biesheuvel @ 2015-01-07 12:16 UTC (permalink / raw)
  To: Leif Lindholm
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Roy Franz, Mark Rutland,
	Catalin Marinas, Will Deacon, Matt Fleming, Borislav Petkov,
	Dave Young, Geoff Levand, Mark Salter

On 7 January 2015 at 12:06, Leif Lindholm <leif.lindholm-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
> On Mon, Dec 22, 2014 at 10:59:02AM +0000, Ard Biesheuvel wrote:
>> In order to support kexec, the kernel needs to be able to deal with the
>> state of the UEFI firmware after SetVirtualAddressMap() has been called.
>> To avoid having separate code paths for non-kexec and kexec, let's move
>> the call to SetVirtualAddressMap() to the stub: this will guarantee us
>> that it will only be called once (since the stub is not executed during
>> kexec), and ensures that the UEFI state is identical between kexec and
>> normal boot.
>>
>> This implies that the layout of the virtual mapping needs to be created
>> by the stub as well. All regions are rounded up to a naturally aligned
>> multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
>> in the UEFI memory map. The kernel proper reads those values and installs
>> the mappings in a dedicated set of page tables that are swapped in during
>> UEFI Runtime Services calls.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>> ---
>>  arch/arm64/include/asm/efi.h       |  20 +++-
>>  arch/arm64/kernel/efi.c            | 223 ++++++++++++++++++++-----------------
>>  arch/arm64/kernel/setup.c          |   1 +
>>  drivers/firmware/efi/libstub/fdt.c | 137 ++++++++++++++++++++++-
>>  4 files changed, 270 insertions(+), 111 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
>> index 71291253114f..6cc668a378c5 100644
>> --- a/arch/arm64/include/asm/efi.h
>> +++ b/arch/arm64/include/asm/efi.h
>> @@ -7,28 +7,36 @@
>>  #ifdef CONFIG_EFI
>>  extern void efi_init(void);
>>  extern void efi_idmap_init(void);
>> +extern void efi_virtmap_init(void);
>>  #else
>>  #define efi_init()
>>  #define efi_idmap_init()
>> +#define efi_virtmap_init
>>  #endif
>>
>>  #define efi_call_virt(f, ...)                                                \
>>  ({                                                                   \
>> -     efi_##f##_t *__f = efi.systab->runtime->f;                      \
>> +     efi_##f##_t *__f;                                               \
>>       efi_status_t __s;                                               \
>>                                                                       \
>> -     kernel_neon_begin();                                            \
>> +     kernel_neon_begin();    /* disables preemption */               \
>
> Nitpick: adding comment to otherwise untouched source line.
>
>> +     efi_virtmap_load();                                             \
>> +     __f = efi.systab->runtime->f;                                   \
>>       __s = __f(__VA_ARGS__);                                         \
>> +     efi_virtmap_unload();                                           \
>>       kernel_neon_end();                                              \
>>       __s;                                                            \
>>  })
>>
>>  #define __efi_call_virt(f, ...)                                              \
>>  ({                                                                   \
>> -     efi_##f##_t *__f = efi.systab->runtime->f;                      \
>> +     efi_##f##_t *__f;                                               \
>>                                                                       \
>> -     kernel_neon_begin();                                            \
>> +     kernel_neon_begin();    /* disables preemption */               \
>
> Same nitpick.
>

Is there anything wrong with that?
Would you prefer the comment to be on a separate line?

>> +     efi_virtmap_load();                                             \
>> +     __f = efi.systab->runtime->f;                                   \
>>       __f(__VA_ARGS__);                                               \
>> +     efi_virtmap_unload();                                           \
>>       kernel_neon_end();                                              \
>>  })
>>
>> @@ -45,5 +53,9 @@ extern void efi_idmap_init(void);
>>  #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__)
>>
>>  #define EFI_ALLOC_ALIGN              SZ_64K
>> +#define EFI_VIRTMAP          EFI_ARCH_1
>> +
>> +void efi_virtmap_load(void);
>> +void efi_virtmap_unload(void);
>>
>>  #endif /* _ASM_EFI_H */
>> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
>> index 6fac253bc783..2ebe67ffb629 100644
>> --- a/arch/arm64/kernel/efi.c
>> +++ b/arch/arm64/kernel/efi.c
>> @@ -11,25 +11,30 @@
>>   *
>>   */
>>
>> +#include <linux/atomic.h>
>>  #include <linux/dmi.h>
>>  #include <linux/efi.h>
>>  #include <linux/export.h>
>>  #include <linux/memblock.h>
>> +#include <linux/mm_types.h>
>>  #include <linux/bootmem.h>
>>  #include <linux/of.h>
>>  #include <linux/of_fdt.h>
>> +#include <linux/rbtree.h>
>> +#include <linux/rwsem.h>
>>  #include <linux/sched.h>
>>  #include <linux/slab.h>
>> +#include <linux/spinlock.h>
>>
>>  #include <asm/cacheflush.h>
>>  #include <asm/efi.h>
>>  #include <asm/tlbflush.h>
>>  #include <asm/mmu_context.h>
>> +#include <asm/mmu.h>
>> +#include <asm/pgtable.h>
>>
>>  struct efi_memory_map memmap;
>>
>> -static efi_runtime_services_t *runtime;
>> -
>>  static u64 efi_system_table;
>>
>>  static int uefi_debug __initdata;
>> @@ -69,9 +74,33 @@ static void __init efi_setup_idmap(void)
>>       }
>>  }
>>
>> +/*
>> + * Translate a EFI virtual address into a physical address: this is necessary,
>> + * as some data members of the EFI system table are virtually remapped after
>> + * SetVirtualAddressMap() has been called.
>> + */
>> +static phys_addr_t efi_to_phys(unsigned long addr)
>> +{
>> +     efi_memory_desc_t *md;
>> +
>> +     for_each_efi_memory_desc(&memmap, md) {
>> +             if (!(md->attribute & EFI_MEMORY_RUNTIME))
>> +                     continue;
>> +             if (md->virt_addr == 0)
>> +                     /* no virtual mapping has been installed by the stub */
>> +                     break;
>> +             if (md->virt_addr <= addr &&
>> +                 (addr - md->virt_addr) < (md->num_pages << EFI_PAGE_SHIFT))
>> +                     return md->phys_addr + addr - md->virt_addr;
>> +     }
>> +     return addr;
>> +}
>> +
>>  static int __init uefi_init(void)
>>  {
>>       efi_char16_t *c16;
>> +     void *config_tables;
>> +     u64 table_size;
>>       char vendor[100] = "unknown";
>>       int i, retval;
>>
>> @@ -99,7 +128,7 @@ static int __init uefi_init(void)
>>                       efi.systab->hdr.revision & 0xffff);
>>
>>       /* Show what we know for posterity */
>> -     c16 = early_memremap(efi.systab->fw_vendor,
>> +     c16 = early_memremap(efi_to_phys(efi.systab->fw_vendor),
>>                            sizeof(vendor));
>>       if (c16) {
>>               for (i = 0; i < (int) sizeof(vendor) - 1 && *c16; ++i)
>> @@ -112,8 +141,14 @@ static int __init uefi_init(void)
>>               efi.systab->hdr.revision >> 16,
>>               efi.systab->hdr.revision & 0xffff, vendor);
>>
>> -     retval = efi_config_init(NULL);
>> +     table_size = sizeof(efi_config_table_64_t) * efi.systab->nr_tables;
>> +     config_tables = early_memremap(efi_to_phys(efi.systab->tables),
>> +                                    table_size);
>> +
>> +     retval = efi_config_parse_tables(config_tables,
>> +                                      efi.systab->nr_tables, NULL);
>>
>> +     early_memunmap(config_tables, table_size);
>>  out:
>>       early_memunmap(efi.systab,  sizeof(efi_system_table_t));
>>       return retval;
>> @@ -328,51 +363,9 @@ void __init efi_idmap_init(void)
>>       efi_setup_idmap();
>>  }
>>
>> -static int __init remap_region(efi_memory_desc_t *md, void **new)
>> -{
>> -     u64 paddr, vaddr, npages, size;
>> -
>> -     paddr = md->phys_addr;
>> -     npages = md->num_pages;
>> -     memrange_efi_to_native(&paddr, &npages);
>> -     size = npages << PAGE_SHIFT;
>> -
>> -     if (is_normal_ram(md))
>> -             vaddr = (__force u64)ioremap_cache(paddr, size);
>> -     else
>> -             vaddr = (__force u64)ioremap(paddr, size);
>> -
>> -     if (!vaddr) {
>> -             pr_err("Unable to remap 0x%llx pages @ %p\n",
>> -                    npages, (void *)paddr);
>> -             return 0;
>> -     }
>> -
>> -     /* adjust for any rounding when EFI and system pagesize differs */
>> -     md->virt_addr = vaddr + (md->phys_addr - paddr);
>> -
>> -     if (uefi_debug)
>> -             pr_info("  EFI remap 0x%012llx => %p\n",
>> -                     md->phys_addr, (void *)md->virt_addr);
>> -
>> -     memcpy(*new, md, memmap.desc_size);
>> -     *new += memmap.desc_size;
>> -
>> -     return 1;
>> -}
>> -
>> -/*
>> - * Switch UEFI from an identity map to a kernel virtual map
>> - */
>
> No function description at all?

Seems I was a bit lazy there.

> Arguably this function could change name now as well, since UEFI will
> already be in virtual mode. arm64_enable_runtime_map()?
>

OK

>>  static int __init arm64_enter_virtual_mode(void)
>>  {
>> -     efi_memory_desc_t *md;
>> -     phys_addr_t virtmap_phys;
>> -     void *virtmap, *virt_md;
>> -     efi_status_t status;
>>       u64 mapsize;
>> -     int count = 0;
>> -     unsigned long flags;
>>
>>       if (!efi_enabled(EFI_BOOT)) {
>>               pr_info("EFI services will not be available.\n");
>> @@ -395,79 +388,28 @@ static int __init arm64_enter_virtual_mode(void)
>>
>>       efi.memmap = &memmap;
>>
>> -     /* Map the runtime regions */
>> -     virtmap = kmalloc(mapsize, GFP_KERNEL);
>> -     if (!virtmap) {
>> -             pr_err("Failed to allocate EFI virtual memmap\n");
>> -             return -1;
>> -     }
>> -     virtmap_phys = virt_to_phys(virtmap);
>> -     virt_md = virtmap;
>> -
>> -     for_each_efi_memory_desc(&memmap, md) {
>> -             if (!(md->attribute & EFI_MEMORY_RUNTIME))
>> -                     continue;
>> -             if (!remap_region(md, &virt_md))
>> -                     goto err_unmap;
>> -             ++count;
>> -     }
>> -
>> -     efi.systab = (__force void *)efi_lookup_mapped_addr(efi_system_table);
>> +     efi.systab = (__force void *)ioremap_cache(efi_system_table,
>> +                                                sizeof(efi_system_table_t));
>>       if (!efi.systab) {
>> -             /*
>> -              * If we have no virtual mapping for the System Table at this
>> -              * point, the memory map doesn't cover the physical offset where
>> -              * it resides. This means the System Table will be inaccessible
>> -              * to Runtime Services themselves once the virtual mapping is
>> -              * installed.
>> -              */
>> -             pr_err("Failed to remap EFI System Table -- buggy firmware?\n");
>> -             goto err_unmap;
>> +             pr_err("Failed to remap EFI System Table\n");
>> +             return -1;
>>       }
>>       set_bit(EFI_SYSTEM_TABLES, &efi.flags);
>>
>> -     local_irq_save(flags);
>> -     cpu_switch_mm(idmap_pg_dir, &init_mm);
>> -
>> -     /* Call SetVirtualAddressMap with the physical address of the map */
>> -     runtime = efi.systab->runtime;
>> -     efi.set_virtual_address_map = runtime->set_virtual_address_map;
>> -
>> -     status = efi.set_virtual_address_map(count * memmap.desc_size,
>> -                                          memmap.desc_size,
>> -                                          memmap.desc_version,
>> -                                          (efi_memory_desc_t *)virtmap_phys);
>> -     cpu_set_reserved_ttbr0();
>> -     flush_tlb_all();
>> -     local_irq_restore(flags);
>> -
>> -     kfree(virtmap);
>> -
>>       free_boot_services();
>>
>> -     if (status != EFI_SUCCESS) {
>> -             pr_err("Failed to set EFI virtual address map! [%lx]\n",
>> -                     status);
>> +     if (!efi_enabled(EFI_VIRTMAP)) {
>> +             pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
>>               return -1;
>>       }
>>
>>       /* Set up runtime services function pointers */
>> -     runtime = efi.systab->runtime;
>>       efi_native_runtime_setup();
>>       set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
>>
>>       efi.runtime_version = efi.systab->hdr.revision;
>>
>>       return 0;
>> -
>> -err_unmap:
>> -     /* unmap all mappings that succeeded: there are 'count' of those */
>> -     for (virt_md = virtmap; count--; virt_md += memmap.desc_size) {
>> -             md = virt_md;
>> -             iounmap((__force void __iomem *)md->virt_addr);
>> -     }
>> -     kfree(virtmap);
>> -     return -1;
>>  }
>>  early_initcall(arm64_enter_virtual_mode);
>>
>> @@ -484,3 +426,78 @@ static int __init arm64_dmi_init(void)
>>       return 0;
>>  }
>>  core_initcall(arm64_dmi_init);
>> +
>> +static pgd_t efi_pgd[PTRS_PER_PGD] __page_aligned_bss;
>> +
>> +static struct mm_struct efi_mm = {
>> +     .mm_rb                  = RB_ROOT,
>> +     .pgd                    = efi_pgd,
>> +     .mm_users               = ATOMIC_INIT(2),
>> +     .mm_count               = ATOMIC_INIT(1),
>> +     .mmap_sem               = __RWSEM_INITIALIZER(efi_mm.mmap_sem),
>> +     .page_table_lock        = __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock),
>> +     .mmlist                 = LIST_HEAD_INIT(efi_mm.mmlist),
>> +     INIT_MM_CONTEXT(efi_mm)
>> +};
>> +
>> +static void efi_set_pgd(struct mm_struct *mm)
>> +{
>> +     cpu_switch_mm(mm->pgd, mm);
>> +     flush_tlb_all();
>> +     if (icache_is_aivivt())
>> +             __flush_icache_all();
>> +}
>> +
>> +void efi_virtmap_load(void)
>> +{
>> +     WARN_ON(preemptible());
>> +     efi_set_pgd(&efi_mm);
>> +}
>> +
>> +void efi_virtmap_unload(void)
>> +{
>> +     efi_set_pgd(current->active_mm);
>> +}
>> +
>> +void __init efi_virtmap_init(void)
>> +{
>> +     efi_memory_desc_t *md;
>> +
>> +     if (!efi_enabled(EFI_BOOT))
>> +             return;
>> +
>> +     for_each_efi_memory_desc(&memmap, md) {
>> +             u64 paddr, npages, size;
>> +             pgprot_t prot;
>> +
>> +             if (!(md->attribute & EFI_MEMORY_RUNTIME))
>> +                     continue;
>> +             if (WARN(md->virt_addr == 0,
>> +                      "UEFI virtual mapping incomplete or missing -- no entry found for 0x%llx\n",
>> +                      md->phys_addr))
>> +                     return;
>> +
>> +             paddr = md->phys_addr;
>> +             npages = md->num_pages;
>> +             memrange_efi_to_native(&paddr, &npages);
>> +             size = npages << PAGE_SHIFT;
>> +
>> +             pr_info("  EFI remap 0x%012llx => %p\n",
>> +                     md->phys_addr, (void *)md->virt_addr);
>> +
>> +             /*
>> +              * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
>> +              * executable, everything else can be mapped with the XN bits
>> +              * set.
>> +              */
>> +             if (!is_normal_ram(md))
>> +                     prot = __pgprot(PROT_DEVICE_nGnRE);
>> +             else if (md->type == EFI_RUNTIME_SERVICES_CODE)
>> +                     prot = PAGE_KERNEL_EXEC;
>> +             else
>> +                     prot = PAGE_KERNEL;
>> +
>> +             create_pgd_mapping(&efi_mm, paddr, md->virt_addr, size, prot);
>> +     }
>> +     set_bit(EFI_VIRTMAP, &efi.flags);
>> +}
>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> index b80991166754..d8390f507da0 100644
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -402,6 +402,7 @@ void __init setup_arch(char **cmdline_p)
>>       request_standard_resources();
>>
>>       efi_idmap_init();
>> +     efi_virtmap_init();
>
> Could these two be merged together into one function?
> Say efi_memmap_init()?
>

Well, I decided to do it like this because efi_idmap_init() gets
removed in its entirety (including this invocation) in a subsequent
patch.

>>
>>       unflatten_device_tree();
>>
>> diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
>> index c846a9608cbd..76bc8abf41d1 100644
>> --- a/drivers/firmware/efi/libstub/fdt.c
>> +++ b/drivers/firmware/efi/libstub/fdt.c
>> @@ -167,6 +167,94 @@ fdt_set_fail:
>>  #define EFI_FDT_ALIGN EFI_PAGE_SIZE
>>  #endif
>>
>> +static efi_status_t get_memory_map(efi_system_table_t *sys_table_arg,
>> +                                efi_memory_desc_t **map,
>> +                                unsigned long *map_size,
>> +                                unsigned long *desc_size,
>> +                                u32 *desc_ver, unsigned long *key_ptr)
>> +{
>> +     efi_status_t status;
>> +
>> +     /*
>> +      * Call get_memory_map() with 0 size to retrieve the size of the
>> +      * required allocation.
>> +      */
>> +     *map_size = 0;
>> +     status = efi_call_early(get_memory_map, map_size, NULL,
>> +                             key_ptr, desc_size, desc_ver);
>> +     if (status != EFI_BUFFER_TOO_SMALL)
>> +             return EFI_LOAD_ERROR;
>> +
>> +     /*
>> +      * Add an additional efi_memory_desc_t to map_size because we're doing
>> +      * an allocation which may be in a new descriptor region. Then double it
>> +      * to give us some scratch space to prepare the input virtmap to give
>> +      * to SetVirtualAddressMap(). Note that this is EFI_LOADER_DATA memory,
>> +      * and the kernel memblock_reserve()'s only the size of the actual
>> +      * memory map, so the scratch space is freed again automatically.
>> +      */
>> +     *map_size += *desc_size;
>> +     status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
>> +                             *map_size * 2, (void **)map);
>> +     if (status != EFI_SUCCESS)
>> +             return status;
>> +
>> +     status = efi_call_early(get_memory_map, map_size, *map,
>> +                             key_ptr, desc_size, desc_ver);
>> +     if (status != EFI_SUCCESS)
>> +             efi_call_early(free_pool, *map);
>> +     return status;
>> +}
>> +
>> +/*
>> + * This is the base address at which to start allocating virtual memory ranges
>> + * for UEFI Runtime Services. This is a userland range so that we can use any
>> + * allocation we choose, and eliminate the risk of a conflict after kexec.
>> + */
>> +#define EFI_RT_VIRTUAL_BASE  0x40000000
>> +
>> +static void update_memory_map(efi_memory_desc_t *memory_map,
>> +                           unsigned long map_size, unsigned long desc_size,
>> +                           int *count)
>> +{
>> +     u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
>> +     efi_memory_desc_t *out = (void *)memory_map + map_size;
>> +     int l;
>> +
>> +     for (l = 0; l < map_size; l += desc_size) {
>> +             efi_memory_desc_t *in = (void *)memory_map + l;
>> +             u64 paddr, size;
>> +
>> +             if (!(in->attribute & EFI_MEMORY_RUNTIME))
>> +                     continue;
>> +
>> +             /*
>> +              * Make the mapping compatible with 64k pages: this allows
>> +              * a 4k page size kernel to kexec a 64k page size kernel and
>> +              * vice versa.
>> +              */
>> +             paddr = round_down(in->phys_addr, SZ_64K);
>> +             size = round_up(in->num_pages * EFI_PAGE_SIZE +
>> +                             in->phys_addr - paddr, SZ_64K);
>> +
>> +             /*
>> +              * Avoid wasting memory on PTEs by choosing a virtual base that
>> +              * is compatible with section mappings if this region has the
>> +              * appropriate size and physical alignment. (Sections are 2 MB
>> +              * on 4k granule kernels)
>> +              */
>> +             if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
>> +                     efi_virt_base = round_up(efi_virt_base, SZ_2M);
>> +
>> +             in->virt_addr = efi_virt_base + in->phys_addr - paddr;
>> +             efi_virt_base += size;
>> +
>> +             memcpy(out, in, desc_size);
>> +             out = (void *)out + desc_size;
>> +             ++*count;
>> +     }
>> +}
>> +
>>  /*
>>   * Allocate memory for a new FDT, then add EFI, commandline, and
>>   * initrd related fields to the FDT.  This routine increases the
>> @@ -196,6 +284,7 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
>>       efi_memory_desc_t *memory_map;
>>       unsigned long new_fdt_size;
>>       efi_status_t status;
>> +     int runtime_entry_count = 0;
>>
>>       /*
>>        * Estimate size of new FDT, and allocate memory for it. We
>> @@ -216,8 +305,8 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
>>                * we can get the memory map key  needed for
>>                * exit_boot_services().
>>                */
>> -             status = efi_get_memory_map(sys_table, &memory_map, &map_size,
>> -                                         &desc_size, &desc_ver, &mmap_key);
>> +             status = get_memory_map(sys_table, &memory_map, &map_size,
>> +                                     &desc_size, &desc_ver, &mmap_key);
>>               if (status != EFI_SUCCESS)
>>                       goto fail_free_new_fdt;
>>
>> @@ -248,12 +337,52 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
>>               }
>>       }
>>
>> +     /*
>> +      * Update the memory map with virtual addresses. The function will also
>> +      * populate the spare second half of the memory_map allocation with
>> +      * copies of just the EFI_MEMORY_RUNTIME entries so that we can pass it
>> +      * straight into SetVirtualAddressMap()
>> +      */
>> +     update_memory_map(memory_map, map_size, desc_size,
>> +                       &runtime_entry_count);
>> +
>> +     pr_efi(sys_table,
>> +            "Exiting boot services and installing virtual address map...\n");
>> +
>>       /* Now we are ready to exit_boot_services.*/
>>       status = sys_table->boottime->exit_boot_services(handle, mmap_key);
>>
>> +     if (status == EFI_SUCCESS) {
>> +             efi_set_virtual_address_map_t *svam;
>>
>> -     if (status == EFI_SUCCESS)
>> -             return status;
>> +             /* Install the new virtual address map */
>> +             svam = sys_table->runtime->set_virtual_address_map;
>> +             status = svam(runtime_entry_count * desc_size, desc_size,
>> +                           desc_ver, (void *)memory_map + map_size);
>> +
>> +             /*
>> +              * We are beyond the point of no return here, so if the call to
>> +              * SetVirtualAddressMap() failed, we need to signal that to the
>> +              * incoming kernel but proceed normally otherwise.
>> +              */
>> +             if (status != EFI_SUCCESS) {
>> +                     int l;
>> +
>> +                     /*
>> +                      * Set the virtual address field of all
>> +                      * EFI_MEMORY_RUNTIME entries to 0. This will signal
>> +                      * the incoming kernel that no virtual translation has
>> +                      * been installed.
>> +                      */
>> +                     for (l = 0; l < map_size; l += desc_size) {
>> +                             efi_memory_desc_t *p = (void *)memory_map + l;
>> +
>> +                             if (p->attribute & EFI_MEMORY_RUNTIME)
>> +                                     p->virt_addr = 0;
>> +                     }
>> +             }
>> +             return EFI_SUCCESS;
>> +     }
>>
>>       pr_efi_err(sys_table, "Exit boot services failed.\n");
>>
>> --
>> 1.8.3.2
>
> Apart from this, and other comments in the thread, looks good.
>
> /
>     Leif
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
       [not found]             ` <CAKv+Gu_k0_anqvm24P8J4WaMrFLMxOhQETAfUb6xmzaJ+Z2vBQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-01-07 12:25               ` Leif Lindholm
       [not found]                 ` <20150107122515.GK3827-t77nlHhSwNqAroYi2ySoxKxOck334EZe@public.gmane.org>
  0 siblings, 1 reply; 30+ messages in thread
From: Leif Lindholm @ 2015-01-07 12:25 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Roy Franz, Mark Rutland,
	Catalin Marinas, Will Deacon, Matt Fleming, Borislav Petkov,
	Dave Young, Geoff Levand, Mark Salter

On Wed, Jan 07, 2015 at 12:16:02PM +0000, Ard Biesheuvel wrote:
> >> diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
> >> index 71291253114f..6cc668a378c5 100644
> >> --- a/arch/arm64/include/asm/efi.h
> >> +++ b/arch/arm64/include/asm/efi.h
> >> @@ -7,28 +7,36 @@
> >>  #ifdef CONFIG_EFI
> >>  extern void efi_init(void);
> >>  extern void efi_idmap_init(void);
> >> +extern void efi_virtmap_init(void);
> >>  #else
> >>  #define efi_init()
> >>  #define efi_idmap_init()
> >> +#define efi_virtmap_init
> >>  #endif
> >>
> >>  #define efi_call_virt(f, ...)                                                \
> >>  ({                                                                   \
> >> -     efi_##f##_t *__f = efi.systab->runtime->f;                      \
> >> +     efi_##f##_t *__f;                                               \
> >>       efi_status_t __s;                                               \
> >>                                                                       \
> >> -     kernel_neon_begin();                                            \
> >> +     kernel_neon_begin();    /* disables preemption */               \
> >
> > Nitpick: adding comment to otherwise untouched source line.
> >
> >> +     efi_virtmap_load();                                             \
> >> +     __f = efi.systab->runtime->f;                                   \
> >>       __s = __f(__VA_ARGS__);                                         \
> >> +     efi_virtmap_unload();                                           \
> >>       kernel_neon_end();                                              \
> >>       __s;                                                            \
> >>  })
> >>
> >>  #define __efi_call_virt(f, ...)                                              \
> >>  ({                                                                   \
> >> -     efi_##f##_t *__f = efi.systab->runtime->f;                      \
> >> +     efi_##f##_t *__f;                                               \
> >>                                                                       \
> >> -     kernel_neon_begin();                                            \
> >> +     kernel_neon_begin();    /* disables preemption */               \
> >
> > Same nitpick.
> >
> 
> Is there anything wrong with that?

I said nitpick.

My (very minor) objection is that a (very reasonable) comment is added
to existing functionality by a patch that adds new functionality. It
makes the git blame/praise output less clear.

> Would you prefer the comment to be on a separate line?

I would _prefer_ the comments to be a separate patch.
But again, a nitpick.
 
> >> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> >> index b80991166754..d8390f507da0 100644
> >> --- a/arch/arm64/kernel/setup.c
> >> +++ b/arch/arm64/kernel/setup.c
> >> @@ -402,6 +402,7 @@ void __init setup_arch(char **cmdline_p)
> >>       request_standard_resources();
> >>
> >>       efi_idmap_init();
> >> +     efi_virtmap_init();
> >
> > Could these two be merged together into one function?
> > Say efi_memmap_init()?
> >
> 
> Well, I decided to do it like this because efi_idmap_init() gets
> removed in its entirety (including this invocation) in a subsequent
> patch.

Ah yeah. I knew that yesterday. Ignore me. More coffee.
 
/
    Leif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
       [not found]                 ` <20150107122515.GK3827-t77nlHhSwNqAroYi2ySoxKxOck334EZe@public.gmane.org>
@ 2015-01-07 12:30                   ` Ard Biesheuvel
       [not found]                     ` <CAKv+Gu-aRkoRj5zVXrsBc0APCehZ9W926J6fxH380K9EURXe_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 30+ messages in thread
From: Ard Biesheuvel @ 2015-01-07 12:30 UTC (permalink / raw)
  To: Leif Lindholm
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Roy Franz, Mark Rutland,
	Catalin Marinas, Will Deacon, Matt Fleming, Borislav Petkov,
	Dave Young, Geoff Levand, Mark Salter

On 7 January 2015 at 12:25, Leif Lindholm <leif.lindholm-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
> On Wed, Jan 07, 2015 at 12:16:02PM +0000, Ard Biesheuvel wrote:
>> >> diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
>> >> index 71291253114f..6cc668a378c5 100644
>> >> --- a/arch/arm64/include/asm/efi.h
>> >> +++ b/arch/arm64/include/asm/efi.h
>> >> @@ -7,28 +7,36 @@
>> >>  #ifdef CONFIG_EFI
>> >>  extern void efi_init(void);
>> >>  extern void efi_idmap_init(void);
>> >> +extern void efi_virtmap_init(void);
>> >>  #else
>> >>  #define efi_init()
>> >>  #define efi_idmap_init()
>> >> +#define efi_virtmap_init
>> >>  #endif
>> >>
>> >>  #define efi_call_virt(f, ...)                                                \
>> >>  ({                                                                   \
>> >> -     efi_##f##_t *__f = efi.systab->runtime->f;                      \
>> >> +     efi_##f##_t *__f;                                               \
>> >>       efi_status_t __s;                                               \
>> >>                                                                       \
>> >> -     kernel_neon_begin();                                            \
>> >> +     kernel_neon_begin();    /* disables preemption */               \
>> >
>> > Nitpick: adding comment to otherwise untouched source line.
>> >
>> >> +     efi_virtmap_load();                                             \
>> >> +     __f = efi.systab->runtime->f;                                   \
>> >>       __s = __f(__VA_ARGS__);                                         \
>> >> +     efi_virtmap_unload();                                           \
>> >>       kernel_neon_end();                                              \
>> >>       __s;                                                            \
>> >>  })
>> >>
>> >>  #define __efi_call_virt(f, ...)                                              \
>> >>  ({                                                                   \
>> >> -     efi_##f##_t *__f = efi.systab->runtime->f;                      \
>> >> +     efi_##f##_t *__f;                                               \
>> >>                                                                       \
>> >> -     kernel_neon_begin();                                            \
>> >> +     kernel_neon_begin();    /* disables preemption */               \
>> >
>> > Same nitpick.
>> >
>>
>> Is there anything wrong with that?
>
> I said nitpick.
>
> My (very minor) objection is that a (very reasonable) comment is added
> to existing functionality by a patch that adds new functionality. It
> makes the git blame/praise output less clear.
>
>> Would you prefer the comment to be on a separate line?
>
> I would _prefer_ the comments to be a separate patch.
> But again, a nitpick.
>

Well, adding the comment is relevant to this patch, as we need to
disable preemption now before switching to the new address space. In
fact, it might be better even to drop the comment, and add an explicit
(if redundant) preempt_disable/preempt_enable pair.


>> >> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> >> index b80991166754..d8390f507da0 100644
>> >> --- a/arch/arm64/kernel/setup.c
>> >> +++ b/arch/arm64/kernel/setup.c
>> >> @@ -402,6 +402,7 @@ void __init setup_arch(char **cmdline_p)
>> >>       request_standard_resources();
>> >>
>> >>       efi_idmap_init();
>> >> +     efi_virtmap_init();
>> >
>> > Could these two be merged together into one function?
>> > Say efi_memmap_init()?
>> >
>>
>> Well, I decided to do it like this because efi_idmap_init() gets
>> removed in its entirety (including this invocation) in a subsequent
>> patch.
>
> Ah yeah. I knew that yesterday. Ignore me. More coffee.
>
> /
>     Leif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
       [not found]                     ` <CAKv+Gu-aRkoRj5zVXrsBc0APCehZ9W926J6fxH380K9EURXe_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-01-07 12:56                       ` Leif Lindholm
  0 siblings, 0 replies; 30+ messages in thread
From: Leif Lindholm @ 2015-01-07 12:56 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Roy Franz, Mark Rutland,
	Catalin Marinas, Will Deacon, Matt Fleming, Borislav Petkov,
	Dave Young, Geoff Levand, Mark Salter

On Wed, Jan 07, 2015 at 12:30:27PM +0000, Ard Biesheuvel wrote:
> >> >> +     kernel_neon_begin();    /* disables preemption */               \
> >> >
> >> > Same nitpick.
> >> >
> >>
> >> Is there anything wrong with that?
> >
> > I said nitpick.
> >
> > My (very minor) objection is that a (very reasonable) comment is added
> > to existing functionality by a patch that adds new functionality. It
> > makes the git blame/praise output less clear.
> >
> >> Would you prefer the comment to be on a separate line?
> >
> > I would _prefer_ the comments to be a separate patch.
> > But again, a nitpick.
> >
> 
> Well, adding the comment is relevant to this patch, as we need to
> disable preemption now before switching to the new address space. In
> fact, it might be better even to drop the comment, and add an explicit
> (if redundant) preempt_disable/preempt_enable pair.

That would be more clear, yes.

/
    Leif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
  2015-01-05 16:20       ` Mark Rutland
  2015-01-06 17:13         ` Leif Lindholm
@ 2015-01-07 18:05         ` Ard Biesheuvel
  1 sibling, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2015-01-07 18:05 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, Catalin Marinas, Will Deacon,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

On 5 January 2015 at 16:20, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
> Hi Ard,
>
> I have a few (minor) comments below.
>
> On Mon, Dec 22, 2014 at 10:59:02AM +0000, Ard Biesheuvel wrote:
>> In order to support kexec, the kernel needs to be able to deal with the
>> state of the UEFI firmware after SetVirtualAddressMap() has been called.
>> To avoid having separate code paths for non-kexec and kexec, let's move
>> the call to SetVirtualAddressMap() to the stub: this will guarantee us
>> that it will only be called once (since the stub is not executed during
>> kexec), and ensures that the UEFI state is identical between kexec and
>> normal boot.
>>
>> This implies that the layout of the virtual mapping needs to be created
>> by the stub as well. All regions are rounded up to a naturally aligned
>> multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
>> in the UEFI memory map. The kernel proper reads those values and installs
>> the mappings in a dedicated set of page tables that are swapped in during
>> UEFI Runtime Services calls.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>> ---
>>  arch/arm64/include/asm/efi.h       |  20 +++-
>>  arch/arm64/kernel/efi.c            | 223 ++++++++++++++++++++-----------------
>>  arch/arm64/kernel/setup.c          |   1 +
>>  drivers/firmware/efi/libstub/fdt.c | 137 ++++++++++++++++++++++-
>>  4 files changed, 270 insertions(+), 111 deletions(-)
>
> [...]
>
>> +static void efi_set_pgd(struct mm_struct *mm)
>> +{
>> +       cpu_switch_mm(mm->pgd, mm);
>> +       flush_tlb_all();
>> +       if (icache_is_aivivt())
>> +               __flush_icache_all();
>> +}
>
> Do we have any idea how often we call runtime services?
>
> I assume not all that often (read the RTC at boot, set/get variables).
>
> If we're nuking the TLBs and I-cache a lot we'll probably need to
> reserve an asid for the EFI virtmap.
>

As pointed out by Leif, we should not be calling runtime services
often enough for this to have an impact imo

>> +void __init efi_virtmap_init(void)
>> +{
>> +       efi_memory_desc_t *md;
>> +
>> +       if (!efi_enabled(EFI_BOOT))
>> +               return;
>> +
>> +       for_each_efi_memory_desc(&memmap, md) {
>> +               u64 paddr, npages, size;
>> +               pgprot_t prot;
>> +
>> +               if (!(md->attribute & EFI_MEMORY_RUNTIME))
>> +                       continue;
>> +               if (WARN(md->virt_addr == 0,
>> +                        "UEFI virtual mapping incomplete or missing -- no entry found for 0x%llx\n",
>> +                        md->phys_addr))
>> +                       return;
>
> I wonder if we might want to use a different address to signal a bad
> mapping (e.g. 0UL-1 as it's not even EFI_PAGE_SIZE aligned), just in
> case we have to handle a valid use of zero in future for some reason --
> perhaps coming from another OS.
>

The spec suggests that a failed call to SetVirtualAddressMap() means
that the old 1:1 mapping is still in effect, and booting can continue
as if the call was never made (This is suggested by the list of return
codes, which are all based on validation of the input) That is also
why efi_to_phys() returns the physical address if it doesn't find a
virtual address, i.e., so that we can at least initialize EFI and the
memory map. So afaict, there is really no reason to distinguish
between 'SVAM() was not called' and 'SVAM() was called but didn't like
its input args'.

>> +
>> +               paddr = md->phys_addr;
>> +               npages = md->num_pages;
>> +               memrange_efi_to_native(&paddr, &npages);
>> +               size = npages << PAGE_SHIFT;
>> +
>> +               pr_info("  EFI remap 0x%012llx => %p\n",
>
> Why not use %016llx? We might only have 48-bit PAs currently, but we may
> as well keep the VA and PA equally long when printing out -- that'll
> also help to identify issues with garbage in the upper 16 bits of the
> PA field.
>
> [...]
>

Sure, why not. It is code I carried over from the previous version,
but I am happy to change it here


>> +/*
>> + * This is the base address at which to start allocating virtual memory ranges
>> + * for UEFI Runtime Services. This is a userland range so that we can use any
>> + * allocation we choose, and eliminate the risk of a conflict after kexec.
>> + */
>> +#define EFI_RT_VIRTUAL_BASE    0x40000000
>
> Nit: I'm not keen on the term "userland" here. You can map stuff to EL0
> in the high half of the address space if you wanted, so TTBR0/TTBR1
> aren't architecturally user/kernel.
>
> s/userland/low half/, perhaps?
>

OK

> It might also be worth pointing out that the arbitrary base address
> isn't zero so as to be less likely to be an idmap.
>

It is the largest non-zero base2 boundary that can reasonably be
shared with v7, even with a 2:2 split.
I will add a comment to that effect.

>> +
>> +static void update_memory_map(efi_memory_desc_t *memory_map,
>> +                             unsigned long map_size, unsigned long desc_size,
>> +                             int *count)
>> +{
>> +       u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
>> +       efi_memory_desc_t *out = (void *)memory_map + map_size;
>> +       int l;
>> +
>> +       for (l = 0; l < map_size; l += desc_size) {
>> +               efi_memory_desc_t *in = (void *)memory_map + l;
>> +               u64 paddr, size;
>> +
>> +               if (!(in->attribute & EFI_MEMORY_RUNTIME))
>> +                       continue;
>> +
>> +               /*
>> +                * Make the mapping compatible with 64k pages: this allows
>> +                * a 4k page size kernel to kexec a 64k page size kernel and
>> +                * vice versa.
>> +                */
>> +               paddr = round_down(in->phys_addr, SZ_64K);
>> +               size = round_up(in->num_pages * EFI_PAGE_SIZE +
>> +                               in->phys_addr - paddr, SZ_64K);
>> +
>> +               /*
>> +                * Avoid wasting memory on PTEs by choosing a virtual base that
>> +                * is compatible with section mappings if this region has the
>> +                * appropriate size and physical alignment. (Sections are 2 MB
>> +                * on 4k granule kernels)
>> +                */
>> +               if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
>> +                       efi_virt_base = round_up(efi_virt_base, SZ_2M);
>> +
>> +               in->virt_addr = efi_virt_base + in->phys_addr - paddr;
>> +               efi_virt_base += size;
>> +
>> +               memcpy(out, in, desc_size);
>> +               out = (void *)out + desc_size;
>> +               ++*count;
>> +       }
>> +}
>
> This feels like this should live in arch/arm64, or under some other
> directory specific to arm/arm64. That said, I guess the fdt stuff is
> currently arm-specific anyway, so perhaps this is fine.
>
> [...]
>

I can move this function (and get_memory_map) to arm-stub.c instead.

>> @@ -248,12 +337,52 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
>>                 }
>>         }
>>
>> +       /*
>> +        * Update the memory map with virtual addresses. The function will also
>> +        * populate the spare second half of the memory_map allocation with
>> +        * copies of just the EFI_MEMORY_RUNTIME entries so that we can pass it
>> +        * straight into SetVirtualAddressMap()
>> +        */
>> +       update_memory_map(memory_map, map_size, desc_size,
>> +                         &runtime_entry_count);
>> +
>> +       pr_efi(sys_table,
>> +              "Exiting boot services and installing virtual address map...\n");
>
> I believe that the memory map is allowed to change as a result of this
> call, so I think this needs to be moved before update_memory_map.
>

Yes, you mentioned that before but I forgot about it.
I will fix that as well.

-- 
Ard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
       [not found]         ` <20150105165402.GE3163-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org>
  2015-01-06 18:01           ` Leif Lindholm
@ 2015-01-07 18:15           ` Ard Biesheuvel
  1 sibling, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2015-01-07 18:15 UTC (permalink / raw)
  To: Matt Fleming
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Roy Franz,
	Mark Rutland, Catalin Marinas, Will Deacon, Matt Fleming,
	Borislav Petkov, Dave Young, Geoff Levand, Mark Salter

On 5 January 2015 at 16:54, Matt Fleming <matt-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org> wrote:
> On Mon, 22 Dec, at 10:59:02AM, Ard Biesheuvel wrote:
>> In order to support kexec, the kernel needs to be able to deal with the
>> state of the UEFI firmware after SetVirtualAddressMap() has been called.
>> To avoid having separate code paths for non-kexec and kexec, let's move
>> the call to SetVirtualAddressMap() to the stub: this will guarantee us
>> that it will only be called once (since the stub is not executed during
>> kexec), and ensures that the UEFI state is identical between kexec and
>> normal boot.
>>
>> This implies that the layout of the virtual mapping needs to be created
>> by the stub as well. All regions are rounded up to a naturally aligned
>> multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
>> in the UEFI memory map. The kernel proper reads those values and installs
>> the mappings in a dedicated set of page tables that are swapped in during
>> UEFI Runtime Services calls.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>> ---
>>  arch/arm64/include/asm/efi.h       |  20 +++-
>>  arch/arm64/kernel/efi.c            | 223 ++++++++++++++++++++-----------------
>>  arch/arm64/kernel/setup.c          |   1 +
>>  drivers/firmware/efi/libstub/fdt.c | 137 ++++++++++++++++++++++-
>>  4 files changed, 270 insertions(+), 111 deletions(-)
>
> [...]
>
>> @@ -45,5 +53,9 @@ extern void efi_idmap_init(void);
>>  #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__)
>>
>>  #define EFI_ALLOC_ALIGN              SZ_64K
>> +#define EFI_VIRTMAP          EFI_ARCH_1
>
> Any chance of documenting what EFI_VIRTMAP means for posterity and why
> you want to use one of the EFI arch config bits? How is this different
> from EFI_RUNTIME_SERVICES?
>

Will do. It means the virtmap page tables have been set up, based on
the virtual-to-physical mapping that the stub inserted into the memory
map.
This becomes a prereq for enabling the runtime services.

> Take a look at EFI_OLD_MEMMAP in arch/x86/include/asm/efi.h for a
> crackingly well documented example from Borislav.
>
> [...]
>
>> diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
>> index c846a9608cbd..76bc8abf41d1 100644
>> --- a/drivers/firmware/efi/libstub/fdt.c
>> +++ b/drivers/firmware/efi/libstub/fdt.c
>> @@ -167,6 +167,94 @@ fdt_set_fail:
>>  #define EFI_FDT_ALIGN EFI_PAGE_SIZE
>>  #endif
>>
>> +static efi_status_t get_memory_map(efi_system_table_t *sys_table_arg,
>> +                                efi_memory_desc_t **map,
>> +                                unsigned long *map_size,
>> +                                unsigned long *desc_size,
>> +                                u32 *desc_ver, unsigned long *key_ptr)
>> +{
>> +     efi_status_t status;
>> +
>> +     /*
>> +      * Call get_memory_map() with 0 size to retrieve the size of the
>> +      * required allocation.
>> +      */
>> +     *map_size = 0;
>> +     status = efi_call_early(get_memory_map, map_size, NULL,
>> +                             key_ptr, desc_size, desc_ver);
>> +     if (status != EFI_BUFFER_TOO_SMALL)
>> +             return EFI_LOAD_ERROR;
>> +
>> +     /*
>> +      * Add an additional efi_memory_desc_t to map_size because we're doing
>> +      * an allocation which may be in a new descriptor region. Then double it
>> +      * to give us some scratch space to prepare the input virtmap to give
>> +      * to SetVirtualAddressMap(). Note that this is EFI_LOADER_DATA memory,
>> +      * and the kernel memblock_reserve()'s only the size of the actual
>> +      * memory map, so the scratch space is freed again automatically.
>> +      */
>> +     *map_size += *desc_size;
>> +     status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
>> +                             *map_size * 2, (void **)map);
>> +     if (status != EFI_SUCCESS)
>> +             return status;
>> +
>> +     status = efi_call_early(get_memory_map, map_size, *map,
>> +                             key_ptr, desc_size, desc_ver);
>> +     if (status != EFI_SUCCESS)
>> +             efi_call_early(free_pool, *map);
>> +     return status;
>> +}
>
> We've now got two (slightly different) versions of this function. Is
> there any way we could make do with just one?
>

I have seen some ludicrously large memory maps on some of my x86
boxes, so I was reluctant to just double the size for all users. I am
happy to make this the default implementation, but I am not sure if
x86 frees the second half implicitly as arm does.

>> +/*
>> + * This is the base address at which to start allocating virtual memory ranges
>> + * for UEFI Runtime Services. This is a userland range so that we can use any
>> + * allocation we choose, and eliminate the risk of a conflict after kexec.
>> + */
>> +#define EFI_RT_VIRTUAL_BASE  0x40000000
>> +
>> +static void update_memory_map(efi_memory_desc_t *memory_map,
>> +                           unsigned long map_size, unsigned long desc_size,
>> +                           int *count)
>> +{
>> +     u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
>> +     efi_memory_desc_t *out = (void *)memory_map + map_size;
>> +     int l;
>> +
>> +     for (l = 0; l < map_size; l += desc_size) {
>> +             efi_memory_desc_t *in = (void *)memory_map + l;
>> +             u64 paddr, size;
>> +
>> +             if (!(in->attribute & EFI_MEMORY_RUNTIME))
>> +                     continue;
>> +
>> +             /*
>> +              * Make the mapping compatible with 64k pages: this allows
>> +              * a 4k page size kernel to kexec a 64k page size kernel and
>> +              * vice versa.
>> +              */
>> +             paddr = round_down(in->phys_addr, SZ_64K);
>> +             size = round_up(in->num_pages * EFI_PAGE_SIZE +
>> +                             in->phys_addr - paddr, SZ_64K);
>> +
>> +             /*
>> +              * Avoid wasting memory on PTEs by choosing a virtual base that
>> +              * is compatible with section mappings if this region has the
>> +              * appropriate size and physical alignment. (Sections are 2 MB
>> +              * on 4k granule kernels)
>> +              */
>> +             if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
>> +                     efi_virt_base = round_up(efi_virt_base, SZ_2M);
>> +
>> +             in->virt_addr = efi_virt_base + in->phys_addr - paddr;
>> +             efi_virt_base += size;
>> +
>> +             memcpy(out, in, desc_size);
>> +             out = (void *)out + desc_size;
>> +             ++*count;
>> +     }
>> +}
>> +
>
> fdt.c is starting to become a dumping ground for arm*-specific stuff :-(
>
> Is there no general solution for sharing code between arm and aarch64 in
> the kernel so we don't have to stick things like this in
> drivers/firmware/efi/?
>

I will put these functions in arm-stub.c instead.

-- 
Ard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
       [not found]             ` <20150106180120.GF3827-t77nlHhSwNqAroYi2ySoxKxOck334EZe@public.gmane.org>
@ 2015-01-12 10:43               ` Matt Fleming
  0 siblings, 0 replies; 30+ messages in thread
From: Matt Fleming @ 2015-01-12 10:43 UTC (permalink / raw)
  To: Leif Lindholm
  Cc: Ard Biesheuvel,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA,
	geoff.levand-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA

On Tue, 06 Jan, at 06:01:20PM, Leif Lindholm wrote:
> On Mon, Jan 05, 2015 at 04:54:02PM +0000, Matt Fleming wrote:
> 
> I think all we really need above what efi_get_memory_map() provides is
> the scratch space. Would we care about temporarily wasting a little
> bit of EFI_LOADER_DATA on all platforms, or could we just swap the
> function body in efi-stub-helper.c for Ard's version above?
> 
> (I would guess memory maps with <= 32 entries are uncommon anyway, so
> the existing version would already make the bootservice call twice.)
 
I've no concerns about using the additional scratch space.

[...]

> Mmm, not optimal.
> That said, the only arm*-specific things about this particular
> function are the page sizes. Should this move to efi-stub-helper.c
> with EFI_RT_VIRTUAL_BASE moved to arch/<x>/include/asm/efi.h and
> joined by EFI_RT_VIRTUAL_BASE_ALIGN and EFI_RT_VIRTUAL_REGION_ALIGN?

Yeah, that would be an improvement. It's possible we could reuse some of
this code for x86's EFI virtual mapping code that Borislav wrote
(maybe).

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2015-01-12 10:43 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-22 10:58 [PATCH v4 0/8] stable UEFI virtual mappings for kexec Ard Biesheuvel
     [not found] ` <1419245944-2424-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-12-22 10:58   ` [PATCH v4 1/8] arm64/mm: add explicit struct_mm argument to __create_mapping() Ard Biesheuvel
2014-12-22 10:58   ` [PATCH v4 2/8] arm64/mm: add create_pgd_mapping() to create private page tables Ard Biesheuvel
2014-12-22 10:58   ` [PATCH v4 3/8] efi: split off remapping code from efi_config_init() Ard Biesheuvel
     [not found]     ` <1419245944-2424-4-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2015-01-05 14:55       ` Matt Fleming
2015-01-05 21:56       ` Borislav Petkov
2014-12-22 10:59   ` [PATCH v4 4/8] efi: efistub: allow allocation alignment larger than EFI_PAGE_SIZE Ard Biesheuvel
     [not found]     ` <1419245944-2424-5-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-12-23 16:45       ` Borislav Petkov
     [not found]         ` <20141223164549.GB3810-fF5Pk5pvG8Y@public.gmane.org>
2014-12-29  9:25           ` Ard Biesheuvel
     [not found]             ` <CAKv+Gu_+scCHYuXccgsvn9THsbP24OfyjvjN9XtEVe4nMdQtig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-12-29  9:47               ` Borislav Petkov
     [not found]                 ` <20141229094732.GA16051-K5JNixvcfoxupOikMc4+xw@public.gmane.org>
2015-01-05 11:24                   ` Ard Biesheuvel
2014-12-22 10:59   ` [PATCH v4 5/8] arm64/efi: set EFI_ALLOC_ALIGN to 64 KB Ard Biesheuvel
2015-01-06 16:37     ` Leif Lindholm
2014-12-22 10:59   ` [PATCH v4 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub Ard Biesheuvel
     [not found]     ` <1419245944-2424-7-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2015-01-05 16:20       ` Mark Rutland
2015-01-06 17:13         ` Leif Lindholm
2015-01-07 18:05         ` Ard Biesheuvel
2015-01-05 16:54       ` Matt Fleming
     [not found]         ` <20150105165402.GE3163-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org>
2015-01-06 18:01           ` Leif Lindholm
     [not found]             ` <20150106180120.GF3827-t77nlHhSwNqAroYi2ySoxKxOck334EZe@public.gmane.org>
2015-01-12 10:43               ` Matt Fleming
2015-01-07 18:15           ` Ard Biesheuvel
2015-01-07 12:06       ` Leif Lindholm
     [not found]         ` <20150107120608.GJ3827-t77nlHhSwNqAroYi2ySoxKxOck334EZe@public.gmane.org>
2015-01-07 12:16           ` Ard Biesheuvel
     [not found]             ` <CAKv+Gu_k0_anqvm24P8J4WaMrFLMxOhQETAfUb6xmzaJ+Z2vBQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-07 12:25               ` Leif Lindholm
     [not found]                 ` <20150107122515.GK3827-t77nlHhSwNqAroYi2ySoxKxOck334EZe@public.gmane.org>
2015-01-07 12:30                   ` Ard Biesheuvel
     [not found]                     ` <CAKv+Gu-aRkoRj5zVXrsBc0APCehZ9W926J6fxH380K9EURXe_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-07 12:56                       ` Leif Lindholm
2014-12-22 10:59   ` [PATCH v4 7/8] arm64/efi: remove free_boot_services() and friends Ard Biesheuvel
     [not found]     ` <1419245944-2424-8-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2015-01-06 18:06       ` Leif Lindholm
2014-12-22 10:59   ` [PATCH v4 8/8] arm64/efi: remove idmap manipulations from UEFI code Ard Biesheuvel
     [not found]     ` <1419245944-2424-9-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2015-01-06 18:17       ` Leif Lindholm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).