linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] EFI memmap and other fixes, v3
@ 2014-01-18 11:48 Borislav Petkov
  2014-01-18 11:48 ` [PATCH 1/5] x86, ptdump: Add the functionality to dump an arbitrary pagetable Borislav Petkov
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Borislav Petkov @ 2014-01-18 11:48 UTC (permalink / raw)
  To: Linux EFI
  Cc: LKML, Borislav Petkov, Matt Fleming, Matthew Garrett,
	H. Peter Anvin, Toshi Kani

From: Borislav Petkov <bp@suse.de>

Hi all,

this is the latest incarnation which should hopefully work just fine. It
tpasses esting at least on all our boxes so we have *some* coverage.

It took us a long time to debug the unmapping path and realize how
exactly we're doing the PGD sharing between the kernel and the EFI page
table. I've commented on this verbosely in 4/5 for future reference
because this stuff is not trivial.

We definitely should have this in mind when we do more changes on how we
handle the EFI page table and what we do with it, maybe even decouple it
completely from the kernel one. I dunno - it is all future stuff to pay
attention to.

Anyway, if anyone with an EFI box wants to give it a run, feel free to
do so.

As always, all comments and suggestions are appreciated.

Thanks.

Changelog:

* v2:

here's v2 rebased and rediffed against tip (which has the relevant
efi branches).

* v1:

this is the result of Toshi and me debugging a #GP on one of his big HP
boxes sporting UEFI. Each commit message should be self-explanatory so
please look there.

This has more or less an RFC nature thus I'm sending it now to collect
feedback. It is going to wait in the EFI queue anyway after the kexec
stuff gets sorted out first.

Borislav Petkov (5):
  x86, ptdump: Add the functionality to dump an arbitrary pagetable
  efi: Dump the EFI page table
  x86, pageattr: Export page unmapping interface
  efi: Make efi virtual runtime map passing more robust
  efi: Split efi_enter_virtual_mode

 arch/x86/Kconfig                     |   9 ++
 arch/x86/include/asm/efi.h           |   4 +-
 arch/x86/include/asm/pgtable.h       |   3 +-
 arch/x86/include/asm/pgtable_types.h |   2 +
 arch/x86/mm/dump_pagetables.c        |  84 +++++++++-----
 arch/x86/mm/pageattr.c               |  44 +++++---
 arch/x86/platform/efi/efi.c          | 210 ++++++++++++++++++++++++++---------
 arch/x86/platform/efi/efi_32.c       |   7 +-
 arch/x86/platform/efi/efi_64.c       |  41 ++++++-
 include/linux/efi.h                  |   1 +
 10 files changed, 304 insertions(+), 101 deletions(-)

-- 
1.8.5.2.192.g7794a68


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] x86, ptdump: Add the functionality to dump an arbitrary pagetable
  2014-01-18 11:48 [PATCH 0/5] EFI memmap and other fixes, v3 Borislav Petkov
@ 2014-01-18 11:48 ` Borislav Petkov
  2014-01-18 11:48 ` [PATCH 2/5] efi: Dump the EFI page table Borislav Petkov
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2014-01-18 11:48 UTC (permalink / raw)
  To: Linux EFI
  Cc: LKML, Borislav Petkov, Arjan van de Ven, Matt Fleming,
	Matthew Garrett, H. Peter Anvin, Toshi Kani

From: Borislav Petkov <bp@suse.de>

With reusing the ->trampoline_pgd page table for mapping EFI regions in
order to use them after having switched to EFI virtual mode, it is very
useful to be able to dump aforementioned page table in dmesg. This adds
that functionality through the walk_pgd_level() interface which can be
called from somewhere else.

The original functionality of dumping to debugfs remains untouched.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/pgtable.h |  3 +-
 arch/x86/mm/dump_pagetables.c  | 84 +++++++++++++++++++++++++++---------------
 2 files changed, 56 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index bbc8b12fa443..b459ddf27d64 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -15,9 +15,10 @@
 	 : (prot))
 
 #ifndef __ASSEMBLY__
-
 #include <asm/x86_init.h>
 
+void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd);
+
 /*
  * ZERO_PAGE is a global shared page that is always zero: used
  * for zero-mapped memory areas etc..
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 0002a3a33081..20621d753d5f 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -30,6 +30,7 @@ struct pg_state {
 	unsigned long start_address;
 	unsigned long current_address;
 	const struct addr_marker *marker;
+	bool to_dmesg;
 };
 
 struct addr_marker {
@@ -88,10 +89,28 @@ static struct addr_marker address_markers[] = {
 #define PUD_LEVEL_MULT (PTRS_PER_PMD * PMD_LEVEL_MULT)
 #define PGD_LEVEL_MULT (PTRS_PER_PUD * PUD_LEVEL_MULT)
 
+#define pt_dump_seq_printf(m, to_dmesg, fmt, args...)		\
+({								\
+	if (to_dmesg)					\
+		printk(KERN_INFO fmt, ##args);			\
+	else							\
+		if (m)						\
+			seq_printf(m, fmt, ##args);		\
+})
+
+#define pt_dump_cont_printf(m, to_dmesg, fmt, args...)		\
+({								\
+	if (to_dmesg)					\
+		printk(KERN_CONT fmt, ##args);			\
+	else							\
+		if (m)						\
+			seq_printf(m, fmt, ##args);		\
+})
+
 /*
  * Print a readable form of a pgprot_t to the seq_file
  */
-static void printk_prot(struct seq_file *m, pgprot_t prot, int level)
+static void printk_prot(struct seq_file *m, pgprot_t prot, int level, bool dmsg)
 {
 	pgprotval_t pr = pgprot_val(prot);
 	static const char * const level_name[] =
@@ -99,47 +118,47 @@ static void printk_prot(struct seq_file *m, pgprot_t prot, int level)
 
 	if (!pgprot_val(prot)) {
 		/* Not present */
-		seq_printf(m, "                          ");
+		pt_dump_cont_printf(m, dmsg, "                          ");
 	} else {
 		if (pr & _PAGE_USER)
-			seq_printf(m, "USR ");
+			pt_dump_cont_printf(m, dmsg, "USR ");
 		else
-			seq_printf(m, "    ");
+			pt_dump_cont_printf(m, dmsg, "    ");
 		if (pr & _PAGE_RW)
-			seq_printf(m, "RW ");
+			pt_dump_cont_printf(m, dmsg, "RW ");
 		else
-			seq_printf(m, "ro ");
+			pt_dump_cont_printf(m, dmsg, "ro ");
 		if (pr & _PAGE_PWT)
-			seq_printf(m, "PWT ");
+			pt_dump_cont_printf(m, dmsg, "PWT ");
 		else
-			seq_printf(m, "    ");
+			pt_dump_cont_printf(m, dmsg, "    ");
 		if (pr & _PAGE_PCD)
-			seq_printf(m, "PCD ");
+			pt_dump_cont_printf(m, dmsg, "PCD ");
 		else
-			seq_printf(m, "    ");
+			pt_dump_cont_printf(m, dmsg, "    ");
 
 		/* Bit 9 has a different meaning on level 3 vs 4 */
 		if (level <= 3) {
 			if (pr & _PAGE_PSE)
-				seq_printf(m, "PSE ");
+				pt_dump_cont_printf(m, dmsg, "PSE ");
 			else
-				seq_printf(m, "    ");
+				pt_dump_cont_printf(m, dmsg, "    ");
 		} else {
 			if (pr & _PAGE_PAT)
-				seq_printf(m, "pat ");
+				pt_dump_cont_printf(m, dmsg, "pat ");
 			else
-				seq_printf(m, "    ");
+				pt_dump_cont_printf(m, dmsg, "    ");
 		}
 		if (pr & _PAGE_GLOBAL)
-			seq_printf(m, "GLB ");
+			pt_dump_cont_printf(m, dmsg, "GLB ");
 		else
-			seq_printf(m, "    ");
+			pt_dump_cont_printf(m, dmsg, "    ");
 		if (pr & _PAGE_NX)
-			seq_printf(m, "NX ");
+			pt_dump_cont_printf(m, dmsg, "NX ");
 		else
-			seq_printf(m, "x  ");
+			pt_dump_cont_printf(m, dmsg, "x  ");
 	}
-	seq_printf(m, "%s\n", level_name[level]);
+	pt_dump_cont_printf(m, dmsg, "%s\n", level_name[level]);
 }
 
 /*
@@ -178,7 +197,8 @@ static void note_page(struct seq_file *m, struct pg_state *st,
 		st->current_prot = new_prot;
 		st->level = level;
 		st->marker = address_markers;
-		seq_printf(m, "---[ %s ]---\n", st->marker->name);
+		pt_dump_seq_printf(m, st->to_dmesg, "---[ %s ]---\n",
+				   st->marker->name);
 	} else if (prot != cur || level != st->level ||
 		   st->current_address >= st->marker[1].start_address) {
 		const char *unit = units;
@@ -188,17 +208,17 @@ static void note_page(struct seq_file *m, struct pg_state *st,
 		/*
 		 * Now print the actual finished series
 		 */
-		seq_printf(m, "0x%0*lx-0x%0*lx   ",
-			   width, st->start_address,
-			   width, st->current_address);
+		pt_dump_seq_printf(m, st->to_dmesg,  "0x%0*lx-0x%0*lx   ",
+				   width, st->start_address,
+				   width, st->current_address);
 
 		delta = (st->current_address - st->start_address) >> 10;
 		while (!(delta & 1023) && unit[1]) {
 			delta >>= 10;
 			unit++;
 		}
-		seq_printf(m, "%9lu%c ", delta, *unit);
-		printk_prot(m, st->current_prot, st->level);
+		pt_dump_cont_printf(m, st->to_dmesg, "%9lu%c ", delta, *unit);
+		printk_prot(m, st->current_prot, st->level, st->to_dmesg);
 
 		/*
 		 * We print markers for special areas of address space,
@@ -207,7 +227,8 @@ static void note_page(struct seq_file *m, struct pg_state *st,
 		 */
 		if (st->current_address >= st->marker[1].start_address) {
 			st->marker++;
-			seq_printf(m, "---[ %s ]---\n", st->marker->name);
+			pt_dump_seq_printf(m, st->to_dmesg, "---[ %s ]---\n",
+					   st->marker->name);
 		}
 
 		st->start_address = st->current_address;
@@ -296,7 +317,7 @@ static void walk_pud_level(struct seq_file *m, struct pg_state *st, pgd_t addr,
 #define pgd_none(a)  pud_none(__pud(pgd_val(a)))
 #endif
 
-static void walk_pgd_level(struct seq_file *m)
+void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd)
 {
 #ifdef CONFIG_X86_64
 	pgd_t *start = (pgd_t *) &init_level4_pgt;
@@ -304,9 +325,12 @@ static void walk_pgd_level(struct seq_file *m)
 	pgd_t *start = swapper_pg_dir;
 #endif
 	int i;
-	struct pg_state st;
+	struct pg_state st = {};
 
-	memset(&st, 0, sizeof(st));
+	if (pgd) {
+		start = pgd;
+		st.to_dmesg = true;
+	}
 
 	for (i = 0; i < PTRS_PER_PGD; i++) {
 		st.current_address = normalize_addr(i * PGD_LEVEL_MULT);
@@ -331,7 +355,7 @@ static void walk_pgd_level(struct seq_file *m)
 
 static int ptdump_show(struct seq_file *m, void *v)
 {
-	walk_pgd_level(m);
+	ptdump_walk_pgd_level(m, NULL);
 	return 0;
 }
 
-- 
1.8.5.2.192.g7794a68


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] efi: Dump the EFI page table
  2014-01-18 11:48 [PATCH 0/5] EFI memmap and other fixes, v3 Borislav Petkov
  2014-01-18 11:48 ` [PATCH 1/5] x86, ptdump: Add the functionality to dump an arbitrary pagetable Borislav Petkov
@ 2014-01-18 11:48 ` Borislav Petkov
  2014-01-20 13:38   ` Matt Fleming
  2014-01-18 11:48 ` [PATCH 3/5] x86, pageattr: Export page unmapping interface Borislav Petkov
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2014-01-18 11:48 UTC (permalink / raw)
  To: Linux EFI
  Cc: LKML, Borislav Petkov, Matt Fleming, Matthew Garrett,
	H. Peter Anvin, Toshi Kani

From: Borislav Petkov <bp@suse.de>

This is very useful for debugging issues with the recently added
pagetable switching code for EFI virtual mode.

Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/Kconfig               | 9 +++++++++
 arch/x86/include/asm/efi.h     | 1 +
 arch/x86/platform/efi/efi.c    | 1 +
 arch/x86/platform/efi/efi_32.c | 1 +
 arch/x86/platform/efi/efi_64.c | 9 +++++++++
 5 files changed, 21 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 02241d8dcab6..ba0d86cca926 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1614,6 +1614,15 @@ config EFI_STUB
 
 	  See Documentation/efi-stub.txt for more information.
 
+config EFI_PGT_DUMP
+	bool "Dump the EFI pagetable"
+	depends on EFI && X86_PTDUMP
+	---help---
+	  Enable this if you want to dump the EFI page table before
+	  enabling virtual mode. This can be used to debug miscellaneous
+	  issues with the mapping of the EFI runtime regions into that
+	  table.
+
 config SECCOMP
 	def_bool y
 	prompt "Enable seccomp to safely compute untrusted bytecode"
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 3b978c472d08..dfc319e7a7b8 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -132,6 +132,7 @@ extern void __init efi_map_region_fixed(efi_memory_desc_t *md);
 extern void efi_sync_low_kernel_mappings(void);
 extern void efi_setup_page_tables(void);
 extern void __init old_map_region(efi_memory_desc_t *md);
+extern void __init efi_dump_pagetable(void);
 
 struct efi_setup_data {
 	u64 fw_vendor;
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index d62ec87a2b26..35450ab3a123 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -1033,6 +1033,7 @@ void __init efi_enter_virtual_mode(void)
 
 	efi_setup_page_tables();
 	efi_sync_low_kernel_mappings();
+	efi_dump_pagetable();
 
 	if (!efi_setup) {
 		status = phys_efi_set_virtual_address_map(
diff --git a/arch/x86/platform/efi/efi_32.c b/arch/x86/platform/efi/efi_32.c
index 249b183cf417..b9b827cbfecc 100644
--- a/arch/x86/platform/efi/efi_32.c
+++ b/arch/x86/platform/efi/efi_32.c
@@ -41,6 +41,7 @@ static unsigned long efi_rt_eflags;
 
 void efi_sync_low_kernel_mappings(void) {}
 void efi_setup_page_tables(void) {}
+void __init efi_dump_pagetable(void) {}
 
 void __init efi_map_region(efi_memory_desc_t *md)
 {
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 6284f158a47d..76fce828c31d 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -233,3 +233,12 @@ void __init parse_efi_setup(u64 phys_addr, u32 data_len)
 {
 	efi_setup = phys_addr + sizeof(struct setup_data);
 }
+
+void __init efi_dump_pagetable(void)
+{
+#ifdef CONFIG_EFI_PGT_DUMP
+	pgd_t *pgd = (pgd_t *)__va(real_mode_header->trampoline_pgd);
+
+	ptdump_walk_pgd_level(NULL, pgd);
+#endif
+}
-- 
1.8.5.2.192.g7794a68


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] x86, pageattr: Export page unmapping interface
  2014-01-18 11:48 [PATCH 0/5] EFI memmap and other fixes, v3 Borislav Petkov
  2014-01-18 11:48 ` [PATCH 1/5] x86, ptdump: Add the functionality to dump an arbitrary pagetable Borislav Petkov
  2014-01-18 11:48 ` [PATCH 2/5] efi: Dump the EFI page table Borislav Petkov
@ 2014-01-18 11:48 ` Borislav Petkov
  2014-01-18 11:48 ` [PATCH 4/5] efi: Make efi virtual runtime map passing more robust Borislav Petkov
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2014-01-18 11:48 UTC (permalink / raw)
  To: Linux EFI
  Cc: LKML, Borislav Petkov, Matt Fleming, Matthew Garrett,
	H. Peter Anvin, Toshi Kani

From: Borislav Petkov <bp@suse.de>

We will use it in efi so expose it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/pgtable_types.h |  2 ++
 arch/x86/mm/pageattr.c               | 44 +++++++++++++++++++++++++-----------
 2 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index a83aa44bb1fb..765a4f52d6cd 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -384,6 +384,8 @@ extern pte_t *lookup_address(unsigned long address, unsigned int *level);
 extern phys_addr_t slow_virt_to_phys(void *__address);
 extern int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 				   unsigned numpages, unsigned long page_flags);
+void kernel_unmap_pages_in_pgd(pgd_t *root, unsigned long address,
+			       unsigned numpages);
 #endif	/* !__ASSEMBLY__ */
 
 #endif /* _ASM_X86_PGTABLE_DEFS_H */
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index b3b19f46c016..a3488689e301 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -692,6 +692,18 @@ static bool try_to_free_pmd_page(pmd_t *pmd)
 	return true;
 }
 
+static bool try_to_free_pud_page(pud_t *pud)
+{
+	int i;
+
+	for (i = 0; i < PTRS_PER_PUD; i++)
+		if (!pud_none(pud[i]))
+			return false;
+
+	free_page((unsigned long)pud);
+	return true;
+}
+
 static bool unmap_pte_range(pmd_t *pmd, unsigned long start, unsigned long end)
 {
 	pte_t *pte = pte_offset_kernel(pmd, start);
@@ -805,6 +817,16 @@ static void unmap_pud_range(pgd_t *pgd, unsigned long start, unsigned long end)
 	 */
 }
 
+static void unmap_pgd_range(pgd_t *root, unsigned long addr, unsigned long end)
+{
+	pgd_t *pgd_entry = root + pgd_index(addr);
+
+	unmap_pud_range(pgd_entry, addr, end);
+
+	if (try_to_free_pud_page((pud_t *)pgd_page_vaddr(*pgd_entry)))
+		pgd_clear(pgd_entry);
+}
+
 static int alloc_pte_page(pmd_t *pmd)
 {
 	pte_t *pte = (pte_t *)get_zeroed_page(GFP_KERNEL | __GFP_NOTRACK);
@@ -999,9 +1021,8 @@ static int populate_pud(struct cpa_data *cpa, unsigned long start, pgd_t *pgd,
 static int populate_pgd(struct cpa_data *cpa, unsigned long addr)
 {
 	pgprot_t pgprot = __pgprot(_KERNPG_TABLE);
-	bool allocd_pgd = false;
-	pgd_t *pgd_entry;
 	pud_t *pud = NULL;	/* shut up gcc */
+	pgd_t *pgd_entry;
 	int ret;
 
 	pgd_entry = cpa->pgd + pgd_index(addr);
@@ -1015,7 +1036,6 @@ static int populate_pgd(struct cpa_data *cpa, unsigned long addr)
 			return -1;
 
 		set_pgd(pgd_entry, __pgd(__pa(pud) | _KERNPG_TABLE));
-		allocd_pgd = true;
 	}
 
 	pgprot_val(pgprot) &= ~pgprot_val(cpa->mask_clr);
@@ -1023,19 +1043,11 @@ static int populate_pgd(struct cpa_data *cpa, unsigned long addr)
 
 	ret = populate_pud(cpa, addr, pgd_entry, pgprot);
 	if (ret < 0) {
-		unmap_pud_range(pgd_entry, addr,
+		unmap_pgd_range(cpa->pgd, addr,
 				addr + (cpa->numpages << PAGE_SHIFT));
-
-		if (allocd_pgd) {
-			/*
-			 * If I allocated this PUD page, I can just as well
-			 * free it in this error path.
-			 */
-			pgd_clear(pgd_entry);
-			free_page((unsigned long)pud);
-		}
 		return ret;
 	}
+
 	cpa->numpages = ret;
 	return 0;
 }
@@ -1861,6 +1873,12 @@ out:
 	return retval;
 }
 
+void kernel_unmap_pages_in_pgd(pgd_t *root, unsigned long address,
+			       unsigned numpages)
+{
+	unmap_pgd_range(root, address, address + (numpages << PAGE_SHIFT));
+}
+
 /*
  * The testcases use internal knowledge of the implementation that shouldn't
  * be exposed to the rest of the kernel. Include these directly here.
-- 
1.8.5.2.192.g7794a68


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] efi: Make efi virtual runtime map passing more robust
  2014-01-18 11:48 [PATCH 0/5] EFI memmap and other fixes, v3 Borislav Petkov
                   ` (2 preceding siblings ...)
  2014-01-18 11:48 ` [PATCH 3/5] x86, pageattr: Export page unmapping interface Borislav Petkov
@ 2014-01-18 11:48 ` Borislav Petkov
  2014-01-18 11:48 ` [PATCH 5/5] efi: Split efi_enter_virtual_mode Borislav Petkov
  2014-01-23  9:06 ` [PATCH 0/5] EFI memmap and other fixes, v3 Matt Fleming
  5 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2014-01-18 11:48 UTC (permalink / raw)
  To: Linux EFI
  Cc: LKML, Borislav Petkov, Matt Fleming, Matthew Garrett,
	H. Peter Anvin, Toshi Kani

From: Borislav Petkov <bp@suse.de>

Currently, running SetVirtualAddressMap() and passing the physical
address of the virtual map array was working only by a lucky coincidence
because the memory was present in the EFI page table too. Until Toshi
went and booted this on a big HP box - the krealloc() manner of resizing
the memmap we're doing did allocate from such physical addresses which
were not mapped anymore and boom:

http://lkml.kernel.org/r/1386806463.1791.295.camel@misato.fc.hp.com

One way to take care of that issue is to reimplement the krealloc thing
but with pages. We start with contiguous pages of order 1, i.e. 2 pages,
and when we deplete that memory (shouldn't happen all that often but you
know firmware) we realloc the next power-of-two pages.

Having the pages, it is much more handy and easy to map them into the
EFI page table with the already existing mapping code which we're using
for building the virtual mappings.

Thanks to Toshi Kani and Matt for the great debugging help.

Reported-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/efi.h     |  3 +-
 arch/x86/platform/efi/efi.c    | 99 +++++++++++++++++++++++++++++++++---------
 arch/x86/platform/efi/efi_32.c |  6 ++-
 arch/x86/platform/efi/efi_64.c | 32 ++++++++++++--
 4 files changed, 114 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index dfc319e7a7b8..b473f2541e67 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -130,7 +130,8 @@ extern void efi_memory_uc(u64 addr, unsigned long size);
 extern void __init efi_map_region(efi_memory_desc_t *md);
 extern void __init efi_map_region_fixed(efi_memory_desc_t *md);
 extern void efi_sync_low_kernel_mappings(void);
-extern void efi_setup_page_tables(void);
+extern int efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages);
+extern void efi_cleanup_page_tables(unsigned long pa_memmap, unsigned num_pages);
 extern void __init old_map_region(efi_memory_desc_t *md);
 extern void __init efi_dump_pagetable(void);
 
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 35450ab3a123..b79a85c8a6aa 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -938,14 +938,36 @@ static void __init efi_map_regions_fixed(void)
 
 }
 
+static void *realloc_pages(void *old_memmap, int old_shift)
+{
+	void *ret;
+
+	ret = (void *)__get_free_pages(GFP_KERNEL, old_shift + 1);
+	if (!ret)
+		goto out;
+
+	/*
+	 * A first-time allocation doesn't have anything to copy.
+	 */
+	if (!old_memmap)
+		return ret;
+
+	memcpy(ret, old_memmap, PAGE_SIZE << old_shift);
+
+out:
+	free_pages((unsigned long)old_memmap, old_shift);
+	return ret;
+}
+
 /*
- * Map efi memory ranges for runtime serivce and update new_memmap with virtual
- * addresses.
+ * Map the efi memory ranges of the runtime services and update new_mmap with
+ * virtual addresses.
  */
-static void * __init efi_map_regions(int *count)
+static void * __init efi_map_regions(int *count, int *pg_shift)
 {
+	void *p, *new_memmap = NULL;
+	unsigned long left = 0;
 	efi_memory_desc_t *md;
-	void *p, *tmp, *new_memmap = NULL;
 
 	for (p = memmap.map; p < memmap.map_end; p += memmap.desc_size) {
 		md = p;
@@ -960,20 +982,23 @@ static void * __init efi_map_regions(int *count)
 		efi_map_region(md);
 		get_systab_virt_addr(md);
 
-		tmp = krealloc(new_memmap, (*count + 1) * memmap.desc_size,
-			       GFP_KERNEL);
-		if (!tmp)
-			goto out;
-		new_memmap = tmp;
+		if (left < memmap.desc_size) {
+			new_memmap = realloc_pages(new_memmap, *pg_shift);
+			if (!new_memmap)
+				return NULL;
+
+			left += PAGE_SIZE << *pg_shift;
+			(*pg_shift)++;
+		}
+
 		memcpy(new_memmap + (*count * memmap.desc_size), md,
 		       memmap.desc_size);
+
+		left -= memmap.desc_size;
 		(*count)++;
 	}
 
 	return new_memmap;
-out:
-	kfree(new_memmap);
-	return NULL;
 }
 
 /*
@@ -999,9 +1024,9 @@ out:
  */
 void __init efi_enter_virtual_mode(void)
 {
-	efi_status_t status;
+	int err, count = 0, pg_shift = 0;
 	void *new_memmap = NULL;
-	int err, count = 0;
+	efi_status_t status;
 
 	efi.systab = NULL;
 
@@ -1018,20 +1043,24 @@ void __init efi_enter_virtual_mode(void)
 		efi_map_regions_fixed();
 	} else {
 		efi_merge_regions();
-		new_memmap = efi_map_regions(&count);
+		new_memmap = efi_map_regions(&count, &pg_shift);
 		if (!new_memmap) {
 			pr_err("Error reallocating memory, EFI runtime non-functional!\n");
 			return;
 		}
-	}
 
-	err = save_runtime_map();
-	if (err)
-		pr_err("Error saving runtime map, efi runtime on kexec non-functional!!\n");
+		err = save_runtime_map();
+		if (err)
+			pr_err("Error saving runtime map, efi runtime on kexec non-functional!!\n");
+	}
 
 	BUG_ON(!efi.systab);
 
-	efi_setup_page_tables();
+	if (!efi_setup) {
+		if (efi_setup_page_tables(__pa(new_memmap), 1 << pg_shift))
+			return;
+	}
+
 	efi_sync_low_kernel_mappings();
 	efi_dump_pagetable();
 
@@ -1073,7 +1102,35 @@ void __init efi_enter_virtual_mode(void)
 	if (efi_enabled(EFI_OLD_MEMMAP) && (__supported_pte_mask & _PAGE_NX))
 		runtime_code_page_mkexec();
 
-	kfree(new_memmap);
+
+	/*
+	 * We mapped the descriptor array into the EFI pagetable above but we're
+	 * not unmapping it here. Here's why:
+	 *
+	 * We're copying select PGDs from the kernel page table to the EFI page
+	 * table and when we do so and make changes to those PGDs like unmapping
+	 * stuff from them, those changes appear in the kernel page table and we
+	 * go boom.
+	 *
+	 * From setup_real_mode():
+	 *
+	 * ...
+	 * trampoline_pgd[0] = init_level4_pgt[pgd_index(__PAGE_OFFSET)].pgd;
+	 *
+	 * In this particular case, our allocation is in PGD 0 of the EFI page
+	 * table but we've copied that PGD from PGD[272] of the EFI page table:
+	 *
+	 *	pgd_index(__PAGE_OFFSET = 0xffff880000000000) = 272
+	 *
+	 * where the direct memory mapping in kernel space is.
+	 *
+	 * new_memmap's VA comes from that direct mapping and thus clearing it,
+	 * it would get cleared in the kernel page table too.
+	 *
+	 * efi_cleanup_page_tables(__pa(new_memmap), 1 << pg_shift);
+	 */
+	if (!efi_setup)
+		free_pages((unsigned long)new_memmap, pg_shift);
 
 	/* clean DUMMY object */
 	efi.set_variable(efi_dummy_name, &EFI_DUMMY_GUID,
diff --git a/arch/x86/platform/efi/efi_32.c b/arch/x86/platform/efi/efi_32.c
index b9b827cbfecc..7ad87ee09525 100644
--- a/arch/x86/platform/efi/efi_32.c
+++ b/arch/x86/platform/efi/efi_32.c
@@ -40,8 +40,12 @@
 static unsigned long efi_rt_eflags;
 
 void efi_sync_low_kernel_mappings(void) {}
-void efi_setup_page_tables(void) {}
 void __init efi_dump_pagetable(void) {}
+int efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
+{
+	return 0;
+}
+void efi_cleanup_page_tables(unsigned long pa_memmap, unsigned num_pages) {}
 
 void __init efi_map_region(efi_memory_desc_t *md)
 {
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 76fce828c31d..0998f3a536ff 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -137,12 +137,38 @@ void efi_sync_low_kernel_mappings(void)
 		sizeof(pgd_t) * num_pgds);
 }
 
-void efi_setup_page_tables(void)
+int efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 {
+	pgd_t *pgd;
+
+	if (efi_enabled(EFI_OLD_MEMMAP))
+		return 0;
+
 	efi_scratch.efi_pgt = (pgd_t *)(unsigned long)real_mode_header->trampoline_pgd;
+	pgd = __va(efi_scratch.efi_pgt);
 
-	if (!efi_enabled(EFI_OLD_MEMMAP))
-		efi_scratch.use_pgd = true;
+	/*
+	 * It can happen that the physical address of new_memmap lands in memory
+	 * which is not mapped in the EFI page table. Therefore we need to go
+	 * and ident-map those pages containing the map before calling
+	 * phys_efi_set_virtual_address_map().
+	 */
+	if (kernel_map_pages_in_pgd(pgd, pa_memmap, pa_memmap, num_pages, _PAGE_NX)) {
+		pr_err("Error ident-mapping new memmap (0x%lx)!\n", pa_memmap);
+		return 1;
+	}
+
+	efi_scratch.use_pgd = true;
+
+
+	return 0;
+}
+
+void efi_cleanup_page_tables(unsigned long pa_memmap, unsigned num_pages)
+{
+	pgd_t *pgd = (pgd_t *)__va(real_mode_header->trampoline_pgd);
+
+	kernel_unmap_pages_in_pgd(pgd, pa_memmap, num_pages);
 }
 
 static void __init __map_region(efi_memory_desc_t *md, u64 va)
-- 
1.8.5.2.192.g7794a68


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] efi: Split efi_enter_virtual_mode
  2014-01-18 11:48 [PATCH 0/5] EFI memmap and other fixes, v3 Borislav Petkov
                   ` (3 preceding siblings ...)
  2014-01-18 11:48 ` [PATCH 4/5] efi: Make efi virtual runtime map passing more robust Borislav Petkov
@ 2014-01-18 11:48 ` Borislav Petkov
  2014-01-20 13:44   ` Matt Fleming
  2014-01-23  9:06 ` [PATCH 0/5] EFI memmap and other fixes, v3 Matt Fleming
  5 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2014-01-18 11:48 UTC (permalink / raw)
  To: Linux EFI
  Cc: LKML, Borislav Petkov, Matt Fleming, Matthew Garrett,
	H. Peter Anvin, Toshi Kani

From: Borislav Petkov <bp@suse.de>

... into a kexec flavor for better code readability and simplicity. The
original one was getting ugly with ifdeffery.

Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/platform/efi/efi.c | 148 +++++++++++++++++++++++++++++---------------
 include/linux/efi.h         |   1 +
 2 files changed, 99 insertions(+), 50 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index b79a85c8a6aa..bcc3e5117fd7 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -891,8 +891,9 @@ static void __init get_systab_virt_addr(efi_memory_desc_t *md)
 	}
 }
 
-static int __init save_runtime_map(void)
+static void __init save_runtime_map(void)
 {
+#ifdef CONFIG_KEXEC
 	efi_memory_desc_t *md;
 	void *tmp, *p, *q = NULL;
 	int count = 0;
@@ -914,28 +915,12 @@ static int __init save_runtime_map(void)
 	}
 
 	efi_runtime_map_setup(q, count, memmap.desc_size);
+	return;
 
-	return 0;
 out:
 	kfree(q);
-	return -ENOMEM;
-}
-
-/*
- * Map efi regions which were passed via setup_data. The virt_addr is a fixed
- * addr which was used in first kernel of a kexec boot.
- */
-static void __init efi_map_regions_fixed(void)
-{
-	void *p;
-	efi_memory_desc_t *md;
-
-	for (p = memmap.map; p < memmap.map_end; p += memmap.desc_size) {
-		md = p;
-		efi_map_region_fixed(md); /* FIXME: add error handling */
-		get_systab_virt_addr(md);
-	}
-
+	pr_err("Error saving runtime map, efi runtime on kexec non-functional!!\n");
+#endif
 }
 
 static void *realloc_pages(void *old_memmap, int old_shift)
@@ -1001,6 +986,72 @@ static void * __init efi_map_regions(int *count, int *pg_shift)
 	return new_memmap;
 }
 
+void __init kexec_enter_virtual_mode(void)
+{
+#ifdef CONFIG_KEXEC
+	efi_memory_desc_t *md;
+	void *p;
+
+	efi.systab = NULL;
+
+	/*
+	 * We don't do virtual mode, since we don't do runtime services, on
+	 * non-native EFI
+	 */
+	if (!efi_is_native()) {
+		efi_unmap_memmap();
+		return;
+	}
+
+	/*
+	* Map efi regions which were passed via setup_data. The virt_addr is a
+	* fixed addr which was used in first kernel of a kexec boot.
+	*/
+	for (p = memmap.map; p < memmap.map_end; p += memmap.desc_size) {
+		md = p;
+		efi_map_region_fixed(md); /* FIXME: add error handling */
+		get_systab_virt_addr(md);
+	}
+
+	save_runtime_map();
+
+	BUG_ON(!efi.systab);
+
+	efi_sync_low_kernel_mappings();
+
+	/*
+	 * Now that EFI is in virtual mode, update the function
+	 * pointers in the runtime service table to the new virtual addresses.
+	 *
+	 * Call EFI services through wrapper functions.
+	 */
+	efi.runtime_version = efi_systab.hdr.revision;
+	efi.get_time = virt_efi_get_time;
+	efi.set_time = virt_efi_set_time;
+	efi.get_wakeup_time = virt_efi_get_wakeup_time;
+	efi.set_wakeup_time = virt_efi_set_wakeup_time;
+	efi.get_variable = virt_efi_get_variable;
+	efi.get_next_variable = virt_efi_get_next_variable;
+	efi.set_variable = virt_efi_set_variable;
+	efi.get_next_high_mono_count = virt_efi_get_next_high_mono_count;
+	efi.reset_system = virt_efi_reset_system;
+	efi.set_virtual_address_map = NULL;
+	efi.query_variable_info = virt_efi_query_variable_info;
+	efi.update_capsule = virt_efi_update_capsule;
+	efi.query_capsule_caps = virt_efi_query_capsule_caps;
+
+	if (efi_enabled(EFI_OLD_MEMMAP) && (__supported_pte_mask & _PAGE_NX))
+		runtime_code_page_mkexec();
+
+	/* clean DUMMY object */
+	efi.set_variable(efi_dummy_name, &EFI_DUMMY_GUID,
+			 EFI_VARIABLE_NON_VOLATILE |
+			 EFI_VARIABLE_BOOTSERVICE_ACCESS |
+			 EFI_VARIABLE_RUNTIME_ACCESS,
+			 0, NULL);
+#endif
+}
+
 /*
  * This function will switch the EFI runtime services to virtual mode.
  * Essentially, we look through the EFI memmap and map every region that
@@ -1020,11 +1071,12 @@ static void * __init efi_map_regions(int *count, int *pg_shift)
  *
  * Specially for kexec boot, efi runtime maps in previous kernel should
  * be passed in via setup_data. In that case runtime ranges will be mapped
- * to the same virtual addresses as the first kernel.
+ * to the same virtual addresses as the first kernel, see
+ * kexec_enter_virtual_mode().
  */
-void __init efi_enter_virtual_mode(void)
+static void __init __efi_enter_virtual_mode(void)
 {
-	int err, count = 0, pg_shift = 0;
+	int count = 0, pg_shift = 0;
 	void *new_memmap = NULL;
 	efi_status_t status;
 
@@ -1039,43 +1091,33 @@ void __init efi_enter_virtual_mode(void)
 		return;
 	}
 
-	if (efi_setup) {
-		efi_map_regions_fixed();
-	} else {
-		efi_merge_regions();
-		new_memmap = efi_map_regions(&count, &pg_shift);
-		if (!new_memmap) {
-			pr_err("Error reallocating memory, EFI runtime non-functional!\n");
-			return;
-		}
-
-		err = save_runtime_map();
-		if (err)
-			pr_err("Error saving runtime map, efi runtime on kexec non-functional!!\n");
+	efi_merge_regions();
+	new_memmap = efi_map_regions(&count, &pg_shift);
+	if (!new_memmap) {
+		pr_err("Error reallocating memory, EFI runtime non-functional!\n");
+		return;
 	}
 
+	save_runtime_map();
+
 	BUG_ON(!efi.systab);
 
-	if (!efi_setup) {
-		if (efi_setup_page_tables(__pa(new_memmap), 1 << pg_shift))
-			return;
-	}
+	if (efi_setup_page_tables(__pa(new_memmap), 1 << pg_shift))
+		return;
 
 	efi_sync_low_kernel_mappings();
 	efi_dump_pagetable();
 
-	if (!efi_setup) {
-		status = phys_efi_set_virtual_address_map(
+	status = phys_efi_set_virtual_address_map(
 			memmap.desc_size * count,
 			memmap.desc_size,
 			memmap.desc_version,
 			(efi_memory_desc_t *)__pa(new_memmap));
 
-		if (status != EFI_SUCCESS) {
-			pr_alert("Unable to switch EFI into virtual mode (status=%lx)!\n",
-				 status);
-			panic("EFI call to SetVirtualAddressMap() failed!");
-		}
+	if (status != EFI_SUCCESS) {
+		pr_alert("Unable to switch EFI into virtual mode (status=%lx)!\n",
+			 status);
+		panic("EFI call to SetVirtualAddressMap() failed!");
 	}
 
 	/*
@@ -1102,7 +1144,6 @@ void __init efi_enter_virtual_mode(void)
 	if (efi_enabled(EFI_OLD_MEMMAP) && (__supported_pte_mask & _PAGE_NX))
 		runtime_code_page_mkexec();
 
-
 	/*
 	 * We mapped the descriptor array into the EFI pagetable above but we're
 	 * not unmapping it here. Here's why:
@@ -1129,8 +1170,7 @@ void __init efi_enter_virtual_mode(void)
 	 *
 	 * efi_cleanup_page_tables(__pa(new_memmap), 1 << pg_shift);
 	 */
-	if (!efi_setup)
-		free_pages((unsigned long)new_memmap, pg_shift);
+	free_pages((unsigned long)new_memmap, pg_shift);
 
 	/* clean DUMMY object */
 	efi.set_variable(efi_dummy_name, &EFI_DUMMY_GUID,
@@ -1140,6 +1180,14 @@ void __init efi_enter_virtual_mode(void)
 			 0, NULL);
 }
 
+void __init efi_enter_virtual_mode(void)
+{
+	if (efi_setup)
+		kexec_enter_virtual_mode();
+	else
+		__efi_enter_virtual_mode();
+}
+
 /*
  * Convenience functions to obtain memory types and attributes
  */
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 0a819e7a60c9..44ea66ce41e0 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -594,6 +594,7 @@ extern void efi_map_pal_code (void);
 extern void efi_memmap_walk (efi_freemem_callback_t callback, void *arg);
 extern void efi_gettimeofday (struct timespec *ts);
 extern void efi_enter_virtual_mode (void);	/* switch EFI to virtual mode, if possible */
+extern void kexec_enter_virtual_mode(void);
 #ifdef CONFIG_X86
 extern void efi_late_init(void);
 extern void efi_free_boot_services(void);
-- 
1.8.5.2.192.g7794a68


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] efi: Dump the EFI page table
  2014-01-18 11:48 ` [PATCH 2/5] efi: Dump the EFI page table Borislav Petkov
@ 2014-01-20 13:38   ` Matt Fleming
  2014-01-20 13:41     ` Borislav Petkov
  0 siblings, 1 reply; 15+ messages in thread
From: Matt Fleming @ 2014-01-20 13:38 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux EFI, LKML, Borislav Petkov, Matthew Garrett,
	H. Peter Anvin, Toshi Kani

On Sat, 18 Jan, at 12:48:15PM, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> This is very useful for debugging issues with the recently added
> pagetable switching code for EFI virtual mode.
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Tested-by: Toshi Kani <toshi.kani@hp.com>
> ---
>  arch/x86/Kconfig               | 9 +++++++++
>  arch/x86/include/asm/efi.h     | 1 +
>  arch/x86/platform/efi/efi.c    | 1 +
>  arch/x86/platform/efi/efi_32.c | 1 +
>  arch/x86/platform/efi/efi_64.c | 9 +++++++++
>  5 files changed, 21 insertions(+)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 02241d8dcab6..ba0d86cca926 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1614,6 +1614,15 @@ config EFI_STUB
>  
>  	  See Documentation/efi-stub.txt for more information.
>  
> +config EFI_PGT_DUMP
> +	bool "Dump the EFI pagetable"
> +	depends on EFI && X86_PTDUMP
> +	---help---
> +	  Enable this if you want to dump the EFI page table before
> +	  enabling virtual mode. This can be used to debug miscellaneous
> +	  issues with the mapping of the EFI runtime regions into that
> +	  table.
> +

Would this be more suitable in arch/x86/Kconfig.debug?

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] efi: Dump the EFI page table
  2014-01-20 13:38   ` Matt Fleming
@ 2014-01-20 13:41     ` Borislav Petkov
  0 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2014-01-20 13:41 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Linux EFI, LKML, Borislav Petkov, Matthew Garrett,
	H. Peter Anvin, Toshi Kani

On Mon, Jan 20, 2014 at 01:38:02PM +0000, Matt Fleming wrote:
> > +config EFI_PGT_DUMP
> > +	bool "Dump the EFI pagetable"
> > +	depends on EFI && X86_PTDUMP
> > +	---help---
> > +	  Enable this if you want to dump the EFI page table before
> > +	  enabling virtual mode. This can be used to debug miscellaneous
> > +	  issues with the mapping of the EFI runtime regions into that
> > +	  table.
> > +
> 
> Would this be more suitable in arch/x86/Kconfig.debug?

I wanted to keep it close to CONFIG_EFI if we decide to make EFI into a
menu some day but I don't have a strong preference.

Just move it when applying.

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] efi: Split efi_enter_virtual_mode
  2014-01-18 11:48 ` [PATCH 5/5] efi: Split efi_enter_virtual_mode Borislav Petkov
@ 2014-01-20 13:44   ` Matt Fleming
  2014-01-20 13:46     ` Borislav Petkov
  0 siblings, 1 reply; 15+ messages in thread
From: Matt Fleming @ 2014-01-20 13:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux EFI, LKML, Borislav Petkov, Matthew Garrett,
	H. Peter Anvin, Toshi Kani

On Sat, 18 Jan, at 12:48:18PM, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> ... into a kexec flavor for better code readability and simplicity. The
> original one was getting ugly with ifdeffery.
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Tested-by: Toshi Kani <toshi.kani@hp.com>
> ---
>  arch/x86/platform/efi/efi.c | 148 +++++++++++++++++++++++++++++---------------
>  include/linux/efi.h         |   1 +
>  2 files changed, 99 insertions(+), 50 deletions(-)

[...]

> @@ -1001,6 +986,72 @@ static void * __init efi_map_regions(int *count, int *pg_shift)
>  	return new_memmap;
>  }
>  
> +void __init kexec_enter_virtual_mode(void)
> +{

Could this be static for now?

> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index 0a819e7a60c9..44ea66ce41e0 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -594,6 +594,7 @@ extern void efi_map_pal_code (void);
>  extern void efi_memmap_walk (efi_freemem_callback_t callback, void *arg);
>  extern void efi_gettimeofday (struct timespec *ts);
>  extern void efi_enter_virtual_mode (void);	/* switch EFI to virtual mode, if possible */
> +extern void kexec_enter_virtual_mode(void);
>  #ifdef CONFIG_X86
>  extern void efi_late_init(void);
>  extern void efi_free_boot_services(void);

Which would allow us to drop this change.

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] efi: Split efi_enter_virtual_mode
  2014-01-20 13:44   ` Matt Fleming
@ 2014-01-20 13:46     ` Borislav Petkov
  2014-01-20 14:48       ` Matt Fleming
  0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2014-01-20 13:46 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Linux EFI, LKML, Borislav Petkov, Matthew Garrett,
	H. Peter Anvin, Toshi Kani

On Mon, Jan 20, 2014 at 01:44:07PM +0000, Matt Fleming wrote:
> > +void __init kexec_enter_virtual_mode(void)
> > +{
> 
> Could this be static for now?

Didn't you wanna do arch/x86/platform/efi/kexec.c anyway?

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] efi: Split efi_enter_virtual_mode
  2014-01-20 13:46     ` Borislav Petkov
@ 2014-01-20 14:48       ` Matt Fleming
  0 siblings, 0 replies; 15+ messages in thread
From: Matt Fleming @ 2014-01-20 14:48 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux EFI, LKML, Borislav Petkov, Matthew Garrett,
	H. Peter Anvin, Toshi Kani

On Mon, 20 Jan, at 02:46:25PM, Borislav Petkov wrote:
> On Mon, Jan 20, 2014 at 01:44:07PM +0000, Matt Fleming wrote:
> > > +void __init kexec_enter_virtual_mode(void)
> > > +{
> > 
> > Could this be static for now?
> 
> Didn't you wanna do arch/x86/platform/efi/kexec.c anyway?

Yes, but the patches are easier to review if we make
kexec_enter_virtual_mode() global when moving it into a new file.

Plus it makes this patch that little bit smaller because the
modification to efi.h is unnecessary.

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/5] EFI memmap and other fixes, v3
  2014-01-18 11:48 [PATCH 0/5] EFI memmap and other fixes, v3 Borislav Petkov
                   ` (4 preceding siblings ...)
  2014-01-18 11:48 ` [PATCH 5/5] efi: Split efi_enter_virtual_mode Borislav Petkov
@ 2014-01-23  9:06 ` Matt Fleming
  2014-01-23 16:30   ` Toshi Kani
  5 siblings, 1 reply; 15+ messages in thread
From: Matt Fleming @ 2014-01-23  9:06 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux EFI, LKML, Borislav Petkov, Matthew Garrett,
	H. Peter Anvin, Toshi Kani

On Sat, 18 Jan, at 12:48:13PM, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> Hi all,
> 
> this is the latest incarnation which should hopefully work just fine. It
> tpasses esting at least on all our boxes so we have *some* coverage.
> 
> It took us a long time to debug the unmapping path and realize how
> exactly we're doing the PGD sharing between the kernel and the EFI page
> table. I've commented on this verbosely in 4/5 for future reference
> because this stuff is not trivial.
> 
> We definitely should have this in mind when we do more changes on how we
> handle the EFI page table and what we do with it, maybe even decouple it
> completely from the kernel one. I dunno - it is all future stuff to pay
> attention to.
> 
> Anyway, if anyone with an EFI box wants to give it a run, feel free to
> do so.
> 
> As always, all comments and suggestions are appreciated.

Thanks Borislav. I picked these up and applied them to my 'next' branch.

You'll notice that I haven't stuck them in my 'urgent' branch, so these
won't be sent to Linus before v3.15 despite the fact that they're bug
fixes. I think Toshi's box is special enough that most people shouldn't
hit this issue, and since these patches involve rewriting the way we do
SetVirtualAddressMap() I'm happy for people to hammer on the stuff that
did make it into the merge window without this added complication.

Having said that, if people think it's worth the effort I don't mind
setting up a new branch containing this series that can be sent to Linus
much sooner.

Thoughts?

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/5] EFI memmap and other fixes, v3
  2014-01-23  9:06 ` [PATCH 0/5] EFI memmap and other fixes, v3 Matt Fleming
@ 2014-01-23 16:30   ` Toshi Kani
  2014-01-23 16:51     ` Borislav Petkov
  0 siblings, 1 reply; 15+ messages in thread
From: Toshi Kani @ 2014-01-23 16:30 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Borislav Petkov, Linux EFI, LKML, Borislav Petkov,
	Matthew Garrett, H. Peter Anvin

On Thu, 2014-01-23 at 09:06 +0000, Matt Fleming wrote:
> On Sat, 18 Jan, at 12:48:13PM, Borislav Petkov wrote:
> > From: Borislav Petkov <bp@suse.de>
> > 
> > Hi all,
> > 
> > this is the latest incarnation which should hopefully work just fine. It
> > tpasses esting at least on all our boxes so we have *some* coverage.
> > 
> > It took us a long time to debug the unmapping path and realize how
> > exactly we're doing the PGD sharing between the kernel and the EFI page
> > table. I've commented on this verbosely in 4/5 for future reference
> > because this stuff is not trivial.
> > 
> > We definitely should have this in mind when we do more changes on how we
> > handle the EFI page table and what we do with it, maybe even decouple it
> > completely from the kernel one. I dunno - it is all future stuff to pay
> > attention to.
> > 
> > Anyway, if anyone with an EFI box wants to give it a run, feel free to
> > do so.
> > 
> > As always, all comments and suggestions are appreciated.
> 
> Thanks Borislav. I picked these up and applied them to my 'next' branch.
> 
> You'll notice that I haven't stuck them in my 'urgent' branch, so these
> won't be sent to Linus before v3.15 despite the fact that they're bug
> fixes. I think Toshi's box is special enough that most people shouldn't
> hit this issue, and since these patches involve rewriting the way we do
> SetVirtualAddressMap() I'm happy for people to hammer on the stuff that
> did make it into the merge window without this added complication.
> 
> Having said that, if people think it's worth the effort I don't mind
> setting up a new branch containing this series that can be sent to Linus
> much sooner.

Since this issue is quite serious on our systems (3.14 kernels won't
boot up), I'd really appreciate if the fixes can be made into some
3.14-rc.  It can hit on any EFI systems with large memory.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/5] EFI memmap and other fixes, v3
  2014-01-23 16:30   ` Toshi Kani
@ 2014-01-23 16:51     ` Borislav Petkov
  2014-01-23 16:56       ` Toshi Kani
  0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2014-01-23 16:51 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Matt Fleming, Linux EFI, LKML, Borislav Petkov, Matthew Garrett,
	H. Peter Anvin

On Thu, Jan 23, 2014 at 09:30:19AM -0700, Toshi Kani wrote:
> Since this issue is quite serious on our systems (3.14 kernels won't
> boot up), I'd really appreciate if the fixes can be made into some
> 3.14-rc.  It can hit on any EFI systems with large memory.

That's why we have "efi=old_map" - so that you can use the old mapping
method which is a chicken bit for the new scheme.

And I told you privately already that you're better off merging Matt's
'next' branch than us exposing this to the world too early. So stop
pushing this when there are clearly other solutions for your case and be
patient!

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/5] EFI memmap and other fixes, v3
  2014-01-23 16:51     ` Borislav Petkov
@ 2014-01-23 16:56       ` Toshi Kani
  0 siblings, 0 replies; 15+ messages in thread
From: Toshi Kani @ 2014-01-23 16:56 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Matt Fleming, Linux EFI, LKML, Borislav Petkov, Matthew Garrett,
	H. Peter Anvin

On Thu, 2014-01-23 at 17:51 +0100, Borislav Petkov wrote:
> On Thu, Jan 23, 2014 at 09:30:19AM -0700, Toshi Kani wrote:
> > Since this issue is quite serious on our systems (3.14 kernels won't
> > boot up), I'd really appreciate if the fixes can be made into some
> > 3.14-rc.  It can hit on any EFI systems with large memory.
> 
> That's why we have "efi=old_map" - so that you can use the old mapping
> method which is a chicken bit for the new scheme.

Good point.  Yes, that will help us a lot.

> And I told you privately already that you're better off merging Matt's
> 'next' branch than us exposing this to the world too early. So stop
> pushing this when there are clearly other solutions for your case and be
> patient!

Right, but I just responded to Matt's query. :-) 

Thanks,
-Toshi



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-01-23 17:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-18 11:48 [PATCH 0/5] EFI memmap and other fixes, v3 Borislav Petkov
2014-01-18 11:48 ` [PATCH 1/5] x86, ptdump: Add the functionality to dump an arbitrary pagetable Borislav Petkov
2014-01-18 11:48 ` [PATCH 2/5] efi: Dump the EFI page table Borislav Petkov
2014-01-20 13:38   ` Matt Fleming
2014-01-20 13:41     ` Borislav Petkov
2014-01-18 11:48 ` [PATCH 3/5] x86, pageattr: Export page unmapping interface Borislav Petkov
2014-01-18 11:48 ` [PATCH 4/5] efi: Make efi virtual runtime map passing more robust Borislav Petkov
2014-01-18 11:48 ` [PATCH 5/5] efi: Split efi_enter_virtual_mode Borislav Petkov
2014-01-20 13:44   ` Matt Fleming
2014-01-20 13:46     ` Borislav Petkov
2014-01-20 14:48       ` Matt Fleming
2014-01-23  9:06 ` [PATCH 0/5] EFI memmap and other fixes, v3 Matt Fleming
2014-01-23 16:30   ` Toshi Kani
2014-01-23 16:51     ` Borislav Petkov
2014-01-23 16:56       ` Toshi Kani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).