All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] Misc GDT fixes and a cleanup
@ 2017-03-22 21:32 Andy Lutomirski
  2017-03-22 21:32 ` [PATCH 1/7] selftests/x86/ldt_gdt_32: Work around a glibc sigaction bug Andy Lutomirski
                   ` (8 more replies)
  0 siblings, 9 replies; 32+ messages in thread
From: Andy Lutomirski @ 2017-03-22 21:32 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Borislav Petkov, Boris Ostrovsky, Juergen Gross,
	Thomas Garnier, Andy Lutomirski

Hi all-

This applies to tip:x86/mm.  For ease of testing, the series is here, too:

https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/tag/?h=review_20170322_gdt_and_wp

This fixes a few issues, most of which appear to be rather old.  For
whatever reason, Thomas' GDT series unearthed them.  (And one is a
genuine bug in Thomas' code but, in his defense, he might have
cut-and-pasted it verbatim from the identical bug in the EFI code.)

The last three patches are cleanups I did while tracking these down.

Boris, any chance you could test this series on Xen?  The 64-bit
case works for me, but I'm having issues testing on 32-bit right
now.

Ingo, the first patch should address your concerns from the earlier
version.

Andy Lutomirski (7):
  selftests/x86/ldt_gdt_32: Work around a glibc sigaction bug
  x86/gdt: Fix setup_fixmap_gdt() to use the correct PA
  x86/efi/32: Fix EFI on systems where the percpu GDT is virtually
    mapped
  x86/boot/32: Defer resyncing initial_page_table until percpu is set up
  x86/gdt: Get rid of the get_*_gdt_*_vaddr() helpers
  x86/xen/gdt: Use X86_FEATURE_XENPV instead of globals for the GDT
    fixup
  x86/boot/32: Rewrite test_wp_bit()

 arch/x86/include/asm/desc.h           | 21 +++-------------
 arch/x86/include/asm/processor.h      |  2 --
 arch/x86/kernel/cpu/common.c          | 28 ++++++++++++---------
 arch/x86/kernel/cpu/proc.c            |  5 ++--
 arch/x86/kernel/setup.c               | 17 -------------
 arch/x86/kernel/setup_percpu.c        | 21 ++++++++++++++++
 arch/x86/kvm/vmx.c                    |  4 +--
 arch/x86/mm/init_32.c                 | 44 +++++++--------------------------
 arch/x86/platform/efi/efi_32.c        |  2 +-
 arch/x86/xen/enlighten.c              |  4 ---
 tools/testing/selftests/x86/ldt_gdt.c | 46 +++++++++++++++++++++++++++++++++++
 11 files changed, 101 insertions(+), 93 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/7] selftests/x86/ldt_gdt_32: Work around a glibc sigaction bug
  2017-03-22 21:32 [PATCH 0/7] Misc GDT fixes and a cleanup Andy Lutomirski
@ 2017-03-22 21:32 ` Andy Lutomirski
  2017-03-23  9:13   ` [tip:x86/mm] selftests/x86/ldt_gdt_32: Work around a glibc sigaction() bug tip-bot for Andy Lutomirski
  2017-03-22 21:32 ` [PATCH 2/7] x86/gdt: Fix setup_fixmap_gdt() to use the correct PA Andy Lutomirski
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Andy Lutomirski @ 2017-03-22 21:32 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Borislav Petkov, Boris Ostrovsky, Juergen Gross,
	Thomas Garnier, Andy Lutomirski, stable

i386 glibc is buggy and calls the sigaction syscall incorrectly.
This is asymptomatic for normal programs, but it blows up on
programs that do evil things with segmentation.  ldt_gdt an example
of such an evil program.

This doesn't appear to be a regression -- I think I just got lucky
with the uninitialized memory that glibc threw at the kernel when I
wrote the test.

This hackish fix manually issues sigaction(2) syscalls to undo the
damage.  Without the fix, ldt_gdt_32 segfaults; with the fix, it
passes for me.

See https://sourceware.org/bugzilla/show_bug.cgi?id=21269

Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 tools/testing/selftests/x86/ldt_gdt.c | 46 +++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/tools/testing/selftests/x86/ldt_gdt.c b/tools/testing/selftests/x86/ldt_gdt.c
index f6121612e769..b9a22f18566a 100644
--- a/tools/testing/selftests/x86/ldt_gdt.c
+++ b/tools/testing/selftests/x86/ldt_gdt.c
@@ -409,6 +409,51 @@ static void *threadproc(void *ctx)
 	}
 }
 
+#ifdef __i386__
+
+#ifndef SA_RESTORE
+#define SA_RESTORER 0x04000000
+#endif
+
+/*
+ * The UAPI header calls this 'struct sigaction', which conflicts with
+ * glibc.  Sigh.
+ */
+struct fake_ksigaction {
+	void *handler;  /* the real type is nasty */
+	unsigned long sa_flags;
+	void (*sa_restorer)(void);
+	unsigned char sigset[8];
+};
+
+static void fix_sa_restorer(int sig)
+{
+	struct fake_ksigaction ksa;
+
+	if (syscall(SYS_rt_sigaction, sig, NULL, &ksa, 8) == 0) {
+		/*
+		 * glibc has a nasty bug: it sometimes writes garbage to
+		 * sa_restorer.  This interacts quite badly with anything
+		 * that fiddles with SS because it can trigger legacy
+		 * stack switching.  Patch it up.  See:
+		 *
+		 * https://sourceware.org/bugzilla/show_bug.cgi?id=21269
+		 */
+		if (!(ksa.sa_flags & SA_RESTORER) && ksa.sa_restorer) {
+			ksa.sa_restorer = NULL;
+			if (syscall(SYS_rt_sigaction, sig, &ksa, NULL,
+				    sizeof(ksa.sigset)) != 0)
+				err(1, "rt_sigaction");
+		}
+	}
+}
+#else
+static void fix_sa_restorer(int sig)
+{
+	/* 64-bit glibc works fine. */
+}
+#endif
+
 static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
 		       int flags)
 {
@@ -420,6 +465,7 @@ static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
 	if (sigaction(sig, &sa, 0))
 		err(1, "sigaction");
 
+	fix_sa_restorer(sig);
 }
 
 static jmp_buf jmpbuf;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/7] x86/gdt: Fix setup_fixmap_gdt() to use the correct PA
  2017-03-22 21:32 [PATCH 0/7] Misc GDT fixes and a cleanup Andy Lutomirski
  2017-03-22 21:32 ` [PATCH 1/7] selftests/x86/ldt_gdt_32: Work around a glibc sigaction bug Andy Lutomirski
@ 2017-03-22 21:32 ` Andy Lutomirski
  2017-03-23  9:13   ` [tip:x86/mm] " tip-bot for Andy Lutomirski
  2017-03-22 21:32   ` Andy Lutomirski
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Andy Lutomirski @ 2017-03-22 21:32 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Borislav Petkov, Boris Ostrovsky, Juergen Gross,
	Thomas Garnier, Andy Lutomirski

__pa() cannot be used on percpu pointers because they may be
virtually mapped.  Use per_cpu_ptr_to_phys() instead.

This fixes a boot crash on a some 32-bit configurations.  I assume
this is related to which allocation strategy is chosen by the percpu
core.

Fixes: 69218e47994d x86: ("Remap GDT tables in the fixmap section")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/desc.h  | 6 ++++++
 arch/x86/kernel/cpu/common.c | 4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index ec05f9c1a62c..bde11696b893 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -98,6 +98,12 @@ static inline unsigned long get_current_gdt_ro_vaddr(void)
 	return (unsigned long)get_current_gdt_ro();
 }
 
+/* Provide the physical address of the GDT page. */
+static inline phys_addr_t get_cpu_gdt_paddr(unsigned int cpu)
+{
+	return per_cpu_ptr_to_phys(get_cpu_gdt_rw(cpu));
+}
+
 #ifdef CONFIG_X86_64
 
 static inline void pack_gate(gate_desc *gate, unsigned type, unsigned long func,
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f8e22dbad86c..f6e20e2dbfa5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -461,8 +461,8 @@ pgprot_t pg_fixmap_gdt_flags = PAGE_KERNEL;
 /* Setup the fixmap mapping only once per-processor */
 static inline void setup_fixmap_gdt(int cpu)
 {
-	__set_fixmap(get_cpu_gdt_ro_index(cpu),
-		     __pa(get_cpu_gdt_rw(cpu)), pg_fixmap_gdt_flags);
+	__set_fixmap(get_cpu_gdt_ro_index(cpu), get_cpu_gdt_paddr(cpu),
+		     pg_fixmap_gdt_flags);
 }
 
 /* Load the original GDT from the per-cpu structure */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/7] x86/efi/32: Fix EFI on systems where the percpu GDT is virtually mapped
@ 2017-03-22 21:32   ` Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Lutomirski @ 2017-03-22 21:32 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Borislav Petkov, Boris Ostrovsky, Juergen Gross,
	Thomas Garnier, Andy Lutomirski, Matt Fleming, Ard Biesheuvel,
	linux-efi

__pa on a percpu pointer is invalid.  This bug appears to go *waaay*
back, and I guess it's just never been triggered.

Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: linux-efi@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/platform/efi/efi_32.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_32.c b/arch/x86/platform/efi/efi_32.c
index 950071171436..3481268da3d0 100644
--- a/arch/x86/platform/efi/efi_32.c
+++ b/arch/x86/platform/efi/efi_32.c
@@ -68,7 +68,7 @@ pgd_t * __init efi_call_phys_prolog(void)
 	load_cr3(initial_page_table);
 	__flush_tlb_all();
 
-	gdt_descr.address = __pa(get_cpu_gdt_rw(0));
+	gdt_descr.address = get_cpu_gdt_paddr(0);
 	gdt_descr.size = GDT_SIZE - 1;
 	load_gdt(&gdt_descr);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/7] x86/efi/32: Fix EFI on systems where the percpu GDT is virtually mapped
@ 2017-03-22 21:32   ` Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Lutomirski @ 2017-03-22 21:32 UTC (permalink / raw)
  To: x86-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Borislav Petkov,
	Boris Ostrovsky, Juergen Gross, Thomas Garnier, Andy Lutomirski,
	Matt Fleming, Ard Biesheuvel, linux-efi-u79uwXL29TY76Z2rM5mHXA

__pa on a percpu pointer is invalid.  This bug appears to go *waaay*
back, and I guess it's just never been triggered.

Cc: Matt Fleming <matt-mF/unelCI9GS6iBeEJttW/XRex20P6io@public.gmane.org>
Cc: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Cc: linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Signed-off-by: Andy Lutomirski <luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 arch/x86/platform/efi/efi_32.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_32.c b/arch/x86/platform/efi/efi_32.c
index 950071171436..3481268da3d0 100644
--- a/arch/x86/platform/efi/efi_32.c
+++ b/arch/x86/platform/efi/efi_32.c
@@ -68,7 +68,7 @@ pgd_t * __init efi_call_phys_prolog(void)
 	load_cr3(initial_page_table);
 	__flush_tlb_all();
 
-	gdt_descr.address = __pa(get_cpu_gdt_rw(0));
+	gdt_descr.address = get_cpu_gdt_paddr(0);
 	gdt_descr.size = GDT_SIZE - 1;
 	load_gdt(&gdt_descr);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 4/7] x86/boot/32: Defer resyncing initial_page_table until percpu is set up
@ 2017-03-22 21:32   ` Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Lutomirski @ 2017-03-22 21:32 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Borislav Petkov, Boris Ostrovsky, Juergen Gross,
	Thomas Garnier, Andy Lutomirski, Matt Fleming, Ard Biesheuvel,
	linux-efi

The x86 smpboot trampoline expects initial_page_table to have the
GDT mapped.  If the GDT ends up in a virtually mapped percpu page,
then it won't be in the page tables at all until percpu areas are
set up.  The result will be a triple fault the first time that the
CPU attempts to access the GDT after LGDT loads the percpu GDT.

This appears to be an old bug, but somehow the GDT fixmap rework
is triggering it.  This seems to have something to do with the
memory layout.

Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: linux-efi@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/setup.c        | 15 ---------------
 arch/x86/kernel/setup_percpu.c | 21 +++++++++++++++++++++
 2 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 4bf0c8926a1c..56b1177155db 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1226,21 +1226,6 @@ void __init setup_arch(char **cmdline_p)
 
 	kasan_init();
 
-#ifdef CONFIG_X86_32
-	/* sync back kernel address range */
-	clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
-			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
-			KERNEL_PGD_PTRS);
-
-	/*
-	 * sync back low identity map too.  It is used for example
-	 * in the 32-bit EFI stub.
-	 */
-	clone_pgd_range(initial_page_table,
-			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
-			min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
-#endif
-
 	tboot_probe();
 
 	map_vsyscall();
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index 11338b0b3ad2..bb1e8cc0bc84 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -288,4 +288,25 @@ void __init setup_per_cpu_areas(void)
 
 	/* Setup cpu initialized, callin, callout masks */
 	setup_cpu_local_masks();
+
+#ifdef CONFIG_X86_32
+	/*
+	 * Sync back kernel address range.  We want to make sure that
+	 * all kernel mappings, including percpu mappings, are available
+	 * in the smpboot asm.  We can't reliably pick up percpu
+	 * mappings using vmalloc_fault(), because exception dispatch
+	 * needs percpu data.
+	 */
+	clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
+			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
+			KERNEL_PGD_PTRS);
+
+	/*
+	 * sync back low identity map too.  It is used for example
+	 * in the 32-bit EFI stub.
+	 */
+	clone_pgd_range(initial_page_table,
+			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
+			min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
+#endif
 }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 4/7] x86/boot/32: Defer resyncing initial_page_table until percpu is set up
@ 2017-03-22 21:32   ` Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Lutomirski @ 2017-03-22 21:32 UTC (permalink / raw)
  To: x86-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Borislav Petkov,
	Boris Ostrovsky, Juergen Gross, Thomas Garnier, Andy Lutomirski,
	Matt Fleming, Ard Biesheuvel, linux-efi-u79uwXL29TY76Z2rM5mHXA

The x86 smpboot trampoline expects initial_page_table to have the
GDT mapped.  If the GDT ends up in a virtually mapped percpu page,
then it won't be in the page tables at all until percpu areas are
set up.  The result will be a triple fault the first time that the
CPU attempts to access the GDT after LGDT loads the percpu GDT.

This appears to be an old bug, but somehow the GDT fixmap rework
is triggering it.  This seems to have something to do with the
memory layout.

Cc: Matt Fleming <matt-mF/unelCI9GS6iBeEJttW/XRex20P6io@public.gmane.org>
Cc: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Cc: linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Signed-off-by: Andy Lutomirski <luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 arch/x86/kernel/setup.c        | 15 ---------------
 arch/x86/kernel/setup_percpu.c | 21 +++++++++++++++++++++
 2 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 4bf0c8926a1c..56b1177155db 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1226,21 +1226,6 @@ void __init setup_arch(char **cmdline_p)
 
 	kasan_init();
 
-#ifdef CONFIG_X86_32
-	/* sync back kernel address range */
-	clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
-			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
-			KERNEL_PGD_PTRS);
-
-	/*
-	 * sync back low identity map too.  It is used for example
-	 * in the 32-bit EFI stub.
-	 */
-	clone_pgd_range(initial_page_table,
-			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
-			min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
-#endif
-
 	tboot_probe();
 
 	map_vsyscall();
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index 11338b0b3ad2..bb1e8cc0bc84 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -288,4 +288,25 @@ void __init setup_per_cpu_areas(void)
 
 	/* Setup cpu initialized, callin, callout masks */
 	setup_cpu_local_masks();
+
+#ifdef CONFIG_X86_32
+	/*
+	 * Sync back kernel address range.  We want to make sure that
+	 * all kernel mappings, including percpu mappings, are available
+	 * in the smpboot asm.  We can't reliably pick up percpu
+	 * mappings using vmalloc_fault(), because exception dispatch
+	 * needs percpu data.
+	 */
+	clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
+			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
+			KERNEL_PGD_PTRS);
+
+	/*
+	 * sync back low identity map too.  It is used for example
+	 * in the 32-bit EFI stub.
+	 */
+	clone_pgd_range(initial_page_table,
+			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
+			min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
+#endif
 }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 5/7] x86/gdt: Get rid of the get_*_gdt_*_vaddr() helpers
  2017-03-22 21:32 [PATCH 0/7] Misc GDT fixes and a cleanup Andy Lutomirski
                   ` (3 preceding siblings ...)
  2017-03-22 21:32   ` Andy Lutomirski
@ 2017-03-22 21:32 ` Andy Lutomirski
  2017-03-23  9:15   ` [tip:x86/mm] " tip-bot for Andy Lutomirski
  2017-03-22 21:32 ` [PATCH 6/7] x86/xen/gdt: Use X86_FEATURE_XENPV instead of globals for the GDT fixup Andy Lutomirski
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Andy Lutomirski @ 2017-03-22 21:32 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Borislav Petkov, Boris Ostrovsky, Juergen Gross,
	Thomas Garnier, Andy Lutomirski

There's a single caller that is only there because it's passing a
pointer into a function (vmcs_writel()) that takes an unsigned long.
Let's just cast it in place rather than having a bunch of trivial
helpers.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/desc.h | 20 --------------------
 arch/x86/kvm/vmx.c          |  4 ++--
 2 files changed, 2 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index bde11696b893..17cb46e8a184 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -53,22 +53,12 @@ static inline struct desc_struct *get_cpu_gdt_rw(unsigned int cpu)
 	return per_cpu(gdt_page, cpu).gdt;
 }
 
-static inline unsigned long get_cpu_gdt_rw_vaddr(unsigned int cpu)
-{
-	return (unsigned long)get_cpu_gdt_rw(cpu);
-}
-
 /* Provide the current original GDT */
 static inline struct desc_struct *get_current_gdt_rw(void)
 {
 	return this_cpu_ptr(&gdt_page)->gdt;
 }
 
-static inline unsigned long get_current_gdt_rw_vaddr(void)
-{
-	return (unsigned long)get_current_gdt_rw();
-}
-
 /* Get the fixmap index for a specific processor */
 static inline unsigned int get_cpu_gdt_ro_index(int cpu)
 {
@@ -82,22 +72,12 @@ static inline struct desc_struct *get_cpu_gdt_ro(int cpu)
 	return (struct desc_struct *)__fix_to_virt(idx);
 }
 
-static inline unsigned long get_cpu_gdt_ro_vaddr(int cpu)
-{
-	return (unsigned long)get_cpu_gdt_ro(cpu);
-}
-
 /* Provide the current read-only GDT */
 static inline struct desc_struct *get_current_gdt_ro(void)
 {
 	return get_cpu_gdt_ro(smp_processor_id());
 }
 
-static inline unsigned long get_current_gdt_ro_vaddr(void)
-{
-	return (unsigned long)get_current_gdt_ro();
-}
-
 /* Provide the physical address of the GDT page. */
 static inline phys_addr_t get_cpu_gdt_paddr(unsigned int cpu)
 {
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 596a76d82b11..3acde663dc58 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2264,7 +2264,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	}
 
 	if (!already_loaded) {
-		unsigned long gdt = get_current_gdt_ro_vaddr();
+		void *gdt = get_current_gdt_ro();
 		unsigned long sysenter_esp;
 
 		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
@@ -2275,7 +2275,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 		 */
 		vmcs_writel(HOST_TR_BASE,
 			    (unsigned long)this_cpu_ptr(&cpu_tss));
-		vmcs_writel(HOST_GDTR_BASE, gdt);   /* 22.2.4 */
+		vmcs_writel(HOST_GDTR_BASE, (unsigned long)gdt);   /* 22.2.4 */
 
 		/*
 		 * VM exits change the host TR limit to 0x67 after a VM
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 6/7] x86/xen/gdt: Use X86_FEATURE_XENPV instead of globals for the GDT fixup
  2017-03-22 21:32 [PATCH 0/7] Misc GDT fixes and a cleanup Andy Lutomirski
                   ` (4 preceding siblings ...)
  2017-03-22 21:32 ` [PATCH 5/7] x86/gdt: Get rid of the get_*_gdt_*_vaddr() helpers Andy Lutomirski
@ 2017-03-22 21:32 ` Andy Lutomirski
  2017-03-23  9:15   ` [tip:x86/mm] " tip-bot for Andy Lutomirski
  2017-03-22 21:32 ` [PATCH 7/7] x86/boot/32: Rewrite test_wp_bit() Andy Lutomirski
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Andy Lutomirski @ 2017-03-22 21:32 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Borislav Petkov, Boris Ostrovsky, Juergen Gross,
	Thomas Garnier, Andy Lutomirski

Xen imposes special requirements on the GDT.  Rather than using a
global variable for the pgprot, just use an explicit special case
for Xen -- this makes it clearer what's going on.  It also debloats
64-bit kernels very slightly.

Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/desc.h  |  1 -
 arch/x86/kernel/cpu/common.c | 28 +++++++++++++++++-----------
 arch/x86/xen/enlighten.c     |  3 ---
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index 17cb46e8a184..d0a21b12dd58 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -39,7 +39,6 @@ extern struct desc_ptr idt_descr;
 extern gate_desc idt_table[];
 extern const struct desc_ptr debug_idt_descr;
 extern gate_desc debug_idt_table[];
-extern pgprot_t pg_fixmap_gdt_flags;
 
 struct gdt_page {
 	struct desc_struct gdt[GDT_ENTRIES];
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f6e20e2dbfa5..8ee32119144d 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -448,21 +448,27 @@ void load_percpu_segment(int cpu)
 	load_stack_canary_segment();
 }
 
-/*
- * On 64-bit the GDT remapping is read-only.
- * A global is used for Xen to change the default when required.
- */
+/* Setup the fixmap mapping only once per-processor */
+static inline void setup_fixmap_gdt(int cpu)
+{
 #ifdef CONFIG_X86_64
-pgprot_t pg_fixmap_gdt_flags = PAGE_KERNEL_RO;
+	/* On 64-bit systems, we use a read-only fixmap GDT. */
+	pgprot_t prot = PAGE_KERNEL_RO;
 #else
-pgprot_t pg_fixmap_gdt_flags = PAGE_KERNEL;
+	/*
+	 * On native 32-bit systems, the GDT cannot be read-only because
+	 * our double fault handler uses a task gate, and entering through
+	 * a task gate needs to change an available TSS to busy.  If the GDT
+	 * is read-only, that will triple fault.
+	 *
+	 * On Xen PV, the GDT must be read-only because the hypervisor requires
+	 * it.
+	 */
+	pgprot_t prot = boot_cpu_has(X86_FEATURE_XENPV) ?
+		PAGE_KERNEL_RO : PAGE_KERNEL;
 #endif
 
-/* Setup the fixmap mapping only once per-processor */
-static inline void setup_fixmap_gdt(int cpu)
-{
-	__set_fixmap(get_cpu_gdt_ro_index(cpu), get_cpu_gdt_paddr(cpu),
-		     pg_fixmap_gdt_flags);
+	__set_fixmap(get_cpu_gdt_ro_index(cpu), get_cpu_gdt_paddr(cpu), prot);
 }
 
 /* Load the original GDT from the per-cpu structure */
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 08faa61de5f7..4951fcf95143 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1545,9 +1545,6 @@ asmlinkage __visible void __init xen_start_kernel(void)
 	 */
 	xen_initial_gdt = &per_cpu(gdt_page, 0);
 
-	/* GDT can only be remapped RO */
-	pg_fixmap_gdt_flags = PAGE_KERNEL_RO;
-
 	xen_smp_init();
 
 #ifdef CONFIG_ACPI_NUMA
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 7/7] x86/boot/32: Rewrite test_wp_bit()
  2017-03-22 21:32 [PATCH 0/7] Misc GDT fixes and a cleanup Andy Lutomirski
                   ` (5 preceding siblings ...)
  2017-03-22 21:32 ` [PATCH 6/7] x86/xen/gdt: Use X86_FEATURE_XENPV instead of globals for the GDT fixup Andy Lutomirski
@ 2017-03-22 21:32 ` Andy Lutomirski
  2017-03-23  7:31 ` [PATCH 0/7] Misc GDT fixes and a cleanup Ingo Molnar
  2017-03-23 12:18 ` Boris Ostrovsky
  8 siblings, 0 replies; 32+ messages in thread
From: Andy Lutomirski @ 2017-03-22 21:32 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Borislav Petkov, Boris Ostrovsky, Juergen Gross,
	Thomas Garnier, Andy Lutomirski

This code seems to be very old and has gotten only minor updates.
Nowadays we have a shiny function probe_kernel_write() that does
more or less exactly what we need.  Use it.

While we're at it, remove cpuinfo_x86::wp_works_ok, since we panic
on kernels where wp doesn't work okay.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/processor.h |  2 --
 arch/x86/kernel/cpu/proc.c       |  5 ++---
 arch/x86/kernel/setup.c          |  2 --
 arch/x86/mm/init_32.c            | 44 ++++++++--------------------------------
 arch/x86/xen/enlighten.c         |  1 -
 5 files changed, 11 insertions(+), 43 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index edf42c4ac8c8..2c7695c9c684 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -90,8 +90,6 @@ struct cpuinfo_x86 {
 	__u8			x86_model;
 	__u8			x86_mask;
 #ifdef CONFIG_X86_32
-	char			wp_works_ok;	/* It doesn't on 386's */
-
 	/* Problems on some 486Dx4's and old 386's: */
 	char			rfu;
 	char			pad0;
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 18ca99f2798b..6df621ae62a7 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -31,14 +31,13 @@ static void show_cpuinfo_misc(struct seq_file *m, struct cpuinfo_x86 *c)
 		   "fpu\t\t: %s\n"
 		   "fpu_exception\t: %s\n"
 		   "cpuid level\t: %d\n"
-		   "wp\t\t: %s\n",
+		   "wp\t\t: yes\n",
 		   static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
 		   static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
 		   static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
 		   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
 		   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
-		   c->cpuid_level,
-		   c->wp_works_ok ? "yes" : "no");
+		   c->cpuid_level);
 }
 #else
 static void show_cpuinfo_misc(struct seq_file *m, struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 56b1177155db..462b7c69443d 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -175,11 +175,9 @@ static struct resource bss_resource = {
 #ifdef CONFIG_X86_32
 /* cpu data as detected by the assembly code in head.S */
 struct cpuinfo_x86 new_cpu_data = {
-	.wp_works_ok = -1,
 };
 /* common cpu data for all cpus */
 struct cpuinfo_x86 boot_cpu_data __read_mostly = {
-	.wp_works_ok = -1,
 };
 EXPORT_SYMBOL(boot_cpu_data);
 
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 5ed3c141bbd5..097089a5e4d5 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -56,8 +56,6 @@
 
 unsigned long highstart_pfn, highend_pfn;
 
-static noinline int do_test_wp_bit(void);
-
 bool __read_mostly __vmalloc_start_set = false;
 
 /*
@@ -726,20 +724,21 @@ void __init paging_init(void)
  */
 static void __init test_wp_bit(void)
 {
+	char z = 0;
+
 	printk(KERN_INFO
   "Checking if this processor honours the WP bit even in supervisor mode...");
 
-	/* Any page-aligned address will do, the test is non-destructive */
-	__set_fixmap(FIX_WP_TEST, __pa(&swapper_pg_dir), PAGE_KERNEL_RO);
-	boot_cpu_data.wp_works_ok = do_test_wp_bit();
-	clear_fixmap(FIX_WP_TEST);
+	__set_fixmap(FIX_WP_TEST, __pa_symbol(empty_zero_page), PAGE_KERNEL_RO);
 
-	if (!boot_cpu_data.wp_works_ok) {
+	if (probe_kernel_write((char *)fix_to_virt(FIX_WP_TEST), &z, 1) == 0) {
 		printk(KERN_CONT "No.\n");
 		panic("Linux doesn't support CPUs with broken WP.");
-	} else {
-		printk(KERN_CONT "Ok.\n");
 	}
+
+	clear_fixmap(FIX_WP_TEST);
+
+	printk(KERN_CONT "Ok.\n");
 }
 
 void __init mem_init(void)
@@ -821,8 +820,7 @@ void __init mem_init(void)
 	BUG_ON(VMALLOC_START				>= VMALLOC_END);
 	BUG_ON((unsigned long)high_memory		> VMALLOC_START);
 
-	if (boot_cpu_data.wp_works_ok < 0)
-		test_wp_bit();
+	test_wp_bit();
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
@@ -850,30 +848,6 @@ int arch_remove_memory(u64 start, u64 size)
 #endif
 #endif
 
-/*
- * This function cannot be __init, since exceptions don't work in that
- * section.  Put this after the callers, so that it cannot be inlined.
- */
-static noinline int do_test_wp_bit(void)
-{
-	char tmp_reg;
-	int flag;
-
-	__asm__ __volatile__(
-		"	movb %0, %1	\n"
-		"1:	movb %1, %0	\n"
-		"	xorl %2, %2	\n"
-		"2:			\n"
-		_ASM_EXTABLE(1b,2b)
-		:"=m" (*(char *)fix_to_virt(FIX_WP_TEST)),
-		 "=q" (tmp_reg),
-		 "=r" (flag)
-		:"2" (1)
-		:"memory");
-
-	return flag;
-}
-
 int kernel_set_to_readonly __read_mostly;
 
 void set_kernel_text_rw(void)
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 4951fcf95143..6efa0cc425a2 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1595,7 +1595,6 @@ asmlinkage __visible void __init xen_start_kernel(void)
 	/* set up basic CPUID stuff */
 	cpu_detect(&new_cpu_data);
 	set_cpu_cap(&new_cpu_data, X86_FEATURE_FPU);
-	new_cpu_data.wp_works_ok = 1;
 	new_cpu_data.x86_capability[CPUID_1_EDX] = cpuid_edx(1);
 #endif
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/7] Misc GDT fixes and a cleanup
  2017-03-22 21:32 [PATCH 0/7] Misc GDT fixes and a cleanup Andy Lutomirski
                   ` (6 preceding siblings ...)
  2017-03-22 21:32 ` [PATCH 7/7] x86/boot/32: Rewrite test_wp_bit() Andy Lutomirski
@ 2017-03-23  7:31 ` Ingo Molnar
  2017-03-23 12:18 ` Boris Ostrovsky
  8 siblings, 0 replies; 32+ messages in thread
From: Ingo Molnar @ 2017-03-23  7:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, linux-kernel, Borislav Petkov, Boris Ostrovsky,
	Juergen Gross, Thomas Garnier


* Andy Lutomirski <luto@kernel.org> wrote:

> Hi all-
> 
> This applies to tip:x86/mm.  For ease of testing, the series is here, too:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/tag/?h=review_20170322_gdt_and_wp
> 
> This fixes a few issues, most of which appear to be rather old.  For
> whatever reason, Thomas' GDT series unearthed them.  (And one is a
> genuine bug in Thomas' code but, in his defense, he might have
> cut-and-pasted it verbatim from the identical bug in the EFI code.)
> 
> The last three patches are cleanups I did while tracking these down.
> 
> Boris, any chance you could test this series on Xen?  The 64-bit
> case works for me, but I'm having issues testing on 32-bit right
> now.
> 
> Ingo, the first patch should address your concerns from the earlier
> version.
> 
> Andy Lutomirski (7):
>   selftests/x86/ldt_gdt_32: Work around a glibc sigaction bug
>   x86/gdt: Fix setup_fixmap_gdt() to use the correct PA
>   x86/efi/32: Fix EFI on systems where the percpu GDT is virtually
>     mapped
>   x86/boot/32: Defer resyncing initial_page_table until percpu is set up
>   x86/gdt: Get rid of the get_*_gdt_*_vaddr() helpers
>   x86/xen/gdt: Use X86_FEATURE_XENPV instead of globals for the GDT
>     fixup
>   x86/boot/32: Rewrite test_wp_bit()

Ok, looks mostly good to me and I've applied the first 6 patches to tip:x86/mm and 
will push them out if everything tests out fine.

Regarding patch #7: could you please split the last patch into two, and rebase the 
wp_works_ok removal on the very latest tip:x86/mm tree? There's some pending 
changes in tip:x86/process that conflict badly, so I've merged it into tip:x86/mm 
for a conflict-free base. That patch is better split in two anyway, as it does two 
only marginally related things.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [tip:x86/mm] selftests/x86/ldt_gdt_32: Work around a glibc sigaction() bug
  2017-03-22 21:32 ` [PATCH 1/7] selftests/x86/ldt_gdt_32: Work around a glibc sigaction bug Andy Lutomirski
@ 2017-03-23  9:13   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: tip-bot for Andy Lutomirski @ 2017-03-23  9:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jpoimboe, dvlasenk, tglx, luto, torvalds, jgross, thgarnie,
	boris.ostrovsky, hpa, brgerst, linux-kernel, bp, mingo, peterz

Commit-ID:  65973dd3fd31151823f4b8c289eebbb3fb7e6bc0
Gitweb:     http://git.kernel.org/tip/65973dd3fd31151823f4b8c289eebbb3fb7e6bc0
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Wed, 22 Mar 2017 14:32:29 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 23 Mar 2017 08:25:07 +0100

selftests/x86/ldt_gdt_32: Work around a glibc sigaction() bug

i386 glibc is buggy and calls the sigaction syscall incorrectly.

This is asymptomatic for normal programs, but it blows up on
programs that do evil things with segmentation.  The ldt_gdt
self-test is an example of such an evil program.

This doesn't appear to be a regression -- I think I just got lucky
with the uninitialized memory that glibc threw at the kernel when I
wrote the test.

This hackish fix manually issues sigaction(2) syscalls to undo the
damage.  Without the fix, ldt_gdt_32 segfaults; with the fix, it
passes for me.

See: https://sourceware.org/bugzilla/show_bug.cgi?id=21269

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/aaab0f9f93c9af25396f01232608c163a760a668.1490218061.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/testing/selftests/x86/ldt_gdt.c | 46 +++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/tools/testing/selftests/x86/ldt_gdt.c b/tools/testing/selftests/x86/ldt_gdt.c
index f612161..b9a22f1 100644
--- a/tools/testing/selftests/x86/ldt_gdt.c
+++ b/tools/testing/selftests/x86/ldt_gdt.c
@@ -409,6 +409,51 @@ static void *threadproc(void *ctx)
 	}
 }
 
+#ifdef __i386__
+
+#ifndef SA_RESTORE
+#define SA_RESTORER 0x04000000
+#endif
+
+/*
+ * The UAPI header calls this 'struct sigaction', which conflicts with
+ * glibc.  Sigh.
+ */
+struct fake_ksigaction {
+	void *handler;  /* the real type is nasty */
+	unsigned long sa_flags;
+	void (*sa_restorer)(void);
+	unsigned char sigset[8];
+};
+
+static void fix_sa_restorer(int sig)
+{
+	struct fake_ksigaction ksa;
+
+	if (syscall(SYS_rt_sigaction, sig, NULL, &ksa, 8) == 0) {
+		/*
+		 * glibc has a nasty bug: it sometimes writes garbage to
+		 * sa_restorer.  This interacts quite badly with anything
+		 * that fiddles with SS because it can trigger legacy
+		 * stack switching.  Patch it up.  See:
+		 *
+		 * https://sourceware.org/bugzilla/show_bug.cgi?id=21269
+		 */
+		if (!(ksa.sa_flags & SA_RESTORER) && ksa.sa_restorer) {
+			ksa.sa_restorer = NULL;
+			if (syscall(SYS_rt_sigaction, sig, &ksa, NULL,
+				    sizeof(ksa.sigset)) != 0)
+				err(1, "rt_sigaction");
+		}
+	}
+}
+#else
+static void fix_sa_restorer(int sig)
+{
+	/* 64-bit glibc works fine. */
+}
+#endif
+
 static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
 		       int flags)
 {
@@ -420,6 +465,7 @@ static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
 	if (sigaction(sig, &sa, 0))
 		err(1, "sigaction");
 
+	fix_sa_restorer(sig);
 }
 
 static jmp_buf jmpbuf;

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [tip:x86/mm] x86/gdt: Fix setup_fixmap_gdt() to use the correct PA
  2017-03-22 21:32 ` [PATCH 2/7] x86/gdt: Fix setup_fixmap_gdt() to use the correct PA Andy Lutomirski
@ 2017-03-23  9:13   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: tip-bot for Andy Lutomirski @ 2017-03-23  9:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jgross, mingo, boris.ostrovsky, peterz, bp, brgerst,
	linux-kernel, thgarnie, dvlasenk, jpoimboe, luto, tglx, hpa,
	torvalds

Commit-ID:  aa4ea675528f3fa11e9663e5a32f55a81c34dcac
Gitweb:     http://git.kernel.org/tip/aa4ea675528f3fa11e9663e5a32f55a81c34dcac
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Wed, 22 Mar 2017 14:32:30 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 23 Mar 2017 08:25:07 +0100

x86/gdt: Fix setup_fixmap_gdt() to use the correct PA

__pa() cannot be used on percpu pointers because they may be
virtually mapped.  Use per_cpu_ptr_to_phys() instead.

This fixes a boot crash on a some 32-bit configurations.  I assume
this is related to which allocation strategy is chosen by the percpu
core.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 69218e47994d x86: ("Remap GDT tables in the fixmap section")
Link: http://lkml.kernel.org/r/22e0069c29fba31998f193201e359eebfdac4960.1490218061.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/desc.h  | 6 ++++++
 arch/x86/kernel/cpu/common.c | 4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index ec05f9c..bde1169 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -98,6 +98,12 @@ static inline unsigned long get_current_gdt_ro_vaddr(void)
 	return (unsigned long)get_current_gdt_ro();
 }
 
+/* Provide the physical address of the GDT page. */
+static inline phys_addr_t get_cpu_gdt_paddr(unsigned int cpu)
+{
+	return per_cpu_ptr_to_phys(get_cpu_gdt_rw(cpu));
+}
+
 #ifdef CONFIG_X86_64
 
 static inline void pack_gate(gate_desc *gate, unsigned type, unsigned long func,
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f8e22db..f6e20e2 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -461,8 +461,8 @@ pgprot_t pg_fixmap_gdt_flags = PAGE_KERNEL;
 /* Setup the fixmap mapping only once per-processor */
 static inline void setup_fixmap_gdt(int cpu)
 {
-	__set_fixmap(get_cpu_gdt_ro_index(cpu),
-		     __pa(get_cpu_gdt_rw(cpu)), pg_fixmap_gdt_flags);
+	__set_fixmap(get_cpu_gdt_ro_index(cpu), get_cpu_gdt_paddr(cpu),
+		     pg_fixmap_gdt_flags);
 }
 
 /* Load the original GDT from the per-cpu structure */

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [tip:x86/mm] x86/efi/32: Fix EFI on systems where the per-cpu GDT is virtually mapped
  2017-03-22 21:32   ` Andy Lutomirski
  (?)
@ 2017-03-23  9:14   ` tip-bot for Andy Lutomirski
  -1 siblings, 0 replies; 32+ messages in thread
From: tip-bot for Andy Lutomirski @ 2017-03-23  9:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: brgerst, torvalds, ard.biesheuvel, dvlasenk, jpoimboe, bp,
	thgarnie, peterz, matt, hpa, luto, tglx, boris.ostrovsky,
	linux-kernel, mingo, jgross

Commit-ID:  3fa1cabbc3b61224ef33d3ca4a1a96998529bc68
Gitweb:     http://git.kernel.org/tip/3fa1cabbc3b61224ef33d3ca4a1a96998529bc68
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Wed, 22 Mar 2017 14:32:31 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 23 Mar 2017 08:25:07 +0100

x86/efi/32: Fix EFI on systems where the per-cpu GDT is virtually mapped

__pa() on a per-cpu pointer is invalid.  This bug appears to go *waaay*
back, and I guess it's just never been triggered.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/5ba1d3ffca85e1a5b3ac99265ebe55df4cf0dbe4.1490218061.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/platform/efi/efi_32.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_32.c b/arch/x86/platform/efi/efi_32.c
index 9500711..3481268 100644
--- a/arch/x86/platform/efi/efi_32.c
+++ b/arch/x86/platform/efi/efi_32.c
@@ -68,7 +68,7 @@ pgd_t * __init efi_call_phys_prolog(void)
 	load_cr3(initial_page_table);
 	__flush_tlb_all();
 
-	gdt_descr.address = __pa(get_cpu_gdt_rw(0));
+	gdt_descr.address = get_cpu_gdt_paddr(0);
 	gdt_descr.size = GDT_SIZE - 1;
 	load_gdt(&gdt_descr);
 

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
  2017-03-22 21:32   ` Andy Lutomirski
  (?)
@ 2017-03-23  9:14   ` tip-bot for Andy Lutomirski
  2017-05-08  6:31       ` Jan Kiszka
  -1 siblings, 1 reply; 32+ messages in thread
From: tip-bot for Andy Lutomirski @ 2017-03-23  9:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, linux-kernel, hpa, boris.ostrovsky, bp, jpoimboe,
	torvalds, matt, luto, brgerst, tglx, thgarnie, mingo, dvlasenk,
	jgross, ard.biesheuvel

Commit-ID:  23b2a4ddebdd17fad265b4bb77256c2e4ec37dee
Gitweb:     http://git.kernel.org/tip/23b2a4ddebdd17fad265b4bb77256c2e4ec37dee
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Wed, 22 Mar 2017 14:32:32 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 23 Mar 2017 08:25:08 +0100

x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up

The x86 smpboot trampoline expects initial_page_table to have the
GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
then it won't be in the page tables at all until perc-pu areas are
set up.  The result will be a triple fault the first time that the
CPU attempts to access the GDT after LGDT loads the perc-pu GDT.

This appears to be an old bug, but somehow the GDT fixmap rework
is triggering it.  This seems to have something to do with the
memory layout.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/a553264a5972c6a86f9b5caac237470a0c74a720.1490218061.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/setup.c        | 15 ---------------
 arch/x86/kernel/setup_percpu.c | 21 +++++++++++++++++++++
 2 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 4bf0c89..56b1177 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1226,21 +1226,6 @@ void __init setup_arch(char **cmdline_p)
 
 	kasan_init();
 
-#ifdef CONFIG_X86_32
-	/* sync back kernel address range */
-	clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
-			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
-			KERNEL_PGD_PTRS);
-
-	/*
-	 * sync back low identity map too.  It is used for example
-	 * in the 32-bit EFI stub.
-	 */
-	clone_pgd_range(initial_page_table,
-			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
-			min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
-#endif
-
 	tboot_probe();
 
 	map_vsyscall();
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index 11338b0..bb1e8cc 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -288,4 +288,25 @@ void __init setup_per_cpu_areas(void)
 
 	/* Setup cpu initialized, callin, callout masks */
 	setup_cpu_local_masks();
+
+#ifdef CONFIG_X86_32
+	/*
+	 * Sync back kernel address range.  We want to make sure that
+	 * all kernel mappings, including percpu mappings, are available
+	 * in the smpboot asm.  We can't reliably pick up percpu
+	 * mappings using vmalloc_fault(), because exception dispatch
+	 * needs percpu data.
+	 */
+	clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
+			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
+			KERNEL_PGD_PTRS);
+
+	/*
+	 * sync back low identity map too.  It is used for example
+	 * in the 32-bit EFI stub.
+	 */
+	clone_pgd_range(initial_page_table,
+			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
+			min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
+#endif
 }

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [tip:x86/mm] x86/gdt: Get rid of the get_*_gdt_*_vaddr() helpers
  2017-03-22 21:32 ` [PATCH 5/7] x86/gdt: Get rid of the get_*_gdt_*_vaddr() helpers Andy Lutomirski
@ 2017-03-23  9:15   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: tip-bot for Andy Lutomirski @ 2017-03-23  9:15 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: boris.ostrovsky, jpoimboe, tglx, thgarnie, dvlasenk,
	linux-kernel, torvalds, hpa, peterz, jgross, brgerst, luto, bp,
	mingo

Commit-ID:  59c58ceb29d0f030eddb36a3a9dbadcc499786a6
Gitweb:     http://git.kernel.org/tip/59c58ceb29d0f030eddb36a3a9dbadcc499786a6
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Wed, 22 Mar 2017 14:32:33 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 23 Mar 2017 08:25:08 +0100

x86/gdt: Get rid of the get_*_gdt_*_vaddr() helpers

There's a single caller that is only there because it's passing a
pointer into a function (vmcs_writel()) that takes an unsigned long.
Let's just cast it in place rather than having a bunch of trivial
helpers.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/46108fb35e1699252b1b6a85039303ff562c9836.1490218061.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/desc.h | 20 --------------------
 arch/x86/kvm/vmx.c          |  4 ++--
 2 files changed, 2 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index bde1169..17cb46e 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -53,22 +53,12 @@ static inline struct desc_struct *get_cpu_gdt_rw(unsigned int cpu)
 	return per_cpu(gdt_page, cpu).gdt;
 }
 
-static inline unsigned long get_cpu_gdt_rw_vaddr(unsigned int cpu)
-{
-	return (unsigned long)get_cpu_gdt_rw(cpu);
-}
-
 /* Provide the current original GDT */
 static inline struct desc_struct *get_current_gdt_rw(void)
 {
 	return this_cpu_ptr(&gdt_page)->gdt;
 }
 
-static inline unsigned long get_current_gdt_rw_vaddr(void)
-{
-	return (unsigned long)get_current_gdt_rw();
-}
-
 /* Get the fixmap index for a specific processor */
 static inline unsigned int get_cpu_gdt_ro_index(int cpu)
 {
@@ -82,22 +72,12 @@ static inline struct desc_struct *get_cpu_gdt_ro(int cpu)
 	return (struct desc_struct *)__fix_to_virt(idx);
 }
 
-static inline unsigned long get_cpu_gdt_ro_vaddr(int cpu)
-{
-	return (unsigned long)get_cpu_gdt_ro(cpu);
-}
-
 /* Provide the current read-only GDT */
 static inline struct desc_struct *get_current_gdt_ro(void)
 {
 	return get_cpu_gdt_ro(smp_processor_id());
 }
 
-static inline unsigned long get_current_gdt_ro_vaddr(void)
-{
-	return (unsigned long)get_current_gdt_ro();
-}
-
 /* Provide the physical address of the GDT page. */
 static inline phys_addr_t get_cpu_gdt_paddr(unsigned int cpu)
 {
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 596a76d..3acde66 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2264,7 +2264,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	}
 
 	if (!already_loaded) {
-		unsigned long gdt = get_current_gdt_ro_vaddr();
+		void *gdt = get_current_gdt_ro();
 		unsigned long sysenter_esp;
 
 		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
@@ -2275,7 +2275,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 		 */
 		vmcs_writel(HOST_TR_BASE,
 			    (unsigned long)this_cpu_ptr(&cpu_tss));
-		vmcs_writel(HOST_GDTR_BASE, gdt);   /* 22.2.4 */
+		vmcs_writel(HOST_GDTR_BASE, (unsigned long)gdt);   /* 22.2.4 */
 
 		/*
 		 * VM exits change the host TR limit to 0x67 after a VM

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [tip:x86/mm] x86/xen/gdt: Use X86_FEATURE_XENPV instead of globals for the GDT fixup
  2017-03-22 21:32 ` [PATCH 6/7] x86/xen/gdt: Use X86_FEATURE_XENPV instead of globals for the GDT fixup Andy Lutomirski
@ 2017-03-23  9:15   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: tip-bot for Andy Lutomirski @ 2017-03-23  9:15 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, torvalds, boris.ostrovsky, tglx, dvlasenk, bp,
	brgerst, jpoimboe, peterz, hpa, thgarnie, mingo, jgross, luto

Commit-ID:  b23adb7d3f7d1d7cce03db9704de67a99ceeda38
Gitweb:     http://git.kernel.org/tip/b23adb7d3f7d1d7cce03db9704de67a99ceeda38
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Wed, 22 Mar 2017 14:32:34 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 23 Mar 2017 08:25:08 +0100

x86/xen/gdt: Use X86_FEATURE_XENPV instead of globals for the GDT fixup

Xen imposes special requirements on the GDT.  Rather than using a
global variable for the pgprot, just use an explicit special case
for Xen -- this makes it clearer what's going on.  It also debloats
64-bit kernels very slightly.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e9ea96abbfd6a8c87753849171bb5987ecfeb523.1490218061.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/desc.h  |  1 -
 arch/x86/kernel/cpu/common.c | 28 +++++++++++++++++-----------
 arch/x86/xen/enlighten.c     |  3 ---
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index 17cb46e..d0a21b1 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -39,7 +39,6 @@ extern struct desc_ptr idt_descr;
 extern gate_desc idt_table[];
 extern const struct desc_ptr debug_idt_descr;
 extern gate_desc debug_idt_table[];
-extern pgprot_t pg_fixmap_gdt_flags;
 
 struct gdt_page {
 	struct desc_struct gdt[GDT_ENTRIES];
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f6e20e2..8ee3211 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -448,21 +448,27 @@ void load_percpu_segment(int cpu)
 	load_stack_canary_segment();
 }
 
-/*
- * On 64-bit the GDT remapping is read-only.
- * A global is used for Xen to change the default when required.
- */
+/* Setup the fixmap mapping only once per-processor */
+static inline void setup_fixmap_gdt(int cpu)
+{
 #ifdef CONFIG_X86_64
-pgprot_t pg_fixmap_gdt_flags = PAGE_KERNEL_RO;
+	/* On 64-bit systems, we use a read-only fixmap GDT. */
+	pgprot_t prot = PAGE_KERNEL_RO;
 #else
-pgprot_t pg_fixmap_gdt_flags = PAGE_KERNEL;
+	/*
+	 * On native 32-bit systems, the GDT cannot be read-only because
+	 * our double fault handler uses a task gate, and entering through
+	 * a task gate needs to change an available TSS to busy.  If the GDT
+	 * is read-only, that will triple fault.
+	 *
+	 * On Xen PV, the GDT must be read-only because the hypervisor requires
+	 * it.
+	 */
+	pgprot_t prot = boot_cpu_has(X86_FEATURE_XENPV) ?
+		PAGE_KERNEL_RO : PAGE_KERNEL;
 #endif
 
-/* Setup the fixmap mapping only once per-processor */
-static inline void setup_fixmap_gdt(int cpu)
-{
-	__set_fixmap(get_cpu_gdt_ro_index(cpu), get_cpu_gdt_paddr(cpu),
-		     pg_fixmap_gdt_flags);
+	__set_fixmap(get_cpu_gdt_ro_index(cpu), get_cpu_gdt_paddr(cpu), prot);
 }
 
 /* Load the original GDT from the per-cpu structure */
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 08faa61..4951fcf 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1545,9 +1545,6 @@ asmlinkage __visible void __init xen_start_kernel(void)
 	 */
 	xen_initial_gdt = &per_cpu(gdt_page, 0);
 
-	/* GDT can only be remapped RO */
-	pg_fixmap_gdt_flags = PAGE_KERNEL_RO;
-
 	xen_smp_init();
 
 #ifdef CONFIG_ACPI_NUMA

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/7] Misc GDT fixes and a cleanup
  2017-03-22 21:32 [PATCH 0/7] Misc GDT fixes and a cleanup Andy Lutomirski
                   ` (7 preceding siblings ...)
  2017-03-23  7:31 ` [PATCH 0/7] Misc GDT fixes and a cleanup Ingo Molnar
@ 2017-03-23 12:18 ` Boris Ostrovsky
  8 siblings, 0 replies; 32+ messages in thread
From: Boris Ostrovsky @ 2017-03-23 12:18 UTC (permalink / raw)
  To: Andy Lutomirski, x86
  Cc: linux-kernel, Borislav Petkov, Juergen Gross, Thomas Garnier

On 03/22/2017 05:32 PM, Andy Lutomirski wrote:
> Hi all-
>
> This applies to tip:x86/mm.  For ease of testing, the series is here, too:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/tag/?h=review_20170322_gdt_and_wp
>
> This fixes a few issues, most of which appear to be rather old.  For
> whatever reason, Thomas' GDT series unearthed them.  (And one is a
> genuine bug in Thomas' code but, in his defense, he might have
> cut-and-pasted it verbatim from the identical bug in the EFI code.)
>
> The last three patches are cleanups I did while tracking these down.
>
> Boris, any chance you could test this series on Xen?  The 64-bit
> case works for me, but I'm having issues testing on 32-bit right
> now.

Yes, this is all good. Tests passed.

-boris

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-08  6:31       ` Jan Kiszka
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Kiszka @ 2017-05-08  6:31 UTC (permalink / raw)
  To: luto, mingo, x86, linux-efi
  Cc: linux-kernel, peterz, torvalds, jpoimboe, boris.ostrovsky, bp,
	hpa, matt, tglx, brgerst, thgarnie, dvlasenk, jgross,
	ard.biesheuvel

On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
> The x86 smpboot trampoline expects initial_page_table to have the
> GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
> then it won't be in the page tables at all until perc-pu areas are
> set up.  The result will be a triple fault the first time that the
> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
> 
> This appears to be an old bug, but somehow the GDT fixmap rework
> is triggering it.  This seems to have something to do with the
> memory layout.
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Brian Gerst <brgerst@gmail.com>
> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-efi@vger.kernel.org
> Link: http://lkml.kernel.org/r/a553264a5972c6a86f9b5caac237470a0c74a720.1490218061.git.luto@kernel.org
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  arch/x86/kernel/setup.c        | 15 ---------------
>  arch/x86/kernel/setup_percpu.c | 21 +++++++++++++++++++++
>  2 files changed, 21 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 4bf0c89..56b1177 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -1226,21 +1226,6 @@ void __init setup_arch(char **cmdline_p)
>  
>  	kasan_init();
>  
> -#ifdef CONFIG_X86_32
> -	/* sync back kernel address range */
> -	clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
> -			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
> -			KERNEL_PGD_PTRS);
> -
> -	/*
> -	 * sync back low identity map too.  It is used for example
> -	 * in the 32-bit EFI stub.
> -	 */
> -	clone_pgd_range(initial_page_table,
> -			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
> -			min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
> -#endif
> -
>  	tboot_probe();
>  
>  	map_vsyscall();
> diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
> index 11338b0..bb1e8cc 100644
> --- a/arch/x86/kernel/setup_percpu.c
> +++ b/arch/x86/kernel/setup_percpu.c
> @@ -288,4 +288,25 @@ void __init setup_per_cpu_areas(void)
>  
>  	/* Setup cpu initialized, callin, callout masks */
>  	setup_cpu_local_masks();
> +
> +#ifdef CONFIG_X86_32
> +	/*
> +	 * Sync back kernel address range.  We want to make sure that
> +	 * all kernel mappings, including percpu mappings, are available
> +	 * in the smpboot asm.  We can't reliably pick up percpu
> +	 * mappings using vmalloc_fault(), because exception dispatch
> +	 * needs percpu data.
> +	 */
> +	clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
> +			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
> +			KERNEL_PGD_PTRS);
> +
> +	/*
> +	 * sync back low identity map too.  It is used for example
> +	 * in the 32-bit EFI stub.
> +	 */
> +	clone_pgd_range(initial_page_table,
> +			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
> +			min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
> +#endif
>  }
> 

This breaks the boot on our Intel Quark platform (IOT2000, similar to
Galileo Gen2). Reverting it over master makes it work again. Any idea
what goes wrong? Let me know how I can help debugging this.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-08  6:31       ` Jan Kiszka
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Kiszka @ 2017-05-08  6:31 UTC (permalink / raw)
  To: luto-DgEjT+Ai2ygdnm+yROfE0A, mingo-DgEjT+Ai2ygdnm+yROfE0A, x86,
	linux-efi
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	jpoimboe-H+wXaHxf7aLQT0dZR+AlfA,
	boris.ostrovsky-QHcLZuEGTsvQT0dZR+AlfA,
	bp-Gina5bIWoIWzQB+pC5nmwQ, hpa-YMNOUZJC4hwAvxtiuMwx3w,
	matt-mF/unelCI9GS6iBeEJttW/XRex20P6io,
	tglx-hfZtesqFncYOwBW4kG4KsQ, brgerst-Re5JQEeQqe8AvxtiuMwx3w,
	thgarnie-hpIqsD4AKlfQT0dZR+AlfA, dvlasenk-H+wXaHxf7aLQT0dZR+AlfA,
	jgross-IBi9RG/b67k, ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A

On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
> The x86 smpboot trampoline expects initial_page_table to have the
> GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
> then it won't be in the page tables at all until perc-pu areas are
> set up.  The result will be a triple fault the first time that the
> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
> 
> This appears to be an old bug, but somehow the GDT fixmap rework
> is triggering it.  This seems to have something to do with the
> memory layout.
> 
> Signed-off-by: Andy Lutomirski <luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> Cc: Boris Ostrovsky <boris.ostrovsky-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> Cc: Borislav Petkov <bp-Gina5bIWoIWzQB+pC5nmwQ@public.gmane.org>
> Cc: Brian Gerst <brgerst-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Denys Vlasenko <dvlasenk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: H. Peter Anvin <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
> Cc: Josh Poimboeuf <jpoimboe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Juergen Gross <jgross-IBi9RG/b67k@public.gmane.org>
> Cc: Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
> Cc: Matt Fleming <matt-mF/unelCI9GS6iBeEJttW/XRex20P6io@public.gmane.org>
> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> Cc: Thomas Garnier <thgarnie-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
> Cc: linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Link: http://lkml.kernel.org/r/a553264a5972c6a86f9b5caac237470a0c74a720.1490218061.git.luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org
> Signed-off-by: Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> ---
>  arch/x86/kernel/setup.c        | 15 ---------------
>  arch/x86/kernel/setup_percpu.c | 21 +++++++++++++++++++++
>  2 files changed, 21 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 4bf0c89..56b1177 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -1226,21 +1226,6 @@ void __init setup_arch(char **cmdline_p)
>  
>  	kasan_init();
>  
> -#ifdef CONFIG_X86_32
> -	/* sync back kernel address range */
> -	clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
> -			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
> -			KERNEL_PGD_PTRS);
> -
> -	/*
> -	 * sync back low identity map too.  It is used for example
> -	 * in the 32-bit EFI stub.
> -	 */
> -	clone_pgd_range(initial_page_table,
> -			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
> -			min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
> -#endif
> -
>  	tboot_probe();
>  
>  	map_vsyscall();
> diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
> index 11338b0..bb1e8cc 100644
> --- a/arch/x86/kernel/setup_percpu.c
> +++ b/arch/x86/kernel/setup_percpu.c
> @@ -288,4 +288,25 @@ void __init setup_per_cpu_areas(void)
>  
>  	/* Setup cpu initialized, callin, callout masks */
>  	setup_cpu_local_masks();
> +
> +#ifdef CONFIG_X86_32
> +	/*
> +	 * Sync back kernel address range.  We want to make sure that
> +	 * all kernel mappings, including percpu mappings, are available
> +	 * in the smpboot asm.  We can't reliably pick up percpu
> +	 * mappings using vmalloc_fault(), because exception dispatch
> +	 * needs percpu data.
> +	 */
> +	clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
> +			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
> +			KERNEL_PGD_PTRS);
> +
> +	/*
> +	 * sync back low identity map too.  It is used for example
> +	 * in the 32-bit EFI stub.
> +	 */
> +	clone_pgd_range(initial_page_table,
> +			swapper_pg_dir     + KERNEL_PGD_BOUNDARY,
> +			min(KERNEL_PGD_PTRS, KERNEL_PGD_BOUNDARY));
> +#endif
>  }
> 

This breaks the boot on our Intel Quark platform (IOT2000, similar to
Galileo Gen2). Reverting it over master makes it work again. Any idea
what goes wrong? Let me know how I can help debugging this.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-08  9:32         ` Andy Shevchenko
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Shevchenko @ 2017-05-08  9:32 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Andy Lutomirski, Ingo Molnar, x86, linux-efi, linux-kernel,
	Peter Zijlstra (Intel),
	Linus Torvalds, jpoimboe, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, Matt Fleming, Thomas Gleixner, Brian Gerst,
	thgarnie, Denys Vlasenko, Juergen Gross, Ard Biesheuvel

On Mon, May 8, 2017 at 9:31 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
>> The x86 smpboot trampoline expects initial_page_table to have the
>> GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
>> then it won't be in the page tables at all until perc-pu areas are
>> set up.  The result will be a triple fault the first time that the
>> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
>>
>> This appears to be an old bug, but somehow the GDT fixmap rework
>> is triggering it.  This seems to have something to do with the
>> memory layout.

> This breaks the boot on our Intel Quark platform (IOT2000, similar to
> Galileo Gen2). Reverting it over master makes it work again. Any idea
> what goes wrong? Let me know how I can help debugging this.

JFYI: As of today linux-next when _kexec:ed_ works fine to me

Perhaps I can test this later with direct boot from SD card.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-08  9:32         ` Andy Shevchenko
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Shevchenko @ 2017-05-08  9:32 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Andy Lutomirski, Ingo Molnar, x86, linux-efi,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Peter Zijlstra (Intel),
	Linus Torvalds, jpoimboe-H+wXaHxf7aLQT0dZR+AlfA, Boris Ostrovsky,
	Borislav Petkov, H. Peter Anvin, Matt Fleming, Thomas Gleixner,
	Brian Gerst, thgarnie-hpIqsD4AKlfQT0dZR+AlfA, Denys Vlasenko,
	Juergen Gross, Ard Biesheuvel

On Mon, May 8, 2017 at 9:31 AM, Jan Kiszka <jan.kiszka-kv7WeFo6aLtBDgjK7y7TUQ@public.gmane.org> wrote:
> On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
>> The x86 smpboot trampoline expects initial_page_table to have the
>> GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
>> then it won't be in the page tables at all until perc-pu areas are
>> set up.  The result will be a triple fault the first time that the
>> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
>>
>> This appears to be an old bug, but somehow the GDT fixmap rework
>> is triggering it.  This seems to have something to do with the
>> memory layout.

> This breaks the boot on our Intel Quark platform (IOT2000, similar to
> Galileo Gen2). Reverting it over master makes it work again. Any idea
> what goes wrong? Let me know how I can help debugging this.

JFYI: As of today linux-next when _kexec:ed_ works fine to me

Perhaps I can test this later with direct boot from SD card.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-08 11:21           ` Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Lutomirski @ 2017-05-08 11:21 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Jan Kiszka, Andy Lutomirski, Ingo Molnar, x86, linux-efi,
	linux-kernel, Peter Zijlstra (Intel),
	Linus Torvalds, Josh Poimboeuf, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, Matt Fleming, Thomas Gleixner, Brian Gerst,
	Thomas Garnier, Denys Vlasenko, Juergen Gross, Ard Biesheuvel

On Mon, May 8, 2017 at 2:32 AM, Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> On Mon, May 8, 2017 at 9:31 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
>>> The x86 smpboot trampoline expects initial_page_table to have the
>>> GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
>>> then it won't be in the page tables at all until perc-pu areas are
>>> set up.  The result will be a triple fault the first time that the
>>> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
>>>
>>> This appears to be an old bug, but somehow the GDT fixmap rework
>>> is triggering it.  This seems to have something to do with the
>>> memory layout.
>
>> This breaks the boot on our Intel Quark platform (IOT2000, similar to
>> Galileo Gen2). Reverting it over master makes it work again. Any idea
>> what goes wrong? Let me know how I can help debugging this.
>
> JFYI: As of today linux-next when _kexec:ed_ works fine to me
>
> Perhaps I can test this later with direct boot from SD card.
>

The most likely explanation is that there's some code that needs the
page table synced and runs before setup_per_cpu_areas().  The relevant
init code is:

    setup_arch(&command_line);
    mm_init_cpumask(&init_mm);
    setup_command_line(command_line);
    setup_nr_cpu_ids();
    setup_per_cpu_areas();

so I didn't move it very far.  It would be awesome if we could get a
backtrace when the failure happens, but it's likely to be a triple
fault.  Is this an EFI boot?  I bet the failure is in efi_init().

Could you try reverting just the deletions in the patch?  I.e. try a
kernel with both the old and the new copies of the code I moved.

--Andy

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-08 11:21           ` Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Lutomirski @ 2017-05-08 11:21 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Jan Kiszka, Andy Lutomirski, Ingo Molnar, x86, linux-efi,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Peter Zijlstra (Intel),
	Linus Torvalds, Josh Poimboeuf, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, Matt Fleming, Thomas Gleixner, Brian Gerst,
	Thomas Garnier, Denys Vlasenko, Juergen Gross, Ard Biesheuvel

On Mon, May 8, 2017 at 2:32 AM, Andy Shevchenko
<andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, May 8, 2017 at 9:31 AM, Jan Kiszka <jan.kiszka-kv7WeFo6aLtBDgjK7y7TUQ@public.gmane.org> wrote:
>> On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
>>> The x86 smpboot trampoline expects initial_page_table to have the
>>> GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
>>> then it won't be in the page tables at all until perc-pu areas are
>>> set up.  The result will be a triple fault the first time that the
>>> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
>>>
>>> This appears to be an old bug, but somehow the GDT fixmap rework
>>> is triggering it.  This seems to have something to do with the
>>> memory layout.
>
>> This breaks the boot on our Intel Quark platform (IOT2000, similar to
>> Galileo Gen2). Reverting it over master makes it work again. Any idea
>> what goes wrong? Let me know how I can help debugging this.
>
> JFYI: As of today linux-next when _kexec:ed_ works fine to me
>
> Perhaps I can test this later with direct boot from SD card.
>

The most likely explanation is that there's some code that needs the
page table synced and runs before setup_per_cpu_areas().  The relevant
init code is:

    setup_arch(&command_line);
    mm_init_cpumask(&init_mm);
    setup_command_line(command_line);
    setup_nr_cpu_ids();
    setup_per_cpu_areas();

so I didn't move it very far.  It would be awesome if we could get a
backtrace when the failure happens, but it's likely to be a triple
fault.  Is this an EFI boot?  I bet the failure is in efi_init().

Could you try reverting just the deletions in the patch?  I.e. try a
kernel with both the old and the new copies of the code I moved.

--Andy

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
  2017-05-08 11:21           ` Andy Lutomirski
  (?)
@ 2017-05-08 12:34           ` Jan Kiszka
  2017-05-08 14:45               ` Andy Shevchenko
  2017-05-08 17:53               ` Jan Kiszka
  -1 siblings, 2 replies; 32+ messages in thread
From: Jan Kiszka @ 2017-05-08 12:34 UTC (permalink / raw)
  To: Andy Lutomirski, Andy Shevchenko
  Cc: Ingo Molnar, x86, linux-efi, linux-kernel, Peter Zijlstra (Intel),
	Linus Torvalds, Josh Poimboeuf, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, Matt Fleming, Thomas Gleixner, Brian Gerst,
	Thomas Garnier, Denys Vlasenko, Juergen Gross, Ard Biesheuvel

On 2017-05-08 13:21, Andy Lutomirski wrote:
> On Mon, May 8, 2017 at 2:32 AM, Andy Shevchenko
> <andy.shevchenko@gmail.com> wrote:
>> On Mon, May 8, 2017 at 9:31 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>> On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
>>>> The x86 smpboot trampoline expects initial_page_table to have the
>>>> GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
>>>> then it won't be in the page tables at all until perc-pu areas are
>>>> set up.  The result will be a triple fault the first time that the
>>>> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
>>>>
>>>> This appears to be an old bug, but somehow the GDT fixmap rework
>>>> is triggering it.  This seems to have something to do with the
>>>> memory layout.
>>
>>> This breaks the boot on our Intel Quark platform (IOT2000, similar to
>>> Galileo Gen2). Reverting it over master makes it work again. Any idea
>>> what goes wrong? Let me know how I can help debugging this.
>>
>> JFYI: As of today linux-next when _kexec:ed_ works fine to me
>>
>> Perhaps I can test this later with direct boot from SD card.
>>
> 
> The most likely explanation is that there's some code that needs the
> page table synced and runs before setup_per_cpu_areas().  The relevant
> init code is:
> 
>     setup_arch(&command_line);
>     mm_init_cpumask(&init_mm);
>     setup_command_line(command_line);
>     setup_nr_cpu_ids();
>     setup_per_cpu_areas();
> 
> so I didn't move it very far.  It would be awesome if we could get a
> backtrace when the failure happens, but it's likely to be a triple
> fault.  Is this an EFI boot?  I bet the failure is in efi_init().

Yes, it's an EFI thing. Unfortunately, I didn't make
earlycon/earlyprintk work yet.

> 
> Could you try reverting just the deletions in the patch?  I.e. try a
> kernel with both the old and the new copies of the code I moved.

Let me try that later. I can also move the new code around to nail down
the dependency.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-08 14:45               ` Andy Shevchenko
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Shevchenko @ 2017-05-08 14:45 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Andy Lutomirski, Ingo Molnar, x86, linux-efi, linux-kernel,
	Peter Zijlstra (Intel),
	Linus Torvalds, Josh Poimboeuf, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, Matt Fleming, Thomas Gleixner, Brian Gerst,
	Thomas Garnier, Denys Vlasenko, Juergen Gross, Ard Biesheuvel

On Mon, May 8, 2017 at 3:34 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2017-05-08 13:21, Andy Lutomirski wrote:

> Yes, it's an EFI thing. Unfortunately, I didn't make
> earlycon/earlyprintk work yet.

Quark does use HS UART for the console and we have no support for such
in x86/earlyprintk. earlycon should work if you add proper parameters
to the command line (including port base address, no ttySx), though I
don't remember if it makes really "early".

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-08 14:45               ` Andy Shevchenko
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Shevchenko @ 2017-05-08 14:45 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Andy Lutomirski, Ingo Molnar, x86, linux-efi,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Peter Zijlstra (Intel),
	Linus Torvalds, Josh Poimboeuf, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, Matt Fleming, Thomas Gleixner, Brian Gerst,
	Thomas Garnier, Denys Vlasenko, Juergen Gross, Ard Biesheuvel

On Mon, May 8, 2017 at 3:34 PM, Jan Kiszka <jan.kiszka-kv7WeFo6aLtBDgjK7y7TUQ@public.gmane.org> wrote:
> On 2017-05-08 13:21, Andy Lutomirski wrote:

> Yes, it's an EFI thing. Unfortunately, I didn't make
> earlycon/earlyprintk work yet.

Quark does use HS UART for the console and we have no support for such
in x86/earlyprintk. earlycon should work if you add proper parameters
to the command line (including port base address, no ttySx), though I
don't remember if it makes really "early".

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
  2017-05-08 14:45               ` Andy Shevchenko
  (?)
@ 2017-05-08 15:24               ` Jan Kiszka
  -1 siblings, 0 replies; 32+ messages in thread
From: Jan Kiszka @ 2017-05-08 15:24 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Andy Lutomirski, Ingo Molnar, x86, linux-efi, linux-kernel,
	Peter Zijlstra (Intel),
	Linus Torvalds, Josh Poimboeuf, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, Matt Fleming, Thomas Gleixner, Brian Gerst,
	Thomas Garnier, Denys Vlasenko, Juergen Gross, Ard Biesheuvel

On 2017-05-08 16:45, Andy Shevchenko wrote:
> On Mon, May 8, 2017 at 3:34 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> On 2017-05-08 13:21, Andy Lutomirski wrote:
> 
>> Yes, it's an EFI thing. Unfortunately, I didn't make
>> earlycon/earlyprintk work yet.
> 
> Quark does use HS UART for the console and we have no support for such
> in x86/earlyprintk. earlycon should work if you add proper parameters
> to the command line (including port base address, no ttySx), though I
> don't remember if it makes really "early".

Ah, that explains it. I've tried earlycon, also with the apparently
correct params, but it didn't print anything in this case. And
earlyprintk=efi seems to use a framebuffer, while our devices are
headless...

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-08 17:53               ` Jan Kiszka
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Kiszka @ 2017-05-08 17:53 UTC (permalink / raw)
  To: Andy Lutomirski, Andy Shevchenko
  Cc: Ingo Molnar, x86, linux-efi, linux-kernel, Peter Zijlstra (Intel),
	Linus Torvalds, Josh Poimboeuf, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, Matt Fleming, Thomas Gleixner, Brian Gerst,
	Thomas Garnier, Denys Vlasenko, Juergen Gross, Ard Biesheuvel

On 2017-05-08 14:34, Jan Kiszka wrote:
> On 2017-05-08 13:21, Andy Lutomirski wrote:
>> On Mon, May 8, 2017 at 2:32 AM, Andy Shevchenko
>> <andy.shevchenko@gmail.com> wrote:
>>> On Mon, May 8, 2017 at 9:31 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>> On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
>>>>> The x86 smpboot trampoline expects initial_page_table to have the
>>>>> GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
>>>>> then it won't be in the page tables at all until perc-pu areas are
>>>>> set up.  The result will be a triple fault the first time that the
>>>>> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
>>>>>
>>>>> This appears to be an old bug, but somehow the GDT fixmap rework
>>>>> is triggering it.  This seems to have something to do with the
>>>>> memory layout.
>>>
>>>> This breaks the boot on our Intel Quark platform (IOT2000, similar to
>>>> Galileo Gen2). Reverting it over master makes it work again. Any idea
>>>> what goes wrong? Let me know how I can help debugging this.
>>>
>>> JFYI: As of today linux-next when _kexec:ed_ works fine to me
>>>
>>> Perhaps I can test this later with direct boot from SD card.
>>>
>>
>> The most likely explanation is that there's some code that needs the
>> page table synced and runs before setup_per_cpu_areas().  The relevant
>> init code is:
>>
>>     setup_arch(&command_line);
>>     mm_init_cpumask(&init_mm);
>>     setup_command_line(command_line);
>>     setup_nr_cpu_ids();
>>     setup_per_cpu_areas();
>>
>> so I didn't move it very far.  It would be awesome if we could get a
>> backtrace when the failure happens, but it's likely to be a triple
>> fault.  Is this an EFI boot?  I bet the failure is in efi_init().
> 
> Yes, it's an EFI thing. Unfortunately, I didn't make
> earlycon/earlyprintk work yet.
> 
>>
>> Could you try reverting just the deletions in the patch?  I.e. try a
>> kernel with both the old and the new copies of the code I moved.
> 
> Let me try that later. I can also move the new code around to nail down
> the dependency.
> 

I found the reason: your patch is very discriminating! Not the whole
world is multicore yet. ;)

setup_per_cpu_areas() is taken from mm/percpu.c in case of !CONFIG_SMP.
So the new home for the resync is not even built.

Any suggestions how to refactor things instead?

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-08 17:53               ` Jan Kiszka
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Kiszka @ 2017-05-08 17:53 UTC (permalink / raw)
  To: Andy Lutomirski, Andy Shevchenko
  Cc: Ingo Molnar, x86, linux-efi, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Peter Zijlstra (Intel),
	Linus Torvalds, Josh Poimboeuf, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, Matt Fleming, Thomas Gleixner, Brian Gerst,
	Thomas Garnier, Denys Vlasenko, Juergen Gross, Ard Biesheuvel

On 2017-05-08 14:34, Jan Kiszka wrote:
> On 2017-05-08 13:21, Andy Lutomirski wrote:
>> On Mon, May 8, 2017 at 2:32 AM, Andy Shevchenko
>> <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> On Mon, May 8, 2017 at 9:31 AM, Jan Kiszka <jan.kiszka-kv7WeFo6aLtBDgjK7y7TUQ@public.gmane.org> wrote:
>>>> On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
>>>>> The x86 smpboot trampoline expects initial_page_table to have the
>>>>> GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
>>>>> then it won't be in the page tables at all until perc-pu areas are
>>>>> set up.  The result will be a triple fault the first time that the
>>>>> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
>>>>>
>>>>> This appears to be an old bug, but somehow the GDT fixmap rework
>>>>> is triggering it.  This seems to have something to do with the
>>>>> memory layout.
>>>
>>>> This breaks the boot on our Intel Quark platform (IOT2000, similar to
>>>> Galileo Gen2). Reverting it over master makes it work again. Any idea
>>>> what goes wrong? Let me know how I can help debugging this.
>>>
>>> JFYI: As of today linux-next when _kexec:ed_ works fine to me
>>>
>>> Perhaps I can test this later with direct boot from SD card.
>>>
>>
>> The most likely explanation is that there's some code that needs the
>> page table synced and runs before setup_per_cpu_areas().  The relevant
>> init code is:
>>
>>     setup_arch(&command_line);
>>     mm_init_cpumask(&init_mm);
>>     setup_command_line(command_line);
>>     setup_nr_cpu_ids();
>>     setup_per_cpu_areas();
>>
>> so I didn't move it very far.  It would be awesome if we could get a
>> backtrace when the failure happens, but it's likely to be a triple
>> fault.  Is this an EFI boot?  I bet the failure is in efi_init().
> 
> Yes, it's an EFI thing. Unfortunately, I didn't make
> earlycon/earlyprintk work yet.
> 
>>
>> Could you try reverting just the deletions in the patch?  I.e. try a
>> kernel with both the old and the new copies of the code I moved.
> 
> Let me try that later. I can also move the new code around to nail down
> the dependency.
> 

I found the reason: your patch is very discriminating! Not the whole
world is multicore yet. ;)

setup_per_cpu_areas() is taken from mm/percpu.c in case of !CONFIG_SMP.
So the new home for the resync is not even built.

Any suggestions how to refactor things instead?

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-09  0:03                 ` Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Lutomirski @ 2017-05-09  0:03 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Andy Lutomirski, Andy Shevchenko, Ingo Molnar, x86, linux-efi,
	linux-kernel, Peter Zijlstra (Intel),
	Linus Torvalds, Josh Poimboeuf, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, Matt Fleming, Thomas Gleixner, Brian Gerst,
	Thomas Garnier, Denys Vlasenko, Juergen Gross, Ard Biesheuvel

On Mon, May 8, 2017 at 10:53 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2017-05-08 14:34, Jan Kiszka wrote:
>> On 2017-05-08 13:21, Andy Lutomirski wrote:
>>> On Mon, May 8, 2017 at 2:32 AM, Andy Shevchenko
>>> <andy.shevchenko@gmail.com> wrote:
>>>> On Mon, May 8, 2017 at 9:31 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>> On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
>>>>>> The x86 smpboot trampoline expects initial_page_table to have the
>>>>>> GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
>>>>>> then it won't be in the page tables at all until perc-pu areas are
>>>>>> set up.  The result will be a triple fault the first time that the
>>>>>> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
>>>>>>
>>>>>> This appears to be an old bug, but somehow the GDT fixmap rework
>>>>>> is triggering it.  This seems to have something to do with the
>>>>>> memory layout.
>>>>
>>>>> This breaks the boot on our Intel Quark platform (IOT2000, similar to
>>>>> Galileo Gen2). Reverting it over master makes it work again. Any idea
>>>>> what goes wrong? Let me know how I can help debugging this.
>>>>
>>>> JFYI: As of today linux-next when _kexec:ed_ works fine to me
>>>>
>>>> Perhaps I can test this later with direct boot from SD card.
>>>>
>>>
>>> The most likely explanation is that there's some code that needs the
>>> page table synced and runs before setup_per_cpu_areas().  The relevant
>>> init code is:
>>>
>>>     setup_arch(&command_line);
>>>     mm_init_cpumask(&init_mm);
>>>     setup_command_line(command_line);
>>>     setup_nr_cpu_ids();
>>>     setup_per_cpu_areas();
>>>
>>> so I didn't move it very far.  It would be awesome if we could get a
>>> backtrace when the failure happens, but it's likely to be a triple
>>> fault.  Is this an EFI boot?  I bet the failure is in efi_init().
>>
>> Yes, it's an EFI thing. Unfortunately, I didn't make
>> earlycon/earlyprintk work yet.
>>
>>>
>>> Could you try reverting just the deletions in the patch?  I.e. try a
>>> kernel with both the old and the new copies of the code I moved.
>>
>> Let me try that later. I can also move the new code around to nail down
>> the dependency.
>>
>
> I found the reason: your patch is very discriminating! Not the whole
> world is multicore yet. ;)
>
> setup_per_cpu_areas() is taken from mm/percpu.c in case of !CONFIG_SMP.
> So the new home for the resync is not even built.

D'oh!

>
> Any suggestions how to refactor things instead?

efi_init() seems okay, but it makes me nervous.  I think the partial
revert is the right fix.  Patch coming.

>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RDA ITP SES-DE
> Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
@ 2017-05-09  0:03                 ` Andy Lutomirski
  0 siblings, 0 replies; 32+ messages in thread
From: Andy Lutomirski @ 2017-05-09  0:03 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Andy Lutomirski, Andy Shevchenko, Ingo Molnar, x86, linux-efi,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Peter Zijlstra (Intel),
	Linus Torvalds, Josh Poimboeuf, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, Matt Fleming, Thomas Gleixner, Brian Gerst,
	Thomas Garnier, Denys Vlasenko, Juergen Gross, Ard Biesheuvel

On Mon, May 8, 2017 at 10:53 AM, Jan Kiszka <jan.kiszka-kv7WeFo6aLtBDgjK7y7TUQ@public.gmane.org> wrote:
> On 2017-05-08 14:34, Jan Kiszka wrote:
>> On 2017-05-08 13:21, Andy Lutomirski wrote:
>>> On Mon, May 8, 2017 at 2:32 AM, Andy Shevchenko
>>> <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>> On Mon, May 8, 2017 at 9:31 AM, Jan Kiszka <jan.kiszka-kv7WeFo6aLtBDgjK7y7TUQ@public.gmane.org> wrote:
>>>>> On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote:
>>>>>> The x86 smpboot trampoline expects initial_page_table to have the
>>>>>> GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
>>>>>> then it won't be in the page tables at all until perc-pu areas are
>>>>>> set up.  The result will be a triple fault the first time that the
>>>>>> CPU attempts to access the GDT after LGDT loads the perc-pu GDT.
>>>>>>
>>>>>> This appears to be an old bug, but somehow the GDT fixmap rework
>>>>>> is triggering it.  This seems to have something to do with the
>>>>>> memory layout.
>>>>
>>>>> This breaks the boot on our Intel Quark platform (IOT2000, similar to
>>>>> Galileo Gen2). Reverting it over master makes it work again. Any idea
>>>>> what goes wrong? Let me know how I can help debugging this.
>>>>
>>>> JFYI: As of today linux-next when _kexec:ed_ works fine to me
>>>>
>>>> Perhaps I can test this later with direct boot from SD card.
>>>>
>>>
>>> The most likely explanation is that there's some code that needs the
>>> page table synced and runs before setup_per_cpu_areas().  The relevant
>>> init code is:
>>>
>>>     setup_arch(&command_line);
>>>     mm_init_cpumask(&init_mm);
>>>     setup_command_line(command_line);
>>>     setup_nr_cpu_ids();
>>>     setup_per_cpu_areas();
>>>
>>> so I didn't move it very far.  It would be awesome if we could get a
>>> backtrace when the failure happens, but it's likely to be a triple
>>> fault.  Is this an EFI boot?  I bet the failure is in efi_init().
>>
>> Yes, it's an EFI thing. Unfortunately, I didn't make
>> earlycon/earlyprintk work yet.
>>
>>>
>>> Could you try reverting just the deletions in the patch?  I.e. try a
>>> kernel with both the old and the new copies of the code I moved.
>>
>> Let me try that later. I can also move the new code around to nail down
>> the dependency.
>>
>
> I found the reason: your patch is very discriminating! Not the whole
> world is multicore yet. ;)
>
> setup_per_cpu_areas() is taken from mm/percpu.c in case of !CONFIG_SMP.
> So the new home for the resync is not even built.

D'oh!

>
> Any suggestions how to refactor things instead?

efi_init() seems okay, but it makes me nervous.  I think the partial
revert is the right fix.  Patch coming.

>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RDA ITP SES-DE
> Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2017-05-09  0:04 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-22 21:32 [PATCH 0/7] Misc GDT fixes and a cleanup Andy Lutomirski
2017-03-22 21:32 ` [PATCH 1/7] selftests/x86/ldt_gdt_32: Work around a glibc sigaction bug Andy Lutomirski
2017-03-23  9:13   ` [tip:x86/mm] selftests/x86/ldt_gdt_32: Work around a glibc sigaction() bug tip-bot for Andy Lutomirski
2017-03-22 21:32 ` [PATCH 2/7] x86/gdt: Fix setup_fixmap_gdt() to use the correct PA Andy Lutomirski
2017-03-23  9:13   ` [tip:x86/mm] " tip-bot for Andy Lutomirski
2017-03-22 21:32 ` [PATCH 3/7] x86/efi/32: Fix EFI on systems where the percpu GDT is virtually mapped Andy Lutomirski
2017-03-22 21:32   ` Andy Lutomirski
2017-03-23  9:14   ` [tip:x86/mm] x86/efi/32: Fix EFI on systems where the per-cpu " tip-bot for Andy Lutomirski
2017-03-22 21:32 ` [PATCH 4/7] x86/boot/32: Defer resyncing initial_page_table until percpu is set up Andy Lutomirski
2017-03-22 21:32   ` Andy Lutomirski
2017-03-23  9:14   ` [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu " tip-bot for Andy Lutomirski
2017-05-08  6:31     ` Jan Kiszka
2017-05-08  6:31       ` Jan Kiszka
2017-05-08  9:32       ` Andy Shevchenko
2017-05-08  9:32         ` Andy Shevchenko
2017-05-08 11:21         ` Andy Lutomirski
2017-05-08 11:21           ` Andy Lutomirski
2017-05-08 12:34           ` Jan Kiszka
2017-05-08 14:45             ` Andy Shevchenko
2017-05-08 14:45               ` Andy Shevchenko
2017-05-08 15:24               ` Jan Kiszka
2017-05-08 17:53             ` Jan Kiszka
2017-05-08 17:53               ` Jan Kiszka
2017-05-09  0:03               ` Andy Lutomirski
2017-05-09  0:03                 ` Andy Lutomirski
2017-03-22 21:32 ` [PATCH 5/7] x86/gdt: Get rid of the get_*_gdt_*_vaddr() helpers Andy Lutomirski
2017-03-23  9:15   ` [tip:x86/mm] " tip-bot for Andy Lutomirski
2017-03-22 21:32 ` [PATCH 6/7] x86/xen/gdt: Use X86_FEATURE_XENPV instead of globals for the GDT fixup Andy Lutomirski
2017-03-23  9:15   ` [tip:x86/mm] " tip-bot for Andy Lutomirski
2017-03-22 21:32 ` [PATCH 7/7] x86/boot/32: Rewrite test_wp_bit() Andy Lutomirski
2017-03-23  7:31 ` [PATCH 0/7] Misc GDT fixes and a cleanup Ingo Molnar
2017-03-23 12:18 ` Boris Ostrovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.