linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv3 0/5] x86: 5-level related changes into decompression code
@ 2017-12-04 12:40 Kirill A. Shutemov
  2017-12-04 12:40 ` [PATCHv3 1/5] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time Kirill A. Shutemov
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2017-12-04 12:40 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

Hi Ingo,

Here's updated changes that prepare the code to boot-time switching between
paging modes and handle booting in 5-level mode when bootloader put kernel
image above 4G, but haven't enabled 5-level paging for us.

First two patches can be backported to v4.14 to provide sensible error
message when CONFIG_X86_5LEVEL=y kernel is booted on hardware that doesn't
support the feature.

Please review and consider applying.

Kirill A. Shutemov (5):
  x86/boot/compressed/64: Detect and handle 5-level paging at boot-time
  x86/boot/compressed/64: Print error if 5-level paging is not supported
  x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c
  x86/boot/compressed/64: Introduce place_trampoline()
  x86/boot/compressed/64: Handle 5-level paging boot if kernel is above
    4G

 arch/x86/boot/compressed/Makefile                  |   3 +-
 arch/x86/boot/compressed/head_64.S                 | 108 +++++++++++++--------
 .../boot/compressed/{pagetable.c => kaslr_64.c}    |   0
 arch/x86/boot/compressed/misc.c                    |  16 +++
 arch/x86/boot/compressed/pgtable.h                 |  18 ++++
 arch/x86/boot/compressed/pgtable_64.c              |  61 ++++++++++++
 6 files changed, 166 insertions(+), 40 deletions(-)
 rename arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} (100%)
 create mode 100644 arch/x86/boot/compressed/pgtable.h
 create mode 100644 arch/x86/boot/compressed/pgtable_64.c

-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCHv3 1/5] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time
  2017-12-04 12:40 [PATCHv3 0/5] x86: 5-level related changes into decompression code Kirill A. Shutemov
@ 2017-12-04 12:40 ` Kirill A. Shutemov
  2017-12-04 20:29   ` Thomas Gleixner
  2017-12-04 12:40 ` [PATCHv3 2/5] x86/boot/compressed/64: Print error if 5-level paging is not supported Kirill A. Shutemov
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2017-12-04 12:40 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov, stable

This patch prepare decompression code to boot-time switching between 4-
and 5-level paging.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>	[4.14+]
---
 arch/x86/boot/compressed/Makefile     |  1 +
 arch/x86/boot/compressed/head_64.S    | 16 ++++++++++++----
 arch/x86/boot/compressed/pgtable_64.c | 18 ++++++++++++++++++
 3 files changed, 31 insertions(+), 4 deletions(-)
 create mode 100644 arch/x86/boot/compressed/pgtable_64.c

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 1e9c322e973a..f25e1530e064 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -80,6 +80,7 @@ vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
 	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/pagetable.o
 	vmlinux-objs-y += $(obj)/mem_encrypt.o
+	vmlinux-objs-y += $(obj)/pgtable_64.o
 endif
 
 $(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 20919b4f3133..fc313e29fe2c 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -305,10 +305,18 @@ ENTRY(startup_64)
 	leaq	boot_stack_end(%rbx), %rsp
 
 #ifdef CONFIG_X86_5LEVEL
-	/* Check if 5-level paging has already enabled */
-	movq	%cr4, %rax
-	testl	$X86_CR4_LA57, %eax
-	jnz	lvl5
+	/*
+	 * Check if we need to enable 5-level paging.
+	 * RSI holds real mode data and need to be preserved across
+	 * a function call.
+	 */
+	pushq	%rsi
+	call	l5_paging_required
+	popq	%rsi
+
+	/* If l5_paging_required() returned zero, we're done here. */
+	cmpq	$0, %rax
+	je	lvl5
 
 	/*
 	 * At this point we are in long mode with 4-level paging enabled,
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
new file mode 100644
index 000000000000..eed3a2c3b577
--- /dev/null
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -0,0 +1,18 @@
+#include <asm/processor.h>
+
+int l5_paging_required(void)
+{
+	/* Check i leaf 7 is supported. */
+	if (native_cpuid_eax(0) < 7)
+		return 0;
+
+	/* Check if la57 is supported. */
+	if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+		return 0;
+
+	/* Check if 5-level paging has already been enabled. */
+	if (native_read_cr4() & X86_CR4_LA57)
+		return 0;
+
+	return 1;
+}
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCHv3 2/5] x86/boot/compressed/64: Print error if 5-level paging is not supported
  2017-12-04 12:40 [PATCHv3 0/5] x86: 5-level related changes into decompression code Kirill A. Shutemov
  2017-12-04 12:40 ` [PATCHv3 1/5] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time Kirill A. Shutemov
@ 2017-12-04 12:40 ` Kirill A. Shutemov
  2017-12-04 19:31   ` Borislav Petkov
  2017-12-04 12:40 ` [PATCHv3 3/5] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2017-12-04 12:40 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov, stable

We cannot proceed booting if the machine doesn't support the paging mode
kernel was compiled for.

Getting error the usual way -- via validate_cpu() -- is not going to
work. We need to enable appropriate paging mode before that, otherwise
kernel would triple-fault during KASLR setup.

This code will go away once we get support for boot-time switching
between paging modes.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>	[4.14+]
---
 arch/x86/boot/compressed/misc.c       | 16 ++++++++++++++++
 arch/x86/boot/compressed/pgtable_64.c |  2 +-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b50c42455e25..f7f8d9f76e15 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -169,6 +169,16 @@ void __puthex(unsigned long value)
 	}
 }
 
+static int l5_supported(void)
+{
+	/* Check if leaf 7 is supported. */
+	if (native_cpuid_eax(0) < 7)
+		return 0;
+
+	/* Check if la57 is supported. */
+	return native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31));
+}
+
 #if CONFIG_X86_NEED_RELOCS
 static void handle_relocations(void *output, unsigned long output_len,
 			       unsigned long virt_addr)
@@ -362,6 +372,12 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 	console_init();
 	debug_putstr("early console in extract_kernel\n");
 
+	if (IS_ENABLED(CONFIG_X86_5LEVEL) && !l5_supported()) {
+		error("This linux kernel as configured requires 5-level paging\n"
+			"This CPU does not support the required 'cr4.la57' feature\n"
+			"Unable to boot - please use a kernel appropriate for your CPU\n");
+	}
+
 	free_mem_ptr     = heap;	/* Heap */
 	free_mem_end_ptr = heap + BOOT_HEAP_SIZE;
 
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index eed3a2c3b577..7bcf03b376da 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -2,7 +2,7 @@
 
 int l5_paging_required(void)
 {
-	/* Check i leaf 7 is supported. */
+	/* Check if leaf 7 is supported. */
 	if (native_cpuid_eax(0) < 7)
 		return 0;
 
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCHv3 3/5] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c
  2017-12-04 12:40 [PATCHv3 0/5] x86: 5-level related changes into decompression code Kirill A. Shutemov
  2017-12-04 12:40 ` [PATCHv3 1/5] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time Kirill A. Shutemov
  2017-12-04 12:40 ` [PATCHv3 2/5] x86/boot/compressed/64: Print error if 5-level paging is not supported Kirill A. Shutemov
@ 2017-12-04 12:40 ` Kirill A. Shutemov
  2017-12-04 12:40 ` [PATCHv3 4/5] x86/boot/compressed/64: Introduce place_trampoline() Kirill A. Shutemov
  2017-12-04 12:40 ` [PATCHv3 5/5] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Kirill A. Shutemov
  4 siblings, 0 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2017-12-04 12:40 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

The name of the file -- pagetable.c -- is misleading: it only contains
helpers used for KASLR in 64-bin mode.

Let's rename the file to reflect its content.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/Makefile                    | 2 +-
 arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} (100%)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index f25e1530e064..1f734cd98fd3 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -78,7 +78,7 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
-	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/pagetable.o
+	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr_64.o
 	vmlinux-objs-y += $(obj)/mem_encrypt.o
 	vmlinux-objs-y += $(obj)/pgtable_64.o
 endif
diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/kaslr_64.c
similarity index 100%
rename from arch/x86/boot/compressed/pagetable.c
rename to arch/x86/boot/compressed/kaslr_64.c
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCHv3 4/5] x86/boot/compressed/64: Introduce place_trampoline()
  2017-12-04 12:40 [PATCHv3 0/5] x86: 5-level related changes into decompression code Kirill A. Shutemov
                   ` (2 preceding siblings ...)
  2017-12-04 12:40 ` [PATCHv3 3/5] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
@ 2017-12-04 12:40 ` Kirill A. Shutemov
  2017-12-04 12:40 ` [PATCHv3 5/5] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Kirill A. Shutemov
  4 siblings, 0 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2017-12-04 12:40 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

If bootloader enables 64-bit mode with 4-level paging, we might need to
switch over to 5-level paging. The switching requires disabling paging.
It works fine if kernel itself is loaded below 4G.

If bootloader put the kernel above 4G (not sure if anybody does this),
we would loose control as soon as paging is disabled as code becomes
unreachable.

To handle the situation, we need a trampoline in lower memory that would
take care about switching on 5-level paging.

Apart from trampoline itself we also need place to store top level page
table in lower memory as we don't have a way to load 64-bit value into
CR3 from 32-bit mode. We only really need 8-bytes there as we only use
the very first entry of the page table. But we allocate whole page
anyway. We cannot have the code in the same because, there's hazard that
a CPU would read page table speculatively and get confused seeing
garbage.

This patch introduces paging_prepare() that check if we need to enable
5-level paging and then finds right spot in lower memory for trampoline,
copies trampoline code there and setups new top level page table for
5-level paging.

At this point we do all the preparation, but not yet use trampoline.
It will be done in following patch.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/head_64.S    | 54 ++++++++++++++++-------------
 arch/x86/boot/compressed/pgtable.h    | 18 ++++++++++
 arch/x86/boot/compressed/pgtable_64.c | 65 +++++++++++++++++++++++++++++------
 3 files changed, 103 insertions(+), 34 deletions(-)
 create mode 100644 arch/x86/boot/compressed/pgtable.h

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index fc313e29fe2c..33a47d5c6445 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -304,33 +304,45 @@ ENTRY(startup_64)
 	/* Set up the stack */
 	leaq	boot_stack_end(%rbx), %rsp
 
-#ifdef CONFIG_X86_5LEVEL
-	/*
-	 * Check if we need to enable 5-level paging.
-	 * RSI holds real mode data and need to be preserved across
-	 * a function call.
-	 */
-	pushq	%rsi
-	call	l5_paging_required
-	popq	%rsi
-
-	/* If l5_paging_required() returned zero, we're done here. */
-	cmpq	$0, %rax
-	je	lvl5
-
 	/*
 	 * At this point we are in long mode with 4-level paging enabled,
-	 * but we want to enable 5-level paging.
+	 * but we might want to enable 5-level paging.
 	 *
 	 * The problem is that we cannot do it directly. Setting LA57 in
 	 * long mode would trigger #GP. So we need to switch off long mode
 	 * first.
 	 *
-	 * NOTE: This is not going to work if bootloader put us above 4G
-	 * limit.
+	 * We also need trampoline in lower memory to switch from 4- to 5-level
+	 * paging for cases when bootloader put kernel above 4G, but didn't
+	 * enable 5-level paging for us.
+	 *
+	 * For trampoline, we have to have top page table in lower memory as we
+	 * don't have a way to load 64-bit value into CR3 from 32-bit mode.
+	 *
+	 * We go though trampoline even if we don't have to: if we're already
+	 * in 5-level paging mode or if we don't need to switch to it. This way
+	 * the trampoline code gets tested not only in special rare case, but
+	 * on every boot.
+	 */
+
+	/*
+	 * paging_prepare() would setup trampoline and check if we need to
+	 * enable 5-level paging.
+	 *
+	 * Address of trampoline is rerurned in RAX. The bit 0 is used to
+	 * encode if we need to enabled 5-level paging.
 	 *
-	 * The first step is go into compatibility mode.
+	 * RSI holds real mode data and need to be preserved across
+	 * a function call.
 	 */
+	pushq	%rsi
+	call	paging_prepare
+	popq	%rsi
+	movq	%rax, %rcx
+	andq	$(~1UL), %rcx
+
+	testq	$1, %rax
+	jz	lvl5
 
 	/* Clear additional page table */
 	leaq	lvl5_pgtable(%rbx), %rdi
@@ -352,7 +364,6 @@ ENTRY(startup_64)
 	pushq	%rax
 	lretq
 lvl5:
-#endif
 
 	/* Zero EFLAGS */
 	pushq	$0
@@ -490,7 +501,7 @@ relocated:
 	jmp	*%rax
 
 	.code32
-#ifdef CONFIG_X86_5LEVEL
+ENTRY(trampoline_32bit_src)
 compatible_mode:
 	/* Setup data and stack segments */
 	movl	$__KERNEL_DS, %eax
@@ -526,7 +537,6 @@ compatible_mode:
 	movl	%eax, %cr0
 
 	lret
-#endif
 
 no_longmode:
 	/* This isn't an x86-64 CPU so hang */
@@ -585,7 +595,5 @@ boot_stack_end:
 	.balign 4096
 pgtable:
 	.fill BOOT_PGT_SIZE, 1, 0
-#ifdef CONFIG_X86_5LEVEL
 lvl5_pgtable:
 	.fill PAGE_SIZE, 1, 0
-#endif
diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h
new file mode 100644
index 000000000000..0261d4ab62e6
--- /dev/null
+++ b/arch/x86/boot/compressed/pgtable.h
@@ -0,0 +1,18 @@
+#ifndef BOOT_COMPRESSED_PAGETABLE_H
+#define BOOT_COMPRESSED_PAGETABLE_H
+
+#define TRAMPOLINE_32BIT_SIZE		(2 * PAGE_SIZE)
+
+#define TRAMPOLINE_32BIT_PGTABLE_OFF	0
+
+#define TRAMPOLINE_32BIT_CODE_OFF	PAGE_SIZE
+#define TRAMPOLINE_32BIT_CODE_SIZE	0x50
+
+#define TRAMPOLINE_32BIT_STACK_END	TRAMPOLINE_32BIT_SIZE
+
+#ifndef __ASSEMBLER__
+
+extern void (*trampoline_32bit_src)(void *return_ptr);
+
+#endif /* __ASSEMBLER__ */
+#endif /* BOOT_COMPRESSED_PAGETABLE_H */
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index 7bcf03b376da..9c11f4c26d35 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -1,18 +1,61 @@
 #include <asm/processor.h>
+#include "pgtable.h"
+#include "../string.h"
 
-int l5_paging_required(void)
+#define BIOS_START_MIN		0x20000U	/* 128K, less than this is insane */
+#define BIOS_START_MAX		0x9f000U	/* 640K, absolute maximum */
+
+unsigned long paging_prepare(void)
 {
-	/* Check if leaf 7 is supported. */
-	if (native_cpuid_eax(0) < 7)
-		return 0;
+	unsigned long bios_start, ebda_start, trampoline_start, *trampoline;
+	int l5_required = 0;
+
+	/* Check if la57 is desired and supported */
+	if (IS_ENABLED(CONFIG_X86_5LEVEL) && native_cpuid_eax(0) >= 7 &&
+			(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+		l5_required = 1;
+
+	/*
+	 * Find suitable spot for trampoline.
+	 * Based on reserve_bios_regions().
+	 */
+
+	ebda_start = *(unsigned short *)0x40e << 4;
+	bios_start = *(unsigned short *)0x413 << 10;
+
+	if (bios_start < BIOS_START_MIN || bios_start > BIOS_START_MAX)
+		bios_start = BIOS_START_MAX;
+
+	if (ebda_start > BIOS_START_MIN && ebda_start < bios_start)
+		bios_start = ebda_start;
+
+	/* Place trampoline below end of low memory, aligned to 4k */
+	trampoline_start = bios_start - TRAMPOLINE_32BIT_SIZE;
+	trampoline_start = round_down(trampoline_start, PAGE_SIZE);
+
+	trampoline = (unsigned long *)trampoline_start;
+
+	/* Clear trampoline memory first */
+	memset(trampoline, 0, TRAMPOLINE_32BIT_SIZE);
 
-	/* Check if la57 is supported. */
-	if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
-		return 0;
+	/* Copy trampoline code in place */
+	memcpy(trampoline + TRAMPOLINE_32BIT_CODE_OFF / sizeof(unsigned long),
+			&trampoline_32bit_src, TRAMPOLINE_32BIT_CODE_SIZE);
 
-	/* Check if 5-level paging has already been enabled. */
-	if (native_read_cr4() & X86_CR4_LA57)
-		return 0;
+	if (l5_required) {
+		/*
+		 * For 5-level paging setup current CR3 as the first and the
+		 * only entry in a new top level page table.
+		 */
+		trampoline[0] = __native_read_cr3() + _PAGE_TABLE_NOENC;
+	} else {
+		/*
+		 * For 4-level paging, copy current top-level page table.
+		 * It might be above 4G and be unaccessible from 32-bit mode.
+		 */
+		memcpy(trampoline, (void *)__native_read_cr3(), PAGE_SIZE);
+	}
 
-	return 1;
+	/* Bit 0 is used to encode if 5-level paging is required */
+	return trampoline_start | l5_required;
 }
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCHv3 5/5] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G
  2017-12-04 12:40 [PATCHv3 0/5] x86: 5-level related changes into decompression code Kirill A. Shutemov
                   ` (3 preceding siblings ...)
  2017-12-04 12:40 ` [PATCHv3 4/5] x86/boot/compressed/64: Introduce place_trampoline() Kirill A. Shutemov
@ 2017-12-04 12:40 ` Kirill A. Shutemov
  4 siblings, 0 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2017-12-04 12:40 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

This patch addresses shortcoming in current boot process on machines
that supports 5-level paging.

If bootloader enables 64-bit mode with 4-level paging, we need to
switch over to 5-level paging. The switching requires disabling paging.
It works fine if kernel itself is loaded below 4G.

If bootloader put the kernel above 4G (not sure if anybody does this),
we would loose control as soon as paging is disabled as code becomes
unreachable.

This patch implements trampoline in lower memory to handle this
situation.

We only need the memory for very short time, until main kernel image
setup its own page tables.

We go though trampoline even if we don't have to: if we're already in
5-level paging mode or if we don't need to switch to it. This way the
trampoline code gets tested not only in special rare case, but on every
boot.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/head_64.S | 72 +++++++++++++++++++++++---------------
 1 file changed, 43 insertions(+), 29 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 33a47d5c6445..525972ca27b7 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -33,6 +33,7 @@
 #include <asm/processor-flags.h>
 #include <asm/asm-offsets.h>
 #include <asm/bootparam.h>
+#include "pgtable.h"
 
 /*
  * Locally defined symbols should be marked hidden:
@@ -339,31 +340,22 @@ ENTRY(startup_64)
 	call	paging_prepare
 	popq	%rsi
 	movq	%rax, %rcx
-	andq	$(~1UL), %rcx
-
-	testq	$1, %rax
-	jz	lvl5
-
-	/* Clear additional page table */
-	leaq	lvl5_pgtable(%rbx), %rdi
-	xorq	%rax, %rax
-	movq	$(PAGE_SIZE/8), %rcx
-	rep	stosq
 
 	/*
-	 * Setup current CR3 as the first and only entry in a new top level
-	 * page table.
+	 * Load address of trampoline_return into RDI.
+	 * It will be used by trampoline to return to main code.
 	 */
-	movq	%cr3, %rdi
-	leaq	0x7 (%rdi), %rax
-	movq	%rax, lvl5_pgtable(%rbx)
+	leaq	trampoline_return(%rip), %rdi
 
 	/* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */
 	pushq	$__KERNEL32_CS
-	leaq	compatible_mode(%rip), %rax
+	andq	$(~1UL), %rax /* Clear bit 0: encode if 5-level paging neeeded */
+	leaq	TRAMPOLINE_32BIT_CODE_OFF(%rax), %rax
 	pushq	%rax
 	lretq
-lvl5:
+trampoline_return:
+	/* Restore stack, 32-bit trampoline uses own stack */
+	leaq	boot_stack_end(%rbx), %rsp
 
 	/* Zero EFLAGS */
 	pushq	$0
@@ -501,36 +493,51 @@ relocated:
 	jmp	*%rax
 
 	.code32
+/*
+ * This is 32-bit trampoline that will be copied over to low memory.
+ *
+ * RDI contains return address (might be above 4G).
+ * ECX contains the base address of trampoline memory.
+ * Bit 0 of ECX encodes if 5-level paging is required.
+ */
 ENTRY(trampoline_32bit_src)
-compatible_mode:
 	/* Setup data and stack segments */
 	movl	$__KERNEL_DS, %eax
 	movl	%eax, %ds
 	movl	%eax, %ss
 
+	movl	%ecx, %edx
+	andl	$(~1UL), %edx
+
+	/* Setup new stack at the end of trampoline memory */
+	leal	TRAMPOLINE_32BIT_STACK_END (%edx), %esp
+
 	/* Disable paging */
 	movl	%cr0, %eax
 	btrl	$X86_CR0_PG_BIT, %eax
 	movl	%eax, %cr0
 
-	/* Point CR3 to 5-level paging */
-	leal	lvl5_pgtable(%ebx), %eax
+	/* Point CR3 to trampoline top level page table */
+	leal	TRAMPOLINE_32BIT_PGTABLE_OFF (%edx), %eax
 	movl	%eax, %cr3
 
 	/* Enable PAE and LA57 mode */
 	movl	%cr4, %eax
-	orl	$(X86_CR4_PAE | X86_CR4_LA57), %eax
+	orl	$X86_CR4_PAE, %eax
+
+	/* Bit 0 of ECX encodes if 5-level paging is required */
+	testl	$1, %ecx
+	jz	1f
+	orl	$X86_CR4_LA57, %eax
+1:
 	movl	%eax, %cr4
 
-	/* Calculate address we are running at */
-	call	1f
-1:	popl	%edi
-	subl	$1b, %edi
+	/* Calculate address of paging_enabled once we are in trampoline */
+	leal	paging_enabled - trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_OFF (%edx), %eax
 
 	/* Prepare stack for far return to Long Mode */
 	pushl	$__KERNEL_CS
-	leal	lvl5(%edi), %eax
-	push	%eax
+	pushl	%eax
 
 	/* Enable paging back */
 	movl	$(X86_CR0_PG | X86_CR0_PE), %eax
@@ -538,6 +545,15 @@ compatible_mode:
 
 	lret
 
+	.code64
+paging_enabled:
+	/* Return from trampoline */
+	jmp	*%rdi
+
+	/* Bound size of trampoline code */
+	.org	trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_SIZE
+
+	.code32
 no_longmode:
 	/* This isn't an x86-64 CPU so hang */
 1:
@@ -595,5 +611,3 @@ boot_stack_end:
 	.balign 4096
 pgtable:
 	.fill BOOT_PGT_SIZE, 1, 0
-lvl5_pgtable:
-	.fill PAGE_SIZE, 1, 0
-- 
2.15.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCHv3 2/5] x86/boot/compressed/64: Print error if 5-level paging is not supported
  2017-12-04 12:40 ` [PATCHv3 2/5] x86/boot/compressed/64: Print error if 5-level paging is not supported Kirill A. Shutemov
@ 2017-12-04 19:31   ` Borislav Petkov
  0 siblings, 0 replies; 9+ messages in thread
From: Borislav Petkov @ 2017-12-04 19:31 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin,
	Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov, Andi Kleen,
	linux-mm, linux-kernel, stable

On Mon, Dec 04, 2017 at 03:40:56PM +0300, Kirill A. Shutemov wrote:
> We cannot proceed booting if the machine doesn't support the paging mode
> kernel was compiled for.
> 
> Getting error the usual way -- via validate_cpu() -- is not going to
> work. We need to enable appropriate paging mode before that, otherwise
> kernel would triple-fault during KASLR setup.
> 
> This code will go away once we get support for boot-time switching
> between paging modes.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: <stable@vger.kernel.org>	[4.14+]
> ---
>  arch/x86/boot/compressed/misc.c       | 16 ++++++++++++++++
>  arch/x86/boot/compressed/pgtable_64.c |  2 +-
>  2 files changed, 17 insertions(+), 1 deletion(-)

Reported-and-tested-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3 1/5] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time
  2017-12-04 12:40 ` [PATCHv3 1/5] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time Kirill A. Shutemov
@ 2017-12-04 20:29   ` Thomas Gleixner
  2017-12-04 21:15     ` Kirill A. Shutemov
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2017-12-04 20:29 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, x86, H. Peter Anvin, Linus Torvalds,
	Andy Lutomirski, Cyrill Gorcunov, Borislav Petkov, Andi Kleen,
	linux-mm, linux-kernel, stable

On Mon, 4 Dec 2017, Kirill A. Shutemov wrote:

> This patch prepare decompression code to boot-time switching between 4-
> and 5-level paging.

This is the very wrong reason for tagging this commit stable.

> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: <stable@vger.kernel.org>	[4.14+]

Adding cc stable  requires a Fixes tag as well.

> +int l5_paging_required(void)
> +{
> +	/* Check i leaf 7 is supported. */

So you introduce the typo here and then you fix it in the next patch which
is the actual bug fix as an completely unrelated hunk.

-- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -2,7 +2,7 @@
 
 int l5_paging_required(void)
 {
-       /* Check i leaf 7 is supported. */
+       /* Check if leaf 7 is supported. */

That's just careless and sloppy.

I fixed it up once more along with the lousy changelogs because this crap,
which you not even thought about addressing it when shoving your 5-level
support into 4.14 needs to be fixed.

I'm really tired of your sloppiness. You waste everyones time just by
ignoring feedback and continuing to do what you think is enough. Works for
me is _NOT_ enough for kernel development.

I'm not even looking at the rest of the series unless someone else has the
stomach to do so and sends a Reviewed-by.

Alternatively you can sit down and look at the changelogs and the code and
figure out whether it matches what I told you over and over. Once you think
it does, then please feel free to resend it, but be sure that I'm going to
apply the most restrictive crap filter on anything which comes from you
from now on.

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3 1/5] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time
  2017-12-04 20:29   ` Thomas Gleixner
@ 2017-12-04 21:15     ` Kirill A. Shutemov
  0 siblings, 0 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2017-12-04 21:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kirill A. Shutemov, Ingo Molnar, x86, H. Peter Anvin,
	Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel, stable

On Mon, Dec 04, 2017 at 09:29:45PM +0100, Thomas Gleixner wrote:
> On Mon, 4 Dec 2017, Kirill A. Shutemov wrote:
> 
> > This patch prepare decompression code to boot-time switching between 4-
> > and 5-level paging.
> 
> This is the very wrong reason for tagging this commit stable.
> 
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: <stable@vger.kernel.org>	[4.14+]
> 
> Adding cc stable  requires a Fixes tag as well.
> 
> > +int l5_paging_required(void)
> > +{
> > +	/* Check i leaf 7 is supported. */
> 
> So you introduce the typo here and then you fix it in the next patch which
> is the actual bug fix as an completely unrelated hunk.
> 
> -- a/arch/x86/boot/compressed/pgtable_64.c
> +++ b/arch/x86/boot/compressed/pgtable_64.c
> @@ -2,7 +2,7 @@
>  
>  int l5_paging_required(void)
>  {
> -       /* Check i leaf 7 is supported. */
> +       /* Check if leaf 7 is supported. */
> 
> That's just careless and sloppy.
> 
> I fixed it up once more along with the lousy changelogs because this crap,
> which you not even thought about addressing it when shoving your 5-level
> support into 4.14 needs to be fixed.
> 
> I'm really tired of your sloppiness. You waste everyones time just by
> ignoring feedback and continuing to do what you think is enough. Works for
> me is _NOT_ enough for kernel development.

Sorry. I screwed it up.

I'll do my best to not waste your time again.

> I'm not even looking at the rest of the series unless someone else has the
> stomach to do so and sends a Reviewed-by.
> 
> Alternatively you can sit down and look at the changelogs and the code and
> figure out whether it matches what I told you over and over. Once you think
> it does, then please feel free to resend it, but be sure that I'm going to
> apply the most restrictive crap filter on anything which comes from you
> from now on.

Fair enough. I'll recheck everything in the morning and send them again.

Thanks,
  and sorry again for wasting your time.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-12-04 21:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-04 12:40 [PATCHv3 0/5] x86: 5-level related changes into decompression code Kirill A. Shutemov
2017-12-04 12:40 ` [PATCHv3 1/5] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time Kirill A. Shutemov
2017-12-04 20:29   ` Thomas Gleixner
2017-12-04 21:15     ` Kirill A. Shutemov
2017-12-04 12:40 ` [PATCHv3 2/5] x86/boot/compressed/64: Print error if 5-level paging is not supported Kirill A. Shutemov
2017-12-04 19:31   ` Borislav Petkov
2017-12-04 12:40 ` [PATCHv3 3/5] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
2017-12-04 12:40 ` [PATCHv3 4/5] x86/boot/compressed/64: Introduce place_trampoline() Kirill A. Shutemov
2017-12-04 12:40 ` [PATCHv3 5/5] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Kirill A. Shutemov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).