All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-10 22:06 ` Kirill A. Shutemov
  0 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

Hi Ingo,

Here's updated changes that prepare the code to boot-time switching between
paging modes and handle booting in 5-level mode when bootloader put kernel
image above 4G, but haven't enabled 5-level paging for us.

I've updated patches based on your feedback.

Please review and consider applying.

Kirill A. Shutemov (4):
  x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c
  x86/boot/compressed/64: Detect and handle 5-level paging at boot-time
  x86/boot/compressed/64: Introduce place_trampoline()
  x86/boot/compressed/64: Handle 5-level paging boot if kernel is above
    4G

 arch/x86/boot/compressed/Makefile                  |   3 +-
 arch/x86/boot/compressed/head_64.S                 | 108 +++++++++++++--------
 .../boot/compressed/{pagetable.c => kaslr_64.c}    |   0
 arch/x86/boot/compressed/pgtable.h                 |  18 ++++
 arch/x86/boot/compressed/pgtable_64.c              |  61 ++++++++++++
 5 files changed, 150 insertions(+), 40 deletions(-)
 rename arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} (100%)
 create mode 100644 arch/x86/boot/compressed/pgtable.h
 create mode 100644 arch/x86/boot/compressed/pgtable_64.c

-- 
2.14.2

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-10 22:06 ` Kirill A. Shutemov
  0 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

Hi Ingo,

Here's updated changes that prepare the code to boot-time switching between
paging modes and handle booting in 5-level mode when bootloader put kernel
image above 4G, but haven't enabled 5-level paging for us.

I've updated patches based on your feedback.

Please review and consider applying.

Kirill A. Shutemov (4):
  x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c
  x86/boot/compressed/64: Detect and handle 5-level paging at boot-time
  x86/boot/compressed/64: Introduce place_trampoline()
  x86/boot/compressed/64: Handle 5-level paging boot if kernel is above
    4G

 arch/x86/boot/compressed/Makefile                  |   3 +-
 arch/x86/boot/compressed/head_64.S                 | 108 +++++++++++++--------
 .../boot/compressed/{pagetable.c => kaslr_64.c}    |   0
 arch/x86/boot/compressed/pgtable.h                 |  18 ++++
 arch/x86/boot/compressed/pgtable_64.c              |  61 ++++++++++++
 5 files changed, 150 insertions(+), 40 deletions(-)
 rename arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} (100%)
 create mode 100644 arch/x86/boot/compressed/pgtable.h
 create mode 100644 arch/x86/boot/compressed/pgtable_64.c

-- 
2.14.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCHv2 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c
  2017-11-10 22:06 ` Kirill A. Shutemov
@ 2017-11-10 22:06   ` Kirill A. Shutemov
  -1 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

The name of the file -- pagetable.c -- is misleading: it only contains
helpers used for KASLR in 64-bin mode.

Let's rename the file to reflect its content.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/Makefile                    | 2 +-
 arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} (100%)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 1e9c322e973a..ae0be0b923e1 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -78,7 +78,7 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
-	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/pagetable.o
+	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr_64.o
 	vmlinux-objs-y += $(obj)/mem_encrypt.o
 endif
 
diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/kaslr_64.c
similarity index 100%
rename from arch/x86/boot/compressed/pagetable.c
rename to arch/x86/boot/compressed/kaslr_64.c
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv2 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c
@ 2017-11-10 22:06   ` Kirill A. Shutemov
  0 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

The name of the file -- pagetable.c -- is misleading: it only contains
helpers used for KASLR in 64-bin mode.

Let's rename the file to reflect its content.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/Makefile                    | 2 +-
 arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} (100%)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 1e9c322e973a..ae0be0b923e1 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -78,7 +78,7 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
-	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/pagetable.o
+	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr_64.o
 	vmlinux-objs-y += $(obj)/mem_encrypt.o
 endif
 
diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/kaslr_64.c
similarity index 100%
rename from arch/x86/boot/compressed/pagetable.c
rename to arch/x86/boot/compressed/kaslr_64.c
-- 
2.14.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv2 2/4] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time
  2017-11-10 22:06 ` Kirill A. Shutemov
@ 2017-11-10 22:06   ` Kirill A. Shutemov
  -1 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

This patch prepare decompression code to boot-time switching between 4-
and 5-level paging.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/Makefile     |  1 +
 arch/x86/boot/compressed/head_64.S    | 16 ++++++++++++----
 arch/x86/boot/compressed/pgtable_64.c | 18 ++++++++++++++++++
 3 files changed, 31 insertions(+), 4 deletions(-)
 create mode 100644 arch/x86/boot/compressed/pgtable_64.c

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index ae0be0b923e1..1f734cd98fd3 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -80,6 +80,7 @@ vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
 	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr_64.o
 	vmlinux-objs-y += $(obj)/mem_encrypt.o
+	vmlinux-objs-y += $(obj)/pgtable_64.o
 endif
 
 $(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 20919b4f3133..fc313e29fe2c 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -305,10 +305,18 @@ ENTRY(startup_64)
 	leaq	boot_stack_end(%rbx), %rsp
 
 #ifdef CONFIG_X86_5LEVEL
-	/* Check if 5-level paging has already enabled */
-	movq	%cr4, %rax
-	testl	$X86_CR4_LA57, %eax
-	jnz	lvl5
+	/*
+	 * Check if we need to enable 5-level paging.
+	 * RSI holds real mode data and need to be preserved across
+	 * a function call.
+	 */
+	pushq	%rsi
+	call	l5_paging_required
+	popq	%rsi
+
+	/* If l5_paging_required() returned zero, we're done here. */
+	cmpq	$0, %rax
+	je	lvl5
 
 	/*
 	 * At this point we are in long mode with 4-level paging enabled,
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
new file mode 100644
index 000000000000..eed3a2c3b577
--- /dev/null
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -0,0 +1,18 @@
+#include <asm/processor.h>
+
+int l5_paging_required(void)
+{
+	/* Check i leaf 7 is supported. */
+	if (native_cpuid_eax(0) < 7)
+		return 0;
+
+	/* Check if la57 is supported. */
+	if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+		return 0;
+
+	/* Check if 5-level paging has already been enabled. */
+	if (native_read_cr4() & X86_CR4_LA57)
+		return 0;
+
+	return 1;
+}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv2 2/4] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time
@ 2017-11-10 22:06   ` Kirill A. Shutemov
  0 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

This patch prepare decompression code to boot-time switching between 4-
and 5-level paging.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/Makefile     |  1 +
 arch/x86/boot/compressed/head_64.S    | 16 ++++++++++++----
 arch/x86/boot/compressed/pgtable_64.c | 18 ++++++++++++++++++
 3 files changed, 31 insertions(+), 4 deletions(-)
 create mode 100644 arch/x86/boot/compressed/pgtable_64.c

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index ae0be0b923e1..1f734cd98fd3 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -80,6 +80,7 @@ vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
 	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr_64.o
 	vmlinux-objs-y += $(obj)/mem_encrypt.o
+	vmlinux-objs-y += $(obj)/pgtable_64.o
 endif
 
 $(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 20919b4f3133..fc313e29fe2c 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -305,10 +305,18 @@ ENTRY(startup_64)
 	leaq	boot_stack_end(%rbx), %rsp
 
 #ifdef CONFIG_X86_5LEVEL
-	/* Check if 5-level paging has already enabled */
-	movq	%cr4, %rax
-	testl	$X86_CR4_LA57, %eax
-	jnz	lvl5
+	/*
+	 * Check if we need to enable 5-level paging.
+	 * RSI holds real mode data and need to be preserved across
+	 * a function call.
+	 */
+	pushq	%rsi
+	call	l5_paging_required
+	popq	%rsi
+
+	/* If l5_paging_required() returned zero, we're done here. */
+	cmpq	$0, %rax
+	je	lvl5
 
 	/*
 	 * At this point we are in long mode with 4-level paging enabled,
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
new file mode 100644
index 000000000000..eed3a2c3b577
--- /dev/null
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -0,0 +1,18 @@
+#include <asm/processor.h>
+
+int l5_paging_required(void)
+{
+	/* Check i leaf 7 is supported. */
+	if (native_cpuid_eax(0) < 7)
+		return 0;
+
+	/* Check if la57 is supported. */
+	if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+		return 0;
+
+	/* Check if 5-level paging has already been enabled. */
+	if (native_read_cr4() & X86_CR4_LA57)
+		return 0;
+
+	return 1;
+}
-- 
2.14.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv2 3/4] x86/boot/compressed/64: Introduce place_trampoline()
  2017-11-10 22:06 ` Kirill A. Shutemov
@ 2017-11-10 22:06   ` Kirill A. Shutemov
  -1 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

If bootloader enables 64-bit mode with 4-level paging, we might need to
switch over to 5-level paging. The switching requires disabling paging.
It works fine if kernel itself is loaded below 4G.

If bootloader put the kernel above 4G (not sure if anybody does this),
we would loose control as soon as paging is disabled as code becomes
unreachable.

To handle the situation, we need a trampoline in lower memory that would
take care about switching on 5-level paging.

Apart from trampoline itself we also need place to store top level page
table in lower memory as we don't have a way to load 64-bit value into
CR3 from 32-bit mode. We only really need 8-bytes there as we only use
the very first entry of the page table. But we allocate whole page
anyway. We cannot have the code in the same because, there's hazard that
a CPU would read page table speculatively and get confused seeing
garbage.

This patch introduces paging_prepare() that check if we need to enable
5-level paging and then finds right spot in lower memory for trampoline,
copies trampoline code there and setups new top level page table for
5-level paging.

At this point we do all the preparation, but not yet use trampoline.
It will be done in following patch.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/head_64.S    | 54 ++++++++++++++++-------------
 arch/x86/boot/compressed/pgtable.h    | 18 ++++++++++
 arch/x86/boot/compressed/pgtable_64.c | 65 +++++++++++++++++++++++++++++------
 3 files changed, 103 insertions(+), 34 deletions(-)
 create mode 100644 arch/x86/boot/compressed/pgtable.h

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index fc313e29fe2c..33a47d5c6445 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -304,33 +304,45 @@ ENTRY(startup_64)
 	/* Set up the stack */
 	leaq	boot_stack_end(%rbx), %rsp
 
-#ifdef CONFIG_X86_5LEVEL
-	/*
-	 * Check if we need to enable 5-level paging.
-	 * RSI holds real mode data and need to be preserved across
-	 * a function call.
-	 */
-	pushq	%rsi
-	call	l5_paging_required
-	popq	%rsi
-
-	/* If l5_paging_required() returned zero, we're done here. */
-	cmpq	$0, %rax
-	je	lvl5
-
 	/*
 	 * At this point we are in long mode with 4-level paging enabled,
-	 * but we want to enable 5-level paging.
+	 * but we might want to enable 5-level paging.
 	 *
 	 * The problem is that we cannot do it directly. Setting LA57 in
 	 * long mode would trigger #GP. So we need to switch off long mode
 	 * first.
 	 *
-	 * NOTE: This is not going to work if bootloader put us above 4G
-	 * limit.
+	 * We also need trampoline in lower memory to switch from 4- to 5-level
+	 * paging for cases when bootloader put kernel above 4G, but didn't
+	 * enable 5-level paging for us.
+	 *
+	 * For trampoline, we have to have top page table in lower memory as we
+	 * don't have a way to load 64-bit value into CR3 from 32-bit mode.
+	 *
+	 * We go though trampoline even if we don't have to: if we're already
+	 * in 5-level paging mode or if we don't need to switch to it. This way
+	 * the trampoline code gets tested not only in special rare case, but
+	 * on every boot.
+	 */
+
+	/*
+	 * paging_prepare() would setup trampoline and check if we need to
+	 * enable 5-level paging.
+	 *
+	 * Address of trampoline is rerurned in RAX. The bit 0 is used to
+	 * encode if we need to enabled 5-level paging.
 	 *
-	 * The first step is go into compatibility mode.
+	 * RSI holds real mode data and need to be preserved across
+	 * a function call.
 	 */
+	pushq	%rsi
+	call	paging_prepare
+	popq	%rsi
+	movq	%rax, %rcx
+	andq	$(~1UL), %rcx
+
+	testq	$1, %rax
+	jz	lvl5
 
 	/* Clear additional page table */
 	leaq	lvl5_pgtable(%rbx), %rdi
@@ -352,7 +364,6 @@ ENTRY(startup_64)
 	pushq	%rax
 	lretq
 lvl5:
-#endif
 
 	/* Zero EFLAGS */
 	pushq	$0
@@ -490,7 +501,7 @@ relocated:
 	jmp	*%rax
 
 	.code32
-#ifdef CONFIG_X86_5LEVEL
+ENTRY(trampoline_32bit_src)
 compatible_mode:
 	/* Setup data and stack segments */
 	movl	$__KERNEL_DS, %eax
@@ -526,7 +537,6 @@ compatible_mode:
 	movl	%eax, %cr0
 
 	lret
-#endif
 
 no_longmode:
 	/* This isn't an x86-64 CPU so hang */
@@ -585,7 +595,5 @@ boot_stack_end:
 	.balign 4096
 pgtable:
 	.fill BOOT_PGT_SIZE, 1, 0
-#ifdef CONFIG_X86_5LEVEL
 lvl5_pgtable:
 	.fill PAGE_SIZE, 1, 0
-#endif
diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h
new file mode 100644
index 000000000000..0261d4ab62e6
--- /dev/null
+++ b/arch/x86/boot/compressed/pgtable.h
@@ -0,0 +1,18 @@
+#ifndef BOOT_COMPRESSED_PAGETABLE_H
+#define BOOT_COMPRESSED_PAGETABLE_H
+
+#define TRAMPOLINE_32BIT_SIZE		(2 * PAGE_SIZE)
+
+#define TRAMPOLINE_32BIT_PGTABLE_OFF	0
+
+#define TRAMPOLINE_32BIT_CODE_OFF	PAGE_SIZE
+#define TRAMPOLINE_32BIT_CODE_SIZE	0x50
+
+#define TRAMPOLINE_32BIT_STACK_END	TRAMPOLINE_32BIT_SIZE
+
+#ifndef __ASSEMBLER__
+
+extern void (*trampoline_32bit_src)(void *return_ptr);
+
+#endif /* __ASSEMBLER__ */
+#endif /* BOOT_COMPRESSED_PAGETABLE_H */
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index eed3a2c3b577..a2ab6b9cf258 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -1,18 +1,61 @@
 #include <asm/processor.h>
+#include "pgtable.h"
+#include "../string.h"
 
-int l5_paging_required(void)
+#define BIOS_START_MIN		0x20000U	/* 128K, less than this is insane */
+#define BIOS_START_MAX		0x9f000U	/* 640K, absolute maximum */
+
+unsigned long paging_prepare(void)
 {
-	/* Check i leaf 7 is supported. */
-	if (native_cpuid_eax(0) < 7)
-		return 0;
+	unsigned long bios_start, ebda_start, trampoline_start, *trampoline;
+	int l5_required = 0;
+
+	/* Check if la57 is desired and supported */
+	if (IS_ENABLED(CONFIG_X86_5LEVEL) && native_cpuid_eax(0) >= 7 &&
+			(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+		l5_required = 1;
+
+	/*
+	 * Find suitable spot for trampoline.
+	 * Based on reserve_bios_regions().
+	 */
+
+	ebda_start = *(unsigned short *)0x40e << 4;
+	bios_start = *(unsigned short *)0x413 << 10;
+
+	if (bios_start < BIOS_START_MIN || bios_start > BIOS_START_MAX)
+		bios_start = BIOS_START_MAX;
+
+	if (ebda_start > BIOS_START_MIN && ebda_start < bios_start)
+		bios_start = ebda_start;
+
+	/* Place trampoline below end of low memory, aligned to 4k */
+	trampoline_start = bios_start - TRAMPOLINE_32BIT_SIZE;
+	trampoline_start = round_down(trampoline_start, PAGE_SIZE);
+
+	trampoline = (unsigned long *)trampoline_start;
+
+	/* Clear trampoline memory first */
+	memset(trampoline, 0, TRAMPOLINE_32BIT_SIZE);
 
-	/* Check if la57 is supported. */
-	if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
-		return 0;
+	/* Copy trampoline code in place */
+	memcpy(trampoline + TRAMPOLINE_32BIT_CODE_OFF / sizeof(unsigned long),
+			&trampoline_32bit_src, TRAMPOLINE_32BIT_CODE_SIZE);
 
-	/* Check if 5-level paging has already been enabled. */
-	if (native_read_cr4() & X86_CR4_LA57)
-		return 0;
+	if (l5_required) {
+		/*
+		 * For 5-level paging setup current CR3 as the first and the
+		 * only entry in a new top level page table.
+		 */
+		trampoline[0] = __read_cr3() + _PAGE_TABLE_NOENC;
+	} else {
+		/*
+		 * For 4-level paging, copy current top-level page table.
+		 * It might be above 4G and be unaccessible from 32-bit mode.
+		 */
+		memcpy(trampoline, (void *)__read_cr3(), PAGE_SIZE);
+	}
 
-	return 1;
+	/* Bit 0 is used to encode if 5-level paging is required */
+	return trampoline_start | l5_required;
 }
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv2 3/4] x86/boot/compressed/64: Introduce place_trampoline()
@ 2017-11-10 22:06   ` Kirill A. Shutemov
  0 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

If bootloader enables 64-bit mode with 4-level paging, we might need to
switch over to 5-level paging. The switching requires disabling paging.
It works fine if kernel itself is loaded below 4G.

If bootloader put the kernel above 4G (not sure if anybody does this),
we would loose control as soon as paging is disabled as code becomes
unreachable.

To handle the situation, we need a trampoline in lower memory that would
take care about switching on 5-level paging.

Apart from trampoline itself we also need place to store top level page
table in lower memory as we don't have a way to load 64-bit value into
CR3 from 32-bit mode. We only really need 8-bytes there as we only use
the very first entry of the page table. But we allocate whole page
anyway. We cannot have the code in the same because, there's hazard that
a CPU would read page table speculatively and get confused seeing
garbage.

This patch introduces paging_prepare() that check if we need to enable
5-level paging and then finds right spot in lower memory for trampoline,
copies trampoline code there and setups new top level page table for
5-level paging.

At this point we do all the preparation, but not yet use trampoline.
It will be done in following patch.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/head_64.S    | 54 ++++++++++++++++-------------
 arch/x86/boot/compressed/pgtable.h    | 18 ++++++++++
 arch/x86/boot/compressed/pgtable_64.c | 65 +++++++++++++++++++++++++++++------
 3 files changed, 103 insertions(+), 34 deletions(-)
 create mode 100644 arch/x86/boot/compressed/pgtable.h

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index fc313e29fe2c..33a47d5c6445 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -304,33 +304,45 @@ ENTRY(startup_64)
 	/* Set up the stack */
 	leaq	boot_stack_end(%rbx), %rsp
 
-#ifdef CONFIG_X86_5LEVEL
-	/*
-	 * Check if we need to enable 5-level paging.
-	 * RSI holds real mode data and need to be preserved across
-	 * a function call.
-	 */
-	pushq	%rsi
-	call	l5_paging_required
-	popq	%rsi
-
-	/* If l5_paging_required() returned zero, we're done here. */
-	cmpq	$0, %rax
-	je	lvl5
-
 	/*
 	 * At this point we are in long mode with 4-level paging enabled,
-	 * but we want to enable 5-level paging.
+	 * but we might want to enable 5-level paging.
 	 *
 	 * The problem is that we cannot do it directly. Setting LA57 in
 	 * long mode would trigger #GP. So we need to switch off long mode
 	 * first.
 	 *
-	 * NOTE: This is not going to work if bootloader put us above 4G
-	 * limit.
+	 * We also need trampoline in lower memory to switch from 4- to 5-level
+	 * paging for cases when bootloader put kernel above 4G, but didn't
+	 * enable 5-level paging for us.
+	 *
+	 * For trampoline, we have to have top page table in lower memory as we
+	 * don't have a way to load 64-bit value into CR3 from 32-bit mode.
+	 *
+	 * We go though trampoline even if we don't have to: if we're already
+	 * in 5-level paging mode or if we don't need to switch to it. This way
+	 * the trampoline code gets tested not only in special rare case, but
+	 * on every boot.
+	 */
+
+	/*
+	 * paging_prepare() would setup trampoline and check if we need to
+	 * enable 5-level paging.
+	 *
+	 * Address of trampoline is rerurned in RAX. The bit 0 is used to
+	 * encode if we need to enabled 5-level paging.
 	 *
-	 * The first step is go into compatibility mode.
+	 * RSI holds real mode data and need to be preserved across
+	 * a function call.
 	 */
+	pushq	%rsi
+	call	paging_prepare
+	popq	%rsi
+	movq	%rax, %rcx
+	andq	$(~1UL), %rcx
+
+	testq	$1, %rax
+	jz	lvl5
 
 	/* Clear additional page table */
 	leaq	lvl5_pgtable(%rbx), %rdi
@@ -352,7 +364,6 @@ ENTRY(startup_64)
 	pushq	%rax
 	lretq
 lvl5:
-#endif
 
 	/* Zero EFLAGS */
 	pushq	$0
@@ -490,7 +501,7 @@ relocated:
 	jmp	*%rax
 
 	.code32
-#ifdef CONFIG_X86_5LEVEL
+ENTRY(trampoline_32bit_src)
 compatible_mode:
 	/* Setup data and stack segments */
 	movl	$__KERNEL_DS, %eax
@@ -526,7 +537,6 @@ compatible_mode:
 	movl	%eax, %cr0
 
 	lret
-#endif
 
 no_longmode:
 	/* This isn't an x86-64 CPU so hang */
@@ -585,7 +595,5 @@ boot_stack_end:
 	.balign 4096
 pgtable:
 	.fill BOOT_PGT_SIZE, 1, 0
-#ifdef CONFIG_X86_5LEVEL
 lvl5_pgtable:
 	.fill PAGE_SIZE, 1, 0
-#endif
diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h
new file mode 100644
index 000000000000..0261d4ab62e6
--- /dev/null
+++ b/arch/x86/boot/compressed/pgtable.h
@@ -0,0 +1,18 @@
+#ifndef BOOT_COMPRESSED_PAGETABLE_H
+#define BOOT_COMPRESSED_PAGETABLE_H
+
+#define TRAMPOLINE_32BIT_SIZE		(2 * PAGE_SIZE)
+
+#define TRAMPOLINE_32BIT_PGTABLE_OFF	0
+
+#define TRAMPOLINE_32BIT_CODE_OFF	PAGE_SIZE
+#define TRAMPOLINE_32BIT_CODE_SIZE	0x50
+
+#define TRAMPOLINE_32BIT_STACK_END	TRAMPOLINE_32BIT_SIZE
+
+#ifndef __ASSEMBLER__
+
+extern void (*trampoline_32bit_src)(void *return_ptr);
+
+#endif /* __ASSEMBLER__ */
+#endif /* BOOT_COMPRESSED_PAGETABLE_H */
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index eed3a2c3b577..a2ab6b9cf258 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -1,18 +1,61 @@
 #include <asm/processor.h>
+#include "pgtable.h"
+#include "../string.h"
 
-int l5_paging_required(void)
+#define BIOS_START_MIN		0x20000U	/* 128K, less than this is insane */
+#define BIOS_START_MAX		0x9f000U	/* 640K, absolute maximum */
+
+unsigned long paging_prepare(void)
 {
-	/* Check i leaf 7 is supported. */
-	if (native_cpuid_eax(0) < 7)
-		return 0;
+	unsigned long bios_start, ebda_start, trampoline_start, *trampoline;
+	int l5_required = 0;
+
+	/* Check if la57 is desired and supported */
+	if (IS_ENABLED(CONFIG_X86_5LEVEL) && native_cpuid_eax(0) >= 7 &&
+			(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+		l5_required = 1;
+
+	/*
+	 * Find suitable spot for trampoline.
+	 * Based on reserve_bios_regions().
+	 */
+
+	ebda_start = *(unsigned short *)0x40e << 4;
+	bios_start = *(unsigned short *)0x413 << 10;
+
+	if (bios_start < BIOS_START_MIN || bios_start > BIOS_START_MAX)
+		bios_start = BIOS_START_MAX;
+
+	if (ebda_start > BIOS_START_MIN && ebda_start < bios_start)
+		bios_start = ebda_start;
+
+	/* Place trampoline below end of low memory, aligned to 4k */
+	trampoline_start = bios_start - TRAMPOLINE_32BIT_SIZE;
+	trampoline_start = round_down(trampoline_start, PAGE_SIZE);
+
+	trampoline = (unsigned long *)trampoline_start;
+
+	/* Clear trampoline memory first */
+	memset(trampoline, 0, TRAMPOLINE_32BIT_SIZE);
 
-	/* Check if la57 is supported. */
-	if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
-		return 0;
+	/* Copy trampoline code in place */
+	memcpy(trampoline + TRAMPOLINE_32BIT_CODE_OFF / sizeof(unsigned long),
+			&trampoline_32bit_src, TRAMPOLINE_32BIT_CODE_SIZE);
 
-	/* Check if 5-level paging has already been enabled. */
-	if (native_read_cr4() & X86_CR4_LA57)
-		return 0;
+	if (l5_required) {
+		/*
+		 * For 5-level paging setup current CR3 as the first and the
+		 * only entry in a new top level page table.
+		 */
+		trampoline[0] = __read_cr3() + _PAGE_TABLE_NOENC;
+	} else {
+		/*
+		 * For 4-level paging, copy current top-level page table.
+		 * It might be above 4G and be unaccessible from 32-bit mode.
+		 */
+		memcpy(trampoline, (void *)__read_cr3(), PAGE_SIZE);
+	}
 
-	return 1;
+	/* Bit 0 is used to encode if 5-level paging is required */
+	return trampoline_start | l5_required;
 }
-- 
2.14.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv2 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G
  2017-11-10 22:06 ` Kirill A. Shutemov
@ 2017-11-10 22:06   ` Kirill A. Shutemov
  -1 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

This patch addresses shortcoming in current boot process on machines
that supports 5-level paging.

If bootloader enables 64-bit mode with 4-level paging, we need to
switch over to 5-level paging. The switching requires disabling paging.
It works fine if kernel itself is loaded below 4G.

If bootloader put the kernel above 4G (not sure if anybody does this),
we would loose control as soon as paging is disabled as code becomes
unreachable.

This patch implements trampoline in lower memory to handle this
situation.

We only need the memory for very short time, until main kernel image
setup its own page tables.

We go though trampoline even if we don't have to: if we're already in
5-level paging mode or if we don't need to switch to it. This way the
trampoline code gets tested not only in special rare case, but on every
boot.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/head_64.S | 72 +++++++++++++++++++++++---------------
 1 file changed, 43 insertions(+), 29 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 33a47d5c6445..525972ca27b7 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -33,6 +33,7 @@
 #include <asm/processor-flags.h>
 #include <asm/asm-offsets.h>
 #include <asm/bootparam.h>
+#include "pgtable.h"
 
 /*
  * Locally defined symbols should be marked hidden:
@@ -339,31 +340,22 @@ ENTRY(startup_64)
 	call	paging_prepare
 	popq	%rsi
 	movq	%rax, %rcx
-	andq	$(~1UL), %rcx
-
-	testq	$1, %rax
-	jz	lvl5
-
-	/* Clear additional page table */
-	leaq	lvl5_pgtable(%rbx), %rdi
-	xorq	%rax, %rax
-	movq	$(PAGE_SIZE/8), %rcx
-	rep	stosq
 
 	/*
-	 * Setup current CR3 as the first and only entry in a new top level
-	 * page table.
+	 * Load address of trampoline_return into RDI.
+	 * It will be used by trampoline to return to main code.
 	 */
-	movq	%cr3, %rdi
-	leaq	0x7 (%rdi), %rax
-	movq	%rax, lvl5_pgtable(%rbx)
+	leaq	trampoline_return(%rip), %rdi
 
 	/* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */
 	pushq	$__KERNEL32_CS
-	leaq	compatible_mode(%rip), %rax
+	andq	$(~1UL), %rax /* Clear bit 0: encode if 5-level paging neeeded */
+	leaq	TRAMPOLINE_32BIT_CODE_OFF(%rax), %rax
 	pushq	%rax
 	lretq
-lvl5:
+trampoline_return:
+	/* Restore stack, 32-bit trampoline uses own stack */
+	leaq	boot_stack_end(%rbx), %rsp
 
 	/* Zero EFLAGS */
 	pushq	$0
@@ -501,36 +493,51 @@ relocated:
 	jmp	*%rax
 
 	.code32
+/*
+ * This is 32-bit trampoline that will be copied over to low memory.
+ *
+ * RDI contains return address (might be above 4G).
+ * ECX contains the base address of trampoline memory.
+ * Bit 0 of ECX encodes if 5-level paging is required.
+ */
 ENTRY(trampoline_32bit_src)
-compatible_mode:
 	/* Setup data and stack segments */
 	movl	$__KERNEL_DS, %eax
 	movl	%eax, %ds
 	movl	%eax, %ss
 
+	movl	%ecx, %edx
+	andl	$(~1UL), %edx
+
+	/* Setup new stack at the end of trampoline memory */
+	leal	TRAMPOLINE_32BIT_STACK_END (%edx), %esp
+
 	/* Disable paging */
 	movl	%cr0, %eax
 	btrl	$X86_CR0_PG_BIT, %eax
 	movl	%eax, %cr0
 
-	/* Point CR3 to 5-level paging */
-	leal	lvl5_pgtable(%ebx), %eax
+	/* Point CR3 to trampoline top level page table */
+	leal	TRAMPOLINE_32BIT_PGTABLE_OFF (%edx), %eax
 	movl	%eax, %cr3
 
 	/* Enable PAE and LA57 mode */
 	movl	%cr4, %eax
-	orl	$(X86_CR4_PAE | X86_CR4_LA57), %eax
+	orl	$X86_CR4_PAE, %eax
+
+	/* Bit 0 of ECX encodes if 5-level paging is required */
+	testl	$1, %ecx
+	jz	1f
+	orl	$X86_CR4_LA57, %eax
+1:
 	movl	%eax, %cr4
 
-	/* Calculate address we are running at */
-	call	1f
-1:	popl	%edi
-	subl	$1b, %edi
+	/* Calculate address of paging_enabled once we are in trampoline */
+	leal	paging_enabled - trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_OFF (%edx), %eax
 
 	/* Prepare stack for far return to Long Mode */
 	pushl	$__KERNEL_CS
-	leal	lvl5(%edi), %eax
-	push	%eax
+	pushl	%eax
 
 	/* Enable paging back */
 	movl	$(X86_CR0_PG | X86_CR0_PE), %eax
@@ -538,6 +545,15 @@ compatible_mode:
 
 	lret
 
+	.code64
+paging_enabled:
+	/* Return from trampoline */
+	jmp	*%rdi
+
+	/* Bound size of trampoline code */
+	.org	trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_SIZE
+
+	.code32
 no_longmode:
 	/* This isn't an x86-64 CPU so hang */
 1:
@@ -595,5 +611,3 @@ boot_stack_end:
 	.balign 4096
 pgtable:
 	.fill BOOT_PGT_SIZE, 1, 0
-lvl5_pgtable:
-	.fill PAGE_SIZE, 1, 0
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCHv2 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G
@ 2017-11-10 22:06   ` Kirill A. Shutemov
  0 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
	Kirill A. Shutemov

This patch addresses shortcoming in current boot process on machines
that supports 5-level paging.

If bootloader enables 64-bit mode with 4-level paging, we need to
switch over to 5-level paging. The switching requires disabling paging.
It works fine if kernel itself is loaded below 4G.

If bootloader put the kernel above 4G (not sure if anybody does this),
we would loose control as soon as paging is disabled as code becomes
unreachable.

This patch implements trampoline in lower memory to handle this
situation.

We only need the memory for very short time, until main kernel image
setup its own page tables.

We go though trampoline even if we don't have to: if we're already in
5-level paging mode or if we don't need to switch to it. This way the
trampoline code gets tested not only in special rare case, but on every
boot.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/head_64.S | 72 +++++++++++++++++++++++---------------
 1 file changed, 43 insertions(+), 29 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 33a47d5c6445..525972ca27b7 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -33,6 +33,7 @@
 #include <asm/processor-flags.h>
 #include <asm/asm-offsets.h>
 #include <asm/bootparam.h>
+#include "pgtable.h"
 
 /*
  * Locally defined symbols should be marked hidden:
@@ -339,31 +340,22 @@ ENTRY(startup_64)
 	call	paging_prepare
 	popq	%rsi
 	movq	%rax, %rcx
-	andq	$(~1UL), %rcx
-
-	testq	$1, %rax
-	jz	lvl5
-
-	/* Clear additional page table */
-	leaq	lvl5_pgtable(%rbx), %rdi
-	xorq	%rax, %rax
-	movq	$(PAGE_SIZE/8), %rcx
-	rep	stosq
 
 	/*
-	 * Setup current CR3 as the first and only entry in a new top level
-	 * page table.
+	 * Load address of trampoline_return into RDI.
+	 * It will be used by trampoline to return to main code.
 	 */
-	movq	%cr3, %rdi
-	leaq	0x7 (%rdi), %rax
-	movq	%rax, lvl5_pgtable(%rbx)
+	leaq	trampoline_return(%rip), %rdi
 
 	/* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */
 	pushq	$__KERNEL32_CS
-	leaq	compatible_mode(%rip), %rax
+	andq	$(~1UL), %rax /* Clear bit 0: encode if 5-level paging neeeded */
+	leaq	TRAMPOLINE_32BIT_CODE_OFF(%rax), %rax
 	pushq	%rax
 	lretq
-lvl5:
+trampoline_return:
+	/* Restore stack, 32-bit trampoline uses own stack */
+	leaq	boot_stack_end(%rbx), %rsp
 
 	/* Zero EFLAGS */
 	pushq	$0
@@ -501,36 +493,51 @@ relocated:
 	jmp	*%rax
 
 	.code32
+/*
+ * This is 32-bit trampoline that will be copied over to low memory.
+ *
+ * RDI contains return address (might be above 4G).
+ * ECX contains the base address of trampoline memory.
+ * Bit 0 of ECX encodes if 5-level paging is required.
+ */
 ENTRY(trampoline_32bit_src)
-compatible_mode:
 	/* Setup data and stack segments */
 	movl	$__KERNEL_DS, %eax
 	movl	%eax, %ds
 	movl	%eax, %ss
 
+	movl	%ecx, %edx
+	andl	$(~1UL), %edx
+
+	/* Setup new stack at the end of trampoline memory */
+	leal	TRAMPOLINE_32BIT_STACK_END (%edx), %esp
+
 	/* Disable paging */
 	movl	%cr0, %eax
 	btrl	$X86_CR0_PG_BIT, %eax
 	movl	%eax, %cr0
 
-	/* Point CR3 to 5-level paging */
-	leal	lvl5_pgtable(%ebx), %eax
+	/* Point CR3 to trampoline top level page table */
+	leal	TRAMPOLINE_32BIT_PGTABLE_OFF (%edx), %eax
 	movl	%eax, %cr3
 
 	/* Enable PAE and LA57 mode */
 	movl	%cr4, %eax
-	orl	$(X86_CR4_PAE | X86_CR4_LA57), %eax
+	orl	$X86_CR4_PAE, %eax
+
+	/* Bit 0 of ECX encodes if 5-level paging is required */
+	testl	$1, %ecx
+	jz	1f
+	orl	$X86_CR4_LA57, %eax
+1:
 	movl	%eax, %cr4
 
-	/* Calculate address we are running at */
-	call	1f
-1:	popl	%edi
-	subl	$1b, %edi
+	/* Calculate address of paging_enabled once we are in trampoline */
+	leal	paging_enabled - trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_OFF (%edx), %eax
 
 	/* Prepare stack for far return to Long Mode */
 	pushl	$__KERNEL_CS
-	leal	lvl5(%edi), %eax
-	push	%eax
+	pushl	%eax
 
 	/* Enable paging back */
 	movl	$(X86_CR0_PG | X86_CR0_PE), %eax
@@ -538,6 +545,15 @@ compatible_mode:
 
 	lret
 
+	.code64
+paging_enabled:
+	/* Return from trampoline */
+	jmp	*%rdi
+
+	/* Bound size of trampoline code */
+	.org	trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_SIZE
+
+	.code32
 no_longmode:
 	/* This isn't an x86-64 CPU so hang */
 1:
@@ -595,5 +611,3 @@ boot_stack_end:
 	.balign 4096
 pgtable:
 	.fill BOOT_PGT_SIZE, 1, 0
-lvl5_pgtable:
-	.fill PAGE_SIZE, 1, 0
-- 
2.14.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-10 22:06 ` Kirill A. Shutemov
@ 2017-11-22  8:09   ` Kirill A. Shutemov
  -1 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-22  8:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kirill A. Shutemov, x86, Thomas Gleixner, H. Peter Anvin,
	Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel

On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> Hi Ingo,
> 
> Here's updated changes that prepare the code to boot-time switching between
> paging modes and handle booting in 5-level mode when bootloader put kernel
> image above 4G, but haven't enabled 5-level paging for us.
> 
> I've updated patches based on your feedback.
> 
> Please review and consider applying.

Gentle ping.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-22  8:09   ` Kirill A. Shutemov
  0 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-22  8:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kirill A. Shutemov, x86, Thomas Gleixner, H. Peter Anvin,
	Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Borislav Petkov, Andi Kleen, linux-mm, linux-kernel

On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> Hi Ingo,
> 
> Here's updated changes that prepare the code to boot-time switching between
> paging modes and handle booting in 5-level mode when bootloader put kernel
> image above 4G, but haven't enabled 5-level paging for us.
> 
> I've updated patches based on your feedback.
> 
> Please review and consider applying.

Gentle ping.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-10 22:06 ` Kirill A. Shutemov
@ 2017-11-29 15:49   ` Borislav Petkov
  -1 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-29 15:49 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin,
	Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov, Andi Kleen,
	linux-mm, linux-kernel

On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> Hi Ingo,
> 
> Here's updated changes that prepare the code to boot-time switching between
> paging modes and handle booting in 5-level mode when bootloader put kernel
> image above 4G, but haven't enabled 5-level paging for us.

Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
triple-faults and ends up spinning in a reboot loop. Even though it
should say:

early console in setup code
This kernel requires the following features not present on the CPU:
la57 
Unable to boot - please use a kernel appropriate for your CPU.

and halt.

A kvm guest still does that but baremetal triple-faults.

Ideas?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 15:49   ` Borislav Petkov
  0 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-29 15:49 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin,
	Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov, Andi Kleen,
	linux-mm, linux-kernel

On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> Hi Ingo,
> 
> Here's updated changes that prepare the code to boot-time switching between
> paging modes and handle booting in 5-level mode when bootloader put kernel
> image above 4G, but haven't enabled 5-level paging for us.

Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
triple-faults and ends up spinning in a reboot loop. Even though it
should say:

early console in setup code
This kernel requires the following features not present on the CPU:
la57 
Unable to boot - please use a kernel appropriate for your CPU.

and halt.

A kvm guest still does that but baremetal triple-faults.

Ideas?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 15:49   ` Borislav Petkov
@ 2017-11-29 16:13     ` Kirill A. Shutemov
  -1 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-29 16:13 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kirill A. Shutemov, Ingo Molnar, x86, Thomas Gleixner,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 04:49:08PM +0100, Borislav Petkov wrote:
> On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> > Hi Ingo,
> > 
> > Here's updated changes that prepare the code to boot-time switching between
> > paging modes and handle booting in 5-level mode when bootloader put kernel
> > image above 4G, but haven't enabled 5-level paging for us.
> 
> Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
> triple-faults and ends up spinning in a reboot loop. Even though it
> should say:
> 
> early console in setup code
> This kernel requires the following features not present on the CPU:
> la57 
> Unable to boot - please use a kernel appropriate for your CPU.
> 
> and halt.
> 
> A kvm guest still does that but baremetal triple-faults.
> 
> Ideas?

Looks like we call check_cpuflags() too late. 5-level paging gets enabled
before image decompression started.

For qemu/kvm it works because it's supported in softmmu, even if not
advertised in cpuid.

I'm not sure if it worth fixing on its own. I would rather get boot-time
switching code upstream sooner. It will get problem go away naturally.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 16:13     ` Kirill A. Shutemov
  0 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-29 16:13 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kirill A. Shutemov, Ingo Molnar, x86, Thomas Gleixner,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 04:49:08PM +0100, Borislav Petkov wrote:
> On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> > Hi Ingo,
> > 
> > Here's updated changes that prepare the code to boot-time switching between
> > paging modes and handle booting in 5-level mode when bootloader put kernel
> > image above 4G, but haven't enabled 5-level paging for us.
> 
> Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
> triple-faults and ends up spinning in a reboot loop. Even though it
> should say:
> 
> early console in setup code
> This kernel requires the following features not present on the CPU:
> la57 
> Unable to boot - please use a kernel appropriate for your CPU.
> 
> and halt.
> 
> A kvm guest still does that but baremetal triple-faults.
> 
> Ideas?

Looks like we call check_cpuflags() too late. 5-level paging gets enabled
before image decompression started.

For qemu/kvm it works because it's supported in softmmu, even if not
advertised in cpuid.

I'm not sure if it worth fixing on its own. I would rather get boot-time
switching code upstream sooner. It will get problem go away naturally.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 16:13     ` Kirill A. Shutemov
@ 2017-11-29 16:40       ` Thomas Gleixner
  -1 siblings, 0 replies; 46+ messages in thread
From: Thomas Gleixner @ 2017-11-29 16:40 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Borislav Petkov, Kirill A. Shutemov, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Wed, 29 Nov 2017, Kirill A. Shutemov wrote:

> On Wed, Nov 29, 2017 at 04:49:08PM +0100, Borislav Petkov wrote:
> > On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> > > Hi Ingo,
> > > 
> > > Here's updated changes that prepare the code to boot-time switching between
> > > paging modes and handle booting in 5-level mode when bootloader put kernel
> > > image above 4G, but haven't enabled 5-level paging for us.
> > 
> > Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
> > triple-faults and ends up spinning in a reboot loop. Even though it
> > should say:
> > 
> > early console in setup code
> > This kernel requires the following features not present on the CPU:
> > la57 
> > Unable to boot - please use a kernel appropriate for your CPU.
> > 
> > and halt.
> > 
> > A kvm guest still does that but baremetal triple-faults.
> > 
> > Ideas?
> 
> Looks like we call check_cpuflags() too late. 5-level paging gets enabled
> before image decompression started.
> 
> For qemu/kvm it works because it's supported in softmmu, even if not
> advertised in cpuid.
> 
> I'm not sure if it worth fixing on its own. I would rather get boot-time
> switching code upstream sooner. It will get problem go away naturally.

It needs to be fixed now. Because that problem exists in 4.14

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 16:40       ` Thomas Gleixner
  0 siblings, 0 replies; 46+ messages in thread
From: Thomas Gleixner @ 2017-11-29 16:40 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Borislav Petkov, Kirill A. Shutemov, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Wed, 29 Nov 2017, Kirill A. Shutemov wrote:

> On Wed, Nov 29, 2017 at 04:49:08PM +0100, Borislav Petkov wrote:
> > On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> > > Hi Ingo,
> > > 
> > > Here's updated changes that prepare the code to boot-time switching between
> > > paging modes and handle booting in 5-level mode when bootloader put kernel
> > > image above 4G, but haven't enabled 5-level paging for us.
> > 
> > Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
> > triple-faults and ends up spinning in a reboot loop. Even though it
> > should say:
> > 
> > early console in setup code
> > This kernel requires the following features not present on the CPU:
> > la57 
> > Unable to boot - please use a kernel appropriate for your CPU.
> > 
> > and halt.
> > 
> > A kvm guest still does that but baremetal triple-faults.
> > 
> > Ideas?
> 
> Looks like we call check_cpuflags() too late. 5-level paging gets enabled
> before image decompression started.
> 
> For qemu/kvm it works because it's supported in softmmu, even if not
> advertised in cpuid.
> 
> I'm not sure if it worth fixing on its own. I would rather get boot-time
> switching code upstream sooner. It will get problem go away naturally.

It needs to be fixed now. Because that problem exists in 4.14

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 16:40       ` Thomas Gleixner
@ 2017-11-29 17:08         ` Kirill A. Shutemov
  -1 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-29 17:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, Kirill A. Shutemov, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 05:40:32PM +0100, Thomas Gleixner wrote:
> On Wed, 29 Nov 2017, Kirill A. Shutemov wrote:
> 
> > On Wed, Nov 29, 2017 at 04:49:08PM +0100, Borislav Petkov wrote:
> > > On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> > > > Hi Ingo,
> > > > 
> > > > Here's updated changes that prepare the code to boot-time switching between
> > > > paging modes and handle booting in 5-level mode when bootloader put kernel
> > > > image above 4G, but haven't enabled 5-level paging for us.
> > > 
> > > Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
> > > triple-faults and ends up spinning in a reboot loop. Even though it
> > > should say:
> > > 
> > > early console in setup code
> > > This kernel requires the following features not present on the CPU:
> > > la57 
> > > Unable to boot - please use a kernel appropriate for your CPU.
> > > 
> > > and halt.
> > > 
> > > A kvm guest still does that but baremetal triple-faults.
> > > 
> > > Ideas?
> > 
> > Looks like we call check_cpuflags() too late. 5-level paging gets enabled
> > before image decompression started.
> > 
> > For qemu/kvm it works because it's supported in softmmu, even if not
> > advertised in cpuid.
> > 
> > I'm not sure if it worth fixing on its own. I would rather get boot-time
> > switching code upstream sooner. It will get problem go away naturally.
> 
> It needs to be fixed now. Because that problem exists in 4.14

Okay.

We're really early in the boot -- startup_64 in decompression code -- and
I don't know a way print a message there. Is there a way?

no_longmode handled by just hanging the machine. Is it enough for no_la57
case too?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 17:08         ` Kirill A. Shutemov
  0 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-29 17:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, Kirill A. Shutemov, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 05:40:32PM +0100, Thomas Gleixner wrote:
> On Wed, 29 Nov 2017, Kirill A. Shutemov wrote:
> 
> > On Wed, Nov 29, 2017 at 04:49:08PM +0100, Borislav Petkov wrote:
> > > On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> > > > Hi Ingo,
> > > > 
> > > > Here's updated changes that prepare the code to boot-time switching between
> > > > paging modes and handle booting in 5-level mode when bootloader put kernel
> > > > image above 4G, but haven't enabled 5-level paging for us.
> > > 
> > > Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
> > > triple-faults and ends up spinning in a reboot loop. Even though it
> > > should say:
> > > 
> > > early console in setup code
> > > This kernel requires the following features not present on the CPU:
> > > la57 
> > > Unable to boot - please use a kernel appropriate for your CPU.
> > > 
> > > and halt.
> > > 
> > > A kvm guest still does that but baremetal triple-faults.
> > > 
> > > Ideas?
> > 
> > Looks like we call check_cpuflags() too late. 5-level paging gets enabled
> > before image decompression started.
> > 
> > For qemu/kvm it works because it's supported in softmmu, even if not
> > advertised in cpuid.
> > 
> > I'm not sure if it worth fixing on its own. I would rather get boot-time
> > switching code upstream sooner. It will get problem go away naturally.
> 
> It needs to be fixed now. Because that problem exists in 4.14

Okay.

We're really early in the boot -- startup_64 in decompression code -- and
I don't know a way print a message there. Is there a way?

no_longmode handled by just hanging the machine. Is it enough for no_la57
case too?

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 17:08         ` Kirill A. Shutemov
@ 2017-11-29 17:48           ` Borislav Petkov
  -1 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-29 17:48 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Kirill A. Shutemov, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> We're really early in the boot -- startup_64 in decompression code -- and
> I don't know a way print a message there. Is there a way?
> 
> no_longmode handled by just hanging the machine. Is it enough for no_la57
> case too?

Patch pls.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 17:48           ` Borislav Petkov
  0 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-29 17:48 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Kirill A. Shutemov, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> We're really early in the boot -- startup_64 in decompression code -- and
> I don't know a way print a message there. Is there a way?
> 
> no_longmode handled by just hanging the machine. Is it enough for no_la57
> case too?

Patch pls.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 17:48           ` Borislav Petkov
@ 2017-11-29 19:01             ` H. Peter Anvin
  -1 siblings, 0 replies; 46+ messages in thread
From: H. Peter Anvin @ 2017-11-29 19:01 UTC (permalink / raw)
  To: Borislav Petkov, Kirill A. Shutemov
  Cc: Thomas Gleixner, Kirill A. Shutemov, Ingo Molnar, x86,
	Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov, Andi Kleen,
	linux-mm, linux-kernel

On 11/29/17 09:48, Borislav Petkov wrote:
> On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
>> We're really early in the boot -- startup_64 in decompression code -- and
>> I don't know a way print a message there. Is there a way?
>>
>> no_longmode handled by just hanging the machine. Is it enough for no_la57
>> case too?
> 
> Patch pls.
> 

I don't think there is any way to get a message out here.  It's too late
to use the firmware, and too early to use anything native.

no_longmode in startup_64 is an oxymoron -- it simply can't happen,
although of course we can enter at the 32-bit entry point with that problem.

We can hang the machine, or we can triple-fault it in the hope of
triggering a reset, and that way if the bootloader has been configured
with a backup kernel there is a hope of recovery.

Triple-faulting is trivial:

	push $0
	push $0
	lidt (%rsp)		/* %esp for 32-bit mode */
	ud2
	/* WTF? */
1:	hlt
	jmp 1b

This will either hang the machine or reboot it, depending on if the
reboot-on-triple-fault logic in the chipset actually works.

	-hpa

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 19:01             ` H. Peter Anvin
  0 siblings, 0 replies; 46+ messages in thread
From: H. Peter Anvin @ 2017-11-29 19:01 UTC (permalink / raw)
  To: Borislav Petkov, Kirill A. Shutemov
  Cc: Thomas Gleixner, Kirill A. Shutemov, Ingo Molnar, x86,
	Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov, Andi Kleen,
	linux-mm, linux-kernel

On 11/29/17 09:48, Borislav Petkov wrote:
> On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
>> We're really early in the boot -- startup_64 in decompression code -- and
>> I don't know a way print a message there. Is there a way?
>>
>> no_longmode handled by just hanging the machine. Is it enough for no_la57
>> case too?
> 
> Patch pls.
> 

I don't think there is any way to get a message out here.  It's too late
to use the firmware, and too early to use anything native.

no_longmode in startup_64 is an oxymoron -- it simply can't happen,
although of course we can enter at the 32-bit entry point with that problem.

We can hang the machine, or we can triple-fault it in the hope of
triggering a reset, and that way if the bootloader has been configured
with a backup kernel there is a hope of recovery.

Triple-faulting is trivial:

	push $0
	push $0
	lidt (%rsp)		/* %esp for 32-bit mode */
	ud2
	/* WTF? */
1:	hlt
	jmp 1b

This will either hang the machine or reboot it, depending on if the
reboot-on-triple-fault logic in the chipset actually works.

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 19:01             ` H. Peter Anvin
@ 2017-11-29 19:19               ` Borislav Petkov
  -1 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-29 19:19 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 11:01:35AM -0800, H. Peter Anvin wrote:
> We can hang the machine, or we can triple-fault it in the hope of
> triggering a reset, and that way if the bootloader has been configured
> with a backup kernel there is a hope of recovery.

Well, it triple-faults right now and that's not really user-friendly. If
we can't dump a message than we should make X86_5LEVEL depend on BROKEN
for the time being...

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 19:19               ` Borislav Petkov
  0 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-29 19:19 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 11:01:35AM -0800, H. Peter Anvin wrote:
> We can hang the machine, or we can triple-fault it in the hope of
> triggering a reset, and that way if the bootloader has been configured
> with a backup kernel there is a hope of recovery.

Well, it triple-faults right now and that's not really user-friendly. If
we can't dump a message than we should make X86_5LEVEL depend on BROKEN
for the time being...

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 17:08         ` Kirill A. Shutemov
@ 2017-11-29 20:58           ` Andi Kleen
  -1 siblings, 0 replies; 46+ messages in thread
From: Andi Kleen @ 2017-11-29 20:58 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Borislav Petkov, Kirill A. Shutemov,
	Ingo Molnar, x86, H. Peter Anvin, Linus Torvalds,
	Andy Lutomirski, Cyrill Gorcunov, linux-mm, linux-kernel

> We're really early in the boot -- startup_64 in decompression code -- and
> I don't know a way print a message there. Is there a way?
> 
> no_longmode handled by just hanging the machine. Is it enough for no_la57
> case too?

The way to handle it is to check it early in the real mode boot code when you 
can still print messages. That is how missing long mode is handled.

-Andi

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 20:58           ` Andi Kleen
  0 siblings, 0 replies; 46+ messages in thread
From: Andi Kleen @ 2017-11-29 20:58 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Borislav Petkov, Kirill A. Shutemov,
	Ingo Molnar, x86, H. Peter Anvin, Linus Torvalds,
	Andy Lutomirski, Cyrill Gorcunov, linux-mm, linux-kernel

> We're really early in the boot -- startup_64 in decompression code -- and
> I don't know a way print a message there. Is there a way?
> 
> no_longmode handled by just hanging the machine. Is it enough for no_la57
> case too?

The way to handle it is to check it early in the real mode boot code when you 
can still print messages. That is how missing long mode is handled.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 20:58           ` Andi Kleen
@ 2017-11-29 21:03             ` hpa
  -1 siblings, 0 replies; 46+ messages in thread
From: hpa @ 2017-11-29 21:03 UTC (permalink / raw)
  To: Andi Kleen, Kirill A. Shutemov
  Cc: Thomas Gleixner, Borislav Petkov, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, linux-mm, linux-kernel

On November 29, 2017 12:58:15 PM PST, Andi Kleen <ak@linux.intel.com> wrote:
>> We're really early in the boot -- startup_64 in decompression code --
>and
>> I don't know a way print a message there. Is there a way?
>> 
>> no_longmode handled by just hanging the machine. Is it enough for
>no_la57
>> case too?
>
>The way to handle it is to check it early in the real mode boot code
>when you 
>can still print messages. That is how missing long mode is handled.
>
>-Andi

Yes, and that test should be done automatically.  However, we also check at several later points in case that code is bypassed by the bootloader.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 21:03             ` hpa
  0 siblings, 0 replies; 46+ messages in thread
From: hpa @ 2017-11-29 21:03 UTC (permalink / raw)
  To: Andi Kleen, Kirill A. Shutemov
  Cc: Thomas Gleixner, Borislav Petkov, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, linux-mm, linux-kernel

On November 29, 2017 12:58:15 PM PST, Andi Kleen <ak@linux.intel.com> wrote:
>> We're really early in the boot -- startup_64 in decompression code --
>and
>> I don't know a way print a message there. Is there a way?
>> 
>> no_longmode handled by just hanging the machine. Is it enough for
>no_la57
>> case too?
>
>The way to handle it is to check it early in the real mode boot code
>when you 
>can still print messages. That is how missing long mode is handled.
>
>-Andi

Yes, and that test should be done automatically.  However, we also check at several later points in case that code is bypassed by the bootloader.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 19:19               ` Borislav Petkov
@ 2017-11-29 21:33                 ` H. Peter Anvin
  -1 siblings, 0 replies; 46+ messages in thread
From: H. Peter Anvin @ 2017-11-29 21:33 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel

On 11/29/17 11:19, Borislav Petkov wrote:
> On Wed, Nov 29, 2017 at 11:01:35AM -0800, H. Peter Anvin wrote:
>> We can hang the machine, or we can triple-fault it in the hope of
>> triggering a reset, and that way if the bootloader has been configured
>> with a backup kernel there is a hope of recovery.
> 
> Well, it triple-faults right now and that's not really user-friendly. If
> we can't dump a message than we should make X86_5LEVEL depend on BROKEN
> for the time being...
> 

You can't dump a message about *anything* if the bootloader bypasses the
checks that happen before we leave the firmware behind.  This is what
this is about.  For BIOS or EFI boot that go through the proper stub
functions we will print a message just fine, as we already validate the
"required features" structure (although please do verify that the
relevant words are indeed being checked.)

However, if the bootloader jumps straight into the code what do you
expect it to do?  We have no real concept about what we'd need to do to
issue a message as we really don't know what devices are available on
the system, etc.  If the screen_info field in struct boot_params has
been initialized then we actually *do* know how to write to the screen
-- if you are okay with including a text font etc. since modern systems
boot in graphics mode.

What else could we do?  I guess we could add a new field -- which
bootloaders would have to add support for -- for a callback to the
bootloader in case of an early-detected fatal kernel initialization
error.  This would have some... interesting(*)... issues with it, and
wouldn't resolve anything for existing bootloaders, but perhaps it is a
worthwhile extension going forward.

	-hpa

(*) The bootloader would have to be prepared for a largely undefined CPU
    state, in a rarely executed path.  However, it is arguably no worse
    than what we have now.  Current bootloaders *can* at least know all
    the memory the kernel will use before the kernel's own memory
    management takes over, so it is possible for it to allocate the
    kernel in such a way that its own code/data is preserved.

    It is at least possible to determine which major CPU mode we are
    running in when we get to that entrypoint.  The following code
    snippet will do it:

entry:
	.code16
	dec %ax
	mov $0,%ax
	jmp 16f
	nop
	nop
	jmp 32f
	.code64
	jmp code_64
	.code32
32:	jmp code_32
	.code16
16:	/* Arbitrary 16-bit code can start here */

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 21:33                 ` H. Peter Anvin
  0 siblings, 0 replies; 46+ messages in thread
From: H. Peter Anvin @ 2017-11-29 21:33 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel

On 11/29/17 11:19, Borislav Petkov wrote:
> On Wed, Nov 29, 2017 at 11:01:35AM -0800, H. Peter Anvin wrote:
>> We can hang the machine, or we can triple-fault it in the hope of
>> triggering a reset, and that way if the bootloader has been configured
>> with a backup kernel there is a hope of recovery.
> 
> Well, it triple-faults right now and that's not really user-friendly. If
> we can't dump a message than we should make X86_5LEVEL depend on BROKEN
> for the time being...
> 

You can't dump a message about *anything* if the bootloader bypasses the
checks that happen before we leave the firmware behind.  This is what
this is about.  For BIOS or EFI boot that go through the proper stub
functions we will print a message just fine, as we already validate the
"required features" structure (although please do verify that the
relevant words are indeed being checked.)

However, if the bootloader jumps straight into the code what do you
expect it to do?  We have no real concept about what we'd need to do to
issue a message as we really don't know what devices are available on
the system, etc.  If the screen_info field in struct boot_params has
been initialized then we actually *do* know how to write to the screen
-- if you are okay with including a text font etc. since modern systems
boot in graphics mode.

What else could we do?  I guess we could add a new field -- which
bootloaders would have to add support for -- for a callback to the
bootloader in case of an early-detected fatal kernel initialization
error.  This would have some... interesting(*)... issues with it, and
wouldn't resolve anything for existing bootloaders, but perhaps it is a
worthwhile extension going forward.

	-hpa

(*) The bootloader would have to be prepared for a largely undefined CPU
    state, in a rarely executed path.  However, it is arguably no worse
    than what we have now.  Current bootloaders *can* at least know all
    the memory the kernel will use before the kernel's own memory
    management takes over, so it is possible for it to allocate the
    kernel in such a way that its own code/data is preserved.

    It is at least possible to determine which major CPU mode we are
    running in when we get to that entrypoint.  The following code
    snippet will do it:

entry:
	.code16
	dec %ax
	mov $0,%ax
	jmp 16f
	nop
	nop
	jmp 32f
	.code64
	jmp code_64
	.code32
32:	jmp code_32
	.code16
16:	/* Arbitrary 16-bit code can start here */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 21:33                 ` H. Peter Anvin
@ 2017-11-29 22:31                   ` Borislav Petkov
  -1 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-29 22:31 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 01:33:28PM -0800, H. Peter Anvin wrote:
> You can't dump a message about *anything* if the bootloader bypasses the
> checks that happen before we leave the firmware behind.  This is what
> this is about.  For BIOS or EFI boot that go through the proper stub
> functions we will print a message just fine, as we already validate the
> "required features" structure (although please do verify that the
> relevant words are indeed being checked.)

A couple of points:

* so this box here has a normal grub installation and apparently grub
jumps to some other entry point.

* I'm not convinced we need to do everything you typed because this is
only a temporary issue and once X86_5LEVEL is complete, it should work.
I mean, it needs to work otherwise forget single-system image and I
don't think we want to give that up.

> However, if the bootloader jumps straight into the code what do you
> expect it to do?  We have no real concept about what we'd need to do to
> issue a message as we really don't know what devices are available on
> the system, etc.  If the screen_info field in struct boot_params has
> been initialized then we actually *do* know how to write to the screen
> -- if you are okay with including a text font etc. since modern systems
> boot in graphics mode.

We switch to text mode and dump our message. Can we do that?

I wouldn't want to do any of this back'n'forth between kernel and boot
loader because that sounds fragile, at least to me. And again, I'm
not convinced we should spend too much energy on this as the issue is
temporary AFAICT.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 22:31                   ` Borislav Petkov
  0 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-29 22:31 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 01:33:28PM -0800, H. Peter Anvin wrote:
> You can't dump a message about *anything* if the bootloader bypasses the
> checks that happen before we leave the firmware behind.  This is what
> this is about.  For BIOS or EFI boot that go through the proper stub
> functions we will print a message just fine, as we already validate the
> "required features" structure (although please do verify that the
> relevant words are indeed being checked.)

A couple of points:

* so this box here has a normal grub installation and apparently grub
jumps to some other entry point.

* I'm not convinced we need to do everything you typed because this is
only a temporary issue and once X86_5LEVEL is complete, it should work.
I mean, it needs to work otherwise forget single-system image and I
don't think we want to give that up.

> However, if the bootloader jumps straight into the code what do you
> expect it to do?  We have no real concept about what we'd need to do to
> issue a message as we really don't know what devices are available on
> the system, etc.  If the screen_info field in struct boot_params has
> been initialized then we actually *do* know how to write to the screen
> -- if you are okay with including a text font etc. since modern systems
> boot in graphics mode.

We switch to text mode and dump our message. Can we do that?

I wouldn't want to do any of this back'n'forth between kernel and boot
loader because that sounds fragile, at least to me. And again, I'm
not convinced we should spend too much energy on this as the issue is
temporary AFAICT.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 22:31                   ` Borislav Petkov
@ 2017-11-29 23:24                     ` H. Peter Anvin
  -1 siblings, 0 replies; 46+ messages in thread
From: H. Peter Anvin @ 2017-11-29 23:24 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel

On 11/29/17 14:31, Borislav Petkov wrote:
> 
> A couple of points:
> 
> * so this box here has a normal grub installation and apparently grub
> jumps to some other entry point.
> 

Yes, Grub as a matter of policy(!) does everything in the most braindead
way possible.  You have to use "linux16" or "linuxefi" to make it do
something sane.

> * I'm not convinced we need to do everything you typed because this is
> only a temporary issue and once X86_5LEVEL is complete, it should work.
> I mean, it needs to work otherwise forget single-system image and I
> don't think we want to give that up.
> 
>> However, if the bootloader jumps straight into the code what do you
>> expect it to do?  We have no real concept about what we'd need to do to
>> issue a message as we really don't know what devices are available on
>> the system, etc.  If the screen_info field in struct boot_params has
>> been initialized then we actually *do* know how to write to the screen
>> -- if you are okay with including a text font etc. since modern systems
>> boot in graphics mode.
> 
> We switch to text mode and dump our message. Can we do that?

What is text mode?  It is hardware that is going away(*), and you don't
even know if you have a display screen on your system at all, or how
you'd have to configure your display hardware even if it is "mostly" VGA.

> I wouldn't want to do any of this back'n'forth between kernel and boot
> loader because that sounds fragile, at least to me. And again, I'm
> not convinced we should spend too much energy on this as the issue is
> temporary AFAICT.

Well, it's not just limited to 5-level mode; it's kind a general issue.
We have had this issue for a very, very long time -- all the way back to
i386 PAE at the very least.  I'm personally OK with triple-faulting the
CPU in this case.

	-hpa


(*) And for good reason -- it is completely memory-latency-bound as you
    have an indirect reference for every byte you fetch.  In a UMA
    system this sucks up an insane amount of system bandwidth, unless
    you are willing to burn the area of having a 16K SRAM cache.

    VGA hardware, additionally, has a bunch of insane operations that
    have to be memory-mapped.  The resulting hardware screws with
    pretty much any sane GPU implementation, so I'm fully expecting that
    as soon as GPUs no longer come with a CBIOS option ROM VGA hardware
    will be dropped more or less immediately.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-29 23:24                     ` H. Peter Anvin
  0 siblings, 0 replies; 46+ messages in thread
From: H. Peter Anvin @ 2017-11-29 23:24 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel

On 11/29/17 14:31, Borislav Petkov wrote:
> 
> A couple of points:
> 
> * so this box here has a normal grub installation and apparently grub
> jumps to some other entry point.
> 

Yes, Grub as a matter of policy(!) does everything in the most braindead
way possible.  You have to use "linux16" or "linuxefi" to make it do
something sane.

> * I'm not convinced we need to do everything you typed because this is
> only a temporary issue and once X86_5LEVEL is complete, it should work.
> I mean, it needs to work otherwise forget single-system image and I
> don't think we want to give that up.
> 
>> However, if the bootloader jumps straight into the code what do you
>> expect it to do?  We have no real concept about what we'd need to do to
>> issue a message as we really don't know what devices are available on
>> the system, etc.  If the screen_info field in struct boot_params has
>> been initialized then we actually *do* know how to write to the screen
>> -- if you are okay with including a text font etc. since modern systems
>> boot in graphics mode.
> 
> We switch to text mode and dump our message. Can we do that?

What is text mode?  It is hardware that is going away(*), and you don't
even know if you have a display screen on your system at all, or how
you'd have to configure your display hardware even if it is "mostly" VGA.

> I wouldn't want to do any of this back'n'forth between kernel and boot
> loader because that sounds fragile, at least to me. And again, I'm
> not convinced we should spend too much energy on this as the issue is
> temporary AFAICT.

Well, it's not just limited to 5-level mode; it's kind a general issue.
We have had this issue for a very, very long time -- all the way back to
i386 PAE at the very least.  I'm personally OK with triple-faulting the
CPU in this case.

	-hpa


(*) And for good reason -- it is completely memory-latency-bound as you
    have an indirect reference for every byte you fetch.  In a UMA
    system this sucks up an insane amount of system bandwidth, unless
    you are willing to burn the area of having a 16K SRAM cache.

    VGA hardware, additionally, has a bunch of insane operations that
    have to be memory-mapped.  The resulting hardware screws with
    pretty much any sane GPU implementation, so I'm fully expecting that
    as soon as GPUs no longer come with a CBIOS option ROM VGA hardware
    will be dropped more or less immediately.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 23:24                     ` H. Peter Anvin
@ 2017-11-30  1:27                       ` Konrad Rzeszutek Wilk
  -1 siblings, 0 replies; 46+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-11-30  1:27 UTC (permalink / raw)
  To: H. Peter Anvin, daniel.kiper
  Cc: Borislav Petkov, Kirill A. Shutemov, Thomas Gleixner,
	Kirill A. Shutemov, Ingo Molnar, x86, Linus Torvalds,
	Andy Lutomirski, Cyrill Gorcunov, Andi Kleen, linux-mm,
	linux-kernel

On Wed, Nov 29, 2017 at 03:24:53PM -0800, H. Peter Anvin wrote:
> On 11/29/17 14:31, Borislav Petkov wrote:
> > 
> > A couple of points:
> > 
> > * so this box here has a normal grub installation and apparently grub
> > jumps to some other entry point.

Ouch. Perhaps you can report this on grub-devel mailing list? And also
what version, since I am not sure if this is a distro-specific version?

> > 
> 
> Yes, Grub as a matter of policy(!) does everything in the most braindead

There is a policy on this? Could you point me out to it - it would
be enlightening to read it :-)

> way possible.  You have to use "linux16" or "linuxefi" to make it do
> something sane.

The Linux bootparams structure is _only_ for Linux. Or are there other
OSes that use the same structure to pass information?

AFAICT the linuxefi does not exist upstream.
> 
> > * I'm not convinced we need to do everything you typed because this is
> > only a temporary issue and once X86_5LEVEL is complete, it should work.
> > I mean, it needs to work otherwise forget single-system image and I
> > don't think we want to give that up.
> > 
> >> However, if the bootloader jumps straight into the code what do you
> >> expect it to do?  We have no real concept about what we'd need to do to
> >> issue a message as we really don't know what devices are available on
> >> the system, etc.  If the screen_info field in struct boot_params has
> >> been initialized then we actually *do* know how to write to the screen
> >> -- if you are okay with including a text font etc. since modern systems
> >> boot in graphics mode.
> > 
> > We switch to text mode and dump our message. Can we do that?
> 
> What is text mode?  It is hardware that is going away(*), and you don't
> even know if you have a display screen on your system at all, or how
> you'd have to configure your display hardware even if it is "mostly" VGA.
> 
> > I wouldn't want to do any of this back'n'forth between kernel and boot
> > loader because that sounds fragile, at least to me. And again, I'm
> > not convinced we should spend too much energy on this as the issue is
> > temporary AFAICT.
> 
> Well, it's not just limited to 5-level mode; it's kind a general issue.
> We have had this issue for a very, very long time -- all the way back to
> i386 PAE at the very least.  I'm personally OK with triple-faulting the
> CPU in this case.
> 
> 	-hpa
> 
> 
> (*) And for good reason -- it is completely memory-latency-bound as you
>     have an indirect reference for every byte you fetch.  In a UMA
>     system this sucks up an insane amount of system bandwidth, unless
>     you are willing to burn the area of having a 16K SRAM cache.
> 
>     VGA hardware, additionally, has a bunch of insane operations that
>     have to be memory-mapped.  The resulting hardware screws with
>     pretty much any sane GPU implementation, so I'm fully expecting that
>     as soon as GPUs no longer come with a CBIOS option ROM VGA hardware
>     will be dropped more or less immediately.

Woot! RIP VGA..

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-30  1:27                       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 46+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-11-30  1:27 UTC (permalink / raw)
  To: H. Peter Anvin, daniel.kiper
  Cc: Borislav Petkov, Kirill A. Shutemov, Thomas Gleixner,
	Kirill A. Shutemov, Ingo Molnar, x86, Linus Torvalds,
	Andy Lutomirski, Cyrill Gorcunov, Andi Kleen, linux-mm,
	linux-kernel

On Wed, Nov 29, 2017 at 03:24:53PM -0800, H. Peter Anvin wrote:
> On 11/29/17 14:31, Borislav Petkov wrote:
> > 
> > A couple of points:
> > 
> > * so this box here has a normal grub installation and apparently grub
> > jumps to some other entry point.

Ouch. Perhaps you can report this on grub-devel mailing list? And also
what version, since I am not sure if this is a distro-specific version?

> > 
> 
> Yes, Grub as a matter of policy(!) does everything in the most braindead

There is a policy on this? Could you point me out to it - it would
be enlightening to read it :-)

> way possible.  You have to use "linux16" or "linuxefi" to make it do
> something sane.

The Linux bootparams structure is _only_ for Linux. Or are there other
OSes that use the same structure to pass information?

AFAICT the linuxefi does not exist upstream.
> 
> > * I'm not convinced we need to do everything you typed because this is
> > only a temporary issue and once X86_5LEVEL is complete, it should work.
> > I mean, it needs to work otherwise forget single-system image and I
> > don't think we want to give that up.
> > 
> >> However, if the bootloader jumps straight into the code what do you
> >> expect it to do?  We have no real concept about what we'd need to do to
> >> issue a message as we really don't know what devices are available on
> >> the system, etc.  If the screen_info field in struct boot_params has
> >> been initialized then we actually *do* know how to write to the screen
> >> -- if you are okay with including a text font etc. since modern systems
> >> boot in graphics mode.
> > 
> > We switch to text mode and dump our message. Can we do that?
> 
> What is text mode?  It is hardware that is going away(*), and you don't
> even know if you have a display screen on your system at all, or how
> you'd have to configure your display hardware even if it is "mostly" VGA.
> 
> > I wouldn't want to do any of this back'n'forth between kernel and boot
> > loader because that sounds fragile, at least to me. And again, I'm
> > not convinced we should spend too much energy on this as the issue is
> > temporary AFAICT.
> 
> Well, it's not just limited to 5-level mode; it's kind a general issue.
> We have had this issue for a very, very long time -- all the way back to
> i386 PAE at the very least.  I'm personally OK with triple-faulting the
> CPU in this case.
> 
> 	-hpa
> 
> 
> (*) And for good reason -- it is completely memory-latency-bound as you
>     have an indirect reference for every byte you fetch.  In a UMA
>     system this sucks up an insane amount of system bandwidth, unless
>     you are willing to burn the area of having a 16K SRAM cache.
> 
>     VGA hardware, additionally, has a bunch of insane operations that
>     have to be memory-mapped.  The resulting hardware screws with
>     pretty much any sane GPU implementation, so I'm fully expecting that
>     as soon as GPUs no longer come with a CBIOS option ROM VGA hardware
>     will be dropped more or less immediately.

Woot! RIP VGA..

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 17:48           ` Borislav Petkov
@ 2017-11-30  7:31             ` Kirill A. Shutemov
  -1 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-30  7:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 05:48:51PM +0000, Borislav Petkov wrote:
> On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> > We're really early in the boot -- startup_64 in decompression code -- and
> > I don't know a way print a message there. Is there a way?
> > 
> > no_longmode handled by just hanging the machine. Is it enough for no_la57
> > case too?
> 
> Patch pls.

The patch below on top of patch 2/4 from this patch would do the trick.

Please give it a shot.

>From 95b5489d1f4ea03c6226d13eb6797825234489d6 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Thu, 30 Nov 2017 10:23:53 +0300
Subject: [PATCH] x86/boot/compressed/64: Print error if 5-level paging is not
 supported

We cannot proceed booting if the machine doesn't support the paging mode
kernel was compiled for.

Getting error the usual way -- via validate_cpu() -- is not going to
work. We need to enable appropriate paging mode before that, otherwise
kernel would triple-fault during KASLR setup.

This code will go away once we get support for boot-time switching
between paging modes.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/boot/compressed/misc.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b50c42455e25..5205e848dc33 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -40,6 +40,8 @@
 /* Functions used by the included decompressor code below. */
 void *memmove(void *dest, const void *src, size_t n);
 
+int l5_paging_required(void);
+
 /*
  * This is set up by the setup-routine at boot-time
  */
@@ -362,6 +364,13 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 	console_init();
 	debug_putstr("early console in extract_kernel\n");
 
+	if (IS_ENABLED(CONFIG_X86_5LEVEL) && !l5_paging_required()) {
+		error("The kernel is compiled with 5-level paging enabled, "
+				"but the CPU doesn't support la57\n"
+				"Unable to boot - please use "
+				"a kernel appropriate for your CPU.\n");
+	}
+
 	free_mem_ptr     = heap;	/* Heap */
 	free_mem_end_ptr = heap + BOOT_HEAP_SIZE;
 
-- 
 Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-30  7:31             ` Kirill A. Shutemov
  0 siblings, 0 replies; 46+ messages in thread
From: Kirill A. Shutemov @ 2017-11-30  7:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 05:48:51PM +0000, Borislav Petkov wrote:
> On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> > We're really early in the boot -- startup_64 in decompression code -- and
> > I don't know a way print a message there. Is there a way?
> > 
> > no_longmode handled by just hanging the machine. Is it enough for no_la57
> > case too?
> 
> Patch pls.

The patch below on top of patch 2/4 from this patch would do the trick.

Please give it a shot.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-29 23:24                     ` H. Peter Anvin
@ 2017-11-30 10:12                       ` Borislav Petkov
  -1 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-30 10:12 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 03:24:53PM -0800, H. Peter Anvin wrote:
> Yes, Grub as a matter of policy(!) does everything in the most braindead
> way possible.  You have to use "linux16" or "linuxefi" to make it do
> something sane.

Good to know, thx.

> What is text mode?  It is hardware that is going away(*), and you don't
> even know if you have a display screen on your system at all, or how
> you'd have to configure your display hardware even if it is "mostly" VGA.

Ok, let me take a stab completely in the dark here: can we ask FW to
switch to some mode which is "suitable" for printing messages?

It would mean we'd have to switch back to real mode where we could do
something ala arch/x86/boot/bioscall.S

After we've printed something, we halt.

If there's no screen, we only halt - it's not like we can magically get
a fairy to connect a screen to the system.

> Well, it's not just limited to 5-level mode; it's kind a general issue.
> We have had this issue for a very, very long time -- all the way back to
> i386 PAE at the very least.

I realize that, judging by your reaction. And yes, we should try to find
a proper solution here in the long run.

> I'm personally OK with triple-faulting the CPU in this case.

Except that is not really user-friendly, as I mentioned already, and
could save other users a bunch of time looking for why TF the kernel
doesn't boot only to realize they enabled an option which is not ready
yet. Which should have depended on BROKEN when it went upstream, btw.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-30 10:12                       ` Borislav Petkov
  0 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-30 10:12 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
	Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
	Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel

On Wed, Nov 29, 2017 at 03:24:53PM -0800, H. Peter Anvin wrote:
> Yes, Grub as a matter of policy(!) does everything in the most braindead
> way possible.  You have to use "linux16" or "linuxefi" to make it do
> something sane.

Good to know, thx.

> What is text mode?  It is hardware that is going away(*), and you don't
> even know if you have a display screen on your system at all, or how
> you'd have to configure your display hardware even if it is "mostly" VGA.

Ok, let me take a stab completely in the dark here: can we ask FW to
switch to some mode which is "suitable" for printing messages?

It would mean we'd have to switch back to real mode where we could do
something ala arch/x86/boot/bioscall.S

After we've printed something, we halt.

If there's no screen, we only halt - it's not like we can magically get
a fairy to connect a screen to the system.

> Well, it's not just limited to 5-level mode; it's kind a general issue.
> We have had this issue for a very, very long time -- all the way back to
> i386 PAE at the very least.

I realize that, judging by your reaction. And yes, we should try to find
a proper solution here in the long run.

> I'm personally OK with triple-faulting the CPU in this case.

Except that is not really user-friendly, as I mentioned already, and
could save other users a bunch of time looking for why TF the kernel
doesn't boot only to realize they enabled an option which is not ready
yet. Which should have depended on BROKEN when it went upstream, btw.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-30  7:31             ` Kirill A. Shutemov
@ 2017-11-30 10:14               ` Borislav Petkov
  -1 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-30 10:14 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Thu, Nov 30, 2017 at 10:31:31AM +0300, Kirill A. Shutemov wrote:
> On Wed, Nov 29, 2017 at 05:48:51PM +0000, Borislav Petkov wrote:
> > On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> > > We're really early in the boot -- startup_64 in decompression code -- and
> > > I don't know a way print a message there. Is there a way?
> > > 
> > > no_longmode handled by just hanging the machine. Is it enough for no_la57
> > > case too?
> > 
> > Patch pls.
> 
> The patch below on top of patch 2/4 from this patch would do the trick.
> 
> Please give it a shot.

Yap, that works. Thanks!

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-30 10:14               ` Borislav Petkov
  0 siblings, 0 replies; 46+ messages in thread
From: Borislav Petkov @ 2017-11-30 10:14 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Thu, Nov 30, 2017 at 10:31:31AM +0300, Kirill A. Shutemov wrote:
> On Wed, Nov 29, 2017 at 05:48:51PM +0000, Borislav Petkov wrote:
> > On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> > > We're really early in the boot -- startup_64 in decompression code -- and
> > > I don't know a way print a message there. Is there a way?
> > > 
> > > no_longmode handled by just hanging the machine. Is it enough for no_la57
> > > case too?
> > 
> > Patch pls.
> 
> The patch below on top of patch 2/4 from this patch would do the trick.
> 
> Please give it a shot.

Yap, that works. Thanks!

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
  2017-11-30  7:31             ` Kirill A. Shutemov
@ 2017-11-30 15:45               ` Joe Perches
  -1 siblings, 0 replies; 46+ messages in thread
From: Joe Perches @ 2017-11-30 15:45 UTC (permalink / raw)
  To: Kirill A. Shutemov, Borislav Petkov
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Thu, 2017-11-30 at 10:31 +0300, Kirill A. Shutemov wrote:
> On Wed, Nov 29, 2017 at 05:48:51PM +0000, Borislav Petkov wrote:
> > On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> > > We're really early in the boot -- startup_64 in decompression code -- and
> > > I don't know a way print a message there. Is there a way?
> > > 
> > > no_longmode handled by just hanging the machine. Is it enough for no_la57
> > > case too?
> > 
> > Patch pls.
> 
> The patch below on top of patch 2/4 from this patch would do the trick.
> 
> Please give it a shot.
> 
> From 95b5489d1f4ea03c6226d13eb6797825234489d6 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Thu, 30 Nov 2017 10:23:53 +0300
> Subject: [PATCH] x86/boot/compressed/64: Print error if 5-level paging is not
>  supported
> 
> We cannot proceed booting if the machine doesn't support the paging mode
> kernel was compiled for.
> 
> Getting error the usual way -- via validate_cpu() -- is not going to
> work. We need to enable appropriate paging mode before that, otherwise
> kernel would triple-fault during KASLR setup.
> 
> This code will go away once we get support for boot-time switching
> between paging modes.

trivia:

> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
[]
> @@ -362,6 +364,13 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
>  	console_init();
>  	debug_putstr("early console in extract_kernel\n");
>  
> +	if (IS_ENABLED(CONFIG_X86_5LEVEL) && !l5_paging_required()) {
> +		error("The kernel is compiled with 5-level paging enabled, "
> +				"but the CPU doesn't support la57\n"

la57 is lanthanum, perhaps something less obscure or more
readily searchable?  Maybe cr4.la57?  it?

Maybe something like:

"This linux kernel as configured requires 5-level paging\n"
"This CPU does not support the required 'cr4.la57' feature\n"
"Unable to boot - please use a kernel appropriate for your CPU\n"

And please use complete coalesced single lines.

> +				"Unable to boot - please use "
> +				"a kernel appropriate for your CPU.\n");

Here too.  Thanks.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-30 15:45               ` Joe Perches
  0 siblings, 0 replies; 46+ messages in thread
From: Joe Perches @ 2017-11-30 15:45 UTC (permalink / raw)
  To: Kirill A. Shutemov, Borislav Petkov
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
	Andi Kleen, linux-mm, linux-kernel

On Thu, 2017-11-30 at 10:31 +0300, Kirill A. Shutemov wrote:
> On Wed, Nov 29, 2017 at 05:48:51PM +0000, Borislav Petkov wrote:
> > On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> > > We're really early in the boot -- startup_64 in decompression code -- and
> > > I don't know a way print a message there. Is there a way?
> > > 
> > > no_longmode handled by just hanging the machine. Is it enough for no_la57
> > > case too?
> > 
> > Patch pls.
> 
> The patch below on top of patch 2/4 from this patch would do the trick.
> 
> Please give it a shot.
> 
> From 95b5489d1f4ea03c6226d13eb6797825234489d6 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Thu, 30 Nov 2017 10:23:53 +0300
> Subject: [PATCH] x86/boot/compressed/64: Print error if 5-level paging is not
>  supported
> 
> We cannot proceed booting if the machine doesn't support the paging mode
> kernel was compiled for.
> 
> Getting error the usual way -- via validate_cpu() -- is not going to
> work. We need to enable appropriate paging mode before that, otherwise
> kernel would triple-fault during KASLR setup.
> 
> This code will go away once we get support for boot-time switching
> between paging modes.

trivia:

> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
[]
> @@ -362,6 +364,13 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
>  	console_init();
>  	debug_putstr("early console in extract_kernel\n");
>  
> +	if (IS_ENABLED(CONFIG_X86_5LEVEL) && !l5_paging_required()) {
> +		error("The kernel is compiled with 5-level paging enabled, "
> +				"but the CPU doesn't support la57\n"

la57 is lanthanum, perhaps something less obscure or more
readily searchable?  Maybe cr4.la57?  it?

Maybe something like:

"This linux kernel as configured requires 5-level paging\n"
"This CPU does not support the required 'cr4.la57' feature\n"
"Unable to boot - please use a kernel appropriate for your CPU\n"

And please use complete coalesced single lines.

> +				"Unable to boot - please use "
> +				"a kernel appropriate for your CPU.\n");

Here too.  Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2017-11-30 15:45 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-10 22:06 [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
2017-11-10 22:06 ` Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
2017-11-10 22:06   ` Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 2/4] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time Kirill A. Shutemov
2017-11-10 22:06   ` Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 3/4] x86/boot/compressed/64: Introduce place_trampoline() Kirill A. Shutemov
2017-11-10 22:06   ` Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Kirill A. Shutemov
2017-11-10 22:06   ` Kirill A. Shutemov
2017-11-22  8:09 ` [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
2017-11-22  8:09   ` Kirill A. Shutemov
2017-11-29 15:49 ` Borislav Petkov
2017-11-29 15:49   ` Borislav Petkov
2017-11-29 16:13   ` Kirill A. Shutemov
2017-11-29 16:13     ` Kirill A. Shutemov
2017-11-29 16:40     ` Thomas Gleixner
2017-11-29 16:40       ` Thomas Gleixner
2017-11-29 17:08       ` Kirill A. Shutemov
2017-11-29 17:08         ` Kirill A. Shutemov
2017-11-29 17:48         ` Borislav Petkov
2017-11-29 17:48           ` Borislav Petkov
2017-11-29 19:01           ` H. Peter Anvin
2017-11-29 19:01             ` H. Peter Anvin
2017-11-29 19:19             ` Borislav Petkov
2017-11-29 19:19               ` Borislav Petkov
2017-11-29 21:33               ` H. Peter Anvin
2017-11-29 21:33                 ` H. Peter Anvin
2017-11-29 22:31                 ` Borislav Petkov
2017-11-29 22:31                   ` Borislav Petkov
2017-11-29 23:24                   ` H. Peter Anvin
2017-11-29 23:24                     ` H. Peter Anvin
2017-11-30  1:27                     ` Konrad Rzeszutek Wilk
2017-11-30  1:27                       ` Konrad Rzeszutek Wilk
2017-11-30 10:12                     ` Borislav Petkov
2017-11-30 10:12                       ` Borislav Petkov
2017-11-30  7:31           ` Kirill A. Shutemov
2017-11-30  7:31             ` Kirill A. Shutemov
2017-11-30 10:14             ` Borislav Petkov
2017-11-30 10:14               ` Borislav Petkov
2017-11-30 15:45             ` Joe Perches
2017-11-30 15:45               ` Joe Perches
2017-11-29 20:58         ` Andi Kleen
2017-11-29 20:58           ` Andi Kleen
2017-11-29 21:03           ` hpa
2017-11-29 21:03             ` hpa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.