* [PATCHv2 0/4] x86: 5-level related changes into decompression code
@ 2017-11-10 22:06 Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
` (5 more replies)
0 siblings, 6 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
Kirill A. Shutemov
Hi Ingo,
Here's updated changes that prepare the code to boot-time switching between
paging modes and handle booting in 5-level mode when bootloader put kernel
image above 4G, but haven't enabled 5-level paging for us.
I've updated patches based on your feedback.
Please review and consider applying.
Kirill A. Shutemov (4):
x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c
x86/boot/compressed/64: Detect and handle 5-level paging at boot-time
x86/boot/compressed/64: Introduce place_trampoline()
x86/boot/compressed/64: Handle 5-level paging boot if kernel is above
4G
arch/x86/boot/compressed/Makefile | 3 +-
arch/x86/boot/compressed/head_64.S | 108 +++++++++++++--------
.../boot/compressed/{pagetable.c => kaslr_64.c} | 0
arch/x86/boot/compressed/pgtable.h | 18 ++++
arch/x86/boot/compressed/pgtable_64.c | 61 ++++++++++++
5 files changed, 150 insertions(+), 40 deletions(-)
rename arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} (100%)
create mode 100644 arch/x86/boot/compressed/pgtable.h
create mode 100644 arch/x86/boot/compressed/pgtable_64.c
--
2.14.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCHv2 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c
2017-11-10 22:06 [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
@ 2017-11-10 22:06 ` Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 2/4] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time Kirill A. Shutemov
` (4 subsequent siblings)
5 siblings, 0 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
Kirill A. Shutemov
The name of the file -- pagetable.c -- is misleading: it only contains
helpers used for KASLR in 64-bin mode.
Let's rename the file to reflect its content.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/boot/compressed/Makefile | 2 +-
arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} | 0
2 files changed, 1 insertion(+), 1 deletion(-)
rename arch/x86/boot/compressed/{pagetable.c => kaslr_64.c} (100%)
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 1e9c322e973a..ae0be0b923e1 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -78,7 +78,7 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
ifdef CONFIG_X86_64
- vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/pagetable.o
+ vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr_64.o
vmlinux-objs-y += $(obj)/mem_encrypt.o
endif
diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/kaslr_64.c
similarity index 100%
rename from arch/x86/boot/compressed/pagetable.c
rename to arch/x86/boot/compressed/kaslr_64.c
--
2.14.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCHv2 2/4] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time
2017-11-10 22:06 [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
@ 2017-11-10 22:06 ` Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 3/4] x86/boot/compressed/64: Introduce place_trampoline() Kirill A. Shutemov
` (3 subsequent siblings)
5 siblings, 0 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
Kirill A. Shutemov
This patch prepare decompression code to boot-time switching between 4-
and 5-level paging.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/boot/compressed/Makefile | 1 +
arch/x86/boot/compressed/head_64.S | 16 ++++++++++++----
arch/x86/boot/compressed/pgtable_64.c | 18 ++++++++++++++++++
3 files changed, 31 insertions(+), 4 deletions(-)
create mode 100644 arch/x86/boot/compressed/pgtable_64.c
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index ae0be0b923e1..1f734cd98fd3 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -80,6 +80,7 @@ vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
ifdef CONFIG_X86_64
vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr_64.o
vmlinux-objs-y += $(obj)/mem_encrypt.o
+ vmlinux-objs-y += $(obj)/pgtable_64.o
endif
$(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 20919b4f3133..fc313e29fe2c 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -305,10 +305,18 @@ ENTRY(startup_64)
leaq boot_stack_end(%rbx), %rsp
#ifdef CONFIG_X86_5LEVEL
- /* Check if 5-level paging has already enabled */
- movq %cr4, %rax
- testl $X86_CR4_LA57, %eax
- jnz lvl5
+ /*
+ * Check if we need to enable 5-level paging.
+ * RSI holds real mode data and need to be preserved across
+ * a function call.
+ */
+ pushq %rsi
+ call l5_paging_required
+ popq %rsi
+
+ /* If l5_paging_required() returned zero, we're done here. */
+ cmpq $0, %rax
+ je lvl5
/*
* At this point we are in long mode with 4-level paging enabled,
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
new file mode 100644
index 000000000000..eed3a2c3b577
--- /dev/null
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -0,0 +1,18 @@
+#include <asm/processor.h>
+
+int l5_paging_required(void)
+{
+ /* Check i leaf 7 is supported. */
+ if (native_cpuid_eax(0) < 7)
+ return 0;
+
+ /* Check if la57 is supported. */
+ if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+ return 0;
+
+ /* Check if 5-level paging has already been enabled. */
+ if (native_read_cr4() & X86_CR4_LA57)
+ return 0;
+
+ return 1;
+}
--
2.14.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCHv2 3/4] x86/boot/compressed/64: Introduce place_trampoline()
2017-11-10 22:06 [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 2/4] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time Kirill A. Shutemov
@ 2017-11-10 22:06 ` Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Kirill A. Shutemov
` (2 subsequent siblings)
5 siblings, 0 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
Kirill A. Shutemov
If bootloader enables 64-bit mode with 4-level paging, we might need to
switch over to 5-level paging. The switching requires disabling paging.
It works fine if kernel itself is loaded below 4G.
If bootloader put the kernel above 4G (not sure if anybody does this),
we would loose control as soon as paging is disabled as code becomes
unreachable.
To handle the situation, we need a trampoline in lower memory that would
take care about switching on 5-level paging.
Apart from trampoline itself we also need place to store top level page
table in lower memory as we don't have a way to load 64-bit value into
CR3 from 32-bit mode. We only really need 8-bytes there as we only use
the very first entry of the page table. But we allocate whole page
anyway. We cannot have the code in the same because, there's hazard that
a CPU would read page table speculatively and get confused seeing
garbage.
This patch introduces paging_prepare() that check if we need to enable
5-level paging and then finds right spot in lower memory for trampoline,
copies trampoline code there and setups new top level page table for
5-level paging.
At this point we do all the preparation, but not yet use trampoline.
It will be done in following patch.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/boot/compressed/head_64.S | 54 ++++++++++++++++-------------
arch/x86/boot/compressed/pgtable.h | 18 ++++++++++
arch/x86/boot/compressed/pgtable_64.c | 65 +++++++++++++++++++++++++++++------
3 files changed, 103 insertions(+), 34 deletions(-)
create mode 100644 arch/x86/boot/compressed/pgtable.h
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index fc313e29fe2c..33a47d5c6445 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -304,33 +304,45 @@ ENTRY(startup_64)
/* Set up the stack */
leaq boot_stack_end(%rbx), %rsp
-#ifdef CONFIG_X86_5LEVEL
- /*
- * Check if we need to enable 5-level paging.
- * RSI holds real mode data and need to be preserved across
- * a function call.
- */
- pushq %rsi
- call l5_paging_required
- popq %rsi
-
- /* If l5_paging_required() returned zero, we're done here. */
- cmpq $0, %rax
- je lvl5
-
/*
* At this point we are in long mode with 4-level paging enabled,
- * but we want to enable 5-level paging.
+ * but we might want to enable 5-level paging.
*
* The problem is that we cannot do it directly. Setting LA57 in
* long mode would trigger #GP. So we need to switch off long mode
* first.
*
- * NOTE: This is not going to work if bootloader put us above 4G
- * limit.
+ * We also need trampoline in lower memory to switch from 4- to 5-level
+ * paging for cases when bootloader put kernel above 4G, but didn't
+ * enable 5-level paging for us.
+ *
+ * For trampoline, we have to have top page table in lower memory as we
+ * don't have a way to load 64-bit value into CR3 from 32-bit mode.
+ *
+ * We go though trampoline even if we don't have to: if we're already
+ * in 5-level paging mode or if we don't need to switch to it. This way
+ * the trampoline code gets tested not only in special rare case, but
+ * on every boot.
+ */
+
+ /*
+ * paging_prepare() would setup trampoline and check if we need to
+ * enable 5-level paging.
+ *
+ * Address of trampoline is rerurned in RAX. The bit 0 is used to
+ * encode if we need to enabled 5-level paging.
*
- * The first step is go into compatibility mode.
+ * RSI holds real mode data and need to be preserved across
+ * a function call.
*/
+ pushq %rsi
+ call paging_prepare
+ popq %rsi
+ movq %rax, %rcx
+ andq $(~1UL), %rcx
+
+ testq $1, %rax
+ jz lvl5
/* Clear additional page table */
leaq lvl5_pgtable(%rbx), %rdi
@@ -352,7 +364,6 @@ ENTRY(startup_64)
pushq %rax
lretq
lvl5:
-#endif
/* Zero EFLAGS */
pushq $0
@@ -490,7 +501,7 @@ relocated:
jmp *%rax
.code32
-#ifdef CONFIG_X86_5LEVEL
+ENTRY(trampoline_32bit_src)
compatible_mode:
/* Setup data and stack segments */
movl $__KERNEL_DS, %eax
@@ -526,7 +537,6 @@ compatible_mode:
movl %eax, %cr0
lret
-#endif
no_longmode:
/* This isn't an x86-64 CPU so hang */
@@ -585,7 +595,5 @@ boot_stack_end:
.balign 4096
pgtable:
.fill BOOT_PGT_SIZE, 1, 0
-#ifdef CONFIG_X86_5LEVEL
lvl5_pgtable:
.fill PAGE_SIZE, 1, 0
-#endif
diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h
new file mode 100644
index 000000000000..0261d4ab62e6
--- /dev/null
+++ b/arch/x86/boot/compressed/pgtable.h
@@ -0,0 +1,18 @@
+#ifndef BOOT_COMPRESSED_PAGETABLE_H
+#define BOOT_COMPRESSED_PAGETABLE_H
+
+#define TRAMPOLINE_32BIT_SIZE (2 * PAGE_SIZE)
+
+#define TRAMPOLINE_32BIT_PGTABLE_OFF 0
+
+#define TRAMPOLINE_32BIT_CODE_OFF PAGE_SIZE
+#define TRAMPOLINE_32BIT_CODE_SIZE 0x50
+
+#define TRAMPOLINE_32BIT_STACK_END TRAMPOLINE_32BIT_SIZE
+
+#ifndef __ASSEMBLER__
+
+extern void (*trampoline_32bit_src)(void *return_ptr);
+
+#endif /* __ASSEMBLER__ */
+#endif /* BOOT_COMPRESSED_PAGETABLE_H */
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index eed3a2c3b577..a2ab6b9cf258 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -1,18 +1,61 @@
#include <asm/processor.h>
+#include "pgtable.h"
+#include "../string.h"
-int l5_paging_required(void)
+#define BIOS_START_MIN 0x20000U /* 128K, less than this is insane */
+#define BIOS_START_MAX 0x9f000U /* 640K, absolute maximum */
+
+unsigned long paging_prepare(void)
{
- /* Check i leaf 7 is supported. */
- if (native_cpuid_eax(0) < 7)
- return 0;
+ unsigned long bios_start, ebda_start, trampoline_start, *trampoline;
+ int l5_required = 0;
+
+ /* Check if la57 is desired and supported */
+ if (IS_ENABLED(CONFIG_X86_5LEVEL) && native_cpuid_eax(0) >= 7 &&
+ (native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+ l5_required = 1;
+
+ /*
+ * Find suitable spot for trampoline.
+ * Based on reserve_bios_regions().
+ */
+
+ ebda_start = *(unsigned short *)0x40e << 4;
+ bios_start = *(unsigned short *)0x413 << 10;
+
+ if (bios_start < BIOS_START_MIN || bios_start > BIOS_START_MAX)
+ bios_start = BIOS_START_MAX;
+
+ if (ebda_start > BIOS_START_MIN && ebda_start < bios_start)
+ bios_start = ebda_start;
+
+ /* Place trampoline below end of low memory, aligned to 4k */
+ trampoline_start = bios_start - TRAMPOLINE_32BIT_SIZE;
+ trampoline_start = round_down(trampoline_start, PAGE_SIZE);
+
+ trampoline = (unsigned long *)trampoline_start;
+
+ /* Clear trampoline memory first */
+ memset(trampoline, 0, TRAMPOLINE_32BIT_SIZE);
- /* Check if la57 is supported. */
- if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
- return 0;
+ /* Copy trampoline code in place */
+ memcpy(trampoline + TRAMPOLINE_32BIT_CODE_OFF / sizeof(unsigned long),
+ &trampoline_32bit_src, TRAMPOLINE_32BIT_CODE_SIZE);
- /* Check if 5-level paging has already been enabled. */
- if (native_read_cr4() & X86_CR4_LA57)
- return 0;
+ if (l5_required) {
+ /*
+ * For 5-level paging setup current CR3 as the first and the
+ * only entry in a new top level page table.
+ */
+ trampoline[0] = __read_cr3() + _PAGE_TABLE_NOENC;
+ } else {
+ /*
+ * For 4-level paging, copy current top-level page table.
+ * It might be above 4G and be unaccessible from 32-bit mode.
+ */
+ memcpy(trampoline, (void *)__read_cr3(), PAGE_SIZE);
+ }
- return 1;
+ /* Bit 0 is used to encode if 5-level paging is required */
+ return trampoline_start | l5_required;
}
--
2.14.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCHv2 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G
2017-11-10 22:06 [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
` (2 preceding siblings ...)
2017-11-10 22:06 ` [PATCHv2 3/4] x86/boot/compressed/64: Introduce place_trampoline() Kirill A. Shutemov
@ 2017-11-10 22:06 ` Kirill A. Shutemov
2017-11-22 8:09 ` [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
2017-11-29 15:49 ` Borislav Petkov
5 siblings, 0 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2017-11-10 22:06 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, linux-kernel,
Kirill A. Shutemov
This patch addresses shortcoming in current boot process on machines
that supports 5-level paging.
If bootloader enables 64-bit mode with 4-level paging, we need to
switch over to 5-level paging. The switching requires disabling paging.
It works fine if kernel itself is loaded below 4G.
If bootloader put the kernel above 4G (not sure if anybody does this),
we would loose control as soon as paging is disabled as code becomes
unreachable.
This patch implements trampoline in lower memory to handle this
situation.
We only need the memory for very short time, until main kernel image
setup its own page tables.
We go though trampoline even if we don't have to: if we're already in
5-level paging mode or if we don't need to switch to it. This way the
trampoline code gets tested not only in special rare case, but on every
boot.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/boot/compressed/head_64.S | 72 +++++++++++++++++++++++---------------
1 file changed, 43 insertions(+), 29 deletions(-)
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 33a47d5c6445..525972ca27b7 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -33,6 +33,7 @@
#include <asm/processor-flags.h>
#include <asm/asm-offsets.h>
#include <asm/bootparam.h>
+#include "pgtable.h"
/*
* Locally defined symbols should be marked hidden:
@@ -339,31 +340,22 @@ ENTRY(startup_64)
call paging_prepare
popq %rsi
movq %rax, %rcx
- andq $(~1UL), %rcx
-
- testq $1, %rax
- jz lvl5
-
- /* Clear additional page table */
- leaq lvl5_pgtable(%rbx), %rdi
- xorq %rax, %rax
- movq $(PAGE_SIZE/8), %rcx
- rep stosq
/*
- * Setup current CR3 as the first and only entry in a new top level
- * page table.
+ * Load address of trampoline_return into RDI.
+ * It will be used by trampoline to return to main code.
*/
- movq %cr3, %rdi
- leaq 0x7 (%rdi), %rax
- movq %rax, lvl5_pgtable(%rbx)
+ leaq trampoline_return(%rip), %rdi
/* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */
pushq $__KERNEL32_CS
- leaq compatible_mode(%rip), %rax
+ andq $(~1UL), %rax /* Clear bit 0: encode if 5-level paging neeeded */
+ leaq TRAMPOLINE_32BIT_CODE_OFF(%rax), %rax
pushq %rax
lretq
-lvl5:
+trampoline_return:
+ /* Restore stack, 32-bit trampoline uses own stack */
+ leaq boot_stack_end(%rbx), %rsp
/* Zero EFLAGS */
pushq $0
@@ -501,36 +493,51 @@ relocated:
jmp *%rax
.code32
+/*
+ * This is 32-bit trampoline that will be copied over to low memory.
+ *
+ * RDI contains return address (might be above 4G).
+ * ECX contains the base address of trampoline memory.
+ * Bit 0 of ECX encodes if 5-level paging is required.
+ */
ENTRY(trampoline_32bit_src)
-compatible_mode:
/* Setup data and stack segments */
movl $__KERNEL_DS, %eax
movl %eax, %ds
movl %eax, %ss
+ movl %ecx, %edx
+ andl $(~1UL), %edx
+
+ /* Setup new stack at the end of trampoline memory */
+ leal TRAMPOLINE_32BIT_STACK_END (%edx), %esp
+
/* Disable paging */
movl %cr0, %eax
btrl $X86_CR0_PG_BIT, %eax
movl %eax, %cr0
- /* Point CR3 to 5-level paging */
- leal lvl5_pgtable(%ebx), %eax
+ /* Point CR3 to trampoline top level page table */
+ leal TRAMPOLINE_32BIT_PGTABLE_OFF (%edx), %eax
movl %eax, %cr3
/* Enable PAE and LA57 mode */
movl %cr4, %eax
- orl $(X86_CR4_PAE | X86_CR4_LA57), %eax
+ orl $X86_CR4_PAE, %eax
+
+ /* Bit 0 of ECX encodes if 5-level paging is required */
+ testl $1, %ecx
+ jz 1f
+ orl $X86_CR4_LA57, %eax
+1:
movl %eax, %cr4
- /* Calculate address we are running at */
- call 1f
-1: popl %edi
- subl $1b, %edi
+ /* Calculate address of paging_enabled once we are in trampoline */
+ leal paging_enabled - trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_OFF (%edx), %eax
/* Prepare stack for far return to Long Mode */
pushl $__KERNEL_CS
- leal lvl5(%edi), %eax
- push %eax
+ pushl %eax
/* Enable paging back */
movl $(X86_CR0_PG | X86_CR0_PE), %eax
@@ -538,6 +545,15 @@ compatible_mode:
lret
+ .code64
+paging_enabled:
+ /* Return from trampoline */
+ jmp *%rdi
+
+ /* Bound size of trampoline code */
+ .org trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_SIZE
+
+ .code32
no_longmode:
/* This isn't an x86-64 CPU so hang */
1:
@@ -595,5 +611,3 @@ boot_stack_end:
.balign 4096
pgtable:
.fill BOOT_PGT_SIZE, 1, 0
-lvl5_pgtable:
- .fill PAGE_SIZE, 1, 0
--
2.14.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-10 22:06 [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
` (3 preceding siblings ...)
2017-11-10 22:06 ` [PATCHv2 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Kirill A. Shutemov
@ 2017-11-22 8:09 ` Kirill A. Shutemov
2017-11-29 15:49 ` Borislav Petkov
5 siblings, 0 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2017-11-22 8:09 UTC (permalink / raw)
To: Ingo Molnar
Cc: Kirill A. Shutemov, x86, Thomas Gleixner, H. Peter Anvin,
Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Borislav Petkov, Andi Kleen, linux-mm, linux-kernel
On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> Hi Ingo,
>
> Here's updated changes that prepare the code to boot-time switching between
> paging modes and handle booting in 5-level mode when bootloader put kernel
> image above 4G, but haven't enabled 5-level paging for us.
>
> I've updated patches based on your feedback.
>
> Please review and consider applying.
Gentle ping.
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-10 22:06 [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
` (4 preceding siblings ...)
2017-11-22 8:09 ` [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
@ 2017-11-29 15:49 ` Borislav Petkov
2017-11-29 16:13 ` Kirill A. Shutemov
5 siblings, 1 reply; 23+ messages in thread
From: Borislav Petkov @ 2017-11-29 15:49 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin,
Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov, Andi Kleen,
linux-mm, linux-kernel
On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> Hi Ingo,
>
> Here's updated changes that prepare the code to boot-time switching between
> paging modes and handle booting in 5-level mode when bootloader put kernel
> image above 4G, but haven't enabled 5-level paging for us.
Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
triple-faults and ends up spinning in a reboot loop. Even though it
should say:
early console in setup code
This kernel requires the following features not present on the CPU:
la57
Unable to boot - please use a kernel appropriate for your CPU.
and halt.
A kvm guest still does that but baremetal triple-faults.
Ideas?
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 15:49 ` Borislav Petkov
@ 2017-11-29 16:13 ` Kirill A. Shutemov
2017-11-29 16:40 ` Thomas Gleixner
0 siblings, 1 reply; 23+ messages in thread
From: Kirill A. Shutemov @ 2017-11-29 16:13 UTC (permalink / raw)
To: Borislav Petkov
Cc: Kirill A. Shutemov, Ingo Molnar, x86, Thomas Gleixner,
H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Andi Kleen, linux-mm, linux-kernel
On Wed, Nov 29, 2017 at 04:49:08PM +0100, Borislav Petkov wrote:
> On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> > Hi Ingo,
> >
> > Here's updated changes that prepare the code to boot-time switching between
> > paging modes and handle booting in 5-level mode when bootloader put kernel
> > image above 4G, but haven't enabled 5-level paging for us.
>
> Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
> triple-faults and ends up spinning in a reboot loop. Even though it
> should say:
>
> early console in setup code
> This kernel requires the following features not present on the CPU:
> la57
> Unable to boot - please use a kernel appropriate for your CPU.
>
> and halt.
>
> A kvm guest still does that but baremetal triple-faults.
>
> Ideas?
Looks like we call check_cpuflags() too late. 5-level paging gets enabled
before image decompression started.
For qemu/kvm it works because it's supported in softmmu, even if not
advertised in cpuid.
I'm not sure if it worth fixing on its own. I would rather get boot-time
switching code upstream sooner. It will get problem go away naturally.
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 16:13 ` Kirill A. Shutemov
@ 2017-11-29 16:40 ` Thomas Gleixner
2017-11-29 17:08 ` Kirill A. Shutemov
0 siblings, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2017-11-29 16:40 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Borislav Petkov, Kirill A. Shutemov, Ingo Molnar, x86,
H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Andi Kleen, linux-mm, linux-kernel
On Wed, 29 Nov 2017, Kirill A. Shutemov wrote:
> On Wed, Nov 29, 2017 at 04:49:08PM +0100, Borislav Petkov wrote:
> > On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> > > Hi Ingo,
> > >
> > > Here's updated changes that prepare the code to boot-time switching between
> > > paging modes and handle booting in 5-level mode when bootloader put kernel
> > > image above 4G, but haven't enabled 5-level paging for us.
> >
> > Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
> > triple-faults and ends up spinning in a reboot loop. Even though it
> > should say:
> >
> > early console in setup code
> > This kernel requires the following features not present on the CPU:
> > la57
> > Unable to boot - please use a kernel appropriate for your CPU.
> >
> > and halt.
> >
> > A kvm guest still does that but baremetal triple-faults.
> >
> > Ideas?
>
> Looks like we call check_cpuflags() too late. 5-level paging gets enabled
> before image decompression started.
>
> For qemu/kvm it works because it's supported in softmmu, even if not
> advertised in cpuid.
>
> I'm not sure if it worth fixing on its own. I would rather get boot-time
> switching code upstream sooner. It will get problem go away naturally.
It needs to be fixed now. Because that problem exists in 4.14
Thanks,
tglx
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 16:40 ` Thomas Gleixner
@ 2017-11-29 17:08 ` Kirill A. Shutemov
2017-11-29 17:48 ` Borislav Petkov
2017-11-29 20:58 ` Andi Kleen
0 siblings, 2 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2017-11-29 17:08 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Borislav Petkov, Kirill A. Shutemov, Ingo Molnar, x86,
H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Andi Kleen, linux-mm, linux-kernel
On Wed, Nov 29, 2017 at 05:40:32PM +0100, Thomas Gleixner wrote:
> On Wed, 29 Nov 2017, Kirill A. Shutemov wrote:
>
> > On Wed, Nov 29, 2017 at 04:49:08PM +0100, Borislav Petkov wrote:
> > > On Sat, Nov 11, 2017 at 01:06:41AM +0300, Kirill A. Shutemov wrote:
> > > > Hi Ingo,
> > > >
> > > > Here's updated changes that prepare the code to boot-time switching between
> > > > paging modes and handle booting in 5-level mode when bootloader put kernel
> > > > image above 4G, but haven't enabled 5-level paging for us.
> > >
> > > Btw, if I enable CONFIG_X86_5LEVEL with 4.15-rc1 on an AMD box, the box
> > > triple-faults and ends up spinning in a reboot loop. Even though it
> > > should say:
> > >
> > > early console in setup code
> > > This kernel requires the following features not present on the CPU:
> > > la57
> > > Unable to boot - please use a kernel appropriate for your CPU.
> > >
> > > and halt.
> > >
> > > A kvm guest still does that but baremetal triple-faults.
> > >
> > > Ideas?
> >
> > Looks like we call check_cpuflags() too late. 5-level paging gets enabled
> > before image decompression started.
> >
> > For qemu/kvm it works because it's supported in softmmu, even if not
> > advertised in cpuid.
> >
> > I'm not sure if it worth fixing on its own. I would rather get boot-time
> > switching code upstream sooner. It will get problem go away naturally.
>
> It needs to be fixed now. Because that problem exists in 4.14
Okay.
We're really early in the boot -- startup_64 in decompression code -- and
I don't know a way print a message there. Is there a way?
no_longmode handled by just hanging the machine. Is it enough for no_la57
case too?
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 17:08 ` Kirill A. Shutemov
@ 2017-11-29 17:48 ` Borislav Petkov
2017-11-29 19:01 ` H. Peter Anvin
2017-11-30 7:31 ` Kirill A. Shutemov
2017-11-29 20:58 ` Andi Kleen
1 sibling, 2 replies; 23+ messages in thread
From: Borislav Petkov @ 2017-11-29 17:48 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Kirill A. Shutemov, Ingo Molnar, x86,
H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Andi Kleen, linux-mm, linux-kernel
On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> We're really early in the boot -- startup_64 in decompression code -- and
> I don't know a way print a message there. Is there a way?
>
> no_longmode handled by just hanging the machine. Is it enough for no_la57
> case too?
Patch pls.
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 17:48 ` Borislav Petkov
@ 2017-11-29 19:01 ` H. Peter Anvin
2017-11-29 19:19 ` Borislav Petkov
2017-11-30 7:31 ` Kirill A. Shutemov
1 sibling, 1 reply; 23+ messages in thread
From: H. Peter Anvin @ 2017-11-29 19:01 UTC (permalink / raw)
To: Borislav Petkov, Kirill A. Shutemov
Cc: Thomas Gleixner, Kirill A. Shutemov, Ingo Molnar, x86,
Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov, Andi Kleen,
linux-mm, linux-kernel
On 11/29/17 09:48, Borislav Petkov wrote:
> On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
>> We're really early in the boot -- startup_64 in decompression code -- and
>> I don't know a way print a message there. Is there a way?
>>
>> no_longmode handled by just hanging the machine. Is it enough for no_la57
>> case too?
>
> Patch pls.
>
I don't think there is any way to get a message out here. It's too late
to use the firmware, and too early to use anything native.
no_longmode in startup_64 is an oxymoron -- it simply can't happen,
although of course we can enter at the 32-bit entry point with that problem.
We can hang the machine, or we can triple-fault it in the hope of
triggering a reset, and that way if the bootloader has been configured
with a backup kernel there is a hope of recovery.
Triple-faulting is trivial:
push $0
push $0
lidt (%rsp) /* %esp for 32-bit mode */
ud2
/* WTF? */
1: hlt
jmp 1b
This will either hang the machine or reboot it, depending on if the
reboot-on-triple-fault logic in the chipset actually works.
-hpa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 19:01 ` H. Peter Anvin
@ 2017-11-29 19:19 ` Borislav Petkov
2017-11-29 21:33 ` H. Peter Anvin
0 siblings, 1 reply; 23+ messages in thread
From: Borislav Petkov @ 2017-11-29 19:19 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel
On Wed, Nov 29, 2017 at 11:01:35AM -0800, H. Peter Anvin wrote:
> We can hang the machine, or we can triple-fault it in the hope of
> triggering a reset, and that way if the bootloader has been configured
> with a backup kernel there is a hope of recovery.
Well, it triple-faults right now and that's not really user-friendly. If
we can't dump a message than we should make X86_5LEVEL depend on BROKEN
for the time being...
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 17:08 ` Kirill A. Shutemov
2017-11-29 17:48 ` Borislav Petkov
@ 2017-11-29 20:58 ` Andi Kleen
2017-11-29 21:03 ` hpa
1 sibling, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2017-11-29 20:58 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Thomas Gleixner, Borislav Petkov, Kirill A. Shutemov,
Ingo Molnar, x86, H. Peter Anvin, Linus Torvalds,
Andy Lutomirski, Cyrill Gorcunov, linux-mm, linux-kernel
> We're really early in the boot -- startup_64 in decompression code -- and
> I don't know a way print a message there. Is there a way?
>
> no_longmode handled by just hanging the machine. Is it enough for no_la57
> case too?
The way to handle it is to check it early in the real mode boot code when you
can still print messages. That is how missing long mode is handled.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 20:58 ` Andi Kleen
@ 2017-11-29 21:03 ` hpa
0 siblings, 0 replies; 23+ messages in thread
From: hpa @ 2017-11-29 21:03 UTC (permalink / raw)
To: Andi Kleen, Kirill A. Shutemov
Cc: Thomas Gleixner, Borislav Petkov, Kirill A. Shutemov,
Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
Cyrill Gorcunov, linux-mm, linux-kernel
On November 29, 2017 12:58:15 PM PST, Andi Kleen <ak@linux.intel.com> wrote:
>> We're really early in the boot -- startup_64 in decompression code --
>and
>> I don't know a way print a message there. Is there a way?
>>
>> no_longmode handled by just hanging the machine. Is it enough for
>no_la57
>> case too?
>
>The way to handle it is to check it early in the real mode boot code
>when you
>can still print messages. That is how missing long mode is handled.
>
>-Andi
Yes, and that test should be done automatically. However, we also check at several later points in case that code is bypassed by the bootloader.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 19:19 ` Borislav Petkov
@ 2017-11-29 21:33 ` H. Peter Anvin
2017-11-29 22:31 ` Borislav Petkov
0 siblings, 1 reply; 23+ messages in thread
From: H. Peter Anvin @ 2017-11-29 21:33 UTC (permalink / raw)
To: Borislav Petkov
Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel
On 11/29/17 11:19, Borislav Petkov wrote:
> On Wed, Nov 29, 2017 at 11:01:35AM -0800, H. Peter Anvin wrote:
>> We can hang the machine, or we can triple-fault it in the hope of
>> triggering a reset, and that way if the bootloader has been configured
>> with a backup kernel there is a hope of recovery.
>
> Well, it triple-faults right now and that's not really user-friendly. If
> we can't dump a message than we should make X86_5LEVEL depend on BROKEN
> for the time being...
>
You can't dump a message about *anything* if the bootloader bypasses the
checks that happen before we leave the firmware behind. This is what
this is about. For BIOS or EFI boot that go through the proper stub
functions we will print a message just fine, as we already validate the
"required features" structure (although please do verify that the
relevant words are indeed being checked.)
However, if the bootloader jumps straight into the code what do you
expect it to do? We have no real concept about what we'd need to do to
issue a message as we really don't know what devices are available on
the system, etc. If the screen_info field in struct boot_params has
been initialized then we actually *do* know how to write to the screen
-- if you are okay with including a text font etc. since modern systems
boot in graphics mode.
What else could we do? I guess we could add a new field -- which
bootloaders would have to add support for -- for a callback to the
bootloader in case of an early-detected fatal kernel initialization
error. This would have some... interesting(*)... issues with it, and
wouldn't resolve anything for existing bootloaders, but perhaps it is a
worthwhile extension going forward.
-hpa
(*) The bootloader would have to be prepared for a largely undefined CPU
state, in a rarely executed path. However, it is arguably no worse
than what we have now. Current bootloaders *can* at least know all
the memory the kernel will use before the kernel's own memory
management takes over, so it is possible for it to allocate the
kernel in such a way that its own code/data is preserved.
It is at least possible to determine which major CPU mode we are
running in when we get to that entrypoint. The following code
snippet will do it:
entry:
.code16
dec %ax
mov $0,%ax
jmp 16f
nop
nop
jmp 32f
.code64
jmp code_64
.code32
32: jmp code_32
.code16
16: /* Arbitrary 16-bit code can start here */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 21:33 ` H. Peter Anvin
@ 2017-11-29 22:31 ` Borislav Petkov
2017-11-29 23:24 ` H. Peter Anvin
0 siblings, 1 reply; 23+ messages in thread
From: Borislav Petkov @ 2017-11-29 22:31 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel
On Wed, Nov 29, 2017 at 01:33:28PM -0800, H. Peter Anvin wrote:
> You can't dump a message about *anything* if the bootloader bypasses the
> checks that happen before we leave the firmware behind. This is what
> this is about. For BIOS or EFI boot that go through the proper stub
> functions we will print a message just fine, as we already validate the
> "required features" structure (although please do verify that the
> relevant words are indeed being checked.)
A couple of points:
* so this box here has a normal grub installation and apparently grub
jumps to some other entry point.
* I'm not convinced we need to do everything you typed because this is
only a temporary issue and once X86_5LEVEL is complete, it should work.
I mean, it needs to work otherwise forget single-system image and I
don't think we want to give that up.
> However, if the bootloader jumps straight into the code what do you
> expect it to do? We have no real concept about what we'd need to do to
> issue a message as we really don't know what devices are available on
> the system, etc. If the screen_info field in struct boot_params has
> been initialized then we actually *do* know how to write to the screen
> -- if you are okay with including a text font etc. since modern systems
> boot in graphics mode.
We switch to text mode and dump our message. Can we do that?
I wouldn't want to do any of this back'n'forth between kernel and boot
loader because that sounds fragile, at least to me. And again, I'm
not convinced we should spend too much energy on this as the issue is
temporary AFAICT.
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 22:31 ` Borislav Petkov
@ 2017-11-29 23:24 ` H. Peter Anvin
2017-11-30 1:27 ` Konrad Rzeszutek Wilk
2017-11-30 10:12 ` Borislav Petkov
0 siblings, 2 replies; 23+ messages in thread
From: H. Peter Anvin @ 2017-11-29 23:24 UTC (permalink / raw)
To: Borislav Petkov
Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel
On 11/29/17 14:31, Borislav Petkov wrote:
>
> A couple of points:
>
> * so this box here has a normal grub installation and apparently grub
> jumps to some other entry point.
>
Yes, Grub as a matter of policy(!) does everything in the most braindead
way possible. You have to use "linux16" or "linuxefi" to make it do
something sane.
> * I'm not convinced we need to do everything you typed because this is
> only a temporary issue and once X86_5LEVEL is complete, it should work.
> I mean, it needs to work otherwise forget single-system image and I
> don't think we want to give that up.
>
>> However, if the bootloader jumps straight into the code what do you
>> expect it to do? We have no real concept about what we'd need to do to
>> issue a message as we really don't know what devices are available on
>> the system, etc. If the screen_info field in struct boot_params has
>> been initialized then we actually *do* know how to write to the screen
>> -- if you are okay with including a text font etc. since modern systems
>> boot in graphics mode.
>
> We switch to text mode and dump our message. Can we do that?
What is text mode? It is hardware that is going away(*), and you don't
even know if you have a display screen on your system at all, or how
you'd have to configure your display hardware even if it is "mostly" VGA.
> I wouldn't want to do any of this back'n'forth between kernel and boot
> loader because that sounds fragile, at least to me. And again, I'm
> not convinced we should spend too much energy on this as the issue is
> temporary AFAICT.
Well, it's not just limited to 5-level mode; it's kind a general issue.
We have had this issue for a very, very long time -- all the way back to
i386 PAE at the very least. I'm personally OK with triple-faulting the
CPU in this case.
-hpa
(*) And for good reason -- it is completely memory-latency-bound as you
have an indirect reference for every byte you fetch. In a UMA
system this sucks up an insane amount of system bandwidth, unless
you are willing to burn the area of having a 16K SRAM cache.
VGA hardware, additionally, has a bunch of insane operations that
have to be memory-mapped. The resulting hardware screws with
pretty much any sane GPU implementation, so I'm fully expecting that
as soon as GPUs no longer come with a CBIOS option ROM VGA hardware
will be dropped more or less immediately.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 23:24 ` H. Peter Anvin
@ 2017-11-30 1:27 ` Konrad Rzeszutek Wilk
2017-11-30 10:12 ` Borislav Petkov
1 sibling, 0 replies; 23+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-11-30 1:27 UTC (permalink / raw)
To: H. Peter Anvin, daniel.kiper
Cc: Borislav Petkov, Kirill A. Shutemov, Thomas Gleixner,
Kirill A. Shutemov, Ingo Molnar, x86, Linus Torvalds,
Andy Lutomirski, Cyrill Gorcunov, Andi Kleen, linux-mm,
linux-kernel
On Wed, Nov 29, 2017 at 03:24:53PM -0800, H. Peter Anvin wrote:
> On 11/29/17 14:31, Borislav Petkov wrote:
> >
> > A couple of points:
> >
> > * so this box here has a normal grub installation and apparently grub
> > jumps to some other entry point.
Ouch. Perhaps you can report this on grub-devel mailing list? And also
what version, since I am not sure if this is a distro-specific version?
> >
>
> Yes, Grub as a matter of policy(!) does everything in the most braindead
There is a policy on this? Could you point me out to it - it would
be enlightening to read it :-)
> way possible. You have to use "linux16" or "linuxefi" to make it do
> something sane.
The Linux bootparams structure is _only_ for Linux. Or are there other
OSes that use the same structure to pass information?
AFAICT the linuxefi does not exist upstream.
>
> > * I'm not convinced we need to do everything you typed because this is
> > only a temporary issue and once X86_5LEVEL is complete, it should work.
> > I mean, it needs to work otherwise forget single-system image and I
> > don't think we want to give that up.
> >
> >> However, if the bootloader jumps straight into the code what do you
> >> expect it to do? We have no real concept about what we'd need to do to
> >> issue a message as we really don't know what devices are available on
> >> the system, etc. If the screen_info field in struct boot_params has
> >> been initialized then we actually *do* know how to write to the screen
> >> -- if you are okay with including a text font etc. since modern systems
> >> boot in graphics mode.
> >
> > We switch to text mode and dump our message. Can we do that?
>
> What is text mode? It is hardware that is going away(*), and you don't
> even know if you have a display screen on your system at all, or how
> you'd have to configure your display hardware even if it is "mostly" VGA.
>
> > I wouldn't want to do any of this back'n'forth between kernel and boot
> > loader because that sounds fragile, at least to me. And again, I'm
> > not convinced we should spend too much energy on this as the issue is
> > temporary AFAICT.
>
> Well, it's not just limited to 5-level mode; it's kind a general issue.
> We have had this issue for a very, very long time -- all the way back to
> i386 PAE at the very least. I'm personally OK with triple-faulting the
> CPU in this case.
>
> -hpa
>
>
> (*) And for good reason -- it is completely memory-latency-bound as you
> have an indirect reference for every byte you fetch. In a UMA
> system this sucks up an insane amount of system bandwidth, unless
> you are willing to burn the area of having a 16K SRAM cache.
>
> VGA hardware, additionally, has a bunch of insane operations that
> have to be memory-mapped. The resulting hardware screws with
> pretty much any sane GPU implementation, so I'm fully expecting that
> as soon as GPUs no longer come with a CBIOS option ROM VGA hardware
> will be dropped more or less immediately.
Woot! RIP VGA..
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 17:48 ` Borislav Petkov
2017-11-29 19:01 ` H. Peter Anvin
@ 2017-11-30 7:31 ` Kirill A. Shutemov
2017-11-30 10:14 ` Borislav Petkov
2017-11-30 15:45 ` Joe Perches
1 sibling, 2 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2017-11-30 7:31 UTC (permalink / raw)
To: Borislav Petkov
Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, x86,
H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Andi Kleen, linux-mm, linux-kernel
On Wed, Nov 29, 2017 at 05:48:51PM +0000, Borislav Petkov wrote:
> On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> > We're really early in the boot -- startup_64 in decompression code -- and
> > I don't know a way print a message there. Is there a way?
> >
> > no_longmode handled by just hanging the machine. Is it enough for no_la57
> > case too?
>
> Patch pls.
The patch below on top of patch 2/4 from this patch would do the trick.
Please give it a shot.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-29 23:24 ` H. Peter Anvin
2017-11-30 1:27 ` Konrad Rzeszutek Wilk
@ 2017-11-30 10:12 ` Borislav Petkov
1 sibling, 0 replies; 23+ messages in thread
From: Borislav Petkov @ 2017-11-30 10:12 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Kirill A. Shutemov, Thomas Gleixner, Kirill A. Shutemov,
Ingo Molnar, x86, Linus Torvalds, Andy Lutomirski,
Cyrill Gorcunov, Andi Kleen, linux-mm, linux-kernel
On Wed, Nov 29, 2017 at 03:24:53PM -0800, H. Peter Anvin wrote:
> Yes, Grub as a matter of policy(!) does everything in the most braindead
> way possible. You have to use "linux16" or "linuxefi" to make it do
> something sane.
Good to know, thx.
> What is text mode? It is hardware that is going away(*), and you don't
> even know if you have a display screen on your system at all, or how
> you'd have to configure your display hardware even if it is "mostly" VGA.
Ok, let me take a stab completely in the dark here: can we ask FW to
switch to some mode which is "suitable" for printing messages?
It would mean we'd have to switch back to real mode where we could do
something ala arch/x86/boot/bioscall.S
After we've printed something, we halt.
If there's no screen, we only halt - it's not like we can magically get
a fairy to connect a screen to the system.
> Well, it's not just limited to 5-level mode; it's kind a general issue.
> We have had this issue for a very, very long time -- all the way back to
> i386 PAE at the very least.
I realize that, judging by your reaction. And yes, we should try to find
a proper solution here in the long run.
> I'm personally OK with triple-faulting the CPU in this case.
Except that is not really user-friendly, as I mentioned already, and
could save other users a bunch of time looking for why TF the kernel
doesn't boot only to realize they enabled an option which is not ready
yet. Which should have depended on BROKEN when it went upstream, btw.
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-30 7:31 ` Kirill A. Shutemov
@ 2017-11-30 10:14 ` Borislav Petkov
2017-11-30 15:45 ` Joe Perches
1 sibling, 0 replies; 23+ messages in thread
From: Borislav Petkov @ 2017-11-30 10:14 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, x86,
H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Andi Kleen, linux-mm, linux-kernel
On Thu, Nov 30, 2017 at 10:31:31AM +0300, Kirill A. Shutemov wrote:
> On Wed, Nov 29, 2017 at 05:48:51PM +0000, Borislav Petkov wrote:
> > On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> > > We're really early in the boot -- startup_64 in decompression code -- and
> > > I don't know a way print a message there. Is there a way?
> > >
> > > no_longmode handled by just hanging the machine. Is it enough for no_la57
> > > case too?
> >
> > Patch pls.
>
> The patch below on top of patch 2/4 from this patch would do the trick.
>
> Please give it a shot.
Yap, that works. Thanks!
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv2 0/4] x86: 5-level related changes into decompression code
2017-11-30 7:31 ` Kirill A. Shutemov
2017-11-30 10:14 ` Borislav Petkov
@ 2017-11-30 15:45 ` Joe Perches
1 sibling, 0 replies; 23+ messages in thread
From: Joe Perches @ 2017-11-30 15:45 UTC (permalink / raw)
To: Kirill A. Shutemov, Borislav Petkov
Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, x86,
H. Peter Anvin, Linus Torvalds, Andy Lutomirski, Cyrill Gorcunov,
Andi Kleen, linux-mm, linux-kernel
On Thu, 2017-11-30 at 10:31 +0300, Kirill A. Shutemov wrote:
> On Wed, Nov 29, 2017 at 05:48:51PM +0000, Borislav Petkov wrote:
> > On Wed, Nov 29, 2017 at 08:08:31PM +0300, Kirill A. Shutemov wrote:
> > > We're really early in the boot -- startup_64 in decompression code -- and
> > > I don't know a way print a message there. Is there a way?
> > >
> > > no_longmode handled by just hanging the machine. Is it enough for no_la57
> > > case too?
> >
> > Patch pls.
>
> The patch below on top of patch 2/4 from this patch would do the trick.
>
> Please give it a shot.
>
> From 95b5489d1f4ea03c6226d13eb6797825234489d6 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Thu, 30 Nov 2017 10:23:53 +0300
> Subject: [PATCH] x86/boot/compressed/64: Print error if 5-level paging is not
> supported
>
> We cannot proceed booting if the machine doesn't support the paging mode
> kernel was compiled for.
>
> Getting error the usual way -- via validate_cpu() -- is not going to
> work. We need to enable appropriate paging mode before that, otherwise
> kernel would triple-fault during KASLR setup.
>
> This code will go away once we get support for boot-time switching
> between paging modes.
trivia:
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
[]
> @@ -362,6 +364,13 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
> console_init();
> debug_putstr("early console in extract_kernel\n");
>
> + if (IS_ENABLED(CONFIG_X86_5LEVEL) && !l5_paging_required()) {
> + error("The kernel is compiled with 5-level paging enabled, "
> + "but the CPU doesn't support la57\n"
la57 is lanthanum, perhaps something less obscure or more
readily searchable? Maybe cr4.la57? it?
Maybe something like:
"This linux kernel as configured requires 5-level paging\n"
"This CPU does not support the required 'cr4.la57' feature\n"
"Unable to boot - please use a kernel appropriate for your CPU\n"
And please use complete coalesced single lines.
> + "Unable to boot - please use "
> + "a kernel appropriate for your CPU.\n");
Here too. Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2017-11-30 15:45 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-10 22:06 [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 1/4] x86/boot/compressed/64: Rename pagetable.c to kaslr_64.c Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 2/4] x86/boot/compressed/64: Detect and handle 5-level paging at boot-time Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 3/4] x86/boot/compressed/64: Introduce place_trampoline() Kirill A. Shutemov
2017-11-10 22:06 ` [PATCHv2 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G Kirill A. Shutemov
2017-11-22 8:09 ` [PATCHv2 0/4] x86: 5-level related changes into decompression code Kirill A. Shutemov
2017-11-29 15:49 ` Borislav Petkov
2017-11-29 16:13 ` Kirill A. Shutemov
2017-11-29 16:40 ` Thomas Gleixner
2017-11-29 17:08 ` Kirill A. Shutemov
2017-11-29 17:48 ` Borislav Petkov
2017-11-29 19:01 ` H. Peter Anvin
2017-11-29 19:19 ` Borislav Petkov
2017-11-29 21:33 ` H. Peter Anvin
2017-11-29 22:31 ` Borislav Petkov
2017-11-29 23:24 ` H. Peter Anvin
2017-11-30 1:27 ` Konrad Rzeszutek Wilk
2017-11-30 10:12 ` Borislav Petkov
2017-11-30 7:31 ` Kirill A. Shutemov
2017-11-30 10:14 ` Borislav Petkov
2017-11-30 15:45 ` Joe Perches
2017-11-29 20:58 ` Andi Kleen
2017-11-29 21:03 ` hpa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).