linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code
@ 2024-01-29 18:05 Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 01/19] efi/libstub: Add generic support for parsing mem_encrypt= Ard Biesheuvel
                   ` (18 more replies)
  0 siblings, 19 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

This is a follow-up to my RFC [0] that proposed to build the entire core
kernel with -fPIC, to reduce the likelihood that code that runs
extremely early from the 1:1 mapping of memory will misbehave.

This is needed to address reports that SEV boot on Clang built kernels
is broken, due to the fact that this early code attempts to access
virtual kernel address that are not mapped yet. Kevin has suggested some
workarounds to this [1] but this is really something that requires a
more rigorous approach, rather than addressing a couple of symptoms of
the underlying defect.

As it turns out, the use of fPIE for the entire kernel is neither
necessary nor sufficient, and has its own set of problems, including the
fact that the PIE small C code model uses FS rather than GS for the
per-CPU register, and only recent GCC and Clang versions permit this to
be overridden on the command line.

But the real problem is that even position independent code is not
guaranteed to execute correctly at any offset unless all statically
initialized pointer variables use the same translation as the code.

So instead, this v2 and later proposes another solution, taking the
following approach:
- clean up and refactor the startup code so that the primary startup
  code executes from the 1:1 mapping but nothing else;
- define a new text section type .pi.text and enforce that it can only
  call into other .pi.text sections;
- (tbd) require that objects containing .pi.text sections are built with
  -fPIC, and disallow any absolute references from such objects.

The latter point is not implemented yet in this v3, but this could be
done rather straight-forwardly. (The EFI stub already does something
similar across all architectures)

Changes since v2: [2]
- move command line parsing out of early startup code entirely
- fix LTO and instrumentation related build warnings reported by Nathan
- omit PTI related PGD/P4D setters when creating the early page tables,
  instead of pulling that code into the 'early' set

[0] https://lkml.kernel.org/r/20240122090851.851120-7-ardb%2Bgit%40google.com
[1] https://lore.kernel.org/all/20240111223650.3502633-1-kevinloughlin@google.com/T/#u
[2] https://lkml.kernel.org/r/20240125112818.2016733-19-ardb%2Bgit%40google.com

Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Dionna Glaze <dionnaglaze@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: llvm@lists.linux.dev

Ard Biesheuvel (19):
  efi/libstub: Add generic support for parsing mem_encrypt=
  x86/boot: Move mem_encrypt= parsing to the decompressor
  x86/startup_64: Drop long return to initial_code pointer
  x86/startup_64: Simplify calculation of initial page table address
  x86/startup_64: Simplify CR4 handling in startup code
  x86/startup_64: Drop global variables keeping track of LA57 state
  x86/startup_64: Simplify virtual switch on primary boot
  x86/head64: Replace pointer fixups with PIE codegen
  x86/head64: Simplify GDT/IDT initialization code
  asm-generic: Add special .pi.text section for position independent
    code
  x86: Move return_thunk to __pitext section
  x86/head64: Move early startup code into __pitext
  modpost: Warn about calls from __pitext into other text sections
  x86/coco: Make cc_set_mask() static inline
  x86/sev: Make all code reachable from 1:1 mapping __pitext
  x86/sev: Avoid WARN() in early code
  x86/sev: Use PIC codegen for early SEV startup code
  x86/sev: Drop inline asm LEA instructions for RIP-relative references
  x86/startup_64: Don't bother setting up GS before the kernel is mapped

 arch/x86/Makefile                              |   8 +
 arch/x86/boot/compressed/Makefile              |   2 +-
 arch/x86/boot/compressed/misc.c                |  22 +++
 arch/x86/boot/compressed/pgtable_64.c          |   2 -
 arch/x86/boot/compressed/sev.c                 |   6 +
 arch/x86/coco/core.c                           |   7 +-
 arch/x86/include/asm/coco.h                    |   8 +-
 arch/x86/include/asm/desc.h                    |   3 +-
 arch/x86/include/asm/init.h                    |   2 -
 arch/x86/include/asm/mem_encrypt.h             |   8 +-
 arch/x86/include/asm/pgtable_64.h              |  12 +-
 arch/x86/include/asm/pgtable_64_types.h        |  15 +-
 arch/x86/include/asm/setup.h                   |   4 +-
 arch/x86/include/asm/sev.h                     |   6 +-
 arch/x86/include/uapi/asm/bootparam.h          |   2 +
 arch/x86/kernel/Makefile                       |   7 +
 arch/x86/kernel/cpu/common.c                   |   2 -
 arch/x86/kernel/head64.c                       | 206 +++++++-------------
 arch/x86/kernel/head_64.S                      | 156 +++++----------
 arch/x86/kernel/sev-shared.c                   |  54 +++--
 arch/x86/kernel/sev.c                          |  27 ++-
 arch/x86/kernel/vmlinux.lds.S                  |   3 +-
 arch/x86/lib/Makefile                          |  13 --
 arch/x86/lib/memcpy_64.S                       |   3 +-
 arch/x86/lib/memset_64.S                       |   3 +-
 arch/x86/lib/retpoline.S                       |   2 +-
 arch/x86/mm/Makefile                           |   2 +-
 arch/x86/mm/kasan_init_64.c                    |   3 -
 arch/x86/mm/mem_encrypt_boot.S                 |   3 +-
 arch/x86/mm/mem_encrypt_identity.c             |  98 +++-------
 drivers/firmware/efi/libstub/efi-stub-helper.c |   8 +
 drivers/firmware/efi/libstub/efistub.h         |   2 +-
 drivers/firmware/efi/libstub/x86-stub.c        |   6 +
 include/asm-generic/vmlinux.lds.h              |   3 +
 include/linux/init.h                           |  12 ++
 scripts/mod/modpost.c                          |  11 +-
 tools/objtool/check.c                          |  26 +--
 37 files changed, 319 insertions(+), 438 deletions(-)


base-commit: aa8eff72842021f52600392b245fb82d113afa8a
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 01/19] efi/libstub: Add generic support for parsing mem_encrypt=
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-31  7:31   ` Borislav Petkov
  2024-01-29 18:05 ` [PATCH v3 02/19] x86/boot: Move mem_encrypt= parsing to the decompressor Ard Biesheuvel
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

Parse the mem_encrypt= command line parameter from the EFI stub if
CONFIG_ARCH_HAS_MEM_ENCRYPT=y, so that it can be passed to the early
boot code by the arch code in the stub.

This avoids the need for the core kernel to do any string parsing very
early in the boot.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 drivers/firmware/efi/libstub/efi-stub-helper.c | 8 ++++++++
 drivers/firmware/efi/libstub/efistub.h         | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index bfa30625f5d0..3dc2f9aaf08d 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -24,6 +24,8 @@ static bool efi_noinitrd;
 static bool efi_nosoftreserve;
 static bool efi_disable_pci_dma = IS_ENABLED(CONFIG_EFI_DISABLE_PCI_DMA);
 
+int efi_mem_encrypt;
+
 bool __pure __efi_soft_reserve_enabled(void)
 {
 	return !efi_nosoftreserve;
@@ -75,6 +77,12 @@ efi_status_t efi_parse_options(char const *cmdline)
 			efi_noinitrd = true;
 		} else if (IS_ENABLED(CONFIG_X86_64) && !strcmp(param, "no5lvl")) {
 			efi_no5lvl = true;
+		} else if (IS_ENABLED(CONFIG_ARCH_HAS_MEM_ENCRYPT) &&
+			   !strcmp(param, "mem_encrypt") && val) {
+			if (parse_option_str(val, "on"))
+				efi_mem_encrypt = 1;
+			else if (parse_option_str(val, "off"))
+				efi_mem_encrypt = -1;
 		} else if (!strcmp(param, "efi") && val) {
 			efi_nochunk = parse_option_str(val, "nochunk");
 			efi_novamap |= parse_option_str(val, "novamap");
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 212687c30d79..a1c6ab24cd99 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -37,8 +37,8 @@ extern bool efi_no5lvl;
 extern bool efi_nochunk;
 extern bool efi_nokaslr;
 extern int efi_loglevel;
+extern int efi_mem_encrypt;
 extern bool efi_novamap;
-
 extern const efi_system_table_t *efi_system_table;
 
 typedef union efi_dxe_services_table efi_dxe_services_table_t;
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 02/19] x86/boot: Move mem_encrypt= parsing to the decompressor
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 01/19] efi/libstub: Add generic support for parsing mem_encrypt= Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-31  8:35   ` Borislav Petkov
  2024-01-29 18:05 ` [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer Ard Biesheuvel
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

The early SME/SEV code parses the command line very early, in order to
decide whether or not memory encryption should be enabled, which needs
to occur even before the initial page tables are created.

This is problematic for a number of reasons:
- this early code runs from the 1:1 mapping provided by the decompressor
  or firmware, which uses a different translation than the one assumed by
  the linker, and so the code needs to be built in a special way;
- parsing external input while the entire kernel image is still mapped
  writable is a bad idea in general, and really does not belong in
  security minded code;
- the current code ignores the built-in command line entirely (although
  this appears to be the case for the entire decompressor)

Given that the decompressor/EFI stub is an intrinsic part of the x86
bootable kernel image, move the command line parsing there and out of
the core kernel. This removes the need to build lib/cmdline.o in a
special way, or to use RIP-relative LEA instructions in inline asm
blocks.

This involves a pair of new xloadflags in the setup header to indicate
that a) mem_encrypt= was provided, and b) whether it was set to on or
off. What this actually means in terms of default behavior when the
command line parameter is omitted is left up to the existing logic -
this permits the same flags to be reused if the need arises.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/boot/compressed/misc.c         | 22 ++++++++++
 arch/x86/include/uapi/asm/bootparam.h   |  2 +
 arch/x86/lib/Makefile                   | 13 ------
 arch/x86/mm/mem_encrypt_identity.c      | 45 +++-----------------
 drivers/firmware/efi/libstub/x86-stub.c |  6 +++
 5 files changed, 37 insertions(+), 51 deletions(-)

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b99e08e6815b..d63a2dc7d0b1 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -357,6 +357,26 @@ unsigned long decompress_kernel(unsigned char *outbuf, unsigned long virt_addr,
 	return entry;
 }
 
+/*
+ * Set the memory encryption xloadflag based on the mem_encrypt= command line
+ * parameter, if provided. If not, the consumer of the flag decides what the
+ * default behavior should be.
+ */
+static void set_mem_encrypt_flag(struct setup_header *hdr)
+{
+	hdr->xloadflags &= ~(XLF_MEM_ENCRYPTION | XLF_MEM_ENCRYPTION_ENABLED);
+
+	if (IS_ENABLED(CONFIG_ARCH_HAS_MEM_ENCRYPT)) {
+		int on = cmdline_find_option_bool("mem_encrypt=on");
+		int off = cmdline_find_option_bool("mem_encrypt=off");
+
+		if (on || off)
+			hdr->xloadflags |= XLF_MEM_ENCRYPTION;
+		if (on > off)
+			hdr->xloadflags |= XLF_MEM_ENCRYPTION_ENABLED;
+	}
+}
+
 /*
  * The compressed kernel image (ZO), has been moved so that its position
  * is against the end of the buffer used to hold the uncompressed kernel
@@ -387,6 +407,8 @@ asmlinkage __visible void *extract_kernel(void *rmode, unsigned char *output)
 	/* Clear flags intended for solely in-kernel use. */
 	boot_params_ptr->hdr.loadflags &= ~KASLR_FLAG;
 
+	set_mem_encrypt_flag(&boot_params_ptr->hdr);
+
 	sanitize_boot_params(boot_params_ptr);
 
 	if (boot_params_ptr->screen_info.orig_video_mode == 7) {
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index 01d19fc22346..316784e17d38 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -38,6 +38,8 @@
 #define XLF_EFI_KEXEC			(1<<4)
 #define XLF_5LEVEL			(1<<5)
 #define XLF_5LEVEL_ENABLED		(1<<6)
+#define XLF_MEM_ENCRYPTION		(1<<7)
+#define XLF_MEM_ENCRYPTION_ENABLED	(1<<8)
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index ea3a28e7b613..f0dae4fb6d07 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -14,19 +14,6 @@ ifdef CONFIG_KCSAN
 CFLAGS_REMOVE_delay.o = $(CC_FLAGS_FTRACE)
 endif
 
-# Early boot use of cmdline; don't instrument it
-ifdef CONFIG_AMD_MEM_ENCRYPT
-KCOV_INSTRUMENT_cmdline.o := n
-KASAN_SANITIZE_cmdline.o  := n
-KCSAN_SANITIZE_cmdline.o  := n
-
-ifdef CONFIG_FUNCTION_TRACER
-CFLAGS_REMOVE_cmdline.o = -pg
-endif
-
-CFLAGS_cmdline.o := -fno-stack-protector -fno-jump-tables
-endif
-
 inat_tables_script = $(srctree)/arch/x86/tools/gen-insn-attr-x86.awk
 inat_tables_maps = $(srctree)/arch/x86/lib/x86-opcode-map.txt
 quiet_cmd_inat_tables = GEN     $@
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 7f72472a34d6..06466f6d5966 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -43,7 +43,6 @@
 
 #include <asm/setup.h>
 #include <asm/sections.h>
-#include <asm/cmdline.h>
 #include <asm/coco.h>
 #include <asm/sev.h>
 
@@ -95,10 +94,6 @@ struct sme_populate_pgd_data {
  */
 static char sme_workarea[2 * PMD_SIZE] __section(".init.scratch");
 
-static char sme_cmdline_arg[] __initdata = "mem_encrypt";
-static char sme_cmdline_on[]  __initdata = "on";
-static char sme_cmdline_off[] __initdata = "off";
-
 static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd)
 {
 	unsigned long pgd_start, pgd_end, pgd_size;
@@ -504,11 +499,9 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
 
 void __init sme_enable(struct boot_params *bp)
 {
-	const char *cmdline_ptr, *cmdline_arg, *cmdline_on, *cmdline_off;
 	unsigned int eax, ebx, ecx, edx;
 	unsigned long feature_mask;
 	unsigned long me_mask;
-	char buffer[16];
 	bool snp;
 	u64 msr;
 
@@ -570,42 +563,18 @@ void __init sme_enable(struct boot_params *bp)
 		msr = __rdmsr(MSR_AMD64_SYSCFG);
 		if (!(msr & MSR_AMD64_SYSCFG_MEM_ENCRYPT))
 			return;
+
+		if (bp->hdr.xloadflags & XLF_MEM_ENCRYPTION) {
+			if (bp->hdr.xloadflags & XLF_MEM_ENCRYPTION_ENABLED)
+				sme_me_mask = me_mask;
+		} else if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT)) {
+			sme_me_mask = me_mask;
+		}
 	} else {
 		/* SEV state cannot be controlled by a command line option */
 		sme_me_mask = me_mask;
-		goto out;
 	}
 
-	/*
-	 * Fixups have not been applied to phys_base yet and we're running
-	 * identity mapped, so we must obtain the address to the SME command
-	 * line argument data using rip-relative addressing.
-	 */
-	asm ("lea sme_cmdline_arg(%%rip), %0"
-	     : "=r" (cmdline_arg)
-	     : "p" (sme_cmdline_arg));
-	asm ("lea sme_cmdline_on(%%rip), %0"
-	     : "=r" (cmdline_on)
-	     : "p" (sme_cmdline_on));
-	asm ("lea sme_cmdline_off(%%rip), %0"
-	     : "=r" (cmdline_off)
-	     : "p" (sme_cmdline_off));
-
-	if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT))
-		sme_me_mask = me_mask;
-
-	cmdline_ptr = (const char *)((u64)bp->hdr.cmd_line_ptr |
-				     ((u64)bp->ext_cmd_line_ptr << 32));
-
-	if (cmdline_find_option(cmdline_ptr, cmdline_arg, buffer, sizeof(buffer)) < 0)
-		goto out;
-
-	if (!strncmp(buffer, cmdline_on, sizeof(buffer)))
-		sme_me_mask = me_mask;
-	else if (!strncmp(buffer, cmdline_off, sizeof(buffer)))
-		sme_me_mask = 0;
-
-out:
 	if (sme_me_mask) {
 		physical_mask &= ~sme_me_mask;
 		cc_vendor = CC_VENDOR_AMD;
diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
index 0d510c9a06a4..66e336cca0cc 100644
--- a/drivers/firmware/efi/libstub/x86-stub.c
+++ b/drivers/firmware/efi/libstub/x86-stub.c
@@ -879,6 +879,12 @@ void __noreturn efi_stub_entry(efi_handle_t handle,
 		}
 	}
 
+	if (IS_ENABLED(CONFIG_ARCH_HAS_MEM_ENCRYPT) && efi_mem_encrypt) {
+		hdr->xloadflags |= XLF_MEM_ENCRYPTION;
+		if (efi_mem_encrypt > 0)
+			hdr->xloadflags |= XLF_MEM_ENCRYPTION_ENABLED;
+	}
+
 	status = efi_decompress_kernel(&kernel_entry);
 	if (status != EFI_SUCCESS) {
 		efi_err("Failed to decompress kernel\n");
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 01/19] efi/libstub: Add generic support for parsing mem_encrypt= Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 02/19] x86/boot: Move mem_encrypt= parsing to the decompressor Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-31 13:44   ` Borislav Petkov
  2024-01-31 18:14   ` [tip: x86/boot] " tip-bot2 for Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 04/19] x86/startup_64: Simplify calculation of initial page table address Ard Biesheuvel
                   ` (15 subsequent siblings)
  18 siblings, 2 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

Since commit 866b556efa12 ("x86/head/64: Install startup GDT"), the
primary startup sequence sets the code segment register (CS) to __KERNEL_CS
before calling into the startup code shared between primary and
secondary boot.

This means a simple indirect call is sufficient here.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/head_64.S | 35 ++------------------
 1 file changed, 3 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index d4918d03efb4..4017a49d7b76 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	movq	%r15, %rdi
 
 .Ljump_to_C_code:
-	/*
-	 * Jump to run C code and to be on a real kernel address.
-	 * Since we are running on identity-mapped space we have to jump
-	 * to the full 64bit address, this is only possible as indirect
-	 * jump.  In addition we need to ensure %cs is set so we make this
-	 * a far return.
-	 *
-	 * Note: do not change to far jump indirect with 64bit offset.
-	 *
-	 * AMD does not support far jump indirect with 64bit offset.
-	 * AMD64 Architecture Programmer's Manual, Volume 3: states only
-	 *	JMP FAR mem16:16 FF /5 Far jump indirect,
-	 *		with the target specified by a far pointer in memory.
-	 *	JMP FAR mem16:32 FF /5 Far jump indirect,
-	 *		with the target specified by a far pointer in memory.
-	 *
-	 * Intel64 does support 64bit offset.
-	 * Software Developer Manual Vol 2: states:
-	 *	FF /5 JMP m16:16 Jump far, absolute indirect,
-	 *		address given in m16:16
-	 *	FF /5 JMP m16:32 Jump far, absolute indirect,
-	 *		address given in m16:32.
-	 *	REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
-	 *		address given in m16:64.
-	 */
-	pushq	$.Lafter_lret	# put return address on stack for unwinder
 	xorl	%ebp, %ebp	# clear frame pointer
-	movq	initial_code(%rip), %rax
-	pushq	$__KERNEL_CS	# set correct cs
-	pushq	%rax		# target address in negative space
-	lretq
-.Lafter_lret:
-	ANNOTATE_NOENDBR
+	ANNOTATE_RETPOLINE_SAFE
+	callq	*initial_code(%rip)
+	int3
 SYM_CODE_END(secondary_startup_64)
 
 #include "verify_cpu.S"
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 04/19] x86/startup_64: Simplify calculation of initial page table address
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-02-05 10:40   ` Borislav Petkov
  2024-01-29 18:05 ` [PATCH v3 05/19] x86/startup_64: Simplify CR4 handling in startup code Ard Biesheuvel
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

Determining the address of the initial page table to program into CR3
involves:
- taking the physical address
- adding the SME encryption mask

On the primary entry path, the code is mapped using a 1:1 virtual to
physical translation, so the physical address can be taken directly
using a RIP-relative LEA instruction.

On the secondary entry path, the address can be obtained by taking the
offset from the virtual kernel base (__START_kernel_map) and adding the
physical kernel base.

This is all very straight-forward, but the current code makes a mess of
this. Clean this up.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/head_64.S | 25 ++++++--------------
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 4017a49d7b76..6d24c2014759 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -113,13 +113,11 @@ SYM_CODE_START_NOALIGN(startup_64)
 	call	__startup_64
 
 	/* Form the CR3 value being sure to include the CR3 modifier */
-	addq	$(early_top_pgt - __START_KERNEL_map), %rax
+	leaq	early_top_pgt(%rip), %rcx
+	addq	%rcx, %rax
 
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 	mov	%rax, %rdi
-	mov	%rax, %r14
-
-	addq	phys_base(%rip), %rdi
 
 	/*
 	 * For SEV guests: Verify that the C-bit is correct. A malicious
@@ -128,12 +126,6 @@ SYM_CODE_START_NOALIGN(startup_64)
 	 * the next RET instruction.
 	 */
 	call	sev_verify_cbit
-
-	/*
-	 * Restore CR3 value without the phys_base which will be added
-	 * below, before writing %cr3.
-	 */
-	 mov	%r14, %rax
 #endif
 
 	jmp 1f
@@ -173,18 +165,18 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	/* Clear %R15 which holds the boot_params pointer on the boot CPU */
 	xorq	%r15, %r15
 
+	/* Derive the runtime physical address of init_top_pgt[] */
+	movq	phys_base(%rip), %rax
+	addq	$(init_top_pgt - __START_KERNEL_map), %rax
+
 	/*
 	 * Retrieve the modifier (SME encryption mask if SME is active) to be
 	 * added to the initial pgdir entry that will be programmed into CR3.
 	 */
 #ifdef CONFIG_AMD_MEM_ENCRYPT
-	movq	sme_me_mask, %rax
-#else
-	xorq	%rax, %rax
+	addq	sme_me_mask(%rip), %rax
 #endif
 
-	/* Form the CR3 value being sure to include the CR3 modifier */
-	addq	$(init_top_pgt - __START_KERNEL_map), %rax
 1:
 
 #ifdef CONFIG_X86_MCE
@@ -211,9 +203,6 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 #endif
 	movq	%rcx, %cr4
 
-	/* Setup early boot stage 4-/5-level pagetables. */
-	addq	phys_base(%rip), %rax
-
 	/*
 	 * Switch to new page-table
 	 *
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 05/19] x86/startup_64: Simplify CR4 handling in startup code
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (3 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 04/19] x86/startup_64: Simplify calculation of initial page table address Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-02-06 18:21   ` Borislav Petkov
  2024-01-29 18:05 ` [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state Ard Biesheuvel
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

When executing in long mode, the CR4.PAE and CR4.LA57 control bits
cannot be updated, and so they can simply be preserved rather than
reason about whether or not they need to be set. CR4.PSE has no effect
in long mode so it can be omitted.

CR4.PGE is used to flush the TLBs, by clearing it if it was set, and
subsequently re-enabling it. So there is no need to set it just to
disable and re-enable it later.

CR4.MCE must be preserved unless the kernel was built without
CONFIG_X86_MCE, in which case it must be cleared.

Reimplement the above logic in a more straight-forward way, by defining
a mask of CR4 bits to preserve, and applying that to CR4 at the point
where it needs to be updated anyway.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/head_64.S | 27 ++++++++------------
 1 file changed, 10 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 6d24c2014759..ca46995205d4 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -179,6 +179,12 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 
 1:
 
+	/*
+	 * Define a mask of CR4 bits to preserve. PAE and LA57 cannot be
+	 * modified while paging remains enabled. PGE will be toggled below if
+	 * it is already set.
+	 */
+	movl	$(X86_CR4_PAE | X86_CR4_PGE | X86_CR4_LA57), %edx
 #ifdef CONFIG_X86_MCE
 	/*
 	 * Preserve CR4.MCE if the kernel will enable #MC support.
@@ -187,22 +193,9 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	 * configured will crash the system regardless of the CR4.MCE value set
 	 * here.
 	 */
-	movq	%cr4, %rcx
-	andl	$X86_CR4_MCE, %ecx
-#else
-	movl	$0, %ecx
+	orl	$X86_CR4_MCE, %edx
 #endif
 
-	/* Enable PAE mode, PSE, PGE and LA57 */
-	orl	$(X86_CR4_PAE | X86_CR4_PSE | X86_CR4_PGE), %ecx
-#ifdef CONFIG_X86_5LEVEL
-	testb	$1, __pgtable_l5_enabled(%rip)
-	jz	1f
-	orl	$X86_CR4_LA57, %ecx
-1:
-#endif
-	movq	%rcx, %cr4
-
 	/*
 	 * Switch to new page-table
 	 *
@@ -218,10 +211,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	 * entries from the identity mapping are flushed.
 	 */
 	movq	%cr4, %rcx
-	movq	%rcx, %rax
-	xorq	$X86_CR4_PGE, %rcx
+	andl	%edx, %ecx
+0:	btcl	$X86_CR4_PGE_BIT, %ecx
 	movq	%rcx, %cr4
-	movq	%rax, %cr4
+	jc	0b
 
 	/* Ensure I am executing from virtual addresses */
 	movq	$1f, %rax
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (4 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 05/19] x86/startup_64: Simplify CR4 handling in startup code Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-02-07 13:29   ` Borislav Petkov
  2024-01-29 18:05 ` [PATCH v3 07/19] x86/startup_64: Simplify virtual switch on primary boot Ard Biesheuvel
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

On x86_64, the core kernel is entered in long mode, which implies that
paging is enabled. This means that the CR4.LA57 control bit is
guaranteed to be in sync with the number of paging levels used by the
kernel, and there is no need to store this in a variable.

There is also no need to use variables for storing the calculations of
pgdir_shift and ptrs_per_p4d, as they are easily determined on the fly.
Other assignments of global variables related to the number of paging
levels can be deferred to the primary C entrypoint that actually runs
from the kernel virtual mapping.

This removes the need for writing to __ro_after_init from the code that
executes extremely early via the 1:1 mapping.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/boot/compressed/pgtable_64.c   |  2 -
 arch/x86/include/asm/pgtable_64_types.h | 15 +++---
 arch/x86/kernel/cpu/common.c            |  2 -
 arch/x86/kernel/head64.c                | 52 ++++----------------
 arch/x86/mm/kasan_init_64.c             |  3 --
 arch/x86/mm/mem_encrypt_identity.c      |  9 ----
 6 files changed, 15 insertions(+), 68 deletions(-)

diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index 51f957b24ba7..0586cc216aa6 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -128,8 +128,6 @@ asmlinkage void configure_5level_paging(struct boot_params *bp, void *pgtable)
 
 		/* Initialize variables for 5-level paging */
 		__pgtable_l5_enabled = 1;
-		pgdir_shift = 48;
-		ptrs_per_p4d = 512;
 	}
 
 	/*
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 38b54b992f32..ecc010fbb377 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -22,28 +22,25 @@ typedef struct { pteval_t pte; } pte_t;
 typedef struct { pmdval_t pmd; } pmd_t;
 
 #ifdef CONFIG_X86_5LEVEL
+#ifdef USE_EARLY_PGTABLE_L5
 extern unsigned int __pgtable_l5_enabled;
 
-#ifdef USE_EARLY_PGTABLE_L5
 /*
- * cpu_feature_enabled() is not available in early boot code.
- * Use variable instead.
+ * CR4.LA57 may not be set to its final value yet in the early boot code.
+ * Use a variable instead.
  */
 static inline bool pgtable_l5_enabled(void)
 {
 	return __pgtable_l5_enabled;
 }
 #else
-#define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57)
+#define pgtable_l5_enabled() !!(native_read_cr4() & X86_CR4_LA57)
 #endif /* USE_EARLY_PGTABLE_L5 */
 
 #else
 #define pgtable_l5_enabled() 0
 #endif /* CONFIG_X86_5LEVEL */
 
-extern unsigned int pgdir_shift;
-extern unsigned int ptrs_per_p4d;
-
 #endif	/* !__ASSEMBLY__ */
 
 #define SHARED_KERNEL_PMD	0
@@ -53,7 +50,7 @@ extern unsigned int ptrs_per_p4d;
 /*
  * PGDIR_SHIFT determines what a top-level page table entry can map
  */
-#define PGDIR_SHIFT	pgdir_shift
+#define PGDIR_SHIFT	(pgtable_l5_enabled() ? 48 : 39)
 #define PTRS_PER_PGD	512
 
 /*
@@ -61,7 +58,7 @@ extern unsigned int ptrs_per_p4d;
  */
 #define P4D_SHIFT		39
 #define MAX_PTRS_PER_P4D	512
-#define PTRS_PER_P4D		ptrs_per_p4d
+#define PTRS_PER_P4D		(pgtable_l5_enabled() ? 512 : 1)
 #define P4D_SIZE		(_AC(1, UL) << P4D_SHIFT)
 #define P4D_MASK		(~(P4D_SIZE - 1))
 
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0b97bcde70c6..20ac11a2c06b 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1,6 +1,4 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* cpu_feature_enabled() cannot be used this early */
-#define USE_EARLY_PGTABLE_L5
 
 #include <linux/memblock.h>
 #include <linux/linkage.h>
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index dc0956067944..d636bb02213f 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -7,9 +7,6 @@
 
 #define DISABLE_BRANCH_PROFILING
 
-/* cpu_feature_enabled() cannot be used this early */
-#define USE_EARLY_PGTABLE_L5
-
 #include <linux/init.h>
 #include <linux/linkage.h>
 #include <linux/types.h>
@@ -50,14 +47,6 @@ extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
 static unsigned int __initdata next_early_pgt;
 pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
 
-#ifdef CONFIG_X86_5LEVEL
-unsigned int __pgtable_l5_enabled __ro_after_init;
-unsigned int pgdir_shift __ro_after_init = 39;
-EXPORT_SYMBOL(pgdir_shift);
-unsigned int ptrs_per_p4d __ro_after_init = 1;
-EXPORT_SYMBOL(ptrs_per_p4d);
-#endif
-
 #ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
 unsigned long page_offset_base __ro_after_init = __PAGE_OFFSET_BASE_L4;
 EXPORT_SYMBOL(page_offset_base);
@@ -95,37 +84,6 @@ static unsigned long __head *fixup_long(void *ptr, unsigned long physaddr)
 	return fixup_pointer(ptr, physaddr);
 }
 
-#ifdef CONFIG_X86_5LEVEL
-static unsigned int __head *fixup_int(void *ptr, unsigned long physaddr)
-{
-	return fixup_pointer(ptr, physaddr);
-}
-
-static bool __head check_la57_support(unsigned long physaddr)
-{
-	/*
-	 * 5-level paging is detected and enabled at kernel decompression
-	 * stage. Only check if it has been enabled there.
-	 */
-	if (!(native_read_cr4() & X86_CR4_LA57))
-		return false;
-
-	*fixup_int(&__pgtable_l5_enabled, physaddr) = 1;
-	*fixup_int(&pgdir_shift, physaddr) = 48;
-	*fixup_int(&ptrs_per_p4d, physaddr) = 512;
-	*fixup_long(&page_offset_base, physaddr) = __PAGE_OFFSET_BASE_L5;
-	*fixup_long(&vmalloc_base, physaddr) = __VMALLOC_BASE_L5;
-	*fixup_long(&vmemmap_base, physaddr) = __VMEMMAP_BASE_L5;
-
-	return true;
-}
-#else
-static bool __head check_la57_support(unsigned long physaddr)
-{
-	return false;
-}
-#endif
-
 static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdval_t *pmd)
 {
 	unsigned long vaddr, vaddr_end;
@@ -189,7 +147,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
 	int i;
 	unsigned int *next_pgt_ptr;
 
-	la57 = check_la57_support(physaddr);
+	la57 = pgtable_l5_enabled();
 
 	/* Is the address too large? */
 	if (physaddr >> MAX_PHYSMEM_BITS)
@@ -486,6 +444,14 @@ asmlinkage __visible void __init __noreturn x86_64_start_kernel(char * real_mode
 				(__START_KERNEL & PGDIR_MASK)));
 	BUILD_BUG_ON(__fix_to_virt(__end_of_fixed_addresses) <= MODULES_END);
 
+#ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
+	if (pgtable_l5_enabled()) {
+		page_offset_base	= __PAGE_OFFSET_BASE_L5;
+		vmalloc_base		= __VMALLOC_BASE_L5;
+		vmemmap_base		= __VMEMMAP_BASE_L5;
+	}
+#endif
+
 	cr4_init_shadow();
 
 	/* Kill off the identity-map trampoline */
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 0302491d799d..85ae1ef840cc 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -2,9 +2,6 @@
 #define DISABLE_BRANCH_PROFILING
 #define pr_fmt(fmt) "kasan: " fmt
 
-/* cpu_feature_enabled() cannot be used this early */
-#define USE_EARLY_PGTABLE_L5
-
 #include <linux/memblock.h>
 #include <linux/kasan.h>
 #include <linux/kdebug.h>
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 06466f6d5966..2e195866a7fe 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -27,15 +27,6 @@
 #undef CONFIG_PARAVIRT_XXL
 #undef CONFIG_PARAVIRT_SPINLOCKS
 
-/*
- * This code runs before CPU feature bits are set. By default, the
- * pgtable_l5_enabled() function uses bit X86_FEATURE_LA57 to determine if
- * 5-level paging is active, so that won't work here. USE_EARLY_PGTABLE_L5
- * is provided to handle this situation and, instead, use a variable that
- * has been set by the early boot code.
- */
-#define USE_EARLY_PGTABLE_L5
-
 #include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/mem_encrypt.h>
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 07/19] x86/startup_64: Simplify virtual switch on primary boot
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (5 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-02-07 14:50   ` Borislav Petkov
  2024-01-29 18:05 ` [PATCH v3 08/19] x86/head64: Replace pointer fixups with PIE codegen Ard Biesheuvel
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

The secondary startup code is used on the primary boot path as well, but
in this case, the initial part runs from a 1:1 mapping, until an
explicit cross-jump is made to the kernel virtual mapping of the same
code.

On the secondary boot path, this jump is pointless as the code already
executes from the mapping targeted by the jump. So combine this
cross-jump with the jump from startup_64() into the common boot path.
This simplifies the execution flow, and clearly separates code that runs
from a 1:1 mapping from code that runs from the kernel virtual mapping.

Note that this requires a page table switch, so hoist the CR3 assignment
into startup_64() as well.

Given that the secondary startup code does not require a special
placement inside the executable, move it to the .text section.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/head_64.S | 41 +++++++++-----------
 1 file changed, 19 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index ca46995205d4..953b82be4cd4 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -39,7 +39,6 @@ L4_START_KERNEL = l4_index(__START_KERNEL_map)
 
 L3_START_KERNEL = pud_index(__START_KERNEL_map)
 
-	.text
 	__HEAD
 	.code64
 SYM_CODE_START_NOALIGN(startup_64)
@@ -128,9 +127,19 @@ SYM_CODE_START_NOALIGN(startup_64)
 	call	sev_verify_cbit
 #endif
 
-	jmp 1f
+	/*
+	 * Switch to early_top_pgt which still has the identity mappings
+	 * present.
+	 */
+	movq	%rax, %cr3
+
+	/* Branch to the common startup code at its kernel virtual address */
+	movq	$common_startup_64, %rax
+	ANNOTATE_RETPOLINE_SAFE
+	jmp	*%rax
 SYM_CODE_END(startup_64)
 
+	.text
 SYM_CODE_START(secondary_startup_64)
 	UNWIND_HINT_END_OF_STACK
 	ANNOTATE_NOENDBR
@@ -176,8 +185,15 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 	addq	sme_me_mask(%rip), %rax
 #endif
+	/*
+	 * Switch to the init_top_pgt here, away from the trampoline_pgd and
+	 * unmap the identity mapped ranges.
+	 */
+	movq	%rax, %cr3
 
-1:
+SYM_INNER_LABEL(common_startup_64, SYM_L_LOCAL)
+	UNWIND_HINT_END_OF_STACK
+	ANNOTATE_NOENDBR // above
 
 	/*
 	 * Define a mask of CR4 bits to preserve. PAE and LA57 cannot be
@@ -195,17 +211,6 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	 */
 	orl	$X86_CR4_MCE, %edx
 #endif
-
-	/*
-	 * Switch to new page-table
-	 *
-	 * For the boot CPU this switches to early_top_pgt which still has the
-	 * identity mappings present. The secondary CPUs will switch to the
-	 * init_top_pgt here, away from the trampoline_pgd and unmap the
-	 * identity mapped ranges.
-	 */
-	movq	%rax, %cr3
-
 	/*
 	 * Do a global TLB flush after the CR3 switch to make sure the TLB
 	 * entries from the identity mapping are flushed.
@@ -216,14 +221,6 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	movq	%rcx, %cr4
 	jc	0b
 
-	/* Ensure I am executing from virtual addresses */
-	movq	$1f, %rax
-	ANNOTATE_RETPOLINE_SAFE
-	jmp	*%rax
-1:
-	UNWIND_HINT_END_OF_STACK
-	ANNOTATE_NOENDBR // above
-
 #ifdef CONFIG_SMP
 	/*
 	 * For parallel boot, the APIC ID is read from the APIC, and then
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 08/19] x86/head64: Replace pointer fixups with PIE codegen
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (6 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 07/19] x86/startup_64: Simplify virtual switch on primary boot Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-02-12 10:29   ` Borislav Petkov
  2024-01-29 18:05 ` [PATCH v3 09/19] x86/head64: Simplify GDT/IDT initialization code Ard Biesheuvel
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

Some of the C code in head64.c may be called from a different virtual
address than it was linked at. Currently, we deal with this by using
ordinary, position dependent codegen, and fixing up all symbol
references on the fly. This is fragile and tricky to maintain. It is
also unnecessary: we can use position independent codegen (with hidden
visibility) to ensure that all compiler generated symbol references are
RIP-relative, removing the need for fixups entirely.

It does mean we need explicit references to kernel virtual addresses to
be generated by hand, so generate those using a movabs instruction in
inline asm in the handful places where we actually need this.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/Makefile                 |  8 ++
 arch/x86/boot/compressed/Makefile |  2 +-
 arch/x86/include/asm/desc.h       |  3 +-
 arch/x86/include/asm/setup.h      |  4 +-
 arch/x86/kernel/Makefile          |  5 ++
 arch/x86/kernel/head64.c          | 88 +++++++-------------
 arch/x86/kernel/head_64.S         |  5 +-
 7 files changed, 51 insertions(+), 64 deletions(-)

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 1a068de12a56..2b5954e75318 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -168,6 +168,14 @@ else
         KBUILD_CFLAGS += -mcmodel=kernel
         KBUILD_RUSTFLAGS += -Cno-redzone=y
         KBUILD_RUSTFLAGS += -Ccode-model=kernel
+
+	PIE_CFLAGS-$(CONFIG_STACKPROTECTOR)	+= -fno-stack-protector
+	PIE_CFLAGS-$(CONFIG_LTO)		+= -fno-lto
+
+	PIE_CFLAGS := -fpie -mcmodel=small $(PIE_CFLAGS-y) \
+		      -include $(srctree)/include/linux/hidden.h
+
+	export PIE_CFLAGS
 endif
 
 #
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index f19c038409aa..bccee07eae60 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -84,7 +84,7 @@ LDFLAGS_vmlinux += -T
 hostprogs	:= mkpiggy
 HOST_EXTRACFLAGS += -I$(srctree)/tools/include
 
-sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|__bss_start\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
+sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABbCDGRSTtVW] \(_text\|__bss_start\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
 
 quiet_cmd_voffset = VOFFSET $@
       cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@
diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index ab97b22ac04a..2e9809feeacd 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -134,7 +134,8 @@ static inline void paravirt_free_ldt(struct desc_struct *ldt, unsigned entries)
 
 #define store_ldt(ldt) asm("sldt %0" : "=m"(ldt))
 
-static inline void native_write_idt_entry(gate_desc *idt, int entry, const gate_desc *gate)
+static __always_inline void
+native_write_idt_entry(gate_desc *idt, int entry, const gate_desc *gate)
 {
 	memcpy(&idt[entry], gate, sizeof(*gate));
 }
diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index 5c83729c8e71..b004f1b9a052 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -47,8 +47,8 @@ extern unsigned long saved_video_mode;
 
 extern void reserve_standard_io_resources(void);
 extern void i386_reserve_resources(void);
-extern unsigned long __startup_64(unsigned long physaddr, struct boot_params *bp);
-extern void startup_64_setup_env(unsigned long physbase);
+extern unsigned long __startup_64(struct boot_params *bp);
+extern void startup_64_setup_env(void);
 extern void early_setup_idt(void);
 extern void __init do_early_exception(struct pt_regs *regs, int trapnr);
 
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 0000325ab98f..42db41b04d8e 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -21,6 +21,11 @@ CFLAGS_REMOVE_sev.o = -pg
 CFLAGS_REMOVE_rethook.o = -pg
 endif
 
+# head64.c contains C code that may execute from a different virtual address
+# than it was linked at, so we always build it using PIE codegen
+CFLAGS_head64.o += $(PIE_CFLAGS)
+UBSAN_SANITIZE_head64.o					:= n
+
 KASAN_SANITIZE_head$(BITS).o				:= n
 KASAN_SANITIZE_dumpstack.o				:= n
 KASAN_SANITIZE_dumpstack_$(BITS).o			:= n
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index d636bb02213f..a4a380494703 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -74,15 +74,10 @@ static struct desc_ptr startup_gdt_descr __initdata = {
 	.address = 0,
 };
 
-static void __head *fixup_pointer(void *ptr, unsigned long physaddr)
-{
-	return ptr - (void *)_text + (void *)physaddr;
-}
-
-static unsigned long __head *fixup_long(void *ptr, unsigned long physaddr)
-{
-	return fixup_pointer(ptr, physaddr);
-}
+#define __va_symbol(sym) ({						\
+	unsigned long __v;						\
+	asm("movq $" __stringify(sym) ", %0":"=r"(__v));		\
+	__v; })
 
 static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdval_t *pmd)
 {
@@ -99,8 +94,8 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
 	 * attribute.
 	 */
 	if (sme_get_me_mask()) {
-		vaddr = (unsigned long)__start_bss_decrypted;
-		vaddr_end = (unsigned long)__end_bss_decrypted;
+		vaddr = __va_symbol(__start_bss_decrypted);
+		vaddr_end = __va_symbol(__end_bss_decrypted);
 
 		for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
 			/*
@@ -127,25 +122,17 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
 	return sme_get_me_mask();
 }
 
-/* Code in __startup_64() can be relocated during execution, but the compiler
- * doesn't have to generate PC-relative relocations when accessing globals from
- * that function. Clang actually does not generate them, which leads to
- * boot-time crashes. To work around this problem, every global pointer must
- * be adjusted using fixup_pointer().
- */
-unsigned long __head __startup_64(unsigned long physaddr,
-				  struct boot_params *bp)
+unsigned long __head __startup_64(struct boot_params *bp)
 {
+	unsigned long physaddr = (unsigned long)_text;
 	unsigned long load_delta, *p;
 	unsigned long pgtable_flags;
 	pgdval_t *pgd;
 	p4dval_t *p4d;
 	pudval_t *pud;
 	pmdval_t *pmd, pmd_entry;
-	pteval_t *mask_ptr;
 	bool la57;
 	int i;
-	unsigned int *next_pgt_ptr;
 
 	la57 = pgtable_l5_enabled();
 
@@ -157,7 +144,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
 	 * Compute the delta between the address I am compiled to run at
 	 * and the address I am actually running at.
 	 */
-	load_delta = physaddr - (unsigned long)(_text - __START_KERNEL_map);
+	load_delta = physaddr - (__va_symbol(_text) - __START_KERNEL_map);
 
 	/* Is the address not 2M aligned? */
 	if (load_delta & ~PMD_MASK)
@@ -168,26 +155,24 @@ unsigned long __head __startup_64(unsigned long physaddr,
 
 	/* Fixup the physical addresses in the page table */
 
-	pgd = fixup_pointer(early_top_pgt, physaddr);
+	pgd = (pgdval_t *)early_top_pgt;
 	p = pgd + pgd_index(__START_KERNEL_map);
 	if (la57)
 		*p = (unsigned long)level4_kernel_pgt;
 	else
 		*p = (unsigned long)level3_kernel_pgt;
-	*p += _PAGE_TABLE_NOENC - __START_KERNEL_map + load_delta;
+	*p += _PAGE_TABLE_NOENC + sme_get_me_mask();
 
 	if (la57) {
-		p4d = fixup_pointer(level4_kernel_pgt, physaddr);
+		p4d = (p4dval_t *)level4_kernel_pgt;
 		p4d[511] += load_delta;
 	}
 
-	pud = fixup_pointer(level3_kernel_pgt, physaddr);
-	pud[510] += load_delta;
-	pud[511] += load_delta;
+	level3_kernel_pgt[510].pud += load_delta;
+	level3_kernel_pgt[511].pud += load_delta;
 
-	pmd = fixup_pointer(level2_fixmap_pgt, physaddr);
 	for (i = FIXMAP_PMD_TOP; i > FIXMAP_PMD_TOP - FIXMAP_PMD_NUM; i--)
-		pmd[i] += load_delta;
+		level2_fixmap_pgt[i].pmd += load_delta;
 
 	/*
 	 * Set up the identity mapping for the switchover.  These
@@ -196,15 +181,13 @@ unsigned long __head __startup_64(unsigned long physaddr,
 	 * it avoids problems around wraparound.
 	 */
 
-	next_pgt_ptr = fixup_pointer(&next_early_pgt, physaddr);
-	pud = fixup_pointer(early_dynamic_pgts[(*next_pgt_ptr)++], physaddr);
-	pmd = fixup_pointer(early_dynamic_pgts[(*next_pgt_ptr)++], physaddr);
+	pud = (pudval_t *)early_dynamic_pgts[next_early_pgt++];
+	pmd = (pmdval_t *)early_dynamic_pgts[next_early_pgt++];
 
 	pgtable_flags = _KERNPG_TABLE_NOENC + sme_get_me_mask();
 
 	if (la57) {
-		p4d = fixup_pointer(early_dynamic_pgts[(*next_pgt_ptr)++],
-				    physaddr);
+		p4d = (p4dval_t *)early_dynamic_pgts[next_early_pgt++];
 
 		i = (physaddr >> PGDIR_SHIFT) % PTRS_PER_PGD;
 		pgd[i + 0] = (pgdval_t)p4d + pgtable_flags;
@@ -225,8 +208,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
 
 	pmd_entry = __PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL;
 	/* Filter out unsupported __PAGE_KERNEL_* bits: */
-	mask_ptr = fixup_pointer(&__supported_pte_mask, physaddr);
-	pmd_entry &= *mask_ptr;
+	pmd_entry &= __supported_pte_mask;
 	pmd_entry += sme_get_me_mask();
 	pmd_entry +=  physaddr;
 
@@ -252,14 +234,14 @@ unsigned long __head __startup_64(unsigned long physaddr,
 	 * error, causing the BIOS to halt the system.
 	 */
 
-	pmd = fixup_pointer(level2_kernel_pgt, physaddr);
+	pmd = (pmdval_t *)level2_kernel_pgt;
 
 	/* invalidate pages before the kernel image */
-	for (i = 0; i < pmd_index((unsigned long)_text); i++)
+	for (i = 0; i < pmd_index(__va_symbol(_text)); i++)
 		pmd[i] &= ~_PAGE_PRESENT;
 
 	/* fixup pages that are part of the kernel image */
-	for (; i <= pmd_index((unsigned long)_end); i++)
+	for (; i <= pmd_index(__va_symbol(_end)); i++)
 		if (pmd[i] & _PAGE_PRESENT)
 			pmd[i] += load_delta;
 
@@ -271,7 +253,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
 	 * Fixup phys_base - remove the memory encryption mask to obtain
 	 * the true physical address.
 	 */
-	*fixup_long(&phys_base, physaddr) += load_delta - sme_get_me_mask();
+	phys_base += load_delta - sme_get_me_mask();
 
 	return sme_postprocess_startup(bp, pmd);
 }
@@ -553,22 +535,16 @@ static void set_bringup_idt_handler(gate_desc *idt, int n, void *handler)
 }
 
 /* This runs while still in the direct mapping */
-static void __head startup_64_load_idt(unsigned long physbase)
+static void __head startup_64_load_idt(void)
 {
-	struct desc_ptr *desc = fixup_pointer(&bringup_idt_descr, physbase);
-	gate_desc *idt = fixup_pointer(bringup_idt_table, physbase);
-
-
-	if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
-		void *handler;
+	gate_desc *idt = bringup_idt_table;
 
+	if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
 		/* VMM Communication Exception */
-		handler = fixup_pointer(vc_no_ghcb, physbase);
-		set_bringup_idt_handler(idt, X86_TRAP_VC, handler);
-	}
+		set_bringup_idt_handler(idt, X86_TRAP_VC, vc_no_ghcb);
 
-	desc->address = (unsigned long)idt;
-	native_load_idt(desc);
+	bringup_idt_descr.address = (unsigned long)idt;
+	native_load_idt(&bringup_idt_descr);
 }
 
 /* This is used when running on kernel addresses */
@@ -587,10 +563,10 @@ void early_setup_idt(void)
 /*
  * Setup boot CPU state needed before kernel switches to virtual addresses.
  */
-void __head startup_64_setup_env(unsigned long physbase)
+void __head startup_64_setup_env(void)
 {
 	/* Load GDT */
-	startup_gdt_descr.address = (unsigned long)fixup_pointer(startup_gdt, physbase);
+	startup_gdt_descr.address = (unsigned long)startup_gdt;
 	native_load_gdt(&startup_gdt_descr);
 
 	/* New GDT is live - reload data segment registers */
@@ -598,5 +574,5 @@ void __head startup_64_setup_env(unsigned long physbase)
 		     "movl %%eax, %%ss\n"
 		     "movl %%eax, %%es\n" : : "a"(__KERNEL_DS) : "memory");
 
-	startup_64_load_idt(physbase);
+	startup_64_load_idt();
 }
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 953b82be4cd4..b0508e84f756 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -67,8 +67,6 @@ SYM_CODE_START_NOALIGN(startup_64)
 	/* Set up the stack for verify_cpu() */
 	leaq	(__end_init_task - PTREGS_SIZE)(%rip), %rsp
 
-	leaq	_text(%rip), %rdi
-
 	/* Setup GSBASE to allow stack canary access for C code */
 	movl	$MSR_GS_BASE, %ecx
 	leaq	INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
@@ -107,8 +105,7 @@ SYM_CODE_START_NOALIGN(startup_64)
 	 * is active) to be added to the initial pgdir entry that will be
 	 * programmed into CR3.
 	 */
-	leaq	_text(%rip), %rdi
-	movq	%r15, %rsi
+	movq	%r15, %rdi
 	call	__startup_64
 
 	/* Form the CR3 value being sure to include the CR3 modifier */
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 09/19] x86/head64: Simplify GDT/IDT initialization code
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (7 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 08/19] x86/head64: Replace pointer fixups with PIE codegen Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-02-12 14:37   ` Borislav Petkov
  2024-01-29 18:05 ` [PATCH v3 10/19] asm-generic: Add special .pi.text section for position independent code Ard Biesheuvel
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

There used to be two separate code paths for programming the IDT early:
one that was called via the 1:1 mapping, and one via the kernel virtual
mapping, where the former used explicit pointer fixups to obtain 1:1
mapped addresses.

That distinction is now gone so the GDT/IDT init code can be unified and
simplified accordingly.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/head64.c | 57 +++++++-------------
 1 file changed, 18 insertions(+), 39 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index a4a380494703..58c58c66dec9 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -59,21 +59,12 @@ EXPORT_SYMBOL(vmemmap_base);
 /*
  * GDT used on the boot CPU before switching to virtual addresses.
  */
-static struct desc_struct startup_gdt[GDT_ENTRIES] __initdata = {
+static struct desc_struct startup_gdt[GDT_ENTRIES] __initconst = {
 	[GDT_ENTRY_KERNEL32_CS]         = GDT_ENTRY_INIT(DESC_CODE32, 0, 0xfffff),
 	[GDT_ENTRY_KERNEL_CS]           = GDT_ENTRY_INIT(DESC_CODE64, 0, 0xfffff),
 	[GDT_ENTRY_KERNEL_DS]           = GDT_ENTRY_INIT(DESC_DATA64, 0, 0xfffff),
 };
 
-/*
- * Address needs to be set at runtime because it references the startup_gdt
- * while the kernel still uses a direct mapping.
- */
-static struct desc_ptr startup_gdt_descr __initdata = {
-	.size = sizeof(startup_gdt)-1,
-	.address = 0,
-};
-
 #define __va_symbol(sym) ({						\
 	unsigned long __v;						\
 	asm("movq $" __stringify(sym) ", %0":"=r"(__v));		\
@@ -517,47 +508,32 @@ void __init __noreturn x86_64_start_reservations(char *real_mode_data)
  */
 static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data;
 
-static struct desc_ptr bringup_idt_descr = {
-	.size		= (NUM_EXCEPTION_VECTORS * sizeof(gate_desc)) - 1,
-	.address	= 0, /* Set at runtime */
-};
-
-static void set_bringup_idt_handler(gate_desc *idt, int n, void *handler)
-{
-#ifdef CONFIG_AMD_MEM_ENCRYPT
-	struct idt_data data;
-	gate_desc desc;
-
-	init_idt_data(&data, n, handler);
-	idt_init_desc(&desc, &data);
-	native_write_idt_entry(idt, n, &desc);
-#endif
-}
-
-/* This runs while still in the direct mapping */
-static void __head startup_64_load_idt(void)
+static void early_load_idt(void (*handler)(void))
 {
 	gate_desc *idt = bringup_idt_table;
+	struct desc_ptr bringup_idt_descr;
+
+	if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
+		struct idt_data data;
+		gate_desc desc;
 
-	if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
 		/* VMM Communication Exception */
-		set_bringup_idt_handler(idt, X86_TRAP_VC, vc_no_ghcb);
+		init_idt_data(&data, X86_TRAP_VC, handler);
+		idt_init_desc(&desc, &data);
+		native_write_idt_entry(idt, X86_TRAP_VC, &desc);
+	}
 
 	bringup_idt_descr.address = (unsigned long)idt;
+	bringup_idt_descr.size = sizeof(bringup_idt_table);
 	native_load_idt(&bringup_idt_descr);
 }
 
-/* This is used when running on kernel addresses */
 void early_setup_idt(void)
 {
-	/* VMM Communication Exception */
-	if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
+	if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
 		setup_ghcb();
-		set_bringup_idt_handler(bringup_idt_table, X86_TRAP_VC, vc_boot_ghcb);
-	}
 
-	bringup_idt_descr.address = (unsigned long)bringup_idt_table;
-	native_load_idt(&bringup_idt_descr);
+	early_load_idt(vc_boot_ghcb);
 }
 
 /*
@@ -565,8 +541,11 @@ void early_setup_idt(void)
  */
 void __head startup_64_setup_env(void)
 {
+	struct desc_ptr startup_gdt_descr;
+
 	/* Load GDT */
 	startup_gdt_descr.address = (unsigned long)startup_gdt;
+	startup_gdt_descr.size = sizeof(startup_gdt) - 1;
 	native_load_gdt(&startup_gdt_descr);
 
 	/* New GDT is live - reload data segment registers */
@@ -574,5 +553,5 @@ void __head startup_64_setup_env(void)
 		     "movl %%eax, %%ss\n"
 		     "movl %%eax, %%es\n" : : "a"(__KERNEL_DS) : "memory");
 
-	startup_64_load_idt();
+	early_load_idt(vc_no_ghcb);
 }
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 10/19] asm-generic: Add special .pi.text section for position independent code
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (8 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 09/19] x86/head64: Simplify GDT/IDT initialization code Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 11/19] x86: Move return_thunk to __pitext section Ard Biesheuvel
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

Add a special .pi.text section that architectures will use to carry code
that can be called while the kernel is executing from a different
virtual address than its link time address. This is typically needed by
very early boot code that executes from a 1:1 mapping, and may need to
call into other code to perform preparatory tasks that must be completed
before switching to the kernel's ordinary virtual mapping.

Note that this implies that the code in question cannot generally be
instrumented safely, and so the contents are combined with the existing
.noinstr.text section, making .pi.text a proper subset of the former.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 include/asm-generic/vmlinux.lds.h |  3 +++
 include/linux/init.h              | 12 +++++++++
 scripts/mod/modpost.c             |  5 +++-
 tools/objtool/check.c             | 26 ++++++++------------
 4 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 5dd3a61d673d..70c9767cac5a 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -553,6 +553,9 @@
 		__cpuidle_text_start = .;				\
 		*(.cpuidle.text)					\
 		__cpuidle_text_end = .;					\
+		__pi_text_start = .;					\
+		*(.pi.text)						\
+		__pi_text_end = .;					\
 		__noinstr_text_end = .;
 
 /*
diff --git a/include/linux/init.h b/include/linux/init.h
index 3fa3f6241350..85bb701b664c 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -55,6 +55,17 @@
 #define __exitdata	__section(".exit.data")
 #define __exit_call	__used __section(".exitcall.exit")
 
+/*
+ * __pitext should be used to mark code that can execute correctly from a
+ * different virtual offset than the kernel was linked at. This is used for
+ * code that is called extremely early during boot.
+ *
+ * Note that this is incompatible with KAsan, which applies an affine
+ * translation to the virtual address to obtain the shadow address which is
+ * strictly tied to the kernel's virtual address space.
+ */
+#define __pitext	__section(".pi.text") __no_sanitize_address notrace
+
 /*
  * modpost check for section mismatches during the kernel build.
  * A section mismatch happens when there are references from a
@@ -92,6 +103,7 @@
 
 /* For assembly routines */
 #define __HEAD		.section	".head.text","ax"
+#define __PITEXT	.section	".pi.text","ax"
 #define __INIT		.section	".init.text","ax"
 #define __FINIT		.previous
 
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 795b21154446..962d00df47ab 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -813,9 +813,12 @@ static void check_section(const char *modname, struct elf_info *elf,
 
 #define INIT_SECTIONS      ".init.*"
 
-#define ALL_TEXT_SECTIONS  ".init.text", ".meminit.text", ".exit.text", \
+#define ALL_PI_TEXT_SECTIONS  ".pi.text", ".pi.text.*"
+#define ALL_NON_PI_TEXT_SECTIONS  ".init.text", ".meminit.text", ".exit.text", \
 		TEXT_SECTIONS, OTHER_TEXT_SECTIONS
 
+#define ALL_TEXT_SECTIONS  ALL_NON_PI_TEXT_SECTIONS, ALL_PI_TEXT_SECTIONS
+
 enum mismatch {
 	TEXTDATA_TO_ANY_INIT_EXIT,
 	XXXINIT_TO_SOME_INIT,
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 548ec3cd7c00..af8f23a96037 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -389,6 +389,7 @@ static int decode_instructions(struct objtool_file *file)
 		if (!strcmp(sec->name, ".noinstr.text") ||
 		    !strcmp(sec->name, ".entry.text") ||
 		    !strcmp(sec->name, ".cpuidle.text") ||
+		    !strncmp(sec->name, ".pi.text", 8) ||
 		    !strncmp(sec->name, ".text..__x86.", 13))
 			sec->noinstr = true;
 
@@ -4234,23 +4235,16 @@ static int validate_noinstr_sections(struct objtool_file *file)
 {
 	struct section *sec;
 	int warnings = 0;
+	static char const *noinstr_sections[] = {
+		".noinstr.text", ".entry.text", ".cpuidle.text", ".pi.text",
+	};
 
-	sec = find_section_by_name(file->elf, ".noinstr.text");
-	if (sec) {
-		warnings += validate_section(file, sec);
-		warnings += validate_unwind_hints(file, sec);
-	}
-
-	sec = find_section_by_name(file->elf, ".entry.text");
-	if (sec) {
-		warnings += validate_section(file, sec);
-		warnings += validate_unwind_hints(file, sec);
-	}
-
-	sec = find_section_by_name(file->elf, ".cpuidle.text");
-	if (sec) {
-		warnings += validate_section(file, sec);
-		warnings += validate_unwind_hints(file, sec);
+	for (int i = 0; i < ARRAY_SIZE(noinstr_sections); i++) {
+		sec = find_section_by_name(file->elf, noinstr_sections[i]);
+		if (sec) {
+			warnings += validate_section(file, sec);
+			warnings += validate_unwind_hints(file, sec);
+		}
 	}
 
 	return warnings;
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 11/19] x86: Move return_thunk to __pitext section
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (9 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 10/19] asm-generic: Add special .pi.text section for position independent code Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 12/19] x86/head64: Move early startup code into __pitext Ard Biesheuvel
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

The x86 return thunk will function correctly even when it is called via
a different virtual mapping than the one it was linked at, so it can
safely be moved to .pi.text. This allows other code in that section to
call it.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/vmlinux.lds.S | 2 +-
 arch/x86/lib/retpoline.S      | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index a349dbfc6d5a..77262e804250 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -134,7 +134,7 @@ SECTIONS
 		SOFTIRQENTRY_TEXT
 #ifdef CONFIG_RETPOLINE
 		*(.text..__x86.indirect_thunk)
-		*(.text..__x86.return_thunk)
+		*(.pi.text..__x86.return_thunk)
 #endif
 		STATIC_CALL_TEXT
 
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 7b2589877d06..003b35445bbb 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -136,7 +136,7 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
  * relocations for same-section JMPs and that breaks the returns
  * detection logic in apply_returns() and in objtool.
  */
-	.section .text..__x86.return_thunk
+	.section .pi.text..__x86.return_thunk, "ax"
 
 #ifdef CONFIG_CPU_SRSO
 
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 12/19] x86/head64: Move early startup code into __pitext
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (10 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 11/19] x86: Move return_thunk to __pitext section Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 13/19] modpost: Warn about calls from __pitext into other text sections Ard Biesheuvel
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

The boot CPU runs some early startup C code using a 1:1 mapping of
memory, which deviates from the normal kernel virtual mapping that is
used for calculating statically initialized pointer variables.

This makes it necessary to strictly limit which C code will actually be
called from that early boot path. Implement this by moving the early
startup code into __pitext.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/include/asm/init.h |  2 --
 arch/x86/kernel/head64.c    |  9 ++++----
 arch/x86/kernel/head_64.S   | 24 ++++++++++++--------
 3 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
index cc9ccf61b6bd..5f1d3c421f68 100644
--- a/arch/x86/include/asm/init.h
+++ b/arch/x86/include/asm/init.h
@@ -2,8 +2,6 @@
 #ifndef _ASM_X86_INIT_H
 #define _ASM_X86_INIT_H
 
-#define __head	__section(".head.text")
-
 struct x86_mapping_info {
 	void *(*alloc_pgt_page)(void *); /* allocate buf for page table */
 	void *context;			 /* context for alloc_pgt_page */
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 58c58c66dec9..0ecd36f5326a 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -70,7 +70,8 @@ static struct desc_struct startup_gdt[GDT_ENTRIES] __initconst = {
 	asm("movq $" __stringify(sym) ", %0":"=r"(__v));		\
 	__v; })
 
-static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdval_t *pmd)
+static unsigned long __pitext sme_postprocess_startup(struct boot_params *bp,
+						      pmdval_t *pmd)
 {
 	unsigned long vaddr, vaddr_end;
 	int i;
@@ -113,7 +114,7 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
 	return sme_get_me_mask();
 }
 
-unsigned long __head __startup_64(struct boot_params *bp)
+unsigned long __pitext __startup_64(struct boot_params *bp)
 {
 	unsigned long physaddr = (unsigned long)_text;
 	unsigned long load_delta, *p;
@@ -508,7 +509,7 @@ void __init __noreturn x86_64_start_reservations(char *real_mode_data)
  */
 static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data;
 
-static void early_load_idt(void (*handler)(void))
+static void __pitext early_load_idt(void (*handler)(void))
 {
 	gate_desc *idt = bringup_idt_table;
 	struct desc_ptr bringup_idt_descr;
@@ -539,7 +540,7 @@ void early_setup_idt(void)
 /*
  * Setup boot CPU state needed before kernel switches to virtual addresses.
  */
-void __head startup_64_setup_env(void)
+void __pitext startup_64_setup_env(void)
 {
 	struct desc_ptr startup_gdt_descr;
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index b0508e84f756..e671caafd932 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -42,6 +42,15 @@ L3_START_KERNEL = pud_index(__START_KERNEL_map)
 	__HEAD
 	.code64
 SYM_CODE_START_NOALIGN(startup_64)
+	UNWIND_HINT_END_OF_STACK
+	jmp	primary_startup_64
+SYM_CODE_END(startup_64)
+
+	__PITEXT
+#include "verify_cpu.S"
+#include "sev_verify_cbit.S"
+
+SYM_CODE_START_LOCAL(primary_startup_64)
 	UNWIND_HINT_END_OF_STACK
 	/*
 	 * At this point the CPU runs in 64bit mode CS.L = 1 CS.D = 0,
@@ -131,10 +140,12 @@ SYM_CODE_START_NOALIGN(startup_64)
 	movq	%rax, %cr3
 
 	/* Branch to the common startup code at its kernel virtual address */
-	movq	$common_startup_64, %rax
 	ANNOTATE_RETPOLINE_SAFE
-	jmp	*%rax
-SYM_CODE_END(startup_64)
+	jmp	*.Lcommon_startup_64(%rip)
+SYM_CODE_END(primary_startup_64)
+
+	__INITRODATA
+SYM_DATA_LOCAL(.Lcommon_startup_64, .quad common_startup_64)
 
 	.text
 SYM_CODE_START(secondary_startup_64)
@@ -410,9 +421,6 @@ SYM_INNER_LABEL(common_startup_64, SYM_L_LOCAL)
 	int3
 SYM_CODE_END(secondary_startup_64)
 
-#include "verify_cpu.S"
-#include "sev_verify_cbit.S"
-
 #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_AMD_MEM_ENCRYPT)
 /*
  * Entry point for soft restart of a CPU. Invoked from xxx_play_dead() for
@@ -539,10 +547,8 @@ SYM_CODE_END(early_idt_handler_common)
  * paravirtualized INTERRUPT_RETURN and pv-ops don't work that early.
  *
  * XXX it does, fix this.
- *
- * This handler will end up in the .init.text section and not be
- * available to boot secondary CPUs.
  */
+	__PITEXT
 SYM_CODE_START_NOALIGN(vc_no_ghcb)
 	UNWIND_HINT_IRET_REGS offset=8
 	ENDBR
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 13/19] modpost: Warn about calls from __pitext into other text sections
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (11 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 12/19] x86/head64: Move early startup code into __pitext Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 14/19] x86/coco: Make cc_set_mask() static inline Ard Biesheuvel
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

Ensure that code that is marked as being able to safely run from a 1:1
mapping does not call into other code which might lack that property.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 scripts/mod/modpost.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 962d00df47ab..33b56d6b4e7b 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -825,6 +825,7 @@ enum mismatch {
 	ANY_INIT_TO_ANY_EXIT,
 	ANY_EXIT_TO_ANY_INIT,
 	EXTABLE_TO_NON_TEXT,
+	PI_TEXT_TO_NON_PI_TEXT,
 };
 
 /**
@@ -887,6 +888,11 @@ static const struct sectioncheck sectioncheck[] = {
 	.bad_tosec = { ".altinstr_replacement", NULL },
 	.good_tosec = {ALL_TEXT_SECTIONS , NULL},
 	.mismatch = EXTABLE_TO_NON_TEXT,
+},
+{
+	.fromsec = { ALL_PI_TEXT_SECTIONS, NULL },
+	.bad_tosec = { ALL_NON_PI_TEXT_SECTIONS, NULL },
+	.mismatch = PI_TEXT_TO_NON_PI_TEXT,
 }
 };
 
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 14/19] x86/coco: Make cc_set_mask() static inline
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (12 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 13/19] modpost: Warn about calls from __pitext into other text sections Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-30 23:16   ` Kevin Loughlin
  2024-01-29 18:05 ` [PATCH v3 15/19] x86/sev: Make all code reachable from 1:1 mapping __pitext Ard Biesheuvel
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

Setting the cc_mask global variable may be done early in the boot while
running fromm a 1:1 translation. This code is built with -fPIC in order
to support this.

Make cc_set_mask() static inline so it can execute safely in this
context as well.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/coco/core.c        | 7 +------
 arch/x86/include/asm/coco.h | 8 +++++++-
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index eeec9986570e..d07be9d05cd0 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -14,7 +14,7 @@
 #include <asm/processor.h>
 
 enum cc_vendor cc_vendor __ro_after_init = CC_VENDOR_NONE;
-static u64 cc_mask __ro_after_init;
+u64 cc_mask __ro_after_init;
 
 static bool noinstr intel_cc_platform_has(enum cc_attr attr)
 {
@@ -148,8 +148,3 @@ u64 cc_mkdec(u64 val)
 	}
 }
 EXPORT_SYMBOL_GPL(cc_mkdec);
-
-__init void cc_set_mask(u64 mask)
-{
-	cc_mask = mask;
-}
diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
index 6ae2d16a7613..ecc29d6136ad 100644
--- a/arch/x86/include/asm/coco.h
+++ b/arch/x86/include/asm/coco.h
@@ -13,7 +13,13 @@ enum cc_vendor {
 extern enum cc_vendor cc_vendor;
 
 #ifdef CONFIG_ARCH_HAS_CC_PLATFORM
-void cc_set_mask(u64 mask);
+static inline void cc_set_mask(u64 mask)
+{
+	extern u64 cc_mask;
+
+	cc_mask = mask;
+}
+
 u64 cc_mkenc(u64 val);
 u64 cc_mkdec(u64 val);
 #else
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 15/19] x86/sev: Make all code reachable from 1:1 mapping __pitext
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (13 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 14/19] x86/coco: Make cc_set_mask() static inline Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 16/19] x86/sev: Avoid WARN() in early code Ard Biesheuvel
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

We cannot safely call any code when still executing from the 1:1 mapping
at early boot. The SEV init code in particular does a fair amount of
work this early, and calls into ordinary APIs, which is not safe, as
these may be instrumented by the sanitizers or by things link
CONFIG_DEBUG_VM or CONFIG_DEBUG_VIRTUAL.

So annotate all SEV code used early as __pitext and along with it, some
of the shared code that it relies on. Also override some definition of
the __pa/__va translation macros to avoid pulling in debug versions.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/boot/compressed/sev.c     |  6 +++
 arch/x86/include/asm/mem_encrypt.h |  8 ++--
 arch/x86/include/asm/pgtable_64.h  | 12 +++++-
 arch/x86/include/asm/sev.h         |  6 +--
 arch/x86/kernel/head64.c           | 20 ++++++----
 arch/x86/kernel/sev-shared.c       | 40 +++++++++++---------
 arch/x86/kernel/sev.c              | 14 +++----
 arch/x86/lib/memcpy_64.S           |  3 +-
 arch/x86/lib/memset_64.S           |  3 +-
 arch/x86/mm/mem_encrypt_boot.S     |  3 +-
 arch/x86/mm/mem_encrypt_identity.c | 35 ++++++++---------
 11 files changed, 90 insertions(+), 60 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 073291832f44..ada6cd8d600b 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -25,6 +25,9 @@
 #include "error.h"
 #include "../msr.h"
 
+#undef __pa_nodebug
+#define __pa_nodebug __pa
+
 static struct ghcb boot_ghcb_page __aligned(PAGE_SIZE);
 struct ghcb *boot_ghcb;
 
@@ -116,6 +119,9 @@ static bool fault_in_kernel_space(unsigned long address)
 #undef __init
 #define __init
 
+#undef __pitext
+#define __pitext
+
 #define __BOOT_COMPRESSED
 
 /* Basic instruction decoding support needed */
diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 359ada486fa9..48469e22a75e 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -46,8 +46,8 @@ void __init sme_unmap_bootdata(char *real_mode_data);
 
 void __init sme_early_init(void);
 
-void __init sme_encrypt_kernel(struct boot_params *bp);
-void __init sme_enable(struct boot_params *bp);
+void sme_encrypt_kernel(struct boot_params *bp);
+void sme_enable(struct boot_params *bp);
 
 int __init early_set_memory_decrypted(unsigned long vaddr, unsigned long size);
 int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size);
@@ -75,8 +75,8 @@ static inline void __init sme_unmap_bootdata(char *real_mode_data) { }
 
 static inline void __init sme_early_init(void) { }
 
-static inline void __init sme_encrypt_kernel(struct boot_params *bp) { }
-static inline void __init sme_enable(struct boot_params *bp) { }
+static inline void sme_encrypt_kernel(struct boot_params *bp) { }
+static inline void sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
 
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 24af25b1551a..3a6d90f47f32 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -139,12 +139,17 @@ static inline pud_t native_pudp_get_and_clear(pud_t *xp)
 #endif
 }
 
+static inline void set_p4d_kernel(p4d_t *p4dp, p4d_t p4d)
+{
+	WRITE_ONCE(*p4dp, p4d);
+}
+
 static inline void native_set_p4d(p4d_t *p4dp, p4d_t p4d)
 {
 	pgd_t pgd;
 
 	if (pgtable_l5_enabled() || !IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION)) {
-		WRITE_ONCE(*p4dp, p4d);
+		set_p4d_kernel(p4dp, p4d);
 		return;
 	}
 
@@ -158,6 +163,11 @@ static inline void native_p4d_clear(p4d_t *p4d)
 	native_set_p4d(p4d, native_make_p4d(0));
 }
 
+static inline void set_pgd_kernel(pgd_t *pgdp, pgd_t pgd)
+{
+	WRITE_ONCE(*pgdp, pgd);
+}
+
 static inline void native_set_pgd(pgd_t *pgdp, pgd_t pgd)
 {
 	WRITE_ONCE(*pgdp, pti_set_user_pgtbl(pgdp, pgd));
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 5b4a1ce3d368..e3b55bd15ce1 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -201,14 +201,14 @@ struct snp_guest_request_ioctl;
 void setup_ghcb(void);
 void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
 					 unsigned long npages);
-void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
-					unsigned long npages);
+void early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+				 unsigned long npages);
 void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
 void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
 void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
 void snp_set_wakeup_secondary_cpu(void);
 bool snp_init(struct boot_params *bp);
-void __init __noreturn snp_abort(void);
+void __noreturn snp_abort(void);
 int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio);
 void snp_accept_memory(phys_addr_t start, phys_addr_t end);
 u64 snp_get_unsupported_features(u64 status);
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 0ecd36f5326a..b014f81e0eac 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -91,16 +91,20 @@ static unsigned long __pitext sme_postprocess_startup(struct boot_params *bp,
 
 		for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
 			/*
-			 * On SNP, transition the page to shared in the RMP table so that
-			 * it is consistent with the page table attribute change.
+			 * On SNP, transition the page to shared in the RMP
+			 * table so that it is consistent with the page table
+			 * attribute change.
 			 *
-			 * __start_bss_decrypted has a virtual address in the high range
-			 * mapping (kernel .text). PVALIDATE, by way of
-			 * early_snp_set_memory_shared(), requires a valid virtual
-			 * address but the kernel is currently running off of the identity
-			 * mapping so use __pa() to get a *currently* valid virtual address.
+			 * __start_bss_decrypted has a virtual address in the
+			 * high range mapping (kernel .text). PVALIDATE, by way
+			 * of early_snp_set_memory_shared(), requires a valid
+			 * virtual address but the kernel is currently running
+			 * off of the identity mapping so use __pa() to get a
+			 * *currently* valid virtual address.
 			 */
-			early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
+			early_snp_set_memory_shared(__pa_nodebug(vaddr),
+						    __pa_nodebug(vaddr),
+						    PTRS_PER_PMD);
 
 			i = pmd_index(vaddr);
 			pmd[i] -= sme_get_me_mask();
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 5db24d0fc557..481dbd009ce9 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -93,7 +93,8 @@ static bool __init sev_es_check_cpu_features(void)
 	return true;
 }
 
-static void __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
+static __always_inline void __noreturn sev_es_terminate(unsigned int set,
+							unsigned int reason)
 {
 	u64 val = GHCB_MSR_TERM_REQ;
 
@@ -226,10 +227,9 @@ static enum es_result verify_exception_info(struct ghcb *ghcb, struct es_em_ctxt
 	return ES_VMM_ERROR;
 }
 
-static enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
-					  struct es_em_ctxt *ctxt,
-					  u64 exit_code, u64 exit_info_1,
-					  u64 exit_info_2)
+static enum es_result __pitext
+sev_es_ghcb_hv_call(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
+		    u64 exit_code, u64 exit_info_1, u64 exit_info_2)
 {
 	/* Fill in protocol and format specifiers */
 	ghcb->protocol_version = ghcb_version;
@@ -239,13 +239,13 @@ static enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
 	ghcb_set_sw_exit_info_1(ghcb, exit_info_1);
 	ghcb_set_sw_exit_info_2(ghcb, exit_info_2);
 
-	sev_es_wr_ghcb_msr(__pa(ghcb));
+	sev_es_wr_ghcb_msr(__pa_nodebug(ghcb));
 	VMGEXIT();
 
 	return verify_exception_info(ghcb, ctxt);
 }
 
-static int __sev_cpuid_hv(u32 fn, int reg_idx, u32 *reg)
+static int __pitext __sev_cpuid_hv(u32 fn, int reg_idx, u32 *reg)
 {
 	u64 val;
 
@@ -260,7 +260,7 @@ static int __sev_cpuid_hv(u32 fn, int reg_idx, u32 *reg)
 	return 0;
 }
 
-static int __sev_cpuid_hv_msr(struct cpuid_leaf *leaf)
+static int __pitext __sev_cpuid_hv_msr(struct cpuid_leaf *leaf)
 {
 	int ret;
 
@@ -283,7 +283,9 @@ static int __sev_cpuid_hv_msr(struct cpuid_leaf *leaf)
 	return ret;
 }
 
-static int __sev_cpuid_hv_ghcb(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
+static int __pitext __sev_cpuid_hv_ghcb(struct ghcb *ghcb,
+					struct es_em_ctxt *ctxt,
+					struct cpuid_leaf *leaf)
 {
 	u32 cr4 = native_read_cr4();
 	int ret;
@@ -316,7 +318,8 @@ static int __sev_cpuid_hv_ghcb(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struc
 	return ES_OK;
 }
 
-static int sev_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
+static int __pitext sev_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
+				 struct cpuid_leaf *leaf)
 {
 	return ghcb ? __sev_cpuid_hv_ghcb(ghcb, ctxt, leaf)
 		    : __sev_cpuid_hv_msr(leaf);
@@ -395,7 +398,7 @@ static u32 snp_cpuid_calc_xsave_size(u64 xfeatures_en, bool compacted)
 	return xsave_size;
 }
 
-static bool
+static bool __pitext
 snp_cpuid_get_validated_func(struct cpuid_leaf *leaf)
 {
 	const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
@@ -431,14 +434,16 @@ snp_cpuid_get_validated_func(struct cpuid_leaf *leaf)
 	return false;
 }
 
-static void snp_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
+static void __pitext snp_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
+				  struct cpuid_leaf *leaf)
 {
 	if (sev_cpuid_hv(ghcb, ctxt, leaf))
 		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID_HV);
 }
 
-static int snp_cpuid_postprocess(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
-				 struct cpuid_leaf *leaf)
+static int __pitext snp_cpuid_postprocess(struct ghcb *ghcb,
+					 struct es_em_ctxt *ctxt,
+					 struct cpuid_leaf *leaf)
 {
 	struct cpuid_leaf leaf_hv = *leaf;
 
@@ -532,7 +537,8 @@ static int snp_cpuid_postprocess(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
  * Returns -EOPNOTSUPP if feature not enabled. Any other non-zero return value
  * should be treated as fatal by caller.
  */
-static int snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
+static int __pitext snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
+			      struct cpuid_leaf *leaf)
 {
 	const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
 
@@ -574,7 +580,7 @@ static int snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_le
  * page yet, so it only supports the MSR based communication with the
  * hypervisor and only the CPUID exit-code.
  */
-void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
+void __pitext do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
 {
 	unsigned int subfn = lower_bits(regs->cx, 32);
 	unsigned int fn = lower_bits(regs->ax, 32);
@@ -1052,7 +1058,7 @@ static struct cc_blob_sev_info *find_cc_blob_setup_data(struct boot_params *bp)
  * mapping needs to be updated in sync with all the changes to virtual memory
  * layout and related mapping facilities throughout the boot process.
  */
-static void __init setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
+static void __pitext setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
 {
 	const struct snp_cpuid_table *cpuid_table_fw, *cpuid_table;
 	int i;
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 1ec753331524..62981b463b76 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -682,8 +682,8 @@ static u64 __init get_jump_table_addr(void)
 	return ret;
 }
 
-static void early_set_pages_state(unsigned long vaddr, unsigned long paddr,
-				  unsigned long npages, enum psc_op op)
+static void __pitext early_set_pages_state(unsigned long vaddr, unsigned long paddr,
+					   unsigned long npages, enum psc_op op)
 {
 	unsigned long paddr_end;
 	u64 val;
@@ -758,8 +758,8 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
 	early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_PRIVATE);
 }
 
-void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
-					unsigned long npages)
+void __pitext early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+					  unsigned long npages)
 {
 	/*
 	 * This can be invoked in early boot while running identity mapped, so
@@ -2062,7 +2062,7 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
  *
  * Scan for the blob in that order.
  */
-static __init struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
+static __pitext struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
 {
 	struct cc_blob_sev_info *cc_info;
 
@@ -2088,7 +2088,7 @@ static __init struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
 	return cc_info;
 }
 
-bool __init snp_init(struct boot_params *bp)
+bool __pitext snp_init(struct boot_params *bp)
 {
 	struct cc_blob_sev_info *cc_info;
 
@@ -2110,7 +2110,7 @@ bool __init snp_init(struct boot_params *bp)
 	return true;
 }
 
-void __init __noreturn snp_abort(void)
+void __pitext __noreturn snp_abort(void)
 {
 	sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
 }
diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S
index 0ae2e1712e2e..f56cb062d874 100644
--- a/arch/x86/lib/memcpy_64.S
+++ b/arch/x86/lib/memcpy_64.S
@@ -2,13 +2,14 @@
 /* Copyright 2002 Andi Kleen */
 
 #include <linux/export.h>
+#include <linux/init.h>
 #include <linux/linkage.h>
 #include <linux/cfi_types.h>
 #include <asm/errno.h>
 #include <asm/cpufeatures.h>
 #include <asm/alternative.h>
 
-.section .noinstr.text, "ax"
+	__PITEXT
 
 /*
  * memcpy - Copy a memory block.
diff --git a/arch/x86/lib/memset_64.S b/arch/x86/lib/memset_64.S
index 0199d56cb479..455424dcadc0 100644
--- a/arch/x86/lib/memset_64.S
+++ b/arch/x86/lib/memset_64.S
@@ -2,11 +2,12 @@
 /* Copyright 2002 Andi Kleen, SuSE Labs */
 
 #include <linux/export.h>
+#include <linux/init.h>
 #include <linux/linkage.h>
 #include <asm/cpufeatures.h>
 #include <asm/alternative.h>
 
-.section .noinstr.text, "ax"
+	__PITEXT
 
 /*
  * ISO C memset - set a memory block to a byte value. This function uses fast
diff --git a/arch/x86/mm/mem_encrypt_boot.S b/arch/x86/mm/mem_encrypt_boot.S
index e25288ee33c2..f951f4f86e5c 100644
--- a/arch/x86/mm/mem_encrypt_boot.S
+++ b/arch/x86/mm/mem_encrypt_boot.S
@@ -7,6 +7,7 @@
  * Author: Tom Lendacky <thomas.lendacky@amd.com>
  */
 
+#include <linux/init.h>
 #include <linux/linkage.h>
 #include <linux/pgtable.h>
 #include <asm/page.h>
@@ -14,7 +15,7 @@
 #include <asm/msr-index.h>
 #include <asm/nospec-branch.h>
 
-	.text
+	__PITEXT
 	.code64
 SYM_FUNC_START(sme_encrypt_execute)
 
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 2e195866a7fe..bc39e04de980 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -85,7 +85,8 @@ struct sme_populate_pgd_data {
  */
 static char sme_workarea[2 * PMD_SIZE] __section(".init.scratch");
 
-static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd)
+
+static void __pitext sme_clear_pgd(struct sme_populate_pgd_data *ppd)
 {
 	unsigned long pgd_start, pgd_end, pgd_size;
 	pgd_t *pgd_p;
@@ -100,7 +101,7 @@ static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd)
 	memset(pgd_p, 0, pgd_size);
 }
 
-static pud_t __init *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
+static pud_t __pitext *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
 {
 	pgd_t *pgd;
 	p4d_t *p4d;
@@ -112,7 +113,7 @@ static pud_t __init *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
 		p4d = ppd->pgtable_area;
 		memset(p4d, 0, sizeof(*p4d) * PTRS_PER_P4D);
 		ppd->pgtable_area += sizeof(*p4d) * PTRS_PER_P4D;
-		set_pgd(pgd, __pgd(PGD_FLAGS | __pa(p4d)));
+		set_pgd_kernel(pgd, __pgd(PGD_FLAGS | __pa(p4d)));
 	}
 
 	p4d = p4d_offset(pgd, ppd->vaddr);
@@ -120,7 +121,7 @@ static pud_t __init *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
 		pud = ppd->pgtable_area;
 		memset(pud, 0, sizeof(*pud) * PTRS_PER_PUD);
 		ppd->pgtable_area += sizeof(*pud) * PTRS_PER_PUD;
-		set_p4d(p4d, __p4d(P4D_FLAGS | __pa(pud)));
+		set_p4d_kernel(p4d, __p4d(P4D_FLAGS | __pa(pud)));
 	}
 
 	pud = pud_offset(p4d, ppd->vaddr);
@@ -137,7 +138,7 @@ static pud_t __init *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
 	return pud;
 }
 
-static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
+static void __pitext sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
 {
 	pud_t *pud;
 	pmd_t *pmd;
@@ -153,7 +154,7 @@ static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
 	set_pmd(pmd, __pmd(ppd->paddr | ppd->pmd_flags));
 }
 
-static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd)
+static void __pitext sme_populate_pgd(struct sme_populate_pgd_data *ppd)
 {
 	pud_t *pud;
 	pmd_t *pmd;
@@ -179,7 +180,7 @@ static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd)
 		set_pte(pte, __pte(ppd->paddr | ppd->pte_flags));
 }
 
-static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
+static void __pitext __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
 {
 	while (ppd->vaddr < ppd->vaddr_end) {
 		sme_populate_pgd_large(ppd);
@@ -189,7 +190,7 @@ static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
 	}
 }
 
-static void __init __sme_map_range_pte(struct sme_populate_pgd_data *ppd)
+static void __pitext __sme_map_range_pte(struct sme_populate_pgd_data *ppd)
 {
 	while (ppd->vaddr < ppd->vaddr_end) {
 		sme_populate_pgd(ppd);
@@ -199,7 +200,7 @@ static void __init __sme_map_range_pte(struct sme_populate_pgd_data *ppd)
 	}
 }
 
-static void __init __sme_map_range(struct sme_populate_pgd_data *ppd,
+static void __pitext __sme_map_range(struct sme_populate_pgd_data *ppd,
 				   pmdval_t pmd_flags, pteval_t pte_flags)
 {
 	unsigned long vaddr_end;
@@ -223,22 +224,22 @@ static void __init __sme_map_range(struct sme_populate_pgd_data *ppd,
 	__sme_map_range_pte(ppd);
 }
 
-static void __init sme_map_range_encrypted(struct sme_populate_pgd_data *ppd)
+static void __pitext sme_map_range_encrypted(struct sme_populate_pgd_data *ppd)
 {
 	__sme_map_range(ppd, PMD_FLAGS_ENC, PTE_FLAGS_ENC);
 }
 
-static void __init sme_map_range_decrypted(struct sme_populate_pgd_data *ppd)
+static void __pitext sme_map_range_decrypted(struct sme_populate_pgd_data *ppd)
 {
 	__sme_map_range(ppd, PMD_FLAGS_DEC, PTE_FLAGS_DEC);
 }
 
-static void __init sme_map_range_decrypted_wp(struct sme_populate_pgd_data *ppd)
+static void __pitext sme_map_range_decrypted_wp(struct sme_populate_pgd_data *ppd)
 {
 	__sme_map_range(ppd, PMD_FLAGS_DEC_WP, PTE_FLAGS_DEC_WP);
 }
 
-static unsigned long __init sme_pgtable_calc(unsigned long len)
+static unsigned long __pitext sme_pgtable_calc(unsigned long len)
 {
 	unsigned long entries = 0, tables = 0;
 
@@ -275,7 +276,7 @@ static unsigned long __init sme_pgtable_calc(unsigned long len)
 	return entries + tables;
 }
 
-void __init sme_encrypt_kernel(struct boot_params *bp)
+void __pitext sme_encrypt_kernel(struct boot_params *bp)
 {
 	unsigned long workarea_start, workarea_end, workarea_len;
 	unsigned long execute_start, execute_end, execute_len;
@@ -310,8 +311,8 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
 	 */
 
 	/* Physical addresses gives us the identity mapped virtual addresses */
-	kernel_start = __pa_symbol(_text);
-	kernel_end = ALIGN(__pa_symbol(_end), PMD_SIZE);
+	kernel_start = __pa(_text);
+	kernel_end = ALIGN(__pa(_end), PMD_SIZE);
 	kernel_len = kernel_end - kernel_start;
 
 	initrd_start = 0;
@@ -488,7 +489,7 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
 	native_write_cr3(__native_read_cr3());
 }
 
-void __init sme_enable(struct boot_params *bp)
+void __pitext sme_enable(struct boot_params *bp)
 {
 	unsigned int eax, ebx, ecx, edx;
 	unsigned long feature_mask;
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 16/19] x86/sev: Avoid WARN() in early code
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (14 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 15/19] x86/sev: Make all code reachable from 1:1 mapping __pitext Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 17/19] x86/sev: Use PIC codegen for early SEV startup code Ard Biesheuvel
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

Drop uses of WARN() from code that is reachable from the early primary
boot path which executes via the initial 1:1 mapping before the kernel
page tables are populated. This is unsafe and mostly pointless, given
that printk() does not actually work yet at this point.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/sev.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 62981b463b76..94bf054bbde3 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -698,7 +698,7 @@ static void __pitext early_set_pages_state(unsigned long vaddr, unsigned long pa
 		if (op == SNP_PAGE_STATE_SHARED) {
 			/* Page validation must be rescinded before changing to shared */
 			ret = pvalidate(vaddr, RMP_PG_SIZE_4K, false);
-			if (WARN(ret, "Failed to validate address 0x%lx ret %d", paddr, ret))
+			if (ret)
 				goto e_term;
 		}
 
@@ -711,21 +711,16 @@ static void __pitext early_set_pages_state(unsigned long vaddr, unsigned long pa
 
 		val = sev_es_rd_ghcb_msr();
 
-		if (WARN(GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP,
-			 "Wrong PSC response code: 0x%x\n",
-			 (unsigned int)GHCB_RESP_CODE(val)))
+		if (GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP)
 			goto e_term;
 
-		if (WARN(GHCB_MSR_PSC_RESP_VAL(val),
-			 "Failed to change page state to '%s' paddr 0x%lx error 0x%llx\n",
-			 op == SNP_PAGE_STATE_PRIVATE ? "private" : "shared",
-			 paddr, GHCB_MSR_PSC_RESP_VAL(val)))
+		if (GHCB_MSR_PSC_RESP_VAL(val))
 			goto e_term;
 
 		if (op == SNP_PAGE_STATE_PRIVATE) {
 			/* Page validation must be performed after changing to private */
 			ret = pvalidate(vaddr, RMP_PG_SIZE_4K, true);
-			if (WARN(ret, "Failed to validate address 0x%lx ret %d", paddr, ret))
+			if (ret)
 				goto e_term;
 		}
 
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 17/19] x86/sev: Use PIC codegen for early SEV startup code
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (15 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 16/19] x86/sev: Avoid WARN() in early code Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 18/19] x86/sev: Drop inline asm LEA instructions for RIP-relative references Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 19/19] x86/startup_64: Don't bother setting up GS before the kernel is mapped Ard Biesheuvel
  18 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

Use PIC codegen for the compilation units containing code that may be
called very early during the boot, at which point the CPU still runs
from the 1:1 mapping of memory. This is necessary to prevent the
compiler from emitting absolute symbol references to addresses that are
not mapped yet.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/Makefile      | 2 ++
 arch/x86/kernel/vmlinux.lds.S | 1 +
 arch/x86/mm/Makefile          | 2 +-
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 42db41b04d8e..3819b65c64ec 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -24,7 +24,9 @@ endif
 # head64.c contains C code that may execute from a different virtual address
 # than it was linked at, so we always build it using PIE codegen
 CFLAGS_head64.o += $(PIE_CFLAGS)
+CFLAGS_sev.o += $(PIE_CFLAGS)
 UBSAN_SANITIZE_head64.o					:= n
+UBSAN_SANITIZE_sev.o					:= n
 
 KASAN_SANITIZE_head$(BITS).o				:= n
 KASAN_SANITIZE_dumpstack.o				:= n
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 77262e804250..bbdccb6362a9 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -182,6 +182,7 @@ SECTIONS
 
 		DATA_DATA
 		CONSTRUCTORS
+		*(.data.rel .data.rel.*)
 
 		/* rarely changed data like cpu maps */
 		READ_MOSTLY_DATA(INTERNODE_CACHE_BYTES)
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index c80febc44cd2..f3bb8b415348 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -31,7 +31,7 @@ obj-y				+= pat/
 
 # Make sure __phys_addr has no stackprotector
 CFLAGS_physaddr.o		:= -fno-stack-protector
-CFLAGS_mem_encrypt_identity.o	:= -fno-stack-protector
+CFLAGS_mem_encrypt_identity.o	:= $(PIE_CFLAGS)
 
 CFLAGS_fault.o := -I $(srctree)/$(src)/../include/asm/trace
 
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 18/19] x86/sev: Drop inline asm LEA instructions for RIP-relative references
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (16 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 17/19] x86/sev: Use PIC codegen for early SEV startup code Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  2024-01-29 18:05 ` [PATCH v3 19/19] x86/startup_64: Don't bother setting up GS before the kernel is mapped Ard Biesheuvel
  18 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

The SEV code that may run early is now built with -fPIC and so there is
no longer a need for explicit RIP-relative references in inline asm,
given that is what the compiler will emit as well.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/sev-shared.c       | 14 +-------------
 arch/x86/mm/mem_encrypt_identity.c | 11 +----------
 2 files changed, 2 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 481dbd009ce9..1cfbc6d0df89 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -325,21 +325,9 @@ static int __pitext sev_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
 		    : __sev_cpuid_hv_msr(leaf);
 }
 
-/*
- * This may be called early while still running on the initial identity
- * mapping. Use RIP-relative addressing to obtain the correct address
- * while running with the initial identity mapping as well as the
- * switch-over to kernel virtual addresses later.
- */
 static const struct snp_cpuid_table *snp_cpuid_get_table(void)
 {
-	void *ptr;
-
-	asm ("lea cpuid_table_copy(%%rip), %0"
-	     : "=r" (ptr)
-	     : "p" (&cpuid_table_copy));
-
-	return ptr;
+	return &cpuid_table_copy;
 }
 
 /*
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index bc39e04de980..d01e6b1256c6 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -85,7 +85,6 @@ struct sme_populate_pgd_data {
  */
 static char sme_workarea[2 * PMD_SIZE] __section(".init.scratch");
 
-
 static void __pitext sme_clear_pgd(struct sme_populate_pgd_data *ppd)
 {
 	unsigned long pgd_start, pgd_end, pgd_size;
@@ -329,14 +328,6 @@ void __pitext sme_encrypt_kernel(struct boot_params *bp)
 	}
 #endif
 
-	/*
-	 * We're running identity mapped, so we must obtain the address to the
-	 * SME encryption workarea using rip-relative addressing.
-	 */
-	asm ("lea sme_workarea(%%rip), %0"
-	     : "=r" (workarea_start)
-	     : "p" (sme_workarea));
-
 	/*
 	 * Calculate required number of workarea bytes needed:
 	 *   executable encryption area size:
@@ -346,7 +337,7 @@ void __pitext sme_encrypt_kernel(struct boot_params *bp)
 	 *   pagetable structures for the encryption of the kernel
 	 *   pagetable structures for workarea (in case not currently mapped)
 	 */
-	execute_start = workarea_start;
+	execute_start = workarea_start = (unsigned long)sme_workarea;
 	execute_end = execute_start + (PAGE_SIZE * 2) + PMD_SIZE;
 	execute_len = execute_end - execute_start;
 
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 19/19] x86/startup_64: Don't bother setting up GS before the kernel is mapped
  2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
                   ` (17 preceding siblings ...)
  2024-01-29 18:05 ` [PATCH v3 18/19] x86/sev: Drop inline asm LEA instructions for RIP-relative references Ard Biesheuvel
@ 2024-01-29 18:05 ` Ard Biesheuvel
  18 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-29 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Kevin Loughlin, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

From: Ard Biesheuvel <ardb@kernel.org>

The code that executes from the early 1:1 mapping of the kernel should
set up the kernel page tables and nothing else. C code that is linked
into this code path is severely restricted in what it can do, and is
therefore required to remain uninstrumented. It also built with -fPIC
and without stack protector support.

This makes it unnecessary to enable per-CPU variable access this early,
and for the boot CPU, the initialization that occurs in the common CPU
startup path is sufficient.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/head_64.S | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index e671caafd932..ae211cb62a1e 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -76,13 +76,6 @@ SYM_CODE_START_LOCAL(primary_startup_64)
 	/* Set up the stack for verify_cpu() */
 	leaq	(__end_init_task - PTREGS_SIZE)(%rip), %rsp
 
-	/* Setup GSBASE to allow stack canary access for C code */
-	movl	$MSR_GS_BASE, %ecx
-	leaq	INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
-	movl	%edx, %eax
-	shrq	$32,  %rdx
-	wrmsr
-
 	call	startup_64_setup_env
 
 	/* Now switch to __KERNEL_CS so IRET works reliably */
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 14/19] x86/coco: Make cc_set_mask() static inline
  2024-01-29 18:05 ` [PATCH v3 14/19] x86/coco: Make cc_set_mask() static inline Ard Biesheuvel
@ 2024-01-30 23:16   ` Kevin Loughlin
  2024-01-30 23:36     ` Ard Biesheuvel
  0 siblings, 1 reply; 52+ messages in thread
From: Kevin Loughlin @ 2024-01-30 23:16 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, Ard Biesheuvel, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, Jan 29, 2024 at 10:06 AM Ard Biesheuvel <ardb+git@google.com> wrote:
>
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Setting the cc_mask global variable may be done early in the boot while
> running fromm a 1:1 translation. This code is built with -fPIC in order
> to support this.
>
> Make cc_set_mask() static inline so it can execute safely in this
> context as well.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/x86/coco/core.c        | 7 +------
>  arch/x86/include/asm/coco.h | 8 +++++++-
>  2 files changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
> index eeec9986570e..d07be9d05cd0 100644
> --- a/arch/x86/coco/core.c
> +++ b/arch/x86/coco/core.c
> @@ -14,7 +14,7 @@
>  #include <asm/processor.h>
>
>  enum cc_vendor cc_vendor __ro_after_init = CC_VENDOR_NONE;
> -static u64 cc_mask __ro_after_init;
> +u64 cc_mask __ro_after_init;
>
>  static bool noinstr intel_cc_platform_has(enum cc_attr attr)
>  {
> @@ -148,8 +148,3 @@ u64 cc_mkdec(u64 val)
>         }
>  }
>  EXPORT_SYMBOL_GPL(cc_mkdec);
> -
> -__init void cc_set_mask(u64 mask)
> -{
> -       cc_mask = mask;
> -}
> diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
> index 6ae2d16a7613..ecc29d6136ad 100644
> --- a/arch/x86/include/asm/coco.h
> +++ b/arch/x86/include/asm/coco.h
> @@ -13,7 +13,13 @@ enum cc_vendor {
>  extern enum cc_vendor cc_vendor;
>
>  #ifdef CONFIG_ARCH_HAS_CC_PLATFORM
> -void cc_set_mask(u64 mask);
> +static inline void cc_set_mask(u64 mask)

In the inline functions I changed/added to core.c in [0], I saw an
objtool warning on clang builds when using inline instead of
__always_inline; I did not see the same warning for gcc . Should we
similarly use __always_inline to strictly-enforce here?

[0] https://lore.kernel.org/lkml/20240130220845.1978329-2-kevinloughlin@google.com/#Z31arch:x86:coco:core.c

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 14/19] x86/coco: Make cc_set_mask() static inline
  2024-01-30 23:16   ` Kevin Loughlin
@ 2024-01-30 23:36     ` Ard Biesheuvel
  0 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-30 23:36 UTC (permalink / raw)
  To: Kevin Loughlin
  Cc: Ard Biesheuvel, linux-kernel, Tom Lendacky, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Wed, 31 Jan 2024 at 00:16, Kevin Loughlin <kevinloughlin@google.com> wrote:
>
> On Mon, Jan 29, 2024 at 10:06 AM Ard Biesheuvel <ardb+git@google.com> wrote:
> >
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Setting the cc_mask global variable may be done early in the boot while
> > running fromm a 1:1 translation. This code is built with -fPIC in order
> > to support this.
> >
> > Make cc_set_mask() static inline so it can execute safely in this
> > context as well.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/x86/coco/core.c        | 7 +------
> >  arch/x86/include/asm/coco.h | 8 +++++++-
> >  2 files changed, 8 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
> > index eeec9986570e..d07be9d05cd0 100644
> > --- a/arch/x86/coco/core.c
> > +++ b/arch/x86/coco/core.c
> > @@ -14,7 +14,7 @@
> >  #include <asm/processor.h>
> >
> >  enum cc_vendor cc_vendor __ro_after_init = CC_VENDOR_NONE;
> > -static u64 cc_mask __ro_after_init;
> > +u64 cc_mask __ro_after_init;
> >
> >  static bool noinstr intel_cc_platform_has(enum cc_attr attr)
> >  {
> > @@ -148,8 +148,3 @@ u64 cc_mkdec(u64 val)
> >         }
> >  }
> >  EXPORT_SYMBOL_GPL(cc_mkdec);
> > -
> > -__init void cc_set_mask(u64 mask)
> > -{
> > -       cc_mask = mask;
> > -}
> > diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
> > index 6ae2d16a7613..ecc29d6136ad 100644
> > --- a/arch/x86/include/asm/coco.h
> > +++ b/arch/x86/include/asm/coco.h
> > @@ -13,7 +13,13 @@ enum cc_vendor {
> >  extern enum cc_vendor cc_vendor;
> >
> >  #ifdef CONFIG_ARCH_HAS_CC_PLATFORM
> > -void cc_set_mask(u64 mask);
> > +static inline void cc_set_mask(u64 mask)
>
> In the inline functions I changed/added to core.c in [0], I saw an
> objtool warning on clang builds when using inline instead of
> __always_inline; I did not see the same warning for gcc . Should we
> similarly use __always_inline to strictly-enforce here?
>
> [0] https://lore.kernel.org/lkml/20240130220845.1978329-2-kevinloughlin@google.com/#Z31arch:x86:coco:core.c

This assembles to a single instruction

movq %rsi, cc_mask(%rip)

and the definition is in a header file, so I'm not convinced it makes
a different.

And looking at your series, I think there is no need to modify coco.c
at all if you just take this patch instead: the other code in that
file should not be called early at all (unless our downstream has
substantial changes there)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 01/19] efi/libstub: Add generic support for parsing mem_encrypt=
  2024-01-29 18:05 ` [PATCH v3 01/19] efi/libstub: Add generic support for parsing mem_encrypt= Ard Biesheuvel
@ 2024-01-31  7:31   ` Borislav Petkov
  2024-02-01 16:23     ` Kevin Loughlin
  0 siblings, 1 reply; 52+ messages in thread
From: Borislav Petkov @ 2024-01-31  7:31 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, Ard Biesheuvel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, Jan 29, 2024 at 07:05:04PM +0100, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> Parse the mem_encrypt= command line parameter from the EFI stub if
> CONFIG_ARCH_HAS_MEM_ENCRYPT=y, so that it can be passed to the early
> boot code by the arch code in the stub.

I guess all systems which do memory encryption are EFI systems anyway so
we should not worry about the old ones...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/19] x86/boot: Move mem_encrypt= parsing to the decompressor
  2024-01-29 18:05 ` [PATCH v3 02/19] x86/boot: Move mem_encrypt= parsing to the decompressor Ard Biesheuvel
@ 2024-01-31  8:35   ` Borislav Petkov
  2024-01-31  9:12     ` Ard Biesheuvel
  0 siblings, 1 reply; 52+ messages in thread
From: Borislav Petkov @ 2024-01-31  8:35 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, Ard Biesheuvel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, Jan 29, 2024 at 07:05:05PM +0100, Ard Biesheuvel wrote:
> +/*
> + * Set the memory encryption xloadflag based on the mem_encrypt= command line
> + * parameter, if provided. If not, the consumer of the flag decides what the
> + * default behavior should be.
> + */
> +static void set_mem_encrypt_flag(struct setup_header *hdr)

parse_mem_encrypt

> +{
> +	hdr->xloadflags &= ~(XLF_MEM_ENCRYPTION | XLF_MEM_ENCRYPTION_ENABLED);
> +
> +	if (IS_ENABLED(CONFIG_ARCH_HAS_MEM_ENCRYPT)) {

That's unconditionally enabled on x86:

	select ARCH_HAS_MEM_ENCRYPT

in x86/Kconfig.

Which sounds like you need a single XLF_MEM_ENCRYPT and simplify this
more.

> +		int on = cmdline_find_option_bool("mem_encrypt=on");
> +		int off = cmdline_find_option_bool("mem_encrypt=off");
> +
> +		if (on || off)
> +			hdr->xloadflags |= XLF_MEM_ENCRYPTION;
> +		if (on > off)
> +			hdr->xloadflags |= XLF_MEM_ENCRYPTION_ENABLED;
> +	}
> +}

Otherwise, I like the simplification.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/19] x86/boot: Move mem_encrypt= parsing to the decompressor
  2024-01-31  8:35   ` Borislav Petkov
@ 2024-01-31  9:12     ` Ard Biesheuvel
  2024-01-31  9:29       ` Borislav Petkov
  0 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-31  9:12 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Wed, Jan 31, 2024 at 9:35 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:05PM +0100, Ard Biesheuvel wrote:
> > +/*
> > + * Set the memory encryption xloadflag based on the mem_encrypt= command line
> > + * parameter, if provided. If not, the consumer of the flag decides what the
> > + * default behavior should be.
> > + */
> > +static void set_mem_encrypt_flag(struct setup_header *hdr)
>
> parse_mem_encrypt
>

OK

> > +{
> > +     hdr->xloadflags &= ~(XLF_MEM_ENCRYPTION | XLF_MEM_ENCRYPTION_ENABLED);
> > +
> > +     if (IS_ENABLED(CONFIG_ARCH_HAS_MEM_ENCRYPT)) {
>
> That's unconditionally enabled on x86:
>
>         select ARCH_HAS_MEM_ENCRYPT
>
> in x86/Kconfig.
>
> Which sounds like you need a single XLF_MEM_ENCRYPT and simplify this
> more.
>

OK, but that only means I can drop the if().

The reason we need two flags is because there is no default value to
use when the command line param is absent.

There is CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT but that one is AMD
specific. There is CONFIG_X86_MEM_ENCRYPT which is shared between
SME/SEV and TDX, which has no default setting.

> > +             int on = cmdline_find_option_bool("mem_encrypt=on");
> > +             int off = cmdline_find_option_bool("mem_encrypt=off");
> > +
> > +             if (on || off)
> > +                     hdr->xloadflags |= XLF_MEM_ENCRYPTION;
> > +             if (on > off)
> > +                     hdr->xloadflags |= XLF_MEM_ENCRYPTION_ENABLED;
> > +     }
> > +}
>
> Otherwise, I like the simplification.
>

Cheers.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/19] x86/boot: Move mem_encrypt= parsing to the decompressor
  2024-01-31  9:12     ` Ard Biesheuvel
@ 2024-01-31  9:29       ` Borislav Petkov
  2024-01-31  9:59         ` Ard Biesheuvel
  2024-02-01 14:17         ` Tom Lendacky
  0 siblings, 2 replies; 52+ messages in thread
From: Borislav Petkov @ 2024-01-31  9:29 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Wed, Jan 31, 2024 at 10:12:13AM +0100, Ard Biesheuvel wrote:
> The reason we need two flags is because there is no default value to
> use when the command line param is absent.

I think absent means memory encryption disabled like with every other
option which is not present...

> There is CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT but that one is AMD

... yes, and I'm thinking that it is time we kill this. I don't think
anything uses it. It was meant well at the time.

Let's wait for Tom to wake up first, though, as he might have some
objections...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/19] x86/boot: Move mem_encrypt= parsing to the decompressor
  2024-01-31  9:29       ` Borislav Petkov
@ 2024-01-31  9:59         ` Ard Biesheuvel
  2024-02-01 14:17         ` Tom Lendacky
  1 sibling, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-31  9:59 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Wed, 31 Jan 2024 at 10:30, Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Jan 31, 2024 at 10:12:13AM +0100, Ard Biesheuvel wrote:
> > The reason we need two flags is because there is no default value to
> > use when the command line param is absent.
>
> I think absent means memory encryption disabled like with every other
> option which is not present...
>
> > There is CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT but that one is AMD
>
> ... yes, and I'm thinking that it is time we kill this. I don't think
> anything uses it. It was meant well at the time.
>
> Let's wait for Tom to wake up first, though, as he might have some
> objections...
>

OK, yeah, that would help.

AIUI this is for SME only anyway - SEV ignores this, and I suppose TDX
will do the same.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer
  2024-01-29 18:05 ` [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer Ard Biesheuvel
@ 2024-01-31 13:44   ` Borislav Petkov
  2024-01-31 13:57     ` Ard Biesheuvel
  2024-01-31 18:14   ` [tip: x86/boot] " tip-bot2 for Ard Biesheuvel
  1 sibling, 1 reply; 52+ messages in thread
From: Borislav Petkov @ 2024-01-31 13:44 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, Ard Biesheuvel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, Jan 29, 2024 at 07:05:06PM +0100, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> Since commit 866b556efa12 ("x86/head/64: Install startup GDT"), the
> primary startup sequence sets the code segment register (CS) to __KERNEL_CS
> before calling into the startup code shared between primary and
> secondary boot.
> 
> This means a simple indirect call is sufficient here.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/x86/kernel/head_64.S | 35 ++------------------
>  1 file changed, 3 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index d4918d03efb4..4017a49d7b76 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
>  	movq	%r15, %rdi
>  
>  .Ljump_to_C_code:
> -	/*
> -	 * Jump to run C code and to be on a real kernel address.
> -	 * Since we are running on identity-mapped space we have to jump
> -	 * to the full 64bit address, this is only possible as indirect
> -	 * jump.  In addition we need to ensure %cs is set so we make this
> -	 * a far return.
> -	 *
> -	 * Note: do not change to far jump indirect with 64bit offset.
> -	 *
> -	 * AMD does not support far jump indirect with 64bit offset.
> -	 * AMD64 Architecture Programmer's Manual, Volume 3: states only
> -	 *	JMP FAR mem16:16 FF /5 Far jump indirect,
> -	 *		with the target specified by a far pointer in memory.
> -	 *	JMP FAR mem16:32 FF /5 Far jump indirect,
> -	 *		with the target specified by a far pointer in memory.
> -	 *
> -	 * Intel64 does support 64bit offset.
> -	 * Software Developer Manual Vol 2: states:
> -	 *	FF /5 JMP m16:16 Jump far, absolute indirect,
> -	 *		address given in m16:16
> -	 *	FF /5 JMP m16:32 Jump far, absolute indirect,
> -	 *		address given in m16:32.
> -	 *	REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
> -	 *		address given in m16:64.
> -	 */
> -	pushq	$.Lafter_lret	# put return address on stack for unwinder
>  	xorl	%ebp, %ebp	# clear frame pointer
> -	movq	initial_code(%rip), %rax
> -	pushq	$__KERNEL_CS	# set correct cs
> -	pushq	%rax		# target address in negative space
> -	lretq
> -.Lafter_lret:
> -	ANNOTATE_NOENDBR
> +	ANNOTATE_RETPOLINE_SAFE
> +	callq	*initial_code(%rip)
> +	int3
>  SYM_CODE_END(secondary_startup_64)
>  
>  #include "verify_cpu.S"

objtool doesn't like it yet:

vmlinux.o: warning: objtool: verify_cpu+0x0: stack state mismatch: cfa1=4+8 cfa2=-1+0

Once we've solved this, I'll take this one even now - very nice cleanup!

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer
  2024-01-31 13:44   ` Borislav Petkov
@ 2024-01-31 13:57     ` Ard Biesheuvel
  2024-01-31 14:07       ` Ard Biesheuvel
  0 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-31 13:57 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Wed, 31 Jan 2024 at 14:45, Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:06PM +0100, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Since commit 866b556efa12 ("x86/head/64: Install startup GDT"), the
> > primary startup sequence sets the code segment register (CS) to __KERNEL_CS
> > before calling into the startup code shared between primary and
> > secondary boot.
> >
> > This means a simple indirect call is sufficient here.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/x86/kernel/head_64.S | 35 ++------------------
> >  1 file changed, 3 insertions(+), 32 deletions(-)
> >
> > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> > index d4918d03efb4..4017a49d7b76 100644
> > --- a/arch/x86/kernel/head_64.S
> > +++ b/arch/x86/kernel/head_64.S
> > @@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
> >       movq    %r15, %rdi
> >
> >  .Ljump_to_C_code:
> > -     /*
> > -      * Jump to run C code and to be on a real kernel address.
> > -      * Since we are running on identity-mapped space we have to jump
> > -      * to the full 64bit address, this is only possible as indirect
> > -      * jump.  In addition we need to ensure %cs is set so we make this
> > -      * a far return.
> > -      *
> > -      * Note: do not change to far jump indirect with 64bit offset.
> > -      *
> > -      * AMD does not support far jump indirect with 64bit offset.
> > -      * AMD64 Architecture Programmer's Manual, Volume 3: states only
> > -      *      JMP FAR mem16:16 FF /5 Far jump indirect,
> > -      *              with the target specified by a far pointer in memory.
> > -      *      JMP FAR mem16:32 FF /5 Far jump indirect,
> > -      *              with the target specified by a far pointer in memory.
> > -      *
> > -      * Intel64 does support 64bit offset.
> > -      * Software Developer Manual Vol 2: states:
> > -      *      FF /5 JMP m16:16 Jump far, absolute indirect,
> > -      *              address given in m16:16
> > -      *      FF /5 JMP m16:32 Jump far, absolute indirect,
> > -      *              address given in m16:32.
> > -      *      REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
> > -      *              address given in m16:64.
> > -      */
> > -     pushq   $.Lafter_lret   # put return address on stack for unwinder
> >       xorl    %ebp, %ebp      # clear frame pointer
> > -     movq    initial_code(%rip), %rax
> > -     pushq   $__KERNEL_CS    # set correct cs
> > -     pushq   %rax            # target address in negative space
> > -     lretq
> > -.Lafter_lret:
> > -     ANNOTATE_NOENDBR
> > +     ANNOTATE_RETPOLINE_SAFE
> > +     callq   *initial_code(%rip)
> > +     int3
> >  SYM_CODE_END(secondary_startup_64)
> >
> >  #include "verify_cpu.S"
>
> objtool doesn't like it yet:
>
> vmlinux.o: warning: objtool: verify_cpu+0x0: stack state mismatch: cfa1=4+8 cfa2=-1+0
>
> Once we've solved this, I'll take this one even now - very nice cleanup!
>

s/int3/RET seems to do the trick.

As long as there is an instruction that follows the callq, the
unwinder will see secondary_startup_64 at the base of the call stack.
We never return here anyway.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer
  2024-01-31 13:57     ` Ard Biesheuvel
@ 2024-01-31 14:07       ` Ard Biesheuvel
  2024-01-31 16:29         ` Borislav Petkov
  0 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-01-31 14:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Wed, 31 Jan 2024 at 14:57, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Wed, 31 Jan 2024 at 14:45, Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Mon, Jan 29, 2024 at 07:05:06PM +0100, Ard Biesheuvel wrote:
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > >
> > > Since commit 866b556efa12 ("x86/head/64: Install startup GDT"), the
> > > primary startup sequence sets the code segment register (CS) to __KERNEL_CS
> > > before calling into the startup code shared between primary and
> > > secondary boot.
> > >
> > > This means a simple indirect call is sufficient here.
> > >
> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > > ---
> > >  arch/x86/kernel/head_64.S | 35 ++------------------
> > >  1 file changed, 3 insertions(+), 32 deletions(-)
> > >
> > > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> > > index d4918d03efb4..4017a49d7b76 100644
> > > --- a/arch/x86/kernel/head_64.S
> > > +++ b/arch/x86/kernel/head_64.S
> > > @@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
> > >       movq    %r15, %rdi
> > >
> > >  .Ljump_to_C_code:
> > > -     /*
> > > -      * Jump to run C code and to be on a real kernel address.
> > > -      * Since we are running on identity-mapped space we have to jump
> > > -      * to the full 64bit address, this is only possible as indirect
> > > -      * jump.  In addition we need to ensure %cs is set so we make this
> > > -      * a far return.
> > > -      *
> > > -      * Note: do not change to far jump indirect with 64bit offset.
> > > -      *
> > > -      * AMD does not support far jump indirect with 64bit offset.
> > > -      * AMD64 Architecture Programmer's Manual, Volume 3: states only
> > > -      *      JMP FAR mem16:16 FF /5 Far jump indirect,
> > > -      *              with the target specified by a far pointer in memory.
> > > -      *      JMP FAR mem16:32 FF /5 Far jump indirect,
> > > -      *              with the target specified by a far pointer in memory.
> > > -      *
> > > -      * Intel64 does support 64bit offset.
> > > -      * Software Developer Manual Vol 2: states:
> > > -      *      FF /5 JMP m16:16 Jump far, absolute indirect,
> > > -      *              address given in m16:16
> > > -      *      FF /5 JMP m16:32 Jump far, absolute indirect,
> > > -      *              address given in m16:32.
> > > -      *      REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
> > > -      *              address given in m16:64.
> > > -      */
> > > -     pushq   $.Lafter_lret   # put return address on stack for unwinder
> > >       xorl    %ebp, %ebp      # clear frame pointer
> > > -     movq    initial_code(%rip), %rax
> > > -     pushq   $__KERNEL_CS    # set correct cs
> > > -     pushq   %rax            # target address in negative space
> > > -     lretq
> > > -.Lafter_lret:
> > > -     ANNOTATE_NOENDBR
> > > +     ANNOTATE_RETPOLINE_SAFE
> > > +     callq   *initial_code(%rip)
> > > +     int3
> > >  SYM_CODE_END(secondary_startup_64)
> > >
> > >  #include "verify_cpu.S"
> >
> > objtool doesn't like it yet:
> >
> > vmlinux.o: warning: objtool: verify_cpu+0x0: stack state mismatch: cfa1=4+8 cfa2=-1+0
> >
> > Once we've solved this, I'll take this one even now - very nice cleanup!
> >
>
> s/int3/RET seems to do the trick.
>

or ud2, even better,

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer
  2024-01-31 14:07       ` Ard Biesheuvel
@ 2024-01-31 16:29         ` Borislav Petkov
  0 siblings, 0 replies; 52+ messages in thread
From: Borislav Petkov @ 2024-01-31 16:29 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Wed, Jan 31, 2024 at 03:07:50PM +0100, Ard Biesheuvel wrote:
> > s/int3/RET seems to do the trick.
> >
> or ud2, even better,

Yap, that does it. And yes, we don't return here. I guess objtool
complains because

"7. file: warning: objtool: func()+0x5c: stack state mismatch

   The instruction's frame pointer state is inconsistent, depending on
   which execution path was taken to reach the instruction.

   ...

   Another possibility is that the code has some asm or inline asm which
   does some unusual things to the stack or the frame pointer.  In such
   cases it's probably appropriate to use the unwind hint macros in
   asm/unwind_hints.h.
"

Lemme test this one a bit on my machines and queue it.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [tip: x86/boot] x86/startup_64: Drop long return to initial_code pointer
  2024-01-29 18:05 ` [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer Ard Biesheuvel
  2024-01-31 13:44   ` Borislav Petkov
@ 2024-01-31 18:14   ` tip-bot2 for Ard Biesheuvel
  1 sibling, 0 replies; 52+ messages in thread
From: tip-bot2 for Ard Biesheuvel @ 2024-01-31 18:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Ard Biesheuvel, Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/boot branch of tip:

Commit-ID:     15675706241887ed7fdad9e91f4bf977b9896d0f
Gitweb:        https://git.kernel.org/tip/15675706241887ed7fdad9e91f4bf977b9896d0f
Author:        Ard Biesheuvel <ardb@kernel.org>
AuthorDate:    Mon, 29 Jan 2024 19:05:06 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 31 Jan 2024 18:31:21 +01:00

x86/startup_64: Drop long return to initial_code pointer

Since

  866b556efa12 ("x86/head/64: Install startup GDT")

the primary startup sequence sets the code segment register (CS) to
__KERNEL_CS before calling into the startup code shared between primary
and secondary boot.

This means a simple indirect call is sufficient here.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240129180502.4069817-24-ardb+git@google.com
---
 arch/x86/kernel/head_64.S | 35 +++--------------------------------
 1 file changed, 3 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index d4918d0..bfbac50 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	movq	%r15, %rdi
 
 .Ljump_to_C_code:
-	/*
-	 * Jump to run C code and to be on a real kernel address.
-	 * Since we are running on identity-mapped space we have to jump
-	 * to the full 64bit address, this is only possible as indirect
-	 * jump.  In addition we need to ensure %cs is set so we make this
-	 * a far return.
-	 *
-	 * Note: do not change to far jump indirect with 64bit offset.
-	 *
-	 * AMD does not support far jump indirect with 64bit offset.
-	 * AMD64 Architecture Programmer's Manual, Volume 3: states only
-	 *	JMP FAR mem16:16 FF /5 Far jump indirect,
-	 *		with the target specified by a far pointer in memory.
-	 *	JMP FAR mem16:32 FF /5 Far jump indirect,
-	 *		with the target specified by a far pointer in memory.
-	 *
-	 * Intel64 does support 64bit offset.
-	 * Software Developer Manual Vol 2: states:
-	 *	FF /5 JMP m16:16 Jump far, absolute indirect,
-	 *		address given in m16:16
-	 *	FF /5 JMP m16:32 Jump far, absolute indirect,
-	 *		address given in m16:32.
-	 *	REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
-	 *		address given in m16:64.
-	 */
-	pushq	$.Lafter_lret	# put return address on stack for unwinder
 	xorl	%ebp, %ebp	# clear frame pointer
-	movq	initial_code(%rip), %rax
-	pushq	$__KERNEL_CS	# set correct cs
-	pushq	%rax		# target address in negative space
-	lretq
-.Lafter_lret:
-	ANNOTATE_NOENDBR
+	ANNOTATE_RETPOLINE_SAFE
+	callq	*initial_code(%rip)
+	ud2
 SYM_CODE_END(secondary_startup_64)
 
 #include "verify_cpu.S"

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/19] x86/boot: Move mem_encrypt= parsing to the decompressor
  2024-01-31  9:29       ` Borislav Petkov
  2024-01-31  9:59         ` Ard Biesheuvel
@ 2024-02-01 14:17         ` Tom Lendacky
  2024-02-01 16:15           ` Ard Biesheuvel
  1 sibling, 1 reply; 52+ messages in thread
From: Tom Lendacky @ 2024-02-01 14:17 UTC (permalink / raw)
  To: Borislav Petkov, Ard Biesheuvel
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Dionna Glaze,
	Thomas Gleixner, Ingo Molnar, Dave Hansen, Andy Lutomirski,
	Arnd Bergmann, Nathan Chancellor, Nick Desaulniers, Justin Stitt,
	Kees Cook, Brian Gerst, linux-arch, llvm

On 1/31/24 03:29, Borislav Petkov wrote:
> On Wed, Jan 31, 2024 at 10:12:13AM +0100, Ard Biesheuvel wrote:
>> The reason we need two flags is because there is no default value to
>> use when the command line param is absent.
> 
> I think absent means memory encryption disabled like with every other
> option which is not present...
> 
>> There is CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT but that one is AMD
> 
> ... yes, and I'm thinking that it is time we kill this. I don't think
> anything uses it. It was meant well at the time.
> 
> Let's wait for Tom to wake up first, though, as he might have some
> objections...

I don't know if anyone is using the AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT 
config option, but I don't have an issue removing it.

Thanks,
Tom

> 
> Thx.
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/19] x86/boot: Move mem_encrypt= parsing to the decompressor
  2024-02-01 14:17         ` Tom Lendacky
@ 2024-02-01 16:15           ` Ard Biesheuvel
  2024-02-02 16:35             ` [PATCH] x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT Borislav Petkov
  0 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-02-01 16:15 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Borislav Petkov, Ard Biesheuvel, linux-kernel, Kevin Loughlin,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Thu, 1 Feb 2024 at 15:17, Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 1/31/24 03:29, Borislav Petkov wrote:
> > On Wed, Jan 31, 2024 at 10:12:13AM +0100, Ard Biesheuvel wrote:
> >> The reason we need two flags is because there is no default value to
> >> use when the command line param is absent.
> >
> > I think absent means memory encryption disabled like with every other
> > option which is not present...
> >
> >> There is CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT but that one is AMD
> >
> > ... yes, and I'm thinking that it is time we kill this. I don't think
> > anything uses it. It was meant well at the time.
> >
> > Let's wait for Tom to wake up first, though, as he might have some
> > objections...
>
> I don't know if anyone is using the AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
> config option, but I don't have an issue removing it.
>

OK, I'll remove it in the next rev.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 01/19] efi/libstub: Add generic support for parsing mem_encrypt=
  2024-01-31  7:31   ` Borislav Petkov
@ 2024-02-01 16:23     ` Kevin Loughlin
  2024-02-01 16:28       ` Ard Biesheuvel
  0 siblings, 1 reply; 52+ messages in thread
From: Kevin Loughlin @ 2024-02-01 16:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, linux-kernel, Ard Biesheuvel, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm, Conrad Grobler, Andri Saar, Sidharth Telang

On Tue, Jan 30, 2024 at 11:32 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:04PM +0100, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Parse the mem_encrypt= command line parameter from the EFI stub if
> > CONFIG_ARCH_HAS_MEM_ENCRYPT=y, so that it can be passed to the early
> > boot code by the arch code in the stub.
>
> I guess all systems which do memory encryption are EFI systems anyway so
> we should not worry about the old ones...

There is at least one non-EFI firmware supporting memory encryption:
Oak stage0 firmware [0]. However, I think Ard's patch seems simple
enough to adopt in non-EFI firmware(s) if needed. I merely wanted to
point out the existence of non-EFI memory encryption systems for
potential future cases (ex: reviewing more complex patches at the
firmware interface).

[0] https://github.com/project-oak/oak/tree/main/stage0_bin

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 01/19] efi/libstub: Add generic support for parsing mem_encrypt=
  2024-02-01 16:23     ` Kevin Loughlin
@ 2024-02-01 16:28       ` Ard Biesheuvel
  0 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-02-01 16:28 UTC (permalink / raw)
  To: Kevin Loughlin
  Cc: Borislav Petkov, Ard Biesheuvel, linux-kernel, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm, Conrad Grobler, Andri Saar, Sidharth Telang

On Thu, 1 Feb 2024 at 17:23, Kevin Loughlin <kevinloughlin@google.com> wrote:
>
> On Tue, Jan 30, 2024 at 11:32 PM Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Mon, Jan 29, 2024 at 07:05:04PM +0100, Ard Biesheuvel wrote:
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > >
> > > Parse the mem_encrypt= command line parameter from the EFI stub if
> > > CONFIG_ARCH_HAS_MEM_ENCRYPT=y, so that it can be passed to the early
> > > boot code by the arch code in the stub.
> >
> > I guess all systems which do memory encryption are EFI systems anyway so
> > we should not worry about the old ones...
>
> There is at least one non-EFI firmware supporting memory encryption:
> Oak stage0 firmware [0]. However, I think Ard's patch seems simple
> enough to adopt in non-EFI firmware(s) if needed. I merely wanted to
> point out the existence of non-EFI memory encryption systems for
> potential future cases (ex: reviewing more complex patches at the
> firmware interface).
>
> [0] https://github.com/project-oak/oak/tree/main/stage0_bin
>

The second patch in this series actually implements the mem_encrypt=
parsing for both EFI and non-EFI boot. I just broke this out in a
separate patch because it affects architectures other than x86.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH] x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
  2024-02-01 16:15           ` Ard Biesheuvel
@ 2024-02-02 16:35             ` Borislav Petkov
  2024-02-02 16:47               ` Ard Biesheuvel
  2024-02-03 10:50               ` [tip: x86/sev] " tip-bot2 for Borislav Petkov (AMD)
  0 siblings, 2 replies; 52+ messages in thread
From: Borislav Petkov @ 2024-02-02 16:35 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Tom Lendacky, Ard Biesheuvel, linux-kernel, Kevin Loughlin,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Thu, Feb 01, 2024 at 05:15:51PM +0100, Ard Biesheuvel wrote:
> OK, I'll remove it in the next rev.

Considering how it simplifies sme_enable() even more, I'd like to
expedite this one.

Thx.

---
From: "Borislav Petkov (AMD)" <bp@alien8.de>
Date: Fri, 2 Feb 2024 17:29:32 +0100
Subject: [PATCH] x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT

It was meant well at the time but nothing's using it so get rid of it.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
 Documentation/admin-guide/kernel-parameters.txt  |  4 +---
 Documentation/arch/x86/amd-memory-encryption.rst | 16 ++++++++--------
 arch/x86/Kconfig                                 | 13 -------------
 arch/x86/mm/mem_encrypt_identity.c               | 11 +----------
 4 files changed, 10 insertions(+), 34 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 31b3a25680d0..2cb70a384af8 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3320,9 +3320,7 @@
 
 	mem_encrypt=	[X86-64] AMD Secure Memory Encryption (SME) control
 			Valid arguments: on, off
-			Default (depends on kernel configuration option):
-			  on  (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y)
-			  off (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n)
+			Default: off
 			mem_encrypt=on:		Activate SME
 			mem_encrypt=off:	Do not activate SME
 
diff --git a/Documentation/arch/x86/amd-memory-encryption.rst b/Documentation/arch/x86/amd-memory-encryption.rst
index 07caa8fff852..414bc7402ae7 100644
--- a/Documentation/arch/x86/amd-memory-encryption.rst
+++ b/Documentation/arch/x86/amd-memory-encryption.rst
@@ -87,14 +87,14 @@ The state of SME in the Linux kernel can be documented as follows:
 	  kernel is non-zero).
 
 SME can also be enabled and activated in the BIOS. If SME is enabled and
-activated in the BIOS, then all memory accesses will be encrypted and it will
-not be necessary to activate the Linux memory encryption support.  If the BIOS
-merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG), then Linux can activate
-memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
-by supplying mem_encrypt=on on the kernel command line.  However, if BIOS does
-not enable SME, then Linux will not be able to activate memory encryption, even
-if configured to do so by default or the mem_encrypt=on command line parameter
-is specified.
+activated in the BIOS, then all memory accesses will be encrypted and it
+will not be necessary to activate the Linux memory encryption support.
+
+If the BIOS merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG),
+then memory encryption can be enabled by supplying mem_encrypt=on on the
+kernel command line.  However, if BIOS does not enable SME, then Linux
+will not be able to activate memory encryption, even if configured to do
+so by default or the mem_encrypt=on command line parameter is specified.
 
 Secure Nested Paging (SNP)
 ==========================
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5edec175b9bf..58d3593bc4f2 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1539,19 +1539,6 @@ config AMD_MEM_ENCRYPT
 	  This requires an AMD processor that supports Secure Memory
 	  Encryption (SME).
 
-config AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
-	bool "Activate AMD Secure Memory Encryption (SME) by default"
-	depends on AMD_MEM_ENCRYPT
-	help
-	  Say yes to have system memory encrypted by default if running on
-	  an AMD processor that supports Secure Memory Encryption (SME).
-
-	  If set to Y, then the encryption of system memory can be
-	  deactivated with the mem_encrypt=off command line option.
-
-	  If set to N, then the encryption of system memory can be
-	  activated with the mem_encrypt=on command line option.
-
 # Common NUMA Features
 config NUMA
 	bool "NUMA Memory Allocation and Scheduler Support"
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 7f72472a34d6..efe9f217fcf9 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -97,7 +97,6 @@ static char sme_workarea[2 * PMD_SIZE] __section(".init.scratch");
 
 static char sme_cmdline_arg[] __initdata = "mem_encrypt";
 static char sme_cmdline_on[]  __initdata = "on";
-static char sme_cmdline_off[] __initdata = "off";
 
 static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd)
 {
@@ -504,7 +503,7 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
 
 void __init sme_enable(struct boot_params *bp)
 {
-	const char *cmdline_ptr, *cmdline_arg, *cmdline_on, *cmdline_off;
+	const char *cmdline_ptr, *cmdline_arg, *cmdline_on;
 	unsigned int eax, ebx, ecx, edx;
 	unsigned long feature_mask;
 	unsigned long me_mask;
@@ -587,12 +586,6 @@ void __init sme_enable(struct boot_params *bp)
 	asm ("lea sme_cmdline_on(%%rip), %0"
 	     : "=r" (cmdline_on)
 	     : "p" (sme_cmdline_on));
-	asm ("lea sme_cmdline_off(%%rip), %0"
-	     : "=r" (cmdline_off)
-	     : "p" (sme_cmdline_off));
-
-	if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT))
-		sme_me_mask = me_mask;
 
 	cmdline_ptr = (const char *)((u64)bp->hdr.cmd_line_ptr |
 				     ((u64)bp->ext_cmd_line_ptr << 32));
@@ -602,8 +595,6 @@ void __init sme_enable(struct boot_params *bp)
 
 	if (!strncmp(buffer, cmdline_on, sizeof(buffer)))
 		sme_me_mask = me_mask;
-	else if (!strncmp(buffer, cmdline_off, sizeof(buffer)))
-		sme_me_mask = 0;
 
 out:
 	if (sme_me_mask) {
-- 
2.43.0


-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH] x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
  2024-02-02 16:35             ` [PATCH] x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT Borislav Petkov
@ 2024-02-02 16:47               ` Ard Biesheuvel
  2024-02-03 10:50               ` [tip: x86/sev] " tip-bot2 for Borislav Petkov (AMD)
  1 sibling, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-02-02 16:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tom Lendacky, Ard Biesheuvel, linux-kernel, Kevin Loughlin,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Fri, 2 Feb 2024 at 17:35, Borislav Petkov <bp@alien8.de> wrote:
>
> On Thu, Feb 01, 2024 at 05:15:51PM +0100, Ard Biesheuvel wrote:
> > OK, I'll remove it in the next rev.
>
> Considering how it simplifies sme_enable() even more, I'd like to
> expedite this one.
>
> Thx.
>
> ---
> From: "Borislav Petkov (AMD)" <bp@alien8.de>
> Date: Fri, 2 Feb 2024 17:29:32 +0100
> Subject: [PATCH] x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
>
> It was meant well at the time but nothing's using it so get rid of it.
>
> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> ---
>  Documentation/admin-guide/kernel-parameters.txt  |  4 +---
>  Documentation/arch/x86/amd-memory-encryption.rst | 16 ++++++++--------
>  arch/x86/Kconfig                                 | 13 -------------
>  arch/x86/mm/mem_encrypt_identity.c               | 11 +----------
>  4 files changed, 10 insertions(+), 34 deletions(-)
>

Works for me.

Acked-by: Ard Biesheuvel <ardb@kernel.org>

> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 31b3a25680d0..2cb70a384af8 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -3320,9 +3320,7 @@
>
>         mem_encrypt=    [X86-64] AMD Secure Memory Encryption (SME) control
>                         Valid arguments: on, off
> -                       Default (depends on kernel configuration option):
> -                         on  (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y)
> -                         off (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n)
> +                       Default: off
>                         mem_encrypt=on:         Activate SME
>                         mem_encrypt=off:        Do not activate SME
>
> diff --git a/Documentation/arch/x86/amd-memory-encryption.rst b/Documentation/arch/x86/amd-memory-encryption.rst
> index 07caa8fff852..414bc7402ae7 100644
> --- a/Documentation/arch/x86/amd-memory-encryption.rst
> +++ b/Documentation/arch/x86/amd-memory-encryption.rst
> @@ -87,14 +87,14 @@ The state of SME in the Linux kernel can be documented as follows:
>           kernel is non-zero).
>
>  SME can also be enabled and activated in the BIOS. If SME is enabled and
> -activated in the BIOS, then all memory accesses will be encrypted and it will
> -not be necessary to activate the Linux memory encryption support.  If the BIOS
> -merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG), then Linux can activate
> -memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
> -by supplying mem_encrypt=on on the kernel command line.  However, if BIOS does
> -not enable SME, then Linux will not be able to activate memory encryption, even
> -if configured to do so by default or the mem_encrypt=on command line parameter
> -is specified.
> +activated in the BIOS, then all memory accesses will be encrypted and it
> +will not be necessary to activate the Linux memory encryption support.
> +
> +If the BIOS merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG),
> +then memory encryption can be enabled by supplying mem_encrypt=on on the
> +kernel command line.  However, if BIOS does not enable SME, then Linux
> +will not be able to activate memory encryption, even if configured to do
> +so by default or the mem_encrypt=on command line parameter is specified.
>
>  Secure Nested Paging (SNP)
>  ==========================
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 5edec175b9bf..58d3593bc4f2 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1539,19 +1539,6 @@ config AMD_MEM_ENCRYPT
>           This requires an AMD processor that supports Secure Memory
>           Encryption (SME).
>
> -config AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
> -       bool "Activate AMD Secure Memory Encryption (SME) by default"
> -       depends on AMD_MEM_ENCRYPT
> -       help
> -         Say yes to have system memory encrypted by default if running on
> -         an AMD processor that supports Secure Memory Encryption (SME).
> -
> -         If set to Y, then the encryption of system memory can be
> -         deactivated with the mem_encrypt=off command line option.
> -
> -         If set to N, then the encryption of system memory can be
> -         activated with the mem_encrypt=on command line option.
> -
>  # Common NUMA Features
>  config NUMA
>         bool "NUMA Memory Allocation and Scheduler Support"
> diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
> index 7f72472a34d6..efe9f217fcf9 100644
> --- a/arch/x86/mm/mem_encrypt_identity.c
> +++ b/arch/x86/mm/mem_encrypt_identity.c
> @@ -97,7 +97,6 @@ static char sme_workarea[2 * PMD_SIZE] __section(".init.scratch");
>
>  static char sme_cmdline_arg[] __initdata = "mem_encrypt";
>  static char sme_cmdline_on[]  __initdata = "on";
> -static char sme_cmdline_off[] __initdata = "off";
>
>  static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd)
>  {
> @@ -504,7 +503,7 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
>
>  void __init sme_enable(struct boot_params *bp)
>  {
> -       const char *cmdline_ptr, *cmdline_arg, *cmdline_on, *cmdline_off;
> +       const char *cmdline_ptr, *cmdline_arg, *cmdline_on;
>         unsigned int eax, ebx, ecx, edx;
>         unsigned long feature_mask;
>         unsigned long me_mask;
> @@ -587,12 +586,6 @@ void __init sme_enable(struct boot_params *bp)
>         asm ("lea sme_cmdline_on(%%rip), %0"
>              : "=r" (cmdline_on)
>              : "p" (sme_cmdline_on));
> -       asm ("lea sme_cmdline_off(%%rip), %0"
> -            : "=r" (cmdline_off)
> -            : "p" (sme_cmdline_off));
> -
> -       if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT))
> -               sme_me_mask = me_mask;
>
>         cmdline_ptr = (const char *)((u64)bp->hdr.cmd_line_ptr |
>                                      ((u64)bp->ext_cmd_line_ptr << 32));
> @@ -602,8 +595,6 @@ void __init sme_enable(struct boot_params *bp)
>
>         if (!strncmp(buffer, cmdline_on, sizeof(buffer)))
>                 sme_me_mask = me_mask;
> -       else if (!strncmp(buffer, cmdline_off, sizeof(buffer)))
> -               sme_me_mask = 0;
>
>  out:
>         if (sme_me_mask) {
> --
> 2.43.0
>
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [tip: x86/sev] x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
  2024-02-02 16:35             ` [PATCH] x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT Borislav Petkov
  2024-02-02 16:47               ` Ard Biesheuvel
@ 2024-02-03 10:50               ` tip-bot2 for Borislav Petkov (AMD)
  1 sibling, 0 replies; 52+ messages in thread
From: tip-bot2 for Borislav Petkov (AMD) @ 2024-02-03 10:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Borislav Petkov (AMD), Ard Biesheuvel, x86, linux-kernel

The following commit has been merged into the x86/sev branch of tip:

Commit-ID:     29956748339aa8757a7e2f927a8679dd08f24bb6
Gitweb:        https://git.kernel.org/tip/29956748339aa8757a7e2f927a8679dd08f24bb6
Author:        Borislav Petkov (AMD) <bp@alien8.de>
AuthorDate:    Fri, 02 Feb 2024 17:29:32 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Sat, 03 Feb 2024 11:38:17 +01:00

x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT

It was meant well at the time but nothing's using it so get rid of it.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240202163510.GDZb0Zvj8qOndvFOiZ@fat_crate.local
---
 Documentation/admin-guide/kernel-parameters.txt  |  4 +---
 Documentation/arch/x86/amd-memory-encryption.rst | 16 +++++++--------
 arch/x86/Kconfig                                 | 13 +------------
 arch/x86/mm/mem_encrypt_identity.c               | 11 +----------
 4 files changed, 10 insertions(+), 34 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 31b3a25..2cb70a3 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3320,9 +3320,7 @@
 
 	mem_encrypt=	[X86-64] AMD Secure Memory Encryption (SME) control
 			Valid arguments: on, off
-			Default (depends on kernel configuration option):
-			  on  (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y)
-			  off (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n)
+			Default: off
 			mem_encrypt=on:		Activate SME
 			mem_encrypt=off:	Do not activate SME
 
diff --git a/Documentation/arch/x86/amd-memory-encryption.rst b/Documentation/arch/x86/amd-memory-encryption.rst
index 07caa8f..414bc74 100644
--- a/Documentation/arch/x86/amd-memory-encryption.rst
+++ b/Documentation/arch/x86/amd-memory-encryption.rst
@@ -87,14 +87,14 @@ The state of SME in the Linux kernel can be documented as follows:
 	  kernel is non-zero).
 
 SME can also be enabled and activated in the BIOS. If SME is enabled and
-activated in the BIOS, then all memory accesses will be encrypted and it will
-not be necessary to activate the Linux memory encryption support.  If the BIOS
-merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG), then Linux can activate
-memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
-by supplying mem_encrypt=on on the kernel command line.  However, if BIOS does
-not enable SME, then Linux will not be able to activate memory encryption, even
-if configured to do so by default or the mem_encrypt=on command line parameter
-is specified.
+activated in the BIOS, then all memory accesses will be encrypted and it
+will not be necessary to activate the Linux memory encryption support.
+
+If the BIOS merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG),
+then memory encryption can be enabled by supplying mem_encrypt=on on the
+kernel command line.  However, if BIOS does not enable SME, then Linux
+will not be able to activate memory encryption, even if configured to do
+so by default or the mem_encrypt=on command line parameter is specified.
 
 Secure Nested Paging (SNP)
 ==========================
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5edec17..58d3593 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1539,19 +1539,6 @@ config AMD_MEM_ENCRYPT
 	  This requires an AMD processor that supports Secure Memory
 	  Encryption (SME).
 
-config AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
-	bool "Activate AMD Secure Memory Encryption (SME) by default"
-	depends on AMD_MEM_ENCRYPT
-	help
-	  Say yes to have system memory encrypted by default if running on
-	  an AMD processor that supports Secure Memory Encryption (SME).
-
-	  If set to Y, then the encryption of system memory can be
-	  deactivated with the mem_encrypt=off command line option.
-
-	  If set to N, then the encryption of system memory can be
-	  activated with the mem_encrypt=on command line option.
-
 # Common NUMA Features
 config NUMA
 	bool "NUMA Memory Allocation and Scheduler Support"
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 7f72472..efe9f21 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -97,7 +97,6 @@ static char sme_workarea[2 * PMD_SIZE] __section(".init.scratch");
 
 static char sme_cmdline_arg[] __initdata = "mem_encrypt";
 static char sme_cmdline_on[]  __initdata = "on";
-static char sme_cmdline_off[] __initdata = "off";
 
 static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd)
 {
@@ -504,7 +503,7 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
 
 void __init sme_enable(struct boot_params *bp)
 {
-	const char *cmdline_ptr, *cmdline_arg, *cmdline_on, *cmdline_off;
+	const char *cmdline_ptr, *cmdline_arg, *cmdline_on;
 	unsigned int eax, ebx, ecx, edx;
 	unsigned long feature_mask;
 	unsigned long me_mask;
@@ -587,12 +586,6 @@ void __init sme_enable(struct boot_params *bp)
 	asm ("lea sme_cmdline_on(%%rip), %0"
 	     : "=r" (cmdline_on)
 	     : "p" (sme_cmdline_on));
-	asm ("lea sme_cmdline_off(%%rip), %0"
-	     : "=r" (cmdline_off)
-	     : "p" (sme_cmdline_off));
-
-	if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT))
-		sme_me_mask = me_mask;
 
 	cmdline_ptr = (const char *)((u64)bp->hdr.cmd_line_ptr |
 				     ((u64)bp->ext_cmd_line_ptr << 32));
@@ -602,8 +595,6 @@ void __init sme_enable(struct boot_params *bp)
 
 	if (!strncmp(buffer, cmdline_on, sizeof(buffer)))
 		sme_me_mask = me_mask;
-	else if (!strncmp(buffer, cmdline_off, sizeof(buffer)))
-		sme_me_mask = 0;
 
 out:
 	if (sme_me_mask) {

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 04/19] x86/startup_64: Simplify calculation of initial page table address
  2024-01-29 18:05 ` [PATCH v3 04/19] x86/startup_64: Simplify calculation of initial page table address Ard Biesheuvel
@ 2024-02-05 10:40   ` Borislav Petkov
  0 siblings, 0 replies; 52+ messages in thread
From: Borislav Petkov @ 2024-02-05 10:40 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, Ard Biesheuvel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, Jan 29, 2024 at 07:05:07PM +0100, Ard Biesheuvel wrote:
> This is all very straight-forward, but the current code makes a mess of
> this.

That's because of a lot of histerical raisins and us not wanting to
break this. I'm single-stepping through all these changes very carefully
to make sure nothing breaks.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 05/19] x86/startup_64: Simplify CR4 handling in startup code
  2024-01-29 18:05 ` [PATCH v3 05/19] x86/startup_64: Simplify CR4 handling in startup code Ard Biesheuvel
@ 2024-02-06 18:21   ` Borislav Petkov
  2024-02-07 10:38     ` Ard Biesheuvel
  0 siblings, 1 reply; 52+ messages in thread
From: Borislav Petkov @ 2024-02-06 18:21 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, Ard Biesheuvel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, Jan 29, 2024 at 07:05:08PM +0100, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> When executing in long mode, the CR4.PAE and CR4.LA57 control bits
> cannot be updated,

"Long mode requires PAE to be enabled in order to use the 64-bit
page-translation data structures to translate 64-bit virtual addresses
to 52-bit physical addresses."

which is actually already enabled at that point:

cr4            0x20                [ PAE ]

"5-Level paging is enabled by setting CR4[LA57]=1 when EFER[LMA]=1.
CR4[LA57] is ignored when long mode is not active (EFER[LMA]=0)."

and if I had a 5-level guest, it would have LA57 already set too.

So I think you mean "When paging is enabled" as dhansen correctly points
out.

> and so they can simply be preserved rather than reason about whether
> or not they need to be set. CR4.PSE has no effect in long mode so it
> can be omitted.

f4c5ca985012 ("x86_64: Show CR4.PSE on auxiliaries like on BSP")

Please don't forget about git history before doing changes here.

> CR4.PGE is used to flush the TLBs, by clearing it if it was set, and

... to flush TLB entries with the global bit set.

And just like the above commit says, I think the CR4 settings across all
CPUs on the machine should be the same. So we want to keep PSE.

Removing the CONFIG_X86_5LEVEL ifdeffery is nice, OTOH.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 05/19] x86/startup_64: Simplify CR4 handling in startup code
  2024-02-06 18:21   ` Borislav Petkov
@ 2024-02-07 10:38     ` Ard Biesheuvel
  0 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-02-07 10:38 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Tue, 6 Feb 2024 at 18:21, Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:08PM +0100, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > When executing in long mode, the CR4.PAE and CR4.LA57 control bits
> > cannot be updated,
>
> "Long mode requires PAE to be enabled in order to use the 64-bit
> page-translation data structures to translate 64-bit virtual addresses
> to 52-bit physical addresses."
>
> which is actually already enabled at that point:
>
> cr4            0x20                [ PAE ]
>
> "5-Level paging is enabled by setting CR4[LA57]=1 when EFER[LMA]=1.
> CR4[LA57] is ignored when long mode is not active (EFER[LMA]=0)."
>
> and if I had a 5-level guest, it would have LA57 already set too.
>
> So I think you mean "When paging is enabled" as dhansen correctly points
> out.
>

Ack.

> > and so they can simply be preserved rather than reason about whether
> > or not they need to be set. CR4.PSE has no effect in long mode so it
> > can be omitted.
>
> f4c5ca985012 ("x86_64: Show CR4.PSE on auxiliaries like on BSP")
>
> Please don't forget about git history before doing changes here.
>

My bad - I misunderstood what is going on here.

> > CR4.PGE is used to flush the TLBs, by clearing it if it was set, and
>
> ... to flush TLB entries with the global bit set.
>
> And just like the above commit says, I think the CR4 settings across all
> CPUs on the machine should be the same. So we want to keep PSE.
>
> Removing the CONFIG_X86_5LEVEL ifdeffery is nice, OTOH.
>

Cheers.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state
  2024-01-29 18:05 ` [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state Ard Biesheuvel
@ 2024-02-07 13:29   ` Borislav Petkov
  2024-02-09 13:55     ` Ard Biesheuvel
  0 siblings, 1 reply; 52+ messages in thread
From: Borislav Petkov @ 2024-02-07 13:29 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, Ard Biesheuvel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, Jan 29, 2024 at 07:05:09PM +0100, Ard Biesheuvel wrote:
>  static inline bool pgtable_l5_enabled(void)
>  {
>  	return __pgtable_l5_enabled;
>  }
>  #else
> -#define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57)
> +#define pgtable_l5_enabled() !!(native_read_cr4() & X86_CR4_LA57)
>  #endif /* USE_EARLY_PGTABLE_L5 */

Can we drop this ifdeffery and simply have __pgtable_l5_enabled always
present and contain the correct value?

So that we don't have an expensive CR4 read hidden in
pgtable_l5_enabled()?

For the sake of simplicity, pgtable_l5_enabled() can be defined outside
of CONFIG_X86_5LEVEL and since both vendors support 5level now, might as
well start dropping the CONFIG ifdeffery slowly...

Other than that - a nice cleanup!

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 07/19] x86/startup_64: Simplify virtual switch on primary boot
  2024-01-29 18:05 ` [PATCH v3 07/19] x86/startup_64: Simplify virtual switch on primary boot Ard Biesheuvel
@ 2024-02-07 14:50   ` Borislav Petkov
  0 siblings, 0 replies; 52+ messages in thread
From: Borislav Petkov @ 2024-02-07 14:50 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, Ard Biesheuvel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, Jan 29, 2024 at 07:05:10PM +0100, Ard Biesheuvel wrote:
> +SYM_INNER_LABEL(common_startup_64, SYM_L_LOCAL)
> +	UNWIND_HINT_END_OF_STACK
> +	ANNOTATE_NOENDBR // above
			^^^^^^^^^^

leftover comment.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state
  2024-02-07 13:29   ` Borislav Petkov
@ 2024-02-09 13:55     ` Ard Biesheuvel
  2024-02-10 10:40       ` Borislav Petkov
  0 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-02-09 13:55 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Wed, 7 Feb 2024 at 13:29, Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:09PM +0100, Ard Biesheuvel wrote:
> >  static inline bool pgtable_l5_enabled(void)
> >  {
> >       return __pgtable_l5_enabled;
> >  }
> >  #else
> > -#define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57)
> > +#define pgtable_l5_enabled() !!(native_read_cr4() & X86_CR4_LA57)
> >  #endif /* USE_EARLY_PGTABLE_L5 */
>
> Can we drop this ifdeffery and simply have __pgtable_l5_enabled always
> present and contain the correct value?
>

I was trying to get rid of global variable assignments and accesses
from the 1:1 mapping, but since we cannot get rid of those entirely,
we might just keep __pgtable_l5_enabled but use RIP_REL_REF() in the
accessors, and move the assignment to the asm startup code.

> So that we don't have an expensive CR4 read hidden in
> pgtable_l5_enabled()?
>

Yeah, I didn't realize it was expensive. Alternatively, we might do
something like

static __always_inline bool pgtable_l5_enabled(void)
{
   unsigned long r;
   bool ret;

   asm(ALTERNATIVE_TERNARY(
       "movq %%cr4, %[reg] \n\t btl %[la57], %k[reg]" CC_SET(c),
       %P[feat], "stc", "clc")
       : [reg] "=r" (r), CC_OUT(c) (ret)
       : [feat] "i" (X86_FEATURE_LA57),
         [la57] "i" (X86_CR4_LA57_BIT)
       : "cc");
   return ret;
}

but we'd still have two versions in that case.

> For the sake of simplicity, pgtable_l5_enabled() can be defined outside
> of CONFIG_X86_5LEVEL and since both vendors support 5level now, might as
> well start dropping the CONFIG ifdeffery slowly...
>
> Other than that - a nice cleanup!
>

Thanks.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state
  2024-02-09 13:55     ` Ard Biesheuvel
@ 2024-02-10 10:40       ` Borislav Petkov
  2024-02-11 22:36         ` Ard Biesheuvel
  0 siblings, 1 reply; 52+ messages in thread
From: Borislav Petkov @ 2024-02-10 10:40 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Fri, Feb 09, 2024 at 01:55:02PM +0000, Ard Biesheuvel wrote:
> I was trying to get rid of global variable assignments and accesses
> from the 1:1 mapping, but since we cannot get rid of those entirely,
> we might just keep __pgtable_l5_enabled but use RIP_REL_REF() in the
> accessors, and move the assignment to the asm startup code.

Yeah.

>    asm(ALTERNATIVE_TERNARY(
>        "movq %%cr4, %[reg] \n\t btl %[la57], %k[reg]" CC_SET(c),
>        %P[feat], "stc", "clc")
>        : [reg] "=r" (r), CC_OUT(c) (ret)
>        : [feat] "i" (X86_FEATURE_LA57),
>          [la57] "i" (X86_CR4_LA57_BIT)
>        : "cc");

Creative :)

> but we'd still have two versions in that case.

Yap. RIP_REL_REF() ain't too bad ...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state
  2024-02-10 10:40       ` Borislav Petkov
@ 2024-02-11 22:36         ` Ard Biesheuvel
  0 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-02-11 22:36 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Sat, 10 Feb 2024 at 11:41, Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Feb 09, 2024 at 01:55:02PM +0000, Ard Biesheuvel wrote:
> > I was trying to get rid of global variable assignments and accesses
> > from the 1:1 mapping, but since we cannot get rid of those entirely,
> > we might just keep __pgtable_l5_enabled but use RIP_REL_REF() in the
> > accessors, and move the assignment to the asm startup code.
>
> Yeah.
>
> >    asm(ALTERNATIVE_TERNARY(
> >        "movq %%cr4, %[reg] \n\t btl %[la57], %k[reg]" CC_SET(c),
> >        %P[feat], "stc", "clc")
> >        : [reg] "=r" (r), CC_OUT(c) (ret)
> >        : [feat] "i" (X86_FEATURE_LA57),
> >          [la57] "i" (X86_CR4_LA57_BIT)
> >        : "cc");
>
> Creative :)
>
> > but we'd still have two versions in that case.
>
> Yap. RIP_REL_REF() ain't too bad ...
>

We can actually rip all of that stuff out, and have only a single
implementation of pgtable_l5_enabled() that is not based on a variable
at all. It results in a nice cleanup, but I'll keep it as a separate
patch in the next revision so we can easily drop it if preferred.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 08/19] x86/head64: Replace pointer fixups with PIE codegen
  2024-01-29 18:05 ` [PATCH v3 08/19] x86/head64: Replace pointer fixups with PIE codegen Ard Biesheuvel
@ 2024-02-12 10:29   ` Borislav Petkov
  2024-02-12 11:52     ` Ard Biesheuvel
  0 siblings, 1 reply; 52+ messages in thread
From: Borislav Petkov @ 2024-02-12 10:29 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, Ard Biesheuvel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, Jan 29, 2024 at 07:05:11PM +0100, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> Some of the C code in head64.c may be called from a different virtual
> address than it was linked at. Currently, we deal with this by using

Yeah, make passive pls: "Currently, this is done by using... "

> ordinary, position dependent codegen, and fixing up all symbol
> references on the fly. This is fragile and tricky to maintain. It is
> also unnecessary: we can use position independent codegen (with hidden
		   ^^^
Ditto: "use ..."

In the comments below too, pls, where it says "we".

> visibility) to ensure that all compiler generated symbol references are
> RIP-relative, removing the need for fixups entirely.
> 
> It does mean we need explicit references to kernel virtual addresses to
> be generated by hand, so generate those using a movabs instruction in
> inline asm in the handful places where we actually need this.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/x86/Makefile                 |  8 ++
>  arch/x86/boot/compressed/Makefile |  2 +-
>  arch/x86/include/asm/desc.h       |  3 +-
>  arch/x86/include/asm/setup.h      |  4 +-
>  arch/x86/kernel/Makefile          |  5 ++
>  arch/x86/kernel/head64.c          | 88 +++++++-------------
>  arch/x86/kernel/head_64.S         |  5 +-
>  7 files changed, 51 insertions(+), 64 deletions(-)
> 
> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> index 1a068de12a56..2b5954e75318 100644
> --- a/arch/x86/Makefile
> +++ b/arch/x86/Makefile
> @@ -168,6 +168,14 @@ else
>          KBUILD_CFLAGS += -mcmodel=kernel
>          KBUILD_RUSTFLAGS += -Cno-redzone=y
>          KBUILD_RUSTFLAGS += -Ccode-model=kernel
> +
> +	PIE_CFLAGS-$(CONFIG_STACKPROTECTOR)	+= -fno-stack-protector

Main Makefile has

KBUILD_CFLAGS += -fno-PIE

and this ends up being:

gcc -Wp,-MMD,arch/x86/kernel/.head64.s.d -nostdinc ... -fno-PIE ... -fpie ... -fverbose-asm -S -o arch/x86/kernel/head64.s arch/x86/kernel/head64.c

Can you pls remove -fno-PIE from those TUs which use PIE_CFLAGS so that
there's no confusion when staring at V=1 output?

> +	PIE_CFLAGS-$(CONFIG_LTO)		+= -fno-lto
> +
> +	PIE_CFLAGS := -fpie -mcmodel=small $(PIE_CFLAGS-y) \
> +		      -include $(srctree)/include/linux/hidden.h
> +
> +	export PIE_CFLAGS
>  endif
>  
>  #

Other than that, that code becomes much more readable, cool!

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 08/19] x86/head64: Replace pointer fixups with PIE codegen
  2024-02-12 10:29   ` Borislav Petkov
@ 2024-02-12 11:52     ` Ard Biesheuvel
  2024-02-12 14:18       ` Borislav Petkov
  0 siblings, 1 reply; 52+ messages in thread
From: Ard Biesheuvel @ 2024-02-12 11:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, 12 Feb 2024 at 11:29, Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:11PM +0100, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Some of the C code in head64.c may be called from a different virtual
> > address than it was linked at. Currently, we deal with this by using
>
> Yeah, make passive pls: "Currently, this is done by using... "
>
> > ordinary, position dependent codegen, and fixing up all symbol
> > references on the fly. This is fragile and tricky to maintain. It is
> > also unnecessary: we can use position independent codegen (with hidden
>                    ^^^
> Ditto: "use ..."
>
> In the comments below too, pls, where it says "we".
>

Ack.

> > visibility) to ensure that all compiler generated symbol references are
> > RIP-relative, removing the need for fixups entirely.
> >
> > It does mean we need explicit references to kernel virtual addresses to
> > be generated by hand, so generate those using a movabs instruction in
> > inline asm in the handful places where we actually need this.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/x86/Makefile                 |  8 ++
> >  arch/x86/boot/compressed/Makefile |  2 +-
> >  arch/x86/include/asm/desc.h       |  3 +-
> >  arch/x86/include/asm/setup.h      |  4 +-
> >  arch/x86/kernel/Makefile          |  5 ++
> >  arch/x86/kernel/head64.c          | 88 +++++++-------------
> >  arch/x86/kernel/head_64.S         |  5 +-
> >  7 files changed, 51 insertions(+), 64 deletions(-)
> >
> > diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> > index 1a068de12a56..2b5954e75318 100644
> > --- a/arch/x86/Makefile
> > +++ b/arch/x86/Makefile
> > @@ -168,6 +168,14 @@ else
> >          KBUILD_CFLAGS += -mcmodel=kernel
> >          KBUILD_RUSTFLAGS += -Cno-redzone=y
> >          KBUILD_RUSTFLAGS += -Ccode-model=kernel
> > +
> > +     PIE_CFLAGS-$(CONFIG_STACKPROTECTOR)     += -fno-stack-protector
>
> Main Makefile has
>
> KBUILD_CFLAGS += -fno-PIE
>
> and this ends up being:
>
> gcc -Wp,-MMD,arch/x86/kernel/.head64.s.d -nostdinc ... -fno-PIE ... -fpie ... -fverbose-asm -S -o arch/x86/kernel/head64.s arch/x86/kernel/head64.c
>
> Can you pls remove -fno-PIE from those TUs which use PIE_CFLAGS so that
> there's no confusion when staring at V=1 output?
>

Yeah. That would means adding PIE_CFLAGS_REMOVE alongside PIE_CFLAGS
and applying both in every place it is used, but we are only dealing
with a handful of object files here.


> > +     PIE_CFLAGS-$(CONFIG_LTO)                += -fno-lto
> > +
> > +     PIE_CFLAGS := -fpie -mcmodel=small $(PIE_CFLAGS-y) \
> > +                   -include $(srctree)/include/linux/hidden.h
> > +
> > +     export PIE_CFLAGS
> >  endif
> >
> >  #
>
> Other than that, that code becomes much more readable, cool!
>

Thanks. But now that we have RIP_REL_REF(), I might split the cleanup
from the actual switch to -fpie, which I am still a bit on the fence
about, given different compiler versions, LTO, etc.

RIP_REL_REF(foo) just turns into 'foo' when compiling with -fpie and
we could drop those piecemeal once we are confident that -fpie does
not cause any regressions.

Note that I have some reservations now about .pi.text as well: it is a
bit intrusive, and on x86, we might just as well move everything that
executes from the 1:1 mapping into .head.text, and teach objtool that
those sections should not contain any ELF relocations involving
absolute addresses. But this is another thing that I want to spend a
bit more time on before I respin it, so I will just do the cleanup in
the next revision, and add the rigid correctness checks the next
cycle.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 08/19] x86/head64: Replace pointer fixups with PIE codegen
  2024-02-12 11:52     ` Ard Biesheuvel
@ 2024-02-12 14:18       ` Borislav Petkov
  0 siblings, 0 replies; 52+ messages in thread
From: Borislav Petkov @ 2024-02-12 14:18 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, Feb 12, 2024 at 12:52:01PM +0100, Ard Biesheuvel wrote:
> Yeah. That would means adding PIE_CFLAGS_REMOVE alongside PIE_CFLAGS
> and applying both in every place it is used, but we are only dealing
> with a handful of object files here.

Right.

And we already have such a thing with PURGATORY_CFLAGS_REMOVE.

> Thanks. But now that we have RIP_REL_REF(), I might split the cleanup
> from the actual switch to -fpie, which I am still a bit on the fence
> about, given different compiler versions, LTO, etc.

Tell me about it. Considering how much jumping through hoops we had to
do in recent years to accomodate building the source with the different
compilers, I'm all for being very conservative here.

> RIP_REL_REF(foo) just turns into 'foo' when compiling with -fpie and
> we could drop those piecemeal once we are confident that -fpie does
> not cause any regressions.

Ack.

> Note that I have some reservations now about .pi.text as well: it is a
> bit intrusive, and on x86, we might just as well move everything that
> executes from the 1:1 mapping into .head.text, and teach objtool that
> those sections should not contain any ELF relocations involving
> absolute addresses. But this is another thing that I want to spend a
> bit more time on before I respin it, so I will just do the cleanup in
> the next revision, and add the rigid correctness checks the next
> cycle.

I am fully onboard with being conservative and doing things in small
steps considering how many bugs tend to fall out when the stuff hits
upstream. So going slowly and making sure our sanity is intact is a very
good idea!

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 09/19] x86/head64: Simplify GDT/IDT initialization code
  2024-01-29 18:05 ` [PATCH v3 09/19] x86/head64: Simplify GDT/IDT initialization code Ard Biesheuvel
@ 2024-02-12 14:37   ` Borislav Petkov
  2024-02-12 15:23     ` Ard Biesheuvel
  0 siblings, 1 reply; 52+ messages in thread
From: Borislav Petkov @ 2024-02-12 14:37 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, Ard Biesheuvel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, Jan 29, 2024 at 07:05:12PM +0100, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> There used to be two separate code paths for programming the IDT early:
> one that was called via the 1:1 mapping, and one via the kernel virtual
> mapping, where the former used explicit pointer fixups to obtain 1:1
> mapped addresses.
> 
> That distinction is now gone so the GDT/IDT init code can be unified and
> simplified accordingly.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/x86/kernel/head64.c | 57 +++++++-------------
>  1 file changed, 18 insertions(+), 39 deletions(-)

Ok, I don't see anything wrong here and since this one is the last of
the cleanup, lemme stop here so that you can send a new revision. We can
deal with whether we want .pi.text later.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 09/19] x86/head64: Simplify GDT/IDT initialization code
  2024-02-12 14:37   ` Borislav Petkov
@ 2024-02-12 15:23     ` Ard Biesheuvel
  0 siblings, 0 replies; 52+ messages in thread
From: Ard Biesheuvel @ 2024-02-12 15:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, linux-kernel, Kevin Loughlin, Tom Lendacky,
	Dionna Glaze, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Andy Lutomirski, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Justin Stitt, Kees Cook, Brian Gerst,
	linux-arch, llvm

On Mon, 12 Feb 2024 at 15:37, Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:12PM +0100, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > There used to be two separate code paths for programming the IDT early:
> > one that was called via the 1:1 mapping, and one via the kernel virtual
> > mapping, where the former used explicit pointer fixups to obtain 1:1
> > mapped addresses.
> >
> > That distinction is now gone so the GDT/IDT init code can be unified and
> > simplified accordingly.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/x86/kernel/head64.c | 57 +++++++-------------
> >  1 file changed, 18 insertions(+), 39 deletions(-)
>
> Ok, I don't see anything wrong here and since this one is the last of
> the cleanup, lemme stop here so that you can send a new revision. We can
> deal with whether we want .pi.text later.
>

OK.

I'll have the next rev out shortly, thanks.

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2024-02-12 15:24 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-29 18:05 [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 01/19] efi/libstub: Add generic support for parsing mem_encrypt= Ard Biesheuvel
2024-01-31  7:31   ` Borislav Petkov
2024-02-01 16:23     ` Kevin Loughlin
2024-02-01 16:28       ` Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 02/19] x86/boot: Move mem_encrypt= parsing to the decompressor Ard Biesheuvel
2024-01-31  8:35   ` Borislav Petkov
2024-01-31  9:12     ` Ard Biesheuvel
2024-01-31  9:29       ` Borislav Petkov
2024-01-31  9:59         ` Ard Biesheuvel
2024-02-01 14:17         ` Tom Lendacky
2024-02-01 16:15           ` Ard Biesheuvel
2024-02-02 16:35             ` [PATCH] x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT Borislav Petkov
2024-02-02 16:47               ` Ard Biesheuvel
2024-02-03 10:50               ` [tip: x86/sev] " tip-bot2 for Borislav Petkov (AMD)
2024-01-29 18:05 ` [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer Ard Biesheuvel
2024-01-31 13:44   ` Borislav Petkov
2024-01-31 13:57     ` Ard Biesheuvel
2024-01-31 14:07       ` Ard Biesheuvel
2024-01-31 16:29         ` Borislav Petkov
2024-01-31 18:14   ` [tip: x86/boot] " tip-bot2 for Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 04/19] x86/startup_64: Simplify calculation of initial page table address Ard Biesheuvel
2024-02-05 10:40   ` Borislav Petkov
2024-01-29 18:05 ` [PATCH v3 05/19] x86/startup_64: Simplify CR4 handling in startup code Ard Biesheuvel
2024-02-06 18:21   ` Borislav Petkov
2024-02-07 10:38     ` Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state Ard Biesheuvel
2024-02-07 13:29   ` Borislav Petkov
2024-02-09 13:55     ` Ard Biesheuvel
2024-02-10 10:40       ` Borislav Petkov
2024-02-11 22:36         ` Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 07/19] x86/startup_64: Simplify virtual switch on primary boot Ard Biesheuvel
2024-02-07 14:50   ` Borislav Petkov
2024-01-29 18:05 ` [PATCH v3 08/19] x86/head64: Replace pointer fixups with PIE codegen Ard Biesheuvel
2024-02-12 10:29   ` Borislav Petkov
2024-02-12 11:52     ` Ard Biesheuvel
2024-02-12 14:18       ` Borislav Petkov
2024-01-29 18:05 ` [PATCH v3 09/19] x86/head64: Simplify GDT/IDT initialization code Ard Biesheuvel
2024-02-12 14:37   ` Borislav Petkov
2024-02-12 15:23     ` Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 10/19] asm-generic: Add special .pi.text section for position independent code Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 11/19] x86: Move return_thunk to __pitext section Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 12/19] x86/head64: Move early startup code into __pitext Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 13/19] modpost: Warn about calls from __pitext into other text sections Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 14/19] x86/coco: Make cc_set_mask() static inline Ard Biesheuvel
2024-01-30 23:16   ` Kevin Loughlin
2024-01-30 23:36     ` Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 15/19] x86/sev: Make all code reachable from 1:1 mapping __pitext Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 16/19] x86/sev: Avoid WARN() in early code Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 17/19] x86/sev: Use PIC codegen for early SEV startup code Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 18/19] x86/sev: Drop inline asm LEA instructions for RIP-relative references Ard Biesheuvel
2024-01-29 18:05 ` [PATCH v3 19/19] x86/startup_64: Don't bother setting up GS before the kernel is mapped Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).