kexec.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/9] kexec x86 purgatory cleanup
@ 2024-04-24 15:53 Ard Biesheuvel
  2024-04-24 15:53 ` [RFC PATCH 1/9] x86/purgatory: Drop function entry padding from purgatory Ard Biesheuvel
                   ` (9 more replies)
  0 siblings, 10 replies; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 15:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman, kexec,
	Nathan Chancellor, Nick Desaulniers, Kees Cook, Bill Wendling,
	Justin Stitt, Masahiro Yamada

From: Ard Biesheuvel <ardb@kernel.org>

The kexec purgatory is built like a kernel module, i.e., a partially
linked ELF object where each section is allocated and placed
individually, and all relocations need to be fixed up, even place
relative ones.

This makes sense for kernel modules, which share the address space with
the core kernel, and contain unresolved references that need to be wired
up to symbols in other modules or the kernel itself.

The purgatory, however, is a fully linked binary without any external
references, or any overlap with the kernel's virtual address space. So
it makes much more sense to create a fully linked ELF executable that
can just be loaded and run anywhere in memory.

The purgatory build on x86 has already switched over to position
independent codegen, which only leaves a handful of absolute references,
which can either be dropped (patch #3) or converted into a RIP-relative
one (patch #4). That leaves a purgatory executable that can run at any
offset in memory with applying any relocations whatsoever.

Some tweaks are needed to deal with the difference between partially
(ET_REL) and fully (ET_DYN/ET_EXEC) linked ELF objects, but with those
in place, a substantial amount of complicated ELF allocation, placement
and patching/relocation code can simply be dropped.

The last patch in the series removes this code from the generic kexec
implementation, but this can only be done once other architectures apply
the same changes proposed here for x86 (powerpc, s390 and riscv all
implement the purgatory using the shared logic)

Link: https://lore.kernel.org/all/CAKwvOd=3Jrzju++=Ve61=ZdeshxUM=K3-bGMNREnGOQgNw=aag@mail.gmail.com/
Link: https://lore.kernel.org/all/20240418201705.3673200-2-ardb+git@google.com/

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: kexec@lists.infradead.org
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Bill Wendling <morbo@google.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>

Ard Biesheuvel (9):
  x86/purgatory: Drop function entry padding from purgatory
  x86/purgatory: Simplify stack handling
  x86/purgatory: Drop pointless GDT switch
  x86/purgatory: Avoid absolute reference to GDT
  x86/purgatory: Simplify GDT and drop data segment
  kexec: Add support for fully linked purgatory executables
  x86/purgatory: Use fully linked PIE ELF executable
  x86/purgatory: Simplify references to regs array
  kexec: Drop support for partially linked purgatory executables

 arch/x86/include/asm/kexec.h       |   8 -
 arch/x86/kernel/kexec-bzimage64.c  |   8 -
 arch/x86/kernel/machine_kexec_64.c | 127 ----------
 arch/x86/purgatory/Makefile        |  17 +-
 arch/x86/purgatory/entry64.S       |  96 ++++----
 arch/x86/purgatory/setup-x86_64.S  |  31 +--
 arch/x86/purgatory/stack.S         |  18 --
 include/asm-generic/purgatory.lds  |  34 +++
 kernel/kexec_file.c                | 255 +++-----------------
 9 files changed, 125 insertions(+), 469 deletions(-)
 delete mode 100644 arch/x86/purgatory/stack.S
 create mode 100644 include/asm-generic/purgatory.lds

-- 
2.44.0.769.g3c40516874-goog


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC PATCH 1/9] x86/purgatory: Drop function entry padding from purgatory
  2024-04-24 15:53 [RFC PATCH 0/9] kexec x86 purgatory cleanup Ard Biesheuvel
@ 2024-04-24 15:53 ` Ard Biesheuvel
  2024-04-24 15:53 ` [RFC PATCH 2/9] x86/purgatory: Simplify stack handling Ard Biesheuvel
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 15:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman, kexec,
	Nathan Chancellor, Nick Desaulniers, Kees Cook, Bill Wendling,
	Justin Stitt, Masahiro Yamada

From: Ard Biesheuvel <ardb@kernel.org>

The purgatory is a completely separate ELF executable carried inside the
kernel as an opaque binary blob. This means that function entry padding
and the associated ELF metadata are not exposed to the branch tracking
and code patching machinery, and can there be dropped from the purgatory
binary.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/purgatory/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index a18591f6e6d9..2df4a4b70ff5 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -23,6 +23,9 @@ KBUILD_CFLAGS := $(filter-out -fprofile-sample-use=% -fprofile-use=%,$(KBUILD_CF
 # by kexec. Remove -flto=* flags.
 KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO),$(KBUILD_CFLAGS))
 
+# Drop the function entry padding, which is not needed here
+KBUILD_CFLAGS := $(filter-out $(PADDING_CFLAGS),$(KBUILD_CFLAGS))
+
 # When linking purgatory.ro with -r unresolved symbols are not checked,
 # also link a purgatory.chk binary without -r to check for unresolved symbols.
 PURGATORY_LDFLAGS := -e purgatory_start -z nodefaultlib
-- 
2.44.0.769.g3c40516874-goog


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 2/9] x86/purgatory: Simplify stack handling
  2024-04-24 15:53 [RFC PATCH 0/9] kexec x86 purgatory cleanup Ard Biesheuvel
  2024-04-24 15:53 ` [RFC PATCH 1/9] x86/purgatory: Drop function entry padding from purgatory Ard Biesheuvel
@ 2024-04-24 15:53 ` Ard Biesheuvel
  2024-04-24 18:26   ` Nathan Chancellor
  2024-04-24 15:53 ` [RFC PATCH 3/9] x86/purgatory: Drop pointless GDT switch Ard Biesheuvel
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 15:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman, kexec,
	Nathan Chancellor, Nick Desaulniers, Kees Cook, Bill Wendling,
	Justin Stitt, Masahiro Yamada

From: Ard Biesheuvel <ardb@kernel.org>

The x86 purgatory, which does little more than verify a SHA-256 hash of
the loaded segments, currently uses three different stacks:
- one in .bss that is used to call the purgatory C code
- one in .rodata that is only used to switch to an updated code segment
  descriptor in the GDT
- one in .data, which allows it to be prepopulated from the kexec loader
  in theory, but this is not actually being taken advantage of.

Simplify this, by dropping the latter two stacks, as well as the loader
logic that programs RSP.

Both the stacks in .bss and .data are 4k aligned, but 16 byte alignment
is more than sufficient.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/include/asm/kexec.h      |  1 -
 arch/x86/kernel/kexec-bzimage64.c |  8 --------
 arch/x86/purgatory/entry64.S      |  8 --------
 arch/x86/purgatory/setup-x86_64.S |  2 +-
 arch/x86/purgatory/stack.S        | 18 ------------------
 5 files changed, 1 insertion(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 91ca9a9ee3a2..ee7b32565e5f 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -163,7 +163,6 @@ struct kexec_entry64_regs {
 	uint64_t rcx;
 	uint64_t rdx;
 	uint64_t rbx;
-	uint64_t rsp;
 	uint64_t rbp;
 	uint64_t rsi;
 	uint64_t rdi;
diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
index cde167b0ea92..f5bf1b7d01a6 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -400,7 +400,6 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
 	unsigned long bootparam_load_addr, kernel_load_addr, initrd_load_addr;
 	struct bzimage64_data *ldata;
 	struct kexec_entry64_regs regs64;
-	void *stack;
 	unsigned int setup_hdr_offset = offsetof(struct boot_params, hdr);
 	unsigned int efi_map_offset, efi_map_sz, efi_setup_data_offset;
 	struct kexec_buf kbuf = { .image = image, .buf_max = ULONG_MAX,
@@ -550,14 +549,7 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
 	regs64.rbx = 0; /* Bootstrap Processor */
 	regs64.rsi = bootparam_load_addr;
 	regs64.rip = kernel_load_addr + 0x200;
-	stack = kexec_purgatory_get_symbol_addr(image, "stack_end");
-	if (IS_ERR(stack)) {
-		pr_err("Could not find address of symbol stack_end\n");
-		ret = -EINVAL;
-		goto out_free_params;
-	}
 
-	regs64.rsp = (unsigned long)stack;
 	ret = kexec_purgatory_get_set_symbol(image, "entry64_regs", &regs64,
 					     sizeof(regs64), 0);
 	if (ret)
diff --git a/arch/x86/purgatory/entry64.S b/arch/x86/purgatory/entry64.S
index 0b4390ce586b..9913877b0dbe 100644
--- a/arch/x86/purgatory/entry64.S
+++ b/arch/x86/purgatory/entry64.S
@@ -26,8 +26,6 @@ SYM_CODE_START(entry64)
 	movl    %eax, %fs
 	movl    %eax, %gs
 
-	/* Setup new stack */
-	leaq    stack_init(%rip), %rsp
 	pushq   $0x10 /* CS */
 	leaq    new_cs_exit(%rip), %rax
 	pushq   %rax
@@ -41,7 +39,6 @@ new_cs_exit:
 	movq	rdx(%rip), %rdx
 	movq	rsi(%rip), %rsi
 	movq	rdi(%rip), %rdi
-	movq    rsp(%rip), %rsp
 	movq	rbp(%rip), %rbp
 	movq	r8(%rip), %r8
 	movq	r9(%rip), %r9
@@ -63,7 +60,6 @@ rax:	.quad 0x0
 rcx:	.quad 0x0
 rdx:	.quad 0x0
 rbx:	.quad 0x0
-rsp:	.quad 0x0
 rbp:	.quad 0x0
 rsi:	.quad 0x0
 rdi:	.quad 0x0
@@ -97,7 +93,3 @@ SYM_DATA_START_LOCAL(gdt)
 	/* 0x18 4GB flat data segment */
 	.word 0xFFFF, 0x0000, 0x9200, 0x00CF
 SYM_DATA_END_LABEL(gdt, SYM_L_LOCAL, gdt_end)
-
-SYM_DATA_START_LOCAL(stack)
-	.quad   0, 0
-SYM_DATA_END_LABEL(stack, SYM_L_LOCAL, stack_init)
diff --git a/arch/x86/purgatory/setup-x86_64.S b/arch/x86/purgatory/setup-x86_64.S
index 89d9e9e53fcd..2d10ff88851d 100644
--- a/arch/x86/purgatory/setup-x86_64.S
+++ b/arch/x86/purgatory/setup-x86_64.S
@@ -53,7 +53,7 @@ SYM_DATA_START_LOCAL(gdt)
 SYM_DATA_END_LABEL(gdt, SYM_L_LOCAL, gdt_end)
 
 	.bss
-	.balign 4096
+	.balign 16
 SYM_DATA_START_LOCAL(lstack)
 	.skip 4096
 SYM_DATA_END_LABEL(lstack, SYM_L_LOCAL, lstack_end)
diff --git a/arch/x86/purgatory/stack.S b/arch/x86/purgatory/stack.S
deleted file mode 100644
index 1ef507ca50a5..000000000000
--- a/arch/x86/purgatory/stack.S
+++ /dev/null
@@ -1,18 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * purgatory:  stack
- *
- * Copyright (C) 2014 Red Hat Inc.
- */
-
-#include <linux/linkage.h>
-
-	/* A stack for the loaded kernel.
-	 * Separate and in the data section so it can be prepopulated.
-	 */
-	.data
-	.balign 4096
-
-SYM_DATA_START(stack)
-	.skip 4096
-SYM_DATA_END_LABEL(stack, SYM_L_GLOBAL, stack_end)
-- 
2.44.0.769.g3c40516874-goog


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 3/9] x86/purgatory: Drop pointless GDT switch
  2024-04-24 15:53 [RFC PATCH 0/9] kexec x86 purgatory cleanup Ard Biesheuvel
  2024-04-24 15:53 ` [RFC PATCH 1/9] x86/purgatory: Drop function entry padding from purgatory Ard Biesheuvel
  2024-04-24 15:53 ` [RFC PATCH 2/9] x86/purgatory: Simplify stack handling Ard Biesheuvel
@ 2024-04-24 15:53 ` Ard Biesheuvel
  2024-04-24 15:53 ` [RFC PATCH 4/9] x86/purgatory: Avoid absolute reference to GDT Ard Biesheuvel
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 15:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman, kexec,
	Nathan Chancellor, Nick Desaulniers, Kees Cook, Bill Wendling,
	Justin Stitt, Masahiro Yamada

From: Ard Biesheuvel <ardb@kernel.org>

The x86 purgatory switches to a new GDT twice, and the first time, it
doesn't even bother to switch to the new code segment. Given that data
segment selectors are ignored in long mode, and the fact that the GDT is
reprogrammed again after returning from purgatory(), the first switch is
entirely pointless and can just be dropped altogether.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/purgatory/setup-x86_64.S | 29 --------------------
 1 file changed, 29 deletions(-)

diff --git a/arch/x86/purgatory/setup-x86_64.S b/arch/x86/purgatory/setup-x86_64.S
index 2d10ff88851d..f160fc729cbe 100644
--- a/arch/x86/purgatory/setup-x86_64.S
+++ b/arch/x86/purgatory/setup-x86_64.S
@@ -15,17 +15,6 @@
 	.code64
 
 SYM_CODE_START(purgatory_start)
-	/* Load a gdt so I know what the segment registers are */
-	lgdt	gdt(%rip)
-
-	/* load the data segments */
-	movl	$0x18, %eax	/* data segment */
-	movl	%eax, %ds
-	movl	%eax, %es
-	movl	%eax, %ss
-	movl	%eax, %fs
-	movl	%eax, %gs
-
 	/* Setup a stack */
 	leaq	lstack_end(%rip), %rsp
 
@@ -34,24 +23,6 @@ SYM_CODE_START(purgatory_start)
 	jmp	entry64
 SYM_CODE_END(purgatory_start)
 
-	.section ".rodata"
-	.balign 16
-SYM_DATA_START_LOCAL(gdt)
-	/* 0x00 unusable segment
-	 * 0x08 unused
-	 * so use them as the gdt ptr
-	 */
-	.word	gdt_end - gdt - 1
-	.quad	gdt
-	.word	0, 0, 0
-
-	/* 0x10 4GB flat code segment */
-	.word	0xFFFF, 0x0000, 0x9A00, 0x00AF
-
-	/* 0x18 4GB flat data segment */
-	.word	0xFFFF, 0x0000, 0x9200, 0x00CF
-SYM_DATA_END_LABEL(gdt, SYM_L_LOCAL, gdt_end)
-
 	.bss
 	.balign 16
 SYM_DATA_START_LOCAL(lstack)
-- 
2.44.0.769.g3c40516874-goog


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 4/9] x86/purgatory: Avoid absolute reference to GDT
  2024-04-24 15:53 [RFC PATCH 0/9] kexec x86 purgatory cleanup Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2024-04-24 15:53 ` [RFC PATCH 3/9] x86/purgatory: Drop pointless GDT switch Ard Biesheuvel
@ 2024-04-24 15:53 ` Ard Biesheuvel
  2024-04-24 17:38   ` Brian Gerst
  2024-04-24 15:53 ` [RFC PATCH 5/9] x86/purgatory: Simplify GDT and drop data segment Ard Biesheuvel
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 15:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman, kexec,
	Nathan Chancellor, Nick Desaulniers, Kees Cook, Bill Wendling,
	Justin Stitt, Masahiro Yamada

From: Ard Biesheuvel <ardb@kernel.org>

The purgatory is almost entirely position independent, without any need
for any relocation processing at load time except for the reference to
the GDT in the entry code. Generate this reference at runtime instead,
to remove the last R_X86_64_64 relocation from this code.

While the GDT itself needs to be preserved in memory as long as it is
live, the GDT descriptor that is used to program the GDT can be
discarded so it can be allocated on the stack.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/purgatory/entry64.S | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/purgatory/entry64.S b/arch/x86/purgatory/entry64.S
index 9913877b0dbe..888661d9db9c 100644
--- a/arch/x86/purgatory/entry64.S
+++ b/arch/x86/purgatory/entry64.S
@@ -16,7 +16,11 @@
 
 SYM_CODE_START(entry64)
 	/* Setup a gdt that should be preserved */
-	lgdt gdt(%rip)
+	leaq	gdt(%rip), %rax
+	pushq	%rax
+	pushw	$gdt_end - gdt - 1
+	lgdt	(%rsp)
+	addq	$10, %rsp
 
 	/* load the data segments */
 	movl    $0x18, %eax     /* data segment */
@@ -83,8 +87,8 @@ SYM_DATA_START_LOCAL(gdt)
 	 * 0x08 unused
 	 * so use them as gdt ptr
 	 */
-	.word gdt_end - gdt - 1
-	.quad gdt
+	.word 0
+	.quad 0
 	.word 0, 0, 0
 
 	/* 0x10 4GB flat code segment */
-- 
2.44.0.769.g3c40516874-goog


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 5/9] x86/purgatory: Simplify GDT and drop data segment
  2024-04-24 15:53 [RFC PATCH 0/9] kexec x86 purgatory cleanup Ard Biesheuvel
                   ` (3 preceding siblings ...)
  2024-04-24 15:53 ` [RFC PATCH 4/9] x86/purgatory: Avoid absolute reference to GDT Ard Biesheuvel
@ 2024-04-24 15:53 ` Ard Biesheuvel
  2024-04-24 15:53 ` [RFC PATCH 6/9] kexec: Add support for fully linked purgatory executables Ard Biesheuvel
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 15:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman, kexec,
	Nathan Chancellor, Nick Desaulniers, Kees Cook, Bill Wendling,
	Justin Stitt, Masahiro Yamada

From: Ard Biesheuvel <ardb@kernel.org>

Data segment selectors are ignored in long mode so there is no point in
programming them. So clear them instead. This only leaves the code
segment entry in the GDT, which can be moved up a slot now that the
second slot is no longer used as the GDT descriptor.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/purgatory/entry64.S | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/arch/x86/purgatory/entry64.S b/arch/x86/purgatory/entry64.S
index 888661d9db9c..3d09781d4f9a 100644
--- a/arch/x86/purgatory/entry64.S
+++ b/arch/x86/purgatory/entry64.S
@@ -23,14 +23,14 @@ SYM_CODE_START(entry64)
 	addq	$10, %rsp
 
 	/* load the data segments */
-	movl    $0x18, %eax     /* data segment */
+	xorl    %eax, %eax     /* data segment */
 	movl    %eax, %ds
 	movl    %eax, %es
 	movl    %eax, %ss
 	movl    %eax, %fs
 	movl    %eax, %gs
 
-	pushq   $0x10 /* CS */
+	pushq   $0x8 /* CS */
 	leaq    new_cs_exit(%rip), %rax
 	pushq   %rax
 	lretq
@@ -84,16 +84,9 @@ SYM_DATA_END(entry64_regs)
 SYM_DATA_START_LOCAL(gdt)
 	/*
 	 * 0x00 unusable segment
-	 * 0x08 unused
-	 * so use them as gdt ptr
 	 */
-	.word 0
 	.quad 0
-	.word 0, 0, 0
 
-	/* 0x10 4GB flat code segment */
+	/* 0x8 4GB flat code segment */
 	.word 0xFFFF, 0x0000, 0x9A00, 0x00AF
-
-	/* 0x18 4GB flat data segment */
-	.word 0xFFFF, 0x0000, 0x9200, 0x00CF
 SYM_DATA_END_LABEL(gdt, SYM_L_LOCAL, gdt_end)
-- 
2.44.0.769.g3c40516874-goog


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 6/9] kexec: Add support for fully linked purgatory executables
  2024-04-24 15:53 [RFC PATCH 0/9] kexec x86 purgatory cleanup Ard Biesheuvel
                   ` (4 preceding siblings ...)
  2024-04-24 15:53 ` [RFC PATCH 5/9] x86/purgatory: Simplify GDT and drop data segment Ard Biesheuvel
@ 2024-04-24 15:53 ` Ard Biesheuvel
  2024-04-24 15:53 ` [RFC PATCH 7/9] x86/purgatory: Use fully linked PIE ELF executable Ard Biesheuvel
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 15:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman, kexec,
	Nathan Chancellor, Nick Desaulniers, Kees Cook, Bill Wendling,
	Justin Stitt, Masahiro Yamada

From: Ard Biesheuvel <ardb@kernel.org>

The purgatory ELF object is typically a partially linked object, which
puts the burden on the kexec loader to lay out the executable in memory,
and this involves (among other things) deciding the placement of the
sections in memory, and fixing up all relocations (relative and absolute
ones)

All of this can be greatly simplified by using a fully linked PIE ELF
executable instead, constructed in a way that removes the need for any
relocation processing or layout and allocation of individual sections.

By gathering all allocatable sections into a single PT_LOAD segment, and
relying on RIP-relative references, all relocations will be applied by
the linker, and the segment simply needs to be copied into memory.

So add a linker script and some minimal handling in generic code, which
can be used by architectures to opt into this approach. This will be
wired up for x86 in a subsequent patch.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 include/asm-generic/purgatory.lds | 34 ++++++++++
 kernel/kexec_file.c               | 68 +++++++++++++++++++-
 2 files changed, 101 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/purgatory.lds b/include/asm-generic/purgatory.lds
new file mode 100644
index 000000000000..260c457f7608
--- /dev/null
+++ b/include/asm-generic/purgatory.lds
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+PHDRS
+{
+	text PT_LOAD FLAGS(7) FILEHDR PHDRS;
+}
+
+SECTIONS
+{
+	. = SIZEOF_HEADERS;
+
+	.text : {
+		*(.text .rodata* .kexec-purgatory .data*)
+	} :text
+
+	.bss : {
+		*(.bss .dynbss)
+	} :text
+
+	.rela.dyn : {
+		*(.rela.*)
+	}
+
+	.symtab 0 : { *(.symtab) }
+	.strtab 0 : { *(.strtab) }
+	.shstrtab 0 : { *(.shstrtab) }
+
+	/DISCARD/ : {
+		*(.interp .modinfo .dynsym .dynstr .hash .gnu.* .dynamic .comment)
+		*(.got .plt .got.plt .plt.got .note.* .eh_frame .sframe)
+	}
+}
+
+ASSERT(SIZEOF(.rela.dyn) == 0, "Absolute relocations detected");
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index bef2f6f2571b..6379f8dfc29f 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -1010,6 +1010,62 @@ static int kexec_apply_relocations(struct kimage *image)
 	return 0;
 }
 
+/*
+ * kexec_load_purgatory_pie - Load the position independent purgatory object.
+ * @pi:		Purgatory info struct.
+ * @kbuf:	Memory parameters to use.
+ *
+ * Load a purgatory PIE executable. This is a fully linked executable
+ * consisting of a single PT_LOAD segment that does not require any relocation
+ * processing.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+static int kexec_load_purgatory_pie(struct purgatory_info *pi,
+				    struct kexec_buf *kbuf)
+{
+	const Elf_Phdr *phdr = (void *)pi->ehdr + pi->ehdr->e_phoff;
+	int ret;
+
+	if (pi->ehdr->e_phnum != 1)
+		return -EINVAL;
+
+	kbuf->bufsz = phdr->p_filesz;
+	kbuf->memsz = phdr->p_memsz;
+	kbuf->buf_align = phdr->p_align;
+
+	kbuf->buffer = vzalloc(kbuf->bufsz);
+	if (!kbuf->buffer)
+		return -ENOMEM;
+
+	ret = kexec_add_buffer(kbuf);
+	if (ret)
+		goto out_free_kbuf;
+
+	kbuf->image->start = kbuf->mem + pi->ehdr->e_entry;
+
+	pi->sechdrs = vcalloc(pi->ehdr->e_shnum, pi->ehdr->e_shentsize);
+	if (!pi->sechdrs)
+		goto out_free_kbuf;
+
+	pi->purgatory_buf = memcpy(kbuf->buffer,
+				   (void *)pi->ehdr + phdr->p_offset,
+				   kbuf->bufsz);
+
+	memcpy(pi->sechdrs, (void *)pi->ehdr + pi->ehdr->e_shoff,
+	       pi->ehdr->e_shnum * pi->ehdr->e_shentsize);
+
+	for (int i = 0; i < pi->ehdr->e_shnum; i++)
+		if (pi->sechdrs[i].sh_flags & SHF_ALLOC)
+			pi->sechdrs[i].sh_addr += kbuf->mem;
+
+	return 0;
+
+out_free_kbuf:
+	vfree(kbuf->buffer);
+	return ret;
+}
+
 /*
  * kexec_load_purgatory - Load and relocate the purgatory object.
  * @image:	Image to add the purgatory to.
@@ -1031,6 +1087,9 @@ int kexec_load_purgatory(struct kimage *image, struct kexec_buf *kbuf)
 
 	pi->ehdr = (const Elf_Ehdr *)kexec_purgatory;
 
+	if (pi->ehdr->e_type != ET_REL)
+		return kexec_load_purgatory_pie(pi, kbuf);
+
 	ret = kexec_purgatory_setup_kbuf(pi, kbuf);
 	if (ret)
 		return ret;
@@ -1087,7 +1146,8 @@ static const Elf_Sym *kexec_purgatory_find_symbol(struct purgatory_info *pi,
 
 		/* Go through symbols for a match */
 		for (k = 0; k < sechdrs[i].sh_size/sizeof(Elf_Sym); k++) {
-			if (ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL)
+			if (pi->ehdr->e_type == ET_REL &&
+			    ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL)
 				continue;
 
 			if (strcmp(strtab + syms[k].st_name, name) != 0)
@@ -1159,6 +1219,12 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
 
 	sym_buf = (char *)pi->purgatory_buf + sec->sh_offset + sym->st_value;
 
+	if (pi->ehdr->e_type != ET_REL) {
+		const Elf_Shdr *shdr = (void *)pi->ehdr + pi->ehdr->e_shoff;
+
+		sym_buf -= shdr[sym->st_shndx].sh_addr;
+	}
+
 	if (get_value)
 		memcpy((void *)buf, sym_buf, size);
 	else
-- 
2.44.0.769.g3c40516874-goog


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 7/9] x86/purgatory: Use fully linked PIE ELF executable
  2024-04-24 15:53 [RFC PATCH 0/9] kexec x86 purgatory cleanup Ard Biesheuvel
                   ` (5 preceding siblings ...)
  2024-04-24 15:53 ` [RFC PATCH 6/9] kexec: Add support for fully linked purgatory executables Ard Biesheuvel
@ 2024-04-24 15:53 ` Ard Biesheuvel
  2024-04-24 15:53 ` [RFC PATCH 8/9] x86/purgatory: Simplify references to regs array Ard Biesheuvel
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 15:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman, kexec,
	Nathan Chancellor, Nick Desaulniers, Kees Cook, Bill Wendling,
	Justin Stitt, Masahiro Yamada

From: Ard Biesheuvel <ardb@kernel.org>

Now that the generic support is in place, switch to a fully linked PIE
ELF executable for the purgatory, so that it can be loaded as a single,
fully relocated image. This allows a lot of ugly post-processing logic
to simply be dropped.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/include/asm/kexec.h       |   7 --
 arch/x86/kernel/machine_kexec_64.c | 127 --------------------
 arch/x86/purgatory/Makefile        |  14 +--
 3 files changed, 5 insertions(+), 143 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index ee7b32565e5f..c7cacc2e9dfb 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -191,13 +191,6 @@ void arch_kexec_unprotect_crashkres(void);
 #define arch_kexec_unprotect_crashkres arch_kexec_unprotect_crashkres
 
 #ifdef CONFIG_KEXEC_FILE
-struct purgatory_info;
-int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
-				     Elf_Shdr *section,
-				     const Elf_Shdr *relsec,
-				     const Elf_Shdr *symtab);
-#define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add
-
 int arch_kimage_file_post_load_cleanup(struct kimage *image);
 #define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
 #endif
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index bc0a5348b4a6..ded924423e50 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -371,133 +371,6 @@ void machine_kexec(struct kimage *image)
 /* arch-dependent functionality related to kexec file-based syscall */
 
 #ifdef CONFIG_KEXEC_FILE
-/*
- * Apply purgatory relocations.
- *
- * @pi:		Purgatory to be relocated.
- * @section:	Section relocations applying to.
- * @relsec:	Section containing RELAs.
- * @symtabsec:	Corresponding symtab.
- *
- * TODO: Some of the code belongs to generic code. Move that in kexec.c.
- */
-int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
-				     Elf_Shdr *section, const Elf_Shdr *relsec,
-				     const Elf_Shdr *symtabsec)
-{
-	unsigned int i;
-	Elf64_Rela *rel;
-	Elf64_Sym *sym;
-	void *location;
-	unsigned long address, sec_base, value;
-	const char *strtab, *name, *shstrtab;
-	const Elf_Shdr *sechdrs;
-
-	/* String & section header string table */
-	sechdrs = (void *)pi->ehdr + pi->ehdr->e_shoff;
-	strtab = (char *)pi->ehdr + sechdrs[symtabsec->sh_link].sh_offset;
-	shstrtab = (char *)pi->ehdr + sechdrs[pi->ehdr->e_shstrndx].sh_offset;
-
-	rel = (void *)pi->ehdr + relsec->sh_offset;
-
-	pr_debug("Applying relocate section %s to %u\n",
-		 shstrtab + relsec->sh_name, relsec->sh_info);
-
-	for (i = 0; i < relsec->sh_size / sizeof(*rel); i++) {
-
-		/*
-		 * rel[i].r_offset contains byte offset from beginning
-		 * of section to the storage unit affected.
-		 *
-		 * This is location to update. This is temporary buffer
-		 * where section is currently loaded. This will finally be
-		 * loaded to a different address later, pointed to by
-		 * ->sh_addr. kexec takes care of moving it
-		 *  (kexec_load_segment()).
-		 */
-		location = pi->purgatory_buf;
-		location += section->sh_offset;
-		location += rel[i].r_offset;
-
-		/* Final address of the location */
-		address = section->sh_addr + rel[i].r_offset;
-
-		/*
-		 * rel[i].r_info contains information about symbol table index
-		 * w.r.t which relocation must be made and type of relocation
-		 * to apply. ELF64_R_SYM() and ELF64_R_TYPE() macros get
-		 * these respectively.
-		 */
-		sym = (void *)pi->ehdr + symtabsec->sh_offset;
-		sym += ELF64_R_SYM(rel[i].r_info);
-
-		if (sym->st_name)
-			name = strtab + sym->st_name;
-		else
-			name = shstrtab + sechdrs[sym->st_shndx].sh_name;
-
-		pr_debug("Symbol: %s info: %02x shndx: %02x value=%llx size: %llx\n",
-			 name, sym->st_info, sym->st_shndx, sym->st_value,
-			 sym->st_size);
-
-		if (sym->st_shndx == SHN_UNDEF) {
-			pr_err("Undefined symbol: %s\n", name);
-			return -ENOEXEC;
-		}
-
-		if (sym->st_shndx == SHN_COMMON) {
-			pr_err("symbol '%s' in common section\n", name);
-			return -ENOEXEC;
-		}
-
-		if (sym->st_shndx == SHN_ABS)
-			sec_base = 0;
-		else if (sym->st_shndx >= pi->ehdr->e_shnum) {
-			pr_err("Invalid section %d for symbol %s\n",
-			       sym->st_shndx, name);
-			return -ENOEXEC;
-		} else
-			sec_base = pi->sechdrs[sym->st_shndx].sh_addr;
-
-		value = sym->st_value;
-		value += sec_base;
-		value += rel[i].r_addend;
-
-		switch (ELF64_R_TYPE(rel[i].r_info)) {
-		case R_X86_64_NONE:
-			break;
-		case R_X86_64_64:
-			*(u64 *)location = value;
-			break;
-		case R_X86_64_32:
-			*(u32 *)location = value;
-			if (value != *(u32 *)location)
-				goto overflow;
-			break;
-		case R_X86_64_32S:
-			*(s32 *)location = value;
-			if ((s64)value != *(s32 *)location)
-				goto overflow;
-			break;
-		case R_X86_64_PC32:
-		case R_X86_64_PLT32:
-			value -= (u64)address;
-			*(u32 *)location = value;
-			break;
-		default:
-			pr_err("Unknown rela relocation: %llu\n",
-			       ELF64_R_TYPE(rel[i].r_info));
-			return -ENOEXEC;
-		}
-	}
-	return 0;
-
-overflow:
-	pr_err("Overflow in relocation type %d value 0x%lx\n",
-	       (int)ELF64_R_TYPE(rel[i].r_info), value);
-	return -ENOEXEC;
-}
-
 int arch_kimage_file_post_load_cleanup(struct kimage *image)
 {
 	vfree(image->elf_headers);
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index 2df4a4b70ff5..acc09799af2a 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -26,12 +26,11 @@ KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO),$(KBUILD_CFLAGS))
 # Drop the function entry padding, which is not needed here
 KBUILD_CFLAGS := $(filter-out $(PADDING_CFLAGS),$(KBUILD_CFLAGS))
 
-# When linking purgatory.ro with -r unresolved symbols are not checked,
-# also link a purgatory.chk binary without -r to check for unresolved symbols.
 PURGATORY_LDFLAGS := -e purgatory_start -z nodefaultlib
-LDFLAGS_purgatory.ro := -r $(PURGATORY_LDFLAGS)
-LDFLAGS_purgatory.chk := $(PURGATORY_LDFLAGS)
-targets += purgatory.ro purgatory.chk
+PURGATORY_LDFLAGS += -T $(srctree)/include/asm-generic/purgatory.lds -pie
+PURGATORY_LDFLAGS += --orphan-handling=$(CONFIG_LD_ORPHAN_WARN_LEVEL)
+LDFLAGS_purgatory.ro := $(PURGATORY_LDFLAGS)
+targets += purgatory.ro
 
 # Sanitizer, etc. runtimes are unavailable and cannot be linked here.
 GCOV_PROFILE	:= n
@@ -87,9 +86,6 @@ asflags-remove-y		+= $(foreach x, -g -gdwarf-4 -gdwarf-5, $(x) -Wa,$(x))
 $(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
 		$(call if_changed,ld)
 
-$(obj)/purgatory.chk: $(obj)/purgatory.ro FORCE
-		$(call if_changed,ld)
-
-$(obj)/kexec-purgatory.o: $(obj)/purgatory.ro $(obj)/purgatory.chk
+$(obj)/kexec-purgatory.o: $(obj)/purgatory.ro
 
 obj-y += kexec-purgatory.o
-- 
2.44.0.769.g3c40516874-goog


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 8/9] x86/purgatory: Simplify references to regs array
  2024-04-24 15:53 [RFC PATCH 0/9] kexec x86 purgatory cleanup Ard Biesheuvel
                   ` (6 preceding siblings ...)
  2024-04-24 15:53 ` [RFC PATCH 7/9] x86/purgatory: Use fully linked PIE ELF executable Ard Biesheuvel
@ 2024-04-24 15:53 ` Ard Biesheuvel
  2024-04-24 15:53 ` [RFC PATCH 9/9] kexec: Drop support for partially linked purgatory executables Ard Biesheuvel
  2024-04-24 20:04 ` [RFC PATCH 0/9] kexec x86 purgatory cleanup Eric W. Biederman
  9 siblings, 0 replies; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 15:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman, kexec,
	Nathan Chancellor, Nick Desaulniers, Kees Cook, Bill Wendling,
	Justin Stitt, Masahiro Yamada

From: Ard Biesheuvel <ardb@kernel.org>

Use a single symbol reference and offset addressing to load the contents
of the register file from memory, instead of using a symbol reference
for each, which results in larger code and more ELF overhead. While at
it, rename the individual labels with an .L prefix so they are omitted
from the ELF symbol table.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/purgatory/entry64.S | 67 ++++++++++----------
 1 file changed, 34 insertions(+), 33 deletions(-)

diff --git a/arch/x86/purgatory/entry64.S b/arch/x86/purgatory/entry64.S
index 3d09781d4f9a..56487fb7fa1d 100644
--- a/arch/x86/purgatory/entry64.S
+++ b/arch/x86/purgatory/entry64.S
@@ -37,45 +37,46 @@ SYM_CODE_START(entry64)
 new_cs_exit:
 
 	/* Load the registers */
-	movq	rax(%rip), %rax
-	movq	rbx(%rip), %rbx
-	movq	rcx(%rip), %rcx
-	movq	rdx(%rip), %rdx
-	movq	rsi(%rip), %rsi
-	movq	rdi(%rip), %rdi
-	movq	rbp(%rip), %rbp
-	movq	r8(%rip), %r8
-	movq	r9(%rip), %r9
-	movq	r10(%rip), %r10
-	movq	r11(%rip), %r11
-	movq	r12(%rip), %r12
-	movq	r13(%rip), %r13
-	movq	r14(%rip), %r14
-	movq	r15(%rip), %r15
+	leaq	entry64_regs(%rip), %r15
+	movq	0x00(%r15), %rax
+	movq	0x08(%r15), %rcx
+	movq	0x10(%r15), %rdx
+	movq	0x18(%r15), %rbx
+	movq	0x20(%r15), %rbp
+	movq	0x28(%r15), %rsi
+	movq	0x30(%r15), %rdi
+	movq	0x38(%r15), %r8
+	movq	0x40(%r15), %r9
+	movq	0x48(%r15), %r10
+	movq	0x50(%r15), %r11
+	movq	0x58(%r15), %r12
+	movq	0x60(%r15), %r13
+	movq	0x68(%r15), %r14
+	movq	0x70(%r15), %r15
 
 	/* Jump to the new code... */
-	jmpq	*rip(%rip)
+	jmpq	*.Lrip(%rip)
 SYM_CODE_END(entry64)
 
 	.section ".rodata"
-	.balign 4
+	.balign	8
 SYM_DATA_START(entry64_regs)
-rax:	.quad 0x0
-rcx:	.quad 0x0
-rdx:	.quad 0x0
-rbx:	.quad 0x0
-rbp:	.quad 0x0
-rsi:	.quad 0x0
-rdi:	.quad 0x0
-r8:	.quad 0x0
-r9:	.quad 0x0
-r10:	.quad 0x0
-r11:	.quad 0x0
-r12:	.quad 0x0
-r13:	.quad 0x0
-r14:	.quad 0x0
-r15:	.quad 0x0
-rip:	.quad 0x0
+.Lrax:	.quad	0x0
+.Lrcx:	.quad	0x0
+.Lrdx:	.quad	0x0
+.Lrbx:	.quad	0x0
+.Lrbp:	.quad	0x0
+.Lrsi:	.quad	0x0
+.Lrdi:	.quad	0x0
+.Lr8:	.quad	0x0
+.Lr9:	.quad	0x0
+.Lr10:	.quad	0x0
+.Lr11:	.quad	0x0
+.Lr12:	.quad	0x0
+.Lr13:	.quad	0x0
+.Lr14:	.quad	0x0
+.Lr15:	.quad	0x0
+.Lrip:	.quad	0x0
 SYM_DATA_END(entry64_regs)
 
 	/* GDT */
-- 
2.44.0.769.g3c40516874-goog


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 9/9] kexec: Drop support for partially linked purgatory executables
  2024-04-24 15:53 [RFC PATCH 0/9] kexec x86 purgatory cleanup Ard Biesheuvel
                   ` (7 preceding siblings ...)
  2024-04-24 15:53 ` [RFC PATCH 8/9] x86/purgatory: Simplify references to regs array Ard Biesheuvel
@ 2024-04-24 15:53 ` Ard Biesheuvel
  2024-04-24 20:04 ` [RFC PATCH 0/9] kexec x86 purgatory cleanup Eric W. Biederman
  9 siblings, 0 replies; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 15:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman, kexec,
	Nathan Chancellor, Nick Desaulniers, Kees Cook, Bill Wendling,
	Justin Stitt, Masahiro Yamada

From: Ard Biesheuvel <ardb@kernel.org>

Remove the handling of purgatories that are allocated, loaded and
relocated as individual ELF sections, which requires a lot of
post-processing on the part of the kexec loader. This has been
superseded by the use of fully linked PIE executables, which do not
require such post-processing.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 kernel/kexec_file.c | 271 +-------------------
 1 file changed, 14 insertions(+), 257 deletions(-)

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 6379f8dfc29f..782a1247558c 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -808,228 +808,31 @@ static int kexec_calculate_store_digests(struct kimage *image)
 
 #ifdef CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY
 /*
- * kexec_purgatory_setup_kbuf - prepare buffer to load purgatory.
- * @pi:		Purgatory to be loaded.
- * @kbuf:	Buffer to setup.
- *
- * Allocates the memory needed for the buffer. Caller is responsible to free
- * the memory after use.
- *
- * Return: 0 on success, negative errno on error.
- */
-static int kexec_purgatory_setup_kbuf(struct purgatory_info *pi,
-				      struct kexec_buf *kbuf)
-{
-	const Elf_Shdr *sechdrs;
-	unsigned long bss_align;
-	unsigned long bss_sz;
-	unsigned long align;
-	int i, ret;
-
-	sechdrs = (void *)pi->ehdr + pi->ehdr->e_shoff;
-	kbuf->buf_align = bss_align = 1;
-	kbuf->bufsz = bss_sz = 0;
-
-	for (i = 0; i < pi->ehdr->e_shnum; i++) {
-		if (!(sechdrs[i].sh_flags & SHF_ALLOC))
-			continue;
-
-		align = sechdrs[i].sh_addralign;
-		if (sechdrs[i].sh_type != SHT_NOBITS) {
-			if (kbuf->buf_align < align)
-				kbuf->buf_align = align;
-			kbuf->bufsz = ALIGN(kbuf->bufsz, align);
-			kbuf->bufsz += sechdrs[i].sh_size;
-		} else {
-			if (bss_align < align)
-				bss_align = align;
-			bss_sz = ALIGN(bss_sz, align);
-			bss_sz += sechdrs[i].sh_size;
-		}
-	}
-	kbuf->bufsz = ALIGN(kbuf->bufsz, bss_align);
-	kbuf->memsz = kbuf->bufsz + bss_sz;
-	if (kbuf->buf_align < bss_align)
-		kbuf->buf_align = bss_align;
-
-	kbuf->buffer = vzalloc(kbuf->bufsz);
-	if (!kbuf->buffer)
-		return -ENOMEM;
-	pi->purgatory_buf = kbuf->buffer;
-
-	ret = kexec_add_buffer(kbuf);
-	if (ret)
-		goto out;
-
-	return 0;
-out:
-	vfree(pi->purgatory_buf);
-	pi->purgatory_buf = NULL;
-	return ret;
-}
-
-/*
- * kexec_purgatory_setup_sechdrs - prepares the pi->sechdrs buffer.
- * @pi:		Purgatory to be loaded.
- * @kbuf:	Buffer prepared to store purgatory.
- *
- * Allocates the memory needed for the buffer. Caller is responsible to free
- * the memory after use.
- *
- * Return: 0 on success, negative errno on error.
- */
-static int kexec_purgatory_setup_sechdrs(struct purgatory_info *pi,
-					 struct kexec_buf *kbuf)
-{
-	unsigned long bss_addr;
-	unsigned long offset;
-	size_t sechdrs_size;
-	Elf_Shdr *sechdrs;
-	int i;
-
-	/*
-	 * The section headers in kexec_purgatory are read-only. In order to
-	 * have them modifiable make a temporary copy.
-	 */
-	sechdrs_size = array_size(sizeof(Elf_Shdr), pi->ehdr->e_shnum);
-	sechdrs = vzalloc(sechdrs_size);
-	if (!sechdrs)
-		return -ENOMEM;
-	memcpy(sechdrs, (void *)pi->ehdr + pi->ehdr->e_shoff, sechdrs_size);
-	pi->sechdrs = sechdrs;
-
-	offset = 0;
-	bss_addr = kbuf->mem + kbuf->bufsz;
-	kbuf->image->start = pi->ehdr->e_entry;
-
-	for (i = 0; i < pi->ehdr->e_shnum; i++) {
-		unsigned long align;
-		void *src, *dst;
-
-		if (!(sechdrs[i].sh_flags & SHF_ALLOC))
-			continue;
-
-		align = sechdrs[i].sh_addralign;
-		if (sechdrs[i].sh_type == SHT_NOBITS) {
-			bss_addr = ALIGN(bss_addr, align);
-			sechdrs[i].sh_addr = bss_addr;
-			bss_addr += sechdrs[i].sh_size;
-			continue;
-		}
-
-		offset = ALIGN(offset, align);
-
-		/*
-		 * Check if the segment contains the entry point, if so,
-		 * calculate the value of image->start based on it.
-		 * If the compiler has produced more than one .text section
-		 * (Eg: .text.hot), they are generally after the main .text
-		 * section, and they shall not be used to calculate
-		 * image->start. So do not re-calculate image->start if it
-		 * is not set to the initial value, and warn the user so they
-		 * have a chance to fix their purgatory's linker script.
-		 */
-		if (sechdrs[i].sh_flags & SHF_EXECINSTR &&
-		    pi->ehdr->e_entry >= sechdrs[i].sh_addr &&
-		    pi->ehdr->e_entry < (sechdrs[i].sh_addr
-					 + sechdrs[i].sh_size) &&
-		    !WARN_ON(kbuf->image->start != pi->ehdr->e_entry)) {
-			kbuf->image->start -= sechdrs[i].sh_addr;
-			kbuf->image->start += kbuf->mem + offset;
-		}
-
-		src = (void *)pi->ehdr + sechdrs[i].sh_offset;
-		dst = pi->purgatory_buf + offset;
-		memcpy(dst, src, sechdrs[i].sh_size);
-
-		sechdrs[i].sh_addr = kbuf->mem + offset;
-		sechdrs[i].sh_offset = offset;
-		offset += sechdrs[i].sh_size;
-	}
-
-	return 0;
-}
-
-static int kexec_apply_relocations(struct kimage *image)
-{
-	int i, ret;
-	struct purgatory_info *pi = &image->purgatory_info;
-	const Elf_Shdr *sechdrs;
-
-	sechdrs = (void *)pi->ehdr + pi->ehdr->e_shoff;
-
-	for (i = 0; i < pi->ehdr->e_shnum; i++) {
-		const Elf_Shdr *relsec;
-		const Elf_Shdr *symtab;
-		Elf_Shdr *section;
-
-		relsec = sechdrs + i;
-
-		if (relsec->sh_type != SHT_RELA &&
-		    relsec->sh_type != SHT_REL)
-			continue;
-
-		/*
-		 * For section of type SHT_RELA/SHT_REL,
-		 * ->sh_link contains section header index of associated
-		 * symbol table. And ->sh_info contains section header
-		 * index of section to which relocations apply.
-		 */
-		if (relsec->sh_info >= pi->ehdr->e_shnum ||
-		    relsec->sh_link >= pi->ehdr->e_shnum)
-			return -ENOEXEC;
-
-		section = pi->sechdrs + relsec->sh_info;
-		symtab = sechdrs + relsec->sh_link;
-
-		if (!(section->sh_flags & SHF_ALLOC))
-			continue;
-
-		/*
-		 * symtab->sh_link contain section header index of associated
-		 * string table.
-		 */
-		if (symtab->sh_link >= pi->ehdr->e_shnum)
-			/* Invalid section number? */
-			continue;
-
-		/*
-		 * Respective architecture needs to provide support for applying
-		 * relocations of type SHT_RELA/SHT_REL.
-		 */
-		if (relsec->sh_type == SHT_RELA)
-			ret = arch_kexec_apply_relocations_add(pi, section,
-							       relsec, symtab);
-		else if (relsec->sh_type == SHT_REL)
-			ret = arch_kexec_apply_relocations(pi, section,
-							   relsec, symtab);
-		if (ret)
-			return ret;
-	}
-
-	return 0;
-}
-
-/*
- * kexec_load_purgatory_pie - Load the position independent purgatory object.
- * @pi:		Purgatory info struct.
+ * kexec_load_purgatory - Load and relocate the purgatory object.
+ * @image:	Image to add the purgatory to.
  * @kbuf:	Memory parameters to use.
  *
- * Load a purgatory PIE executable. This is a fully linked executable
- * consisting of a single PT_LOAD segment that does not require any relocation
- * processing.
+ * Allocates the memory needed for image->purgatory_info.sechdrs and
+ * image->purgatory_info.purgatory_buf/kbuf->buffer. Caller is responsible
+ * to free the memory after use.
  *
  * Return: 0 on success, negative errno on error.
  */
-static int kexec_load_purgatory_pie(struct purgatory_info *pi,
-				    struct kexec_buf *kbuf)
+int kexec_load_purgatory(struct kimage *image, struct kexec_buf *kbuf)
 {
-	const Elf_Phdr *phdr = (void *)pi->ehdr + pi->ehdr->e_phoff;
+	struct purgatory_info *pi = &image->purgatory_info;
+	const Elf_Phdr *phdr;
 	int ret;
 
+	if (kexec_purgatory_size <= 0)
+		return -EINVAL;
+
+	pi->ehdr = (const Elf_Ehdr *)kexec_purgatory;
 	if (pi->ehdr->e_phnum != 1)
 		return -EINVAL;
 
+	phdr = (void *)pi->ehdr + pi->ehdr->e_phoff;
+
 	kbuf->bufsz = phdr->p_filesz;
 	kbuf->memsz = phdr->p_memsz;
 	kbuf->buf_align = phdr->p_align;
@@ -1066,52 +869,6 @@ static int kexec_load_purgatory_pie(struct purgatory_info *pi,
 	return ret;
 }
 
-/*
- * kexec_load_purgatory - Load and relocate the purgatory object.
- * @image:	Image to add the purgatory to.
- * @kbuf:	Memory parameters to use.
- *
- * Allocates the memory needed for image->purgatory_info.sechdrs and
- * image->purgatory_info.purgatory_buf/kbuf->buffer. Caller is responsible
- * to free the memory after use.
- *
- * Return: 0 on success, negative errno on error.
- */
-int kexec_load_purgatory(struct kimage *image, struct kexec_buf *kbuf)
-{
-	struct purgatory_info *pi = &image->purgatory_info;
-	int ret;
-
-	if (kexec_purgatory_size <= 0)
-		return -EINVAL;
-
-	pi->ehdr = (const Elf_Ehdr *)kexec_purgatory;
-
-	if (pi->ehdr->e_type != ET_REL)
-		return kexec_load_purgatory_pie(pi, kbuf);
-
-	ret = kexec_purgatory_setup_kbuf(pi, kbuf);
-	if (ret)
-		return ret;
-
-	ret = kexec_purgatory_setup_sechdrs(pi, kbuf);
-	if (ret)
-		goto out_free_kbuf;
-
-	ret = kexec_apply_relocations(image);
-	if (ret)
-		goto out;
-
-	return 0;
-out:
-	vfree(pi->sechdrs);
-	pi->sechdrs = NULL;
-out_free_kbuf:
-	vfree(pi->purgatory_buf);
-	pi->purgatory_buf = NULL;
-	return ret;
-}
-
 /*
  * kexec_purgatory_find_symbol - find a symbol in the purgatory
  * @pi:		Purgatory to search in.
-- 
2.44.0.769.g3c40516874-goog


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/9] x86/purgatory: Avoid absolute reference to GDT
  2024-04-24 15:53 ` [RFC PATCH 4/9] x86/purgatory: Avoid absolute reference to GDT Ard Biesheuvel
@ 2024-04-24 17:38   ` Brian Gerst
  2024-04-24 17:53     ` Ard Biesheuvel
  0 siblings, 1 reply; 19+ messages in thread
From: Brian Gerst @ 2024-04-24 17:38 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman,
	kexec, Nathan Chancellor, Nick Desaulniers, Kees Cook,
	Bill Wendling, Justin Stitt, Masahiro Yamada

On Wed, Apr 24, 2024 at 12:06 PM Ard Biesheuvel <ardb+git@google.com> wrote:
>
> From: Ard Biesheuvel <ardb@kernel.org>
>
> The purgatory is almost entirely position independent, without any need
> for any relocation processing at load time except for the reference to
> the GDT in the entry code. Generate this reference at runtime instead,
> to remove the last R_X86_64_64 relocation from this code.
>
> While the GDT itself needs to be preserved in memory as long as it is
> live, the GDT descriptor that is used to program the GDT can be
> discarded so it can be allocated on the stack.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/x86/purgatory/entry64.S | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/purgatory/entry64.S b/arch/x86/purgatory/entry64.S
> index 9913877b0dbe..888661d9db9c 100644
> --- a/arch/x86/purgatory/entry64.S
> +++ b/arch/x86/purgatory/entry64.S
> @@ -16,7 +16,11 @@
>
>  SYM_CODE_START(entry64)
>         /* Setup a gdt that should be preserved */
> -       lgdt gdt(%rip)
> +       leaq    gdt(%rip), %rax
> +       pushq   %rax
> +       pushw   $gdt_end - gdt - 1
> +       lgdt    (%rsp)
> +       addq    $10, %rsp

This misaligns the stack, pushing 16 bytes on the stack but only
removing 10 (decimal).

>
>         /* load the data segments */
>         movl    $0x18, %eax     /* data segment */
> @@ -83,8 +87,8 @@ SYM_DATA_START_LOCAL(gdt)
>          * 0x08 unused
>          * so use them as gdt ptr

obsolete comment

>          */
> -       .word gdt_end - gdt - 1
> -       .quad gdt
> +       .word 0
> +       .quad 0
>         .word 0, 0, 0

This can be condensed down to:
        .quad 0, 0

>
>         /* 0x10 4GB flat code segment */
> --
> 2.44.0.769.g3c40516874-goog

Brian Gerst

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/9] x86/purgatory: Avoid absolute reference to GDT
  2024-04-24 17:38   ` Brian Gerst
@ 2024-04-24 17:53     ` Ard Biesheuvel
  2024-04-24 19:00       ` Brian Gerst
  0 siblings, 1 reply; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 17:53 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Ard Biesheuvel, linux-kernel, x86, Arnd Bergmann, Eric Biederman,
	kexec, Nathan Chancellor, Nick Desaulniers, Kees Cook,
	Bill Wendling, Justin Stitt, Masahiro Yamada

Hi Brian,

Thanks for taking a look.

On Wed, 24 Apr 2024 at 19:39, Brian Gerst <brgerst@gmail.com> wrote:
>
> On Wed, Apr 24, 2024 at 12:06 PM Ard Biesheuvel <ardb+git@google.com> wrote:
> >
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > The purgatory is almost entirely position independent, without any need
> > for any relocation processing at load time except for the reference to
> > the GDT in the entry code. Generate this reference at runtime instead,
> > to remove the last R_X86_64_64 relocation from this code.
> >
> > While the GDT itself needs to be preserved in memory as long as it is
> > live, the GDT descriptor that is used to program the GDT can be
> > discarded so it can be allocated on the stack.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/x86/purgatory/entry64.S | 10 +++++++---
> >  1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/purgatory/entry64.S b/arch/x86/purgatory/entry64.S
> > index 9913877b0dbe..888661d9db9c 100644
> > --- a/arch/x86/purgatory/entry64.S
> > +++ b/arch/x86/purgatory/entry64.S
> > @@ -16,7 +16,11 @@
> >
> >  SYM_CODE_START(entry64)
> >         /* Setup a gdt that should be preserved */
> > -       lgdt gdt(%rip)
> > +       leaq    gdt(%rip), %rax
> > +       pushq   %rax
> > +       pushw   $gdt_end - gdt - 1
> > +       lgdt    (%rsp)
> > +       addq    $10, %rsp
>
> This misaligns the stack, pushing 16 bytes on the stack but only
> removing 10 (decimal).
>

pushw subtracts 2 from RSP and stores a word. So the total size stored
is 10 decimal not 16.

> >
> >         /* load the data segments */
> >         movl    $0x18, %eax     /* data segment */
> > @@ -83,8 +87,8 @@ SYM_DATA_START_LOCAL(gdt)
> >          * 0x08 unused
> >          * so use them as gdt ptr
>
> obsolete comment
>
> >          */
> > -       .word gdt_end - gdt - 1
> > -       .quad gdt
> > +       .word 0
> > +       .quad 0
> >         .word 0, 0, 0
>
> This can be condensed down to:
>         .quad 0, 0
>

This code and the comment are removed in the next patch.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 2/9] x86/purgatory: Simplify stack handling
  2024-04-24 15:53 ` [RFC PATCH 2/9] x86/purgatory: Simplify stack handling Ard Biesheuvel
@ 2024-04-24 18:26   ` Nathan Chancellor
  2024-04-26 21:32     ` Justin Stitt
  0 siblings, 1 reply; 19+ messages in thread
From: Nathan Chancellor @ 2024-04-24 18:26 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, x86, Ard Biesheuvel, Arnd Bergmann, Eric Biederman,
	kexec, Nick Desaulniers, Kees Cook, Bill Wendling, Justin Stitt,
	Masahiro Yamada

On Wed, Apr 24, 2024 at 05:53:12PM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> The x86 purgatory, which does little more than verify a SHA-256 hash of
> the loaded segments, currently uses three different stacks:
> - one in .bss that is used to call the purgatory C code
> - one in .rodata that is only used to switch to an updated code segment
>   descriptor in the GDT
> - one in .data, which allows it to be prepopulated from the kexec loader
>   in theory, but this is not actually being taken advantage of.
> 
> Simplify this, by dropping the latter two stacks, as well as the loader
> logic that programs RSP.
> 
> Both the stacks in .bss and .data are 4k aligned, but 16 byte alignment
> is more than sufficient.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/x86/include/asm/kexec.h      |  1 -
>  arch/x86/kernel/kexec-bzimage64.c |  8 --------
>  arch/x86/purgatory/entry64.S      |  8 --------
>  arch/x86/purgatory/setup-x86_64.S |  2 +-
>  arch/x86/purgatory/stack.S        | 18 ------------------

This needs a small fix up to build.

  make[6]: *** No rule to make target 'arch/x86/purgatory/stack.o', needed by 'arch/x86/purgatory/purgatory.ro'.

diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index acc09799af2a..2b6b2fb033d6 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 OBJECT_FILES_NON_STANDARD := y
 
-purgatory-y := purgatory.o stack.o setup-x86_$(BITS).o sha256.o entry64.o string.o
+purgatory-y := purgatory.o setup-x86_$(BITS).o sha256.o entry64.o string.o
 
 targets += $(purgatory-y)
 PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/9] x86/purgatory: Avoid absolute reference to GDT
  2024-04-24 17:53     ` Ard Biesheuvel
@ 2024-04-24 19:00       ` Brian Gerst
  0 siblings, 0 replies; 19+ messages in thread
From: Brian Gerst @ 2024-04-24 19:00 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Ard Biesheuvel, linux-kernel, x86, Arnd Bergmann, Eric Biederman,
	kexec, Nathan Chancellor, Nick Desaulniers, Kees Cook,
	Bill Wendling, Justin Stitt, Masahiro Yamada

On Wed, Apr 24, 2024 at 1:53 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> Hi Brian,
>
> Thanks for taking a look.
>
> On Wed, 24 Apr 2024 at 19:39, Brian Gerst <brgerst@gmail.com> wrote:
> >
> > On Wed, Apr 24, 2024 at 12:06 PM Ard Biesheuvel <ardb+git@google.com> wrote:
> > >
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > >
> > > The purgatory is almost entirely position independent, without any need
> > > for any relocation processing at load time except for the reference to
> > > the GDT in the entry code. Generate this reference at runtime instead,
> > > to remove the last R_X86_64_64 relocation from this code.
> > >
> > > While the GDT itself needs to be preserved in memory as long as it is
> > > live, the GDT descriptor that is used to program the GDT can be
> > > discarded so it can be allocated on the stack.
> > >
> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > > ---
> > >  arch/x86/purgatory/entry64.S | 10 +++++++---
> > >  1 file changed, 7 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/x86/purgatory/entry64.S b/arch/x86/purgatory/entry64.S
> > > index 9913877b0dbe..888661d9db9c 100644
> > > --- a/arch/x86/purgatory/entry64.S
> > > +++ b/arch/x86/purgatory/entry64.S
> > > @@ -16,7 +16,11 @@
> > >
> > >  SYM_CODE_START(entry64)
> > >         /* Setup a gdt that should be preserved */
> > > -       lgdt gdt(%rip)
> > > +       leaq    gdt(%rip), %rax
> > > +       pushq   %rax
> > > +       pushw   $gdt_end - gdt - 1
> > > +       lgdt    (%rsp)
> > > +       addq    $10, %rsp
> >
> > This misaligns the stack, pushing 16 bytes on the stack but only
> > removing 10 (decimal).
> >
>
> pushw subtracts 2 from RSP and stores a word. So the total size stored
> is 10 decimal not 16.

I didn't notice the pushw, since I didn't think a 16-bit push was even
possible in 64-bit mode. Unexpected, but clever.


Brian Gerst

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/9] kexec x86 purgatory cleanup
  2024-04-24 15:53 [RFC PATCH 0/9] kexec x86 purgatory cleanup Ard Biesheuvel
                   ` (8 preceding siblings ...)
  2024-04-24 15:53 ` [RFC PATCH 9/9] kexec: Drop support for partially linked purgatory executables Ard Biesheuvel
@ 2024-04-24 20:04 ` Eric W. Biederman
  2024-04-24 20:52   ` Ard Biesheuvel
  9 siblings, 1 reply; 19+ messages in thread
From: Eric W. Biederman @ 2024-04-24 20:04 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, x86, Ard Biesheuvel, Arnd Bergmann, kexec,
	Nathan Chancellor, Nick Desaulniers, Kees Cook, Bill Wendling,
	Justin Stitt, Masahiro Yamada

Ard Biesheuvel <ardb+git@google.com> writes:

> From: Ard Biesheuvel <ardb@kernel.org>
>
> The kexec purgatory is built like a kernel module, i.e., a partially
> linked ELF object where each section is allocated and placed
> individually, and all relocations need to be fixed up, even place
> relative ones.
>
> This makes sense for kernel modules, which share the address space with
> the core kernel, and contain unresolved references that need to be wired
> up to symbols in other modules or the kernel itself.
>
> The purgatory, however, is a fully linked binary without any external
> references, or any overlap with the kernel's virtual address space. So
> it makes much more sense to create a fully linked ELF executable that
> can just be loaded and run anywhere in memory.

It does have external references that are resolved when it is loaded.

Further it is at least my impression that non-PIC code is more
efficient.  PIC typically requires silly things like Global Offset
Tables that non-PIC code does not.  At first glance this looks like a
code passivization.

Now at lot of functionality has been stripped out of purgatory so maybe
in it's stripped down this make sense, but I want to challenge the
notion that this is the obvious thing to do.

> The purgatory build on x86 has already switched over to position
> independent codegen, which only leaves a handful of absolute references,
> which can either be dropped (patch #3) or converted into a RIP-relative
> one (patch #4). That leaves a purgatory executable that can run at any
> offset in memory with applying any relocations whatsoever.

I missed that conversation.  Do you happen to have a pointer?  I would
think the 32bit code is where the PIC would be most costly as the 32bit
x86 instruction set predates PIC being a common compilation target.

> Some tweaks are needed to deal with the difference between partially
> (ET_REL) and fully (ET_DYN/ET_EXEC) linked ELF objects, but with those
> in place, a substantial amount of complicated ELF allocation, placement
> and patching/relocation code can simply be dropped.

Really?  As I recall it only needed to handle a single allocation type,
and there were good reasons (at least when I wrote it) to patch symbols.

Again maybe the fact that people have removed 90% of the functionality
makes this make sense, but that is not obvious at first glance.

> The last patch in the series removes this code from the generic kexec
> implementation, but this can only be done once other architectures apply
> the same changes proposed here for x86 (powerpc, s390 and riscv all
> implement the purgatory using the shared logic)
>
> Link: https://lore.kernel.org/all/CAKwvOd=3Jrzju++=Ve61=ZdeshxUM=K3-bGMNREnGOQgNw=aag@mail.gmail.com/
> Link: https://lore.kernel.org/all/20240418201705.3673200-2-ardb+git@google.com/
>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: kexec@lists.infradead.org
> Cc: Nathan Chancellor <nathan@kernel.org>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Bill Wendling <morbo@google.com>
> Cc: Justin Stitt <justinstitt@google.com>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
>
> Ard Biesheuvel (9):
>   x86/purgatory: Drop function entry padding from purgatory
>   x86/purgatory: Simplify stack handling
>   x86/purgatory: Drop pointless GDT switch
>   x86/purgatory: Avoid absolute reference to GDT
>   x86/purgatory: Simplify GDT and drop data segment
>   kexec: Add support for fully linked purgatory executables
>   x86/purgatory: Use fully linked PIE ELF executable
>   x86/purgatory: Simplify references to regs array
>   kexec: Drop support for partially linked purgatory executables
>
>  arch/x86/include/asm/kexec.h       |   8 -
>  arch/x86/kernel/kexec-bzimage64.c  |   8 -
>  arch/x86/kernel/machine_kexec_64.c | 127 ----------
>  arch/x86/purgatory/Makefile        |  17 +-
>  arch/x86/purgatory/entry64.S       |  96 ++++----
>  arch/x86/purgatory/setup-x86_64.S  |  31 +--
>  arch/x86/purgatory/stack.S         |  18 --
>  include/asm-generic/purgatory.lds  |  34 +++
>  kernel/kexec_file.c                | 255 +++-----------------
>  9 files changed, 125 insertions(+), 469 deletions(-)
>  delete mode 100644 arch/x86/purgatory/stack.S
>  create mode 100644 include/asm-generic/purgatory.lds

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/9] kexec x86 purgatory cleanup
  2024-04-24 20:04 ` [RFC PATCH 0/9] kexec x86 purgatory cleanup Eric W. Biederman
@ 2024-04-24 20:52   ` Ard Biesheuvel
  0 siblings, 0 replies; 19+ messages in thread
From: Ard Biesheuvel @ 2024-04-24 20:52 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, x86, Arnd Bergmann, kexec, Nathan Chancellor,
	Nick Desaulniers, Kees Cook, Bill Wendling, Justin Stitt,
	Masahiro Yamada

On Wed, 24 Apr 2024 at 22:04, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Ard Biesheuvel <ardb+git@google.com> writes:
>
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > The kexec purgatory is built like a kernel module, i.e., a partially
> > linked ELF object where each section is allocated and placed
> > individually, and all relocations need to be fixed up, even place
> > relative ones.
> >
> > This makes sense for kernel modules, which share the address space with
> > the core kernel, and contain unresolved references that need to be wired
> > up to symbols in other modules or the kernel itself.
> >
> > The purgatory, however, is a fully linked binary without any external
> > references, or any overlap with the kernel's virtual address space. So
> > it makes much more sense to create a fully linked ELF executable that
> > can just be loaded and run anywhere in memory.
>
> It does have external references that are resolved when it is loaded.
>

It doesn't today, and it hasn't for a while, at least since commit

e4160b2e4b02377c67f8ecd05786811598f39acd
x86/purgatory: Fail the build if purgatory.ro has missing symbols

which forces a build failure on unresolved external references, by
doing a full link of the purgatory.

> Further it is at least my impression that non-PIC code is more
> efficient.  PIC typically requires silly things like Global Offset
> Tables that non-PIC code does not.  At first glance this looks like a
> code passivization.
>

Given that the 64-bit purgatory can be loaded in memory that is not
32-bit addressable, the PIC code is essentially a given, since the
large code model is much worse (it uses 64-bit immediate for all
function and variable symbols, and therefore always uses indirect
calls)

Please refer to

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/build&id=cba786af84a0f9716204e09f518ce3b7ada8555e

for more details. (Getting pulled into that discussion is how I ended
up looking into the purgatory in more detail)

> Now at lot of functionality has been stripped out of purgatory so maybe
> in it's stripped down this make sense, but I want to challenge the
> notion that this is the obvious thing to do.
>

The diffstat speaks for itself - on x86, much of the allocation and
relocation logic can simply be dropped when building the purgatory in
this manner.

> > The purgatory build on x86 has already switched over to position
> > independent codegen, which only leaves a handful of absolute references,
> > which can either be dropped (patch #3) or converted into a RIP-relative
> > one (patch #4). That leaves a purgatory executable that can run at any
> > offset in memory with applying any relocations whatsoever.
>
> I missed that conversation.  Do you happen to have a pointer?  I would
> think the 32bit code is where the PIC would be most costly as the 32bit
> x86 instruction set predates PIC being a common compilation target.
>

See link above. Note that this none of this is about 32-bit code - the
purgatory as it exists today never drops out of long mode (and no
32-bit version appears to exist)

> > Some tweaks are needed to deal with the difference between partially
> > (ET_REL) and fully (ET_DYN/ET_EXEC) linked ELF objects, but with those
> > in place, a substantial amount of complicated ELF allocation, placement
> > and patching/relocation code can simply be dropped.
>
> Really?  As I recall it only needed to handle a single allocation type,
> and there were good reasons (at least when I wrote it) to patch symbols.
>
> Again maybe the fact that people have removed 90% of the functionality
> makes this make sense, but that is not obvious at first glance.
>

Again, the patches and the diffstat speak for themselves - the linker
applies all the relocations at build time, and emits all the sections
into a single ELF segment that can be copied into memory and executed
directly (modulo poking values into the global variables for the
sha256 digest and the segment list)

The last patch in the series shows which code we could drop from the
generic kexec_file_load() implementation once other architectures
adopt this scheme.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 2/9] x86/purgatory: Simplify stack handling
  2024-04-24 18:26   ` Nathan Chancellor
@ 2024-04-26 21:32     ` Justin Stitt
  2024-04-26 21:53       ` Nathan Chancellor
  0 siblings, 1 reply; 19+ messages in thread
From: Justin Stitt @ 2024-04-26 21:32 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Ard Biesheuvel, linux-kernel, x86, Ard Biesheuvel, Arnd Bergmann,
	Eric Biederman, kexec, Nick Desaulniers, Kees Cook,
	Bill Wendling, Masahiro Yamada

Hi,

On Wed, Apr 24, 2024 at 11:26:59AM -0700, Nathan Chancellor wrote:
> On Wed, Apr 24, 2024 at 05:53:12PM +0200, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> > 
> > The x86 purgatory, which does little more than verify a SHA-256 hash of
> > the loaded segments, currently uses three different stacks:
> > - one in .bss that is used to call the purgatory C code
> > - one in .rodata that is only used to switch to an updated code segment
> >   descriptor in the GDT
> > - one in .data, which allows it to be prepopulated from the kexec loader
> >   in theory, but this is not actually being taken advantage of.
> > 
> > Simplify this, by dropping the latter two stacks, as well as the loader
> > logic that programs RSP.
> > 
> > Both the stacks in .bss and .data are 4k aligned, but 16 byte alignment
> > is more than sufficient.
> > 
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/x86/include/asm/kexec.h      |  1 -
> >  arch/x86/kernel/kexec-bzimage64.c |  8 --------
> >  arch/x86/purgatory/entry64.S      |  8 --------
> >  arch/x86/purgatory/setup-x86_64.S |  2 +-
> >  arch/x86/purgatory/stack.S        | 18 ------------------
> 
> This needs a small fix up to build.
> 
>   make[6]: *** No rule to make target 'arch/x86/purgatory/stack.o', needed by 'arch/x86/purgatory/purgatory.ro'.
> 

I was trying to reproduce this build failure, but to no avail. I am
curious what your build target / build command was.

It is clear that stack.S has been removed so your change makes sense, I
don't doubt that -- I just cannot get that specific error message you
encountered (what is a .ro file supposed to be, anyway?).

> diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
> index acc09799af2a..2b6b2fb033d6 100644
> --- a/arch/x86/purgatory/Makefile
> +++ b/arch/x86/purgatory/Makefile
> @@ -1,7 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0
>  OBJECT_FILES_NON_STANDARD := y
>  
> -purgatory-y := purgatory.o stack.o setup-x86_$(BITS).o sha256.o entry64.o string.o
> +purgatory-y := purgatory.o setup-x86_$(BITS).o sha256.o entry64.o string.o
>  
>  targets += $(purgatory-y)
>  PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))

Thanks
Justin

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 2/9] x86/purgatory: Simplify stack handling
  2024-04-26 21:32     ` Justin Stitt
@ 2024-04-26 21:53       ` Nathan Chancellor
  2024-04-26 22:01         ` Justin Stitt
  0 siblings, 1 reply; 19+ messages in thread
From: Nathan Chancellor @ 2024-04-26 21:53 UTC (permalink / raw)
  To: Justin Stitt
  Cc: Ard Biesheuvel, linux-kernel, x86, Ard Biesheuvel, Arnd Bergmann,
	Eric Biederman, kexec, Nick Desaulniers, Kees Cook,
	Bill Wendling, Masahiro Yamada

On Fri, Apr 26, 2024 at 09:32:52PM +0000, Justin Stitt wrote:
> Hi,
> 
> On Wed, Apr 24, 2024 at 11:26:59AM -0700, Nathan Chancellor wrote:
> > On Wed, Apr 24, 2024 at 05:53:12PM +0200, Ard Biesheuvel wrote:
> > >  arch/x86/purgatory/stack.S        | 18 ------------------
> > 
> > This needs a small fix up to build.
> > 
> >   make[6]: *** No rule to make target 'arch/x86/purgatory/stack.o', needed by 'arch/x86/purgatory/purgatory.ro'.
> > 
> 
> I was trying to reproduce this build failure, but to no avail. I am
> curious what your build target / build command was.
> 
> It is clear that stack.S has been removed so your change makes sense, I
> don't doubt that -- I just cannot get that specific error message you

Odd, I was using my distribution configuration for the test but it is
easily reproducible with allmodconfig:

  $ make -skj"$(nproc)" ARCH=x86_64 LLVM=1 allmodconfig arch/x86/purgatory/
  make[6]: *** No rule to make target 'arch/x86/purgatory/stack.o', needed by 'arch/x86/purgatory/purgatory.ro'.
  ...

> encountered (what is a .ro file supposed to be, anyway?).

Read only? Relocatable object? *shrug*

Cheers,
Nathan

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 2/9] x86/purgatory: Simplify stack handling
  2024-04-26 21:53       ` Nathan Chancellor
@ 2024-04-26 22:01         ` Justin Stitt
  0 siblings, 0 replies; 19+ messages in thread
From: Justin Stitt @ 2024-04-26 22:01 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Ard Biesheuvel, linux-kernel, x86, Ard Biesheuvel, Arnd Bergmann,
	Eric Biederman, kexec, Nick Desaulniers, Kees Cook,
	Bill Wendling, Masahiro Yamada

On Fri, Apr 26, 2024 at 2:53 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Fri, Apr 26, 2024 at 09:32:52PM +0000, Justin Stitt wrote:
> > Hi,
> >
> > On Wed, Apr 24, 2024 at 11:26:59AM -0700, Nathan Chancellor wrote:
> > > On Wed, Apr 24, 2024 at 05:53:12PM +0200, Ard Biesheuvel wrote:
> > > >  arch/x86/purgatory/stack.S        | 18 ------------------
> > >
> > > This needs a small fix up to build.
> > >
> > >   make[6]: *** No rule to make target 'arch/x86/purgatory/stack.o', needed by 'arch/x86/purgatory/purgatory.ro'.
> > >
> >
> > I was trying to reproduce this build failure, but to no avail. I am
> > curious what your build target / build command was.
> >
> > It is clear that stack.S has been removed so your change makes sense, I
> > don't doubt that -- I just cannot get that specific error message you
>
> Odd, I was using my distribution configuration for the test but it is
> easily reproducible with allmodconfig:
>
>   $ make -skj"$(nproc)" ARCH=x86_64 LLVM=1 allmodconfig arch/x86/purgatory/
>   make[6]: *** No rule to make target 'arch/x86/purgatory/stack.o', needed by 'arch/x86/purgatory/purgatory.ro'.
>   ...

Agh, I was just doing a defconfig followed by a menuconfig to manually
enable all the kexec and purgatory stuff. I wonder which one I missed.

allyes/allmodconfig is what I needed here :thumbs_up:

>
> > encountered (what is a .ro file supposed to be, anyway?).
>
> Read only? Relocatable object? *shrug*
>
> Cheers,
> Nathan

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-04-26 22:02 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-24 15:53 [RFC PATCH 0/9] kexec x86 purgatory cleanup Ard Biesheuvel
2024-04-24 15:53 ` [RFC PATCH 1/9] x86/purgatory: Drop function entry padding from purgatory Ard Biesheuvel
2024-04-24 15:53 ` [RFC PATCH 2/9] x86/purgatory: Simplify stack handling Ard Biesheuvel
2024-04-24 18:26   ` Nathan Chancellor
2024-04-26 21:32     ` Justin Stitt
2024-04-26 21:53       ` Nathan Chancellor
2024-04-26 22:01         ` Justin Stitt
2024-04-24 15:53 ` [RFC PATCH 3/9] x86/purgatory: Drop pointless GDT switch Ard Biesheuvel
2024-04-24 15:53 ` [RFC PATCH 4/9] x86/purgatory: Avoid absolute reference to GDT Ard Biesheuvel
2024-04-24 17:38   ` Brian Gerst
2024-04-24 17:53     ` Ard Biesheuvel
2024-04-24 19:00       ` Brian Gerst
2024-04-24 15:53 ` [RFC PATCH 5/9] x86/purgatory: Simplify GDT and drop data segment Ard Biesheuvel
2024-04-24 15:53 ` [RFC PATCH 6/9] kexec: Add support for fully linked purgatory executables Ard Biesheuvel
2024-04-24 15:53 ` [RFC PATCH 7/9] x86/purgatory: Use fully linked PIE ELF executable Ard Biesheuvel
2024-04-24 15:53 ` [RFC PATCH 8/9] x86/purgatory: Simplify references to regs array Ard Biesheuvel
2024-04-24 15:53 ` [RFC PATCH 9/9] kexec: Drop support for partially linked purgatory executables Ard Biesheuvel
2024-04-24 20:04 ` [RFC PATCH 0/9] kexec x86 purgatory cleanup Eric W. Biederman
2024-04-24 20:52   ` Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).