linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address
@ 2019-09-19 16:05 Kairui Song
  2019-09-25  9:51 ` Jarkko Sakkinen
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Kairui Song @ 2019-09-19 16:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Matthew Garrett, Jarkko Sakkinen, Baoquan He, Dave Young, x86,
	linux-efi, Kairui Song

Currently, kernel fails to boot on some HyperV VMs when using EFI.
And it's a potential issue on all platforms.

It's caused a broken kernel relocation on EFI systems, when below three
conditions are met:

1. Kernel image is not loaded to the default address (LOAD_PHYSICAL_ADDR)
   by the loader.
2. There isn't enough room to contain the kernel, starting from the
   default load address (eg. something else occupied part the region).
3. In the memmap provided by EFI firmware, there is a memory region
   starts below LOAD_PHYSICAL_ADDR, and suitable for containing the
   kernel.

Efi stub will perform a kernel relocation when condition 1 is met. But
due to condition 2, efi stub can't relocate kernel to the preferred
address, so it fallback to query and alloc from EFI firmware for lowest
usable memory region.

It's incorrect to use the lowest memory address. In later stage, kernel
will assume LOAD_PHYSICAL_ADDR as the minimal acceptable relocate address,
but efi stub will end up relocating kernel below it.

Then before the kernel decompressing. Kernel will do another relocation
to address not lower than LOAD_PHYSICAL_ADDR, this time the relocate will
over write the blockage at the default load address, which efi stub tried
to avoid, and lead to unexpected behavior. Beside, the memory region it
writes to is not allocated from EFI firmware, which is also wrong.

To fix it, just don't let efi stub relocate the kernel to any address
lower than lowest acceptable address.

Signed-off-by: Kairui Song <kasong@redhat.com>

---

Update from V1:
 - Redo the commit message.

 arch/x86/boot/compressed/eboot.c               |  8 +++++---
 drivers/firmware/efi/libstub/arm32-stub.c      |  2 +-
 drivers/firmware/efi/libstub/arm64-stub.c      |  2 +-
 drivers/firmware/efi/libstub/efi-stub-helper.c | 12 ++++++++----
 include/linux/efi.h                            |  5 +++--
 5 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index 936bdb924ec2..8207e8aa297e 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -13,6 +13,7 @@
 #include <asm/e820/types.h>
 #include <asm/setup.h>
 #include <asm/desc.h>
+#include <asm/boot.h>
 
 #include "../string.h"
 #include "eboot.h"
@@ -432,7 +433,7 @@ struct boot_params *make_boot_params(struct efi_config *c)
 	}
 
 	status = efi_low_alloc(sys_table, 0x4000, 1,
-			       (unsigned long *)&boot_params);
+			       (unsigned long *)&boot_params, 0);
 	if (status != EFI_SUCCESS) {
 		efi_printk(sys_table, "Failed to allocate lowmem for boot params\n");
 		return NULL;
@@ -817,7 +818,7 @@ efi_main(struct efi_config *c, struct boot_params *boot_params)
 
 	gdt->size = 0x800;
 	status = efi_low_alloc(sys_table, gdt->size, 8,
-			   (unsigned long *)&gdt->address);
+			       (unsigned long *)&gdt->address, 0);
 	if (status != EFI_SUCCESS) {
 		efi_printk(sys_table, "Failed to allocate memory for 'gdt'\n");
 		goto fail;
@@ -842,7 +843,8 @@ efi_main(struct efi_config *c, struct boot_params *boot_params)
 		status = efi_relocate_kernel(sys_table, &bzimage_addr,
 					     hdr->init_size, hdr->init_size,
 					     hdr->pref_address,
-					     hdr->kernel_alignment);
+					     hdr->kernel_alignment,
+					     LOAD_PHYSICAL_ADDR);
 		if (status != EFI_SUCCESS) {
 			efi_printk(sys_table, "efi_relocate_kernel() failed!\n");
 			goto fail;
diff --git a/drivers/firmware/efi/libstub/arm32-stub.c b/drivers/firmware/efi/libstub/arm32-stub.c
index e8f7aefb6813..bf6f954d6afe 100644
--- a/drivers/firmware/efi/libstub/arm32-stub.c
+++ b/drivers/firmware/efi/libstub/arm32-stub.c
@@ -220,7 +220,7 @@ efi_status_t handle_kernel_image(efi_system_table_t *sys_table,
 	*image_size = image->image_size;
 	status = efi_relocate_kernel(sys_table, image_addr, *image_size,
 				     *image_size,
-				     dram_base + MAX_UNCOMP_KERNEL_SIZE, 0);
+				     dram_base + MAX_UNCOMP_KERNEL_SIZE, 0, 0);
 	if (status != EFI_SUCCESS) {
 		pr_efi_err(sys_table, "Failed to relocate kernel.\n");
 		efi_free(sys_table, *reserve_size, *reserve_addr);
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 1550d244e996..3d2e517e10f4 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -140,7 +140,7 @@ efi_status_t handle_kernel_image(efi_system_table_t *sys_table_arg,
 	if (status != EFI_SUCCESS) {
 		*reserve_size = kernel_memsize + TEXT_OFFSET;
 		status = efi_low_alloc(sys_table_arg, *reserve_size,
-				       MIN_KIMG_ALIGN, reserve_addr);
+				       MIN_KIMG_ALIGN, reserve_addr, 0);
 
 		if (status != EFI_SUCCESS) {
 			pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index 3caae7f2cf56..00b00a2562aa 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -260,11 +260,11 @@ efi_status_t efi_high_alloc(efi_system_table_t *sys_table_arg,
 }
 
 /*
- * Allocate at the lowest possible address.
+ * Allocate at the lowest possible address that is not below 'min'.
  */
 efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
 			   unsigned long size, unsigned long align,
-			   unsigned long *addr)
+			   unsigned long *addr, unsigned long min)
 {
 	unsigned long map_size, desc_size, buff_size;
 	efi_memory_desc_t *map;
@@ -311,6 +311,9 @@ efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
 		start = desc->phys_addr;
 		end = start + desc->num_pages * EFI_PAGE_SIZE;
 
+		if (start < min)
+			start = min;
+
 		/*
 		 * Don't allocate at 0x0. It will confuse code that
 		 * checks pointers against NULL. Skip the first 8
@@ -698,7 +701,8 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
 				 unsigned long image_size,
 				 unsigned long alloc_size,
 				 unsigned long preferred_addr,
-				 unsigned long alignment)
+				 unsigned long alignment,
+				 unsigned long min_addr)
 {
 	unsigned long cur_image_addr;
 	unsigned long new_addr = 0;
@@ -732,7 +736,7 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
 	 */
 	if (status != EFI_SUCCESS) {
 		status = efi_low_alloc(sys_table_arg, alloc_size, alignment,
-				       &new_addr);
+				       &new_addr, min_addr);
 	}
 	if (status != EFI_SUCCESS) {
 		pr_efi_err(sys_table_arg, "Failed to allocate usable memory for kernel.\n");
diff --git a/include/linux/efi.h b/include/linux/efi.h
index f87fabea4a85..cc947c0f3e06 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -1587,7 +1587,7 @@ efi_status_t efi_get_memory_map(efi_system_table_t *sys_table_arg,
 
 efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
 			   unsigned long size, unsigned long align,
-			   unsigned long *addr);
+			   unsigned long *addr, unsigned long min);
 
 efi_status_t efi_high_alloc(efi_system_table_t *sys_table_arg,
 			    unsigned long size, unsigned long align,
@@ -1598,7 +1598,8 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
 				 unsigned long image_size,
 				 unsigned long alloc_size,
 				 unsigned long preferred_addr,
-				 unsigned long alignment);
+				 unsigned long alignment,
+				 unsigned long min_addr);
 
 efi_status_t handle_cmdline_files(efi_system_table_t *sys_table_arg,
 				  efi_loaded_image_t *image,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address
  2019-09-19 16:05 [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address Kairui Song
@ 2019-09-25  9:51 ` Jarkko Sakkinen
  2019-09-25  9:55 ` Baoquan He
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Jarkko Sakkinen @ 2019-09-25  9:51 UTC (permalink / raw)
  To: Kairui Song
  Cc: linux-kernel, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Matthew Garrett, Baoquan He, Dave Young, x86,
	linux-efi

On Fri, Sep 20, 2019 at 12:05:21AM +0800, Kairui Song wrote:
> Currently, kernel fails to boot on some HyperV VMs when using EFI.
> And it's a potential issue on all platforms.
> 
> It's caused a broken kernel relocation on EFI systems, when below three
> conditions are met:
> 
> 1. Kernel image is not loaded to the default address (LOAD_PHYSICAL_ADDR)
>    by the loader.
> 2. There isn't enough room to contain the kernel, starting from the
>    default load address (eg. something else occupied part the region).
> 3. In the memmap provided by EFI firmware, there is a memory region
>    starts below LOAD_PHYSICAL_ADDR, and suitable for containing the
>    kernel.
> 
> Efi stub will perform a kernel relocation when condition 1 is met. But
> due to condition 2, efi stub can't relocate kernel to the preferred
> address, so it fallback to query and alloc from EFI firmware for lowest
> usable memory region.
> 
> It's incorrect to use the lowest memory address. In later stage, kernel
> will assume LOAD_PHYSICAL_ADDR as the minimal acceptable relocate address,
> but efi stub will end up relocating kernel below it.
> 
> Then before the kernel decompressing. Kernel will do another relocation
> to address not lower than LOAD_PHYSICAL_ADDR, this time the relocate will
> over write the blockage at the default load address, which efi stub tried
> to avoid, and lead to unexpected behavior. Beside, the memory region it
> writes to is not allocated from EFI firmware, which is also wrong.
> 
> To fix it, just don't let efi stub relocate the kernel to any address
> lower than lowest acceptable address.
> 
> Signed-off-by: Kairui Song <kasong@redhat.com>

Acked-by:  <jarkko.sakkinen@linux.intel.com>

/Jarkko

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address
  2019-09-19 16:05 [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address Kairui Song
  2019-09-25  9:51 ` Jarkko Sakkinen
@ 2019-09-25  9:55 ` Baoquan He
  2019-09-25 17:35   ` Kairui Song
  2019-09-25 15:25 ` Ard Biesheuvel
  2019-10-11 13:23 ` Borislav Petkov
  3 siblings, 1 reply; 9+ messages in thread
From: Baoquan He @ 2019-09-25  9:55 UTC (permalink / raw)
  To: Kairui Song
  Cc: linux-kernel, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Matthew Garrett, Jarkko Sakkinen, Dave Young,
	x86

On 09/20/19 at 12:05am, Kairui Song wrote:
> Currently, kernel fails to boot on some HyperV VMs when using EFI.
> And it's a potential issue on all platforms.
> 
> It's caused a broken kernel relocation on EFI systems, when below three
> conditions are met:
> 
> 1. Kernel image is not loaded to the default address (LOAD_PHYSICAL_ADDR)
>    by the loader.
> 2. There isn't enough room to contain the kernel, starting from the
>    default load address (eg. something else occupied part the region).
> 3. In the memmap provided by EFI firmware, there is a memory region
>    starts below LOAD_PHYSICAL_ADDR, and suitable for containing the
>    kernel.

Thanks for the effort, Kairui.

Let me summarize what I got from this issue, please correct me if
anything missed:

***
Problem:
This bug is reported on Hyper-V platform. The kernel will reset to
firmware w/o any console printing in 1st kernel and kdump kernel
sometime.

***
Root cause:
With debugging, the resetting to firmware is triggered when execute
'rep     movsq' line of /boot/compressed/head_64.S. The reason is that
efi boot stub may put kernel image below 16M, then later head_64.S will
relocate kernel to 16M directly. That relocation will conflict with some
efi reserved region, then cause the resetting.

A more detail process based on the problem occurred on that HyperV
machine:

- kernel (INIT_SIZE: 56820K) got loaded at 0x3c881000 (not aligned,
  and not equal to pref_address 0x1000000), need to relocate.

- efi_relocate_kernel is called, try to allocate INIT_SIZE of memory
  at pref_address, failed, something else occupied this region.

- efi_relocate_kernel call efi_low_alloc as fallback, and got the address
  0x800000 (Below 0x1000000)

- Later in arch/x86/boot/compressed/head_64.S:108, LOAD_PHYSICAL_ADDR is
  force used as the new load address as the current address is lower than
  that. Then kernel try relocate to 0x1000000.

- However the memory starting from 0x1000000 is not allocated from EFI
  firmware, writing to this region caused the system to reset.

***
Solution:
Alwasys search area above LOAD_PHYSICAL_ADDR, namely 16M to put kernel
image in /boot/compressed/eboot.c. Then efi boot stub in eboot.c will
search an suitable area in efi memmap, to make sure no any reserved
region will conflict with the target area of kernel image. Besides,
kernel won't be relocated in /boot/compressed/head_64.S since it has
been above 16M. 

#ifdef CONFIG_RELOCATABLE
        leaq    startup_32(%rip) /* - $startup_32 */, %rbp
        movl    BP_kernel_alignment(%rsi), %eax
        decl    %eax
        addq    %rax, %rbp
        notq    %rax
        andq    %rax, %rbp
        cmpq    $LOAD_PHYSICAL_ADDR, %rbp
        jge     1f
#endif
        movq    $LOAD_PHYSICAL_ADDR, %rbp
1:

        /* Target address to relocate to for decompression */
        movl    BP_init_size(%rsi), %ebx
        subl    $_end, %ebx
        addq    %rbp, %rbx


***
I have one concerns about this patch:

Why this only happen in Hyper-V platform. Qemu/kvm, baremetal, vmware
ESI don't have this issue? What's the difference?


By the way, I personally like this way better. Because it is fixing a
potention issue. Efi boot stub code may put kernel below 16M, but the
relocation code in boot/compressed/head_64.S doesn't consider the
possible conflict, and head_64.S have no way to know the efi memmap
information. If this patch can't be accepted, woring around it in
Hyper-V may be a way.

Thanks
Baoquan

> 
> Efi stub will perform a kernel relocation when condition 1 is met. But
> due to condition 2, efi stub can't relocate kernel to the preferred
> address, so it fallback to query and alloc from EFI firmware for lowest
> usable memory region.
> 
> It's incorrect to use the lowest memory address. In later stage, kernel
> will assume LOAD_PHYSICAL_ADDR as the minimal acceptable relocate address,
> but efi stub will end up relocating kernel below it.
> 
> Then before the kernel decompressing. Kernel will do another relocation
> to address not lower than LOAD_PHYSICAL_ADDR, this time the relocate will
> over write the blockage at the default load address, which efi stub tried
> to avoid, and lead to unexpected behavior. Beside, the memory region it
> writes to is not allocated from EFI firmware, which is also wrong.
> 
> To fix it, just don't let efi stub relocate the kernel to any address
> lower than lowest acceptable address.
> 
> Signed-off-by: Kairui Song <kasong@redhat.com>
> 
> ---
> 
> Update from V1:
>  - Redo the commit message.
> 
>  arch/x86/boot/compressed/eboot.c               |  8 +++++---
>  drivers/firmware/efi/libstub/arm32-stub.c      |  2 +-
>  drivers/firmware/efi/libstub/arm64-stub.c      |  2 +-
>  drivers/firmware/efi/libstub/efi-stub-helper.c | 12 ++++++++----
>  include/linux/efi.h                            |  5 +++--
>  5 files changed, 18 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
> index 936bdb924ec2..8207e8aa297e 100644
> --- a/arch/x86/boot/compressed/eboot.c
> +++ b/arch/x86/boot/compressed/eboot.c
> @@ -13,6 +13,7 @@
>  #include <asm/e820/types.h>
>  #include <asm/setup.h>
>  #include <asm/desc.h>
> +#include <asm/boot.h>
>  
>  #include "../string.h"
>  #include "eboot.h"
> @@ -432,7 +433,7 @@ struct boot_params *make_boot_params(struct efi_config *c)
>  	}
>  
>  	status = efi_low_alloc(sys_table, 0x4000, 1,
> -			       (unsigned long *)&boot_params);
> +			       (unsigned long *)&boot_params, 0);
>  	if (status != EFI_SUCCESS) {
>  		efi_printk(sys_table, "Failed to allocate lowmem for boot params\n");
>  		return NULL;
> @@ -817,7 +818,7 @@ efi_main(struct efi_config *c, struct boot_params *boot_params)
>  
>  	gdt->size = 0x800;
>  	status = efi_low_alloc(sys_table, gdt->size, 8,
> -			   (unsigned long *)&gdt->address);
> +			       (unsigned long *)&gdt->address, 0);
>  	if (status != EFI_SUCCESS) {
>  		efi_printk(sys_table, "Failed to allocate memory for 'gdt'\n");
>  		goto fail;
> @@ -842,7 +843,8 @@ efi_main(struct efi_config *c, struct boot_params *boot_params)
>  		status = efi_relocate_kernel(sys_table, &bzimage_addr,
>  					     hdr->init_size, hdr->init_size,
>  					     hdr->pref_address,
> -					     hdr->kernel_alignment);
> +					     hdr->kernel_alignment,
> +					     LOAD_PHYSICAL_ADDR);
>  		if (status != EFI_SUCCESS) {
>  			efi_printk(sys_table, "efi_relocate_kernel() failed!\n");
>  			goto fail;
> diff --git a/drivers/firmware/efi/libstub/arm32-stub.c b/drivers/firmware/efi/libstub/arm32-stub.c
> index e8f7aefb6813..bf6f954d6afe 100644
> --- a/drivers/firmware/efi/libstub/arm32-stub.c
> +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> @@ -220,7 +220,7 @@ efi_status_t handle_kernel_image(efi_system_table_t *sys_table,
>  	*image_size = image->image_size;
>  	status = efi_relocate_kernel(sys_table, image_addr, *image_size,
>  				     *image_size,
> -				     dram_base + MAX_UNCOMP_KERNEL_SIZE, 0);
> +				     dram_base + MAX_UNCOMP_KERNEL_SIZE, 0, 0);
>  	if (status != EFI_SUCCESS) {
>  		pr_efi_err(sys_table, "Failed to relocate kernel.\n");
>  		efi_free(sys_table, *reserve_size, *reserve_addr);
> diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
> index 1550d244e996..3d2e517e10f4 100644
> --- a/drivers/firmware/efi/libstub/arm64-stub.c
> +++ b/drivers/firmware/efi/libstub/arm64-stub.c
> @@ -140,7 +140,7 @@ efi_status_t handle_kernel_image(efi_system_table_t *sys_table_arg,
>  	if (status != EFI_SUCCESS) {
>  		*reserve_size = kernel_memsize + TEXT_OFFSET;
>  		status = efi_low_alloc(sys_table_arg, *reserve_size,
> -				       MIN_KIMG_ALIGN, reserve_addr);
> +				       MIN_KIMG_ALIGN, reserve_addr, 0);
>  
>  		if (status != EFI_SUCCESS) {
>  			pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
> diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
> index 3caae7f2cf56..00b00a2562aa 100644
> --- a/drivers/firmware/efi/libstub/efi-stub-helper.c
> +++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
> @@ -260,11 +260,11 @@ efi_status_t efi_high_alloc(efi_system_table_t *sys_table_arg,
>  }
>  
>  /*
> - * Allocate at the lowest possible address.
> + * Allocate at the lowest possible address that is not below 'min'.
>   */
>  efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
>  			   unsigned long size, unsigned long align,
> -			   unsigned long *addr)
> +			   unsigned long *addr, unsigned long min)
>  {
>  	unsigned long map_size, desc_size, buff_size;
>  	efi_memory_desc_t *map;
> @@ -311,6 +311,9 @@ efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
>  		start = desc->phys_addr;
>  		end = start + desc->num_pages * EFI_PAGE_SIZE;
>  
> +		if (start < min)
> +			start = min;
> +
>  		/*
>  		 * Don't allocate at 0x0. It will confuse code that
>  		 * checks pointers against NULL. Skip the first 8
> @@ -698,7 +701,8 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
>  				 unsigned long image_size,
>  				 unsigned long alloc_size,
>  				 unsigned long preferred_addr,
> -				 unsigned long alignment)
> +				 unsigned long alignment,
> +				 unsigned long min_addr)
>  {
>  	unsigned long cur_image_addr;
>  	unsigned long new_addr = 0;
> @@ -732,7 +736,7 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
>  	 */
>  	if (status != EFI_SUCCESS) {
>  		status = efi_low_alloc(sys_table_arg, alloc_size, alignment,
> -				       &new_addr);
> +				       &new_addr, min_addr);
>  	}
>  	if (status != EFI_SUCCESS) {
>  		pr_efi_err(sys_table_arg, "Failed to allocate usable memory for kernel.\n");
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index f87fabea4a85..cc947c0f3e06 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -1587,7 +1587,7 @@ efi_status_t efi_get_memory_map(efi_system_table_t *sys_table_arg,
>  
>  efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
>  			   unsigned long size, unsigned long align,
> -			   unsigned long *addr);
> +			   unsigned long *addr, unsigned long min);
>  
>  efi_status_t efi_high_alloc(efi_system_table_t *sys_table_arg,
>  			    unsigned long size, unsigned long align,
> @@ -1598,7 +1598,8 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
>  				 unsigned long image_size,
>  				 unsigned long alloc_size,
>  				 unsigned long preferred_addr,
> -				 unsigned long alignment);
> +				 unsigned long alignment,
> +				 unsigned long min_addr);
>  
>  efi_status_t handle_cmdline_files(efi_system_table_t *sys_table_arg,
>  				  efi_loaded_image_t *image,
> -- 
> 2.21.0
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address
  2019-09-19 16:05 [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address Kairui Song
  2019-09-25  9:51 ` Jarkko Sakkinen
  2019-09-25  9:55 ` Baoquan He
@ 2019-09-25 15:25 ` Ard Biesheuvel
  2019-09-25 17:36   ` Kairui Song
  2019-10-11 13:23 ` Borislav Petkov
  3 siblings, 1 reply; 9+ messages in thread
From: Ard Biesheuvel @ 2019-09-25 15:25 UTC (permalink / raw)
  To: Kairui Song
  Cc: Linux Kernel Mailing List, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Matthew Garrett, Jarkko Sakkinen, Baoquan He,
	Dave Young, the arch/x86 maintainers, linux-efi

On Thu, 19 Sep 2019 at 18:06, Kairui Song <kasong@redhat.com> wrote:
>
> Currently, kernel fails to boot on some HyperV VMs when using EFI.
> And it's a potential issue on all platforms.
>
> It's caused a broken kernel relocation on EFI systems, when below three
> conditions are met:
>
> 1. Kernel image is not loaded to the default address (LOAD_PHYSICAL_ADDR)
>    by the loader.
> 2. There isn't enough room to contain the kernel, starting from the
>    default load address (eg. something else occupied part the region).
> 3. In the memmap provided by EFI firmware, there is a memory region
>    starts below LOAD_PHYSICAL_ADDR, and suitable for containing the
>    kernel.
>
> Efi stub will perform a kernel relocation when condition 1 is met. But
> due to condition 2, efi stub can't relocate kernel to the preferred
> address, so it fallback to query and alloc from EFI firmware for lowest
> usable memory region.
>
> It's incorrect to use the lowest memory address. In later stage, kernel
> will assume LOAD_PHYSICAL_ADDR as the minimal acceptable relocate address,
> but efi stub will end up relocating kernel below it.
>
> Then before the kernel decompressing. Kernel will do another relocation
> to address not lower than LOAD_PHYSICAL_ADDR, this time the relocate will
> over write the blockage at the default load address, which efi stub tried
> to avoid, and lead to unexpected behavior. Beside, the memory region it
> writes to is not allocated from EFI firmware, which is also wrong.
>
> To fix it, just don't let efi stub relocate the kernel to any address
> lower than lowest acceptable address.
>
> Signed-off-by: Kairui Song <kasong@redhat.com>
>

Hello Kairui,

This patch looks correct to me, but it needs an ack from the x86
maintainers, since the rules around LOAD_PHYSICAL_ADDR are specific to
the x86 architecture.


> ---
>
> Update from V1:
>  - Redo the commit message.
>
>  arch/x86/boot/compressed/eboot.c               |  8 +++++---
>  drivers/firmware/efi/libstub/arm32-stub.c      |  2 +-
>  drivers/firmware/efi/libstub/arm64-stub.c      |  2 +-
>  drivers/firmware/efi/libstub/efi-stub-helper.c | 12 ++++++++----
>  include/linux/efi.h                            |  5 +++--
>  5 files changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
> index 936bdb924ec2..8207e8aa297e 100644
> --- a/arch/x86/boot/compressed/eboot.c
> +++ b/arch/x86/boot/compressed/eboot.c
> @@ -13,6 +13,7 @@
>  #include <asm/e820/types.h>
>  #include <asm/setup.h>
>  #include <asm/desc.h>
> +#include <asm/boot.h>
>
>  #include "../string.h"
>  #include "eboot.h"
> @@ -432,7 +433,7 @@ struct boot_params *make_boot_params(struct efi_config *c)
>         }
>
>         status = efi_low_alloc(sys_table, 0x4000, 1,
> -                              (unsigned long *)&boot_params);
> +                              (unsigned long *)&boot_params, 0);
>         if (status != EFI_SUCCESS) {
>                 efi_printk(sys_table, "Failed to allocate lowmem for boot params\n");
>                 return NULL;
> @@ -817,7 +818,7 @@ efi_main(struct efi_config *c, struct boot_params *boot_params)
>
>         gdt->size = 0x800;
>         status = efi_low_alloc(sys_table, gdt->size, 8,
> -                          (unsigned long *)&gdt->address);
> +                              (unsigned long *)&gdt->address, 0);
>         if (status != EFI_SUCCESS) {
>                 efi_printk(sys_table, "Failed to allocate memory for 'gdt'\n");
>                 goto fail;
> @@ -842,7 +843,8 @@ efi_main(struct efi_config *c, struct boot_params *boot_params)
>                 status = efi_relocate_kernel(sys_table, &bzimage_addr,
>                                              hdr->init_size, hdr->init_size,
>                                              hdr->pref_address,
> -                                            hdr->kernel_alignment);
> +                                            hdr->kernel_alignment,
> +                                            LOAD_PHYSICAL_ADDR);
>                 if (status != EFI_SUCCESS) {
>                         efi_printk(sys_table, "efi_relocate_kernel() failed!\n");
>                         goto fail;
> diff --git a/drivers/firmware/efi/libstub/arm32-stub.c b/drivers/firmware/efi/libstub/arm32-stub.c
> index e8f7aefb6813..bf6f954d6afe 100644
> --- a/drivers/firmware/efi/libstub/arm32-stub.c
> +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> @@ -220,7 +220,7 @@ efi_status_t handle_kernel_image(efi_system_table_t *sys_table,
>         *image_size = image->image_size;
>         status = efi_relocate_kernel(sys_table, image_addr, *image_size,
>                                      *image_size,
> -                                    dram_base + MAX_UNCOMP_KERNEL_SIZE, 0);
> +                                    dram_base + MAX_UNCOMP_KERNEL_SIZE, 0, 0);
>         if (status != EFI_SUCCESS) {
>                 pr_efi_err(sys_table, "Failed to relocate kernel.\n");
>                 efi_free(sys_table, *reserve_size, *reserve_addr);
> diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
> index 1550d244e996..3d2e517e10f4 100644
> --- a/drivers/firmware/efi/libstub/arm64-stub.c
> +++ b/drivers/firmware/efi/libstub/arm64-stub.c
> @@ -140,7 +140,7 @@ efi_status_t handle_kernel_image(efi_system_table_t *sys_table_arg,
>         if (status != EFI_SUCCESS) {
>                 *reserve_size = kernel_memsize + TEXT_OFFSET;
>                 status = efi_low_alloc(sys_table_arg, *reserve_size,
> -                                      MIN_KIMG_ALIGN, reserve_addr);
> +                                      MIN_KIMG_ALIGN, reserve_addr, 0);
>
>                 if (status != EFI_SUCCESS) {
>                         pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
> diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
> index 3caae7f2cf56..00b00a2562aa 100644
> --- a/drivers/firmware/efi/libstub/efi-stub-helper.c
> +++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
> @@ -260,11 +260,11 @@ efi_status_t efi_high_alloc(efi_system_table_t *sys_table_arg,
>  }
>
>  /*
> - * Allocate at the lowest possible address.
> + * Allocate at the lowest possible address that is not below 'min'.
>   */
>  efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
>                            unsigned long size, unsigned long align,
> -                          unsigned long *addr)
> +                          unsigned long *addr, unsigned long min)
>  {
>         unsigned long map_size, desc_size, buff_size;
>         efi_memory_desc_t *map;
> @@ -311,6 +311,9 @@ efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
>                 start = desc->phys_addr;
>                 end = start + desc->num_pages * EFI_PAGE_SIZE;
>
> +               if (start < min)
> +                       start = min;
> +
>                 /*
>                  * Don't allocate at 0x0. It will confuse code that
>                  * checks pointers against NULL. Skip the first 8
> @@ -698,7 +701,8 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
>                                  unsigned long image_size,
>                                  unsigned long alloc_size,
>                                  unsigned long preferred_addr,
> -                                unsigned long alignment)
> +                                unsigned long alignment,
> +                                unsigned long min_addr)
>  {
>         unsigned long cur_image_addr;
>         unsigned long new_addr = 0;
> @@ -732,7 +736,7 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
>          */
>         if (status != EFI_SUCCESS) {
>                 status = efi_low_alloc(sys_table_arg, alloc_size, alignment,
> -                                      &new_addr);
> +                                      &new_addr, min_addr);
>         }
>         if (status != EFI_SUCCESS) {
>                 pr_efi_err(sys_table_arg, "Failed to allocate usable memory for kernel.\n");
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index f87fabea4a85..cc947c0f3e06 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -1587,7 +1587,7 @@ efi_status_t efi_get_memory_map(efi_system_table_t *sys_table_arg,
>
>  efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
>                            unsigned long size, unsigned long align,
> -                          unsigned long *addr);
> +                          unsigned long *addr, unsigned long min);
>
>  efi_status_t efi_high_alloc(efi_system_table_t *sys_table_arg,
>                             unsigned long size, unsigned long align,
> @@ -1598,7 +1598,8 @@ efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg,
>                                  unsigned long image_size,
>                                  unsigned long alloc_size,
>                                  unsigned long preferred_addr,
> -                                unsigned long alignment);
> +                                unsigned long alignment,
> +                                unsigned long min_addr);
>
>  efi_status_t handle_cmdline_files(efi_system_table_t *sys_table_arg,
>                                   efi_loaded_image_t *image,
> --
> 2.21.0
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address
  2019-09-25  9:55 ` Baoquan He
@ 2019-09-25 17:35   ` Kairui Song
  0 siblings, 0 replies; 9+ messages in thread
From: Kairui Song @ 2019-09-25 17:35 UTC (permalink / raw)
  To: Baoquan He
  Cc: Linux Kernel Mailing List, Ard Biesheuvel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Matthew Garrett, Jarkko Sakkinen,
	Dave Young, the arch/x86 maintainers

On Wed, Sep 25, 2019 at 5:55 PM Baoquan He <bhe@redhat.com> wrote:
>
> On 09/20/19 at 12:05am, Kairui Song wrote:
> > Currently, kernel fails to boot on some HyperV VMs when using EFI.
> > And it's a potential issue on all platforms.
> >
> > It's caused a broken kernel relocation on EFI systems, when below three
> > conditions are met:
> >
> > 1. Kernel image is not loaded to the default address (LOAD_PHYSICAL_ADDR)
> >    by the loader.
> > 2. There isn't enough room to contain the kernel, starting from the
> >    default load address (eg. something else occupied part the region).
> > 3. In the memmap provided by EFI firmware, there is a memory region
> >    starts below LOAD_PHYSICAL_ADDR, and suitable for containing the
> >    kernel.
>
> Thanks for the effort, Kairui.
>
> Let me summarize what I got from this issue, please correct me if
> anything missed:
>
> ***
> Problem:
> This bug is reported on Hyper-V platform. The kernel will reset to
> firmware w/o any console printing in 1st kernel and kdump kernel
> sometime.
>
> ***
> Root cause:
> With debugging, the resetting to firmware is triggered when execute
> 'rep     movsq' line of /boot/compressed/head_64.S. The reason is that
> efi boot stub may put kernel image below 16M, then later head_64.S will
> relocate kernel to 16M directly. That relocation will conflict with some
> efi reserved region, then cause the resetting.
>
> A more detail process based on the problem occurred on that HyperV
> machine:
>
> - kernel (INIT_SIZE: 56820K) got loaded at 0x3c881000 (not aligned,
>   and not equal to pref_address 0x1000000), need to relocate.
>
> - efi_relocate_kernel is called, try to allocate INIT_SIZE of memory
>   at pref_address, failed, something else occupied this region.
>
> - efi_relocate_kernel call efi_low_alloc as fallback, and got the address
>   0x800000 (Below 0x1000000)
>
> - Later in arch/x86/boot/compressed/head_64.S:108, LOAD_PHYSICAL_ADDR is
>   force used as the new load address as the current address is lower than
>   that. Then kernel try relocate to 0x1000000.
>
> - However the memory starting from 0x1000000 is not allocated from EFI
>   firmware, writing to this region caused the system to reset.
>
> ***
> Solution:
> Alwasys search area above LOAD_PHYSICAL_ADDR, namely 16M to put kernel
> image in /boot/compressed/eboot.c. Then efi boot stub in eboot.c will
> search an suitable area in efi memmap, to make sure no any reserved
> region will conflict with the target area of kernel image. Besides,
> kernel won't be relocated in /boot/compressed/head_64.S since it has
> been above 16M.
>
> #ifdef CONFIG_RELOCATABLE
>         leaq    startup_32(%rip) /* - $startup_32 */, %rbp
>         movl    BP_kernel_alignment(%rsi), %eax
>         decl    %eax
>         addq    %rax, %rbp
>         notq    %rax
>         andq    %rax, %rbp
>         cmpq    $LOAD_PHYSICAL_ADDR, %rbp
>         jge     1f
> #endif
>         movq    $LOAD_PHYSICAL_ADDR, %rbp
> 1:
>
>         /* Target address to relocate to for decompression */
>         movl    BP_init_size(%rsi), %ebx
>         subl    $_end, %ebx
>         addq    %rbp, %rbx
>

Hi Baoquan,

Yes, it's all correct. Thanks for adding these details.

>
> ***
> I have one concerns about this patch:
>
> Why this only happen in Hyper-V platform. Qemu/kvm, baremetal, vmware
> ESI don't have this issue? What's the difference?

Let me post part the efi memmap on that machine (and btw the kernel
size is 55M):

kernel: efi: mem00: type=7, attr=0xf,
range=[0x0000000000000000-0x0000000000080000) (0MB)
kernel: efi: mem01: type=4, attr=0xf,
range=[0x0000000000080000-0x0000000000081000) (0MB)
kernel: efi: mem02: type=2, attr=0xf,
range=[0x0000000000081000-0x0000000000082000) (0MB)
kernel: efi: mem03: type=7, attr=0xf,
range=[0x0000000000082000-0x00000000000a0000) (0MB)
kernel: efi: mem04: type=4, attr=0xf,
range=[0x0000000000100000-0x000000000062a000) (5MB)
kernel: efi: mem05: type=7, attr=0xf,
range=[0x000000000062a000-0x0000000004200000) (59MB)
kernel: efi: mem06: type=4, attr=0xf,
range=[0x0000000004200000-0x0000000004400000) (2MB)
kernel: efi: mem07: type=7, attr=0xf,
range=[0x0000000004400000-0x00000000045c6000) (1MB)
kernel: efi: mem08: type=4, attr=0xf,
range=[0x00000000045c6000-0x00000000045e6000) (0MB)
kernel: efi: mem09: type=3, attr=0xf,
range=[0x00000000045e6000-0x000000000460b000) (0MB)
kernel: efi: mem10: type=4, attr=0xf,
range=[0x000000000460b000-0x0000000004613000) (0MB)
kernel: efi: mem11: type=3, attr=0xf,
range=[0x0000000004613000-0x000000000462b000) (0MB)
kernel: efi: mem12: type=7, attr=0xf,
range=[0x000000000462b000-0x0000000004800000) (1MB)
kernel: efi: mem13: type=2, attr=0xf,
range=[0x0000000004800000-0x0000000007f7d000) (55MB)
kernel: efi: mem14: type=7, attr=0xf,
range=[0x0000000007f7d000-0x0000000039a39000) (794MB)
kernel: efi: mem15: type=2, attr=0xf,
range=[0x0000000039a39000-0x0000000040000000) (101MB)
kernel: efi: mem16: type=7, attr=0xf,
range=[0x0000000040000000-0x000000004263d000) (38MB)
kernel: efi: mem17: type=2, attr=0xf,
range=[0x000000004263d000-0x000000007fff2000) (985MB)
kernel: efi: mem18: type=0, attr=0xf,
range=[0x000000007fff2000-0x000000007fff3000) (0MB)
kernel: efi: mem19: type=7, attr=0xf,
range=[0x000000007fff3000-0x00000000f6aaf000) (1898MB)
kernel: efi: mem20: type=2, attr=0xf,
range=[0x00000000f6aaf000-0x00000000f6ab0000) (0MB)
kernel: efi: mem21: type=1, attr=0xf,
range=[0x00000000f6ab0000-0x00000000f6bcd000) (1MB)
kernel: efi: mem22: type=2, attr=0xf,
range=[0x00000000f6bcd000-0x00000000f6cec000) (1MB)
kernel: efi: mem23: type=1, attr=0xf,
range=[0x00000000f6cec000-0x00000000f6dfb000) (1MB)
kernel: efi: mem24: type=6, attr=0x800000000000000f,
range=[0x00000000f6dfb000-0x00000000f6e06000) (0MB)
kernel: efi: mem25: type=9, attr=0xf,
range=[0x00000000f6e06000-0x00000000f6e07000) (0MB)
kernel: efi: mem26: type=3, attr=0xf,
range=[0x00000000f6e07000-0x00000000f6eea000) (0MB)
kernel: efi: mem27: type=9, attr=0xf,
range=[0x00000000f6eea000-0x00000000f6ef2000) (0MB)
kernel: efi: mem28: type=6, attr=0x800000000000000f,
range=[0x00000000f6ef2000-0x00000000f6f1b000) (0MB)
kernel: efi: mem29: type=7, attr=0xf,
range=[0x00000000f6f1b000-0x00000000f73c1000) (4MB)
kernel: efi: mem30: type=4, attr=0xf,
range=[0x00000000f73c1000-0x00000000f7e1b000) (10MB)
kernel: efi: mem31: type=3, attr=0xf,
range=[0x00000000f7e1b000-0x00000000f7f9b000) (1MB)
kernel: efi: mem32: type=5, attr=0x800000000000000f,
range=[0x00000000f7f9b000-0x00000000f7fcb000) (0MB)
kernel: efi: mem33: type=6, attr=0x800000000000000f,
range=[0x00000000f7fcb000-0x00000000f7fef000) (0MB)
kernel: efi: mem34: type=0, attr=0xf,
range=[0x00000000f7fef000-0x00000000f7ff3000) (0MB)
kernel: efi: mem35: type=9, attr=0xf,
range=[0x00000000f7ff3000-0x00000000f7ffb000) (0MB)
kernel: efi: mem36: type=10, attr=0xf,
range=[0x00000000f7ffb000-0x00000000f7fff000) (0MB)
kernel: efi: mem37: type=4, attr=0xf,
range=[0x00000000f7fff000-0x00000000f8000000) (0MB)
kernel: efi: mem38: type=7, attr=0xf,
range=[0x0000000100000000-0x0000000108000000) (128MB)
kernel: efi: mem39: type=0, attr=0x1,
range=[0x00000000000c0000-0x0000000000100000) (0MB)

You see, there is a region:
kernel: efi: mem05: type=7, attr=0xf,
range=[0x000000000062a000-0x0000000004200000) (59MB)

Which fits the kernel, and it's below 0x1000000 (16M), and the loader
didn't load the kernel to a prefered address (16M), so efi-stub will
relocate kernel to that low region.
I didn't observe any other platform's firmware will provide a region
starts below 16M and large enough to contain kernel, and load kernel
into a strange address at the same time.

>
> By the way, I personally like this way better. Because it is fixing a
> potention issue. Efi boot stub code may put kernel below 16M, but the
> relocation code in boot/compressed/head_64.S doesn't consider the
> possible conflict, and head_64.S have no way to know the efi memmap
> information. If this patch can't be accepted, woring around it in
> Hyper-V may be a way.
>
> Thanks
> Baoquan
>

Thanks for the review!

-- 
Best Regards,
Kairui Song

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address
  2019-09-25 15:25 ` Ard Biesheuvel
@ 2019-09-25 17:36   ` Kairui Song
  2019-10-11 10:18     ` Kairui Song
  0 siblings, 1 reply; 9+ messages in thread
From: Kairui Song @ 2019-09-25 17:36 UTC (permalink / raw)
  To: Ard Biesheuvel, the arch/x86 maintainers
  Cc: Linux Kernel Mailing List, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Matthew Garrett, Jarkko Sakkinen, Baoquan He,
	Dave Young, linux-efi

On Wed, Sep 25, 2019 at 11:25 PM Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
>
> On Thu, 19 Sep 2019 at 18:06, Kairui Song <kasong@redhat.com> wrote:
> >
> > Currently, kernel fails to boot on some HyperV VMs when using EFI.
> > And it's a potential issue on all platforms.
> >
> > It's caused a broken kernel relocation on EFI systems, when below three
> > conditions are met:
> >
> > 1. Kernel image is not loaded to the default address (LOAD_PHYSICAL_ADDR)
> >    by the loader.
> > 2. There isn't enough room to contain the kernel, starting from the
> >    default load address (eg. something else occupied part the region).
> > 3. In the memmap provided by EFI firmware, there is a memory region
> >    starts below LOAD_PHYSICAL_ADDR, and suitable for containing the
> >    kernel.
> >
> > Efi stub will perform a kernel relocation when condition 1 is met. But
> > due to condition 2, efi stub can't relocate kernel to the preferred
> > address, so it fallback to query and alloc from EFI firmware for lowest
> > usable memory region.
> >
> > It's incorrect to use the lowest memory address. In later stage, kernel
> > will assume LOAD_PHYSICAL_ADDR as the minimal acceptable relocate address,
> > but efi stub will end up relocating kernel below it.
> >
> > Then before the kernel decompressing. Kernel will do another relocation
> > to address not lower than LOAD_PHYSICAL_ADDR, this time the relocate will
> > over write the blockage at the default load address, which efi stub tried
> > to avoid, and lead to unexpected behavior. Beside, the memory region it
> > writes to is not allocated from EFI firmware, which is also wrong.
> >
> > To fix it, just don't let efi stub relocate the kernel to any address
> > lower than lowest acceptable address.
> >
> > Signed-off-by: Kairui Song <kasong@redhat.com>
> >
>
> Hello Kairui,
>
> This patch looks correct to me, but it needs an ack from the x86
> maintainers, since the rules around LOAD_PHYSICAL_ADDR are specific to
> the x86 architecture.
>
>

Thanks for the review, Ard.

Can any x86 maintainer help provide some review?

-- 
Best Regards,
Kairui Song

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address
  2019-09-25 17:36   ` Kairui Song
@ 2019-10-11 10:18     ` Kairui Song
  0 siblings, 0 replies; 9+ messages in thread
From: Kairui Song @ 2019-10-11 10:18 UTC (permalink / raw)
  To: the arch/x86 maintainers
  Cc: Linux Kernel Mailing List, Ard Biesheuvel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Matthew Garrett, Jarkko Sakkinen,
	Baoquan He, Dave Young, linux-efi

On Thu, Sep 26, 2019 at 1:36 AM Kairui Song <kasong@redhat.com> wrote:
>
> On Wed, Sep 25, 2019 at 11:25 PM Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
> >
> > On Thu, 19 Sep 2019 at 18:06, Kairui Song <kasong@redhat.com> wrote:
> > >
> > > Currently, kernel fails to boot on some HyperV VMs when using EFI.
> > > And it's a potential issue on all platforms.
> > >
> > > It's caused a broken kernel relocation on EFI systems, when below three
> > > conditions are met:
> > >
> > > 1. Kernel image is not loaded to the default address (LOAD_PHYSICAL_ADDR)
> > >    by the loader.
> > > 2. There isn't enough room to contain the kernel, starting from the
> > >    default load address (eg. something else occupied part the region).
> > > 3. In the memmap provided by EFI firmware, there is a memory region
> > >    starts below LOAD_PHYSICAL_ADDR, and suitable for containing the
> > >    kernel.
> > >
> > > Efi stub will perform a kernel relocation when condition 1 is met. But
> > > due to condition 2, efi stub can't relocate kernel to the preferred
> > > address, so it fallback to query and alloc from EFI firmware for lowest
> > > usable memory region.
> > >
> > > It's incorrect to use the lowest memory address. In later stage, kernel
> > > will assume LOAD_PHYSICAL_ADDR as the minimal acceptable relocate address,
> > > but efi stub will end up relocating kernel below it.
> > >
> > > Then before the kernel decompressing. Kernel will do another relocation
> > > to address not lower than LOAD_PHYSICAL_ADDR, this time the relocate will
> > > over write the blockage at the default load address, which efi stub tried
> > > to avoid, and lead to unexpected behavior. Beside, the memory region it
> > > writes to is not allocated from EFI firmware, which is also wrong.
> > >
> > > To fix it, just don't let efi stub relocate the kernel to any address
> > > lower than lowest acceptable address.
> > >
> > > Signed-off-by: Kairui Song <kasong@redhat.com>
> > >
> >
> > Hello Kairui,
> >
> > This patch looks correct to me, but it needs an ack from the x86
> > maintainers, since the rules around LOAD_PHYSICAL_ADDR are specific to
> > the x86 architecture.
> >
> >
>
> Thanks for the review, Ard.
>
> Can any x86 maintainer help provide some review?
>
> --
> Best Regards,
> Kairui Song

Ping? Any comments?

--
Best Regards,
Kairui Song

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address
  2019-09-19 16:05 [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address Kairui Song
                   ` (2 preceding siblings ...)
  2019-09-25 15:25 ` Ard Biesheuvel
@ 2019-10-11 13:23 ` Borislav Petkov
  2019-10-12  3:46   ` Kairui Song
  3 siblings, 1 reply; 9+ messages in thread
From: Borislav Petkov @ 2019-10-11 13:23 UTC (permalink / raw)
  To: Kairui Song
  Cc: linux-kernel, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar,
	Matthew Garrett, Jarkko Sakkinen, Baoquan He, Dave Young, x86,
	linux-efi

On Fri, Sep 20, 2019 at 12:05:21AM +0800, Kairui Song wrote:
> Currently, kernel fails to boot on some HyperV VMs when using EFI.
> And it's a potential issue on all platforms.
> 
> It's caused a broken kernel relocation on EFI systems, when below three
> conditions are met:
> 
> 1. Kernel image is not loaded to the default address (LOAD_PHYSICAL_ADDR)
>    by the loader.
> 2. There isn't enough room to contain the kernel, starting from the
>    default load address (eg. something else occupied part the region).
> 3. In the memmap provided by EFI firmware, there is a memory region
>    starts below LOAD_PHYSICAL_ADDR, and suitable for containing the
>    kernel.
> 
> Efi stub will perform a kernel relocation when condition 1 is met. But
> due to condition 2, efi stub can't relocate kernel to the preferred
> address, so it fallback to query and alloc from EFI firmware for lowest
> usable memory region.
> 
> It's incorrect to use the lowest memory address. In later stage, kernel
> will assume LOAD_PHYSICAL_ADDR as the minimal acceptable relocate address,
> but efi stub will end up relocating kernel below it.

So far, so good.

> Then before the kernel decompressing. Kernel will do another relocation
> to address not lower than LOAD_PHYSICAL_ADDR, this time the relocate will
> over write the blockage at the default load address, which efi stub tried
> to avoid, and lead to unexpected behavior. Beside, the memory region it
> writes to is not allocated from EFI firmware, which is also wrong.

This paragraph is an unreadable mess and should be rewritten in simple,
declarative sentences.

The patch itself looks ok.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address
  2019-10-11 13:23 ` Borislav Petkov
@ 2019-10-12  3:46   ` Kairui Song
  0 siblings, 0 replies; 9+ messages in thread
From: Kairui Song @ 2019-10-12  3:46 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Kernel Mailing List, Ard Biesheuvel, Thomas Gleixner,
	Ingo Molnar, Matthew Garrett, Jarkko Sakkinen, Baoquan He,
	Dave Young, the arch/x86 maintainers, linux-efi

On Fri, Oct 11, 2019 at 9:24 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Sep 20, 2019 at 12:05:21AM +0800, Kairui Song wrote:
> > Currently, kernel fails to boot on some HyperV VMs when using EFI.
> > And it's a potential issue on all platforms.
> >
> > It's caused a broken kernel relocation on EFI systems, when below three
> > conditions are met:
> >
> > 1. Kernel image is not loaded to the default address (LOAD_PHYSICAL_ADDR)
> >    by the loader.
> > 2. There isn't enough room to contain the kernel, starting from the
> >    default load address (eg. something else occupied part the region).
> > 3. In the memmap provided by EFI firmware, there is a memory region
> >    starts below LOAD_PHYSICAL_ADDR, and suitable for containing the
> >    kernel.
> >
> > Efi stub will perform a kernel relocation when condition 1 is met. But
> > due to condition 2, efi stub can't relocate kernel to the preferred
> > address, so it fallback to query and alloc from EFI firmware for lowest
> > usable memory region.
> >
> > It's incorrect to use the lowest memory address. In later stage, kernel
> > will assume LOAD_PHYSICAL_ADDR as the minimal acceptable relocate address,
> > but efi stub will end up relocating kernel below it.
>
> So far, so good.
>
> > Then before the kernel decompressing. Kernel will do another relocation
> > to address not lower than LOAD_PHYSICAL_ADDR, this time the relocate will
> > over write the blockage at the default load address, which efi stub tried
> > to avoid, and lead to unexpected behavior. Beside, the memory region it
> > writes to is not allocated from EFI firmware, which is also wrong.
>
> This paragraph is an unreadable mess and should be rewritten in simple,
> declarative sentences.
>
> The patch itself looks ok.

Thanks. I've sent V3 updating that paragraph.

>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
--
Best Regards,
Kairui Song

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-10-12  3:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-19 16:05 [PATCH v2] x86, efi: never relocate kernel below lowest acceptable address Kairui Song
2019-09-25  9:51 ` Jarkko Sakkinen
2019-09-25  9:55 ` Baoquan He
2019-09-25 17:35   ` Kairui Song
2019-09-25 15:25 ` Ard Biesheuvel
2019-09-25 17:36   ` Kairui Song
2019-10-11 10:18     ` Kairui Song
2019-10-11 13:23 ` Borislav Petkov
2019-10-12  3:46   ` Kairui Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).