linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
@ 2017-06-24 14:16 Baoquan He
  2017-06-24 14:23 ` Baoquan He
  0 siblings, 1 reply; 12+ messages in thread
From: Baoquan He @ 2017-06-24 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Baoquan He

Kdump kernel will reset to firmware after crash is trigered when
crashkernel=xxM,high is added to kernel command line. Kexec has the
same phenomenon. This only happened on system with kaslr code
compiled in and kernel option 'nokaslr'is added. Both of them works
well when kaslr is enabled.

When crashkernel high is set or kexec case, kexec/kdump kernel will be
put above 4G. Since we assign the original loading address of kernel to
virt_addr as initial value, the virt_addr will be larger than 1G if kaslr
is disabled, it exceeds the kernel mapping size which is only 1G. Then
it will cause relocation handling error in handle_relocations().

In fact this is a known issue and fixed in commit:

... f285f4a ("x86, boot: Skip relocs when load address unchanged")

But above fix was lost carelessly in later commit:

... 8391c73 ("x86/KASLR: Randomize virtual address separately")

To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial value.
If no kaslr is taken, it will skip the relocation handling and jump out.

Fixes: 8391c73 ("x86/KASLR: Randomize virtual address separately")
Tested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
"H. Peter Anvin" <hpa@zytor.com>
Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar <mingo@redhat.com>
x86@kernel.org
Kees Cook <keescook@chromium.org
Baoquan He <bhe@redhat.com>
Dave Jiang <dave.jiang@intel.com>
Yinghai Lu <yinghai@kernel.org>
Arnd Bergmann <arnd@arndb.de>
Thomas Garnier <thgarnie@google.com>
---
 arch/x86/boot/compressed/kaslr.c | 3 ---
 arch/x86/boot/compressed/misc.c  | 4 ++--
 arch/x86/boot/compressed/misc.h  | 2 --
 3 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index fe318b4..91f27ab 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -625,9 +625,6 @@ void choose_random_location(unsigned long input,
 {
 	unsigned long random_addr, min_addr;
 
-	/* By default, keep output position unchanged. */
-	*virt_addr = *output;
-
 	if (cmdline_find_option_bool("nokaslr")) {
 		warn("KASLR disabled: 'nokaslr' on cmdline.");
 		return;
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b3c5a5f0..c945acd 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -338,7 +338,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 				  unsigned long output_len)
 {
 	const unsigned long kernel_total_size = VO__end - VO__text;
-	unsigned long virt_addr = (unsigned long)output;
+	unsigned long virt_addr = LOAD_PHYSICAL_ADDR;
 
 	/* Retain x86 boot parameters pointer passed from startup_32/64. */
 	boot_params = rmode;
@@ -397,7 +397,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 #ifndef CONFIG_RELOCATABLE
 	if ((unsigned long)output != LOAD_PHYSICAL_ADDR)
 		error("Destination address does not match LOAD_PHYSICAL_ADDR");
-	if ((unsigned long)output != virt_addr)
+	if (virt_addr != LOAD_PHYSICAL_ADDR)
 		error("Destination virtual address changed when not relocatable");
 #endif
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 1c8355e..766a521 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -81,8 +81,6 @@ static inline void choose_random_location(unsigned long input,
 					  unsigned long output_size,
 					  unsigned long *virt_addr)
 {
-	/* No change from existing output location. */
-	*virt_addr = *output;
 }
 #endif
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
  2017-06-24 14:16 [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case Baoquan He
@ 2017-06-24 14:23 ` Baoquan He
  0 siblings, 0 replies; 12+ messages in thread
From: Baoquan He @ 2017-06-24 14:23 UTC (permalink / raw)
  To: linux-kernel

Sorry, forgot adding Cc: before maintainers contact, NACK this patch and
will repost.

On 06/24/17 at 10:16pm, Baoquan He wrote:
> Kdump kernel will reset to firmware after crash is trigered when
> crashkernel=xxM,high is added to kernel command line. Kexec has the
> same phenomenon. This only happened on system with kaslr code
> compiled in and kernel option 'nokaslr'is added. Both of them works
> well when kaslr is enabled.
> 
> When crashkernel high is set or kexec case, kexec/kdump kernel will be
> put above 4G. Since we assign the original loading address of kernel to
> virt_addr as initial value, the virt_addr will be larger than 1G if kaslr
> is disabled, it exceeds the kernel mapping size which is only 1G. Then
> it will cause relocation handling error in handle_relocations().
> 
> In fact this is a known issue and fixed in commit:
> 
> ... f285f4a ("x86, boot: Skip relocs when load address unchanged")
> 
> But above fix was lost carelessly in later commit:
> 
> ... 8391c73 ("x86/KASLR: Randomize virtual address separately")
> 
> To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial value.
> If no kaslr is taken, it will skip the relocation handling and jump out.
> 
> Fixes: 8391c73 ("x86/KASLR: Randomize virtual address separately")
> Tested-by: Dave Young <dyoung@redhat.com>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> "H. Peter Anvin" <hpa@zytor.com>
> Thomas Gleixner <tglx@linutronix.de>
> Ingo Molnar <mingo@redhat.com>
> x86@kernel.org
> Kees Cook <keescook@chromium.org
> Baoquan He <bhe@redhat.com>
> Dave Jiang <dave.jiang@intel.com>
> Yinghai Lu <yinghai@kernel.org>
> Arnd Bergmann <arnd@arndb.de>
> Thomas Garnier <thgarnie@google.com>
> ---
>  arch/x86/boot/compressed/kaslr.c | 3 ---
>  arch/x86/boot/compressed/misc.c  | 4 ++--
>  arch/x86/boot/compressed/misc.h  | 2 --
>  3 files changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
> index fe318b4..91f27ab 100644
> --- a/arch/x86/boot/compressed/kaslr.c
> +++ b/arch/x86/boot/compressed/kaslr.c
> @@ -625,9 +625,6 @@ void choose_random_location(unsigned long input,
>  {
>  	unsigned long random_addr, min_addr;
>  
> -	/* By default, keep output position unchanged. */
> -	*virt_addr = *output;
> -
>  	if (cmdline_find_option_bool("nokaslr")) {
>  		warn("KASLR disabled: 'nokaslr' on cmdline.");
>  		return;
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index b3c5a5f0..c945acd 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -338,7 +338,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
>  				  unsigned long output_len)
>  {
>  	const unsigned long kernel_total_size = VO__end - VO__text;
> -	unsigned long virt_addr = (unsigned long)output;
> +	unsigned long virt_addr = LOAD_PHYSICAL_ADDR;
>  
>  	/* Retain x86 boot parameters pointer passed from startup_32/64. */
>  	boot_params = rmode;
> @@ -397,7 +397,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
>  #ifndef CONFIG_RELOCATABLE
>  	if ((unsigned long)output != LOAD_PHYSICAL_ADDR)
>  		error("Destination address does not match LOAD_PHYSICAL_ADDR");
> -	if ((unsigned long)output != virt_addr)
> +	if (virt_addr != LOAD_PHYSICAL_ADDR)
>  		error("Destination virtual address changed when not relocatable");
>  #endif
>  
> diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> index 1c8355e..766a521 100644
> --- a/arch/x86/boot/compressed/misc.h
> +++ b/arch/x86/boot/compressed/misc.h
> @@ -81,8 +81,6 @@ static inline void choose_random_location(unsigned long input,
>  					  unsigned long output_size,
>  					  unsigned long *virt_addr)
>  {
> -	/* No change from existing output location. */
> -	*virt_addr = *output;
>  }
>  #endif
>  
> -- 
> 2.5.5
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
  2017-07-06 13:28       ` Baoquan He
@ 2017-07-06 13:54         ` Baoquan He
  0 siblings, 0 replies; 12+ messages in thread
From: Baoquan He @ 2017-07-06 13:54 UTC (permalink / raw)
  To: Kees Cook
  Cc: LKML, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86,
	Kees Cook <keescook@chromium.org, Yinghai Lu, Arnd Bergmann,
	Thomas Garnier

On 07/06/17 at 09:28pm, Baoquan He wrote:
> > > Do you mean the handling in boot/compressed/head_64.S? Whatever it does,
> > > it's only for physical address. The virtual address mapping is not
> > > touched. Here virt_addr respresents the offset between
> > > 0xffffffff80000000, 0xffffffffc0000000. And if skip the relocation
> > > handling, we can see that the '_text' will be mapped to
> > > 0xffffffff81000000 forever, no matter where physical address of '_text'
> > > is. So here LOAD_PHYSICAL_ADDR is default value of 'virt_addr'.
> > 
> > So, this isn't always true. The output address is the address assigned
> > by head_*.S for the extraction destination. We can't just change that,
> > since it's based on where to perform the extraction, etc. All the
> > logic in there is designed to make sure we're actually >=
> > LOAD_PHYSICAL_ADDR:
> > 
> >         cmpl    $LOAD_PHYSICAL_ADDR, %ebx
> >         jge     1f
> > #endif
> >         movl    $LOAD_PHYSICAL_ADDR, %ebx
> > 1:
> 
> Yes, here it's trying to do an adjustment. To align the original loading
> address if it's not aligned to CONFIG_PHYSICAL_ALIGN, and to make it be
> LOAD_PHYSICAL_ADDR if it's smaller than LOAD_PHYSICAL_ADDR. Other than
> those, nothing else be done. So let me conclude the possible original
> 'output' value:
> 1) default case
>   It's LOAD_PHYSICAL_ADDR Which is the default loading address.
> 2) kexec/kdump
>   The output will be at the top of the available physical ram.
> 3) Modified bootloader
>   Which could put kernel anywhere by modifying code of bootloader. This
>   is the case that we need do above alignment and adjustment to avoid
>   unpleasant kernel error.
> 
> Except of above 3 cases, physical address of kernel could be changed by
> physical address randomization.
> 4) physical address randomization
> 
> However, for virtual address, there are only two chances to decide:
> 1) arch/x86/kernel/vmlinux.lds.S
>   we just map .text at __START_KERNEL
> #define __PHYSICAL_START        ALIGN(CONFIG_PHYSICAL_START, \                                                                                   
>                                       CONFIG_PHYSICAL_ALIGN)
> 
> #define __START_KERNEL          (__START_KERNEL_map + __PHYSICAL_START)
> 2) handle_relocations :: boot/compressed/misc.c
> In handle_relocations(), we DO pass in the physical 'output', but that
> is used to calculate 'map' which is used to position those memory
> unit where the referenced address need be relocated. 
>   
> So later, in arch/x86/kernel/head_64.S, we will store the delta which is
> between 16M and the actual kernel loading address into phys_base. So as

Here I was wrong about phys_base. It has been changed in the latest code
in Linus's tree. Please check arch/x86/kernel/head64.c: 

void __head __startup_64(unsigned long physaddr)                                                                                                 
{
	...
	load_delta = physaddr - (unsigned long)(_text - __START_KERNEL_map);
	...

	/* Fixup phys_base */
        p = fixup_pointer(&phys_base, physaddr);
        *p += load_delta;
}
Here physaddr should be 'output' which is the place kernel decompressed
at.


> long as we can translate physical address and virtual address of kernel
> by below formula, everything is going very well. 
> 
>     y = x - __START_KERNEL_map + phys_base.
>    
> We really don't need to tie physical address and virtual address of
> kernel together in any cases. They can be randomized or not randomized
> completely separately. As long as we keep above translation formula
> working well, everything need not be worried.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
  2017-07-05 19:06     ` Kees Cook
@ 2017-07-06 13:28       ` Baoquan He
  2017-07-06 13:54         ` Baoquan He
  0 siblings, 1 reply; 12+ messages in thread
From: Baoquan He @ 2017-07-06 13:28 UTC (permalink / raw)
  To: Kees Cook
  Cc: LKML, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86,
	Kees Cook <keescook@chromium.org, Yinghai Lu, Arnd Bergmann,
	Thomas Garnier

On 07/05/17 at 12:06pm, Kees Cook wrote:
> On Tue, Jun 27, 2017 at 4:24 PM, Baoquan He <bhe@redhat.com> wrote:
> > Below code was added to fix the kexec/kdump kernel with kaslr disabled,
> > at that time kernel kaslr physical address and virtual address
> > randomization are coupled. What it was doing is to randomize physical
> > address in 1G range and add the delta onto the starting address of
> > virtual address, 0xffffffff80000000.
> >
> > But now the physical and virtual address randomization has been
> > separated. It means that whether each of them is changed or not
> > randomly, the randomization wont' be impacted. So below change you
> > provided will has two problems:
> 
> Well, it's only sometimes separated. When it's still tied together,
> the (identity-mapped) physical randomization _is_ the virtual
> randomization.

I am a little confused about this. I can't think of a case that
in x86 64 virtual address and physical address need be tied together to
do randomization. And in what conditions they are separated, could you
give cases with a detailed address value?

But in x86 32bit, it does couple physical and virtual address
randomization together. Because it just randomizes to get an value
between 16M and 512M, then use the delta between 16M and the random
value to do the relocation handling for virtual address.

Just write my understanding at below, please add comment or point out
mistakes of mine.
> 
> > 1st, the 'virt_addr' represents the offset of virtual address
> > randomization between 0xffffffff80000000 and 0xffffffffc0000000, should
> > not get a initial value '(unsigned long)output'. The 'output' could be
> > any value between 16M and 64T. It makes no sense to make the assignment
> > and could bring confusion to code readers.
> >
> > 2nd, for x86 64, we do the relocation handling if and only if virtual
> > address is randomized to a different value, namely the offset 'virt_addr'
> > has a value which is not 16M. You can see this in handle_relocations().
> >
> >         if (IS_ENABLED(CONFIG_X86_64))
> >                 delta = virt_addr - LOAD_PHYSICAL_ADDR;
> >
> >         if (!delta) {
> >                 debug_putstr("No relocation needed... ");
> >                 return;
> >         }
> 
> On x86_64, the kernel is built so that if the virt address matches
> LOAD_PHYSICAL_ADDR, we don't have to do relocations.

Agreed.

> 
> > Now below code will bring a bug that if physical address randomization fail
> > to get a different value, but virtual address randomization may succeed to
> > get one, then it won't do relocation handling. This contradicts with the
> > design and implementation of the current code.
> 
> Let's address this specific issue (below).
> 
> >
> >>
> >> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> >> index b3c5a5f030ce..4496a05d1f8a 100644
> >> --- a/arch/x86/boot/compressed/misc.c
> >> +++ b/arch/x86/boot/compressed/misc.c
> >> @@ -339,6 +339,7 @@ asmlinkage __visible void *extract_kernel(void
> >> *rmode, memptr heap,
> >>  {
> >>         const unsigned long kernel_total_size = VO__end - VO__text;
> >>         unsigned long virt_addr = (unsigned long)output;
> >> +       unsigned long virt_addr_orig = virt_addr;
> >>
> >>         /* Retain x86 boot parameters pointer passed from startup_32/64. */
> >>         boot_params = rmode;
> >> @@ -405,7 +406,12 @@ asmlinkage __visible void *extract_kernel(void
> >> *rmode, memptr heap,
> >>         __decompress(input_data, input_len, NULL, NULL, output, output_len,
> >>                         NULL, error);
> >>         parse_elf(output);
> >> -       handle_relocations(output, output_len, virt_addr);
> >> +       /*
> >> +        * 32-bit always performs relocations. 64-bit relocations are only
> >> +        * needed if kASLR has chosen a different load address.
> >> +        */
> >> +       if (!IS_ENABLED(CONFIG_X86_64) || virt_addr != virt_addr_orig)
> >> +               handle_relocations(output, output_len, virt_addr);
> >>         debug_putstr("done.\nBooting the kernel.\n");
> >>         return output;
> >>  }
> >>
> >> > diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> >> > index b3c5a5f0..c945acd 100644
> >> > --- a/arch/x86/boot/compressed/misc.c
> >> > +++ b/arch/x86/boot/compressed/misc.c
> >> > @@ -338,7 +338,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
> >> >                                   unsigned long output_len)
> >> >  {
> >> >         const unsigned long kernel_total_size = VO__end - VO__text;
> >> > -       unsigned long virt_addr = (unsigned long)output;
> >> > +       unsigned long virt_addr = LOAD_PHYSICAL_ADDR;
> >>
> >> I don't like this being hard-coded. The logic for the output address
> >> is already handled in the .S files, and may not be LOAD_PHYSICAL_ADDR.
> >> I'd prefer just adding back the simple test of virt_addr changing.
> >
> > Do you mean the handling in boot/compressed/head_64.S? Whatever it does,
> > it's only for physical address. The virtual address mapping is not
> > touched. Here virt_addr respresents the offset between
> > 0xffffffff80000000, 0xffffffffc0000000. And if skip the relocation
> > handling, we can see that the '_text' will be mapped to
> > 0xffffffff81000000 forever, no matter where physical address of '_text'
> > is. So here LOAD_PHYSICAL_ADDR is default value of 'virt_addr'.
> 
> So, this isn't always true. The output address is the address assigned
> by head_*.S for the extraction destination. We can't just change that,
> since it's based on where to perform the extraction, etc. All the
> logic in there is designed to make sure we're actually >=
> LOAD_PHYSICAL_ADDR:
> 
>         cmpl    $LOAD_PHYSICAL_ADDR, %ebx
>         jge     1f
> #endif
>         movl    $LOAD_PHYSICAL_ADDR, %ebx
> 1:

Yes, here it's trying to do an adjustment. To align the original loading
address if it's not aligned to CONFIG_PHYSICAL_ALIGN, and to make it be
LOAD_PHYSICAL_ADDR if it's smaller than LOAD_PHYSICAL_ADDR. Other than
those, nothing else be done. So let me conclude the possible original
'output' value:
1) default case
  It's LOAD_PHYSICAL_ADDR Which is the default loading address.
2) kexec/kdump
  The output will be at the top of the available physical ram.
3) Modified bootloader
  Which could put kernel anywhere by modifying code of bootloader. This
  is the case that we need do above alignment and adjustment to avoid
  unpleasant kernel error.

Except of above 3 cases, physical address of kernel could be changed by
physical address randomization.
4) physical address randomization

However, for virtual address, there are only two chances to decide:
1) arch/x86/kernel/vmlinux.lds.S
  we just map .text at __START_KERNEL
#define __PHYSICAL_START        ALIGN(CONFIG_PHYSICAL_START, \                                                                                   
                                      CONFIG_PHYSICAL_ALIGN)

#define __START_KERNEL          (__START_KERNEL_map + __PHYSICAL_START)
2) handle_relocations :: boot/compressed/misc.c
In handle_relocations(), we DO pass in the physical 'output', but that
is used to calculate 'map' which is used to position those memory
unit where the referenced address need be relocated. 
  
So later, in arch/x86/kernel/head_64.S, we will store the delta which is
between 16M and the actual kernel loading address into phys_base. So as
long as we can translate physical address and virtual address of kernel
by below formula, everything is going very well. 

    y = x - __START_KERNEL_map + phys_base.
   
We really don't need to tie physical address and virtual address of
kernel together in any cases. They can be randomized or not randomized
completely separately. As long as we keep above translation formula
working well, everything need not be worried.
> 
> We can't undo that since the calculation performed there is critical
> to getting it into the right place.
> 
> >> >
> >> >         /* Retain x86 boot parameters pointer passed from startup_32/64. */
> >> >         boot_params = rmode;
> >> > @@ -397,7 +397,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
> >> >  #ifndef CONFIG_RELOCATABLE
> >> >         if ((unsigned long)output != LOAD_PHYSICAL_ADDR)
> >> >                 error("Destination address does not match LOAD_PHYSICAL_ADDR");
> >> > -       if ((unsigned long)output != virt_addr)
> >> > +       if (virt_addr != LOAD_PHYSICAL_ADDR)
> >> >                 error("Destination virtual address changed when not relocatable");
> >> >  #endif
> >> >
> >> > diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> >> > index 1c8355e..766a521 100644
> >> > --- a/arch/x86/boot/compressed/misc.h
> >> > +++ b/arch/x86/boot/compressed/misc.h
> >> > @@ -81,8 +81,6 @@ static inline void choose_random_location(unsigned long input,
> >> >                                           unsigned long output_size,
> >> >                                           unsigned long *virt_addr)
> >> >  {
> >> > -       /* No change from existing output location. */
> >> > -       *virt_addr = *output;
> >> >  }
> >> >  #endif
> 
> So, the code in question is this:
> 
> static void handle_relocations(void *output, unsigned long output_len,
>                                unsigned long virt_addr)
> {
>         int *reloc;
>         unsigned long delta, map, ptr;
>         unsigned long min_addr = (unsigned long)output;
>         unsigned long max_addr = min_addr + (VO___bss_start - VO__text);
> 
>         /*
>          * Calculate the delta between where vmlinux was linked to load
>          * and where it was actually loaded.
>          */
>         delta = min_addr - LOAD_PHYSICAL_ADDR;
> 
>         /*
>          * The kernel contains a table of relocation addresses. Those
>          * addresses have the final load address of the kernel in virtual
>          * memory. We are currently working in the self map. So we need to
>          * create an adjustment for kernel memory addresses to the self map.
>          * This will involve subtracting out the base address of the kernel.
>          */
>         map = delta - __START_KERNEL_map;
> 
>         /*
>          * 32-bit always performs relocations. 64-bit relocations are only
>          * needed if KASLR has chosen a different starting address offset
>          * from __START_KERNEL_map.
>          */
>         if (IS_ENABLED(CONFIG_X86_64))
>                 delta = virt_addr - LOAD_PHYSICAL_ADDR;
> 
>         if (!delta) {
>                 debug_putstr("No relocation needed... ");
>                 return;
>         }
>         debug_putstr("Performing relocations... ");
> 
> The first use of delta is to figure out the correct relocation offset,
> which is based on the relocation tables, etc.
Yes
> 
> The second use is for decoupled phys/virt (x86_64 only), but I don't
> think these need to be separate, do they? I think "delta" should only
> ever operate on the virt_addr, yes?

Yes. And if yes, for x86_64, there's nothing related to physical
address, here physical address is only used to find out the ram unit
where relocation need be done.

> 
> in misc.c, this chooses a physical location and a virtual:
> 
>         choose_random_location((unsigned long)input_data, input_len,
>                                 (unsigned long *)&output,
>                                 max(output_len, kernel_total_size),
>                                 &virt_addr);
> 
> Then we place the kernel in the physical location:
> 
>         __decompress(input_data, input_len, NULL, NULL, output, output_len,
>                         NULL, error);
> 
> And finally handle relocations:
> 
>         handle_relocations(output, output_len, virt_addr);
> 
> "output" is the identity-mapped physical (and possibly virtual)
> location, and virt_addr is the virtual.

No, output here is physical address, and yes., it's also virtual address
which is identity mappped. But when we do the relocation handling, we
can only think it as physical address. Here it's virtual only because we
need to operate on ram unit under protection mode with paging enabled.

> 
> So, it looks to me like the calculations at the top of
> handle_relocations() may not be correct for the case you're hitting?
> 
> What are the values there in that case? We must not hard-code output,
> so handle_relocations() has to be entirely self-contained with its
> logic.

Well, I may get a little bit what confusion you might have. First of
all, that 'virt_addr' is not the real virtual address, it's just an
offset in [0xffffffff80000000], 0xffffffffc0000000). Its value range is
between 16M and 1G. And for physical address 'output', we need avoid
many things, initramfs, kernel cmdline, etc. While for virt_addr, we
don't need to avoid anything, because it's just an offset and we can map
kernel from any places to 'virt_addr'. And secondly, hard-coding 'virt_add'
is only hard-coding the virtual randomzation offset, it won't impact
'output' at all. 'ouput' doesn't rely on and isn't related to
'virt_addr' at all, for x86 64.

> 
> I actually think the bug might be that choose_random_location() in the
> nokaslr case doesn't explicitly set virt_addr to output, like the
> no-op does:


> 
> void choose_random_location(unsigned long input,
>                             unsigned long input_size,
>                             unsigned long *output,
>                             unsigned long output_size,
>                             unsigned long *virt_addr)
> {
>         unsigned long random_addr, min_addr;
> 
>         if (cmdline_find_option_bool("nokaslr")) {
>                 warn("KASLR disabled: 'nokaslr' on cmdline.");
>                 return;
>         }
> 
> Perhaps we need a few fixes:
> 
> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
> index 91f27ab970ef..e1e57c626093 100644
> --- a/arch/x86/boot/compressed/kaslr.c
> +++ b/arch/x86/boot/compressed/kaslr.c
> @@ -627,6 +627,7 @@ void choose_random_location(unsigned long input,
> 
>         if (cmdline_find_option_bool("nokaslr")) {
>                 warn("KASLR disabled: 'nokaslr' on cmdline.");
> +               *virt_addr = *output;
>                 return;
>         }
> 
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index 00241c815524..3eb754e4f823 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -198,12 +198,13 @@ static void handle_relocations(void *output,
> unsigned long output_len,
>          * needed if KASLR has chosen a different starting address offset
>          * from __START_KERNEL_map.
>          */
> -       if (IS_ENABLED(CONFIG_X86_64))
> +       if (IS_ENABLED(CONFIG_X86_64)) {
>                 delta = virt_addr - LOAD_PHYSICAL_ADDR;
> 
> -       if (!delta) {
> -               debug_putstr("No relocation needed... ");
> -               return;
> +               if (!delta) {
> +                       debug_putstr("No relocation needed... ");
> +                       return;
> +               }
>         }
>         debug_putstr("Performing relocations... ");
> 
> 
> 
> 
> 
> -- 
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
  2017-06-27 23:24   ` Baoquan He
  2017-06-27 23:33     ` Baoquan He
@ 2017-07-05 19:06     ` Kees Cook
  2017-07-06 13:28       ` Baoquan He
  1 sibling, 1 reply; 12+ messages in thread
From: Kees Cook @ 2017-07-05 19:06 UTC (permalink / raw)
  To: Baoquan He
  Cc: LKML, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86,
	Kees Cook <keescook@chromium.org, Yinghai Lu, Arnd Bergmann,
	Thomas Garnier

On Tue, Jun 27, 2017 at 4:24 PM, Baoquan He <bhe@redhat.com> wrote:
> Hi Kees,
>
> On 06/27/17 at 03:42pm, Kees Cook wrote:
>> On Sat, Jun 24, 2017 at 7:25 AM, Baoquan He <bhe@redhat.com> wrote:
>> > Kdump kernel will reset to firmware after crash is trigered when
>> > crashkernel=xxM,high is added to kernel command line. Kexec has the
>> > same phenomenon. This only happened on system with kaslr code
>> > compiled in and kernel option 'nokaslr'is added. Both of them works
>> > well when kaslr is enabled.
>> >
>> > When crashkernel high is set or kexec case, kexec/kdump kernel will be
>> > put above 4G. Since we assign the original loading address of kernel to
>> > virt_addr as initial value, the virt_addr will be larger than 1G if kaslr
>> > is disabled, it exceeds the kernel mapping size which is only 1G. Then
>> > it will cause relocation handling error in handle_relocations().
>> >
>> > In fact this is a known issue and fixed in commit:
>> >
>> > ... f285f4a ("x86, boot: Skip relocs when load address unchanged")
>> >
>> > But above fix was lost carelessly in later commit:
>> >
>> > ... 8391c73 ("x86/KASLR: Randomize virtual address separately")
>>
>> Hrm, this changes more than just fixing this missing logic. How about
>> just simply:
>
> Below code was added to fix the kexec/kdump kernel with kaslr disabled,
> at that time kernel kaslr physical address and virtual address
> randomization are coupled. What it was doing is to randomize physical
> address in 1G range and add the delta onto the starting address of
> virtual address, 0xffffffff80000000.
>
> But now the physical and virtual address randomization has been
> separated. It means that whether each of them is changed or not
> randomly, the randomization wont' be impacted. So below change you
> provided will has two problems:

Well, it's only sometimes separated. When it's still tied together,
the (identity-mapped) physical randomization _is_ the virtual
randomization.

> 1st, the 'virt_addr' represents the offset of virtual address
> randomization between 0xffffffff80000000 and 0xffffffffc0000000, should
> not get a initial value '(unsigned long)output'. The 'output' could be
> any value between 16M and 64T. It makes no sense to make the assignment
> and could bring confusion to code readers.
>
> 2nd, for x86 64, we do the relocation handling if and only if virtual
> address is randomized to a different value, namely the offset 'virt_addr'
> has a value which is not 16M. You can see this in handle_relocations().
>
>         if (IS_ENABLED(CONFIG_X86_64))
>                 delta = virt_addr - LOAD_PHYSICAL_ADDR;
>
>         if (!delta) {
>                 debug_putstr("No relocation needed... ");
>                 return;
>         }

On x86_64, the kernel is built so that if the virt address matches
LOAD_PHYSICAL_ADDR, we don't have to do relocations.

> Now below code will bring a bug that if physical address randomization fail
> to get a different value, but virtual address randomization may succeed to
> get one, then it won't do relocation handling. This contradicts with the
> design and implementation of the current code.

Let's address this specific issue (below).

>
>>
>> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
>> index b3c5a5f030ce..4496a05d1f8a 100644
>> --- a/arch/x86/boot/compressed/misc.c
>> +++ b/arch/x86/boot/compressed/misc.c
>> @@ -339,6 +339,7 @@ asmlinkage __visible void *extract_kernel(void
>> *rmode, memptr heap,
>>  {
>>         const unsigned long kernel_total_size = VO__end - VO__text;
>>         unsigned long virt_addr = (unsigned long)output;
>> +       unsigned long virt_addr_orig = virt_addr;
>>
>>         /* Retain x86 boot parameters pointer passed from startup_32/64. */
>>         boot_params = rmode;
>> @@ -405,7 +406,12 @@ asmlinkage __visible void *extract_kernel(void
>> *rmode, memptr heap,
>>         __decompress(input_data, input_len, NULL, NULL, output, output_len,
>>                         NULL, error);
>>         parse_elf(output);
>> -       handle_relocations(output, output_len, virt_addr);
>> +       /*
>> +        * 32-bit always performs relocations. 64-bit relocations are only
>> +        * needed if kASLR has chosen a different load address.
>> +        */
>> +       if (!IS_ENABLED(CONFIG_X86_64) || virt_addr != virt_addr_orig)
>> +               handle_relocations(output, output_len, virt_addr);
>>         debug_putstr("done.\nBooting the kernel.\n");
>>         return output;
>>  }
>>
>> > diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
>> > index b3c5a5f0..c945acd 100644
>> > --- a/arch/x86/boot/compressed/misc.c
>> > +++ b/arch/x86/boot/compressed/misc.c
>> > @@ -338,7 +338,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
>> >                                   unsigned long output_len)
>> >  {
>> >         const unsigned long kernel_total_size = VO__end - VO__text;
>> > -       unsigned long virt_addr = (unsigned long)output;
>> > +       unsigned long virt_addr = LOAD_PHYSICAL_ADDR;
>>
>> I don't like this being hard-coded. The logic for the output address
>> is already handled in the .S files, and may not be LOAD_PHYSICAL_ADDR.
>> I'd prefer just adding back the simple test of virt_addr changing.
>
> Do you mean the handling in boot/compressed/head_64.S? Whatever it does,
> it's only for physical address. The virtual address mapping is not
> touched. Here virt_addr respresents the offset between
> 0xffffffff80000000, 0xffffffffc0000000. And if skip the relocation
> handling, we can see that the '_text' will be mapped to
> 0xffffffff81000000 forever, no matter where physical address of '_text'
> is. So here LOAD_PHYSICAL_ADDR is default value of 'virt_addr'.

So, this isn't always true. The output address is the address assigned
by head_*.S for the extraction destination. We can't just change that,
since it's based on where to perform the extraction, etc. All the
logic in there is designed to make sure we're actually >=
LOAD_PHYSICAL_ADDR:

        cmpl    $LOAD_PHYSICAL_ADDR, %ebx
        jge     1f
#endif
        movl    $LOAD_PHYSICAL_ADDR, %ebx
1:

We can't undo that since the calculation performed there is critical
to getting it into the right place.

>> >
>> >         /* Retain x86 boot parameters pointer passed from startup_32/64. */
>> >         boot_params = rmode;
>> > @@ -397,7 +397,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
>> >  #ifndef CONFIG_RELOCATABLE
>> >         if ((unsigned long)output != LOAD_PHYSICAL_ADDR)
>> >                 error("Destination address does not match LOAD_PHYSICAL_ADDR");
>> > -       if ((unsigned long)output != virt_addr)
>> > +       if (virt_addr != LOAD_PHYSICAL_ADDR)
>> >                 error("Destination virtual address changed when not relocatable");
>> >  #endif
>> >
>> > diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
>> > index 1c8355e..766a521 100644
>> > --- a/arch/x86/boot/compressed/misc.h
>> > +++ b/arch/x86/boot/compressed/misc.h
>> > @@ -81,8 +81,6 @@ static inline void choose_random_location(unsigned long input,
>> >                                           unsigned long output_size,
>> >                                           unsigned long *virt_addr)
>> >  {
>> > -       /* No change from existing output location. */
>> > -       *virt_addr = *output;
>> >  }
>> >  #endif

So, the code in question is this:

static void handle_relocations(void *output, unsigned long output_len,
                               unsigned long virt_addr)
{
        int *reloc;
        unsigned long delta, map, ptr;
        unsigned long min_addr = (unsigned long)output;
        unsigned long max_addr = min_addr + (VO___bss_start - VO__text);

        /*
         * Calculate the delta between where vmlinux was linked to load
         * and where it was actually loaded.
         */
        delta = min_addr - LOAD_PHYSICAL_ADDR;

        /*
         * The kernel contains a table of relocation addresses. Those
         * addresses have the final load address of the kernel in virtual
         * memory. We are currently working in the self map. So we need to
         * create an adjustment for kernel memory addresses to the self map.
         * This will involve subtracting out the base address of the kernel.
         */
        map = delta - __START_KERNEL_map;

        /*
         * 32-bit always performs relocations. 64-bit relocations are only
         * needed if KASLR has chosen a different starting address offset
         * from __START_KERNEL_map.
         */
        if (IS_ENABLED(CONFIG_X86_64))
                delta = virt_addr - LOAD_PHYSICAL_ADDR;

        if (!delta) {
                debug_putstr("No relocation needed... ");
                return;
        }
        debug_putstr("Performing relocations... ");

The first use of delta is to figure out the correct relocation offset,
which is based on the relocation tables, etc.

The second use is for decoupled phys/virt (x86_64 only), but I don't
think these need to be separate, do they? I think "delta" should only
ever operate on the virt_addr, yes?

in misc.c, this chooses a physical location and a virtual:

        choose_random_location((unsigned long)input_data, input_len,
                                (unsigned long *)&output,
                                max(output_len, kernel_total_size),
                                &virt_addr);

Then we place the kernel in the physical location:

        __decompress(input_data, input_len, NULL, NULL, output, output_len,
                        NULL, error);

And finally handle relocations:

        handle_relocations(output, output_len, virt_addr);

"output" is the identity-mapped physical (and possibly virtual)
location, and virt_addr is the virtual.

So, it looks to me like the calculations at the top of
handle_relocations() may not be correct for the case you're hitting?

What are the values there in that case? We must not hard-code output,
so handle_relocations() has to be entirely self-contained with its
logic.

I actually think the bug might be that choose_random_location() in the
nokaslr case doesn't explicitly set virt_addr to output, like the
no-op does:

void choose_random_location(unsigned long input,
                            unsigned long input_size,
                            unsigned long *output,
                            unsigned long output_size,
                            unsigned long *virt_addr)
{
        unsigned long random_addr, min_addr;

        if (cmdline_find_option_bool("nokaslr")) {
                warn("KASLR disabled: 'nokaslr' on cmdline.");
                return;
        }

Perhaps we need a few fixes:

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 91f27ab970ef..e1e57c626093 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -627,6 +627,7 @@ void choose_random_location(unsigned long input,

        if (cmdline_find_option_bool("nokaslr")) {
                warn("KASLR disabled: 'nokaslr' on cmdline.");
+               *virt_addr = *output;
                return;
        }

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index 00241c815524..3eb754e4f823 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -198,12 +198,13 @@ static void handle_relocations(void *output,
unsigned long output_len,
         * needed if KASLR has chosen a different starting address offset
         * from __START_KERNEL_map.
         */
-       if (IS_ENABLED(CONFIG_X86_64))
+       if (IS_ENABLED(CONFIG_X86_64)) {
                delta = virt_addr - LOAD_PHYSICAL_ADDR;

-       if (!delta) {
-               debug_putstr("No relocation needed... ");
-               return;
+               if (!delta) {
+                       debug_putstr("No relocation needed... ");
+                       return;
+               }
        }
        debug_putstr("Performing relocations... ");





-- 
Kees Cook
Pixel Security

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
  2017-06-27 23:24   ` Baoquan He
@ 2017-06-27 23:33     ` Baoquan He
  2017-07-05 19:06     ` Kees Cook
  1 sibling, 0 replies; 12+ messages in thread
From: Baoquan He @ 2017-06-27 23:33 UTC (permalink / raw)
  To: Kees Cook
  Cc: LKML, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86,
	Kees Cook <keescook@chromium.org, Yinghai Lu, Arnd Bergmann,
	Thomas Garnier

On 06/28/17 at 07:24am, Baoquan He wrote:
> Hi Kees,
> 
> On 06/27/17 at 03:42pm, Kees Cook wrote:
> > On Sat, Jun 24, 2017 at 7:25 AM, Baoquan He <bhe@redhat.com> wrote:
> > > Kdump kernel will reset to firmware after crash is trigered when
> > > crashkernel=xxM,high is added to kernel command line. Kexec has the
> > > same phenomenon. This only happened on system with kaslr code
> > > compiled in and kernel option 'nokaslr'is added. Both of them works
> > > well when kaslr is enabled.
> > >
> > > When crashkernel high is set or kexec case, kexec/kdump kernel will be
> > > put above 4G. Since we assign the original loading address of kernel to
> > > virt_addr as initial value, the virt_addr will be larger than 1G if kaslr
> > > is disabled, it exceeds the kernel mapping size which is only 1G. Then
> > > it will cause relocation handling error in handle_relocations().
> > >
> > > In fact this is a known issue and fixed in commit:
> > >
> > > ... f285f4a ("x86, boot: Skip relocs when load address unchanged")
> > >
> > > But above fix was lost carelessly in later commit:
> > >
> > > ... 8391c73 ("x86/KASLR: Randomize virtual address separately")
> > 
> > Hrm, this changes more than just fixing this missing logic. How about
> > just simply:
> 
> Below code was added to fix the kexec/kdump kernel with kaslr disabled,
> at that time kernel kaslr physical address and virtual address
> randomization are coupled. What it was doing is to randomize physical
> address in 1G range and add the delta onto the starting address of
> virtual address, 0xffffffff80000000.
> 
> But now the physical and virtual address randomization has been
> separated. It means that whether each of them is changed or not
> randomly, randomization wont' be impacted. So below change you
		      ~~^ of the other one
> provided will has two problems:
> 
> 1st, the 'virt_addr' represents the offset of virtual address
> randomization between 0xffffffff80000000 and 0xffffffffc0000000, should
> not get a initial value '(unsigned long)output'. The 'output' could be
> any value between 16M and 64T. It makes no sense to make the assignment
> and could bring confusion to code readers.
> 
> 2nd, for x86 64, we do the relocation handling if and only if virtual
> address is randomized to a different value, namely the offset 'virt_addr'
> has a value which is not 16M. You can see this in handle_relocations().
> 
> 	if (IS_ENABLED(CONFIG_X86_64))  
>                 delta = virt_addr - LOAD_PHYSICAL_ADDR; 
>   
>         if (!delta) {          
>                 debug_putstr("No relocation needed... ");
>                 return;
>         }
> Now below code will bring a bug that if physical address randomization fail
> to get a different value, but virtual address randomization may succeed to
> get one, then it won't do relocation handling. This contradicts with the
> design and implementation of the current code.
> 
> > 
> > diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> > index b3c5a5f030ce..4496a05d1f8a 100644
> > --- a/arch/x86/boot/compressed/misc.c
> > +++ b/arch/x86/boot/compressed/misc.c
> > @@ -339,6 +339,7 @@ asmlinkage __visible void *extract_kernel(void
> > *rmode, memptr heap,
> >  {
> >         const unsigned long kernel_total_size = VO__end - VO__text;
> >         unsigned long virt_addr = (unsigned long)output;
> > +       unsigned long virt_addr_orig = virt_addr;
> > 
> >         /* Retain x86 boot parameters pointer passed from startup_32/64. */
> >         boot_params = rmode;
> > @@ -405,7 +406,12 @@ asmlinkage __visible void *extract_kernel(void
> > *rmode, memptr heap,
> >         __decompress(input_data, input_len, NULL, NULL, output, output_len,
> >                         NULL, error);
> >         parse_elf(output);
> > -       handle_relocations(output, output_len, virt_addr);
> > +       /*
> > +        * 32-bit always performs relocations. 64-bit relocations are only
> > +        * needed if kASLR has chosen a different load address.
> > +        */
> > +       if (!IS_ENABLED(CONFIG_X86_64) || virt_addr != virt_addr_orig)
> > +               handle_relocations(output, output_len, virt_addr);
> >         debug_putstr("done.\nBooting the kernel.\n");
> >         return output;
> >  }
> > 
> > > diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> > > index b3c5a5f0..c945acd 100644
> > > --- a/arch/x86/boot/compressed/misc.c
> > > +++ b/arch/x86/boot/compressed/misc.c
> > > @@ -338,7 +338,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
> > >                                   unsigned long output_len)
> > >  {
> > >         const unsigned long kernel_total_size = VO__end - VO__text;
> > > -       unsigned long virt_addr = (unsigned long)output;
> > > +       unsigned long virt_addr = LOAD_PHYSICAL_ADDR;
> > 
> > I don't like this being hard-coded. The logic for the output address
> > is already handled in the .S files, and may not be LOAD_PHYSICAL_ADDR.
> > I'd prefer just adding back the simple test of virt_addr changing.
> 
> Do you mean the handling in boot/compressed/head_64.S? Whatever it does,
> it's only for physical address. The virtual address mapping is not
> touched. Here virt_addr respresents the offset between
> 0xffffffff80000000, 0xffffffffc0000000. And if skip the relocation
> handling, we can see that the '_text' will be mapped to
> 0xffffffff81000000 forever, no matter where physical address of '_text'
> is. So here LOAD_PHYSICAL_ADDR is default value of 'virt_addr'.
> 
> > >
> > >         /* Retain x86 boot parameters pointer passed from startup_32/64. */
> > >         boot_params = rmode;
> > > @@ -397,7 +397,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
> > >  #ifndef CONFIG_RELOCATABLE
> > >         if ((unsigned long)output != LOAD_PHYSICAL_ADDR)
> > >                 error("Destination address does not match LOAD_PHYSICAL_ADDR");
> > > -       if ((unsigned long)output != virt_addr)
> > > +       if (virt_addr != LOAD_PHYSICAL_ADDR)
> > >                 error("Destination virtual address changed when not relocatable");
> > >  #endif
> > >
> > > diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> > > index 1c8355e..766a521 100644
> > > --- a/arch/x86/boot/compressed/misc.h
> > > +++ b/arch/x86/boot/compressed/misc.h
> > > @@ -81,8 +81,6 @@ static inline void choose_random_location(unsigned long input,
> > >                                           unsigned long output_size,
> > >                                           unsigned long *virt_addr)
> > >  {
> > > -       /* No change from existing output location. */
> > > -       *virt_addr = *output;
> > >  }
> > >  #endif
> > >
> > > --
> > > 2.5.5
> > >
> > 
> > 
> > 
> > -- 
> > Kees Cook
> > Pixel Security

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
  2017-06-27 22:42 ` Kees Cook
@ 2017-06-27 23:24   ` Baoquan He
  2017-06-27 23:33     ` Baoquan He
  2017-07-05 19:06     ` Kees Cook
  0 siblings, 2 replies; 12+ messages in thread
From: Baoquan He @ 2017-06-27 23:24 UTC (permalink / raw)
  To: Kees Cook
  Cc: LKML, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86,
	Kees Cook <keescook@chromium.org, Yinghai Lu, Arnd Bergmann,
	Thomas Garnier

Hi Kees,

On 06/27/17 at 03:42pm, Kees Cook wrote:
> On Sat, Jun 24, 2017 at 7:25 AM, Baoquan He <bhe@redhat.com> wrote:
> > Kdump kernel will reset to firmware after crash is trigered when
> > crashkernel=xxM,high is added to kernel command line. Kexec has the
> > same phenomenon. This only happened on system with kaslr code
> > compiled in and kernel option 'nokaslr'is added. Both of them works
> > well when kaslr is enabled.
> >
> > When crashkernel high is set or kexec case, kexec/kdump kernel will be
> > put above 4G. Since we assign the original loading address of kernel to
> > virt_addr as initial value, the virt_addr will be larger than 1G if kaslr
> > is disabled, it exceeds the kernel mapping size which is only 1G. Then
> > it will cause relocation handling error in handle_relocations().
> >
> > In fact this is a known issue and fixed in commit:
> >
> > ... f285f4a ("x86, boot: Skip relocs when load address unchanged")
> >
> > But above fix was lost carelessly in later commit:
> >
> > ... 8391c73 ("x86/KASLR: Randomize virtual address separately")
> 
> Hrm, this changes more than just fixing this missing logic. How about
> just simply:

Below code was added to fix the kexec/kdump kernel with kaslr disabled,
at that time kernel kaslr physical address and virtual address
randomization are coupled. What it was doing is to randomize physical
address in 1G range and add the delta onto the starting address of
virtual address, 0xffffffff80000000.

But now the physical and virtual address randomization has been
separated. It means that whether each of them is changed or not
randomly, the randomization wont' be impacted. So below change you
provided will has two problems:

1st, the 'virt_addr' represents the offset of virtual address
randomization between 0xffffffff80000000 and 0xffffffffc0000000, should
not get a initial value '(unsigned long)output'. The 'output' could be
any value between 16M and 64T. It makes no sense to make the assignment
and could bring confusion to code readers.

2nd, for x86 64, we do the relocation handling if and only if virtual
address is randomized to a different value, namely the offset 'virt_addr'
has a value which is not 16M. You can see this in handle_relocations().

	if (IS_ENABLED(CONFIG_X86_64))  
                delta = virt_addr - LOAD_PHYSICAL_ADDR; 
  
        if (!delta) {          
                debug_putstr("No relocation needed... ");
                return;
        }
Now below code will bring a bug that if physical address randomization fail
to get a different value, but virtual address randomization may succeed to
get one, then it won't do relocation handling. This contradicts with the
design and implementation of the current code.

> 
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index b3c5a5f030ce..4496a05d1f8a 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -339,6 +339,7 @@ asmlinkage __visible void *extract_kernel(void
> *rmode, memptr heap,
>  {
>         const unsigned long kernel_total_size = VO__end - VO__text;
>         unsigned long virt_addr = (unsigned long)output;
> +       unsigned long virt_addr_orig = virt_addr;
> 
>         /* Retain x86 boot parameters pointer passed from startup_32/64. */
>         boot_params = rmode;
> @@ -405,7 +406,12 @@ asmlinkage __visible void *extract_kernel(void
> *rmode, memptr heap,
>         __decompress(input_data, input_len, NULL, NULL, output, output_len,
>                         NULL, error);
>         parse_elf(output);
> -       handle_relocations(output, output_len, virt_addr);
> +       /*
> +        * 32-bit always performs relocations. 64-bit relocations are only
> +        * needed if kASLR has chosen a different load address.
> +        */
> +       if (!IS_ENABLED(CONFIG_X86_64) || virt_addr != virt_addr_orig)
> +               handle_relocations(output, output_len, virt_addr);
>         debug_putstr("done.\nBooting the kernel.\n");
>         return output;
>  }
> 
> > diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> > index b3c5a5f0..c945acd 100644
> > --- a/arch/x86/boot/compressed/misc.c
> > +++ b/arch/x86/boot/compressed/misc.c
> > @@ -338,7 +338,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
> >                                   unsigned long output_len)
> >  {
> >         const unsigned long kernel_total_size = VO__end - VO__text;
> > -       unsigned long virt_addr = (unsigned long)output;
> > +       unsigned long virt_addr = LOAD_PHYSICAL_ADDR;
> 
> I don't like this being hard-coded. The logic for the output address
> is already handled in the .S files, and may not be LOAD_PHYSICAL_ADDR.
> I'd prefer just adding back the simple test of virt_addr changing.

Do you mean the handling in boot/compressed/head_64.S? Whatever it does,
it's only for physical address. The virtual address mapping is not
touched. Here virt_addr respresents the offset between
0xffffffff80000000, 0xffffffffc0000000. And if skip the relocation
handling, we can see that the '_text' will be mapped to
0xffffffff81000000 forever, no matter where physical address of '_text'
is. So here LOAD_PHYSICAL_ADDR is default value of 'virt_addr'.

> >
> >         /* Retain x86 boot parameters pointer passed from startup_32/64. */
> >         boot_params = rmode;
> > @@ -397,7 +397,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
> >  #ifndef CONFIG_RELOCATABLE
> >         if ((unsigned long)output != LOAD_PHYSICAL_ADDR)
> >                 error("Destination address does not match LOAD_PHYSICAL_ADDR");
> > -       if ((unsigned long)output != virt_addr)
> > +       if (virt_addr != LOAD_PHYSICAL_ADDR)
> >                 error("Destination virtual address changed when not relocatable");
> >  #endif
> >
> > diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> > index 1c8355e..766a521 100644
> > --- a/arch/x86/boot/compressed/misc.h
> > +++ b/arch/x86/boot/compressed/misc.h
> > @@ -81,8 +81,6 @@ static inline void choose_random_location(unsigned long input,
> >                                           unsigned long output_size,
> >                                           unsigned long *virt_addr)
> >  {
> > -       /* No change from existing output location. */
> > -       *virt_addr = *output;
> >  }
> >  #endif
> >
> > --
> > 2.5.5
> >
> 
> 
> 
> -- 
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
       [not found] <1498314309-18502-1-git-send-email-bhe@redhat.com>
  2017-06-26  9:47 ` Ingo Molnar
@ 2017-06-27 22:42 ` Kees Cook
  2017-06-27 23:24   ` Baoquan He
  1 sibling, 1 reply; 12+ messages in thread
From: Kees Cook @ 2017-06-27 22:42 UTC (permalink / raw)
  To: Baoquan He
  Cc: LKML, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86,
	Kees Cook <keescook@chromium.org, Yinghai Lu, Arnd Bergmann,
	Thomas Garnier

On Sat, Jun 24, 2017 at 7:25 AM, Baoquan He <bhe@redhat.com> wrote:
> Kdump kernel will reset to firmware after crash is trigered when
> crashkernel=xxM,high is added to kernel command line. Kexec has the
> same phenomenon. This only happened on system with kaslr code
> compiled in and kernel option 'nokaslr'is added. Both of them works
> well when kaslr is enabled.
>
> When crashkernel high is set or kexec case, kexec/kdump kernel will be
> put above 4G. Since we assign the original loading address of kernel to
> virt_addr as initial value, the virt_addr will be larger than 1G if kaslr
> is disabled, it exceeds the kernel mapping size which is only 1G. Then
> it will cause relocation handling error in handle_relocations().
>
> In fact this is a known issue and fixed in commit:
>
> ... f285f4a ("x86, boot: Skip relocs when load address unchanged")
>
> But above fix was lost carelessly in later commit:
>
> ... 8391c73 ("x86/KASLR: Randomize virtual address separately")

Hrm, this changes more than just fixing this missing logic. How about
just simply:

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b3c5a5f030ce..4496a05d1f8a 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -339,6 +339,7 @@ asmlinkage __visible void *extract_kernel(void
*rmode, memptr heap,
 {
        const unsigned long kernel_total_size = VO__end - VO__text;
        unsigned long virt_addr = (unsigned long)output;
+       unsigned long virt_addr_orig = virt_addr;

        /* Retain x86 boot parameters pointer passed from startup_32/64. */
        boot_params = rmode;
@@ -405,7 +406,12 @@ asmlinkage __visible void *extract_kernel(void
*rmode, memptr heap,
        __decompress(input_data, input_len, NULL, NULL, output, output_len,
                        NULL, error);
        parse_elf(output);
-       handle_relocations(output, output_len, virt_addr);
+       /*
+        * 32-bit always performs relocations. 64-bit relocations are only
+        * needed if kASLR has chosen a different load address.
+        */
+       if (!IS_ENABLED(CONFIG_X86_64) || virt_addr != virt_addr_orig)
+               handle_relocations(output, output_len, virt_addr);
        debug_putstr("done.\nBooting the kernel.\n");
        return output;
 }


>
> To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial value.
> If no kaslr is taken, it will skip the relocation handling and jump out.
>
> Fixes: 8391c73 ("x86/KASLR: Randomize virtual address separately")
> Tested-by: Dave Young <dyoung@redhat.com>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: x86@kernel.org
> Cc: Kees Cook <keescook@chromium.org
> Cc: Yinghai Lu <yinghai@kernel.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Thomas Garnier <thgarnie@google.com>
> ---
>  arch/x86/boot/compressed/kaslr.c | 3 ---
>  arch/x86/boot/compressed/misc.c  | 4 ++--
>  arch/x86/boot/compressed/misc.h  | 2 --
>  3 files changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
> index fe318b4..91f27ab 100644
> --- a/arch/x86/boot/compressed/kaslr.c
> +++ b/arch/x86/boot/compressed/kaslr.c
> @@ -625,9 +625,6 @@ void choose_random_location(unsigned long input,
>  {
>         unsigned long random_addr, min_addr;
>
> -       /* By default, keep output position unchanged. */
> -       *virt_addr = *output;
> -
>         if (cmdline_find_option_bool("nokaslr")) {
>                 warn("KASLR disabled: 'nokaslr' on cmdline.");
>                 return;
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index b3c5a5f0..c945acd 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -338,7 +338,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
>                                   unsigned long output_len)
>  {
>         const unsigned long kernel_total_size = VO__end - VO__text;
> -       unsigned long virt_addr = (unsigned long)output;
> +       unsigned long virt_addr = LOAD_PHYSICAL_ADDR;

I don't like this being hard-coded. The logic for the output address
is already handled in the .S files, and may not be LOAD_PHYSICAL_ADDR.
I'd prefer just adding back the simple test of virt_addr changing.

-Kees

>
>         /* Retain x86 boot parameters pointer passed from startup_32/64. */
>         boot_params = rmode;
> @@ -397,7 +397,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
>  #ifndef CONFIG_RELOCATABLE
>         if ((unsigned long)output != LOAD_PHYSICAL_ADDR)
>                 error("Destination address does not match LOAD_PHYSICAL_ADDR");
> -       if ((unsigned long)output != virt_addr)
> +       if (virt_addr != LOAD_PHYSICAL_ADDR)
>                 error("Destination virtual address changed when not relocatable");
>  #endif
>
> diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> index 1c8355e..766a521 100644
> --- a/arch/x86/boot/compressed/misc.h
> +++ b/arch/x86/boot/compressed/misc.h
> @@ -81,8 +81,6 @@ static inline void choose_random_location(unsigned long input,
>                                           unsigned long output_size,
>                                           unsigned long *virt_addr)
>  {
> -       /* No change from existing output location. */
> -       *virt_addr = *output;
>  }
>  #endif
>
> --
> 2.5.5
>



-- 
Kees Cook
Pixel Security

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
  2017-06-27  8:34     ` Ingo Molnar
@ 2017-06-27  8:55       ` Baoquan He
  0 siblings, 0 replies; 12+ messages in thread
From: Baoquan He @ 2017-06-27  8:55 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Thomas Gleixner, H. Peter Anvin

On 06/27/17 at 10:34am, Ingo Molnar wrote:
> 
> * Baoquan He <bhe@redhat.com> wrote:
> 
> > As you suggested, we can add a checking to see if the virt_addr is
> > bigger than 1G, and print warning if exceed or hang there with error
> > message.
> 
> Could you try a patch for that, and see whether it catches this particular bug? 
> (before the fix is applied.)

Sure, below code change should catch it. Just I am struggling to decide
if I should add a new local variable and assign
 max(output_len, kernel_total_size) to it, and the name of the new local
variable is really hard to choose. Let me run a test on below code.

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index c945acd8fa33..00241c815524 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -390,6 +390,8 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 #ifdef CONFIG_X86_64
 	if (heap > 0x3fffffffffffUL)
 		error("Destination address too large");
+	if (virt_addr + max(output_len, kernel_total_size) > KERNEL_IMAGE_SIZE)
+		error("Destination virtual address is beyond the kernel mapping area");
 #else
 	if (heap > ((-__PAGE_OFFSET-(128<<20)-1) & 0x7fffffff))
 		error("Destination address too large");

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
  2017-06-26 10:43   ` Baoquan He
@ 2017-06-27  8:34     ` Ingo Molnar
  2017-06-27  8:55       ` Baoquan He
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2017-06-27  8:34 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-kernel, Thomas Gleixner, H. Peter Anvin


* Baoquan He <bhe@redhat.com> wrote:

> As you suggested, we can add a checking to see if the virt_addr is
> bigger than 1G, and print warning if exceed or hang there with error
> message.

Could you try a patch for that, and see whether it catches this particular bug? 
(before the fix is applied.)

Could be a 2 patch series: first the patch that adds the warning, then the fix.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
  2017-06-26  9:47 ` Ingo Molnar
@ 2017-06-26 10:43   ` Baoquan He
  2017-06-27  8:34     ` Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: Baoquan He @ 2017-06-26 10:43 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Thomas Gleixner, H. Peter Anvin

Hi Ingo,

Thanks for looking into this patch!

On 06/26/17 at 11:47am, Ingo Molnar wrote:
> 
> * Baoquan He <bhe@redhat.com> wrote:
> 
> > Kdump kernel will reset to firmware after crash is trigered when
> > crashkernel=xxM,high is added to kernel command line. Kexec has the
> > same phenomenon. This only happened on system with kaslr code
> > compiled in and kernel option 'nokaslr'is added. Both of them works
> > well when kaslr is enabled.
> > 
> > When crashkernel high is set or kexec case, kexec/kdump kernel will be
> > put above 4G. Since we assign the original loading address of kernel to
> > virt_addr as initial value, the virt_addr will be larger than 1G if kaslr
> > is disabled, it exceeds the kernel mapping size which is only 1G. Then
> > it will cause relocation handling error in handle_relocations().
> 
> So instead of whacking yet another kexec mole, how could we turn this into a more 
> debuggable warning (either during build or during the failed bootup) instead of a 
> crash and reset (triple fault?) back to the BIOS screen?

This is a good question.

In fact this might not be kexe only problem. It's actually a code bug.
For x86_64, kernel text kaslr is separated into physical and virtual
address randomization. And here the virtual addr randmoization, can only
be done inside [0xffffffff80000000, 0xffffffffc0000000), the 1G area.
When we do the virtual address randomization, we only randomize to get
an offset between 0 and 1G, then add this offset onto the starting
address, 0xffffffff80000000. In the current code, virt_addr represents
the offset, which is a little confusing. In my original patch, it's
named as virt_offset(the link of original patch is pasted below). Kees
helped refactor the code, that corner case could be forgotton.

https://github.com/baoquan-he/linux/commit/034fbed941c60099368a40058cdfd0b4cac76040

If from the point of view of the offset of virtual address, which is among
kernel mapping area [0xffffffff80000000, 0xffffffffc0000000), the 'output'
which is the original loading address of kernel, absolutely should not be
assigned to virt_addr (better renamed as virt_offset or something else).

Here, kexec/kdump is only a pratical use case. I know someone ever
modifed bootloader to make kernel can be loaded arbitrary address, which
is not 16M, decided at compiled time. Then kernel will fail too.

As you suggested, we can add a checking to see if the virt_addr is
bigger than 1G, and print warning if exceed or hang there with error
message.

> 
> If kexec/kdump wants to do crazy things they should at least be _debuggable_ in a 
> straightforward manner.

So, it may not be fault of kexec/kdump, it's a code bug. I guess we
allow kernel to be loaded at any address, but not the address decided at
compiled time, 16M, right?

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case
       [not found] <1498314309-18502-1-git-send-email-bhe@redhat.com>
@ 2017-06-26  9:47 ` Ingo Molnar
  2017-06-26 10:43   ` Baoquan He
  2017-06-27 22:42 ` Kees Cook
  1 sibling, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2017-06-26  9:47 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-kernel, Thomas Gleixner, H. Peter Anvin


* Baoquan He <bhe@redhat.com> wrote:

> Kdump kernel will reset to firmware after crash is trigered when
> crashkernel=xxM,high is added to kernel command line. Kexec has the
> same phenomenon. This only happened on system with kaslr code
> compiled in and kernel option 'nokaslr'is added. Both of them works
> well when kaslr is enabled.
> 
> When crashkernel high is set or kexec case, kexec/kdump kernel will be
> put above 4G. Since we assign the original loading address of kernel to
> virt_addr as initial value, the virt_addr will be larger than 1G if kaslr
> is disabled, it exceeds the kernel mapping size which is only 1G. Then
> it will cause relocation handling error in handle_relocations().

So instead of whacking yet another kexec mole, how could we turn this into a more 
debuggable warning (either during build or during the failed bootup) instead of a 
crash and reset (triple fault?) back to the BIOS screen?

If kexec/kdump wants to do crazy things they should at least be _debuggable_ in a 
straightforward manner.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-07-06 13:54 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-24 14:16 [PATCH] x86/boot/KASLR: Skip relocation handling in no kaslr case Baoquan He
2017-06-24 14:23 ` Baoquan He
     [not found] <1498314309-18502-1-git-send-email-bhe@redhat.com>
2017-06-26  9:47 ` Ingo Molnar
2017-06-26 10:43   ` Baoquan He
2017-06-27  8:34     ` Ingo Molnar
2017-06-27  8:55       ` Baoquan He
2017-06-27 22:42 ` Kees Cook
2017-06-27 23:24   ` Baoquan He
2017-06-27 23:33     ` Baoquan He
2017-07-05 19:06     ` Kees Cook
2017-07-06 13:28       ` Baoquan He
2017-07-06 13:54         ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).