linux-efi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
@ 2020-02-12 11:44 Hans de Goede
  2020-02-12 11:53 ` Ard Biesheuvel
  0 siblings, 1 reply; 12+ messages in thread
From: Hans de Goede @ 2020-02-12 11:44 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux-efi

Hi Ard,

While booting 5.6-rc1 on one of my test machines I noticed the WARN_ON
on line 198 of arch/x86/platform/efi/efi_64.c trigger many times.

I've done some debugging on this an this is caused by the following
call path:

drivers/firmware/efi/vars.c: efivar_init():

         unsigned long variable_name_size = 1024;
         efi_char16_t *variable_name;
         efi_guid_t vendor_guid;

         variable_name = kzalloc(variable_name_size, GFP_KERNEL);
         if (!variable_name) {
                 printk(KERN_ERR "efivars: Memory allocation failed.\n");
                 return -ENOMEM;
         }

	...


         do {
                 variable_name_size = 1024;

                 status = ops->get_next_variable(&variable_name_size,
                                                 variable_name,
                                                 &vendor_guid);
	...

arch/x86/platform/efi/efi_64.c: efi_thunk_get_next_variable()

	...
         phys_vendor = virt_to_phys_or_null(vendor);
	...

arch/x86/platform/efi/efi_64.c: virt_to_phys_or_null_size()

	...
	WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
	...

Specifically the problem is that the efi_guid_t vendor_guid has an 8 bytes
aligned address and the WARN_ON checks for it being aligned to\
sizeof(efi_guid_t) which is 16 bytes.

I've fixed this for now with the following local fix, but I'm not sure
what the alignment rules actually are so I'm not sure this is correct:

--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -181,6 +181,7 @@ static inline phys_addr_t
  virt_to_phys_or_null_size(void *va, unsigned long size)
  {
  	bool bad_size;
+	int alignment;

  	if (!va)
  		return 0;
@@ -195,7 +196,8 @@ virt_to_phys_or_null_size(void *va, unsigned long size)
  	 */
  	bad_size = size > PAGE_SIZE || !is_power_of_2(size);

-	WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
+	alignment = size > 8 ? 8 : size;
+	WARN_ON(!IS_ALIGNED((unsigned long)va, alignment) || bad_size);

  	return slow_virt_to_phys(va);
  }


I have a feeling that this is the right thing to do, but as said I'm not 100%
sure. If you can confirm that this is the right fix, then I can submit this
upstream.

Regards,

Hans


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
  2020-02-12 11:44 New EFI thunk alignment WARN_ON in 5.6 triggers multiple times Hans de Goede
@ 2020-02-12 11:53 ` Ard Biesheuvel
  2020-02-12 15:00   ` Hans de Goede
  0 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2020-02-12 11:53 UTC (permalink / raw)
  To: Hans de Goede; +Cc: linux-efi

On Wed, 12 Feb 2020 at 12:44, Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi Ard,
>
> While booting 5.6-rc1 on one of my test machines I noticed the WARN_ON
> on line 198 of arch/x86/platform/efi/efi_64.c trigger many times.
>
> I've done some debugging on this an this is caused by the following
> call path:
>
> drivers/firmware/efi/vars.c: efivar_init():
>
>          unsigned long variable_name_size = 1024;
>          efi_char16_t *variable_name;
>          efi_guid_t vendor_guid;
>
>          variable_name = kzalloc(variable_name_size, GFP_KERNEL);
>          if (!variable_name) {
>                  printk(KERN_ERR "efivars: Memory allocation failed.\n");
>                  return -ENOMEM;
>          }
>
>         ...
>
>
>          do {
>                  variable_name_size = 1024;
>
>                  status = ops->get_next_variable(&variable_name_size,
>                                                  variable_name,
>                                                  &vendor_guid);
>         ...
>
> arch/x86/platform/efi/efi_64.c: efi_thunk_get_next_variable()
>
>         ...
>          phys_vendor = virt_to_phys_or_null(vendor);
>         ...
>
> arch/x86/platform/efi/efi_64.c: virt_to_phys_or_null_size()
>
>         ...
>         WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
>         ...
>
> Specifically the problem is that the efi_guid_t vendor_guid has an 8 bytes
> aligned address and the WARN_ON checks for it being aligned to\
> sizeof(efi_guid_t) which is 16 bytes.
>
> I've fixed this for now with the following local fix, but I'm not sure
> what the alignment rules actually are so I'm not sure this is correct:
>
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -181,6 +181,7 @@ static inline phys_addr_t
>   virt_to_phys_or_null_size(void *va, unsigned long size)
>   {
>         bool bad_size;
> +       int alignment;
>
>         if (!va)
>                 return 0;
> @@ -195,7 +196,8 @@ virt_to_phys_or_null_size(void *va, unsigned long size)
>          */
>         bad_size = size > PAGE_SIZE || !is_power_of_2(size);
>
> -       WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> +       alignment = size > 8 ? 8 : size;
> +       WARN_ON(!IS_ALIGNED((unsigned long)va, alignment) || bad_size);
>
>         return slow_virt_to_phys(va);
>   }
>
>
> I have a feeling that this is the right thing to do, but as said I'm not 100%
> sure. If you can confirm that this is the right fix, then I can submit this
> upstream.
>


It seems that the purpose of the alignment check is to ensure that the
data does not cross a page boundary, so that the data is guaranteed to
be contiguous in the physical address space as well.

So in that light, the fix is most definitely wrong, although I am not
sure how this is supposed to work generally.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
  2020-02-12 11:53 ` Ard Biesheuvel
@ 2020-02-12 15:00   ` Hans de Goede
  2020-02-12 21:08     ` Ard Biesheuvel
  2020-02-12 21:57     ` Arvind Sankar
  0 siblings, 2 replies; 12+ messages in thread
From: Hans de Goede @ 2020-02-12 15:00 UTC (permalink / raw)
  To: Ard Biesheuvel, Andy Lutomirski, Matt Fleming; +Cc: linux-efi

Hi,

On 2/12/20 12:53 PM, Ard Biesheuvel wrote:
> On Wed, 12 Feb 2020 at 12:44, Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi Ard,
>>
>> While booting 5.6-rc1 on one of my test machines I noticed the WARN_ON
>> on line 198 of arch/x86/platform/efi/efi_64.c trigger many times.
>>
>> I've done some debugging on this an this is caused by the following
>> call path:
>>
>> drivers/firmware/efi/vars.c: efivar_init():
>>
>>           unsigned long variable_name_size = 1024;
>>           efi_char16_t *variable_name;
>>           efi_guid_t vendor_guid;
>>
>>           variable_name = kzalloc(variable_name_size, GFP_KERNEL);
>>           if (!variable_name) {
>>                   printk(KERN_ERR "efivars: Memory allocation failed.\n");
>>                   return -ENOMEM;
>>           }
>>
>>          ...
>>
>>
>>           do {
>>                   variable_name_size = 1024;
>>
>>                   status = ops->get_next_variable(&variable_name_size,
>>                                                   variable_name,
>>                                                   &vendor_guid);
>>          ...
>>
>> arch/x86/platform/efi/efi_64.c: efi_thunk_get_next_variable()
>>
>>          ...
>>           phys_vendor = virt_to_phys_or_null(vendor);
>>          ...
>>
>> arch/x86/platform/efi/efi_64.c: virt_to_phys_or_null_size()
>>
>>          ...
>>          WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
>>          ...
>>
>> Specifically the problem is that the efi_guid_t vendor_guid has an 8 bytes
>> aligned address and the WARN_ON checks for it being aligned to\
>> sizeof(efi_guid_t) which is 16 bytes.
>>
>> I've fixed this for now with the following local fix, but I'm not sure
>> what the alignment rules actually are so I'm not sure this is correct:
>>
>> --- a/arch/x86/platform/efi/efi_64.c
>> +++ b/arch/x86/platform/efi/efi_64.c
>> @@ -181,6 +181,7 @@ static inline phys_addr_t
>>    virt_to_phys_or_null_size(void *va, unsigned long size)
>>    {
>>          bool bad_size;
>> +       int alignment;
>>
>>          if (!va)
>>                  return 0;
>> @@ -195,7 +196,8 @@ virt_to_phys_or_null_size(void *va, unsigned long size)
>>           */
>>          bad_size = size > PAGE_SIZE || !is_power_of_2(size);
>>
>> -       WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
>> +       alignment = size > 8 ? 8 : size;
>> +       WARN_ON(!IS_ALIGNED((unsigned long)va, alignment) || bad_size);
>>
>>          return slow_virt_to_phys(va);
>>    }
>>
>>
>> I have a feeling that this is the right thing to do, but as said I'm not 100%
>> sure. If you can confirm that this is the right fix, then I can submit this
>> upstream.
>>
> 
> 
> It seems that the purpose of the alignment check is to ensure that the
> data does not cross a page boundary, so that the data is guaranteed to
> be contiguous in the physical address space as well.
> 
> So in that light, the fix is most definitely wrong, although I am not
> sure how this is supposed to work generally.

I'm not sure that is what it is trying to check, if that is what it is
trying to check then the code is certainly wrong.

Let me first quote the entire check:

         /*
          * A fully aligned variable on the stack is guaranteed not to
          * cross a page bounary. Try to catch strings on the stack by
          * checking that 'size' is a power of two.
          */
         bad_size = size > PAGE_SIZE || !is_power_of_2(size);

         WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);

AFAIK EFI is using the identity map, and the kernel stack is
physically contiguous, so crossing a page boundary should not a
problem. Also notice how the bad_size thing is talking about
page boundary crossing, but the thing triggering is the
IS_ALIGNED check. AFAIK there is no requirement for a struct, e.g.
an UUID (which is the problem here) to be aligned to its size,
it just needs to be 8 byte / 64 bit aligned, which it is yet
the IS_ALIGNED check is failing because it is checking for
a larger, in this case twice as large, but possibly it will
end up checking for a much larger alignment.

As said I'm not exactly familiar with the alignment requirements
but the current check certainly seems wrong.

It seems that we have had this around for a while, it stems from:

###
 From f6697df36bdf0bf7fce984605c2918d4a7b4269f Mon Sep 17 00:00:00 2001
From: Matt Fleming <matt@codeblueprint.co.uk>
Date: Sat, 12 Nov 2016 21:04:24 +0000
Subject: [PATCH v3] x86/efi: Prevent mixed mode boot corruption with
  CONFIG_VMAP_STACK=y

Booting an EFI mixed mode kernel has been crashing since commit:

   e37e43a497d5 ("x86/mm/64: Enable vmapped stacks (CONFIG_HAVE_ARCH_VMAP_STACK=y

The user-visible effect in my test setup was the kernel being unable
to find the root file system ramdisk. This was likely caused by silent
memory or page table corruption.

Enabling CONFIG_DEBUG_VIRTUAL=y immediately flagged the thunking code as
abusing virt_to_phys() because it was passing addresses that were not
part of the kernel direct mapping.

Use the slow version instead, which correctly handles all memory
regions by performing a page table walk.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
###

But I guess the recent changes are triggering the alignment check now.

Alternatively we could make the GUID kalloc-ed instead of having it
on the stack, but that seems unnecessary.

Andy, Matt, do you have any input on this?

Regards,

Hans



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
  2020-02-12 15:00   ` Hans de Goede
@ 2020-02-12 21:08     ` Ard Biesheuvel
  2020-02-13  7:36       ` Hans de Goede
  2020-02-12 21:57     ` Arvind Sankar
  1 sibling, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2020-02-12 21:08 UTC (permalink / raw)
  To: Hans de Goede; +Cc: Andy Lutomirski, Matt Fleming, linux-efi

On Wed, 12 Feb 2020 at 16:00, Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 2/12/20 12:53 PM, Ard Biesheuvel wrote:
> > On Wed, 12 Feb 2020 at 12:44, Hans de Goede <hdegoede@redhat.com> wrote:
> >>
> >> Hi Ard,
> >>
> >> While booting 5.6-rc1 on one of my test machines I noticed the WARN_ON
> >> on line 198 of arch/x86/platform/efi/efi_64.c trigger many times.
> >>
> >> I've done some debugging on this an this is caused by the following
> >> call path:
> >>
> >> drivers/firmware/efi/vars.c: efivar_init():
> >>
> >>           unsigned long variable_name_size = 1024;
> >>           efi_char16_t *variable_name;
> >>           efi_guid_t vendor_guid;
> >>
> >>           variable_name = kzalloc(variable_name_size, GFP_KERNEL);
> >>           if (!variable_name) {
> >>                   printk(KERN_ERR "efivars: Memory allocation failed.\n");
> >>                   return -ENOMEM;
> >>           }
> >>
> >>          ...
> >>
> >>
> >>           do {
> >>                   variable_name_size = 1024;
> >>
> >>                   status = ops->get_next_variable(&variable_name_size,
> >>                                                   variable_name,
> >>                                                   &vendor_guid);
> >>          ...
> >>
> >> arch/x86/platform/efi/efi_64.c: efi_thunk_get_next_variable()
> >>
> >>          ...
> >>           phys_vendor = virt_to_phys_or_null(vendor);
> >>          ...
> >>
> >> arch/x86/platform/efi/efi_64.c: virt_to_phys_or_null_size()
> >>
> >>          ...
> >>          WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> >>          ...
> >>
> >> Specifically the problem is that the efi_guid_t vendor_guid has an 8 bytes
> >> aligned address and the WARN_ON checks for it being aligned to\
> >> sizeof(efi_guid_t) which is 16 bytes.
> >>
> >> I've fixed this for now with the following local fix, but I'm not sure
> >> what the alignment rules actually are so I'm not sure this is correct:
> >>
> >> --- a/arch/x86/platform/efi/efi_64.c
> >> +++ b/arch/x86/platform/efi/efi_64.c
> >> @@ -181,6 +181,7 @@ static inline phys_addr_t
> >>    virt_to_phys_or_null_size(void *va, unsigned long size)
> >>    {
> >>          bool bad_size;
> >> +       int alignment;
> >>
> >>          if (!va)
> >>                  return 0;
> >> @@ -195,7 +196,8 @@ virt_to_phys_or_null_size(void *va, unsigned long size)
> >>           */
> >>          bad_size = size > PAGE_SIZE || !is_power_of_2(size);
> >>
> >> -       WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> >> +       alignment = size > 8 ? 8 : size;
> >> +       WARN_ON(!IS_ALIGNED((unsigned long)va, alignment) || bad_size);
> >>
> >>          return slow_virt_to_phys(va);
> >>    }
> >>
> >>
> >> I have a feeling that this is the right thing to do, but as said I'm not 100%
> >> sure. If you can confirm that this is the right fix, then I can submit this
> >> upstream.
> >>
> >
> >
> > It seems that the purpose of the alignment check is to ensure that the
> > data does not cross a page boundary, so that the data is guaranteed to
> > be contiguous in the physical address space as well.
> >
> > So in that light, the fix is most definitely wrong, although I am not
> > sure how this is supposed to work generally.
>
> I'm not sure that is what it is trying to check, if that is what it is
> trying to check then the code is certainly wrong.
>
> Let me first quote the entire check:
>
>          /*
>           * A fully aligned variable on the stack is guaranteed not to
>           * cross a page bounary. Try to catch strings on the stack by
>           * checking that 'size' is a power of two.
>           */
>          bad_size = size > PAGE_SIZE || !is_power_of_2(size);
>
>          WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
>
> AFAIK EFI is using the identity map, and the kernel stack is
> physically contiguous, so crossing a page boundary should not a
> problem.

CONFIG_HAVE_ARCH_VMAP_STACK=y means the kernel stack may not be
contiguous in physical memory, which is why this was added in the
first place.

We do allocate a special stack for mixed mode, but we only switch to
it in the .S thunking code, so at this point, we are still running
from the vmap'ed stack

> Also notice how the bad_size thing is talking about
> page boundary crossing, but the thing triggering is the
> IS_ALIGNED check. AFAIK there is no requirement for a struct, e.g.
> an UUID (which is the problem here) to be aligned to its size,
> it just needs to be 8 byte / 64 bit aligned, which it is yet
> the IS_ALIGNED check is failing because it is checking for
> a larger, in this case twice as large, but possibly it will
> end up checking for a much larger alignment.
>

The idea is that a data structure of size < PAGE_SIZE is guaranteed
not to cross a page boundary if it is aligned to a power-of-2 upper
bound of its size. This has nothing to do with the data type or the
minimal alignment of any of its constituent parts.

> As said I'm not exactly familiar with the alignment requirements
> but the current check certainly seems wrong.
>
> It seems that we have had this around for a while, it stems from:
>
> ###
>  From f6697df36bdf0bf7fce984605c2918d4a7b4269f Mon Sep 17 00:00:00 2001
> From: Matt Fleming <matt@codeblueprint.co.uk>
> Date: Sat, 12 Nov 2016 21:04:24 +0000
> Subject: [PATCH v3] x86/efi: Prevent mixed mode boot corruption with
>   CONFIG_VMAP_STACK=y
>
> Booting an EFI mixed mode kernel has been crashing since commit:
>
>    e37e43a497d5 ("x86/mm/64: Enable vmapped stacks (CONFIG_HAVE_ARCH_VMAP_STACK=y
>
> The user-visible effect in my test setup was the kernel being unable
> to find the root file system ramdisk. This was likely caused by silent
> memory or page table corruption.
>
> Enabling CONFIG_DEBUG_VIRTUAL=y immediately flagged the thunking code as
> abusing virt_to_phys() because it was passing addresses that were not
> part of the kernel direct mapping.
>
> Use the slow version instead, which correctly handles all memory
> regions by performing a page table walk.
>
> Suggested-by: Andy Lutomirski <luto@amacapital.net>
> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
> ###
>
> But I guess the recent changes are triggering the alignment check now.
>
> Alternatively we could make the GUID kalloc-ed instead of having it
> on the stack, but that seems unnecessary.
>
> Andy, Matt, do you have any input on this?
>
> Regards,
>
> Hans
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
  2020-02-12 15:00   ` Hans de Goede
  2020-02-12 21:08     ` Ard Biesheuvel
@ 2020-02-12 21:57     ` Arvind Sankar
  2020-02-13  0:34       ` Arvind Sankar
  1 sibling, 1 reply; 12+ messages in thread
From: Arvind Sankar @ 2020-02-12 21:57 UTC (permalink / raw)
  To: Hans de Goede; +Cc: Ard Biesheuvel, Andy Lutomirski, Matt Fleming, linux-efi

On Wed, Feb 12, 2020 at 04:00:24PM +0100, Hans de Goede wrote:
> Hi,
> 
> On 2/12/20 12:53 PM, Ard Biesheuvel wrote:
> > 
> > 
> > It seems that the purpose of the alignment check is to ensure that the
> > data does not cross a page boundary, so that the data is guaranteed to
> > be contiguous in the physical address space as well.
> > 
> > So in that light, the fix is most definitely wrong, although I am not
> > sure how this is supposed to work generally.
> 
> I'm not sure that is what it is trying to check, if that is what it is
> trying to check then the code is certainly wrong.
> 
> Let me first quote the entire check:
> 
>          /*
>           * A fully aligned variable on the stack is guaranteed not to
>           * cross a page bounary. Try to catch strings on the stack by
>           * checking that 'size' is a power of two.
>           */
>          bad_size = size > PAGE_SIZE || !is_power_of_2(size);
> 
>          WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> 
> AFAIK EFI is using the identity map, and the kernel stack is
> physically contiguous, so crossing a page boundary should not a
> problem. Also notice how the bad_size thing is talking about
> page boundary crossing, but the thing triggering is the
> IS_ALIGNED check. AFAIK there is no requirement for a struct, e.g.
> an UUID (which is the problem here) to be aligned to its size,
> it just needs to be 8 byte / 64 bit aligned, which it is yet
> the IS_ALIGNED check is failing because it is checking for
> a larger, in this case twice as large, but possibly it will
> end up checking for a much larger alignment.
> 
> As said I'm not exactly familiar with the alignment requirements
> but the current check certainly seems wrong.

If VMAP_STACK is enabled, the stack pages may not be physically
contiguous. So the check is trying to warn about all cases where the
object might cross a page boundary and so not be physically contiguous,
even if the current call is ok because it doesn't cross the page
boundary.

It should probably also error out rather than making the call if it is
actually the case that it spans non-physically contiguous pages.

However, I'm also getting confused about how mixed-mode works at all if
we have more than 4Gb RAM, which it seems we want to support as we take
pains to allocate a stack below 4Gb for the thunk. But in this case, the
kernel text and stack could be above 4Gb physically, so rather than
using a 1:1 map we'd need to find some space in the low 4Gb of virtual
addresses to map those, but I don't see where we do this? Also, that
dynamically allocated variable_name could be above 4G as well.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
  2020-02-12 21:57     ` Arvind Sankar
@ 2020-02-13  0:34       ` Arvind Sankar
  2020-02-13  7:34         ` Hans de Goede
  0 siblings, 1 reply; 12+ messages in thread
From: Arvind Sankar @ 2020-02-13  0:34 UTC (permalink / raw)
  To: Hans de Goede; +Cc: Ard Biesheuvel, Andy Lutomirski, Matt Fleming, linux-efi

On Wed, Feb 12, 2020 at 04:57:19PM -0500, Arvind Sankar wrote:
> However, I'm also getting confused about how mixed-mode works at all if
> we have more than 4Gb RAM, which it seems we want to support as we take
> pains to allocate a stack below 4Gb for the thunk. But in this case, the
> kernel text and stack could be above 4Gb physically, so rather than
> using a 1:1 map we'd need to find some space in the low 4Gb of virtual
> addresses to map those, but I don't see where we do this? Also, that
> dynamically allocated variable_name could be above 4G as well.

Verified in QEMU that mixed mode crashes if physical RAM can extend
above 4G. Are there any such real devices in existence? Should we check
that and disable runtime services if so?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
  2020-02-13  0:34       ` Arvind Sankar
@ 2020-02-13  7:34         ` Hans de Goede
  0 siblings, 0 replies; 12+ messages in thread
From: Hans de Goede @ 2020-02-13  7:34 UTC (permalink / raw)
  To: Arvind Sankar; +Cc: Ard Biesheuvel, Andy Lutomirski, Matt Fleming, linux-efi

Hi,

On 2/13/20 1:34 AM, Arvind Sankar wrote:
> On Wed, Feb 12, 2020 at 04:57:19PM -0500, Arvind Sankar wrote:
>> However, I'm also getting confused about how mixed-mode works at all if
>> we have more than 4Gb RAM, which it seems we want to support as we take
>> pains to allocate a stack below 4Gb for the thunk. But in this case, the
>> kernel text and stack could be above 4Gb physically, so rather than
>> using a 1:1 map we'd need to find some space in the low 4Gb of virtual
>> addresses to map those, but I don't see where we do this? Also, that
>> dynamically allocated variable_name could be above 4G as well.
> 
> Verified in QEMU that mixed mode crashes if physical RAM can extend
> above 4G. Are there any such real devices in existence? Should we check
> that and disable runtime services if so?

AFAIK mixed-mode is mostly used on Bay Trail and maybe on one or 2 Cherry
Trail devices. Bay Trail does not go above 2G RAM and Cherry Trail does not
go above 4G RAM so in practice this should not be a problem.

I guess we could/should still document this somewhere ?  Or maybe add a check
just for the sake of correctness?

Regards,

Hans


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
  2020-02-12 21:08     ` Ard Biesheuvel
@ 2020-02-13  7:36       ` Hans de Goede
  2020-02-13  8:03         ` Ard Biesheuvel
  0 siblings, 1 reply; 12+ messages in thread
From: Hans de Goede @ 2020-02-13  7:36 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: Andy Lutomirski, Matt Fleming, linux-efi

Hi,

On 2/12/20 10:08 PM, Ard Biesheuvel wrote:
> On Wed, 12 Feb 2020 at 16:00, Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi,
>>
>> On 2/12/20 12:53 PM, Ard Biesheuvel wrote:
>>> On Wed, 12 Feb 2020 at 12:44, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>
>>>> Hi Ard,
>>>>
>>>> While booting 5.6-rc1 on one of my test machines I noticed the WARN_ON
>>>> on line 198 of arch/x86/platform/efi/efi_64.c trigger many times.
>>>>
>>>> I've done some debugging on this an this is caused by the following
>>>> call path:
>>>>
>>>> drivers/firmware/efi/vars.c: efivar_init():
>>>>
>>>>            unsigned long variable_name_size = 1024;
>>>>            efi_char16_t *variable_name;
>>>>            efi_guid_t vendor_guid;
>>>>
>>>>            variable_name = kzalloc(variable_name_size, GFP_KERNEL);
>>>>            if (!variable_name) {
>>>>                    printk(KERN_ERR "efivars: Memory allocation failed.\n");
>>>>                    return -ENOMEM;
>>>>            }
>>>>
>>>>           ...
>>>>
>>>>
>>>>            do {
>>>>                    variable_name_size = 1024;
>>>>
>>>>                    status = ops->get_next_variable(&variable_name_size,
>>>>                                                    variable_name,
>>>>                                                    &vendor_guid);
>>>>           ...
>>>>
>>>> arch/x86/platform/efi/efi_64.c: efi_thunk_get_next_variable()
>>>>
>>>>           ...
>>>>            phys_vendor = virt_to_phys_or_null(vendor);
>>>>           ...
>>>>
>>>> arch/x86/platform/efi/efi_64.c: virt_to_phys_or_null_size()
>>>>
>>>>           ...
>>>>           WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
>>>>           ...
>>>>
>>>> Specifically the problem is that the efi_guid_t vendor_guid has an 8 bytes
>>>> aligned address and the WARN_ON checks for it being aligned to\
>>>> sizeof(efi_guid_t) which is 16 bytes.
>>>>
>>>> I've fixed this for now with the following local fix, but I'm not sure
>>>> what the alignment rules actually are so I'm not sure this is correct:
>>>>
>>>> --- a/arch/x86/platform/efi/efi_64.c
>>>> +++ b/arch/x86/platform/efi/efi_64.c
>>>> @@ -181,6 +181,7 @@ static inline phys_addr_t
>>>>     virt_to_phys_or_null_size(void *va, unsigned long size)
>>>>     {
>>>>           bool bad_size;
>>>> +       int alignment;
>>>>
>>>>           if (!va)
>>>>                   return 0;
>>>> @@ -195,7 +196,8 @@ virt_to_phys_or_null_size(void *va, unsigned long size)
>>>>            */
>>>>           bad_size = size > PAGE_SIZE || !is_power_of_2(size);
>>>>
>>>> -       WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
>>>> +       alignment = size > 8 ? 8 : size;
>>>> +       WARN_ON(!IS_ALIGNED((unsigned long)va, alignment) || bad_size);
>>>>
>>>>           return slow_virt_to_phys(va);
>>>>     }
>>>>
>>>>
>>>> I have a feeling that this is the right thing to do, but as said I'm not 100%
>>>> sure. If you can confirm that this is the right fix, then I can submit this
>>>> upstream.
>>>>
>>>
>>>
>>> It seems that the purpose of the alignment check is to ensure that the
>>> data does not cross a page boundary, so that the data is guaranteed to
>>> be contiguous in the physical address space as well.
>>>
>>> So in that light, the fix is most definitely wrong, although I am not
>>> sure how this is supposed to work generally.
>>
>> I'm not sure that is what it is trying to check, if that is what it is
>> trying to check then the code is certainly wrong.
>>
>> Let me first quote the entire check:
>>
>>           /*
>>            * A fully aligned variable on the stack is guaranteed not to
>>            * cross a page bounary. Try to catch strings on the stack by
>>            * checking that 'size' is a power of two.
>>            */
>>           bad_size = size > PAGE_SIZE || !is_power_of_2(size);
>>
>>           WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
>>
>> AFAIK EFI is using the identity map, and the kernel stack is
>> physically contiguous, so crossing a page boundary should not a
>> problem.
> 
> CONFIG_HAVE_ARCH_VMAP_STACK=y means the kernel stack may not be
> contiguous in physical memory, which is why this was added in the
> first place.
> 
> We do allocate a special stack for mixed mode, but we only switch to
> it in the .S thunking code, so at this point, we are still running
> from the vmap'ed stack
> 
>> Also notice how the bad_size thing is talking about
>> page boundary crossing, but the thing triggering is the
>> IS_ALIGNED check. AFAIK there is no requirement for a struct, e.g.
>> an UUID (which is the problem here) to be aligned to its size,
>> it just needs to be 8 byte / 64 bit aligned, which it is yet
>> the IS_ALIGNED check is failing because it is checking for
>> a larger, in this case twice as large, but possibly it will
>> end up checking for a much larger alignment.
>>
> 
> The idea is that a data structure of size < PAGE_SIZE is guaranteed
> not to cross a page boundary if it is aligned to a power-of-2 upper
> bound of its size. This has nothing to do with the data type or the
> minimal alignment of any of its constituent parts.

Ok, so I guess that the correct fix is to switch to kmalloc-ing
"efi_guid_t vendor_guid" in efivar_init() instead of declaring it on
the stack?

Regards,

Hans


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
  2020-02-13  7:36       ` Hans de Goede
@ 2020-02-13  8:03         ` Ard Biesheuvel
  2020-02-13  8:55           ` Ard Biesheuvel
  0 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2020-02-13  8:03 UTC (permalink / raw)
  To: Hans de Goede; +Cc: Andy Lutomirski, Matt Fleming, linux-efi, Arvind Sankar

On Thu, 13 Feb 2020 at 08:36, Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 2/12/20 10:08 PM, Ard Biesheuvel wrote:
> > On Wed, 12 Feb 2020 at 16:00, Hans de Goede <hdegoede@redhat.com> wrote:
> >>
> >> Hi,
> >>
> >> On 2/12/20 12:53 PM, Ard Biesheuvel wrote:
> >>> On Wed, 12 Feb 2020 at 12:44, Hans de Goede <hdegoede@redhat.com> wrote:
> >>>>
> >>>> Hi Ard,
> >>>>
> >>>> While booting 5.6-rc1 on one of my test machines I noticed the WARN_ON
> >>>> on line 198 of arch/x86/platform/efi/efi_64.c trigger many times.
> >>>>
> >>>> I've done some debugging on this an this is caused by the following
> >>>> call path:
> >>>>
> >>>> drivers/firmware/efi/vars.c: efivar_init():
> >>>>
> >>>>            unsigned long variable_name_size = 1024;
> >>>>            efi_char16_t *variable_name;
> >>>>            efi_guid_t vendor_guid;
> >>>>
> >>>>            variable_name = kzalloc(variable_name_size, GFP_KERNEL);
> >>>>            if (!variable_name) {
> >>>>                    printk(KERN_ERR "efivars: Memory allocation failed.\n");
> >>>>                    return -ENOMEM;
> >>>>            }
> >>>>
> >>>>           ...
> >>>>
> >>>>
> >>>>            do {
> >>>>                    variable_name_size = 1024;
> >>>>
> >>>>                    status = ops->get_next_variable(&variable_name_size,
> >>>>                                                    variable_name,
> >>>>                                                    &vendor_guid);
> >>>>           ...
> >>>>
> >>>> arch/x86/platform/efi/efi_64.c: efi_thunk_get_next_variable()
> >>>>
> >>>>           ...
> >>>>            phys_vendor = virt_to_phys_or_null(vendor);
> >>>>           ...
> >>>>
> >>>> arch/x86/platform/efi/efi_64.c: virt_to_phys_or_null_size()
> >>>>
> >>>>           ...
> >>>>           WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> >>>>           ...
> >>>>
> >>>> Specifically the problem is that the efi_guid_t vendor_guid has an 8 bytes
> >>>> aligned address and the WARN_ON checks for it being aligned to\
> >>>> sizeof(efi_guid_t) which is 16 bytes.
> >>>>
> >>>> I've fixed this for now with the following local fix, but I'm not sure
> >>>> what the alignment rules actually are so I'm not sure this is correct:
> >>>>
> >>>> --- a/arch/x86/platform/efi/efi_64.c
> >>>> +++ b/arch/x86/platform/efi/efi_64.c
> >>>> @@ -181,6 +181,7 @@ static inline phys_addr_t
> >>>>     virt_to_phys_or_null_size(void *va, unsigned long size)
> >>>>     {
> >>>>           bool bad_size;
> >>>> +       int alignment;
> >>>>
> >>>>           if (!va)
> >>>>                   return 0;
> >>>> @@ -195,7 +196,8 @@ virt_to_phys_or_null_size(void *va, unsigned long size)
> >>>>            */
> >>>>           bad_size = size > PAGE_SIZE || !is_power_of_2(size);
> >>>>
> >>>> -       WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> >>>> +       alignment = size > 8 ? 8 : size;
> >>>> +       WARN_ON(!IS_ALIGNED((unsigned long)va, alignment) || bad_size);
> >>>>
> >>>>           return slow_virt_to_phys(va);
> >>>>     }
> >>>>
> >>>>
> >>>> I have a feeling that this is the right thing to do, but as said I'm not 100%
> >>>> sure. If you can confirm that this is the right fix, then I can submit this
> >>>> upstream.
> >>>>
> >>>
> >>>
> >>> It seems that the purpose of the alignment check is to ensure that the
> >>> data does not cross a page boundary, so that the data is guaranteed to
> >>> be contiguous in the physical address space as well.
> >>>
> >>> So in that light, the fix is most definitely wrong, although I am not
> >>> sure how this is supposed to work generally.
> >>
> >> I'm not sure that is what it is trying to check, if that is what it is
> >> trying to check then the code is certainly wrong.
> >>
> >> Let me first quote the entire check:
> >>
> >>           /*
> >>            * A fully aligned variable on the stack is guaranteed not to
> >>            * cross a page bounary. Try to catch strings on the stack by
> >>            * checking that 'size' is a power of two.
> >>            */
> >>           bad_size = size > PAGE_SIZE || !is_power_of_2(size);
> >>
> >>           WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> >>
> >> AFAIK EFI is using the identity map, and the kernel stack is
> >> physically contiguous, so crossing a page boundary should not a
> >> problem.
> >
> > CONFIG_HAVE_ARCH_VMAP_STACK=y means the kernel stack may not be
> > contiguous in physical memory, which is why this was added in the
> > first place.
> >
> > We do allocate a special stack for mixed mode, but we only switch to
> > it in the .S thunking code, so at this point, we are still running
> > from the vmap'ed stack
> >
> >> Also notice how the bad_size thing is talking about
> >> page boundary crossing, but the thing triggering is the
> >> IS_ALIGNED check. AFAIK there is no requirement for a struct, e.g.
> >> an UUID (which is the problem here) to be aligned to its size,
> >> it just needs to be 8 byte / 64 bit aligned, which it is yet
> >> the IS_ALIGNED check is failing because it is checking for
> >> a larger, in this case twice as large, but possibly it will
> >> end up checking for a much larger alignment.
> >>
> >
> > The idea is that a data structure of size < PAGE_SIZE is guaranteed
> > not to cross a page boundary if it is aligned to a power-of-2 upper
> > bound of its size. This has nothing to do with the data type or the
> > minimal alignment of any of its constituent parts.
>
> Ok, so I guess that the correct fix is to switch to kmalloc-ing
> "efi_guid_t vendor_guid" in efivar_init() instead of declaring it on
> the stack?
>

I'd prefer it if we updated the efi_thunk_* routines to use a
efi_guid_t on its stack that is naturally aligned, along the lines of

--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -658,6 +658,7 @@ static efi_status_t
 efi_thunk_get_variable(efi_char16_t *name, efi_guid_t *vendor,
                       u32 *attr, unsigned long *data_size, void *data)
 {
+       efi_guid_t guid __aligned(sizeof(efi_guid_t)) = *vendor;
        efi_status_t status;
        u32 phys_name, phys_vendor, phys_attr;
        u32 phys_data_size, phys_data;
@@ -666,7 +667,7 @@ efi_thunk_get_variable(efi_char16_t *name,
efi_guid_t *vendor,
        spin_lock_irqsave(&efi_runtime_lock, flags);

        phys_data_size = virt_to_phys_or_null(data_size);
-       phys_vendor = virt_to_phys_or_null(vendor);
+       phys_vendor = virt_to_phys_or_null(&guid);
        phys_name = virt_to_phys_or_null_size(name, efi_name_size(name));
        phys_attr = virt_to_phys_or_null(attr);
        phys_data = virt_to_phys_or_null_size(data, *data_size);


As for the crashes with >4 GB memory size: I suppose we should just
cap the memory to 4GB when mixed mode is detected, although I think
this is mostly a theoretical case.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
  2020-02-13  8:03         ` Ard Biesheuvel
@ 2020-02-13  8:55           ` Ard Biesheuvel
  2020-02-13 10:03             ` Hans de Goede
  0 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2020-02-13  8:55 UTC (permalink / raw)
  To: Hans de Goede; +Cc: Andy Lutomirski, Matt Fleming, linux-efi, Arvind Sankar

On Thu, 13 Feb 2020 at 09:03, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Thu, 13 Feb 2020 at 08:36, Hans de Goede <hdegoede@redhat.com> wrote:
> >
> > Hi,
> >
> > On 2/12/20 10:08 PM, Ard Biesheuvel wrote:
> > > On Wed, 12 Feb 2020 at 16:00, Hans de Goede <hdegoede@redhat.com> wrote:
> > >>
> > >> Hi,
> > >>
> > >> On 2/12/20 12:53 PM, Ard Biesheuvel wrote:
> > >>> On Wed, 12 Feb 2020 at 12:44, Hans de Goede <hdegoede@redhat.com> wrote:
> > >>>>
> > >>>> Hi Ard,
> > >>>>
> > >>>> While booting 5.6-rc1 on one of my test machines I noticed the WARN_ON
> > >>>> on line 198 of arch/x86/platform/efi/efi_64.c trigger many times.
> > >>>>
> > >>>> I've done some debugging on this an this is caused by the following
> > >>>> call path:
> > >>>>
> > >>>> drivers/firmware/efi/vars.c: efivar_init():
> > >>>>
> > >>>>            unsigned long variable_name_size = 1024;
> > >>>>            efi_char16_t *variable_name;
> > >>>>            efi_guid_t vendor_guid;
> > >>>>
> > >>>>            variable_name = kzalloc(variable_name_size, GFP_KERNEL);
> > >>>>            if (!variable_name) {
> > >>>>                    printk(KERN_ERR "efivars: Memory allocation failed.\n");
> > >>>>                    return -ENOMEM;
> > >>>>            }
> > >>>>
> > >>>>           ...
> > >>>>
> > >>>>
> > >>>>            do {
> > >>>>                    variable_name_size = 1024;
> > >>>>
> > >>>>                    status = ops->get_next_variable(&variable_name_size,
> > >>>>                                                    variable_name,
> > >>>>                                                    &vendor_guid);
> > >>>>           ...
> > >>>>
> > >>>> arch/x86/platform/efi/efi_64.c: efi_thunk_get_next_variable()
> > >>>>
> > >>>>           ...
> > >>>>            phys_vendor = virt_to_phys_or_null(vendor);
> > >>>>           ...
> > >>>>
> > >>>> arch/x86/platform/efi/efi_64.c: virt_to_phys_or_null_size()
> > >>>>
> > >>>>           ...
> > >>>>           WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> > >>>>           ...
> > >>>>
> > >>>> Specifically the problem is that the efi_guid_t vendor_guid has an 8 bytes
> > >>>> aligned address and the WARN_ON checks for it being aligned to\
> > >>>> sizeof(efi_guid_t) which is 16 bytes.
> > >>>>
> > >>>> I've fixed this for now with the following local fix, but I'm not sure
> > >>>> what the alignment rules actually are so I'm not sure this is correct:
> > >>>>
> > >>>> --- a/arch/x86/platform/efi/efi_64.c
> > >>>> +++ b/arch/x86/platform/efi/efi_64.c
> > >>>> @@ -181,6 +181,7 @@ static inline phys_addr_t
> > >>>>     virt_to_phys_or_null_size(void *va, unsigned long size)
> > >>>>     {
> > >>>>           bool bad_size;
> > >>>> +       int alignment;
> > >>>>
> > >>>>           if (!va)
> > >>>>                   return 0;
> > >>>> @@ -195,7 +196,8 @@ virt_to_phys_or_null_size(void *va, unsigned long size)
> > >>>>            */
> > >>>>           bad_size = size > PAGE_SIZE || !is_power_of_2(size);
> > >>>>
> > >>>> -       WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> > >>>> +       alignment = size > 8 ? 8 : size;
> > >>>> +       WARN_ON(!IS_ALIGNED((unsigned long)va, alignment) || bad_size);
> > >>>>
> > >>>>           return slow_virt_to_phys(va);
> > >>>>     }
> > >>>>
> > >>>>
> > >>>> I have a feeling that this is the right thing to do, but as said I'm not 100%
> > >>>> sure. If you can confirm that this is the right fix, then I can submit this
> > >>>> upstream.
> > >>>>
> > >>>
> > >>>
> > >>> It seems that the purpose of the alignment check is to ensure that the
> > >>> data does not cross a page boundary, so that the data is guaranteed to
> > >>> be contiguous in the physical address space as well.
> > >>>
> > >>> So in that light, the fix is most definitely wrong, although I am not
> > >>> sure how this is supposed to work generally.
> > >>
> > >> I'm not sure that is what it is trying to check, if that is what it is
> > >> trying to check then the code is certainly wrong.
> > >>
> > >> Let me first quote the entire check:
> > >>
> > >>           /*
> > >>            * A fully aligned variable on the stack is guaranteed not to
> > >>            * cross a page bounary. Try to catch strings on the stack by
> > >>            * checking that 'size' is a power of two.
> > >>            */
> > >>           bad_size = size > PAGE_SIZE || !is_power_of_2(size);
> > >>
> > >>           WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> > >>
> > >> AFAIK EFI is using the identity map, and the kernel stack is
> > >> physically contiguous, so crossing a page boundary should not a
> > >> problem.
> > >
> > > CONFIG_HAVE_ARCH_VMAP_STACK=y means the kernel stack may not be
> > > contiguous in physical memory, which is why this was added in the
> > > first place.
> > >
> > > We do allocate a special stack for mixed mode, but we only switch to
> > > it in the .S thunking code, so at this point, we are still running
> > > from the vmap'ed stack
> > >
> > >> Also notice how the bad_size thing is talking about
> > >> page boundary crossing, but the thing triggering is the
> > >> IS_ALIGNED check. AFAIK there is no requirement for a struct, e.g.
> > >> an UUID (which is the problem here) to be aligned to its size,
> > >> it just needs to be 8 byte / 64 bit aligned, which it is yet
> > >> the IS_ALIGNED check is failing because it is checking for
> > >> a larger, in this case twice as large, but possibly it will
> > >> end up checking for a much larger alignment.
> > >>
> > >
> > > The idea is that a data structure of size < PAGE_SIZE is guaranteed
> > > not to cross a page boundary if it is aligned to a power-of-2 upper
> > > bound of its size. This has nothing to do with the data type or the
> > > minimal alignment of any of its constituent parts.
> >
> > Ok, so I guess that the correct fix is to switch to kmalloc-ing
> > "efi_guid_t vendor_guid" in efivar_init() instead of declaring it on
> > the stack?
> >
>
> I'd prefer it if we updated the efi_thunk_* routines to use a
> efi_guid_t on its stack that is naturally aligned, along the lines of
>
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -658,6 +658,7 @@ static efi_status_t
>  efi_thunk_get_variable(efi_char16_t *name, efi_guid_t *vendor,
>                        u32 *attr, unsigned long *data_size, void *data)
>  {
> +       efi_guid_t guid __aligned(sizeof(efi_guid_t)) = *vendor;
>         efi_status_t status;
>         u32 phys_name, phys_vendor, phys_attr;
>         u32 phys_data_size, phys_data;
> @@ -666,7 +667,7 @@ efi_thunk_get_variable(efi_char16_t *name,
> efi_guid_t *vendor,
>         spin_lock_irqsave(&efi_runtime_lock, flags);
>
>         phys_data_size = virt_to_phys_or_null(data_size);
> -       phys_vendor = virt_to_phys_or_null(vendor);
> +       phys_vendor = virt_to_phys_or_null(&guid);
>         phys_name = virt_to_phys_or_null_size(name, efi_name_size(name));
>         phys_attr = virt_to_phys_or_null(attr);
>         phys_data = virt_to_phys_or_null_size(data, *data_size);
>
>
> As for the crashes with >4 GB memory size: I suppose we should just
> cap the memory to 4GB when mixed mode is detected, although I think
> this is mostly a theoretical case.

Hmm, actually, I don't think we can rely on alignment == 16, given
that the compiler assumes that RSP is 16 byte aligned already, which
is not the case in reality.

So we'll probably have to add some compiler foo as well.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
  2020-02-13  8:55           ` Ard Biesheuvel
@ 2020-02-13 10:03             ` Hans de Goede
  2020-02-13 10:18               ` Ard Biesheuvel
  0 siblings, 1 reply; 12+ messages in thread
From: Hans de Goede @ 2020-02-13 10:03 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: Andy Lutomirski, Matt Fleming, linux-efi, Arvind Sankar

Hi,

On 2/13/20 9:55 AM, Ard Biesheuvel wrote:
> On Thu, 13 Feb 2020 at 09:03, Ard Biesheuvel <ardb@kernel.org> wrote:
>>
>> On Thu, 13 Feb 2020 at 08:36, Hans de Goede <hdegoede@redhat.com> wrote:
>>>
>>> Hi,
>>>
>>> On 2/12/20 10:08 PM, Ard Biesheuvel wrote:
>>>> On Wed, 12 Feb 2020 at 16:00, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On 2/12/20 12:53 PM, Ard Biesheuvel wrote:
>>>>>> On Wed, 12 Feb 2020 at 12:44, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>>>>
>>>>>>> Hi Ard,
>>>>>>>
>>>>>>> While booting 5.6-rc1 on one of my test machines I noticed the WARN_ON
>>>>>>> on line 198 of arch/x86/platform/efi/efi_64.c trigger many times.
>>>>>>>
>>>>>>> I've done some debugging on this an this is caused by the following
>>>>>>> call path:
>>>>>>>
>>>>>>> drivers/firmware/efi/vars.c: efivar_init():
>>>>>>>
>>>>>>>             unsigned long variable_name_size = 1024;
>>>>>>>             efi_char16_t *variable_name;
>>>>>>>             efi_guid_t vendor_guid;
>>>>>>>
>>>>>>>             variable_name = kzalloc(variable_name_size, GFP_KERNEL);
>>>>>>>             if (!variable_name) {
>>>>>>>                     printk(KERN_ERR "efivars: Memory allocation failed.\n");
>>>>>>>                     return -ENOMEM;
>>>>>>>             }
>>>>>>>
>>>>>>>            ...
>>>>>>>
>>>>>>>
>>>>>>>             do {
>>>>>>>                     variable_name_size = 1024;
>>>>>>>
>>>>>>>                     status = ops->get_next_variable(&variable_name_size,
>>>>>>>                                                     variable_name,
>>>>>>>                                                     &vendor_guid);
>>>>>>>            ...
>>>>>>>
>>>>>>> arch/x86/platform/efi/efi_64.c: efi_thunk_get_next_variable()
>>>>>>>
>>>>>>>            ...
>>>>>>>             phys_vendor = virt_to_phys_or_null(vendor);
>>>>>>>            ...
>>>>>>>
>>>>>>> arch/x86/platform/efi/efi_64.c: virt_to_phys_or_null_size()
>>>>>>>
>>>>>>>            ...
>>>>>>>            WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
>>>>>>>            ...
>>>>>>>
>>>>>>> Specifically the problem is that the efi_guid_t vendor_guid has an 8 bytes
>>>>>>> aligned address and the WARN_ON checks for it being aligned to\
>>>>>>> sizeof(efi_guid_t) which is 16 bytes.
>>>>>>>
>>>>>>> I've fixed this for now with the following local fix, but I'm not sure
>>>>>>> what the alignment rules actually are so I'm not sure this is correct:
>>>>>>>
>>>>>>> --- a/arch/x86/platform/efi/efi_64.c
>>>>>>> +++ b/arch/x86/platform/efi/efi_64.c
>>>>>>> @@ -181,6 +181,7 @@ static inline phys_addr_t
>>>>>>>      virt_to_phys_or_null_size(void *va, unsigned long size)
>>>>>>>      {
>>>>>>>            bool bad_size;
>>>>>>> +       int alignment;
>>>>>>>
>>>>>>>            if (!va)
>>>>>>>                    return 0;
>>>>>>> @@ -195,7 +196,8 @@ virt_to_phys_or_null_size(void *va, unsigned long size)
>>>>>>>             */
>>>>>>>            bad_size = size > PAGE_SIZE || !is_power_of_2(size);
>>>>>>>
>>>>>>> -       WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
>>>>>>> +       alignment = size > 8 ? 8 : size;
>>>>>>> +       WARN_ON(!IS_ALIGNED((unsigned long)va, alignment) || bad_size);
>>>>>>>
>>>>>>>            return slow_virt_to_phys(va);
>>>>>>>      }
>>>>>>>
>>>>>>>
>>>>>>> I have a feeling that this is the right thing to do, but as said I'm not 100%
>>>>>>> sure. If you can confirm that this is the right fix, then I can submit this
>>>>>>> upstream.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> It seems that the purpose of the alignment check is to ensure that the
>>>>>> data does not cross a page boundary, so that the data is guaranteed to
>>>>>> be contiguous in the physical address space as well.
>>>>>>
>>>>>> So in that light, the fix is most definitely wrong, although I am not
>>>>>> sure how this is supposed to work generally.
>>>>>
>>>>> I'm not sure that is what it is trying to check, if that is what it is
>>>>> trying to check then the code is certainly wrong.
>>>>>
>>>>> Let me first quote the entire check:
>>>>>
>>>>>            /*
>>>>>             * A fully aligned variable on the stack is guaranteed not to
>>>>>             * cross a page bounary. Try to catch strings on the stack by
>>>>>             * checking that 'size' is a power of two.
>>>>>             */
>>>>>            bad_size = size > PAGE_SIZE || !is_power_of_2(size);
>>>>>
>>>>>            WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
>>>>>
>>>>> AFAIK EFI is using the identity map, and the kernel stack is
>>>>> physically contiguous, so crossing a page boundary should not a
>>>>> problem.
>>>>
>>>> CONFIG_HAVE_ARCH_VMAP_STACK=y means the kernel stack may not be
>>>> contiguous in physical memory, which is why this was added in the
>>>> first place.
>>>>
>>>> We do allocate a special stack for mixed mode, but we only switch to
>>>> it in the .S thunking code, so at this point, we are still running
>>>> from the vmap'ed stack
>>>>
>>>>> Also notice how the bad_size thing is talking about
>>>>> page boundary crossing, but the thing triggering is the
>>>>> IS_ALIGNED check. AFAIK there is no requirement for a struct, e.g.
>>>>> an UUID (which is the problem here) to be aligned to its size,
>>>>> it just needs to be 8 byte / 64 bit aligned, which it is yet
>>>>> the IS_ALIGNED check is failing because it is checking for
>>>>> a larger, in this case twice as large, but possibly it will
>>>>> end up checking for a much larger alignment.
>>>>>
>>>>
>>>> The idea is that a data structure of size < PAGE_SIZE is guaranteed
>>>> not to cross a page boundary if it is aligned to a power-of-2 upper
>>>> bound of its size. This has nothing to do with the data type or the
>>>> minimal alignment of any of its constituent parts.
>>>
>>> Ok, so I guess that the correct fix is to switch to kmalloc-ing
>>> "efi_guid_t vendor_guid" in efivar_init() instead of declaring it on
>>> the stack?
>>>
>>
>> I'd prefer it if we updated the efi_thunk_* routines to use a
>> efi_guid_t on its stack that is naturally aligned, along the lines of
>>
>> --- a/arch/x86/platform/efi/efi_64.c
>> +++ b/arch/x86/platform/efi/efi_64.c
>> @@ -658,6 +658,7 @@ static efi_status_t
>>   efi_thunk_get_variable(efi_char16_t *name, efi_guid_t *vendor,
>>                         u32 *attr, unsigned long *data_size, void *data)
>>   {
>> +       efi_guid_t guid __aligned(sizeof(efi_guid_t)) = *vendor;
>>          efi_status_t status;
>>          u32 phys_name, phys_vendor, phys_attr;
>>          u32 phys_data_size, phys_data;
>> @@ -666,7 +667,7 @@ efi_thunk_get_variable(efi_char16_t *name,
>> efi_guid_t *vendor,
>>          spin_lock_irqsave(&efi_runtime_lock, flags);
>>
>>          phys_data_size = virt_to_phys_or_null(data_size);
>> -       phys_vendor = virt_to_phys_or_null(vendor);
>> +       phys_vendor = virt_to_phys_or_null(&guid);
>>          phys_name = virt_to_phys_or_null_size(name, efi_name_size(name));
>>          phys_attr = virt_to_phys_or_null(attr);
>>          phys_data = virt_to_phys_or_null_size(data, *data_size);
>>
>>
>> As for the crashes with >4 GB memory size: I suppose we should just
>> cap the memory to 4GB when mixed mode is detected, although I think
>> this is mostly a theoretical case.
> 
> Hmm, actually, I don't think we can rely on alignment == 16, given
> that the compiler assumes that RSP is 16 byte aligned already, which
> is not the case in reality.
> 
> So we'll probably have to add some compiler foo as well.

It really seems easier to me to just kmalloc everything which is
bigger then 8 bytes? I see you are modifying the efi_thunk_get_variable
function, my idea was to do the kmalloc in efivar_init(), that already
kmalloc's the variable_name, so doing the same for the vendor guid inside
efi_init() seems like an ok solution to me?

Regards,

Hans


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New EFI thunk alignment WARN_ON in 5.6 triggers multiple times
  2020-02-13 10:03             ` Hans de Goede
@ 2020-02-13 10:18               ` Ard Biesheuvel
  0 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2020-02-13 10:18 UTC (permalink / raw)
  To: Hans de Goede; +Cc: Andy Lutomirski, Matt Fleming, linux-efi, Arvind Sankar

On Thu, 13 Feb 2020 at 11:03, Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 2/13/20 9:55 AM, Ard Biesheuvel wrote:
> > On Thu, 13 Feb 2020 at 09:03, Ard Biesheuvel <ardb@kernel.org> wrote:
> >>
> >> On Thu, 13 Feb 2020 at 08:36, Hans de Goede <hdegoede@redhat.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> On 2/12/20 10:08 PM, Ard Biesheuvel wrote:
> >>>> On Wed, 12 Feb 2020 at 16:00, Hans de Goede <hdegoede@redhat.com> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> On 2/12/20 12:53 PM, Ard Biesheuvel wrote:
> >>>>>> On Wed, 12 Feb 2020 at 12:44, Hans de Goede <hdegoede@redhat.com> wrote:
> >>>>>>>
> >>>>>>> Hi Ard,
> >>>>>>>
> >>>>>>> While booting 5.6-rc1 on one of my test machines I noticed the WARN_ON
> >>>>>>> on line 198 of arch/x86/platform/efi/efi_64.c trigger many times.
> >>>>>>>
> >>>>>>> I've done some debugging on this an this is caused by the following
> >>>>>>> call path:
> >>>>>>>
> >>>>>>> drivers/firmware/efi/vars.c: efivar_init():
> >>>>>>>
> >>>>>>>             unsigned long variable_name_size = 1024;
> >>>>>>>             efi_char16_t *variable_name;
> >>>>>>>             efi_guid_t vendor_guid;
> >>>>>>>
> >>>>>>>             variable_name = kzalloc(variable_name_size, GFP_KERNEL);
> >>>>>>>             if (!variable_name) {
> >>>>>>>                     printk(KERN_ERR "efivars: Memory allocation failed.\n");
> >>>>>>>                     return -ENOMEM;
> >>>>>>>             }
> >>>>>>>
> >>>>>>>            ...
> >>>>>>>
> >>>>>>>
> >>>>>>>             do {
> >>>>>>>                     variable_name_size = 1024;
> >>>>>>>
> >>>>>>>                     status = ops->get_next_variable(&variable_name_size,
> >>>>>>>                                                     variable_name,
> >>>>>>>                                                     &vendor_guid);
> >>>>>>>            ...
> >>>>>>>
> >>>>>>> arch/x86/platform/efi/efi_64.c: efi_thunk_get_next_variable()
> >>>>>>>
> >>>>>>>            ...
> >>>>>>>             phys_vendor = virt_to_phys_or_null(vendor);
> >>>>>>>            ...
> >>>>>>>
> >>>>>>> arch/x86/platform/efi/efi_64.c: virt_to_phys_or_null_size()
> >>>>>>>
> >>>>>>>            ...
> >>>>>>>            WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> >>>>>>>            ...
> >>>>>>>
> >>>>>>> Specifically the problem is that the efi_guid_t vendor_guid has an 8 bytes
> >>>>>>> aligned address and the WARN_ON checks for it being aligned to\
> >>>>>>> sizeof(efi_guid_t) which is 16 bytes.
> >>>>>>>
> >>>>>>> I've fixed this for now with the following local fix, but I'm not sure
> >>>>>>> what the alignment rules actually are so I'm not sure this is correct:
> >>>>>>>
> >>>>>>> --- a/arch/x86/platform/efi/efi_64.c
> >>>>>>> +++ b/arch/x86/platform/efi/efi_64.c
> >>>>>>> @@ -181,6 +181,7 @@ static inline phys_addr_t
> >>>>>>>      virt_to_phys_or_null_size(void *va, unsigned long size)
> >>>>>>>      {
> >>>>>>>            bool bad_size;
> >>>>>>> +       int alignment;
> >>>>>>>
> >>>>>>>            if (!va)
> >>>>>>>                    return 0;
> >>>>>>> @@ -195,7 +196,8 @@ virt_to_phys_or_null_size(void *va, unsigned long size)
> >>>>>>>             */
> >>>>>>>            bad_size = size > PAGE_SIZE || !is_power_of_2(size);
> >>>>>>>
> >>>>>>> -       WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> >>>>>>> +       alignment = size > 8 ? 8 : size;
> >>>>>>> +       WARN_ON(!IS_ALIGNED((unsigned long)va, alignment) || bad_size);
> >>>>>>>
> >>>>>>>            return slow_virt_to_phys(va);
> >>>>>>>      }
> >>>>>>>
> >>>>>>>
> >>>>>>> I have a feeling that this is the right thing to do, but as said I'm not 100%
> >>>>>>> sure. If you can confirm that this is the right fix, then I can submit this
> >>>>>>> upstream.
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> It seems that the purpose of the alignment check is to ensure that the
> >>>>>> data does not cross a page boundary, so that the data is guaranteed to
> >>>>>> be contiguous in the physical address space as well.
> >>>>>>
> >>>>>> So in that light, the fix is most definitely wrong, although I am not
> >>>>>> sure how this is supposed to work generally.
> >>>>>
> >>>>> I'm not sure that is what it is trying to check, if that is what it is
> >>>>> trying to check then the code is certainly wrong.
> >>>>>
> >>>>> Let me first quote the entire check:
> >>>>>
> >>>>>            /*
> >>>>>             * A fully aligned variable on the stack is guaranteed not to
> >>>>>             * cross a page bounary. Try to catch strings on the stack by
> >>>>>             * checking that 'size' is a power of two.
> >>>>>             */
> >>>>>            bad_size = size > PAGE_SIZE || !is_power_of_2(size);
> >>>>>
> >>>>>            WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
> >>>>>
> >>>>> AFAIK EFI is using the identity map, and the kernel stack is
> >>>>> physically contiguous, so crossing a page boundary should not a
> >>>>> problem.
> >>>>
> >>>> CONFIG_HAVE_ARCH_VMAP_STACK=y means the kernel stack may not be
> >>>> contiguous in physical memory, which is why this was added in the
> >>>> first place.
> >>>>
> >>>> We do allocate a special stack for mixed mode, but we only switch to
> >>>> it in the .S thunking code, so at this point, we are still running
> >>>> from the vmap'ed stack
> >>>>
> >>>>> Also notice how the bad_size thing is talking about
> >>>>> page boundary crossing, but the thing triggering is the
> >>>>> IS_ALIGNED check. AFAIK there is no requirement for a struct, e.g.
> >>>>> an UUID (which is the problem here) to be aligned to its size,
> >>>>> it just needs to be 8 byte / 64 bit aligned, which it is yet
> >>>>> the IS_ALIGNED check is failing because it is checking for
> >>>>> a larger, in this case twice as large, but possibly it will
> >>>>> end up checking for a much larger alignment.
> >>>>>
> >>>>
> >>>> The idea is that a data structure of size < PAGE_SIZE is guaranteed
> >>>> not to cross a page boundary if it is aligned to a power-of-2 upper
> >>>> bound of its size. This has nothing to do with the data type or the
> >>>> minimal alignment of any of its constituent parts.
> >>>
> >>> Ok, so I guess that the correct fix is to switch to kmalloc-ing
> >>> "efi_guid_t vendor_guid" in efivar_init() instead of declaring it on
> >>> the stack?
> >>>
> >>
> >> I'd prefer it if we updated the efi_thunk_* routines to use a
> >> efi_guid_t on its stack that is naturally aligned, along the lines of
> >>
> >> --- a/arch/x86/platform/efi/efi_64.c
> >> +++ b/arch/x86/platform/efi/efi_64.c
> >> @@ -658,6 +658,7 @@ static efi_status_t
> >>   efi_thunk_get_variable(efi_char16_t *name, efi_guid_t *vendor,
> >>                         u32 *attr, unsigned long *data_size, void *data)
> >>   {
> >> +       efi_guid_t guid __aligned(sizeof(efi_guid_t)) = *vendor;
> >>          efi_status_t status;
> >>          u32 phys_name, phys_vendor, phys_attr;
> >>          u32 phys_data_size, phys_data;
> >> @@ -666,7 +667,7 @@ efi_thunk_get_variable(efi_char16_t *name,
> >> efi_guid_t *vendor,
> >>          spin_lock_irqsave(&efi_runtime_lock, flags);
> >>
> >>          phys_data_size = virt_to_phys_or_null(data_size);
> >> -       phys_vendor = virt_to_phys_or_null(vendor);
> >> +       phys_vendor = virt_to_phys_or_null(&guid);
> >>          phys_name = virt_to_phys_or_null_size(name, efi_name_size(name));
> >>          phys_attr = virt_to_phys_or_null(attr);
> >>          phys_data = virt_to_phys_or_null_size(data, *data_size);
> >>
> >>
> >> As for the crashes with >4 GB memory size: I suppose we should just
> >> cap the memory to 4GB when mixed mode is detected, although I think
> >> this is mostly a theoretical case.
> >
> > Hmm, actually, I don't think we can rely on alignment == 16, given
> > that the compiler assumes that RSP is 16 byte aligned already, which
> > is not the case in reality.
> >
> > So we'll probably have to add some compiler foo as well.
>
> It really seems easier to me to just kmalloc everything which is
> bigger then 8 bytes? I see you are modifying the efi_thunk_get_variable
> function, my idea was to do the kmalloc in efivar_init(), that already
> kmalloc's the variable_name, so doing the same for the vendor guid inside
> efi_init() seems like an ok solution to me?
>

No, I really don't want these mixed mode constraints to leak into other code.

I'll have some patches out in a minute.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-02-13 10:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-12 11:44 New EFI thunk alignment WARN_ON in 5.6 triggers multiple times Hans de Goede
2020-02-12 11:53 ` Ard Biesheuvel
2020-02-12 15:00   ` Hans de Goede
2020-02-12 21:08     ` Ard Biesheuvel
2020-02-13  7:36       ` Hans de Goede
2020-02-13  8:03         ` Ard Biesheuvel
2020-02-13  8:55           ` Ard Biesheuvel
2020-02-13 10:03             ` Hans de Goede
2020-02-13 10:18               ` Ard Biesheuvel
2020-02-12 21:57     ` Arvind Sankar
2020-02-13  0:34       ` Arvind Sankar
2020-02-13  7:34         ` Hans de Goede

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).