The following changes since commit 1e4b044d22517cae7047c99038abb444423243ca: Linux 4.18-rc4 (2018-07-08 16:34:02 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-urgent for you to fetch changes up to d7f2e972e702d329fe11d6956df99dfc31211c25: efi/x86: remove pointless call to PciIo->Attributes() (2018-07-11 10:52:46 +0200) ---------------------------------------------------------------- A single fix for the x86 PCI I/O protocol handling code that got broken for mixed mode (64-bit Linux/x86 on 32-bit UEFI) after a fix was applied in -rc2 to fix it for ordinary 64-bit Linux/x86. ---------------------------------------------------------------- Ard Biesheuvel (1): efi/x86: remove pointless call to PciIo->Attributes() arch/x86/boot/compressed/eboot.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-)
When it was first introduced, the EFI stub code that copies the contents of PCI option ROMs originally only intended to do so if the EFI_PCI_IO_ATTRIBUTE_EMBEDDED_ROM attribute was *not* set. The reason was that the UEFI spec permits PCI option ROM images to be provided by the platform directly, rather than via the ROM BAR, and in this case, the OS can only access them at runtime if they are preserved at boot time by copying them from the areas described by PciIo->RomImage and PciIo->RomSize. However, it implemented this check erroneously, as can be seen in commit dd5fc854de5fd ("EFI: Stash ROMs if they're not in the PCI BAR"): if (!attributes & EFI_PCI_IO_ATTRIBUTE_EMBEDDED_ROM) continue; and given that the numeric value of EFI_PCI_IO_ATTRIBUTE_EMBEDDED_ROM is 0x4000, this condition never becomes true, and so the option ROMs were copied unconditionally. This was spotted and 'fixed' by commit 886d751a2ea99a160 ("x86, efi: correct precedence of operators in setup_efi_pci"), but inadvertently inverted the logic at the same time, defeating the purpose of the code, since it now only preserves option ROM images that can be read from the ROM BAR as well. Unsurprisingly, this broke some systems, and so the check was removed entirely in commit 739701888f5d ("x86, efi: remove attribute check from setup_efi_pci"). It is debatable whether this check should have been included in the first place, since the option ROM image provided to the UEFI driver by the firmware may be different from the one that is actually present in the card's flash ROM, and so whatever PciIo->RomImage points at should be preferred regardless of whether the attribute is set. As this was the only use of the attributes field, we can remove the call to PciIo->Attributes() entirely, which is especially nice because its prototype involves uint64_t type by-value arguments which the EFI mixed mode has trouble dealing with. Tested-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de> Tested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/x86/boot/compressed/eboot.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c index e57665b4ba1c..e98522ea6f09 100644 --- a/arch/x86/boot/compressed/eboot.c +++ b/arch/x86/boot/compressed/eboot.c @@ -114,18 +114,12 @@ __setup_efi_pci(efi_pci_io_protocol_t *pci, struct pci_setup_rom **__rom) struct pci_setup_rom *rom = NULL; efi_status_t status; unsigned long size; - uint64_t attributes, romsize; + uint64_t romsize; void *romimage; - status = efi_call_proto(efi_pci_io_protocol, attributes, pci, - EfiPciIoAttributeOperationGet, 0ULL, - &attributes); - if (status != EFI_SUCCESS) - return status; - /* - * Some firmware images contain EFI function pointers at the place where the - * romimage and romsize fields are supposed to be. Typically the EFI + * Some firmware images contain EFI function pointers at the place where + * the romimage and romsize fields are supposed to be. Typically the EFI * code is mapped at high addresses, translating to an unrealistically * large romsize. The UEFI spec limits the size of option ROMs to 16 * MiB so we reject any ROMs over 16 MiB in size to catch this. -- 2.17.1
* Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> The following changes since commit 1e4b044d22517cae7047c99038abb444423243ca:
>
> Linux 4.18-rc4 (2018-07-08 16:34:02 -0700)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-urgent
>
> for you to fetch changes up to d7f2e972e702d329fe11d6956df99dfc31211c25:
>
> efi/x86: remove pointless call to PciIo->Attributes() (2018-07-11 10:52:46 +0200)
>
> ----------------------------------------------------------------
> A single fix for the x86 PCI I/O protocol handling code that got
> broken for mixed mode (64-bit Linux/x86 on 32-bit UEFI) after a
> fix was applied in -rc2 to fix it for ordinary 64-bit Linux/x86.
Just curious, because it's unclear from the changelog, what was the symptom, a
boot hang, instant reboot, or some other misbehavior? Also, what's the scope of
the fix: were all 64-bit on 32-bit UEFI mixed-mode bootups affected, or only a
certain subset?
Thanks,
Ingo
Hi,
On 11-07-18 12:13, Ingo Molnar wrote:
>
> * Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>
>> The following changes since commit 1e4b044d22517cae7047c99038abb444423243ca:
>>
>> Linux 4.18-rc4 (2018-07-08 16:34:02 -0700)
>>
>> are available in the Git repository at:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-urgent
>>
>> for you to fetch changes up to d7f2e972e702d329fe11d6956df99dfc31211c25:
>>
>> efi/x86: remove pointless call to PciIo->Attributes() (2018-07-11 10:52:46 +0200)
>>
>> ----------------------------------------------------------------
>> A single fix for the x86 PCI I/O protocol handling code that got
>> broken for mixed mode (64-bit Linux/x86 on 32-bit UEFI) after a
>> fix was applied in -rc2 to fix it for ordinary 64-bit Linux/x86.
>
> Just curious, because it's unclear from the changelog, what was the symptom, a
> boot hang, instant reboot, or some other misbehavior? Also, what's the scope of
> the fix: were all 64-bit on 32-bit UEFI mixed-mode bootups affected, or only a
> certain subset?
The problem was a reboot (resulting in a boot loop). I may have tested this
on multiple Bay Trail based devices, I don't remember but it was an issue on
Bay Trail devices, which typically use a 32 bit UEFI even though they have a
64 bit capable CPU. There are some rare Bay Trail devices with a 64 bit UEFI
but they are the exception.
Regards,
Hans
On 11 July 2018 at 12:13, Ingo Molnar <mingo@kernel.org> wrote: > > * Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > >> The following changes since commit 1e4b044d22517cae7047c99038abb444423243ca: >> >> Linux 4.18-rc4 (2018-07-08 16:34:02 -0700) >> >> are available in the Git repository at: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-urgent >> >> for you to fetch changes up to d7f2e972e702d329fe11d6956df99dfc31211c25: >> >> efi/x86: remove pointless call to PciIo->Attributes() (2018-07-11 10:52:46 +0200) >> >> ---------------------------------------------------------------- >> A single fix for the x86 PCI I/O protocol handling code that got >> broken for mixed mode (64-bit Linux/x86 on 32-bit UEFI) after a >> fix was applied in -rc2 to fix it for ordinary 64-bit Linux/x86. > > Just curious, because it's unclear from the changelog, what was the symptom, a > boot hang, instant reboot, or some other misbehavior? Hans reported that his mixed mode tablet would not boot at all any more, but enter a reboot loop without any logs printed by the kernel. > Also, what's the scope of > the fix: were all 64-bit on 32-bit UEFI mixed-mode bootups affected, or only a > certain subset? > Any mixed mode system with PCI is likely to be affected. I have added a QEMU mixed mode config to my boot test environment to catch errors like this one. The unfortunate thing here is that this uncovered a fundamental issue with mixed mode, i.e., that any UEFI protocol prototype involving 64-bit by-value parameters needs to be special cased in the stub code, which is rather tedious. There is one other call that is potentially affected, a file open call in the initrd handling code, but that specific occurrence happens to work unmodified. This patch removes the other one. Going forward, we will have to carefully review UEFI protocol invocations for mixed mode compatibility.
* Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 11 July 2018 at 12:13, Ingo Molnar <mingo@kernel.org> wrote: > > > > * Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > > > >> The following changes since commit 1e4b044d22517cae7047c99038abb444423243ca: > >> > >> Linux 4.18-rc4 (2018-07-08 16:34:02 -0700) > >> > >> are available in the Git repository at: > >> > >> git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-urgent > >> > >> for you to fetch changes up to d7f2e972e702d329fe11d6956df99dfc31211c25: > >> > >> efi/x86: remove pointless call to PciIo->Attributes() (2018-07-11 10:52:46 +0200) > >> > >> ---------------------------------------------------------------- > >> A single fix for the x86 PCI I/O protocol handling code that got > >> broken for mixed mode (64-bit Linux/x86 on 32-bit UEFI) after a > >> fix was applied in -rc2 to fix it for ordinary 64-bit Linux/x86. > > > > Just curious, because it's unclear from the changelog, what was the symptom, a > > boot hang, instant reboot, or some other misbehavior? > > Hans reported that his mixed mode tablet would not boot at all any > more, but enter a reboot loop without any logs printed by the kernel. > > > Also, what's the scope of > > the fix: were all 64-bit on 32-bit UEFI mixed-mode bootups affected, or only a > > certain subset? > > > > Any mixed mode system with PCI is likely to be affected. I have added > a QEMU mixed mode config to my boot test environment to catch errors > like this one. Ok, I've added this information to the commit - will be useful to backporters, to judge the severity of the bug fixed. > The unfortunate thing here is that this uncovered a fundamental issue with mixed > mode, i.e., that any UEFI protocol prototype involving 64-bit by-value > parameters needs to be special cased in the stub code, which is rather tedious. > There is one other call that is potentially affected, a file open call in the > initrd handling code, but that specific occurrence happens to work unmodified. > This patch removes the other one. Going forward, we will have to carefully > review UEFI protocol invocations for mixed mode compatibility. Yeah. Is there any, more systematic way to detect such problems perhaps at an earlier stage, other than careful review which will often fail to find such bugs? Also, testing is good, but could we perhaps do something on a deeper level - automate the casting, generate a warning on suspicious patterns, etc. etc? Thanks, Ingo
On 11 July 2018 at 13:14, Ingo Molnar <mingo@kernel.org> wrote: > > * Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > >> On 11 July 2018 at 12:13, Ingo Molnar <mingo@kernel.org> wrote: >> > >> > * Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >> > >> >> The following changes since commit 1e4b044d22517cae7047c99038abb444423243ca: >> >> >> >> Linux 4.18-rc4 (2018-07-08 16:34:02 -0700) >> >> >> >> are available in the Git repository at: >> >> >> >> git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-urgent >> >> >> >> for you to fetch changes up to d7f2e972e702d329fe11d6956df99dfc31211c25: >> >> >> >> efi/x86: remove pointless call to PciIo->Attributes() (2018-07-11 10:52:46 +0200) >> >> >> >> ---------------------------------------------------------------- >> >> A single fix for the x86 PCI I/O protocol handling code that got >> >> broken for mixed mode (64-bit Linux/x86 on 32-bit UEFI) after a >> >> fix was applied in -rc2 to fix it for ordinary 64-bit Linux/x86. >> > >> > Just curious, because it's unclear from the changelog, what was the symptom, a >> > boot hang, instant reboot, or some other misbehavior? >> >> Hans reported that his mixed mode tablet would not boot at all any >> more, but enter a reboot loop without any logs printed by the kernel. >> >> > Also, what's the scope of >> > the fix: were all 64-bit on 32-bit UEFI mixed-mode bootups affected, or only a >> > certain subset? >> > >> >> Any mixed mode system with PCI is likely to be affected. I have added >> a QEMU mixed mode config to my boot test environment to catch errors >> like this one. > > Ok, I've added this information to the commit - will be useful to backporters, > to judge the severity of the bug fixed. > Perhaps it wasn't clear from the commit log that only v4.18-rc2 and later is affected by the mixed mode issue, since that is when a fix for ordinary 64-bit x86 was applied that affected v4.18-rc1. >> The unfortunate thing here is that this uncovered a fundamental issue with mixed >> mode, i.e., that any UEFI protocol prototype involving 64-bit by-value >> parameters needs to be special cased in the stub code, which is rather tedious. >> There is one other call that is potentially affected, a file open call in the >> initrd handling code, but that specific occurrence happens to work unmodified. >> This patch removes the other one. Going forward, we will have to carefully >> review UEFI protocol invocations for mixed mode compatibility. > > Yeah. Is there any, more systematic way to detect such problems perhaps at an > earlier stage, other than careful review which will often fail to find such bugs? > Also, testing is good, but could we perhaps do something on a deeper level - > automate the casting, generate a warning on suspicious patterns, etc. etc? > The main problem is the assumption is that we can convert any call using the SysV/x86_64 calling convention to the IA32 calling convention by pushing a 32-bit word for each argument passed in a register. This assumption holds most of the time, but not all of the time, and any argument passed by register that takes up more than a single 32-bit slot is problematic. Note that EFI_PHYSICAL_ADDRESS is always defined as 64 bits wide, and is widely used in UEFI. Fortunately, it is mostly passed by reference, and pointers are 32-bit in mixed mode, so there we dodge the issue. To me, it is a bit surprising that GCC cannot do this for us, i.e., we set some __attribute__(()) on a function declaration to inform the compiler that it should use the 32-bit calling convention. But I guess there are issues that complicate this in ways that my limited understanding of low level x86 does not cover. In any case, the only way to automate this would be to find *some* way to instantiate the thunking code specifically for each prototype that we invoke at runtime. The most naive approach would be to classify functions as (u32, u32, u32, u32, u32, ...) (u64, u32, u32, u32, u32, ...) (u32, u64, u32, u32, u32, ...) (u64, u64, u32, u32, u32, ...) etc etc and have a static library containing the thunking routine for each one, and wire them up as appropriate. Of course, there is no point in exhaustively generating each one if we know that only the file open() call deviates from the first entry. However, the EFI stub code is not expected to expand that much, and so for the time being, I'm fine with a combination of review and rigorous testing
* Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 11 July 2018 at 13:14, Ingo Molnar <mingo@kernel.org> wrote: > > > > * Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > > > >> On 11 July 2018 at 12:13, Ingo Molnar <mingo@kernel.org> wrote: > >> > > >> > * Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > >> > > >> >> The following changes since commit 1e4b044d22517cae7047c99038abb444423243ca: > >> >> > >> >> Linux 4.18-rc4 (2018-07-08 16:34:02 -0700) > >> >> > >> >> are available in the Git repository at: > >> >> > >> >> git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-urgent > >> >> > >> >> for you to fetch changes up to d7f2e972e702d329fe11d6956df99dfc31211c25: > >> >> > >> >> efi/x86: remove pointless call to PciIo->Attributes() (2018-07-11 10:52:46 +0200) > >> >> > >> >> ---------------------------------------------------------------- > >> >> A single fix for the x86 PCI I/O protocol handling code that got > >> >> broken for mixed mode (64-bit Linux/x86 on 32-bit UEFI) after a > >> >> fix was applied in -rc2 to fix it for ordinary 64-bit Linux/x86. > >> > > >> > Just curious, because it's unclear from the changelog, what was the symptom, a > >> > boot hang, instant reboot, or some other misbehavior? > >> > >> Hans reported that his mixed mode tablet would not boot at all any > >> more, but enter a reboot loop without any logs printed by the kernel. > >> > >> > Also, what's the scope of > >> > the fix: were all 64-bit on 32-bit UEFI mixed-mode bootups affected, or only a > >> > certain subset? > >> > > >> > >> Any mixed mode system with PCI is likely to be affected. I have added > >> a QEMU mixed mode config to my boot test environment to catch errors > >> like this one. > > > > Ok, I've added this information to the commit - will be useful to backporters, > > to judge the severity of the bug fixed. > > > > Perhaps it wasn't clear from the commit log that only v4.18-rc2 and > later is affected by the mixed mode issue, since that is when a fix > for ordinary 64-bit x86 was applied that affected v4.18-rc1. Ah, ok. Still, if for whatever reason the commit that introduced the problem is backported, this one will be too. The chain of sha1's seemed rather long, so there's a chance for that. > However, the EFI stub code is not expected to expand that much, and so for the > time being, I'm fine with a combination of review and rigorous testing Fair enough! Thanks, Ingo