On Thu, Aug 08, 2019 at 08:03:49AM +0200, Jan Beulich wrote: > On 08.08.2019 04:53, Marek Marczykowski-Górecki wrote: > > On Wed, Aug 07, 2019 at 09:26:00PM +0200, Marek Marczykowski-Górecki wrote: > > > Ok, regardless of adding proper option for that, I've hardcoded map_bs=1 > > > and it still crashes, just slightly differently: > > > > > > Xen call trace: > > > [<0000000000000080>] 0000000000000080 > > > [<8c2b0398e0000daa>] 8c2b0398e0000daa > > > > > > Pagetable walk from ffffffff858483a1: > > > L4[0x1ff] = 0000000000000000 ffffffffffffffff > > > > > > **************************************** > > > Panic on CPU 0: > > > FATAL PAGE FAULT > > > [error_code=0002] > > > Faulting linear address: ffffffff858483a1 > > > **************************************** > > > > > > Full message attached. > > > > After playing more with it and also know workarounds for various EFI > > issues, I've found a way to boot it: avoid calling Exit BootServices. > > There was a patch from Konrad adding /noexit option, that never get > > committed. Similar to efi=mapbs option, I'd add efi=no-exitboot too > > (once efi=mapbs patch is accepted). > > > > Anyway, I'm curious what exactly is wrong here. Is it that the firmware > > is not happy about lack of SetVirtualAddressMap call? FWIW, the crash is > > during GetVariable RS call. I've verified that the function itself is > > within EfiRuntimeServicesCode, but I don't feel like tracing Lenovo > > UEFI... > > This suggests that the firmware zaps a few too many pointers > during ExitBootServices(). Perhaps internally they check > whether pointers point into BootServices* memory, and hence the > wrong marking in the memory map has consequences beyond the OS > re-using such memory? > > A proper answer to your question can of course only be given > by someone knowing this specific firmware version. I explored it a bit more and talked with a few people doing firmware development and few conclusions: 1. Not calling SetVirtualAddressMap(), while technically legal, is pretty uncommon and not recommended if you want to avoid less tested (aka buggy) UEFI code paths. 2. Every UEFI call before SetVirtualAddressMap() call should be done with flat physical memory. This include SetVirtualAddressMap() call itself. Implicitly this means such calls can legally access memory areas not marked with EFI_MEMORY_RUNTIME. For the second point, relevant part of UEFI 2.7 spec (Runtime Services - Virtual Memory Services chapter): This section contains function definitions for the virtual memory support that may be optionally used by an operating system at runtime. If an operating system chooses to make EFI runtime service calls in a virtual addressing mode instead of the flat physical mode, then the operating system must use the services in this section to switch the EFI runtime services from flat physical addressing to virtual addressing. (...) The call to SetVirtualAddressMap() must be done with the physical mappings. On successful return from this function, the system must then make any future calls with the newly assigned virtual mappings. All address space mappings must be done in accordance to the cacheability flags as specified in the original address map. I've tried to poke around this part of Xen code, including resurrecting SetVirtualAddressMap() (#define USE_SET_VIRTUAL_ADDRESS_MAP in common/efi/boot.c) and (unsurprisingly) hit multiple issues: - at this point of time, Xen is already relocated and paging is enabled - SetVirtualAddressMap() is indeed not happy about being called with new address map in place already - directmap - at which that code points - is mapped with NX, which breaks EfiRuntimeServicesCode area Then I've tried a different approach: call SetVirtualAddressMap(), but with an address map that tries to pretend physical addressing (the code under #ifndef USE_SET_VIRTUAL_ADDRESS_MAP). This mostly worked, I needed only few changes: - set VirtualStart back to PhysicalStart in that memory map (it was set to directmap) - map boot services (at least for the SetVirtualAddressMap() call time, but haven't tried unmapping it later) - call SetVirtualAddressMap() with that "1:1" map in place, using efi_rs_enter/efi_rs_leave. This fixed the issue for me, now runtime services do work even without disabling ExitBootServices() call. And without any extra platform-specific command line arguments. And I think it also shouldn't break kexec, since it uses 1:1-like map, but I haven't tried. One should simply ignore EFI_UNSUPPORTED return code (I don't know how to avoid the call at all after kexec). Any thoughts? If the above sounds good, I'll cleanup the patch and submit it. BTW Does it qualify for 4.13? On one hand it may be seen as a bugfix (fix booting on some UEFI firmwares), but on the other hand, I can't think of all the side effects. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?