On Tue, Oct 08, 2019 at 04:19:13PM +0200, Jan Beulich wrote: > On 08.10.2019 15:52, Marek Marczykowski-Górecki wrote: > > On Tue, Oct 08, 2019 at 03:08:29PM +0200, Jan Beulich wrote: > >> On 08.10.2019 13:50, Marek Marczykowski-Górecki wrote: > >>> I explored it a bit more and talked with a few people doing firmware > >>> development and few conclusions: > >>> 1. Not calling SetVirtualAddressMap(), while technically legal, is > >>> pretty uncommon and not recommended if you want to avoid less tested > >>> (aka buggy) UEFI code paths. > >>> 2. Every UEFI call before SetVirtualAddressMap() call should be done > >>> with flat physical memory. This include SetVirtualAddressMap() call > >>> itself. Implicitly this means such calls can legally access memory areas > >>> not marked with EFI_MEMORY_RUNTIME. > >> > >> I don't think this is quite right - whether non-runtime memory may > >> be touched depends exclusively on ExitBootServices() (not) having > >> got called (yet). > > > > That would be logical. In practice however we have evidences firmware > > vendors have different opinion... A comment from Linux (already quoted > > here 2 months ago): > > > > /* > > * The UEFI specification makes it clear that the operating system is > > * free to do whatever it wants with boot services code after > > * ExitBootServices() has been called. Ignoring this recommendation a > > * significant bunch of EFI implementations continue calling into boot > > * services code (SetVirtualAddressMap). In order to work around such > > * buggy implementations we reserve boot services region during EFI > > * init and make sure it stays executable. Then, after > > * SetVirtualAddressMap(), it is discarded. > > * > > * However, some boot services regions contain data that is required > > * by drivers, so we need to track which memory ranges can never be > > * freed. This is done by tagging those regions with the > > * EFI_MEMORY_RUNTIME attribute. > > * > > * Any driver that wants to mark a region as reserved must use > > * efi_mem_reserve() which will insert a new EFI memory descriptor > > * into efi.memmap (splitting existing regions if necessary) and tag > > * it with EFI_MEMORY_RUNTIME. > > */ > > But you realize that the comment specifically talks about > the call tree underneath SetVirtualAddressMap() being the violator. > As long as we don't call this function, we're unaffected as far as > this comment goes. Well, this very thread proves it isn't only about SetVirtualAddressMap(). I _guess_ it's about calls before SetVirtualAddressMap() returns (which supposedly do some cleanups). In Linux case, it is only that one call. > > Regardless of SetVirtualAddressMap() discussion, I propose to > > automatically map boot services code/data, to make Xen work on more > > machines (even if _we_ consider those buggy). > > I remain on my prior position: Adding command line triggerable > workarounds for such cases is fine. Defaulting to assume buggy > firmware is acceptable only if this means no extra penalty to > systems with conforming firmware. Hence, for the case at hand, > I object to doing this automatically; we already have the > /mapbs workaround in place to deal with the case when xen.efi > is used. Judging from the title here there may need to be an > addition to also allow triggering this from the MB2 boot path. What about mirroring Linux behavior? I.e. mapping those regions for the SetVirtualAddressMap() time (when enabled) and unmapping after - unless tagged with EFI_MEMORY_RUNTIME? Similarly to Andrew, I'd really prefer for Xen to work out of the box, with as little as possible manual tweaks needed. > >>> Then I've tried a different approach: call SetVirtualAddressMap(), but > >>> with an address map that tries to pretend physical addressing (the code > >>> under #ifndef USE_SET_VIRTUAL_ADDRESS_MAP). This mostly worked, I needed > >>> only few changes: > >>> - set VirtualStart back to PhysicalStart in that memory map (it was set > >>> to directmap) > >>> - map boot services (at least for the SetVirtualAddressMap() call time, > >>> but haven't tried unmapping it later) > >>> - call SetVirtualAddressMap() with that "1:1" map in place, using > >>> efi_rs_enter/efi_rs_leave. > >>> > >>> This fixed the issue for me, now runtime services do work even without > >>> disabling ExitBootServices() call. And without any extra > >>> platform-specific command line arguments. And I think it also shouldn't break > >>> kexec, since it uses 1:1-like map, but I haven't tried. One should > >>> simply ignore EFI_UNSUPPORTED return code (I don't know how to avoid the > >>> call at all after kexec). > >> > >> That's the point - it can't be avoided, and hence it failing is not > >> an option. Or else there needs to be a protocol telling kexec what > >> it is to expect, and that it in particular can't change the virtual > >> address map to its liking. Back at the time when I put together the > >> EFI booting code, no such protocol existed, and hence calling > >> SetVirtualAddressMap() was not an option. I have no idea whether > >> things have changed in the meantime. > > > > Hmm, how is it different from the current situation? Not calling > > SetVirtualAddressMap() means UEFI will not be notified about new address > > map. It does _not_ mean it will use 1:1 map, it will use what was > > previously set. > > What do you mean by "previously set"? SetVirtualAddressMap() can be > invoked only once during a given session (i.e. without intervening > boot). How would other than a 1:1 map have got set? Like, in the very next sentence of my answer: > > What if Xen was kexec'ed from Linux? > > In Linux case, it looks like it passes around the EFI memory map using > > some Linux-specific mechanism, but I don't find it particularly > > appealing option. > > Indeed. > > > What about something in between: make this SetVirtualAddressMap() call > > compile-time option (kconfig), depending on !CONFIG_KEXEC ? And when > > enabled, properly handle SetVirtualAddressMap() failure. > > What is "proper handling" here? Logging the error and either panic or disabling runtime services (I tend to the latter). > > I my case, > > where I do care about supporting various UEFI implementations, I don't > > need kexec support. And apparently people carrying about kexec don't > > have problems with lack of SetVirtualAddressMap(), so that would be > > win-win, no? > > Allowing SetVirtualAddressMap() when !KEXEC would be fine with me. > The fly in the ointment here is that we'd prefer not to have such > Kconfig options (at least not without EXPERT qualifier), as > (security) supporting all the possible combinations would be a > nightmare. If an EXPERT dependency is okay with you, then I'll be > looking forward to your patch. EXPERT is fine with me. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?