From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: Domain Save Image Format proposal (draft B) Date: Tue, 11 Feb 2014 13:20:25 +0000 Message-ID: <52FA31A9020000780011B261@nat28.tlf.novell.com> References: <52F90A71.40802@citrix.com> <52FA043B020000780011B10C@nat28.tlf.novell.com> <52FA1FC8.7010104@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <52FA1FC8.7010104@citrix.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel Cc: Shriram Rajagopalan , "Xen-devel@lists.xen.org" , Ian Jackson , Ian Campbell , StefanoStabellini List-Id: xen-devel@lists.xenproject.org >>> On 11.02.14 at 14:04, David Vrabel wrote: > On 11/02/14 10:06, Jan Beulich wrote: >>>>> On 10.02.14 at 18:20, David Vrabel wrote: >>> Domain Header >>> ------------- >>> >>> The domain header includes general properties of the domain. >>> >>> 0 1 2 3 4 5 6 7 octet >>> +-----------+-----------+-----------+-------------+ >>> | arch | type | page_shift| (reserved) | >>> +-----------+-----------+-----------+-------------+ >>> >>> -------------------------------------------------------------------- >>> Field Description >>> ----------- -------------------------------------------------------- >>> arch 0x0000: Reserved. >>> >>> 0x0001: x86. >>> >>> 0x0002: ARM. >>> >>> type 0x0000: Reserved. >>> >>> 0x0001: x86 PV. >>> >>> 0x0002 - 0xFFFF: Reserved. >> >> So how would ARM, x86 HVM, and x86 PVH be expressed? > > Something like: > > 0x0001: x86 PV. > 0x0002: x86 HVM. > 0x0003: x86 PVH. > 0x0004: ARM. Ah, so the list above wasn't meant to be exhaustive (for the current set of things to care about). > Which does make the arch field a bit redundant, I suppose. Indeed. >>> P2M >>> --- >>> >>> [ This is a more flexible replacement for the old p2m_size field and >>> p2m array. ] >>> >>> The P2M record contains a portion of the source domain's P2M. >>> Multiple P2M records may be sent if the source P2M changes during the >>> stream. >>> >>> 0 1 2 3 4 5 6 7 octet >>> +-------------------------------------------------+ >>> | pfn_begin | >>> +-------------------------------------------------+ >>> | pfn_end | >>> +-------------------------------------------------+ >>> | mfn[0] | >>> +-------------------------------------------------+ >>> ... >>> +-------------------------------------------------+ >>> | mfn[N-1] | >>> +-------------------------------------------------+ >>> >>> -------------------------------------------------------------------- >>> Field Description >>> ----------- -------------------------------------------------------- >>> pfn_begin The first PFN in this portion of the P2M >>> >>> pfn_end One past the last PFN in this portion of the P2M. >> >> I'd favor an inclusive range here, such that if we ever reach a >> fully populatable 64-bit PFN space (on some future architecture) >> there'd still be no issue with special casing the then unavoidable >> wraparound. > > Ok, but 64-bit PFN space would suggest 76 bit of address space which > seems somewhat far off. Is that something we want to consider now? If it's as cheap as using an inclusive range instead of a half-inclusive one, I'd say yes. >>> Legacy Images (x86 only) >>> ======================== >>> >>> Restoring legacy images from older tools shall be handled by >>> translating the legacy format image into this new format. >>> >>> It shall not be possible to save in the legacy format. >>> >>> There are two different legacy images depending on whether they were >>> generated by a 32-bit or a 64-bit toolstack. These shall be >>> distinguished by inspecting octets 4-7 in the image. If these are >>> zero then it is a 64-bit image. >>> >>> Toolstack Field Value >>> --------- ----- ----- >>> 64-bit Bit 31-63 of the p2m_size field 0 (since p2m_size < 2^32^) >> >> Afaics this is being determined via xc_domain_maximum_gpfn(), >> which I don't think guarantees the result to be limited to 2^32. >> Or in fact the libxc interface wrongly limits the value (by >> truncating the "long" returned from the hypercall to an "int"). So >> in practice consistent images would have the field limited to 2^31 >> on 64-bit tool stacks (since for larger values the negative function >> return value would get converted by sign-extension, but all sorts >> of other trouble would result due to the now huge p2m_size). > > For the handling of legacy images I think we need to only consider > images that could have been practically generated by older tools. Right. That's what I meant to say with everything following the first sentence. >>> Future Extensions >>> ================= >>> >>> All changes to this format require the image version to be increased. >> >> Oh, okay, this partly deals with the first question above. Question >> is whether that's a useful requirement, i.e. whether that wouldn't >> lead to an inflation of versions needing conversion (for a tool stack >> that wants to support more than just migration from N-1). > > Only legacy images would be converted to the newest format. I would > expect version V-1 images would be handled by (mostly) the same code as > V images. Particularly if V is V-1 with extra record types. Just consider distros, namely such that have a lower release frequency than we. They'd necessarily want to cover at least the range between their V'-1 and V', which could just end up being V-2 ... V from Xen pov, but - especially if we were to further shorten the release cycle - could easily become a larger range. Jan