On Tue, Feb 11, 2014 at 5:58 AM, David Vrabel wrote: > On 10/02/14 20:00, Shriram Rajagopalan wrote: > > On Mon, Feb 10, 2014 at 9:20 AM, David Vrabel > > wrote: > > > > > > Its tempting to adopt all the TCP-style madness for transferring a set of > > structured data. Why this endian-ness mess? Am I missing something > here? > > I am assuming that a lion's share of Xen's deployment is on x86 > > (not including Amazon). So that leaves ARM. Why not let these > > processors take the hit of endian-ness conversion? > > I'm not sure I would characterize a spec being precise about byte > ordering as "endianness mess". > > I think it would be a pretty poor specification if it didn't specify > byte ordering -- we can't have the tools having to make assumptions > about the ordering. > > Totally agree. But as someone else put it (and you did as well), my point was that its sufficient to specify it once, somewhere in the image header and making sure that (as you put it below), that the current use cases don't have to go through needless endian conversion. > However, I do think it can be specified in such a way that all the > current use cases don't have to do any byte swapping (except for the > minimal header). > > > +-----------------------+-------------------------+ > > | checksum | (reserved) | > > +-----------------------+-------------------------+ > > > > > > I am assuming that you the checksum field is present only > > for debugging purposes? Otherwise, I see no reason for the > > computational overhead, given that we are already sending data > > over a reliable channel + IIRC we already have an image-wide checksum > > when saving the image to disk. > > I'm not aware of any image wide checksum. > Yep. I was mistaken. > The checksum seems like a potentially useful feature but I don't have a > requirement for it so if no one else thinks it is useful it can be removed. > > My suggestion is that when saving the image to disk, why not have a single image-wide checksum to ensure that the image from disk being restored is still valid? > > PAGE_DATA > > --------- > [...] > > -------------------------------------------------------------------- > > Field Description > > ----------- -------------------------------------------------------- > > count Number of pages described in this record. > > > > pfn An array of count PFNs. Bits 63-60 contain > > the XEN\_DOMCTL\_PFINFO_* value for that PFN. > > > > page_data page_size octets of uncompressed page contents for each > page > > set as present in the pfn array. > > -------------------------------------------------------------------- > > > > > > s/uncompressed/(compressed/uncompressed)/ > > (Remus sends compressed data) > > No. I think compressed page data should have its own record type. The > current scheme of mode flipping records seems crazy to me. > > What record flipping? For page compression, Remus basically has a simple XOR+RLE encoded sequence of bytes, preceded by a 4-byte length field. Instead of sending the usual 4K per-page page_data, this compressed chunk is sent. The additional code on the remote side is an additional "if" block, that uses xc_uncompess instead of memcpy to get the uncompressed page. It would not change the way the PAGE_DATA record would be transmitted. Though, one potentially cooler addition could be to use the option field of the record header to indicate whether the data is compressed or not. Given that we have 64 bits, we could even go as far as specifying the type of compression module used (e.g., none, remus, gzip, etc.). This might be really helpful when one wants to save/restore large images (a 8GB VM for example) to/from disks. Is this better/worse than simply gzipping the entire saved image? I don't know yet. However, for live migration, this would be pretty helpful (especially when migrating over long latency networks). Remus' compression technique cannot be used for live migration as it requires a previous version of pages for XOR+RLE compression. However, gzip and other such compression algorithms would be pretty handy in the live migration case, over WAN or even a clogged LAN, where there are tons of VMs being moved back and forth. Feel free to shoot down this idea if it seems unfeasible. Thanks shriram