From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50294) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YG30v-00034T-J1 for qemu-devel@nongnu.org; Tue, 27 Jan 2015 05:04:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YG30s-00050A-93 for qemu-devel@nongnu.org; Tue, 27 Jan 2015 05:04:53 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37075) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YG30s-0004zO-0W for qemu-devel@nongnu.org; Tue, 27 Jan 2015 05:04:50 -0500 Date: Tue, 27 Jan 2015 10:04:31 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20150127100430.GB2425@work-vm> References: <1412358473-31398-1-git-send-email-dgilbert@redhat.com> <1412358473-31398-44-git-send-email-dgilbert@redhat.com> <20141113031039.GE7291@voom.fritz.box> <20141217182131.GF31071@work-vm> <20150127045055.GC19081@voom.fritz.box> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150127045055.GC19081@voom.fritz.box> Subject: Re: [Qemu-devel] [PATCH v4 43/47] Host page!=target page: Cleanup bitmaps List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: David Gibson Cc: aarcange@redhat.com, yamahata@private.email.ne.jp, quintela@redhat.com, cristian.klein@cs.umu.se, qemu-devel@nongnu.org, amit.shah@redhat.com, yanghy@cn.fujitsu.com * David Gibson (david@gibson.dropbear.id.au) wrote: > On Wed, Dec 17, 2014 at 06:21:34PM +0000, Dr. David Alan Gilbert wrote: > > * David Gibson (david@gibson.dropbear.id.au) wrote: > > > On Fri, Oct 03, 2014 at 06:47:49PM +0100, Dr. David Alan Gilbert (git) wrote: > > > > From: "Dr. David Alan Gilbert" > > > > > > > > Prior to the start of postcopy, ensure that everything that will > > > > be transferred later is a whole host-page in size. > > > > > > > > This is accomplished by discarding partially transferred host pages > > > > and marking any that are partially dirty as fully dirty. > > > > > > > > Signed-off-by: Dr. David Alan Gilbert > > > > --- > > > > arch_init.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- > > > > 1 file changed, 111 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/arch_init.c b/arch_init.c > > > > index 1fe4fab..aac250c 100644 > > > > --- a/arch_init.c > > > > +++ b/arch_init.c > > > > @@ -1024,7 +1024,6 @@ static uint32_t get_32bits_map(unsigned long *map, int64_t start) > > > > * A helper to put 32 bits into a bit map; trivial for HOST_LONG_BITS=32 > > > > * messier for 64; the bitmaps are actually long's that are 32 or 64bit > > > > */ > > > > -__attribute__ (( unused )) /* Until later in patch series */ > > > > static void put_32bits_map(unsigned long *map, int64_t start, > > > > uint32_t v) > > > > { > > > > @@ -1153,15 +1152,126 @@ static int pc_each_ram_discard(MigrationState *ms) > > > > } > > > > > > > > /* > > > > + * Utility for the outgoing postcopy code. > > > > + * > > > > + * Discard any partially sent host-page size chunks, mark any partially > > > > + * dirty host-page size chunks as all dirty. > > > > + * > > > > + * Returns: 0 on success > > > > + */ > > > > +static int postcopy_chunk_hostpages(MigrationState *ms) > > > > +{ > > > > + struct RAMBlock *block; > > > > + unsigned int host_bits = sysconf(_SC_PAGESIZE) / TARGET_PAGE_SIZE; > > > > + uint32_t host_mask; > > > > + > > > > + /* Should be a power of 2 */ > > > > + assert(host_bits && !(host_bits & (host_bits - 1))); > > > > + /* > > > > + * If the host_bits isn't a division of 32 (the minimum long size) > > > > + * then the code gets a lot more complex; disallow for now > > > > + * (I'm not aware of a system where it's true anyway) > > > > + */ > > > > + assert((32 % host_bits) == 0); > > > > > > This assert makes the first one redundant. > > > > True I guess, removed the power of 2 check. > > > > > > > > > > +/* > > > > * Transmit the set of pages to be discarded after precopy to the target > > > > * these are pages that have been sent previously but have been dirtied > > > > * Hopefully this is pretty sparse > > > > */ > > > > int ram_postcopy_send_discard_bitmap(MigrationState *ms) > > > > { > > > > + int ret; > > > > + > > > > /* This should be our last sync, the src is now paused */ > > > > migration_bitmap_sync(); > > > > > > > > + /* Deal with TPS != HPS */ > > > > + ret = postcopy_chunk_hostpages(ms); > > > > + if (ret) { > > > > + return ret; > > > > + } > > > > > > This really seems like a bogus thing to be doing on the outgoing > > > migration side. Doesn't the host page size constraint come from the > > > destination (due to the need to atomically instate pages). Source > > > host page size == destination host page size doesn't seem like it > > > should be an inherent constraint > > > > It's not an inherent constraint; it just makes life messier. I had > > some code to deal with it but it complicates things even more, and > > I've not got anything to test that rare case with; if someone is > > desperate for it then it can be added. > > So, I'm all for deferring implementation improvements that we don't > need for the time being. > > What worries me though, is having the source have to make assumptions > about how the migration stream will be processed on the destination > that aren't somehow baked into the protocol itself. i.e. I think we > should really try to avoid the possibility of migration streams that > are structurally sound, and look like they should be valid, but > aren't, because of subtle constraints in the order and manner in which > the destination needs to process the individual chunks. Agreed; see below. > > > and it's not clear why you can't do > > > this rounding out to host page sized chunks on the receive end. > > > > The source keeps track of which pages still need sending, and so > > has to update that list when it tells the destination to perform > > a discard. > > Ah. > > > If the destination discards more than the source told it to (for > > example because it has bigger host-pages) the source would need > > to update it's map of the pages that still need sending. > > I'm beginning to wonder if what we really need is for early in the > migration process the destination to tell the host what granularity of > updates it can handle (based on its page size). > > Perhaps the short summary is that I don't think we need to actually > handle the case of different source and dest host page sizes. BUT, > if that does happen the migration process should be able to detect > that that's what's gone wrong and print a meaningful error, rather > than having the destination blow up part way through and deep the code > because chunk constraints necessary for the dest host page size > haven't been met by the source. Right; I cut the problem in the opposite direction and made the source send the destination its page sizes in the 'advise' message and the destination validates it and spits a: Postcopy needs matching host page sizes (s=%d d=%d)" (That's in 21/47 Add wrappers and handlers....) It's just a little easier to do it that way rather than having to make the source wait for the destination. Dave > > -- > David Gibson | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ > | _way_ _around_! > http://www.ozlabs.org/~dgibson -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK