From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47432) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X5GLY-0002gP-4Y for qemu-devel@nongnu.org; Thu, 10 Jul 2014 11:33:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X5GLQ-0005NX-LK for qemu-devel@nongnu.org; Thu, 10 Jul 2014 11:33:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:61613) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X5GLQ-0005NE-E5 for qemu-devel@nongnu.org; Thu, 10 Jul 2014 11:33:12 -0400 Date: Thu, 10 Jul 2014 17:33:05 +0200 From: Andrea Arcangeli Message-ID: <20140710153305.GA20989@redhat.com> References: <1404495717-4239-1-git-send-email-dgilbert@redhat.com> <53B7D36B.4050800@redhat.com> <20140707140229.GA3443@work-vm> <53BAB03B.5000308@redhat.com> <20140710112921.GG2627@work-vm> <53BE8B8E.5020505@redhat.com> <20140710133742.GH2627@work-vm> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140710133742.GH2627@work-vm> Subject: Re: [Qemu-devel] [PATCH 00/46] Postcopy implementation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: yamahata@private.email.ne.jp, lilei@linux.vnet.ibm.com, quintela@redhat.com, qemu-devel@nongnu.org, Paolo Bonzini On Thu, Jul 10, 2014 at 02:37:43PM +0100, Dr. David Alan Gilbert wrote: > * Eric Blake (eblake@redhat.com) wrote: > > Is there any need for an > > event telling libvirt that enough pre-copy has occurred to make a > > postcopy worthwhile? > > I'm not sure that qemu knows much more than management does at that > point; any such decision you can make based on an arbitrary cut off > (i.e. migration is taking too long) or you could consider something > based on some of the other stats that migration already exposes > (like the dirty pages stats); if we've got any more stats that you > need we can always expose them. > > Agreed; although we can just do that independently of this big patch set. It can be independent yes, but I think such event is needed (and once we add such event I hope we can get rid of the polling libvirt is doing for pure precopy too). I think for very large guests what should happen is a single _lazy_ pass of precopy and then immediately postcopy. That's why I think an event that notifies libvirt when it should issue the postcopy command is good, to be able to implement the single _lazy_ pass and nothing more than that. qemu should stop precopy and the source guest just before sending the event, so then libvirt can assign all storage to the destination just before issuing the postcopy commmand. By the time the event has been raised by qemu, the guest in the source qemu must never run anymore. So it is actually the same event needed in pure precopy too (except when using precopy+postcopy the "precopy complete" event will fire much sooner). We'll still need a parameter to precopy to tell qemu when precopy should stop. The single precopy lazy pass would consist of clearing the dirty bitmap, starting precopy, then if any page is found dirty by the time precopy tries to send it, we skip it. We only send those pages in precopy that haven't been modified yet by the time we reach them in precopy. Pages heavily modified will be sent purely through postcopy. Ultimately postcopy will be a page sorting feature to massively decrease the downtime latency, and to reduce to 2*ramsize the maximum amount of data transferred on the network without having to slow down the guest artificially. We'll also know exactly the maximum time in advance that it takes to migrate a large host no matter the load in it (2*ramsize divided by the network bandwidth available at the migration time). It'll be totally deterministic, no black magic slowdowns anymore.