From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55146) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d0ndK-0000S4-0Q for qemu-devel@nongnu.org; Wed, 19 Apr 2017 07:18:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d0ndI-0005fC-L9 for qemu-devel@nongnu.org; Wed, 19 Apr 2017 07:18:50 -0400 From: Juan Quintela In-Reply-To: <20170419111328.GB2099@work-vm> (David Alan Gilbert's message of "Wed, 19 Apr 2017 12:13:37 +0100") References: <20170412091819.GB4955@noname.str.redhat.com> <20170418144725.GJ21261@stefanha-x1.localdomain> <20170418153256.GE9236@noname.redhat.com> <20170419111328.GB2099@work-vm> Reply-To: quintela@redhat.com Date: Wed, 19 Apr 2017 13:18:36 +0200 Message-ID: <8737d4hchf.fsf@secure.mitica> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [Qemu-block] migrate -b problems List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: Kevin Wolf , Stefan Hajnoczi , qemu-block@nongnu.org, famz@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com "Dr. David Alan Gilbert" wrote: > * Kevin Wolf (kwolf@redhat.com) wrote: >> Am 18.04.2017 um 16:47 hat Stefan Hajnoczi geschrieben: >> > On Wed, Apr 12, 2017 at 11:18:19AM +0200, Kevin Wolf wrote: >> > > after getting assertion failure reports for block migration in the last >> > > minute, we just hacked around it by commenting out op blocker assertions >> > > for the 2.9 release, but now we need to see how to fix things properly. >> > > Luckily, get_maintainer.pl doesn't report me, but only you. :-) >> > > >> > > The main problem I see with the block migration code (on the >> > > destination) is that it abuses the BlockBackend that belongs to the >> > > guest device to make its own writes to the image file. If the guest >> > > isn't allowed to write to the image (which it now isn't during incoming >> > > migration since it would conflict with the newer style of block >> > > migration using an NBD server), writing to this BlockBackend doesn't >> > > work any more. >> > > >> > > So what should really happen is that incoming block migration creates >> > > its own BlockBackend for writing to the image. Now we don't want to do >> > > this anew for every incoming block, but ideally we'd just create all >> > > necessary BlockBackends upfront and then keep using them throughout the >> > > whole migration. Is there a way to get some setup/teardown callbacks >> > > at the start/end of the migration that could initialise and free such >> > > global data? >> > >> > It can be done in the beginning of block_load() similar to >> > block_mig_state.bmds_list, which is created in init_blk_migration() at >> > save time. >> >> The difference is that block_load() is the counterpart for >> block_save_iterate(), not for init_blk_migration(). That is, it is >> called for each chunk of block migration data, which is interleaved with >> normal RAM migration chunks. >> >> So we can either create each BlockBackend the first time we need it in >> block_load(), or create BlockBackends for all existing device BBs and >> BDSes the first time block_load() is called. We still need some place >> to actually free the BlockBackends again when the migration completes. >> >> Dave suggested migration state notifiers, which looked like an option, >> but at least the existing migration states aren't enough, because the >> BlockBackends need to go away before blk_resume_after_migration() is >> called, but MIGRATION_STATUS_COMPLETED is set only afterwards. >> >> > We can also move the if (blk != blk_prev) blk_invalidate_cache() code >> > out of the load loop. It should be done once when setting up >> > BlockBackends. >> >> Same problem as above, while saving has setup/cleanup callbacks, we only >> have the iterate callback for loading. > > > Yes, and while we have the notifier chain for the source on migration state > changes we don't have the notifier on the destination. > > If we just add a load_cleanup member to SaveVMHandlers and call all of them > at the end of an inbound migration would that be enough? > (And define 'end') We already have a setup() one, that should be enough, no? We also need a cleanup() one, that is what I am going to add. Anything else that is needed for this particular problem? Thanks, Juan.