On Wed, Apr 12, 2017 at 11:18:19AM +0200, Kevin Wolf wrote: > after getting assertion failure reports for block migration in the last > minute, we just hacked around it by commenting out op blocker assertions > for the 2.9 release, but now we need to see how to fix things properly. > Luckily, get_maintainer.pl doesn't report me, but only you. :-) > > The main problem I see with the block migration code (on the > destination) is that it abuses the BlockBackend that belongs to the > guest device to make its own writes to the image file. If the guest > isn't allowed to write to the image (which it now isn't during incoming > migration since it would conflict with the newer style of block > migration using an NBD server), writing to this BlockBackend doesn't > work any more. > > So what should really happen is that incoming block migration creates > its own BlockBackend for writing to the image. Now we don't want to do > this anew for every incoming block, but ideally we'd just create all > necessary BlockBackends upfront and then keep using them throughout the > whole migration. Is there a way to get some setup/teardown callbacks > at the start/end of the migration that could initialise and free such > global data? It can be done in the beginning of block_load() similar to block_mig_state.bmds_list, which is created in init_blk_migration() at save time. We can also move the if (blk != blk_prev) blk_invalidate_cache() code out of the load loop. It should be done once when setting up BlockBackends. > The other problem with block migration is that is uses a BlockBackend > name to identify which device is migrated. However, there can be images > that are not attached to any BlockBackend, or if it is, the BlockBackend > might be anonymous, so this doesn't work. I suppose changing the field > to "device name if available, node-name otherwise" would solve this. Yes, that sounds good and is backwards compatible.