On Mon, Oct 11, 2021 at 01:31:16AM -0400, Jagannathan Raman wrote: > +static void vfu_mig_state_running(vfu_ctx_t *vfu_ctx) > +{ > + VfuObject *o = vfu_get_private(vfu_ctx); > + VfuObjectClass *k = VFU_OBJECT_GET_CLASS(OBJECT(o)); > + static int migrated_devs; > + Error *local_err = NULL; > + int ret; > + > + /** > + * TODO: move to VFU_MIGR_STATE_RESUME handler. Presently, the > + * VMSD data from source is not available at RESUME state. > + * Working on a fix for this. > + */ > + if (!o->vfu_mig_file) { > + o->vfu_mig_file = qemu_fopen_ops(o, &vfu_mig_fops_load, false); > + } > + > + ret = qemu_remote_loadvm(o->vfu_mig_file); > + if (ret) { > + error_setg(&error_abort, "vfu: failed to restore device state"); > + return; > + } > + > + qemu_file_shutdown(o->vfu_mig_file); > + o->vfu_mig_file = NULL; > + > + /* VFU_MIGR_STATE_RUNNING begins here */ > + if (++migrated_devs == k->nr_devs) { See below about migrated_devs. > + bdrv_invalidate_cache_all(&local_err); > + if (local_err) { > + error_report_err(local_err); > + return; > + } > + > + vm_start(); > + } > +} > + > +static void vfu_mig_state_stop(vfu_ctx_t *vfu_ctx) > +{ > + VfuObject *o = vfu_get_private(vfu_ctx); > + VfuObjectClass *k = VFU_OBJECT_GET_CLASS(OBJECT(o)); > + static int migrated_devs; > + > + /** > + * note: calling bdrv_inactivate_all() is not the best approach. > + * > + * Ideally, we would identify the block devices (if any) indirectly > + * linked (such as via a scs-hd device) to each of the migrated devices, s/scs/scsi/ > + * and inactivate them individually. This is essential while operating > + * the server in a storage daemon mode, with devices from different VMs. > + * > + * However, we currently don't have this capability. As such, we need to > + * inactivate all devices at the same time when migration is completed. > + */ > + if (++migrated_devs == k->nr_devs) { > + bdrv_inactivate_all(); > + vm_stop(RUN_STATE_PAUSED); The order of these two functions is reversed in migration/migration.c. First we pause the VM, then we inactivate disks. I think we need to zero migrated_devs in case migration fails and we try to migrate again later: migrated_devs = 0; This is still not quite right because maybe only a few VfuObjects are stopped before migration fails. A different approach for counting devices is necessary, like zeroing migrated_devs in vfu_mig_state_stop_and_copy(). > @@ -422,6 +722,35 @@ static void vfu_object_machine_done(Notifier *notifier, void *data) > return; > } > > + /* > + * TODO: The 0x20000 number used below is a temporary. We are working on > + * a cleaner fix for this. > + * > + * The libvfio-user library assumes that the remote knows the size of > + * the data to be migrated at boot time, but that is not the case with > + * VMSDs, as it can contain a variable-size buffer. 0x20000 is used > + * as a sufficiently large buffer to demonstrate migration, but that > + * cannot be used as a solution. > + * > + */ My question from the previous revision was not answered: libvfio-user has the vfu_migration_callbacks_t interface that allows the device to save/load more data regardless of the size of the migration region. I don't see the issue here since the region doesn't need to be sized to fit the savevm data?