On Wed, Dec 09, 2020 at 06:08:14PM +0100, Eugenio Perez Martin wrote: > On Mon, Dec 7, 2020 at 6:42 PM Stefan Hajnoczi wrote: > > On Fri, Nov 20, 2020 at 07:50:45PM +0100, Eugenio Pérez wrote: > > > +{ > > > + struct vhost_vring_file file = { > > > + .index = idx > > > + }; > > > + VirtQueue *vq = virtio_get_queue(dev->vdev, idx); > > > + VhostShadowVirtqueue *svq; > > > + int r; > > > + > > > + svq = g_new0(VhostShadowVirtqueue, 1); > > > + svq->vq = vq; > > > + > > > + r = event_notifier_init(&svq->hdev_notifier, 0); > > > + assert(r == 0); > > > + > > > + file.fd = event_notifier_get_fd(&svq->hdev_notifier); > > > + r = dev->vhost_ops->vhost_set_vring_kick(dev, &file); > > > + assert(r == 0); > > > + > > > + return svq; > > > +} > > > > I guess there are assumptions about the status of the device? Does the > > virtqueue need to be disabled when this function is called? > > > > Yes. Maybe an assertion checking the notification state? Sounds good. > > > + > > > +static int vhost_sw_live_migration_stop(struct vhost_dev *dev) > > > +{ > > > + int idx; > > > + > > > + vhost_dev_enable_notifiers(dev, dev->vdev); > > > + for (idx = 0; idx < dev->nvqs; ++idx) { > > > + vhost_sw_lm_shadow_vq_free(dev->sw_lm_shadow_vq[idx]); > > > + } > > > + > > > + return 0; > > > +} > > > + > > > +static int vhost_sw_live_migration_start(struct vhost_dev *dev) > > > +{ > > > + int idx; > > > + > > > + for (idx = 0; idx < dev->nvqs; ++idx) { > > > + dev->sw_lm_shadow_vq[idx] = vhost_sw_lm_shadow_vq(dev, idx); > > > + } > > > + > > > + vhost_dev_disable_notifiers(dev, dev->vdev); > > > > There is a race condition if the guest kicks the vq while this is > > happening. The shadow vq hdev_notifier needs to be set so the vhost > > device checks the virtqueue for requests that slipped in during the > > race window. > > > > I'm not sure if I follow you. If I understand correctly, > vhost_dev_disable_notifiers calls virtio_bus_cleanup_host_notifier, > and the latter calls virtio_queue_host_notifier_read. That's why the > documentation says "This might actually run the qemu handlers right > away, so virtio in qemu must be completely setup when this is > called.". Am I missing something? There are at least two cases: 1. Virtqueue kicks that come before vhost_dev_disable_notifiers(). vhost_dev_disable_notifiers() notices that and calls virtio_queue_notify_vq(). Will handle_sw_lm_vq() be invoked or is the device's vq handler function invoked? 2. Virtqueue kicks that come in after vhost_dev_disable_notifiers() returns. We hold the QEMU global mutex so the vCPU thread cannot perform MMIO/PIO dispatch immediately. The vCPU thread's ioctl(KVM_RUN) has already returned and will dispatch dispatch the MMIO/PIO access inside QEMU as soon as the global mutex is released. In other words, we're not taking the kvm.ko ioeventfd path but memory_region_dispatch_write_eventfds() should signal the ioeventfd that is registered at the time the dispatch occurs. Is that eventfd handled by handle_sw_lm_vq()? Neither of these cases are obvious from the code. At least comments would help but I suspect restructuring the code so the critical ioeventfd state changes happen in a sequence would make it even clearer.