On Wed, Dec 09, 2020 at 04:26:50AM -0500, Jason Wang wrote: > ----- Original Message ----- > > On Fri, Nov 20, 2020 at 07:50:38PM +0100, Eugenio Pérez wrote: > > > This series enable vDPA software assisted live migration for vhost-net > > > devices. This is a new method of vhost devices migration: Instead of > > > relay on vDPA device's dirty logging capability, SW assisted LM > > > intercepts dataplane, forwarding the descriptors between VM and device. > > > > Pros: > > + vhost/vDPA devices don't need to implement dirty memory logging > > + Obsoletes ioctl(VHOST_SET_LOG_BASE) and friends > > > > Cons: > > - Not generic, relies on vhost-net-specific ioctls > > - Doesn't support VIRTIO Shared Memory Regions > > https://github.com/oasis-tcs/virtio-spec/blob/master/shared-mem.tex > > I may miss something but my understanding is that it's the > responsiblity of device to migrate this part? Good point. You're right. > > - Performance (see below) > > > > I think performance will be significantly lower when the shadow vq is > > enabled. Imagine a vDPA device with hardware vq doorbell registers > > mapped into the guest so the guest driver can directly kick the device. > > When the shadow vq is enabled a vmexit is needed to write to the shadow > > vq ioeventfd, then the host kernel scheduler switches to a QEMU thread > > to read the ioeventfd, the descriptors are translated, QEMU writes to > > the vhost hdev kick fd, the host kernel scheduler switches to the vhost > > worker thread, vhost/vDPA notifies the virtqueue, and finally the > > vDPA driver writes to the hardware vq doorbell register. That is a lot > > of overhead compared to writing to an exitless MMIO register! > > I think it's a balance. E.g we can poll the virtqueue to have an > exitless doorbell. > > > > > If the shadow vq was implemented in drivers/vhost/ and QEMU used the > > existing ioctl(VHOST_SET_LOG_BASE) approach, then the overhead would be > > reduced to just one set of ioeventfd/irqfd. In other words, the QEMU > > dirty memory logging happens asynchronously and isn't in the dataplane. > > > > In addition, hardware that supports dirty memory logging as well as > > software vDPA devices could completely eliminate the shadow vq for even > > better performance. > > Yes. That's our plan. But the interface might require more thought. > > E.g is the bitmap a good approach? To me reporting dirty pages via > virqueue is better since it get less footprint and is self throttled. > > And we need an address space other than the one used by guest for > either bitmap for virtqueue. > > > > > But performance is a question of "is it good enough?". Maybe this > > approach is okay and users don't expect good performance while dirty > > memory logging is enabled. > > Yes, and actually such slow down may help for the converge of the > migration. > > Note that the whole idea is try to have a generic solution for all > types of devices. It's good to consider the performance but for the > first stage, it should be sufficient to make it work and consider to > optimize on top. Moving the shadow vq to the kernel later would be quite a big change requiring rewriting much of the code. That's why I mentioned this now before a lot of effort is invested in a QEMU implementation. > > I just wanted to share the idea of moving the > > shadow vq into the kernel in case you like that approach better. > > My understanding is to keep kernel as simple as possible and leave the > polices to userspace as much as possible. E.g it requires us to > disable doorbell mapping and irq offloading, all of which were under > the control of userspace. If the performance is acceptable with the QEMU approach then I think that's the best place to implement it. It looks high-overhead though so maybe one of the first things to do is to run benchmarks to collect data on how it performs? Stefan