Live Migration of Virtio Virtual Function

* Live Migration of Virtio Virtual Function
@ 2021-08-12 12:08 Max Gurtovoy
  2021-08-17  8:51 ` [virtio-comment] " Jason Wang
  0 siblings, 1 reply; 33+ messages in thread
From: Max Gurtovoy @ 2021-08-12 12:08 UTC (permalink / raw)
  To: virtio-comment, Jason Wang, Michael S. Tsirkin, cohuck
  Cc: Parav Pandit, Shahaf Shuler, Ariel Adam, Amnon Ilan, Bodong Wang,
	Jason Gunthorpe, Stefan Hajnoczi, Eugenio Perez Martin,
	Liran Liss, Oren Duer

[-- Attachment #1: Type: text/plain, Size: 1880 bytes --]

Hi all,
Live migration is one of the most important features of virtualization and virtio devices are oftenly found in virtual environments.

The migration process is managed by a migration SW that is running on the hypervisor and the VM is not aware of the process at all.

Unlike the vDPA case, a real pci Virtual Function state resides in the HW.

In our vision, in order to fulfil the Live migration requirements for virtual functions, each physical function device must implement migration operations. Using these operations, it will be able to master the migration process for the virtual function devices. Each capable physical function device has a supervisor permissions to change the virtual function operational states, save/restore its internal state and start/stop dirty pages tracking.

An example of this approach can be seen in the way NVIDIA performs live migration of a ConnectX NIC function:
https://github.com/jgunthorpe/linux/commits/mlx5_vfio_pci

NVIDIAs SNAP technology enables hardware-accelerated software defined PCIe devices. virtio-blk/virtio-net/virtio-fs SNAP used for storage and networking solutions. The host OS/hypervisor uses its standard drivers that are implemented according to a well-known VIRTIO specifications.

In order to implement Live Migration for these virtual function devices, that use a standard drivers as mentioned, the specification should define how HW vendor should build their devices and for SW developers to adjust the drivers.
This will enable specification compliant vendor agnostic solution.

This is exactly how we built the migration driver for ConnectX (internal HW design doc) and I guess that this is the way other vendors work.

For that, I would like to know if the approach of "PF that controls the VF live migration process" is acceptable by the VIRTIO technical group ?

Cheers,
-Max.

[-- Attachment #2: Type: text/html, Size: 4357 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread