[PATCH v2 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration

* [PATCH v2 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration
@ 2023-02-08  9:42 Eugenio Pérez
  2023-02-08  9:42 ` [PATCH v2 01/13] vdpa net: move iova tree creation from init to start Eugenio Pérez
                   ` (14 more replies)
  0 siblings, 15 replies; 68+ messages in thread
From: Eugenio Pérez @ 2023-02-08  9:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Harpreet Singh Anand, Gonglei (Arei),
	Michael S. Tsirkin, Jason Wang, Cindy Lu, alvaro.karsz,
	Zhu Lingshan, Lei Yang, Liuxiangdong, Shannon Nelson,
	Parav Pandit, Gautam Dawar, Eli Cohen, Stefan Hajnoczi,
	Laurent Vivier, longpeng2, virtualization, Stefano Garzarella,
	si-wei.liu

It's possible to migrate vdpa net devices if they are shadowed from the
start.  But to always shadow the dataplane is to effectively break its host
passthrough, so its not convenient in vDPA scenarios.

This series enables dynamically switching to shadow mode only at
migration time.  This allows full data virtqueues passthrough all the
time qemu is not migrating.

In this series only net devices with no CVQ are migratable.  CVQ adds
additional state that would make the series bigger and still had some
controversy on previous RFC, so let's split it.

The first patch delays the creation of the iova tree until it is really needed,
and makes it easier to dynamically move from and to SVQ mode.

Next patches from 02 to 05 handle the suspending and getting of vq state (base)
of the device at the switch to SVQ mode.  The new _F_SUSPEND feature is
negotiated and stop device flow is changed so the state can be fetched trusting
the device will not modify it.

Since vhost backend must offer VHOST_F_LOG_ALL to be migratable, last patches
but the last one add the needed migration blockers so vhost-vdpa can offer it
safely.  They also add the handling of this feature.

Finally, the last patch makes virtio vhost-vdpa backend to offer
VHOST_F_LOG_ALL so qemu migrate the device as long as no other blocker has been
added.

Successfully tested with vdpa_sim_net with patch [1] applied and with the qemu
emulated device with vp_vdpa with some restrictions:
* No CVQ. No feature that didn't work with SVQ previously (packed, ...)
* VIRTIO_RING_F_STATE patches implementing [2].
* Expose _F_SUSPEND, but ignore it and suspend on ring state fetch like
  DPDK.

Comments are welcome.

v2:
- Check for SUSPEND in vhost_dev.backend_cap, as .backend_features is empty at
  the check moment.

v1:
- Omit all code working with CVQ and block migration if the device supports
  CVQ.
- Remove spurious kick.
- Move all possible checks for migration to vhost-vdpa instead of the net
  backend. Move them to init code from start code.
- Suspend on vhost_vdpa_dev_start(false) instead of in vhost-vdpa net backend.
- Properly split suspend after geting base and adding of status_reset patches.
- Add possible TODOs to points where this series can improve in the future.
- Check the state of migration using migration_in_setup and
  migration_has_failed instead of checking all the possible migration status in
  a switch.
- Add TODO with possible low hand fruit using RESUME ops.
- Always offer _F_LOG from virtio/vhost-vdpa and let migration blockers do
  their thing instead of adding a variable.
- RFC v2 at https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg02574.html

RFC v2:
- Use a migration listener instead of a memory listener to know when
  the migration starts.
- Add stuff not picked with ASID patches, like enable rings after
  driver_ok
- Add rewinding on the migration src, not in dst
- RFC v1 at https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg01664.html

[1] https://lore.kernel.org/lkml/20230203142501.300125-1-eperezma@redhat.com/T/
[2] https://lists.oasis-open.org/archives/virtio-comment/202103/msg00036.html

Eugenio Pérez (13):
  vdpa net: move iova tree creation from init to start
  vdpa: Negotiate _F_SUSPEND feature
  vdpa: add vhost_vdpa_suspend
  vdpa: move vhost reset after get vring base
  vdpa: rewind at get_base, not set_base
  vdpa net: allow VHOST_F_LOG_ALL
  vdpa: add vdpa net migration state notifier
  vdpa: disable RAM block discard only for the first device
  vdpa net: block migration if the device has CVQ
  vdpa: block migration if device has unsupported features
  vdpa: block migration if dev does not have _F_SUSPEND
  vdpa: block migration if SVQ does not admit a feature
  vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices

 include/hw/virtio/vhost-backend.h |   4 +
 hw/virtio/vhost-vdpa.c            | 126 +++++++++++++++-----
 hw/virtio/vhost.c                 |   3 +
 net/vhost-vdpa.c                  | 192 +++++++++++++++++++++++++-----
 hw/virtio/trace-events            |   1 +
 5 files changed, 267 insertions(+), 59 deletions(-)

-- 
2.31.1

^ permalink raw reply	[flat|nested] 68+ messages in thread