All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration
@ 2023-02-03 19:11 Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 01/13] vdpa net: move iova tree creation from init to start Eugenio Pérez
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

It's possible to migrate vdpa net devices if they are shadowed from the
start.  But to always shadow the dataplane is to effectively break its host
passthrough, so its not convenient in vDPA scenarios.

This series enables dynamically switching to shadow mode only at
migration time.  This allows full data virtqueues passthrough all the
time qemu is not migrating.

In this series only net devices with no CVQ are migratable.  CVQ adds
additional state that would make the series bigger and still had some
controversy on previous RFC, so let's split it.

The first patch delays the creation of the iova tree until it is really needed,
and makes it easier to dynamically move from and to SVQ mode.

Next patches from 02 to 05 handle the suspending and getting of vq state (base)
of the device at the switch to SVQ mode.  The new _F_SUSPEND feature is
negotiated and stop device flow is changed so the state can be fetched trusting
the device will not modify it.

Since vhost backend must offer VHOST_F_LOG_ALL to be migratable, last patches
but the last one add the needed migration blockers so vhost-vdpa can offer it
safely.  They also add the handling of this feature.

Finally, the last patch makes virtio vhost-vdpa backend to offer
VHOST_F_LOG_ALL so qemu migrate the device as long as no other blocker has been
added.

Successfully tested with vdpa_sim_net with patch [1] applied and with the qemu
emulated device with vp_vdpa with some restrictions:
* No CVQ. No feature that didn't work with SVQ previously (packed, ...)
* VIRTIO_RING_F_STATE patches implementing [2].
* Expose _F_SUSPEND, but ignore it and suspend on ring state fetch like
  DPDK.

Comments are welcome.

v1:
- Omit all code working with CVQ and block migration if the device supports
  CVQ.
- Remove spurious kick.
- Move all possible checks for migration to vhost-vdpa instead of the net
  backend. Move them to init code from start code.
- Suspend on vhost_vdpa_dev_start(false) instead of in vhost-vdpa net backend.
- Properly split suspend after geting base and adding of status_reset patches.
- Add possible TODOs to points where this series can improve in the future.
- Check the state of migration using migration_in_setup and
  migration_has_failed instead of checking all the possible migration status in
  a switch.
- Add TODO with possible low hand fruit using RESUME ops.
- Always offer _F_LOG from virtio/vhost-vdpa and let migration blockers do
  their thing instead of adding a variable.
- RFC v2 at https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg02574.html

RFC v2:
- Use a migration listener instead of a memory listener to know when
  the migration starts.
- Add stuff not picked with ASID patches, like enable rings after
  driver_ok
- Add rewinding on the migration src, not in dst
- RFC v1 at https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg01664.html

[1] https://lore.kernel.org/lkml/20230203142501.300125-1-eperezma@redhat.com/T/
[2] https://lists.oasis-open.org/archives/virtio-comment/202103/msg00036.html

Eugenio Pérez (13):
  vdpa net: move iova tree creation from init to start
  vdpa: Negotiate _F_SUSPEND feature
  vdpa: add vhost_vdpa_suspend
  vdpa: move vhost reset after get vring base
  vdpa: rewind at get_base, not set_base
  vdpa net: allow VHOST_F_LOG_ALL
  vdpa: add vdpa net migration state notifier
  vdpa: disable RAM block discard only for the first device
  vdpa net: block migration if the device has CVQ
  vdpa: block migration if device has unsupported features
  vdpa: block migration if dev does not have _F_SUSPEND
  vdpa: block migration if SVQ does not admit a feature
  vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices

 include/hw/virtio/vhost-backend.h |   4 +
 hw/virtio/vhost-vdpa.c            | 126 +++++++++++++++-----
 hw/virtio/vhost.c                 |   3 +
 net/vhost-vdpa.c                  | 192 +++++++++++++++++++++++++-----
 hw/virtio/trace-events            |   1 +
 5 files changed, 267 insertions(+), 59 deletions(-)

-- 
2.31.1




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 01/13] vdpa net: move iova tree creation from init to start
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 02/13] vdpa: Negotiate _F_SUSPEND feature Eugenio Pérez
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

Only create iova_tree if and when it is needed.

The cleanup keeps being responsible of last VQ but this change allows it
to merge both cleanup functions.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 net/vhost-vdpa.c | 99 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 71 insertions(+), 28 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index de5ed8ff22..a9e6c8f28e 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -178,13 +178,9 @@ err_init:
 static void vhost_vdpa_cleanup(NetClientState *nc)
 {
     VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
-    struct vhost_dev *dev = &s->vhost_net->dev;
 
     qemu_vfree(s->cvq_cmd_out_buffer);
     qemu_vfree(s->status);
-    if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
-        g_clear_pointer(&s->vhost_vdpa.iova_tree, vhost_iova_tree_delete);
-    }
     if (s->vhost_net) {
         vhost_net_cleanup(s->vhost_net);
         g_free(s->vhost_net);
@@ -234,10 +230,64 @@ static ssize_t vhost_vdpa_receive(NetClientState *nc, const uint8_t *buf,
     return size;
 }
 
+/** From any vdpa net client, get the netclient of first queue pair */
+static VhostVDPAState *vhost_vdpa_net_first_nc_vdpa(VhostVDPAState *s)
+{
+    NICState *nic = qemu_get_nic(s->nc.peer);
+    NetClientState *nc0 = qemu_get_peer(nic->ncs, 0);
+
+    return DO_UPCAST(VhostVDPAState, nc, nc0);
+}
+
+static void vhost_vdpa_net_data_start_first(VhostVDPAState *s)
+{
+    struct vhost_vdpa *v = &s->vhost_vdpa;
+
+    if (v->shadow_vqs_enabled) {
+        v->iova_tree = vhost_iova_tree_new(v->iova_range.first,
+                                           v->iova_range.last);
+    }
+}
+
+static int vhost_vdpa_net_data_start(NetClientState *nc)
+{
+    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+    struct vhost_vdpa *v = &s->vhost_vdpa;
+
+    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+
+    if (v->index == 0) {
+        vhost_vdpa_net_data_start_first(s);
+        return 0;
+    }
+
+    if (v->shadow_vqs_enabled) {
+        VhostVDPAState *s0 = vhost_vdpa_net_first_nc_vdpa(s);
+        v->iova_tree = s0->vhost_vdpa.iova_tree;
+    }
+
+    return 0;
+}
+
+static void vhost_vdpa_net_client_stop(NetClientState *nc)
+{
+    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+    struct vhost_dev *dev;
+
+    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+
+    dev = s->vhost_vdpa.dev;
+    if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
+        g_clear_pointer(&s->vhost_vdpa.iova_tree, vhost_iova_tree_delete);
+    }
+}
+
 static NetClientInfo net_vhost_vdpa_info = {
         .type = NET_CLIENT_DRIVER_VHOST_VDPA,
         .size = sizeof(VhostVDPAState),
         .receive = vhost_vdpa_receive,
+        .start = vhost_vdpa_net_data_start,
+        .stop = vhost_vdpa_net_client_stop,
         .cleanup = vhost_vdpa_cleanup,
         .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
         .has_ufo = vhost_vdpa_has_ufo,
@@ -351,7 +401,7 @@ dma_map_err:
 
 static int vhost_vdpa_net_cvq_start(NetClientState *nc)
 {
-    VhostVDPAState *s;
+    VhostVDPAState *s, *s0;
     struct vhost_vdpa *v;
     uint64_t backend_features;
     int64_t cvq_group;
@@ -425,6 +475,15 @@ out:
         return 0;
     }
 
+    s0 = vhost_vdpa_net_first_nc_vdpa(s);
+    if (s0->vhost_vdpa.iova_tree) {
+        /* SVQ is already configured for all virtqueues */
+        v->iova_tree = s0->vhost_vdpa.iova_tree;
+    } else {
+        v->iova_tree = vhost_iova_tree_new(v->iova_range.first,
+                                           v->iova_range.last);
+    }
+
     r = vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, s->cvq_cmd_out_buffer,
                                vhost_vdpa_net_cvq_cmd_page_len(), false);
     if (unlikely(r < 0)) {
@@ -449,15 +508,9 @@ static void vhost_vdpa_net_cvq_stop(NetClientState *nc)
     if (s->vhost_vdpa.shadow_vqs_enabled) {
         vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, s->cvq_cmd_out_buffer);
         vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, s->status);
-        if (!s->always_svq) {
-            /*
-             * If only the CVQ is shadowed we can delete this safely.
-             * If all the VQs are shadows this will be needed by the time the
-             * device is started again to register SVQ vrings and similar.
-             */
-            g_clear_pointer(&s->vhost_vdpa.iova_tree, vhost_iova_tree_delete);
-        }
     }
+
+    vhost_vdpa_net_client_stop(nc);
 }
 
 static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, size_t out_len,
@@ -667,8 +720,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
                                        int nvqs,
                                        bool is_datapath,
                                        bool svq,
-                                       struct vhost_vdpa_iova_range iova_range,
-                                       VhostIOVATree *iova_tree)
+                                       struct vhost_vdpa_iova_range iova_range)
 {
     NetClientState *nc = NULL;
     VhostVDPAState *s;
@@ -690,7 +742,6 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
     s->vhost_vdpa.shadow_vqs_enabled = svq;
     s->vhost_vdpa.iova_range = iova_range;
     s->vhost_vdpa.shadow_data = svq;
-    s->vhost_vdpa.iova_tree = iova_tree;
     if (!is_datapath) {
         s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size(),
                                             vhost_vdpa_net_cvq_cmd_page_len());
@@ -760,7 +811,6 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     uint64_t features;
     int vdpa_device_fd;
     g_autofree NetClientState **ncs = NULL;
-    g_autoptr(VhostIOVATree) iova_tree = NULL;
     struct vhost_vdpa_iova_range iova_range;
     NetClientState *nc;
     int queue_pairs, r, i = 0, has_cvq = 0;
@@ -812,12 +862,8 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
         goto err;
     }
 
-    if (opts->x_svq) {
-        if (!vhost_vdpa_net_valid_svq_features(features, errp)) {
-            goto err_svq;
-        }
-
-        iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
+    if (opts->x_svq && !vhost_vdpa_net_valid_svq_features(features, errp)) {
+        goto err;
     }
 
     ncs = g_malloc0(sizeof(*ncs) * queue_pairs);
@@ -825,7 +871,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     for (i = 0; i < queue_pairs; i++) {
         ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
                                      vdpa_device_fd, i, 2, true, opts->x_svq,
-                                     iova_range, iova_tree);
+                                     iova_range);
         if (!ncs[i])
             goto err;
     }
@@ -833,13 +879,11 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     if (has_cvq) {
         nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
                                  vdpa_device_fd, i, 1, false,
-                                 opts->x_svq, iova_range, iova_tree);
+                                 opts->x_svq, iova_range);
         if (!nc)
             goto err;
     }
 
-    /* iova_tree ownership belongs to last NetClientState */
-    g_steal_pointer(&iova_tree);
     return 0;
 
 err:
@@ -849,7 +893,6 @@ err:
         }
     }
 
-err_svq:
     qemu_close(vdpa_device_fd);
 
     return -1;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 02/13] vdpa: Negotiate _F_SUSPEND feature
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 01/13] vdpa net: move iova tree creation from init to start Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 03/13] vdpa: add vhost_vdpa_suspend Eugenio Pérez
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

This is needed for qemu to know it can suspend the device to retrieve
its status and enable SVQ with it, so all the process is transparent to
the guest.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 542e003101..2e79fbe4b2 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -659,7 +659,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
     uint64_t features;
     uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 |
         0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH |
-        0x1ULL << VHOST_BACKEND_F_IOTLB_ASID;
+        0x1ULL << VHOST_BACKEND_F_IOTLB_ASID |
+        0x1ULL << VHOST_BACKEND_F_SUSPEND;
     int r;
 
     if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 03/13] vdpa: add vhost_vdpa_suspend
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 01/13] vdpa net: move iova tree creation from init to start Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 02/13] vdpa: Negotiate _F_SUSPEND feature Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 04/13] vdpa: move vhost reset after get vring base Eugenio Pérez
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

The function vhost.c:vhost_dev_stop fetches the vring base so the vq
state can be migrated to other devices.  However, this is unreliable in
vdpa, since we didn't signal the device to suspend the queues, making
the value fetched useless.

Suspend the device if possible before fetching first and subsequent
vring bases.

Moreover, vdpa totally reset and wipes the device at the last device
before fetch its vrings base, making that operation useless in the last
device. This will be fixed in later patches of this series.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 19 +++++++++++++++++++
 hw/virtio/trace-events |  1 +
 2 files changed, 20 insertions(+)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 2e79fbe4b2..b681841cdc 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1108,6 +1108,24 @@ static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
     }
 }
 
+static void vhost_vdpa_suspend(struct vhost_dev *dev)
+{
+    struct vhost_vdpa *v = dev->opaque;
+    int r;
+
+    if (!vhost_vdpa_first_dev(dev) ||
+        !(dev->backend_features & BIT_ULL(VHOST_BACKEND_F_SUSPEND))) {
+        return;
+    }
+
+    trace_vhost_vdpa_suspend(dev);
+    r = ioctl(v->device_fd, VHOST_VDPA_SUSPEND);
+    if (unlikely(r)) {
+        error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
+        /* Not aborting since we're called from stop context */
+    }
+}
+
 static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
 {
     struct vhost_vdpa *v = dev->opaque;
@@ -1122,6 +1140,7 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
         }
         vhost_vdpa_set_vring_ready(dev);
     } else {
+        vhost_vdpa_suspend(dev);
         vhost_vdpa_svqs_stop(dev);
         vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
     }
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index a87c5f39a2..8f8d05cf9b 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -50,6 +50,7 @@ vhost_vdpa_set_vring_ready(void *dev) "dev: %p"
 vhost_vdpa_dump_config(void *dev, const char *line) "dev: %p %s"
 vhost_vdpa_set_config(void *dev, uint32_t offset, uint32_t size, uint32_t flags) "dev: %p offset: %"PRIu32" size: %"PRIu32" flags: 0x%"PRIx32
 vhost_vdpa_get_config(void *dev, void *config, uint32_t config_len) "dev: %p config: %p config_len: %"PRIu32
+vhost_vdpa_suspend(void *dev) "dev: %p"
 vhost_vdpa_dev_start(void *dev, bool started) "dev: %p started: %d"
 vhost_vdpa_set_log_base(void *dev, uint64_t base, unsigned long long size, int refcnt, int fd, void *log) "dev: %p base: 0x%"PRIx64" size: %llu refcnt: %d fd: %d log: %p"
 vhost_vdpa_set_vring_addr(void *dev, unsigned int index, unsigned int flags, uint64_t desc_user_addr, uint64_t used_user_addr, uint64_t avail_user_addr, uint64_t log_guest_addr) "dev: %p index: %u flags: 0x%x desc_user_addr: 0x%"PRIx64" used_user_addr: 0x%"PRIx64" avail_user_addr: 0x%"PRIx64" log_guest_addr: 0x%"PRIx64
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 04/13] vdpa: move vhost reset after get vring base
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
                   ` (2 preceding siblings ...)
  2023-02-03 19:11 ` [PATCH 03/13] vdpa: add vhost_vdpa_suspend Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 05/13] vdpa: rewind at get_base, not set_base Eugenio Pérez
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

The function vhost.c:vhost_dev_stop calls vhost operation
vhost_dev_start(false). In the case of vdpa it totally reset and wipes
the device, making the fetching of the vring base (virtqueue state) totally
useless.

The kernel backend does not use vhost_dev_start vhost op callback, but
vhost-user do. A patch to make vhost_user_dev_start more similar to vdpa
is desirable, but it can be added on top.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/vhost-backend.h |  4 ++++
 hw/virtio/vhost-vdpa.c            | 22 ++++++++++++++++------
 hw/virtio/vhost.c                 |  3 +++
 3 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index c5ab49051e..ec3fbae58d 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -130,6 +130,9 @@ typedef bool (*vhost_force_iommu_op)(struct vhost_dev *dev);
 
 typedef int (*vhost_set_config_call_op)(struct vhost_dev *dev,
                                        int fd);
+
+typedef void (*vhost_reset_status_op)(struct vhost_dev *dev);
+
 typedef struct VhostOps {
     VhostBackendType backend_type;
     vhost_backend_init vhost_backend_init;
@@ -177,6 +180,7 @@ typedef struct VhostOps {
     vhost_get_device_id_op vhost_get_device_id;
     vhost_force_iommu_op vhost_force_iommu;
     vhost_set_config_call_op vhost_set_config_call;
+    vhost_reset_status_op vhost_reset_status;
 } VhostOps;
 
 int vhost_backend_update_device_iotlb(struct vhost_dev *dev,
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index b681841cdc..6d20d70601 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1152,14 +1152,23 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
     if (started) {
         memory_listener_register(&v->listener, &address_space_memory);
         return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
-    } else {
-        vhost_vdpa_reset_device(dev);
-        vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
-                                   VIRTIO_CONFIG_S_DRIVER);
-        memory_listener_unregister(&v->listener);
+    }
 
-        return 0;
+    return 0;
+}
+
+static void vhost_vdpa_reset_status(struct vhost_dev *dev)
+{
+    struct vhost_vdpa *v = dev->opaque;
+
+    if (dev->vq_index + dev->nvqs != dev->vq_index_end) {
+        return;
     }
+
+    vhost_vdpa_reset_device(dev);
+    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
+                                VIRTIO_CONFIG_S_DRIVER);
+    memory_listener_unregister(&v->listener);
 }
 
 static int vhost_vdpa_set_log_base(struct vhost_dev *dev, uint64_t base,
@@ -1346,4 +1355,5 @@ const VhostOps vdpa_ops = {
         .vhost_vq_get_addr = vhost_vdpa_vq_get_addr,
         .vhost_force_iommu = vhost_vdpa_force_iommu,
         .vhost_set_config_call = vhost_vdpa_set_config_call,
+        .vhost_reset_status = vhost_vdpa_reset_status,
 };
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index eb8c4c378c..a266396576 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -2049,6 +2049,9 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
                              hdev->vqs + i,
                              hdev->vq_index + i);
     }
+    if (hdev->vhost_ops->vhost_reset_status) {
+        hdev->vhost_ops->vhost_reset_status(hdev);
+    }
 
     if (vhost_dev_has_iommu(hdev)) {
         if (hdev->vhost_ops->vhost_set_iotlb_callback) {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 05/13] vdpa: rewind at get_base, not set_base
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
                   ` (3 preceding siblings ...)
  2023-02-03 19:11 ` [PATCH 04/13] vdpa: move vhost reset after get vring base Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 06/13] vdpa net: allow VHOST_F_LOG_ALL Eugenio Pérez
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

At this moment it is only possible to migrate to a vdpa device running
with x-svq=on. As a protective measure, the rewind of the inflight
descriptors was done at the destination. That way if the source sent a
virtqueue with inuse descriptors they are always discarded.

Since this series allows to migrate also to passthrough devices with no
SVQ, the right thing to do is to rewind at the source so the base of
vrings are correct.

Support for inflight descriptors may be added in the future.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 6d20d70601..6385c16a55 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1211,18 +1211,7 @@ static int vhost_vdpa_set_vring_base(struct vhost_dev *dev,
                                        struct vhost_vring_state *ring)
 {
     struct vhost_vdpa *v = dev->opaque;
-    VirtQueue *vq = virtio_get_queue(dev->vdev, ring->index);
 
-    /*
-     * vhost-vdpa devices does not support in-flight requests. Set all of them
-     * as available.
-     *
-     * TODO: This is ok for networking, but other kinds of devices might
-     * have problems with these retransmissions.
-     */
-    while (virtqueue_rewind(vq, 1)) {
-        continue;
-    }
     if (v->shadow_vqs_enabled) {
         /*
          * Device vring base was set at device start. SVQ base is handled by
@@ -1241,6 +1230,19 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev *dev,
     int ret;
 
     if (v->shadow_vqs_enabled) {
+        VirtQueue *vq = virtio_get_queue(dev->vdev, ring->index);
+
+        /*
+         * vhost-vdpa devices does not support in-flight requests. Set all of
+         * them as available.
+         *
+         * TODO: This is ok for networking, but other kinds of devices might
+         * have problems with these retransmissions.
+         */
+        while (virtqueue_rewind(vq, 1)) {
+            continue;
+        }
+
         ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index);
         return 0;
     }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 06/13] vdpa net: allow VHOST_F_LOG_ALL
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
                   ` (4 preceding siblings ...)
  2023-02-03 19:11 ` [PATCH 05/13] vdpa: rewind at get_base, not set_base Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 07/13] vdpa: add vdpa net migration state notifier Eugenio Pérez
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

Since some actions move to the start function instead of init, the
device features may not be the parent vdpa device's, but the one
returned by vhost backend.  If transition to SVQ is supported, the vhost
backend will return _F_LOG_ALL to signal the device is migratable.

Add VHOST_F_LOG_ALL.  HW dirty page tracking can be added on top of this
change if the device supports it in the future.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 net/vhost-vdpa.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index a9e6c8f28e..dd686b4514 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -98,6 +98,8 @@ static const uint64_t vdpa_svq_device_features =
     BIT_ULL(VIRTIO_NET_F_MQ) |
     BIT_ULL(VIRTIO_F_ANY_LAYOUT) |
     BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR) |
+    /* VHOST_F_LOG_ALL is exposed by SVQ */
+    BIT_ULL(VHOST_F_LOG_ALL) |
     BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
     BIT_ULL(VIRTIO_NET_F_STANDBY);
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 07/13] vdpa: add vdpa net migration state notifier
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
                   ` (5 preceding siblings ...)
  2023-02-03 19:11 ` [PATCH 06/13] vdpa net: allow VHOST_F_LOG_ALL Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 08/13] vdpa: disable RAM block discard only for the first device Eugenio Pérez
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

This allows net to restart the device backend to configure SVQ on it.

Ideally, these changes should not be net specific. However, the vdpa net
backend is the one with enough knowledge to configure everything because
of some reasons:
* Queues might need to be shadowed or not depending on its kind (control
  vs data).
* Queues need to share the same map translations (iova tree).

Because of that it is cleaner to restart the whole net backend and
configure again as expected, similar to how vhost-kernel moves between
userspace and passthrough.

If more kinds of devices need dynamic switching to SVQ we can create a
callback struct like VhostOps and move most of the code there.
VhostOps cannot be reused since all vdpa backend share them, and to
personalize just for networking would be too heavy.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
v3:
* Add TODO to use the resume operation in the future.
* Use migration_in_setup and migration_has_failed instead of a
  complicated switch case.
---
 net/vhost-vdpa.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index dd686b4514..bca13f97fd 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -26,12 +26,14 @@
 #include <err.h>
 #include "standard-headers/linux/virtio_net.h"
 #include "monitor/monitor.h"
+#include "migration/misc.h"
 #include "hw/virtio/vhost.h"
 
 /* Todo:need to add the multiqueue support here */
 typedef struct VhostVDPAState {
     NetClientState nc;
     struct vhost_vdpa vhost_vdpa;
+    Notifier migration_state;
     VHostNetState *vhost_net;
 
     /* Control commands shadow buffers */
@@ -241,10 +243,79 @@ static VhostVDPAState *vhost_vdpa_net_first_nc_vdpa(VhostVDPAState *s)
     return DO_UPCAST(VhostVDPAState, nc, nc0);
 }
 
+static void vhost_vdpa_net_log_global_enable(VhostVDPAState *s, bool enable)
+{
+    struct vhost_vdpa *v = &s->vhost_vdpa;
+    VirtIONet *n;
+    VirtIODevice *vdev;
+    int data_queue_pairs, cvq, r;
+    NetClientState *peer;
+
+    /* We are only called on the first data vqs and only if x-svq is not set */
+    if (s->vhost_vdpa.shadow_vqs_enabled == enable) {
+        return;
+    }
+
+    vdev = v->dev->vdev;
+    n = VIRTIO_NET(vdev);
+    if (!n->vhost_started) {
+        return;
+    }
+
+    data_queue_pairs = n->multiqueue ? n->max_queue_pairs : 1;
+    cvq = virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ) ?
+                                  n->max_ncs - n->max_queue_pairs : 0;
+    /*
+     * TODO: vhost_net_stop does suspend, get_base and reset. We can be smarter
+     * in the future and resume the device if read-only operations between
+     * suspend and reset goes wrong.
+     */
+    vhost_net_stop(vdev, n->nic->ncs, data_queue_pairs, cvq);
+
+    peer = s->nc.peer;
+    for (int i = 0; i < data_queue_pairs + cvq; i++) {
+        VhostVDPAState *vdpa_state;
+        NetClientState *nc;
+
+        if (i < data_queue_pairs) {
+            nc = qemu_get_peer(peer, i);
+        } else {
+            nc = qemu_get_peer(peer, n->max_queue_pairs);
+        }
+
+        vdpa_state = DO_UPCAST(VhostVDPAState, nc, nc);
+        vdpa_state->vhost_vdpa.shadow_data = enable;
+
+        if (i < data_queue_pairs) {
+            /* Do not override CVQ shadow_vqs_enabled */
+            vdpa_state->vhost_vdpa.shadow_vqs_enabled = enable;
+        }
+    }
+
+    r = vhost_net_start(vdev, n->nic->ncs, data_queue_pairs, cvq);
+    if (unlikely(r < 0)) {
+        error_report("unable to start vhost net: %s(%d)", g_strerror(-r), -r);
+    }
+}
+
+static void vdpa_net_migration_state_notifier(Notifier *notifier, void *data)
+{
+    MigrationState *migration = data;
+    VhostVDPAState *s = container_of(notifier, VhostVDPAState,
+                                     migration_state);
+
+    if (migration_in_setup(migration)) {
+        vhost_vdpa_net_log_global_enable(s, true);
+    } else if (migration_has_failed(migration)) {
+        vhost_vdpa_net_log_global_enable(s, false);
+    }
+}
+
 static void vhost_vdpa_net_data_start_first(VhostVDPAState *s)
 {
     struct vhost_vdpa *v = &s->vhost_vdpa;
 
+    add_migration_state_change_notifier(&s->migration_state);
     if (v->shadow_vqs_enabled) {
         v->iova_tree = vhost_iova_tree_new(v->iova_range.first,
                                            v->iova_range.last);
@@ -278,6 +349,10 @@ static void vhost_vdpa_net_client_stop(NetClientState *nc)
 
     assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
 
+    if (s->vhost_vdpa.index == 0) {
+        remove_migration_state_change_notifier(&s->migration_state);
+    }
+
     dev = s->vhost_vdpa.dev;
     if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
         g_clear_pointer(&s->vhost_vdpa.iova_tree, vhost_iova_tree_delete);
@@ -741,6 +816,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
     s->vhost_vdpa.device_fd = vdpa_device_fd;
     s->vhost_vdpa.index = queue_pair_index;
     s->always_svq = svq;
+    s->migration_state.notify = vdpa_net_migration_state_notifier;
     s->vhost_vdpa.shadow_vqs_enabled = svq;
     s->vhost_vdpa.iova_range = iova_range;
     s->vhost_vdpa.shadow_data = svq;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 08/13] vdpa: disable RAM block discard only for the first device
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
                   ` (6 preceding siblings ...)
  2023-02-03 19:11 ` [PATCH 07/13] vdpa: add vdpa net migration state notifier Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 09/13] vdpa net: block migration if the device has CVQ Eugenio Pérez
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

Although it does not make a big difference, its more correct and
simplifies the cleanup path in subsequent patches.

Move ram_block_discard_disable(false) call to the top of
vhost_vdpa_cleanup because:
* We cannot use vhost_vdpa_first_dev after dev->opaque = NULL
  assignment.
* Improve the stack order in cleanup: since it is the last action taken
  in init, it should be the first at cleanup.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 6385c16a55..b0d69df9e6 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -431,16 +431,6 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
     trace_vhost_vdpa_init(dev, opaque);
     int ret;
 
-    /*
-     * Similar to VFIO, we end up pinning all guest memory and have to
-     * disable discarding of RAM.
-     */
-    ret = ram_block_discard_disable(true);
-    if (ret) {
-        error_report("Cannot set discarding of RAM broken");
-        return ret;
-    }
-
     v = opaque;
     v->dev = dev;
     dev->opaque =  opaque ;
@@ -452,6 +442,16 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
         return 0;
     }
 
+    /*
+     * Similar to VFIO, we end up pinning all guest memory and have to
+     * disable discarding of RAM.
+     */
+    ret = ram_block_discard_disable(true);
+    if (ret) {
+        error_report("Cannot set discarding of RAM broken");
+        return ret;
+    }
+
     vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
                                VIRTIO_CONFIG_S_DRIVER);
 
@@ -577,12 +577,15 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
     assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
     v = dev->opaque;
     trace_vhost_vdpa_cleanup(dev, v);
+    if (vhost_vdpa_first_dev(dev)) {
+        ram_block_discard_disable(false);
+    }
+
     vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
     memory_listener_unregister(&v->listener);
     vhost_vdpa_svq_cleanup(dev);
 
     dev->opaque = NULL;
-    ram_block_discard_disable(false);
 
     return 0;
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 09/13] vdpa net: block migration if the device has CVQ
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
                   ` (7 preceding siblings ...)
  2023-02-03 19:11 ` [PATCH 08/13] vdpa: disable RAM block discard only for the first device Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 10/13] vdpa: block migration if device has unsupported features Eugenio Pérez
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

Devices with CVQ needs to migrate state beyond vq state.  Leaving this
to future series.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 net/vhost-vdpa.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index bca13f97fd..309861e56c 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -955,11 +955,17 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     }
 
     if (has_cvq) {
+        VhostVDPAState *s;
+
         nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
                                  vdpa_device_fd, i, 1, false,
                                  opts->x_svq, iova_range);
         if (!nc)
             goto err;
+
+        s = DO_UPCAST(VhostVDPAState, nc, nc);
+        error_setg(&s->vhost_vdpa.dev->migration_blocker,
+                   "net vdpa cannot migrate with MQ feature");
     }
 
     return 0;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 10/13] vdpa: block migration if device has unsupported features
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
                   ` (8 preceding siblings ...)
  2023-02-03 19:11 ` [PATCH 09/13] vdpa net: block migration if the device has CVQ Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 11/13] vdpa: block migration if dev does not have _F_SUSPEND Eugenio Pérez
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

A vdpa net device must initialize with SVQ in order to be migratable at
this moment, and initialization code verifies some conditions.  If the
device is not initialized with the x-svq parameter, it will not expose
_F_LOG so the vhost subsystem will block VM migration from its
initialization.

Next patches change this, so we need to verify migration conditions
differently.

QEMU only supports a subset of net features in SVQ, and it cannot
migrate state that cannot track or restore in the destination.  Add a
migration blocker if the device offer an unsupported feature.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 net/vhost-vdpa.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 309861e56c..a0c4d5de2c 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -952,6 +952,15 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
                                      iova_range);
         if (!ncs[i])
             goto err;
+
+        if (i == 0) {
+            VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, ncs[0]);
+
+            if (!s->vhost_vdpa.dev->migration_blocker) {
+                vhost_vdpa_net_valid_svq_features(features,
+                                        &s->vhost_vdpa.dev->migration_blocker);
+            }
+        }
     }
 
     if (has_cvq) {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 11/13] vdpa: block migration if dev does not have _F_SUSPEND
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
                   ` (9 preceding siblings ...)
  2023-02-03 19:11 ` [PATCH 10/13] vdpa: block migration if device has unsupported features Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 12/13] vdpa: block migration if SVQ does not admit a feature Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 13/13] vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices Eugenio Pérez
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

Next patches enable devices to be migrated even if vdpa netdev has not
been started with x-svq. However, not all devices are migratable, so we
need to block migration if we detect that.

Block vhost-vdpa device migration if it does not offer _F_SUSPEND and it
has not been started with x-svq.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index b0d69df9e6..c01411af30 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -442,6 +442,27 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
         return 0;
     }
 
+    /*
+     * If dev->shadow_vqs_enabled at initialization that means the device has
+     * been started with x-svq=on, so don't block migration
+     */
+    if (dev->migration_blocker == NULL && !v->shadow_vqs_enabled) {
+        uint64_t backend_features;
+
+        /* We don't have dev->backend_features yet */
+        ret = vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES,
+                              &backend_features);
+        if (unlikely(ret)) {
+            error_setg_errno(errp, -ret, "Could not get backend features");
+            return ret;
+        }
+
+        if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_SUSPEND))) {
+            error_setg(&dev->migration_blocker,
+                "vhost-vdpa backend lacks VHOST_BACKEND_F_SUSPEND feature.");
+        }
+    }
+
     /*
      * Similar to VFIO, we end up pinning all guest memory and have to
      * disable discarding of RAM.
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 12/13] vdpa: block migration if SVQ does not admit a feature
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
                   ` (10 preceding siblings ...)
  2023-02-03 19:11 ` [PATCH 11/13] vdpa: block migration if dev does not have _F_SUSPEND Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  2023-02-03 19:11 ` [PATCH 13/13] vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices Eugenio Pérez
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

Next patches enable devices to be migrated even if vdpa netdev has not
been started with x-svq. However, not all devices are migratable, so we
need to block migration if we detect that.

Block migration if we detect the device expose a feature SVQ does not
know how to work with.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index c01411af30..2de0796e66 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -460,6 +460,15 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
         if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_SUSPEND))) {
             error_setg(&dev->migration_blocker,
                 "vhost-vdpa backend lacks VHOST_BACKEND_F_SUSPEND feature.");
+        } else {
+            /* We don't have dev->features yet */
+            uint64_t features;
+            ret = vhost_vdpa_get_dev_features(dev, &features);
+            if (unlikely(ret)) {
+                error_setg_errno(errp, -ret, "Could not get device features");
+                return ret;
+            }
+            vhost_svq_valid_features(features, &dev->migration_blocker);
         }
     }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 13/13] vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices
  2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
                   ` (11 preceding siblings ...)
  2023-02-03 19:11 ` [PATCH 12/13] vdpa: block migration if SVQ does not admit a feature Eugenio Pérez
@ 2023-02-03 19:11 ` Eugenio Pérez
  12 siblings, 0 replies; 14+ messages in thread
From: Eugenio Pérez @ 2023-02-03 19:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Cindy Lu, Liuxiangdong, alvaro.karsz, Zhu Lingshan, Jason Wang,
	longpeng2, Michael S. Tsirkin, Shannon Nelson, Eli Cohen,
	Stefan Hajnoczi, virtualization, Parav Pandit, si-wei.liu,
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Harpreet Singh Anand, Gonglei (Arei)

vhost-vdpa devices can return this features now that blockers have been
set in case some features are not met.

Expose VHOST_F_LOG_ALL only in that case.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 2de0796e66..a94ce254a9 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1319,10 +1319,9 @@ static int vhost_vdpa_set_vring_call(struct vhost_dev *dev,
 static int vhost_vdpa_get_features(struct vhost_dev *dev,
                                      uint64_t *features)
 {
-    struct vhost_vdpa *v = dev->opaque;
     int ret = vhost_vdpa_get_dev_features(dev, features);
 
-    if (ret == 0 && v->shadow_vqs_enabled) {
+    if (ret == 0) {
         /* Add SVQ logging capabilities */
         *features |= BIT_ULL(VHOST_F_LOG_ALL);
     }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-02-03 19:14 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-03 19:11 [PATCH 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
2023-02-03 19:11 ` [PATCH 01/13] vdpa net: move iova tree creation from init to start Eugenio Pérez
2023-02-03 19:11 ` [PATCH 02/13] vdpa: Negotiate _F_SUSPEND feature Eugenio Pérez
2023-02-03 19:11 ` [PATCH 03/13] vdpa: add vhost_vdpa_suspend Eugenio Pérez
2023-02-03 19:11 ` [PATCH 04/13] vdpa: move vhost reset after get vring base Eugenio Pérez
2023-02-03 19:11 ` [PATCH 05/13] vdpa: rewind at get_base, not set_base Eugenio Pérez
2023-02-03 19:11 ` [PATCH 06/13] vdpa net: allow VHOST_F_LOG_ALL Eugenio Pérez
2023-02-03 19:11 ` [PATCH 07/13] vdpa: add vdpa net migration state notifier Eugenio Pérez
2023-02-03 19:11 ` [PATCH 08/13] vdpa: disable RAM block discard only for the first device Eugenio Pérez
2023-02-03 19:11 ` [PATCH 09/13] vdpa net: block migration if the device has CVQ Eugenio Pérez
2023-02-03 19:11 ` [PATCH 10/13] vdpa: block migration if device has unsupported features Eugenio Pérez
2023-02-03 19:11 ` [PATCH 11/13] vdpa: block migration if dev does not have _F_SUSPEND Eugenio Pérez
2023-02-03 19:11 ` [PATCH 12/13] vdpa: block migration if SVQ does not admit a feature Eugenio Pérez
2023-02-03 19:11 ` [PATCH 13/13] vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices Eugenio Pérez

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.