All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ
@ 2022-07-08 10:49 Eugenio Pérez
  2022-07-08 10:49 ` [PATCH 01/22] vhost: Return earlier if used buffers overrun Eugenio Pérez
                   ` (21 more replies)
  0 siblings, 22 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

Control virtqueue is used by networking device for accepting various
commands from the driver. It's a must to support advanced configurations.

Rx filtering event is issues by qemu when device's MAC address changed once and
the previous one has not been queried by external agents.

Shadow VirtQueue (SVQ) already makes possible tracking the state of virtqueues,
effectively intercepting them so qemu can track what regions of memory are
dirty because device action and needs migration. However, this does not solve
networking device state seen by the driver because CVQ messages, like changes
on MAC addresses from the driver.

This series uses SVQ infraestructure to intercept networking control messages
used by the device. This way, qemu is able to update VirtIONet device model and
react to them. In particular, this series enables rx filter change
notification.

This is a pre-requisite to achieve net vdpa device with CVQ live migration.
It's a stripped down version of [1], with error paths checked and no migration
enabled.

First patch solves a memory leak if the device is able to trick qemu to think
it has returned more buffers than svq size. This should not be possible, but
we're a bit safer this way.

Next nine patches reorder and clean code base so its easier to apply later
ones. No functional change should be noticed from these changes.

Patches from 11 to 16 enable SVQ API to make other parts of qemu to interact
with it. In particular, they will be used by vhost-vdpa net to handle CVQ
messages.

Patches 17 to 19 enable the update of the virtio-net device model for each
CVQ message acknoledged by the device.

Last patches enable x-svq parameter, forbidding device migration since it is
not restored in the destination's vdpa device yet. This will be added in later
series, using this work.

Comments are welcome.

[1] https://patchwork.kernel.org/project/qemu-devel/cover/20220706184008.1649478-1-eperezma@redhat.com/

Eugenio Pérez (22):
  vhost: Return earlier if used buffers overrun
  vhost: move descriptor translation to vhost_svq_vring_write_descs
  vdpa: Clean vhost_vdpa_dev_start(dev, false)
  virtio-net: Expose ctrl virtqueue logic
  vhost: Decouple vhost_svq_add_split from VirtQueueElement
  vhost: Reorder vhost_svq_last_desc_of_chain
  vhost: Add SVQElement
  vhost: Move last chain id to SVQ element
  vhost: Add opaque member to SVQElement
  vdpa: Small rename of error labels
  vhost: add vhost_svq_push_elem
  vhost: Add vhost_svq_inject
  vhost: add vhost_svq_poll
  vhost: Add custom used buffer callback
  vhost: Add svq avail_handler callback
  vhost: add detach SVQ operation
  vdpa: Export vhost_vdpa_dma_map and unmap calls
  vdpa: manual forward CVQ buffers
  vdpa: Buffer CVQ support on shadow virtqueue
  vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs
  vdpa: Add device migration blocker
  vdpa: Add x-svq to NetdevVhostVDPAOptions

 qapi/net.json                      |   9 +-
 hw/virtio/vhost-shadow-virtqueue.h |  64 +++-
 include/hw/virtio/vhost-vdpa.h     |   8 +
 include/hw/virtio/virtio-net.h     |   4 +
 hw/net/virtio-net.c                |  84 ++---
 hw/virtio/vhost-shadow-virtqueue.c | 287 +++++++++++++----
 hw/virtio/vhost-vdpa.c             |  63 ++--
 net/vhost-vdpa.c                   | 473 ++++++++++++++++++++++++++++-
 8 files changed, 855 insertions(+), 137 deletions(-)

-- 
2.31.1




^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/22] vhost: Return earlier if used buffers overrun
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
@ 2022-07-08 10:49 ` Eugenio Pérez
  2022-07-08 10:49 ` [PATCH 02/22] vhost: move descriptor translation to vhost_svq_vring_write_descs Eugenio Pérez
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

Previous function misses the just picked avail buffer from the queue.
This way keeps blocking the used queue forever, but is cleaner to check
before calling to vhost_svq_get_buf.

Fixes: 100890f7cad50 ("vhost: Shadow virtqueue buffers forwarding")
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 56c96ebd13..9280285435 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -405,19 +405,21 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
         vhost_svq_disable_notification(svq);
         while (true) {
             uint32_t len;
-            g_autofree VirtQueueElement *elem = vhost_svq_get_buf(svq, &len);
-            if (!elem) {
-                break;
-            }
+            g_autofree VirtQueueElement *elem = NULL;
 
             if (unlikely(i >= svq->vring.num)) {
                 qemu_log_mask(LOG_GUEST_ERROR,
                          "More than %u used buffers obtained in a %u size SVQ",
                          i, svq->vring.num);
-                virtqueue_fill(vq, elem, len, i);
-                virtqueue_flush(vq, i);
+                virtqueue_flush(vq, svq->vring.num);
                 return;
             }
+
+            elem = vhost_svq_get_buf(svq, &len);
+            if (!elem) {
+                break;
+            }
+
             virtqueue_fill(vq, elem, len, i++);
         }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 02/22] vhost: move descriptor translation to vhost_svq_vring_write_descs
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
  2022-07-08 10:49 ` [PATCH 01/22] vhost: Return earlier if used buffers overrun Eugenio Pérez
@ 2022-07-08 10:49 ` Eugenio Pérez
  2022-07-08 10:49 ` [PATCH 03/22] vdpa: Clean vhost_vdpa_dev_start(dev, false) Eugenio Pérez
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

It's done for both in and out descriptors so it's better placed here.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 38 +++++++++++++++++++++---------
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 9280285435..115d769b86 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -122,17 +122,35 @@ static bool vhost_svq_translate_addr(const VhostShadowVirtqueue *svq,
     return true;
 }
 
-static void vhost_vring_write_descs(VhostShadowVirtqueue *svq, hwaddr *sg,
-                                    const struct iovec *iovec, size_t num,
-                                    bool more_descs, bool write)
+/**
+ * Write descriptors to SVQ vring
+ *
+ * @svq: The shadow virtqueue
+ * @sg: Cache for hwaddr
+ * @iovec: The iovec from the guest
+ * @num: iovec length
+ * @more_descs: True if more descriptors come in the chain
+ * @write: True if they are writeable descriptors
+ *
+ * Return true if success, false otherwise and print error.
+ */
+static bool vhost_svq_vring_write_descs(VhostShadowVirtqueue *svq, hwaddr *sg,
+                                        const struct iovec *iovec, size_t num,
+                                        bool more_descs, bool write)
 {
     uint16_t i = svq->free_head, last = svq->free_head;
     unsigned n;
     uint16_t flags = write ? cpu_to_le16(VRING_DESC_F_WRITE) : 0;
     vring_desc_t *descs = svq->vring.desc;
+    bool ok;
 
     if (num == 0) {
-        return;
+        return true;
+    }
+
+    ok = vhost_svq_translate_addr(svq, sg, iovec, num);
+    if (unlikely(!ok)) {
+        return false;
     }
 
     for (n = 0; n < num; n++) {
@@ -150,6 +168,7 @@ static void vhost_vring_write_descs(VhostShadowVirtqueue *svq, hwaddr *sg,
     }
 
     svq->free_head = le16_to_cpu(svq->desc_next[last]);
+    return true;
 }
 
 static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
@@ -169,21 +188,18 @@ static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
         return false;
     }
 
-    ok = vhost_svq_translate_addr(svq, sgs, elem->out_sg, elem->out_num);
+    ok = vhost_svq_vring_write_descs(svq, sgs, elem->out_sg, elem->out_num,
+                                     elem->in_num > 0, false);
     if (unlikely(!ok)) {
         return false;
     }
-    vhost_vring_write_descs(svq, sgs, elem->out_sg, elem->out_num,
-                            elem->in_num > 0, false);
-
 
-    ok = vhost_svq_translate_addr(svq, sgs, elem->in_sg, elem->in_num);
+    ok = vhost_svq_vring_write_descs(svq, sgs, elem->in_sg, elem->in_num, false,
+                                     true);
     if (unlikely(!ok)) {
         return false;
     }
 
-    vhost_vring_write_descs(svq, sgs, elem->in_sg, elem->in_num, false, true);
-
     /*
      * Put the entry in the available array (but don't update avail->idx until
      * they do sync).
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 03/22] vdpa: Clean vhost_vdpa_dev_start(dev, false)
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
  2022-07-08 10:49 ` [PATCH 01/22] vhost: Return earlier if used buffers overrun Eugenio Pérez
  2022-07-08 10:49 ` [PATCH 02/22] vhost: move descriptor translation to vhost_svq_vring_write_descs Eugenio Pérez
@ 2022-07-08 10:49 ` Eugenio Pérez
  2022-07-08 10:49 ` [PATCH 04/22] virtio-net: Expose ctrl virtqueue logic Eugenio Pérez
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

Return value is never checked and is a clean path, so assume success

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 33 ++++++++++-----------------------
 1 file changed, 10 insertions(+), 23 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 66f054a12c..d6ba4a492a 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -872,41 +872,35 @@ static int vhost_vdpa_svq_set_fds(struct vhost_dev *dev,
 /**
  * Unmap a SVQ area in the device
  */
-static bool vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v,
+static void vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v,
                                       const DMAMap *needle)
 {
     const DMAMap *result = vhost_iova_tree_find_iova(v->iova_tree, needle);
     hwaddr size;
-    int r;
 
     if (unlikely(!result)) {
         error_report("Unable to find SVQ address to unmap");
-        return false;
+        return;
     }
 
     size = ROUND_UP(result->size, qemu_real_host_page_size());
-    r = vhost_vdpa_dma_unmap(v, result->iova, size);
-    return r == 0;
+    vhost_vdpa_dma_unmap(v, result->iova, size);
 }
 
-static bool vhost_vdpa_svq_unmap_rings(struct vhost_dev *dev,
+static void vhost_vdpa_svq_unmap_rings(struct vhost_dev *dev,
                                        const VhostShadowVirtqueue *svq)
 {
     DMAMap needle = {};
     struct vhost_vdpa *v = dev->opaque;
     struct vhost_vring_addr svq_addr;
-    bool ok;
 
     vhost_svq_get_vring_addr(svq, &svq_addr);
 
     needle.translated_addr = svq_addr.desc_user_addr;
-    ok = vhost_vdpa_svq_unmap_ring(v, &needle);
-    if (unlikely(!ok)) {
-        return false;
-    }
+    vhost_vdpa_svq_unmap_ring(v, &needle);
 
     needle.translated_addr = svq_addr.used_user_addr;
-    return vhost_vdpa_svq_unmap_ring(v, &needle);
+    vhost_vdpa_svq_unmap_ring(v, &needle);
 }
 
 /**
@@ -1066,23 +1060,19 @@ err:
     return false;
 }
 
-static bool vhost_vdpa_svqs_stop(struct vhost_dev *dev)
+static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
 {
     struct vhost_vdpa *v = dev->opaque;
 
     if (!v->shadow_vqs) {
-        return true;
+        return;
     }
 
     for (unsigned i = 0; i < v->shadow_vqs->len; ++i) {
         VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
-        bool ok = vhost_vdpa_svq_unmap_rings(dev, svq);
-        if (unlikely(!ok)) {
-            return false;
-        }
+        vhost_vdpa_svq_unmap_rings(dev, svq);
     }
 
-    return true;
 }
 
 static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
@@ -1099,10 +1089,7 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
         }
         vhost_vdpa_set_vring_ready(dev);
     } else {
-        ok = vhost_vdpa_svqs_stop(dev);
-        if (unlikely(!ok)) {
-            return -1;
-        }
+        vhost_vdpa_svqs_stop(dev);
         vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
     }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 04/22] virtio-net: Expose ctrl virtqueue logic
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (2 preceding siblings ...)
  2022-07-08 10:49 ` [PATCH 03/22] vdpa: Clean vhost_vdpa_dev_start(dev, false) Eugenio Pérez
@ 2022-07-08 10:49 ` Eugenio Pérez
  2022-07-08 10:49 ` [PATCH 05/22] vhost: Decouple vhost_svq_add_split from VirtQueueElement Eugenio Pérez
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

This allows external vhost-net devices to modify the state of the
VirtIO device model once vhost-vdpa device has acknowledge the control
commands.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/virtio-net.h |  4 ++
 hw/net/virtio-net.c            | 84 ++++++++++++++++++++--------------
 2 files changed, 53 insertions(+), 35 deletions(-)

diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index eb87032627..42caea0d1d 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -218,6 +218,10 @@ struct VirtIONet {
     struct EBPFRSSContext ebpf_rss;
 };
 
+size_t virtio_net_handle_ctrl_iov(VirtIODevice *vdev,
+                                  const struct iovec *in_sg, unsigned in_num,
+                                  const struct iovec *out_sg,
+                                  unsigned out_num);
 void virtio_net_set_netclient_name(VirtIONet *n, const char *name,
                                    const char *type);
 
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 7ad948ee7c..53bb92c9f1 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1434,57 +1434,71 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
     return VIRTIO_NET_OK;
 }
 
-static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
+size_t virtio_net_handle_ctrl_iov(VirtIODevice *vdev,
+                                  const struct iovec *in_sg, unsigned in_num,
+                                  const struct iovec *out_sg,
+                                  unsigned out_num)
 {
     VirtIONet *n = VIRTIO_NET(vdev);
     struct virtio_net_ctrl_hdr ctrl;
     virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
-    VirtQueueElement *elem;
     size_t s;
     struct iovec *iov, *iov2;
-    unsigned int iov_cnt;
+
+    if (iov_size(in_sg, in_num) < sizeof(status) ||
+        iov_size(out_sg, out_num) < sizeof(ctrl)) {
+        virtio_error(vdev, "virtio-net ctrl missing headers");
+        return 0;
+    }
+
+    iov2 = iov = g_memdup2(out_sg, sizeof(struct iovec) * out_num);
+    s = iov_to_buf(iov, out_num, 0, &ctrl, sizeof(ctrl));
+    iov_discard_front(&iov, &out_num, sizeof(ctrl));
+    if (s != sizeof(ctrl)) {
+        status = VIRTIO_NET_ERR;
+    } else if (ctrl.class == VIRTIO_NET_CTRL_RX) {
+        status = virtio_net_handle_rx_mode(n, ctrl.cmd, iov, out_num);
+    } else if (ctrl.class == VIRTIO_NET_CTRL_MAC) {
+        status = virtio_net_handle_mac(n, ctrl.cmd, iov, out_num);
+    } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN) {
+        status = virtio_net_handle_vlan_table(n, ctrl.cmd, iov, out_num);
+    } else if (ctrl.class == VIRTIO_NET_CTRL_ANNOUNCE) {
+        status = virtio_net_handle_announce(n, ctrl.cmd, iov, out_num);
+    } else if (ctrl.class == VIRTIO_NET_CTRL_MQ) {
+        status = virtio_net_handle_mq(n, ctrl.cmd, iov, out_num);
+    } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
+        status = virtio_net_handle_offloads(n, ctrl.cmd, iov, out_num);
+    }
+
+    s = iov_from_buf(in_sg, in_num, 0, &status, sizeof(status));
+    assert(s == sizeof(status));
+
+    g_free(iov2);
+    return sizeof(status);
+}
+
+static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtQueueElement *elem;
 
     for (;;) {
+        size_t written;
         elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
         if (!elem) {
             break;
         }
-        if (iov_size(elem->in_sg, elem->in_num) < sizeof(status) ||
-            iov_size(elem->out_sg, elem->out_num) < sizeof(ctrl)) {
-            virtio_error(vdev, "virtio-net ctrl missing headers");
+
+        written = virtio_net_handle_ctrl_iov(vdev, elem->in_sg, elem->in_num,
+                                             elem->out_sg, elem->out_num);
+        if (written > 0) {
+            virtqueue_push(vq, elem, written);
+            virtio_notify(vdev, vq);
+            g_free(elem);
+        } else {
             virtqueue_detach_element(vq, elem, 0);
             g_free(elem);
             break;
         }
-
-        iov_cnt = elem->out_num;
-        iov2 = iov = g_memdup2(elem->out_sg,
-                               sizeof(struct iovec) * elem->out_num);
-        s = iov_to_buf(iov, iov_cnt, 0, &ctrl, sizeof(ctrl));
-        iov_discard_front(&iov, &iov_cnt, sizeof(ctrl));
-        if (s != sizeof(ctrl)) {
-            status = VIRTIO_NET_ERR;
-        } else if (ctrl.class == VIRTIO_NET_CTRL_RX) {
-            status = virtio_net_handle_rx_mode(n, ctrl.cmd, iov, iov_cnt);
-        } else if (ctrl.class == VIRTIO_NET_CTRL_MAC) {
-            status = virtio_net_handle_mac(n, ctrl.cmd, iov, iov_cnt);
-        } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN) {
-            status = virtio_net_handle_vlan_table(n, ctrl.cmd, iov, iov_cnt);
-        } else if (ctrl.class == VIRTIO_NET_CTRL_ANNOUNCE) {
-            status = virtio_net_handle_announce(n, ctrl.cmd, iov, iov_cnt);
-        } else if (ctrl.class == VIRTIO_NET_CTRL_MQ) {
-            status = virtio_net_handle_mq(n, ctrl.cmd, iov, iov_cnt);
-        } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
-            status = virtio_net_handle_offloads(n, ctrl.cmd, iov, iov_cnt);
-        }
-
-        s = iov_from_buf(elem->in_sg, elem->in_num, 0, &status, sizeof(status));
-        assert(s == sizeof(status));
-
-        virtqueue_push(vq, elem, sizeof(status));
-        virtio_notify(vdev, vq);
-        g_free(iov2);
-        g_free(elem);
     }
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 05/22] vhost: Decouple vhost_svq_add_split from VirtQueueElement
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (3 preceding siblings ...)
  2022-07-08 10:49 ` [PATCH 04/22] virtio-net: Expose ctrl virtqueue logic Eugenio Pérez
@ 2022-07-08 10:49 ` Eugenio Pérez
  2022-07-08 10:49 ` [PATCH 06/22] vhost: Reorder vhost_svq_last_desc_of_chain Eugenio Pérez
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

VirtQueueElement comes from the guest, but we're heading SVQ to be able
to inject element without the guest's knowledge.

To do so, make this accept sg buffers directly, instead of using
VirtQueueElement.

Add vhost_svq_add_element to maintain element convenience

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 38 +++++++++++++++++++++---------
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 115d769b86..2d70f832e9 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -172,30 +172,32 @@ static bool vhost_svq_vring_write_descs(VhostShadowVirtqueue *svq, hwaddr *sg,
 }
 
 static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
-                                VirtQueueElement *elem, unsigned *head)
+                                const struct iovec *out_sg, size_t out_num,
+                                const struct iovec *in_sg, size_t in_num,
+                                unsigned *head)
 {
     unsigned avail_idx;
     vring_avail_t *avail = svq->vring.avail;
     bool ok;
-    g_autofree hwaddr *sgs = g_new(hwaddr, MAX(elem->out_num, elem->in_num));
+    g_autofree hwaddr *sgs = NULL;
 
     *head = svq->free_head;
 
     /* We need some descriptors here */
-    if (unlikely(!elem->out_num && !elem->in_num)) {
+    if (unlikely(!out_num && !in_num)) {
         qemu_log_mask(LOG_GUEST_ERROR,
                       "Guest provided element with no descriptors");
         return false;
     }
 
-    ok = vhost_svq_vring_write_descs(svq, sgs, elem->out_sg, elem->out_num,
-                                     elem->in_num > 0, false);
+    sgs = g_new(hwaddr, MAX(out_num, in_num));
+    ok = vhost_svq_vring_write_descs(svq, sgs, out_sg, out_num, in_num > 0,
+                                     false);
     if (unlikely(!ok)) {
         return false;
     }
 
-    ok = vhost_svq_vring_write_descs(svq, sgs, elem->in_sg, elem->in_num, false,
-                                     true);
+    ok = vhost_svq_vring_write_descs(svq, sgs, in_sg, in_num, false, true);
     if (unlikely(!ok)) {
         return false;
     }
@@ -222,10 +224,13 @@ static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
  * takes ownership of the element: In case of failure, it is free and the SVQ
  * is considered broken.
  */
-static bool vhost_svq_add(VhostShadowVirtqueue *svq, VirtQueueElement *elem)
+static bool vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
+                          size_t out_num, const struct iovec *in_sg,
+                          size_t in_num, VirtQueueElement *elem)
 {
     unsigned qemu_head;
-    bool ok = vhost_svq_add_split(svq, elem, &qemu_head);
+    bool ok = vhost_svq_add_split(svq, out_sg, out_num, in_sg, in_num,
+                                  &qemu_head);
     if (unlikely(!ok)) {
         g_free(elem);
         return false;
@@ -249,6 +254,18 @@ static void vhost_svq_kick(VhostShadowVirtqueue *svq)
     event_notifier_set(&svq->hdev_kick);
 }
 
+static bool vhost_svq_add_element(VhostShadowVirtqueue *svq,
+                                  VirtQueueElement *elem)
+{
+    bool ok = vhost_svq_add(svq, elem->out_sg, elem->out_num, elem->in_sg,
+                            elem->in_num, elem);
+    if (ok) {
+        vhost_svq_kick(svq);
+    }
+
+    return ok;
+}
+
 /**
  * Forward available buffers.
  *
@@ -301,12 +318,11 @@ static void vhost_handle_guest_kick(VhostShadowVirtqueue *svq)
                 return;
             }
 
-            ok = vhost_svq_add(svq, elem);
+            ok = vhost_svq_add_element(svq, g_steal_pointer(&elem));
             if (unlikely(!ok)) {
                 /* VQ is broken, just return and ignore any other kicks */
                 return;
             }
-            vhost_svq_kick(svq);
         }
 
         virtio_queue_set_notification(svq->vq, true);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 06/22] vhost: Reorder vhost_svq_last_desc_of_chain
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (4 preceding siblings ...)
  2022-07-08 10:49 ` [PATCH 05/22] vhost: Decouple vhost_svq_add_split from VirtQueueElement Eugenio Pérez
@ 2022-07-08 10:49 ` Eugenio Pérez
  2022-07-08 10:49 ` [PATCH 07/22] vhost: Add SVQElement Eugenio Pérez
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

SVQ is going to store it in SVQElement, so we need it before add functions.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 2d70f832e9..a4d5d7bae0 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -217,6 +217,16 @@ static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
     return true;
 }
 
+static uint16_t vhost_svq_last_desc_of_chain(const VhostShadowVirtqueue *svq,
+                                             uint16_t num, uint16_t i)
+{
+    for (uint16_t j = 0; j < (num - 1); ++j) {
+        i = le16_to_cpu(svq->desc_next[i]);
+    }
+
+    return i;
+}
+
 /**
  * Add an element to a SVQ.
  *
@@ -374,16 +384,6 @@ static void vhost_svq_disable_notification(VhostShadowVirtqueue *svq)
     svq->vring.avail->flags |= cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
 }
 
-static uint16_t vhost_svq_last_desc_of_chain(const VhostShadowVirtqueue *svq,
-                                             uint16_t num, uint16_t i)
-{
-    for (uint16_t j = 0; j < (num - 1); ++j) {
-        i = le16_to_cpu(svq->desc_next[i]);
-    }
-
-    return i;
-}
-
 static VirtQueueElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
                                            uint32_t *len)
 {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 07/22] vhost: Add SVQElement
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (5 preceding siblings ...)
  2022-07-08 10:49 ` [PATCH 06/22] vhost: Reorder vhost_svq_last_desc_of_chain Eugenio Pérez
@ 2022-07-08 10:49 ` Eugenio Pérez
  2022-07-08 10:49 ` [PATCH 08/22] vhost: Move last chain id to SVQ element Eugenio Pérez
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

This will allow SVQ to add metadata to the different queue elements. To
simplify changes, only store actual element at this patch.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |  8 ++++--
 hw/virtio/vhost-shadow-virtqueue.c | 41 ++++++++++++++++++++----------
 2 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index c132c994e9..0b34f48037 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -15,6 +15,10 @@
 #include "standard-headers/linux/vhost_types.h"
 #include "hw/virtio/vhost-iova-tree.h"
 
+typedef struct SVQElement {
+    VirtQueueElement *elem;
+} SVQElement;
+
 /* Shadow virtqueue to relay notifications */
 typedef struct VhostShadowVirtqueue {
     /* Shadow vring */
@@ -47,8 +51,8 @@ typedef struct VhostShadowVirtqueue {
     /* IOVA mapping */
     VhostIOVATree *iova_tree;
 
-    /* Map for use the guest's descriptors */
-    VirtQueueElement **ring_id_maps;
+    /* Each element context */
+    SVQElement *ring_id_maps;
 
     /* Next VirtQueue element that guest made available */
     VirtQueueElement *next_guest_avail_elem;
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index a4d5d7bae0..d50e1383f5 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -246,7 +246,7 @@ static bool vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
         return false;
     }
 
-    svq->ring_id_maps[qemu_head] = elem;
+    svq->ring_id_maps[qemu_head].elem = elem;
     return true;
 }
 
@@ -384,15 +384,25 @@ static void vhost_svq_disable_notification(VhostShadowVirtqueue *svq)
     svq->vring.avail->flags |= cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
 }
 
-static VirtQueueElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
-                                           uint32_t *len)
+static bool vhost_svq_is_empty_elem(SVQElement elem)
+{
+    return elem.elem == NULL;
+}
+
+static SVQElement vhost_svq_empty_elem(void)
+{
+    return (SVQElement){};
+}
+
+static SVQElement vhost_svq_get_buf(VhostShadowVirtqueue *svq, uint32_t *len)
 {
     const vring_used_t *used = svq->vring.used;
     vring_used_elem_t used_elem;
+    SVQElement svq_elem = vhost_svq_empty_elem();
     uint16_t last_used, last_used_chain, num;
 
     if (!vhost_svq_more_used(svq)) {
-        return NULL;
+        return svq_elem;
     }
 
     /* Only get used array entries after they have been exposed by dev */
@@ -405,24 +415,25 @@ static VirtQueueElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
     if (unlikely(used_elem.id >= svq->vring.num)) {
         qemu_log_mask(LOG_GUEST_ERROR, "Device %s says index %u is used",
                       svq->vdev->name, used_elem.id);
-        return NULL;
+        return svq_elem;
     }
 
-    if (unlikely(!svq->ring_id_maps[used_elem.id])) {
+    svq_elem = svq->ring_id_maps[used_elem.id];
+    svq->ring_id_maps[used_elem.id] = vhost_svq_empty_elem();
+    if (unlikely(vhost_svq_is_empty_elem(svq_elem))) {
         qemu_log_mask(LOG_GUEST_ERROR,
             "Device %s says index %u is used, but it was not available",
             svq->vdev->name, used_elem.id);
-        return NULL;
+        return svq_elem;
     }
 
-    num = svq->ring_id_maps[used_elem.id]->in_num +
-          svq->ring_id_maps[used_elem.id]->out_num;
+    num = svq_elem.elem->in_num + svq_elem.elem->out_num;
     last_used_chain = vhost_svq_last_desc_of_chain(svq, num, used_elem.id);
     svq->desc_next[last_used_chain] = svq->free_head;
     svq->free_head = used_elem.id;
 
     *len = used_elem.len;
-    return g_steal_pointer(&svq->ring_id_maps[used_elem.id]);
+    return svq_elem;
 }
 
 static void vhost_svq_flush(VhostShadowVirtqueue *svq,
@@ -437,6 +448,7 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
         vhost_svq_disable_notification(svq);
         while (true) {
             uint32_t len;
+            SVQElement svq_elem;
             g_autofree VirtQueueElement *elem = NULL;
 
             if (unlikely(i >= svq->vring.num)) {
@@ -447,11 +459,12 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
                 return;
             }
 
-            elem = vhost_svq_get_buf(svq, &len);
-            if (!elem) {
+            svq_elem = vhost_svq_get_buf(svq, &len);
+            if (vhost_svq_is_empty_elem(svq_elem)) {
                 break;
             }
 
+            elem = g_steal_pointer(&svq_elem.elem);
             virtqueue_fill(vq, elem, len, i++);
         }
 
@@ -594,7 +607,7 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
     memset(svq->vring.desc, 0, driver_size);
     svq->vring.used = qemu_memalign(qemu_real_host_page_size(), device_size);
     memset(svq->vring.used, 0, device_size);
-    svq->ring_id_maps = g_new0(VirtQueueElement *, svq->vring.num);
+    svq->ring_id_maps = g_new0(SVQElement, svq->vring.num);
     svq->desc_next = g_new0(uint16_t, svq->vring.num);
     for (unsigned i = 0; i < svq->vring.num - 1; i++) {
         svq->desc_next[i] = cpu_to_le16(i + 1);
@@ -619,7 +632,7 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
 
     for (unsigned i = 0; i < svq->vring.num; ++i) {
         g_autofree VirtQueueElement *elem = NULL;
-        elem = g_steal_pointer(&svq->ring_id_maps[i]);
+        elem = g_steal_pointer(&svq->ring_id_maps[i].elem);
         if (elem) {
             virtqueue_detach_element(svq->vq, elem, 0);
         }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 08/22] vhost: Move last chain id to SVQ element
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (6 preceding siblings ...)
  2022-07-08 10:49 ` [PATCH 07/22] vhost: Add SVQElement Eugenio Pérez
@ 2022-07-08 10:49 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 09/22] vhost: Add opaque member to SVQElement Eugenio Pérez
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

We will allow SVQ user to store opaque data for each element, so its
easier if we store this kind of information just at avail.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |  3 +++
 hw/virtio/vhost-shadow-virtqueue.c | 14 ++++++++------
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 0b34f48037..5646d875cb 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -17,6 +17,9 @@
 
 typedef struct SVQElement {
     VirtQueueElement *elem;
+
+    /* Last descriptor of the chain */
+    uint32_t last_chain_id;
 } SVQElement;
 
 /* Shadow virtqueue to relay notifications */
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index d50e1383f5..635b6b359f 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -238,7 +238,9 @@ static bool vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
                           size_t out_num, const struct iovec *in_sg,
                           size_t in_num, VirtQueueElement *elem)
 {
+    SVQElement *svq_elem;
     unsigned qemu_head;
+    size_t n;
     bool ok = vhost_svq_add_split(svq, out_sg, out_num, in_sg, in_num,
                                   &qemu_head);
     if (unlikely(!ok)) {
@@ -246,7 +248,10 @@ static bool vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
         return false;
     }
 
-    svq->ring_id_maps[qemu_head].elem = elem;
+    n = out_num + in_num;
+    svq_elem = &svq->ring_id_maps[qemu_head];
+    svq_elem->elem = elem;
+    svq_elem->last_chain_id = vhost_svq_last_desc_of_chain(svq, n, qemu_head);
     return true;
 }
 
@@ -399,7 +404,7 @@ static SVQElement vhost_svq_get_buf(VhostShadowVirtqueue *svq, uint32_t *len)
     const vring_used_t *used = svq->vring.used;
     vring_used_elem_t used_elem;
     SVQElement svq_elem = vhost_svq_empty_elem();
-    uint16_t last_used, last_used_chain, num;
+    uint16_t last_used;
 
     if (!vhost_svq_more_used(svq)) {
         return svq_elem;
@@ -427,11 +432,8 @@ static SVQElement vhost_svq_get_buf(VhostShadowVirtqueue *svq, uint32_t *len)
         return svq_elem;
     }
 
-    num = svq_elem.elem->in_num + svq_elem.elem->out_num;
-    last_used_chain = vhost_svq_last_desc_of_chain(svq, num, used_elem.id);
-    svq->desc_next[last_used_chain] = svq->free_head;
+    svq->desc_next[svq_elem.last_chain_id] = svq->free_head;
     svq->free_head = used_elem.id;
-
     *len = used_elem.len;
     return svq_elem;
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 09/22] vhost: Add opaque member to SVQElement
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (7 preceding siblings ...)
  2022-07-08 10:49 ` [PATCH 08/22] vhost: Move last chain id to SVQ element Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 10/22] vdpa: Small rename of error labels Eugenio Pérez
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

When qemu injects buffers to the vdpa device it will be used to maintain
contextual data. If SVQ has no operation, it will be used to maintain
the VirtQueueElement pointer.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |  3 ++-
 hw/virtio/vhost-shadow-virtqueue.c | 13 +++++++------
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 5646d875cb..3e1bea12ca 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -16,7 +16,8 @@
 #include "hw/virtio/vhost-iova-tree.h"
 
 typedef struct SVQElement {
-    VirtQueueElement *elem;
+    /* Opaque data */
+    void *opaque;
 
     /* Last descriptor of the chain */
     uint32_t last_chain_id;
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 635b6b359f..01caa5887e 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -236,7 +236,7 @@ static uint16_t vhost_svq_last_desc_of_chain(const VhostShadowVirtqueue *svq,
  */
 static bool vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
                           size_t out_num, const struct iovec *in_sg,
-                          size_t in_num, VirtQueueElement *elem)
+                          size_t in_num, void *opaque)
 {
     SVQElement *svq_elem;
     unsigned qemu_head;
@@ -244,13 +244,12 @@ static bool vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
     bool ok = vhost_svq_add_split(svq, out_sg, out_num, in_sg, in_num,
                                   &qemu_head);
     if (unlikely(!ok)) {
-        g_free(elem);
         return false;
     }
 
     n = out_num + in_num;
     svq_elem = &svq->ring_id_maps[qemu_head];
-    svq_elem->elem = elem;
+    svq_elem->opaque = opaque;
     svq_elem->last_chain_id = vhost_svq_last_desc_of_chain(svq, n, qemu_head);
     return true;
 }
@@ -276,6 +275,8 @@ static bool vhost_svq_add_element(VhostShadowVirtqueue *svq,
                             elem->in_num, elem);
     if (ok) {
         vhost_svq_kick(svq);
+    } else {
+        g_free(elem);
     }
 
     return ok;
@@ -391,7 +392,7 @@ static void vhost_svq_disable_notification(VhostShadowVirtqueue *svq)
 
 static bool vhost_svq_is_empty_elem(SVQElement elem)
 {
-    return elem.elem == NULL;
+    return elem.opaque == NULL;
 }
 
 static SVQElement vhost_svq_empty_elem(void)
@@ -466,7 +467,7 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
                 break;
             }
 
-            elem = g_steal_pointer(&svq_elem.elem);
+            elem = g_steal_pointer(&svq_elem.opaque);
             virtqueue_fill(vq, elem, len, i++);
         }
 
@@ -634,7 +635,7 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
 
     for (unsigned i = 0; i < svq->vring.num; ++i) {
         g_autofree VirtQueueElement *elem = NULL;
-        elem = g_steal_pointer(&svq->ring_id_maps[i].elem);
+        elem = g_steal_pointer(&svq->ring_id_maps[i].opaque);
         if (elem) {
             virtqueue_detach_element(svq->vq, elem, 0);
         }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 10/22] vdpa: Small rename of error labels
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (8 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 09/22] vhost: Add opaque member to SVQElement Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 11/22] vhost: add vhost_svq_push_elem Eugenio Pérez
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

So later patches are cleaner

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index d6ba4a492a..fccfc832ea 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1024,7 +1024,7 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
         int r;
         bool ok = vhost_vdpa_svq_setup(dev, svq, i, &err);
         if (unlikely(!ok)) {
-            goto err;
+            goto err_svq_setup;
         }
 
         vhost_svq_start(svq, dev->vdev, vq);
@@ -1049,8 +1049,7 @@ err_set_addr:
 err_map:
     vhost_svq_stop(g_ptr_array_index(v->shadow_vqs, i));
 
-err:
-    error_reportf_err(err, "Cannot setup SVQ %u: ", i);
+err_svq_setup:
     for (unsigned j = 0; j < i; ++j) {
         VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, j);
         vhost_vdpa_svq_unmap_rings(dev, svq);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 11/22] vhost: add vhost_svq_push_elem
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (9 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 10/22] vdpa: Small rename of error labels Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 12/22] vhost: Add vhost_svq_inject Eugenio Pérez
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

This function allows external SVQ users to return guest's available
buffers.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |  2 ++
 hw/virtio/vhost-shadow-virtqueue.c | 16 ++++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 3e1bea12ca..855fa82e3e 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -82,6 +82,8 @@ typedef struct VhostShadowVirtqueue {
 
 bool vhost_svq_valid_features(uint64_t features, Error **errp);
 
+void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
+                         const VirtQueueElement *elem, uint32_t len);
 void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
 void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
 void vhost_svq_get_vring_addr(const VhostShadowVirtqueue *svq,
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 01caa5887e..2b0a268655 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -439,6 +439,22 @@ static SVQElement vhost_svq_get_buf(VhostShadowVirtqueue *svq, uint32_t *len)
     return svq_elem;
 }
 
+/**
+ * Push an element to SVQ, returning it to the guest.
+ */
+void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
+                         const VirtQueueElement *elem, uint32_t len)
+{
+    virtqueue_push(svq->vq, elem, len);
+    if (svq->next_guest_avail_elem) {
+        /*
+         * Avail ring was full when vhost_svq_flush was called, so it's a
+         * good moment to make more descriptors available if possible.
+         */
+        vhost_handle_guest_kick(svq);
+    }
+}
+
 static void vhost_svq_flush(VhostShadowVirtqueue *svq,
                             bool check_for_avail_queue)
 {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 12/22] vhost: Add vhost_svq_inject
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (10 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 11/22] vhost: add vhost_svq_push_elem Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 13/22] vhost: add vhost_svq_poll Eugenio Pérez
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

This allows qemu to inject buffers to the device.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |  2 ++
 hw/virtio/vhost-shadow-virtqueue.c | 37 ++++++++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 855fa82e3e..09b87078af 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -84,6 +84,8 @@ bool vhost_svq_valid_features(uint64_t features, Error **errp);
 
 void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
                          const VirtQueueElement *elem, uint32_t len);
+int vhost_svq_inject(VhostShadowVirtqueue *svq, const struct iovec *iov,
+                     size_t out_num, size_t in_num, void *opaque);
 void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
 void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
 void vhost_svq_get_vring_addr(const VhostShadowVirtqueue *svq,
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 2b0a268655..4d59954f1b 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -282,6 +282,43 @@ static bool vhost_svq_add_element(VhostShadowVirtqueue *svq,
     return ok;
 }
 
+/**
+ * Inject a chain of buffers to the device
+ *
+ * @svq: Shadow VirtQueue
+ * @iov: I/O vector
+ * @out_num: Number of front out descriptors
+ * @in_num: Number of last input descriptors
+ * @opaque: Contextual data to store in descriptor
+ *
+ * Return 0 on success, -ENOMEM if cannot inject
+ */
+int vhost_svq_inject(VhostShadowVirtqueue *svq, const struct iovec *iov,
+                     size_t out_num, size_t in_num, void *opaque)
+{
+    bool ok;
+    size_t num = out_num + in_num;
+
+    /*
+     * All vhost_svq_inject calls are controlled by qemu so we won't hit this
+     * assertions.
+     */
+    assert(out_num || in_num);
+
+    if (unlikely(num > vhost_svq_available_slots(svq))) {
+        error_report("Injecting in a full queue");
+        return -ENOMEM;
+    }
+
+    ok = vhost_svq_add(svq, iov, out_num, iov + out_num, in_num, opaque);
+    if (unlikely(!ok)) {
+        return -EINVAL;
+    }
+
+    vhost_svq_kick(svq);
+    return 0;
+}
+
 /**
  * Forward available buffers.
  *
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 13/22] vhost: add vhost_svq_poll
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (11 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 12/22] vhost: Add vhost_svq_inject Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 14/22] vhost: Add custom used buffer callback Eugenio Pérez
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

It allows the Shadow Control VirtQueue to wait the device to use the commands
that restore the net device state after a live migration.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |  1 +
 hw/virtio/vhost-shadow-virtqueue.c | 54 ++++++++++++++++++++++++++++--
 2 files changed, 52 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 09b87078af..57ff97ce4f 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -86,6 +86,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
                          const VirtQueueElement *elem, uint32_t len);
 int vhost_svq_inject(VhostShadowVirtqueue *svq, const struct iovec *iov,
                      size_t out_num, size_t in_num, void *opaque);
+ssize_t vhost_svq_poll(VhostShadowVirtqueue *svq);
 void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
 void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
 void vhost_svq_get_vring_addr(const VhostShadowVirtqueue *svq,
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 4d59954f1b..f4affa52ee 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -10,6 +10,8 @@
 #include "qemu/osdep.h"
 #include "hw/virtio/vhost-shadow-virtqueue.h"
 
+#include <glib/gpoll.h>
+
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "qemu/main-loop.h"
@@ -492,10 +494,11 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq,
     }
 }
 
-static void vhost_svq_flush(VhostShadowVirtqueue *svq,
-                            bool check_for_avail_queue)
+static size_t vhost_svq_flush(VhostShadowVirtqueue *svq,
+                              bool check_for_avail_queue)
 {
     VirtQueue *vq = svq->vq;
+    size_t ret = 0;
 
     /* Forward as many used buffers as possible. */
     do {
@@ -512,7 +515,7 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
                          "More than %u used buffers obtained in a %u size SVQ",
                          i, svq->vring.num);
                 virtqueue_flush(vq, svq->vring.num);
-                return;
+                return ret;
             }
 
             svq_elem = vhost_svq_get_buf(svq, &len);
@@ -522,6 +525,7 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
 
             elem = g_steal_pointer(&svq_elem.opaque);
             virtqueue_fill(vq, elem, len, i++);
+            ret++;
         }
 
         virtqueue_flush(vq, i);
@@ -535,6 +539,50 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
             vhost_handle_guest_kick(svq);
         }
     } while (!vhost_svq_enable_notification(svq));
+
+    return ret;
+}
+
+/**
+ * Poll the SVQ for device used buffers.
+ *
+ * This function race with main event loop SVQ polling, so extra
+ * synchronization is needed.
+ *
+ * Return the number of descriptors read from the device.
+ */
+ssize_t vhost_svq_poll(VhostShadowVirtqueue *svq)
+{
+    int fd = event_notifier_get_fd(&svq->hdev_call);
+    GPollFD poll_fd = {
+        .fd = fd,
+        .events = G_IO_IN,
+    };
+    assert(fd >= 0);
+    int r = g_poll(&poll_fd, 1, -1);
+
+    if (unlikely(r < 0)) {
+        error_report("Cannot poll device call fd "G_POLLFD_FORMAT": (%d) %s",
+                     poll_fd.fd, errno, g_strerror(errno));
+        return -errno;
+    }
+
+    if (r == 0) {
+        return 0;
+    }
+
+    if (unlikely(poll_fd.revents & ~(G_IO_IN))) {
+        error_report(
+            "Error polling device call fd "G_POLLFD_FORMAT": revents=%d",
+            poll_fd.fd, poll_fd.revents);
+        return -1;
+    }
+
+    /*
+     * Max return value of vhost_svq_flush is (uint16_t)-1, so it's safe to
+     * convert to ssize_t.
+     */
+    return vhost_svq_flush(svq, false);
 }
 
 /**
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 14/22] vhost: Add custom used buffer callback
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (12 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 13/22] vhost: add vhost_svq_poll Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 15/22] vhost: Add svq avail_handler callback Eugenio Pérez
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

The callback allows SVQ users to know the VirtQueue requests and
responses. QEMU can use this to synchronize virtio device model state,
allowing to migrate it with minimum changes to the migration code.

If callbacks are specified at svq creation, the buffers need to be
injected to the device using vhost_svq_inject. An opaque data must be
given with it, and its returned to the callback at used_handler call.

In the case of networking, this will be used to inspect control
virtqueue messages status from the device.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h | 15 ++++++++++++++-
 hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++------
 hw/virtio/vhost-vdpa.c             |  3 ++-
 3 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 57ff97ce4f..96ce7aa62e 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -23,6 +23,15 @@ typedef struct SVQElement {
     uint32_t last_chain_id;
 } SVQElement;
 
+typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
+typedef void (*VirtQueueUsedCallback)(VhostShadowVirtqueue *svq,
+                                      void *used_elem_opaque,
+                                      uint32_t written);
+
+typedef struct VhostShadowVirtqueueOps {
+    VirtQueueUsedCallback used_handler;
+} VhostShadowVirtqueueOps;
+
 /* Shadow virtqueue to relay notifications */
 typedef struct VhostShadowVirtqueue {
     /* Shadow vring */
@@ -67,6 +76,9 @@ typedef struct VhostShadowVirtqueue {
      */
     uint16_t *desc_next;
 
+    /* Caller callbacks */
+    const VhostShadowVirtqueueOps *ops;
+
     /* Next head to expose to the device */
     uint16_t shadow_avail_idx;
 
@@ -98,7 +110,8 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
                      VirtQueue *vq);
 void vhost_svq_stop(VhostShadowVirtqueue *svq);
 
-VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree);
+VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree,
+                                    const VhostShadowVirtqueueOps *ops);
 
 void vhost_svq_free(gpointer vq);
 G_DEFINE_AUTOPTR_CLEANUP_FUNC(VhostShadowVirtqueue, vhost_svq_free);
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index f4affa52ee..40183f8afd 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -306,6 +306,7 @@ int vhost_svq_inject(VhostShadowVirtqueue *svq, const struct iovec *iov,
      * assertions.
      */
     assert(out_num || in_num);
+    assert(svq->ops);
 
     if (unlikely(num > vhost_svq_available_slots(svq))) {
         error_report("Injecting in a full queue");
@@ -508,7 +509,6 @@ static size_t vhost_svq_flush(VhostShadowVirtqueue *svq,
         while (true) {
             uint32_t len;
             SVQElement svq_elem;
-            g_autofree VirtQueueElement *elem = NULL;
 
             if (unlikely(i >= svq->vring.num)) {
                 qemu_log_mask(LOG_GUEST_ERROR,
@@ -523,13 +523,20 @@ static size_t vhost_svq_flush(VhostShadowVirtqueue *svq,
                 break;
             }
 
-            elem = g_steal_pointer(&svq_elem.opaque);
-            virtqueue_fill(vq, elem, len, i++);
+            if (svq->ops) {
+                svq->ops->used_handler(svq, svq_elem.opaque, len);
+            } else {
+                g_autofree VirtQueueElement *elem = NULL;
+                elem = g_steal_pointer(&svq_elem.opaque);
+                virtqueue_fill(vq, elem, len, i++);
+            }
             ret++;
         }
 
-        virtqueue_flush(vq, i);
-        event_notifier_set(&svq->svq_call);
+        if (i > 0) {
+            virtqueue_flush(vq, i);
+            event_notifier_set(&svq->svq_call);
+        }
 
         if (check_for_avail_queue && svq->next_guest_avail_elem) {
             /*
@@ -758,12 +765,14 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
  * shadow methods and file descriptors.
  *
  * @iova_tree: Tree to perform descriptors translations
+ * @ops: SVQ owner callbacks
  *
  * Returns the new virtqueue or NULL.
  *
  * In case of error, reason is reported through error_report.
  */
-VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree)
+VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree,
+                                    const VhostShadowVirtqueueOps *ops)
 {
     g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
     int r;
@@ -785,6 +794,7 @@ VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree)
     event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
     event_notifier_set_handler(&svq->hdev_call, vhost_svq_handle_call);
     svq->iova_tree = iova_tree;
+    svq->ops = ops;
     return g_steal_pointer(&svq);
 
 err_init_hdev_call:
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index fccfc832ea..25f7146fe4 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -418,8 +418,9 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
 
     shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
     for (unsigned n = 0; n < hdev->nvqs; ++n) {
-        g_autoptr(VhostShadowVirtqueue) svq = vhost_svq_new(v->iova_tree);
+        g_autoptr(VhostShadowVirtqueue) svq;
 
+        svq = vhost_svq_new(v->iova_tree, NULL);
         if (unlikely(!svq)) {
             error_setg(errp, "Cannot create svq %u", n);
             return -1;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 15/22] vhost: Add svq avail_handler callback
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (13 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 14/22] vhost: Add custom used buffer callback Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 16/22] vhost: add detach SVQ operation Eugenio Pérez
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

This allows external handlers to be aware of new buffers that the guest
places in the virtqueue.

When this callback is defined the ownership of guest's virtqueue element
is transferred to the callback. This means that if the user wants to
forward the descriptor it needs to manually inject it. The callback is
also free to process the command by itself and use the element with
svq_push.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h | 23 ++++++++++++++++++++++-
 hw/virtio/vhost-shadow-virtqueue.c | 13 +++++++++++--
 hw/virtio/vhost-vdpa.c             |  2 +-
 3 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 96ce7aa62e..cfc891e2e8 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -24,11 +24,28 @@ typedef struct SVQElement {
 } SVQElement;
 
 typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
+
+/**
+ * Callback to handle an avail buffer.
+ *
+ * @svq:  Shadow virtqueue
+ * @elem:  Element placed in the queue by the guest
+ * @vq_callback_opaque:  Opaque
+ *
+ * Returns true if the vq is running as expected, false otherwise.
+ *
+ * Note that ownership of elem is transferred to the callback.
+ */
+typedef bool (*VirtQueueAvailCallback)(VhostShadowVirtqueue *svq,
+                                       VirtQueueElement *elem,
+                                       void *vq_callback_opaque);
+
 typedef void (*VirtQueueUsedCallback)(VhostShadowVirtqueue *svq,
                                       void *used_elem_opaque,
                                       uint32_t written);
 
 typedef struct VhostShadowVirtqueueOps {
+    VirtQueueAvailCallback avail_handler;
     VirtQueueUsedCallback used_handler;
 } VhostShadowVirtqueueOps;
 
@@ -79,6 +96,9 @@ typedef struct VhostShadowVirtqueue {
     /* Caller callbacks */
     const VhostShadowVirtqueueOps *ops;
 
+    /* Caller callbacks opaque */
+    void *ops_opaque;
+
     /* Next head to expose to the device */
     uint16_t shadow_avail_idx;
 
@@ -111,7 +131,8 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
 void vhost_svq_stop(VhostShadowVirtqueue *svq);
 
 VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree,
-                                    const VhostShadowVirtqueueOps *ops);
+                                    const VhostShadowVirtqueueOps *ops,
+                                    void *ops_opaque);
 
 void vhost_svq_free(gpointer vq);
 G_DEFINE_AUTOPTR_CLEANUP_FUNC(VhostShadowVirtqueue, vhost_svq_free);
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 40183f8afd..78579b9e0b 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -374,7 +374,13 @@ static void vhost_handle_guest_kick(VhostShadowVirtqueue *svq)
                 return;
             }
 
-            ok = vhost_svq_add_element(svq, g_steal_pointer(&elem));
+            if (svq->ops) {
+                ok = svq->ops->avail_handler(svq, g_steal_pointer(&elem),
+                                             svq->ops_opaque);
+            } else {
+                ok = vhost_svq_add_element(svq, g_steal_pointer(&elem));
+            }
+
             if (unlikely(!ok)) {
                 /* VQ is broken, just return and ignore any other kicks */
                 return;
@@ -766,13 +772,15 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
  *
  * @iova_tree: Tree to perform descriptors translations
  * @ops: SVQ owner callbacks
+ * @ops_opaque: ops opaque pointer
  *
  * Returns the new virtqueue or NULL.
  *
  * In case of error, reason is reported through error_report.
  */
 VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree,
-                                    const VhostShadowVirtqueueOps *ops)
+                                    const VhostShadowVirtqueueOps *ops,
+                                    void *ops_opaque)
 {
     g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
     int r;
@@ -795,6 +803,7 @@ VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree,
     event_notifier_set_handler(&svq->hdev_call, vhost_svq_handle_call);
     svq->iova_tree = iova_tree;
     svq->ops = ops;
+    svq->ops_opaque = ops_opaque;
     return g_steal_pointer(&svq);
 
 err_init_hdev_call:
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 25f7146fe4..9a4f00c114 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -420,7 +420,7 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
     for (unsigned n = 0; n < hdev->nvqs; ++n) {
         g_autoptr(VhostShadowVirtqueue) svq;
 
-        svq = vhost_svq_new(v->iova_tree, NULL);
+        svq = vhost_svq_new(v->iova_tree, NULL, NULL);
         if (unlikely(!svq)) {
             error_setg(errp, "Cannot create svq %u", n);
             return -1;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 16/22] vhost: add detach SVQ operation
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (14 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 15/22] vhost: Add svq avail_handler callback Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 17/22] vdpa: Export vhost_vdpa_dma_map and unmap calls Eugenio Pérez
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

To notify the caller it needs to discard the element.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h | 11 +++++++++++
 hw/virtio/vhost-shadow-virtqueue.c | 11 ++++++++++-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index cfc891e2e8..dc0059adc6 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -44,9 +44,20 @@ typedef void (*VirtQueueUsedCallback)(VhostShadowVirtqueue *svq,
                                       void *used_elem_opaque,
                                       uint32_t written);
 
+/**
+ * Detach the element from the shadow virtqueue.  SVQ needs to free it and it
+ * cannot be pushed or discarded.
+ *
+ * @elem_opaque: The element opaque
+ *
+ * Return the guest element to detach and free if any.
+ */
+typedef VirtQueueElement *(*VirtQueueDetachCallback)(void *elem_opaque);
+
 typedef struct VhostShadowVirtqueueOps {
     VirtQueueAvailCallback avail_handler;
     VirtQueueUsedCallback used_handler;
+    VirtQueueDetachCallback detach_handler;
 } VhostShadowVirtqueueOps;
 
 /* Shadow virtqueue to relay notifications */
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 78579b9e0b..626691ac4e 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -749,7 +749,16 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
 
     for (unsigned i = 0; i < svq->vring.num; ++i) {
         g_autofree VirtQueueElement *elem = NULL;
-        elem = g_steal_pointer(&svq->ring_id_maps[i].opaque);
+        void *opaque = g_steal_pointer(&svq->ring_id_maps[i].opaque);
+
+        if (!opaque) {
+            continue;
+        } else if (svq->ops) {
+            elem = svq->ops->detach_handler(opaque);
+        } else {
+            elem = opaque;
+        }
+
         if (elem) {
             virtqueue_detach_element(svq->vq, elem, 0);
         }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 17/22] vdpa: Export vhost_vdpa_dma_map and unmap calls
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (15 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 16/22] vhost: add detach SVQ operation Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 18/22] vdpa: manual forward CVQ buffers Eugenio Pérez
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

Shadow CVQ will copy buffers on qemu VA, so we avoid TOCTOU attacks that
can set a different state in qemu device model and vdpa device.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/vhost-vdpa.h | 4 ++++
 hw/virtio/vhost-vdpa.c         | 7 +++----
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index a29dbb3f53..7214eb47dc 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -39,4 +39,8 @@ typedef struct vhost_vdpa {
     VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
 } VhostVDPA;
 
+int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
+                       void *vaddr, bool readonly);
+int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
+
 #endif
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 9a4f00c114..7d2922ccbf 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -71,8 +71,8 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
     return false;
 }
 
-static int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
-                              void *vaddr, bool readonly)
+int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
+                       void *vaddr, bool readonly)
 {
     struct vhost_msg_v2 msg = {};
     int fd = v->device_fd;
@@ -97,8 +97,7 @@ static int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
     return ret;
 }
 
-static int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova,
-                                hwaddr size)
+int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size)
 {
     struct vhost_msg_v2 msg = {};
     int fd = v->device_fd;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 18/22] vdpa: manual forward CVQ buffers
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (16 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 17/22] vdpa: Export vhost_vdpa_dma_map and unmap calls Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 19/22] vdpa: Buffer CVQ support on shadow virtqueue Eugenio Pérez
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

Do a simple forwarding of CVQ buffers, the same work SVQ could do but
through callbacks. No functional change intended.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/vhost-vdpa.h |  3 ++
 hw/virtio/vhost-vdpa.c         |  3 +-
 net/vhost-vdpa.c               | 59 ++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 7214eb47dc..1111d85643 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -15,6 +15,7 @@
 #include <gmodule.h>
 
 #include "hw/virtio/vhost-iova-tree.h"
+#include "hw/virtio/vhost-shadow-virtqueue.h"
 #include "hw/virtio/virtio.h"
 #include "standard-headers/linux/vhost_types.h"
 
@@ -35,6 +36,8 @@ typedef struct vhost_vdpa {
     /* IOVA mapping used by the Shadow Virtqueue */
     VhostIOVATree *iova_tree;
     GPtrArray *shadow_vqs;
+    const VhostShadowVirtqueueOps *shadow_vq_ops;
+    void *shadow_vq_ops_opaque;
     struct vhost_dev *dev;
     VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
 } VhostVDPA;
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7d2922ccbf..c1162daecc 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -419,7 +419,8 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
     for (unsigned n = 0; n < hdev->nvqs; ++n) {
         g_autoptr(VhostShadowVirtqueue) svq;
 
-        svq = vhost_svq_new(v->iova_tree, NULL, NULL);
+        svq = vhost_svq_new(v->iova_tree, v->shadow_vq_ops,
+                            v->shadow_vq_ops_opaque);
         if (unlikely(!svq)) {
             error_setg(errp, "Cannot create svq %u", n);
             return -1;
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index df1e69ee72..8558ad7a01 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -11,11 +11,14 @@
 
 #include "qemu/osdep.h"
 #include "clients.h"
+#include "hw/virtio/virtio-net.h"
 #include "net/vhost_net.h"
 #include "net/vhost-vdpa.h"
 #include "hw/virtio/vhost-vdpa.h"
 #include "qemu/config-file.h"
 #include "qemu/error-report.h"
+#include "qemu/log.h"
+#include "qemu/memalign.h"
 #include "qemu/option.h"
 #include "qapi/error.h"
 #include <linux/vhost.h>
@@ -187,6 +190,58 @@ static NetClientInfo net_vhost_vdpa_info = {
         .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
+/**
+ * Forward buffer for the moment.
+ */
+static bool vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
+                                             VirtQueueElement *guest_elem,
+                                             void *opaque)
+{
+    g_autofree VirtQueueElement *elem = guest_elem;
+    unsigned int n = elem->out_num + elem->in_num;
+    g_autofree struct iovec *iov = g_new(struct iovec, n);
+    size_t in_len;
+    virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
+    int r;
+
+    memcpy(iov, elem->out_sg, elem->out_num);
+    memcpy(iov + elem->out_num, elem->in_sg, elem->in_num);
+
+    r = vhost_svq_inject(svq, iov, elem->out_num, elem->in_num, elem);
+    if (unlikely(r != 0)) {
+        goto err;
+    }
+
+    /* Now elem belongs to SVQ */
+    g_steal_pointer(&elem);
+    return true;
+
+err:
+    in_len = iov_from_buf(elem->in_sg, elem->in_num, 0, &status,
+                          sizeof(status));
+    vhost_svq_push_elem(svq, elem, in_len);
+    return true;
+}
+
+static VirtQueueElement *vhost_vdpa_net_handle_ctrl_detach(void *elem_opaque)
+{
+    return elem_opaque;
+}
+
+static void vhost_vdpa_net_handle_ctrl_used(VhostShadowVirtqueue *svq,
+                                            void *vq_elem_opaque,
+                                            uint32_t dev_written)
+{
+    g_autofree VirtQueueElement *guest_elem = vq_elem_opaque;
+    vhost_svq_push_elem(svq, guest_elem, sizeof(virtio_net_ctrl_ack));
+}
+
+static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
+    .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
+    .used_handler = vhost_vdpa_net_handle_ctrl_used,
+    .detach_handler = vhost_vdpa_net_handle_ctrl_detach,
+};
+
 static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
                                            const char *device,
                                            const char *name,
@@ -211,6 +266,10 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
 
     s->vhost_vdpa.device_fd = vdpa_device_fd;
     s->vhost_vdpa.index = queue_pair_index;
+    if (!is_datapath) {
+        s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
+        s->vhost_vdpa.shadow_vq_ops_opaque = s;
+    }
     ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa, queue_pair_index, nvqs);
     if (ret) {
         qemu_del_net_client(nc);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 19/22] vdpa: Buffer CVQ support on shadow virtqueue
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (17 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 18/22] vdpa: manual forward CVQ buffers Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 20/22] vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs Eugenio Pérez
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

Introduce the control virtqueue support for vDPA shadow virtqueue. This
is needed for advanced networking features like rx filtering.

Virtio-net control VQ copies now the descriptors to qemu's VA, so we
avoid TOCTOU with the guest's or device's memory every time there is a
device model change. Otherwise, the guest could change the memory
content in the time between qemu and the device reads it.

Likewise, qemu does not share the memory of the command with the device:
it exposes another copy to it.

To demonstrate command handling, VIRTIO_NET_F_CTRL_MACADDR is
implemented.  If virtio-net driver changes MAC the virtio-net device
model will be updated with the new one, and a rx filtering change event
will be raised.

Others cvq commands could be added here straightforwardly but they have
been not tested.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 net/vhost-vdpa.c | 334 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 322 insertions(+), 12 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 8558ad7a01..3ae74f7fb5 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -28,6 +28,26 @@
 #include "monitor/monitor.h"
 #include "hw/virtio/vhost.h"
 
+typedef struct CVQElement {
+    /* Device's in and out buffer */
+    void *in_buf, *out_buf;
+
+    /* Optional guest element from where this cvqelement was created */
+    VirtQueueElement *guest_elem;
+
+    /* Control header sent by the guest. */
+    struct virtio_net_ctrl_hdr ctrl;
+
+    /* vhost-vdpa device, for cleanup reasons */
+    struct vhost_vdpa *vdpa;
+
+    /* Length of out data */
+    size_t out_len;
+
+    /* Copy of the out data sent by the guest excluding ctrl. */
+    uint8_t out_data[];
+} CVQElement;
+
 /* Todo:need to add the multiqueue support here */
 typedef struct VhostVDPAState {
     NetClientState nc;
@@ -191,29 +211,277 @@ static NetClientInfo net_vhost_vdpa_info = {
 };
 
 /**
- * Forward buffer for the moment.
+ * Unmap a descriptor chain of a SVQ element, optionally copying its in buffers
+ *
+ * @svq: Shadow VirtQueue
+ * @iova: SVQ IO Virtual address of descriptor
+ * @iov: Optional iovec to store device writable buffer
+ * @iov_cnt: iov length
+ * @buf_len: Length written by the device
+ *
+ * TODO: Use me! and adapt to net/vhost-vdpa format
+ * Print error message in case of error
+ */
+static void vhost_vdpa_cvq_unmap_buf(CVQElement *elem, void *addr)
+{
+    struct vhost_vdpa *v = elem->vdpa;
+    VhostIOVATree *tree = v->iova_tree;
+    DMAMap needle = {
+        /*
+         * No need to specify size or to look for more translations since
+         * this contiguous chunk was allocated by us.
+         */
+        .translated_addr = (hwaddr)(uintptr_t)addr,
+    };
+    const DMAMap *map = vhost_iova_tree_find_iova(tree, &needle);
+    int r;
+
+    if (unlikely(!map)) {
+        error_report("Cannot locate expected map");
+        goto err;
+    }
+
+    r = vhost_vdpa_dma_unmap(v, map->iova, map->size + 1);
+    if (unlikely(r != 0)) {
+        error_report("Device cannot unmap: %s(%d)", g_strerror(r), r);
+    }
+
+    vhost_iova_tree_remove(tree, map);
+
+err:
+    qemu_vfree(addr);
+}
+
+static void vhost_vdpa_cvq_delete_elem(CVQElement *elem)
+{
+    if (elem->out_buf) {
+        vhost_vdpa_cvq_unmap_buf(elem, g_steal_pointer(&elem->out_buf));
+    }
+
+    if (elem->in_buf) {
+        vhost_vdpa_cvq_unmap_buf(elem, g_steal_pointer(&elem->in_buf));
+    }
+
+    /* Guest element must have been returned to the guest or free otherway */
+    assert(!elem->guest_elem);
+
+    g_free(elem);
+}
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(CVQElement, vhost_vdpa_cvq_delete_elem);
+
+static int vhost_vdpa_net_cvq_svq_inject(VhostShadowVirtqueue *svq,
+                                         CVQElement *cvq_elem,
+                                         size_t out_len)
+{
+    const struct iovec iov[] = {
+        {
+            .iov_base = cvq_elem->out_buf,
+            .iov_len = out_len,
+        },{
+            .iov_base = cvq_elem->in_buf,
+            .iov_len = sizeof(virtio_net_ctrl_ack),
+        }
+    };
+
+    return vhost_svq_inject(svq, iov, 1, 1, cvq_elem);
+}
+
+static void *vhost_vdpa_cvq_alloc_buf(struct vhost_vdpa *v,
+                                      const uint8_t *out_data, size_t data_len,
+                                      bool write)
+{
+    DMAMap map = {};
+    size_t buf_len = ROUND_UP(data_len, qemu_real_host_page_size());
+    void *buf = qemu_memalign(qemu_real_host_page_size(), buf_len);
+    int r;
+
+    if (!write) {
+        memcpy(buf, out_data, data_len);
+        memset(buf + data_len, 0, buf_len - data_len);
+    } else {
+        memset(buf, 0, data_len);
+    }
+
+    map.translated_addr = (hwaddr)(uintptr_t)buf;
+    map.size = buf_len - 1;
+    map.perm = write ? IOMMU_RW : IOMMU_RO,
+    r = vhost_iova_tree_map_alloc(v->iova_tree, &map);
+    if (unlikely(r != IOVA_OK)) {
+        error_report("Cannot map injected element");
+        goto err;
+    }
+
+    r = vhost_vdpa_dma_map(v, map.iova, buf_len, buf, !write);
+    if (unlikely(r < 0)) {
+        goto dma_map_err;
+    }
+
+    return buf;
+
+dma_map_err:
+    vhost_iova_tree_remove(v->iova_tree, &map);
+
+err:
+    qemu_vfree(buf);
+    return NULL;
+}
+
+/**
+ * Allocate an element suitable to be injected
+ *
+ * @iov: The iovec
+ * @out_num: Number of out elements, placed first in iov
+ * @in_num: Number of in elements, placed after out ones
+ * @elem: Optional guest element from where this one was created
+ *
+ * TODO: Do we need a sg for out_num? I think not
+ */
+static CVQElement *vhost_vdpa_cvq_alloc_elem(VhostVDPAState *s,
+                                             struct virtio_net_ctrl_hdr ctrl,
+                                             const struct iovec *out_sg,
+                                             size_t out_num, size_t out_size,
+                                             VirtQueueElement *elem)
+{
+    g_autoptr(CVQElement) cvq_elem = g_malloc(sizeof(CVQElement) + out_size);
+    g_autofree VirtQueueElement *guest_elem = elem;
+    uint8_t *out_cursor = cvq_elem->out_data;
+    struct vhost_vdpa *v = &s->vhost_vdpa;
+
+    /* Start with a clean base */
+    memset(cvq_elem, 0, sizeof(*cvq_elem));
+    cvq_elem->vdpa = &s->vhost_vdpa;
+
+    /*
+     * Linearize element. If guest had a descriptor chain, we expose the device
+     * a single buffer.
+     */
+    cvq_elem->out_len = out_size;
+    memcpy(out_cursor, &ctrl, sizeof(ctrl));
+    out_size -= sizeof(ctrl);
+    out_cursor += sizeof(ctrl);
+    iov_to_buf(out_sg, out_num, 0, out_cursor, out_size);
+
+    cvq_elem->out_buf = vhost_vdpa_cvq_alloc_buf(v, cvq_elem->out_data,
+                                                 out_size, false);
+    if (unlikely(!cvq_elem->out_buf)) {
+        return NULL;
+    }
+
+    cvq_elem->in_buf = vhost_vdpa_cvq_alloc_buf(v, NULL,
+                                                sizeof(virtio_net_ctrl_ack),
+                                                true);
+    if (unlikely(!cvq_elem->in_buf)) {
+        return NULL;
+    }
+
+    cvq_elem->guest_elem = g_steal_pointer(&guest_elem);
+    cvq_elem->ctrl = ctrl;
+    return g_steal_pointer(&cvq_elem);
+}
+
+/**
+ * iov_size with an upper limit. It's assumed UINT64_MAX is an invalid
+ * iov_size.
+ */
+static uint64_t vhost_vdpa_net_iov_len(const struct iovec *iov,
+                                       unsigned int iov_cnt, size_t max)
+{
+    uint64_t len = 0;
+
+    for (unsigned int i = 0; len < max && i < iov_cnt; i++) {
+        bool overflow = uadd64_overflow(iov[i].iov_len, len, &len);
+        if (unlikely(overflow)) {
+            return UINT64_MAX;
+        }
+    }
+
+    return len;
+}
+
+static CVQElement *vhost_vdpa_net_cvq_copy_elem(VhostVDPAState *s,
+                                                VirtQueueElement *elem)
+{
+    struct virtio_net_ctrl_hdr ctrl;
+    g_autofree struct iovec *iov = NULL;
+    struct iovec *iov2;
+    unsigned int out_num = elem->out_num;
+    size_t n, out_size = 0;
+
+    /* TODO: in buffer MUST have only a single entry with a char? size */
+    if (unlikely(vhost_vdpa_net_iov_len(elem->in_sg, elem->in_num,
+                                        sizeof(virtio_net_ctrl_ack))
+                                              < sizeof(virtio_net_ctrl_ack))) {
+        return NULL;
+    }
+
+    n = iov_to_buf(elem->out_sg, out_num, 0, &ctrl, sizeof(ctrl));
+    if (unlikely(n != sizeof(ctrl))) {
+        qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid out size\n", __func__);
+        return NULL;
+    }
+
+    iov = iov2 = g_memdup2(elem->out_sg, sizeof(struct iovec) * elem->out_num);
+    iov_discard_front(&iov2, &out_num, sizeof(ctrl));
+    switch (ctrl.class) {
+    case VIRTIO_NET_CTRL_MAC:
+        switch (ctrl.cmd) {
+        case VIRTIO_NET_CTRL_MAC_ADDR_SET:
+            if (likely(vhost_vdpa_net_iov_len(iov2, out_num, 6))) {
+                out_size += 6;
+                break;
+            }
+
+            qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid mac size\n", __func__);
+            return NULL;
+        default:
+            qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid mac cmd %u\n",
+                          __func__, ctrl.cmd);
+            return NULL;
+        };
+        break;
+    default:
+        qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid control class %u\n",
+                      __func__, ctrl.class);
+        return NULL;
+    };
+
+    return vhost_vdpa_cvq_alloc_elem(s, ctrl, iov2, out_num,
+                                     sizeof(ctrl) + out_size, elem);
+}
+
+/**
+ * Validate and copy control virtqueue commands.
+ *
+ * Following QEMU guidelines, we offer a copy of the buffers to the device to
+ * prevent TOCTOU bugs.  This functions check that the buffers length are
+ * expected too.
  */
 static bool vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
                                              VirtQueueElement *guest_elem,
                                              void *opaque)
 {
+    VhostVDPAState *s = opaque;
+    g_autoptr(CVQElement) cvq_elem = NULL;
     g_autofree VirtQueueElement *elem = guest_elem;
-    unsigned int n = elem->out_num + elem->in_num;
-    g_autofree struct iovec *iov = g_new(struct iovec, n);
-    size_t in_len;
+    size_t out_size, in_len;
     virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
     int r;
 
-    memcpy(iov, elem->out_sg, elem->out_num);
-    memcpy(iov + elem->out_num, elem->in_sg, elem->in_num);
+    cvq_elem = vhost_vdpa_net_cvq_copy_elem(s, elem);
+    if (unlikely(!cvq_elem)) {
+        goto err;
+    }
 
-    r = vhost_svq_inject(svq, iov, elem->out_num, elem->in_num, elem);
+    /* out size validated at vhost_vdpa_net_cvq_copy_elem */
+    out_size = iov_size(elem->out_sg, elem->out_num);
+    r = vhost_vdpa_net_cvq_svq_inject(svq, cvq_elem, out_size);
     if (unlikely(r != 0)) {
         goto err;
     }
 
-    /* Now elem belongs to SVQ */
-    g_steal_pointer(&elem);
+    cvq_elem->guest_elem = g_steal_pointer(&elem);
+    /* Now CVQ elem belongs to SVQ */
+    g_steal_pointer(&cvq_elem);
     return true;
 
 err:
@@ -225,15 +493,57 @@ err:
 
 static VirtQueueElement *vhost_vdpa_net_handle_ctrl_detach(void *elem_opaque)
 {
-    return elem_opaque;
+    g_autoptr(CVQElement) cvq_elem = elem_opaque;
+    return g_steal_pointer(&cvq_elem->guest_elem);
 }
 
 static void vhost_vdpa_net_handle_ctrl_used(VhostShadowVirtqueue *svq,
                                             void *vq_elem_opaque,
                                             uint32_t dev_written)
 {
-    g_autofree VirtQueueElement *guest_elem = vq_elem_opaque;
-    vhost_svq_push_elem(svq, guest_elem, sizeof(virtio_net_ctrl_ack));
+    g_autoptr(CVQElement) cvq_elem = vq_elem_opaque;
+    virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
+    const struct iovec out = {
+        .iov_base = cvq_elem->out_data,
+        .iov_len = cvq_elem->out_len,
+    };
+    const struct iovec in = {
+        .iov_base = &status,
+        .iov_len = sizeof(status),
+    };
+    g_autofree VirtQueueElement *guest_elem = NULL;
+
+    if (unlikely(dev_written < sizeof(status))) {
+        error_report("Insufficient written data (%llu)",
+                     (long long unsigned)dev_written);
+        goto out;
+    }
+
+    switch (cvq_elem->ctrl.class) {
+    case VIRTIO_NET_CTRL_MAC_ADDR_SET:
+        break;
+    default:
+        error_report("Unexpected ctrl class %u", cvq_elem->ctrl.class);
+        goto out;
+    };
+
+    memcpy(&status, cvq_elem->in_buf, sizeof(status));
+    if (status != VIRTIO_NET_OK) {
+        goto out;
+    }
+
+    status = VIRTIO_NET_ERR;
+    virtio_net_handle_ctrl_iov(svq->vdev, &in, 1, &out, 1);
+    if (status != VIRTIO_NET_OK) {
+        error_report("Bad CVQ processing in model");
+        goto out;
+    }
+
+out:
+    guest_elem = g_steal_pointer(&cvq_elem->guest_elem);
+    iov_from_buf(guest_elem->in_sg, guest_elem->in_num, 0, &status,
+                 sizeof(status));
+    vhost_svq_push_elem(svq, guest_elem, sizeof(status));
 }
 
 static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 20/22] vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (18 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 19/22] vdpa: Buffer CVQ support on shadow virtqueue Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 21/22] vdpa: Add device migration blocker Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 22/22] vdpa: Add x-svq to NetdevVhostVDPAOptions Eugenio Pérez
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

To know the device features is needed for CVQ SVQ, so SVQ knows if it
can handle all commands or not. Extract from
vhost_vdpa_get_max_queue_pairs so we can reuse it.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 net/vhost-vdpa.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 3ae74f7fb5..b6ed30bec3 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -588,20 +588,24 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
     return nc;
 }
 
-static int vhost_vdpa_get_max_queue_pairs(int fd, int *has_cvq, Error **errp)
+static int vhost_vdpa_get_features(int fd, uint64_t *features, Error **errp)
+{
+    int ret = ioctl(fd, VHOST_GET_FEATURES, features);
+    if (unlikely(ret < 0)) {
+        error_setg_errno(errp, errno,
+                         "Fail to query features from vhost-vDPA device");
+    }
+    return ret;
+}
+
+static int vhost_vdpa_get_max_queue_pairs(int fd, uint64_t features,
+                                          int *has_cvq, Error **errp)
 {
     unsigned long config_size = offsetof(struct vhost_vdpa_config, buf);
     g_autofree struct vhost_vdpa_config *config = NULL;
     __virtio16 *max_queue_pairs;
-    uint64_t features;
     int ret;
 
-    ret = ioctl(fd, VHOST_GET_FEATURES, &features);
-    if (ret) {
-        error_setg(errp, "Fail to query features from vhost-vDPA device");
-        return ret;
-    }
-
     if (features & (1 << VIRTIO_NET_F_CTRL_VQ)) {
         *has_cvq = 1;
     } else {
@@ -631,10 +635,11 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
                         NetClientState *peer, Error **errp)
 {
     const NetdevVhostVDPAOptions *opts;
+    uint64_t features;
     int vdpa_device_fd;
     g_autofree NetClientState **ncs = NULL;
     NetClientState *nc;
-    int queue_pairs, i, has_cvq = 0;
+    int queue_pairs, r, i, has_cvq = 0;
 
     assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
     opts = &netdev->u.vhost_vdpa;
@@ -648,7 +653,12 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
         return -errno;
     }
 
-    queue_pairs = vhost_vdpa_get_max_queue_pairs(vdpa_device_fd,
+    r = vhost_vdpa_get_features(vdpa_device_fd, &features, errp);
+    if (unlikely(r < 0)) {
+        return r;
+    }
+
+    queue_pairs = vhost_vdpa_get_max_queue_pairs(vdpa_device_fd, features,
                                                  &has_cvq, errp);
     if (queue_pairs < 0) {
         qemu_close(vdpa_device_fd);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 21/22] vdpa: Add device migration blocker
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (19 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 20/22] vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 10:50 ` [PATCH 22/22] vdpa: Add x-svq to NetdevVhostVDPAOptions Eugenio Pérez
  21 siblings, 0 replies; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

The device may need to add migration blockers. For example, if vdpa
device uses features not compatible with migration.

Add the possibility here.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/vhost-vdpa.h |  1 +
 hw/virtio/vhost-vdpa.c         | 14 ++++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 1111d85643..d10a89303e 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -35,6 +35,7 @@ typedef struct vhost_vdpa {
     bool shadow_vqs_enabled;
     /* IOVA mapping used by the Shadow Virtqueue */
     VhostIOVATree *iova_tree;
+    Error *migration_blocker;
     GPtrArray *shadow_vqs;
     const VhostShadowVirtqueueOps *shadow_vq_ops;
     void *shadow_vq_ops_opaque;
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index c1162daecc..764a81b57f 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -20,6 +20,7 @@
 #include "hw/virtio/vhost-shadow-virtqueue.h"
 #include "hw/virtio/vhost-vdpa.h"
 #include "exec/address-spaces.h"
+#include "migration/blocker.h"
 #include "qemu/cutils.h"
 #include "qemu/main-loop.h"
 #include "cpu.h"
@@ -1016,6 +1017,13 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
         return true;
     }
 
+    if (v->migration_blocker) {
+        int r = migrate_add_blocker(v->migration_blocker, &err);
+        if (unlikely(r < 0)) {
+            goto err_migration_blocker;
+        }
+    }
+
     for (i = 0; i < v->shadow_vqs->len; ++i) {
         VirtQueue *vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
         VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
@@ -1057,6 +1065,9 @@ err_svq_setup:
         vhost_svq_stop(svq);
     }
 
+err_migration_blocker:
+    error_reportf_err(err, "Cannot setup SVQ %u: ", i);
+
     return false;
 }
 
@@ -1073,6 +1084,9 @@ static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
         vhost_vdpa_svq_unmap_rings(dev, svq);
     }
 
+    if (v->migration_blocker) {
+        migrate_del_blocker(v->migration_blocker);
+    }
 }
 
 static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 22/22] vdpa: Add x-svq to NetdevVhostVDPAOptions
  2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (20 preceding siblings ...)
  2022-07-08 10:50 ` [PATCH 21/22] vdpa: Add device migration blocker Eugenio Pérez
@ 2022-07-08 10:50 ` Eugenio Pérez
  2022-07-08 12:52   ` Markus Armbruster
  21 siblings, 1 reply; 25+ messages in thread
From: Eugenio Pérez @ 2022-07-08 10:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella,
	Markus Armbruster, Paolo Bonzini, Harpreet Singh Anand,
	Jason Wang, Michael S. Tsirkin, Eli Cohen, Parav Pandit,
	Cornelia Huck

Finally offering the possibility to enable SVQ from the command line.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 qapi/net.json    |  9 +++++-
 net/vhost-vdpa.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 79 insertions(+), 4 deletions(-)

diff --git a/qapi/net.json b/qapi/net.json
index 9af11e9a3b..75ba2cb989 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -445,12 +445,19 @@
 # @queues: number of queues to be created for multiqueue vhost-vdpa
 #          (default: 1)
 #
+# @x-svq: Start device with (experimental) shadow virtqueue. (Since 7.1)
+#         (default: false)
+#
+# Features:
+# @unstable: Member @x-svq is experimental.
+#
 # Since: 5.1
 ##
 { 'struct': 'NetdevVhostVDPAOptions',
   'data': {
     '*vhostdev':     'str',
-    '*queues':       'int' } }
+    '*queues':       'int',
+    '*x-svq':        {'type': 'bool', 'features' : [ 'unstable'] } } }
 
 ##
 # @NetdevVmnetHostOptions:
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index b6ed30bec3..a6ebc234c0 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -92,6 +92,30 @@ const int vdpa_feature_bits[] = {
     VHOST_INVALID_FEATURE_BIT
 };
 
+/** Supported device specific feature bits with SVQ */
+static const uint64_t vdpa_svq_device_features =
+    BIT_ULL(VIRTIO_NET_F_CSUM) |
+    BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) |
+    BIT_ULL(VIRTIO_NET_F_CTRL_GUEST_OFFLOADS) |
+    BIT_ULL(VIRTIO_NET_F_MTU) |
+    BIT_ULL(VIRTIO_NET_F_MAC) |
+    BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) |
+    BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) |
+    BIT_ULL(VIRTIO_NET_F_GUEST_ECN) |
+    BIT_ULL(VIRTIO_NET_F_GUEST_UFO) |
+    BIT_ULL(VIRTIO_NET_F_HOST_TSO4) |
+    BIT_ULL(VIRTIO_NET_F_HOST_TSO6) |
+    BIT_ULL(VIRTIO_NET_F_HOST_ECN) |
+    BIT_ULL(VIRTIO_NET_F_HOST_UFO) |
+    BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) |
+    BIT_ULL(VIRTIO_NET_F_STATUS) |
+    BIT_ULL(VIRTIO_NET_F_CTRL_VQ) |
+    BIT_ULL(VIRTIO_NET_F_MQ) |
+    BIT_ULL(VIRTIO_F_ANY_LAYOUT) |
+    BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR) |
+    BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
+    BIT_ULL(VIRTIO_NET_F_STANDBY);
+
 VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
 {
     VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
@@ -150,7 +174,11 @@ err_init:
 static void vhost_vdpa_cleanup(NetClientState *nc)
 {
     VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+    struct vhost_dev *dev = &s->vhost_net->dev;
 
+    if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
+        g_clear_pointer(&s->vhost_vdpa.iova_tree, vhost_iova_tree_delete);
+    }
     if (s->vhost_net) {
         vhost_net_cleanup(s->vhost_net);
         g_free(s->vhost_net);
@@ -398,6 +426,14 @@ static uint64_t vhost_vdpa_net_iov_len(const struct iovec *iov,
     return len;
 }
 
+static int vhost_vdpa_get_iova_range(int fd,
+                                     struct vhost_vdpa_iova_range *iova_range)
+{
+    int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
+
+    return ret < 0 ? -errno : 0;
+}
+
 static CVQElement *vhost_vdpa_net_cvq_copy_elem(VhostVDPAState *s,
                                                 VirtQueueElement *elem)
 {
@@ -558,7 +594,9 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
                                            int vdpa_device_fd,
                                            int queue_pair_index,
                                            int nvqs,
-                                           bool is_datapath)
+                                           bool is_datapath,
+                                           bool svq,
+                                           VhostIOVATree *iova_tree)
 {
     NetClientState *nc = NULL;
     VhostVDPAState *s;
@@ -576,9 +614,13 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
 
     s->vhost_vdpa.device_fd = vdpa_device_fd;
     s->vhost_vdpa.index = queue_pair_index;
+    s->vhost_vdpa.shadow_vqs_enabled = svq;
+    s->vhost_vdpa.iova_tree = iova_tree;
     if (!is_datapath) {
         s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
         s->vhost_vdpa.shadow_vq_ops_opaque = s;
+        error_setg(&s->vhost_vdpa.migration_blocker,
+                   "Migration disabled: vhost-vdpa uses CVQ.");
     }
     ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa, queue_pair_index, nvqs);
     if (ret) {
@@ -638,6 +680,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     uint64_t features;
     int vdpa_device_fd;
     g_autofree NetClientState **ncs = NULL;
+    g_autoptr(VhostIOVATree) iova_tree = NULL;
     NetClientState *nc;
     int queue_pairs, r, i, has_cvq = 0;
 
@@ -665,22 +708,45 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
         return queue_pairs;
     }
 
+    if (opts->x_svq) {
+        struct vhost_vdpa_iova_range iova_range;
+
+        uint64_t invalid_dev_features =
+            features & ~vdpa_svq_device_features &
+            /* Transport are all accepted at this point */
+            ~MAKE_64BIT_MASK(VIRTIO_TRANSPORT_F_START,
+                             VIRTIO_TRANSPORT_F_END - VIRTIO_TRANSPORT_F_START);
+
+        if (invalid_dev_features) {
+            error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
+                       invalid_dev_features);
+            goto err_svq;
+        }
+
+        vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range);
+        iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
+    }
+
     ncs = g_malloc0(sizeof(*ncs) * queue_pairs);
 
     for (i = 0; i < queue_pairs; i++) {
         ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
-                                     vdpa_device_fd, i, 2, true);
+                                     vdpa_device_fd, i, 2, true, opts->x_svq,
+                                     iova_tree);
         if (!ncs[i])
             goto err;
     }
 
     if (has_cvq) {
         nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
-                                 vdpa_device_fd, i, 1, false);
+                                 vdpa_device_fd, i, 1, false,
+                                 opts->x_svq, iova_tree);
         if (!nc)
             goto err;
     }
 
+    /* iova_tree ownership belongs to last NetClientState */
+    g_steal_pointer(&iova_tree);
     return 0;
 
 err:
@@ -689,6 +755,8 @@ err:
             qemu_del_net_client(ncs[i]);
         }
     }
+
+err_svq:
     qemu_close(vdpa_device_fd);
 
     return -1;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 22/22] vdpa: Add x-svq to NetdevVhostVDPAOptions
  2022-07-08 10:50 ` [PATCH 22/22] vdpa: Add x-svq to NetdevVhostVDPAOptions Eugenio Pérez
@ 2022-07-08 12:52   ` Markus Armbruster
  2022-07-11  7:17     ` Eugenio Perez Martin
  0 siblings, 1 reply; 25+ messages in thread
From: Markus Armbruster @ 2022-07-08 12:52 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella, Paolo Bonzini,
	Harpreet Singh Anand, Jason Wang, Michael S. Tsirkin, Eli Cohen,
	Parav Pandit, Cornelia Huck

Eugenio Pérez <eperezma@redhat.com> writes:

> Finally offering the possibility to enable SVQ from the command line.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>

Please carry forward Acked-by and Reviewed-by you received for prior
revisions unless you change something that invalidates them.  This
ensures reviewers get credit, and also saves them time: if the tag is
still there, nothing much changed, and no need to look at it again.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 22/22] vdpa: Add x-svq to NetdevVhostVDPAOptions
  2022-07-08 12:52   ` Markus Armbruster
@ 2022-07-11  7:17     ` Eugenio Perez Martin
  0 siblings, 0 replies; 25+ messages in thread
From: Eugenio Perez Martin @ 2022-07-11  7:17 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: qemu-level, Eric Blake, Stefan Hajnoczi, Liuxiangdong, Cindy Lu,
	Zhu Lingshan, Gonglei (Arei),
	Laurent Vivier, Gautam Dawar, Stefano Garzarella, Paolo Bonzini,
	Harpreet Singh Anand, Jason Wang, Michael S. Tsirkin, Eli Cohen,
	Parav Pandit, Cornelia Huck

On Fri, Jul 8, 2022 at 2:53 PM Markus Armbruster <armbru@redhat.com> wrote:
>
> Eugenio Pérez <eperezma@redhat.com> writes:
>
> > Finally offering the possibility to enable SVQ from the command line.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>
> Please carry forward Acked-by and Reviewed-by you received for prior
> revisions unless you change something that invalidates them.  This
> ensures reviewers get credit, and also saves them time: if the tag is
> still there, nothing much changed, and no need to look at it again.
>

I'm sorry, I didn't carry forward because I was not sure if we agreed
on the previous behavior. Now that I understand better what you meant,
yes, it should have been carried. I'll make sure I put it in for the
next revisions.

Thanks!



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-07-11  7:19 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-08 10:49 [PATCH 00/22] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
2022-07-08 10:49 ` [PATCH 01/22] vhost: Return earlier if used buffers overrun Eugenio Pérez
2022-07-08 10:49 ` [PATCH 02/22] vhost: move descriptor translation to vhost_svq_vring_write_descs Eugenio Pérez
2022-07-08 10:49 ` [PATCH 03/22] vdpa: Clean vhost_vdpa_dev_start(dev, false) Eugenio Pérez
2022-07-08 10:49 ` [PATCH 04/22] virtio-net: Expose ctrl virtqueue logic Eugenio Pérez
2022-07-08 10:49 ` [PATCH 05/22] vhost: Decouple vhost_svq_add_split from VirtQueueElement Eugenio Pérez
2022-07-08 10:49 ` [PATCH 06/22] vhost: Reorder vhost_svq_last_desc_of_chain Eugenio Pérez
2022-07-08 10:49 ` [PATCH 07/22] vhost: Add SVQElement Eugenio Pérez
2022-07-08 10:49 ` [PATCH 08/22] vhost: Move last chain id to SVQ element Eugenio Pérez
2022-07-08 10:50 ` [PATCH 09/22] vhost: Add opaque member to SVQElement Eugenio Pérez
2022-07-08 10:50 ` [PATCH 10/22] vdpa: Small rename of error labels Eugenio Pérez
2022-07-08 10:50 ` [PATCH 11/22] vhost: add vhost_svq_push_elem Eugenio Pérez
2022-07-08 10:50 ` [PATCH 12/22] vhost: Add vhost_svq_inject Eugenio Pérez
2022-07-08 10:50 ` [PATCH 13/22] vhost: add vhost_svq_poll Eugenio Pérez
2022-07-08 10:50 ` [PATCH 14/22] vhost: Add custom used buffer callback Eugenio Pérez
2022-07-08 10:50 ` [PATCH 15/22] vhost: Add svq avail_handler callback Eugenio Pérez
2022-07-08 10:50 ` [PATCH 16/22] vhost: add detach SVQ operation Eugenio Pérez
2022-07-08 10:50 ` [PATCH 17/22] vdpa: Export vhost_vdpa_dma_map and unmap calls Eugenio Pérez
2022-07-08 10:50 ` [PATCH 18/22] vdpa: manual forward CVQ buffers Eugenio Pérez
2022-07-08 10:50 ` [PATCH 19/22] vdpa: Buffer CVQ support on shadow virtqueue Eugenio Pérez
2022-07-08 10:50 ` [PATCH 20/22] vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs Eugenio Pérez
2022-07-08 10:50 ` [PATCH 21/22] vdpa: Add device migration blocker Eugenio Pérez
2022-07-08 10:50 ` [PATCH 22/22] vdpa: Add x-svq to NetdevVhostVDPAOptions Eugenio Pérez
2022-07-08 12:52   ` Markus Armbruster
2022-07-11  7:17     ` Eugenio Perez Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.