All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ
@ 2022-07-14 16:31 Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 01/19] vhost: move descriptor translation to vhost_svq_vring_write_descs Eugenio Pérez
                   ` (18 more replies)
  0 siblings, 19 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

Control virtqueue is used by networking device for accepting various
commands from the driver. It's a must to support advanced configurations.

Rx filtering event is issues by qemu when device's MAC address changed once and
the previous one has not been queried by external agents.

Shadow VirtQueue (SVQ) already makes possible tracking the state of virtqueues,
effectively intercepting them so qemu can track what regions of memory are
dirty because device action and needs migration. However, this does not solve
networking device state seen by the driver because CVQ messages, like changes
on MAC addresses from the driver.

This series uses SVQ infrastructure to intercept networking control messages
used by the device. This way, qemu is able to update VirtIONet device model and
react to them. In particular, this series enables rx filter change
notification.

This is a prerequisite to achieve net vdpa device with CVQ live migration.
It's a stripped down version of [1], with error paths checked and no migration
enabled.

First nine patches reorder and clean code base so its easier to apply later
ones. No functional change should be noticed from these changes.

Patches from 11 to 14 enable SVQ API to make other parts of qemu to interact
with it. In particular, they will be used by vhost-vdpa net to handle CVQ
messages.

Patches 15 to 17 enable the update of the virtio-net device model for each
CVQ message acknowledged by the device.

Last patches enable x-svq parameter, forbidding device migration since it is
not restored in the destination's vdpa device yet. This will be added in later
series, using this work.

Comments are welcome.

v2:
- (Comments from series [1]).
- Active poll for CVQ answer instead of relay on async used callback
- Do not offer a new buffer to device but reuse qemu's
- Use vhost_svq_add instead of not needed vhost_svq_inject
- Delete used and detach callbacks, not needed anymore
- Embed members of SVQElement in VirtQueueElement
- Reuse the same buffers for all CVQ commands

[1] https://patchwork.kernel.org/project/qemu-devel/cover/20220706184008.1649478-1-eperezma@redhat.com/

Eugenio Pérez (19):
  vhost: move descriptor translation to vhost_svq_vring_write_descs
  virtio-net: Expose MAC_TABLE_ENTRIES
  virtio-net: Expose ctrl virtqueue logic
  vhost: Reorder vhost_svq_kick
  vhost: Move vhost_svq_kick call to vhost_svq_add
  vhost: Check for queue full at vhost_svq_add
  vhost: Decouple vhost_svq_add from VirtQueueElement
  vhost: Add SVQElement
  vhost: Track number of descs in SVQElement
  vhost: add vhost_svq_push_elem
  vhost: Expose vhost_svq_add
  vhost: add vhost_svq_poll
  vhost: Add svq avail_handler callback
  vdpa: Export vhost_vdpa_dma_map and unmap calls
  vdpa: manual forward CVQ buffers
  vdpa: Buffer CVQ support on shadow virtqueue
  vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs
  vdpa: Add device migration blocker
  vdpa: Add x-svq to NetdevVhostVDPAOptions

 qapi/net.json                      |   9 +-
 hw/virtio/vhost-shadow-virtqueue.h |  52 ++++-
 include/hw/virtio/vhost-vdpa.h     |   8 +
 include/hw/virtio/virtio-net.h     |   7 +
 hw/net/virtio-net.c                |  85 ++++---
 hw/virtio/vhost-shadow-virtqueue.c | 216 +++++++++++------
 hw/virtio/vhost-vdpa.c             |  25 +-
 net/vhost-vdpa.c                   | 357 +++++++++++++++++++++++++++--
 8 files changed, 635 insertions(+), 124 deletions(-)

-- 
2.31.1




^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 01/19] vhost: move descriptor translation to vhost_svq_vring_write_descs
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 02/19] virtio-net: Expose MAC_TABLE_ENTRIES Eugenio Pérez
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

It's done for both in and out descriptors so it's better placed here.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 38 +++++++++++++++++++++---------
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 56c96ebd13..e2184a4481 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -122,17 +122,35 @@ static bool vhost_svq_translate_addr(const VhostShadowVirtqueue *svq,
     return true;
 }
 
-static void vhost_vring_write_descs(VhostShadowVirtqueue *svq, hwaddr *sg,
-                                    const struct iovec *iovec, size_t num,
-                                    bool more_descs, bool write)
+/**
+ * Write descriptors to SVQ vring
+ *
+ * @svq: The shadow virtqueue
+ * @sg: Cache for hwaddr
+ * @iovec: The iovec from the guest
+ * @num: iovec length
+ * @more_descs: True if more descriptors come in the chain
+ * @write: True if they are writeable descriptors
+ *
+ * Return true if success, false otherwise and print error.
+ */
+static bool vhost_svq_vring_write_descs(VhostShadowVirtqueue *svq, hwaddr *sg,
+                                        const struct iovec *iovec, size_t num,
+                                        bool more_descs, bool write)
 {
     uint16_t i = svq->free_head, last = svq->free_head;
     unsigned n;
     uint16_t flags = write ? cpu_to_le16(VRING_DESC_F_WRITE) : 0;
     vring_desc_t *descs = svq->vring.desc;
+    bool ok;
 
     if (num == 0) {
-        return;
+        return true;
+    }
+
+    ok = vhost_svq_translate_addr(svq, sg, iovec, num);
+    if (unlikely(!ok)) {
+        return false;
     }
 
     for (n = 0; n < num; n++) {
@@ -150,6 +168,7 @@ static void vhost_vring_write_descs(VhostShadowVirtqueue *svq, hwaddr *sg,
     }
 
     svq->free_head = le16_to_cpu(svq->desc_next[last]);
+    return true;
 }
 
 static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
@@ -169,21 +188,18 @@ static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
         return false;
     }
 
-    ok = vhost_svq_translate_addr(svq, sgs, elem->out_sg, elem->out_num);
+    ok = vhost_svq_vring_write_descs(svq, sgs, elem->out_sg, elem->out_num,
+                                     elem->in_num > 0, false);
     if (unlikely(!ok)) {
         return false;
     }
-    vhost_vring_write_descs(svq, sgs, elem->out_sg, elem->out_num,
-                            elem->in_num > 0, false);
-
 
-    ok = vhost_svq_translate_addr(svq, sgs, elem->in_sg, elem->in_num);
+    ok = vhost_svq_vring_write_descs(svq, sgs, elem->in_sg, elem->in_num, false,
+                                     true);
     if (unlikely(!ok)) {
         return false;
     }
 
-    vhost_vring_write_descs(svq, sgs, elem->in_sg, elem->in_num, false, true);
-
     /*
      * Put the entry in the available array (but don't update avail->idx until
      * they do sync).
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 02/19] virtio-net: Expose MAC_TABLE_ENTRIES
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 01/19] vhost: move descriptor translation to vhost_svq_vring_write_descs Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 03/19] virtio-net: Expose ctrl virtqueue logic Eugenio Pérez
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

vhost-vdpa control virtqueue needs to know the maximum entries supported
by the virtio-net device, so we know if it is possible to apply the
filter.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/virtio-net.h | 3 +++
 hw/net/virtio-net.c            | 1 -
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index eb87032627..cce1c554f7 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -35,6 +35,9 @@ OBJECT_DECLARE_SIMPLE_TYPE(VirtIONet, VIRTIO_NET)
  * and latency. */
 #define TX_BURST 256
 
+/* Maximum VIRTIO_NET_CTRL_MAC_TABLE_SET unicast + multicast entries. */
+#define MAC_TABLE_ENTRIES    64
+
 typedef struct virtio_net_conf
 {
     uint32_t txtimer;
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 7ad948ee7c..f83e96e4ce 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -49,7 +49,6 @@
 
 #define VIRTIO_NET_VM_VERSION    11
 
-#define MAC_TABLE_ENTRIES    64
 #define MAX_VLAN    (1 << 12)   /* Per 802.1Q definition */
 
 /* previously fixed value */
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 03/19] virtio-net: Expose ctrl virtqueue logic
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 01/19] vhost: move descriptor translation to vhost_svq_vring_write_descs Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 02/19] virtio-net: Expose MAC_TABLE_ENTRIES Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 04/19] vhost: Reorder vhost_svq_kick Eugenio Pérez
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

This allows external vhost-net devices to modify the state of the
VirtIO device model once the vhost-vdpa device has acknowledged the
control commands.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/virtio-net.h |  4 ++
 hw/net/virtio-net.c            | 84 ++++++++++++++++++++--------------
 2 files changed, 53 insertions(+), 35 deletions(-)

diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index cce1c554f7..ef234ffe7e 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -221,6 +221,10 @@ struct VirtIONet {
     struct EBPFRSSContext ebpf_rss;
 };
 
+size_t virtio_net_handle_ctrl_iov(VirtIODevice *vdev,
+                                  const struct iovec *in_sg, unsigned in_num,
+                                  const struct iovec *out_sg,
+                                  unsigned out_num);
 void virtio_net_set_netclient_name(VirtIONet *n, const char *name,
                                    const char *type);
 
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index f83e96e4ce..dd0d056fde 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1433,57 +1433,71 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
     return VIRTIO_NET_OK;
 }
 
-static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
+size_t virtio_net_handle_ctrl_iov(VirtIODevice *vdev,
+                                  const struct iovec *in_sg, unsigned in_num,
+                                  const struct iovec *out_sg,
+                                  unsigned out_num)
 {
     VirtIONet *n = VIRTIO_NET(vdev);
     struct virtio_net_ctrl_hdr ctrl;
     virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
-    VirtQueueElement *elem;
     size_t s;
     struct iovec *iov, *iov2;
-    unsigned int iov_cnt;
+
+    if (iov_size(in_sg, in_num) < sizeof(status) ||
+        iov_size(out_sg, out_num) < sizeof(ctrl)) {
+        virtio_error(vdev, "virtio-net ctrl missing headers");
+        return 0;
+    }
+
+    iov2 = iov = g_memdup2(out_sg, sizeof(struct iovec) * out_num);
+    s = iov_to_buf(iov, out_num, 0, &ctrl, sizeof(ctrl));
+    iov_discard_front(&iov, &out_num, sizeof(ctrl));
+    if (s != sizeof(ctrl)) {
+        status = VIRTIO_NET_ERR;
+    } else if (ctrl.class == VIRTIO_NET_CTRL_RX) {
+        status = virtio_net_handle_rx_mode(n, ctrl.cmd, iov, out_num);
+    } else if (ctrl.class == VIRTIO_NET_CTRL_MAC) {
+        status = virtio_net_handle_mac(n, ctrl.cmd, iov, out_num);
+    } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN) {
+        status = virtio_net_handle_vlan_table(n, ctrl.cmd, iov, out_num);
+    } else if (ctrl.class == VIRTIO_NET_CTRL_ANNOUNCE) {
+        status = virtio_net_handle_announce(n, ctrl.cmd, iov, out_num);
+    } else if (ctrl.class == VIRTIO_NET_CTRL_MQ) {
+        status = virtio_net_handle_mq(n, ctrl.cmd, iov, out_num);
+    } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
+        status = virtio_net_handle_offloads(n, ctrl.cmd, iov, out_num);
+    }
+
+    s = iov_from_buf(in_sg, in_num, 0, &status, sizeof(status));
+    assert(s == sizeof(status));
+
+    g_free(iov2);
+    return sizeof(status);
+}
+
+static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtQueueElement *elem;
 
     for (;;) {
+        size_t written;
         elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
         if (!elem) {
             break;
         }
-        if (iov_size(elem->in_sg, elem->in_num) < sizeof(status) ||
-            iov_size(elem->out_sg, elem->out_num) < sizeof(ctrl)) {
-            virtio_error(vdev, "virtio-net ctrl missing headers");
+
+        written = virtio_net_handle_ctrl_iov(vdev, elem->in_sg, elem->in_num,
+                                             elem->out_sg, elem->out_num);
+        if (written > 0) {
+            virtqueue_push(vq, elem, written);
+            virtio_notify(vdev, vq);
+            g_free(elem);
+        } else {
             virtqueue_detach_element(vq, elem, 0);
             g_free(elem);
             break;
         }
-
-        iov_cnt = elem->out_num;
-        iov2 = iov = g_memdup2(elem->out_sg,
-                               sizeof(struct iovec) * elem->out_num);
-        s = iov_to_buf(iov, iov_cnt, 0, &ctrl, sizeof(ctrl));
-        iov_discard_front(&iov, &iov_cnt, sizeof(ctrl));
-        if (s != sizeof(ctrl)) {
-            status = VIRTIO_NET_ERR;
-        } else if (ctrl.class == VIRTIO_NET_CTRL_RX) {
-            status = virtio_net_handle_rx_mode(n, ctrl.cmd, iov, iov_cnt);
-        } else if (ctrl.class == VIRTIO_NET_CTRL_MAC) {
-            status = virtio_net_handle_mac(n, ctrl.cmd, iov, iov_cnt);
-        } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN) {
-            status = virtio_net_handle_vlan_table(n, ctrl.cmd, iov, iov_cnt);
-        } else if (ctrl.class == VIRTIO_NET_CTRL_ANNOUNCE) {
-            status = virtio_net_handle_announce(n, ctrl.cmd, iov, iov_cnt);
-        } else if (ctrl.class == VIRTIO_NET_CTRL_MQ) {
-            status = virtio_net_handle_mq(n, ctrl.cmd, iov, iov_cnt);
-        } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
-            status = virtio_net_handle_offloads(n, ctrl.cmd, iov, iov_cnt);
-        }
-
-        s = iov_from_buf(elem->in_sg, elem->in_num, 0, &status, sizeof(status));
-        assert(s == sizeof(status));
-
-        virtqueue_push(vq, elem, sizeof(status));
-        virtio_notify(vdev, vq);
-        g_free(iov2);
-        g_free(elem);
     }
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 04/19] vhost: Reorder vhost_svq_kick
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (2 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 03/19] virtio-net: Expose ctrl virtqueue logic Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 05/19] vhost: Move vhost_svq_kick call to vhost_svq_add Eugenio Pérez
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

Future code needs to call it from vhost_svq_add.

No functional change intended.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index e2184a4481..fd1839cec5 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -215,6 +215,20 @@ static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
     return true;
 }
 
+static void vhost_svq_kick(VhostShadowVirtqueue *svq)
+{
+    /*
+     * We need to expose the available array entries before checking the used
+     * flags
+     */
+    smp_mb();
+    if (svq->vring.used->flags & VRING_USED_F_NO_NOTIFY) {
+        return;
+    }
+
+    event_notifier_set(&svq->hdev_kick);
+}
+
 /**
  * Add an element to a SVQ.
  *
@@ -235,20 +249,6 @@ static bool vhost_svq_add(VhostShadowVirtqueue *svq, VirtQueueElement *elem)
     return true;
 }
 
-static void vhost_svq_kick(VhostShadowVirtqueue *svq)
-{
-    /*
-     * We need to expose the available array entries before checking the used
-     * flags
-     */
-    smp_mb();
-    if (svq->vring.used->flags & VRING_USED_F_NO_NOTIFY) {
-        return;
-    }
-
-    event_notifier_set(&svq->hdev_kick);
-}
-
 /**
  * Forward available buffers.
  *
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 05/19] vhost: Move vhost_svq_kick call to vhost_svq_add
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (3 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 04/19] vhost: Reorder vhost_svq_kick Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 06/19] vhost: Check for queue full at vhost_svq_add Eugenio Pérez
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

The series needs to expose vhost_svq_add with full functionality,
including kick

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index fd1839cec5..e5a4a62daa 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -246,6 +246,7 @@ static bool vhost_svq_add(VhostShadowVirtqueue *svq, VirtQueueElement *elem)
     }
 
     svq->ring_id_maps[qemu_head] = elem;
+    vhost_svq_kick(svq);
     return true;
 }
 
@@ -306,7 +307,6 @@ static void vhost_handle_guest_kick(VhostShadowVirtqueue *svq)
                 /* VQ is broken, just return and ignore any other kicks */
                 return;
             }
-            vhost_svq_kick(svq);
         }
 
         virtio_queue_set_notification(svq->vq, true);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 06/19] vhost: Check for queue full at vhost_svq_add
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (4 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 05/19] vhost: Move vhost_svq_kick call to vhost_svq_add Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 07/19] vhost: Decouple vhost_svq_add from VirtQueueElement Eugenio Pérez
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

The series need to expose vhost_svq_add with full functionality,
including checking for full queue.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 59 +++++++++++++++++-------------
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index e5a4a62daa..aee9891a67 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -233,21 +233,29 @@ static void vhost_svq_kick(VhostShadowVirtqueue *svq)
  * Add an element to a SVQ.
  *
  * The caller must check that there is enough slots for the new element. It
- * takes ownership of the element: In case of failure, it is free and the SVQ
- * is considered broken.
+ * takes ownership of the element: In case of failure not ENOSPC, it is free.
+ *
+ * Return -EINVAL if element is invalid, -ENOSPC if dev queue is full
  */
-static bool vhost_svq_add(VhostShadowVirtqueue *svq, VirtQueueElement *elem)
+static int vhost_svq_add(VhostShadowVirtqueue *svq, VirtQueueElement *elem)
 {
     unsigned qemu_head;
-    bool ok = vhost_svq_add_split(svq, elem, &qemu_head);
+    unsigned ndescs = elem->in_num + elem->out_num;
+    bool ok;
+
+    if (unlikely(ndescs > vhost_svq_available_slots(svq))) {
+        return -ENOSPC;
+    }
+
+    ok = vhost_svq_add_split(svq, elem, &qemu_head);
     if (unlikely(!ok)) {
         g_free(elem);
-        return false;
+        return -EINVAL;
     }
 
     svq->ring_id_maps[qemu_head] = elem;
     vhost_svq_kick(svq);
-    return true;
+    return 0;
 }
 
 /**
@@ -274,7 +282,7 @@ static void vhost_handle_guest_kick(VhostShadowVirtqueue *svq)
 
         while (true) {
             VirtQueueElement *elem;
-            bool ok;
+            int r;
 
             if (svq->next_guest_avail_elem) {
                 elem = g_steal_pointer(&svq->next_guest_avail_elem);
@@ -286,25 +294,24 @@ static void vhost_handle_guest_kick(VhostShadowVirtqueue *svq)
                 break;
             }
 
-            if (elem->out_num + elem->in_num > vhost_svq_available_slots(svq)) {
-                /*
-                 * This condition is possible since a contiguous buffer in GPA
-                 * does not imply a contiguous buffer in qemu's VA
-                 * scatter-gather segments. If that happens, the buffer exposed
-                 * to the device needs to be a chain of descriptors at this
-                 * moment.
-                 *
-                 * SVQ cannot hold more available buffers if we are here:
-                 * queue the current guest descriptor and ignore further kicks
-                 * until some elements are used.
-                 */
-                svq->next_guest_avail_elem = elem;
-                return;
-            }
-
-            ok = vhost_svq_add(svq, elem);
-            if (unlikely(!ok)) {
-                /* VQ is broken, just return and ignore any other kicks */
+            r = vhost_svq_add(svq, elem);
+            if (unlikely(r != 0)) {
+                if (r == -ENOSPC) {
+                    /*
+                     * This condition is possible since a contiguous buffer in
+                     * GPA does not imply a contiguous buffer in qemu's VA
+                     * scatter-gather segments. If that happens, the buffer
+                     * exposed to the device needs to be a chain of descriptors
+                     * at this moment.
+                     *
+                     * SVQ cannot hold more available buffers if we are here:
+                     * queue the current guest descriptor and ignore kicks
+                     * until some elements are used.
+                     */
+                    svq->next_guest_avail_elem = elem;
+                }
+
+                /* VQ is full or broken, just return and ignore kicks */
                 return;
             }
         }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 07/19] vhost: Decouple vhost_svq_add from VirtQueueElement
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (5 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 06/19] vhost: Check for queue full at vhost_svq_add Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 08/19] vhost: Add SVQElement Eugenio Pérez
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

VirtQueueElement comes from the guest, but we're heading SVQ to be able
to modify the element presented to the device without the guest's
knowledge.

To do so, make SVQ accept sg buffers directly, instead of using
VirtQueueElement.

Add vhost_svq_add_element to maintain element convenience.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 33 ++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index aee9891a67..b005a457c6 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -172,30 +172,31 @@ static bool vhost_svq_vring_write_descs(VhostShadowVirtqueue *svq, hwaddr *sg,
 }
 
 static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
-                                VirtQueueElement *elem, unsigned *head)
+                                const struct iovec *out_sg, size_t out_num,
+                                const struct iovec *in_sg, size_t in_num,
+                                unsigned *head)
 {
     unsigned avail_idx;
     vring_avail_t *avail = svq->vring.avail;
     bool ok;
-    g_autofree hwaddr *sgs = g_new(hwaddr, MAX(elem->out_num, elem->in_num));
+    g_autofree hwaddr *sgs = g_new(hwaddr, MAX(out_num, in_num));
 
     *head = svq->free_head;
 
     /* We need some descriptors here */
-    if (unlikely(!elem->out_num && !elem->in_num)) {
+    if (unlikely(!out_num && !in_num)) {
         qemu_log_mask(LOG_GUEST_ERROR,
                       "Guest provided element with no descriptors");
         return false;
     }
 
-    ok = vhost_svq_vring_write_descs(svq, sgs, elem->out_sg, elem->out_num,
-                                     elem->in_num > 0, false);
+    ok = vhost_svq_vring_write_descs(svq, sgs, out_sg, out_num, in_num > 0,
+                                     false);
     if (unlikely(!ok)) {
         return false;
     }
 
-    ok = vhost_svq_vring_write_descs(svq, sgs, elem->in_sg, elem->in_num, false,
-                                     true);
+    ok = vhost_svq_vring_write_descs(svq, sgs, in_sg, in_num, false, true);
     if (unlikely(!ok)) {
         return false;
     }
@@ -237,17 +238,19 @@ static void vhost_svq_kick(VhostShadowVirtqueue *svq)
  *
  * Return -EINVAL if element is invalid, -ENOSPC if dev queue is full
  */
-static int vhost_svq_add(VhostShadowVirtqueue *svq, VirtQueueElement *elem)
+static int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
+                          size_t out_num, const struct iovec *in_sg,
+                          size_t in_num, VirtQueueElement *elem)
 {
     unsigned qemu_head;
-    unsigned ndescs = elem->in_num + elem->out_num;
+    unsigned ndescs = in_num + out_num;
     bool ok;
 
     if (unlikely(ndescs > vhost_svq_available_slots(svq))) {
         return -ENOSPC;
     }
 
-    ok = vhost_svq_add_split(svq, elem, &qemu_head);
+    ok = vhost_svq_add_split(svq, out_sg, out_num, in_sg, in_num, &qemu_head);
     if (unlikely(!ok)) {
         g_free(elem);
         return -EINVAL;
@@ -258,6 +261,14 @@ static int vhost_svq_add(VhostShadowVirtqueue *svq, VirtQueueElement *elem)
     return 0;
 }
 
+/* Convenience wrapper to add a guest's element to SVQ */
+static int vhost_svq_add_element(VhostShadowVirtqueue *svq,
+                                 VirtQueueElement *elem)
+{
+    return vhost_svq_add(svq, elem->out_sg, elem->out_num, elem->in_sg,
+                         elem->in_num, elem);
+}
+
 /**
  * Forward available buffers.
  *
@@ -294,7 +305,7 @@ static void vhost_handle_guest_kick(VhostShadowVirtqueue *svq)
                 break;
             }
 
-            r = vhost_svq_add(svq, elem);
+            r = vhost_svq_add_element(svq, elem);
             if (unlikely(r != 0)) {
                 if (r == -ENOSPC) {
                     /*
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 08/19] vhost: Add SVQElement
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (6 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 07/19] vhost: Decouple vhost_svq_add from VirtQueueElement Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 09/19] vhost: Track number of descs in SVQElement Eugenio Pérez
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

This will allow SVQ to add context to the different queue elements.

This patch only store the actual element, no functional change intended.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |  8 ++++++--
 hw/virtio/vhost-shadow-virtqueue.c | 32 ++++++++++++++++--------------
 2 files changed, 23 insertions(+), 17 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index c132c994e9..f35d4b8f90 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -15,6 +15,10 @@
 #include "standard-headers/linux/vhost_types.h"
 #include "hw/virtio/vhost-iova-tree.h"
 
+typedef struct SVQElement {
+    VirtQueueElement elem;
+} SVQElement;
+
 /* Shadow virtqueue to relay notifications */
 typedef struct VhostShadowVirtqueue {
     /* Shadow vring */
@@ -48,10 +52,10 @@ typedef struct VhostShadowVirtqueue {
     VhostIOVATree *iova_tree;
 
     /* Map for use the guest's descriptors */
-    VirtQueueElement **ring_id_maps;
+    SVQElement **ring_id_maps;
 
     /* Next VirtQueue element that guest made available */
-    VirtQueueElement *next_guest_avail_elem;
+    SVQElement *next_guest_avail_elem;
 
     /*
      * Backup next field for each descriptor so we can recover securely, not
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index b005a457c6..442ca3cbd3 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -240,7 +240,7 @@ static void vhost_svq_kick(VhostShadowVirtqueue *svq)
  */
 static int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
                           size_t out_num, const struct iovec *in_sg,
-                          size_t in_num, VirtQueueElement *elem)
+                          size_t in_num, SVQElement *svq_elem)
 {
     unsigned qemu_head;
     unsigned ndescs = in_num + out_num;
@@ -252,21 +252,22 @@ static int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
 
     ok = vhost_svq_add_split(svq, out_sg, out_num, in_sg, in_num, &qemu_head);
     if (unlikely(!ok)) {
-        g_free(elem);
+        g_free(svq_elem);
         return -EINVAL;
     }
 
-    svq->ring_id_maps[qemu_head] = elem;
+    svq->ring_id_maps[qemu_head] = svq_elem;
     vhost_svq_kick(svq);
     return 0;
 }
 
 /* Convenience wrapper to add a guest's element to SVQ */
 static int vhost_svq_add_element(VhostShadowVirtqueue *svq,
-                                 VirtQueueElement *elem)
+                                 SVQElement *svq_elem)
 {
+    VirtQueueElement *elem = &svq_elem->elem;
     return vhost_svq_add(svq, elem->out_sg, elem->out_num, elem->in_sg,
-                         elem->in_num, elem);
+                         elem->in_num, svq_elem);
 }
 
 /**
@@ -292,7 +293,7 @@ static void vhost_handle_guest_kick(VhostShadowVirtqueue *svq)
         virtio_queue_set_notification(svq->vq, false);
 
         while (true) {
-            VirtQueueElement *elem;
+            SVQElement *elem;
             int r;
 
             if (svq->next_guest_avail_elem) {
@@ -386,9 +387,10 @@ static uint16_t vhost_svq_last_desc_of_chain(const VhostShadowVirtqueue *svq,
     return i;
 }
 
-static VirtQueueElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
-                                           uint32_t *len)
+static SVQElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
+                                     uint32_t *len)
 {
+    SVQElement *elem;
     const vring_used_t *used = svq->vring.used;
     vring_used_elem_t used_elem;
     uint16_t last_used, last_used_chain, num;
@@ -417,8 +419,8 @@ static VirtQueueElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
         return NULL;
     }
 
-    num = svq->ring_id_maps[used_elem.id]->in_num +
-          svq->ring_id_maps[used_elem.id]->out_num;
+    elem = svq->ring_id_maps[used_elem.id];
+    num = elem->elem.in_num + elem->elem.out_num;
     last_used_chain = vhost_svq_last_desc_of_chain(svq, num, used_elem.id);
     svq->desc_next[last_used_chain] = svq->free_head;
     svq->free_head = used_elem.id;
@@ -439,8 +441,8 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
         vhost_svq_disable_notification(svq);
         while (true) {
             uint32_t len;
-            g_autofree VirtQueueElement *elem = vhost_svq_get_buf(svq, &len);
-            if (!elem) {
+            g_autofree SVQElement *svq_elem = vhost_svq_get_buf(svq, &len);
+            if (!svq_elem) {
                 break;
             }
 
@@ -448,11 +450,11 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
                 qemu_log_mask(LOG_GUEST_ERROR,
                          "More than %u used buffers obtained in a %u size SVQ",
                          i, svq->vring.num);
-                virtqueue_fill(vq, elem, len, i);
+                virtqueue_fill(vq, &svq_elem->elem, len, i);
                 virtqueue_flush(vq, i);
                 return;
             }
-            virtqueue_fill(vq, elem, len, i++);
+            virtqueue_fill(vq, &svq_elem->elem, len, i++);
         }
 
         virtqueue_flush(vq, i);
@@ -594,7 +596,7 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
     memset(svq->vring.desc, 0, driver_size);
     svq->vring.used = qemu_memalign(qemu_real_host_page_size(), device_size);
     memset(svq->vring.used, 0, device_size);
-    svq->ring_id_maps = g_new0(VirtQueueElement *, svq->vring.num);
+    svq->ring_id_maps = g_new0(SVQElement *, svq->vring.num);
     svq->desc_next = g_new0(uint16_t, svq->vring.num);
     for (unsigned i = 0; i < svq->vring.num - 1; i++) {
         svq->desc_next[i] = cpu_to_le16(i + 1);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 09/19] vhost: Track number of descs in SVQElement
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (7 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 08/19] vhost: Add SVQElement Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-15  4:10   ` Jason Wang
  2022-07-14 16:31 ` [PATCH v2 10/19] vhost: add vhost_svq_push_elem Eugenio Pérez
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

Since CVQ will be able to modify elements, the number of descriptors in
the guest may not match with the number of descriptors exposed. Track
separately.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |  6 ++++++
 hw/virtio/vhost-shadow-virtqueue.c | 10 +++++-----
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index f35d4b8f90..143c86a568 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -17,6 +17,12 @@
 
 typedef struct SVQElement {
     VirtQueueElement elem;
+
+    /*
+     * Number of descriptors exposed to the device. May or may not match
+     * guest's
+     */
+    unsigned int ndescs;
 } SVQElement;
 
 /* Shadow virtqueue to relay notifications */
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 442ca3cbd3..3b112c4ec8 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -243,10 +243,10 @@ static int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
                           size_t in_num, SVQElement *svq_elem)
 {
     unsigned qemu_head;
-    unsigned ndescs = in_num + out_num;
+    svq_elem->ndescs = in_num + out_num;
     bool ok;
 
-    if (unlikely(ndescs > vhost_svq_available_slots(svq))) {
+    if (unlikely(svq_elem->ndescs > vhost_svq_available_slots(svq))) {
         return -ENOSPC;
     }
 
@@ -393,7 +393,7 @@ static SVQElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
     SVQElement *elem;
     const vring_used_t *used = svq->vring.used;
     vring_used_elem_t used_elem;
-    uint16_t last_used, last_used_chain, num;
+    uint16_t last_used, last_used_chain;
 
     if (!vhost_svq_more_used(svq)) {
         return NULL;
@@ -420,8 +420,8 @@ static SVQElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
     }
 
     elem = svq->ring_id_maps[used_elem.id];
-    num = elem->elem.in_num + elem->elem.out_num;
-    last_used_chain = vhost_svq_last_desc_of_chain(svq, num, used_elem.id);
+    last_used_chain = vhost_svq_last_desc_of_chain(svq, elem->ndescs,
+                                                   used_elem.id);
     svq->desc_next[last_used_chain] = svq->free_head;
     svq->free_head = used_elem.id;
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 10/19] vhost: add vhost_svq_push_elem
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (8 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 09/19] vhost: Track number of descs in SVQElement Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 11/19] vhost: Expose vhost_svq_add Eugenio Pérez
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

This function allows external SVQ users to return guest's available
buffers.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |  3 +++
 hw/virtio/vhost-shadow-virtqueue.c | 16 ++++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 143c86a568..69b352c707 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -84,6 +84,9 @@ typedef struct VhostShadowVirtqueue {
 
 bool vhost_svq_valid_features(uint64_t features, Error **errp);
 
+void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *elem,
+                         uint32_t len);
+
 void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
 void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
 void vhost_svq_get_vring_addr(const VhostShadowVirtqueue *svq,
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 3b112c4ec8..95a8ab8477 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -429,6 +429,22 @@ static SVQElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
     return g_steal_pointer(&svq->ring_id_maps[used_elem.id]);
 }
 
+/**
+ * Push an element to SVQ, returning it to the guest.
+ */
+void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *svq_elem,
+                         uint32_t len)
+{
+    virtqueue_push(svq->vq, &svq_elem->elem, len);
+    if (svq->next_guest_avail_elem) {
+        /*
+         * Avail ring was full when vhost_svq_flush was called, so it's a
+         * good moment to make more descriptors available if possible.
+         */
+        vhost_handle_guest_kick(svq);
+    }
+}
+
 static void vhost_svq_flush(VhostShadowVirtqueue *svq,
                             bool check_for_avail_queue)
 {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 11/19] vhost: Expose vhost_svq_add
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (9 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 10/19] vhost: add vhost_svq_push_elem Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 12/19] vhost: add vhost_svq_poll Eugenio Pérez
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

This allows external parts of SVQ to forward custom buffers to the
device.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h | 3 +++
 hw/virtio/vhost-shadow-virtqueue.c | 6 +++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 69b352c707..1692541cbb 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -86,6 +86,9 @@ bool vhost_svq_valid_features(uint64_t features, Error **errp);
 
 void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *elem,
                          uint32_t len);
+int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
+                  size_t out_num, const struct iovec *in_sg, size_t in_num,
+                  SVQElement *elem);
 
 void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
 void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 95a8ab8477..5244896358 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -238,9 +238,9 @@ static void vhost_svq_kick(VhostShadowVirtqueue *svq)
  *
  * Return -EINVAL if element is invalid, -ENOSPC if dev queue is full
  */
-static int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
-                          size_t out_num, const struct iovec *in_sg,
-                          size_t in_num, SVQElement *svq_elem)
+int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
+                  size_t out_num, const struct iovec *in_sg, size_t in_num,
+                  SVQElement *svq_elem)
 {
     unsigned qemu_head;
     svq_elem->ndescs = in_num + out_num;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 12/19] vhost: add vhost_svq_poll
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (10 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 11/19] vhost: Expose vhost_svq_add Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-15  3:58   ` Jason Wang
  2022-07-14 16:31 ` [PATCH v2 13/19] vhost: Add svq avail_handler callback Eugenio Pérez
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

It allows the Shadow Control VirtQueue to wait for the device to use the
available buffers.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |  1 +
 hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 1692541cbb..b5c6e3b3b4 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *elem,
 int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
                   size_t out_num, const struct iovec *in_sg, size_t in_num,
                   SVQElement *elem);
+size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
 
 void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
 void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 5244896358..31a267f721 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
     } while (!vhost_svq_enable_notification(svq));
 }
 
+/**
+ * Poll the SVQ for one device used buffer.
+ *
+ * This function race with main event loop SVQ polling, so extra
+ * synchronization is needed.
+ *
+ * Return the length written by the device.
+ */
+size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
+{
+    do {
+        uint32_t len;
+        SVQElement *elem = vhost_svq_get_buf(svq, &len);
+        if (elem) {
+            return len;
+        }
+
+        /* Make sure we read new used_idx */
+        smp_rmb();
+    } while (true);
+}
+
 /**
  * Forward used buffers.
  *
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 13/19] vhost: Add svq avail_handler callback
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (11 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 12/19] vhost: add vhost_svq_poll Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 14/19] vdpa: Export vhost_vdpa_dma_map and unmap calls Eugenio Pérez
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

This allows external handlers to be aware of new buffers that the guest
places in the virtqueue.

When this callback is defined the ownership of the guest's virtqueue
element is transferred to the callback. This means that if the user
wants to forward the descriptor it needs to manually inject it. The
callback is also free to process the command by itself and use the
element with svq_push.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h | 31 +++++++++++++++++++++++++++++-
 hw/virtio/vhost-shadow-virtqueue.c | 14 ++++++++++++--
 hw/virtio/vhost-vdpa.c             |  3 ++-
 3 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index b5c6e3b3b4..965ca88706 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -25,6 +25,27 @@ typedef struct SVQElement {
     unsigned int ndescs;
 } SVQElement;
 
+typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
+
+/**
+ * Callback to handle an avail buffer.
+ *
+ * @svq:  Shadow virtqueue
+ * @elem:  Element placed in the queue by the guest
+ * @vq_callback_opaque:  Opaque
+ *
+ * Returns 0 if the vq is running as expected.
+ *
+ * Note that ownership of elem is transferred to the callback.
+ */
+typedef int (*VirtQueueAvailCallback)(VhostShadowVirtqueue *svq,
+                                      SVQElement *elem,
+                                      void *vq_callback_opaque);
+
+typedef struct VhostShadowVirtqueueOps {
+    VirtQueueAvailCallback avail_handler;
+} VhostShadowVirtqueueOps;
+
 /* Shadow virtqueue to relay notifications */
 typedef struct VhostShadowVirtqueue {
     /* Shadow vring */
@@ -69,6 +90,12 @@ typedef struct VhostShadowVirtqueue {
      */
     uint16_t *desc_next;
 
+    /* Caller callbacks */
+    const VhostShadowVirtqueueOps *ops;
+
+    /* Caller callbacks opaque */
+    void *ops_opaque;
+
     /* Next head to expose to the device */
     uint16_t shadow_avail_idx;
 
@@ -102,7 +129,9 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
                      VirtQueue *vq);
 void vhost_svq_stop(VhostShadowVirtqueue *svq);
 
-VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree);
+VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree,
+                                    const VhostShadowVirtqueueOps *ops,
+                                    void *ops_opaque);
 
 void vhost_svq_free(gpointer vq);
 G_DEFINE_AUTOPTR_CLEANUP_FUNC(VhostShadowVirtqueue, vhost_svq_free);
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 31a267f721..85b2d49326 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -306,7 +306,11 @@ static void vhost_handle_guest_kick(VhostShadowVirtqueue *svq)
                 break;
             }
 
-            r = vhost_svq_add_element(svq, elem);
+            if (svq->ops) {
+                r = svq->ops->avail_handler(svq, elem, svq->ops_opaque);
+            } else {
+                r = vhost_svq_add_element(svq, elem);
+            }
             if (unlikely(r != 0)) {
                 if (r == -ENOSPC) {
                     /*
@@ -681,12 +685,16 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
  * shadow methods and file descriptors.
  *
  * @iova_tree: Tree to perform descriptors translations
+ * @ops: SVQ owner callbacks
+ * @ops_opaque: ops opaque pointer
  *
  * Returns the new virtqueue or NULL.
  *
  * In case of error, reason is reported through error_report.
  */
-VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree)
+VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree,
+                                    const VhostShadowVirtqueueOps *ops,
+                                    void *ops_opaque)
 {
     g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
     int r;
@@ -708,6 +716,8 @@ VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree)
     event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
     event_notifier_set_handler(&svq->hdev_call, vhost_svq_handle_call);
     svq->iova_tree = iova_tree;
+    svq->ops = ops;
+    svq->ops_opaque = ops_opaque;
     return g_steal_pointer(&svq);
 
 err_init_hdev_call:
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 66f054a12c..0b13e98471 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -418,8 +418,9 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
 
     shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
     for (unsigned n = 0; n < hdev->nvqs; ++n) {
-        g_autoptr(VhostShadowVirtqueue) svq = vhost_svq_new(v->iova_tree);
+        g_autoptr(VhostShadowVirtqueue) svq;
 
+        svq = vhost_svq_new(v->iova_tree, NULL, NULL);
         if (unlikely(!svq)) {
             error_setg(errp, "Cannot create svq %u", n);
             return -1;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 14/19] vdpa: Export vhost_vdpa_dma_map and unmap calls
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (12 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 13/19] vhost: Add svq avail_handler callback Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 15/19] vdpa: manual forward CVQ buffers Eugenio Pérez
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

Shadow CVQ will copy buffers on qemu VA, so we avoid TOCTOU attacks from
the guest that could set a different state in qemu device model and vdpa
device.

To do so, it needs to be able to map these new buffers to the device.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 include/hw/virtio/vhost-vdpa.h | 4 ++++
 hw/virtio/vhost-vdpa.c         | 7 +++----
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index a29dbb3f53..7214eb47dc 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -39,4 +39,8 @@ typedef struct vhost_vdpa {
     VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
 } VhostVDPA;
 
+int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
+                       void *vaddr, bool readonly);
+int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
+
 #endif
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 0b13e98471..96997210be 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -71,8 +71,8 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
     return false;
 }
 
-static int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
-                              void *vaddr, bool readonly)
+int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
+                       void *vaddr, bool readonly)
 {
     struct vhost_msg_v2 msg = {};
     int fd = v->device_fd;
@@ -97,8 +97,7 @@ static int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
     return ret;
 }
 
-static int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova,
-                                hwaddr size)
+int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size)
 {
     struct vhost_msg_v2 msg = {};
     int fd = v->device_fd;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 15/19] vdpa: manual forward CVQ buffers
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (13 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 14/19] vdpa: Export vhost_vdpa_dma_map and unmap calls Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-15  4:08   ` Jason Wang
  2022-07-14 16:31 ` [PATCH v2 16/19] vdpa: Buffer CVQ support on shadow virtqueue Eugenio Pérez
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

Do a simple forwarding of CVQ buffers, the same work SVQ could do but
through callbacks. No functional change intended.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/vhost-vdpa.h |  3 ++
 hw/virtio/vhost-vdpa.c         |  3 +-
 net/vhost-vdpa.c               | 58 ++++++++++++++++++++++++++++++++++
 3 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 7214eb47dc..1111d85643 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -15,6 +15,7 @@
 #include <gmodule.h>
 
 #include "hw/virtio/vhost-iova-tree.h"
+#include "hw/virtio/vhost-shadow-virtqueue.h"
 #include "hw/virtio/virtio.h"
 #include "standard-headers/linux/vhost_types.h"
 
@@ -35,6 +36,8 @@ typedef struct vhost_vdpa {
     /* IOVA mapping used by the Shadow Virtqueue */
     VhostIOVATree *iova_tree;
     GPtrArray *shadow_vqs;
+    const VhostShadowVirtqueueOps *shadow_vq_ops;
+    void *shadow_vq_ops_opaque;
     struct vhost_dev *dev;
     VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
 } VhostVDPA;
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 96997210be..beaaa7049a 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -419,7 +419,8 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
     for (unsigned n = 0; n < hdev->nvqs; ++n) {
         g_autoptr(VhostShadowVirtqueue) svq;
 
-        svq = vhost_svq_new(v->iova_tree, NULL, NULL);
+        svq = vhost_svq_new(v->iova_tree, v->shadow_vq_ops,
+                            v->shadow_vq_ops_opaque);
         if (unlikely(!svq)) {
             error_setg(errp, "Cannot create svq %u", n);
             return -1;
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index df1e69ee72..805c9dd6b6 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -11,11 +11,14 @@
 
 #include "qemu/osdep.h"
 #include "clients.h"
+#include "hw/virtio/virtio-net.h"
 #include "net/vhost_net.h"
 #include "net/vhost-vdpa.h"
 #include "hw/virtio/vhost-vdpa.h"
 #include "qemu/config-file.h"
 #include "qemu/error-report.h"
+#include "qemu/log.h"
+#include "qemu/memalign.h"
 #include "qemu/option.h"
 #include "qapi/error.h"
 #include <linux/vhost.h>
@@ -187,6 +190,57 @@ static NetClientInfo net_vhost_vdpa_info = {
         .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
+/**
+ * Forward buffer for the moment.
+ */
+static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
+                                            SVQElement *svq_elem, void *opaque)
+{
+    VirtQueueElement *elem = &svq_elem->elem;
+    unsigned int n = elem->out_num + elem->in_num;
+    g_autofree struct iovec *dev_buffers = g_new(struct iovec, n);
+    size_t in_len, dev_written;
+    virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
+    int r;
+
+    memcpy(dev_buffers, elem->out_sg, elem->out_num);
+    memcpy(dev_buffers + elem->out_num, elem->in_sg, elem->in_num);
+
+    r = vhost_svq_add(svq, &dev_buffers[0], elem->out_num, &dev_buffers[1],
+                      elem->in_num, svq_elem);
+    if (unlikely(r != 0)) {
+        if (unlikely(r == -ENOSPC)) {
+            qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
+                          __func__);
+        }
+        goto out;
+    }
+
+    /*
+     * We can poll here since we've had BQL from the time we sent the
+     * descriptor. Also, we need to take the answer before SVQ pulls by itself,
+     * when BQL is released
+     */
+    dev_written = vhost_svq_poll(svq);
+    if (unlikely(dev_written < sizeof(status))) {
+        error_report("Insufficient written data (%zu)", dev_written);
+    }
+
+out:
+    in_len = iov_from_buf(elem->in_sg, elem->in_num, 0, &status,
+                          sizeof(status));
+    if (unlikely(in_len < sizeof(status))) {
+        error_report("Bad device CVQ written length");
+    }
+    vhost_svq_push_elem(svq, svq_elem, MIN(in_len, sizeof(status)));
+    g_free(svq_elem);
+    return r;
+}
+
+static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
+    .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
+};
+
 static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
                                            const char *device,
                                            const char *name,
@@ -211,6 +265,10 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
 
     s->vhost_vdpa.device_fd = vdpa_device_fd;
     s->vhost_vdpa.index = queue_pair_index;
+    if (!is_datapath) {
+        s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
+        s->vhost_vdpa.shadow_vq_ops_opaque = s;
+    }
     ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa, queue_pair_index, nvqs);
     if (ret) {
         qemu_del_net_client(nc);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 16/19] vdpa: Buffer CVQ support on shadow virtqueue
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (14 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 15/19] vdpa: manual forward CVQ buffers Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 17/19] vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs Eugenio Pérez
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

Introduce the control virtqueue support for vDPA shadow virtqueue. This
is needed for advanced networking features like rx filtering.

Virtio-net control VQ copies the descriptors to qemu's VA, so we avoid
TOCTOU with the guest's or device's memory every time there is a device
model change.  Otherwise, the guest could change the memory content in
the time between qemu and the device read it.

To demonstrate command handling, VIRTIO_NET_F_CTRL_MACADDR is
implemented.  If the virtio-net driver changes MAC the virtio-net device
model will be updated with the new one, and a rx filtering change event
will be raised.

More cvq commands could be added here straightforwardly but they have
not been tested.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 net/vhost-vdpa.c | 211 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 204 insertions(+), 7 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 805c9dd6b6..bc115a1455 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -33,6 +33,9 @@ typedef struct VhostVDPAState {
     NetClientState nc;
     struct vhost_vdpa vhost_vdpa;
     VHostNetState *vhost_net;
+
+    /* Control commands shadow buffers */
+    void *cvq_cmd_out_buffer, *cvq_cmd_in_buffer;
     bool started;
 } VhostVDPAState;
 
@@ -131,6 +134,8 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
 {
     VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
 
+    qemu_vfree(s->cvq_cmd_out_buffer);
+    qemu_vfree(s->cvq_cmd_in_buffer);
     if (s->vhost_net) {
         vhost_net_cleanup(s->vhost_net);
         g_free(s->vhost_net);
@@ -190,24 +195,191 @@ static NetClientInfo net_vhost_vdpa_info = {
         .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
+static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
+{
+    VhostIOVATree *tree = v->iova_tree;
+    DMAMap needle = {
+        /*
+         * No need to specify size or to look for more translations since
+         * this contiguous chunk was allocated by us.
+         */
+        .translated_addr = (hwaddr)(uintptr_t)addr,
+    };
+    const DMAMap *map = vhost_iova_tree_find_iova(tree, &needle);
+    int r;
+
+    if (unlikely(!map)) {
+        error_report("Cannot locate expected map");
+        return;
+    }
+
+    r = vhost_vdpa_dma_unmap(v, map->iova, map->size + 1);
+    if (unlikely(r != 0)) {
+        error_report("Device cannot unmap: %s(%d)", g_strerror(r), r);
+    }
+
+    vhost_iova_tree_remove(tree, map);
+}
+
+static size_t vhost_vdpa_net_cvq_cmd_len(void)
+{
+    /*
+     * MAC_TABLE_SET is the ctrl command that produces the longer out buffer.
+     * In buffer is always 1 byte, so it should fit here
+     */
+    return sizeof(struct virtio_net_ctrl_hdr) +
+           2 * sizeof(struct virtio_net_ctrl_mac) +
+           MAC_TABLE_ENTRIES * ETH_ALEN;
+}
+
+static size_t vhost_vdpa_net_cvq_cmd_page_len(void)
+{
+    return ROUND_UP(vhost_vdpa_net_cvq_cmd_len(), qemu_real_host_page_size());
+}
+
+/** Copy and map a guest buffer. */
+static bool vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v,
+                                   const struct iovec *out_data,
+                                   size_t out_num, size_t data_len, void *buf,
+                                   size_t *written, bool write)
+{
+    DMAMap map = {};
+    int r;
+
+    if (unlikely(!data_len)) {
+        qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid legnth of %s buffer\n",
+                      __func__, write ? "in" : "out");
+        return false;
+    }
+
+    *written = iov_to_buf(out_data, out_num, 0, buf, data_len);
+    map.translated_addr = (hwaddr)(uintptr_t)buf;
+    map.size = vhost_vdpa_net_cvq_cmd_page_len() - 1;
+    map.perm = write ? IOMMU_RW : IOMMU_RO,
+    r = vhost_iova_tree_map_alloc(v->iova_tree, &map);
+    if (unlikely(r != IOVA_OK)) {
+        error_report("Cannot map injected element");
+        return false;
+    }
+
+    r = vhost_vdpa_dma_map(v, map.iova, vhost_vdpa_net_cvq_cmd_page_len(), buf,
+                           !write);
+    if (unlikely(r < 0)) {
+        goto dma_map_err;
+    }
+
+    return true;
+
+dma_map_err:
+    vhost_iova_tree_remove(v->iova_tree, &map);
+    return false;
+}
+
 /**
- * Forward buffer for the moment.
+ * Copy the guest element into a dedicated buffer suitable to be sent to NIC
+ *
+ * @iov: [0] is the out buffer, [1] is the in one
+ */
+static bool vhost_vdpa_net_cvq_map_elem(VhostVDPAState *s,
+                                        VirtQueueElement *elem,
+                                        struct iovec *iov)
+{
+    size_t in_copied;
+    bool ok;
+
+    iov[0].iov_base = s->cvq_cmd_out_buffer;
+    ok = vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, elem->out_sg, elem->out_num,
+                                vhost_vdpa_net_cvq_cmd_len(), iov[0].iov_base,
+                                &iov[0].iov_len, false);
+    if (unlikely(!ok)) {
+        return false;
+    }
+
+    iov[1].iov_base = s->cvq_cmd_in_buffer;
+    ok = vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, NULL, 0,
+                                sizeof(virtio_net_ctrl_ack), iov[1].iov_base,
+                                &in_copied, true);
+    if (unlikely(!ok)) {
+        vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, s->cvq_cmd_out_buffer);
+        return false;
+    }
+
+    iov[1].iov_len = sizeof(virtio_net_ctrl_ack);
+    return true;
+}
+
+/**
+ * Do not forward commands not supported by SVQ. Otherwise, the device could
+ * accept it and qemu would not know how to update the device model.
+ */
+static bool vhost_vdpa_net_cvq_validate_cmd(const struct iovec *out,
+                                            size_t out_num)
+{
+    struct virtio_net_ctrl_hdr ctrl;
+    size_t n;
+
+    n = iov_to_buf(out, out_num, 0, &ctrl, sizeof(ctrl));
+    if (unlikely(n < sizeof(ctrl))) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: invalid legnth of out buffer %zu\n", __func__, n);
+        return false;
+    }
+
+    switch (ctrl.class) {
+    case VIRTIO_NET_CTRL_MAC:
+        switch (ctrl.cmd) {
+        case VIRTIO_NET_CTRL_MAC_ADDR_SET:
+            return true;
+        default:
+            qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid mac cmd %u\n",
+                          __func__, ctrl.cmd);
+        };
+        break;
+    default:
+        qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid control class %u\n",
+                      __func__, ctrl.class);
+    };
+
+    return false;
+}
+
+/**
+ * Validate and copy control virtqueue commands.
+ *
+ * Following QEMU guidelines, we offer a copy of the buffers to the device to
+ * prevent TOCTOU bugs.
  */
 static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
                                             SVQElement *svq_elem, void *opaque)
 {
     VirtQueueElement *elem = &svq_elem->elem;
-    unsigned int n = elem->out_num + elem->in_num;
-    g_autofree struct iovec *dev_buffers = g_new(struct iovec, n);
+    VhostVDPAState *s = opaque;
     size_t in_len, dev_written;
     virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
+    /* out and in buffers sent to the device */
+    struct iovec dev_buffers[2] = {
+        { .iov_base = s->cvq_cmd_out_buffer },
+        { .iov_base = s->cvq_cmd_in_buffer },
+    };
+    /* in buffer used for device model */
+    const struct iovec in = {
+        .iov_base = &status,
+        .iov_len = sizeof(status),
+    };
     int r;
+    bool ok;
 
-    memcpy(dev_buffers, elem->out_sg, elem->out_num);
-    memcpy(dev_buffers + elem->out_num, elem->in_sg, elem->in_num);
+    ok = vhost_vdpa_net_cvq_map_elem(s, elem, dev_buffers);
+    if (unlikely(!ok)) {
+        goto out;
+    }
 
-    r = vhost_svq_add(svq, &dev_buffers[0], elem->out_num, &dev_buffers[1],
-                      elem->in_num, svq_elem);
+    ok = vhost_vdpa_net_cvq_validate_cmd(&dev_buffers[0], 1);
+    if (unlikely(!ok)) {
+        goto out;
+    }
+
+    r = vhost_svq_add(svq, &dev_buffers[0], 1, &dev_buffers[1], 1, svq_elem);
     if (unlikely(r != 0)) {
         if (unlikely(r == -ENOSPC)) {
             qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
@@ -224,6 +396,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
     dev_written = vhost_svq_poll(svq);
     if (unlikely(dev_written < sizeof(status))) {
         error_report("Insufficient written data (%zu)", dev_written);
+        goto out;
+    }
+
+    memcpy(&status, dev_buffers[1].iov_base, sizeof(status));
+    if (status != VIRTIO_NET_OK) {
+        goto out;
+    }
+
+    status = VIRTIO_NET_ERR;
+    virtio_net_handle_ctrl_iov(svq->vdev, &in, 1, dev_buffers, 1);
+    if (status != VIRTIO_NET_OK) {
+        error_report("Bad CVQ processing in model");
     }
 
 out:
@@ -234,6 +418,12 @@ out:
     }
     vhost_svq_push_elem(svq, svq_elem, MIN(in_len, sizeof(status)));
     g_free(svq_elem);
+    if (dev_buffers[0].iov_base) {
+        vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, dev_buffers[0].iov_base);
+    }
+    if (dev_buffers[1].iov_base) {
+        vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, dev_buffers[1].iov_base);
+    }
     return r;
 }
 
@@ -266,6 +456,13 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
     s->vhost_vdpa.device_fd = vdpa_device_fd;
     s->vhost_vdpa.index = queue_pair_index;
     if (!is_datapath) {
+        s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size(),
+                                            vhost_vdpa_net_cvq_cmd_page_len());
+        memset(s->cvq_cmd_out_buffer, 0, vhost_vdpa_net_cvq_cmd_page_len());
+        s->cvq_cmd_in_buffer = qemu_memalign(qemu_real_host_page_size(),
+                                            vhost_vdpa_net_cvq_cmd_page_len());
+        memset(s->cvq_cmd_in_buffer, 0, vhost_vdpa_net_cvq_cmd_page_len());
+
         s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
         s->vhost_vdpa.shadow_vq_ops_opaque = s;
     }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 17/19] vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (15 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 16/19] vdpa: Buffer CVQ support on shadow virtqueue Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 18/19] vdpa: Add device migration blocker Eugenio Pérez
  2022-07-14 16:31 ` [PATCH v2 19/19] vdpa: Add x-svq to NetdevVhostVDPAOptions Eugenio Pérez
  18 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

To know the device features is needed for CVQ SVQ, so SVQ knows if it
can handle all commands or not. Extract from
vhost_vdpa_get_max_queue_pairs so we can reuse it.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 net/vhost-vdpa.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index bc115a1455..7ccf9eaf4d 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -474,20 +474,24 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
     return nc;
 }
 
-static int vhost_vdpa_get_max_queue_pairs(int fd, int *has_cvq, Error **errp)
+static int vhost_vdpa_get_features(int fd, uint64_t *features, Error **errp)
+{
+    int ret = ioctl(fd, VHOST_GET_FEATURES, features);
+    if (unlikely(ret < 0)) {
+        error_setg_errno(errp, errno,
+                         "Fail to query features from vhost-vDPA device");
+    }
+    return ret;
+}
+
+static int vhost_vdpa_get_max_queue_pairs(int fd, uint64_t features,
+                                          int *has_cvq, Error **errp)
 {
     unsigned long config_size = offsetof(struct vhost_vdpa_config, buf);
     g_autofree struct vhost_vdpa_config *config = NULL;
     __virtio16 *max_queue_pairs;
-    uint64_t features;
     int ret;
 
-    ret = ioctl(fd, VHOST_GET_FEATURES, &features);
-    if (ret) {
-        error_setg(errp, "Fail to query features from vhost-vDPA device");
-        return ret;
-    }
-
     if (features & (1 << VIRTIO_NET_F_CTRL_VQ)) {
         *has_cvq = 1;
     } else {
@@ -517,10 +521,11 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
                         NetClientState *peer, Error **errp)
 {
     const NetdevVhostVDPAOptions *opts;
+    uint64_t features;
     int vdpa_device_fd;
     g_autofree NetClientState **ncs = NULL;
     NetClientState *nc;
-    int queue_pairs, i, has_cvq = 0;
+    int queue_pairs, r, i, has_cvq = 0;
 
     assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
     opts = &netdev->u.vhost_vdpa;
@@ -534,7 +539,12 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
         return -errno;
     }
 
-    queue_pairs = vhost_vdpa_get_max_queue_pairs(vdpa_device_fd,
+    r = vhost_vdpa_get_features(vdpa_device_fd, &features, errp);
+    if (unlikely(r < 0)) {
+        return r;
+    }
+
+    queue_pairs = vhost_vdpa_get_max_queue_pairs(vdpa_device_fd, features,
                                                  &has_cvq, errp);
     if (queue_pairs < 0) {
         qemu_close(vdpa_device_fd);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 18/19] vdpa: Add device migration blocker
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (16 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 17/19] vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-15  4:03   ` Jason Wang
  2022-07-14 16:31 ` [PATCH v2 19/19] vdpa: Add x-svq to NetdevVhostVDPAOptions Eugenio Pérez
  18 siblings, 1 reply; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

Since the vhost-vdpa device is exposing _F_LOG, adding a migration blocker if
it uses CVQ.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/vhost-vdpa.h |  1 +
 hw/virtio/vhost-vdpa.c         | 14 ++++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 1111d85643..d10a89303e 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -35,6 +35,7 @@ typedef struct vhost_vdpa {
     bool shadow_vqs_enabled;
     /* IOVA mapping used by the Shadow Virtqueue */
     VhostIOVATree *iova_tree;
+    Error *migration_blocker;
     GPtrArray *shadow_vqs;
     const VhostShadowVirtqueueOps *shadow_vq_ops;
     void *shadow_vq_ops_opaque;
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index beaaa7049a..795ed5a049 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -20,6 +20,7 @@
 #include "hw/virtio/vhost-shadow-virtqueue.h"
 #include "hw/virtio/vhost-vdpa.h"
 #include "exec/address-spaces.h"
+#include "migration/blocker.h"
 #include "qemu/cutils.h"
 #include "qemu/main-loop.h"
 #include "cpu.h"
@@ -1022,6 +1023,13 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
         return true;
     }
 
+    if (v->migration_blocker) {
+        int r = migrate_add_blocker(v->migration_blocker, &err);
+        if (unlikely(r < 0)) {
+            goto err_migration_blocker;
+        }
+    }
+
     for (i = 0; i < v->shadow_vqs->len; ++i) {
         VirtQueue *vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
         VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
@@ -1064,6 +1072,9 @@ err:
         vhost_svq_stop(svq);
     }
 
+err_migration_blocker:
+    error_reportf_err(err, "Cannot setup SVQ %u: ", i);
+
     return false;
 }
 
@@ -1083,6 +1094,9 @@ static bool vhost_vdpa_svqs_stop(struct vhost_dev *dev)
         }
     }
 
+    if (v->migration_blocker) {
+        migrate_del_blocker(v->migration_blocker);
+    }
     return true;
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 19/19] vdpa: Add x-svq to NetdevVhostVDPAOptions
  2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
                   ` (17 preceding siblings ...)
  2022-07-14 16:31 ` [PATCH v2 18/19] vdpa: Add device migration blocker Eugenio Pérez
@ 2022-07-14 16:31 ` Eugenio Pérez
  2022-07-15  4:13   ` Jason Wang
  18 siblings, 1 reply; 42+ messages in thread
From: Eugenio Pérez @ 2022-07-14 16:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Jason Wang, Gonglei (Arei)

Finally offering the possibility to enable SVQ from the command line.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
---
 qapi/net.json    |  9 +++++-
 net/vhost-vdpa.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 77 insertions(+), 4 deletions(-)

diff --git a/qapi/net.json b/qapi/net.json
index 9af11e9a3b..75ba2cb989 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -445,12 +445,19 @@
 # @queues: number of queues to be created for multiqueue vhost-vdpa
 #          (default: 1)
 #
+# @x-svq: Start device with (experimental) shadow virtqueue. (Since 7.1)
+#         (default: false)
+#
+# Features:
+# @unstable: Member @x-svq is experimental.
+#
 # Since: 5.1
 ##
 { 'struct': 'NetdevVhostVDPAOptions',
   'data': {
     '*vhostdev':     'str',
-    '*queues':       'int' } }
+    '*queues':       'int',
+    '*x-svq':        {'type': 'bool', 'features' : [ 'unstable'] } } }
 
 ##
 # @NetdevVmnetHostOptions:
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 7ccf9eaf4d..85148a5114 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -75,6 +75,28 @@ const int vdpa_feature_bits[] = {
     VHOST_INVALID_FEATURE_BIT
 };
 
+/** Supported device specific feature bits with SVQ */
+static const uint64_t vdpa_svq_device_features =
+    BIT_ULL(VIRTIO_NET_F_CSUM) |
+    BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) |
+    BIT_ULL(VIRTIO_NET_F_MTU) |
+    BIT_ULL(VIRTIO_NET_F_MAC) |
+    BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) |
+    BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) |
+    BIT_ULL(VIRTIO_NET_F_GUEST_ECN) |
+    BIT_ULL(VIRTIO_NET_F_GUEST_UFO) |
+    BIT_ULL(VIRTIO_NET_F_HOST_TSO4) |
+    BIT_ULL(VIRTIO_NET_F_HOST_TSO6) |
+    BIT_ULL(VIRTIO_NET_F_HOST_ECN) |
+    BIT_ULL(VIRTIO_NET_F_HOST_UFO) |
+    BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) |
+    BIT_ULL(VIRTIO_NET_F_STATUS) |
+    BIT_ULL(VIRTIO_NET_F_CTRL_VQ) |
+    BIT_ULL(VIRTIO_F_ANY_LAYOUT) |
+    BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR) |
+    BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
+    BIT_ULL(VIRTIO_NET_F_STANDBY);
+
 VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
 {
     VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
@@ -133,9 +155,13 @@ err_init:
 static void vhost_vdpa_cleanup(NetClientState *nc)
 {
     VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+    struct vhost_dev *dev = &s->vhost_net->dev;
 
     qemu_vfree(s->cvq_cmd_out_buffer);
     qemu_vfree(s->cvq_cmd_in_buffer);
+    if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
+        g_clear_pointer(&s->vhost_vdpa.iova_tree, vhost_iova_tree_delete);
+    }
     if (s->vhost_net) {
         vhost_net_cleanup(s->vhost_net);
         g_free(s->vhost_net);
@@ -437,7 +463,9 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
                                            int vdpa_device_fd,
                                            int queue_pair_index,
                                            int nvqs,
-                                           bool is_datapath)
+                                           bool is_datapath,
+                                           bool svq,
+                                           VhostIOVATree *iova_tree)
 {
     NetClientState *nc = NULL;
     VhostVDPAState *s;
@@ -455,6 +483,8 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
 
     s->vhost_vdpa.device_fd = vdpa_device_fd;
     s->vhost_vdpa.index = queue_pair_index;
+    s->vhost_vdpa.shadow_vqs_enabled = svq;
+    s->vhost_vdpa.iova_tree = iova_tree;
     if (!is_datapath) {
         s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size(),
                                             vhost_vdpa_net_cvq_cmd_page_len());
@@ -465,6 +495,8 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
 
         s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
         s->vhost_vdpa.shadow_vq_ops_opaque = s;
+        error_setg(&s->vhost_vdpa.migration_blocker,
+                   "Migration disabled: vhost-vdpa uses CVQ.");
     }
     ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa, queue_pair_index, nvqs);
     if (ret) {
@@ -474,6 +506,14 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
     return nc;
 }
 
+static int vhost_vdpa_get_iova_range(int fd,
+                                     struct vhost_vdpa_iova_range *iova_range)
+{
+    int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
+
+    return ret < 0 ? -errno : 0;
+}
+
 static int vhost_vdpa_get_features(int fd, uint64_t *features, Error **errp)
 {
     int ret = ioctl(fd, VHOST_GET_FEATURES, features);
@@ -524,6 +564,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     uint64_t features;
     int vdpa_device_fd;
     g_autofree NetClientState **ncs = NULL;
+    g_autoptr(VhostIOVATree) iova_tree = NULL;
     NetClientState *nc;
     int queue_pairs, r, i, has_cvq = 0;
 
@@ -551,22 +592,45 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
         return queue_pairs;
     }
 
+    if (opts->x_svq) {
+        struct vhost_vdpa_iova_range iova_range;
+
+        uint64_t invalid_dev_features =
+            features & ~vdpa_svq_device_features &
+            /* Transport are all accepted at this point */
+            ~MAKE_64BIT_MASK(VIRTIO_TRANSPORT_F_START,
+                             VIRTIO_TRANSPORT_F_END - VIRTIO_TRANSPORT_F_START);
+
+        if (invalid_dev_features) {
+            error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
+                       invalid_dev_features);
+            goto err_svq;
+        }
+
+        vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range);
+        iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
+    }
+
     ncs = g_malloc0(sizeof(*ncs) * queue_pairs);
 
     for (i = 0; i < queue_pairs; i++) {
         ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
-                                     vdpa_device_fd, i, 2, true);
+                                     vdpa_device_fd, i, 2, true, opts->x_svq,
+                                     iova_tree);
         if (!ncs[i])
             goto err;
     }
 
     if (has_cvq) {
         nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
-                                 vdpa_device_fd, i, 1, false);
+                                 vdpa_device_fd, i, 1, false,
+                                 opts->x_svq, iova_tree);
         if (!nc)
             goto err;
     }
 
+    /* iova_tree ownership belongs to last NetClientState */
+    g_steal_pointer(&iova_tree);
     return 0;
 
 err:
@@ -575,6 +639,8 @@ err:
             qemu_del_net_client(ncs[i]);
         }
     }
+
+err_svq:
     qemu_close(vdpa_device_fd);
 
     return -1;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 12/19] vhost: add vhost_svq_poll
  2022-07-14 16:31 ` [PATCH v2 12/19] vhost: add vhost_svq_poll Eugenio Pérez
@ 2022-07-15  3:58   ` Jason Wang
  2022-07-15  5:38     ` Eugenio Perez Martin
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Wang @ 2022-07-15  3:58 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> It allows the Shadow Control VirtQueue to wait for the device to use the
> available buffers.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  hw/virtio/vhost-shadow-virtqueue.h |  1 +
>  hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++
>  2 files changed, 23 insertions(+)
>
> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> index 1692541cbb..b5c6e3b3b4 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.h
> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> @@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *elem,
>  int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
>                    size_t out_num, const struct iovec *in_sg, size_t in_num,
>                    SVQElement *elem);
> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
>
>  void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
>  void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> index 5244896358..31a267f721 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.c
> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> @@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
>      } while (!vhost_svq_enable_notification(svq));
>  }
>
> +/**
> + * Poll the SVQ for one device used buffer.
> + *
> + * This function race with main event loop SVQ polling, so extra
> + * synchronization is needed.
> + *
> + * Return the length written by the device.
> + */
> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> +{
> +    do {
> +        uint32_t len;
> +        SVQElement *elem = vhost_svq_get_buf(svq, &len);
> +        if (elem) {
> +            return len;
> +        }
> +
> +        /* Make sure we read new used_idx */
> +        smp_rmb();

There's already one smp_rmb(0 in vhost_svq_get_buf(). So this seems useless?

Thanks

> +    } while (true);
> +}
> +
>  /**
>   * Forward used buffers.
>   *
> --
> 2.31.1
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 18/19] vdpa: Add device migration blocker
  2022-07-14 16:31 ` [PATCH v2 18/19] vdpa: Add device migration blocker Eugenio Pérez
@ 2022-07-15  4:03   ` Jason Wang
  2022-07-15  5:39     ` Eugenio Perez Martin
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Wang @ 2022-07-15  4:03 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> Since the vhost-vdpa device is exposing _F_LOG,

I may miss something but I think it doesn't?

Note that the features were fetched from the vDPA parent.

Thanks

> adding a migration blocker if
> it uses CVQ.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  include/hw/virtio/vhost-vdpa.h |  1 +
>  hw/virtio/vhost-vdpa.c         | 14 ++++++++++++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> index 1111d85643..d10a89303e 100644
> --- a/include/hw/virtio/vhost-vdpa.h
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -35,6 +35,7 @@ typedef struct vhost_vdpa {
>      bool shadow_vqs_enabled;
>      /* IOVA mapping used by the Shadow Virtqueue */
>      VhostIOVATree *iova_tree;
> +    Error *migration_blocker;
>      GPtrArray *shadow_vqs;
>      const VhostShadowVirtqueueOps *shadow_vq_ops;
>      void *shadow_vq_ops_opaque;
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index beaaa7049a..795ed5a049 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -20,6 +20,7 @@
>  #include "hw/virtio/vhost-shadow-virtqueue.h"
>  #include "hw/virtio/vhost-vdpa.h"
>  #include "exec/address-spaces.h"
> +#include "migration/blocker.h"
>  #include "qemu/cutils.h"
>  #include "qemu/main-loop.h"
>  #include "cpu.h"
> @@ -1022,6 +1023,13 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
>          return true;
>      }
>
> +    if (v->migration_blocker) {
> +        int r = migrate_add_blocker(v->migration_blocker, &err);
> +        if (unlikely(r < 0)) {
> +            goto err_migration_blocker;
> +        }
> +    }
> +
>      for (i = 0; i < v->shadow_vqs->len; ++i) {
>          VirtQueue *vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
>          VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
> @@ -1064,6 +1072,9 @@ err:
>          vhost_svq_stop(svq);
>      }
>
> +err_migration_blocker:
> +    error_reportf_err(err, "Cannot setup SVQ %u: ", i);
> +
>      return false;
>  }
>
> @@ -1083,6 +1094,9 @@ static bool vhost_vdpa_svqs_stop(struct vhost_dev *dev)
>          }
>      }
>
> +    if (v->migration_blocker) {
> +        migrate_del_blocker(v->migration_blocker);
> +    }
>      return true;
>  }
>
> --
> 2.31.1
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 15/19] vdpa: manual forward CVQ buffers
  2022-07-14 16:31 ` [PATCH v2 15/19] vdpa: manual forward CVQ buffers Eugenio Pérez
@ 2022-07-15  4:08   ` Jason Wang
  2022-07-15  5:33     ` Eugenio Perez Martin
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Wang @ 2022-07-15  4:08 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> Do a simple forwarding of CVQ buffers, the same work SVQ could do but
> through callbacks. No functional change intended.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  include/hw/virtio/vhost-vdpa.h |  3 ++
>  hw/virtio/vhost-vdpa.c         |  3 +-
>  net/vhost-vdpa.c               | 58 ++++++++++++++++++++++++++++++++++
>  3 files changed, 63 insertions(+), 1 deletion(-)
>
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> index 7214eb47dc..1111d85643 100644
> --- a/include/hw/virtio/vhost-vdpa.h
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -15,6 +15,7 @@
>  #include <gmodule.h>
>
>  #include "hw/virtio/vhost-iova-tree.h"
> +#include "hw/virtio/vhost-shadow-virtqueue.h"
>  #include "hw/virtio/virtio.h"
>  #include "standard-headers/linux/vhost_types.h"
>
> @@ -35,6 +36,8 @@ typedef struct vhost_vdpa {
>      /* IOVA mapping used by the Shadow Virtqueue */
>      VhostIOVATree *iova_tree;
>      GPtrArray *shadow_vqs;
> +    const VhostShadowVirtqueueOps *shadow_vq_ops;
> +    void *shadow_vq_ops_opaque;
>      struct vhost_dev *dev;
>      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
>  } VhostVDPA;
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 96997210be..beaaa7049a 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -419,7 +419,8 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
>      for (unsigned n = 0; n < hdev->nvqs; ++n) {
>          g_autoptr(VhostShadowVirtqueue) svq;
>
> -        svq = vhost_svq_new(v->iova_tree, NULL, NULL);
> +        svq = vhost_svq_new(v->iova_tree, v->shadow_vq_ops,
> +                            v->shadow_vq_ops_opaque);
>          if (unlikely(!svq)) {
>              error_setg(errp, "Cannot create svq %u", n);
>              return -1;
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index df1e69ee72..805c9dd6b6 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -11,11 +11,14 @@
>
>  #include "qemu/osdep.h"
>  #include "clients.h"
> +#include "hw/virtio/virtio-net.h"
>  #include "net/vhost_net.h"
>  #include "net/vhost-vdpa.h"
>  #include "hw/virtio/vhost-vdpa.h"
>  #include "qemu/config-file.h"
>  #include "qemu/error-report.h"
> +#include "qemu/log.h"
> +#include "qemu/memalign.h"
>  #include "qemu/option.h"
>  #include "qapi/error.h"
>  #include <linux/vhost.h>
> @@ -187,6 +190,57 @@ static NetClientInfo net_vhost_vdpa_info = {
>          .check_peer_type = vhost_vdpa_check_peer_type,
>  };
>
> +/**
> + * Forward buffer for the moment.
> + */
> +static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
> +                                            SVQElement *svq_elem, void *opaque)
> +{
> +    VirtQueueElement *elem = &svq_elem->elem;
> +    unsigned int n = elem->out_num + elem->in_num;
> +    g_autofree struct iovec *dev_buffers = g_new(struct iovec, n);
> +    size_t in_len, dev_written;
> +    virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
> +    int r;
> +
> +    memcpy(dev_buffers, elem->out_sg, elem->out_num);
> +    memcpy(dev_buffers + elem->out_num, elem->in_sg, elem->in_num);
> +
> +    r = vhost_svq_add(svq, &dev_buffers[0], elem->out_num, &dev_buffers[1],
> +                      elem->in_num, svq_elem);
> +    if (unlikely(r != 0)) {
> +        if (unlikely(r == -ENOSPC)) {
> +            qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
> +                          __func__);
> +        }
> +        goto out;
> +    }
> +
> +    /*
> +     * We can poll here since we've had BQL from the time we sent the
> +     * descriptor. Also, we need to take the answer before SVQ pulls by itself,
> +     * when BQL is released
> +     */
> +    dev_written = vhost_svq_poll(svq);
> +    if (unlikely(dev_written < sizeof(status))) {
> +        error_report("Insufficient written data (%zu)", dev_written);
> +    }
> +
> +out:
> +    in_len = iov_from_buf(elem->in_sg, elem->in_num, 0, &status,
> +                          sizeof(status));
> +    if (unlikely(in_len < sizeof(status))) {
> +        error_report("Bad device CVQ written length");
> +    }
> +    vhost_svq_push_elem(svq, svq_elem, MIN(in_len, sizeof(status)));
> +    g_free(svq_elem);
> +    return r;
> +}
> +
> +static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> +    .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> +};

I wonder if it's possible to even remove this handler. Can we let the
kick to be handled by virtio_net_handler_ctrl() in virtio-net.c?

Thanks

> +
>  static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>                                             const char *device,
>                                             const char *name,
> @@ -211,6 +265,10 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>
>      s->vhost_vdpa.device_fd = vdpa_device_fd;
>      s->vhost_vdpa.index = queue_pair_index;
> +    if (!is_datapath) {
> +        s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
> +        s->vhost_vdpa.shadow_vq_ops_opaque = s;
> +    }
>      ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa, queue_pair_index, nvqs);
>      if (ret) {
>          qemu_del_net_client(nc);
> --
> 2.31.1
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 09/19] vhost: Track number of descs in SVQElement
  2022-07-14 16:31 ` [PATCH v2 09/19] vhost: Track number of descs in SVQElement Eugenio Pérez
@ 2022-07-15  4:10   ` Jason Wang
  2022-07-15  5:41     ` Eugenio Perez Martin
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Wang @ 2022-07-15  4:10 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> Since CVQ will be able to modify elements, the number of descriptors in
> the guest may not match with the number of descriptors exposed. Track
> separately.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  hw/virtio/vhost-shadow-virtqueue.h |  6 ++++++
>  hw/virtio/vhost-shadow-virtqueue.c | 10 +++++-----
>  2 files changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> index f35d4b8f90..143c86a568 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.h
> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> @@ -17,6 +17,12 @@
>
>  typedef struct SVQElement {
>      VirtQueueElement elem;
> +
> +    /*
> +     * Number of descriptors exposed to the device. May or may not match
> +     * guest's
> +     */
> +    unsigned int ndescs;
>  } SVQElement;

Can we simplify things furtherly by moving ndscs into a dedicated array at svq?

Then we don't need to bother with introducing SVQElement.

Thanks

>
>  /* Shadow virtqueue to relay notifications */
> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> index 442ca3cbd3..3b112c4ec8 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.c
> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> @@ -243,10 +243,10 @@ static int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
>                            size_t in_num, SVQElement *svq_elem)
>  {
>      unsigned qemu_head;
> -    unsigned ndescs = in_num + out_num;
> +    svq_elem->ndescs = in_num + out_num;
>      bool ok;
>
> -    if (unlikely(ndescs > vhost_svq_available_slots(svq))) {
> +    if (unlikely(svq_elem->ndescs > vhost_svq_available_slots(svq))) {
>          return -ENOSPC;
>      }
>
> @@ -393,7 +393,7 @@ static SVQElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
>      SVQElement *elem;
>      const vring_used_t *used = svq->vring.used;
>      vring_used_elem_t used_elem;
> -    uint16_t last_used, last_used_chain, num;
> +    uint16_t last_used, last_used_chain;
>
>      if (!vhost_svq_more_used(svq)) {
>          return NULL;
> @@ -420,8 +420,8 @@ static SVQElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
>      }
>
>      elem = svq->ring_id_maps[used_elem.id];
> -    num = elem->elem.in_num + elem->elem.out_num;
> -    last_used_chain = vhost_svq_last_desc_of_chain(svq, num, used_elem.id);
> +    last_used_chain = vhost_svq_last_desc_of_chain(svq, elem->ndescs,
> +                                                   used_elem.id);
>      svq->desc_next[last_used_chain] = svq->free_head;
>      svq->free_head = used_elem.id;
>
> --
> 2.31.1
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 19/19] vdpa: Add x-svq to NetdevVhostVDPAOptions
  2022-07-14 16:31 ` [PATCH v2 19/19] vdpa: Add x-svq to NetdevVhostVDPAOptions Eugenio Pérez
@ 2022-07-15  4:13   ` Jason Wang
  2022-07-15  6:09     ` Eugenio Perez Martin
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Wang @ 2022-07-15  4:13 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> Finally offering the possibility to enable SVQ from the command line.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> Acked-by: Markus Armbruster <armbru@redhat.com>
> ---
>  qapi/net.json    |  9 +++++-
>  net/vhost-vdpa.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 77 insertions(+), 4 deletions(-)
>
> diff --git a/qapi/net.json b/qapi/net.json
> index 9af11e9a3b..75ba2cb989 100644
> --- a/qapi/net.json
> +++ b/qapi/net.json
> @@ -445,12 +445,19 @@
>  # @queues: number of queues to be created for multiqueue vhost-vdpa
>  #          (default: 1)
>  #
> +# @x-svq: Start device with (experimental) shadow virtqueue. (Since 7.1)
> +#         (default: false)
> +#
> +# Features:
> +# @unstable: Member @x-svq is experimental.
> +#
>  # Since: 5.1
>  ##
>  { 'struct': 'NetdevVhostVDPAOptions',
>    'data': {
>      '*vhostdev':     'str',
> -    '*queues':       'int' } }
> +    '*queues':       'int',
> +    '*x-svq':        {'type': 'bool', 'features' : [ 'unstable'] } } }
>
>  ##
>  # @NetdevVmnetHostOptions:
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 7ccf9eaf4d..85148a5114 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -75,6 +75,28 @@ const int vdpa_feature_bits[] = {
>      VHOST_INVALID_FEATURE_BIT
>  };
>
> +/** Supported device specific feature bits with SVQ */
> +static const uint64_t vdpa_svq_device_features =
> +    BIT_ULL(VIRTIO_NET_F_CSUM) |
> +    BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) |
> +    BIT_ULL(VIRTIO_NET_F_MTU) |
> +    BIT_ULL(VIRTIO_NET_F_MAC) |
> +    BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) |
> +    BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) |
> +    BIT_ULL(VIRTIO_NET_F_GUEST_ECN) |
> +    BIT_ULL(VIRTIO_NET_F_GUEST_UFO) |
> +    BIT_ULL(VIRTIO_NET_F_HOST_TSO4) |
> +    BIT_ULL(VIRTIO_NET_F_HOST_TSO6) |
> +    BIT_ULL(VIRTIO_NET_F_HOST_ECN) |
> +    BIT_ULL(VIRTIO_NET_F_HOST_UFO) |
> +    BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) |
> +    BIT_ULL(VIRTIO_NET_F_STATUS) |
> +    BIT_ULL(VIRTIO_NET_F_CTRL_VQ) |
> +    BIT_ULL(VIRTIO_F_ANY_LAYOUT) |
> +    BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR) |
> +    BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
> +    BIT_ULL(VIRTIO_NET_F_STANDBY);

We need to have a plan for the full feature support like

indirect, event_index, and packed.

I can help in developing some of these if you wish.

Thanks

> +
>  VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
>  {
>      VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> @@ -133,9 +155,13 @@ err_init:
>  static void vhost_vdpa_cleanup(NetClientState *nc)
>  {
>      VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> +    struct vhost_dev *dev = &s->vhost_net->dev;
>
>      qemu_vfree(s->cvq_cmd_out_buffer);
>      qemu_vfree(s->cvq_cmd_in_buffer);
> +    if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
> +        g_clear_pointer(&s->vhost_vdpa.iova_tree, vhost_iova_tree_delete);
> +    }
>      if (s->vhost_net) {
>          vhost_net_cleanup(s->vhost_net);
>          g_free(s->vhost_net);
> @@ -437,7 +463,9 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>                                             int vdpa_device_fd,
>                                             int queue_pair_index,
>                                             int nvqs,
> -                                           bool is_datapath)
> +                                           bool is_datapath,
> +                                           bool svq,
> +                                           VhostIOVATree *iova_tree)
>  {
>      NetClientState *nc = NULL;
>      VhostVDPAState *s;
> @@ -455,6 +483,8 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>
>      s->vhost_vdpa.device_fd = vdpa_device_fd;
>      s->vhost_vdpa.index = queue_pair_index;
> +    s->vhost_vdpa.shadow_vqs_enabled = svq;
> +    s->vhost_vdpa.iova_tree = iova_tree;
>      if (!is_datapath) {
>          s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size(),
>                                              vhost_vdpa_net_cvq_cmd_page_len());
> @@ -465,6 +495,8 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>
>          s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
>          s->vhost_vdpa.shadow_vq_ops_opaque = s;
> +        error_setg(&s->vhost_vdpa.migration_blocker,
> +                   "Migration disabled: vhost-vdpa uses CVQ.");
>      }
>      ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa, queue_pair_index, nvqs);
>      if (ret) {
> @@ -474,6 +506,14 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>      return nc;
>  }
>
> +static int vhost_vdpa_get_iova_range(int fd,
> +                                     struct vhost_vdpa_iova_range *iova_range)
> +{
> +    int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> +
> +    return ret < 0 ? -errno : 0;
> +}
> +
>  static int vhost_vdpa_get_features(int fd, uint64_t *features, Error **errp)
>  {
>      int ret = ioctl(fd, VHOST_GET_FEATURES, features);
> @@ -524,6 +564,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>      uint64_t features;
>      int vdpa_device_fd;
>      g_autofree NetClientState **ncs = NULL;
> +    g_autoptr(VhostIOVATree) iova_tree = NULL;
>      NetClientState *nc;
>      int queue_pairs, r, i, has_cvq = 0;
>
> @@ -551,22 +592,45 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>          return queue_pairs;
>      }
>
> +    if (opts->x_svq) {
> +        struct vhost_vdpa_iova_range iova_range;
> +
> +        uint64_t invalid_dev_features =
> +            features & ~vdpa_svq_device_features &
> +            /* Transport are all accepted at this point */
> +            ~MAKE_64BIT_MASK(VIRTIO_TRANSPORT_F_START,
> +                             VIRTIO_TRANSPORT_F_END - VIRTIO_TRANSPORT_F_START);
> +
> +        if (invalid_dev_features) {
> +            error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
> +                       invalid_dev_features);
> +            goto err_svq;
> +        }
> +
> +        vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range);
> +        iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
> +    }
> +
>      ncs = g_malloc0(sizeof(*ncs) * queue_pairs);
>
>      for (i = 0; i < queue_pairs; i++) {
>          ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> -                                     vdpa_device_fd, i, 2, true);
> +                                     vdpa_device_fd, i, 2, true, opts->x_svq,
> +                                     iova_tree);
>          if (!ncs[i])
>              goto err;
>      }
>
>      if (has_cvq) {
>          nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> -                                 vdpa_device_fd, i, 1, false);
> +                                 vdpa_device_fd, i, 1, false,
> +                                 opts->x_svq, iova_tree);
>          if (!nc)
>              goto err;
>      }
>
> +    /* iova_tree ownership belongs to last NetClientState */
> +    g_steal_pointer(&iova_tree);
>      return 0;
>
>  err:
> @@ -575,6 +639,8 @@ err:
>              qemu_del_net_client(ncs[i]);
>          }
>      }
> +
> +err_svq:
>      qemu_close(vdpa_device_fd);
>
>      return -1;
> --
> 2.31.1
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 15/19] vdpa: manual forward CVQ buffers
  2022-07-15  4:08   ` Jason Wang
@ 2022-07-15  5:33     ` Eugenio Perez Martin
  2022-07-15  8:44       ` Jason Wang
  0 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-15  5:33 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 6:08 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > Do a simple forwarding of CVQ buffers, the same work SVQ could do but
> > through callbacks. No functional change intended.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  include/hw/virtio/vhost-vdpa.h |  3 ++
> >  hw/virtio/vhost-vdpa.c         |  3 +-
> >  net/vhost-vdpa.c               | 58 ++++++++++++++++++++++++++++++++++
> >  3 files changed, 63 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > index 7214eb47dc..1111d85643 100644
> > --- a/include/hw/virtio/vhost-vdpa.h
> > +++ b/include/hw/virtio/vhost-vdpa.h
> > @@ -15,6 +15,7 @@
> >  #include <gmodule.h>
> >
> >  #include "hw/virtio/vhost-iova-tree.h"
> > +#include "hw/virtio/vhost-shadow-virtqueue.h"
> >  #include "hw/virtio/virtio.h"
> >  #include "standard-headers/linux/vhost_types.h"
> >
> > @@ -35,6 +36,8 @@ typedef struct vhost_vdpa {
> >      /* IOVA mapping used by the Shadow Virtqueue */
> >      VhostIOVATree *iova_tree;
> >      GPtrArray *shadow_vqs;
> > +    const VhostShadowVirtqueueOps *shadow_vq_ops;
> > +    void *shadow_vq_ops_opaque;
> >      struct vhost_dev *dev;
> >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> >  } VhostVDPA;
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index 96997210be..beaaa7049a 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -419,7 +419,8 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> >      for (unsigned n = 0; n < hdev->nvqs; ++n) {
> >          g_autoptr(VhostShadowVirtqueue) svq;
> >
> > -        svq = vhost_svq_new(v->iova_tree, NULL, NULL);
> > +        svq = vhost_svq_new(v->iova_tree, v->shadow_vq_ops,
> > +                            v->shadow_vq_ops_opaque);
> >          if (unlikely(!svq)) {
> >              error_setg(errp, "Cannot create svq %u", n);
> >              return -1;
> > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > index df1e69ee72..805c9dd6b6 100644
> > --- a/net/vhost-vdpa.c
> > +++ b/net/vhost-vdpa.c
> > @@ -11,11 +11,14 @@
> >
> >  #include "qemu/osdep.h"
> >  #include "clients.h"
> > +#include "hw/virtio/virtio-net.h"
> >  #include "net/vhost_net.h"
> >  #include "net/vhost-vdpa.h"
> >  #include "hw/virtio/vhost-vdpa.h"
> >  #include "qemu/config-file.h"
> >  #include "qemu/error-report.h"
> > +#include "qemu/log.h"
> > +#include "qemu/memalign.h"
> >  #include "qemu/option.h"
> >  #include "qapi/error.h"
> >  #include <linux/vhost.h>
> > @@ -187,6 +190,57 @@ static NetClientInfo net_vhost_vdpa_info = {
> >          .check_peer_type = vhost_vdpa_check_peer_type,
> >  };
> >
> > +/**
> > + * Forward buffer for the moment.
> > + */
> > +static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
> > +                                            SVQElement *svq_elem, void *opaque)
> > +{
> > +    VirtQueueElement *elem = &svq_elem->elem;
> > +    unsigned int n = elem->out_num + elem->in_num;
> > +    g_autofree struct iovec *dev_buffers = g_new(struct iovec, n);
> > +    size_t in_len, dev_written;
> > +    virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
> > +    int r;
> > +
> > +    memcpy(dev_buffers, elem->out_sg, elem->out_num);
> > +    memcpy(dev_buffers + elem->out_num, elem->in_sg, elem->in_num);
> > +
> > +    r = vhost_svq_add(svq, &dev_buffers[0], elem->out_num, &dev_buffers[1],
> > +                      elem->in_num, svq_elem);
> > +    if (unlikely(r != 0)) {
> > +        if (unlikely(r == -ENOSPC)) {
> > +            qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
> > +                          __func__);
> > +        }
> > +        goto out;
> > +    }
> > +
> > +    /*
> > +     * We can poll here since we've had BQL from the time we sent the
> > +     * descriptor. Also, we need to take the answer before SVQ pulls by itself,
> > +     * when BQL is released
> > +     */
> > +    dev_written = vhost_svq_poll(svq);
> > +    if (unlikely(dev_written < sizeof(status))) {
> > +        error_report("Insufficient written data (%zu)", dev_written);
> > +    }
> > +
> > +out:
> > +    in_len = iov_from_buf(elem->in_sg, elem->in_num, 0, &status,
> > +                          sizeof(status));
> > +    if (unlikely(in_len < sizeof(status))) {
> > +        error_report("Bad device CVQ written length");
> > +    }
> > +    vhost_svq_push_elem(svq, svq_elem, MIN(in_len, sizeof(status)));
> > +    g_free(svq_elem);
> > +    return r;
> > +}
> > +
> > +static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> > +    .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> > +};
>
> I wonder if it's possible to even remove this handler. Can we let the
> kick to be handled by virtio_net_handler_ctrl() in virtio-net.c?
>

I kind of drafted that here:
https://lists.nongnu.org/archive/html/qemu-devel/2022-07/msg02652.html

But I'm not sure about the part of not enabling the guest event
notifiers. I'd say we would need to move SVQ to the VirtIODevice
itself, which seems rather complicated at this moment.

Without applying that patch, all of the events of vDPA devices go
either to device itself or to the SVQ registered handlers. If we go
that route, we add the vq event handler, and that would depend on the
svq state + specific vq.

I'm totally ok to do it the first thing for the next merge window, or
if I have the certainty it's the accepted way to go. Otherwise, I
think we will not be able to deliver at least migration with SVQ for
this merge window again.

Thanks!



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 12/19] vhost: add vhost_svq_poll
  2022-07-15  3:58   ` Jason Wang
@ 2022-07-15  5:38     ` Eugenio Perez Martin
  2022-07-15  8:47       ` Jason Wang
  0 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-15  5:38 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > It allows the Shadow Control VirtQueue to wait for the device to use the
> > available buffers.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  hw/virtio/vhost-shadow-virtqueue.h |  1 +
> >  hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++
> >  2 files changed, 23 insertions(+)
> >
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> > index 1692541cbb..b5c6e3b3b4 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.h
> > +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > @@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *elem,
> >  int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
> >                    size_t out_num, const struct iovec *in_sg, size_t in_num,
> >                    SVQElement *elem);
> > +size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
> >
> >  void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
> >  void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> > index 5244896358..31a267f721 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.c
> > +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > @@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
> >      } while (!vhost_svq_enable_notification(svq));
> >  }
> >
> > +/**
> > + * Poll the SVQ for one device used buffer.
> > + *
> > + * This function race with main event loop SVQ polling, so extra
> > + * synchronization is needed.
> > + *
> > + * Return the length written by the device.
> > + */
> > +size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> > +{
> > +    do {
> > +        uint32_t len;
> > +        SVQElement *elem = vhost_svq_get_buf(svq, &len);
> > +        if (elem) {
> > +            return len;
> > +        }
> > +
> > +        /* Make sure we read new used_idx */
> > +        smp_rmb();
>
> There's already one smp_rmb(0 in vhost_svq_get_buf(). So this seems useless?
>

That rmb is after checking for new entries with (vq->last_used_idx !=
svq->shadow_used_idx) , to avoid reordering used_idx read with the
actual used entry. So my understanding is
that the compiler is free to skip that check within the while loop.

Maybe the right solution is to add it in vhost_svq_more_used after the
condition (vq->last_used_idx != svq->shadow_used_idx) is false?

Thanks!


> Thanks
>
> > +    } while (true);
> > +}
> > +
> >  /**
> >   * Forward used buffers.
> >   *
> > --
> > 2.31.1
> >
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 18/19] vdpa: Add device migration blocker
  2022-07-15  4:03   ` Jason Wang
@ 2022-07-15  5:39     ` Eugenio Perez Martin
  2022-07-15  8:50       ` Jason Wang
  0 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-15  5:39 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 6:03 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > Since the vhost-vdpa device is exposing _F_LOG,
>
> I may miss something but I think it doesn't?
>

It's at vhost_vdpa_get_features. As long as SVQ is enabled, it's
exposing VHOST_F_LOG_ALL.

Thanks!

> Note that the features were fetched from the vDPA parent.
>
> Thanks
>
> > adding a migration blocker if
> > it uses CVQ.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  include/hw/virtio/vhost-vdpa.h |  1 +
> >  hw/virtio/vhost-vdpa.c         | 14 ++++++++++++++
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > index 1111d85643..d10a89303e 100644
> > --- a/include/hw/virtio/vhost-vdpa.h
> > +++ b/include/hw/virtio/vhost-vdpa.h
> > @@ -35,6 +35,7 @@ typedef struct vhost_vdpa {
> >      bool shadow_vqs_enabled;
> >      /* IOVA mapping used by the Shadow Virtqueue */
> >      VhostIOVATree *iova_tree;
> > +    Error *migration_blocker;
> >      GPtrArray *shadow_vqs;
> >      const VhostShadowVirtqueueOps *shadow_vq_ops;
> >      void *shadow_vq_ops_opaque;
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index beaaa7049a..795ed5a049 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -20,6 +20,7 @@
> >  #include "hw/virtio/vhost-shadow-virtqueue.h"
> >  #include "hw/virtio/vhost-vdpa.h"
> >  #include "exec/address-spaces.h"
> > +#include "migration/blocker.h"
> >  #include "qemu/cutils.h"
> >  #include "qemu/main-loop.h"
> >  #include "cpu.h"
> > @@ -1022,6 +1023,13 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
> >          return true;
> >      }
> >
> > +    if (v->migration_blocker) {
> > +        int r = migrate_add_blocker(v->migration_blocker, &err);
> > +        if (unlikely(r < 0)) {
> > +            goto err_migration_blocker;
> > +        }
> > +    }
> > +
> >      for (i = 0; i < v->shadow_vqs->len; ++i) {
> >          VirtQueue *vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
> >          VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
> > @@ -1064,6 +1072,9 @@ err:
> >          vhost_svq_stop(svq);
> >      }
> >
> > +err_migration_blocker:
> > +    error_reportf_err(err, "Cannot setup SVQ %u: ", i);
> > +
> >      return false;
> >  }
> >
> > @@ -1083,6 +1094,9 @@ static bool vhost_vdpa_svqs_stop(struct vhost_dev *dev)
> >          }
> >      }
> >
> > +    if (v->migration_blocker) {
> > +        migrate_del_blocker(v->migration_blocker);
> > +    }
> >      return true;
> >  }
> >
> > --
> > 2.31.1
> >
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 09/19] vhost: Track number of descs in SVQElement
  2022-07-15  4:10   ` Jason Wang
@ 2022-07-15  5:41     ` Eugenio Perez Martin
  0 siblings, 0 replies; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-15  5:41 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 6:10 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > Since CVQ will be able to modify elements, the number of descriptors in
> > the guest may not match with the number of descriptors exposed. Track
> > separately.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  hw/virtio/vhost-shadow-virtqueue.h |  6 ++++++
> >  hw/virtio/vhost-shadow-virtqueue.c | 10 +++++-----
> >  2 files changed, 11 insertions(+), 5 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> > index f35d4b8f90..143c86a568 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.h
> > +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > @@ -17,6 +17,12 @@
> >
> >  typedef struct SVQElement {
> >      VirtQueueElement elem;
> > +
> > +    /*
> > +     * Number of descriptors exposed to the device. May or may not match
> > +     * guest's
> > +     */
> > +    unsigned int ndescs;
> >  } SVQElement;
>
> Can we simplify things furtherly by moving ndscs into a dedicated array at svq?
>
> Then we don't need to bother with introducing SVQElement.
>

Yes, I'll move to a desc_state.

Thanks!

> Thanks
>
> >
> >  /* Shadow virtqueue to relay notifications */
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> > index 442ca3cbd3..3b112c4ec8 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.c
> > +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > @@ -243,10 +243,10 @@ static int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
> >                            size_t in_num, SVQElement *svq_elem)
> >  {
> >      unsigned qemu_head;
> > -    unsigned ndescs = in_num + out_num;
> > +    svq_elem->ndescs = in_num + out_num;
> >      bool ok;
> >
> > -    if (unlikely(ndescs > vhost_svq_available_slots(svq))) {
> > +    if (unlikely(svq_elem->ndescs > vhost_svq_available_slots(svq))) {
> >          return -ENOSPC;
> >      }
> >
> > @@ -393,7 +393,7 @@ static SVQElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
> >      SVQElement *elem;
> >      const vring_used_t *used = svq->vring.used;
> >      vring_used_elem_t used_elem;
> > -    uint16_t last_used, last_used_chain, num;
> > +    uint16_t last_used, last_used_chain;
> >
> >      if (!vhost_svq_more_used(svq)) {
> >          return NULL;
> > @@ -420,8 +420,8 @@ static SVQElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
> >      }
> >
> >      elem = svq->ring_id_maps[used_elem.id];
> > -    num = elem->elem.in_num + elem->elem.out_num;
> > -    last_used_chain = vhost_svq_last_desc_of_chain(svq, num, used_elem.id);
> > +    last_used_chain = vhost_svq_last_desc_of_chain(svq, elem->ndescs,
> > +                                                   used_elem.id);
> >      svq->desc_next[last_used_chain] = svq->free_head;
> >      svq->free_head = used_elem.id;
> >
> > --
> > 2.31.1
> >
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 19/19] vdpa: Add x-svq to NetdevVhostVDPAOptions
  2022-07-15  4:13   ` Jason Wang
@ 2022-07-15  6:09     ` Eugenio Perez Martin
  0 siblings, 0 replies; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-15  6:09 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 6:13 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > Finally offering the possibility to enable SVQ from the command line.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > Acked-by: Markus Armbruster <armbru@redhat.com>
> > ---
> >  qapi/net.json    |  9 +++++-
> >  net/vhost-vdpa.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++--
> >  2 files changed, 77 insertions(+), 4 deletions(-)
> >
> > diff --git a/qapi/net.json b/qapi/net.json
> > index 9af11e9a3b..75ba2cb989 100644
> > --- a/qapi/net.json
> > +++ b/qapi/net.json
> > @@ -445,12 +445,19 @@
> >  # @queues: number of queues to be created for multiqueue vhost-vdpa
> >  #          (default: 1)
> >  #
> > +# @x-svq: Start device with (experimental) shadow virtqueue. (Since 7.1)
> > +#         (default: false)
> > +#
> > +# Features:
> > +# @unstable: Member @x-svq is experimental.
> > +#
> >  # Since: 5.1
> >  ##
> >  { 'struct': 'NetdevVhostVDPAOptions',
> >    'data': {
> >      '*vhostdev':     'str',
> > -    '*queues':       'int' } }
> > +    '*queues':       'int',
> > +    '*x-svq':        {'type': 'bool', 'features' : [ 'unstable'] } } }
> >
> >  ##
> >  # @NetdevVmnetHostOptions:
> > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > index 7ccf9eaf4d..85148a5114 100644
> > --- a/net/vhost-vdpa.c
> > +++ b/net/vhost-vdpa.c
> > @@ -75,6 +75,28 @@ const int vdpa_feature_bits[] = {
> >      VHOST_INVALID_FEATURE_BIT
> >  };
> >
> > +/** Supported device specific feature bits with SVQ */
> > +static const uint64_t vdpa_svq_device_features =
> > +    BIT_ULL(VIRTIO_NET_F_CSUM) |
> > +    BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) |
> > +    BIT_ULL(VIRTIO_NET_F_MTU) |
> > +    BIT_ULL(VIRTIO_NET_F_MAC) |
> > +    BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) |
> > +    BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) |
> > +    BIT_ULL(VIRTIO_NET_F_GUEST_ECN) |
> > +    BIT_ULL(VIRTIO_NET_F_GUEST_UFO) |
> > +    BIT_ULL(VIRTIO_NET_F_HOST_TSO4) |
> > +    BIT_ULL(VIRTIO_NET_F_HOST_TSO6) |
> > +    BIT_ULL(VIRTIO_NET_F_HOST_ECN) |
> > +    BIT_ULL(VIRTIO_NET_F_HOST_UFO) |
> > +    BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) |
> > +    BIT_ULL(VIRTIO_NET_F_STATUS) |
> > +    BIT_ULL(VIRTIO_NET_F_CTRL_VQ) |
> > +    BIT_ULL(VIRTIO_F_ANY_LAYOUT) |
> > +    BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR) |
> > +    BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
> > +    BIT_ULL(VIRTIO_NET_F_STANDBY);
>
> We need to have a plan for the full feature support like
>
> indirect, event_index, and packed.
>

Event idx is almost straightforward to develop. Packed has more code
but it's equally doable. In order it should not be hard too.

Indirect is a little bit more complicated because of the indirect
table. I guess we will need to either allocate a big buffer where we
can obtain indirect tables and cvq buffers, or allocate & map them
individually.

Note that we can half-support them. To enable them in the guest's
vring is as easy as to accept that feature in SVQ, and SVQ can easily
translate one format to another. I know the interesting part is the
shadow vring to speed the communication with the device, but it's
still a first step in that direction if needed.

> I can help in developing some of these if you wish.
>

We could plan for the next release cycle for sure.

Thanks!

> Thanks
>
> > +
> >  VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
> >  {
> >      VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > @@ -133,9 +155,13 @@ err_init:
> >  static void vhost_vdpa_cleanup(NetClientState *nc)
> >  {
> >      VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > +    struct vhost_dev *dev = &s->vhost_net->dev;
> >
> >      qemu_vfree(s->cvq_cmd_out_buffer);
> >      qemu_vfree(s->cvq_cmd_in_buffer);
> > +    if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
> > +        g_clear_pointer(&s->vhost_vdpa.iova_tree, vhost_iova_tree_delete);
> > +    }
> >      if (s->vhost_net) {
> >          vhost_net_cleanup(s->vhost_net);
> >          g_free(s->vhost_net);
> > @@ -437,7 +463,9 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >                                             int vdpa_device_fd,
> >                                             int queue_pair_index,
> >                                             int nvqs,
> > -                                           bool is_datapath)
> > +                                           bool is_datapath,
> > +                                           bool svq,
> > +                                           VhostIOVATree *iova_tree)
> >  {
> >      NetClientState *nc = NULL;
> >      VhostVDPAState *s;
> > @@ -455,6 +483,8 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >
> >      s->vhost_vdpa.device_fd = vdpa_device_fd;
> >      s->vhost_vdpa.index = queue_pair_index;
> > +    s->vhost_vdpa.shadow_vqs_enabled = svq;
> > +    s->vhost_vdpa.iova_tree = iova_tree;
> >      if (!is_datapath) {
> >          s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size(),
> >                                              vhost_vdpa_net_cvq_cmd_page_len());
> > @@ -465,6 +495,8 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >
> >          s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
> >          s->vhost_vdpa.shadow_vq_ops_opaque = s;
> > +        error_setg(&s->vhost_vdpa.migration_blocker,
> > +                   "Migration disabled: vhost-vdpa uses CVQ.");
> >      }
> >      ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa, queue_pair_index, nvqs);
> >      if (ret) {
> > @@ -474,6 +506,14 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >      return nc;
> >  }
> >
> > +static int vhost_vdpa_get_iova_range(int fd,
> > +                                     struct vhost_vdpa_iova_range *iova_range)
> > +{
> > +    int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> > +
> > +    return ret < 0 ? -errno : 0;
> > +}
> > +
> >  static int vhost_vdpa_get_features(int fd, uint64_t *features, Error **errp)
> >  {
> >      int ret = ioctl(fd, VHOST_GET_FEATURES, features);
> > @@ -524,6 +564,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> >      uint64_t features;
> >      int vdpa_device_fd;
> >      g_autofree NetClientState **ncs = NULL;
> > +    g_autoptr(VhostIOVATree) iova_tree = NULL;
> >      NetClientState *nc;
> >      int queue_pairs, r, i, has_cvq = 0;
> >
> > @@ -551,22 +592,45 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> >          return queue_pairs;
> >      }
> >
> > +    if (opts->x_svq) {
> > +        struct vhost_vdpa_iova_range iova_range;
> > +
> > +        uint64_t invalid_dev_features =
> > +            features & ~vdpa_svq_device_features &
> > +            /* Transport are all accepted at this point */
> > +            ~MAKE_64BIT_MASK(VIRTIO_TRANSPORT_F_START,
> > +                             VIRTIO_TRANSPORT_F_END - VIRTIO_TRANSPORT_F_START);
> > +
> > +        if (invalid_dev_features) {
> > +            error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
> > +                       invalid_dev_features);
> > +            goto err_svq;
> > +        }
> > +
> > +        vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range);
> > +        iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
> > +    }
> > +
> >      ncs = g_malloc0(sizeof(*ncs) * queue_pairs);
> >
> >      for (i = 0; i < queue_pairs; i++) {
> >          ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> > -                                     vdpa_device_fd, i, 2, true);
> > +                                     vdpa_device_fd, i, 2, true, opts->x_svq,
> > +                                     iova_tree);
> >          if (!ncs[i])
> >              goto err;
> >      }
> >
> >      if (has_cvq) {
> >          nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> > -                                 vdpa_device_fd, i, 1, false);
> > +                                 vdpa_device_fd, i, 1, false,
> > +                                 opts->x_svq, iova_tree);
> >          if (!nc)
> >              goto err;
> >      }
> >
> > +    /* iova_tree ownership belongs to last NetClientState */
> > +    g_steal_pointer(&iova_tree);
> >      return 0;
> >
> >  err:
> > @@ -575,6 +639,8 @@ err:
> >              qemu_del_net_client(ncs[i]);
> >          }
> >      }
> > +
> > +err_svq:
> >      qemu_close(vdpa_device_fd);
> >
> >      return -1;
> > --
> > 2.31.1
> >
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 15/19] vdpa: manual forward CVQ buffers
  2022-07-15  5:33     ` Eugenio Perez Martin
@ 2022-07-15  8:44       ` Jason Wang
  2022-07-15  9:01         ` Eugenio Perez Martin
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Wang @ 2022-07-15  8:44 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 1:34 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 6:08 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > >
> > > Do a simple forwarding of CVQ buffers, the same work SVQ could do but
> > > through callbacks. No functional change intended.
> > >
> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > ---
> > >  include/hw/virtio/vhost-vdpa.h |  3 ++
> > >  hw/virtio/vhost-vdpa.c         |  3 +-
> > >  net/vhost-vdpa.c               | 58 ++++++++++++++++++++++++++++++++++
> > >  3 files changed, 63 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > index 7214eb47dc..1111d85643 100644
> > > --- a/include/hw/virtio/vhost-vdpa.h
> > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > @@ -15,6 +15,7 @@
> > >  #include <gmodule.h>
> > >
> > >  #include "hw/virtio/vhost-iova-tree.h"
> > > +#include "hw/virtio/vhost-shadow-virtqueue.h"
> > >  #include "hw/virtio/virtio.h"
> > >  #include "standard-headers/linux/vhost_types.h"
> > >
> > > @@ -35,6 +36,8 @@ typedef struct vhost_vdpa {
> > >      /* IOVA mapping used by the Shadow Virtqueue */
> > >      VhostIOVATree *iova_tree;
> > >      GPtrArray *shadow_vqs;
> > > +    const VhostShadowVirtqueueOps *shadow_vq_ops;
> > > +    void *shadow_vq_ops_opaque;
> > >      struct vhost_dev *dev;
> > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > >  } VhostVDPA;
> > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > index 96997210be..beaaa7049a 100644
> > > --- a/hw/virtio/vhost-vdpa.c
> > > +++ b/hw/virtio/vhost-vdpa.c
> > > @@ -419,7 +419,8 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> > >      for (unsigned n = 0; n < hdev->nvqs; ++n) {
> > >          g_autoptr(VhostShadowVirtqueue) svq;
> > >
> > > -        svq = vhost_svq_new(v->iova_tree, NULL, NULL);
> > > +        svq = vhost_svq_new(v->iova_tree, v->shadow_vq_ops,
> > > +                            v->shadow_vq_ops_opaque);
> > >          if (unlikely(!svq)) {
> > >              error_setg(errp, "Cannot create svq %u", n);
> > >              return -1;
> > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > index df1e69ee72..805c9dd6b6 100644
> > > --- a/net/vhost-vdpa.c
> > > +++ b/net/vhost-vdpa.c
> > > @@ -11,11 +11,14 @@
> > >
> > >  #include "qemu/osdep.h"
> > >  #include "clients.h"
> > > +#include "hw/virtio/virtio-net.h"
> > >  #include "net/vhost_net.h"
> > >  #include "net/vhost-vdpa.h"
> > >  #include "hw/virtio/vhost-vdpa.h"
> > >  #include "qemu/config-file.h"
> > >  #include "qemu/error-report.h"
> > > +#include "qemu/log.h"
> > > +#include "qemu/memalign.h"
> > >  #include "qemu/option.h"
> > >  #include "qapi/error.h"
> > >  #include <linux/vhost.h>
> > > @@ -187,6 +190,57 @@ static NetClientInfo net_vhost_vdpa_info = {
> > >          .check_peer_type = vhost_vdpa_check_peer_type,
> > >  };
> > >
> > > +/**
> > > + * Forward buffer for the moment.
> > > + */
> > > +static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
> > > +                                            SVQElement *svq_elem, void *opaque)
> > > +{
> > > +    VirtQueueElement *elem = &svq_elem->elem;
> > > +    unsigned int n = elem->out_num + elem->in_num;
> > > +    g_autofree struct iovec *dev_buffers = g_new(struct iovec, n);
> > > +    size_t in_len, dev_written;
> > > +    virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
> > > +    int r;
> > > +
> > > +    memcpy(dev_buffers, elem->out_sg, elem->out_num);
> > > +    memcpy(dev_buffers + elem->out_num, elem->in_sg, elem->in_num);
> > > +
> > > +    r = vhost_svq_add(svq, &dev_buffers[0], elem->out_num, &dev_buffers[1],
> > > +                      elem->in_num, svq_elem);
> > > +    if (unlikely(r != 0)) {
> > > +        if (unlikely(r == -ENOSPC)) {
> > > +            qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
> > > +                          __func__);
> > > +        }
> > > +        goto out;
> > > +    }
> > > +
> > > +    /*
> > > +     * We can poll here since we've had BQL from the time we sent the
> > > +     * descriptor. Also, we need to take the answer before SVQ pulls by itself,
> > > +     * when BQL is released
> > > +     */
> > > +    dev_written = vhost_svq_poll(svq);
> > > +    if (unlikely(dev_written < sizeof(status))) {
> > > +        error_report("Insufficient written data (%zu)", dev_written);
> > > +    }
> > > +
> > > +out:
> > > +    in_len = iov_from_buf(elem->in_sg, elem->in_num, 0, &status,
> > > +                          sizeof(status));
> > > +    if (unlikely(in_len < sizeof(status))) {
> > > +        error_report("Bad device CVQ written length");
> > > +    }
> > > +    vhost_svq_push_elem(svq, svq_elem, MIN(in_len, sizeof(status)));
> > > +    g_free(svq_elem);
> > > +    return r;
> > > +}
> > > +
> > > +static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> > > +    .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> > > +};
> >
> > I wonder if it's possible to even remove this handler. Can we let the
> > kick to be handled by virtio_net_handler_ctrl() in virtio-net.c?
> >
>
> I kind of drafted that here:
> https://lists.nongnu.org/archive/html/qemu-devel/2022-07/msg02652.html
>
> But I'm not sure about the part of not enabling the guest event
> notifiers. I'd say we would need to move SVQ to the VirtIODevice
> itself, which seems rather complicated at this moment.
>
> Without applying that patch, all of the events of vDPA devices go
> either to device itself or to the SVQ registered handlers. If we go
> that route, we add the vq event handler, and that would depend on the
> svq state + specific vq.

So what's in my mind is to let it work more like existing vhost
backends. In that case the shadow virtqueue is only used as one way to
communicate with the vhost-backend:

1) vhost-net using TAP ioctl
2) vhost-user using UNIX domain socket
3) vhost-vDPA is using shadow cvq

>
> I'm totally ok to do it the first thing for the next merge window, or
> if I have the certainty it's the accepted way to go. Otherwise, I
> think we will not be able to deliver at least migration with SVQ for
> this merge window again.

So I think it's fine for the current stage. We can improve it in the future.

Thanks

>
> Thanks!
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 12/19] vhost: add vhost_svq_poll
  2022-07-15  5:38     ` Eugenio Perez Martin
@ 2022-07-15  8:47       ` Jason Wang
  2022-07-15 17:05         ` Eugenio Perez Martin
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Wang @ 2022-07-15  8:47 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 1:39 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > >
> > > It allows the Shadow Control VirtQueue to wait for the device to use the
> > > available buffers.
> > >
> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > ---
> > >  hw/virtio/vhost-shadow-virtqueue.h |  1 +
> > >  hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++
> > >  2 files changed, 23 insertions(+)
> > >
> > > diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> > > index 1692541cbb..b5c6e3b3b4 100644
> > > --- a/hw/virtio/vhost-shadow-virtqueue.h
> > > +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > > @@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *elem,
> > >  int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
> > >                    size_t out_num, const struct iovec *in_sg, size_t in_num,
> > >                    SVQElement *elem);
> > > +size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
> > >
> > >  void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
> > >  void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
> > > diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> > > index 5244896358..31a267f721 100644
> > > --- a/hw/virtio/vhost-shadow-virtqueue.c
> > > +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > > @@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
> > >      } while (!vhost_svq_enable_notification(svq));
> > >  }
> > >
> > > +/**
> > > + * Poll the SVQ for one device used buffer.
> > > + *
> > > + * This function race with main event loop SVQ polling, so extra
> > > + * synchronization is needed.
> > > + *
> > > + * Return the length written by the device.
> > > + */
> > > +size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> > > +{
> > > +    do {
> > > +        uint32_t len;
> > > +        SVQElement *elem = vhost_svq_get_buf(svq, &len);
> > > +        if (elem) {
> > > +            return len;
> > > +        }
> > > +
> > > +        /* Make sure we read new used_idx */
> > > +        smp_rmb();
> >
> > There's already one smp_rmb(0 in vhost_svq_get_buf(). So this seems useless?
> >
>
> That rmb is after checking for new entries with (vq->last_used_idx !=
> svq->shadow_used_idx) , to avoid reordering used_idx read with the
> actual used entry. So my understanding is
> that the compiler is free to skip that check within the while loop.

What do you mean by "that check" here?

>
> Maybe the right solution is to add it in vhost_svq_more_used after the
> condition (vq->last_used_idx != svq->shadow_used_idx) is false?

I'm not sure I get the goal of the smp_rmb() here. What barrier does it pair?

Since we are in the busy loop, we will read the for new used_idx for
sure, and we can't forecast when the used_idx is committed to memory.

Thanks

>
> Thanks!
>
>
> > Thanks
> >
> > > +    } while (true);
> > > +}
> > > +
> > >  /**
> > >   * Forward used buffers.
> > >   *
> > > --
> > > 2.31.1
> > >
> >
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 18/19] vdpa: Add device migration blocker
  2022-07-15  5:39     ` Eugenio Perez Martin
@ 2022-07-15  8:50       ` Jason Wang
  2022-07-15  9:05         ` Eugenio Perez Martin
  2022-07-22 13:29         ` Eugenio Perez Martin
  0 siblings, 2 replies; 42+ messages in thread
From: Jason Wang @ 2022-07-15  8:50 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 1:40 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 6:03 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > >
> > > Since the vhost-vdpa device is exposing _F_LOG,
> >
> > I may miss something but I think it doesn't?
> >
>
> It's at vhost_vdpa_get_features. As long as SVQ is enabled, it's
> exposing VHOST_F_LOG_ALL.

Ok, so this needs to be specified in the change log. But I'm kind of
confused here, we do want to allow migration to work so why we disable
it?

Thanks

>
> Thanks!
>
> > Note that the features were fetched from the vDPA parent.
> >
> > Thanks
> >
> > > adding a migration blocker if
> > > it uses CVQ.
> > >
> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > ---
> > >  include/hw/virtio/vhost-vdpa.h |  1 +
> > >  hw/virtio/vhost-vdpa.c         | 14 ++++++++++++++
> > >  2 files changed, 15 insertions(+)
> > >
> > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > index 1111d85643..d10a89303e 100644
> > > --- a/include/hw/virtio/vhost-vdpa.h
> > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > @@ -35,6 +35,7 @@ typedef struct vhost_vdpa {
> > >      bool shadow_vqs_enabled;
> > >      /* IOVA mapping used by the Shadow Virtqueue */
> > >      VhostIOVATree *iova_tree;
> > > +    Error *migration_blocker;
> > >      GPtrArray *shadow_vqs;
> > >      const VhostShadowVirtqueueOps *shadow_vq_ops;
> > >      void *shadow_vq_ops_opaque;
> > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > index beaaa7049a..795ed5a049 100644
> > > --- a/hw/virtio/vhost-vdpa.c
> > > +++ b/hw/virtio/vhost-vdpa.c
> > > @@ -20,6 +20,7 @@
> > >  #include "hw/virtio/vhost-shadow-virtqueue.h"
> > >  #include "hw/virtio/vhost-vdpa.h"
> > >  #include "exec/address-spaces.h"
> > > +#include "migration/blocker.h"
> > >  #include "qemu/cutils.h"
> > >  #include "qemu/main-loop.h"
> > >  #include "cpu.h"
> > > @@ -1022,6 +1023,13 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
> > >          return true;
> > >      }
> > >
> > > +    if (v->migration_blocker) {
> > > +        int r = migrate_add_blocker(v->migration_blocker, &err);
> > > +        if (unlikely(r < 0)) {
> > > +            goto err_migration_blocker;
> > > +        }
> > > +    }
> > > +
> > >      for (i = 0; i < v->shadow_vqs->len; ++i) {
> > >          VirtQueue *vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
> > >          VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
> > > @@ -1064,6 +1072,9 @@ err:
> > >          vhost_svq_stop(svq);
> > >      }
> > >
> > > +err_migration_blocker:
> > > +    error_reportf_err(err, "Cannot setup SVQ %u: ", i);
> > > +
> > >      return false;
> > >  }
> > >
> > > @@ -1083,6 +1094,9 @@ static bool vhost_vdpa_svqs_stop(struct vhost_dev *dev)
> > >          }
> > >      }
> > >
> > > +    if (v->migration_blocker) {
> > > +        migrate_del_blocker(v->migration_blocker);
> > > +    }
> > >      return true;
> > >  }
> > >
> > > --
> > > 2.31.1
> > >
> >
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 15/19] vdpa: manual forward CVQ buffers
  2022-07-15  8:44       ` Jason Wang
@ 2022-07-15  9:01         ` Eugenio Perez Martin
  0 siblings, 0 replies; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-15  9:01 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 10:44 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 1:34 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Fri, Jul 15, 2022 at 6:08 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >
> > > > Do a simple forwarding of CVQ buffers, the same work SVQ could do but
> > > > through callbacks. No functional change intended.
> > > >
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > ---
> > > >  include/hw/virtio/vhost-vdpa.h |  3 ++
> > > >  hw/virtio/vhost-vdpa.c         |  3 +-
> > > >  net/vhost-vdpa.c               | 58 ++++++++++++++++++++++++++++++++++
> > > >  3 files changed, 63 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > > index 7214eb47dc..1111d85643 100644
> > > > --- a/include/hw/virtio/vhost-vdpa.h
> > > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > > @@ -15,6 +15,7 @@
> > > >  #include <gmodule.h>
> > > >
> > > >  #include "hw/virtio/vhost-iova-tree.h"
> > > > +#include "hw/virtio/vhost-shadow-virtqueue.h"
> > > >  #include "hw/virtio/virtio.h"
> > > >  #include "standard-headers/linux/vhost_types.h"
> > > >
> > > > @@ -35,6 +36,8 @@ typedef struct vhost_vdpa {
> > > >      /* IOVA mapping used by the Shadow Virtqueue */
> > > >      VhostIOVATree *iova_tree;
> > > >      GPtrArray *shadow_vqs;
> > > > +    const VhostShadowVirtqueueOps *shadow_vq_ops;
> > > > +    void *shadow_vq_ops_opaque;
> > > >      struct vhost_dev *dev;
> > > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > > >  } VhostVDPA;
> > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > > index 96997210be..beaaa7049a 100644
> > > > --- a/hw/virtio/vhost-vdpa.c
> > > > +++ b/hw/virtio/vhost-vdpa.c
> > > > @@ -419,7 +419,8 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> > > >      for (unsigned n = 0; n < hdev->nvqs; ++n) {
> > > >          g_autoptr(VhostShadowVirtqueue) svq;
> > > >
> > > > -        svq = vhost_svq_new(v->iova_tree, NULL, NULL);
> > > > +        svq = vhost_svq_new(v->iova_tree, v->shadow_vq_ops,
> > > > +                            v->shadow_vq_ops_opaque);
> > > >          if (unlikely(!svq)) {
> > > >              error_setg(errp, "Cannot create svq %u", n);
> > > >              return -1;
> > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > > index df1e69ee72..805c9dd6b6 100644
> > > > --- a/net/vhost-vdpa.c
> > > > +++ b/net/vhost-vdpa.c
> > > > @@ -11,11 +11,14 @@
> > > >
> > > >  #include "qemu/osdep.h"
> > > >  #include "clients.h"
> > > > +#include "hw/virtio/virtio-net.h"
> > > >  #include "net/vhost_net.h"
> > > >  #include "net/vhost-vdpa.h"
> > > >  #include "hw/virtio/vhost-vdpa.h"
> > > >  #include "qemu/config-file.h"
> > > >  #include "qemu/error-report.h"
> > > > +#include "qemu/log.h"
> > > > +#include "qemu/memalign.h"
> > > >  #include "qemu/option.h"
> > > >  #include "qapi/error.h"
> > > >  #include <linux/vhost.h>
> > > > @@ -187,6 +190,57 @@ static NetClientInfo net_vhost_vdpa_info = {
> > > >          .check_peer_type = vhost_vdpa_check_peer_type,
> > > >  };
> > > >
> > > > +/**
> > > > + * Forward buffer for the moment.
> > > > + */
> > > > +static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
> > > > +                                            SVQElement *svq_elem, void *opaque)
> > > > +{
> > > > +    VirtQueueElement *elem = &svq_elem->elem;
> > > > +    unsigned int n = elem->out_num + elem->in_num;
> > > > +    g_autofree struct iovec *dev_buffers = g_new(struct iovec, n);
> > > > +    size_t in_len, dev_written;
> > > > +    virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
> > > > +    int r;
> > > > +
> > > > +    memcpy(dev_buffers, elem->out_sg, elem->out_num);
> > > > +    memcpy(dev_buffers + elem->out_num, elem->in_sg, elem->in_num);
> > > > +
> > > > +    r = vhost_svq_add(svq, &dev_buffers[0], elem->out_num, &dev_buffers[1],
> > > > +                      elem->in_num, svq_elem);
> > > > +    if (unlikely(r != 0)) {
> > > > +        if (unlikely(r == -ENOSPC)) {
> > > > +            qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
> > > > +                          __func__);
> > > > +        }
> > > > +        goto out;
> > > > +    }
> > > > +
> > > > +    /*
> > > > +     * We can poll here since we've had BQL from the time we sent the
> > > > +     * descriptor. Also, we need to take the answer before SVQ pulls by itself,
> > > > +     * when BQL is released
> > > > +     */
> > > > +    dev_written = vhost_svq_poll(svq);
> > > > +    if (unlikely(dev_written < sizeof(status))) {
> > > > +        error_report("Insufficient written data (%zu)", dev_written);
> > > > +    }
> > > > +
> > > > +out:
> > > > +    in_len = iov_from_buf(elem->in_sg, elem->in_num, 0, &status,
> > > > +                          sizeof(status));
> > > > +    if (unlikely(in_len < sizeof(status))) {
> > > > +        error_report("Bad device CVQ written length");
> > > > +    }
> > > > +    vhost_svq_push_elem(svq, svq_elem, MIN(in_len, sizeof(status)));
> > > > +    g_free(svq_elem);
> > > > +    return r;
> > > > +}
> > > > +
> > > > +static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> > > > +    .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> > > > +};
> > >
> > > I wonder if it's possible to even remove this handler. Can we let the
> > > kick to be handled by virtio_net_handler_ctrl() in virtio-net.c?
> > >
> >
> > I kind of drafted that here:
> > https://lists.nongnu.org/archive/html/qemu-devel/2022-07/msg02652.html
> >
> > But I'm not sure about the part of not enabling the guest event
> > notifiers. I'd say we would need to move SVQ to the VirtIODevice
> > itself, which seems rather complicated at this moment.
> >
> > Without applying that patch, all of the events of vDPA devices go
> > either to device itself or to the SVQ registered handlers. If we go
> > that route, we add the vq event handler, and that would depend on the
> > svq state + specific vq.
>
> So what's in my mind is to let it work more like existing vhost
> backends. In that case the shadow virtqueue is only used as one way to
> communicate with the vhost-backend:
>
> 1) vhost-net using TAP ioctl
> 2) vhost-user using UNIX domain socket
> 3) vhost-vDPA is using shadow cvq
>

Yes, but how do you get SVQ event handlers called?

Data vqs are currently registering by themselves on
vhost_svq_set_svq_kick_fd / vhost_svq_new. With this proposal, data
vqs would keep that way but cvq will be registered by other means,
making SVQ hard to reason about.

With this proposal, the right thing to do is to move everything to
virtqueues handle_output callbacks. I like the proposal a lot: that
way we slim down SVQ by a lot, and SVQ will benefit for every
improvement we do to VirtQueues kick handling (coroutines, iothreads,
...). However to move everything comes with problems: SVQ code can be
called outside of the event loop.

Both that and not calling vhost_dev_enable_notifiers will be severely
untested at this point, not sure if other vhost devices do that but I
think not.

Thanks!



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 18/19] vdpa: Add device migration blocker
  2022-07-15  8:50       ` Jason Wang
@ 2022-07-15  9:05         ` Eugenio Perez Martin
  2022-07-15 16:13           ` Eugenio Perez Martin
  2022-07-22 13:29         ` Eugenio Perez Martin
  1 sibling, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-15  9:05 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 10:51 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 1:40 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Fri, Jul 15, 2022 at 6:03 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >
> > > > Since the vhost-vdpa device is exposing _F_LOG,
> > >
> > > I may miss something but I think it doesn't?
> > >
> >
> > It's at vhost_vdpa_get_features. As long as SVQ is enabled, it's
> > exposing VHOST_F_LOG_ALL.
>
> Ok, so this needs to be specified in the change log.

Got it, I'll write some note.

> But I'm kind of
> confused here, we do want to allow migration to work so why we disable
> it?
>

With x-svq parameter, migration of simple devices with no cvq "is
possible". It has intrinsic problems like can't emit the gratuitous
arp but it's possible and traffic can continue.

But devices with cvq require to restore the state at the destination.
That part is not implemented, so it's blocked at the moment.

In the immediate future not all cases (as "net features") will be
available: net/vhost-net.c (or virtio-net.c?) needs to know how to
inject the state at the destination to restore the guest visible
configuration. It's simple code, but it needs to be developed. So
migration blocker is kept for these features. Hopefully, we will reach
a point where all features supported by virtio-net.c will be
supported, but the right thing to do is to merge basic ones first.

Thanks!



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 18/19] vdpa: Add device migration blocker
  2022-07-15  9:05         ` Eugenio Perez Martin
@ 2022-07-15 16:13           ` Eugenio Perez Martin
  0 siblings, 0 replies; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-15 16:13 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 11:05 AM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 10:51 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Fri, Jul 15, 2022 at 1:40 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Fri, Jul 15, 2022 at 6:03 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > >
> > > > > Since the vhost-vdpa device is exposing _F_LOG,
> > > >
> > > > I may miss something but I think it doesn't?
> > > >
> > >
> > > It's at vhost_vdpa_get_features. As long as SVQ is enabled, it's
> > > exposing VHOST_F_LOG_ALL.
> >
> > Ok, so this needs to be specified in the change log.
>
> Got it, I'll write some note.
>
> > But I'm kind of
> > confused here, we do want to allow migration to work so why we disable
> > it?
> >
>

Adding here:
Without the x-svq parameter, migration is disabled unless the actual
vdpa device backend supports _F_LOG_ALL by itself. There is no such
thing in the Linux kernel at the moment.

> With x-svq parameter, migration of simple devices with no cvq "is
> possible". It has intrinsic problems like can't emit the gratuitous
> arp but it's possible and traffic can continue.
>
> But devices with cvq require to restore the state at the destination.
> That part is not implemented, so it's blocked at the moment.
>
> In the immediate future not all cases (as "net features") will be
> available: net/vhost-net.c (or virtio-net.c?) needs to know how to
> inject the state at the destination to restore the guest visible
> configuration. It's simple code, but it needs to be developed. So
> migration blocker is kept for these features. Hopefully, we will reach
> a point where all features supported by virtio-net.c will be
> supported, but the right thing to do is to merge basic ones first.
>
> Thanks!



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 12/19] vhost: add vhost_svq_poll
  2022-07-15  8:47       ` Jason Wang
@ 2022-07-15 17:05         ` Eugenio Perez Martin
  2022-07-19  7:38           ` Jason Wang
  0 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-15 17:05 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 10:48 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 1:39 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Fri, Jul 15, 2022 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >
> > > > It allows the Shadow Control VirtQueue to wait for the device to use the
> > > > available buffers.
> > > >
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > ---
> > > >  hw/virtio/vhost-shadow-virtqueue.h |  1 +
> > > >  hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++
> > > >  2 files changed, 23 insertions(+)
> > > >
> > > > diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> > > > index 1692541cbb..b5c6e3b3b4 100644
> > > > --- a/hw/virtio/vhost-shadow-virtqueue.h
> > > > +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > > > @@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *elem,
> > > >  int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
> > > >                    size_t out_num, const struct iovec *in_sg, size_t in_num,
> > > >                    SVQElement *elem);
> > > > +size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
> > > >
> > > >  void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
> > > >  void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
> > > > diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> > > > index 5244896358..31a267f721 100644
> > > > --- a/hw/virtio/vhost-shadow-virtqueue.c
> > > > +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > > > @@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
> > > >      } while (!vhost_svq_enable_notification(svq));
> > > >  }
> > > >
> > > > +/**
> > > > + * Poll the SVQ for one device used buffer.
> > > > + *
> > > > + * This function race with main event loop SVQ polling, so extra
> > > > + * synchronization is needed.
> > > > + *
> > > > + * Return the length written by the device.
> > > > + */
> > > > +size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> > > > +{
> > > > +    do {
> > > > +        uint32_t len;
> > > > +        SVQElement *elem = vhost_svq_get_buf(svq, &len);
> > > > +        if (elem) {
> > > > +            return len;
> > > > +        }
> > > > +
> > > > +        /* Make sure we read new used_idx */
> > > > +        smp_rmb();
> > >
> > > There's already one smp_rmb(0 in vhost_svq_get_buf(). So this seems useless?
> > >
> >
> > That rmb is after checking for new entries with (vq->last_used_idx !=
> > svq->shadow_used_idx) , to avoid reordering used_idx read with the
> > actual used entry. So my understanding is
> > that the compiler is free to skip that check within the while loop.
>
> What do you mean by "that check" here?
>

The first check of (presumably cached) last_used_idx !=
shadow_used_idx. Right before the SVQ's vring fetch of
svq->vring.used->idx.

> >
> > Maybe the right solution is to add it in vhost_svq_more_used after the
> > condition (vq->last_used_idx != svq->shadow_used_idx) is false?
>
> I'm not sure I get the goal of the smp_rmb() here. What barrier does it pair?
>

It pairs with the actual device memory write.

Note that I'm worried about compiler optimizations or reordering
causing an infinite loop, not that the memory is updated properly.

Let's say we just returned from vhost_svq_add, and avail_idx -
used_idx > 0. The device still did not update SVQ vring used idx, and
qemu is very fast so it completes a full call of vhost_svq_get_buf
before the device is able to increment the used index. We can trace
the full vhost_svq_get_buf without a memory barrier.

If the compiler inlines enough and we delete the new smp_rmb barrier,
this is what it sees:

size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
{
    do {
        more_used = false
        // The next conditional returns false
        if (svq->last_used_idx != svq->shadow_used_idx) {
            goto useful;
        }

        svq->shadow_used_idx = cpu_to_le16(svq->vring.used->idx);

        // next conditional returns false too
        if (!(svq->last_used_idx != svq->shadow_used_idx))
            continue;

useful:
        // actual code to handle new used buffer
        break;
        }
    }
}

And qemu does not need to read again none of the variables since
nothing modifies them within the loop before "useful" tag
(svq->vring.used->idx, svq->last_used_idx, svq->shadow_used_idx). So
it could freely rewrite as:

size_t vhost_svq_poll(VhostShadowVirtqueue *svq) {
    if (svq->last_used_idx == svq->shadow_used_idx &&
        svq->last_used_idx == svq->vring.used->idx) {
            for (;;);
    }
}

That's why I think the right place for the mb is right after caller
code see (potentially cached) last_used_idx == shadow_used_idx, and it
needs to read a value paired with the "device's mb" in the SVQ vring.

We didn't have that problem before, since we clear event_notifier
right before the do{}while(), and event loop should hit a memory
barrier in the next select / poll / read / whatever syscall to check
that event notifier fd is set again.

> Since we are in the busy loop, we will read the for new used_idx for
> sure,

I'm not so sure of that, but maybe I've missed something.

I'm sending v3 with this comment pending, so we can iterate faster.

Thanks!

> and we can't forecast when the used_idx is committed to memory.
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 12/19] vhost: add vhost_svq_poll
  2022-07-15 17:05         ` Eugenio Perez Martin
@ 2022-07-19  7:38           ` Jason Wang
  2022-07-19  8:42             ` Eugenio Perez Martin
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Wang @ 2022-07-19  7:38 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)


在 2022/7/16 01:05, Eugenio Perez Martin 写道:
> On Fri, Jul 15, 2022 at 10:48 AM Jason Wang <jasowang@redhat.com> wrote:
>> On Fri, Jul 15, 2022 at 1:39 PM Eugenio Perez Martin
>> <eperezma@redhat.com> wrote:
>>> On Fri, Jul 15, 2022 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>>>> It allows the Shadow Control VirtQueue to wait for the device to use the
>>>>> available buffers.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>>   hw/virtio/vhost-shadow-virtqueue.h |  1 +
>>>>>   hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++
>>>>>   2 files changed, 23 insertions(+)
>>>>>
>>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
>>>>> index 1692541cbb..b5c6e3b3b4 100644
>>>>> --- a/hw/virtio/vhost-shadow-virtqueue.h
>>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
>>>>> @@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *elem,
>>>>>   int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
>>>>>                     size_t out_num, const struct iovec *in_sg, size_t in_num,
>>>>>                     SVQElement *elem);
>>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
>>>>>
>>>>>   void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
>>>>>   void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
>>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
>>>>> index 5244896358..31a267f721 100644
>>>>> --- a/hw/virtio/vhost-shadow-virtqueue.c
>>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
>>>>> @@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
>>>>>       } while (!vhost_svq_enable_notification(svq));
>>>>>   }
>>>>>
>>>>> +/**
>>>>> + * Poll the SVQ for one device used buffer.
>>>>> + *
>>>>> + * This function race with main event loop SVQ polling, so extra
>>>>> + * synchronization is needed.
>>>>> + *
>>>>> + * Return the length written by the device.
>>>>> + */
>>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
>>>>> +{
>>>>> +    do {
>>>>> +        uint32_t len;
>>>>> +        SVQElement *elem = vhost_svq_get_buf(svq, &len);
>>>>> +        if (elem) {
>>>>> +            return len;
>>>>> +        }
>>>>> +
>>>>> +        /* Make sure we read new used_idx */
>>>>> +        smp_rmb();
>>>> There's already one smp_rmb(0 in vhost_svq_get_buf(). So this seems useless?
>>>>
>>> That rmb is after checking for new entries with (vq->last_used_idx !=
>>> svq->shadow_used_idx) , to avoid reordering used_idx read with the
>>> actual used entry. So my understanding is
>>> that the compiler is free to skip that check within the while loop.
>> What do you mean by "that check" here?
>>
> The first check of (presumably cached) last_used_idx !=
> shadow_used_idx. Right before the SVQ's vring fetch of
> svq->vring.used->idx.
>
>>> Maybe the right solution is to add it in vhost_svq_more_used after the
>>> condition (vq->last_used_idx != svq->shadow_used_idx) is false?
>> I'm not sure I get the goal of the smp_rmb() here. What barrier does it pair?
>>
> It pairs with the actual device memory write.
>
> Note that I'm worried about compiler optimizations or reordering
> causing an infinite loop, not that the memory is updated properly.
>
> Let's say we just returned from vhost_svq_add, and avail_idx -
> used_idx > 0. The device still did not update SVQ vring used idx, and
> qemu is very fast so it completes a full call of vhost_svq_get_buf
> before the device is able to increment the used index. We can trace
> the full vhost_svq_get_buf without a memory barrier.
>
> If the compiler inlines enough and we delete the new smp_rmb barrier,
> this is what it sees:
>
> size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> {
>      do {
>          more_used = false
>          // The next conditional returns false
>          if (svq->last_used_idx != svq->shadow_used_idx) {
>              goto useful;
>          }
>
>          svq->shadow_used_idx = cpu_to_le16(svq->vring.used->idx);
>
>          // next conditional returns false too
>          if (!(svq->last_used_idx != svq->shadow_used_idx))
>              continue;
>
> useful:
>          // actual code to handle new used buffer
>          break;
>          }
>      }
> }
>
> And qemu does not need to read again none of the variables since
> nothing modifies them within the loop before "useful" tag
> (svq->vring.used->idx, svq->last_used_idx, svq->shadow_used_idx). So
> it could freely rewrite as:
>
> size_t vhost_svq_poll(VhostShadowVirtqueue *svq) {
>      if (svq->last_used_idx == svq->shadow_used_idx &&
>          svq->last_used_idx == svq->vring.used->idx) {
>              for (;;);
>      }
> }
>
> That's why I think the right place for the mb is right after caller
> code see (potentially cached) last_used_idx == shadow_used_idx, and it
> needs to read a value paired with the "device's mb" in the SVQ vring.


I think you need "volatile" instead of the memory barriers. If I 
understand correctly, you want the load from the memory instead of the 
registers here.

Thanks


>
> We didn't have that problem before, since we clear event_notifier
> right before the do{}while(), and event loop should hit a memory
> barrier in the next select / poll / read / whatever syscall to check
> that event notifier fd is set again.
>
>> Since we are in the busy loop, we will read the for new used_idx for
>> sure,
> I'm not so sure of that, but maybe I've missed something.
>
> I'm sending v3 with this comment pending, so we can iterate faster.
>
> Thanks!
>
>> and we can't forecast when the used_idx is committed to memory.
>>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 12/19] vhost: add vhost_svq_poll
  2022-07-19  7:38           ` Jason Wang
@ 2022-07-19  8:42             ` Eugenio Perez Martin
  2022-07-19  8:48               ` Jason Wang
  0 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-19  8:42 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Tue, Jul 19, 2022 at 9:38 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/7/16 01:05, Eugenio Perez Martin 写道:
> > On Fri, Jul 15, 2022 at 10:48 AM Jason Wang <jasowang@redhat.com> wrote:
> >> On Fri, Jul 15, 2022 at 1:39 PM Eugenio Perez Martin
> >> <eperezma@redhat.com> wrote:
> >>> On Fri, Jul 15, 2022 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
> >>>> On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >>>>> It allows the Shadow Control VirtQueue to wait for the device to use the
> >>>>> available buffers.
> >>>>>
> >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>>>> ---
> >>>>>   hw/virtio/vhost-shadow-virtqueue.h |  1 +
> >>>>>   hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++
> >>>>>   2 files changed, 23 insertions(+)
> >>>>>
> >>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> >>>>> index 1692541cbb..b5c6e3b3b4 100644
> >>>>> --- a/hw/virtio/vhost-shadow-virtqueue.h
> >>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> >>>>> @@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *elem,
> >>>>>   int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
> >>>>>                     size_t out_num, const struct iovec *in_sg, size_t in_num,
> >>>>>                     SVQElement *elem);
> >>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
> >>>>>
> >>>>>   void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
> >>>>>   void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
> >>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> >>>>> index 5244896358..31a267f721 100644
> >>>>> --- a/hw/virtio/vhost-shadow-virtqueue.c
> >>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> >>>>> @@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
> >>>>>       } while (!vhost_svq_enable_notification(svq));
> >>>>>   }
> >>>>>
> >>>>> +/**
> >>>>> + * Poll the SVQ for one device used buffer.
> >>>>> + *
> >>>>> + * This function race with main event loop SVQ polling, so extra
> >>>>> + * synchronization is needed.
> >>>>> + *
> >>>>> + * Return the length written by the device.
> >>>>> + */
> >>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> >>>>> +{
> >>>>> +    do {
> >>>>> +        uint32_t len;
> >>>>> +        SVQElement *elem = vhost_svq_get_buf(svq, &len);
> >>>>> +        if (elem) {
> >>>>> +            return len;
> >>>>> +        }
> >>>>> +
> >>>>> +        /* Make sure we read new used_idx */
> >>>>> +        smp_rmb();
> >>>> There's already one smp_rmb(0 in vhost_svq_get_buf(). So this seems useless?
> >>>>
> >>> That rmb is after checking for new entries with (vq->last_used_idx !=
> >>> svq->shadow_used_idx) , to avoid reordering used_idx read with the
> >>> actual used entry. So my understanding is
> >>> that the compiler is free to skip that check within the while loop.
> >> What do you mean by "that check" here?
> >>
> > The first check of (presumably cached) last_used_idx !=
> > shadow_used_idx. Right before the SVQ's vring fetch of
> > svq->vring.used->idx.
> >
> >>> Maybe the right solution is to add it in vhost_svq_more_used after the
> >>> condition (vq->last_used_idx != svq->shadow_used_idx) is false?
> >> I'm not sure I get the goal of the smp_rmb() here. What barrier does it pair?
> >>
> > It pairs with the actual device memory write.
> >
> > Note that I'm worried about compiler optimizations or reordering
> > causing an infinite loop, not that the memory is updated properly.
> >
> > Let's say we just returned from vhost_svq_add, and avail_idx -
> > used_idx > 0. The device still did not update SVQ vring used idx, and
> > qemu is very fast so it completes a full call of vhost_svq_get_buf
> > before the device is able to increment the used index. We can trace
> > the full vhost_svq_get_buf without a memory barrier.
> >
> > If the compiler inlines enough and we delete the new smp_rmb barrier,
> > this is what it sees:
> >
> > size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> > {
> >      do {
> >          more_used = false
> >          // The next conditional returns false
> >          if (svq->last_used_idx != svq->shadow_used_idx) {
> >              goto useful;
> >          }
> >
> >          svq->shadow_used_idx = cpu_to_le16(svq->vring.used->idx);
> >
> >          // next conditional returns false too
> >          if (!(svq->last_used_idx != svq->shadow_used_idx))
> >              continue;
> >
> > useful:
> >          // actual code to handle new used buffer
> >          break;
> >          }
> >      }
> > }
> >
> > And qemu does not need to read again none of the variables since
> > nothing modifies them within the loop before "useful" tag
> > (svq->vring.used->idx, svq->last_used_idx, svq->shadow_used_idx). So
> > it could freely rewrite as:
> >
> > size_t vhost_svq_poll(VhostShadowVirtqueue *svq) {
> >      if (svq->last_used_idx == svq->shadow_used_idx &&
> >          svq->last_used_idx == svq->vring.used->idx) {
> >              for (;;);
> >      }
> > }
> >
> > That's why I think the right place for the mb is right after caller
> > code see (potentially cached) last_used_idx == shadow_used_idx, and it
> > needs to read a value paired with the "device's mb" in the SVQ vring.
>
>
> I think you need "volatile" instead of the memory barriers.

The kernel's READ_ONCE implementation uses a volatile casting but also
a memory read barrier after it. I guess it's because the compiler can
reorder non-volatile accesses around volatile ones. In the proposed
code, that barrier is provided by the vhost_svq_more_used caller
(vhost_svq_get_buf). I think no other caller should need it.

Thanks!

> If I
> understand correctly, you want the load from the memory instead of the
> registers here.
>
> Thanks
>
>
> >
> > We didn't have that problem before, since we clear event_notifier
> > right before the do{}while(), and event loop should hit a memory
> > barrier in the next select / poll / read / whatever syscall to check
> > that event notifier fd is set again.
> >
> >> Since we are in the busy loop, we will read the for new used_idx for
> >> sure,
> > I'm not so sure of that, but maybe I've missed something.
> >
> > I'm sending v3 with this comment pending, so we can iterate faster.
> >
> > Thanks!
> >
> >> and we can't forecast when the used_idx is committed to memory.
> >>
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 12/19] vhost: add vhost_svq_poll
  2022-07-19  8:42             ` Eugenio Perez Martin
@ 2022-07-19  8:48               ` Jason Wang
  2022-07-19  9:09                 ` Eugenio Perez Martin
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Wang @ 2022-07-19  8:48 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Tue, Jul 19, 2022 at 4:42 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Tue, Jul 19, 2022 at 9:38 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> >
> > 在 2022/7/16 01:05, Eugenio Perez Martin 写道:
> > > On Fri, Jul 15, 2022 at 10:48 AM Jason Wang <jasowang@redhat.com> wrote:
> > >> On Fri, Jul 15, 2022 at 1:39 PM Eugenio Perez Martin
> > >> <eperezma@redhat.com> wrote:
> > >>> On Fri, Jul 15, 2022 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
> > >>>> On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > >>>>> It allows the Shadow Control VirtQueue to wait for the device to use the
> > >>>>> available buffers.
> > >>>>>
> > >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > >>>>> ---
> > >>>>>   hw/virtio/vhost-shadow-virtqueue.h |  1 +
> > >>>>>   hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++
> > >>>>>   2 files changed, 23 insertions(+)
> > >>>>>
> > >>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> > >>>>> index 1692541cbb..b5c6e3b3b4 100644
> > >>>>> --- a/hw/virtio/vhost-shadow-virtqueue.h
> > >>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > >>>>> @@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *elem,
> > >>>>>   int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
> > >>>>>                     size_t out_num, const struct iovec *in_sg, size_t in_num,
> > >>>>>                     SVQElement *elem);
> > >>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
> > >>>>>
> > >>>>>   void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
> > >>>>>   void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
> > >>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> > >>>>> index 5244896358..31a267f721 100644
> > >>>>> --- a/hw/virtio/vhost-shadow-virtqueue.c
> > >>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > >>>>> @@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
> > >>>>>       } while (!vhost_svq_enable_notification(svq));
> > >>>>>   }
> > >>>>>
> > >>>>> +/**
> > >>>>> + * Poll the SVQ for one device used buffer.
> > >>>>> + *
> > >>>>> + * This function race with main event loop SVQ polling, so extra
> > >>>>> + * synchronization is needed.
> > >>>>> + *
> > >>>>> + * Return the length written by the device.
> > >>>>> + */
> > >>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> > >>>>> +{
> > >>>>> +    do {
> > >>>>> +        uint32_t len;
> > >>>>> +        SVQElement *elem = vhost_svq_get_buf(svq, &len);
> > >>>>> +        if (elem) {
> > >>>>> +            return len;
> > >>>>> +        }
> > >>>>> +
> > >>>>> +        /* Make sure we read new used_idx */
> > >>>>> +        smp_rmb();
> > >>>> There's already one smp_rmb(0 in vhost_svq_get_buf(). So this seems useless?
> > >>>>
> > >>> That rmb is after checking for new entries with (vq->last_used_idx !=
> > >>> svq->shadow_used_idx) , to avoid reordering used_idx read with the
> > >>> actual used entry. So my understanding is
> > >>> that the compiler is free to skip that check within the while loop.
> > >> What do you mean by "that check" here?
> > >>
> > > The first check of (presumably cached) last_used_idx !=
> > > shadow_used_idx. Right before the SVQ's vring fetch of
> > > svq->vring.used->idx.
> > >
> > >>> Maybe the right solution is to add it in vhost_svq_more_used after the
> > >>> condition (vq->last_used_idx != svq->shadow_used_idx) is false?
> > >> I'm not sure I get the goal of the smp_rmb() here. What barrier does it pair?
> > >>
> > > It pairs with the actual device memory write.
> > >
> > > Note that I'm worried about compiler optimizations or reordering
> > > causing an infinite loop, not that the memory is updated properly.
> > >
> > > Let's say we just returned from vhost_svq_add, and avail_idx -
> > > used_idx > 0. The device still did not update SVQ vring used idx, and
> > > qemu is very fast so it completes a full call of vhost_svq_get_buf
> > > before the device is able to increment the used index. We can trace
> > > the full vhost_svq_get_buf without a memory barrier.
> > >
> > > If the compiler inlines enough and we delete the new smp_rmb barrier,
> > > this is what it sees:
> > >
> > > size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> > > {
> > >      do {
> > >          more_used = false
> > >          // The next conditional returns false
> > >          if (svq->last_used_idx != svq->shadow_used_idx) {
> > >              goto useful;
> > >          }
> > >
> > >          svq->shadow_used_idx = cpu_to_le16(svq->vring.used->idx);
> > >
> > >          // next conditional returns false too
> > >          if (!(svq->last_used_idx != svq->shadow_used_idx))
> > >              continue;
> > >
> > > useful:
> > >          // actual code to handle new used buffer
> > >          break;
> > >          }
> > >      }
> > > }
> > >
> > > And qemu does not need to read again none of the variables since
> > > nothing modifies them within the loop before "useful" tag
> > > (svq->vring.used->idx, svq->last_used_idx, svq->shadow_used_idx). So
> > > it could freely rewrite as:
> > >
> > > size_t vhost_svq_poll(VhostShadowVirtqueue *svq) {
> > >      if (svq->last_used_idx == svq->shadow_used_idx &&
> > >          svq->last_used_idx == svq->vring.used->idx) {
> > >              for (;;);
> > >      }
> > > }
> > >
> > > That's why I think the right place for the mb is right after caller
> > > code see (potentially cached) last_used_idx == shadow_used_idx, and it
> > > needs to read a value paired with the "device's mb" in the SVQ vring.
> >
> >
> > I think you need "volatile" instead of the memory barriers.
>
> The kernel's READ_ONCE implementation uses a volatile casting but also
> a memory read barrier after it.

This sounds strange, the volatile should not tie to any barriers. And
this is what I've seen in kernel's code:

/*
 * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
 * atomicity. Note that this may result in tears!
 */
#ifndef __READ_ONCE
define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
#endif

Thanks

> I guess it's because the compiler can
> reorder non-volatile accesses around volatile ones. In the proposed
> code, that barrier is provided by the vhost_svq_more_used caller
> (vhost_svq_get_buf). I think no other caller should need it.
>
> Thanks!
>
> > If I
> > understand correctly, you want the load from the memory instead of the
> > registers here.
> >
> > Thanks
> >
> >
> > >
> > > We didn't have that problem before, since we clear event_notifier
> > > right before the do{}while(), and event loop should hit a memory
> > > barrier in the next select / poll / read / whatever syscall to check
> > > that event notifier fd is set again.
> > >
> > >> Since we are in the busy loop, we will read the for new used_idx for
> > >> sure,
> > > I'm not so sure of that, but maybe I've missed something.
> > >
> > > I'm sending v3 with this comment pending, so we can iterate faster.
> > >
> > > Thanks!
> > >
> > >> and we can't forecast when the used_idx is committed to memory.
> > >>
> >
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 12/19] vhost: add vhost_svq_poll
  2022-07-19  8:48               ` Jason Wang
@ 2022-07-19  9:09                 ` Eugenio Perez Martin
  0 siblings, 0 replies; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-19  9:09 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Tue, Jul 19, 2022 at 10:49 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Tue, Jul 19, 2022 at 4:42 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Tue, Jul 19, 2022 at 9:38 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > >
> > > 在 2022/7/16 01:05, Eugenio Perez Martin 写道:
> > > > On Fri, Jul 15, 2022 at 10:48 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >> On Fri, Jul 15, 2022 at 1:39 PM Eugenio Perez Martin
> > > >> <eperezma@redhat.com> wrote:
> > > >>> On Fri, Jul 15, 2022 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >>>> On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >>>>> It allows the Shadow Control VirtQueue to wait for the device to use the
> > > >>>>> available buffers.
> > > >>>>>
> > > >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > >>>>> ---
> > > >>>>>   hw/virtio/vhost-shadow-virtqueue.h |  1 +
> > > >>>>>   hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++
> > > >>>>>   2 files changed, 23 insertions(+)
> > > >>>>>
> > > >>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> > > >>>>> index 1692541cbb..b5c6e3b3b4 100644
> > > >>>>> --- a/hw/virtio/vhost-shadow-virtqueue.h
> > > >>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > > >>>>> @@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const SVQElement *elem,
> > > >>>>>   int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
> > > >>>>>                     size_t out_num, const struct iovec *in_sg, size_t in_num,
> > > >>>>>                     SVQElement *elem);
> > > >>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
> > > >>>>>
> > > >>>>>   void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
> > > >>>>>   void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
> > > >>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> > > >>>>> index 5244896358..31a267f721 100644
> > > >>>>> --- a/hw/virtio/vhost-shadow-virtqueue.c
> > > >>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > > >>>>> @@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
> > > >>>>>       } while (!vhost_svq_enable_notification(svq));
> > > >>>>>   }
> > > >>>>>
> > > >>>>> +/**
> > > >>>>> + * Poll the SVQ for one device used buffer.
> > > >>>>> + *
> > > >>>>> + * This function race with main event loop SVQ polling, so extra
> > > >>>>> + * synchronization is needed.
> > > >>>>> + *
> > > >>>>> + * Return the length written by the device.
> > > >>>>> + */
> > > >>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> > > >>>>> +{
> > > >>>>> +    do {
> > > >>>>> +        uint32_t len;
> > > >>>>> +        SVQElement *elem = vhost_svq_get_buf(svq, &len);
> > > >>>>> +        if (elem) {
> > > >>>>> +            return len;
> > > >>>>> +        }
> > > >>>>> +
> > > >>>>> +        /* Make sure we read new used_idx */
> > > >>>>> +        smp_rmb();
> > > >>>> There's already one smp_rmb(0 in vhost_svq_get_buf(). So this seems useless?
> > > >>>>
> > > >>> That rmb is after checking for new entries with (vq->last_used_idx !=
> > > >>> svq->shadow_used_idx) , to avoid reordering used_idx read with the
> > > >>> actual used entry. So my understanding is
> > > >>> that the compiler is free to skip that check within the while loop.
> > > >> What do you mean by "that check" here?
> > > >>
> > > > The first check of (presumably cached) last_used_idx !=
> > > > shadow_used_idx. Right before the SVQ's vring fetch of
> > > > svq->vring.used->idx.
> > > >
> > > >>> Maybe the right solution is to add it in vhost_svq_more_used after the
> > > >>> condition (vq->last_used_idx != svq->shadow_used_idx) is false?
> > > >> I'm not sure I get the goal of the smp_rmb() here. What barrier does it pair?
> > > >>
> > > > It pairs with the actual device memory write.
> > > >
> > > > Note that I'm worried about compiler optimizations or reordering
> > > > causing an infinite loop, not that the memory is updated properly.
> > > >
> > > > Let's say we just returned from vhost_svq_add, and avail_idx -
> > > > used_idx > 0. The device still did not update SVQ vring used idx, and
> > > > qemu is very fast so it completes a full call of vhost_svq_get_buf
> > > > before the device is able to increment the used index. We can trace
> > > > the full vhost_svq_get_buf without a memory barrier.
> > > >
> > > > If the compiler inlines enough and we delete the new smp_rmb barrier,
> > > > this is what it sees:
> > > >
> > > > size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> > > > {
> > > >      do {
> > > >          more_used = false
> > > >          // The next conditional returns false
> > > >          if (svq->last_used_idx != svq->shadow_used_idx) {
> > > >              goto useful;
> > > >          }
> > > >
> > > >          svq->shadow_used_idx = cpu_to_le16(svq->vring.used->idx);
> > > >
> > > >          // next conditional returns false too
> > > >          if (!(svq->last_used_idx != svq->shadow_used_idx))
> > > >              continue;
> > > >
> > > > useful:
> > > >          // actual code to handle new used buffer
> > > >          break;
> > > >          }
> > > >      }
> > > > }
> > > >
> > > > And qemu does not need to read again none of the variables since
> > > > nothing modifies them within the loop before "useful" tag
> > > > (svq->vring.used->idx, svq->last_used_idx, svq->shadow_used_idx). So
> > > > it could freely rewrite as:
> > > >
> > > > size_t vhost_svq_poll(VhostShadowVirtqueue *svq) {
> > > >      if (svq->last_used_idx == svq->shadow_used_idx &&
> > > >          svq->last_used_idx == svq->vring.used->idx) {
> > > >              for (;;);
> > > >      }
> > > > }
> > > >
> > > > That's why I think the right place for the mb is right after caller
> > > > code see (potentially cached) last_used_idx == shadow_used_idx, and it
> > > > needs to read a value paired with the "device's mb" in the SVQ vring.
> > >
> > >
> > > I think you need "volatile" instead of the memory barriers.
> >
> > The kernel's READ_ONCE implementation uses a volatile casting but also
> > a memory read barrier after it.
>
> This sounds strange, the volatile should not tie to any barriers. And
> this is what I've seen in kernel's code:
>
> /*
>  * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
>  * atomicity. Note that this may result in tears!
>  */
> #ifndef __READ_ONCE
> define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
> #endif
>

Sorry, I wrote the message too fast without context,

I was looking at tools/virtio/ringtest/main.h definition. Many other
userland implementations are only the volatile part.

We should be fine with only the volatile part (the definition you just
gave) as we already have the memory barrier from the caller. Looking
at the tools/include/linux/compiler.h, we're in the second case:
Ensuring that the compiler does not fold, spindle, or otherwise
mutilate accesses that either do not require ordering or that interact
with an explicit memory barrier or atomic instruction that provides
the required ordering.

Thanks!

> Thanks
>
> > I guess it's because the compiler can
> > reorder non-volatile accesses around volatile ones. In the proposed
> > code, that barrier is provided by the vhost_svq_more_used caller
> > (vhost_svq_get_buf). I think no other caller should need it.
> >
> > Thanks!
> >
> > > If I
> > > understand correctly, you want the load from the memory instead of the
> > > registers here.
> > >
> > > Thanks
> > >
> > >
> > > >
> > > > We didn't have that problem before, since we clear event_notifier
> > > > right before the do{}while(), and event loop should hit a memory
> > > > barrier in the next select / poll / read / whatever syscall to check
> > > > that event notifier fd is set again.
> > > >
> > > >> Since we are in the busy loop, we will read the for new used_idx for
> > > >> sure,
> > > > I'm not so sure of that, but maybe I've missed something.
> > > >
> > > > I'm sending v3 with this comment pending, so we can iterate faster.
> > > >
> > > > Thanks!
> > > >
> > > >> and we can't forecast when the used_idx is committed to memory.
> > > >>
> > >
> >
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 18/19] vdpa: Add device migration blocker
  2022-07-15  8:50       ` Jason Wang
  2022-07-15  9:05         ` Eugenio Perez Martin
@ 2022-07-22 13:29         ` Eugenio Perez Martin
  1 sibling, 0 replies; 42+ messages in thread
From: Eugenio Perez Martin @ 2022-07-22 13:29 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Gautam Dawar, Paolo Bonzini, Michael S. Tsirkin,
	Stefano Garzarella, Eric Blake, Zhu Lingshan, Stefan Hajnoczi,
	Markus Armbruster, Cornelia Huck, Parav Pandit, Laurent Vivier,
	Liuxiangdong, Eli Cohen, Cindy Lu, Harpreet Singh Anand,
	Gonglei (Arei)

On Fri, Jul 15, 2022 at 10:51 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jul 15, 2022 at 1:40 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Fri, Jul 15, 2022 at 6:03 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >
> > > > Since the vhost-vdpa device is exposing _F_LOG,
> > >
> > > I may miss something but I think it doesn't?
> > >
> >
> > It's at vhost_vdpa_get_features. As long as SVQ is enabled, it's
> > exposing VHOST_F_LOG_ALL.
>
> Ok, so this needs to be specified in the change log.

I tried to add the entry in the changelog but I don't have the
permission to do so.

Something like "Add new experimental x-svq option to migrate simple
vhost-vdpa net devices without CVQ"?

Thanks!



^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2022-07-22 13:33 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-14 16:31 [PATCH v2 00/19] vdpa net devices Rx filter change notification with Shadow VQ Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 01/19] vhost: move descriptor translation to vhost_svq_vring_write_descs Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 02/19] virtio-net: Expose MAC_TABLE_ENTRIES Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 03/19] virtio-net: Expose ctrl virtqueue logic Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 04/19] vhost: Reorder vhost_svq_kick Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 05/19] vhost: Move vhost_svq_kick call to vhost_svq_add Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 06/19] vhost: Check for queue full at vhost_svq_add Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 07/19] vhost: Decouple vhost_svq_add from VirtQueueElement Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 08/19] vhost: Add SVQElement Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 09/19] vhost: Track number of descs in SVQElement Eugenio Pérez
2022-07-15  4:10   ` Jason Wang
2022-07-15  5:41     ` Eugenio Perez Martin
2022-07-14 16:31 ` [PATCH v2 10/19] vhost: add vhost_svq_push_elem Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 11/19] vhost: Expose vhost_svq_add Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 12/19] vhost: add vhost_svq_poll Eugenio Pérez
2022-07-15  3:58   ` Jason Wang
2022-07-15  5:38     ` Eugenio Perez Martin
2022-07-15  8:47       ` Jason Wang
2022-07-15 17:05         ` Eugenio Perez Martin
2022-07-19  7:38           ` Jason Wang
2022-07-19  8:42             ` Eugenio Perez Martin
2022-07-19  8:48               ` Jason Wang
2022-07-19  9:09                 ` Eugenio Perez Martin
2022-07-14 16:31 ` [PATCH v2 13/19] vhost: Add svq avail_handler callback Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 14/19] vdpa: Export vhost_vdpa_dma_map and unmap calls Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 15/19] vdpa: manual forward CVQ buffers Eugenio Pérez
2022-07-15  4:08   ` Jason Wang
2022-07-15  5:33     ` Eugenio Perez Martin
2022-07-15  8:44       ` Jason Wang
2022-07-15  9:01         ` Eugenio Perez Martin
2022-07-14 16:31 ` [PATCH v2 16/19] vdpa: Buffer CVQ support on shadow virtqueue Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 17/19] vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs Eugenio Pérez
2022-07-14 16:31 ` [PATCH v2 18/19] vdpa: Add device migration blocker Eugenio Pérez
2022-07-15  4:03   ` Jason Wang
2022-07-15  5:39     ` Eugenio Perez Martin
2022-07-15  8:50       ` Jason Wang
2022-07-15  9:05         ` Eugenio Perez Martin
2022-07-15 16:13           ` Eugenio Perez Martin
2022-07-22 13:29         ` Eugenio Perez Martin
2022-07-14 16:31 ` [PATCH v2 19/19] vdpa: Add x-svq to NetdevVhostVDPAOptions Eugenio Pérez
2022-07-15  4:13   ` Jason Wang
2022-07-15  6:09     ` Eugenio Perez Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.