All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/10] ASID support in vhost-vdpa net
@ 2022-11-08 17:07 Eugenio Pérez
  2022-11-08 17:07 ` [PATCH v6 01/10] vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop Eugenio Pérez
                   ` (10 more replies)
  0 siblings, 11 replies; 46+ messages in thread
From: Eugenio Pérez @ 2022-11-08 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Jason Wang, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

Control VQ is the way net devices use to send changes to the device state, like
the number of active queues or its mac address.

QEMU needs to intercept this queue so it can track these changes and is able to
migrate the device. It can do it from 1576dbb5bbc4 ("vdpa: Add x-svq to
NetdevVhostVDPAOptions"). However, to enable x-svq implies to shadow all VirtIO
device's virtqueues, which will damage performance.

This series adds address space isolation, so the device and the guest
communicate directly with them (passthrough) and CVQ communication is split in
two: The guest communicates with QEMU and QEMU forwards the commands to the
device.

Comments are welcome. Thanks!

v6:
- Do not allocate SVQ resources like file descriptors if SVQ cannot be used.
- Disable shadow CVQ if the device does not support it because of net
  features.

v5:
- Move vring state in vhost_vdpa_get_vring_group instead of using a
  parameter.
- Rename VHOST_VDPA_NET_CVQ_PASSTHROUGH to VHOST_VDPA_NET_DATA_ASID

v4:
- Rebased on last CVQ start series, that allocated CVQ cmd bufs at load
- Squash vhost_vdpa_cvq_group_is_independent.
- Do not check for cvq index on vhost_vdpa_net_prepare, we only have one
  that callback registered in that NetClientInfo.
- Add comment specifying behavior if device does not support _F_ASID
- Update headers to a later Linux commit to not to remove SETUP_RNG_SEED

v3:
- Do not return an error but just print a warning if vdpa device initialization
  returns failure while getting AS num of VQ groups
- Delete extra newline

v2:
- Much as commented on series [1], handle vhost_net backend through
  NetClientInfo callbacks instead of directly.
- Fix not freeing SVQ properly when device does not support CVQ
- Add BIT_ULL missed checking device's backend feature for _F_ASID.

Eugenio Pérez (10):
  vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop
  vhost: set SVQ device call handler at SVQ start
  vhost: Allocate SVQ device file descriptors at device start
  vdpa: add vhost_vdpa_net_valid_svq_features
  vdpa: move SVQ vring features check to net/
  vdpa: Allocate SVQ unconditionally
  vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap
  vdpa: Store x-svq parameter in VhostVDPAState
  vdpa: Add listener_shadow_vq to vhost_vdpa
  vdpa: Always start CVQ in SVQ mode

 include/hw/virtio/vhost-vdpa.h     |  10 +-
 hw/virtio/vhost-shadow-virtqueue.c |  35 +-----
 hw/virtio/vhost-vdpa.c             | 114 ++++++++++---------
 net/vhost-vdpa.c                   | 171 ++++++++++++++++++++++++++---
 hw/virtio/trace-events             |   4 +-
 5 files changed, 222 insertions(+), 112 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v6 01/10] vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop
  2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
@ 2022-11-08 17:07 ` Eugenio Pérez
  2022-11-10  5:21   ` Jason Wang
  2022-11-08 17:07 ` [PATCH v6 02/10] vhost: set SVQ device call handler at SVQ start Eugenio Pérez
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Eugenio Pérez @ 2022-11-08 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Jason Wang, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

This function used to trust in v->shadow_vqs != NULL to know if it must
start svq or not.

This is not going to be valid anymore, as qemu is going to allocate svq
unconditionally (but it will only start them conditionally).

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7468e44b87..7f0ff4df5b 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1029,7 +1029,7 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
     Error *err = NULL;
     unsigned i;
 
-    if (!v->shadow_vqs) {
+    if (!v->shadow_vqs_enabled) {
         return true;
     }
 
@@ -1082,7 +1082,7 @@ static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
 {
     struct vhost_vdpa *v = dev->opaque;
 
-    if (!v->shadow_vqs) {
+    if (!v->shadow_vqs_enabled) {
         return;
     }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v6 02/10] vhost: set SVQ device call handler at SVQ start
  2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
  2022-11-08 17:07 ` [PATCH v6 01/10] vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop Eugenio Pérez
@ 2022-11-08 17:07 ` Eugenio Pérez
  2022-11-10  5:22   ` Jason Wang
  2022-11-08 17:07 ` [PATCH v6 03/10] vhost: Allocate SVQ device file descriptors at device start Eugenio Pérez
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Eugenio Pérez @ 2022-11-08 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Jason Wang, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

By the end of this series CVQ is shadowed as long as the features
support it.

Since we don't know at the beginning of qemu running if this is
supported, move the event notifier handler setting to the start of the
SVQ, instead of the start of qemu run.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 5bd14cad96..264ddc166d 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -648,6 +648,7 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
 {
     size_t desc_size, driver_size, device_size;
 
+    event_notifier_set_handler(&svq->hdev_call, vhost_svq_handle_call);
     svq->next_guest_avail_elem = NULL;
     svq->shadow_avail_idx = 0;
     svq->shadow_used_idx = 0;
@@ -704,6 +705,7 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
     g_free(svq->desc_state);
     qemu_vfree(svq->vring.desc);
     qemu_vfree(svq->vring.used);
+    event_notifier_set_handler(&svq->hdev_call, NULL);
 }
 
 /**
@@ -740,7 +742,6 @@ VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree,
     }
 
     event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
-    event_notifier_set_handler(&svq->hdev_call, vhost_svq_handle_call);
     svq->iova_tree = iova_tree;
     svq->ops = ops;
     svq->ops_opaque = ops_opaque;
@@ -763,7 +764,6 @@ void vhost_svq_free(gpointer pvq)
     VhostShadowVirtqueue *vq = pvq;
     vhost_svq_stop(vq);
     event_notifier_cleanup(&vq->hdev_kick);
-    event_notifier_set_handler(&vq->hdev_call, NULL);
     event_notifier_cleanup(&vq->hdev_call);
     g_free(vq);
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v6 03/10] vhost: Allocate SVQ device file descriptors at device start
  2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
  2022-11-08 17:07 ` [PATCH v6 01/10] vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop Eugenio Pérez
  2022-11-08 17:07 ` [PATCH v6 02/10] vhost: set SVQ device call handler at SVQ start Eugenio Pérez
@ 2022-11-08 17:07 ` Eugenio Pérez
  2022-11-10  5:28   ` Jason Wang
  2022-11-08 17:07 ` [PATCH v6 04/10] vdpa: add vhost_vdpa_net_valid_svq_features Eugenio Pérez
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Eugenio Pérez @ 2022-11-08 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Jason Wang, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

The next patches will start control SVQ if possible. However, we don't
know if that will be possible at qemu boot anymore.

Delay device file descriptors until we know it at device start.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 31 ++------------------------
 hw/virtio/vhost-vdpa.c             | 35 ++++++++++++++++++++++++------
 2 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 264ddc166d..3b05bab44d 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -715,43 +715,18 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
  * @iova_tree: Tree to perform descriptors translations
  * @ops: SVQ owner callbacks
  * @ops_opaque: ops opaque pointer
- *
- * Returns the new virtqueue or NULL.
- *
- * In case of error, reason is reported through error_report.
  */
 VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree,
                                     const VhostShadowVirtqueueOps *ops,
                                     void *ops_opaque)
 {
-    g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
-    int r;
-
-    r = event_notifier_init(&svq->hdev_kick, 0);
-    if (r != 0) {
-        error_report("Couldn't create kick event notifier: %s (%d)",
-                     g_strerror(errno), errno);
-        goto err_init_hdev_kick;
-    }
-
-    r = event_notifier_init(&svq->hdev_call, 0);
-    if (r != 0) {
-        error_report("Couldn't create call event notifier: %s (%d)",
-                     g_strerror(errno), errno);
-        goto err_init_hdev_call;
-    }
+    VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
 
     event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
     svq->iova_tree = iova_tree;
     svq->ops = ops;
     svq->ops_opaque = ops_opaque;
-    return g_steal_pointer(&svq);
-
-err_init_hdev_call:
-    event_notifier_cleanup(&svq->hdev_kick);
-
-err_init_hdev_kick:
-    return NULL;
+    return svq;
 }
 
 /**
@@ -763,7 +738,5 @@ void vhost_svq_free(gpointer pvq)
 {
     VhostShadowVirtqueue *vq = pvq;
     vhost_svq_stop(vq);
-    event_notifier_cleanup(&vq->hdev_kick);
-    event_notifier_cleanup(&vq->hdev_call);
     g_free(vq);
 }
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7f0ff4df5b..3df2775760 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -428,15 +428,11 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
 
     shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
     for (unsigned n = 0; n < hdev->nvqs; ++n) {
-        g_autoptr(VhostShadowVirtqueue) svq;
+        VhostShadowVirtqueue *svq;
 
         svq = vhost_svq_new(v->iova_tree, v->shadow_vq_ops,
                             v->shadow_vq_ops_opaque);
-        if (unlikely(!svq)) {
-            error_setg(errp, "Cannot create svq %u", n);
-            return -1;
-        }
-        g_ptr_array_add(shadow_vqs, g_steal_pointer(&svq));
+        g_ptr_array_add(shadow_vqs, svq);
     }
 
     v->shadow_vqs = g_steal_pointer(&shadow_vqs);
@@ -864,11 +860,23 @@ static int vhost_vdpa_svq_set_fds(struct vhost_dev *dev,
     const EventNotifier *event_notifier = &svq->hdev_kick;
     int r;
 
+    r = event_notifier_init(&svq->hdev_kick, 0);
+    if (r != 0) {
+        error_setg_errno(errp, -r, "Couldn't create kick event notifier");
+        goto err_init_hdev_kick;
+    }
+
+    r = event_notifier_init(&svq->hdev_call, 0);
+    if (r != 0) {
+        error_setg_errno(errp, -r, "Couldn't create call event notifier");
+        goto err_init_hdev_call;
+    }
+
     file.fd = event_notifier_get_fd(event_notifier);
     r = vhost_vdpa_set_vring_dev_kick(dev, &file);
     if (unlikely(r != 0)) {
         error_setg_errno(errp, -r, "Can't set device kick fd");
-        return r;
+        goto err_init_set_dev_fd;
     }
 
     event_notifier = &svq->hdev_call;
@@ -876,8 +884,18 @@ static int vhost_vdpa_svq_set_fds(struct vhost_dev *dev,
     r = vhost_vdpa_set_vring_dev_call(dev, &file);
     if (unlikely(r != 0)) {
         error_setg_errno(errp, -r, "Can't set device call fd");
+        goto err_init_set_dev_fd;
     }
 
+    return 0;
+
+err_init_set_dev_fd:
+    event_notifier_set_handler(&svq->hdev_call, NULL);
+
+err_init_hdev_call:
+    event_notifier_cleanup(&svq->hdev_kick);
+
+err_init_hdev_kick:
     return r;
 }
 
@@ -1089,6 +1107,9 @@ static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
     for (unsigned i = 0; i < v->shadow_vqs->len; ++i) {
         VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
         vhost_vdpa_svq_unmap_rings(dev, svq);
+
+        event_notifier_cleanup(&svq->hdev_kick);
+        event_notifier_cleanup(&svq->hdev_call);
     }
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v6 04/10] vdpa: add vhost_vdpa_net_valid_svq_features
  2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
                   ` (2 preceding siblings ...)
  2022-11-08 17:07 ` [PATCH v6 03/10] vhost: Allocate SVQ device file descriptors at device start Eugenio Pérez
@ 2022-11-08 17:07 ` Eugenio Pérez
  2022-11-10  5:29   ` Jason Wang
  2022-11-08 17:07 ` [PATCH v6 05/10] vdpa: move SVQ vring features check to net/ Eugenio Pérez
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Eugenio Pérez @ 2022-11-08 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Jason Wang, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

It will be reused at vdpa device start so let's extract in its own function

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 net/vhost-vdpa.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index e370ecb8eb..d3b1de481b 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -106,6 +106,22 @@ VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
     return s->vhost_net;
 }
 
+static bool vhost_vdpa_net_valid_svq_features(uint64_t features, Error **errp)
+{
+    uint64_t invalid_dev_features =
+        features & ~vdpa_svq_device_features &
+        /* Transport are all accepted at this point */
+        ~MAKE_64BIT_MASK(VIRTIO_TRANSPORT_F_START,
+                         VIRTIO_TRANSPORT_F_END - VIRTIO_TRANSPORT_F_START);
+
+    if (invalid_dev_features) {
+        error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
+                   invalid_dev_features);
+    }
+
+    return !invalid_dev_features;
+}
+
 static int vhost_vdpa_net_check_device_id(struct vhost_net *net)
 {
     uint32_t device_id;
@@ -675,15 +691,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     if (opts->x_svq) {
         struct vhost_vdpa_iova_range iova_range;
 
-        uint64_t invalid_dev_features =
-            features & ~vdpa_svq_device_features &
-            /* Transport are all accepted at this point */
-            ~MAKE_64BIT_MASK(VIRTIO_TRANSPORT_F_START,
-                             VIRTIO_TRANSPORT_F_END - VIRTIO_TRANSPORT_F_START);
-
-        if (invalid_dev_features) {
-            error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
-                       invalid_dev_features);
+        if (!vhost_vdpa_net_valid_svq_features(features, errp)) {
             goto err_svq;
         }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v6 05/10] vdpa: move SVQ vring features check to net/
  2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
                   ` (3 preceding siblings ...)
  2022-11-08 17:07 ` [PATCH v6 04/10] vdpa: add vhost_vdpa_net_valid_svq_features Eugenio Pérez
@ 2022-11-08 17:07 ` Eugenio Pérez
  2022-11-10  5:40   ` Jason Wang
  2022-11-08 17:07 ` [PATCH v6 06/10] vdpa: Allocate SVQ unconditionally Eugenio Pérez
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Eugenio Pérez @ 2022-11-08 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Jason Wang, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

The next patches will start control SVQ if possible. However, we don't
know if that will be possible at qemu boot anymore.

Since the moved checks will be already evaluated at net/ to know if it
is ok to shadow CVQ, move them.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 33 ++-------------------------------
 net/vhost-vdpa.c       |  3 ++-
 2 files changed, 4 insertions(+), 32 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 3df2775760..146f0dcb40 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -402,29 +402,9 @@ static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
     return ret;
 }
 
-static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
-                               Error **errp)
+static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v)
 {
     g_autoptr(GPtrArray) shadow_vqs = NULL;
-    uint64_t dev_features, svq_features;
-    int r;
-    bool ok;
-
-    if (!v->shadow_vqs_enabled) {
-        return 0;
-    }
-
-    r = vhost_vdpa_get_dev_features(hdev, &dev_features);
-    if (r != 0) {
-        error_setg_errno(errp, -r, "Can't get vdpa device features");
-        return r;
-    }
-
-    svq_features = dev_features;
-    ok = vhost_svq_valid_features(svq_features, errp);
-    if (unlikely(!ok)) {
-        return -1;
-    }
 
     shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
     for (unsigned n = 0; n < hdev->nvqs; ++n) {
@@ -436,7 +416,6 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
     }
 
     v->shadow_vqs = g_steal_pointer(&shadow_vqs);
-    return 0;
 }
 
 static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
@@ -461,11 +440,7 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
     dev->opaque =  opaque ;
     v->listener = vhost_vdpa_memory_listener;
     v->msg_type = VHOST_IOTLB_MSG_V2;
-    ret = vhost_vdpa_init_svq(dev, v, errp);
-    if (ret) {
-        goto err;
-    }
-
+    vhost_vdpa_init_svq(dev, v);
     vhost_vdpa_get_iova_range(v);
 
     if (!vhost_vdpa_first_dev(dev)) {
@@ -476,10 +451,6 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
                                VIRTIO_CONFIG_S_DRIVER);
 
     return 0;
-
-err:
-    ram_block_discard_disable(false);
-    return ret;
 }
 
 static void vhost_vdpa_host_notifier_uninit(struct vhost_dev *dev,
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index d3b1de481b..fb35b17ab4 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -117,9 +117,10 @@ static bool vhost_vdpa_net_valid_svq_features(uint64_t features, Error **errp)
     if (invalid_dev_features) {
         error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
                    invalid_dev_features);
+        return false;
     }
 
-    return !invalid_dev_features;
+    return vhost_svq_valid_features(features, errp);
 }
 
 static int vhost_vdpa_net_check_device_id(struct vhost_net *net)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v6 06/10] vdpa: Allocate SVQ unconditionally
  2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
                   ` (4 preceding siblings ...)
  2022-11-08 17:07 ` [PATCH v6 05/10] vdpa: move SVQ vring features check to net/ Eugenio Pérez
@ 2022-11-08 17:07 ` Eugenio Pérez
  2022-11-08 17:07 ` [PATCH v6 07/10] vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap Eugenio Pérez
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 46+ messages in thread
From: Eugenio Pérez @ 2022-11-08 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Jason Wang, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

SVQ may run or not in a device depending on runtime conditions (for
example, if the device can move CVQ to its own group or not).

Allocate the SVQ array unconditionally at startup, since its hard to
move this allocation elsewhere.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 146f0dcb40..23efb8f49d 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -547,10 +547,6 @@ static void vhost_vdpa_svq_cleanup(struct vhost_dev *dev)
     struct vhost_vdpa *v = dev->opaque;
     size_t idx;
 
-    if (!v->shadow_vqs) {
-        return;
-    }
-
     for (idx = 0; idx < v->shadow_vqs->len; ++idx) {
         vhost_svq_stop(g_ptr_array_index(v->shadow_vqs, idx));
     }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v6 07/10] vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap
  2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
                   ` (5 preceding siblings ...)
  2022-11-08 17:07 ` [PATCH v6 06/10] vdpa: Allocate SVQ unconditionally Eugenio Pérez
@ 2022-11-08 17:07 ` Eugenio Pérez
  2022-11-10  5:50   ` Jason Wang
  2022-11-08 17:07 ` [PATCH v6 08/10] vdpa: Store x-svq parameter in VhostVDPAState Eugenio Pérez
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Eugenio Pérez @ 2022-11-08 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Jason Wang, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

So the caller can choose which ASID is destined.

No need to update the batch functions as they will always be called from
memory listener updates at the moment. Memory listener updates will
always update ASID 0, as it's the passthrough ASID.

All vhost devices's ASID are 0 at this moment.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
v5:
* Solve conflict, now vhost_vdpa_svq_unmap_ring returns void
* Change comment on zero initialization.

v4: Add comment specifying behavior if device does not support _F_ASID

v3: Deleted unneeded space
---
 include/hw/virtio/vhost-vdpa.h |  8 +++++---
 hw/virtio/vhost-vdpa.c         | 29 +++++++++++++++++++----------
 net/vhost-vdpa.c               |  6 +++---
 hw/virtio/trace-events         |  4 ++--
 4 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 1111d85643..6560bb9d78 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -29,6 +29,7 @@ typedef struct vhost_vdpa {
     int index;
     uint32_t msg_type;
     bool iotlb_batch_begin_sent;
+    uint32_t address_space_id;
     MemoryListener listener;
     struct vhost_vdpa_iova_range iova_range;
     uint64_t acked_features;
@@ -42,8 +43,9 @@ typedef struct vhost_vdpa {
     VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
 } VhostVDPA;
 
-int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
-                       void *vaddr, bool readonly);
-int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
+int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
+                       hwaddr size, void *vaddr, bool readonly);
+int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
+                         hwaddr size);
 
 #endif
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 23efb8f49d..8fd32ba32b 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -72,22 +72,24 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
     return false;
 }
 
-int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
-                       void *vaddr, bool readonly)
+int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
+                       hwaddr size, void *vaddr, bool readonly)
 {
     struct vhost_msg_v2 msg = {};
     int fd = v->device_fd;
     int ret = 0;
 
     msg.type = v->msg_type;
+    msg.asid = asid; /* 0 if vdpa device does not support asid */
     msg.iotlb.iova = iova;
     msg.iotlb.size = size;
     msg.iotlb.uaddr = (uint64_t)(uintptr_t)vaddr;
     msg.iotlb.perm = readonly ? VHOST_ACCESS_RO : VHOST_ACCESS_RW;
     msg.iotlb.type = VHOST_IOTLB_UPDATE;
 
-   trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.iotlb.iova, msg.iotlb.size,
-                            msg.iotlb.uaddr, msg.iotlb.perm, msg.iotlb.type);
+    trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.asid, msg.iotlb.iova,
+                             msg.iotlb.size, msg.iotlb.uaddr, msg.iotlb.perm,
+                             msg.iotlb.type);
 
     if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
         error_report("failed to write, fd=%d, errno=%d (%s)",
@@ -98,18 +100,24 @@ int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
     return ret;
 }
 
-int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size)
+int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
+                         hwaddr size)
 {
     struct vhost_msg_v2 msg = {};
     int fd = v->device_fd;
     int ret = 0;
 
     msg.type = v->msg_type;
+    /*
+     * The caller must set asid = 0 if the device does not support asid.
+     * This is not an ABI break since it is set to 0 by the initializer anyway.
+     */
+    msg.asid = asid;
     msg.iotlb.iova = iova;
     msg.iotlb.size = size;
     msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
 
-    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.iotlb.iova,
+    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.asid, msg.iotlb.iova,
                                msg.iotlb.size, msg.iotlb.type);
 
     if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
@@ -229,7 +237,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
     }
 
     vhost_vdpa_iotlb_batch_begin_once(v);
-    ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
+    ret = vhost_vdpa_dma_map(v, 0, iova, int128_get64(llsize),
                              vaddr, section->readonly);
     if (ret) {
         error_report("vhost vdpa map fail!");
@@ -303,7 +311,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
         vhost_iova_tree_remove(v->iova_tree, *result);
     }
     vhost_vdpa_iotlb_batch_begin_once(v);
-    ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
+    ret = vhost_vdpa_dma_unmap(v, 0, iova, int128_get64(llsize));
     if (ret) {
         error_report("vhost_vdpa dma unmap error!");
     }
@@ -884,7 +892,7 @@ static void vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v, hwaddr addr)
     }
 
     size = ROUND_UP(result->size, qemu_real_host_page_size());
-    r = vhost_vdpa_dma_unmap(v, result->iova, size);
+    r = vhost_vdpa_dma_unmap(v, v->address_space_id, result->iova, size);
     if (unlikely(r < 0)) {
         error_report("Unable to unmap SVQ vring: %s (%d)", g_strerror(-r), -r);
         return;
@@ -924,7 +932,8 @@ static bool vhost_vdpa_svq_map_ring(struct vhost_vdpa *v, DMAMap *needle,
         return false;
     }
 
-    r = vhost_vdpa_dma_map(v, needle->iova, needle->size + 1,
+    r = vhost_vdpa_dma_map(v, v->address_space_id, needle->iova,
+                           needle->size + 1,
                            (void *)(uintptr_t)needle->translated_addr,
                            needle->perm == IOMMU_RO);
     if (unlikely(r != 0)) {
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index fb35b17ab4..ca1acc0410 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -258,7 +258,7 @@ static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
         return;
     }
 
-    r = vhost_vdpa_dma_unmap(v, map->iova, map->size + 1);
+    r = vhost_vdpa_dma_unmap(v, v->address_space_id, map->iova, map->size + 1);
     if (unlikely(r != 0)) {
         error_report("Device cannot unmap: %s(%d)", g_strerror(r), r);
     }
@@ -298,8 +298,8 @@ static int vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v, void *buf, size_t size,
         return r;
     }
 
-    r = vhost_vdpa_dma_map(v, map.iova, vhost_vdpa_net_cvq_cmd_page_len(), buf,
-                           !write);
+    r = vhost_vdpa_dma_map(v, v->address_space_id, map.iova,
+                           vhost_vdpa_net_cvq_cmd_page_len(), buf, !write);
     if (unlikely(r < 0)) {
         goto dma_map_err;
     }
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 820dadc26c..0ad9390307 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -30,8 +30,8 @@ vhost_user_write(uint32_t req, uint32_t flags) "req:%d flags:0x%"PRIx32""
 vhost_user_create_notifier(int idx, void *n) "idx:%d n:%p"
 
 # vhost-vdpa.c
-vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
-vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
+vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
+vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
 vhost_vdpa_listener_begin_batch(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
 vhost_vdpa_listener_commit(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
 vhost_vdpa_listener_region_add(void *vdpa, uint64_t iova, uint64_t llend, void *vaddr, bool readonly) "vdpa: %p iova 0x%"PRIx64" llend 0x%"PRIx64" vaddr: %p read-only: %d"
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v6 08/10] vdpa: Store x-svq parameter in VhostVDPAState
  2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
                   ` (6 preceding siblings ...)
  2022-11-08 17:07 ` [PATCH v6 07/10] vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap Eugenio Pérez
@ 2022-11-08 17:07 ` Eugenio Pérez
  2022-11-08 17:07 ` [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa Eugenio Pérez
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 46+ messages in thread
From: Eugenio Pérez @ 2022-11-08 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Jason Wang, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

CVQ can be shadowed two ways:
- Device has x-svq=on parameter (current way)
- The device can isolate CVQ in its own vq group

QEMU needs to check for the second condition dynamically, because CVQ
index is not known at initialization time. Since this is dynamic, the
CVQ isolation could vary with different conditions, making it possible
to go from "not isolated group" to "isolated".

Saving the cmdline parameter in an extra field so we never disable CVQ
SVQ in case the device was started with cmdline.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 net/vhost-vdpa.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index ca1acc0410..85a318faca 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -38,6 +38,8 @@ typedef struct VhostVDPAState {
     void *cvq_cmd_out_buffer;
     virtio_net_ctrl_ack *status;
 
+    /* The device always have SVQ enabled */
+    bool always_svq;
     bool started;
 } VhostVDPAState;
 
@@ -566,6 +568,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
 
     s->vhost_vdpa.device_fd = vdpa_device_fd;
     s->vhost_vdpa.index = queue_pair_index;
+    s->always_svq = svq;
     s->vhost_vdpa.shadow_vqs_enabled = svq;
     s->vhost_vdpa.iova_tree = iova_tree;
     if (!is_datapath) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa
  2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
                   ` (7 preceding siblings ...)
  2022-11-08 17:07 ` [PATCH v6 08/10] vdpa: Store x-svq parameter in VhostVDPAState Eugenio Pérez
@ 2022-11-08 17:07 ` Eugenio Pérez
  2022-11-10  6:00   ` Jason Wang
  2022-11-08 17:07 ` [PATCH v6 10/10] vdpa: Always start CVQ in SVQ mode Eugenio Pérez
  2022-11-10 12:25 ` [PATCH v6 00/10] ASID support in vhost-vdpa net Michael S. Tsirkin
  10 siblings, 1 reply; 46+ messages in thread
From: Eugenio Pérez @ 2022-11-08 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Jason Wang, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

The memory listener that thells the device how to convert GPA to qemu's
va is registered against CVQ vhost_vdpa. This series try to map the
memory listener translations to ASID 0, while it maps the CVQ ones to
ASID 1.

Let's tell the listener if it needs to register them on iova tree or
not.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
v5: Solve conflict about vhost_iova_tree_remove accepting mem_region by
    value.
---
 include/hw/virtio/vhost-vdpa.h | 2 ++
 hw/virtio/vhost-vdpa.c         | 6 +++---
 net/vhost-vdpa.c               | 1 +
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 6560bb9d78..0c3ed2d69b 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -34,6 +34,8 @@ typedef struct vhost_vdpa {
     struct vhost_vdpa_iova_range iova_range;
     uint64_t acked_features;
     bool shadow_vqs_enabled;
+    /* The listener must send iova tree addresses, not GPA */
+    bool listener_shadow_vq;
     /* IOVA mapping used by the Shadow Virtqueue */
     VhostIOVATree *iova_tree;
     GPtrArray *shadow_vqs;
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 8fd32ba32b..e3914fa40e 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -220,7 +220,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
                                          vaddr, section->readonly);
 
     llsize = int128_sub(llend, int128_make64(iova));
-    if (v->shadow_vqs_enabled) {
+    if (v->listener_shadow_vq) {
         int r;
 
         mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
@@ -247,7 +247,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
     return;
 
 fail_map:
-    if (v->shadow_vqs_enabled) {
+    if (v->listener_shadow_vq) {
         vhost_iova_tree_remove(v->iova_tree, mem_region);
     }
 
@@ -292,7 +292,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
 
     llsize = int128_sub(llend, int128_make64(iova));
 
-    if (v->shadow_vqs_enabled) {
+    if (v->listener_shadow_vq) {
         const DMAMap *result;
         const void *vaddr = memory_region_get_ram_ptr(section->mr) +
             section->offset_within_region +
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 85a318faca..02780ee37b 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -570,6 +570,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
     s->vhost_vdpa.index = queue_pair_index;
     s->always_svq = svq;
     s->vhost_vdpa.shadow_vqs_enabled = svq;
+    s->vhost_vdpa.listener_shadow_vq = svq;
     s->vhost_vdpa.iova_tree = iova_tree;
     if (!is_datapath) {
         s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size(),
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v6 10/10] vdpa: Always start CVQ in SVQ mode
  2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
                   ` (8 preceding siblings ...)
  2022-11-08 17:07 ` [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa Eugenio Pérez
@ 2022-11-08 17:07 ` Eugenio Pérez
  2022-11-10  6:24   ` Jason Wang
  2022-11-10 12:25 ` [PATCH v6 00/10] ASID support in vhost-vdpa net Michael S. Tsirkin
  10 siblings, 1 reply; 46+ messages in thread
From: Eugenio Pérez @ 2022-11-08 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Jason Wang, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

Isolate control virtqueue in its own group, allowing to intercept control
commands but letting dataplane run totally passthrough to the guest.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
v6:
* Disable control SVQ if the device does not support it because of
features.

v5:
* Fixing the not adding cvq buffers when x-svq=on is specified.
* Move vring state in vhost_vdpa_get_vring_group instead of using a
  parameter.
* Rename VHOST_VDPA_NET_CVQ_PASSTHROUGH to VHOST_VDPA_NET_DATA_ASID

v4:
* Squash vhost_vdpa_cvq_group_is_independent.
* Rebased on last CVQ start series, that allocated CVQ cmd bufs at load
* Do not check for cvq index on vhost_vdpa_net_prepare, we only have one
  that callback registered in that NetClientInfo.

v3:
* Make asid related queries print a warning instead of returning an
  error and stop the start of qemu.
---
 hw/virtio/vhost-vdpa.c |   3 +-
 net/vhost-vdpa.c       | 138 ++++++++++++++++++++++++++++++++++++++---
 2 files changed, 132 insertions(+), 9 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index e3914fa40e..6401e7efb1 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -648,7 +648,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
 {
     uint64_t features;
     uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 |
-        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH;
+        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH |
+        0x1ULL << VHOST_BACKEND_F_IOTLB_ASID;
     int r;
 
     if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) {
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 02780ee37b..7245ea70c6 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -38,6 +38,9 @@ typedef struct VhostVDPAState {
     void *cvq_cmd_out_buffer;
     virtio_net_ctrl_ack *status;
 
+    /* Number of address spaces supported by the device */
+    unsigned address_space_num;
+
     /* The device always have SVQ enabled */
     bool always_svq;
     bool started;
@@ -101,6 +104,9 @@ static const uint64_t vdpa_svq_device_features =
     BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
     BIT_ULL(VIRTIO_NET_F_STANDBY);
 
+#define VHOST_VDPA_NET_DATA_ASID 0
+#define VHOST_VDPA_NET_CVQ_ASID 1
+
 VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
 {
     VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
@@ -242,6 +248,34 @@ static NetClientInfo net_vhost_vdpa_info = {
         .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
+static uint32_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
+{
+    struct vhost_vring_state state = {
+        .index = vq_index,
+    };
+    int r = ioctl(device_fd, VHOST_VDPA_GET_VRING_GROUP, &state);
+
+    return r < 0 ? 0 : state.num;
+}
+
+static int vhost_vdpa_set_address_space_id(struct vhost_vdpa *v,
+                                           unsigned vq_group,
+                                           unsigned asid_num)
+{
+    struct vhost_vring_state asid = {
+        .index = vq_group,
+        .num = asid_num,
+    };
+    int ret;
+
+    ret = ioctl(v->device_fd, VHOST_VDPA_SET_GROUP_ASID, &asid);
+    if (unlikely(ret < 0)) {
+        warn_report("Can't set vq group %u asid %u, errno=%d (%s)",
+            asid.index, asid.num, errno, g_strerror(errno));
+    }
+    return ret;
+}
+
 static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
 {
     VhostIOVATree *tree = v->iova_tree;
@@ -316,11 +350,54 @@ dma_map_err:
 static int vhost_vdpa_net_cvq_start(NetClientState *nc)
 {
     VhostVDPAState *s;
-    int r;
+    struct vhost_vdpa *v;
+    uint32_t cvq_group;
+    int cvq_index, r;
 
     assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
 
     s = DO_UPCAST(VhostVDPAState, nc, nc);
+    v = &s->vhost_vdpa;
+
+    v->listener_shadow_vq = s->always_svq;
+    v->shadow_vqs_enabled = s->always_svq;
+    s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_DATA_ASID;
+
+    if (s->always_svq) {
+        goto out;
+    }
+
+    if (s->address_space_num < 2) {
+        return 0;
+    }
+
+    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
+        return 0;
+    }
+
+    /**
+     * Check if all the virtqueues of the virtio device are in a different vq
+     * than the last vq. VQ group of last group passed in cvq_group.
+     */
+    cvq_index = v->dev->vq_index_end - 1;
+    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
+    for (int i = 0; i < cvq_index; ++i) {
+        uint32_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
+
+        if (unlikely(group == cvq_group)) {
+            warn_report("CVQ %u group is the same as VQ %u one (%u)", cvq_group,
+                        i, group);
+            return 0;
+        }
+    }
+
+    r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
+    if (r == 0) {
+        v->shadow_vqs_enabled = true;
+        s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_CVQ_ASID;
+    }
+
+out:
     if (!s->vhost_vdpa.shadow_vqs_enabled) {
         return 0;
     }
@@ -542,12 +619,38 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
     .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
 };
 
+static uint32_t vhost_vdpa_get_as_num(int vdpa_device_fd)
+{
+    uint64_t features;
+    unsigned num_as;
+    int r;
+
+    r = ioctl(vdpa_device_fd, VHOST_GET_BACKEND_FEATURES, &features);
+    if (unlikely(r < 0)) {
+        warn_report("Cannot get backend features");
+        return 1;
+    }
+
+    if (!(features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
+        return 1;
+    }
+
+    r = ioctl(vdpa_device_fd, VHOST_VDPA_GET_AS_NUM, &num_as);
+    if (unlikely(r < 0)) {
+        warn_report("Cannot retrieve number of supported ASs");
+        return 1;
+    }
+
+    return num_as;
+}
+
 static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
                                            const char *device,
                                            const char *name,
                                            int vdpa_device_fd,
                                            int queue_pair_index,
                                            int nvqs,
+                                           unsigned nas,
                                            bool is_datapath,
                                            bool svq,
                                            VhostIOVATree *iova_tree)
@@ -566,6 +669,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
     qemu_set_info_str(nc, TYPE_VHOST_VDPA);
     s = DO_UPCAST(VhostVDPAState, nc, nc);
 
+    s->address_space_num = nas;
     s->vhost_vdpa.device_fd = vdpa_device_fd;
     s->vhost_vdpa.index = queue_pair_index;
     s->always_svq = svq;
@@ -652,6 +756,8 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     g_autoptr(VhostIOVATree) iova_tree = NULL;
     NetClientState *nc;
     int queue_pairs, r, i = 0, has_cvq = 0;
+    unsigned num_as = 1;
+    bool svq_cvq;
 
     assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
     opts = &netdev->u.vhost_vdpa;
@@ -693,12 +799,28 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
         return queue_pairs;
     }
 
-    if (opts->x_svq) {
-        struct vhost_vdpa_iova_range iova_range;
+    svq_cvq = opts->x_svq;
+    if (has_cvq && !opts->x_svq) {
+        num_as = vhost_vdpa_get_as_num(vdpa_device_fd);
+        svq_cvq = num_as > 1;
+    }
+
+    if (opts->x_svq || svq_cvq) {
+        Error *warn = NULL;
 
-        if (!vhost_vdpa_net_valid_svq_features(features, errp)) {
-            goto err_svq;
+        svq_cvq = vhost_vdpa_net_valid_svq_features(features,
+                                                   opts->x_svq ? errp : &warn);
+        if (!svq_cvq) {
+            if (opts->x_svq) {
+                goto err_svq;
+            } else {
+                warn_reportf_err(warn, "Cannot shadow CVQ: ");
+            }
         }
+    }
+
+    if (opts->x_svq || svq_cvq) {
+        struct vhost_vdpa_iova_range iova_range;
 
         vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range);
         iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
@@ -708,15 +830,15 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
 
     for (i = 0; i < queue_pairs; i++) {
         ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
-                                     vdpa_device_fd, i, 2, true, opts->x_svq,
-                                     iova_tree);
+                                     vdpa_device_fd, i, 2, num_as, true,
+                                     opts->x_svq, iova_tree);
         if (!ncs[i])
             goto err;
     }
 
     if (has_cvq) {
         nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
-                                 vdpa_device_fd, i, 1, false,
+                                 vdpa_device_fd, i, 1, num_as, false,
                                  opts->x_svq, iova_tree);
         if (!nc)
             goto err;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 01/10] vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop
  2022-11-08 17:07 ` [PATCH v6 01/10] vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop Eugenio Pérez
@ 2022-11-10  5:21   ` Jason Wang
  2022-11-10 12:54     ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-10  5:21 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> This function used to trust in v->shadow_vqs != NULL to know if it must
> start svq or not.
>
> This is not going to be valid anymore, as qemu is going to allocate svq
> unconditionally (but it will only start them conditionally).

It might be a waste of memory if we did this. Any reason for this?

Thanks

>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  hw/virtio/vhost-vdpa.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 7468e44b87..7f0ff4df5b 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -1029,7 +1029,7 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
>      Error *err = NULL;
>      unsigned i;
>
> -    if (!v->shadow_vqs) {
> +    if (!v->shadow_vqs_enabled) {
>          return true;
>      }
>
> @@ -1082,7 +1082,7 @@ static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
>  {
>      struct vhost_vdpa *v = dev->opaque;
>
> -    if (!v->shadow_vqs) {
> +    if (!v->shadow_vqs_enabled) {
>          return;
>      }
>
> --
> 2.31.1
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 02/10] vhost: set SVQ device call handler at SVQ start
  2022-11-08 17:07 ` [PATCH v6 02/10] vhost: set SVQ device call handler at SVQ start Eugenio Pérez
@ 2022-11-10  5:22   ` Jason Wang
  0 siblings, 0 replies; 46+ messages in thread
From: Jason Wang @ 2022-11-10  5:22 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> By the end of this series CVQ is shadowed as long as the features
> support it.
>
> Since we don't know at the beginning of qemu running if this is
> supported, move the event notifier handler setting to the start of the
> SVQ, instead of the start of qemu run.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks

> ---
>  hw/virtio/vhost-shadow-virtqueue.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> index 5bd14cad96..264ddc166d 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.c
> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> @@ -648,6 +648,7 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
>  {
>      size_t desc_size, driver_size, device_size;
>
> +    event_notifier_set_handler(&svq->hdev_call, vhost_svq_handle_call);
>      svq->next_guest_avail_elem = NULL;
>      svq->shadow_avail_idx = 0;
>      svq->shadow_used_idx = 0;
> @@ -704,6 +705,7 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
>      g_free(svq->desc_state);
>      qemu_vfree(svq->vring.desc);
>      qemu_vfree(svq->vring.used);
> +    event_notifier_set_handler(&svq->hdev_call, NULL);
>  }
>
>  /**
> @@ -740,7 +742,6 @@ VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree,
>      }
>
>      event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
> -    event_notifier_set_handler(&svq->hdev_call, vhost_svq_handle_call);
>      svq->iova_tree = iova_tree;
>      svq->ops = ops;
>      svq->ops_opaque = ops_opaque;
> @@ -763,7 +764,6 @@ void vhost_svq_free(gpointer pvq)
>      VhostShadowVirtqueue *vq = pvq;
>      vhost_svq_stop(vq);
>      event_notifier_cleanup(&vq->hdev_kick);
> -    event_notifier_set_handler(&vq->hdev_call, NULL);
>      event_notifier_cleanup(&vq->hdev_call);
>      g_free(vq);
>  }
> --
> 2.31.1
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 03/10] vhost: Allocate SVQ device file descriptors at device start
  2022-11-08 17:07 ` [PATCH v6 03/10] vhost: Allocate SVQ device file descriptors at device start Eugenio Pérez
@ 2022-11-10  5:28   ` Jason Wang
  0 siblings, 0 replies; 46+ messages in thread
From: Jason Wang @ 2022-11-10  5:28 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> The next patches will start control SVQ if possible. However, we don't
> know if that will be possible at qemu boot anymore.
>
> Delay device file descriptors until we know it at device start.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks

> ---
>  hw/virtio/vhost-shadow-virtqueue.c | 31 ++------------------------
>  hw/virtio/vhost-vdpa.c             | 35 ++++++++++++++++++++++++------
>  2 files changed, 30 insertions(+), 36 deletions(-)
>
> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> index 264ddc166d..3b05bab44d 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.c
> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> @@ -715,43 +715,18 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
>   * @iova_tree: Tree to perform descriptors translations
>   * @ops: SVQ owner callbacks
>   * @ops_opaque: ops opaque pointer
> - *
> - * Returns the new virtqueue or NULL.
> - *
> - * In case of error, reason is reported through error_report.
>   */
>  VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree,
>                                      const VhostShadowVirtqueueOps *ops,
>                                      void *ops_opaque)
>  {
> -    g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
> -    int r;
> -
> -    r = event_notifier_init(&svq->hdev_kick, 0);
> -    if (r != 0) {
> -        error_report("Couldn't create kick event notifier: %s (%d)",
> -                     g_strerror(errno), errno);
> -        goto err_init_hdev_kick;
> -    }
> -
> -    r = event_notifier_init(&svq->hdev_call, 0);
> -    if (r != 0) {
> -        error_report("Couldn't create call event notifier: %s (%d)",
> -                     g_strerror(errno), errno);
> -        goto err_init_hdev_call;
> -    }
> +    VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
>
>      event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
>      svq->iova_tree = iova_tree;
>      svq->ops = ops;
>      svq->ops_opaque = ops_opaque;
> -    return g_steal_pointer(&svq);
> -
> -err_init_hdev_call:
> -    event_notifier_cleanup(&svq->hdev_kick);
> -
> -err_init_hdev_kick:
> -    return NULL;
> +    return svq;
>  }
>
>  /**
> @@ -763,7 +738,5 @@ void vhost_svq_free(gpointer pvq)
>  {
>      VhostShadowVirtqueue *vq = pvq;
>      vhost_svq_stop(vq);
> -    event_notifier_cleanup(&vq->hdev_kick);
> -    event_notifier_cleanup(&vq->hdev_call);
>      g_free(vq);
>  }
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 7f0ff4df5b..3df2775760 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -428,15 +428,11 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
>
>      shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
>      for (unsigned n = 0; n < hdev->nvqs; ++n) {
> -        g_autoptr(VhostShadowVirtqueue) svq;
> +        VhostShadowVirtqueue *svq;
>
>          svq = vhost_svq_new(v->iova_tree, v->shadow_vq_ops,
>                              v->shadow_vq_ops_opaque);
> -        if (unlikely(!svq)) {
> -            error_setg(errp, "Cannot create svq %u", n);
> -            return -1;
> -        }
> -        g_ptr_array_add(shadow_vqs, g_steal_pointer(&svq));
> +        g_ptr_array_add(shadow_vqs, svq);
>      }
>
>      v->shadow_vqs = g_steal_pointer(&shadow_vqs);
> @@ -864,11 +860,23 @@ static int vhost_vdpa_svq_set_fds(struct vhost_dev *dev,
>      const EventNotifier *event_notifier = &svq->hdev_kick;
>      int r;
>
> +    r = event_notifier_init(&svq->hdev_kick, 0);
> +    if (r != 0) {
> +        error_setg_errno(errp, -r, "Couldn't create kick event notifier");
> +        goto err_init_hdev_kick;
> +    }
> +
> +    r = event_notifier_init(&svq->hdev_call, 0);
> +    if (r != 0) {
> +        error_setg_errno(errp, -r, "Couldn't create call event notifier");
> +        goto err_init_hdev_call;
> +    }
> +
>      file.fd = event_notifier_get_fd(event_notifier);
>      r = vhost_vdpa_set_vring_dev_kick(dev, &file);
>      if (unlikely(r != 0)) {
>          error_setg_errno(errp, -r, "Can't set device kick fd");
> -        return r;
> +        goto err_init_set_dev_fd;
>      }
>
>      event_notifier = &svq->hdev_call;
> @@ -876,8 +884,18 @@ static int vhost_vdpa_svq_set_fds(struct vhost_dev *dev,
>      r = vhost_vdpa_set_vring_dev_call(dev, &file);
>      if (unlikely(r != 0)) {
>          error_setg_errno(errp, -r, "Can't set device call fd");
> +        goto err_init_set_dev_fd;
>      }
>
> +    return 0;
> +
> +err_init_set_dev_fd:
> +    event_notifier_set_handler(&svq->hdev_call, NULL);
> +
> +err_init_hdev_call:
> +    event_notifier_cleanup(&svq->hdev_kick);
> +
> +err_init_hdev_kick:
>      return r;
>  }
>
> @@ -1089,6 +1107,9 @@ static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
>      for (unsigned i = 0; i < v->shadow_vqs->len; ++i) {
>          VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
>          vhost_vdpa_svq_unmap_rings(dev, svq);
> +
> +        event_notifier_cleanup(&svq->hdev_kick);
> +        event_notifier_cleanup(&svq->hdev_call);
>      }
>  }
>
> --
> 2.31.1
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 04/10] vdpa: add vhost_vdpa_net_valid_svq_features
  2022-11-08 17:07 ` [PATCH v6 04/10] vdpa: add vhost_vdpa_net_valid_svq_features Eugenio Pérez
@ 2022-11-10  5:29   ` Jason Wang
  0 siblings, 0 replies; 46+ messages in thread
From: Jason Wang @ 2022-11-10  5:29 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> It will be reused at vdpa device start so let's extract in its own function
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks

>  net/vhost-vdpa.c | 26 +++++++++++++++++---------
>  1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index e370ecb8eb..d3b1de481b 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -106,6 +106,22 @@ VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
>      return s->vhost_net;
>  }
>
> +static bool vhost_vdpa_net_valid_svq_features(uint64_t features, Error **errp)
> +{
> +    uint64_t invalid_dev_features =
> +        features & ~vdpa_svq_device_features &
> +        /* Transport are all accepted at this point */
> +        ~MAKE_64BIT_MASK(VIRTIO_TRANSPORT_F_START,
> +                         VIRTIO_TRANSPORT_F_END - VIRTIO_TRANSPORT_F_START);
> +
> +    if (invalid_dev_features) {
> +        error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
> +                   invalid_dev_features);
> +    }
> +
> +    return !invalid_dev_features;
> +}
> +
>  static int vhost_vdpa_net_check_device_id(struct vhost_net *net)
>  {
>      uint32_t device_id;
> @@ -675,15 +691,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>      if (opts->x_svq) {
>          struct vhost_vdpa_iova_range iova_range;
>
> -        uint64_t invalid_dev_features =
> -            features & ~vdpa_svq_device_features &
> -            /* Transport are all accepted at this point */
> -            ~MAKE_64BIT_MASK(VIRTIO_TRANSPORT_F_START,
> -                             VIRTIO_TRANSPORT_F_END - VIRTIO_TRANSPORT_F_START);
> -
> -        if (invalid_dev_features) {
> -            error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
> -                       invalid_dev_features);
> +        if (!vhost_vdpa_net_valid_svq_features(features, errp)) {
>              goto err_svq;
>          }
>
> --
> 2.31.1
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 05/10] vdpa: move SVQ vring features check to net/
  2022-11-08 17:07 ` [PATCH v6 05/10] vdpa: move SVQ vring features check to net/ Eugenio Pérez
@ 2022-11-10  5:40   ` Jason Wang
  2022-11-10 13:09     ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-10  5:40 UTC (permalink / raw)
  To: Eugenio Pérez, qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Cindy Lu, Eli Cohen,
	Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini


在 2022/11/9 01:07, Eugenio Pérez 写道:
> The next patches will start control SVQ if possible. However, we don't
> know if that will be possible at qemu boot anymore.


If I was not wrong, there's no device specific feature that is checked 
in the function. So it should be general enough to be used by devices 
other than net. Then I don't see any advantage of doing this.

Thanks


>
> Since the moved checks will be already evaluated at net/ to know if it
> is ok to shadow CVQ, move them.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>   hw/virtio/vhost-vdpa.c | 33 ++-------------------------------
>   net/vhost-vdpa.c       |  3 ++-
>   2 files changed, 4 insertions(+), 32 deletions(-)
>
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 3df2775760..146f0dcb40 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -402,29 +402,9 @@ static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
>       return ret;
>   }
>   
> -static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> -                               Error **errp)
> +static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v)
>   {
>       g_autoptr(GPtrArray) shadow_vqs = NULL;
> -    uint64_t dev_features, svq_features;
> -    int r;
> -    bool ok;
> -
> -    if (!v->shadow_vqs_enabled) {
> -        return 0;
> -    }
> -
> -    r = vhost_vdpa_get_dev_features(hdev, &dev_features);
> -    if (r != 0) {
> -        error_setg_errno(errp, -r, "Can't get vdpa device features");
> -        return r;
> -    }
> -
> -    svq_features = dev_features;
> -    ok = vhost_svq_valid_features(svq_features, errp);
> -    if (unlikely(!ok)) {
> -        return -1;
> -    }
>   
>       shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
>       for (unsigned n = 0; n < hdev->nvqs; ++n) {
> @@ -436,7 +416,6 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
>       }
>   
>       v->shadow_vqs = g_steal_pointer(&shadow_vqs);
> -    return 0;
>   }
>   
>   static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> @@ -461,11 +440,7 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
>       dev->opaque =  opaque ;
>       v->listener = vhost_vdpa_memory_listener;
>       v->msg_type = VHOST_IOTLB_MSG_V2;
> -    ret = vhost_vdpa_init_svq(dev, v, errp);
> -    if (ret) {
> -        goto err;
> -    }
> -
> +    vhost_vdpa_init_svq(dev, v);
>       vhost_vdpa_get_iova_range(v);
>   
>       if (!vhost_vdpa_first_dev(dev)) {
> @@ -476,10 +451,6 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
>                                  VIRTIO_CONFIG_S_DRIVER);
>   
>       return 0;
> -
> -err:
> -    ram_block_discard_disable(false);
> -    return ret;
>   }
>   
>   static void vhost_vdpa_host_notifier_uninit(struct vhost_dev *dev,
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index d3b1de481b..fb35b17ab4 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -117,9 +117,10 @@ static bool vhost_vdpa_net_valid_svq_features(uint64_t features, Error **errp)
>       if (invalid_dev_features) {
>           error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
>                      invalid_dev_features);
> +        return false;
>       }
>   
> -    return !invalid_dev_features;
> +    return vhost_svq_valid_features(features, errp);
>   }
>   
>   static int vhost_vdpa_net_check_device_id(struct vhost_net *net)


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 07/10] vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap
  2022-11-08 17:07 ` [PATCH v6 07/10] vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap Eugenio Pérez
@ 2022-11-10  5:50   ` Jason Wang
  2022-11-10 13:22     ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-10  5:50 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> So the caller can choose which ASID is destined.
>
> No need to update the batch functions as they will always be called from
> memory listener updates at the moment. Memory listener updates will
> always update ASID 0, as it's the passthrough ASID.
>
> All vhost devices's ASID are 0 at this moment.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
> v5:
> * Solve conflict, now vhost_vdpa_svq_unmap_ring returns void
> * Change comment on zero initialization.
>
> v4: Add comment specifying behavior if device does not support _F_ASID
>
> v3: Deleted unneeded space
> ---
>  include/hw/virtio/vhost-vdpa.h |  8 +++++---
>  hw/virtio/vhost-vdpa.c         | 29 +++++++++++++++++++----------
>  net/vhost-vdpa.c               |  6 +++---
>  hw/virtio/trace-events         |  4 ++--
>  4 files changed, 29 insertions(+), 18 deletions(-)
>
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> index 1111d85643..6560bb9d78 100644
> --- a/include/hw/virtio/vhost-vdpa.h
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -29,6 +29,7 @@ typedef struct vhost_vdpa {
>      int index;
>      uint32_t msg_type;
>      bool iotlb_batch_begin_sent;
> +    uint32_t address_space_id;

So the trick is let device specific code to zero this during allocation?

>      MemoryListener listener;
>      struct vhost_vdpa_iova_range iova_range;
>      uint64_t acked_features;
> @@ -42,8 +43,9 @@ typedef struct vhost_vdpa {
>      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
>  } VhostVDPA;
>
> -int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> -                       void *vaddr, bool readonly);
> -int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> +int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> +                       hwaddr size, void *vaddr, bool readonly);
> +int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> +                         hwaddr size);
>
>  #endif
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 23efb8f49d..8fd32ba32b 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -72,22 +72,24 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
>      return false;
>  }
>
> -int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> -                       void *vaddr, bool readonly)
> +int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> +                       hwaddr size, void *vaddr, bool readonly)
>  {
>      struct vhost_msg_v2 msg = {};
>      int fd = v->device_fd;
>      int ret = 0;
>
>      msg.type = v->msg_type;
> +    msg.asid = asid; /* 0 if vdpa device does not support asid */

The comment here is confusing. If this is a requirement, we need either

1) doc this

or

2) perform necessary checks in the function itself.

>      msg.iotlb.iova = iova;
>      msg.iotlb.size = size;
>      msg.iotlb.uaddr = (uint64_t)(uintptr_t)vaddr;
>      msg.iotlb.perm = readonly ? VHOST_ACCESS_RO : VHOST_ACCESS_RW;
>      msg.iotlb.type = VHOST_IOTLB_UPDATE;
>
> -   trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.iotlb.iova, msg.iotlb.size,
> -                            msg.iotlb.uaddr, msg.iotlb.perm, msg.iotlb.type);
> +    trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.asid, msg.iotlb.iova,
> +                             msg.iotlb.size, msg.iotlb.uaddr, msg.iotlb.perm,
> +                             msg.iotlb.type);
>
>      if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
>          error_report("failed to write, fd=%d, errno=%d (%s)",
> @@ -98,18 +100,24 @@ int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
>      return ret;
>  }
>
> -int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size)
> +int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> +                         hwaddr size)
>  {
>      struct vhost_msg_v2 msg = {};
>      int fd = v->device_fd;
>      int ret = 0;
>
>      msg.type = v->msg_type;
> +    /*
> +     * The caller must set asid = 0 if the device does not support asid.
> +     * This is not an ABI break since it is set to 0 by the initializer anyway.
> +     */
> +    msg.asid = asid;
>      msg.iotlb.iova = iova;
>      msg.iotlb.size = size;
>      msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
>
> -    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.iotlb.iova,
> +    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.asid, msg.iotlb.iova,
>                                 msg.iotlb.size, msg.iotlb.type);
>
>      if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> @@ -229,7 +237,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>      }
>
>      vhost_vdpa_iotlb_batch_begin_once(v);
> -    ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
> +    ret = vhost_vdpa_dma_map(v, 0, iova, int128_get64(llsize),

Can we use v->address_space_id here? Then we don't need to modify this
line when we support multiple asids logic in the future.

Thanks

>                               vaddr, section->readonly);
>      if (ret) {
>          error_report("vhost vdpa map fail!");
> @@ -303,7 +311,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
>          vhost_iova_tree_remove(v->iova_tree, *result);
>      }
>      vhost_vdpa_iotlb_batch_begin_once(v);
> -    ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
> +    ret = vhost_vdpa_dma_unmap(v, 0, iova, int128_get64(llsize));
>      if (ret) {
>          error_report("vhost_vdpa dma unmap error!");
>      }
> @@ -884,7 +892,7 @@ static void vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v, hwaddr addr)
>      }
>
>      size = ROUND_UP(result->size, qemu_real_host_page_size());
> -    r = vhost_vdpa_dma_unmap(v, result->iova, size);
> +    r = vhost_vdpa_dma_unmap(v, v->address_space_id, result->iova, size);
>      if (unlikely(r < 0)) {
>          error_report("Unable to unmap SVQ vring: %s (%d)", g_strerror(-r), -r);
>          return;
> @@ -924,7 +932,8 @@ static bool vhost_vdpa_svq_map_ring(struct vhost_vdpa *v, DMAMap *needle,
>          return false;
>      }
>
> -    r = vhost_vdpa_dma_map(v, needle->iova, needle->size + 1,
> +    r = vhost_vdpa_dma_map(v, v->address_space_id, needle->iova,
> +                           needle->size + 1,
>                             (void *)(uintptr_t)needle->translated_addr,
>                             needle->perm == IOMMU_RO);
>      if (unlikely(r != 0)) {
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index fb35b17ab4..ca1acc0410 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -258,7 +258,7 @@ static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
>          return;
>      }
>
> -    r = vhost_vdpa_dma_unmap(v, map->iova, map->size + 1);
> +    r = vhost_vdpa_dma_unmap(v, v->address_space_id, map->iova, map->size + 1);
>      if (unlikely(r != 0)) {
>          error_report("Device cannot unmap: %s(%d)", g_strerror(r), r);
>      }
> @@ -298,8 +298,8 @@ static int vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v, void *buf, size_t size,
>          return r;
>      }
>
> -    r = vhost_vdpa_dma_map(v, map.iova, vhost_vdpa_net_cvq_cmd_page_len(), buf,
> -                           !write);
> +    r = vhost_vdpa_dma_map(v, v->address_space_id, map.iova,
> +                           vhost_vdpa_net_cvq_cmd_page_len(), buf, !write);
>      if (unlikely(r < 0)) {
>          goto dma_map_err;
>      }
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index 820dadc26c..0ad9390307 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -30,8 +30,8 @@ vhost_user_write(uint32_t req, uint32_t flags) "req:%d flags:0x%"PRIx32""
>  vhost_user_create_notifier(int idx, void *n) "idx:%d n:%p"
>
>  # vhost-vdpa.c
> -vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
> -vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
> +vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
> +vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
>  vhost_vdpa_listener_begin_batch(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
>  vhost_vdpa_listener_commit(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
>  vhost_vdpa_listener_region_add(void *vdpa, uint64_t iova, uint64_t llend, void *vaddr, bool readonly) "vdpa: %p iova 0x%"PRIx64" llend 0x%"PRIx64" vaddr: %p read-only: %d"
> --
> 2.31.1
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa
  2022-11-08 17:07 ` [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa Eugenio Pérez
@ 2022-11-10  6:00   ` Jason Wang
  2022-11-10 13:47     ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-10  6:00 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> The memory listener that thells the device how to convert GPA to qemu's
> va is registered against CVQ vhost_vdpa. This series try to map the
> memory listener translations to ASID 0, while it maps the CVQ ones to
> ASID 1.
>
> Let's tell the listener if it needs to register them on iova tree or
> not.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
> v5: Solve conflict about vhost_iova_tree_remove accepting mem_region by
>     value.
> ---
>  include/hw/virtio/vhost-vdpa.h | 2 ++
>  hw/virtio/vhost-vdpa.c         | 6 +++---
>  net/vhost-vdpa.c               | 1 +
>  3 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> index 6560bb9d78..0c3ed2d69b 100644
> --- a/include/hw/virtio/vhost-vdpa.h
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -34,6 +34,8 @@ typedef struct vhost_vdpa {
>      struct vhost_vdpa_iova_range iova_range;
>      uint64_t acked_features;
>      bool shadow_vqs_enabled;
> +    /* The listener must send iova tree addresses, not GPA */
> +    bool listener_shadow_vq;
>      /* IOVA mapping used by the Shadow Virtqueue */
>      VhostIOVATree *iova_tree;
>      GPtrArray *shadow_vqs;
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 8fd32ba32b..e3914fa40e 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -220,7 +220,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>                                           vaddr, section->readonly);
>
>      llsize = int128_sub(llend, int128_make64(iova));
> -    if (v->shadow_vqs_enabled) {
> +    if (v->listener_shadow_vq) {
>          int r;
>
>          mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
> @@ -247,7 +247,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>      return;
>
>  fail_map:
> -    if (v->shadow_vqs_enabled) {
> +    if (v->listener_shadow_vq) {
>          vhost_iova_tree_remove(v->iova_tree, mem_region);
>      }
>
> @@ -292,7 +292,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
>
>      llsize = int128_sub(llend, int128_make64(iova));
>
> -    if (v->shadow_vqs_enabled) {
> +    if (v->listener_shadow_vq) {
>          const DMAMap *result;
>          const void *vaddr = memory_region_get_ram_ptr(section->mr) +
>              section->offset_within_region +
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 85a318faca..02780ee37b 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -570,6 +570,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>      s->vhost_vdpa.index = queue_pair_index;
>      s->always_svq = svq;
>      s->vhost_vdpa.shadow_vqs_enabled = svq;
> +    s->vhost_vdpa.listener_shadow_vq = svq;

Any chance those above two can differ?

Thanks

>      s->vhost_vdpa.iova_tree = iova_tree;
>      if (!is_datapath) {
>          s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size(),
> --
> 2.31.1
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 10/10] vdpa: Always start CVQ in SVQ mode
  2022-11-08 17:07 ` [PATCH v6 10/10] vdpa: Always start CVQ in SVQ mode Eugenio Pérez
@ 2022-11-10  6:24   ` Jason Wang
  2022-11-10 16:07     ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-10  6:24 UTC (permalink / raw)
  To: Eugenio Pérez, qemu-devel
  Cc: Parav Pandit, Stefan Hajnoczi, Si-Wei Liu, Laurent Vivier,
	Harpreet Singh Anand, Michael S. Tsirkin, Gautam Dawar,
	Liuxiangdong, Stefano Garzarella, Cindy Lu, Eli Cohen,
	Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini


在 2022/11/9 01:07, Eugenio Pérez 写道:
> Isolate control virtqueue in its own group, allowing to intercept control
> commands but letting dataplane run totally passthrough to the guest.


I think we need to tweak the title to "vdpa: Always start CVQ in SVQ 
mode if possible". Since SVQ for CVQ can't be enabled without ASID support?


>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
> v6:
> * Disable control SVQ if the device does not support it because of
> features.
>
> v5:
> * Fixing the not adding cvq buffers when x-svq=on is specified.
> * Move vring state in vhost_vdpa_get_vring_group instead of using a
>    parameter.
> * Rename VHOST_VDPA_NET_CVQ_PASSTHROUGH to VHOST_VDPA_NET_DATA_ASID
>
> v4:
> * Squash vhost_vdpa_cvq_group_is_independent.
> * Rebased on last CVQ start series, that allocated CVQ cmd bufs at load
> * Do not check for cvq index on vhost_vdpa_net_prepare, we only have one
>    that callback registered in that NetClientInfo.
>
> v3:
> * Make asid related queries print a warning instead of returning an
>    error and stop the start of qemu.
> ---
>   hw/virtio/vhost-vdpa.c |   3 +-
>   net/vhost-vdpa.c       | 138 ++++++++++++++++++++++++++++++++++++++---
>   2 files changed, 132 insertions(+), 9 deletions(-)
>
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index e3914fa40e..6401e7efb1 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -648,7 +648,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
>   {
>       uint64_t features;
>       uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 |
> -        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH;
> +        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH |
> +        0x1ULL << VHOST_BACKEND_F_IOTLB_ASID;
>       int r;
>   
>       if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) {
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 02780ee37b..7245ea70c6 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -38,6 +38,9 @@ typedef struct VhostVDPAState {
>       void *cvq_cmd_out_buffer;
>       virtio_net_ctrl_ack *status;
>   
> +    /* Number of address spaces supported by the device */
> +    unsigned address_space_num;


I'm not sure this is the best place to store thing like this since it 
can cause confusion. We will have multiple VhostVDPAState when 
multiqueue is enabled.


> +
>       /* The device always have SVQ enabled */
>       bool always_svq;
>       bool started;
> @@ -101,6 +104,9 @@ static const uint64_t vdpa_svq_device_features =
>       BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
>       BIT_ULL(VIRTIO_NET_F_STANDBY);
>   
> +#define VHOST_VDPA_NET_DATA_ASID 0
> +#define VHOST_VDPA_NET_CVQ_ASID 1
> +
>   VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
>   {
>       VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> @@ -242,6 +248,34 @@ static NetClientInfo net_vhost_vdpa_info = {
>           .check_peer_type = vhost_vdpa_check_peer_type,
>   };
>   
> +static uint32_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> +{
> +    struct vhost_vring_state state = {
> +        .index = vq_index,
> +    };
> +    int r = ioctl(device_fd, VHOST_VDPA_GET_VRING_GROUP, &state);
> +
> +    return r < 0 ? 0 : state.num;


Assume 0 when ioctl() fail is probably not a good idea: errors in ioctl 
might be hidden. It would be better to fallback to 0 when ASID is not 
supported.


> +}
> +
> +static int vhost_vdpa_set_address_space_id(struct vhost_vdpa *v,
> +                                           unsigned vq_group,
> +                                           unsigned asid_num)
> +{
> +    struct vhost_vring_state asid = {
> +        .index = vq_group,
> +        .num = asid_num,
> +    };
> +    int ret;
> +
> +    ret = ioctl(v->device_fd, VHOST_VDPA_SET_GROUP_ASID, &asid);
> +    if (unlikely(ret < 0)) {
> +        warn_report("Can't set vq group %u asid %u, errno=%d (%s)",
> +            asid.index, asid.num, errno, g_strerror(errno));
> +    }
> +    return ret;
> +}
> +
>   static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
>   {
>       VhostIOVATree *tree = v->iova_tree;
> @@ -316,11 +350,54 @@ dma_map_err:
>   static int vhost_vdpa_net_cvq_start(NetClientState *nc)
>   {
>       VhostVDPAState *s;
> -    int r;
> +    struct vhost_vdpa *v;
> +    uint32_t cvq_group;
> +    int cvq_index, r;
>   
>       assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>   
>       s = DO_UPCAST(VhostVDPAState, nc, nc);
> +    v = &s->vhost_vdpa;
> +
> +    v->listener_shadow_vq = s->always_svq;
> +    v->shadow_vqs_enabled = s->always_svq;
> +    s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_DATA_ASID;
> +
> +    if (s->always_svq) {
> +        goto out;
> +    }
> +
> +    if (s->address_space_num < 2) {
> +        return 0;
> +    }
> +
> +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> +        return 0;
> +    }


Any reason we do the above check during the start/stop? It should be 
easier to do that in the initialization.


> +
> +    /**
> +     * Check if all the virtqueues of the virtio device are in a different vq
> +     * than the last vq. VQ group of last group passed in cvq_group.
> +     */
> +    cvq_index = v->dev->vq_index_end - 1;
> +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> +    for (int i = 0; i < cvq_index; ++i) {
> +        uint32_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> +
> +        if (unlikely(group == cvq_group)) {
> +            warn_report("CVQ %u group is the same as VQ %u one (%u)", cvq_group,
> +                        i, group);
> +            return 0;
> +        }
> +    }
> +
> +    r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
> +    if (r == 0) {
> +        v->shadow_vqs_enabled = true;
> +        s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_CVQ_ASID;
> +    }
> +
> +out:
>       if (!s->vhost_vdpa.shadow_vqs_enabled) {
>           return 0;
>       }
> @@ -542,12 +619,38 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
>       .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
>   };
>   
> +static uint32_t vhost_vdpa_get_as_num(int vdpa_device_fd)
> +{
> +    uint64_t features;
> +    unsigned num_as;
> +    int r;
> +
> +    r = ioctl(vdpa_device_fd, VHOST_GET_BACKEND_FEATURES, &features);
> +    if (unlikely(r < 0)) {
> +        warn_report("Cannot get backend features");
> +        return 1;
> +    }
> +
> +    if (!(features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
> +        return 1;
> +    }
> +
> +    r = ioctl(vdpa_device_fd, VHOST_VDPA_GET_AS_NUM, &num_as);
> +    if (unlikely(r < 0)) {
> +        warn_report("Cannot retrieve number of supported ASs");
> +        return 1;


Let's return error here. This help to identify bugs of qemu or kernel.


> +    }
> +
> +    return num_as;
> +}
> +
>   static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>                                              const char *device,
>                                              const char *name,
>                                              int vdpa_device_fd,
>                                              int queue_pair_index,
>                                              int nvqs,
> +                                           unsigned nas,
>                                              bool is_datapath,
>                                              bool svq,
>                                              VhostIOVATree *iova_tree)
> @@ -566,6 +669,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>       qemu_set_info_str(nc, TYPE_VHOST_VDPA);
>       s = DO_UPCAST(VhostVDPAState, nc, nc);
>   
> +    s->address_space_num = nas;
>       s->vhost_vdpa.device_fd = vdpa_device_fd;
>       s->vhost_vdpa.index = queue_pair_index;
>       s->always_svq = svq;
> @@ -652,6 +756,8 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>       g_autoptr(VhostIOVATree) iova_tree = NULL;
>       NetClientState *nc;
>       int queue_pairs, r, i = 0, has_cvq = 0;
> +    unsigned num_as = 1;
> +    bool svq_cvq;
>   
>       assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>       opts = &netdev->u.vhost_vdpa;
> @@ -693,12 +799,28 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>           return queue_pairs;
>       }
>   
> -    if (opts->x_svq) {
> -        struct vhost_vdpa_iova_range iova_range;
> +    svq_cvq = opts->x_svq;
> +    if (has_cvq && !opts->x_svq) {
> +        num_as = vhost_vdpa_get_as_num(vdpa_device_fd);
> +        svq_cvq = num_as > 1;
> +    }


The above check is not easy to follow, how about?

svq_cvq = vhost_vdpa_get_as_num() > 1 ? true : opts->x_svq;


> +
> +    if (opts->x_svq || svq_cvq) {


Any chance we can have opts->x_svq = true but svq_cvq = false? Checking 
svq_cvq seems sufficient here.


> +        Error *warn = NULL;
>   
> -        if (!vhost_vdpa_net_valid_svq_features(features, errp)) {
> -            goto err_svq;
> +        svq_cvq = vhost_vdpa_net_valid_svq_features(features,
> +                                                   opts->x_svq ? errp : &warn);
> +        if (!svq_cvq) {


Same question as above.


> +            if (opts->x_svq) {
> +                goto err_svq;
> +            } else {
> +                warn_reportf_err(warn, "Cannot shadow CVQ: ");
> +            }
>           }
> +    }
> +
> +    if (opts->x_svq || svq_cvq) {
> +        struct vhost_vdpa_iova_range iova_range;
>   
>           vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range);
>           iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
> @@ -708,15 +830,15 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>   
>       for (i = 0; i < queue_pairs; i++) {
>           ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> -                                     vdpa_device_fd, i, 2, true, opts->x_svq,
> -                                     iova_tree);
> +                                     vdpa_device_fd, i, 2, num_as, true,


I don't get why we need pass num_as to a specific vhost_vdpa structure. 
It should be sufficient to pass asid there.

Thanks


> +                                     opts->x_svq, iova_tree);
>           if (!ncs[i])
>               goto err;
>       }
>   
>       if (has_cvq) {
>           nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> -                                 vdpa_device_fd, i, 1, false,
> +                                 vdpa_device_fd, i, 1, num_as, false,
>                                    opts->x_svq, iova_tree);
>           if (!nc)
>               goto err;


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 00/10] ASID support in vhost-vdpa net
  2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
                   ` (9 preceding siblings ...)
  2022-11-08 17:07 ` [PATCH v6 10/10] vdpa: Always start CVQ in SVQ mode Eugenio Pérez
@ 2022-11-10 12:25 ` Michael S. Tsirkin
  2022-11-10 12:56   ` Eugenio Perez Martin
  10 siblings, 1 reply; 46+ messages in thread
From: Michael S. Tsirkin @ 2022-11-10 12:25 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Gautam Dawar, Liuxiangdong,
	Stefano Garzarella, Jason Wang, Cindy Lu, Eli Cohen,
	Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Tue, Nov 08, 2022 at 06:07:45PM +0100, Eugenio Pérez wrote:
> Control VQ is the way net devices use to send changes to the device state, like
> the number of active queues or its mac address.
> 
> QEMU needs to intercept this queue so it can track these changes and is able to
> migrate the device. It can do it from 1576dbb5bbc4 ("vdpa: Add x-svq to
> NetdevVhostVDPAOptions"). However, to enable x-svq implies to shadow all VirtIO
> device's virtqueues, which will damage performance.
> 
> This series adds address space isolation, so the device and the guest
> communicate directly with them (passthrough) and CVQ communication is split in
> two: The guest communicates with QEMU and QEMU forwards the commands to the
> device.
> 
> Comments are welcome. Thanks!


This is not 7.2 material, right?

> v6:
> - Do not allocate SVQ resources like file descriptors if SVQ cannot be used.
> - Disable shadow CVQ if the device does not support it because of net
>   features.
> 
> v5:
> - Move vring state in vhost_vdpa_get_vring_group instead of using a
>   parameter.
> - Rename VHOST_VDPA_NET_CVQ_PASSTHROUGH to VHOST_VDPA_NET_DATA_ASID
> 
> v4:
> - Rebased on last CVQ start series, that allocated CVQ cmd bufs at load
> - Squash vhost_vdpa_cvq_group_is_independent.
> - Do not check for cvq index on vhost_vdpa_net_prepare, we only have one
>   that callback registered in that NetClientInfo.
> - Add comment specifying behavior if device does not support _F_ASID
> - Update headers to a later Linux commit to not to remove SETUP_RNG_SEED
> 
> v3:
> - Do not return an error but just print a warning if vdpa device initialization
>   returns failure while getting AS num of VQ groups
> - Delete extra newline
> 
> v2:
> - Much as commented on series [1], handle vhost_net backend through
>   NetClientInfo callbacks instead of directly.
> - Fix not freeing SVQ properly when device does not support CVQ
> - Add BIT_ULL missed checking device's backend feature for _F_ASID.
> 
> Eugenio Pérez (10):
>   vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop
>   vhost: set SVQ device call handler at SVQ start
>   vhost: Allocate SVQ device file descriptors at device start
>   vdpa: add vhost_vdpa_net_valid_svq_features
>   vdpa: move SVQ vring features check to net/
>   vdpa: Allocate SVQ unconditionally
>   vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap
>   vdpa: Store x-svq parameter in VhostVDPAState
>   vdpa: Add listener_shadow_vq to vhost_vdpa
>   vdpa: Always start CVQ in SVQ mode
> 
>  include/hw/virtio/vhost-vdpa.h     |  10 +-
>  hw/virtio/vhost-shadow-virtqueue.c |  35 +-----
>  hw/virtio/vhost-vdpa.c             | 114 ++++++++++---------
>  net/vhost-vdpa.c                   | 171 ++++++++++++++++++++++++++---
>  hw/virtio/trace-events             |   4 +-
>  5 files changed, 222 insertions(+), 112 deletions(-)
> 
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 01/10] vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop
  2022-11-10  5:21   ` Jason Wang
@ 2022-11-10 12:54     ` Eugenio Perez Martin
  2022-11-11  7:24       ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-10 12:54 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Thu, Nov 10, 2022 at 6:22 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > This function used to trust in v->shadow_vqs != NULL to know if it must
> > start svq or not.
> >
> > This is not going to be valid anymore, as qemu is going to allocate svq
> > unconditionally (but it will only start them conditionally).
>
> It might be a waste of memory if we did this. Any reason for this?
>

Well, it's modelled after vhost_vdpa notifier member [1].

But sure we can reduce the memory usage if SVQ is not used. The first
function that needs it is vhost_set_vring_kick. But I think it is not
a good function to place the delayed allocation.

Would it work to move the allocation to vhost_set_features vhost op?
It seems unlikely to me to call callbacks that can affect SVQ earlier
than that one. Or maybe to create a new one and call it the first on
vhost.c:vhost_dev_start?

Thanks!

[1] The notifier member already allocates VIRTIO_QUEUE_MAX
VhostVDPAHostNotifier for each vhost_vdpa, It is easy to reduce at
least to the number of virtqueues on a vhost_vdpa. Should I send a
patch for this one?


> Thanks
>
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  hw/virtio/vhost-vdpa.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index 7468e44b87..7f0ff4df5b 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -1029,7 +1029,7 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
> >      Error *err = NULL;
> >      unsigned i;
> >
> > -    if (!v->shadow_vqs) {
> > +    if (!v->shadow_vqs_enabled) {
> >          return true;
> >      }
> >
> > @@ -1082,7 +1082,7 @@ static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
> >  {
> >      struct vhost_vdpa *v = dev->opaque;
> >
> > -    if (!v->shadow_vqs) {
> > +    if (!v->shadow_vqs_enabled) {
> >          return;
> >      }
> >
> > --
> > 2.31.1
> >
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 00/10] ASID support in vhost-vdpa net
  2022-11-10 12:25 ` [PATCH v6 00/10] ASID support in vhost-vdpa net Michael S. Tsirkin
@ 2022-11-10 12:56   ` Eugenio Perez Martin
  0 siblings, 0 replies; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-10 12:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Gautam Dawar, Liuxiangdong,
	Stefano Garzarella, Jason Wang, Cindy Lu, Eli Cohen,
	Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Thu, Nov 10, 2022 at 1:25 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Nov 08, 2022 at 06:07:45PM +0100, Eugenio Pérez wrote:
> > Control VQ is the way net devices use to send changes to the device state, like
> > the number of active queues or its mac address.
> >
> > QEMU needs to intercept this queue so it can track these changes and is able to
> > migrate the device. It can do it from 1576dbb5bbc4 ("vdpa: Add x-svq to
> > NetdevVhostVDPAOptions"). However, to enable x-svq implies to shadow all VirtIO
> > device's virtqueues, which will damage performance.
> >
> > This series adds address space isolation, so the device and the guest
> > communicate directly with them (passthrough) and CVQ communication is split in
> > two: The guest communicates with QEMU and QEMU forwards the commands to the
> > device.
> >
> > Comments are welcome. Thanks!
>
>
> This is not 7.2 material, right?
>

No it is not.

I should have noted it after PATCH in the subject, sorry.

> > v6:
> > - Do not allocate SVQ resources like file descriptors if SVQ cannot be used.
> > - Disable shadow CVQ if the device does not support it because of net
> >   features.
> >
> > v5:
> > - Move vring state in vhost_vdpa_get_vring_group instead of using a
> >   parameter.
> > - Rename VHOST_VDPA_NET_CVQ_PASSTHROUGH to VHOST_VDPA_NET_DATA_ASID
> >
> > v4:
> > - Rebased on last CVQ start series, that allocated CVQ cmd bufs at load
> > - Squash vhost_vdpa_cvq_group_is_independent.
> > - Do not check for cvq index on vhost_vdpa_net_prepare, we only have one
> >   that callback registered in that NetClientInfo.
> > - Add comment specifying behavior if device does not support _F_ASID
> > - Update headers to a later Linux commit to not to remove SETUP_RNG_SEED
> >
> > v3:
> > - Do not return an error but just print a warning if vdpa device initialization
> >   returns failure while getting AS num of VQ groups
> > - Delete extra newline
> >
> > v2:
> > - Much as commented on series [1], handle vhost_net backend through
> >   NetClientInfo callbacks instead of directly.
> > - Fix not freeing SVQ properly when device does not support CVQ
> > - Add BIT_ULL missed checking device's backend feature for _F_ASID.
> >
> > Eugenio Pérez (10):
> >   vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop
> >   vhost: set SVQ device call handler at SVQ start
> >   vhost: Allocate SVQ device file descriptors at device start
> >   vdpa: add vhost_vdpa_net_valid_svq_features
> >   vdpa: move SVQ vring features check to net/
> >   vdpa: Allocate SVQ unconditionally
> >   vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap
> >   vdpa: Store x-svq parameter in VhostVDPAState
> >   vdpa: Add listener_shadow_vq to vhost_vdpa
> >   vdpa: Always start CVQ in SVQ mode
> >
> >  include/hw/virtio/vhost-vdpa.h     |  10 +-
> >  hw/virtio/vhost-shadow-virtqueue.c |  35 +-----
> >  hw/virtio/vhost-vdpa.c             | 114 ++++++++++---------
> >  net/vhost-vdpa.c                   | 171 ++++++++++++++++++++++++++---
> >  hw/virtio/trace-events             |   4 +-
> >  5 files changed, 222 insertions(+), 112 deletions(-)
> >
> > --
> > 2.31.1
> >
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 05/10] vdpa: move SVQ vring features check to net/
  2022-11-10  5:40   ` Jason Wang
@ 2022-11-10 13:09     ` Eugenio Perez Martin
  2022-11-11  7:34       ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-10 13:09 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Thu, Nov 10, 2022 at 6:40 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/11/9 01:07, Eugenio Pérez 写道:
> > The next patches will start control SVQ if possible. However, we don't
> > know if that will be possible at qemu boot anymore.
>
>
> If I was not wrong, there's no device specific feature that is checked
> in the function. So it should be general enough to be used by devices
> other than net. Then I don't see any advantage of doing this.
>

Because vhost_vdpa_init_svq is called at qemu boot, failing if it is
not possible to shadow the Virtqueue.

Now the CVQ will be shadowed if possible, so we need to check this at
device start, not at initialization. To store this information at boot
time is not valid anymore, because v->shadow_vqs_enabled is not valid
at this time anymore.

Thanks!

> Thanks
>
>
> >
> > Since the moved checks will be already evaluated at net/ to know if it
> > is ok to shadow CVQ, move them.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >   hw/virtio/vhost-vdpa.c | 33 ++-------------------------------
> >   net/vhost-vdpa.c       |  3 ++-
> >   2 files changed, 4 insertions(+), 32 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index 3df2775760..146f0dcb40 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -402,29 +402,9 @@ static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
> >       return ret;
> >   }
> >
> > -static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> > -                               Error **errp)
> > +static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v)
> >   {
> >       g_autoptr(GPtrArray) shadow_vqs = NULL;
> > -    uint64_t dev_features, svq_features;
> > -    int r;
> > -    bool ok;
> > -
> > -    if (!v->shadow_vqs_enabled) {
> > -        return 0;
> > -    }
> > -
> > -    r = vhost_vdpa_get_dev_features(hdev, &dev_features);
> > -    if (r != 0) {
> > -        error_setg_errno(errp, -r, "Can't get vdpa device features");
> > -        return r;
> > -    }
> > -
> > -    svq_features = dev_features;
> > -    ok = vhost_svq_valid_features(svq_features, errp);
> > -    if (unlikely(!ok)) {
> > -        return -1;
> > -    }
> >
> >       shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
> >       for (unsigned n = 0; n < hdev->nvqs; ++n) {
> > @@ -436,7 +416,6 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> >       }
> >
> >       v->shadow_vqs = g_steal_pointer(&shadow_vqs);
> > -    return 0;
> >   }
> >
> >   static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> > @@ -461,11 +440,7 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> >       dev->opaque =  opaque ;
> >       v->listener = vhost_vdpa_memory_listener;
> >       v->msg_type = VHOST_IOTLB_MSG_V2;
> > -    ret = vhost_vdpa_init_svq(dev, v, errp);
> > -    if (ret) {
> > -        goto err;
> > -    }
> > -
> > +    vhost_vdpa_init_svq(dev, v);
> >       vhost_vdpa_get_iova_range(v);
> >
> >       if (!vhost_vdpa_first_dev(dev)) {
> > @@ -476,10 +451,6 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> >                                  VIRTIO_CONFIG_S_DRIVER);
> >
> >       return 0;
> > -
> > -err:
> > -    ram_block_discard_disable(false);
> > -    return ret;
> >   }
> >
> >   static void vhost_vdpa_host_notifier_uninit(struct vhost_dev *dev,
> > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > index d3b1de481b..fb35b17ab4 100644
> > --- a/net/vhost-vdpa.c
> > +++ b/net/vhost-vdpa.c
> > @@ -117,9 +117,10 @@ static bool vhost_vdpa_net_valid_svq_features(uint64_t features, Error **errp)
> >       if (invalid_dev_features) {
> >           error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
> >                      invalid_dev_features);
> > +        return false;
> >       }
> >
> > -    return !invalid_dev_features;
> > +    return vhost_svq_valid_features(features, errp);
> >   }
> >
> >   static int vhost_vdpa_net_check_device_id(struct vhost_net *net)
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 07/10] vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap
  2022-11-10  5:50   ` Jason Wang
@ 2022-11-10 13:22     ` Eugenio Perez Martin
  2022-11-11  7:41       ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-10 13:22 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Thu, Nov 10, 2022 at 6:51 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > So the caller can choose which ASID is destined.
> >
> > No need to update the batch functions as they will always be called from
> > memory listener updates at the moment. Memory listener updates will
> > always update ASID 0, as it's the passthrough ASID.
> >
> > All vhost devices's ASID are 0 at this moment.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> > v5:
> > * Solve conflict, now vhost_vdpa_svq_unmap_ring returns void
> > * Change comment on zero initialization.
> >
> > v4: Add comment specifying behavior if device does not support _F_ASID
> >
> > v3: Deleted unneeded space
> > ---
> >  include/hw/virtio/vhost-vdpa.h |  8 +++++---
> >  hw/virtio/vhost-vdpa.c         | 29 +++++++++++++++++++----------
> >  net/vhost-vdpa.c               |  6 +++---
> >  hw/virtio/trace-events         |  4 ++--
> >  4 files changed, 29 insertions(+), 18 deletions(-)
> >
> > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > index 1111d85643..6560bb9d78 100644
> > --- a/include/hw/virtio/vhost-vdpa.h
> > +++ b/include/hw/virtio/vhost-vdpa.h
> > @@ -29,6 +29,7 @@ typedef struct vhost_vdpa {
> >      int index;
> >      uint32_t msg_type;
> >      bool iotlb_batch_begin_sent;
> > +    uint32_t address_space_id;
>
> So the trick is let device specific code to zero this during allocation?
>

Yes, but I don't see how that is a trick :). All other parameters also
trust it to be 0 at allocation.

> >      MemoryListener listener;
> >      struct vhost_vdpa_iova_range iova_range;
> >      uint64_t acked_features;
> > @@ -42,8 +43,9 @@ typedef struct vhost_vdpa {
> >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> >  } VhostVDPA;
> >
> > -int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> > -                       void *vaddr, bool readonly);
> > -int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> > +int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> > +                       hwaddr size, void *vaddr, bool readonly);
> > +int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> > +                         hwaddr size);
> >
> >  #endif
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index 23efb8f49d..8fd32ba32b 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -72,22 +72,24 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> >      return false;
> >  }
> >
> > -int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> > -                       void *vaddr, bool readonly)
> > +int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> > +                       hwaddr size, void *vaddr, bool readonly)
> >  {
> >      struct vhost_msg_v2 msg = {};
> >      int fd = v->device_fd;
> >      int ret = 0;
> >
> >      msg.type = v->msg_type;
> > +    msg.asid = asid; /* 0 if vdpa device does not support asid */
>
> The comment here is confusing. If this is a requirement, we need either
>
> 1) doc this
>
> or
>
> 2) perform necessary checks in the function itself.
>

I only documented it in vhost_vdpa_dma_unmap and now I realize it.
Would it work to just copy that comment here?

> >      msg.iotlb.iova = iova;
> >      msg.iotlb.size = size;
> >      msg.iotlb.uaddr = (uint64_t)(uintptr_t)vaddr;
> >      msg.iotlb.perm = readonly ? VHOST_ACCESS_RO : VHOST_ACCESS_RW;
> >      msg.iotlb.type = VHOST_IOTLB_UPDATE;
> >
> > -   trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.iotlb.iova, msg.iotlb.size,
> > -                            msg.iotlb.uaddr, msg.iotlb.perm, msg.iotlb.type);
> > +    trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.asid, msg.iotlb.iova,
> > +                             msg.iotlb.size, msg.iotlb.uaddr, msg.iotlb.perm,
> > +                             msg.iotlb.type);
> >
> >      if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> >          error_report("failed to write, fd=%d, errno=%d (%s)",
> > @@ -98,18 +100,24 @@ int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> >      return ret;
> >  }
> >
> > -int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size)
> > +int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> > +                         hwaddr size)
> >  {
> >      struct vhost_msg_v2 msg = {};
> >      int fd = v->device_fd;
> >      int ret = 0;
> >
> >      msg.type = v->msg_type;
> > +    /*
> > +     * The caller must set asid = 0 if the device does not support asid.
> > +     * This is not an ABI break since it is set to 0 by the initializer anyway.
> > +     */
> > +    msg.asid = asid;
> >      msg.iotlb.iova = iova;
> >      msg.iotlb.size = size;
> >      msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
> >
> > -    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.iotlb.iova,
> > +    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.asid, msg.iotlb.iova,
> >                                 msg.iotlb.size, msg.iotlb.type);
> >
> >      if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> > @@ -229,7 +237,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >      }
> >
> >      vhost_vdpa_iotlb_batch_begin_once(v);
> > -    ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
> > +    ret = vhost_vdpa_dma_map(v, 0, iova, int128_get64(llsize),
>
> Can we use v->address_space_id here? Then we don't need to modify this
> line when we support multiple asids logic in the future.
>

The registered memory listener is the one of the last vhost_vdpa, the
one that handles the last queue.

If all data virtqueues are not shadowed but CVQ is,
v->address_space_id is 1 with the current code. But the listener is
actually mapping the ASID 0, not 1.

Another alternative is to register it to the last data virtqueue, not
the last queue of vhost_vdpa. But it is hard to express it in a
generic way at virtio/vhost-vdpa.c . To have a boolean indicating the
vhost_vdpa we want to register its memory listener?

It seems easier to me to simply assign 0 at GPA translations. If SVQ
is enabled for all queues, then 0 is GPA to qemu's VA + SVQ stuff. If
it is not, 0 is always GPA to qemu's VA.

Thanks!

> Thanks
>
> >                               vaddr, section->readonly);
> >      if (ret) {
> >          error_report("vhost vdpa map fail!");
> > @@ -303,7 +311,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> >          vhost_iova_tree_remove(v->iova_tree, *result);
> >      }
> >      vhost_vdpa_iotlb_batch_begin_once(v);
> > -    ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
> > +    ret = vhost_vdpa_dma_unmap(v, 0, iova, int128_get64(llsize));
> >      if (ret) {
> >          error_report("vhost_vdpa dma unmap error!");
> >      }
> > @@ -884,7 +892,7 @@ static void vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v, hwaddr addr)
> >      }
> >
> >      size = ROUND_UP(result->size, qemu_real_host_page_size());
> > -    r = vhost_vdpa_dma_unmap(v, result->iova, size);
> > +    r = vhost_vdpa_dma_unmap(v, v->address_space_id, result->iova, size);
> >      if (unlikely(r < 0)) {
> >          error_report("Unable to unmap SVQ vring: %s (%d)", g_strerror(-r), -r);
> >          return;
> > @@ -924,7 +932,8 @@ static bool vhost_vdpa_svq_map_ring(struct vhost_vdpa *v, DMAMap *needle,
> >          return false;
> >      }
> >
> > -    r = vhost_vdpa_dma_map(v, needle->iova, needle->size + 1,
> > +    r = vhost_vdpa_dma_map(v, v->address_space_id, needle->iova,
> > +                           needle->size + 1,
> >                             (void *)(uintptr_t)needle->translated_addr,
> >                             needle->perm == IOMMU_RO);
> >      if (unlikely(r != 0)) {
> > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > index fb35b17ab4..ca1acc0410 100644
> > --- a/net/vhost-vdpa.c
> > +++ b/net/vhost-vdpa.c
> > @@ -258,7 +258,7 @@ static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
> >          return;
> >      }
> >
> > -    r = vhost_vdpa_dma_unmap(v, map->iova, map->size + 1);
> > +    r = vhost_vdpa_dma_unmap(v, v->address_space_id, map->iova, map->size + 1);
> >      if (unlikely(r != 0)) {
> >          error_report("Device cannot unmap: %s(%d)", g_strerror(r), r);
> >      }
> > @@ -298,8 +298,8 @@ static int vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v, void *buf, size_t size,
> >          return r;
> >      }
> >
> > -    r = vhost_vdpa_dma_map(v, map.iova, vhost_vdpa_net_cvq_cmd_page_len(), buf,
> > -                           !write);
> > +    r = vhost_vdpa_dma_map(v, v->address_space_id, map.iova,
> > +                           vhost_vdpa_net_cvq_cmd_page_len(), buf, !write);
> >      if (unlikely(r < 0)) {
> >          goto dma_map_err;
> >      }
> > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> > index 820dadc26c..0ad9390307 100644
> > --- a/hw/virtio/trace-events
> > +++ b/hw/virtio/trace-events
> > @@ -30,8 +30,8 @@ vhost_user_write(uint32_t req, uint32_t flags) "req:%d flags:0x%"PRIx32""
> >  vhost_user_create_notifier(int idx, void *n) "idx:%d n:%p"
> >
> >  # vhost-vdpa.c
> > -vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
> > -vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
> > +vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
> > +vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
> >  vhost_vdpa_listener_begin_batch(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
> >  vhost_vdpa_listener_commit(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
> >  vhost_vdpa_listener_region_add(void *vdpa, uint64_t iova, uint64_t llend, void *vaddr, bool readonly) "vdpa: %p iova 0x%"PRIx64" llend 0x%"PRIx64" vaddr: %p read-only: %d"
> > --
> > 2.31.1
> >
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa
  2022-11-10  6:00   ` Jason Wang
@ 2022-11-10 13:47     ` Eugenio Perez Martin
  2022-11-11  7:48       ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-10 13:47 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Thu, Nov 10, 2022 at 7:01 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > The memory listener that thells the device how to convert GPA to qemu's
> > va is registered against CVQ vhost_vdpa. This series try to map the
> > memory listener translations to ASID 0, while it maps the CVQ ones to
> > ASID 1.
> >
> > Let's tell the listener if it needs to register them on iova tree or
> > not.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> > v5: Solve conflict about vhost_iova_tree_remove accepting mem_region by
> >     value.
> > ---
> >  include/hw/virtio/vhost-vdpa.h | 2 ++
> >  hw/virtio/vhost-vdpa.c         | 6 +++---
> >  net/vhost-vdpa.c               | 1 +
> >  3 files changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > index 6560bb9d78..0c3ed2d69b 100644
> > --- a/include/hw/virtio/vhost-vdpa.h
> > +++ b/include/hw/virtio/vhost-vdpa.h
> > @@ -34,6 +34,8 @@ typedef struct vhost_vdpa {
> >      struct vhost_vdpa_iova_range iova_range;
> >      uint64_t acked_features;
> >      bool shadow_vqs_enabled;
> > +    /* The listener must send iova tree addresses, not GPA */
> > +    bool listener_shadow_vq;
> >      /* IOVA mapping used by the Shadow Virtqueue */
> >      VhostIOVATree *iova_tree;
> >      GPtrArray *shadow_vqs;
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index 8fd32ba32b..e3914fa40e 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -220,7 +220,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >                                           vaddr, section->readonly);
> >
> >      llsize = int128_sub(llend, int128_make64(iova));
> > -    if (v->shadow_vqs_enabled) {
> > +    if (v->listener_shadow_vq) {
> >          int r;
> >
> >          mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
> > @@ -247,7 +247,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >      return;
> >
> >  fail_map:
> > -    if (v->shadow_vqs_enabled) {
> > +    if (v->listener_shadow_vq) {
> >          vhost_iova_tree_remove(v->iova_tree, mem_region);
> >      }
> >
> > @@ -292,7 +292,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> >
> >      llsize = int128_sub(llend, int128_make64(iova));
> >
> > -    if (v->shadow_vqs_enabled) {
> > +    if (v->listener_shadow_vq) {
> >          const DMAMap *result;
> >          const void *vaddr = memory_region_get_ram_ptr(section->mr) +
> >              section->offset_within_region +
> > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > index 85a318faca..02780ee37b 100644
> > --- a/net/vhost-vdpa.c
> > +++ b/net/vhost-vdpa.c
> > @@ -570,6 +570,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >      s->vhost_vdpa.index = queue_pair_index;
> >      s->always_svq = svq;
> >      s->vhost_vdpa.shadow_vqs_enabled = svq;
> > +    s->vhost_vdpa.listener_shadow_vq = svq;
>
> Any chance those above two can differ?
>

If CVQ is shadowed but data VQs are not, shadow_vqs_enabled is true
but listener_shadow_vq is not.

It is more clear in the next commit, where only shadow_vqs_enabled is
set to true at vhost_vdpa_net_cvq_start.

Thanks!

> Thanks
>
> >      s->vhost_vdpa.iova_tree = iova_tree;
> >      if (!is_datapath) {
> >          s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size(),
> > --
> > 2.31.1
> >
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 10/10] vdpa: Always start CVQ in SVQ mode
  2022-11-10  6:24   ` Jason Wang
@ 2022-11-10 16:07     ` Eugenio Perez Martin
  2022-11-11  8:02       ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-10 16:07 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Thu, Nov 10, 2022 at 7:25 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/11/9 01:07, Eugenio Pérez 写道:
> > Isolate control virtqueue in its own group, allowing to intercept control
> > commands but letting dataplane run totally passthrough to the guest.
>
>
> I think we need to tweak the title to "vdpa: Always start CVQ in SVQ
> mode if possible". Since SVQ for CVQ can't be enabled without ASID support?
>

Yes, I totally agree.

>
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> > v6:
> > * Disable control SVQ if the device does not support it because of
> > features.
> >
> > v5:
> > * Fixing the not adding cvq buffers when x-svq=on is specified.
> > * Move vring state in vhost_vdpa_get_vring_group instead of using a
> >    parameter.
> > * Rename VHOST_VDPA_NET_CVQ_PASSTHROUGH to VHOST_VDPA_NET_DATA_ASID
> >
> > v4:
> > * Squash vhost_vdpa_cvq_group_is_independent.
> > * Rebased on last CVQ start series, that allocated CVQ cmd bufs at load
> > * Do not check for cvq index on vhost_vdpa_net_prepare, we only have one
> >    that callback registered in that NetClientInfo.
> >
> > v3:
> > * Make asid related queries print a warning instead of returning an
> >    error and stop the start of qemu.
> > ---
> >   hw/virtio/vhost-vdpa.c |   3 +-
> >   net/vhost-vdpa.c       | 138 ++++++++++++++++++++++++++++++++++++++---
> >   2 files changed, 132 insertions(+), 9 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index e3914fa40e..6401e7efb1 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -648,7 +648,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
> >   {
> >       uint64_t features;
> >       uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 |
> > -        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH;
> > +        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH |
> > +        0x1ULL << VHOST_BACKEND_F_IOTLB_ASID;
> >       int r;
> >
> >       if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) {
> > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > index 02780ee37b..7245ea70c6 100644
> > --- a/net/vhost-vdpa.c
> > +++ b/net/vhost-vdpa.c
> > @@ -38,6 +38,9 @@ typedef struct VhostVDPAState {
> >       void *cvq_cmd_out_buffer;
> >       virtio_net_ctrl_ack *status;
> >
> > +    /* Number of address spaces supported by the device */
> > +    unsigned address_space_num;
>
>
> I'm not sure this is the best place to store thing like this since it
> can cause confusion. We will have multiple VhostVDPAState when
> multiqueue is enabled.
>

I think we can delete this and ask it on each device start.

>
> > +
> >       /* The device always have SVQ enabled */
> >       bool always_svq;
> >       bool started;
> > @@ -101,6 +104,9 @@ static const uint64_t vdpa_svq_device_features =
> >       BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
> >       BIT_ULL(VIRTIO_NET_F_STANDBY);
> >
> > +#define VHOST_VDPA_NET_DATA_ASID 0
> > +#define VHOST_VDPA_NET_CVQ_ASID 1
> > +
> >   VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
> >   {
> >       VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > @@ -242,6 +248,34 @@ static NetClientInfo net_vhost_vdpa_info = {
> >           .check_peer_type = vhost_vdpa_check_peer_type,
> >   };
> >
> > +static uint32_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > +{
> > +    struct vhost_vring_state state = {
> > +        .index = vq_index,
> > +    };
> > +    int r = ioctl(device_fd, VHOST_VDPA_GET_VRING_GROUP, &state);
> > +
> > +    return r < 0 ? 0 : state.num;
>
>
> Assume 0 when ioctl() fail is probably not a good idea: errors in ioctl
> might be hidden. It would be better to fallback to 0 when ASID is not
> supported.
>

Did I misunderstand you on [1]?

>
> > +}
> > +
> > +static int vhost_vdpa_set_address_space_id(struct vhost_vdpa *v,
> > +                                           unsigned vq_group,
> > +                                           unsigned asid_num)
> > +{
> > +    struct vhost_vring_state asid = {
> > +        .index = vq_group,
> > +        .num = asid_num,
> > +    };
> > +    int ret;
> > +
> > +    ret = ioctl(v->device_fd, VHOST_VDPA_SET_GROUP_ASID, &asid);
> > +    if (unlikely(ret < 0)) {
> > +        warn_report("Can't set vq group %u asid %u, errno=%d (%s)",
> > +            asid.index, asid.num, errno, g_strerror(errno));
> > +    }
> > +    return ret;
> > +}
> > +
> >   static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
> >   {
> >       VhostIOVATree *tree = v->iova_tree;
> > @@ -316,11 +350,54 @@ dma_map_err:
> >   static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> >   {
> >       VhostVDPAState *s;
> > -    int r;
> > +    struct vhost_vdpa *v;
> > +    uint32_t cvq_group;
> > +    int cvq_index, r;
> >
> >       assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> >
> >       s = DO_UPCAST(VhostVDPAState, nc, nc);
> > +    v = &s->vhost_vdpa;
> > +
> > +    v->listener_shadow_vq = s->always_svq;
> > +    v->shadow_vqs_enabled = s->always_svq;
> > +    s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_DATA_ASID;
> > +
> > +    if (s->always_svq) {
> > +        goto out;
> > +    }
> > +
> > +    if (s->address_space_num < 2) {
> > +        return 0;
> > +    }
> > +
> > +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > +        return 0;
> > +    }
>
>
> Any reason we do the above check during the start/stop? It should be
> easier to do that in the initialization.
>

We can store it as a member of VhostVDPAState maybe? They will be
duplicated like the current number of AS.

>
> > +
> > +    /**
> > +     * Check if all the virtqueues of the virtio device are in a different vq
> > +     * than the last vq. VQ group of last group passed in cvq_group.
> > +     */
> > +    cvq_index = v->dev->vq_index_end - 1;
> > +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> > +    for (int i = 0; i < cvq_index; ++i) {
> > +        uint32_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> > +
> > +        if (unlikely(group == cvq_group)) {
> > +            warn_report("CVQ %u group is the same as VQ %u one (%u)", cvq_group,
> > +                        i, group);
> > +            return 0;
> > +        }
> > +    }
> > +
> > +    r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
> > +    if (r == 0) {
> > +        v->shadow_vqs_enabled = true;
> > +        s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_CVQ_ASID;
> > +    }
> > +
> > +out:
> >       if (!s->vhost_vdpa.shadow_vqs_enabled) {
> >           return 0;
> >       }
> > @@ -542,12 +619,38 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> >       .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> >   };
> >
> > +static uint32_t vhost_vdpa_get_as_num(int vdpa_device_fd)
> > +{
> > +    uint64_t features;
> > +    unsigned num_as;
> > +    int r;
> > +
> > +    r = ioctl(vdpa_device_fd, VHOST_GET_BACKEND_FEATURES, &features);
> > +    if (unlikely(r < 0)) {
> > +        warn_report("Cannot get backend features");
> > +        return 1;
> > +    }
> > +
> > +    if (!(features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
> > +        return 1;
> > +    }
> > +
> > +    r = ioctl(vdpa_device_fd, VHOST_VDPA_GET_AS_NUM, &num_as);
> > +    if (unlikely(r < 0)) {
> > +        warn_report("Cannot retrieve number of supported ASs");
> > +        return 1;
>
>
> Let's return error here. This help. to identify bugs of qemu or kernel.
>

Same comment as with VHOST_VDPA_GET_VRING_GROUP.

>
> > +    }
> > +
> > +    return num_as;
> > +}
> > +
> >   static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >                                              const char *device,
> >                                              const char *name,
> >                                              int vdpa_device_fd,
> >                                              int queue_pair_index,
> >                                              int nvqs,
> > +                                           unsigned nas,
> >                                              bool is_datapath,
> >                                              bool svq,
> >                                              VhostIOVATree *iova_tree)
> > @@ -566,6 +669,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >       qemu_set_info_str(nc, TYPE_VHOST_VDPA);
> >       s = DO_UPCAST(VhostVDPAState, nc, nc);
> >
> > +    s->address_space_num = nas;
> >       s->vhost_vdpa.device_fd = vdpa_device_fd;
> >       s->vhost_vdpa.index = queue_pair_index;
> >       s->always_svq = svq;
> > @@ -652,6 +756,8 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> >       g_autoptr(VhostIOVATree) iova_tree = NULL;
> >       NetClientState *nc;
> >       int queue_pairs, r, i = 0, has_cvq = 0;
> > +    unsigned num_as = 1;
> > +    bool svq_cvq;
> >
> >       assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> >       opts = &netdev->u.vhost_vdpa;
> > @@ -693,12 +799,28 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> >           return queue_pairs;
> >       }
> >
> > -    if (opts->x_svq) {
> > -        struct vhost_vdpa_iova_range iova_range;
> > +    svq_cvq = opts->x_svq;
> > +    if (has_cvq && !opts->x_svq) {
> > +        num_as = vhost_vdpa_get_as_num(vdpa_device_fd);
> > +        svq_cvq = num_as > 1;
> > +    }
>
>
> The above check is not easy to follow, how about?
>
> svq_cvq = vhost_vdpa_get_as_num() > 1 ? true : opts->x_svq;
>

That would allocate the iova tree even if CVQ is not used in the
guest. And num_as is reused later, although we can ask it to the
device at device start to avoid this.

If any, the linear conversion would be:
svq_cvq = opts->x_svq || (has_cvq && vhost_vdpa_get_as_num(vdpa_device_fd))

So we avoid the AS_NUM ioctl if not needed.

>
> > +
> > +    if (opts->x_svq || svq_cvq) {
>
>
> Any chance we can have opts->x_svq = true but svq_cvq = false? Checking
> svq_cvq seems sufficient here.
>

The reverse is possible, to have svq_cvq but no opts->x_svq.

Depending on that, this code emits a warning or a fatal error.

>
> > +        Error *warn = NULL;
> >
> > -        if (!vhost_vdpa_net_valid_svq_features(features, errp)) {
> > -            goto err_svq;
> > +        svq_cvq = vhost_vdpa_net_valid_svq_features(features,
> > +                                                   opts->x_svq ? errp : &warn);
> > +        if (!svq_cvq) {
>
>
> Same question as above.
>
>
> > +            if (opts->x_svq) {
> > +                goto err_svq;
> > +            } else {
> > +                warn_reportf_err(warn, "Cannot shadow CVQ: ");
> > +            }
> >           }
> > +    }
> > +
> > +    if (opts->x_svq || svq_cvq) {
> > +        struct vhost_vdpa_iova_range iova_range;
> >
> >           vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range);
> >           iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
> > @@ -708,15 +830,15 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> >
> >       for (i = 0; i < queue_pairs; i++) {
> >           ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> > -                                     vdpa_device_fd, i, 2, true, opts->x_svq,
> > -                                     iova_tree);
> > +                                     vdpa_device_fd, i, 2, num_as, true,
>
>
> I don't get why we need pass num_as to a specific vhost_vdpa structure.
> It should be sufficient to pass asid there.
>

ASID is not known at this time, but at device's start. This is because
we cannot ask if the CVQ is in its own vq group, because we don't know
the control virtqueue index until the guest acknowledges the different
features.

Thanks!

[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg901685.html


>
> > +                                     opts->x_svq, iova_tree);
> >           if (!ncs[i])
> >               goto err;
> >       }
> >
> >       if (has_cvq) {
> >           nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> > -                                 vdpa_device_fd, i, 1, false,
> > +                                 vdpa_device_fd, i, 1, num_as, false,
> >                                    opts->x_svq, iova_tree);
> >           if (!nc)
> >               goto err;
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 01/10] vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop
  2022-11-10 12:54     ` Eugenio Perez Martin
@ 2022-11-11  7:24       ` Jason Wang
  0 siblings, 0 replies; 46+ messages in thread
From: Jason Wang @ 2022-11-11  7:24 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini


在 2022/11/10 20:54, Eugenio Perez Martin 写道:
> On Thu, Nov 10, 2022 at 6:22 AM Jason Wang <jasowang@redhat.com> wrote:
>> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>> This function used to trust in v->shadow_vqs != NULL to know if it must
>>> start svq or not.
>>>
>>> This is not going to be valid anymore, as qemu is going to allocate svq
>>> unconditionally (but it will only start them conditionally).
>> It might be a waste of memory if we did this. Any reason for this?
>>
> Well, it's modelled after vhost_vdpa notifier member [1].


Right, this could be optimized in the future as well.


>
> But sure we can reduce the memory usage if SVQ is not used. The first
> function that needs it is vhost_set_vring_kick. But I think it is not
> a good function to place the delayed allocation.
>
> Would it work to move the allocation to vhost_set_features vhost op?
> It seems unlikely to me to call callbacks that can affect SVQ earlier
> than that one. Or maybe to create a new one and call it the first on
> vhost.c:vhost_dev_start?


Rethink about this, so I think we can leave this in the future.

Thanks


>
> Thanks!
>
> [1] The notifier member already allocates VIRTIO_QUEUE_MAX
> VhostVDPAHostNotifier for each vhost_vdpa, It is easy to reduce at
> least to the number of virtqueues on a vhost_vdpa. Should I send a
> patch for this one?
>
>
>> Thanks
>>
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>>   hw/virtio/vhost-vdpa.c | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>> index 7468e44b87..7f0ff4df5b 100644
>>> --- a/hw/virtio/vhost-vdpa.c
>>> +++ b/hw/virtio/vhost-vdpa.c
>>> @@ -1029,7 +1029,7 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
>>>       Error *err = NULL;
>>>       unsigned i;
>>>
>>> -    if (!v->shadow_vqs) {
>>> +    if (!v->shadow_vqs_enabled) {
>>>           return true;
>>>       }
>>>
>>> @@ -1082,7 +1082,7 @@ static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
>>>   {
>>>       struct vhost_vdpa *v = dev->opaque;
>>>
>>> -    if (!v->shadow_vqs) {
>>> +    if (!v->shadow_vqs_enabled) {
>>>           return;
>>>       }
>>>
>>> --
>>> 2.31.1
>>>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 05/10] vdpa: move SVQ vring features check to net/
  2022-11-10 13:09     ` Eugenio Perez Martin
@ 2022-11-11  7:34       ` Jason Wang
  2022-11-11  7:55         ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-11  7:34 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini


在 2022/11/10 21:09, Eugenio Perez Martin 写道:
> On Thu, Nov 10, 2022 at 6:40 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> 在 2022/11/9 01:07, Eugenio Pérez 写道:
>>> The next patches will start control SVQ if possible. However, we don't
>>> know if that will be possible at qemu boot anymore.
>>
>> If I was not wrong, there's no device specific feature that is checked
>> in the function. So it should be general enough to be used by devices
>> other than net. Then I don't see any advantage of doing this.
>>
> Because vhost_vdpa_init_svq is called at qemu boot, failing if it is
> not possible to shadow the Virtqueue.
>
> Now the CVQ will be shadowed if possible, so we need to check this at
> device start, not at initialization.


Any reason we can't check this at device start? We don't need 
driver_features and we can do any probing to make sure cvq has an unique 
group during initialization time.


>   To store this information at boot
> time is not valid anymore, because v->shadow_vqs_enabled is not valid
> at this time anymore.


Ok, but this doesn't explain why it is net specific but vhost-vdpa specific.

Thanks


>
> Thanks!
>
>> Thanks
>>
>>
>>> Since the moved checks will be already evaluated at net/ to know if it
>>> is ok to shadow CVQ, move them.
>>>
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>>    hw/virtio/vhost-vdpa.c | 33 ++-------------------------------
>>>    net/vhost-vdpa.c       |  3 ++-
>>>    2 files changed, 4 insertions(+), 32 deletions(-)
>>>
>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>> index 3df2775760..146f0dcb40 100644
>>> --- a/hw/virtio/vhost-vdpa.c
>>> +++ b/hw/virtio/vhost-vdpa.c
>>> @@ -402,29 +402,9 @@ static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
>>>        return ret;
>>>    }
>>>
>>> -static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
>>> -                               Error **errp)
>>> +static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v)
>>>    {
>>>        g_autoptr(GPtrArray) shadow_vqs = NULL;
>>> -    uint64_t dev_features, svq_features;
>>> -    int r;
>>> -    bool ok;
>>> -
>>> -    if (!v->shadow_vqs_enabled) {
>>> -        return 0;
>>> -    }
>>> -
>>> -    r = vhost_vdpa_get_dev_features(hdev, &dev_features);
>>> -    if (r != 0) {
>>> -        error_setg_errno(errp, -r, "Can't get vdpa device features");
>>> -        return r;
>>> -    }
>>> -
>>> -    svq_features = dev_features;
>>> -    ok = vhost_svq_valid_features(svq_features, errp);
>>> -    if (unlikely(!ok)) {
>>> -        return -1;
>>> -    }
>>>
>>>        shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
>>>        for (unsigned n = 0; n < hdev->nvqs; ++n) {
>>> @@ -436,7 +416,6 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
>>>        }
>>>
>>>        v->shadow_vqs = g_steal_pointer(&shadow_vqs);
>>> -    return 0;
>>>    }
>>>
>>>    static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
>>> @@ -461,11 +440,7 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
>>>        dev->opaque =  opaque ;
>>>        v->listener = vhost_vdpa_memory_listener;
>>>        v->msg_type = VHOST_IOTLB_MSG_V2;
>>> -    ret = vhost_vdpa_init_svq(dev, v, errp);
>>> -    if (ret) {
>>> -        goto err;
>>> -    }
>>> -
>>> +    vhost_vdpa_init_svq(dev, v);
>>>        vhost_vdpa_get_iova_range(v);
>>>
>>>        if (!vhost_vdpa_first_dev(dev)) {
>>> @@ -476,10 +451,6 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
>>>                                   VIRTIO_CONFIG_S_DRIVER);
>>>
>>>        return 0;
>>> -
>>> -err:
>>> -    ram_block_discard_disable(false);
>>> -    return ret;
>>>    }
>>>
>>>    static void vhost_vdpa_host_notifier_uninit(struct vhost_dev *dev,
>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>> index d3b1de481b..fb35b17ab4 100644
>>> --- a/net/vhost-vdpa.c
>>> +++ b/net/vhost-vdpa.c
>>> @@ -117,9 +117,10 @@ static bool vhost_vdpa_net_valid_svq_features(uint64_t features, Error **errp)
>>>        if (invalid_dev_features) {
>>>            error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
>>>                       invalid_dev_features);
>>> +        return false;
>>>        }
>>>
>>> -    return !invalid_dev_features;
>>> +    return vhost_svq_valid_features(features, errp);
>>>    }
>>>
>>>    static int vhost_vdpa_net_check_device_id(struct vhost_net *net)


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 07/10] vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap
  2022-11-10 13:22     ` Eugenio Perez Martin
@ 2022-11-11  7:41       ` Jason Wang
  2022-11-11 13:02         ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-11  7:41 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini


在 2022/11/10 21:22, Eugenio Perez Martin 写道:
> On Thu, Nov 10, 2022 at 6:51 AM Jason Wang <jasowang@redhat.com> wrote:
>> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>> So the caller can choose which ASID is destined.
>>>
>>> No need to update the batch functions as they will always be called from
>>> memory listener updates at the moment. Memory listener updates will
>>> always update ASID 0, as it's the passthrough ASID.
>>>
>>> All vhost devices's ASID are 0 at this moment.
>>>
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>> v5:
>>> * Solve conflict, now vhost_vdpa_svq_unmap_ring returns void
>>> * Change comment on zero initialization.
>>>
>>> v4: Add comment specifying behavior if device does not support _F_ASID
>>>
>>> v3: Deleted unneeded space
>>> ---
>>>   include/hw/virtio/vhost-vdpa.h |  8 +++++---
>>>   hw/virtio/vhost-vdpa.c         | 29 +++++++++++++++++++----------
>>>   net/vhost-vdpa.c               |  6 +++---
>>>   hw/virtio/trace-events         |  4 ++--
>>>   4 files changed, 29 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
>>> index 1111d85643..6560bb9d78 100644
>>> --- a/include/hw/virtio/vhost-vdpa.h
>>> +++ b/include/hw/virtio/vhost-vdpa.h
>>> @@ -29,6 +29,7 @@ typedef struct vhost_vdpa {
>>>       int index;
>>>       uint32_t msg_type;
>>>       bool iotlb_batch_begin_sent;
>>> +    uint32_t address_space_id;
>> So the trick is let device specific code to zero this during allocation?
>>
> Yes, but I don't see how that is a trick :). All other parameters also
> trust it to be 0 at allocation.
>
>>>       MemoryListener listener;
>>>       struct vhost_vdpa_iova_range iova_range;
>>>       uint64_t acked_features;
>>> @@ -42,8 +43,9 @@ typedef struct vhost_vdpa {
>>>       VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
>>>   } VhostVDPA;
>>>
>>> -int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
>>> -                       void *vaddr, bool readonly);
>>> -int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
>>> +int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
>>> +                       hwaddr size, void *vaddr, bool readonly);
>>> +int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
>>> +                         hwaddr size);
>>>
>>>   #endif
>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>> index 23efb8f49d..8fd32ba32b 100644
>>> --- a/hw/virtio/vhost-vdpa.c
>>> +++ b/hw/virtio/vhost-vdpa.c
>>> @@ -72,22 +72,24 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
>>>       return false;
>>>   }
>>>
>>> -int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
>>> -                       void *vaddr, bool readonly)
>>> +int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
>>> +                       hwaddr size, void *vaddr, bool readonly)
>>>   {
>>>       struct vhost_msg_v2 msg = {};
>>>       int fd = v->device_fd;
>>>       int ret = 0;
>>>
>>>       msg.type = v->msg_type;
>>> +    msg.asid = asid; /* 0 if vdpa device does not support asid */
>> The comment here is confusing. If this is a requirement, we need either
>>
>> 1) doc this
>>
>> or
>>
>> 2) perform necessary checks in the function itself.
>>
> I only documented it in vhost_vdpa_dma_unmap and now I realize it.
> Would it work to just copy that comment here?


Probably, and let's move the comment above the function definition.


>
>>>       msg.iotlb.iova = iova;
>>>       msg.iotlb.size = size;
>>>       msg.iotlb.uaddr = (uint64_t)(uintptr_t)vaddr;
>>>       msg.iotlb.perm = readonly ? VHOST_ACCESS_RO : VHOST_ACCESS_RW;
>>>       msg.iotlb.type = VHOST_IOTLB_UPDATE;
>>>
>>> -   trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.iotlb.iova, msg.iotlb.size,
>>> -                            msg.iotlb.uaddr, msg.iotlb.perm, msg.iotlb.type);
>>> +    trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.asid, msg.iotlb.iova,
>>> +                             msg.iotlb.size, msg.iotlb.uaddr, msg.iotlb.perm,
>>> +                             msg.iotlb.type);
>>>
>>>       if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
>>>           error_report("failed to write, fd=%d, errno=%d (%s)",
>>> @@ -98,18 +100,24 @@ int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
>>>       return ret;
>>>   }
>>>
>>> -int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size)
>>> +int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
>>> +                         hwaddr size)
>>>   {
>>>       struct vhost_msg_v2 msg = {};
>>>       int fd = v->device_fd;
>>>       int ret = 0;
>>>
>>>       msg.type = v->msg_type;
>>> +    /*
>>> +     * The caller must set asid = 0 if the device does not support asid.
>>> +     * This is not an ABI break since it is set to 0 by the initializer anyway.
>>> +     */
>>> +    msg.asid = asid;
>>>       msg.iotlb.iova = iova;
>>>       msg.iotlb.size = size;
>>>       msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
>>>
>>> -    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.iotlb.iova,
>>> +    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.asid, msg.iotlb.iova,
>>>                                  msg.iotlb.size, msg.iotlb.type);
>>>
>>>       if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
>>> @@ -229,7 +237,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>>>       }
>>>
>>>       vhost_vdpa_iotlb_batch_begin_once(v);
>>> -    ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
>>> +    ret = vhost_vdpa_dma_map(v, 0, iova, int128_get64(llsize),
>> Can we use v->address_space_id here? Then we don't need to modify this
>> line when we support multiple asids logic in the future.
>>
> The registered memory listener is the one of the last vhost_vdpa, the
> one that handles the last queue.
>
> If all data virtqueues are not shadowed but CVQ is,
> v->address_space_id is 1 with the current code.


Ok, right. So let's add a comment here. It would be even better to 
define the macro for data vq asid in this patch.


Thanks


>   But the listener is
> actually mapping the ASID 0, not 1.
>
> Another alternative is to register it to the last data virtqueue, not
> the last queue of vhost_vdpa. But it is hard to express it in a
> generic way at virtio/vhost-vdpa.c . To have a boolean indicating the
> vhost_vdpa we want to register its memory listener?
>
> It seems easier to me to simply assign 0 at GPA translations. If SVQ
> is enabled for all queues, then 0 is GPA to qemu's VA + SVQ stuff. If
> it is not, 0 is always GPA to qemu's VA.
>
> Thanks!
>
>> Thanks
>>
>>>                                vaddr, section->readonly);
>>>       if (ret) {
>>>           error_report("vhost vdpa map fail!");
>>> @@ -303,7 +311,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
>>>           vhost_iova_tree_remove(v->iova_tree, *result);
>>>       }
>>>       vhost_vdpa_iotlb_batch_begin_once(v);
>>> -    ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
>>> +    ret = vhost_vdpa_dma_unmap(v, 0, iova, int128_get64(llsize));
>>>       if (ret) {
>>>           error_report("vhost_vdpa dma unmap error!");
>>>       }
>>> @@ -884,7 +892,7 @@ static void vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v, hwaddr addr)
>>>       }
>>>
>>>       size = ROUND_UP(result->size, qemu_real_host_page_size());
>>> -    r = vhost_vdpa_dma_unmap(v, result->iova, size);
>>> +    r = vhost_vdpa_dma_unmap(v, v->address_space_id, result->iova, size);
>>>       if (unlikely(r < 0)) {
>>>           error_report("Unable to unmap SVQ vring: %s (%d)", g_strerror(-r), -r);
>>>           return;
>>> @@ -924,7 +932,8 @@ static bool vhost_vdpa_svq_map_ring(struct vhost_vdpa *v, DMAMap *needle,
>>>           return false;
>>>       }
>>>
>>> -    r = vhost_vdpa_dma_map(v, needle->iova, needle->size + 1,
>>> +    r = vhost_vdpa_dma_map(v, v->address_space_id, needle->iova,
>>> +                           needle->size + 1,
>>>                              (void *)(uintptr_t)needle->translated_addr,
>>>                              needle->perm == IOMMU_RO);
>>>       if (unlikely(r != 0)) {
>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>> index fb35b17ab4..ca1acc0410 100644
>>> --- a/net/vhost-vdpa.c
>>> +++ b/net/vhost-vdpa.c
>>> @@ -258,7 +258,7 @@ static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
>>>           return;
>>>       }
>>>
>>> -    r = vhost_vdpa_dma_unmap(v, map->iova, map->size + 1);
>>> +    r = vhost_vdpa_dma_unmap(v, v->address_space_id, map->iova, map->size + 1);
>>>       if (unlikely(r != 0)) {
>>>           error_report("Device cannot unmap: %s(%d)", g_strerror(r), r);
>>>       }
>>> @@ -298,8 +298,8 @@ static int vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v, void *buf, size_t size,
>>>           return r;
>>>       }
>>>
>>> -    r = vhost_vdpa_dma_map(v, map.iova, vhost_vdpa_net_cvq_cmd_page_len(), buf,
>>> -                           !write);
>>> +    r = vhost_vdpa_dma_map(v, v->address_space_id, map.iova,
>>> +                           vhost_vdpa_net_cvq_cmd_page_len(), buf, !write);
>>>       if (unlikely(r < 0)) {
>>>           goto dma_map_err;
>>>       }
>>> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
>>> index 820dadc26c..0ad9390307 100644
>>> --- a/hw/virtio/trace-events
>>> +++ b/hw/virtio/trace-events
>>> @@ -30,8 +30,8 @@ vhost_user_write(uint32_t req, uint32_t flags) "req:%d flags:0x%"PRIx32""
>>>   vhost_user_create_notifier(int idx, void *n) "idx:%d n:%p"
>>>
>>>   # vhost-vdpa.c
>>> -vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
>>> -vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
>>> +vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
>>> +vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
>>>   vhost_vdpa_listener_begin_batch(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
>>>   vhost_vdpa_listener_commit(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
>>>   vhost_vdpa_listener_region_add(void *vdpa, uint64_t iova, uint64_t llend, void *vaddr, bool readonly) "vdpa: %p iova 0x%"PRIx64" llend 0x%"PRIx64" vaddr: %p read-only: %d"
>>> --
>>> 2.31.1
>>>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa
  2022-11-10 13:47     ` Eugenio Perez Martin
@ 2022-11-11  7:48       ` Jason Wang
  2022-11-11 13:12         ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-11  7:48 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini


在 2022/11/10 21:47, Eugenio Perez Martin 写道:
> On Thu, Nov 10, 2022 at 7:01 AM Jason Wang <jasowang@redhat.com> wrote:
>> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>> The memory listener that thells the device how to convert GPA to qemu's
>>> va is registered against CVQ vhost_vdpa. This series try to map the
>>> memory listener translations to ASID 0, while it maps the CVQ ones to
>>> ASID 1.
>>>
>>> Let's tell the listener if it needs to register them on iova tree or
>>> not.
>>>
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>> v5: Solve conflict about vhost_iova_tree_remove accepting mem_region by
>>>      value.
>>> ---
>>>   include/hw/virtio/vhost-vdpa.h | 2 ++
>>>   hw/virtio/vhost-vdpa.c         | 6 +++---
>>>   net/vhost-vdpa.c               | 1 +
>>>   3 files changed, 6 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
>>> index 6560bb9d78..0c3ed2d69b 100644
>>> --- a/include/hw/virtio/vhost-vdpa.h
>>> +++ b/include/hw/virtio/vhost-vdpa.h
>>> @@ -34,6 +34,8 @@ typedef struct vhost_vdpa {
>>>       struct vhost_vdpa_iova_range iova_range;
>>>       uint64_t acked_features;
>>>       bool shadow_vqs_enabled;
>>> +    /* The listener must send iova tree addresses, not GPA */


Btw, cindy's vIOMMU series will make it not necessarily GPA any more.


>>> +    bool listener_shadow_vq;
>>>       /* IOVA mapping used by the Shadow Virtqueue */
>>>       VhostIOVATree *iova_tree;
>>>       GPtrArray *shadow_vqs;
>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>> index 8fd32ba32b..e3914fa40e 100644
>>> --- a/hw/virtio/vhost-vdpa.c
>>> +++ b/hw/virtio/vhost-vdpa.c
>>> @@ -220,7 +220,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>>>                                            vaddr, section->readonly);
>>>
>>>       llsize = int128_sub(llend, int128_make64(iova));
>>> -    if (v->shadow_vqs_enabled) {
>>> +    if (v->listener_shadow_vq) {
>>>           int r;
>>>
>>>           mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
>>> @@ -247,7 +247,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>>>       return;
>>>
>>>   fail_map:
>>> -    if (v->shadow_vqs_enabled) {
>>> +    if (v->listener_shadow_vq) {
>>>           vhost_iova_tree_remove(v->iova_tree, mem_region);
>>>       }
>>>
>>> @@ -292,7 +292,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
>>>
>>>       llsize = int128_sub(llend, int128_make64(iova));
>>>
>>> -    if (v->shadow_vqs_enabled) {
>>> +    if (v->listener_shadow_vq) {
>>>           const DMAMap *result;
>>>           const void *vaddr = memory_region_get_ram_ptr(section->mr) +
>>>               section->offset_within_region +
>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>> index 85a318faca..02780ee37b 100644
>>> --- a/net/vhost-vdpa.c
>>> +++ b/net/vhost-vdpa.c
>>> @@ -570,6 +570,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>>>       s->vhost_vdpa.index = queue_pair_index;
>>>       s->always_svq = svq;
>>>       s->vhost_vdpa.shadow_vqs_enabled = svq;
>>> +    s->vhost_vdpa.listener_shadow_vq = svq;
>> Any chance those above two can differ?
>>
> If CVQ is shadowed but data VQs are not, shadow_vqs_enabled is true
> but listener_shadow_vq is not.
>
> It is more clear in the next commit, where only shadow_vqs_enabled is
> set to true at vhost_vdpa_net_cvq_start.


Ok, the name looks a little bit confusing. I wonder if it's better to 
use shadow_cvq and shadow_data ?

Thanks


>
> Thanks!
>
>> Thanks
>>
>>>       s->vhost_vdpa.iova_tree = iova_tree;
>>>       if (!is_datapath) {
>>>           s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size(),
>>> --
>>> 2.31.1
>>>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 05/10] vdpa: move SVQ vring features check to net/
  2022-11-11  7:34       ` Jason Wang
@ 2022-11-11  7:55         ` Eugenio Perez Martin
  2022-11-11  8:07           ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-11  7:55 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Fri, Nov 11, 2022 at 8:34 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/11/10 21:09, Eugenio Perez Martin 写道:
> > On Thu, Nov 10, 2022 at 6:40 AM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> 在 2022/11/9 01:07, Eugenio Pérez 写道:
> >>> The next patches will start control SVQ if possible. However, we don't
> >>> know if that will be possible at qemu boot anymore.
> >>
> >> If I was not wrong, there's no device specific feature that is checked
> >> in the function. So it should be general enough to be used by devices
> >> other than net. Then I don't see any advantage of doing this.
> >>
> > Because vhost_vdpa_init_svq is called at qemu boot, failing if it is
> > not possible to shadow the Virtqueue.
> >
> > Now the CVQ will be shadowed if possible, so we need to check this at
> > device start, not at initialization.
>
>
> Any reason we can't check this at device start? We don't need
> driver_features and we can do any probing to make sure cvq has an unique
> group during initialization time.
>

We need the CVQ index to check if it has an independent group. CVQ
index depends on the features the guest's ack:
* If it acks _F_MQ, it is the last one.
* If it doesn't, CVQ idx is 2.

We cannot have acked features at initialization, and they could
change: It is valid for a guest to ack _F_MQ, then reset the device,
then not ack it.

>
> >   To store this information at boot
> > time is not valid anymore, because v->shadow_vqs_enabled is not valid
> > at this time anymore.
>
>
> Ok, but this doesn't explain why it is net specific but vhost-vdpa specific.
>

We can try to move it to a vhost op, but we have the same problem as
the svq array allocation: We don't have the right place in vhost ops
to check this. Maybe vhost_set_features is the right one here?

Thanks!

> Thanks
>
>
> >
> > Thanks!
> >
> >> Thanks
> >>
> >>
> >>> Since the moved checks will be already evaluated at net/ to know if it
> >>> is ok to shadow CVQ, move them.
> >>>
> >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>> ---
> >>>    hw/virtio/vhost-vdpa.c | 33 ++-------------------------------
> >>>    net/vhost-vdpa.c       |  3 ++-
> >>>    2 files changed, 4 insertions(+), 32 deletions(-)
> >>>
> >>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>> index 3df2775760..146f0dcb40 100644
> >>> --- a/hw/virtio/vhost-vdpa.c
> >>> +++ b/hw/virtio/vhost-vdpa.c
> >>> @@ -402,29 +402,9 @@ static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
> >>>        return ret;
> >>>    }
> >>>
> >>> -static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> >>> -                               Error **errp)
> >>> +static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v)
> >>>    {
> >>>        g_autoptr(GPtrArray) shadow_vqs = NULL;
> >>> -    uint64_t dev_features, svq_features;
> >>> -    int r;
> >>> -    bool ok;
> >>> -
> >>> -    if (!v->shadow_vqs_enabled) {
> >>> -        return 0;
> >>> -    }
> >>> -
> >>> -    r = vhost_vdpa_get_dev_features(hdev, &dev_features);
> >>> -    if (r != 0) {
> >>> -        error_setg_errno(errp, -r, "Can't get vdpa device features");
> >>> -        return r;
> >>> -    }
> >>> -
> >>> -    svq_features = dev_features;
> >>> -    ok = vhost_svq_valid_features(svq_features, errp);
> >>> -    if (unlikely(!ok)) {
> >>> -        return -1;
> >>> -    }
> >>>
> >>>        shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
> >>>        for (unsigned n = 0; n < hdev->nvqs; ++n) {
> >>> @@ -436,7 +416,6 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> >>>        }
> >>>
> >>>        v->shadow_vqs = g_steal_pointer(&shadow_vqs);
> >>> -    return 0;
> >>>    }
> >>>
> >>>    static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> >>> @@ -461,11 +440,7 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> >>>        dev->opaque =  opaque ;
> >>>        v->listener = vhost_vdpa_memory_listener;
> >>>        v->msg_type = VHOST_IOTLB_MSG_V2;
> >>> -    ret = vhost_vdpa_init_svq(dev, v, errp);
> >>> -    if (ret) {
> >>> -        goto err;
> >>> -    }
> >>> -
> >>> +    vhost_vdpa_init_svq(dev, v);
> >>>        vhost_vdpa_get_iova_range(v);
> >>>
> >>>        if (!vhost_vdpa_first_dev(dev)) {
> >>> @@ -476,10 +451,6 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> >>>                                   VIRTIO_CONFIG_S_DRIVER);
> >>>
> >>>        return 0;
> >>> -
> >>> -err:
> >>> -    ram_block_discard_disable(false);
> >>> -    return ret;
> >>>    }
> >>>
> >>>    static void vhost_vdpa_host_notifier_uninit(struct vhost_dev *dev,
> >>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> >>> index d3b1de481b..fb35b17ab4 100644
> >>> --- a/net/vhost-vdpa.c
> >>> +++ b/net/vhost-vdpa.c
> >>> @@ -117,9 +117,10 @@ static bool vhost_vdpa_net_valid_svq_features(uint64_t features, Error **errp)
> >>>        if (invalid_dev_features) {
> >>>            error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
> >>>                       invalid_dev_features);
> >>> +        return false;
> >>>        }
> >>>
> >>> -    return !invalid_dev_features;
> >>> +    return vhost_svq_valid_features(features, errp);
> >>>    }
> >>>
> >>>    static int vhost_vdpa_net_check_device_id(struct vhost_net *net)
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 10/10] vdpa: Always start CVQ in SVQ mode
  2022-11-10 16:07     ` Eugenio Perez Martin
@ 2022-11-11  8:02       ` Jason Wang
  2022-11-11 14:38         ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-11  8:02 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini


在 2022/11/11 00:07, Eugenio Perez Martin 写道:
> On Thu, Nov 10, 2022 at 7:25 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> 在 2022/11/9 01:07, Eugenio Pérez 写道:
>>> Isolate control virtqueue in its own group, allowing to intercept control
>>> commands but letting dataplane run totally passthrough to the guest.
>>
>> I think we need to tweak the title to "vdpa: Always start CVQ in SVQ
>> mode if possible". Since SVQ for CVQ can't be enabled without ASID support?
>>
> Yes, I totally agree.


Btw, I wonder if it's worth to remove the "x-" prefix for the shadow 
virtqueue. It can help for the devices without ASID support but want do 
live migration.


>
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>> v6:
>>> * Disable control SVQ if the device does not support it because of
>>> features.
>>>
>>> v5:
>>> * Fixing the not adding cvq buffers when x-svq=on is specified.
>>> * Move vring state in vhost_vdpa_get_vring_group instead of using a
>>>     parameter.
>>> * Rename VHOST_VDPA_NET_CVQ_PASSTHROUGH to VHOST_VDPA_NET_DATA_ASID
>>>
>>> v4:
>>> * Squash vhost_vdpa_cvq_group_is_independent.
>>> * Rebased on last CVQ start series, that allocated CVQ cmd bufs at load
>>> * Do not check for cvq index on vhost_vdpa_net_prepare, we only have one
>>>     that callback registered in that NetClientInfo.
>>>
>>> v3:
>>> * Make asid related queries print a warning instead of returning an
>>>     error and stop the start of qemu.
>>> ---
>>>    hw/virtio/vhost-vdpa.c |   3 +-
>>>    net/vhost-vdpa.c       | 138 ++++++++++++++++++++++++++++++++++++++---
>>>    2 files changed, 132 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>> index e3914fa40e..6401e7efb1 100644
>>> --- a/hw/virtio/vhost-vdpa.c
>>> +++ b/hw/virtio/vhost-vdpa.c
>>> @@ -648,7 +648,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
>>>    {
>>>        uint64_t features;
>>>        uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 |
>>> -        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH;
>>> +        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH |
>>> +        0x1ULL << VHOST_BACKEND_F_IOTLB_ASID;
>>>        int r;
>>>
>>>        if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) {
>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>> index 02780ee37b..7245ea70c6 100644
>>> --- a/net/vhost-vdpa.c
>>> +++ b/net/vhost-vdpa.c
>>> @@ -38,6 +38,9 @@ typedef struct VhostVDPAState {
>>>        void *cvq_cmd_out_buffer;
>>>        virtio_net_ctrl_ack *status;
>>>
>>> +    /* Number of address spaces supported by the device */
>>> +    unsigned address_space_num;
>>
>> I'm not sure this is the best place to store thing like this since it
>> can cause confusion. We will have multiple VhostVDPAState when
>> multiqueue is enabled.
>>
> I think we can delete this and ask it on each device start.
>
>>> +
>>>        /* The device always have SVQ enabled */
>>>        bool always_svq;
>>>        bool started;
>>> @@ -101,6 +104,9 @@ static const uint64_t vdpa_svq_device_features =
>>>        BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
>>>        BIT_ULL(VIRTIO_NET_F_STANDBY);
>>>
>>> +#define VHOST_VDPA_NET_DATA_ASID 0
>>> +#define VHOST_VDPA_NET_CVQ_ASID 1
>>> +
>>>    VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
>>>    {
>>>        VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
>>> @@ -242,6 +248,34 @@ static NetClientInfo net_vhost_vdpa_info = {
>>>            .check_peer_type = vhost_vdpa_check_peer_type,
>>>    };
>>>
>>> +static uint32_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
>>> +{
>>> +    struct vhost_vring_state state = {
>>> +        .index = vq_index,
>>> +    };
>>> +    int r = ioctl(device_fd, VHOST_VDPA_GET_VRING_GROUP, &state);
>>> +
>>> +    return r < 0 ? 0 : state.num;
>>
>> Assume 0 when ioctl() fail is probably not a good idea: errors in ioctl
>> might be hidden. It would be better to fallback to 0 when ASID is not
>> supported.
>>
> Did I misunderstand you on [1]?


Nope. I think I was wrong at that time then :( Sorry for that.

We should differ from the case

1) no ASID support so 0 is assumed

2) something wrong in the case of ioctl, it's not necessarily a ENOTSUPP.


>
>>> +}
>>> +
>>> +static int vhost_vdpa_set_address_space_id(struct vhost_vdpa *v,
>>> +                                           unsigned vq_group,
>>> +                                           unsigned asid_num)
>>> +{
>>> +    struct vhost_vring_state asid = {
>>> +        .index = vq_group,
>>> +        .num = asid_num,
>>> +    };
>>> +    int ret;
>>> +
>>> +    ret = ioctl(v->device_fd, VHOST_VDPA_SET_GROUP_ASID, &asid);
>>> +    if (unlikely(ret < 0)) {
>>> +        warn_report("Can't set vq group %u asid %u, errno=%d (%s)",
>>> +            asid.index, asid.num, errno, g_strerror(errno));
>>> +    }
>>> +    return ret;
>>> +}
>>> +
>>>    static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
>>>    {
>>>        VhostIOVATree *tree = v->iova_tree;
>>> @@ -316,11 +350,54 @@ dma_map_err:
>>>    static int vhost_vdpa_net_cvq_start(NetClientState *nc)
>>>    {
>>>        VhostVDPAState *s;
>>> -    int r;
>>> +    struct vhost_vdpa *v;
>>> +    uint32_t cvq_group;
>>> +    int cvq_index, r;
>>>
>>>        assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>>>
>>>        s = DO_UPCAST(VhostVDPAState, nc, nc);
>>> +    v = &s->vhost_vdpa;
>>> +
>>> +    v->listener_shadow_vq = s->always_svq;
>>> +    v->shadow_vqs_enabled = s->always_svq;
>>> +    s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_DATA_ASID;
>>> +
>>> +    if (s->always_svq) {
>>> +        goto out;
>>> +    }
>>> +
>>> +    if (s->address_space_num < 2) {
>>> +        return 0;
>>> +    }
>>> +
>>> +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
>>> +        return 0;
>>> +    }
>>
>> Any reason we do the above check during the start/stop? It should be
>> easier to do that in the initialization.
>>
> We can store it as a member of VhostVDPAState maybe? They will be
> duplicated like the current number of AS.


I meant each VhostVDPAState just need to know the ASID it needs to use. 
There's no need to know the total number of address spaces or do the 
validation on it during start (the validation could be done during 
initialization).


>
>>> +
>>> +    /**
>>> +     * Check if all the virtqueues of the virtio device are in a different vq
>>> +     * than the last vq. VQ group of last group passed in cvq_group.
>>> +     */
>>> +    cvq_index = v->dev->vq_index_end - 1;
>>> +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
>>> +    for (int i = 0; i < cvq_index; ++i) {
>>> +        uint32_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
>>> +
>>> +        if (unlikely(group == cvq_group)) {
>>> +            warn_report("CVQ %u group is the same as VQ %u one (%u)", cvq_group,
>>> +                        i, group);
>>> +            return 0;
>>> +        }
>>> +    }
>>> +
>>> +    r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
>>> +    if (r == 0) {
>>> +        v->shadow_vqs_enabled = true;
>>> +        s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_CVQ_ASID;
>>> +    }
>>> +
>>> +out:
>>>        if (!s->vhost_vdpa.shadow_vqs_enabled) {
>>>            return 0;
>>>        }
>>> @@ -542,12 +619,38 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
>>>        .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
>>>    };
>>>
>>> +static uint32_t vhost_vdpa_get_as_num(int vdpa_device_fd)
>>> +{
>>> +    uint64_t features;
>>> +    unsigned num_as;
>>> +    int r;
>>> +
>>> +    r = ioctl(vdpa_device_fd, VHOST_GET_BACKEND_FEATURES, &features);
>>> +    if (unlikely(r < 0)) {
>>> +        warn_report("Cannot get backend features");
>>> +        return 1;
>>> +    }
>>> +
>>> +    if (!(features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
>>> +        return 1;
>>> +    }
>>> +
>>> +    r = ioctl(vdpa_device_fd, VHOST_VDPA_GET_AS_NUM, &num_as);
>>> +    if (unlikely(r < 0)) {
>>> +        warn_report("Cannot retrieve number of supported ASs");
>>> +        return 1;
>>
>> Let's return error here. This help. to identify bugs of qemu or kernel.
>>
> Same comment as with VHOST_VDPA_GET_VRING_GROUP.
>
>>> +    }
>>> +
>>> +    return num_as;
>>> +}
>>> +
>>>    static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>>>                                               const char *device,
>>>                                               const char *name,
>>>                                               int vdpa_device_fd,
>>>                                               int queue_pair_index,
>>>                                               int nvqs,
>>> +                                           unsigned nas,
>>>                                               bool is_datapath,
>>>                                               bool svq,
>>>                                               VhostIOVATree *iova_tree)
>>> @@ -566,6 +669,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>>>        qemu_set_info_str(nc, TYPE_VHOST_VDPA);
>>>        s = DO_UPCAST(VhostVDPAState, nc, nc);
>>>
>>> +    s->address_space_num = nas;
>>>        s->vhost_vdpa.device_fd = vdpa_device_fd;
>>>        s->vhost_vdpa.index = queue_pair_index;
>>>        s->always_svq = svq;
>>> @@ -652,6 +756,8 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>>>        g_autoptr(VhostIOVATree) iova_tree = NULL;
>>>        NetClientState *nc;
>>>        int queue_pairs, r, i = 0, has_cvq = 0;
>>> +    unsigned num_as = 1;
>>> +    bool svq_cvq;
>>>
>>>        assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>>>        opts = &netdev->u.vhost_vdpa;
>>> @@ -693,12 +799,28 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>>>            return queue_pairs;
>>>        }
>>>
>>> -    if (opts->x_svq) {
>>> -        struct vhost_vdpa_iova_range iova_range;
>>> +    svq_cvq = opts->x_svq;
>>> +    if (has_cvq && !opts->x_svq) {
>>> +        num_as = vhost_vdpa_get_as_num(vdpa_device_fd);
>>> +        svq_cvq = num_as > 1;
>>> +    }
>>
>> The above check is not easy to follow, how about?
>>
>> svq_cvq = vhost_vdpa_get_as_num() > 1 ? true : opts->x_svq;
>>
> That would allocate the iova tree even if CVQ is not used in the
> guest. And num_as is reused later, although we can ask it to the
> device at device start to avoid this.


Ok.


>
> If any, the linear conversion would be:
> svq_cvq = opts->x_svq || (has_cvq && vhost_vdpa_get_as_num(vdpa_device_fd))
>
> So we avoid the AS_NUM ioctl if not needed.


So when !opts->x_svq, we need to check num_as is at least 2?


>
>>> +
>>> +    if (opts->x_svq || svq_cvq) {
>>
>> Any chance we can have opts->x_svq = true but svq_cvq = false? Checking
>> svq_cvq seems sufficient here.
>>
> The reverse is possible, to have svq_cvq but no opts->x_svq.
>
> Depending on that, this code emits a warning or a fatal error.


Ok, as replied in the previous patch, I think we need a better name for 
those ones.

if (opts->x_svq) {
         shadow_data_vq = true;
         if(has_cvq) shadow_cvq = true;
} else if (num_as >= 2 && has_cvq) {
         shadow_cvq = true;
}

The other logic can just check shadow_cvq or shadow_data_vq individually.


>
>>> +        Error *warn = NULL;
>>>
>>> -        if (!vhost_vdpa_net_valid_svq_features(features, errp)) {
>>> -            goto err_svq;
>>> +        svq_cvq = vhost_vdpa_net_valid_svq_features(features,
>>> +                                                   opts->x_svq ? errp : &warn);
>>> +        if (!svq_cvq) {
>>
>> Same question as above.
>>
>>
>>> +            if (opts->x_svq) {
>>> +                goto err_svq;
>>> +            } else {
>>> +                warn_reportf_err(warn, "Cannot shadow CVQ: ");
>>> +            }
>>>            }
>>> +    }
>>> +
>>> +    if (opts->x_svq || svq_cvq) {
>>> +        struct vhost_vdpa_iova_range iova_range;
>>>
>>>            vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range);
>>>            iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
>>> @@ -708,15 +830,15 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>>>
>>>        for (i = 0; i < queue_pairs; i++) {
>>>            ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>>> -                                     vdpa_device_fd, i, 2, true, opts->x_svq,
>>> -                                     iova_tree);
>>> +                                     vdpa_device_fd, i, 2, num_as, true,
>>
>> I don't get why we need pass num_as to a specific vhost_vdpa structure.
>> It should be sufficient to pass asid there.
>>
> ASID is not known at this time, but at device's start. This is because
> we cannot ask if the CVQ is in its own vq group, because we don't know
> the control virtqueue index until the guest acknowledges the different
> features.


We can probe those during initialization I think. E.g doing some 
negotiation in the initialization phase.

Thanks


>
> Thanks!
>
> [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg901685.html
>
>
>>> +                                     opts->x_svq, iova_tree);
>>>            if (!ncs[i])
>>>                goto err;
>>>        }
>>>
>>>        if (has_cvq) {
>>>            nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>>> -                                 vdpa_device_fd, i, 1, false,
>>> +                                 vdpa_device_fd, i, 1, num_as, false,
>>>                                     opts->x_svq, iova_tree);
>>>            if (!nc)
>>>                goto err;


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 05/10] vdpa: move SVQ vring features check to net/
  2022-11-11  7:55         ` Eugenio Perez Martin
@ 2022-11-11  8:07           ` Jason Wang
  2022-11-11 12:58             ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-11  8:07 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Fri, Nov 11, 2022 at 3:56 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Fri, Nov 11, 2022 at 8:34 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> >
> > 在 2022/11/10 21:09, Eugenio Perez Martin 写道:
> > > On Thu, Nov 10, 2022 at 6:40 AM Jason Wang <jasowang@redhat.com> wrote:
> > >>
> > >> 在 2022/11/9 01:07, Eugenio Pérez 写道:
> > >>> The next patches will start control SVQ if possible. However, we don't
> > >>> know if that will be possible at qemu boot anymore.
> > >>
> > >> If I was not wrong, there's no device specific feature that is checked
> > >> in the function. So it should be general enough to be used by devices
> > >> other than net. Then I don't see any advantage of doing this.
> > >>
> > > Because vhost_vdpa_init_svq is called at qemu boot, failing if it is
> > > not possible to shadow the Virtqueue.
> > >
> > > Now the CVQ will be shadowed if possible, so we need to check this at
> > > device start, not at initialization.
> >
> >
> > Any reason we can't check this at device start? We don't need
> > driver_features and we can do any probing to make sure cvq has an unique
> > group during initialization time.
> >
>
> We need the CVQ index to check if it has an independent group. CVQ
> index depends on the features the guest's ack:
> * If it acks _F_MQ, it is the last one.
> * If it doesn't, CVQ idx is 2.
>
> We cannot have acked features at initialization, and they could
> change: It is valid for a guest to ack _F_MQ, then reset the device,
> then not ack it.

Can we do some probing by negotiating _F_MQ if the device offers it,
then we can know if cvq has a unique group?

>
> >
> > >   To store this information at boot
> > > time is not valid anymore, because v->shadow_vqs_enabled is not valid
> > > at this time anymore.
> >
> >
> > Ok, but this doesn't explain why it is net specific but vhost-vdpa specific.
> >
>
> We can try to move it to a vhost op, but we have the same problem as
> the svq array allocation: We don't have the right place in vhost ops
> to check this. Maybe vhost_set_features is the right one here?

If we can do all the probing at the initialization phase, we can do
everything there.

Thanks

>
> Thanks!
>
> > Thanks
> >
> >
> > >
> > > Thanks!
> > >
> > >> Thanks
> > >>
> > >>
> > >>> Since the moved checks will be already evaluated at net/ to know if it
> > >>> is ok to shadow CVQ, move them.
> > >>>
> > >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > >>> ---
> > >>>    hw/virtio/vhost-vdpa.c | 33 ++-------------------------------
> > >>>    net/vhost-vdpa.c       |  3 ++-
> > >>>    2 files changed, 4 insertions(+), 32 deletions(-)
> > >>>
> > >>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > >>> index 3df2775760..146f0dcb40 100644
> > >>> --- a/hw/virtio/vhost-vdpa.c
> > >>> +++ b/hw/virtio/vhost-vdpa.c
> > >>> @@ -402,29 +402,9 @@ static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
> > >>>        return ret;
> > >>>    }
> > >>>
> > >>> -static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> > >>> -                               Error **errp)
> > >>> +static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v)
> > >>>    {
> > >>>        g_autoptr(GPtrArray) shadow_vqs = NULL;
> > >>> -    uint64_t dev_features, svq_features;
> > >>> -    int r;
> > >>> -    bool ok;
> > >>> -
> > >>> -    if (!v->shadow_vqs_enabled) {
> > >>> -        return 0;
> > >>> -    }
> > >>> -
> > >>> -    r = vhost_vdpa_get_dev_features(hdev, &dev_features);
> > >>> -    if (r != 0) {
> > >>> -        error_setg_errno(errp, -r, "Can't get vdpa device features");
> > >>> -        return r;
> > >>> -    }
> > >>> -
> > >>> -    svq_features = dev_features;
> > >>> -    ok = vhost_svq_valid_features(svq_features, errp);
> > >>> -    if (unlikely(!ok)) {
> > >>> -        return -1;
> > >>> -    }
> > >>>
> > >>>        shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
> > >>>        for (unsigned n = 0; n < hdev->nvqs; ++n) {
> > >>> @@ -436,7 +416,6 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> > >>>        }
> > >>>
> > >>>        v->shadow_vqs = g_steal_pointer(&shadow_vqs);
> > >>> -    return 0;
> > >>>    }
> > >>>
> > >>>    static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> > >>> @@ -461,11 +440,7 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> > >>>        dev->opaque =  opaque ;
> > >>>        v->listener = vhost_vdpa_memory_listener;
> > >>>        v->msg_type = VHOST_IOTLB_MSG_V2;
> > >>> -    ret = vhost_vdpa_init_svq(dev, v, errp);
> > >>> -    if (ret) {
> > >>> -        goto err;
> > >>> -    }
> > >>> -
> > >>> +    vhost_vdpa_init_svq(dev, v);
> > >>>        vhost_vdpa_get_iova_range(v);
> > >>>
> > >>>        if (!vhost_vdpa_first_dev(dev)) {
> > >>> @@ -476,10 +451,6 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> > >>>                                   VIRTIO_CONFIG_S_DRIVER);
> > >>>
> > >>>        return 0;
> > >>> -
> > >>> -err:
> > >>> -    ram_block_discard_disable(false);
> > >>> -    return ret;
> > >>>    }
> > >>>
> > >>>    static void vhost_vdpa_host_notifier_uninit(struct vhost_dev *dev,
> > >>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > >>> index d3b1de481b..fb35b17ab4 100644
> > >>> --- a/net/vhost-vdpa.c
> > >>> +++ b/net/vhost-vdpa.c
> > >>> @@ -117,9 +117,10 @@ static bool vhost_vdpa_net_valid_svq_features(uint64_t features, Error **errp)
> > >>>        if (invalid_dev_features) {
> > >>>            error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
> > >>>                       invalid_dev_features);
> > >>> +        return false;
> > >>>        }
> > >>>
> > >>> -    return !invalid_dev_features;
> > >>> +    return vhost_svq_valid_features(features, errp);
> > >>>    }
> > >>>
> > >>>    static int vhost_vdpa_net_check_device_id(struct vhost_net *net)
> >
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 05/10] vdpa: move SVQ vring features check to net/
  2022-11-11  8:07           ` Jason Wang
@ 2022-11-11 12:58             ` Eugenio Perez Martin
  2022-11-14  4:26               ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-11 12:58 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Fri, Nov 11, 2022 at 9:07 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Nov 11, 2022 at 3:56 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Fri, Nov 11, 2022 at 8:34 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > >
> > > 在 2022/11/10 21:09, Eugenio Perez Martin 写道:
> > > > On Thu, Nov 10, 2022 at 6:40 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >>
> > > >> 在 2022/11/9 01:07, Eugenio Pérez 写道:
> > > >>> The next patches will start control SVQ if possible. However, we don't
> > > >>> know if that will be possible at qemu boot anymore.
> > > >>
> > > >> If I was not wrong, there's no device specific feature that is checked
> > > >> in the function. So it should be general enough to be used by devices
> > > >> other than net. Then I don't see any advantage of doing this.
> > > >>
> > > > Because vhost_vdpa_init_svq is called at qemu boot, failing if it is
> > > > not possible to shadow the Virtqueue.
> > > >
> > > > Now the CVQ will be shadowed if possible, so we need to check this at
> > > > device start, not at initialization.
> > >
> > >
> > > Any reason we can't check this at device start? We don't need
> > > driver_features and we can do any probing to make sure cvq has an unique
> > > group during initialization time.
> > >
> >
> > We need the CVQ index to check if it has an independent group. CVQ
> > index depends on the features the guest's ack:
> > * If it acks _F_MQ, it is the last one.
> > * If it doesn't, CVQ idx is 2.
> >
> > We cannot have acked features at initialization, and they could
> > change: It is valid for a guest to ack _F_MQ, then reset the device,
> > then not ack it.
>
> Can we do some probing by negotiating _F_MQ if the device offers it,
> then we can know if cvq has a unique group?
>

What if the guest does not ack _F_MQ?

To be completed it would go like:

* Probe negotiate _F_MQ, check unique group,
* Probe negotiate !_F_MQ, check unique group,
* Actually negotiate with the guest's feature set.
* React to failures. Probably the same way as if the CVQ is not
isolated, disabling SVQ?

To me it seems simpler to specify somehow that the vq must be independent.

Thanks!

> >
> > >
> > > >   To store this information at boot
> > > > time is not valid anymore, because v->shadow_vqs_enabled is not valid
> > > > at this time anymore.
> > >
> > >
> > > Ok, but this doesn't explain why it is net specific but vhost-vdpa specific.
> > >
> >
> > We can try to move it to a vhost op, but we have the same problem as
> > the svq array allocation: We don't have the right place in vhost ops
> > to check this. Maybe vhost_set_features is the right one here?
>
> If we can do all the probing at the initialization phase, we can do
> everything there.
>
> Thanks
>
> >
> > Thanks!
> >
> > > Thanks
> > >
> > >
> > > >
> > > > Thanks!
> > > >
> > > >> Thanks
> > > >>
> > > >>
> > > >>> Since the moved checks will be already evaluated at net/ to know if it
> > > >>> is ok to shadow CVQ, move them.
> > > >>>
> > > >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > >>> ---
> > > >>>    hw/virtio/vhost-vdpa.c | 33 ++-------------------------------
> > > >>>    net/vhost-vdpa.c       |  3 ++-
> > > >>>    2 files changed, 4 insertions(+), 32 deletions(-)
> > > >>>
> > > >>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > >>> index 3df2775760..146f0dcb40 100644
> > > >>> --- a/hw/virtio/vhost-vdpa.c
> > > >>> +++ b/hw/virtio/vhost-vdpa.c
> > > >>> @@ -402,29 +402,9 @@ static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
> > > >>>        return ret;
> > > >>>    }
> > > >>>
> > > >>> -static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> > > >>> -                               Error **errp)
> > > >>> +static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v)
> > > >>>    {
> > > >>>        g_autoptr(GPtrArray) shadow_vqs = NULL;
> > > >>> -    uint64_t dev_features, svq_features;
> > > >>> -    int r;
> > > >>> -    bool ok;
> > > >>> -
> > > >>> -    if (!v->shadow_vqs_enabled) {
> > > >>> -        return 0;
> > > >>> -    }
> > > >>> -
> > > >>> -    r = vhost_vdpa_get_dev_features(hdev, &dev_features);
> > > >>> -    if (r != 0) {
> > > >>> -        error_setg_errno(errp, -r, "Can't get vdpa device features");
> > > >>> -        return r;
> > > >>> -    }
> > > >>> -
> > > >>> -    svq_features = dev_features;
> > > >>> -    ok = vhost_svq_valid_features(svq_features, errp);
> > > >>> -    if (unlikely(!ok)) {
> > > >>> -        return -1;
> > > >>> -    }
> > > >>>
> > > >>>        shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
> > > >>>        for (unsigned n = 0; n < hdev->nvqs; ++n) {
> > > >>> @@ -436,7 +416,6 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> > > >>>        }
> > > >>>
> > > >>>        v->shadow_vqs = g_steal_pointer(&shadow_vqs);
> > > >>> -    return 0;
> > > >>>    }
> > > >>>
> > > >>>    static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> > > >>> @@ -461,11 +440,7 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> > > >>>        dev->opaque =  opaque ;
> > > >>>        v->listener = vhost_vdpa_memory_listener;
> > > >>>        v->msg_type = VHOST_IOTLB_MSG_V2;
> > > >>> -    ret = vhost_vdpa_init_svq(dev, v, errp);
> > > >>> -    if (ret) {
> > > >>> -        goto err;
> > > >>> -    }
> > > >>> -
> > > >>> +    vhost_vdpa_init_svq(dev, v);
> > > >>>        vhost_vdpa_get_iova_range(v);
> > > >>>
> > > >>>        if (!vhost_vdpa_first_dev(dev)) {
> > > >>> @@ -476,10 +451,6 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> > > >>>                                   VIRTIO_CONFIG_S_DRIVER);
> > > >>>
> > > >>>        return 0;
> > > >>> -
> > > >>> -err:
> > > >>> -    ram_block_discard_disable(false);
> > > >>> -    return ret;
> > > >>>    }
> > > >>>
> > > >>>    static void vhost_vdpa_host_notifier_uninit(struct vhost_dev *dev,
> > > >>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > >>> index d3b1de481b..fb35b17ab4 100644
> > > >>> --- a/net/vhost-vdpa.c
> > > >>> +++ b/net/vhost-vdpa.c
> > > >>> @@ -117,9 +117,10 @@ static bool vhost_vdpa_net_valid_svq_features(uint64_t features, Error **errp)
> > > >>>        if (invalid_dev_features) {
> > > >>>            error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
> > > >>>                       invalid_dev_features);
> > > >>> +        return false;
> > > >>>        }
> > > >>>
> > > >>> -    return !invalid_dev_features;
> > > >>> +    return vhost_svq_valid_features(features, errp);
> > > >>>    }
> > > >>>
> > > >>>    static int vhost_vdpa_net_check_device_id(struct vhost_net *net)
> > >
> >
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 07/10] vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap
  2022-11-11  7:41       ` Jason Wang
@ 2022-11-11 13:02         ` Eugenio Perez Martin
  2022-11-14  4:27           ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-11 13:02 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Fri, Nov 11, 2022 at 8:41 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/11/10 21:22, Eugenio Perez Martin 写道:
> > On Thu, Nov 10, 2022 at 6:51 AM Jason Wang <jasowang@redhat.com> wrote:
> >> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >>> So the caller can choose which ASID is destined.
> >>>
> >>> No need to update the batch functions as they will always be called from
> >>> memory listener updates at the moment. Memory listener updates will
> >>> always update ASID 0, as it's the passthrough ASID.
> >>>
> >>> All vhost devices's ASID are 0 at this moment.
> >>>
> >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>> ---
> >>> v5:
> >>> * Solve conflict, now vhost_vdpa_svq_unmap_ring returns void
> >>> * Change comment on zero initialization.
> >>>
> >>> v4: Add comment specifying behavior if device does not support _F_ASID
> >>>
> >>> v3: Deleted unneeded space
> >>> ---
> >>>   include/hw/virtio/vhost-vdpa.h |  8 +++++---
> >>>   hw/virtio/vhost-vdpa.c         | 29 +++++++++++++++++++----------
> >>>   net/vhost-vdpa.c               |  6 +++---
> >>>   hw/virtio/trace-events         |  4 ++--
> >>>   4 files changed, 29 insertions(+), 18 deletions(-)
> >>>
> >>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> >>> index 1111d85643..6560bb9d78 100644
> >>> --- a/include/hw/virtio/vhost-vdpa.h
> >>> +++ b/include/hw/virtio/vhost-vdpa.h
> >>> @@ -29,6 +29,7 @@ typedef struct vhost_vdpa {
> >>>       int index;
> >>>       uint32_t msg_type;
> >>>       bool iotlb_batch_begin_sent;
> >>> +    uint32_t address_space_id;
> >> So the trick is let device specific code to zero this during allocation?
> >>
> > Yes, but I don't see how that is a trick :). All other parameters also
> > trust it to be 0 at allocation.
> >
> >>>       MemoryListener listener;
> >>>       struct vhost_vdpa_iova_range iova_range;
> >>>       uint64_t acked_features;
> >>> @@ -42,8 +43,9 @@ typedef struct vhost_vdpa {
> >>>       VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> >>>   } VhostVDPA;
> >>>
> >>> -int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> >>> -                       void *vaddr, bool readonly);
> >>> -int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> >>> +int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> >>> +                       hwaddr size, void *vaddr, bool readonly);
> >>> +int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> >>> +                         hwaddr size);
> >>>
> >>>   #endif
> >>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>> index 23efb8f49d..8fd32ba32b 100644
> >>> --- a/hw/virtio/vhost-vdpa.c
> >>> +++ b/hw/virtio/vhost-vdpa.c
> >>> @@ -72,22 +72,24 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> >>>       return false;
> >>>   }
> >>>
> >>> -int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> >>> -                       void *vaddr, bool readonly)
> >>> +int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> >>> +                       hwaddr size, void *vaddr, bool readonly)
> >>>   {
> >>>       struct vhost_msg_v2 msg = {};
> >>>       int fd = v->device_fd;
> >>>       int ret = 0;
> >>>
> >>>       msg.type = v->msg_type;
> >>> +    msg.asid = asid; /* 0 if vdpa device does not support asid */
> >> The comment here is confusing. If this is a requirement, we need either
> >>
> >> 1) doc this
> >>
> >> or
> >>
> >> 2) perform necessary checks in the function itself.
> >>
> > I only documented it in vhost_vdpa_dma_unmap and now I realize it.
> > Would it work to just copy that comment here?
>
>
> Probably, and let's move the comment above the function definition.
>
>
> >
> >>>       msg.iotlb.iova = iova;
> >>>       msg.iotlb.size = size;
> >>>       msg.iotlb.uaddr = (uint64_t)(uintptr_t)vaddr;
> >>>       msg.iotlb.perm = readonly ? VHOST_ACCESS_RO : VHOST_ACCESS_RW;
> >>>       msg.iotlb.type = VHOST_IOTLB_UPDATE;
> >>>
> >>> -   trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.iotlb.iova, msg.iotlb.size,
> >>> -                            msg.iotlb.uaddr, msg.iotlb.perm, msg.iotlb.type);
> >>> +    trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.asid, msg.iotlb.iova,
> >>> +                             msg.iotlb.size, msg.iotlb.uaddr, msg.iotlb.perm,
> >>> +                             msg.iotlb.type);
> >>>
> >>>       if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> >>>           error_report("failed to write, fd=%d, errno=%d (%s)",
> >>> @@ -98,18 +100,24 @@ int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> >>>       return ret;
> >>>   }
> >>>
> >>> -int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size)
> >>> +int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> >>> +                         hwaddr size)
> >>>   {
> >>>       struct vhost_msg_v2 msg = {};
> >>>       int fd = v->device_fd;
> >>>       int ret = 0;
> >>>
> >>>       msg.type = v->msg_type;
> >>> +    /*
> >>> +     * The caller must set asid = 0 if the device does not support asid.
> >>> +     * This is not an ABI break since it is set to 0 by the initializer anyway.
> >>> +     */
> >>> +    msg.asid = asid;
> >>>       msg.iotlb.iova = iova;
> >>>       msg.iotlb.size = size;
> >>>       msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
> >>>
> >>> -    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.iotlb.iova,
> >>> +    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.asid, msg.iotlb.iova,
> >>>                                  msg.iotlb.size, msg.iotlb.type);
> >>>
> >>>       if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> >>> @@ -229,7 +237,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >>>       }
> >>>
> >>>       vhost_vdpa_iotlb_batch_begin_once(v);
> >>> -    ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
> >>> +    ret = vhost_vdpa_dma_map(v, 0, iova, int128_get64(llsize),
> >> Can we use v->address_space_id here? Then we don't need to modify this
> >> line when we support multiple asids logic in the future.
> >>
> > The registered memory listener is the one of the last vhost_vdpa, the
> > one that handles the last queue.
> >
> > If all data virtqueues are not shadowed but CVQ is,
> > v->address_space_id is 1 with the current code.
>
>
> Ok, right. So let's add a comment here. It would be even better to
> define the macro for data vq asid in this patch.
>

I agree that it must be changed to a macro.

However, currently the _ASID macros belong to net/ . Maybe to declare
VHOST_VDPA_GPA_ASID in include/hw/virtio/vhost-vdpa.h and just let
VHOST_VDPA_NET_CVQ_ASID in net/vhost-vdpa.c?

Thanks!

>
> Thanks
>
>
> >   But the listener is
> > actually mapping the ASID 0, not 1.
> >
> > Another alternative is to register it to the last data virtqueue, not
> > the last queue of vhost_vdpa. But it is hard to express it in a
> > generic way at virtio/vhost-vdpa.c . To have a boolean indicating the
> > vhost_vdpa we want to register its memory listener?
> >
> > It seems easier to me to simply assign 0 at GPA translations. If SVQ
> > is enabled for all queues, then 0 is GPA to qemu's VA + SVQ stuff. If
> > it is not, 0 is always GPA to qemu's VA.
> >
> > Thanks!
> >
> >> Thanks
> >>
> >>>                                vaddr, section->readonly);
> >>>       if (ret) {
> >>>           error_report("vhost vdpa map fail!");
> >>> @@ -303,7 +311,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> >>>           vhost_iova_tree_remove(v->iova_tree, *result);
> >>>       }
> >>>       vhost_vdpa_iotlb_batch_begin_once(v);
> >>> -    ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
> >>> +    ret = vhost_vdpa_dma_unmap(v, 0, iova, int128_get64(llsize));
> >>>       if (ret) {
> >>>           error_report("vhost_vdpa dma unmap error!");
> >>>       }
> >>> @@ -884,7 +892,7 @@ static void vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v, hwaddr addr)
> >>>       }
> >>>
> >>>       size = ROUND_UP(result->size, qemu_real_host_page_size());
> >>> -    r = vhost_vdpa_dma_unmap(v, result->iova, size);
> >>> +    r = vhost_vdpa_dma_unmap(v, v->address_space_id, result->iova, size);
> >>>       if (unlikely(r < 0)) {
> >>>           error_report("Unable to unmap SVQ vring: %s (%d)", g_strerror(-r), -r);
> >>>           return;
> >>> @@ -924,7 +932,8 @@ static bool vhost_vdpa_svq_map_ring(struct vhost_vdpa *v, DMAMap *needle,
> >>>           return false;
> >>>       }
> >>>
> >>> -    r = vhost_vdpa_dma_map(v, needle->iova, needle->size + 1,
> >>> +    r = vhost_vdpa_dma_map(v, v->address_space_id, needle->iova,
> >>> +                           needle->size + 1,
> >>>                              (void *)(uintptr_t)needle->translated_addr,
> >>>                              needle->perm == IOMMU_RO);
> >>>       if (unlikely(r != 0)) {
> >>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> >>> index fb35b17ab4..ca1acc0410 100644
> >>> --- a/net/vhost-vdpa.c
> >>> +++ b/net/vhost-vdpa.c
> >>> @@ -258,7 +258,7 @@ static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
> >>>           return;
> >>>       }
> >>>
> >>> -    r = vhost_vdpa_dma_unmap(v, map->iova, map->size + 1);
> >>> +    r = vhost_vdpa_dma_unmap(v, v->address_space_id, map->iova, map->size + 1);
> >>>       if (unlikely(r != 0)) {
> >>>           error_report("Device cannot unmap: %s(%d)", g_strerror(r), r);
> >>>       }
> >>> @@ -298,8 +298,8 @@ static int vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v, void *buf, size_t size,
> >>>           return r;
> >>>       }
> >>>
> >>> -    r = vhost_vdpa_dma_map(v, map.iova, vhost_vdpa_net_cvq_cmd_page_len(), buf,
> >>> -                           !write);
> >>> +    r = vhost_vdpa_dma_map(v, v->address_space_id, map.iova,
> >>> +                           vhost_vdpa_net_cvq_cmd_page_len(), buf, !write);
> >>>       if (unlikely(r < 0)) {
> >>>           goto dma_map_err;
> >>>       }
> >>> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> >>> index 820dadc26c..0ad9390307 100644
> >>> --- a/hw/virtio/trace-events
> >>> +++ b/hw/virtio/trace-events
> >>> @@ -30,8 +30,8 @@ vhost_user_write(uint32_t req, uint32_t flags) "req:%d flags:0x%"PRIx32""
> >>>   vhost_user_create_notifier(int idx, void *n) "idx:%d n:%p"
> >>>
> >>>   # vhost-vdpa.c
> >>> -vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
> >>> -vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
> >>> +vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
> >>> +vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
> >>>   vhost_vdpa_listener_begin_batch(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
> >>>   vhost_vdpa_listener_commit(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
> >>>   vhost_vdpa_listener_region_add(void *vdpa, uint64_t iova, uint64_t llend, void *vaddr, bool readonly) "vdpa: %p iova 0x%"PRIx64" llend 0x%"PRIx64" vaddr: %p read-only: %d"
> >>> --
> >>> 2.31.1
> >>>
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa
  2022-11-11  7:48       ` Jason Wang
@ 2022-11-11 13:12         ` Eugenio Perez Martin
  2022-11-14  4:30           ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-11 13:12 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Fri, Nov 11, 2022 at 8:49 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/11/10 21:47, Eugenio Perez Martin 写道:
> > On Thu, Nov 10, 2022 at 7:01 AM Jason Wang <jasowang@redhat.com> wrote:
> >> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >>> The memory listener that thells the device how to convert GPA to qemu's
> >>> va is registered against CVQ vhost_vdpa. This series try to map the
> >>> memory listener translations to ASID 0, while it maps the CVQ ones to
> >>> ASID 1.
> >>>
> >>> Let's tell the listener if it needs to register them on iova tree or
> >>> not.
> >>>
> >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>> ---
> >>> v5: Solve conflict about vhost_iova_tree_remove accepting mem_region by
> >>>      value.
> >>> ---
> >>>   include/hw/virtio/vhost-vdpa.h | 2 ++
> >>>   hw/virtio/vhost-vdpa.c         | 6 +++---
> >>>   net/vhost-vdpa.c               | 1 +
> >>>   3 files changed, 6 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> >>> index 6560bb9d78..0c3ed2d69b 100644
> >>> --- a/include/hw/virtio/vhost-vdpa.h
> >>> +++ b/include/hw/virtio/vhost-vdpa.h
> >>> @@ -34,6 +34,8 @@ typedef struct vhost_vdpa {
> >>>       struct vhost_vdpa_iova_range iova_range;
> >>>       uint64_t acked_features;
> >>>       bool shadow_vqs_enabled;
> >>> +    /* The listener must send iova tree addresses, not GPA */
>
>
> Btw, cindy's vIOMMU series will make it not necessarily GPA any more.
>

Yes, this comment should be tuned then. But the SVQ iova_tree will not
be equal to vIOMMU one because shadow vrings.

But maybe SVQ can inspect both instead of having all the duplicated entries.

>
> >>> +    bool listener_shadow_vq;
> >>>       /* IOVA mapping used by the Shadow Virtqueue */
> >>>       VhostIOVATree *iova_tree;
> >>>       GPtrArray *shadow_vqs;
> >>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>> index 8fd32ba32b..e3914fa40e 100644
> >>> --- a/hw/virtio/vhost-vdpa.c
> >>> +++ b/hw/virtio/vhost-vdpa.c
> >>> @@ -220,7 +220,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >>>                                            vaddr, section->readonly);
> >>>
> >>>       llsize = int128_sub(llend, int128_make64(iova));
> >>> -    if (v->shadow_vqs_enabled) {
> >>> +    if (v->listener_shadow_vq) {
> >>>           int r;
> >>>
> >>>           mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
> >>> @@ -247,7 +247,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >>>       return;
> >>>
> >>>   fail_map:
> >>> -    if (v->shadow_vqs_enabled) {
> >>> +    if (v->listener_shadow_vq) {
> >>>           vhost_iova_tree_remove(v->iova_tree, mem_region);
> >>>       }
> >>>
> >>> @@ -292,7 +292,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> >>>
> >>>       llsize = int128_sub(llend, int128_make64(iova));
> >>>
> >>> -    if (v->shadow_vqs_enabled) {
> >>> +    if (v->listener_shadow_vq) {
> >>>           const DMAMap *result;
> >>>           const void *vaddr = memory_region_get_ram_ptr(section->mr) +
> >>>               section->offset_within_region +
> >>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> >>> index 85a318faca..02780ee37b 100644
> >>> --- a/net/vhost-vdpa.c
> >>> +++ b/net/vhost-vdpa.c
> >>> @@ -570,6 +570,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >>>       s->vhost_vdpa.index = queue_pair_index;
> >>>       s->always_svq = svq;
> >>>       s->vhost_vdpa.shadow_vqs_enabled = svq;
> >>> +    s->vhost_vdpa.listener_shadow_vq = svq;
> >> Any chance those above two can differ?
> >>
> > If CVQ is shadowed but data VQs are not, shadow_vqs_enabled is true
> > but listener_shadow_vq is not.
> >
> > It is more clear in the next commit, where only shadow_vqs_enabled is
> > set to true at vhost_vdpa_net_cvq_start.
>
>
> Ok, the name looks a little bit confusing. I wonder if it's better to
> use shadow_cvq and shadow_data ?
>

I'm ok with renaming it, but struct vhost_vdpa is generic across all
kind of devices, and it does not know if it is a datapath or not for
the moment.

Maybe listener_uses_iova_tree?

Thanks!


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 10/10] vdpa: Always start CVQ in SVQ mode
  2022-11-11  8:02       ` Jason Wang
@ 2022-11-11 14:38         ` Eugenio Perez Martin
  2022-11-14  4:36           ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-11 14:38 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Fri, Nov 11, 2022 at 9:03 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/11/11 00:07, Eugenio Perez Martin 写道:
> > On Thu, Nov 10, 2022 at 7:25 AM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> 在 2022/11/9 01:07, Eugenio Pérez 写道:
> >>> Isolate control virtqueue in its own group, allowing to intercept control
> >>> commands but letting dataplane run totally passthrough to the guest.
> >>
> >> I think we need to tweak the title to "vdpa: Always start CVQ in SVQ
> >> mode if possible". Since SVQ for CVQ can't be enabled without ASID support?
> >>
> > Yes, I totally agree.
>
>
> Btw, I wonder if it's worth to remove the "x-" prefix for the shadow
> virtqueue. It can help for the devices without ASID support but want do
> live migration.
>

Sure I can propose on top.

>
> >
> >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>> ---
> >>> v6:
> >>> * Disable control SVQ if the device does not support it because of
> >>> features.
> >>>
> >>> v5:
> >>> * Fixing the not adding cvq buffers when x-svq=on is specified.
> >>> * Move vring state in vhost_vdpa_get_vring_group instead of using a
> >>>     parameter.
> >>> * Rename VHOST_VDPA_NET_CVQ_PASSTHROUGH to VHOST_VDPA_NET_DATA_ASID
> >>>
> >>> v4:
> >>> * Squash vhost_vdpa_cvq_group_is_independent.
> >>> * Rebased on last CVQ start series, that allocated CVQ cmd bufs at load
> >>> * Do not check for cvq index on vhost_vdpa_net_prepare, we only have one
> >>>     that callback registered in that NetClientInfo.
> >>>
> >>> v3:
> >>> * Make asid related queries print a warning instead of returning an
> >>>     error and stop the start of qemu.
> >>> ---
> >>>    hw/virtio/vhost-vdpa.c |   3 +-
> >>>    net/vhost-vdpa.c       | 138 ++++++++++++++++++++++++++++++++++++++---
> >>>    2 files changed, 132 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>> index e3914fa40e..6401e7efb1 100644
> >>> --- a/hw/virtio/vhost-vdpa.c
> >>> +++ b/hw/virtio/vhost-vdpa.c
> >>> @@ -648,7 +648,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
> >>>    {
> >>>        uint64_t features;
> >>>        uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 |
> >>> -        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH;
> >>> +        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH |
> >>> +        0x1ULL << VHOST_BACKEND_F_IOTLB_ASID;
> >>>        int r;
> >>>
> >>>        if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) {
> >>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> >>> index 02780ee37b..7245ea70c6 100644
> >>> --- a/net/vhost-vdpa.c
> >>> +++ b/net/vhost-vdpa.c
> >>> @@ -38,6 +38,9 @@ typedef struct VhostVDPAState {
> >>>        void *cvq_cmd_out_buffer;
> >>>        virtio_net_ctrl_ack *status;
> >>>
> >>> +    /* Number of address spaces supported by the device */
> >>> +    unsigned address_space_num;
> >>
> >> I'm not sure this is the best place to store thing like this since it
> >> can cause confusion. We will have multiple VhostVDPAState when
> >> multiqueue is enabled.
> >>
> > I think we can delete this and ask it on each device start.
> >
> >>> +
> >>>        /* The device always have SVQ enabled */
> >>>        bool always_svq;
> >>>        bool started;
> >>> @@ -101,6 +104,9 @@ static const uint64_t vdpa_svq_device_features =
> >>>        BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
> >>>        BIT_ULL(VIRTIO_NET_F_STANDBY);
> >>>
> >>> +#define VHOST_VDPA_NET_DATA_ASID 0
> >>> +#define VHOST_VDPA_NET_CVQ_ASID 1
> >>> +
> >>>    VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
> >>>    {
> >>>        VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> >>> @@ -242,6 +248,34 @@ static NetClientInfo net_vhost_vdpa_info = {
> >>>            .check_peer_type = vhost_vdpa_check_peer_type,
> >>>    };
> >>>
> >>> +static uint32_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> >>> +{
> >>> +    struct vhost_vring_state state = {
> >>> +        .index = vq_index,
> >>> +    };
> >>> +    int r = ioctl(device_fd, VHOST_VDPA_GET_VRING_GROUP, &state);
> >>> +
> >>> +    return r < 0 ? 0 : state.num;
> >>
> >> Assume 0 when ioctl() fail is probably not a good idea: errors in ioctl
> >> might be hidden. It would be better to fallback to 0 when ASID is not
> >> supported.
> >>
> > Did I misunderstand you on [1]?
>
>
> Nope. I think I was wrong at that time then :( Sorry for that.
>
> We should differ from the case
>
> 1) no ASID support so 0 is assumed
>
> 2) something wrong in the case of ioctl, it's not necessarily a ENOTSUPP.
>

What action should we take here? Isn't it better to disable SVQ and
let the device run?

>
> >
> >>> +}
> >>> +
> >>> +static int vhost_vdpa_set_address_space_id(struct vhost_vdpa *v,
> >>> +                                           unsigned vq_group,
> >>> +                                           unsigned asid_num)
> >>> +{
> >>> +    struct vhost_vring_state asid = {
> >>> +        .index = vq_group,
> >>> +        .num = asid_num,
> >>> +    };
> >>> +    int ret;
> >>> +
> >>> +    ret = ioctl(v->device_fd, VHOST_VDPA_SET_GROUP_ASID, &asid);
> >>> +    if (unlikely(ret < 0)) {
> >>> +        warn_report("Can't set vq group %u asid %u, errno=%d (%s)",
> >>> +            asid.index, asid.num, errno, g_strerror(errno));
> >>> +    }
> >>> +    return ret;
> >>> +}
> >>> +
> >>>    static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
> >>>    {
> >>>        VhostIOVATree *tree = v->iova_tree;
> >>> @@ -316,11 +350,54 @@ dma_map_err:
> >>>    static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> >>>    {
> >>>        VhostVDPAState *s;
> >>> -    int r;
> >>> +    struct vhost_vdpa *v;
> >>> +    uint32_t cvq_group;
> >>> +    int cvq_index, r;
> >>>
> >>>        assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> >>>
> >>>        s = DO_UPCAST(VhostVDPAState, nc, nc);
> >>> +    v = &s->vhost_vdpa;
> >>> +
> >>> +    v->listener_shadow_vq = s->always_svq;
> >>> +    v->shadow_vqs_enabled = s->always_svq;
> >>> +    s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_DATA_ASID;
> >>> +
> >>> +    if (s->always_svq) {
> >>> +        goto out;
> >>> +    }
> >>> +
> >>> +    if (s->address_space_num < 2) {
> >>> +        return 0;
> >>> +    }
> >>> +
> >>> +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> >>> +        return 0;
> >>> +    }
> >>
> >> Any reason we do the above check during the start/stop? It should be
> >> easier to do that in the initialization.
> >>
> > We can store it as a member of VhostVDPAState maybe? They will be
> > duplicated like the current number of AS.
>
>
> I meant each VhostVDPAState just need to know the ASID it needs to use.
> There's no need to know the total number of address spaces or do the
> validation on it during start (the validation could be done during
> initialization).
>

I thought we were talking about the virtio features.

So let's omit this check and simply try to set ASID? The worst case is
an -ENOTSUPP or -EINVAL, so the actions to take are the same as if we
don't have enough AS.

>
> >
> >>> +
> >>> +    /**
> >>> +     * Check if all the virtqueues of the virtio device are in a different vq
> >>> +     * than the last vq. VQ group of last group passed in cvq_group.
> >>> +     */
> >>> +    cvq_index = v->dev->vq_index_end - 1;
> >>> +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> >>> +    for (int i = 0; i < cvq_index; ++i) {
> >>> +        uint32_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> >>> +
> >>> +        if (unlikely(group == cvq_group)) {
> >>> +            warn_report("CVQ %u group is the same as VQ %u one (%u)", cvq_group,
> >>> +                        i, group);
> >>> +            return 0;
> >>> +        }
> >>> +    }
> >>> +
> >>> +    r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
> >>> +    if (r == 0) {
> >>> +        v->shadow_vqs_enabled = true;
> >>> +        s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_CVQ_ASID;
> >>> +    }
> >>> +
> >>> +out:
> >>>        if (!s->vhost_vdpa.shadow_vqs_enabled) {
> >>>            return 0;
> >>>        }
> >>> @@ -542,12 +619,38 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> >>>        .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> >>>    };
> >>>
> >>> +static uint32_t vhost_vdpa_get_as_num(int vdpa_device_fd)
> >>> +{
> >>> +    uint64_t features;
> >>> +    unsigned num_as;
> >>> +    int r;
> >>> +
> >>> +    r = ioctl(vdpa_device_fd, VHOST_GET_BACKEND_FEATURES, &features);
> >>> +    if (unlikely(r < 0)) {
> >>> +        warn_report("Cannot get backend features");
> >>> +        return 1;
> >>> +    }
> >>> +
> >>> +    if (!(features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
> >>> +        return 1;
> >>> +    }
> >>> +
> >>> +    r = ioctl(vdpa_device_fd, VHOST_VDPA_GET_AS_NUM, &num_as);
> >>> +    if (unlikely(r < 0)) {
> >>> +        warn_report("Cannot retrieve number of supported ASs");
> >>> +        return 1;
> >>
> >> Let's return error here. This help. to identify bugs of qemu or kernel.
> >>
> > Same comment as with VHOST_VDPA_GET_VRING_GROUP.
> >
> >>> +    }
> >>> +
> >>> +    return num_as;
> >>> +}
> >>> +
> >>>    static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >>>                                               const char *device,
> >>>                                               const char *name,
> >>>                                               int vdpa_device_fd,
> >>>                                               int queue_pair_index,
> >>>                                               int nvqs,
> >>> +                                           unsigned nas,
> >>>                                               bool is_datapath,
> >>>                                               bool svq,
> >>>                                               VhostIOVATree *iova_tree)
> >>> @@ -566,6 +669,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >>>        qemu_set_info_str(nc, TYPE_VHOST_VDPA);
> >>>        s = DO_UPCAST(VhostVDPAState, nc, nc);
> >>>
> >>> +    s->address_space_num = nas;
> >>>        s->vhost_vdpa.device_fd = vdpa_device_fd;
> >>>        s->vhost_vdpa.index = queue_pair_index;
> >>>        s->always_svq = svq;
> >>> @@ -652,6 +756,8 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> >>>        g_autoptr(VhostIOVATree) iova_tree = NULL;
> >>>        NetClientState *nc;
> >>>        int queue_pairs, r, i = 0, has_cvq = 0;
> >>> +    unsigned num_as = 1;
> >>> +    bool svq_cvq;
> >>>
> >>>        assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> >>>        opts = &netdev->u.vhost_vdpa;
> >>> @@ -693,12 +799,28 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> >>>            return queue_pairs;
> >>>        }
> >>>
> >>> -    if (opts->x_svq) {
> >>> -        struct vhost_vdpa_iova_range iova_range;
> >>> +    svq_cvq = opts->x_svq;
> >>> +    if (has_cvq && !opts->x_svq) {
> >>> +        num_as = vhost_vdpa_get_as_num(vdpa_device_fd);
> >>> +        svq_cvq = num_as > 1;
> >>> +    }
> >>
> >> The above check is not easy to follow, how about?
> >>
> >> svq_cvq = vhost_vdpa_get_as_num() > 1 ? true : opts->x_svq;
> >>
> > That would allocate the iova tree even if CVQ is not used in the
> > guest. And num_as is reused later, although we can ask it to the
> > device at device start to avoid this.
>
>
> Ok.
>
>
> >
> > If any, the linear conversion would be:
> > svq_cvq = opts->x_svq || (has_cvq && vhost_vdpa_get_as_num(vdpa_device_fd))
> >
> > So we avoid the AS_NUM ioctl if not needed.
>
>
> So when !opts->x_svq, we need to check num_as is at least 2?
>

As I think you proposed, we can simply try to set CVQ ASID and react to -EINVAL.

But this code here is trying to not to allocate iova_tree if we're
sure it will not be needed. Maybe it is easier to always allocate the
empty iova tree and only fill it if needed?

>
> >
> >>> +
> >>> +    if (opts->x_svq || svq_cvq) {
> >>
> >> Any chance we can have opts->x_svq = true but svq_cvq = false? Checking
> >> svq_cvq seems sufficient here.
> >>
> > The reverse is possible, to have svq_cvq but no opts->x_svq.
> >
> > Depending on that, this code emits a warning or a fatal error.
>
>
> Ok, as replied in the previous patch, I think we need a better name for
> those ones.
>

cvq_svq can be renamed for sure. x_svq can be aliased with other
variable if needed too.

> if (opts->x_svq) {
>          shadow_data_vq = true;
>          if(has_cvq) shadow_cvq = true;
> } else if (num_as >= 2 && has_cvq) {
>          shadow_cvq = true;
> }
>
> The other logic can just check shadow_cvq or shadow_data_vq individually.
>

Not sure if shadow_data_vq is accurate. It sounds to me as "Only
shadow data virtqueues but not CVQ".

shadow_device and shadow_cvq?

>
> >
> >>> +        Error *warn = NULL;
> >>>
> >>> -        if (!vhost_vdpa_net_valid_svq_features(features, errp)) {
> >>> -            goto err_svq;
> >>> +        svq_cvq = vhost_vdpa_net_valid_svq_features(features,
> >>> +                                                   opts->x_svq ? errp : &warn);
> >>> +        if (!svq_cvq) {
> >>
> >> Same question as above.
> >>
> >>
> >>> +            if (opts->x_svq) {
> >>> +                goto err_svq;
> >>> +            } else {
> >>> +                warn_reportf_err(warn, "Cannot shadow CVQ: ");
> >>> +            }
> >>>            }
> >>> +    }
> >>> +
> >>> +    if (opts->x_svq || svq_cvq) {
> >>> +        struct vhost_vdpa_iova_range iova_range;
> >>>
> >>>            vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range);
> >>>            iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
> >>> @@ -708,15 +830,15 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> >>>
> >>>        for (i = 0; i < queue_pairs; i++) {
> >>>            ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> >>> -                                     vdpa_device_fd, i, 2, true, opts->x_svq,
> >>> -                                     iova_tree);
> >>> +                                     vdpa_device_fd, i, 2, num_as, true,
> >>
> >> I don't get why we need pass num_as to a specific vhost_vdpa structure.
> >> It should be sufficient to pass asid there.
> >>
> > ASID is not known at this time, but at device's start. This is because
> > we cannot ask if the CVQ is in its own vq group, because we don't know
> > the control virtqueue index until the guest acknowledges the different
> > features.
>
>
> We can probe those during initialization I think. E.g doing some
> negotiation in the initialization phase.
>

We've developed this idea in other threads, let's continue there better.

Thanks!

> Thanks
>
>
> >
> > Thanks!
> >
> > [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg901685.html
> >
> >
> >>> +                                     opts->x_svq, iova_tree);
> >>>            if (!ncs[i])
> >>>                goto err;
> >>>        }
> >>>
> >>>        if (has_cvq) {
> >>>            nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> >>> -                                 vdpa_device_fd, i, 1, false,
> >>> +                                 vdpa_device_fd, i, 1, num_as, false,
> >>>                                     opts->x_svq, iova_tree);
> >>>            if (!nc)
> >>>                goto err;
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 05/10] vdpa: move SVQ vring features check to net/
  2022-11-11 12:58             ` Eugenio Perez Martin
@ 2022-11-14  4:26               ` Jason Wang
  2022-11-14 10:10                 ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-14  4:26 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini


在 2022/11/11 20:58, Eugenio Perez Martin 写道:
> On Fri, Nov 11, 2022 at 9:07 AM Jason Wang <jasowang@redhat.com> wrote:
>> On Fri, Nov 11, 2022 at 3:56 PM Eugenio Perez Martin
>> <eperezma@redhat.com> wrote:
>>> On Fri, Nov 11, 2022 at 8:34 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>
>>>> 在 2022/11/10 21:09, Eugenio Perez Martin 写道:
>>>>> On Thu, Nov 10, 2022 at 6:40 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> 在 2022/11/9 01:07, Eugenio Pérez 写道:
>>>>>>> The next patches will start control SVQ if possible. However, we don't
>>>>>>> know if that will be possible at qemu boot anymore.
>>>>>> If I was not wrong, there's no device specific feature that is checked
>>>>>> in the function. So it should be general enough to be used by devices
>>>>>> other than net. Then I don't see any advantage of doing this.
>>>>>>
>>>>> Because vhost_vdpa_init_svq is called at qemu boot, failing if it is
>>>>> not possible to shadow the Virtqueue.
>>>>>
>>>>> Now the CVQ will be shadowed if possible, so we need to check this at
>>>>> device start, not at initialization.
>>>>
>>>> Any reason we can't check this at device start? We don't need
>>>> driver_features and we can do any probing to make sure cvq has an unique
>>>> group during initialization time.
>>>>
>>> We need the CVQ index to check if it has an independent group. CVQ
>>> index depends on the features the guest's ack:
>>> * If it acks _F_MQ, it is the last one.
>>> * If it doesn't, CVQ idx is 2.
>>>
>>> We cannot have acked features at initialization, and they could
>>> change: It is valid for a guest to ack _F_MQ, then reset the device,
>>> then not ack it.
>> Can we do some probing by negotiating _F_MQ if the device offers it,
>> then we can know if cvq has a unique group?
>>
> What if the guest does not ack _F_MQ?
>
> To be completed it would go like:
>
> * Probe negotiate _F_MQ, check unique group,
> * Probe negotiate !_F_MQ, check unique group,


I think it should be a bug if device present a unique virtqueue group 
that depends on a specific feature. That is to say, we can do a single 
round of probing instead of try it twice here.


> * Actually negotiate with the guest's feature set.
> * React to failures. Probably the same way as if the CVQ is not
> isolated, disabling SVQ?
>
> To me it seems simpler to specify somehow that the vq must be independent.


It's just a suggestion, if you think doing it at the start, I'm fine. 
But we need document the reason with a comment maybe.

Thanks


>
> Thanks!
>
>>>>>    To store this information at boot
>>>>> time is not valid anymore, because v->shadow_vqs_enabled is not valid
>>>>> at this time anymore.
>>>>
>>>> Ok, but this doesn't explain why it is net specific but vhost-vdpa specific.
>>>>
>>> We can try to move it to a vhost op, but we have the same problem as
>>> the svq array allocation: We don't have the right place in vhost ops
>>> to check this. Maybe vhost_set_features is the right one here?
>> If we can do all the probing at the initialization phase, we can do
>> everything there.
>>
>> Thanks
>>
>>> Thanks!
>>>
>>>> Thanks
>>>>
>>>>
>>>>> Thanks!
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>> Since the moved checks will be already evaluated at net/ to know if it
>>>>>>> is ok to shadow CVQ, move them.
>>>>>>>
>>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>>>> ---
>>>>>>>     hw/virtio/vhost-vdpa.c | 33 ++-------------------------------
>>>>>>>     net/vhost-vdpa.c       |  3 ++-
>>>>>>>     2 files changed, 4 insertions(+), 32 deletions(-)
>>>>>>>
>>>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>>>>>> index 3df2775760..146f0dcb40 100644
>>>>>>> --- a/hw/virtio/vhost-vdpa.c
>>>>>>> +++ b/hw/virtio/vhost-vdpa.c
>>>>>>> @@ -402,29 +402,9 @@ static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
>>>>>>>         return ret;
>>>>>>>     }
>>>>>>>
>>>>>>> -static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
>>>>>>> -                               Error **errp)
>>>>>>> +static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v)
>>>>>>>     {
>>>>>>>         g_autoptr(GPtrArray) shadow_vqs = NULL;
>>>>>>> -    uint64_t dev_features, svq_features;
>>>>>>> -    int r;
>>>>>>> -    bool ok;
>>>>>>> -
>>>>>>> -    if (!v->shadow_vqs_enabled) {
>>>>>>> -        return 0;
>>>>>>> -    }
>>>>>>> -
>>>>>>> -    r = vhost_vdpa_get_dev_features(hdev, &dev_features);
>>>>>>> -    if (r != 0) {
>>>>>>> -        error_setg_errno(errp, -r, "Can't get vdpa device features");
>>>>>>> -        return r;
>>>>>>> -    }
>>>>>>> -
>>>>>>> -    svq_features = dev_features;
>>>>>>> -    ok = vhost_svq_valid_features(svq_features, errp);
>>>>>>> -    if (unlikely(!ok)) {
>>>>>>> -        return -1;
>>>>>>> -    }
>>>>>>>
>>>>>>>         shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
>>>>>>>         for (unsigned n = 0; n < hdev->nvqs; ++n) {
>>>>>>> @@ -436,7 +416,6 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
>>>>>>>         }
>>>>>>>
>>>>>>>         v->shadow_vqs = g_steal_pointer(&shadow_vqs);
>>>>>>> -    return 0;
>>>>>>>     }
>>>>>>>
>>>>>>>     static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
>>>>>>> @@ -461,11 +440,7 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
>>>>>>>         dev->opaque =  opaque ;
>>>>>>>         v->listener = vhost_vdpa_memory_listener;
>>>>>>>         v->msg_type = VHOST_IOTLB_MSG_V2;
>>>>>>> -    ret = vhost_vdpa_init_svq(dev, v, errp);
>>>>>>> -    if (ret) {
>>>>>>> -        goto err;
>>>>>>> -    }
>>>>>>> -
>>>>>>> +    vhost_vdpa_init_svq(dev, v);
>>>>>>>         vhost_vdpa_get_iova_range(v);
>>>>>>>
>>>>>>>         if (!vhost_vdpa_first_dev(dev)) {
>>>>>>> @@ -476,10 +451,6 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
>>>>>>>                                    VIRTIO_CONFIG_S_DRIVER);
>>>>>>>
>>>>>>>         return 0;
>>>>>>> -
>>>>>>> -err:
>>>>>>> -    ram_block_discard_disable(false);
>>>>>>> -    return ret;
>>>>>>>     }
>>>>>>>
>>>>>>>     static void vhost_vdpa_host_notifier_uninit(struct vhost_dev *dev,
>>>>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>>>>>> index d3b1de481b..fb35b17ab4 100644
>>>>>>> --- a/net/vhost-vdpa.c
>>>>>>> +++ b/net/vhost-vdpa.c
>>>>>>> @@ -117,9 +117,10 @@ static bool vhost_vdpa_net_valid_svq_features(uint64_t features, Error **errp)
>>>>>>>         if (invalid_dev_features) {
>>>>>>>             error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
>>>>>>>                        invalid_dev_features);
>>>>>>> +        return false;
>>>>>>>         }
>>>>>>>
>>>>>>> -    return !invalid_dev_features;
>>>>>>> +    return vhost_svq_valid_features(features, errp);
>>>>>>>     }
>>>>>>>
>>>>>>>     static int vhost_vdpa_net_check_device_id(struct vhost_net *net)


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 07/10] vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap
  2022-11-11 13:02         ` Eugenio Perez Martin
@ 2022-11-14  4:27           ` Jason Wang
  0 siblings, 0 replies; 46+ messages in thread
From: Jason Wang @ 2022-11-14  4:27 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini


在 2022/11/11 21:02, Eugenio Perez Martin 写道:
> On Fri, Nov 11, 2022 at 8:41 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> 在 2022/11/10 21:22, Eugenio Perez Martin 写道:
>>> On Thu, Nov 10, 2022 at 6:51 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>>>> So the caller can choose which ASID is destined.
>>>>>
>>>>> No need to update the batch functions as they will always be called from
>>>>> memory listener updates at the moment. Memory listener updates will
>>>>> always update ASID 0, as it's the passthrough ASID.
>>>>>
>>>>> All vhost devices's ASID are 0 at this moment.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>> v5:
>>>>> * Solve conflict, now vhost_vdpa_svq_unmap_ring returns void
>>>>> * Change comment on zero initialization.
>>>>>
>>>>> v4: Add comment specifying behavior if device does not support _F_ASID
>>>>>
>>>>> v3: Deleted unneeded space
>>>>> ---
>>>>>    include/hw/virtio/vhost-vdpa.h |  8 +++++---
>>>>>    hw/virtio/vhost-vdpa.c         | 29 +++++++++++++++++++----------
>>>>>    net/vhost-vdpa.c               |  6 +++---
>>>>>    hw/virtio/trace-events         |  4 ++--
>>>>>    4 files changed, 29 insertions(+), 18 deletions(-)
>>>>>
>>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
>>>>> index 1111d85643..6560bb9d78 100644
>>>>> --- a/include/hw/virtio/vhost-vdpa.h
>>>>> +++ b/include/hw/virtio/vhost-vdpa.h
>>>>> @@ -29,6 +29,7 @@ typedef struct vhost_vdpa {
>>>>>        int index;
>>>>>        uint32_t msg_type;
>>>>>        bool iotlb_batch_begin_sent;
>>>>> +    uint32_t address_space_id;
>>>> So the trick is let device specific code to zero this during allocation?
>>>>
>>> Yes, but I don't see how that is a trick :). All other parameters also
>>> trust it to be 0 at allocation.
>>>
>>>>>        MemoryListener listener;
>>>>>        struct vhost_vdpa_iova_range iova_range;
>>>>>        uint64_t acked_features;
>>>>> @@ -42,8 +43,9 @@ typedef struct vhost_vdpa {
>>>>>        VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
>>>>>    } VhostVDPA;
>>>>>
>>>>> -int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
>>>>> -                       void *vaddr, bool readonly);
>>>>> -int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
>>>>> +int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
>>>>> +                       hwaddr size, void *vaddr, bool readonly);
>>>>> +int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
>>>>> +                         hwaddr size);
>>>>>
>>>>>    #endif
>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>>>> index 23efb8f49d..8fd32ba32b 100644
>>>>> --- a/hw/virtio/vhost-vdpa.c
>>>>> +++ b/hw/virtio/vhost-vdpa.c
>>>>> @@ -72,22 +72,24 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
>>>>>        return false;
>>>>>    }
>>>>>
>>>>> -int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
>>>>> -                       void *vaddr, bool readonly)
>>>>> +int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
>>>>> +                       hwaddr size, void *vaddr, bool readonly)
>>>>>    {
>>>>>        struct vhost_msg_v2 msg = {};
>>>>>        int fd = v->device_fd;
>>>>>        int ret = 0;
>>>>>
>>>>>        msg.type = v->msg_type;
>>>>> +    msg.asid = asid; /* 0 if vdpa device does not support asid */
>>>> The comment here is confusing. If this is a requirement, we need either
>>>>
>>>> 1) doc this
>>>>
>>>> or
>>>>
>>>> 2) perform necessary checks in the function itself.
>>>>
>>> I only documented it in vhost_vdpa_dma_unmap and now I realize it.
>>> Would it work to just copy that comment here?
>>
>> Probably, and let's move the comment above the function definition.
>>
>>
>>>>>        msg.iotlb.iova = iova;
>>>>>        msg.iotlb.size = size;
>>>>>        msg.iotlb.uaddr = (uint64_t)(uintptr_t)vaddr;
>>>>>        msg.iotlb.perm = readonly ? VHOST_ACCESS_RO : VHOST_ACCESS_RW;
>>>>>        msg.iotlb.type = VHOST_IOTLB_UPDATE;
>>>>>
>>>>> -   trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.iotlb.iova, msg.iotlb.size,
>>>>> -                            msg.iotlb.uaddr, msg.iotlb.perm, msg.iotlb.type);
>>>>> +    trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.asid, msg.iotlb.iova,
>>>>> +                             msg.iotlb.size, msg.iotlb.uaddr, msg.iotlb.perm,
>>>>> +                             msg.iotlb.type);
>>>>>
>>>>>        if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
>>>>>            error_report("failed to write, fd=%d, errno=%d (%s)",
>>>>> @@ -98,18 +100,24 @@ int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
>>>>>        return ret;
>>>>>    }
>>>>>
>>>>> -int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size)
>>>>> +int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
>>>>> +                         hwaddr size)
>>>>>    {
>>>>>        struct vhost_msg_v2 msg = {};
>>>>>        int fd = v->device_fd;
>>>>>        int ret = 0;
>>>>>
>>>>>        msg.type = v->msg_type;
>>>>> +    /*
>>>>> +     * The caller must set asid = 0 if the device does not support asid.
>>>>> +     * This is not an ABI break since it is set to 0 by the initializer anyway.
>>>>> +     */
>>>>> +    msg.asid = asid;
>>>>>        msg.iotlb.iova = iova;
>>>>>        msg.iotlb.size = size;
>>>>>        msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
>>>>>
>>>>> -    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.iotlb.iova,
>>>>> +    trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.asid, msg.iotlb.iova,
>>>>>                                   msg.iotlb.size, msg.iotlb.type);
>>>>>
>>>>>        if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
>>>>> @@ -229,7 +237,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>>>>>        }
>>>>>
>>>>>        vhost_vdpa_iotlb_batch_begin_once(v);
>>>>> -    ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
>>>>> +    ret = vhost_vdpa_dma_map(v, 0, iova, int128_get64(llsize),
>>>> Can we use v->address_space_id here? Then we don't need to modify this
>>>> line when we support multiple asids logic in the future.
>>>>
>>> The registered memory listener is the one of the last vhost_vdpa, the
>>> one that handles the last queue.
>>>
>>> If all data virtqueues are not shadowed but CVQ is,
>>> v->address_space_id is 1 with the current code.
>>
>> Ok, right. So let's add a comment here. It would be even better to
>> define the macro for data vq asid in this patch.
>>
> I agree that it must be changed to a macro.
>
> However, currently the _ASID macros belong to net/ . Maybe to declare
> VHOST_VDPA_GPA_ASID in include/hw/virtio/vhost-vdpa.h and just let
> VHOST_VDPA_NET_CVQ_ASID in net/vhost-vdpa.c?
>
> Thanks!


That should be fine.

Thanks


>
>> Thanks
>>
>>
>>>    But the listener is
>>> actually mapping the ASID 0, not 1.
>>>
>>> Another alternative is to register it to the last data virtqueue, not
>>> the last queue of vhost_vdpa. But it is hard to express it in a
>>> generic way at virtio/vhost-vdpa.c . To have a boolean indicating the
>>> vhost_vdpa we want to register its memory listener?
>>>
>>> It seems easier to me to simply assign 0 at GPA translations. If SVQ
>>> is enabled for all queues, then 0 is GPA to qemu's VA + SVQ stuff. If
>>> it is not, 0 is always GPA to qemu's VA.
>>>
>>> Thanks!
>>>
>>>> Thanks
>>>>
>>>>>                                 vaddr, section->readonly);
>>>>>        if (ret) {
>>>>>            error_report("vhost vdpa map fail!");
>>>>> @@ -303,7 +311,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
>>>>>            vhost_iova_tree_remove(v->iova_tree, *result);
>>>>>        }
>>>>>        vhost_vdpa_iotlb_batch_begin_once(v);
>>>>> -    ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
>>>>> +    ret = vhost_vdpa_dma_unmap(v, 0, iova, int128_get64(llsize));
>>>>>        if (ret) {
>>>>>            error_report("vhost_vdpa dma unmap error!");
>>>>>        }
>>>>> @@ -884,7 +892,7 @@ static void vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v, hwaddr addr)
>>>>>        }
>>>>>
>>>>>        size = ROUND_UP(result->size, qemu_real_host_page_size());
>>>>> -    r = vhost_vdpa_dma_unmap(v, result->iova, size);
>>>>> +    r = vhost_vdpa_dma_unmap(v, v->address_space_id, result->iova, size);
>>>>>        if (unlikely(r < 0)) {
>>>>>            error_report("Unable to unmap SVQ vring: %s (%d)", g_strerror(-r), -r);
>>>>>            return;
>>>>> @@ -924,7 +932,8 @@ static bool vhost_vdpa_svq_map_ring(struct vhost_vdpa *v, DMAMap *needle,
>>>>>            return false;
>>>>>        }
>>>>>
>>>>> -    r = vhost_vdpa_dma_map(v, needle->iova, needle->size + 1,
>>>>> +    r = vhost_vdpa_dma_map(v, v->address_space_id, needle->iova,
>>>>> +                           needle->size + 1,
>>>>>                               (void *)(uintptr_t)needle->translated_addr,
>>>>>                               needle->perm == IOMMU_RO);
>>>>>        if (unlikely(r != 0)) {
>>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>>>> index fb35b17ab4..ca1acc0410 100644
>>>>> --- a/net/vhost-vdpa.c
>>>>> +++ b/net/vhost-vdpa.c
>>>>> @@ -258,7 +258,7 @@ static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
>>>>>            return;
>>>>>        }
>>>>>
>>>>> -    r = vhost_vdpa_dma_unmap(v, map->iova, map->size + 1);
>>>>> +    r = vhost_vdpa_dma_unmap(v, v->address_space_id, map->iova, map->size + 1);
>>>>>        if (unlikely(r != 0)) {
>>>>>            error_report("Device cannot unmap: %s(%d)", g_strerror(r), r);
>>>>>        }
>>>>> @@ -298,8 +298,8 @@ static int vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v, void *buf, size_t size,
>>>>>            return r;
>>>>>        }
>>>>>
>>>>> -    r = vhost_vdpa_dma_map(v, map.iova, vhost_vdpa_net_cvq_cmd_page_len(), buf,
>>>>> -                           !write);
>>>>> +    r = vhost_vdpa_dma_map(v, v->address_space_id, map.iova,
>>>>> +                           vhost_vdpa_net_cvq_cmd_page_len(), buf, !write);
>>>>>        if (unlikely(r < 0)) {
>>>>>            goto dma_map_err;
>>>>>        }
>>>>> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
>>>>> index 820dadc26c..0ad9390307 100644
>>>>> --- a/hw/virtio/trace-events
>>>>> +++ b/hw/virtio/trace-events
>>>>> @@ -30,8 +30,8 @@ vhost_user_write(uint32_t req, uint32_t flags) "req:%d flags:0x%"PRIx32""
>>>>>    vhost_user_create_notifier(int idx, void *n) "idx:%d n:%p"
>>>>>
>>>>>    # vhost-vdpa.c
>>>>> -vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
>>>>> -vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
>>>>> +vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" perm: 0x%"PRIx8" type: %"PRIu8
>>>>> +vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint32_t asid, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p fd: %d msg_type: %"PRIu32" asid: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
>>>>>    vhost_vdpa_listener_begin_batch(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
>>>>>    vhost_vdpa_listener_commit(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d msg_type: %"PRIu32" type: %"PRIu8
>>>>>    vhost_vdpa_listener_region_add(void *vdpa, uint64_t iova, uint64_t llend, void *vaddr, bool readonly) "vdpa: %p iova 0x%"PRIx64" llend 0x%"PRIx64" vaddr: %p read-only: %d"
>>>>> --
>>>>> 2.31.1
>>>>>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa
  2022-11-11 13:12         ` Eugenio Perez Martin
@ 2022-11-14  4:30           ` Jason Wang
  2022-11-14 16:30             ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-14  4:30 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini


在 2022/11/11 21:12, Eugenio Perez Martin 写道:
> On Fri, Nov 11, 2022 at 8:49 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> 在 2022/11/10 21:47, Eugenio Perez Martin 写道:
>>> On Thu, Nov 10, 2022 at 7:01 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>>>> The memory listener that thells the device how to convert GPA to qemu's
>>>>> va is registered against CVQ vhost_vdpa. This series try to map the
>>>>> memory listener translations to ASID 0, while it maps the CVQ ones to
>>>>> ASID 1.
>>>>>
>>>>> Let's tell the listener if it needs to register them on iova tree or
>>>>> not.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>> v5: Solve conflict about vhost_iova_tree_remove accepting mem_region by
>>>>>       value.
>>>>> ---
>>>>>    include/hw/virtio/vhost-vdpa.h | 2 ++
>>>>>    hw/virtio/vhost-vdpa.c         | 6 +++---
>>>>>    net/vhost-vdpa.c               | 1 +
>>>>>    3 files changed, 6 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
>>>>> index 6560bb9d78..0c3ed2d69b 100644
>>>>> --- a/include/hw/virtio/vhost-vdpa.h
>>>>> +++ b/include/hw/virtio/vhost-vdpa.h
>>>>> @@ -34,6 +34,8 @@ typedef struct vhost_vdpa {
>>>>>        struct vhost_vdpa_iova_range iova_range;
>>>>>        uint64_t acked_features;
>>>>>        bool shadow_vqs_enabled;
>>>>> +    /* The listener must send iova tree addresses, not GPA */
>>
>> Btw, cindy's vIOMMU series will make it not necessarily GPA any more.
>>
> Yes, this comment should be tuned then. But the SVQ iova_tree will not
> be equal to vIOMMU one because shadow vrings.
>
> But maybe SVQ can inspect both instead of having all the duplicated entries.
>
>>>>> +    bool listener_shadow_vq;
>>>>>        /* IOVA mapping used by the Shadow Virtqueue */
>>>>>        VhostIOVATree *iova_tree;
>>>>>        GPtrArray *shadow_vqs;
>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>>>> index 8fd32ba32b..e3914fa40e 100644
>>>>> --- a/hw/virtio/vhost-vdpa.c
>>>>> +++ b/hw/virtio/vhost-vdpa.c
>>>>> @@ -220,7 +220,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>>>>>                                             vaddr, section->readonly);
>>>>>
>>>>>        llsize = int128_sub(llend, int128_make64(iova));
>>>>> -    if (v->shadow_vqs_enabled) {
>>>>> +    if (v->listener_shadow_vq) {
>>>>>            int r;
>>>>>
>>>>>            mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
>>>>> @@ -247,7 +247,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>>>>>        return;
>>>>>
>>>>>    fail_map:
>>>>> -    if (v->shadow_vqs_enabled) {
>>>>> +    if (v->listener_shadow_vq) {
>>>>>            vhost_iova_tree_remove(v->iova_tree, mem_region);
>>>>>        }
>>>>>
>>>>> @@ -292,7 +292,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
>>>>>
>>>>>        llsize = int128_sub(llend, int128_make64(iova));
>>>>>
>>>>> -    if (v->shadow_vqs_enabled) {
>>>>> +    if (v->listener_shadow_vq) {
>>>>>            const DMAMap *result;
>>>>>            const void *vaddr = memory_region_get_ram_ptr(section->mr) +
>>>>>                section->offset_within_region +
>>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>>>> index 85a318faca..02780ee37b 100644
>>>>> --- a/net/vhost-vdpa.c
>>>>> +++ b/net/vhost-vdpa.c
>>>>> @@ -570,6 +570,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>>>>>        s->vhost_vdpa.index = queue_pair_index;
>>>>>        s->always_svq = svq;
>>>>>        s->vhost_vdpa.shadow_vqs_enabled = svq;
>>>>> +    s->vhost_vdpa.listener_shadow_vq = svq;
>>>> Any chance those above two can differ?
>>>>
>>> If CVQ is shadowed but data VQs are not, shadow_vqs_enabled is true
>>> but listener_shadow_vq is not.
>>>
>>> It is more clear in the next commit, where only shadow_vqs_enabled is
>>> set to true at vhost_vdpa_net_cvq_start.
>>
>> Ok, the name looks a little bit confusing. I wonder if it's better to
>> use shadow_cvq and shadow_data ?
>>
> I'm ok with renaming it, but struct vhost_vdpa is generic across all
> kind of devices, and it does not know if it is a datapath or not for
> the moment.
>
> Maybe listener_uses_iova_tree?


I think "iova_tree" is something that is internal to svq implementation, 
it's better to define the name from the view of vhost_vdpa level.

Thanks


>
> Thanks!
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 10/10] vdpa: Always start CVQ in SVQ mode
  2022-11-11 14:38         ` Eugenio Perez Martin
@ 2022-11-14  4:36           ` Jason Wang
  0 siblings, 0 replies; 46+ messages in thread
From: Jason Wang @ 2022-11-14  4:36 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini


在 2022/11/11 22:38, Eugenio Perez Martin 写道:
> On Fri, Nov 11, 2022 at 9:03 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> 在 2022/11/11 00:07, Eugenio Perez Martin 写道:
>>> On Thu, Nov 10, 2022 at 7:25 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> 在 2022/11/9 01:07, Eugenio Pérez 写道:
>>>>> Isolate control virtqueue in its own group, allowing to intercept control
>>>>> commands but letting dataplane run totally passthrough to the guest.
>>>> I think we need to tweak the title to "vdpa: Always start CVQ in SVQ
>>>> mode if possible". Since SVQ for CVQ can't be enabled without ASID support?
>>>>
>>> Yes, I totally agree.
>>
>> Btw, I wonder if it's worth to remove the "x-" prefix for the shadow
>> virtqueue. It can help for the devices without ASID support but want do
>> live migration.
>>
> Sure I can propose on top.
>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>> v6:
>>>>> * Disable control SVQ if the device does not support it because of
>>>>> features.
>>>>>
>>>>> v5:
>>>>> * Fixing the not adding cvq buffers when x-svq=on is specified.
>>>>> * Move vring state in vhost_vdpa_get_vring_group instead of using a
>>>>>      parameter.
>>>>> * Rename VHOST_VDPA_NET_CVQ_PASSTHROUGH to VHOST_VDPA_NET_DATA_ASID
>>>>>
>>>>> v4:
>>>>> * Squash vhost_vdpa_cvq_group_is_independent.
>>>>> * Rebased on last CVQ start series, that allocated CVQ cmd bufs at load
>>>>> * Do not check for cvq index on vhost_vdpa_net_prepare, we only have one
>>>>>      that callback registered in that NetClientInfo.
>>>>>
>>>>> v3:
>>>>> * Make asid related queries print a warning instead of returning an
>>>>>      error and stop the start of qemu.
>>>>> ---
>>>>>     hw/virtio/vhost-vdpa.c |   3 +-
>>>>>     net/vhost-vdpa.c       | 138 ++++++++++++++++++++++++++++++++++++++---
>>>>>     2 files changed, 132 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>>>> index e3914fa40e..6401e7efb1 100644
>>>>> --- a/hw/virtio/vhost-vdpa.c
>>>>> +++ b/hw/virtio/vhost-vdpa.c
>>>>> @@ -648,7 +648,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
>>>>>     {
>>>>>         uint64_t features;
>>>>>         uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 |
>>>>> -        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH;
>>>>> +        0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH |
>>>>> +        0x1ULL << VHOST_BACKEND_F_IOTLB_ASID;
>>>>>         int r;
>>>>>
>>>>>         if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) {
>>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>>>> index 02780ee37b..7245ea70c6 100644
>>>>> --- a/net/vhost-vdpa.c
>>>>> +++ b/net/vhost-vdpa.c
>>>>> @@ -38,6 +38,9 @@ typedef struct VhostVDPAState {
>>>>>         void *cvq_cmd_out_buffer;
>>>>>         virtio_net_ctrl_ack *status;
>>>>>
>>>>> +    /* Number of address spaces supported by the device */
>>>>> +    unsigned address_space_num;
>>>> I'm not sure this is the best place to store thing like this since it
>>>> can cause confusion. We will have multiple VhostVDPAState when
>>>> multiqueue is enabled.
>>>>
>>> I think we can delete this and ask it on each device start.
>>>
>>>>> +
>>>>>         /* The device always have SVQ enabled */
>>>>>         bool always_svq;
>>>>>         bool started;
>>>>> @@ -101,6 +104,9 @@ static const uint64_t vdpa_svq_device_features =
>>>>>         BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
>>>>>         BIT_ULL(VIRTIO_NET_F_STANDBY);
>>>>>
>>>>> +#define VHOST_VDPA_NET_DATA_ASID 0
>>>>> +#define VHOST_VDPA_NET_CVQ_ASID 1
>>>>> +
>>>>>     VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
>>>>>     {
>>>>>         VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
>>>>> @@ -242,6 +248,34 @@ static NetClientInfo net_vhost_vdpa_info = {
>>>>>             .check_peer_type = vhost_vdpa_check_peer_type,
>>>>>     };
>>>>>
>>>>> +static uint32_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
>>>>> +{
>>>>> +    struct vhost_vring_state state = {
>>>>> +        .index = vq_index,
>>>>> +    };
>>>>> +    int r = ioctl(device_fd, VHOST_VDPA_GET_VRING_GROUP, &state);
>>>>> +
>>>>> +    return r < 0 ? 0 : state.num;
>>>> Assume 0 when ioctl() fail is probably not a good idea: errors in ioctl
>>>> might be hidden. It would be better to fallback to 0 when ASID is not
>>>> supported.
>>>>
>>> Did I misunderstand you on [1]?
>>
>> Nope. I think I was wrong at that time then :( Sorry for that.
>>
>> We should differ from the case
>>
>> 1) no ASID support so 0 is assumed
>>
>> 2) something wrong in the case of ioctl, it's not necessarily a ENOTSUPP.
>>
> What action should we take here? Isn't it better to disable SVQ and
> let the device run?


It should fail the function instead of assuming 0 like how other vhost 
ioctl work.


>
>>>>> +}
>>>>> +
>>>>> +static int vhost_vdpa_set_address_space_id(struct vhost_vdpa *v,
>>>>> +                                           unsigned vq_group,
>>>>> +                                           unsigned asid_num)
>>>>> +{
>>>>> +    struct vhost_vring_state asid = {
>>>>> +        .index = vq_group,
>>>>> +        .num = asid_num,
>>>>> +    };
>>>>> +    int ret;
>>>>> +
>>>>> +    ret = ioctl(v->device_fd, VHOST_VDPA_SET_GROUP_ASID, &asid);
>>>>> +    if (unlikely(ret < 0)) {
>>>>> +        warn_report("Can't set vq group %u asid %u, errno=%d (%s)",
>>>>> +            asid.index, asid.num, errno, g_strerror(errno));
>>>>> +    }
>>>>> +    return ret;
>>>>> +}
>>>>> +
>>>>>     static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
>>>>>     {
>>>>>         VhostIOVATree *tree = v->iova_tree;
>>>>> @@ -316,11 +350,54 @@ dma_map_err:
>>>>>     static int vhost_vdpa_net_cvq_start(NetClientState *nc)
>>>>>     {
>>>>>         VhostVDPAState *s;
>>>>> -    int r;
>>>>> +    struct vhost_vdpa *v;
>>>>> +    uint32_t cvq_group;
>>>>> +    int cvq_index, r;
>>>>>
>>>>>         assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>>>>>
>>>>>         s = DO_UPCAST(VhostVDPAState, nc, nc);
>>>>> +    v = &s->vhost_vdpa;
>>>>> +
>>>>> +    v->listener_shadow_vq = s->always_svq;
>>>>> +    v->shadow_vqs_enabled = s->always_svq;
>>>>> +    s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_DATA_ASID;
>>>>> +
>>>>> +    if (s->always_svq) {
>>>>> +        goto out;
>>>>> +    }
>>>>> +
>>>>> +    if (s->address_space_num < 2) {
>>>>> +        return 0;
>>>>> +    }
>>>>> +
>>>>> +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
>>>>> +        return 0;
>>>>> +    }
>>>> Any reason we do the above check during the start/stop? It should be
>>>> easier to do that in the initialization.
>>>>
>>> We can store it as a member of VhostVDPAState maybe? They will be
>>> duplicated like the current number of AS.
>>
>> I meant each VhostVDPAState just need to know the ASID it needs to use.
>> There's no need to know the total number of address spaces or do the
>> validation on it during start (the validation could be done during
>> initialization).
>>
> I thought we were talking about the virtio features.
>
> So let's omit this check and simply try to set ASID? The worst case is
> an -ENOTSUPP or -EINVAL, so the actions to take are the same as if we
> don't have enough AS.


See the discussion in other thread:

1) do the probing of asid stuffs during initalization, we can simply try 
to set the ASID here and fail the vhost dev start
2) do the probing each time during vhost start

Personally I like 1, but if you want to go with 2 it should also fine. 
(need some comments)


>
>>>>> +
>>>>> +    /**
>>>>> +     * Check if all the virtqueues of the virtio device are in a different vq
>>>>> +     * than the last vq. VQ group of last group passed in cvq_group.
>>>>> +     */
>>>>> +    cvq_index = v->dev->vq_index_end - 1;
>>>>> +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
>>>>> +    for (int i = 0; i < cvq_index; ++i) {
>>>>> +        uint32_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
>>>>> +
>>>>> +        if (unlikely(group == cvq_group)) {
>>>>> +            warn_report("CVQ %u group is the same as VQ %u one (%u)", cvq_group,
>>>>> +                        i, group);
>>>>> +            return 0;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
>>>>> +    if (r == 0) {
>>>>> +        v->shadow_vqs_enabled = true;
>>>>> +        s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_CVQ_ASID;
>>>>> +    }
>>>>> +
>>>>> +out:
>>>>>         if (!s->vhost_vdpa.shadow_vqs_enabled) {
>>>>>             return 0;
>>>>>         }
>>>>> @@ -542,12 +619,38 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
>>>>>         .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
>>>>>     };
>>>>>
>>>>> +static uint32_t vhost_vdpa_get_as_num(int vdpa_device_fd)
>>>>> +{
>>>>> +    uint64_t features;
>>>>> +    unsigned num_as;
>>>>> +    int r;
>>>>> +
>>>>> +    r = ioctl(vdpa_device_fd, VHOST_GET_BACKEND_FEATURES, &features);
>>>>> +    if (unlikely(r < 0)) {
>>>>> +        warn_report("Cannot get backend features");
>>>>> +        return 1;
>>>>> +    }
>>>>> +
>>>>> +    if (!(features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
>>>>> +        return 1;
>>>>> +    }
>>>>> +
>>>>> +    r = ioctl(vdpa_device_fd, VHOST_VDPA_GET_AS_NUM, &num_as);
>>>>> +    if (unlikely(r < 0)) {
>>>>> +        warn_report("Cannot retrieve number of supported ASs");
>>>>> +        return 1;
>>>> Let's return error here. This help. to identify bugs of qemu or kernel.
>>>>
>>> Same comment as with VHOST_VDPA_GET_VRING_GROUP.
>>>
>>>>> +    }
>>>>> +
>>>>> +    return num_as;
>>>>> +}
>>>>> +
>>>>>     static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>>>>>                                                const char *device,
>>>>>                                                const char *name,
>>>>>                                                int vdpa_device_fd,
>>>>>                                                int queue_pair_index,
>>>>>                                                int nvqs,
>>>>> +                                           unsigned nas,
>>>>>                                                bool is_datapath,
>>>>>                                                bool svq,
>>>>>                                                VhostIOVATree *iova_tree)
>>>>> @@ -566,6 +669,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>>>>>         qemu_set_info_str(nc, TYPE_VHOST_VDPA);
>>>>>         s = DO_UPCAST(VhostVDPAState, nc, nc);
>>>>>
>>>>> +    s->address_space_num = nas;
>>>>>         s->vhost_vdpa.device_fd = vdpa_device_fd;
>>>>>         s->vhost_vdpa.index = queue_pair_index;
>>>>>         s->always_svq = svq;
>>>>> @@ -652,6 +756,8 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>>>>>         g_autoptr(VhostIOVATree) iova_tree = NULL;
>>>>>         NetClientState *nc;
>>>>>         int queue_pairs, r, i = 0, has_cvq = 0;
>>>>> +    unsigned num_as = 1;
>>>>> +    bool svq_cvq;
>>>>>
>>>>>         assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>>>>>         opts = &netdev->u.vhost_vdpa;
>>>>> @@ -693,12 +799,28 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>>>>>             return queue_pairs;
>>>>>         }
>>>>>
>>>>> -    if (opts->x_svq) {
>>>>> -        struct vhost_vdpa_iova_range iova_range;
>>>>> +    svq_cvq = opts->x_svq;
>>>>> +    if (has_cvq && !opts->x_svq) {
>>>>> +        num_as = vhost_vdpa_get_as_num(vdpa_device_fd);
>>>>> +        svq_cvq = num_as > 1;
>>>>> +    }
>>>> The above check is not easy to follow, how about?
>>>>
>>>> svq_cvq = vhost_vdpa_get_as_num() > 1 ? true : opts->x_svq;
>>>>
>>> That would allocate the iova tree even if CVQ is not used in the
>>> guest. And num_as is reused later, although we can ask it to the
>>> device at device start to avoid this.
>>
>> Ok.
>>
>>
>>> If any, the linear conversion would be:
>>> svq_cvq = opts->x_svq || (has_cvq && vhost_vdpa_get_as_num(vdpa_device_fd))
>>>
>>> So we avoid the AS_NUM ioctl if not needed.
>>
>> So when !opts->x_svq, we need to check num_as is at least 2?
>>
> As I think you proposed, we can simply try to set CVQ ASID and react to -EINVAL.
>
> But this code here is trying to not to allocate iova_tree if we're
> sure it will not be needed. Maybe it is easier to always allocate the
> empty iova tree and only fill it if needed?


I feel it should work.


>
>>>>> +
>>>>> +    if (opts->x_svq || svq_cvq) {
>>>> Any chance we can have opts->x_svq = true but svq_cvq = false? Checking
>>>> svq_cvq seems sufficient here.
>>>>
>>> The reverse is possible, to have svq_cvq but no opts->x_svq.
>>>
>>> Depending on that, this code emits a warning or a fatal error.
>>
>> Ok, as replied in the previous patch, I think we need a better name for
>> those ones.
>>
> cvq_svq can be renamed for sure. x_svq can be aliased with other
> variable if needed too.
>
>> if (opts->x_svq) {
>>           shadow_data_vq = true;
>>           if(has_cvq) shadow_cvq = true;
>> } else if (num_as >= 2 && has_cvq) {
>>           shadow_cvq = true;
>> }
>>
>> The other logic can just check shadow_cvq or shadow_data_vq individually.
>>
> Not sure if shadow_data_vq is accurate. It sounds to me as "Only
> shadow data virtqueues but not CVQ".
>
> shadow_device and shadow_cvq?


Should work, let's see.

Thanks


>
>>>>> +        Error *warn = NULL;
>>>>>
>>>>> -        if (!vhost_vdpa_net_valid_svq_features(features, errp)) {
>>>>> -            goto err_svq;
>>>>> +        svq_cvq = vhost_vdpa_net_valid_svq_features(features,
>>>>> +                                                   opts->x_svq ? errp : &warn);
>>>>> +        if (!svq_cvq) {
>>>> Same question as above.
>>>>
>>>>
>>>>> +            if (opts->x_svq) {
>>>>> +                goto err_svq;
>>>>> +            } else {
>>>>> +                warn_reportf_err(warn, "Cannot shadow CVQ: ");
>>>>> +            }
>>>>>             }
>>>>> +    }
>>>>> +
>>>>> +    if (opts->x_svq || svq_cvq) {
>>>>> +        struct vhost_vdpa_iova_range iova_range;
>>>>>
>>>>>             vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range);
>>>>>             iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
>>>>> @@ -708,15 +830,15 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>>>>>
>>>>>         for (i = 0; i < queue_pairs; i++) {
>>>>>             ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>>>>> -                                     vdpa_device_fd, i, 2, true, opts->x_svq,
>>>>> -                                     iova_tree);
>>>>> +                                     vdpa_device_fd, i, 2, num_as, true,
>>>> I don't get why we need pass num_as to a specific vhost_vdpa structure.
>>>> It should be sufficient to pass asid there.
>>>>
>>> ASID is not known at this time, but at device's start. This is because
>>> we cannot ask if the CVQ is in its own vq group, because we don't know
>>> the control virtqueue index until the guest acknowledges the different
>>> features.
>>
>> We can probe those during initialization I think. E.g doing some
>> negotiation in the initialization phase.
>>
> We've developed this idea in other threads, let's continue there better.
>
> Thanks!
>
>> Thanks
>>
>>
>>> Thanks!
>>>
>>> [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg901685.html
>>>
>>>
>>>>> +                                     opts->x_svq, iova_tree);
>>>>>             if (!ncs[i])
>>>>>                 goto err;
>>>>>         }
>>>>>
>>>>>         if (has_cvq) {
>>>>>             nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>>>>> -                                 vdpa_device_fd, i, 1, false,
>>>>> +                                 vdpa_device_fd, i, 1, num_as, false,
>>>>>                                      opts->x_svq, iova_tree);
>>>>>             if (!nc)
>>>>>                 goto err;


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 05/10] vdpa: move SVQ vring features check to net/
  2022-11-14  4:26               ` Jason Wang
@ 2022-11-14 10:10                 ` Eugenio Perez Martin
  0 siblings, 0 replies; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-14 10:10 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Mon, Nov 14, 2022 at 5:26 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/11/11 20:58, Eugenio Perez Martin 写道:
> > On Fri, Nov 11, 2022 at 9:07 AM Jason Wang <jasowang@redhat.com> wrote:
> >> On Fri, Nov 11, 2022 at 3:56 PM Eugenio Perez Martin
> >> <eperezma@redhat.com> wrote:
> >>> On Fri, Nov 11, 2022 at 8:34 AM Jason Wang <jasowang@redhat.com> wrote:
> >>>>
> >>>> 在 2022/11/10 21:09, Eugenio Perez Martin 写道:
> >>>>> On Thu, Nov 10, 2022 at 6:40 AM Jason Wang <jasowang@redhat.com> wrote:
> >>>>>> 在 2022/11/9 01:07, Eugenio Pérez 写道:
> >>>>>>> The next patches will start control SVQ if possible. However, we don't
> >>>>>>> know if that will be possible at qemu boot anymore.
> >>>>>> If I was not wrong, there's no device specific feature that is checked
> >>>>>> in the function. So it should be general enough to be used by devices
> >>>>>> other than net. Then I don't see any advantage of doing this.
> >>>>>>
> >>>>> Because vhost_vdpa_init_svq is called at qemu boot, failing if it is
> >>>>> not possible to shadow the Virtqueue.
> >>>>>
> >>>>> Now the CVQ will be shadowed if possible, so we need to check this at
> >>>>> device start, not at initialization.
> >>>>
> >>>> Any reason we can't check this at device start? We don't need
> >>>> driver_features and we can do any probing to make sure cvq has an unique
> >>>> group during initialization time.
> >>>>
> >>> We need the CVQ index to check if it has an independent group. CVQ
> >>> index depends on the features the guest's ack:
> >>> * If it acks _F_MQ, it is the last one.
> >>> * If it doesn't, CVQ idx is 2.
> >>>
> >>> We cannot have acked features at initialization, and they could
> >>> change: It is valid for a guest to ack _F_MQ, then reset the device,
> >>> then not ack it.
> >> Can we do some probing by negotiating _F_MQ if the device offers it,
> >> then we can know if cvq has a unique group?
> >>
> > What if the guest does not ack _F_MQ?
> >
> > To be completed it would go like:
> >
> > * Probe negotiate _F_MQ, check unique group,
> > * Probe negotiate !_F_MQ, check unique group,
>
>
> I think it should be a bug if device present a unique virtqueue group
> that depends on a specific feature. That is to say, we can do a single
> round of probing instead of try it twice here.
>

I didn't mean a single virtqueue group but CVQ has its own group.

From vhost POV the valid behavior is feature dependent already: If
_F_MQ is not negotiated, the vq in a different group must be the 3rd.
If it is negotiated, the vq in its own group must be the last one.

Since the check of the virtqueue group is already dependent on the vq
index I think it would be a mistake to change a vq group ASID without
checking if it is independent or not.

The consequence of not checking that is that:
* We will move all the dataplane vq group to ASID 1, so the device
will not work properly.
* The device will be migratable.

>
> > * Actually negotiate with the guest's feature set.
> > * React to failures. Probably the same way as if the CVQ is not
> > isolated, disabling SVQ?
> >
> > To me it seems simpler to specify somehow that the vq must be independent.
>
>
> It's just a suggestion, if you think doing it at the start, I'm fine.

Don't get me wrong, we can make changes to go toward that direction
and I think it's a good idea, especially to reduce syscalls. I'm just
trying to put all the scenarios on the table because maybe I'm missing
something to solve them, or we can ignore them

Thanks!

> But we need document the reason with a comment maybe.
>
> Thanks
>
>
> >
> > Thanks!
> >
> >>>>>    To store this information at boot
> >>>>> time is not valid anymore, because v->shadow_vqs_enabled is not valid
> >>>>> at this time anymore.
> >>>>
> >>>> Ok, but this doesn't explain why it is net specific but vhost-vdpa specific.
> >>>>
> >>> We can try to move it to a vhost op, but we have the same problem as
> >>> the svq array allocation: We don't have the right place in vhost ops
> >>> to check this. Maybe vhost_set_features is the right one here?
> >> If we can do all the probing at the initialization phase, we can do
> >> everything there.
> >>
> >> Thanks
> >>
> >>> Thanks!
> >>>
> >>>> Thanks
> >>>>
> >>>>
> >>>>> Thanks!
> >>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>>
> >>>>>>> Since the moved checks will be already evaluated at net/ to know if it
> >>>>>>> is ok to shadow CVQ, move them.
> >>>>>>>
> >>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>>>>>> ---
> >>>>>>>     hw/virtio/vhost-vdpa.c | 33 ++-------------------------------
> >>>>>>>     net/vhost-vdpa.c       |  3 ++-
> >>>>>>>     2 files changed, 4 insertions(+), 32 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>>>>>> index 3df2775760..146f0dcb40 100644
> >>>>>>> --- a/hw/virtio/vhost-vdpa.c
> >>>>>>> +++ b/hw/virtio/vhost-vdpa.c
> >>>>>>> @@ -402,29 +402,9 @@ static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
> >>>>>>>         return ret;
> >>>>>>>     }
> >>>>>>>
> >>>>>>> -static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> >>>>>>> -                               Error **errp)
> >>>>>>> +static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v)
> >>>>>>>     {
> >>>>>>>         g_autoptr(GPtrArray) shadow_vqs = NULL;
> >>>>>>> -    uint64_t dev_features, svq_features;
> >>>>>>> -    int r;
> >>>>>>> -    bool ok;
> >>>>>>> -
> >>>>>>> -    if (!v->shadow_vqs_enabled) {
> >>>>>>> -        return 0;
> >>>>>>> -    }
> >>>>>>> -
> >>>>>>> -    r = vhost_vdpa_get_dev_features(hdev, &dev_features);
> >>>>>>> -    if (r != 0) {
> >>>>>>> -        error_setg_errno(errp, -r, "Can't get vdpa device features");
> >>>>>>> -        return r;
> >>>>>>> -    }
> >>>>>>> -
> >>>>>>> -    svq_features = dev_features;
> >>>>>>> -    ok = vhost_svq_valid_features(svq_features, errp);
> >>>>>>> -    if (unlikely(!ok)) {
> >>>>>>> -        return -1;
> >>>>>>> -    }
> >>>>>>>
> >>>>>>>         shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
> >>>>>>>         for (unsigned n = 0; n < hdev->nvqs; ++n) {
> >>>>>>> @@ -436,7 +416,6 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
> >>>>>>>         }
> >>>>>>>
> >>>>>>>         v->shadow_vqs = g_steal_pointer(&shadow_vqs);
> >>>>>>> -    return 0;
> >>>>>>>     }
> >>>>>>>
> >>>>>>>     static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> >>>>>>> @@ -461,11 +440,7 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> >>>>>>>         dev->opaque =  opaque ;
> >>>>>>>         v->listener = vhost_vdpa_memory_listener;
> >>>>>>>         v->msg_type = VHOST_IOTLB_MSG_V2;
> >>>>>>> -    ret = vhost_vdpa_init_svq(dev, v, errp);
> >>>>>>> -    if (ret) {
> >>>>>>> -        goto err;
> >>>>>>> -    }
> >>>>>>> -
> >>>>>>> +    vhost_vdpa_init_svq(dev, v);
> >>>>>>>         vhost_vdpa_get_iova_range(v);
> >>>>>>>
> >>>>>>>         if (!vhost_vdpa_first_dev(dev)) {
> >>>>>>> @@ -476,10 +451,6 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> >>>>>>>                                    VIRTIO_CONFIG_S_DRIVER);
> >>>>>>>
> >>>>>>>         return 0;
> >>>>>>> -
> >>>>>>> -err:
> >>>>>>> -    ram_block_discard_disable(false);
> >>>>>>> -    return ret;
> >>>>>>>     }
> >>>>>>>
> >>>>>>>     static void vhost_vdpa_host_notifier_uninit(struct vhost_dev *dev,
> >>>>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> >>>>>>> index d3b1de481b..fb35b17ab4 100644
> >>>>>>> --- a/net/vhost-vdpa.c
> >>>>>>> +++ b/net/vhost-vdpa.c
> >>>>>>> @@ -117,9 +117,10 @@ static bool vhost_vdpa_net_valid_svq_features(uint64_t features, Error **errp)
> >>>>>>>         if (invalid_dev_features) {
> >>>>>>>             error_setg(errp, "vdpa svq does not work with features 0x%" PRIx64,
> >>>>>>>                        invalid_dev_features);
> >>>>>>> +        return false;
> >>>>>>>         }
> >>>>>>>
> >>>>>>> -    return !invalid_dev_features;
> >>>>>>> +    return vhost_svq_valid_features(features, errp);
> >>>>>>>     }
> >>>>>>>
> >>>>>>>     static int vhost_vdpa_net_check_device_id(struct vhost_net *net)
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa
  2022-11-14  4:30           ` Jason Wang
@ 2022-11-14 16:30             ` Eugenio Perez Martin
  2022-11-15  3:04               ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-14 16:30 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Mon, Nov 14, 2022 at 5:30 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/11/11 21:12, Eugenio Perez Martin 写道:
> > On Fri, Nov 11, 2022 at 8:49 AM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> 在 2022/11/10 21:47, Eugenio Perez Martin 写道:
> >>> On Thu, Nov 10, 2022 at 7:01 AM Jason Wang <jasowang@redhat.com> wrote:
> >>>> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >>>>> The memory listener that thells the device how to convert GPA to qemu's
> >>>>> va is registered against CVQ vhost_vdpa. This series try to map the
> >>>>> memory listener translations to ASID 0, while it maps the CVQ ones to
> >>>>> ASID 1.
> >>>>>
> >>>>> Let's tell the listener if it needs to register them on iova tree or
> >>>>> not.
> >>>>>
> >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>>>> ---
> >>>>> v5: Solve conflict about vhost_iova_tree_remove accepting mem_region by
> >>>>>       value.
> >>>>> ---
> >>>>>    include/hw/virtio/vhost-vdpa.h | 2 ++
> >>>>>    hw/virtio/vhost-vdpa.c         | 6 +++---
> >>>>>    net/vhost-vdpa.c               | 1 +
> >>>>>    3 files changed, 6 insertions(+), 3 deletions(-)
> >>>>>
> >>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> >>>>> index 6560bb9d78..0c3ed2d69b 100644
> >>>>> --- a/include/hw/virtio/vhost-vdpa.h
> >>>>> +++ b/include/hw/virtio/vhost-vdpa.h
> >>>>> @@ -34,6 +34,8 @@ typedef struct vhost_vdpa {
> >>>>>        struct vhost_vdpa_iova_range iova_range;
> >>>>>        uint64_t acked_features;
> >>>>>        bool shadow_vqs_enabled;
> >>>>> +    /* The listener must send iova tree addresses, not GPA */
> >>
> >> Btw, cindy's vIOMMU series will make it not necessarily GPA any more.
> >>
> > Yes, this comment should be tuned then. But the SVQ iova_tree will not
> > be equal to vIOMMU one because shadow vrings.
> >
> > But maybe SVQ can inspect both instead of having all the duplicated entries.
> >
> >>>>> +    bool listener_shadow_vq;
> >>>>>        /* IOVA mapping used by the Shadow Virtqueue */
> >>>>>        VhostIOVATree *iova_tree;
> >>>>>        GPtrArray *shadow_vqs;
> >>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>>>> index 8fd32ba32b..e3914fa40e 100644
> >>>>> --- a/hw/virtio/vhost-vdpa.c
> >>>>> +++ b/hw/virtio/vhost-vdpa.c
> >>>>> @@ -220,7 +220,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >>>>>                                             vaddr, section->readonly);
> >>>>>
> >>>>>        llsize = int128_sub(llend, int128_make64(iova));
> >>>>> -    if (v->shadow_vqs_enabled) {
> >>>>> +    if (v->listener_shadow_vq) {
> >>>>>            int r;
> >>>>>
> >>>>>            mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
> >>>>> @@ -247,7 +247,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >>>>>        return;
> >>>>>
> >>>>>    fail_map:
> >>>>> -    if (v->shadow_vqs_enabled) {
> >>>>> +    if (v->listener_shadow_vq) {
> >>>>>            vhost_iova_tree_remove(v->iova_tree, mem_region);
> >>>>>        }
> >>>>>
> >>>>> @@ -292,7 +292,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> >>>>>
> >>>>>        llsize = int128_sub(llend, int128_make64(iova));
> >>>>>
> >>>>> -    if (v->shadow_vqs_enabled) {
> >>>>> +    if (v->listener_shadow_vq) {
> >>>>>            const DMAMap *result;
> >>>>>            const void *vaddr = memory_region_get_ram_ptr(section->mr) +
> >>>>>                section->offset_within_region +
> >>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> >>>>> index 85a318faca..02780ee37b 100644
> >>>>> --- a/net/vhost-vdpa.c
> >>>>> +++ b/net/vhost-vdpa.c
> >>>>> @@ -570,6 +570,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >>>>>        s->vhost_vdpa.index = queue_pair_index;
> >>>>>        s->always_svq = svq;
> >>>>>        s->vhost_vdpa.shadow_vqs_enabled = svq;
> >>>>> +    s->vhost_vdpa.listener_shadow_vq = svq;
> >>>> Any chance those above two can differ?
> >>>>
> >>> If CVQ is shadowed but data VQs are not, shadow_vqs_enabled is true
> >>> but listener_shadow_vq is not.
> >>>
> >>> It is more clear in the next commit, where only shadow_vqs_enabled is
> >>> set to true at vhost_vdpa_net_cvq_start.
> >>
> >> Ok, the name looks a little bit confusing. I wonder if it's better to
> >> use shadow_cvq and shadow_data ?
> >>
> > I'm ok with renaming it, but struct vhost_vdpa is generic across all
> > kind of devices, and it does not know if it is a datapath or not for
> > the moment.
> >
> > Maybe listener_uses_iova_tree?
>
>
> I think "iova_tree" is something that is internal to svq implementation,
> it's better to define the name from the view of vhost_vdpa level.
>

I don't get this, vhost_vdpa struct already has a pointer to its iova_tree.

Thanks!


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa
  2022-11-14 16:30             ` Eugenio Perez Martin
@ 2022-11-15  3:04               ` Jason Wang
  2022-11-15 11:24                 ` Eugenio Perez Martin
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2022-11-15  3:04 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Tue, Nov 15, 2022 at 12:31 AM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Mon, Nov 14, 2022 at 5:30 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> >
> > 在 2022/11/11 21:12, Eugenio Perez Martin 写道:
> > > On Fri, Nov 11, 2022 at 8:49 AM Jason Wang <jasowang@redhat.com> wrote:
> > >>
> > >> 在 2022/11/10 21:47, Eugenio Perez Martin 写道:
> > >>> On Thu, Nov 10, 2022 at 7:01 AM Jason Wang <jasowang@redhat.com> wrote:
> > >>>> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > >>>>> The memory listener that thells the device how to convert GPA to qemu's
> > >>>>> va is registered against CVQ vhost_vdpa. This series try to map the
> > >>>>> memory listener translations to ASID 0, while it maps the CVQ ones to
> > >>>>> ASID 1.
> > >>>>>
> > >>>>> Let's tell the listener if it needs to register them on iova tree or
> > >>>>> not.
> > >>>>>
> > >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > >>>>> ---
> > >>>>> v5: Solve conflict about vhost_iova_tree_remove accepting mem_region by
> > >>>>>       value.
> > >>>>> ---
> > >>>>>    include/hw/virtio/vhost-vdpa.h | 2 ++
> > >>>>>    hw/virtio/vhost-vdpa.c         | 6 +++---
> > >>>>>    net/vhost-vdpa.c               | 1 +
> > >>>>>    3 files changed, 6 insertions(+), 3 deletions(-)
> > >>>>>
> > >>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > >>>>> index 6560bb9d78..0c3ed2d69b 100644
> > >>>>> --- a/include/hw/virtio/vhost-vdpa.h
> > >>>>> +++ b/include/hw/virtio/vhost-vdpa.h
> > >>>>> @@ -34,6 +34,8 @@ typedef struct vhost_vdpa {
> > >>>>>        struct vhost_vdpa_iova_range iova_range;
> > >>>>>        uint64_t acked_features;
> > >>>>>        bool shadow_vqs_enabled;
> > >>>>> +    /* The listener must send iova tree addresses, not GPA */
> > >>
> > >> Btw, cindy's vIOMMU series will make it not necessarily GPA any more.
> > >>
> > > Yes, this comment should be tuned then. But the SVQ iova_tree will not
> > > be equal to vIOMMU one because shadow vrings.
> > >
> > > But maybe SVQ can inspect both instead of having all the duplicated entries.
> > >
> > >>>>> +    bool listener_shadow_vq;
> > >>>>>        /* IOVA mapping used by the Shadow Virtqueue */
> > >>>>>        VhostIOVATree *iova_tree;
> > >>>>>        GPtrArray *shadow_vqs;
> > >>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > >>>>> index 8fd32ba32b..e3914fa40e 100644
> > >>>>> --- a/hw/virtio/vhost-vdpa.c
> > >>>>> +++ b/hw/virtio/vhost-vdpa.c
> > >>>>> @@ -220,7 +220,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > >>>>>                                             vaddr, section->readonly);
> > >>>>>
> > >>>>>        llsize = int128_sub(llend, int128_make64(iova));
> > >>>>> -    if (v->shadow_vqs_enabled) {
> > >>>>> +    if (v->listener_shadow_vq) {
> > >>>>>            int r;
> > >>>>>
> > >>>>>            mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
> > >>>>> @@ -247,7 +247,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > >>>>>        return;
> > >>>>>
> > >>>>>    fail_map:
> > >>>>> -    if (v->shadow_vqs_enabled) {
> > >>>>> +    if (v->listener_shadow_vq) {
> > >>>>>            vhost_iova_tree_remove(v->iova_tree, mem_region);
> > >>>>>        }
> > >>>>>
> > >>>>> @@ -292,7 +292,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> > >>>>>
> > >>>>>        llsize = int128_sub(llend, int128_make64(iova));
> > >>>>>
> > >>>>> -    if (v->shadow_vqs_enabled) {
> > >>>>> +    if (v->listener_shadow_vq) {
> > >>>>>            const DMAMap *result;
> > >>>>>            const void *vaddr = memory_region_get_ram_ptr(section->mr) +
> > >>>>>                section->offset_within_region +
> > >>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > >>>>> index 85a318faca..02780ee37b 100644
> > >>>>> --- a/net/vhost-vdpa.c
> > >>>>> +++ b/net/vhost-vdpa.c
> > >>>>> @@ -570,6 +570,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> > >>>>>        s->vhost_vdpa.index = queue_pair_index;
> > >>>>>        s->always_svq = svq;
> > >>>>>        s->vhost_vdpa.shadow_vqs_enabled = svq;
> > >>>>> +    s->vhost_vdpa.listener_shadow_vq = svq;
> > >>>> Any chance those above two can differ?
> > >>>>
> > >>> If CVQ is shadowed but data VQs are not, shadow_vqs_enabled is true
> > >>> but listener_shadow_vq is not.
> > >>>
> > >>> It is more clear in the next commit, where only shadow_vqs_enabled is
> > >>> set to true at vhost_vdpa_net_cvq_start.
> > >>
> > >> Ok, the name looks a little bit confusing. I wonder if it's better to
> > >> use shadow_cvq and shadow_data ?
> > >>
> > > I'm ok with renaming it, but struct vhost_vdpa is generic across all
> > > kind of devices, and it does not know if it is a datapath or not for
> > > the moment.
> > >
> > > Maybe listener_uses_iova_tree?
> >
> >
> > I think "iova_tree" is something that is internal to svq implementation,
> > it's better to define the name from the view of vhost_vdpa level.
> >
>
> I don't get this, vhost_vdpa struct already has a pointer to its iova_tree.

Yes, this is a suggestion to improve the readability of the code. So
what I meant is to have a name to demonstrate why we need to use
iova_tree instead of "uses_iova_tree".

Thanks

>
> Thanks!
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa
  2022-11-15  3:04               ` Jason Wang
@ 2022-11-15 11:24                 ` Eugenio Perez Martin
  2022-11-16  3:33                   ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Eugenio Perez Martin @ 2022-11-15 11:24 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Tue, Nov 15, 2022 at 4:04 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Tue, Nov 15, 2022 at 12:31 AM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Mon, Nov 14, 2022 at 5:30 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > >
> > > 在 2022/11/11 21:12, Eugenio Perez Martin 写道:
> > > > On Fri, Nov 11, 2022 at 8:49 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >>
> > > >> 在 2022/11/10 21:47, Eugenio Perez Martin 写道:
> > > >>> On Thu, Nov 10, 2022 at 7:01 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >>>> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >>>>> The memory listener that thells the device how to convert GPA to qemu's
> > > >>>>> va is registered against CVQ vhost_vdpa. This series try to map the
> > > >>>>> memory listener translations to ASID 0, while it maps the CVQ ones to
> > > >>>>> ASID 1.
> > > >>>>>
> > > >>>>> Let's tell the listener if it needs to register them on iova tree or
> > > >>>>> not.
> > > >>>>>
> > > >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > >>>>> ---
> > > >>>>> v5: Solve conflict about vhost_iova_tree_remove accepting mem_region by
> > > >>>>>       value.
> > > >>>>> ---
> > > >>>>>    include/hw/virtio/vhost-vdpa.h | 2 ++
> > > >>>>>    hw/virtio/vhost-vdpa.c         | 6 +++---
> > > >>>>>    net/vhost-vdpa.c               | 1 +
> > > >>>>>    3 files changed, 6 insertions(+), 3 deletions(-)
> > > >>>>>
> > > >>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > >>>>> index 6560bb9d78..0c3ed2d69b 100644
> > > >>>>> --- a/include/hw/virtio/vhost-vdpa.h
> > > >>>>> +++ b/include/hw/virtio/vhost-vdpa.h
> > > >>>>> @@ -34,6 +34,8 @@ typedef struct vhost_vdpa {
> > > >>>>>        struct vhost_vdpa_iova_range iova_range;
> > > >>>>>        uint64_t acked_features;
> > > >>>>>        bool shadow_vqs_enabled;
> > > >>>>> +    /* The listener must send iova tree addresses, not GPA */
> > > >>
> > > >> Btw, cindy's vIOMMU series will make it not necessarily GPA any more.
> > > >>
> > > > Yes, this comment should be tuned then. But the SVQ iova_tree will not
> > > > be equal to vIOMMU one because shadow vrings.
> > > >
> > > > But maybe SVQ can inspect both instead of having all the duplicated entries.
> > > >
> > > >>>>> +    bool listener_shadow_vq;
> > > >>>>>        /* IOVA mapping used by the Shadow Virtqueue */
> > > >>>>>        VhostIOVATree *iova_tree;
> > > >>>>>        GPtrArray *shadow_vqs;
> > > >>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > >>>>> index 8fd32ba32b..e3914fa40e 100644
> > > >>>>> --- a/hw/virtio/vhost-vdpa.c
> > > >>>>> +++ b/hw/virtio/vhost-vdpa.c
> > > >>>>> @@ -220,7 +220,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > >>>>>                                             vaddr, section->readonly);
> > > >>>>>
> > > >>>>>        llsize = int128_sub(llend, int128_make64(iova));
> > > >>>>> -    if (v->shadow_vqs_enabled) {
> > > >>>>> +    if (v->listener_shadow_vq) {
> > > >>>>>            int r;
> > > >>>>>
> > > >>>>>            mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
> > > >>>>> @@ -247,7 +247,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > >>>>>        return;
> > > >>>>>
> > > >>>>>    fail_map:
> > > >>>>> -    if (v->shadow_vqs_enabled) {
> > > >>>>> +    if (v->listener_shadow_vq) {
> > > >>>>>            vhost_iova_tree_remove(v->iova_tree, mem_region);
> > > >>>>>        }
> > > >>>>>
> > > >>>>> @@ -292,7 +292,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> > > >>>>>
> > > >>>>>        llsize = int128_sub(llend, int128_make64(iova));
> > > >>>>>
> > > >>>>> -    if (v->shadow_vqs_enabled) {
> > > >>>>> +    if (v->listener_shadow_vq) {
> > > >>>>>            const DMAMap *result;
> > > >>>>>            const void *vaddr = memory_region_get_ram_ptr(section->mr) +
> > > >>>>>                section->offset_within_region +
> > > >>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > >>>>> index 85a318faca..02780ee37b 100644
> > > >>>>> --- a/net/vhost-vdpa.c
> > > >>>>> +++ b/net/vhost-vdpa.c
> > > >>>>> @@ -570,6 +570,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> > > >>>>>        s->vhost_vdpa.index = queue_pair_index;
> > > >>>>>        s->always_svq = svq;
> > > >>>>>        s->vhost_vdpa.shadow_vqs_enabled = svq;
> > > >>>>> +    s->vhost_vdpa.listener_shadow_vq = svq;
> > > >>>> Any chance those above two can differ?
> > > >>>>
> > > >>> If CVQ is shadowed but data VQs are not, shadow_vqs_enabled is true
> > > >>> but listener_shadow_vq is not.
> > > >>>
> > > >>> It is more clear in the next commit, where only shadow_vqs_enabled is
> > > >>> set to true at vhost_vdpa_net_cvq_start.
> > > >>
> > > >> Ok, the name looks a little bit confusing. I wonder if it's better to
> > > >> use shadow_cvq and shadow_data ?
> > > >>
> > > > I'm ok with renaming it, but struct vhost_vdpa is generic across all
> > > > kind of devices, and it does not know if it is a datapath or not for
> > > > the moment.
> > > >
> > > > Maybe listener_uses_iova_tree?
> > >
> > >
> > > I think "iova_tree" is something that is internal to svq implementation,
> > > it's better to define the name from the view of vhost_vdpa level.
> > >
> >
> > I don't get this, vhost_vdpa struct already has a pointer to its iova_tree.
>
> Yes, this is a suggestion to improve the readability of the code. So
> what I meant is to have a name to demonstrate why we need to use
> iova_tree instead of "uses_iova_tree".
>

I understand.

Knowing that the listener will be always bound to data vqs (being net,
blk, ...), I think it is ok to rename it to shadow_data.

But I think there is no way to add shadow_cvq properly from
hw/virtio/vhost-vdpa.c , since we don't know if the vhost_vdpa belongs
to a datapath or not. Would it work just to rename listener_shadow_vq
to shadow_data?

Thanks!


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa
  2022-11-15 11:24                 ` Eugenio Perez Martin
@ 2022-11-16  3:33                   ` Jason Wang
  0 siblings, 0 replies; 46+ messages in thread
From: Jason Wang @ 2022-11-16  3:33 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Stefan Hajnoczi, Si-Wei Liu,
	Laurent Vivier, Harpreet Singh Anand, Michael S. Tsirkin,
	Gautam Dawar, Liuxiangdong, Stefano Garzarella, Cindy Lu,
	Eli Cohen, Cornelia Huck, Zhu Lingshan, kvm, Gonglei (Arei),
	Paolo Bonzini

On Tue, Nov 15, 2022 at 7:25 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Tue, Nov 15, 2022 at 4:04 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Tue, Nov 15, 2022 at 12:31 AM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Mon, Nov 14, 2022 at 5:30 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > >
> > > > 在 2022/11/11 21:12, Eugenio Perez Martin 写道:
> > > > > On Fri, Nov 11, 2022 at 8:49 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >>
> > > > >> 在 2022/11/10 21:47, Eugenio Perez Martin 写道:
> > > > >>> On Thu, Nov 10, 2022 at 7:01 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >>>> On Wed, Nov 9, 2022 at 1:08 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > >>>>> The memory listener that thells the device how to convert GPA to qemu's
> > > > >>>>> va is registered against CVQ vhost_vdpa. This series try to map the
> > > > >>>>> memory listener translations to ASID 0, while it maps the CVQ ones to
> > > > >>>>> ASID 1.
> > > > >>>>>
> > > > >>>>> Let's tell the listener if it needs to register them on iova tree or
> > > > >>>>> not.
> > > > >>>>>
> > > > >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > >>>>> ---
> > > > >>>>> v5: Solve conflict about vhost_iova_tree_remove accepting mem_region by
> > > > >>>>>       value.
> > > > >>>>> ---
> > > > >>>>>    include/hw/virtio/vhost-vdpa.h | 2 ++
> > > > >>>>>    hw/virtio/vhost-vdpa.c         | 6 +++---
> > > > >>>>>    net/vhost-vdpa.c               | 1 +
> > > > >>>>>    3 files changed, 6 insertions(+), 3 deletions(-)
> > > > >>>>>
> > > > >>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > > >>>>> index 6560bb9d78..0c3ed2d69b 100644
> > > > >>>>> --- a/include/hw/virtio/vhost-vdpa.h
> > > > >>>>> +++ b/include/hw/virtio/vhost-vdpa.h
> > > > >>>>> @@ -34,6 +34,8 @@ typedef struct vhost_vdpa {
> > > > >>>>>        struct vhost_vdpa_iova_range iova_range;
> > > > >>>>>        uint64_t acked_features;
> > > > >>>>>        bool shadow_vqs_enabled;
> > > > >>>>> +    /* The listener must send iova tree addresses, not GPA */
> > > > >>
> > > > >> Btw, cindy's vIOMMU series will make it not necessarily GPA any more.
> > > > >>
> > > > > Yes, this comment should be tuned then. But the SVQ iova_tree will not
> > > > > be equal to vIOMMU one because shadow vrings.
> > > > >
> > > > > But maybe SVQ can inspect both instead of having all the duplicated entries.
> > > > >
> > > > >>>>> +    bool listener_shadow_vq;
> > > > >>>>>        /* IOVA mapping used by the Shadow Virtqueue */
> > > > >>>>>        VhostIOVATree *iova_tree;
> > > > >>>>>        GPtrArray *shadow_vqs;
> > > > >>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > > >>>>> index 8fd32ba32b..e3914fa40e 100644
> > > > >>>>> --- a/hw/virtio/vhost-vdpa.c
> > > > >>>>> +++ b/hw/virtio/vhost-vdpa.c
> > > > >>>>> @@ -220,7 +220,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > > >>>>>                                             vaddr, section->readonly);
> > > > >>>>>
> > > > >>>>>        llsize = int128_sub(llend, int128_make64(iova));
> > > > >>>>> -    if (v->shadow_vqs_enabled) {
> > > > >>>>> +    if (v->listener_shadow_vq) {
> > > > >>>>>            int r;
> > > > >>>>>
> > > > >>>>>            mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
> > > > >>>>> @@ -247,7 +247,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > > >>>>>        return;
> > > > >>>>>
> > > > >>>>>    fail_map:
> > > > >>>>> -    if (v->shadow_vqs_enabled) {
> > > > >>>>> +    if (v->listener_shadow_vq) {
> > > > >>>>>            vhost_iova_tree_remove(v->iova_tree, mem_region);
> > > > >>>>>        }
> > > > >>>>>
> > > > >>>>> @@ -292,7 +292,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> > > > >>>>>
> > > > >>>>>        llsize = int128_sub(llend, int128_make64(iova));
> > > > >>>>>
> > > > >>>>> -    if (v->shadow_vqs_enabled) {
> > > > >>>>> +    if (v->listener_shadow_vq) {
> > > > >>>>>            const DMAMap *result;
> > > > >>>>>            const void *vaddr = memory_region_get_ram_ptr(section->mr) +
> > > > >>>>>                section->offset_within_region +
> > > > >>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > > >>>>> index 85a318faca..02780ee37b 100644
> > > > >>>>> --- a/net/vhost-vdpa.c
> > > > >>>>> +++ b/net/vhost-vdpa.c
> > > > >>>>> @@ -570,6 +570,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> > > > >>>>>        s->vhost_vdpa.index = queue_pair_index;
> > > > >>>>>        s->always_svq = svq;
> > > > >>>>>        s->vhost_vdpa.shadow_vqs_enabled = svq;
> > > > >>>>> +    s->vhost_vdpa.listener_shadow_vq = svq;
> > > > >>>> Any chance those above two can differ?
> > > > >>>>
> > > > >>> If CVQ is shadowed but data VQs are not, shadow_vqs_enabled is true
> > > > >>> but listener_shadow_vq is not.
> > > > >>>
> > > > >>> It is more clear in the next commit, where only shadow_vqs_enabled is
> > > > >>> set to true at vhost_vdpa_net_cvq_start.
> > > > >>
> > > > >> Ok, the name looks a little bit confusing. I wonder if it's better to
> > > > >> use shadow_cvq and shadow_data ?
> > > > >>
> > > > > I'm ok with renaming it, but struct vhost_vdpa is generic across all
> > > > > kind of devices, and it does not know if it is a datapath or not for
> > > > > the moment.
> > > > >
> > > > > Maybe listener_uses_iova_tree?
> > > >
> > > >
> > > > I think "iova_tree" is something that is internal to svq implementation,
> > > > it's better to define the name from the view of vhost_vdpa level.
> > > >
> > >
> > > I don't get this, vhost_vdpa struct already has a pointer to its iova_tree.
> >
> > Yes, this is a suggestion to improve the readability of the code. So
> > what I meant is to have a name to demonstrate why we need to use
> > iova_tree instead of "uses_iova_tree".
> >
>
> I understand.
>
> Knowing that the listener will be always bound to data vqs (being net,
> blk, ...), I think it is ok to rename it to shadow_data.
>
> But I think there is no way to add shadow_cvq properly from
> hw/virtio/vhost-vdpa.c , since we don't know if the vhost_vdpa belongs
> to a datapath or not. Would it work just to rename listener_shadow_vq
> to shadow_data?

This should work.

Thanks

>
> Thanks!
>


^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2022-11-16  3:35 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-08 17:07 [PATCH v6 00/10] ASID support in vhost-vdpa net Eugenio Pérez
2022-11-08 17:07 ` [PATCH v6 01/10] vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop Eugenio Pérez
2022-11-10  5:21   ` Jason Wang
2022-11-10 12:54     ` Eugenio Perez Martin
2022-11-11  7:24       ` Jason Wang
2022-11-08 17:07 ` [PATCH v6 02/10] vhost: set SVQ device call handler at SVQ start Eugenio Pérez
2022-11-10  5:22   ` Jason Wang
2022-11-08 17:07 ` [PATCH v6 03/10] vhost: Allocate SVQ device file descriptors at device start Eugenio Pérez
2022-11-10  5:28   ` Jason Wang
2022-11-08 17:07 ` [PATCH v6 04/10] vdpa: add vhost_vdpa_net_valid_svq_features Eugenio Pérez
2022-11-10  5:29   ` Jason Wang
2022-11-08 17:07 ` [PATCH v6 05/10] vdpa: move SVQ vring features check to net/ Eugenio Pérez
2022-11-10  5:40   ` Jason Wang
2022-11-10 13:09     ` Eugenio Perez Martin
2022-11-11  7:34       ` Jason Wang
2022-11-11  7:55         ` Eugenio Perez Martin
2022-11-11  8:07           ` Jason Wang
2022-11-11 12:58             ` Eugenio Perez Martin
2022-11-14  4:26               ` Jason Wang
2022-11-14 10:10                 ` Eugenio Perez Martin
2022-11-08 17:07 ` [PATCH v6 06/10] vdpa: Allocate SVQ unconditionally Eugenio Pérez
2022-11-08 17:07 ` [PATCH v6 07/10] vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap Eugenio Pérez
2022-11-10  5:50   ` Jason Wang
2022-11-10 13:22     ` Eugenio Perez Martin
2022-11-11  7:41       ` Jason Wang
2022-11-11 13:02         ` Eugenio Perez Martin
2022-11-14  4:27           ` Jason Wang
2022-11-08 17:07 ` [PATCH v6 08/10] vdpa: Store x-svq parameter in VhostVDPAState Eugenio Pérez
2022-11-08 17:07 ` [PATCH v6 09/10] vdpa: Add listener_shadow_vq to vhost_vdpa Eugenio Pérez
2022-11-10  6:00   ` Jason Wang
2022-11-10 13:47     ` Eugenio Perez Martin
2022-11-11  7:48       ` Jason Wang
2022-11-11 13:12         ` Eugenio Perez Martin
2022-11-14  4:30           ` Jason Wang
2022-11-14 16:30             ` Eugenio Perez Martin
2022-11-15  3:04               ` Jason Wang
2022-11-15 11:24                 ` Eugenio Perez Martin
2022-11-16  3:33                   ` Jason Wang
2022-11-08 17:07 ` [PATCH v6 10/10] vdpa: Always start CVQ in SVQ mode Eugenio Pérez
2022-11-10  6:24   ` Jason Wang
2022-11-10 16:07     ` Eugenio Perez Martin
2022-11-11  8:02       ` Jason Wang
2022-11-11 14:38         ` Eugenio Perez Martin
2022-11-14  4:36           ` Jason Wang
2022-11-10 12:25 ` [PATCH v6 00/10] ASID support in vhost-vdpa net Michael S. Tsirkin
2022-11-10 12:56   ` Eugenio Perez Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.