All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization
@ 2023-05-09 15:44 Eugenio Pérez
  2023-05-09 15:44 ` [PATCH v3 1/5] vdpa: Remove status in reset tracing Eugenio Pérez
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Eugenio Pérez @ 2023-05-09 15:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Zhu Lingshan, longpeng2, Stefano Garzarella,
	Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	Jason Wang, si-wei.liu, Shannon Nelson, Lei Yang,
	Michael S. Tsirkin, Laurent Vivier, Cindy Lu

QEMU v8.0 is able to switch dynamically between vhost-vdpa passthrough
and SVQ mode as long as the net device does not have CVQ.  The net device
state followed (and migrated) by CVQ requires special care.

A pre-requisite to add CVQ to that framework is to determine if devices with
CVQ are migratable or not at initialization time.  The solution to it is to
always shadow only CVQ, and vq groups and ASID are used for that.

However, current qemu version only checks ASID at device start (as "driver set
DRIVER_OK status bit"), not at device initialization.  A check at
initialization time is required.  Otherwise, the guest would be able to set
and remove migration blockers at will [1].

This series is a requisite for migration of vhost-vdpa net devices with CVQ.
However it already makes sense by its own, as it reduces the number of ioctls
at migration time, decreasing the error paths there.

[1] https://lore.kernel.org/qemu-devel/2616f0cd-f9e8-d183-ea78-db1be4825d9c@redhat.com/
---
v3:
* Only record cvq_isolated, true if the device have cvq isolated in both !MQ
* and MQ configurations.
* Drop the cache of cvq group, it can be done on top

v2:
* Take out the reset of the device from vhost_vdpa_cvq_is_isolated
  (reported by Lei Yang).
* Expand patch messages by Stefano G. questions.

Eugenio Pérez (5):
  vdpa: Remove status in reset tracing
  vdpa: add vhost_vdpa_reset_status_fd
  vdpa: add vhost_vdpa_set_dev_features_fd
  vdpa: return errno in vhost_vdpa_get_vring_group error
  vdpa: move CVQ isolation check to net_init_vhost_vdpa

 include/hw/virtio/vhost-vdpa.h |   2 +
 hw/virtio/vhost-vdpa.c         |  78 ++++++++++-----
 net/vhost-vdpa.c               | 171 ++++++++++++++++++++++++++-------
 hw/virtio/trace-events         |   2 +-
 4 files changed, 192 insertions(+), 61 deletions(-)

-- 
2.31.1




^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 1/5] vdpa: Remove status in reset tracing
  2023-05-09 15:44 [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
@ 2023-05-09 15:44 ` Eugenio Pérez
  2023-05-09 15:44 ` [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd Eugenio Pérez
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Eugenio Pérez @ 2023-05-09 15:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Zhu Lingshan, longpeng2, Stefano Garzarella,
	Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	Jason Wang, si-wei.liu, Shannon Nelson, Lei Yang,
	Michael S. Tsirkin, Laurent Vivier, Cindy Lu

It is always 0 and it is not useful to route call through file
descriptor.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 2 +-
 hw/virtio/trace-events | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index bc6bad23d5..bbabea18f3 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -716,7 +716,7 @@ static int vhost_vdpa_reset_device(struct vhost_dev *dev)
     uint8_t status = 0;
 
     ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
-    trace_vhost_vdpa_reset_device(dev, status);
+    trace_vhost_vdpa_reset_device(dev);
     v->suspended = false;
     return ret;
 }
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 8f8d05cf9b..6265231683 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -44,7 +44,7 @@ vhost_vdpa_set_mem_table(void *dev, uint32_t nregions, uint32_t padding) "dev: %
 vhost_vdpa_dump_regions(void *dev, int i, uint64_t guest_phys_addr, uint64_t memory_size, uint64_t userspace_addr, uint64_t flags_padding) "dev: %p %d: guest_phys_addr: 0x%"PRIx64" memory_size: 0x%"PRIx64" userspace_addr: 0x%"PRIx64" flags_padding: 0x%"PRIx64
 vhost_vdpa_set_features(void *dev, uint64_t features) "dev: %p features: 0x%"PRIx64
 vhost_vdpa_get_device_id(void *dev, uint32_t device_id) "dev: %p device_id %"PRIu32
-vhost_vdpa_reset_device(void *dev, uint8_t status) "dev: %p status: 0x%"PRIx8
+vhost_vdpa_reset_device(void *dev) "dev: %p"
 vhost_vdpa_get_vq_index(void *dev, int idx, int vq_idx) "dev: %p idx: %d vq idx: %d"
 vhost_vdpa_set_vring_ready(void *dev) "dev: %p"
 vhost_vdpa_dump_config(void *dev, const char *line) "dev: %p %s"
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd
  2023-05-09 15:44 [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
  2023-05-09 15:44 ` [PATCH v3 1/5] vdpa: Remove status in reset tracing Eugenio Pérez
@ 2023-05-09 15:44 ` Eugenio Pérez
  2023-05-17  3:14   ` Jason Wang
  2023-05-09 15:44 ` [PATCH v3 3/5] vdpa: add vhost_vdpa_set_dev_features_fd Eugenio Pérez
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Eugenio Pérez @ 2023-05-09 15:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Zhu Lingshan, longpeng2, Stefano Garzarella,
	Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	Jason Wang, si-wei.liu, Shannon Nelson, Lei Yang,
	Michael S. Tsirkin, Laurent Vivier, Cindy Lu

This allows to reset a vhost-vdpa device from external subsystems like
vhost-net, since it does not have any struct vhost_dev by the time we
need to use it.

It is used in subsequent patches to negotiate features
and probe for CVQ ASID isolation.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/vhost-vdpa.h |  1 +
 hw/virtio/vhost-vdpa.c         | 58 +++++++++++++++++++++++-----------
 2 files changed, 41 insertions(+), 18 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index c278a2a8de..28de7da91e 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -54,6 +54,7 @@ typedef struct vhost_vdpa {
     VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
 } VhostVDPA;
 
+void vhost_vdpa_reset_status_fd(int fd);
 int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range);
 
 int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index bbabea18f3..7a2053b8d9 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -335,38 +335,45 @@ static const MemoryListener vhost_vdpa_memory_listener = {
     .region_del = vhost_vdpa_listener_region_del,
 };
 
-static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
-                             void *arg)
+static int vhost_vdpa_dev_fd(const struct vhost_dev *dev)
 {
     struct vhost_vdpa *v = dev->opaque;
-    int fd = v->device_fd;
-    int ret;
 
     assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
+    return v->device_fd;
+}
+
+static int vhost_vdpa_call_fd(int fd, unsigned long int request, void *arg)
+{
+    int ret = ioctl(fd, request, arg);
 
-    ret = ioctl(fd, request, arg);
     return ret < 0 ? -errno : ret;
 }
 
-static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
+static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
+                           void *arg)
+{
+    return vhost_vdpa_call_fd(vhost_vdpa_dev_fd(dev), request, arg);
+}
+
+static int vhost_vdpa_add_status_fd(int fd, uint8_t status)
 {
     uint8_t s;
     int ret;
 
-    trace_vhost_vdpa_add_status(dev, status);
-    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
+    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
     if (ret < 0) {
         return ret;
     }
 
     s |= status;
 
-    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
+    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &s);
     if (ret < 0) {
         return ret;
     }
 
-    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
+    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
     if (ret < 0) {
         return ret;
     }
@@ -378,6 +385,12 @@ static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
     return 0;
 }
 
+static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
+{
+    trace_vhost_vdpa_add_status(dev, status);
+    return vhost_vdpa_add_status_fd(vhost_vdpa_dev_fd(dev), status);
+}
+
 int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range)
 {
     int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
@@ -709,16 +722,20 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
     return ret;
 }
 
+static int vhost_vdpa_reset_device_fd(int fd)
+{
+    uint8_t status = 0;
+
+    return vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &status);
+}
+
 static int vhost_vdpa_reset_device(struct vhost_dev *dev)
 {
     struct vhost_vdpa *v = dev->opaque;
-    int ret;
-    uint8_t status = 0;
 
-    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
-    trace_vhost_vdpa_reset_device(dev);
     v->suspended = false;
-    return ret;
+    trace_vhost_vdpa_reset_device(dev);
+    return vhost_vdpa_reset_device_fd(vhost_vdpa_dev_fd(dev));
 }
 
 static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
@@ -1170,6 +1187,13 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
     return 0;
 }
 
+void vhost_vdpa_reset_status_fd(int fd)
+{
+    vhost_vdpa_reset_device_fd(fd);
+    vhost_vdpa_add_status_fd(fd, VIRTIO_CONFIG_S_ACKNOWLEDGE |
+                                 VIRTIO_CONFIG_S_DRIVER);
+}
+
 static void vhost_vdpa_reset_status(struct vhost_dev *dev)
 {
     struct vhost_vdpa *v = dev->opaque;
@@ -1178,9 +1202,7 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
         return;
     }
 
-    vhost_vdpa_reset_device(dev);
-    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
-                               VIRTIO_CONFIG_S_DRIVER);
+    vhost_vdpa_reset_status_fd(vhost_vdpa_dev_fd(dev));
     memory_listener_unregister(&v->listener);
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 3/5] vdpa: add vhost_vdpa_set_dev_features_fd
  2023-05-09 15:44 [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
  2023-05-09 15:44 ` [PATCH v3 1/5] vdpa: Remove status in reset tracing Eugenio Pérez
  2023-05-09 15:44 ` [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd Eugenio Pérez
@ 2023-05-09 15:44 ` Eugenio Pérez
  2023-05-09 15:44 ` [PATCH v3 4/5] vdpa: return errno in vhost_vdpa_get_vring_group error Eugenio Pérez
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Eugenio Pérez @ 2023-05-09 15:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Zhu Lingshan, longpeng2, Stefano Garzarella,
	Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	Jason Wang, si-wei.liu, Shannon Nelson, Lei Yang,
	Michael S. Tsirkin, Laurent Vivier, Cindy Lu

This allows to set the features of a vhost-vdpa device from external
subsystems like vhost-net.  It is used in subsequent patches to
negotiate features and probe for CVQ ASID isolation.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/vhost-vdpa.h |  1 +
 hw/virtio/vhost-vdpa.c         | 20 +++++++++++++-------
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 28de7da91e..a9cb6f3a32 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -55,6 +55,7 @@ typedef struct vhost_vdpa {
 } VhostVDPA;
 
 void vhost_vdpa_reset_status_fd(int fd);
+int vhost_vdpa_set_dev_features_fd(int fd, uint64_t features);
 int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range);
 
 int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7a2053b8d9..acd5be46a9 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -651,11 +651,22 @@ static int vhost_vdpa_set_mem_table(struct vhost_dev *dev,
     return 0;
 }
 
+int vhost_vdpa_set_dev_features_fd(int fd, uint64_t features)
+{
+    int ret;
+
+    ret = vhost_vdpa_call_fd(fd, VHOST_SET_FEATURES, &features);
+    if (ret) {
+        return ret;
+    }
+
+    return vhost_vdpa_add_status_fd(fd, VIRTIO_CONFIG_S_FEATURES_OK);
+}
+
 static int vhost_vdpa_set_features(struct vhost_dev *dev,
                                    uint64_t features)
 {
     struct vhost_vdpa *v = dev->opaque;
-    int ret;
 
     if (!vhost_vdpa_first_dev(dev)) {
         return 0;
@@ -678,12 +689,7 @@ static int vhost_vdpa_set_features(struct vhost_dev *dev,
     }
 
     trace_vhost_vdpa_set_features(dev, features);
-    ret = vhost_vdpa_call(dev, VHOST_SET_FEATURES, &features);
-    if (ret) {
-        return ret;
-    }
-
-    return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK);
+    return vhost_vdpa_set_dev_features_fd(vhost_vdpa_dev_fd(dev), features);
 }
 
 static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 4/5] vdpa: return errno in vhost_vdpa_get_vring_group error
  2023-05-09 15:44 [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
                   ` (2 preceding siblings ...)
  2023-05-09 15:44 ` [PATCH v3 3/5] vdpa: add vhost_vdpa_set_dev_features_fd Eugenio Pérez
@ 2023-05-09 15:44 ` Eugenio Pérez
  2023-05-09 15:44 ` [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa Eugenio Pérez
  2023-05-17  6:18 ` [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization Lei Yang
  5 siblings, 0 replies; 19+ messages in thread
From: Eugenio Pérez @ 2023-05-09 15:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Zhu Lingshan, longpeng2, Stefano Garzarella,
	Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	Jason Wang, si-wei.liu, Shannon Nelson, Lei Yang,
	Michael S. Tsirkin, Laurent Vivier, Cindy Lu

We need to tell in the caller, as some errors are expected in a normal
workflow.  In particular, parent drivers in recent kernels with
VHOST_BACKEND_F_IOTLB_ASID may not support vring groups.  In that case,
-ENOTSUP is returned.

This is the case of vp_vdpa in Linux 6.2.

Next patches in this series will use that information to know if it must
abort or not.  Also, next patches return properly an errp instead of
printing with error_report.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 net/vhost-vdpa.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 37cdc84562..3fb833fe76 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -362,6 +362,14 @@ static NetClientInfo net_vhost_vdpa_info = {
         .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
+/**
+ * Get vring virtqueue group
+ *
+ * @device_fd  vdpa device fd
+ * @vq_index   Virtqueue index
+ *
+ * Return -errno in case of error, or vq group if success.
+ */
 static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
 {
     struct vhost_vring_state state = {
@@ -370,6 +378,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
     int r = ioctl(device_fd, VHOST_VDPA_GET_VRING_GROUP, &state);
 
     if (unlikely(r < 0)) {
+        r = -errno;
         error_report("Cannot get VQ %u group: %s", vq_index,
                      g_strerror(errno));
         return r;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa
  2023-05-09 15:44 [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
                   ` (3 preceding siblings ...)
  2023-05-09 15:44 ` [PATCH v3 4/5] vdpa: return errno in vhost_vdpa_get_vring_group error Eugenio Pérez
@ 2023-05-09 15:44 ` Eugenio Pérez
  2023-05-17  3:59   ` Jason Wang
  2023-05-17  6:18 ` [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization Lei Yang
  5 siblings, 1 reply; 19+ messages in thread
From: Eugenio Pérez @ 2023-05-09 15:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Zhu Lingshan, longpeng2, Stefano Garzarella,
	Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	Jason Wang, si-wei.liu, Shannon Nelson, Lei Yang,
	Michael S. Tsirkin, Laurent Vivier, Cindy Lu

Evaluating it at start time instead of initialization time may make the
guest capable of dynamically adding or removing migration blockers.

Also, moving to initialization reduces the number of ioctls in the
migration, reducing failure possibilities.

As a drawback we need to check for CVQ isolation twice: one time with no
MQ negotiated and another one acking it, as long as the device supports
it.  This is because Vring ASID / group management is based on vq
indexes, but we don't know the index of CVQ before negotiating MQ.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
v3: Only record cvq_isolated, true if the device have cvq isolated in
    both !MQ and MQ configurations.
---
 net/vhost-vdpa.c | 178 +++++++++++++++++++++++++++++++++++------------
 1 file changed, 135 insertions(+), 43 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 3fb833fe76..29054b77a9 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
 
     /* The device always have SVQ enabled */
     bool always_svq;
+
+    /* The device can isolate CVQ in its own ASID */
+    bool cvq_isolated;
+
     bool started;
 } VhostVDPAState;
 
@@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
         .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
-/**
- * Get vring virtqueue group
- *
- * @device_fd  vdpa device fd
- * @vq_index   Virtqueue index
- *
- * Return -errno in case of error, or vq group if success.
- */
-static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
+static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
+                                          Error **errp)
 {
     struct vhost_vring_state state = {
         .index = vq_index,
@@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
 
     if (unlikely(r < 0)) {
         r = -errno;
-        error_report("Cannot get VQ %u group: %s", vq_index,
-                     g_strerror(errno));
+        error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
         return r;
     }
 
@@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
 {
     VhostVDPAState *s, *s0;
     struct vhost_vdpa *v;
-    uint64_t backend_features;
     int64_t cvq_group;
-    int cvq_index, r;
+    int r;
+    Error *err = NULL;
 
     assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
 
@@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
     /*
      * If we early return in these cases SVQ will not be enabled. The migration
      * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
-     *
-     * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
-     * yet.
      */
-    r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
-    if (unlikely(r < 0)) {
-        error_report("Cannot get vdpa backend_features: %s(%d)",
-            g_strerror(errno), errno);
-        return -1;
+    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
+        return 0;
     }
-    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
-        !vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
+
+    if (!s->cvq_isolated) {
         return 0;
     }
 
-    /*
-     * Check if all the virtqueues of the virtio device are in a different vq
-     * than the last vq. VQ group of last group passed in cvq_group.
-     */
-    cvq_index = v->dev->vq_index_end - 1;
-    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
+    cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
+                                           v->dev->vq_index_end - 1,
+                                           &err);
     if (unlikely(cvq_group < 0)) {
+        error_report_err(err);
         return cvq_group;
     }
-    for (int i = 0; i < cvq_index; ++i) {
-        int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
-
-        if (unlikely(group < 0)) {
-            return group;
-        }
-
-        if (group == cvq_group) {
-            return 0;
-        }
-    }
 
     r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
     if (unlikely(r < 0)) {
@@ -799,6 +776,111 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
     .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
 };
 
+/**
+ * Probe the device to check control virtqueue is isolated.
+ *
+ * @device_fd vhost-vdpa file descriptor
+ * @features features to negotiate
+ * @cvq_index Control vq index
+ *
+ * Returns -1 in case of error, 0 if false and 1 if true
+ */
+static int vhost_vdpa_cvq_is_isolated(int device_fd, uint64_t features,
+                                      unsigned cvq_index, Error **errp)
+{
+    int64_t cvq_group;
+    int r;
+
+    r = vhost_vdpa_set_dev_features_fd(device_fd, features);
+    if (unlikely(r < 0)) {
+        error_setg_errno(errp, -r, "Cannot set device features");
+        return r;
+    }
+
+    cvq_group = vhost_vdpa_get_vring_group(device_fd, cvq_index, errp);
+    if (unlikely(cvq_group < 0)) {
+        return cvq_group;
+    }
+
+    for (int i = 0; i < cvq_index; ++i) {
+        int64_t group = vhost_vdpa_get_vring_group(device_fd, i, errp);
+
+        if (unlikely(group < 0)) {
+            return group;
+        }
+
+        if (group == (int64_t)cvq_group) {
+            return 0;
+        }
+    }
+
+    return 1;
+}
+
+/**
+ * Probe if CVQ is isolated when the device is MQ and when it is not MQ
+ *
+ * @device_fd         The vdpa device fd
+ * @features          Features offered by the device.
+ * @cvq_index         The control vq index if mq is negotiated. Ignored
+ *                    otherwise.
+ *
+ * Returns <0 in case of failure, 0 if false and 1 if true.
+ */
+static int vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
+                                          int cvq_index, Error **errp)
+{
+    uint64_t backend_features;
+    int r;
+
+    ERRP_GUARD();
+
+    r = ioctl(device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
+    if (unlikely(r < 0)) {
+        error_setg_errno(errp, errno, "Cannot get vdpa backend_features");
+        return r;
+    }
+
+    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
+        return 0;
+    }
+
+    r = vhost_vdpa_cvq_is_isolated(device_fd,
+                                   features & ~BIT_ULL(VIRTIO_NET_F_MQ), 2,
+                                   errp);
+    if (unlikely(r < 0)) {
+        if (r != -ENOTSUP) {
+            return r;
+        }
+
+        /*
+         * The kernel report VHOST_BACKEND_F_IOTLB_ASID if the vdpa frontend
+         * support ASID even if the parent driver does not.  The CVQ cannot be
+         * isolated in this case.
+         */
+        error_free(*errp);
+        *errp = NULL;
+        return 0;
+    }
+
+    if (r == 0) {
+        return 0;
+    }
+
+    vhost_vdpa_reset_status_fd(device_fd);
+    if (!(features & BIT_ULL(VIRTIO_NET_F_MQ))) {
+        return 0;
+    }
+
+    r = vhost_vdpa_cvq_is_isolated(device_fd, features, cvq_index * 2, errp);
+    if (unlikely(r < 0)) {
+        return r;
+    }
+
+    vhost_vdpa_reset_status_fd(device_fd);
+    return r;
+}
+
 static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
                                        const char *device,
                                        const char *name,
@@ -808,16 +890,25 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
                                        bool is_datapath,
                                        bool svq,
                                        struct vhost_vdpa_iova_range iova_range,
-                                       uint64_t features)
+                                       uint64_t features,
+                                       Error **errp)
 {
     NetClientState *nc = NULL;
     VhostVDPAState *s;
     int ret = 0;
     assert(name);
+    int cvq_isolated;
+
     if (is_datapath) {
         nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device,
                                  name);
     } else {
+        cvq_isolated = vhost_vdpa_probe_cvq_isolation(vdpa_device_fd, features,
+                                                      queue_pair_index, errp);
+        if (unlikely(cvq_isolated < 0)) {
+            return NULL;
+        }
+
         nc = qemu_new_net_control_client(&net_vhost_vdpa_cvq_info, peer,
                                          device, name);
     }
@@ -844,6 +935,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
 
         s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
         s->vhost_vdpa.shadow_vq_ops_opaque = s;
+        s->cvq_isolated = cvq_isolated;
 
         /*
          * TODO: We cannot migrate devices with CVQ as there is no way to set
@@ -972,7 +1064,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     for (i = 0; i < queue_pairs; i++) {
         ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
                                      vdpa_device_fd, i, 2, true, opts->x_svq,
-                                     iova_range, features);
+                                     iova_range, features, errp);
         if (!ncs[i])
             goto err;
     }
@@ -980,7 +1072,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     if (has_cvq) {
         nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
                                  vdpa_device_fd, i, 1, false,
-                                 opts->x_svq, iova_range, features);
+                                 opts->x_svq, iova_range, features, errp);
         if (!nc)
             goto err;
     }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd
  2023-05-09 15:44 ` [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd Eugenio Pérez
@ 2023-05-17  3:14   ` Jason Wang
  2023-05-17  5:46     ` Eugenio Perez Martin
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2023-05-17  3:14 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Michael S. Tsirkin,
	Laurent Vivier, Cindy Lu

On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> This allows to reset a vhost-vdpa device from external subsystems like
> vhost-net, since it does not have any struct vhost_dev by the time we
> need to use it.
>
> It is used in subsequent patches to negotiate features
> and probe for CVQ ASID isolation.
>
> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  include/hw/virtio/vhost-vdpa.h |  1 +
>  hw/virtio/vhost-vdpa.c         | 58 +++++++++++++++++++++++-----------
>  2 files changed, 41 insertions(+), 18 deletions(-)
>
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> index c278a2a8de..28de7da91e 100644
> --- a/include/hw/virtio/vhost-vdpa.h
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -54,6 +54,7 @@ typedef struct vhost_vdpa {
>      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
>  } VhostVDPA;
>
> +void vhost_vdpa_reset_status_fd(int fd);
>  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range);
>
>  int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index bbabea18f3..7a2053b8d9 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -335,38 +335,45 @@ static const MemoryListener vhost_vdpa_memory_listener = {
>      .region_del = vhost_vdpa_listener_region_del,
>  };
>
> -static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> -                             void *arg)
> +static int vhost_vdpa_dev_fd(const struct vhost_dev *dev)
>  {
>      struct vhost_vdpa *v = dev->opaque;
> -    int fd = v->device_fd;
> -    int ret;
>
>      assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> +    return v->device_fd;
> +}

Nit: unless the vhost_dev structure is opaque to the upper layer, I
don't see any advantage for having a dedicated indirect helper to get
device_fd.

> +
> +static int vhost_vdpa_call_fd(int fd, unsigned long int request, void *arg)
> +{
> +    int ret = ioctl(fd, request, arg);
>
> -    ret = ioctl(fd, request, arg);
>      return ret < 0 ? -errno : ret;
>  }
>
> -static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> +static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> +                           void *arg)
> +{
> +    return vhost_vdpa_call_fd(vhost_vdpa_dev_fd(dev), request, arg);
> +}
> +
> +static int vhost_vdpa_add_status_fd(int fd, uint8_t status)
>  {
>      uint8_t s;
>      int ret;
>
> -    trace_vhost_vdpa_add_status(dev, status);
> -    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
>      if (ret < 0) {
>          return ret;
>      }
>
>      s |= status;
>
> -    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &s);
>      if (ret < 0) {
>          return ret;
>      }
>
> -    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
>      if (ret < 0) {
>          return ret;
>      }
> @@ -378,6 +385,12 @@ static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
>      return 0;
>  }
>
> +static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> +{
> +    trace_vhost_vdpa_add_status(dev, status);
> +    return vhost_vdpa_add_status_fd(vhost_vdpa_dev_fd(dev), status);
> +}
> +
>  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range)
>  {
>      int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> @@ -709,16 +722,20 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
>      return ret;
>  }
>
> +static int vhost_vdpa_reset_device_fd(int fd)
> +{
> +    uint8_t status = 0;
> +
> +    return vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &status);
> +}
> +
>  static int vhost_vdpa_reset_device(struct vhost_dev *dev)
>  {
>      struct vhost_vdpa *v = dev->opaque;
> -    int ret;
> -    uint8_t status = 0;
>
> -    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> -    trace_vhost_vdpa_reset_device(dev);
>      v->suspended = false;
> -    return ret;
> +    trace_vhost_vdpa_reset_device(dev);
> +    return vhost_vdpa_reset_device_fd(vhost_vdpa_dev_fd(dev));
>  }
>
>  static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
> @@ -1170,6 +1187,13 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>      return 0;
>  }
>
> +void vhost_vdpa_reset_status_fd(int fd)
> +{
> +    vhost_vdpa_reset_device_fd(fd);
> +    vhost_vdpa_add_status_fd(fd, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> +                                 VIRTIO_CONFIG_S_DRIVER);

I would like to rename this function since it does more than just reset.

Thanks

> +}
> +
>  static void vhost_vdpa_reset_status(struct vhost_dev *dev)
>  {
>      struct vhost_vdpa *v = dev->opaque;
> @@ -1178,9 +1202,7 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
>          return;
>      }
>
> -    vhost_vdpa_reset_device(dev);
> -    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> -                               VIRTIO_CONFIG_S_DRIVER);
> +    vhost_vdpa_reset_status_fd(vhost_vdpa_dev_fd(dev));
>      memory_listener_unregister(&v->listener);
>  }
>
> --
> 2.31.1
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa
  2023-05-09 15:44 ` [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa Eugenio Pérez
@ 2023-05-17  3:59   ` Jason Wang
  2023-05-17  6:29     ` Eugenio Perez Martin
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2023-05-17  3:59 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Michael S. Tsirkin,
	Laurent Vivier, Cindy Lu

On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> Evaluating it at start time instead of initialization time may make the
> guest capable of dynamically adding or removing migration blockers.
>
> Also, moving to initialization reduces the number of ioctls in the
> migration, reducing failure possibilities.
>
> As a drawback we need to check for CVQ isolation twice: one time with no
> MQ negotiated and another one acking it, as long as the device supports
> it.  This is because Vring ASID / group management is based on vq
> indexes, but we don't know the index of CVQ before negotiating MQ.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
> v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
> v3: Only record cvq_isolated, true if the device have cvq isolated in
>     both !MQ and MQ configurations.
> ---
>  net/vhost-vdpa.c | 178 +++++++++++++++++++++++++++++++++++------------
>  1 file changed, 135 insertions(+), 43 deletions(-)
>
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 3fb833fe76..29054b77a9 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
>
>      /* The device always have SVQ enabled */
>      bool always_svq;
> +
> +    /* The device can isolate CVQ in its own ASID */
> +    bool cvq_isolated;
> +
>      bool started;
>  } VhostVDPAState;
>
> @@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
>          .check_peer_type = vhost_vdpa_check_peer_type,
>  };
>
> -/**
> - * Get vring virtqueue group
> - *
> - * @device_fd  vdpa device fd
> - * @vq_index   Virtqueue index
> - *
> - * Return -errno in case of error, or vq group if success.
> - */
> -static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> +static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
> +                                          Error **errp)
>  {
>      struct vhost_vring_state state = {
>          .index = vq_index,
> @@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
>
>      if (unlikely(r < 0)) {
>          r = -errno;
> -        error_report("Cannot get VQ %u group: %s", vq_index,
> -                     g_strerror(errno));
> +        error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
>          return r;
>      }
>
> @@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
>  {
>      VhostVDPAState *s, *s0;
>      struct vhost_vdpa *v;
> -    uint64_t backend_features;
>      int64_t cvq_group;
> -    int cvq_index, r;
> +    int r;
> +    Error *err = NULL;
>
>      assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>
> @@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
>      /*
>       * If we early return in these cases SVQ will not be enabled. The migration
>       * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
> -     *
> -     * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
> -     * yet.
>       */
> -    r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> -    if (unlikely(r < 0)) {
> -        error_report("Cannot get vdpa backend_features: %s(%d)",
> -            g_strerror(errno), errno);
> -        return -1;
> +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> +        return 0;
>      }
> -    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
> -        !vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> +
> +    if (!s->cvq_isolated) {
>          return 0;
>      }
>
> -    /*
> -     * Check if all the virtqueues of the virtio device are in a different vq
> -     * than the last vq. VQ group of last group passed in cvq_group.
> -     */
> -    cvq_index = v->dev->vq_index_end - 1;
> -    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
> +                                           v->dev->vq_index_end - 1,
> +                                           &err);
>      if (unlikely(cvq_group < 0)) {
> +        error_report_err(err);
>          return cvq_group;
>      }
> -    for (int i = 0; i < cvq_index; ++i) {
> -        int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> -
> -        if (unlikely(group < 0)) {
> -            return group;
> -        }
> -
> -        if (group == cvq_group) {
> -            return 0;
> -        }
> -    }
>
>      r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
>      if (unlikely(r < 0)) {
> @@ -799,6 +776,111 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
>      .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
>  };
>
> +/**
> + * Probe the device to check control virtqueue is isolated.
> + *
> + * @device_fd vhost-vdpa file descriptor
> + * @features features to negotiate
> + * @cvq_index Control vq index
> + *
> + * Returns -1 in case of error, 0 if false and 1 if true
> + */
> +static int vhost_vdpa_cvq_is_isolated(int device_fd, uint64_t features,
> +                                      unsigned cvq_index, Error **errp)
> +{
> +    int64_t cvq_group;
> +    int r;
> +
> +    r = vhost_vdpa_set_dev_features_fd(device_fd, features);
> +    if (unlikely(r < 0)) {
> +        error_setg_errno(errp, -r, "Cannot set device features");
> +        return r;
> +    }
> +
> +    cvq_group = vhost_vdpa_get_vring_group(device_fd, cvq_index, errp);
> +    if (unlikely(cvq_group < 0)) {
> +        return cvq_group;
> +    }
> +
> +    for (int i = 0; i < cvq_index; ++i) {
> +        int64_t group = vhost_vdpa_get_vring_group(device_fd, i, errp);
> +
> +        if (unlikely(group < 0)) {
> +            return group;
> +        }
> +
> +        if (group == (int64_t)cvq_group) {
> +            return 0;
> +        }
> +    }
> +
> +    return 1;
> +}
> +
> +/**
> + * Probe if CVQ is isolated when the device is MQ and when it is not MQ
> + *
> + * @device_fd         The vdpa device fd
> + * @features          Features offered by the device.
> + * @cvq_index         The control vq index if mq is negotiated. Ignored
> + *                    otherwise.
> + *
> + * Returns <0 in case of failure, 0 if false and 1 if true.
> + */
> +static int vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
> +                                          int cvq_index, Error **errp)
> +{
> +    uint64_t backend_features;
> +    int r;
> +
> +    ERRP_GUARD();
> +
> +    r = ioctl(device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> +    if (unlikely(r < 0)) {
> +        error_setg_errno(errp, errno, "Cannot get vdpa backend_features");
> +        return r;
> +    }
> +
> +    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
> +        return 0;
> +    }
> +
> +    r = vhost_vdpa_cvq_is_isolated(device_fd,
> +                                   features & ~BIT_ULL(VIRTIO_NET_F_MQ), 2,
> +                                   errp);
> +    if (unlikely(r < 0)) {
> +        if (r != -ENOTSUP) {
> +            return r;
> +        }
> +
> +        /*
> +         * The kernel report VHOST_BACKEND_F_IOTLB_ASID if the vdpa frontend
> +         * support ASID even if the parent driver does not.  The CVQ cannot be
> +         * isolated in this case.
> +         */
> +        error_free(*errp);
> +        *errp = NULL;
> +        return 0;
> +    }
> +
> +    if (r == 0) {
> +        return 0;
> +    }
> +
> +    vhost_vdpa_reset_status_fd(device_fd);
> +    if (!(features & BIT_ULL(VIRTIO_NET_F_MQ))) {
> +        return 0;
> +    }
> +
> +    r = vhost_vdpa_cvq_is_isolated(device_fd, features, cvq_index * 2, errp);

I think checking this once should be sufficient. That is to say, it
should be a bug if there's hardware that puts cvq in a dedicated group
in MQ but not in SQ.

Thanks

> +    if (unlikely(r < 0)) {
> +        return r;
> +    }
> +
> +    vhost_vdpa_reset_status_fd(device_fd);
> +    return r;
> +}
> +
>  static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>                                         const char *device,
>                                         const char *name,
> @@ -808,16 +890,25 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>                                         bool is_datapath,
>                                         bool svq,
>                                         struct vhost_vdpa_iova_range iova_range,
> -                                       uint64_t features)
> +                                       uint64_t features,
> +                                       Error **errp)
>  {
>      NetClientState *nc = NULL;
>      VhostVDPAState *s;
>      int ret = 0;
>      assert(name);
> +    int cvq_isolated;
> +
>      if (is_datapath) {
>          nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device,
>                                   name);
>      } else {
> +        cvq_isolated = vhost_vdpa_probe_cvq_isolation(vdpa_device_fd, features,
> +                                                      queue_pair_index, errp);
> +        if (unlikely(cvq_isolated < 0)) {
> +            return NULL;
> +        }
> +
>          nc = qemu_new_net_control_client(&net_vhost_vdpa_cvq_info, peer,
>                                           device, name);
>      }
> @@ -844,6 +935,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>
>          s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
>          s->vhost_vdpa.shadow_vq_ops_opaque = s;
> +        s->cvq_isolated = cvq_isolated;
>
>          /*
>           * TODO: We cannot migrate devices with CVQ as there is no way to set
> @@ -972,7 +1064,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>      for (i = 0; i < queue_pairs; i++) {
>          ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>                                       vdpa_device_fd, i, 2, true, opts->x_svq,
> -                                     iova_range, features);
> +                                     iova_range, features, errp);
>          if (!ncs[i])
>              goto err;
>      }
> @@ -980,7 +1072,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>      if (has_cvq) {
>          nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>                                   vdpa_device_fd, i, 1, false,
> -                                 opts->x_svq, iova_range, features);
> +                                 opts->x_svq, iova_range, features, errp);
>          if (!nc)
>              goto err;
>      }
> --
> 2.31.1
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd
  2023-05-17  3:14   ` Jason Wang
@ 2023-05-17  5:46     ` Eugenio Perez Martin
  2023-05-17  5:49       ` Jason Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Eugenio Perez Martin @ 2023-05-17  5:46 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Michael S. Tsirkin,
	Laurent Vivier, Cindy Lu

On Wed, May 17, 2023 at 5:14 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > This allows to reset a vhost-vdpa device from external subsystems like
> > vhost-net, since it does not have any struct vhost_dev by the time we
> > need to use it.
> >
> > It is used in subsequent patches to negotiate features
> > and probe for CVQ ASID isolation.
> >
> > Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  include/hw/virtio/vhost-vdpa.h |  1 +
> >  hw/virtio/vhost-vdpa.c         | 58 +++++++++++++++++++++++-----------
> >  2 files changed, 41 insertions(+), 18 deletions(-)
> >
> > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > index c278a2a8de..28de7da91e 100644
> > --- a/include/hw/virtio/vhost-vdpa.h
> > +++ b/include/hw/virtio/vhost-vdpa.h
> > @@ -54,6 +54,7 @@ typedef struct vhost_vdpa {
> >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> >  } VhostVDPA;
> >
> > +void vhost_vdpa_reset_status_fd(int fd);
> >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range);
> >
> >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index bbabea18f3..7a2053b8d9 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -335,38 +335,45 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> >      .region_del = vhost_vdpa_listener_region_del,
> >  };
> >
> > -static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > -                             void *arg)
> > +static int vhost_vdpa_dev_fd(const struct vhost_dev *dev)
> >  {
> >      struct vhost_vdpa *v = dev->opaque;
> > -    int fd = v->device_fd;
> > -    int ret;
> >
> >      assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> > +    return v->device_fd;
> > +}
>
> Nit: unless the vhost_dev structure is opaque to the upper layer, I
> don't see any advantage for having a dedicated indirect helper to get
> device_fd.
>

The purpose was to not duplicate the assert, but sure it's not mandatory.

> > +
> > +static int vhost_vdpa_call_fd(int fd, unsigned long int request, void *arg)
> > +{
> > +    int ret = ioctl(fd, request, arg);
> >
> > -    ret = ioctl(fd, request, arg);
> >      return ret < 0 ? -errno : ret;
> >  }
> >
> > -static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > +static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > +                           void *arg)
> > +{
> > +    return vhost_vdpa_call_fd(vhost_vdpa_dev_fd(dev), request, arg);
> > +}
> > +
> > +static int vhost_vdpa_add_status_fd(int fd, uint8_t status)
> >  {
> >      uint8_t s;
> >      int ret;
> >
> > -    trace_vhost_vdpa_add_status(dev, status);
> > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> >      if (ret < 0) {
> >          return ret;
> >      }
> >
> >      s |= status;
> >
> > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &s);
> >      if (ret < 0) {
> >          return ret;
> >      }
> >
> > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> >      if (ret < 0) {
> >          return ret;
> >      }
> > @@ -378,6 +385,12 @@ static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> >      return 0;
> >  }
> >
> > +static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > +{
> > +    trace_vhost_vdpa_add_status(dev, status);
> > +    return vhost_vdpa_add_status_fd(vhost_vdpa_dev_fd(dev), status);
> > +}
> > +
> >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range)
> >  {
> >      int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> > @@ -709,16 +722,20 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> >      return ret;
> >  }
> >
> > +static int vhost_vdpa_reset_device_fd(int fd)
> > +{
> > +    uint8_t status = 0;
> > +
> > +    return vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &status);
> > +}
> > +
> >  static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> >  {
> >      struct vhost_vdpa *v = dev->opaque;
> > -    int ret;
> > -    uint8_t status = 0;
> >
> > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> > -    trace_vhost_vdpa_reset_device(dev);
> >      v->suspended = false;
> > -    return ret;
> > +    trace_vhost_vdpa_reset_device(dev);
> > +    return vhost_vdpa_reset_device_fd(vhost_vdpa_dev_fd(dev));
> >  }
> >
> >  static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
> > @@ -1170,6 +1187,13 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> >      return 0;
> >  }
> >
> > +void vhost_vdpa_reset_status_fd(int fd)
> > +{
> > +    vhost_vdpa_reset_device_fd(fd);
> > +    vhost_vdpa_add_status_fd(fd, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > +                                 VIRTIO_CONFIG_S_DRIVER);
>
> I would like to rename this function since it does more than just reset.
>

vhost_vdpa_set_ready?

Thanks!

> Thanks
>
> > +}
> > +
> >  static void vhost_vdpa_reset_status(struct vhost_dev *dev)
> >  {
> >      struct vhost_vdpa *v = dev->opaque;
> > @@ -1178,9 +1202,7 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
> >          return;
> >      }
> >
> > -    vhost_vdpa_reset_device(dev);
> > -    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > -                               VIRTIO_CONFIG_S_DRIVER);
> > +    vhost_vdpa_reset_status_fd(vhost_vdpa_dev_fd(dev));
> >      memory_listener_unregister(&v->listener);
> >  }
> >
> > --
> > 2.31.1
> >
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd
  2023-05-17  5:46     ` Eugenio Perez Martin
@ 2023-05-17  5:49       ` Jason Wang
  2023-05-24 17:36         ` Eugenio Perez Martin
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2023-05-17  5:49 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Michael S. Tsirkin,
	Laurent Vivier, Cindy Lu

On Wed, May 17, 2023 at 1:46 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Wed, May 17, 2023 at 5:14 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > >
> > > This allows to reset a vhost-vdpa device from external subsystems like
> > > vhost-net, since it does not have any struct vhost_dev by the time we
> > > need to use it.
> > >
> > > It is used in subsequent patches to negotiate features
> > > and probe for CVQ ASID isolation.
> > >
> > > Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > ---
> > >  include/hw/virtio/vhost-vdpa.h |  1 +
> > >  hw/virtio/vhost-vdpa.c         | 58 +++++++++++++++++++++++-----------
> > >  2 files changed, 41 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > index c278a2a8de..28de7da91e 100644
> > > --- a/include/hw/virtio/vhost-vdpa.h
> > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > @@ -54,6 +54,7 @@ typedef struct vhost_vdpa {
> > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > >  } VhostVDPA;
> > >
> > > +void vhost_vdpa_reset_status_fd(int fd);
> > >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range);
> > >
> > >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > index bbabea18f3..7a2053b8d9 100644
> > > --- a/hw/virtio/vhost-vdpa.c
> > > +++ b/hw/virtio/vhost-vdpa.c
> > > @@ -335,38 +335,45 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> > >      .region_del = vhost_vdpa_listener_region_del,
> > >  };
> > >
> > > -static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > > -                             void *arg)
> > > +static int vhost_vdpa_dev_fd(const struct vhost_dev *dev)
> > >  {
> > >      struct vhost_vdpa *v = dev->opaque;
> > > -    int fd = v->device_fd;
> > > -    int ret;
> > >
> > >      assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> > > +    return v->device_fd;
> > > +}
> >
> > Nit: unless the vhost_dev structure is opaque to the upper layer, I
> > don't see any advantage for having a dedicated indirect helper to get
> > device_fd.
> >
>
> The purpose was to not duplicate the assert, but sure it's not mandatory.

Ok, but I think for new codes, we'd better avoid assert as much as possible.

>
> > > +
> > > +static int vhost_vdpa_call_fd(int fd, unsigned long int request, void *arg)
> > > +{
> > > +    int ret = ioctl(fd, request, arg);
> > >
> > > -    ret = ioctl(fd, request, arg);
> > >      return ret < 0 ? -errno : ret;
> > >  }
> > >
> > > -static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > > +static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > > +                           void *arg)
> > > +{
> > > +    return vhost_vdpa_call_fd(vhost_vdpa_dev_fd(dev), request, arg);
> > > +}
> > > +
> > > +static int vhost_vdpa_add_status_fd(int fd, uint8_t status)
> > >  {
> > >      uint8_t s;
> > >      int ret;
> > >
> > > -    trace_vhost_vdpa_add_status(dev, status);
> > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> > >      if (ret < 0) {
> > >          return ret;
> > >      }
> > >
> > >      s |= status;
> > >
> > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> > > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &s);
> > >      if (ret < 0) {
> > >          return ret;
> > >      }
> > >
> > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> > >      if (ret < 0) {
> > >          return ret;
> > >      }
> > > @@ -378,6 +385,12 @@ static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > >      return 0;
> > >  }
> > >
> > > +static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > > +{
> > > +    trace_vhost_vdpa_add_status(dev, status);
> > > +    return vhost_vdpa_add_status_fd(vhost_vdpa_dev_fd(dev), status);
> > > +}
> > > +
> > >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range)
> > >  {
> > >      int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> > > @@ -709,16 +722,20 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> > >      return ret;
> > >  }
> > >
> > > +static int vhost_vdpa_reset_device_fd(int fd)
> > > +{
> > > +    uint8_t status = 0;
> > > +
> > > +    return vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &status);
> > > +}
> > > +
> > >  static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> > >  {
> > >      struct vhost_vdpa *v = dev->opaque;
> > > -    int ret;
> > > -    uint8_t status = 0;
> > >
> > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> > > -    trace_vhost_vdpa_reset_device(dev);
> > >      v->suspended = false;
> > > -    return ret;
> > > +    trace_vhost_vdpa_reset_device(dev);
> > > +    return vhost_vdpa_reset_device_fd(vhost_vdpa_dev_fd(dev));
> > >  }
> > >
> > >  static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
> > > @@ -1170,6 +1187,13 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> > >      return 0;
> > >  }
> > >
> > > +void vhost_vdpa_reset_status_fd(int fd)
> > > +{
> > > +    vhost_vdpa_reset_device_fd(fd);
> > > +    vhost_vdpa_add_status_fd(fd, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > > +                                 VIRTIO_CONFIG_S_DRIVER);
> >
> > I would like to rename this function since it does more than just reset.
> >
>
> vhost_vdpa_set_ready?

But it doesn't set DRIVER_OK so it might be somehow misleading.

Thanks

>
> Thanks!
>
> > Thanks
> >
> > > +}
> > > +
> > >  static void vhost_vdpa_reset_status(struct vhost_dev *dev)
> > >  {
> > >      struct vhost_vdpa *v = dev->opaque;
> > > @@ -1178,9 +1202,7 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
> > >          return;
> > >      }
> > >
> > > -    vhost_vdpa_reset_device(dev);
> > > -    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > > -                               VIRTIO_CONFIG_S_DRIVER);
> > > +    vhost_vdpa_reset_status_fd(vhost_vdpa_dev_fd(dev));
> > >      memory_listener_unregister(&v->listener);
> > >  }
> > >
> > > --
> > > 2.31.1
> > >
> >
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization
  2023-05-09 15:44 [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
                   ` (4 preceding siblings ...)
  2023-05-09 15:44 ` [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa Eugenio Pérez
@ 2023-05-17  6:18 ` Lei Yang
  5 siblings, 0 replies; 19+ messages in thread
From: Lei Yang @ 2023-05-17  6:18 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	Jason Wang, si-wei.liu, Shannon Nelson, Michael S. Tsirkin,
	Laurent Vivier, Cindy Lu

QE tested this series with sanity testing on the vdpa_sim device,
everything are works fine and there is no any new regression problems.

Tested-by: Lei Yang <leiyang@redhat.com>



On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> QEMU v8.0 is able to switch dynamically between vhost-vdpa passthrough
> and SVQ mode as long as the net device does not have CVQ.  The net device
> state followed (and migrated) by CVQ requires special care.
>
> A pre-requisite to add CVQ to that framework is to determine if devices with
> CVQ are migratable or not at initialization time.  The solution to it is to
> always shadow only CVQ, and vq groups and ASID are used for that.
>
> However, current qemu version only checks ASID at device start (as "driver set
> DRIVER_OK status bit"), not at device initialization.  A check at
> initialization time is required.  Otherwise, the guest would be able to set
> and remove migration blockers at will [1].
>
> This series is a requisite for migration of vhost-vdpa net devices with CVQ.
> However it already makes sense by its own, as it reduces the number of ioctls
> at migration time, decreasing the error paths there.
>
> [1] https://lore.kernel.org/qemu-devel/2616f0cd-f9e8-d183-ea78-db1be4825d9c@redhat.com/
> ---
> v3:
> * Only record cvq_isolated, true if the device have cvq isolated in both !MQ
> * and MQ configurations.
> * Drop the cache of cvq group, it can be done on top
>
> v2:
> * Take out the reset of the device from vhost_vdpa_cvq_is_isolated
>   (reported by Lei Yang).
> * Expand patch messages by Stefano G. questions.
>
> Eugenio Pérez (5):
>   vdpa: Remove status in reset tracing
>   vdpa: add vhost_vdpa_reset_status_fd
>   vdpa: add vhost_vdpa_set_dev_features_fd
>   vdpa: return errno in vhost_vdpa_get_vring_group error
>   vdpa: move CVQ isolation check to net_init_vhost_vdpa
>
>  include/hw/virtio/vhost-vdpa.h |   2 +
>  hw/virtio/vhost-vdpa.c         |  78 ++++++++++-----
>  net/vhost-vdpa.c               | 171 ++++++++++++++++++++++++++-------
>  hw/virtio/trace-events         |   2 +-
>  4 files changed, 192 insertions(+), 61 deletions(-)
>
> --
> 2.31.1
>
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa
  2023-05-17  3:59   ` Jason Wang
@ 2023-05-17  6:29     ` Eugenio Perez Martin
  2023-05-18  5:49       ` Jason Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Eugenio Perez Martin @ 2023-05-17  6:29 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Michael S. Tsirkin,
	Laurent Vivier, Cindy Lu

On Wed, May 17, 2023 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > Evaluating it at start time instead of initialization time may make the
> > guest capable of dynamically adding or removing migration blockers.
> >
> > Also, moving to initialization reduces the number of ioctls in the
> > migration, reducing failure possibilities.
> >
> > As a drawback we need to check for CVQ isolation twice: one time with no
> > MQ negotiated and another one acking it, as long as the device supports
> > it.  This is because Vring ASID / group management is based on vq
> > indexes, but we don't know the index of CVQ before negotiating MQ.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> > v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
> > v3: Only record cvq_isolated, true if the device have cvq isolated in
> >     both !MQ and MQ configurations.
> > ---
> >  net/vhost-vdpa.c | 178 +++++++++++++++++++++++++++++++++++------------
> >  1 file changed, 135 insertions(+), 43 deletions(-)
> >
> > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > index 3fb833fe76..29054b77a9 100644
> > --- a/net/vhost-vdpa.c
> > +++ b/net/vhost-vdpa.c
> > @@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
> >
> >      /* The device always have SVQ enabled */
> >      bool always_svq;
> > +
> > +    /* The device can isolate CVQ in its own ASID */
> > +    bool cvq_isolated;
> > +
> >      bool started;
> >  } VhostVDPAState;
> >
> > @@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
> >          .check_peer_type = vhost_vdpa_check_peer_type,
> >  };
> >
> > -/**
> > - * Get vring virtqueue group
> > - *
> > - * @device_fd  vdpa device fd
> > - * @vq_index   Virtqueue index
> > - *
> > - * Return -errno in case of error, or vq group if success.
> > - */
> > -static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > +static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
> > +                                          Error **errp)
> >  {
> >      struct vhost_vring_state state = {
> >          .index = vq_index,
> > @@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> >
> >      if (unlikely(r < 0)) {
> >          r = -errno;
> > -        error_report("Cannot get VQ %u group: %s", vq_index,
> > -                     g_strerror(errno));
> > +        error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
> >          return r;
> >      }
> >
> > @@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> >  {
> >      VhostVDPAState *s, *s0;
> >      struct vhost_vdpa *v;
> > -    uint64_t backend_features;
> >      int64_t cvq_group;
> > -    int cvq_index, r;
> > +    int r;
> > +    Error *err = NULL;
> >
> >      assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> >
> > @@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> >      /*
> >       * If we early return in these cases SVQ will not be enabled. The migration
> >       * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
> > -     *
> > -     * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
> > -     * yet.
> >       */
> > -    r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > -    if (unlikely(r < 0)) {
> > -        error_report("Cannot get vdpa backend_features: %s(%d)",
> > -            g_strerror(errno), errno);
> > -        return -1;
> > +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > +        return 0;
> >      }
> > -    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
> > -        !vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > +
> > +    if (!s->cvq_isolated) {
> >          return 0;
> >      }
> >
> > -    /*
> > -     * Check if all the virtqueues of the virtio device are in a different vq
> > -     * than the last vq. VQ group of last group passed in cvq_group.
> > -     */
> > -    cvq_index = v->dev->vq_index_end - 1;
> > -    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> > +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
> > +                                           v->dev->vq_index_end - 1,
> > +                                           &err);
> >      if (unlikely(cvq_group < 0)) {
> > +        error_report_err(err);
> >          return cvq_group;
> >      }
> > -    for (int i = 0; i < cvq_index; ++i) {
> > -        int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> > -
> > -        if (unlikely(group < 0)) {
> > -            return group;
> > -        }
> > -
> > -        if (group == cvq_group) {
> > -            return 0;
> > -        }
> > -    }
> >
> >      r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
> >      if (unlikely(r < 0)) {
> > @@ -799,6 +776,111 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> >      .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> >  };
> >
> > +/**
> > + * Probe the device to check control virtqueue is isolated.
> > + *
> > + * @device_fd vhost-vdpa file descriptor
> > + * @features features to negotiate
> > + * @cvq_index Control vq index
> > + *
> > + * Returns -1 in case of error, 0 if false and 1 if true
> > + */
> > +static int vhost_vdpa_cvq_is_isolated(int device_fd, uint64_t features,
> > +                                      unsigned cvq_index, Error **errp)
> > +{
> > +    int64_t cvq_group;
> > +    int r;
> > +
> > +    r = vhost_vdpa_set_dev_features_fd(device_fd, features);
> > +    if (unlikely(r < 0)) {
> > +        error_setg_errno(errp, -r, "Cannot set device features");
> > +        return r;
> > +    }
> > +
> > +    cvq_group = vhost_vdpa_get_vring_group(device_fd, cvq_index, errp);
> > +    if (unlikely(cvq_group < 0)) {
> > +        return cvq_group;
> > +    }
> > +
> > +    for (int i = 0; i < cvq_index; ++i) {
> > +        int64_t group = vhost_vdpa_get_vring_group(device_fd, i, errp);
> > +
> > +        if (unlikely(group < 0)) {
> > +            return group;
> > +        }
> > +
> > +        if (group == (int64_t)cvq_group) {
> > +            return 0;
> > +        }
> > +    }
> > +
> > +    return 1;
> > +}
> > +
> > +/**
> > + * Probe if CVQ is isolated when the device is MQ and when it is not MQ
> > + *
> > + * @device_fd         The vdpa device fd
> > + * @features          Features offered by the device.
> > + * @cvq_index         The control vq index if mq is negotiated. Ignored
> > + *                    otherwise.
> > + *
> > + * Returns <0 in case of failure, 0 if false and 1 if true.
> > + */
> > +static int vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
> > +                                          int cvq_index, Error **errp)
> > +{
> > +    uint64_t backend_features;
> > +    int r;
> > +
> > +    ERRP_GUARD();
> > +
> > +    r = ioctl(device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > +    if (unlikely(r < 0)) {
> > +        error_setg_errno(errp, errno, "Cannot get vdpa backend_features");
> > +        return r;
> > +    }
> > +
> > +    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
> > +        return 0;
> > +    }
> > +
> > +    r = vhost_vdpa_cvq_is_isolated(device_fd,
> > +                                   features & ~BIT_ULL(VIRTIO_NET_F_MQ), 2,
> > +                                   errp);
> > +    if (unlikely(r < 0)) {
> > +        if (r != -ENOTSUP) {
> > +            return r;
> > +        }
> > +
> > +        /*
> > +         * The kernel report VHOST_BACKEND_F_IOTLB_ASID if the vdpa frontend
> > +         * support ASID even if the parent driver does not.  The CVQ cannot be
> > +         * isolated in this case.
> > +         */
> > +        error_free(*errp);
> > +        *errp = NULL;
> > +        return 0;
> > +    }
> > +
> > +    if (r == 0) {
> > +        return 0;
> > +    }
> > +
> > +    vhost_vdpa_reset_status_fd(device_fd);
> > +    if (!(features & BIT_ULL(VIRTIO_NET_F_MQ))) {
> > +        return 0;
> > +    }
> > +
> > +    r = vhost_vdpa_cvq_is_isolated(device_fd, features, cvq_index * 2, errp);
>
> I think checking this once should be sufficient. That is to say, it
> should be a bug if there's hardware that puts cvq in a dedicated group
> in MQ but not in SQ.
>

This is checking the NIC is not buggy :). Otherwise, we're giving
access to the guest to the CVQ shadow vring. And, currently, SVQ code
assumes only QEMU can access it.

But maybe this made more sense in previous versions, where the series
also cached the cvq group here. If I understand you correctly, it is
enough to check that CVQ is isolated in SQ, and assume it will be
isolated also in MQ, right? I can modify the patch that way if you
confirm this.

Thanks!

> Thanks
>
> > +    if (unlikely(r < 0)) {
> > +        return r;
> > +    }
> > +
> > +    vhost_vdpa_reset_status_fd(device_fd);
> > +    return r;
> > +}
> > +
> >  static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >                                         const char *device,
> >                                         const char *name,
> > @@ -808,16 +890,25 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >                                         bool is_datapath,
> >                                         bool svq,
> >                                         struct vhost_vdpa_iova_range iova_range,
> > -                                       uint64_t features)
> > +                                       uint64_t features,
> > +                                       Error **errp)
> >  {
> >      NetClientState *nc = NULL;
> >      VhostVDPAState *s;
> >      int ret = 0;
> >      assert(name);
> > +    int cvq_isolated;
> > +
> >      if (is_datapath) {
> >          nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device,
> >                                   name);
> >      } else {
> > +        cvq_isolated = vhost_vdpa_probe_cvq_isolation(vdpa_device_fd, features,
> > +                                                      queue_pair_index, errp);
> > +        if (unlikely(cvq_isolated < 0)) {
> > +            return NULL;
> > +        }
> > +
> >          nc = qemu_new_net_control_client(&net_vhost_vdpa_cvq_info, peer,
> >                                           device, name);
> >      }
> > @@ -844,6 +935,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> >
> >          s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
> >          s->vhost_vdpa.shadow_vq_ops_opaque = s;
> > +        s->cvq_isolated = cvq_isolated;
> >
> >          /*
> >           * TODO: We cannot migrate devices with CVQ as there is no way to set
> > @@ -972,7 +1064,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> >      for (i = 0; i < queue_pairs; i++) {
> >          ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> >                                       vdpa_device_fd, i, 2, true, opts->x_svq,
> > -                                     iova_range, features);
> > +                                     iova_range, features, errp);
> >          if (!ncs[i])
> >              goto err;
> >      }
> > @@ -980,7 +1072,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> >      if (has_cvq) {
> >          nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> >                                   vdpa_device_fd, i, 1, false,
> > -                                 opts->x_svq, iova_range, features);
> > +                                 opts->x_svq, iova_range, features, errp);
> >          if (!nc)
> >              goto err;
> >      }
> > --
> > 2.31.1
> >
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa
  2023-05-17  6:29     ` Eugenio Perez Martin
@ 2023-05-18  5:49       ` Jason Wang
  2023-05-18  6:36         ` Eugenio Perez Martin
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2023-05-18  5:49 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Michael S. Tsirkin,
	Laurent Vivier, Cindy Lu

On Wed, May 17, 2023 at 2:30 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Wed, May 17, 2023 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > >
> > > Evaluating it at start time instead of initialization time may make the
> > > guest capable of dynamically adding or removing migration blockers.
> > >
> > > Also, moving to initialization reduces the number of ioctls in the
> > > migration, reducing failure possibilities.
> > >
> > > As a drawback we need to check for CVQ isolation twice: one time with no
> > > MQ negotiated and another one acking it, as long as the device supports
> > > it.  This is because Vring ASID / group management is based on vq
> > > indexes, but we don't know the index of CVQ before negotiating MQ.
> > >
> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > ---
> > > v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
> > > v3: Only record cvq_isolated, true if the device have cvq isolated in
> > >     both !MQ and MQ configurations.
> > > ---
> > >  net/vhost-vdpa.c | 178 +++++++++++++++++++++++++++++++++++------------
> > >  1 file changed, 135 insertions(+), 43 deletions(-)
> > >
> > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > index 3fb833fe76..29054b77a9 100644
> > > --- a/net/vhost-vdpa.c
> > > +++ b/net/vhost-vdpa.c
> > > @@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
> > >
> > >      /* The device always have SVQ enabled */
> > >      bool always_svq;
> > > +
> > > +    /* The device can isolate CVQ in its own ASID */
> > > +    bool cvq_isolated;
> > > +
> > >      bool started;
> > >  } VhostVDPAState;
> > >
> > > @@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
> > >          .check_peer_type = vhost_vdpa_check_peer_type,
> > >  };
> > >
> > > -/**
> > > - * Get vring virtqueue group
> > > - *
> > > - * @device_fd  vdpa device fd
> > > - * @vq_index   Virtqueue index
> > > - *
> > > - * Return -errno in case of error, or vq group if success.
> > > - */
> > > -static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > > +static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
> > > +                                          Error **errp)
> > >  {
> > >      struct vhost_vring_state state = {
> > >          .index = vq_index,
> > > @@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > >
> > >      if (unlikely(r < 0)) {
> > >          r = -errno;
> > > -        error_report("Cannot get VQ %u group: %s", vq_index,
> > > -                     g_strerror(errno));
> > > +        error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
> > >          return r;
> > >      }
> > >
> > > @@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> > >  {
> > >      VhostVDPAState *s, *s0;
> > >      struct vhost_vdpa *v;
> > > -    uint64_t backend_features;
> > >      int64_t cvq_group;
> > > -    int cvq_index, r;
> > > +    int r;
> > > +    Error *err = NULL;
> > >
> > >      assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > >
> > > @@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> > >      /*
> > >       * If we early return in these cases SVQ will not be enabled. The migration
> > >       * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
> > > -     *
> > > -     * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
> > > -     * yet.
> > >       */
> > > -    r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > > -    if (unlikely(r < 0)) {
> > > -        error_report("Cannot get vdpa backend_features: %s(%d)",
> > > -            g_strerror(errno), errno);
> > > -        return -1;
> > > +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > > +        return 0;
> > >      }
> > > -    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
> > > -        !vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > > +
> > > +    if (!s->cvq_isolated) {
> > >          return 0;
> > >      }
> > >
> > > -    /*
> > > -     * Check if all the virtqueues of the virtio device are in a different vq
> > > -     * than the last vq. VQ group of last group passed in cvq_group.
> > > -     */
> > > -    cvq_index = v->dev->vq_index_end - 1;
> > > -    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> > > +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
> > > +                                           v->dev->vq_index_end - 1,
> > > +                                           &err);
> > >      if (unlikely(cvq_group < 0)) {
> > > +        error_report_err(err);
> > >          return cvq_group;
> > >      }
> > > -    for (int i = 0; i < cvq_index; ++i) {
> > > -        int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> > > -
> > > -        if (unlikely(group < 0)) {
> > > -            return group;
> > > -        }
> > > -
> > > -        if (group == cvq_group) {
> > > -            return 0;
> > > -        }
> > > -    }
> > >
> > >      r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
> > >      if (unlikely(r < 0)) {
> > > @@ -799,6 +776,111 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> > >      .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> > >  };
> > >
> > > +/**
> > > + * Probe the device to check control virtqueue is isolated.
> > > + *
> > > + * @device_fd vhost-vdpa file descriptor
> > > + * @features features to negotiate
> > > + * @cvq_index Control vq index
> > > + *
> > > + * Returns -1 in case of error, 0 if false and 1 if true
> > > + */
> > > +static int vhost_vdpa_cvq_is_isolated(int device_fd, uint64_t features,
> > > +                                      unsigned cvq_index, Error **errp)
> > > +{
> > > +    int64_t cvq_group;
> > > +    int r;
> > > +
> > > +    r = vhost_vdpa_set_dev_features_fd(device_fd, features);
> > > +    if (unlikely(r < 0)) {
> > > +        error_setg_errno(errp, -r, "Cannot set device features");
> > > +        return r;
> > > +    }
> > > +
> > > +    cvq_group = vhost_vdpa_get_vring_group(device_fd, cvq_index, errp);
> > > +    if (unlikely(cvq_group < 0)) {
> > > +        return cvq_group;
> > > +    }
> > > +
> > > +    for (int i = 0; i < cvq_index; ++i) {
> > > +        int64_t group = vhost_vdpa_get_vring_group(device_fd, i, errp);
> > > +
> > > +        if (unlikely(group < 0)) {
> > > +            return group;
> > > +        }
> > > +
> > > +        if (group == (int64_t)cvq_group) {
> > > +            return 0;
> > > +        }
> > > +    }
> > > +
> > > +    return 1;
> > > +}
> > > +
> > > +/**
> > > + * Probe if CVQ is isolated when the device is MQ and when it is not MQ
> > > + *
> > > + * @device_fd         The vdpa device fd
> > > + * @features          Features offered by the device.
> > > + * @cvq_index         The control vq index if mq is negotiated. Ignored
> > > + *                    otherwise.
> > > + *
> > > + * Returns <0 in case of failure, 0 if false and 1 if true.
> > > + */
> > > +static int vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
> > > +                                          int cvq_index, Error **errp)
> > > +{
> > > +    uint64_t backend_features;
> > > +    int r;
> > > +
> > > +    ERRP_GUARD();
> > > +
> > > +    r = ioctl(device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > > +    if (unlikely(r < 0)) {
> > > +        error_setg_errno(errp, errno, "Cannot get vdpa backend_features");
> > > +        return r;
> > > +    }
> > > +
> > > +    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
> > > +        return 0;
> > > +    }
> > > +
> > > +    r = vhost_vdpa_cvq_is_isolated(device_fd,
> > > +                                   features & ~BIT_ULL(VIRTIO_NET_F_MQ), 2,
> > > +                                   errp);
> > > +    if (unlikely(r < 0)) {
> > > +        if (r != -ENOTSUP) {
> > > +            return r;
> > > +        }
> > > +
> > > +        /*
> > > +         * The kernel report VHOST_BACKEND_F_IOTLB_ASID if the vdpa frontend
> > > +         * support ASID even if the parent driver does not.  The CVQ cannot be
> > > +         * isolated in this case.
> > > +         */
> > > +        error_free(*errp);
> > > +        *errp = NULL;
> > > +        return 0;
> > > +    }
> > > +
> > > +    if (r == 0) {
> > > +        return 0;
> > > +    }
> > > +
> > > +    vhost_vdpa_reset_status_fd(device_fd);
> > > +    if (!(features & BIT_ULL(VIRTIO_NET_F_MQ))) {
> > > +        return 0;
> > > +    }
> > > +
> > > +    r = vhost_vdpa_cvq_is_isolated(device_fd, features, cvq_index * 2, errp);
> >
> > I think checking this once should be sufficient. That is to say, it
> > should be a bug if there's hardware that puts cvq in a dedicated group
> > in MQ but not in SQ.
> >
>
> This is checking the NIC is not buggy :). Otherwise, we're giving
> access to the guest to the CVQ shadow vring. And, currently, SVQ code
> assumes only QEMU can access it.

Just to make sure we are at the same page, I meant, the hardware
should be buggy if the isolation of cvq is not consistent between
single and multiqueue.

>
> But maybe this made more sense in previous versions, where the series
> also cached the cvq group here. If I understand you correctly, it is
> enough to check that CVQ is isolated in SQ, and assume it will be
> isolated also in MQ, right? I can modify the patch that way if you
> confirm this.

I think so, or just negotiate with what hardware provides us and check.

Thanks

>
> Thanks!
>
> > Thanks
> >
> > > +    if (unlikely(r < 0)) {
> > > +        return r;
> > > +    }
> > > +
> > > +    vhost_vdpa_reset_status_fd(device_fd);
> > > +    return r;
> > > +}
> > > +
> > >  static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> > >                                         const char *device,
> > >                                         const char *name,
> > > @@ -808,16 +890,25 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> > >                                         bool is_datapath,
> > >                                         bool svq,
> > >                                         struct vhost_vdpa_iova_range iova_range,
> > > -                                       uint64_t features)
> > > +                                       uint64_t features,
> > > +                                       Error **errp)
> > >  {
> > >      NetClientState *nc = NULL;
> > >      VhostVDPAState *s;
> > >      int ret = 0;
> > >      assert(name);
> > > +    int cvq_isolated;
> > > +
> > >      if (is_datapath) {
> > >          nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device,
> > >                                   name);
> > >      } else {
> > > +        cvq_isolated = vhost_vdpa_probe_cvq_isolation(vdpa_device_fd, features,
> > > +                                                      queue_pair_index, errp);
> > > +        if (unlikely(cvq_isolated < 0)) {
> > > +            return NULL;
> > > +        }
> > > +
> > >          nc = qemu_new_net_control_client(&net_vhost_vdpa_cvq_info, peer,
> > >                                           device, name);
> > >      }
> > > @@ -844,6 +935,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> > >
> > >          s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
> > >          s->vhost_vdpa.shadow_vq_ops_opaque = s;
> > > +        s->cvq_isolated = cvq_isolated;
> > >
> > >          /*
> > >           * TODO: We cannot migrate devices with CVQ as there is no way to set
> > > @@ -972,7 +1064,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> > >      for (i = 0; i < queue_pairs; i++) {
> > >          ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> > >                                       vdpa_device_fd, i, 2, true, opts->x_svq,
> > > -                                     iova_range, features);
> > > +                                     iova_range, features, errp);
> > >          if (!ncs[i])
> > >              goto err;
> > >      }
> > > @@ -980,7 +1072,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> > >      if (has_cvq) {
> > >          nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> > >                                   vdpa_device_fd, i, 1, false,
> > > -                                 opts->x_svq, iova_range, features);
> > > +                                 opts->x_svq, iova_range, features, errp);
> > >          if (!nc)
> > >              goto err;
> > >      }
> > > --
> > > 2.31.1
> > >
> >
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa
  2023-05-18  5:49       ` Jason Wang
@ 2023-05-18  6:36         ` Eugenio Perez Martin
  2023-05-18  6:58           ` Jason Wang
  2023-05-18 21:22           ` Michael S. Tsirkin
  0 siblings, 2 replies; 19+ messages in thread
From: Eugenio Perez Martin @ 2023-05-18  6:36 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Michael S. Tsirkin,
	Laurent Vivier, Cindy Lu

On Thu, May 18, 2023 at 7:50 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, May 17, 2023 at 2:30 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Wed, May 17, 2023 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >
> > > > Evaluating it at start time instead of initialization time may make the
> > > > guest capable of dynamically adding or removing migration blockers.
> > > >
> > > > Also, moving to initialization reduces the number of ioctls in the
> > > > migration, reducing failure possibilities.
> > > >
> > > > As a drawback we need to check for CVQ isolation twice: one time with no
> > > > MQ negotiated and another one acking it, as long as the device supports
> > > > it.  This is because Vring ASID / group management is based on vq
> > > > indexes, but we don't know the index of CVQ before negotiating MQ.
> > > >
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > ---
> > > > v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
> > > > v3: Only record cvq_isolated, true if the device have cvq isolated in
> > > >     both !MQ and MQ configurations.
> > > > ---
> > > >  net/vhost-vdpa.c | 178 +++++++++++++++++++++++++++++++++++------------
> > > >  1 file changed, 135 insertions(+), 43 deletions(-)
> > > >
> > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > > index 3fb833fe76..29054b77a9 100644
> > > > --- a/net/vhost-vdpa.c
> > > > +++ b/net/vhost-vdpa.c
> > > > @@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
> > > >
> > > >      /* The device always have SVQ enabled */
> > > >      bool always_svq;
> > > > +
> > > > +    /* The device can isolate CVQ in its own ASID */
> > > > +    bool cvq_isolated;
> > > > +
> > > >      bool started;
> > > >  } VhostVDPAState;
> > > >
> > > > @@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
> > > >          .check_peer_type = vhost_vdpa_check_peer_type,
> > > >  };
> > > >
> > > > -/**
> > > > - * Get vring virtqueue group
> > > > - *
> > > > - * @device_fd  vdpa device fd
> > > > - * @vq_index   Virtqueue index
> > > > - *
> > > > - * Return -errno in case of error, or vq group if success.
> > > > - */
> > > > -static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > > > +static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
> > > > +                                          Error **errp)
> > > >  {
> > > >      struct vhost_vring_state state = {
> > > >          .index = vq_index,
> > > > @@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > > >
> > > >      if (unlikely(r < 0)) {
> > > >          r = -errno;
> > > > -        error_report("Cannot get VQ %u group: %s", vq_index,
> > > > -                     g_strerror(errno));
> > > > +        error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
> > > >          return r;
> > > >      }
> > > >
> > > > @@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> > > >  {
> > > >      VhostVDPAState *s, *s0;
> > > >      struct vhost_vdpa *v;
> > > > -    uint64_t backend_features;
> > > >      int64_t cvq_group;
> > > > -    int cvq_index, r;
> > > > +    int r;
> > > > +    Error *err = NULL;
> > > >
> > > >      assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > > >
> > > > @@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> > > >      /*
> > > >       * If we early return in these cases SVQ will not be enabled. The migration
> > > >       * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
> > > > -     *
> > > > -     * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
> > > > -     * yet.
> > > >       */
> > > > -    r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > > > -    if (unlikely(r < 0)) {
> > > > -        error_report("Cannot get vdpa backend_features: %s(%d)",
> > > > -            g_strerror(errno), errno);
> > > > -        return -1;
> > > > +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > > > +        return 0;
> > > >      }
> > > > -    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
> > > > -        !vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > > > +
> > > > +    if (!s->cvq_isolated) {
> > > >          return 0;
> > > >      }
> > > >
> > > > -    /*
> > > > -     * Check if all the virtqueues of the virtio device are in a different vq
> > > > -     * than the last vq. VQ group of last group passed in cvq_group.
> > > > -     */
> > > > -    cvq_index = v->dev->vq_index_end - 1;
> > > > -    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> > > > +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
> > > > +                                           v->dev->vq_index_end - 1,
> > > > +                                           &err);
> > > >      if (unlikely(cvq_group < 0)) {
> > > > +        error_report_err(err);
> > > >          return cvq_group;
> > > >      }
> > > > -    for (int i = 0; i < cvq_index; ++i) {
> > > > -        int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> > > > -
> > > > -        if (unlikely(group < 0)) {
> > > > -            return group;
> > > > -        }
> > > > -
> > > > -        if (group == cvq_group) {
> > > > -            return 0;
> > > > -        }
> > > > -    }
> > > >
> > > >      r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
> > > >      if (unlikely(r < 0)) {
> > > > @@ -799,6 +776,111 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> > > >      .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> > > >  };
> > > >
> > > > +/**
> > > > + * Probe the device to check control virtqueue is isolated.
> > > > + *
> > > > + * @device_fd vhost-vdpa file descriptor
> > > > + * @features features to negotiate
> > > > + * @cvq_index Control vq index
> > > > + *
> > > > + * Returns -1 in case of error, 0 if false and 1 if true
> > > > + */
> > > > +static int vhost_vdpa_cvq_is_isolated(int device_fd, uint64_t features,
> > > > +                                      unsigned cvq_index, Error **errp)
> > > > +{
> > > > +    int64_t cvq_group;
> > > > +    int r;
> > > > +
> > > > +    r = vhost_vdpa_set_dev_features_fd(device_fd, features);
> > > > +    if (unlikely(r < 0)) {
> > > > +        error_setg_errno(errp, -r, "Cannot set device features");
> > > > +        return r;
> > > > +    }
> > > > +
> > > > +    cvq_group = vhost_vdpa_get_vring_group(device_fd, cvq_index, errp);
> > > > +    if (unlikely(cvq_group < 0)) {
> > > > +        return cvq_group;
> > > > +    }
> > > > +
> > > > +    for (int i = 0; i < cvq_index; ++i) {
> > > > +        int64_t group = vhost_vdpa_get_vring_group(device_fd, i, errp);
> > > > +
> > > > +        if (unlikely(group < 0)) {
> > > > +            return group;
> > > > +        }
> > > > +
> > > > +        if (group == (int64_t)cvq_group) {
> > > > +            return 0;
> > > > +        }
> > > > +    }
> > > > +
> > > > +    return 1;
> > > > +}
> > > > +
> > > > +/**
> > > > + * Probe if CVQ is isolated when the device is MQ and when it is not MQ
> > > > + *
> > > > + * @device_fd         The vdpa device fd
> > > > + * @features          Features offered by the device.
> > > > + * @cvq_index         The control vq index if mq is negotiated. Ignored
> > > > + *                    otherwise.
> > > > + *
> > > > + * Returns <0 in case of failure, 0 if false and 1 if true.
> > > > + */
> > > > +static int vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
> > > > +                                          int cvq_index, Error **errp)
> > > > +{
> > > > +    uint64_t backend_features;
> > > > +    int r;
> > > > +
> > > > +    ERRP_GUARD();
> > > > +
> > > > +    r = ioctl(device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > > > +    if (unlikely(r < 0)) {
> > > > +        error_setg_errno(errp, errno, "Cannot get vdpa backend_features");
> > > > +        return r;
> > > > +    }
> > > > +
> > > > +    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
> > > > +        return 0;
> > > > +    }
> > > > +
> > > > +    r = vhost_vdpa_cvq_is_isolated(device_fd,
> > > > +                                   features & ~BIT_ULL(VIRTIO_NET_F_MQ), 2,
> > > > +                                   errp);
> > > > +    if (unlikely(r < 0)) {
> > > > +        if (r != -ENOTSUP) {
> > > > +            return r;
> > > > +        }
> > > > +
> > > > +        /*
> > > > +         * The kernel report VHOST_BACKEND_F_IOTLB_ASID if the vdpa frontend
> > > > +         * support ASID even if the parent driver does not.  The CVQ cannot be
> > > > +         * isolated in this case.
> > > > +         */
> > > > +        error_free(*errp);
> > > > +        *errp = NULL;
> > > > +        return 0;
> > > > +    }
> > > > +
> > > > +    if (r == 0) {
> > > > +        return 0;
> > > > +    }
> > > > +
> > > > +    vhost_vdpa_reset_status_fd(device_fd);
> > > > +    if (!(features & BIT_ULL(VIRTIO_NET_F_MQ))) {
> > > > +        return 0;
> > > > +    }
> > > > +
> > > > +    r = vhost_vdpa_cvq_is_isolated(device_fd, features, cvq_index * 2, errp);
> > >
> > > I think checking this once should be sufficient. That is to say, it
> > > should be a bug if there's hardware that puts cvq in a dedicated group
> > > in MQ but not in SQ.
> > >
> >
> > This is checking the NIC is not buggy :). Otherwise, we're giving
> > access to the guest to the CVQ shadow vring. And, currently, SVQ code
> > assumes only QEMU can access it.
>
> Just to make sure we are at the same page, I meant, the hardware
> should be buggy if the isolation of cvq is not consistent between
> single and multiqueue.
>

Yes, I got you.

The problem with that particular bug is that we will handle guest's
vring with the bad IOVA tree. Since QEMU is not sanitizing that
descriptors anymore, the device can be used to write at qemu memory.
At this time only SVQ vring and in buffers should be writable by this,
so it's not a big deal.

This can also happen if the device is buggy in other ways. For
example, reporting that CVQ is isolated at VHOST_VDPA_GET_VRING_GROUP
but then handling maps ignoring the ASID parameter. There is no
protection for that, so I agree this double check makes little sense.

> >
> > But maybe this made more sense in previous versions, where the series
> > also cached the cvq group here. If I understand you correctly, it is
> > enough to check that CVQ is isolated in SQ, and assume it will be
> > isolated also in MQ, right? I can modify the patch that way if you
> > confirm this.
>
> I think so, or just negotiate with what hardware provides us and check.
>

To always probe with SQ makes the code simpler, but let me know if you
think there are advantages to probing otherwise.

Thanks!



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa
  2023-05-18  6:36         ` Eugenio Perez Martin
@ 2023-05-18  6:58           ` Jason Wang
  2023-05-18 21:22           ` Michael S. Tsirkin
  1 sibling, 0 replies; 19+ messages in thread
From: Jason Wang @ 2023-05-18  6:58 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Michael S. Tsirkin,
	Laurent Vivier, Cindy Lu

On Thu, May 18, 2023 at 2:37 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Thu, May 18, 2023 at 7:50 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, May 17, 2023 at 2:30 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Wed, May 17, 2023 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > >
> > > > > Evaluating it at start time instead of initialization time may make the
> > > > > guest capable of dynamically adding or removing migration blockers.
> > > > >
> > > > > Also, moving to initialization reduces the number of ioctls in the
> > > > > migration, reducing failure possibilities.
> > > > >
> > > > > As a drawback we need to check for CVQ isolation twice: one time with no
> > > > > MQ negotiated and another one acking it, as long as the device supports
> > > > > it.  This is because Vring ASID / group management is based on vq
> > > > > indexes, but we don't know the index of CVQ before negotiating MQ.
> > > > >
> > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > ---
> > > > > v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
> > > > > v3: Only record cvq_isolated, true if the device have cvq isolated in
> > > > >     both !MQ and MQ configurations.
> > > > > ---
> > > > >  net/vhost-vdpa.c | 178 +++++++++++++++++++++++++++++++++++------------
> > > > >  1 file changed, 135 insertions(+), 43 deletions(-)
> > > > >
> > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > > > index 3fb833fe76..29054b77a9 100644
> > > > > --- a/net/vhost-vdpa.c
> > > > > +++ b/net/vhost-vdpa.c
> > > > > @@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
> > > > >
> > > > >      /* The device always have SVQ enabled */
> > > > >      bool always_svq;
> > > > > +
> > > > > +    /* The device can isolate CVQ in its own ASID */
> > > > > +    bool cvq_isolated;
> > > > > +
> > > > >      bool started;
> > > > >  } VhostVDPAState;
> > > > >
> > > > > @@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
> > > > >          .check_peer_type = vhost_vdpa_check_peer_type,
> > > > >  };
> > > > >
> > > > > -/**
> > > > > - * Get vring virtqueue group
> > > > > - *
> > > > > - * @device_fd  vdpa device fd
> > > > > - * @vq_index   Virtqueue index
> > > > > - *
> > > > > - * Return -errno in case of error, or vq group if success.
> > > > > - */
> > > > > -static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > > > > +static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
> > > > > +                                          Error **errp)
> > > > >  {
> > > > >      struct vhost_vring_state state = {
> > > > >          .index = vq_index,
> > > > > @@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > > > >
> > > > >      if (unlikely(r < 0)) {
> > > > >          r = -errno;
> > > > > -        error_report("Cannot get VQ %u group: %s", vq_index,
> > > > > -                     g_strerror(errno));
> > > > > +        error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
> > > > >          return r;
> > > > >      }
> > > > >
> > > > > @@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> > > > >  {
> > > > >      VhostVDPAState *s, *s0;
> > > > >      struct vhost_vdpa *v;
> > > > > -    uint64_t backend_features;
> > > > >      int64_t cvq_group;
> > > > > -    int cvq_index, r;
> > > > > +    int r;
> > > > > +    Error *err = NULL;
> > > > >
> > > > >      assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > > > >
> > > > > @@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> > > > >      /*
> > > > >       * If we early return in these cases SVQ will not be enabled. The migration
> > > > >       * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
> > > > > -     *
> > > > > -     * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
> > > > > -     * yet.
> > > > >       */
> > > > > -    r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > > > > -    if (unlikely(r < 0)) {
> > > > > -        error_report("Cannot get vdpa backend_features: %s(%d)",
> > > > > -            g_strerror(errno), errno);
> > > > > -        return -1;
> > > > > +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > > > > +        return 0;
> > > > >      }
> > > > > -    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
> > > > > -        !vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > > > > +
> > > > > +    if (!s->cvq_isolated) {
> > > > >          return 0;
> > > > >      }
> > > > >
> > > > > -    /*
> > > > > -     * Check if all the virtqueues of the virtio device are in a different vq
> > > > > -     * than the last vq. VQ group of last group passed in cvq_group.
> > > > > -     */
> > > > > -    cvq_index = v->dev->vq_index_end - 1;
> > > > > -    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> > > > > +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
> > > > > +                                           v->dev->vq_index_end - 1,
> > > > > +                                           &err);
> > > > >      if (unlikely(cvq_group < 0)) {
> > > > > +        error_report_err(err);
> > > > >          return cvq_group;
> > > > >      }
> > > > > -    for (int i = 0; i < cvq_index; ++i) {
> > > > > -        int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> > > > > -
> > > > > -        if (unlikely(group < 0)) {
> > > > > -            return group;
> > > > > -        }
> > > > > -
> > > > > -        if (group == cvq_group) {
> > > > > -            return 0;
> > > > > -        }
> > > > > -    }
> > > > >
> > > > >      r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
> > > > >      if (unlikely(r < 0)) {
> > > > > @@ -799,6 +776,111 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> > > > >      .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> > > > >  };
> > > > >
> > > > > +/**
> > > > > + * Probe the device to check control virtqueue is isolated.
> > > > > + *
> > > > > + * @device_fd vhost-vdpa file descriptor
> > > > > + * @features features to negotiate
> > > > > + * @cvq_index Control vq index
> > > > > + *
> > > > > + * Returns -1 in case of error, 0 if false and 1 if true
> > > > > + */
> > > > > +static int vhost_vdpa_cvq_is_isolated(int device_fd, uint64_t features,
> > > > > +                                      unsigned cvq_index, Error **errp)
> > > > > +{
> > > > > +    int64_t cvq_group;
> > > > > +    int r;
> > > > > +
> > > > > +    r = vhost_vdpa_set_dev_features_fd(device_fd, features);
> > > > > +    if (unlikely(r < 0)) {
> > > > > +        error_setg_errno(errp, -r, "Cannot set device features");
> > > > > +        return r;
> > > > > +    }
> > > > > +
> > > > > +    cvq_group = vhost_vdpa_get_vring_group(device_fd, cvq_index, errp);
> > > > > +    if (unlikely(cvq_group < 0)) {
> > > > > +        return cvq_group;
> > > > > +    }
> > > > > +
> > > > > +    for (int i = 0; i < cvq_index; ++i) {
> > > > > +        int64_t group = vhost_vdpa_get_vring_group(device_fd, i, errp);
> > > > > +
> > > > > +        if (unlikely(group < 0)) {
> > > > > +            return group;
> > > > > +        }
> > > > > +
> > > > > +        if (group == (int64_t)cvq_group) {
> > > > > +            return 0;
> > > > > +        }
> > > > > +    }
> > > > > +
> > > > > +    return 1;
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * Probe if CVQ is isolated when the device is MQ and when it is not MQ
> > > > > + *
> > > > > + * @device_fd         The vdpa device fd
> > > > > + * @features          Features offered by the device.
> > > > > + * @cvq_index         The control vq index if mq is negotiated. Ignored
> > > > > + *                    otherwise.
> > > > > + *
> > > > > + * Returns <0 in case of failure, 0 if false and 1 if true.
> > > > > + */
> > > > > +static int vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
> > > > > +                                          int cvq_index, Error **errp)
> > > > > +{
> > > > > +    uint64_t backend_features;
> > > > > +    int r;
> > > > > +
> > > > > +    ERRP_GUARD();
> > > > > +
> > > > > +    r = ioctl(device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > > > > +    if (unlikely(r < 0)) {
> > > > > +        error_setg_errno(errp, errno, "Cannot get vdpa backend_features");
> > > > > +        return r;
> > > > > +    }
> > > > > +
> > > > > +    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
> > > > > +        return 0;
> > > > > +    }
> > > > > +
> > > > > +    r = vhost_vdpa_cvq_is_isolated(device_fd,
> > > > > +                                   features & ~BIT_ULL(VIRTIO_NET_F_MQ), 2,
> > > > > +                                   errp);
> > > > > +    if (unlikely(r < 0)) {
> > > > > +        if (r != -ENOTSUP) {
> > > > > +            return r;
> > > > > +        }
> > > > > +
> > > > > +        /*
> > > > > +         * The kernel report VHOST_BACKEND_F_IOTLB_ASID if the vdpa frontend
> > > > > +         * support ASID even if the parent driver does not.  The CVQ cannot be
> > > > > +         * isolated in this case.
> > > > > +         */
> > > > > +        error_free(*errp);
> > > > > +        *errp = NULL;
> > > > > +        return 0;
> > > > > +    }
> > > > > +
> > > > > +    if (r == 0) {
> > > > > +        return 0;
> > > > > +    }
> > > > > +
> > > > > +    vhost_vdpa_reset_status_fd(device_fd);
> > > > > +    if (!(features & BIT_ULL(VIRTIO_NET_F_MQ))) {
> > > > > +        return 0;
> > > > > +    }
> > > > > +
> > > > > +    r = vhost_vdpa_cvq_is_isolated(device_fd, features, cvq_index * 2, errp);
> > > >
> > > > I think checking this once should be sufficient. That is to say, it
> > > > should be a bug if there's hardware that puts cvq in a dedicated group
> > > > in MQ but not in SQ.
> > > >
> > >
> > > This is checking the NIC is not buggy :). Otherwise, we're giving
> > > access to the guest to the CVQ shadow vring. And, currently, SVQ code
> > > assumes only QEMU can access it.
> >
> > Just to make sure we are at the same page, I meant, the hardware
> > should be buggy if the isolation of cvq is not consistent between
> > single and multiqueue.
> >
>
> Yes, I got you.
>
> The problem with that particular bug is that we will handle guest's
> vring with the bad IOVA tree. Since QEMU is not sanitizing that
> descriptors anymore, the device can be used to write at qemu memory.
> At this time only SVQ vring and in buffers should be writable by this,
> so it's not a big deal.
>
> This can also happen if the device is buggy in other ways. For
> example, reporting that CVQ is isolated at VHOST_VDPA_GET_VRING_GROUP
> but then handling maps ignoring the ASID parameter. There is no
> protection for that, so I agree this double check makes little sense.
>
> > >
> > > But maybe this made more sense in previous versions, where the series
> > > also cached the cvq group here. If I understand you correctly, it is
> > > enough to check that CVQ is isolated in SQ, and assume it will be
> > > isolated also in MQ, right? I can modify the patch that way if you
> > > confirm this.
> >
> > I think so, or just negotiate with what hardware provides us and check.
> >
>
> To always probe with SQ makes the code simpler, but let me know if you
> think there are advantages to probing otherwise.

It should be fine.

Thanks

>
> Thanks!
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa
  2023-05-18  6:36         ` Eugenio Perez Martin
  2023-05-18  6:58           ` Jason Wang
@ 2023-05-18 21:22           ` Michael S. Tsirkin
  2023-05-19  4:50             ` Eugenio Perez Martin
  1 sibling, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2023-05-18 21:22 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Jason Wang, qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Laurent Vivier, Cindy Lu

On Thu, May 18, 2023 at 08:36:22AM +0200, Eugenio Perez Martin wrote:
> On Thu, May 18, 2023 at 7:50 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, May 17, 2023 at 2:30 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Wed, May 17, 2023 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > >
> > > > > Evaluating it at start time instead of initialization time may make the
> > > > > guest capable of dynamically adding or removing migration blockers.
> > > > >
> > > > > Also, moving to initialization reduces the number of ioctls in the
> > > > > migration, reducing failure possibilities.
> > > > >
> > > > > As a drawback we need to check for CVQ isolation twice: one time with no
> > > > > MQ negotiated and another one acking it, as long as the device supports
> > > > > it.  This is because Vring ASID / group management is based on vq
> > > > > indexes, but we don't know the index of CVQ before negotiating MQ.
> > > > >
> > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > ---
> > > > > v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
> > > > > v3: Only record cvq_isolated, true if the device have cvq isolated in
> > > > >     both !MQ and MQ configurations.
> > > > > ---
> > > > >  net/vhost-vdpa.c | 178 +++++++++++++++++++++++++++++++++++------------
> > > > >  1 file changed, 135 insertions(+), 43 deletions(-)
> > > > >
> > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > > > index 3fb833fe76..29054b77a9 100644
> > > > > --- a/net/vhost-vdpa.c
> > > > > +++ b/net/vhost-vdpa.c
> > > > > @@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
> > > > >
> > > > >      /* The device always have SVQ enabled */
> > > > >      bool always_svq;
> > > > > +
> > > > > +    /* The device can isolate CVQ in its own ASID */
> > > > > +    bool cvq_isolated;
> > > > > +
> > > > >      bool started;
> > > > >  } VhostVDPAState;
> > > > >
> > > > > @@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
> > > > >          .check_peer_type = vhost_vdpa_check_peer_type,
> > > > >  };
> > > > >
> > > > > -/**
> > > > > - * Get vring virtqueue group
> > > > > - *
> > > > > - * @device_fd  vdpa device fd
> > > > > - * @vq_index   Virtqueue index
> > > > > - *
> > > > > - * Return -errno in case of error, or vq group if success.
> > > > > - */
> > > > > -static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > > > > +static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
> > > > > +                                          Error **errp)
> > > > >  {
> > > > >      struct vhost_vring_state state = {
> > > > >          .index = vq_index,
> > > > > @@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > > > >
> > > > >      if (unlikely(r < 0)) {
> > > > >          r = -errno;
> > > > > -        error_report("Cannot get VQ %u group: %s", vq_index,
> > > > > -                     g_strerror(errno));
> > > > > +        error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
> > > > >          return r;
> > > > >      }
> > > > >
> > > > > @@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> > > > >  {
> > > > >      VhostVDPAState *s, *s0;
> > > > >      struct vhost_vdpa *v;
> > > > > -    uint64_t backend_features;
> > > > >      int64_t cvq_group;
> > > > > -    int cvq_index, r;
> > > > > +    int r;
> > > > > +    Error *err = NULL;
> > > > >
> > > > >      assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > > > >
> > > > > @@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> > > > >      /*
> > > > >       * If we early return in these cases SVQ will not be enabled. The migration
> > > > >       * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
> > > > > -     *
> > > > > -     * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
> > > > > -     * yet.
> > > > >       */
> > > > > -    r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > > > > -    if (unlikely(r < 0)) {
> > > > > -        error_report("Cannot get vdpa backend_features: %s(%d)",
> > > > > -            g_strerror(errno), errno);
> > > > > -        return -1;
> > > > > +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > > > > +        return 0;
> > > > >      }
> > > > > -    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
> > > > > -        !vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > > > > +
> > > > > +    if (!s->cvq_isolated) {
> > > > >          return 0;
> > > > >      }
> > > > >
> > > > > -    /*
> > > > > -     * Check if all the virtqueues of the virtio device are in a different vq
> > > > > -     * than the last vq. VQ group of last group passed in cvq_group.
> > > > > -     */
> > > > > -    cvq_index = v->dev->vq_index_end - 1;
> > > > > -    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> > > > > +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
> > > > > +                                           v->dev->vq_index_end - 1,
> > > > > +                                           &err);
> > > > >      if (unlikely(cvq_group < 0)) {
> > > > > +        error_report_err(err);
> > > > >          return cvq_group;
> > > > >      }
> > > > > -    for (int i = 0; i < cvq_index; ++i) {
> > > > > -        int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> > > > > -
> > > > > -        if (unlikely(group < 0)) {
> > > > > -            return group;
> > > > > -        }
> > > > > -
> > > > > -        if (group == cvq_group) {
> > > > > -            return 0;
> > > > > -        }
> > > > > -    }
> > > > >
> > > > >      r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
> > > > >      if (unlikely(r < 0)) {
> > > > > @@ -799,6 +776,111 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> > > > >      .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> > > > >  };
> > > > >
> > > > > +/**
> > > > > + * Probe the device to check control virtqueue is isolated.
> > > > > + *
> > > > > + * @device_fd vhost-vdpa file descriptor
> > > > > + * @features features to negotiate
> > > > > + * @cvq_index Control vq index
> > > > > + *
> > > > > + * Returns -1 in case of error, 0 if false and 1 if true
> > > > > + */
> > > > > +static int vhost_vdpa_cvq_is_isolated(int device_fd, uint64_t features,
> > > > > +                                      unsigned cvq_index, Error **errp)
> > > > > +{
> > > > > +    int64_t cvq_group;
> > > > > +    int r;
> > > > > +
> > > > > +    r = vhost_vdpa_set_dev_features_fd(device_fd, features);
> > > > > +    if (unlikely(r < 0)) {
> > > > > +        error_setg_errno(errp, -r, "Cannot set device features");
> > > > > +        return r;
> > > > > +    }
> > > > > +
> > > > > +    cvq_group = vhost_vdpa_get_vring_group(device_fd, cvq_index, errp);
> > > > > +    if (unlikely(cvq_group < 0)) {
> > > > > +        return cvq_group;
> > > > > +    }
> > > > > +
> > > > > +    for (int i = 0; i < cvq_index; ++i) {
> > > > > +        int64_t group = vhost_vdpa_get_vring_group(device_fd, i, errp);
> > > > > +
> > > > > +        if (unlikely(group < 0)) {
> > > > > +            return group;
> > > > > +        }
> > > > > +
> > > > > +        if (group == (int64_t)cvq_group) {
> > > > > +            return 0;
> > > > > +        }
> > > > > +    }
> > > > > +
> > > > > +    return 1;
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * Probe if CVQ is isolated when the device is MQ and when it is not MQ
> > > > > + *
> > > > > + * @device_fd         The vdpa device fd
> > > > > + * @features          Features offered by the device.
> > > > > + * @cvq_index         The control vq index if mq is negotiated. Ignored
> > > > > + *                    otherwise.
> > > > > + *
> > > > > + * Returns <0 in case of failure, 0 if false and 1 if true.
> > > > > + */
> > > > > +static int vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
> > > > > +                                          int cvq_index, Error **errp)
> > > > > +{
> > > > > +    uint64_t backend_features;
> > > > > +    int r;
> > > > > +
> > > > > +    ERRP_GUARD();
> > > > > +
> > > > > +    r = ioctl(device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > > > > +    if (unlikely(r < 0)) {
> > > > > +        error_setg_errno(errp, errno, "Cannot get vdpa backend_features");
> > > > > +        return r;
> > > > > +    }
> > > > > +
> > > > > +    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
> > > > > +        return 0;
> > > > > +    }
> > > > > +
> > > > > +    r = vhost_vdpa_cvq_is_isolated(device_fd,
> > > > > +                                   features & ~BIT_ULL(VIRTIO_NET_F_MQ), 2,
> > > > > +                                   errp);
> > > > > +    if (unlikely(r < 0)) {
> > > > > +        if (r != -ENOTSUP) {
> > > > > +            return r;
> > > > > +        }
> > > > > +
> > > > > +        /*
> > > > > +         * The kernel report VHOST_BACKEND_F_IOTLB_ASID if the vdpa frontend
> > > > > +         * support ASID even if the parent driver does not.  The CVQ cannot be
> > > > > +         * isolated in this case.
> > > > > +         */
> > > > > +        error_free(*errp);
> > > > > +        *errp = NULL;
> > > > > +        return 0;
> > > > > +    }
> > > > > +
> > > > > +    if (r == 0) {
> > > > > +        return 0;
> > > > > +    }
> > > > > +
> > > > > +    vhost_vdpa_reset_status_fd(device_fd);
> > > > > +    if (!(features & BIT_ULL(VIRTIO_NET_F_MQ))) {
> > > > > +        return 0;
> > > > > +    }
> > > > > +
> > > > > +    r = vhost_vdpa_cvq_is_isolated(device_fd, features, cvq_index * 2, errp);
> > > >
> > > > I think checking this once should be sufficient. That is to say, it
> > > > should be a bug if there's hardware that puts cvq in a dedicated group
> > > > in MQ but not in SQ.
> > > >
> > >
> > > This is checking the NIC is not buggy :). Otherwise, we're giving
> > > access to the guest to the CVQ shadow vring. And, currently, SVQ code
> > > assumes only QEMU can access it.
> >
> > Just to make sure we are at the same page, I meant, the hardware
> > should be buggy if the isolation of cvq is not consistent between
> > single and multiqueue.
> >
> 
> Yes, I got you.
> 
> The problem with that particular bug is that we will handle guest's
> vring with the bad IOVA tree. Since QEMU is not sanitizing that
> descriptors anymore, the device can be used to write at qemu memory.
> At this time only SVQ vring and in buffers should be writable by this,
> so it's not a big deal.
> 
> This can also happen if the device is buggy in other ways. For
> example, reporting that CVQ is isolated at VHOST_VDPA_GET_VRING_GROUP
> but then handling maps ignoring the ASID parameter. There is no
> protection for that, so I agree this double check makes little sense.

Ok so you will repost with this check removed?

> > >
> > > But maybe this made more sense in previous versions, where the series
> > > also cached the cvq group here. If I understand you correctly, it is
> > > enough to check that CVQ is isolated in SQ, and assume it will be
> > > isolated also in MQ, right? I can modify the patch that way if you
> > > confirm this.
> >
> > I think so, or just negotiate with what hardware provides us and check.
> >
> 
> To always probe with SQ makes the code simpler, but let me know if you
> think there are advantages to probing otherwise.
> 
> Thanks!



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa
  2023-05-18 21:22           ` Michael S. Tsirkin
@ 2023-05-19  4:50             ` Eugenio Perez Martin
  0 siblings, 0 replies; 19+ messages in thread
From: Eugenio Perez Martin @ 2023-05-19  4:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Laurent Vivier, Cindy Lu

On Thu, May 18, 2023 at 11:23 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, May 18, 2023 at 08:36:22AM +0200, Eugenio Perez Martin wrote:
> > On Thu, May 18, 2023 at 7:50 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Wed, May 17, 2023 at 2:30 PM Eugenio Perez Martin
> > > <eperezma@redhat.com> wrote:
> > > >
> > > > On Wed, May 17, 2023 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > > >
> > > > > > Evaluating it at start time instead of initialization time may make the
> > > > > > guest capable of dynamically adding or removing migration blockers.
> > > > > >
> > > > > > Also, moving to initialization reduces the number of ioctls in the
> > > > > > migration, reducing failure possibilities.
> > > > > >
> > > > > > As a drawback we need to check for CVQ isolation twice: one time with no
> > > > > > MQ negotiated and another one acking it, as long as the device supports
> > > > > > it.  This is because Vring ASID / group management is based on vq
> > > > > > indexes, but we don't know the index of CVQ before negotiating MQ.
> > > > > >
> > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > ---
> > > > > > v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
> > > > > > v3: Only record cvq_isolated, true if the device have cvq isolated in
> > > > > >     both !MQ and MQ configurations.
> > > > > > ---
> > > > > >  net/vhost-vdpa.c | 178 +++++++++++++++++++++++++++++++++++------------
> > > > > >  1 file changed, 135 insertions(+), 43 deletions(-)
> > > > > >
> > > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > > > > index 3fb833fe76..29054b77a9 100644
> > > > > > --- a/net/vhost-vdpa.c
> > > > > > +++ b/net/vhost-vdpa.c
> > > > > > @@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
> > > > > >
> > > > > >      /* The device always have SVQ enabled */
> > > > > >      bool always_svq;
> > > > > > +
> > > > > > +    /* The device can isolate CVQ in its own ASID */
> > > > > > +    bool cvq_isolated;
> > > > > > +
> > > > > >      bool started;
> > > > > >  } VhostVDPAState;
> > > > > >
> > > > > > @@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
> > > > > >          .check_peer_type = vhost_vdpa_check_peer_type,
> > > > > >  };
> > > > > >
> > > > > > -/**
> > > > > > - * Get vring virtqueue group
> > > > > > - *
> > > > > > - * @device_fd  vdpa device fd
> > > > > > - * @vq_index   Virtqueue index
> > > > > > - *
> > > > > > - * Return -errno in case of error, or vq group if success.
> > > > > > - */
> > > > > > -static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > > > > > +static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
> > > > > > +                                          Error **errp)
> > > > > >  {
> > > > > >      struct vhost_vring_state state = {
> > > > > >          .index = vq_index,
> > > > > > @@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > > > > >
> > > > > >      if (unlikely(r < 0)) {
> > > > > >          r = -errno;
> > > > > > -        error_report("Cannot get VQ %u group: %s", vq_index,
> > > > > > -                     g_strerror(errno));
> > > > > > +        error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
> > > > > >          return r;
> > > > > >      }
> > > > > >
> > > > > > @@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> > > > > >  {
> > > > > >      VhostVDPAState *s, *s0;
> > > > > >      struct vhost_vdpa *v;
> > > > > > -    uint64_t backend_features;
> > > > > >      int64_t cvq_group;
> > > > > > -    int cvq_index, r;
> > > > > > +    int r;
> > > > > > +    Error *err = NULL;
> > > > > >
> > > > > >      assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > > > > >
> > > > > > @@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> > > > > >      /*
> > > > > >       * If we early return in these cases SVQ will not be enabled. The migration
> > > > > >       * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
> > > > > > -     *
> > > > > > -     * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
> > > > > > -     * yet.
> > > > > >       */
> > > > > > -    r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > > > > > -    if (unlikely(r < 0)) {
> > > > > > -        error_report("Cannot get vdpa backend_features: %s(%d)",
> > > > > > -            g_strerror(errno), errno);
> > > > > > -        return -1;
> > > > > > +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > > > > > +        return 0;
> > > > > >      }
> > > > > > -    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
> > > > > > -        !vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > > > > > +
> > > > > > +    if (!s->cvq_isolated) {
> > > > > >          return 0;
> > > > > >      }
> > > > > >
> > > > > > -    /*
> > > > > > -     * Check if all the virtqueues of the virtio device are in a different vq
> > > > > > -     * than the last vq. VQ group of last group passed in cvq_group.
> > > > > > -     */
> > > > > > -    cvq_index = v->dev->vq_index_end - 1;
> > > > > > -    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> > > > > > +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
> > > > > > +                                           v->dev->vq_index_end - 1,
> > > > > > +                                           &err);
> > > > > >      if (unlikely(cvq_group < 0)) {
> > > > > > +        error_report_err(err);
> > > > > >          return cvq_group;
> > > > > >      }
> > > > > > -    for (int i = 0; i < cvq_index; ++i) {
> > > > > > -        int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> > > > > > -
> > > > > > -        if (unlikely(group < 0)) {
> > > > > > -            return group;
> > > > > > -        }
> > > > > > -
> > > > > > -        if (group == cvq_group) {
> > > > > > -            return 0;
> > > > > > -        }
> > > > > > -    }
> > > > > >
> > > > > >      r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
> > > > > >      if (unlikely(r < 0)) {
> > > > > > @@ -799,6 +776,111 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
> > > > > >      .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
> > > > > >  };
> > > > > >
> > > > > > +/**
> > > > > > + * Probe the device to check control virtqueue is isolated.
> > > > > > + *
> > > > > > + * @device_fd vhost-vdpa file descriptor
> > > > > > + * @features features to negotiate
> > > > > > + * @cvq_index Control vq index
> > > > > > + *
> > > > > > + * Returns -1 in case of error, 0 if false and 1 if true
> > > > > > + */
> > > > > > +static int vhost_vdpa_cvq_is_isolated(int device_fd, uint64_t features,
> > > > > > +                                      unsigned cvq_index, Error **errp)
> > > > > > +{
> > > > > > +    int64_t cvq_group;
> > > > > > +    int r;
> > > > > > +
> > > > > > +    r = vhost_vdpa_set_dev_features_fd(device_fd, features);
> > > > > > +    if (unlikely(r < 0)) {
> > > > > > +        error_setg_errno(errp, -r, "Cannot set device features");
> > > > > > +        return r;
> > > > > > +    }
> > > > > > +
> > > > > > +    cvq_group = vhost_vdpa_get_vring_group(device_fd, cvq_index, errp);
> > > > > > +    if (unlikely(cvq_group < 0)) {
> > > > > > +        return cvq_group;
> > > > > > +    }
> > > > > > +
> > > > > > +    for (int i = 0; i < cvq_index; ++i) {
> > > > > > +        int64_t group = vhost_vdpa_get_vring_group(device_fd, i, errp);
> > > > > > +
> > > > > > +        if (unlikely(group < 0)) {
> > > > > > +            return group;
> > > > > > +        }
> > > > > > +
> > > > > > +        if (group == (int64_t)cvq_group) {
> > > > > > +            return 0;
> > > > > > +        }
> > > > > > +    }
> > > > > > +
> > > > > > +    return 1;
> > > > > > +}
> > > > > > +
> > > > > > +/**
> > > > > > + * Probe if CVQ is isolated when the device is MQ and when it is not MQ
> > > > > > + *
> > > > > > + * @device_fd         The vdpa device fd
> > > > > > + * @features          Features offered by the device.
> > > > > > + * @cvq_index         The control vq index if mq is negotiated. Ignored
> > > > > > + *                    otherwise.
> > > > > > + *
> > > > > > + * Returns <0 in case of failure, 0 if false and 1 if true.
> > > > > > + */
> > > > > > +static int vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
> > > > > > +                                          int cvq_index, Error **errp)
> > > > > > +{
> > > > > > +    uint64_t backend_features;
> > > > > > +    int r;
> > > > > > +
> > > > > > +    ERRP_GUARD();
> > > > > > +
> > > > > > +    r = ioctl(device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > > > > > +    if (unlikely(r < 0)) {
> > > > > > +        error_setg_errno(errp, errno, "Cannot get vdpa backend_features");
> > > > > > +        return r;
> > > > > > +    }
> > > > > > +
> > > > > > +    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
> > > > > > +        return 0;
> > > > > > +    }
> > > > > > +
> > > > > > +    r = vhost_vdpa_cvq_is_isolated(device_fd,
> > > > > > +                                   features & ~BIT_ULL(VIRTIO_NET_F_MQ), 2,
> > > > > > +                                   errp);
> > > > > > +    if (unlikely(r < 0)) {
> > > > > > +        if (r != -ENOTSUP) {
> > > > > > +            return r;
> > > > > > +        }
> > > > > > +
> > > > > > +        /*
> > > > > > +         * The kernel report VHOST_BACKEND_F_IOTLB_ASID if the vdpa frontend
> > > > > > +         * support ASID even if the parent driver does not.  The CVQ cannot be
> > > > > > +         * isolated in this case.
> > > > > > +         */
> > > > > > +        error_free(*errp);
> > > > > > +        *errp = NULL;
> > > > > > +        return 0;
> > > > > > +    }
> > > > > > +
> > > > > > +    if (r == 0) {
> > > > > > +        return 0;
> > > > > > +    }
> > > > > > +
> > > > > > +    vhost_vdpa_reset_status_fd(device_fd);
> > > > > > +    if (!(features & BIT_ULL(VIRTIO_NET_F_MQ))) {
> > > > > > +        return 0;
> > > > > > +    }
> > > > > > +
> > > > > > +    r = vhost_vdpa_cvq_is_isolated(device_fd, features, cvq_index * 2, errp);
> > > > >
> > > > > I think checking this once should be sufficient. That is to say, it
> > > > > should be a bug if there's hardware that puts cvq in a dedicated group
> > > > > in MQ but not in SQ.
> > > > >
> > > >
> > > > This is checking the NIC is not buggy :). Otherwise, we're giving
> > > > access to the guest to the CVQ shadow vring. And, currently, SVQ code
> > > > assumes only QEMU can access it.
> > >
> > > Just to make sure we are at the same page, I meant, the hardware
> > > should be buggy if the isolation of cvq is not consistent between
> > > single and multiqueue.
> > >
> >
> > Yes, I got you.
> >
> > The problem with that particular bug is that we will handle guest's
> > vring with the bad IOVA tree. Since QEMU is not sanitizing that
> > descriptors anymore, the device can be used to write at qemu memory.
> > At this time only SVQ vring and in buffers should be writable by this,
> > so it's not a big deal.
> >
> > This can also happen if the device is buggy in other ways. For
> > example, reporting that CVQ is isolated at VHOST_VDPA_GET_VRING_GROUP
> > but then handling maps ignoring the ASID parameter. There is no
> > protection for that, so I agree this double check makes little sense.
>
> Ok so you will repost with this check removed?
>

Yes, I'll repost it.

Thanks!

> > > >
> > > > But maybe this made more sense in previous versions, where the series
> > > > also cached the cvq group here. If I understand you correctly, it is
> > > > enough to check that CVQ is isolated in SQ, and assume it will be
> > > > isolated also in MQ, right? I can modify the patch that way if you
> > > > confirm this.
> > >
> > > I think so, or just negotiate with what hardware provides us and check.
> > >
> >
> > To always probe with SQ makes the code simpler, but let me know if you
> > think there are advantages to probing otherwise.
> >
> > Thanks!
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd
  2023-05-17  5:49       ` Jason Wang
@ 2023-05-24 17:36         ` Eugenio Perez Martin
  2023-05-26  4:10           ` Jason Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Eugenio Perez Martin @ 2023-05-24 17:36 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Michael S. Tsirkin,
	Laurent Vivier, Cindy Lu

On Wed, May 17, 2023 at 7:49 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, May 17, 2023 at 1:46 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Wed, May 17, 2023 at 5:14 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >
> > > > This allows to reset a vhost-vdpa device from external subsystems like
> > > > vhost-net, since it does not have any struct vhost_dev by the time we
> > > > need to use it.
> > > >
> > > > It is used in subsequent patches to negotiate features
> > > > and probe for CVQ ASID isolation.
> > > >
> > > > Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > ---
> > > >  include/hw/virtio/vhost-vdpa.h |  1 +
> > > >  hw/virtio/vhost-vdpa.c         | 58 +++++++++++++++++++++++-----------
> > > >  2 files changed, 41 insertions(+), 18 deletions(-)
> > > >
> > > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > > index c278a2a8de..28de7da91e 100644
> > > > --- a/include/hw/virtio/vhost-vdpa.h
> > > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > > @@ -54,6 +54,7 @@ typedef struct vhost_vdpa {
> > > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > > >  } VhostVDPA;
> > > >
> > > > +void vhost_vdpa_reset_status_fd(int fd);
> > > >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range);
> > > >
> > > >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > > index bbabea18f3..7a2053b8d9 100644
> > > > --- a/hw/virtio/vhost-vdpa.c
> > > > +++ b/hw/virtio/vhost-vdpa.c
> > > > @@ -335,38 +335,45 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> > > >      .region_del = vhost_vdpa_listener_region_del,
> > > >  };
> > > >
> > > > -static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > > > -                             void *arg)
> > > > +static int vhost_vdpa_dev_fd(const struct vhost_dev *dev)
> > > >  {
> > > >      struct vhost_vdpa *v = dev->opaque;
> > > > -    int fd = v->device_fd;
> > > > -    int ret;
> > > >
> > > >      assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> > > > +    return v->device_fd;
> > > > +}
> > >
> > > Nit: unless the vhost_dev structure is opaque to the upper layer, I
> > > don't see any advantage for having a dedicated indirect helper to get
> > > device_fd.
> > >
> >
> > The purpose was to not duplicate the assert, but sure it's not mandatory.
>
> Ok, but I think for new codes, we'd better avoid assert as much as possible.
>

I'll remove it, thanks!

> >
> > > > +
> > > > +static int vhost_vdpa_call_fd(int fd, unsigned long int request, void *arg)
> > > > +{
> > > > +    int ret = ioctl(fd, request, arg);
> > > >
> > > > -    ret = ioctl(fd, request, arg);
> > > >      return ret < 0 ? -errno : ret;
> > > >  }
> > > >
> > > > -static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > > > +static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > > > +                           void *arg)
> > > > +{
> > > > +    return vhost_vdpa_call_fd(vhost_vdpa_dev_fd(dev), request, arg);
> > > > +}
> > > > +
> > > > +static int vhost_vdpa_add_status_fd(int fd, uint8_t status)
> > > >  {
> > > >      uint8_t s;
> > > >      int ret;
> > > >
> > > > -    trace_vhost_vdpa_add_status(dev, status);
> > > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > > > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> > > >      if (ret < 0) {
> > > >          return ret;
> > > >      }
> > > >
> > > >      s |= status;
> > > >
> > > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> > > > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &s);
> > > >      if (ret < 0) {
> > > >          return ret;
> > > >      }
> > > >
> > > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > > > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> > > >      if (ret < 0) {
> > > >          return ret;
> > > >      }
> > > > @@ -378,6 +385,12 @@ static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > > >      return 0;
> > > >  }
> > > >
> > > > +static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > > > +{
> > > > +    trace_vhost_vdpa_add_status(dev, status);
> > > > +    return vhost_vdpa_add_status_fd(vhost_vdpa_dev_fd(dev), status);
> > > > +}
> > > > +
> > > >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range)
> > > >  {
> > > >      int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> > > > @@ -709,16 +722,20 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> > > >      return ret;
> > > >  }
> > > >
> > > > +static int vhost_vdpa_reset_device_fd(int fd)
> > > > +{
> > > > +    uint8_t status = 0;
> > > > +
> > > > +    return vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &status);
> > > > +}
> > > > +
> > > >  static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> > > >  {
> > > >      struct vhost_vdpa *v = dev->opaque;
> > > > -    int ret;
> > > > -    uint8_t status = 0;
> > > >
> > > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> > > > -    trace_vhost_vdpa_reset_device(dev);
> > > >      v->suspended = false;
> > > > -    return ret;
> > > > +    trace_vhost_vdpa_reset_device(dev);
> > > > +    return vhost_vdpa_reset_device_fd(vhost_vdpa_dev_fd(dev));
> > > >  }
> > > >
> > > >  static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
> > > > @@ -1170,6 +1187,13 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> > > >      return 0;
> > > >  }
> > > >
> > > > +void vhost_vdpa_reset_status_fd(int fd)
> > > > +{
> > > > +    vhost_vdpa_reset_device_fd(fd);
> > > > +    vhost_vdpa_add_status_fd(fd, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > > > +                                 VIRTIO_CONFIG_S_DRIVER);
> > >
> > > I would like to rename this function since it does more than just reset.
> > >
> >
> > vhost_vdpa_set_ready?
>
> But it doesn't set DRIVER_OK so it might be somehow misleading.
>

I'm still thinking about this. I don't like the original name either
(vhost_vdpa_reset_status), but that one was already in the code before
this patch. To rename it makes this patch scope bigger.

The main idea of this function is to stop duplicating code between
net/vhost-vdpa and hw/virtio/vdpa. One possibility is to not export
vhost_vdpa_reset_status_fd but export vhost_vdpa_reset_device_fd and
vhost_vdpa_add_status_fd individually. Does that seem better?

The other end is not to offer anything and call raw ioctl as previous
code of net/vhost-vdpa. There is a middle ground, like only exporting
one of them for sure.

What do you think?

Thanks!

> Thanks
>
> >
> > Thanks!
> >
> > > Thanks
> > >
> > > > +}
> > > > +
> > > >  static void vhost_vdpa_reset_status(struct vhost_dev *dev)
> > > >  {
> > > >      struct vhost_vdpa *v = dev->opaque;
> > > > @@ -1178,9 +1202,7 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
> > > >          return;
> > > >      }
> > > >
> > > > -    vhost_vdpa_reset_device(dev);
> > > > -    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > > > -                               VIRTIO_CONFIG_S_DRIVER);
> > > > +    vhost_vdpa_reset_status_fd(vhost_vdpa_dev_fd(dev));
> > > >      memory_listener_unregister(&v->listener);
> > > >  }
> > > >
> > > > --
> > > > 2.31.1
> > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd
  2023-05-24 17:36         ` Eugenio Perez Martin
@ 2023-05-26  4:10           ` Jason Wang
  0 siblings, 0 replies; 19+ messages in thread
From: Jason Wang @ 2023-05-26  4:10 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Parav Pandit, Zhu Lingshan, longpeng2,
	Stefano Garzarella, Gautam Dawar, Gonglei (Arei),
	Harpreet Singh Anand, alvaro.karsz, Liuxiangdong, Dragos Tatulea,
	si-wei.liu, Shannon Nelson, Lei Yang, Michael S. Tsirkin,
	Laurent Vivier, Cindy Lu

On Thu, May 25, 2023 at 1:37 AM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Wed, May 17, 2023 at 7:49 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, May 17, 2023 at 1:46 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Wed, May 17, 2023 at 5:14 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > > >
> > > > > This allows to reset a vhost-vdpa device from external subsystems like
> > > > > vhost-net, since it does not have any struct vhost_dev by the time we
> > > > > need to use it.
> > > > >
> > > > > It is used in subsequent patches to negotiate features
> > > > > and probe for CVQ ASID isolation.
> > > > >
> > > > > Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
> > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > ---
> > > > >  include/hw/virtio/vhost-vdpa.h |  1 +
> > > > >  hw/virtio/vhost-vdpa.c         | 58 +++++++++++++++++++++++-----------
> > > > >  2 files changed, 41 insertions(+), 18 deletions(-)
> > > > >
> > > > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > > > index c278a2a8de..28de7da91e 100644
> > > > > --- a/include/hw/virtio/vhost-vdpa.h
> > > > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > > > @@ -54,6 +54,7 @@ typedef struct vhost_vdpa {
> > > > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > > > >  } VhostVDPA;
> > > > >
> > > > > +void vhost_vdpa_reset_status_fd(int fd);
> > > > >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range);
> > > > >
> > > > >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> > > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > > > index bbabea18f3..7a2053b8d9 100644
> > > > > --- a/hw/virtio/vhost-vdpa.c
> > > > > +++ b/hw/virtio/vhost-vdpa.c
> > > > > @@ -335,38 +335,45 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> > > > >      .region_del = vhost_vdpa_listener_region_del,
> > > > >  };
> > > > >
> > > > > -static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > > > > -                             void *arg)
> > > > > +static int vhost_vdpa_dev_fd(const struct vhost_dev *dev)
> > > > >  {
> > > > >      struct vhost_vdpa *v = dev->opaque;
> > > > > -    int fd = v->device_fd;
> > > > > -    int ret;
> > > > >
> > > > >      assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> > > > > +    return v->device_fd;
> > > > > +}
> > > >
> > > > Nit: unless the vhost_dev structure is opaque to the upper layer, I
> > > > don't see any advantage for having a dedicated indirect helper to get
> > > > device_fd.
> > > >
> > >
> > > The purpose was to not duplicate the assert, but sure it's not mandatory.
> >
> > Ok, but I think for new codes, we'd better avoid assert as much as possible.
> >
>
> I'll remove it, thanks!
>
> > >
> > > > > +
> > > > > +static int vhost_vdpa_call_fd(int fd, unsigned long int request, void *arg)
> > > > > +{
> > > > > +    int ret = ioctl(fd, request, arg);
> > > > >
> > > > > -    ret = ioctl(fd, request, arg);
> > > > >      return ret < 0 ? -errno : ret;
> > > > >  }
> > > > >
> > > > > -static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > > > > +static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > > > > +                           void *arg)
> > > > > +{
> > > > > +    return vhost_vdpa_call_fd(vhost_vdpa_dev_fd(dev), request, arg);
> > > > > +}
> > > > > +
> > > > > +static int vhost_vdpa_add_status_fd(int fd, uint8_t status)
> > > > >  {
> > > > >      uint8_t s;
> > > > >      int ret;
> > > > >
> > > > > -    trace_vhost_vdpa_add_status(dev, status);
> > > > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > > > > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> > > > >      if (ret < 0) {
> > > > >          return ret;
> > > > >      }
> > > > >
> > > > >      s |= status;
> > > > >
> > > > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> > > > > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &s);
> > > > >      if (ret < 0) {
> > > > >          return ret;
> > > > >      }
> > > > >
> > > > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > > > > +    ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> > > > >      if (ret < 0) {
> > > > >          return ret;
> > > > >      }
> > > > > @@ -378,6 +385,12 @@ static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > > > >      return 0;
> > > > >  }
> > > > >
> > > > > +static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > > > > +{
> > > > > +    trace_vhost_vdpa_add_status(dev, status);
> > > > > +    return vhost_vdpa_add_status_fd(vhost_vdpa_dev_fd(dev), status);
> > > > > +}
> > > > > +
> > > > >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range *iova_range)
> > > > >  {
> > > > >      int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> > > > > @@ -709,16 +722,20 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> > > > >      return ret;
> > > > >  }
> > > > >
> > > > > +static int vhost_vdpa_reset_device_fd(int fd)
> > > > > +{
> > > > > +    uint8_t status = 0;
> > > > > +
> > > > > +    return vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &status);
> > > > > +}
> > > > > +
> > > > >  static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> > > > >  {
> > > > >      struct vhost_vdpa *v = dev->opaque;
> > > > > -    int ret;
> > > > > -    uint8_t status = 0;
> > > > >
> > > > > -    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> > > > > -    trace_vhost_vdpa_reset_device(dev);
> > > > >      v->suspended = false;
> > > > > -    return ret;
> > > > > +    trace_vhost_vdpa_reset_device(dev);
> > > > > +    return vhost_vdpa_reset_device_fd(vhost_vdpa_dev_fd(dev));
> > > > >  }
> > > > >
> > > > >  static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
> > > > > @@ -1170,6 +1187,13 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> > > > >      return 0;
> > > > >  }
> > > > >
> > > > > +void vhost_vdpa_reset_status_fd(int fd)
> > > > > +{
> > > > > +    vhost_vdpa_reset_device_fd(fd);
> > > > > +    vhost_vdpa_add_status_fd(fd, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > > > > +                                 VIRTIO_CONFIG_S_DRIVER);
> > > >
> > > > I would like to rename this function since it does more than just reset.
> > > >
> > >
> > > vhost_vdpa_set_ready?
> >
> > But it doesn't set DRIVER_OK so it might be somehow misleading.
> >
>
> I'm still thinking about this. I don't like the original name either
> (vhost_vdpa_reset_status), but that one was already in the code before
> this patch. To rename it makes this patch scope bigger.
>
> The main idea of this function is to stop duplicating code between
> net/vhost-vdpa and hw/virtio/vdpa. One possibility is to not export
> vhost_vdpa_reset_status_fd but export vhost_vdpa_reset_device_fd and
> vhost_vdpa_add_status_fd individually. Does that seem better?

Probably not, the reason is that at uAPI level, we don't differ reset
from other status actually:

#define VHOST_VDPA_SET_STATUS           _IOW(VHOST_VIRTIO, 0x72, __u8)

>
> The other end is not to offer anything and call raw ioctl as previous
> code of net/vhost-vdpa. There is a middle ground, like only exporting
> one of them for sure.

This might be better.

Thanks

>
> What do you think?
>
> Thanks!
>
> > Thanks
> >
> > >
> > > Thanks!
> > >
> > > > Thanks
> > > >
> > > > > +}
> > > > > +
> > > > >  static void vhost_vdpa_reset_status(struct vhost_dev *dev)
> > > > >  {
> > > > >      struct vhost_vdpa *v = dev->opaque;
> > > > > @@ -1178,9 +1202,7 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
> > > > >          return;
> > > > >      }
> > > > >
> > > > > -    vhost_vdpa_reset_device(dev);
> > > > > -    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > > > > -                               VIRTIO_CONFIG_S_DRIVER);
> > > > > +    vhost_vdpa_reset_status_fd(vhost_vdpa_dev_fd(dev));
> > > > >      memory_listener_unregister(&v->listener);
> > > > >  }
> > > > >
> > > > > --
> > > > > 2.31.1
> > > > >
> > > >
> > >
> >
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-05-26  4:11 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-09 15:44 [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
2023-05-09 15:44 ` [PATCH v3 1/5] vdpa: Remove status in reset tracing Eugenio Pérez
2023-05-09 15:44 ` [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd Eugenio Pérez
2023-05-17  3:14   ` Jason Wang
2023-05-17  5:46     ` Eugenio Perez Martin
2023-05-17  5:49       ` Jason Wang
2023-05-24 17:36         ` Eugenio Perez Martin
2023-05-26  4:10           ` Jason Wang
2023-05-09 15:44 ` [PATCH v3 3/5] vdpa: add vhost_vdpa_set_dev_features_fd Eugenio Pérez
2023-05-09 15:44 ` [PATCH v3 4/5] vdpa: return errno in vhost_vdpa_get_vring_group error Eugenio Pérez
2023-05-09 15:44 ` [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa Eugenio Pérez
2023-05-17  3:59   ` Jason Wang
2023-05-17  6:29     ` Eugenio Perez Martin
2023-05-18  5:49       ` Jason Wang
2023-05-18  6:36         ` Eugenio Perez Martin
2023-05-18  6:58           ` Jason Wang
2023-05-18 21:22           ` Michael S. Tsirkin
2023-05-19  4:50             ` Eugenio Perez Martin
2023-05-17  6:18 ` [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization Lei Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.