[PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization
@ 2023-05-26 15:31 Eugenio Pérez
  2023-05-26 15:31 ` [PATCH v4 1/2] vdpa: return errno in vhost_vdpa_get_vring_group error Eugenio Pérez
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Eugenio Pérez @ 2023-05-26 15:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Gonglei (Arei),
	longpeng2, Shannon Nelson, Laurent Vivier, si-wei.liu, Cindy Lu,
	Lei Yang, Harpreet Singh Anand, Dragos Tatulea,
	Stefano Garzarella, Gautam Dawar, Jason Wang, Liuxiangdong,
	Michael S. Tsirkin, alvaro.karsz, Zhu Lingshan

QEMU v8.0 is able to switch dynamically between vhost-vdpa passthrough
and SVQ mode as long as the net device does not have CVQ.  The net device
state followed (and migrated) by CVQ requires special care.

A pre-requisite to add CVQ to that framework is to determine if devices with
CVQ are migratable or not at initialization time.  The solution to it is to
always shadow only CVQ, and vq groups and ASID are used for that.

However, current qemu version only checks ASID at device start (as "driver set
DRIVER_OK status bit"), not at device initialization.  A check at
initialization time is required.  Otherwise, the guest would be able to set
and remove migration blockers at will [1].

This series is a requisite for migration of vhost-vdpa net devices with CVQ.
However it already makes sense by its own, as it reduces the number of ioctls
at migration time, decreasing the error paths there.

[1] https://lore.kernel.org/qemu-devel/2616f0cd-f9e8-d183-ea78-db1be4825d9c@redhat.com/
---
v4:
* Only probe one of MQ or !MQ.
* Merge vhost_vdpa_cvq_is_isolated in vhost_vdpa_probe_cvq_isolation
* Call ioctl directly instead of adding functions.

v3:
* Only record cvq_isolated, true if the device have cvq isolated in both !MQ
* and MQ configurations.
* Drop the cache of cvq group, it can be done on top

v2:
* Take out the reset of the device from vhost_vdpa_cvq_is_isolated
  (reported by Lei Yang).
* Expand patch messages by Stefano G. questions.

Eugenio Pérez (2):
  vdpa: return errno in vhost_vdpa_get_vring_group error
  vdpa: move CVQ isolation check to net_init_vhost_vdpa

 net/vhost-vdpa.c | 147 ++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 112 insertions(+), 35 deletions(-)

-- 
2.31.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/2] vdpa: return errno in vhost_vdpa_get_vring_group error
  2023-05-26 15:31 [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
@ 2023-05-26 15:31 ` Eugenio Pérez
  2023-05-26 15:31 ` [PATCH v4 2/2] vdpa: move CVQ isolation check to net_init_vhost_vdpa Eugenio Pérez
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Eugenio Pérez @ 2023-05-26 15:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Gonglei (Arei),
	longpeng2, Shannon Nelson, Laurent Vivier, si-wei.liu, Cindy Lu,
	Lei Yang, Harpreet Singh Anand, Dragos Tatulea,
	Stefano Garzarella, Gautam Dawar, Jason Wang, Liuxiangdong,
	Michael S. Tsirkin, alvaro.karsz, Zhu Lingshan

We need to tell in the caller, as some errors are expected in a normal
workflow.  In particular, parent drivers in recent kernels with
VHOST_BACKEND_F_IOTLB_ASID may not support vring groups.  In that case,
-ENOTSUP is returned.

This is the case of vp_vdpa in Linux 6.2.

Next patches in this series will use that information to know if it must
abort or not.  Also, next patches return properly an errp instead of
printing with error_report.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 net/vhost-vdpa.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 37cdc84562..3fb833fe76 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -362,6 +362,14 @@ static NetClientInfo net_vhost_vdpa_info = {
         .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
+/**
+ * Get vring virtqueue group
+ *
+ * @device_fd  vdpa device fd
+ * @vq_index   Virtqueue index
+ *
+ * Return -errno in case of error, or vq group if success.
+ */
 static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
 {
     struct vhost_vring_state state = {
@@ -370,6 +378,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
     int r = ioctl(device_fd, VHOST_VDPA_GET_VRING_GROUP, &state);
 
     if (unlikely(r < 0)) {
+        r = -errno;
         error_report("Cannot get VQ %u group: %s", vq_index,
                      g_strerror(errno));
         return r;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/2] vdpa: move CVQ isolation check to net_init_vhost_vdpa
  2023-05-26 15:31 [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
  2023-05-26 15:31 ` [PATCH v4 1/2] vdpa: return errno in vhost_vdpa_get_vring_group error Eugenio Pérez
@ 2023-05-26 15:31 ` Eugenio Pérez
  2023-05-31  1:44   ` Jason Wang
  2023-06-02 11:29 ` [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization Lei Yang
  2023-06-23 13:04 ` Peter Maydell
  3 siblings, 1 reply; 7+ messages in thread
From: Eugenio Pérez @ 2023-05-26 15:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Gonglei (Arei),
	longpeng2, Shannon Nelson, Laurent Vivier, si-wei.liu, Cindy Lu,
	Lei Yang, Harpreet Singh Anand, Dragos Tatulea,
	Stefano Garzarella, Gautam Dawar, Jason Wang, Liuxiangdong,
	Michael S. Tsirkin, alvaro.karsz, Zhu Lingshan

Evaluating it at start time instead of initialization time may make the
guest capable of dynamically adding or removing migration blockers.

Also, moving to initialization reduces the number of ioctls in the
migration, reducing failure possibilities.

As a drawback we need to check for CVQ isolation twice: one time with no
MQ negotiated and another one acking it, as long as the device supports
it.  This is because Vring ASID / group management is based on vq
indexes, but we don't know the index of CVQ before negotiating MQ.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
v3: Only record cvq_isolated, true if the device have cvq isolated in
    both !MQ and MQ configurations.
v4: Only probe one of MQ or !MQ.
v4: Merge vhost_vdpa_cvq_is_isolated in vhost_vdpa_probe_cvq_isolation
    as there is only one call now.
v4: Call ioctl directly instead of adding functions.
---
 net/vhost-vdpa.c | 154 ++++++++++++++++++++++++++++++++++-------------
 1 file changed, 111 insertions(+), 43 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 3fb833fe76..0cafccd334 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
 
     /* The device always have SVQ enabled */
     bool always_svq;
+
+    /* The device can isolate CVQ in its own ASID */
+    bool cvq_isolated;
+
     bool started;
 } VhostVDPAState;
 
@@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
         .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
-/**
- * Get vring virtqueue group
- *
- * @device_fd  vdpa device fd
- * @vq_index   Virtqueue index
- *
- * Return -errno in case of error, or vq group if success.
- */
-static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
+static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
+                                          Error **errp)
 {
     struct vhost_vring_state state = {
         .index = vq_index,
@@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
 
     if (unlikely(r < 0)) {
         r = -errno;
-        error_report("Cannot get VQ %u group: %s", vq_index,
-                     g_strerror(errno));
+        error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
         return r;
     }
 
@@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
 {
     VhostVDPAState *s, *s0;
     struct vhost_vdpa *v;
-    uint64_t backend_features;
     int64_t cvq_group;
-    int cvq_index, r;
+    int r;
+    Error *err = NULL;
 
     assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
 
@@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
     /*
      * If we early return in these cases SVQ will not be enabled. The migration
      * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
-     *
-     * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
-     * yet.
      */
-    r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
-    if (unlikely(r < 0)) {
-        error_report("Cannot get vdpa backend_features: %s(%d)",
-            g_strerror(errno), errno);
-        return -1;
+    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
+        return 0;
     }
-    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
-        !vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
+
+    if (!s->cvq_isolated) {
         return 0;
     }
 
-    /*
-     * Check if all the virtqueues of the virtio device are in a different vq
-     * than the last vq. VQ group of last group passed in cvq_group.
-     */
-    cvq_index = v->dev->vq_index_end - 1;
-    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
+    cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
+                                           v->dev->vq_index_end - 1,
+                                           &err);
     if (unlikely(cvq_group < 0)) {
+        error_report_err(err);
         return cvq_group;
     }
-    for (int i = 0; i < cvq_index; ++i) {
-        int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
-
-        if (unlikely(group < 0)) {
-            return group;
-        }
-
-        if (group == cvq_group) {
-            return 0;
-        }
-    }
 
     r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
     if (unlikely(r < 0)) {
@@ -799,6 +776,87 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
     .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
 };
 
+/**
+ * Probe if CVQ is isolated
+ *
+ * @device_fd         The vdpa device fd
+ * @features          Features offered by the device.
+ * @cvq_index         The control vq pair index
+ *
+ * Returns <0 in case of failure, 0 if false and 1 if true.
+ */
+static int vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
+                                          int cvq_index, Error **errp)
+{
+    uint64_t backend_features;
+    int64_t cvq_group;
+    uint8_t status = VIRTIO_CONFIG_S_ACKNOWLEDGE |
+                     VIRTIO_CONFIG_S_DRIVER |
+                     VIRTIO_CONFIG_S_FEATURES_OK;
+    int r;
+
+    ERRP_GUARD();
+
+    r = ioctl(device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
+    if (unlikely(r < 0)) {
+        error_setg_errno(errp, errno, "Cannot get vdpa backend_features");
+        return r;
+    }
+
+    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
+        return 0;
+    }
+
+    r = ioctl(device_fd, VHOST_SET_FEATURES, &features);
+    if (unlikely(r)) {
+        error_setg_errno(errp, errno, "Cannot set features");
+    }
+
+    r = ioctl(device_fd, VHOST_VDPA_SET_STATUS, &status);
+    if (unlikely(r)) {
+        error_setg_errno(errp, -r, "Cannot set device features");
+        goto out;
+    }
+
+    cvq_group = vhost_vdpa_get_vring_group(device_fd, cvq_index, errp);
+    if (unlikely(cvq_group < 0)) {
+        if (cvq_group != -ENOTSUP) {
+            r = cvq_group;
+            goto out;
+        }
+
+        /*
+         * The kernel report VHOST_BACKEND_F_IOTLB_ASID if the vdpa frontend
+         * support ASID even if the parent driver does not.  The CVQ cannot be
+         * isolated in this case.
+         */
+        error_free(*errp);
+        *errp = NULL;
+        r = 0;
+        goto out;
+    }
+
+    for (int i = 0; i < cvq_index; ++i) {
+        int64_t group = vhost_vdpa_get_vring_group(device_fd, i, errp);
+        if (unlikely(group < 0)) {
+            r = group;
+            goto out;
+        }
+
+        if (group == (int64_t)cvq_group) {
+            r = 0;
+            goto out;
+        }
+    }
+
+    r = 1;
+
+out:
+    status = 0;
+    ioctl(device_fd, VHOST_VDPA_SET_STATUS, &status);
+    return r;
+}
+
 static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
                                        const char *device,
                                        const char *name,
@@ -808,16 +866,25 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
                                        bool is_datapath,
                                        bool svq,
                                        struct vhost_vdpa_iova_range iova_range,
-                                       uint64_t features)
+                                       uint64_t features,
+                                       Error **errp)
 {
     NetClientState *nc = NULL;
     VhostVDPAState *s;
     int ret = 0;
     assert(name);
+    int cvq_isolated;
+
     if (is_datapath) {
         nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device,
                                  name);
     } else {
+        cvq_isolated = vhost_vdpa_probe_cvq_isolation(vdpa_device_fd, features,
+                                                      queue_pair_index * 2, errp);
+        if (unlikely(cvq_isolated < 0)) {
+            return NULL;
+        }
+
         nc = qemu_new_net_control_client(&net_vhost_vdpa_cvq_info, peer,
                                          device, name);
     }
@@ -844,6 +911,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
 
         s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
         s->vhost_vdpa.shadow_vq_ops_opaque = s;
+        s->cvq_isolated = cvq_isolated;
 
         /*
          * TODO: We cannot migrate devices with CVQ as there is no way to set
@@ -972,7 +1040,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     for (i = 0; i < queue_pairs; i++) {
         ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
                                      vdpa_device_fd, i, 2, true, opts->x_svq,
-                                     iova_range, features);
+                                     iova_range, features, errp);
         if (!ncs[i])
             goto err;
     }
@@ -980,7 +1048,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
     if (has_cvq) {
         nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
                                  vdpa_device_fd, i, 1, false,
-                                 opts->x_svq, iova_range, features);
+                                 opts->x_svq, iova_range, features, errp);
         if (!nc)
             goto err;
     }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/2] vdpa: move CVQ isolation check to net_init_vhost_vdpa
  2023-05-26 15:31 ` [PATCH v4 2/2] vdpa: move CVQ isolation check to net_init_vhost_vdpa Eugenio Pérez
@ 2023-05-31  1:44   ` Jason Wang
  0 siblings, 0 replies; 7+ messages in thread
From: Jason Wang @ 2023-05-31  1:44 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Gonglei (Arei),
	longpeng2, Shannon Nelson, Laurent Vivier, si-wei.liu, Cindy Lu,
	Lei Yang, Harpreet Singh Anand, Dragos Tatulea,
	Stefano Garzarella, Gautam Dawar, Liuxiangdong,
	Michael S. Tsirkin, alvaro.karsz, Zhu Lingshan

On Fri, May 26, 2023 at 11:31 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> Evaluating it at start time instead of initialization time may make the
> guest capable of dynamically adding or removing migration blockers.
>
> Also, moving to initialization reduces the number of ioctls in the
> migration, reducing failure possibilities.
>
> As a drawback we need to check for CVQ isolation twice: one time with no
> MQ negotiated and another one acking it, as long as the device supports
> it.  This is because Vring ASID / group management is based on vq
> indexes, but we don't know the index of CVQ before negotiating MQ.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks

> ---
> v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
> v3: Only record cvq_isolated, true if the device have cvq isolated in
>     both !MQ and MQ configurations.
> v4: Only probe one of MQ or !MQ.
> v4: Merge vhost_vdpa_cvq_is_isolated in vhost_vdpa_probe_cvq_isolation
>     as there is only one call now.
> v4: Call ioctl directly instead of adding functions.
> ---
>  net/vhost-vdpa.c | 154 ++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 111 insertions(+), 43 deletions(-)
>
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 3fb833fe76..0cafccd334 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
>
>      /* The device always have SVQ enabled */
>      bool always_svq;
> +
> +    /* The device can isolate CVQ in its own ASID */
> +    bool cvq_isolated;
> +
>      bool started;
>  } VhostVDPAState;
>
> @@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
>          .check_peer_type = vhost_vdpa_check_peer_type,
>  };
>
> -/**
> - * Get vring virtqueue group
> - *
> - * @device_fd  vdpa device fd
> - * @vq_index   Virtqueue index
> - *
> - * Return -errno in case of error, or vq group if success.
> - */
> -static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> +static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
> +                                          Error **errp)
>  {
>      struct vhost_vring_state state = {
>          .index = vq_index,
> @@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
>
>      if (unlikely(r < 0)) {
>          r = -errno;
> -        error_report("Cannot get VQ %u group: %s", vq_index,
> -                     g_strerror(errno));
> +        error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
>          return r;
>      }
>
> @@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
>  {
>      VhostVDPAState *s, *s0;
>      struct vhost_vdpa *v;
> -    uint64_t backend_features;
>      int64_t cvq_group;
> -    int cvq_index, r;
> +    int r;
> +    Error *err = NULL;
>
>      assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>
> @@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
>      /*
>       * If we early return in these cases SVQ will not be enabled. The migration
>       * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
> -     *
> -     * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
> -     * yet.
>       */
> -    r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> -    if (unlikely(r < 0)) {
> -        error_report("Cannot get vdpa backend_features: %s(%d)",
> -            g_strerror(errno), errno);
> -        return -1;
> +    if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> +        return 0;
>      }
> -    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
> -        !vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> +
> +    if (!s->cvq_isolated) {
>          return 0;
>      }
>
> -    /*
> -     * Check if all the virtqueues of the virtio device are in a different vq
> -     * than the last vq. VQ group of last group passed in cvq_group.
> -     */
> -    cvq_index = v->dev->vq_index_end - 1;
> -    cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> +    cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
> +                                           v->dev->vq_index_end - 1,
> +                                           &err);
>      if (unlikely(cvq_group < 0)) {
> +        error_report_err(err);
>          return cvq_group;
>      }
> -    for (int i = 0; i < cvq_index; ++i) {
> -        int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> -
> -        if (unlikely(group < 0)) {
> -            return group;
> -        }
> -
> -        if (group == cvq_group) {
> -            return 0;
> -        }
> -    }
>
>      r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
>      if (unlikely(r < 0)) {
> @@ -799,6 +776,87 @@ static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
>      .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
>  };
>
> +/**
> + * Probe if CVQ is isolated
> + *
> + * @device_fd         The vdpa device fd
> + * @features          Features offered by the device.
> + * @cvq_index         The control vq pair index
> + *
> + * Returns <0 in case of failure, 0 if false and 1 if true.
> + */
> +static int vhost_vdpa_probe_cvq_isolation(int device_fd, uint64_t features,
> +                                          int cvq_index, Error **errp)
> +{
> +    uint64_t backend_features;
> +    int64_t cvq_group;
> +    uint8_t status = VIRTIO_CONFIG_S_ACKNOWLEDGE |
> +                     VIRTIO_CONFIG_S_DRIVER |
> +                     VIRTIO_CONFIG_S_FEATURES_OK;
> +    int r;
> +
> +    ERRP_GUARD();
> +
> +    r = ioctl(device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> +    if (unlikely(r < 0)) {
> +        error_setg_errno(errp, errno, "Cannot get vdpa backend_features");
> +        return r;
> +    }
> +
> +    if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID))) {
> +        return 0;
> +    }
> +
> +    r = ioctl(device_fd, VHOST_SET_FEATURES, &features);
> +    if (unlikely(r)) {
> +        error_setg_errno(errp, errno, "Cannot set features");
> +    }
> +
> +    r = ioctl(device_fd, VHOST_VDPA_SET_STATUS, &status);
> +    if (unlikely(r)) {
> +        error_setg_errno(errp, -r, "Cannot set device features");
> +        goto out;
> +    }
> +
> +    cvq_group = vhost_vdpa_get_vring_group(device_fd, cvq_index, errp);
> +    if (unlikely(cvq_group < 0)) {
> +        if (cvq_group != -ENOTSUP) {
> +            r = cvq_group;
> +            goto out;
> +        }
> +
> +        /*
> +         * The kernel report VHOST_BACKEND_F_IOTLB_ASID if the vdpa frontend
> +         * support ASID even if the parent driver does not.  The CVQ cannot be
> +         * isolated in this case.
> +         */
> +        error_free(*errp);
> +        *errp = NULL;
> +        r = 0;
> +        goto out;
> +    }
> +
> +    for (int i = 0; i < cvq_index; ++i) {
> +        int64_t group = vhost_vdpa_get_vring_group(device_fd, i, errp);
> +        if (unlikely(group < 0)) {
> +            r = group;
> +            goto out;
> +        }
> +
> +        if (group == (int64_t)cvq_group) {
> +            r = 0;
> +            goto out;
> +        }
> +    }
> +
> +    r = 1;
> +
> +out:
> +    status = 0;
> +    ioctl(device_fd, VHOST_VDPA_SET_STATUS, &status);
> +    return r;
> +}
> +
>  static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>                                         const char *device,
>                                         const char *name,
> @@ -808,16 +866,25 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>                                         bool is_datapath,
>                                         bool svq,
>                                         struct vhost_vdpa_iova_range iova_range,
> -                                       uint64_t features)
> +                                       uint64_t features,
> +                                       Error **errp)
>  {
>      NetClientState *nc = NULL;
>      VhostVDPAState *s;
>      int ret = 0;
>      assert(name);
> +    int cvq_isolated;
> +
>      if (is_datapath) {
>          nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device,
>                                   name);
>      } else {
> +        cvq_isolated = vhost_vdpa_probe_cvq_isolation(vdpa_device_fd, features,
> +                                                      queue_pair_index * 2, errp);
> +        if (unlikely(cvq_isolated < 0)) {
> +            return NULL;
> +        }
> +
>          nc = qemu_new_net_control_client(&net_vhost_vdpa_cvq_info, peer,
>                                           device, name);
>      }
> @@ -844,6 +911,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
>
>          s->vhost_vdpa.shadow_vq_ops = &vhost_vdpa_net_svq_ops;
>          s->vhost_vdpa.shadow_vq_ops_opaque = s;
> +        s->cvq_isolated = cvq_isolated;
>
>          /*
>           * TODO: We cannot migrate devices with CVQ as there is no way to set
> @@ -972,7 +1040,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>      for (i = 0; i < queue_pairs; i++) {
>          ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>                                       vdpa_device_fd, i, 2, true, opts->x_svq,
> -                                     iova_range, features);
> +                                     iova_range, features, errp);
>          if (!ncs[i])
>              goto err;
>      }
> @@ -980,7 +1048,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
>      if (has_cvq) {
>          nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>                                   vdpa_device_fd, i, 1, false,
> -                                 opts->x_svq, iova_range, features);
> +                                 opts->x_svq, iova_range, features, errp);
>          if (!nc)
>              goto err;
>      }
> --
> 2.31.1
>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization
  2023-05-26 15:31 [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
  2023-05-26 15:31 ` [PATCH v4 1/2] vdpa: return errno in vhost_vdpa_get_vring_group error Eugenio Pérez
  2023-05-26 15:31 ` [PATCH v4 2/2] vdpa: move CVQ isolation check to net_init_vhost_vdpa Eugenio Pérez
@ 2023-06-02 11:29 ` Lei Yang
  2023-06-23 13:04 ` Peter Maydell
  3 siblings, 0 replies; 7+ messages in thread
From: Lei Yang @ 2023-06-02 11:29 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Gonglei (Arei),
	longpeng2, Shannon Nelson, Laurent Vivier, si-wei.liu, Cindy Lu,
	Harpreet Singh Anand, Dragos Tatulea, Stefano Garzarella,
	Gautam Dawar, Jason Wang, Liuxiangdong, Michael S. Tsirkin,
	alvaro.karsz, Zhu Lingshan

QE did a sanity test on v4 of this series using the vdpa_sim
device,everything is working fine.

Tested-by: Lei Yang <leiyang@redhat.com>

On Fri, May 26, 2023 at 11:31 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> QEMU v8.0 is able to switch dynamically between vhost-vdpa passthrough
> and SVQ mode as long as the net device does not have CVQ.  The net device
> state followed (and migrated) by CVQ requires special care.
>
> A pre-requisite to add CVQ to that framework is to determine if devices with
> CVQ are migratable or not at initialization time.  The solution to it is to
> always shadow only CVQ, and vq groups and ASID are used for that.
>
> However, current qemu version only checks ASID at device start (as "driver set
> DRIVER_OK status bit"), not at device initialization.  A check at
> initialization time is required.  Otherwise, the guest would be able to set
> and remove migration blockers at will [1].
>
> This series is a requisite for migration of vhost-vdpa net devices with CVQ.
> However it already makes sense by its own, as it reduces the number of ioctls
> at migration time, decreasing the error paths there.
>
> [1] https://lore.kernel.org/qemu-devel/2616f0cd-f9e8-d183-ea78-db1be4825d9c@redhat.com/
> ---
> v4:
> * Only probe one of MQ or !MQ.
> * Merge vhost_vdpa_cvq_is_isolated in vhost_vdpa_probe_cvq_isolation
> * Call ioctl directly instead of adding functions.
>
> v3:
> * Only record cvq_isolated, true if the device have cvq isolated in both !MQ
> * and MQ configurations.
> * Drop the cache of cvq group, it can be done on top
>
> v2:
> * Take out the reset of the device from vhost_vdpa_cvq_is_isolated
>   (reported by Lei Yang).
> * Expand patch messages by Stefano G. questions.
>
> Eugenio Pérez (2):
>   vdpa: return errno in vhost_vdpa_get_vring_group error
>   vdpa: move CVQ isolation check to net_init_vhost_vdpa
>
>  net/vhost-vdpa.c | 147 ++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 112 insertions(+), 35 deletions(-)
>
> --
> 2.31.1
>
>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization
  2023-05-26 15:31 [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
                   ` (2 preceding siblings ...)
  2023-06-02 11:29 ` [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization Lei Yang
@ 2023-06-23 13:04 ` Peter Maydell
  2023-06-25  9:11   ` Eugenio Perez Martin
  3 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2023-06-23 13:04 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: qemu-devel, Parav Pandit, Gonglei (Arei),
	longpeng2, Shannon Nelson, Laurent Vivier, si-wei.liu, Cindy Lu,
	Lei Yang, Harpreet Singh Anand, Dragos Tatulea,
	Stefano Garzarella, Gautam Dawar, Jason Wang, Liuxiangdong,
	Michael S. Tsirkin, alvaro.karsz, Zhu Lingshan

On Fri, 26 May 2023 at 16:32, Eugenio Pérez <eperezma@redhat.com> wrote:
>
> QEMU v8.0 is able to switch dynamically between vhost-vdpa passthrough
> and SVQ mode as long as the net device does not have CVQ.  The net device
> state followed (and migrated) by CVQ requires special care.
>
> A pre-requisite to add CVQ to that framework is to determine if devices with
> CVQ are migratable or not at initialization time.  The solution to it is to
> always shadow only CVQ, and vq groups and ASID are used for that.
>
> However, current qemu version only checks ASID at device start (as "driver set
> DRIVER_OK status bit"), not at device initialization.  A check at
> initialization time is required.  Otherwise, the guest would be able to set
> and remove migration blockers at will [1].
>
> This series is a requisite for migration of vhost-vdpa net devices with CVQ.
> However it already makes sense by its own, as it reduces the number of ioctls
> at migration time, decreasing the error paths there.

Hi -- since you're working on the net_init_vhost_vdpa() code,
would you mind having a look at Coverity CID 1490785 ?
This is about a leak of the vdpa_device_fd. We fixed one
instance of that leak in commit aed5da45daf734ddc54 but
it looks like there's still a different leak:

    for (i = 0; i < queue_pairs; i++) {
        ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
                                     vdpa_device_fd, i, 2, true, opts->x_svq,
                                     iova_range, features);
        if (!ncs[i])
            goto err;
    }

    if (has_cvq) {
        nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
                                 vdpa_device_fd, i, 1, false,
                                 opts->x_svq, iova_range, features);
        if (!nc)
            goto err;
    }

    return 0;

In this code, if queue_pairs is non-zero we will use
vdpa_device_fd because we pass it to net_vhost_vdpa_init().
Similarly, if has_cvq is true then we'll also use the fd.
But if queue_pairs is zero and has_cvq is false then we
will not do anything with the fd, and will return 0,
leaking the file descriptor.

Maybe this combination is not supposed to happen, but
I can't see anything in vhost_vdpa_get_max_queue_pairs()
or in this function which guards against it. If it's
an invalid setup we should detect it and return an
error, I think.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization
  2023-06-23 13:04 ` Peter Maydell
@ 2023-06-25  9:11   ` Eugenio Perez Martin
  0 siblings, 0 replies; 7+ messages in thread
From: Eugenio Perez Martin @ 2023-06-25  9:11 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-devel, Parav Pandit, Gonglei (Arei),
	longpeng2, Shannon Nelson, Laurent Vivier, si-wei.liu, Cindy Lu,
	Lei Yang, Harpreet Singh Anand, Dragos Tatulea,
	Stefano Garzarella, Gautam Dawar, Jason Wang, Liuxiangdong,
	Michael S. Tsirkin, alvaro.karsz, Zhu Lingshan

On Fri, Jun 23, 2023 at 3:06 PM Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Fri, 26 May 2023 at 16:32, Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > QEMU v8.0 is able to switch dynamically between vhost-vdpa passthrough
> > and SVQ mode as long as the net device does not have CVQ.  The net device
> > state followed (and migrated) by CVQ requires special care.
> >
> > A pre-requisite to add CVQ to that framework is to determine if devices with
> > CVQ are migratable or not at initialization time.  The solution to it is to
> > always shadow only CVQ, and vq groups and ASID are used for that.
> >
> > However, current qemu version only checks ASID at device start (as "driver set
> > DRIVER_OK status bit"), not at device initialization.  A check at
> > initialization time is required.  Otherwise, the guest would be able to set
> > and remove migration blockers at will [1].
> >
> > This series is a requisite for migration of vhost-vdpa net devices with CVQ.
> > However it already makes sense by its own, as it reduces the number of ioctls
> > at migration time, decreasing the error paths there.
>
> Hi -- since you're working on the net_init_vhost_vdpa() code,
> would you mind having a look at Coverity CID 1490785 ?
> This is about a leak of the vdpa_device_fd. We fixed one
> instance of that leak in commit aed5da45daf734ddc54 but
> it looks like there's still a different leak:
>
>     for (i = 0; i < queue_pairs; i++) {
>         ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>                                      vdpa_device_fd, i, 2, true, opts->x_svq,
>                                      iova_range, features);
>         if (!ncs[i])
>             goto err;
>     }
>
>     if (has_cvq) {
>         nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>                                  vdpa_device_fd, i, 1, false,
>                                  opts->x_svq, iova_range, features);
>         if (!nc)
>             goto err;
>     }
>
>     return 0;
>
> In this code, if queue_pairs is non-zero we will use
> vdpa_device_fd because we pass it to net_vhost_vdpa_init().
> Similarly, if has_cvq is true then we'll also use the fd.
> But if queue_pairs is zero and has_cvq is false then we
> will not do anything with the fd, and will return 0,
> leaking the file descriptor.
>
> Maybe this combination is not supposed to happen, but
> I can't see anything in vhost_vdpa_get_max_queue_pairs()
> or in this function which guards against it. If it's
> an invalid setup we should detect it and return an
> error, I think.
>

Yes, a device that returns 0 max_vq_pairs would be invalid. I will
introduce a guard for that.

Thanks for the heads up!

> thanks
> -- PMM
>



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-06-25  9:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-26 15:31 [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization Eugenio Pérez
2023-05-26 15:31 ` [PATCH v4 1/2] vdpa: return errno in vhost_vdpa_get_vring_group error Eugenio Pérez
2023-05-26 15:31 ` [PATCH v4 2/2] vdpa: move CVQ isolation check to net_init_vhost_vdpa Eugenio Pérez
2023-05-31  1:44   ` Jason Wang
2023-06-02 11:29 ` [PATCH v4 0/2] Move ASID test to vhost-vdpa net initialization Lei Yang
2023-06-23 13:04 ` Peter Maydell
2023-06-25  9:11   ` Eugenio Perez Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.