All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH vhost v2 0/8] vdpa/mlx5: Add support for resumable vqs
@ 2023-12-05 10:46 Dragos Tatulea
  2023-12-05 10:46 ` [PATCH mlx5-vhost v2 1/8] vdpa/mlx5: Expose resumable vq capability Dragos Tatulea
                   ` (7 more replies)
  0 siblings, 8 replies; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-05 10:46 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, Eugenio Perez Martin,
	Si-Wei Liu, Saeed Mahameed, Leon Romanovsky, virtualization,
	Gal Pressman
  Cc: Dragos Tatulea, kvm, linux-kernel, Parav Pandit, Xuan Zhuo

Add support for resumable vqs in the driver. This is a firmware feature
that can be used for the following benefits:
- Full device .suspend/.resume.
- .set_map doesn't need to destroy and create new vqs anymore just to
  update the map. When resumable vqs are supported it is enough to
  suspend the vqs, set the new maps, and then resume the vqs.

The first patch exposes the relevant bits in mlx5_ifc.h. That means it
needs to be applied to the mlx5-vhost tree [0] first. Once applied
there, the change has to be pulled from mlx5-vhost into the vhost tree
and only then the remaining patches can be applied. Same flow as the vq
descriptor mappings patchset [1].

To be able to use resumable vqs properly, support for selectively modifying
vq parameters was needed. This is what the second part of the series
consists of.

The middle part adds support for resumable vqs.

The last part of the series introduces reference counting for mrs which
is necessary to avoid freeing mkeys too early or leaking them.

* Changes in v2:
- Added mr refcounting patches.
- Deleted unnecessary patch: "vdpa/mlx5: Split function into locked and
  unlocked variants"
- Small print improvement in "Introduce per vq and device resume"
  patch.
- Patch 1/7 has been applied to mlx5-vhost branch.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
[1] https://lore.kernel.org/virtualization/20231018171456.1624030-2-dtatulea@nvidia.com/


Dragos Tatulea (8):
  vdpa/mlx5: Expose resumable vq capability
  vdpa/mlx5: Allow modifying multiple vq fields in one modify command
  vdpa/mlx5: Introduce per vq and device resume
  vdpa/mlx5: Mark vq addrs for modification in hw vq
  vdpa/mlx5: Mark vq state for modification in hw vq
  vdpa/mlx5: Use vq suspend/resume during .set_map
  vdpa/mlx5: Introduce reference counting to mrs
  vdpa/mlx5: Add mkey leak detection

 drivers/vdpa/mlx5/core/mlx5_vdpa.h |  10 +-
 drivers/vdpa/mlx5/core/mr.c        |  69 ++++++++--
 drivers/vdpa/mlx5/net/mlx5_vnet.c  | 209 ++++++++++++++++++++++++++---
 include/linux/mlx5/mlx5_ifc.h      |   3 +-
 include/linux/mlx5/mlx5_ifc_vdpa.h |   4 +
 5 files changed, 257 insertions(+), 38 deletions(-)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH mlx5-vhost v2 1/8] vdpa/mlx5: Expose resumable vq capability
  2023-12-05 10:46 [PATCH vhost v2 0/8] vdpa/mlx5: Add support for resumable vqs Dragos Tatulea
@ 2023-12-05 10:46 ` Dragos Tatulea
  2023-12-05 10:46 ` [PATCH vhost v2 2/8] vdpa/mlx5: Allow modifying multiple vq fields in one modify command Dragos Tatulea
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-05 10:46 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, Eugenio Perez Martin,
	Si-Wei Liu, Saeed Mahameed, Leon Romanovsky, virtualization,
	Gal Pressman
  Cc: Dragos Tatulea, kvm, linux-kernel, Parav Pandit, Xuan Zhuo

Necessary for checking if resumable vqs are supported by the hardware.
Actual support will be added in a downstream patch.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/linux/mlx5/mlx5_ifc.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 6f3631425f38..9eaceaf6bcb0 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1236,7 +1236,8 @@ struct mlx5_ifc_virtio_emulation_cap_bits {
 
 	u8	   reserved_at_c0[0x13];
 	u8         desc_group_mkey_supported[0x1];
-	u8         reserved_at_d4[0xc];
+	u8         freeze_to_rdy_supported[0x1];
+	u8         reserved_at_d5[0xb];
 
 	u8         reserved_at_e0[0x20];
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH vhost v2 2/8] vdpa/mlx5: Allow modifying multiple vq fields in one modify command
  2023-12-05 10:46 [PATCH vhost v2 0/8] vdpa/mlx5: Add support for resumable vqs Dragos Tatulea
  2023-12-05 10:46 ` [PATCH mlx5-vhost v2 1/8] vdpa/mlx5: Expose resumable vq capability Dragos Tatulea
@ 2023-12-05 10:46 ` Dragos Tatulea
  2023-12-05 10:46 ` [PATCH vhost v2 3/8] vdpa/mlx5: Introduce per vq and device resume Dragos Tatulea
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-05 10:46 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, Eugenio Perez Martin,
	Si-Wei Liu, Saeed Mahameed, Leon Romanovsky, virtualization,
	Gal Pressman
  Cc: Dragos Tatulea, kvm, linux-kernel, Parav Pandit, Xuan Zhuo

Add a bitmask variable that tracks hw vq field changes that
are supposed to be modified on next hw vq change command.

This will be useful to set multiple vq fields when resuming the vq.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 48 +++++++++++++++++++++++++------
 1 file changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 26ba7da6b410..1e08a8805640 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -120,6 +120,9 @@ struct mlx5_vdpa_virtqueue {
 	u16 avail_idx;
 	u16 used_idx;
 	int fw_state;
+
+	u64 modified_fields;
+
 	struct msi_map map;
 
 	/* keep last in the struct */
@@ -1181,7 +1184,19 @@ static bool is_valid_state_change(int oldstate, int newstate)
 	}
 }
 
-static int modify_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq, int state)
+static bool modifiable_virtqueue_fields(struct mlx5_vdpa_virtqueue *mvq)
+{
+	/* Only state is always modifiable */
+	if (mvq->modified_fields & ~MLX5_VIRTQ_MODIFY_MASK_STATE)
+		return mvq->fw_state == MLX5_VIRTIO_NET_Q_OBJECT_STATE_INIT ||
+		       mvq->fw_state == MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND;
+
+	return true;
+}
+
+static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
+			    struct mlx5_vdpa_virtqueue *mvq,
+			    int state)
 {
 	int inlen = MLX5_ST_SZ_BYTES(modify_virtio_net_q_in);
 	u32 out[MLX5_ST_SZ_DW(modify_virtio_net_q_out)] = {};
@@ -1193,6 +1208,9 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtque
 	if (mvq->fw_state == MLX5_VIRTIO_NET_Q_OBJECT_NONE)
 		return 0;
 
+	if (!modifiable_virtqueue_fields(mvq))
+		return -EINVAL;
+
 	if (!is_valid_state_change(mvq->fw_state, state))
 		return -EINVAL;
 
@@ -1208,17 +1226,28 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtque
 	MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
 
 	obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
-	MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select,
-		   MLX5_VIRTQ_MODIFY_MASK_STATE);
-	MLX5_SET(virtio_net_q_object, obj_context, state, state);
+	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE)
+		MLX5_SET(virtio_net_q_object, obj_context, state, state);
+
+	MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
 	err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
 	kfree(in);
 	if (!err)
 		mvq->fw_state = state;
 
+	mvq->modified_fields = 0;
+
 	return err;
 }
 
+static int modify_virtqueue_state(struct mlx5_vdpa_net *ndev,
+				  struct mlx5_vdpa_virtqueue *mvq,
+				  unsigned int state)
+{
+	mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_STATE;
+	return modify_virtqueue(ndev, mvq, state);
+}
+
 static int counter_set_alloc(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
 {
 	u32 in[MLX5_ST_SZ_DW(create_virtio_q_counters_in)] = {};
@@ -1347,7 +1376,7 @@ static int setup_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
 		goto err_vq;
 
 	if (mvq->ready) {
-		err = modify_virtqueue(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY);
+		err = modify_virtqueue_state(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY);
 		if (err) {
 			mlx5_vdpa_warn(&ndev->mvdev, "failed to modify to ready vq idx %d(%d)\n",
 				       idx, err);
@@ -1382,7 +1411,7 @@ static void suspend_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *m
 	if (mvq->fw_state != MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY)
 		return;
 
-	if (modify_virtqueue(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND))
+	if (modify_virtqueue_state(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND))
 		mlx5_vdpa_warn(&ndev->mvdev, "modify to suspend failed\n");
 
 	if (query_virtqueue(ndev, mvq, &attr)) {
@@ -1407,6 +1436,7 @@ static void teardown_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *
 		return;
 
 	suspend_vq(ndev, mvq);
+	mvq->modified_fields = 0;
 	destroy_virtqueue(ndev, mvq);
 	dealloc_vector(ndev, mvq);
 	counter_set_dealloc(ndev, mvq);
@@ -2207,7 +2237,7 @@ static void mlx5_vdpa_set_vq_ready(struct vdpa_device *vdev, u16 idx, bool ready
 	if (!ready) {
 		suspend_vq(ndev, mvq);
 	} else {
-		err = modify_virtqueue(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY);
+		err = modify_virtqueue_state(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY);
 		if (err) {
 			mlx5_vdpa_warn(mvdev, "modify VQ %d to ready failed (%d)\n", idx, err);
 			ready = false;
@@ -2804,8 +2834,10 @@ static void clear_vqs_ready(struct mlx5_vdpa_net *ndev)
 {
 	int i;
 
-	for (i = 0; i < ndev->mvdev.max_vqs; i++)
+	for (i = 0; i < ndev->mvdev.max_vqs; i++) {
 		ndev->vqs[i].ready = false;
+		ndev->vqs[i].modified_fields = 0;
+	}
 
 	ndev->mvdev.cvq.ready = false;
 }
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH vhost v2 3/8] vdpa/mlx5: Introduce per vq and device resume
  2023-12-05 10:46 [PATCH vhost v2 0/8] vdpa/mlx5: Add support for resumable vqs Dragos Tatulea
  2023-12-05 10:46 ` [PATCH mlx5-vhost v2 1/8] vdpa/mlx5: Expose resumable vq capability Dragos Tatulea
  2023-12-05 10:46 ` [PATCH vhost v2 2/8] vdpa/mlx5: Allow modifying multiple vq fields in one modify command Dragos Tatulea
@ 2023-12-05 10:46 ` Dragos Tatulea
  2023-12-05 10:46 ` [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq Dragos Tatulea
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-05 10:46 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, Eugenio Perez Martin,
	Si-Wei Liu, Saeed Mahameed, Leon Romanovsky, virtualization,
	Gal Pressman
  Cc: Dragos Tatulea, kvm, linux-kernel, Parav Pandit, Xuan Zhuo

Implement vdpa vq and device resume if capability detected. Add support
for suspend -> ready state change.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 69 +++++++++++++++++++++++++++----
 1 file changed, 62 insertions(+), 7 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 1e08a8805640..f8f088cced50 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1170,7 +1170,12 @@ static int query_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueu
 	return err;
 }
 
-static bool is_valid_state_change(int oldstate, int newstate)
+static bool is_resumable(struct mlx5_vdpa_net *ndev)
+{
+	return ndev->mvdev.vdev.config->resume;
+}
+
+static bool is_valid_state_change(int oldstate, int newstate, bool resumable)
 {
 	switch (oldstate) {
 	case MLX5_VIRTIO_NET_Q_OBJECT_STATE_INIT:
@@ -1178,6 +1183,7 @@ static bool is_valid_state_change(int oldstate, int newstate)
 	case MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY:
 		return newstate == MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND;
 	case MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND:
+		return resumable ? newstate == MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY : false;
 	case MLX5_VIRTIO_NET_Q_OBJECT_STATE_ERR:
 	default:
 		return false;
@@ -1200,6 +1206,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 {
 	int inlen = MLX5_ST_SZ_BYTES(modify_virtio_net_q_in);
 	u32 out[MLX5_ST_SZ_DW(modify_virtio_net_q_out)] = {};
+	bool state_change = false;
 	void *obj_context;
 	void *cmd_hdr;
 	void *in;
@@ -1211,9 +1218,6 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 	if (!modifiable_virtqueue_fields(mvq))
 		return -EINVAL;
 
-	if (!is_valid_state_change(mvq->fw_state, state))
-		return -EINVAL;
-
 	in = kzalloc(inlen, GFP_KERNEL);
 	if (!in)
 		return -ENOMEM;
@@ -1226,17 +1230,29 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 	MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
 
 	obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
-	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE)
+
+	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
+		if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
+			err = -EINVAL;
+			goto done;
+		}
+
 		MLX5_SET(virtio_net_q_object, obj_context, state, state);
+		state_change = true;
+	}
 
 	MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
 	err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
-	kfree(in);
-	if (!err)
+	if (err)
+		goto done;
+
+	if (state_change)
 		mvq->fw_state = state;
 
 	mvq->modified_fields = 0;
 
+done:
+	kfree(in);
 	return err;
 }
 
@@ -1430,6 +1446,24 @@ static void suspend_vqs(struct mlx5_vdpa_net *ndev)
 		suspend_vq(ndev, &ndev->vqs[i]);
 }
 
+static void resume_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
+{
+	if (!mvq->initialized || !is_resumable(ndev))
+		return;
+
+	if (mvq->fw_state != MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND)
+		return;
+
+	if (modify_virtqueue_state(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY))
+		mlx5_vdpa_warn(&ndev->mvdev, "modify to resume failed for vq %u\n", mvq->index);
+}
+
+static void resume_vqs(struct mlx5_vdpa_net *ndev)
+{
+	for (int i = 0; i < ndev->mvdev.max_vqs; i++)
+		resume_vq(ndev, &ndev->vqs[i]);
+}
+
 static void teardown_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
 {
 	if (!mvq->initialized)
@@ -3261,6 +3295,23 @@ static int mlx5_vdpa_suspend(struct vdpa_device *vdev)
 	return 0;
 }
 
+static int mlx5_vdpa_resume(struct vdpa_device *vdev)
+{
+	struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+	struct mlx5_vdpa_net *ndev;
+
+	ndev = to_mlx5_vdpa_ndev(mvdev);
+
+	mlx5_vdpa_info(mvdev, "resuming device\n");
+
+	down_write(&ndev->reslock);
+	mvdev->suspended = false;
+	resume_vqs(ndev);
+	register_link_notifier(ndev);
+	up_write(&ndev->reslock);
+	return 0;
+}
+
 static int mlx5_set_group_asid(struct vdpa_device *vdev, u32 group,
 			       unsigned int asid)
 {
@@ -3317,6 +3368,7 @@ static const struct vdpa_config_ops mlx5_vdpa_ops = {
 	.get_vq_dma_dev = mlx5_get_vq_dma_dev,
 	.free = mlx5_vdpa_free,
 	.suspend = mlx5_vdpa_suspend,
+	.resume = mlx5_vdpa_resume, /* Op disabled if not supported. */
 };
 
 static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu)
@@ -3688,6 +3740,9 @@ static int mlx5v_probe(struct auxiliary_device *adev,
 	if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, desc_group_mkey_supported))
 		mgtdev->vdpa_ops.get_vq_desc_group = NULL;
 
+	if (!MLX5_CAP_DEV_VDPA_EMULATION(mdev, freeze_to_rdy_supported))
+		mgtdev->vdpa_ops.resume = NULL;
+
 	err = vdpa_mgmtdev_register(&mgtdev->mgtdev);
 	if (err)
 		goto reg_err;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-05 10:46 [PATCH vhost v2 0/8] vdpa/mlx5: Add support for resumable vqs Dragos Tatulea
                   ` (2 preceding siblings ...)
  2023-12-05 10:46 ` [PATCH vhost v2 3/8] vdpa/mlx5: Introduce per vq and device resume Dragos Tatulea
@ 2023-12-05 10:46 ` Dragos Tatulea
  2023-12-12 19:21   ` Eugenio Perez Martin
  2023-12-05 10:46 ` [PATCH vhost v2 5/8] vdpa/mlx5: Mark vq state " Dragos Tatulea
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-05 10:46 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, Eugenio Perez Martin,
	Si-Wei Liu, Saeed Mahameed, Leon Romanovsky, virtualization,
	Gal Pressman
  Cc: Dragos Tatulea, kvm, linux-kernel, Parav Pandit, Xuan Zhuo

Addresses get set by .set_vq_address. hw vq addresses will be updated on
next modify_virtqueue.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
 include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index f8f088cced50..80e066de0866 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 	bool state_change = false;
 	void *obj_context;
 	void *cmd_hdr;
+	void *vq_ctx;
 	void *in;
 	int err;
 
@@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 	MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
 
 	obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
+	vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
 
 	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
 		if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
@@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 		state_change = true;
 	}
 
+	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
+		MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
+		MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
+		MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
+	}
+
 	MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
 	err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
 	if (err)
@@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
 	mvq->desc_addr = desc_area;
 	mvq->device_addr = device_area;
 	mvq->driver_addr = driver_area;
+	mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
 	return 0;
 }
 
diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
index b86d51a855f6..9594ac405740 100644
--- a/include/linux/mlx5/mlx5_ifc_vdpa.h
+++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
@@ -145,6 +145,7 @@ enum {
 	MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
 	MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
 	MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
+	MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
 	MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
 };
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH vhost v2 5/8] vdpa/mlx5: Mark vq state for modification in hw vq
  2023-12-05 10:46 [PATCH vhost v2 0/8] vdpa/mlx5: Add support for resumable vqs Dragos Tatulea
                   ` (3 preceding siblings ...)
  2023-12-05 10:46 ` [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq Dragos Tatulea
@ 2023-12-05 10:46 ` Dragos Tatulea
  2023-12-05 10:46 ` [PATCH vhost v2 6/8] vdpa/mlx5: Use vq suspend/resume during .set_map Dragos Tatulea
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-05 10:46 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, Eugenio Perez Martin,
	Si-Wei Liu, Saeed Mahameed, Leon Romanovsky, virtualization,
	Gal Pressman
  Cc: Dragos Tatulea, kvm, linux-kernel, Parav Pandit, Xuan Zhuo

.set_vq_state will set the indices and mark the fields to be modified in
the hw vq.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c  | 8 ++++++++
 include/linux/mlx5/mlx5_ifc_vdpa.h | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 80e066de0866..d6c8506cec8f 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1249,6 +1249,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 		MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
 	}
 
+	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_AVAIL_IDX)
+		MLX5_SET(virtio_net_q_object, obj_context, hw_available_index, mvq->avail_idx);
+
+	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_USED_IDX)
+		MLX5_SET(virtio_net_q_object, obj_context, hw_used_index, mvq->used_idx);
+
 	MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
 	err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
 	if (err)
@@ -2328,6 +2334,8 @@ static int mlx5_vdpa_set_vq_state(struct vdpa_device *vdev, u16 idx,
 
 	mvq->used_idx = state->split.avail_index;
 	mvq->avail_idx = state->split.avail_index;
+	mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_AVAIL_IDX |
+				MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_USED_IDX;
 	return 0;
 }
 
diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
index 9594ac405740..32e712106e68 100644
--- a/include/linux/mlx5/mlx5_ifc_vdpa.h
+++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
@@ -146,6 +146,8 @@ enum {
 	MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
 	MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
 	MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
+	MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_AVAIL_IDX       = (u64)1 << 7,
+	MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_USED_IDX        = (u64)1 << 8,
 	MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
 };
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH vhost v2 6/8] vdpa/mlx5: Use vq suspend/resume during .set_map
  2023-12-05 10:46 [PATCH vhost v2 0/8] vdpa/mlx5: Add support for resumable vqs Dragos Tatulea
                   ` (4 preceding siblings ...)
  2023-12-05 10:46 ` [PATCH vhost v2 5/8] vdpa/mlx5: Mark vq state " Dragos Tatulea
@ 2023-12-05 10:46 ` Dragos Tatulea
  2023-12-12 19:22   ` Eugenio Perez Martin
  2023-12-05 10:46 ` [PATCH vhost v2 7/8] vdpa/mlx5: Introduce reference counting to mrs Dragos Tatulea
  2023-12-05 10:46 ` [PATCH vhost v2 8/8] vdpa/mlx5: Add mkey leak detection Dragos Tatulea
  7 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-05 10:46 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, Eugenio Perez Martin,
	Si-Wei Liu, Saeed Mahameed, Leon Romanovsky, virtualization,
	Gal Pressman
  Cc: Dragos Tatulea, kvm, linux-kernel, Parav Pandit, Xuan Zhuo

Instead of tearing down and setting up vq resources, use vq
suspend/resume during .set_map to speed things up a bit.

The vq mr is updated with the new mapping while the vqs are suspended.

If the device doesn't support resumable vqs, do the old teardown and
setup dance.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c  | 46 ++++++++++++++++++++++++------
 include/linux/mlx5/mlx5_ifc_vdpa.h |  1 +
 2 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index d6c8506cec8f..6a21223d97a8 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1206,6 +1206,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 {
 	int inlen = MLX5_ST_SZ_BYTES(modify_virtio_net_q_in);
 	u32 out[MLX5_ST_SZ_DW(modify_virtio_net_q_out)] = {};
+	struct mlx5_vdpa_dev *mvdev = &ndev->mvdev;
 	bool state_change = false;
 	void *obj_context;
 	void *cmd_hdr;
@@ -1255,6 +1256,24 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_USED_IDX)
 		MLX5_SET(virtio_net_q_object, obj_context, hw_used_index, mvq->used_idx);
 
+	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY) {
+		struct mlx5_vdpa_mr *mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]];
+
+		if (mr)
+			MLX5_SET(virtio_q, vq_ctx, virtio_q_mkey, mr->mkey);
+		else
+			mvq->modified_fields &= ~MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY;
+	}
+
+	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY) {
+		struct mlx5_vdpa_mr *mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_DESC_GROUP]];
+
+		if (mr && MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, desc_group_mkey_supported))
+			MLX5_SET(virtio_q, vq_ctx, desc_group_mkey, mr->mkey);
+		else
+			mvq->modified_fields &= ~MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY;
+	}
+
 	MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
 	err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
 	if (err)
@@ -2784,24 +2803,35 @@ static int mlx5_vdpa_change_map(struct mlx5_vdpa_dev *mvdev,
 				unsigned int asid)
 {
 	struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+	bool teardown = !is_resumable(ndev);
 	int err;
 
 	suspend_vqs(ndev);
-	err = save_channels_info(ndev);
-	if (err)
-		return err;
+	if (teardown) {
+		err = save_channels_info(ndev);
+		if (err)
+			return err;
 
-	teardown_driver(ndev);
+		teardown_driver(ndev);
+	}
 
 	mlx5_vdpa_update_mr(mvdev, new_mr, asid);
 
+	for (int i = 0; i < ndev->cur_num_vqs; i++)
+		ndev->vqs[i].modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY |
+						MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY;
+
 	if (!(mvdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || mvdev->suspended)
 		return 0;
 
-	restore_channels_info(ndev);
-	err = setup_driver(mvdev);
-	if (err)
-		return err;
+	if (teardown) {
+		restore_channels_info(ndev);
+		err = setup_driver(mvdev);
+		if (err)
+			return err;
+	}
+
+	resume_vqs(ndev);
 
 	return 0;
 }
diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
index 32e712106e68..40371c916cf9 100644
--- a/include/linux/mlx5/mlx5_ifc_vdpa.h
+++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
@@ -148,6 +148,7 @@ enum {
 	MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
 	MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_AVAIL_IDX       = (u64)1 << 7,
 	MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_USED_IDX        = (u64)1 << 8,
+	MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY            = (u64)1 << 11,
 	MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
 };
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH vhost v2 7/8] vdpa/mlx5: Introduce reference counting to mrs
  2023-12-05 10:46 [PATCH vhost v2 0/8] vdpa/mlx5: Add support for resumable vqs Dragos Tatulea
                   ` (5 preceding siblings ...)
  2023-12-05 10:46 ` [PATCH vhost v2 6/8] vdpa/mlx5: Use vq suspend/resume during .set_map Dragos Tatulea
@ 2023-12-05 10:46 ` Dragos Tatulea
  2023-12-12 18:26   ` Eugenio Perez Martin
  2023-12-05 10:46 ` [PATCH vhost v2 8/8] vdpa/mlx5: Add mkey leak detection Dragos Tatulea
  7 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-05 10:46 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, Eugenio Perez Martin,
	Si-Wei Liu, Saeed Mahameed, Leon Romanovsky, virtualization,
	Gal Pressman
  Cc: Dragos Tatulea, kvm, linux-kernel, Parav Pandit, Xuan Zhuo

Deleting the old mr during mr update (.set_map) and then modifying the
vqs with the new mr is not a good flow for firmware. The firmware
expects that mkeys are deleted after there are no more vqs referencing
them.

Introduce reference counting for mrs to fix this. It is the only way to
make sure that mkeys are not in use by vqs.

An mr reference is taken when the mr is associated to the mr asid table
and when the mr is linked to the vq on create/modify. The reference is
released when the mkey is unlinked from the vq (trough modify/destroy)
and from the mr asid table.

To make things consistent, get rid of mlx5_vdpa_destroy_mr and use
get/put semantics everywhere.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
---
 drivers/vdpa/mlx5/core/mlx5_vdpa.h |  8 +++--
 drivers/vdpa/mlx5/core/mr.c        | 50 ++++++++++++++++++++----------
 drivers/vdpa/mlx5/net/mlx5_vnet.c  | 45 ++++++++++++++++++++++-----
 3 files changed, 78 insertions(+), 25 deletions(-)

diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
index 84547d998bcf..1a0d27b6e09a 100644
--- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
+++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
@@ -35,6 +35,8 @@ struct mlx5_vdpa_mr {
 	struct vhost_iotlb *iotlb;
 
 	bool user_mr;
+
+	refcount_t refcount;
 };
 
 struct mlx5_vdpa_resources {
@@ -118,8 +120,10 @@ int mlx5_vdpa_destroy_mkey(struct mlx5_vdpa_dev *mvdev, u32 mkey);
 struct mlx5_vdpa_mr *mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
 					 struct vhost_iotlb *iotlb);
 void mlx5_vdpa_destroy_mr_resources(struct mlx5_vdpa_dev *mvdev);
-void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev,
-			  struct mlx5_vdpa_mr *mr);
+void mlx5_vdpa_get_mr(struct mlx5_vdpa_dev *mvdev,
+		      struct mlx5_vdpa_mr *mr);
+void mlx5_vdpa_put_mr(struct mlx5_vdpa_dev *mvdev,
+		      struct mlx5_vdpa_mr *mr);
 void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev,
 			 struct mlx5_vdpa_mr *mr,
 			 unsigned int asid);
diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c
index 2197c46e563a..c7dc8914354a 100644
--- a/drivers/vdpa/mlx5/core/mr.c
+++ b/drivers/vdpa/mlx5/core/mr.c
@@ -498,32 +498,52 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr
 
 static void _mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr)
 {
+	if (WARN_ON(!mr))
+		return;
+
 	if (mr->user_mr)
 		destroy_user_mr(mvdev, mr);
 	else
 		destroy_dma_mr(mvdev, mr);
 
 	vhost_iotlb_free(mr->iotlb);
+
+	kfree(mr);
 }
 
-void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev,
-			  struct mlx5_vdpa_mr *mr)
+static void _mlx5_vdpa_put_mr(struct mlx5_vdpa_dev *mvdev,
+			      struct mlx5_vdpa_mr *mr)
 {
 	if (!mr)
 		return;
 
+	if (refcount_dec_and_test(&mr->refcount))
+		_mlx5_vdpa_destroy_mr(mvdev, mr);
+}
+
+void mlx5_vdpa_put_mr(struct mlx5_vdpa_dev *mvdev,
+		      struct mlx5_vdpa_mr *mr)
+{
 	mutex_lock(&mvdev->mr_mtx);
+	_mlx5_vdpa_put_mr(mvdev, mr);
+	mutex_unlock(&mvdev->mr_mtx);
+}
 
-	_mlx5_vdpa_destroy_mr(mvdev, mr);
+static void _mlx5_vdpa_get_mr(struct mlx5_vdpa_dev *mvdev,
+			      struct mlx5_vdpa_mr *mr)
+{
+	if (!mr)
+		return;
 
-	for (int i = 0; i < MLX5_VDPA_NUM_AS; i++) {
-		if (mvdev->mr[i] == mr)
-			mvdev->mr[i] = NULL;
-	}
+	refcount_inc(&mr->refcount);
+}
 
+void mlx5_vdpa_get_mr(struct mlx5_vdpa_dev *mvdev,
+		      struct mlx5_vdpa_mr *mr)
+{
+	mutex_lock(&mvdev->mr_mtx);
+	_mlx5_vdpa_get_mr(mvdev, mr);
 	mutex_unlock(&mvdev->mr_mtx);
-
-	kfree(mr);
 }
 
 void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev,
@@ -534,20 +554,16 @@ void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev,
 
 	mutex_lock(&mvdev->mr_mtx);
 
+	_mlx5_vdpa_put_mr(mvdev, old_mr);
 	mvdev->mr[asid] = new_mr;
-	if (old_mr) {
-		_mlx5_vdpa_destroy_mr(mvdev, old_mr);
-		kfree(old_mr);
-	}
 
 	mutex_unlock(&mvdev->mr_mtx);
-
 }
 
 void mlx5_vdpa_destroy_mr_resources(struct mlx5_vdpa_dev *mvdev)
 {
 	for (int i = 0; i < MLX5_VDPA_NUM_AS; i++)
-		mlx5_vdpa_destroy_mr(mvdev, mvdev->mr[i]);
+		mlx5_vdpa_update_mr(mvdev, NULL, i);
 
 	prune_iotlb(mvdev->cvq.iotlb);
 }
@@ -607,6 +623,8 @@ struct mlx5_vdpa_mr *mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
 	if (err)
 		goto out_err;
 
+	refcount_set(&mr->refcount, 1);
+
 	return mr;
 
 out_err:
@@ -651,7 +669,7 @@ int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid)
 	if (asid >= MLX5_VDPA_NUM_AS)
 		return -EINVAL;
 
-	mlx5_vdpa_destroy_mr(mvdev, mvdev->mr[asid]);
+	mlx5_vdpa_update_mr(mvdev, NULL, asid);
 
 	if (asid == 0 && MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) {
 		if (mlx5_vdpa_create_dma_mr(mvdev))
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 6a21223d97a8..133cbb66dcfe 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -123,6 +123,9 @@ struct mlx5_vdpa_virtqueue {
 
 	u64 modified_fields;
 
+	struct mlx5_vdpa_mr *vq_mr;
+	struct mlx5_vdpa_mr *desc_mr;
+
 	struct msi_map map;
 
 	/* keep last in the struct */
@@ -946,6 +949,14 @@ static int create_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtque
 	kfree(in);
 	mvq->virtq_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id);
 
+	mlx5_vdpa_get_mr(mvdev, vq_mr);
+	mvq->vq_mr = vq_mr;
+
+	if (vq_desc_mr && MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, desc_group_mkey_supported)) {
+		mlx5_vdpa_get_mr(mvdev, vq_desc_mr);
+		mvq->desc_mr = vq_desc_mr;
+	}
+
 	return 0;
 
 err_cmd:
@@ -972,6 +983,12 @@ static void destroy_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtq
 	}
 	mvq->fw_state = MLX5_VIRTIO_NET_Q_OBJECT_NONE;
 	umems_destroy(ndev, mvq);
+
+	mlx5_vdpa_put_mr(&ndev->mvdev, mvq->vq_mr);
+	mvq->vq_mr = NULL;
+
+	mlx5_vdpa_put_mr(&ndev->mvdev, mvq->desc_mr);
+	mvq->desc_mr = NULL;
 }
 
 static u32 get_rqpn(struct mlx5_vdpa_virtqueue *mvq, bool fw)
@@ -1207,6 +1224,8 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 	int inlen = MLX5_ST_SZ_BYTES(modify_virtio_net_q_in);
 	u32 out[MLX5_ST_SZ_DW(modify_virtio_net_q_out)] = {};
 	struct mlx5_vdpa_dev *mvdev = &ndev->mvdev;
+	struct mlx5_vdpa_mr *desc_mr = NULL;
+	struct mlx5_vdpa_mr *vq_mr = NULL;
 	bool state_change = false;
 	void *obj_context;
 	void *cmd_hdr;
@@ -1257,19 +1276,19 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 		MLX5_SET(virtio_net_q_object, obj_context, hw_used_index, mvq->used_idx);
 
 	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY) {
-		struct mlx5_vdpa_mr *mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]];
+		vq_mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]];
 
-		if (mr)
-			MLX5_SET(virtio_q, vq_ctx, virtio_q_mkey, mr->mkey);
+		if (vq_mr)
+			MLX5_SET(virtio_q, vq_ctx, virtio_q_mkey, vq_mr->mkey);
 		else
 			mvq->modified_fields &= ~MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY;
 	}
 
 	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY) {
-		struct mlx5_vdpa_mr *mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_DESC_GROUP]];
+		desc_mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_DESC_GROUP]];
 
-		if (mr && MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, desc_group_mkey_supported))
-			MLX5_SET(virtio_q, vq_ctx, desc_group_mkey, mr->mkey);
+		if (desc_mr && MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, desc_group_mkey_supported))
+			MLX5_SET(virtio_q, vq_ctx, desc_group_mkey, desc_mr->mkey);
 		else
 			mvq->modified_fields &= ~MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY;
 	}
@@ -1282,6 +1301,18 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
 	if (state_change)
 		mvq->fw_state = state;
 
+	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY) {
+		mlx5_vdpa_put_mr(mvdev, mvq->vq_mr);
+		mlx5_vdpa_get_mr(mvdev, vq_mr);
+		mvq->vq_mr = vq_mr;
+	}
+
+	if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY) {
+		mlx5_vdpa_put_mr(mvdev, mvq->desc_mr);
+		mlx5_vdpa_get_mr(mvdev, desc_mr);
+		mvq->desc_mr = desc_mr;
+	}
+
 	mvq->modified_fields = 0;
 
 done:
@@ -3095,7 +3126,7 @@ static int set_map_data(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb,
 	return mlx5_vdpa_update_cvq_iotlb(mvdev, iotlb, asid);
 
 out_err:
-	mlx5_vdpa_destroy_mr(mvdev, new_mr);
+	mlx5_vdpa_put_mr(mvdev, new_mr);
 	return err;
 }
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH vhost v2 8/8] vdpa/mlx5: Add mkey leak detection
  2023-12-05 10:46 [PATCH vhost v2 0/8] vdpa/mlx5: Add support for resumable vqs Dragos Tatulea
                   ` (6 preceding siblings ...)
  2023-12-05 10:46 ` [PATCH vhost v2 7/8] vdpa/mlx5: Introduce reference counting to mrs Dragos Tatulea
@ 2023-12-05 10:46 ` Dragos Tatulea
  2023-12-12 18:32   ` Eugenio Perez Martin
  7 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-05 10:46 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, Eugenio Perez Martin,
	Si-Wei Liu, Saeed Mahameed, Leon Romanovsky, virtualization,
	Gal Pressman
  Cc: Dragos Tatulea, kvm, linux-kernel, Parav Pandit, Xuan Zhuo

Track allocated mrs in a list and show warning when leaks are detected
on device free or reset.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
---
 drivers/vdpa/mlx5/core/mlx5_vdpa.h |  2 ++
 drivers/vdpa/mlx5/core/mr.c        | 23 +++++++++++++++++++++++
 drivers/vdpa/mlx5/net/mlx5_vnet.c  |  2 ++
 3 files changed, 27 insertions(+)

diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
index 1a0d27b6e09a..50aac8fe57ef 100644
--- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
+++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
@@ -37,6 +37,7 @@ struct mlx5_vdpa_mr {
 	bool user_mr;
 
 	refcount_t refcount;
+	struct list_head mr_list;
 };
 
 struct mlx5_vdpa_resources {
@@ -95,6 +96,7 @@ struct mlx5_vdpa_dev {
 	u32 generation;
 
 	struct mlx5_vdpa_mr *mr[MLX5_VDPA_NUM_AS];
+	struct list_head mr_list_head;
 	/* serialize mr access */
 	struct mutex mr_mtx;
 	struct mlx5_control_vq cvq;
diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c
index c7dc8914354a..4758914ccf86 100644
--- a/drivers/vdpa/mlx5/core/mr.c
+++ b/drivers/vdpa/mlx5/core/mr.c
@@ -508,6 +508,8 @@ static void _mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_
 
 	vhost_iotlb_free(mr->iotlb);
 
+	list_del(&mr->mr_list);
+
 	kfree(mr);
 }
 
@@ -560,12 +562,31 @@ void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev,
 	mutex_unlock(&mvdev->mr_mtx);
 }
 
+static void mlx5_vdpa_show_mr_leaks(struct mlx5_vdpa_dev *mvdev)
+{
+	struct mlx5_vdpa_mr *mr;
+
+	mutex_lock(&mvdev->mr_mtx);
+
+	list_for_each_entry(mr, &mvdev->mr_list_head, mr_list) {
+
+		mlx5_vdpa_warn(mvdev, "mkey still alive after resource delete: "
+				      "mr: %p, mkey: 0x%x, refcount: %u\n",
+				       mr, mr->mkey, refcount_read(&mr->refcount));
+	}
+
+	mutex_unlock(&mvdev->mr_mtx);
+
+}
+
 void mlx5_vdpa_destroy_mr_resources(struct mlx5_vdpa_dev *mvdev)
 {
 	for (int i = 0; i < MLX5_VDPA_NUM_AS; i++)
 		mlx5_vdpa_update_mr(mvdev, NULL, i);
 
 	prune_iotlb(mvdev->cvq.iotlb);
+
+	mlx5_vdpa_show_mr_leaks(mvdev);
 }
 
 static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
@@ -592,6 +613,8 @@ static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
 	if (err)
 		goto err_iotlb;
 
+	list_add_tail(&mr->mr_list, &mvdev->mr_list_head);
+
 	return 0;
 
 err_iotlb:
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 133cbb66dcfe..778821bab7d9 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -3722,6 +3722,8 @@ static int mlx5_vdpa_dev_add(struct vdpa_mgmt_dev *v_mdev, const char *name,
 	if (err)
 		goto err_mpfs;
 
+	INIT_LIST_HEAD(&mvdev->mr_list_head);
+
 	if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) {
 		err = mlx5_vdpa_create_dma_mr(mvdev);
 		if (err)
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 7/8] vdpa/mlx5: Introduce reference counting to mrs
  2023-12-05 10:46 ` [PATCH vhost v2 7/8] vdpa/mlx5: Introduce reference counting to mrs Dragos Tatulea
@ 2023-12-12 18:26   ` Eugenio Perez Martin
  0 siblings, 0 replies; 31+ messages in thread
From: Eugenio Perez Martin @ 2023-12-12 18:26 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: Michael S . Tsirkin, Jason Wang, Si-Wei Liu, Saeed Mahameed,
	Leon Romanovsky, virtualization, Gal Pressman, kvm, linux-kernel,
	Parav Pandit, Xuan Zhuo

On Tue, Dec 5, 2023 at 11:47 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> Deleting the old mr during mr update (.set_map) and then modifying the
> vqs with the new mr is not a good flow for firmware. The firmware
> expects that mkeys are deleted after there are no more vqs referencing
> them.
>
> Introduce reference counting for mrs to fix this. It is the only way to
> make sure that mkeys are not in use by vqs.
>
> An mr reference is taken when the mr is associated to the mr asid table
> and when the mr is linked to the vq on create/modify. The reference is
> released when the mkey is unlinked from the vq (trough modify/destroy)
> and from the mr asid table.
>
> To make things consistent, get rid of mlx5_vdpa_destroy_mr and use
> get/put semantics everywhere.
>
> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> Reviewed-by: Gal Pressman <gal@nvidia.com>

Acked-by: Eugenio Pérez <eperezma@redhat.com>

> ---
>  drivers/vdpa/mlx5/core/mlx5_vdpa.h |  8 +++--
>  drivers/vdpa/mlx5/core/mr.c        | 50 ++++++++++++++++++++----------
>  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 45 ++++++++++++++++++++++-----
>  3 files changed, 78 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> index 84547d998bcf..1a0d27b6e09a 100644
> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> @@ -35,6 +35,8 @@ struct mlx5_vdpa_mr {
>         struct vhost_iotlb *iotlb;
>
>         bool user_mr;
> +
> +       refcount_t refcount;
>  };
>
>  struct mlx5_vdpa_resources {
> @@ -118,8 +120,10 @@ int mlx5_vdpa_destroy_mkey(struct mlx5_vdpa_dev *mvdev, u32 mkey);
>  struct mlx5_vdpa_mr *mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
>                                          struct vhost_iotlb *iotlb);
>  void mlx5_vdpa_destroy_mr_resources(struct mlx5_vdpa_dev *mvdev);
> -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev,
> -                         struct mlx5_vdpa_mr *mr);
> +void mlx5_vdpa_get_mr(struct mlx5_vdpa_dev *mvdev,
> +                     struct mlx5_vdpa_mr *mr);
> +void mlx5_vdpa_put_mr(struct mlx5_vdpa_dev *mvdev,
> +                     struct mlx5_vdpa_mr *mr);
>  void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev,
>                          struct mlx5_vdpa_mr *mr,
>                          unsigned int asid);
> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c
> index 2197c46e563a..c7dc8914354a 100644
> --- a/drivers/vdpa/mlx5/core/mr.c
> +++ b/drivers/vdpa/mlx5/core/mr.c
> @@ -498,32 +498,52 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr
>
>  static void _mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr)
>  {
> +       if (WARN_ON(!mr))
> +               return;
> +
>         if (mr->user_mr)
>                 destroy_user_mr(mvdev, mr);
>         else
>                 destroy_dma_mr(mvdev, mr);
>
>         vhost_iotlb_free(mr->iotlb);
> +
> +       kfree(mr);
>  }
>
> -void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev,
> -                         struct mlx5_vdpa_mr *mr)
> +static void _mlx5_vdpa_put_mr(struct mlx5_vdpa_dev *mvdev,
> +                             struct mlx5_vdpa_mr *mr)
>  {
>         if (!mr)
>                 return;
>
> +       if (refcount_dec_and_test(&mr->refcount))
> +               _mlx5_vdpa_destroy_mr(mvdev, mr);
> +}
> +
> +void mlx5_vdpa_put_mr(struct mlx5_vdpa_dev *mvdev,
> +                     struct mlx5_vdpa_mr *mr)
> +{
>         mutex_lock(&mvdev->mr_mtx);
> +       _mlx5_vdpa_put_mr(mvdev, mr);
> +       mutex_unlock(&mvdev->mr_mtx);
> +}
>
> -       _mlx5_vdpa_destroy_mr(mvdev, mr);
> +static void _mlx5_vdpa_get_mr(struct mlx5_vdpa_dev *mvdev,
> +                             struct mlx5_vdpa_mr *mr)
> +{
> +       if (!mr)
> +               return;
>
> -       for (int i = 0; i < MLX5_VDPA_NUM_AS; i++) {
> -               if (mvdev->mr[i] == mr)
> -                       mvdev->mr[i] = NULL;
> -       }
> +       refcount_inc(&mr->refcount);
> +}
>
> +void mlx5_vdpa_get_mr(struct mlx5_vdpa_dev *mvdev,
> +                     struct mlx5_vdpa_mr *mr)
> +{
> +       mutex_lock(&mvdev->mr_mtx);
> +       _mlx5_vdpa_get_mr(mvdev, mr);
>         mutex_unlock(&mvdev->mr_mtx);
> -
> -       kfree(mr);
>  }
>
>  void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev,
> @@ -534,20 +554,16 @@ void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev,
>
>         mutex_lock(&mvdev->mr_mtx);
>
> +       _mlx5_vdpa_put_mr(mvdev, old_mr);
>         mvdev->mr[asid] = new_mr;
> -       if (old_mr) {
> -               _mlx5_vdpa_destroy_mr(mvdev, old_mr);
> -               kfree(old_mr);
> -       }
>
>         mutex_unlock(&mvdev->mr_mtx);
> -
>  }
>
>  void mlx5_vdpa_destroy_mr_resources(struct mlx5_vdpa_dev *mvdev)
>  {
>         for (int i = 0; i < MLX5_VDPA_NUM_AS; i++)
> -               mlx5_vdpa_destroy_mr(mvdev, mvdev->mr[i]);
> +               mlx5_vdpa_update_mr(mvdev, NULL, i);
>
>         prune_iotlb(mvdev->cvq.iotlb);
>  }
> @@ -607,6 +623,8 @@ struct mlx5_vdpa_mr *mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
>         if (err)
>                 goto out_err;
>
> +       refcount_set(&mr->refcount, 1);
> +
>         return mr;
>
>  out_err:
> @@ -651,7 +669,7 @@ int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid)
>         if (asid >= MLX5_VDPA_NUM_AS)
>                 return -EINVAL;
>
> -       mlx5_vdpa_destroy_mr(mvdev, mvdev->mr[asid]);
> +       mlx5_vdpa_update_mr(mvdev, NULL, asid);
>
>         if (asid == 0 && MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) {
>                 if (mlx5_vdpa_create_dma_mr(mvdev))
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 6a21223d97a8..133cbb66dcfe 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -123,6 +123,9 @@ struct mlx5_vdpa_virtqueue {
>
>         u64 modified_fields;
>
> +       struct mlx5_vdpa_mr *vq_mr;
> +       struct mlx5_vdpa_mr *desc_mr;
> +
>         struct msi_map map;
>
>         /* keep last in the struct */
> @@ -946,6 +949,14 @@ static int create_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtque
>         kfree(in);
>         mvq->virtq_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id);
>
> +       mlx5_vdpa_get_mr(mvdev, vq_mr);
> +       mvq->vq_mr = vq_mr;
> +
> +       if (vq_desc_mr && MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, desc_group_mkey_supported)) {
> +               mlx5_vdpa_get_mr(mvdev, vq_desc_mr);
> +               mvq->desc_mr = vq_desc_mr;
> +       }
> +
>         return 0;
>
>  err_cmd:
> @@ -972,6 +983,12 @@ static void destroy_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtq
>         }
>         mvq->fw_state = MLX5_VIRTIO_NET_Q_OBJECT_NONE;
>         umems_destroy(ndev, mvq);
> +
> +       mlx5_vdpa_put_mr(&ndev->mvdev, mvq->vq_mr);
> +       mvq->vq_mr = NULL;
> +
> +       mlx5_vdpa_put_mr(&ndev->mvdev, mvq->desc_mr);
> +       mvq->desc_mr = NULL;
>  }
>
>  static u32 get_rqpn(struct mlx5_vdpa_virtqueue *mvq, bool fw)
> @@ -1207,6 +1224,8 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
>         int inlen = MLX5_ST_SZ_BYTES(modify_virtio_net_q_in);
>         u32 out[MLX5_ST_SZ_DW(modify_virtio_net_q_out)] = {};
>         struct mlx5_vdpa_dev *mvdev = &ndev->mvdev;
> +       struct mlx5_vdpa_mr *desc_mr = NULL;
> +       struct mlx5_vdpa_mr *vq_mr = NULL;
>         bool state_change = false;
>         void *obj_context;
>         void *cmd_hdr;
> @@ -1257,19 +1276,19 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
>                 MLX5_SET(virtio_net_q_object, obj_context, hw_used_index, mvq->used_idx);
>
>         if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY) {
> -               struct mlx5_vdpa_mr *mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]];
> +               vq_mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]];
>
> -               if (mr)
> -                       MLX5_SET(virtio_q, vq_ctx, virtio_q_mkey, mr->mkey);
> +               if (vq_mr)
> +                       MLX5_SET(virtio_q, vq_ctx, virtio_q_mkey, vq_mr->mkey);
>                 else
>                         mvq->modified_fields &= ~MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY;
>         }
>
>         if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY) {
> -               struct mlx5_vdpa_mr *mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_DESC_GROUP]];
> +               desc_mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_DESC_GROUP]];
>
> -               if (mr && MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, desc_group_mkey_supported))
> -                       MLX5_SET(virtio_q, vq_ctx, desc_group_mkey, mr->mkey);
> +               if (desc_mr && MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, desc_group_mkey_supported))
> +                       MLX5_SET(virtio_q, vq_ctx, desc_group_mkey, desc_mr->mkey);
>                 else
>                         mvq->modified_fields &= ~MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY;
>         }
> @@ -1282,6 +1301,18 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
>         if (state_change)
>                 mvq->fw_state = state;
>
> +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY) {
> +               mlx5_vdpa_put_mr(mvdev, mvq->vq_mr);
> +               mlx5_vdpa_get_mr(mvdev, vq_mr);
> +               mvq->vq_mr = vq_mr;
> +       }
> +
> +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY) {
> +               mlx5_vdpa_put_mr(mvdev, mvq->desc_mr);
> +               mlx5_vdpa_get_mr(mvdev, desc_mr);
> +               mvq->desc_mr = desc_mr;
> +       }
> +
>         mvq->modified_fields = 0;
>
>  done:
> @@ -3095,7 +3126,7 @@ static int set_map_data(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb,
>         return mlx5_vdpa_update_cvq_iotlb(mvdev, iotlb, asid);
>
>  out_err:
> -       mlx5_vdpa_destroy_mr(mvdev, new_mr);
> +       mlx5_vdpa_put_mr(mvdev, new_mr);
>         return err;
>  }
>
> --
> 2.42.0
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 8/8] vdpa/mlx5: Add mkey leak detection
  2023-12-05 10:46 ` [PATCH vhost v2 8/8] vdpa/mlx5: Add mkey leak detection Dragos Tatulea
@ 2023-12-12 18:32   ` Eugenio Perez Martin
  0 siblings, 0 replies; 31+ messages in thread
From: Eugenio Perez Martin @ 2023-12-12 18:32 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: Michael S . Tsirkin, Jason Wang, Si-Wei Liu, Saeed Mahameed,
	Leon Romanovsky, virtualization, Gal Pressman, kvm, linux-kernel,
	Parav Pandit, Xuan Zhuo

On Tue, Dec 5, 2023 at 11:47 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> Track allocated mrs in a list and show warning when leaks are detected
> on device free or reset.
>
> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> Reviewed-by: Gal Pressman <gal@nvidia.com>

Acked-by: Eugenio Pérez <eperezma@redhat.com>

> ---
>  drivers/vdpa/mlx5/core/mlx5_vdpa.h |  2 ++
>  drivers/vdpa/mlx5/core/mr.c        | 23 +++++++++++++++++++++++
>  drivers/vdpa/mlx5/net/mlx5_vnet.c  |  2 ++
>  3 files changed, 27 insertions(+)
>
> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> index 1a0d27b6e09a..50aac8fe57ef 100644
> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> @@ -37,6 +37,7 @@ struct mlx5_vdpa_mr {
>         bool user_mr;
>
>         refcount_t refcount;
> +       struct list_head mr_list;
>  };
>
>  struct mlx5_vdpa_resources {
> @@ -95,6 +96,7 @@ struct mlx5_vdpa_dev {
>         u32 generation;
>
>         struct mlx5_vdpa_mr *mr[MLX5_VDPA_NUM_AS];
> +       struct list_head mr_list_head;
>         /* serialize mr access */
>         struct mutex mr_mtx;
>         struct mlx5_control_vq cvq;
> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c
> index c7dc8914354a..4758914ccf86 100644
> --- a/drivers/vdpa/mlx5/core/mr.c
> +++ b/drivers/vdpa/mlx5/core/mr.c
> @@ -508,6 +508,8 @@ static void _mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_
>
>         vhost_iotlb_free(mr->iotlb);
>
> +       list_del(&mr->mr_list);
> +
>         kfree(mr);
>  }
>
> @@ -560,12 +562,31 @@ void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev,
>         mutex_unlock(&mvdev->mr_mtx);
>  }
>
> +static void mlx5_vdpa_show_mr_leaks(struct mlx5_vdpa_dev *mvdev)
> +{
> +       struct mlx5_vdpa_mr *mr;
> +
> +       mutex_lock(&mvdev->mr_mtx);
> +
> +       list_for_each_entry(mr, &mvdev->mr_list_head, mr_list) {
> +
> +               mlx5_vdpa_warn(mvdev, "mkey still alive after resource delete: "
> +                                     "mr: %p, mkey: 0x%x, refcount: %u\n",
> +                                      mr, mr->mkey, refcount_read(&mr->refcount));
> +       }
> +
> +       mutex_unlock(&mvdev->mr_mtx);
> +
> +}
> +
>  void mlx5_vdpa_destroy_mr_resources(struct mlx5_vdpa_dev *mvdev)
>  {
>         for (int i = 0; i < MLX5_VDPA_NUM_AS; i++)
>                 mlx5_vdpa_update_mr(mvdev, NULL, i);
>
>         prune_iotlb(mvdev->cvq.iotlb);
> +
> +       mlx5_vdpa_show_mr_leaks(mvdev);
>  }
>
>  static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
> @@ -592,6 +613,8 @@ static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
>         if (err)
>                 goto err_iotlb;
>
> +       list_add_tail(&mr->mr_list, &mvdev->mr_list_head);
> +
>         return 0;
>
>  err_iotlb:
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 133cbb66dcfe..778821bab7d9 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -3722,6 +3722,8 @@ static int mlx5_vdpa_dev_add(struct vdpa_mgmt_dev *v_mdev, const char *name,
>         if (err)
>                 goto err_mpfs;
>
> +       INIT_LIST_HEAD(&mvdev->mr_list_head);
> +
>         if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) {
>                 err = mlx5_vdpa_create_dma_mr(mvdev);
>                 if (err)
> --
> 2.42.0
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-05 10:46 ` [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq Dragos Tatulea
@ 2023-12-12 19:21   ` Eugenio Perez Martin
  2023-12-12 19:44     ` Dragos Tatulea
  2023-12-12 23:44     ` Si-Wei Liu
  0 siblings, 2 replies; 31+ messages in thread
From: Eugenio Perez Martin @ 2023-12-12 19:21 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: Michael S . Tsirkin, Jason Wang, Si-Wei Liu, Saeed Mahameed,
	Leon Romanovsky, virtualization, Gal Pressman, kvm, linux-kernel,
	Parav Pandit, Xuan Zhuo

On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> Addresses get set by .set_vq_address. hw vq addresses will be updated on
> next modify_virtqueue.
>
> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> Reviewed-by: Gal Pressman <gal@nvidia.com>
> Acked-by: Eugenio Pérez <eperezma@redhat.com>

I'm kind of ok with this patch and the next one about state, but I
didn't ack them in the previous series.

My main concern is that it is not valid to change the vq address after
DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
change at this moment. I'm not sure about vq state in vDPA, but vhost
forbids changing it with an active backend.

Suspend is not defined in VirtIO at this moment though, so maybe it is
ok to decide that all of these parameters may change during suspend.
Maybe the best thing is to protect this with a vDPA feature flag.

Jason, what do you think?

Thanks!

> ---
>  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
>  include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
>  2 files changed, 10 insertions(+)
>
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index f8f088cced50..80e066de0866 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
>         bool state_change = false;
>         void *obj_context;
>         void *cmd_hdr;
> +       void *vq_ctx;
>         void *in;
>         int err;
>
> @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
>         MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
>
>         obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
>
>         if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
>                 if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
>                 state_change = true;
>         }
>
> +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> +       }
> +
>         MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
>         err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
>         if (err)
> @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
>         mvq->desc_addr = desc_area;
>         mvq->device_addr = device_area;
>         mvq->driver_addr = driver_area;
> +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
>         return 0;
>  }
>
> diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> index b86d51a855f6..9594ac405740 100644
> --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> @@ -145,6 +145,7 @@ enum {
>         MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
>         MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
>         MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
>         MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
>  };
>
> --
> 2.42.0
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 6/8] vdpa/mlx5: Use vq suspend/resume during .set_map
  2023-12-05 10:46 ` [PATCH vhost v2 6/8] vdpa/mlx5: Use vq suspend/resume during .set_map Dragos Tatulea
@ 2023-12-12 19:22   ` Eugenio Perez Martin
  0 siblings, 0 replies; 31+ messages in thread
From: Eugenio Perez Martin @ 2023-12-12 19:22 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: Michael S . Tsirkin, Jason Wang, Si-Wei Liu, Saeed Mahameed,
	Leon Romanovsky, virtualization, Gal Pressman, kvm, linux-kernel,
	Parav Pandit, Xuan Zhuo

On Tue, Dec 5, 2023 at 11:47 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> Instead of tearing down and setting up vq resources, use vq
> suspend/resume during .set_map to speed things up a bit.
>
> The vq mr is updated with the new mapping while the vqs are suspended.
>
> If the device doesn't support resumable vqs, do the old teardown and
> setup dance.
>
> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> Reviewed-by: Gal Pressman <gal@nvidia.com>
> Acked-by: Eugenio Pérez <eperezma@redhat.com>

I didn't ack it, but I'm ok with it, so:

Acked-by: Eugenio Pérez <eperezma@redhat.com>

Thanks!

> ---
>  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 46 ++++++++++++++++++++++++------
>  include/linux/mlx5/mlx5_ifc_vdpa.h |  1 +
>  2 files changed, 39 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index d6c8506cec8f..6a21223d97a8 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1206,6 +1206,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
>  {
>         int inlen = MLX5_ST_SZ_BYTES(modify_virtio_net_q_in);
>         u32 out[MLX5_ST_SZ_DW(modify_virtio_net_q_out)] = {};
> +       struct mlx5_vdpa_dev *mvdev = &ndev->mvdev;
>         bool state_change = false;
>         void *obj_context;
>         void *cmd_hdr;
> @@ -1255,6 +1256,24 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
>         if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_USED_IDX)
>                 MLX5_SET(virtio_net_q_object, obj_context, hw_used_index, mvq->used_idx);
>
> +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY) {
> +               struct mlx5_vdpa_mr *mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]];
> +
> +               if (mr)
> +                       MLX5_SET(virtio_q, vq_ctx, virtio_q_mkey, mr->mkey);
> +               else
> +                       mvq->modified_fields &= ~MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY;
> +       }
> +
> +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY) {
> +               struct mlx5_vdpa_mr *mr = mvdev->mr[mvdev->group2asid[MLX5_VDPA_DATAVQ_DESC_GROUP]];
> +
> +               if (mr && MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, desc_group_mkey_supported))
> +                       MLX5_SET(virtio_q, vq_ctx, desc_group_mkey, mr->mkey);
> +               else
> +                       mvq->modified_fields &= ~MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY;
> +       }
> +
>         MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
>         err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
>         if (err)
> @@ -2784,24 +2803,35 @@ static int mlx5_vdpa_change_map(struct mlx5_vdpa_dev *mvdev,
>                                 unsigned int asid)
>  {
>         struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> +       bool teardown = !is_resumable(ndev);
>         int err;
>
>         suspend_vqs(ndev);
> -       err = save_channels_info(ndev);
> -       if (err)
> -               return err;
> +       if (teardown) {
> +               err = save_channels_info(ndev);
> +               if (err)
> +                       return err;
>
> -       teardown_driver(ndev);
> +               teardown_driver(ndev);
> +       }
>
>         mlx5_vdpa_update_mr(mvdev, new_mr, asid);
>
> +       for (int i = 0; i < ndev->cur_num_vqs; i++)
> +               ndev->vqs[i].modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY |
> +                                               MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY;
> +
>         if (!(mvdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || mvdev->suspended)
>                 return 0;
>
> -       restore_channels_info(ndev);
> -       err = setup_driver(mvdev);
> -       if (err)
> -               return err;
> +       if (teardown) {
> +               restore_channels_info(ndev);
> +               err = setup_driver(mvdev);
> +               if (err)
> +                       return err;
> +       }
> +
> +       resume_vqs(ndev);
>
>         return 0;
>  }
> diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> index 32e712106e68..40371c916cf9 100644
> --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> @@ -148,6 +148,7 @@ enum {
>         MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
>         MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_AVAIL_IDX       = (u64)1 << 7,
>         MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_USED_IDX        = (u64)1 << 8,
> +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_MKEY            = (u64)1 << 11,
>         MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
>  };
>
> --
> 2.42.0
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-12 19:21   ` Eugenio Perez Martin
@ 2023-12-12 19:44     ` Dragos Tatulea
  2023-12-12 23:44     ` Si-Wei Liu
  1 sibling, 0 replies; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-12 19:44 UTC (permalink / raw)
  To: eperezma
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, kvm, mst, Saeed Mahameed, jasowang,
	leon

On Tue, 2023-12-12 at 20:21 +0100, Eugenio Perez Martin wrote:
> On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > 
> > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > next modify_virtqueue.
> > 
> > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> 
> I'm kind of ok with this patch and the next one about state, but I
> didn't ack them in the previous series.
> 
Sorry about the Ack misplacement. I got confused.

> My main concern is that it is not valid to change the vq address after
> DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> change at this moment. I'm not sure about vq state in vDPA, but vhost
> forbids changing it with an active backend.
> 
> Suspend is not defined in VirtIO at this moment though, so maybe it is
> ok to decide that all of these parameters may change during suspend.
> Maybe the best thing is to protect this with a vDPA feature flag.
> 
> Jason, what do you think?
> 
> Thanks!
> 
> > ---
> >  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> >  include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> >  2 files changed, 10 insertions(+)
> > 
> > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > index f8f088cced50..80e066de0866 100644
> > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> >         bool state_change = false;
> >         void *obj_context;
> >         void *cmd_hdr;
> > +       void *vq_ctx;
> >         void *in;
> >         int err;
> > 
> > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> >         MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > 
> >         obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > 
> >         if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> >                 if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> >                 state_change = true;
> >         }
> > 
> > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > +       }
> > +
> >         MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> >         err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> >         if (err)
> > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> >         mvq->desc_addr = desc_area;
> >         mvq->device_addr = device_area;
> >         mvq->driver_addr = driver_area;
> > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> >         return 0;
> >  }
> > 
> > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > index b86d51a855f6..9594ac405740 100644
> > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > @@ -145,6 +145,7 @@ enum {
> >         MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> >         MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> >         MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> >         MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> >  };
> > 
> > --
> > 2.42.0
> > 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-12 19:21   ` Eugenio Perez Martin
  2023-12-12 19:44     ` Dragos Tatulea
@ 2023-12-12 23:44     ` Si-Wei Liu
  2023-12-14 13:39       ` Dragos Tatulea
  1 sibling, 1 reply; 31+ messages in thread
From: Si-Wei Liu @ 2023-12-12 23:44 UTC (permalink / raw)
  To: Eugenio Perez Martin, Dragos Tatulea
  Cc: Michael S . Tsirkin, Jason Wang, Saeed Mahameed, Leon Romanovsky,
	virtualization, Gal Pressman, kvm, linux-kernel, Parav Pandit,
	Xuan Zhuo



On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>> Addresses get set by .set_vq_address. hw vq addresses will be updated on
>> next modify_virtqueue.
>>
>> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
>> Reviewed-by: Gal Pressman <gal@nvidia.com>
>> Acked-by: Eugenio Pérez <eperezma@redhat.com>
> I'm kind of ok with this patch and the next one about state, but I
> didn't ack them in the previous series.
>
> My main concern is that it is not valid to change the vq address after
> DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> change at this moment. I'm not sure about vq state in vDPA, but vhost
> forbids changing it with an active backend.
>
> Suspend is not defined in VirtIO at this moment though, so maybe it is
> ok to decide that all of these parameters may change during suspend.
> Maybe the best thing is to protect this with a vDPA feature flag.
I think protect with vDPA feature flag could work, while on the other 
hand vDPA means vendor specific optimization is possible around suspend 
and resume (in case it helps performance), which doesn't have to be 
backed by virtio spec. Same applies to vhost user backend features, 
variations there were not backed by spec either. Of course, we should 
try best to make the default behavior backward compatible with 
virtio-based backend, but that circles back to no suspend definition in 
the current virtio spec, for which I hope we don't cease development on 
vDPA indefinitely. After all, the virtio based vdap backend can well 
define its own feature flag to describe (minor difference in) the 
suspend behavior based on the later spec once it is formed in future.

Regards,
-Siwei



>
> Jason, what do you think?
>
> Thanks!
>
>> ---
>>   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
>>   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
>>   2 files changed, 10 insertions(+)
>>
>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> index f8f088cced50..80e066de0866 100644
>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
>>          bool state_change = false;
>>          void *obj_context;
>>          void *cmd_hdr;
>> +       void *vq_ctx;
>>          void *in;
>>          int err;
>>
>> @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
>>          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
>>
>>          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
>> +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
>>
>>          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
>>                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
>> @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
>>                  state_change = true;
>>          }
>>
>> +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
>> +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
>> +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
>> +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
>> +       }
>> +
>>          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
>>          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
>>          if (err)
>> @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
>>          mvq->desc_addr = desc_area;
>>          mvq->device_addr = device_area;
>>          mvq->driver_addr = driver_area;
>> +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
>>          return 0;
>>   }
>>
>> diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
>> index b86d51a855f6..9594ac405740 100644
>> --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
>> +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
>> @@ -145,6 +145,7 @@ enum {
>>          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
>>          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
>>          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
>> +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
>>          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
>>   };
>>
>> --
>> 2.42.0
>>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-12 23:44     ` Si-Wei Liu
@ 2023-12-14 13:39       ` Dragos Tatulea
  2023-12-14 13:45         ` Michael S. Tsirkin
  0 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-14 13:39 UTC (permalink / raw)
  To: si-wei.liu, eperezma
  Cc: xuanzhuo, Parav Pandit, virtualization, Gal Pressman,
	linux-kernel, kvm, mst, Saeed Mahameed, jasowang, leon

On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> 
> On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > next modify_virtqueue.
> > > 
> > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > I'm kind of ok with this patch and the next one about state, but I
> > didn't ack them in the previous series.
> > 
> > My main concern is that it is not valid to change the vq address after
> > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > forbids changing it with an active backend.
> > 
> > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > ok to decide that all of these parameters may change during suspend.
> > Maybe the best thing is to protect this with a vDPA feature flag.
> I think protect with vDPA feature flag could work, while on the other 
> hand vDPA means vendor specific optimization is possible around suspend 
> and resume (in case it helps performance), which doesn't have to be 
> backed by virtio spec. Same applies to vhost user backend features, 
> variations there were not backed by spec either. Of course, we should 
> try best to make the default behavior backward compatible with 
> virtio-based backend, but that circles back to no suspend definition in 
> the current virtio spec, for which I hope we don't cease development on 
> vDPA indefinitely. After all, the virtio based vdap backend can well 
> define its own feature flag to describe (minor difference in) the 
> suspend behavior based on the later spec once it is formed in future.
> 
So what is the way forward here? From what I understand the options are:

1) Add a vdpa feature flag for changing device properties while suspended.

2) Drop these 2 patches from the series for now. Not sure if this makes sense as
this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
code won't work anymore. This means the series would be less well tested.

Are there other possible options? What do you think?

[0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip

Thanks,
Dragos

> Regards,
> -Siwei
> 
> 
> 
> > 
> > Jason, what do you think?
> > 
> > Thanks!
> > 
> > > ---
> > >   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> > >   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> > >   2 files changed, 10 insertions(+)
> > > 
> > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > index f8f088cced50..80e066de0866 100644
> > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > >          bool state_change = false;
> > >          void *obj_context;
> > >          void *cmd_hdr;
> > > +       void *vq_ctx;
> > >          void *in;
> > >          int err;
> > > 
> > > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > >          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > > 
> > >          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > > 
> > >          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> > >                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > >                  state_change = true;
> > >          }
> > > 
> > > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > > +       }
> > > +
> > >          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> > >          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> > >          if (err)
> > > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> > >          mvq->desc_addr = desc_area;
> > >          mvq->device_addr = device_area;
> > >          mvq->driver_addr = driver_area;
> > > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> > >          return 0;
> > >   }
> > > 
> > > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > index b86d51a855f6..9594ac405740 100644
> > > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > @@ -145,6 +145,7 @@ enum {
> > >          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> > >          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> > >   };
> > > 
> > > --
> > > 2.42.0
> > > 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-14 13:39       ` Dragos Tatulea
@ 2023-12-14 13:45         ` Michael S. Tsirkin
  2023-12-14 15:51           ` Dragos Tatulea
  0 siblings, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2023-12-14 13:45 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: si-wei.liu, eperezma, xuanzhuo, Parav Pandit, virtualization,
	Gal Pressman, linux-kernel, kvm, Saeed Mahameed, jasowang, leon

On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > 
> > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > next modify_virtqueue.
> > > > 
> > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > I'm kind of ok with this patch and the next one about state, but I
> > > didn't ack them in the previous series.
> > > 
> > > My main concern is that it is not valid to change the vq address after
> > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > forbids changing it with an active backend.
> > > 
> > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > ok to decide that all of these parameters may change during suspend.
> > > Maybe the best thing is to protect this with a vDPA feature flag.
> > I think protect with vDPA feature flag could work, while on the other 
> > hand vDPA means vendor specific optimization is possible around suspend 
> > and resume (in case it helps performance), which doesn't have to be 
> > backed by virtio spec. Same applies to vhost user backend features, 
> > variations there were not backed by spec either. Of course, we should 
> > try best to make the default behavior backward compatible with 
> > virtio-based backend, but that circles back to no suspend definition in 
> > the current virtio spec, for which I hope we don't cease development on 
> > vDPA indefinitely. After all, the virtio based vdap backend can well 
> > define its own feature flag to describe (minor difference in) the 
> > suspend behavior based on the later spec once it is formed in future.
> > 
> So what is the way forward here? From what I understand the options are:
> 
> 1) Add a vdpa feature flag for changing device properties while suspended.
> 
> 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> code won't work anymore. This means the series would be less well tested.
> 
> Are there other possible options? What do you think?
> 
> [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip

I am fine with either of these.

> Thanks,
> Dragos
> 
> > Regards,
> > -Siwei
> > 
> > 
> > 
> > > 
> > > Jason, what do you think?
> > > 
> > > Thanks!
> > > 
> > > > ---
> > > >   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> > > >   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> > > >   2 files changed, 10 insertions(+)
> > > > 
> > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > index f8f088cced50..80e066de0866 100644
> > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > >          bool state_change = false;
> > > >          void *obj_context;
> > > >          void *cmd_hdr;
> > > > +       void *vq_ctx;
> > > >          void *in;
> > > >          int err;
> > > > 
> > > > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > >          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > > > 
> > > >          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > > > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > > > 
> > > >          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> > > >                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > > > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > >                  state_change = true;
> > > >          }
> > > > 
> > > > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > > > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > > > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > > > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > > > +       }
> > > > +
> > > >          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> > > >          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> > > >          if (err)
> > > > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> > > >          mvq->desc_addr = desc_area;
> > > >          mvq->device_addr = device_area;
> > > >          mvq->driver_addr = driver_area;
> > > > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> > > >          return 0;
> > > >   }
> > > > 
> > > > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > index b86d51a855f6..9594ac405740 100644
> > > > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > @@ -145,6 +145,7 @@ enum {
> > > >          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > > > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> > > >          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> > > >   };
> > > > 
> > > > --
> > > > 2.42.0
> > > > 
> > 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-14 13:45         ` Michael S. Tsirkin
@ 2023-12-14 15:51           ` Dragos Tatulea
  2023-12-14 18:30             ` Eugenio Perez Martin
  0 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-14 15:51 UTC (permalink / raw)
  To: mst
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, eperezma, virtualization,
	linux-kernel, si-wei.liu, kvm, jasowang, Saeed Mahameed, leon

On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > 
> > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > next modify_virtqueue.
> > > > > 
> > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > I'm kind of ok with this patch and the next one about state, but I
> > > > didn't ack them in the previous series.
> > > > 
> > > > My main concern is that it is not valid to change the vq address after
> > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > forbids changing it with an active backend.
> > > > 
> > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > ok to decide that all of these parameters may change during suspend.
> > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > I think protect with vDPA feature flag could work, while on the other 
> > > hand vDPA means vendor specific optimization is possible around suspend 
> > > and resume (in case it helps performance), which doesn't have to be 
> > > backed by virtio spec. Same applies to vhost user backend features, 
> > > variations there were not backed by spec either. Of course, we should 
> > > try best to make the default behavior backward compatible with 
> > > virtio-based backend, but that circles back to no suspend definition in 
> > > the current virtio spec, for which I hope we don't cease development on 
> > > vDPA indefinitely. After all, the virtio based vdap backend can well 
> > > define its own feature flag to describe (minor difference in) the 
> > > suspend behavior based on the later spec once it is formed in future.
> > > 
> > So what is the way forward here? From what I understand the options are:
> > 
> > 1) Add a vdpa feature flag for changing device properties while suspended.
> > 
> > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > code won't work anymore. This means the series would be less well tested.
> > 
> > Are there other possible options? What do you think?
> > 
> > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> 
> I am fine with either of these.
> 
How about allowing the change only under the following conditions:
  vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&  
VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set

?

Thanks,
Dragos

> > Thanks,
> > Dragos
> > 
> > > Regards,
> > > -Siwei
> > > 
> > > 
> > > 
> > > > 
> > > > Jason, what do you think?
> > > > 
> > > > Thanks!
> > > > 
> > > > > ---
> > > > >   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> > > > >   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> > > > >   2 files changed, 10 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > index f8f088cced50..80e066de0866 100644
> > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > >          bool state_change = false;
> > > > >          void *obj_context;
> > > > >          void *cmd_hdr;
> > > > > +       void *vq_ctx;
> > > > >          void *in;
> > > > >          int err;
> > > > > 
> > > > > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > >          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > > > > 
> > > > >          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > > > > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > > > > 
> > > > >          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> > > > >                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > > > > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > >                  state_change = true;
> > > > >          }
> > > > > 
> > > > > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > > > > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > > > > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > > > > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > > > > +       }
> > > > > +
> > > > >          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> > > > >          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> > > > >          if (err)
> > > > > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> > > > >          mvq->desc_addr = desc_area;
> > > > >          mvq->device_addr = device_area;
> > > > >          mvq->driver_addr = driver_area;
> > > > > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> > > > >          return 0;
> > > > >   }
> > > > > 
> > > > > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > index b86d51a855f6..9594ac405740 100644
> > > > > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > @@ -145,6 +145,7 @@ enum {
> > > > >          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > > > > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> > > > >          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> > > > >   };
> > > > > 
> > > > > --
> > > > > 2.42.0
> > > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-14 15:51           ` Dragos Tatulea
@ 2023-12-14 18:30             ` Eugenio Perez Martin
  2023-12-15 12:35               ` Dragos Tatulea
  0 siblings, 1 reply; 31+ messages in thread
From: Eugenio Perez Martin @ 2023-12-14 18:30 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: mst, xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, kvm, jasowang, Saeed Mahameed, leon

On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > >
> > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > next modify_virtqueue.
> > > > > >
> > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > didn't ack them in the previous series.
> > > > >
> > > > > My main concern is that it is not valid to change the vq address after
> > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > forbids changing it with an active backend.
> > > > >
> > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > ok to decide that all of these parameters may change during suspend.
> > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > I think protect with vDPA feature flag could work, while on the other
> > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > and resume (in case it helps performance), which doesn't have to be
> > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > variations there were not backed by spec either. Of course, we should
> > > > try best to make the default behavior backward compatible with
> > > > virtio-based backend, but that circles back to no suspend definition in
> > > > the current virtio spec, for which I hope we don't cease development on
> > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > define its own feature flag to describe (minor difference in) the
> > > > suspend behavior based on the later spec once it is formed in future.
> > > >
> > > So what is the way forward here? From what I understand the options are:
> > >
> > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > >
> > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > code won't work anymore. This means the series would be less well tested.
> > >
> > > Are there other possible options? What do you think?
> > >
> > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> >
> > I am fine with either of these.
> >
> How about allowing the change only under the following conditions:
>   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
>
> ?

I think the best option by far is 1, as there is no hint in the
combination of these 3 indicating that you can change device
properties in the suspended state.

>
> Thanks,
> Dragos
>
> > > Thanks,
> > > Dragos
> > >
> > > > Regards,
> > > > -Siwei
> > > >
> > > >
> > > >
> > > > >
> > > > > Jason, what do you think?
> > > > >
> > > > > Thanks!
> > > > >
> > > > > > ---
> > > > > >   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> > > > > >   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> > > > > >   2 files changed, 10 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > index f8f088cced50..80e066de0866 100644
> > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > >          bool state_change = false;
> > > > > >          void *obj_context;
> > > > > >          void *cmd_hdr;
> > > > > > +       void *vq_ctx;
> > > > > >          void *in;
> > > > > >          int err;
> > > > > >
> > > > > > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > >          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > > > > >
> > > > > >          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > > > > > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > > > > >
> > > > > >          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> > > > > >                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > > > > > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > >                  state_change = true;
> > > > > >          }
> > > > > >
> > > > > > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > > > > > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > > > > > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > > > > > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > > > > > +       }
> > > > > > +
> > > > > >          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> > > > > >          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> > > > > >          if (err)
> > > > > > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> > > > > >          mvq->desc_addr = desc_area;
> > > > > >          mvq->device_addr = device_area;
> > > > > >          mvq->driver_addr = driver_area;
> > > > > > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> > > > > >          return 0;
> > > > > >   }
> > > > > >
> > > > > > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > index b86d51a855f6..9594ac405740 100644
> > > > > > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > @@ -145,6 +145,7 @@ enum {
> > > > > >          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > > > > > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> > > > > >          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> > > > > >   };
> > > > > >
> > > > > > --
> > > > > > 2.42.0
> > > > > >
> > > >
> > >
> >
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-14 18:30             ` Eugenio Perez Martin
@ 2023-12-15 12:35               ` Dragos Tatulea
  2023-12-15 14:13                 ` Dragos Tatulea
  0 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-15 12:35 UTC (permalink / raw)
  To: eperezma
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, kvm, mst, Saeed Mahameed, jasowang,
	leon

On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > 
> > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > 
> > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > next modify_virtqueue.
> > > > > > > 
> > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > didn't ack them in the previous series.
> > > > > > 
> > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > forbids changing it with an active backend.
> > > > > > 
> > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > variations there were not backed by spec either. Of course, we should
> > > > > try best to make the default behavior backward compatible with
> > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > define its own feature flag to describe (minor difference in) the
> > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > 
> > > > So what is the way forward here? From what I understand the options are:
> > > > 
> > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > 
> > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > code won't work anymore. This means the series would be less well tested.
> > > > 
> > > > Are there other possible options? What do you think?
> > > > 
> > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > 
> > > I am fine with either of these.
> > > 
> > How about allowing the change only under the following conditions:
> >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > 
> > ?
> 
> I think the best option by far is 1, as there is no hint in the
> combination of these 3 indicating that you can change device
> properties in the suspended state.
> 
Sure. Will respin a v3 without these two patches.

Another series can implement option 2 and add these 2 patches on top.

> > 
> > Thanks,
> > Dragos
> > 
> > > > Thanks,
> > > > Dragos
> > > > 
> > > > > Regards,
> > > > > -Siwei
> > > > > 
> > > > > 
> > > > > 
> > > > > > 
> > > > > > Jason, what do you think?
> > > > > > 
> > > > > > Thanks!
> > > > > > 
> > > > > > > ---
> > > > > > >   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> > > > > > >   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> > > > > > >   2 files changed, 10 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > index f8f088cced50..80e066de0866 100644
> > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > >          bool state_change = false;
> > > > > > >          void *obj_context;
> > > > > > >          void *cmd_hdr;
> > > > > > > +       void *vq_ctx;
> > > > > > >          void *in;
> > > > > > >          int err;
> > > > > > > 
> > > > > > > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > >          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > > > > > > 
> > > > > > >          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > > > > > > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > > > > > > 
> > > > > > >          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> > > > > > >                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > > > > > > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > >                  state_change = true;
> > > > > > >          }
> > > > > > > 
> > > > > > > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > > > > > > +       }
> > > > > > > +
> > > > > > >          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> > > > > > >          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> > > > > > >          if (err)
> > > > > > > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> > > > > > >          mvq->desc_addr = desc_area;
> > > > > > >          mvq->device_addr = device_area;
> > > > > > >          mvq->driver_addr = driver_area;
> > > > > > > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> > > > > > >          return 0;
> > > > > > >   }
> > > > > > > 
> > > > > > > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > index b86d51a855f6..9594ac405740 100644
> > > > > > > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > @@ -145,6 +145,7 @@ enum {
> > > > > > >          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > > > > > > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> > > > > > >   };
> > > > > > > 
> > > > > > > --
> > > > > > > 2.42.0
> > > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-15 12:35               ` Dragos Tatulea
@ 2023-12-15 14:13                 ` Dragos Tatulea
  2023-12-15 17:56                   ` Eugenio Perez Martin
  0 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-15 14:13 UTC (permalink / raw)
  To: eperezma
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, mst, kvm, Saeed Mahameed, jasowang,
	leon

On Fri, 2023-12-15 at 12:35 +0000, Dragos Tatulea wrote:
> On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> > On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > 
> > > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > > 
> > > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > > next modify_virtqueue.
> > > > > > > > 
> > > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > > didn't ack them in the previous series.
> > > > > > > 
> > > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > > forbids changing it with an active backend.
> > > > > > > 
> > > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > > variations there were not backed by spec either. Of course, we should
> > > > > > try best to make the default behavior backward compatible with
> > > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > > define its own feature flag to describe (minor difference in) the
> > > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > > 
> > > > > So what is the way forward here? From what I understand the options are:
> > > > > 
> > > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > > 
> > > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > > code won't work anymore. This means the series would be less well tested.
> > > > > 
> > > > > Are there other possible options? What do you think?
> > > > > 
> > > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > > 
> > > > I am fine with either of these.
> > > > 
> > > How about allowing the change only under the following conditions:
> > >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > > 
> > > ?
> > 
> > I think the best option by far is 1, as there is no hint in the
> > combination of these 3 indicating that you can change device
> > properties in the suspended state.
> > 
> Sure. Will respin a v3 without these two patches.
> 
> Another series can implement option 2 and add these 2 patches on top.
Hmm...I misunderstood your statement and sent a erroneus v3. You said that
having a feature flag is the best option.

Will add a feature flag in v4: is this similar to the
VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag?

Thanks,
Dragos 

> > > Thanks,
> > > Dragos
> > > 
> > > > > Thanks,
> > > > > Dragos
> > > > > 
> > > > > > Regards,
> > > > > > -Siwei
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > Jason, what do you think?
> > > > > > > 
> > > > > > > Thanks!
> > > > > > > 
> > > > > > > > ---
> > > > > > > >   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> > > > > > > >   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> > > > > > > >   2 files changed, 10 insertions(+)
> > > > > > > > 
> > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > index f8f088cced50..80e066de0866 100644
> > > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > >          bool state_change = false;
> > > > > > > >          void *obj_context;
> > > > > > > >          void *cmd_hdr;
> > > > > > > > +       void *vq_ctx;
> > > > > > > >          void *in;
> > > > > > > >          int err;
> > > > > > > > 
> > > > > > > > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > >          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > > > > > > > 
> > > > > > > >          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > > > > > > > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > > > > > > > 
> > > > > > > >          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> > > > > > > >                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > > > > > > > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > >                  state_change = true;
> > > > > > > >          }
> > > > > > > > 
> > > > > > > > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > > > > > > > +       }
> > > > > > > > +
> > > > > > > >          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> > > > > > > >          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> > > > > > > >          if (err)
> > > > > > > > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> > > > > > > >          mvq->desc_addr = desc_area;
> > > > > > > >          mvq->device_addr = device_area;
> > > > > > > >          mvq->driver_addr = driver_area;
> > > > > > > > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> > > > > > > >          return 0;
> > > > > > > >   }
> > > > > > > > 
> > > > > > > > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > index b86d51a855f6..9594ac405740 100644
> > > > > > > > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > @@ -145,6 +145,7 @@ enum {
> > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > > > > > > > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> > > > > > > >   };
> > > > > > > > 
> > > > > > > > --
> > > > > > > > 2.42.0
> > > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-15 14:13                 ` Dragos Tatulea
@ 2023-12-15 17:56                   ` Eugenio Perez Martin
  2023-12-16 11:03                     ` Dragos Tatulea
  0 siblings, 1 reply; 31+ messages in thread
From: Eugenio Perez Martin @ 2023-12-15 17:56 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, mst, kvm, Saeed Mahameed, jasowang,
	leon

On Fri, Dec 15, 2023 at 3:13 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> On Fri, 2023-12-15 at 12:35 +0000, Dragos Tatulea wrote:
> > On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> > > On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > >
> > > > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > > >
> > > > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > > > next modify_virtqueue.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > > > didn't ack them in the previous series.
> > > > > > > >
> > > > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > > > forbids changing it with an active backend.
> > > > > > > >
> > > > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > > > variations there were not backed by spec either. Of course, we should
> > > > > > > try best to make the default behavior backward compatible with
> > > > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > > > define its own feature flag to describe (minor difference in) the
> > > > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > > >
> > > > > > So what is the way forward here? From what I understand the options are:
> > > > > >
> > > > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > > >
> > > > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > > > code won't work anymore. This means the series would be less well tested.
> > > > > >
> > > > > > Are there other possible options? What do you think?
> > > > > >
> > > > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > > >
> > > > > I am fine with either of these.
> > > > >
> > > > How about allowing the change only under the following conditions:
> > > >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > > >
> > > > ?
> > >
> > > I think the best option by far is 1, as there is no hint in the
> > > combination of these 3 indicating that you can change device
> > > properties in the suspended state.
> > >
> > Sure. Will respin a v3 without these two patches.
> >
> > Another series can implement option 2 and add these 2 patches on top.
> Hmm...I misunderstood your statement and sent a erroneus v3. You said that
> having a feature flag is the best option.
>
> Will add a feature flag in v4: is this similar to the
> VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag?
>

Right, it should be easy to return it from .get_backend_features op if
the FW returns that capability, isn't it?

> Thanks,
> Dragos
>
> > > > Thanks,
> > > > Dragos
> > > >
> > > > > > Thanks,
> > > > > > Dragos
> > > > > >
> > > > > > > Regards,
> > > > > > > -Siwei
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Jason, what do you think?
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > > ---
> > > > > > > > >   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> > > > > > > > >   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> > > > > > > > >   2 files changed, 10 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > index f8f088cced50..80e066de0866 100644
> > > > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > >          bool state_change = false;
> > > > > > > > >          void *obj_context;
> > > > > > > > >          void *cmd_hdr;
> > > > > > > > > +       void *vq_ctx;
> > > > > > > > >          void *in;
> > > > > > > > >          int err;
> > > > > > > > >
> > > > > > > > > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > >          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > > > > > > > >
> > > > > > > > >          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > > > > > > > > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > > > > > > > >
> > > > > > > > >          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> > > > > > > > >                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > > > > > > > > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > >                  state_change = true;
> > > > > > > > >          }
> > > > > > > > >
> > > > > > > > > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > > > > > > > > +       }
> > > > > > > > > +
> > > > > > > > >          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> > > > > > > > >          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> > > > > > > > >          if (err)
> > > > > > > > > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> > > > > > > > >          mvq->desc_addr = desc_area;
> > > > > > > > >          mvq->device_addr = device_area;
> > > > > > > > >          mvq->driver_addr = driver_area;
> > > > > > > > > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> > > > > > > > >          return 0;
> > > > > > > > >   }
> > > > > > > > >
> > > > > > > > > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > index b86d51a855f6..9594ac405740 100644
> > > > > > > > > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > @@ -145,6 +145,7 @@ enum {
> > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > > > > > > > > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> > > > > > > > >   };
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.42.0
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-15 17:56                   ` Eugenio Perez Martin
@ 2023-12-16 11:03                     ` Dragos Tatulea
  2023-12-18 10:16                       ` Eugenio Perez Martin
  0 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-16 11:03 UTC (permalink / raw)
  To: eperezma
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, kvm, mst, Saeed Mahameed, jasowang,
	leon

On Fri, 2023-12-15 at 18:56 +0100, Eugenio Perez Martin wrote:
> On Fri, Dec 15, 2023 at 3:13 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > 
> > On Fri, 2023-12-15 at 12:35 +0000, Dragos Tatulea wrote:
> > > On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> > > > On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > 
> > > > > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > > > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > > > > 
> > > > > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > > > > next modify_virtqueue.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > > > > didn't ack them in the previous series.
> > > > > > > > > 
> > > > > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > > > > forbids changing it with an active backend.
> > > > > > > > > 
> > > > > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > > > > variations there were not backed by spec either. Of course, we should
> > > > > > > > try best to make the default behavior backward compatible with
> > > > > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > > > > define its own feature flag to describe (minor difference in) the
> > > > > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > > > > 
> > > > > > > So what is the way forward here? From what I understand the options are:
> > > > > > > 
> > > > > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > > > > 
> > > > > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > > > > code won't work anymore. This means the series would be less well tested.
> > > > > > > 
> > > > > > > Are there other possible options? What do you think?
> > > > > > > 
> > > > > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > > > > 
> > > > > > I am fine with either of these.
> > > > > > 
> > > > > How about allowing the change only under the following conditions:
> > > > >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > > > > 
> > > > > ?
> > > > 
> > > > I think the best option by far is 1, as there is no hint in the
> > > > combination of these 3 indicating that you can change device
> > > > properties in the suspended state.
> > > > 
> > > Sure. Will respin a v3 without these two patches.
> > > 
> > > Another series can implement option 2 and add these 2 patches on top.
> > Hmm...I misunderstood your statement and sent a erroneus v3. You said that
> > having a feature flag is the best option.
> > 
> > Will add a feature flag in v4: is this similar to the
> > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag?
> > 
> 
> Right, it should be easy to return it from .get_backend_features op if
> the FW returns that capability, isn't it?
> 
Yes, that's easy. But I wonder if we need one feature bit for each type of 
change:
- VHOST_BACKEND_F_CHANGEABLE_VQ_ADDR_IN_SUSPEND
- VHOST_BACKEND_F_CHANGEABLE_VQ_STATE_IN_SUSPEND

Or would a big one VHOST_BACKEND_F_CAN_RECONFIG_VQ_IN_SUSPEND suffice?

To me having individual feature bits makes sense. But it could also takes too
many bits if more changes are required.

Thanks,
Dragos

> > Thanks,
> > Dragos
> > 
> > > > > Thanks,
> > > > > Dragos
> > > > > 
> > > > > > > Thanks,
> > > > > > > Dragos
> > > > > > > 
> > > > > > > > Regards,
> > > > > > > > -Siwei
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Jason, what do you think?
> > > > > > > > > 
> > > > > > > > > Thanks!
> > > > > > > > > 
> > > > > > > > > > ---
> > > > > > > > > >   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> > > > > > > > > >   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> > > > > > > > > >   2 files changed, 10 insertions(+)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > index f8f088cced50..80e066de0866 100644
> > > > > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > >          bool state_change = false;
> > > > > > > > > >          void *obj_context;
> > > > > > > > > >          void *cmd_hdr;
> > > > > > > > > > +       void *vq_ctx;
> > > > > > > > > >          void *in;
> > > > > > > > > >          int err;
> > > > > > > > > > 
> > > > > > > > > > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > >          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > > > > > > > > > 
> > > > > > > > > >          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > > > > > > > > > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > > > > > > > > > 
> > > > > > > > > >          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> > > > > > > > > >                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > > > > > > > > > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > >                  state_change = true;
> > > > > > > > > >          }
> > > > > > > > > > 
> > > > > > > > > > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > > > > > > > > > +       }
> > > > > > > > > > +
> > > > > > > > > >          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> > > > > > > > > >          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> > > > > > > > > >          if (err)
> > > > > > > > > > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> > > > > > > > > >          mvq->desc_addr = desc_area;
> > > > > > > > > >          mvq->device_addr = device_area;
> > > > > > > > > >          mvq->driver_addr = driver_area;
> > > > > > > > > > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> > > > > > > > > >          return 0;
> > > > > > > > > >   }
> > > > > > > > > > 
> > > > > > > > > > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > index b86d51a855f6..9594ac405740 100644
> > > > > > > > > > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > @@ -145,6 +145,7 @@ enum {
> > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > > > > > > > > > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> > > > > > > > > >   };
> > > > > > > > > > 
> > > > > > > > > > --
> > > > > > > > > > 2.42.0
> > > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-16 11:03                     ` Dragos Tatulea
@ 2023-12-18 10:16                       ` Eugenio Perez Martin
  2023-12-18 10:52                         ` Dragos Tatulea
  0 siblings, 1 reply; 31+ messages in thread
From: Eugenio Perez Martin @ 2023-12-18 10:16 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, kvm, mst, Saeed Mahameed, jasowang,
	leon

On Sat, Dec 16, 2023 at 12:03 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> On Fri, 2023-12-15 at 18:56 +0100, Eugenio Perez Martin wrote:
> > On Fri, Dec 15, 2023 at 3:13 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > >
> > > On Fri, 2023-12-15 at 12:35 +0000, Dragos Tatulea wrote:
> > > > On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> > > > > On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > >
> > > > > > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > > > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > > > > >
> > > > > > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > > > > > next modify_virtqueue.
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > > > > > didn't ack them in the previous series.
> > > > > > > > > >
> > > > > > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > > > > > forbids changing it with an active backend.
> > > > > > > > > >
> > > > > > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > > > > > variations there were not backed by spec either. Of course, we should
> > > > > > > > > try best to make the default behavior backward compatible with
> > > > > > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > > > > > define its own feature flag to describe (minor difference in) the
> > > > > > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > > > > >
> > > > > > > > So what is the way forward here? From what I understand the options are:
> > > > > > > >
> > > > > > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > > > > >
> > > > > > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > > > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > > > > > code won't work anymore. This means the series would be less well tested.
> > > > > > > >
> > > > > > > > Are there other possible options? What do you think?
> > > > > > > >
> > > > > > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > > > > >
> > > > > > > I am fine with either of these.
> > > > > > >
> > > > > > How about allowing the change only under the following conditions:
> > > > > >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > > > > >
> > > > > > ?
> > > > >
> > > > > I think the best option by far is 1, as there is no hint in the
> > > > > combination of these 3 indicating that you can change device
> > > > > properties in the suspended state.
> > > > >
> > > > Sure. Will respin a v3 without these two patches.
> > > >
> > > > Another series can implement option 2 and add these 2 patches on top.
> > > Hmm...I misunderstood your statement and sent a erroneus v3. You said that
> > > having a feature flag is the best option.
> > >
> > > Will add a feature flag in v4: is this similar to the
> > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag?
> > >
> >
> > Right, it should be easy to return it from .get_backend_features op if
> > the FW returns that capability, isn't it?
> >
> Yes, that's easy. But I wonder if we need one feature bit for each type of
> change:
> - VHOST_BACKEND_F_CHANGEABLE_VQ_ADDR_IN_SUSPEND
> - VHOST_BACKEND_F_CHANGEABLE_VQ_STATE_IN_SUSPEND
>

I'd say yes. Although we could configure SVQ initial state in userland
as different than 0 for this first step, it would be needed in the
long term.

> Or would a big one VHOST_BACKEND_F_CAN_RECONFIG_VQ_IN_SUSPEND suffice?
>

I'd say "reconfig vq" is not valid as mlx driver doesn't allow
changing queue sizes, for example, isn't it? To define that it is
valid to change "all parameters" seems very confident.

> To me having individual feature bits makes sense. But it could also takes too
> many bits if more changes are required.
>

Yes, that's a good point. Maybe it is valid to define a subset of
features that can be changed., but I think it is way clearer to just
check for individual feature bits.

> Thanks,
> Dragos
>
> > > Thanks,
> > > Dragos
> > >
> > > > > > Thanks,
> > > > > > Dragos
> > > > > >
> > > > > > > > Thanks,
> > > > > > > > Dragos
> > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > -Siwei
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Jason, what do you think?
> > > > > > > > > >
> > > > > > > > > > Thanks!
> > > > > > > > > >
> > > > > > > > > > > ---
> > > > > > > > > > >   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> > > > > > > > > > >   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> > > > > > > > > > >   2 files changed, 10 insertions(+)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > > index f8f088cced50..80e066de0866 100644
> > > > > > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > > >          bool state_change = false;
> > > > > > > > > > >          void *obj_context;
> > > > > > > > > > >          void *cmd_hdr;
> > > > > > > > > > > +       void *vq_ctx;
> > > > > > > > > > >          void *in;
> > > > > > > > > > >          int err;
> > > > > > > > > > >
> > > > > > > > > > > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > > >          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > > > > > > > > > >
> > > > > > > > > > >          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > > > > > > > > > > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > > > > > > > > > >
> > > > > > > > > > >          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> > > > > > > > > > >                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > > > > > > > > > > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > > >                  state_change = true;
> > > > > > > > > > >          }
> > > > > > > > > > >
> > > > > > > > > > > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > > > > > > > > > > +       }
> > > > > > > > > > > +
> > > > > > > > > > >          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> > > > > > > > > > >          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> > > > > > > > > > >          if (err)
> > > > > > > > > > > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> > > > > > > > > > >          mvq->desc_addr = desc_area;
> > > > > > > > > > >          mvq->device_addr = device_area;
> > > > > > > > > > >          mvq->driver_addr = driver_area;
> > > > > > > > > > > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> > > > > > > > > > >          return 0;
> > > > > > > > > > >   }
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > > index b86d51a855f6..9594ac405740 100644
> > > > > > > > > > > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > > @@ -145,6 +145,7 @@ enum {
> > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > > > > > > > > > > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> > > > > > > > > > >   };
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > 2.42.0
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-18 10:16                       ` Eugenio Perez Martin
@ 2023-12-18 10:52                         ` Dragos Tatulea
  2023-12-18 12:06                           ` Eugenio Perez Martin
  0 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-18 10:52 UTC (permalink / raw)
  To: eperezma
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, mst, kvm, Saeed Mahameed, jasowang,
	leon

On Mon, 2023-12-18 at 11:16 +0100, Eugenio Perez Martin wrote:
> On Sat, Dec 16, 2023 at 12:03 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > 
> > On Fri, 2023-12-15 at 18:56 +0100, Eugenio Perez Martin wrote:
> > > On Fri, Dec 15, 2023 at 3:13 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > 
> > > > On Fri, 2023-12-15 at 12:35 +0000, Dragos Tatulea wrote:
> > > > > On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> > > > > > On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > 
> > > > > > > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > > > > > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > > > > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > > > > > > 
> > > > > > > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > > > > > > next modify_virtqueue.
> > > > > > > > > > > > 
> > > > > > > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > > > > > > didn't ack them in the previous series.
> > > > > > > > > > > 
> > > > > > > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > > > > > > forbids changing it with an active backend.
> > > > > > > > > > > 
> > > > > > > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > > > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > > > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > > > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > > > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > > > > > > variations there were not backed by spec either. Of course, we should
> > > > > > > > > > try best to make the default behavior backward compatible with
> > > > > > > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > > > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > > > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > > > > > > define its own feature flag to describe (minor difference in) the
> > > > > > > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > > > > > > 
> > > > > > > > > So what is the way forward here? From what I understand the options are:
> > > > > > > > > 
> > > > > > > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > > > > > > 
> > > > > > > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > > > > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > > > > > > code won't work anymore. This means the series would be less well tested.
> > > > > > > > > 
> > > > > > > > > Are there other possible options? What do you think?
> > > > > > > > > 
> > > > > > > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > > > > > > 
> > > > > > > > I am fine with either of these.
> > > > > > > > 
> > > > > > > How about allowing the change only under the following conditions:
> > > > > > >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > > > > > > 
> > > > > > > ?
> > > > > > 
> > > > > > I think the best option by far is 1, as there is no hint in the
> > > > > > combination of these 3 indicating that you can change device
> > > > > > properties in the suspended state.
> > > > > > 
> > > > > Sure. Will respin a v3 without these two patches.
> > > > > 
> > > > > Another series can implement option 2 and add these 2 patches on top.
> > > > Hmm...I misunderstood your statement and sent a erroneus v3. You said that
> > > > having a feature flag is the best option.
> > > > 
> > > > Will add a feature flag in v4: is this similar to the
> > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag?
> > > > 
> > > 
> > > Right, it should be easy to return it from .get_backend_features op if
> > > the FW returns that capability, isn't it?
> > > 
> > Yes, that's easy. But I wonder if we need one feature bit for each type of
> > change:
> > - VHOST_BACKEND_F_CHANGEABLE_VQ_ADDR_IN_SUSPEND
> > - VHOST_BACKEND_F_CHANGEABLE_VQ_STATE_IN_SUSPEND
> > 
> 
> I'd say yes. Although we could configure SVQ initial state in userland
> as different than 0 for this first step, it would be needed in the
> long term.
> 
> > Or would a big one VHOST_BACKEND_F_CAN_RECONFIG_VQ_IN_SUSPEND suffice?
> > 
> 
> I'd say "reconfig vq" is not valid as mlx driver doesn't allow
> changing queue sizes, for example, isn't it? 
> 
Modifying the queue size for a vq is indeed not supported by the mlx device.

> To define that it is
> valid to change "all parameters" seems very confident.
> 
Ack

> > To me having individual feature bits makes sense. But it could also takes too
> > many bits if more changes are required.
> > 
> 
> Yes, that's a good point. Maybe it is valid to define a subset of
> features that can be changed., but I think it is way clearer to just
> check for individual feature bits.
> 
I will prepare extra patches with the 2 feature bits approach.

Is it necessary to add checks in the vdpa core that block changing these
properties if the state is driver ok and the device doesn't support the feature?

> > Thanks,
> > Dragos
> > 
> > > > Thanks,
> > > > Dragos
> > > > 
> > > > > > > Thanks,
> > > > > > > Dragos
> > > > > > > 
> > > > > > > > > Thanks,
> > > > > > > > > Dragos
> > > > > > > > > 
> > > > > > > > > > Regards,
> > > > > > > > > > -Siwei
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > Jason, what do you think?
> > > > > > > > > > > 
> > > > > > > > > > > Thanks!
> > > > > > > > > > > 
> > > > > > > > > > > > ---
> > > > > > > > > > > >   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> > > > > > > > > > > >   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> > > > > > > > > > > >   2 files changed, 10 insertions(+)
> > > > > > > > > > > > 
> > > > > > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > > > index f8f088cced50..80e066de0866 100644
> > > > > > > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > > > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > > > >          bool state_change = false;
> > > > > > > > > > > >          void *obj_context;
> > > > > > > > > > > >          void *cmd_hdr;
> > > > > > > > > > > > +       void *vq_ctx;
> > > > > > > > > > > >          void *in;
> > > > > > > > > > > >          int err;
> > > > > > > > > > > > 
> > > > > > > > > > > > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > > > >          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > > > > > > > > > > > 
> > > > > > > > > > > >          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > > > > > > > > > > > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > > > > > > > > > > > 
> > > > > > > > > > > >          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> > > > > > > > > > > >                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > > > > > > > > > > > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > > > >                  state_change = true;
> > > > > > > > > > > >          }
> > > > > > > > > > > > 
> > > > > > > > > > > > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > > > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > > > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > > > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > > > > > > > > > > > +       }
> > > > > > > > > > > > +
> > > > > > > > > > > >          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> > > > > > > > > > > >          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> > > > > > > > > > > >          if (err)
> > > > > > > > > > > > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> > > > > > > > > > > >          mvq->desc_addr = desc_area;
> > > > > > > > > > > >          mvq->device_addr = device_area;
> > > > > > > > > > > >          mvq->driver_addr = driver_area;
> > > > > > > > > > > > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> > > > > > > > > > > >          return 0;
> > > > > > > > > > > >   }
> > > > > > > > > > > > 
> > > > > > > > > > > > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > > > index b86d51a855f6..9594ac405740 100644
> > > > > > > > > > > > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > > > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > > > @@ -145,6 +145,7 @@ enum {
> > > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> > > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> > > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > > > > > > > > > > > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> > > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> > > > > > > > > > > >   };
> > > > > > > > > > > > 
> > > > > > > > > > > > --
> > > > > > > > > > > > 2.42.0
> > > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-18 10:52                         ` Dragos Tatulea
@ 2023-12-18 12:06                           ` Eugenio Perez Martin
  2023-12-18 13:58                             ` Dragos Tatulea
  0 siblings, 1 reply; 31+ messages in thread
From: Eugenio Perez Martin @ 2023-12-18 12:06 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, mst, kvm, Saeed Mahameed, jasowang,
	leon

On Mon, Dec 18, 2023 at 11:52 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> On Mon, 2023-12-18 at 11:16 +0100, Eugenio Perez Martin wrote:
> > On Sat, Dec 16, 2023 at 12:03 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > >
> > > On Fri, 2023-12-15 at 18:56 +0100, Eugenio Perez Martin wrote:
> > > > On Fri, Dec 15, 2023 at 3:13 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > >
> > > > > On Fri, 2023-12-15 at 12:35 +0000, Dragos Tatulea wrote:
> > > > > > On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> > > > > > > On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > > > > > > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > > > > > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > > > > > > >
> > > > > > > > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > > > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > > > > > > > next modify_virtqueue.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > > > > > > > didn't ack them in the previous series.
> > > > > > > > > > > >
> > > > > > > > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > > > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > > > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > > > > > > > forbids changing it with an active backend.
> > > > > > > > > > > >
> > > > > > > > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > > > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > > > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > > > > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > > > > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > > > > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > > > > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > > > > > > > variations there were not backed by spec either. Of course, we should
> > > > > > > > > > > try best to make the default behavior backward compatible with
> > > > > > > > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > > > > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > > > > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > > > > > > > define its own feature flag to describe (minor difference in) the
> > > > > > > > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > > > > > > >
> > > > > > > > > > So what is the way forward here? From what I understand the options are:
> > > > > > > > > >
> > > > > > > > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > > > > > > >
> > > > > > > > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > > > > > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > > > > > > > code won't work anymore. This means the series would be less well tested.
> > > > > > > > > >
> > > > > > > > > > Are there other possible options? What do you think?
> > > > > > > > > >
> > > > > > > > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > > > > > > >
> > > > > > > > > I am fine with either of these.
> > > > > > > > >
> > > > > > > > How about allowing the change only under the following conditions:
> > > > > > > >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > > > > > > >
> > > > > > > > ?
> > > > > > >
> > > > > > > I think the best option by far is 1, as there is no hint in the
> > > > > > > combination of these 3 indicating that you can change device
> > > > > > > properties in the suspended state.
> > > > > > >
> > > > > > Sure. Will respin a v3 without these two patches.
> > > > > >
> > > > > > Another series can implement option 2 and add these 2 patches on top.
> > > > > Hmm...I misunderstood your statement and sent a erroneus v3. You said that
> > > > > having a feature flag is the best option.
> > > > >
> > > > > Will add a feature flag in v4: is this similar to the
> > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag?
> > > > >
> > > >
> > > > Right, it should be easy to return it from .get_backend_features op if
> > > > the FW returns that capability, isn't it?
> > > >
> > > Yes, that's easy. But I wonder if we need one feature bit for each type of
> > > change:
> > > - VHOST_BACKEND_F_CHANGEABLE_VQ_ADDR_IN_SUSPEND
> > > - VHOST_BACKEND_F_CHANGEABLE_VQ_STATE_IN_SUSPEND
> > >
> >
> > I'd say yes. Although we could configure SVQ initial state in userland
> > as different than 0 for this first step, it would be needed in the
> > long term.
> >
> > > Or would a big one VHOST_BACKEND_F_CAN_RECONFIG_VQ_IN_SUSPEND suffice?
> > >
> >
> > I'd say "reconfig vq" is not valid as mlx driver doesn't allow
> > changing queue sizes, for example, isn't it?
> >
> Modifying the queue size for a vq is indeed not supported by the mlx device.
>
> > To define that it is
> > valid to change "all parameters" seems very confident.
> >
> Ack
>
> > > To me having individual feature bits makes sense. But it could also takes too
> > > many bits if more changes are required.
> > >
> >
> > Yes, that's a good point. Maybe it is valid to define a subset of
> > features that can be changed., but I think it is way clearer to just
> > check for individual feature bits.
> >
> I will prepare extra patches with the 2 feature bits approach.
>
> Is it necessary to add checks in the vdpa core that block changing these
> properties if the state is driver ok and the device doesn't support the feature?
>

Yes, I think it is better to protect for changes in vdpa core.

> > > Thanks,
> > > Dragos
> > >
> > > > > Thanks,
> > > > > Dragos
> > > > >
> > > > > > > > Thanks,
> > > > > > > > Dragos
> > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Dragos
> > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > -Siwei
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Jason, what do you think?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks!
> > > > > > > > > > > >
> > > > > > > > > > > > > ---
> > > > > > > > > > > > >   drivers/vdpa/mlx5/net/mlx5_vnet.c  | 9 +++++++++
> > > > > > > > > > > > >   include/linux/mlx5/mlx5_ifc_vdpa.h | 1 +
> > > > > > > > > > > > >   2 files changed, 10 insertions(+)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > > > > index f8f088cced50..80e066de0866 100644
> > > > > > > > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > > > > > > @@ -1209,6 +1209,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > > > > >          bool state_change = false;
> > > > > > > > > > > > >          void *obj_context;
> > > > > > > > > > > > >          void *cmd_hdr;
> > > > > > > > > > > > > +       void *vq_ctx;
> > > > > > > > > > > > >          void *in;
> > > > > > > > > > > > >          int err;
> > > > > > > > > > > > >
> > > > > > > > > > > > > @@ -1230,6 +1231,7 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > > > > >          MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, uid, ndev->mvdev.res.uid);
> > > > > > > > > > > > >
> > > > > > > > > > > > >          obj_context = MLX5_ADDR_OF(modify_virtio_net_q_in, in, obj_context);
> > > > > > > > > > > > > +       vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context);
> > > > > > > > > > > > >
> > > > > > > > > > > > >          if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_STATE) {
> > > > > > > > > > > > >                  if (!is_valid_state_change(mvq->fw_state, state, is_resumable(ndev))) {
> > > > > > > > > > > > > @@ -1241,6 +1243,12 @@ static int modify_virtqueue(struct mlx5_vdpa_net *ndev,
> > > > > > > > > > > > >                  state_change = true;
> > > > > > > > > > > > >          }
> > > > > > > > > > > > >
> > > > > > > > > > > > > +       if (mvq->modified_fields & MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS) {
> > > > > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr);
> > > > > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr);
> > > > > > > > > > > > > +               MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr);
> > > > > > > > > > > > > +       }
> > > > > > > > > > > > > +
> > > > > > > > > > > > >          MLX5_SET64(virtio_net_q_object, obj_context, modify_field_select, mvq->modified_fields);
> > > > > > > > > > > > >          err = mlx5_cmd_exec(ndev->mvdev.mdev, in, inlen, out, sizeof(out));
> > > > > > > > > > > > >          if (err)
> > > > > > > > > > > > > @@ -2202,6 +2210,7 @@ static int mlx5_vdpa_set_vq_address(struct vdpa_device *vdev, u16 idx, u64 desc_
> > > > > > > > > > > > >          mvq->desc_addr = desc_area;
> > > > > > > > > > > > >          mvq->device_addr = device_area;
> > > > > > > > > > > > >          mvq->driver_addr = driver_area;
> > > > > > > > > > > > > +       mvq->modified_fields |= MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS;
> > > > > > > > > > > > >          return 0;
> > > > > > > > > > > > >   }
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/include/linux/mlx5/mlx5_ifc_vdpa.h b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > > > > index b86d51a855f6..9594ac405740 100644
> > > > > > > > > > > > > --- a/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > > > > +++ b/include/linux/mlx5/mlx5_ifc_vdpa.h
> > > > > > > > > > > > > @@ -145,6 +145,7 @@ enum {
> > > > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_STATE                    = (u64)1 << 0,
> > > > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_PARAMS      = (u64)1 << 3,
> > > > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DIRTY_BITMAP_DUMP_ENABLE = (u64)1 << 4,
> > > > > > > > > > > > > +       MLX5_VIRTQ_MODIFY_MASK_VIRTIO_Q_ADDRS           = (u64)1 << 6,
> > > > > > > > > > > > >          MLX5_VIRTQ_MODIFY_MASK_DESC_GROUP_MKEY          = (u64)1 << 14,
> > > > > > > > > > > > >   };
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > 2.42.0
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-18 12:06                           ` Eugenio Perez Martin
@ 2023-12-18 13:58                             ` Dragos Tatulea
  2023-12-19  7:24                               ` Eugenio Perez Martin
  0 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-18 13:58 UTC (permalink / raw)
  To: eperezma
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, kvm, mst, Saeed Mahameed, jasowang,
	leon

On Mon, 2023-12-18 at 13:06 +0100, Eugenio Perez Martin wrote:
> On Mon, Dec 18, 2023 at 11:52 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > 
> > On Mon, 2023-12-18 at 11:16 +0100, Eugenio Perez Martin wrote:
> > > On Sat, Dec 16, 2023 at 12:03 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > 
> > > > On Fri, 2023-12-15 at 18:56 +0100, Eugenio Perez Martin wrote:
> > > > > On Fri, Dec 15, 2023 at 3:13 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > 
> > > > > > On Fri, 2023-12-15 at 12:35 +0000, Dragos Tatulea wrote:
> > > > > > > On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> > > > > > > > On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > 
> > > > > > > > > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > > > > > > > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > > > > > > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > > > > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > > > > > > > > next modify_virtqueue.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > > > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > > > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > > > > > > > > didn't ack them in the previous series.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > > > > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > > > > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > > > > > > > > forbids changing it with an active backend.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > > > > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > > > > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > > > > > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > > > > > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > > > > > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > > > > > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > > > > > > > > variations there were not backed by spec either. Of course, we should
> > > > > > > > > > > > try best to make the default behavior backward compatible with
> > > > > > > > > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > > > > > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > > > > > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > > > > > > > > define its own feature flag to describe (minor difference in) the
> > > > > > > > > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > > > > > > > > 
> > > > > > > > > > > So what is the way forward here? From what I understand the options are:
> > > > > > > > > > > 
> > > > > > > > > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > > > > > > > > 
> > > > > > > > > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > > > > > > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > > > > > > > > code won't work anymore. This means the series would be less well tested.
> > > > > > > > > > > 
> > > > > > > > > > > Are there other possible options? What do you think?
> > > > > > > > > > > 
> > > > > > > > > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > > > > > > > > 
> > > > > > > > > > I am fine with either of these.
> > > > > > > > > > 
> > > > > > > > > How about allowing the change only under the following conditions:
> > > > > > > > >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > > > > > > > > 
> > > > > > > > > ?
> > > > > > > > 
> > > > > > > > I think the best option by far is 1, as there is no hint in the
> > > > > > > > combination of these 3 indicating that you can change device
> > > > > > > > properties in the suspended state.
> > > > > > > > 
> > > > > > > Sure. Will respin a v3 without these two patches.
> > > > > > > 
> > > > > > > Another series can implement option 2 and add these 2 patches on top.
> > > > > > Hmm...I misunderstood your statement and sent a erroneus v3. You said that
> > > > > > having a feature flag is the best option.
> > > > > > 
> > > > > > Will add a feature flag in v4: is this similar to the
> > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag?
> > > > > > 
> > > > > 
> > > > > Right, it should be easy to return it from .get_backend_features op if
> > > > > the FW returns that capability, isn't it?
> > > > > 
> > > > Yes, that's easy. But I wonder if we need one feature bit for each type of
> > > > change:
> > > > - VHOST_BACKEND_F_CHANGEABLE_VQ_ADDR_IN_SUSPEND
> > > > - VHOST_BACKEND_F_CHANGEABLE_VQ_STATE_IN_SUSPEND
> > > > 
> > > 
> > > I'd say yes. Although we could configure SVQ initial state in userland
> > > as different than 0 for this first step, it would be needed in the
> > > long term.
> > > 
> > > > Or would a big one VHOST_BACKEND_F_CAN_RECONFIG_VQ_IN_SUSPEND suffice?
> > > > 
> > > 
> > > I'd say "reconfig vq" is not valid as mlx driver doesn't allow
> > > changing queue sizes, for example, isn't it?
> > > 
> > Modifying the queue size for a vq is indeed not supported by the mlx device.
> > 
> > > To define that it is
> > > valid to change "all parameters" seems very confident.
> > > 
> > Ack
> > 
> > > > To me having individual feature bits makes sense. But it could also takes too
> > > > many bits if more changes are required.
> > > > 
> > > 
> > > Yes, that's a good point. Maybe it is valid to define a subset of
> > > features that can be changed., but I think it is way clearer to just
> > > check for individual feature bits.
> > > 
> > I will prepare extra patches with the 2 feature bits approach.
> > 
> > Is it necessary to add checks in the vdpa core that block changing these
> > properties if the state is driver ok and the device doesn't support the feature?
> > 
> 
> Yes, I think it is better to protect for changes in vdpa core.
> 
Hmmm... there is no suspended state available. I would only add checks for the
DRIVER_OK state of the device because adding a is_suspended state/op seems out
of scope for this series. Any thoughts?


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-18 13:58                             ` Dragos Tatulea
@ 2023-12-19  7:24                               ` Eugenio Perez Martin
  2023-12-19 11:16                                 ` Dragos Tatulea
  0 siblings, 1 reply; 31+ messages in thread
From: Eugenio Perez Martin @ 2023-12-19  7:24 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, kvm, mst, Saeed Mahameed, jasowang,
	leon

On Mon, Dec 18, 2023 at 2:58 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> On Mon, 2023-12-18 at 13:06 +0100, Eugenio Perez Martin wrote:
> > On Mon, Dec 18, 2023 at 11:52 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > >
> > > On Mon, 2023-12-18 at 11:16 +0100, Eugenio Perez Martin wrote:
> > > > On Sat, Dec 16, 2023 at 12:03 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > >
> > > > > On Fri, 2023-12-15 at 18:56 +0100, Eugenio Perez Martin wrote:
> > > > > > On Fri, Dec 15, 2023 at 3:13 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > >
> > > > > > > On Fri, 2023-12-15 at 12:35 +0000, Dragos Tatulea wrote:
> > > > > > > > On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> > > > > > > > > On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > > > > > > > > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > > > > > > > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > > > > > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > > > > > > > > > next modify_virtqueue.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > > > > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > > > > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > > > > > > > > > didn't ack them in the previous series.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > > > > > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > > > > > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > > > > > > > > > forbids changing it with an active backend.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > > > > > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > > > > > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > > > > > > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > > > > > > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > > > > > > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > > > > > > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > > > > > > > > > variations there were not backed by spec either. Of course, we should
> > > > > > > > > > > > > try best to make the default behavior backward compatible with
> > > > > > > > > > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > > > > > > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > > > > > > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > > > > > > > > > define its own feature flag to describe (minor difference in) the
> > > > > > > > > > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > > > > > > > > >
> > > > > > > > > > > > So what is the way forward here? From what I understand the options are:
> > > > > > > > > > > >
> > > > > > > > > > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > > > > > > > > >
> > > > > > > > > > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > > > > > > > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > > > > > > > > > code won't work anymore. This means the series would be less well tested.
> > > > > > > > > > > >
> > > > > > > > > > > > Are there other possible options? What do you think?
> > > > > > > > > > > >
> > > > > > > > > > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > > > > > > > > >
> > > > > > > > > > > I am fine with either of these.
> > > > > > > > > > >
> > > > > > > > > > How about allowing the change only under the following conditions:
> > > > > > > > > >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > > > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > > > > > > > > >
> > > > > > > > > > ?
> > > > > > > > >
> > > > > > > > > I think the best option by far is 1, as there is no hint in the
> > > > > > > > > combination of these 3 indicating that you can change device
> > > > > > > > > properties in the suspended state.
> > > > > > > > >
> > > > > > > > Sure. Will respin a v3 without these two patches.
> > > > > > > >
> > > > > > > > Another series can implement option 2 and add these 2 patches on top.
> > > > > > > Hmm...I misunderstood your statement and sent a erroneus v3. You said that
> > > > > > > having a feature flag is the best option.
> > > > > > >
> > > > > > > Will add a feature flag in v4: is this similar to the
> > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag?
> > > > > > >
> > > > > >
> > > > > > Right, it should be easy to return it from .get_backend_features op if
> > > > > > the FW returns that capability, isn't it?
> > > > > >
> > > > > Yes, that's easy. But I wonder if we need one feature bit for each type of
> > > > > change:
> > > > > - VHOST_BACKEND_F_CHANGEABLE_VQ_ADDR_IN_SUSPEND
> > > > > - VHOST_BACKEND_F_CHANGEABLE_VQ_STATE_IN_SUSPEND
> > > > >
> > > >
> > > > I'd say yes. Although we could configure SVQ initial state in userland
> > > > as different than 0 for this first step, it would be needed in the
> > > > long term.
> > > >
> > > > > Or would a big one VHOST_BACKEND_F_CAN_RECONFIG_VQ_IN_SUSPEND suffice?
> > > > >
> > > >
> > > > I'd say "reconfig vq" is not valid as mlx driver doesn't allow
> > > > changing queue sizes, for example, isn't it?
> > > >
> > > Modifying the queue size for a vq is indeed not supported by the mlx device.
> > >
> > > > To define that it is
> > > > valid to change "all parameters" seems very confident.
> > > >
> > > Ack
> > >
> > > > > To me having individual feature bits makes sense. But it could also takes too
> > > > > many bits if more changes are required.
> > > > >
> > > >
> > > > Yes, that's a good point. Maybe it is valid to define a subset of
> > > > features that can be changed., but I think it is way clearer to just
> > > > check for individual feature bits.
> > > >
> > > I will prepare extra patches with the 2 feature bits approach.
> > >
> > > Is it necessary to add checks in the vdpa core that block changing these
> > > properties if the state is driver ok and the device doesn't support the feature?
> > >
> >
> > Yes, I think it is better to protect for changes in vdpa core.
> >
> Hmmm... there is no suspended state available. I would only add checks for the
> DRIVER_OK state of the device because adding a is_suspended state/op seems out
> of scope for this series. Any thoughts?
>

I can develop it so you can include it in your series for sure, I will
send it ASAP.

Thanks!


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-19  7:24                               ` Eugenio Perez Martin
@ 2023-12-19 11:16                                 ` Dragos Tatulea
  2023-12-19 14:02                                   ` Eugenio Perez Martin
  0 siblings, 1 reply; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-19 11:16 UTC (permalink / raw)
  To: eperezma
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, mst, kvm, Saeed Mahameed, jasowang,
	leon

On Tue, 2023-12-19 at 08:24 +0100, Eugenio Perez Martin wrote:
> On Mon, Dec 18, 2023 at 2:58 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > 
> > On Mon, 2023-12-18 at 13:06 +0100, Eugenio Perez Martin wrote:
> > > On Mon, Dec 18, 2023 at 11:52 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > 
> > > > On Mon, 2023-12-18 at 11:16 +0100, Eugenio Perez Martin wrote:
> > > > > On Sat, Dec 16, 2023 at 12:03 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > 
> > > > > > On Fri, 2023-12-15 at 18:56 +0100, Eugenio Perez Martin wrote:
> > > > > > > On Fri, Dec 15, 2023 at 3:13 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > 
> > > > > > > > On Fri, 2023-12-15 at 12:35 +0000, Dragos Tatulea wrote:
> > > > > > > > > On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> > > > > > > > > > On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > > 
> > > > > > > > > > > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > > > > > > > > > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > > > > > > > > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > > > > > > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > > > > > > > > > > next modify_virtqueue.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > > > > > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > > > > > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > > > > > > > > > > didn't ack them in the previous series.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > > > > > > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > > > > > > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > > > > > > > > > > forbids changing it with an active backend.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > > > > > > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > > > > > > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > > > > > > > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > > > > > > > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > > > > > > > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > > > > > > > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > > > > > > > > > > variations there were not backed by spec either. Of course, we should
> > > > > > > > > > > > > > try best to make the default behavior backward compatible with
> > > > > > > > > > > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > > > > > > > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > > > > > > > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > > > > > > > > > > define its own feature flag to describe (minor difference in) the
> > > > > > > > > > > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > So what is the way forward here? From what I understand the options are:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > > > > > > > > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > > > > > > > > > > code won't work anymore. This means the series would be less well tested.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Are there other possible options? What do you think?
> > > > > > > > > > > > > 
> > > > > > > > > > > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > > > > > > > > > > 
> > > > > > > > > > > > I am fine with either of these.
> > > > > > > > > > > > 
> > > > > > > > > > > How about allowing the change only under the following conditions:
> > > > > > > > > > >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > > > > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > > > > > > > > > > 
> > > > > > > > > > > ?
> > > > > > > > > > 
> > > > > > > > > > I think the best option by far is 1, as there is no hint in the
> > > > > > > > > > combination of these 3 indicating that you can change device
> > > > > > > > > > properties in the suspended state.
> > > > > > > > > > 
> > > > > > > > > Sure. Will respin a v3 without these two patches.
> > > > > > > > > 
> > > > > > > > > Another series can implement option 2 and add these 2 patches on top.
> > > > > > > > Hmm...I misunderstood your statement and sent a erroneus v3. You said that
> > > > > > > > having a feature flag is the best option.
> > > > > > > > 
> > > > > > > > Will add a feature flag in v4: is this similar to the
> > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag?
> > > > > > > > 
> > > > > > > 
> > > > > > > Right, it should be easy to return it from .get_backend_features op if
> > > > > > > the FW returns that capability, isn't it?
> > > > > > > 
> > > > > > Yes, that's easy. But I wonder if we need one feature bit for each type of
> > > > > > change:
> > > > > > - VHOST_BACKEND_F_CHANGEABLE_VQ_ADDR_IN_SUSPEND
> > > > > > - VHOST_BACKEND_F_CHANGEABLE_VQ_STATE_IN_SUSPEND
> > > > > > 
> > > > > 
> > > > > I'd say yes. Although we could configure SVQ initial state in userland
> > > > > as different than 0 for this first step, it would be needed in the
> > > > > long term.
> > > > > 
> > > > > > Or would a big one VHOST_BACKEND_F_CAN_RECONFIG_VQ_IN_SUSPEND suffice?
> > > > > > 
> > > > > 
> > > > > I'd say "reconfig vq" is not valid as mlx driver doesn't allow
> > > > > changing queue sizes, for example, isn't it?
> > > > > 
> > > > Modifying the queue size for a vq is indeed not supported by the mlx device.
> > > > 
> > > > > To define that it is
> > > > > valid to change "all parameters" seems very confident.
> > > > > 
> > > > Ack
> > > > 
> > > > > > To me having individual feature bits makes sense. But it could also takes too
> > > > > > many bits if more changes are required.
> > > > > > 
> > > > > 
> > > > > Yes, that's a good point. Maybe it is valid to define a subset of
> > > > > features that can be changed., but I think it is way clearer to just
> > > > > check for individual feature bits.
> > > > > 
> > > > I will prepare extra patches with the 2 feature bits approach.
> > > > 
> > > > Is it necessary to add checks in the vdpa core that block changing these
> > > > properties if the state is driver ok and the device doesn't support the feature?
> > > > 
> > > 
> > > Yes, I think it is better to protect for changes in vdpa core.
> > > 
> > Hmmm... there is no suspended state available. I would only add checks for the
> > DRIVER_OK state of the device because adding a is_suspended state/op seems out
> > of scope for this series. Any thoughts?
> > 
> 
> I can develop it so you can include it in your series for sure, I will
> send it ASAP.
> 
If it's a matter of:
- Adding a suspended state to struct vhost_vdpa.
- Setting it to true on successful device suspend.
- Clearing it on successful device resume and device reset.

I can add this patch. I'm just not sure about the locking part. But maybe I can
send it and we can debate on the code.

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-19 11:16                                 ` Dragos Tatulea
@ 2023-12-19 14:02                                   ` Eugenio Perez Martin
  2023-12-19 15:11                                     ` Dragos Tatulea
  0 siblings, 1 reply; 31+ messages in thread
From: Eugenio Perez Martin @ 2023-12-19 14:02 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, mst, kvm, Saeed Mahameed, jasowang,
	leon

On Tue, Dec 19, 2023 at 12:16 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> On Tue, 2023-12-19 at 08:24 +0100, Eugenio Perez Martin wrote:
> > On Mon, Dec 18, 2023 at 2:58 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > >
> > > On Mon, 2023-12-18 at 13:06 +0100, Eugenio Perez Martin wrote:
> > > > On Mon, Dec 18, 2023 at 11:52 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > >
> > > > > On Mon, 2023-12-18 at 11:16 +0100, Eugenio Perez Martin wrote:
> > > > > > On Sat, Dec 16, 2023 at 12:03 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > >
> > > > > > > On Fri, 2023-12-15 at 18:56 +0100, Eugenio Perez Martin wrote:
> > > > > > > > On Fri, Dec 15, 2023 at 3:13 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > >
> > > > > > > > > On Fri, 2023-12-15 at 12:35 +0000, Dragos Tatulea wrote:
> > > > > > > > > > On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> > > > > > > > > > > On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > > > > > > > > > > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > > > > > > > > > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > > > > > > > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > > > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > > > > > > > > > > > next modify_virtqueue.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > > > > > > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > > > > > > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > > > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > > > > > > > > > > > didn't ack them in the previous series.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > > > > > > > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > > > > > > > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > > > > > > > > > > > forbids changing it with an active backend.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > > > > > > > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > > > > > > > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > > > > > > > > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > > > > > > > > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > > > > > > > > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > > > > > > > > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > > > > > > > > > > > variations there were not backed by spec either. Of course, we should
> > > > > > > > > > > > > > > try best to make the default behavior backward compatible with
> > > > > > > > > > > > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > > > > > > > > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > > > > > > > > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > > > > > > > > > > > define its own feature flag to describe (minor difference in) the
> > > > > > > > > > > > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > So what is the way forward here? From what I understand the options are:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > > > > > > > > > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > > > > > > > > > > > code won't work anymore. This means the series would be less well tested.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Are there other possible options? What do you think?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > > > > > > > > > > >
> > > > > > > > > > > > > I am fine with either of these.
> > > > > > > > > > > > >
> > > > > > > > > > > > How about allowing the change only under the following conditions:
> > > > > > > > > > > >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > > > > > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > > > > > > > > > > >
> > > > > > > > > > > > ?
> > > > > > > > > > >
> > > > > > > > > > > I think the best option by far is 1, as there is no hint in the
> > > > > > > > > > > combination of these 3 indicating that you can change device
> > > > > > > > > > > properties in the suspended state.
> > > > > > > > > > >
> > > > > > > > > > Sure. Will respin a v3 without these two patches.
> > > > > > > > > >
> > > > > > > > > > Another series can implement option 2 and add these 2 patches on top.
> > > > > > > > > Hmm...I misunderstood your statement and sent a erroneus v3. You said that
> > > > > > > > > having a feature flag is the best option.
> > > > > > > > >
> > > > > > > > > Will add a feature flag in v4: is this similar to the
> > > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag?
> > > > > > > > >
> > > > > > > >
> > > > > > > > Right, it should be easy to return it from .get_backend_features op if
> > > > > > > > the FW returns that capability, isn't it?
> > > > > > > >
> > > > > > > Yes, that's easy. But I wonder if we need one feature bit for each type of
> > > > > > > change:
> > > > > > > - VHOST_BACKEND_F_CHANGEABLE_VQ_ADDR_IN_SUSPEND
> > > > > > > - VHOST_BACKEND_F_CHANGEABLE_VQ_STATE_IN_SUSPEND
> > > > > > >
> > > > > >
> > > > > > I'd say yes. Although we could configure SVQ initial state in userland
> > > > > > as different than 0 for this first step, it would be needed in the
> > > > > > long term.
> > > > > >
> > > > > > > Or would a big one VHOST_BACKEND_F_CAN_RECONFIG_VQ_IN_SUSPEND suffice?
> > > > > > >
> > > > > >
> > > > > > I'd say "reconfig vq" is not valid as mlx driver doesn't allow
> > > > > > changing queue sizes, for example, isn't it?
> > > > > >
> > > > > Modifying the queue size for a vq is indeed not supported by the mlx device.
> > > > >
> > > > > > To define that it is
> > > > > > valid to change "all parameters" seems very confident.
> > > > > >
> > > > > Ack
> > > > >
> > > > > > > To me having individual feature bits makes sense. But it could also takes too
> > > > > > > many bits if more changes are required.
> > > > > > >
> > > > > >
> > > > > > Yes, that's a good point. Maybe it is valid to define a subset of
> > > > > > features that can be changed., but I think it is way clearer to just
> > > > > > check for individual feature bits.
> > > > > >
> > > > > I will prepare extra patches with the 2 feature bits approach.
> > > > >
> > > > > Is it necessary to add checks in the vdpa core that block changing these
> > > > > properties if the state is driver ok and the device doesn't support the feature?
> > > > >
> > > >
> > > > Yes, I think it is better to protect for changes in vdpa core.
> > > >
> > > Hmmm... there is no suspended state available. I would only add checks for the
> > > DRIVER_OK state of the device because adding a is_suspended state/op seems out
> > > of scope for this series. Any thoughts?
> > >
> >
> > I can develop it so you can include it in your series for sure, I will
> > send it ASAP.
> >
> If it's a matter of:
> - Adding a suspended state to struct vhost_vdpa.
> - Setting it to true on successful device suspend.
> - Clearing it on successful device resume and device reset.
>
> I can add this patch. I'm just not sure about the locking part. But maybe I can
> send it and we can debate on the code.
>

Yes, that was the plan basically.

I think vhost_vdpa_suspend / vhost_vdpa_resume are already called from
vhost_vdpa_unlocked_ioctl with the lock acquired, is that what you
mean?

Thanks!


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq
  2023-12-19 14:02                                   ` Eugenio Perez Martin
@ 2023-12-19 15:11                                     ` Dragos Tatulea
  0 siblings, 0 replies; 31+ messages in thread
From: Dragos Tatulea @ 2023-12-19 15:11 UTC (permalink / raw)
  To: eperezma
  Cc: xuanzhuo, Parav Pandit, Gal Pressman, virtualization,
	linux-kernel, si-wei.liu, kvm, mst, Saeed Mahameed, jasowang,
	leon

On Tue, 2023-12-19 at 15:02 +0100, Eugenio Perez Martin wrote:
> On Tue, Dec 19, 2023 at 12:16 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > 
> > On Tue, 2023-12-19 at 08:24 +0100, Eugenio Perez Martin wrote:
> > > On Mon, Dec 18, 2023 at 2:58 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > 
> > > > On Mon, 2023-12-18 at 13:06 +0100, Eugenio Perez Martin wrote:
> > > > > On Mon, Dec 18, 2023 at 11:52 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > 
> > > > > > On Mon, 2023-12-18 at 11:16 +0100, Eugenio Perez Martin wrote:
> > > > > > > On Sat, Dec 16, 2023 at 12:03 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > 
> > > > > > > > On Fri, 2023-12-15 at 18:56 +0100, Eugenio Perez Martin wrote:
> > > > > > > > > On Fri, Dec 15, 2023 at 3:13 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > 
> > > > > > > > > > On Fri, 2023-12-15 at 12:35 +0000, Dragos Tatulea wrote:
> > > > > > > > > > > On Thu, 2023-12-14 at 19:30 +0100, Eugenio Perez Martin wrote:
> > > > > > > > > > > > On Thu, Dec 14, 2023 at 4:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On Thu, 2023-12-14 at 08:45 -0500, Michael S. Tsirkin wrote:
> > > > > > > > > > > > > > On Thu, Dec 14, 2023 at 01:39:55PM +0000, Dragos Tatulea wrote:
> > > > > > > > > > > > > > > On Tue, 2023-12-12 at 15:44 -0800, Si-Wei Liu wrote:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > On 12/12/2023 11:21 AM, Eugenio Perez Martin wrote:
> > > > > > > > > > > > > > > > > On Tue, Dec 5, 2023 at 11:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > > > > > > > > > > > > > > > > Addresses get set by .set_vq_address. hw vq addresses will be updated on
> > > > > > > > > > > > > > > > > > next modify_virtqueue.
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > > > > > > > > > > > > > > > > > Reviewed-by: Gal Pressman <gal@nvidia.com>
> > > > > > > > > > > > > > > > > > Acked-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > > > > > > > > I'm kind of ok with this patch and the next one about state, but I
> > > > > > > > > > > > > > > > > didn't ack them in the previous series.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > My main concern is that it is not valid to change the vq address after
> > > > > > > > > > > > > > > > > DRIVER_OK in VirtIO, which vDPA follows. Only memory maps are ok to
> > > > > > > > > > > > > > > > > change at this moment. I'm not sure about vq state in vDPA, but vhost
> > > > > > > > > > > > > > > > > forbids changing it with an active backend.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > Suspend is not defined in VirtIO at this moment though, so maybe it is
> > > > > > > > > > > > > > > > > ok to decide that all of these parameters may change during suspend.
> > > > > > > > > > > > > > > > > Maybe the best thing is to protect this with a vDPA feature flag.
> > > > > > > > > > > > > > > > I think protect with vDPA feature flag could work, while on the other
> > > > > > > > > > > > > > > > hand vDPA means vendor specific optimization is possible around suspend
> > > > > > > > > > > > > > > > and resume (in case it helps performance), which doesn't have to be
> > > > > > > > > > > > > > > > backed by virtio spec. Same applies to vhost user backend features,
> > > > > > > > > > > > > > > > variations there were not backed by spec either. Of course, we should
> > > > > > > > > > > > > > > > try best to make the default behavior backward compatible with
> > > > > > > > > > > > > > > > virtio-based backend, but that circles back to no suspend definition in
> > > > > > > > > > > > > > > > the current virtio spec, for which I hope we don't cease development on
> > > > > > > > > > > > > > > > vDPA indefinitely. After all, the virtio based vdap backend can well
> > > > > > > > > > > > > > > > define its own feature flag to describe (minor difference in) the
> > > > > > > > > > > > > > > > suspend behavior based on the later spec once it is formed in future.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > So what is the way forward here? From what I understand the options are:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 1) Add a vdpa feature flag for changing device properties while suspended.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 2) Drop these 2 patches from the series for now. Not sure if this makes sense as
> > > > > > > > > > > > > > > this. But then Si-Wei's qemu device suspend/resume poc [0] that exercises this
> > > > > > > > > > > > > > > code won't work anymore. This means the series would be less well tested.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Are there other possible options? What do you think?
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > [0] https://github.com/siwliu-kernel/qemu/tree/svq-resume-wip
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > I am fine with either of these.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > How about allowing the change only under the following conditions:
> > > > > > > > > > > > >   vhost_vdpa_can_suspend && vhost_vdpa_can_resume &&
> > > > > > > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is set
> > > > > > > > > > > > > 
> > > > > > > > > > > > > ?
> > > > > > > > > > > > 
> > > > > > > > > > > > I think the best option by far is 1, as there is no hint in the
> > > > > > > > > > > > combination of these 3 indicating that you can change device
> > > > > > > > > > > > properties in the suspended state.
> > > > > > > > > > > > 
> > > > > > > > > > > Sure. Will respin a v3 without these two patches.
> > > > > > > > > > > 
> > > > > > > > > > > Another series can implement option 2 and add these 2 patches on top.
> > > > > > > > > > Hmm...I misunderstood your statement and sent a erroneus v3. You said that
> > > > > > > > > > having a feature flag is the best option.
> > > > > > > > > > 
> > > > > > > > > > Will add a feature flag in v4: is this similar to the
> > > > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag?
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Right, it should be easy to return it from .get_backend_features op if
> > > > > > > > > the FW returns that capability, isn't it?
> > > > > > > > > 
> > > > > > > > Yes, that's easy. But I wonder if we need one feature bit for each type of
> > > > > > > > change:
> > > > > > > > - VHOST_BACKEND_F_CHANGEABLE_VQ_ADDR_IN_SUSPEND
> > > > > > > > - VHOST_BACKEND_F_CHANGEABLE_VQ_STATE_IN_SUSPEND
> > > > > > > > 
> > > > > > > 
> > > > > > > I'd say yes. Although we could configure SVQ initial state in userland
> > > > > > > as different than 0 for this first step, it would be needed in the
> > > > > > > long term.
> > > > > > > 
> > > > > > > > Or would a big one VHOST_BACKEND_F_CAN_RECONFIG_VQ_IN_SUSPEND suffice?
> > > > > > > > 
> > > > > > > 
> > > > > > > I'd say "reconfig vq" is not valid as mlx driver doesn't allow
> > > > > > > changing queue sizes, for example, isn't it?
> > > > > > > 
> > > > > > Modifying the queue size for a vq is indeed not supported by the mlx device.
> > > > > > 
> > > > > > > To define that it is
> > > > > > > valid to change "all parameters" seems very confident.
> > > > > > > 
> > > > > > Ack
> > > > > > 
> > > > > > > > To me having individual feature bits makes sense. But it could also takes too
> > > > > > > > many bits if more changes are required.
> > > > > > > > 
> > > > > > > 
> > > > > > > Yes, that's a good point. Maybe it is valid to define a subset of
> > > > > > > features that can be changed., but I think it is way clearer to just
> > > > > > > check for individual feature bits.
> > > > > > > 
> > > > > > I will prepare extra patches with the 2 feature bits approach.
> > > > > > 
> > > > > > Is it necessary to add checks in the vdpa core that block changing these
> > > > > > properties if the state is driver ok and the device doesn't support the feature?
> > > > > > 
> > > > > 
> > > > > Yes, I think it is better to protect for changes in vdpa core.
> > > > > 
> > > > Hmmm... there is no suspended state available. I would only add checks for the
> > > > DRIVER_OK state of the device because adding a is_suspended state/op seems out
> > > > of scope for this series. Any thoughts?
> > > > 
> > > 
> > > I can develop it so you can include it in your series for sure, I will
> > > send it ASAP.
> > > 
> > If it's a matter of:
> > - Adding a suspended state to struct vhost_vdpa.
> > - Setting it to true on successful device suspend.
> > - Clearing it on successful device resume and device reset.
> > 
> > I can add this patch. I'm just not sure about the locking part. But maybe I can
> > send it and we can debate on the code.
> > 
> 
> Yes, that was the plan basically.
> 
> I think vhost_vdpa_suspend / vhost_vdpa_resume are already called from
> vhost_vdpa_unlocked_ioctl with the lock acquired, is that what you
> mean?
> 
Yes, that's what I wanted to make sure that is correct. I will send the v4 soon.

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2023-12-19 15:11 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-05 10:46 [PATCH vhost v2 0/8] vdpa/mlx5: Add support for resumable vqs Dragos Tatulea
2023-12-05 10:46 ` [PATCH mlx5-vhost v2 1/8] vdpa/mlx5: Expose resumable vq capability Dragos Tatulea
2023-12-05 10:46 ` [PATCH vhost v2 2/8] vdpa/mlx5: Allow modifying multiple vq fields in one modify command Dragos Tatulea
2023-12-05 10:46 ` [PATCH vhost v2 3/8] vdpa/mlx5: Introduce per vq and device resume Dragos Tatulea
2023-12-05 10:46 ` [PATCH vhost v2 4/8] vdpa/mlx5: Mark vq addrs for modification in hw vq Dragos Tatulea
2023-12-12 19:21   ` Eugenio Perez Martin
2023-12-12 19:44     ` Dragos Tatulea
2023-12-12 23:44     ` Si-Wei Liu
2023-12-14 13:39       ` Dragos Tatulea
2023-12-14 13:45         ` Michael S. Tsirkin
2023-12-14 15:51           ` Dragos Tatulea
2023-12-14 18:30             ` Eugenio Perez Martin
2023-12-15 12:35               ` Dragos Tatulea
2023-12-15 14:13                 ` Dragos Tatulea
2023-12-15 17:56                   ` Eugenio Perez Martin
2023-12-16 11:03                     ` Dragos Tatulea
2023-12-18 10:16                       ` Eugenio Perez Martin
2023-12-18 10:52                         ` Dragos Tatulea
2023-12-18 12:06                           ` Eugenio Perez Martin
2023-12-18 13:58                             ` Dragos Tatulea
2023-12-19  7:24                               ` Eugenio Perez Martin
2023-12-19 11:16                                 ` Dragos Tatulea
2023-12-19 14:02                                   ` Eugenio Perez Martin
2023-12-19 15:11                                     ` Dragos Tatulea
2023-12-05 10:46 ` [PATCH vhost v2 5/8] vdpa/mlx5: Mark vq state " Dragos Tatulea
2023-12-05 10:46 ` [PATCH vhost v2 6/8] vdpa/mlx5: Use vq suspend/resume during .set_map Dragos Tatulea
2023-12-12 19:22   ` Eugenio Perez Martin
2023-12-05 10:46 ` [PATCH vhost v2 7/8] vdpa/mlx5: Introduce reference counting to mrs Dragos Tatulea
2023-12-12 18:26   ` Eugenio Perez Martin
2023-12-05 10:46 ` [PATCH vhost v2 8/8] vdpa/mlx5: Add mkey leak detection Dragos Tatulea
2023-12-12 18:32   ` Eugenio Perez Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.