All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH vfio 0/2] Migration few enhancements
@ 2022-06-06  8:56 Yishai Hadas
  2022-06-06  8:56 ` [PATCH vfio 1/2] vfio/mlx5: Protect mlx5vf_disable_fds() upon close device Yishai Hadas
  2022-06-06  8:56 ` [PATCH vfio 2/2] vfio: Split migration ops from main device ops Yishai Hadas
  0 siblings, 2 replies; 18+ messages in thread
From: Yishai Hadas @ 2022-06-06  8:56 UTC (permalink / raw)
  To: alex.williamson, jgg, kvm
  Cc: yishaih, maorg, cohuck, kevin.tian, shameerali.kolothum.thodi,
	liulongfang

This series includes few enhancements in the migration area and some
fixes in mlx5 driver as of below.

It splits migration ops from the main device ops, this enables a driver
to safely set its migration's ops only when migration is supported and
leave the other code around (e.g., core, driver) clean.

Registering different structs based on the device capabilities might
start to hit combinatorial explosion when we'll introduce ops for dirty
logging that may be optional too.

As part of that, adapt mlx5 to this scheme and fix some issues around
its migration capable usage.

Yishai Hadas (2):
  vfio/mlx5: Protect mlx5vf_disable_fds() upon close device
  vfio: Split migration ops from main device ops

 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    | 11 +++++---
 drivers/vfio/pci/mlx5/cmd.c                   | 14 +++++++++-
 drivers/vfio/pci/mlx5/cmd.h                   |  4 ++-
 drivers/vfio/pci/mlx5/main.c                  | 11 +++++---
 drivers/vfio/vfio.c                           | 13 +++++-----
 include/linux/vfio.h                          | 26 ++++++++++++-------
 6 files changed, 54 insertions(+), 25 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH vfio 1/2] vfio/mlx5: Protect mlx5vf_disable_fds() upon close device
  2022-06-06  8:56 [PATCH vfio 0/2] Migration few enhancements Yishai Hadas
@ 2022-06-06  8:56 ` Yishai Hadas
  2022-06-10  3:25   ` Tian, Kevin
  2022-06-06  8:56 ` [PATCH vfio 2/2] vfio: Split migration ops from main device ops Yishai Hadas
  1 sibling, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2022-06-06  8:56 UTC (permalink / raw)
  To: alex.williamson, jgg, kvm
  Cc: yishaih, maorg, cohuck, kevin.tian, shameerali.kolothum.thodi,
	liulongfang

Protect mlx5vf_disable_fds() upon close device to be called under the
state mutex as done in all other places.

This will prevent a race with any other flow which calls
mlx5vf_disable_fds() as of health/recovery upon
MLX5_PF_NOTIFY_DISABLE_VF event.

Encapsulate this functionality in a separate function named
mlx5vf_cmd_close_migratable() to consider migration caps and for further
usage upon close device.

Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5 devices")
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/mlx5/cmd.c  | 10 ++++++++++
 drivers/vfio/pci/mlx5/cmd.h  |  1 +
 drivers/vfio/pci/mlx5/main.c |  2 +-
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c
index 9b9f33ca270a..cdd0c667dc77 100644
--- a/drivers/vfio/pci/mlx5/cmd.c
+++ b/drivers/vfio/pci/mlx5/cmd.c
@@ -88,6 +88,16 @@ static int mlx5fv_vf_event(struct notifier_block *nb,
 	return 0;
 }
 
+void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev)
+{
+	if (!mvdev->migrate_cap)
+		return;
+
+	mutex_lock(&mvdev->state_mutex);
+	mlx5vf_disable_fds(mvdev);
+	mlx5vf_state_mutex_unlock(mvdev);
+}
+
 void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device *mvdev)
 {
 	if (!mvdev->migrate_cap)
diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h
index 6c3112fdd8b1..aa692d9ce656 100644
--- a/drivers/vfio/pci/mlx5/cmd.h
+++ b/drivers/vfio/pci/mlx5/cmd.h
@@ -64,6 +64,7 @@ int mlx5vf_cmd_query_vhca_migration_state(struct mlx5vf_pci_core_device *mvdev,
 					  size_t *state_size);
 void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev);
 void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device *mvdev);
+void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev);
 int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev,
 			       struct mlx5_vf_migration_file *migf);
 int mlx5vf_cmd_load_vhca_state(struct mlx5vf_pci_core_device *mvdev,
diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c
index 0558d0649ddb..d754990f0662 100644
--- a/drivers/vfio/pci/mlx5/main.c
+++ b/drivers/vfio/pci/mlx5/main.c
@@ -570,7 +570,7 @@ static void mlx5vf_pci_close_device(struct vfio_device *core_vdev)
 	struct mlx5vf_pci_core_device *mvdev = container_of(
 		core_vdev, struct mlx5vf_pci_core_device, core_device.vdev);
 
-	mlx5vf_disable_fds(mvdev);
+	mlx5vf_cmd_close_migratable(mvdev);
 	vfio_pci_core_close_device(core_vdev);
 }
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-06  8:56 [PATCH vfio 0/2] Migration few enhancements Yishai Hadas
  2022-06-06  8:56 ` [PATCH vfio 1/2] vfio/mlx5: Protect mlx5vf_disable_fds() upon close device Yishai Hadas
@ 2022-06-06  8:56 ` Yishai Hadas
  2022-06-10  3:32   ` Tian, Kevin
  1 sibling, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2022-06-06  8:56 UTC (permalink / raw)
  To: alex.williamson, jgg, kvm
  Cc: yishaih, maorg, cohuck, kevin.tian, shameerali.kolothum.thodi,
	liulongfang

vfio core checks whether the driver sets some migration op (e.g.
set_state/get_state) and accordingly calls its op.

However, currently mlx5 driver sets the above ops without regards to its
migration caps.

This might lead to unexpected usage/Oops if user space may call to the
above ops even if the driver doesn't support migration. As for example,
the migration state_mutex is not initialized in that case.

The cleanest way to manage that seems to split the migration ops from
the main device ops, this will let the driver setting them separately
from the main ops when it's applicable.

As part of that, changed HISI driver to match this scheme.

This scheme may enable down the road to come with some extra group of
ops (e.g. DMA log) that can be set without regards to the other options
based on driver caps.

Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5 devices")
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    | 11 +++++---
 drivers/vfio/pci/mlx5/cmd.c                   |  4 ++-
 drivers/vfio/pci/mlx5/cmd.h                   |  3 ++-
 drivers/vfio/pci/mlx5/main.c                  |  9 ++++---
 drivers/vfio/vfio.c                           | 13 +++++-----
 include/linux/vfio.h                          | 26 ++++++++++++-------
 6 files changed, 42 insertions(+), 24 deletions(-)

diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index 4def43f5f7b6..ea762e28c1cc 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -1185,7 +1185,7 @@ static int hisi_acc_vfio_pci_open_device(struct vfio_device *core_vdev)
 	if (ret)
 		return ret;
 
-	if (core_vdev->ops->migration_set_state) {
+	if (core_vdev->mig_ops) {
 		ret = hisi_acc_vf_qm_init(hisi_acc_vdev);
 		if (ret) {
 			vfio_pci_core_disable(vdev);
@@ -1208,6 +1208,11 @@ static void hisi_acc_vfio_pci_close_device(struct vfio_device *core_vdev)
 	vfio_pci_core_close_device(core_vdev);
 }
 
+static const struct vfio_migration_ops hisi_acc_vfio_pci_migrn_state_ops = {
+	.migration_set_state = hisi_acc_vfio_pci_set_device_state,
+	.migration_get_state = hisi_acc_vfio_pci_get_device_state,
+};
+
 static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
 	.name = "hisi-acc-vfio-pci-migration",
 	.open_device = hisi_acc_vfio_pci_open_device,
@@ -1219,8 +1224,6 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
 	.mmap = hisi_acc_vfio_pci_mmap,
 	.request = vfio_pci_core_request,
 	.match = vfio_pci_core_match,
-	.migration_set_state = hisi_acc_vfio_pci_set_device_state,
-	.migration_get_state = hisi_acc_vfio_pci_get_device_state,
 };
 
 static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
@@ -1272,6 +1275,8 @@ static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device
 		if (!ret) {
 			vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
 						  &hisi_acc_vfio_pci_migrn_ops);
+			hisi_acc_vdev->core_device.vdev.mig_ops =
+					&hisi_acc_vfio_pci_migrn_state_ops;
 		} else {
 			pci_warn(pdev, "migration support failed, continue with generic interface\n");
 			vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c
index cdd0c667dc77..dd5d7bfe0a49 100644
--- a/drivers/vfio/pci/mlx5/cmd.c
+++ b/drivers/vfio/pci/mlx5/cmd.c
@@ -108,7 +108,8 @@ void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device *mvdev)
 	destroy_workqueue(mvdev->cb_wq);
 }
 
-void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev)
+void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev,
+			       const struct vfio_migration_ops *mig_ops)
 {
 	struct pci_dev *pdev = mvdev->core_device.pdev;
 	int ret;
@@ -149,6 +150,7 @@ void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev)
 	mvdev->core_device.vdev.migration_flags =
 		VFIO_MIGRATION_STOP_COPY |
 		VFIO_MIGRATION_P2P;
+	mvdev->core_device.vdev.mig_ops = mig_ops;
 
 end:
 	mlx5_vf_put_core_dev(mvdev->mdev);
diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h
index aa692d9ce656..8208f4701a90 100644
--- a/drivers/vfio/pci/mlx5/cmd.h
+++ b/drivers/vfio/pci/mlx5/cmd.h
@@ -62,7 +62,8 @@ int mlx5vf_cmd_suspend_vhca(struct mlx5vf_pci_core_device *mvdev, u16 op_mod);
 int mlx5vf_cmd_resume_vhca(struct mlx5vf_pci_core_device *mvdev, u16 op_mod);
 int mlx5vf_cmd_query_vhca_migration_state(struct mlx5vf_pci_core_device *mvdev,
 					  size_t *state_size);
-void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev);
+void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev,
+			       const struct vfio_migration_ops *mig_ops);
 void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device *mvdev);
 void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev);
 int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev,
diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c
index d754990f0662..a9b63d15c5d3 100644
--- a/drivers/vfio/pci/mlx5/main.c
+++ b/drivers/vfio/pci/mlx5/main.c
@@ -574,6 +574,11 @@ static void mlx5vf_pci_close_device(struct vfio_device *core_vdev)
 	vfio_pci_core_close_device(core_vdev);
 }
 
+static const struct vfio_migration_ops mlx5vf_pci_mig_ops = {
+	.migration_set_state = mlx5vf_pci_set_device_state,
+	.migration_get_state = mlx5vf_pci_get_device_state,
+};
+
 static const struct vfio_device_ops mlx5vf_pci_ops = {
 	.name = "mlx5-vfio-pci",
 	.open_device = mlx5vf_pci_open_device,
@@ -585,8 +590,6 @@ static const struct vfio_device_ops mlx5vf_pci_ops = {
 	.mmap = vfio_pci_core_mmap,
 	.request = vfio_pci_core_request,
 	.match = vfio_pci_core_match,
-	.migration_set_state = mlx5vf_pci_set_device_state,
-	.migration_get_state = mlx5vf_pci_get_device_state,
 };
 
 static int mlx5vf_pci_probe(struct pci_dev *pdev,
@@ -599,7 +602,7 @@ static int mlx5vf_pci_probe(struct pci_dev *pdev,
 	if (!mvdev)
 		return -ENOMEM;
 	vfio_pci_core_init_device(&mvdev->core_device, pdev, &mlx5vf_pci_ops);
-	mlx5vf_cmd_set_migratable(mvdev);
+	mlx5vf_cmd_set_migratable(mvdev, &mlx5vf_pci_mig_ops);
 	dev_set_drvdata(&pdev->dev, &mvdev->core_device);
 	ret = vfio_pci_core_register_device(&mvdev->core_device);
 	if (ret)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index e22be13e6771..453dc2464969 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1534,8 +1534,8 @@ vfio_ioctl_device_feature_mig_device_state(struct vfio_device *device,
 	struct file *filp = NULL;
 	int ret;
 
-	if (!device->ops->migration_set_state ||
-	    !device->ops->migration_get_state)
+	if (!device->mig_ops->migration_set_state ||
+	    !device->mig_ops->migration_get_state)
 		return -ENOTTY;
 
 	ret = vfio_check_feature(flags, argsz,
@@ -1551,7 +1551,8 @@ vfio_ioctl_device_feature_mig_device_state(struct vfio_device *device,
 	if (flags & VFIO_DEVICE_FEATURE_GET) {
 		enum vfio_device_mig_state curr_state;
 
-		ret = device->ops->migration_get_state(device, &curr_state);
+		ret = device->mig_ops->migration_get_state(device,
+							   &curr_state);
 		if (ret)
 			return ret;
 		mig.device_state = curr_state;
@@ -1559,7 +1560,7 @@ vfio_ioctl_device_feature_mig_device_state(struct vfio_device *device,
 	}
 
 	/* Handle the VFIO_DEVICE_FEATURE_SET */
-	filp = device->ops->migration_set_state(device, mig.device_state);
+	filp = device->mig_ops->migration_set_state(device, mig.device_state);
 	if (IS_ERR(filp) || !filp)
 		goto out_copy;
 
@@ -1582,8 +1583,8 @@ static int vfio_ioctl_device_feature_migration(struct vfio_device *device,
 	};
 	int ret;
 
-	if (!device->ops->migration_set_state ||
-	    !device->ops->migration_get_state)
+	if (!device->mig_ops->migration_set_state ||
+	    !device->mig_ops->migration_get_state)
 		return -ENOTTY;
 
 	ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_GET,
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index aa888cc51757..0453ca970221 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -32,6 +32,7 @@ struct vfio_device_set {
 struct vfio_device {
 	struct device *dev;
 	const struct vfio_device_ops *ops;
+	const struct vfio_migration_ops *mig_ops;
 	struct vfio_group *group;
 	struct vfio_device_set *dev_set;
 	struct list_head dev_set_list;
@@ -61,16 +62,6 @@ struct vfio_device {
  *         match, -errno for abort (ex. match with insufficient or incorrect
  *         additional args)
  * @device_feature: Optional, fill in the VFIO_DEVICE_FEATURE ioctl
- * @migration_set_state: Optional callback to change the migration state for
- *         devices that support migration. It's mandatory for
- *         VFIO_DEVICE_FEATURE_MIGRATION migration support.
- *         The returned FD is used for data transfer according to the FSM
- *         definition. The driver is responsible to ensure that FD reaches end
- *         of stream or error whenever the migration FSM leaves a data transfer
- *         state or before close_device() returns.
- * @migration_get_state: Optional callback to get the migration state for
- *         devices that support migration. It's mandatory for
- *         VFIO_DEVICE_FEATURE_MIGRATION migration support.
  */
 struct vfio_device_ops {
 	char	*name;
@@ -87,6 +78,21 @@ struct vfio_device_ops {
 	int	(*match)(struct vfio_device *vdev, char *buf);
 	int	(*device_feature)(struct vfio_device *device, u32 flags,
 				  void __user *arg, size_t argsz);
+};
+
+/**
+ * @migration_set_state: Optional callback to change the migration state for
+ *         devices that support migration. It's mandatory for
+ *         VFIO_DEVICE_FEATURE_MIGRATION migration support.
+ *         The returned FD is used for data transfer according to the FSM
+ *         definition. The driver is responsible to ensure that FD reaches end
+ *         of stream or error whenever the migration FSM leaves a data transfer
+ *         state or before close_device() returns.
+ * @migration_get_state: Optional callback to get the migration state for
+ *         devices that support migration. It's mandatory for
+ *         VFIO_DEVICE_FEATURE_MIGRATION migration support.
+ */
+struct vfio_migration_ops {
 	struct file *(*migration_set_state)(
 		struct vfio_device *device,
 		enum vfio_device_mig_state new_state);
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* RE: [PATCH vfio 1/2] vfio/mlx5: Protect mlx5vf_disable_fds() upon close device
  2022-06-06  8:56 ` [PATCH vfio 1/2] vfio/mlx5: Protect mlx5vf_disable_fds() upon close device Yishai Hadas
@ 2022-06-10  3:25   ` Tian, Kevin
  0 siblings, 0 replies; 18+ messages in thread
From: Tian, Kevin @ 2022-06-10  3:25 UTC (permalink / raw)
  To: Yishai Hadas, alex.williamson, jgg, kvm
  Cc: maorg, cohuck, shameerali.kolothum.thodi, liulongfang

> From: Yishai Hadas <yishaih@nvidia.com>
> Sent: Monday, June 6, 2022 4:56 PM
> 
> Protect mlx5vf_disable_fds() upon close device to be called under the
> state mutex as done in all other places.
> 
> This will prevent a race with any other flow which calls
> mlx5vf_disable_fds() as of health/recovery upon
> MLX5_PF_NOTIFY_DISABLE_VF event.
> 
> Encapsulate this functionality in a separate function named
> mlx5vf_cmd_close_migratable() to consider migration caps and for further
> usage upon close device.
> 
> Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5 devices")
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  drivers/vfio/pci/mlx5/cmd.c  | 10 ++++++++++
>  drivers/vfio/pci/mlx5/cmd.h  |  1 +
>  drivers/vfio/pci/mlx5/main.c |  2 +-
>  3 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c
> index 9b9f33ca270a..cdd0c667dc77 100644
> --- a/drivers/vfio/pci/mlx5/cmd.c
> +++ b/drivers/vfio/pci/mlx5/cmd.c
> @@ -88,6 +88,16 @@ static int mlx5fv_vf_event(struct notifier_block *nb,
>  	return 0;
>  }
> 
> +void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev)
> +{
> +	if (!mvdev->migrate_cap)
> +		return;
> +
> +	mutex_lock(&mvdev->state_mutex);
> +	mlx5vf_disable_fds(mvdev);
> +	mlx5vf_state_mutex_unlock(mvdev);
> +}
> +
>  void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device
> *mvdev)
>  {
>  	if (!mvdev->migrate_cap)
> diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h
> index 6c3112fdd8b1..aa692d9ce656 100644
> --- a/drivers/vfio/pci/mlx5/cmd.h
> +++ b/drivers/vfio/pci/mlx5/cmd.h
> @@ -64,6 +64,7 @@ int mlx5vf_cmd_query_vhca_migration_state(struct
> mlx5vf_pci_core_device *mvdev,
>  					  size_t *state_size);
>  void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev);
>  void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device
> *mvdev);
> +void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev);
>  int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev,
>  			       struct mlx5_vf_migration_file *migf);
>  int mlx5vf_cmd_load_vhca_state(struct mlx5vf_pci_core_device *mvdev,
> diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c
> index 0558d0649ddb..d754990f0662 100644
> --- a/drivers/vfio/pci/mlx5/main.c
> +++ b/drivers/vfio/pci/mlx5/main.c
> @@ -570,7 +570,7 @@ static void mlx5vf_pci_close_device(struct
> vfio_device *core_vdev)
>  	struct mlx5vf_pci_core_device *mvdev = container_of(
>  		core_vdev, struct mlx5vf_pci_core_device, core_device.vdev);
> 
> -	mlx5vf_disable_fds(mvdev);
> +	mlx5vf_cmd_close_migratable(mvdev);
>  	vfio_pci_core_close_device(core_vdev);
>  }
> 
> --
> 2.18.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-06  8:56 ` [PATCH vfio 2/2] vfio: Split migration ops from main device ops Yishai Hadas
@ 2022-06-10  3:32   ` Tian, Kevin
  2022-06-13  7:13     ` Yishai Hadas
  0 siblings, 1 reply; 18+ messages in thread
From: Tian, Kevin @ 2022-06-10  3:32 UTC (permalink / raw)
  To: Yishai Hadas, alex.williamson, jgg, kvm
  Cc: maorg, cohuck, shameerali.kolothum.thodi, liulongfang

> From: Yishai Hadas <yishaih@nvidia.com>
> Sent: Monday, June 6, 2022 4:56 PM
> 
> vfio core checks whether the driver sets some migration op (e.g.
> set_state/get_state) and accordingly calls its op.
> 
> However, currently mlx5 driver sets the above ops without regards to its
> migration caps.
> 
> This might lead to unexpected usage/Oops if user space may call to the
> above ops even if the driver doesn't support migration. As for example,
> the migration state_mutex is not initialized in that case.
> 
> The cleanest way to manage that seems to split the migration ops from
> the main device ops, this will let the driver setting them separately
> from the main ops when it's applicable.
> 
> As part of that, changed HISI driver to match this scheme.
> 
> This scheme may enable down the road to come with some extra group of
> ops (e.g. DMA log) that can be set without regards to the other options
> based on driver caps.
> 
> Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5 devices")
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one nit:

> @@ -1534,8 +1534,8 @@ vfio_ioctl_device_feature_mig_device_state(struct
> vfio_device *device,
>  	struct file *filp = NULL;
>  	int ret;
> 
> -	if (!device->ops->migration_set_state ||
> -	    !device->ops->migration_get_state)
> +	if (!device->mig_ops->migration_set_state ||
> +	    !device->mig_ops->migration_get_state)
>  		return -ENOTTY;

...

> @@ -1582,8 +1583,8 @@ static int
> vfio_ioctl_device_feature_migration(struct vfio_device *device,
>  	};
>  	int ret;
> 
> -	if (!device->ops->migration_set_state ||
> -	    !device->ops->migration_get_state)
> +	if (!device->mig_ops->migration_set_state ||
> +	    !device->mig_ops->migration_get_state)
>  		return -ENOTTY;
> 

Above checks can be done once when the device is registered then
here replaced with a single check on device->mig_ops. 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-10  3:32   ` Tian, Kevin
@ 2022-06-13  7:13     ` Yishai Hadas
  2022-06-16 14:18       ` Yishai Hadas
  0 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2022-06-13  7:13 UTC (permalink / raw)
  To: Tian, Kevin, alex.williamson, jgg, kvm
  Cc: maorg, cohuck, shameerali.kolothum.thodi, liulongfang

On 10/06/2022 6:32, Tian, Kevin wrote:
>> From: Yishai Hadas <yishaih@nvidia.com>
>> Sent: Monday, June 6, 2022 4:56 PM
>>
>> vfio core checks whether the driver sets some migration op (e.g.
>> set_state/get_state) and accordingly calls its op.
>>
>> However, currently mlx5 driver sets the above ops without regards to its
>> migration caps.
>>
>> This might lead to unexpected usage/Oops if user space may call to the
>> above ops even if the driver doesn't support migration. As for example,
>> the migration state_mutex is not initialized in that case.
>>
>> The cleanest way to manage that seems to split the migration ops from
>> the main device ops, this will let the driver setting them separately
>> from the main ops when it's applicable.
>>
>> As part of that, changed HISI driver to match this scheme.
>>
>> This scheme may enable down the road to come with some extra group of
>> ops (e.g. DMA log) that can be set without regards to the other options
>> based on driver caps.
>>
>> Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5 devices")
>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one nit:

Thanks Kevin, please see below.

>
>> @@ -1534,8 +1534,8 @@ vfio_ioctl_device_feature_mig_device_state(struct
>> vfio_device *device,
>>   	struct file *filp = NULL;
>>   	int ret;
>>
>> -	if (!device->ops->migration_set_state ||
>> -	    !device->ops->migration_get_state)
>> +	if (!device->mig_ops->migration_set_state ||
>> +	    !device->mig_ops->migration_get_state)
>>   		return -ENOTTY;
> ...
>
>> @@ -1582,8 +1583,8 @@ static int
>> vfio_ioctl_device_feature_migration(struct vfio_device *device,
>>   	};
>>   	int ret;
>>
>> -	if (!device->ops->migration_set_state ||
>> -	    !device->ops->migration_get_state)
>> +	if (!device->mig_ops->migration_set_state ||
>> +	    !device->mig_ops->migration_get_state)
>>   		return -ENOTTY;
>>
> Above checks can be done once when the device is registered then
> here replaced with a single check on device->mig_ops.
>
I agree, it may look as of below.

Theoretically, this could be done even before this patch upon device 
registration.

We could check that both 'ops' were set and *not* only one of and later 
check for the specific 'op' upon the feature request.

Alex,

Do you prefer to switch to the below as part of V2 or stay as of current 
submission and I'll just add Kevin as Reviewed-by ?

diff --git a/drivers/vfio/pci/vfio_pci_core.c 
b/drivers/vfio/pci/vfio_pci_core.c
index a0d69ddaf90d..f42102a03851 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1855,6 +1855,11 @@ int vfio_pci_core_register_device(struct 
vfio_pci_core_device *vdev)
         if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
                 return -EINVAL;

+       if (vdev->vdev.mig_ops &&
+          !(vdev->vdev.mig_ops->migration_get_state &&
+            vdev->vdev.mig_ops->migration_get_state))
+               return -EINVAL;
+

Yishai


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-13  7:13     ` Yishai Hadas
@ 2022-06-16 14:18       ` Yishai Hadas
  2022-06-16 23:01         ` Alex Williamson
  0 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2022-06-16 14:18 UTC (permalink / raw)
  To: Tian, Kevin, alex.williamson
  Cc: maorg, cohuck, shameerali.kolothum.thodi, liulongfang, kvm, jgg

On 13/06/2022 10:13, Yishai Hadas wrote:
> On 10/06/2022 6:32, Tian, Kevin wrote:
>>> From: Yishai Hadas <yishaih@nvidia.com>
>>> Sent: Monday, June 6, 2022 4:56 PM
>>>
>>> vfio core checks whether the driver sets some migration op (e.g.
>>> set_state/get_state) and accordingly calls its op.
>>>
>>> However, currently mlx5 driver sets the above ops without regards to 
>>> its
>>> migration caps.
>>>
>>> This might lead to unexpected usage/Oops if user space may call to the
>>> above ops even if the driver doesn't support migration. As for example,
>>> the migration state_mutex is not initialized in that case.
>>>
>>> The cleanest way to manage that seems to split the migration ops from
>>> the main device ops, this will let the driver setting them separately
>>> from the main ops when it's applicable.
>>>
>>> As part of that, changed HISI driver to match this scheme.
>>>
>>> This scheme may enable down the road to come with some extra group of
>>> ops (e.g. DMA log) that can be set without regards to the other options
>>> based on driver caps.
>>>
>>> Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5 
>>> devices")
>>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one nit:
>
> Thanks Kevin, please see below.
>
>>
>>> @@ -1534,8 +1534,8 @@ vfio_ioctl_device_feature_mig_device_state(struct
>>> vfio_device *device,
>>>       struct file *filp = NULL;
>>>       int ret;
>>>
>>> -    if (!device->ops->migration_set_state ||
>>> -        !device->ops->migration_get_state)
>>> +    if (!device->mig_ops->migration_set_state ||
>>> +        !device->mig_ops->migration_get_state)
>>>           return -ENOTTY;
>> ...
>>
>>> @@ -1582,8 +1583,8 @@ static int
>>> vfio_ioctl_device_feature_migration(struct vfio_device *device,
>>>       };
>>>       int ret;
>>>
>>> -    if (!device->ops->migration_set_state ||
>>> -        !device->ops->migration_get_state)
>>> +    if (!device->mig_ops->migration_set_state ||
>>> +        !device->mig_ops->migration_get_state)
>>>           return -ENOTTY;
>>>
>> Above checks can be done once when the device is registered then
>> here replaced with a single check on device->mig_ops.
>>
> I agree, it may look as of below.
>
> Theoretically, this could be done even before this patch upon device 
> registration.
>
> We could check that both 'ops' were set and *not* only one of and 
> later check for the specific 'op' upon the feature request.
>
> Alex,
>
> Do you prefer to switch to the below as part of V2 or stay as of 
> current submission and I'll just add Kevin as Reviewed-by ?
>
> diff --git a/drivers/vfio/pci/vfio_pci_core.c 
> b/drivers/vfio/pci/vfio_pci_core.c
> index a0d69ddaf90d..f42102a03851 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1855,6 +1855,11 @@ int vfio_pci_core_register_device(struct 
> vfio_pci_core_device *vdev)
>         if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
>                 return -EINVAL;
>
> +       if (vdev->vdev.mig_ops &&
> +          !(vdev->vdev.mig_ops->migration_get_state &&
> +            vdev->vdev.mig_ops->migration_get_state))
> +               return -EINVAL;
> +
>
> Yishai
>
Hi Alex,

Did you have the chance to review the above note ?

I would like to send V2 with Kevin's Reviewed-by tag for both patches, 
just wonder about the nit.

Yishai


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-16 14:18       ` Yishai Hadas
@ 2022-06-16 23:01         ` Alex Williamson
  2022-06-17  3:58           ` Tian, Kevin
                             ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Alex Williamson @ 2022-06-16 23:01 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Tian, Kevin, maorg, cohuck, shameerali.kolothum.thodi,
	liulongfang, kvm, jgg

On Thu, 16 Jun 2022 17:18:16 +0300
Yishai Hadas <yishaih@nvidia.com> wrote:

> On 13/06/2022 10:13, Yishai Hadas wrote:
> > On 10/06/2022 6:32, Tian, Kevin wrote:  
> >>> From: Yishai Hadas <yishaih@nvidia.com>
> >>> Sent: Monday, June 6, 2022 4:56 PM
> >>>
> >>> vfio core checks whether the driver sets some migration op (e.g.
> >>> set_state/get_state) and accordingly calls its op.
> >>>
> >>> However, currently mlx5 driver sets the above ops without regards to 
> >>> its
> >>> migration caps.
> >>>
> >>> This might lead to unexpected usage/Oops if user space may call to the
> >>> above ops even if the driver doesn't support migration. As for example,
> >>> the migration state_mutex is not initialized in that case.
> >>>
> >>> The cleanest way to manage that seems to split the migration ops from
> >>> the main device ops, this will let the driver setting them separately
> >>> from the main ops when it's applicable.
> >>>
> >>> As part of that, changed HISI driver to match this scheme.
> >>>
> >>> This scheme may enable down the road to come with some extra group of
> >>> ops (e.g. DMA log) that can be set without regards to the other options
> >>> based on driver caps.
> >>>
> >>> Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5 
> >>> devices")
> >>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>  
> >> Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one nit:  
> >
> > Thanks Kevin, please see below.
> >  
> >>  
> >>> @@ -1534,8 +1534,8 @@ vfio_ioctl_device_feature_mig_device_state(struct
> >>> vfio_device *device,
> >>>       struct file *filp = NULL;
> >>>       int ret;
> >>>
> >>> -    if (!device->ops->migration_set_state ||
> >>> -        !device->ops->migration_get_state)
> >>> +    if (!device->mig_ops->migration_set_state ||
> >>> +        !device->mig_ops->migration_get_state)
> >>>           return -ENOTTY;  
> >> ...
> >>  
> >>> @@ -1582,8 +1583,8 @@ static int
> >>> vfio_ioctl_device_feature_migration(struct vfio_device *device,
> >>>       };
> >>>       int ret;
> >>>
> >>> -    if (!device->ops->migration_set_state ||
> >>> -        !device->ops->migration_get_state)
> >>> +    if (!device->mig_ops->migration_set_state ||
> >>> +        !device->mig_ops->migration_get_state)
> >>>           return -ENOTTY;
> >>>  
> >> Above checks can be done once when the device is registered then
> >> here replaced with a single check on device->mig_ops.
> >>  
> > I agree, it may look as of below.
> >
> > Theoretically, this could be done even before this patch upon device 
> > registration.
> >
> > We could check that both 'ops' were set and *not* only one of and 
> > later check for the specific 'op' upon the feature request.
> >
> > Alex,
> >
> > Do you prefer to switch to the below as part of V2 or stay as of 
> > current submission and I'll just add Kevin as Reviewed-by ?
> >
> > diff --git a/drivers/vfio/pci/vfio_pci_core.c 
> > b/drivers/vfio/pci/vfio_pci_core.c
> > index a0d69ddaf90d..f42102a03851 100644
> > --- a/drivers/vfio/pci/vfio_pci_core.c
> > +++ b/drivers/vfio/pci/vfio_pci_core.c
> > @@ -1855,6 +1855,11 @@ int vfio_pci_core_register_device(struct 
> > vfio_pci_core_device *vdev)
> >         if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
> >                 return -EINVAL;
> >
> > +       if (vdev->vdev.mig_ops &&
> > +          !(vdev->vdev.mig_ops->migration_get_state &&
> > +            vdev->vdev.mig_ops->migration_get_state))
> > +               return -EINVAL;
> > +
> >
> > Yishai
> >  
> Hi Alex,
> 
> Did you have the chance to review the above note ?
> 
> I would like to send V2 with Kevin's Reviewed-by tag for both patches, 
> just wonder about the nit.

The whole mig_ops handling seems rather ad-hoc to me.  What tells a
developer when mig_ops can be set?  Can they be unset?  Does this
enable a device feature callout to dynamically enable migration?  Why do
we init the device with vfio_device_ops, but then open code per driver
setting vfio_migration_ops?

For the hisi_acc changes, they're specifically defining two device_ops,
one for migration and one without migration.  This patch oddly makes
them identical other than the name and doesn't simplify the setup path,
which should now only require a single call point for init-device with
the proposed API rather than the existing three.

If we remove the migration callbacks from vfio_device_ops, there's also
an opportunity that we can fix the vfio_device_ops.name handling in
hisi_acc, where the only current user of this is to set the driver
override for spawned VFs from a vfio owned PF.  This is probably
irrelevant for this driver, but clearly we have an expectation there
that name is the name of the module providing the ops and we have no
module named hisi-acc-vfio-pci-migration.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-16 23:01         ` Alex Williamson
@ 2022-06-17  3:58           ` Tian, Kevin
  2022-06-17  4:04           ` Tian, Kevin
  2022-06-17  8:50           ` Shameerali Kolothum Thodi
  2 siblings, 0 replies; 18+ messages in thread
From: Tian, Kevin @ 2022-06-17  3:58 UTC (permalink / raw)
  To: Alex Williamson, Yishai Hadas
  Cc: maorg, cohuck, shameerali.kolothum.thodi, liulongfang, kvm, jgg

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Friday, June 17, 2022 7:01 AM
> 
> The whole mig_ops handling seems rather ad-hoc to me.  What tells a
> developer when mig_ops can be set?  Can they be unset?  Does this
> enable a device feature callout to dynamically enable migration?  Why do
> we init the device with vfio_device_ops, but then open code per driver
> setting vfio_migration_ops?
> 

A simpler fix could be treating device->migration_flags==0 as indicator
of no support of migration. w/o additional flags only RUNNING and
ERROR states are supported which cannot support migration. Failing
the related device feature ops in such case sounds clearer to me. 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-16 23:01         ` Alex Williamson
  2022-06-17  3:58           ` Tian, Kevin
@ 2022-06-17  4:04           ` Tian, Kevin
  2022-06-17  8:50           ` Shameerali Kolothum Thodi
  2 siblings, 0 replies; 18+ messages in thread
From: Tian, Kevin @ 2022-06-17  4:04 UTC (permalink / raw)
  To: Alex Williamson, Yishai Hadas
  Cc: maorg, cohuck, shameerali.kolothum.thodi, liulongfang, kvm, jgg

> From: Tian, Kevin
> Sent: Friday, June 17, 2022 11:59 AM
> 
> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Friday, June 17, 2022 7:01 AM
> >
> > The whole mig_ops handling seems rather ad-hoc to me.  What tells a
> > developer when mig_ops can be set?  Can they be unset?  Does this
> > enable a device feature callout to dynamically enable migration?  Why do
> > we init the device with vfio_device_ops, but then open code per driver
> > setting vfio_migration_ops?
> >
> 
> A simpler fix could be treating device->migration_flags==0 as indicator
> of no support of migration. w/o additional flags only RUNNING and
> ERROR states are supported which cannot support migration. Failing
> the related device feature ops in such case sounds clearer to me.

Actually this check is anyway required given the uAPI requires STOP_COPY
**must** be supported:

/*
 * Indicates the device can support the migration API through
 * VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE. If this GET succeeds, the RUNNING and
 * ERROR states are always supported. Support for additional states is
 * indicated via the flags field; at least VFIO_MIGRATION_STOP_COPY must be
 * set.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-16 23:01         ` Alex Williamson
  2022-06-17  3:58           ` Tian, Kevin
  2022-06-17  4:04           ` Tian, Kevin
@ 2022-06-17  8:50           ` Shameerali Kolothum Thodi
  2022-06-17 14:47             ` Alex Williamson
  2 siblings, 1 reply; 18+ messages in thread
From: Shameerali Kolothum Thodi @ 2022-06-17  8:50 UTC (permalink / raw)
  To: Alex Williamson, Yishai Hadas
  Cc: Tian, Kevin, maorg, cohuck, liulongfang, kvm, jgg



> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: 17 June 2022 00:01
> To: Yishai Hadas <yishaih@nvidia.com>
> Cc: Tian, Kevin <kevin.tian@intel.com>; maorg@nvidia.com;
> cohuck@redhat.com; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; liulongfang
> <liulongfang@huawei.com>; kvm@vger.kernel.org; jgg@nvidia.com
> Subject: Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
> 
> On Thu, 16 Jun 2022 17:18:16 +0300
> Yishai Hadas <yishaih@nvidia.com> wrote:
> 
> > On 13/06/2022 10:13, Yishai Hadas wrote:
> > > On 10/06/2022 6:32, Tian, Kevin wrote:
> > >>> From: Yishai Hadas <yishaih@nvidia.com>
> > >>> Sent: Monday, June 6, 2022 4:56 PM
> > >>>
> > >>> vfio core checks whether the driver sets some migration op (e.g.
> > >>> set_state/get_state) and accordingly calls its op.
> > >>>
> > >>> However, currently mlx5 driver sets the above ops without regards to
> > >>> its
> > >>> migration caps.
> > >>>
> > >>> This might lead to unexpected usage/Oops if user space may call to the
> > >>> above ops even if the driver doesn't support migration. As for example,
> > >>> the migration state_mutex is not initialized in that case.
> > >>>
> > >>> The cleanest way to manage that seems to split the migration ops from
> > >>> the main device ops, this will let the driver setting them separately
> > >>> from the main ops when it's applicable.
> > >>>
> > >>> As part of that, changed HISI driver to match this scheme.
> > >>>
> > >>> This scheme may enable down the road to come with some extra
> group of
> > >>> ops (e.g. DMA log) that can be set without regards to the other options
> > >>> based on driver caps.
> > >>>
> > >>> Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5
> > >>> devices")
> > >>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> > >> Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one nit:
> > >
> > > Thanks Kevin, please see below.
> > >
> > >>
> > >>> @@ -1534,8 +1534,8 @@
> vfio_ioctl_device_feature_mig_device_state(struct
> > >>> vfio_device *device,
> > >>>       struct file *filp = NULL;
> > >>>       int ret;
> > >>>
> > >>> -    if (!device->ops->migration_set_state ||
> > >>> -        !device->ops->migration_get_state)
> > >>> +    if (!device->mig_ops->migration_set_state ||
> > >>> +        !device->mig_ops->migration_get_state)
> > >>>           return -ENOTTY;
> > >> ...
> > >>
> > >>> @@ -1582,8 +1583,8 @@ static int
> > >>> vfio_ioctl_device_feature_migration(struct vfio_device *device,
> > >>>       };
> > >>>       int ret;
> > >>>
> > >>> -    if (!device->ops->migration_set_state ||
> > >>> -        !device->ops->migration_get_state)
> > >>> +    if (!device->mig_ops->migration_set_state ||
> > >>> +        !device->mig_ops->migration_get_state)
> > >>>           return -ENOTTY;
> > >>>
> > >> Above checks can be done once when the device is registered then
> > >> here replaced with a single check on device->mig_ops.
> > >>
> > > I agree, it may look as of below.
> > >
> > > Theoretically, this could be done even before this patch upon device
> > > registration.
> > >
> > > We could check that both 'ops' were set and *not* only one of and
> > > later check for the specific 'op' upon the feature request.
> > >
> > > Alex,
> > >
> > > Do you prefer to switch to the below as part of V2 or stay as of
> > > current submission and I'll just add Kevin as Reviewed-by ?
> > >
> > > diff --git a/drivers/vfio/pci/vfio_pci_core.c
> > > b/drivers/vfio/pci/vfio_pci_core.c
> > > index a0d69ddaf90d..f42102a03851 100644
> > > --- a/drivers/vfio/pci/vfio_pci_core.c
> > > +++ b/drivers/vfio/pci/vfio_pci_core.c
> > > @@ -1855,6 +1855,11 @@ int vfio_pci_core_register_device(struct
> > > vfio_pci_core_device *vdev)
> > >         if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
> > >                 return -EINVAL;
> > >
> > > +       if (vdev->vdev.mig_ops &&
> > > +          !(vdev->vdev.mig_ops->migration_get_state &&
> > > +            vdev->vdev.mig_ops->migration_get_state))
> > > +               return -EINVAL;
> > > +
> > >
> > > Yishai
> > >
> > Hi Alex,
> >
> > Did you have the chance to review the above note ?
> >
> > I would like to send V2 with Kevin's Reviewed-by tag for both patches,
> > just wonder about the nit.
> 
> The whole mig_ops handling seems rather ad-hoc to me.  What tells a
> developer when mig_ops can be set?  Can they be unset?  Does this
> enable a device feature callout to dynamically enable migration?  Why do
> we init the device with vfio_device_ops, but then open code per driver
> setting vfio_migration_ops?
> 
> For the hisi_acc changes, they're specifically defining two device_ops,
> one for migration and one without migration.  This patch oddly makes
> them identical other than the name and doesn't simplify the setup path,
> which should now only require a single call point for init-device with
> the proposed API rather than the existing three.

I think one of the reasons we ended up using two diff ops were to restrict
exposing the migration BAR regions to Guest(and for that we only expose
half of the BAR region now). And for nested assignments this may result
in invalid BAR sizes if we do it for no migration cases. Hence we have
overrides for read/write/mmap/ioctl functions in hisi_acc_vfio_pci_migrn_ops .

If we have to simplify the setup path, I guess we retain only the _migrn_ops and
add explicit checks for migration support in the above override functions.  

> 
> If we remove the migration callbacks from vfio_device_ops, there's also
> an opportunity that we can fix the vfio_device_ops.name handling in
> hisi_acc, where the only current user of this is to set the driver
> override for spawned VFs from a vfio owned PF.  This is probably
> irrelevant for this driver, but clearly we have an expectation there
> that name is the name of the module providing the ops and we have no
> module named hisi-acc-vfio-pci-migration.  Thanks,
> 

Since the driver is mainly to provide migration support, I am just thinking
whether it make sense to return failure in probe() if migration cannot be
supported and then if user wishes can bind to the generic vfio-pci driver
instead.

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-17  8:50           ` Shameerali Kolothum Thodi
@ 2022-06-17 14:47             ` Alex Williamson
  2022-06-19  9:25               ` Yishai Hadas
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2022-06-17 14:47 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: Yishai Hadas, Tian, Kevin, maorg, cohuck, liulongfang, kvm, jgg

On Fri, 17 Jun 2022 08:50:00 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:

> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > Sent: 17 June 2022 00:01
> > To: Yishai Hadas <yishaih@nvidia.com>
> > Cc: Tian, Kevin <kevin.tian@intel.com>; maorg@nvidia.com;
> > cohuck@redhat.com; Shameerali Kolothum Thodi
> > <shameerali.kolothum.thodi@huawei.com>; liulongfang
> > <liulongfang@huawei.com>; kvm@vger.kernel.org; jgg@nvidia.com
> > Subject: Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
> > 
> > On Thu, 16 Jun 2022 17:18:16 +0300
> > Yishai Hadas <yishaih@nvidia.com> wrote:
> >   
> > > On 13/06/2022 10:13, Yishai Hadas wrote:  
> > > > On 10/06/2022 6:32, Tian, Kevin wrote:  
> > > >>> From: Yishai Hadas <yishaih@nvidia.com>
> > > >>> Sent: Monday, June 6, 2022 4:56 PM
> > > >>>
> > > >>> vfio core checks whether the driver sets some migration op (e.g.
> > > >>> set_state/get_state) and accordingly calls its op.
> > > >>>
> > > >>> However, currently mlx5 driver sets the above ops without regards to
> > > >>> its
> > > >>> migration caps.
> > > >>>
> > > >>> This might lead to unexpected usage/Oops if user space may call to the
> > > >>> above ops even if the driver doesn't support migration. As for example,
> > > >>> the migration state_mutex is not initialized in that case.
> > > >>>
> > > >>> The cleanest way to manage that seems to split the migration ops from
> > > >>> the main device ops, this will let the driver setting them separately
> > > >>> from the main ops when it's applicable.
> > > >>>
> > > >>> As part of that, changed HISI driver to match this scheme.
> > > >>>
> > > >>> This scheme may enable down the road to come with some extra  
> > group of  
> > > >>> ops (e.g. DMA log) that can be set without regards to the other options
> > > >>> based on driver caps.
> > > >>>
> > > >>> Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5
> > > >>> devices")
> > > >>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>  
> > > >> Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one nit:  
> > > >
> > > > Thanks Kevin, please see below.
> > > >  
> > > >>  
> > > >>> @@ -1534,8 +1534,8 @@  
> > vfio_ioctl_device_feature_mig_device_state(struct  
> > > >>> vfio_device *device,
> > > >>>       struct file *filp = NULL;
> > > >>>       int ret;
> > > >>>
> > > >>> -    if (!device->ops->migration_set_state ||
> > > >>> -        !device->ops->migration_get_state)
> > > >>> +    if (!device->mig_ops->migration_set_state ||
> > > >>> +        !device->mig_ops->migration_get_state)
> > > >>>           return -ENOTTY;  
> > > >> ...
> > > >>  
> > > >>> @@ -1582,8 +1583,8 @@ static int
> > > >>> vfio_ioctl_device_feature_migration(struct vfio_device *device,
> > > >>>       };
> > > >>>       int ret;
> > > >>>
> > > >>> -    if (!device->ops->migration_set_state ||
> > > >>> -        !device->ops->migration_get_state)
> > > >>> +    if (!device->mig_ops->migration_set_state ||
> > > >>> +        !device->mig_ops->migration_get_state)
> > > >>>           return -ENOTTY;
> > > >>>  
> > > >> Above checks can be done once when the device is registered then
> > > >> here replaced with a single check on device->mig_ops.
> > > >>  
> > > > I agree, it may look as of below.
> > > >
> > > > Theoretically, this could be done even before this patch upon device
> > > > registration.
> > > >
> > > > We could check that both 'ops' were set and *not* only one of and
> > > > later check for the specific 'op' upon the feature request.
> > > >
> > > > Alex,
> > > >
> > > > Do you prefer to switch to the below as part of V2 or stay as of
> > > > current submission and I'll just add Kevin as Reviewed-by ?
> > > >
> > > > diff --git a/drivers/vfio/pci/vfio_pci_core.c
> > > > b/drivers/vfio/pci/vfio_pci_core.c
> > > > index a0d69ddaf90d..f42102a03851 100644
> > > > --- a/drivers/vfio/pci/vfio_pci_core.c
> > > > +++ b/drivers/vfio/pci/vfio_pci_core.c
> > > > @@ -1855,6 +1855,11 @@ int vfio_pci_core_register_device(struct
> > > > vfio_pci_core_device *vdev)
> > > >         if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
> > > >                 return -EINVAL;
> > > >
> > > > +       if (vdev->vdev.mig_ops &&
> > > > +          !(vdev->vdev.mig_ops->migration_get_state &&
> > > > +            vdev->vdev.mig_ops->migration_get_state))
> > > > +               return -EINVAL;
> > > > +
> > > >
> > > > Yishai
> > > >  
> > > Hi Alex,
> > >
> > > Did you have the chance to review the above note ?
> > >
> > > I would like to send V2 with Kevin's Reviewed-by tag for both patches,
> > > just wonder about the nit.  
> > 
> > The whole mig_ops handling seems rather ad-hoc to me.  What tells a
> > developer when mig_ops can be set?  Can they be unset?  Does this
> > enable a device feature callout to dynamically enable migration?  Why do
> > we init the device with vfio_device_ops, but then open code per driver
> > setting vfio_migration_ops?
> > 
> > For the hisi_acc changes, they're specifically defining two device_ops,
> > one for migration and one without migration.  This patch oddly makes
> > them identical other than the name and doesn't simplify the setup path,
> > which should now only require a single call point for init-device with
> > the proposed API rather than the existing three.  
> 
> I think one of the reasons we ended up using two diff ops were to restrict
> exposing the migration BAR regions to Guest(and for that we only expose
> half of the BAR region now). And for nested assignments this may result
> in invalid BAR sizes if we do it for no migration cases. Hence we have
> overrides for read/write/mmap/ioctl functions in hisi_acc_vfio_pci_migrn_ops .

Darn, I always forget about that and skimmed too quickly, the close,
read, write, and mmap callbacks are all different.  Sorry about that.

> If we have to simplify the setup path, I guess we retain only the _migrn_ops and
> add explicit checks for migration support in the above override functions.  

Yes, that's an option and none of those callbacks should be terribly
performance sensitive to an inline test, it's just a matter of where we
want to simplify the code.  Given the additional differences in
callbacks I'm not necessarily suggesting this is something we need to
change.

> > If we remove the migration callbacks from vfio_device_ops, there's also
> > an opportunity that we can fix the vfio_device_ops.name handling in
> > hisi_acc, where the only current user of this is to set the driver
> > override for spawned VFs from a vfio owned PF.  This is probably
> > irrelevant for this driver, but clearly we have an expectation there
> > that name is the name of the module providing the ops and we have no
> > module named hisi-acc-vfio-pci-migration.  Thanks,
> >   
> 
> Since the driver is mainly to provide migration support, I am just thinking
> whether it make sense to return failure in probe() if migration cannot be
> supported and then if user wishes can bind to the generic vfio-pci driver
> instead.

That might be asking too much for userspace.  We're conveying to
userspace via the modinfo alias information which vfio-pci variant to
use for the device, but then to have the driver reject the device and
expect userspace to move to the next match seems like a lot to ask.  It
doesn't seem wrong for a driver to have multiple vfio_device_ops, but
we might need to think more about the purpose of the name field, if the
use of it for capturing VFs is valid, or if we'd rather have some sort
of explicit capture driver provided by the ops.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-17 14:47             ` Alex Williamson
@ 2022-06-19  9:25               ` Yishai Hadas
  2022-06-20  3:49                 ` Jason Gunthorpe
  0 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2022-06-19  9:25 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Tian, Kevin, maorg, cohuck, liulongfang, kvm, jgg,
	Shameerali Kolothum Thodi

On 17/06/2022 17:47, Alex Williamson wrote:
> On Fri, 17 Jun 2022 08:50:00 +0000
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:
>
>>> -----Original Message-----
>>> From: Alex Williamson [mailto:alex.williamson@redhat.com]
>>> Sent: 17 June 2022 00:01
>>> To: Yishai Hadas <yishaih@nvidia.com>
>>> Cc: Tian, Kevin <kevin.tian@intel.com>; maorg@nvidia.com;
>>> cohuck@redhat.com; Shameerali Kolothum Thodi
>>> <shameerali.kolothum.thodi@huawei.com>; liulongfang
>>> <liulongfang@huawei.com>; kvm@vger.kernel.org; jgg@nvidia.com
>>> Subject: Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
>>>
>>> On Thu, 16 Jun 2022 17:18:16 +0300
>>> Yishai Hadas <yishaih@nvidia.com> wrote:
>>>    
>>>> On 13/06/2022 10:13, Yishai Hadas wrote:
>>>>> On 10/06/2022 6:32, Tian, Kevin wrote:
>>>>>>> From: Yishai Hadas <yishaih@nvidia.com>
>>>>>>> Sent: Monday, June 6, 2022 4:56 PM
>>>>>>>
>>>>>>> vfio core checks whether the driver sets some migration op (e.g.
>>>>>>> set_state/get_state) and accordingly calls its op.
>>>>>>>
>>>>>>> However, currently mlx5 driver sets the above ops without regards to
>>>>>>> its
>>>>>>> migration caps.
>>>>>>>
>>>>>>> This might lead to unexpected usage/Oops if user space may call to the
>>>>>>> above ops even if the driver doesn't support migration. As for example,
>>>>>>> the migration state_mutex is not initialized in that case.
>>>>>>>
>>>>>>> The cleanest way to manage that seems to split the migration ops from
>>>>>>> the main device ops, this will let the driver setting them separately
>>>>>>> from the main ops when it's applicable.
>>>>>>>
>>>>>>> As part of that, changed HISI driver to match this scheme.
>>>>>>>
>>>>>>> This scheme may enable down the road to come with some extra
>>> group of
>>>>>>> ops (e.g. DMA log) that can be set without regards to the other options
>>>>>>> based on driver caps.
>>>>>>>
>>>>>>> Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5
>>>>>>> devices")
>>>>>>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>>>>>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one nit:
>>>>> Thanks Kevin, please see below.
>>>>>   
>>>>>>   
>>>>>>> @@ -1534,8 +1534,8 @@
>>> vfio_ioctl_device_feature_mig_device_state(struct
>>>>>>> vfio_device *device,
>>>>>>>        struct file *filp = NULL;
>>>>>>>        int ret;
>>>>>>>
>>>>>>> -    if (!device->ops->migration_set_state ||
>>>>>>> -        !device->ops->migration_get_state)
>>>>>>> +    if (!device->mig_ops->migration_set_state ||
>>>>>>> +        !device->mig_ops->migration_get_state)
>>>>>>>            return -ENOTTY;
>>>>>> ...
>>>>>>   
>>>>>>> @@ -1582,8 +1583,8 @@ static int
>>>>>>> vfio_ioctl_device_feature_migration(struct vfio_device *device,
>>>>>>>        };
>>>>>>>        int ret;
>>>>>>>
>>>>>>> -    if (!device->ops->migration_set_state ||
>>>>>>> -        !device->ops->migration_get_state)
>>>>>>> +    if (!device->mig_ops->migration_set_state ||
>>>>>>> +        !device->mig_ops->migration_get_state)
>>>>>>>            return -ENOTTY;
>>>>>>>   
>>>>>> Above checks can be done once when the device is registered then
>>>>>> here replaced with a single check on device->mig_ops.
>>>>>>   
>>>>> I agree, it may look as of below.
>>>>>
>>>>> Theoretically, this could be done even before this patch upon device
>>>>> registration.
>>>>>
>>>>> We could check that both 'ops' were set and *not* only one of and
>>>>> later check for the specific 'op' upon the feature request.
>>>>>
>>>>> Alex,
>>>>>
>>>>> Do you prefer to switch to the below as part of V2 or stay as of
>>>>> current submission and I'll just add Kevin as Reviewed-by ?
>>>>>
>>>>> diff --git a/drivers/vfio/pci/vfio_pci_core.c
>>>>> b/drivers/vfio/pci/vfio_pci_core.c
>>>>> index a0d69ddaf90d..f42102a03851 100644
>>>>> --- a/drivers/vfio/pci/vfio_pci_core.c
>>>>> +++ b/drivers/vfio/pci/vfio_pci_core.c
>>>>> @@ -1855,6 +1855,11 @@ int vfio_pci_core_register_device(struct
>>>>> vfio_pci_core_device *vdev)
>>>>>          if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
>>>>>                  return -EINVAL;
>>>>>
>>>>> +       if (vdev->vdev.mig_ops &&
>>>>> +          !(vdev->vdev.mig_ops->migration_get_state &&
>>>>> +            vdev->vdev.mig_ops->migration_get_state))
>>>>> +               return -EINVAL;
>>>>> +
>>>>>
>>>>> Yishai
>>>>>   
>>>> Hi Alex,
>>>>
>>>> Did you have the chance to review the above note ?
>>>>
>>>> I would like to send V2 with Kevin's Reviewed-by tag for both patches,
>>>> just wonder about the nit.
>>> The whole mig_ops handling seems rather ad-hoc to me.  What tells a
>>> developer when mig_ops can be set?  Can they be unset?  Does this
>>> enable a device feature callout to dynamically enable migration?  Why do
>>> we init the device with vfio_device_ops, but then open code per driver
>>> setting vfio_migration_ops?
>>>
>>> For the hisi_acc changes, they're specifically defining two device_ops,
>>> one for migration and one without migration.  This patch oddly makes
>>> them identical other than the name and doesn't simplify the setup path,
>>> which should now only require a single call point for init-device with
>>> the proposed API rather than the existing three.
>> I think one of the reasons we ended up using two diff ops were to restrict
>> exposing the migration BAR regions to Guest(and for that we only expose
>> half of the BAR region now). And for nested assignments this may result
>> in invalid BAR sizes if we do it for no migration cases. Hence we have
>> overrides for read/write/mmap/ioctl functions in hisi_acc_vfio_pci_migrn_ops .
> Darn, I always forget about that and skimmed too quickly, the close,
> read, write, and mmap callbacks are all different.  Sorry about that.
>
>> If we have to simplify the setup path, I guess we retain only the _migrn_ops and
>> add explicit checks for migration support in the above override functions.
> Yes, that's an option and none of those callbacks should be terribly
> performance sensitive to an inline test, it's just a matter of where we
> want to simplify the code.  Given the additional differences in
> callbacks I'm not necessarily suggesting this is something we need to
> change.


Alex,

So, do you suggest to go via this approach for mlx5 as well ?

Means, staying with a single device_ops but just inline a check whether 
migration is really supported inside the migration get/set state 
callbacks and let the core call it unconditionally.

Following this approach may introduce extra 'if's for the coming dirty 
tracking 'ops' that may face the same issue with the need to maintain an 
extra cap bit inside the driver for 'dirty tracking' support.

Assuming that an extra 'if' is not performance issue even not on the 
'read and clear' flow (is it ?) this can work.

However, the alternative that was suggested by this patch could be 
cleaner I assume at least for mlx5 and may drop all those extra 'if's, etc.

Let's converge and I'll send V2 accordingly.

Yishai


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-19  9:25               ` Yishai Hadas
@ 2022-06-20  3:49                 ` Jason Gunthorpe
  2022-06-21 16:41                   ` Alex Williamson
  0 siblings, 1 reply; 18+ messages in thread
From: Jason Gunthorpe @ 2022-06-20  3:49 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Alex Williamson, Tian, Kevin, maorg, cohuck, liulongfang, kvm,
	Shameerali Kolothum Thodi

On Sun, Jun 19, 2022 at 12:25:50PM +0300, Yishai Hadas wrote:

> Means, staying with a single device_ops but just inline a check whether
> migration is really supported inside the migration get/set state callbacks
> and let the core call it unconditionally.

I find it much cleaner to have op == NULL means unsupported.

As soon as you start linking supported/unsupported to other flags it
can get very complicated fairly fast. I have this experiance from RDMA
where we've spent a long time now ripping out hundreds of flag tests
and replacing them with NULL op checks. Many bugs were fixed doing
this as drivers never fully understood what the flags mean and ended
up with flags set that their driver doesn't properly implement.

The mistake we made in RDMA was not splitting the ops, instead the ops
were left mutable so the driver could load the right combination based
on HW ability.

Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-20  3:49                 ` Jason Gunthorpe
@ 2022-06-21 16:41                   ` Alex Williamson
  2022-06-24 14:12                     ` Jason Gunthorpe
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2022-06-21 16:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yishai Hadas, Tian, Kevin, maorg, cohuck, liulongfang, kvm,
	Shameerali Kolothum Thodi

On Mon, 20 Jun 2022 00:49:09 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Sun, Jun 19, 2022 at 12:25:50PM +0300, Yishai Hadas wrote:
> 
> > Means, staying with a single device_ops but just inline a check whether
> > migration is really supported inside the migration get/set state callbacks
> > and let the core call it unconditionally.  
> 
> I find it much cleaner to have op == NULL means unsupported.
> 
> As soon as you start linking supported/unsupported to other flags it
> can get very complicated fairly fast. I have this experiance from RDMA
> where we've spent a long time now ripping out hundreds of flag tests
> and replacing them with NULL op checks. Many bugs were fixed doing
> this as drivers never fully understood what the flags mean and ended
> up with flags set that their driver doesn't properly implement.
> 
> The mistake we made in RDMA was not splitting the ops, instead the ops
> were left mutable so the driver could load the right combination based
> on HW ability.

I don't really have an issue with splitting the ops, but what
techniques have you learned from RDMA to make drivers setting ops less
ad-hoc than proposed here?  Are drivers expected to set ops before a
formally defined registration point?  Can ops be dynamically modified?
Is there an API for setting ops or is it open coded per driver?

We probably don't need this series to propose a solution to the
hisi-acc name field usage vs vfio-pci-core SR-IOV driver_override
usage, but let's put it on a to-do list somewhere.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-21 16:41                   ` Alex Williamson
@ 2022-06-24 14:12                     ` Jason Gunthorpe
  2022-06-24 14:41                       ` Alex Williamson
  0 siblings, 1 reply; 18+ messages in thread
From: Jason Gunthorpe @ 2022-06-24 14:12 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Yishai Hadas, Tian, Kevin, maorg, cohuck, liulongfang, kvm,
	Shameerali Kolothum Thodi

On Tue, Jun 21, 2022 at 10:41:46AM -0600, Alex Williamson wrote:
> On Mon, 20 Jun 2022 00:49:09 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Sun, Jun 19, 2022 at 12:25:50PM +0300, Yishai Hadas wrote:
> > 
> > > Means, staying with a single device_ops but just inline a check whether
> > > migration is really supported inside the migration get/set state callbacks
> > > and let the core call it unconditionally.  
> > 
> > I find it much cleaner to have op == NULL means unsupported.
> > 
> > As soon as you start linking supported/unsupported to other flags it
> > can get very complicated fairly fast. I have this experiance from RDMA
> > where we've spent a long time now ripping out hundreds of flag tests
> > and replacing them with NULL op checks. Many bugs were fixed doing
> > this as drivers never fully understood what the flags mean and ended
> > up with flags set that their driver doesn't properly implement.
> > 
> > The mistake we made in RDMA was not splitting the ops, instead the ops
> > were left mutable so the driver could load the right combination based
> > on HW ability.
> 
> I don't really have an issue with splitting the ops, but what
> techniques have you learned from RDMA to make drivers setting ops less
> ad-hoc than proposed here?  Are drivers expected to set ops before a
> formally defined registration point?  

Yes, the flow is the same three step process as in VFIO:

 vfio_init_group_dev()
 [driver continues to prepare the vfio_device]
 vfio_register_group_dev()

I included the 'ops' as an argument to vfio_init_group_dev() as a code
reduction not a statement that the ops must be fully set before
vfio_init_group_dev() returns.

The entire point of the init step is to allow the core and driver to
co-operate and fully initialize the object before moving to register.

So I don't view it as ad-hoc that the object needs further setup after
vfio_init_group_dev().

> Is there an API for setting ops or is it open coded per driver?

RDMA uses a function ib_set_device_ops() but that is only because
there is alot of code in that function to do the copying of ops
pointers. Splitting avoids the copying so we don't really need a
function.

Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-24 14:12                     ` Jason Gunthorpe
@ 2022-06-24 14:41                       ` Alex Williamson
  2022-06-24 14:53                         ` Jason Gunthorpe
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2022-06-24 14:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yishai Hadas, Tian, Kevin, maorg, cohuck, liulongfang, kvm,
	Shameerali Kolothum Thodi

On Fri, 24 Jun 2022 11:12:37 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Jun 21, 2022 at 10:41:46AM -0600, Alex Williamson wrote:
> > On Mon, 20 Jun 2022 00:49:09 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Sun, Jun 19, 2022 at 12:25:50PM +0300, Yishai Hadas wrote:
> > >   
> > > > Means, staying with a single device_ops but just inline a check whether
> > > > migration is really supported inside the migration get/set state callbacks
> > > > and let the core call it unconditionally.    
> > > 
> > > I find it much cleaner to have op == NULL means unsupported.
> > > 
> > > As soon as you start linking supported/unsupported to other flags it
> > > can get very complicated fairly fast. I have this experiance from RDMA
> > > where we've spent a long time now ripping out hundreds of flag tests
> > > and replacing them with NULL op checks. Many bugs were fixed doing
> > > this as drivers never fully understood what the flags mean and ended
> > > up with flags set that their driver doesn't properly implement.
> > > 
> > > The mistake we made in RDMA was not splitting the ops, instead the ops
> > > were left mutable so the driver could load the right combination based
> > > on HW ability.  
> > 
> > I don't really have an issue with splitting the ops, but what
> > techniques have you learned from RDMA to make drivers setting ops less
> > ad-hoc than proposed here?  Are drivers expected to set ops before a
> > formally defined registration point?    
> 
> Yes, the flow is the same three step process as in VFIO:
> 
>  vfio_init_group_dev()
>  [driver continues to prepare the vfio_device]
>  vfio_register_group_dev()
> 
> I included the 'ops' as an argument to vfio_init_group_dev() as a code
> reduction not a statement that the ops must be fully set before
> vfio_init_group_dev() returns.
> 
> The entire point of the init step is to allow the core and driver to
> co-operate and fully initialize the object before moving to register.
> 
> So I don't view it as ad-hoc that the object needs further setup after
> vfio_init_group_dev().
> 
> > Is there an API for setting ops or is it open coded per driver?  
> 
> RDMA uses a function ib_set_device_ops() but that is only because
> there is alot of code in that function to do the copying of ops
> pointers. Splitting avoids the copying so we don't really need a
> function.

So maybe we just need a comment in the definition of the
vfio_migration_ops that vfio_device.mig_ops is a static property of the
vfio_device which must be set prior to registering the vfio_device.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
  2022-06-24 14:41                       ` Alex Williamson
@ 2022-06-24 14:53                         ` Jason Gunthorpe
  0 siblings, 0 replies; 18+ messages in thread
From: Jason Gunthorpe @ 2022-06-24 14:53 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Yishai Hadas, Tian, Kevin, maorg, cohuck, liulongfang, kvm,
	Shameerali Kolothum Thodi

On Fri, Jun 24, 2022 at 08:41:25AM -0600, Alex Williamson wrote:

> > RDMA uses a function ib_set_device_ops() but that is only because
> > there is alot of code in that function to do the copying of ops
> > pointers. Splitting avoids the copying so we don't really need a
> > function.
> 
> So maybe we just need a comment in the definition of the
> vfio_migration_ops that vfio_device.mig_ops is a static property of the
> vfio_device which must be set prior to registering the vfio_device.

Sure

Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-06-24 14:59 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-06  8:56 [PATCH vfio 0/2] Migration few enhancements Yishai Hadas
2022-06-06  8:56 ` [PATCH vfio 1/2] vfio/mlx5: Protect mlx5vf_disable_fds() upon close device Yishai Hadas
2022-06-10  3:25   ` Tian, Kevin
2022-06-06  8:56 ` [PATCH vfio 2/2] vfio: Split migration ops from main device ops Yishai Hadas
2022-06-10  3:32   ` Tian, Kevin
2022-06-13  7:13     ` Yishai Hadas
2022-06-16 14:18       ` Yishai Hadas
2022-06-16 23:01         ` Alex Williamson
2022-06-17  3:58           ` Tian, Kevin
2022-06-17  4:04           ` Tian, Kevin
2022-06-17  8:50           ` Shameerali Kolothum Thodi
2022-06-17 14:47             ` Alex Williamson
2022-06-19  9:25               ` Yishai Hadas
2022-06-20  3:49                 ` Jason Gunthorpe
2022-06-21 16:41                   ` Alex Williamson
2022-06-24 14:12                     ` Jason Gunthorpe
2022-06-24 14:41                       ` Alex Williamson
2022-06-24 14:53                         ` Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.