kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* cleanup vfio iommu_group creation v5
@ 2021-09-13  7:15 Christoph Hellwig
  2021-09-13  7:15 ` [PATCH 01/14] vfio: Move vfio_iommu_group_get() to vfio_register_group_dev() Christoph Hellwig
                   ` (14 more replies)
  0 siblings, 15 replies; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm

Hi Alex,

this series cleans up how iommu group are created in VFIO as well as
various other lose ends around that.

Terrence: I did not pick up your Tested-by as there were quite a few
changes and a major rebase.

Changes since v4:
 - clean up an intermediate version of vfio_group_find_or_alloc to avoid
   reviewer confusion
 - replace an incorrect NULL check with IS_ERR
 - rebased ontop of the vfio_ap_ops changes in Linux 5.15-rc
 - improve a commit message

Changes since v3:
 - restore the attribution to Jason for patch 1, which git-rebase lost
 - fix a vfio vs iommu group counting issue that I added over Jasons
   original patch
 - add comments describing the VFIO_EMULATED_IOMMU and VFIO_NO_IOMMU
   flags
 - use the emulated iommu naming consistently in comments
 - a spelling fix

Changes since v2:
 - cosmetic changes to the code flow in vfio_group_find_or_alloc
 - replace "mediated" with "emulated iommu"
 - add a comment describing vfio_register_emulated_iommu_dev
 - rebased on top of the "Introduce vfio_pci_core subsystem" series

Changes since v1:
 - only taint if a noiommu group was successfully created

Diffstat:
 drivers/vfio/fsl-mc/vfio_fsl_mc.c            |   17 -
 drivers/vfio/mdev/mdev_driver.c              |   46 ----
 drivers/vfio/mdev/vfio_mdev.c                |    2 
 drivers/vfio/pci/vfio_pci_core.c             |   13 -
 drivers/vfio/platform/vfio_platform_common.c |   13 -
 drivers/vfio/vfio.c                          |  306 ++++++++++++---------------
 drivers/vfio/vfio.h                          |   63 +++++
 drivers/vfio/vfio_iommu_spapr_tce.c          |    6 
 drivers/vfio/vfio_iommu_type1.c              |  255 ++++++----------------
 include/linux/mdev.h                         |   20 -
 include/linux/vfio.h                         |   53 ----
 samples/vfio-mdev/mbochs.c                   |    2 
 samples/vfio-mdev/mdpy.c                     |    2 
 samples/vfio-mdev/mtty.c                     |    2 
 14 files changed, 292 insertions(+), 508 deletions(-)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/14] vfio: Move vfio_iommu_group_get() to vfio_register_group_dev()
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
@ 2021-09-13  7:15 ` Christoph Hellwig
  2021-09-14  1:57   ` Tian, Kevin
  2021-09-13  7:15 ` [PATCH 02/14] vfio: factor out a vfio_iommu_driver_allowed helper Christoph Hellwig
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe

From: Jason Gunthorpe <jgg@nvidia.com>

We don't need to hold a reference to the group in the driver as well as
obtain a reference to the same group as the first thing
vfio_register_group_dev() does.

Since the drivers never use the group move this all into the core code.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/vfio/fsl-mc/vfio_fsl_mc.c            | 17 ++-------
 drivers/vfio/pci/vfio_pci_core.c             | 13 ++-----
 drivers/vfio/platform/vfio_platform_common.c | 13 +------
 drivers/vfio/vfio.c                          | 36 ++++++++------------
 include/linux/vfio.h                         |  3 --
 5 files changed, 19 insertions(+), 63 deletions(-)

diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 0ead91bfa83867..9e838fed560339 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -505,22 +505,13 @@ static void vfio_fsl_uninit_device(struct vfio_fsl_mc_device *vdev)
 
 static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
 {
-	struct iommu_group *group;
 	struct vfio_fsl_mc_device *vdev;
 	struct device *dev = &mc_dev->dev;
 	int ret;
 
-	group = vfio_iommu_group_get(dev);
-	if (!group) {
-		dev_err(dev, "VFIO_FSL_MC: No IOMMU group\n");
-		return -EINVAL;
-	}
-
 	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
-	if (!vdev) {
-		ret = -ENOMEM;
-		goto out_group_put;
-	}
+	if (!vdev)
+		return -ENOMEM;
 
 	vfio_init_group_dev(&vdev->vdev, dev, &vfio_fsl_mc_ops);
 	vdev->mc_dev = mc_dev;
@@ -556,8 +547,6 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
 out_uninit:
 	vfio_uninit_group_dev(&vdev->vdev);
 	kfree(vdev);
-out_group_put:
-	vfio_iommu_group_put(group, dev);
 	return ret;
 }
 
@@ -574,8 +563,6 @@ static int vfio_fsl_mc_remove(struct fsl_mc_device *mc_dev)
 
 	vfio_uninit_group_dev(&vdev->vdev);
 	kfree(vdev);
-	vfio_iommu_group_put(mc_dev->dev.iommu_group, dev);
-
 	return 0;
 }
 
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 68198e0f2a6310..43bab0ca3e682d 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1806,7 +1806,6 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_uninit_device);
 int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
 {
 	struct pci_dev *pdev = vdev->pdev;
-	struct iommu_group *group;
 	int ret;
 
 	if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
@@ -1825,10 +1824,6 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
 		return -EBUSY;
 	}
 
-	group = vfio_iommu_group_get(&pdev->dev);
-	if (!group)
-		return -EINVAL;
-
 	if (pci_is_root_bus(pdev->bus)) {
 		ret = vfio_assign_device_set(&vdev->vdev, vdev);
 	} else if (!pci_probe_reset_slot(pdev->slot)) {
@@ -1842,10 +1837,10 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
 	}
 
 	if (ret)
-		goto out_group_put;
+		return ret;
 	ret = vfio_pci_vf_init(vdev);
 	if (ret)
-		goto out_group_put;
+		return ret;
 	ret = vfio_pci_vga_init(vdev);
 	if (ret)
 		goto out_vf;
@@ -1876,8 +1871,6 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
 		vfio_pci_set_power_state(vdev, PCI_D0);
 out_vf:
 	vfio_pci_vf_uninit(vdev);
-out_group_put:
-	vfio_iommu_group_put(group, &pdev->dev);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(vfio_pci_core_register_device);
@@ -1893,8 +1886,6 @@ void vfio_pci_core_unregister_device(struct vfio_pci_core_device *vdev)
 	vfio_pci_vf_uninit(vdev);
 	vfio_pci_vga_uninit(vdev);
 
-	vfio_iommu_group_put(pdev->dev.iommu_group, &pdev->dev);
-
 	if (!disable_idle_d3)
 		vfio_pci_set_power_state(vdev, PCI_D0);
 }
diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c
index 6af7ce7d619c25..256f55b84e70a0 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -642,7 +642,6 @@ static int vfio_platform_of_probe(struct vfio_platform_device *vdev,
 int vfio_platform_probe_common(struct vfio_platform_device *vdev,
 			       struct device *dev)
 {
-	struct iommu_group *group;
 	int ret;
 
 	vfio_init_group_dev(&vdev->vdev, dev, &vfio_platform_ops);
@@ -663,24 +662,15 @@ int vfio_platform_probe_common(struct vfio_platform_device *vdev,
 		goto out_uninit;
 	}
 
-	group = vfio_iommu_group_get(dev);
-	if (!group) {
-		dev_err(dev, "No IOMMU group for device %s\n", vdev->name);
-		ret = -EINVAL;
-		goto put_reset;
-	}
-
 	ret = vfio_register_group_dev(&vdev->vdev);
 	if (ret)
-		goto put_iommu;
+		goto put_reset;
 
 	mutex_init(&vdev->igate);
 
 	pm_runtime_enable(dev);
 	return 0;
 
-put_iommu:
-	vfio_iommu_group_put(group, dev);
 put_reset:
 	vfio_platform_put_reset(vdev);
 out_uninit:
@@ -696,7 +686,6 @@ void vfio_platform_remove_common(struct vfio_platform_device *vdev)
 	pm_runtime_disable(vdev->device);
 	vfio_platform_put_reset(vdev);
 	vfio_uninit_group_dev(&vdev->vdev);
-	vfio_iommu_group_put(vdev->vdev.dev->iommu_group, vdev->vdev.dev);
 }
 EXPORT_SYMBOL_GPL(vfio_platform_remove_common);
 
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 3c034fe14ccb03..b39da9b90c95bc 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -169,15 +169,7 @@ static void vfio_release_device_set(struct vfio_device *device)
 	xa_unlock(&vfio_device_set_xa);
 }
 
-/*
- * vfio_iommu_group_{get,put} are only intended for VFIO bus driver probe
- * and remove functions, any use cases other than acquiring the first
- * reference for the purpose of calling vfio_register_group_dev() or removing
- * that symmetric reference after vfio_unregister_group_dev() should use the raw
- * iommu_group_{get,put} functions.  In particular, vfio_iommu_group_put()
- * removes the device from the dummy group and cannot be nested.
- */
-struct iommu_group *vfio_iommu_group_get(struct device *dev)
+static struct iommu_group *vfio_iommu_group_get(struct device *dev)
 {
 	struct iommu_group *group;
 	int __maybe_unused ret;
@@ -220,18 +212,6 @@ struct iommu_group *vfio_iommu_group_get(struct device *dev)
 
 	return group;
 }
-EXPORT_SYMBOL_GPL(vfio_iommu_group_get);
-
-void vfio_iommu_group_put(struct iommu_group *group, struct device *dev)
-{
-#ifdef CONFIG_VFIO_NOIOMMU
-	if (iommu_group_get_iommudata(group) == &noiommu)
-		iommu_group_remove_device(dev);
-#endif
-
-	iommu_group_put(group);
-}
-EXPORT_SYMBOL_GPL(vfio_iommu_group_put);
 
 #ifdef CONFIG_VFIO_NOIOMMU
 static void *vfio_noiommu_open(unsigned long arg)
@@ -841,7 +821,7 @@ int vfio_register_group_dev(struct vfio_device *device)
 	if (!device->dev_set)
 		vfio_assign_device_set(device, device);
 
-	iommu_group = iommu_group_get(device->dev);
+	iommu_group = vfio_iommu_group_get(device->dev);
 	if (!iommu_group)
 		return -EINVAL;
 
@@ -849,6 +829,10 @@ int vfio_register_group_dev(struct vfio_device *device)
 	if (!group) {
 		group = vfio_create_group(iommu_group);
 		if (IS_ERR(group)) {
+#ifdef CONFIG_VFIO_NOIOMMU
+			if (iommu_group_get_iommudata(iommu_group) == &noiommu)
+				iommu_group_remove_device(device->dev);
+#endif
 			iommu_group_put(iommu_group);
 			return PTR_ERR(group);
 		}
@@ -865,6 +849,10 @@ int vfio_register_group_dev(struct vfio_device *device)
 		dev_WARN(device->dev, "Device already exists on group %d\n",
 			 iommu_group_id(iommu_group));
 		vfio_device_put(existing_device);
+#ifdef CONFIG_VFIO_NOIOMMU
+		if (iommu_group_get_iommudata(iommu_group) == &noiommu)
+			iommu_group_remove_device(device->dev);
+#endif
 		vfio_group_put(group);
 		return -EBUSY;
 	}
@@ -1010,6 +998,10 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	if (list_empty(&group->device_list))
 		wait_event(group->container_q, !group->container);
 
+#ifdef CONFIG_VFIO_NOIOMMU
+	if (iommu_group_get_iommudata(group) == &noiommu)
+		iommu_group_remove_device(dev);
+#endif
 	/* Matches the get in vfio_register_group_dev() */
 	vfio_group_put(group);
 }
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index b53a9557884ada..f7083c2fd0d099 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -71,9 +71,6 @@ struct vfio_device_ops {
 	int	(*match)(struct vfio_device *vdev, char *buf);
 };
 
-extern struct iommu_group *vfio_iommu_group_get(struct device *dev);
-extern void vfio_iommu_group_put(struct iommu_group *group, struct device *dev);
-
 void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
 			 const struct vfio_device_ops *ops);
 void vfio_uninit_group_dev(struct vfio_device *device);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/14] vfio: factor out a vfio_iommu_driver_allowed helper
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
  2021-09-13  7:15 ` [PATCH 01/14] vfio: Move vfio_iommu_group_get() to vfio_register_group_dev() Christoph Hellwig
@ 2021-09-13  7:15 ` Christoph Hellwig
  2021-09-23 22:59   ` Alex Williamson
  2021-09-13  7:15 ` [PATCH 03/14] vfio: remove the iommudata check in vfio_noiommu_attach_group Christoph Hellwig
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

Factor out a little helper to make the checks for the noiommu driver less
ugly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio.c | 37 +++++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index b39da9b90c95bc..faf8e0d637bb94 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -257,8 +257,23 @@ static const struct vfio_iommu_driver_ops vfio_noiommu_ops = {
 	.attach_group = vfio_noiommu_attach_group,
 	.detach_group = vfio_noiommu_detach_group,
 };
-#endif
 
+/*
+ * Only noiommu containers can use vfio-noiommu and noiommu containers can only
+ * use vfio-noiommu.
+ */
+static inline bool vfio_iommu_driver_allowed(struct vfio_container *container,
+		const struct vfio_iommu_driver *driver)
+{
+	return container->noiommu == (driver->ops == &vfio_noiommu_ops);
+}
+#else
+static inline bool vfio_iommu_driver_allowed(struct vfio_container *container,
+		const struct vfio_iommu_driver *driver)
+{
+	return true;
+}
+#endif /* CONFIG_VFIO_NOIOMMU */
 
 /**
  * IOMMU driver registration
@@ -999,8 +1014,8 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 		wait_event(group->container_q, !group->container);
 
 #ifdef CONFIG_VFIO_NOIOMMU
-	if (iommu_group_get_iommudata(group) == &noiommu)
-		iommu_group_remove_device(dev);
+	if (iommu_group_get_iommudata(group->iommu_group) == &noiommu)
+		iommu_group_remove_device(device->dev);
 #endif
 	/* Matches the get in vfio_register_group_dev() */
 	vfio_group_put(group);
@@ -1034,13 +1049,10 @@ static long vfio_ioctl_check_extension(struct vfio_container *container,
 			list_for_each_entry(driver, &vfio.iommu_drivers_list,
 					    vfio_next) {
 
-#ifdef CONFIG_VFIO_NOIOMMU
 				if (!list_empty(&container->group_list) &&
-				    (container->noiommu !=
-				     (driver->ops == &vfio_noiommu_ops)))
+				    !vfio_iommu_driver_allowed(container,
+							       driver))
 					continue;
-#endif
-
 				if (!try_module_get(driver->ops->owner))
 					continue;
 
@@ -1112,15 +1124,8 @@ static long vfio_ioctl_set_iommu(struct vfio_container *container,
 	list_for_each_entry(driver, &vfio.iommu_drivers_list, vfio_next) {
 		void *data;
 
-#ifdef CONFIG_VFIO_NOIOMMU
-		/*
-		 * Only noiommu containers can use vfio-noiommu and noiommu
-		 * containers can only use vfio-noiommu.
-		 */
-		if (container->noiommu != (driver->ops == &vfio_noiommu_ops))
+		if (!vfio_iommu_driver_allowed(container, driver))
 			continue;
-#endif
-
 		if (!try_module_get(driver->ops->owner))
 			continue;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/14] vfio: remove the iommudata check in vfio_noiommu_attach_group
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
  2021-09-13  7:15 ` [PATCH 01/14] vfio: Move vfio_iommu_group_get() to vfio_register_group_dev() Christoph Hellwig
  2021-09-13  7:15 ` [PATCH 02/14] vfio: factor out a vfio_iommu_driver_allowed helper Christoph Hellwig
@ 2021-09-13  7:15 ` Christoph Hellwig
  2021-09-13  7:15 ` [PATCH 04/14] vfio: factor out a vfio_group_find_or_alloc helper Christoph Hellwig
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

vfio_noiommu_attach_group has two callers:

 1) __vfio_container_attach_groups is called by vfio_ioctl_set_iommu,
    which just called vfio_iommu_driver_allowed
 2) vfio_group_set_container requires already checks ->noiommu on the
    vfio_group, which is propagated from the iommudata in
    vfio_create_group

so this check is entirely superflous and can be removed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index faf8e0d637bb94..8bd4b0b96b94a3 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -240,7 +240,7 @@ static long vfio_noiommu_ioctl(void *iommu_data,
 static int vfio_noiommu_attach_group(void *iommu_data,
 				     struct iommu_group *iommu_group)
 {
-	return iommu_group_get_iommudata(iommu_group) == &noiommu ? 0 : -EINVAL;
+	return 0;
 }
 
 static void vfio_noiommu_detach_group(void *iommu_data,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/14] vfio: factor out a vfio_group_find_or_alloc helper
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2021-09-13  7:15 ` [PATCH 03/14] vfio: remove the iommudata check in vfio_noiommu_attach_group Christoph Hellwig
@ 2021-09-13  7:15 ` Christoph Hellwig
  2021-09-14  2:00   ` Tian, Kevin
  2021-09-13  7:15 ` [PATCH 05/14] vfio: refactor noiommu group creation Christoph Hellwig
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe

Factor out a helper to find or allocate the vfio_group to reduce the
spagetthi code in vfio_register_group_dev a little.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/vfio.c | 60 ++++++++++++++++++++++++++-------------------
 1 file changed, 35 insertions(+), 25 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 8bd4b0b96b94a3..2b2679c7126fdf 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -823,10 +823,39 @@ void vfio_uninit_group_dev(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
 
+struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
+{
+	struct iommu_group *iommu_group;
+	struct vfio_group *group;
+
+	iommu_group = vfio_iommu_group_get(dev);
+	if (!iommu_group)
+		return ERR_PTR(-EINVAL);
+
+	/* a found vfio_group already holds a reference to the iommu_group */
+	group = vfio_group_get_from_iommu(iommu_group);
+	if (group)
+		goto out_put;
+
+	/* a newly created vfio_group keeps the reference. */
+	group = vfio_create_group(iommu_group);
+	if (IS_ERR(group))
+		goto out_remove;
+	return group;
+
+out_remove:
+#ifdef CONFIG_VFIO_NOIOMMU
+	if (iommu_group_get_iommudata(iommu_group) == &noiommu)
+		iommu_group_remove_device(dev);
+#endif
+out_put:
+	iommu_group_put(iommu_group);
+	return group;
+}
+
 int vfio_register_group_dev(struct vfio_device *device)
 {
 	struct vfio_device *existing_device;
-	struct iommu_group *iommu_group;
 	struct vfio_group *group;
 
 	/*
@@ -836,36 +865,17 @@ int vfio_register_group_dev(struct vfio_device *device)
 	if (!device->dev_set)
 		vfio_assign_device_set(device, device);
 
-	iommu_group = vfio_iommu_group_get(device->dev);
-	if (!iommu_group)
-		return -EINVAL;
-
-	group = vfio_group_get_from_iommu(iommu_group);
-	if (!group) {
-		group = vfio_create_group(iommu_group);
-		if (IS_ERR(group)) {
-#ifdef CONFIG_VFIO_NOIOMMU
-			if (iommu_group_get_iommudata(iommu_group) == &noiommu)
-				iommu_group_remove_device(device->dev);
-#endif
-			iommu_group_put(iommu_group);
-			return PTR_ERR(group);
-		}
-	} else {
-		/*
-		 * A found vfio_group already holds a reference to the
-		 * iommu_group.  A created vfio_group keeps the reference.
-		 */
-		iommu_group_put(iommu_group);
-	}
+	group = vfio_group_find_or_alloc(device->dev);
+	if (IS_ERR(group))
+		return PTR_ERR(group);
 
 	existing_device = vfio_group_get_device(group, device->dev);
 	if (existing_device) {
 		dev_WARN(device->dev, "Device already exists on group %d\n",
-			 iommu_group_id(iommu_group));
+			 iommu_group_id(group->iommu_group));
 		vfio_device_put(existing_device);
 #ifdef CONFIG_VFIO_NOIOMMU
-		if (iommu_group_get_iommudata(iommu_group) == &noiommu)
+		if (iommu_group_get_iommudata(group->iommu_group) == &noiommu)
 			iommu_group_remove_device(device->dev);
 #endif
 		vfio_group_put(group);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/14] vfio: refactor noiommu group creation
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2021-09-13  7:15 ` [PATCH 04/14] vfio: factor out a vfio_group_find_or_alloc helper Christoph Hellwig
@ 2021-09-13  7:15 ` Christoph Hellwig
  2021-09-14  2:07   ` Tian, Kevin
  2021-09-13  7:15 ` [PATCH 06/14] vfio: remove the iommudata hack for noiommu groups Christoph Hellwig
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe

Split the actual noiommu group creation from vfio_iommu_group_get into a
new helper, and open code the rest of vfio_iommu_group_get in its only
caller.  This creates an entirely separate and clear code path for the
noiommu group creation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/vfio.c | 104 ++++++++++++++++++++++----------------------
 1 file changed, 52 insertions(+), 52 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 2b2679c7126fdf..b1ed156adccd04 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -169,50 +169,6 @@ static void vfio_release_device_set(struct vfio_device *device)
 	xa_unlock(&vfio_device_set_xa);
 }
 
-static struct iommu_group *vfio_iommu_group_get(struct device *dev)
-{
-	struct iommu_group *group;
-	int __maybe_unused ret;
-
-	group = iommu_group_get(dev);
-
-#ifdef CONFIG_VFIO_NOIOMMU
-	/*
-	 * With noiommu enabled, an IOMMU group will be created for a device
-	 * that doesn't already have one and doesn't have an iommu_ops on their
-	 * bus.  We set iommudata simply to be able to identify these groups
-	 * as special use and for reclamation later.
-	 */
-	if (group || !noiommu || iommu_present(dev->bus))
-		return group;
-
-	group = iommu_group_alloc();
-	if (IS_ERR(group))
-		return NULL;
-
-	iommu_group_set_name(group, "vfio-noiommu");
-	iommu_group_set_iommudata(group, &noiommu, NULL);
-	ret = iommu_group_add_device(group, dev);
-	if (ret) {
-		iommu_group_put(group);
-		return NULL;
-	}
-
-	/*
-	 * Where to taint?  At this point we've added an IOMMU group for a
-	 * device that is not backed by iommu_ops, therefore any iommu_
-	 * callback using iommu_ops can legitimately Oops.  So, while we may
-	 * be about to give a DMA capable device to a user without IOMMU
-	 * protection, which is clearly taint-worthy, let's go ahead and do
-	 * it here.
-	 */
-	add_taint(TAINT_USER, LOCKDEP_STILL_OK);
-	dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
-#endif
-
-	return group;
-}
-
 #ifdef CONFIG_VFIO_NOIOMMU
 static void *vfio_noiommu_open(unsigned long arg)
 {
@@ -823,12 +779,61 @@ void vfio_uninit_group_dev(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
 
-struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
+#ifdef CONFIG_VFIO_NOIOMMU
+static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 {
 	struct iommu_group *iommu_group;
 	struct vfio_group *group;
+	int ret;
+
+	iommu_group = iommu_group_alloc();
+	if (IS_ERR(iommu_group))
+		return ERR_CAST(iommu_group);
 
-	iommu_group = vfio_iommu_group_get(dev);
+	iommu_group_set_name(iommu_group, "vfio-noiommu");
+	iommu_group_set_iommudata(iommu_group, &noiommu, NULL);
+	ret = iommu_group_add_device(iommu_group, dev);
+	if (ret)
+		goto out_put_group;
+
+	group = vfio_create_group(iommu_group);
+	if (IS_ERR(group)) {
+		ret = PTR_ERR(group);
+		goto out_remove_device;
+	}
+
+	return group;
+
+out_remove_device:
+	iommu_group_remove_device(dev);
+out_put_group:
+	iommu_group_put(iommu_group);
+	return ERR_PTR(ret);
+}
+#endif
+
+static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
+{
+	struct iommu_group *iommu_group;
+	struct vfio_group *group;
+
+	iommu_group = iommu_group_get(dev);
+#ifdef CONFIG_VFIO_NOIOMMU
+	if (!iommu_group && noiommu && !iommu_present(dev->bus)) {
+		/*
+		 * With noiommu enabled, create an IOMMU group for devices that
+		 * don't already have one and don't have an iommu_ops on their
+		 * bus.  Taint the kernel because we're about to give a DMA
+		 * capable device to a user without IOMMU protection.
+		 */
+		group = vfio_noiommu_group_alloc(dev);
+		if (!IS_ERR(group)) {
+			add_taint(TAINT_USER, LOCKDEP_STILL_OK);
+			dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
+		}
+		return group;
+	}
+#endif
 	if (!iommu_group)
 		return ERR_PTR(-EINVAL);
 
@@ -840,14 +845,9 @@ struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 	/* a newly created vfio_group keeps the reference. */
 	group = vfio_create_group(iommu_group);
 	if (IS_ERR(group))
-		goto out_remove;
+		goto out_put;
 	return group;
 
-out_remove:
-#ifdef CONFIG_VFIO_NOIOMMU
-	if (iommu_group_get_iommudata(iommu_group) == &noiommu)
-		iommu_group_remove_device(dev);
-#endif
 out_put:
 	iommu_group_put(iommu_group);
 	return group;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/14] vfio: remove the iommudata hack for noiommu groups
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2021-09-13  7:15 ` [PATCH 05/14] vfio: refactor noiommu group creation Christoph Hellwig
@ 2021-09-13  7:15 ` Christoph Hellwig
  2021-09-15 16:44   ` Jason Gunthorpe
  2021-09-13  7:15 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

Just pass a noiommu argument to vfio_create_group and set up the
->noiommu flag directly, and remove the now superflous
vfio_iommu_group_put helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index b1ed156adccd04..23eaebd2e28cd9 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -335,7 +335,8 @@ static void vfio_group_unlock_and_free(struct vfio_group *group)
 /**
  * Group objects - create, release, get, put, search
  */
-static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group)
+static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
+		bool noiommu)
 {
 	struct vfio_group *group, *tmp;
 	struct device *dev;
@@ -354,9 +355,7 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group)
 	atomic_set(&group->opened, 0);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
-#ifdef CONFIG_VFIO_NOIOMMU
-	group->noiommu = (iommu_group_get_iommudata(iommu_group) == &noiommu);
-#endif
+	group->noiommu = noiommu;
 	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 
 	group->nb.notifier_call = vfio_iommu_group_notifier;
@@ -791,12 +790,11 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 		return ERR_CAST(iommu_group);
 
 	iommu_group_set_name(iommu_group, "vfio-noiommu");
-	iommu_group_set_iommudata(iommu_group, &noiommu, NULL);
 	ret = iommu_group_add_device(iommu_group, dev);
 	if (ret)
 		goto out_put_group;
 
-	group = vfio_create_group(iommu_group);
+	group = vfio_create_group(iommu_group, true);
 	if (IS_ERR(group)) {
 		ret = PTR_ERR(group);
 		goto out_remove_device;
@@ -843,7 +841,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 		goto out_put;
 
 	/* a newly created vfio_group keeps the reference. */
-	group = vfio_create_group(iommu_group);
+	group = vfio_create_group(iommu_group, false);
 	if (IS_ERR(group))
 		goto out_put;
 	return group;
@@ -874,10 +872,8 @@ int vfio_register_group_dev(struct vfio_device *device)
 		dev_WARN(device->dev, "Device already exists on group %d\n",
 			 iommu_group_id(group->iommu_group));
 		vfio_device_put(existing_device);
-#ifdef CONFIG_VFIO_NOIOMMU
-		if (iommu_group_get_iommudata(group->iommu_group) == &noiommu)
+		if (group->noiommu)
 			iommu_group_remove_device(device->dev);
-#endif
 		vfio_group_put(group);
 		return -EBUSY;
 	}
@@ -1023,10 +1019,9 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	if (list_empty(&group->device_list))
 		wait_event(group->container_q, !group->container);
 
-#ifdef CONFIG_VFIO_NOIOMMU
-	if (iommu_group_get_iommudata(group->iommu_group) == &noiommu)
+	if (group->noiommu)
 		iommu_group_remove_device(device->dev);
-#endif
+
 	/* Matches the get in vfio_register_group_dev() */
 	vfio_group_put(group);
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2021-09-13  7:15 ` [PATCH 06/14] vfio: remove the iommudata hack for noiommu groups Christoph Hellwig
@ 2021-09-13  7:15 ` Christoph Hellwig
  2021-09-14  2:23   ` Tian, Kevin
  2021-09-13  7:16 ` [PATCH 08/14] vfio: remove unused method from vfio_iommu_driver_ops Christoph Hellwig
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm

Reuse the logic in vfio_noiommu_group_alloc to allocate a fake
single-device iommu group for mediated devices by factoring out a common
function, and replacing the noiommu boolean field in struct vfio_group
with an enum to distinguish the three different kinds of groups.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/s390/crypto/vfio_ap_ops.c |  2 +-
 drivers/vfio/mdev/mdev_driver.c   | 45 ++-------------
 drivers/vfio/mdev/vfio_mdev.c     |  2 +-
 drivers/vfio/vfio.c               | 92 ++++++++++++++++++++++---------
 include/linux/vfio.h              |  1 +
 samples/vfio-mdev/mbochs.c        |  2 +-
 samples/vfio-mdev/mdpy.c          |  2 +-
 samples/vfio-mdev/mtty.c          |  2 +-
 8 files changed, 76 insertions(+), 72 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 118939a7729a1e..24755d1aedd5b0 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -351,7 +351,7 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev)
 	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
 	mutex_unlock(&matrix_dev->lock);
 
-	ret = vfio_register_group_dev(&matrix_mdev->vdev);
+	ret = vfio_register_emulated_iommu_dev(&matrix_mdev->vdev);
 	if (ret)
 		goto err_list;
 	dev_set_drvdata(&mdev->dev, matrix_mdev);
diff --git a/drivers/vfio/mdev/mdev_driver.c b/drivers/vfio/mdev/mdev_driver.c
index e2cb1ff56f6c9b..7927ed4f1711f2 100644
--- a/drivers/vfio/mdev/mdev_driver.c
+++ b/drivers/vfio/mdev/mdev_driver.c
@@ -13,60 +13,23 @@
 
 #include "mdev_private.h"
 
-static int mdev_attach_iommu(struct mdev_device *mdev)
-{
-	int ret;
-	struct iommu_group *group;
-
-	group = iommu_group_alloc();
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
-	ret = iommu_group_add_device(group, &mdev->dev);
-	if (!ret)
-		dev_info(&mdev->dev, "MDEV: group_id = %d\n",
-			 iommu_group_id(group));
-
-	iommu_group_put(group);
-	return ret;
-}
-
-static void mdev_detach_iommu(struct mdev_device *mdev)
-{
-	iommu_group_remove_device(&mdev->dev);
-	dev_info(&mdev->dev, "MDEV: detaching iommu\n");
-}
-
 static int mdev_probe(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
-	int ret;
 
-	ret = mdev_attach_iommu(mdev);
-	if (ret)
-		return ret;
-
-	if (drv->probe) {
-		ret = drv->probe(mdev);
-		if (ret)
-			mdev_detach_iommu(mdev);
-	}
-
-	return ret;
+	if (!drv->probe)
+		return 0;
+	return drv->probe(to_mdev_device(dev));
 }
 
 static void mdev_remove(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
 
 	if (drv->remove)
-		drv->remove(mdev);
-
-	mdev_detach_iommu(mdev);
+		drv->remove(to_mdev_device(dev));
 }
 
 static int mdev_match(struct device *dev, struct device_driver *drv)
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 7a9883048216e7..a90e24b0c851d3 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -119,7 +119,7 @@ static int vfio_mdev_probe(struct mdev_device *mdev)
 		return -ENOMEM;
 
 	vfio_init_group_dev(vdev, &mdev->dev, &vfio_mdev_dev_ops);
-	ret = vfio_register_group_dev(vdev);
+	ret = vfio_register_emulated_iommu_dev(vdev);
 	if (ret)
 		goto out_uninit;
 
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 23eaebd2e28cd9..2508c8c3984091 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -67,6 +67,30 @@ struct vfio_unbound_dev {
 	struct list_head		unbound_next;
 };
 
+enum vfio_group_type {
+	/*
+	 * Physical device with IOMMU backing.
+	 */
+	VFIO_IOMMU,
+
+	/*
+	 * Virtual device without IOMMU backing. The VFIO core fakes up an
+	 * iommu_group as the iommu_group sysfs interface is part of the
+	 * userspace ABI.  The user of these devices must not be able to
+	 * directly trigger unmediated DMA.
+	 */
+	VFIO_EMULATED_IOMMU,
+
+	/*
+	 * Physical device without IOMMU backing. The VFIO core fakes up an
+	 * iommu_group as the iommu_group sysfs interface is part of the
+	 * userspace ABI.  Users can trigger unmediated DMA by the device,
+	 * usage is highly dangerous, requires an explicit opt-in and will
+	 * taint the kernel.
+	 */
+	VFIO_NO_IOMMU,
+};
+
 struct vfio_group {
 	struct kref			kref;
 	int				minor;
@@ -83,7 +107,7 @@ struct vfio_group {
 	struct mutex			unbound_lock;
 	atomic_t			opened;
 	wait_queue_head_t		container_q;
-	bool				noiommu;
+	enum vfio_group_type		type;
 	unsigned int			dev_counter;
 	struct kvm			*kvm;
 	struct blocking_notifier_head	notifier;
@@ -336,7 +360,7 @@ static void vfio_group_unlock_and_free(struct vfio_group *group)
  * Group objects - create, release, get, put, search
  */
 static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
-		bool noiommu)
+		enum vfio_group_type type)
 {
 	struct vfio_group *group, *tmp;
 	struct device *dev;
@@ -355,7 +379,7 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	atomic_set(&group->opened, 0);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
-	group->noiommu = noiommu;
+	group->type = type;
 	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 
 	group->nb.notifier_call = vfio_iommu_group_notifier;
@@ -391,8 +415,8 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	}
 
 	dev = device_create(vfio.class, NULL,
-			    MKDEV(MAJOR(vfio.group_devt), minor),
-			    group, "%s%d", group->noiommu ? "noiommu-" : "",
+			    MKDEV(MAJOR(vfio.group_devt), minor), group, "%s%d",
+			    group->type == VFIO_NO_IOMMU ? "noiommu-" : "",
 			    iommu_group_id(iommu_group));
 	if (IS_ERR(dev)) {
 		vfio_free_group_minor(minor);
@@ -778,8 +802,8 @@ void vfio_uninit_group_dev(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
 
-#ifdef CONFIG_VFIO_NOIOMMU
-static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
+static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
+		enum vfio_group_type type)
 {
 	struct iommu_group *iommu_group;
 	struct vfio_group *group;
@@ -794,7 +818,7 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	if (ret)
 		goto out_put_group;
 
-	group = vfio_create_group(iommu_group, true);
+	group = vfio_create_group(iommu_group, type);
 	if (IS_ERR(group)) {
 		ret = PTR_ERR(group);
 		goto out_remove_device;
@@ -808,7 +832,6 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	iommu_group_put(iommu_group);
 	return ERR_PTR(ret);
 }
-#endif
 
 static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 {
@@ -824,7 +847,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 		 * bus.  Taint the kernel because we're about to give a DMA
 		 * capable device to a user without IOMMU protection.
 		 */
-		group = vfio_noiommu_group_alloc(dev);
+		group = vfio_noiommu_group_alloc(dev, VFIO_NO_IOMMU);
 		if (!IS_ERR(group)) {
 			add_taint(TAINT_USER, LOCKDEP_STILL_OK);
 			dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
@@ -841,7 +864,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 		goto out_put;
 
 	/* a newly created vfio_group keeps the reference. */
-	group = vfio_create_group(iommu_group, false);
+	group = vfio_create_group(iommu_group, VFIO_IOMMU);
 	if (IS_ERR(group))
 		goto out_put;
 	return group;
@@ -851,10 +874,13 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 	return group;
 }
 
-int vfio_register_group_dev(struct vfio_device *device)
+static int __vfio_register_dev(struct vfio_device *device,
+		struct vfio_group *group)
 {
 	struct vfio_device *existing_device;
-	struct vfio_group *group;
+
+	if (IS_ERR(group))
+		return PTR_ERR(group);
 
 	/*
 	 * If the driver doesn't specify a set then the device is added to a
@@ -863,16 +889,13 @@ int vfio_register_group_dev(struct vfio_device *device)
 	if (!device->dev_set)
 		vfio_assign_device_set(device, device);
 
-	group = vfio_group_find_or_alloc(device->dev);
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
 	existing_device = vfio_group_get_device(group, device->dev);
 	if (existing_device) {
 		dev_WARN(device->dev, "Device already exists on group %d\n",
 			 iommu_group_id(group->iommu_group));
 		vfio_device_put(existing_device);
-		if (group->noiommu)
+		if (group->type == VFIO_NO_IOMMU ||
+		    group->type == VFIO_EMULATED_IOMMU)
 			iommu_group_remove_device(device->dev);
 		vfio_group_put(group);
 		return -EBUSY;
@@ -891,8 +914,25 @@ int vfio_register_group_dev(struct vfio_device *device)
 
 	return 0;
 }
+
+int vfio_register_group_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_group_find_or_alloc(device->dev));
+}
 EXPORT_SYMBOL_GPL(vfio_register_group_dev);
 
+/*
+ * Register a virtual device without IOMMU backing.  The user of this
+ * device must not be able to directly trigger unmediated DMA.
+ */
+int vfio_register_emulated_iommu_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_noiommu_group_alloc(device->dev, VFIO_EMULATED_IOMMU));
+}
+EXPORT_SYMBOL_GPL(vfio_register_emulated_iommu_dev);
+
 /**
  * Get a reference to the vfio_device for a device.  Even if the
  * caller thinks they own the device, they could be racing with a
@@ -1019,7 +1059,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	if (list_empty(&group->device_list))
 		wait_event(group->container_q, !group->container);
 
-	if (group->noiommu)
+	if (group->type == VFIO_NO_IOMMU || group->type == VFIO_EMULATED_IOMMU)
 		iommu_group_remove_device(device->dev);
 
 	/* Matches the get in vfio_register_group_dev() */
@@ -1368,7 +1408,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	if (atomic_read(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	f = fdget(container_fd);
@@ -1388,7 +1428,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 
 	/* Real groups and fake groups cannot mix */
 	if (!list_empty(&container->group_list) &&
-	    container->noiommu != group->noiommu) {
+	    container->noiommu != (group->type == VFIO_NO_IOMMU)) {
 		ret = -EPERM;
 		goto unlock_out;
 	}
@@ -1402,7 +1442,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	}
 
 	group->container = container;
-	container->noiommu = group->noiommu;
+	container->noiommu = (group->type == VFIO_NO_IOMMU);
 	list_add(&group->container_next, &container->group_list);
 
 	/* Get a reference on the container and mark a user within the group */
@@ -1426,7 +1466,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
 	if (!atomic_inc_not_zero(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu) {
+	if (group->type == VFIO_NO_IOMMU) {
 		atomic_dec(&group->container_users);
 		return -EPERM;
 	}
@@ -1451,7 +1491,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	    !group->container->iommu_driver || !vfio_group_viable(group))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	device = vfio_device_get_from_name(group, buf);
@@ -1498,7 +1538,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 
 	fd_install(fdno, filep);
 
-	if (group->noiommu)
+	if (group->type == VFIO_NO_IOMMU)
 		dev_warn(device->dev, "vfio-noiommu device opened by user "
 			 "(%s:%d)\n", current->comm, task_pid_nr(current));
 	return fdno;
@@ -1594,7 +1634,7 @@ static int vfio_group_fops_open(struct inode *inode, struct file *filep)
 	if (!group)
 		return -ENODEV;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO)) {
+	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO)) {
 		vfio_group_put(group);
 		return -EPERM;
 	}
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index f7083c2fd0d099..bbe29300862649 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -75,6 +75,7 @@ void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
 			 const struct vfio_device_ops *ops);
 void vfio_uninit_group_dev(struct vfio_device *device);
 int vfio_register_group_dev(struct vfio_device *device);
+int vfio_register_emulated_iommu_dev(struct vfio_device *device);
 void vfio_unregister_group_dev(struct vfio_device *device);
 extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
 extern void vfio_device_put(struct vfio_device *device);
diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index c313ab4d1f4e4e..cd41bec5fdeb39 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -553,7 +553,7 @@ static int mbochs_probe(struct mdev_device *mdev)
 	mbochs_create_config_space(mdev_state);
 	mbochs_reset(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index 8d1a80a0722aa9..fe5d43e797b6d3 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -258,7 +258,7 @@ static int mdpy_probe(struct mdev_device *mdev)
 
 	mdpy_count++;
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 5983cdb16e3d1d..a0e1a469bd47af 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -741,7 +741,7 @@ static int mtty_probe(struct mdev_device *mdev)
 
 	mtty_create_config_space(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_vconfig;
 	dev_set_drvdata(&mdev->dev, mdev_state);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/14] vfio: remove unused method from vfio_iommu_driver_ops
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2021-09-13  7:15 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
@ 2021-09-13  7:16 ` Christoph Hellwig
  2021-09-13  7:16 ` [PATCH 09/14] vfio: move the vfio_iommu_driver_ops interface out of <linux/vfio.h> Christoph Hellwig
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:16 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

The read, write and mmap methods are never implemented, so remove them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio.c  | 50 --------------------------------------------
 include/linux/vfio.h |  5 -----
 2 files changed, 55 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 2508c8c3984091..2c1c7316aa192c 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1276,62 +1276,12 @@ static int vfio_fops_release(struct inode *inode, struct file *filep)
 	return 0;
 }
 
-/*
- * Once an iommu driver is set, we optionally pass read/write/mmap
- * on to the driver, allowing management interfaces beyond ioctl.
- */
-static ssize_t vfio_fops_read(struct file *filep, char __user *buf,
-			      size_t count, loff_t *ppos)
-{
-	struct vfio_container *container = filep->private_data;
-	struct vfio_iommu_driver *driver;
-	ssize_t ret = -EINVAL;
-
-	driver = container->iommu_driver;
-	if (likely(driver && driver->ops->read))
-		ret = driver->ops->read(container->iommu_data,
-					buf, count, ppos);
-
-	return ret;
-}
-
-static ssize_t vfio_fops_write(struct file *filep, const char __user *buf,
-			       size_t count, loff_t *ppos)
-{
-	struct vfio_container *container = filep->private_data;
-	struct vfio_iommu_driver *driver;
-	ssize_t ret = -EINVAL;
-
-	driver = container->iommu_driver;
-	if (likely(driver && driver->ops->write))
-		ret = driver->ops->write(container->iommu_data,
-					 buf, count, ppos);
-
-	return ret;
-}
-
-static int vfio_fops_mmap(struct file *filep, struct vm_area_struct *vma)
-{
-	struct vfio_container *container = filep->private_data;
-	struct vfio_iommu_driver *driver;
-	int ret = -EINVAL;
-
-	driver = container->iommu_driver;
-	if (likely(driver && driver->ops->mmap))
-		ret = driver->ops->mmap(container->iommu_data, vma);
-
-	return ret;
-}
-
 static const struct file_operations vfio_fops = {
 	.owner		= THIS_MODULE,
 	.open		= vfio_fops_open,
 	.release	= vfio_fops_release,
-	.read		= vfio_fops_read,
-	.write		= vfio_fops_write,
 	.unlocked_ioctl	= vfio_fops_unl_ioctl,
 	.compat_ioctl	= compat_ptr_ioctl,
-	.mmap		= vfio_fops_mmap,
 };
 
 /**
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index bbe29300862649..7a57a0077f9637 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -95,13 +95,8 @@ struct vfio_iommu_driver_ops {
 	struct module	*owner;
 	void		*(*open)(unsigned long arg);
 	void		(*release)(void *iommu_data);
-	ssize_t		(*read)(void *iommu_data, char __user *buf,
-				size_t count, loff_t *ppos);
-	ssize_t		(*write)(void *iommu_data, const char __user *buf,
-				 size_t count, loff_t *size);
 	long		(*ioctl)(void *iommu_data, unsigned int cmd,
 				 unsigned long arg);
-	int		(*mmap)(void *iommu_data, struct vm_area_struct *vma);
 	int		(*attach_group)(void *iommu_data,
 					struct iommu_group *group);
 	void		(*detach_group)(void *iommu_data,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/14] vfio: move the vfio_iommu_driver_ops interface out of <linux/vfio.h>
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2021-09-13  7:16 ` [PATCH 08/14] vfio: remove unused method from vfio_iommu_driver_ops Christoph Hellwig
@ 2021-09-13  7:16 ` Christoph Hellwig
  2021-09-13  7:16 ` [PATCH 10/14] vfio: remove the unused mdev iommu hook Christoph Hellwig
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:16 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

Create a new private drivers/vfio/vfio.h header for the interface between
the VFIO core and the iommu drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio.c                 |  1 +
 drivers/vfio/vfio.h                 | 47 +++++++++++++++++++++++++++++
 drivers/vfio/vfio_iommu_spapr_tce.c |  1 +
 drivers/vfio/vfio_iommu_type1.c     |  1 +
 include/linux/vfio.h                | 44 ---------------------------
 5 files changed, 50 insertions(+), 44 deletions(-)
 create mode 100644 drivers/vfio/vfio.h

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 2c1c7316aa192c..6589e296ef348c 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -32,6 +32,7 @@
 #include <linux/vfio.h>
 #include <linux/wait.h>
 #include <linux/sched/signal.h>
+#include "vfio.h"
 
 #define DRIVER_VERSION	"0.3"
 #define DRIVER_AUTHOR	"Alex Williamson <alex.williamson@redhat.com>"
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
new file mode 100644
index 00000000000000..a78de649eb2f16
--- /dev/null
+++ b/drivers/vfio/vfio.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2012 Red Hat, Inc.  All rights reserved.
+ *     Author: Alex Williamson <alex.williamson@redhat.com>
+ */
+
+/* events for the backend driver notify callback */
+enum vfio_iommu_notify_type {
+	VFIO_IOMMU_CONTAINER_CLOSE = 0,
+};
+
+/**
+ * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
+ */
+struct vfio_iommu_driver_ops {
+	char		*name;
+	struct module	*owner;
+	void		*(*open)(unsigned long arg);
+	void		(*release)(void *iommu_data);
+	long		(*ioctl)(void *iommu_data, unsigned int cmd,
+				 unsigned long arg);
+	int		(*attach_group)(void *iommu_data,
+					struct iommu_group *group);
+	void		(*detach_group)(void *iommu_data,
+					struct iommu_group *group);
+	int		(*pin_pages)(void *iommu_data,
+				     struct iommu_group *group,
+				     unsigned long *user_pfn,
+				     int npage, int prot,
+				     unsigned long *phys_pfn);
+	int		(*unpin_pages)(void *iommu_data,
+				       unsigned long *user_pfn, int npage);
+	int		(*register_notifier)(void *iommu_data,
+					     unsigned long *events,
+					     struct notifier_block *nb);
+	int		(*unregister_notifier)(void *iommu_data,
+					       struct notifier_block *nb);
+	int		(*dma_rw)(void *iommu_data, dma_addr_t user_iova,
+				  void *data, size_t count, bool write);
+	struct iommu_domain *(*group_iommu_domain)(void *iommu_data,
+						   struct iommu_group *group);
+	void		(*notify)(void *iommu_data,
+				  enum vfio_iommu_notify_type event);
+};
+
+int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
+void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops);
diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
index fe888b5dcc0062..3efd09faeca4a8 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -20,6 +20,7 @@
 #include <linux/sched/mm.h>
 #include <linux/sched/signal.h>
 #include <linux/mm.h>
+#include "vfio.h"
 
 #include <asm/iommu.h>
 #include <asm/tce.h>
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 0e9217687f5c3e..2e51e4390c1531 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -40,6 +40,7 @@
 #include <linux/notifier.h>
 #include <linux/dma-iommu.h>
 #include <linux/irqdomain.h>
+#include "vfio.h"
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 7a57a0077f9637..76191d7abed185 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -82,50 +82,6 @@ extern void vfio_device_put(struct vfio_device *device);
 
 int vfio_assign_device_set(struct vfio_device *device, void *set_id);
 
-/* events for the backend driver notify callback */
-enum vfio_iommu_notify_type {
-	VFIO_IOMMU_CONTAINER_CLOSE = 0,
-};
-
-/**
- * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
- */
-struct vfio_iommu_driver_ops {
-	char		*name;
-	struct module	*owner;
-	void		*(*open)(unsigned long arg);
-	void		(*release)(void *iommu_data);
-	long		(*ioctl)(void *iommu_data, unsigned int cmd,
-				 unsigned long arg);
-	int		(*attach_group)(void *iommu_data,
-					struct iommu_group *group);
-	void		(*detach_group)(void *iommu_data,
-					struct iommu_group *group);
-	int		(*pin_pages)(void *iommu_data,
-				     struct iommu_group *group,
-				     unsigned long *user_pfn,
-				     int npage, int prot,
-				     unsigned long *phys_pfn);
-	int		(*unpin_pages)(void *iommu_data,
-				       unsigned long *user_pfn, int npage);
-	int		(*register_notifier)(void *iommu_data,
-					     unsigned long *events,
-					     struct notifier_block *nb);
-	int		(*unregister_notifier)(void *iommu_data,
-					       struct notifier_block *nb);
-	int		(*dma_rw)(void *iommu_data, dma_addr_t user_iova,
-				  void *data, size_t count, bool write);
-	struct iommu_domain *(*group_iommu_domain)(void *iommu_data,
-						   struct iommu_group *group);
-	void		(*notify)(void *iommu_data,
-				  enum vfio_iommu_notify_type event);
-};
-
-extern int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
-
-extern void vfio_unregister_iommu_driver(
-				const struct vfio_iommu_driver_ops *ops);
-
 /*
  * External user API
  */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/14] vfio: remove the unused mdev iommu hook
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2021-09-13  7:16 ` [PATCH 09/14] vfio: move the vfio_iommu_driver_ops interface out of <linux/vfio.h> Christoph Hellwig
@ 2021-09-13  7:16 ` Christoph Hellwig
  2021-09-13  7:16 ` [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1 Christoph Hellwig
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:16 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

The iommu_device field in struct mdev_device has never been used
since it was added more than 2 years ago.

This is a manual revert of commit 7bd50f0cd2
("vfio/type1: Add domain at(de)taching group helpers").

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio_iommu_type1.c | 133 +++++++-------------------------
 include/linux/mdev.h            |  20 -----
 2 files changed, 26 insertions(+), 127 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 2e51e4390c1531..42a6be1fb7265e 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -114,7 +114,6 @@ struct vfio_batch {
 struct vfio_iommu_group {
 	struct iommu_group	*iommu_group;
 	struct list_head	next;
-	bool			mdev_group;	/* An mdev group */
 	bool			pinned_page_dirty_scope;
 };
 
@@ -1935,61 +1934,6 @@ static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
 	return ret;
 }
 
-static int vfio_mdev_attach_domain(struct device *dev, void *data)
-{
-	struct mdev_device *mdev = to_mdev_device(dev);
-	struct iommu_domain *domain = data;
-	struct device *iommu_device;
-
-	iommu_device = mdev_get_iommu_device(mdev);
-	if (iommu_device) {
-		if (iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
-			return iommu_aux_attach_device(domain, iommu_device);
-		else
-			return iommu_attach_device(domain, iommu_device);
-	}
-
-	return -EINVAL;
-}
-
-static int vfio_mdev_detach_domain(struct device *dev, void *data)
-{
-	struct mdev_device *mdev = to_mdev_device(dev);
-	struct iommu_domain *domain = data;
-	struct device *iommu_device;
-
-	iommu_device = mdev_get_iommu_device(mdev);
-	if (iommu_device) {
-		if (iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
-			iommu_aux_detach_device(domain, iommu_device);
-		else
-			iommu_detach_device(domain, iommu_device);
-	}
-
-	return 0;
-}
-
-static int vfio_iommu_attach_group(struct vfio_domain *domain,
-				   struct vfio_iommu_group *group)
-{
-	if (group->mdev_group)
-		return iommu_group_for_each_dev(group->iommu_group,
-						domain->domain,
-						vfio_mdev_attach_domain);
-	else
-		return iommu_attach_group(domain->domain, group->iommu_group);
-}
-
-static void vfio_iommu_detach_group(struct vfio_domain *domain,
-				    struct vfio_iommu_group *group)
-{
-	if (group->mdev_group)
-		iommu_group_for_each_dev(group->iommu_group, domain->domain,
-					 vfio_mdev_detach_domain);
-	else
-		iommu_detach_group(domain->domain, group->iommu_group);
-}
-
 static bool vfio_bus_is_mdev(struct bus_type *bus)
 {
 	struct bus_type *mdev_bus;
@@ -2004,20 +1948,6 @@ static bool vfio_bus_is_mdev(struct bus_type *bus)
 	return ret;
 }
 
-static int vfio_mdev_iommu_device(struct device *dev, void *data)
-{
-	struct mdev_device *mdev = to_mdev_device(dev);
-	struct device **old = data, *new;
-
-	new = mdev_get_iommu_device(mdev);
-	if (!new || (*old && *old != new))
-		return -EINVAL;
-
-	*old = new;
-
-	return 0;
-}
-
 /*
  * This is a helper function to insert an address range to iova list.
  * The list is initially created with a single entry corresponding to
@@ -2278,38 +2208,25 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_free;
 
 	if (vfio_bus_is_mdev(bus)) {
-		struct device *iommu_device = NULL;
-
-		group->mdev_group = true;
-
-		/* Determine the isolation type */
-		ret = iommu_group_for_each_dev(iommu_group, &iommu_device,
-					       vfio_mdev_iommu_device);
-		if (ret || !iommu_device) {
-			if (!iommu->external_domain) {
-				INIT_LIST_HEAD(&domain->group_list);
-				iommu->external_domain = domain;
-				vfio_update_pgsize_bitmap(iommu);
-			} else {
-				kfree(domain);
-			}
-
-			list_add(&group->next,
-				 &iommu->external_domain->group_list);
-			/*
-			 * Non-iommu backed group cannot dirty memory directly,
-			 * it can only use interfaces that provide dirty
-			 * tracking.
-			 * The iommu scope can only be promoted with the
-			 * addition of a dirty tracking group.
-			 */
-			group->pinned_page_dirty_scope = true;
-			mutex_unlock(&iommu->lock);
-
-			return 0;
+		if (!iommu->external_domain) {
+			INIT_LIST_HEAD(&domain->group_list);
+			iommu->external_domain = domain;
+			vfio_update_pgsize_bitmap(iommu);
+		} else {
+			kfree(domain);
 		}
 
-		bus = iommu_device->bus;
+		list_add(&group->next, &iommu->external_domain->group_list);
+		/*
+		 * Non-iommu backed group cannot dirty memory directly, it can
+		 * only use interfaces that provide dirty tracking.
+		 * The iommu scope can only be promoted with the addition of a
+		 * dirty tracking group.
+		 */
+		group->pinned_page_dirty_scope = true;
+		mutex_unlock(&iommu->lock);
+
+		return 0;
 	}
 
 	domain->domain = iommu_domain_alloc(bus);
@@ -2324,7 +2241,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 			goto out_domain;
 	}
 
-	ret = vfio_iommu_attach_group(domain, group);
+	ret = iommu_attach_group(domain->domain, group->iommu_group);
 	if (ret)
 		goto out_domain;
 
@@ -2391,15 +2308,17 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	list_for_each_entry(d, &iommu->domain_list, next) {
 		if (d->domain->ops == domain->domain->ops &&
 		    d->prot == domain->prot) {
-			vfio_iommu_detach_group(domain, group);
-			if (!vfio_iommu_attach_group(d, group)) {
+			iommu_detach_group(domain->domain, group->iommu_group);
+			if (!iommu_attach_group(d->domain,
+						group->iommu_group)) {
 				list_add(&group->next, &d->group_list);
 				iommu_domain_free(domain->domain);
 				kfree(domain);
 				goto done;
 			}
 
-			ret = vfio_iommu_attach_group(domain, group);
+			ret = iommu_attach_group(domain->domain,
+						 group->iommu_group);
 			if (ret)
 				goto out_domain;
 		}
@@ -2436,7 +2355,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	return 0;
 
 out_detach:
-	vfio_iommu_detach_group(domain, group);
+	iommu_detach_group(domain->domain, group->iommu_group);
 out_domain:
 	iommu_domain_free(domain->domain);
 	vfio_iommu_iova_free(&iova_copy);
@@ -2601,7 +2520,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		if (!group)
 			continue;
 
-		vfio_iommu_detach_group(domain, group);
+		iommu_detach_group(domain->domain, group->iommu_group);
 		update_dirty_scope = !group->pinned_page_dirty_scope;
 		list_del(&group->next);
 		kfree(group);
@@ -2689,7 +2608,7 @@ static void vfio_release_domain(struct vfio_domain *domain, bool external)
 	list_for_each_entry_safe(group, group_tmp,
 				 &domain->group_list, next) {
 		if (!external)
-			vfio_iommu_detach_group(domain, group);
+			iommu_detach_group(domain->domain, group->iommu_group);
 		list_del(&group->next);
 		kfree(group);
 	}
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 68427e8fadebd6..15d03f6532d073 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -18,7 +18,6 @@ struct mdev_device {
 	void *driver_data;
 	struct list_head next;
 	struct mdev_type *type;
-	struct device *iommu_device;
 	bool active;
 };
 
@@ -27,25 +26,6 @@ static inline struct mdev_device *to_mdev_device(struct device *dev)
 	return container_of(dev, struct mdev_device, dev);
 }
 
-/*
- * Called by the parent device driver to set the device which represents
- * this mdev in iommu protection scope. By default, the iommu device is
- * NULL, that indicates using vendor defined isolation.
- *
- * @dev: the mediated device that iommu will isolate.
- * @iommu_device: a pci device which represents the iommu for @dev.
- */
-static inline void mdev_set_iommu_device(struct mdev_device *mdev,
-					 struct device *iommu_device)
-{
-	mdev->iommu_device = iommu_device;
-}
-
-static inline struct device *mdev_get_iommu_device(struct mdev_device *mdev)
-{
-	return mdev->iommu_device;
-}
-
 unsigned int mdev_get_type_group_id(struct mdev_device *mdev);
 unsigned int mtype_get_type_group_id(struct mdev_type *mtype);
 struct device *mtype_get_parent_dev(struct mdev_type *mtype);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2021-09-13  7:16 ` [PATCH 10/14] vfio: remove the unused mdev iommu hook Christoph Hellwig
@ 2021-09-13  7:16 ` Christoph Hellwig
  2021-09-16 19:55   ` Kirti Wankhede
  2021-09-13  7:16 ` [PATCH 12/14] vfio/spapr_tce: reject mediated devices Christoph Hellwig
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:16 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

Pass the group flags to ->attach_group and remove the messy check for
the bus type.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio.c                 | 32 +++++------------------------
 drivers/vfio/vfio.h                 | 27 +++++++++++++++++++++++-
 drivers/vfio/vfio_iommu_spapr_tce.c |  2 +-
 drivers/vfio/vfio_iommu_type1.c     | 19 ++---------------
 4 files changed, 34 insertions(+), 46 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 6589e296ef348c..08b27b64f0f935 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -68,30 +68,6 @@ struct vfio_unbound_dev {
 	struct list_head		unbound_next;
 };
 
-enum vfio_group_type {
-	/*
-	 * Physical device with IOMMU backing.
-	 */
-	VFIO_IOMMU,
-
-	/*
-	 * Virtual device without IOMMU backing. The VFIO core fakes up an
-	 * iommu_group as the iommu_group sysfs interface is part of the
-	 * userspace ABI.  The user of these devices must not be able to
-	 * directly trigger unmediated DMA.
-	 */
-	VFIO_EMULATED_IOMMU,
-
-	/*
-	 * Physical device without IOMMU backing. The VFIO core fakes up an
-	 * iommu_group as the iommu_group sysfs interface is part of the
-	 * userspace ABI.  Users can trigger unmediated DMA by the device,
-	 * usage is highly dangerous, requires an explicit opt-in and will
-	 * taint the kernel.
-	 */
-	VFIO_NO_IOMMU,
-};
-
 struct vfio_group {
 	struct kref			kref;
 	int				minor;
@@ -219,7 +195,7 @@ static long vfio_noiommu_ioctl(void *iommu_data,
 }
 
 static int vfio_noiommu_attach_group(void *iommu_data,
-				     struct iommu_group *iommu_group)
+		struct iommu_group *iommu_group, enum vfio_group_type type)
 {
 	return 0;
 }
@@ -1129,7 +1105,8 @@ static int __vfio_container_attach_groups(struct vfio_container *container,
 	int ret = -ENODEV;
 
 	list_for_each_entry(group, &container->group_list, container_next) {
-		ret = driver->ops->attach_group(data, group->iommu_group);
+		ret = driver->ops->attach_group(data, group->iommu_group,
+						group->type);
 		if (ret)
 			goto unwind;
 	}
@@ -1387,7 +1364,8 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	driver = container->iommu_driver;
 	if (driver) {
 		ret = driver->ops->attach_group(container->iommu_data,
-						group->iommu_group);
+						group->iommu_group,
+						group->type);
 		if (ret)
 			goto unlock_out;
 	}
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index a78de649eb2f16..a6713022115155 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -4,6 +4,30 @@
  *     Author: Alex Williamson <alex.williamson@redhat.com>
  */
 
+enum vfio_group_type {
+	/*
+	 * Physical device with IOMMU backing.
+	 */
+	VFIO_IOMMU,
+
+	/*
+	 * Virtual device without IOMMU backing. The VFIO core fakes up an
+	 * iommu_group as the iommu_group sysfs interface is part of the
+	 * userspace ABI.  The user of these devices must not be able to
+	 * directly trigger unmediated DMA.
+	 */
+	VFIO_EMULATED_IOMMU,
+
+	/*
+	 * Physical device without IOMMU backing. The VFIO core fakes up an
+	 * iommu_group as the iommu_group sysfs interface is part of the
+	 * userspace ABI.  Users can trigger unmediated DMA by the device,
+	 * usage is highly dangerous, requires an explicit opt-in and will
+	 * taint the kernel.
+	 */
+	VFIO_NO_IOMMU,
+};
+
 /* events for the backend driver notify callback */
 enum vfio_iommu_notify_type {
 	VFIO_IOMMU_CONTAINER_CLOSE = 0,
@@ -20,7 +44,8 @@ struct vfio_iommu_driver_ops {
 	long		(*ioctl)(void *iommu_data, unsigned int cmd,
 				 unsigned long arg);
 	int		(*attach_group)(void *iommu_data,
-					struct iommu_group *group);
+					struct iommu_group *group,
+					enum vfio_group_type);
 	void		(*detach_group)(void *iommu_data,
 					struct iommu_group *group);
 	int		(*pin_pages)(void *iommu_data,
diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
index 3efd09faeca4a8..936a26b13c0b01 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -1239,7 +1239,7 @@ static long tce_iommu_take_ownership_ddw(struct tce_container *container,
 }
 
 static int tce_iommu_attach_group(void *iommu_data,
-		struct iommu_group *iommu_group)
+		struct iommu_group *iommu_group, enum vfio_group_type type)
 {
 	int ret = 0;
 	struct tce_container *container = iommu_data;
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 42a6be1fb7265e..a48e9f597cb213 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -36,7 +36,6 @@
 #include <linux/uaccess.h>
 #include <linux/vfio.h>
 #include <linux/workqueue.h>
-#include <linux/mdev.h>
 #include <linux/notifier.h>
 #include <linux/dma-iommu.h>
 #include <linux/irqdomain.h>
@@ -1934,20 +1933,6 @@ static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
 	return ret;
 }
 
-static bool vfio_bus_is_mdev(struct bus_type *bus)
-{
-	struct bus_type *mdev_bus;
-	bool ret = false;
-
-	mdev_bus = symbol_get(mdev_bus_type);
-	if (mdev_bus) {
-		ret = (bus == mdev_bus);
-		symbol_put(mdev_bus_type);
-	}
-
-	return ret;
-}
-
 /*
  * This is a helper function to insert an address range to iova list.
  * The list is initially created with a single entry corresponding to
@@ -2172,7 +2157,7 @@ static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu,
 }
 
 static int vfio_iommu_type1_attach_group(void *iommu_data,
-					 struct iommu_group *iommu_group)
+		struct iommu_group *iommu_group, enum vfio_group_type type)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_iommu_group *group;
@@ -2207,7 +2192,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_free;
 
-	if (vfio_bus_is_mdev(bus)) {
+	if (type == VFIO_EMULATED_IOMMU) {
 		if (!iommu->external_domain) {
 			INIT_LIST_HEAD(&domain->group_list);
 			iommu->external_domain = domain;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 12/14] vfio/spapr_tce: reject mediated devices
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (10 preceding siblings ...)
  2021-09-13  7:16 ` [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1 Christoph Hellwig
@ 2021-09-13  7:16 ` Christoph Hellwig
  2021-09-13  7:16 ` [PATCH 13/14] vfio/iommu_type1: remove the "external" domain Christoph Hellwig
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:16 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

Unlike the the type1 IOMMU backend, the SPAPR one does not contain any
support for the magic non-IOMMU backed iommu_group used by mediated
devices, so reject them in ->attach_group.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio_iommu_spapr_tce.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
index 936a26b13c0b01..708a95e618310c 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -1246,6 +1246,9 @@ static int tce_iommu_attach_group(void *iommu_data,
 	struct iommu_table_group *table_group;
 	struct tce_iommu_group *tcegrp = NULL;
 
+	if (type == VFIO_EMULATED_IOMMU)
+		return -EINVAL;
+
 	mutex_lock(&container->lock);
 
 	/* pr_debug("tce_vfio: Attaching group #%u to iommu %p\n",
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 13/14] vfio/iommu_type1: remove the "external" domain
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (11 preceding siblings ...)
  2021-09-13  7:16 ` [PATCH 12/14] vfio/spapr_tce: reject mediated devices Christoph Hellwig
@ 2021-09-13  7:16 ` Christoph Hellwig
  2021-09-23 22:59   ` Alex Williamson
  2021-09-13  7:16 ` [PATCH 14/14] vfio/iommu_type1: remove IS_IOMMU_CAP_DOMAIN_IN_CONTAINER Christoph Hellwig
  2021-09-15 18:07 ` cleanup vfio iommu_group creation v5 Jason Gunthorpe
  14 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:16 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

The external_domain concept rather misleading and not actually needed.
Replace it with a list of emulated groups in struct vfio_iommu.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio_iommu_type1.c | 121 ++++++++++++++------------------
 1 file changed, 54 insertions(+), 67 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index a48e9f597cb213..d2db62cb2aaa39 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -65,7 +65,6 @@ MODULE_PARM_DESC(dma_entry_limit,
 struct vfio_iommu {
 	struct list_head	domain_list;
 	struct list_head	iova_list;
-	struct vfio_domain	*external_domain; /* domain for external user */
 	struct mutex		lock;
 	struct rb_root		dma_list;
 	struct blocking_notifier_head notifier;
@@ -78,6 +77,7 @@ struct vfio_iommu {
 	bool			nesting;
 	bool			dirty_page_tracking;
 	bool			container_open;
+	struct list_head	emulated_iommu_groups;
 };
 
 struct vfio_domain {
@@ -1892,8 +1892,8 @@ static struct vfio_iommu_group*
 vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 			    struct iommu_group *iommu_group)
 {
+	struct vfio_iommu_group *group;
 	struct vfio_domain *domain;
-	struct vfio_iommu_group *group = NULL;
 
 	list_for_each_entry(domain, &iommu->domain_list, next) {
 		group = find_iommu_group(domain, iommu_group);
@@ -1901,10 +1901,10 @@ vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 			return group;
 	}
 
-	if (iommu->external_domain)
-		group = find_iommu_group(iommu->external_domain, iommu_group);
-
-	return group;
+	list_for_each_entry(group, &iommu->emulated_iommu_groups, next)
+		if (group->iommu_group == iommu_group)
+			return group;
+	return NULL;
 }
 
 static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
@@ -2163,62 +2163,52 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_iommu_group *group;
 	struct vfio_domain *domain, *d;
 	struct bus_type *bus = NULL;
-	int ret;
 	bool resv_msi, msi_remap;
 	phys_addr_t resv_msi_base = 0;
 	struct iommu_domain_geometry *geo;
 	LIST_HEAD(iova_copy);
 	LIST_HEAD(group_resv_regions);
+	int ret = -EINVAL;
 
 	mutex_lock(&iommu->lock);
 
 	/* Check for duplicates */
-	if (vfio_iommu_find_iommu_group(iommu, iommu_group)) {
-		mutex_unlock(&iommu->lock);
-		return -EINVAL;
-	}
+	if (vfio_iommu_find_iommu_group(iommu, iommu_group))
+		goto out_unlock;
 
+	ret = -ENOMEM;
 	group = kzalloc(sizeof(*group), GFP_KERNEL);
-	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
-	if (!group || !domain) {
-		ret = -ENOMEM;
-		goto out_free;
-	}
-
+	if (!group)
+		goto out_unlock;
 	group->iommu_group = iommu_group;
 
-	/* Determine bus_type in order to allocate a domain */
-	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
-	if (ret)
-		goto out_free;
-
 	if (type == VFIO_EMULATED_IOMMU) {
-		if (!iommu->external_domain) {
-			INIT_LIST_HEAD(&domain->group_list);
-			iommu->external_domain = domain;
-			vfio_update_pgsize_bitmap(iommu);
-		} else {
-			kfree(domain);
-		}
-
-		list_add(&group->next, &iommu->external_domain->group_list);
+		list_add(&group->next, &iommu->emulated_iommu_groups);
 		/*
-		 * Non-iommu backed group cannot dirty memory directly, it can
+		 * An emulated IOMMU group cannot dirty memory directly, it can
 		 * only use interfaces that provide dirty tracking.
 		 * The iommu scope can only be promoted with the addition of a
 		 * dirty tracking group.
 		 */
 		group->pinned_page_dirty_scope = true;
-		mutex_unlock(&iommu->lock);
-
-		return 0;
+		ret = 0;
+		goto out_unlock;
 	}
 
+	/* Determine bus_type in order to allocate a domain */
+	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
+	if (ret)
+		goto out_free_group;
+
+	ret = -ENOMEM;
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (!domain)
+		goto out_free_group;
+
+	ret = -EIO;
 	domain->domain = iommu_domain_alloc(bus);
-	if (!domain->domain) {
-		ret = -EIO;
-		goto out_free;
-	}
+	if (!domain->domain)
+		goto out_free_domain;
 
 	if (iommu->nesting) {
 		ret = iommu_enable_nesting(domain->domain);
@@ -2345,9 +2335,11 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	iommu_domain_free(domain->domain);
 	vfio_iommu_iova_free(&iova_copy);
 	vfio_iommu_resv_free(&group_resv_regions);
-out_free:
+out_free_domain:
 	kfree(domain);
+out_free_group:
 	kfree(group);
+out_unlock:
 	mutex_unlock(&iommu->lock);
 	return ret;
 }
@@ -2472,25 +2464,19 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
+	list_for_each_entry(group, &iommu->emulated_iommu_groups, next) {
+		if (group->iommu_group != iommu_group)
+			continue;
+		update_dirty_scope = !group->pinned_page_dirty_scope;
+		list_del(&group->next);
+		kfree(group);
 
-	if (iommu->external_domain) {
-		group = find_iommu_group(iommu->external_domain, iommu_group);
-		if (group) {
-			update_dirty_scope = !group->pinned_page_dirty_scope;
-			list_del(&group->next);
-			kfree(group);
-
-			if (list_empty(&iommu->external_domain->group_list)) {
-				if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
-					WARN_ON(iommu->notifier.head);
-					vfio_iommu_unmap_unpin_all(iommu);
-				}
-
-				kfree(iommu->external_domain);
-				iommu->external_domain = NULL;
-			}
-			goto detach_group_done;
+		if (list_empty(&iommu->emulated_iommu_groups) &&
+		    !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
+			WARN_ON(iommu->notifier.head);
+			vfio_iommu_unmap_unpin_all(iommu);
 		}
+		goto detach_group_done;
 	}
 
 	/*
@@ -2518,7 +2504,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		 */
 		if (list_empty(&domain->group_list)) {
 			if (list_is_singular(&iommu->domain_list)) {
-				if (!iommu->external_domain) {
+				if (list_empty(&iommu->emulated_iommu_groups)) {
 					WARN_ON(iommu->notifier.head);
 					vfio_iommu_unmap_unpin_all(iommu);
 				} else {
@@ -2582,41 +2568,42 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
 	init_waitqueue_head(&iommu->vaddr_wait);
+	INIT_LIST_HEAD(&iommu->emulated_iommu_groups);
 
 	return iommu;
 }
 
-static void vfio_release_domain(struct vfio_domain *domain, bool external)
+static void vfio_release_domain(struct vfio_domain *domain)
 {
 	struct vfio_iommu_group *group, *group_tmp;
 
 	list_for_each_entry_safe(group, group_tmp,
 				 &domain->group_list, next) {
-		if (!external)
-			iommu_detach_group(domain->domain, group->iommu_group);
+		iommu_detach_group(domain->domain, group->iommu_group);
 		list_del(&group->next);
 		kfree(group);
 	}
 
-	if (!external)
-		iommu_domain_free(domain->domain);
+	iommu_domain_free(domain->domain);
 }
 
 static void vfio_iommu_type1_release(void *iommu_data)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain, *domain_tmp;
+	struct vfio_iommu_group *group, *next_group;
 
-	if (iommu->external_domain) {
-		vfio_release_domain(iommu->external_domain, true);
-		kfree(iommu->external_domain);
+	list_for_each_entry_safe(group, next_group,
+			&iommu->emulated_iommu_groups, next) {
+		list_del(&group->next);
+		kfree(group);
 	}
 
 	vfio_iommu_unmap_unpin_all(iommu);
 
 	list_for_each_entry_safe(domain, domain_tmp,
 				 &iommu->domain_list, next) {
-		vfio_release_domain(domain, false);
+		vfio_release_domain(domain);
 		list_del(&domain->next);
 		kfree(domain);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 14/14] vfio/iommu_type1: remove IS_IOMMU_CAP_DOMAIN_IN_CONTAINER
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (12 preceding siblings ...)
  2021-09-13  7:16 ` [PATCH 13/14] vfio/iommu_type1: remove the "external" domain Christoph Hellwig
@ 2021-09-13  7:16 ` Christoph Hellwig
  2021-09-15 18:07 ` cleanup vfio iommu_group creation v5 Jason Gunthorpe
  14 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:16 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

IS_IOMMU_CAP_DOMAIN_IN_CONTAINER just obsfucated the checks being
performed, so open code it in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio_iommu_type1.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index d2db62cb2aaa39..7ad56afb11750e 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -139,9 +139,6 @@ struct vfio_regions {
 	size_t len;
 };
 
-#define IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)	\
-					(!list_empty(&iommu->domain_list))
-
 #define DIRTY_BITMAP_BYTES(n)	(ALIGN(n, BITS_PER_TYPE(u64)) / BITS_PER_BYTE)
 
 /*
@@ -879,7 +876,7 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 	 * already pinned and accounted. Accounting should be done if there is no
 	 * iommu capable domain in the container.
 	 */
-	do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu);
+	do_accounting = list_empty(&iommu->domain_list);
 
 	for (i = 0; i < npage; i++) {
 		struct vfio_pfn *vpfn;
@@ -968,7 +965,7 @@ static int vfio_iommu_type1_unpin_pages(void *iommu_data,
 
 	mutex_lock(&iommu->lock);
 
-	do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu);
+	do_accounting = list_empty(&iommu->domain_list);
 	for (i = 0; i < npage; i++) {
 		struct vfio_dma *dma;
 		dma_addr_t iova;
@@ -1089,7 +1086,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 	if (!dma->size)
 		return 0;
 
-	if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
+	if (list_empty(&iommu->domain_list))
 		return 0;
 
 	/*
@@ -1666,7 +1663,7 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 	vfio_link_dma(iommu, dma);
 
 	/* Don't pin and map if container doesn't contain IOMMU capable domain*/
-	if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
+	if (list_empty(&iommu->domain_list))
 		dma->size = size;
 	else
 		ret = vfio_pin_map_dma(iommu, dma, size);
@@ -2472,7 +2469,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		kfree(group);
 
 		if (list_empty(&iommu->emulated_iommu_groups) &&
-		    !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
+		    list_empty(&iommu->domain_list)) {
 			WARN_ON(iommu->notifier.head);
 			vfio_iommu_unmap_unpin_all(iommu);
 		}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* RE: [PATCH 01/14] vfio: Move vfio_iommu_group_get() to vfio_register_group_dev()
  2021-09-13  7:15 ` [PATCH 01/14] vfio: Move vfio_iommu_group_get() to vfio_register_group_dev() Christoph Hellwig
@ 2021-09-14  1:57   ` Tian, Kevin
  2021-09-14  5:46     ` Christoph Hellwig
  0 siblings, 1 reply; 43+ messages in thread
From: Tian, Kevin @ 2021-09-14  1:57 UTC (permalink / raw)
  To: Christoph Hellwig, Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Xu, Terrence, kvm, Jason Gunthorpe

> From: Christoph Hellwig <hch@lst.de>
> Sent: Monday, September 13, 2021 3:16 PM
> 
> From: Jason Gunthorpe <jgg@nvidia.com>
> 
> We don't need to hold a reference to the group in the driver as well as
> obtain a reference to the same group as the first thing
> vfio_register_group_dev() does.
> 
> Since the drivers never use the group move this all into the core code.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

I gave my reviewed-by in last version. Looks it's lost here.

> ---
>  drivers/vfio/fsl-mc/vfio_fsl_mc.c            | 17 ++-------
>  drivers/vfio/pci/vfio_pci_core.c             | 13 ++-----
>  drivers/vfio/platform/vfio_platform_common.c | 13 +------
>  drivers/vfio/vfio.c                          | 36 ++++++++------------
>  include/linux/vfio.h                         |  3 --
>  5 files changed, 19 insertions(+), 63 deletions(-)
> 
> diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-
> mc/vfio_fsl_mc.c
> index 0ead91bfa83867..9e838fed560339 100644
> --- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
> +++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
> @@ -505,22 +505,13 @@ static void vfio_fsl_uninit_device(struct
> vfio_fsl_mc_device *vdev)
> 
>  static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
>  {
> -	struct iommu_group *group;
>  	struct vfio_fsl_mc_device *vdev;
>  	struct device *dev = &mc_dev->dev;
>  	int ret;
> 
> -	group = vfio_iommu_group_get(dev);
> -	if (!group) {
> -		dev_err(dev, "VFIO_FSL_MC: No IOMMU group\n");
> -		return -EINVAL;
> -	}
> -
>  	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
> -	if (!vdev) {
> -		ret = -ENOMEM;
> -		goto out_group_put;
> -	}
> +	if (!vdev)
> +		return -ENOMEM;
> 
>  	vfio_init_group_dev(&vdev->vdev, dev, &vfio_fsl_mc_ops);
>  	vdev->mc_dev = mc_dev;
> @@ -556,8 +547,6 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device
> *mc_dev)
>  out_uninit:
>  	vfio_uninit_group_dev(&vdev->vdev);
>  	kfree(vdev);
> -out_group_put:
> -	vfio_iommu_group_put(group, dev);
>  	return ret;
>  }
> 
> @@ -574,8 +563,6 @@ static int vfio_fsl_mc_remove(struct fsl_mc_device
> *mc_dev)
> 
>  	vfio_uninit_group_dev(&vdev->vdev);
>  	kfree(vdev);
> -	vfio_iommu_group_put(mc_dev->dev.iommu_group, dev);
> -
>  	return 0;
>  }
> 
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index 68198e0f2a6310..43bab0ca3e682d 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1806,7 +1806,6 @@
> EXPORT_SYMBOL_GPL(vfio_pci_core_uninit_device);
>  int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
>  {
>  	struct pci_dev *pdev = vdev->pdev;
> -	struct iommu_group *group;
>  	int ret;
> 
>  	if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
> @@ -1825,10 +1824,6 @@ int vfio_pci_core_register_device(struct
> vfio_pci_core_device *vdev)
>  		return -EBUSY;
>  	}
> 
> -	group = vfio_iommu_group_get(&pdev->dev);
> -	if (!group)
> -		return -EINVAL;
> -
>  	if (pci_is_root_bus(pdev->bus)) {
>  		ret = vfio_assign_device_set(&vdev->vdev, vdev);
>  	} else if (!pci_probe_reset_slot(pdev->slot)) {
> @@ -1842,10 +1837,10 @@ int vfio_pci_core_register_device(struct
> vfio_pci_core_device *vdev)
>  	}
> 
>  	if (ret)
> -		goto out_group_put;
> +		return ret;
>  	ret = vfio_pci_vf_init(vdev);
>  	if (ret)
> -		goto out_group_put;
> +		return ret;
>  	ret = vfio_pci_vga_init(vdev);
>  	if (ret)
>  		goto out_vf;
> @@ -1876,8 +1871,6 @@ int vfio_pci_core_register_device(struct
> vfio_pci_core_device *vdev)
>  		vfio_pci_set_power_state(vdev, PCI_D0);
>  out_vf:
>  	vfio_pci_vf_uninit(vdev);
> -out_group_put:
> -	vfio_iommu_group_put(group, &pdev->dev);
>  	return ret;
>  }
>  EXPORT_SYMBOL_GPL(vfio_pci_core_register_device);
> @@ -1893,8 +1886,6 @@ void vfio_pci_core_unregister_device(struct
> vfio_pci_core_device *vdev)
>  	vfio_pci_vf_uninit(vdev);
>  	vfio_pci_vga_uninit(vdev);
> 
> -	vfio_iommu_group_put(pdev->dev.iommu_group, &pdev->dev);
> -
>  	if (!disable_idle_d3)
>  		vfio_pci_set_power_state(vdev, PCI_D0);
>  }
> diff --git a/drivers/vfio/platform/vfio_platform_common.c
> b/drivers/vfio/platform/vfio_platform_common.c
> index 6af7ce7d619c25..256f55b84e70a0 100644
> --- a/drivers/vfio/platform/vfio_platform_common.c
> +++ b/drivers/vfio/platform/vfio_platform_common.c
> @@ -642,7 +642,6 @@ static int vfio_platform_of_probe(struct
> vfio_platform_device *vdev,
>  int vfio_platform_probe_common(struct vfio_platform_device *vdev,
>  			       struct device *dev)
>  {
> -	struct iommu_group *group;
>  	int ret;
> 
>  	vfio_init_group_dev(&vdev->vdev, dev, &vfio_platform_ops);
> @@ -663,24 +662,15 @@ int vfio_platform_probe_common(struct
> vfio_platform_device *vdev,
>  		goto out_uninit;
>  	}
> 
> -	group = vfio_iommu_group_get(dev);
> -	if (!group) {
> -		dev_err(dev, "No IOMMU group for device %s\n", vdev-
> >name);
> -		ret = -EINVAL;
> -		goto put_reset;
> -	}
> -
>  	ret = vfio_register_group_dev(&vdev->vdev);
>  	if (ret)
> -		goto put_iommu;
> +		goto put_reset;
> 
>  	mutex_init(&vdev->igate);
> 
>  	pm_runtime_enable(dev);
>  	return 0;
> 
> -put_iommu:
> -	vfio_iommu_group_put(group, dev);
>  put_reset:
>  	vfio_platform_put_reset(vdev);
>  out_uninit:
> @@ -696,7 +686,6 @@ void vfio_platform_remove_common(struct
> vfio_platform_device *vdev)
>  	pm_runtime_disable(vdev->device);
>  	vfio_platform_put_reset(vdev);
>  	vfio_uninit_group_dev(&vdev->vdev);
> -	vfio_iommu_group_put(vdev->vdev.dev->iommu_group, vdev-
> >vdev.dev);
>  }
>  EXPORT_SYMBOL_GPL(vfio_platform_remove_common);
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 3c034fe14ccb03..b39da9b90c95bc 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -169,15 +169,7 @@ static void vfio_release_device_set(struct
> vfio_device *device)
>  	xa_unlock(&vfio_device_set_xa);
>  }
> 
> -/*
> - * vfio_iommu_group_{get,put} are only intended for VFIO bus driver probe
> - * and remove functions, any use cases other than acquiring the first
> - * reference for the purpose of calling vfio_register_group_dev() or
> removing
> - * that symmetric reference after vfio_unregister_group_dev() should use
> the raw
> - * iommu_group_{get,put} functions.  In particular, vfio_iommu_group_put()
> - * removes the device from the dummy group and cannot be nested.
> - */
> -struct iommu_group *vfio_iommu_group_get(struct device *dev)
> +static struct iommu_group *vfio_iommu_group_get(struct device *dev)
>  {
>  	struct iommu_group *group;
>  	int __maybe_unused ret;
> @@ -220,18 +212,6 @@ struct iommu_group *vfio_iommu_group_get(struct
> device *dev)
> 
>  	return group;
>  }
> -EXPORT_SYMBOL_GPL(vfio_iommu_group_get);
> -
> -void vfio_iommu_group_put(struct iommu_group *group, struct device
> *dev)
> -{
> -#ifdef CONFIG_VFIO_NOIOMMU
> -	if (iommu_group_get_iommudata(group) == &noiommu)
> -		iommu_group_remove_device(dev);
> -#endif
> -
> -	iommu_group_put(group);
> -}
> -EXPORT_SYMBOL_GPL(vfio_iommu_group_put);
> 
>  #ifdef CONFIG_VFIO_NOIOMMU
>  static void *vfio_noiommu_open(unsigned long arg)
> @@ -841,7 +821,7 @@ int vfio_register_group_dev(struct vfio_device
> *device)
>  	if (!device->dev_set)
>  		vfio_assign_device_set(device, device);
> 
> -	iommu_group = iommu_group_get(device->dev);
> +	iommu_group = vfio_iommu_group_get(device->dev);
>  	if (!iommu_group)
>  		return -EINVAL;
> 
> @@ -849,6 +829,10 @@ int vfio_register_group_dev(struct vfio_device
> *device)
>  	if (!group) {
>  		group = vfio_create_group(iommu_group);
>  		if (IS_ERR(group)) {
> +#ifdef CONFIG_VFIO_NOIOMMU
> +			if (iommu_group_get_iommudata(iommu_group) ==
> &noiommu)
> +				iommu_group_remove_device(device->dev);
> +#endif
>  			iommu_group_put(iommu_group);
>  			return PTR_ERR(group);
>  		}
> @@ -865,6 +849,10 @@ int vfio_register_group_dev(struct vfio_device
> *device)
>  		dev_WARN(device->dev, "Device already exists on
> group %d\n",
>  			 iommu_group_id(iommu_group));
>  		vfio_device_put(existing_device);
> +#ifdef CONFIG_VFIO_NOIOMMU
> +		if (iommu_group_get_iommudata(iommu_group) ==
> &noiommu)
> +			iommu_group_remove_device(device->dev);
> +#endif
>  		vfio_group_put(group);
>  		return -EBUSY;
>  	}
> @@ -1010,6 +998,10 @@ void vfio_unregister_group_dev(struct vfio_device
> *device)
>  	if (list_empty(&group->device_list))
>  		wait_event(group->container_q, !group->container);
> 
> +#ifdef CONFIG_VFIO_NOIOMMU
> +	if (iommu_group_get_iommudata(group) == &noiommu)
> +		iommu_group_remove_device(dev);
> +#endif
>  	/* Matches the get in vfio_register_group_dev() */
>  	vfio_group_put(group);
>  }
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index b53a9557884ada..f7083c2fd0d099 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -71,9 +71,6 @@ struct vfio_device_ops {
>  	int	(*match)(struct vfio_device *vdev, char *buf);
>  };
> 
> -extern struct iommu_group *vfio_iommu_group_get(struct device *dev);
> -extern void vfio_iommu_group_put(struct iommu_group *group, struct
> device *dev);
> -
>  void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
>  			 const struct vfio_device_ops *ops);
>  void vfio_uninit_group_dev(struct vfio_device *device);
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH 04/14] vfio: factor out a vfio_group_find_or_alloc helper
  2021-09-13  7:15 ` [PATCH 04/14] vfio: factor out a vfio_group_find_or_alloc helper Christoph Hellwig
@ 2021-09-14  2:00   ` Tian, Kevin
  0 siblings, 0 replies; 43+ messages in thread
From: Tian, Kevin @ 2021-09-14  2:00 UTC (permalink / raw)
  To: Christoph Hellwig, Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Xu, Terrence, kvm, Jason Gunthorpe

> From: Christoph Hellwig <hch@lst.de>
> Sent: Monday, September 13, 2021 3:16 PM
> 
> Factor out a helper to find or allocate the vfio_group to reduce the
> spagetthi code in vfio_register_group_dev a little.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  drivers/vfio/vfio.c | 60 ++++++++++++++++++++++++++-------------------
>  1 file changed, 35 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 8bd4b0b96b94a3..2b2679c7126fdf 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -823,10 +823,39 @@ void vfio_uninit_group_dev(struct vfio_device
> *device)
>  }
>  EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
> 
> +struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
> +{
> +	struct iommu_group *iommu_group;
> +	struct vfio_group *group;
> +
> +	iommu_group = vfio_iommu_group_get(dev);
> +	if (!iommu_group)
> +		return ERR_PTR(-EINVAL);
> +
> +	/* a found vfio_group already holds a reference to the iommu_group
> */
> +	group = vfio_group_get_from_iommu(iommu_group);
> +	if (group)
> +		goto out_put;
> +
> +	/* a newly created vfio_group keeps the reference. */
> +	group = vfio_create_group(iommu_group);
> +	if (IS_ERR(group))
> +		goto out_remove;
> +	return group;
> +
> +out_remove:
> +#ifdef CONFIG_VFIO_NOIOMMU
> +	if (iommu_group_get_iommudata(iommu_group) == &noiommu)
> +		iommu_group_remove_device(dev);
> +#endif
> +out_put:
> +	iommu_group_put(iommu_group);
> +	return group;
> +}
> +
>  int vfio_register_group_dev(struct vfio_device *device)
>  {
>  	struct vfio_device *existing_device;
> -	struct iommu_group *iommu_group;
>  	struct vfio_group *group;
> 
>  	/*
> @@ -836,36 +865,17 @@ int vfio_register_group_dev(struct vfio_device
> *device)
>  	if (!device->dev_set)
>  		vfio_assign_device_set(device, device);
> 
> -	iommu_group = vfio_iommu_group_get(device->dev);
> -	if (!iommu_group)
> -		return -EINVAL;
> -
> -	group = vfio_group_get_from_iommu(iommu_group);
> -	if (!group) {
> -		group = vfio_create_group(iommu_group);
> -		if (IS_ERR(group)) {
> -#ifdef CONFIG_VFIO_NOIOMMU
> -			if (iommu_group_get_iommudata(iommu_group) ==
> &noiommu)
> -				iommu_group_remove_device(device->dev);
> -#endif
> -			iommu_group_put(iommu_group);
> -			return PTR_ERR(group);
> -		}
> -	} else {
> -		/*
> -		 * A found vfio_group already holds a reference to the
> -		 * iommu_group.  A created vfio_group keeps the reference.
> -		 */
> -		iommu_group_put(iommu_group);
> -	}
> +	group = vfio_group_find_or_alloc(device->dev);
> +	if (IS_ERR(group))
> +		return PTR_ERR(group);
> 
>  	existing_device = vfio_group_get_device(group, device->dev);
>  	if (existing_device) {
>  		dev_WARN(device->dev, "Device already exists on
> group %d\n",
> -			 iommu_group_id(iommu_group));
> +			 iommu_group_id(group->iommu_group));
>  		vfio_device_put(existing_device);
>  #ifdef CONFIG_VFIO_NOIOMMU
> -		if (iommu_group_get_iommudata(iommu_group) ==
> &noiommu)
> +		if (iommu_group_get_iommudata(group->iommu_group) ==
> &noiommu)
>  			iommu_group_remove_device(device->dev);
>  #endif
>  		vfio_group_put(group);
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH 05/14] vfio: refactor noiommu group creation
  2021-09-13  7:15 ` [PATCH 05/14] vfio: refactor noiommu group creation Christoph Hellwig
@ 2021-09-14  2:07   ` Tian, Kevin
  0 siblings, 0 replies; 43+ messages in thread
From: Tian, Kevin @ 2021-09-14  2:07 UTC (permalink / raw)
  To: Christoph Hellwig, Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Xu, Terrence, kvm, Jason Gunthorpe

> From: Christoph Hellwig <hch@lst.de>
> Sent: Monday, September 13, 2021 3:16 PM
> 
> Split the actual noiommu group creation from vfio_iommu_group_get into a
> new helper, and open code the rest of vfio_iommu_group_get in its only
> caller.  This creates an entirely separate and clear code path for the
> noiommu group creation.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  drivers/vfio/vfio.c | 104 ++++++++++++++++++++++----------------------
>  1 file changed, 52 insertions(+), 52 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 2b2679c7126fdf..b1ed156adccd04 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -169,50 +169,6 @@ static void vfio_release_device_set(struct
> vfio_device *device)
>  	xa_unlock(&vfio_device_set_xa);
>  }
> 
> -static struct iommu_group *vfio_iommu_group_get(struct device *dev)
> -{
> -	struct iommu_group *group;
> -	int __maybe_unused ret;
> -
> -	group = iommu_group_get(dev);
> -
> -#ifdef CONFIG_VFIO_NOIOMMU
> -	/*
> -	 * With noiommu enabled, an IOMMU group will be created for a
> device
> -	 * that doesn't already have one and doesn't have an iommu_ops on
> their
> -	 * bus.  We set iommudata simply to be able to identify these groups
> -	 * as special use and for reclamation later.
> -	 */
> -	if (group || !noiommu || iommu_present(dev->bus))
> -		return group;
> -
> -	group = iommu_group_alloc();
> -	if (IS_ERR(group))
> -		return NULL;
> -
> -	iommu_group_set_name(group, "vfio-noiommu");
> -	iommu_group_set_iommudata(group, &noiommu, NULL);
> -	ret = iommu_group_add_device(group, dev);
> -	if (ret) {
> -		iommu_group_put(group);
> -		return NULL;
> -	}
> -
> -	/*
> -	 * Where to taint?  At this point we've added an IOMMU group for a
> -	 * device that is not backed by iommu_ops, therefore any iommu_
> -	 * callback using iommu_ops can legitimately Oops.  So, while we
> may
> -	 * be about to give a DMA capable device to a user without IOMMU
> -	 * protection, which is clearly taint-worthy, let's go ahead and do
> -	 * it here.
> -	 */
> -	add_taint(TAINT_USER, LOCKDEP_STILL_OK);
> -	dev_warn(dev, "Adding kernel taint for vfio-noiommu group on
> device\n");
> -#endif
> -
> -	return group;
> -}
> -
>  #ifdef CONFIG_VFIO_NOIOMMU
>  static void *vfio_noiommu_open(unsigned long arg)
>  {
> @@ -823,12 +779,61 @@ void vfio_uninit_group_dev(struct vfio_device
> *device)
>  }
>  EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
> 
> -struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
> +#ifdef CONFIG_VFIO_NOIOMMU
> +static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
>  {
>  	struct iommu_group *iommu_group;
>  	struct vfio_group *group;
> +	int ret;
> +
> +	iommu_group = iommu_group_alloc();
> +	if (IS_ERR(iommu_group))
> +		return ERR_CAST(iommu_group);
> 
> -	iommu_group = vfio_iommu_group_get(dev);
> +	iommu_group_set_name(iommu_group, "vfio-noiommu");
> +	iommu_group_set_iommudata(iommu_group, &noiommu, NULL);
> +	ret = iommu_group_add_device(iommu_group, dev);
> +	if (ret)
> +		goto out_put_group;
> +
> +	group = vfio_create_group(iommu_group);
> +	if (IS_ERR(group)) {
> +		ret = PTR_ERR(group);
> +		goto out_remove_device;
> +	}
> +
> +	return group;
> +
> +out_remove_device:
> +	iommu_group_remove_device(dev);
> +out_put_group:
> +	iommu_group_put(iommu_group);
> +	return ERR_PTR(ret);
> +}
> +#endif
> +
> +static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
> +{
> +	struct iommu_group *iommu_group;
> +	struct vfio_group *group;
> +
> +	iommu_group = iommu_group_get(dev);
> +#ifdef CONFIG_VFIO_NOIOMMU
> +	if (!iommu_group && noiommu && !iommu_present(dev->bus)) {
> +		/*
> +		 * With noiommu enabled, create an IOMMU group for
> devices that
> +		 * don't already have one and don't have an iommu_ops on
> their
> +		 * bus.  Taint the kernel because we're about to give a DMA
> +		 * capable device to a user without IOMMU protection.
> +		 */
> +		group = vfio_noiommu_group_alloc(dev);
> +		if (!IS_ERR(group)) {
> +			add_taint(TAINT_USER, LOCKDEP_STILL_OK);
> +			dev_warn(dev, "Adding kernel taint for vfio-
> noiommu group on device\n");
> +		}
> +		return group;
> +	}
> +#endif
>  	if (!iommu_group)
>  		return ERR_PTR(-EINVAL);
> 
> @@ -840,14 +845,9 @@ struct vfio_group *vfio_group_find_or_alloc(struct
> device *dev)
>  	/* a newly created vfio_group keeps the reference. */
>  	group = vfio_create_group(iommu_group);
>  	if (IS_ERR(group))
> -		goto out_remove;
> +		goto out_put;
>  	return group;
> 
> -out_remove:
> -#ifdef CONFIG_VFIO_NOIOMMU
> -	if (iommu_group_get_iommudata(iommu_group) == &noiommu)
> -		iommu_group_remove_device(dev);
> -#endif
>  out_put:
>  	iommu_group_put(iommu_group);
>  	return group;
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-09-13  7:15 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
@ 2021-09-14  2:23   ` Tian, Kevin
  0 siblings, 0 replies; 43+ messages in thread
From: Tian, Kevin @ 2021-09-14  2:23 UTC (permalink / raw)
  To: Christoph Hellwig, Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Xu, Terrence, kvm

> From: Christoph Hellwig <hch@lst.de>
> Sent: Monday, September 13, 2021 3:16 PM
> 
> Reuse the logic in vfio_noiommu_group_alloc to allocate a fake
> single-device iommu group for mediated devices by factoring out a common
> function, and replacing the noiommu boolean field in struct vfio_group
> with an enum to distinguish the three different kinds of groups.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  drivers/s390/crypto/vfio_ap_ops.c |  2 +-
>  drivers/vfio/mdev/mdev_driver.c   | 45 ++-------------
>  drivers/vfio/mdev/vfio_mdev.c     |  2 +-
>  drivers/vfio/vfio.c               | 92 ++++++++++++++++++++++---------
>  include/linux/vfio.h              |  1 +
>  samples/vfio-mdev/mbochs.c        |  2 +-
>  samples/vfio-mdev/mdpy.c          |  2 +-
>  samples/vfio-mdev/mtty.c          |  2 +-
>  8 files changed, 76 insertions(+), 72 deletions(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
> b/drivers/s390/crypto/vfio_ap_ops.c
> index 118939a7729a1e..24755d1aedd5b0 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -351,7 +351,7 @@ static int vfio_ap_mdev_probe(struct mdev_device
> *mdev)
>  	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
>  	mutex_unlock(&matrix_dev->lock);
> 
> -	ret = vfio_register_group_dev(&matrix_mdev->vdev);
> +	ret = vfio_register_emulated_iommu_dev(&matrix_mdev->vdev);
>  	if (ret)
>  		goto err_list;
>  	dev_set_drvdata(&mdev->dev, matrix_mdev);
> diff --git a/drivers/vfio/mdev/mdev_driver.c
> b/drivers/vfio/mdev/mdev_driver.c
> index e2cb1ff56f6c9b..7927ed4f1711f2 100644
> --- a/drivers/vfio/mdev/mdev_driver.c
> +++ b/drivers/vfio/mdev/mdev_driver.c
> @@ -13,60 +13,23 @@
> 
>  #include "mdev_private.h"
> 
> -static int mdev_attach_iommu(struct mdev_device *mdev)
> -{
> -	int ret;
> -	struct iommu_group *group;
> -
> -	group = iommu_group_alloc();
> -	if (IS_ERR(group))
> -		return PTR_ERR(group);
> -
> -	ret = iommu_group_add_device(group, &mdev->dev);
> -	if (!ret)
> -		dev_info(&mdev->dev, "MDEV: group_id = %d\n",
> -			 iommu_group_id(group));
> -
> -	iommu_group_put(group);
> -	return ret;
> -}
> -
> -static void mdev_detach_iommu(struct mdev_device *mdev)
> -{
> -	iommu_group_remove_device(&mdev->dev);
> -	dev_info(&mdev->dev, "MDEV: detaching iommu\n");
> -}
> -
>  static int mdev_probe(struct device *dev)
>  {
>  	struct mdev_driver *drv =
>  		container_of(dev->driver, struct mdev_driver, driver);
> -	struct mdev_device *mdev = to_mdev_device(dev);
> -	int ret;
> 
> -	ret = mdev_attach_iommu(mdev);
> -	if (ret)
> -		return ret;
> -
> -	if (drv->probe) {
> -		ret = drv->probe(mdev);
> -		if (ret)
> -			mdev_detach_iommu(mdev);
> -	}
> -
> -	return ret;
> +	if (!drv->probe)
> +		return 0;
> +	return drv->probe(to_mdev_device(dev));
>  }
> 
>  static void mdev_remove(struct device *dev)
>  {
>  	struct mdev_driver *drv =
>  		container_of(dev->driver, struct mdev_driver, driver);
> -	struct mdev_device *mdev = to_mdev_device(dev);
> 
>  	if (drv->remove)
> -		drv->remove(mdev);
> -
> -	mdev_detach_iommu(mdev);
> +		drv->remove(to_mdev_device(dev));
>  }
> 
>  static int mdev_match(struct device *dev, struct device_driver *drv)
> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
> index 7a9883048216e7..a90e24b0c851d3 100644
> --- a/drivers/vfio/mdev/vfio_mdev.c
> +++ b/drivers/vfio/mdev/vfio_mdev.c
> @@ -119,7 +119,7 @@ static int vfio_mdev_probe(struct mdev_device
> *mdev)
>  		return -ENOMEM;
> 
>  	vfio_init_group_dev(vdev, &mdev->dev, &vfio_mdev_dev_ops);
> -	ret = vfio_register_group_dev(vdev);
> +	ret = vfio_register_emulated_iommu_dev(vdev);
>  	if (ret)
>  		goto out_uninit;
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 23eaebd2e28cd9..2508c8c3984091 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -67,6 +67,30 @@ struct vfio_unbound_dev {
>  	struct list_head		unbound_next;
>  };
> 
> +enum vfio_group_type {
> +	/*
> +	 * Physical device with IOMMU backing.
> +	 */
> +	VFIO_IOMMU,
> +
> +	/*
> +	 * Virtual device without IOMMU backing. The VFIO core fakes up an
> +	 * iommu_group as the iommu_group sysfs interface is part of the
> +	 * userspace ABI.  The user of these devices must not be able to
> +	 * directly trigger unmediated DMA.
> +	 */
> +	VFIO_EMULATED_IOMMU,
> +
> +	/*
> +	 * Physical device without IOMMU backing. The VFIO core fakes up an
> +	 * iommu_group as the iommu_group sysfs interface is part of the
> +	 * userspace ABI.  Users can trigger unmediated DMA by the device,
> +	 * usage is highly dangerous, requires an explicit opt-in and will
> +	 * taint the kernel.
> +	 */
> +	VFIO_NO_IOMMU,
> +};
> +
>  struct vfio_group {
>  	struct kref			kref;
>  	int				minor;
> @@ -83,7 +107,7 @@ struct vfio_group {
>  	struct mutex			unbound_lock;
>  	atomic_t			opened;
>  	wait_queue_head_t		container_q;
> -	bool				noiommu;
> +	enum vfio_group_type		type;
>  	unsigned int			dev_counter;
>  	struct kvm			*kvm;
>  	struct blocking_notifier_head	notifier;
> @@ -336,7 +360,7 @@ static void vfio_group_unlock_and_free(struct
> vfio_group *group)
>   * Group objects - create, release, get, put, search
>   */
>  static struct vfio_group *vfio_create_group(struct iommu_group
> *iommu_group,
> -		bool noiommu)
> +		enum vfio_group_type type)
>  {
>  	struct vfio_group *group, *tmp;
>  	struct device *dev;
> @@ -355,7 +379,7 @@ static struct vfio_group *vfio_create_group(struct
> iommu_group *iommu_group,
>  	atomic_set(&group->opened, 0);
>  	init_waitqueue_head(&group->container_q);
>  	group->iommu_group = iommu_group;
> -	group->noiommu = noiommu;
> +	group->type = type;
>  	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
> 
>  	group->nb.notifier_call = vfio_iommu_group_notifier;
> @@ -391,8 +415,8 @@ static struct vfio_group *vfio_create_group(struct
> iommu_group *iommu_group,
>  	}
> 
>  	dev = device_create(vfio.class, NULL,
> -			    MKDEV(MAJOR(vfio.group_devt), minor),
> -			    group, "%s%d", group->noiommu ? "noiommu-" :
> "",
> +			    MKDEV(MAJOR(vfio.group_devt), minor), group,
> "%s%d",
> +			    group->type == VFIO_NO_IOMMU ? "noiommu-" :
> "",
>  			    iommu_group_id(iommu_group));
>  	if (IS_ERR(dev)) {
>  		vfio_free_group_minor(minor);
> @@ -778,8 +802,8 @@ void vfio_uninit_group_dev(struct vfio_device
> *device)
>  }
>  EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
> 
> -#ifdef CONFIG_VFIO_NOIOMMU
> -static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
> +static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
> +		enum vfio_group_type type)
>  {
>  	struct iommu_group *iommu_group;
>  	struct vfio_group *group;
> @@ -794,7 +818,7 @@ static struct vfio_group
> *vfio_noiommu_group_alloc(struct device *dev)
>  	if (ret)
>  		goto out_put_group;
> 
> -	group = vfio_create_group(iommu_group, true);
> +	group = vfio_create_group(iommu_group, type);
>  	if (IS_ERR(group)) {
>  		ret = PTR_ERR(group);
>  		goto out_remove_device;
> @@ -808,7 +832,6 @@ static struct vfio_group
> *vfio_noiommu_group_alloc(struct device *dev)
>  	iommu_group_put(iommu_group);
>  	return ERR_PTR(ret);
>  }
> -#endif
> 
>  static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
>  {
> @@ -824,7 +847,7 @@ static struct vfio_group
> *vfio_group_find_or_alloc(struct device *dev)
>  		 * bus.  Taint the kernel because we're about to give a DMA
>  		 * capable device to a user without IOMMU protection.
>  		 */
> -		group = vfio_noiommu_group_alloc(dev);
> +		group = vfio_noiommu_group_alloc(dev, VFIO_NO_IOMMU);
>  		if (!IS_ERR(group)) {
>  			add_taint(TAINT_USER, LOCKDEP_STILL_OK);
>  			dev_warn(dev, "Adding kernel taint for vfio-
> noiommu group on device\n");
> @@ -841,7 +864,7 @@ static struct vfio_group
> *vfio_group_find_or_alloc(struct device *dev)
>  		goto out_put;
> 
>  	/* a newly created vfio_group keeps the reference. */
> -	group = vfio_create_group(iommu_group, false);
> +	group = vfio_create_group(iommu_group, VFIO_IOMMU);
>  	if (IS_ERR(group))
>  		goto out_put;
>  	return group;
> @@ -851,10 +874,13 @@ static struct vfio_group
> *vfio_group_find_or_alloc(struct device *dev)
>  	return group;
>  }
> 
> -int vfio_register_group_dev(struct vfio_device *device)
> +static int __vfio_register_dev(struct vfio_device *device,
> +		struct vfio_group *group)
>  {
>  	struct vfio_device *existing_device;
> -	struct vfio_group *group;
> +
> +	if (IS_ERR(group))
> +		return PTR_ERR(group);
> 
>  	/*
>  	 * If the driver doesn't specify a set then the device is added to a
> @@ -863,16 +889,13 @@ int vfio_register_group_dev(struct vfio_device
> *device)
>  	if (!device->dev_set)
>  		vfio_assign_device_set(device, device);
> 
> -	group = vfio_group_find_or_alloc(device->dev);
> -	if (IS_ERR(group))
> -		return PTR_ERR(group);
> -
>  	existing_device = vfio_group_get_device(group, device->dev);
>  	if (existing_device) {
>  		dev_WARN(device->dev, "Device already exists on
> group %d\n",
>  			 iommu_group_id(group->iommu_group));
>  		vfio_device_put(existing_device);
> -		if (group->noiommu)
> +		if (group->type == VFIO_NO_IOMMU ||
> +		    group->type == VFIO_EMULATED_IOMMU)
>  			iommu_group_remove_device(device->dev);
>  		vfio_group_put(group);
>  		return -EBUSY;
> @@ -891,8 +914,25 @@ int vfio_register_group_dev(struct vfio_device
> *device)
> 
>  	return 0;
>  }
> +
> +int vfio_register_group_dev(struct vfio_device *device)
> +{
> +	return __vfio_register_dev(device,
> +		vfio_group_find_or_alloc(device->dev));
> +}
>  EXPORT_SYMBOL_GPL(vfio_register_group_dev);
> 
> +/*
> + * Register a virtual device without IOMMU backing.  The user of this
> + * device must not be able to directly trigger unmediated DMA.
> + */
> +int vfio_register_emulated_iommu_dev(struct vfio_device *device)
> +{
> +	return __vfio_register_dev(device,
> +		vfio_noiommu_group_alloc(device->dev,
> VFIO_EMULATED_IOMMU));
> +}
> +EXPORT_SYMBOL_GPL(vfio_register_emulated_iommu_dev);
> +
>  /**
>   * Get a reference to the vfio_device for a device.  Even if the
>   * caller thinks they own the device, they could be racing with a
> @@ -1019,7 +1059,7 @@ void vfio_unregister_group_dev(struct vfio_device
> *device)
>  	if (list_empty(&group->device_list))
>  		wait_event(group->container_q, !group->container);
> 
> -	if (group->noiommu)
> +	if (group->type == VFIO_NO_IOMMU || group->type ==
> VFIO_EMULATED_IOMMU)
>  		iommu_group_remove_device(device->dev);
> 
>  	/* Matches the get in vfio_register_group_dev() */
> @@ -1368,7 +1408,7 @@ static int vfio_group_set_container(struct
> vfio_group *group, int container_fd)
>  	if (atomic_read(&group->container_users))
>  		return -EINVAL;
> 
> -	if (group->noiommu && !capable(CAP_SYS_RAWIO))
> +	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
>  		return -EPERM;
> 
>  	f = fdget(container_fd);
> @@ -1388,7 +1428,7 @@ static int vfio_group_set_container(struct
> vfio_group *group, int container_fd)
> 
>  	/* Real groups and fake groups cannot mix */
>  	if (!list_empty(&container->group_list) &&
> -	    container->noiommu != group->noiommu) {
> +	    container->noiommu != (group->type == VFIO_NO_IOMMU)) {
>  		ret = -EPERM;
>  		goto unlock_out;
>  	}
> @@ -1402,7 +1442,7 @@ static int vfio_group_set_container(struct
> vfio_group *group, int container_fd)
>  	}
> 
>  	group->container = container;
> -	container->noiommu = group->noiommu;
> +	container->noiommu = (group->type == VFIO_NO_IOMMU);
>  	list_add(&group->container_next, &container->group_list);
> 
>  	/* Get a reference on the container and mark a user within the group
> */
> @@ -1426,7 +1466,7 @@ static int vfio_group_add_container_user(struct
> vfio_group *group)
>  	if (!atomic_inc_not_zero(&group->container_users))
>  		return -EINVAL;
> 
> -	if (group->noiommu) {
> +	if (group->type == VFIO_NO_IOMMU) {
>  		atomic_dec(&group->container_users);
>  		return -EPERM;
>  	}
> @@ -1451,7 +1491,7 @@ static int vfio_group_get_device_fd(struct
> vfio_group *group, char *buf)
>  	    !group->container->iommu_driver || !vfio_group_viable(group))
>  		return -EINVAL;
> 
> -	if (group->noiommu && !capable(CAP_SYS_RAWIO))
> +	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
>  		return -EPERM;
> 
>  	device = vfio_device_get_from_name(group, buf);
> @@ -1498,7 +1538,7 @@ static int vfio_group_get_device_fd(struct
> vfio_group *group, char *buf)
> 
>  	fd_install(fdno, filep);
> 
> -	if (group->noiommu)
> +	if (group->type == VFIO_NO_IOMMU)
>  		dev_warn(device->dev, "vfio-noiommu device opened by
> user "
>  			 "(%s:%d)\n", current->comm, task_pid_nr(current));
>  	return fdno;
> @@ -1594,7 +1634,7 @@ static int vfio_group_fops_open(struct inode
> *inode, struct file *filep)
>  	if (!group)
>  		return -ENODEV;
> 
> -	if (group->noiommu && !capable(CAP_SYS_RAWIO)) {
> +	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
> {
>  		vfio_group_put(group);
>  		return -EPERM;
>  	}
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index f7083c2fd0d099..bbe29300862649 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -75,6 +75,7 @@ void vfio_init_group_dev(struct vfio_device *device,
> struct device *dev,
>  			 const struct vfio_device_ops *ops);
>  void vfio_uninit_group_dev(struct vfio_device *device);
>  int vfio_register_group_dev(struct vfio_device *device);
> +int vfio_register_emulated_iommu_dev(struct vfio_device *device);
>  void vfio_unregister_group_dev(struct vfio_device *device);
>  extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
>  extern void vfio_device_put(struct vfio_device *device);
> diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> index c313ab4d1f4e4e..cd41bec5fdeb39 100644
> --- a/samples/vfio-mdev/mbochs.c
> +++ b/samples/vfio-mdev/mbochs.c
> @@ -553,7 +553,7 @@ static int mbochs_probe(struct mdev_device *mdev)
>  	mbochs_create_config_space(mdev_state);
>  	mbochs_reset(mdev_state);
> 
> -	ret = vfio_register_group_dev(&mdev_state->vdev);
> +	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
>  	if (ret)
>  		goto err_mem;
>  	dev_set_drvdata(&mdev->dev, mdev_state);
> diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> index 8d1a80a0722aa9..fe5d43e797b6d3 100644
> --- a/samples/vfio-mdev/mdpy.c
> +++ b/samples/vfio-mdev/mdpy.c
> @@ -258,7 +258,7 @@ static int mdpy_probe(struct mdev_device *mdev)
> 
>  	mdpy_count++;
> 
> -	ret = vfio_register_group_dev(&mdev_state->vdev);
> +	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
>  	if (ret)
>  		goto err_mem;
>  	dev_set_drvdata(&mdev->dev, mdev_state);
> diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> index 5983cdb16e3d1d..a0e1a469bd47af 100644
> --- a/samples/vfio-mdev/mtty.c
> +++ b/samples/vfio-mdev/mtty.c
> @@ -741,7 +741,7 @@ static int mtty_probe(struct mdev_device *mdev)
> 
>  	mtty_create_config_space(mdev_state);
> 
> -	ret = vfio_register_group_dev(&mdev_state->vdev);
> +	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
>  	if (ret)
>  		goto err_vconfig;
>  	dev_set_drvdata(&mdev->dev, mdev_state);
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/14] vfio: Move vfio_iommu_group_get() to vfio_register_group_dev()
  2021-09-14  1:57   ` Tian, Kevin
@ 2021-09-14  5:46     ` Christoph Hellwig
  2021-09-14  6:11       ` Tian, Kevin
  0 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-14  5:46 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Christoph Hellwig, Alex Williamson, Diana Craciun, Cornelia Huck,
	Kirti Wankhede, Eric Auger, Jason Gunthorpe, Xu, Terrence, kvm,
	Jason Gunthorpe

On Tue, Sep 14, 2021 at 01:57:55AM +0000, Tian, Kevin wrote:
> > Since the drivers never use the group move this all into the core code.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> I gave my reviewed-by in last version. Looks it's lost here.

There were a few changes, so I'd prefer everyone to re-review it
carefully.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH 01/14] vfio: Move vfio_iommu_group_get() to vfio_register_group_dev()
  2021-09-14  5:46     ` Christoph Hellwig
@ 2021-09-14  6:11       ` Tian, Kevin
  0 siblings, 0 replies; 43+ messages in thread
From: Tian, Kevin @ 2021-09-14  6:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Diana Craciun, Cornelia Huck, Kirti Wankhede,
	Eric Auger, Jason Gunthorpe, Xu, Terrence, kvm, Jason Gunthorpe

> From: Christoph Hellwig <hch@lst.de>
> Sent: Tuesday, September 14, 2021 1:46 PM
> 
> On Tue, Sep 14, 2021 at 01:57:55AM +0000, Tian, Kevin wrote:
> > > Since the drivers never use the group move this all into the core code.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> >
> > I gave my reviewed-by in last version. Looks it's lost here.
> 
> There were a few changes, so I'd prefer everyone to re-review it
> carefully.

I did a quick comparison between two versions. The only difference
is on a few line offsets for applied changes. No actual code change
if I didn't overlook anything here...

Thanks
Kevin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/14] vfio: remove the iommudata hack for noiommu groups
  2021-09-13  7:15 ` [PATCH 06/14] vfio: remove the iommudata hack for noiommu groups Christoph Hellwig
@ 2021-09-15 16:44   ` Jason Gunthorpe
  0 siblings, 0 replies; 43+ messages in thread
From: Jason Gunthorpe @ 2021-09-15 16:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Diana Craciun, Cornelia Huck, Kirti Wankhede,
	Eric Auger, Terrence Xu, kvm, Kevin Tian

On Mon, Sep 13, 2021 at 09:15:58AM +0200, Christoph Hellwig wrote:
> Just pass a noiommu argument to vfio_create_group and set up the
> ->noiommu flag directly, and remove the now superflous
> vfio_iommu_group_put helper.

Nothing wrong with the patch, but vfio_iommu_group_put() is now
removed in patch 1

Jason

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: cleanup vfio iommu_group creation v5
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
                   ` (13 preceding siblings ...)
  2021-09-13  7:16 ` [PATCH 14/14] vfio/iommu_type1: remove IS_IOMMU_CAP_DOMAIN_IN_CONTAINER Christoph Hellwig
@ 2021-09-15 18:07 ` Jason Gunthorpe
  14 siblings, 0 replies; 43+ messages in thread
From: Jason Gunthorpe @ 2021-09-15 18:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Diana Craciun, Cornelia Huck, Kirti Wankhede,
	Eric Auger, Terrence Xu, kvm

On Mon, Sep 13, 2021 at 09:15:52AM +0200, Christoph Hellwig wrote:
> Hi Alex,
> 
> this series cleans up how iommu group are created in VFIO as well as
> various other lose ends around that.
> 
> Terrence: I did not pick up your Tested-by as there were quite a few
> changes and a major rebase.
> 
> Changes since v4:
>  - clean up an intermediate version of vfio_group_find_or_alloc to avoid
>    reviewer confusion
>  - replace an incorrect NULL check with IS_ERR
>  - rebased ontop of the vfio_ap_ops changes in Linux 5.15-rc
>  - improve a commit message

This all still looks good to me, can we get it into your tree in the
early party of the rc cycle Alex? Due to the
vfio_register_emulated_iommu_dev() it conflicts with the vfio_ccw and
i915 mdev conversions patches - this should go first..

Thanks,
Jason

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1
  2021-09-13  7:16 ` [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1 Christoph Hellwig
@ 2021-09-16 19:55   ` Kirti Wankhede
  2021-09-16 22:18     ` Jason Gunthorpe
  0 siblings, 1 reply; 43+ messages in thread
From: Kirti Wankhede @ 2021-09-16 19:55 UTC (permalink / raw)
  To: Christoph Hellwig, Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Eric Auger, Jason Gunthorpe,
	Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian



On 9/13/2021 12:46 PM, Christoph Hellwig wrote:
> Pass the group flags to ->attach_group and remove the messy check for
> the bus type.
> 

I like the way vfio_group_type is used in this patch, that removes messy 
way to call symbol_get(mdev_bus_type).

Any thoughts on how VFIO_IOMMU, i.e. IOMMU backed mdev, can be implemented?

For IOMMU backed mdev, mdev->dev->iommu_group should be same as 
mdev->type->parent->dev->iommu_group or in other words, parent device 
would be DMA alias for the mdev device with the restriction - single 
mdev device can be created for the physical device. Is it possible to 
link iommu_group of these two devices some way?

Thanks,
Kirti

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
>   drivers/vfio/vfio.c                 | 32 +++++------------------------
>   drivers/vfio/vfio.h                 | 27 +++++++++++++++++++++++-
>   drivers/vfio/vfio_iommu_spapr_tce.c |  2 +-
>   drivers/vfio/vfio_iommu_type1.c     | 19 ++---------------
>   4 files changed, 34 insertions(+), 46 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 6589e296ef348c..08b27b64f0f935 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -68,30 +68,6 @@ struct vfio_unbound_dev {
>   	struct list_head		unbound_next;
>   };
>   
> -enum vfio_group_type {
> -	/*
> -	 * Physical device with IOMMU backing.
> -	 */
> -	VFIO_IOMMU,
> -
> -	/*
> -	 * Virtual device without IOMMU backing. The VFIO core fakes up an
> -	 * iommu_group as the iommu_group sysfs interface is part of the
> -	 * userspace ABI.  The user of these devices must not be able to
> -	 * directly trigger unmediated DMA.
> -	 */
> -	VFIO_EMULATED_IOMMU,
> -
> -	/*
> -	 * Physical device without IOMMU backing. The VFIO core fakes up an
> -	 * iommu_group as the iommu_group sysfs interface is part of the
> -	 * userspace ABI.  Users can trigger unmediated DMA by the device,
> -	 * usage is highly dangerous, requires an explicit opt-in and will
> -	 * taint the kernel.
> -	 */
> -	VFIO_NO_IOMMU,
> -};
> -
>   struct vfio_group {
>   	struct kref			kref;
>   	int				minor;
> @@ -219,7 +195,7 @@ static long vfio_noiommu_ioctl(void *iommu_data,
>   }
>   
>   static int vfio_noiommu_attach_group(void *iommu_data,
> -				     struct iommu_group *iommu_group)
> +		struct iommu_group *iommu_group, enum vfio_group_type type)
>   {
>   	return 0;
>   }
> @@ -1129,7 +1105,8 @@ static int __vfio_container_attach_groups(struct vfio_container *container,
>   	int ret = -ENODEV;
>   
>   	list_for_each_entry(group, &container->group_list, container_next) {
> -		ret = driver->ops->attach_group(data, group->iommu_group);
> +		ret = driver->ops->attach_group(data, group->iommu_group,
> +						group->type);
>   		if (ret)
>   			goto unwind;
>   	}
> @@ -1387,7 +1364,8 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
>   	driver = container->iommu_driver;
>   	if (driver) {
>   		ret = driver->ops->attach_group(container->iommu_data,
> -						group->iommu_group);
> +						group->iommu_group,
> +						group->type);
>   		if (ret)
>   			goto unlock_out;
>   	}
> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> index a78de649eb2f16..a6713022115155 100644
> --- a/drivers/vfio/vfio.h
> +++ b/drivers/vfio/vfio.h
> @@ -4,6 +4,30 @@
>    *     Author: Alex Williamson <alex.williamson@redhat.com>
>    */
>   
> +enum vfio_group_type {
> +	/*
> +	 * Physical device with IOMMU backing.
> +	 */
> +	VFIO_IOMMU,
> +
> +	/*
> +	 * Virtual device without IOMMU backing. The VFIO core fakes up an
> +	 * iommu_group as the iommu_group sysfs interface is part of the
> +	 * userspace ABI.  The user of these devices must not be able to
> +	 * directly trigger unmediated DMA.
> +	 */
> +	VFIO_EMULATED_IOMMU,
> +
> +	/*
> +	 * Physical device without IOMMU backing. The VFIO core fakes up an
> +	 * iommu_group as the iommu_group sysfs interface is part of the
> +	 * userspace ABI.  Users can trigger unmediated DMA by the device,
> +	 * usage is highly dangerous, requires an explicit opt-in and will
> +	 * taint the kernel.
> +	 */
> +	VFIO_NO_IOMMU,
> +};
> +
>   /* events for the backend driver notify callback */
>   enum vfio_iommu_notify_type {
>   	VFIO_IOMMU_CONTAINER_CLOSE = 0,
> @@ -20,7 +44,8 @@ struct vfio_iommu_driver_ops {
>   	long		(*ioctl)(void *iommu_data, unsigned int cmd,
>   				 unsigned long arg);
>   	int		(*attach_group)(void *iommu_data,
> -					struct iommu_group *group);
> +					struct iommu_group *group,
> +					enum vfio_group_type);
>   	void		(*detach_group)(void *iommu_data,
>   					struct iommu_group *group);
>   	int		(*pin_pages)(void *iommu_data,
> diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
> index 3efd09faeca4a8..936a26b13c0b01 100644
> --- a/drivers/vfio/vfio_iommu_spapr_tce.c
> +++ b/drivers/vfio/vfio_iommu_spapr_tce.c
> @@ -1239,7 +1239,7 @@ static long tce_iommu_take_ownership_ddw(struct tce_container *container,
>   }
>   
>   static int tce_iommu_attach_group(void *iommu_data,
> -		struct iommu_group *iommu_group)
> +		struct iommu_group *iommu_group, enum vfio_group_type type)
>   {
>   	int ret = 0;
>   	struct tce_container *container = iommu_data;
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 42a6be1fb7265e..a48e9f597cb213 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -36,7 +36,6 @@
>   #include <linux/uaccess.h>
>   #include <linux/vfio.h>
>   #include <linux/workqueue.h>
> -#include <linux/mdev.h>
>   #include <linux/notifier.h>
>   #include <linux/dma-iommu.h>
>   #include <linux/irqdomain.h>
> @@ -1934,20 +1933,6 @@ static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
>   	return ret;
>   }
>   
> -static bool vfio_bus_is_mdev(struct bus_type *bus)
> -{
> -	struct bus_type *mdev_bus;
> -	bool ret = false;
> -
> -	mdev_bus = symbol_get(mdev_bus_type);
> -	if (mdev_bus) {
> -		ret = (bus == mdev_bus);
> -		symbol_put(mdev_bus_type);
> -	}
> -
> -	return ret;
> -}
> -
>   /*
>    * This is a helper function to insert an address range to iova list.
>    * The list is initially created with a single entry corresponding to
> @@ -2172,7 +2157,7 @@ static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu,
>   }
>   
>   static int vfio_iommu_type1_attach_group(void *iommu_data,
> -					 struct iommu_group *iommu_group)
> +		struct iommu_group *iommu_group, enum vfio_group_type type)
>   {
>   	struct vfio_iommu *iommu = iommu_data;
>   	struct vfio_iommu_group *group;
> @@ -2207,7 +2192,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>   	if (ret)
>   		goto out_free;
>   
> -	if (vfio_bus_is_mdev(bus)) {
> +	if (type == VFIO_EMULATED_IOMMU) {
>   		if (!iommu->external_domain) {
>   			INIT_LIST_HEAD(&domain->group_list);
>   			iommu->external_domain = domain;
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1
  2021-09-16 19:55   ` Kirti Wankhede
@ 2021-09-16 22:18     ` Jason Gunthorpe
  2021-09-17  4:49       ` Tian, Kevin
  0 siblings, 1 reply; 43+ messages in thread
From: Jason Gunthorpe @ 2021-09-16 22:18 UTC (permalink / raw)
  To: Kirti Wankhede
  Cc: Christoph Hellwig, Alex Williamson, Diana Craciun, Cornelia Huck,
	Eric Auger, Terrence Xu, kvm, Kevin Tian

On Fri, Sep 17, 2021 at 01:25:59AM +0530, Kirti Wankhede wrote:
> 
> 
> On 9/13/2021 12:46 PM, Christoph Hellwig wrote:
> > Pass the group flags to ->attach_group and remove the messy check for
> > the bus type.
> > 
> 
> I like the way vfio_group_type is used in this patch, that removes messy way
> to call symbol_get(mdev_bus_type).
> 
> Any thoughts on how VFIO_IOMMU, i.e. IOMMU backed mdev, can be implemented?
> 
> For IOMMU backed mdev, mdev->dev->iommu_group should be same as
> mdev->type->parent->dev->iommu_group or in other words, parent device would
> be DMA alias for the mdev device with the restriction - single mdev device
> can be created for the physical device. Is it possible to link iommu_group
> of these two devices some way?

You just use the new style mdev API and directly call
vfio_register_group_dev and it will pick up the
parent->dev->iommu_group naturally like everything else using physical
iommu groups.

Jason

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1
  2021-09-16 22:18     ` Jason Gunthorpe
@ 2021-09-17  4:49       ` Tian, Kevin
  2021-09-17  5:05         ` Christoph Hellwig
  0 siblings, 1 reply; 43+ messages in thread
From: Tian, Kevin @ 2021-09-17  4:49 UTC (permalink / raw)
  To: Jason Gunthorpe, Kirti Wankhede
  Cc: Christoph Hellwig, Alex Williamson, Diana Craciun, Cornelia Huck,
	Eric Auger, Xu, Terrence, kvm

> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Friday, September 17, 2021 6:19 AM
> 
> On Fri, Sep 17, 2021 at 01:25:59AM +0530, Kirti Wankhede wrote:
> >
> >
> > On 9/13/2021 12:46 PM, Christoph Hellwig wrote:
> > > Pass the group flags to ->attach_group and remove the messy check for
> > > the bus type.
> > >
> >
> > I like the way vfio_group_type is used in this patch, that removes messy
> way
> > to call symbol_get(mdev_bus_type).
> >
> > Any thoughts on how VFIO_IOMMU, i.e. IOMMU backed mdev, can be
> implemented?
> >
> > For IOMMU backed mdev, mdev->dev->iommu_group should be same as
> > mdev->type->parent->dev->iommu_group or in other words, parent device
> would
> > be DMA alias for the mdev device with the restriction - single mdev device
> > can be created for the physical device. Is it possible to link iommu_group
> > of these two devices some way?
> 
> You just use the new style mdev API and directly call
> vfio_register_group_dev and it will pick up the
> parent->dev->iommu_group naturally like everything else using physical
> iommu groups.
> 

For above usage (wrap pdev into mdev), isn't the right way to directly add 
vendor vfio-pci driver since vfio-pci-core has been split out now? It's not 
necessary to fake a mdev just for adding some mediation in the r/w path...

Another type of IOMMU-backed mdev is with pasid support. But for this
case we discussed earlier that it doesn't have group and will follow the
new /dev/iommu proposal instead.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1
  2021-09-17  4:49       ` Tian, Kevin
@ 2021-09-17  5:05         ` Christoph Hellwig
  2021-09-17  6:51           ` Kirti Wankhede
  0 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-09-17  5:05 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Jason Gunthorpe, Kirti Wankhede, Christoph Hellwig,
	Alex Williamson, Diana Craciun, Cornelia Huck, Eric Auger, Xu,
	Terrence, kvm

On Fri, Sep 17, 2021 at 04:49:41AM +0000, Tian, Kevin wrote:
> > You just use the new style mdev API and directly call
> > vfio_register_group_dev and it will pick up the
> > parent->dev->iommu_group naturally like everything else using physical
> > iommu groups.
> > 
> 
> For above usage (wrap pdev into mdev), isn't the right way to directly add 
> vendor vfio-pci driver since vfio-pci-core has been split out now? It's not 
> necessary to fake a mdev just for adding some mediation in the r/w path...

Exactly.

> Another type of IOMMU-backed mdev is with pasid support. But for this
> case we discussed earlier that it doesn't have group and will follow the
> new /dev/iommu proposal instead.

Indeed.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1
  2021-09-17  5:05         ` Christoph Hellwig
@ 2021-09-17  6:51           ` Kirti Wankhede
  2021-09-17 12:53             ` Jason Gunthorpe
  0 siblings, 1 reply; 43+ messages in thread
From: Kirti Wankhede @ 2021-09-17  6:51 UTC (permalink / raw)
  To: Christoph Hellwig, Tian, Kevin
  Cc: Jason Gunthorpe, Alex Williamson, Diana Craciun, Cornelia Huck,
	Eric Auger, Xu, Terrence, kvm



On 9/17/2021 10:35 AM, Christoph Hellwig wrote:
> On Fri, Sep 17, 2021 at 04:49:41AM +0000, Tian, Kevin wrote:
>>> You just use the new style mdev API and directly call
>>> vfio_register_group_dev and it will pick up the
>>> parent->dev->iommu_group naturally like everything else using physical
>>> iommu groups.
>>>
>>

Directly calling vfio_register_group_dev() doesn't work without linking 
mdev->dev->iommu_group to mdev->type->parent->dev->iommu_group.

When mdev device is created, mdev->dev->iommu_group is NULL. Then if 
called vfio_register_group_dev with mdev->dev as vfio device, it fails 
because mdev->dev->iommu_group is NULL. So create vfio_device with mdev 
parent's dev as below:

if (IOMMU backed mdev)
     vfio_init_group_dev(&vfio_dev, &mdev->type->parent->dev, &fops);
else
     vfio_init_group_dev(&vfio_dev, &mdev->dev, &fops);

if (IOMMU backed mdev)
     vfio_register_group_dev(&vfio_dev);
else
     vfio_register_emulated_iommu_dev(&vfio_dev);

But still mdev->dev->iommu_group is NULL and 
/sys/bus/mdev/devices/<mdev_uuid>/iommu_group is not present.
For QEMU, input parameter is mdev device's UUID. QEMU checks 
/sys/bus/mdev/devices/<mdev_uuid>/iommu_group path and fails with error 
"no iommu_group found".

There has to be symlink /mdev/devices/<mdev_uuid>/iommu_group to it's 
parent device's iommu_group.

iommu_group_add_device(parent_iommu_group, mdev->dev) fails because mdev 
device is not pci device or ACPI device. Can it be allowed to add mdev 
device to its parent's iommu group here?

>> For above usage (wrap pdev into mdev), isn't the right way to directly add
>> vendor vfio-pci driver since vfio-pci-core has been split out now? It's not
>> necessary to fake a mdev just for adding some mediation in the r/w path...
> 
> Exactly.

vfio-pci doesn't provide way to configure the device as mdev framework 
provide using mdev_types.

Thanks,
Kirti


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1
  2021-09-17  6:51           ` Kirti Wankhede
@ 2021-09-17 12:53             ` Jason Gunthorpe
  2021-09-30 16:46               ` Alex Williamson
  0 siblings, 1 reply; 43+ messages in thread
From: Jason Gunthorpe @ 2021-09-17 12:53 UTC (permalink / raw)
  To: Kirti Wankhede
  Cc: Christoph Hellwig, Tian, Kevin, Alex Williamson, Diana Craciun,
	Cornelia Huck, Eric Auger, Xu, Terrence, kvm

On Fri, Sep 17, 2021 at 12:21:07PM +0530, Kirti Wankhede wrote:
> 
> 
> On 9/17/2021 10:35 AM, Christoph Hellwig wrote:
> > On Fri, Sep 17, 2021 at 04:49:41AM +0000, Tian, Kevin wrote:
> > > > You just use the new style mdev API and directly call
> > > > vfio_register_group_dev and it will pick up the
> > > > parent->dev->iommu_group naturally like everything else using physical
> > > > iommu groups.
> > > > 
> > > 
> 
> Directly calling vfio_register_group_dev() doesn't work without linking
> mdev->dev->iommu_group to mdev->type->parent->dev->iommu_group.

You pass the PCI device, not the mdev to vfio_register_group_dev().

> > > For above usage (wrap pdev into mdev), isn't the right way to directly add
> > > vendor vfio-pci driver since vfio-pci-core has been split out now? It's not
> > > necessary to fake a mdev just for adding some mediation in the r/w path...
> > 
> > Exactly.
> 
> vfio-pci doesn't provide way to configure the device as mdev framework
> provide using mdev_types.

The linux standard is for a PCI PF driver to configure it's VFs, not a
mdev or vfio.

Jason

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/14] vfio: factor out a vfio_iommu_driver_allowed helper
  2021-09-13  7:15 ` [PATCH 02/14] vfio: factor out a vfio_iommu_driver_allowed helper Christoph Hellwig
@ 2021-09-23 22:59   ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2021-09-23 22:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

On Mon, 13 Sep 2021 09:15:54 +0200
Christoph Hellwig <hch@lst.de> wrote:

> Factor out a little helper to make the checks for the noiommu driver less
> ugly.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
>  drivers/vfio/vfio.c | 37 +++++++++++++++++++++----------------
>  1 file changed, 21 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index b39da9b90c95bc..faf8e0d637bb94 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
...
> @@ -999,8 +1014,8 @@ void vfio_unregister_group_dev(struct vfio_device *device)
>  		wait_event(group->container_q, !group->container);
>  
>  #ifdef CONFIG_VFIO_NOIOMMU
> -	if (iommu_group_get_iommudata(group) == &noiommu)
> -		iommu_group_remove_device(dev);
> +	if (iommu_group_get_iommudata(group->iommu_group) == &noiommu)
> +		iommu_group_remove_device(device->dev);
>  #endif
>  	/* Matches the get in vfio_register_group_dev() */
>  	vfio_group_put(group);

This is a build fix to the previous patch in the series, it should be
rolled in there.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 13/14] vfio/iommu_type1: remove the "external" domain
  2021-09-13  7:16 ` [PATCH 13/14] vfio/iommu_type1: remove the "external" domain Christoph Hellwig
@ 2021-09-23 22:59   ` Alex Williamson
  2021-09-23 23:06     ` Jason Gunthorpe
  0 siblings, 1 reply; 43+ messages in thread
From: Alex Williamson @ 2021-09-23 22:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm, Jason Gunthorpe, Kevin Tian

On Mon, 13 Sep 2021 09:16:05 +0200
Christoph Hellwig <hch@lst.de> wrote:

> The external_domain concept rather misleading and not actually needed.
> Replace it with a list of emulated groups in struct vfio_iommu.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 121 ++++++++++++++------------------
>  1 file changed, 54 insertions(+), 67 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index a48e9f597cb213..d2db62cb2aaa39 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
...
> @@ -2163,62 +2163,52 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>  	struct vfio_iommu_group *group;
>  	struct vfio_domain *domain, *d;
>  	struct bus_type *bus = NULL;
> -	int ret;
>  	bool resv_msi, msi_remap;
>  	phys_addr_t resv_msi_base = 0;
>  	struct iommu_domain_geometry *geo;
>  	LIST_HEAD(iova_copy);
>  	LIST_HEAD(group_resv_regions);
> +	int ret = -EINVAL;
>  
>  	mutex_lock(&iommu->lock);
>  
>  	/* Check for duplicates */
> -	if (vfio_iommu_find_iommu_group(iommu, iommu_group)) {
> -		mutex_unlock(&iommu->lock);
> -		return -EINVAL;
> -	}
> +	if (vfio_iommu_find_iommu_group(iommu, iommu_group))
> +		goto out_unlock;
>  
> +	ret = -ENOMEM;
>  	group = kzalloc(sizeof(*group), GFP_KERNEL);
> -	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
> -	if (!group || !domain) {
> -		ret = -ENOMEM;
> -		goto out_free;
> -	}
> -
> +	if (!group)
> +		goto out_unlock;
>  	group->iommu_group = iommu_group;
>  
> -	/* Determine bus_type in order to allocate a domain */
> -	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
> -	if (ret)
> -		goto out_free;
> -
>  	if (type == VFIO_EMULATED_IOMMU) {
> -		if (!iommu->external_domain) {
> -			INIT_LIST_HEAD(&domain->group_list);
> -			iommu->external_domain = domain;
> -			vfio_update_pgsize_bitmap(iommu);
> -		} else {
> -			kfree(domain);
> -		}
> -
> -		list_add(&group->next, &iommu->external_domain->group_list);
> +		list_add(&group->next, &iommu->emulated_iommu_groups);
>  		/*
> -		 * Non-iommu backed group cannot dirty memory directly, it can
> +		 * An emulated IOMMU group cannot dirty memory directly, it can
>  		 * only use interfaces that provide dirty tracking.
>  		 * The iommu scope can only be promoted with the addition of a
>  		 * dirty tracking group.
>  		 */
>  		group->pinned_page_dirty_scope = true;
> -		mutex_unlock(&iommu->lock);
> -
> -		return 0;
> +		ret = 0;
> +		goto out_unlock;
>  	}

I think this dropped the call to vfio_update_pgsize_bitmap(), which
would leave iommu->pgsize_bitmap = 0 for a container hosting only mdev
devices, which leads us to undefined behavior when we're using ffs on
it to validate maps, unmaps, dirty bitmap support, etc.   Did I miss
this getting moved elsewhere?  Thanks,

Alex


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 13/14] vfio/iommu_type1: remove the "external" domain
  2021-09-23 22:59   ` Alex Williamson
@ 2021-09-23 23:06     ` Jason Gunthorpe
  0 siblings, 0 replies; 43+ messages in thread
From: Jason Gunthorpe @ 2021-09-23 23:06 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Christoph Hellwig, Diana Craciun, Cornelia Huck, Kirti Wankhede,
	Eric Auger, Terrence Xu, kvm, Kevin Tian

On Thu, Sep 23, 2021 at 04:59:24PM -0600, Alex Williamson wrote:

> I think this dropped the call to vfio_update_pgsize_bitmap(), which
> would leave iommu->pgsize_bitmap = 0 for a container hosting only mdev
> devices, which leads us to undefined behavior when we're using ffs on
> it to validate maps, unmaps, dirty bitmap support, etc.   Did I miss
> this getting moved elsewhere?  Thanks,

I think you are right, but I'd suggest to add a call in
vfio_iommu_type1_open() so that the pgsize_bitmap is never invalid in
the first place.

Calling vfio_update_pgsize_bitmap() in places that don't change the
domain_list is pretty confusing.

Jason


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1
  2021-09-17 12:53             ` Jason Gunthorpe
@ 2021-09-30 16:46               ` Alex Williamson
  2021-09-30 16:57                 ` Jason Gunthorpe
  0 siblings, 1 reply; 43+ messages in thread
From: Alex Williamson @ 2021-09-30 16:46 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Kirti Wankhede, Christoph Hellwig, Tian, Kevin, Diana Craciun,
	Cornelia Huck, Eric Auger, Xu, Terrence, kvm

On Fri, 17 Sep 2021 09:53:01 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:

> On Fri, Sep 17, 2021 at 12:21:07PM +0530, Kirti Wankhede wrote:
> > 
> > 
> > On 9/17/2021 10:35 AM, Christoph Hellwig wrote:  
> > > On Fri, Sep 17, 2021 at 04:49:41AM +0000, Tian, Kevin wrote:  
> > > > > You just use the new style mdev API and directly call
> > > > > vfio_register_group_dev and it will pick up the
> > > > > parent->dev->iommu_group naturally like everything else using physical
> > > > > iommu groups.
> > > > >   
> > > >   
> > 
> > Directly calling vfio_register_group_dev() doesn't work without linking
> > mdev->dev->iommu_group to mdev->type->parent->dev->iommu_group.  
> 
> You pass the PCI device, not the mdev to vfio_register_group_dev().
> 
> > > > For above usage (wrap pdev into mdev), isn't the right way to directly add
> > > > vendor vfio-pci driver since vfio-pci-core has been split out now? It's not
> > > > necessary to fake a mdev just for adding some mediation in the r/w path...  
> > > 
> > > Exactly.  
> > 
> > vfio-pci doesn't provide way to configure the device as mdev framework
> > provide using mdev_types.  
> 
> The linux standard is for a PCI PF driver to configure it's VFs, not a
> mdev or vfio.

Hi Jason,

I'm only aware that the PF driver enables basic SR-IOV configuration of
VFs, ie. the number of enabled VFs.  The mdev interface enables not
only management of the number of child devices, but the flavor of each
child, for example the non-homogeneous slice of resources allocated per
child device.

I think that's the sort of configuration Kirti is asking about here and
I'm not aware of any standard mechanism for a PF driver to apply a
configuration per VF.  A mediated path to a physical device isn't
exclusive to providing features like migration, it can also be used to
create these sorts device flavors.  For example, we might expose NIC VFs
and administrative configuration should restrict VF1 to a 1Gbit
interface while VF2 gets 10Gbit.

Are we left to driver specific sysfs attributes to achieve this or can
we create some form of standardization like mdev provides?  Thanks,

Alex


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1
  2021-09-30 16:46               ` Alex Williamson
@ 2021-09-30 16:57                 ` Jason Gunthorpe
  2021-10-01  3:20                   ` Christoph Hellwig
  0 siblings, 1 reply; 43+ messages in thread
From: Jason Gunthorpe @ 2021-09-30 16:57 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Kirti Wankhede, Christoph Hellwig, Tian, Kevin, Diana Craciun,
	Cornelia Huck, Eric Auger, Xu, Terrence, kvm

On Thu, Sep 30, 2021 at 10:46:20AM -0600, Alex Williamson wrote:
> I'm only aware that the PF driver enables basic SR-IOV configuration of
> VFs, ie. the number of enabled VFs. 

This is quite common in the netdev world, for instance you use the PF
driver to set the MAC addresses, QOS and other details on the VF
devices.

> only management of the number of child devices, but the flavor of each
> child, for example the non-homogeneous slice of resources allocated per
> child device.

Since the devices are PCI VFs they should be able to be used, with
configuration, any place a PCI VF is usable. EG vfio-pci, a Kernel
driver, etc.

This is why the PF needs to provide the configuration to support all
the use cases.
 
> I'm not aware of any standard mechanism for a PF driver to apply a
> configuration per VF.  

devlink is the standard way these days. It can model the PCI devices
and puts a control point in the PF driver.

I'd defer to Jiri, Leon and others to explain the details of this
though :)

> create these sorts device flavors.  For example, we might expose NIC VFs
> and administrative configuration should restrict VF1 to a 1Gbit
> interface while VF2 gets 10Gbit.

This is all being done already today through the PF driver using
either netink or devlink interfaces

Jason

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1
  2021-09-30 16:57                 ` Jason Gunthorpe
@ 2021-10-01  3:20                   ` Christoph Hellwig
  0 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2021-10-01  3:20 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Kirti Wankhede, Christoph Hellwig, Tian, Kevin,
	Diana Craciun, Cornelia Huck, Eric Auger, Xu, Terrence, kvm

On Thu, Sep 30, 2021 at 01:57:07PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 30, 2021 at 10:46:20AM -0600, Alex Williamson wrote:
> > I'm only aware that the PF driver enables basic SR-IOV configuration of
> > VFs, ie. the number of enabled VFs. 
> 
> This is quite common in the netdev world, for instance you use the PF
> driver to set the MAC addresses, QOS and other details on the VF
> devices.

The NVMe spec also support it using the Virtualization extensions,
although I'm not aware of any device that actually implements it so far.

> 
> > only management of the number of child devices, but the flavor of each
> > child, for example the non-homogeneous slice of resources allocated per
> > child device.
> 
> Since the devices are PCI VFs they should be able to be used, with
> configuration, any place a PCI VF is usable. EG vfio-pci, a Kernel
> driver, etc.
> 
> This is why the PF needs to provide the configuration to support all
> the use cases.

Exactly.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 13/14] vfio/iommu_type1: remove the "external" domain
  2021-08-26 13:34 ` [PATCH 13/14] vfio/iommu_type1: remove the "external" domain Christoph Hellwig
@ 2021-08-26 23:08   ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2021-08-26 23:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm, Jason Gunthorpe, Kevin Tian

On Thu, 26 Aug 2021 15:34:23 +0200
Christoph Hellwig <hch@lst.de> wrote:

> The external_domain concept rather misleading and not actually needed.
> Replace it with a list of emulated groups in struct vfio_iommu and
> document the purpose.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 121 ++++++++++++++------------------
>  1 file changed, 54 insertions(+), 67 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index ca3c995c84166f..1ef98d4e2abecf 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -65,7 +65,6 @@ MODULE_PARM_DESC(dma_entry_limit,
>  struct vfio_iommu {
>  	struct list_head	domain_list;
>  	struct list_head	iova_list;
> -	struct vfio_domain	*external_domain; /* domain for external user */
>  	struct mutex		lock;
>  	struct rb_root		dma_list;
>  	struct blocking_notifier_head notifier;
> @@ -78,6 +77,7 @@ struct vfio_iommu {
>  	bool			nesting;
>  	bool			dirty_page_tracking;
>  	bool			container_open;
> +	struct list_head	emulated_iommu_groups;
>  };
>  
>  struct vfio_domain {
> @@ -1892,8 +1892,8 @@ static struct vfio_iommu_group*
>  vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
>  			    struct iommu_group *iommu_group)
>  {
> +	struct vfio_iommu_group *group;
>  	struct vfio_domain *domain;
> -	struct vfio_iommu_group *group = NULL;
>  
>  	list_for_each_entry(domain, &iommu->domain_list, next) {
>  		group = find_iommu_group(domain, iommu_group);
> @@ -1901,10 +1901,10 @@ vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
>  			return group;
>  	}
>  
> -	if (iommu->external_domain)
> -		group = find_iommu_group(iommu->external_domain, iommu_group);
> -
> -	return group;
> +	list_for_each_entry(group, &iommu->emulated_iommu_groups, next)
> +		if (group->iommu_group == iommu_group)
> +			return group;
> +	return NULL;
>  }
>  
>  static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
> @@ -2163,62 +2163,52 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>  	struct vfio_iommu_group *group;
>  	struct vfio_domain *domain, *d;
>  	struct bus_type *bus = NULL;
> -	int ret;
>  	bool resv_msi, msi_remap;
>  	phys_addr_t resv_msi_base = 0;
>  	struct iommu_domain_geometry *geo;
>  	LIST_HEAD(iova_copy);
>  	LIST_HEAD(group_resv_regions);
> +	int ret = -EINVAL;
>  
>  	mutex_lock(&iommu->lock);
>  
>  	/* Check for duplicates */
> -	if (vfio_iommu_find_iommu_group(iommu, iommu_group)) {
> -		mutex_unlock(&iommu->lock);
> -		return -EINVAL;
> -	}
> +	if (vfio_iommu_find_iommu_group(iommu, iommu_group))
> +		goto out_unlock;
>  
> +	ret = -ENOMEM;
>  	group = kzalloc(sizeof(*group), GFP_KERNEL);
> -	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
> -	if (!group || !domain) {
> -		ret = -ENOMEM;
> -		goto out_free;
> -	}
> -
> +	if (!group)
> +		goto out_unlock;
>  	group->iommu_group = iommu_group;
>  
> -	/* Determine bus_type in order to allocate a domain */
> -	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
> -	if (ret)
> -		goto out_free;
> -
>  	if (flags & VFIO_EMULATED_IOMMU) {
> -		if (!iommu->external_domain) {
> -			INIT_LIST_HEAD(&domain->group_list);
> -			iommu->external_domain = domain;
> -			vfio_update_pgsize_bitmap(iommu);
> -		} else {
> -			kfree(domain);
> -		}
> -
> -		list_add(&group->next, &iommu->external_domain->group_list);
> +		list_add(&group->next, &iommu->emulated_iommu_groups);
>  		/*
> -		 * Non-iommu backed group cannot dirty memory directly, it can
> +		 * An emulated IOMMU group cannot dirty memory directly, it can
>  		 * only use interfaces that provide dirty tracking.
>  		 * The iommu scope can only be promoted with the addition of a
>  		 * dirty tracking group.
>  		 */


I didn't actually spot the additional documentation noted in the commit
log, is this it?  Thanks,

Alex


>  		group->pinned_page_dirty_scope = true;
> -		mutex_unlock(&iommu->lock);
> -
> -		return 0;
> +		ret = 0;
> +		goto out_unlock;
>  	}
>  
> +	/* Determine bus_type in order to allocate a domain */
> +	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
> +	if (ret)
> +		goto out_free_group;
> +
> +	ret = -ENOMEM;
> +	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
> +	if (!domain)
> +		goto out_free_group;
> +
> +	ret = -EIO;
>  	domain->domain = iommu_domain_alloc(bus);
> -	if (!domain->domain) {
> -		ret = -EIO;
> -		goto out_free;
> -	}
> +	if (!domain->domain)
> +		goto out_free_domain;
>  
>  	if (iommu->nesting) {
>  		ret = iommu_enable_nesting(domain->domain);
> @@ -2345,9 +2335,11 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>  	iommu_domain_free(domain->domain);
>  	vfio_iommu_iova_free(&iova_copy);
>  	vfio_iommu_resv_free(&group_resv_regions);
> -out_free:
> +out_free_domain:
>  	kfree(domain);
> +out_free_group:
>  	kfree(group);
> +out_unlock:
>  	mutex_unlock(&iommu->lock);
>  	return ret;
>  }
> @@ -2472,25 +2464,19 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
>  	LIST_HEAD(iova_copy);
>  
>  	mutex_lock(&iommu->lock);
> +	list_for_each_entry(group, &iommu->emulated_iommu_groups, next) {
> +		if (group->iommu_group != iommu_group)
> +			continue;
> +		update_dirty_scope = !group->pinned_page_dirty_scope;
> +		list_del(&group->next);
> +		kfree(group);
>  
> -	if (iommu->external_domain) {
> -		group = find_iommu_group(iommu->external_domain, iommu_group);
> -		if (group) {
> -			update_dirty_scope = !group->pinned_page_dirty_scope;
> -			list_del(&group->next);
> -			kfree(group);
> -
> -			if (list_empty(&iommu->external_domain->group_list)) {
> -				if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
> -					WARN_ON(iommu->notifier.head);
> -					vfio_iommu_unmap_unpin_all(iommu);
> -				}
> -
> -				kfree(iommu->external_domain);
> -				iommu->external_domain = NULL;
> -			}
> -			goto detach_group_done;
> +		if (list_empty(&iommu->emulated_iommu_groups) &&
> +		    !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
> +			WARN_ON(iommu->notifier.head);
> +			vfio_iommu_unmap_unpin_all(iommu);
>  		}
> +		goto detach_group_done;
>  	}
>  
>  	/*
> @@ -2518,7 +2504,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
>  		 */
>  		if (list_empty(&domain->group_list)) {
>  			if (list_is_singular(&iommu->domain_list)) {
> -				if (!iommu->external_domain) {
> +				if (list_empty(&iommu->emulated_iommu_groups)) {
>  					WARN_ON(iommu->notifier.head);
>  					vfio_iommu_unmap_unpin_all(iommu);
>  				} else {
> @@ -2582,41 +2568,42 @@ static void *vfio_iommu_type1_open(unsigned long arg)
>  	mutex_init(&iommu->lock);
>  	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
>  	init_waitqueue_head(&iommu->vaddr_wait);
> +	INIT_LIST_HEAD(&iommu->emulated_iommu_groups);
>  
>  	return iommu;
>  }
>  
> -static void vfio_release_domain(struct vfio_domain *domain, bool external)
> +static void vfio_release_domain(struct vfio_domain *domain)
>  {
>  	struct vfio_iommu_group *group, *group_tmp;
>  
>  	list_for_each_entry_safe(group, group_tmp,
>  				 &domain->group_list, next) {
> -		if (!external)
> -			iommu_detach_group(domain->domain, group->iommu_group);
> +		iommu_detach_group(domain->domain, group->iommu_group);
>  		list_del(&group->next);
>  		kfree(group);
>  	}
>  
> -	if (!external)
> -		iommu_domain_free(domain->domain);
> +	iommu_domain_free(domain->domain);
>  }
>  
>  static void vfio_iommu_type1_release(void *iommu_data)
>  {
>  	struct vfio_iommu *iommu = iommu_data;
>  	struct vfio_domain *domain, *domain_tmp;
> +	struct vfio_iommu_group *group, *next_group;
>  
> -	if (iommu->external_domain) {
> -		vfio_release_domain(iommu->external_domain, true);
> -		kfree(iommu->external_domain);
> +	list_for_each_entry_safe(group, next_group,
> +			&iommu->emulated_iommu_groups, next) {
> +		list_del(&group->next);
> +		kfree(group);
>  	}
>  
>  	vfio_iommu_unmap_unpin_all(iommu);
>  
>  	list_for_each_entry_safe(domain, domain_tmp,
>  				 &iommu->domain_list, next) {
> -		vfio_release_domain(domain, false);
> +		vfio_release_domain(domain);
>  		list_del(&domain->next);
>  		kfree(domain);
>  	}


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 13/14] vfio/iommu_type1: remove the "external" domain
  2021-08-26 13:34 cleanup vfio iommu_group creation v4 Christoph Hellwig
@ 2021-08-26 13:34 ` Christoph Hellwig
  2021-08-26 23:08   ` Alex Williamson
  0 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-08-26 13:34 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm, Jason Gunthorpe, Kevin Tian

The external_domain concept rather misleading and not actually needed.
Replace it with a list of emulated groups in struct vfio_iommu and
document the purpose.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio_iommu_type1.c | 121 ++++++++++++++------------------
 1 file changed, 54 insertions(+), 67 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index ca3c995c84166f..1ef98d4e2abecf 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -65,7 +65,6 @@ MODULE_PARM_DESC(dma_entry_limit,
 struct vfio_iommu {
 	struct list_head	domain_list;
 	struct list_head	iova_list;
-	struct vfio_domain	*external_domain; /* domain for external user */
 	struct mutex		lock;
 	struct rb_root		dma_list;
 	struct blocking_notifier_head notifier;
@@ -78,6 +77,7 @@ struct vfio_iommu {
 	bool			nesting;
 	bool			dirty_page_tracking;
 	bool			container_open;
+	struct list_head	emulated_iommu_groups;
 };
 
 struct vfio_domain {
@@ -1892,8 +1892,8 @@ static struct vfio_iommu_group*
 vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 			    struct iommu_group *iommu_group)
 {
+	struct vfio_iommu_group *group;
 	struct vfio_domain *domain;
-	struct vfio_iommu_group *group = NULL;
 
 	list_for_each_entry(domain, &iommu->domain_list, next) {
 		group = find_iommu_group(domain, iommu_group);
@@ -1901,10 +1901,10 @@ vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 			return group;
 	}
 
-	if (iommu->external_domain)
-		group = find_iommu_group(iommu->external_domain, iommu_group);
-
-	return group;
+	list_for_each_entry(group, &iommu->emulated_iommu_groups, next)
+		if (group->iommu_group == iommu_group)
+			return group;
+	return NULL;
 }
 
 static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
@@ -2163,62 +2163,52 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_iommu_group *group;
 	struct vfio_domain *domain, *d;
 	struct bus_type *bus = NULL;
-	int ret;
 	bool resv_msi, msi_remap;
 	phys_addr_t resv_msi_base = 0;
 	struct iommu_domain_geometry *geo;
 	LIST_HEAD(iova_copy);
 	LIST_HEAD(group_resv_regions);
+	int ret = -EINVAL;
 
 	mutex_lock(&iommu->lock);
 
 	/* Check for duplicates */
-	if (vfio_iommu_find_iommu_group(iommu, iommu_group)) {
-		mutex_unlock(&iommu->lock);
-		return -EINVAL;
-	}
+	if (vfio_iommu_find_iommu_group(iommu, iommu_group))
+		goto out_unlock;
 
+	ret = -ENOMEM;
 	group = kzalloc(sizeof(*group), GFP_KERNEL);
-	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
-	if (!group || !domain) {
-		ret = -ENOMEM;
-		goto out_free;
-	}
-
+	if (!group)
+		goto out_unlock;
 	group->iommu_group = iommu_group;
 
-	/* Determine bus_type in order to allocate a domain */
-	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
-	if (ret)
-		goto out_free;
-
 	if (flags & VFIO_EMULATED_IOMMU) {
-		if (!iommu->external_domain) {
-			INIT_LIST_HEAD(&domain->group_list);
-			iommu->external_domain = domain;
-			vfio_update_pgsize_bitmap(iommu);
-		} else {
-			kfree(domain);
-		}
-
-		list_add(&group->next, &iommu->external_domain->group_list);
+		list_add(&group->next, &iommu->emulated_iommu_groups);
 		/*
-		 * Non-iommu backed group cannot dirty memory directly, it can
+		 * An emulated IOMMU group cannot dirty memory directly, it can
 		 * only use interfaces that provide dirty tracking.
 		 * The iommu scope can only be promoted with the addition of a
 		 * dirty tracking group.
 		 */
 		group->pinned_page_dirty_scope = true;
-		mutex_unlock(&iommu->lock);
-
-		return 0;
+		ret = 0;
+		goto out_unlock;
 	}
 
+	/* Determine bus_type in order to allocate a domain */
+	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
+	if (ret)
+		goto out_free_group;
+
+	ret = -ENOMEM;
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (!domain)
+		goto out_free_group;
+
+	ret = -EIO;
 	domain->domain = iommu_domain_alloc(bus);
-	if (!domain->domain) {
-		ret = -EIO;
-		goto out_free;
-	}
+	if (!domain->domain)
+		goto out_free_domain;
 
 	if (iommu->nesting) {
 		ret = iommu_enable_nesting(domain->domain);
@@ -2345,9 +2335,11 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	iommu_domain_free(domain->domain);
 	vfio_iommu_iova_free(&iova_copy);
 	vfio_iommu_resv_free(&group_resv_regions);
-out_free:
+out_free_domain:
 	kfree(domain);
+out_free_group:
 	kfree(group);
+out_unlock:
 	mutex_unlock(&iommu->lock);
 	return ret;
 }
@@ -2472,25 +2464,19 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
+	list_for_each_entry(group, &iommu->emulated_iommu_groups, next) {
+		if (group->iommu_group != iommu_group)
+			continue;
+		update_dirty_scope = !group->pinned_page_dirty_scope;
+		list_del(&group->next);
+		kfree(group);
 
-	if (iommu->external_domain) {
-		group = find_iommu_group(iommu->external_domain, iommu_group);
-		if (group) {
-			update_dirty_scope = !group->pinned_page_dirty_scope;
-			list_del(&group->next);
-			kfree(group);
-
-			if (list_empty(&iommu->external_domain->group_list)) {
-				if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
-					WARN_ON(iommu->notifier.head);
-					vfio_iommu_unmap_unpin_all(iommu);
-				}
-
-				kfree(iommu->external_domain);
-				iommu->external_domain = NULL;
-			}
-			goto detach_group_done;
+		if (list_empty(&iommu->emulated_iommu_groups) &&
+		    !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
+			WARN_ON(iommu->notifier.head);
+			vfio_iommu_unmap_unpin_all(iommu);
 		}
+		goto detach_group_done;
 	}
 
 	/*
@@ -2518,7 +2504,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		 */
 		if (list_empty(&domain->group_list)) {
 			if (list_is_singular(&iommu->domain_list)) {
-				if (!iommu->external_domain) {
+				if (list_empty(&iommu->emulated_iommu_groups)) {
 					WARN_ON(iommu->notifier.head);
 					vfio_iommu_unmap_unpin_all(iommu);
 				} else {
@@ -2582,41 +2568,42 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
 	init_waitqueue_head(&iommu->vaddr_wait);
+	INIT_LIST_HEAD(&iommu->emulated_iommu_groups);
 
 	return iommu;
 }
 
-static void vfio_release_domain(struct vfio_domain *domain, bool external)
+static void vfio_release_domain(struct vfio_domain *domain)
 {
 	struct vfio_iommu_group *group, *group_tmp;
 
 	list_for_each_entry_safe(group, group_tmp,
 				 &domain->group_list, next) {
-		if (!external)
-			iommu_detach_group(domain->domain, group->iommu_group);
+		iommu_detach_group(domain->domain, group->iommu_group);
 		list_del(&group->next);
 		kfree(group);
 	}
 
-	if (!external)
-		iommu_domain_free(domain->domain);
+	iommu_domain_free(domain->domain);
 }
 
 static void vfio_iommu_type1_release(void *iommu_data)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain, *domain_tmp;
+	struct vfio_iommu_group *group, *next_group;
 
-	if (iommu->external_domain) {
-		vfio_release_domain(iommu->external_domain, true);
-		kfree(iommu->external_domain);
+	list_for_each_entry_safe(group, next_group,
+			&iommu->emulated_iommu_groups, next) {
+		list_del(&group->next);
+		kfree(group);
 	}
 
 	vfio_iommu_unmap_unpin_all(iommu);
 
 	list_for_each_entry_safe(domain, domain_tmp,
 				 &iommu->domain_list, next) {
-		vfio_release_domain(domain, false);
+		vfio_release_domain(domain);
 		list_del(&domain->next);
 		kfree(domain);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* RE: [PATCH 13/14] vfio/iommu_type1: remove the "external" domain
  2021-08-25 16:19 ` [PATCH 13/14] vfio/iommu_type1: remove the "external" domain Christoph Hellwig
@ 2021-08-26  4:31   ` Tian, Kevin
  0 siblings, 0 replies; 43+ messages in thread
From: Tian, Kevin @ 2021-08-26  4:31 UTC (permalink / raw)
  To: Christoph Hellwig, Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm, Jason Gunthorpe

> From: Christoph Hellwig <hch@lst.de>
> Sent: Thursday, August 26, 2021 12:19 AM
> 
[...]
> @@ -2163,45 +2164,27 @@ static int vfio_iommu_type1_attach_group(void
> *iommu_data,
>  	struct vfio_iommu_group *group;
>  	struct vfio_domain *domain, *d;
>  	struct bus_type *bus = NULL;
> -	int ret;
>  	bool resv_msi, msi_remap;
>  	phys_addr_t resv_msi_base = 0;
>  	struct iommu_domain_geometry *geo;
>  	LIST_HEAD(iova_copy);
>  	LIST_HEAD(group_resv_regions);
> +	int ret = -EINVAL;
> 
>  	mutex_lock(&iommu->lock);
> 
>  	/* Check for duplicates */
> -	if (vfio_iommu_find_iommu_group(iommu, iommu_group)) {
> -		mutex_unlock(&iommu->lock);
> -		return -EINVAL;
> -	}
> +	if (vfio_iommu_find_iommu_group(iommu, iommu_group))
> +		goto out_unlock;
> 
> +	ret = -ENOMEM;
>  	group = kzalloc(sizeof(*group), GFP_KERNEL);
> -	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
> -	if (!group || !domain) {
> -		ret = -ENOMEM;
> -		goto out_free;
> -	}
> -
> +	if (!group)
> +		goto out_unlock;
>  	group->iommu_group = iommu_group;
> 
> -	/* Determine bus_type in order to allocate a domain */
> -	ret = iommu_group_for_each_dev(iommu_group, &bus,
> vfio_bus_type);
> -	if (ret)
> -		goto out_free;
> -
>  	if (flags & VFIO_EMULATED_IOMMU) {
> -		if (!iommu->external_domain) {
> -			INIT_LIST_HEAD(&domain->group_list);
> -			iommu->external_domain = domain;
> -			vfio_update_pgsize_bitmap(iommu);
> -		} else {
> -			kfree(domain);
> -		}
> -
> -		list_add(&group->next, &iommu->external_domain-
> >group_list);
> +		list_add(&group->next, &iommu->emulated_iommu_groups);
>  		/*
>  		 * Non-iommu backed group cannot dirty memory directly, it
> can

unify the naming e.g. "group with emulated iommu cannot dirty ..."

>  		 * only use interfaces that provide dirty tracking.
> @@ -2209,16 +2192,24 @@ static int vfio_iommu_type1_attach_group(void
> *iommu_data,
>  		 * dirty tracking group.
>  		 */
>  		group->pinned_page_dirty_scope = true;
> -		mutex_unlock(&iommu->lock);
> -
> -		return 0;
> +		ret = 0;
> +		goto out_unlock;
>  	}
> 
> +	/* Determine bus_type in order to allocate a domain */
> +	ret = iommu_group_for_each_dev(iommu_group, &bus,
> vfio_bus_type);
> +	if (ret)
> +		goto out_free_group;
> +
> +	ret = -ENOMEM;
> +	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
> +	if (!domain)
> +		goto out_free_group;
> +
> +	ret = -EIO;
>  	domain->domain = iommu_domain_alloc(bus);
> -	if (!domain->domain) {
> -		ret = -EIO;
> -		goto out_free;
> -	}

-ENOMEM?

> +	if (!domain->domain)
> +		goto out_free_domain;
> 
>  	if (iommu->nesting) {
>  		ret = iommu_enable_nesting(domain->domain);

looks your change of errno handling is incomplete. there are several other
places down this function which set the errno right before goto. better to 
have a consistent style in one function. 😊

Thanks
Kevin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 13/14] vfio/iommu_type1: remove the "external" domain
  2021-08-25 16:19 cleanup vfio iommu_group creation v3 Christoph Hellwig
@ 2021-08-25 16:19 ` Christoph Hellwig
  2021-08-26  4:31   ` Tian, Kevin
  0 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-08-25 16:19 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm, Jason Gunthorpe

The external_domain concept rather misleading and not actually needed.
Replace it with a list of emulated groups in struct vfio_iommu and
document the purpose.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 120 ++++++++++++++------------------
 1 file changed, 54 insertions(+), 66 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index ca3c995c84166f..871cd2867999cb 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -65,7 +65,6 @@ MODULE_PARM_DESC(dma_entry_limit,
 struct vfio_iommu {
 	struct list_head	domain_list;
 	struct list_head	iova_list;
-	struct vfio_domain	*external_domain; /* domain for external user */
 	struct mutex		lock;
 	struct rb_root		dma_list;
 	struct blocking_notifier_head notifier;
@@ -78,6 +77,8 @@ struct vfio_iommu {
 	bool			nesting;
 	bool			dirty_page_tracking;
 	bool			container_open;
+	/* fake iommu groups to support devices with an emulated IOMMU: */
+	struct list_head	emulated_iommu_groups;
 };
 
 struct vfio_domain {
@@ -1892,8 +1893,8 @@ static struct vfio_iommu_group*
 vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 			    struct iommu_group *iommu_group)
 {
+	struct vfio_iommu_group *group;
 	struct vfio_domain *domain;
-	struct vfio_iommu_group *group = NULL;
 
 	list_for_each_entry(domain, &iommu->domain_list, next) {
 		group = find_iommu_group(domain, iommu_group);
@@ -1901,10 +1902,10 @@ vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 			return group;
 	}
 
-	if (iommu->external_domain)
-		group = find_iommu_group(iommu->external_domain, iommu_group);
-
-	return group;
+	list_for_each_entry(group, &iommu->emulated_iommu_groups, next)
+		if (group->iommu_group == iommu_group)
+			return group;
+	return NULL;
 }
 
 static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
@@ -2163,45 +2164,27 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_iommu_group *group;
 	struct vfio_domain *domain, *d;
 	struct bus_type *bus = NULL;
-	int ret;
 	bool resv_msi, msi_remap;
 	phys_addr_t resv_msi_base = 0;
 	struct iommu_domain_geometry *geo;
 	LIST_HEAD(iova_copy);
 	LIST_HEAD(group_resv_regions);
+	int ret = -EINVAL;
 
 	mutex_lock(&iommu->lock);
 
 	/* Check for duplicates */
-	if (vfio_iommu_find_iommu_group(iommu, iommu_group)) {
-		mutex_unlock(&iommu->lock);
-		return -EINVAL;
-	}
+	if (vfio_iommu_find_iommu_group(iommu, iommu_group))
+		goto out_unlock;
 
+	ret = -ENOMEM;
 	group = kzalloc(sizeof(*group), GFP_KERNEL);
-	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
-	if (!group || !domain) {
-		ret = -ENOMEM;
-		goto out_free;
-	}
-
+	if (!group)
+		goto out_unlock;
 	group->iommu_group = iommu_group;
 
-	/* Determine bus_type in order to allocate a domain */
-	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
-	if (ret)
-		goto out_free;
-
 	if (flags & VFIO_EMULATED_IOMMU) {
-		if (!iommu->external_domain) {
-			INIT_LIST_HEAD(&domain->group_list);
-			iommu->external_domain = domain;
-			vfio_update_pgsize_bitmap(iommu);
-		} else {
-			kfree(domain);
-		}
-
-		list_add(&group->next, &iommu->external_domain->group_list);
+		list_add(&group->next, &iommu->emulated_iommu_groups);
 		/*
 		 * Non-iommu backed group cannot dirty memory directly, it can
 		 * only use interfaces that provide dirty tracking.
@@ -2209,16 +2192,24 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		 * dirty tracking group.
 		 */
 		group->pinned_page_dirty_scope = true;
-		mutex_unlock(&iommu->lock);
-
-		return 0;
+		ret = 0;
+		goto out_unlock;
 	}
 
+	/* Determine bus_type in order to allocate a domain */
+	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
+	if (ret)
+		goto out_free_group;
+
+	ret = -ENOMEM;
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (!domain)
+		goto out_free_group;
+
+	ret = -EIO;
 	domain->domain = iommu_domain_alloc(bus);
-	if (!domain->domain) {
-		ret = -EIO;
-		goto out_free;
-	}
+	if (!domain->domain)
+		goto out_free_domain;
 
 	if (iommu->nesting) {
 		ret = iommu_enable_nesting(domain->domain);
@@ -2345,9 +2336,11 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	iommu_domain_free(domain->domain);
 	vfio_iommu_iova_free(&iova_copy);
 	vfio_iommu_resv_free(&group_resv_regions);
-out_free:
+out_free_domain:
 	kfree(domain);
+out_free_group:
 	kfree(group);
+out_unlock:
 	mutex_unlock(&iommu->lock);
 	return ret;
 }
@@ -2472,25 +2465,19 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
+	list_for_each_entry(group, &iommu->emulated_iommu_groups, next) {
+		if (group->iommu_group != iommu_group)
+			continue;
+		update_dirty_scope = !group->pinned_page_dirty_scope;
+		list_del(&group->next);
+		kfree(group);
 
-	if (iommu->external_domain) {
-		group = find_iommu_group(iommu->external_domain, iommu_group);
-		if (group) {
-			update_dirty_scope = !group->pinned_page_dirty_scope;
-			list_del(&group->next);
-			kfree(group);
-
-			if (list_empty(&iommu->external_domain->group_list)) {
-				if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
-					WARN_ON(iommu->notifier.head);
-					vfio_iommu_unmap_unpin_all(iommu);
-				}
-
-				kfree(iommu->external_domain);
-				iommu->external_domain = NULL;
-			}
-			goto detach_group_done;
+		if (list_empty(&iommu->emulated_iommu_groups) &&
+		    !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
+			WARN_ON(iommu->notifier.head);
+			vfio_iommu_unmap_unpin_all(iommu);
 		}
+		goto detach_group_done;
 	}
 
 	/*
@@ -2518,7 +2505,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		 */
 		if (list_empty(&domain->group_list)) {
 			if (list_is_singular(&iommu->domain_list)) {
-				if (!iommu->external_domain) {
+				if (list_empty(&iommu->emulated_iommu_groups)) {
 					WARN_ON(iommu->notifier.head);
 					vfio_iommu_unmap_unpin_all(iommu);
 				} else {
@@ -2582,41 +2569,42 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
 	init_waitqueue_head(&iommu->vaddr_wait);
+	INIT_LIST_HEAD(&iommu->emulated_iommu_groups);
 
 	return iommu;
 }
 
-static void vfio_release_domain(struct vfio_domain *domain, bool external)
+static void vfio_release_domain(struct vfio_domain *domain)
 {
 	struct vfio_iommu_group *group, *group_tmp;
 
 	list_for_each_entry_safe(group, group_tmp,
 				 &domain->group_list, next) {
-		if (!external)
-			iommu_detach_group(domain->domain, group->iommu_group);
+		iommu_detach_group(domain->domain, group->iommu_group);
 		list_del(&group->next);
 		kfree(group);
 	}
 
-	if (!external)
-		iommu_domain_free(domain->domain);
+	iommu_domain_free(domain->domain);
 }
 
 static void vfio_iommu_type1_release(void *iommu_data)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain, *domain_tmp;
+	struct vfio_iommu_group *group, *next_group;
 
-	if (iommu->external_domain) {
-		vfio_release_domain(iommu->external_domain, true);
-		kfree(iommu->external_domain);
+	list_for_each_entry_safe(group, next_group,
+			&iommu->emulated_iommu_groups, next) {
+		list_del(&group->next);
+		kfree(group);
 	}
 
 	vfio_iommu_unmap_unpin_all(iommu);
 
 	list_for_each_entry_safe(domain, domain_tmp,
 				 &iommu->domain_list, next) {
-		vfio_release_domain(domain, false);
+		vfio_release_domain(domain);
 		list_del(&domain->next);
 		kfree(domain);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 13/14] vfio/iommu_type1: remove the "external" domain
  2021-08-25  0:36   ` Jason Gunthorpe
@ 2021-08-25  5:35     ` Christoph Hellwig
  0 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2021-08-25  5:35 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Diana Craciun, Cornelia Huck,
	Kirti Wankhede, Eric Auger, kvm

On Tue, Aug 24, 2021 at 09:36:29PM -0300, Jason Gunthorpe wrote:
> > +	/*
> > +	 * Tracks the fake iommu groups created by vfio to support mediated
> > +	 * devices.  These are not backed by an actual IOMMU.
> > +	 */
> > +	struct list_head	mediated_groups;
> 
> Though again I'd prefer another name to mediated here.. These are
> domainless groups?

I think they should use whatever named we use for/instead of the mediated
groups as these are the same concept.  Using an unrelated name like
external did causes major confusion.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 13/14] vfio/iommu_type1: remove the "external" domain
  2021-08-24 14:46 ` [PATCH 13/14] vfio/iommu_type1: remove the "external" domain Christoph Hellwig
@ 2021-08-25  0:36   ` Jason Gunthorpe
  2021-08-25  5:35     ` Christoph Hellwig
  0 siblings, 1 reply; 43+ messages in thread
From: Jason Gunthorpe @ 2021-08-25  0:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Diana Craciun, Cornelia Huck, Kirti Wankhede,
	Eric Auger, kvm

On Tue, Aug 24, 2021 at 04:46:48PM +0200, Christoph Hellwig wrote:
> The external_domain concept rather misleading and not actually needed.
> Replace it with a list of mediated groups in struct vfio_iommu and
> document the purpose.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 123 +++++++++++++++-----------------
>  1 file changed, 57 insertions(+), 66 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

> +	/*
> +	 * Tracks the fake iommu groups created by vfio to support mediated
> +	 * devices.  These are not backed by an actual IOMMU.
> +	 */
> +	struct list_head	mediated_groups;

Though again I'd prefer another name to mediated here.. These are
domainless groups?

Jason

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 13/14] vfio/iommu_type1: remove the "external" domain
  2021-08-24 14:46 cleanup vfio iommu_group creation v2 Christoph Hellwig
@ 2021-08-24 14:46 ` Christoph Hellwig
  2021-08-25  0:36   ` Jason Gunthorpe
  0 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2021-08-24 14:46 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm

The external_domain concept rather misleading and not actually needed.
Replace it with a list of mediated groups in struct vfio_iommu and
document the purpose.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/vfio/vfio_iommu_type1.c | 123 +++++++++++++++-----------------
 1 file changed, 57 insertions(+), 66 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 44a3abdca580a0..205f13c05b236e 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -65,7 +65,6 @@ MODULE_PARM_DESC(dma_entry_limit,
 struct vfio_iommu {
 	struct list_head	domain_list;
 	struct list_head	iova_list;
-	struct vfio_domain	*external_domain; /* domain for external user */
 	struct mutex		lock;
 	struct rb_root		dma_list;
 	struct blocking_notifier_head notifier;
@@ -78,6 +77,12 @@ struct vfio_iommu {
 	bool			nesting;
 	bool			dirty_page_tracking;
 	bool			container_open;
+
+	/*
+	 * Tracks the fake iommu groups created by vfio to support mediated
+	 * devices.  These are not backed by an actual IOMMU.
+	 */
+	struct list_head	mediated_groups;
 };
 
 struct vfio_domain {
@@ -1892,8 +1897,8 @@ static struct vfio_iommu_group*
 vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 			    struct iommu_group *iommu_group)
 {
+	struct vfio_iommu_group *group;
 	struct vfio_domain *domain;
-	struct vfio_iommu_group *group = NULL;
 
 	list_for_each_entry(domain, &iommu->domain_list, next) {
 		group = find_iommu_group(domain, iommu_group);
@@ -1901,10 +1906,10 @@ vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 			return group;
 	}
 
-	if (iommu->external_domain)
-		group = find_iommu_group(iommu->external_domain, iommu_group);
-
-	return group;
+	list_for_each_entry(group, &iommu->mediated_groups, next)
+		if (group->iommu_group == iommu_group)
+			return group;
+	return NULL;
 }
 
 static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
@@ -2163,45 +2168,27 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_iommu_group *group;
 	struct vfio_domain *domain, *d;
 	struct bus_type *bus = NULL;
-	int ret;
 	bool resv_msi, msi_remap;
 	phys_addr_t resv_msi_base = 0;
 	struct iommu_domain_geometry *geo;
 	LIST_HEAD(iova_copy);
 	LIST_HEAD(group_resv_regions);
+	int ret = -EINVAL;
 
 	mutex_lock(&iommu->lock);
 
 	/* Check for duplicates */
-	if (vfio_iommu_find_iommu_group(iommu, iommu_group)) {
-		mutex_unlock(&iommu->lock);
-		return -EINVAL;
-	}
+	if (vfio_iommu_find_iommu_group(iommu, iommu_group))
+		goto out_unlock;
 
+	ret = -ENOMEM;
 	group = kzalloc(sizeof(*group), GFP_KERNEL);
-	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
-	if (!group || !domain) {
-		ret = -ENOMEM;
-		goto out_free;
-	}
-
+	if (!group)
+		goto out_unlock;
 	group->iommu_group = iommu_group;
 
-	/* Determine bus_type in order to allocate a domain */
-	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
-	if (ret)
-		goto out_free;
-
 	if (flags & VFIO_MEDIATED) {
-		if (!iommu->external_domain) {
-			INIT_LIST_HEAD(&domain->group_list);
-			iommu->external_domain = domain;
-			vfio_update_pgsize_bitmap(iommu);
-		} else {
-			kfree(domain);
-		}
-
-		list_add(&group->next, &iommu->external_domain->group_list);
+		list_add(&group->next, &iommu->mediated_groups);
 		/*
 		 * Non-iommu backed group cannot dirty memory directly, it can
 		 * only use interfaces that provide dirty tracking.
@@ -2209,16 +2196,24 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		 * dirty tracking group.
 		 */
 		group->pinned_page_dirty_scope = true;
-		mutex_unlock(&iommu->lock);
-
-		return 0;
+		ret = 0;
+		goto out_unlock;
 	}
 
+	/* Determine bus_type in order to allocate a domain */
+	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
+	if (ret)
+		goto out_free_group;
+
+	ret = -ENOMEM;
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (!domain)
+		goto out_free_group;
+
+	ret = -EIO;
 	domain->domain = iommu_domain_alloc(bus);
-	if (!domain->domain) {
-		ret = -EIO;
-		goto out_free;
-	}
+	if (!domain->domain)
+		goto out_free_domain;
 
 	if (iommu->nesting) {
 		ret = iommu_enable_nesting(domain->domain);
@@ -2345,9 +2340,11 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	iommu_domain_free(domain->domain);
 	vfio_iommu_iova_free(&iova_copy);
 	vfio_iommu_resv_free(&group_resv_regions);
-out_free:
+out_free_domain:
 	kfree(domain);
+out_free_group:
 	kfree(group);
+out_unlock:
 	mutex_unlock(&iommu->lock);
 	return ret;
 }
@@ -2472,25 +2469,19 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
+	list_for_each_entry(group, &iommu->mediated_groups, next) {
+		if (group->iommu_group != iommu_group)
+			continue;
+		update_dirty_scope = !group->pinned_page_dirty_scope;
+		list_del(&group->next);
+		kfree(group);
 
-	if (iommu->external_domain) {
-		group = find_iommu_group(iommu->external_domain, iommu_group);
-		if (group) {
-			update_dirty_scope = !group->pinned_page_dirty_scope;
-			list_del(&group->next);
-			kfree(group);
-
-			if (list_empty(&iommu->external_domain->group_list)) {
-				if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
-					WARN_ON(iommu->notifier.head);
-					vfio_iommu_unmap_unpin_all(iommu);
-				}
-
-				kfree(iommu->external_domain);
-				iommu->external_domain = NULL;
-			}
-			goto detach_group_done;
+		if (list_empty(&iommu->mediated_groups) &&
+		    !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
+			WARN_ON(iommu->notifier.head);
+			vfio_iommu_unmap_unpin_all(iommu);
 		}
+		goto detach_group_done;
 	}
 
 	/*
@@ -2518,7 +2509,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		 */
 		if (list_empty(&domain->group_list)) {
 			if (list_is_singular(&iommu->domain_list)) {
-				if (!iommu->external_domain) {
+				if (list_empty(&iommu->mediated_groups)) {
 					WARN_ON(iommu->notifier.head);
 					vfio_iommu_unmap_unpin_all(iommu);
 				} else {
@@ -2582,41 +2573,41 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
 	init_waitqueue_head(&iommu->vaddr_wait);
+	INIT_LIST_HEAD(&iommu->mediated_groups);
 
 	return iommu;
 }
 
-static void vfio_release_domain(struct vfio_domain *domain, bool external)
+static void vfio_release_domain(struct vfio_domain *domain)
 {
 	struct vfio_iommu_group *group, *group_tmp;
 
 	list_for_each_entry_safe(group, group_tmp,
 				 &domain->group_list, next) {
-		if (!external)
-			iommu_detach_group(domain->domain, group->iommu_group);
+		iommu_detach_group(domain->domain, group->iommu_group);
 		list_del(&group->next);
 		kfree(group);
 	}
 
-	if (!external)
-		iommu_domain_free(domain->domain);
+	iommu_domain_free(domain->domain);
 }
 
 static void vfio_iommu_type1_release(void *iommu_data)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain, *domain_tmp;
+	struct vfio_iommu_group *group, *n;
 
-	if (iommu->external_domain) {
-		vfio_release_domain(iommu->external_domain, true);
-		kfree(iommu->external_domain);
+	list_for_each_entry_safe(group, n, &iommu->mediated_groups, next) {
+		list_del(&group->next);
+		kfree(group);
 	}
 
 	vfio_iommu_unmap_unpin_all(iommu);
 
 	list_for_each_entry_safe(domain, domain_tmp,
 				 &iommu->domain_list, next) {
-		vfio_release_domain(domain, false);
+		vfio_release_domain(domain);
 		list_del(&domain->next);
 		kfree(domain);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 13/14] vfio/iommu_type1: remove the "external" domain
  2021-08-11 15:14 cleanup vfio iommu_group creation Christoph Hellwig
@ 2021-08-11 15:14 ` Christoph Hellwig
  0 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2021-08-11 15:14 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm

The external_domain concept rather misleading and not actually needed.
Replace it with a list of mediated groups in struct vfio_iommu and
document the purpose.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/vfio/vfio_iommu_type1.c | 123 +++++++++++++++-----------------
 1 file changed, 57 insertions(+), 66 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 44a3abdca580a0..205f13c05b236e 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -65,7 +65,6 @@ MODULE_PARM_DESC(dma_entry_limit,
 struct vfio_iommu {
 	struct list_head	domain_list;
 	struct list_head	iova_list;
-	struct vfio_domain	*external_domain; /* domain for external user */
 	struct mutex		lock;
 	struct rb_root		dma_list;
 	struct blocking_notifier_head notifier;
@@ -78,6 +77,12 @@ struct vfio_iommu {
 	bool			nesting;
 	bool			dirty_page_tracking;
 	bool			container_open;
+
+	/*
+	 * Tracks the fake iommu groups created by vfio to support mediated
+	 * devices.  These are not backed by an actual IOMMU.
+	 */
+	struct list_head	mediated_groups;
 };
 
 struct vfio_domain {
@@ -1892,8 +1897,8 @@ static struct vfio_iommu_group*
 vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 			    struct iommu_group *iommu_group)
 {
+	struct vfio_iommu_group *group;
 	struct vfio_domain *domain;
-	struct vfio_iommu_group *group = NULL;
 
 	list_for_each_entry(domain, &iommu->domain_list, next) {
 		group = find_iommu_group(domain, iommu_group);
@@ -1901,10 +1906,10 @@ vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
 			return group;
 	}
 
-	if (iommu->external_domain)
-		group = find_iommu_group(iommu->external_domain, iommu_group);
-
-	return group;
+	list_for_each_entry(group, &iommu->mediated_groups, next)
+		if (group->iommu_group == iommu_group)
+			return group;
+	return NULL;
 }
 
 static bool vfio_iommu_has_sw_msi(struct list_head *group_resv_regions,
@@ -2163,45 +2168,27 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_iommu_group *group;
 	struct vfio_domain *domain, *d;
 	struct bus_type *bus = NULL;
-	int ret;
 	bool resv_msi, msi_remap;
 	phys_addr_t resv_msi_base = 0;
 	struct iommu_domain_geometry *geo;
 	LIST_HEAD(iova_copy);
 	LIST_HEAD(group_resv_regions);
+	int ret = -EINVAL;
 
 	mutex_lock(&iommu->lock);
 
 	/* Check for duplicates */
-	if (vfio_iommu_find_iommu_group(iommu, iommu_group)) {
-		mutex_unlock(&iommu->lock);
-		return -EINVAL;
-	}
+	if (vfio_iommu_find_iommu_group(iommu, iommu_group))
+		goto out_unlock;
 
+	ret = -ENOMEM;
 	group = kzalloc(sizeof(*group), GFP_KERNEL);
-	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
-	if (!group || !domain) {
-		ret = -ENOMEM;
-		goto out_free;
-	}
-
+	if (!group)
+		goto out_unlock;
 	group->iommu_group = iommu_group;
 
-	/* Determine bus_type in order to allocate a domain */
-	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
-	if (ret)
-		goto out_free;
-
 	if (flags & VFIO_MEDIATED) {
-		if (!iommu->external_domain) {
-			INIT_LIST_HEAD(&domain->group_list);
-			iommu->external_domain = domain;
-			vfio_update_pgsize_bitmap(iommu);
-		} else {
-			kfree(domain);
-		}
-
-		list_add(&group->next, &iommu->external_domain->group_list);
+		list_add(&group->next, &iommu->mediated_groups);
 		/*
 		 * Non-iommu backed group cannot dirty memory directly, it can
 		 * only use interfaces that provide dirty tracking.
@@ -2209,16 +2196,24 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		 * dirty tracking group.
 		 */
 		group->pinned_page_dirty_scope = true;
-		mutex_unlock(&iommu->lock);
-
-		return 0;
+		ret = 0;
+		goto out_unlock;
 	}
 
+	/* Determine bus_type in order to allocate a domain */
+	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
+	if (ret)
+		goto out_free_group;
+
+	ret = -ENOMEM;
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (!domain)
+		goto out_free_group;
+
+	ret = -EIO;
 	domain->domain = iommu_domain_alloc(bus);
-	if (!domain->domain) {
-		ret = -EIO;
-		goto out_free;
-	}
+	if (!domain->domain)
+		goto out_free_domain;
 
 	if (iommu->nesting) {
 		ret = iommu_enable_nesting(domain->domain);
@@ -2345,9 +2340,11 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	iommu_domain_free(domain->domain);
 	vfio_iommu_iova_free(&iova_copy);
 	vfio_iommu_resv_free(&group_resv_regions);
-out_free:
+out_free_domain:
 	kfree(domain);
+out_free_group:
 	kfree(group);
+out_unlock:
 	mutex_unlock(&iommu->lock);
 	return ret;
 }
@@ -2472,25 +2469,19 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
+	list_for_each_entry(group, &iommu->mediated_groups, next) {
+		if (group->iommu_group != iommu_group)
+			continue;
+		update_dirty_scope = !group->pinned_page_dirty_scope;
+		list_del(&group->next);
+		kfree(group);
 
-	if (iommu->external_domain) {
-		group = find_iommu_group(iommu->external_domain, iommu_group);
-		if (group) {
-			update_dirty_scope = !group->pinned_page_dirty_scope;
-			list_del(&group->next);
-			kfree(group);
-
-			if (list_empty(&iommu->external_domain->group_list)) {
-				if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
-					WARN_ON(iommu->notifier.head);
-					vfio_iommu_unmap_unpin_all(iommu);
-				}
-
-				kfree(iommu->external_domain);
-				iommu->external_domain = NULL;
-			}
-			goto detach_group_done;
+		if (list_empty(&iommu->mediated_groups) &&
+		    !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
+			WARN_ON(iommu->notifier.head);
+			vfio_iommu_unmap_unpin_all(iommu);
 		}
+		goto detach_group_done;
 	}
 
 	/*
@@ -2518,7 +2509,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		 */
 		if (list_empty(&domain->group_list)) {
 			if (list_is_singular(&iommu->domain_list)) {
-				if (!iommu->external_domain) {
+				if (list_empty(&iommu->mediated_groups)) {
 					WARN_ON(iommu->notifier.head);
 					vfio_iommu_unmap_unpin_all(iommu);
 				} else {
@@ -2582,41 +2573,41 @@ static void *vfio_iommu_type1_open(unsigned long arg)
 	mutex_init(&iommu->lock);
 	BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
 	init_waitqueue_head(&iommu->vaddr_wait);
+	INIT_LIST_HEAD(&iommu->mediated_groups);
 
 	return iommu;
 }
 
-static void vfio_release_domain(struct vfio_domain *domain, bool external)
+static void vfio_release_domain(struct vfio_domain *domain)
 {
 	struct vfio_iommu_group *group, *group_tmp;
 
 	list_for_each_entry_safe(group, group_tmp,
 				 &domain->group_list, next) {
-		if (!external)
-			iommu_detach_group(domain->domain, group->iommu_group);
+		iommu_detach_group(domain->domain, group->iommu_group);
 		list_del(&group->next);
 		kfree(group);
 	}
 
-	if (!external)
-		iommu_domain_free(domain->domain);
+	iommu_domain_free(domain->domain);
 }
 
 static void vfio_iommu_type1_release(void *iommu_data)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain, *domain_tmp;
+	struct vfio_iommu_group *group, *n;
 
-	if (iommu->external_domain) {
-		vfio_release_domain(iommu->external_domain, true);
-		kfree(iommu->external_domain);
+	list_for_each_entry_safe(group, n, &iommu->mediated_groups, next) {
+		list_del(&group->next);
+		kfree(group);
 	}
 
 	vfio_iommu_unmap_unpin_all(iommu);
 
 	list_for_each_entry_safe(domain, domain_tmp,
 				 &iommu->domain_list, next) {
-		vfio_release_domain(domain, false);
+		vfio_release_domain(domain);
 		list_del(&domain->next);
 		kfree(domain);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2021-10-01  3:20 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
2021-09-13  7:15 ` [PATCH 01/14] vfio: Move vfio_iommu_group_get() to vfio_register_group_dev() Christoph Hellwig
2021-09-14  1:57   ` Tian, Kevin
2021-09-14  5:46     ` Christoph Hellwig
2021-09-14  6:11       ` Tian, Kevin
2021-09-13  7:15 ` [PATCH 02/14] vfio: factor out a vfio_iommu_driver_allowed helper Christoph Hellwig
2021-09-23 22:59   ` Alex Williamson
2021-09-13  7:15 ` [PATCH 03/14] vfio: remove the iommudata check in vfio_noiommu_attach_group Christoph Hellwig
2021-09-13  7:15 ` [PATCH 04/14] vfio: factor out a vfio_group_find_or_alloc helper Christoph Hellwig
2021-09-14  2:00   ` Tian, Kevin
2021-09-13  7:15 ` [PATCH 05/14] vfio: refactor noiommu group creation Christoph Hellwig
2021-09-14  2:07   ` Tian, Kevin
2021-09-13  7:15 ` [PATCH 06/14] vfio: remove the iommudata hack for noiommu groups Christoph Hellwig
2021-09-15 16:44   ` Jason Gunthorpe
2021-09-13  7:15 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
2021-09-14  2:23   ` Tian, Kevin
2021-09-13  7:16 ` [PATCH 08/14] vfio: remove unused method from vfio_iommu_driver_ops Christoph Hellwig
2021-09-13  7:16 ` [PATCH 09/14] vfio: move the vfio_iommu_driver_ops interface out of <linux/vfio.h> Christoph Hellwig
2021-09-13  7:16 ` [PATCH 10/14] vfio: remove the unused mdev iommu hook Christoph Hellwig
2021-09-13  7:16 ` [PATCH 11/14] vfio: clean up the check for mediated device in vfio_iommu_type1 Christoph Hellwig
2021-09-16 19:55   ` Kirti Wankhede
2021-09-16 22:18     ` Jason Gunthorpe
2021-09-17  4:49       ` Tian, Kevin
2021-09-17  5:05         ` Christoph Hellwig
2021-09-17  6:51           ` Kirti Wankhede
2021-09-17 12:53             ` Jason Gunthorpe
2021-09-30 16:46               ` Alex Williamson
2021-09-30 16:57                 ` Jason Gunthorpe
2021-10-01  3:20                   ` Christoph Hellwig
2021-09-13  7:16 ` [PATCH 12/14] vfio/spapr_tce: reject mediated devices Christoph Hellwig
2021-09-13  7:16 ` [PATCH 13/14] vfio/iommu_type1: remove the "external" domain Christoph Hellwig
2021-09-23 22:59   ` Alex Williamson
2021-09-23 23:06     ` Jason Gunthorpe
2021-09-13  7:16 ` [PATCH 14/14] vfio/iommu_type1: remove IS_IOMMU_CAP_DOMAIN_IN_CONTAINER Christoph Hellwig
2021-09-15 18:07 ` cleanup vfio iommu_group creation v5 Jason Gunthorpe
  -- strict thread matches above, loose matches on Subject: below --
2021-08-26 13:34 cleanup vfio iommu_group creation v4 Christoph Hellwig
2021-08-26 13:34 ` [PATCH 13/14] vfio/iommu_type1: remove the "external" domain Christoph Hellwig
2021-08-26 23:08   ` Alex Williamson
2021-08-25 16:19 cleanup vfio iommu_group creation v3 Christoph Hellwig
2021-08-25 16:19 ` [PATCH 13/14] vfio/iommu_type1: remove the "external" domain Christoph Hellwig
2021-08-26  4:31   ` Tian, Kevin
2021-08-24 14:46 cleanup vfio iommu_group creation v2 Christoph Hellwig
2021-08-24 14:46 ` [PATCH 13/14] vfio/iommu_type1: remove the "external" domain Christoph Hellwig
2021-08-25  0:36   ` Jason Gunthorpe
2021-08-25  5:35     ` Christoph Hellwig
2021-08-11 15:14 cleanup vfio iommu_group creation Christoph Hellwig
2021-08-11 15:14 ` [PATCH 13/14] vfio/iommu_type1: remove the "external" domain Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).