linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/5] Simplify vfio_iommu_type1 attach/detach routine
@ 2022-06-30 20:36 Nicolin Chen
  2022-06-30 20:36 ` [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group Nicolin Chen
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Nicolin Chen @ 2022-06-30 20:36 UTC (permalink / raw)
  To: joro, will, marcan, sven, robin.murphy, robdclark, baolu.lu,
	orsonzhai, baolin.wang7, zhang.lyra, jean-philippe,
	alex.williamson, jgg, kevin.tian
  Cc: suravee.suthikulpanit, alyssa, dwmw2, mjrosato, gerald.schaefer,
	thierry.reding, vdumpa, jonathanh, cohuck, thunder.leizhen,
	christophe.jaillet, chenxiang66, john.garry, yangyingliang,
	iommu, iommu, linux-kernel, linux-arm-kernel, linux-arm-msm,
	linux-s390, linux-tegra, virtualization, kvm

This is a preparatory series for IOMMUFD v2 patches. It enforces error
code -EMEDIUMTYPE in iommu_attach_device() and iommu_attach_group() when
an IOMMU domain and a device/group are incompatible. It also drops the
useless domain->ops check since it won't fail in current environment.

These allow VFIO iommu code to simplify its group attachment routine, by
avoiding the extra IOMMU domain allocations and attach/detach sequences
of the old code.

Worths mentioning the exact match for enforce_cache_coherency is removed
with this series, since there's very less value in doing that as KVM will
not be able to take advantage of it -- this just wastes domain memory.
Instead, we rely on Intel IOMMU driver taking care of that internally.

This is on github:
https://github.com/nicolinc/iommufd/commits/vfio_iommu_attach

Changelog
v4:
 * Dropped EMEDIUMTYPE change in mtk_v1 driver per Robin's input
 * Added Baolu's and Kevin's Reviewed-by lines
v3: https://lore.kernel.org/kvm/20220623200029.26007-1-nicolinc@nvidia.com/
 * Dropped all dev_err since -EMEDIUMTYPE clearly indicates what error.
 * Updated commit message of enforce_cache_coherency removing patch.
 * Updated commit message of domain->ops removing patch.
 * Replaced "goto out_unlock" with simply mutex_unlock() and return.
 * Added a line of comments for -EMEDIUMTYPE return check.
 * Moved iommu_get_msi_cookie() into alloc_attach_domain() as a cookie
   should be logically tied to the lifetime of a domain itself.
 * Added Kevin's "Reviewed-by".
v2: https://lore.kernel.org/kvm/20220616000304.23890-1-nicolinc@nvidia.com/
 * Added -EMEDIUMTYPE to more IOMMU drivers that fit the category.
 * Changed dev_err to dev_dbg for -EMEDIUMTYPE to avoid kernel log spam.
 * Dropped iommu_ops patch, and removed domain->ops in VFIO directly,
   since there's no mixed-driver use case that would fail the sanity.
 * Updated commit log of the patch removing enforce_cache_coherency.
 * Fixed a misplace of "num_non_pinned_groups--" in detach_group patch.
 * Moved "num_non_pinned_groups++" in PATCH-5 to the common path between
   domain-reusing and new-domain pathways, like the code previously did.
 * Fixed a typo in EMEDIUMTYPE patch.
v1: https://lore.kernel.org/kvm/20220606061927.26049-1-nicolinc@nvidia.com/

Jason Gunthorpe (1):
  vfio/iommu_type1: Prefer to reuse domains vs match enforced cache
    coherency

Nicolin Chen (4):
  iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  vfio/iommu_type1: Remove the domain->ops comparison
  vfio/iommu_type1: Clean up update_dirty_scope in detach_group()
  vfio/iommu_type1: Simplify group attachment

 drivers/iommu/amd/iommu.c                   |   2 +-
 drivers/iommu/apple-dart.c                  |   4 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  15 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |   6 +-
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     |   9 +-
 drivers/iommu/intel/iommu.c                 |  10 +-
 drivers/iommu/iommu.c                       |  28 ++
 drivers/iommu/ipmmu-vmsa.c                  |   4 +-
 drivers/iommu/omap-iommu.c                  |   3 +-
 drivers/iommu/s390-iommu.c                  |   2 +-
 drivers/iommu/sprd-iommu.c                  |   6 +-
 drivers/iommu/tegra-gart.c                  |   2 +-
 drivers/iommu/virtio-iommu.c                |   3 +-
 drivers/vfio/vfio_iommu_type1.c             | 340 ++++++++++----------
 14 files changed, 224 insertions(+), 210 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-06-30 20:36 [PATCH v4 0/5] Simplify vfio_iommu_type1 attach/detach routine Nicolin Chen
@ 2022-06-30 20:36 ` Nicolin Chen
  2022-07-01  6:53   ` Baolu Lu
  2022-07-01 10:21   ` Robin Murphy
  2022-06-30 20:36 ` [PATCH v4 2/5] vfio/iommu_type1: Prefer to reuse domains vs match enforced cache coherency Nicolin Chen
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 11+ messages in thread
From: Nicolin Chen @ 2022-06-30 20:36 UTC (permalink / raw)
  To: joro, will, marcan, sven, robin.murphy, robdclark, baolu.lu,
	orsonzhai, baolin.wang7, zhang.lyra, jean-philippe,
	alex.williamson, jgg, kevin.tian
  Cc: suravee.suthikulpanit, alyssa, dwmw2, mjrosato, gerald.schaefer,
	thierry.reding, vdumpa, jonathanh, cohuck, thunder.leizhen,
	christophe.jaillet, chenxiang66, john.garry, yangyingliang,
	iommu, iommu, linux-kernel, linux-arm-kernel, linux-arm-msm,
	linux-s390, linux-tegra, virtualization, kvm

Cases like VFIO wish to attach a device to an existing domain that was
not allocated specifically from the device. This raises a condition
where the IOMMU driver can fail the domain attach because the domain and
device are incompatible with each other.

This is a soft failure that can be resolved by using a different domain.

Provide a dedicated errno from the IOMMU driver during attach that the
reason attached failed is because of domain incompatability. EMEDIUMTYPE
is chosen because it is never used within the iommu subsystem today and
evokes a sense that the 'medium' aka the domain is incompatible.

VFIO can use this to know attach is a soft failure and it should continue
searching. Otherwise the attach will be a hard failure and VFIO will
return the code to userspace.

Update all drivers to return EMEDIUMTYPE in their failure paths that are
related to domain incompatability. Also remove adjacent error prints for
these soft failures, to prevent a kernel log spam, since -EMEDIUMTYPE is
clear enough to indicate an incompatability error.

Add kdocs describing this behavior.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/amd/iommu.c                   |  2 +-
 drivers/iommu/apple-dart.c                  |  4 +--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++--------
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  6 ++---
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     |  9 ++-----
 drivers/iommu/intel/iommu.c                 | 10 +++-----
 drivers/iommu/iommu.c                       | 28 +++++++++++++++++++++
 drivers/iommu/ipmmu-vmsa.c                  |  4 +--
 drivers/iommu/omap-iommu.c                  |  3 +--
 drivers/iommu/s390-iommu.c                  |  2 +-
 drivers/iommu/sprd-iommu.c                  |  6 ++---
 drivers/iommu/tegra-gart.c                  |  2 +-
 drivers/iommu/virtio-iommu.c                |  3 +--
 13 files changed, 48 insertions(+), 46 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a56a9ad3273e..e851c3e91145 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1743,7 +1743,7 @@ static int attach_device(struct device *dev,
 	if (domain->flags & PD_IOMMUV2_MASK) {
 		struct iommu_domain *def_domain = iommu_get_dma_domain(dev);
 
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 		if (def_domain->type != IOMMU_DOMAIN_IDENTITY)
 			goto out;
 
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index 8af0242a90d9..e58dc310afd7 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -495,10 +495,10 @@ static int apple_dart_attach_dev(struct iommu_domain *domain,
 
 	if (cfg->stream_maps[0].dart->force_bypass &&
 	    domain->type != IOMMU_DOMAIN_IDENTITY)
-		return -EINVAL;
+		return -EMEDIUMTYPE;
 	if (!cfg->stream_maps[0].dart->supports_bypass &&
 	    domain->type == IOMMU_DOMAIN_IDENTITY)
-		return -EINVAL;
+		return -EMEDIUMTYPE;
 
 	ret = apple_dart_finalize_domain(domain, cfg);
 	if (ret)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 88817a3376ef..5b64138f549d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2420,24 +2420,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			goto out_unlock;
 		}
 	} else if (smmu_domain->smmu != smmu) {
-		dev_err(dev,
-			"cannot attach to SMMU %s (upstream of %s)\n",
-			dev_name(smmu_domain->smmu->dev),
-			dev_name(smmu->dev));
-		ret = -ENXIO;
+		ret = -EMEDIUMTYPE;
 		goto out_unlock;
 	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
 		   master->ssid_bits != smmu_domain->s1_cfg.s1cdmax) {
-		dev_err(dev,
-			"cannot attach to incompatible domain (%u SSID bits != %u)\n",
-			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 		goto out_unlock;
 	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
 		   smmu_domain->stall_enabled != master->stall_enabled) {
-		dev_err(dev, "cannot attach to stall-%s domain\n",
-			smmu_domain->stall_enabled ? "enabled" : "disabled");
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 		goto out_unlock;
 	}
 
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 2ed3594f384e..072cac5ab5a4 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1135,10 +1135,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	struct arm_smmu_device *smmu;
 	int ret;
 
-	if (!fwspec || fwspec->ops != &arm_smmu_ops) {
-		dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n");
-		return -ENXIO;
-	}
+	if (!fwspec || fwspec->ops != &arm_smmu_ops)
+		return -EMEDIUMTYPE;
 
 	/*
 	 * FIXME: The arch/arm DMA API code tries to attach devices to its own
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 4c077c38fbd6..8372f985c14a 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -381,13 +381,8 @@ static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev
 	 * Sanity check the domain. We don't support domains across
 	 * different IOMMUs.
 	 */
-	if (qcom_domain->iommu != qcom_iommu) {
-		dev_err(dev, "cannot attach to IOMMU %s while already "
-			"attached to domain on IOMMU %s\n",
-			dev_name(qcom_domain->iommu->dev),
-			dev_name(qcom_iommu->dev));
-		return -EINVAL;
-	}
+	if (qcom_domain->iommu != qcom_iommu)
+		return -EMEDIUMTYPE;
 
 	return 0;
 }
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 44016594831d..db5fb799e350 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4323,19 +4323,15 @@ static int prepare_domain_attach_device(struct iommu_domain *domain,
 		return -ENODEV;
 
 	if (dmar_domain->force_snooping && !ecap_sc_support(iommu->ecap))
-		return -EOPNOTSUPP;
+		return -EMEDIUMTYPE;
 
 	/* check if this iommu agaw is sufficient for max mapped address */
 	addr_width = agaw_to_width(iommu->agaw);
 	if (addr_width > cap_mgaw(iommu->cap))
 		addr_width = cap_mgaw(iommu->cap);
 
-	if (dmar_domain->max_addr > (1LL << addr_width)) {
-		dev_err(dev, "%s: iommu width (%d) is not "
-		        "sufficient for the mapped address (%llx)\n",
-		        __func__, addr_width, dmar_domain->max_addr);
-		return -EFAULT;
-	}
+	if (dmar_domain->max_addr > (1LL << addr_width))
+		return -EMEDIUMTYPE;
 	dmar_domain->gaw = addr_width;
 
 	/*
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index cdc86c39954e..15e7a2914b5a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1972,6 +1972,20 @@ static int __iommu_attach_device(struct iommu_domain *domain,
 	return ret;
 }
 
+/**
+ * iommu_attach_device - Attach a device to an IOMMU domain
+ * @domain: IOMMU domain to attach
+ * @dev: Device that will be attached
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Specifically, -EMEDIUMTYPE is returned as a soft failure if the domain and
+ * the device are incompatible in some way. This indicates that a caller should
+ * try another existing IOMMU domain or allocate a new one. And note that it's
+ * recommended to keep kernel print free when reporting -EMEDIUMTYPE error, as
+ * this function can be called to test compatibility with domains that will fail
+ * the test, which will result in a kernel log spam.
+ */
 int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
 {
 	struct iommu_group *group;
@@ -2098,6 +2112,20 @@ static int __iommu_attach_group(struct iommu_domain *domain,
 	return ret;
 }
 
+/**
+ * iommu_attach_group - Attach an IOMMU group to an IOMMU domain
+ * @domain: IOMMU domain to attach
+ * @group: IOMMU group that will be attached
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Specifically, -EMEDIUMTYPE is returned as a soft failure if the domain and
+ * the device are incompatible in some way. This indicates that a caller should
+ * try another existing IOMMU domain or allocate a new one. And note that it's
+ * recommended to keep kernel print free when reporting -EMEDIUMTYPE error, as
+ * this function can be called to test compatibility with domains that will fail
+ * the test, which will result in a kernel log spam.
+ */
 int iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group)
 {
 	int ret;
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 1d42084d0276..0103480648cb 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -628,9 +628,7 @@ static int ipmmu_attach_device(struct iommu_domain *io_domain,
 		 * Something is wrong, we can't attach two devices using
 		 * different IOMMUs to the same domain.
 		 */
-		dev_err(dev, "Can't attach IPMMU %s to domain on IPMMU %s\n",
-			dev_name(mmu->dev), dev_name(domain->mmu->dev));
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 	} else
 		dev_info(dev, "Reusing IPMMU context %u\n", domain->context_id);
 
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index d9cf2820c02e..6bc8925726bf 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1471,8 +1471,7 @@ omap_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 
 	/* only a single client device can be attached to a domain */
 	if (omap_domain->dev) {
-		dev_err(dev, "iommu domain is already attached\n");
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;
 		goto out;
 	}
 
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index c898bcbbce11..ddcb78b284bb 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -127,7 +127,7 @@ static int s390_iommu_attach_device(struct iommu_domain *domain,
 	/* Allow only devices with identical DMA range limits */
 	} else if (domain->geometry.aperture_start != zdev->start_dma ||
 		   domain->geometry.aperture_end != zdev->end_dma) {
-		rc = -EINVAL;
+		rc = -EMEDIUMTYPE;
 		spin_unlock_irqrestore(&s390_domain->list_lock, flags);
 		goto out_restore;
 	}
diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
index bd409bab6286..f6ae230ca1cd 100644
--- a/drivers/iommu/sprd-iommu.c
+++ b/drivers/iommu/sprd-iommu.c
@@ -237,10 +237,8 @@ static int sprd_iommu_attach_device(struct iommu_domain *domain,
 	struct sprd_iommu_domain *dom = to_sprd_domain(domain);
 	size_t pgt_size = sprd_iommu_pgt_size(domain);
 
-	if (dom->sdev) {
-		pr_err("There's already a device attached to this domain.\n");
-		return -EINVAL;
-	}
+	if (dom->sdev)
+		return -EMEDIUMTYPE;
 
 	dom->pgt_va = dma_alloc_coherent(sdev->dev, pgt_size, &dom->pgt_pa, GFP_KERNEL);
 	if (!dom->pgt_va)
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index a6700a40a6f8..011c33e6ae31 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -112,7 +112,7 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
 	spin_lock(&gart->dom_lock);
 
 	if (gart->active_domain && gart->active_domain != domain) {
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;
 	} else if (dev_iommu_priv_get(dev) != domain) {
 		dev_iommu_priv_set(dev, domain);
 		gart->active_domain = domain;
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 25be4b822aa0..a41a62dccb4d 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -733,8 +733,7 @@ static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		 */
 		ret = viommu_domain_finalise(vdev, domain);
 	} else if (vdomain->viommu != vdev->viommu) {
-		dev_err(dev, "cannot attach to foreign vIOMMU\n");
-		ret = -EXDEV;
+		ret = -EMEDIUMTYPE;
 	}
 	mutex_unlock(&vdomain->mutex);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 2/5] vfio/iommu_type1: Prefer to reuse domains vs match enforced cache coherency
  2022-06-30 20:36 [PATCH v4 0/5] Simplify vfio_iommu_type1 attach/detach routine Nicolin Chen
  2022-06-30 20:36 ` [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group Nicolin Chen
@ 2022-06-30 20:36 ` Nicolin Chen
  2022-06-30 20:36 ` [PATCH v4 3/5] vfio/iommu_type1: Remove the domain->ops comparison Nicolin Chen
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Nicolin Chen @ 2022-06-30 20:36 UTC (permalink / raw)
  To: joro, will, marcan, sven, robin.murphy, robdclark, baolu.lu,
	orsonzhai, baolin.wang7, zhang.lyra, jean-philippe,
	alex.williamson, jgg, kevin.tian
  Cc: suravee.suthikulpanit, alyssa, dwmw2, mjrosato, gerald.schaefer,
	thierry.reding, vdumpa, jonathanh, cohuck, thunder.leizhen,
	christophe.jaillet, chenxiang66, john.garry, yangyingliang,
	iommu, iommu, linux-kernel, linux-arm-kernel, linux-arm-msm,
	linux-s390, linux-tegra, virtualization, kvm

From: Jason Gunthorpe <jgg@nvidia.com>

The KVM mechanism for controlling wbinvd is based on OR of the coherency
property of all devices attached to a guest, no matter whether those
devices are attached to a single domain or multiple domains.

On the other hand, the benefit to using separate domains was that those
devices attached to domains supporting enforced cache coherency always
mapped with the attributes necessary to provide that feature, therefore
if a non-enforced domain was dropped, the associated group removal would
re-trigger an evaluation by KVM.

In practice however, the only known cases of such mixed domains included
an Intel IGD device behind an IOMMU lacking snoop control, where such
devices do not support hotplug, therefore this scenario lacks testing and
is not considered sufficiently relevant to support.

After all, KVM won't take advantage of trying to push a device that could
do enforced cache coherency to a dedicated domain vs re-using an existing
domain, which is non-coherent.

Simplify this code and eliminate the test. This removes the only logic
that needed to have a dummy domain attached prior to searching for a
matching domain and simplifies the next patches.

It's unclear whether we want to further optimize the Intel driver to
update the domain coherency after a device is detached from it, at
least not before KVM can be verified to handle such dynamics in related
emulation paths (wbinvd, vcpu load, write_cr0, ept, etc.). In reality
we don't see an usage requiring such optimization as the only device
which imposes such non-coherency is Intel GPU which even doesn't
support hotplug/hot remove.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c13b9290e357..f4e3b423a453 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2285,9 +2285,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	 * testing if they're on the same bus_type.
 	 */
 	list_for_each_entry(d, &iommu->domain_list, next) {
-		if (d->domain->ops == domain->domain->ops &&
-		    d->enforce_cache_coherency ==
-			    domain->enforce_cache_coherency) {
+		if (d->domain->ops == domain->domain->ops) {
 			iommu_detach_group(domain->domain, group->iommu_group);
 			if (!iommu_attach_group(d->domain,
 						group->iommu_group)) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 3/5] vfio/iommu_type1: Remove the domain->ops comparison
  2022-06-30 20:36 [PATCH v4 0/5] Simplify vfio_iommu_type1 attach/detach routine Nicolin Chen
  2022-06-30 20:36 ` [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group Nicolin Chen
  2022-06-30 20:36 ` [PATCH v4 2/5] vfio/iommu_type1: Prefer to reuse domains vs match enforced cache coherency Nicolin Chen
@ 2022-06-30 20:36 ` Nicolin Chen
  2022-06-30 20:36 ` [PATCH v4 4/5] vfio/iommu_type1: Clean up update_dirty_scope in detach_group() Nicolin Chen
  2022-06-30 20:36 ` [PATCH v4 5/5] vfio/iommu_type1: Simplify group attachment Nicolin Chen
  4 siblings, 0 replies; 11+ messages in thread
From: Nicolin Chen @ 2022-06-30 20:36 UTC (permalink / raw)
  To: joro, will, marcan, sven, robin.murphy, robdclark, baolu.lu,
	orsonzhai, baolin.wang7, zhang.lyra, jean-philippe,
	alex.williamson, jgg, kevin.tian
  Cc: suravee.suthikulpanit, alyssa, dwmw2, mjrosato, gerald.schaefer,
	thierry.reding, vdumpa, jonathanh, cohuck, thunder.leizhen,
	christophe.jaillet, chenxiang66, john.garry, yangyingliang,
	iommu, iommu, linux-kernel, linux-arm-kernel, linux-arm-msm,
	linux-s390, linux-tegra, virtualization, kvm

The domain->ops validation was added, as a precaution, for mixed-driver
systems.

Per Robin's remarks,
* While bus_set_iommu() still exists, the core code prevents multiple
  drivers from registering, so we can't really run into a situation of
  having a mixed-driver system:
  https://lore.kernel.org/kvm/6e1280c5-4b22-ebb3-3912-6c72bc169982@arm.com/

* And there's plenty more significant problems than this to fix; in future
  when many can be permitted, we will rely on the IOMMU core code to check
  the domain->ops:
  https://lore.kernel.org/kvm/6575de6d-94ba-c427-5b1e-967750ddff23@arm.com/

So remove the check in VFIO for simplicity.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index f4e3b423a453..11be5f95580b 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2277,29 +2277,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 			domain->domain->ops->enforce_cache_coherency(
 				domain->domain);
 
-	/*
-	 * Try to match an existing compatible domain.  We don't want to
-	 * preclude an IOMMU driver supporting multiple bus_types and being
-	 * able to include different bus_types in the same IOMMU domain, so
-	 * we test whether the domains use the same iommu_ops rather than
-	 * testing if they're on the same bus_type.
-	 */
+	/* Try to match an existing compatible domain */
 	list_for_each_entry(d, &iommu->domain_list, next) {
-		if (d->domain->ops == domain->domain->ops) {
-			iommu_detach_group(domain->domain, group->iommu_group);
-			if (!iommu_attach_group(d->domain,
-						group->iommu_group)) {
-				list_add(&group->next, &d->group_list);
-				iommu_domain_free(domain->domain);
-				kfree(domain);
-				goto done;
-			}
-
-			ret = iommu_attach_group(domain->domain,
-						 group->iommu_group);
-			if (ret)
-				goto out_domain;
+		iommu_detach_group(domain->domain, group->iommu_group);
+		if (!iommu_attach_group(d->domain, group->iommu_group)) {
+			list_add(&group->next, &d->group_list);
+			iommu_domain_free(domain->domain);
+			kfree(domain);
+			goto done;
 		}
+
+		ret = iommu_attach_group(domain->domain,  group->iommu_group);
+		if (ret)
+			goto out_domain;
 	}
 
 	vfio_test_domain_fgsp(domain);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 4/5] vfio/iommu_type1: Clean up update_dirty_scope in detach_group()
  2022-06-30 20:36 [PATCH v4 0/5] Simplify vfio_iommu_type1 attach/detach routine Nicolin Chen
                   ` (2 preceding siblings ...)
  2022-06-30 20:36 ` [PATCH v4 3/5] vfio/iommu_type1: Remove the domain->ops comparison Nicolin Chen
@ 2022-06-30 20:36 ` Nicolin Chen
  2022-06-30 20:36 ` [PATCH v4 5/5] vfio/iommu_type1: Simplify group attachment Nicolin Chen
  4 siblings, 0 replies; 11+ messages in thread
From: Nicolin Chen @ 2022-06-30 20:36 UTC (permalink / raw)
  To: joro, will, marcan, sven, robin.murphy, robdclark, baolu.lu,
	orsonzhai, baolin.wang7, zhang.lyra, jean-philippe,
	alex.williamson, jgg, kevin.tian
  Cc: suravee.suthikulpanit, alyssa, dwmw2, mjrosato, gerald.schaefer,
	thierry.reding, vdumpa, jonathanh, cohuck, thunder.leizhen,
	christophe.jaillet, chenxiang66, john.garry, yangyingliang,
	iommu, iommu, linux-kernel, linux-arm-kernel, linux-arm-msm,
	linux-s390, linux-tegra, virtualization, kvm

All devices in emulated_iommu_groups have pinned_page_dirty_scope
set, so the update_dirty_scope in the first list_for_each_entry
is always false. Clean it up, and move the "if update_dirty_scope"
part from the detach_group_done routine to the domain_list part.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 11be5f95580b..b9ccb3cfac5d 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2453,14 +2453,12 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain;
 	struct vfio_iommu_group *group;
-	bool update_dirty_scope = false;
 	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
 	list_for_each_entry(group, &iommu->emulated_iommu_groups, next) {
 		if (group->iommu_group != iommu_group)
 			continue;
-		update_dirty_scope = !group->pinned_page_dirty_scope;
 		list_del(&group->next);
 		kfree(group);
 
@@ -2469,7 +2467,8 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			WARN_ON(iommu->notifier.head);
 			vfio_iommu_unmap_unpin_all(iommu);
 		}
-		goto detach_group_done;
+		mutex_unlock(&iommu->lock);
+		return;
 	}
 
 	/*
@@ -2485,9 +2484,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			continue;
 
 		iommu_detach_group(domain->domain, group->iommu_group);
-		update_dirty_scope = !group->pinned_page_dirty_scope;
 		list_del(&group->next);
-		kfree(group);
 		/*
 		 * Group ownership provides privilege, if the group list is
 		 * empty, the domain goes away. If it's the last domain with
@@ -2510,6 +2507,16 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			vfio_iommu_aper_expand(iommu, &iova_copy);
 			vfio_update_pgsize_bitmap(iommu);
 		}
+		/*
+		 * Removal of a group without dirty tracking may allow
+		 * the iommu scope to be promoted.
+		 */
+		if (!group->pinned_page_dirty_scope) {
+			iommu->num_non_pinned_groups--;
+			if (iommu->dirty_page_tracking)
+				vfio_iommu_populate_bitmap_full(iommu);
+		}
+		kfree(group);
 		break;
 	}
 
@@ -2518,16 +2525,6 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	else
 		vfio_iommu_iova_free(&iova_copy);
 
-detach_group_done:
-	/*
-	 * Removal of a group without dirty tracking may allow the iommu scope
-	 * to be promoted.
-	 */
-	if (update_dirty_scope) {
-		iommu->num_non_pinned_groups--;
-		if (iommu->dirty_page_tracking)
-			vfio_iommu_populate_bitmap_full(iommu);
-	}
 	mutex_unlock(&iommu->lock);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 5/5] vfio/iommu_type1: Simplify group attachment
  2022-06-30 20:36 [PATCH v4 0/5] Simplify vfio_iommu_type1 attach/detach routine Nicolin Chen
                   ` (3 preceding siblings ...)
  2022-06-30 20:36 ` [PATCH v4 4/5] vfio/iommu_type1: Clean up update_dirty_scope in detach_group() Nicolin Chen
@ 2022-06-30 20:36 ` Nicolin Chen
  4 siblings, 0 replies; 11+ messages in thread
From: Nicolin Chen @ 2022-06-30 20:36 UTC (permalink / raw)
  To: joro, will, marcan, sven, robin.murphy, robdclark, baolu.lu,
	orsonzhai, baolin.wang7, zhang.lyra, jean-philippe,
	alex.williamson, jgg, kevin.tian
  Cc: suravee.suthikulpanit, alyssa, dwmw2, mjrosato, gerald.schaefer,
	thierry.reding, vdumpa, jonathanh, cohuck, thunder.leizhen,
	christophe.jaillet, chenxiang66, john.garry, yangyingliang,
	iommu, iommu, linux-kernel, linux-arm-kernel, linux-arm-msm,
	linux-s390, linux-tegra, virtualization, kvm

Un-inline the domain specific logic from the attach/detach_group ops into
two paired functions vfio_iommu_alloc_attach_domain() and
vfio_iommu_detach_destroy_domain() that strictly deal with creating and
destroying struct vfio_domains.

Add the logic to check for EMEDIUMTYPE return code of iommu_attach_group()
and avoid the extra domain allocations and attach/detach sequences of the
old code. This allows properly detecting an actual attach error, like
-ENOMEM, vs treating all attach errors as an incompatible domain.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 321 +++++++++++++++++---------------
 1 file changed, 174 insertions(+), 147 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index b9ccb3cfac5d..3ffa4e2d9d18 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2153,15 +2153,174 @@ static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu,
 	list_splice_tail(iova_copy, iova);
 }
 
+static struct vfio_domain *
+vfio_iommu_alloc_attach_domain(struct bus_type *bus, struct vfio_iommu *iommu,
+			       struct vfio_iommu_group *group,
+			       struct list_head *group_resv_regions)
+{
+	struct iommu_domain *new_domain;
+	struct vfio_domain *domain;
+	phys_addr_t resv_msi_base;
+	int ret = 0;
+
+	/* Try to match an existing compatible domain */
+	list_for_each_entry (domain, &iommu->domain_list, next) {
+		ret = iommu_attach_group(domain->domain, group->iommu_group);
+		/* -EMEDIUMTYPE means an incompatible domain, so try next one */
+		if (ret == -EMEDIUMTYPE)
+			continue;
+		if (ret)
+			return ERR_PTR(ret);
+		goto done;
+	}
+
+	new_domain = iommu_domain_alloc(bus);
+	if (!new_domain)
+		return ERR_PTR(-EIO);
+
+	if (iommu->nesting) {
+		ret = iommu_enable_nesting(new_domain);
+		if (ret)
+			goto out_free_iommu_domain;
+	}
+
+	ret = iommu_attach_group(new_domain, group->iommu_group);
+	if (ret)
+		goto out_free_iommu_domain;
+
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (!domain) {
+		ret = -ENOMEM;
+		goto out_detach;
+	}
+
+	domain->domain = new_domain;
+	vfio_test_domain_fgsp(domain);
+
+	/*
+	 * If the IOMMU can block non-coherent operations (ie PCIe TLPs with
+	 * no-snoop set) then VFIO always turns this feature on because on Intel
+	 * platforms it optimizes KVM to disable wbinvd emulation.
+	 */
+	if (new_domain->ops->enforce_cache_coherency)
+		domain->enforce_cache_coherency =
+			new_domain->ops->enforce_cache_coherency(new_domain);
+
+	/* replay mappings on new domains */
+	ret = vfio_iommu_replay(iommu, domain);
+	if (ret)
+		goto out_free_domain;
+
+	if (vfio_iommu_has_sw_msi(group_resv_regions, &resv_msi_base)) {
+		ret = iommu_get_msi_cookie(domain->domain, resv_msi_base);
+		if (ret && ret != -ENODEV)
+			goto out_free_domain;
+	}
+
+	INIT_LIST_HEAD(&domain->group_list);
+	list_add(&domain->next, &iommu->domain_list);
+	vfio_update_pgsize_bitmap(iommu);
+
+done:
+	list_add(&group->next, &domain->group_list);
+
+	/*
+	 * An iommu backed group can dirty memory directly and therefore
+	 * demotes the iommu scope until it declares itself dirty tracking
+	 * capable via the page pinning interface.
+	 */
+	iommu->num_non_pinned_groups++;
+
+	return domain;
+
+out_free_domain:
+	kfree(domain);
+out_detach:
+	iommu_detach_group(new_domain, group->iommu_group);
+out_free_iommu_domain:
+	iommu_domain_free(new_domain);
+	return ERR_PTR(ret);
+}
+
+static void vfio_iommu_unmap_unpin_all(struct vfio_iommu *iommu)
+{
+	struct rb_node *node;
+
+	while ((node = rb_first(&iommu->dma_list)))
+		vfio_remove_dma(iommu, rb_entry(node, struct vfio_dma, node));
+}
+
+static void vfio_iommu_unmap_unpin_reaccount(struct vfio_iommu *iommu)
+{
+	struct rb_node *n, *p;
+
+	n = rb_first(&iommu->dma_list);
+	for (; n; n = rb_next(n)) {
+		struct vfio_dma *dma;
+		long locked = 0, unlocked = 0;
+
+		dma = rb_entry(n, struct vfio_dma, node);
+		unlocked += vfio_unmap_unpin(iommu, dma, false);
+		p = rb_first(&dma->pfn_list);
+		for (; p; p = rb_next(p)) {
+			struct vfio_pfn *vpfn = rb_entry(p, struct vfio_pfn,
+							 node);
+
+			if (!is_invalid_reserved_pfn(vpfn->pfn))
+				locked++;
+		}
+		vfio_lock_acct(dma, locked - unlocked, true);
+	}
+}
+
+static void vfio_iommu_detach_destroy_domain(struct vfio_domain *domain,
+					     struct vfio_iommu *iommu,
+					     struct vfio_iommu_group *group)
+{
+	iommu_detach_group(domain->domain, group->iommu_group);
+	list_del(&group->next);
+	if (!list_empty(&domain->group_list))
+		goto out_dirty;
+
+	/*
+	 * Group ownership provides privilege, if the group list is empty, the
+	 * domain goes away. If it's the last domain with iommu and external
+	 * domain doesn't exist, then all the mappings go away too. If it's the
+	 * last domain with iommu and external domain exist, update accounting
+	 */
+	if (list_is_singular(&iommu->domain_list)) {
+		if (list_empty(&iommu->emulated_iommu_groups)) {
+			WARN_ON(iommu->notifier.head);
+			vfio_iommu_unmap_unpin_all(iommu);
+		} else {
+			vfio_iommu_unmap_unpin_reaccount(iommu);
+		}
+	}
+	iommu_domain_free(domain->domain);
+	list_del(&domain->next);
+	kfree(domain);
+	vfio_update_pgsize_bitmap(iommu);
+
+out_dirty:
+	/*
+	 * Removal of a group without dirty tracking may allow the iommu scope
+	 * to be promoted.
+	 */
+	if (!group->pinned_page_dirty_scope) {
+		iommu->num_non_pinned_groups--;
+		if (iommu->dirty_page_tracking)
+			vfio_iommu_populate_bitmap_full(iommu);
+	}
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 		struct iommu_group *iommu_group, enum vfio_group_type type)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_iommu_group *group;
-	struct vfio_domain *domain, *d;
+	struct vfio_domain *domain;
 	struct bus_type *bus = NULL;
-	bool resv_msi, msi_remap;
-	phys_addr_t resv_msi_base = 0;
+	bool msi_remap;
 	struct iommu_domain_geometry *geo;
 	LIST_HEAD(iova_copy);
 	LIST_HEAD(group_resv_regions);
@@ -2197,26 +2356,17 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_free_group;
 
-	ret = -ENOMEM;
-	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
-	if (!domain)
+	ret = iommu_get_group_resv_regions(iommu_group, &group_resv_regions);
+	if (ret)
 		goto out_free_group;
 
-	ret = -EIO;
-	domain->domain = iommu_domain_alloc(bus);
-	if (!domain->domain)
-		goto out_free_domain;
-
-	if (iommu->nesting) {
-		ret = iommu_enable_nesting(domain->domain);
-		if (ret)
-			goto out_domain;
+	domain = vfio_iommu_alloc_attach_domain(bus, iommu, group,
+						&group_resv_regions);
+	if (IS_ERR(domain)) {
+		ret = PTR_ERR(domain);
+		goto out_free_group;
 	}
 
-	ret = iommu_attach_group(domain->domain, group->iommu_group);
-	if (ret)
-		goto out_domain;
-
 	/* Get aperture info */
 	geo = &domain->domain->geometry;
 	if (vfio_iommu_aper_conflict(iommu, geo->aperture_start,
@@ -2225,10 +2375,6 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
-	ret = iommu_get_group_resv_regions(iommu_group, &group_resv_regions);
-	if (ret)
-		goto out_detach;
-
 	if (vfio_iommu_resv_conflict(iommu, &group_resv_regions)) {
 		ret = -EINVAL;
 		goto out_detach;
@@ -2252,11 +2398,6 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_detach;
 
-	resv_msi = vfio_iommu_has_sw_msi(&group_resv_regions, &resv_msi_base);
-
-	INIT_LIST_HEAD(&domain->group_list);
-	list_add(&group->next, &domain->group_list);
-
 	msi_remap = irq_domain_check_msi_remap() ||
 		    iommu_capable(bus, IOMMU_CAP_INTR_REMAP);
 
@@ -2267,107 +2408,25 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
-	/*
-	 * If the IOMMU can block non-coherent operations (ie PCIe TLPs with
-	 * no-snoop set) then VFIO always turns this feature on because on Intel
-	 * platforms it optimizes KVM to disable wbinvd emulation.
-	 */
-	if (domain->domain->ops->enforce_cache_coherency)
-		domain->enforce_cache_coherency =
-			domain->domain->ops->enforce_cache_coherency(
-				domain->domain);
-
-	/* Try to match an existing compatible domain */
-	list_for_each_entry(d, &iommu->domain_list, next) {
-		iommu_detach_group(domain->domain, group->iommu_group);
-		if (!iommu_attach_group(d->domain, group->iommu_group)) {
-			list_add(&group->next, &d->group_list);
-			iommu_domain_free(domain->domain);
-			kfree(domain);
-			goto done;
-		}
-
-		ret = iommu_attach_group(domain->domain,  group->iommu_group);
-		if (ret)
-			goto out_domain;
-	}
-
-	vfio_test_domain_fgsp(domain);
-
-	/* replay mappings on new domains */
-	ret = vfio_iommu_replay(iommu, domain);
-	if (ret)
-		goto out_detach;
-
-	if (resv_msi) {
-		ret = iommu_get_msi_cookie(domain->domain, resv_msi_base);
-		if (ret && ret != -ENODEV)
-			goto out_detach;
-	}
-
-	list_add(&domain->next, &iommu->domain_list);
-	vfio_update_pgsize_bitmap(iommu);
-done:
 	/* Delete the old one and insert new iova list */
 	vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 
-	/*
-	 * An iommu backed group can dirty memory directly and therefore
-	 * demotes the iommu scope until it declares itself dirty tracking
-	 * capable via the page pinning interface.
-	 */
-	iommu->num_non_pinned_groups++;
 	mutex_unlock(&iommu->lock);
 	vfio_iommu_resv_free(&group_resv_regions);
 
 	return 0;
 
 out_detach:
-	iommu_detach_group(domain->domain, group->iommu_group);
-out_domain:
-	iommu_domain_free(domain->domain);
-	vfio_iommu_iova_free(&iova_copy);
-	vfio_iommu_resv_free(&group_resv_regions);
-out_free_domain:
-	kfree(domain);
+	vfio_iommu_detach_destroy_domain(domain, iommu, group);
 out_free_group:
 	kfree(group);
 out_unlock:
 	mutex_unlock(&iommu->lock);
+	vfio_iommu_iova_free(&iova_copy);
+	vfio_iommu_resv_free(&group_resv_regions);
 	return ret;
 }
 
-static void vfio_iommu_unmap_unpin_all(struct vfio_iommu *iommu)
-{
-	struct rb_node *node;
-
-	while ((node = rb_first(&iommu->dma_list)))
-		vfio_remove_dma(iommu, rb_entry(node, struct vfio_dma, node));
-}
-
-static void vfio_iommu_unmap_unpin_reaccount(struct vfio_iommu *iommu)
-{
-	struct rb_node *n, *p;
-
-	n = rb_first(&iommu->dma_list);
-	for (; n; n = rb_next(n)) {
-		struct vfio_dma *dma;
-		long locked = 0, unlocked = 0;
-
-		dma = rb_entry(n, struct vfio_dma, node);
-		unlocked += vfio_unmap_unpin(iommu, dma, false);
-		p = rb_first(&dma->pfn_list);
-		for (; p; p = rb_next(p)) {
-			struct vfio_pfn *vpfn = rb_entry(p, struct vfio_pfn,
-							 node);
-
-			if (!is_invalid_reserved_pfn(vpfn->pfn))
-				locked++;
-		}
-		vfio_lock_acct(dma, locked - unlocked, true);
-	}
-}
-
 /*
  * Called when a domain is removed in detach. It is possible that
  * the removed domain decided the iova aperture window. Modify the
@@ -2482,44 +2541,12 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		group = find_iommu_group(domain, iommu_group);
 		if (!group)
 			continue;
-
-		iommu_detach_group(domain->domain, group->iommu_group);
-		list_del(&group->next);
-		/*
-		 * Group ownership provides privilege, if the group list is
-		 * empty, the domain goes away. If it's the last domain with
-		 * iommu and external domain doesn't exist, then all the
-		 * mappings go away too. If it's the last domain with iommu and
-		 * external domain exist, update accounting
-		 */
-		if (list_empty(&domain->group_list)) {
-			if (list_is_singular(&iommu->domain_list)) {
-				if (list_empty(&iommu->emulated_iommu_groups)) {
-					WARN_ON(iommu->notifier.head);
-					vfio_iommu_unmap_unpin_all(iommu);
-				} else {
-					vfio_iommu_unmap_unpin_reaccount(iommu);
-				}
-			}
-			iommu_domain_free(domain->domain);
-			list_del(&domain->next);
-			kfree(domain);
-			vfio_iommu_aper_expand(iommu, &iova_copy);
-			vfio_update_pgsize_bitmap(iommu);
-		}
-		/*
-		 * Removal of a group without dirty tracking may allow
-		 * the iommu scope to be promoted.
-		 */
-		if (!group->pinned_page_dirty_scope) {
-			iommu->num_non_pinned_groups--;
-			if (iommu->dirty_page_tracking)
-				vfio_iommu_populate_bitmap_full(iommu);
-		}
+		vfio_iommu_detach_destroy_domain(domain, iommu, group);
 		kfree(group);
 		break;
 	}
 
+	vfio_iommu_aper_expand(iommu, &iova_copy);
 	if (!vfio_iommu_resv_refresh(iommu, &iova_copy))
 		vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 	else
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-06-30 20:36 ` [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group Nicolin Chen
@ 2022-07-01  6:53   ` Baolu Lu
  2022-07-01 10:21   ` Robin Murphy
  1 sibling, 0 replies; 11+ messages in thread
From: Baolu Lu @ 2022-07-01  6:53 UTC (permalink / raw)
  To: Nicolin Chen, joro, will, marcan, sven, robin.murphy, robdclark,
	orsonzhai, baolin.wang7, zhang.lyra, jean-philippe,
	alex.williamson, jgg, kevin.tian
  Cc: baolu.lu, suravee.suthikulpanit, alyssa, dwmw2, mjrosato,
	gerald.schaefer, thierry.reding, vdumpa, jonathanh, cohuck,
	thunder.leizhen, christophe.jaillet, chenxiang66, john.garry,
	yangyingliang, iommu, iommu, linux-kernel, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm

On 2022/7/1 04:36, Nicolin Chen wrote:
> Cases like VFIO wish to attach a device to an existing domain that was
> not allocated specifically from the device. This raises a condition
> where the IOMMU driver can fail the domain attach because the domain and
> device are incompatible with each other.
> 
> This is a soft failure that can be resolved by using a different domain.
> 
> Provide a dedicated errno from the IOMMU driver during attach that the
> reason attached failed is because of domain incompatability. EMEDIUMTYPE
> is chosen because it is never used within the iommu subsystem today and
> evokes a sense that the 'medium' aka the domain is incompatible.
> 
> VFIO can use this to know attach is a soft failure and it should continue
> searching. Otherwise the attach will be a hard failure and VFIO will
> return the code to userspace.
> 
> Update all drivers to return EMEDIUMTYPE in their failure paths that are
> related to domain incompatability. Also remove adjacent error prints for
> these soft failures, to prevent a kernel log spam, since -EMEDIUMTYPE is
> clear enough to indicate an incompatability error.
> 
> Add kdocs describing this behavior.
> 
> Suggested-by: Jason Gunthorpe<jgg@nvidia.com>
> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
> Signed-off-by: Nicolin Chen<nicolinc@nvidia.com>

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>

Best regards,
baolu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-06-30 20:36 ` [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group Nicolin Chen
  2022-07-01  6:53   ` Baolu Lu
@ 2022-07-01 10:21   ` Robin Murphy
  2022-07-01 16:43     ` Nicolin Chen
  1 sibling, 1 reply; 11+ messages in thread
From: Robin Murphy @ 2022-07-01 10:21 UTC (permalink / raw)
  To: Nicolin Chen, joro, will, marcan, sven, robdclark, baolu.lu,
	orsonzhai, baolin.wang7, zhang.lyra, jean-philippe,
	alex.williamson, jgg, kevin.tian
  Cc: suravee.suthikulpanit, alyssa, dwmw2, mjrosato, gerald.schaefer,
	thierry.reding, vdumpa, jonathanh, cohuck, thunder.leizhen,
	christophe.jaillet, chenxiang66, john.garry, yangyingliang,
	iommu, iommu, linux-kernel, linux-arm-kernel, linux-arm-msm,
	linux-s390, linux-tegra, virtualization, kvm

On 2022-06-30 21:36, Nicolin Chen wrote:
> Cases like VFIO wish to attach a device to an existing domain that was
> not allocated specifically from the device. This raises a condition
> where the IOMMU driver can fail the domain attach because the domain and
> device are incompatible with each other.
> 
> This is a soft failure that can be resolved by using a different domain.
> 
> Provide a dedicated errno from the IOMMU driver during attach that the
> reason attached failed is because of domain incompatability. EMEDIUMTYPE
> is chosen because it is never used within the iommu subsystem today and
> evokes a sense that the 'medium' aka the domain is incompatible.
> 
> VFIO can use this to know attach is a soft failure and it should continue
> searching. Otherwise the attach will be a hard failure and VFIO will
> return the code to userspace.
> 
> Update all drivers to return EMEDIUMTYPE in their failure paths that are
> related to domain incompatability. Also remove adjacent error prints for
> these soft failures, to prevent a kernel log spam, since -EMEDIUMTYPE is
> clear enough to indicate an incompatability error.
> 
> Add kdocs describing this behavior.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
[...]
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 2ed3594f384e..072cac5ab5a4 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -1135,10 +1135,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>   	struct arm_smmu_device *smmu;
>   	int ret;
>   
> -	if (!fwspec || fwspec->ops != &arm_smmu_ops) {
> -		dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n");
> -		return -ENXIO;
> -	}
> +	if (!fwspec || fwspec->ops != &arm_smmu_ops)
> +		return -EMEDIUMTYPE;

This is the wrong check, you want the "if (smmu_domain->smmu != smmu)" 
condition further down. If this one fails it's effectively because the 
device doesn't have an IOMMU at all, and similar to patch #3 it will be 
removed once the core code takes over properly (I even have both those 
patches written now!)

Thanks,
Robin.

>   	/*
>   	 * FIXME: The arch/arm DMA API code tries to attach devices to its own

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-07-01 10:21   ` Robin Murphy
@ 2022-07-01 16:43     ` Nicolin Chen
  2022-07-01 18:17       ` Robin Murphy
  0 siblings, 1 reply; 11+ messages in thread
From: Nicolin Chen @ 2022-07-01 16:43 UTC (permalink / raw)
  To: Robin Murphy
  Cc: joro, will, marcan, sven, robdclark, baolu.lu, orsonzhai,
	baolin.wang7, zhang.lyra, jean-philippe, alex.williamson, jgg,
	kevin.tian, suravee.suthikulpanit, alyssa, dwmw2, mjrosato,
	gerald.schaefer, thierry.reding, vdumpa, jonathanh, cohuck,
	thunder.leizhen, christophe.jaillet, chenxiang66, john.garry,
	yangyingliang, iommu, iommu, linux-kernel, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm

On Fri, Jul 01, 2022 at 11:21:48AM +0100, Robin Murphy wrote:

> > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > index 2ed3594f384e..072cac5ab5a4 100644
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > @@ -1135,10 +1135,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> >       struct arm_smmu_device *smmu;
> >       int ret;
> > 
> > -     if (!fwspec || fwspec->ops != &arm_smmu_ops) {
> > -             dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n");
> > -             return -ENXIO;
> > -     }
> > +     if (!fwspec || fwspec->ops != &arm_smmu_ops)
> > +             return -EMEDIUMTYPE;
> 
> This is the wrong check, you want the "if (smmu_domain->smmu != smmu)"
> condition further down. If this one fails it's effectively because the
> device doesn't have an IOMMU at all, and similar to patch #3 it will be

Thanks for the review! I will fix that. The "on the same bus" is
quite eye-catching.

> removed once the core code takes over properly (I even have both those
> patches written now!)

Actually in my v1 the proposal for ops check returned -EMEDIUMTYPE
also upon an ops mismatch, treating that too as an incompatibility.
Do you mean that we should have fine-grained it further?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-07-01 16:43     ` Nicolin Chen
@ 2022-07-01 18:17       ` Robin Murphy
  2022-07-01 19:08         ` Nicolin Chen
  0 siblings, 1 reply; 11+ messages in thread
From: Robin Murphy @ 2022-07-01 18:17 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: joro, will, marcan, sven, robdclark, baolu.lu, orsonzhai,
	baolin.wang7, zhang.lyra, jean-philippe, alex.williamson, jgg,
	kevin.tian, suravee.suthikulpanit, alyssa, dwmw2, mjrosato,
	gerald.schaefer, thierry.reding, vdumpa, jonathanh, cohuck,
	thunder.leizhen, christophe.jaillet, chenxiang66, john.garry,
	yangyingliang, iommu, iommu, linux-kernel, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm

On 01/07/2022 5:43 pm, Nicolin Chen wrote:
> On Fri, Jul 01, 2022 at 11:21:48AM +0100, Robin Murphy wrote:
> 
>>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>>> index 2ed3594f384e..072cac5ab5a4 100644
>>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
>>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>>> @@ -1135,10 +1135,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>>>        struct arm_smmu_device *smmu;
>>>        int ret;
>>>
>>> -     if (!fwspec || fwspec->ops != &arm_smmu_ops) {
>>> -             dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n");
>>> -             return -ENXIO;
>>> -     }
>>> +     if (!fwspec || fwspec->ops != &arm_smmu_ops)
>>> +             return -EMEDIUMTYPE;
>>
>> This is the wrong check, you want the "if (smmu_domain->smmu != smmu)"
>> condition further down. If this one fails it's effectively because the
>> device doesn't have an IOMMU at all, and similar to patch #3 it will be
> 
> Thanks for the review! I will fix that. The "on the same bus" is
> quite eye-catching.
> 
>> removed once the core code takes over properly (I even have both those
>> patches written now!)
> 
> Actually in my v1 the proposal for ops check returned -EMEDIUMTYPE
> also upon an ops mismatch, treating that too as an incompatibility.
> Do you mean that we should have fine-grained it further?

On second look, I think this particular check was already entirely 
redundant by the time I made the fwspec conversion to it, oh well. Since 
it remains harmless for the time being, let's just ignore it entirely 
until we can confidently say goodbye to the whole lot[1].

I don't think there's any need to differentiate an instance mismatch 
from a driver mismatch, once the latter becomes realistically possible, 
mostly due to iommu_domain_alloc() also having to become device-aware to 
know which driver to allocate from. Thus as far as a user is concerned, 
if attaching a device to an existing domain fails with -EMEDIUMTYPE, 
allocating a new domain using the given device, and attaching to that, 
can be expected to succeed, regardless of why the original attempt was 
rejected. In fact even in the theoretical different-driver-per-bus model 
the same principle still holds up.

Thanks,
Robin.

[1] 
https://gitlab.arm.com/linux-arm/linux-rm/-/commit/aa4accfa4a10e92daad0d51095918e8a89014393

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-07-01 18:17       ` Robin Murphy
@ 2022-07-01 19:08         ` Nicolin Chen
  0 siblings, 0 replies; 11+ messages in thread
From: Nicolin Chen @ 2022-07-01 19:08 UTC (permalink / raw)
  To: Robin Murphy
  Cc: joro, will, marcan, sven, robdclark, baolu.lu, orsonzhai,
	baolin.wang7, zhang.lyra, jean-philippe, alex.williamson, jgg,
	kevin.tian, suravee.suthikulpanit, alyssa, dwmw2, mjrosato,
	gerald.schaefer, thierry.reding, vdumpa, jonathanh, cohuck,
	thunder.leizhen, christophe.jaillet, chenxiang66, john.garry,
	yangyingliang, iommu, iommu, linux-kernel, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm

On Fri, Jul 01, 2022 at 07:17:38PM +0100, Robin Murphy wrote:
> External email: Use caution opening links or attachments
> 
> 
> On 01/07/2022 5:43 pm, Nicolin Chen wrote:
> > On Fri, Jul 01, 2022 at 11:21:48AM +0100, Robin Murphy wrote:
> > 
> > > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > > index 2ed3594f384e..072cac5ab5a4 100644
> > > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > > @@ -1135,10 +1135,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> > > >        struct arm_smmu_device *smmu;
> > > >        int ret;
> > > > 
> > > > -     if (!fwspec || fwspec->ops != &arm_smmu_ops) {
> > > > -             dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n");
> > > > -             return -ENXIO;
> > > > -     }
> > > > +     if (!fwspec || fwspec->ops != &arm_smmu_ops)
> > > > +             return -EMEDIUMTYPE;
> > > 
> > > This is the wrong check, you want the "if (smmu_domain->smmu != smmu)"
> > > condition further down. If this one fails it's effectively because the
> > > device doesn't have an IOMMU at all, and similar to patch #3 it will be
> > 
> > Thanks for the review! I will fix that. The "on the same bus" is
> > quite eye-catching.
> > 
> > > removed once the core code takes over properly (I even have both those
> > > patches written now!)
> > 
> > Actually in my v1 the proposal for ops check returned -EMEDIUMTYPE
> > also upon an ops mismatch, treating that too as an incompatibility.
> > Do you mean that we should have fine-grained it further?
> 
> On second look, I think this particular check was already entirely
> redundant by the time I made the fwspec conversion to it, oh well. Since
> it remains harmless for the time being, let's just ignore it entirely
> until we can confidently say goodbye to the whole lot[1].

That looks cleaner!

> I don't think there's any need to differentiate an instance mismatch
> from a driver mismatch, once the latter becomes realistically possible,
> mostly due to iommu_domain_alloc() also having to become device-aware to
> know which driver to allocate from. Thus as far as a user is concerned,
> if attaching a device to an existing domain fails with -EMEDIUMTYPE,
> allocating a new domain using the given device, and attaching to that,
> can be expected to succeed, regardless of why the original attempt was
> rejected. In fact even in the theoretical different-driver-per-bus model
> the same principle still holds up.

I see. Thanks for the explanation. 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-07-01 19:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-30 20:36 [PATCH v4 0/5] Simplify vfio_iommu_type1 attach/detach routine Nicolin Chen
2022-06-30 20:36 ` [PATCH v4 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group Nicolin Chen
2022-07-01  6:53   ` Baolu Lu
2022-07-01 10:21   ` Robin Murphy
2022-07-01 16:43     ` Nicolin Chen
2022-07-01 18:17       ` Robin Murphy
2022-07-01 19:08         ` Nicolin Chen
2022-06-30 20:36 ` [PATCH v4 2/5] vfio/iommu_type1: Prefer to reuse domains vs match enforced cache coherency Nicolin Chen
2022-06-30 20:36 ` [PATCH v4 3/5] vfio/iommu_type1: Remove the domain->ops comparison Nicolin Chen
2022-06-30 20:36 ` [PATCH v4 4/5] vfio/iommu_type1: Clean up update_dirty_scope in detach_group() Nicolin Chen
2022-06-30 20:36 ` [PATCH v4 5/5] vfio/iommu_type1: Simplify group attachment Nicolin Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).