All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/5] Simplify vfio_iommu_type1 attach/detach routine
@ 2022-08-15 18:14 ` Nicolin Chen
  0 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, jgg, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

This is a preparatory series for IOMMUFD v2 patches. It enforces error
code -EMEDIUMTYPE in iommu_attach_device() and iommu_attach_group() when
an IOMMU domain and a device/group are incompatible. It also drops the
useless domain->ops check since it won't fail in current environment.

These allow VFIO iommu code to simplify its group attachment routine, by
avoiding the extra IOMMU domain allocations and attach/detach sequences
of the old code.

Worths mentioning the exact match for enforce_cache_coherency is removed
with this series, since there's very less value in doing that as KVM will
not be able to take advantage of it -- this just wastes domain memory.
Instead, we rely on Intel IOMMU driver taking care of that internally.

This is on github:
https://github.com/nicolinc/iommufd/commits/vfio_iommu_attach-v6

Changelog
v6:
 * Rebased on top of v6.0-rc1
 * Added "Reviewed-by" from Jason
 * Added "Cc" to Joerg/Will/Robin for the IOMMU patch
v5: https://lore.kernel.org/kvm/20220701214455.14992-1-nicolinc@nvidia.com/
 * Rebased on top of Robin's "Simplify bus_type determination".
 * Fixed a wrong change returning -EMEDIUMTYPE in arm-smmu driver.
 * Added Baolu's "Reviewed-by".
v4: https://lore.kernel.org/kvm/20220630203635.33200-1-nicolinc@nvidia.com/
 * Dropped -EMEDIUMTYPE change in mtk_v1 driver per Robin's input
 * Added Baolu's and Kevin's Reviewed-by lines
v3: https://lore.kernel.org/kvm/20220623200029.26007-1-nicolinc@nvidia.com/
 * Dropped all dev_err since -EMEDIUMTYPE clearly indicates what error.
 * Updated commit message of enforce_cache_coherency removing patch.
 * Updated commit message of domain->ops removing patch.
 * Replaced "goto out_unlock" with simply mutex_unlock() and return.
 * Added a line of comments for -EMEDIUMTYPE return check.
 * Moved iommu_get_msi_cookie() into alloc_attach_domain() as a cookie
   should be logically tied to the lifetime of a domain itself.
 * Added Kevin's "Reviewed-by".
v2: https://lore.kernel.org/kvm/20220616000304.23890-1-nicolinc@nvidia.com/
 * Added -EMEDIUMTYPE to more IOMMU drivers that fit the category.
 * Changed dev_err to dev_dbg for -EMEDIUMTYPE to avoid kernel log spam.
 * Dropped iommu_ops patch, and removed domain->ops in VFIO directly,
   since there's no mixed-driver use case that would fail the sanity.
 * Updated commit log of the patch removing enforce_cache_coherency.
 * Fixed a misplace of "num_non_pinned_groups--" in detach_group patch.
 * Moved "num_non_pinned_groups++" in PATCH-5 to the common path between
   domain-reusing and new-domain pathways, like the code previously did.
 * Fixed a typo in EMEDIUMTYPE patch.
v1: https://lore.kernel.org/kvm/20220606061927.26049-1-nicolinc@nvidia.com/

Jason Gunthorpe (1):
  vfio/iommu_type1: Prefer to reuse domains vs match enforced cache
    coherency

Nicolin Chen (4):
  iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  vfio/iommu_type1: Remove the domain->ops comparison
  vfio/iommu_type1: Clean up update_dirty_scope in detach_group()
  vfio/iommu_type1: Simplify group attachment

 drivers/iommu/amd/iommu.c                   |   2 +-
 drivers/iommu/apple-dart.c                  |   4 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  15 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |   5 +-
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     |   9 +-
 drivers/iommu/intel/iommu.c                 |  10 +-
 drivers/iommu/iommu.c                       |  28 ++
 drivers/iommu/ipmmu-vmsa.c                  |   4 +-
 drivers/iommu/omap-iommu.c                  |   3 +-
 drivers/iommu/s390-iommu.c                  |   2 +-
 drivers/iommu/sprd-iommu.c                  |   6 +-
 drivers/iommu/tegra-gart.c                  |   2 +-
 drivers/iommu/virtio-iommu.c                |   3 +-
 drivers/vfio/vfio_iommu_type1.c             | 353 ++++++++++----------
 14 files changed, 229 insertions(+), 217 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v6 0/5] Simplify vfio_iommu_type1 attach/detach routine
@ 2022-08-15 18:14 ` Nicolin Chen
  0 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: marcan, mjrosato, virtualization, thierry.reding, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu, jgg,
	yangyingliang, orsonzhai, gerald.schaefer, kevin.tian, sven,
	linux-arm-msm, christophe.jaillet, baolin.wang, thunder.leizhen,
	linux-tegra, tglx, linux-arm-kernel, linux-s390, cohuck,
	linux-kernel, shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

This is a preparatory series for IOMMUFD v2 patches. It enforces error
code -EMEDIUMTYPE in iommu_attach_device() and iommu_attach_group() when
an IOMMU domain and a device/group are incompatible. It also drops the
useless domain->ops check since it won't fail in current environment.

These allow VFIO iommu code to simplify its group attachment routine, by
avoiding the extra IOMMU domain allocations and attach/detach sequences
of the old code.

Worths mentioning the exact match for enforce_cache_coherency is removed
with this series, since there's very less value in doing that as KVM will
not be able to take advantage of it -- this just wastes domain memory.
Instead, we rely on Intel IOMMU driver taking care of that internally.

This is on github:
https://github.com/nicolinc/iommufd/commits/vfio_iommu_attach-v6

Changelog
v6:
 * Rebased on top of v6.0-rc1
 * Added "Reviewed-by" from Jason
 * Added "Cc" to Joerg/Will/Robin for the IOMMU patch
v5: https://lore.kernel.org/kvm/20220701214455.14992-1-nicolinc@nvidia.com/
 * Rebased on top of Robin's "Simplify bus_type determination".
 * Fixed a wrong change returning -EMEDIUMTYPE in arm-smmu driver.
 * Added Baolu's "Reviewed-by".
v4: https://lore.kernel.org/kvm/20220630203635.33200-1-nicolinc@nvidia.com/
 * Dropped -EMEDIUMTYPE change in mtk_v1 driver per Robin's input
 * Added Baolu's and Kevin's Reviewed-by lines
v3: https://lore.kernel.org/kvm/20220623200029.26007-1-nicolinc@nvidia.com/
 * Dropped all dev_err since -EMEDIUMTYPE clearly indicates what error.
 * Updated commit message of enforce_cache_coherency removing patch.
 * Updated commit message of domain->ops removing patch.
 * Replaced "goto out_unlock" with simply mutex_unlock() and return.
 * Added a line of comments for -EMEDIUMTYPE return check.
 * Moved iommu_get_msi_cookie() into alloc_attach_domain() as a cookie
   should be logically tied to the lifetime of a domain itself.
 * Added Kevin's "Reviewed-by".
v2: https://lore.kernel.org/kvm/20220616000304.23890-1-nicolinc@nvidia.com/
 * Added -EMEDIUMTYPE to more IOMMU drivers that fit the category.
 * Changed dev_err to dev_dbg for -EMEDIUMTYPE to avoid kernel log spam.
 * Dropped iommu_ops patch, and removed domain->ops in VFIO directly,
   since there's no mixed-driver use case that would fail the sanity.
 * Updated commit log of the patch removing enforce_cache_coherency.
 * Fixed a misplace of "num_non_pinned_groups--" in detach_group patch.
 * Moved "num_non_pinned_groups++" in PATCH-5 to the common path between
   domain-reusing and new-domain pathways, like the code previously did.
 * Fixed a typo in EMEDIUMTYPE patch.
v1: https://lore.kernel.org/kvm/20220606061927.26049-1-nicolinc@nvidia.com/

Jason Gunthorpe (1):
  vfio/iommu_type1: Prefer to reuse domains vs match enforced cache
    coherency

Nicolin Chen (4):
  iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  vfio/iommu_type1: Remove the domain->ops comparison
  vfio/iommu_type1: Clean up update_dirty_scope in detach_group()
  vfio/iommu_type1: Simplify group attachment

 drivers/iommu/amd/iommu.c                   |   2 +-
 drivers/iommu/apple-dart.c                  |   4 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  15 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |   5 +-
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     |   9 +-
 drivers/iommu/intel/iommu.c                 |  10 +-
 drivers/iommu/iommu.c                       |  28 ++
 drivers/iommu/ipmmu-vmsa.c                  |   4 +-
 drivers/iommu/omap-iommu.c                  |   3 +-
 drivers/iommu/s390-iommu.c                  |   2 +-
 drivers/iommu/sprd-iommu.c                  |   6 +-
 drivers/iommu/tegra-gart.c                  |   2 +-
 drivers/iommu/virtio-iommu.c                |   3 +-
 drivers/vfio/vfio_iommu_type1.c             | 353 ++++++++++----------
 14 files changed, 229 insertions(+), 217 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-08-15 18:14 ` Nicolin Chen
@ 2022-08-15 18:14   ` Nicolin Chen
  -1 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, jgg, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

Cases like VFIO wish to attach a device to an existing domain that was
not allocated specifically from the device. This raises a condition
where the IOMMU driver can fail the domain attach because the domain and
device are incompatible with each other.

This is a soft failure that can be resolved by using a different domain.

Provide a dedicated errno from the IOMMU driver during attach that the
reason attached failed is because of domain incompatability. EMEDIUMTYPE
is chosen because it is never used within the iommu subsystem today and
evokes a sense that the 'medium' aka the domain is incompatible.

VFIO can use this to know attach is a soft failure and it should continue
searching. Otherwise the attach will be a hard failure and VFIO will
return the code to userspace.

Update all drivers to return EMEDIUMTYPE in their failure paths that are
related to domain incompatability. Also remove adjacent error prints for
these soft failures, to prevent a kernel log spam, since -EMEDIUMTYPE is
clear enough to indicate an incompatability error.

Add kdocs describing this behavior.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/amd/iommu.c                   |  2 +-
 drivers/iommu/apple-dart.c                  |  4 +--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++--------
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  5 +---
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     |  9 ++-----
 drivers/iommu/intel/iommu.c                 | 10 +++-----
 drivers/iommu/iommu.c                       | 28 +++++++++++++++++++++
 drivers/iommu/ipmmu-vmsa.c                  |  4 +--
 drivers/iommu/omap-iommu.c                  |  3 +--
 drivers/iommu/s390-iommu.c                  |  2 +-
 drivers/iommu/sprd-iommu.c                  |  6 ++---
 drivers/iommu/tegra-gart.c                  |  2 +-
 drivers/iommu/virtio-iommu.c                |  3 +--
 13 files changed, 47 insertions(+), 46 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 65b8e4fd8217..6a5bd0cfc06a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1755,7 +1755,7 @@ static int attach_device(struct device *dev,
 	if (domain->flags & PD_IOMMUV2_MASK) {
 		struct iommu_domain *def_domain = iommu_get_dma_domain(dev);
 
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 		if (def_domain->type != IOMMU_DOMAIN_IDENTITY)
 			goto out;
 
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index 1b1725759262..87f1829d24ea 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -495,10 +495,10 @@ static int apple_dart_attach_dev(struct iommu_domain *domain,
 
 	if (cfg->stream_maps[0].dart->force_bypass &&
 	    domain->type != IOMMU_DOMAIN_IDENTITY)
-		return -EINVAL;
+		return -EMEDIUMTYPE;
 	if (!cfg->stream_maps[0].dart->supports_bypass &&
 	    domain->type == IOMMU_DOMAIN_IDENTITY)
-		return -EINVAL;
+		return -EMEDIUMTYPE;
 
 	ret = apple_dart_finalize_domain(domain, cfg);
 	if (ret)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d32b02336411..a3717961d248 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2429,24 +2429,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			goto out_unlock;
 		}
 	} else if (smmu_domain->smmu != smmu) {
-		dev_err(dev,
-			"cannot attach to SMMU %s (upstream of %s)\n",
-			dev_name(smmu_domain->smmu->dev),
-			dev_name(smmu->dev));
-		ret = -ENXIO;
+		ret = -EMEDIUMTYPE;
 		goto out_unlock;
 	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
 		   master->ssid_bits != smmu_domain->s1_cfg.s1cdmax) {
-		dev_err(dev,
-			"cannot attach to incompatible domain (%u SSID bits != %u)\n",
-			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 		goto out_unlock;
 	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
 		   smmu_domain->stall_enabled != master->stall_enabled) {
-		dev_err(dev, "cannot attach to stall-%s domain\n",
-			smmu_domain->stall_enabled ? "enabled" : "disabled");
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 		goto out_unlock;
 	}
 
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index dfa82df00342..26af6d923321 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1167,10 +1167,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	 * different SMMUs.
 	 */
 	if (smmu_domain->smmu != smmu) {
-		dev_err(dev,
-			"cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n",
-			dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev));
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 		goto rpm_put;
 	}
 
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 17235116d3bb..0daee6d8d993 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -381,13 +381,8 @@ static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev
 	 * Sanity check the domain. We don't support domains across
 	 * different IOMMUs.
 	 */
-	if (qcom_domain->iommu != qcom_iommu) {
-		dev_err(dev, "cannot attach to IOMMU %s while already "
-			"attached to domain on IOMMU %s\n",
-			dev_name(qcom_domain->iommu->dev),
-			dev_name(qcom_iommu->dev));
-		return -EINVAL;
-	}
+	if (qcom_domain->iommu != qcom_iommu)
+		return -EMEDIUMTYPE;
 
 	return 0;
 }
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 7cca030a508e..560d2c3e9304 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4174,19 +4174,15 @@ static int prepare_domain_attach_device(struct iommu_domain *domain,
 		return -ENODEV;
 
 	if (dmar_domain->force_snooping && !ecap_sc_support(iommu->ecap))
-		return -EOPNOTSUPP;
+		return -EMEDIUMTYPE;
 
 	/* check if this iommu agaw is sufficient for max mapped address */
 	addr_width = agaw_to_width(iommu->agaw);
 	if (addr_width > cap_mgaw(iommu->cap))
 		addr_width = cap_mgaw(iommu->cap);
 
-	if (dmar_domain->max_addr > (1LL << addr_width)) {
-		dev_err(dev, "%s: iommu width (%d) is not "
-		        "sufficient for the mapped address (%llx)\n",
-		        __func__, addr_width, dmar_domain->max_addr);
-		return -EFAULT;
-	}
+	if (dmar_domain->max_addr > (1LL << addr_width))
+		return -EMEDIUMTYPE;
 	dmar_domain->gaw = addr_width;
 
 	/*
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 780fb7071577..d7416f6d9bab 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1975,6 +1975,20 @@ static int __iommu_attach_device(struct iommu_domain *domain,
 	return ret;
 }
 
+/**
+ * iommu_attach_device - Attach a device to an IOMMU domain
+ * @domain: IOMMU domain to attach
+ * @dev: Device that will be attached
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Specifically, -EMEDIUMTYPE is returned as a soft failure if the domain and
+ * the device are incompatible in some way. This indicates that a caller should
+ * try another existing IOMMU domain or allocate a new one. And note that it's
+ * recommended to keep kernel print free when reporting -EMEDIUMTYPE error, as
+ * this function can be called to test compatibility with domains that will fail
+ * the test, which will result in a kernel log spam.
+ */
 int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
 {
 	struct iommu_group *group;
@@ -2101,6 +2115,20 @@ static int __iommu_attach_group(struct iommu_domain *domain,
 	return ret;
 }
 
+/**
+ * iommu_attach_group - Attach an IOMMU group to an IOMMU domain
+ * @domain: IOMMU domain to attach
+ * @group: IOMMU group that will be attached
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Specifically, -EMEDIUMTYPE is returned as a soft failure if the domain and
+ * the device are incompatible in some way. This indicates that a caller should
+ * try another existing IOMMU domain or allocate a new one. And note that it's
+ * recommended to keep kernel print free when reporting -EMEDIUMTYPE error, as
+ * this function can be called to test compatibility with domains that will fail
+ * the test, which will result in a kernel log spam.
+ */
 int iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group)
 {
 	int ret;
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 1d42084d0276..0103480648cb 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -628,9 +628,7 @@ static int ipmmu_attach_device(struct iommu_domain *io_domain,
 		 * Something is wrong, we can't attach two devices using
 		 * different IOMMUs to the same domain.
 		 */
-		dev_err(dev, "Can't attach IPMMU %s to domain on IPMMU %s\n",
-			dev_name(mmu->dev), dev_name(domain->mmu->dev));
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 	} else
 		dev_info(dev, "Reusing IPMMU context %u\n", domain->context_id);
 
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index d9cf2820c02e..6bc8925726bf 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1471,8 +1471,7 @@ omap_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 
 	/* only a single client device can be attached to a domain */
 	if (omap_domain->dev) {
-		dev_err(dev, "iommu domain is already attached\n");
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;
 		goto out;
 	}
 
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index c898bcbbce11..ddcb78b284bb 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -127,7 +127,7 @@ static int s390_iommu_attach_device(struct iommu_domain *domain,
 	/* Allow only devices with identical DMA range limits */
 	} else if (domain->geometry.aperture_start != zdev->start_dma ||
 		   domain->geometry.aperture_end != zdev->end_dma) {
-		rc = -EINVAL;
+		rc = -EMEDIUMTYPE;
 		spin_unlock_irqrestore(&s390_domain->list_lock, flags);
 		goto out_restore;
 	}
diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
index 511959c8a14d..2bc1d34981cc 100644
--- a/drivers/iommu/sprd-iommu.c
+++ b/drivers/iommu/sprd-iommu.c
@@ -237,10 +237,8 @@ static int sprd_iommu_attach_device(struct iommu_domain *domain,
 	struct sprd_iommu_domain *dom = to_sprd_domain(domain);
 	size_t pgt_size = sprd_iommu_pgt_size(domain);
 
-	if (dom->sdev) {
-		pr_err("There's already a device attached to this domain.\n");
-		return -EINVAL;
-	}
+	if (dom->sdev)
+		return -EMEDIUMTYPE;
 
 	dom->pgt_va = dma_alloc_coherent(sdev->dev, pgt_size, &dom->pgt_pa, GFP_KERNEL);
 	if (!dom->pgt_va)
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index e5ca3cf1a949..9d81cc467651 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -112,7 +112,7 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
 	spin_lock(&gart->dom_lock);
 
 	if (gart->active_domain && gart->active_domain != domain) {
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;
 	} else if (dev_iommu_priv_get(dev) != domain) {
 		dev_iommu_priv_set(dev, domain);
 		gart->active_domain = domain;
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 08eeafc9529f..6172763c69b8 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -733,8 +733,7 @@ static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		 */
 		ret = viommu_domain_finalise(vdev, domain);
 	} else if (vdomain->viommu != vdev->viommu) {
-		dev_err(dev, "cannot attach to foreign vIOMMU\n");
-		ret = -EXDEV;
+		ret = -EMEDIUMTYPE;
 	}
 	mutex_unlock(&vdomain->mutex);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-08-15 18:14   ` Nicolin Chen
  0 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: marcan, mjrosato, virtualization, thierry.reding, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu, jgg,
	yangyingliang, orsonzhai, gerald.schaefer, kevin.tian, sven,
	linux-arm-msm, christophe.jaillet, baolin.wang, thunder.leizhen,
	linux-tegra, tglx, linux-arm-kernel, linux-s390, cohuck,
	linux-kernel, shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

Cases like VFIO wish to attach a device to an existing domain that was
not allocated specifically from the device. This raises a condition
where the IOMMU driver can fail the domain attach because the domain and
device are incompatible with each other.

This is a soft failure that can be resolved by using a different domain.

Provide a dedicated errno from the IOMMU driver during attach that the
reason attached failed is because of domain incompatability. EMEDIUMTYPE
is chosen because it is never used within the iommu subsystem today and
evokes a sense that the 'medium' aka the domain is incompatible.

VFIO can use this to know attach is a soft failure and it should continue
searching. Otherwise the attach will be a hard failure and VFIO will
return the code to userspace.

Update all drivers to return EMEDIUMTYPE in their failure paths that are
related to domain incompatability. Also remove adjacent error prints for
these soft failures, to prevent a kernel log spam, since -EMEDIUMTYPE is
clear enough to indicate an incompatability error.

Add kdocs describing this behavior.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/amd/iommu.c                   |  2 +-
 drivers/iommu/apple-dart.c                  |  4 +--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++--------
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  5 +---
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     |  9 ++-----
 drivers/iommu/intel/iommu.c                 | 10 +++-----
 drivers/iommu/iommu.c                       | 28 +++++++++++++++++++++
 drivers/iommu/ipmmu-vmsa.c                  |  4 +--
 drivers/iommu/omap-iommu.c                  |  3 +--
 drivers/iommu/s390-iommu.c                  |  2 +-
 drivers/iommu/sprd-iommu.c                  |  6 ++---
 drivers/iommu/tegra-gart.c                  |  2 +-
 drivers/iommu/virtio-iommu.c                |  3 +--
 13 files changed, 47 insertions(+), 46 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 65b8e4fd8217..6a5bd0cfc06a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1755,7 +1755,7 @@ static int attach_device(struct device *dev,
 	if (domain->flags & PD_IOMMUV2_MASK) {
 		struct iommu_domain *def_domain = iommu_get_dma_domain(dev);
 
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 		if (def_domain->type != IOMMU_DOMAIN_IDENTITY)
 			goto out;
 
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index 1b1725759262..87f1829d24ea 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -495,10 +495,10 @@ static int apple_dart_attach_dev(struct iommu_domain *domain,
 
 	if (cfg->stream_maps[0].dart->force_bypass &&
 	    domain->type != IOMMU_DOMAIN_IDENTITY)
-		return -EINVAL;
+		return -EMEDIUMTYPE;
 	if (!cfg->stream_maps[0].dart->supports_bypass &&
 	    domain->type == IOMMU_DOMAIN_IDENTITY)
-		return -EINVAL;
+		return -EMEDIUMTYPE;
 
 	ret = apple_dart_finalize_domain(domain, cfg);
 	if (ret)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d32b02336411..a3717961d248 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2429,24 +2429,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			goto out_unlock;
 		}
 	} else if (smmu_domain->smmu != smmu) {
-		dev_err(dev,
-			"cannot attach to SMMU %s (upstream of %s)\n",
-			dev_name(smmu_domain->smmu->dev),
-			dev_name(smmu->dev));
-		ret = -ENXIO;
+		ret = -EMEDIUMTYPE;
 		goto out_unlock;
 	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
 		   master->ssid_bits != smmu_domain->s1_cfg.s1cdmax) {
-		dev_err(dev,
-			"cannot attach to incompatible domain (%u SSID bits != %u)\n",
-			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 		goto out_unlock;
 	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
 		   smmu_domain->stall_enabled != master->stall_enabled) {
-		dev_err(dev, "cannot attach to stall-%s domain\n",
-			smmu_domain->stall_enabled ? "enabled" : "disabled");
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 		goto out_unlock;
 	}
 
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index dfa82df00342..26af6d923321 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1167,10 +1167,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	 * different SMMUs.
 	 */
 	if (smmu_domain->smmu != smmu) {
-		dev_err(dev,
-			"cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n",
-			dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev));
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 		goto rpm_put;
 	}
 
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 17235116d3bb..0daee6d8d993 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -381,13 +381,8 @@ static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev
 	 * Sanity check the domain. We don't support domains across
 	 * different IOMMUs.
 	 */
-	if (qcom_domain->iommu != qcom_iommu) {
-		dev_err(dev, "cannot attach to IOMMU %s while already "
-			"attached to domain on IOMMU %s\n",
-			dev_name(qcom_domain->iommu->dev),
-			dev_name(qcom_iommu->dev));
-		return -EINVAL;
-	}
+	if (qcom_domain->iommu != qcom_iommu)
+		return -EMEDIUMTYPE;
 
 	return 0;
 }
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 7cca030a508e..560d2c3e9304 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4174,19 +4174,15 @@ static int prepare_domain_attach_device(struct iommu_domain *domain,
 		return -ENODEV;
 
 	if (dmar_domain->force_snooping && !ecap_sc_support(iommu->ecap))
-		return -EOPNOTSUPP;
+		return -EMEDIUMTYPE;
 
 	/* check if this iommu agaw is sufficient for max mapped address */
 	addr_width = agaw_to_width(iommu->agaw);
 	if (addr_width > cap_mgaw(iommu->cap))
 		addr_width = cap_mgaw(iommu->cap);
 
-	if (dmar_domain->max_addr > (1LL << addr_width)) {
-		dev_err(dev, "%s: iommu width (%d) is not "
-		        "sufficient for the mapped address (%llx)\n",
-		        __func__, addr_width, dmar_domain->max_addr);
-		return -EFAULT;
-	}
+	if (dmar_domain->max_addr > (1LL << addr_width))
+		return -EMEDIUMTYPE;
 	dmar_domain->gaw = addr_width;
 
 	/*
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 780fb7071577..d7416f6d9bab 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1975,6 +1975,20 @@ static int __iommu_attach_device(struct iommu_domain *domain,
 	return ret;
 }
 
+/**
+ * iommu_attach_device - Attach a device to an IOMMU domain
+ * @domain: IOMMU domain to attach
+ * @dev: Device that will be attached
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Specifically, -EMEDIUMTYPE is returned as a soft failure if the domain and
+ * the device are incompatible in some way. This indicates that a caller should
+ * try another existing IOMMU domain or allocate a new one. And note that it's
+ * recommended to keep kernel print free when reporting -EMEDIUMTYPE error, as
+ * this function can be called to test compatibility with domains that will fail
+ * the test, which will result in a kernel log spam.
+ */
 int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
 {
 	struct iommu_group *group;
@@ -2101,6 +2115,20 @@ static int __iommu_attach_group(struct iommu_domain *domain,
 	return ret;
 }
 
+/**
+ * iommu_attach_group - Attach an IOMMU group to an IOMMU domain
+ * @domain: IOMMU domain to attach
+ * @group: IOMMU group that will be attached
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Specifically, -EMEDIUMTYPE is returned as a soft failure if the domain and
+ * the device are incompatible in some way. This indicates that a caller should
+ * try another existing IOMMU domain or allocate a new one. And note that it's
+ * recommended to keep kernel print free when reporting -EMEDIUMTYPE error, as
+ * this function can be called to test compatibility with domains that will fail
+ * the test, which will result in a kernel log spam.
+ */
 int iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group)
 {
 	int ret;
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 1d42084d0276..0103480648cb 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -628,9 +628,7 @@ static int ipmmu_attach_device(struct iommu_domain *io_domain,
 		 * Something is wrong, we can't attach two devices using
 		 * different IOMMUs to the same domain.
 		 */
-		dev_err(dev, "Can't attach IPMMU %s to domain on IPMMU %s\n",
-			dev_name(mmu->dev), dev_name(domain->mmu->dev));
-		ret = -EINVAL;
+		ret = -EMEDIUMTYPE;
 	} else
 		dev_info(dev, "Reusing IPMMU context %u\n", domain->context_id);
 
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index d9cf2820c02e..6bc8925726bf 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1471,8 +1471,7 @@ omap_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 
 	/* only a single client device can be attached to a domain */
 	if (omap_domain->dev) {
-		dev_err(dev, "iommu domain is already attached\n");
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;
 		goto out;
 	}
 
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index c898bcbbce11..ddcb78b284bb 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -127,7 +127,7 @@ static int s390_iommu_attach_device(struct iommu_domain *domain,
 	/* Allow only devices with identical DMA range limits */
 	} else if (domain->geometry.aperture_start != zdev->start_dma ||
 		   domain->geometry.aperture_end != zdev->end_dma) {
-		rc = -EINVAL;
+		rc = -EMEDIUMTYPE;
 		spin_unlock_irqrestore(&s390_domain->list_lock, flags);
 		goto out_restore;
 	}
diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
index 511959c8a14d..2bc1d34981cc 100644
--- a/drivers/iommu/sprd-iommu.c
+++ b/drivers/iommu/sprd-iommu.c
@@ -237,10 +237,8 @@ static int sprd_iommu_attach_device(struct iommu_domain *domain,
 	struct sprd_iommu_domain *dom = to_sprd_domain(domain);
 	size_t pgt_size = sprd_iommu_pgt_size(domain);
 
-	if (dom->sdev) {
-		pr_err("There's already a device attached to this domain.\n");
-		return -EINVAL;
-	}
+	if (dom->sdev)
+		return -EMEDIUMTYPE;
 
 	dom->pgt_va = dma_alloc_coherent(sdev->dev, pgt_size, &dom->pgt_pa, GFP_KERNEL);
 	if (!dom->pgt_va)
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index e5ca3cf1a949..9d81cc467651 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -112,7 +112,7 @@ static int gart_iommu_attach_dev(struct iommu_domain *domain,
 	spin_lock(&gart->dom_lock);
 
 	if (gart->active_domain && gart->active_domain != domain) {
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;
 	} else if (dev_iommu_priv_get(dev) != domain) {
 		dev_iommu_priv_set(dev, domain);
 		gart->active_domain = domain;
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 08eeafc9529f..6172763c69b8 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -733,8 +733,7 @@ static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		 */
 		ret = viommu_domain_finalise(vdev, domain);
 	} else if (vdomain->viommu != vdev->viommu) {
-		dev_err(dev, "cannot attach to foreign vIOMMU\n");
-		ret = -EXDEV;
+		ret = -EMEDIUMTYPE;
 	}
 	mutex_unlock(&vdomain->mutex);
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v6 2/5] vfio/iommu_type1: Prefer to reuse domains vs match enforced cache coherency
  2022-08-15 18:14 ` Nicolin Chen
@ 2022-08-15 18:14   ` Nicolin Chen
  -1 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, jgg, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

From: Jason Gunthorpe <jgg@nvidia.com>

The KVM mechanism for controlling wbinvd is based on OR of the coherency
property of all devices attached to a guest, no matter whether those
devices are attached to a single domain or multiple domains.

On the other hand, the benefit to using separate domains was that those
devices attached to domains supporting enforced cache coherency always
mapped with the attributes necessary to provide that feature, therefore
if a non-enforced domain was dropped, the associated group removal would
re-trigger an evaluation by KVM.

In practice however, the only known cases of such mixed domains included
an Intel IGD device behind an IOMMU lacking snoop control, where such
devices do not support hotplug, therefore this scenario lacks testing and
is not considered sufficiently relevant to support.

After all, KVM won't take advantage of trying to push a device that could
do enforced cache coherency to a dedicated domain vs re-using an existing
domain, which is non-coherent.

Simplify this code and eliminate the test. This removes the only logic
that needed to have a dummy domain attached prior to searching for a
matching domain and simplifies the next patches.

It's unclear whether we want to further optimize the Intel driver to
update the domain coherency after a device is detached from it, at
least not before KVM can be verified to handle such dynamics in related
emulation paths (wbinvd, vcpu load, write_cr0, ept, etc.). In reality
we don't see an usage requiring such optimization as the only device
which imposes such non-coherency is Intel GPU which even doesn't
support hotplug/hot remove.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index db516c90a977..88ee6aaf1c88 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2296,9 +2296,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	 * testing if they're on the same bus_type.
 	 */
 	list_for_each_entry(d, &iommu->domain_list, next) {
-		if (d->domain->ops == domain->domain->ops &&
-		    d->enforce_cache_coherency ==
-			    domain->enforce_cache_coherency) {
+		if (d->domain->ops == domain->domain->ops) {
 			iommu_detach_group(domain->domain, group->iommu_group);
 			if (!iommu_attach_group(d->domain,
 						group->iommu_group)) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v6 2/5] vfio/iommu_type1: Prefer to reuse domains vs match enforced cache coherency
@ 2022-08-15 18:14   ` Nicolin Chen
  0 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: marcan, mjrosato, virtualization, thierry.reding, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu, jgg,
	yangyingliang, orsonzhai, gerald.schaefer, kevin.tian, sven,
	linux-arm-msm, christophe.jaillet, baolin.wang, thunder.leizhen,
	linux-tegra, tglx, linux-arm-kernel, linux-s390, cohuck,
	linux-kernel, shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

From: Jason Gunthorpe <jgg@nvidia.com>

The KVM mechanism for controlling wbinvd is based on OR of the coherency
property of all devices attached to a guest, no matter whether those
devices are attached to a single domain or multiple domains.

On the other hand, the benefit to using separate domains was that those
devices attached to domains supporting enforced cache coherency always
mapped with the attributes necessary to provide that feature, therefore
if a non-enforced domain was dropped, the associated group removal would
re-trigger an evaluation by KVM.

In practice however, the only known cases of such mixed domains included
an Intel IGD device behind an IOMMU lacking snoop control, where such
devices do not support hotplug, therefore this scenario lacks testing and
is not considered sufficiently relevant to support.

After all, KVM won't take advantage of trying to push a device that could
do enforced cache coherency to a dedicated domain vs re-using an existing
domain, which is non-coherent.

Simplify this code and eliminate the test. This removes the only logic
that needed to have a dummy domain attached prior to searching for a
matching domain and simplifies the next patches.

It's unclear whether we want to further optimize the Intel driver to
update the domain coherency after a device is detached from it, at
least not before KVM can be verified to handle such dynamics in related
emulation paths (wbinvd, vcpu load, write_cr0, ept, etc.). In reality
we don't see an usage requiring such optimization as the only device
which imposes such non-coherency is Intel GPU which even doesn't
support hotplug/hot remove.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index db516c90a977..88ee6aaf1c88 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2296,9 +2296,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	 * testing if they're on the same bus_type.
 	 */
 	list_for_each_entry(d, &iommu->domain_list, next) {
-		if (d->domain->ops == domain->domain->ops &&
-		    d->enforce_cache_coherency ==
-			    domain->enforce_cache_coherency) {
+		if (d->domain->ops == domain->domain->ops) {
 			iommu_detach_group(domain->domain, group->iommu_group);
 			if (!iommu_attach_group(d->domain,
 						group->iommu_group)) {
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v6 3/5] vfio/iommu_type1: Remove the domain->ops comparison
  2022-08-15 18:14 ` Nicolin Chen
@ 2022-08-15 18:14   ` Nicolin Chen
  -1 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, jgg, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

The domain->ops validation was added, as a precaution, for mixed-driver
systems.

Per Robin's remarks,
* While bus_set_iommu() still exists, the core code prevents multiple
  drivers from registering, so we can't really run into a situation of
  having a mixed-driver system:
  https://lore.kernel.org/kvm/6e1280c5-4b22-ebb3-3912-6c72bc169982@arm.com/

* And there's plenty more significant problems than this to fix; in future
  when many can be permitted, we will rely on the IOMMU core code to check
  the domain->ops:
  https://lore.kernel.org/kvm/6575de6d-94ba-c427-5b1e-967750ddff23@arm.com/

So remove the check in VFIO for simplicity.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 88ee6aaf1c88..523927d61aac 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2288,29 +2288,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 			domain->domain->ops->enforce_cache_coherency(
 				domain->domain);
 
-	/*
-	 * Try to match an existing compatible domain.  We don't want to
-	 * preclude an IOMMU driver supporting multiple bus_types and being
-	 * able to include different bus_types in the same IOMMU domain, so
-	 * we test whether the domains use the same iommu_ops rather than
-	 * testing if they're on the same bus_type.
-	 */
+	/* Try to match an existing compatible domain */
 	list_for_each_entry(d, &iommu->domain_list, next) {
-		if (d->domain->ops == domain->domain->ops) {
-			iommu_detach_group(domain->domain, group->iommu_group);
-			if (!iommu_attach_group(d->domain,
-						group->iommu_group)) {
-				list_add(&group->next, &d->group_list);
-				iommu_domain_free(domain->domain);
-				kfree(domain);
-				goto done;
-			}
-
-			ret = iommu_attach_group(domain->domain,
-						 group->iommu_group);
-			if (ret)
-				goto out_domain;
+		iommu_detach_group(domain->domain, group->iommu_group);
+		if (!iommu_attach_group(d->domain, group->iommu_group)) {
+			list_add(&group->next, &d->group_list);
+			iommu_domain_free(domain->domain);
+			kfree(domain);
+			goto done;
 		}
+
+		ret = iommu_attach_group(domain->domain,  group->iommu_group);
+		if (ret)
+			goto out_domain;
 	}
 
 	vfio_test_domain_fgsp(domain);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v6 3/5] vfio/iommu_type1: Remove the domain->ops comparison
@ 2022-08-15 18:14   ` Nicolin Chen
  0 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: marcan, mjrosato, virtualization, thierry.reding, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu, jgg,
	yangyingliang, orsonzhai, gerald.schaefer, kevin.tian, sven,
	linux-arm-msm, christophe.jaillet, baolin.wang, thunder.leizhen,
	linux-tegra, tglx, linux-arm-kernel, linux-s390, cohuck,
	linux-kernel, shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

The domain->ops validation was added, as a precaution, for mixed-driver
systems.

Per Robin's remarks,
* While bus_set_iommu() still exists, the core code prevents multiple
  drivers from registering, so we can't really run into a situation of
  having a mixed-driver system:
  https://lore.kernel.org/kvm/6e1280c5-4b22-ebb3-3912-6c72bc169982@arm.com/

* And there's plenty more significant problems than this to fix; in future
  when many can be permitted, we will rely on the IOMMU core code to check
  the domain->ops:
  https://lore.kernel.org/kvm/6575de6d-94ba-c427-5b1e-967750ddff23@arm.com/

So remove the check in VFIO for simplicity.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 88ee6aaf1c88..523927d61aac 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2288,29 +2288,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 			domain->domain->ops->enforce_cache_coherency(
 				domain->domain);
 
-	/*
-	 * Try to match an existing compatible domain.  We don't want to
-	 * preclude an IOMMU driver supporting multiple bus_types and being
-	 * able to include different bus_types in the same IOMMU domain, so
-	 * we test whether the domains use the same iommu_ops rather than
-	 * testing if they're on the same bus_type.
-	 */
+	/* Try to match an existing compatible domain */
 	list_for_each_entry(d, &iommu->domain_list, next) {
-		if (d->domain->ops == domain->domain->ops) {
-			iommu_detach_group(domain->domain, group->iommu_group);
-			if (!iommu_attach_group(d->domain,
-						group->iommu_group)) {
-				list_add(&group->next, &d->group_list);
-				iommu_domain_free(domain->domain);
-				kfree(domain);
-				goto done;
-			}
-
-			ret = iommu_attach_group(domain->domain,
-						 group->iommu_group);
-			if (ret)
-				goto out_domain;
+		iommu_detach_group(domain->domain, group->iommu_group);
+		if (!iommu_attach_group(d->domain, group->iommu_group)) {
+			list_add(&group->next, &d->group_list);
+			iommu_domain_free(domain->domain);
+			kfree(domain);
+			goto done;
 		}
+
+		ret = iommu_attach_group(domain->domain,  group->iommu_group);
+		if (ret)
+			goto out_domain;
 	}
 
 	vfio_test_domain_fgsp(domain);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v6 4/5] vfio/iommu_type1: Clean up update_dirty_scope in detach_group()
  2022-08-15 18:14 ` Nicolin Chen
@ 2022-08-15 18:14   ` Nicolin Chen
  -1 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, jgg, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

All devices in emulated_iommu_groups have pinned_page_dirty_scope
set, so the update_dirty_scope in the first list_for_each_entry
is always false. Clean it up, and move the "if update_dirty_scope"
part from the detach_group_done routine to the domain_list part.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 523927d61aac..3b63a5a237c9 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2464,14 +2464,12 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain;
 	struct vfio_iommu_group *group;
-	bool update_dirty_scope = false;
 	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
 	list_for_each_entry(group, &iommu->emulated_iommu_groups, next) {
 		if (group->iommu_group != iommu_group)
 			continue;
-		update_dirty_scope = !group->pinned_page_dirty_scope;
 		list_del(&group->next);
 		kfree(group);
 
@@ -2480,7 +2478,8 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			WARN_ON(!list_empty(&iommu->device_list));
 			vfio_iommu_unmap_unpin_all(iommu);
 		}
-		goto detach_group_done;
+		mutex_unlock(&iommu->lock);
+		return;
 	}
 
 	/*
@@ -2496,9 +2495,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			continue;
 
 		iommu_detach_group(domain->domain, group->iommu_group);
-		update_dirty_scope = !group->pinned_page_dirty_scope;
 		list_del(&group->next);
-		kfree(group);
 		/*
 		 * Group ownership provides privilege, if the group list is
 		 * empty, the domain goes away. If it's the last domain with
@@ -2522,6 +2519,16 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			vfio_iommu_aper_expand(iommu, &iova_copy);
 			vfio_update_pgsize_bitmap(iommu);
 		}
+		/*
+		 * Removal of a group without dirty tracking may allow
+		 * the iommu scope to be promoted.
+		 */
+		if (!group->pinned_page_dirty_scope) {
+			iommu->num_non_pinned_groups--;
+			if (iommu->dirty_page_tracking)
+				vfio_iommu_populate_bitmap_full(iommu);
+		}
+		kfree(group);
 		break;
 	}
 
@@ -2530,16 +2537,6 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	else
 		vfio_iommu_iova_free(&iova_copy);
 
-detach_group_done:
-	/*
-	 * Removal of a group without dirty tracking may allow the iommu scope
-	 * to be promoted.
-	 */
-	if (update_dirty_scope) {
-		iommu->num_non_pinned_groups--;
-		if (iommu->dirty_page_tracking)
-			vfio_iommu_populate_bitmap_full(iommu);
-	}
 	mutex_unlock(&iommu->lock);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v6 4/5] vfio/iommu_type1: Clean up update_dirty_scope in detach_group()
@ 2022-08-15 18:14   ` Nicolin Chen
  0 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: marcan, mjrosato, virtualization, thierry.reding, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu, jgg,
	yangyingliang, orsonzhai, gerald.schaefer, kevin.tian, sven,
	linux-arm-msm, christophe.jaillet, baolin.wang, thunder.leizhen,
	linux-tegra, tglx, linux-arm-kernel, linux-s390, cohuck,
	linux-kernel, shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

All devices in emulated_iommu_groups have pinned_page_dirty_scope
set, so the update_dirty_scope in the first list_for_each_entry
is always false. Clean it up, and move the "if update_dirty_scope"
part from the detach_group_done routine to the domain_list part.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 523927d61aac..3b63a5a237c9 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2464,14 +2464,12 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_domain *domain;
 	struct vfio_iommu_group *group;
-	bool update_dirty_scope = false;
 	LIST_HEAD(iova_copy);
 
 	mutex_lock(&iommu->lock);
 	list_for_each_entry(group, &iommu->emulated_iommu_groups, next) {
 		if (group->iommu_group != iommu_group)
 			continue;
-		update_dirty_scope = !group->pinned_page_dirty_scope;
 		list_del(&group->next);
 		kfree(group);
 
@@ -2480,7 +2478,8 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			WARN_ON(!list_empty(&iommu->device_list));
 			vfio_iommu_unmap_unpin_all(iommu);
 		}
-		goto detach_group_done;
+		mutex_unlock(&iommu->lock);
+		return;
 	}
 
 	/*
@@ -2496,9 +2495,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			continue;
 
 		iommu_detach_group(domain->domain, group->iommu_group);
-		update_dirty_scope = !group->pinned_page_dirty_scope;
 		list_del(&group->next);
-		kfree(group);
 		/*
 		 * Group ownership provides privilege, if the group list is
 		 * empty, the domain goes away. If it's the last domain with
@@ -2522,6 +2519,16 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 			vfio_iommu_aper_expand(iommu, &iova_copy);
 			vfio_update_pgsize_bitmap(iommu);
 		}
+		/*
+		 * Removal of a group without dirty tracking may allow
+		 * the iommu scope to be promoted.
+		 */
+		if (!group->pinned_page_dirty_scope) {
+			iommu->num_non_pinned_groups--;
+			if (iommu->dirty_page_tracking)
+				vfio_iommu_populate_bitmap_full(iommu);
+		}
+		kfree(group);
 		break;
 	}
 
@@ -2530,16 +2537,6 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 	else
 		vfio_iommu_iova_free(&iova_copy);
 
-detach_group_done:
-	/*
-	 * Removal of a group without dirty tracking may allow the iommu scope
-	 * to be promoted.
-	 */
-	if (update_dirty_scope) {
-		iommu->num_non_pinned_groups--;
-		if (iommu->dirty_page_tracking)
-			vfio_iommu_populate_bitmap_full(iommu);
-	}
 	mutex_unlock(&iommu->lock);
 }
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v6 5/5] vfio/iommu_type1: Simplify group attachment
  2022-08-15 18:14 ` Nicolin Chen
@ 2022-08-15 18:14   ` Nicolin Chen
  -1 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, jgg, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

Un-inline the domain specific logic from the attach/detach_group ops into
two paired functions vfio_iommu_alloc_attach_domain() and
vfio_iommu_detach_destroy_domain() that strictly deal with creating and
destroying struct vfio_domains.

Add the logic to check for EMEDIUMTYPE return code of iommu_attach_group()
and avoid the extra domain allocations and attach/detach sequences of the
old code. This allows properly detecting an actual attach error, like
-ENOMEM, vs treating all attach errors as an incompatible domain.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 334 +++++++++++++++++---------------
 1 file changed, 180 insertions(+), 154 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 3b63a5a237c9..51d29b8780cd 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2163,14 +2163,179 @@ static int vfio_iommu_domain_alloc(struct device *dev, void *data)
 	return 1; /* Don't iterate */
 }
 
+static struct vfio_domain *
+vfio_iommu_alloc_attach_domain(struct vfio_iommu *iommu,
+			       struct vfio_iommu_group *group,
+			       struct list_head *group_resv_regions)
+{
+	struct iommu_domain *new_domain;
+	struct vfio_domain *domain;
+	phys_addr_t resv_msi_base;
+	int ret = 0;
+
+	/* Try to match an existing compatible domain */
+	list_for_each_entry (domain, &iommu->domain_list, next) {
+		ret = iommu_attach_group(domain->domain, group->iommu_group);
+		/* -EMEDIUMTYPE means an incompatible domain, so try next one */
+		if (ret == -EMEDIUMTYPE)
+			continue;
+		if (ret)
+			return ERR_PTR(ret);
+		goto done;
+	}
+
+	/*
+	 * Going via the iommu_group iterator avoids races, and trivially gives
+	 * us a representative device for the IOMMU API call. We don't actually
+	 * want to iterate beyond the first device (if any).
+	 */
+	iommu_group_for_each_dev(group->iommu_group, &new_domain,
+				 vfio_iommu_domain_alloc);
+	if (!new_domain)
+		return ERR_PTR(-EIO);
+
+	if (iommu->nesting) {
+		ret = iommu_enable_nesting(new_domain);
+		if (ret)
+			goto out_free_iommu_domain;
+	}
+
+	ret = iommu_attach_group(new_domain, group->iommu_group);
+	if (ret)
+		goto out_free_iommu_domain;
+
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (!domain) {
+		ret = -ENOMEM;
+		goto out_detach;
+	}
+
+	domain->domain = new_domain;
+	vfio_test_domain_fgsp(domain);
+
+	/*
+	 * If the IOMMU can block non-coherent operations (ie PCIe TLPs with
+	 * no-snoop set) then VFIO always turns this feature on because on Intel
+	 * platforms it optimizes KVM to disable wbinvd emulation.
+	 */
+	if (new_domain->ops->enforce_cache_coherency)
+		domain->enforce_cache_coherency =
+			new_domain->ops->enforce_cache_coherency(new_domain);
+
+	/* replay mappings on new domains */
+	ret = vfio_iommu_replay(iommu, domain);
+	if (ret)
+		goto out_free_domain;
+
+	if (vfio_iommu_has_sw_msi(group_resv_regions, &resv_msi_base)) {
+		ret = iommu_get_msi_cookie(domain->domain, resv_msi_base);
+		if (ret && ret != -ENODEV)
+			goto out_free_domain;
+	}
+
+	INIT_LIST_HEAD(&domain->group_list);
+	list_add(&domain->next, &iommu->domain_list);
+	vfio_update_pgsize_bitmap(iommu);
+
+done:
+	list_add(&group->next, &domain->group_list);
+
+	/*
+	 * An iommu backed group can dirty memory directly and therefore
+	 * demotes the iommu scope until it declares itself dirty tracking
+	 * capable via the page pinning interface.
+	 */
+	iommu->num_non_pinned_groups++;
+
+	return domain;
+
+out_free_domain:
+	kfree(domain);
+out_detach:
+	iommu_detach_group(new_domain, group->iommu_group);
+out_free_iommu_domain:
+	iommu_domain_free(new_domain);
+	return ERR_PTR(ret);
+}
+
+static void vfio_iommu_unmap_unpin_all(struct vfio_iommu *iommu)
+{
+	struct rb_node *node;
+
+	while ((node = rb_first(&iommu->dma_list)))
+		vfio_remove_dma(iommu, rb_entry(node, struct vfio_dma, node));
+}
+
+static void vfio_iommu_unmap_unpin_reaccount(struct vfio_iommu *iommu)
+{
+	struct rb_node *n, *p;
+
+	n = rb_first(&iommu->dma_list);
+	for (; n; n = rb_next(n)) {
+		struct vfio_dma *dma;
+		long locked = 0, unlocked = 0;
+
+		dma = rb_entry(n, struct vfio_dma, node);
+		unlocked += vfio_unmap_unpin(iommu, dma, false);
+		p = rb_first(&dma->pfn_list);
+		for (; p; p = rb_next(p)) {
+			struct vfio_pfn *vpfn = rb_entry(p, struct vfio_pfn,
+							 node);
+
+			if (!is_invalid_reserved_pfn(vpfn->pfn))
+				locked++;
+		}
+		vfio_lock_acct(dma, locked - unlocked, true);
+	}
+}
+
+static void vfio_iommu_detach_destroy_domain(struct vfio_domain *domain,
+					     struct vfio_iommu *iommu,
+					     struct vfio_iommu_group *group)
+{
+	iommu_detach_group(domain->domain, group->iommu_group);
+	list_del(&group->next);
+	if (!list_empty(&domain->group_list))
+		goto out_dirty;
+
+	/*
+	 * Group ownership provides privilege, if the group list is empty, the
+	 * domain goes away. If it's the last domain with iommu and external
+	 * domain doesn't exist, then all the mappings go away too. If it's the
+	 * last domain with iommu and external domain exist, update accounting
+	 */
+	if (list_is_singular(&iommu->domain_list)) {
+		if (list_empty(&iommu->emulated_iommu_groups)) {
+			WARN_ON(!list_empty(&iommu->device_list));
+			vfio_iommu_unmap_unpin_all(iommu);
+		} else {
+			vfio_iommu_unmap_unpin_reaccount(iommu);
+		}
+	}
+	iommu_domain_free(domain->domain);
+	list_del(&domain->next);
+	kfree(domain);
+	vfio_update_pgsize_bitmap(iommu);
+
+out_dirty:
+	/*
+	 * Removal of a group without dirty tracking may allow the iommu scope
+	 * to be promoted.
+	 */
+	if (!group->pinned_page_dirty_scope) {
+		iommu->num_non_pinned_groups--;
+		if (iommu->dirty_page_tracking)
+			vfio_iommu_populate_bitmap_full(iommu);
+	}
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 		struct iommu_group *iommu_group, enum vfio_group_type type)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_iommu_group *group;
-	struct vfio_domain *domain, *d;
-	bool resv_msi, msi_remap;
-	phys_addr_t resv_msi_base = 0;
+	struct vfio_domain *domain;
+	bool msi_remap;
 	struct iommu_domain_geometry *geo;
 	LIST_HEAD(iova_copy);
 	LIST_HEAD(group_resv_regions);
@@ -2201,32 +2366,17 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_unlock;
 	}
 
-	ret = -ENOMEM;
-	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
-	if (!domain)
+	ret = iommu_get_group_resv_regions(iommu_group, &group_resv_regions);
+	if (ret)
 		goto out_free_group;
 
-	/*
-	 * Going via the iommu_group iterator avoids races, and trivially gives
-	 * us a representative device for the IOMMU API call. We don't actually
-	 * want to iterate beyond the first device (if any).
-	 */
-	ret = -EIO;
-	iommu_group_for_each_dev(iommu_group, &domain->domain,
-				 vfio_iommu_domain_alloc);
-	if (!domain->domain)
-		goto out_free_domain;
-
-	if (iommu->nesting) {
-		ret = iommu_enable_nesting(domain->domain);
-		if (ret)
-			goto out_domain;
+	domain = vfio_iommu_alloc_attach_domain(iommu, group,
+						&group_resv_regions);
+	if (IS_ERR(domain)) {
+		ret = PTR_ERR(domain);
+		goto out_free_group;
 	}
 
-	ret = iommu_attach_group(domain->domain, group->iommu_group);
-	if (ret)
-		goto out_domain;
-
 	/* Get aperture info */
 	geo = &domain->domain->geometry;
 	if (vfio_iommu_aper_conflict(iommu, geo->aperture_start,
@@ -2235,10 +2385,6 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
-	ret = iommu_get_group_resv_regions(iommu_group, &group_resv_regions);
-	if (ret)
-		goto out_detach;
-
 	if (vfio_iommu_resv_conflict(iommu, &group_resv_regions)) {
 		ret = -EINVAL;
 		goto out_detach;
@@ -2262,11 +2408,6 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_detach;
 
-	resv_msi = vfio_iommu_has_sw_msi(&group_resv_regions, &resv_msi_base);
-
-	INIT_LIST_HEAD(&domain->group_list);
-	list_add(&group->next, &domain->group_list);
-
 	msi_remap = irq_domain_check_msi_remap() ||
 		    iommu_group_for_each_dev(iommu_group, (void *)IOMMU_CAP_INTR_REMAP,
 					     vfio_iommu_device_capable);
@@ -2278,107 +2419,25 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
-	/*
-	 * If the IOMMU can block non-coherent operations (ie PCIe TLPs with
-	 * no-snoop set) then VFIO always turns this feature on because on Intel
-	 * platforms it optimizes KVM to disable wbinvd emulation.
-	 */
-	if (domain->domain->ops->enforce_cache_coherency)
-		domain->enforce_cache_coherency =
-			domain->domain->ops->enforce_cache_coherency(
-				domain->domain);
-
-	/* Try to match an existing compatible domain */
-	list_for_each_entry(d, &iommu->domain_list, next) {
-		iommu_detach_group(domain->domain, group->iommu_group);
-		if (!iommu_attach_group(d->domain, group->iommu_group)) {
-			list_add(&group->next, &d->group_list);
-			iommu_domain_free(domain->domain);
-			kfree(domain);
-			goto done;
-		}
-
-		ret = iommu_attach_group(domain->domain,  group->iommu_group);
-		if (ret)
-			goto out_domain;
-	}
-
-	vfio_test_domain_fgsp(domain);
-
-	/* replay mappings on new domains */
-	ret = vfio_iommu_replay(iommu, domain);
-	if (ret)
-		goto out_detach;
-
-	if (resv_msi) {
-		ret = iommu_get_msi_cookie(domain->domain, resv_msi_base);
-		if (ret && ret != -ENODEV)
-			goto out_detach;
-	}
-
-	list_add(&domain->next, &iommu->domain_list);
-	vfio_update_pgsize_bitmap(iommu);
-done:
 	/* Delete the old one and insert new iova list */
 	vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 
-	/*
-	 * An iommu backed group can dirty memory directly and therefore
-	 * demotes the iommu scope until it declares itself dirty tracking
-	 * capable via the page pinning interface.
-	 */
-	iommu->num_non_pinned_groups++;
 	mutex_unlock(&iommu->lock);
 	vfio_iommu_resv_free(&group_resv_regions);
 
 	return 0;
 
 out_detach:
-	iommu_detach_group(domain->domain, group->iommu_group);
-out_domain:
-	iommu_domain_free(domain->domain);
-	vfio_iommu_iova_free(&iova_copy);
-	vfio_iommu_resv_free(&group_resv_regions);
-out_free_domain:
-	kfree(domain);
+	vfio_iommu_detach_destroy_domain(domain, iommu, group);
 out_free_group:
 	kfree(group);
 out_unlock:
 	mutex_unlock(&iommu->lock);
+	vfio_iommu_iova_free(&iova_copy);
+	vfio_iommu_resv_free(&group_resv_regions);
 	return ret;
 }
 
-static void vfio_iommu_unmap_unpin_all(struct vfio_iommu *iommu)
-{
-	struct rb_node *node;
-
-	while ((node = rb_first(&iommu->dma_list)))
-		vfio_remove_dma(iommu, rb_entry(node, struct vfio_dma, node));
-}
-
-static void vfio_iommu_unmap_unpin_reaccount(struct vfio_iommu *iommu)
-{
-	struct rb_node *n, *p;
-
-	n = rb_first(&iommu->dma_list);
-	for (; n; n = rb_next(n)) {
-		struct vfio_dma *dma;
-		long locked = 0, unlocked = 0;
-
-		dma = rb_entry(n, struct vfio_dma, node);
-		unlocked += vfio_unmap_unpin(iommu, dma, false);
-		p = rb_first(&dma->pfn_list);
-		for (; p; p = rb_next(p)) {
-			struct vfio_pfn *vpfn = rb_entry(p, struct vfio_pfn,
-							 node);
-
-			if (!is_invalid_reserved_pfn(vpfn->pfn))
-				locked++;
-		}
-		vfio_lock_acct(dma, locked - unlocked, true);
-	}
-}
-
 /*
  * Called when a domain is removed in detach. It is possible that
  * the removed domain decided the iova aperture window. Modify the
@@ -2493,45 +2552,12 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		group = find_iommu_group(domain, iommu_group);
 		if (!group)
 			continue;
-
-		iommu_detach_group(domain->domain, group->iommu_group);
-		list_del(&group->next);
-		/*
-		 * Group ownership provides privilege, if the group list is
-		 * empty, the domain goes away. If it's the last domain with
-		 * iommu and external domain doesn't exist, then all the
-		 * mappings go away too. If it's the last domain with iommu and
-		 * external domain exist, update accounting
-		 */
-		if (list_empty(&domain->group_list)) {
-			if (list_is_singular(&iommu->domain_list)) {
-				if (list_empty(&iommu->emulated_iommu_groups)) {
-					WARN_ON(!list_empty(
-						&iommu->device_list));
-					vfio_iommu_unmap_unpin_all(iommu);
-				} else {
-					vfio_iommu_unmap_unpin_reaccount(iommu);
-				}
-			}
-			iommu_domain_free(domain->domain);
-			list_del(&domain->next);
-			kfree(domain);
-			vfio_iommu_aper_expand(iommu, &iova_copy);
-			vfio_update_pgsize_bitmap(iommu);
-		}
-		/*
-		 * Removal of a group without dirty tracking may allow
-		 * the iommu scope to be promoted.
-		 */
-		if (!group->pinned_page_dirty_scope) {
-			iommu->num_non_pinned_groups--;
-			if (iommu->dirty_page_tracking)
-				vfio_iommu_populate_bitmap_full(iommu);
-		}
+		vfio_iommu_detach_destroy_domain(domain, iommu, group);
 		kfree(group);
 		break;
 	}
 
+	vfio_iommu_aper_expand(iommu, &iova_copy);
 	if (!vfio_iommu_resv_refresh(iommu, &iova_copy))
 		vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 	else
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v6 5/5] vfio/iommu_type1: Simplify group attachment
@ 2022-08-15 18:14   ` Nicolin Chen
  0 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-08-15 18:14 UTC (permalink / raw)
  To: joro, will, robin.murphy, alex.williamson
  Cc: marcan, mjrosato, virtualization, thierry.reding, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu, jgg,
	yangyingliang, orsonzhai, gerald.schaefer, kevin.tian, sven,
	linux-arm-msm, christophe.jaillet, baolin.wang, thunder.leizhen,
	linux-tegra, tglx, linux-arm-kernel, linux-s390, cohuck,
	linux-kernel, shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

Un-inline the domain specific logic from the attach/detach_group ops into
two paired functions vfio_iommu_alloc_attach_domain() and
vfio_iommu_detach_destroy_domain() that strictly deal with creating and
destroying struct vfio_domains.

Add the logic to check for EMEDIUMTYPE return code of iommu_attach_group()
and avoid the extra domain allocations and attach/detach sequences of the
old code. This allows properly detecting an actual attach error, like
-ENOMEM, vs treating all attach errors as an incompatible domain.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 334 +++++++++++++++++---------------
 1 file changed, 180 insertions(+), 154 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 3b63a5a237c9..51d29b8780cd 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2163,14 +2163,179 @@ static int vfio_iommu_domain_alloc(struct device *dev, void *data)
 	return 1; /* Don't iterate */
 }
 
+static struct vfio_domain *
+vfio_iommu_alloc_attach_domain(struct vfio_iommu *iommu,
+			       struct vfio_iommu_group *group,
+			       struct list_head *group_resv_regions)
+{
+	struct iommu_domain *new_domain;
+	struct vfio_domain *domain;
+	phys_addr_t resv_msi_base;
+	int ret = 0;
+
+	/* Try to match an existing compatible domain */
+	list_for_each_entry (domain, &iommu->domain_list, next) {
+		ret = iommu_attach_group(domain->domain, group->iommu_group);
+		/* -EMEDIUMTYPE means an incompatible domain, so try next one */
+		if (ret == -EMEDIUMTYPE)
+			continue;
+		if (ret)
+			return ERR_PTR(ret);
+		goto done;
+	}
+
+	/*
+	 * Going via the iommu_group iterator avoids races, and trivially gives
+	 * us a representative device for the IOMMU API call. We don't actually
+	 * want to iterate beyond the first device (if any).
+	 */
+	iommu_group_for_each_dev(group->iommu_group, &new_domain,
+				 vfio_iommu_domain_alloc);
+	if (!new_domain)
+		return ERR_PTR(-EIO);
+
+	if (iommu->nesting) {
+		ret = iommu_enable_nesting(new_domain);
+		if (ret)
+			goto out_free_iommu_domain;
+	}
+
+	ret = iommu_attach_group(new_domain, group->iommu_group);
+	if (ret)
+		goto out_free_iommu_domain;
+
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (!domain) {
+		ret = -ENOMEM;
+		goto out_detach;
+	}
+
+	domain->domain = new_domain;
+	vfio_test_domain_fgsp(domain);
+
+	/*
+	 * If the IOMMU can block non-coherent operations (ie PCIe TLPs with
+	 * no-snoop set) then VFIO always turns this feature on because on Intel
+	 * platforms it optimizes KVM to disable wbinvd emulation.
+	 */
+	if (new_domain->ops->enforce_cache_coherency)
+		domain->enforce_cache_coherency =
+			new_domain->ops->enforce_cache_coherency(new_domain);
+
+	/* replay mappings on new domains */
+	ret = vfio_iommu_replay(iommu, domain);
+	if (ret)
+		goto out_free_domain;
+
+	if (vfio_iommu_has_sw_msi(group_resv_regions, &resv_msi_base)) {
+		ret = iommu_get_msi_cookie(domain->domain, resv_msi_base);
+		if (ret && ret != -ENODEV)
+			goto out_free_domain;
+	}
+
+	INIT_LIST_HEAD(&domain->group_list);
+	list_add(&domain->next, &iommu->domain_list);
+	vfio_update_pgsize_bitmap(iommu);
+
+done:
+	list_add(&group->next, &domain->group_list);
+
+	/*
+	 * An iommu backed group can dirty memory directly and therefore
+	 * demotes the iommu scope until it declares itself dirty tracking
+	 * capable via the page pinning interface.
+	 */
+	iommu->num_non_pinned_groups++;
+
+	return domain;
+
+out_free_domain:
+	kfree(domain);
+out_detach:
+	iommu_detach_group(new_domain, group->iommu_group);
+out_free_iommu_domain:
+	iommu_domain_free(new_domain);
+	return ERR_PTR(ret);
+}
+
+static void vfio_iommu_unmap_unpin_all(struct vfio_iommu *iommu)
+{
+	struct rb_node *node;
+
+	while ((node = rb_first(&iommu->dma_list)))
+		vfio_remove_dma(iommu, rb_entry(node, struct vfio_dma, node));
+}
+
+static void vfio_iommu_unmap_unpin_reaccount(struct vfio_iommu *iommu)
+{
+	struct rb_node *n, *p;
+
+	n = rb_first(&iommu->dma_list);
+	for (; n; n = rb_next(n)) {
+		struct vfio_dma *dma;
+		long locked = 0, unlocked = 0;
+
+		dma = rb_entry(n, struct vfio_dma, node);
+		unlocked += vfio_unmap_unpin(iommu, dma, false);
+		p = rb_first(&dma->pfn_list);
+		for (; p; p = rb_next(p)) {
+			struct vfio_pfn *vpfn = rb_entry(p, struct vfio_pfn,
+							 node);
+
+			if (!is_invalid_reserved_pfn(vpfn->pfn))
+				locked++;
+		}
+		vfio_lock_acct(dma, locked - unlocked, true);
+	}
+}
+
+static void vfio_iommu_detach_destroy_domain(struct vfio_domain *domain,
+					     struct vfio_iommu *iommu,
+					     struct vfio_iommu_group *group)
+{
+	iommu_detach_group(domain->domain, group->iommu_group);
+	list_del(&group->next);
+	if (!list_empty(&domain->group_list))
+		goto out_dirty;
+
+	/*
+	 * Group ownership provides privilege, if the group list is empty, the
+	 * domain goes away. If it's the last domain with iommu and external
+	 * domain doesn't exist, then all the mappings go away too. If it's the
+	 * last domain with iommu and external domain exist, update accounting
+	 */
+	if (list_is_singular(&iommu->domain_list)) {
+		if (list_empty(&iommu->emulated_iommu_groups)) {
+			WARN_ON(!list_empty(&iommu->device_list));
+			vfio_iommu_unmap_unpin_all(iommu);
+		} else {
+			vfio_iommu_unmap_unpin_reaccount(iommu);
+		}
+	}
+	iommu_domain_free(domain->domain);
+	list_del(&domain->next);
+	kfree(domain);
+	vfio_update_pgsize_bitmap(iommu);
+
+out_dirty:
+	/*
+	 * Removal of a group without dirty tracking may allow the iommu scope
+	 * to be promoted.
+	 */
+	if (!group->pinned_page_dirty_scope) {
+		iommu->num_non_pinned_groups--;
+		if (iommu->dirty_page_tracking)
+			vfio_iommu_populate_bitmap_full(iommu);
+	}
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 		struct iommu_group *iommu_group, enum vfio_group_type type)
 {
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_iommu_group *group;
-	struct vfio_domain *domain, *d;
-	bool resv_msi, msi_remap;
-	phys_addr_t resv_msi_base = 0;
+	struct vfio_domain *domain;
+	bool msi_remap;
 	struct iommu_domain_geometry *geo;
 	LIST_HEAD(iova_copy);
 	LIST_HEAD(group_resv_regions);
@@ -2201,32 +2366,17 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_unlock;
 	}
 
-	ret = -ENOMEM;
-	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
-	if (!domain)
+	ret = iommu_get_group_resv_regions(iommu_group, &group_resv_regions);
+	if (ret)
 		goto out_free_group;
 
-	/*
-	 * Going via the iommu_group iterator avoids races, and trivially gives
-	 * us a representative device for the IOMMU API call. We don't actually
-	 * want to iterate beyond the first device (if any).
-	 */
-	ret = -EIO;
-	iommu_group_for_each_dev(iommu_group, &domain->domain,
-				 vfio_iommu_domain_alloc);
-	if (!domain->domain)
-		goto out_free_domain;
-
-	if (iommu->nesting) {
-		ret = iommu_enable_nesting(domain->domain);
-		if (ret)
-			goto out_domain;
+	domain = vfio_iommu_alloc_attach_domain(iommu, group,
+						&group_resv_regions);
+	if (IS_ERR(domain)) {
+		ret = PTR_ERR(domain);
+		goto out_free_group;
 	}
 
-	ret = iommu_attach_group(domain->domain, group->iommu_group);
-	if (ret)
-		goto out_domain;
-
 	/* Get aperture info */
 	geo = &domain->domain->geometry;
 	if (vfio_iommu_aper_conflict(iommu, geo->aperture_start,
@@ -2235,10 +2385,6 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
-	ret = iommu_get_group_resv_regions(iommu_group, &group_resv_regions);
-	if (ret)
-		goto out_detach;
-
 	if (vfio_iommu_resv_conflict(iommu, &group_resv_regions)) {
 		ret = -EINVAL;
 		goto out_detach;
@@ -2262,11 +2408,6 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	if (ret)
 		goto out_detach;
 
-	resv_msi = vfio_iommu_has_sw_msi(&group_resv_regions, &resv_msi_base);
-
-	INIT_LIST_HEAD(&domain->group_list);
-	list_add(&group->next, &domain->group_list);
-
 	msi_remap = irq_domain_check_msi_remap() ||
 		    iommu_group_for_each_dev(iommu_group, (void *)IOMMU_CAP_INTR_REMAP,
 					     vfio_iommu_device_capable);
@@ -2278,107 +2419,25 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
-	/*
-	 * If the IOMMU can block non-coherent operations (ie PCIe TLPs with
-	 * no-snoop set) then VFIO always turns this feature on because on Intel
-	 * platforms it optimizes KVM to disable wbinvd emulation.
-	 */
-	if (domain->domain->ops->enforce_cache_coherency)
-		domain->enforce_cache_coherency =
-			domain->domain->ops->enforce_cache_coherency(
-				domain->domain);
-
-	/* Try to match an existing compatible domain */
-	list_for_each_entry(d, &iommu->domain_list, next) {
-		iommu_detach_group(domain->domain, group->iommu_group);
-		if (!iommu_attach_group(d->domain, group->iommu_group)) {
-			list_add(&group->next, &d->group_list);
-			iommu_domain_free(domain->domain);
-			kfree(domain);
-			goto done;
-		}
-
-		ret = iommu_attach_group(domain->domain,  group->iommu_group);
-		if (ret)
-			goto out_domain;
-	}
-
-	vfio_test_domain_fgsp(domain);
-
-	/* replay mappings on new domains */
-	ret = vfio_iommu_replay(iommu, domain);
-	if (ret)
-		goto out_detach;
-
-	if (resv_msi) {
-		ret = iommu_get_msi_cookie(domain->domain, resv_msi_base);
-		if (ret && ret != -ENODEV)
-			goto out_detach;
-	}
-
-	list_add(&domain->next, &iommu->domain_list);
-	vfio_update_pgsize_bitmap(iommu);
-done:
 	/* Delete the old one and insert new iova list */
 	vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 
-	/*
-	 * An iommu backed group can dirty memory directly and therefore
-	 * demotes the iommu scope until it declares itself dirty tracking
-	 * capable via the page pinning interface.
-	 */
-	iommu->num_non_pinned_groups++;
 	mutex_unlock(&iommu->lock);
 	vfio_iommu_resv_free(&group_resv_regions);
 
 	return 0;
 
 out_detach:
-	iommu_detach_group(domain->domain, group->iommu_group);
-out_domain:
-	iommu_domain_free(domain->domain);
-	vfio_iommu_iova_free(&iova_copy);
-	vfio_iommu_resv_free(&group_resv_regions);
-out_free_domain:
-	kfree(domain);
+	vfio_iommu_detach_destroy_domain(domain, iommu, group);
 out_free_group:
 	kfree(group);
 out_unlock:
 	mutex_unlock(&iommu->lock);
+	vfio_iommu_iova_free(&iova_copy);
+	vfio_iommu_resv_free(&group_resv_regions);
 	return ret;
 }
 
-static void vfio_iommu_unmap_unpin_all(struct vfio_iommu *iommu)
-{
-	struct rb_node *node;
-
-	while ((node = rb_first(&iommu->dma_list)))
-		vfio_remove_dma(iommu, rb_entry(node, struct vfio_dma, node));
-}
-
-static void vfio_iommu_unmap_unpin_reaccount(struct vfio_iommu *iommu)
-{
-	struct rb_node *n, *p;
-
-	n = rb_first(&iommu->dma_list);
-	for (; n; n = rb_next(n)) {
-		struct vfio_dma *dma;
-		long locked = 0, unlocked = 0;
-
-		dma = rb_entry(n, struct vfio_dma, node);
-		unlocked += vfio_unmap_unpin(iommu, dma, false);
-		p = rb_first(&dma->pfn_list);
-		for (; p; p = rb_next(p)) {
-			struct vfio_pfn *vpfn = rb_entry(p, struct vfio_pfn,
-							 node);
-
-			if (!is_invalid_reserved_pfn(vpfn->pfn))
-				locked++;
-		}
-		vfio_lock_acct(dma, locked - unlocked, true);
-	}
-}
-
 /*
  * Called when a domain is removed in detach. It is possible that
  * the removed domain decided the iova aperture window. Modify the
@@ -2493,45 +2552,12 @@ static void vfio_iommu_type1_detach_group(void *iommu_data,
 		group = find_iommu_group(domain, iommu_group);
 		if (!group)
 			continue;
-
-		iommu_detach_group(domain->domain, group->iommu_group);
-		list_del(&group->next);
-		/*
-		 * Group ownership provides privilege, if the group list is
-		 * empty, the domain goes away. If it's the last domain with
-		 * iommu and external domain doesn't exist, then all the
-		 * mappings go away too. If it's the last domain with iommu and
-		 * external domain exist, update accounting
-		 */
-		if (list_empty(&domain->group_list)) {
-			if (list_is_singular(&iommu->domain_list)) {
-				if (list_empty(&iommu->emulated_iommu_groups)) {
-					WARN_ON(!list_empty(
-						&iommu->device_list));
-					vfio_iommu_unmap_unpin_all(iommu);
-				} else {
-					vfio_iommu_unmap_unpin_reaccount(iommu);
-				}
-			}
-			iommu_domain_free(domain->domain);
-			list_del(&domain->next);
-			kfree(domain);
-			vfio_iommu_aper_expand(iommu, &iova_copy);
-			vfio_update_pgsize_bitmap(iommu);
-		}
-		/*
-		 * Removal of a group without dirty tracking may allow
-		 * the iommu scope to be promoted.
-		 */
-		if (!group->pinned_page_dirty_scope) {
-			iommu->num_non_pinned_groups--;
-			if (iommu->dirty_page_tracking)
-				vfio_iommu_populate_bitmap_full(iommu);
-		}
+		vfio_iommu_detach_destroy_domain(domain, iommu, group);
 		kfree(group);
 		break;
 	}
 
+	vfio_iommu_aper_expand(iommu, &iova_copy);
 	if (!vfio_iommu_resv_refresh(iommu, &iova_copy))
 		vfio_iommu_iova_insert_copy(iommu, &iova_copy);
 	else
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-08-15 18:14   ` Nicolin Chen
  (?)
@ 2022-09-07 12:41     ` Joerg Roedel
  -1 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-09-07 12:41 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: will, robin.murphy, alex.williamson, suravee.suthikulpanit,
	marcan, sven, alyssa, robdclark, dwmw2, baolu.lu, mjrosato,
	gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
	thierry.reding, vdumpa, jonathanh, jean-philippe, cohuck, jgg,
	tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
> Provide a dedicated errno from the IOMMU driver during attach that the
> reason attached failed is because of domain incompatability. EMEDIUMTYPE
> is chosen because it is never used within the iommu subsystem today and
> evokes a sense that the 'medium' aka the domain is incompatible.

I am not a fan of re-using EMEDIUMTYPE or any other special value. What
is needed here in EINVAL, but with a way to tell the caller which of the
function parameters is actually invalid.

For that I prefer adding an additional pointer parameter to the attach
functions in which the reason for the failure can be communicated up the
chain.

For the top-level iommu_attach_device() function I am okay with having a
special version which has this additional paremter.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-07 12:41     ` Joerg Roedel
  0 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-09-07 12:41 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	jgg, yangyingliang, orsonzhai, gerald.schaefer, sven,
	linux-arm-msm, christophe.jaillet, baolin.wang, thunder.leizhen,
	linux-tegra, tglx, virtualization, linux-arm-kernel, dwmw2,
	cohuck, vdumpa, shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, robin.murphy, baolu.lu

On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
> Provide a dedicated errno from the IOMMU driver during attach that the
> reason attached failed is because of domain incompatability. EMEDIUMTYPE
> is chosen because it is never used within the iommu subsystem today and
> evokes a sense that the 'medium' aka the domain is incompatible.

I am not a fan of re-using EMEDIUMTYPE or any other special value. What
is needed here in EINVAL, but with a way to tell the caller which of the
function parameters is actually invalid.

For that I prefer adding an additional pointer parameter to the attach
functions in which the reason for the failure can be communicated up the
chain.

For the top-level iommu_attach_device() function I am okay with having a
special version which has this additional paremter.

Regards,

	Joerg
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-07 12:41     ` Joerg Roedel
  0 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-09-07 12:41 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	jgg, yangyingliang, orsonzhai, gerald.schaefer, kevin.tian, sven,
	linux-arm-msm, alex.williamson, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, robin.murphy, baolu.lu

On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
> Provide a dedicated errno from the IOMMU driver during attach that the
> reason attached failed is because of domain incompatability. EMEDIUMTYPE
> is chosen because it is never used within the iommu subsystem today and
> evokes a sense that the 'medium' aka the domain is incompatible.

I am not a fan of re-using EMEDIUMTYPE or any other special value. What
is needed here in EINVAL, but with a way to tell the caller which of the
function parameters is actually invalid.

For that I prefer adding an additional pointer parameter to the attach
functions in which the reason for the failure can be communicated up the
chain.

For the top-level iommu_attach_device() function I am okay with having a
special version which has this additional paremter.

Regards,

	Joerg

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-07 12:41     ` Joerg Roedel
@ 2022-09-07 13:47       ` Jason Gunthorpe
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-07 13:47 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Nicolin Chen, will, robin.murphy, alex.williamson,
	suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

On Wed, Sep 07, 2022 at 02:41:54PM +0200, Joerg Roedel wrote:
> On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
> > Provide a dedicated errno from the IOMMU driver during attach that the
> > reason attached failed is because of domain incompatability. EMEDIUMTYPE
> > is chosen because it is never used within the iommu subsystem today and
> > evokes a sense that the 'medium' aka the domain is incompatible.
> 
> I am not a fan of re-using EMEDIUMTYPE or any other special value. What
> is needed here in EINVAL, but with a way to tell the caller which of the
> function parameters is actually invalid.

Using errnos to indicate the nature of failure is a well established
unix practice, it is why we have hundreds of error codes and don't
just return -EINVAL for everything.

What don't you like about it?

Would you be happier if we wrote it like

 #define IOMMU_EINCOMPATIBLE_DEVICE xx

Which tells "which of the function parameters is actually invalid" ?

> For that I prefer adding an additional pointer parameter to the attach
> functions in which the reason for the failure can be communicated up the
> chain.

That sounds like OS/2 :\

Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-07 13:47       ` Jason Gunthorpe
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-07 13:47 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer,
	kevin.tian, sven, linux-arm-msm, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, robin.murphy, baolu.lu

On Wed, Sep 07, 2022 at 02:41:54PM +0200, Joerg Roedel wrote:
> On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
> > Provide a dedicated errno from the IOMMU driver during attach that the
> > reason attached failed is because of domain incompatability. EMEDIUMTYPE
> > is chosen because it is never used within the iommu subsystem today and
> > evokes a sense that the 'medium' aka the domain is incompatible.
> 
> I am not a fan of re-using EMEDIUMTYPE or any other special value. What
> is needed here in EINVAL, but with a way to tell the caller which of the
> function parameters is actually invalid.

Using errnos to indicate the nature of failure is a well established
unix practice, it is why we have hundreds of error codes and don't
just return -EINVAL for everything.

What don't you like about it?

Would you be happier if we wrote it like

 #define IOMMU_EINCOMPATIBLE_DEVICE xx

Which tells "which of the function parameters is actually invalid" ?

> For that I prefer adding an additional pointer parameter to the attach
> functions in which the reason for the failure can be communicated up the
> chain.

That sounds like OS/2 :\

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-07 13:47       ` Jason Gunthorpe
  (?)
@ 2022-09-07 14:06         ` Joerg Roedel
  -1 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-09-07 14:06 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nicolin Chen, will, robin.murphy, alex.williamson,
	suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

On Wed, Sep 07, 2022 at 10:47:39AM -0300, Jason Gunthorpe wrote:
> Would you be happier if we wrote it like
> 
>  #define IOMMU_EINCOMPATIBLE_DEVICE xx
> 
> Which tells "which of the function parameters is actually invalid" ?

Having done some Rust hacking in the last months, I have to say I like
to concept of error handling with Result<> there. Ideally we have a way
to emulate that in our C code without having to change all callers.

What I am proposing is a way this could be emulated here, but I am open
to other suggestions. Still better than re-using random error codes for
special purposes.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-07 14:06         ` Joerg Roedel
  0 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-09-07 14:06 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer, sven,
	linux-arm-msm, vdumpa, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, robin.murphy, baolu.lu

On Wed, Sep 07, 2022 at 10:47:39AM -0300, Jason Gunthorpe wrote:
> Would you be happier if we wrote it like
> 
>  #define IOMMU_EINCOMPATIBLE_DEVICE xx
> 
> Which tells "which of the function parameters is actually invalid" ?

Having done some Rust hacking in the last months, I have to say I like
to concept of error handling with Result<> there. Ideally we have a way
to emulate that in our C code without having to change all callers.

What I am proposing is a way this could be emulated here, but I am open
to other suggestions. Still better than re-using random error codes for
special purposes.

Regards,

	Joerg
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-07 14:06         ` Joerg Roedel
  0 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-09-07 14:06 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer,
	kevin.tian, sven, linux-arm-msm, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, robin.murphy, baolu.lu

On Wed, Sep 07, 2022 at 10:47:39AM -0300, Jason Gunthorpe wrote:
> Would you be happier if we wrote it like
> 
>  #define IOMMU_EINCOMPATIBLE_DEVICE xx
> 
> Which tells "which of the function parameters is actually invalid" ?

Having done some Rust hacking in the last months, I have to say I like
to concept of error handling with Result<> there. Ideally we have a way
to emulate that in our C code without having to change all callers.

What I am proposing is a way this could be emulated here, but I am open
to other suggestions. Still better than re-using random error codes for
special purposes.

Regards,

	Joerg

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-07 13:47       ` Jason Gunthorpe
  (?)
@ 2022-09-07 14:23         ` Robin Murphy
  -1 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-09-07 14:23 UTC (permalink / raw)
  To: Jason Gunthorpe, Joerg Roedel
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer, sven,
	linux-arm-msm, vdumpa, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, dwmw2, baolu.lu

On 2022-09-07 14:47, Jason Gunthorpe wrote:
> On Wed, Sep 07, 2022 at 02:41:54PM +0200, Joerg Roedel wrote:
>> On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
>>> Provide a dedicated errno from the IOMMU driver during attach that the
>>> reason attached failed is because of domain incompatability. EMEDIUMTYPE
>>> is chosen because it is never used within the iommu subsystem today and
>>> evokes a sense that the 'medium' aka the domain is incompatible.
>>
>> I am not a fan of re-using EMEDIUMTYPE or any other special value. What
>> is needed here in EINVAL, but with a way to tell the caller which of the
>> function parameters is actually invalid.
> 
> Using errnos to indicate the nature of failure is a well established
> unix practice, it is why we have hundreds of error codes and don't
> just return -EINVAL for everything.
> 
> What don't you like about it?
> 
> Would you be happier if we wrote it like
> 
>   #define IOMMU_EINCOMPATIBLE_DEVICE xx
> 
> Which tells "which of the function parameters is actually invalid" ?

FWIW, we're now very close to being able to validate dev->iommu against 
where the domain came from in core code, and so short-circuit 
->attach_dev entirely if they don't match. At that point -EINVAL at the 
driver callback level could be assumed to refer to the domain argument, 
while anything else could be taken as something going unexpectedly wrong 
when the attach may otherwise have worked. I've forgotten if we actually 
had a valid case anywhere for "this is my device but even if you retry 
with a different domain it's still never going to work", but I think we 
wouldn't actually need that anyway - it should be clear enough to a 
caller that if attaching to an existing domain fails, then allocating a 
fresh domain and attaching also fails, that's the point to give up.

Robin.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-07 14:23         ` Robin Murphy
  0 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-09-07 14:23 UTC (permalink / raw)
  To: Jason Gunthorpe, Joerg Roedel
  Cc: Nicolin Chen, will, alex.williamson, suravee.suthikulpanit,
	marcan, sven, alyssa, robdclark, dwmw2, baolu.lu, mjrosato,
	gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
	thierry.reding, vdumpa, jonathanh, jean-philippe, cohuck, tglx,
	shameerali.kolothum.thodi, thunder.leizhen, christophe.jaillet,
	yangyingliang, jon, iommu, linux-kernel, asahi, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm,
	kevin.tian

On 2022-09-07 14:47, Jason Gunthorpe wrote:
> On Wed, Sep 07, 2022 at 02:41:54PM +0200, Joerg Roedel wrote:
>> On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
>>> Provide a dedicated errno from the IOMMU driver during attach that the
>>> reason attached failed is because of domain incompatability. EMEDIUMTYPE
>>> is chosen because it is never used within the iommu subsystem today and
>>> evokes a sense that the 'medium' aka the domain is incompatible.
>>
>> I am not a fan of re-using EMEDIUMTYPE or any other special value. What
>> is needed here in EINVAL, but with a way to tell the caller which of the
>> function parameters is actually invalid.
> 
> Using errnos to indicate the nature of failure is a well established
> unix practice, it is why we have hundreds of error codes and don't
> just return -EINVAL for everything.
> 
> What don't you like about it?
> 
> Would you be happier if we wrote it like
> 
>   #define IOMMU_EINCOMPATIBLE_DEVICE xx
> 
> Which tells "which of the function parameters is actually invalid" ?

FWIW, we're now very close to being able to validate dev->iommu against 
where the domain came from in core code, and so short-circuit 
->attach_dev entirely if they don't match. At that point -EINVAL at the 
driver callback level could be assumed to refer to the domain argument, 
while anything else could be taken as something going unexpectedly wrong 
when the attach may otherwise have worked. I've forgotten if we actually 
had a valid case anywhere for "this is my device but even if you retry 
with a different domain it's still never going to work", but I think we 
wouldn't actually need that anyway - it should be clear enough to a 
caller that if attaching to an existing domain fails, then allocating a 
fresh domain and attaching also fails, that's the point to give up.

Robin.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-07 14:23         ` Robin Murphy
  0 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-09-07 14:23 UTC (permalink / raw)
  To: Jason Gunthorpe, Joerg Roedel
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer,
	kevin.tian, sven, linux-arm-msm, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

On 2022-09-07 14:47, Jason Gunthorpe wrote:
> On Wed, Sep 07, 2022 at 02:41:54PM +0200, Joerg Roedel wrote:
>> On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
>>> Provide a dedicated errno from the IOMMU driver during attach that the
>>> reason attached failed is because of domain incompatability. EMEDIUMTYPE
>>> is chosen because it is never used within the iommu subsystem today and
>>> evokes a sense that the 'medium' aka the domain is incompatible.
>>
>> I am not a fan of re-using EMEDIUMTYPE or any other special value. What
>> is needed here in EINVAL, but with a way to tell the caller which of the
>> function parameters is actually invalid.
> 
> Using errnos to indicate the nature of failure is a well established
> unix practice, it is why we have hundreds of error codes and don't
> just return -EINVAL for everything.
> 
> What don't you like about it?
> 
> Would you be happier if we wrote it like
> 
>   #define IOMMU_EINCOMPATIBLE_DEVICE xx
> 
> Which tells "which of the function parameters is actually invalid" ?

FWIW, we're now very close to being able to validate dev->iommu against 
where the domain came from in core code, and so short-circuit 
->attach_dev entirely if they don't match. At that point -EINVAL at the 
driver callback level could be assumed to refer to the domain argument, 
while anything else could be taken as something going unexpectedly wrong 
when the attach may otherwise have worked. I've forgotten if we actually 
had a valid case anywhere for "this is my device but even if you retry 
with a different domain it's still never going to work", but I think we 
wouldn't actually need that anyway - it should be clear enough to a 
caller that if attaching to an existing domain fails, then allocating a 
fresh domain and attaching also fails, that's the point to give up.

Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-07 14:23         ` Robin Murphy
@ 2022-09-07 17:00           ` Jason Gunthorpe
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-07 17:00 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Nicolin Chen, will, alex.williamson,
	suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

On Wed, Sep 07, 2022 at 03:23:09PM +0100, Robin Murphy wrote:
> On 2022-09-07 14:47, Jason Gunthorpe wrote:
> > On Wed, Sep 07, 2022 at 02:41:54PM +0200, Joerg Roedel wrote:
> > > On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
> > > > Provide a dedicated errno from the IOMMU driver during attach that the
> > > > reason attached failed is because of domain incompatability. EMEDIUMTYPE
> > > > is chosen because it is never used within the iommu subsystem today and
> > > > evokes a sense that the 'medium' aka the domain is incompatible.
> > > 
> > > I am not a fan of re-using EMEDIUMTYPE or any other special value. What
> > > is needed here in EINVAL, but with a way to tell the caller which of the
> > > function parameters is actually invalid.
> > 
> > Using errnos to indicate the nature of failure is a well established
> > unix practice, it is why we have hundreds of error codes and don't
> > just return -EINVAL for everything.
> > 
> > What don't you like about it?
> > 
> > Would you be happier if we wrote it like
> > 
> >   #define IOMMU_EINCOMPATIBLE_DEVICE xx
> > 
> > Which tells "which of the function parameters is actually invalid" ?
> 
> FWIW, we're now very close to being able to validate dev->iommu against
> where the domain came from in core code, and so short-circuit ->attach_dev
> entirely if they don't match. 

I don't think this is a long term direction. We have systems now with
a number of SMMU blocks and we really are going to see a need that
they share the iommu_domains so we don't have unncessary overheads
from duplicated io page table memory.

So ultimately I'd expect to pass the iommu_domain to the driver and
the driver will decide if the page table memory it represents is
compatible or not. Restricting to only the same iommu instance isn't
good..

> At that point -EINVAL at the driver callback level could be assumed
> to refer to the domain argument, while anything else could be taken
> as something going unexpectedly wrong when the attach may otherwise
> have worked. I've forgotten if we actually had a valid case anywhere
> for "this is my device but even if you retry with a different domain
> it's still never going to work", but I think we wouldn't actually
> need that anyway - it should be clear enough to a caller that if
> attaching to an existing domain fails, then allocating a fresh
> domain and attaching also fails, that's the point to give up.

The point was to have clear error handling, we either have permenent
errors or 'this domain will never work with this device error'.

If we treat all error as temporary and just retry randomly it can
create a mess. For instance we might fail to attach to a perfectly
compatible domain due to ENOMEM or something and then go on to
successfully a create a new 2nd domain, just due to races.

We can certainly code the try everything then allocate scheme, it is
just much more fragile than having definitive error codes.

Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-07 17:00           ` Jason Gunthorpe
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-07 17:00 UTC (permalink / raw)
  To: Robin Murphy
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, Joerg Roedel, jon, jonathanh,
	iommu, Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer,
	kevin.tian, sven, linux-arm-msm, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

On Wed, Sep 07, 2022 at 03:23:09PM +0100, Robin Murphy wrote:
> On 2022-09-07 14:47, Jason Gunthorpe wrote:
> > On Wed, Sep 07, 2022 at 02:41:54PM +0200, Joerg Roedel wrote:
> > > On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
> > > > Provide a dedicated errno from the IOMMU driver during attach that the
> > > > reason attached failed is because of domain incompatability. EMEDIUMTYPE
> > > > is chosen because it is never used within the iommu subsystem today and
> > > > evokes a sense that the 'medium' aka the domain is incompatible.
> > > 
> > > I am not a fan of re-using EMEDIUMTYPE or any other special value. What
> > > is needed here in EINVAL, but with a way to tell the caller which of the
> > > function parameters is actually invalid.
> > 
> > Using errnos to indicate the nature of failure is a well established
> > unix practice, it is why we have hundreds of error codes and don't
> > just return -EINVAL for everything.
> > 
> > What don't you like about it?
> > 
> > Would you be happier if we wrote it like
> > 
> >   #define IOMMU_EINCOMPATIBLE_DEVICE xx
> > 
> > Which tells "which of the function parameters is actually invalid" ?
> 
> FWIW, we're now very close to being able to validate dev->iommu against
> where the domain came from in core code, and so short-circuit ->attach_dev
> entirely if they don't match. 

I don't think this is a long term direction. We have systems now with
a number of SMMU blocks and we really are going to see a need that
they share the iommu_domains so we don't have unncessary overheads
from duplicated io page table memory.

So ultimately I'd expect to pass the iommu_domain to the driver and
the driver will decide if the page table memory it represents is
compatible or not. Restricting to only the same iommu instance isn't
good..

> At that point -EINVAL at the driver callback level could be assumed
> to refer to the domain argument, while anything else could be taken
> as something going unexpectedly wrong when the attach may otherwise
> have worked. I've forgotten if we actually had a valid case anywhere
> for "this is my device but even if you retry with a different domain
> it's still never going to work", but I think we wouldn't actually
> need that anyway - it should be clear enough to a caller that if
> attaching to an existing domain fails, then allocating a fresh
> domain and attaching also fails, that's the point to give up.

The point was to have clear error handling, we either have permenent
errors or 'this domain will never work with this device error'.

If we treat all error as temporary and just retry randomly it can
create a mess. For instance we might fail to attach to a perfectly
compatible domain due to ENOMEM or something and then go on to
successfully a create a new 2nd domain, just due to races.

We can certainly code the try everything then allocate scheme, it is
just much more fragile than having definitive error codes.

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-07 14:06         ` Joerg Roedel
@ 2022-09-07 17:10           ` Jason Gunthorpe
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-07 17:10 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Nicolin Chen, will, robin.murphy, alex.williamson,
	suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

On Wed, Sep 07, 2022 at 04:06:29PM +0200, Joerg Roedel wrote:
> On Wed, Sep 07, 2022 at 10:47:39AM -0300, Jason Gunthorpe wrote:
> > Would you be happier if we wrote it like
> > 
> >  #define IOMMU_EINCOMPATIBLE_DEVICE xx
> > 
> > Which tells "which of the function parameters is actually invalid" ?
> 
> Having done some Rust hacking in the last months, I have to say I like
> to concept of error handling with Result<> there. Ideally we have a way
> to emulate that in our C code without having to change all callers.

Sure, rust has all sorts of nice things. But the kernel doesn't follow
rust idioms, and I don't think this is a great place to start
experimenting with them.

The unix/linux idiom is return an errno, and define what the errnos
mean for your function. We have a long history of creatively applying
the existing errnos, and sometimes we create new ones.

We rarely return an errno and an additional error code because it
doesn't fit the overall model, I can't return something like that
through a system call, for instance.

> What I am proposing is a way this could be emulated here, but I am open
> to other suggestions. Still better than re-using random error codes for
> special purposes.

I think, in context of Linux as a project, it very much is worse to
make up some rust-inspired error handing and discard the typical
design patterns.

Linux works because, for the most part, people follow similar design
sensibilities throughout the tree.

It has been 3 months since EMEDIUMTYPE was first proposed and 6
iterations of the series, don't you think it is a bit late in the game
to try to experiment with rust error handling idioms?

So, again, would you be happy with a simple 

 #define IOMMU_EINCOMPATIBLE_DEVICE xx

to make it less "re-using random error codes"?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-07 17:10           ` Jason Gunthorpe
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-07 17:10 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer,
	kevin.tian, sven, linux-arm-msm, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, robin.murphy, baolu.lu

On Wed, Sep 07, 2022 at 04:06:29PM +0200, Joerg Roedel wrote:
> On Wed, Sep 07, 2022 at 10:47:39AM -0300, Jason Gunthorpe wrote:
> > Would you be happier if we wrote it like
> > 
> >  #define IOMMU_EINCOMPATIBLE_DEVICE xx
> > 
> > Which tells "which of the function parameters is actually invalid" ?
> 
> Having done some Rust hacking in the last months, I have to say I like
> to concept of error handling with Result<> there. Ideally we have a way
> to emulate that in our C code without having to change all callers.

Sure, rust has all sorts of nice things. But the kernel doesn't follow
rust idioms, and I don't think this is a great place to start
experimenting with them.

The unix/linux idiom is return an errno, and define what the errnos
mean for your function. We have a long history of creatively applying
the existing errnos, and sometimes we create new ones.

We rarely return an errno and an additional error code because it
doesn't fit the overall model, I can't return something like that
through a system call, for instance.

> What I am proposing is a way this could be emulated here, but I am open
> to other suggestions. Still better than re-using random error codes for
> special purposes.

I think, in context of Linux as a project, it very much is worse to
make up some rust-inspired error handing and discard the typical
design patterns.

Linux works because, for the most part, people follow similar design
sensibilities throughout the tree.

It has been 3 months since EMEDIUMTYPE was first proposed and 6
iterations of the series, don't you think it is a bit late in the game
to try to experiment with rust error handling idioms?

So, again, would you be happy with a simple 

 #define IOMMU_EINCOMPATIBLE_DEVICE xx

to make it less "re-using random error codes"?

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-07 17:00           ` Jason Gunthorpe
  (?)
@ 2022-09-07 19:41             ` Robin Murphy
  -1 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-09-07 19:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Joerg Roedel, Nicolin Chen, will, alex.williamson,
	suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

On 2022-09-07 18:00, Jason Gunthorpe wrote:
> On Wed, Sep 07, 2022 at 03:23:09PM +0100, Robin Murphy wrote:
>> On 2022-09-07 14:47, Jason Gunthorpe wrote:
>>> On Wed, Sep 07, 2022 at 02:41:54PM +0200, Joerg Roedel wrote:
>>>> On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
>>>>> Provide a dedicated errno from the IOMMU driver during attach that the
>>>>> reason attached failed is because of domain incompatability. EMEDIUMTYPE
>>>>> is chosen because it is never used within the iommu subsystem today and
>>>>> evokes a sense that the 'medium' aka the domain is incompatible.
>>>>
>>>> I am not a fan of re-using EMEDIUMTYPE or any other special value. What
>>>> is needed here in EINVAL, but with a way to tell the caller which of the
>>>> function parameters is actually invalid.
>>>
>>> Using errnos to indicate the nature of failure is a well established
>>> unix practice, it is why we have hundreds of error codes and don't
>>> just return -EINVAL for everything.
>>>
>>> What don't you like about it?
>>>
>>> Would you be happier if we wrote it like
>>>
>>>    #define IOMMU_EINCOMPATIBLE_DEVICE xx
>>>
>>> Which tells "which of the function parameters is actually invalid" ?
>>
>> FWIW, we're now very close to being able to validate dev->iommu against
>> where the domain came from in core code, and so short-circuit ->attach_dev
>> entirely if they don't match.
> 
> I don't think this is a long term direction. We have systems now with
> a number of SMMU blocks and we really are going to see a need that
> they share the iommu_domains so we don't have unncessary overheads
> from duplicated io page table memory.
> 
> So ultimately I'd expect to pass the iommu_domain to the driver and
> the driver will decide if the page table memory it represents is
> compatible or not. Restricting to only the same iommu instance isn't
> good..

Who said IOMMU instance? As a reminder, the patch I currently have[1] is 
matching the driver (via the device ops), which happens to be entirely 
compatible with drivers supporting cross-instance domains. Mostly 
because we already have drivers that support cross-instance domains and 
callers that use them.

>> At that point -EINVAL at the driver callback level could be assumed
>> to refer to the domain argument, while anything else could be taken
>> as something going unexpectedly wrong when the attach may otherwise
>> have worked. I've forgotten if we actually had a valid case anywhere
>> for "this is my device but even if you retry with a different domain
>> it's still never going to work", but I think we wouldn't actually
>> need that anyway - it should be clear enough to a caller that if
>> attaching to an existing domain fails, then allocating a fresh
>> domain and attaching also fails, that's the point to give up.
> 
> The point was to have clear error handling, we either have permenent
> errors or 'this domain will never work with this device error'.
> 
> If we treat all error as temporary and just retry randomly it can
> create a mess. For instance we might fail to attach to a perfectly
> compatible domain due to ENOMEM or something and then go on to
> successfully a create a new 2nd domain, just due to races.
> 
> We can certainly code the try everything then allocate scheme, it is
> just much more fragile than having definitive error codes.

Again, not what I was suggesting. In fact the nature of 
iommu_attach_group() already rules out bogus devices getting this far, 
so all a driver currently has to worry about is compatibility of a 
device that it definitely probed with a domain that it definitely 
allocated. Therefore, from a caller's point of view, if attaching to an 
existing domain returns -EINVAL, try another domain; multiple different 
existing domains can be tried, and may also return -EINVAL for the same 
or different reasons; the final attempt is to allocate a fresh domain 
and attach to that, which should always be nominally valid and *never* 
return -EINVAL. If any attempt returns any other error, bail out down 
the usual "this should have worked but something went wrong" path. Even 
if any driver did have a nonsensical "nothing went wrong, I just can't 
attach my device to any of my domains" case, I don't think it would 
really need distinguishing from any other general error anyway.

Once multiple drivers are in play, the only addition is that the 
"gatekeeper" check inside iommu_attach_group() may also return -EINVAL 
if the device is managed by a different driver, since that still fits 
the same "try again with a different domain" message to the caller.

It's actually quite neat - basically the exact same thing we've tried to 
do with -EMEDIUMTYPE here, but more self-explanatory, since the fact is 
that a domain itself should never be invalid for attaching to via its 
own ops, and a group should never be inherently invalid for attaching to 
a suitable domain, it is only ever a particular combination of group (or 
device at the internal level) and domain that may not be valid together. 
Thus as long as we can maintain that basic guarantee that attaching a 
group to a newly allocated domain can only ever fail for resource 
allocation reasons and not some spurious "incompatibility", then we 
don't need any obscure trickery, and a single, clear, error code is in 
fact enough to say all that needs to be said.

Whether iommu_attach_device() should also join the party and start 
rejecting non-singleton-group devices with a different error, or 
maintain its current behaviour since its legacy users already have their 
expectations set, is another matter in its own right.

Cheers,
Robin.

[1] 
https://gitlab.arm.com/linux-arm/linux-rm/-/commit/683cdff1b2d4ae11f56e38d93b37e66e8c939fc9

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-07 19:41             ` Robin Murphy
  0 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-09-07 19:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, Joerg Roedel, jon, jonathanh,
	iommu, Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer,
	sven, linux-arm-msm, vdumpa, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, dwmw2, baolu.lu

On 2022-09-07 18:00, Jason Gunthorpe wrote:
> On Wed, Sep 07, 2022 at 03:23:09PM +0100, Robin Murphy wrote:
>> On 2022-09-07 14:47, Jason Gunthorpe wrote:
>>> On Wed, Sep 07, 2022 at 02:41:54PM +0200, Joerg Roedel wrote:
>>>> On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
>>>>> Provide a dedicated errno from the IOMMU driver during attach that the
>>>>> reason attached failed is because of domain incompatability. EMEDIUMTYPE
>>>>> is chosen because it is never used within the iommu subsystem today and
>>>>> evokes a sense that the 'medium' aka the domain is incompatible.
>>>>
>>>> I am not a fan of re-using EMEDIUMTYPE or any other special value. What
>>>> is needed here in EINVAL, but with a way to tell the caller which of the
>>>> function parameters is actually invalid.
>>>
>>> Using errnos to indicate the nature of failure is a well established
>>> unix practice, it is why we have hundreds of error codes and don't
>>> just return -EINVAL for everything.
>>>
>>> What don't you like about it?
>>>
>>> Would you be happier if we wrote it like
>>>
>>>    #define IOMMU_EINCOMPATIBLE_DEVICE xx
>>>
>>> Which tells "which of the function parameters is actually invalid" ?
>>
>> FWIW, we're now very close to being able to validate dev->iommu against
>> where the domain came from in core code, and so short-circuit ->attach_dev
>> entirely if they don't match.
> 
> I don't think this is a long term direction. We have systems now with
> a number of SMMU blocks and we really are going to see a need that
> they share the iommu_domains so we don't have unncessary overheads
> from duplicated io page table memory.
> 
> So ultimately I'd expect to pass the iommu_domain to the driver and
> the driver will decide if the page table memory it represents is
> compatible or not. Restricting to only the same iommu instance isn't
> good..

Who said IOMMU instance? As a reminder, the patch I currently have[1] is 
matching the driver (via the device ops), which happens to be entirely 
compatible with drivers supporting cross-instance domains. Mostly 
because we already have drivers that support cross-instance domains and 
callers that use them.

>> At that point -EINVAL at the driver callback level could be assumed
>> to refer to the domain argument, while anything else could be taken
>> as something going unexpectedly wrong when the attach may otherwise
>> have worked. I've forgotten if we actually had a valid case anywhere
>> for "this is my device but even if you retry with a different domain
>> it's still never going to work", but I think we wouldn't actually
>> need that anyway - it should be clear enough to a caller that if
>> attaching to an existing domain fails, then allocating a fresh
>> domain and attaching also fails, that's the point to give up.
> 
> The point was to have clear error handling, we either have permenent
> errors or 'this domain will never work with this device error'.
> 
> If we treat all error as temporary and just retry randomly it can
> create a mess. For instance we might fail to attach to a perfectly
> compatible domain due to ENOMEM or something and then go on to
> successfully a create a new 2nd domain, just due to races.
> 
> We can certainly code the try everything then allocate scheme, it is
> just much more fragile than having definitive error codes.

Again, not what I was suggesting. In fact the nature of 
iommu_attach_group() already rules out bogus devices getting this far, 
so all a driver currently has to worry about is compatibility of a 
device that it definitely probed with a domain that it definitely 
allocated. Therefore, from a caller's point of view, if attaching to an 
existing domain returns -EINVAL, try another domain; multiple different 
existing domains can be tried, and may also return -EINVAL for the same 
or different reasons; the final attempt is to allocate a fresh domain 
and attach to that, which should always be nominally valid and *never* 
return -EINVAL. If any attempt returns any other error, bail out down 
the usual "this should have worked but something went wrong" path. Even 
if any driver did have a nonsensical "nothing went wrong, I just can't 
attach my device to any of my domains" case, I don't think it would 
really need distinguishing from any other general error anyway.

Once multiple drivers are in play, the only addition is that the 
"gatekeeper" check inside iommu_attach_group() may also return -EINVAL 
if the device is managed by a different driver, since that still fits 
the same "try again with a different domain" message to the caller.

It's actually quite neat - basically the exact same thing we've tried to 
do with -EMEDIUMTYPE here, but more self-explanatory, since the fact is 
that a domain itself should never be invalid for attaching to via its 
own ops, and a group should never be inherently invalid for attaching to 
a suitable domain, it is only ever a particular combination of group (or 
device at the internal level) and domain that may not be valid together. 
Thus as long as we can maintain that basic guarantee that attaching a 
group to a newly allocated domain can only ever fail for resource 
allocation reasons and not some spurious "incompatibility", then we 
don't need any obscure trickery, and a single, clear, error code is in 
fact enough to say all that needs to be said.

Whether iommu_attach_device() should also join the party and start 
rejecting non-singleton-group devices with a different error, or 
maintain its current behaviour since its legacy users already have their 
expectations set, is another matter in its own right.

Cheers,
Robin.

[1] 
https://gitlab.arm.com/linux-arm/linux-rm/-/commit/683cdff1b2d4ae11f56e38d93b37e66e8c939fc9
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-07 19:41             ` Robin Murphy
  0 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-09-07 19:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, Joerg Roedel, jon, jonathanh,
	iommu, Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer,
	kevin.tian, sven, linux-arm-msm, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

On 2022-09-07 18:00, Jason Gunthorpe wrote:
> On Wed, Sep 07, 2022 at 03:23:09PM +0100, Robin Murphy wrote:
>> On 2022-09-07 14:47, Jason Gunthorpe wrote:
>>> On Wed, Sep 07, 2022 at 02:41:54PM +0200, Joerg Roedel wrote:
>>>> On Mon, Aug 15, 2022 at 11:14:33AM -0700, Nicolin Chen wrote:
>>>>> Provide a dedicated errno from the IOMMU driver during attach that the
>>>>> reason attached failed is because of domain incompatability. EMEDIUMTYPE
>>>>> is chosen because it is never used within the iommu subsystem today and
>>>>> evokes a sense that the 'medium' aka the domain is incompatible.
>>>>
>>>> I am not a fan of re-using EMEDIUMTYPE or any other special value. What
>>>> is needed here in EINVAL, but with a way to tell the caller which of the
>>>> function parameters is actually invalid.
>>>
>>> Using errnos to indicate the nature of failure is a well established
>>> unix practice, it is why we have hundreds of error codes and don't
>>> just return -EINVAL for everything.
>>>
>>> What don't you like about it?
>>>
>>> Would you be happier if we wrote it like
>>>
>>>    #define IOMMU_EINCOMPATIBLE_DEVICE xx
>>>
>>> Which tells "which of the function parameters is actually invalid" ?
>>
>> FWIW, we're now very close to being able to validate dev->iommu against
>> where the domain came from in core code, and so short-circuit ->attach_dev
>> entirely if they don't match.
> 
> I don't think this is a long term direction. We have systems now with
> a number of SMMU blocks and we really are going to see a need that
> they share the iommu_domains so we don't have unncessary overheads
> from duplicated io page table memory.
> 
> So ultimately I'd expect to pass the iommu_domain to the driver and
> the driver will decide if the page table memory it represents is
> compatible or not. Restricting to only the same iommu instance isn't
> good..

Who said IOMMU instance? As a reminder, the patch I currently have[1] is 
matching the driver (via the device ops), which happens to be entirely 
compatible with drivers supporting cross-instance domains. Mostly 
because we already have drivers that support cross-instance domains and 
callers that use them.

>> At that point -EINVAL at the driver callback level could be assumed
>> to refer to the domain argument, while anything else could be taken
>> as something going unexpectedly wrong when the attach may otherwise
>> have worked. I've forgotten if we actually had a valid case anywhere
>> for "this is my device but even if you retry with a different domain
>> it's still never going to work", but I think we wouldn't actually
>> need that anyway - it should be clear enough to a caller that if
>> attaching to an existing domain fails, then allocating a fresh
>> domain and attaching also fails, that's the point to give up.
> 
> The point was to have clear error handling, we either have permenent
> errors or 'this domain will never work with this device error'.
> 
> If we treat all error as temporary and just retry randomly it can
> create a mess. For instance we might fail to attach to a perfectly
> compatible domain due to ENOMEM or something and then go on to
> successfully a create a new 2nd domain, just due to races.
> 
> We can certainly code the try everything then allocate scheme, it is
> just much more fragile than having definitive error codes.

Again, not what I was suggesting. In fact the nature of 
iommu_attach_group() already rules out bogus devices getting this far, 
so all a driver currently has to worry about is compatibility of a 
device that it definitely probed with a domain that it definitely 
allocated. Therefore, from a caller's point of view, if attaching to an 
existing domain returns -EINVAL, try another domain; multiple different 
existing domains can be tried, and may also return -EINVAL for the same 
or different reasons; the final attempt is to allocate a fresh domain 
and attach to that, which should always be nominally valid and *never* 
return -EINVAL. If any attempt returns any other error, bail out down 
the usual "this should have worked but something went wrong" path. Even 
if any driver did have a nonsensical "nothing went wrong, I just can't 
attach my device to any of my domains" case, I don't think it would 
really need distinguishing from any other general error anyway.

Once multiple drivers are in play, the only addition is that the 
"gatekeeper" check inside iommu_attach_group() may also return -EINVAL 
if the device is managed by a different driver, since that still fits 
the same "try again with a different domain" message to the caller.

It's actually quite neat - basically the exact same thing we've tried to 
do with -EMEDIUMTYPE here, but more self-explanatory, since the fact is 
that a domain itself should never be invalid for attaching to via its 
own ops, and a group should never be inherently invalid for attaching to 
a suitable domain, it is only ever a particular combination of group (or 
device at the internal level) and domain that may not be valid together. 
Thus as long as we can maintain that basic guarantee that attaching a 
group to a newly allocated domain can only ever fail for resource 
allocation reasons and not some spurious "incompatibility", then we 
don't need any obscure trickery, and a single, clear, error code is in 
fact enough to say all that needs to be said.

Whether iommu_attach_device() should also join the party and start 
rejecting non-singleton-group devices with a different error, or 
maintain its current behaviour since its legacy users already have their 
expectations set, is another matter in its own right.

Cheers,
Robin.

[1] 
https://gitlab.arm.com/linux-arm/linux-rm/-/commit/683cdff1b2d4ae11f56e38d93b37e66e8c939fc9

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-07 19:41             ` Robin Murphy
@ 2022-09-08  0:43               ` Jason Gunthorpe
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-08  0:43 UTC (permalink / raw)
  To: Robin Murphy, Joerg Roedel
  Cc: Nicolin Chen, will, alex.williamson, suravee.suthikulpanit,
	marcan, sven, alyssa, robdclark, dwmw2, baolu.lu, mjrosato,
	gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
	thierry.reding, vdumpa, jonathanh, jean-philippe, cohuck, tglx,
	shameerali.kolothum.thodi, thunder.leizhen, christophe.jaillet,
	yangyingliang, jon, iommu, linux-kernel, asahi, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm,
	kevin.tian

On Wed, Sep 07, 2022 at 08:41:13PM +0100, Robin Murphy wrote:

> > > FWIW, we're now very close to being able to validate dev->iommu against
> > > where the domain came from in core code, and so short-circuit ->attach_dev
> > > entirely if they don't match.
> > 
> > I don't think this is a long term direction. We have systems now with
> > a number of SMMU blocks and we really are going to see a need that
> > they share the iommu_domains so we don't have unncessary overheads
> > from duplicated io page table memory.
> > 
> > So ultimately I'd expect to pass the iommu_domain to the driver and
> > the driver will decide if the page table memory it represents is
> > compatible or not. Restricting to only the same iommu instance isn't
> > good..
> 
> Who said IOMMU instance?

Ah, I completely misunderstood what 'dev->iommu' was referring too, OK
I see.

> Again, not what I was suggesting. In fact the nature of iommu_attach_group()
> already rules out bogus devices getting this far, so all a driver currently
> has to worry about is compatibility of a device that it definitely probed
> with a domain that it definitely allocated. Therefore, from a caller's point
> of view, if attaching to an existing domain returns -EINVAL, try another
> domain; multiple different existing domains can be tried, and may also
> return -EINVAL for the same or different reasons; the final attempt is to
> allocate a fresh domain and attach to that, which should always be nominally
> valid and *never* return -EINVAL. If any attempt returns any other error,
> bail out down the usual "this should have worked but something went wrong"
> path. Even if any driver did have a nonsensical "nothing went wrong, I just
> can't attach my device to any of my domains" case, I don't think it would
> really need distinguishing from any other general error anyway.

The algorithm you described is exactly what this series does, it just
used EMEDIUMTYPE instead of EINVAL. Changing it to EINVAL is not a
fundamental problem, just a bit more work.

Looking at Nicolin's series there is a bunch of existing errnos that
would still need converting, ie EXDEV, EBUSY, EOPNOTSUPP, EFAULT, and
ENXIO are all returned as codes for 'domain incompatible with device'
in various drivers. So the patch would still look much the same, just
changing them to EINVAL instead of EMEDIUMTYPE.

That leaves the question of the remaining EINVAL's that Nicolin did
not convert to EMEDIUMTYPE.

eg in the AMD driver:

	if (!check_device(dev))
		return -EINVAL;

	iommu = rlookup_amd_iommu(dev);
	if (!iommu)
		return -EINVAL;

These are all cases of 'something is really wrong with the device or
iommu, everything will fail'. Other drivers are using ENODEV for this
already, so we'd probably have an additional patch changing various
places like that to ENODEV.

This mixture of error codes is the basic reason why a new code was
used, because none of the existing codes are used with any
consistency.

But OK, I'm on board, lets use more common errnos with specific
meaning, that can be documented in a comment someplace:
 ENOMEM - out of memory
 ENODEV - no domain can attach, device or iommu is messed up
 EINVAL - the domain is incompatible with the device
 <others> - Same behavior as ENODEV, use is discouraged.

I think achieving consistency of error codes is a generally desirable
goal, it makes the error code actually useful.

Joerg this is a good bit of work, will you be OK with it?

> Thus as long as we can maintain that basic guarantee that attaching
> a group to a newly allocated domain can only ever fail for resource
> allocation reasons and not some spurious "incompatibility", then we
> don't need any obscure trickery, and a single, clear, error code is
> in fact enough to say all that needs to be said.

As above, this is not the case, drivers do seem to have error paths
that are unconditional on the domain. Perhaps they are just protective
assertions and never happen.

Regardless, it doesn't matter. If they return ENODEV or EINVAL the
VFIO side algorithm will continue to work fine, it just does alot more
work if EINVAL is permanently returned.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-08  0:43               ` Jason Gunthorpe
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-08  0:43 UTC (permalink / raw)
  To: Robin Murphy, Joerg Roedel
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer,
	kevin.tian, sven, linux-arm-msm, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

On Wed, Sep 07, 2022 at 08:41:13PM +0100, Robin Murphy wrote:

> > > FWIW, we're now very close to being able to validate dev->iommu against
> > > where the domain came from in core code, and so short-circuit ->attach_dev
> > > entirely if they don't match.
> > 
> > I don't think this is a long term direction. We have systems now with
> > a number of SMMU blocks and we really are going to see a need that
> > they share the iommu_domains so we don't have unncessary overheads
> > from duplicated io page table memory.
> > 
> > So ultimately I'd expect to pass the iommu_domain to the driver and
> > the driver will decide if the page table memory it represents is
> > compatible or not. Restricting to only the same iommu instance isn't
> > good..
> 
> Who said IOMMU instance?

Ah, I completely misunderstood what 'dev->iommu' was referring too, OK
I see.

> Again, not what I was suggesting. In fact the nature of iommu_attach_group()
> already rules out bogus devices getting this far, so all a driver currently
> has to worry about is compatibility of a device that it definitely probed
> with a domain that it definitely allocated. Therefore, from a caller's point
> of view, if attaching to an existing domain returns -EINVAL, try another
> domain; multiple different existing domains can be tried, and may also
> return -EINVAL for the same or different reasons; the final attempt is to
> allocate a fresh domain and attach to that, which should always be nominally
> valid and *never* return -EINVAL. If any attempt returns any other error,
> bail out down the usual "this should have worked but something went wrong"
> path. Even if any driver did have a nonsensical "nothing went wrong, I just
> can't attach my device to any of my domains" case, I don't think it would
> really need distinguishing from any other general error anyway.

The algorithm you described is exactly what this series does, it just
used EMEDIUMTYPE instead of EINVAL. Changing it to EINVAL is not a
fundamental problem, just a bit more work.

Looking at Nicolin's series there is a bunch of existing errnos that
would still need converting, ie EXDEV, EBUSY, EOPNOTSUPP, EFAULT, and
ENXIO are all returned as codes for 'domain incompatible with device'
in various drivers. So the patch would still look much the same, just
changing them to EINVAL instead of EMEDIUMTYPE.

That leaves the question of the remaining EINVAL's that Nicolin did
not convert to EMEDIUMTYPE.

eg in the AMD driver:

	if (!check_device(dev))
		return -EINVAL;

	iommu = rlookup_amd_iommu(dev);
	if (!iommu)
		return -EINVAL;

These are all cases of 'something is really wrong with the device or
iommu, everything will fail'. Other drivers are using ENODEV for this
already, so we'd probably have an additional patch changing various
places like that to ENODEV.

This mixture of error codes is the basic reason why a new code was
used, because none of the existing codes are used with any
consistency.

But OK, I'm on board, lets use more common errnos with specific
meaning, that can be documented in a comment someplace:
 ENOMEM - out of memory
 ENODEV - no domain can attach, device or iommu is messed up
 EINVAL - the domain is incompatible with the device
 <others> - Same behavior as ENODEV, use is discouraged.

I think achieving consistency of error codes is a generally desirable
goal, it makes the error code actually useful.

Joerg this is a good bit of work, will you be OK with it?

> Thus as long as we can maintain that basic guarantee that attaching
> a group to a newly allocated domain can only ever fail for resource
> allocation reasons and not some spurious "incompatibility", then we
> don't need any obscure trickery, and a single, clear, error code is
> in fact enough to say all that needs to be said.

As above, this is not the case, drivers do seem to have error paths
that are unconditional on the domain. Perhaps they are just protective
assertions and never happen.

Regardless, it doesn't matter. If they return ENODEV or EINVAL the
VFIO side algorithm will continue to work fine, it just does alot more
work if EINVAL is permanently returned.

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-08  0:43               ` Jason Gunthorpe
  (?)
@ 2022-09-08  9:30                 ` Tian, Kevin
  -1 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-08  9:30 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy, Joerg Roedel
  Cc: Nicolin Chen, will, alex.williamson, suravee.suthikulpanit,
	marcan, sven, alyssa, robdclark, dwmw2, baolu.lu, mjrosato,
	gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
	thierry.reding, vdumpa, jonathanh, jean-philippe, cohuck, tglx,
	shameerali.kolothum.thodi, thunder.leizhen, christophe.jaillet,
	yangyingliang, jon, iommu, linux-kernel, asahi, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, September 8, 2022 8:43 AM
> 
> On Wed, Sep 07, 2022 at 08:41:13PM +0100, Robin Murphy wrote:
> 
> > Again, not what I was suggesting. In fact the nature of
> iommu_attach_group()
> > already rules out bogus devices getting this far, so all a driver currently
> > has to worry about is compatibility of a device that it definitely probed
> > with a domain that it definitely allocated. Therefore, from a caller's point
> > of view, if attaching to an existing domain returns -EINVAL, try another
> > domain; multiple different existing domains can be tried, and may also
> > return -EINVAL for the same or different reasons; the final attempt is to
> > allocate a fresh domain and attach to that, which should always be
> nominally
> > valid and *never* return -EINVAL. If any attempt returns any other error,
> > bail out down the usual "this should have worked but something went
> wrong"
> > path. Even if any driver did have a nonsensical "nothing went wrong, I just
> > can't attach my device to any of my domains" case, I don't think it would
> > really need distinguishing from any other general error anyway.
> 
> The algorithm you described is exactly what this series does, it just
> used EMEDIUMTYPE instead of EINVAL. Changing it to EINVAL is not a
> fundamental problem, just a bit more work.
> 
> Looking at Nicolin's series there is a bunch of existing errnos that
> would still need converting, ie EXDEV, EBUSY, EOPNOTSUPP, EFAULT, and
> ENXIO are all returned as codes for 'domain incompatible with device'
> in various drivers. So the patch would still look much the same, just
> changing them to EINVAL instead of EMEDIUMTYPE.
> 
> That leaves the question of the remaining EINVAL's that Nicolin did
> not convert to EMEDIUMTYPE.
> 
> eg in the AMD driver:
> 
> 	if (!check_device(dev))
> 		return -EINVAL;
> 
> 	iommu = rlookup_amd_iommu(dev);
> 	if (!iommu)
> 		return -EINVAL;
> 
> These are all cases of 'something is really wrong with the device or
> iommu, everything will fail'. Other drivers are using ENODEV for this
> already, so we'd probably have an additional patch changing various
> places like that to ENODEV.
> 
> This mixture of error codes is the basic reason why a new code was
> used, because none of the existing codes are used with any
> consistency.

btw I saw the policy for -EBUSY is also not consistent in this series.

while it's correct to change -EBUSY to -EMEDIUMTYPE for omap, assuming
that retrying another fresh domain for the said device should work:

	if (omap_domain->dev) {
-		dev_err(dev, "iommu domain is already attached\n");
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;
 		goto out;
 	}

the change in tegra-gart doesn't sound correct:

	if (gart->active_domain && gart->active_domain != domain) {
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;

one device cannot be attached to two domains. This fact doesn't change
no matter how many domains are tried. In concept this check is
redundant and should have been done by iommu core, but obviously we
didn't pay attention to what -EBUSY actually represents in this path.

> 
> But OK, I'm on board, lets use more common errnos with specific
> meaning, that can be documented in a comment someplace:
>  ENOMEM - out of memory
>  ENODEV - no domain can attach, device or iommu is messed up
>  EINVAL - the domain is incompatible with the device
>  <others> - Same behavior as ENODEV, use is discouraged.

There are also cases where common kAPIs are called in the attach
path which may return -EINVAL and random errno, e.g.:

omap_iommu_attach_dev()
  omap_iommu_attach()
    iommu_enable()
      pm_runtime_get_sync()
        __pm_runtime_resume()
          rpm_resume()
	if (dev->power.runtime_error) {
		retval = -EINVAL;
            
viommu_attach_dev()
  viommu_domain_finalise()
    ida_alloc_range()
	if ((int)min < 0)
		return -ENOSPC;

> 
> I think achieving consistency of error codes is a generally desirable
> goal, it makes the error code actually useful.
> 
> Joerg this is a good bit of work, will you be OK with it?
> 
> > Thus as long as we can maintain that basic guarantee that attaching
> > a group to a newly allocated domain can only ever fail for resource
> > allocation reasons and not some spurious "incompatibility", then we
> > don't need any obscure trickery, and a single, clear, error code is
> > in fact enough to say all that needs to be said.
> 
> As above, this is not the case, drivers do seem to have error paths
> that are unconditional on the domain. Perhaps they are just protective
> assertions and never happen.
> 
> Regardless, it doesn't matter. If they return ENODEV or EINVAL the
> VFIO side algorithm will continue to work fine, it just does alot more
> work if EINVAL is permanently returned.
> 

I don't see an elegant option here.

If we care about accuracy of reporting incompatibility -EMEDIUMTYPE
is obviously a better option.

If we think attach_dev is a slow path and having unnecessary retries
doesn't hurt then -EINVAL sounds a simpler option. We probably can
just go using -EINVAL as retry indicator in vfio even w/o changing
iommu drivers at this point. Then improve them to use consistent
errno gradually and in a separate effort.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-08  9:30                 ` Tian, Kevin
  0 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-08  9:30 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy, Joerg Roedel
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer, sven,
	linux-arm-msm, vdumpa, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, dwmw2, baolu.lu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, September 8, 2022 8:43 AM
> 
> On Wed, Sep 07, 2022 at 08:41:13PM +0100, Robin Murphy wrote:
> 
> > Again, not what I was suggesting. In fact the nature of
> iommu_attach_group()
> > already rules out bogus devices getting this far, so all a driver currently
> > has to worry about is compatibility of a device that it definitely probed
> > with a domain that it definitely allocated. Therefore, from a caller's point
> > of view, if attaching to an existing domain returns -EINVAL, try another
> > domain; multiple different existing domains can be tried, and may also
> > return -EINVAL for the same or different reasons; the final attempt is to
> > allocate a fresh domain and attach to that, which should always be
> nominally
> > valid and *never* return -EINVAL. If any attempt returns any other error,
> > bail out down the usual "this should have worked but something went
> wrong"
> > path. Even if any driver did have a nonsensical "nothing went wrong, I just
> > can't attach my device to any of my domains" case, I don't think it would
> > really need distinguishing from any other general error anyway.
> 
> The algorithm you described is exactly what this series does, it just
> used EMEDIUMTYPE instead of EINVAL. Changing it to EINVAL is not a
> fundamental problem, just a bit more work.
> 
> Looking at Nicolin's series there is a bunch of existing errnos that
> would still need converting, ie EXDEV, EBUSY, EOPNOTSUPP, EFAULT, and
> ENXIO are all returned as codes for 'domain incompatible with device'
> in various drivers. So the patch would still look much the same, just
> changing them to EINVAL instead of EMEDIUMTYPE.
> 
> That leaves the question of the remaining EINVAL's that Nicolin did
> not convert to EMEDIUMTYPE.
> 
> eg in the AMD driver:
> 
> 	if (!check_device(dev))
> 		return -EINVAL;
> 
> 	iommu = rlookup_amd_iommu(dev);
> 	if (!iommu)
> 		return -EINVAL;
> 
> These are all cases of 'something is really wrong with the device or
> iommu, everything will fail'. Other drivers are using ENODEV for this
> already, so we'd probably have an additional patch changing various
> places like that to ENODEV.
> 
> This mixture of error codes is the basic reason why a new code was
> used, because none of the existing codes are used with any
> consistency.

btw I saw the policy for -EBUSY is also not consistent in this series.

while it's correct to change -EBUSY to -EMEDIUMTYPE for omap, assuming
that retrying another fresh domain for the said device should work:

	if (omap_domain->dev) {
-		dev_err(dev, "iommu domain is already attached\n");
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;
 		goto out;
 	}

the change in tegra-gart doesn't sound correct:

	if (gart->active_domain && gart->active_domain != domain) {
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;

one device cannot be attached to two domains. This fact doesn't change
no matter how many domains are tried. In concept this check is
redundant and should have been done by iommu core, but obviously we
didn't pay attention to what -EBUSY actually represents in this path.

> 
> But OK, I'm on board, lets use more common errnos with specific
> meaning, that can be documented in a comment someplace:
>  ENOMEM - out of memory
>  ENODEV - no domain can attach, device or iommu is messed up
>  EINVAL - the domain is incompatible with the device
>  <others> - Same behavior as ENODEV, use is discouraged.

There are also cases where common kAPIs are called in the attach
path which may return -EINVAL and random errno, e.g.:

omap_iommu_attach_dev()
  omap_iommu_attach()
    iommu_enable()
      pm_runtime_get_sync()
        __pm_runtime_resume()
          rpm_resume()
	if (dev->power.runtime_error) {
		retval = -EINVAL;
            
viommu_attach_dev()
  viommu_domain_finalise()
    ida_alloc_range()
	if ((int)min < 0)
		return -ENOSPC;

> 
> I think achieving consistency of error codes is a generally desirable
> goal, it makes the error code actually useful.
> 
> Joerg this is a good bit of work, will you be OK with it?
> 
> > Thus as long as we can maintain that basic guarantee that attaching
> > a group to a newly allocated domain can only ever fail for resource
> > allocation reasons and not some spurious "incompatibility", then we
> > don't need any obscure trickery, and a single, clear, error code is
> > in fact enough to say all that needs to be said.
> 
> As above, this is not the case, drivers do seem to have error paths
> that are unconditional on the domain. Perhaps they are just protective
> assertions and never happen.
> 
> Regardless, it doesn't matter. If they return ENODEV or EINVAL the
> VFIO side algorithm will continue to work fine, it just does alot more
> work if EINVAL is permanently returned.
> 

I don't see an elegant option here.

If we care about accuracy of reporting incompatibility -EMEDIUMTYPE
is obviously a better option.

If we think attach_dev is a slow path and having unnecessary retries
doesn't hurt then -EINVAL sounds a simpler option. We probably can
just go using -EINVAL as retry indicator in vfio even w/o changing
iommu drivers at this point. Then improve them to use consistent
errno gradually and in a separate effort.

Thanks
Kevin
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-08  9:30                 ` Tian, Kevin
  0 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-08  9:30 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy, Joerg Roedel
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer, sven,
	linux-arm-msm, christophe.jaillet, baolin.wang, thunder.leizhen,
	linux-tegra, tglx, virtualization, linux-arm-kernel, linux-s390,
	cohuck, alex.williamson, shameerali.kolothum.thodi, robdclark,
	asahi, suravee.suthikulpanit, dwmw2, baolu.lu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, September 8, 2022 8:43 AM
> 
> On Wed, Sep 07, 2022 at 08:41:13PM +0100, Robin Murphy wrote:
> 
> > Again, not what I was suggesting. In fact the nature of
> iommu_attach_group()
> > already rules out bogus devices getting this far, so all a driver currently
> > has to worry about is compatibility of a device that it definitely probed
> > with a domain that it definitely allocated. Therefore, from a caller's point
> > of view, if attaching to an existing domain returns -EINVAL, try another
> > domain; multiple different existing domains can be tried, and may also
> > return -EINVAL for the same or different reasons; the final attempt is to
> > allocate a fresh domain and attach to that, which should always be
> nominally
> > valid and *never* return -EINVAL. If any attempt returns any other error,
> > bail out down the usual "this should have worked but something went
> wrong"
> > path. Even if any driver did have a nonsensical "nothing went wrong, I just
> > can't attach my device to any of my domains" case, I don't think it would
> > really need distinguishing from any other general error anyway.
> 
> The algorithm you described is exactly what this series does, it just
> used EMEDIUMTYPE instead of EINVAL. Changing it to EINVAL is not a
> fundamental problem, just a bit more work.
> 
> Looking at Nicolin's series there is a bunch of existing errnos that
> would still need converting, ie EXDEV, EBUSY, EOPNOTSUPP, EFAULT, and
> ENXIO are all returned as codes for 'domain incompatible with device'
> in various drivers. So the patch would still look much the same, just
> changing them to EINVAL instead of EMEDIUMTYPE.
> 
> That leaves the question of the remaining EINVAL's that Nicolin did
> not convert to EMEDIUMTYPE.
> 
> eg in the AMD driver:
> 
> 	if (!check_device(dev))
> 		return -EINVAL;
> 
> 	iommu = rlookup_amd_iommu(dev);
> 	if (!iommu)
> 		return -EINVAL;
> 
> These are all cases of 'something is really wrong with the device or
> iommu, everything will fail'. Other drivers are using ENODEV for this
> already, so we'd probably have an additional patch changing various
> places like that to ENODEV.
> 
> This mixture of error codes is the basic reason why a new code was
> used, because none of the existing codes are used with any
> consistency.

btw I saw the policy for -EBUSY is also not consistent in this series.

while it's correct to change -EBUSY to -EMEDIUMTYPE for omap, assuming
that retrying another fresh domain for the said device should work:

	if (omap_domain->dev) {
-		dev_err(dev, "iommu domain is already attached\n");
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;
 		goto out;
 	}

the change in tegra-gart doesn't sound correct:

	if (gart->active_domain && gart->active_domain != domain) {
-		ret = -EBUSY;
+		ret = -EMEDIUMTYPE;

one device cannot be attached to two domains. This fact doesn't change
no matter how many domains are tried. In concept this check is
redundant and should have been done by iommu core, but obviously we
didn't pay attention to what -EBUSY actually represents in this path.

> 
> But OK, I'm on board, lets use more common errnos with specific
> meaning, that can be documented in a comment someplace:
>  ENOMEM - out of memory
>  ENODEV - no domain can attach, device or iommu is messed up
>  EINVAL - the domain is incompatible with the device
>  <others> - Same behavior as ENODEV, use is discouraged.

There are also cases where common kAPIs are called in the attach
path which may return -EINVAL and random errno, e.g.:

omap_iommu_attach_dev()
  omap_iommu_attach()
    iommu_enable()
      pm_runtime_get_sync()
        __pm_runtime_resume()
          rpm_resume()
	if (dev->power.runtime_error) {
		retval = -EINVAL;
            
viommu_attach_dev()
  viommu_domain_finalise()
    ida_alloc_range()
	if ((int)min < 0)
		return -ENOSPC;

> 
> I think achieving consistency of error codes is a generally desirable
> goal, it makes the error code actually useful.
> 
> Joerg this is a good bit of work, will you be OK with it?
> 
> > Thus as long as we can maintain that basic guarantee that attaching
> > a group to a newly allocated domain can only ever fail for resource
> > allocation reasons and not some spurious "incompatibility", then we
> > don't need any obscure trickery, and a single, clear, error code is
> > in fact enough to say all that needs to be said.
> 
> As above, this is not the case, drivers do seem to have error paths
> that are unconditional on the domain. Perhaps they are just protective
> assertions and never happen.
> 
> Regardless, it doesn't matter. If they return ENODEV or EINVAL the
> VFIO side algorithm will continue to work fine, it just does alot more
> work if EINVAL is permanently returned.
> 

I don't see an elegant option here.

If we care about accuracy of reporting incompatibility -EMEDIUMTYPE
is obviously a better option.

If we think attach_dev is a slow path and having unnecessary retries
doesn't hurt then -EINVAL sounds a simpler option. We probably can
just go using -EINVAL as retry indicator in vfio even w/o changing
iommu drivers at this point. Then improve them to use consistent
errno gradually and in a separate effort.

Thanks
Kevin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-08  0:43               ` Jason Gunthorpe
  (?)
@ 2022-09-08  9:54                 ` Tian, Kevin
  -1 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-08  9:54 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy, Joerg Roedel
  Cc: Nicolin Chen, will, alex.williamson, suravee.suthikulpanit,
	marcan, sven, alyssa, robdclark, dwmw2, baolu.lu, mjrosato,
	gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
	thierry.reding, vdumpa, jonathanh, jean-philippe, cohuck, tglx,
	shameerali.kolothum.thodi, thunder.leizhen, christophe.jaillet,
	yangyingliang, jon, iommu, linux-kernel, asahi, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm

> From: Tian, Kevin
> Sent: Thursday, September 8, 2022 5:31 PM
> > This mixture of error codes is the basic reason why a new code was
> > used, because none of the existing codes are used with any
> > consistency.
> 
> btw I saw the policy for -EBUSY is also not consistent in this series.
> 
> while it's correct to change -EBUSY to -EMEDIUMTYPE for omap, assuming
> that retrying another fresh domain for the said device should work:
> 
> 	if (omap_domain->dev) {
> -		dev_err(dev, "iommu domain is already attached\n");
> -		ret = -EBUSY;
> +		ret = -EMEDIUMTYPE;
>  		goto out;
>  	}
> 
> the change in tegra-gart doesn't sound correct:
> 
> 	if (gart->active_domain && gart->active_domain != domain) {
> -		ret = -EBUSY;
> +		ret = -EMEDIUMTYPE;
> 
> one device cannot be attached to two domains. This fact doesn't change
> no matter how many domains are tried. In concept this check is
> redundant and should have been done by iommu core, but obviously we
> didn't pay attention to what -EBUSY actually represents in this path.
> 

oops. Above is actually a right retry condition. gart is iommu instead of
device. So in concept retrying gart->active_domain for the device could
work.

So please ignore this comment.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-08  9:54                 ` Tian, Kevin
  0 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-08  9:54 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy, Joerg Roedel
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer, sven,
	linux-arm-msm, vdumpa, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, dwmw2, baolu.lu

> From: Tian, Kevin
> Sent: Thursday, September 8, 2022 5:31 PM
> > This mixture of error codes is the basic reason why a new code was
> > used, because none of the existing codes are used with any
> > consistency.
> 
> btw I saw the policy for -EBUSY is also not consistent in this series.
> 
> while it's correct to change -EBUSY to -EMEDIUMTYPE for omap, assuming
> that retrying another fresh domain for the said device should work:
> 
> 	if (omap_domain->dev) {
> -		dev_err(dev, "iommu domain is already attached\n");
> -		ret = -EBUSY;
> +		ret = -EMEDIUMTYPE;
>  		goto out;
>  	}
> 
> the change in tegra-gart doesn't sound correct:
> 
> 	if (gart->active_domain && gart->active_domain != domain) {
> -		ret = -EBUSY;
> +		ret = -EMEDIUMTYPE;
> 
> one device cannot be attached to two domains. This fact doesn't change
> no matter how many domains are tried. In concept this check is
> redundant and should have been done by iommu core, but obviously we
> didn't pay attention to what -EBUSY actually represents in this path.
> 

oops. Above is actually a right retry condition. gart is iommu instead of
device. So in concept retrying gart->active_domain for the device could
work.

So please ignore this comment.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-08  9:54                 ` Tian, Kevin
  0 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-08  9:54 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy, Joerg Roedel
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer, sven,
	linux-arm-msm, christophe.jaillet, baolin.wang, thunder.leizhen,
	linux-tegra, tglx, virtualization, linux-arm-kernel, linux-s390,
	cohuck, alex.williamson, shameerali.kolothum.thodi, robdclark,
	asahi, suravee.suthikulpanit, dwmw2, baolu.lu

> From: Tian, Kevin
> Sent: Thursday, September 8, 2022 5:31 PM
> > This mixture of error codes is the basic reason why a new code was
> > used, because none of the existing codes are used with any
> > consistency.
> 
> btw I saw the policy for -EBUSY is also not consistent in this series.
> 
> while it's correct to change -EBUSY to -EMEDIUMTYPE for omap, assuming
> that retrying another fresh domain for the said device should work:
> 
> 	if (omap_domain->dev) {
> -		dev_err(dev, "iommu domain is already attached\n");
> -		ret = -EBUSY;
> +		ret = -EMEDIUMTYPE;
>  		goto out;
>  	}
> 
> the change in tegra-gart doesn't sound correct:
> 
> 	if (gart->active_domain && gart->active_domain != domain) {
> -		ret = -EBUSY;
> +		ret = -EMEDIUMTYPE;
> 
> one device cannot be attached to two domains. This fact doesn't change
> no matter how many domains are tried. In concept this check is
> redundant and should have been done by iommu core, but obviously we
> didn't pay attention to what -EBUSY actually represents in this path.
> 

oops. Above is actually a right retry condition. gart is iommu instead of
device. So in concept retrying gart->active_domain for the device could
work.

So please ignore this comment.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-08  0:43               ` Jason Gunthorpe
  (?)
@ 2022-09-08 10:25                 ` Robin Murphy
  -1 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-09-08 10:25 UTC (permalink / raw)
  To: Jason Gunthorpe, Joerg Roedel
  Cc: Nicolin Chen, will, alex.williamson, suravee.suthikulpanit,
	marcan, sven, alyssa, robdclark, dwmw2, baolu.lu, mjrosato,
	gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
	thierry.reding, vdumpa, jonathanh, jean-philippe, cohuck, tglx,
	shameerali.kolothum.thodi, thunder.leizhen, christophe.jaillet,
	yangyingliang, jon, iommu, linux-kernel, asahi, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm,
	kevin.tian

On 2022-09-08 01:43, Jason Gunthorpe wrote:
> On Wed, Sep 07, 2022 at 08:41:13PM +0100, Robin Murphy wrote:
> 
>>>> FWIW, we're now very close to being able to validate dev->iommu against
>>>> where the domain came from in core code, and so short-circuit ->attach_dev
>>>> entirely if they don't match.
>>>
>>> I don't think this is a long term direction. We have systems now with
>>> a number of SMMU blocks and we really are going to see a need that
>>> they share the iommu_domains so we don't have unncessary overheads
>>> from duplicated io page table memory.
>>>
>>> So ultimately I'd expect to pass the iommu_domain to the driver and
>>> the driver will decide if the page table memory it represents is
>>> compatible or not. Restricting to only the same iommu instance isn't
>>> good..
>>
>> Who said IOMMU instance?
> 
> Ah, I completely misunderstood what 'dev->iommu' was referring too, OK
> I see.
> 
>> Again, not what I was suggesting. In fact the nature of iommu_attach_group()
>> already rules out bogus devices getting this far, so all a driver currently
>> has to worry about is compatibility of a device that it definitely probed
>> with a domain that it definitely allocated. Therefore, from a caller's point
>> of view, if attaching to an existing domain returns -EINVAL, try another
>> domain; multiple different existing domains can be tried, and may also
>> return -EINVAL for the same or different reasons; the final attempt is to
>> allocate a fresh domain and attach to that, which should always be nominally
>> valid and *never* return -EINVAL. If any attempt returns any other error,
>> bail out down the usual "this should have worked but something went wrong"
>> path. Even if any driver did have a nonsensical "nothing went wrong, I just
>> can't attach my device to any of my domains" case, I don't think it would
>> really need distinguishing from any other general error anyway.
> 
> The algorithm you described is exactly what this series does, it just
> used EMEDIUMTYPE instead of EINVAL. Changing it to EINVAL is not a
> fundamental problem, just a bit more work.
> 
> Looking at Nicolin's series there is a bunch of existing errnos that
> would still need converting, ie EXDEV, EBUSY, EOPNOTSUPP, EFAULT, and
> ENXIO are all returned as codes for 'domain incompatible with device'
> in various drivers. So the patch would still look much the same, just
> changing them to EINVAL instead of EMEDIUMTYPE.
> 
> That leaves the question of the remaining EINVAL's that Nicolin did
> not convert to EMEDIUMTYPE.
> 
> eg in the AMD driver:
> 
> 	if (!check_device(dev))
> 		return -EINVAL;
> 
> 	iommu = rlookup_amd_iommu(dev);
> 	if (!iommu)
> 		return -EINVAL;
> 
> These are all cases of 'something is really wrong with the device or
> iommu, everything will fail'. Other drivers are using ENODEV for this
> already, so we'd probably have an additional patch changing various
> places like that to ENODEV.
> 
> This mixture of error codes is the basic reason why a new code was
> used, because none of the existing codes are used with any
> consistency.
> 
> But OK, I'm on board, lets use more common errnos with specific
> meaning, that can be documented in a comment someplace:
>   ENOMEM - out of memory
>   ENODEV - no domain can attach, device or iommu is messed up
>   EINVAL - the domain is incompatible with the device
>   <others> - Same behavior as ENODEV, use is discouraged.
> 
> I think achieving consistency of error codes is a generally desirable
> goal, it makes the error code actually useful.
> 
> Joerg this is a good bit of work, will you be OK with it?
> 
>> Thus as long as we can maintain that basic guarantee that attaching
>> a group to a newly allocated domain can only ever fail for resource
>> allocation reasons and not some spurious "incompatibility", then we
>> don't need any obscure trickery, and a single, clear, error code is
>> in fact enough to say all that needs to be said.
> 
> As above, this is not the case, drivers do seem to have error paths
> that are unconditional on the domain. Perhaps they are just protective
> assertions and never happen.

Right, that's the gist of what I was getting at - I think it's worth 
putting in the effort to audit and fix the drivers so that that *can* be 
the case, then we can have a meaningful error API with standard codes 
effectively for free, rather than just sighing at the existing mess and 
building a slightly esoteric special case on top.

Case in point, the AMD checks quoted above are pointless, since it 
checks the same things in ->probe_device, and if that fails then the 
device won't get a group so there's no way for it to even reach 
->attach_dev any more. I'm sure there's a *lot* of cruft that can be 
cleared out now that per-device and per-domain ops give us this kind of 
inherent robustness.

Cheers,
Robin.

> Regardless, it doesn't matter. If they return ENODEV or EINVAL the
> VFIO side algorithm will continue to work fine, it just does alot more
> work if EINVAL is permanently returned.
> 
> Thanks,
> Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-08 10:25                 ` Robin Murphy
  0 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-09-08 10:25 UTC (permalink / raw)
  To: Jason Gunthorpe, Joerg Roedel
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer, sven,
	linux-arm-msm, vdumpa, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, dwmw2, baolu.lu

On 2022-09-08 01:43, Jason Gunthorpe wrote:
> On Wed, Sep 07, 2022 at 08:41:13PM +0100, Robin Murphy wrote:
> 
>>>> FWIW, we're now very close to being able to validate dev->iommu against
>>>> where the domain came from in core code, and so short-circuit ->attach_dev
>>>> entirely if they don't match.
>>>
>>> I don't think this is a long term direction. We have systems now with
>>> a number of SMMU blocks and we really are going to see a need that
>>> they share the iommu_domains so we don't have unncessary overheads
>>> from duplicated io page table memory.
>>>
>>> So ultimately I'd expect to pass the iommu_domain to the driver and
>>> the driver will decide if the page table memory it represents is
>>> compatible or not. Restricting to only the same iommu instance isn't
>>> good..
>>
>> Who said IOMMU instance?
> 
> Ah, I completely misunderstood what 'dev->iommu' was referring too, OK
> I see.
> 
>> Again, not what I was suggesting. In fact the nature of iommu_attach_group()
>> already rules out bogus devices getting this far, so all a driver currently
>> has to worry about is compatibility of a device that it definitely probed
>> with a domain that it definitely allocated. Therefore, from a caller's point
>> of view, if attaching to an existing domain returns -EINVAL, try another
>> domain; multiple different existing domains can be tried, and may also
>> return -EINVAL for the same or different reasons; the final attempt is to
>> allocate a fresh domain and attach to that, which should always be nominally
>> valid and *never* return -EINVAL. If any attempt returns any other error,
>> bail out down the usual "this should have worked but something went wrong"
>> path. Even if any driver did have a nonsensical "nothing went wrong, I just
>> can't attach my device to any of my domains" case, I don't think it would
>> really need distinguishing from any other general error anyway.
> 
> The algorithm you described is exactly what this series does, it just
> used EMEDIUMTYPE instead of EINVAL. Changing it to EINVAL is not a
> fundamental problem, just a bit more work.
> 
> Looking at Nicolin's series there is a bunch of existing errnos that
> would still need converting, ie EXDEV, EBUSY, EOPNOTSUPP, EFAULT, and
> ENXIO are all returned as codes for 'domain incompatible with device'
> in various drivers. So the patch would still look much the same, just
> changing them to EINVAL instead of EMEDIUMTYPE.
> 
> That leaves the question of the remaining EINVAL's that Nicolin did
> not convert to EMEDIUMTYPE.
> 
> eg in the AMD driver:
> 
> 	if (!check_device(dev))
> 		return -EINVAL;
> 
> 	iommu = rlookup_amd_iommu(dev);
> 	if (!iommu)
> 		return -EINVAL;
> 
> These are all cases of 'something is really wrong with the device or
> iommu, everything will fail'. Other drivers are using ENODEV for this
> already, so we'd probably have an additional patch changing various
> places like that to ENODEV.
> 
> This mixture of error codes is the basic reason why a new code was
> used, because none of the existing codes are used with any
> consistency.
> 
> But OK, I'm on board, lets use more common errnos with specific
> meaning, that can be documented in a comment someplace:
>   ENOMEM - out of memory
>   ENODEV - no domain can attach, device or iommu is messed up
>   EINVAL - the domain is incompatible with the device
>   <others> - Same behavior as ENODEV, use is discouraged.
> 
> I think achieving consistency of error codes is a generally desirable
> goal, it makes the error code actually useful.
> 
> Joerg this is a good bit of work, will you be OK with it?
> 
>> Thus as long as we can maintain that basic guarantee that attaching
>> a group to a newly allocated domain can only ever fail for resource
>> allocation reasons and not some spurious "incompatibility", then we
>> don't need any obscure trickery, and a single, clear, error code is
>> in fact enough to say all that needs to be said.
> 
> As above, this is not the case, drivers do seem to have error paths
> that are unconditional on the domain. Perhaps they are just protective
> assertions and never happen.

Right, that's the gist of what I was getting at - I think it's worth 
putting in the effort to audit and fix the drivers so that that *can* be 
the case, then we can have a meaningful error API with standard codes 
effectively for free, rather than just sighing at the existing mess and 
building a slightly esoteric special case on top.

Case in point, the AMD checks quoted above are pointless, since it 
checks the same things in ->probe_device, and if that fails then the 
device won't get a group so there's no way for it to even reach 
->attach_dev any more. I'm sure there's a *lot* of cruft that can be 
cleared out now that per-device and per-domain ops give us this kind of 
inherent robustness.

Cheers,
Robin.

> Regardless, it doesn't matter. If they return ENODEV or EINVAL the
> VFIO side algorithm will continue to work fine, it just does alot more
> work if EINVAL is permanently returned.
> 
> Thanks,
> Jason
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-08 10:25                 ` Robin Murphy
  0 siblings, 0 replies; 67+ messages in thread
From: Robin Murphy @ 2022-09-08 10:25 UTC (permalink / raw)
  To: Jason Gunthorpe, Joerg Roedel
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer,
	kevin.tian, sven, linux-arm-msm, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

On 2022-09-08 01:43, Jason Gunthorpe wrote:
> On Wed, Sep 07, 2022 at 08:41:13PM +0100, Robin Murphy wrote:
> 
>>>> FWIW, we're now very close to being able to validate dev->iommu against
>>>> where the domain came from in core code, and so short-circuit ->attach_dev
>>>> entirely if they don't match.
>>>
>>> I don't think this is a long term direction. We have systems now with
>>> a number of SMMU blocks and we really are going to see a need that
>>> they share the iommu_domains so we don't have unncessary overheads
>>> from duplicated io page table memory.
>>>
>>> So ultimately I'd expect to pass the iommu_domain to the driver and
>>> the driver will decide if the page table memory it represents is
>>> compatible or not. Restricting to only the same iommu instance isn't
>>> good..
>>
>> Who said IOMMU instance?
> 
> Ah, I completely misunderstood what 'dev->iommu' was referring too, OK
> I see.
> 
>> Again, not what I was suggesting. In fact the nature of iommu_attach_group()
>> already rules out bogus devices getting this far, so all a driver currently
>> has to worry about is compatibility of a device that it definitely probed
>> with a domain that it definitely allocated. Therefore, from a caller's point
>> of view, if attaching to an existing domain returns -EINVAL, try another
>> domain; multiple different existing domains can be tried, and may also
>> return -EINVAL for the same or different reasons; the final attempt is to
>> allocate a fresh domain and attach to that, which should always be nominally
>> valid and *never* return -EINVAL. If any attempt returns any other error,
>> bail out down the usual "this should have worked but something went wrong"
>> path. Even if any driver did have a nonsensical "nothing went wrong, I just
>> can't attach my device to any of my domains" case, I don't think it would
>> really need distinguishing from any other general error anyway.
> 
> The algorithm you described is exactly what this series does, it just
> used EMEDIUMTYPE instead of EINVAL. Changing it to EINVAL is not a
> fundamental problem, just a bit more work.
> 
> Looking at Nicolin's series there is a bunch of existing errnos that
> would still need converting, ie EXDEV, EBUSY, EOPNOTSUPP, EFAULT, and
> ENXIO are all returned as codes for 'domain incompatible with device'
> in various drivers. So the patch would still look much the same, just
> changing them to EINVAL instead of EMEDIUMTYPE.
> 
> That leaves the question of the remaining EINVAL's that Nicolin did
> not convert to EMEDIUMTYPE.
> 
> eg in the AMD driver:
> 
> 	if (!check_device(dev))
> 		return -EINVAL;
> 
> 	iommu = rlookup_amd_iommu(dev);
> 	if (!iommu)
> 		return -EINVAL;
> 
> These are all cases of 'something is really wrong with the device or
> iommu, everything will fail'. Other drivers are using ENODEV for this
> already, so we'd probably have an additional patch changing various
> places like that to ENODEV.
> 
> This mixture of error codes is the basic reason why a new code was
> used, because none of the existing codes are used with any
> consistency.
> 
> But OK, I'm on board, lets use more common errnos with specific
> meaning, that can be documented in a comment someplace:
>   ENOMEM - out of memory
>   ENODEV - no domain can attach, device or iommu is messed up
>   EINVAL - the domain is incompatible with the device
>   <others> - Same behavior as ENODEV, use is discouraged.
> 
> I think achieving consistency of error codes is a generally desirable
> goal, it makes the error code actually useful.
> 
> Joerg this is a good bit of work, will you be OK with it?
> 
>> Thus as long as we can maintain that basic guarantee that attaching
>> a group to a newly allocated domain can only ever fail for resource
>> allocation reasons and not some spurious "incompatibility", then we
>> don't need any obscure trickery, and a single, clear, error code is
>> in fact enough to say all that needs to be said.
> 
> As above, this is not the case, drivers do seem to have error paths
> that are unconditional on the domain. Perhaps they are just protective
> assertions and never happen.

Right, that's the gist of what I was getting at - I think it's worth 
putting in the effort to audit and fix the drivers so that that *can* be 
the case, then we can have a meaningful error API with standard codes 
effectively for free, rather than just sighing at the existing mess and 
building a slightly esoteric special case on top.

Case in point, the AMD checks quoted above are pointless, since it 
checks the same things in ->probe_device, and if that fails then the 
device won't get a group so there's no way for it to even reach 
->attach_dev any more. I'm sure there's a *lot* of cruft that can be 
cleared out now that per-device and per-domain ops give us this kind of 
inherent robustness.

Cheers,
Robin.

> Regardless, it doesn't matter. If they return ENODEV or EINVAL the
> VFIO side algorithm will continue to work fine, it just does alot more
> work if EINVAL is permanently returned.
> 
> Thanks,
> Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-08  9:30                 ` Tian, Kevin
@ 2022-09-08 12:08                   ` Jason Gunthorpe
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-08 12:08 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Robin Murphy, Joerg Roedel, Nicolin Chen, will, alex.williamson,
	suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm

On Thu, Sep 08, 2022 at 09:30:57AM +0000, Tian, Kevin wrote:
> > But OK, I'm on board, lets use more common errnos with specific
> > meaning, that can be documented in a comment someplace:
> >  ENOMEM - out of memory
> >  ENODEV - no domain can attach, device or iommu is messed up
> >  EINVAL - the domain is incompatible with the device
> >  <others> - Same behavior as ENODEV, use is discouraged.
> 
> There are also cases where common kAPIs are called in the attach
> path which may return -EINVAL and random errno, e.g.:
> 
> omap_iommu_attach_dev()
>   omap_iommu_attach()
>     iommu_enable()
>       pm_runtime_get_sync()
>         __pm_runtime_resume()
>           rpm_resume()
> 	if (dev->power.runtime_error) {
> 		retval = -EINVAL;
>             
> viommu_attach_dev()
>   viommu_domain_finalise()
>     ida_alloc_range()
> 	if ((int)min < 0)
> 		return -ENOSPC;

Yes, this is was also on my mind with choosing an unpopular return
code, it has a higher chance of not coming out of some other kernel
API

> If we think attach_dev is a slow path and having unnecessary retries
> doesn't hurt then -EINVAL sounds a simpler option. We probably can
> just go using -EINVAL as retry indicator in vfio even w/o changing
> iommu drivers at this point. Then improve them to use consistent
> errno gradually and in a separate effort.

Given Joerg's objection I think we will do EINVAL and just live with
the imperfection.

It is not just slow path, but being inaccurate can mean extra domains
are created when they were not needed. But I think we are getting into
sufficiently unlikely territory that issue can be ignored to make
progress.

Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-08 12:08                   ` Jason Gunthorpe
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-08 12:08 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, Joerg Roedel, jon,
	jonathanh, iommu, Nicolin Chen, yangyingliang, orsonzhai,
	gerald.schaefer, sven, linux-arm-msm, christophe.jaillet,
	baolin.wang, thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, Robin Murphy, baolu.lu

On Thu, Sep 08, 2022 at 09:30:57AM +0000, Tian, Kevin wrote:
> > But OK, I'm on board, lets use more common errnos with specific
> > meaning, that can be documented in a comment someplace:
> >  ENOMEM - out of memory
> >  ENODEV - no domain can attach, device or iommu is messed up
> >  EINVAL - the domain is incompatible with the device
> >  <others> - Same behavior as ENODEV, use is discouraged.
> 
> There are also cases where common kAPIs are called in the attach
> path which may return -EINVAL and random errno, e.g.:
> 
> omap_iommu_attach_dev()
>   omap_iommu_attach()
>     iommu_enable()
>       pm_runtime_get_sync()
>         __pm_runtime_resume()
>           rpm_resume()
> 	if (dev->power.runtime_error) {
> 		retval = -EINVAL;
>             
> viommu_attach_dev()
>   viommu_domain_finalise()
>     ida_alloc_range()
> 	if ((int)min < 0)
> 		return -ENOSPC;

Yes, this is was also on my mind with choosing an unpopular return
code, it has a higher chance of not coming out of some other kernel
API

> If we think attach_dev is a slow path and having unnecessary retries
> doesn't hurt then -EINVAL sounds a simpler option. We probably can
> just go using -EINVAL as retry indicator in vfio even w/o changing
> iommu drivers at this point. Then improve them to use consistent
> errno gradually and in a separate effort.

Given Joerg's objection I think we will do EINVAL and just live with
the imperfection.

It is not just slow path, but being inaccurate can mean extra domains
are created when they were not needed. But I think we are getting into
sufficiently unlikely territory that issue can be ignored to make
progress.

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-07 17:10           ` Jason Gunthorpe
  (?)
@ 2022-09-08 13:28             ` Joerg Roedel
  -1 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-09-08 13:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nicolin Chen, will, robin.murphy, alex.williamson,
	suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

On Wed, Sep 07, 2022 at 02:10:33PM -0300, Jason Gunthorpe wrote:
> Sure, rust has all sorts of nice things. But the kernel doesn't follow
> rust idioms, and I don't think this is a great place to start
> experimenting with them.

It is actually a great place to start experimenting. The IOMMU
interfaces are rather domain specific and if we get something wrong the
damage is limited to a few callers. There are APIs much more exposed in
the kernel which would be worse for that.

But anyway, I am not insisting on it.

> It has been 3 months since EMEDIUMTYPE was first proposed and 6
> iterations of the series, don't you think it is a bit late in the game
> to try to experiment with rust error handling idioms?

If I am not mistaken, I am the person who gets blamed when crappy IOMMU
code is sent upstream. So it is also up to me to decide in which state
and how close to merging a given patch series is an whether it is
already 'late in the game'.

> So, again, would you be happy with a simple 
> 
>  #define IOMMU_EINCOMPATIBLE_DEVICE xx
> 
> to make it less "re-using random error codes"?

I am wondering if this can be solved by better defining what the return
codes mean and adjust the call-back functions to match the definition.
Something like:

	-ENODEV : Device not mapped my an IOMMU
	-EBUSY  : Device attached and domain can not be changed
	-EINVAL : Device and domain are incompatible
	...

That would be much more intuitive than using something obscure like
EMEDIUMTYPE.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-08 13:28             ` Joerg Roedel
  0 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-09-08 13:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer, sven,
	linux-arm-msm, vdumpa, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, robin.murphy, baolu.lu

On Wed, Sep 07, 2022 at 02:10:33PM -0300, Jason Gunthorpe wrote:
> Sure, rust has all sorts of nice things. But the kernel doesn't follow
> rust idioms, and I don't think this is a great place to start
> experimenting with them.

It is actually a great place to start experimenting. The IOMMU
interfaces are rather domain specific and if we get something wrong the
damage is limited to a few callers. There are APIs much more exposed in
the kernel which would be worse for that.

But anyway, I am not insisting on it.

> It has been 3 months since EMEDIUMTYPE was first proposed and 6
> iterations of the series, don't you think it is a bit late in the game
> to try to experiment with rust error handling idioms?

If I am not mistaken, I am the person who gets blamed when crappy IOMMU
code is sent upstream. So it is also up to me to decide in which state
and how close to merging a given patch series is an whether it is
already 'late in the game'.

> So, again, would you be happy with a simple 
> 
>  #define IOMMU_EINCOMPATIBLE_DEVICE xx
> 
> to make it less "re-using random error codes"?

I am wondering if this can be solved by better defining what the return
codes mean and adjust the call-back functions to match the definition.
Something like:

	-ENODEV : Device not mapped my an IOMMU
	-EBUSY  : Device attached and domain can not be changed
	-EINVAL : Device and domain are incompatible
	...

That would be much more intuitive than using something obscure like
EMEDIUMTYPE.

Regards,

	Joerg
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-08 13:28             ` Joerg Roedel
  0 siblings, 0 replies; 67+ messages in thread
From: Joerg Roedel @ 2022-09-08 13:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer,
	kevin.tian, sven, linux-arm-msm, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, robin.murphy, baolu.lu

On Wed, Sep 07, 2022 at 02:10:33PM -0300, Jason Gunthorpe wrote:
> Sure, rust has all sorts of nice things. But the kernel doesn't follow
> rust idioms, and I don't think this is a great place to start
> experimenting with them.

It is actually a great place to start experimenting. The IOMMU
interfaces are rather domain specific and if we get something wrong the
damage is limited to a few callers. There are APIs much more exposed in
the kernel which would be worse for that.

But anyway, I am not insisting on it.

> It has been 3 months since EMEDIUMTYPE was first proposed and 6
> iterations of the series, don't you think it is a bit late in the game
> to try to experiment with rust error handling idioms?

If I am not mistaken, I am the person who gets blamed when crappy IOMMU
code is sent upstream. So it is also up to me to decide in which state
and how close to merging a given patch series is an whether it is
already 'late in the game'.

> So, again, would you be happy with a simple 
> 
>  #define IOMMU_EINCOMPATIBLE_DEVICE xx
> 
> to make it less "re-using random error codes"?

I am wondering if this can be solved by better defining what the return
codes mean and adjust the call-back functions to match the definition.
Something like:

	-ENODEV : Device not mapped my an IOMMU
	-EBUSY  : Device attached and domain can not be changed
	-EINVAL : Device and domain are incompatible
	...

That would be much more intuitive than using something obscure like
EMEDIUMTYPE.

Regards,

	Joerg

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-08 13:28             ` Joerg Roedel
@ 2022-09-08 16:14               ` Jason Gunthorpe
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-08 16:14 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Nicolin Chen, will, robin.murphy, alex.williamson,
	suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm, kevin.tian

On Thu, Sep 08, 2022 at 03:28:22PM +0200, Joerg Roedel wrote:
> > It has been 3 months since EMEDIUMTYPE was first proposed and 6
> > iterations of the series, don't you think it is a bit late in the game
> > to try to experiment with rust error handling idioms?
> 
> If I am not mistaken, I am the person who gets blamed when crappy IOMMU
> code is sent upstream. So it is also up to me to decide in which state
> and how close to merging a given patch series is an whether it is
> already 'late in the game'.

I don't think the maintainer is the one who gets blamed. The community
is responsible as a collective group for it's decisions. The
maintainer is the leader of the community, responsible to foster it,
and contributes their guidance, but doesn't bare an unlimited
responsibility for what is merged.

In a case like this I am the advocate, Nicolin wrote the patches,
Kevin reviewed, Alex ack'd them - we as a group are ultimately
responsible to repair, defend, or whatever is needed.

> I am wondering if this can be solved by better defining what the return
> codes mean and adjust the call-back functions to match the definition.
> Something like:
> 
> 	-ENODEV : Device not mapped my an IOMMU
> 	-EBUSY  : Device attached and domain can not be changed
> 	-EINVAL : Device and domain are incompatible
> 	...

Yes, this was gone over in a side thread the pros/cons, so lets do
it. Nicolin will come with something along these lines.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-08 16:14               ` Jason Gunthorpe
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-08 16:14 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	Nicolin Chen, yangyingliang, orsonzhai, gerald.schaefer,
	kevin.tian, sven, linux-arm-msm, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, robin.murphy, baolu.lu

On Thu, Sep 08, 2022 at 03:28:22PM +0200, Joerg Roedel wrote:
> > It has been 3 months since EMEDIUMTYPE was first proposed and 6
> > iterations of the series, don't you think it is a bit late in the game
> > to try to experiment with rust error handling idioms?
> 
> If I am not mistaken, I am the person who gets blamed when crappy IOMMU
> code is sent upstream. So it is also up to me to decide in which state
> and how close to merging a given patch series is an whether it is
> already 'late in the game'.

I don't think the maintainer is the one who gets blamed. The community
is responsible as a collective group for it's decisions. The
maintainer is the leader of the community, responsible to foster it,
and contributes their guidance, but doesn't bare an unlimited
responsibility for what is merged.

In a case like this I am the advocate, Nicolin wrote the patches,
Kevin reviewed, Alex ack'd them - we as a group are ultimately
responsible to repair, defend, or whatever is needed.

> I am wondering if this can be solved by better defining what the return
> codes mean and adjust the call-back functions to match the definition.
> Something like:
> 
> 	-ENODEV : Device not mapped my an IOMMU
> 	-EBUSY  : Device attached and domain can not be changed
> 	-EINVAL : Device and domain are incompatible
> 	...

Yes, this was gone over in a side thread the pros/cons, so lets do
it. Nicolin will come with something along these lines.

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-08 16:14               ` Jason Gunthorpe
@ 2022-09-09  3:17                 ` Nicolin Chen
  -1 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-09-09  3:17 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe
  Cc: will, robin.murphy, alex.williamson, suravee.suthikulpanit,
	marcan, sven, alyssa, robdclark, dwmw2, baolu.lu, mjrosato,
	gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
	thierry.reding, vdumpa, jonathanh, jean-philippe, cohuck, tglx,
	shameerali.kolothum.thodi, thunder.leizhen, christophe.jaillet,
	yangyingliang, jon, iommu, linux-kernel, asahi, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm,
	kevin.tian

On Thu, Sep 08, 2022 at 01:14:42PM -0300, Jason Gunthorpe wrote:

> > I am wondering if this can be solved by better defining what the return
> > codes mean and adjust the call-back functions to match the definition.
> > Something like:
> > 
> > 	-ENODEV : Device not mapped my an IOMMU
> > 	-EBUSY  : Device attached and domain can not be changed
> > 	-EINVAL : Device and domain are incompatible
> > 	...
> 
> Yes, this was gone over in a side thread the pros/cons, so lets do
> it. Nicolin will come with something along these lines.

I have started this effort by combining this list and the one from
the side thread:

@@ -266,6 +266,13 @@ struct iommu_ops {
 /**
  * struct iommu_domain_ops - domain specific operations
  * @attach_dev: attach an iommu domain to a device
+ *              Rules of its return errno:
+ *               ENOMEM  - Out of memory
+ *               EINVAL  - Device and domain are incompatible
+ *               EBUSY   - Device is attached to a domain and cannot be changed
+ *               ENODEV  - Device or domain is messed up: device is not mapped
+ *                         to an IOMMU, no domain can attach, and etc.
+ *              <others> - Same behavior as ENODEV, use is discouraged
  * @detach_dev: detach an iommu domain from a device
  * @map: map a physically contiguous memory region to an iommu domain
  * @map_pages: map a physically contiguous set of pages of the same size to

I am now going through every single return value of ->attach_dev to
make sure the list above applies. And I will also incorporate things
like Robin's comments at the AMD IOMMU driver.

And if the change occurs to be bigger, I guess that separating it to
be an IOMMU series from this VFIO one might be better.

Thanks
Nic

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-09  3:17                 ` Nicolin Chen
  0 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-09-09  3:17 UTC (permalink / raw)
  To: Joerg Roedel, Jason Gunthorpe
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	yangyingliang, orsonzhai, gerald.schaefer, kevin.tian, sven,
	linux-arm-msm, alex.williamson, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, robin.murphy, baolu.lu

On Thu, Sep 08, 2022 at 01:14:42PM -0300, Jason Gunthorpe wrote:

> > I am wondering if this can be solved by better defining what the return
> > codes mean and adjust the call-back functions to match the definition.
> > Something like:
> > 
> > 	-ENODEV : Device not mapped my an IOMMU
> > 	-EBUSY  : Device attached and domain can not be changed
> > 	-EINVAL : Device and domain are incompatible
> > 	...
> 
> Yes, this was gone over in a side thread the pros/cons, so lets do
> it. Nicolin will come with something along these lines.

I have started this effort by combining this list and the one from
the side thread:

@@ -266,6 +266,13 @@ struct iommu_ops {
 /**
  * struct iommu_domain_ops - domain specific operations
  * @attach_dev: attach an iommu domain to a device
+ *              Rules of its return errno:
+ *               ENOMEM  - Out of memory
+ *               EINVAL  - Device and domain are incompatible
+ *               EBUSY   - Device is attached to a domain and cannot be changed
+ *               ENODEV  - Device or domain is messed up: device is not mapped
+ *                         to an IOMMU, no domain can attach, and etc.
+ *              <others> - Same behavior as ENODEV, use is discouraged
  * @detach_dev: detach an iommu domain from a device
  * @map: map a physically contiguous memory region to an iommu domain
  * @map_pages: map a physically contiguous set of pages of the same size to

I am now going through every single return value of ->attach_dev to
make sure the list above applies. And I will also incorporate things
like Robin's comments at the AMD IOMMU driver.

And if the change occurs to be bigger, I guess that separating it to
be an IOMMU series from this VFIO one might be better.

Thanks
Nic

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-09  3:17                 ` Nicolin Chen
  (?)
@ 2022-09-09  5:00                   ` Tian, Kevin
  -1 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-09  5:00 UTC (permalink / raw)
  To: Nicolin Chen, Joerg Roedel, Jason Gunthorpe
  Cc: will, robin.murphy, alex.williamson, suravee.suthikulpanit,
	marcan, sven, alyssa, robdclark, dwmw2, baolu.lu, mjrosato,
	gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
	thierry.reding, vdumpa, jonathanh, jean-philippe, cohuck, tglx,
	shameerali.kolothum.thodi, thunder.leizhen, christophe.jaillet,
	yangyingliang, jon, iommu, linux-kernel, asahi, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Friday, September 9, 2022 11:17 AM
> 
> On Thu, Sep 08, 2022 at 01:14:42PM -0300, Jason Gunthorpe wrote:
> 
> > > I am wondering if this can be solved by better defining what the return
> > > codes mean and adjust the call-back functions to match the definition.
> > > Something like:
> > >
> > > 	-ENODEV : Device not mapped my an IOMMU
> > > 	-EBUSY  : Device attached and domain can not be changed
> > > 	-EINVAL : Device and domain are incompatible
> > > 	...
> >
> > Yes, this was gone over in a side thread the pros/cons, so lets do
> > it. Nicolin will come with something along these lines.
> 
> I have started this effort by combining this list and the one from
> the side thread:
> 
> @@ -266,6 +266,13 @@ struct iommu_ops {
>  /**
>   * struct iommu_domain_ops - domain specific operations
>   * @attach_dev: attach an iommu domain to a device
> + *              Rules of its return errno:
> + *               ENOMEM  - Out of memory
> + *               EINVAL  - Device and domain are incompatible
> + *               EBUSY   - Device is attached to a domain and cannot be changed

With this definition then probably @attach_dev should not return -EBUSY
at all given it's already checked in the start of __iommu_attach_group():

	if (group->domain && group->domain != group->default_domain &&
	    group->domain != group->blocking_domain)
		return -EBUSY;

> + *               ENODEV  - Device or domain is messed up: device is not mapped
> + *                         to an IOMMU, no domain can attach, and etc.

if domain is messed up then should return -EINVAL given using another domain
might just work. IMHO here -ENODEV should only cover device specific problems
preventing this device from being attached to by any domain.

> + *              <others> - Same behavior as ENODEV, use is discouraged

didn't get the "Same behavior" part. Does it suggest all other errnos should
be converted to ENODEV?

btw what about -ENOSPC? It's sane to allocate some resource in the attach
path while the resource might be not available, e.g.:

intel_iommu_attach_device()
  domain_add_dev_info()
    domain_attach_iommu():

	int num, ret = -ENOSPC;
	...
	ndomains = cap_ndoms(iommu->cap);
	num = find_first_zero_bit(iommu->domain_ids, ndomains);
	if (num >= ndomains) {
		pr_err("%s: No free domain ids\n", iommu->name);
		goto err_unlock;
	}

As discussed in a side thread a note might be added to exempt calling
kAPI outside of the iommu driver. 

>   * @detach_dev: detach an iommu domain from a device
>   * @map: map a physically contiguous memory region to an iommu domain
>   * @map_pages: map a physically contiguous set of pages of the same size
> to
> 
> I am now going through every single return value of ->attach_dev to
> make sure the list above applies. And I will also incorporate things
> like Robin's comments at the AMD IOMMU driver.
> 
> And if the change occurs to be bigger, I guess that separating it to
> be an IOMMU series from this VFIO one might be better.
> 
> Thanks
> Nic

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-09  5:00                   ` Tian, Kevin
  0 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-09  5:00 UTC (permalink / raw)
  To: Nicolin Chen, Joerg Roedel, Jason Gunthorpe
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	yangyingliang, orsonzhai, gerald.schaefer, sven, linux-arm-msm,
	christophe.jaillet, baolin.wang, thunder.leizhen, linux-tegra,
	tglx, virtualization, linux-arm-kernel, dwmw2, cohuck, vdumpa,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, robin.murphy, baolu.lu

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Friday, September 9, 2022 11:17 AM
> 
> On Thu, Sep 08, 2022 at 01:14:42PM -0300, Jason Gunthorpe wrote:
> 
> > > I am wondering if this can be solved by better defining what the return
> > > codes mean and adjust the call-back functions to match the definition.
> > > Something like:
> > >
> > > 	-ENODEV : Device not mapped my an IOMMU
> > > 	-EBUSY  : Device attached and domain can not be changed
> > > 	-EINVAL : Device and domain are incompatible
> > > 	...
> >
> > Yes, this was gone over in a side thread the pros/cons, so lets do
> > it. Nicolin will come with something along these lines.
> 
> I have started this effort by combining this list and the one from
> the side thread:
> 
> @@ -266,6 +266,13 @@ struct iommu_ops {
>  /**
>   * struct iommu_domain_ops - domain specific operations
>   * @attach_dev: attach an iommu domain to a device
> + *              Rules of its return errno:
> + *               ENOMEM  - Out of memory
> + *               EINVAL  - Device and domain are incompatible
> + *               EBUSY   - Device is attached to a domain and cannot be changed

With this definition then probably @attach_dev should not return -EBUSY
at all given it's already checked in the start of __iommu_attach_group():

	if (group->domain && group->domain != group->default_domain &&
	    group->domain != group->blocking_domain)
		return -EBUSY;

> + *               ENODEV  - Device or domain is messed up: device is not mapped
> + *                         to an IOMMU, no domain can attach, and etc.

if domain is messed up then should return -EINVAL given using another domain
might just work. IMHO here -ENODEV should only cover device specific problems
preventing this device from being attached to by any domain.

> + *              <others> - Same behavior as ENODEV, use is discouraged

didn't get the "Same behavior" part. Does it suggest all other errnos should
be converted to ENODEV?

btw what about -ENOSPC? It's sane to allocate some resource in the attach
path while the resource might be not available, e.g.:

intel_iommu_attach_device()
  domain_add_dev_info()
    domain_attach_iommu():

	int num, ret = -ENOSPC;
	...
	ndomains = cap_ndoms(iommu->cap);
	num = find_first_zero_bit(iommu->domain_ids, ndomains);
	if (num >= ndomains) {
		pr_err("%s: No free domain ids\n", iommu->name);
		goto err_unlock;
	}

As discussed in a side thread a note might be added to exempt calling
kAPI outside of the iommu driver. 

>   * @detach_dev: detach an iommu domain from a device
>   * @map: map a physically contiguous memory region to an iommu domain
>   * @map_pages: map a physically contiguous set of pages of the same size
> to
> 
> I am now going through every single return value of ->attach_dev to
> make sure the list above applies. And I will also incorporate things
> like Robin's comments at the AMD IOMMU driver.
> 
> And if the change occurs to be bigger, I guess that separating it to
> be an IOMMU series from this VFIO one might be better.
> 
> Thanks
> Nic
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-09  5:00                   ` Tian, Kevin
  0 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-09  5:00 UTC (permalink / raw)
  To: Nicolin Chen, Joerg Roedel, Jason Gunthorpe
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	yangyingliang, orsonzhai, gerald.schaefer, sven, linux-arm-msm,
	alex.williamson, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, robin.murphy, baolu.lu

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Friday, September 9, 2022 11:17 AM
> 
> On Thu, Sep 08, 2022 at 01:14:42PM -0300, Jason Gunthorpe wrote:
> 
> > > I am wondering if this can be solved by better defining what the return
> > > codes mean and adjust the call-back functions to match the definition.
> > > Something like:
> > >
> > > 	-ENODEV : Device not mapped my an IOMMU
> > > 	-EBUSY  : Device attached and domain can not be changed
> > > 	-EINVAL : Device and domain are incompatible
> > > 	...
> >
> > Yes, this was gone over in a side thread the pros/cons, so lets do
> > it. Nicolin will come with something along these lines.
> 
> I have started this effort by combining this list and the one from
> the side thread:
> 
> @@ -266,6 +266,13 @@ struct iommu_ops {
>  /**
>   * struct iommu_domain_ops - domain specific operations
>   * @attach_dev: attach an iommu domain to a device
> + *              Rules of its return errno:
> + *               ENOMEM  - Out of memory
> + *               EINVAL  - Device and domain are incompatible
> + *               EBUSY   - Device is attached to a domain and cannot be changed

With this definition then probably @attach_dev should not return -EBUSY
at all given it's already checked in the start of __iommu_attach_group():

	if (group->domain && group->domain != group->default_domain &&
	    group->domain != group->blocking_domain)
		return -EBUSY;

> + *               ENODEV  - Device or domain is messed up: device is not mapped
> + *                         to an IOMMU, no domain can attach, and etc.

if domain is messed up then should return -EINVAL given using another domain
might just work. IMHO here -ENODEV should only cover device specific problems
preventing this device from being attached to by any domain.

> + *              <others> - Same behavior as ENODEV, use is discouraged

didn't get the "Same behavior" part. Does it suggest all other errnos should
be converted to ENODEV?

btw what about -ENOSPC? It's sane to allocate some resource in the attach
path while the resource might be not available, e.g.:

intel_iommu_attach_device()
  domain_add_dev_info()
    domain_attach_iommu():

	int num, ret = -ENOSPC;
	...
	ndomains = cap_ndoms(iommu->cap);
	num = find_first_zero_bit(iommu->domain_ids, ndomains);
	if (num >= ndomains) {
		pr_err("%s: No free domain ids\n", iommu->name);
		goto err_unlock;
	}

As discussed in a side thread a note might be added to exempt calling
kAPI outside of the iommu driver. 

>   * @detach_dev: detach an iommu domain from a device
>   * @map: map a physically contiguous memory region to an iommu domain
>   * @map_pages: map a physically contiguous set of pages of the same size
> to
> 
> I am now going through every single return value of ->attach_dev to
> make sure the list above applies. And I will also incorporate things
> like Robin's comments at the AMD IOMMU driver.
> 
> And if the change occurs to be bigger, I guess that separating it to
> be an IOMMU series from this VFIO one might be better.
> 
> Thanks
> Nic

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-09  5:00                   ` Tian, Kevin
@ 2022-09-09 12:07                     ` Jason Gunthorpe
  -1 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-09 12:07 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Nicolin Chen, Joerg Roedel, will, robin.murphy, alex.williamson,
	suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm

On Fri, Sep 09, 2022 at 05:00:16AM +0000, Tian, Kevin wrote:

> > I have started this effort by combining this list and the one from
> > the side thread:
> > 
> > @@ -266,6 +266,13 @@ struct iommu_ops {
> >  /**
> >   * struct iommu_domain_ops - domain specific operations
> >   * @attach_dev: attach an iommu domain to a device
> > + *              Rules of its return errno:
> > + *               ENOMEM  - Out of memory
> > + *               EINVAL  - Device and domain are incompatible
> > + *               EBUSY   - Device is attached to a domain and cannot be changed
> 
> With this definition then probably @attach_dev should not return -EBUSY
> at all given it's already checked in the start of __iommu_attach_group():

I think the EBUSY would be only for non-conforming drivers. The API
semantic is you can always attach a new domain and replace an existing
domain.

So things like AMD's "can't do anything but idenitity on RID when
PASID enabled" would be -EBUSY.

Seems right that it should be rare though.

> > + *               ENODEV  - Device or domain is messed up: device is not mapped
> > + *                         to an IOMMU, no domain can attach, and etc.
> 
> if domain is messed up then should return -EINVAL given using another domain
> might just work. IMHO here -ENODEV should only cover device specific problems
> preventing this device from being attached to by any domain.

Agree
 
> > + *              <others> - Same behavior as ENODEV, use is discouraged
> 
> didn't get the "Same behavior" part. Does it suggest all other errnos should
> be converted to ENODEV?

It says all other errnos should be treated as ENODEV by the caller but
forwarded to userspace for further detail.

> btw what about -ENOSPC? It's sane to allocate some resource in the attach
> path while the resource might be not available, e.g.:

Seems resaonable that it is similar to ENOMEM

> As discussed in a side thread a note might be added to exempt calling
> kAPI outside of the iommu driver. 

Sadly, not really.. The driver is responsible to santize this if it is
relevant. It is the main downside of this approach.

Jason

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-09 12:07                     ` Jason Gunthorpe
  0 siblings, 0 replies; 67+ messages in thread
From: Jason Gunthorpe @ 2022-09-09 12:07 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, Joerg Roedel, jon,
	jonathanh, iommu, Nicolin Chen, yangyingliang, orsonzhai,
	gerald.schaefer, sven, linux-arm-msm, christophe.jaillet,
	baolin.wang, thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, robin.murphy, baolu.lu

On Fri, Sep 09, 2022 at 05:00:16AM +0000, Tian, Kevin wrote:

> > I have started this effort by combining this list and the one from
> > the side thread:
> > 
> > @@ -266,6 +266,13 @@ struct iommu_ops {
> >  /**
> >   * struct iommu_domain_ops - domain specific operations
> >   * @attach_dev: attach an iommu domain to a device
> > + *              Rules of its return errno:
> > + *               ENOMEM  - Out of memory
> > + *               EINVAL  - Device and domain are incompatible
> > + *               EBUSY   - Device is attached to a domain and cannot be changed
> 
> With this definition then probably @attach_dev should not return -EBUSY
> at all given it's already checked in the start of __iommu_attach_group():

I think the EBUSY would be only for non-conforming drivers. The API
semantic is you can always attach a new domain and replace an existing
domain.

So things like AMD's "can't do anything but idenitity on RID when
PASID enabled" would be -EBUSY.

Seems right that it should be rare though.

> > + *               ENODEV  - Device or domain is messed up: device is not mapped
> > + *                         to an IOMMU, no domain can attach, and etc.
> 
> if domain is messed up then should return -EINVAL given using another domain
> might just work. IMHO here -ENODEV should only cover device specific problems
> preventing this device from being attached to by any domain.

Agree
 
> > + *              <others> - Same behavior as ENODEV, use is discouraged
> 
> didn't get the "Same behavior" part. Does it suggest all other errnos should
> be converted to ENODEV?

It says all other errnos should be treated as ENODEV by the caller but
forwarded to userspace for further detail.

> btw what about -ENOSPC? It's sane to allocate some resource in the attach
> path while the resource might be not available, e.g.:

Seems resaonable that it is similar to ENOMEM

> As discussed in a side thread a note might be added to exempt calling
> kAPI outside of the iommu driver. 

Sadly, not really.. The driver is responsible to santize this if it is
relevant. It is the main downside of this approach.

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-08 12:08                   ` Jason Gunthorpe
@ 2022-09-10 23:35                     ` Nicolin Chen
  -1 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-09-10 23:35 UTC (permalink / raw)
  To: Robin Murphy, Joerg Roedel, Tian, Kevin, Jason Gunthorpe
  Cc: will, alex.williamson, suravee.suthikulpanit, marcan, sven,
	alyssa, robdclark, dwmw2, baolu.lu, mjrosato, gerald.schaefer,
	orsonzhai, baolin.wang, zhang.lyra, thierry.reding, vdumpa,
	jonathanh, jean-philippe, cohuck, tglx,
	shameerali.kolothum.thodi, thunder.leizhen, christophe.jaillet,
	yangyingliang, jon, iommu, linux-kernel, asahi, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm

On Thu, Sep 08, 2022 at 09:08:38AM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 08, 2022 at 09:30:57AM +0000, Tian, Kevin wrote:

> > There are also cases where common kAPIs are called in the attach
> > path which may return -EINVAL and random errno, e.g.:
> > 
> > omap_iommu_attach_dev()
> >   omap_iommu_attach()
> >     iommu_enable()
> >       pm_runtime_get_sync()
> >         __pm_runtime_resume()
> >           rpm_resume()
> > 	if (dev->power.runtime_error) {
> > 		retval = -EINVAL;
 
> Yes, this is was also on my mind with choosing an unpopular return
> code, it has a higher chance of not coming out of some other kernel
> API

I wonder if it would be safe to just treat a pm_runtime_get_sync()
failure as -ENODEV, since a PM resume() mostly occurs to the IOMMU
that an IOMMU client/master device is behind, while an iommu_domain
rarely intervenes.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-10 23:35                     ` Nicolin Chen
  0 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-09-10 23:35 UTC (permalink / raw)
  To: Robin Murphy, Joerg Roedel, Tian, Kevin, Jason Gunthorpe
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	yangyingliang, orsonzhai, gerald.schaefer, sven, linux-arm-msm,
	alex.williamson, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, dwmw2, baolu.lu

On Thu, Sep 08, 2022 at 09:08:38AM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 08, 2022 at 09:30:57AM +0000, Tian, Kevin wrote:

> > There are also cases where common kAPIs are called in the attach
> > path which may return -EINVAL and random errno, e.g.:
> > 
> > omap_iommu_attach_dev()
> >   omap_iommu_attach()
> >     iommu_enable()
> >       pm_runtime_get_sync()
> >         __pm_runtime_resume()
> >           rpm_resume()
> > 	if (dev->power.runtime_error) {
> > 		retval = -EINVAL;
 
> Yes, this is was also on my mind with choosing an unpopular return
> code, it has a higher chance of not coming out of some other kernel
> API

I wonder if it would be safe to just treat a pm_runtime_get_sync()
failure as -ENODEV, since a PM resume() mostly occurs to the IOMMU
that an IOMMU client/master device is behind, while an iommu_domain
rarely intervenes.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-09 12:07                     ` Jason Gunthorpe
  (?)
@ 2022-09-13  2:22                       ` Tian, Kevin
  -1 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-13  2:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nicolin Chen, Joerg Roedel, will, robin.murphy, alex.williamson,
	suravee.suthikulpanit, marcan, sven, alyssa, robdclark, dwmw2,
	baolu.lu, mjrosato, gerald.schaefer, orsonzhai, baolin.wang,
	zhang.lyra, thierry.reding, vdumpa, jonathanh, jean-philippe,
	cohuck, tglx, shameerali.kolothum.thodi, thunder.leizhen,
	christophe.jaillet, yangyingliang, jon, iommu, linux-kernel,
	asahi, linux-arm-kernel, linux-arm-msm, linux-s390, linux-tegra,
	virtualization, kvm

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, September 9, 2022 8:08 PM
> 
> 
> > As discussed in a side thread a note might be added to exempt calling
> > kAPI outside of the iommu driver.
> 
> Sadly, not really.. The driver is responsible to santize this if it is
> relevant. It is the main downside of this approach.
> 

Better provide a clarification on what sanitization means.

e.g. I don't think we should change errno in those kAPIs to match the
definition in iommu subsystem since e.g. -EINVAL really means different
things in different context.

So the sanitization in iommu driver is probably that:

  - If an external kAPI returns -EINVAL, convert it to -ENODEV given iommu
    domain is iommu internal object hence unlikely for external kAPIs to
    capture incompatibility issue between domain/device;
  - Otherwise just pass whatever returned to the caller, following the definition
    of "Same behavior as -ENODEV" above

Thanks
Kevin

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-13  2:22                       ` Tian, Kevin
  0 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-13  2:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, Joerg Roedel, jon,
	jonathanh, iommu, Nicolin Chen, yangyingliang, orsonzhai,
	gerald.schaefer, sven, linux-arm-msm, vdumpa, christophe.jaillet,
	baolin.wang, thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, robin.murphy, baolu.lu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, September 9, 2022 8:08 PM
> 
> 
> > As discussed in a side thread a note might be added to exempt calling
> > kAPI outside of the iommu driver.
> 
> Sadly, not really.. The driver is responsible to santize this if it is
> relevant. It is the main downside of this approach.
> 

Better provide a clarification on what sanitization means.

e.g. I don't think we should change errno in those kAPIs to match the
definition in iommu subsystem since e.g. -EINVAL really means different
things in different context.

So the sanitization in iommu driver is probably that:

  - If an external kAPI returns -EINVAL, convert it to -ENODEV given iommu
    domain is iommu internal object hence unlikely for external kAPIs to
    capture incompatibility issue between domain/device;
  - Otherwise just pass whatever returned to the caller, following the definition
    of "Same behavior as -ENODEV" above

Thanks
Kevin
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-13  2:22                       ` Tian, Kevin
  0 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-13  2:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, Joerg Roedel, jon,
	jonathanh, iommu, Nicolin Chen, yangyingliang, orsonzhai,
	gerald.schaefer, sven, linux-arm-msm, christophe.jaillet,
	baolin.wang, thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, robin.murphy, baolu.lu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, September 9, 2022 8:08 PM
> 
> 
> > As discussed in a side thread a note might be added to exempt calling
> > kAPI outside of the iommu driver.
> 
> Sadly, not really.. The driver is responsible to santize this if it is
> relevant. It is the main downside of this approach.
> 

Better provide a clarification on what sanitization means.

e.g. I don't think we should change errno in those kAPIs to match the
definition in iommu subsystem since e.g. -EINVAL really means different
things in different context.

So the sanitization in iommu driver is probably that:

  - If an external kAPI returns -EINVAL, convert it to -ENODEV given iommu
    domain is iommu internal object hence unlikely for external kAPIs to
    capture incompatibility issue between domain/device;
  - Otherwise just pass whatever returned to the caller, following the definition
    of "Same behavior as -ENODEV" above

Thanks
Kevin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-10 23:35                     ` Nicolin Chen
  (?)
@ 2022-09-13  2:24                       ` Tian, Kevin
  -1 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-13  2:24 UTC (permalink / raw)
  To: Nicolin Chen, Robin Murphy, Joerg Roedel, Jason Gunthorpe
  Cc: will, alex.williamson, suravee.suthikulpanit, marcan, sven,
	alyssa, robdclark, dwmw2, baolu.lu, mjrosato, gerald.schaefer,
	orsonzhai, baolin.wang, zhang.lyra, thierry.reding, vdumpa,
	jonathanh, jean-philippe, cohuck, tglx,
	shameerali.kolothum.thodi, thunder.leizhen, christophe.jaillet,
	yangyingliang, jon, iommu, linux-kernel, asahi, linux-arm-kernel,
	linux-arm-msm, linux-s390, linux-tegra, virtualization, kvm

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Sunday, September 11, 2022 7:36 AM
> 
> On Thu, Sep 08, 2022 at 09:08:38AM -0300, Jason Gunthorpe wrote:
> > On Thu, Sep 08, 2022 at 09:30:57AM +0000, Tian, Kevin wrote:
> 
> > > There are also cases where common kAPIs are called in the attach
> > > path which may return -EINVAL and random errno, e.g.:
> > >
> > > omap_iommu_attach_dev()
> > >   omap_iommu_attach()
> > >     iommu_enable()
> > >       pm_runtime_get_sync()
> > >         __pm_runtime_resume()
> > >           rpm_resume()
> > > 	if (dev->power.runtime_error) {
> > > 		retval = -EINVAL;
> 
> > Yes, this is was also on my mind with choosing an unpopular return
> > code, it has a higher chance of not coming out of some other kernel
> > API
> 
> I wonder if it would be safe to just treat a pm_runtime_get_sync()
> failure as -ENODEV, since a PM resume() mostly occurs to the IOMMU
> that an IOMMU client/master device is behind, while an iommu_domain
> rarely intervenes.

Yes, this is a condition preventing the device from being attached by
a domain hence converting -EINVAL to -ENODEV probably makes sense.
But as replied in another we might want to keep other errno's as is.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-13  2:24                       ` Tian, Kevin
  0 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-13  2:24 UTC (permalink / raw)
  To: Nicolin Chen, Robin Murphy, Joerg Roedel, Jason Gunthorpe
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	yangyingliang, orsonzhai, gerald.schaefer, sven, linux-arm-msm,
	christophe.jaillet, baolin.wang, thunder.leizhen, linux-tegra,
	tglx, virtualization, linux-arm-kernel, linux-s390, cohuck,
	vdumpa, shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, dwmw2, baolu.lu

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Sunday, September 11, 2022 7:36 AM
> 
> On Thu, Sep 08, 2022 at 09:08:38AM -0300, Jason Gunthorpe wrote:
> > On Thu, Sep 08, 2022 at 09:30:57AM +0000, Tian, Kevin wrote:
> 
> > > There are also cases where common kAPIs are called in the attach
> > > path which may return -EINVAL and random errno, e.g.:
> > >
> > > omap_iommu_attach_dev()
> > >   omap_iommu_attach()
> > >     iommu_enable()
> > >       pm_runtime_get_sync()
> > >         __pm_runtime_resume()
> > >           rpm_resume()
> > > 	if (dev->power.runtime_error) {
> > > 		retval = -EINVAL;
> 
> > Yes, this is was also on my mind with choosing an unpopular return
> > code, it has a higher chance of not coming out of some other kernel
> > API
> 
> I wonder if it would be safe to just treat a pm_runtime_get_sync()
> failure as -ENODEV, since a PM resume() mostly occurs to the IOMMU
> that an IOMMU client/master device is behind, while an iommu_domain
> rarely intervenes.

Yes, this is a condition preventing the device from being attached by
a domain hence converting -EINVAL to -ENODEV probably makes sense.
But as replied in another we might want to keep other errno's as is.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-13  2:24                       ` Tian, Kevin
  0 siblings, 0 replies; 67+ messages in thread
From: Tian, Kevin @ 2022-09-13  2:24 UTC (permalink / raw)
  To: Nicolin Chen, Robin Murphy, Joerg Roedel, Jason Gunthorpe
  Cc: marcan, mjrosato, linux-kernel, thierry.reding, will, alyssa,
	jean-philippe, kvm, zhang.lyra, jon, jonathanh, iommu,
	yangyingliang, orsonzhai, gerald.schaefer, sven, linux-arm-msm,
	alex.williamson, christophe.jaillet, baolin.wang,
	thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, linux-s390, cohuck, shameerali.kolothum.thodi,
	robdclark, asahi, suravee.suthikulpanit, dwmw2, baolu.lu

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Sunday, September 11, 2022 7:36 AM
> 
> On Thu, Sep 08, 2022 at 09:08:38AM -0300, Jason Gunthorpe wrote:
> > On Thu, Sep 08, 2022 at 09:30:57AM +0000, Tian, Kevin wrote:
> 
> > > There are also cases where common kAPIs are called in the attach
> > > path which may return -EINVAL and random errno, e.g.:
> > >
> > > omap_iommu_attach_dev()
> > >   omap_iommu_attach()
> > >     iommu_enable()
> > >       pm_runtime_get_sync()
> > >         __pm_runtime_resume()
> > >           rpm_resume()
> > > 	if (dev->power.runtime_error) {
> > > 		retval = -EINVAL;
> 
> > Yes, this is was also on my mind with choosing an unpopular return
> > code, it has a higher chance of not coming out of some other kernel
> > API
> 
> I wonder if it would be safe to just treat a pm_runtime_get_sync()
> failure as -ENODEV, since a PM resume() mostly occurs to the IOMMU
> that an IOMMU client/master device is behind, while an iommu_domain
> rarely intervenes.

Yes, this is a condition preventing the device from being attached by
a domain hence converting -EINVAL to -ENODEV probably makes sense.
But as replied in another we might want to keep other errno's as is.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-13  2:22                       ` Tian, Kevin
@ 2022-09-13  5:07                         ` Nicolin Chen
  -1 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-09-13  5:07 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Jason Gunthorpe, Joerg Roedel, will, robin.murphy,
	alex.williamson, suravee.suthikulpanit, marcan, sven, alyssa,
	robdclark, dwmw2, baolu.lu, mjrosato, gerald.schaefer, orsonzhai,
	baolin.wang, zhang.lyra, thierry.reding, vdumpa, jonathanh,
	jean-philippe, cohuck, tglx, shameerali.kolothum.thodi,
	thunder.leizhen, christophe.jaillet, yangyingliang, jon, iommu,
	linux-kernel, asahi, linux-arm-kernel, linux-arm-msm, linux-s390,
	linux-tegra, virtualization, kvm

On Tue, Sep 13, 2022 at 02:22:17AM +0000, Tian, Kevin wrote:
> External email: Use caution opening links or attachments
> 
> 
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Friday, September 9, 2022 8:08 PM
> >
> >
> > > As discussed in a side thread a note might be added to exempt calling
> > > kAPI outside of the iommu driver.
> >
> > Sadly, not really.. The driver is responsible to santize this if it is
> > relevant. It is the main downside of this approach.
> >
> 
> Better provide a clarification on what sanitization means.
> 
> e.g. I don't think we should change errno in those kAPIs to match the
> definition in iommu subsystem since e.g. -EINVAL really means different
> things in different context.
> 
> So the sanitization in iommu driver is probably that:
> 
>   - If an external kAPI returns -EINVAL, convert it to -ENODEV given iommu
>     domain is iommu internal object hence unlikely for external kAPIs to
>     capture incompatibility issue between domain/device;
>   - Otherwise just pass whatever returned to the caller, following the definition
>     of "Same behavior as -ENODEV" above

I added something similar. Thanks!

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ea30f00dc145..190647d018f9 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -266,6 +266,17 @@ struct iommu_ops {
 /**
  * struct iommu_domain_ops - domain specific operations
  * @attach_dev: attach an iommu domain to a device
+ *              Rules of its return errno:
+ *               EINVAL  - Exclusively, device and domain are incompatible. Must
+ *                         avoid kernel prints along with this errno. An EINVAL
+ *                         returned from a kAPI must be coverted to ENODEV if it
+ *                         is device specific, or to some other reasonable errno
+ *                         being listed below
+ *               ENOMEM  - Out of memory
+ *               ENOSPC  - No space left on device
+ *               EBUSY   - Device is attached to a domain and cannot be changed
+ *               ENODEV  - Device specific errors, not able to be attached
+ *              <others> - Treated as ENODEV by the caller. Use is discouraged
  * @detach_dev: detach an iommu domain from a device
  * @map: map a physically contiguous memory region to an iommu domain
  * @map_pages: map a physically contiguous set of pages of the same size to

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-13  5:07                         ` Nicolin Chen
  0 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-09-13  5:07 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, Joerg Roedel, jon,
	jonathanh, iommu, Jason Gunthorpe, yangyingliang, orsonzhai,
	gerald.schaefer, sven, linux-arm-msm, christophe.jaillet,
	baolin.wang, thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, robin.murphy, baolu.lu

On Tue, Sep 13, 2022 at 02:22:17AM +0000, Tian, Kevin wrote:
> External email: Use caution opening links or attachments
> 
> 
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Friday, September 9, 2022 8:08 PM
> >
> >
> > > As discussed in a side thread a note might be added to exempt calling
> > > kAPI outside of the iommu driver.
> >
> > Sadly, not really.. The driver is responsible to santize this if it is
> > relevant. It is the main downside of this approach.
> >
> 
> Better provide a clarification on what sanitization means.
> 
> e.g. I don't think we should change errno in those kAPIs to match the
> definition in iommu subsystem since e.g. -EINVAL really means different
> things in different context.
> 
> So the sanitization in iommu driver is probably that:
> 
>   - If an external kAPI returns -EINVAL, convert it to -ENODEV given iommu
>     domain is iommu internal object hence unlikely for external kAPIs to
>     capture incompatibility issue between domain/device;
>   - Otherwise just pass whatever returned to the caller, following the definition
>     of "Same behavior as -ENODEV" above

I added something similar. Thanks!

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ea30f00dc145..190647d018f9 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -266,6 +266,17 @@ struct iommu_ops {
 /**
  * struct iommu_domain_ops - domain specific operations
  * @attach_dev: attach an iommu domain to a device
+ *              Rules of its return errno:
+ *               EINVAL  - Exclusively, device and domain are incompatible. Must
+ *                         avoid kernel prints along with this errno. An EINVAL
+ *                         returned from a kAPI must be coverted to ENODEV if it
+ *                         is device specific, or to some other reasonable errno
+ *                         being listed below
+ *               ENOMEM  - Out of memory
+ *               ENOSPC  - No space left on device
+ *               EBUSY   - Device is attached to a domain and cannot be changed
+ *               ENODEV  - Device specific errors, not able to be attached
+ *              <others> - Treated as ENODEV by the caller. Use is discouraged
  * @detach_dev: detach an iommu domain from a device
  * @map: map a physically contiguous memory region to an iommu domain
  * @map_pages: map a physically contiguous set of pages of the same size to

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  2022-09-13  2:24                       ` Tian, Kevin
@ 2022-09-13  8:36                         ` Nicolin Chen
  -1 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-09-13  8:36 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Robin Murphy, Joerg Roedel, Jason Gunthorpe, will,
	alex.williamson, suravee.suthikulpanit, marcan, sven, alyssa,
	robdclark, dwmw2, baolu.lu, mjrosato, gerald.schaefer, orsonzhai,
	baolin.wang, zhang.lyra, thierry.reding, vdumpa, jonathanh,
	jean-philippe, cohuck, tglx, shameerali.kolothum.thodi,
	thunder.leizhen, christophe.jaillet, yangyingliang, jon, iommu,
	linux-kernel, asahi, linux-arm-kernel, linux-arm-msm, linux-s390,
	linux-tegra, virtualization, kvm

Hi Kevin,

On Tue, Sep 13, 2022 at 02:24:27AM +0000, Tian, Kevin wrote:
> > I wonder if it would be safe to just treat a pm_runtime_get_sync()
> > failure as -ENODEV, since a PM resume() mostly occurs to the IOMMU
> > that an IOMMU client/master device is behind, while an iommu_domain
> > rarely intervenes.
> 
> Yes, this is a condition preventing the device from being attached by
> a domain hence converting -EINVAL to -ENODEV probably makes sense.
> But as replied in another we might want to keep other errno's as is.

Thanks for the reply and helps. I've sent a new IOMMU series:
https://lore.kernel.org/linux-iommu/20220913082448.31120-1-nicolinc@nvidia.com/

Sorry I forgot to put you in CC, for I plainly copied the send
list from the get_maintainer.pl script :(

Hope you can still find these emails.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
@ 2022-09-13  8:36                         ` Nicolin Chen
  0 siblings, 0 replies; 67+ messages in thread
From: Nicolin Chen @ 2022-09-13  8:36 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: linux-s390, marcan, mjrosato, linux-kernel, thierry.reding, will,
	alyssa, jean-philippe, kvm, zhang.lyra, Joerg Roedel, jon,
	jonathanh, iommu, Jason Gunthorpe, yangyingliang, orsonzhai,
	gerald.schaefer, sven, linux-arm-msm, christophe.jaillet,
	baolin.wang, thunder.leizhen, linux-tegra, tglx, virtualization,
	linux-arm-kernel, dwmw2, cohuck, alex.williamson,
	shameerali.kolothum.thodi, robdclark, asahi,
	suravee.suthikulpanit, Robin Murphy, baolu.lu

Hi Kevin,

On Tue, Sep 13, 2022 at 02:24:27AM +0000, Tian, Kevin wrote:
> > I wonder if it would be safe to just treat a pm_runtime_get_sync()
> > failure as -ENODEV, since a PM resume() mostly occurs to the IOMMU
> > that an IOMMU client/master device is behind, while an iommu_domain
> > rarely intervenes.
> 
> Yes, this is a condition preventing the device from being attached by
> a domain hence converting -EINVAL to -ENODEV probably makes sense.
> But as replied in another we might want to keep other errno's as is.

Thanks for the reply and helps. I've sent a new IOMMU series:
https://lore.kernel.org/linux-iommu/20220913082448.31120-1-nicolinc@nvidia.com/

Sorry I forgot to put you in CC, for I plainly copied the send
list from the get_maintainer.pl script :(

Hope you can still find these emails.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2022-09-13  8:37 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-15 18:14 [PATCH v6 0/5] Simplify vfio_iommu_type1 attach/detach routine Nicolin Chen
2022-08-15 18:14 ` Nicolin Chen
2022-08-15 18:14 ` [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain and device/group Nicolin Chen
2022-08-15 18:14   ` Nicolin Chen
2022-09-07 12:41   ` Joerg Roedel
2022-09-07 12:41     ` Joerg Roedel
2022-09-07 12:41     ` Joerg Roedel
2022-09-07 13:47     ` Jason Gunthorpe
2022-09-07 13:47       ` Jason Gunthorpe
2022-09-07 14:06       ` Joerg Roedel
2022-09-07 14:06         ` Joerg Roedel
2022-09-07 14:06         ` Joerg Roedel
2022-09-07 17:10         ` Jason Gunthorpe
2022-09-07 17:10           ` Jason Gunthorpe
2022-09-08 13:28           ` Joerg Roedel
2022-09-08 13:28             ` Joerg Roedel
2022-09-08 13:28             ` Joerg Roedel
2022-09-08 16:14             ` Jason Gunthorpe
2022-09-08 16:14               ` Jason Gunthorpe
2022-09-09  3:17               ` Nicolin Chen
2022-09-09  3:17                 ` Nicolin Chen
2022-09-09  5:00                 ` Tian, Kevin
2022-09-09  5:00                   ` Tian, Kevin
2022-09-09  5:00                   ` Tian, Kevin
2022-09-09 12:07                   ` Jason Gunthorpe
2022-09-09 12:07                     ` Jason Gunthorpe
2022-09-13  2:22                     ` Tian, Kevin
2022-09-13  2:22                       ` Tian, Kevin
2022-09-13  2:22                       ` Tian, Kevin
2022-09-13  5:07                       ` Nicolin Chen
2022-09-13  5:07                         ` Nicolin Chen
2022-09-07 14:23       ` Robin Murphy
2022-09-07 14:23         ` Robin Murphy
2022-09-07 14:23         ` Robin Murphy
2022-09-07 17:00         ` Jason Gunthorpe
2022-09-07 17:00           ` Jason Gunthorpe
2022-09-07 19:41           ` Robin Murphy
2022-09-07 19:41             ` Robin Murphy
2022-09-07 19:41             ` Robin Murphy
2022-09-08  0:43             ` Jason Gunthorpe
2022-09-08  0:43               ` Jason Gunthorpe
2022-09-08  9:30               ` Tian, Kevin
2022-09-08  9:30                 ` Tian, Kevin
2022-09-08  9:30                 ` Tian, Kevin
2022-09-08 12:08                 ` Jason Gunthorpe
2022-09-08 12:08                   ` Jason Gunthorpe
2022-09-10 23:35                   ` Nicolin Chen
2022-09-10 23:35                     ` Nicolin Chen
2022-09-13  2:24                     ` Tian, Kevin
2022-09-13  2:24                       ` Tian, Kevin
2022-09-13  2:24                       ` Tian, Kevin
2022-09-13  8:36                       ` Nicolin Chen
2022-09-13  8:36                         ` Nicolin Chen
2022-09-08  9:54               ` Tian, Kevin
2022-09-08  9:54                 ` Tian, Kevin
2022-09-08  9:54                 ` Tian, Kevin
2022-09-08 10:25               ` Robin Murphy
2022-09-08 10:25                 ` Robin Murphy
2022-09-08 10:25                 ` Robin Murphy
2022-08-15 18:14 ` [PATCH v6 2/5] vfio/iommu_type1: Prefer to reuse domains vs match enforced cache coherency Nicolin Chen
2022-08-15 18:14   ` Nicolin Chen
2022-08-15 18:14 ` [PATCH v6 3/5] vfio/iommu_type1: Remove the domain->ops comparison Nicolin Chen
2022-08-15 18:14   ` Nicolin Chen
2022-08-15 18:14 ` [PATCH v6 4/5] vfio/iommu_type1: Clean up update_dirty_scope in detach_group() Nicolin Chen
2022-08-15 18:14   ` Nicolin Chen
2022-08-15 18:14 ` [PATCH v6 5/5] vfio/iommu_type1: Simplify group attachment Nicolin Chen
2022-08-15 18:14   ` Nicolin Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.