All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] iommu: The early demise of bus ops
@ 2023-01-19 19:18 Robin Murphy
  2023-01-19 19:18 ` [PATCH 1/8] iommu: Decouple iommu_present() from " Robin Murphy
                   ` (9 more replies)
  0 siblings, 10 replies; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 19:18 UTC (permalink / raw)
  To: joro, will; +Cc: hch, jgg, iommu, linux-kernel, rafael.j.wysocki, gregkh

Hi all,

[ Christoph, Greg, Rafael; feel free to ignore all the IOMMU details,
 just a heads-up for some pretty trivial header motion in patch #6 ]

This is sort of an RFC, in the sense that the patches are functionally
ready but I don't expect that we necessarily want to merge all them
right away; at this point it's more for the sake of visibility and
checking if anyone strongly objects to the direction I'm taking. As such
I've based these patches on 6.2-rc3 and made no effort to integrate them
with the IOMMUFD-related work going on in parallel and/or already
queued, even though there is some functional intersection and almost
certain conflicts. If we reach a consensus that we would like any of
this for 6.3 I'll rebase as appropriate.

Patches #1-6 here work up to what I originally expected to be the
triumphant finale of the whole mission, but as it turns out is actually
feasible and convenient to get out of the way *before* getting into the
really fiddly bits of refactoring the DT/ACPI of_xlate stuff, ARM DMA
ops, and the iommu_domain_alloc() interface. Patch #8 is included here
as the precursor to another cleanup series for drivers that currently
have an awkward domain_finalise step in their .attach_dev op, but I'm
unlikely to start writing those patches for a while yet. Patch #7 is
also here nominally, but in fact I think it could already go anywhere
since the last rework of iommu_device_register().

Thanks,
Robin.


Robin Murphy (8):
  iommu: Decouple iommu_present() from bus ops
  iommu: Validate that devices match domains
  iommu: Factor out a "first device in group" helper
  iommu: Switch __iommu_domain_alloc() to device ops
  iommu/arm-smmu: Don't register fwnode for legacy binding
  iommu: Retire bus ops
  iommu: Clean up open-coded ownership checks
  iommu: Pass device through ops->domain_alloc

 drivers/iommu/amd/iommu.c                   |   3 +-
 drivers/iommu/apple-dart.c                  |   3 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |   6 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  15 +--
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     |  19 +--
 drivers/iommu/exynos-iommu.c                |   3 +-
 drivers/iommu/fsl_pamu_domain.c             |   3 +-
 drivers/iommu/intel/iommu.c                 |   3 +-
 drivers/iommu/iommu.c                       | 130 +++++++++++++-------
 drivers/iommu/ipmmu-vmsa.c                  |   3 +-
 drivers/iommu/msm_iommu.c                   |   3 +-
 drivers/iommu/mtk_iommu.c                   |  10 +-
 drivers/iommu/mtk_iommu_v1.c                |   6 +-
 drivers/iommu/omap-iommu.c                  |   3 +-
 drivers/iommu/rockchip-iommu.c              |   3 +-
 drivers/iommu/s390-iommu.c                  |   3 +-
 drivers/iommu/sprd-iommu.c                  |  11 +-
 drivers/iommu/sun50i-iommu.c                |   3 +-
 drivers/iommu/tegra-gart.c                  |   3 +-
 drivers/iommu/tegra-smmu.c                  |   3 +-
 drivers/iommu/virtio-iommu.c                |   6 +-
 include/acpi/acpi_bus.h                     |   2 +
 include/linux/device.h                      |   1 -
 include/linux/device/bus.h                  |   5 -
 include/linux/dma-map-ops.h                 |   1 +
 include/linux/iommu.h                       |   4 +-
 26 files changed, 139 insertions(+), 116 deletions(-)

-- 
2.36.1.dirty


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-19 19:18 [PATCH 0/8] iommu: The early demise of bus ops Robin Murphy
@ 2023-01-19 19:18 ` Robin Murphy
  2023-01-26 13:13   ` Baolu Lu
  2023-01-19 19:18 ` [PATCH 2/8] iommu: Validate that devices match domains Robin Murphy
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 19:18 UTC (permalink / raw)
  To: joro, will; +Cc: hch, jgg, iommu, linux-kernel

Much as I'd like to remove iommu_present(), the final remaining users
are proving stubbornly difficult to clean up, so kick that can down
the road and just rework it to preserve the current behaviour without
depending on bus ops.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iommu.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b189ed345057..a77d58e1b976 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1871,9 +1871,24 @@ int bus_iommu_probe(struct bus_type *bus)
 	return ret;
 }
 
+static int __iommu_present(struct device *dev, void *unused)
+{
+	return device_iommu_mapped(dev);
+}
+
+/**
+ * iommu_present() - make platform-specific assumptions about an IOMMU
+ * @bus: bus to check
+ *
+ * Do not use this function. You want device_iommu_mapped() instead.
+ *
+ * Return: true if some IOMMU is present for some device on the given bus. In
+ * general it may not be the only IOMMU, and it may not be for the device you
+ * are ultimately interested in.
+ */
 bool iommu_present(struct bus_type *bus)
 {
-	return bus->iommu_ops != NULL;
+	return bus_for_each_dev(bus, NULL, NULL, __iommu_present) > 0;
 }
 EXPORT_SYMBOL_GPL(iommu_present);
 
-- 
2.36.1.dirty


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/8] iommu: Validate that devices match domains
  2023-01-19 19:18 [PATCH 0/8] iommu: The early demise of bus ops Robin Murphy
  2023-01-19 19:18 ` [PATCH 1/8] iommu: Decouple iommu_present() from " Robin Murphy
@ 2023-01-19 19:18 ` Robin Murphy
  2023-01-19 19:18 ` [PATCH 3/8] iommu: Factor out a "first device in group" helper Robin Murphy
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 19:18 UTC (permalink / raw)
  To: joro, will; +Cc: hch, jgg, iommu, linux-kernel

Before we can allow drivers to coexist, we need to make sure that one
driver's domain ops can't misinterpret another driver's dev_iommu_priv
data. To that end, add a token to the domain so we can remember how it
was allocated - for now this may as well be the device ops. We can trust
ourselves for internal default domain attachment, so add the check where
it covers both the external attach interfaces.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iommu.c | 13 +++++++++----
 include/linux/iommu.h |  1 +
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index a77d58e1b976..bc53ffbba4de 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1941,20 +1941,22 @@ EXPORT_SYMBOL_GPL(iommu_set_fault_handler);
 static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 						 unsigned type)
 {
+	const struct iommu_ops *ops = bus ? bus->iommu_ops : NULL;
 	struct iommu_domain *domain;
 
-	if (bus == NULL || bus->iommu_ops == NULL)
+	if (!ops)
 		return NULL;
 
-	domain = bus->iommu_ops->domain_alloc(type);
+	domain = ops->domain_alloc(type);
 	if (!domain)
 		return NULL;
 
 	domain->type = type;
+	domain->owner = ops;
 	/* Assume all sizes by default; the driver may override this later */
-	domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;
+	domain->pgsize_bitmap = ops->pgsize_bitmap;
 	if (!domain->ops)
-		domain->ops = bus->iommu_ops->default_domain_ops;
+		domain->ops = ops->default_domain_ops;
 
 	if (iommu_is_dma_domain(domain) && iommu_get_dma_cookie(domain)) {
 		iommu_domain_free(domain);
@@ -2128,6 +2130,9 @@ static int iommu_group_do_attach_device(struct device *dev, void *data)
 {
 	struct iommu_domain *domain = data;
 
+	if (dev_iommu_ops(dev) != domain->owner)
+		return -EINVAL;
+
 	return __iommu_attach_device(domain, dev);
 }
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d37bf28faf82..35af9d4e3969 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -95,6 +95,7 @@ struct iommu_domain_geometry {
 struct iommu_domain {
 	unsigned type;
 	const struct iommu_domain_ops *ops;
+	const struct iommu_ops *owner; /* Whose domain_alloc we came from */
 	unsigned long pgsize_bitmap;	/* Bitmap of page sizes in use */
 	struct iommu_domain_geometry geometry;
 	struct iommu_dma_cookie *iova_cookie;
-- 
2.36.1.dirty


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/8] iommu: Factor out a "first device in group" helper
  2023-01-19 19:18 [PATCH 0/8] iommu: The early demise of bus ops Robin Murphy
  2023-01-19 19:18 ` [PATCH 1/8] iommu: Decouple iommu_present() from " Robin Murphy
  2023-01-19 19:18 ` [PATCH 2/8] iommu: Validate that devices match domains Robin Murphy
@ 2023-01-19 19:18 ` Robin Murphy
  2023-01-19 19:23   ` Jason Gunthorpe
  2023-01-19 19:18 ` [PATCH 4/8] iommu: Switch __iommu_domain_alloc() to device ops Robin Murphy
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 19:18 UTC (permalink / raw)
  To: joro, will; +Cc: hch, jgg, iommu, linux-kernel

This pattern for picking the first device out of the group list is
repeated a few times now, so it's clearly worth factoring out.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iommu.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index bc53ffbba4de..5b37766a09e2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1084,6 +1084,11 @@ void iommu_group_remove_device(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(iommu_group_remove_device);
 
+static struct device *iommu_group_first_dev(struct iommu_group *group)
+{
+	return list_first_entry(&group->devices, struct group_device, list)->dev;
+}
+
 static int iommu_group_device_count(struct iommu_group *group)
 {
 	struct group_device *entry;
@@ -2835,7 +2840,6 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 				       struct device *prev_dev, int type)
 {
 	struct iommu_domain *prev_dom;
-	struct group_device *grp_dev;
 	int ret, dev_def_dom;
 	struct device *dev;
 
@@ -2867,8 +2871,7 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 	}
 
 	/* Since group has only one device */
-	grp_dev = list_first_entry(&group->devices, struct group_device, list);
-	dev = grp_dev->dev;
+	dev = iommu_group_first_dev(group);
 
 	if (prev_dev != dev) {
 		dev_err_ratelimited(prev_dev, "Cannot change default domain: Device has been changed\n");
@@ -2965,7 +2968,6 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 static ssize_t iommu_group_store_type(struct iommu_group *group,
 				      const char *buf, size_t count)
 {
-	struct group_device *grp_dev;
 	struct device *dev;
 	int ret, req_type;
 
@@ -3000,8 +3002,7 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 	}
 
 	/* Since group has only one device */
-	grp_dev = list_first_entry(&group->devices, struct group_device, list);
-	dev = grp_dev->dev;
+	dev = iommu_group_first_dev(group);
 	get_device(dev);
 
 	/*
@@ -3126,21 +3127,18 @@ void iommu_device_unuse_default_domain(struct device *dev)
 
 static int __iommu_group_alloc_blocking_domain(struct iommu_group *group)
 {
-	struct group_device *dev =
-		list_first_entry(&group->devices, struct group_device, list);
+	struct device *dev = iommu_group_first_dev(group);
 
 	if (group->blocking_domain)
 		return 0;
 
-	group->blocking_domain =
-		__iommu_domain_alloc(dev->dev->bus, IOMMU_DOMAIN_BLOCKED);
+	group->blocking_domain = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_BLOCKED);
 	if (!group->blocking_domain) {
 		/*
 		 * For drivers that do not yet understand IOMMU_DOMAIN_BLOCKED
 		 * create an empty domain instead.
 		 */
-		group->blocking_domain = __iommu_domain_alloc(
-			dev->dev->bus, IOMMU_DOMAIN_UNMANAGED);
+		group->blocking_domain = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_UNMANAGED);
 		if (!group->blocking_domain)
 			return -EINVAL;
 	}
-- 
2.36.1.dirty


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 4/8] iommu: Switch __iommu_domain_alloc() to device ops
  2023-01-19 19:18 [PATCH 0/8] iommu: The early demise of bus ops Robin Murphy
                   ` (2 preceding siblings ...)
  2023-01-19 19:18 ` [PATCH 3/8] iommu: Factor out a "first device in group" helper Robin Murphy
@ 2023-01-19 19:18 ` Robin Murphy
  2023-01-19 19:26   ` Jason Gunthorpe
  2023-01-19 19:18 ` [PATCH 5/8] iommu/arm-smmu: Don't register fwnode for legacy binding Robin Murphy
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 19:18 UTC (permalink / raw)
  To: joro, will; +Cc: hch, jgg, iommu, linux-kernel

In all the places we allocate default domains, we have (or can easily
get hold of) a device from which to resolve the right IOMMU ops; only
the public iommu_domain_alloc() interface actually depends on bus ops.
Reworking the public API is a big enough mission in its own right, but
in the meantime we can still decouple it from bus ops internally to move
forward.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iommu.c | 57 ++++++++++++++++++++++++++++---------------
 1 file changed, 37 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 5b37766a09e2..1a31d94adff5 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -88,7 +88,7 @@ static int iommu_bus_notifier(struct notifier_block *nb,
 			      unsigned long action, void *data);
 static int iommu_alloc_default_domain(struct iommu_group *group,
 				      struct device *dev);
-static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
+static struct iommu_domain *__iommu_domain_alloc(struct device *dev,
 						 unsigned type);
 static int __iommu_attach_device(struct iommu_domain *domain,
 				 struct device *dev);
@@ -1620,15 +1620,15 @@ static int iommu_get_def_domain_type(struct device *dev)
 	return 0;
 }
 
-static int iommu_group_alloc_default_domain(struct bus_type *bus,
-					    struct iommu_group *group,
+static int iommu_group_alloc_default_domain(struct iommu_group *group,
+					    struct device *dev,
 					    unsigned int type)
 {
 	struct iommu_domain *dom;
 
-	dom = __iommu_domain_alloc(bus, type);
+	dom = __iommu_domain_alloc(dev, type);
 	if (!dom && type != IOMMU_DOMAIN_DMA) {
-		dom = __iommu_domain_alloc(bus, IOMMU_DOMAIN_DMA);
+		dom = __iommu_domain_alloc(dev, IOMMU_DOMAIN_DMA);
 		if (dom)
 			pr_warn("Failed to allocate default IOMMU domain of type %u for group %s - Falling back to IOMMU_DOMAIN_DMA",
 				type, group->name);
@@ -1653,7 +1653,7 @@ static int iommu_alloc_default_domain(struct iommu_group *group,
 
 	type = iommu_get_def_domain_type(dev) ? : iommu_def_domain_type;
 
-	return iommu_group_alloc_default_domain(dev->bus, group, type);
+	return iommu_group_alloc_default_domain(group, dev, type);
 }
 
 /**
@@ -1766,8 +1766,7 @@ static int probe_get_default_domain_type(struct device *dev, void *data)
 	return 0;
 }
 
-static void probe_alloc_default_domain(struct bus_type *bus,
-				       struct iommu_group *group)
+static void probe_alloc_default_domain(struct iommu_group *group)
 {
 	struct __group_domain_type gtype;
 
@@ -1777,10 +1776,12 @@ static void probe_alloc_default_domain(struct bus_type *bus,
 	__iommu_group_for_each_dev(group, &gtype,
 				   probe_get_default_domain_type);
 
-	if (!gtype.type)
+	if (!gtype.type) {
 		gtype.type = iommu_def_domain_type;
+		gtype.dev = iommu_group_first_dev(group);
+	}
 
-	iommu_group_alloc_default_domain(bus, group, gtype.type);
+	iommu_group_alloc_default_domain(group, gtype.dev, gtype.type);
 
 }
 
@@ -1854,7 +1855,7 @@ int bus_iommu_probe(struct bus_type *bus)
 		list_del_init(&group->entry);
 
 		/* Try to allocate default domain */
-		probe_alloc_default_domain(bus, group);
+		probe_alloc_default_domain(group);
 
 		if (!group->default_domain) {
 			mutex_unlock(&group->mutex);
@@ -1943,15 +1944,12 @@ void iommu_set_fault_handler(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_set_fault_handler);
 
-static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
+static struct iommu_domain *__iommu_domain_alloc(struct device *dev,
 						 unsigned type)
 {
-	const struct iommu_ops *ops = bus ? bus->iommu_ops : NULL;
+	const struct iommu_ops *ops = dev_iommu_ops(dev);
 	struct iommu_domain *domain;
 
-	if (!ops)
-		return NULL;
-
 	domain = ops->domain_alloc(type);
 	if (!domain)
 		return NULL;
@@ -1970,9 +1968,28 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 	return domain;
 }
 
+static int __iommu_domain_alloc_dev(struct device *dev, void *data)
+{
+	struct device **alloc_dev = data;
+
+	if (!device_iommu_mapped(dev))
+		return 0;
+
+	WARN_ONCE(*alloc_dev && dev_iommu_ops(dev) != dev_iommu_ops(*alloc_dev),
+		"Multiple IOMMU drivers present, which the public IOMMU API can't fully support yet. This may not work as expected, sorry!\n");
+
+	*alloc_dev = dev;
+	return 0;
+}
+
 struct iommu_domain *iommu_domain_alloc(struct bus_type *bus)
 {
-	return __iommu_domain_alloc(bus, IOMMU_DOMAIN_UNMANAGED);
+	struct device *dev = NULL;
+
+	if (bus_for_each_dev(bus, NULL, &dev, __iommu_domain_alloc_dev))
+		return NULL;
+
+	return __iommu_domain_alloc(dev, IOMMU_DOMAIN_UNMANAGED);
 }
 EXPORT_SYMBOL_GPL(iommu_domain_alloc);
 
@@ -2918,7 +2935,7 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 	}
 
 	/* Sets group->default_domain to the newly allocated domain */
-	ret = iommu_group_alloc_default_domain(dev->bus, group, type);
+	ret = iommu_group_alloc_default_domain(group, dev, type);
 	if (ret)
 		goto out;
 
@@ -3132,13 +3149,13 @@ static int __iommu_group_alloc_blocking_domain(struct iommu_group *group)
 	if (group->blocking_domain)
 		return 0;
 
-	group->blocking_domain = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_BLOCKED);
+	group->blocking_domain = __iommu_domain_alloc(dev, IOMMU_DOMAIN_BLOCKED);
 	if (!group->blocking_domain) {
 		/*
 		 * For drivers that do not yet understand IOMMU_DOMAIN_BLOCKED
 		 * create an empty domain instead.
 		 */
-		group->blocking_domain = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_UNMANAGED);
+		group->blocking_domain = __iommu_domain_alloc(dev, IOMMU_DOMAIN_UNMANAGED);
 		if (!group->blocking_domain)
 			return -EINVAL;
 	}
-- 
2.36.1.dirty


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 5/8] iommu/arm-smmu: Don't register fwnode for legacy binding
  2023-01-19 19:18 [PATCH 0/8] iommu: The early demise of bus ops Robin Murphy
                   ` (3 preceding siblings ...)
  2023-01-19 19:18 ` [PATCH 4/8] iommu: Switch __iommu_domain_alloc() to device ops Robin Murphy
@ 2023-01-19 19:18 ` Robin Murphy
  2023-01-19 19:18 ` [PATCH 6/8] iommu: Retire bus ops Robin Murphy
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 19:18 UTC (permalink / raw)
  To: joro, will; +Cc: hch, jgg, iommu, linux-kernel

When using the legacy binding we bypass the of_xlate mechanism, so avoid
registering the instance fwnodes which act as keys for that. This will
help __iommu_probe_device() to retrieve the registered ops the same way
as for x86 etc. when no fwspec has previously been set up by of_xlate.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 719fbca1fe52..607f06af01b6 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -2156,7 +2156,8 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 		return err;
 	}
 
-	err = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev);
+	err = iommu_device_register(&smmu->iommu, &arm_smmu_ops,
+				    using_legacy_binding ? NULL : dev);
 	if (err) {
 		dev_err(dev, "Failed to register iommu\n");
 		iommu_device_sysfs_remove(&smmu->iommu);
-- 
2.36.1.dirty


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 6/8] iommu: Retire bus ops
  2023-01-19 19:18 [PATCH 0/8] iommu: The early demise of bus ops Robin Murphy
                   ` (4 preceding siblings ...)
  2023-01-19 19:18 ` [PATCH 5/8] iommu/arm-smmu: Don't register fwnode for legacy binding Robin Murphy
@ 2023-01-19 19:18 ` Robin Murphy
  2023-01-20  0:27   ` Baolu Lu
  2023-01-20 10:23   ` Greg Kroah-Hartman
  2023-01-19 19:18 ` [PATCH 7/8] iommu: Clean up open-coded ownership checks Robin Murphy
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 19:18 UTC (permalink / raw)
  To: joro, will
  Cc: hch, jgg, iommu, linux-kernel, Rafael J . Wysocki, Greg Kroah-Hartman

With the rest of the API internals converted, it's time to finally
tackle probe_device and how we bootstrap the per-device ops association
to begin with. This ends up being disappointingly straightforward, since
fwspec users are already doing it in order to find their of_xlate
callback, and it works out that we can easily do the equivalent for
other drivers too. Then shuffle the remaining awareness of iommu_ops
into the couple of core headers that still need it, and breathe a sigh
of relief...

Ding dong the bus ops are gone!

CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Christoph Hellwig <hch@lst.de>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iommu.c       | 27 ++++++++++++++++-----------
 include/acpi/acpi_bus.h     |  2 ++
 include/linux/device.h      |  1 -
 include/linux/device/bus.h  |  5 -----
 include/linux/dma-map-ops.h |  1 +
 5 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1a31d94adff5..8997b8f2e79f 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -218,13 +218,6 @@ int iommu_device_register(struct iommu_device *iommu,
 	/* We need to be able to take module references appropriately */
 	if (WARN_ON(is_module_address((unsigned long)ops) && !ops->owner))
 		return -EINVAL;
-	/*
-	 * Temporarily enforce global restriction to a single driver. This was
-	 * already the de-facto behaviour, since any possible combination of
-	 * existing drivers would compete for at least the PCI or platform bus.
-	 */
-	if (iommu_buses[0]->iommu_ops && iommu_buses[0]->iommu_ops != ops)
-		return -EBUSY;
 
 	iommu->ops = ops;
 	if (hwdev)
@@ -234,10 +227,8 @@ int iommu_device_register(struct iommu_device *iommu,
 	list_add_tail(&iommu->list, &iommu_device_list);
 	spin_unlock(&iommu_device_lock);
 
-	for (int i = 0; i < ARRAY_SIZE(iommu_buses) && !err; i++) {
-		iommu_buses[i]->iommu_ops = ops;
+	for (int i = 0; i < ARRAY_SIZE(iommu_buses) && !err; i++)
 		err = bus_iommu_probe(iommu_buses[i]);
-	}
 	if (err)
 		iommu_device_unregister(iommu);
 	return err;
@@ -303,12 +294,26 @@ static u32 dev_iommu_get_max_pasids(struct device *dev)
 
 static int __iommu_probe_device(struct device *dev, struct list_head *group_list)
 {
-	const struct iommu_ops *ops = dev->bus->iommu_ops;
+	const struct iommu_ops *ops;
 	struct iommu_device *iommu_dev;
+	struct iommu_fwspec *fwspec;
 	struct iommu_group *group;
 	static DEFINE_MUTEX(iommu_probe_device_lock);
 	int ret;
 
+	/*
+	 * For FDT-based systems and ACPI IORT/VIOT, drivers register IOMMU
+	 * instances with non-NULL fwnodes, and client devices should have been
+	 * identified with a fwspec by this point. For Intel/AMD/s390/PAMU we
+	 * can assume a single active driver with global ops, and so grab those
+	 * from any registered instance, cheekily co-opting the same mechanism.
+	 */
+	fwspec = dev_iommu_fwspec_get(dev);
+	if (fwspec && fwspec->ops)
+		ops = fwspec->ops;
+	else
+		ops = iommu_ops_from_fwnode(NULL);
+
 	if (!ops)
 		return -ENODEV;
 	/*
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index cd3b75e08ec3..067dde9291c9 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -614,6 +614,8 @@ struct acpi_pci_root {
 
 /* helper */
 
+struct iommu_ops;
+
 bool acpi_dma_supported(const struct acpi_device *adev);
 enum dev_dma_attr acpi_get_dma_attr(struct acpi_device *adev);
 int acpi_iommu_fwspec_init(struct device *dev, u32 id,
diff --git a/include/linux/device.h b/include/linux/device.h
index 44e3acae7b36..f7a7ecafedd3 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -41,7 +41,6 @@ struct class;
 struct subsys_private;
 struct device_node;
 struct fwnode_handle;
-struct iommu_ops;
 struct iommu_group;
 struct dev_pin_info;
 struct dev_iommu;
diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index d8b29ccd07e5..4ece3470112f 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -63,9 +63,6 @@ struct fwnode_handle;
  *			this bus.
  * @pm:		Power management operations of this bus, callback the specific
  *		device driver's pm-ops.
- * @iommu_ops:  IOMMU specific operations for this bus, used to attach IOMMU
- *              driver implementations to a bus and allow the driver to do
- *              bus-specific setup
  * @p:		The private data of the driver core, only the driver core can
  *		touch this.
  * @lock_key:	Lock class key for use by the lock validator
@@ -109,8 +106,6 @@ struct bus_type {
 
 	const struct dev_pm_ops *pm;
 
-	const struct iommu_ops *iommu_ops;
-
 	struct subsys_private *p;
 	struct lock_class_key lock_key;
 
diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index d678afeb8a13..e8ebf0bf611b 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -10,6 +10,7 @@
 #include <linux/pgtable.h>
 
 struct cma;
+struct iommu_ops;
 
 /*
  * Values for struct dma_map_ops.flags:
-- 
2.36.1.dirty


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 7/8] iommu: Clean up open-coded ownership checks
  2023-01-19 19:18 [PATCH 0/8] iommu: The early demise of bus ops Robin Murphy
                   ` (5 preceding siblings ...)
  2023-01-19 19:18 ` [PATCH 6/8] iommu: Retire bus ops Robin Murphy
@ 2023-01-19 19:18 ` Robin Murphy
  2023-01-19 19:31   ` Jason Gunthorpe
  2023-01-19 19:18 ` [PATCH 8/8] iommu: Pass device through ops->domain_alloc Robin Murphy
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 19:18 UTC (permalink / raw)
  To: joro, will; +Cc: hch, jgg, iommu, linux-kernel

Some drivers already implement their own defence against the possibility
of being given someone else's device. Since this is now taken care of by
the core code (and via a slightly different path from the original
fwspec-based idea), let's clean them up.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  3 ---
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  9 +--------
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     | 16 +++-------------
 drivers/iommu/mtk_iommu.c                   |  7 +------
 drivers/iommu/mtk_iommu_v1.c                |  3 ---
 drivers/iommu/sprd-iommu.c                  |  8 +-------
 drivers/iommu/virtio-iommu.c                |  3 ---
 7 files changed, 6 insertions(+), 43 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index ab160198edd6..cb05d9771192 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2642,9 +2642,6 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 	struct arm_smmu_master *master;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 
-	if (!fwspec || fwspec->ops != &arm_smmu_ops)
-		return ERR_PTR(-ENODEV);
-
 	if (WARN_ON_ONCE(dev_iommu_priv_get(dev)))
 		return ERR_PTR(-EBUSY);
 
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 607f06af01b6..235550db0d59 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1118,11 +1118,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	struct arm_smmu_device *smmu;
 	int ret;
 
-	if (!fwspec || fwspec->ops != &arm_smmu_ops) {
-		dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n");
-		return -ENXIO;
-	}
-
 	/*
 	 * FIXME: The arch/arm DMA API code tries to attach devices to its own
 	 * domains between of_xlate() and probe_device() - we have no way to cope
@@ -1352,10 +1347,8 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 		fwspec = dev_iommu_fwspec_get(dev);
 		if (ret)
 			goto out_free;
-	} else if (fwspec && fwspec->ops == &arm_smmu_ops) {
-		smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode);
 	} else {
-		return ERR_PTR(-ENODEV);
+		smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode);
 	}
 
 	ret = -EINVAL;
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 270c3d9128ba..b2d3d309be1e 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -79,16 +79,6 @@ static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
 
 static const struct iommu_ops qcom_iommu_ops;
 
-static struct qcom_iommu_dev * to_iommu(struct device *dev)
-{
-	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-
-	if (!fwspec || fwspec->ops != &qcom_iommu_ops)
-		return NULL;
-
-	return dev_iommu_priv_get(dev);
-}
-
 static struct qcom_iommu_ctx * to_ctx(struct qcom_iommu_domain *d, unsigned asid)
 {
 	struct qcom_iommu_dev *qcom_iommu = d->iommu;
@@ -361,7 +351,7 @@ static void qcom_iommu_domain_free(struct iommu_domain *domain)
 
 static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
-	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev);
+	struct qcom_iommu_dev *qcom_iommu = dev_iommu_priv_get(dev);
 	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
 	int ret;
 
@@ -391,7 +381,7 @@ static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *de
 {
 	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev);
+	struct qcom_iommu_dev *qcom_iommu = dev_iommu_priv_get(dev);
 	unsigned i;
 
 	if (WARN_ON(!qcom_domain->iommu))
@@ -508,7 +498,7 @@ static bool qcom_iommu_capable(struct device *dev, enum iommu_cap cap)
 
 static struct iommu_device *qcom_iommu_probe_device(struct device *dev)
 {
-	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev);
+	struct qcom_iommu_dev *qcom_iommu = dev_iommu_priv_get(dev);
 	struct device_link *link;
 
 	if (!qcom_iommu)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 2badd6acfb23..005136a4cc36 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -784,16 +784,11 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
 static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
 {
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-	struct mtk_iommu_data *data;
+	struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
 	struct device_link *link;
 	struct device *larbdev;
 	unsigned int larbid, larbidx, i;
 
-	if (!fwspec || fwspec->ops != &mtk_iommu_ops)
-		return ERR_PTR(-ENODEV); /* Not a iommu client device */
-
-	data = dev_iommu_priv_get(dev);
-
 	if (!MTK_IOMMU_IS_TYPE(data->plat_data, MTK_IOMMU_TYPE_MM))
 		return &data->iommu;
 
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 69682ee068d2..dff8ea0af884 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -478,9 +478,6 @@ static struct iommu_device *mtk_iommu_v1_probe_device(struct device *dev)
 		idx++;
 	}
 
-	if (!fwspec || fwspec->ops != &mtk_iommu_v1_ops)
-		return ERR_PTR(-ENODEV); /* Not a iommu client device */
-
 	data = dev_iommu_priv_get(dev);
 
 	/* Link the consumer device with the smi-larb device(supplier) */
diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
index 219bfa11f7f4..4cebccb6fc8b 100644
--- a/drivers/iommu/sprd-iommu.c
+++ b/drivers/iommu/sprd-iommu.c
@@ -373,13 +373,7 @@ static phys_addr_t sprd_iommu_iova_to_phys(struct iommu_domain *domain,
 
 static struct iommu_device *sprd_iommu_probe_device(struct device *dev)
 {
-	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-	struct sprd_iommu_device *sdev;
-
-	if (!fwspec || fwspec->ops != &sprd_iommu_ops)
-		return ERR_PTR(-ENODEV);
-
-	sdev = dev_iommu_priv_get(dev);
+	struct sprd_iommu_device *sdev = dev_iommu_priv_get(dev);
 
 	return &sdev->iommu;
 }
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 5b8fe9bfa9a5..59f1abd6ee53 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -945,9 +945,6 @@ static struct iommu_device *viommu_probe_device(struct device *dev)
 	struct viommu_dev *viommu = NULL;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 
-	if (!fwspec || fwspec->ops != &viommu_ops)
-		return ERR_PTR(-ENODEV);
-
 	viommu = viommu_get_by_fwnode(fwspec->iommu_fwnode);
 	if (!viommu)
 		return ERR_PTR(-ENODEV);
-- 
2.36.1.dirty


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 8/8] iommu: Pass device through ops->domain_alloc
  2023-01-19 19:18 [PATCH 0/8] iommu: The early demise of bus ops Robin Murphy
                   ` (6 preceding siblings ...)
  2023-01-19 19:18 ` [PATCH 7/8] iommu: Clean up open-coded ownership checks Robin Murphy
@ 2023-01-19 19:18 ` Robin Murphy
  2023-01-19 19:34 ` [PATCH 0/8] iommu: The early demise of bus ops Jason Gunthorpe
  2023-01-20 12:33 ` Joerg Roedel
  9 siblings, 0 replies; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 19:18 UTC (permalink / raw)
  To: joro, will; +Cc: hch, jgg, iommu, linux-kernel

With __iommu_domain_alloc() now aware of a device, we can propagate that
through to the drivers so that they can also pick up their instance data
without having to wait until they see the device attach.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/amd/iommu.c                   | 3 ++-
 drivers/iommu/apple-dart.c                  | 3 ++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 ++-
 drivers/iommu/arm/arm-smmu/arm-smmu.c       | 3 ++-
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     | 3 ++-
 drivers/iommu/exynos-iommu.c                | 3 ++-
 drivers/iommu/fsl_pamu_domain.c             | 3 ++-
 drivers/iommu/intel/iommu.c                 | 3 ++-
 drivers/iommu/iommu.c                       | 4 ++--
 drivers/iommu/ipmmu-vmsa.c                  | 3 ++-
 drivers/iommu/msm_iommu.c                   | 3 ++-
 drivers/iommu/mtk_iommu.c                   | 3 ++-
 drivers/iommu/mtk_iommu_v1.c                | 3 ++-
 drivers/iommu/omap-iommu.c                  | 3 ++-
 drivers/iommu/rockchip-iommu.c              | 3 ++-
 drivers/iommu/s390-iommu.c                  | 3 ++-
 drivers/iommu/sprd-iommu.c                  | 3 ++-
 drivers/iommu/sun50i-iommu.c                | 3 ++-
 drivers/iommu/tegra-gart.c                  | 3 ++-
 drivers/iommu/tegra-smmu.c                  | 3 ++-
 drivers/iommu/virtio-iommu.c                | 3 ++-
 include/linux/iommu.h                       | 3 ++-
 22 files changed, 44 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index cbeaab55c0db..f5bc61a4c3d4 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2084,7 +2084,8 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
 	return NULL;
 }
 
-static struct iommu_domain *amd_iommu_domain_alloc(unsigned type)
+static struct iommu_domain *amd_iommu_domain_alloc(struct device *dev,
+						   unsigned type)
 {
 	struct protection_domain *domain;
 
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index 4f4a323be0d0..95f3aa323a99 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -576,7 +576,8 @@ static void apple_dart_release_device(struct device *dev)
 	kfree(cfg);
 }
 
-static struct iommu_domain *apple_dart_domain_alloc(unsigned int type)
+static struct iommu_domain *apple_dart_domain_alloc(struct device *dev,
+						    unsigned int type)
 {
 	struct apple_dart_domain *dart_domain;
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index cb05d9771192..453e9dbf4920 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2005,7 +2005,8 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
 	}
 }
 
-static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
+static struct iommu_domain *arm_smmu_domain_alloc(struct device *dev,
+						  unsigned type)
 {
 	struct arm_smmu_domain *smmu_domain;
 
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 235550db0d59..86c9c6df18ae 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -851,7 +851,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	arm_smmu_rpm_put(smmu);
 }
 
-static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
+static struct iommu_domain *arm_smmu_domain_alloc(struct device *dev,
+						  unsigned type)
 {
 	struct arm_smmu_domain *smmu_domain;
 
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index b2d3d309be1e..7b9daacf1697 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -309,7 +309,8 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
 	return ret;
 }
 
-static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
+static struct iommu_domain *qcom_iommu_domain_alloc(struct device *dev,
+						    unsigned type)
 {
 	struct qcom_iommu_domain *qcom_domain;
 
diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index b0cde2211987..54f9d10a9fb1 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -816,7 +816,8 @@ static inline void exynos_iommu_set_pte(sysmmu_pte_t *ent, sysmmu_pte_t val)
 				   DMA_TO_DEVICE);
 }
 
-static struct iommu_domain *exynos_iommu_domain_alloc(unsigned type)
+static struct iommu_domain *exynos_iommu_domain_alloc(struct device *dev,
+						      unsigned type)
 {
 	struct exynos_iommu_domain *domain;
 	dma_addr_t handle;
diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index 4408ac3c49b6..69578e574af0 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -192,7 +192,8 @@ static void fsl_pamu_domain_free(struct iommu_domain *domain)
 	kmem_cache_free(fsl_pamu_domain_cache, dma_domain);
 }
 
-static struct iommu_domain *fsl_pamu_domain_alloc(unsigned type)
+static struct iommu_domain *fsl_pamu_domain_alloc(struct device *dev,
+						  unsigned type)
 {
 	struct fsl_dma_domain *dma_domain;
 
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 59df7e42fd53..590594a0d6e1 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4157,7 +4157,8 @@ static struct iommu_domain blocking_domain = {
 	}
 };
 
-static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
+static struct iommu_domain *intel_iommu_domain_alloc(struct device *dev,
+						     unsigned type)
 {
 	struct dmar_domain *dmar_domain;
 	struct iommu_domain *domain;
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8997b8f2e79f..521f6f780294 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1955,7 +1955,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct device *dev,
 	const struct iommu_ops *ops = dev_iommu_ops(dev);
 	struct iommu_domain *domain;
 
-	domain = ops->domain_alloc(type);
+	domain = ops->domain_alloc(dev, type);
 	if (!domain)
 		return NULL;
 
@@ -3456,7 +3456,7 @@ struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
 	const struct iommu_ops *ops = dev_iommu_ops(dev);
 	struct iommu_domain *domain;
 
-	domain = ops->domain_alloc(IOMMU_DOMAIN_SVA);
+	domain = ops->domain_alloc(dev, IOMMU_DOMAIN_SVA);
 	if (!domain)
 		return NULL;
 
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 76e51d8e3732..7bb6c2fd3214 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -567,7 +567,8 @@ static irqreturn_t ipmmu_irq(int irq, void *dev)
  * IOMMU Operations
  */
 
-static struct iommu_domain *ipmmu_domain_alloc(unsigned type)
+static struct iommu_domain *ipmmu_domain_alloc(struct device *dev,
+					       unsigned type)
 {
 	struct ipmmu_vmsa_domain *domain;
 
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index c60624910872..1d9704e7c75f 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -302,7 +302,8 @@ static void __program_context(void __iomem *base, int ctx,
 	SET_M(base, ctx, 1);
 }
 
-static struct iommu_domain *msm_iommu_domain_alloc(unsigned type)
+static struct iommu_domain *msm_iommu_domain_alloc(struct device *dev,
+						   unsigned type)
 {
 	struct msm_priv *priv;
 
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 005136a4cc36..711716c8dcdf 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -632,7 +632,8 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom,
 	return 0;
 }
 
-static struct iommu_domain *mtk_iommu_domain_alloc(unsigned type)
+static struct iommu_domain *mtk_iommu_domain_alloc(struct device *dev,
+						   unsigned type)
 {
 	struct mtk_iommu_domain *dom;
 
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index dff8ea0af884..3d1e0635f9b3 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -270,7 +270,8 @@ static int mtk_iommu_v1_domain_finalise(struct mtk_iommu_v1_data *data)
 	return 0;
 }
 
-static struct iommu_domain *mtk_iommu_v1_domain_alloc(unsigned type)
+static struct iommu_domain *mtk_iommu_v1_domain_alloc(struct device *dev,
+						      unsigned type)
 {
 	struct mtk_iommu_v1_domain *dom;
 
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 2fd7702c6709..19a22ba1860d 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1566,7 +1566,8 @@ static void omap_iommu_detach_dev(struct iommu_domain *domain,
 	spin_unlock(&omap_domain->lock);
 }
 
-static struct iommu_domain *omap_iommu_domain_alloc(unsigned type)
+static struct iommu_domain *omap_iommu_domain_alloc(struct device *dev,
+						    unsigned type)
 {
 	struct omap_iommu_domain *omap_domain;
 
diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index a68eadd64f38..50b17a096ea5 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -1057,7 +1057,8 @@ static int rk_iommu_attach_device(struct iommu_domain *domain,
 	return ret;
 }
 
-static struct iommu_domain *rk_iommu_domain_alloc(unsigned type)
+static struct iommu_domain *rk_iommu_domain_alloc(struct device *dev,
+						  unsigned type)
 {
 	struct rk_iommu_domain *rk_domain;
 
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index ed33c6cce083..a56e4e282469 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -41,7 +41,8 @@ static bool s390_iommu_capable(struct device *dev, enum iommu_cap cap)
 	}
 }
 
-static struct iommu_domain *s390_domain_alloc(unsigned domain_type)
+static struct iommu_domain *s390_domain_alloc(struct device *dev,
+					      unsigned domain_type)
 {
 	struct s390_domain *s390_domain;
 
diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
index 4cebccb6fc8b..32992cb07c4f 100644
--- a/drivers/iommu/sprd-iommu.c
+++ b/drivers/iommu/sprd-iommu.c
@@ -132,7 +132,8 @@ sprd_iommu_pgt_size(struct iommu_domain *domain)
 		SPRD_IOMMU_PAGE_SHIFT) * sizeof(u32);
 }
 
-static struct iommu_domain *sprd_iommu_domain_alloc(unsigned int domain_type)
+static struct iommu_domain *sprd_iommu_domain_alloc(struct device *dev,
+						    unsigned int domain_type)
 {
 	struct sprd_iommu_domain *dom;
 
diff --git a/drivers/iommu/sun50i-iommu.c b/drivers/iommu/sun50i-iommu.c
index 5b585eace3d4..10c7eb9c5f11 100644
--- a/drivers/iommu/sun50i-iommu.c
+++ b/drivers/iommu/sun50i-iommu.c
@@ -667,7 +667,8 @@ static phys_addr_t sun50i_iommu_iova_to_phys(struct iommu_domain *domain,
 		sun50i_iova_get_page_offset(iova);
 }
 
-static struct iommu_domain *sun50i_iommu_domain_alloc(unsigned type)
+static struct iommu_domain *sun50i_iommu_domain_alloc(struct device *dev,
+						      unsigned type)
 {
 	struct sun50i_iommu_domain *sun50i_domain;
 
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index ed53279d1106..75bcf191b091 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -141,7 +141,8 @@ static void gart_iommu_detach_dev(struct iommu_domain *domain,
 	spin_unlock(&gart->dom_lock);
 }
 
-static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
+static struct iommu_domain *gart_iommu_domain_alloc(struct device *dev,
+						    unsigned type)
 {
 	struct iommu_domain *domain;
 
diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 5b1af40221ec..c7c34d9f60ec 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -272,7 +272,8 @@ static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id)
 	clear_bit(id, smmu->asids);
 }
 
-static struct iommu_domain *tegra_smmu_domain_alloc(unsigned type)
+static struct iommu_domain *tegra_smmu_domain_alloc(struct device *dev,
+						    unsigned type)
 {
 	struct tegra_smmu_as *as;
 
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 59f1abd6ee53..7f397767a8e2 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -637,7 +637,8 @@ static void viommu_event_handler(struct virtqueue *vq)
 
 /* IOMMU API */
 
-static struct iommu_domain *viommu_domain_alloc(unsigned type)
+static struct iommu_domain *viommu_domain_alloc(struct device *dev,
+						unsigned type)
 {
 	struct viommu_domain *vdomain;
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 35af9d4e3969..3a8930d36046 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -252,7 +252,8 @@ struct iommu_ops {
 	bool (*capable)(struct device *dev, enum iommu_cap);
 
 	/* Domain allocation and freeing by the iommu driver */
-	struct iommu_domain *(*domain_alloc)(unsigned iommu_domain_type);
+	struct iommu_domain *(*domain_alloc)(struct device *dev,
+					     unsigned iommu_domain_type);
 
 	struct iommu_device *(*probe_device)(struct device *dev);
 	void (*release_device)(struct device *dev);
-- 
2.36.1.dirty


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/8] iommu: Factor out a "first device in group" helper
  2023-01-19 19:18 ` [PATCH 3/8] iommu: Factor out a "first device in group" helper Robin Murphy
@ 2023-01-19 19:23   ` Jason Gunthorpe
  2023-01-19 19:36     ` Robin Murphy
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Gunthorpe @ 2023-01-19 19:23 UTC (permalink / raw)
  To: Robin Murphy; +Cc: joro, will, hch, iommu, linux-kernel

On Thu, Jan 19, 2023 at 07:18:21PM +0000, Robin Murphy wrote:
> This pattern for picking the first device out of the group list is
> repeated a few times now, so it's clearly worth factoring out.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/iommu.c | 22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index bc53ffbba4de..5b37766a09e2 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1084,6 +1084,11 @@ void iommu_group_remove_device(struct device *dev)
>  }
>  EXPORT_SYMBOL_GPL(iommu_group_remove_device);
>  
> +static struct device *iommu_group_first_dev(struct iommu_group *group)
> +{

Add a
 lockdep_assert_held(&group->lock);

?

> -	group->blocking_domain =
> -		__iommu_domain_alloc(dev->dev->bus, IOMMU_DOMAIN_BLOCKED);
> +	group->blocking_domain = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_BLOCKED);
>  	if (!group->blocking_domain) {
>  		/*
>  		 * For drivers that do not yet understand IOMMU_DOMAIN_BLOCKED
>  		 * create an empty domain instead.
>  		 */
> -		group->blocking_domain = __iommu_domain_alloc(
> -			dev->dev->bus, IOMMU_DOMAIN_UNMANAGED);
> +		group->blocking_domain = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_UNMANAGED);
>  		if (!group->blocking_domain)
>  			return -EINVAL;

These are extra hunks?

Jason

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/8] iommu: Switch __iommu_domain_alloc() to device ops
  2023-01-19 19:18 ` [PATCH 4/8] iommu: Switch __iommu_domain_alloc() to device ops Robin Murphy
@ 2023-01-19 19:26   ` Jason Gunthorpe
  2023-01-19 20:12     ` Robin Murphy
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Gunthorpe @ 2023-01-19 19:26 UTC (permalink / raw)
  To: Robin Murphy; +Cc: joro, will, hch, iommu, linux-kernel

On Thu, Jan 19, 2023 at 07:18:22PM +0000, Robin Murphy wrote:

> -static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
> +static struct iommu_domain *__iommu_domain_alloc(struct device *dev,
>  						 unsigned type)
>  {
> -	const struct iommu_ops *ops = bus ? bus->iommu_ops : NULL;
> +	const struct iommu_ops *ops = dev_iommu_ops(dev);
>  	struct iommu_domain *domain;
>  
> -	if (!ops)
> -		return NULL;
> -
>  	domain = ops->domain_alloc(type);
>  	if (!domain)
>  		return NULL;
> @@ -1970,9 +1968,28 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>  	return domain;
>  }
>  
> +static int __iommu_domain_alloc_dev(struct device *dev, void *data)
> +{
> +	struct device **alloc_dev = data;
> +
> +	if (!device_iommu_mapped(dev))
> +		return 0;

Is 0 the right thing? see below

> +
> +	WARN_ONCE(*alloc_dev && dev_iommu_ops(dev) != dev_iommu_ops(*alloc_dev),
> +		"Multiple IOMMU drivers present, which the public IOMMU API can't fully support yet. This may not work as expected, sorry!\n");

if (WARN_ONCE(..))
   return -EINVAL

So that iommu_domain_alloc fails?

> +	*alloc_dev = dev;
> +	return 0;
> +}
> +
>  struct iommu_domain *iommu_domain_alloc(struct bus_type *bus)
>  {
> -	return __iommu_domain_alloc(bus, IOMMU_DOMAIN_UNMANAGED);
> +	struct device *dev = NULL;
> +
> +	if (bus_for_each_dev(bus, NULL, &dev, __iommu_domain_alloc_dev))
> +		return NULL;

eg shouldn't iommu_domain_alloc() return NULL if any devices are
!device_iommu_mapped ?

Jason

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 7/8] iommu: Clean up open-coded ownership checks
  2023-01-19 19:18 ` [PATCH 7/8] iommu: Clean up open-coded ownership checks Robin Murphy
@ 2023-01-19 19:31   ` Jason Gunthorpe
  2023-01-19 20:52     ` Robin Murphy
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Gunthorpe @ 2023-01-19 19:31 UTC (permalink / raw)
  To: Robin Murphy; +Cc: joro, will, hch, iommu, linux-kernel

On Thu, Jan 19, 2023 at 07:18:25PM +0000, Robin Murphy wrote:
> diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
> index 270c3d9128ba..b2d3d309be1e 100644
> --- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
> +++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
> @@ -79,16 +79,6 @@ static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
>  
>  static const struct iommu_ops qcom_iommu_ops;
>  
> -static struct qcom_iommu_dev * to_iommu(struct device *dev)
> -{
> -	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> -
> -	if (!fwspec || fwspec->ops != &qcom_iommu_ops)
> -		return NULL;
> -
> -	return dev_iommu_priv_get(dev);
> -}
> -
>  static struct qcom_iommu_ctx * to_ctx(struct qcom_iommu_domain *d, unsigned asid)
>  {
>  	struct qcom_iommu_dev *qcom_iommu = d->iommu;
> @@ -361,7 +351,7 @@ static void qcom_iommu_domain_free(struct iommu_domain *domain)
>  
>  static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  {
> -	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev);
> +	struct qcom_iommu_dev *qcom_iommu = dev_iommu_priv_get(dev);
>  	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>  	int ret;

The confusing checks for null qcom_iommu following this should go away
too, right?

It should not be possible to have an ops set on the device but have an
invalid priv..

Jason

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/8] iommu: The early demise of bus ops
  2023-01-19 19:18 [PATCH 0/8] iommu: The early demise of bus ops Robin Murphy
                   ` (7 preceding siblings ...)
  2023-01-19 19:18 ` [PATCH 8/8] iommu: Pass device through ops->domain_alloc Robin Murphy
@ 2023-01-19 19:34 ` Jason Gunthorpe
  2023-01-20 12:33 ` Joerg Roedel
  9 siblings, 0 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2023-01-19 19:34 UTC (permalink / raw)
  To: Robin Murphy
  Cc: joro, will, hch, iommu, linux-kernel, rafael.j.wysocki, gregkh

On Thu, Jan 19, 2023 at 07:18:18PM +0000, Robin Murphy wrote:
> Hi all,
> 
> [ Christoph, Greg, Rafael; feel free to ignore all the IOMMU details,
>  just a heads-up for some pretty trivial header motion in patch #6 ]
> 
> This is sort of an RFC, in the sense that the patches are functionally
> ready but I don't expect that we necessarily want to merge all them
> right away; at this point it's more for the sake of visibility and
> checking if anyone strongly objects to the direction I'm taking.

Looks good to me

Is there anything beyond some typing preventing the removal of the bus
from the out-of-iommu callers?

> I've based these patches on 6.2-rc3 and made no effort to integrate them
> with the IOMMUFD-related work going on in parallel and/or already
> queued, even though there is some functional intersection and almost
> certain conflicts. If we reach a consensus that we would like any of
> this for 6.3 I'll rebase as appropriate.

I don't see a reason to wait..

Jason

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/8] iommu: Factor out a "first device in group" helper
  2023-01-19 19:23   ` Jason Gunthorpe
@ 2023-01-19 19:36     ` Robin Murphy
  0 siblings, 0 replies; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 19:36 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: joro, will, hch, iommu, linux-kernel

On 19/01/2023 7:23 pm, Jason Gunthorpe wrote:
> On Thu, Jan 19, 2023 at 07:18:21PM +0000, Robin Murphy wrote:
>> This pattern for picking the first device out of the group list is
>> repeated a few times now, so it's clearly worth factoring out.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   drivers/iommu/iommu.c | 22 ++++++++++------------
>>   1 file changed, 10 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index bc53ffbba4de..5b37766a09e2 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1084,6 +1084,11 @@ void iommu_group_remove_device(struct device *dev)
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_group_remove_device);
>>   
>> +static struct device *iommu_group_first_dev(struct iommu_group *group)
>> +{
> 
> Add a
>   lockdep_assert_held(&group->lock);
> 
> ?

Sure, could do. I guess I didn't consider it since 
iommu_group_device_count() and __iommu_group_for_each_dev() don't assert 
it either, but that's not to say they couldn't change too.

>> -	group->blocking_domain =
>> -		__iommu_domain_alloc(dev->dev->bus, IOMMU_DOMAIN_BLOCKED);
>> +	group->blocking_domain = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_BLOCKED);
>>   	if (!group->blocking_domain) {
>>   		/*
>>   		 * For drivers that do not yet understand IOMMU_DOMAIN_BLOCKED
>>   		 * create an empty domain instead.
>>   		 */
>> -		group->blocking_domain = __iommu_domain_alloc(
>> -			dev->dev->bus, IOMMU_DOMAIN_UNMANAGED);
>> +		group->blocking_domain = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_UNMANAGED);
>>   		if (!group->blocking_domain)
>>   			return -EINVAL;
> 
> These are extra hunks?

The annoyingly-subtle difference between "dev->dev->bus" and "dev->bus" 
is precisely one of the reasons I think hiding the group_device behind a 
helper is worthwhile ;)

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/8] iommu: Switch __iommu_domain_alloc() to device ops
  2023-01-19 19:26   ` Jason Gunthorpe
@ 2023-01-19 20:12     ` Robin Murphy
  0 siblings, 0 replies; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 20:12 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: joro, will, hch, iommu, linux-kernel

On 19/01/2023 7:26 pm, Jason Gunthorpe wrote:
> On Thu, Jan 19, 2023 at 07:18:22PM +0000, Robin Murphy wrote:
> 
>> -static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>> +static struct iommu_domain *__iommu_domain_alloc(struct device *dev,
>>   						 unsigned type)
>>   {
>> -	const struct iommu_ops *ops = bus ? bus->iommu_ops : NULL;
>> +	const struct iommu_ops *ops = dev_iommu_ops(dev);
>>   	struct iommu_domain *domain;
>>   
>> -	if (!ops)
>> -		return NULL;
>> -
>>   	domain = ops->domain_alloc(type);
>>   	if (!domain)
>>   		return NULL;
>> @@ -1970,9 +1968,28 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>>   	return domain;
>>   }
>>   
>> +static int __iommu_domain_alloc_dev(struct device *dev, void *data)
>> +{
>> +	struct device **alloc_dev = data;
>> +
>> +	if (!device_iommu_mapped(dev))
>> +		return 0;
> 
> Is 0 the right thing? see below

Yes, the idea here is to always double-check the whole bus.

>> +
>> +	WARN_ONCE(*alloc_dev && dev_iommu_ops(dev) != dev_iommu_ops(*alloc_dev),
>> +		"Multiple IOMMU drivers present, which the public IOMMU API can't fully support yet. This may not work as expected, sorry!\n");
> 
> if (WARN_ONCE(..))
>     return -EINVAL
> 
> So that iommu_domain_alloc fails?

The current behaviour is that if you have multiple different IOMMUs 
present, then only one driver succeeds in registering, effectively at 
random depending on probe order. To get predictable behaviour where 
iommu_domain_alloc() (and indeed the whole IOMMU API) works for a 
specific device, you have to manage your kernel config or modules to 
only load the driver for the correct IOMMU.

After patch #4, we allow all the drivers to register, but the net effect 
on the public API is still the same - it only works successfully for one 
driver, effectively at random - and the same solution - don't load the 
other drivers, or at least load them in an appropriate order relative to 
the client drivers - still applies. On those grounds it seems a fair 
compromise until we can sort iommu_domain_alloc() itself. As far as I'm 
aware there are still no immediate real-world users for this - upstream 
support for Rockchip RK3588 is still in very early days, and a long way 
off being complete enough for users to get interested in trying to play 
with the Arm SMMUs in there (leading to disappointment that VFIO won't 
work since they're non-coherent...)

>> +	*alloc_dev = dev;
>> +	return 0;
>> +}
>> +
>>   struct iommu_domain *iommu_domain_alloc(struct bus_type *bus)
>>   {
>> -	return __iommu_domain_alloc(bus, IOMMU_DOMAIN_UNMANAGED);
>> +	struct device *dev = NULL;
>> +
>> +	if (bus_for_each_dev(bus, NULL, &dev, __iommu_domain_alloc_dev))
>> +		return NULL;
> 
> eg shouldn't iommu_domain_alloc() return NULL if any devices are
> !device_iommu_mapped ?

No, that would even break the normal single-driver case, because it's 
always been the case that not all devices on e.g. the platform bus are 
iommu_mapped. That's largely why bus ops are a rubbish abstraction.

Even with multiple drivers, we can still allocate a domain here which 
will work fine with *some* devices, and safely fail to work with others, 
so I don't see that we'd gain much from being unnecessarily restrictive.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 7/8] iommu: Clean up open-coded ownership checks
  2023-01-19 19:31   ` Jason Gunthorpe
@ 2023-01-19 20:52     ` Robin Murphy
  0 siblings, 0 replies; 32+ messages in thread
From: Robin Murphy @ 2023-01-19 20:52 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: joro, will, hch, iommu, linux-kernel

On 19/01/2023 7:31 pm, Jason Gunthorpe wrote:
> On Thu, Jan 19, 2023 at 07:18:25PM +0000, Robin Murphy wrote:
>> diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
>> index 270c3d9128ba..b2d3d309be1e 100644
>> --- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
>> +++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
>> @@ -79,16 +79,6 @@ static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
>>   
>>   static const struct iommu_ops qcom_iommu_ops;
>>   
>> -static struct qcom_iommu_dev * to_iommu(struct device *dev)
>> -{
>> -	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>> -
>> -	if (!fwspec || fwspec->ops != &qcom_iommu_ops)
>> -		return NULL;
>> -
>> -	return dev_iommu_priv_get(dev);
>> -}
>> -
>>   static struct qcom_iommu_ctx * to_ctx(struct qcom_iommu_domain *d, unsigned asid)
>>   {
>>   	struct qcom_iommu_dev *qcom_iommu = d->iommu;
>> @@ -361,7 +351,7 @@ static void qcom_iommu_domain_free(struct iommu_domain *domain)
>>   
>>   static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
>>   {
>> -	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev);
>> +	struct qcom_iommu_dev *qcom_iommu = dev_iommu_priv_get(dev);
>>   	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>>   	int ret;
> 
> The confusing checks for null qcom_iommu following this should go away
> too, right?
> 
> It should not be possible to have an ops set on the device but have an
> invalid priv..

Yeah, this was just an audit of already-redundant fwspec->ops checks, 
but another one for redundant dev_iommu_priv checks is probably 
worthwhile too.

Cheers,
Robin.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/8] iommu: Retire bus ops
  2023-01-19 19:18 ` [PATCH 6/8] iommu: Retire bus ops Robin Murphy
@ 2023-01-20  0:27   ` Baolu Lu
  2023-01-20 12:31     ` Robin Murphy
  2023-01-20 10:23   ` Greg Kroah-Hartman
  1 sibling, 1 reply; 32+ messages in thread
From: Baolu Lu @ 2023-01-20  0:27 UTC (permalink / raw)
  To: Robin Murphy, joro, will
  Cc: baolu.lu, hch, jgg, iommu, linux-kernel, Rafael J . Wysocki,
	Greg Kroah-Hartman

On 2023/1/20 3:18, Robin Murphy wrote:
> +	/*
> +	 * For FDT-based systems and ACPI IORT/VIOT, drivers register IOMMU
> +	 * instances with non-NULL fwnodes, and client devices should have been
> +	 * identified with a fwspec by this point. For Intel/AMD/s390/PAMU we
> +	 * can assume a single active driver with global ops, and so grab those
> +	 * from any registered instance, cheekily co-opting the same mechanism.
> +	 */
> +	fwspec = dev_iommu_fwspec_get(dev);
> +	if (fwspec && fwspec->ops)
> +		ops = fwspec->ops;
> +	else
> +		ops = iommu_ops_from_fwnode(NULL);

I'm imagining if Intel/AMD/s390 drivers need to give up global ops.
Is there any way to allow them to make such conversion? I am just
thinking about whether this is a hard limitation for these drivers.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/8] iommu: Retire bus ops
  2023-01-19 19:18 ` [PATCH 6/8] iommu: Retire bus ops Robin Murphy
  2023-01-20  0:27   ` Baolu Lu
@ 2023-01-20 10:23   ` Greg Kroah-Hartman
  1 sibling, 0 replies; 32+ messages in thread
From: Greg Kroah-Hartman @ 2023-01-20 10:23 UTC (permalink / raw)
  To: Robin Murphy
  Cc: joro, will, hch, jgg, iommu, linux-kernel, Rafael J . Wysocki

On Thu, Jan 19, 2023 at 07:18:24PM +0000, Robin Murphy wrote:
> With the rest of the API internals converted, it's time to finally
> tackle probe_device and how we bootstrap the per-device ops association
> to begin with. This ends up being disappointingly straightforward, since
> fwspec users are already doing it in order to find their of_xlate
> callback, and it works out that we can easily do the equivalent for
> other drivers too. Then shuffle the remaining awareness of iommu_ops
> into the couple of core headers that still need it, and breathe a sigh
> of relief...
> 
> Ding dong the bus ops are gone!
> 
> CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> CC: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Nice work!

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/8] iommu: Retire bus ops
  2023-01-20  0:27   ` Baolu Lu
@ 2023-01-20 12:31     ` Robin Murphy
  2023-01-26 12:37       ` Baolu Lu
  0 siblings, 1 reply; 32+ messages in thread
From: Robin Murphy @ 2023-01-20 12:31 UTC (permalink / raw)
  To: Baolu Lu, joro, will
  Cc: hch, jgg, iommu, linux-kernel, Rafael J . Wysocki, Greg Kroah-Hartman

On 2023-01-20 00:27, Baolu Lu wrote:
> On 2023/1/20 3:18, Robin Murphy wrote:
>> +    /*
>> +     * For FDT-based systems and ACPI IORT/VIOT, drivers register IOMMU
>> +     * instances with non-NULL fwnodes, and client devices should 
>> have been
>> +     * identified with a fwspec by this point. For 
>> Intel/AMD/s390/PAMU we
>> +     * can assume a single active driver with global ops, and so grab 
>> those
>> +     * from any registered instance, cheekily co-opting the same 
>> mechanism.
>> +     */
>> +    fwspec = dev_iommu_fwspec_get(dev);
>> +    if (fwspec && fwspec->ops)
>> +        ops = fwspec->ops;
>> +    else
>> +        ops = iommu_ops_from_fwnode(NULL);
> 
> I'm imagining if Intel/AMD/s390 drivers need to give up global ops.
> Is there any way to allow them to make such conversion? I am just
> thinking about whether this is a hard limitation for these drivers.

Yes, they could perhaps bodge into the existing fwnode mechanism, or we 
could make bigger changes to adapt and generalise the whole 
instance-registration-token-lookup concept, or if the driver can resolve 
the correct instance for a device internally, then it could suffice to 
just have all its device ops share a single common .probe_device 
implementation that does the right thing.

The comment is merely noting the fact that we can get away without 
having to worry about those changes just yet, since all the drivers 
*are* currently still built around the hard constraint of a single set 
of device ops per bus.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/8] iommu: The early demise of bus ops
  2023-01-19 19:18 [PATCH 0/8] iommu: The early demise of bus ops Robin Murphy
                   ` (8 preceding siblings ...)
  2023-01-19 19:34 ` [PATCH 0/8] iommu: The early demise of bus ops Jason Gunthorpe
@ 2023-01-20 12:33 ` Joerg Roedel
  9 siblings, 0 replies; 32+ messages in thread
From: Joerg Roedel @ 2023-01-20 12:33 UTC (permalink / raw)
  To: Robin Murphy
  Cc: will, hch, jgg, iommu, linux-kernel, rafael.j.wysocki, gregkh

Hi Robin,

On Thu, Jan 19, 2023 at 07:18:18PM +0000, Robin Murphy wrote:
> This is sort of an RFC, in the sense that the patches are functionally
> ready but I don't expect that we necessarily want to merge all them
> right away; at this point it's more for the sake of visibility and
> checking if anyone strongly objects to the direction I'm taking. As such
> I've based these patches on 6.2-rc3 and made no effort to integrate them
> with the IOMMUFD-related work going on in parallel and/or already
> queued, even though there is some functional intersection and almost
> certain conflicts. If we reach a consensus that we would like any of
> this for 6.3 I'll rebase as appropriate.

Thanks for doing this, I like the direction this is taking! I see no
strict reason to wait with this, but also no strict reason to hurry it
in.

I would say we have another week for this to get ready to be merged into
6.3, which means having a version based on iommu/core with the reviews
addressed and enough relevant Reviewed-by's.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 6/8] iommu: Retire bus ops
  2023-01-20 12:31     ` Robin Murphy
@ 2023-01-26 12:37       ` Baolu Lu
  0 siblings, 0 replies; 32+ messages in thread
From: Baolu Lu @ 2023-01-26 12:37 UTC (permalink / raw)
  To: Robin Murphy, joro, will
  Cc: baolu.lu, hch, jgg, iommu, linux-kernel, Rafael J . Wysocki,
	Greg Kroah-Hartman

On 2023/1/20 20:31, Robin Murphy wrote:
> On 2023-01-20 00:27, Baolu Lu wrote:
>> On 2023/1/20 3:18, Robin Murphy wrote:
>>> +    /*
>>> +     * For FDT-based systems and ACPI IORT/VIOT, drivers register IOMMU
>>> +     * instances with non-NULL fwnodes, and client devices should 
>>> have been
>>> +     * identified with a fwspec by this point. For 
>>> Intel/AMD/s390/PAMU we
>>> +     * can assume a single active driver with global ops, and so 
>>> grab those
>>> +     * from any registered instance, cheekily co-opting the same 
>>> mechanism.
>>> +     */
>>> +    fwspec = dev_iommu_fwspec_get(dev);
>>> +    if (fwspec && fwspec->ops)
>>> +        ops = fwspec->ops;
>>> +    else
>>> +        ops = iommu_ops_from_fwnode(NULL);
>>
>> I'm imagining if Intel/AMD/s390 drivers need to give up global ops.
>> Is there any way to allow them to make such conversion? I am just
>> thinking about whether this is a hard limitation for these drivers.
> 
> Yes, they could perhaps bodge into the existing fwnode mechanism, or we 
> could make bigger changes to adapt and generalise the whole 
> instance-registration-token-lookup concept, or if the driver can resolve 
> the correct instance for a device internally, then it could suffice to 
> just have all its device ops share a single common .probe_device 
> implementation that does the right thing.

Yes. Sharing a common .probe_device entry and let the IOMMU driver
connect the device and its iommu ops is a feasible solution. Thanks!

> 
> The comment is merely noting the fact that we can get away without 
> having to worry about those changes just yet, since all the drivers 
> *are* currently still built around the hard constraint of a single set 
> of device ops per bus.

Yes. Fair enough.

--
Best regards,
baolu


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-19 19:18 ` [PATCH 1/8] iommu: Decouple iommu_present() from " Robin Murphy
@ 2023-01-26 13:13   ` Baolu Lu
  2023-01-26 14:21     ` Robin Murphy
  0 siblings, 1 reply; 32+ messages in thread
From: Baolu Lu @ 2023-01-26 13:13 UTC (permalink / raw)
  To: Robin Murphy, joro, will; +Cc: baolu.lu, hch, jgg, iommu, linux-kernel

On 2023/1/20 3:18, Robin Murphy wrote:
> Much as I'd like to remove iommu_present(), the final remaining users
> are proving stubbornly difficult to clean up, so kick that can down
> the road and just rework it to preserve the current behaviour without
> depending on bus ops.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>   drivers/iommu/iommu.c | 17 ++++++++++++++++-
>   1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index b189ed345057..a77d58e1b976 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1871,9 +1871,24 @@ int bus_iommu_probe(struct bus_type *bus)
>   	return ret;
>   }
>   
> +static int __iommu_present(struct device *dev, void *unused)
> +{
> +	return device_iommu_mapped(dev);
> +}

/**
  * device_iommu_mapped - Returns true when the device DMA is translated
  *                       by an IOMMU
  * @dev: Device to perform the check on
  */
static inline bool device_iommu_mapped(struct device *dev)
{
         return (dev->iommu_group != NULL);
}

Perhaps device_iommu_mapped() should be improved. In some cases, the
device has an iommu_group filled is not enough to indicate that the
device has IOMMU hardware for DMA translation.

For example, VFIO could allocate an iommu_group and add a device into
the iommu_group even there's no IOMMU hardware in
vfio_noiommu_group_alloc().

Basically iommu_group_add_device() doesn't check the presence of an
IOMMU.

> +
> +/**
> + * iommu_present() - make platform-specific assumptions about an IOMMU
> + * @bus: bus to check
> + *
> + * Do not use this function. You want device_iommu_mapped() instead.
> + *
> + * Return: true if some IOMMU is present for some device on the given bus. In
> + * general it may not be the only IOMMU, and it may not be for the device you
> + * are ultimately interested in.
> + */
>   bool iommu_present(struct bus_type *bus)
>   {
> -	return bus->iommu_ops != NULL;
> +	return bus_for_each_dev(bus, NULL, NULL, __iommu_present) > 0;
>   }
>   EXPORT_SYMBOL_GPL(iommu_present);
>   

--
Best regards,
baolu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-26 13:13   ` Baolu Lu
@ 2023-01-26 14:21     ` Robin Murphy
  2023-01-26 14:41       ` Jason Gunthorpe
  0 siblings, 1 reply; 32+ messages in thread
From: Robin Murphy @ 2023-01-26 14:21 UTC (permalink / raw)
  To: Baolu Lu, joro, will; +Cc: hch, jgg, iommu, linux-kernel

On 2023-01-26 13:13, Baolu Lu wrote:
> On 2023/1/20 3:18, Robin Murphy wrote:
>> Much as I'd like to remove iommu_present(), the final remaining users
>> are proving stubbornly difficult to clean up, so kick that can down
>> the road and just rework it to preserve the current behaviour without
>> depending on bus ops.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   drivers/iommu/iommu.c | 17 ++++++++++++++++-
>>   1 file changed, 16 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index b189ed345057..a77d58e1b976 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1871,9 +1871,24 @@ int bus_iommu_probe(struct bus_type *bus)
>>       return ret;
>>   }
>> +static int __iommu_present(struct device *dev, void *unused)
>> +{
>> +    return device_iommu_mapped(dev);
>> +}
> 
> /**
>   * device_iommu_mapped - Returns true when the device DMA is translated
>   *                       by an IOMMU
>   * @dev: Device to perform the check on
>   */
> static inline bool device_iommu_mapped(struct device *dev)
> {
>          return (dev->iommu_group != NULL);
> }
> 
> Perhaps device_iommu_mapped() should be improved. In some cases, the
> device has an iommu_group filled is not enough to indicate that the
> device has IOMMU hardware for DMA translation.
> 
> For example, VFIO could allocate an iommu_group and add a device into
> the iommu_group even there's no IOMMU hardware in
> vfio_noiommu_group_alloc().
> 
> Basically iommu_group_add_device() doesn't check the presence of an
> IOMMU.

/**
  * iommu_group_add_device [...]
  *
  * This function is called by an iommu driver [...]
  */

The "check" is inherent in the fact that it's been called at all. VFIO 
noiommu *is* an IOMMU driver in the sense that it provides a bare 
minimum of IOMMU API functionality (i.e. creating groups), sufficient to 
support (careful) usage by VFIO drivers. There would not seem to be a 
legitimate reason for some *other* driver to be specifically querying a 
device while it is already bound to a VFIO driver (and thus may have a 
noiommu group).

In terms of this patch, I'm confident that nobody is using VFIO noiommu 
on old Tegra SoCs; I'm even more confident that they wouldn't be doing 
it with platform devices; and I'm supremely confident that they're not 
loading the GPU drivers while already in the middle of using noiommu 
vfio_platform. Basically "not using VFIO noiommu" is one of the inherent 
platform-specific assumptions. If anyone else now ignores the first 
sentence of the documentation and tries to use iommu_present() somewhere 
that assumption might not hold, returning a meaningless wrong answer is 
the documented behaviour :)

Cheers,
Robin.

> 
>> +
>> +/**
>> + * iommu_present() - make platform-specific assumptions about an IOMMU
>> + * @bus: bus to check
>> + *
>> + * Do not use this function. You want device_iommu_mapped() instead.
>> + *
>> + * Return: true if some IOMMU is present for some device on the given 
>> bus. In
>> + * general it may not be the only IOMMU, and it may not be for the 
>> device you
>> + * are ultimately interested in.
>> + */
>>   bool iommu_present(struct bus_type *bus)
>>   {
>> -    return bus->iommu_ops != NULL;
>> +    return bus_for_each_dev(bus, NULL, NULL, __iommu_present) > 0;
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_present);
> 
> -- 
> Best regards,
> baolu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-26 14:21     ` Robin Murphy
@ 2023-01-26 14:41       ` Jason Gunthorpe
  2023-01-27 13:50         ` Baolu Lu
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Gunthorpe @ 2023-01-26 14:41 UTC (permalink / raw)
  To: Robin Murphy; +Cc: Baolu Lu, joro, will, hch, iommu, linux-kernel

On Thu, Jan 26, 2023 at 02:21:29PM +0000, Robin Murphy wrote:

> The "check" is inherent in the fact that it's been called at all. VFIO
> noiommu *is* an IOMMU driver in the sense that it provides a bare minimum of
> IOMMU API functionality (i.e. creating groups), sufficient to support
> (careful) usage by VFIO drivers. There would not seem to be a legitimate
> reason for some *other* driver to be specifically querying a device while it
> is already bound to a VFIO driver (and thus may have a noiommu group).

Yes, the devices that VFIO assigns to its internal groups never leak
outside VFIO's control during their assignment - ie they are
continuously bound to VFIO never another driver.

So no other driver can ever see the internal groups unless it is
messing around with devices it is not bound to :)

Jason

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-26 14:41       ` Jason Gunthorpe
@ 2023-01-27 13:50         ` Baolu Lu
  2023-01-27 13:59           ` Jason Gunthorpe
  2023-01-27 15:19           ` Robin Murphy
  0 siblings, 2 replies; 32+ messages in thread
From: Baolu Lu @ 2023-01-27 13:50 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: baolu.lu, joro, will, hch, iommu, linux-kernel

On 2023/1/26 22:41, Jason Gunthorpe wrote:
> On Thu, Jan 26, 2023 at 02:21:29PM +0000, Robin Murphy wrote:
> 
>> The "check" is inherent in the fact that it's been called at all. VFIO
>> noiommu*is*  an IOMMU driver in the sense that it provides a bare minimum of
>> IOMMU API functionality (i.e. creating groups), sufficient to support
>> (careful) usage by VFIO drivers. There would not seem to be a legitimate
>> reason for some*other*  driver to be specifically querying a device while it
>> is already bound to a VFIO driver (and thus may have a noiommu group).
> Yes, the devices that VFIO assigns to its internal groups never leak
> outside VFIO's control during their assignment - ie they are
> continuously bound to VFIO never another driver.
> 
> So no other driver can ever see the internal groups unless it is
> messing around with devices it is not bound to 😄

Fair enough. I was thinking that probably we could make it like below:

/**
  * device_iommu_mapped - Returns true when the device DMA is translated
  *                       by an IOMMU
  * @dev: Device to perform the check on
  */
static inline bool device_iommu_mapped(struct device *dev)
{
         return (dev->iommu && dev->iommu->iommu_dev);
}

The iommu probe device code guarantees that dev->iommu->iommu_dev is
valid only after the IOMMU driver's .probe_device returned successfully.

Any thoughts?

Best regards,
baolu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-27 13:50         ` Baolu Lu
@ 2023-01-27 13:59           ` Jason Gunthorpe
  2023-01-27 15:19           ` Robin Murphy
  1 sibling, 0 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2023-01-27 13:59 UTC (permalink / raw)
  To: Baolu Lu; +Cc: Robin Murphy, joro, will, hch, iommu, linux-kernel

On Fri, Jan 27, 2023 at 09:50:29PM +0800, Baolu Lu wrote:
> On 2023/1/26 22:41, Jason Gunthorpe wrote:
> > On Thu, Jan 26, 2023 at 02:21:29PM +0000, Robin Murphy wrote:
> > 
> > > The "check" is inherent in the fact that it's been called at all. VFIO
> > > noiommu*is*  an IOMMU driver in the sense that it provides a bare minimum of
> > > IOMMU API functionality (i.e. creating groups), sufficient to support
> > > (careful) usage by VFIO drivers. There would not seem to be a legitimate
> > > reason for some*other*  driver to be specifically querying a device while it
> > > is already bound to a VFIO driver (and thus may have a noiommu group).
> > Yes, the devices that VFIO assigns to its internal groups never leak
> > outside VFIO's control during their assignment - ie they are
> > continuously bound to VFIO never another driver.
> > 
> > So no other driver can ever see the internal groups unless it is
> > messing around with devices it is not bound to 😄
> 
> Fair enough. I was thinking that probably we could make it like below:
> 
> /**
>  * device_iommu_mapped - Returns true when the device DMA is translated
>  *                       by an IOMMU
>  * @dev: Device to perform the check on
>  */
> static inline bool device_iommu_mapped(struct device *dev)
> {
>         return (dev->iommu && dev->iommu->iommu_dev);
> }
> 
> The iommu probe device code guarantees that dev->iommu->iommu_dev is
> valid only after the IOMMU driver's .probe_device returned successfully.
> 
> Any thoughts?

I find the above much clearer if it can work

Jason

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-27 13:50         ` Baolu Lu
  2023-01-27 13:59           ` Jason Gunthorpe
@ 2023-01-27 15:19           ` Robin Murphy
  2023-01-27 15:43             ` Jason Gunthorpe
  1 sibling, 1 reply; 32+ messages in thread
From: Robin Murphy @ 2023-01-27 15:19 UTC (permalink / raw)
  To: Baolu Lu, Jason Gunthorpe; +Cc: joro, will, hch, iommu, linux-kernel

On 2023-01-27 13:50, Baolu Lu wrote:
> On 2023/1/26 22:41, Jason Gunthorpe wrote:
>> On Thu, Jan 26, 2023 at 02:21:29PM +0000, Robin Murphy wrote:
>>
>>> The "check" is inherent in the fact that it's been called at all. VFIO
>>> noiommu*is*  an IOMMU driver in the sense that it provides a bare 
>>> minimum of
>>> IOMMU API functionality (i.e. creating groups), sufficient to support
>>> (careful) usage by VFIO drivers. There would not seem to be a legitimate
>>> reason for some*other*  driver to be specifically querying a device 
>>> while it
>>> is already bound to a VFIO driver (and thus may have a noiommu group).
>> Yes, the devices that VFIO assigns to its internal groups never leak
>> outside VFIO's control during their assignment - ie they are
>> continuously bound to VFIO never another driver.
>>
>> So no other driver can ever see the internal groups unless it is
>> messing around with devices it is not bound to 😄
> 
> Fair enough. I was thinking that probably we could make it like below:
> 
> /**
>   * device_iommu_mapped - Returns true when the device DMA is translated
>   *                       by an IOMMU
>   * @dev: Device to perform the check on
>   */
> static inline bool device_iommu_mapped(struct device *dev)
> {
>          return (dev->iommu && dev->iommu->iommu_dev);
> }
> 
> The iommu probe device code guarantees that dev->iommu->iommu_dev is
> valid only after the IOMMU driver's .probe_device returned successfully.
> 
> Any thoughts?

Heh, I actually wrote that helper yesterday for v2, but as an internal 
check for valid ops :)

The current implementation of device_iommu_mapped() just dates back to 
when dev->iommu_group was the only per-device thing we had, so in 
principle I don't have any conceptual objection to redefining it in 
terms of "device has ops" rather than "device has a group", but as 
things stand you'd still have to do something about PPC first (I know 
Jason had been pushing on that, but I've not kept track of where it got to).

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-27 15:19           ` Robin Murphy
@ 2023-01-27 15:43             ` Jason Gunthorpe
  2023-01-28  8:49               ` Baolu Lu
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Gunthorpe @ 2023-01-27 15:43 UTC (permalink / raw)
  To: Robin Murphy; +Cc: Baolu Lu, joro, will, hch, iommu, linux-kernel

On Fri, Jan 27, 2023 at 03:19:55PM +0000, Robin Murphy wrote:

> The current implementation of device_iommu_mapped() just dates back to when
> dev->iommu_group was the only per-device thing we had, so in principle I
> don't have any conceptual objection to redefining it in terms of "device has
> ops" rather than "device has a group", but as things stand you'd still have
> to do something about PPC first (I know Jason had been pushing on that, but
> I've not kept track of where it got to).

PPC hasn't moved at all, AFAICT. In a few more months I'm going to
suggest we delete the special VFIO support due to it being broken,
distros already having turned it off and nobody caring enough to fix
it..

What does device_iommu_mapped() even really mean?

Looking at usages..

These are fixing SOC HW bugs/issues - the desire seems to be "is the SOC's
IOMMU enabled"

drivers/char/agp/intel-gtt.c:           device_iommu_mapped(&intel_private.pcidev->dev));
drivers/dma/sh/rcar-dmac.c:     if (device_iommu_mapped(&pdev->dev))
drivers/gpu/drm/i915/i915_utils.c:      if (device_iommu_mapped(i915->drm.dev))
?
drivers/usb/dwc3/dwc3-xilinx.c: if (of_dma_is_coherent(dev->of_node) || device_iommu_mapped(dev)) {
drivers/usb/host/xhci.c:        if (!(xhci->quirks & XHCI_ZERO_64B_REGS) || !device_iommu_mapped(dev))
drivers/crypto/qat/qat_common/adf_sriov.c:      if (!device_iommu_mapped(&pdev->dev))
?

These seem to be trying to decide if iommu_domain's can be used (and
they can't be on power):

drivers/gpu/drm/msm/msm_drv.c:  if (device_iommu_mapped(mdp_dev))
drivers/gpu/drm/msm/msm_drv.c:          device_iommu_mapped(dev->dev) ||
drivers/gpu/drm/msm/msm_drv.c:          device_iommu_mapped(dev->dev->parent);
drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c:     if (device_iommu_mapped(dev)) {
drivers/gpu/drm/rockchip/rockchip_drm_drv.c:    if (!device_iommu_mapped(dev))
drivers/gpu/drm/tegra/uapi.c:   if (device_iommu_mapped(client->base.dev) && client->ops->can_use_memory_ctx) {
drivers/gpu/host1x/context.c:           if (!fwspec || !device_iommu_mapped(&ctx->dev)) {
drivers/infiniband/hw/usnic/usnic_ib_main.c:    if (!device_iommu_mapped(&pdev->dev)) {

Yikes, trying to map DMA addresses programmed into devices back to CPU addresses:

drivers/misc/habanalabs/common/debugfs.c: if (!user_address || device_iommu_mapped(&hdev->pdev->dev)) {
drivers/misc/habanalabs/gaudi2/gaudi2.c:                if (!device_iommu_mapped(&hdev->pdev->dev))

And then sequencing the call to iommu_probe_device() which doesn't
apply to power:

drivers/acpi/scan.c:    if (!err && dev->bus && !device_iommu_mapped(dev))
drivers/iommu/of_iommu.c:       if (!err && dev->bus && !device_iommu_mapped(dev))

Leaving these:

arch/powerpc/kernel/eeh.c:      if (device_iommu_mapped(dev)) {

This is only used to support eeh_iommu_group_to_pe which is only
caleld by vfio_iommu_spapr_tce.c. Since power vfio doesn't work right
now this is uncallable, and when power is fixed this will work
properly.

arch/powerpc/kernel/iommu.c:    if (device_iommu_mapped(dev)) {
arch/powerpc/kernel/iommu.c:    if (!device_iommu_mapped(dev)) {

These should both be replaced with some kind of 'device has iommu group', since
it is really driving ppc unique group logic.

So, I'd say Baolu's approach is the right thing, just replace the
above two in ppc with something else.

Jason

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-27 15:43             ` Jason Gunthorpe
@ 2023-01-28  8:49               ` Baolu Lu
  2023-01-30 13:49                 ` Robin Murphy
  0 siblings, 1 reply; 32+ messages in thread
From: Baolu Lu @ 2023-01-28  8:49 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: baolu.lu, joro, will, hch, iommu, linux-kernel

On 2023/1/27 23:43, Jason Gunthorpe wrote:
> On Fri, Jan 27, 2023 at 03:19:55PM +0000, Robin Murphy wrote:
> 
>> The current implementation of device_iommu_mapped() just dates back to when
>> dev->iommu_group was the only per-device thing we had, so in principle I
>> don't have any conceptual objection to redefining it in terms of "device has
>> ops" rather than "device has a group", but as things stand you'd still have
>> to do something about PPC first (I know Jason had been pushing on that, but
>> I've not kept track of where it got to).
> PPC hasn't moved at all, AFAICT. In a few more months I'm going to
> suggest we delete the special VFIO support due to it being broken,
> distros already having turned it off and nobody caring enough to fix
> it..
> 
> What does device_iommu_mapped() even really mean?
> 
> Looking at usages..
> 
> These are fixing SOC HW bugs/issues - the desire seems to be "is the SOC's
> IOMMU enabled"
> 
> drivers/char/agp/intel-gtt.c:           device_iommu_mapped(&intel_private.pcidev->dev));
> drivers/dma/sh/rcar-dmac.c:     if (device_iommu_mapped(&pdev->dev))
> drivers/gpu/drm/i915/i915_utils.c:      if (device_iommu_mapped(i915->drm.dev))
> ?
> drivers/usb/dwc3/dwc3-xilinx.c: if (of_dma_is_coherent(dev->of_node) || device_iommu_mapped(dev)) {
> drivers/usb/host/xhci.c:        if (!(xhci->quirks & XHCI_ZERO_64B_REGS) || !device_iommu_mapped(dev))
> drivers/crypto/qat/qat_common/adf_sriov.c:      if (!device_iommu_mapped(&pdev->dev))
> ?
> 
> These seem to be trying to decide if iommu_domain's can be used (and
> they can't be on power):
> 
> drivers/gpu/drm/msm/msm_drv.c:  if (device_iommu_mapped(mdp_dev))
> drivers/gpu/drm/msm/msm_drv.c:          device_iommu_mapped(dev->dev) ||
> drivers/gpu/drm/msm/msm_drv.c:          device_iommu_mapped(dev->dev->parent);
> drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c:     if (device_iommu_mapped(dev)) {
> drivers/gpu/drm/rockchip/rockchip_drm_drv.c:    if (!device_iommu_mapped(dev))
> drivers/gpu/drm/tegra/uapi.c:   if (device_iommu_mapped(client->base.dev) && client->ops->can_use_memory_ctx) {
> drivers/gpu/host1x/context.c:           if (!fwspec || !device_iommu_mapped(&ctx->dev)) {
> drivers/infiniband/hw/usnic/usnic_ib_main.c:    if (!device_iommu_mapped(&pdev->dev)) {
> 
> Yikes, trying to map DMA addresses programmed into devices back to CPU addresses:
> 
> drivers/misc/habanalabs/common/debugfs.c: if (!user_address || device_iommu_mapped(&hdev->pdev->dev)) {
> drivers/misc/habanalabs/gaudi2/gaudi2.c:                if (!device_iommu_mapped(&hdev->pdev->dev))
> 
> And then sequencing the call to iommu_probe_device() which doesn't
> apply to power:
> 
> drivers/acpi/scan.c:    if (!err && dev->bus && !device_iommu_mapped(dev))
> drivers/iommu/of_iommu.c:       if (!err && dev->bus && !device_iommu_mapped(dev))
> 
> Leaving these:
> 
> arch/powerpc/kernel/eeh.c:      if (device_iommu_mapped(dev)) {
> 
> This is only used to support eeh_iommu_group_to_pe which is only
> caleld by vfio_iommu_spapr_tce.c. Since power vfio doesn't work right
> now this is uncallable, and when power is fixed this will work
> properly.
> 
> arch/powerpc/kernel/iommu.c:    if (device_iommu_mapped(dev)) {
> arch/powerpc/kernel/iommu.c:    if (!device_iommu_mapped(dev)) {
> 
> These should both be replaced with some kind of 'device has iommu group', since
> it is really driving ppc unique group logic.
> 
> So, I'd say Baolu's approach is the right thing, just replace the
> above two in ppc with something else.

Thank you both. I will follow up a series later.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-28  8:49               ` Baolu Lu
@ 2023-01-30 13:49                 ` Robin Murphy
  2023-01-30 13:53                   ` Jason Gunthorpe
  0 siblings, 1 reply; 32+ messages in thread
From: Robin Murphy @ 2023-01-30 13:49 UTC (permalink / raw)
  To: Baolu Lu, Jason Gunthorpe; +Cc: joro, will, hch, iommu, linux-kernel

On 2023-01-28 08:49, Baolu Lu wrote:
> On 2023/1/27 23:43, Jason Gunthorpe wrote:
>> On Fri, Jan 27, 2023 at 03:19:55PM +0000, Robin Murphy wrote:
>>
>>> The current implementation of device_iommu_mapped() just dates back 
>>> to when
>>> dev->iommu_group was the only per-device thing we had, so in principle I
>>> don't have any conceptual objection to redefining it in terms of 
>>> "device has
>>> ops" rather than "device has a group", but as things stand you'd 
>>> still have
>>> to do something about PPC first (I know Jason had been pushing on 
>>> that, but
>>> I've not kept track of where it got to).
>> PPC hasn't moved at all, AFAICT. In a few more months I'm going to
>> suggest we delete the special VFIO support due to it being broken,
>> distros already having turned it off and nobody caring enough to fix
>> it..
>>
>> What does device_iommu_mapped() even really mean?
>>
>> Looking at usages..
>>
>> These are fixing SOC HW bugs/issues - the desire seems to be "is the 
>> SOC's
>> IOMMU enabled"
>>
>> drivers/char/agp/intel-gtt.c:           
>> device_iommu_mapped(&intel_private.pcidev->dev));
>> drivers/dma/sh/rcar-dmac.c:     if (device_iommu_mapped(&pdev->dev))
>> drivers/gpu/drm/i915/i915_utils.c:      if 
>> (device_iommu_mapped(i915->drm.dev))
>> ?
>> drivers/usb/dwc3/dwc3-xilinx.c: if (of_dma_is_coherent(dev->of_node) 
>> || device_iommu_mapped(dev)) {
>> drivers/usb/host/xhci.c:        if (!(xhci->quirks & 
>> XHCI_ZERO_64B_REGS) || !device_iommu_mapped(dev))
>> drivers/crypto/qat/qat_common/adf_sriov.c:      if 
>> (!device_iommu_mapped(&pdev->dev))
>> ?
>>
>> These seem to be trying to decide if iommu_domain's can be used (and
>> they can't be on power):
>>
>> drivers/gpu/drm/msm/msm_drv.c:  if (device_iommu_mapped(mdp_dev))
>> drivers/gpu/drm/msm/msm_drv.c:          device_iommu_mapped(dev->dev) ||
>> drivers/gpu/drm/msm/msm_drv.c:          
>> device_iommu_mapped(dev->dev->parent);
>> drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c:     if 
>> (device_iommu_mapped(dev)) {
>> drivers/gpu/drm/rockchip/rockchip_drm_drv.c:    if 
>> (!device_iommu_mapped(dev))
>> drivers/gpu/drm/tegra/uapi.c:   if 
>> (device_iommu_mapped(client->base.dev) && 
>> client->ops->can_use_memory_ctx) {
>> drivers/gpu/host1x/context.c:           if (!fwspec || 
>> !device_iommu_mapped(&ctx->dev)) {
>> drivers/infiniband/hw/usnic/usnic_ib_main.c:    if 
>> (!device_iommu_mapped(&pdev->dev)) {
>>
>> Yikes, trying to map DMA addresses programmed into devices back to CPU 
>> addresses:
>>
>> drivers/misc/habanalabs/common/debugfs.c: if (!user_address || 
>> device_iommu_mapped(&hdev->pdev->dev)) {
>> drivers/misc/habanalabs/gaudi2/gaudi2.c:                if 
>> (!device_iommu_mapped(&hdev->pdev->dev))
>>
>> And then sequencing the call to iommu_probe_device() which doesn't
>> apply to power:
>>
>> drivers/acpi/scan.c:    if (!err && dev->bus && 
>> !device_iommu_mapped(dev))
>> drivers/iommu/of_iommu.c:       if (!err && dev->bus && 
>> !device_iommu_mapped(dev))
>>
>> Leaving these:
>>
>> arch/powerpc/kernel/eeh.c:      if (device_iommu_mapped(dev)) {
>>
>> This is only used to support eeh_iommu_group_to_pe which is only
>> caleld by vfio_iommu_spapr_tce.c. Since power vfio doesn't work right
>> now this is uncallable, and when power is fixed this will work
>> properly.

Oh wow, I should have looked at more context... Even better, this one is 
already just an elaborate "if (true)" - it has been impossible for 
dev_has_iommu_table() to return 0 since at least 2015 :D
>> arch/powerpc/kernel/iommu.c:    if (device_iommu_mapped(dev)) {
>> arch/powerpc/kernel/iommu.c:    if (!device_iommu_mapped(dev)) {
>>
>> These should both be replaced with some kind of 'device has iommu 
>> group', since
>> it is really driving ppc unique group logic.

And in fact those appear to mostly serve for printing debug messages; if 
we made iommu_group_add_device() return -EBUSY for duplicate calls (not 
necessarily a bad idea anyway vs. relying on the noisy failure of 
sysfs_create_link()), they could arguably just go.

All in all, it's only actually the habanalabs ones that I'm slightly 
wary of, since they're effectively (mis)using device_iommu_mapped() to 
infer the DMA ops implementation, which could potentially go wrong (or 
at least *more* wrong) on POWER with this change. I guess the saving 
grace is that although they are available on PCIe-interfaced modules, 
the userspace driver stack seems to be implicitly x86_64-only - as far 
as I could tell from a quick poke around their site and documentation, 
which doesn't appear to acknowledge the concept of CPU architectures at 
all - so the chances of anyone actually trying to use the kernel drivers 
in anger on POWER seem minimal.

Thanks,
Robin.

>> So, I'd say Baolu's approach is the right thing, just replace the
>> above two in ppc with something else.
> 
> Thank you both. I will follow up a series later.
> 
> Best regards,
> baolu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-30 13:49                 ` Robin Murphy
@ 2023-01-30 13:53                   ` Jason Gunthorpe
  2023-01-30 14:22                     ` Oded Gabbay
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Gunthorpe @ 2023-01-30 13:53 UTC (permalink / raw)
  To: Robin Murphy; +Cc: Baolu Lu, joro, will, hch, iommu, linux-kernel

On Mon, Jan 30, 2023 at 01:49:20PM +0000, Robin Murphy wrote:

> All in all, it's only actually the habanalabs ones that I'm slightly wary
> of, since they're effectively (mis)using device_iommu_mapped() to infer the
> DMA ops implementation, which could potentially go wrong (or at least *more*
> wrong) on POWER with this change. I guess the saving grace is that
> although

IMHO habana is not using the DMA API abstraction properly. If it
doesn't work on some archs is their bugs to deal with - we don't need
to complexify the core code to tiptoe around around such an abuse in
an obscure driver.

Jason

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/8] iommu: Decouple iommu_present() from bus ops
  2023-01-30 13:53                   ` Jason Gunthorpe
@ 2023-01-30 14:22                     ` Oded Gabbay
  0 siblings, 0 replies; 32+ messages in thread
From: Oded Gabbay @ 2023-01-30 14:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Robin Murphy, Baolu Lu, joro, will, hch, iommu, linux-kernel

On Mon, Jan 30, 2023 at 3:53 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Mon, Jan 30, 2023 at 01:49:20PM +0000, Robin Murphy wrote:
>
> > All in all, it's only actually the habanalabs ones that I'm slightly wary
> > of, since they're effectively (mis)using device_iommu_mapped() to infer the
> > DMA ops implementation, which could potentially go wrong (or at least *more*
> > wrong) on POWER with this change. I guess the saving grace is that
> > although
>
> IMHO habana is not using the DMA API abstraction properly. If it
> doesn't work on some archs is their bugs to deal with - we don't need
> to complexify the core code to tiptoe around around such an abuse in
> an obscure driver.
>
> Jason
Agreed, feel free to change the kapi as you see fit. Do the right
thing for the kernel.
In any case, we limit ourselves to x86-64 arch in the 6.3 merge cycle.

Oded

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2023-01-30 14:25 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-19 19:18 [PATCH 0/8] iommu: The early demise of bus ops Robin Murphy
2023-01-19 19:18 ` [PATCH 1/8] iommu: Decouple iommu_present() from " Robin Murphy
2023-01-26 13:13   ` Baolu Lu
2023-01-26 14:21     ` Robin Murphy
2023-01-26 14:41       ` Jason Gunthorpe
2023-01-27 13:50         ` Baolu Lu
2023-01-27 13:59           ` Jason Gunthorpe
2023-01-27 15:19           ` Robin Murphy
2023-01-27 15:43             ` Jason Gunthorpe
2023-01-28  8:49               ` Baolu Lu
2023-01-30 13:49                 ` Robin Murphy
2023-01-30 13:53                   ` Jason Gunthorpe
2023-01-30 14:22                     ` Oded Gabbay
2023-01-19 19:18 ` [PATCH 2/8] iommu: Validate that devices match domains Robin Murphy
2023-01-19 19:18 ` [PATCH 3/8] iommu: Factor out a "first device in group" helper Robin Murphy
2023-01-19 19:23   ` Jason Gunthorpe
2023-01-19 19:36     ` Robin Murphy
2023-01-19 19:18 ` [PATCH 4/8] iommu: Switch __iommu_domain_alloc() to device ops Robin Murphy
2023-01-19 19:26   ` Jason Gunthorpe
2023-01-19 20:12     ` Robin Murphy
2023-01-19 19:18 ` [PATCH 5/8] iommu/arm-smmu: Don't register fwnode for legacy binding Robin Murphy
2023-01-19 19:18 ` [PATCH 6/8] iommu: Retire bus ops Robin Murphy
2023-01-20  0:27   ` Baolu Lu
2023-01-20 12:31     ` Robin Murphy
2023-01-26 12:37       ` Baolu Lu
2023-01-20 10:23   ` Greg Kroah-Hartman
2023-01-19 19:18 ` [PATCH 7/8] iommu: Clean up open-coded ownership checks Robin Murphy
2023-01-19 19:31   ` Jason Gunthorpe
2023-01-19 20:52     ` Robin Murphy
2023-01-19 19:18 ` [PATCH 8/8] iommu: Pass device through ops->domain_alloc Robin Murphy
2023-01-19 19:34 ` [PATCH 0/8] iommu: The early demise of bus ops Jason Gunthorpe
2023-01-20 12:33 ` Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.