linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Use right iommu_ops for mock device
@ 2024-01-26 10:53 Lu Baolu
  2024-01-26 10:53 ` [PATCH v2 1/2] iommu: Use mutex instead of spinlock for iommu_device_list Lu Baolu
  2024-01-26 10:53 ` [PATCH v2 2/2] iommu: Probe right iommu_ops for device Lu Baolu
  0 siblings, 2 replies; 8+ messages in thread
From: Lu Baolu @ 2024-01-26 10:53 UTC (permalink / raw)
  To: Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon, Robin Murphy
  Cc: iommu, linux-kernel, Lu Baolu

In the iommu probe device path, __iommu_probe_device() gets the iommu_ops
for the device from dev->iommu->fwspec if this field has been initialized
before probing. Otherwise, it will lookup the global iommu device list
and use the iommu_ops of the first iommu device which has no
dev->iommu->fwspec. This causes the wrong iommu_ops to be used for the mock
device on x86 platforms where dev->iommu->fwspec is not used.

Probe the right iommu_ops for device by iterating over all the global
drivers and calling their probe function to find a match.

Change log:
v2:
 - Iterate over all the global iommu drivers and calling their probe
   function to find a match.

v1: https://lore.kernel.org/linux-iommu/20240111073213.180020-1-baolu.lu@linux.intel.com/

Lu Baolu (2):
  iommu: Use mutex instead of spinlock for iommu_device_list
  iommu: Probe right iommu_ops for device

 drivers/iommu/iommu.c | 98 ++++++++++++++++++++++++++-----------------
 1 file changed, 59 insertions(+), 39 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] iommu: Use mutex instead of spinlock for iommu_device_list
  2024-01-26 10:53 [PATCH v2 0/2] Use right iommu_ops for mock device Lu Baolu
@ 2024-01-26 10:53 ` Lu Baolu
  2024-01-29  8:04   ` Tian, Kevin
  2024-01-26 10:53 ` [PATCH v2 2/2] iommu: Probe right iommu_ops for device Lu Baolu
  1 sibling, 1 reply; 8+ messages in thread
From: Lu Baolu @ 2024-01-26 10:53 UTC (permalink / raw)
  To: Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon, Robin Murphy
  Cc: iommu, linux-kernel, Lu Baolu

The iommu_device_lock spinlock was used to protect the iommu device
list. However, there is no requirement to access the iommu device
list in interrupt context. Therefore, a mutex is sufficient.

This also prepares for the next change, which will iterate the iommu
device list and call the probe callback within the locking area.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/iommu.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 68e648b55767..0af0b5544072 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -146,7 +146,7 @@ struct iommu_group_attribute iommu_group_attr_##_name =		\
 	container_of(_kobj, struct iommu_group, kobj)
 
 static LIST_HEAD(iommu_device_list);
-static DEFINE_SPINLOCK(iommu_device_lock);
+static DEFINE_MUTEX(iommu_device_lock);
 
 static const struct bus_type * const iommu_buses[] = {
 	&platform_bus_type,
@@ -262,9 +262,9 @@ int iommu_device_register(struct iommu_device *iommu,
 	if (hwdev)
 		iommu->fwnode = dev_fwnode(hwdev);
 
-	spin_lock(&iommu_device_lock);
+	mutex_lock(&iommu_device_lock);
 	list_add_tail(&iommu->list, &iommu_device_list);
-	spin_unlock(&iommu_device_lock);
+	mutex_unlock(&iommu_device_lock);
 
 	for (int i = 0; i < ARRAY_SIZE(iommu_buses) && !err; i++)
 		err = bus_iommu_probe(iommu_buses[i]);
@@ -279,9 +279,9 @@ void iommu_device_unregister(struct iommu_device *iommu)
 	for (int i = 0; i < ARRAY_SIZE(iommu_buses); i++)
 		bus_for_each_dev(iommu_buses[i], NULL, iommu, remove_iommu_group);
 
-	spin_lock(&iommu_device_lock);
+	mutex_lock(&iommu_device_lock);
 	list_del(&iommu->list);
-	spin_unlock(&iommu_device_lock);
+	mutex_unlock(&iommu_device_lock);
 
 	/* Pairs with the alloc in generic_single_device_group() */
 	iommu_group_put(iommu->singleton_group);
@@ -316,9 +316,9 @@ int iommu_device_register_bus(struct iommu_device *iommu,
 	if (err)
 		return err;
 
-	spin_lock(&iommu_device_lock);
+	mutex_lock(&iommu_device_lock);
 	list_add_tail(&iommu->list, &iommu_device_list);
-	spin_unlock(&iommu_device_lock);
+	mutex_unlock(&iommu_device_lock);
 
 	err = bus_iommu_probe(bus);
 	if (err) {
@@ -2033,9 +2033,9 @@ bool iommu_present(const struct bus_type *bus)
 
 	for (int i = 0; i < ARRAY_SIZE(iommu_buses); i++) {
 		if (iommu_buses[i] == bus) {
-			spin_lock(&iommu_device_lock);
+			mutex_lock(&iommu_device_lock);
 			ret = !list_empty(&iommu_device_list);
-			spin_unlock(&iommu_device_lock);
+			mutex_unlock(&iommu_device_lock);
 		}
 	}
 	return ret;
@@ -2983,13 +2983,13 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode)
 	const struct iommu_ops *ops = NULL;
 	struct iommu_device *iommu;
 
-	spin_lock(&iommu_device_lock);
+	mutex_lock(&iommu_device_lock);
 	list_for_each_entry(iommu, &iommu_device_list, list)
 		if (iommu->fwnode == fwnode) {
 			ops = iommu->ops;
 			break;
 		}
-	spin_unlock(&iommu_device_lock);
+	mutex_unlock(&iommu_device_lock);
 	return ops;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] iommu: Probe right iommu_ops for device
  2024-01-26 10:53 [PATCH v2 0/2] Use right iommu_ops for mock device Lu Baolu
  2024-01-26 10:53 ` [PATCH v2 1/2] iommu: Use mutex instead of spinlock for iommu_device_list Lu Baolu
@ 2024-01-26 10:53 ` Lu Baolu
  2024-01-29  8:07   ` Tian, Kevin
                     ` (2 more replies)
  1 sibling, 3 replies; 8+ messages in thread
From: Lu Baolu @ 2024-01-26 10:53 UTC (permalink / raw)
  To: Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon, Robin Murphy
  Cc: iommu, linux-kernel, Lu Baolu

Previously, in the iommu probe device path, __iommu_probe_device() gets
the iommu_ops for the device from dev->iommu->fwspec if this field has
been initialized before probing. Otherwise, it is assumed that only one
of Intel, AMD, s390, PAMU or legacy SMMUv2 can be present, hence it's
feasible to lookup the global iommu device list and use the iommu_ops of
the first iommu device which has no dev->iommu->fwspec.

The assumption mentioned above is no longer correct with the introduction
of the IOMMUFD mock device on x86 platforms. There might be multiple
instances of iommu drivers, none of which have the dev->iommu->fwspec
field populated.

Probe the right iommu_ops for device by iterating over all the global
drivers and call their probe function to find a match.

Fixes: 17de3f5fdd35 ("iommu: Retire bus ops")
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/iommu.c | 76 +++++++++++++++++++++++++++----------------
 1 file changed, 48 insertions(+), 28 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0af0b5544072..2cdb01e411fa 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -396,30 +396,69 @@ void dev_iommu_priv_set(struct device *dev, void *priv)
 }
 EXPORT_SYMBOL_GPL(dev_iommu_priv_set);
 
+static struct iommu_device *
+__iommu_do_probe_device(struct device *dev, const struct iommu_ops *ops)
+{
+	struct iommu_device *iommu_dev;
+
+	if (!try_module_get(ops->owner))
+		return ERR_PTR(-EINVAL);
+
+	iommu_dev = ops->probe_device(dev);
+	if (IS_ERR(iommu_dev))
+		module_put(ops->owner);
+
+	return iommu_dev;
+}
+
+static struct iommu_device *iommu_probe_device_ops(struct device *dev)
+{
+	struct iommu_device *iter, *iommu_dev = ERR_PTR(-ENODEV);
+	struct iommu_fwspec *fwspec;
+
+	/*
+	 * For FDT-based systems and ACPI IORT/VIOT, drivers register IOMMU
+	 * instances with non-NULL fwnodes, and client devices should have been
+	 * identified with a fwspec by this point. Otherwise, we will iterate
+	 * over all the global drivers and call their probe function to find a
+	 * match.
+	 */
+	fwspec = dev_iommu_fwspec_get(dev);
+	if (fwspec && fwspec->ops)
+		return __iommu_do_probe_device(dev, fwspec->ops);
+
+	mutex_lock(&iommu_device_lock);
+	list_for_each_entry(iter, &iommu_device_list, list) {
+		iommu_dev = __iommu_do_probe_device(dev, iter->ops);
+		if (!IS_ERR(iommu_dev))
+			break;
+	}
+	mutex_unlock(&iommu_device_lock);
+
+	return iommu_dev;
+}
+
 /*
  * Init the dev->iommu and dev->iommu_group in the struct device and get the
  * driver probed
  */
-static int iommu_init_device(struct device *dev, const struct iommu_ops *ops)
+static int iommu_init_device(struct device *dev)
 {
 	struct iommu_device *iommu_dev;
+	const struct iommu_ops *ops;
 	struct iommu_group *group;
 	int ret;
 
 	if (!dev_iommu_get(dev))
 		return -ENOMEM;
 
-	if (!try_module_get(ops->owner)) {
-		ret = -EINVAL;
-		goto err_free;
-	}
-
-	iommu_dev = ops->probe_device(dev);
+	iommu_dev = iommu_probe_device_ops(dev);
 	if (IS_ERR(iommu_dev)) {
 		ret = PTR_ERR(iommu_dev);
-		goto err_module_put;
+		goto err_free;
 	}
 	dev->iommu->iommu_dev = iommu_dev;
+	ops = dev_iommu_ops(dev);
 
 	ret = iommu_device_link(iommu_dev, dev);
 	if (ret)
@@ -444,7 +483,6 @@ static int iommu_init_device(struct device *dev, const struct iommu_ops *ops)
 err_release:
 	if (ops->release_device)
 		ops->release_device(dev);
-err_module_put:
 	module_put(ops->owner);
 err_free:
 	dev->iommu->iommu_dev = NULL;
@@ -499,28 +537,10 @@ DEFINE_MUTEX(iommu_probe_device_lock);
 
 static int __iommu_probe_device(struct device *dev, struct list_head *group_list)
 {
-	const struct iommu_ops *ops;
-	struct iommu_fwspec *fwspec;
 	struct iommu_group *group;
 	struct group_device *gdev;
 	int ret;
 
-	/*
-	 * For FDT-based systems and ACPI IORT/VIOT, drivers register IOMMU
-	 * instances with non-NULL fwnodes, and client devices should have been
-	 * identified with a fwspec by this point. Otherwise, we can currently
-	 * assume that only one of Intel, AMD, s390, PAMU or legacy SMMUv2 can
-	 * be present, and that any of their registered instances has suitable
-	 * ops for probing, and thus cheekily co-opt the same mechanism.
-	 */
-	fwspec = dev_iommu_fwspec_get(dev);
-	if (fwspec && fwspec->ops)
-		ops = fwspec->ops;
-	else
-		ops = iommu_ops_from_fwnode(NULL);
-
-	if (!ops)
-		return -ENODEV;
 	/*
 	 * Serialise to avoid races between IOMMU drivers registering in
 	 * parallel and/or the "replay" calls from ACPI/OF code via client
@@ -534,7 +554,7 @@ static int __iommu_probe_device(struct device *dev, struct list_head *group_list
 	if (dev->iommu_group)
 		return 0;
 
-	ret = iommu_init_device(dev, ops);
+	ret = iommu_init_device(dev);
 	if (ret)
 		return ret;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* RE: [PATCH v2 1/2] iommu: Use mutex instead of spinlock for iommu_device_list
  2024-01-26 10:53 ` [PATCH v2 1/2] iommu: Use mutex instead of spinlock for iommu_device_list Lu Baolu
@ 2024-01-29  8:04   ` Tian, Kevin
  2024-01-29 14:57     ` Jason Gunthorpe
  0 siblings, 1 reply; 8+ messages in thread
From: Tian, Kevin @ 2024-01-29  8:04 UTC (permalink / raw)
  To: Lu Baolu, Jason Gunthorpe, Joerg Roedel, Will Deacon, Robin Murphy
  Cc: iommu, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Friday, January 26, 2024 6:54 PM
> 
> The iommu_device_lock spinlock was used to protect the iommu device
> list. However, there is no requirement to access the iommu device
> list in interrupt context. Therefore, a mutex is sufficient.

I don't think that interrupt is the reason for spinlock otherwise
spin_lock_irqsave() should be used instead.

> 
> This also prepares for the next change, which will iterate the iommu
> device list and call the probe callback within the locking area.
> 

Given the touched paths are all slow paths:

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH v2 2/2] iommu: Probe right iommu_ops for device
  2024-01-26 10:53 ` [PATCH v2 2/2] iommu: Probe right iommu_ops for device Lu Baolu
@ 2024-01-29  8:07   ` Tian, Kevin
  2024-01-29 14:58   ` Robin Murphy
  2024-01-29 15:37   ` Jason Gunthorpe
  2 siblings, 0 replies; 8+ messages in thread
From: Tian, Kevin @ 2024-01-29  8:07 UTC (permalink / raw)
  To: Lu Baolu, Jason Gunthorpe, Joerg Roedel, Will Deacon, Robin Murphy
  Cc: iommu, linux-kernel

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Friday, January 26, 2024 6:54 PM
> 
> Previously, in the iommu probe device path, __iommu_probe_device() gets
> the iommu_ops for the device from dev->iommu->fwspec if this field has
> been initialized before probing. Otherwise, it is assumed that only one
> of Intel, AMD, s390, PAMU or legacy SMMUv2 can be present, hence it's
> feasible to lookup the global iommu device list and use the iommu_ops of
> the first iommu device which has no dev->iommu->fwspec.
> 
> The assumption mentioned above is no longer correct with the introduction
> of the IOMMUFD mock device on x86 platforms. There might be multiple
> instances of iommu drivers, none of which have the dev->iommu->fwspec
> field populated.
> 
> Probe the right iommu_ops for device by iterating over all the global
> drivers and call their probe function to find a match.
> 
> Fixes: 17de3f5fdd35 ("iommu: Retire bus ops")
> Cc: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with a nit:

> +
> +	mutex_lock(&iommu_device_lock);
> +	list_for_each_entry(iter, &iommu_device_list, list) {
> +		iommu_dev = __iommu_do_probe_device(dev, iter->ops);
> +		if (!IS_ERR(iommu_dev))
> +			break;
> +	}

here could skip iommu with a valid fwspec.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] iommu: Use mutex instead of spinlock for iommu_device_list
  2024-01-29  8:04   ` Tian, Kevin
@ 2024-01-29 14:57     ` Jason Gunthorpe
  0 siblings, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2024-01-29 14:57 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Lu Baolu, Joerg Roedel, Will Deacon, Robin Murphy, iommu, linux-kernel

On Mon, Jan 29, 2024 at 08:04:35AM +0000, Tian, Kevin wrote:
> > From: Lu Baolu <baolu.lu@linux.intel.com>
> > Sent: Friday, January 26, 2024 6:54 PM
> > 
> > The iommu_device_lock spinlock was used to protect the iommu device
> > list. However, there is no requirement to access the iommu device
> > list in interrupt context. Therefore, a mutex is sufficient.
> 
> I don't think that interrupt is the reason for spinlock otherwise
> spin_lock_irqsave() should be used instead.

Right, there is no touch of this from an interrupt

I suspect it is following the the general kernel wisdom that a
spinlock is better if the critical sections are very small.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] iommu: Probe right iommu_ops for device
  2024-01-26 10:53 ` [PATCH v2 2/2] iommu: Probe right iommu_ops for device Lu Baolu
  2024-01-29  8:07   ` Tian, Kevin
@ 2024-01-29 14:58   ` Robin Murphy
  2024-01-29 15:37   ` Jason Gunthorpe
  2 siblings, 0 replies; 8+ messages in thread
From: Robin Murphy @ 2024-01-29 14:58 UTC (permalink / raw)
  To: Lu Baolu, Jason Gunthorpe, Kevin Tian, Joerg Roedel, Will Deacon
  Cc: iommu, linux-kernel

On 2024-01-26 10:53 am, Lu Baolu wrote:
> Previously, in the iommu probe device path, __iommu_probe_device() gets
> the iommu_ops for the device from dev->iommu->fwspec if this field has
> been initialized before probing. Otherwise, it is assumed that only one
> of Intel, AMD, s390, PAMU or legacy SMMUv2 can be present, hence it's
> feasible to lookup the global iommu device list and use the iommu_ops of
> the first iommu device which has no dev->iommu->fwspec.
> 
> The assumption mentioned above is no longer correct with the introduction
> of the IOMMUFD mock device on x86 platforms. There might be multiple
> instances of iommu drivers, none of which have the dev->iommu->fwspec
> field populated.
> 
> Probe the right iommu_ops for device by iterating over all the global
> drivers and call their probe function to find a match.

This will now break several drivers which are no longer expecting to see 
a ->probe_device call without having seen the corresponding ->of_xlate 
call first.

Can we please just do the trivial fix to iommufd itself? At this point I 
don't mind if it's your v1, the even simpler one I proposed 2 months 
ago[1], or anything else similarly self-contained.

Thanks,
Robin.

[1] 
https://lore.kernel.org/linux-iommu/e365c08b21a8d0b60e6f5d1411be6701c1a06a53.1701165201.git.robin.murphy@arm.com/

> Fixes: 17de3f5fdd35 ("iommu: Retire bus ops")
> Cc: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>   drivers/iommu/iommu.c | 76 +++++++++++++++++++++++++++----------------
>   1 file changed, 48 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 0af0b5544072..2cdb01e411fa 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -396,30 +396,69 @@ void dev_iommu_priv_set(struct device *dev, void *priv)
>   }
>   EXPORT_SYMBOL_GPL(dev_iommu_priv_set);
>   
> +static struct iommu_device *
> +__iommu_do_probe_device(struct device *dev, const struct iommu_ops *ops)
> +{
> +	struct iommu_device *iommu_dev;
> +
> +	if (!try_module_get(ops->owner))
> +		return ERR_PTR(-EINVAL);
> +
> +	iommu_dev = ops->probe_device(dev);
> +	if (IS_ERR(iommu_dev))
> +		module_put(ops->owner);
> +
> +	return iommu_dev;
> +}
> +
> +static struct iommu_device *iommu_probe_device_ops(struct device *dev)
> +{
> +	struct iommu_device *iter, *iommu_dev = ERR_PTR(-ENODEV);
> +	struct iommu_fwspec *fwspec;
> +
> +	/*
> +	 * For FDT-based systems and ACPI IORT/VIOT, drivers register IOMMU
> +	 * instances with non-NULL fwnodes, and client devices should have been
> +	 * identified with a fwspec by this point. Otherwise, we will iterate
> +	 * over all the global drivers and call their probe function to find a
> +	 * match.
> +	 */
> +	fwspec = dev_iommu_fwspec_get(dev);
> +	if (fwspec && fwspec->ops)
> +		return __iommu_do_probe_device(dev, fwspec->ops);
> +
> +	mutex_lock(&iommu_device_lock);
> +	list_for_each_entry(iter, &iommu_device_list, list) {
> +		iommu_dev = __iommu_do_probe_device(dev, iter->ops);
> +		if (!IS_ERR(iommu_dev))
> +			break;
> +	}
> +	mutex_unlock(&iommu_device_lock);
> +
> +	return iommu_dev;
> +}
> +
>   /*
>    * Init the dev->iommu and dev->iommu_group in the struct device and get the
>    * driver probed
>    */
> -static int iommu_init_device(struct device *dev, const struct iommu_ops *ops)
> +static int iommu_init_device(struct device *dev)
>   {
>   	struct iommu_device *iommu_dev;
> +	const struct iommu_ops *ops;
>   	struct iommu_group *group;
>   	int ret;
>   
>   	if (!dev_iommu_get(dev))
>   		return -ENOMEM;
>   
> -	if (!try_module_get(ops->owner)) {
> -		ret = -EINVAL;
> -		goto err_free;
> -	}
> -
> -	iommu_dev = ops->probe_device(dev);
> +	iommu_dev = iommu_probe_device_ops(dev);
>   	if (IS_ERR(iommu_dev)) {
>   		ret = PTR_ERR(iommu_dev);
> -		goto err_module_put;
> +		goto err_free;
>   	}
>   	dev->iommu->iommu_dev = iommu_dev;
> +	ops = dev_iommu_ops(dev);
>   
>   	ret = iommu_device_link(iommu_dev, dev);
>   	if (ret)
> @@ -444,7 +483,6 @@ static int iommu_init_device(struct device *dev, const struct iommu_ops *ops)
>   err_release:
>   	if (ops->release_device)
>   		ops->release_device(dev);
> -err_module_put:
>   	module_put(ops->owner);
>   err_free:
>   	dev->iommu->iommu_dev = NULL;
> @@ -499,28 +537,10 @@ DEFINE_MUTEX(iommu_probe_device_lock);
>   
>   static int __iommu_probe_device(struct device *dev, struct list_head *group_list)
>   {
> -	const struct iommu_ops *ops;
> -	struct iommu_fwspec *fwspec;
>   	struct iommu_group *group;
>   	struct group_device *gdev;
>   	int ret;
>   
> -	/*
> -	 * For FDT-based systems and ACPI IORT/VIOT, drivers register IOMMU
> -	 * instances with non-NULL fwnodes, and client devices should have been
> -	 * identified with a fwspec by this point. Otherwise, we can currently
> -	 * assume that only one of Intel, AMD, s390, PAMU or legacy SMMUv2 can
> -	 * be present, and that any of their registered instances has suitable
> -	 * ops for probing, and thus cheekily co-opt the same mechanism.
> -	 */
> -	fwspec = dev_iommu_fwspec_get(dev);
> -	if (fwspec && fwspec->ops)
> -		ops = fwspec->ops;
> -	else
> -		ops = iommu_ops_from_fwnode(NULL);
> -
> -	if (!ops)
> -		return -ENODEV;
>   	/*
>   	 * Serialise to avoid races between IOMMU drivers registering in
>   	 * parallel and/or the "replay" calls from ACPI/OF code via client
> @@ -534,7 +554,7 @@ static int __iommu_probe_device(struct device *dev, struct list_head *group_list
>   	if (dev->iommu_group)
>   		return 0;
>   
> -	ret = iommu_init_device(dev, ops);
> +	ret = iommu_init_device(dev);
>   	if (ret)
>   		return ret;
>   

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] iommu: Probe right iommu_ops for device
  2024-01-26 10:53 ` [PATCH v2 2/2] iommu: Probe right iommu_ops for device Lu Baolu
  2024-01-29  8:07   ` Tian, Kevin
  2024-01-29 14:58   ` Robin Murphy
@ 2024-01-29 15:37   ` Jason Gunthorpe
  2 siblings, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2024-01-29 15:37 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Kevin Tian, Joerg Roedel, Will Deacon, Robin Murphy, iommu, linux-kernel

On Fri, Jan 26, 2024 at 06:53:41PM +0800, Lu Baolu wrote:
> Previously, in the iommu probe device path, __iommu_probe_device() gets
> the iommu_ops for the device from dev->iommu->fwspec if this field has
> been initialized before probing. Otherwise, it is assumed that only one
> of Intel, AMD, s390, PAMU or legacy SMMUv2 can be present, hence it's
> feasible to lookup the global iommu device list and use the iommu_ops of
> the first iommu device which has no dev->iommu->fwspec.
> 
> The assumption mentioned above is no longer correct with the introduction
> of the IOMMUFD mock device on x86 platforms. There might be multiple
> instances of iommu drivers, none of which have the dev->iommu->fwspec
> field populated.
> 
> Probe the right iommu_ops for device by iterating over all the global
> drivers and call their probe function to find a match.
> 
> Fixes: 17de3f5fdd35 ("iommu: Retire bus ops")
> Cc: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/iommu.c | 76 +++++++++++++++++++++++++++----------------
>  1 file changed, 48 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 0af0b5544072..2cdb01e411fa 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -396,30 +396,69 @@ void dev_iommu_priv_set(struct device *dev, void *priv)
>  }
>  EXPORT_SYMBOL_GPL(dev_iommu_priv_set);
>  
> +static struct iommu_device *
> +__iommu_do_probe_device(struct device *dev, const struct iommu_ops *ops)
> +{
> +	struct iommu_device *iommu_dev;
> +
> +	if (!try_module_get(ops->owner))
> +		return ERR_PTR(-EINVAL);
> +
> +	iommu_dev = ops->probe_device(dev);
> +	if (IS_ERR(iommu_dev))
> +		module_put(ops->owner);

This doesn't really do enough to protect against races, to do that
fully we need to have iommu_device_unregister() do some better
locking, and fix fwspec->ops lifecycle somehow.. Remember that module
ref counts don't prevent iommu_device_unregister_bus() concurrency.

So, it would be simpler to leave the try_module_get in the
iommu_init_device(), just move it after the probe_device call.

> +static struct iommu_device *iommu_probe_device_ops(struct device *dev)
> +{
> +	struct iommu_device *iter, *iommu_dev = ERR_PTR(-ENODEV);
> +	struct iommu_fwspec *fwspec;
> +
> +	/*
> +	 * For FDT-based systems and ACPI IORT/VIOT, drivers register IOMMU
> +	 * instances with non-NULL fwnodes, and client devices should have been
> +	 * identified with a fwspec by this point. Otherwise, we will iterate
> +	 * over all the global drivers and call their probe function to find a
> +	 * match.
> +	 */
> +	fwspec = dev_iommu_fwspec_get(dev);
> +	if (fwspec && fwspec->ops)
> +		return __iommu_do_probe_device(dev, fwspec->ops);
> +
> +	mutex_lock(&iommu_device_lock);
> +	list_for_each_entry(iter, &iommu_device_list, list) {
> +		iommu_dev = __iommu_do_probe_device(dev, iter->ops);
> +		if (!IS_ERR(iommu_dev))
> +			break;

This does need to skip duplicate ops though (see my version), we don't
want to call the same driver many times. And Kevin and Robin are
right, since we recently removed a bunch of fwpsec checks from drivers
the core code must now never call a fwspec expecting driver without a
fwspec. (check for !iommu->fwnode)

Otherwise this looks fine for me, thanks

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-01-29 15:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-26 10:53 [PATCH v2 0/2] Use right iommu_ops for mock device Lu Baolu
2024-01-26 10:53 ` [PATCH v2 1/2] iommu: Use mutex instead of spinlock for iommu_device_list Lu Baolu
2024-01-29  8:04   ` Tian, Kevin
2024-01-29 14:57     ` Jason Gunthorpe
2024-01-26 10:53 ` [PATCH v2 2/2] iommu: Probe right iommu_ops for device Lu Baolu
2024-01-29  8:07   ` Tian, Kevin
2024-01-29 14:58   ` Robin Murphy
2024-01-29 15:37   ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).