linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
@ 2022-06-22 12:04 Robin Murphy
  2022-06-22 12:04 ` [PATCH v2 2/2] vfio: Use device_iommu_capable() Robin Murphy
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Robin Murphy @ 2022-06-22 12:04 UTC (permalink / raw)
  To: alex.williamson, cohuck; +Cc: kvm, iommu, iommu, linux-kernel, jgg

Since IOMMU groups are mandatory for drivers to support, it stands to
reason that any device which has been successfully be added to a group
must be on a bus supported by that IOMMU driver, and therefore a domain
viable for any device in the group must be viable for all devices in
the group. This already has to be the case for the IOMMU API's internal
default domain, for instance. Thus even if the group contains devices on
different buses, that can only mean that the IOMMU driver actually
supports such an odd topology, and so without loss of generality we can
expect the bus type of any device in a group to be suitable for IOMMU
API calls.

Replace vfio_bus_type() with a simple call to resolve an appropriate
member device from which to then derive a bus type. This is also a step
towards removing the vague bus-based interfaces from the IOMMU API, when
we can subsequently switch to using this device directly.

Furthermore, scrutiny reveals a lack of protection for the bus being
removed while vfio_iommu_type1_attach_group() is using it; the reference
that VFIO holds on the iommu_group ensures that data remains valid, but
does not prevent the group's membership changing underfoot. Holding the
vfio_device for as long as we need here also neatly solves this.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

After sleeping on it, I decided to type up the helper function approach
to see how it looked in practice, and in doing so realised that with one
more tweak it could also subsume the locking out of the common paths as
well, so end up being a self-contained way for type1 to take care of its
own concern, which I rather like.

 drivers/vfio/vfio.c             | 18 +++++++++++++++++-
 drivers/vfio/vfio.h             |  3 +++
 drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++-------------------
 3 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 61e71c1154be..73bab04880d0 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -448,7 +448,7 @@ static void vfio_group_get(struct vfio_group *group)
  * Device objects - create, release, get, put, search
  */
 /* Device reference always implies a group reference */
-static void vfio_device_put(struct vfio_device *device)
+void vfio_device_put(struct vfio_device *device)
 {
 	if (refcount_dec_and_test(&device->refcount))
 		complete(&device->comp);
@@ -475,6 +475,22 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
 	return NULL;
 }
 
+struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
+{
+	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
+	struct vfio_device *device;
+
+	mutex_lock(&group->device_lock);
+	list_for_each_entry(device, &group->device_list, group_next) {
+		if (vfio_device_try_get(device)) {
+			mutex_unlock(&group->device_lock);
+			return device;
+		}
+	}
+	mutex_unlock(&group->device_lock);
+	return NULL;
+}
+
 /*
  * VFIO driver API
  */
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index a67130221151..e8f21e64541b 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -70,3 +70,6 @@ struct vfio_iommu_driver_ops {
 
 int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
 void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops);
+
+struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group);
+void vfio_device_put(struct vfio_device *device);
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c13b9290e357..e38b8bfde677 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1679,18 +1679,6 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 	return ret;
 }
 
-static int vfio_bus_type(struct device *dev, void *data)
-{
-	struct bus_type **bus = data;
-
-	if (*bus && *bus != dev->bus)
-		return -EINVAL;
-
-	*bus = dev->bus;
-
-	return 0;
-}
-
 static int vfio_iommu_replay(struct vfio_iommu *iommu,
 			     struct vfio_domain *domain)
 {
@@ -2159,7 +2147,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_iommu *iommu = iommu_data;
 	struct vfio_iommu_group *group;
 	struct vfio_domain *domain, *d;
-	struct bus_type *bus = NULL;
+	struct vfio_device *iommu_api_dev;
 	bool resv_msi, msi_remap;
 	phys_addr_t resv_msi_base = 0;
 	struct iommu_domain_geometry *geo;
@@ -2192,18 +2180,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_unlock;
 	}
 
-	/* Determine bus_type in order to allocate a domain */
-	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
-	if (ret)
+	/* Resolve the group back to a member device for IOMMU API ops */
+	ret = -ENODEV;
+	iommu_api_dev = vfio_device_get_from_iommu(iommu_group);
+	if (!iommu_api_dev)
 		goto out_free_group;
 
 	ret = -ENOMEM;
 	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
 	if (!domain)
-		goto out_free_group;
+		goto out_put_dev;
 
 	ret = -EIO;
-	domain->domain = iommu_domain_alloc(bus);
+	domain->domain = iommu_domain_alloc(iommu_api_dev->dev->bus);
 	if (!domain->domain)
 		goto out_free_domain;
 
@@ -2258,7 +2247,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	list_add(&group->next, &domain->group_list);
 
 	msi_remap = irq_domain_check_msi_remap() ||
-		    iommu_capable(bus, IOMMU_CAP_INTR_REMAP);
+		    iommu_capable(iommu_api_dev->dev->bus, IOMMU_CAP_INTR_REMAP);
 
 	if (!allow_unsafe_interrupts && !msi_remap) {
 		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
@@ -2331,6 +2320,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	iommu->num_non_pinned_groups++;
 	mutex_unlock(&iommu->lock);
 	vfio_iommu_resv_free(&group_resv_regions);
+	vfio_device_put(iommu_api_dev);
 
 	return 0;
 
@@ -2342,6 +2332,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	vfio_iommu_resv_free(&group_resv_regions);
 out_free_domain:
 	kfree(domain);
+out_put_dev:
+	vfio_device_put(iommu_api_dev);
 out_free_group:
 	kfree(group);
 out_unlock:
-- 
2.36.1.dirty


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/2] vfio: Use device_iommu_capable()
  2022-06-22 12:04 [PATCH v2 1/2] vfio/type1: Simplify bus_type determination Robin Murphy
@ 2022-06-22 12:04 ` Robin Murphy
  2022-06-23  1:47   ` Baolu Lu
  2022-06-22 22:17 ` [PATCH v2 1/2] vfio/type1: Simplify bus_type determination Alex Williamson
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Robin Murphy @ 2022-06-22 12:04 UTC (permalink / raw)
  To: alex.williamson, cohuck; +Cc: kvm, iommu, iommu, linux-kernel, jgg

Use the new interface to check the capabilities for our device
specifically.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/vfio/vfio.c             | 2 +-
 drivers/vfio/vfio_iommu_type1.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 73bab04880d0..765d68192c88 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -621,7 +621,7 @@ int vfio_register_group_dev(struct vfio_device *device)
 	 * VFIO always sets IOMMU_CACHE because we offer no way for userspace to
 	 * restore cache coherency.
 	 */
-	if (!iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY))
+	if (!device_iommu_capable(device->dev, IOMMU_CAP_CACHE_COHERENCY))
 		return -EINVAL;
 
 	return __vfio_register_dev(device,
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index e38b8bfde677..2107e95eb743 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2247,7 +2247,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	list_add(&group->next, &domain->group_list);
 
 	msi_remap = irq_domain_check_msi_remap() ||
-		    iommu_capable(iommu_api_dev->dev->bus, IOMMU_CAP_INTR_REMAP);
+		    device_iommu_capable(iommu_api_dev->dev, IOMMU_CAP_INTR_REMAP);
 
 	if (!allow_unsafe_interrupts && !msi_remap) {
 		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
-- 
2.36.1.dirty


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-22 12:04 [PATCH v2 1/2] vfio/type1: Simplify bus_type determination Robin Murphy
  2022-06-22 12:04 ` [PATCH v2 2/2] vfio: Use device_iommu_capable() Robin Murphy
@ 2022-06-22 22:17 ` Alex Williamson
  2022-06-23  8:46   ` Tian, Kevin
  2022-06-23 12:23   ` Robin Murphy
  2022-06-23  1:46 ` Baolu Lu
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 19+ messages in thread
From: Alex Williamson @ 2022-06-22 22:17 UTC (permalink / raw)
  To: Robin Murphy; +Cc: cohuck, jgg, iommu, iommu, kvm, linux-kernel

On Wed, 22 Jun 2022 13:04:11 +0100
Robin Murphy <robin.murphy@arm.com> wrote:

> Since IOMMU groups are mandatory for drivers to support, it stands to
> reason that any device which has been successfully be added to a group

s/be //

> must be on a bus supported by that IOMMU driver, and therefore a domain
> viable for any device in the group must be viable for all devices in
> the group. This already has to be the case for the IOMMU API's internal
> default domain, for instance. Thus even if the group contains devices on
> different buses, that can only mean that the IOMMU driver actually
> supports such an odd topology, and so without loss of generality we can
> expect the bus type of any device in a group to be suitable for IOMMU
> API calls.
> 
> Replace vfio_bus_type() with a simple call to resolve an appropriate
> member device from which to then derive a bus type. This is also a step
> towards removing the vague bus-based interfaces from the IOMMU API, when
> we can subsequently switch to using this device directly.
> 
> Furthermore, scrutiny reveals a lack of protection for the bus being
> removed while vfio_iommu_type1_attach_group() is using it; the reference
> that VFIO holds on the iommu_group ensures that data remains valid, but
> does not prevent the group's membership changing underfoot. Holding the
> vfio_device for as long as we need here also neatly solves this.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
> 
> After sleeping on it, I decided to type up the helper function approach
> to see how it looked in practice, and in doing so realised that with one
> more tweak it could also subsume the locking out of the common paths as
> well, so end up being a self-contained way for type1 to take care of its
> own concern, which I rather like.
> 
>  drivers/vfio/vfio.c             | 18 +++++++++++++++++-
>  drivers/vfio/vfio.h             |  3 +++
>  drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++-------------------
>  3 files changed, 31 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 61e71c1154be..73bab04880d0 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -448,7 +448,7 @@ static void vfio_group_get(struct vfio_group *group)
>   * Device objects - create, release, get, put, search
>   */
>  /* Device reference always implies a group reference */
> -static void vfio_device_put(struct vfio_device *device)
> +void vfio_device_put(struct vfio_device *device)
>  {
>  	if (refcount_dec_and_test(&device->refcount))
>  		complete(&device->comp);
> @@ -475,6 +475,22 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
>  	return NULL;
>  }
>  
> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
> +{
> +	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
> +	struct vfio_device *device;

Check group for NULL.

> +
> +	mutex_lock(&group->device_lock);
> +	list_for_each_entry(device, &group->device_list, group_next) {
> +		if (vfio_device_try_get(device)) {
> +			mutex_unlock(&group->device_lock);
> +			return device;
> +		}
> +	}
> +	mutex_unlock(&group->device_lock);
> +	return NULL;

No vfio_group_put() on either path.

> +}
> +
>  /*
>   * VFIO driver API
>   */
> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> index a67130221151..e8f21e64541b 100644
> --- a/drivers/vfio/vfio.h
> +++ b/drivers/vfio/vfio.h
> @@ -70,3 +70,6 @@ struct vfio_iommu_driver_ops {
>  
>  int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
>  void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops);
> +
> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group);
> +void vfio_device_put(struct vfio_device *device);
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index c13b9290e357..e38b8bfde677 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1679,18 +1679,6 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>  	return ret;
>  }
>  
> -static int vfio_bus_type(struct device *dev, void *data)
> -{
> -	struct bus_type **bus = data;
> -
> -	if (*bus && *bus != dev->bus)
> -		return -EINVAL;
> -
> -	*bus = dev->bus;
> -
> -	return 0;
> -}
> -
>  static int vfio_iommu_replay(struct vfio_iommu *iommu,
>  			     struct vfio_domain *domain)
>  {
> @@ -2159,7 +2147,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>  	struct vfio_iommu *iommu = iommu_data;
>  	struct vfio_iommu_group *group;
>  	struct vfio_domain *domain, *d;
> -	struct bus_type *bus = NULL;
> +	struct vfio_device *iommu_api_dev;
>  	bool resv_msi, msi_remap;
>  	phys_addr_t resv_msi_base = 0;
>  	struct iommu_domain_geometry *geo;
> @@ -2192,18 +2180,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>  		goto out_unlock;
>  	}
>  
> -	/* Determine bus_type in order to allocate a domain */
> -	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
> -	if (ret)
> +	/* Resolve the group back to a member device for IOMMU API ops */
> +	ret = -ENODEV;
> +	iommu_api_dev = vfio_device_get_from_iommu(iommu_group);
> +	if (!iommu_api_dev)
>  		goto out_free_group;
>  
>  	ret = -ENOMEM;
>  	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
>  	if (!domain)
> -		goto out_free_group;
> +		goto out_put_dev;
>  
>  	ret = -EIO;
> -	domain->domain = iommu_domain_alloc(bus);
> +	domain->domain = iommu_domain_alloc(iommu_api_dev->dev->bus);

It makes sense to move away from a bus centric interface to iommu ops
and I can see that having a device interface when we have device level
address-ability within a group makes sense, but does it make sense to
only have that device level interface?  For example, if an iommu_group
is going to remain an aspect of the iommu subsystem, shouldn't we be
able to allocate a domain and test capabilities based on the group and
the iommu driver should have enough embedded information reachable from
the struct iommu_group to do those things?  This "perform group level
operations based on an arbitrary device in the group" is pretty klunky.
Thanks,

Alex

>  	if (!domain->domain)
>  		goto out_free_domain;
>  
> @@ -2258,7 +2247,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>  	list_add(&group->next, &domain->group_list);
>  
>  	msi_remap = irq_domain_check_msi_remap() ||
> -		    iommu_capable(bus, IOMMU_CAP_INTR_REMAP);
> +		    iommu_capable(iommu_api_dev->dev->bus, IOMMU_CAP_INTR_REMAP);
>  
>  	if (!allow_unsafe_interrupts && !msi_remap) {
>  		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
> @@ -2331,6 +2320,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>  	iommu->num_non_pinned_groups++;
>  	mutex_unlock(&iommu->lock);
>  	vfio_iommu_resv_free(&group_resv_regions);
> +	vfio_device_put(iommu_api_dev);
>  
>  	return 0;
>  
> @@ -2342,6 +2332,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>  	vfio_iommu_resv_free(&group_resv_regions);
>  out_free_domain:
>  	kfree(domain);
> +out_put_dev:
> +	vfio_device_put(iommu_api_dev);
>  out_free_group:
>  	kfree(group);
>  out_unlock:


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-22 12:04 [PATCH v2 1/2] vfio/type1: Simplify bus_type determination Robin Murphy
  2022-06-22 12:04 ` [PATCH v2 2/2] vfio: Use device_iommu_capable() Robin Murphy
  2022-06-22 22:17 ` [PATCH v2 1/2] vfio/type1: Simplify bus_type determination Alex Williamson
@ 2022-06-23  1:46 ` Baolu Lu
  2022-06-23  4:32 ` kernel test robot
  2022-06-24  1:52 ` Jason Gunthorpe
  4 siblings, 0 replies; 19+ messages in thread
From: Baolu Lu @ 2022-06-23  1:46 UTC (permalink / raw)
  To: Robin Murphy, alex.williamson, cohuck
  Cc: baolu.lu, kvm, iommu, iommu, linux-kernel, jgg

On 2022/6/22 20:04, Robin Murphy wrote:
> Since IOMMU groups are mandatory for drivers to support, it stands to
> reason that any device which has been successfully be added to a group
> must be on a bus supported by that IOMMU driver, and therefore a domain
> viable for any device in the group must be viable for all devices in
> the group. This already has to be the case for the IOMMU API's internal
> default domain, for instance. Thus even if the group contains devices on
> different buses, that can only mean that the IOMMU driver actually
> supports such an odd topology, and so without loss of generality we can
> expect the bus type of any device in a group to be suitable for IOMMU
> API calls.

Ideally we could remove bus->iommu_ops and all IOMMU APIs go through the
dev_iommu_ops().

> 
> Replace vfio_bus_type() with a simple call to resolve an appropriate
> member device from which to then derive a bus type. This is also a step
> towards removing the vague bus-based interfaces from the IOMMU API, when
> we can subsequently switch to using this device directly.
> 
> Furthermore, scrutiny reveals a lack of protection for the bus being
> removed while vfio_iommu_type1_attach_group() is using it; the reference
> that VFIO holds on the iommu_group ensures that data remains valid, but
> does not prevent the group's membership changing underfoot. Holding the
> vfio_device for as long as we need here also neatly solves this.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>

Best regards,
baolu

> ---
> 
> After sleeping on it, I decided to type up the helper function approach
> to see how it looked in practice, and in doing so realised that with one
> more tweak it could also subsume the locking out of the common paths as
> well, so end up being a self-contained way for type1 to take care of its
> own concern, which I rather like.
> 
>   drivers/vfio/vfio.c             | 18 +++++++++++++++++-
>   drivers/vfio/vfio.h             |  3 +++
>   drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++-------------------
>   3 files changed, 31 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 61e71c1154be..73bab04880d0 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -448,7 +448,7 @@ static void vfio_group_get(struct vfio_group *group)
>    * Device objects - create, release, get, put, search
>    */
>   /* Device reference always implies a group reference */
> -static void vfio_device_put(struct vfio_device *device)
> +void vfio_device_put(struct vfio_device *device)
>   {
>   	if (refcount_dec_and_test(&device->refcount))
>   		complete(&device->comp);
> @@ -475,6 +475,22 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
>   	return NULL;
>   }
>   
> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
> +{
> +	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
> +	struct vfio_device *device;
> +
> +	mutex_lock(&group->device_lock);
> +	list_for_each_entry(device, &group->device_list, group_next) {
> +		if (vfio_device_try_get(device)) {
> +			mutex_unlock(&group->device_lock);
> +			return device;
> +		}
> +	}
> +	mutex_unlock(&group->device_lock);
> +	return NULL;
> +}
> +
>   /*
>    * VFIO driver API
>    */
> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> index a67130221151..e8f21e64541b 100644
> --- a/drivers/vfio/vfio.h
> +++ b/drivers/vfio/vfio.h
> @@ -70,3 +70,6 @@ struct vfio_iommu_driver_ops {
>   
>   int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
>   void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops);
> +
> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group);
> +void vfio_device_put(struct vfio_device *device);
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index c13b9290e357..e38b8bfde677 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1679,18 +1679,6 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>   	return ret;
>   }
>   
> -static int vfio_bus_type(struct device *dev, void *data)
> -{
> -	struct bus_type **bus = data;
> -
> -	if (*bus && *bus != dev->bus)
> -		return -EINVAL;
> -
> -	*bus = dev->bus;
> -
> -	return 0;
> -}
> -
>   static int vfio_iommu_replay(struct vfio_iommu *iommu,
>   			     struct vfio_domain *domain)
>   {
> @@ -2159,7 +2147,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>   	struct vfio_iommu *iommu = iommu_data;
>   	struct vfio_iommu_group *group;
>   	struct vfio_domain *domain, *d;
> -	struct bus_type *bus = NULL;
> +	struct vfio_device *iommu_api_dev;
>   	bool resv_msi, msi_remap;
>   	phys_addr_t resv_msi_base = 0;
>   	struct iommu_domain_geometry *geo;
> @@ -2192,18 +2180,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>   		goto out_unlock;
>   	}
>   
> -	/* Determine bus_type in order to allocate a domain */
> -	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
> -	if (ret)
> +	/* Resolve the group back to a member device for IOMMU API ops */
> +	ret = -ENODEV;
> +	iommu_api_dev = vfio_device_get_from_iommu(iommu_group);
> +	if (!iommu_api_dev)
>   		goto out_free_group;
>   
>   	ret = -ENOMEM;
>   	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
>   	if (!domain)
> -		goto out_free_group;
> +		goto out_put_dev;
>   
>   	ret = -EIO;
> -	domain->domain = iommu_domain_alloc(bus);
> +	domain->domain = iommu_domain_alloc(iommu_api_dev->dev->bus);
>   	if (!domain->domain)
>   		goto out_free_domain;
>   
> @@ -2258,7 +2247,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>   	list_add(&group->next, &domain->group_list);
>   
>   	msi_remap = irq_domain_check_msi_remap() ||
> -		    iommu_capable(bus, IOMMU_CAP_INTR_REMAP);
> +		    iommu_capable(iommu_api_dev->dev->bus, IOMMU_CAP_INTR_REMAP);
>   
>   	if (!allow_unsafe_interrupts && !msi_remap) {
>   		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
> @@ -2331,6 +2320,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>   	iommu->num_non_pinned_groups++;
>   	mutex_unlock(&iommu->lock);
>   	vfio_iommu_resv_free(&group_resv_regions);
> +	vfio_device_put(iommu_api_dev);
>   
>   	return 0;
>   
> @@ -2342,6 +2332,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>   	vfio_iommu_resv_free(&group_resv_regions);
>   out_free_domain:
>   	kfree(domain);
> +out_put_dev:
> +	vfio_device_put(iommu_api_dev);
>   out_free_group:
>   	kfree(group);
>   out_unlock:


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] vfio: Use device_iommu_capable()
  2022-06-22 12:04 ` [PATCH v2 2/2] vfio: Use device_iommu_capable() Robin Murphy
@ 2022-06-23  1:47   ` Baolu Lu
  0 siblings, 0 replies; 19+ messages in thread
From: Baolu Lu @ 2022-06-23  1:47 UTC (permalink / raw)
  To: Robin Murphy, alex.williamson, cohuck
  Cc: baolu.lu, jgg, iommu, iommu, kvm, linux-kernel

On 2022/6/22 20:04, Robin Murphy wrote:
> Use the new interface to check the capabilities for our device
> specifically.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>   drivers/vfio/vfio.c             | 2 +-
>   drivers/vfio/vfio_iommu_type1.c | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 73bab04880d0..765d68192c88 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -621,7 +621,7 @@ int vfio_register_group_dev(struct vfio_device *device)
>   	 * VFIO always sets IOMMU_CACHE because we offer no way for userspace to
>   	 * restore cache coherency.
>   	 */
> -	if (!iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY))
> +	if (!device_iommu_capable(device->dev, IOMMU_CAP_CACHE_COHERENCY))
>   		return -EINVAL;
>   
>   	return __vfio_register_dev(device,
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index e38b8bfde677..2107e95eb743 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -2247,7 +2247,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>   	list_add(&group->next, &domain->group_list);
>   
>   	msi_remap = irq_domain_check_msi_remap() ||
> -		    iommu_capable(iommu_api_dev->dev->bus, IOMMU_CAP_INTR_REMAP);
> +		    device_iommu_capable(iommu_api_dev->dev, IOMMU_CAP_INTR_REMAP);
>   
>   	if (!allow_unsafe_interrupts && !msi_remap) {
>   		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>

Best regards,
baolu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-22 12:04 [PATCH v2 1/2] vfio/type1: Simplify bus_type determination Robin Murphy
                   ` (2 preceding siblings ...)
  2022-06-23  1:46 ` Baolu Lu
@ 2022-06-23  4:32 ` kernel test robot
  2022-06-24  1:52 ` Jason Gunthorpe
  4 siblings, 0 replies; 19+ messages in thread
From: kernel test robot @ 2022-06-23  4:32 UTC (permalink / raw)
  To: Robin Murphy, alex.williamson, cohuck
  Cc: kbuild-all, kvm, iommu, iommu, linux-kernel, jgg

Hi Robin,

I love your patch! Yet something to improve:

[auto build test ERROR on v5.19-rc3]
[also build test ERROR on linus/master next-20220622]
[cannot apply to awilliam-vfio/next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Robin-Murphy/vfio-type1-Simplify-bus_type-determination/20220622-200503
base:    a111daf0c53ae91e71fd2bfe7497862d14132e3e
config: x86_64-rhel-8.3-kselftests (https://download.01.org/0day-ci/archive/20220623/202206231208.PGASmlUW-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-3) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/7a6e1ddc765bde40f879995137a2ff20cb0eda47
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Robin-Murphy/vfio-type1-Simplify-bus_type-determination/20220622-200503
        git checkout 7a6e1ddc765bde40f879995137a2ff20cb0eda47
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "vfio_device_get_from_iommu" [drivers/vfio/vfio_iommu_type1.ko] undefined!
>> ERROR: modpost: "vfio_device_put" [drivers/vfio/vfio_iommu_type1.ko] undefined!

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-22 22:17 ` [PATCH v2 1/2] vfio/type1: Simplify bus_type determination Alex Williamson
@ 2022-06-23  8:46   ` Tian, Kevin
  2022-06-23 20:35     ` Alex Williamson
  2022-06-23 12:23   ` Robin Murphy
  1 sibling, 1 reply; 19+ messages in thread
From: Tian, Kevin @ 2022-06-23  8:46 UTC (permalink / raw)
  To: Alex Williamson, Robin Murphy
  Cc: cohuck, jgg, iommu, iommu, kvm, linux-kernel

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Thursday, June 23, 2022 6:17 AM
> 
> >
> >  	ret = -EIO;
> > -	domain->domain = iommu_domain_alloc(bus);
> > +	domain->domain = iommu_domain_alloc(iommu_api_dev->dev-
> >bus);
> 
> It makes sense to move away from a bus centric interface to iommu ops
> and I can see that having a device interface when we have device level
> address-ability within a group makes sense, but does it make sense to
> only have that device level interface?  For example, if an iommu_group
> is going to remain an aspect of the iommu subsystem, shouldn't we be
> able to allocate a domain and test capabilities based on the group and
> the iommu driver should have enough embedded information reachable
> from
> the struct iommu_group to do those things?  This "perform group level
> operations based on an arbitrary device in the group" is pretty klunky.
> Thanks,
> 

This sounds a right thing to do.

btw another alternative which I'm thinking of is whether vfio_group
can record the bus info when the first device is added to it in
__vfio_register_dev(). Then we don't need a group interface from
iommu to test if vfio is the only user having such requirement.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-22 22:17 ` [PATCH v2 1/2] vfio/type1: Simplify bus_type determination Alex Williamson
  2022-06-23  8:46   ` Tian, Kevin
@ 2022-06-23 12:23   ` Robin Murphy
  2022-06-23 20:50     ` Jason Gunthorpe
  2022-06-23 23:00     ` Alex Williamson
  1 sibling, 2 replies; 19+ messages in thread
From: Robin Murphy @ 2022-06-23 12:23 UTC (permalink / raw)
  To: Alex Williamson; +Cc: cohuck, jgg, iommu, iommu, kvm, linux-kernel

On 2022-06-22 23:17, Alex Williamson wrote:
> On Wed, 22 Jun 2022 13:04:11 +0100
> Robin Murphy <robin.murphy@arm.com> wrote:
> 
>> Since IOMMU groups are mandatory for drivers to support, it stands to
>> reason that any device which has been successfully be added to a group
> 
> s/be //

Oops.

>> must be on a bus supported by that IOMMU driver, and therefore a domain
>> viable for any device in the group must be viable for all devices in
>> the group. This already has to be the case for the IOMMU API's internal
>> default domain, for instance. Thus even if the group contains devices on
>> different buses, that can only mean that the IOMMU driver actually
>> supports such an odd topology, and so without loss of generality we can
>> expect the bus type of any device in a group to be suitable for IOMMU
>> API calls.
>>
>> Replace vfio_bus_type() with a simple call to resolve an appropriate
>> member device from which to then derive a bus type. This is also a step
>> towards removing the vague bus-based interfaces from the IOMMU API, when
>> we can subsequently switch to using this device directly.
>>
>> Furthermore, scrutiny reveals a lack of protection for the bus being
>> removed while vfio_iommu_type1_attach_group() is using it; the reference
>> that VFIO holds on the iommu_group ensures that data remains valid, but
>> does not prevent the group's membership changing underfoot. Holding the
>> vfio_device for as long as we need here also neatly solves this.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>
>> After sleeping on it, I decided to type up the helper function approach
>> to see how it looked in practice, and in doing so realised that with one
>> more tweak it could also subsume the locking out of the common paths as
>> well, so end up being a self-contained way for type1 to take care of its
>> own concern, which I rather like.
>>
>>   drivers/vfio/vfio.c             | 18 +++++++++++++++++-
>>   drivers/vfio/vfio.h             |  3 +++
>>   drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++-------------------
>>   3 files changed, 31 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
>> index 61e71c1154be..73bab04880d0 100644
>> --- a/drivers/vfio/vfio.c
>> +++ b/drivers/vfio/vfio.c
>> @@ -448,7 +448,7 @@ static void vfio_group_get(struct vfio_group *group)
>>    * Device objects - create, release, get, put, search
>>    */
>>   /* Device reference always implies a group reference */
>> -static void vfio_device_put(struct vfio_device *device)
>> +void vfio_device_put(struct vfio_device *device)
>>   {
>>   	if (refcount_dec_and_test(&device->refcount))
>>   		complete(&device->comp);
>> @@ -475,6 +475,22 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
>>   	return NULL;
>>   }
>>   
>> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
>> +{
>> +	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
>> +	struct vfio_device *device;
> 
> Check group for NULL.

OK - FWIW in context this should only ever make sense to call with an 
iommu_group which has already been derived from a vfio_group, and I did 
initially consider a check with a WARN_ON(), but then decided that the 
unguarded dereference would be a sufficiently strong message. No problem 
with bringing that back to make it more defensive if that's what you prefer.

>> +
>> +	mutex_lock(&group->device_lock);
>> +	list_for_each_entry(device, &group->device_list, group_next) {
>> +		if (vfio_device_try_get(device)) {
>> +			mutex_unlock(&group->device_lock);
>> +			return device;
>> +		}
>> +	}
>> +	mutex_unlock(&group->device_lock);
>> +	return NULL;
> 
> No vfio_group_put() on either path.

Oops indeed.

>> +}
>> +
>>   /*
>>    * VFIO driver API
>>    */
>> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
>> index a67130221151..e8f21e64541b 100644
>> --- a/drivers/vfio/vfio.h
>> +++ b/drivers/vfio/vfio.h
>> @@ -70,3 +70,6 @@ struct vfio_iommu_driver_ops {
>>   
>>   int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
>>   void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops);
>> +
>> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group);
>> +void vfio_device_put(struct vfio_device *device);
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index c13b9290e357..e38b8bfde677 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -1679,18 +1679,6 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>   	return ret;
>>   }
>>   
>> -static int vfio_bus_type(struct device *dev, void *data)
>> -{
>> -	struct bus_type **bus = data;
>> -
>> -	if (*bus && *bus != dev->bus)
>> -		return -EINVAL;
>> -
>> -	*bus = dev->bus;
>> -
>> -	return 0;
>> -}
>> -
>>   static int vfio_iommu_replay(struct vfio_iommu *iommu,
>>   			     struct vfio_domain *domain)
>>   {
>> @@ -2159,7 +2147,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>>   	struct vfio_iommu *iommu = iommu_data;
>>   	struct vfio_iommu_group *group;
>>   	struct vfio_domain *domain, *d;
>> -	struct bus_type *bus = NULL;
>> +	struct vfio_device *iommu_api_dev;
>>   	bool resv_msi, msi_remap;
>>   	phys_addr_t resv_msi_base = 0;
>>   	struct iommu_domain_geometry *geo;
>> @@ -2192,18 +2180,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>>   		goto out_unlock;
>>   	}
>>   
>> -	/* Determine bus_type in order to allocate a domain */
>> -	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
>> -	if (ret)
>> +	/* Resolve the group back to a member device for IOMMU API ops */
>> +	ret = -ENODEV;
>> +	iommu_api_dev = vfio_device_get_from_iommu(iommu_group);
>> +	if (!iommu_api_dev)
>>   		goto out_free_group;
>>   
>>   	ret = -ENOMEM;
>>   	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
>>   	if (!domain)
>> -		goto out_free_group;
>> +		goto out_put_dev;
>>   
>>   	ret = -EIO;
>> -	domain->domain = iommu_domain_alloc(bus);
>> +	domain->domain = iommu_domain_alloc(iommu_api_dev->dev->bus);
> 
> It makes sense to move away from a bus centric interface to iommu ops
> and I can see that having a device interface when we have device level
> address-ability within a group makes sense, but does it make sense to
> only have that device level interface?  For example, if an iommu_group
> is going to remain an aspect of the iommu subsystem, shouldn't we be
> able to allocate a domain and test capabilities based on the group and
> the iommu driver should have enough embedded information reachable from
> the struct iommu_group to do those things?  This "perform group level
> operations based on an arbitrary device in the group" is pretty klunky.

The fact* is that devices (and domains) are the fundamental units of the 
IOMMU API internals, due to what's most practical within the Linux 
driver model, while groups remain more of a mid-level abstraction - 
IOMMU drivers themselves are only aware of groups at all in terms of 
whether they can physically distinguish a given device from others. The 
client-driver-facing API is already moving back to being device-centric, 
because that's what fits everyone else's usage models, and we concluded 
that exposing the complexity of groups everywhere was more trouble than 
it's worth.

So yes, technically we could implement an iommu_group_capable() and an 
iommu_group_domain_alloc(), which would still just internally resolve 
the IOMMU ops and instance data from a member device to perform the 
driver-level call, but once again it would be for the benefit of 
precisely one user. And I really have minimal enthusiasm for diverging 
any further into one IOMMU API for everyone else plus a separate special 
IOMMU API for VFIO type1, when type1 is supposed to be the 
VFIO-to-IOMMU-API translation layer anyway! To look at it another way, 
if most of the complexity of groups is for VFIO's benefit, then why 
*shouldn't* VFIO take responsibility for some of the fiddly details that 
don't matter to anyone else?

Thanks,
Robin.


* with some inescapable degree of subjective opinion, of course

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-23  8:46   ` Tian, Kevin
@ 2022-06-23 20:35     ` Alex Williamson
  0 siblings, 0 replies; 19+ messages in thread
From: Alex Williamson @ 2022-06-23 20:35 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: Robin Murphy, cohuck, jgg, iommu, iommu, kvm, linux-kernel

On Thu, 23 Jun 2022 08:46:45 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Thursday, June 23, 2022 6:17 AM
> >   
> > >
> > >  	ret = -EIO;
> > > -	domain->domain = iommu_domain_alloc(bus);
> > > +	domain->domain = iommu_domain_alloc(iommu_api_dev->dev-
> > >bus);  
> > 
> > It makes sense to move away from a bus centric interface to iommu ops
> > and I can see that having a device interface when we have device level
> > address-ability within a group makes sense, but does it make sense to
> > only have that device level interface?  For example, if an iommu_group
> > is going to remain an aspect of the iommu subsystem, shouldn't we be
> > able to allocate a domain and test capabilities based on the group and
> > the iommu driver should have enough embedded information reachable
> > from
> > the struct iommu_group to do those things?  This "perform group level
> > operations based on an arbitrary device in the group" is pretty klunky.
> > Thanks,
> >   
> 
> This sounds a right thing to do.
> 
> btw another alternative which I'm thinking of is whether vfio_group
> can record the bus info when the first device is added to it in
> __vfio_register_dev(). Then we don't need a group interface from
> iommu to test if vfio is the only user having such requirement.

That might be more simple, but it's just another variation on vfio
picking an arbitrary device from a group to satisfy the iommu interface
rather than operating on an iommu subsystem provided object.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-23 12:23   ` Robin Murphy
@ 2022-06-23 20:50     ` Jason Gunthorpe
  2022-06-23 23:00     ` Alex Williamson
  1 sibling, 0 replies; 19+ messages in thread
From: Jason Gunthorpe @ 2022-06-23 20:50 UTC (permalink / raw)
  To: Robin Murphy; +Cc: Alex Williamson, cohuck, iommu, iommu, kvm, linux-kernel

On Thu, Jun 23, 2022 at 01:23:05PM +0100, Robin Murphy wrote:

> So yes, technically we could implement an iommu_group_capable() and an
> iommu_group_domain_alloc(), which would still just internally resolve the
> IOMMU ops and instance data from a member device to perform the driver-level
> call, but once again it would be for the benefit of precisely one
> user. 

Benefit one user and come with a fairly complex locking situation to
boot.

Alex, I'd rather think about moving the type 1 code so that the iommu
attach happens during device FD creation (then we have a concrete
non-fake device), not during group FD opening.

That is the model we need for iommufd anyhow.

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-23 12:23   ` Robin Murphy
  2022-06-23 20:50     ` Jason Gunthorpe
@ 2022-06-23 23:00     ` Alex Williamson
  2022-06-24  1:50       ` Jason Gunthorpe
  1 sibling, 1 reply; 19+ messages in thread
From: Alex Williamson @ 2022-06-23 23:00 UTC (permalink / raw)
  To: Robin Murphy; +Cc: cohuck, jgg, iommu, iommu, kvm, linux-kernel

On Thu, 23 Jun 2022 13:23:05 +0100
Robin Murphy <robin.murphy@arm.com> wrote:

> On 2022-06-22 23:17, Alex Williamson wrote:
> > On Wed, 22 Jun 2022 13:04:11 +0100
> > Robin Murphy <robin.murphy@arm.com> wrote:
> >   
> >> Since IOMMU groups are mandatory for drivers to support, it stands to
> >> reason that any device which has been successfully be added to a group  
> > 
> > s/be //  
> 
> Oops.
> 
> >> must be on a bus supported by that IOMMU driver, and therefore a domain
> >> viable for any device in the group must be viable for all devices in
> >> the group. This already has to be the case for the IOMMU API's internal
> >> default domain, for instance. Thus even if the group contains devices on
> >> different buses, that can only mean that the IOMMU driver actually
> >> supports such an odd topology, and so without loss of generality we can
> >> expect the bus type of any device in a group to be suitable for IOMMU
> >> API calls.
> >>
> >> Replace vfio_bus_type() with a simple call to resolve an appropriate
> >> member device from which to then derive a bus type. This is also a step
> >> towards removing the vague bus-based interfaces from the IOMMU API, when
> >> we can subsequently switch to using this device directly.
> >>
> >> Furthermore, scrutiny reveals a lack of protection for the bus being
> >> removed while vfio_iommu_type1_attach_group() is using it; the reference
> >> that VFIO holds on the iommu_group ensures that data remains valid, but
> >> does not prevent the group's membership changing underfoot. Holding the
> >> vfio_device for as long as we need here also neatly solves this.
> >>
> >> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> >> ---
> >>
> >> After sleeping on it, I decided to type up the helper function approach
> >> to see how it looked in practice, and in doing so realised that with one
> >> more tweak it could also subsume the locking out of the common paths as
> >> well, so end up being a self-contained way for type1 to take care of its
> >> own concern, which I rather like.
> >>
> >>   drivers/vfio/vfio.c             | 18 +++++++++++++++++-
> >>   drivers/vfio/vfio.h             |  3 +++
> >>   drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++-------------------
> >>   3 files changed, 31 insertions(+), 20 deletions(-)
> >>
> >> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> >> index 61e71c1154be..73bab04880d0 100644
> >> --- a/drivers/vfio/vfio.c
> >> +++ b/drivers/vfio/vfio.c
> >> @@ -448,7 +448,7 @@ static void vfio_group_get(struct vfio_group *group)
> >>    * Device objects - create, release, get, put, search
> >>    */
> >>   /* Device reference always implies a group reference */
> >> -static void vfio_device_put(struct vfio_device *device)
> >> +void vfio_device_put(struct vfio_device *device)
> >>   {
> >>   	if (refcount_dec_and_test(&device->refcount))
> >>   		complete(&device->comp);
> >> @@ -475,6 +475,22 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
> >>   	return NULL;
> >>   }
> >>   
> >> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
> >> +{
> >> +	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
> >> +	struct vfio_device *device;  
> > 
> > Check group for NULL.  
> 
> OK - FWIW in context this should only ever make sense to call with an 
> iommu_group which has already been derived from a vfio_group, and I did 
> initially consider a check with a WARN_ON(), but then decided that the 
> unguarded dereference would be a sufficiently strong message. No problem 
> with bringing that back to make it more defensive if that's what you prefer.

A while down the road, that's a bit too much implicit knowledge of the
intent and single purpose of this function just to simply avoid a test.

> >> +
> >> +	mutex_lock(&group->device_lock);
> >> +	list_for_each_entry(device, &group->device_list, group_next) {
> >> +		if (vfio_device_try_get(device)) {
> >> +			mutex_unlock(&group->device_lock);
> >> +			return device;
> >> +		}
> >> +	}
> >> +	mutex_unlock(&group->device_lock);
> >> +	return NULL;  
> > 
> > No vfio_group_put() on either path.  
> 
> Oops indeed.
> 
> >> +}
> >> +
> >>   /*
> >>    * VFIO driver API
> >>    */
> >> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> >> index a67130221151..e8f21e64541b 100644
> >> --- a/drivers/vfio/vfio.h
> >> +++ b/drivers/vfio/vfio.h
> >> @@ -70,3 +70,6 @@ struct vfio_iommu_driver_ops {
> >>   
> >>   int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
> >>   void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops);
> >> +
> >> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group);
> >> +void vfio_device_put(struct vfio_device *device);
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >> index c13b9290e357..e38b8bfde677 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -1679,18 +1679,6 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
> >>   	return ret;
> >>   }
> >>   
> >> -static int vfio_bus_type(struct device *dev, void *data)
> >> -{
> >> -	struct bus_type **bus = data;
> >> -
> >> -	if (*bus && *bus != dev->bus)
> >> -		return -EINVAL;
> >> -
> >> -	*bus = dev->bus;
> >> -
> >> -	return 0;
> >> -}
> >> -
> >>   static int vfio_iommu_replay(struct vfio_iommu *iommu,
> >>   			     struct vfio_domain *domain)
> >>   {
> >> @@ -2159,7 +2147,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
> >>   	struct vfio_iommu *iommu = iommu_data;
> >>   	struct vfio_iommu_group *group;
> >>   	struct vfio_domain *domain, *d;
> >> -	struct bus_type *bus = NULL;
> >> +	struct vfio_device *iommu_api_dev;
> >>   	bool resv_msi, msi_remap;
> >>   	phys_addr_t resv_msi_base = 0;
> >>   	struct iommu_domain_geometry *geo;
> >> @@ -2192,18 +2180,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
> >>   		goto out_unlock;
> >>   	}
> >>   
> >> -	/* Determine bus_type in order to allocate a domain */
> >> -	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
> >> -	if (ret)
> >> +	/* Resolve the group back to a member device for IOMMU API ops */
> >> +	ret = -ENODEV;
> >> +	iommu_api_dev = vfio_device_get_from_iommu(iommu_group);
> >> +	if (!iommu_api_dev)
> >>   		goto out_free_group;
> >>   
> >>   	ret = -ENOMEM;
> >>   	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
> >>   	if (!domain)
> >> -		goto out_free_group;
> >> +		goto out_put_dev;
> >>   
> >>   	ret = -EIO;
> >> -	domain->domain = iommu_domain_alloc(bus);
> >> +	domain->domain = iommu_domain_alloc(iommu_api_dev->dev->bus);  
> > 
> > It makes sense to move away from a bus centric interface to iommu ops
> > and I can see that having a device interface when we have device level
> > address-ability within a group makes sense, but does it make sense to
> > only have that device level interface?  For example, if an iommu_group
> > is going to remain an aspect of the iommu subsystem, shouldn't we be
> > able to allocate a domain and test capabilities based on the group and
> > the iommu driver should have enough embedded information reachable from
> > the struct iommu_group to do those things?  This "perform group level
> > operations based on an arbitrary device in the group" is pretty klunky.  
> 
> The fact* is that devices (and domains) are the fundamental units of the 
> IOMMU API internals, due to what's most practical within the Linux 
> driver model, while groups remain more of a mid-level abstraction - 
> IOMMU drivers themselves are only aware of groups at all in terms of 
> whether they can physically distinguish a given device from others. The 
> client-driver-facing API is already moving back to being device-centric, 
> because that's what fits everyone else's usage models, and we concluded 
> that exposing the complexity of groups everywhere was more trouble than 
> it's worth.
> 
> So yes, technically we could implement an iommu_group_capable() and an 
> iommu_group_domain_alloc(), which would still just internally resolve 
> the IOMMU ops and instance data from a member device to perform the 
> driver-level call, but once again it would be for the benefit of 
> precisely one user. And I really have minimal enthusiasm for diverging 
> any further into one IOMMU API for everyone else plus a separate special 
> IOMMU API for VFIO type1, when type1 is supposed to be the 
> VFIO-to-IOMMU-API translation layer anyway! To look at it another way, 
> if most of the complexity of groups is for VFIO's benefit, then why 
> *shouldn't* VFIO take responsibility for some of the fiddly details that 
> don't matter to anyone else?


Hmm, I agree, but I can't get past vfio-core exporting a function to
select an arbitrary vfio_device from an iommu_group.  What if type1 was
passed an opaque vfio_group pointer and vfio exported a function to
iterate each vfio_device in that vfio_group?  We'd need to export
vfio_device_try_get() and vfio_device_put() and type1 itself would stop
on the first vfio_device object it can get.  Sort of the v1 approach,
but with a vfio iterator rather than iommu.

I'd lean towards Kevin's idea that we could store bus_type on the
vfio_group and pass that to type1, with the same assumptions we're
making in the commit log that it's consistent, but that doesn't get us
closer to the long term plan of dropping the bus_type interfaces AIUI.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-23 23:00     ` Alex Williamson
@ 2022-06-24  1:50       ` Jason Gunthorpe
  2022-06-24 14:11         ` Alex Williamson
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2022-06-24  1:50 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Robin Murphy, cohuck, iommu, iommu, kvm, linux-kernel

On Thu, Jun 23, 2022 at 05:00:44PM -0600, Alex Williamson wrote:

> > >> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
> > >> +{
> > >> +	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
> > >> +	struct vfio_device *device;  
> > > 
> > > Check group for NULL.  
> > 
> > OK - FWIW in context this should only ever make sense to call with an 
> > iommu_group which has already been derived from a vfio_group, and I did 
> > initially consider a check with a WARN_ON(), but then decided that the 
> > unguarded dereference would be a sufficiently strong message. No problem 
> > with bringing that back to make it more defensive if that's what you prefer.
> 
> A while down the road, that's a bit too much implicit knowledge of the
> intent and single purpose of this function just to simply avoid a test.

I think we should just pass the 'struct vfio_group *' into the
attach_group op and have this API take that type in and forget the
vfio_group_get_from_iommu().

At this point there is little justification for
vfio_group_get_from_iommu() existing at all, it should be folded into
the one use in vfio_group_find_or_alloc() and the locking widened so
we don't have the unlock/alloc/lock race that requires it to be called
twice.

> I'd lean towards Kevin's idea that we could store bus_type on the
> vfio_group and pass that to type1, with the same assumptions we're
> making in the commit log that it's consistent, but that doesn't get us
> closer to the long term plan of dropping the bus_type interfaces
> AIUI.

Right, the point is to get a representative struct device here to use.

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-22 12:04 [PATCH v2 1/2] vfio/type1: Simplify bus_type determination Robin Murphy
                   ` (3 preceding siblings ...)
  2022-06-23  4:32 ` kernel test robot
@ 2022-06-24  1:52 ` Jason Gunthorpe
  4 siblings, 0 replies; 19+ messages in thread
From: Jason Gunthorpe @ 2022-06-24  1:52 UTC (permalink / raw)
  To: Robin Murphy; +Cc: alex.williamson, cohuck, kvm, iommu, iommu, linux-kernel

On Wed, Jun 22, 2022 at 01:04:11PM +0100, Robin Murphy wrote:

> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
> +{
> +	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
> +	struct vfio_device *device;
> +
> +	mutex_lock(&group->device_lock);
> +	list_for_each_entry(device, &group->device_list, group_next) {
> +		if (vfio_device_try_get(device)) {
> +			mutex_unlock(&group->device_lock);
> +			return device;
> +		}
> +	}
> +	mutex_unlock(&group->device_lock);
> +	return NULL;
> +}

FWIW, I have no objection to this general approach, and I don't think
we should make any broader API just for this.

Though I might call it something like
'vfio_get_group_representor_device()' which more strongly suggests
what it is only used for.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-24  1:50       ` Jason Gunthorpe
@ 2022-06-24 14:11         ` Alex Williamson
  2022-06-24 14:18           ` Jason Gunthorpe
  0 siblings, 1 reply; 19+ messages in thread
From: Alex Williamson @ 2022-06-24 14:11 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Robin Murphy, cohuck, iommu, iommu, kvm, linux-kernel

On Thu, 23 Jun 2022 22:50:30 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Thu, Jun 23, 2022 at 05:00:44PM -0600, Alex Williamson wrote:
> 
> > > >> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
> > > >> +{
> > > >> +	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
> > > >> +	struct vfio_device *device;    
> > > > 
> > > > Check group for NULL.    
> > > 
> > > OK - FWIW in context this should only ever make sense to call with an 
> > > iommu_group which has already been derived from a vfio_group, and I did 
> > > initially consider a check with a WARN_ON(), but then decided that the 
> > > unguarded dereference would be a sufficiently strong message. No problem 
> > > with bringing that back to make it more defensive if that's what you prefer.  
> > 
> > A while down the road, that's a bit too much implicit knowledge of the
> > intent and single purpose of this function just to simply avoid a test.  
> 
> I think we should just pass the 'struct vfio_group *' into the
> attach_group op and have this API take that type in and forget the
> vfio_group_get_from_iommu().

That's essentially what I'm suggesting, the vfio_group is passed as an
opaque pointer which type1 can use for a
vfio_group_for_each_vfio_device() type call.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-24 14:11         ` Alex Williamson
@ 2022-06-24 14:18           ` Jason Gunthorpe
  2022-06-24 14:28             ` Alex Williamson
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2022-06-24 14:18 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Robin Murphy, cohuck, iommu, iommu, kvm, linux-kernel

On Fri, Jun 24, 2022 at 08:11:59AM -0600, Alex Williamson wrote:
> On Thu, 23 Jun 2022 22:50:30 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Thu, Jun 23, 2022 at 05:00:44PM -0600, Alex Williamson wrote:
> > 
> > > > >> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
> > > > >> +{
> > > > >> +	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
> > > > >> +	struct vfio_device *device;    
> > > > > 
> > > > > Check group for NULL.    
> > > > 
> > > > OK - FWIW in context this should only ever make sense to call with an 
> > > > iommu_group which has already been derived from a vfio_group, and I did 
> > > > initially consider a check with a WARN_ON(), but then decided that the 
> > > > unguarded dereference would be a sufficiently strong message. No problem 
> > > > with bringing that back to make it more defensive if that's what you prefer.  
> > > 
> > > A while down the road, that's a bit too much implicit knowledge of the
> > > intent and single purpose of this function just to simply avoid a test.  
> > 
> > I think we should just pass the 'struct vfio_group *' into the
> > attach_group op and have this API take that type in and forget the
> > vfio_group_get_from_iommu().
> 
> That's essentially what I'm suggesting, the vfio_group is passed as an
> opaque pointer which type1 can use for a
> vfio_group_for_each_vfio_device() type call.  Thanks,

I don't want to add a whole vfio_group_for_each_vfio_device()
machinery that isn't actually needed by anything.. This is all
internal, we don't need to design more than exactly what is needed.

At this point if we change the signature of the attach then we may as
well just pass in the representative vfio_device, that is probably
less LOC overall.

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-24 14:18           ` Jason Gunthorpe
@ 2022-06-24 14:28             ` Alex Williamson
  2022-06-24 14:56               ` Jason Gunthorpe
  2022-06-24 15:12               ` Robin Murphy
  0 siblings, 2 replies; 19+ messages in thread
From: Alex Williamson @ 2022-06-24 14:28 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Robin Murphy, cohuck, iommu, iommu, kvm, linux-kernel

On Fri, 24 Jun 2022 11:18:36 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Fri, Jun 24, 2022 at 08:11:59AM -0600, Alex Williamson wrote:
> > On Thu, 23 Jun 2022 22:50:30 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Thu, Jun 23, 2022 at 05:00:44PM -0600, Alex Williamson wrote:
> > >   
> > > > > >> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
> > > > > >> +{
> > > > > >> +	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
> > > > > >> +	struct vfio_device *device;      
> > > > > > 
> > > > > > Check group for NULL.      
> > > > > 
> > > > > OK - FWIW in context this should only ever make sense to call with an 
> > > > > iommu_group which has already been derived from a vfio_group, and I did 
> > > > > initially consider a check with a WARN_ON(), but then decided that the 
> > > > > unguarded dereference would be a sufficiently strong message. No problem 
> > > > > with bringing that back to make it more defensive if that's what you prefer.    
> > > > 
> > > > A while down the road, that's a bit too much implicit knowledge of the
> > > > intent and single purpose of this function just to simply avoid a test.    
> > > 
> > > I think we should just pass the 'struct vfio_group *' into the
> > > attach_group op and have this API take that type in and forget the
> > > vfio_group_get_from_iommu().  
> > 
> > That's essentially what I'm suggesting, the vfio_group is passed as an
> > opaque pointer which type1 can use for a
> > vfio_group_for_each_vfio_device() type call.  Thanks,  
> 
> I don't want to add a whole vfio_group_for_each_vfio_device()
> machinery that isn't actually needed by anything.. This is all
> internal, we don't need to design more than exactly what is needed.
> 
> At this point if we change the signature of the attach then we may as
> well just pass in the representative vfio_device, that is probably
> less LOC overall.

That means that vfio core still needs to pick an arbitrary
representative device, which I find in fundamental conflict to the
nature of groups.  Type1 is the interface to the IOMMU API, if through
the IOMMU API we can make an assumption that all devices within the
group are equivalent for a given operation, that should be done in type1
code, not in vfio core.  A for-each interface is commonplace and not
significantly more code or design than already proposed.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-24 14:28             ` Alex Williamson
@ 2022-06-24 14:56               ` Jason Gunthorpe
  2022-06-24 15:12               ` Robin Murphy
  1 sibling, 0 replies; 19+ messages in thread
From: Jason Gunthorpe @ 2022-06-24 14:56 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Robin Murphy, cohuck, iommu, iommu, kvm, linux-kernel

On Fri, Jun 24, 2022 at 08:28:31AM -0600, Alex Williamson wrote:

> > > That's essentially what I'm suggesting, the vfio_group is passed as an
> > > opaque pointer which type1 can use for a
> > > vfio_group_for_each_vfio_device() type call.  Thanks,  
> > 
> > I don't want to add a whole vfio_group_for_each_vfio_device()
> > machinery that isn't actually needed by anything.. This is all
> > internal, we don't need to design more than exactly what is needed.
> > 
> > At this point if we change the signature of the attach then we may as
> > well just pass in the representative vfio_device, that is probably
> > less LOC overall.
> 
> That means that vfio core still needs to pick an arbitrary
> representative device, which I find in fundamental conflict to the
> nature of groups.

Well, this is where iommu is going, I think Robin has explained this
view well enough.

Ideally we'd move VFIO away from trying to attach groups and attach
when the device FD is opened, I view this as a micro step in that
direction.

> Type1 is the interface to the IOMMU API, if through the IOMMU API we
> can make an assumption that all devices within the group are
> equivalent for a given operation, that should be done in type1 code,
> not in vfio core.

iommu_group is part of the core code, if the representative device
assumption stems from the iommu_group then the core code can safely
make it.

> A for-each interface is commonplace and not significantly more code
> or design than already proposed.

Except that someone else might get the idea to use it for something
completely inappropriate.

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-24 14:28             ` Alex Williamson
  2022-06-24 14:56               ` Jason Gunthorpe
@ 2022-06-24 15:12               ` Robin Murphy
  2022-06-24 16:04                 ` Alex Williamson
  1 sibling, 1 reply; 19+ messages in thread
From: Robin Murphy @ 2022-06-24 15:12 UTC (permalink / raw)
  To: Alex Williamson, Jason Gunthorpe; +Cc: cohuck, iommu, iommu, kvm, linux-kernel

On 2022-06-24 15:28, Alex Williamson wrote:
> On Fri, 24 Jun 2022 11:18:36 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
>> On Fri, Jun 24, 2022 at 08:11:59AM -0600, Alex Williamson wrote:
>>> On Thu, 23 Jun 2022 22:50:30 -0300
>>> Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>    
>>>> On Thu, Jun 23, 2022 at 05:00:44PM -0600, Alex Williamson wrote:
>>>>    
>>>>>>>> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
>>>>>>>> +{
>>>>>>>> +	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
>>>>>>>> +	struct vfio_device *device;
>>>>>>>
>>>>>>> Check group for NULL.
>>>>>>
>>>>>> OK - FWIW in context this should only ever make sense to call with an
>>>>>> iommu_group which has already been derived from a vfio_group, and I did
>>>>>> initially consider a check with a WARN_ON(), but then decided that the
>>>>>> unguarded dereference would be a sufficiently strong message. No problem
>>>>>> with bringing that back to make it more defensive if that's what you prefer.
>>>>>
>>>>> A while down the road, that's a bit too much implicit knowledge of the
>>>>> intent and single purpose of this function just to simply avoid a test.
>>>>
>>>> I think we should just pass the 'struct vfio_group *' into the
>>>> attach_group op and have this API take that type in and forget the
>>>> vfio_group_get_from_iommu().
>>>
>>> That's essentially what I'm suggesting, the vfio_group is passed as an
>>> opaque pointer which type1 can use for a
>>> vfio_group_for_each_vfio_device() type call.  Thanks,
>>
>> I don't want to add a whole vfio_group_for_each_vfio_device()
>> machinery that isn't actually needed by anything.. This is all
>> internal, we don't need to design more than exactly what is needed.
>>
>> At this point if we change the signature of the attach then we may as
>> well just pass in the representative vfio_device, that is probably
>> less LOC overall.
> 
> That means that vfio core still needs to pick an arbitrary
> representative device, which I find in fundamental conflict to the
> nature of groups.  Type1 is the interface to the IOMMU API, if through
> the IOMMU API we can make an assumption that all devices within the
> group are equivalent for a given operation, that should be done in type1
> code, not in vfio core.  A for-each interface is commonplace and not
> significantly more code or design than already proposed.  Thanks,

It also occurred to me this morning that there's another middle-ground 
option staring out from the call-wrapping notion I mentioned yesterday - 
while I'm not keen to provide it from the IOMMU API, there's absolutely 
no reason that VFIO couldn't just use the building blocks by itself, and 
in fact it works out almost absurdly simple:

static bool vfio_device_capable(struct device *dev, void *data)
{
	return device_iommu_capable(dev, (enum iommu_cap)data);
}

bool vfio_group_capable(struct iommu_group *group, enum iommu_cap cap)
{
	return iommu_group_for_each_dev(group, (void *)cap, vfio_device_capable);
}

and much the same for iommu_domain_alloc() once I get that far. The 
locking concern neatly disappears because we're no longer holding any 
bus or device pointer that can go stale. How does that seem as a 
compromise for now, looking forward to Jason's longer-term view of 
rearranging the attach_group process such that a vfio_device falls 
naturally to hand?

Cheers,
Robin.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
  2022-06-24 15:12               ` Robin Murphy
@ 2022-06-24 16:04                 ` Alex Williamson
  0 siblings, 0 replies; 19+ messages in thread
From: Alex Williamson @ 2022-06-24 16:04 UTC (permalink / raw)
  To: Robin Murphy; +Cc: Jason Gunthorpe, cohuck, iommu, iommu, kvm, linux-kernel

On Fri, 24 Jun 2022 16:12:55 +0100
Robin Murphy <robin.murphy@arm.com> wrote:

> On 2022-06-24 15:28, Alex Williamson wrote:
> > On Fri, 24 Jun 2022 11:18:36 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> >> On Fri, Jun 24, 2022 at 08:11:59AM -0600, Alex Williamson wrote:  
> >>> On Thu, 23 Jun 2022 22:50:30 -0300
> >>> Jason Gunthorpe <jgg@nvidia.com> wrote:
> >>>      
> >>>> On Thu, Jun 23, 2022 at 05:00:44PM -0600, Alex Williamson wrote:
> >>>>      
> >>>>>>>> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
> >>>>>>>> +{
> >>>>>>>> +	struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
> >>>>>>>> +	struct vfio_device *device;  
> >>>>>>>
> >>>>>>> Check group for NULL.  
> >>>>>>
> >>>>>> OK - FWIW in context this should only ever make sense to call with an
> >>>>>> iommu_group which has already been derived from a vfio_group, and I did
> >>>>>> initially consider a check with a WARN_ON(), but then decided that the
> >>>>>> unguarded dereference would be a sufficiently strong message. No problem
> >>>>>> with bringing that back to make it more defensive if that's what you prefer.  
> >>>>>
> >>>>> A while down the road, that's a bit too much implicit knowledge of the
> >>>>> intent and single purpose of this function just to simply avoid a test.  
> >>>>
> >>>> I think we should just pass the 'struct vfio_group *' into the
> >>>> attach_group op and have this API take that type in and forget the
> >>>> vfio_group_get_from_iommu().  
> >>>
> >>> That's essentially what I'm suggesting, the vfio_group is passed as an
> >>> opaque pointer which type1 can use for a
> >>> vfio_group_for_each_vfio_device() type call.  Thanks,  
> >>
> >> I don't want to add a whole vfio_group_for_each_vfio_device()
> >> machinery that isn't actually needed by anything.. This is all
> >> internal, we don't need to design more than exactly what is needed.
> >>
> >> At this point if we change the signature of the attach then we may as
> >> well just pass in the representative vfio_device, that is probably
> >> less LOC overall.  
> > 
> > That means that vfio core still needs to pick an arbitrary
> > representative device, which I find in fundamental conflict to the
> > nature of groups.  Type1 is the interface to the IOMMU API, if through
> > the IOMMU API we can make an assumption that all devices within the
> > group are equivalent for a given operation, that should be done in type1
> > code, not in vfio core.  A for-each interface is commonplace and not
> > significantly more code or design than already proposed.  Thanks,  
> 
> It also occurred to me this morning that there's another middle-ground 
> option staring out from the call-wrapping notion I mentioned yesterday - 
> while I'm not keen to provide it from the IOMMU API, there's absolutely 
> no reason that VFIO couldn't just use the building blocks by itself, and 
> in fact it works out almost absurdly simple:
> 
> static bool vfio_device_capable(struct device *dev, void *data)
> {
> 	return device_iommu_capable(dev, (enum iommu_cap)data);
> }
> 
> bool vfio_group_capable(struct iommu_group *group, enum iommu_cap cap)
> {
> 	return iommu_group_for_each_dev(group, (void *)cap, vfio_device_capable);
> }
> 
> and much the same for iommu_domain_alloc() once I get that far. The 
> locking concern neatly disappears because we're no longer holding any 
> bus or device pointer that can go stale. How does that seem as a 
> compromise for now, looking forward to Jason's longer-term view of 
> rearranging the attach_group process such that a vfio_device falls 
> naturally to hand?

Yup, that seems like another way to do it, a slight iteration on the
current bus_type flow, and also avoids any sort of arbitrary
representative device being passed around as an API.

For clarity of the principle that all devices within the group should
have the same capabilities, we could even further follow the existing
bus_type and do a sanity test here at the same time, or perhaps simply
stop after the first device to avoid the if-any-device-is-capable
semantics implied above.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-06-24 16:04 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-22 12:04 [PATCH v2 1/2] vfio/type1: Simplify bus_type determination Robin Murphy
2022-06-22 12:04 ` [PATCH v2 2/2] vfio: Use device_iommu_capable() Robin Murphy
2022-06-23  1:47   ` Baolu Lu
2022-06-22 22:17 ` [PATCH v2 1/2] vfio/type1: Simplify bus_type determination Alex Williamson
2022-06-23  8:46   ` Tian, Kevin
2022-06-23 20:35     ` Alex Williamson
2022-06-23 12:23   ` Robin Murphy
2022-06-23 20:50     ` Jason Gunthorpe
2022-06-23 23:00     ` Alex Williamson
2022-06-24  1:50       ` Jason Gunthorpe
2022-06-24 14:11         ` Alex Williamson
2022-06-24 14:18           ` Jason Gunthorpe
2022-06-24 14:28             ` Alex Williamson
2022-06-24 14:56               ` Jason Gunthorpe
2022-06-24 15:12               ` Robin Murphy
2022-06-24 16:04                 ` Alex Williamson
2022-06-23  1:46 ` Baolu Lu
2022-06-23  4:32 ` kernel test robot
2022-06-24  1:52 ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).