All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] iommu/virtio: Enable IOMMU_CAP_DERRED_FLUSH
@ 2023-09-18 11:51 Niklas Schnelle
  2023-09-18 11:51 ` [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map Niklas Schnelle
  2023-09-18 11:51 ` [PATCH v2 2/2] iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush Niklas Schnelle
  0 siblings, 2 replies; 39+ messages in thread
From: Niklas Schnelle @ 2023-09-18 11:51 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Joerg Roedel, Will Deacon, Robin Murphy
  Cc: virtualization, iommu, linux-kernel, Niklas Schnelle

Hi All,

Previously I used virtio-iommu as a non-s390x test vehicle[0] for the
single queue flushing scheme introduced by my s390x DMA API conversion
series[1]. For this I modified virtio-iommu to a) use .iotlb_sync_map
and b) enable IOMMU_CAP_DEFERRED_FLUSH. It turned out that deferred
flush and even just the introduction of ops->iotlb_sync_map yield
performance uplift[2] even with per-CPU queues. So here is a small
series of these two changes. This still applies on top of my series[1]
because its first patch titled "iommu: Allow .iotlb_sync_map to fail and
handle s390's -ENOMEM return" enable ops->iotlb_sync_map to return
errors and virtio-iommu's sync can fail. This also makes sure there is
no merge conflict with that series.

The code is also available on the b4/viommu-deferred-flush branch of my
kernel.org git repository[3]

Thanks,
Niklas

[0] https://lore.kernel.org/lkml/20230726111433.1105665-1-schnelle@linux.ibm.com/
[1] https://lore.kernel.org/lkml/20230825-dma_iommu-v12-0-4134455994a7@linux.ibm.com/
[2] https://lore.kernel.org/lkml/20230802123612.GA6142@myrica/
[3] https://git.kernel.org/pub/scm/linux/kernel/git/niks/linux.git/log/?h=b4/viommu-deferred-flush

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
Changes in v2:
- Check for viommu == NULL in viommu_sync_req() instead of for
  0 endpoints in ops (Jean-Philippe)
- Added comment where viommu can be NULL (me)
- Link to v1: https://lore.kernel.org/r/20230825-viommu-sync-map-v1-0-56bdcfaa29ec@linux.ibm.com

---
Niklas Schnelle (2):
      iommu/virtio: Make use of ops->iotlb_sync_map
      iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush

 drivers/iommu/virtio-iommu.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)
---
base-commit: e165388f6d32dc8a49f49ef6e80584ad3def3d78
change-id: 20230825-viommu-sync-map-1bf0cc4fdc15

Best regards,
-- 
Niklas Schnelle
Linux on Z Development

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement - https://www.ibm.com/privacy 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-18 11:51 [PATCH v2 0/2] iommu/virtio: Enable IOMMU_CAP_DERRED_FLUSH Niklas Schnelle
@ 2023-09-18 11:51 ` Niklas Schnelle
  2023-09-18 15:58     ` Jean-Philippe Brucker
  2023-09-18 16:37     ` Robin Murphy
  2023-09-18 11:51 ` [PATCH v2 2/2] iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush Niklas Schnelle
  1 sibling, 2 replies; 39+ messages in thread
From: Niklas Schnelle @ 2023-09-18 11:51 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Joerg Roedel, Will Deacon, Robin Murphy
  Cc: virtualization, iommu, linux-kernel, Niklas Schnelle

Pull out the sync operation from viommu_map_pages() by implementing
ops->iotlb_sync_map. This allows the common IOMMU code to map multiple
elements of an sg with a single sync (see iommu_map_sg()). Furthermore,
it is also a requirement for IOMMU_CAP_DEFERRED_FLUSH.

Link: https://lore.kernel.org/lkml/20230726111433.1105665-1-schnelle@linux.ibm.com/
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
 drivers/iommu/virtio-iommu.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 17dcd826f5c2..3649586f0e5c 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
 	int ret;
 	unsigned long flags;
 
+	/*
+	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
+	 * is initialized e.g. via iommu_create_device_direct_mappings()
+	 */
+	if (!viommu)
+		return 0;
 	spin_lock_irqsave(&viommu->request_lock, flags);
 	ret = __viommu_sync_req(viommu);
 	if (ret)
@@ -843,7 +849,7 @@ static int viommu_map_pages(struct iommu_domain *domain, unsigned long iova,
 			.flags		= cpu_to_le32(flags),
 		};
 
-		ret = viommu_send_req_sync(vdomain->viommu, &map, sizeof(map));
+		ret = viommu_add_req(vdomain->viommu, &map, sizeof(map));
 		if (ret) {
 			viommu_del_mappings(vdomain, iova, end);
 			return ret;
@@ -912,6 +918,14 @@ static void viommu_iotlb_sync(struct iommu_domain *domain,
 	viommu_sync_req(vdomain->viommu);
 }
 
+static int viommu_iotlb_sync_map(struct iommu_domain *domain,
+				 unsigned long iova, size_t size)
+{
+	struct viommu_domain *vdomain = to_viommu_domain(domain);
+
+	return viommu_sync_req(vdomain->viommu);
+}
+
 static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
 {
 	struct iommu_resv_region *entry, *new_entry, *msi = NULL;
@@ -1058,6 +1072,7 @@ static struct iommu_ops viommu_ops = {
 		.unmap_pages		= viommu_unmap_pages,
 		.iova_to_phys		= viommu_iova_to_phys,
 		.iotlb_sync		= viommu_iotlb_sync,
+		.iotlb_sync_map		= viommu_iotlb_sync_map,
 		.free			= viommu_domain_free,
 	}
 };

-- 
2.39.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 2/2] iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush
  2023-09-18 11:51 [PATCH v2 0/2] iommu/virtio: Enable IOMMU_CAP_DERRED_FLUSH Niklas Schnelle
  2023-09-18 11:51 ` [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map Niklas Schnelle
@ 2023-09-18 11:51 ` Niklas Schnelle
  2023-09-18 15:59     ` Jean-Philippe Brucker
  1 sibling, 1 reply; 39+ messages in thread
From: Niklas Schnelle @ 2023-09-18 11:51 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Joerg Roedel, Will Deacon, Robin Murphy
  Cc: virtualization, iommu, linux-kernel, Niklas Schnelle

Add ops->flush_iotlb_all operation to enable virtio-iommu for the
dma-iommu deferred flush scheme. This results in a significant increase
in performance in exchange for a window in which devices can still
access previously IOMMU mapped memory when running with
CONFIG_IOMMU_DEFAULT_DMA_LAZY. The previous strict behavior can be
achieved with iommu.strict=1 on the kernel command line or
CONFIG_IOMMU_DEFAULT_DMA_STRICT.

Link: https://lore.kernel.org/lkml/20230802123612.GA6142@myrica/
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
 drivers/iommu/virtio-iommu.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 3649586f0e5c..4dd330fbcbdd 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -926,6 +926,13 @@ static int viommu_iotlb_sync_map(struct iommu_domain *domain,
 	return viommu_sync_req(vdomain->viommu);
 }
 
+static void viommu_flush_iotlb_all(struct iommu_domain *domain)
+{
+	struct viommu_domain *vdomain = to_viommu_domain(domain);
+
+	viommu_sync_req(vdomain->viommu);
+}
+
 static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
 {
 	struct iommu_resv_region *entry, *new_entry, *msi = NULL;
@@ -1051,6 +1058,8 @@ static bool viommu_capable(struct device *dev, enum iommu_cap cap)
 	switch (cap) {
 	case IOMMU_CAP_CACHE_COHERENCY:
 		return true;
+	case IOMMU_CAP_DEFERRED_FLUSH:
+		return true;
 	default:
 		return false;
 	}
@@ -1071,6 +1080,7 @@ static struct iommu_ops viommu_ops = {
 		.map_pages		= viommu_map_pages,
 		.unmap_pages		= viommu_unmap_pages,
 		.iova_to_phys		= viommu_iova_to_phys,
+		.flush_iotlb_all	= viommu_flush_iotlb_all,
 		.iotlb_sync		= viommu_iotlb_sync,
 		.iotlb_sync_map		= viommu_iotlb_sync_map,
 		.free			= viommu_domain_free,

-- 
2.39.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-18 11:51 ` [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map Niklas Schnelle
@ 2023-09-18 15:58     ` Jean-Philippe Brucker
  2023-09-18 16:37     ` Robin Murphy
  1 sibling, 0 replies; 39+ messages in thread
From: Jean-Philippe Brucker @ 2023-09-18 15:58 UTC (permalink / raw)
  To: Niklas Schnelle
  Cc: Will Deacon, Joerg Roedel, linux-kernel, virtualization, iommu,
	Robin Murphy

On Mon, Sep 18, 2023 at 01:51:43PM +0200, Niklas Schnelle wrote:
> Pull out the sync operation from viommu_map_pages() by implementing
> ops->iotlb_sync_map. This allows the common IOMMU code to map multiple
> elements of an sg with a single sync (see iommu_map_sg()). Furthermore,
> it is also a requirement for IOMMU_CAP_DEFERRED_FLUSH.
> 
> Link: https://lore.kernel.org/lkml/20230726111433.1105665-1-schnelle@linux.ibm.com/
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>

Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

This must be merged after "iommu/dma: s390 DMA API conversion and
optimized IOTLB flushing" because of the updated iotlb_sync_map()
prototype.

Thanks,
Jean

> ---
>  drivers/iommu/virtio-iommu.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 17dcd826f5c2..3649586f0e5c 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
>  	int ret;
>  	unsigned long flags;
>  
> +	/*
> +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> +	 */
> +	if (!viommu)
> +		return 0;
>  	spin_lock_irqsave(&viommu->request_lock, flags);
>  	ret = __viommu_sync_req(viommu);
>  	if (ret)
> @@ -843,7 +849,7 @@ static int viommu_map_pages(struct iommu_domain *domain, unsigned long iova,
>  			.flags		= cpu_to_le32(flags),
>  		};
>  
> -		ret = viommu_send_req_sync(vdomain->viommu, &map, sizeof(map));
> +		ret = viommu_add_req(vdomain->viommu, &map, sizeof(map));
>  		if (ret) {
>  			viommu_del_mappings(vdomain, iova, end);
>  			return ret;
> @@ -912,6 +918,14 @@ static void viommu_iotlb_sync(struct iommu_domain *domain,
>  	viommu_sync_req(vdomain->viommu);
>  }
>  
> +static int viommu_iotlb_sync_map(struct iommu_domain *domain,
> +				 unsigned long iova, size_t size)
> +{
> +	struct viommu_domain *vdomain = to_viommu_domain(domain);
> +
> +	return viommu_sync_req(vdomain->viommu);
> +}
> +
>  static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
>  {
>  	struct iommu_resv_region *entry, *new_entry, *msi = NULL;
> @@ -1058,6 +1072,7 @@ static struct iommu_ops viommu_ops = {
>  		.unmap_pages		= viommu_unmap_pages,
>  		.iova_to_phys		= viommu_iova_to_phys,
>  		.iotlb_sync		= viommu_iotlb_sync,
> +		.iotlb_sync_map		= viommu_iotlb_sync_map,
>  		.free			= viommu_domain_free,
>  	}
>  };
> 
> -- 
> 2.39.2
> 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-18 15:58     ` Jean-Philippe Brucker
  0 siblings, 0 replies; 39+ messages in thread
From: Jean-Philippe Brucker @ 2023-09-18 15:58 UTC (permalink / raw)
  To: Niklas Schnelle
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, virtualization, iommu,
	linux-kernel

On Mon, Sep 18, 2023 at 01:51:43PM +0200, Niklas Schnelle wrote:
> Pull out the sync operation from viommu_map_pages() by implementing
> ops->iotlb_sync_map. This allows the common IOMMU code to map multiple
> elements of an sg with a single sync (see iommu_map_sg()). Furthermore,
> it is also a requirement for IOMMU_CAP_DEFERRED_FLUSH.
> 
> Link: https://lore.kernel.org/lkml/20230726111433.1105665-1-schnelle@linux.ibm.com/
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>

Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

This must be merged after "iommu/dma: s390 DMA API conversion and
optimized IOTLB flushing" because of the updated iotlb_sync_map()
prototype.

Thanks,
Jean

> ---
>  drivers/iommu/virtio-iommu.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 17dcd826f5c2..3649586f0e5c 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
>  	int ret;
>  	unsigned long flags;
>  
> +	/*
> +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> +	 */
> +	if (!viommu)
> +		return 0;
>  	spin_lock_irqsave(&viommu->request_lock, flags);
>  	ret = __viommu_sync_req(viommu);
>  	if (ret)
> @@ -843,7 +849,7 @@ static int viommu_map_pages(struct iommu_domain *domain, unsigned long iova,
>  			.flags		= cpu_to_le32(flags),
>  		};
>  
> -		ret = viommu_send_req_sync(vdomain->viommu, &map, sizeof(map));
> +		ret = viommu_add_req(vdomain->viommu, &map, sizeof(map));
>  		if (ret) {
>  			viommu_del_mappings(vdomain, iova, end);
>  			return ret;
> @@ -912,6 +918,14 @@ static void viommu_iotlb_sync(struct iommu_domain *domain,
>  	viommu_sync_req(vdomain->viommu);
>  }
>  
> +static int viommu_iotlb_sync_map(struct iommu_domain *domain,
> +				 unsigned long iova, size_t size)
> +{
> +	struct viommu_domain *vdomain = to_viommu_domain(domain);
> +
> +	return viommu_sync_req(vdomain->viommu);
> +}
> +
>  static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
>  {
>  	struct iommu_resv_region *entry, *new_entry, *msi = NULL;
> @@ -1058,6 +1072,7 @@ static struct iommu_ops viommu_ops = {
>  		.unmap_pages		= viommu_unmap_pages,
>  		.iova_to_phys		= viommu_iova_to_phys,
>  		.iotlb_sync		= viommu_iotlb_sync,
> +		.iotlb_sync_map		= viommu_iotlb_sync_map,
>  		.free			= viommu_domain_free,
>  	}
>  };
> 
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 2/2] iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush
  2023-09-18 11:51 ` [PATCH v2 2/2] iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush Niklas Schnelle
@ 2023-09-18 15:59     ` Jean-Philippe Brucker
  0 siblings, 0 replies; 39+ messages in thread
From: Jean-Philippe Brucker @ 2023-09-18 15:59 UTC (permalink / raw)
  To: Niklas Schnelle
  Cc: Will Deacon, Joerg Roedel, linux-kernel, virtualization, iommu,
	Robin Murphy

On Mon, Sep 18, 2023 at 01:51:44PM +0200, Niklas Schnelle wrote:
> Add ops->flush_iotlb_all operation to enable virtio-iommu for the
> dma-iommu deferred flush scheme. This results in a significant increase
> in performance in exchange for a window in which devices can still
> access previously IOMMU mapped memory when running with
> CONFIG_IOMMU_DEFAULT_DMA_LAZY. The previous strict behavior can be
> achieved with iommu.strict=1 on the kernel command line or
> CONFIG_IOMMU_DEFAULT_DMA_STRICT.
> 
> Link: https://lore.kernel.org/lkml/20230802123612.GA6142@myrica/
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>

Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

> ---
>  drivers/iommu/virtio-iommu.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 3649586f0e5c..4dd330fbcbdd 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -926,6 +926,13 @@ static int viommu_iotlb_sync_map(struct iommu_domain *domain,
>  	return viommu_sync_req(vdomain->viommu);
>  }
>  
> +static void viommu_flush_iotlb_all(struct iommu_domain *domain)
> +{
> +	struct viommu_domain *vdomain = to_viommu_domain(domain);
> +
> +	viommu_sync_req(vdomain->viommu);
> +}
> +
>  static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
>  {
>  	struct iommu_resv_region *entry, *new_entry, *msi = NULL;
> @@ -1051,6 +1058,8 @@ static bool viommu_capable(struct device *dev, enum iommu_cap cap)
>  	switch (cap) {
>  	case IOMMU_CAP_CACHE_COHERENCY:
>  		return true;
> +	case IOMMU_CAP_DEFERRED_FLUSH:
> +		return true;
>  	default:
>  		return false;
>  	}
> @@ -1071,6 +1080,7 @@ static struct iommu_ops viommu_ops = {
>  		.map_pages		= viommu_map_pages,
>  		.unmap_pages		= viommu_unmap_pages,
>  		.iova_to_phys		= viommu_iova_to_phys,
> +		.flush_iotlb_all	= viommu_flush_iotlb_all,
>  		.iotlb_sync		= viommu_iotlb_sync,
>  		.iotlb_sync_map		= viommu_iotlb_sync_map,
>  		.free			= viommu_domain_free,
> 
> -- 
> 2.39.2
> 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 2/2] iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush
@ 2023-09-18 15:59     ` Jean-Philippe Brucker
  0 siblings, 0 replies; 39+ messages in thread
From: Jean-Philippe Brucker @ 2023-09-18 15:59 UTC (permalink / raw)
  To: Niklas Schnelle
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, virtualization, iommu,
	linux-kernel

On Mon, Sep 18, 2023 at 01:51:44PM +0200, Niklas Schnelle wrote:
> Add ops->flush_iotlb_all operation to enable virtio-iommu for the
> dma-iommu deferred flush scheme. This results in a significant increase
> in performance in exchange for a window in which devices can still
> access previously IOMMU mapped memory when running with
> CONFIG_IOMMU_DEFAULT_DMA_LAZY. The previous strict behavior can be
> achieved with iommu.strict=1 on the kernel command line or
> CONFIG_IOMMU_DEFAULT_DMA_STRICT.
> 
> Link: https://lore.kernel.org/lkml/20230802123612.GA6142@myrica/
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>

Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

> ---
>  drivers/iommu/virtio-iommu.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 3649586f0e5c..4dd330fbcbdd 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -926,6 +926,13 @@ static int viommu_iotlb_sync_map(struct iommu_domain *domain,
>  	return viommu_sync_req(vdomain->viommu);
>  }
>  
> +static void viommu_flush_iotlb_all(struct iommu_domain *domain)
> +{
> +	struct viommu_domain *vdomain = to_viommu_domain(domain);
> +
> +	viommu_sync_req(vdomain->viommu);
> +}
> +
>  static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
>  {
>  	struct iommu_resv_region *entry, *new_entry, *msi = NULL;
> @@ -1051,6 +1058,8 @@ static bool viommu_capable(struct device *dev, enum iommu_cap cap)
>  	switch (cap) {
>  	case IOMMU_CAP_CACHE_COHERENCY:
>  		return true;
> +	case IOMMU_CAP_DEFERRED_FLUSH:
> +		return true;
>  	default:
>  		return false;
>  	}
> @@ -1071,6 +1080,7 @@ static struct iommu_ops viommu_ops = {
>  		.map_pages		= viommu_map_pages,
>  		.unmap_pages		= viommu_unmap_pages,
>  		.iova_to_phys		= viommu_iova_to_phys,
> +		.flush_iotlb_all	= viommu_flush_iotlb_all,
>  		.iotlb_sync		= viommu_iotlb_sync,
>  		.iotlb_sync_map		= viommu_iotlb_sync_map,
>  		.free			= viommu_domain_free,
> 
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-18 11:51 ` [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map Niklas Schnelle
@ 2023-09-18 16:37     ` Robin Murphy
  2023-09-18 16:37     ` Robin Murphy
  1 sibling, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-18 16:37 UTC (permalink / raw)
  To: Niklas Schnelle, Jean-Philippe Brucker, Joerg Roedel, Will Deacon
  Cc: virtualization, iommu, linux-kernel

On 2023-09-18 12:51, Niklas Schnelle wrote:
> Pull out the sync operation from viommu_map_pages() by implementing
> ops->iotlb_sync_map. This allows the common IOMMU code to map multiple
> elements of an sg with a single sync (see iommu_map_sg()). Furthermore,
> it is also a requirement for IOMMU_CAP_DEFERRED_FLUSH.

Is it really a requirement? Deferred flush only deals with unmapping. Or 
are you just trying to say that it's not too worthwhile to try doing 
more for unmapping performance while obvious mapping performance is 
still left on the table?

> Link: https://lore.kernel.org/lkml/20230726111433.1105665-1-schnelle@linux.ibm.com/
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
> ---
>   drivers/iommu/virtio-iommu.c | 17 ++++++++++++++++-
>   1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 17dcd826f5c2..3649586f0e5c 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
>   	int ret;
>   	unsigned long flags;
>   
> +	/*
> +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> +	 */
> +	if (!viommu)
> +		return 0;

Minor nit: I'd be inclined to make that check explicitly in the places 
where it definitely is expected, rather than allowing *any* sync to 
silently do nothing if called incorrectly. Plus then they could use 
vdomain->nr_endpoints for consistency with the equivalent checks 
elsewhere (it did take me a moment to figure out how we could get to 
.iotlb_sync_map with a NULL viommu without viommu_map_pages() blowing up 
first...)

Thanks,
Robin.

>   	spin_lock_irqsave(&viommu->request_lock, flags);
>   	ret = __viommu_sync_req(viommu);
>   	if (ret)
> @@ -843,7 +849,7 @@ static int viommu_map_pages(struct iommu_domain *domain, unsigned long iova,
>   			.flags		= cpu_to_le32(flags),
>   		};
>   
> -		ret = viommu_send_req_sync(vdomain->viommu, &map, sizeof(map));
> +		ret = viommu_add_req(vdomain->viommu, &map, sizeof(map));
>   		if (ret) {
>   			viommu_del_mappings(vdomain, iova, end);
>   			return ret;
> @@ -912,6 +918,14 @@ static void viommu_iotlb_sync(struct iommu_domain *domain,
>   	viommu_sync_req(vdomain->viommu);
>   }
>   
> +static int viommu_iotlb_sync_map(struct iommu_domain *domain,
> +				 unsigned long iova, size_t size)
> +{
> +	struct viommu_domain *vdomain = to_viommu_domain(domain);
> +
> +	return viommu_sync_req(vdomain->viommu);
> +}
> +
>   static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
>   {
>   	struct iommu_resv_region *entry, *new_entry, *msi = NULL;
> @@ -1058,6 +1072,7 @@ static struct iommu_ops viommu_ops = {
>   		.unmap_pages		= viommu_unmap_pages,
>   		.iova_to_phys		= viommu_iova_to_phys,
>   		.iotlb_sync		= viommu_iotlb_sync,
> +		.iotlb_sync_map		= viommu_iotlb_sync_map,
>   		.free			= viommu_domain_free,
>   	}
>   };
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-18 16:37     ` Robin Murphy
  0 siblings, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-18 16:37 UTC (permalink / raw)
  To: Niklas Schnelle, Jean-Philippe Brucker, Joerg Roedel, Will Deacon
  Cc: iommu, linux-kernel, virtualization

On 2023-09-18 12:51, Niklas Schnelle wrote:
> Pull out the sync operation from viommu_map_pages() by implementing
> ops->iotlb_sync_map. This allows the common IOMMU code to map multiple
> elements of an sg with a single sync (see iommu_map_sg()). Furthermore,
> it is also a requirement for IOMMU_CAP_DEFERRED_FLUSH.

Is it really a requirement? Deferred flush only deals with unmapping. Or 
are you just trying to say that it's not too worthwhile to try doing 
more for unmapping performance while obvious mapping performance is 
still left on the table?

> Link: https://lore.kernel.org/lkml/20230726111433.1105665-1-schnelle@linux.ibm.com/
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
> ---
>   drivers/iommu/virtio-iommu.c | 17 ++++++++++++++++-
>   1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 17dcd826f5c2..3649586f0e5c 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
>   	int ret;
>   	unsigned long flags;
>   
> +	/*
> +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> +	 */
> +	if (!viommu)
> +		return 0;

Minor nit: I'd be inclined to make that check explicitly in the places 
where it definitely is expected, rather than allowing *any* sync to 
silently do nothing if called incorrectly. Plus then they could use 
vdomain->nr_endpoints for consistency with the equivalent checks 
elsewhere (it did take me a moment to figure out how we could get to 
.iotlb_sync_map with a NULL viommu without viommu_map_pages() blowing up 
first...)

Thanks,
Robin.

>   	spin_lock_irqsave(&viommu->request_lock, flags);
>   	ret = __viommu_sync_req(viommu);
>   	if (ret)
> @@ -843,7 +849,7 @@ static int viommu_map_pages(struct iommu_domain *domain, unsigned long iova,
>   			.flags		= cpu_to_le32(flags),
>   		};
>   
> -		ret = viommu_send_req_sync(vdomain->viommu, &map, sizeof(map));
> +		ret = viommu_add_req(vdomain->viommu, &map, sizeof(map));
>   		if (ret) {
>   			viommu_del_mappings(vdomain, iova, end);
>   			return ret;
> @@ -912,6 +918,14 @@ static void viommu_iotlb_sync(struct iommu_domain *domain,
>   	viommu_sync_req(vdomain->viommu);
>   }
>   
> +static int viommu_iotlb_sync_map(struct iommu_domain *domain,
> +				 unsigned long iova, size_t size)
> +{
> +	struct viommu_domain *vdomain = to_viommu_domain(domain);
> +
> +	return viommu_sync_req(vdomain->viommu);
> +}
> +
>   static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
>   {
>   	struct iommu_resv_region *entry, *new_entry, *msi = NULL;
> @@ -1058,6 +1072,7 @@ static struct iommu_ops viommu_ops = {
>   		.unmap_pages		= viommu_unmap_pages,
>   		.iova_to_phys		= viommu_iova_to_phys,
>   		.iotlb_sync		= viommu_iotlb_sync,
> +		.iotlb_sync_map		= viommu_iotlb_sync_map,
>   		.free			= viommu_domain_free,
>   	}
>   };
> 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-18 16:37     ` Robin Murphy
  (?)
@ 2023-09-19  8:00     ` Niklas Schnelle
  -1 siblings, 0 replies; 39+ messages in thread
From: Niklas Schnelle @ 2023-09-19  8:00 UTC (permalink / raw)
  To: Robin Murphy, Jean-Philippe Brucker, Joerg Roedel, Will Deacon
  Cc: virtualization, iommu, linux-kernel

On Mon, 2023-09-18 at 17:37 +0100, Robin Murphy wrote:
> On 2023-09-18 12:51, Niklas Schnelle wrote:
> > Pull out the sync operation from viommu_map_pages() by implementing
> > ops->iotlb_sync_map. This allows the common IOMMU code to map multiple
> > elements of an sg with a single sync (see iommu_map_sg()). Furthermore,
> > it is also a requirement for IOMMU_CAP_DEFERRED_FLUSH.
> 
> Is it really a requirement? Deferred flush only deals with unmapping. Or 
> are you just trying to say that it's not too worthwhile to try doing 
> more for unmapping performance while obvious mapping performance is 
> still left on the table?
> 

You're right there is no hard requirement. I somehow thought that
iommu_create_device_direct_map() relied on it because it does
flush_iotlb_all() and iommu_map() but that doesn't seem to be true. If
you want I can resend with the last sentence removed.

> > Link: https://lore.kernel.org/lkml/20230726111433.1105665-1-schnelle@linux.ibm.com/
> > Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
> > ---
> >   drivers/iommu/virtio-iommu.c | 17 ++++++++++++++++-
> >   1 file changed, 16 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > index 17dcd826f5c2..3649586f0e5c 100644
> > --- a/drivers/iommu/virtio-iommu.c
> > +++ b/drivers/iommu/virtio-iommu.c
> > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> >   	int ret;
> >   	unsigned long flags;
> >   
> > +	/*
> > +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> > +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> > +	 */
> > +	if (!viommu)
> > +		return 0;
> 
> Minor nit: I'd be inclined to make that check explicitly in the places 
> where it definitely is expected, rather than allowing *any* sync to 
> silently do nothing if called incorrectly. Plus then they could use 
> vdomain->nr_endpoints for consistency with the equivalent checks 
> elsewhere (it did take me a moment to figure out how we could get to 
> .iotlb_sync_map with a NULL viommu without viommu_map_pages() blowing up 
> first...)
> 
> Thanks,
> Robin.

That's what I had in v1. I think this is a matter of taste and Jean-
Philippe pointed me to moving the check into viommu_sync_req() I added
a comment because it really is not entirely obvious.

> 
> >   	spin_lock_irqsave(&viommu->request_lock, flags);
> >   	ret = __viommu_sync_req(viommu);
> >   	if (ret)
> > @@ -843,7 +849,7 @@ static int viommu_map_pages(struct iommu_domain *domain, unsigned long iova,
> >   			.flags		= cpu_to_le32(flags),
> >   		};
> >   
> > -		ret = viommu_send_req_sync(vdomain->viommu, &map, sizeof(map));
> > +		ret = viommu_add_req(vdomain->viommu, &map, sizeof(map));
> >   		if (ret) {
> >   			viommu_del_mappings(vdomain, iova, end);
> >   			return ret;
> > @@ -912,6 +918,14 @@ static void viommu_iotlb_sync(struct iommu_domain *domain,
> >   	viommu_sync_req(vdomain->viommu);
> >   }
> >   
> > +static int viommu_iotlb_sync_map(struct iommu_domain *domain,
> > +				 unsigned long iova, size_t size)
> > +{
> > +	struct viommu_domain *vdomain = to_viommu_domain(domain);
> > +
> > +	return viommu_sync_req(vdomain->viommu);
> > +}
> > +
> >   static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
> >   {
> >   	struct iommu_resv_region *entry, *new_entry, *msi = NULL;
> > @@ -1058,6 +1072,7 @@ static struct iommu_ops viommu_ops = {
> >   		.unmap_pages		= viommu_unmap_pages,
> >   		.iova_to_phys		= viommu_iova_to_phys,
> >   		.iotlb_sync		= viommu_iotlb_sync,
> > +		.iotlb_sync_map		= viommu_iotlb_sync_map,
> >   		.free			= viommu_domain_free,
> >   	}
> >   };
> > 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-18 16:37     ` Robin Murphy
@ 2023-09-19  8:15       ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 39+ messages in thread
From: Jean-Philippe Brucker @ 2023-09-19  8:15 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Niklas Schnelle, Joerg Roedel, Will Deacon, virtualization,
	iommu, linux-kernel

On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote:
> > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > index 17dcd826f5c2..3649586f0e5c 100644
> > --- a/drivers/iommu/virtio-iommu.c
> > +++ b/drivers/iommu/virtio-iommu.c
> > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> >   	int ret;
> >   	unsigned long flags;
> > +	/*
> > +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> > +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> > +	 */
> > +	if (!viommu)
> > +		return 0;
> 
> Minor nit: I'd be inclined to make that check explicitly in the places where
> it definitely is expected, rather than allowing *any* sync to silently do
> nothing if called incorrectly. Plus then they could use
> vdomain->nr_endpoints for consistency with the equivalent checks elsewhere
> (it did take me a moment to figure out how we could get to .iotlb_sync_map
> with a NULL viommu without viommu_map_pages() blowing up first...)

They're not strictly equivalent: this check works around a temporary issue
with the IOMMU core, which calls map/unmap before the domain is finalized.
Once we merge domain_alloc() and finalize(), then this check disappears,
but we still need to test nr_endpoints in map/unmap to handle detached
domains (and we still need to fix the synchronization of nr_endpoints
against attach/detach). That's why I preferred doing this on viommu and
keeping it in one place.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-19  8:15       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 39+ messages in thread
From: Jean-Philippe Brucker @ 2023-09-19  8:15 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Niklas Schnelle, Joerg Roedel, linux-kernel, virtualization,
	iommu, Will Deacon

On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote:
> > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > index 17dcd826f5c2..3649586f0e5c 100644
> > --- a/drivers/iommu/virtio-iommu.c
> > +++ b/drivers/iommu/virtio-iommu.c
> > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> >   	int ret;
> >   	unsigned long flags;
> > +	/*
> > +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> > +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> > +	 */
> > +	if (!viommu)
> > +		return 0;
> 
> Minor nit: I'd be inclined to make that check explicitly in the places where
> it definitely is expected, rather than allowing *any* sync to silently do
> nothing if called incorrectly. Plus then they could use
> vdomain->nr_endpoints for consistency with the equivalent checks elsewhere
> (it did take me a moment to figure out how we could get to .iotlb_sync_map
> with a NULL viommu without viommu_map_pages() blowing up first...)

They're not strictly equivalent: this check works around a temporary issue
with the IOMMU core, which calls map/unmap before the domain is finalized.
Once we merge domain_alloc() and finalize(), then this check disappears,
but we still need to test nr_endpoints in map/unmap to handle detached
domains (and we still need to fix the synchronization of nr_endpoints
against attach/detach). That's why I preferred doing this on viommu and
keeping it in one place.

Thanks,
Jean
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-19  8:15       ` Jean-Philippe Brucker
@ 2023-09-19  8:28         ` Robin Murphy
  -1 siblings, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-19  8:28 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Niklas Schnelle, Joerg Roedel, Will Deacon, virtualization,
	iommu, linux-kernel

On 2023-09-19 09:15, Jean-Philippe Brucker wrote:
> On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote:
>>> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
>>> index 17dcd826f5c2..3649586f0e5c 100644
>>> --- a/drivers/iommu/virtio-iommu.c
>>> +++ b/drivers/iommu/virtio-iommu.c
>>> @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
>>>    	int ret;
>>>    	unsigned long flags;
>>> +	/*
>>> +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
>>> +	 * is initialized e.g. via iommu_create_device_direct_mappings()
>>> +	 */
>>> +	if (!viommu)
>>> +		return 0;
>>
>> Minor nit: I'd be inclined to make that check explicitly in the places where
>> it definitely is expected, rather than allowing *any* sync to silently do
>> nothing if called incorrectly. Plus then they could use
>> vdomain->nr_endpoints for consistency with the equivalent checks elsewhere
>> (it did take me a moment to figure out how we could get to .iotlb_sync_map
>> with a NULL viommu without viommu_map_pages() blowing up first...)
> 
> They're not strictly equivalent: this check works around a temporary issue
> with the IOMMU core, which calls map/unmap before the domain is finalized.
> Once we merge domain_alloc() and finalize(), then this check disappears,
> but we still need to test nr_endpoints in map/unmap to handle detached
> domains (and we still need to fix the synchronization of nr_endpoints
> against attach/detach). That's why I preferred doing this on viommu and
> keeping it in one place.

Fair enough - it just seems to me that in both cases it's a detached 
domain, so its previous history of whether it's ever been otherwise or 
not shouldn't matter. Even once viommu is initialised, does it really 
make sense to send sync commands for a mapping on a detached domain 
where we haven't actually sent any map/unmap commands?

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-19  8:28         ` Robin Murphy
  0 siblings, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-19  8:28 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Niklas Schnelle, Joerg Roedel, linux-kernel, virtualization,
	iommu, Will Deacon

On 2023-09-19 09:15, Jean-Philippe Brucker wrote:
> On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote:
>>> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
>>> index 17dcd826f5c2..3649586f0e5c 100644
>>> --- a/drivers/iommu/virtio-iommu.c
>>> +++ b/drivers/iommu/virtio-iommu.c
>>> @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
>>>    	int ret;
>>>    	unsigned long flags;
>>> +	/*
>>> +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
>>> +	 * is initialized e.g. via iommu_create_device_direct_mappings()
>>> +	 */
>>> +	if (!viommu)
>>> +		return 0;
>>
>> Minor nit: I'd be inclined to make that check explicitly in the places where
>> it definitely is expected, rather than allowing *any* sync to silently do
>> nothing if called incorrectly. Plus then they could use
>> vdomain->nr_endpoints for consistency with the equivalent checks elsewhere
>> (it did take me a moment to figure out how we could get to .iotlb_sync_map
>> with a NULL viommu without viommu_map_pages() blowing up first...)
> 
> They're not strictly equivalent: this check works around a temporary issue
> with the IOMMU core, which calls map/unmap before the domain is finalized.
> Once we merge domain_alloc() and finalize(), then this check disappears,
> but we still need to test nr_endpoints in map/unmap to handle detached
> domains (and we still need to fix the synchronization of nr_endpoints
> against attach/detach). That's why I preferred doing this on viommu and
> keeping it in one place.

Fair enough - it just seems to me that in both cases it's a detached 
domain, so its previous history of whether it's ever been otherwise or 
not shouldn't matter. Even once viommu is initialised, does it really 
make sense to send sync commands for a mapping on a detached domain 
where we haven't actually sent any map/unmap commands?

Thanks,
Robin.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-19  8:15       ` Jean-Philippe Brucker
@ 2023-09-19 14:46         ` Jason Gunthorpe
  -1 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-19 14:46 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Robin Murphy, Niklas Schnelle, Joerg Roedel, Will Deacon,
	virtualization, iommu, linux-kernel

On Tue, Sep 19, 2023 at 09:15:19AM +0100, Jean-Philippe Brucker wrote:
> On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote:
> > > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > > index 17dcd826f5c2..3649586f0e5c 100644
> > > --- a/drivers/iommu/virtio-iommu.c
> > > +++ b/drivers/iommu/virtio-iommu.c
> > > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> > >   	int ret;
> > >   	unsigned long flags;
> > > +	/*
> > > +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> > > +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> > > +	 */
> > > +	if (!viommu)
> > > +		return 0;
> > 
> > Minor nit: I'd be inclined to make that check explicitly in the places where
> > it definitely is expected, rather than allowing *any* sync to silently do
> > nothing if called incorrectly. Plus then they could use
> > vdomain->nr_endpoints for consistency with the equivalent checks elsewhere
> > (it did take me a moment to figure out how we could get to .iotlb_sync_map
> > with a NULL viommu without viommu_map_pages() blowing up first...)

This makes more sense to me

Ultimately this driver should reach a point where every iommu_domain
always has a non-null domain->viommu because it will be set during
alloc.

But it can still have nr_endpoints == 0, doesn't it make sense to
avoid sync in this case?

(btw this driver is missing locking around vdomain->nr_endpoints)

> They're not strictly equivalent: this check works around a temporary issue
> with the IOMMU core, which calls map/unmap before the domain is
> finalized.

Where? The above points to iommu_create_device_direct_mappings() but
it doesn't because the pgsize_bitmap == 0:

static int iommu_create_device_direct_mappings(struct iommu_domain *domain,
					       struct device *dev)
{
	struct iommu_resv_region *entry;
	struct list_head mappings;
	unsigned long pg_size;
	int ret = 0;

	pg_size = domain->pgsize_bitmap ? 1UL << __ffs(domain->pgsize_bitmap) : 0;
	INIT_LIST_HEAD(&mappings);

	if (WARN_ON_ONCE(iommu_is_dma_domain(domain) && !pg_size))

Indeed, the driver should be failing all map's until the domain is
finalized because it has no way to check the IOVA matches the eventual
aperture.

Jason

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-19 14:46         ` Jason Gunthorpe
  0 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-19 14:46 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Niklas Schnelle, Robin Murphy, Joerg Roedel, linux-kernel,
	virtualization, iommu, Will Deacon

On Tue, Sep 19, 2023 at 09:15:19AM +0100, Jean-Philippe Brucker wrote:
> On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote:
> > > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > > index 17dcd826f5c2..3649586f0e5c 100644
> > > --- a/drivers/iommu/virtio-iommu.c
> > > +++ b/drivers/iommu/virtio-iommu.c
> > > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> > >   	int ret;
> > >   	unsigned long flags;
> > > +	/*
> > > +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> > > +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> > > +	 */
> > > +	if (!viommu)
> > > +		return 0;
> > 
> > Minor nit: I'd be inclined to make that check explicitly in the places where
> > it definitely is expected, rather than allowing *any* sync to silently do
> > nothing if called incorrectly. Plus then they could use
> > vdomain->nr_endpoints for consistency with the equivalent checks elsewhere
> > (it did take me a moment to figure out how we could get to .iotlb_sync_map
> > with a NULL viommu without viommu_map_pages() blowing up first...)

This makes more sense to me

Ultimately this driver should reach a point where every iommu_domain
always has a non-null domain->viommu because it will be set during
alloc.

But it can still have nr_endpoints == 0, doesn't it make sense to
avoid sync in this case?

(btw this driver is missing locking around vdomain->nr_endpoints)

> They're not strictly equivalent: this check works around a temporary issue
> with the IOMMU core, which calls map/unmap before the domain is
> finalized.

Where? The above points to iommu_create_device_direct_mappings() but
it doesn't because the pgsize_bitmap == 0:

static int iommu_create_device_direct_mappings(struct iommu_domain *domain,
					       struct device *dev)
{
	struct iommu_resv_region *entry;
	struct list_head mappings;
	unsigned long pg_size;
	int ret = 0;

	pg_size = domain->pgsize_bitmap ? 1UL << __ffs(domain->pgsize_bitmap) : 0;
	INIT_LIST_HEAD(&mappings);

	if (WARN_ON_ONCE(iommu_is_dma_domain(domain) && !pg_size))

Indeed, the driver should be failing all map's until the domain is
finalized because it has no way to check the IOVA matches the eventual
aperture.

Jason
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-19  8:28         ` Robin Murphy
@ 2023-09-22  7:52           ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 39+ messages in thread
From: Jean-Philippe Brucker @ 2023-09-22  7:52 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Niklas Schnelle, Joerg Roedel, Will Deacon, virtualization,
	iommu, linux-kernel

On Tue, Sep 19, 2023 at 09:28:08AM +0100, Robin Murphy wrote:
> On 2023-09-19 09:15, Jean-Philippe Brucker wrote:
> > On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote:
> > > > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > > > index 17dcd826f5c2..3649586f0e5c 100644
> > > > --- a/drivers/iommu/virtio-iommu.c
> > > > +++ b/drivers/iommu/virtio-iommu.c
> > > > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> > > >    	int ret;
> > > >    	unsigned long flags;
> > > > +	/*
> > > > +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> > > > +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> > > > +	 */
> > > > +	if (!viommu)
> > > > +		return 0;
> > > 
> > > Minor nit: I'd be inclined to make that check explicitly in the places where
> > > it definitely is expected, rather than allowing *any* sync to silently do
> > > nothing if called incorrectly. Plus then they could use
> > > vdomain->nr_endpoints for consistency with the equivalent checks elsewhere
> > > (it did take me a moment to figure out how we could get to .iotlb_sync_map
> > > with a NULL viommu without viommu_map_pages() blowing up first...)
> > 
> > They're not strictly equivalent: this check works around a temporary issue
> > with the IOMMU core, which calls map/unmap before the domain is finalized.
> > Once we merge domain_alloc() and finalize(), then this check disappears,
> > but we still need to test nr_endpoints in map/unmap to handle detached
> > domains (and we still need to fix the synchronization of nr_endpoints
> > against attach/detach). That's why I preferred doing this on viommu and
> > keeping it in one place.
> 
> Fair enough - it just seems to me that in both cases it's a detached domain,
> so its previous history of whether it's ever been otherwise or not shouldn't
> matter. Even once viommu is initialised, does it really make sense to send
> sync commands for a mapping on a detached domain where we haven't actually
> sent any map/unmap commands?

If no requests were added by map/unmap, then viommu_sync_req() is
essentially a nop because virtio doesn't use sync commands (and
virtqueue_kick() only kicks the host when the queue's not empty, if I
remember correctly). It still does a bit of work so is less efficient than
a preliminary check on nr_endpoints, but it feels nicer to streamline the
case where the domain is attached, since map/unmap on detached domains
happens rarely, if ever.

Either is fine by me. An extra test won't make much difference performance
wise, and I guess will look less confusing. Niklas, do you mind resending
the version with nr_endpoints check (and updated commit messages)?

Thanks,
Jean

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-22  7:52           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 39+ messages in thread
From: Jean-Philippe Brucker @ 2023-09-22  7:52 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Niklas Schnelle, Joerg Roedel, linux-kernel, virtualization,
	iommu, Will Deacon

On Tue, Sep 19, 2023 at 09:28:08AM +0100, Robin Murphy wrote:
> On 2023-09-19 09:15, Jean-Philippe Brucker wrote:
> > On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote:
> > > > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > > > index 17dcd826f5c2..3649586f0e5c 100644
> > > > --- a/drivers/iommu/virtio-iommu.c
> > > > +++ b/drivers/iommu/virtio-iommu.c
> > > > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> > > >    	int ret;
> > > >    	unsigned long flags;
> > > > +	/*
> > > > +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> > > > +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> > > > +	 */
> > > > +	if (!viommu)
> > > > +		return 0;
> > > 
> > > Minor nit: I'd be inclined to make that check explicitly in the places where
> > > it definitely is expected, rather than allowing *any* sync to silently do
> > > nothing if called incorrectly. Plus then they could use
> > > vdomain->nr_endpoints for consistency with the equivalent checks elsewhere
> > > (it did take me a moment to figure out how we could get to .iotlb_sync_map
> > > with a NULL viommu without viommu_map_pages() blowing up first...)
> > 
> > They're not strictly equivalent: this check works around a temporary issue
> > with the IOMMU core, which calls map/unmap before the domain is finalized.
> > Once we merge domain_alloc() and finalize(), then this check disappears,
> > but we still need to test nr_endpoints in map/unmap to handle detached
> > domains (and we still need to fix the synchronization of nr_endpoints
> > against attach/detach). That's why I preferred doing this on viommu and
> > keeping it in one place.
> 
> Fair enough - it just seems to me that in both cases it's a detached domain,
> so its previous history of whether it's ever been otherwise or not shouldn't
> matter. Even once viommu is initialised, does it really make sense to send
> sync commands for a mapping on a detached domain where we haven't actually
> sent any map/unmap commands?

If no requests were added by map/unmap, then viommu_sync_req() is
essentially a nop because virtio doesn't use sync commands (and
virtqueue_kick() only kicks the host when the queue's not empty, if I
remember correctly). It still does a bit of work so is less efficient than
a preliminary check on nr_endpoints, but it feels nicer to streamline the
case where the domain is attached, since map/unmap on detached domains
happens rarely, if ever.

Either is fine by me. An extra test won't make much difference performance
wise, and I guess will look less confusing. Niklas, do you mind resending
the version with nr_endpoints check (and updated commit messages)?

Thanks,
Jean
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-19 14:46         ` Jason Gunthorpe
@ 2023-09-22  7:57           ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 39+ messages in thread
From: Jean-Philippe Brucker @ 2023-09-22  7:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Robin Murphy, Niklas Schnelle, Joerg Roedel, Will Deacon,
	virtualization, iommu, linux-kernel

On Tue, Sep 19, 2023 at 11:46:49AM -0300, Jason Gunthorpe wrote:
> On Tue, Sep 19, 2023 at 09:15:19AM +0100, Jean-Philippe Brucker wrote:
> > On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote:
> > > > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > > > index 17dcd826f5c2..3649586f0e5c 100644
> > > > --- a/drivers/iommu/virtio-iommu.c
> > > > +++ b/drivers/iommu/virtio-iommu.c
> > > > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> > > >   	int ret;
> > > >   	unsigned long flags;
> > > > +	/*
> > > > +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> > > > +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> > > > +	 */
> > > > +	if (!viommu)
> > > > +		return 0;
> > > 
> > > Minor nit: I'd be inclined to make that check explicitly in the places where
> > > it definitely is expected, rather than allowing *any* sync to silently do
> > > nothing if called incorrectly. Plus then they could use
> > > vdomain->nr_endpoints for consistency with the equivalent checks elsewhere
> > > (it did take me a moment to figure out how we could get to .iotlb_sync_map
> > > with a NULL viommu without viommu_map_pages() blowing up first...)
> 
> This makes more sense to me
> 
> Ultimately this driver should reach a point where every iommu_domain
> always has a non-null domain->viommu because it will be set during
> alloc.
> 
> But it can still have nr_endpoints == 0, doesn't it make sense to
> avoid sync in this case?
> 
> (btw this driver is missing locking around vdomain->nr_endpoints)

Yes, that's on my list

> 
> > They're not strictly equivalent: this check works around a temporary issue
> > with the IOMMU core, which calls map/unmap before the domain is
> > finalized.
> 
> Where? The above points to iommu_create_device_direct_mappings() but
> it doesn't because the pgsize_bitmap == 0:

__iommu_domain_alloc() sets pgsize_bitmap in this case:

        /*
         * If not already set, assume all sizes by default; the driver
         * may override this later
         */
        if (!domain->pgsize_bitmap)
                domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;

Thanks,
Jean

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-22  7:57           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 39+ messages in thread
From: Jean-Philippe Brucker @ 2023-09-22  7:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Niklas Schnelle, Robin Murphy, Joerg Roedel, linux-kernel,
	virtualization, iommu, Will Deacon

On Tue, Sep 19, 2023 at 11:46:49AM -0300, Jason Gunthorpe wrote:
> On Tue, Sep 19, 2023 at 09:15:19AM +0100, Jean-Philippe Brucker wrote:
> > On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote:
> > > > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > > > index 17dcd826f5c2..3649586f0e5c 100644
> > > > --- a/drivers/iommu/virtio-iommu.c
> > > > +++ b/drivers/iommu/virtio-iommu.c
> > > > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> > > >   	int ret;
> > > >   	unsigned long flags;
> > > > +	/*
> > > > +	 * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> > > > +	 * is initialized e.g. via iommu_create_device_direct_mappings()
> > > > +	 */
> > > > +	if (!viommu)
> > > > +		return 0;
> > > 
> > > Minor nit: I'd be inclined to make that check explicitly in the places where
> > > it definitely is expected, rather than allowing *any* sync to silently do
> > > nothing if called incorrectly. Plus then they could use
> > > vdomain->nr_endpoints for consistency with the equivalent checks elsewhere
> > > (it did take me a moment to figure out how we could get to .iotlb_sync_map
> > > with a NULL viommu without viommu_map_pages() blowing up first...)
> 
> This makes more sense to me
> 
> Ultimately this driver should reach a point where every iommu_domain
> always has a non-null domain->viommu because it will be set during
> alloc.
> 
> But it can still have nr_endpoints == 0, doesn't it make sense to
> avoid sync in this case?
> 
> (btw this driver is missing locking around vdomain->nr_endpoints)

Yes, that's on my list

> 
> > They're not strictly equivalent: this check works around a temporary issue
> > with the IOMMU core, which calls map/unmap before the domain is
> > finalized.
> 
> Where? The above points to iommu_create_device_direct_mappings() but
> it doesn't because the pgsize_bitmap == 0:

__iommu_domain_alloc() sets pgsize_bitmap in this case:

        /*
         * If not already set, assume all sizes by default; the driver
         * may override this later
         */
        if (!domain->pgsize_bitmap)
                domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;

Thanks,
Jean
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-22  7:57           ` Jean-Philippe Brucker
@ 2023-09-22 12:41             ` Jason Gunthorpe
  -1 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-22 12:41 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Robin Murphy, Niklas Schnelle, Joerg Roedel, Will Deacon,
	virtualization, iommu, linux-kernel

On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote:
> > > They're not strictly equivalent: this check works around a temporary issue
> > > with the IOMMU core, which calls map/unmap before the domain is
> > > finalized.
> > 
> > Where? The above points to iommu_create_device_direct_mappings() but
> > it doesn't because the pgsize_bitmap == 0:
> 
> __iommu_domain_alloc() sets pgsize_bitmap in this case:
> 
>         /*
>          * If not already set, assume all sizes by default; the driver
>          * may override this later
>          */
>         if (!domain->pgsize_bitmap)
>                 domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;

Dirver's shouldn't do that.

The core code was fixed to try again with mapping reserved regions to
support these kinds of drivers.

Jason

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-22 12:41             ` Jason Gunthorpe
  0 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-22 12:41 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Niklas Schnelle, Robin Murphy, Joerg Roedel, linux-kernel,
	virtualization, iommu, Will Deacon

On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote:
> > > They're not strictly equivalent: this check works around a temporary issue
> > > with the IOMMU core, which calls map/unmap before the domain is
> > > finalized.
> > 
> > Where? The above points to iommu_create_device_direct_mappings() but
> > it doesn't because the pgsize_bitmap == 0:
> 
> __iommu_domain_alloc() sets pgsize_bitmap in this case:
> 
>         /*
>          * If not already set, assume all sizes by default; the driver
>          * may override this later
>          */
>         if (!domain->pgsize_bitmap)
>                 domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;

Dirver's shouldn't do that.

The core code was fixed to try again with mapping reserved regions to
support these kinds of drivers.

Jason
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-22 12:41             ` Jason Gunthorpe
@ 2023-09-22 13:13               ` Robin Murphy
  -1 siblings, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-22 13:13 UTC (permalink / raw)
  To: Jason Gunthorpe, Jean-Philippe Brucker
  Cc: Niklas Schnelle, Joerg Roedel, Will Deacon, virtualization,
	iommu, linux-kernel

On 22/09/2023 1:41 pm, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote:
>>>> They're not strictly equivalent: this check works around a temporary issue
>>>> with the IOMMU core, which calls map/unmap before the domain is
>>>> finalized.
>>>
>>> Where? The above points to iommu_create_device_direct_mappings() but
>>> it doesn't because the pgsize_bitmap == 0:
>>
>> __iommu_domain_alloc() sets pgsize_bitmap in this case:
>>
>>          /*
>>           * If not already set, assume all sizes by default; the driver
>>           * may override this later
>>           */
>>          if (!domain->pgsize_bitmap)
>>                  domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;
> 
> Dirver's shouldn't do that.
> 
> The core code was fixed to try again with mapping reserved regions to
> support these kinds of drivers.

This is still the "normal" code path, really; I think it's only AMD that 
started initialising the domain bitmap "early" and warranted making it 
conditional. However we *do* ultimately want all the drivers to do the 
same, so we can get rid of ops->pgsize_bitmap, because it's already 
pretty redundant and meaningless in the face of per-domain pagetable 
formats.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-22 13:13               ` Robin Murphy
  0 siblings, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-22 13:13 UTC (permalink / raw)
  To: Jason Gunthorpe, Jean-Philippe Brucker
  Cc: Niklas Schnelle, Joerg Roedel, linux-kernel, virtualization,
	iommu, Will Deacon

On 22/09/2023 1:41 pm, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote:
>>>> They're not strictly equivalent: this check works around a temporary issue
>>>> with the IOMMU core, which calls map/unmap before the domain is
>>>> finalized.
>>>
>>> Where? The above points to iommu_create_device_direct_mappings() but
>>> it doesn't because the pgsize_bitmap == 0:
>>
>> __iommu_domain_alloc() sets pgsize_bitmap in this case:
>>
>>          /*
>>           * If not already set, assume all sizes by default; the driver
>>           * may override this later
>>           */
>>          if (!domain->pgsize_bitmap)
>>                  domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;
> 
> Dirver's shouldn't do that.
> 
> The core code was fixed to try again with mapping reserved regions to
> support these kinds of drivers.

This is still the "normal" code path, really; I think it's only AMD that 
started initialising the domain bitmap "early" and warranted making it 
conditional. However we *do* ultimately want all the drivers to do the 
same, so we can get rid of ops->pgsize_bitmap, because it's already 
pretty redundant and meaningless in the face of per-domain pagetable 
formats.

Thanks,
Robin.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-22 13:13               ` Robin Murphy
@ 2023-09-22 16:27                 ` Jason Gunthorpe
  -1 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-22 16:27 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	Will Deacon, virtualization, iommu, linux-kernel

On Fri, Sep 22, 2023 at 02:13:18PM +0100, Robin Murphy wrote:
> On 22/09/2023 1:41 pm, Jason Gunthorpe wrote:
> > On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote:
> > > > > They're not strictly equivalent: this check works around a temporary issue
> > > > > with the IOMMU core, which calls map/unmap before the domain is
> > > > > finalized.
> > > > 
> > > > Where? The above points to iommu_create_device_direct_mappings() but
> > > > it doesn't because the pgsize_bitmap == 0:
> > > 
> > > __iommu_domain_alloc() sets pgsize_bitmap in this case:
> > > 
> > >          /*
> > >           * If not already set, assume all sizes by default; the driver
> > >           * may override this later
> > >           */
> > >          if (!domain->pgsize_bitmap)
> > >                  domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;
> > 
> > Dirver's shouldn't do that.
> > 
> > The core code was fixed to try again with mapping reserved regions to
> > support these kinds of drivers.
> 
> This is still the "normal" code path, really; I think it's only AMD that
> started initialising the domain bitmap "early" and warranted making it
> conditional.

My main point was that iommu_create_device_direct_mappings() should
fail for unfinalized domains, setting pgsize_bitmap to allow it to
succeed is not a nice hack, and not necessary now.

What do you think about something like this to replace
iommu_create_device_direct_mappings(), that does enforce things
properly?


static int resv_cmp(void *priv, const struct list_head *llhs,
		    const struct list_head *lrhs)
{
	struct iommu_resv_region *lhs = list_entry(llhs, struct iommu_resv_region, list);
	struct iommu_resv_region *rhs = list_entry(lrhs, struct iommu_resv_region, list);

	if (lhs->start == rhs->start)
		return 0;
	if (lhs->start < rhs->start)
		return -1;
	return 1;
}

static int iommu_create_device_direct_mappings(struct iommu_domain *domain,
					       struct device *dev)
{
	struct iommu_resv_region *entry;
	struct iommu_resv_region *tmp;
	struct list_head mappings;
	struct list_head direct;
	phys_addr_t cur = 0;
	int ret = 0;

	INIT_LIST_HEAD(&mappings);
	INIT_LIST_HEAD(&direct);

	iommu_get_resv_regions(dev, &mappings);

	list_for_each_entry_safe(entry, tmp, &mappings, list) {
		if (entry->type == IOMMU_RESV_DIRECT)
			dev->iommu->require_direct = 1;

		if ((domain->type & __IOMMU_DOMAIN_PAGING) &&
		    (entry->type == IOMMU_RESV_DIRECT ||
		     entry->type == IOMMU_RESV_DIRECT_RELAXABLE)) {
			if (domain->geometry.aperture_start > entry->start ||
			    domain->geometry.aperture_end == 0 ||
			    (domain->geometry.aperture_end - 1) <
				    (entry->start + entry->length - 1)) {
				ret = -EINVAL;
				goto out;
			}
			list_move(&entry->list, &direct);
		}
	}

	if (list_empty(&direct))
		goto out;

	/*
	 * FW can have overlapping ranges, sort the list by start address
	 * and map any duplicated IOVA only once.
	 */
	list_sort(NULL, &direct, resv_cmp);
	list_for_each_entry(entry, &direct, list) {
		phys_addr_t start_pfn = entry->start / PAGE_SIZE;
		phys_addr_t last_pfn =
			(entry->length - 1 + entry->start) / PAGE_SIZE;

		if (start_pfn < cur)
			start_pfn = cur;

		if (start_pfn <= last_pfn) {
			ret = iommu_map(domain, start_pfn * PAGE_SIZE,
					start_pfn * PAGE_SIZE,
					(last_pfn - start_pfn + 1) * PAGE_SIZE,
					entry->prot, GFP_KERNEL);
			if (ret)
				goto out;
			cur = last_pfn + 1;
		}
	}

out:
	list_splice(&direct, &mappings);
	iommu_put_resv_regions(dev, &mappings);

	return ret;
}

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-22 16:27                 ` Jason Gunthorpe
  0 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-22 16:27 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	linux-kernel, virtualization, iommu, Will Deacon

On Fri, Sep 22, 2023 at 02:13:18PM +0100, Robin Murphy wrote:
> On 22/09/2023 1:41 pm, Jason Gunthorpe wrote:
> > On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote:
> > > > > They're not strictly equivalent: this check works around a temporary issue
> > > > > with the IOMMU core, which calls map/unmap before the domain is
> > > > > finalized.
> > > > 
> > > > Where? The above points to iommu_create_device_direct_mappings() but
> > > > it doesn't because the pgsize_bitmap == 0:
> > > 
> > > __iommu_domain_alloc() sets pgsize_bitmap in this case:
> > > 
> > >          /*
> > >           * If not already set, assume all sizes by default; the driver
> > >           * may override this later
> > >           */
> > >          if (!domain->pgsize_bitmap)
> > >                  domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;
> > 
> > Dirver's shouldn't do that.
> > 
> > The core code was fixed to try again with mapping reserved regions to
> > support these kinds of drivers.
> 
> This is still the "normal" code path, really; I think it's only AMD that
> started initialising the domain bitmap "early" and warranted making it
> conditional.

My main point was that iommu_create_device_direct_mappings() should
fail for unfinalized domains, setting pgsize_bitmap to allow it to
succeed is not a nice hack, and not necessary now.

What do you think about something like this to replace
iommu_create_device_direct_mappings(), that does enforce things
properly?


static int resv_cmp(void *priv, const struct list_head *llhs,
		    const struct list_head *lrhs)
{
	struct iommu_resv_region *lhs = list_entry(llhs, struct iommu_resv_region, list);
	struct iommu_resv_region *rhs = list_entry(lrhs, struct iommu_resv_region, list);

	if (lhs->start == rhs->start)
		return 0;
	if (lhs->start < rhs->start)
		return -1;
	return 1;
}

static int iommu_create_device_direct_mappings(struct iommu_domain *domain,
					       struct device *dev)
{
	struct iommu_resv_region *entry;
	struct iommu_resv_region *tmp;
	struct list_head mappings;
	struct list_head direct;
	phys_addr_t cur = 0;
	int ret = 0;

	INIT_LIST_HEAD(&mappings);
	INIT_LIST_HEAD(&direct);

	iommu_get_resv_regions(dev, &mappings);

	list_for_each_entry_safe(entry, tmp, &mappings, list) {
		if (entry->type == IOMMU_RESV_DIRECT)
			dev->iommu->require_direct = 1;

		if ((domain->type & __IOMMU_DOMAIN_PAGING) &&
		    (entry->type == IOMMU_RESV_DIRECT ||
		     entry->type == IOMMU_RESV_DIRECT_RELAXABLE)) {
			if (domain->geometry.aperture_start > entry->start ||
			    domain->geometry.aperture_end == 0 ||
			    (domain->geometry.aperture_end - 1) <
				    (entry->start + entry->length - 1)) {
				ret = -EINVAL;
				goto out;
			}
			list_move(&entry->list, &direct);
		}
	}

	if (list_empty(&direct))
		goto out;

	/*
	 * FW can have overlapping ranges, sort the list by start address
	 * and map any duplicated IOVA only once.
	 */
	list_sort(NULL, &direct, resv_cmp);
	list_for_each_entry(entry, &direct, list) {
		phys_addr_t start_pfn = entry->start / PAGE_SIZE;
		phys_addr_t last_pfn =
			(entry->length - 1 + entry->start) / PAGE_SIZE;

		if (start_pfn < cur)
			start_pfn = cur;

		if (start_pfn <= last_pfn) {
			ret = iommu_map(domain, start_pfn * PAGE_SIZE,
					start_pfn * PAGE_SIZE,
					(last_pfn - start_pfn + 1) * PAGE_SIZE,
					entry->prot, GFP_KERNEL);
			if (ret)
				goto out;
			cur = last_pfn + 1;
		}
	}

out:
	list_splice(&direct, &mappings);
	iommu_put_resv_regions(dev, &mappings);

	return ret;
}
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-22 16:27                 ` Jason Gunthorpe
@ 2023-09-22 18:07                   ` Robin Murphy
  -1 siblings, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-22 18:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	Will Deacon, virtualization, iommu, linux-kernel

On 22/09/2023 5:27 pm, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 02:13:18PM +0100, Robin Murphy wrote:
>> On 22/09/2023 1:41 pm, Jason Gunthorpe wrote:
>>> On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote:
>>>>>> They're not strictly equivalent: this check works around a temporary issue
>>>>>> with the IOMMU core, which calls map/unmap before the domain is
>>>>>> finalized.
>>>>>
>>>>> Where? The above points to iommu_create_device_direct_mappings() but
>>>>> it doesn't because the pgsize_bitmap == 0:
>>>>
>>>> __iommu_domain_alloc() sets pgsize_bitmap in this case:
>>>>
>>>>           /*
>>>>            * If not already set, assume all sizes by default; the driver
>>>>            * may override this later
>>>>            */
>>>>           if (!domain->pgsize_bitmap)
>>>>                   domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;
>>>
>>> Dirver's shouldn't do that.
>>>
>>> The core code was fixed to try again with mapping reserved regions to
>>> support these kinds of drivers.
>>
>> This is still the "normal" code path, really; I think it's only AMD that
>> started initialising the domain bitmap "early" and warranted making it
>> conditional.
> 
> My main point was that iommu_create_device_direct_mappings() should
> fail for unfinalized domains, setting pgsize_bitmap to allow it to
> succeed is not a nice hack, and not necessary now.

Sure, but it's the whole "unfinalised domains" and rewriting 
domain->pgsize_bitmap after attach thing that is itself the massive 
hack. AMD doesn't do that, and doesn't need to; it knows the appropriate 
format at allocation time and can quite happily return a fully working 
domain which allows map before attach, but the old ops->pgsize_bitmap 
mechanism fundamentally doesn't work for multiple formats with different 
page sizes. The only thing I'd accuse it of doing wrong is the weird 
half-and-half thing of having one format as a default via one mechanism, 
and the other as an override through the other, rather than setting both 
explicitly.

virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings 
either; it sets it once it's discovered any instance, since apparently 
it's assuming that all instances must support identical page sizes, and 
thus once it's seen one it can work "normally" per the core code's 
assumptions. It's also I think the only driver which has a "finalise" 
bodge but *can* still properly support map-before-attach, by virtue of 
having to replay mappings to every new endpoint anyway.

> What do you think about something like this to replace
> iommu_create_device_direct_mappings(), that does enforce things
> properly?

I fail to see how that would make any practical difference. Either the 
mappings can be correctly set up in a pagetable *before* the relevant 
device is attached to that pagetable, or they can't (if the driver 
doesn't have enough information to be able to do so) and we just have to 
really hope nothing blows up in the race window between attaching the 
device to an empty pagetable and having a second try at 
iommu_create_device_direct_mappings(). That's a driver-level issue and 
has nothing to do with pgsize_bitmap either way.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-22 18:07                   ` Robin Murphy
  0 siblings, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-22 18:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	linux-kernel, virtualization, iommu, Will Deacon

On 22/09/2023 5:27 pm, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 02:13:18PM +0100, Robin Murphy wrote:
>> On 22/09/2023 1:41 pm, Jason Gunthorpe wrote:
>>> On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote:
>>>>>> They're not strictly equivalent: this check works around a temporary issue
>>>>>> with the IOMMU core, which calls map/unmap before the domain is
>>>>>> finalized.
>>>>>
>>>>> Where? The above points to iommu_create_device_direct_mappings() but
>>>>> it doesn't because the pgsize_bitmap == 0:
>>>>
>>>> __iommu_domain_alloc() sets pgsize_bitmap in this case:
>>>>
>>>>           /*
>>>>            * If not already set, assume all sizes by default; the driver
>>>>            * may override this later
>>>>            */
>>>>           if (!domain->pgsize_bitmap)
>>>>                   domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;
>>>
>>> Dirver's shouldn't do that.
>>>
>>> The core code was fixed to try again with mapping reserved regions to
>>> support these kinds of drivers.
>>
>> This is still the "normal" code path, really; I think it's only AMD that
>> started initialising the domain bitmap "early" and warranted making it
>> conditional.
> 
> My main point was that iommu_create_device_direct_mappings() should
> fail for unfinalized domains, setting pgsize_bitmap to allow it to
> succeed is not a nice hack, and not necessary now.

Sure, but it's the whole "unfinalised domains" and rewriting 
domain->pgsize_bitmap after attach thing that is itself the massive 
hack. AMD doesn't do that, and doesn't need to; it knows the appropriate 
format at allocation time and can quite happily return a fully working 
domain which allows map before attach, but the old ops->pgsize_bitmap 
mechanism fundamentally doesn't work for multiple formats with different 
page sizes. The only thing I'd accuse it of doing wrong is the weird 
half-and-half thing of having one format as a default via one mechanism, 
and the other as an override through the other, rather than setting both 
explicitly.

virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings 
either; it sets it once it's discovered any instance, since apparently 
it's assuming that all instances must support identical page sizes, and 
thus once it's seen one it can work "normally" per the core code's 
assumptions. It's also I think the only driver which has a "finalise" 
bodge but *can* still properly support map-before-attach, by virtue of 
having to replay mappings to every new endpoint anyway.

> What do you think about something like this to replace
> iommu_create_device_direct_mappings(), that does enforce things
> properly?

I fail to see how that would make any practical difference. Either the 
mappings can be correctly set up in a pagetable *before* the relevant 
device is attached to that pagetable, or they can't (if the driver 
doesn't have enough information to be able to do so) and we just have to 
really hope nothing blows up in the race window between attaching the 
device to an empty pagetable and having a second try at 
iommu_create_device_direct_mappings(). That's a driver-level issue and 
has nothing to do with pgsize_bitmap either way.

Thanks,
Robin.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-22 18:07                   ` Robin Murphy
@ 2023-09-22 23:33                     ` Jason Gunthorpe
  -1 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-22 23:33 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	Will Deacon, virtualization, iommu, linux-kernel

On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote:

> virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings
> either; it sets it once it's discovered any instance, since apparently it's
> assuming that all instances must support identical page sizes, and thus once
> it's seen one it can work "normally" per the core code's assumptions. It's
> also I think the only driver which has a "finalise" bodge but *can* still
> properly support map-before-attach, by virtue of having to replay mappings
> to every new endpoint anyway.

Well it can't quite do that since it doesn't know the geometry - it
all is sort of guessing and hoping it doesn't explode on replay. If it
knows the geometry it wouldn't need finalize...

> > What do you think about something like this to replace
> > iommu_create_device_direct_mappings(), that does enforce things
> > properly?
> 
> I fail to see how that would make any practical difference. Either the
> mappings can be correctly set up in a pagetable *before* the relevant device
> is attached to that pagetable, or they can't (if the driver doesn't have
> enough information to be able to do so) and we just have to really hope
> nothing blows up in the race window between attaching the device to an empty
> pagetable and having a second try at iommu_create_device_direct_mappings().
> That's a driver-level issue and has nothing to do with pgsize_bitmap either
> way.

Except we don't detect this in the core code correctly, that is my
point. We should detect the aperture conflict, not pgsize_bitmap to
check if it is the first or second try.

Jason

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-22 23:33                     ` Jason Gunthorpe
  0 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-22 23:33 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	linux-kernel, virtualization, iommu, Will Deacon

On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote:

> virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings
> either; it sets it once it's discovered any instance, since apparently it's
> assuming that all instances must support identical page sizes, and thus once
> it's seen one it can work "normally" per the core code's assumptions. It's
> also I think the only driver which has a "finalise" bodge but *can* still
> properly support map-before-attach, by virtue of having to replay mappings
> to every new endpoint anyway.

Well it can't quite do that since it doesn't know the geometry - it
all is sort of guessing and hoping it doesn't explode on replay. If it
knows the geometry it wouldn't need finalize...

> > What do you think about something like this to replace
> > iommu_create_device_direct_mappings(), that does enforce things
> > properly?
> 
> I fail to see how that would make any practical difference. Either the
> mappings can be correctly set up in a pagetable *before* the relevant device
> is attached to that pagetable, or they can't (if the driver doesn't have
> enough information to be able to do so) and we just have to really hope
> nothing blows up in the race window between attaching the device to an empty
> pagetable and having a second try at iommu_create_device_direct_mappings().
> That's a driver-level issue and has nothing to do with pgsize_bitmap either
> way.

Except we don't detect this in the core code correctly, that is my
point. We should detect the aperture conflict, not pgsize_bitmap to
check if it is the first or second try.

Jason
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-22 23:33                     ` Jason Gunthorpe
  (?)
@ 2023-09-25  2:48                     ` Baolu Lu
  2023-09-25 12:40                         ` Jason Gunthorpe
  -1 siblings, 1 reply; 39+ messages in thread
From: Baolu Lu @ 2023-09-25  2:48 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Murphy
  Cc: baolu.lu, Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	Will Deacon, virtualization, iommu, linux-kernel

On 9/23/23 7:33 AM, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote:
> 
>> virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings
>> either; it sets it once it's discovered any instance, since apparently it's
>> assuming that all instances must support identical page sizes, and thus once
>> it's seen one it can work "normally" per the core code's assumptions. It's
>> also I think the only driver which has a "finalise" bodge but*can*  still
>> properly support map-before-attach, by virtue of having to replay mappings
>> to every new endpoint anyway.
> Well it can't quite do that since it doesn't know the geometry - it
> all is sort of guessing and hoping it doesn't explode on replay. If it
> knows the geometry it wouldn't need finalize...

The ultimate solution to this problem seems to be to add device pointer
to the parameter of ops->domain_alloc. So once the domain is allocated,
it is fully initialized. Attaching this domain to a device that is not
compatible will return -EINVAL, then the caller has to allocate a new
domain for this device.

I feel that this is not an AMD specific problem, other iommu drivers
will also encounter the similar problem sooner or later.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-25  2:48                     ` Baolu Lu
@ 2023-09-25 12:40                         ` Jason Gunthorpe
  0 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-25 12:40 UTC (permalink / raw)
  To: Baolu Lu
  Cc: Robin Murphy, Jean-Philippe Brucker, Niklas Schnelle,
	Joerg Roedel, Will Deacon, virtualization, iommu, linux-kernel

On Mon, Sep 25, 2023 at 10:48:21AM +0800, Baolu Lu wrote:
> On 9/23/23 7:33 AM, Jason Gunthorpe wrote:
> > On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote:
> > 
> > > virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings
> > > either; it sets it once it's discovered any instance, since apparently it's
> > > assuming that all instances must support identical page sizes, and thus once
> > > it's seen one it can work "normally" per the core code's assumptions. It's
> > > also I think the only driver which has a "finalise" bodge but*can*  still
> > > properly support map-before-attach, by virtue of having to replay mappings
> > > to every new endpoint anyway.
> > Well it can't quite do that since it doesn't know the geometry - it
> > all is sort of guessing and hoping it doesn't explode on replay. If it
> > knows the geometry it wouldn't need finalize...
> 
> The ultimate solution to this problem seems to be to add device pointer
> to the parameter of ops->domain_alloc. So once the domain is allocated,
> it is fully initialized. Attaching this domain to a device that is not
> compatible will return -EINVAL, then the caller has to allocate a new
> domain for this device.

Sure, it looks like this, and it works already for default domain
setup.

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 8599394c9157d7..1697f5a2370785 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -637,15 +637,10 @@ static void viommu_event_handler(struct virtqueue *vq)
 
 /* IOMMU API */
 
-static struct iommu_domain *viommu_domain_alloc(unsigned type)
+static struct viommu_domain *__viommu_domain_alloc(void)
 {
 	struct viommu_domain *vdomain;
 
-	if (type != IOMMU_DOMAIN_UNMANAGED &&
-	    type != IOMMU_DOMAIN_DMA &&
-	    type != IOMMU_DOMAIN_IDENTITY)
-		return NULL;
-
 	vdomain = kzalloc(sizeof(*vdomain), GFP_KERNEL);
 	if (!vdomain)
 		return NULL;
@@ -654,16 +649,32 @@ static struct iommu_domain *viommu_domain_alloc(unsigned type)
 	spin_lock_init(&vdomain->mappings_lock);
 	vdomain->mappings = RB_ROOT_CACHED;
 
+	return vdomain;
+}
+
+static struct iommu_domain *viommu_domain_alloc(unsigned type)
+{
+	struct viommu_domain *vdomain;
+
+	/*
+	 * viommu sometimes creates an identity domain out of a
+	 * paging domain, that can't use the global static.
+	 * During attach it will fill in an identity page table.
+	 */
+	if (type != IOMMU_DOMAIN_IDENTITY)
+		return NULL;
+	vdomain = __viommu_domain_alloc();
+	if (!vdomain)
+		return NULL;
 	return &vdomain->domain;
 }
 
 static int viommu_domain_finalise(struct viommu_endpoint *vdev,
-				  struct iommu_domain *domain)
+				  struct viommu_domain *vdomain)
 {
 	int ret;
 	unsigned long viommu_page_size;
 	struct viommu_dev *viommu = vdev->viommu;
-	struct viommu_domain *vdomain = to_viommu_domain(domain);
 
 	viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
 	if (viommu_page_size > PAGE_SIZE) {
@@ -680,13 +691,13 @@ static int viommu_domain_finalise(struct viommu_endpoint *vdev,
 
 	vdomain->id		= (unsigned int)ret;
 
-	domain->pgsize_bitmap	= viommu->pgsize_bitmap;
-	domain->geometry	= viommu->geometry;
+	vdomain->domain.pgsize_bitmap = viommu->pgsize_bitmap;
+	vdomain->domain.geometry = viommu->geometry;
 
 	vdomain->map_flags	= viommu->map_flags;
 	vdomain->viommu		= viommu;
 
-	if (domain->type == IOMMU_DOMAIN_IDENTITY) {
+	if (vdomain->domain.type == IOMMU_DOMAIN_IDENTITY) {
 		if (virtio_has_feature(viommu->vdev,
 				       VIRTIO_IOMMU_F_BYPASS_CONFIG)) {
 			vdomain->bypass = true;
@@ -717,6 +728,24 @@ static void viommu_domain_free(struct iommu_domain *domain)
 	kfree(vdomain);
 }
 
+static struct iommu_domain *viommu_domain_alloc_paging(struct device *dev)
+{
+	struct viommu_domain *vdomain;
+	vdomain = __viommu_domain_alloc();
+	if (!vdomain)
+		return NULL;
+
+	if (dev) {
+		struct viommu_endpoint *vdev = dev_iommu_priv_get(dev);
+
+		if (viommu_domain_finalise(vdev, vdomain)) {
+			viommu_domain_free(&vdomain->domain);
+			return NULL;
+		}
+	}
+	return &vdomain->domain;
+}
+
 static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int i;
@@ -732,7 +761,7 @@ static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		 * Properly initialize the domain now that we know which viommu
 		 * owns it.
 		 */
-		ret = viommu_domain_finalise(vdev, domain);
+		ret = viommu_domain_finalise(vdev, vdomain);
 	} else if (vdomain->viommu != vdev->viommu) {
 		ret = -EINVAL;
 	}
@@ -1045,6 +1074,7 @@ static bool viommu_capable(struct device *dev, enum iommu_cap cap)
 static struct iommu_ops viommu_ops = {
 	.capable		= viommu_capable,
 	.domain_alloc		= viommu_domain_alloc,
+	.domain_alloc_paging	= viommu_domain_alloc_paging,
 	.probe_device		= viommu_probe_device,
 	.probe_finalize		= viommu_probe_finalize,
 	.release_device		= viommu_release_device,

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-25 12:40                         ` Jason Gunthorpe
  0 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-25 12:40 UTC (permalink / raw)
  To: Baolu Lu
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Will Deacon,
	Joerg Roedel, linux-kernel, virtualization, iommu, Robin Murphy

On Mon, Sep 25, 2023 at 10:48:21AM +0800, Baolu Lu wrote:
> On 9/23/23 7:33 AM, Jason Gunthorpe wrote:
> > On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote:
> > 
> > > virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings
> > > either; it sets it once it's discovered any instance, since apparently it's
> > > assuming that all instances must support identical page sizes, and thus once
> > > it's seen one it can work "normally" per the core code's assumptions. It's
> > > also I think the only driver which has a "finalise" bodge but*can*  still
> > > properly support map-before-attach, by virtue of having to replay mappings
> > > to every new endpoint anyway.
> > Well it can't quite do that since it doesn't know the geometry - it
> > all is sort of guessing and hoping it doesn't explode on replay. If it
> > knows the geometry it wouldn't need finalize...
> 
> The ultimate solution to this problem seems to be to add device pointer
> to the parameter of ops->domain_alloc. So once the domain is allocated,
> it is fully initialized. Attaching this domain to a device that is not
> compatible will return -EINVAL, then the caller has to allocate a new
> domain for this device.

Sure, it looks like this, and it works already for default domain
setup.

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 8599394c9157d7..1697f5a2370785 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -637,15 +637,10 @@ static void viommu_event_handler(struct virtqueue *vq)
 
 /* IOMMU API */
 
-static struct iommu_domain *viommu_domain_alloc(unsigned type)
+static struct viommu_domain *__viommu_domain_alloc(void)
 {
 	struct viommu_domain *vdomain;
 
-	if (type != IOMMU_DOMAIN_UNMANAGED &&
-	    type != IOMMU_DOMAIN_DMA &&
-	    type != IOMMU_DOMAIN_IDENTITY)
-		return NULL;
-
 	vdomain = kzalloc(sizeof(*vdomain), GFP_KERNEL);
 	if (!vdomain)
 		return NULL;
@@ -654,16 +649,32 @@ static struct iommu_domain *viommu_domain_alloc(unsigned type)
 	spin_lock_init(&vdomain->mappings_lock);
 	vdomain->mappings = RB_ROOT_CACHED;
 
+	return vdomain;
+}
+
+static struct iommu_domain *viommu_domain_alloc(unsigned type)
+{
+	struct viommu_domain *vdomain;
+
+	/*
+	 * viommu sometimes creates an identity domain out of a
+	 * paging domain, that can't use the global static.
+	 * During attach it will fill in an identity page table.
+	 */
+	if (type != IOMMU_DOMAIN_IDENTITY)
+		return NULL;
+	vdomain = __viommu_domain_alloc();
+	if (!vdomain)
+		return NULL;
 	return &vdomain->domain;
 }
 
 static int viommu_domain_finalise(struct viommu_endpoint *vdev,
-				  struct iommu_domain *domain)
+				  struct viommu_domain *vdomain)
 {
 	int ret;
 	unsigned long viommu_page_size;
 	struct viommu_dev *viommu = vdev->viommu;
-	struct viommu_domain *vdomain = to_viommu_domain(domain);
 
 	viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
 	if (viommu_page_size > PAGE_SIZE) {
@@ -680,13 +691,13 @@ static int viommu_domain_finalise(struct viommu_endpoint *vdev,
 
 	vdomain->id		= (unsigned int)ret;
 
-	domain->pgsize_bitmap	= viommu->pgsize_bitmap;
-	domain->geometry	= viommu->geometry;
+	vdomain->domain.pgsize_bitmap = viommu->pgsize_bitmap;
+	vdomain->domain.geometry = viommu->geometry;
 
 	vdomain->map_flags	= viommu->map_flags;
 	vdomain->viommu		= viommu;
 
-	if (domain->type == IOMMU_DOMAIN_IDENTITY) {
+	if (vdomain->domain.type == IOMMU_DOMAIN_IDENTITY) {
 		if (virtio_has_feature(viommu->vdev,
 				       VIRTIO_IOMMU_F_BYPASS_CONFIG)) {
 			vdomain->bypass = true;
@@ -717,6 +728,24 @@ static void viommu_domain_free(struct iommu_domain *domain)
 	kfree(vdomain);
 }
 
+static struct iommu_domain *viommu_domain_alloc_paging(struct device *dev)
+{
+	struct viommu_domain *vdomain;
+	vdomain = __viommu_domain_alloc();
+	if (!vdomain)
+		return NULL;
+
+	if (dev) {
+		struct viommu_endpoint *vdev = dev_iommu_priv_get(dev);
+
+		if (viommu_domain_finalise(vdev, vdomain)) {
+			viommu_domain_free(&vdomain->domain);
+			return NULL;
+		}
+	}
+	return &vdomain->domain;
+}
+
 static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int i;
@@ -732,7 +761,7 @@ static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		 * Properly initialize the domain now that we know which viommu
 		 * owns it.
 		 */
-		ret = viommu_domain_finalise(vdev, domain);
+		ret = viommu_domain_finalise(vdev, vdomain);
 	} else if (vdomain->viommu != vdev->viommu) {
 		ret = -EINVAL;
 	}
@@ -1045,6 +1074,7 @@ static bool viommu_capable(struct device *dev, enum iommu_cap cap)
 static struct iommu_ops viommu_ops = {
 	.capable		= viommu_capable,
 	.domain_alloc		= viommu_domain_alloc,
+	.domain_alloc_paging	= viommu_domain_alloc_paging,
 	.probe_device		= viommu_probe_device,
 	.probe_finalize		= viommu_probe_finalize,
 	.release_device		= viommu_release_device,
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-22 23:33                     ` Jason Gunthorpe
@ 2023-09-25 13:07                       ` Robin Murphy
  -1 siblings, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-25 13:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	Will Deacon, virtualization, iommu, linux-kernel

On 2023-09-23 00:33, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote:
> 
>> virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings
>> either; it sets it once it's discovered any instance, since apparently it's
>> assuming that all instances must support identical page sizes, and thus once
>> it's seen one it can work "normally" per the core code's assumptions. It's
>> also I think the only driver which has a "finalise" bodge but *can* still
>> properly support map-before-attach, by virtue of having to replay mappings
>> to every new endpoint anyway.
> 
> Well it can't quite do that since it doesn't know the geometry - it
> all is sort of guessing and hoping it doesn't explode on replay. If it
> knows the geometry it wouldn't need finalize...

I think it's entirely reasonable to assume that any direct mappings 
specified for a device are valid for that device and its IOMMU. However, 
in the particular case of virtio, it really shouldn't ever have direct 
mappings anyway, since even if the underlying hardware did have any, the 
host can enforce the actual direct-mapping aspect itself, and just 
present them as unusable regions to the guest.

>>> What do you think about something like this to replace
>>> iommu_create_device_direct_mappings(), that does enforce things
>>> properly?
>>
>> I fail to see how that would make any practical difference. Either the
>> mappings can be correctly set up in a pagetable *before* the relevant device
>> is attached to that pagetable, or they can't (if the driver doesn't have
>> enough information to be able to do so) and we just have to really hope
>> nothing blows up in the race window between attaching the device to an empty
>> pagetable and having a second try at iommu_create_device_direct_mappings().
>> That's a driver-level issue and has nothing to do with pgsize_bitmap either
>> way.
> 
> Except we don't detect this in the core code correctly, that is my
> point. We should detect the aperture conflict, not pgsize_bitmap to
> check if it is the first or second try.

Again, that's irrelevant. It can only be about whether the actual 
->map_pages call succeeds or not. A driver could well know up-front that 
all instances support the same pgsize_bitmap and aperture, and set both 
at ->domain_alloc time, yet still be unable to handle an actual mapping 
without knowing which instance(s) that needs to interact with (e.g. 
omap-iommu).

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-25 13:07                       ` Robin Murphy
  0 siblings, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-25 13:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	linux-kernel, virtualization, iommu, Will Deacon

On 2023-09-23 00:33, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote:
> 
>> virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings
>> either; it sets it once it's discovered any instance, since apparently it's
>> assuming that all instances must support identical page sizes, and thus once
>> it's seen one it can work "normally" per the core code's assumptions. It's
>> also I think the only driver which has a "finalise" bodge but *can* still
>> properly support map-before-attach, by virtue of having to replay mappings
>> to every new endpoint anyway.
> 
> Well it can't quite do that since it doesn't know the geometry - it
> all is sort of guessing and hoping it doesn't explode on replay. If it
> knows the geometry it wouldn't need finalize...

I think it's entirely reasonable to assume that any direct mappings 
specified for a device are valid for that device and its IOMMU. However, 
in the particular case of virtio, it really shouldn't ever have direct 
mappings anyway, since even if the underlying hardware did have any, the 
host can enforce the actual direct-mapping aspect itself, and just 
present them as unusable regions to the guest.

>>> What do you think about something like this to replace
>>> iommu_create_device_direct_mappings(), that does enforce things
>>> properly?
>>
>> I fail to see how that would make any practical difference. Either the
>> mappings can be correctly set up in a pagetable *before* the relevant device
>> is attached to that pagetable, or they can't (if the driver doesn't have
>> enough information to be able to do so) and we just have to really hope
>> nothing blows up in the race window between attaching the device to an empty
>> pagetable and having a second try at iommu_create_device_direct_mappings().
>> That's a driver-level issue and has nothing to do with pgsize_bitmap either
>> way.
> 
> Except we don't detect this in the core code correctly, that is my
> point. We should detect the aperture conflict, not pgsize_bitmap to
> check if it is the first or second try.

Again, that's irrelevant. It can only be about whether the actual 
->map_pages call succeeds or not. A driver could well know up-front that 
all instances support the same pgsize_bitmap and aperture, and set both 
at ->domain_alloc time, yet still be unable to handle an actual mapping 
without knowing which instance(s) that needs to interact with (e.g. 
omap-iommu).

Thanks,
Robin.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-25 13:07                       ` Robin Murphy
@ 2023-09-25 13:29                         ` Jason Gunthorpe
  -1 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-25 13:29 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	Will Deacon, virtualization, iommu, linux-kernel

On Mon, Sep 25, 2023 at 02:07:50PM +0100, Robin Murphy wrote:
> On 2023-09-23 00:33, Jason Gunthorpe wrote:
> > On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote:
> > 
> > > virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings
> > > either; it sets it once it's discovered any instance, since apparently it's
> > > assuming that all instances must support identical page sizes, and thus once
> > > it's seen one it can work "normally" per the core code's assumptions. It's
> > > also I think the only driver which has a "finalise" bodge but *can* still
> > > properly support map-before-attach, by virtue of having to replay mappings
> > > to every new endpoint anyway.
> > 
> > Well it can't quite do that since it doesn't know the geometry - it
> > all is sort of guessing and hoping it doesn't explode on replay. If it
> > knows the geometry it wouldn't need finalize...
> 
> I think it's entirely reasonable to assume that any direct mappings
> specified for a device are valid for that device and its IOMMU. However, in
> the particular case of virtio, it really shouldn't ever have direct mappings
> anyway, since even if the underlying hardware did have any, the host can
> enforce the actual direct-mapping aspect itself, and just present them as
> unusable regions to the guest.

I assume this machinery is for the ARM GIC ITS page....

> Again, that's irrelevant. It can only be about whether the actual
> ->map_pages call succeeds or not. A driver could well know up-front that all
> instances support the same pgsize_bitmap and aperture, and set both at
> ->domain_alloc time, yet still be unable to handle an actual mapping without
> knowing which instance(s) that needs to interact with (e.g. omap-iommu).

I think this is a different issue. The domain is supposed to represent
the actual io pte storage, and the storage is supposed to exist even
when the domain is not attached to anything.

As we said with tegra-gart, it is a bug in the driver if all the
mappings disappear when the last device is detached from the domain.
Driver bugs like this turn into significant issues with vfio/iommufd
as this will result in warn_on's and memory leaking.

So, I disagree that this is something we should be allowing in the API
design. map_pages should succeed (memory allocation failures aside) if
a IOVA within the aperture and valid flags are presented. Regardless
of the attachment status. Calling map_pages with an IOVA outside the
aperture should be a caller bug.

It looks omap is just mis-designed to store the pgd in the omap_iommu,
not the omap_iommu_domain :( pgd is clearly a per-domain object in our
API. And why does every instance need its own copy of the identical
pgd?

Jason

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-25 13:29                         ` Jason Gunthorpe
  0 siblings, 0 replies; 39+ messages in thread
From: Jason Gunthorpe @ 2023-09-25 13:29 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	linux-kernel, virtualization, iommu, Will Deacon

On Mon, Sep 25, 2023 at 02:07:50PM +0100, Robin Murphy wrote:
> On 2023-09-23 00:33, Jason Gunthorpe wrote:
> > On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote:
> > 
> > > virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings
> > > either; it sets it once it's discovered any instance, since apparently it's
> > > assuming that all instances must support identical page sizes, and thus once
> > > it's seen one it can work "normally" per the core code's assumptions. It's
> > > also I think the only driver which has a "finalise" bodge but *can* still
> > > properly support map-before-attach, by virtue of having to replay mappings
> > > to every new endpoint anyway.
> > 
> > Well it can't quite do that since it doesn't know the geometry - it
> > all is sort of guessing and hoping it doesn't explode on replay. If it
> > knows the geometry it wouldn't need finalize...
> 
> I think it's entirely reasonable to assume that any direct mappings
> specified for a device are valid for that device and its IOMMU. However, in
> the particular case of virtio, it really shouldn't ever have direct mappings
> anyway, since even if the underlying hardware did have any, the host can
> enforce the actual direct-mapping aspect itself, and just present them as
> unusable regions to the guest.

I assume this machinery is for the ARM GIC ITS page....

> Again, that's irrelevant. It can only be about whether the actual
> ->map_pages call succeeds or not. A driver could well know up-front that all
> instances support the same pgsize_bitmap and aperture, and set both at
> ->domain_alloc time, yet still be unable to handle an actual mapping without
> knowing which instance(s) that needs to interact with (e.g. omap-iommu).

I think this is a different issue. The domain is supposed to represent
the actual io pte storage, and the storage is supposed to exist even
when the domain is not attached to anything.

As we said with tegra-gart, it is a bug in the driver if all the
mappings disappear when the last device is detached from the domain.
Driver bugs like this turn into significant issues with vfio/iommufd
as this will result in warn_on's and memory leaking.

So, I disagree that this is something we should be allowing in the API
design. map_pages should succeed (memory allocation failures aside) if
a IOVA within the aperture and valid flags are presented. Regardless
of the attachment status. Calling map_pages with an IOVA outside the
aperture should be a caller bug.

It looks omap is just mis-designed to store the pgd in the omap_iommu,
not the omap_iommu_domain :( pgd is clearly a per-domain object in our
API. And why does every instance need its own copy of the identical
pgd?

Jason
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
  2023-09-25 13:29                         ` Jason Gunthorpe
@ 2023-09-25 17:23                           ` Robin Murphy
  -1 siblings, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-25 17:23 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	Will Deacon, virtualization, iommu, linux-kernel

On 2023-09-25 14:29, Jason Gunthorpe wrote:
> On Mon, Sep 25, 2023 at 02:07:50PM +0100, Robin Murphy wrote:
>> On 2023-09-23 00:33, Jason Gunthorpe wrote:
>>> On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote:
>>>
>>>> virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings
>>>> either; it sets it once it's discovered any instance, since apparently it's
>>>> assuming that all instances must support identical page sizes, and thus once
>>>> it's seen one it can work "normally" per the core code's assumptions. It's
>>>> also I think the only driver which has a "finalise" bodge but *can* still
>>>> properly support map-before-attach, by virtue of having to replay mappings
>>>> to every new endpoint anyway.
>>>
>>> Well it can't quite do that since it doesn't know the geometry - it
>>> all is sort of guessing and hoping it doesn't explode on replay. If it
>>> knows the geometry it wouldn't need finalize...
>>
>> I think it's entirely reasonable to assume that any direct mappings
>> specified for a device are valid for that device and its IOMMU. However, in
>> the particular case of virtio, it really shouldn't ever have direct mappings
>> anyway, since even if the underlying hardware did have any, the host can
>> enforce the actual direct-mapping aspect itself, and just present them as
>> unusable regions to the guest.
> 
> I assume this machinery is for the ARM GIC ITS page....
> 
>> Again, that's irrelevant. It can only be about whether the actual
>> ->map_pages call succeeds or not. A driver could well know up-front that all
>> instances support the same pgsize_bitmap and aperture, and set both at
>> ->domain_alloc time, yet still be unable to handle an actual mapping without
>> knowing which instance(s) that needs to interact with (e.g. omap-iommu).
> 
> I think this is a different issue. The domain is supposed to represent
> the actual io pte storage, and the storage is supposed to exist even
> when the domain is not attached to anything.
> 
> As we said with tegra-gart, it is a bug in the driver if all the
> mappings disappear when the last device is detached from the domain.
> Driver bugs like this turn into significant issues with vfio/iommufd
> as this will result in warn_on's and memory leaking.
> 
> So, I disagree that this is something we should be allowing in the API
> design. map_pages should succeed (memory allocation failures aside) if
> a IOVA within the aperture and valid flags are presented. Regardless
> of the attachment status. Calling map_pages with an IOVA outside the
> aperture should be a caller bug.
> 
> It looks omap is just mis-designed to store the pgd in the omap_iommu,
> not the omap_iommu_domain :( pgd is clearly a per-domain object in our
> API. And why does every instance need its own copy of the identical
> pgd?

The point wasn't that it was necessarily a good and justifiable example, 
just that it is one that exists, to demonstrate that in general we have 
no reasonable heuristic for guessing whether ->map_pages is going to 
succeed or not other than by calling it and seeing if it succeeds or 
not. And IMO it's a complete waste of time thinking about ways to make 
such a heuristic possible instead of just getting on with fixing 
iommu_domain_alloc() to make the problem disappear altogether. Once 
Joerg pushes out the current queue I'll rebase and resend v4 of the bus 
ops removal, then hopefully get back to despairing at the hideous pile 
of WIP iommu_domain_alloc() patches I currently have on top of it...

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
@ 2023-09-25 17:23                           ` Robin Murphy
  0 siblings, 0 replies; 39+ messages in thread
From: Robin Murphy @ 2023-09-25 17:23 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Jean-Philippe Brucker, Niklas Schnelle, Joerg Roedel,
	linux-kernel, virtualization, iommu, Will Deacon

On 2023-09-25 14:29, Jason Gunthorpe wrote:
> On Mon, Sep 25, 2023 at 02:07:50PM +0100, Robin Murphy wrote:
>> On 2023-09-23 00:33, Jason Gunthorpe wrote:
>>> On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote:
>>>
>>>> virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings
>>>> either; it sets it once it's discovered any instance, since apparently it's
>>>> assuming that all instances must support identical page sizes, and thus once
>>>> it's seen one it can work "normally" per the core code's assumptions. It's
>>>> also I think the only driver which has a "finalise" bodge but *can* still
>>>> properly support map-before-attach, by virtue of having to replay mappings
>>>> to every new endpoint anyway.
>>>
>>> Well it can't quite do that since it doesn't know the geometry - it
>>> all is sort of guessing and hoping it doesn't explode on replay. If it
>>> knows the geometry it wouldn't need finalize...
>>
>> I think it's entirely reasonable to assume that any direct mappings
>> specified for a device are valid for that device and its IOMMU. However, in
>> the particular case of virtio, it really shouldn't ever have direct mappings
>> anyway, since even if the underlying hardware did have any, the host can
>> enforce the actual direct-mapping aspect itself, and just present them as
>> unusable regions to the guest.
> 
> I assume this machinery is for the ARM GIC ITS page....
> 
>> Again, that's irrelevant. It can only be about whether the actual
>> ->map_pages call succeeds or not. A driver could well know up-front that all
>> instances support the same pgsize_bitmap and aperture, and set both at
>> ->domain_alloc time, yet still be unable to handle an actual mapping without
>> knowing which instance(s) that needs to interact with (e.g. omap-iommu).
> 
> I think this is a different issue. The domain is supposed to represent
> the actual io pte storage, and the storage is supposed to exist even
> when the domain is not attached to anything.
> 
> As we said with tegra-gart, it is a bug in the driver if all the
> mappings disappear when the last device is detached from the domain.
> Driver bugs like this turn into significant issues with vfio/iommufd
> as this will result in warn_on's and memory leaking.
> 
> So, I disagree that this is something we should be allowing in the API
> design. map_pages should succeed (memory allocation failures aside) if
> a IOVA within the aperture and valid flags are presented. Regardless
> of the attachment status. Calling map_pages with an IOVA outside the
> aperture should be a caller bug.
> 
> It looks omap is just mis-designed to store the pgd in the omap_iommu,
> not the omap_iommu_domain :( pgd is clearly a per-domain object in our
> API. And why does every instance need its own copy of the identical
> pgd?

The point wasn't that it was necessarily a good and justifiable example, 
just that it is one that exists, to demonstrate that in general we have 
no reasonable heuristic for guessing whether ->map_pages is going to 
succeed or not other than by calling it and seeing if it succeeds or 
not. And IMO it's a complete waste of time thinking about ways to make 
such a heuristic possible instead of just getting on with fixing 
iommu_domain_alloc() to make the problem disappear altogether. Once 
Joerg pushes out the current queue I'll rebase and resend v4 of the bus 
ops removal, then hopefully get back to despairing at the hideous pile 
of WIP iommu_domain_alloc() patches I currently have on top of it...

Thanks,
Robin.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2023-09-25 17:24 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-18 11:51 [PATCH v2 0/2] iommu/virtio: Enable IOMMU_CAP_DERRED_FLUSH Niklas Schnelle
2023-09-18 11:51 ` [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map Niklas Schnelle
2023-09-18 15:58   ` Jean-Philippe Brucker
2023-09-18 15:58     ` Jean-Philippe Brucker
2023-09-18 16:37   ` Robin Murphy
2023-09-18 16:37     ` Robin Murphy
2023-09-19  8:00     ` Niklas Schnelle
2023-09-19  8:15     ` Jean-Philippe Brucker
2023-09-19  8:15       ` Jean-Philippe Brucker
2023-09-19  8:28       ` Robin Murphy
2023-09-19  8:28         ` Robin Murphy
2023-09-22  7:52         ` Jean-Philippe Brucker
2023-09-22  7:52           ` Jean-Philippe Brucker
2023-09-19 14:46       ` Jason Gunthorpe
2023-09-19 14:46         ` Jason Gunthorpe
2023-09-22  7:57         ` Jean-Philippe Brucker
2023-09-22  7:57           ` Jean-Philippe Brucker
2023-09-22 12:41           ` Jason Gunthorpe
2023-09-22 12:41             ` Jason Gunthorpe
2023-09-22 13:13             ` Robin Murphy
2023-09-22 13:13               ` Robin Murphy
2023-09-22 16:27               ` Jason Gunthorpe
2023-09-22 16:27                 ` Jason Gunthorpe
2023-09-22 18:07                 ` Robin Murphy
2023-09-22 18:07                   ` Robin Murphy
2023-09-22 23:33                   ` Jason Gunthorpe
2023-09-22 23:33                     ` Jason Gunthorpe
2023-09-25  2:48                     ` Baolu Lu
2023-09-25 12:40                       ` Jason Gunthorpe
2023-09-25 12:40                         ` Jason Gunthorpe
2023-09-25 13:07                     ` Robin Murphy
2023-09-25 13:07                       ` Robin Murphy
2023-09-25 13:29                       ` Jason Gunthorpe
2023-09-25 13:29                         ` Jason Gunthorpe
2023-09-25 17:23                         ` Robin Murphy
2023-09-25 17:23                           ` Robin Murphy
2023-09-18 11:51 ` [PATCH v2 2/2] iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush Niklas Schnelle
2023-09-18 15:59   ` Jean-Philippe Brucker
2023-09-18 15:59     ` Jean-Philippe Brucker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.