iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
@ 2022-05-16 13:06 John Garry via iommu
  2022-05-17  8:38 ` Christoph Hellwig
  2022-05-17 10:40 ` Robin Murphy
  0 siblings, 2 replies; 12+ messages in thread
From: John Garry via iommu @ 2022-05-16 13:06 UTC (permalink / raw)
  To: joro, will, hch, robin.murphy, m.szyprowski
  Cc: liyihang6, linux-kernel, iommu

For streaming DMA mappings involving an IOMMU and whose IOVA len regularly
exceeds the IOVA rcache upper limit (meaning that they are not cached),
performance can be reduced.

Add the IOMMU callback for DMA mapping API dma_max_mapping_size(), which
allows the drivers to know the mapping limit and thus limit the requested 
IOVA lengths.

This resolves the performance issue originally reported in [0] for a SCSI
HBA driver which was regularly mapping SGLs which required IOVAs in
excess of the IOVA caching limit. In this case the block layer limits the
max sectors per request - as configured in __scsi_init_queue() - which
will limit the total SGL length the driver tries to map and in turn limits
IOVA lengths requested.

[0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/

Signed-off-by: John Garry <john.garry@huawei.com>
---
Sending as an RFC as iommu_dma_max_mapping_size() is a soft limit, and not
a hard limit which I expect is the semantics of dma_map_ops.max_mapping_size

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 09f6e1c0f9c0..e2d5205cde37 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1442,6 +1442,21 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
 	return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
 }
 
+static size_t iommu_dma_max_mapping_size(struct device *dev)
+{
+	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+	struct iommu_dma_cookie *cookie;
+
+	if (!domain)
+		return 0;
+
+	cookie = domain->iova_cookie;
+	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
+		return 0;
+
+	return iova_rcache_range();
+}
+
 static const struct dma_map_ops iommu_dma_ops = {
 	.alloc			= iommu_dma_alloc,
 	.free			= iommu_dma_free,
@@ -1462,6 +1477,7 @@ static const struct dma_map_ops iommu_dma_ops = {
 	.map_resource		= iommu_dma_map_resource,
 	.unmap_resource		= iommu_dma_unmap_resource,
 	.get_merge_boundary	= iommu_dma_get_merge_boundary,
+	.max_mapping_size	= iommu_dma_max_mapping_size,
 };
 
 /*
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index db77aa675145..9f00b58d546e 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
 static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
 static void free_iova_rcaches(struct iova_domain *iovad);
 
+unsigned long iova_rcache_range(void)
+{
+	return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
+}
+
 static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
 {
 	struct iova_domain *iovad;
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 320a70e40233..ae3e18d77e6c 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova)
 int iova_cache_get(void);
 void iova_cache_put(void);
 
+unsigned long iova_rcache_range(void);
+
 void free_iova(struct iova_domain *iovad, unsigned long pfn);
 void __free_iova(struct iova_domain *iovad, struct iova *iova);
 struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
@@ -105,6 +107,11 @@ static inline void iova_cache_put(void)
 {
 }
 
+static inline unsigned long iova_rcache_range(void)
+{
+	return 0;
+}
+
 static inline void free_iova(struct iova_domain *iovad, unsigned long pfn)
 {
 }
-- 
2.26.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
  2022-05-16 13:06 [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size() John Garry via iommu
@ 2022-05-17  8:38 ` Christoph Hellwig
  2022-05-17  9:02   ` John Garry via iommu
  2022-05-17 10:40 ` Robin Murphy
  1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2022-05-17  8:38 UTC (permalink / raw)
  To: John Garry; +Cc: will, linux-kernel, iommu, robin.murphy, hch, liyihang6

On Mon, May 16, 2022 at 09:06:01PM +0800, John Garry wrote:
> For streaming DMA mappings involving an IOMMU and whose IOVA len regularly
> exceeds the IOVA rcache upper limit (meaning that they are not cached),
> performance can be reduced.
> 
> Add the IOMMU callback for DMA mapping API dma_max_mapping_size(), which
> allows the drivers to know the mapping limit and thus limit the requested 
> IOVA lengths.
> 
> This resolves the performance issue originally reported in [0] for a SCSI
> HBA driver which was regularly mapping SGLs which required IOVAs in
> excess of the IOVA caching limit. In this case the block layer limits the
> max sectors per request - as configured in __scsi_init_queue() - which
> will limit the total SGL length the driver tries to map and in turn limits
> IOVA lengths requested.
> 
> [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
> Sending as an RFC as iommu_dma_max_mapping_size() is a soft limit, and not
> a hard limit which I expect is the semantics of dma_map_ops.max_mapping_size
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 09f6e1c0f9c0..e2d5205cde37 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1442,6 +1442,21 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
>  	return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
>  }
>  

> +	if (!domain)
> +		return 0;
> +
> +	cookie = domain->iova_cookie;
> +	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
> +		return 0;

Can these conditions even be true here?

> +static inline unsigned long iova_rcache_range(void)
> +{
> +	return 0;
> +}

Given that IOMMU_DMA select IOMMU_IOVA there is no need for this stub.

Otherwise this looks sensible to me.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
  2022-05-17  8:38 ` Christoph Hellwig
@ 2022-05-17  9:02   ` John Garry via iommu
  2022-05-17  9:11     ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: John Garry via iommu @ 2022-05-17  9:02 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: will, linux-kernel, iommu, robin.murphy, liyihang6

On 17/05/2022 09:38, Christoph Hellwig wrote:
> On Mon, May 16, 2022 at 09:06:01PM +0800, John Garry wrote:
>> For streaming DMA mappings involving an IOMMU and whose IOVA len regularly
>> exceeds the IOVA rcache upper limit (meaning that they are not cached),
>> performance can be reduced.
>>
>> Add the IOMMU callback for DMA mapping API dma_max_mapping_size(), which
>> allows the drivers to know the mapping limit and thus limit the requested
>> IOVA lengths.
>>
>> This resolves the performance issue originally reported in [0] for a SCSI
>> HBA driver which was regularly mapping SGLs which required IOVAs in
>> excess of the IOVA caching limit. In this case the block layer limits the
>> max sectors per request - as configured in __scsi_init_queue() - which
>> will limit the total SGL length the driver tries to map and in turn limits
>> IOVA lengths requested.

BTW, on a separate topic, I noticed that even with this change my ATA 
devices have max_hw_sectors_kb of 32767, as opposed to 128 for SAS 
devices. It seems that libata-scsi - specifically ata_scsi_dev_config() 
- doesn't honour the shost max_sectors limit. I guess that is not 
intentional.

>>
>> [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> ---
>> Sending as an RFC as iommu_dma_max_mapping_size() is a soft limit, and not
>> a hard limit which I expect is the semantics of dma_map_ops.max_mapping_size
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 09f6e1c0f9c0..e2d5205cde37 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -1442,6 +1442,21 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
>>   	return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
>>   }
>>   
> 
>> +	if (!domain)
>> +		return 0;
>> +
>> +	cookie = domain->iova_cookie;
>> +	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
>> +		return 0;
> 
> Can these conditions even be true here?

I don't think so. Paranoia on my part.

> 
>> +static inline unsigned long iova_rcache_range(void)
>> +{
>> +	return 0;
>> +}
> 
> Given that IOMMU_DMA select IOMMU_IOVA there is no need for this stub.

hmmm.. ok. Policy was to be stub everything but I think that it has changed.

> 
> Otherwise this looks sensible to me.
> 
> .

Great, thanks.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
  2022-05-17  9:02   ` John Garry via iommu
@ 2022-05-17  9:11     ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2022-05-17  9:11 UTC (permalink / raw)
  To: John Garry
  Cc: will, linux-kernel, iommu, robin.murphy, Christoph Hellwig, liyihang6

On Tue, May 17, 2022 at 10:02:00AM +0100, John Garry wrote:
> BTW, on a separate topic, I noticed that even with this change my ATA 
> devices have max_hw_sectors_kb of 32767, as opposed to 128 for SAS devices. 
> It seems that libata-scsi - specifically ata_scsi_dev_config() - doesn't 
> honour the shost max_sectors limit. I guess that is not intentional.

I don't think it is.  the libsas/libsata integration is a bit messy
sometimes..
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
  2022-05-16 13:06 [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size() John Garry via iommu
  2022-05-17  8:38 ` Christoph Hellwig
@ 2022-05-17 10:40 ` Robin Murphy
  2022-05-17 11:26   ` John Garry via iommu
  2022-05-18 13:12   ` Christoph Hellwig
  1 sibling, 2 replies; 12+ messages in thread
From: Robin Murphy @ 2022-05-17 10:40 UTC (permalink / raw)
  To: John Garry, joro, will, hch, m.szyprowski; +Cc: liyihang6, iommu, linux-kernel

On 2022-05-16 14:06, John Garry wrote:
> For streaming DMA mappings involving an IOMMU and whose IOVA len regularly
> exceeds the IOVA rcache upper limit (meaning that they are not cached),
> performance can be reduced.
> 
> Add the IOMMU callback for DMA mapping API dma_max_mapping_size(), which
> allows the drivers to know the mapping limit and thus limit the requested
> IOVA lengths.
> 
> This resolves the performance issue originally reported in [0] for a SCSI
> HBA driver which was regularly mapping SGLs which required IOVAs in
> excess of the IOVA caching limit. In this case the block layer limits the
> max sectors per request - as configured in __scsi_init_queue() - which
> will limit the total SGL length the driver tries to map and in turn limits
> IOVA lengths requested.
> 
> [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
> Sending as an RFC as iommu_dma_max_mapping_size() is a soft limit, and not
> a hard limit which I expect is the semantics of dma_map_ops.max_mapping_size

Indeed, sorry but NAK for this being nonsense. As I've said at least 
once before, if the unnecessary SAC address allocation attempt slows 
down your workload, make it not do that in the first place. If you don't 
like the existing command-line parameter then fine, there are plenty of 
other options, it just needs to be done in a way that doesn't break x86 
systems with dodgy firmware, as my first attempt turned out to.

Furthermore, if a particular SCSI driver doesn't benefit from mappings 
larger than 256KB, then that driver is also free to limit its own 
mapping size. There are other folks out there with use-cases for mapping 
*gigabytes* at once; you don't get to cripple the API and say that 
that's suddenly not allowed just because it happens to make your thing 
go faster, that's absurd.

Thanks,
Robin.

> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 09f6e1c0f9c0..e2d5205cde37 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1442,6 +1442,21 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
>   	return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
>   }
>   
> +static size_t iommu_dma_max_mapping_size(struct device *dev)
> +{
> +	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> +	struct iommu_dma_cookie *cookie;
> +
> +	if (!domain)
> +		return 0;
> +
> +	cookie = domain->iova_cookie;
> +	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
> +		return 0;
> +
> +	return iova_rcache_range();
> +}
> +
>   static const struct dma_map_ops iommu_dma_ops = {
>   	.alloc			= iommu_dma_alloc,
>   	.free			= iommu_dma_free,
> @@ -1462,6 +1477,7 @@ static const struct dma_map_ops iommu_dma_ops = {
>   	.map_resource		= iommu_dma_map_resource,
>   	.unmap_resource		= iommu_dma_unmap_resource,
>   	.get_merge_boundary	= iommu_dma_get_merge_boundary,
> +	.max_mapping_size	= iommu_dma_max_mapping_size,
>   };
>   
>   /*
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index db77aa675145..9f00b58d546e 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
>   static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
>   static void free_iova_rcaches(struct iova_domain *iovad);
>   
> +unsigned long iova_rcache_range(void)
> +{
> +	return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
> +}
> +
>   static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
>   {
>   	struct iova_domain *iovad;
> diff --git a/include/linux/iova.h b/include/linux/iova.h
> index 320a70e40233..ae3e18d77e6c 100644
> --- a/include/linux/iova.h
> +++ b/include/linux/iova.h
> @@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova)
>   int iova_cache_get(void);
>   void iova_cache_put(void);
>   
> +unsigned long iova_rcache_range(void);
> +
>   void free_iova(struct iova_domain *iovad, unsigned long pfn);
>   void __free_iova(struct iova_domain *iovad, struct iova *iova);
>   struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
> @@ -105,6 +107,11 @@ static inline void iova_cache_put(void)
>   {
>   }
>   
> +static inline unsigned long iova_rcache_range(void)
> +{
> +	return 0;
> +}
> +
>   static inline void free_iova(struct iova_domain *iovad, unsigned long pfn)
>   {
>   }
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
  2022-05-17 10:40 ` Robin Murphy
@ 2022-05-17 11:26   ` John Garry via iommu
  2022-05-17 12:02     ` Robin Murphy
  2022-05-18 13:12   ` Christoph Hellwig
  1 sibling, 1 reply; 12+ messages in thread
From: John Garry via iommu @ 2022-05-17 11:26 UTC (permalink / raw)
  To: Robin Murphy, joro, will, hch, m.szyprowski
  Cc: liyihang6, iommu, linux-kernel

On 17/05/2022 11:40, Robin Murphy wrote:
> On 2022-05-16 14:06, John Garry wrote:
>> For streaming DMA mappings involving an IOMMU and whose IOVA len 
>> regularly
>> exceeds the IOVA rcache upper limit (meaning that they are not cached),
>> performance can be reduced.
>>
>> Add the IOMMU callback for DMA mapping API dma_max_mapping_size(), which
>> allows the drivers to know the mapping limit and thus limit the requested
>> IOVA lengths.
>>
>> This resolves the performance issue originally reported in [0] for a SCSI
>> HBA driver which was regularly mapping SGLs which required IOVAs in
>> excess of the IOVA caching limit. In this case the block layer limits the
>> max sectors per request - as configured in __scsi_init_queue() - which
>> will limit the total SGL length the driver tries to map and in turn 
>> limits
>> IOVA lengths requested.
>>
>> [0] 
>> https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/ 
>>
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> ---
>> Sending as an RFC as iommu_dma_max_mapping_size() is a soft limit, and 
>> not
>> a hard limit which I expect is the semantics of 
>> dma_map_ops.max_mapping_size
> 
> Indeed, sorry but NAK for this being nonsense. As I've said at least 
> once before, if the unnecessary SAC address allocation attempt slows 
> down your workload, make it not do that in the first place. If you don't 
> like the existing command-line parameter then fine, > there are plenty of
> other options, it just needs to be done in a way that doesn't break x86 
> systems with dodgy firmware, as my first attempt turned out to.

Sorry, but I am not interested in this. It was discussed in Jan last 
year without any viable solution.

Anyway we still have the long-term IOVA aging issue, and requesting 
non-cached IOVAs is involved in that. So I would rather keep the SCSI 
driver to requesting cached IOVAs all the time.

I did try to do it the other way around - configuring the IOVA caching 
range according to the drivers requirement but that got nowhere.

> 
> Furthermore, if a particular SCSI driver doesn't benefit from mappings 
> larger than 256KB, then that driver is also free to limit its own 
> mapping size. There are other folks out there with use-cases for mapping 
> *gigabytes* at once; you don't get to cripple the API and say that 
> that's suddenly not allowed just because it happens to make your thing 
> go faster, that's absurd.

I'd say less catastrophically slow, not faster.

So how to inform the SCSI driver of this caching limit then so that it 
may limit the SGL length?

Thanks,
John
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
  2022-05-17 11:26   ` John Garry via iommu
@ 2022-05-17 12:02     ` Robin Murphy
  2022-05-17 13:50       ` John Garry via iommu
  2022-05-18 13:13       ` Christoph Hellwig
  0 siblings, 2 replies; 12+ messages in thread
From: Robin Murphy @ 2022-05-17 12:02 UTC (permalink / raw)
  To: John Garry, joro, will, hch, m.szyprowski; +Cc: liyihang6, iommu, linux-kernel

On 2022-05-17 12:26, John Garry wrote:
> On 17/05/2022 11:40, Robin Murphy wrote:
>> On 2022-05-16 14:06, John Garry wrote:
>>> For streaming DMA mappings involving an IOMMU and whose IOVA len 
>>> regularly
>>> exceeds the IOVA rcache upper limit (meaning that they are not cached),
>>> performance can be reduced.
>>>
>>> Add the IOMMU callback for DMA mapping API dma_max_mapping_size(), which
>>> allows the drivers to know the mapping limit and thus limit the 
>>> requested
>>> IOVA lengths.
>>>
>>> This resolves the performance issue originally reported in [0] for a 
>>> SCSI
>>> HBA driver which was regularly mapping SGLs which required IOVAs in
>>> excess of the IOVA caching limit. In this case the block layer limits 
>>> the
>>> max sectors per request - as configured in __scsi_init_queue() - which
>>> will limit the total SGL length the driver tries to map and in turn 
>>> limits
>>> IOVA lengths requested.
>>>
>>> [0] 
>>> https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/ 
>>>
>>>
>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>> ---
>>> Sending as an RFC as iommu_dma_max_mapping_size() is a soft limit, 
>>> and not
>>> a hard limit which I expect is the semantics of 
>>> dma_map_ops.max_mapping_size
>>
>> Indeed, sorry but NAK for this being nonsense. As I've said at least 
>> once before, if the unnecessary SAC address allocation attempt slows 
>> down your workload, make it not do that in the first place. If you 
>> don't like the existing command-line parameter then fine, > there are 
>> plenty of
>> other options, it just needs to be done in a way that doesn't break 
>> x86 systems with dodgy firmware, as my first attempt turned out to.
> 
> Sorry, but I am not interested in this. It was discussed in Jan last 
> year without any viable solution.

Er, OK, if you're not interested in solving that problem I don't see why 
you'd bring it up, but hey ho. *I* still think it's important, so I 
guess I'll revive my old patch with a CONFIG_X86 bodge and have another 
go at some point.

> Anyway we still have the long-term IOVA aging issue, and requesting 
> non-cached IOVAs is involved in that. So I would rather keep the SCSI 
> driver to requesting cached IOVAs all the time.
> 
> I did try to do it the other way around - configuring the IOVA caching 
> range according to the drivers requirement but that got nowhere.

FWIW I thought that all looked OK, it just kept getting drowned out by 
more critical things in my inbox so I hoped someone else might comment. 
If it turns out that I've become the de-facto IOVA maintainer in 
everyone else's minds now and they're all waiting for my word then fair 
enough, I just need to know and reset my expectations accordingly. Joerg?

>> Furthermore, if a particular SCSI driver doesn't benefit from mappings 
>> larger than 256KB, then that driver is also free to limit its own 
>> mapping size. There are other folks out there with use-cases for 
>> mapping *gigabytes* at once; you don't get to cripple the API and say 
>> that that's suddenly not allowed just because it happens to make your 
>> thing go faster, that's absurd.
> 
> I'd say less catastrophically slow, not faster.
> 
> So how to inform the SCSI driver of this caching limit then so that it 
> may limit the SGL length?

Driver-specific mechanism; block-layer-specific mechanism; redefine this 
whole API to something like dma_opt_mapping_size(), as a limit above 
which mappings might become less efficient or start to fail (callback to 
my thoughts on [1] as well, I suppose); many options. Just not imposing 
a ridiculously low *maximum* on everyone wherein mapping calls "should 
not be larger than the returned value" when that's clearly bollocks.

Cheers,
Robin.

[1] 
https://lore.kernel.org/linux-iommu/20220510142109.777738-1-ltykernel@gmail.com/
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
  2022-05-17 12:02     ` Robin Murphy
@ 2022-05-17 13:50       ` John Garry via iommu
  2022-05-18 13:13       ` Christoph Hellwig
  1 sibling, 0 replies; 12+ messages in thread
From: John Garry via iommu @ 2022-05-17 13:50 UTC (permalink / raw)
  To: Robin Murphy, joro, will, hch, m.szyprowski
  Cc: liyihang6, iommu, linux-kernel

On 17/05/2022 13:02, Robin Murphy wrote:
>>>
>>> Indeed, sorry but NAK for this being nonsense. As I've said at least 
>>> once before, if the unnecessary SAC address allocation attempt slows 
>>> down your workload, make it not do that in the first place. If you 
>>> don't like the existing command-line parameter then fine, > there are 
>>> plenty of
>>> other options, it just needs to be done in a way that doesn't break 
>>> x86 systems with dodgy firmware, as my first attempt turned out to.
>>
>> Sorry, but I am not interested in this. It was discussed in Jan last 
>> year without any viable solution.
> 
> Er, OK, if you're not interested in solving that problem I don't see why 
> you'd bring it up, but hey ho. *I* still think it's important, so I 
> guess I'll revive my old patch with a CONFIG_X86 bodge and have another 
> go at some point.

Let me rephrase, I would be happy to help fix that problem if we really 
can get it fixed, however for my problem it's better to try to get the 
SCSI driver to stop requesting uncached IOVAs foremost.

> 
>> Anyway we still have the long-term IOVA aging issue, and requesting 
>> non-cached IOVAs is involved in that. So I would rather keep the SCSI 
>> driver to requesting cached IOVAs all the time.
>>
>> I did try to do it the other way around - configuring the IOVA caching 
>> range according to the drivers requirement but that got nowhere.

Note that this is still not a final solution as it's not always viable 
to ask a user to unbind + bind the driver.

> 
> FWIW I thought that all looked OK, it just kept getting drowned out by 
> more critical things in my inbox so I hoped someone else might comment. 
> If it turns out that I've become the de-facto IOVA maintainer in 
> everyone else's minds now and they're all waiting for my word then fair 
> enough, I just need to know and reset my expectations accordingly. Joerg?

It would be great to see an improvement here...

> 
>>> Furthermore, if a particular SCSI driver doesn't benefit from 
>>> mappings larger than 256KB, then that driver is also free to limit 
>>> its own mapping size. There are other folks out there with use-cases 
>>> for mapping *gigabytes* at once; you don't get to cripple the API and 
>>> say that that's suddenly not allowed just because it happens to make 
>>> your thing go faster, that's absurd.
>>
>> I'd say less catastrophically slow, not faster.
>>
>> So how to inform the SCSI driver of this caching limit then so that it 
>> may limit the SGL length?
> 
> Driver-specific mechanism; block-layer-specific mechanism; redefine this 
> whole API to something like dma_opt_mapping_size(), as a limit above 
> which mappings might become less efficient or start to fail (callback to 
> my thoughts on [1] as well, I suppose); many options.

ok, fine.

> Just not imposing 
> a ridiculously low *maximum* on everyone wherein mapping calls "should 
> not be larger than the returned value" when that's clearly bollocks.

I agree that this change is in violation as the documentation clearly 
implies a hard limit.

However, FWIW, from looking at users of dma_max_mapping_size(), they 
seem to use in a similar way to SCSI/block layer core, i.e. use this 
value to limit the max SGL total len per IO command. And not as a method 
to learn max DMA consistent mappings size for ring buffers, etc.

Anyway I'll look at dma_opt_mapping_size() but I am not sure how keen 
Christoph will be on this... it is strange to introduce that API due to 
peculiarity of the IOVA allocator.

> 
> 
> [1] 
> https://lore.kernel.org/linux-iommu/20220510142109.777738-1-ltykernel@gmail.com/

Thanks,
John
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
  2022-05-17 10:40 ` Robin Murphy
  2022-05-17 11:26   ` John Garry via iommu
@ 2022-05-18 13:12   ` Christoph Hellwig
  2022-05-18 13:45     ` Robin Murphy
  1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2022-05-18 13:12 UTC (permalink / raw)
  To: Robin Murphy; +Cc: liyihang6, linux-kernel, iommu, will, hch

On Tue, May 17, 2022 at 11:40:52AM +0100, Robin Murphy wrote:
> Indeed, sorry but NAK for this being nonsense. As I've said at least once 
> before, if the unnecessary SAC address allocation attempt slows down your 
> workload, make it not do that in the first place. If you don't like the 
> existing command-line parameter then fine, there are plenty of other 
> options, it just needs to be done in a way that doesn't break x86 systems 
> with dodgy firmware, as my first attempt turned out to.

What broke x86?
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
  2022-05-17 12:02     ` Robin Murphy
  2022-05-17 13:50       ` John Garry via iommu
@ 2022-05-18 13:13       ` Christoph Hellwig
  2022-05-20  8:51         ` Joerg Roedel
  1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2022-05-18 13:13 UTC (permalink / raw)
  To: Robin Murphy; +Cc: liyihang6, linux-kernel, iommu, will, hch

On Tue, May 17, 2022 at 01:02:00PM +0100, Robin Murphy wrote:
>> So how to inform the SCSI driver of this caching limit then so that it may 
>> limit the SGL length?
>
> Driver-specific mechanism; block-layer-specific mechanism; redefine this 
> whole API to something like dma_opt_mapping_size(), as a limit above which 
> mappings might become less efficient or start to fail (callback to my 
> thoughts on [1] as well, I suppose); many options. Just not imposing a 
> ridiculously low *maximum* on everyone wherein mapping calls "should not be 
> larger than the returned value" when that's clearly bollocks.

Well, for swiotlb it is a hard limit.  So if we want to go down that
route we need two APIs, one for the optimal size and one for the
hard limit.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
  2022-05-18 13:12   ` Christoph Hellwig
@ 2022-05-18 13:45     ` Robin Murphy
  0 siblings, 0 replies; 12+ messages in thread
From: Robin Murphy @ 2022-05-18 13:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: liyihang6, linux-kernel, iommu, will

On 2022-05-18 14:12, Christoph Hellwig wrote:
> On Tue, May 17, 2022 at 11:40:52AM +0100, Robin Murphy wrote:
>> Indeed, sorry but NAK for this being nonsense. As I've said at least once
>> before, if the unnecessary SAC address allocation attempt slows down your
>> workload, make it not do that in the first place. If you don't like the
>> existing command-line parameter then fine, there are plenty of other
>> options, it just needs to be done in a way that doesn't break x86 systems
>> with dodgy firmware, as my first attempt turned out to.
> 
> What broke x86?

See the thread at [1] (and in case of curiosity the other IVRS patches I 
refer to therein were at [2]). Basically, undescribed limitations lead 
to DMA address truncation once iommu-dma starts allocating from what it 
thinks is the full usable IOVA range. Your typical desktop PC is 
unlikely to have enough concurrent DMA-mapped memory to overflow the 
32-bit IOVA space naturally, so this has probably been hiding an untold 
multitude of sins over the years.

Robin.

[1] 
https://lore.kernel.org/linux-iommu/e583fc6dd1fb4ffc90310ff4372ee776f9cc7a3c.1594207679.git.robin.murphy@arm.com/
[2] 
https://lore.kernel.org/linux-iommu/20200605145655.13639-1-sebott@amazon.de/
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size()
  2022-05-18 13:13       ` Christoph Hellwig
@ 2022-05-20  8:51         ` Joerg Roedel
  0 siblings, 0 replies; 12+ messages in thread
From: Joerg Roedel @ 2022-05-20  8:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Robin Murphy, linux-kernel, iommu, will, liyihang6

On Wed, May 18, 2022 at 03:13:53PM +0200, Christoph Hellwig wrote:
> On Tue, May 17, 2022 at 01:02:00PM +0100, Robin Murphy wrote:
> >> So how to inform the SCSI driver of this caching limit then so that it may 
> >> limit the SGL length?
> >
> > Driver-specific mechanism; block-layer-specific mechanism; redefine this 
> > whole API to something like dma_opt_mapping_size(), as a limit above which 
> > mappings might become less efficient or start to fail (callback to my 
> > thoughts on [1] as well, I suppose); many options. Just not imposing a 
> > ridiculously low *maximum* on everyone wherein mapping calls "should not be 
> > larger than the returned value" when that's clearly bollocks.
> 
> Well, for swiotlb it is a hard limit.  So if we want to go down that
> route we need two APIs, one for the optimal size and one for the
> hard limit.

I agree with Robin, and if it really helps some drivers I am all for
doing a dma_opt_mapping_size() instead. Limiting DMA mapping sizes to
make drivers perform better gets a clear NAK from my side.

Regards,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-05-20  8:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-16 13:06 [RFC PATCH] dma-iommu: Add iommu_dma_max_mapping_size() John Garry via iommu
2022-05-17  8:38 ` Christoph Hellwig
2022-05-17  9:02   ` John Garry via iommu
2022-05-17  9:11     ` Christoph Hellwig
2022-05-17 10:40 ` Robin Murphy
2022-05-17 11:26   ` John Garry via iommu
2022-05-17 12:02     ` Robin Murphy
2022-05-17 13:50       ` John Garry via iommu
2022-05-18 13:13       ` Christoph Hellwig
2022-05-20  8:51         ` Joerg Roedel
2022-05-18 13:12   ` Christoph Hellwig
2022-05-18 13:45     ` Robin Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).