linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] DMA mapping changes for SCSI core
@ 2022-06-06  9:30 John Garry
  2022-06-06  9:30 ` [PATCH v3 1/4] dma-mapping: Add dma_opt_mapping_size() John Garry
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: John Garry @ 2022-06-06  9:30 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen, John Garry

As reported in [0], DMA mappings whose size exceeds the IOMMU IOVA caching
limit may see a big performance hit.

This series introduces a new DMA mapping API, dma_opt_mapping_size(), so
that drivers may know this limit when performance is a factor in the
mapping.

Robin didn't like using dma_max_mapping_size() for this [1].

The SCSI core code is modified to use this limit.

I also added a patch for libata-scsi as it does not currently honour the
shost max_sectors limit.

Note: Christoph has previously kindly offered to take this series via the
      dma-mapping tree, so I think that we just need an ack from the
      IOMMU guys now. 

[0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
[1] https://lore.kernel.org/linux-iommu/f5b78c9c-312e-70ab-ecbb-f14623a4b6e3@arm.com/

Changes since v2:
- Rebase on v5.19-rc1
- Add Damien's tag to 2/4 (thanks)

Changes since v1:
- Relocate scsi_add_host_with_dma() dma_dev check (Reported by Dan)
- Add tags from Damien and Martin (thanks)
  - note: I only added Martin's tag to the SCSI patch

John Garry (4):
  dma-mapping: Add dma_opt_mapping_size()
  dma-iommu: Add iommu_dma_opt_mapping_size()
  scsi: core: Cap shost max_sectors according to DMA optimum mapping
    limits
  libata-scsi: Cap ata_device->max_sectors according to
    shost->max_sectors

 Documentation/core-api/dma-api.rst |  9 +++++++++
 drivers/ata/libata-scsi.c          |  1 +
 drivers/iommu/dma-iommu.c          |  6 ++++++
 drivers/iommu/iova.c               |  5 +++++
 drivers/scsi/hosts.c               |  5 +++++
 drivers/scsi/scsi_lib.c            |  4 ----
 include/linux/dma-map-ops.h        |  1 +
 include/linux/dma-mapping.h        |  5 +++++
 include/linux/iova.h               |  2 ++
 kernel/dma/mapping.c               | 12 ++++++++++++
 10 files changed, 46 insertions(+), 4 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 1/4] dma-mapping: Add dma_opt_mapping_size()
  2022-06-06  9:30 [PATCH v3 0/4] DMA mapping changes for SCSI core John Garry
@ 2022-06-06  9:30 ` John Garry
  2022-06-08 17:27   ` Bart Van Assche
  2022-06-06  9:30 ` [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size() John Garry
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2022-06-06  9:30 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen, John Garry

Streaming DMA mapping involving an IOMMU may be much slower for larger
total mapping size. This is because every IOMMU DMA mapping requires an
IOVA to be allocated and freed. IOVA sizes above a certain limit are not
cached, which can have a big impact on DMA mapping performance.

Provide an API for device drivers to know this "optimal" limit, such that
they may try to produce mapping which don't exceed it.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 Documentation/core-api/dma-api.rst |  9 +++++++++
 include/linux/dma-map-ops.h        |  1 +
 include/linux/dma-mapping.h        |  5 +++++
 kernel/dma/mapping.c               | 12 ++++++++++++
 4 files changed, 27 insertions(+)

diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
index 6d6d0edd2d27..b3cd9763d28b 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -204,6 +204,15 @@ Returns the maximum size of a mapping for the device. The size parameter
 of the mapping functions like dma_map_single(), dma_map_page() and
 others should not be larger than the returned value.
 
+::
+
+	size_t
+	dma_opt_mapping_size(struct device *dev);
+
+Returns the maximum optimal size of a mapping for the device. Mapping large
+buffers may take longer so device drivers are advised to limit total DMA
+streaming mappings length to the returned value.
+
 ::
 
 	bool
diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index 0d5b06b3a4a6..98ceba6fa848 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -69,6 +69,7 @@ struct dma_map_ops {
 	int (*dma_supported)(struct device *dev, u64 mask);
 	u64 (*get_required_mask)(struct device *dev);
 	size_t (*max_mapping_size)(struct device *dev);
+	size_t (*opt_mapping_size)(void);
 	unsigned long (*get_merge_boundary)(struct device *dev);
 };
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index dca2b1355bb1..fe3849434b2a 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -144,6 +144,7 @@ int dma_set_mask(struct device *dev, u64 mask);
 int dma_set_coherent_mask(struct device *dev, u64 mask);
 u64 dma_get_required_mask(struct device *dev);
 size_t dma_max_mapping_size(struct device *dev);
+size_t dma_opt_mapping_size(struct device *dev);
 bool dma_need_sync(struct device *dev, dma_addr_t dma_addr);
 unsigned long dma_get_merge_boundary(struct device *dev);
 struct sg_table *dma_alloc_noncontiguous(struct device *dev, size_t size,
@@ -266,6 +267,10 @@ static inline size_t dma_max_mapping_size(struct device *dev)
 {
 	return 0;
 }
+static inline size_t dma_opt_mapping_size(struct device *dev)
+{
+	return 0;
+}
 static inline bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)
 {
 	return false;
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index db7244291b74..1bfe11b1edb6 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -773,6 +773,18 @@ size_t dma_max_mapping_size(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dma_max_mapping_size);
 
+size_t dma_opt_mapping_size(struct device *dev)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+	size_t size = SIZE_MAX;
+
+	if (ops && ops->opt_mapping_size)
+		size = ops->opt_mapping_size();
+
+	return min(dma_max_mapping_size(dev), size);
+}
+EXPORT_SYMBOL_GPL(dma_opt_mapping_size);
+
 bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)
 {
 	const struct dma_map_ops *ops = get_dma_ops(dev);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size()
  2022-06-06  9:30 [PATCH v3 0/4] DMA mapping changes for SCSI core John Garry
  2022-06-06  9:30 ` [PATCH v3 1/4] dma-mapping: Add dma_opt_mapping_size() John Garry
@ 2022-06-06  9:30 ` John Garry
  2022-06-08 17:26   ` Bart Van Assche
  2022-06-14 13:12   ` John Garry
  2022-06-06  9:30 ` [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits John Garry
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 21+ messages in thread
From: John Garry @ 2022-06-06  9:30 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen, John Garry

Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which
allows the drivers to know the optimal mapping limit and thus limit the
requested IOVA lengths.

This value is based on the IOVA rcache range limit, as IOVAs allocated
above this limit must always be newly allocated, which may be quite slow.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/iommu/dma-iommu.c | 6 ++++++
 drivers/iommu/iova.c      | 5 +++++
 include/linux/iova.h      | 2 ++
 3 files changed, 13 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f90251572a5d..9e1586447ee8 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1459,6 +1459,11 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
 	return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
 }
 
+static size_t iommu_dma_opt_mapping_size(void)
+{
+	return iova_rcache_range();
+}
+
 static const struct dma_map_ops iommu_dma_ops = {
 	.alloc			= iommu_dma_alloc,
 	.free			= iommu_dma_free,
@@ -1479,6 +1484,7 @@ static const struct dma_map_ops iommu_dma_ops = {
 	.map_resource		= iommu_dma_map_resource,
 	.unmap_resource		= iommu_dma_unmap_resource,
 	.get_merge_boundary	= iommu_dma_get_merge_boundary,
+	.opt_mapping_size	= iommu_dma_opt_mapping_size,
 };
 
 /*
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index db77aa675145..9f00b58d546e 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
 static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
 static void free_iova_rcaches(struct iova_domain *iovad);
 
+unsigned long iova_rcache_range(void)
+{
+	return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
+}
+
 static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
 {
 	struct iova_domain *iovad;
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 320a70e40233..c6ba6d95d79c 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova)
 int iova_cache_get(void);
 void iova_cache_put(void);
 
+unsigned long iova_rcache_range(void);
+
 void free_iova(struct iova_domain *iovad, unsigned long pfn);
 void __free_iova(struct iova_domain *iovad, struct iova *iova);
 struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
  2022-06-06  9:30 [PATCH v3 0/4] DMA mapping changes for SCSI core John Garry
  2022-06-06  9:30 ` [PATCH v3 1/4] dma-mapping: Add dma_opt_mapping_size() John Garry
  2022-06-06  9:30 ` [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size() John Garry
@ 2022-06-06  9:30 ` John Garry
  2022-06-08 17:33   ` Bart Van Assche
  2022-06-06  9:30 ` [PATCH v3 4/4] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors John Garry
  2022-06-07 22:43 ` [PATCH v3 0/4] DMA mapping changes for SCSI core Bart Van Assche
  4 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2022-06-06  9:30 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen, John Garry

Streaming DMA mappings may be considerably slower when mappings go through
an IOMMU and the total mapping length is somewhat long. This is because the
IOMMU IOVA code allocates and free an IOVA for each mapping, which may
affect performance.

For performance reasons set the request_queue max_sectors from
dma_opt_mapping_size(), which knows this mapping limit.

In addition, the shost->max_sectors is repeatedly set for each sdev in
__scsi_init_queue(). This is unnecessary, so set once when adding the
host.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/hosts.c    | 5 +++++
 drivers/scsi/scsi_lib.c | 4 ----
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 8352f90d997d..ea1a207634d1 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -236,6 +236,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
 
 	shost->dma_dev = dma_dev;
 
+	if (dma_dev->dma_mask) {
+		shost->max_sectors = min_t(unsigned int, shost->max_sectors,
+				dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT);
+	}
+
 	error = scsi_mq_setup_tags(shost);
 	if (error)
 		goto fail;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6ffc9e4258a8..6ce8acea322a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1884,10 +1884,6 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
 		blk_queue_max_integrity_segments(q, shost->sg_prot_tablesize);
 	}
 
-	if (dev->dma_mask) {
-		shost->max_sectors = min_t(unsigned int, shost->max_sectors,
-				dma_max_mapping_size(dev) >> SECTOR_SHIFT);
-	}
 	blk_queue_max_hw_sectors(q, shost->max_sectors);
 	blk_queue_segment_boundary(q, shost->dma_boundary);
 	dma_set_seg_boundary(dev, shost->dma_boundary);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 4/4] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors
  2022-06-06  9:30 [PATCH v3 0/4] DMA mapping changes for SCSI core John Garry
                   ` (2 preceding siblings ...)
  2022-06-06  9:30 ` [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits John Garry
@ 2022-06-06  9:30 ` John Garry
  2022-06-07 22:43 ` [PATCH v3 0/4] DMA mapping changes for SCSI core Bart Van Assche
  4 siblings, 0 replies; 21+ messages in thread
From: John Garry @ 2022-06-06  9:30 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen, John Garry

ATA devices (struct ata_device) have a max_sectors field which is
configured internally in libata. This is then used to (re)configure the
associated sdev request queue max_sectors value from how it is earlier set
in __scsi_init_queue(). In __scsi_init_queue() the max_sectors value is set
according to shost limits, which includes host DMA mapping limits.

Cap the ata_device max_sectors according to shost->max_sectors to respect
this shost limit.

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/ata/libata-scsi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 42cecf95a4e5..8b4b318f378d 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1060,6 +1060,7 @@ int ata_scsi_dev_config(struct scsi_device *sdev, struct ata_device *dev)
 		dev->flags |= ATA_DFLAG_NO_UNLOAD;
 
 	/* configure max sectors */
+	dev->max_sectors = min(dev->max_sectors, sdev->host->max_sectors);
 	blk_queue_max_hw_sectors(q, dev->max_sectors);
 
 	if (dev->class == ATA_DEV_ATAPI) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 0/4] DMA mapping changes for SCSI core
  2022-06-06  9:30 [PATCH v3 0/4] DMA mapping changes for SCSI core John Garry
                   ` (3 preceding siblings ...)
  2022-06-06  9:30 ` [PATCH v3 4/4] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors John Garry
@ 2022-06-07 22:43 ` Bart Van Assche
  2022-06-08 10:14   ` John Garry
  4 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2022-06-07 22:43 UTC (permalink / raw)
  To: John Garry, damien.lemoal, joro, will, jejb, martin.petersen,
	hch, m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 6/6/22 02:30, John Garry wrote:
> As reported in [0], DMA mappings whose size exceeds the IOMMU IOVA caching
> limit may see a big performance hit.
> 
> This series introduces a new DMA mapping API, dma_opt_mapping_size(), so
> that drivers may know this limit when performance is a factor in the
> mapping.
> 
> Robin didn't like using dma_max_mapping_size() for this [1].
> 
> The SCSI core code is modified to use this limit.
> 
> I also added a patch for libata-scsi as it does not currently honour the
> shost max_sectors limit.
> 
> Note: Christoph has previously kindly offered to take this series via the
>        dma-mapping tree, so I think that we just need an ack from the
>        IOMMU guys now.
> 
> [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> [1] https://lore.kernel.org/linux-iommu/f5b78c9c-312e-70ab-ecbb-f14623a4b6e3@arm.com/

Regarding [0], that patch reverts commit 4e89dce72521 ("iommu/iova: 
Retry from last rb tree node if iova search fails"). Reading the 
description of that patch, it seems to me that the iova allocator can be 
improved. Shouldn't the iova allocator be improved such that we don't 
need this patch series? There are algorithms that handle fragmentation 
much better than the current iova allocator algorithm, e.g. the 
https://en.wikipedia.org/wiki/Buddy_memory_allocation algorithm.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 0/4] DMA mapping changes for SCSI core
  2022-06-07 22:43 ` [PATCH v3 0/4] DMA mapping changes for SCSI core Bart Van Assche
@ 2022-06-08 10:14   ` John Garry
  0 siblings, 0 replies; 21+ messages in thread
From: John Garry @ 2022-06-08 10:14 UTC (permalink / raw)
  To: Bart Van Assche, damien.lemoal, joro, will, jejb,
	martin.petersen, hch, m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 07/06/2022 23:43, Bart Van Assche wrote:
> On 6/6/22 02:30, John Garry wrote:
>> As reported in [0], DMA mappings whose size exceeds the IOMMU IOVA 
>> caching
>> limit may see a big performance hit.
>>
>> This series introduces a new DMA mapping API, dma_opt_mapping_size(), so
>> that drivers may know this limit when performance is a factor in the
>> mapping.
>>
>> Robin didn't like using dma_max_mapping_size() for this [1].
>>
>> The SCSI core code is modified to use this limit.
>>
>> I also added a patch for libata-scsi as it does not currently honour the
>> shost max_sectors limit.
>>
>> Note: Christoph has previously kindly offered to take this series via the
>>        dma-mapping tree, so I think that we just need an ack from the
>>        IOMMU guys now.
>>
>> [0] 
>> https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/ 
>>
>> [1] 
>> https://lore.kernel.org/linux-iommu/f5b78c9c-312e-70ab-ecbb-f14623a4b6e3@arm.com/ 
>>
> 
> Regarding [0], that patch reverts commit 4e89dce72521 ("iommu/iova: 
> Retry from last rb tree node if iova search fails"). Reading the 
> description of that patch, it seems to me that the iova allocator can be 
> improved. Shouldn't the iova allocator be improved such that we don't 
> need this patch series? There are algorithms that handle fragmentation 
> much better than the current iova allocator algorithm, e.g. the 
> https://en.wikipedia.org/wiki/Buddy_memory_allocation algorithm.

Regardless of whether the IOVA allocator can be improved - which it 
probably can be - this series is still useful. That is due to the IOVA 
rcache - that is a cache of pre-allocated IOVAs which can be quickly 
used in the DMA mapping. The rache contains IOVAs up to certain fixed 
size. In this series we limit the DMA mapping length to the rcache size 
upper limit to always bypass the allocator (when we have a cached IOVA 
available) - see alloc_iova_fast().

Even if the IOVA allocator were greatly optimised for speed, there would 
still be an overhead in the alloc and free for those larger IOVAs which 
would outweigh the advantage of having larger DMA mappings. But is there 
even an advantage in very large streaming DMA mappings? Maybe for iotlb 
efficiency. But some say it's better to have the DMA engine start 
processing the data ASAP and not wait for larger lists to be built.

Thanks,
John


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size()
  2022-06-06  9:30 ` [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size() John Garry
@ 2022-06-08 17:26   ` Bart Van Assche
  2022-06-08 17:39     ` John Garry
  2022-06-14 13:12   ` John Garry
  1 sibling, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2022-06-08 17:26 UTC (permalink / raw)
  To: John Garry, damien.lemoal, joro, will, jejb, martin.petersen,
	hch, m.szyprowski, robin.murphy
  Cc: linux-scsi, linux-doc, liyihang6, linux-kernel, linux-ide, iommu

On 6/6/22 02:30, John Garry via iommu wrote:
> +unsigned long iova_rcache_range(void)
> +{
> +	return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
> +}

My understanding is that iova cache entries may be smaller than 
IOVA_RANGE_CACHE_MAX_SIZE and hence that even if code that uses the DMA 
mapping API respects this limit that a cache miss can still happen.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 1/4] dma-mapping: Add dma_opt_mapping_size()
  2022-06-06  9:30 ` [PATCH v3 1/4] dma-mapping: Add dma_opt_mapping_size() John Garry
@ 2022-06-08 17:27   ` Bart Van Assche
  0 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2022-06-08 17:27 UTC (permalink / raw)
  To: John Garry, damien.lemoal, joro, will, jejb, martin.petersen,
	hch, m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 6/6/22 02:30, John Garry wrote:
> +::
> +
> +	size_t
> +	dma_opt_mapping_size(struct device *dev);
> +
> +Returns the maximum optimal size of a mapping for the device. Mapping large
> +buffers may take longer so device drivers are advised to limit total DMA
> +streaming mappings length to the returned value.

"Maximum optimal" sounds weird to me. Is there a single optimal value or 
are there multiple optimal values? In the former case I think that the 
word "maximum" should be left out.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
  2022-06-06  9:30 ` [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits John Garry
@ 2022-06-08 17:33   ` Bart Van Assche
  2022-06-08 17:50     ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2022-06-08 17:33 UTC (permalink / raw)
  To: John Garry, damien.lemoal, joro, will, jejb, martin.petersen,
	hch, m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 6/6/22 02:30, John Garry wrote:
> +	if (dma_dev->dma_mask) {
> +		shost->max_sectors = min_t(unsigned int, shost->max_sectors,
> +				dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT);
> +	}

Since IOVA_RANGE_CACHE_MAX_SIZE = 6 this limits max_sectors to 2**6 * 
PAGE_SIZE or 256 KiB if the page size is 4 KiB. I think that's too 
small. Some (SRP) storage arrays require much larger transfers to 
achieve optimal performance.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size()
  2022-06-08 17:26   ` Bart Van Assche
@ 2022-06-08 17:39     ` John Garry
  0 siblings, 0 replies; 21+ messages in thread
From: John Garry @ 2022-06-08 17:39 UTC (permalink / raw)
  To: Bart Van Assche, damien.lemoal, joro, will, jejb,
	martin.petersen, hch, m.szyprowski, robin.murphy
  Cc: linux-scsi, linux-doc, liyihang6, linux-kernel, linux-ide, iommu

On 08/06/2022 18:26, Bart Van Assche wrote:
> On 6/6/22 02:30, John Garry via iommu wrote:
>> +unsigned long iova_rcache_range(void)
>> +{
>> +    return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
>> +}
> 
> My understanding is that iova cache entries may be smaller than 
> IOVA_RANGE_CACHE_MAX_SIZE and hence that even if code that uses the DMA 
> mapping API respects this limit that a cache miss can still happen.

Sure, a cache miss may still happen - however once we have stressed the 
system for a while then the rcaches fill up and don't fail often, or 
often enough to be noticeable compared to not having a cached IOVAs at all.

Thanks,
john

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
  2022-06-08 17:33   ` Bart Van Assche
@ 2022-06-08 17:50     ` John Garry
  2022-06-08 21:07       ` Bart Van Assche
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2022-06-08 17:50 UTC (permalink / raw)
  To: Bart Van Assche, damien.lemoal, joro, will, jejb,
	martin.petersen, hch, m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 08/06/2022 18:33, Bart Van Assche wrote:
> On 6/6/22 02:30, John Garry wrote:
>> +    if (dma_dev->dma_mask) {
>> +        shost->max_sectors = min_t(unsigned int, shost->max_sectors,
>> +                dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT);
>> +    }
> 
> Since IOVA_RANGE_CACHE_MAX_SIZE = 6 this limits max_sectors to 2**6 * 
> PAGE_SIZE or 256 KiB if the page size is 4 KiB.

It's actually 128K for 4K page size, as any IOVA size is roundup to 
power-of-2 when testing if we may cache it, which means anything >128K 
would roundup to 256K and cannot be cached.

> I think that's too 
> small. Some (SRP) storage arrays require much larger transfers to 
> achieve optimal performance.

Have you tried this achieve this optimal performance with an IOMMU enabled?

Please note that this limit only applies if we have an IOMMU enabled for 
the scsi host dma device. Otherwise we are limited by dma direct or 
swiotlb max mapping size, as before.

Thanks,
John

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
  2022-06-08 17:50     ` John Garry
@ 2022-06-08 21:07       ` Bart Van Assche
  2022-06-09  8:00         ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2022-06-08 21:07 UTC (permalink / raw)
  To: John Garry, damien.lemoal, joro, will, jejb, martin.petersen,
	hch, m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 6/8/22 10:50, John Garry wrote:
> Please note that this limit only applies if we have an IOMMU enabled for 
> the scsi host dma device. Otherwise we are limited by dma direct or 
> swiotlb max mapping size, as before.

SCSI host bus adapters that support 64-bit DMA may support much larger 
transfer sizes than 128 KiB.

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
  2022-06-08 21:07       ` Bart Van Assche
@ 2022-06-09  8:00         ` John Garry
  2022-06-09 17:18           ` Bart Van Assche
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2022-06-09  8:00 UTC (permalink / raw)
  To: Bart Van Assche, damien.lemoal, joro, will, jejb,
	martin.petersen, hch, m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 08/06/2022 22:07, Bart Van Assche wrote:
> On 6/8/22 10:50, John Garry wrote:
>> Please note that this limit only applies if we have an IOMMU enabled 
>> for the scsi host dma device. Otherwise we are limited by dma direct 
>> or swiotlb max mapping size, as before.
> 
> SCSI host bus adapters that support 64-bit DMA may support much larger 
> transfer sizes than 128 KiB.

Indeed, and that is my problem today, as my storage controller is 
generating DMA mapping lengths which exceeds 128K and they slow 
everything down.

If you say that SRP enjoys best peformance with larger transfers then 
can you please test this with an IOMMU enabled (iommu group type DMA or 
DMA-FQ)?

Thanks,
John

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
  2022-06-09  8:00         ` John Garry
@ 2022-06-09 17:18           ` Bart Van Assche
  2022-06-09 17:54             ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2022-06-09 17:18 UTC (permalink / raw)
  To: John Garry, damien.lemoal, joro, will, jejb, martin.petersen,
	hch, m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 6/9/22 01:00, John Garry wrote:
> On 08/06/2022 22:07, Bart Van Assche wrote:
>> On 6/8/22 10:50, John Garry wrote:
>>> Please note that this limit only applies if we have an IOMMU enabled 
>>> for the scsi host dma device. Otherwise we are limited by dma direct 
>>> or swiotlb max mapping size, as before.
>>
>> SCSI host bus adapters that support 64-bit DMA may support much larger 
>> transfer sizes than 128 KiB.
> 
> Indeed, and that is my problem today, as my storage controller is 
> generating DMA mapping lengths which exceeds 128K and they slow 
> everything down.
> 
> If you say that SRP enjoys best peformance with larger transfers then 
> can you please test this with an IOMMU enabled (iommu group type DMA or 
> DMA-FQ)?

Hmm ... what exactly do you want me to test? Do you perhaps want me to 
measure how much performance drops with an IOMMU enabled? I don't have 
access anymore to the SRP setup I referred to in my previous email. But 
I do have access to devices that boot from UFS storage. For these 
devices we need to transfer 2 MiB per request to achieve full bandwidth.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
  2022-06-09 17:18           ` Bart Van Assche
@ 2022-06-09 17:54             ` John Garry
  2022-06-09 20:34               ` Bart Van Assche
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2022-06-09 17:54 UTC (permalink / raw)
  To: Bart Van Assche, damien.lemoal, joro, will, jejb,
	martin.petersen, hch, m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 09/06/2022 18:18, Bart Van Assche wrote:
>>>
>>> SCSI host bus adapters that support 64-bit DMA may support much 
>>> larger transfer sizes than 128 KiB.
>>
>> Indeed, and that is my problem today, as my storage controller is 
>> generating DMA mapping lengths which exceeds 128K and they slow 
>> everything down.
>>
>> If you say that SRP enjoys best peformance with larger transfers then 
>> can you please test this with an IOMMU enabled (iommu group type DMA 
>> or DMA-FQ)?
> 
> Hmm ... what exactly do you want me to test? Do you perhaps want me to 
> measure how much performance drops with an IOMMU enabled? 

Yes, I would like to know of any performance change with an IOMMU 
enabled and then with an IOMMU enabled and including my series.

> I don't have 
> access anymore to the SRP setup I referred to in my previous email. But 
> I do have access to devices that boot from UFS storage. For these 
> devices we need to transfer 2 MiB per request to achieve full bandwidth.

ok, but do you have a system where the UFS host controller is behind an 
IOMMU? I had the impression that UFS controllers would be mostly found 
in embedded systems and IOMMUs are not as common on there.

Thanks,
John


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
  2022-06-09 17:54             ` John Garry
@ 2022-06-09 20:34               ` Bart Van Assche
  2022-06-10 15:37                 ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2022-06-09 20:34 UTC (permalink / raw)
  To: John Garry, damien.lemoal, joro, will, jejb, martin.petersen,
	hch, m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 6/9/22 10:54, John Garry wrote:
> ok, but do you have a system where the UFS host controller is behind an 
> IOMMU? I had the impression that UFS controllers would be mostly found 
> in embedded systems and IOMMUs are not as common on there.

Modern phones have an IOMMU. Below one can find an example from a Pixel 
6 phone. The UFS storage controller is not controller by the IOMMU as 
far as I can see but I wouldn't be surprised if the security team would 
ask us one day to enable the IOMMU for the UFS controller.

# (cd /sys/class/iommu && ls */devices)
1a090000.sysmmu/devices:
19000000.aoc

1a510000.sysmmu/devices:
1a440000.lwis_csi

1a540000.sysmmu/devices:
1aa40000.lwis_pdp

1a880000.sysmmu/devices:
1a840000.lwis_g3aa

1ad00000.sysmmu/devices:
1ac40000.lwis_ipp  1ac80000.lwis_gtnr_align

1b080000.sysmmu/devices:
1b450000.lwis_itp

1b780000.sysmmu/devices:

1b7b0000.sysmmu/devices:
1b760000.lwis_mcsc

1b7e0000.sysmmu/devices:

1baa0000.sysmmu/devices:
1a4e0000.lwis_votf  1ba40000.lwis_gdc

1bad0000.sysmmu/devices:
1ba60000.lwis_gdc

1bb00000.sysmmu/devices:
1ba80000.lwis_scsc

1bc70000.sysmmu/devices:
1bc40000.lwis_gtnr_merge

1bca0000.sysmmu/devices:

1bcd0000.sysmmu/devices:

1bd00000.sysmmu/devices:

1bd30000.sysmmu/devices:

1c100000.sysmmu/devices:
1c300000.drmdecon  1c302000.drmdecon

1c110000.sysmmu/devices:

1c120000.sysmmu/devices:

1c660000.sysmmu/devices:
1c640000.g2d

1c690000.sysmmu/devices:

1c710000.sysmmu/devices:
1c700000.smfc

1c870000.sysmmu/devices:
1c8d0000.MFC-0  mfc

1c8a0000.sysmmu/devices:

1ca40000.sysmmu/devices:
1cb00000.bigocean

1cc40000.sysmmu/devices:
1ce00000.abrolhos

Bart.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
  2022-06-09 20:34               ` Bart Van Assche
@ 2022-06-10 15:37                 ` John Garry
  2022-06-23  8:36                   ` John Garry
  0 siblings, 1 reply; 21+ messages in thread
From: John Garry @ 2022-06-10 15:37 UTC (permalink / raw)
  To: Bart Van Assche, damien.lemoal, joro, will, jejb,
	martin.petersen, hch, m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 09/06/2022 21:34, Bart Van Assche wrote:
> On 6/9/22 10:54, John Garry wrote:
>> ok, but do you have a system where the UFS host controller is behind 
>> an IOMMU? I had the impression that UFS controllers would be mostly 
>> found in embedded systems and IOMMUs are not as common on there.
> 
> Modern phones have an IOMMU. Below one can find an example from a Pixel 
> 6 phone. The UFS storage controller is not controller by the IOMMU as 
> far as I can see but I wouldn't be surprised if the security team would 
> ask us one day to enable the IOMMU for the UFS controller.

OK, then unfortunately it seems that you have no method to test. I might 
be able to test USB MSC but I am not even sure if I can even get DMA 
mappings who length exceeds the IOVA rcache limit there.

Thanks,
John

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size()
  2022-06-06  9:30 ` [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size() John Garry
  2022-06-08 17:26   ` Bart Van Assche
@ 2022-06-14 13:12   ` John Garry
  2022-06-23  8:38     ` John Garry
  1 sibling, 1 reply; 21+ messages in thread
From: John Garry @ 2022-06-14 13:12 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen

On 06/06/2022 10:30, John Garry wrote:
> Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which
> allows the drivers to know the optimal mapping limit and thus limit the
> requested IOVA lengths.
> 
> This value is based on the IOVA rcache range limit, as IOVAs allocated
> above this limit must always be newly allocated, which may be quite slow.
> 

Can I please get some sort of ack from the IOMMU people on this one?

Thanks,
John

EOM

> Signed-off-by: John Garry <john.garry@huawei.com>
> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> ---
>   drivers/iommu/dma-iommu.c | 6 ++++++
>   drivers/iommu/iova.c      | 5 +++++
>   include/linux/iova.h      | 2 ++
>   3 files changed, 13 insertions(+)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index f90251572a5d..9e1586447ee8 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1459,6 +1459,11 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
>   	return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
>   }
>   
> +static size_t iommu_dma_opt_mapping_size(void)
> +{
> +	return iova_rcache_range();
> +}
> +
>   static const struct dma_map_ops iommu_dma_ops = {
>   	.alloc			= iommu_dma_alloc,
>   	.free			= iommu_dma_free,
> @@ -1479,6 +1484,7 @@ static const struct dma_map_ops iommu_dma_ops = {
>   	.map_resource		= iommu_dma_map_resource,
>   	.unmap_resource		= iommu_dma_unmap_resource,
>   	.get_merge_boundary	= iommu_dma_get_merge_boundary,
> +	.opt_mapping_size	= iommu_dma_opt_mapping_size,
>   };
>   
>   /*
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index db77aa675145..9f00b58d546e 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
>   static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
>   static void free_iova_rcaches(struct iova_domain *iovad);
>   
> +unsigned long iova_rcache_range(void)
> +{
> +	return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
> +}
> +
>   static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
>   {
>   	struct iova_domain *iovad;
> diff --git a/include/linux/iova.h b/include/linux/iova.h
> index 320a70e40233..c6ba6d95d79c 100644
> --- a/include/linux/iova.h
> +++ b/include/linux/iova.h
> @@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova)
>   int iova_cache_get(void);
>   void iova_cache_put(void);
>   
> +unsigned long iova_rcache_range(void);
> +
>   void free_iova(struct iova_domain *iovad, unsigned long pfn);
>   void __free_iova(struct iova_domain *iovad, struct iova *iova);
>   struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
  2022-06-10 15:37                 ` John Garry
@ 2022-06-23  8:36                   ` John Garry
  0 siblings, 0 replies; 21+ messages in thread
From: John Garry @ 2022-06-23  8:36 UTC (permalink / raw)
  To: Bart Van Assche, damien.lemoal, joro, will, jejb,
	martin.petersen, hch, m.szyprowski, robin.murphy
  Cc: linux-scsi, linux-doc, liyihang6, linux-kernel, linux-ide, iommu

On 10/06/2022 16:37, John Garry via iommu wrote:
> 
>> On 6/9/22 10:54, John Garry wrote:
>>> ok, but do you have a system where the UFS host controller is behind 
>>> an IOMMU? I had the impression that UFS controllers would be mostly 
>>> found in embedded systems and IOMMUs are not as common on there.
>>
>> Modern phones have an IOMMU. Below one can find an example from a 
>> Pixel 6 phone. The UFS storage controller is not controller by the 
>> IOMMU as far as I can see but I wouldn't be surprised if the security 
>> team would ask us one day to enable the IOMMU for the UFS controller.
> 
> OK, then unfortunately it seems that you have no method to test. I might 
> be able to test USB MSC but I am not even sure if I can even get DMA 
> mappings who length exceeds the IOVA rcache limit there.

I was able to do some testing on USB MSC for an XHCI controller. The 
result is that limiting the max HW sectors there does not affect 
performance in normal conditions.

However if I hack the USB driver and fiddle with request queue settings 
then it can:
- lift max_sectors limit in usb_stor_host_template 120KB -> 256KB
- lift request queue read_ahead_kb 128KB -> 256KB

In this scenario I can get 42.5MB/s read throughput, as opposed to 
39.5MB/s in normal conditions. Since .can_queue=1 for that host it would 
not fall foul of some issues I experience in IOVA allocator performance, 
so limiting max_sectors would not be required for that reason.

So this is an artificial test, but it may be worth considering only 
applying this DMA mapping optimal max_sectors limit to SAS controllers 
which I know can benefit.

Christoph, any opinion?

thanks,
John

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size()
  2022-06-14 13:12   ` John Garry
@ 2022-06-23  8:38     ` John Garry
  0 siblings, 0 replies; 21+ messages in thread
From: John Garry @ 2022-06-23  8:38 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, linux-scsi, liyihang6,
	chenxiang66, thunder.leizhen, damien.lemoal, m.szyprowski,
	martin.petersen, jejb, hch

On 14/06/2022 14:12, John Garry wrote:
> On 06/06/2022 10:30, John Garry wrote:
>> Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which
>> allows the drivers to know the optimal mapping limit and thus limit the
>> requested IOVA lengths.
>>
>> This value is based on the IOVA rcache range limit, as IOVAs allocated
>> above this limit must always be newly allocated, which may be quite slow.
>>
> 
> Can I please get some sort of ack from the IOMMU people on this one?
> 

Another request for an ack please.

Thanks,
john

> 
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
>> ---
>>   drivers/iommu/dma-iommu.c | 6 ++++++
>>   drivers/iommu/iova.c      | 5 +++++
>>   include/linux/iova.h      | 2 ++
>>   3 files changed, 13 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index f90251572a5d..9e1586447ee8 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -1459,6 +1459,11 @@ static unsigned long 
>> iommu_dma_get_merge_boundary(struct device *dev)
>>       return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
>>   }
>> +static size_t iommu_dma_opt_mapping_size(void)
>> +{
>> +    return iova_rcache_range();
>> +}
>> +
>>   static const struct dma_map_ops iommu_dma_ops = {
>>       .alloc            = iommu_dma_alloc,
>>       .free            = iommu_dma_free,
>> @@ -1479,6 +1484,7 @@ static const struct dma_map_ops iommu_dma_ops = {
>>       .map_resource        = iommu_dma_map_resource,
>>       .unmap_resource        = iommu_dma_unmap_resource,
>>       .get_merge_boundary    = iommu_dma_get_merge_boundary,
>> +    .opt_mapping_size    = iommu_dma_opt_mapping_size,
>>   };
>>   /*
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index db77aa675145..9f00b58d546e 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct 
>> iova_domain *iovad,
>>   static void free_cpu_cached_iovas(unsigned int cpu, struct 
>> iova_domain *iovad);
>>   static void free_iova_rcaches(struct iova_domain *iovad);
>> +unsigned long iova_rcache_range(void)
>> +{
>> +    return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
>> +}
>> +
>>   static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
>>   {
>>       struct iova_domain *iovad;
>> diff --git a/include/linux/iova.h b/include/linux/iova.h
>> index 320a70e40233..c6ba6d95d79c 100644
>> --- a/include/linux/iova.h
>> +++ b/include/linux/iova.h
>> @@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct 
>> iova_domain *iovad, dma_addr_t iova)
>>   int iova_cache_get(void);
>>   void iova_cache_put(void);
>> +unsigned long iova_rcache_range(void);
>> +
>>   void free_iova(struct iova_domain *iovad, unsigned long pfn);
>>   void __free_iova(struct iova_domain *iovad, struct iova *iova);
>>   struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-06-23  8:38 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-06  9:30 [PATCH v3 0/4] DMA mapping changes for SCSI core John Garry
2022-06-06  9:30 ` [PATCH v3 1/4] dma-mapping: Add dma_opt_mapping_size() John Garry
2022-06-08 17:27   ` Bart Van Assche
2022-06-06  9:30 ` [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size() John Garry
2022-06-08 17:26   ` Bart Van Assche
2022-06-08 17:39     ` John Garry
2022-06-14 13:12   ` John Garry
2022-06-23  8:38     ` John Garry
2022-06-06  9:30 ` [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits John Garry
2022-06-08 17:33   ` Bart Van Assche
2022-06-08 17:50     ` John Garry
2022-06-08 21:07       ` Bart Van Assche
2022-06-09  8:00         ` John Garry
2022-06-09 17:18           ` Bart Van Assche
2022-06-09 17:54             ` John Garry
2022-06-09 20:34               ` Bart Van Assche
2022-06-10 15:37                 ` John Garry
2022-06-23  8:36                   ` John Garry
2022-06-06  9:30 ` [PATCH v3 4/4] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors John Garry
2022-06-07 22:43 ` [PATCH v3 0/4] DMA mapping changes for SCSI core Bart Van Assche
2022-06-08 10:14   ` John Garry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).