linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/5] DMA mapping changes for SCSI core
@ 2022-06-30 12:08 John Garry
  2022-06-30 12:08 ` [PATCH v5 1/5] dma-mapping: Add dma_opt_mapping_size() John Garry
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: John Garry @ 2022-06-30 12:08 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi,
	linuxarm, John Garry

As reported in [0], DMA mappings whose size exceeds the IOMMU IOVA caching
limit may see a big performance hit.

This series introduces a new DMA mapping API, dma_opt_mapping_size(), so
that drivers may know this limit when performance is a factor in the
mapping.

The SCSI SAS transport code is modified only to use this limit. For now I
did not want to touch other hosts as I have a concern that this change
could cause a performance regression.

I also added a patch for libata-scsi as it does not currently honour the
shost max_sectors limit.

[0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/

Changes since v4:
- tweak libata and other patch titles
- Add Robin's tag (thanks!)
- Clarify description of new DMA mapping API

Changes since v3:
- Apply max DMA optimial limit to SAS hosts only
  Note: Even though "scsi: core: Cap shost max_sectors only once when
  adding" is a subset of a previous patch I did not transfer the RB tags
- Rebase on v5.19-rc4

John Garry (5):
  dma-mapping: Add dma_opt_mapping_size()
  dma-iommu: Add iommu_dma_opt_mapping_size()
  scsi: core: Cap shost max_sectors according to DMA limits only once
  scsi: scsi_transport_sas: Cap shost max_sectors according to DMA
    optimal limit
  ata: libata-scsi: Cap ata_device->max_sectors according to
    shost->max_sectors

 Documentation/core-api/dma-api.rst | 14 ++++++++++++++
 drivers/ata/libata-scsi.c          |  1 +
 drivers/iommu/dma-iommu.c          |  6 ++++++
 drivers/iommu/iova.c               |  5 +++++
 drivers/scsi/hosts.c               |  5 +++++
 drivers/scsi/scsi_lib.c            |  4 ----
 drivers/scsi/scsi_transport_sas.c  |  6 ++++++
 include/linux/dma-map-ops.h        |  1 +
 include/linux/dma-mapping.h        |  5 +++++
 include/linux/iova.h               |  2 ++
 kernel/dma/mapping.c               | 12 ++++++++++++
 11 files changed, 57 insertions(+), 4 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v5 1/5] dma-mapping: Add dma_opt_mapping_size()
  2022-06-30 12:08 [PATCH v5 0/5] DMA mapping changes for SCSI core John Garry
@ 2022-06-30 12:08 ` John Garry
  2022-06-30 12:08 ` [PATCH v5 2/5] dma-iommu: Add iommu_dma_opt_mapping_size() John Garry
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: John Garry @ 2022-06-30 12:08 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi,
	linuxarm, John Garry

Streaming DMA mapping involving an IOMMU may be much slower for larger
total mapping size. This is because every IOMMU DMA mapping requires an
IOVA to be allocated and freed. IOVA sizes above a certain limit are not
cached, which can have a big impact on DMA mapping performance.

Provide an API for device drivers to know this "optimal" limit, such that
they may try to produce mapping which don't exceed it.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 Documentation/core-api/dma-api.rst | 14 ++++++++++++++
 include/linux/dma-map-ops.h        |  1 +
 include/linux/dma-mapping.h        |  5 +++++
 kernel/dma/mapping.c               | 12 ++++++++++++
 4 files changed, 32 insertions(+)

diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
index 6d6d0edd2d27..829f20a193ca 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -204,6 +204,20 @@ Returns the maximum size of a mapping for the device. The size parameter
 of the mapping functions like dma_map_single(), dma_map_page() and
 others should not be larger than the returned value.
 
+::
+
+	size_t
+	dma_opt_mapping_size(struct device *dev);
+
+Returns the maximum optimal size of a mapping for the device.
+
+Mapping larger buffers may take much longer in certain scenarios. In
+addition, for high-rate short-lived streaming mappings, the upfront time
+spent on the mapping may account for an appreciable part of the total
+request lifetime. As such, if splitting larger requests incurs no
+significant performance penalty, then device drivers are advised to
+limit total DMA streaming mappings length to the returned value.
+
 ::
 
 	bool
diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index 0d5b06b3a4a6..98ceba6fa848 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -69,6 +69,7 @@ struct dma_map_ops {
 	int (*dma_supported)(struct device *dev, u64 mask);
 	u64 (*get_required_mask)(struct device *dev);
 	size_t (*max_mapping_size)(struct device *dev);
+	size_t (*opt_mapping_size)(void);
 	unsigned long (*get_merge_boundary)(struct device *dev);
 };
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index dca2b1355bb1..fe3849434b2a 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -144,6 +144,7 @@ int dma_set_mask(struct device *dev, u64 mask);
 int dma_set_coherent_mask(struct device *dev, u64 mask);
 u64 dma_get_required_mask(struct device *dev);
 size_t dma_max_mapping_size(struct device *dev);
+size_t dma_opt_mapping_size(struct device *dev);
 bool dma_need_sync(struct device *dev, dma_addr_t dma_addr);
 unsigned long dma_get_merge_boundary(struct device *dev);
 struct sg_table *dma_alloc_noncontiguous(struct device *dev, size_t size,
@@ -266,6 +267,10 @@ static inline size_t dma_max_mapping_size(struct device *dev)
 {
 	return 0;
 }
+static inline size_t dma_opt_mapping_size(struct device *dev)
+{
+	return 0;
+}
 static inline bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)
 {
 	return false;
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index db7244291b74..1bfe11b1edb6 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -773,6 +773,18 @@ size_t dma_max_mapping_size(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dma_max_mapping_size);
 
+size_t dma_opt_mapping_size(struct device *dev)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+	size_t size = SIZE_MAX;
+
+	if (ops && ops->opt_mapping_size)
+		size = ops->opt_mapping_size();
+
+	return min(dma_max_mapping_size(dev), size);
+}
+EXPORT_SYMBOL_GPL(dma_opt_mapping_size);
+
 bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)
 {
 	const struct dma_map_ops *ops = get_dma_ops(dev);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 2/5] dma-iommu: Add iommu_dma_opt_mapping_size()
  2022-06-30 12:08 [PATCH v5 0/5] DMA mapping changes for SCSI core John Garry
  2022-06-30 12:08 ` [PATCH v5 1/5] dma-mapping: Add dma_opt_mapping_size() John Garry
@ 2022-06-30 12:08 ` John Garry
  2022-06-30 12:08 ` [PATCH v5 3/5] scsi: core: Cap shost max_sectors according to DMA limits only once John Garry
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: John Garry @ 2022-06-30 12:08 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi,
	linuxarm, John Garry

Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which
allows the drivers to know the optimal mapping limit and thus limit the
requested IOVA lengths.

This value is based on the IOVA rcache range limit, as IOVAs allocated
above this limit must always be newly allocated, which may be quite slow.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 6 ++++++
 drivers/iommu/iova.c      | 5 +++++
 include/linux/iova.h      | 2 ++
 3 files changed, 13 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f90251572a5d..9e1586447ee8 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1459,6 +1459,11 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
 	return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
 }
 
+static size_t iommu_dma_opt_mapping_size(void)
+{
+	return iova_rcache_range();
+}
+
 static const struct dma_map_ops iommu_dma_ops = {
 	.alloc			= iommu_dma_alloc,
 	.free			= iommu_dma_free,
@@ -1479,6 +1484,7 @@ static const struct dma_map_ops iommu_dma_ops = {
 	.map_resource		= iommu_dma_map_resource,
 	.unmap_resource		= iommu_dma_unmap_resource,
 	.get_merge_boundary	= iommu_dma_get_merge_boundary,
+	.opt_mapping_size	= iommu_dma_opt_mapping_size,
 };
 
 /*
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index db77aa675145..9f00b58d546e 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
 static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
 static void free_iova_rcaches(struct iova_domain *iovad);
 
+unsigned long iova_rcache_range(void)
+{
+	return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
+}
+
 static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
 {
 	struct iova_domain *iovad;
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 320a70e40233..c6ba6d95d79c 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova)
 int iova_cache_get(void);
 void iova_cache_put(void);
 
+unsigned long iova_rcache_range(void);
+
 void free_iova(struct iova_domain *iovad, unsigned long pfn);
 void __free_iova(struct iova_domain *iovad, struct iova *iova);
 struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 3/5] scsi: core: Cap shost max_sectors according to DMA limits only once
  2022-06-30 12:08 [PATCH v5 0/5] DMA mapping changes for SCSI core John Garry
  2022-06-30 12:08 ` [PATCH v5 1/5] dma-mapping: Add dma_opt_mapping_size() John Garry
  2022-06-30 12:08 ` [PATCH v5 2/5] dma-iommu: Add iommu_dma_opt_mapping_size() John Garry
@ 2022-06-30 12:08 ` John Garry
  2022-06-30 23:41   ` Damien Le Moal
  2022-06-30 12:08 ` [PATCH v5 4/5] scsi: scsi_transport_sas: Cap shost max_sectors according to DMA optimal limit John Garry
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: John Garry @ 2022-06-30 12:08 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi,
	linuxarm, John Garry

The shost->max_sectors is repeatedly capped according to the host DMA
mapping limit for each sdev in __scsi_init_queue(). This is unnecessary, so
set only once when adding the host.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hosts.c    | 5 +++++
 drivers/scsi/scsi_lib.c | 4 ----
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 8352f90d997d..d04bd2c7c9f1 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -236,6 +236,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
 
 	shost->dma_dev = dma_dev;
 
+	if (dma_dev->dma_mask) {
+		shost->max_sectors = min_t(unsigned int, shost->max_sectors,
+				dma_max_mapping_size(dma_dev) >> SECTOR_SHIFT);
+	}
+
 	error = scsi_mq_setup_tags(shost);
 	if (error)
 		goto fail;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6ffc9e4258a8..6ce8acea322a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1884,10 +1884,6 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
 		blk_queue_max_integrity_segments(q, shost->sg_prot_tablesize);
 	}
 
-	if (dev->dma_mask) {
-		shost->max_sectors = min_t(unsigned int, shost->max_sectors,
-				dma_max_mapping_size(dev) >> SECTOR_SHIFT);
-	}
 	blk_queue_max_hw_sectors(q, shost->max_sectors);
 	blk_queue_segment_boundary(q, shost->dma_boundary);
 	dma_set_seg_boundary(dev, shost->dma_boundary);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 4/5] scsi: scsi_transport_sas: Cap shost max_sectors according to DMA optimal limit
  2022-06-30 12:08 [PATCH v5 0/5] DMA mapping changes for SCSI core John Garry
                   ` (2 preceding siblings ...)
  2022-06-30 12:08 ` [PATCH v5 3/5] scsi: core: Cap shost max_sectors according to DMA limits only once John Garry
@ 2022-06-30 12:08 ` John Garry
  2022-06-30 23:49   ` Damien Le Moal
  2022-06-30 12:08 ` [PATCH v5 5/5] ata: libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors John Garry
  2022-07-06 13:40 ` [PATCH v5 0/5] DMA mapping changes for SCSI core John Garry
  5 siblings, 1 reply; 20+ messages in thread
From: John Garry @ 2022-06-30 12:08 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi,
	linuxarm, John Garry

Streaming DMA mappings may be considerably slower when mappings go through
an IOMMU and the total mapping length is somewhat long. This is because the
IOMMU IOVA code allocates and free an IOVA for each mapping, which may
affect performance.

For performance reasons set the request queue max_sectors from
dma_opt_mapping_size(), which knows this mapping limit.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/scsi_transport_sas.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
index 12bff64dade6..1b45248748e0 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -225,6 +225,7 @@ static int sas_host_setup(struct transport_container *tc, struct device *dev,
 {
 	struct Scsi_Host *shost = dev_to_shost(dev);
 	struct sas_host_attrs *sas_host = to_sas_host_attrs(shost);
+	struct device *dma_dev = shost->dma_dev;
 
 	INIT_LIST_HEAD(&sas_host->rphy_list);
 	mutex_init(&sas_host->lock);
@@ -236,6 +237,11 @@ static int sas_host_setup(struct transport_container *tc, struct device *dev,
 		dev_printk(KERN_ERR, dev, "fail to a bsg device %d\n",
 			   shost->host_no);
 
+	if (dma_dev) {
+		shost->max_sectors = min_t(unsigned int, shost->max_sectors,
+				dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT);
+	}
+
 	return 0;
 }
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 5/5] ata: libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors
  2022-06-30 12:08 [PATCH v5 0/5] DMA mapping changes for SCSI core John Garry
                   ` (3 preceding siblings ...)
  2022-06-30 12:08 ` [PATCH v5 4/5] scsi: scsi_transport_sas: Cap shost max_sectors according to DMA optimal limit John Garry
@ 2022-06-30 12:08 ` John Garry
  2022-07-06 13:40 ` [PATCH v5 0/5] DMA mapping changes for SCSI core John Garry
  5 siblings, 0 replies; 20+ messages in thread
From: John Garry @ 2022-06-30 12:08 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi,
	linuxarm, John Garry

ATA devices (struct ata_device) have a max_sectors field which is
configured internally in libata. This is then used to (re)configure the
associated sdev request queue max_sectors value from how it is earlier set
in __scsi_init_queue(). In __scsi_init_queue() the max_sectors value is set
according to shost limits, which includes host DMA mapping limits.

Cap the ata_device max_sectors according to shost->max_sectors to respect
this shost limit.

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/ata/libata-scsi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 86dbb1cdfabd..24a43d540d9f 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1060,6 +1060,7 @@ int ata_scsi_dev_config(struct scsi_device *sdev, struct ata_device *dev)
 		dev->flags |= ATA_DFLAG_NO_UNLOAD;
 
 	/* configure max sectors */
+	dev->max_sectors = min(dev->max_sectors, sdev->host->max_sectors);
 	blk_queue_max_hw_sectors(q, dev->max_sectors);
 
 	if (dev->class == ATA_DEV_ATAPI) {
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/5] scsi: core: Cap shost max_sectors according to DMA limits only once
  2022-06-30 12:08 ` [PATCH v5 3/5] scsi: core: Cap shost max_sectors according to DMA limits only once John Garry
@ 2022-06-30 23:41   ` Damien Le Moal
  2022-07-01  8:02     ` John Garry
  0 siblings, 1 reply; 20+ messages in thread
From: Damien Le Moal @ 2022-06-30 23:41 UTC (permalink / raw)
  To: John Garry, joro, will, jejb, martin.petersen, hch, m.szyprowski,
	robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi, linuxarm

On 6/30/22 21:08, John Garry wrote:
> The shost->max_sectors is repeatedly capped according to the host DMA
> mapping limit for each sdev in __scsi_init_queue(). This is unnecessary, so
> set only once when adding the host.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/scsi/hosts.c    | 5 +++++
>  drivers/scsi/scsi_lib.c | 4 ----
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 8352f90d997d..d04bd2c7c9f1 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -236,6 +236,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
>  
>  	shost->dma_dev = dma_dev;
>  
> +	if (dma_dev->dma_mask) {
> +		shost->max_sectors = min_t(unsigned int, shost->max_sectors,
> +				dma_max_mapping_size(dma_dev) >> SECTOR_SHIFT);
> +	}

Nit: you could remove the curly brackets... But it being a multi-line
statement, having them is OK too I think.

> +
>  	error = scsi_mq_setup_tags(shost);
>  	if (error)
>  		goto fail;
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 6ffc9e4258a8..6ce8acea322a 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1884,10 +1884,6 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
>  		blk_queue_max_integrity_segments(q, shost->sg_prot_tablesize);
>  	}
>  
> -	if (dev->dma_mask) {
> -		shost->max_sectors = min_t(unsigned int, shost->max_sectors,
> -				dma_max_mapping_size(dev) >> SECTOR_SHIFT);
> -	}
>  	blk_queue_max_hw_sectors(q, shost->max_sectors);
>  	blk_queue_segment_boundary(q, shost->dma_boundary);
>  	dma_set_seg_boundary(dev, shost->dma_boundary);

Looks good.

Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 4/5] scsi: scsi_transport_sas: Cap shost max_sectors according to DMA optimal limit
  2022-06-30 12:08 ` [PATCH v5 4/5] scsi: scsi_transport_sas: Cap shost max_sectors according to DMA optimal limit John Garry
@ 2022-06-30 23:49   ` Damien Le Moal
  2022-07-01  8:46     ` John Garry
  0 siblings, 1 reply; 20+ messages in thread
From: Damien Le Moal @ 2022-06-30 23:49 UTC (permalink / raw)
  To: John Garry, joro, will, jejb, martin.petersen, hch, m.szyprowski,
	robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi, linuxarm

On 6/30/22 21:08, John Garry wrote:
> Streaming DMA mappings may be considerably slower when mappings go through
> an IOMMU and the total mapping length is somewhat long. This is because the
> IOMMU IOVA code allocates and free an IOVA for each mapping, which may
> affect performance.
> 
> For performance reasons set the request queue max_sectors from
> dma_opt_mapping_size(), which knows this mapping limit.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/scsi/scsi_transport_sas.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
> index 12bff64dade6..1b45248748e0 100644
> --- a/drivers/scsi/scsi_transport_sas.c
> +++ b/drivers/scsi/scsi_transport_sas.c
> @@ -225,6 +225,7 @@ static int sas_host_setup(struct transport_container *tc, struct device *dev,
>  {
>  	struct Scsi_Host *shost = dev_to_shost(dev);
>  	struct sas_host_attrs *sas_host = to_sas_host_attrs(shost);
> +	struct device *dma_dev = shost->dma_dev;
>  
>  	INIT_LIST_HEAD(&sas_host->rphy_list);
>  	mutex_init(&sas_host->lock);
> @@ -236,6 +237,11 @@ static int sas_host_setup(struct transport_container *tc, struct device *dev,
>  		dev_printk(KERN_ERR, dev, "fail to a bsg device %d\n",
>  			   shost->host_no);
>  
> +	if (dma_dev) {
> +		shost->max_sectors = min_t(unsigned int, shost->max_sectors,
> +				dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT);
> +	}

Hmm... shost->max_sectors becomes the max_hw_sectors limit for the block
dev. So using dma_max_mapping_size(dma_dev) for that limit makes sense.
Shouldn't dma_opt_mapping_size(dma_dev) be used to limit only the default
"soft" limit (queue max_sectors limit) instead of the hard limit ?

> +
>  	return 0;
>  }
>  
-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/5] scsi: core: Cap shost max_sectors according to DMA limits only once
  2022-06-30 23:41   ` Damien Le Moal
@ 2022-07-01  8:02     ` John Garry
  0 siblings, 0 replies; 20+ messages in thread
From: John Garry @ 2022-07-01  8:02 UTC (permalink / raw)
  To: Damien Le Moal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi, linuxarm

On 01/07/2022 00:41, Damien Le Moal wrote:
>>   
>>   	shost->dma_dev = dma_dev;
>>   
>> +	if (dma_dev->dma_mask) {
>> +		shost->max_sectors = min_t(unsigned int, shost->max_sectors,
>> +				dma_max_mapping_size(dma_dev) >> SECTOR_SHIFT);
>> +	}
> Nit: you could remove the curly brackets... But it being a multi-line
> statement, having them is OK too I think.
> 

tglx seems to think that they are ok, and I generally agree (now):

https://lore.kernel.org/linux-arm-kernel/877djwdorz.ffs@nanos.tec.linutronix.de/

AFAICT coding-style.rst is ok with them in this scenario too

Cheers,
John

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 4/5] scsi: scsi_transport_sas: Cap shost max_sectors according to DMA optimal limit
  2022-06-30 23:49   ` Damien Le Moal
@ 2022-07-01  8:46     ` John Garry
  0 siblings, 0 replies; 20+ messages in thread
From: John Garry @ 2022-07-01  8:46 UTC (permalink / raw)
  To: Damien Le Moal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi, linuxarm

On 01/07/2022 00:49, Damien Le Moal wrote:
>>   
>> +	if (dma_dev) {
>> +		shost->max_sectors = min_t(unsigned int, shost->max_sectors,
>> +				dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT);
>> +	}

Hi Damien,

 > Hmm... shost->max_sectors becomes the max_hw_sectors limit for the block
 > dev. So using dma_max_mapping_size(dma_dev) for that limit makes sense.
 > Shouldn't dma_opt_mapping_size(dma_dev) be used to limit only the default
 > "soft" limit (queue max_sectors limit) instead of the hard limit ?
 >

Sure, it would sensible to use dma_opt_mapping_size() to limit the 
default queue max sectors limit, while dma_max_mapping_size() limits the 
host max sectors. But I didn't see in practice how limiting the shost 
max sectors to dma_opt_mapping_size() makes a difference:

- block queue max_hw_sectors_kb file is read-only, so we cannot change 
the queue max sectors from there

- And no SAS driver actually tries to modify upwards from the default.
I do note that USB storage driver as an example of a scsi driver which 
does (modify from shost max sectors): see scsiglue.c::slave_configure()

Finally there is no common method to limit the default request queue max 
sectors for those SAS drivers - I would need to add this limit in each 
of their slave_configure callbacks, and I didn't think that its worth it.

Thanks,
John


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
  2022-06-30 12:08 [PATCH v5 0/5] DMA mapping changes for SCSI core John Garry
                   ` (4 preceding siblings ...)
  2022-06-30 12:08 ` [PATCH v5 5/5] ata: libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors John Garry
@ 2022-07-06 13:40 ` John Garry
  2022-07-06 13:44   ` Christoph Hellwig
  5 siblings, 1 reply; 20+ messages in thread
From: John Garry @ 2022-07-06 13:40 UTC (permalink / raw)
  To: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy
  Cc: linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi, linuxarm

On 30/06/2022 13:08, John Garry wrote:

Hi Christoph,

Can you please consider picking up this series? A few things to note 
beforehand:

- I changed to only apply the mapping limit to SAS hosts in this 
version. I would need a fresh ack from Martin for those SCSI parts, but 
wanted to make sure you were ok with it.
- Damien had some doubt on updating the shost max_sectors as opposed to 
the per-request queue default, but I think he's ok with it - see patch 4/5

Thanks,
John


> As reported in [0], DMA mappings whose size exceeds the IOMMU IOVA caching
> limit may see a big performance hit.
> 
> This series introduces a new DMA mapping API, dma_opt_mapping_size(), so
> that drivers may know this limit when performance is a factor in the
> mapping.
> 
> The SCSI SAS transport code is modified only to use this limit. For now I
> did not want to touch other hosts as I have a concern that this change
> could cause a performance regression.
> 
> I also added a patch for libata-scsi as it does not currently honour the
> shost max_sectors limit.
> 
> [0]https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> 
> Changes since v4:
> - tweak libata and other patch titles
> - Add Robin's tag (thanks!)
> - Clarify description of new DMA mapping API


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
  2022-07-06 13:40 ` [PATCH v5 0/5] DMA mapping changes for SCSI core John Garry
@ 2022-07-06 13:44   ` Christoph Hellwig
  2022-07-07 20:35     ` Martin K. Petersen
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2022-07-06 13:44 UTC (permalink / raw)
  To: John Garry
  Cc: damien.lemoal, joro, will, jejb, martin.petersen, hch,
	m.szyprowski, robin.murphy, linux-doc, linux-kernel, linux-ide,
	iommu, iommu, linux-scsi, linuxarm

On Wed, Jul 06, 2022 at 02:40:44PM +0100, John Garry wrote:
> On 30/06/2022 13:08, John Garry wrote:
>
> Hi Christoph,
>
> Can you please consider picking up this series? A few things to note 
> beforehand:
>
> - I changed to only apply the mapping limit to SAS hosts in this version. I 
> would need a fresh ack from Martin for those SCSI parts, but wanted to make 
> sure you were ok with it.

Yes, I've mostly been waiting for an ACK from Martin.

> - Damien had some doubt on updating the shost max_sectors as opposed to the 
> per-request queue default, but I think he's ok with it - see patch 4/5

I'm fine either way.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
  2022-07-06 13:44   ` Christoph Hellwig
@ 2022-07-07 20:35     ` Martin K. Petersen
  2022-07-08 16:17       ` John Garry
  0 siblings, 1 reply; 20+ messages in thread
From: Martin K. Petersen @ 2022-07-07 20:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: John Garry, damien.lemoal, joro, will, jejb, martin.petersen,
	m.szyprowski, robin.murphy, linux-doc, linux-kernel, linux-ide,
	iommu, iommu, linux-scsi, linuxarm


Christoph,

> Yes, I've mostly been waiting for an ACK from Martin.

Sorry, I'm on vacation this week. The series looks OK to me although I
do agree that it would be great if the max was reflected in the queue's
hard limit and opt in the soft limit.

Acked-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
  2022-07-07 20:35     ` Martin K. Petersen
@ 2022-07-08 16:17       ` John Garry
  2022-07-10 23:08         ` Damien Le Moal
  2022-07-14  3:10         ` Martin K. Petersen
  0 siblings, 2 replies; 20+ messages in thread
From: John Garry @ 2022-07-08 16:17 UTC (permalink / raw)
  To: Martin K. Petersen, Christoph Hellwig
  Cc: damien.lemoal, joro, will, jejb, m.szyprowski, robin.murphy,
	linux-doc, linux-kernel, linux-ide, iommu, iommu, linux-scsi,
	linuxarm

On 07/07/2022 21:35, Martin K. Petersen wrote:
> Christoph,
> 
>> Yes, I've mostly been waiting for an ACK from Martin.
> Sorry, I'm on vacation this week. The series looks OK to me although I
> do agree that it would be great if the max was reflected in the queue's
> hard limit and opt in the soft limit.

Ah, I think that I misunderstood Damien's question. I thought he was 
asking why not keep shost max_sectors at dma_max_mapping_size() and then 
init each sdev request queue max hw sectors at dma_opt_mapping_size().

But he seems that you want to know why not have the request queue max 
sectors at dma_opt_mapping_size(). The answer is related to meaning of 
dma_opt_mapping_size(). If we get any mappings which exceed this size 
then it can have a big dma mapping performance hit. So I set max hw 
sectors at this ‘opt’ mapping size to ensure that we get no mappings 
which exceed this size. Indeed, I think max sectors is 128Kb today for 
my host, which would be same as dma_opt_mapping_size() value with an 
IOMMU enabled. And I find that only a small % of request size may exceed 
this 128kb size, but it still has a big performance impact.

> 
> Acked-by: Martin K. Petersen<martin.petersen@oracle.com>

Thanks,
John

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
  2022-07-08 16:17       ` John Garry
@ 2022-07-10 23:08         ` Damien Le Moal
  2022-07-11  7:36           ` John Garry
  2022-07-14  3:10         ` Martin K. Petersen
  1 sibling, 1 reply; 20+ messages in thread
From: Damien Le Moal @ 2022-07-10 23:08 UTC (permalink / raw)
  To: John Garry, Martin K. Petersen, Christoph Hellwig
  Cc: joro, will, jejb, m.szyprowski, robin.murphy, linux-doc,
	linux-kernel, linux-ide, iommu, iommu, linux-scsi, linuxarm

On 7/9/22 01:17, John Garry wrote:
> On 07/07/2022 21:35, Martin K. Petersen wrote:
>> Christoph,
>>
>>> Yes, I've mostly been waiting for an ACK from Martin.
>> Sorry, I'm on vacation this week. The series looks OK to me although I
>> do agree that it would be great if the max was reflected in the queue's
>> hard limit and opt in the soft limit.
> 
> Ah, I think that I misunderstood Damien's question. I thought he was 
> asking why not keep shost max_sectors at dma_max_mapping_size() and then 
> init each sdev request queue max hw sectors at dma_opt_mapping_size().

I was suggesting the reverse :) Keep the device hard limit
(max_hw_sectors) to the max dma mapping and set the soft limit
(max_sectors) to the optimal dma mapping size.

> 
> But he seems that you want to know why not have the request queue max 
> sectors at dma_opt_mapping_size(). The answer is related to meaning of 
> dma_opt_mapping_size(). If we get any mappings which exceed this size 
> then it can have a big dma mapping performance hit. So I set max hw 
> sectors at this ‘opt’ mapping size to ensure that we get no mappings 
> which exceed this size. Indeed, I think max sectors is 128Kb today for 
> my host, which would be same as dma_opt_mapping_size() value with an 
> IOMMU enabled. And I find that only a small % of request size may exceed 
> this 128kb size, but it still has a big performance impact.
> 
>>
>> Acked-by: Martin K. Petersen<martin.petersen@oracle.com>
> 
> Thanks,
> John


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
  2022-07-10 23:08         ` Damien Le Moal
@ 2022-07-11  7:36           ` John Garry
  2022-07-11 10:40             ` Damien Le Moal
  0 siblings, 1 reply; 20+ messages in thread
From: John Garry @ 2022-07-11  7:36 UTC (permalink / raw)
  To: Damien Le Moal, Martin K. Petersen, Christoph Hellwig
  Cc: joro, will, jejb, m.szyprowski, robin.murphy, linux-doc,
	linux-kernel, linux-ide, iommu, iommu, linux-scsi, linuxarm

On 11/07/2022 00:08, Damien Le Moal wrote:
>> Ah, I think that I misunderstood Damien's question. I thought he was
>> asking why not keep shost max_sectors at dma_max_mapping_size() and then
>> init each sdev request queue max hw sectors at dma_opt_mapping_size().
> I was suggesting the reverse:)  Keep the device hard limit
> (max_hw_sectors) to the max dma mapping and set the soft limit
> (max_sectors) to the optimal dma mapping size.

Sure, but as I mentioned below, I only see a small % of requests whose 
mapping size exceeds max_sectors but that still causes a big performance 
hit. So that is why I want to set the hard limit as the optimal dma 
mapping size.

Indeed, the IOMMU IOVA caching limit is already the same as default 
max_sectors for the disks in my system - 128Kb for 4k page size.

> 
>> But he seems that you want to know why not have the request queue max
>> sectors at dma_opt_mapping_size(). The answer is related to meaning of
>> dma_opt_mapping_size(). If we get any mappings which exceed this size
>> then it can have a big dma mapping performance hit. So I set max hw
>> sectors at this ‘opt’ mapping size to ensure that we get no mappings
>> which exceed this size. Indeed, I think max sectors is 128Kb today for
>> my host, which would be same as dma_opt_mapping_size() value with an
>> IOMMU enabled. And I find that only a small % of request size may exceed
>> this 128kb size, but it still has a big performance impact.
>>

Thanks,
John

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
  2022-07-11  7:36           ` John Garry
@ 2022-07-11 10:40             ` Damien Le Moal
  2022-07-11 14:49               ` John Garry
  0 siblings, 1 reply; 20+ messages in thread
From: Damien Le Moal @ 2022-07-11 10:40 UTC (permalink / raw)
  To: John Garry, Martin K. Petersen, Christoph Hellwig
  Cc: joro, will, jejb, m.szyprowski, robin.murphy, linux-doc,
	linux-kernel, linux-ide, iommu, iommu, linux-scsi, linuxarm

On 7/11/22 16:36, John Garry wrote:
> On 11/07/2022 00:08, Damien Le Moal wrote:
>>> Ah, I think that I misunderstood Damien's question. I thought he was
>>> asking why not keep shost max_sectors at dma_max_mapping_size() and then
>>> init each sdev request queue max hw sectors at dma_opt_mapping_size().
>> I was suggesting the reverse:)  Keep the device hard limit
>> (max_hw_sectors) to the max dma mapping and set the soft limit
>> (max_sectors) to the optimal dma mapping size.
> 
> Sure, but as I mentioned below, I only see a small % of requests whose 
> mapping size exceeds max_sectors but that still causes a big performance 
> hit. So that is why I want to set the hard limit as the optimal dma 
> mapping size.

How can you possibly end-up with requests larger than max_sectors ? BIO
split is done using this limit, right ? Or is it that request merging is
allowed up to max_hw_sectors even if the resulting request size exceeds
max_sectors ?

> 
> Indeed, the IOMMU IOVA caching limit is already the same as default 
> max_sectors for the disks in my system - 128Kb for 4k page size.
> 
>>
>>> But he seems that you want to know why not have the request queue max
>>> sectors at dma_opt_mapping_size(). The answer is related to meaning of
>>> dma_opt_mapping_size(). If we get any mappings which exceed this size
>>> then it can have a big dma mapping performance hit. So I set max hw
>>> sectors at this ‘opt’ mapping size to ensure that we get no mappings
>>> which exceed this size. Indeed, I think max sectors is 128Kb today for
>>> my host, which would be same as dma_opt_mapping_size() value with an
>>> IOMMU enabled. And I find that only a small % of request size may exceed
>>> this 128kb size, but it still has a big performance impact.
>>>
> 
> Thanks,
> John


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
  2022-07-11 10:40             ` Damien Le Moal
@ 2022-07-11 14:49               ` John Garry
  0 siblings, 0 replies; 20+ messages in thread
From: John Garry @ 2022-07-11 14:49 UTC (permalink / raw)
  To: Damien Le Moal, Martin K. Petersen, Christoph Hellwig
  Cc: joro, will, jejb, m.szyprowski, robin.murphy, linux-doc,
	linux-kernel, linux-ide, iommu, iommu, linux-scsi, linuxarm

On 11/07/2022 11:40, Damien Le Moal wrote:
> On 7/11/22 16:36, John Garry wrote:
>> On 11/07/2022 00:08, Damien Le Moal wrote:
>>>> Ah, I think that I misunderstood Damien's question. I thought he was
>>>> asking why not keep shost max_sectors at dma_max_mapping_size() and then
>>>> init each sdev request queue max hw sectors at dma_opt_mapping_size().
>>> I was suggesting the reverse:)  Keep the device hard limit
>>> (max_hw_sectors) to the max dma mapping and set the soft limit
>>> (max_sectors) to the optimal dma mapping size.
>>
>> Sure, but as I mentioned below, I only see a small % of requests whose
>> mapping size exceeds max_sectors but that still causes a big performance
>> hit. So that is why I want to set the hard limit as the optimal dma
>> mapping size.
> 
> How can you possibly end-up with requests larger than max_sectors ? BIO
> split is done using this limit, right ? Or is it that request merging is
> allowed up to max_hw_sectors even if the resulting request size exceeds
> max_sectors ?
> 

Ah, I see how I thought that I was seeing requests whose size exceeded 
max_sectors. Somebody must have changed a single disk in my system and 
this odd disk has a higher default max_sectors_kb -- 512 vs 128 for the 
rest.

So ignoring my nonesence that I was seeing oversize requests, as for the 
idea to set default max_sectors at dma_opt_mapping_size(), I see some 
issues:
- for SAS disks I have no common point to impose this limit. Maybe in 
the slave configure callback, but each SAS driver has its own 
implementation generally
- Even if we do config in slave_configure callback the max_sectors value 
is overwritten later in sd_revalidate_disk().

This following change could sort the issue though:

---8<----
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 895b56c8f25e..bb49bea3d161 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3214,6 +3214,8 @@ static int sd_revalidate_disk(struct gendisk *disk)
         sector_t old_capacity = sdkp->capacity;
         unsigned char *buffer;
         unsigned int dev_max, rw_max;
+       struct Scsi_Host *host = sdp->host;
+       struct device *dev = host->dma_dev;

         SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp,
                                       "sd_revalidate_disk\n"));
@@ -3296,8 +3298,13 @@ static int sd_revalidate_disk(struct gendisk *disk)
                                       (sector_t)BLK_DEF_MAX_SECTORS);
         }

-       /* Do not exceed controller limit */
-       rw_max = min(rw_max, queue_max_hw_sectors(q));
+       if (dev->dma_mask) {
+               /* Do not exceed dma optimal limit */
+               rw_max = min_t(unsigned int, rw_max,
+                               dma_opt_mapping_size(dev) >> SECTOR_SHIFT);
+       } else {
+               rw_max = min(rw_max, queue_max_hw_sectors(q));
+       }

         /*
          * Only update max_sectors if previously unset or if the 
current value

--->8---

Or I could go with the method in this series, which is not preferred. 
Let me know what you think.

Thanks,
John


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
  2022-07-08 16:17       ` John Garry
  2022-07-10 23:08         ` Damien Le Moal
@ 2022-07-14  3:10         ` Martin K. Petersen
  2022-07-14  7:52           ` John Garry
  1 sibling, 1 reply; 20+ messages in thread
From: Martin K. Petersen @ 2022-07-14  3:10 UTC (permalink / raw)
  To: John Garry
  Cc: Martin K. Petersen, Christoph Hellwig, damien.lemoal, joro, will,
	jejb, m.szyprowski, robin.murphy, linux-doc, linux-kernel,
	linux-ide, iommu, iommu, linux-scsi, linuxarm


John,

> So I set max hw sectors at this ‘opt’ mapping size to ensure that we
> get no mappings which exceed this size. Indeed, I think max sectors is
> 128Kb today for my host, which would be same as dma_opt_mapping_size()
> value with an IOMMU enabled. And I find that only a small % of request
> size may exceed this 128kb size, but it still has a big performance
> impact.

The purpose of the soft limit is to pick the appropriate I/O size
(i.e. for best performance). The purpose of the hard limit is to ensure
we don't submit something the hardware can't handle or describe.

IOW, the hard limit is not about performance at all. The hard limit is
mainly relevant for things that are way bigger than anything we'd issue
as regular filesystem I/O such as multi-megabyte firmware images, etc.

It's perfectly fine for firmware download performance to be
"suboptimal". What is typically more important in that scenario is that
the firmware image makes it inside a single I/O.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
  2022-07-14  3:10         ` Martin K. Petersen
@ 2022-07-14  7:52           ` John Garry
  0 siblings, 0 replies; 20+ messages in thread
From: John Garry @ 2022-07-14  7:52 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, damien.lemoal, joro, will, jejb, m.szyprowski,
	robin.murphy, linux-doc, linux-kernel, linux-ide, iommu, iommu,
	linux-scsi, linuxarm

On 14/07/2022 04:10, Martin K. Petersen wrote:

Hi Martin,

>> So I set max hw sectors at this ‘opt’ mapping size to ensure that we
>> get no mappings which exceed this size. Indeed, I think max sectors is
>> 128Kb today for my host, which would be same as dma_opt_mapping_size()
>> value with an IOMMU enabled. And I find that only a small % of request
>> size may exceed this 128kb size, but it still has a big performance
>> impact.
> The purpose of the soft limit is to pick the appropriate I/O size
> (i.e. for best performance). The purpose of the hard limit is to ensure
> we don't submit something the hardware can't handle or describe.
> 
> IOW, the hard limit is not about performance at all. The hard limit is
> mainly relevant for things that are way bigger than anything we'd issue
> as regular filesystem I/O such as multi-megabyte firmware images, etc.
> 
> It's perfectly fine for firmware download performance to be
> "suboptimal". What is typically more important in that scenario is that
> the firmware image makes it inside a single I/O.

OK, fine. I've improved the next version such that the DMA mapping opt 
limit only affects the max_sectors default.

Thanks,
John

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-07-14  9:12 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-30 12:08 [PATCH v5 0/5] DMA mapping changes for SCSI core John Garry
2022-06-30 12:08 ` [PATCH v5 1/5] dma-mapping: Add dma_opt_mapping_size() John Garry
2022-06-30 12:08 ` [PATCH v5 2/5] dma-iommu: Add iommu_dma_opt_mapping_size() John Garry
2022-06-30 12:08 ` [PATCH v5 3/5] scsi: core: Cap shost max_sectors according to DMA limits only once John Garry
2022-06-30 23:41   ` Damien Le Moal
2022-07-01  8:02     ` John Garry
2022-06-30 12:08 ` [PATCH v5 4/5] scsi: scsi_transport_sas: Cap shost max_sectors according to DMA optimal limit John Garry
2022-06-30 23:49   ` Damien Le Moal
2022-07-01  8:46     ` John Garry
2022-06-30 12:08 ` [PATCH v5 5/5] ata: libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors John Garry
2022-07-06 13:40 ` [PATCH v5 0/5] DMA mapping changes for SCSI core John Garry
2022-07-06 13:44   ` Christoph Hellwig
2022-07-07 20:35     ` Martin K. Petersen
2022-07-08 16:17       ` John Garry
2022-07-10 23:08         ` Damien Le Moal
2022-07-11  7:36           ` John Garry
2022-07-11 10:40             ` Damien Le Moal
2022-07-11 14:49               ` John Garry
2022-07-14  3:10         ` Martin K. Petersen
2022-07-14  7:52           ` John Garry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).