iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance
@ 2019-06-20  8:50 Yoshihiro Shimoda
  2019-06-20  8:50 ` [RFC PATCH v7 1/5] dma: Introduce dma_get_merge_boundary() Yoshihiro Shimoda
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Yoshihiro Shimoda @ 2019-06-20  8:50 UTC (permalink / raw)
  To: ulf.hansson, hch, m.szyprowski, robin.murphy, joro, axboe
  Cc: linux-renesas-soc, linux-mmc, linux-block, wsa+renesas, iommu

This patch series is based on iommu.git / next branch.

Since SDHI host internal DMAC of the R-Car Gen3 cannot handle two or
more segments, the performance rate (especially, eMMC HS400 reading)
is not good. However, if IOMMU is enabled on the DMAC, since IOMMU will
map multiple scatter gather buffers as one contignous iova, the DMAC can
handle the iova as well and then the performance rate is possible to
improve. In fact, I have measured the performance by using bonnie++,
"Sequential Input - block" rate was improved on r8a7795.

To achieve this, this patch series modifies IOMMU and Block subsystem
at first. Since I'd like to get any feedback from each subsystem whether
this way is acceptable for upstream, I submit it to treewide with RFC.

Changes from v6:
 - [1/5 for DMA MAP] A new patch.
 - [2/5 for IOMMU] A new patch.
 - [3/5 for BLOCK] Add Reviewed-by.
 - [4/5 for BLOCK] Use a new DMA MAP API instead of device_iommu_mapped().
 - [5/5 for MMC] Likewise, and some minor fix.
 - Remove patch 4/5 of v6 from this v7 patch series.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=131769

Changes from v5:
 - Almost all patches are new code.
 - [4/5 for MMC] This is a refactor patch so that I don't add any
   {Tested,Reviewed}-by tags.
 - [5/5 for MMC] Modify MMC subsystem to use bigger segments instead of
   the renesas_sdhi driver.
 - [5/5 for MMC] Use BLK_MAX_SEGMENTS (128) instead of local value
   SDHI_MAX_SEGS_IN_IOMMU (512). Even if we use BLK_MAX_SEGMENTS,
   the performance is still good.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=127511

Changes from v4:
 - [DMA MAPPING] Add a new device_dma_parameters for iova contiguous.
 - [IOMMU] Add a new capable for "merging" segments.
 - [IOMMU] Add a capable ops into the ipmmu-vmsa driver.
 - [MMC] Sort headers in renesas_sdhi_core.c.
 - [MMC] Remove the following codes that made on v3 that can be achieved by
	 DMA MAPPING and IOMMU subsystem:
 -- Check if R-Car Gen3 IPMMU is used or not on patch 3.
 -- Check if all multiple segment buffers are aligned to PAGE_SIZE on patch 3.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=125593

Changes from v3:
 - Use a helper function device_iommu_mapped on patch 1 and 3.
 - Check if R-Car Gen3 IPMMU is used or not on patch 3.
 - Check if all multiple segment buffers are aligned to PAGE_SIZE on patch 3.
 - Add Reviewed-by Wolfram-san on patch 1 and 2. Note that I also got his
   Reviewed-by on patch 3, but I changed it from v2. So, I didn't add
   his Reviewed-by at this time.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=120985

Changes from v2:
 - Add some conditions in the init_card().
 - Add a comment in the init_card().
 - Add definitions for some "MAX_SEGS".
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=116729

Changes from v1:
 - Remove adding init_card ops into struct tmio_mmc_dma_ops and
   tmio_mmc_host and just set init_card on renesas_sdhi_core.c.
 - Revise typos on "mmc: tmio: No memory size limitation if runs on IOMMU".
 - Add Simon-san's Reviewed-by on a tmio patch.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=110485

Yoshihiro Shimoda (5):
  dma: Introduce dma_get_merge_boundary()
  iommu/dma: Add a new dma_map_ops of get_merge_boundary()
  block: sort headers on blk-setting.c
  block: add a helper function to merge the segments
  mmc: queue: Use bigger segments if DMA MAP layer can merge the
    segments

 Documentation/DMA-API.txt   |  8 ++++++++
 block/blk-settings.c        | 34 ++++++++++++++++++++++++++++------
 drivers/iommu/dma-iommu.c   | 11 +++++++++++
 drivers/mmc/core/queue.c    | 35 ++++++++++++++++++++++++++++++++---
 include/linux/blkdev.h      |  2 ++
 include/linux/dma-mapping.h |  6 ++++++
 include/linux/mmc/host.h    |  1 +
 kernel/dma/mapping.c        | 11 +++++++++++
 8 files changed, 99 insertions(+), 9 deletions(-)

-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH v7 1/5] dma: Introduce dma_get_merge_boundary()
  2019-06-20  8:50 [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance Yoshihiro Shimoda
@ 2019-06-20  8:50 ` Yoshihiro Shimoda
  2019-06-24  6:20   ` Christoph Hellwig
  2019-06-20  8:50 ` [RFC PATCH v7 2/5] iommu/dma: Add a new dma_map_ops of get_merge_boundary() Yoshihiro Shimoda
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Yoshihiro Shimoda @ 2019-06-20  8:50 UTC (permalink / raw)
  To: ulf.hansson, hch, m.szyprowski, robin.murphy, joro, axboe
  Cc: linux-renesas-soc, linux-mmc, linux-block, wsa+renesas, iommu

This patch adds a new DMA API "dma_get_merge_boundary". This function
returns the DMA merge boundary if the DMA layer can merge the segments.
This patch also adds the implementation for a new dma_map_ops pointer.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 Documentation/DMA-API.txt   |  8 ++++++++
 include/linux/dma-mapping.h |  6 ++++++
 kernel/dma/mapping.c        | 11 +++++++++++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
index 0076150..11a2647 100644
--- a/Documentation/DMA-API.txt
+++ b/Documentation/DMA-API.txt
@@ -204,6 +204,14 @@ Returns the maximum size of a mapping for the device. The size parameter
 of the mapping functions like dma_map_single(), dma_map_page() and
 others should not be larger than the returned value.
 
+::
+
+	unsigned long
+	dma_get_merge_boundary(struct device *dev);
+
+Returns the DMA merge boundary. If the device cannot merge any the DMA address
+segments, the function returns 0.
+
 Part Id - Streaming DMA mappings
 --------------------------------
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 6309a72..e81e076 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -131,6 +131,7 @@ struct dma_map_ops {
 	int (*dma_supported)(struct device *dev, u64 mask);
 	u64 (*get_required_mask)(struct device *dev);
 	size_t (*max_mapping_size)(struct device *dev);
+	unsigned long (*get_merge_boundary)(struct device *dev);
 };
 
 #define DMA_MAPPING_ERROR		(~(dma_addr_t)0)
@@ -467,6 +468,7 @@ int dma_set_mask(struct device *dev, u64 mask);
 int dma_set_coherent_mask(struct device *dev, u64 mask);
 u64 dma_get_required_mask(struct device *dev);
 size_t dma_max_mapping_size(struct device *dev);
+unsigned long dma_get_merge_boundary(struct device *dev);
 #else /* CONFIG_HAS_DMA */
 static inline dma_addr_t dma_map_page_attrs(struct device *dev,
 		struct page *page, size_t offset, size_t size,
@@ -572,6 +574,10 @@ static inline size_t dma_max_mapping_size(struct device *dev)
 {
 	return 0;
 }
+static inline unsigned long dma_get_merge_boundary(struct device *dev)
+{
+	return 0;
+}
 #endif /* CONFIG_HAS_DMA */
 
 static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr,
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index f7afdad..8e262cf 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -367,3 +367,14 @@ size_t dma_max_mapping_size(struct device *dev)
 	return size;
 }
 EXPORT_SYMBOL_GPL(dma_max_mapping_size);
+
+unsigned long dma_get_merge_boundary(struct device *dev)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+
+	if (!ops || !ops->get_merge_boundary)
+		return 0;	/* can't merge */
+
+	return ops->get_merge_boundary(dev);
+}
+EXPORT_SYMBOL_GPL(dma_get_merge_boundary);
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH v7 2/5] iommu/dma: Add a new dma_map_ops of get_merge_boundary()
  2019-06-20  8:50 [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance Yoshihiro Shimoda
  2019-06-20  8:50 ` [RFC PATCH v7 1/5] dma: Introduce dma_get_merge_boundary() Yoshihiro Shimoda
@ 2019-06-20  8:50 ` Yoshihiro Shimoda
  2019-06-21  7:59   ` Marek Szyprowski
  2019-06-20  8:50 ` [RFC PATCH v7 3/5] block: sort headers on blk-setting.c Yoshihiro Shimoda
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Yoshihiro Shimoda @ 2019-06-20  8:50 UTC (permalink / raw)
  To: ulf.hansson, hch, m.szyprowski, robin.murphy, joro, axboe
  Cc: linux-renesas-soc, linux-mmc, linux-block, wsa+renesas, iommu

This patch adds a new dma_map_ops of get_merge_boundary() to
expose the DMA merge boundary if the domain type is IOMMU_DOMAIN_DMA.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 drivers/iommu/dma-iommu.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 205d694..9950cb5 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1091,6 +1091,16 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
 	return ret;
 }
 
+static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
+{
+	struct iommu_domain *domain = iommu_get_dma_domain(dev);
+
+	if (domain->type != IOMMU_DOMAIN_DMA)
+		return 0;	/* can't merge */
+
+	return (1 << __ffs(domain->pgsize_bitmap)) - 1;
+}
+
 static const struct dma_map_ops iommu_dma_ops = {
 	.alloc			= iommu_dma_alloc,
 	.free			= iommu_dma_free,
@@ -1106,6 +1116,7 @@ static const struct dma_map_ops iommu_dma_ops = {
 	.sync_sg_for_device	= iommu_dma_sync_sg_for_device,
 	.map_resource		= iommu_dma_map_resource,
 	.unmap_resource		= iommu_dma_unmap_resource,
+	.get_merge_boundary	= iommu_dma_get_merge_boundary,
 };
 
 /*
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH v7 3/5] block: sort headers on blk-setting.c
  2019-06-20  8:50 [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance Yoshihiro Shimoda
  2019-06-20  8:50 ` [RFC PATCH v7 1/5] dma: Introduce dma_get_merge_boundary() Yoshihiro Shimoda
  2019-06-20  8:50 ` [RFC PATCH v7 2/5] iommu/dma: Add a new dma_map_ops of get_merge_boundary() Yoshihiro Shimoda
@ 2019-06-20  8:50 ` Yoshihiro Shimoda
  2019-06-20  8:50 ` [RFC PATCH v7 4/5] block: add a helper function to merge the segments Yoshihiro Shimoda
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Yoshihiro Shimoda @ 2019-06-20  8:50 UTC (permalink / raw)
  To: ulf.hansson, hch, m.szyprowski, robin.murphy, joro, axboe
  Cc: linux-renesas-soc, linux-mmc, linux-block, wsa+renesas, iommu

This patch sorts the headers in alphabetic order to ease
the maintenance for this part.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---
 block/blk-settings.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 2ae348c..45f2c52 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -2,16 +2,16 @@
 /*
  * Functions related to setting various queue properties from drivers
  */
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/init.h>
 #include <linux/bio.h>
 #include <linux/blkdev.h>
-#include <linux/memblock.h>	/* for max_pfn/max_low_pfn */
 #include <linux/gcd.h>
-#include <linux/lcm.h>
-#include <linux/jiffies.h>
 #include <linux/gfp.h>
+#include <linux/init.h>
+#include <linux/jiffies.h>
+#include <linux/kernel.h>
+#include <linux/lcm.h>
+#include <linux/memblock.h>     /* for max_pfn/max_low_pfn */
+#include <linux/module.h>
 
 #include "blk.h"
 #include "blk-wbt.h"
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH v7 4/5] block: add a helper function to merge the segments
  2019-06-20  8:50 [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance Yoshihiro Shimoda
                   ` (2 preceding siblings ...)
  2019-06-20  8:50 ` [RFC PATCH v7 3/5] block: sort headers on blk-setting.c Yoshihiro Shimoda
@ 2019-06-20  8:50 ` Yoshihiro Shimoda
  2019-06-24  6:22   ` Christoph Hellwig
  2019-06-20  8:50 ` [RFC PATCH v7 5/5] mmc: queue: Use bigger segments if DMA MAP layer can " Yoshihiro Shimoda
  2019-07-01  8:32 ` [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance Christoph Hellwig
  5 siblings, 1 reply; 15+ messages in thread
From: Yoshihiro Shimoda @ 2019-06-20  8:50 UTC (permalink / raw)
  To: ulf.hansson, hch, m.szyprowski, robin.murphy, joro, axboe
  Cc: linux-renesas-soc, linux-mmc, linux-block, wsa+renesas, iommu

This patch adds a helper function whether a queue can merge
the segments by the DMA MAP layer (e.g. via IOMMU).

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 block/blk-settings.c   | 22 ++++++++++++++++++++++
 include/linux/blkdev.h |  2 ++
 2 files changed, 24 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 45f2c52..6a78ea0 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -4,6 +4,7 @@
  */
 #include <linux/bio.h>
 #include <linux/blkdev.h>
+#include <linux/dma-mapping.h>
 #include <linux/gcd.h>
 #include <linux/gfp.h>
 #include <linux/init.h>
@@ -831,6 +832,27 @@ void blk_queue_write_cache(struct request_queue *q, bool wc, bool fua)
 }
 EXPORT_SYMBOL_GPL(blk_queue_write_cache);
 
+/**
+ * blk_queue_can_use_dma_map_merging - configure queue for merging segments.
+ * @q:		the request queue for the device
+ * @dev:	the device pointer for dma
+ *
+ * Tell the block layer about merging the segments by dma map of @q.
+ */
+bool blk_queue_can_use_dma_map_merging(struct request_queue *q,
+				       struct device *dev)
+{
+	unsigned long boundary = dma_get_merge_boundary(dev);
+
+	if (!boundary)
+		return false;
+
+	/* No need to update max_segment_size. see blk_queue_virt_boundary() */
+	blk_queue_virt_boundary(q, boundary);
+
+	return true;
+}
+
 static int __init blk_settings_init(void)
 {
 	blk_max_low_pfn = max_low_pfn - 1;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 592669b..a7a839d 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1091,6 +1091,8 @@ extern void blk_queue_dma_alignment(struct request_queue *, int);
 extern void blk_queue_update_dma_alignment(struct request_queue *, int);
 extern void blk_queue_rq_timeout(struct request_queue *, unsigned int);
 extern void blk_queue_write_cache(struct request_queue *q, bool enabled, bool fua);
+extern bool blk_queue_can_use_dma_map_merging(struct request_queue *q,
+					      struct device *dev);
 
 /*
  * Number of physical segments as sent to the device.
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH v7 5/5] mmc: queue: Use bigger segments if DMA MAP layer can merge the segments
  2019-06-20  8:50 [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance Yoshihiro Shimoda
                   ` (3 preceding siblings ...)
  2019-06-20  8:50 ` [RFC PATCH v7 4/5] block: add a helper function to merge the segments Yoshihiro Shimoda
@ 2019-06-20  8:50 ` Yoshihiro Shimoda
  2019-06-24  6:24   ` Christoph Hellwig
  2019-07-01  8:32 ` [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance Christoph Hellwig
  5 siblings, 1 reply; 15+ messages in thread
From: Yoshihiro Shimoda @ 2019-06-20  8:50 UTC (permalink / raw)
  To: ulf.hansson, hch, m.szyprowski, robin.murphy, joro, axboe
  Cc: linux-renesas-soc, linux-mmc, linux-block, wsa+renesas, iommu

When the max_segs of a mmc host is smaller than 512, the mmc
subsystem tries to use 512 segments if DMA MAP layer can merge
the segments, and then the mmc subsystem exposes such information
to the block layer by using blk_queue_can_use_dma_map_merging().

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 drivers/mmc/core/queue.c | 35 ++++++++++++++++++++++++++++++++---
 include/linux/mmc/host.h |  1 +
 2 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 92900a0..ab0ecc6 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -24,6 +24,8 @@
 #include "card.h"
 #include "host.h"
 
+#define MMC_DMA_MAP_MERGE_SEGMENTS	512
+
 static inline bool mmc_cqe_dcmd_busy(struct mmc_queue *mq)
 {
 	/* Allow only 1 DCMD at a time */
@@ -196,6 +198,12 @@ static void mmc_queue_setup_discard(struct request_queue *q,
 		blk_queue_flag_set(QUEUE_FLAG_SECERASE, q);
 }
 
+static unsigned int mmc_get_max_segments(struct mmc_host *host)
+{
+	return host->can_dma_map_merge ? MMC_DMA_MAP_MERGE_SEGMENTS :
+					 host->max_segs;
+}
+
 /**
  * mmc_init_request() - initialize the MMC-specific per-request data
  * @q: the request queue
@@ -209,7 +217,7 @@ static int __mmc_init_request(struct mmc_queue *mq, struct request *req,
 	struct mmc_card *card = mq->card;
 	struct mmc_host *host = card->host;
 
-	mq_rq->sg = mmc_alloc_sg(host->max_segs, gfp);
+	mq_rq->sg = mmc_alloc_sg(mmc_get_max_segments(host), gfp);
 	if (!mq_rq->sg)
 		return -ENOMEM;
 
@@ -368,13 +376,23 @@ static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
 	blk_queue_bounce_limit(mq->queue, limit);
 	blk_queue_max_hw_sectors(mq->queue,
 		min(host->max_blk_count, host->max_req_size / 512));
-	blk_queue_max_segments(mq->queue, host->max_segs);
+	if (host->can_dma_map_merge)
+		WARN(!blk_queue_can_use_dma_map_merging(mq->queue,
+							mmc_dev(host)),
+		     "merging was advertised but not possible");
+	blk_queue_max_segments(mq->queue, mmc_get_max_segments(host));
 
 	if (mmc_card_mmc(card))
 		block_size = card->ext_csd.data_sector_size;
 
 	blk_queue_logical_block_size(mq->queue, block_size);
-	blk_queue_max_segment_size(mq->queue,
+	/*
+	 * After blk_queue_can_use_dma_map_merging() was called with succeed,
+	 * since it calls blk_queue_virt_boundary(), the mmc should not call
+	 * both blk_queue_max_segment_size().
+	 */
+	if (host->can_dma_map_merge)
+		blk_queue_max_segment_size(mq->queue,
 			round_down(host->max_seg_size, block_size));
 
 	dma_set_max_seg_size(mmc_dev(host), queue_max_segment_size(mq->queue));
@@ -424,6 +442,17 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card)
 	mq->tag_set.cmd_size = sizeof(struct mmc_queue_req);
 	mq->tag_set.driver_data = mq;
 
+	/*
+	 * Since blk_mq_alloc_tag_set() calls .init_request() of mmc_mq_ops,
+	 * the host->can_dma_map_merge should be set before to get max_segs
+	 * from mmc_get_max_segments().
+	 */
+	if (host->max_segs < MMC_DMA_MAP_MERGE_SEGMENTS &&
+	    dma_get_merge_boundary(mmc_dev(host)))
+		host->can_dma_map_merge = 1;
+	else
+		host->can_dma_map_merge = 0;
+
 	ret = blk_mq_alloc_tag_set(&mq->tag_set);
 	if (ret)
 		return ret;
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 43d0f0c..10c3719 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -398,6 +398,7 @@ struct mmc_host {
 	unsigned int		retune_now:1;	/* do re-tuning at next req */
 	unsigned int		retune_paused:1; /* re-tuning is temporarily disabled */
 	unsigned int		use_blk_mq:1;	/* use blk-mq */
+	unsigned int		can_dma_map_merge:1; /* merging can be used */
 
 	int			rescan_disable;	/* disable card detection */
 	int			rescan_entered;	/* used with nonremovable devices */
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH v7 2/5] iommu/dma: Add a new dma_map_ops of get_merge_boundary()
  2019-06-20  8:50 ` [RFC PATCH v7 2/5] iommu/dma: Add a new dma_map_ops of get_merge_boundary() Yoshihiro Shimoda
@ 2019-06-21  7:59   ` Marek Szyprowski
  2019-06-24  6:21     ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Marek Szyprowski @ 2019-06-21  7:59 UTC (permalink / raw)
  To: Yoshihiro Shimoda, ulf.hansson, hch, robin.murphy, joro, axboe
  Cc: linux-block, wsa+renesas, iommu, linux-mmc, linux-renesas-soc

Hi,

On 2019-06-20 10:50, Yoshihiro Shimoda wrote:
> This patch adds a new dma_map_ops of get_merge_boundary() to
> expose the DMA merge boundary if the domain type is IOMMU_DOMAIN_DMA.
>
> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> ---
>   drivers/iommu/dma-iommu.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
>
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 205d694..9950cb5 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1091,6 +1091,16 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
>   	return ret;
>   }
>   
> +static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
> +{
> +	struct iommu_domain *domain = iommu_get_dma_domain(dev);
> +
> +	if (domain->type != IOMMU_DOMAIN_DMA)
> +		return 0;	/* can't merge */
> +
> +	return (1 << __ffs(domain->pgsize_bitmap)) - 1;
> +}

I really wonder if there is any IOMMU, which doesn't support 4KiB pages. 
Cannot you simply assume that the merge boundary is 4KiB and avoid 
adding this new API?

> +
>   static const struct dma_map_ops iommu_dma_ops = {
>   	.alloc			= iommu_dma_alloc,
>   	.free			= iommu_dma_free,
> @@ -1106,6 +1116,7 @@ static const struct dma_map_ops iommu_dma_ops = {
>   	.sync_sg_for_device	= iommu_dma_sync_sg_for_device,
>   	.map_resource		= iommu_dma_map_resource,
>   	.unmap_resource		= iommu_dma_unmap_resource,
> +	.get_merge_boundary	= iommu_dma_get_merge_boundary,
>   };
>   
>   /*

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH v7 1/5] dma: Introduce dma_get_merge_boundary()
  2019-06-20  8:50 ` [RFC PATCH v7 1/5] dma: Introduce dma_get_merge_boundary() Yoshihiro Shimoda
@ 2019-06-24  6:20   ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2019-06-24  6:20 UTC (permalink / raw)
  To: Yoshihiro Shimoda
  Cc: axboe, linux-renesas-soc, ulf.hansson, linux-mmc, linux-block,
	wsa+renesas, iommu, robin.murphy, hch

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH v7 2/5] iommu/dma: Add a new dma_map_ops of get_merge_boundary()
  2019-06-21  7:59   ` Marek Szyprowski
@ 2019-06-24  6:21     ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2019-06-24  6:21 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: axboe, linux-renesas-soc, ulf.hansson, linux-mmc, hch,
	linux-block, wsa+renesas, iommu, robin.murphy

On Fri, Jun 21, 2019 at 09:59:21AM +0200, Marek Szyprowski wrote:
> Hi,
> 
> On 2019-06-20 10:50, Yoshihiro Shimoda wrote:
> > This patch adds a new dma_map_ops of get_merge_boundary() to
> > expose the DMA merge boundary if the domain type is IOMMU_DOMAIN_DMA.
> >
> > Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > ---
> >   drivers/iommu/dma-iommu.c | 11 +++++++++++
> >   1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index 205d694..9950cb5 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -1091,6 +1091,16 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
> >   	return ret;
> >   }
> >   
> > +static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
> > +{
> > +	struct iommu_domain *domain = iommu_get_dma_domain(dev);
> > +
> > +	if (domain->type != IOMMU_DOMAIN_DMA)
> > +		return 0;	/* can't merge */
> > +
> > +	return (1 << __ffs(domain->pgsize_bitmap)) - 1;
> > +}
> 
> I really wonder if there is any IOMMU, which doesn't support 4KiB pages. 
> Cannot you simply assume that the merge boundary is 4KiB and avoid 
> adding this new API?

No idea if we have one, but I would not be surprised if one shows
up on a system only built to run with 64k pages for example.

Either way the abstraction seems light and self-explanatory, so I see
now reason not to have it even if we assume it would always return
4k, especially as we'd also still need a flag at the dma_map_ops level
to indicate if segement merging is supported at all.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH v7 4/5] block: add a helper function to merge the segments
  2019-06-20  8:50 ` [RFC PATCH v7 4/5] block: add a helper function to merge the segments Yoshihiro Shimoda
@ 2019-06-24  6:22   ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2019-06-24  6:22 UTC (permalink / raw)
  To: Yoshihiro Shimoda
  Cc: axboe, linux-renesas-soc, ulf.hansson, linux-mmc, linux-block,
	wsa+renesas, iommu, robin.murphy, hch

> +bool blk_queue_can_use_dma_map_merging(struct request_queue *q,
> +				       struct device *dev)
> +{
> +	unsigned long boundary = dma_get_merge_boundary(dev);
> +
> +	if (!boundary)
> +		return false;
> +
> +	/* No need to update max_segment_size. see blk_queue_virt_boundary() */
> +	blk_queue_virt_boundary(q, boundary);
> +
> +	return true;

I'd skip that empty line here, but that is way into nitpick territory.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH v7 5/5] mmc: queue: Use bigger segments if DMA MAP layer can merge the segments
  2019-06-20  8:50 ` [RFC PATCH v7 5/5] mmc: queue: Use bigger segments if DMA MAP layer can " Yoshihiro Shimoda
@ 2019-06-24  6:24   ` Christoph Hellwig
  2019-07-08 11:45     ` Ulf Hansson
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2019-06-24  6:24 UTC (permalink / raw)
  To: Yoshihiro Shimoda
  Cc: axboe, linux-renesas-soc, ulf.hansson, linux-mmc, linux-block,
	wsa+renesas, iommu, robin.murphy, hch

On Thu, Jun 20, 2019 at 05:50:10PM +0900, Yoshihiro Shimoda wrote:
> When the max_segs of a mmc host is smaller than 512, the mmc
> subsystem tries to use 512 segments if DMA MAP layer can merge
> the segments, and then the mmc subsystem exposes such information
> to the block layer by using blk_queue_can_use_dma_map_merging().
> 
> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> ---
>  drivers/mmc/core/queue.c | 35 ++++++++++++++++++++++++++++++++---
>  include/linux/mmc/host.h |  1 +
>  2 files changed, 33 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
> index 92900a0..ab0ecc6 100644
> --- a/drivers/mmc/core/queue.c
> +++ b/drivers/mmc/core/queue.c
> @@ -24,6 +24,8 @@
>  #include "card.h"
>  #include "host.h"
>  
> +#define MMC_DMA_MAP_MERGE_SEGMENTS	512
> +
>  static inline bool mmc_cqe_dcmd_busy(struct mmc_queue *mq)
>  {
>  	/* Allow only 1 DCMD at a time */
> @@ -196,6 +198,12 @@ static void mmc_queue_setup_discard(struct request_queue *q,
>  		blk_queue_flag_set(QUEUE_FLAG_SECERASE, q);
>  }
>  
> +static unsigned int mmc_get_max_segments(struct mmc_host *host)
> +{
> +	return host->can_dma_map_merge ? MMC_DMA_MAP_MERGE_SEGMENTS :
> +					 host->max_segs;

I personally don't like superflous use of ? : if an if would be more
obvious:

	if (host->can_dma_map_merge)
		return MMC_DMA_MAP_MERGE_SEGMENTS;
	return host->max_segs;

but that is really just a nitpick and for the mmc maintainer to decide.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance
  2019-06-20  8:50 [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance Yoshihiro Shimoda
                   ` (4 preceding siblings ...)
  2019-06-20  8:50 ` [RFC PATCH v7 5/5] mmc: queue: Use bigger segments if DMA MAP layer can " Yoshihiro Shimoda
@ 2019-07-01  8:32 ` Christoph Hellwig
  2019-07-08 11:45   ` Ulf Hansson
  5 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2019-07-01  8:32 UTC (permalink / raw)
  To: Yoshihiro Shimoda
  Cc: axboe, linux-block, ulf.hansson, linux-mmc, linux-renesas-soc,
	wsa+renesas, iommu, robin.murphy, hch

Any comments from the block, iommu and mmc maintainers?  I'd be happy
to queue this up in the dma-mapping tree, but I'll need some ACKs
for that fast.  Alternatively I can just queue up the DMA API bits,
leaving the rest for the next merge window, but would drag things
out far too long IMHO.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH v7 5/5] mmc: queue: Use bigger segments if DMA MAP layer can merge the segments
  2019-06-24  6:24   ` Christoph Hellwig
@ 2019-07-08 11:45     ` Ulf Hansson
  0 siblings, 0 replies; 15+ messages in thread
From: Ulf Hansson @ 2019-07-08 11:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Linux-Renesas, linux-mmc, linux-block, Wolfram Sang,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Robin Murphy

On Mon, 24 Jun 2019 at 08:24, Christoph Hellwig <hch@lst.de> wrote:
>
> On Thu, Jun 20, 2019 at 05:50:10PM +0900, Yoshihiro Shimoda wrote:
> > When the max_segs of a mmc host is smaller than 512, the mmc
> > subsystem tries to use 512 segments if DMA MAP layer can merge
> > the segments, and then the mmc subsystem exposes such information
> > to the block layer by using blk_queue_can_use_dma_map_merging().
> >
> > Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > ---
> >  drivers/mmc/core/queue.c | 35 ++++++++++++++++++++++++++++++++---
> >  include/linux/mmc/host.h |  1 +
> >  2 files changed, 33 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
> > index 92900a0..ab0ecc6 100644
> > --- a/drivers/mmc/core/queue.c
> > +++ b/drivers/mmc/core/queue.c
> > @@ -24,6 +24,8 @@
> >  #include "card.h"
> >  #include "host.h"
> >
> > +#define MMC_DMA_MAP_MERGE_SEGMENTS   512
> > +
> >  static inline bool mmc_cqe_dcmd_busy(struct mmc_queue *mq)
> >  {
> >       /* Allow only 1 DCMD at a time */
> > @@ -196,6 +198,12 @@ static void mmc_queue_setup_discard(struct request_queue *q,
> >               blk_queue_flag_set(QUEUE_FLAG_SECERASE, q);
> >  }
> >
> > +static unsigned int mmc_get_max_segments(struct mmc_host *host)
> > +{
> > +     return host->can_dma_map_merge ? MMC_DMA_MAP_MERGE_SEGMENTS :
> > +                                      host->max_segs;
>
> I personally don't like superflous use of ? : if an if would be more
> obvious:
>
>         if (host->can_dma_map_merge)
>                 return MMC_DMA_MAP_MERGE_SEGMENTS;
>         return host->max_segs;
>
> but that is really just a nitpick and for the mmc maintainer to decide.

I have no strong opinions, both formats are used in mmc code, so I am
fine as is.

>
> Otherwise looks good:
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance
  2019-07-01  8:32 ` [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance Christoph Hellwig
@ 2019-07-08 11:45   ` Ulf Hansson
  2019-07-08 16:22     ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Ulf Hansson @ 2019-07-08 11:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, linux-mmc, Linux-Renesas, Wolfram Sang,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Robin Murphy

On Mon, 1 Jul 2019 at 10:32, Christoph Hellwig <hch@lst.de> wrote:
>
> Any comments from the block, iommu and mmc maintainers?  I'd be happy
> to queue this up in the dma-mapping tree, but I'll need some ACKs
> for that fast.  Alternatively I can just queue up the DMA API bits,
> leaving the rest for the next merge window, but would drag things
> out far too long IMHO.

Apologize for the delay, the mmc parts looks good to me. If not too
late, feel free to pick it up.

Otherwise, let's do it for the next cycle.

Kind regards
Uffe
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance
  2019-07-08 11:45   ` Ulf Hansson
@ 2019-07-08 16:22     ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2019-07-08 16:22 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Jens Axboe, linux-block, linux-mmc, Linux-Renesas, Wolfram Sang,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Robin Murphy, Christoph Hellwig

On Mon, Jul 08, 2019 at 01:45:55PM +0200, Ulf Hansson wrote:
> On Mon, 1 Jul 2019 at 10:32, Christoph Hellwig <hch@lst.de> wrote:
> >
> > Any comments from the block, iommu and mmc maintainers?  I'd be happy
> > to queue this up in the dma-mapping tree, but I'll need some ACKs
> > for that fast.  Alternatively I can just queue up the DMA API bits,
> > leaving the rest for the next merge window, but would drag things
> > out far too long IMHO.
> 
> Apologize for the delay, the mmc parts looks good to me. If not too
> late, feel free to pick it up.
> 
> Otherwise, let's do it for the next cycle.

I was out the last couple days, so it has to be next cycle.  But it
would still make sense to get everything into a single tree.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-07-08 16:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-20  8:50 [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance Yoshihiro Shimoda
2019-06-20  8:50 ` [RFC PATCH v7 1/5] dma: Introduce dma_get_merge_boundary() Yoshihiro Shimoda
2019-06-24  6:20   ` Christoph Hellwig
2019-06-20  8:50 ` [RFC PATCH v7 2/5] iommu/dma: Add a new dma_map_ops of get_merge_boundary() Yoshihiro Shimoda
2019-06-21  7:59   ` Marek Szyprowski
2019-06-24  6:21     ` Christoph Hellwig
2019-06-20  8:50 ` [RFC PATCH v7 3/5] block: sort headers on blk-setting.c Yoshihiro Shimoda
2019-06-20  8:50 ` [RFC PATCH v7 4/5] block: add a helper function to merge the segments Yoshihiro Shimoda
2019-06-24  6:22   ` Christoph Hellwig
2019-06-20  8:50 ` [RFC PATCH v7 5/5] mmc: queue: Use bigger segments if DMA MAP layer can " Yoshihiro Shimoda
2019-06-24  6:24   ` Christoph Hellwig
2019-07-08 11:45     ` Ulf Hansson
2019-07-01  8:32 ` [RFC PATCH v7 0/5] treewide: improve R-Car SDHI performance Christoph Hellwig
2019-07-08 11:45   ` Ulf Hansson
2019-07-08 16:22     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).