linux-renesas-soc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] mmc: renesas_sdhi: improve performance by changing max_segs
@ 2019-05-22 10:18 Yoshihiro Shimoda
  2019-05-22 10:18 ` [PATCH v3 1/3] mmc: tmio: No memory size limitation if runs on IOMMU Yoshihiro Shimoda
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Yoshihiro Shimoda @ 2019-05-22 10:18 UTC (permalink / raw)
  To: ulf.hansson, wsa+renesas; +Cc: linux-mmc, linux-renesas-soc, Yoshihiro Shimoda

This patch set is based on renesas-drivers.git /
renesas-drivers-2019-05-21-v5.2-rc1 tag.

Since SDHI host internal DMAC of the R-Car Gen3 cannot handle two or more
segments, the performance rate (especially, eMMC HS400 reading) is not good.
However, if IOMMU is enabled on the DMAC, since IOMMU will map multiple
scatter gather buffers as one contignous iova, the DMAC can handle the iova
as well and then the performance rate is possible to improve. In fact,
I have measured the performance by using bonnie++, "Sequential Input - block"
rate was improved on r8a7795. Please refer to the end of this email about
the performance. (I beleive if the performance is improved, the CPU load
is also increased.)

However, in case of a sdio card (especiialy some WiFi cards/drivers),
scatter gather buffers are possible to be not contiguous iova because
each scatter gather buffer has only about 1500 bytes, the DMAC cannot
handle it. So, this patch set adds init_card() ops to detect the card
type, and then the driver changes the max_segs if the DMAC is under
IOMMU environment and an sd card/mmc is detected.

Remarks: I mentioned that Gen2 with the SYS-DMAC basis environment can
also improve the perfomance on the public ML [1], but there was
a measurement error so that I didn't make a patch for SYS-DMAC.

Changes from v2 [2]:
 - Add some conditions in the init_card().
 - Add a comment in the init_card().
 - Add definitions for some "MAX_SEGS".

Changes from v1 [3]:
 - Remove adding init_card ops into struct tmio_mmc_dma_ops and
   tmio_mmc_host and just set init_card on renesas_sdhi_core.c.
 - Revise typos on "mmc: tmio: No memory size limitation if runs on IOMMU".
 - Add Simon-san's Reviewed-by on a tmio patch.

[1]
https://patchwork.kernel.org/patch/10940225/#22635487

[2]
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=116729

[3]
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=110485

---
kernel v5.2-rc1 + local patches + HS400,,,,,,,,,,,,,,,,,,,,,,,,,,
Buildroot 2019.02.1,,,,,,,,,,,,,,,,,,,,,,,,,,
Bonnie++ 1.03e : bonnie\+\+ -d ./ -s 8192 -r 4096 -b -u root,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,
environment,Size,Sequential Output - per char (K/sec),<- (CPU %),Sequential Output - block (K/sec),<- (CPU %),Sequential Output - rewrite (K/sec),<- (CPU %),Sequential Input - per char (K/sec),<- (CPU %),Sequential Input ? block (K/sec),<- (CPU %),Random seeks,<- (CPU %),files,Sequential Create,<- (CPU %),Sequential Read,<- (CPU %),Sequential Delete,<- (CPU %),Random Create,<- (CPU %),Random Read,<- (CPU %),Random Delete,<- (CPU %)
H3_max_segs_1_on_IOMMU,8G,80858,96,121466,28,60602,16,71712,95,121427,15,4128.5,11,16,678,3,+++++,+++,593,2,616,3,+++++,+++,603,3
H3_max_segs_512_on_IOMMU,8G,76637,91,132994,31,81478,20,74522,98,197325,26,4526.5,12,16,864,4,+++++,+++,821,4,814,4,+++++,+++,786,4
---

Yoshihiro Shimoda (3):
  mmc: tmio: No memory size limitation if runs on IOMMU
  mmc: tmio: Add a definition for default max_segs
  mmc: renesas_sdhi: use multiple segments if possible

 drivers/mmc/host/renesas_sdhi_core.c | 25 +++++++++++++++++++++++++
 drivers/mmc/host/tmio_mmc.h          |  1 +
 drivers/mmc/host/tmio_mmc_core.c     |  7 ++++---
 3 files changed, 30 insertions(+), 3 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/3] mmc: tmio: No memory size limitation if runs on IOMMU
  2019-05-22 10:18 [PATCH v3 0/3] mmc: renesas_sdhi: improve performance by changing max_segs Yoshihiro Shimoda
@ 2019-05-22 10:18 ` Yoshihiro Shimoda
  2019-05-22 11:31   ` Wolfram Sang
  2019-05-22 10:18 ` [PATCH v3 2/3] mmc: tmio: Add a definition for default max_segs Yoshihiro Shimoda
  2019-05-22 10:18 ` [PATCH v3 3/3] mmc: renesas_sdhi: use multiple segments if possible Yoshihiro Shimoda
  2 siblings, 1 reply; 9+ messages in thread
From: Yoshihiro Shimoda @ 2019-05-22 10:18 UTC (permalink / raw)
  To: ulf.hansson, wsa+renesas; +Cc: linux-mmc, linux-renesas-soc, Yoshihiro Shimoda

This patch adds a condition to avoid a memory size limitation of
swiotlb if the driver runs on IOMMU.

Tested-by: Takeshi Saito <takeshi.saito.xv@renesas.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
---
 drivers/mmc/host/tmio_mmc_core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
index 130b91c..7c76ab0 100644
--- a/drivers/mmc/host/tmio_mmc_core.c
+++ b/drivers/mmc/host/tmio_mmc_core.c
@@ -1194,9 +1194,10 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host)
 	 * Since swiotlb has memory size limitation, this will calculate
 	 * the maximum size locally (because we don't have any APIs for it now)
 	 * and check the current max_req_size. And then, this will update
-	 * the max_req_size if needed as a workaround.
+	 * the max_req_size if needed as a workaround. However, if the driver
+	 * runs on IOMMU, this workaround isn't needed.
 	 */
-	if (swiotlb_max_segment()) {
+	if (swiotlb_max_segment() && !pdev->dev.iommu_group) {
 		unsigned int max_size = (1 << IO_TLB_SHIFT) * IO_TLB_SEGSIZE;
 
 		if (mmc->max_req_size > max_size)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/3] mmc: tmio: Add a definition for default max_segs
  2019-05-22 10:18 [PATCH v3 0/3] mmc: renesas_sdhi: improve performance by changing max_segs Yoshihiro Shimoda
  2019-05-22 10:18 ` [PATCH v3 1/3] mmc: tmio: No memory size limitation if runs on IOMMU Yoshihiro Shimoda
@ 2019-05-22 10:18 ` Yoshihiro Shimoda
  2019-05-22 11:33   ` Wolfram Sang
  2019-05-22 10:18 ` [PATCH v3 3/3] mmc: renesas_sdhi: use multiple segments if possible Yoshihiro Shimoda
  2 siblings, 1 reply; 9+ messages in thread
From: Yoshihiro Shimoda @ 2019-05-22 10:18 UTC (permalink / raw)
  To: ulf.hansson, wsa+renesas; +Cc: linux-mmc, linux-renesas-soc, Yoshihiro Shimoda

This patch adds a definition for default max_segs to be used by other
driver (renesas_sdhi) in the future.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 drivers/mmc/host/tmio_mmc.h      | 1 +
 drivers/mmc/host/tmio_mmc_core.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/tmio_mmc.h b/drivers/mmc/host/tmio_mmc.h
index c5ba13f..9e387be 100644
--- a/drivers/mmc/host/tmio_mmc.h
+++ b/drivers/mmc/host/tmio_mmc.h
@@ -106,6 +106,7 @@
 #define TMIO_MASK_IRQ     (TMIO_MASK_READOP | TMIO_MASK_WRITEOP | TMIO_MASK_CMD)
 
 #define TMIO_MAX_BLK_SIZE 512
+#define TMIO_DEFAULT_MAX_SEGS 32
 
 struct tmio_mmc_data;
 struct tmio_mmc_host;
diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
index 7c76ab0..583cfc7 100644
--- a/drivers/mmc/host/tmio_mmc_core.c
+++ b/drivers/mmc/host/tmio_mmc_core.c
@@ -1185,7 +1185,7 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host)
 
 	mmc->caps |= MMC_CAP_4_BIT_DATA | pdata->capabilities;
 	mmc->caps2 |= pdata->capabilities2;
-	mmc->max_segs = pdata->max_segs ? : 32;
+	mmc->max_segs = pdata->max_segs ? : TMIO_DEFAULT_MAX_SEGS;
 	mmc->max_blk_size = TMIO_MAX_BLK_SIZE;
 	mmc->max_blk_count = pdata->max_blk_count ? :
 		(PAGE_SIZE / mmc->max_blk_size) * mmc->max_segs;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 3/3] mmc: renesas_sdhi: use multiple segments if possible
  2019-05-22 10:18 [PATCH v3 0/3] mmc: renesas_sdhi: improve performance by changing max_segs Yoshihiro Shimoda
  2019-05-22 10:18 ` [PATCH v3 1/3] mmc: tmio: No memory size limitation if runs on IOMMU Yoshihiro Shimoda
  2019-05-22 10:18 ` [PATCH v3 2/3] mmc: tmio: Add a definition for default max_segs Yoshihiro Shimoda
@ 2019-05-22 10:18 ` Yoshihiro Shimoda
  2019-05-22 11:35   ` Wolfram Sang
  2019-05-22 12:29   ` Christoph Hellwig
  2 siblings, 2 replies; 9+ messages in thread
From: Yoshihiro Shimoda @ 2019-05-22 10:18 UTC (permalink / raw)
  To: ulf.hansson, wsa+renesas; +Cc: linux-mmc, linux-renesas-soc, Yoshihiro Shimoda

In IOMMU environment, since it's possible to merge scatter gather
buffers of memory requests to one iova, this patch changes the max_segs
value when init_card of mmc_host timing to improve the transfer
performance on renesas_sdhi_internal_dmac.

Notes that an sdio card may be possible to use scatter gather buffers
with non page aligned size, so that this driver will not use multiple
segments to avoid any trouble. Also, on renesas_sdhi_sys_dmac,
the max_segs value will change from 32 to 512, but the sys_dmac
can handle 512 segments, so that this init_card ops is added on
"TMIO_MMC_MIN_RCAR2" environment.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 drivers/mmc/host/renesas_sdhi_core.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index 5e9e36e..2f86975 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -46,6 +46,8 @@
 #define SDHI_VER_GEN3_SD	0xcc10
 #define SDHI_VER_GEN3_SDMMC	0xcd10
 
+#define SDHI_MAX_SEGS_IN_IOMMU	512
+
 struct renesas_sdhi_quirks {
 	bool hs400_disabled;
 	bool hs400_4taps;
@@ -203,6 +205,27 @@ static void renesas_sdhi_clk_disable(struct tmio_mmc_host *host)
 	clk_disable_unprepare(priv->clk_cd);
 }
 
+static void renesas_sdhi_init_card(struct mmc_host *mmc, struct mmc_card *card)
+{
+	struct tmio_mmc_host *host = mmc_priv(mmc);
+
+	/*
+	 * In IOMMU environment, it's possible to merge scatter gather buffers
+	 * of memory requests to one iova so that this code changes
+	 * the max_segs when init_card of mmc_host timing. Notes that an sdio
+	 * card may be possible to use scatter gather buffers with non page
+	 * aligned size, so that this driver will not use multiple segments
+	 * to avoid any trouble even if IOMMU environment.
+	 */
+	if (host->pdata->max_segs < SDHI_MAX_SEGS_IN_IOMMU &&
+	    host->pdev->dev.iommu_group &&
+	    (mmc_card_mmc(card) || mmc_card_sd(card)))
+		host->mmc->max_segs = SDHI_MAX_SEGS_IN_IOMMU;
+	else
+		host->mmc->max_segs = host->pdata->max_segs ? :
+				      TMIO_DEFAULT_MAX_SEGS;
+}
+
 static int renesas_sdhi_card_busy(struct mmc_host *mmc)
 {
 	struct tmio_mmc_host *host = mmc_priv(mmc);
@@ -726,6 +749,8 @@ int renesas_sdhi_probe(struct platform_device *pdev,
 
 	/* SDR speeds are only available on Gen2+ */
 	if (mmc_data->flags & TMIO_MMC_MIN_RCAR2) {
+		host->ops.init_card = renesas_sdhi_init_card;
+
 		/* card_busy caused issues on r8a73a4 (pre-Gen2) CD-less SDHI */
 		host->ops.card_busy = renesas_sdhi_card_busy;
 		host->ops.start_signal_voltage_switch =
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/3] mmc: tmio: No memory size limitation if runs on IOMMU
  2019-05-22 10:18 ` [PATCH v3 1/3] mmc: tmio: No memory size limitation if runs on IOMMU Yoshihiro Shimoda
@ 2019-05-22 11:31   ` Wolfram Sang
  0 siblings, 0 replies; 9+ messages in thread
From: Wolfram Sang @ 2019-05-22 11:31 UTC (permalink / raw)
  To: Yoshihiro Shimoda; +Cc: ulf.hansson, wsa+renesas, linux-mmc, linux-renesas-soc

[-- Attachment #1: Type: text/plain, Size: 434 bytes --]

On Wed, May 22, 2019 at 07:18:37PM +0900, Yoshihiro Shimoda wrote:
> This patch adds a condition to avoid a memory size limitation of
> swiotlb if the driver runs on IOMMU.
> 
> Tested-by: Takeshi Saito <takeshi.saito.xv@renesas.com>
> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Reviewed-by: Simon Horman <horms+renesas@verge.net.au>

Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 2/3] mmc: tmio: Add a definition for default max_segs
  2019-05-22 10:18 ` [PATCH v3 2/3] mmc: tmio: Add a definition for default max_segs Yoshihiro Shimoda
@ 2019-05-22 11:33   ` Wolfram Sang
  0 siblings, 0 replies; 9+ messages in thread
From: Wolfram Sang @ 2019-05-22 11:33 UTC (permalink / raw)
  To: Yoshihiro Shimoda; +Cc: ulf.hansson, wsa+renesas, linux-mmc, linux-renesas-soc

[-- Attachment #1: Type: text/plain, Size: 322 bytes --]

On Wed, May 22, 2019 at 07:18:38PM +0900, Yoshihiro Shimoda wrote:
> This patch adds a definition for default max_segs to be used by other
> driver (renesas_sdhi) in the future.
> 
> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>

Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 3/3] mmc: renesas_sdhi: use multiple segments if possible
  2019-05-22 10:18 ` [PATCH v3 3/3] mmc: renesas_sdhi: use multiple segments if possible Yoshihiro Shimoda
@ 2019-05-22 11:35   ` Wolfram Sang
  2019-05-22 12:29   ` Christoph Hellwig
  1 sibling, 0 replies; 9+ messages in thread
From: Wolfram Sang @ 2019-05-22 11:35 UTC (permalink / raw)
  To: Yoshihiro Shimoda; +Cc: ulf.hansson, wsa+renesas, linux-mmc, linux-renesas-soc

[-- Attachment #1: Type: text/plain, Size: 932 bytes --]

On Wed, May 22, 2019 at 07:18:39PM +0900, Yoshihiro Shimoda wrote:
> In IOMMU environment, since it's possible to merge scatter gather
> buffers of memory requests to one iova, this patch changes the max_segs
> value when init_card of mmc_host timing to improve the transfer
> performance on renesas_sdhi_internal_dmac.
> 
> Notes that an sdio card may be possible to use scatter gather buffers
> with non page aligned size, so that this driver will not use multiple
> segments to avoid any trouble. Also, on renesas_sdhi_sys_dmac,
> the max_segs value will change from 32 to 512, but the sys_dmac
> can handle 512 segments, so that this init_card ops is added on
> "TMIO_MMC_MIN_RCAR2" environment.
> 
> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>

Awesome, Shimoda-san. I think you nailed it, this is nicely readable
code!

Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 3/3] mmc: renesas_sdhi: use multiple segments if possible
  2019-05-22 10:18 ` [PATCH v3 3/3] mmc: renesas_sdhi: use multiple segments if possible Yoshihiro Shimoda
  2019-05-22 11:35   ` Wolfram Sang
@ 2019-05-22 12:29   ` Christoph Hellwig
  2019-05-23  4:15     ` Yoshihiro Shimoda
  1 sibling, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2019-05-22 12:29 UTC (permalink / raw)
  To: Yoshihiro Shimoda
  Cc: ulf.hansson, wsa+renesas, linux-mmc, linux-renesas-soc, iommu,
	linux-block

On Wed, May 22, 2019 at 07:18:39PM +0900, Yoshihiro Shimoda wrote:
> In IOMMU environment, since it's possible to merge scatter gather
> buffers of memory requests to one iova, this patch changes the max_segs
> value when init_card of mmc_host timing to improve the transfer
> performance on renesas_sdhi_internal_dmac.

Well, you can't merge everything with an IOMMU.  For one not every
IOMMU can merge multiple scatterlist segments, second even it can merge
segements the segments need to be aligned to the IOMMU page size.  And
then of course we might have an upper limit on the total mapping.

> +	if (host->pdata->max_segs < SDHI_MAX_SEGS_IN_IOMMU &&
> +	    host->pdev->dev.iommu_group &&
> +	    (mmc_card_mmc(card) || mmc_card_sd(card)))
> +		host->mmc->max_segs = SDHI_MAX_SEGS_IN_IOMMU;

This is way to magic.  We'll need a proper DMA layer API to expose
this information, and preferably a block layer helper to increase
max_segs instead of hacking that up in the driver.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v3 3/3] mmc: renesas_sdhi: use multiple segments if possible
  2019-05-22 12:29   ` Christoph Hellwig
@ 2019-05-23  4:15     ` Yoshihiro Shimoda
  0 siblings, 0 replies; 9+ messages in thread
From: Yoshihiro Shimoda @ 2019-05-23  4:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: ulf.hansson, wsa+renesas, linux-mmc, linux-renesas-soc, iommu,
	linux-block

Hi Christoph,

Thank you for your review!

> From: Christoph Hellwig, Sent: Wednesday, May 22, 2019 9:29 PM
> 
> On Wed, May 22, 2019 at 07:18:39PM +0900, Yoshihiro Shimoda wrote:
> > In IOMMU environment, since it's possible to merge scatter gather
> > buffers of memory requests to one iova, this patch changes the max_segs
> > value when init_card of mmc_host timing to improve the transfer
> > performance on renesas_sdhi_internal_dmac.
> 
> Well, you can't merge everything with an IOMMU.  For one not every
> IOMMU can merge multiple scatterlist segments,

I didn't know such IOMMU exists. But, since R-Car Gen3 IOMMU device
(handled by ipmmu-vmsa.c) can merge multiple scatterlist segments,
should this mmc driver check whether the IOMMU device is used or not somehow?

> second even it can merge
> segements the segments need to be aligned to the IOMMU page size.

If this driver checks whether the segments are aligned to the IOMMU
page size before DMA API is called every time, is it acceptable?
If one of the segments is not aligned, this driver should not use
the DMAC.

>  And
> then of course we might have an upper limit on the total mapping.

IIUC, if such a case, DMA API will fail. What do you think?

> > +	if (host->pdata->max_segs < SDHI_MAX_SEGS_IN_IOMMU &&
> > +	    host->pdev->dev.iommu_group &&
> > +	    (mmc_card_mmc(card) || mmc_card_sd(card)))
> > +		host->mmc->max_segs = SDHI_MAX_SEGS_IN_IOMMU;
> 
> This is way to magic.  We'll need a proper DMA layer API to expose
> this information, and preferably a block layer helper to increase
> max_segs instead of hacking that up in the driver.

I think I should have described the detail somewhere. This can expose
this information to a block layer by using blk_queue_max_segments()
that mmc_setup_queue() calls. In other words, this init_card() ops
is called before a block device is created. Is this acceptable if
such a comment is described here?

Best regards,
Yoshihiro Shimoda


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-05-23  4:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-22 10:18 [PATCH v3 0/3] mmc: renesas_sdhi: improve performance by changing max_segs Yoshihiro Shimoda
2019-05-22 10:18 ` [PATCH v3 1/3] mmc: tmio: No memory size limitation if runs on IOMMU Yoshihiro Shimoda
2019-05-22 11:31   ` Wolfram Sang
2019-05-22 10:18 ` [PATCH v3 2/3] mmc: tmio: Add a definition for default max_segs Yoshihiro Shimoda
2019-05-22 11:33   ` Wolfram Sang
2019-05-22 10:18 ` [PATCH v3 3/3] mmc: renesas_sdhi: use multiple segments if possible Yoshihiro Shimoda
2019-05-22 11:35   ` Wolfram Sang
2019-05-22 12:29   ` Christoph Hellwig
2019-05-23  4:15     ` Yoshihiro Shimoda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).