All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mmci: restrict DMA usage to large, even multiblock transfers
@ 2011-02-01 10:41 Linus Walleij
  2011-02-01 10:54 ` Russell King - ARM Linux
  0 siblings, 1 reply; 11+ messages in thread
From: Linus Walleij @ 2011-02-01 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

This will restrict the use of DMA to only cover multi-block
transfers of blocks that are evenly divisible with the FIFO
halfsize (8 4-byte words, 32 bytes). This will be true for any
normal, large MMC file transfer, falling back to PIO mode for
any smallish reads.

Cc: Ulf Hansson <ulf.hansson@stericsson.com>
Cc: Sebastian Rasmussen <sebastian.rasmussen@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
With this and the previous DMA end sync code the latest MMCI
patches work fine on U8500. On external cards the first read
is an 8-byte single block, which would fail otherwise.

I believe the earlier 1-byte-burst code would circumvent this
restriction, this solution is more elegant. Whether it is also
more efficient to avoid DMA on small packets needs to be
confirmed by measurements but I suspect it is.
---
 drivers/mmc/host/mmci.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index bac15a3..9ba24fd 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -484,10 +484,14 @@ static void mmci_start_data(struct mmci_host *host, struct mmc_data *data)
 		datactrl |= MCI_DPSM_DIRECTION;
 
 	/*
-	 * Attempt to use DMA operation mode, if this
-	 * should fail, fall back to PIO mode
+	 * If the transfer is more than one block, and the block size is
+	 * evenly dividable by the half FIFO so it can be chopped into
+	 * burst packets, then attempt to use DMA operation mode, if this
+	 * should fail, fall back to PIO mode.
 	 */
-	if (!mmci_dma_start_data(host, datactrl))
+	if (data->blocks > 1 &&
+	    (data->blksz % variant->fifohalfsize == 0) &&
+	    !mmci_dma_start_data(host, datactrl))
 		return;
 
 	/* IRQ mode, map the SG list for CPU reading/writing */
-- 
1.7.3.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH] mmci: restrict DMA usage to large, even multiblock transfers
  2011-02-01 10:41 [PATCH] mmci: restrict DMA usage to large, even multiblock transfers Linus Walleij
@ 2011-02-01 10:54 ` Russell King - ARM Linux
  2011-02-01 11:34   ` Ulf Hansson
  0 siblings, 1 reply; 11+ messages in thread
From: Russell King - ARM Linux @ 2011-02-01 10:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 01, 2011 at 11:41:13AM +0100, Linus Walleij wrote:
> This will restrict the use of DMA to only cover multi-block
> transfers of blocks that are evenly divisible with the FIFO
> halfsize (8 4-byte words, 32 bytes). This will be true for any
> normal, large MMC file transfer, falling back to PIO mode for
> any smallish reads.

That's a very bad idea.  If it's an ARM primecell, and we (by some magical
reason) are using DMA, we want to always use DMA for transfers larger than
the FIFO depth.  If we switch to PIO mode, we will see FIFO overruns/
underruns if the CPU can't keep up with the data rate.

So I think you actually want:

	if (host->size >= variant->fifosize &&
	    !mmci_dma_start_data(host, datactrl))
		return;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] mmci: restrict DMA usage to large, even multiblock transfers
  2011-02-01 10:54 ` Russell King - ARM Linux
@ 2011-02-01 11:34   ` Ulf Hansson
  2011-02-01 11:45     ` Russell King - ARM Linux
  0 siblings, 1 reply; 11+ messages in thread
From: Ulf Hansson @ 2011-02-01 11:34 UTC (permalink / raw)
  To: linux-arm-kernel

>> On Tue, Feb 01, 2011 at 11:41:13AM +0100, Linus Walleij wrote:
>> This will restrict the use of DMA to only cover multi-block
>> transfers of blocks that are evenly divisible with the FIFO
>> halfsize (8 4-byte words, 32 bytes). This will be true for any
>> normal, large MMC file transfer, falling back to PIO mode for
>> any smallish reads.


> Russell King - ARM Linux wrote:
> That's a very bad idea.  If it's an ARM primecell, and we (by some magical
> reason) are using DMA, we want to always use DMA for transfers larger than
> the FIFO depth.  If we switch to PIO mode, we will see FIFO overruns/
> underruns if the CPU can't keep up with the data rate.
> 
> So I think you actually want:
> 
> 	if (host->size >= variant->fifosize &&
> 	    !mmci_dma_start_data(host, datactrl))
> 		return;

Getting FIFO overrun/underruns is then as of today already the case when 
running PIO for ARM primecell. To prevent this; you probably now that 
already, hardware flow control is implemented in ST version of the ARM 
primecell.

For performance gain I think the threshold should probably not be less 
than a 512 bytes block, but of course performance test needs to verify this.

Maybe a better option would be to specify a dma_threshold value in the 
variant struct. Thus the value can differ between what version of the 
ARM primcell we are using.

Finally I would prefer to move this "threshold" check into the
mmci_dma_start_data function instead, since there anyway can be other 
reasons to why we want to fall back to PIO mode. Better to do that from 
one place.

/Ulf Hansson

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] mmci: restrict DMA usage to large, even multiblock transfers
  2011-02-01 11:34   ` Ulf Hansson
@ 2011-02-01 11:45     ` Russell King - ARM Linux
  2011-02-01 11:56       ` Ulf Hansson
  2011-02-01 11:58       ` Russell King - ARM Linux
  0 siblings, 2 replies; 11+ messages in thread
From: Russell King - ARM Linux @ 2011-02-01 11:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 01, 2011 at 12:34:14PM +0100, Ulf Hansson wrote:
> Getting FIFO overrun/underruns is then as of today already the case when  
> running PIO for ARM primecell. To prevent this; you probably now that  
> already, hardware flow control is implemented in ST version of the ARM  
> primecell.

That really doesn't matter.  This driver is not specific to the ST
version.  It has to work on other versions as well.

I would like to see some further discussion on the unresolved issue of
the burst size setting - the discussion between myself and Linus died
without any apparant conclusion.  Linus tried to get some knowledgable
people from ST involved but no one ever followed up.

What I gathered from the discussion was that the additional DMA enable
bit found in ST variants should not be set as this changes the behaviour
of the DMA requests to be incompatible with the DMA controller
requirements.  With that bit clear, I see no problem with leaving the
burst threshold set at the half FIFO size.

That then makes avoiding DMA for requests below a specific size an
optimization issue and nothing more.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] mmci: restrict DMA usage to large, even multiblock transfers
  2011-02-01 11:45     ` Russell King - ARM Linux
@ 2011-02-01 11:56       ` Ulf Hansson
  2011-02-01 11:58       ` Russell King - ARM Linux
  1 sibling, 0 replies; 11+ messages in thread
From: Ulf Hansson @ 2011-02-01 11:56 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux wrote:
> On Tue, Feb 01, 2011 at 12:34:14PM +0100, Ulf Hansson wrote:
>> Getting FIFO overrun/underruns is then as of today already the case when  
>> running PIO for ARM primecell. To prevent this; you probably now that  
>> already, hardware flow control is implemented in ST version of the ARM  
>> primecell.
> 
> That really doesn't matter.  This driver is not specific to the ST
> version.  It has to work on other versions as well.
> 
> I would like to see some further discussion on the unresolved issue of
> the burst size setting - the discussion between myself and Linus died
> without any apparant conclusion.  Linus tried to get some knowledgable
> people from ST involved but no one ever followed up.
> 
> What I gathered from the discussion was that the additional DMA enable
> bit found in ST variants should not be set as this changes the behaviour
> of the DMA requests to be incompatible with the DMA controller
> requirements.  With that bit clear, I see no problem with leaving the
> burst threshold set at the half FIFO size.
> 
> That then makes avoiding DMA for requests below a specific size an
> optimization issue and nothing more.
> 

OK, I see your point - I agree. This should then be done as a two step 
approach. One for making sure DMA transfers is setup on data transfers 
that is not too small to handle by any reasons. Second as an performance 
optimization, which most likely means an increased threshold value that 
could "override" the earlier one.

/Ulf Hansson

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] mmci: restrict DMA usage to large, even multiblock transfers
  2011-02-01 11:45     ` Russell King - ARM Linux
  2011-02-01 11:56       ` Ulf Hansson
@ 2011-02-01 11:58       ` Russell King - ARM Linux
  2011-02-01 14:07         ` Linus Walleij
  1 sibling, 1 reply; 11+ messages in thread
From: Russell King - ARM Linux @ 2011-02-01 11:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 01, 2011 at 11:45:22AM +0000, Russell King - ARM Linux wrote:
> On Tue, Feb 01, 2011 at 12:34:14PM +0100, Ulf Hansson wrote:
> > Getting FIFO overrun/underruns is then as of today already the case when  
> > running PIO for ARM primecell. To prevent this; you probably now that  
> > already, hardware flow control is implemented in ST version of the ARM  
> > primecell.
> 
> That really doesn't matter.  This driver is not specific to the ST
> version.  It has to work on other versions as well.
> 
> I would like to see some further discussion on the unresolved issue of
> the burst size setting - the discussion between myself and Linus died
> without any apparant conclusion.  Linus tried to get some knowledgable
> people from ST involved but no one ever followed up.
> 
> What I gathered from the discussion was that the additional DMA enable
> bit found in ST variants should not be set as this changes the behaviour
> of the DMA requests to be incompatible with the DMA controller
> requirements.  With that bit clear, I see no problem with leaving the
> burst threshold set at the half FIFO size.
> 
> That then makes avoiding DMA for requests below a specific size an
> optimization issue and nothing more.

So... I think this patch will work fine on ST variants, returning the
DMA request signals back to their classic meaning.  That being LSREQ
is only activated on the _final_ transfer rather than the last 8
transfers, which as Linus describes makes your DMA controller complete
on the 8th-to-last transfer.

diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index 7052b75..0acc2ed 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -44,7 +44,6 @@ static unsigned int fmax = 515633;
  * struct variant_data - MMCI variant-specific quirks
  * @clkreg: default value for MCICLOCK register
  * @clkreg_enable: enable value for MMCICLOCK register
- * @dmareg_enable: enable value for MMCIDATACTRL register
  * @datalength_bits: number of bits in the MMCIDATALENGTH register
  * @fifosize: number of bytes that can be written when MMCI_TXFIFOEMPTY
  *	      is asserted (likewise for RX)
@@ -56,7 +55,6 @@ static unsigned int fmax = 515633;
 struct variant_data {
 	unsigned int		clkreg;
 	unsigned int		clkreg_enable;
-	unsigned int		dmareg_enable;
 	unsigned int		datalength_bits;
 	unsigned int		fifosize;
 	unsigned int		fifohalfsize;
@@ -83,7 +81,6 @@ static struct variant_data variant_ux500 = {
 	.fifohalfsize		= 8 * 4,
 	.clkreg			= MCI_CLK_ENABLE,
 	.clkreg_enable		= 1 << 14, /* HWFCEN */
-	.dmareg_enable		= 1 << 12, /* DMAREQCTRL */
 	.datalength_bits	= 24,
 	.sdio			= true,
 	.st_clkdiv		= true,
@@ -367,6 +364,10 @@ static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 	if (!chan)
 		return -EINVAL;
 
+	/* If less than or equal to the fifo size, don't bother with DMA */ 
+	if (host->size <= variant->fifosize)
+		return -EINVAL;
+
 	device = chan->device;
 	nr_sg = dma_map_sg(device->dev, data->sg, data->sg_len, conf.direction);
 	if (nr_sg == 0)
@@ -388,7 +389,6 @@ static int mmci_dma_start_data(struct mmci_host *host, unsigned int datactrl)
 	dma_async_issue_pending(chan);
 
 	datactrl |= MCI_DPSM_DMAENABLE;
-	datactrl |= variant->dmareg_enable;
 
 	/* Trigger the DMA transfer */
 	writel(datactrl, host->base + MMCIDATACTRL);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH] mmci: restrict DMA usage to large, even multiblock transfers
  2011-02-01 11:58       ` Russell King - ARM Linux
@ 2011-02-01 14:07         ` Linus Walleij
  2011-02-01 14:44           ` Russell King - ARM Linux
  2011-02-01 15:05           ` Russell King - ARM Linux
  0 siblings, 2 replies; 11+ messages in thread
From: Linus Walleij @ 2011-02-01 14:07 UTC (permalink / raw)
  To: linux-arm-kernel

2011/2/1 Russell King - ARM Linux <linux@arm.linux.org.uk>:

> So... I think this patch will work fine on ST variants, returning the
> DMA request signals back to their classic meaning. ?That being LSREQ
> is only activated on the _final_ transfer rather than the last 8
> transfers, which as Linus describes makes your DMA controller complete
> on the 8th-to-last transfer.

I've tested this patch on MMC and SD on U8500 and it works
fine, so
Tested-by: Linus Walleij <linus.walleij@linaro.org>

However I think this weird bit turning off all single requests
is there for some reason and I still try to find out what the
usecase really is. I highly suspect that it's related to either
doing some SDIO usecase or pleasing some buggy DMA
controller.

Having it deactivated most likely works for all MMC/SD cards
and then we have to think about having (ST-Specific)
if (mmc_card_sdio()) clauses for SDIO usecases, which we
can defer until we know what's going on.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] mmci: restrict DMA usage to large, even multiblock transfers
  2011-02-01 14:07         ` Linus Walleij
@ 2011-02-01 14:44           ` Russell King - ARM Linux
  2011-02-02  9:39             ` Linus Walleij
  2011-02-01 15:05           ` Russell King - ARM Linux
  1 sibling, 1 reply; 11+ messages in thread
From: Russell King - ARM Linux @ 2011-02-01 14:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 01, 2011 at 03:07:12PM +0100, Linus Walleij wrote:
> 2011/2/1 Russell King - ARM Linux <linux@arm.linux.org.uk>:
> 
> > So... I think this patch will work fine on ST variants, returning the
> > DMA request signals back to their classic meaning. ?That being LSREQ
> > is only activated on the _final_ transfer rather than the last 8
> > transfers, which as Linus describes makes your DMA controller complete
> > on the 8th-to-last transfer.
> 
> I've tested this patch on MMC and SD on U8500 and it works
> fine, so
> Tested-by: Linus Walleij <linus.walleij@linaro.org>
> 
> However I think this weird bit turning off all single requests
> is there for some reason and I still try to find out what the
> usecase really is. I highly suspect that it's related to either
> doing some SDIO usecase or pleasing some buggy DMA
> controller.

We aren't turning off single requests - by clearing DMAREQCTL, we're
actually turning them back on.

What I think is going on is someone had an idea of using LBREQ to
transfer the last one to seven transfers as a burst, rather than one
to seven separate single transfers as an optimization.  However, this
can only work if the DMA controller obeys the transfer count and doesn't
try to transfer more than the remainder of the transfer in any burst.
IOW, the DMAC has to truncate bursts to the smaller of the burst size
and remainder.  If it can't, DMAREQCTL causes problems.

I assume that the U8500/U300 documentation isn't publically available...
A quick google doesn't seem to help.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] mmci: restrict DMA usage to large, even multiblock transfers
  2011-02-01 14:07         ` Linus Walleij
  2011-02-01 14:44           ` Russell King - ARM Linux
@ 2011-02-01 15:05           ` Russell King - ARM Linux
  2011-02-02  9:34             ` Linus Walleij
  1 sibling, 1 reply; 11+ messages in thread
From: Russell King - ARM Linux @ 2011-02-01 15:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 01, 2011 at 03:07:12PM +0100, Linus Walleij wrote:
> 2011/2/1 Russell King - ARM Linux <linux@arm.linux.org.uk>:
> 
> > So... I think this patch will work fine on ST variants, returning the
> > DMA request signals back to their classic meaning. ?That being LSREQ
> > is only activated on the _final_ transfer rather than the last 8
> > transfers, which as Linus describes makes your DMA controller complete
> > on the 8th-to-last transfer.
> 
> I've tested this patch on MMC and SD on U8500 and it works
> fine, so
> Tested-by: Linus Walleij <linus.walleij@linaro.org>

Thanks.  Can I apply that to the DMA patch too (and combine this with the
main MMCI DMA patch) ?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] mmci: restrict DMA usage to large, even multiblock transfers
  2011-02-01 15:05           ` Russell King - ARM Linux
@ 2011-02-02  9:34             ` Linus Walleij
  0 siblings, 0 replies; 11+ messages in thread
From: Linus Walleij @ 2011-02-02  9:34 UTC (permalink / raw)
  To: linux-arm-kernel

2011/2/1 Russell King - ARM Linux <linux@arm.linux.org.uk>:
> On Tue, Feb 01, 2011 at 03:07:12PM +0100, Linus Walleij wrote:
>> 2011/2/1 Russell King - ARM Linux <linux@arm.linux.org.uk>:
>>
>> > So... I think this patch will work fine on ST variants, returning the
>> > DMA request signals back to their classic meaning. ?That being LSREQ
>> > is only activated on the _final_ transfer rather than the last 8
>> > transfers, which as Linus describes makes your DMA controller complete
>> > on the 8th-to-last transfer.
>>
>> I've tested this patch on MMC and SD on U8500 and it works
>> fine, so
>> Tested-by: Linus Walleij <linus.walleij@linaro.org>
>
> Thanks. ?Can I apply that to the DMA patch too (and combine this with the
> main MMCI DMA patch) ?

Yes I've tested it and it works for MMC/SD cards with the above additions.
Tested-by: Linus Walleij <linus.walleij@linaro.org>

Thanks,
Linus Walleij

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] mmci: restrict DMA usage to large, even multiblock transfers
  2011-02-01 14:44           ` Russell King - ARM Linux
@ 2011-02-02  9:39             ` Linus Walleij
  0 siblings, 0 replies; 11+ messages in thread
From: Linus Walleij @ 2011-02-02  9:39 UTC (permalink / raw)
  To: linux-arm-kernel

2011/2/1 Russell King - ARM Linux <linux@arm.linux.org.uk>:
> On Tue, Feb 01, 2011 at 03:07:12PM +0100, Linus Walleij wrote:
>> However I think this weird bit turning off all single requests
>> is there for some reason and I still try to find out what the
>> usecase really is. I highly suspect that it's related to either
>> doing some SDIO usecase or pleasing some buggy DMA
>> controller.
>
> We aren't turning off single requests - by clearing DMAREQCTL, we're
> actually turning them back on.

Yes that's what I meant, sorry for the confusion.

> What I think is going on is someone had an idea of using LBREQ to
> transfer the last one to seven transfers as a burst, rather than one
> to seven separate single transfers as an optimization. ?However, this
> can only work if the DMA controller obeys the transfer count and doesn't
> try to transfer more than the remainder of the transfer in any burst.
> IOW, the DMAC has to truncate bursts to the smaller of the burst size
> and remainder. ?If it can't, DMAREQCTL causes problems.

Seems plausible. I'll try to look into it.

> I assume that the U8500/U300 documentation isn't publically available...
> A quick google doesn't seem to help.

In this case it doesn't help... U300 does not have this peculiar
feature BTW, it's a pure ux500 thing.

If it gives clear performance improvements (need measurements)
we can always patch it in later. For now much is already achieved
by this patch.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-02-02  9:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-01 10:41 [PATCH] mmci: restrict DMA usage to large, even multiblock transfers Linus Walleij
2011-02-01 10:54 ` Russell King - ARM Linux
2011-02-01 11:34   ` Ulf Hansson
2011-02-01 11:45     ` Russell King - ARM Linux
2011-02-01 11:56       ` Ulf Hansson
2011-02-01 11:58       ` Russell King - ARM Linux
2011-02-01 14:07         ` Linus Walleij
2011-02-01 14:44           ` Russell King - ARM Linux
2011-02-02  9:39             ` Linus Walleij
2011-02-01 15:05           ` Russell King - ARM Linux
2011-02-02  9:34             ` Linus Walleij

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.