linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] dmaengine: bcm2835: Cyclic DMA fixes
@ 2016-06-09 11:41 Matthias Reichl
  2016-06-09 11:41 ` [PATCH 1/2] dmaengine: bcm2835: Fix cyclic DMA period splitting Matthias Reichl
  2016-06-09 11:41 ` [PATCH 2/2] dmaengine: bcm2835: Avoid splitting periods into very small chunks Matthias Reichl
  0 siblings, 2 replies; 6+ messages in thread
From: Matthias Reichl @ 2016-06-09 11:41 UTC (permalink / raw)
  To: Vinod Koul, Stephen Warren, Lee Jones, Eric Anholt
  Cc: Martin Sperl, Clive Messer, dmaengine, linux-rpi-kernel,
	linux-arm-kernel, linux-kernel

In downstream Raspberry Pi kernel we noticed that audio didn't work
as expected, we got stuttering and overruns/underruns. Here's the
link to the original discussion on GitHub:
https://github.com/raspberrypi/linux/issues/1517

This issue is caused by a small bug in the period-splitting-code
and fixed by the first patch.

The second patch, avoiding very small chunks, is mainly a precaution.
While small chunks are not known to have caused any problems so far
they have the potentical to cause very hard to track down issues.
So better avoid such situations in the first place.

Matthias Reichl (2):
  dmaengine: bcm2835: Fix cyclic DMA period splitting
  dmaengine: bcm2835: Avoid splitting periods into very small chunks

 drivers/dma/bcm2835-dma.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] dmaengine: bcm2835: Fix cyclic DMA period splitting
  2016-06-09 11:41 [PATCH 0/2] dmaengine: bcm2835: Cyclic DMA fixes Matthias Reichl
@ 2016-06-09 11:41 ` Matthias Reichl
  2016-06-14  4:49   ` Eric Anholt
  2016-06-09 11:41 ` [PATCH 2/2] dmaengine: bcm2835: Avoid splitting periods into very small chunks Matthias Reichl
  1 sibling, 1 reply; 6+ messages in thread
From: Matthias Reichl @ 2016-06-09 11:41 UTC (permalink / raw)
  To: Vinod Koul, Stephen Warren, Lee Jones, Eric Anholt
  Cc: Martin Sperl, Clive Messer, dmaengine, linux-rpi-kernel,
	linux-arm-kernel, linux-kernel

The code responsible for splitting periods into chunks that
can be handled by the DMA controller missed to update total_len,
the number of bytes processed in the current period, when there
are more chunks to follow.

Therefore total_len was stuck at 0 and the code didn't work at all.
This resulted in a wrong control block layout and audio issues because
the cyclic DMA callback wasn't executing on period boundaries.

Fix this by adding the missing total_len update.

Signed-off-by: Matthias Reichl <hias@horus.com>
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
---
 drivers/dma/bcm2835-dma.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
index 6149b27..344bcf92 100644
--- a/drivers/dma/bcm2835-dma.c
+++ b/drivers/dma/bcm2835-dma.c
@@ -251,8 +251,11 @@ static void bcm2835_dma_create_cb_set_length(
 	 */
 
 	/* have we filled in period_length yet? */
-	if (*total_len + control_block->length < period_len)
+	if (*total_len + control_block->length < period_len) {
+		/* update number of bytes in this period so far */
+		*total_len += control_block->length;
 		return;
+	}
 
 	/* calculate the length that remains to reach period_length */
 	control_block->length = period_len - *total_len;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] dmaengine: bcm2835: Avoid splitting periods into very small chunks
  2016-06-09 11:41 [PATCH 0/2] dmaengine: bcm2835: Cyclic DMA fixes Matthias Reichl
  2016-06-09 11:41 ` [PATCH 1/2] dmaengine: bcm2835: Fix cyclic DMA period splitting Matthias Reichl
@ 2016-06-09 11:41 ` Matthias Reichl
  2016-06-14  5:06   ` Eric Anholt
  1 sibling, 1 reply; 6+ messages in thread
From: Matthias Reichl @ 2016-06-09 11:41 UTC (permalink / raw)
  To: Vinod Koul, Stephen Warren, Lee Jones, Eric Anholt
  Cc: Martin Sperl, Clive Messer, dmaengine, linux-rpi-kernel,
	linux-arm-kernel, linux-kernel

The current cyclic DMA period splitting implementation can generate
very small chunks at the end of each period. For example a 65536 byte
period will be split into a 65532 byte chunk and a 4 byte chunk on
the "lite" DMA channels.

This increases pressure on the RAM controller as the DMA controller
needs to fetch two control blocks from RAM in quick succession and
could potentially cause latency issues if the RAM is tied up by other
devices.

We can easily avoid these situations by distributing the remaining
length evenly between the last-but-one and the last chunk, making
sure that split chunks will be at least half the maximum length the
DMA controller can handle.

This patch checks if the last chunk would be less than half of
the maximum DMA length and if yes distributes the max len+4...max_len*1.5
bytes evenly between the last 2 chunks. This results in chunk sizes
between max_len/2 and max_len*0.75 bytes.

Signed-off-by: Matthias Reichl <hias@horus.com>
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
---
 drivers/dma/bcm2835-dma.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
index 344bcf92..36b998d 100644
--- a/drivers/dma/bcm2835-dma.c
+++ b/drivers/dma/bcm2835-dma.c
@@ -252,6 +252,20 @@ static void bcm2835_dma_create_cb_set_length(
 
 	/* have we filled in period_length yet? */
 	if (*total_len + control_block->length < period_len) {
+		/*
+		 * If the next control block is the last in the period
+		 * and it's length would be less than half of max_len
+		 * change it so that both control blocks are (almost)
+		 * equally long. This avoids generating very short
+		 * control blocks (worst case would be 4 bytes) which
+		 * might be problematic. We also have to make sure the
+		 * new length is a multiple of 4 bytes.
+		 */
+		if (*total_len + control_block->length + max_len / 2 >
+		    period_len) {
+			control_block->length =
+				DIV_ROUND_UP(period_len - *total_len, 8) * 4;
+		}
 		/* update number of bytes in this period so far */
 		*total_len += control_block->length;
 		return;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] dmaengine: bcm2835: Fix cyclic DMA period splitting
  2016-06-09 11:41 ` [PATCH 1/2] dmaengine: bcm2835: Fix cyclic DMA period splitting Matthias Reichl
@ 2016-06-14  4:49   ` Eric Anholt
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Anholt @ 2016-06-14  4:49 UTC (permalink / raw)
  To: Matthias Reichl, Vinod Koul, Stephen Warren, Lee Jones
  Cc: Martin Sperl, Clive Messer, dmaengine, linux-rpi-kernel,
	linux-arm-kernel, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 690 bytes --]

Matthias Reichl <hias@horus.com> writes:

> The code responsible for splitting periods into chunks that
> can be handled by the DMA controller missed to update total_len,
> the number of bytes processed in the current period, when there
> are more chunks to follow.
>
> Therefore total_len was stuck at 0 and the code didn't work at all.
> This resulted in a wrong control block layout and audio issues because
> the cyclic DMA callback wasn't executing on period boundaries.
>
> Fix this by adding the missing total_len update.

It looks like this issue has been around for a long time, and this fix
is pretty dependent on the recent refactors.

Reviewed-by: Eric Anholt <eric@anholt.net>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] dmaengine: bcm2835: Avoid splitting periods into very small chunks
  2016-06-09 11:41 ` [PATCH 2/2] dmaengine: bcm2835: Avoid splitting periods into very small chunks Matthias Reichl
@ 2016-06-14  5:06   ` Eric Anholt
  2016-06-19 10:39     ` Matthias Reichl
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Anholt @ 2016-06-14  5:06 UTC (permalink / raw)
  To: Matthias Reichl, Vinod Koul, Stephen Warren, Lee Jones
  Cc: Martin Sperl, Clive Messer, dmaengine, linux-rpi-kernel,
	linux-arm-kernel, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3168 bytes --]

Matthias Reichl <hias@horus.com> writes:

> The current cyclic DMA period splitting implementation can generate
> very small chunks at the end of each period. For example a 65536 byte
> period will be split into a 65532 byte chunk and a 4 byte chunk on
> the "lite" DMA channels.
>
> This increases pressure on the RAM controller as the DMA controller
> needs to fetch two control blocks from RAM in quick succession and
> could potentially cause latency issues if the RAM is tied up by other
> devices.
>
> We can easily avoid these situations by distributing the remaining
> length evenly between the last-but-one and the last chunk, making
> sure that split chunks will be at least half the maximum length the
> DMA controller can handle.
>
> This patch checks if the last chunk would be less than half of
> the maximum DMA length and if yes distributes the max len+4...max_len*1.5
> bytes evenly between the last 2 chunks. This results in chunk sizes
> between max_len/2 and max_len*0.75 bytes.
>
> Signed-off-by: Matthias Reichl <hias@horus.com>
> Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
> Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
> ---
>  drivers/dma/bcm2835-dma.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
>
> diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
> index 344bcf92..36b998d 100644
> --- a/drivers/dma/bcm2835-dma.c
> +++ b/drivers/dma/bcm2835-dma.c
> @@ -252,6 +252,20 @@ static void bcm2835_dma_create_cb_set_length(
>  
>  	/* have we filled in period_length yet? */
>  	if (*total_len + control_block->length < period_len) {
> +		/*
> +		 * If the next control block is the last in the period
> +		 * and it's length would be less than half of max_len
> +		 * change it so that both control blocks are (almost)
> +		 * equally long. This avoids generating very short
> +		 * control blocks (worst case would be 4 bytes) which
> +		 * might be problematic. We also have to make sure the
> +		 * new length is a multiple of 4 bytes.
> +		 */
> +		if (*total_len + control_block->length + max_len / 2 >
> +		    period_len) {
> +			control_block->length =
> +				DIV_ROUND_UP(period_len - *total_len, 8) * 4;
> +		}
>  		/* update number of bytes in this period so far */
>  		*total_len += control_block->length;
>  		return;

It seems to me like this would all be a lot simpler if we always split
the last 2 control blocks evenly (other than 4-byte rounding):

u32 period_remaining = period_len - *total_len;

/* Early exit if we aren't finishing this period */
if (period_remaining >= max_len) {
	/*
	 * Split the length between the last 2 CBs, to help hide the
	 * latency of fetching the CBs.
	 */
	if (period_remaining < max_len * 2) {
		control_block->length =
                	DIV_ROUND_UP(period_remaining, 8) * 4;
        }
	/* update number of bytes in this period so far */
	*total_len += control_block->length;
}

I'm about to go semi-AFK for a couple weeks.  If there's a good reason
to only do this when the last block is very short, I'm fine with:

Acked-by: Eric Anholt <eric@anholt.net>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] dmaengine: bcm2835: Avoid splitting periods into very small chunks
  2016-06-14  5:06   ` Eric Anholt
@ 2016-06-19 10:39     ` Matthias Reichl
  0 siblings, 0 replies; 6+ messages in thread
From: Matthias Reichl @ 2016-06-19 10:39 UTC (permalink / raw)
  To: Eric Anholt
  Cc: Vinod Koul, Stephen Warren, Lee Jones, Martin Sperl,
	Clive Messer, dmaengine, linux-rpi-kernel, linux-arm-kernel,
	linux-kernel

On Mon, Jun 13, 2016 at 10:06:49PM -0700, Eric Anholt wrote:
> Matthias Reichl <hias@horus.com> writes:
> 
> > The current cyclic DMA period splitting implementation can generate
> > very small chunks at the end of each period. For example a 65536 byte
> > period will be split into a 65532 byte chunk and a 4 byte chunk on
> > the "lite" DMA channels.
> >
> > This increases pressure on the RAM controller as the DMA controller
> > needs to fetch two control blocks from RAM in quick succession and
> > could potentially cause latency issues if the RAM is tied up by other
> > devices.
> >
> > We can easily avoid these situations by distributing the remaining
> > length evenly between the last-but-one and the last chunk, making
> > sure that split chunks will be at least half the maximum length the
> > DMA controller can handle.
> >
> > This patch checks if the last chunk would be less than half of
> > the maximum DMA length and if yes distributes the max len+4...max_len*1.5
> > bytes evenly between the last 2 chunks. This results in chunk sizes
> > between max_len/2 and max_len*0.75 bytes.
> >
> > Signed-off-by: Matthias Reichl <hias@horus.com>
> > Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
> > Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
> > ---
> >  drivers/dma/bcm2835-dma.c | 14 ++++++++++++++
> >  1 file changed, 14 insertions(+)
> >
> > diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
> > index 344bcf92..36b998d 100644
> > --- a/drivers/dma/bcm2835-dma.c
> > +++ b/drivers/dma/bcm2835-dma.c
> > @@ -252,6 +252,20 @@ static void bcm2835_dma_create_cb_set_length(
> >  
> >  	/* have we filled in period_length yet? */
> >  	if (*total_len + control_block->length < period_len) {
> > +		/*
> > +		 * If the next control block is the last in the period
> > +		 * and it's length would be less than half of max_len
> > +		 * change it so that both control blocks are (almost)
> > +		 * equally long. This avoids generating very short
> > +		 * control blocks (worst case would be 4 bytes) which
> > +		 * might be problematic. We also have to make sure the
> > +		 * new length is a multiple of 4 bytes.
> > +		 */
> > +		if (*total_len + control_block->length + max_len / 2 >
> > +		    period_len) {
> > +			control_block->length =
> > +				DIV_ROUND_UP(period_len - *total_len, 8) * 4;
> > +		}
> >  		/* update number of bytes in this period so far */
> >  		*total_len += control_block->length;
> >  		return;
> 
> It seems to me like this would all be a lot simpler if we always split
> the last 2 control blocks evenly (other than 4-byte rounding):

Agreed and thanks a lot for the feedback!

I'll do it that way and then send out a v2.

> u32 period_remaining = period_len - *total_len;
> 
> /* Early exit if we aren't finishing this period */
> if (period_remaining >= max_len) {

This has to be > max_len, but the rest seems fine. We want to split
if we have more than max_len but less than max_len*2 bytes.

> 	/*
> 	 * Split the length between the last 2 CBs, to help hide the
> 	 * latency of fetching the CBs.
> 	 */
> 	if (period_remaining < max_len * 2) {
> 		control_block->length =
>                 	DIV_ROUND_UP(period_remaining, 8) * 4;
>         }
> 	/* update number of bytes in this period so far */
> 	*total_len += control_block->length;
> }
> 
> I'm about to go semi-AFK for a couple weeks.  If there's a good reason
> to only do this when the last block is very short, I'm fine with:
> 
> Acked-by: Eric Anholt <eric@anholt.net>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-06-19 10:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-09 11:41 [PATCH 0/2] dmaengine: bcm2835: Cyclic DMA fixes Matthias Reichl
2016-06-09 11:41 ` [PATCH 1/2] dmaengine: bcm2835: Fix cyclic DMA period splitting Matthias Reichl
2016-06-14  4:49   ` Eric Anholt
2016-06-09 11:41 ` [PATCH 2/2] dmaengine: bcm2835: Avoid splitting periods into very small chunks Matthias Reichl
2016-06-14  5:06   ` Eric Anholt
2016-06-19 10:39     ` Matthias Reichl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).