All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH - dmix v3 0/1] pcm: dmix: Align slave_hw_ptr to slave period boundary
@ 2018-11-06 12:48 Timo Wischer
  2018-11-06 12:52 ` [PATCH - dmix v3 1/1] " twischer
  2018-11-06 14:27 ` [PATCH - dmix v3 0/1] " Takashi Iwai
  0 siblings, 2 replies; 6+ messages in thread
From: Timo Wischer @ 2018-11-06 12:48 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Laxmi Devi, alsa-devel

[-- Attachment #1: Type: text/plain, Size: 1791 bytes --]

On 10/30/18 14:21, Takashi Iwai wrote:
 > Yes, and is this correct?  Suppose you start a stream at the position
 > one sample before the next period (psize * N + (psize-1)). Then
 > slave_hw_ptr is psize * N.  At the next moment, the dpcm->hw.ptr
 > reaches to psize * (N+1), and snd_pcm_dmix_sync_ptr() gets called.
 > Then slave_hw_ptr will be updated to psize * (N+1) although only one
 > sample has been processed?

I have created a spreadsheet to simulate the behavior of dmix (see 
attachment). I used this sheet to test different corner cases with 
different implementations.
Column B-D describes 3 corner cases with the original implementation. 
Column E-H describes the corner cases with patch v2 applied and column 
I-L describes the corner cases with patch v3.
With patch v3 I was not able to find a corner case which would fail. All 
our internal audio test are also passing fine.

The corner cases are described in the form
N*period+period-1
(N+1)period+1
(N+2)period+1

The first line describes the value of dmix->spcm->hw.ptr when 
snd_pcm_dmix_start() will be called.
The second line describes the hw_ptr when poll() returns for the first 
time and the third line describes the hw_ptr when poll() returns for the 
second time.

In column B is the issue case described when the patch is not applied. 
snd_pcm_avail() returns 2 frames only (B27) but the buffer only contains 
3 frames (B18).
Therefore up to 5 frames are available (a buffer size of 8 frames is 
choosen).

The issue case of patch v2 is shown in column H. snd_pcm_avail() returns 
7 frames (H46) but the buffer contains still 3 frames (H37). Therefore 
there are only 5 frames free to overwrite.

Do you see any corner cases which I missed or any other drawbacks?

Best regards and thanks for your time

Timo

[-- Attachment #2: dmix_corner_cases.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 19794 bytes --]

[-- Attachment #3: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH - dmix v3 1/1] pcm: dmix: Align slave_hw_ptr to slave period boundary
  2018-11-06 12:48 [PATCH - dmix v3 0/1] pcm: dmix: Align slave_hw_ptr to slave period boundary Timo Wischer
@ 2018-11-06 12:52 ` twischer
  2018-11-06 13:14   ` Takashi Iwai
  2018-11-06 14:27 ` [PATCH - dmix v3 0/1] " Takashi Iwai
  1 sibling, 1 reply; 6+ messages in thread
From: twischer @ 2018-11-06 12:52 UTC (permalink / raw)
  To: tiwai, patch; +Cc: Laxmi Devi, alsa-devel, Timo Wischer

From: Laxmi Devi <Laxmi.Devi@in.bosch.com>

These changes are required due to the kernel commit 07b7acb51d283d8469696c906b91f1882696a4d4
("ASoC: rsnd: update pointer more accurate")

Issue is that snd_pcm_wait() goes back to waiting because the hw_ptr
is not period aligned. Therefore snd_pcm_wait() will block for a longer
time as required.

With these rcar driver changes the exact position of the dma is returned.
During snd_pcm_start they read hw_ptr as reference, and this hw_ptr
is now not period aligned, and is a little ahead over the period while it
is read. Therefore when the avail is calculated during snd_pcm_wait(),
it is missing the avail_min by a few frames.
Consider the below example:

Considering the application is trying to write 0x120 frames and the
period_size = 0x60, avail_min = 0x120 and buffersize = 0x360 :

rsnd_pointer=0x12c -> dma pointer at the end of second write during
snd_pcm_dmix_start().
Since another 0x120 buffer is available, application writes 0x120 and goes
to snd_pcm_wait().
It is woken up after 3 snd_pcm_period_elapsed() to see rsnd_pointer=0x248.
So hw_ptr = new_slave_hw_ptr - reference_slave_hw_ptr = 0x248 - 0x12c = 0x11c.
It needs 4 more frames to be able to write. And so it goes back to waiting.

But since 3 snd_pcm_period_elapsed(), 3 periods should be available and it
should have been able to write.
If rsnd_pointer during the start was 0x120 which is 3 periods
then 0x248 - 0x120 =  128 it could go on with write.

Signed-off-by: Laxmi Devi <Laxmi.Devi@in.bosch.com>
Signed-off-by: Timo Wischer <twischer@de.adit-jv.com>
---

With this patch the slave_hw_ptr will always contain a period align value.

 src/pcm/pcm_dmix.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/pcm/pcm_dmix.c b/src/pcm/pcm_dmix.c
index 881154e..6c9e052 100644
--- a/src/pcm/pcm_dmix.c
+++ b/src/pcm/pcm_dmix.c
@@ -398,6 +398,8 @@ static int snd_pcm_dmix_sync_ptr0(snd_pcm_t *pcm, snd_pcm_uframes_t slave_hw_ptr
 	snd_pcm_sframes_t diff;
 	
 	old_slave_hw_ptr = dmix->slave_hw_ptr;
+	slave_hw_ptr = ((slave_hw_ptr / dmix->slave_period_size)
+				*  dmix->slave_period_size);
 	dmix->slave_hw_ptr = slave_hw_ptr;
 	diff = slave_hw_ptr - old_slave_hw_ptr;
 	if (diff == 0)		/* fast path */
@@ -560,6 +562,8 @@ static int snd_pcm_dmix_hwsync(snd_pcm_t *pcm)
 static void reset_slave_ptr(snd_pcm_t *pcm, snd_pcm_direct_t *dmix)
 {
 	dmix->slave_appl_ptr = dmix->slave_hw_ptr = *dmix->spcm->hw.ptr;
+	dmix->slave_hw_ptr = ((dmix->slave_hw_ptr / dmix->slave_period_size)
+				* dmix->slave_period_size);
 	if (pcm->buffer_size > pcm->period_size * 2)
 		return;
 	/* If we have too litte periods, better to align the start position
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH - dmix v3 1/1] pcm: dmix: Align slave_hw_ptr to slave period boundary
  2018-11-06 12:52 ` [PATCH - dmix v3 1/1] " twischer
@ 2018-11-06 13:14   ` Takashi Iwai
  0 siblings, 0 replies; 6+ messages in thread
From: Takashi Iwai @ 2018-11-06 13:14 UTC (permalink / raw)
  To: twischer; +Cc: Laxmi Devi, alsa-devel

On Tue, 06 Nov 2018 13:52:53 +0100,
<twischer@de.adit-jv.com> wrote:
> 
> From: Laxmi Devi <Laxmi.Devi@in.bosch.com>
> 
> These changes are required due to the kernel commit 07b7acb51d283d8469696c906b91f1882696a4d4
> ("ASoC: rsnd: update pointer more accurate")
> 
> Issue is that snd_pcm_wait() goes back to waiting because the hw_ptr
> is not period aligned. Therefore snd_pcm_wait() will block for a longer
> time as required.
> 
> With these rcar driver changes the exact position of the dma is returned.
> During snd_pcm_start they read hw_ptr as reference, and this hw_ptr
> is now not period aligned, and is a little ahead over the period while it
> is read. Therefore when the avail is calculated during snd_pcm_wait(),
> it is missing the avail_min by a few frames.
> Consider the below example:
> 
> Considering the application is trying to write 0x120 frames and the
> period_size = 0x60, avail_min = 0x120 and buffersize = 0x360 :
> 
> rsnd_pointer=0x12c -> dma pointer at the end of second write during
> snd_pcm_dmix_start().
> Since another 0x120 buffer is available, application writes 0x120 and goes
> to snd_pcm_wait().
> It is woken up after 3 snd_pcm_period_elapsed() to see rsnd_pointer=0x248.
> So hw_ptr = new_slave_hw_ptr - reference_slave_hw_ptr = 0x248 - 0x12c = 0x11c.
> It needs 4 more frames to be able to write. And so it goes back to waiting.
> 
> But since 3 snd_pcm_period_elapsed(), 3 periods should be available and it
> should have been able to write.
> If rsnd_pointer during the start was 0x120 which is 3 periods
> then 0x248 - 0x120 =  128 it could go on with write.
> 
> Signed-off-by: Laxmi Devi <Laxmi.Devi@in.bosch.com>
> Signed-off-by: Timo Wischer <twischer@de.adit-jv.com>
> ---
> 
> With this patch the slave_hw_ptr will always contain a period align value.

By enforcing that, application won't be able to get the fine-grained
hwptr update any longer, hence it is a clear regression.

That is, we can't apply this as is.  If any, such adjustment has to be
applied only conditionally for certain hardware setup.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH - dmix v3 0/1] pcm: dmix: Align slave_hw_ptr to slave period boundary
  2018-11-06 12:48 [PATCH - dmix v3 0/1] pcm: dmix: Align slave_hw_ptr to slave period boundary Timo Wischer
  2018-11-06 12:52 ` [PATCH - dmix v3 1/1] " twischer
@ 2018-11-06 14:27 ` Takashi Iwai
  2018-11-06 16:15   ` Timo Wischer
  1 sibling, 1 reply; 6+ messages in thread
From: Takashi Iwai @ 2018-11-06 14:27 UTC (permalink / raw)
  To: Timo Wischer; +Cc: Laxmi Devi, alsa-devel

On Tue, 06 Nov 2018 13:48:08 +0100,
Timo Wischer wrote:
> 
> On 10/30/18 14:21, Takashi Iwai wrote:
> > Yes, and is this correct?  Suppose you start a stream at the position
> > one sample before the next period (psize * N + (psize-1)). Then
> > slave_hw_ptr is psize * N.  At the next moment, the dpcm->hw.ptr
> > reaches to psize * (N+1), and snd_pcm_dmix_sync_ptr() gets called.
> > Then slave_hw_ptr will be updated to psize * (N+1) although only one
> > sample has been processed?
> 
> I have created a spreadsheet to simulate the behavior of dmix (see
> attachment). I used this sheet to test different corner cases with
> different implementations.
> Column B-D describes 3 corner cases with the original
> implementation. Column E-H describes the corner cases with patch v2
> applied and column I-L describes the corner cases with patch v3.
> With patch v3 I was not able to find a corner case which would
> fail. All our internal audio test are also passing fine.
> 
> The corner cases are described in the form
> N*period+period-1
> (N+1)period+1
> (N+2)period+1
> 
> The first line describes the value of dmix->spcm->hw.ptr when
> snd_pcm_dmix_start() will be called.
> The second line describes the hw_ptr when poll() returns for the first
> time and the third line describes the hw_ptr when poll() returns for
> the second time.
> 
> In column B is the issue case described when the patch is not
> applied. snd_pcm_avail() returns 2 frames only (B27) but the buffer
> only contains 3 frames (B18).
> Therefore up to 5 frames are available (a buffer size of 8 frames is
> choosen).
> 
> The issue case of patch v2 is shown in column H. snd_pcm_avail()
> returns 7 frames (H46) but the buffer contains still 3 frames
> (H37). Therefore there are only 5 frames free to overwrite.

Hrm, sorry, it's still not clear to me what you've tested and what
these items represent.

> Do you see any corner cases which I missed or any other drawbacks?

I guess that the biggest issue is the understanding of PCM period
wakeup; let me cite the description of a part of your patch:

 "But since 3 snd_pcm_period_elapsed(), 3 periods should be available
  and it should have been able to write.
  If rsnd_pointer during the start was 0x120 which is 3 periods
  then 0x248 - 0x120 =  128 it could go on with write."

This assumption can't be applied.

However, the current implementation of dmix is designed to achieve the
lowest latency by putting the data at the exact position being played
back now.  This, of course, doesn't guarantee the period wakeup = one
period free to fill like the above.  So it's the design choice.

The current code has a workaround for the case of nperiods <= 2, by
setting the initial slave_appl_ptr to the next period start.  This
guarantees that the period wakeup = one period to be filled.  But its
cost is the start latency; the playback doesn't start immediately but
wait until the next period start.


Takashi
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH - dmix v3 0/1] pcm: dmix: Align slave_hw_ptr to slave period boundary
  2018-11-06 14:27 ` [PATCH - dmix v3 0/1] " Takashi Iwai
@ 2018-11-06 16:15   ` Timo Wischer
  2018-11-06 16:32     ` Takashi Iwai
  0 siblings, 1 reply; 6+ messages in thread
From: Timo Wischer @ 2018-11-06 16:15 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Laxmi Devi, alsa-devel

On 11/6/18 15:27, Takashi Iwai wrote:
> I guess that the biggest issue is the understanding of PCM period
> wakeup; let me cite the description of a part of your patch:
>
>   "But since 3 snd_pcm_period_elapsed(), 3 periods should be available
>    and it should have been able to write.
>    If rsnd_pointer during the start was 0x120 which is 3 periods
>    then 0x248 - 0x120 =  128 it could go on with write."
>
> This assumption can't be applied.
>
> However, the current implementation of dmix is designed to achieve the
> lowest latency by putting the data at the exact position being played
> back now.  This, of course, doesn't guarantee the period wakeup = one
> period free to fill like the above.  So it's the design choice.
>
> The current code has a workaround for the case of nperiods <= 2, by
> setting the initial slave_appl_ptr to the next period start.  This
> guarantees that the period wakeup = one period to be filled.  But its
> cost is the start latency; the playback doesn't start immediately but
> wait until the next period start.
>
>
> Takashi

Thanks for this explanation. That helps me to understand.

Now, I am thinking about a solution to align the slave_hw_ptr and 
slave_appl_ptr to the slave_period (round down) only on start up.
So we keep this low latency. (At start up it is the same behavior as 
known from old sound drivers)

As known from the past in worst case it will drop up to slave_period-1 
frames.
If the application uses big periods and the drop is not acceptable a 
small slave_period using the var_period feature could be configured.
But I think it is also not acceptable for an application which is using 
really big periods to wait sometimes one period longer.

There is always a higher probability of an under run whenever a poll() 
wakeup will not result in a write.
For example on systems with high load this could be an issue.

If you are anyway not happy with this solution may be we can think about 
an option to disable this feature
like "min_drop_on_start true"
But then we also should describe the drawbacks of this option like
* Delay of snd_pcm_wait() up to 2*periods
* Higher probability of Xruns


Best regards

Timo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH - dmix v3 0/1] pcm: dmix: Align slave_hw_ptr to slave period boundary
  2018-11-06 16:15   ` Timo Wischer
@ 2018-11-06 16:32     ` Takashi Iwai
  0 siblings, 0 replies; 6+ messages in thread
From: Takashi Iwai @ 2018-11-06 16:32 UTC (permalink / raw)
  To: Timo Wischer; +Cc: Laxmi Devi, alsa-devel

On Tue, 06 Nov 2018 17:15:31 +0100,
Timo Wischer wrote:
> 
> On 11/6/18 15:27, Takashi Iwai wrote:
> > I guess that the biggest issue is the understanding of PCM period
> > wakeup; let me cite the description of a part of your patch:
> >
> >   "But since 3 snd_pcm_period_elapsed(), 3 periods should be available
> >    and it should have been able to write.
> >    If rsnd_pointer during the start was 0x120 which is 3 periods
> >    then 0x248 - 0x120 =  128 it could go on with write."
> >
> > This assumption can't be applied.
> >
> > However, the current implementation of dmix is designed to achieve the
> > lowest latency by putting the data at the exact position being played
> > back now.  This, of course, doesn't guarantee the period wakeup = one
> > period free to fill like the above.  So it's the design choice.
> >
> > The current code has a workaround for the case of nperiods <= 2, by
> > setting the initial slave_appl_ptr to the next period start.  This
> > guarantees that the period wakeup = one period to be filled.  But its
> > cost is the start latency; the playback doesn't start immediately but
> > wait until the next period start.
> >
> >
> > Takashi
> 
> Thanks for this explanation. That helps me to understand.
> 
> Now, I am thinking about a solution to align the slave_hw_ptr and
> slave_appl_ptr to the slave_period (round down) only on start up.
> So we keep this low latency. (At start up it is the same behavior as
> known from old sound drivers)
> 
> As known from the past in worst case it will drop up to slave_period-1 
> frames.
> If the application uses big periods and the drop is not acceptable a
> small slave_period using the var_period feature could be configured.
> But I think it is also not acceptable for an application which is
> using really big periods to wait sometimes one period longer.

Well, which case is acceptable and which isn't -- we can't define it
in our side so easily.  This really depends on how the application is
written and how the hardware setup is.

For example, imagine to play a very short beep sound with dmix.
Dropping samples (at most period-1 size) may result in silence, and
that's not acceptable.

> There is always a higher probability of an under run whenever a poll()
> wakeup will not result in a write.
> For example on systems with high load this could be an issue.
> 
> If you are anyway not happy with this solution may be we can think
> about an option to disable this feature
> like "min_drop_on_start true"
> But then we also should describe the drawbacks of this option like
> * Delay of snd_pcm_wait() up to 2*periods
> * Higher probability of Xruns

This probably depends on how many periods are present on the buffer.
For the standard PC onboard sounds, 16 periods are used as default,
and that's enough for avoiding xruns for normal applications.

I guess we may provide an option for choosing one of the following
modes:

- current default mode;
  no latency, but wakeup may be delayed, and may require 2*periods

- appl_ptr round up to the next period;
  the current workaround for periods <= 2.
  This imposes the start latency.

- round down to the current period start;
  No latency, but will drop a few samples at start.

... and ideally, provide a mode "auto" for choosing the appropriate
one depending on the situation by a smart guess.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-11-06 16:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-06 12:48 [PATCH - dmix v3 0/1] pcm: dmix: Align slave_hw_ptr to slave period boundary Timo Wischer
2018-11-06 12:52 ` [PATCH - dmix v3 1/1] " twischer
2018-11-06 13:14   ` Takashi Iwai
2018-11-06 14:27 ` [PATCH - dmix v3 0/1] " Takashi Iwai
2018-11-06 16:15   ` Timo Wischer
2018-11-06 16:32     ` Takashi Iwai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.