All of lore.kernel.org
 help / color / mirror / Atom feed
* Timer instability
@ 2009-02-19  2:46 Lennart Poettering
  2009-02-19  6:52 ` Takashi Iwai
  0 siblings, 1 reply; 47+ messages in thread
From: Lennart Poettering @ 2009-02-19  2:46 UTC (permalink / raw)
  To: ALSA Development Mailing List

Heya!

As you might now PulseAudio schedules audio by system timers, not by
sound card interrupts. Unfortunately there are are couple of issues
with the reliability of the timing/buffering information we get from
ALSA. 

For example, on an ens1371 I have tracked down the following issue: I
use the full buffer that the sound card provides (64K, i.e. 371ms). I
sleep for about 350 ms, then fill up again, and sleep 350ms, and so
and so on. Everytime I wake up I check snd_pcm_avail() and write
exactly as much as this call tells me to. I also listen to POLLOUT
events on the fd, however I call snd_pcm_sw_params_set_period_event(,
0) for it to make sure I actually don't get any real events on that.

After a couple of minutes of running this loop I can reliably
reproduce that _avail() tells me I should fill the buffer up, I do so,
query it again and it shows the buffer is filled up. I then go to
sleep but immediately get POLLIN (although I shouldnt, given that I
called snd_pcm_sw_params_set_period_event()), wake up again, call
snd_pcm_avail() and according to it the buffer ran completely
empty. The poll() took less than 1 ms time, and the last thing I did
before going to sleep was checking that the buffer was full. So
certainly ALSA is lying to me here... The buffer couldn't have run
empty in less than 1 ms!

The same with real numbers as one example:

  1. ... we are woken up
  2. _avail() returns 15496 samples (i.e. 350ms to fill up)
  3. we gather and write 15496 samples
  4. _avail() returns 48 samples, which we consider "filled up enough" to go to sleep
  5. we call poll()
  6. after < 1ms we are woken up by a POLLIN (which we actually thought we had explicitly disabled)
  7. _avail() returns 16448 samples (i.e. 372ms to fill up -- larger than the actual buffer of 16384 samples!)  
  
This smells a lot like some kind of overflow to me, given that in every
case where I could reproduce this it was possible to 'fix' the issue
by substracting the buffer size from the _avail() after the
poll(). After that substraction the value started to make sense
again. (i.e. in the example above it then becomes 64 samples, which is
reasonable after sleep less than 1ms I guess).

The stop threshold is set to the boundary here btw.

If necessary I could extract a test case for this, but my hope that
you guys might have an idea what goes on without a test case, given
that this smells so "overflowy" ;-)

And of course, how come ALSA triggers POLLIN if I asked it not to?

This is alsa-libs 1.0.19 on a 2.6.27 kernel.

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-19  2:46 Timer instability Lennart Poettering
@ 2009-02-19  6:52 ` Takashi Iwai
  2009-02-20  1:22   ` Lennart Poettering
  0 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-19  6:52 UTC (permalink / raw)
  To: Lennart Poettering; +Cc: ALSA Development Mailing List

At Thu, 19 Feb 2009 03:46:11 +0100,
Lennart Poettering wrote:
> 
> Heya!
> 
> As you might now PulseAudio schedules audio by system timers, not by
> sound card interrupts. Unfortunately there are are couple of issues
> with the reliability of the timing/buffering information we get from
> ALSA. 
> 
> For example, on an ens1371 I have tracked down the following issue: I
> use the full buffer that the sound card provides (64K, i.e. 371ms). I
> sleep for about 350 ms, then fill up again, and sleep 350ms, and so
> and so on. Everytime I wake up I check snd_pcm_avail() and write
> exactly as much as this call tells me to. I also listen to POLLOUT
> events on the fd, however I call snd_pcm_sw_params_set_period_event(,
> 0) for it to make sure I actually don't get any real events on that.
> 
> After a couple of minutes of running this loop I can reliably
> reproduce that _avail() tells me I should fill the buffer up, I do so,
> query it again and it shows the buffer is filled up. I then go to
> sleep but immediately get POLLIN (although I shouldnt, given that I
> called snd_pcm_sw_params_set_period_event()), wake up again, call
> snd_pcm_avail() and according to it the buffer ran completely
> empty. The poll() took less than 1 ms time, and the last thing I did
> before going to sleep was checking that the buffer was full. So
> certainly ALSA is lying to me here... The buffer couldn't have run
> empty in less than 1 ms!
> 
> The same with real numbers as one example:
> 
>   1. ... we are woken up
>   2. _avail() returns 15496 samples (i.e. 350ms to fill up)
>   3. we gather and write 15496 samples
>   4. _avail() returns 48 samples, which we consider "filled up enough" to go to sleep
>   5. we call poll()
>   6. after < 1ms we are woken up by a POLLIN (which we actually thought we had explicitly disabled)
>   7. _avail() returns 16448 samples (i.e. 372ms to fill up -- larger than the actual buffer of 16384 samples!)  
>   
> This smells a lot like some kind of overflow to me, given that in every
> case where I could reproduce this it was possible to 'fix' the issue
> by substracting the buffer size from the _avail() after the
> poll(). After that substraction the value started to make sense
> again. (i.e. in the example above it then becomes 64 samples, which is
> reasonable after sleep less than 1ms I guess).
> 
> The stop threshold is set to the boundary here btw.
> 
> If necessary I could extract a test case for this, but my hope that
> you guys might have an idea what goes on without a test case, given
> that this smells so "overflowy" ;-)
> 
> And of course, how come ALSA triggers POLLIN if I asked it not to?

Are you using *_period_event() stuff?
Unless you use dmix, this is the only place where strange POLLIN may
come in.  There are many new hacks there...

Try to run without that.  Also make sure that you use the very latest
alsa-lib snapshot.


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-19  6:52 ` Takashi Iwai
@ 2009-02-20  1:22   ` Lennart Poettering
  2009-02-20  1:26     ` Lennart Poettering
                       ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Lennart Poettering @ 2009-02-20  1:22 UTC (permalink / raw)
  To: alsa-devel

On Thu, 19.02.09 07:52, Takashi Iwai (tiwai@suse.de) wrote:

> 
> At Thu, 19 Feb 2009 03:46:11 +0100,
> Lennart Poettering wrote:
> > 
> > Heya!
> > 
> > As you might now PulseAudio schedules audio by system timers, not by
> > sound card interrupts. Unfortunately there are are couple of issues
> > with the reliability of the timing/buffering information we get from
> > ALSA. 
> > 
> > For example, on an ens1371 I have tracked down the following issue: I
> > use the full buffer that the sound card provides (64K, i.e. 371ms). I
> > sleep for about 350 ms, then fill up again, and sleep 350ms, and so
> > and so on. Everytime I wake up I check snd_pcm_avail() and write
> > exactly as much as this call tells me to. I also listen to POLLOUT
> > events on the fd, however I call snd_pcm_sw_params_set_period_event(,
> > 0) for it to make sure I actually don't get any real events on that.
> > 
> > After a couple of minutes of running this loop I can reliably
> > reproduce that _avail() tells me I should fill the buffer up, I do so,
> > query it again and it shows the buffer is filled up. I then go to
> > sleep but immediately get POLLIN (although I shouldnt, given that I
> > called snd_pcm_sw_params_set_period_event()), wake up again, call
> > snd_pcm_avail() and according to it the buffer ran completely
> > empty. The poll() took less than 1 ms time, and the last thing I did
> > before going to sleep was checking that the buffer was full. So
> > certainly ALSA is lying to me here... The buffer couldn't have run
> > empty in less than 1 ms!
> > 
> > The same with real numbers as one example:
> > 
> >   1. ... we are woken up
> >   2. _avail() returns 15496 samples (i.e. 350ms to fill up)
> >   3. we gather and write 15496 samples
> >   4. _avail() returns 48 samples, which we consider "filled up enough" to go to sleep
> >   5. we call poll()
> >   6. after < 1ms we are woken up by a POLLIN (which we actually thought we had explicitly disabled)
> >   7. _avail() returns 16448 samples (i.e. 372ms to fill up -- larger than the actual buffer of 16384 samples!)  
> >   
> > This smells a lot like some kind of overflow to me, given that in every
> > case where I could reproduce this it was possible to 'fix' the issue
> > by substracting the buffer size from the _avail() after the
> > poll(). After that substraction the value started to make sense
> > again. (i.e. in the example above it then becomes 64 samples, which is
> > reasonable after sleep less than 1ms I guess).
> > 
> > The stop threshold is set to the boundary here btw.
> > 
> > If necessary I could extract a test case for this, but my hope that
> > you guys might have an idea what goes on without a test case, given
> > that this smells so "overflowy" ;-)
> > 
> > And of course, how come ALSA triggers POLLIN if I asked it not to?
> 
> Are you using *_period_event() stuff?

Yes, I am. But ignore that part for now. I have now commented the use
of that call. Now I certainly get more POLLOUTs as expected, but the
real problem stays: after a few minutes _avail() will suddenly jump
from next to zero to more then the hwbuf size in less than 1ms without
any further inteference and with a buffer size of 350ms! There is
something really wrong with the behaviour of _avail().

I can reproduce this only on ens1371 for now.

> Unless you use dmix, this is the only place where strange POLLIN may
> come in.  There are many new hacks there...

There is no plug in used but "hw".

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-20  1:22   ` Lennart Poettering
@ 2009-02-20  1:26     ` Lennart Poettering
  2009-02-20  1:50     ` Lennart Poettering
  2009-02-20  7:26     ` Takashi Iwai
  2 siblings, 0 replies; 47+ messages in thread
From: Lennart Poettering @ 2009-02-20  1:26 UTC (permalink / raw)
  To: alsa-devel

On Fri, 20.02.09 02:22, Lennart Poettering (mznyfn@0pointer.de) wrote:

> 
> On Thu, 19.02.09 07:52, Takashi Iwai (tiwai@suse.de) wrote:
> 
> > 
> > At Thu, 19 Feb 2009 03:46:11 +0100,
> > Lennart Poettering wrote:
> > > 
> > > Heya!
> > > 
> > > As you might now PulseAudio schedules audio by system timers, not by
> > > sound card interrupts. Unfortunately there are are couple of issues
> > > with the reliability of the timing/buffering information we get from
> > > ALSA. 
> > > 
> > > For example, on an ens1371 I have tracked down the following issue: I
> > > use the full buffer that the sound card provides (64K, i.e. 371ms). I
> > > sleep for about 350 ms, then fill up again, and sleep 350ms, and so
> > > and so on. Everytime I wake up I check snd_pcm_avail() and write
> > > exactly as much as this call tells me to. I also listen to POLLOUT
> > > events on the fd, however I call snd_pcm_sw_params_set_period_event(,
> > > 0) for it to make sure I actually don't get any real events on that.
> > > 
> > > After a couple of minutes of running this loop I can reliably
> > > reproduce that _avail() tells me I should fill the buffer up, I do so,
> > > query it again and it shows the buffer is filled up. I then go to
> > > sleep but immediately get POLLIN (although I shouldnt, given that I
> > > called snd_pcm_sw_params_set_period_event()), wake up again, call
> > > snd_pcm_avail() and according to it the buffer ran completely
> > > empty. The poll() took less than 1 ms time, and the last thing I did
> > > before going to sleep was checking that the buffer was full. So
> > > certainly ALSA is lying to me here... The buffer couldn't have run
> > > empty in less than 1 ms!
> > > 
> > > The same with real numbers as one example:
> > > 
> > >   1. ... we are woken up
> > >   2. _avail() returns 15496 samples (i.e. 350ms to fill up)
> > >   3. we gather and write 15496 samples
> > >   4. _avail() returns 48 samples, which we consider "filled up enough" to go to sleep
> > >   5. we call poll()
> > >   6. after < 1ms we are woken up by a POLLIN (which we actually thought we had explicitly disabled)
> > >   7. _avail() returns 16448 samples (i.e. 372ms to fill up -- larger than the actual buffer of 16384 samples!)  
> > >   
> > > This smells a lot like some kind of overflow to me, given that in every
> > > case where I could reproduce this it was possible to 'fix' the issue
> > > by substracting the buffer size from the _avail() after the
> > > poll(). After that substraction the value started to make sense
> > > again. (i.e. in the example above it then becomes 64 samples, which is
> > > reasonable after sleep less than 1ms I guess).
> > > 
> > > The stop threshold is set to the boundary here btw.
> > > 
> > > If necessary I could extract a test case for this, but my hope that
> > > you guys might have an idea what goes on without a test case, given
> > > that this smells so "overflowy" ;-)
> > > 
> > > And of course, how come ALSA triggers POLLIN if I asked it not to?
> > 
> > Are you using *_period_event() stuff?
> 
> Yes, I am. But ignore that part for now. I have now commented the use
> of that call. Now I certainly get more POLLOUTs as expected, but the
> real problem stays: after a few minutes _avail() will suddenly jump
> from next to zero to more then the hwbuf size in less than 1ms without
> any further inteference and with a buffer size of 350ms! There is
> something really wrong with the behaviour of _avail().
> 
> I can reproduce this only on ens1371 for now.

Oh, and upgrading to 2.6.29-rc5 had no effect either. Problem stays.

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-20  1:22   ` Lennart Poettering
  2009-02-20  1:26     ` Lennart Poettering
@ 2009-02-20  1:50     ` Lennart Poettering
  2009-02-20  7:26     ` Takashi Iwai
  2 siblings, 0 replies; 47+ messages in thread
From: Lennart Poettering @ 2009-02-20  1:50 UTC (permalink / raw)
  To: alsa-devel

On Fri, 20.02.09 02:22, Lennart Poettering (mznyfn@0pointer.de) wrote:

> > > 
> > > If necessary I could extract a test case for this, but my hope that
> > > you guys might have an idea what goes on without a test case, given
> > > that this smells so "overflowy" ;-)
> > > 
> > > And of course, how come ALSA triggers POLLIN if I asked it not to?
> > 
> > Are you using *_period_event() stuff?
> 
> Yes, I am. But ignore that part for now. I have now commented the use
> of that call. Now I certainly get more POLLOUTs as expected, but the
> real problem stays: after a few minutes _avail() will suddenly jump
> from next to zero to more then the hwbuf size in less than 1ms without
> any further inteference and with a buffer size of 350ms! There is
> something really wrong with the behaviour of _avail().
> 
> I can reproduce this only on ens1371 for now.

Actually it is worse than just the issue mentioned above. I also can
reproduce that _avail() tells me the device is almost completely full
(i.e. 370ms of data), and when i then go to sleep for 350ms then
afterwards _avail() will tell me that about 700ms are missing. From
the context the first _avail() was just wrong.

In summary: _avail() is completely unreliable on ens1371. It sometimes
returns values that are too high by once the buffer size, and
sometimes too low by once the buffer size.

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-20  1:22   ` Lennart Poettering
  2009-02-20  1:26     ` Lennart Poettering
  2009-02-20  1:50     ` Lennart Poettering
@ 2009-02-20  7:26     ` Takashi Iwai
  2009-02-20 20:34       ` Lennart Poettering
  2 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-20  7:26 UTC (permalink / raw)
  To: Lennart Poettering; +Cc: alsa-devel

At Fri, 20 Feb 2009 02:22:20 +0100,
Lennart Poettering wrote:
> 
> On Thu, 19.02.09 07:52, Takashi Iwai (tiwai@suse.de) wrote:
> 
> > 
> > At Thu, 19 Feb 2009 03:46:11 +0100,
> > Lennart Poettering wrote:
> > > 
> > > Heya!
> > > 
> > > As you might now PulseAudio schedules audio by system timers, not by
> > > sound card interrupts. Unfortunately there are are couple of issues
> > > with the reliability of the timing/buffering information we get from
> > > ALSA. 
> > > 
> > > For example, on an ens1371 I have tracked down the following issue: I
> > > use the full buffer that the sound card provides (64K, i.e. 371ms). I
> > > sleep for about 350 ms, then fill up again, and sleep 350ms, and so
> > > and so on. Everytime I wake up I check snd_pcm_avail() and write
> > > exactly as much as this call tells me to. I also listen to POLLOUT
> > > events on the fd, however I call snd_pcm_sw_params_set_period_event(,
> > > 0) for it to make sure I actually don't get any real events on that.
> > > 
> > > After a couple of minutes of running this loop I can reliably
> > > reproduce that _avail() tells me I should fill the buffer up, I do so,
> > > query it again and it shows the buffer is filled up. I then go to
> > > sleep but immediately get POLLIN (although I shouldnt, given that I
> > > called snd_pcm_sw_params_set_period_event()), wake up again, call
> > > snd_pcm_avail() and according to it the buffer ran completely
> > > empty. The poll() took less than 1 ms time, and the last thing I did
> > > before going to sleep was checking that the buffer was full. So
> > > certainly ALSA is lying to me here... The buffer couldn't have run
> > > empty in less than 1 ms!
> > > 
> > > The same with real numbers as one example:
> > > 
> > >   1. ... we are woken up
> > >   2. _avail() returns 15496 samples (i.e. 350ms to fill up)
> > >   3. we gather and write 15496 samples
> > >   4. _avail() returns 48 samples, which we consider "filled up enough" to go to sleep
> > >   5. we call poll()
> > >   6. after < 1ms we are woken up by a POLLIN (which we actually thought we had explicitly disabled)
> > >   7. _avail() returns 16448 samples (i.e. 372ms to fill up -- larger than the actual buffer of 16384 samples!)  
> > >   
> > > This smells a lot like some kind of overflow to me, given that in every
> > > case where I could reproduce this it was possible to 'fix' the issue
> > > by substracting the buffer size from the _avail() after the
> > > poll(). After that substraction the value started to make sense
> > > again. (i.e. in the example above it then becomes 64 samples, which is
> > > reasonable after sleep less than 1ms I guess).
> > > 
> > > The stop threshold is set to the boundary here btw.
> > > 
> > > If necessary I could extract a test case for this, but my hope that
> > > you guys might have an idea what goes on without a test case, given
> > > that this smells so "overflowy" ;-)
> > > 
> > > And of course, how come ALSA triggers POLLIN if I asked it not to?
> > 
> > Are you using *_period_event() stuff?
> 
> Yes, I am. But ignore that part for now. I have now commented the use
> of that call. Now I certainly get more POLLOUTs as expected, but the
> real problem stays: after a few minutes _avail() will suddenly jump
> from next to zero to more then the hwbuf size in less than 1ms without
> any further inteference and with a buffer size of 350ms! There is
> something really wrong with the behaviour of _avail().
> 
> I can reproduce this only on ens1371 for now.

The ens1371 driver itself is damn simple.  The pointer callback just
returns the read value.  So, it implies that it's basically a hardware
issue.

Does the patch below have any influence?

> > Unless you use dmix, this is the only place where strange POLLIN may
> > come in.  There are many new hacks there...
> 
> There is no plug in used but "hw".

Yes but *_period_event() is implemented in "hw".


thanks,

Takashi

---
diff --git a/sound/pci/ens1370.c b/sound/pci/ens1370.c
index 18f4d1e..3a7739b 100644
--- a/sound/pci/ens1370.c
+++ b/sound/pci/ens1370.c
@@ -991,15 +991,36 @@ static int snd_ensoniq_capture_prepare(struct snd_pcm_substream *substream)
 	return 0;
 }
 
+#define MAX_PTR_TIMEOUT		50
+
+static unsigned int get_cur_ptr(struct ensoniq *ensoniq, unsigned int map,
+				unsigned int reg)
+{
+	unsigned int optr, ptr;
+	int t;
+
+	outl(map, ES_REG(ensoniq, MEM_PAGE));
+	inl(ES_REG(ensoniq, MEM_PAGE));
+	optr = inl(reg);
+	for (t = 0; t < MAX_PTR_TIMEOUT; t++) {
+		ptr = inl(reg);
+		if (ptr == optr)
+			return ptr;
+		optr = ptr;
+	}
+	return ptr;
+}
+
 static snd_pcm_uframes_t snd_ensoniq_playback1_pointer(struct snd_pcm_substream *substream)
 {
 	struct ensoniq *ensoniq = snd_pcm_substream_chip(substream);
 	size_t ptr;
 
 	spin_lock(&ensoniq->reg_lock);
-	if (inl(ES_REG(ensoniq, CONTROL)) & ES_DAC1_EN) {
-		outl(ES_MEM_PAGEO(ES_PAGE_DAC), ES_REG(ensoniq, MEM_PAGE));
-		ptr = ES_REG_FCURR_COUNTI(inl(ES_REG(ensoniq, DAC1_SIZE)));
+	if (ensoniq->ctrl & ES_DAC1_EN) {
+		ptr = get_cur_ptr(ensoniq, ES_MEM_PAGEO(ES_PAGE_DAC),
+				  ES_REG(ensoniq, DAC1_SIZE));
+		ptr = ES_REG_FCURR_COUNTI(ptr);
 		ptr = bytes_to_frames(substream->runtime, ptr);
 	} else {
 		ptr = 0;
@@ -1014,9 +1035,10 @@ static snd_pcm_uframes_t snd_ensoniq_playback2_pointer(struct snd_pcm_substream
 	size_t ptr;
 
 	spin_lock(&ensoniq->reg_lock);
-	if (inl(ES_REG(ensoniq, CONTROL)) & ES_DAC2_EN) {
-		outl(ES_MEM_PAGEO(ES_PAGE_DAC), ES_REG(ensoniq, MEM_PAGE));
-		ptr = ES_REG_FCURR_COUNTI(inl(ES_REG(ensoniq, DAC2_SIZE)));
+	if (ensoniq->ctrl & ES_DAC2_EN) {
+		ptr = get_cur_ptr(ensoniq, ES_MEM_PAGEO(ES_PAGE_DAC),
+				  ES_REG(ensoniq, DAC2_SIZE));
+		ptr = ES_REG_FCURR_COUNTI(ptr);
 		ptr = bytes_to_frames(substream->runtime, ptr);
 	} else {
 		ptr = 0;
@@ -1031,9 +1053,10 @@ static snd_pcm_uframes_t snd_ensoniq_capture_pointer(struct snd_pcm_substream *s
 	size_t ptr;
 
 	spin_lock(&ensoniq->reg_lock);
-	if (inl(ES_REG(ensoniq, CONTROL)) & ES_ADC_EN) {
-		outl(ES_MEM_PAGEO(ES_PAGE_ADC), ES_REG(ensoniq, MEM_PAGE));
-		ptr = ES_REG_FCURR_COUNTI(inl(ES_REG(ensoniq, ADC_SIZE)));
+	if (ensoniq->ctrl & ES_ADC_EN) {
+		ptr = get_cur_ptr(ensoniq, ES_MEM_PAGEO(ES_PAGE_ADC),
+				  ES_REG(ensoniq, ADC_SIZE));
+		ptr = ES_REG_FCURR_COUNTI(ptr);
 		ptr = bytes_to_frames(substream->runtime, ptr);
 	} else {
 		ptr = 0;

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-20  7:26     ` Takashi Iwai
@ 2009-02-20 20:34       ` Lennart Poettering
  2009-02-21 16:36         ` Takashi Iwai
  0 siblings, 1 reply; 47+ messages in thread
From: Lennart Poettering @ 2009-02-20 20:34 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

On Fri, 20.02.09 08:26, Takashi Iwai (tiwai@suse.de) wrote:

> > Yes, I am. But ignore that part for now. I have now commented the use
> > of that call. Now I certainly get more POLLOUTs as expected, but the
> > real problem stays: after a few minutes _avail() will suddenly jump
> > from next to zero to more then the hwbuf size in less than 1ms without
> > any further inteference and with a buffer size of 350ms! There is
> > something really wrong with the behaviour of _avail().
> > 
> > I can reproduce this only on ens1371 for now.
> 
> The ens1371 driver itself is damn simple.  The pointer callback just
> returns the read value.  So, it implies that it's basically a hardware
> issue.
> 
> Does the patch below have any influence?

No. It doesn't appear to have any effect whatsoever. Still, from time
to time the value returned by _avail() is off by once the buffer size,
sometimes upwards, sometimes downwards.

Moreover I can now reproduce the same issue on snd-intel8x0, albeit it
takes more than a couple of minutes to make snd_pcm_avail() return
bogus data.

Somehow I get the feeling the problem is not so much the inaccuracy of
the pointer the kernel reports but in what alsa-libs does ith it.

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-20 20:34       ` Lennart Poettering
@ 2009-02-21 16:36         ` Takashi Iwai
  2009-02-22  3:14           ` Lennart Poettering
  0 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-21 16:36 UTC (permalink / raw)
  To: Lennart Poettering; +Cc: alsa-devel

At Fri, 20 Feb 2009 21:34:17 +0100,
Lennart Poettering wrote:
> 
> On Fri, 20.02.09 08:26, Takashi Iwai (tiwai@suse.de) wrote:
> 
> > > Yes, I am. But ignore that part for now. I have now commented the use
> > > of that call. Now I certainly get more POLLOUTs as expected, but the
> > > real problem stays: after a few minutes _avail() will suddenly jump
> > > from next to zero to more then the hwbuf size in less than 1ms without
> > > any further inteference and with a buffer size of 350ms! There is
> > > something really wrong with the behaviour of _avail().
> > > 
> > > I can reproduce this only on ens1371 for now.
> > 
> > The ens1371 driver itself is damn simple.  The pointer callback just
> > returns the read value.  So, it implies that it's basically a hardware
> > issue.
> > 
> > Does the patch below have any influence?
> 
> No. It doesn't appear to have any effect whatsoever. Still, from time
> to time the value returned by _avail() is off by once the buffer size,
> sometimes upwards, sometimes downwards.
> 
> Moreover I can now reproduce the same issue on snd-intel8x0, albeit it
> takes more than a couple of minutes to make snd_pcm_avail() return
> bogus data.
> 
> Somehow I get the feeling the problem is not so much the inaccuracy of
> the pointer the kernel reports but in what alsa-libs does ith it.

Hrm... both cases the pointer calculation is relatively simple,
and for "hw" PCM, the avail calculation is also very straightforward.
So, my first suspect is about the pointer calculation.  But, of course
we should check all possibilities.

Can you give a small testcase?


thanks,

Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-21 16:36         ` Takashi Iwai
@ 2009-02-22  3:14           ` Lennart Poettering
  2009-02-23  2:42             ` Lennart Poettering
                               ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Lennart Poettering @ 2009-02-22  3:14 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

On Sat, 21.02.09 17:36, Takashi Iwai (tiwai@suse.de) wrote:

> 
> At Fri, 20 Feb 2009 21:34:17 +0100,
> Lennart Poettering wrote:
> > 
> > On Fri, 20.02.09 08:26, Takashi Iwai (tiwai@suse.de) wrote:
> > 
> > > > Yes, I am. But ignore that part for now. I have now commented the use
> > > > of that call. Now I certainly get more POLLOUTs as expected, but the
> > > > real problem stays: after a few minutes _avail() will suddenly jump
> > > > from next to zero to more then the hwbuf size in less than 1ms without
> > > > any further inteference and with a buffer size of 350ms! There is
> > > > something really wrong with the behaviour of _avail().
> > > > 
> > > > I can reproduce this only on ens1371 for now.
> > > 
> > > The ens1371 driver itself is damn simple.  The pointer callback just
> > > returns the read value.  So, it implies that it's basically a hardware
> > > issue.
> > > 
> > > Does the patch below have any influence?
> > 
> > No. It doesn't appear to have any effect whatsoever. Still, from time
> > to time the value returned by _avail() is off by once the buffer size,
> > sometimes upwards, sometimes downwards.
> > 
> > Moreover I can now reproduce the same issue on snd-intel8x0, albeit it
> > takes more than a couple of minutes to make snd_pcm_avail() return
> > bogus data.
> > 
> > Somehow I get the feeling the problem is not so much the inaccuracy of
> > the pointer the kernel reports but in what alsa-libs does ith it.
> 
> Hrm... both cases the pointer calculation is relatively simple,
> and for "hw" PCM, the avail calculation is also very straightforward.
> So, my first suspect is about the pointer calculation.  But, of course
> we should check all possibilities.
> 
> Can you give a small testcase?

Ok, here's a little test case:

http://git.0pointer.de/?p=pulseaudio.git;a=blob_plain;f=src/tests/alsa-time-test.c

I wrote this little test to track down all kinds of timer issues: it
opens and sets up a sound device. Then it will constantly query
_avail(), _delay() and _htstamp() in a busy loop writing a single
sample at a time. The three values are then printed along with a
system timestamp. The data generated like this can be opened in
gnumeric and a nice graph be drawn.

So much about the original purpose of this tool.

The issue in question can be triggered by simply running this tool and
passing the device name as argv[0]. i.e. "alsa-time-test hw:AudioPCI".

The assert at the end of the file (line 189) is triggered when
_avail() starts to become unreliable and jumps around. 

For me to reproduce this issue all I have to do is run:

  $ alsa-time-test hw:AudioPCI > log

and then wait for a minute or two. The assert is hit the tool aborted.

When that happens one can check the end of 'log' and will see something like this:

<snip>
...
45974071        45974067        45937596        3882    528     4       1       3
45974088        45974084        45937596        3881    529     4       1       3
45974106        45974101        45937596        3880    530     4       1       3
45974122        45974118        45937596        3879    531     4       1       3
45974140        45974136        45937777        3886    524     4       1       3
46025249        45974154        45937777        3885    525     4       1       3
46025279        46025274        45988752        6132    -1722   4       1       3
</snip>

The first column is clock_gettime(CLOCK_MONOTONIC) in us. Second
column in the timestamp stored in the snd_pcm_state_t. The third is
the calculated playback time (i.e. write count - delay).  The fourth
column is avail, the fifth is delay.

In the last two lines (i.e. right when the assert aborted us) it
becomes interesting. Suddenly in 30 us, the buffer jumped from an
avail of 3885 to 6132 (buffer size is 4410). 

I debugged a little bit through alsa-lib. Apparently in contrast to
what I guessed before this issue seems indeed to be caused by a kernel
bug: the hw_ptr right after the SNDRV_PCM_IOCTL_HWSYNC is already
bogus.

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-22  3:14           ` Lennart Poettering
@ 2009-02-23  2:42             ` Lennart Poettering
  2009-02-23  2:56               ` Lennart Poettering
  2009-02-23  7:47             ` Takashi Iwai
  2009-02-24 16:27             ` Takashi Iwai
  2 siblings, 1 reply; 47+ messages in thread
From: Lennart Poettering @ 2009-02-23  2:42 UTC (permalink / raw)
  To: Takashi Iwai, alsa-devel

On Sun, 22.02.09 04:14, Lennart Poettering (mznyfn@0pointer.de) wrote:

> I wrote this little test to track down all kinds of timer issues: it
> opens and sets up a sound device. Then it will constantly query
> _avail(), _delay() and _htstamp() in a busy loop writing a single
> sample at a time. The three values are then printed along with a
> system timestamp. The data generated like this can be opened in
> gnumeric and a nice graph be drawn.

Hmm, did some more testing with this tool on other cards:

On es1969 snd_pcm_avail() sometimes returns values like
1073728596 samples. This smells a lot like an overflow given that this
times four (i.e. the sample size in bytes) is near to 2^32.

Here's an output of this tool for an emu10k1 card:

http://www.nopaste.pl/9vi

Check out line 4417: This is when the devcie starts playback. avail
goes rightfully to 0 (i.e column 4 -- don't get confused by the line
numbers the nopaste adds here), but the playback time suddenly jumps
from 100113 us down to 702 us (column 4 which is the written sample
counter minus the delay transformed to us). This is because in
contrast to most other drivers for this one writing a sample has no
direct effect on snd_pcm_delay while the device is stopped.

Also this shows that the data from _avail() and _status() together is
not atomic: the columns for avail delay change in
different rows when we don't get scheduled often enough.

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-23  2:42             ` Lennart Poettering
@ 2009-02-23  2:56               ` Lennart Poettering
  2009-02-23 19:20                 ` Lennart Poettering
  0 siblings, 1 reply; 47+ messages in thread
From: Lennart Poettering @ 2009-02-23  2:56 UTC (permalink / raw)
  To: alsa-devel; +Cc: Takashi Iwai

On Mon, 23.02.09 03:42, Lennart Poettering (mznyfn@0pointer.de) wrote:

> 
> On Sun, 22.02.09 04:14, Lennart Poettering (mznyfn@0pointer.de) wrote:
> 
> > I wrote this little test to track down all kinds of timer issues: it
> > opens and sets up a sound device. Then it will constantly query
> > _avail(), _delay() and _htstamp() in a busy loop writing a single
> > sample at a time. The three values are then printed along with a
> > system timestamp. The data generated like this can be opened in
> > gnumeric and a nice graph be drawn.
> 
> Hmm, did some more testing with this tool on other cards:
> 
> On es1969 snd_pcm_avail() sometimes returns values like
> 1073728596 samples. This smells a lot like an overflow given that this
> times four (i.e. the sample size in bytes) is near to 2^32.

[...]

> Here's an output of this tool for an emu10k1 card:

[...]

This is the end of the tool's output for HDA AD1989B:

http://fpaste.org/paste/4240

The device is completely filled up (i.e. avail is 0, delay is 4416)
and then after 2us suddenly the avail jumps to 4971973988617027465 and
the delay to -4971973988617023049.

Smells like an integer overflow to me ...

Hmm, for now my little testing experiment I've now found es1969,
ens1371, intel8x0, snd-emu10k1 and some intel-hda to have unreliable
_avail() or _delay().

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-22  3:14           ` Lennart Poettering
  2009-02-23  2:42             ` Lennart Poettering
@ 2009-02-23  7:47             ` Takashi Iwai
  2009-02-24 16:27             ` Takashi Iwai
  2 siblings, 0 replies; 47+ messages in thread
From: Takashi Iwai @ 2009-02-23  7:47 UTC (permalink / raw)
  To: Lennart Poettering; +Cc: alsa-devel

At Sun, 22 Feb 2009 04:14:53 +0100,
Lennart Poettering wrote:
> 
> On Sat, 21.02.09 17:36, Takashi Iwai (tiwai@suse.de) wrote:
> 
> > 
> > At Fri, 20 Feb 2009 21:34:17 +0100,
> > Lennart Poettering wrote:
> > > 
> > > On Fri, 20.02.09 08:26, Takashi Iwai (tiwai@suse.de) wrote:
> > > 
> > > > > Yes, I am. But ignore that part for now. I have now commented the use
> > > > > of that call. Now I certainly get more POLLOUTs as expected, but the
> > > > > real problem stays: after a few minutes _avail() will suddenly jump
> > > > > from next to zero to more then the hwbuf size in less than 1ms without
> > > > > any further inteference and with a buffer size of 350ms! There is
> > > > > something really wrong with the behaviour of _avail().
> > > > > 
> > > > > I can reproduce this only on ens1371 for now.
> > > > 
> > > > The ens1371 driver itself is damn simple.  The pointer callback just
> > > > returns the read value.  So, it implies that it's basically a hardware
> > > > issue.
> > > > 
> > > > Does the patch below have any influence?
> > > 
> > > No. It doesn't appear to have any effect whatsoever. Still, from time
> > > to time the value returned by _avail() is off by once the buffer size,
> > > sometimes upwards, sometimes downwards.
> > > 
> > > Moreover I can now reproduce the same issue on snd-intel8x0, albeit it
> > > takes more than a couple of minutes to make snd_pcm_avail() return
> > > bogus data.
> > > 
> > > Somehow I get the feeling the problem is not so much the inaccuracy of
> > > the pointer the kernel reports but in what alsa-libs does ith it.
> > 
> > Hrm... both cases the pointer calculation is relatively simple,
> > and for "hw" PCM, the avail calculation is also very straightforward.
> > So, my first suspect is about the pointer calculation.  But, of course
> > we should check all possibilities.
> > 
> > Can you give a small testcase?
> 
> Ok, here's a little test case:
> 
> http://git.0pointer.de/?p=pulseaudio.git;a=blob_plain;f=src/tests/alsa-time-test.c

Thanks.  Will check it later.


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-23  2:56               ` Lennart Poettering
@ 2009-02-23 19:20                 ` Lennart Poettering
  2009-02-23 19:24                   ` Colin Guthrie
  2009-02-24  3:21                   ` Lennart Poettering
  0 siblings, 2 replies; 47+ messages in thread
From: Lennart Poettering @ 2009-02-23 19:20 UTC (permalink / raw)
  To: alsa-devel, Takashi Iwai

On Mon, 23.02.09 03:56, Lennart Poettering (mznyfn@0pointer.de) wrote:

> Hmm, for now my little testing experiment I've now found es1969,
> ens1371, intel8x0, snd-emu10k1 and some intel-hda to have unreliable
> _avail() or _delay().

And here's the output for HDA STAC92xx:

http://pastebin.mandriva.com/7149

The first dump is the interesting one: when audio starts to play the
buffer abruptly runs empty in just 20us after the device started. (The
rightmost column is the state btw, 3 == playing).

A quick overview of cards where _delay() and/or _avail() are unreliable:

 es1969
 ens1371
 intel8x0
 emu10k1
 intel-hda on STAC92xx
 intel-hda on AD1989B

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-23 19:20                 ` Lennart Poettering
@ 2009-02-23 19:24                   ` Colin Guthrie
  2009-02-24  3:21                   ` Lennart Poettering
  1 sibling, 0 replies; 47+ messages in thread
From: Colin Guthrie @ 2009-02-23 19:24 UTC (permalink / raw)
  To: alsa-devel

'Twas brillig, and Lennart Poettering at 23/02/09 19:20 did gyre and gimble:
> On Mon, 23.02.09 03:56, Lennart Poettering (mznyfn@0pointer.de) wrote:
> 
>> Hmm, for now my little testing experiment I've now found es1969,
>> ens1371, intel8x0, snd-emu10k1 and some intel-hda to have unreliable
>> _avail() or _delay().
> 
> And here's the output for HDA STAC92xx:
> 
> http://pastebin.mandriva.com/7149

I realised that post will timeout after a month, so here is a longer 
term one :)

http://pastebin.mandriva.com/7238


> The first dump is the interesting one: when audio starts to play the
> buffer abruptly runs empty in just 20us after the device started. (The
> rightmost column is the state btw, 3 == playing).
> 
> A quick overview of cards where _delay() and/or _avail() are unreliable:
> 
>  es1969
>  ens1371
>  intel8x0
>  emu10k1
>  intel-hda on STAC92xx
>  intel-hda on AD1989B
> 
> Lennart
> 


-- 

Colin Guthrie
gmane(at)colin.guthr.ie
http://colin.guthr.ie/

Day Job:
   Tribalogic Limited [http://www.tribalogic.net/]
Open Source:
   Mandriva Linux Contributor [http://www.mandriva.com/]
   PulseAudio Hacker [http://www.pulseaudio.org/]
   Trac Hacker [http://trac.edgewall.org/]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-23 19:20                 ` Lennart Poettering
  2009-02-23 19:24                   ` Colin Guthrie
@ 2009-02-24  3:21                   ` Lennart Poettering
  1 sibling, 0 replies; 47+ messages in thread
From: Lennart Poettering @ 2009-02-24  3:21 UTC (permalink / raw)
  To: alsa-devel, Takashi Iwai

On Mon, 23.02.09 20:20, Lennart Poettering (mznyfn@0pointer.de) wrote:

> On Mon, 23.02.09 03:56, Lennart Poettering (mznyfn@0pointer.de) wrote:
> 
> > Hmm, for now my little testing experiment I've now found es1969,
> > ens1371, intel8x0, snd-emu10k1 and some intel-hda to have unreliable
> > _avail() or _delay().
> 
> And here's the output for HDA STAC92xx:
> 
> http://pastebin.mandriva.com/7149
> 
> The first dump is the interesting one: when audio starts to play the
> buffer abruptly runs empty in just 20us after the device started. (The
> rightmost column is the state btw, 3 == playing).
> 
> A quick overview of cards where _delay() and/or _avail() are unreliable:
> 
>  es1969
>  ens1371
>  intel8x0
>  emu10k1
>  intel-hda on STAC92xx
>  intel-hda on AD1989B

And here's another one, for intel-hda on ALC883:

Sometimes snd_pcm_avail() returns values like 4611686018427387098.

Smells like another overflow to me. 

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-22  3:14           ` Lennart Poettering
  2009-02-23  2:42             ` Lennart Poettering
  2009-02-23  7:47             ` Takashi Iwai
@ 2009-02-24 16:27             ` Takashi Iwai
  2009-02-24 18:46               ` Lennart Poettering
  2 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-24 16:27 UTC (permalink / raw)
  To: Lennart Poettering; +Cc: alsa-devel

At Sun, 22 Feb 2009 04:14:53 +0100,
Lennart Poettering wrote:
> 
> On Sat, 21.02.09 17:36, Takashi Iwai (tiwai@suse.de) wrote:
> 
> > 
> > At Fri, 20 Feb 2009 21:34:17 +0100,
> > Lennart Poettering wrote:
> > > 
> > > On Fri, 20.02.09 08:26, Takashi Iwai (tiwai@suse.de) wrote:
> > > 
> > > > > Yes, I am. But ignore that part for now. I have now commented the use
> > > > > of that call. Now I certainly get more POLLOUTs as expected, but the
> > > > > real problem stays: after a few minutes _avail() will suddenly jump
> > > > > from next to zero to more then the hwbuf size in less than 1ms without
> > > > > any further inteference and with a buffer size of 350ms! There is
> > > > > something really wrong with the behaviour of _avail().
> > > > > 
> > > > > I can reproduce this only on ens1371 for now.
> > > > 
> > > > The ens1371 driver itself is damn simple.  The pointer callback just
> > > > returns the read value.  So, it implies that it's basically a hardware
> > > > issue.
> > > > 
> > > > Does the patch below have any influence?
> > > 
> > > No. It doesn't appear to have any effect whatsoever. Still, from time
> > > to time the value returned by _avail() is off by once the buffer size,
> > > sometimes upwards, sometimes downwards.
> > > 
> > > Moreover I can now reproduce the same issue on snd-intel8x0, albeit it
> > > takes more than a couple of minutes to make snd_pcm_avail() return
> > > bogus data.
> > > 
> > > Somehow I get the feeling the problem is not so much the inaccuracy of
> > > the pointer the kernel reports but in what alsa-libs does ith it.
> > 
> > Hrm... both cases the pointer calculation is relatively simple,
> > and for "hw" PCM, the avail calculation is also very straightforward.
> > So, my first suspect is about the pointer calculation.  But, of course
> > we should check all possibilities.
> > 
> > Can you give a small testcase?
> 
> Ok, here's a little test case:
> 
> http://git.0pointer.de/?p=pulseaudio.git;a=blob_plain;f=src/tests/alsa-time-test.c

Finally have time to go back to this issue again now.

With your test case, indeed I get abort(), too.  However, checking
through the code, you set stop_threshold to the boundary size like:

    r = snd_pcm_sw_params_get_boundary(swparams, &boundary);
    r = snd_pcm_sw_params_set_stop_threshold(pcm, swparams, boundary);

This means essentially you are disabling the XRUN detection in the
driver and keep it free-running.  So, no wonder avail gets over
buffer_size.  It's actually an XRUN condition, but just ignored due to
this setup.  (Usually, stop_threshould == buffer_size.)

Or do I misunderstand the intention of your testcase?


thanks,

Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-24 16:27             ` Takashi Iwai
@ 2009-02-24 18:46               ` Lennart Poettering
  2009-02-24 18:59                 ` Takashi Iwai
  0 siblings, 1 reply; 47+ messages in thread
From: Lennart Poettering @ 2009-02-24 18:46 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

On Tue, 24.02.09 17:27, Takashi Iwai (tiwai@suse.de) wrote:

> > > > Moreover I can now reproduce the same issue on snd-intel8x0, albeit it
> > > > takes more than a couple of minutes to make snd_pcm_avail() return
> > > > bogus data.
> > > > 
> > > > Somehow I get the feeling the problem is not so much the inaccuracy of
> > > > the pointer the kernel reports but in what alsa-libs does ith it.
> > > 
> > > Hrm... both cases the pointer calculation is relatively simple,
> > > and for "hw" PCM, the avail calculation is also very straightforward.
> > > So, my first suspect is about the pointer calculation.  But, of course
> > > we should check all possibilities.
> > > 
> > > Can you give a small testcase?
> > 
> > Ok, here's a little test case:
> > 
> > http://git.0pointer.de/?p=pulseaudio.git;a=blob_plain;f=src/tests/alsa-time-test.c
> 
> Finally have time to go back to this issue again now.
> 
> With your test case, indeed I get abort(), too.  However, checking
> through the code, you set stop_threshold to the boundary size like:
> 
>     r = snd_pcm_sw_params_get_boundary(swparams, &boundary);
>     r = snd_pcm_sw_params_set_stop_threshold(pcm, swparams, boundary);
> 
> This means essentially you are disabling the XRUN detection in the
> driver and keep it free-running.  So, no wonder avail gets over
> buffer_size.  It's actually an XRUN condition, but just ignored due to
> this setup.  (Usually, stop_threshould == buffer_size.)

Yes, threshold is set to boundary. But that's intended. Remember that
the original purpose of the tool is to graph _avail() and _delay()
against the time. For that I want to make sure the that timing stays
stable no matter what.

In reality this setting should not matter at all, because most machines
should be fast enough to keep the buffer filled even if we write one
sample at a time as it is done in the example.

The problems exposed in all the test data posted here for the various
sound cards are not normal normal buffer underruns.

In summary: just ignore the setting of the stop threshold. It is not
related to the problems exposed here.

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-24 18:46               ` Lennart Poettering
@ 2009-02-24 18:59                 ` Takashi Iwai
  2009-02-24 19:04                   ` Jaroslav Kysela
  2009-02-24 19:26                   ` Lennart Poettering
  0 siblings, 2 replies; 47+ messages in thread
From: Takashi Iwai @ 2009-02-24 18:59 UTC (permalink / raw)
  To: Lennart Poettering; +Cc: alsa-devel

At Tue, 24 Feb 2009 19:46:05 +0100,
Lennart Poettering wrote:
> 
> On Tue, 24.02.09 17:27, Takashi Iwai (tiwai@suse.de) wrote:
> 
> > > > > Moreover I can now reproduce the same issue on snd-intel8x0, albeit it
> > > > > takes more than a couple of minutes to make snd_pcm_avail() return
> > > > > bogus data.
> > > > > 
> > > > > Somehow I get the feeling the problem is not so much the inaccuracy of
> > > > > the pointer the kernel reports but in what alsa-libs does ith it.
> > > > 
> > > > Hrm... both cases the pointer calculation is relatively simple,
> > > > and for "hw" PCM, the avail calculation is also very straightforward.
> > > > So, my first suspect is about the pointer calculation.  But, of course
> > > > we should check all possibilities.
> > > > 
> > > > Can you give a small testcase?
> > > 
> > > Ok, here's a little test case:
> > > 
> > > http://git.0pointer.de/?p=pulseaudio.git;a=blob_plain;f=src/tests/alsa-time-test.c
> > 
> > Finally have time to go back to this issue again now.
> > 
> > With your test case, indeed I get abort(), too.  However, checking
> > through the code, you set stop_threshold to the boundary size like:
> > 
> >     r = snd_pcm_sw_params_get_boundary(swparams, &boundary);
> >     r = snd_pcm_sw_params_set_stop_threshold(pcm, swparams, boundary);
> > 
> > This means essentially you are disabling the XRUN detection in the
> > driver and keep it free-running.  So, no wonder avail gets over
> > buffer_size.  It's actually an XRUN condition, but just ignored due to
> > this setup.  (Usually, stop_threshould == buffer_size.)
> 
> Yes, threshold is set to boundary. But that's intended. Remember that
> the original purpose of the tool is to graph _avail() and _delay()
> against the time. For that I want to make sure the that timing stays
> stable no matter what.
> 
> In reality this setting should not matter at all, because most machines
> should be fast enough to keep the buffer filled even if we write one
> sample at a time as it is done in the example.

Not really.  If you feed the output to a terminal, it's damn slow.
That's why I got XRUN.

> The problems exposed in all the test data posted here for the various
> sound cards are not normal normal buffer underruns.

Well, basically "avail >= buffer_size" means the underrun per
definition.  The state isn't changed and kept as RUNNING just because
of the stop_threshold value.

> In summary: just ignore the setting of the stop threshold. It is not
> related to the problems exposed here.

FWIW, when I run the program and feed the output to /dev/null, I don't
get abort() after minutes.  It happened soon only when it runs on a
terminal.  So, the condition appears very likely as an XRUN.

So, try to set stop_threshold to buffer_size once, and check whether
you get any different result.  Let's check another possibility if it
really isn't the case.


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-24 18:59                 ` Takashi Iwai
@ 2009-02-24 19:04                   ` Jaroslav Kysela
  2009-02-24 19:26                   ` Lennart Poettering
  1 sibling, 0 replies; 47+ messages in thread
From: Jaroslav Kysela @ 2009-02-24 19:04 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Lennart Poettering

On Tue, 24 Feb 2009, Takashi Iwai wrote:

> > In reality this setting should not matter at all, because most machines
> > should be fast enough to keep the buffer filled even if we write one
> > sample at a time as it is done in the example.
> 
> Not really.  If you feed the output to a terminal, it's damn slow.
> That's why I got XRUN.

I figured this few seconds ago, too. It's the terminal output which 
makes XRUN.

					Jaroslav

-----
Jaroslav Kysela <perex@perex.cz>
Linux Kernel Sound Maintainer
ALSA Project, Red Hat, Inc.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-24 18:59                 ` Takashi Iwai
  2009-02-24 19:04                   ` Jaroslav Kysela
@ 2009-02-24 19:26                   ` Lennart Poettering
  2009-02-24 20:37                     ` Takashi Iwai
  2009-02-24 21:16                     ` Lennart Poettering
  1 sibling, 2 replies; 47+ messages in thread
From: Lennart Poettering @ 2009-02-24 19:26 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

On Tue, 24.02.09 19:59, Takashi Iwai (tiwai@suse.de) wrote:

> > Yes, threshold is set to boundary. But that's intended. Remember that
> > the original purpose of the tool is to graph _avail() and _delay()
> > against the time. For that I want to make sure the that timing stays
> > stable no matter what.
> > 
> > In reality this setting should not matter at all, because most machines
> > should be fast enough to keep the buffer filled even if we write one
> > sample at a time as it is done in the example.
> 
> Not really.  If you feed the output to a terminal, it's damn slow.
> That's why I got XRUN.

Oh, of course, I should have mentioned that.

The tool generates a *lot* of data, given that it queries _avail() and
_delay() in a busy loop.

To run it either redirect the output to disk (which of course might
create gaps in the output if we write faster than the disk can take it
but this is easily detected by looking on the first column). Or
pipe it directly to something like "tail -n 50".

> > In summary: just ignore the setting of the stop threshold. It is not
> > related to the problems exposed here.
> 
> FWIW, when I run the program and feed the output to /dev/null, I don't
> get abort() after minutes.  It happened soon only when it runs on a
> terminal.  So, the condition appears very likely as an XRUN.

The outputs I posted earlier are *not* due to normal underruns. Please
have a look at the actual data!

> So, try to set stop_threshold to buffer_size once, and check whether
> you get any different result.  Let's check another possibility if it
> really isn't the case.

In those test cases no normal XRUN ever happens unless you pipe the
output to a terminal: please, just actaully look at the data
generated!

The effective difference for this tool that it makes to set
stop_threshold to boundary or not is simply this:

    if stop_threshold is set to boundary underruns will be caught by the
    mentioned assert.

    if stop_threshold is not set to boundary underruns result in an EPIPE
    which is then caught by a different assert in the code.

So again: how stop_threshold is set is *irrelevant* for the test case!
It simply would cause a different assert() to be triggered! That is
all. And again, unless you machine is very slow or otherwise busy
_avail() should never come near the buffer size except very early
afetr startup.

stop_threshold is completely irrelevant to the problems discussed
here.

Oh, and one thing I didn't actually notice earlier: Most drivers return
a negative snd_pcm_delay() if a real underrun happens. According to
the definition of snd_pcm_delay() that we agreed on a couple of
months ago and that is now docuemented in doxygen this makes no
sense. The definition of snd_pcm_delay() goes like this:

"For playback the delay is defined as the time that a frame that is
written to the PCM stream shortly after this call will take to be
actually audible. It is as such the overall latency from the write
call to the final DAC." (from the doxygen docs)

I.e. because on the this world it is impossible to hear a sample that
hasn't even been written yet, _delay() should only return positive
values. However many drivers do return negative values on underrun.

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-24 19:26                   ` Lennart Poettering
@ 2009-02-24 20:37                     ` Takashi Iwai
  2009-02-25 10:08                       ` Takashi Iwai
  2009-02-24 21:16                     ` Lennart Poettering
  1 sibling, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-24 20:37 UTC (permalink / raw)
  To: Lennart Poettering; +Cc: alsa-devel

At Tue, 24 Feb 2009 20:26:11 +0100,
Lennart Poettering wrote:
> 
> On Tue, 24.02.09 19:59, Takashi Iwai (tiwai@suse.de) wrote:
> 
> > > Yes, threshold is set to boundary. But that's intended. Remember that
> > > the original purpose of the tool is to graph _avail() and _delay()
> > > against the time. For that I want to make sure the that timing stays
> > > stable no matter what.
> > > 
> > > In reality this setting should not matter at all, because most machines
> > > should be fast enough to keep the buffer filled even if we write one
> > > sample at a time as it is done in the example.
> > 
> > Not really.  If you feed the output to a terminal, it's damn slow.
> > That's why I got XRUN.
> 
> Oh, of course, I should have mentioned that.
> 
> The tool generates a *lot* of data, given that it queries _avail() and
> _delay() in a busy loop.
> 
> To run it either redirect the output to disk (which of course might
> create gaps in the output if we write faster than the disk can take it
> but this is easily detected by looking on the first column). Or
> pipe it directly to something like "tail -n 50".

Damn, you should have mentioned it before.  I lost too long time
just for that!

> > > In summary: just ignore the setting of the stop threshold. It is not
> > > related to the problems exposed here.
> > 
> > FWIW, when I run the program and feed the output to /dev/null, I don't
> > get abort() after minutes.  It happened soon only when it runs on a
> > terminal.  So, the condition appears very likely as an XRUN.
> 
> The outputs I posted earlier are *not* due to normal underruns. Please
> have a look at the actual data!

I didn't get any "actual data" because of xrun.

> > So, try to set stop_threshold to buffer_size once, and check whether
> > you get any different result.  Let's check another possibility if it
> > really isn't the case.
> 
> In those test cases no normal XRUN ever happens unless you pipe the
> output to a terminal: please, just actaully look at the data
> generated!

The data isn't generated if XRUN occurs because it stops with
assert() at that point.

> The effective difference for this tool that it makes to set
> stop_threshold to boundary or not is simply this:
> 
>     if stop_threshold is set to boundary underruns will be caught by the
>     mentioned assert.
> 
>     if stop_threshold is not set to boundary underruns result in an EPIPE
>     which is then caught by a different assert in the code.
> 
> So again: how stop_threshold is set is *irrelevant* for the test case!

But the assert caught exaclty the XRUN.  Not *always* irrelevant.

> It simply would cause a different assert() to be triggered! That is
> all. And again, unless you machine is very slow or otherwise busy
> _avail() should never come near the buffer size except very early
> afetr startup.

It came.  Maybe it's slow.

> stop_threshold is completely irrelevant to the problems discussed
> here.

OK, but let me continue that tomorrow or later.

> Oh, and one thing I didn't actually notice earlier: Most drivers return
> a negative snd_pcm_delay() if a real underrun happens. According to
> the definition of snd_pcm_delay() that we agreed on a couple of
> months ago and that is now docuemented in doxygen this makes no
> sense. The definition of snd_pcm_delay() goes like this:
> 
> "For playback the delay is defined as the time that a frame that is
> written to the PCM stream shortly after this call will take to be
> actually audible. It is as such the overall latency from the write
> call to the final DAC." (from the doxygen docs)
> 
> I.e. because on the this world it is impossible to hear a sample that
> hasn't even been written yet, _delay() should only return positive
> values. However many drivers do return negative values on underrun.

Maybe you are walking faster than the light speed.

I don't mind to add a negative check in the delay value, but who would
actually care?  It's underrun, so the behavior of the hardware isn't
really defined...


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-24 19:26                   ` Lennart Poettering
  2009-02-24 20:37                     ` Takashi Iwai
@ 2009-02-24 21:16                     ` Lennart Poettering
  2009-02-25 10:48                       ` Takashi Iwai
  1 sibling, 1 reply; 47+ messages in thread
From: Lennart Poettering @ 2009-02-24 21:16 UTC (permalink / raw)
  To: Takashi Iwai, alsa-devel

On Tue, 24.02.09 20:26, Lennart Poettering (mznyfn@0pointer.de) wrote:

> Oh, and one thing I didn't actually notice earlier: Most drivers return
> a negative snd_pcm_delay() if a real underrun happens. According to
> the definition of snd_pcm_delay() that we agreed on a couple of
> months ago and that is now docuemented in doxygen this makes no
> sense. The definition of snd_pcm_delay() goes like this:
> 
> "For playback the delay is defined as the time that a frame that is
> written to the PCM stream shortly after this call will take to be
> actually audible. It is as such the overall latency from the write
> call to the final DAC." (from the doxygen docs)
> 
> I.e. because on the this world it is impossible to hear a sample that
> hasn't even been written yet, _delay() should only return positive
> values. However many drivers do return negative values on underrun.

I take this back.

I think the correct way to handle an underrun when stop_threshold is
set to boundary is that if a write happens after an underrun the
appropriate amount of data is simply dropped. I think enabling this
mode is primarily useful to guarantee timer stability even in case of
underrun. That means snd_pcm_delay() should very well return negative
values meaning "what you write next is past the playback pointer, it
will not be heard".

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-24 20:37                     ` Takashi Iwai
@ 2009-02-25 10:08                       ` Takashi Iwai
  2009-02-25 10:22                         ` Takashi Iwai
  2009-02-25 10:44                         ` Lennart Poettering
  0 siblings, 2 replies; 47+ messages in thread
From: Takashi Iwai @ 2009-02-25 10:08 UTC (permalink / raw)
  To: Lennart Poettering; +Cc: alsa-devel

At Tue, 24 Feb 2009 21:37:20 +0100,
I wrote:
> 
> > stop_threshold is completely irrelevant to the problems discussed
> > here.
> 
> OK, but let me continue that tomorrow or later.

Running it with five sound cards (two HD-audio, emu10k1, CMI8738, and
envy24HT) overnight with tail -500, but nothing appears on my
machines.

All outputs look OK: the first three columns are almost same values,
avail is a few (< 10), delay is about the buffer size.

How can you trigger the problem so easily?


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 10:08                       ` Takashi Iwai
@ 2009-02-25 10:22                         ` Takashi Iwai
  2009-02-25 12:34                           ` Clive Messer
  2009-02-25 10:44                         ` Lennart Poettering
  1 sibling, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-25 10:22 UTC (permalink / raw)
  To: Lennart Poettering; +Cc: alsa-devel

At Wed, 25 Feb 2009 11:08:50 +0100,
I wrote:
> 
> At Tue, 24 Feb 2009 21:37:20 +0100,
> I wrote:
> > 
> > > stop_threshold is completely irrelevant to the problems discussed
> > > here.
> > 
> > OK, but let me continue that tomorrow or later.
> 
> Running it with five sound cards (two HD-audio, emu10k1, CMI8738, and
> envy24HT) overnight with tail -500, but nothing appears on my
> machines.
> 
> All outputs look OK: the first three columns are almost same values,
> avail is a few (< 10), delay is about the buffer size.
> 
> How can you trigger the problem so easily?

Could you check the raw hwptr and applptr values surrounding the error
time, e.g. with the patch to alsa-lib below?
If it's really hwsync that makes values unstable, we have to see the
value skips there...


Takashi

---
diff --git a/src/pcm/pcm_hw.c b/src/pcm/pcm_hw.c
index e9ce092..6dd1633 100644
--- a/src/pcm/pcm_hw.c
+++ b/src/pcm/pcm_hw.c
@@ -545,11 +545,20 @@ static int snd_pcm_hw_hwsync(snd_pcm_t *pcm)
 			if (err < 0)
 				return err;
 		} else {
+			long old_hwptr, old_applptr;
+			long new_hwptr, new_applptr;
+			old_hwptr = *pcm->hw.ptr;
+			old_applptr = *pcm->appl.ptr;
 			if (ioctl(fd, SNDRV_PCM_IOCTL_HWSYNC) < 0) {
 				err = -errno;
 				SYSMSG("SNDRV_PCM_IOCTL_HWSYNC failed");
 				return err;
 			}
+			new_hwptr = *pcm->hw.ptr;
+			new_applptr = *pcm->appl.ptr;
+			printf("hwsync: %ld(%ld), %ld(%ld)\n",
+			       new_hwptr, new_hwptr - old_hwptr,
+			       new_applptr, new_applptr - old_applptr);
 		}
 	} else {
 		snd_pcm_sframes_t delay;

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 10:08                       ` Takashi Iwai
  2009-02-25 10:22                         ` Takashi Iwai
@ 2009-02-25 10:44                         ` Lennart Poettering
  1 sibling, 0 replies; 47+ messages in thread
From: Lennart Poettering @ 2009-02-25 10:44 UTC (permalink / raw)
  To: alsa-devel

On Wed, 25.02.09 11:08, Takashi Iwai (tiwai@suse.de) wrote:

> 
> At Tue, 24 Feb 2009 21:37:20 +0100,
> I wrote:
> > 
> > > stop_threshold is completely irrelevant to the problems discussed
> > > here.
> > 
> > OK, but let me continue that tomorrow or later.
> 
> Running it with five sound cards (two HD-audio, emu10k1, CMI8738, and
> envy24HT) overnight with tail -500, but nothing appears on my
> machines.
> 
> All outputs look OK: the first three columns are almost same values,
> avail is a few (< 10), delay is about the buffer size.
> 
> How can you trigger the problem so easily?

Hmm, that's a good question. It's not just me btw who can reproduce
this that easily. I have put up a page (see bottom of
http://pulseaudio.org/wiki/BrokenSoundDrivers) for this tool during
the WE and got about 20 reports back (some of this I actually
mentioned on the ML already), for different cards on different distros
(from all of Fedora, OpenSUSE, Mandriva, Ubuntu and Debian).

Most of this issues can be reproduced in 5 mins or so. For the
intel8x0 issues it was a little bit harder, took 30min to be triggered.

Most of my own tests was one SMP btw, not sure if that matters. The
excpetion is the intl8x0 stuff which I tested in a machine with a
single CPU.

Since a few PA versions back I write into syslog when snd_pcm_avail()
returns apparently overflown values. Our bugzilla is now full of
reports of messages like this. Here are a two of them as examples:

https://bugzilla.redhat.com/show_bug.cgi?id=485734
https://bugzilla.redhat.com/show_bug.cgi?id=483559

(Look for the snd_pcm_avail_update() log messages in those reports)

BTW, any update on the issue that calling poll() on an ALSA device
returns POLLOUT however a subsequent snd_pcm_avail() returns 0 or some
other value < min_avail? I reported that a while back on the ML. And a
*lot* of drivers are suffering by this. This causes the PA IO loop
to spin quite often for no reason.

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-24 21:16                     ` Lennart Poettering
@ 2009-02-25 10:48                       ` Takashi Iwai
  2009-02-25 10:57                         ` Jaroslav Kysela
  0 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-25 10:48 UTC (permalink / raw)
  To: Lennart Poettering; +Cc: alsa-devel

At Tue, 24 Feb 2009 22:16:21 +0100,
Lennart Poettering wrote:
> 
> On Tue, 24.02.09 20:26, Lennart Poettering (mznyfn@0pointer.de) wrote:
> 
> > Oh, and one thing I didn't actually notice earlier: Most drivers return
> > a negative snd_pcm_delay() if a real underrun happens. According to
> > the definition of snd_pcm_delay() that we agreed on a couple of
> > months ago and that is now docuemented in doxygen this makes no
> > sense. The definition of snd_pcm_delay() goes like this:
> > 
> > "For playback the delay is defined as the time that a frame that is
> > written to the PCM stream shortly after this call will take to be
> > actually audible. It is as such the overall latency from the write
> > call to the final DAC." (from the doxygen docs)
> > 
> > I.e. because on the this world it is impossible to hear a sample that
> > hasn't even been written yet, _delay() should only return positive
> > values. However many drivers do return negative values on underrun.
> 
> I take this back.
> 
> I think the correct way to handle an underrun when stop_threshold is
> set to boundary is that if a write happens after an underrun the
> appropriate amount of data is simply dropped. I think enabling this
> mode is primarily useful to guarantee timer stability even in case of
> underrun. That means snd_pcm_delay() should very well return negative
> values meaning "what you write next is past the playback pointer, it
> will not be heard".

Well, I'm afraid that it's undesired behavior.

In a free-wheel mode, there is no real underrun (except that the
driver pointer callback explicitly returns the error).  It basically
means "I do care any XRUN errors by myself, so don't bother me no
matter what crazy value is set."  PCM core shouldn't react so much
differently depending on the avail/delay value in this mode.

Anyway, the definitions of delay and avail in a free-wheel mode are
ambiguous and rather confusing indeed...


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 10:48                       ` Takashi Iwai
@ 2009-02-25 10:57                         ` Jaroslav Kysela
  2009-02-25 11:04                           ` Takashi Iwai
  0 siblings, 1 reply; 47+ messages in thread
From: Jaroslav Kysela @ 2009-02-25 10:57 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Lennart Poettering

On Wed, 25 Feb 2009, Takashi Iwai wrote:

> Anyway, the definitions of delay and avail in a free-wheel mode are
> ambiguous and rather confusing indeed...

I don't agree. We manage appl_ptr also in no-XRUN check mode for ALSA 
apps, so the negative values makes sense and applications can quickly 
skip missed samples (snd_pcm_forward).

					Jaroslav

-----
Jaroslav Kysela <perex@perex.cz>
Linux Kernel Sound Maintainer
ALSA Project, Red Hat, Inc.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 10:57                         ` Jaroslav Kysela
@ 2009-02-25 11:04                           ` Takashi Iwai
  2009-02-25 11:10                             ` Jaroslav Kysela
  0 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-25 11:04 UTC (permalink / raw)
  To: Jaroslav Kysela; +Cc: alsa-devel, Lennart Poettering

At Wed, 25 Feb 2009 11:57:32 +0100 (CET),
Jaroslav Kysela wrote:
> 
> On Wed, 25 Feb 2009, Takashi Iwai wrote:
> 
> > Anyway, the definitions of delay and avail in a free-wheel mode are
> > ambiguous and rather confusing indeed...
> 
> I don't agree. We manage appl_ptr also in no-XRUN check mode for ALSA 
> apps, so the negative values makes sense and applications can quickly 
> skip missed samples (snd_pcm_forward).

Well, if we follow the definition of snd_pcm_delay(), applptr - delay
doesn't always point to the position to write for the next.
(Right now it works so, though).


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 11:04                           ` Takashi Iwai
@ 2009-02-25 11:10                             ` Jaroslav Kysela
  2009-02-25 11:17                               ` Takashi Iwai
  0 siblings, 1 reply; 47+ messages in thread
From: Jaroslav Kysela @ 2009-02-25 11:10 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Lennart Poettering

On Wed, 25 Feb 2009, Takashi Iwai wrote:

> At Wed, 25 Feb 2009 11:57:32 +0100 (CET),
> Jaroslav Kysela wrote:
> > 
> > On Wed, 25 Feb 2009, Takashi Iwai wrote:
> > 
> > > Anyway, the definitions of delay and avail in a free-wheel mode are
> > > ambiguous and rather confusing indeed...
> > 
> > I don't agree. We manage appl_ptr also in no-XRUN check mode for ALSA 
> > apps, so the negative values makes sense and applications can quickly 
> > skip missed samples (snd_pcm_forward).
> 
> Well, if we follow the definition of snd_pcm_delay(), applptr - delay
> doesn't always point to the position to write for the next.
> (Right now it works so, though).

Sure, but negative value indicates how many samples is application behind 
the output from hardware. The delay value cannot be used for buffer 
filling, of course (also in standard - no-XRUN case - avail functions 
should be used).

						Jaroslav

-----
Jaroslav Kysela <perex@perex.cz>
Linux Kernel Sound Maintainer
ALSA Project, Red Hat, Inc.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 11:10                             ` Jaroslav Kysela
@ 2009-02-25 11:17                               ` Takashi Iwai
  0 siblings, 0 replies; 47+ messages in thread
From: Takashi Iwai @ 2009-02-25 11:17 UTC (permalink / raw)
  To: Jaroslav Kysela; +Cc: alsa-devel, Lennart Poettering

At Wed, 25 Feb 2009 12:10:24 +0100 (CET),
Jaroslav Kysela wrote:
> 
> On Wed, 25 Feb 2009, Takashi Iwai wrote:
> 
> > At Wed, 25 Feb 2009 11:57:32 +0100 (CET),
> > Jaroslav Kysela wrote:
> > > 
> > > On Wed, 25 Feb 2009, Takashi Iwai wrote:
> > > 
> > > > Anyway, the definitions of delay and avail in a free-wheel mode are
> > > > ambiguous and rather confusing indeed...
> > > 
> > > I don't agree. We manage appl_ptr also in no-XRUN check mode for ALSA 
> > > apps, so the negative values makes sense and applications can quickly 
> > > skip missed samples (snd_pcm_forward).
> > 
> > Well, if we follow the definition of snd_pcm_delay(), applptr - delay
> > doesn't always point to the position to write for the next.
> > (Right now it works so, though).
> 
> Sure, but negative value indicates how many samples is application behind 
> the output from hardware. The delay value cannot be used for buffer 
> filling, of course (also in standard - no-XRUN case - avail functions 
> should be used).

The problem I foresee is that a hardware with pre-fetch style doesn't
work reliably with forwarding based on the delay value.  The
free-wheel support on such a hardware is anyway tricky, so it's not
only about avail and delay, though.


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 10:22                         ` Takashi Iwai
@ 2009-02-25 12:34                           ` Clive Messer
  2009-02-25 13:36                             ` Takashi Iwai
  2009-02-25 15:04                             ` Takashi Iwai
  0 siblings, 2 replies; 47+ messages in thread
From: Clive Messer @ 2009-02-25 12:34 UTC (permalink / raw)
  To: alsa-devel

[-- Attachment #1: Type: text/plain, Size: 633 bytes --]

On Wednesday 25 Feb 2009 10:22:26 Takashi Iwai wrote:

> Could you check the raw hwptr and applptr values surrounding the error
> time, e.g. with the patch to alsa-lib below?
> If it's really hwsync that makes values unstable, we have to see the
> value skips there...

Does this help? Log attached. (head -n 100, tail -n 100)

Hardware is  .....

00:1b.0 Audio device: Intel Corporation 82801JI (ICH10 Family) HD Audio 
Controller

module: snd_hda_intel
pcm: AD198x Analog
kernel: kernel-2.6.29-0.43.rc6.fc10.x86_64
alsa: alsa-lib-1.0.19-2.fc10.x86_64 (plus printf patch)

Regards

Clive
-- 
Clive Messer <clive@vacuumtube.org.uk>


[-- Attachment #2: hda.log --]
[-- Type: text/x-log, Size: 6091 bytes --]

hwsync: 0(0), 0(0)
28238	28236	0	4416	0	4	1	2
hwsync: 0(0), 1(0)
28247	28246	22	4415	0	4	1	2
hwsync: 0(0), 2(0)
28251	28251	45	4414	0	4	1	2
hwsync: 0(0), 3(0)
28253	28253	68	4413	0	4	1	2
hwsync: 0(0), 4(0)
28258	28258	90	4412	0	4	1	2
hwsync: 0(0), 5(0)
28260	28260	113	4411	0	4	1	2
hwsync: 0(0), 6(0)
28262	28262	136	4410	0	4	1	2
hwsync: 0(0), 7(0)
28264	28264	158	4409	0	4	1	2
hwsync: 0(0), 8(0)
28267	28266	181	4408	0	4	1	2
hwsync: 0(0), 9(0)
28269	28268	204	4407	0	4	1	2
hwsync: 0(0), 10(0)
28271	28271	226	4406	0	4	1	2
hwsync: 0(0), 11(0)
28273	28273	249	4405	0	4	1	2
hwsync: 0(0), 12(0)
28275	28275	272	4404	0	4	1	2
hwsync: 0(0), 13(0)
28277	28277	294	4403	0	4	1	2
hwsync: 0(0), 14(0)
28279	28279	317	4402	0	4	1	2
hwsync: 0(0), 15(0)
28281	28281	340	4401	0	4	1	2
hwsync: 0(0), 16(0)
28283	28283	362	4400	0	4	1	2
hwsync: 0(0), 17(0)
28285	28285	385	4399	0	4	1	2
hwsync: 0(0), 18(0)
28287	28287	408	4398	0	4	1	2
hwsync: 0(0), 19(0)
28289	28289	430	4397	0	4	1	2
hwsync: 0(0), 20(0)
28291	28291	453	4396	0	4	1	2
hwsync: 0(0), 21(0)
28293	28293	476	4395	0	4	1	2
hwsync: 0(0), 22(0)
28295	28295	498	4394	0	4	1	2
hwsync: 0(0), 23(0)
28298	28297	521	4393	0	4	1	2
hwsync: 0(0), 24(0)
28300	28299	544	4392	0	4	1	2
hwsync: 0(0), 25(0)
28302	28301	566	4391	0	4	1	2
hwsync: 0(0), 26(0)
28304	28304	589	4390	0	4	1	2
hwsync: 0(0), 27(0)
28306	28306	612	4389	0	4	1	2
hwsync: 0(0), 28(0)
28308	28308	634	4388	0	4	1	2
hwsync: 0(0), 29(0)
28310	28310	657	4387	0	4	1	2
hwsync: 0(0), 30(0)
28312	28312	680	4386	0	4	1	2
hwsync: 0(0), 31(0)
28314	28314	702	4385	0	4	1	2
hwsync: 0(0), 32(0)
28316	28316	725	4384	0	4	1	2
hwsync: 0(0), 33(0)
28318	28318	748	4383	0	4	1	2
hwsync: 0(0), 34(0)
28320	28320	770	4382	0	4	1	2
hwsync: 0(0), 35(0)
28322	28322	793	4381	0	4	1	2
hwsync: 0(0), 36(0)
28324	28324	816	4380	0	4	1	2
hwsync: 0(0), 37(0)
28326	28326	839	4379	0	4	1	2
hwsync: 0(0), 38(0)
28328	28328	861	4378	0	4	1	2
hwsync: 0(0), 39(0)
28331	28330	884	4377	0	4	1	2
hwsync: 0(0), 40(0)
28333	28332	907	4376	0	4	1	2
hwsync: 0(0), 41(0)
28335	28334	929	4375	0	4	1	2
hwsync: 0(0), 42(0)
28337	28337	952	4374	0	4	1	2
hwsync: 0(0), 43(0)
28339	28339	975	4373	0	4	1	2
hwsync: 0(0), 44(0)
28341	28341	997	4372	0	4	1	2
hwsync: 0(0), 45(0)
28343	28343	1020	4371	0	4	1	2
hwsync: 0(0), 46(0)
28345	28345	1043	4370	0	4	1	2
hwsync: 0(0), 47(0)
28347	28347	1065	4369	0	4	1	2
hwsync: 0(0), 48(0)
28349	28349	1088	4368	0	4	1	2
hwsync: 0(0), 49(0)
28351	28351	1111	4367	0	4	1	2
hwsync: 1108800(0), 1113216(0)
25178862	25178861	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178863	25178863	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178865	25178865	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178867	25178867	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178869	25178869	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178870	25178870	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178872	25178872	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178874	25178874	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178876	25178876	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178878	25178877	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178879	25178879	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178881	25178881	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178883	25178883	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178885	25178885	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178887	25178886	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178888	25178888	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178890	25178890	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178892	25178892	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178894	25178894	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178895	25178895	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178897	25178897	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178899	25178899	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178901	25178901	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178903	25178902	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178904	25178904	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178906	25178906	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178908	25178908	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178910	25178910	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178912	25178911	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178913	25178913	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178915	25178915	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178917	25178917	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178919	25178919	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178921	25178920	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178922	25178922	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178924	25178924	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178926	25178926	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178928	25178927	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178929	25178929	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178931	25178931	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178933	25178933	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178935	25178935	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178937	25178936	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178938	25178938	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178940	25178940	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178942	25178942	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178944	25178944	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178946	25178945	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25178947	25178947	25142834	0	4416	0	0	3
hwsync: 1108800(0), 1113216(0)
25693169	25693168	25657142	21697	-18264	0	1	3
alsa-time-test: alsa-time-test.c:189: main: Assertion `(unsigned) avail <= buffer_size' failed.

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 12:34                           ` Clive Messer
@ 2009-02-25 13:36                             ` Takashi Iwai
  2009-02-25 15:11                               ` Clive Messer
  2009-02-25 15:04                             ` Takashi Iwai
  1 sibling, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-25 13:36 UTC (permalink / raw)
  To: Clive Messer; +Cc: alsa-devel

At Wed, 25 Feb 2009 12:34:37 +0000,
Clive Messer wrote:
> 
> On Wednesday 25 Feb 2009 10:22:26 Takashi Iwai wrote:
> 
> > Could you check the raw hwptr and applptr values surrounding the error
> > time, e.g. with the patch to alsa-lib below?
> > If it's really hwsync that makes values unstable, we have to see the
> > value skips there...
> 
> Does this help? Log attached. (head -n 100, tail -n 100)

Thanks.  At least it shows the values in hwsync ioctl are sane.
So something wrong in elsewhere.

Could you check whether this error happens with sync_ptr_ioctl flag
in PCM config?  For example, define the below in ~/.asoundrc,

	pcm.mytest {
		type hw
		card 0
		sync_ptr_ioctl true
	}

Then run with PCM "mytest".


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 12:34                           ` Clive Messer
  2009-02-25 13:36                             ` Takashi Iwai
@ 2009-02-25 15:04                             ` Takashi Iwai
  1 sibling, 0 replies; 47+ messages in thread
From: Takashi Iwai @ 2009-02-25 15:04 UTC (permalink / raw)
  To: Clive Messer; +Cc: alsa-devel

Wait a minute...

At Wed, 25 Feb 2009 12:34:37 +0000,
Clive Messer wrote:
> 
> hwsync: 1108800(0), 1113216(0)
> 25178947	25178947	25142834	0	4416	0	0	3
> hwsync: 1108800(0), 1113216(0)
> 25693169	25693168	25657142	21697	-18264	0	1	3

The time leaps indeed here about 0.5 sec.  Assuming this time-keeping
is correct, the avail and delay values there are acceptable.

In this case, apparently a buffer underrun occurs due to a sudden
system load or so.  It's no bug.


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 13:36                             ` Takashi Iwai
@ 2009-02-25 15:11                               ` Clive Messer
  2009-02-25 15:51                                 ` Takashi Iwai
  0 siblings, 1 reply; 47+ messages in thread
From: Clive Messer @ 2009-02-25 15:11 UTC (permalink / raw)
  To: alsa-devel

[-- Attachment #1: Type: text/plain, Size: 427 bytes --]

On Wednesday 25 Feb 2009 13:36:23 Takashi Iwai wrote:

> Could you check whether this error happens with sync_ptr_ioctl flag
> in PCM config?  For example, define the below in ~/.asoundrc,
>
> 	pcm.mytest {
> 		type hw
> 		card 0
> 		sync_ptr_ioctl true
> 	}
>
> Then run with PCM "mytest".

./alsa-time-test mytest > pcm_mytest.log

head & tail 50 lines - attached.

Regards

Clive
-- 
Clive Messer <clive@vacuumtube.org.uk>


[-- Attachment #2: mytest.log --]
[-- Type: text/x-log, Size: 2921 bytes --]

20988	20986	0	4416	0	4	1	2
21002	21002	22	4415	0	4	1	2
21006	21006	45	4414	0	4	1	2
21010	21009	68	4413	0	4	1	2
21014	21013	90	4412	0	4	1	2
21024	21022	113	4411	0	4	1	2
21028	21027	136	4410	0	4	1	2
21033	21032	158	4409	0	4	1	2
21036	21035	181	4408	0	4	1	2
21040	21040	204	4407	0	4	1	2
21044	21043	226	4406	0	4	1	2
21047	21046	249	4405	0	4	1	2
21050	21049	272	4404	0	4	1	2
21053	21053	294	4403	0	4	1	2
21061	21061	317	4402	0	4	1	2
21065	21064	340	4401	0	4	1	2
21068	21067	362	4400	0	4	1	2
21071	21070	385	4399	0	4	1	2
21074	21074	408	4398	0	4	1	2
21078	21077	430	4397	0	4	1	2
21081	21080	453	4396	0	4	1	2
21084	21083	476	4395	0	4	1	2
21087	21087	498	4394	0	4	1	2
21091	21090	521	4393	0	4	1	2
21094	21093	544	4392	0	4	1	2
21097	21096	566	4391	0	4	1	2
21100	21100	589	4390	0	4	1	2
21104	21103	612	4389	0	4	1	2
21107	21106	634	4388	0	4	1	2
21110	21109	657	4387	0	4	1	2
21113	21113	680	4386	0	4	1	2
21116	21116	702	4385	0	4	1	2
21120	21119	725	4384	0	4	1	2
21123	21122	748	4383	0	4	1	2
21126	21125	770	4382	0	4	1	2
21129	21129	793	4381	0	4	1	2
21133	21132	816	4380	0	4	1	2
21136	21135	839	4379	0	4	1	2
21139	21138	861	4378	0	4	1	2
21142	21142	884	4377	0	4	1	2
21145	21145	907	4376	0	4	1	2
21149	21148	929	4375	0	4	1	2
21152	21151	952	4374	0	4	1	2
21155	21154	975	4373	0	4	1	2
21158	21158	997	4372	0	4	1	2
21162	21161	1020	4371	0	4	1	2
21165	21164	1043	4370	0	4	1	2
21168	21167	1065	4369	0	4	1	2
21171	21171	1088	4368	0	4	1	2
21175	21174	1111	4367	0	4	1	2
36046	36045	99115	45	0	0	1	2
36049	36049	99138	44	0	0	1	2
36052	36052	99160	43	0	0	1	2
36060	36059	99183	42	0	0	1	2
36063	36062	99206	41	0	0	1	2
36066	36065	99229	40	0	0	1	2
36069	36069	99251	39	0	0	1	2
36072	36072	99274	38	0	0	1	2
36076	36075	99297	37	0	0	1	2
36079	36078	99319	36	0	0	1	2
36082	36081	99342	35	0	0	1	2
36085	36085	99365	34	0	0	1	2
36088	36088	99387	33	0	0	1	2
36092	36091	99410	32	0	0	1	2
36095	36094	99433	31	0	0	1	2
36098	36098	99455	30	0	0	1	2
36101	36101	99478	29	0	0	1	2
36105	36104	99501	28	0	0	1	2
36108	36107	99523	27	0	0	1	2
36111	36110	99546	26	0	0	1	2
36114	36114	99569	25	0	0	1	2
36117	36117	99591	24	0	0	1	2
36121	36120	99614	23	0	0	1	2
36124	36123	99637	22	0	0	1	2
36127	36126	99659	21	0	0	1	2
36130	36130	99682	20	0	0	1	2
36133	36133	99705	19	0	0	1	2
36137	36136	99727	18	0	0	1	2
36140	36139	99750	17	0	0	1	2
36143	36142	99773	16	0	0	1	2
36146	36146	99795	15	0	0	1	2
36150	36149	99818	14	0	0	1	2
36153	36152	99841	13	0	0	1	2
36156	36155	99863	12	0	0	1	2
36159	36159	99886	11	0	0	1	2
36162	36162	99909	10	0	0	1	2
36166	36165	99931	9	0	0	1	2
36169	36168	99954	8	0	0	1	2
36172	36171	99977	7	0	0	1	2
36175	36175	100000	6	0	0	1	2
36179	36178	100022	5	0	0	1	2
36188	36188	100045	4	0	0	1	2
36192	36191	100068	3	0	0	1	2
36195	36194	100090	2	0	0	1	2
36198	36198	100113	1	0	0	1	2
36207	36206	63854	2816	1600	0	1	3
36212	36211	63854	2815	1601	4	1	3
36216	36216	63854	2814	1602	4	1	3
36221	36220	63854	2813	1603	4	1	3
36225	36225	101224	4460	-44	4	1	3

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 15:11                               ` Clive Messer
@ 2009-02-25 15:51                                 ` Takashi Iwai
  2009-02-25 16:24                                   ` Clive Messer
  0 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-25 15:51 UTC (permalink / raw)
  To: Clive Messer; +Cc: alsa-devel

At Wed, 25 Feb 2009 15:11:44 +0000,
Clive Messer wrote:
> 36192	36191	100068	3	0	0	1	2
> 36195	36194	100090	2	0	0	1	2
> 36198	36198	100113	1	0	0	1	2
> 36207	36206	63854	2816	1600	0	1	3
> 36212	36211	63854	2815	1601	4	1	3
> 36216	36216	63854	2814	1602	4	1	3
> 36221	36220	63854	2813	1603	4	1	3
> 36225	36225	101224	4460	-44	4	1	3

This is pretty weird.  Is this behavior always reproducible?
What shows with the patch below to alsa-lib?

Also, any change if you play with position_fix parameter, such as,
position_fix=1?


Takashi

---
diff --git a/src/pcm/pcm_hw.c b/src/pcm/pcm_hw.c
index e9ce092..9b91a13 100644
--- a/src/pcm/pcm_hw.c
+++ b/src/pcm/pcm_hw.c
@@ -539,6 +539,10 @@ static int snd_pcm_hw_hwsync(snd_pcm_t *pcm)
 {
 	snd_pcm_hw_t *hw = pcm->private_data;
 	int fd = hw->fd, err;
+	long old_hwptr, new_hwptr;
+	long old_applptr, new_applptr;
+	old_hwptr = *pcm->hw.ptr;
+	old_applptr = *pcm->appl.ptr;
 	if (SNDRV_PROTOCOL_VERSION(2, 0, 3) <= hw->version) {
 		if (hw->sync_ptr) {
 			err = sync_ptr1(hw, SNDRV_PCM_SYNC_PTR_HWSYNC);
@@ -564,6 +568,11 @@ static int snd_pcm_hw_hwsync(snd_pcm_t *pcm)
 			}
 		}
 	}
+	new_hwptr = *pcm->hw.ptr;
+	new_applptr = *pcm->appl.ptr;
+	printf("hwsync: %ld(%ld), %ld(%ld)\n",
+	       new_hwptr, new_hwptr - old_hwptr,
+	       new_applptr, new_applptr - old_applptr);
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 15:51                                 ` Takashi Iwai
@ 2009-02-25 16:24                                   ` Clive Messer
  2009-02-25 16:39                                     ` Takashi Iwai
  0 siblings, 1 reply; 47+ messages in thread
From: Clive Messer @ 2009-02-25 16:24 UTC (permalink / raw)
  To: Takashi Iwai, alsa-devel

[-- Attachment #1: Type: text/plain, Size: 839 bytes --]

On Wednesday 25 Feb 2009 15:51:33 you wrote:
> At Wed, 25 Feb 2009 15:11:44 +0000,
>
> Clive Messer wrote:
> > 36192	36191	100068	3	0	0	1	2
> > 36195	36194	100090	2	0	0	1	2
> > 36198	36198	100113	1	0	0	1	2
> > 36207	36206	63854	2816	1600	0	1	3
> > 36212	36211	63854	2815	1601	4	1	3
> > 36216	36216	63854	2814	1602	4	1	3
> > 36221	36220	63854	2813	1603	4	1	3
> > 36225	36225	101224	4460	-44	4	1	3
>
> This is pretty weird.  Is this behavior always reproducible?

Nope. I get different results every time I run it.

> What shows with the patch below to alsa-lib?

Here's one (attached) I just did with the new patch that shows the crazy high 
numbers again.

> Also, any change if you play with position_fix parameter, such as,
> position_fix=1?

I'll try that module option next.

Regards

Clive
-- 
Clive Messer <clive@vacuumtube.org.uk>


[-- Attachment #2: run2.log --]
[-- Type: text/x-log, Size: 2653 bytes --]

hwsync: 0(0), 0(0)
29049	29047	0	4416	0	4	1	2
hwsync: 0(0), 1(0)
29058	29058	22	4415	0	4	1	2
hwsync: 0(0), 2(0)
29061	29061	45	4414	0	4	1	2
hwsync: 0(0), 3(0)
29065	29064	68	4413	0	4	1	2
hwsync: 0(0), 4(0)
29068	29067	90	4412	0	4	1	2
hwsync: 0(0), 5(0)
29070	29070	113	4411	0	4	1	2
hwsync: 0(0), 6(0)
29074	29073	136	4410	0	4	1	2
hwsync: 0(0), 7(0)
29076	29076	158	4409	0	4	1	2
hwsync: 0(0), 8(0)
29079	29079	181	4408	0	4	1	2
hwsync: 0(0), 9(0)
29082	29082	204	4407	0	4	1	2
hwsync: 0(0), 10(0)
29085	29085	226	4406	0	4	1	2
hwsync: 0(0), 11(0)
29088	29088	249	4405	0	4	1	2
hwsync: 0(0), 12(0)
29091	29091	272	4404	0	4	1	2
hwsync: 0(0), 13(0)
29094	29094	294	4403	0	4	1	2
hwsync: 0(0), 14(0)
29099	29099	317	4402	0	4	1	2
hwsync: 0(0), 15(0)
29101	29101	340	4401	0	4	1	2
hwsync: 0(0), 16(0)
29103	29103	362	4400	0	4	1	2
hwsync: 0(0), 17(0)
29105	29105	385	4399	0	4	1	2
hwsync: 0(0), 18(0)
29107	29107	408	4398	0	4	1	2
hwsync: 0(0), 19(0)
29109	29109	430	4397	0	4	1	2
hwsync: 0(0), 20(0)
29111	29111	453	4396	0	4	1	2
hwsync: 0(0), 21(0)
29113	29113	476	4395	0	4	1	2
hwsync: 0(0), 22(0)
29115	29115	498	4394	0	4	1	2
hwsync: 0(0), 23(0)
29117	29117	521	4393	0	4	1	2
hwsync: 0(0), 24(0)
29159	29119	544	4392	0	4	1	2
hwsync: 2080(0), 6496(0)
70717	70716	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70719	70718	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70720	70720	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70722	70722	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70724	70724	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70726	70725	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70727	70727	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70729	70729	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70731	70731	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70733	70733	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70735	70735	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70737	70736	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70738	70738	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70740	70740	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70742	70742	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70744	70743	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70745	70745	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70747	70747	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70749	70749	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70751	70751	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70753	70752	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70754	70754	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70756	70756	47142	0	4416	0	0	3
hwsync: 2080(0), 6496(0)
70758	70758	47142	0	4416	0	0	3
hwsync: 1472(0), 6496(0)
70763	70762	104573379136040	4971973988617026976	-4971973988617022560	4	1	3

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 16:24                                   ` Clive Messer
@ 2009-02-25 16:39                                     ` Takashi Iwai
       [not found]                                       ` <200902251656.47813.clive@vacuumtube.org.uk>
  0 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-25 16:39 UTC (permalink / raw)
  To: Clive Messer; +Cc: alsa-devel

At Wed, 25 Feb 2009 16:24:18 +0000,
Clive Messer wrote:
> 
> On Wednesday 25 Feb 2009 15:51:33 you wrote:
> > At Wed, 25 Feb 2009 15:11:44 +0000,
> >
> > Clive Messer wrote:
> > > 36192	36191	100068	3	0	0	1	2
> > > 36195	36194	100090	2	0	0	1	2
> > > 36198	36198	100113	1	0	0	1	2
> > > 36207	36206	63854	2816	1600	0	1	3
> > > 36212	36211	63854	2815	1601	4	1	3
> > > 36216	36216	63854	2814	1602	4	1	3
> > > 36221	36220	63854	2813	1603	4	1	3
> > > 36225	36225	101224	4460	-44	4	1	3
> >
> > This is pretty weird.  Is this behavior always reproducible?
> 
> Nope. I get different results every time I run it.
> 
> > What shows with the patch below to alsa-lib?
> 
> Here's one (attached) I just did with the new patch that shows the crazy high 
> numbers again.
> hwsync: 2080(0), 6496(0)
> 70758	70758	47142	0	4416	0	0	3
> hwsync: 1472(0), 6496(0)
> 70763	70762	104573379136040	4971973988617026976	-4971973988617022560	4	1	3

The problem is that hwptr goes backward here.  But it didn't happen in
this hwsync call but sometime before that.

Could you try the patch below instead?  This will track all sync_ptr calls.


Takashi

---
diff --git a/src/pcm/pcm_hw.c b/src/pcm/pcm_hw.c
index e9ce092..ad8d4a5 100644
--- a/src/pcm/pcm_hw.c
+++ b/src/pcm/pcm_hw.c
@@ -131,6 +131,10 @@ struct timespec snd_pcm_hw_fast_tstamp(snd_pcm_t *pcm)
 static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int flags)
 {
 	int err;
+	long old_hwptr, new_hwptr;
+	long old_applptr, new_applptr;
+	old_hwptr = *pcm->hw.ptr;
+	old_applptr = *pcm->appl.ptr;
 	hw->sync_ptr->flags = flags;
 	err = ioctl((hw)->fd, SNDRV_PCM_IOCTL_SYNC_PTR, (hw)->sync_ptr);
 	if (err < 0) {
@@ -138,6 +142,11 @@ static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int flags)
 		SYSMSG("SNDRV_PCM_IOCTL_SYNC_PTR failed");
 		return err;
 	}
+	new_hwptr = *pcm->hw.ptr;
+	new_applptr = *pcm->appl.ptr;
+	printf("sync_ptr1: %ld(%ld), %ld(%ld)\n",
+	       new_hwptr, new_hwptr - old_hwptr,
+	       new_applptr, new_applptr - old_applptr);
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: Timer instability
       [not found]                                       ` <200902251656.47813.clive@vacuumtube.org.uk>
@ 2009-02-25 16:59                                         ` Takashi Iwai
  2009-02-25 17:36                                           ` Clive Messer
  0 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-02-25 16:59 UTC (permalink / raw)
  To: Clive Messer; +Cc: alsa-devel

At Wed, 25 Feb 2009 16:56:47 +0000,
Clive Messer wrote:
> 
> On Wednesday 25 Feb 2009 16:39:28 you wrote:
> > At Wed, 25 Feb 2009 16:24:18 +0000,
> >
> > Clive Messer wrote:
> > > On Wednesday 25 Feb 2009 15:51:33 you wrote:
> > > > At Wed, 25 Feb 2009 15:11:44 +0000,
> > > >
> > > > Clive Messer wrote:
> > > > > 36192	36191	100068	3	0	0	1	2
> > > > > 36195	36194	100090	2	0	0	1	2
> > > > > 36198	36198	100113	1	0	0	1	2
> > > > > 36207	36206	63854	2816	1600	0	1	3
> > > > > 36212	36211	63854	2815	1601	4	1	3
> > > > > 36216	36216	63854	2814	1602	4	1	3
> > > > > 36221	36220	63854	2813	1603	4	1	3
> > > > > 36225	36225	101224	4460	-44	4	1	3
> > > >
> > > > This is pretty weird.  Is this behavior always reproducible?
> > >
> > > Nope. I get different results every time I run it.
> > >
> > > > What shows with the patch below to alsa-lib?
> > >
> > > Here's one (attached) I just did with the new patch that shows the crazy
> > > high numbers again.
> > > hwsync: 2080(0), 6496(0)
> > > 70758	70758	47142	0	4416	0	0	3
> > > hwsync: 1472(0), 6496(0)
> > > 70763	70762	104573379136040	4971973988617026976	-4971973988617022560	4	1	
> > >3
> >
> > The problem is that hwptr goes backward here.  But it didn't happen in
> > this hwsync call but sometime before that.
> >
> > Could you try the patch below instead?  This will track all sync_ptr calls.
> >
> >
> > Takashi
> >
> > ---
> > diff --git a/src/pcm/pcm_hw.c b/src/pcm/pcm_hw.c
> > index e9ce092..ad8d4a5 100644
> > --- a/src/pcm/pcm_hw.c
> > +++ b/src/pcm/pcm_hw.c
> > @@ -131,6 +131,10 @@ struct timespec snd_pcm_hw_fast_tstamp(snd_pcm_t *pcm)
> >  static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int flags)
> >  {
> >  	int err;
> > +	long old_hwptr, new_hwptr;
> > +	long old_applptr, new_applptr;
> > +	old_hwptr = *pcm->hw.ptr;
> > +	old_applptr = *pcm->appl.ptr;
> >  	hw->sync_ptr->flags = flags;
> >  	err = ioctl((hw)->fd, SNDRV_PCM_IOCTL_SYNC_PTR, (hw)->sync_ptr);
> >  	if (err < 0) {
> > @@ -138,6 +142,11 @@ static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int
> > flags) SYSMSG("SNDRV_PCM_IOCTL_SYNC_PTR failed");
> >  		return err;
> >  	}
> > +	new_hwptr = *pcm->hw.ptr;
> > +	new_applptr = *pcm->appl.ptr;
> > +	printf("sync_ptr1: %ld(%ld), %ld(%ld)\n",
> > +	       new_hwptr, new_hwptr - old_hwptr,
> > +	       new_applptr, new_applptr - old_applptr);
> >  	return 0;
> >  }
> 
> Sorry Takashi,
> 
> I'm not a coder so I only know that it wont compile because there isn't a pcm 
> variable passed into that function - it's snd_pcm_hw_t *hw.

Oops, sorry, pasted a wrong version.
Here it is.


thanks,

Takashi
---
diff --git a/src/pcm/pcm_hw.c b/src/pcm/pcm_hw.c
index e9ce092..5211d1c 100644
--- a/src/pcm/pcm_hw.c
+++ b/src/pcm/pcm_hw.c
@@ -131,6 +131,10 @@ struct timespec snd_pcm_hw_fast_tstamp(snd_pcm_t *pcm)
 static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int flags)
 {
 	int err;
+	long old_hwptr, new_hwptr;
+	long old_applptr, new_applptr;
+	old_hwptr = hw->sync_ptr->s.status.hw_ptr;
+	old_applptr = hw->sync_ptr->c.control.appl_ptr;
 	hw->sync_ptr->flags = flags;
 	err = ioctl((hw)->fd, SNDRV_PCM_IOCTL_SYNC_PTR, (hw)->sync_ptr);
 	if (err < 0) {
@@ -138,6 +142,11 @@ static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int flags)
 		SYSMSG("SNDRV_PCM_IOCTL_SYNC_PTR failed");
 		return err;
 	}
+	new_hwptr = hw->sync_ptr->s.status.hw_ptr;
+	new_applptr = hw->sync_ptr->c.control.appl_ptr;
+	printf("sync_ptr1: %ld(%ld), %ld(%ld)\n",
+	       new_hwptr, new_hwptr - old_hwptr,
+	       new_applptr, new_applptr - old_applptr);
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 16:59                                         ` Takashi Iwai
@ 2009-02-25 17:36                                           ` Clive Messer
  2009-02-25 18:13                                             ` Takashi Iwai
  0 siblings, 1 reply; 47+ messages in thread
From: Clive Messer @ 2009-02-25 17:36 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

On Wednesday 25 Feb 2009 16:59:48 Takashi Iwai wrote:
> Oops, sorry, pasted a wrong version.
> Here it is.
>
>
> thanks,
>
> Takashi
> ---
> diff --git a/src/pcm/pcm_hw.c b/src/pcm/pcm_hw.c
> index e9ce092..5211d1c 100644
> --- a/src/pcm/pcm_hw.c
> +++ b/src/pcm/pcm_hw.c
> @@ -131,6 +131,10 @@ struct timespec snd_pcm_hw_fast_tstamp(snd_pcm_t *pcm)
>  static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int flags)
>  {
>         int err;
> +       long old_hwptr, new_hwptr;
> +       long old_applptr, new_applptr;
> +       old_hwptr = hw->sync_ptr->s.status.hw_ptr;
> +       old_applptr = hw->sync_ptr->c.control.appl_ptr;
>         hw->sync_ptr->flags = flags;
>         err = ioctl((hw)->fd, SNDRV_PCM_IOCTL_SYNC_PTR, (hw)->sync_ptr);
>         if (err < 0) {
> @@ -138,6 +142,11 @@ static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int
> flags) SYSMSG("SNDRV_PCM_IOCTL_SYNC_PTR failed");
>                 return err;
>         }
> +       new_hwptr = hw->sync_ptr->s.status.hw_ptr;
> +       new_applptr = hw->sync_ptr->c.control.appl_ptr;
> +       printf("sync_ptr1: %ld(%ld), %ld(%ld)\n",
> +              new_hwptr, new_hwptr - old_hwptr,
> +              new_applptr, new_applptr - old_applptr);
>         return 0;
>  }
>  

Takashi,

Applied, but I'm not getting any "sync_ptr1: .... " logging on stdout. Any 
ideas?

Regards

Clive
-- 
Clive Messer <clive@vacuumtube.org.uk>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 17:36                                           ` Clive Messer
@ 2009-02-25 18:13                                             ` Takashi Iwai
  2009-02-25 18:16                                               ` Clive Messer
  2009-02-25 18:34                                               ` Clive Messer
  0 siblings, 2 replies; 47+ messages in thread
From: Takashi Iwai @ 2009-02-25 18:13 UTC (permalink / raw)
  To: Clive Messer; +Cc: alsa-devel

At Wed, 25 Feb 2009 17:36:24 +0000,
Clive Messer wrote:
> 
> On Wednesday 25 Feb 2009 16:59:48 Takashi Iwai wrote:
> > Oops, sorry, pasted a wrong version.
> > Here it is.
> >
> >
> > thanks,
> >
> > Takashi
> > ---
> > diff --git a/src/pcm/pcm_hw.c b/src/pcm/pcm_hw.c
> > index e9ce092..5211d1c 100644
> > --- a/src/pcm/pcm_hw.c
> > +++ b/src/pcm/pcm_hw.c
> > @@ -131,6 +131,10 @@ struct timespec snd_pcm_hw_fast_tstamp(snd_pcm_t *pcm)
> >  static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int flags)
> >  {
> >         int err;
> > +       long old_hwptr, new_hwptr;
> > +       long old_applptr, new_applptr;
> > +       old_hwptr = hw->sync_ptr->s.status.hw_ptr;
> > +       old_applptr = hw->sync_ptr->c.control.appl_ptr;
> >         hw->sync_ptr->flags = flags;
> >         err = ioctl((hw)->fd, SNDRV_PCM_IOCTL_SYNC_PTR, (hw)->sync_ptr);
> >         if (err < 0) {
> > @@ -138,6 +142,11 @@ static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int
> > flags) SYSMSG("SNDRV_PCM_IOCTL_SYNC_PTR failed");
> >                 return err;
> >         }
> > +       new_hwptr = hw->sync_ptr->s.status.hw_ptr;
> > +       new_applptr = hw->sync_ptr->c.control.appl_ptr;
> > +       printf("sync_ptr1: %ld(%ld), %ld(%ld)\n",
> > +              new_hwptr, new_hwptr - old_hwptr,
> > +              new_applptr, new_applptr - old_applptr);
> >         return 0;
> >  }
> >  
> 
> Takashi,
> 
> Applied, but I'm not getting any "sync_ptr1: .... " logging on stdout. Any 
> ideas?

Did you run against the config with sync_ptr?


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 18:13                                             ` Takashi Iwai
@ 2009-02-25 18:16                                               ` Clive Messer
  2009-02-25 18:34                                               ` Clive Messer
  1 sibling, 0 replies; 47+ messages in thread
From: Clive Messer @ 2009-02-25 18:16 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

On Wednesday 25 Feb 2009 18:13:47 you wrote:
> At Wed, 25 Feb 2009 17:36:24 +0000,

> > Applied, but I'm not getting any "sync_ptr1: .... " logging on stdout.
> > Any ideas?
>
> Did you run against the config with sync_ptr?

No, that'll explain it. DOH!!! ;-)

Regards

Clive
-- 
Clive Messer <clive@vacuumtube.org.uk>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 18:13                                             ` Takashi Iwai
  2009-02-25 18:16                                               ` Clive Messer
@ 2009-02-25 18:34                                               ` Clive Messer
  2009-03-03 16:05                                                 ` Takashi Iwai
  1 sibling, 1 reply; 47+ messages in thread
From: Clive Messer @ 2009-02-25 18:34 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

[-- Attachment #1: Type: text/plain, Size: 1441 bytes --]

On Wednesday 25 Feb 2009 18:13:47 Takashi Iwai wrote:

> > > diff --git a/src/pcm/pcm_hw.c b/src/pcm/pcm_hw.c
> > > index e9ce092..5211d1c 100644
> > > --- a/src/pcm/pcm_hw.c
> > > +++ b/src/pcm/pcm_hw.c
> > > @@ -131,6 +131,10 @@ struct timespec snd_pcm_hw_fast_tstamp(snd_pcm_t
> > > *pcm) static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int flags)
> > >  {
> > >         int err;
> > > +       long old_hwptr, new_hwptr;
> > > +       long old_applptr, new_applptr;
> > > +       old_hwptr = hw->sync_ptr->s.status.hw_ptr;
> > > +       old_applptr = hw->sync_ptr->c.control.appl_ptr;
> > >         hw->sync_ptr->flags = flags;
> > >         err = ioctl((hw)->fd, SNDRV_PCM_IOCTL_SYNC_PTR,
> > > (hw)->sync_ptr); if (err < 0) {
> > > @@ -138,6 +142,11 @@ static int sync_ptr1(snd_pcm_hw_t *hw, unsigned
> > > int flags) SYSMSG("SNDRV_PCM_IOCTL_SYNC_PTR failed");
> > >                 return err;
> > >         }
> > > +       new_hwptr = hw->sync_ptr->s.status.hw_ptr;
> > > +       new_applptr = hw->sync_ptr->c.control.appl_ptr;
> > > +       printf("sync_ptr1: %ld(%ld), %ld(%ld)\n",
> > > +              new_hwptr, new_hwptr - old_hwptr,
> > > +              new_applptr, new_applptr - old_applptr);
> > >         return 0;
> > >  }
> > >  

With sync_ptr I cannot reproduce the very large numbers. Here's a typical log. 
(Attached).

Regards

Clive
-- 
Clive Messer <clive@vacuumtube.org.uk>


[-- Attachment #2: sync.log --]
[-- Type: text/x-log, Size: 5088 bytes --]

sync_ptr1: 0(0), 0(0)
sync_ptr1: 0(0), 0(0)
sync_ptr1: 0(0), 0(0)
sync_ptr1: 0(0), 0(0)
sync_ptr1: 0(0), 0(0)
sync_ptr1: 0(0), 1(1)
29091	29089	0	4416	0	4	1	2
sync_ptr1: 0(0), 1(0)
sync_ptr1: 0(0), 1(0)
sync_ptr1: 0(0), 2(1)
29102	29102	22	4415	0	4	1	2
sync_ptr1: 0(0), 2(0)
sync_ptr1: 0(0), 2(0)
sync_ptr1: 0(0), 3(1)
29109	29107	45	4414	0	4	1	2
sync_ptr1: 0(0), 3(0)
sync_ptr1: 0(0), 3(0)
sync_ptr1: 0(0), 4(1)
29113	29112	68	4413	0	4	1	2
sync_ptr1: 0(0), 4(0)
sync_ptr1: 0(0), 4(0)
sync_ptr1: 0(0), 5(1)
29116	29116	90	4412	0	4	1	2
sync_ptr1: 0(0), 5(0)
sync_ptr1: 0(0), 5(0)
sync_ptr1: 0(0), 6(1)
29119	29119	113	4411	0	4	1	2
sync_ptr1: 0(0), 6(0)
sync_ptr1: 0(0), 6(0)
sync_ptr1: 0(0), 7(1)
29122	29122	136	4410	0	4	1	2
sync_ptr1: 0(0), 7(0)
sync_ptr1: 0(0), 7(0)
sync_ptr1: 0(0), 8(1)
29125	29125	158	4409	0	4	1	2
sync_ptr1: 0(0), 8(0)
sync_ptr1: 0(0), 8(0)
sync_ptr1: 0(0), 9(1)
29129	29129	181	4408	0	4	1	2
sync_ptr1: 0(0), 9(0)
sync_ptr1: 0(0), 9(0)
sync_ptr1: 0(0), 10(1)
29132	29132	204	4407	0	4	1	2
sync_ptr1: 0(0), 10(0)
sync_ptr1: 0(0), 10(0)
sync_ptr1: 0(0), 11(1)
29135	29135	226	4406	0	4	1	2
sync_ptr1: 0(0), 11(0)
sync_ptr1: 0(0), 11(0)
sync_ptr1: 0(0), 12(1)
29138	29138	249	4405	0	4	1	2
sync_ptr1: 0(0), 12(0)
sync_ptr1: 0(0), 12(0)
sync_ptr1: 0(0), 13(1)
29141	29141	272	4404	0	4	1	2
sync_ptr1: 0(0), 13(0)
sync_ptr1: 0(0), 13(0)
sync_ptr1: 0(0), 14(1)
29145	29144	294	4403	0	4	1	2
sync_ptr1: 0(0), 14(0)
sync_ptr1: 0(0), 14(0)
sync_ptr1: 0(0), 15(1)
29148	29148	317	4402	0	4	1	2
sync_ptr1: 0(0), 15(0)
sync_ptr1: 0(0), 15(0)
sync_ptr1: 0(0), 16(1)
29151	29151	340	4401	0	4	1	2
sync_ptr1: 0(0), 16(0)
sync_ptr1: 0(0), 16(0)
sync_ptr1: 0(0), 17(1)
29154	29154	362	4400	0	4	1	2
sync_ptr1: 0(0), 17(0)
sync_ptr1: 0(0), 17(0)
sync_ptr1: 0(0), 18(1)
29157	29157	385	4399	0	4	1	2
sync_ptr1: 0(0), 18(0)
sync_ptr1: 0(0), 18(0)
sync_ptr1: 0(0), 19(1)
29160	29160	408	4398	0	4	1	2
sync_ptr1: 0(0), 19(0)
sync_ptr1: 0(0), 19(0)
sync_ptr1: 0(0), 20(1)
29163	29163	430	4397	0	4	1	2
sync_ptr1: 0(0), 20(0)
sync_ptr1: 0(0), 20(0)
sync_ptr1: 0(0), 21(1)
29167	29166	453	4396	0	4	1	2
sync_ptr1: 0(0), 21(0)
sync_ptr1: 0(0), 21(0)
sync_ptr1: 0(0), 22(1)
29170	29170	476	4395	0	4	1	2
sync_ptr1: 0(0), 22(0)
sync_ptr1: 0(0), 22(0)
sync_ptr1: 0(0), 23(1)
29173	29173	498	4394	0	4	1	2
sync_ptr1: 0(0), 23(0)
sync_ptr1: 0(0), 23(0)
sync_ptr1: 0(0), 24(1)
29176	29176	521	4393	0	4	1	2
sync_ptr1: 0(0), 24(0)
sync_ptr1: 0(0), 4397(0)
sync_ptr1: 0(0), 4397(0)
sync_ptr1: 0(0), 4398(1)
44834	44833	99705	19	0	0	1	2
sync_ptr1: 0(0), 4398(0)
sync_ptr1: 0(0), 4398(0)
sync_ptr1: 0(0), 4399(1)
44837	44837	99727	18	0	0	1	2
sync_ptr1: 0(0), 4399(0)
sync_ptr1: 0(0), 4399(0)
sync_ptr1: 0(0), 4400(1)
44840	44840	99750	17	0	0	1	2
sync_ptr1: 0(0), 4400(0)
sync_ptr1: 0(0), 4400(0)
sync_ptr1: 0(0), 4401(1)
44843	44843	99773	16	0	0	1	2
sync_ptr1: 0(0), 4401(0)
sync_ptr1: 0(0), 4401(0)
sync_ptr1: 0(0), 4402(1)
44847	44846	99795	15	0	0	1	2
sync_ptr1: 0(0), 4402(0)
sync_ptr1: 0(0), 4402(0)
sync_ptr1: 0(0), 4403(1)
44850	44850	99818	14	0	0	1	2
sync_ptr1: 0(0), 4403(0)
sync_ptr1: 0(0), 4403(0)
sync_ptr1: 0(0), 4404(1)
44853	44853	99841	13	0	0	1	2
sync_ptr1: 0(0), 4404(0)
sync_ptr1: 0(0), 4404(0)
sync_ptr1: 0(0), 4405(1)
44856	44856	99863	12	0	0	1	2
sync_ptr1: 0(0), 4405(0)
sync_ptr1: 0(0), 4405(0)
sync_ptr1: 0(0), 4406(1)
44860	44860	99886	11	0	0	1	2
sync_ptr1: 0(0), 4406(0)
sync_ptr1: 0(0), 4406(0)
sync_ptr1: 0(0), 4407(1)
44863	44863	99909	10	0	0	1	2
sync_ptr1: 0(0), 4407(0)
sync_ptr1: 0(0), 4407(0)
sync_ptr1: 0(0), 4408(1)
44866	44866	99931	9	0	0	1	2
sync_ptr1: 0(0), 4408(0)
sync_ptr1: 0(0), 4408(0)
sync_ptr1: 0(0), 4409(1)
44869	44869	99954	8	0	0	1	2
sync_ptr1: 0(0), 4409(0)
sync_ptr1: 0(0), 4409(0)
sync_ptr1: 0(0), 4410(1)
44873	44872	99977	7	0	0	1	2
sync_ptr1: 0(0), 4410(0)
sync_ptr1: 0(0), 4410(0)
sync_ptr1: 0(0), 4411(1)
44876	44876	100000	6	0	0	1	2
sync_ptr1: 0(0), 4411(0)
sync_ptr1: 0(0), 4411(0)
sync_ptr1: 0(0), 4412(1)
44879	44879	100022	5	0	0	1	2
sync_ptr1: 0(0), 4412(0)
sync_ptr1: 0(0), 4412(0)
sync_ptr1: 0(0), 4413(1)
44888	44888	100045	4	0	0	1	2
sync_ptr1: 0(0), 4413(0)
sync_ptr1: 0(0), 4413(0)
sync_ptr1: 0(0), 4414(1)
44892	44892	100068	3	0	0	1	2
sync_ptr1: 0(0), 4414(0)
sync_ptr1: 0(0), 4414(0)
sync_ptr1: 0(0), 4415(1)
44895	44895	100090	2	0	0	1	2
sync_ptr1: 0(0), 4415(0)
sync_ptr1: 0(0), 4415(0)
sync_ptr1: 0(0), 4416(1)
44898	44898	100113	1	0	0	1	2
sync_ptr1: 3632(3632), 4416(0)
sync_ptr1: 3632(0), 4416(0)
sync_ptr1: 3632(0), 4417(1)
44905	44904	82358	3632	784	0	1	3
sync_ptr1: 3632(0), 4417(0)
sync_ptr1: 3632(0), 4417(0)
sync_ptr1: 3632(0), 4418(1)
44908	44908	82358	3631	785	4	1	3
sync_ptr1: 3632(0), 4418(0)
sync_ptr1: 3632(0), 4418(0)
sync_ptr1: 3632(0), 4419(1)
44912	44912	82358	3630	786	4	1	3
sync_ptr1: 3632(0), 4419(0)
sync_ptr1: 3632(0), 4419(0)
sync_ptr1: 3632(0), 4420(1)
44915	44915	82358	3629	787	4	1	3
sync_ptr1: 3632(0), 4420(0)
sync_ptr1: 3632(0), 4420(0)
sync_ptr1: 4464(832), 4421(1)
44919	44919	101224	3628	-44	4	1	3
sync_ptr1: 4464(0), 4421(0)
sync_ptr1: 4464(0), 4421(0)
sync_ptr1: 4464(0), 4422(1)
44923	44922	101224	4459	-43	4	1	3

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-02-25 18:34                                               ` Clive Messer
@ 2009-03-03 16:05                                                 ` Takashi Iwai
  2009-03-03 17:41                                                   ` Takashi Iwai
  0 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-03-03 16:05 UTC (permalink / raw)
  To: Clive Messer; +Cc: alsa-devel

At Wed, 25 Feb 2009 18:34:31 +0000,
Clive Messer wrote:
> 
> On Wednesday 25 Feb 2009 18:13:47 Takashi Iwai wrote:
> 
> > > > diff --git a/src/pcm/pcm_hw.c b/src/pcm/pcm_hw.c
> > > > index e9ce092..5211d1c 100644
> > > > --- a/src/pcm/pcm_hw.c
> > > > +++ b/src/pcm/pcm_hw.c
> > > > @@ -131,6 +131,10 @@ struct timespec snd_pcm_hw_fast_tstamp(snd_pcm_t
> > > > *pcm) static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int flags)
> > > >  {
> > > >         int err;
> > > > +       long old_hwptr, new_hwptr;
> > > > +       long old_applptr, new_applptr;
> > > > +       old_hwptr = hw->sync_ptr->s.status.hw_ptr;
> > > > +       old_applptr = hw->sync_ptr->c.control.appl_ptr;
> > > >         hw->sync_ptr->flags = flags;
> > > >         err = ioctl((hw)->fd, SNDRV_PCM_IOCTL_SYNC_PTR,
> > > > (hw)->sync_ptr); if (err < 0) {
> > > > @@ -138,6 +142,11 @@ static int sync_ptr1(snd_pcm_hw_t *hw, unsigned
> > > > int flags) SYSMSG("SNDRV_PCM_IOCTL_SYNC_PTR failed");
> > > >                 return err;
> > > >         }
> > > > +       new_hwptr = hw->sync_ptr->s.status.hw_ptr;
> > > > +       new_applptr = hw->sync_ptr->c.control.appl_ptr;
> > > > +       printf("sync_ptr1: %ld(%ld), %ld(%ld)\n",
> > > > +              new_hwptr, new_hwptr - old_hwptr,
> > > > +              new_applptr, new_applptr - old_applptr);
> > > >         return 0;
> > > >  }
> > > >  
> 
> With sync_ptr I cannot reproduce the very large numbers. Here's a typical log. 
> (Attached).

Hrm, but still an unexpected jump is found.

If you build the driver with CONFIG_SND_PCM_XRUN_DEBUG=y, you must
have /proc/asound/card0/pcm0p/xrun_debug.  Echo 1 to this (as root)
	echo 1 > /proc/asound/card0/pcm0p/xrun_debug
and try the program, see whether any debug message appears.
If any message appears, it means basically an unstable hardware.

Also, the below is a patch I tried to clean up and improve the
handling.  Give it a try (and do echo 1 above for testing).


thanks,

Takashi
---
>From 6568cc56af28ff0371509e5354dcce43ecd9b4fd Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai@suse.de>
Date: Tue, 3 Mar 2009 17:00:15 +0100
Subject: [PATCH] ALSA: Rewrite hw_ptr updaters

Clean up and improve snd_pcm_update_hw_ptr*() functions.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/core/pcm_lib.c |  125 +++++++++++++++++++++++++++++++++-----------------
 1 files changed, 82 insertions(+), 43 deletions(-)

diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c
index 9216910..7d13eee 100644
--- a/sound/core/pcm_lib.c
+++ b/sound/core/pcm_lib.c
@@ -125,19 +125,23 @@ void snd_pcm_playback_silence(struct snd_pcm_substream *substream, snd_pcm_ufram
 	}
 }
 
+#ifdef CONFIG_SND_PCM_XRUN_DEBUG
+#define xrun_debug(substream)	((substream)->pstr->xrun_debug)
+#else
+#define xrun_debug(substream)	0
+#endif
+
 static void xrun(struct snd_pcm_substream *substream)
 {
 	snd_pcm_stop(substream, SNDRV_PCM_STATE_XRUN);
-#ifdef CONFIG_SND_PCM_XRUN_DEBUG
-	if (substream->pstr->xrun_debug) {
+	if (xrun_debug(substream)) {
 		snd_printd(KERN_DEBUG "XRUN: pcmC%dD%d%c\n",
 			   substream->pcm->card->number,
 			   substream->pcm->device,
 			   substream->stream ? 'c' : 'p');
-		if (substream->pstr->xrun_debug > 1)
+		if (xrun_debug(substream) > 1)
 			dump_stack();
 	}
-#endif
 }
 
 static inline snd_pcm_uframes_t snd_pcm_update_hw_ptr_pos(struct snd_pcm_substream *substream,
@@ -182,11 +186,26 @@ static inline int snd_pcm_update_hw_ptr_post(struct snd_pcm_substream *substream
 	return 0;
 }
 
+static void hw_ptr_error(struct snd_pcm_substream *substream,
+			 const char *fmt, ...)
+{
+	if (xrun_debug(substream)) {
+		if (printk_ratelimit()) {
+			va_list args;
+			va_start(args, fmt);
+			vprintk(fmt, args);
+			va_end(args);
+			if (xrun_debug(substream) > 1)
+				dump_stack();
+		}
+	}
+}
+
 static inline int snd_pcm_update_hw_ptr_interrupt(struct snd_pcm_substream *substream)
 {
 	struct snd_pcm_runtime *runtime = substream->runtime;
 	snd_pcm_uframes_t pos;
-	snd_pcm_uframes_t new_hw_ptr, hw_ptr_interrupt;
+	snd_pcm_uframes_t new_hw_ptr, hw_ptr_interrupt, hw_base;
 	snd_pcm_sframes_t delta;
 
 	pos = snd_pcm_update_hw_ptr_pos(substream, runtime);
@@ -194,36 +213,47 @@ static inline int snd_pcm_update_hw_ptr_interrupt(struct snd_pcm_substream *subs
 		xrun(substream);
 		return -EPIPE;
 	}
-	if (runtime->period_size == runtime->buffer_size)
-		goto __next_buf;
-	new_hw_ptr = runtime->hw_ptr_base + pos;
+	hw_base = runtime->hw_ptr_base;
+	new_hw_ptr = hw_base + pos;
 	hw_ptr_interrupt = runtime->hw_ptr_interrupt + runtime->period_size;
-
-	delta = hw_ptr_interrupt - new_hw_ptr;
-	if (delta > 0) {
-		if ((snd_pcm_uframes_t)delta < runtime->buffer_size / 2) {
-#ifdef CONFIG_SND_PCM_XRUN_DEBUG
-			if (runtime->periods > 1 && substream->pstr->xrun_debug) {
-				snd_printd(KERN_ERR "Unexpected hw_pointer value [1] (stream = %i, delta: -%ld, max jitter = %ld): wrong interrupt acknowledge?\n", substream->stream, (long) delta, runtime->buffer_size / 2);
-				if (substream->pstr->xrun_debug > 1)
-					dump_stack();
-			}
-#endif
-			return 0;
+	delta = new_hw_ptr - hw_ptr_interrupt;
+	if (hw_ptr_interrupt == runtime->boundary)
+		hw_ptr_interrupt = 0;
+	if (delta < 0) {
+		delta += runtime->buffer_size;
+		if (delta < 0) {
+			hw_ptr_error(substream, 
+				     "Unexpected hw_pointer value "
+				     "(stream=%i, pos=%ld, intr_ptr=%ld)\n",
+				     substream->stream, (long)pos,
+				     (long)hw_ptr_interrupt);
+			/* rebase to interrupt position */
+			hw_base = new_hw_ptr = hw_ptr_interrupt;
+			delta = 0;
+		} else {
+			hw_base += runtime->buffer_size;
+			if (hw_base == runtime->boundary)
+				hw_base = 0;
+			new_hw_ptr = hw_base + pos;
 		}
-	      __next_buf:
-		runtime->hw_ptr_base += runtime->buffer_size;
-		if (runtime->hw_ptr_base == runtime->boundary)
-			runtime->hw_ptr_base = 0;
-		new_hw_ptr = runtime->hw_ptr_base + pos;
 	}
-
+	if (delta > runtime->period_size) {
+		hw_ptr_error(substream,
+			     "Lost interrupts? "
+			     "(stream=%i, delta=%ld, intr_ptr=%ld)\n",
+			     substream->stream, (long)delta,
+			     (long)hw_ptr_interrupt);
+		/* rebase hw_ptr_interrupt */
+		hw_ptr_interrupt =
+			new_hw_ptr - new_hw_ptr % runtime->period_size;
+	}
 	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK &&
 	    runtime->silence_size > 0)
 		snd_pcm_playback_silence(substream, new_hw_ptr);
 
+	runtime->hw_ptr_base = hw_base;
 	runtime->status->hw_ptr = new_hw_ptr;
-	runtime->hw_ptr_interrupt = new_hw_ptr - new_hw_ptr % runtime->period_size;
+	runtime->hw_ptr_interrupt = hw_ptr_interrupt;
 
 	return snd_pcm_update_hw_ptr_post(substream, runtime);
 }
@@ -233,7 +263,7 @@ int snd_pcm_update_hw_ptr(struct snd_pcm_substream *substream)
 {
 	struct snd_pcm_runtime *runtime = substream->runtime;
 	snd_pcm_uframes_t pos;
-	snd_pcm_uframes_t old_hw_ptr, new_hw_ptr;
+	snd_pcm_uframes_t old_hw_ptr, new_hw_ptr, hw_base;
 	snd_pcm_sframes_t delta;
 
 	old_hw_ptr = runtime->status->hw_ptr;
@@ -242,29 +272,38 @@ int snd_pcm_update_hw_ptr(struct snd_pcm_substream *substream)
 		xrun(substream);
 		return -EPIPE;
 	}
-	new_hw_ptr = runtime->hw_ptr_base + pos;
-
-	delta = old_hw_ptr - new_hw_ptr;
-	if (delta > 0) {
-		if ((snd_pcm_uframes_t)delta < runtime->buffer_size / 2) {
-#ifdef CONFIG_SND_PCM_XRUN_DEBUG
-			if (runtime->periods > 2 && substream->pstr->xrun_debug) {
-				snd_printd(KERN_ERR "Unexpected hw_pointer value [2] (stream = %i, delta: -%ld, max jitter = %ld): wrong interrupt acknowledge?\n", substream->stream, (long) delta, runtime->buffer_size / 2);
-				if (substream->pstr->xrun_debug > 1)
-					dump_stack();
-			}
-#endif
+	hw_base = runtime->hw_ptr_base;
+	new_hw_ptr = hw_base + pos;
+
+	delta = new_hw_ptr - old_hw_ptr;
+	if (delta < 0) {
+		delta += runtime->buffer_size;
+		if (delta < 0) {
+			hw_ptr_error(substream, 
+				     "Unexpected hw_pointer value [2] "
+				     "(stream=%i, pos=%ld, old_ptr=%ld)\n",
+				     substream->stream, (long)pos,
+				     (long)old_hw_ptr);
 			return 0;
 		}
-		runtime->hw_ptr_base += runtime->buffer_size;
-		if (runtime->hw_ptr_base == runtime->boundary)
-			runtime->hw_ptr_base = 0;
-		new_hw_ptr = runtime->hw_ptr_base + pos;
+		hw_base += runtime->buffer_size;
+		if (hw_base == runtime->boundary)
+			hw_base = 0;
+		new_hw_ptr = hw_base + pos;
+	}
+	if (delta > runtime->period_size && runtime->periods > 1) {
+		hw_ptr_error(substream,
+			     "hw_ptr skipping! "
+			     "(pos=%ld, delta=%ld, period=%ld)\n",
+			     (long)pos, (long)delta,
+			     (long)runtime->period_size);
+		return 0;
 	}
 	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK &&
 	    runtime->silence_size > 0)
 		snd_pcm_playback_silence(substream, new_hw_ptr);
 
+	runtime->hw_ptr_base = hw_base;
 	runtime->status->hw_ptr = new_hw_ptr;
 
 	return snd_pcm_update_hw_ptr_post(substream, runtime);
-- 
1.6.1.3

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-03-03 16:05                                                 ` Takashi Iwai
@ 2009-03-03 17:41                                                   ` Takashi Iwai
  2009-03-09 11:13                                                     ` Clive Messer
  0 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-03-03 17:41 UTC (permalink / raw)
  To: Clive Messer; +Cc: alsa-devel, Lennart Poettering

[-- Attachment #1: Type: text/plain, Size: 3000 bytes --]

At Tue, 03 Mar 2009 17:05:31 +0100,
I wrote:
> 
> At Wed, 25 Feb 2009 18:34:31 +0000,
> Clive Messer wrote:
> > 
> > On Wednesday 25 Feb 2009 18:13:47 Takashi Iwai wrote:
> > 
> > > > > diff --git a/src/pcm/pcm_hw.c b/src/pcm/pcm_hw.c
> > > > > index e9ce092..5211d1c 100644
> > > > > --- a/src/pcm/pcm_hw.c
> > > > > +++ b/src/pcm/pcm_hw.c
> > > > > @@ -131,6 +131,10 @@ struct timespec snd_pcm_hw_fast_tstamp(snd_pcm_t
> > > > > *pcm) static int sync_ptr1(snd_pcm_hw_t *hw, unsigned int flags)
> > > > >  {
> > > > >         int err;
> > > > > +       long old_hwptr, new_hwptr;
> > > > > +       long old_applptr, new_applptr;
> > > > > +       old_hwptr = hw->sync_ptr->s.status.hw_ptr;
> > > > > +       old_applptr = hw->sync_ptr->c.control.appl_ptr;
> > > > >         hw->sync_ptr->flags = flags;
> > > > >         err = ioctl((hw)->fd, SNDRV_PCM_IOCTL_SYNC_PTR,
> > > > > (hw)->sync_ptr); if (err < 0) {
> > > > > @@ -138,6 +142,11 @@ static int sync_ptr1(snd_pcm_hw_t *hw, unsigned
> > > > > int flags) SYSMSG("SNDRV_PCM_IOCTL_SYNC_PTR failed");
> > > > >                 return err;
> > > > >         }
> > > > > +       new_hwptr = hw->sync_ptr->s.status.hw_ptr;
> > > > > +       new_applptr = hw->sync_ptr->c.control.appl_ptr;
> > > > > +       printf("sync_ptr1: %ld(%ld), %ld(%ld)\n",
> > > > > +              new_hwptr, new_hwptr - old_hwptr,
> > > > > +              new_applptr, new_applptr - old_applptr);
> > > > >         return 0;
> > > > >  }
> > > > >  
> > 
> > With sync_ptr I cannot reproduce the very large numbers. Here's a typical log. 
> > (Attached).
> 
> Hrm, but still an unexpected jump is found.
> 
> If you build the driver with CONFIG_SND_PCM_XRUN_DEBUG=y, you must
> have /proc/asound/card0/pcm0p/xrun_debug.  Echo 1 to this (as root)
> 	echo 1 > /proc/asound/card0/pcm0p/xrun_debug
> and try the program, see whether any debug message appears.
> If any message appears, it means basically an unstable hardware.
> 
> Also, the below is a patch I tried to clean up and improve the
> handling.  Give it a try (and do echo 1 above for testing).

BTW, the original test program is very hard to see what's wrong
because it spews out way too many lines.  The below is a filtered-out
version.

It prints only unexpected jumps of avail values (the threshold is
set 100 blindly).  The output is like below:

	% ./alsa-time-test hw
	21720872	(4987)	delta 216	avail 216	delay 4200
	21943118	(65)	delta 208	avail 208	delay 4208
	23752310	(3)	delta 232	avail 232	delay 4184
	23761847	(5972)	delta 264	avail 264	delay 4152

The first column is the usec from the program start, the next value in
parentheses is the time-step from the last time of status changes,
the delta is the increase of avail, and the rest are raw values.

If a too large delta appears in a short time-step, something is wrong.
If it appears in a large time-step, it's simply a wrong responsiveness
(aka system latency).


Takashi

[-- Attachment #2: alsa-time-test.c --]
[-- Type: application/octet-stream, Size: 5884 bytes --]

#include <assert.h>
#include <inttypes.h>
#include <time.h>

#include <alsa/asoundlib.h>

#define JUMP_DETECT	100	/* 100 frames */

static uint64_t timespec_us(const struct timespec *ts) {
    return
        ts->tv_sec * 1000000LLU +
        ts->tv_nsec / 1000LLU;
}

int main(int argc, char *argv[]) {
    const char *dev;
    int r;
    snd_pcm_hw_params_t *hwparams;
    snd_pcm_sw_params_t *swparams;
    snd_pcm_status_t *status;
    snd_pcm_t *pcm;
    unsigned rate = 44100;
    unsigned periods = 2;
    snd_pcm_uframes_t boundary, buffer_size = 44100/10; /* 100s */
    snd_pcm_uframes_t period_size;
    int dir = 1;
    struct timespec start, last_timestamp = { 0, 0 };
    uint64_t start_us;
    snd_pcm_sframes_t last_avail, last_delay;
    struct pollfd *pollfds;
    int n_pollfd;
    int64_t sample_count = 0;

    snd_pcm_hw_params_alloca(&hwparams);
    snd_pcm_sw_params_alloca(&swparams);
    snd_pcm_status_alloca(&status);

    r = clock_gettime(CLOCK_MONOTONIC, &start);
    assert(r == 0);

    start_us = timespec_us(&start);

    dev = argc > 1 ? argv[1] : "front:AudioPCI";

    r = snd_pcm_open(&pcm, dev, SND_PCM_STREAM_PLAYBACK, 0);
    assert(r == 0);

    r = snd_pcm_hw_params_any(pcm, hwparams);
    assert(r == 0);

    r = snd_pcm_hw_params_set_rate_resample(pcm, hwparams, 0);
    assert(r == 0);

    r = snd_pcm_hw_params_set_access(pcm, hwparams, SND_PCM_ACCESS_RW_INTERLEAVED);
    assert(r == 0);

    r = snd_pcm_hw_params_set_format(pcm, hwparams, SND_PCM_FORMAT_S16_LE);
    assert(r == 0);

    r = snd_pcm_hw_params_set_rate_near(pcm, hwparams, &rate, NULL);
    assert(r == 0);

    r = snd_pcm_hw_params_set_channels(pcm, hwparams, 2);
    assert(r == 0);

    r = snd_pcm_hw_params_set_periods_integer(pcm, hwparams);
    assert(r == 0);

    r = snd_pcm_hw_params_set_periods_near(pcm, hwparams, &periods, &dir);
    assert(r == 0);

    r = snd_pcm_hw_params_set_buffer_size_near(pcm, hwparams, &buffer_size);
    assert(r == 0);

    r = snd_pcm_hw_params(pcm, hwparams);
    assert(r == 0);

    r = snd_pcm_hw_params_current(pcm, hwparams);
    assert(r == 0);

    r = snd_pcm_sw_params_current(pcm, swparams);
    assert(r == 0);

    r = snd_pcm_sw_params_set_avail_min(pcm, swparams, 1);
    assert(r == 0);

    r = snd_pcm_sw_params_set_period_event(pcm, swparams, 0);
    assert(r == 0);

    r = snd_pcm_hw_params_get_buffer_size(hwparams, &buffer_size);
    assert(r == 0);
    r = snd_pcm_sw_params_set_start_threshold(pcm, swparams, buffer_size);
    assert(r == 0);
    period_size = buffer_size / periods;

    r = snd_pcm_sw_params_get_boundary(swparams, &boundary);
    assert(r == 0);
    r = snd_pcm_sw_params_set_stop_threshold(pcm, swparams, boundary);
    assert(r == 0);

    r = snd_pcm_sw_params_set_tstamp_mode(pcm, swparams, SND_PCM_TSTAMP_ENABLE);
    assert(r == 0);

    r = snd_pcm_sw_params(pcm, swparams);
    assert(r == 0);

    r = snd_pcm_prepare(pcm);
    assert(r == 0);

    r = snd_pcm_sw_params_current(pcm, swparams);
    assert(r == 0);

/*     assert(snd_pcm_hw_params_is_monotonic(hwparams) > 0); */

    n_pollfd = snd_pcm_poll_descriptors_count(pcm);
    assert(n_pollfd > 0);

    pollfds = malloc(sizeof(struct pollfd) * n_pollfd);
    assert(pollfds);

    r = snd_pcm_poll_descriptors(pcm, pollfds, n_pollfd);
    assert(r == n_pollfd);

    for (;;) {
	snd_pcm_sframes_t avail, delay, delta;
        struct timespec now, timestamp;
        unsigned short revents;
        int written = 0;
        uint64_t now_us, timestamp_us;
        snd_pcm_state_t state;

        r = poll(pollfds, n_pollfd, 0);
        assert(r >= 0);

        r = snd_pcm_poll_descriptors_revents(pcm, pollfds, n_pollfd, &revents);
        assert(r == 0);

        assert((revents & ~POLLOUT) == 0);

        avail = snd_pcm_avail(pcm);
        assert(avail >= 0);

        r = snd_pcm_status(pcm, status);
        assert(r == 0);

        /* This assertion fails from time to time. ALSA seems to be broken */
/*         assert(avail == (snd_pcm_sframes_t) snd_pcm_status_get_avail(status)); */
/*         printf("%lu %lu\n", (unsigned long) avail, (unsigned long) snd_pcm_status_get_avail(status)); */

        snd_pcm_status_get_htstamp(status, &timestamp);
        delay = snd_pcm_status_get_delay(status);
        state = snd_pcm_status_get_state(status);

        r = clock_gettime(CLOCK_MONOTONIC, &now);
        assert(r == 0);

        assert(!revents || avail > 0);

        if (avail) {
            snd_pcm_sframes_t sframes;
            static const uint16_t samples[2] = { 0, 0 };

            sframes = snd_pcm_writei(pcm, samples, 1);
            assert(sframes == 1);

            written = 1;
            sample_count++;
        }

        if (avail == last_avail && delay == last_delay) {
            /* This is boring */
            last_timestamp = timestamp;
            continue;
        }

        now_us = timespec_us(&now);
        timestamp_us = timespec_us(&timestamp);

	delta = avail - last_avail;
	if (state == SND_PCM_STATE_RUNNING &&
	    (long)delta >= JUMP_DETECT) { /* jump */
		printf("%llu\t(%llu)\tdelta %li\tavail %li\tdelay %li\n",
		       (unsigned long long) (timestamp_us ? timestamp_us - start_us : 0),
		       (unsigned long long ) (timestamp_us - timespec_us(&last_timestamp)),
		       (signed long) delta,
		       (signed long) avail,
		       (signed long) delay);
	}

        /** When this assert is hit, most likely something bad
         * happened, i.e. the avail jumped suddenly. */
        if ((unsigned) avail > buffer_size) {
		printf("XXX avail (%ld) > buffer_size (%ld)\n",
		       (long)avail, (long)buffer_size);
		printf("XXX last step = %ld\n",
		       (long)(timestamp_us - timespec_us(&last_timestamp)));
		return 1;
	}

        last_avail = avail;
        last_delay = delay;
        last_timestamp = timestamp;
    }

    return 0;
}

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-03-03 17:41                                                   ` Takashi Iwai
@ 2009-03-09 11:13                                                     ` Clive Messer
  2009-03-09 11:20                                                       ` Takashi Iwai
  0 siblings, 1 reply; 47+ messages in thread
From: Clive Messer @ 2009-03-09 11:13 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Lennart Poettering

On Tuesday 03 Mar 2009 17:41:22 Takashi Iwai wrote:

> > Hrm, but still an unexpected jump is found.
> >
> > If you build the driver with CONFIG_SND_PCM_XRUN_DEBUG=y, you must
> > have /proc/asound/card0/pcm0p/xrun_debug.  Echo 1 to this (as root)
> > 	echo 1 > /proc/asound/card0/pcm0p/xrun_debug
> > and try the program, see whether any debug message appears.
> > If any message appears, it means basically an unstable hardware.
> >
> > Also, the below is a patch I tried to clean up and improve the
> > handling.  Give it a try (and do echo 1 above for testing).
>
> BTW, the original test program is very hard to see what's wrong
> because it spews out way too many lines.  The below is a filtered-out
> version.
>
> It prints only unexpected jumps of avail values (the threshold is
> set 100 blindly).  The output is like below:
>
> 	% ./alsa-time-test hw
> 	21720872	(4987)	delta 216	avail 216	delay 4200
> 	21943118	(65)	delta 208	avail 208	delay 4208
> 	23752310	(3)	delta 232	avail 232	delay 4184
> 	23761847	(5972)	delta 264	avail 264	delay 4152
>
> The first column is the usec from the program start, the next value in
> parentheses is the time-step from the last time of status changes,
> the delta is the increase of avail, and the rest are raw values.
>
> If a too large delta appears in a short time-step, something is wrong.
> If it appears in a large time-step, it's simply a wrong responsiveness
> (aka system latency).

Hello Takashi/Lennart,

Sorry, I have been busy for a few days.
I patched my kernel (kernel-2.6.29-0.53.rc7.fc10.x86_64) with your pcm_lib.c 
patch and 'echo 1 > /proc/asound/card0/pcm0p/xrun_debug'.

I have done several runs with the new alsa-time-test.c code. Typically I get a 
bunch of kernel dmesg when I start the alsa-time-test program but not 
afterwards. Just a reminder - I'm running this on fast hardware - X58 / Core 
i7 920 - the machine is not loaded at all, so latency should be very low. 
snd_hda_intel - ' 00-00: AD198x Analog : AD198x Analog : playback 1 : capture 
3'.

Here is an example run .....

hw_ptr skipping! (pos=2096, delta=2096, period=1472)
hw_ptr skipping! (pos=2096, delta=2096, period=1472)
hw_ptr skipping! (pos=3600, delta=3600, period=1472)
hw_ptr skipping! (pos=3600, delta=3600, period=1472)
hw_ptr skipping! (pos=3600, delta=3600, period=1472)
hw_ptr skipping! (pos=3600, delta=3600, period=1472)
hw_ptr skipping! (pos=3600, delta=3600, period=1472)
hw_ptr skipping! (pos=3600, delta=3600, period=1472)

9901323          (2632)  delta 111       avail 117       delay 4299
19901290        (2599)  delta 112       avail 112       delay 4304
29901285        (3)     delta 112       avail 112       delay 4304
39901315        (2624)  delta 120       avail 120       delay 4296
49901317        (2626)  delta 112       avail 112       delay 4304
59901317        (2625)  delta 112       avail 112       delay 4304
69901323        (2632)  delta 118       avail 118       delay 4298
79901391        (3)     delta 120       avail 120       delay 4296
89901327        (2634)  delta 112       avail 112       delay 4304
99901283        (2591)  delta 120       avail 120       delay 4296
109901334       (2642)  delta 120       avail 120       delay 4296
119901320       (2627)  delta 112       avail 112       delay 4304

Regards

Clive
-- 
Clive Messer <clive@vacuumtube.org.uk>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-03-09 11:13                                                     ` Clive Messer
@ 2009-03-09 11:20                                                       ` Takashi Iwai
  2009-03-09 13:09                                                         ` Takashi Iwai
  0 siblings, 1 reply; 47+ messages in thread
From: Takashi Iwai @ 2009-03-09 11:20 UTC (permalink / raw)
  To: Clive Messer; +Cc: alsa-devel, Lennart Poettering

At Mon, 9 Mar 2009 11:13:21 +0000,
Clive Messer wrote:
> 
> On Tuesday 03 Mar 2009 17:41:22 Takashi Iwai wrote:
> 
> > > Hrm, but still an unexpected jump is found.
> > >
> > > If you build the driver with CONFIG_SND_PCM_XRUN_DEBUG=y, you must
> > > have /proc/asound/card0/pcm0p/xrun_debug.  Echo 1 to this (as root)
> > > 	echo 1 > /proc/asound/card0/pcm0p/xrun_debug
> > > and try the program, see whether any debug message appears.
> > > If any message appears, it means basically an unstable hardware.
> > >
> > > Also, the below is a patch I tried to clean up and improve the
> > > handling.  Give it a try (and do echo 1 above for testing).
> >
> > BTW, the original test program is very hard to see what's wrong
> > because it spews out way too many lines.  The below is a filtered-out
> > version.
> >
> > It prints only unexpected jumps of avail values (the threshold is
> > set 100 blindly).  The output is like below:
> >
> > 	% ./alsa-time-test hw
> > 	21720872	(4987)	delta 216	avail 216	delay 4200
> > 	21943118	(65)	delta 208	avail 208	delay 4208
> > 	23752310	(3)	delta 232	avail 232	delay 4184
> > 	23761847	(5972)	delta 264	avail 264	delay 4152
> >
> > The first column is the usec from the program start, the next value in
> > parentheses is the time-step from the last time of status changes,
> > the delta is the increase of avail, and the rest are raw values.
> >
> > If a too large delta appears in a short time-step, something is wrong.
> > If it appears in a large time-step, it's simply a wrong responsiveness
> > (aka system latency).
> 
> Hello Takashi/Lennart,
> 
> Sorry, I have been busy for a few days.
> I patched my kernel (kernel-2.6.29-0.53.rc7.fc10.x86_64) with your pcm_lib.c 
> patch and 'echo 1 > /proc/asound/card0/pcm0p/xrun_debug'.
> 
> I have done several runs with the new alsa-time-test.c code. Typically I get a 
> bunch of kernel dmesg when I start the alsa-time-test program but not 
> afterwards. Just a reminder - I'm running this on fast hardware - X58 / Core 
> i7 920 - the machine is not loaded at all, so latency should be very low. 
> snd_hda_intel - ' 00-00: AD198x Analog : AD198x Analog : playback 1 : capture 
> 3'.
> 
> Here is an example run .....
> 
> hw_ptr skipping! (pos=2096, delta=2096, period=1472)
> hw_ptr skipping! (pos=2096, delta=2096, period=1472)
> hw_ptr skipping! (pos=3600, delta=3600, period=1472)
> hw_ptr skipping! (pos=3600, delta=3600, period=1472)
> hw_ptr skipping! (pos=3600, delta=3600, period=1472)
> hw_ptr skipping! (pos=3600, delta=3600, period=1472)
> hw_ptr skipping! (pos=3600, delta=3600, period=1472)
> hw_ptr skipping! (pos=3600, delta=3600, period=1472)

Thanks.  This implies that the position pointer once reads zero
then the next value gets screwed up.  It shows the DMA pointer
read is pretty unreliable on your hardware.  It's obviously bad for
PA because the DMA pointer is the only trusted information for
glitch-free.  For the traditional method, this isn't usually so
critical, though.

> 9901323          (2632)  delta 111       avail 117       delay 4299
> 19901290        (2599)  delta 112       avail 112       delay 4304
> 29901285        (3)     delta 112       avail 112       delay 4304
> 39901315        (2624)  delta 120       avail 120       delay 4296
> 49901317        (2626)  delta 112       avail 112       delay 4304
> 59901317        (2625)  delta 112       avail 112       delay 4304
> 69901323        (2632)  delta 118       avail 118       delay 4298
> 79901391        (3)     delta 120       avail 120       delay 4296
> 89901327        (2634)  delta 112       avail 112       delay 4304
> 99901283        (2591)  delta 120       avail 120       delay 4296
> 109901334       (2642)  delta 120       avail 120       delay 4296
> 119901320       (2627)  delta 112       avail 112       delay 4304

These outputs look fairly OK.  The latency is about 2.5 ms and it's
about 110 samples.  That us, as long as the outputs look like above,
your system is working fine with my patch.

Let me know if any other cases are found.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Timer instability
  2009-03-09 11:20                                                       ` Takashi Iwai
@ 2009-03-09 13:09                                                         ` Takashi Iwai
  0 siblings, 0 replies; 47+ messages in thread
From: Takashi Iwai @ 2009-03-09 13:09 UTC (permalink / raw)
  To: Clive Messer; +Cc: alsa-devel, Lennart Poettering

At Mon, 09 Mar 2009 12:20:51 +0100,
I wrote:
> 
> At Mon, 9 Mar 2009 11:13:21 +0000,
> Clive Messer wrote:
> > 
> > On Tuesday 03 Mar 2009 17:41:22 Takashi Iwai wrote:
> > 
> > > > Hrm, but still an unexpected jump is found.
> > > >
> > > > If you build the driver with CONFIG_SND_PCM_XRUN_DEBUG=y, you must
> > > > have /proc/asound/card0/pcm0p/xrun_debug.  Echo 1 to this (as root)
> > > > 	echo 1 > /proc/asound/card0/pcm0p/xrun_debug
> > > > and try the program, see whether any debug message appears.
> > > > If any message appears, it means basically an unstable hardware.
> > > >
> > > > Also, the below is a patch I tried to clean up and improve the
> > > > handling.  Give it a try (and do echo 1 above for testing).
> > >
> > > BTW, the original test program is very hard to see what's wrong
> > > because it spews out way too many lines.  The below is a filtered-out
> > > version.
> > >
> > > It prints only unexpected jumps of avail values (the threshold is
> > > set 100 blindly).  The output is like below:
> > >
> > > 	% ./alsa-time-test hw
> > > 	21720872	(4987)	delta 216	avail 216	delay 4200
> > > 	21943118	(65)	delta 208	avail 208	delay 4208
> > > 	23752310	(3)	delta 232	avail 232	delay 4184
> > > 	23761847	(5972)	delta 264	avail 264	delay 4152
> > >
> > > The first column is the usec from the program start, the next value in
> > > parentheses is the time-step from the last time of status changes,
> > > the delta is the increase of avail, and the rest are raw values.
> > >
> > > If a too large delta appears in a short time-step, something is wrong.
> > > If it appears in a large time-step, it's simply a wrong responsiveness
> > > (aka system latency).
> > 
> > Hello Takashi/Lennart,
> > 
> > Sorry, I have been busy for a few days.
> > I patched my kernel (kernel-2.6.29-0.53.rc7.fc10.x86_64) with your pcm_lib.c 
> > patch and 'echo 1 > /proc/asound/card0/pcm0p/xrun_debug'.
> > 
> > I have done several runs with the new alsa-time-test.c code. Typically I get a 
> > bunch of kernel dmesg when I start the alsa-time-test program but not 
> > afterwards. Just a reminder - I'm running this on fast hardware - X58 / Core 
> > i7 920 - the machine is not loaded at all, so latency should be very low. 
> > snd_hda_intel - ' 00-00: AD198x Analog : AD198x Analog : playback 1 : capture 
> > 3'.
> > 
> > Here is an example run .....
> > 
> > hw_ptr skipping! (pos=2096, delta=2096, period=1472)
> > hw_ptr skipping! (pos=2096, delta=2096, period=1472)
> > hw_ptr skipping! (pos=3600, delta=3600, period=1472)
> > hw_ptr skipping! (pos=3600, delta=3600, period=1472)
> > hw_ptr skipping! (pos=3600, delta=3600, period=1472)
> > hw_ptr skipping! (pos=3600, delta=3600, period=1472)
> > hw_ptr skipping! (pos=3600, delta=3600, period=1472)
> > hw_ptr skipping! (pos=3600, delta=3600, period=1472)
> 
> Thanks.  This implies that the position pointer once reads zero
> then the next value gets screwed up.  It shows the DMA pointer
> read is pretty unreliable on your hardware.  It's obviously bad for
> PA because the DMA pointer is the only trusted information for
> glitch-free.  For the traditional method, this isn't usually so
> critical, though.
> 
> > 9901323          (2632)  delta 111       avail 117       delay 4299
> > 19901290        (2599)  delta 112       avail 112       delay 4304
> > 29901285        (3)     delta 112       avail 112       delay 4304
> > 39901315        (2624)  delta 120       avail 120       delay 4296
> > 49901317        (2626)  delta 112       avail 112       delay 4304
> > 59901317        (2625)  delta 112       avail 112       delay 4304
> > 69901323        (2632)  delta 118       avail 118       delay 4298
> > 79901391        (3)     delta 120       avail 120       delay 4296
> > 89901327        (2634)  delta 112       avail 112       delay 4304
> > 99901283        (2591)  delta 120       avail 120       delay 4296
> > 109901334       (2642)  delta 120       avail 120       delay 4296
> > 119901320       (2627)  delta 112       avail 112       delay 4304
> 
> These outputs look fairly OK.  The latency is about 2.5 ms and it's
> about 110 samples.  That us, as long as the outputs look like above,
> your system is working fine with my patch.

As the patch seems working with other tests, too, I merged it
(with a slight clean-up) into sound git tree now.


Takashi

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2009-03-09 13:09 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-19  2:46 Timer instability Lennart Poettering
2009-02-19  6:52 ` Takashi Iwai
2009-02-20  1:22   ` Lennart Poettering
2009-02-20  1:26     ` Lennart Poettering
2009-02-20  1:50     ` Lennart Poettering
2009-02-20  7:26     ` Takashi Iwai
2009-02-20 20:34       ` Lennart Poettering
2009-02-21 16:36         ` Takashi Iwai
2009-02-22  3:14           ` Lennart Poettering
2009-02-23  2:42             ` Lennart Poettering
2009-02-23  2:56               ` Lennart Poettering
2009-02-23 19:20                 ` Lennart Poettering
2009-02-23 19:24                   ` Colin Guthrie
2009-02-24  3:21                   ` Lennart Poettering
2009-02-23  7:47             ` Takashi Iwai
2009-02-24 16:27             ` Takashi Iwai
2009-02-24 18:46               ` Lennart Poettering
2009-02-24 18:59                 ` Takashi Iwai
2009-02-24 19:04                   ` Jaroslav Kysela
2009-02-24 19:26                   ` Lennart Poettering
2009-02-24 20:37                     ` Takashi Iwai
2009-02-25 10:08                       ` Takashi Iwai
2009-02-25 10:22                         ` Takashi Iwai
2009-02-25 12:34                           ` Clive Messer
2009-02-25 13:36                             ` Takashi Iwai
2009-02-25 15:11                               ` Clive Messer
2009-02-25 15:51                                 ` Takashi Iwai
2009-02-25 16:24                                   ` Clive Messer
2009-02-25 16:39                                     ` Takashi Iwai
     [not found]                                       ` <200902251656.47813.clive@vacuumtube.org.uk>
2009-02-25 16:59                                         ` Takashi Iwai
2009-02-25 17:36                                           ` Clive Messer
2009-02-25 18:13                                             ` Takashi Iwai
2009-02-25 18:16                                               ` Clive Messer
2009-02-25 18:34                                               ` Clive Messer
2009-03-03 16:05                                                 ` Takashi Iwai
2009-03-03 17:41                                                   ` Takashi Iwai
2009-03-09 11:13                                                     ` Clive Messer
2009-03-09 11:20                                                       ` Takashi Iwai
2009-03-09 13:09                                                         ` Takashi Iwai
2009-02-25 15:04                             ` Takashi Iwai
2009-02-25 10:44                         ` Lennart Poettering
2009-02-24 21:16                     ` Lennart Poettering
2009-02-25 10:48                       ` Takashi Iwai
2009-02-25 10:57                         ` Jaroslav Kysela
2009-02-25 11:04                           ` Takashi Iwai
2009-02-25 11:10                             ` Jaroslav Kysela
2009-02-25 11:17                               ` Takashi Iwai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.