alsa-devel.alsa-project.org archive mirror
 help / color / mirror / Atom feed
* pcm_meter.c issue at s16_update
@ 2020-07-26 18:20 Pavel Hofman
  2020-07-28 16:46 ` Pavel Hofman
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Hofman @ 2020-07-26 18:20 UTC (permalink / raw)
  To: alsa-devel, Takashi Iwai

Hi,

I am debugging the following problem with the PCM METER API:

An application with meter/scope configured in .asoundrc hits a high CPU
load sometimes when a new playback stream is opened. E.g. in MPD when
opening a new radio stream. The 100% CPU core load takes tens of seconds.

Debugging the issue revealed this:


TL;DR: at decrease of [status.appl_ptr - status.delay] between
consequent runs the size of buffer to convert by the built-in s16 scope
is set from usual 100s to huge value of pcm->boundary (1.5G), causing a
very long processing at 100% core load.

Details:

Every scope setup uses a default s16 scope defined in alsa-lib, which
converts samples to s16 so that the user-provided scopes can access data
via snd_pcm_scope_s16_get_channel_buffer buffers with converted samples.

The s16 scope converts samples in a loop, in frames chunks, decrementing
size variable by frames until zero. The value of size is hundreds
samples max at each s16_update call, calculated in
https://github.com/alsa-project/alsa-lib/blob/master/src/pcm/pcm_meter.c#L1100
. A debug added after that line shows this:

@@ -1098,14 +1103,18 @@ static void s16_update(snd_pcm_scope_t *scope)
        snd_pcm_sframes_t size;
        snd_pcm_uframes_t offset;
        size = meter->now - s16->old;
+       fprintf(stderr, "s16_update 1: meter->now %ld, s16->old %ld,
size %ld\n", meter->now, s16->old, size);
        if (size < 0)
                size += spcm->boundary;
        offset = s16->old % meter->buf_size;




s16_update 1: meter->now 2154673, s16->old 2153771, size 902
s16_update 1: meter->now 2155579, s16->old 2154673, size 906
s16_update 1: meter->now 2156487, s16->old 2155579, size 908
s16_update 1: meter->now 2157391, s16->old 2156487, size 904
s16_update 1: meter->now 2158298, s16->old 2157391, size 907
s16_update 1: meter->now 2159203, s16->old 2158298, size 905
s16_update 1: meter->now 2160091, s16->old 2159203, size 888
s16_update 1: meter->now 2160976, s16->old 2160091, size 885
s16_update 1: meter->now 2161924, s16->old 2160976, size 948
s16_update 1: meter->now 2162829, s16->old 2161924, size 905
s16_update 1: meter->now 2163733, s16->old 2162829, size 904
s16_update 1: meter->now 2164594, s16->old 2163733, size 861
s16_update 1: meter->now 2165542, s16->old 2164594, size 948
s16_update 1: meter->now 2166448, s16->old 2165542, size 906
s16_update 1: meter->now 2167352, s16->old 2166448, size 904
s16_update 1: meter->now 2168259, s16->old 2167352, size 907
s16_update 1: meter->now 2169165, s16->old 2168259, size 906
s16_update 1: meter->now 2170050, s16->old 2169165, size 885
s16_update 1: meter->now 2170936, s16->old 2170050, size 886
s16_update 1: meter->now 2171877, s16->old 2170936, size 941
s16_update 1: meter->now 2172783, s16->old 2171877, size 906
s16_update 1: meter->now 2173690, s16->old 2172783, size 907
s16_update 1: meter->now 2174596, s16->old 2173690, size 906
s16_update 1: meter->now 2175502, s16->old 2174596, size 906
s16_update 1: meter->now 2176407, s16->old 2175502, size 905
s16_update 1: meter->now 2177311, s16->old 2176407, size 904
s16_update 1: meter->now 2178215, s16->old 2177311, size 904


However, sometimes meter->now < s16->old, causing size < 0, for which
this line applies
https://github.com/alsa-project/alsa-lib/blob/master/src/pcm/pcm_meter.c#L1102

size += spcm->boundary

and size becomes a huge number, because hw:0 pcm -> boundary is
1445068800. The loop at
https://github.com/alsa-project/alsa-lib/blob/master/src/pcm/pcm_meter.c#L1104
takes tens of seconds, untill the huge buffer is converted to s16.

Usually the negative difference is quite small:

s16_update 1: meter->now 988, s16->old 21139, size -20151
s16_update 1: meter->now 870, s16->old 18865, size -17995
s16_update 1: meter->now 1065, s16->old 20786, size -19721
s16_update 1: meter->now 701, s16->old 24839, size -24138
s16_update 1: meter->now 782, s16->old 869, size -87
s16_update 1: meter->now 763, s16->old 841, size -78
s16_update 1: meter->now 963, s16->old 22525, size -21562
s16_update 1: meter->now 665, s16->old 832, size -167
s16_update 1: meter->now 745, s16->old 865, size -120
s16_update 1: meter->now 859, s16->old 1002, size -143
s16_update 1: meter->now 935, s16->old 117452, size -116517


Yet in all these cases the size variable is incremented by 1445068800,
which clearly does not make sense.

Now the question is how it happens that meter->now < s16->old, when
meter->now is calculated by now = status.appl_ptr - status.delay
https://github.com/alsa-project/alsa-lib/blob/master/src/pcm/pcm_meter.c#L219
and s16->old is set to meter->now at the end of each s16_update call
https://github.com/alsa-project/alsa-lib/blob/master/src/pcm/pcm_meter.c#L1142
.

The debug around the event looks like this, suddenly meter->now drops down:

s16_update 1: meter->now 2567498, s16->old 2566593, size 905
s16_update 1: meter->now 2568401, s16->old 2567498, size 903
s16_update 1: meter->now 20786, s16->old 20786, size 0
s16_update 1: meter->now 1065, s16->old 20786, size -19721
s16_update 1: meter->now 24839, s16->old 24838, size 1
s16_update 1: meter->now 701, s16->old 24839, size -24138
s16_update 1: meter->now 1253162, s16->old 701, size 1252461
s16_update 1: meter->now 1255148, s16->old 1253162, size 1986

..........

s16_update 1: meter->now 11136, s16->old 10261, size 875
s16_update 1: meter->now 22525, s16->old 22524, size 1
s16_update 1: meter->now 963, s16->old 22525, size -21562
s16_update 1: meter->now 1270914, s16->old 963, size 1269951
s16_update 1: meter->now 1272917, s16->old 1270914, size 2003


Could it perhaps be an XRUN, causing the difference (status.appl_ptr -
status.delay)  to actually decrease compared to the previous run? I
really do not know how/when such decrease can happen. But if it is a
standard situation, than IMO the size of to-be converted buffer by the
s16 scope  should not be set to the huge pcm->boundary.

Thanks a lot for help.

Best regards,

Pavel.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-07-26 18:20 pcm_meter.c issue at s16_update Pavel Hofman
@ 2020-07-28 16:46 ` Pavel Hofman
  2020-07-28 17:04   ` Takashi Iwai
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Hofman @ 2020-07-28 16:46 UTC (permalink / raw)
  To: alsa-devel, Takashi Iwai


Dne 26. 07. 20 v 20:20 Pavel Hofman napsal(a):
> Hi,
> 
> I am debugging the following problem with the PCM METER API:
> 
> An application with meter/scope configured in .asoundrc hits a high CPU
> load sometimes when a new playback stream is opened. E.g. in MPD when
> opening a new radio stream. The 100% CPU core load takes tens of seconds.
> ...........
> 
> TL;DR: at decrease of [status.appl_ptr - status.delay] between
> consequent runs the size of buffer to convert by the built-in s16 scope
> is set from usual 100s to huge value of pcm->boundary (1.5G), causing a
> very long processing at 100% core load.
> 
> The debug around the event looks like this, suddenly meter->now drops down:
> 
> s16_update 1: meter->now 2567498, s16->old 2566593, size 905
> s16_update 1: meter->now 2568401, s16->old 2567498, size 903
> s16_update 1: meter->now 20786, s16->old 20786, size 0
> s16_update 1: meter->now 1065, s16->old 20786, size -19721
> s16_update 1: meter->now 24839, s16->old 24838, size 1
> s16_update 1: meter->now 701, s16->old 24839, size -24138
> s16_update 1: meter->now 1253162, s16->old 701, size 1252461
> s16_update 1: meter->now 1255148, s16->old 1253162, size 1986
> 
> ..........
> 
> s16_update 1: meter->now 11136, s16->old 10261, size 875
> s16_update 1: meter->now 22525, s16->old 22524, size 1
> s16_update 1: meter->now 963, s16->old 22525, size -21562
> s16_update 1: meter->now 1270914, s16->old 963, size 1269951
> s16_update 1: meter->now 1272917, s16->old 1270914, size 2003
> 

I think the problem is that s16->old is not reset when status.appl_ptr
is zeroed and starts running again. There is a call

static void s16_reset(snd_pcm_scope_t *scope)
{
	snd_pcm_scope_s16_t *s16 = scope->private_data;
	snd_pcm_meter_t *meter = s16->pcm->private_data;
	s16->old = meter->now;
}

but I do not know when this method is called and whether the meter->now
is already assigned to the newly zeroed status.appl_ptr.

Please at which method should I reset s16->old = 0?
* s16_reset
* s16_enable
* s16_start
* snd_pcm_scope_s16_open


Thanks a lot for any help,

Pavel.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-07-28 16:46 ` Pavel Hofman
@ 2020-07-28 17:04   ` Takashi Iwai
  2020-07-28 18:04     ` Pavel Hofman
  0 siblings, 1 reply; 16+ messages in thread
From: Takashi Iwai @ 2020-07-28 17:04 UTC (permalink / raw)
  To: Pavel Hofman; +Cc: alsa-devel

On Tue, 28 Jul 2020 18:46:00 +0200,
Pavel Hofman wrote:
> 
> 
> Dne 26. 07. 20 v 20:20 Pavel Hofman napsal(a):
> > Hi,
> > 
> > I am debugging the following problem with the PCM METER API:
> > 
> > An application with meter/scope configured in .asoundrc hits a high CPU
> > load sometimes when a new playback stream is opened. E.g. in MPD when
> > opening a new radio stream. The 100% CPU core load takes tens of seconds.
> > ...........
> > 
> > TL;DR: at decrease of [status.appl_ptr - status.delay] between
> > consequent runs the size of buffer to convert by the built-in s16 scope
> > is set from usual 100s to huge value of pcm->boundary (1.5G), causing a
> > very long processing at 100% core load.
> > 
> > The debug around the event looks like this, suddenly meter->now drops down:
> > 
> > s16_update 1: meter->now 2567498, s16->old 2566593, size 905
> > s16_update 1: meter->now 2568401, s16->old 2567498, size 903
> > s16_update 1: meter->now 20786, s16->old 20786, size 0
> > s16_update 1: meter->now 1065, s16->old 20786, size -19721
> > s16_update 1: meter->now 24839, s16->old 24838, size 1
> > s16_update 1: meter->now 701, s16->old 24839, size -24138
> > s16_update 1: meter->now 1253162, s16->old 701, size 1252461
> > s16_update 1: meter->now 1255148, s16->old 1253162, size 1986
> > 
> > ..........
> > 
> > s16_update 1: meter->now 11136, s16->old 10261, size 875
> > s16_update 1: meter->now 22525, s16->old 22524, size 1
> > s16_update 1: meter->now 963, s16->old 22525, size -21562
> > s16_update 1: meter->now 1270914, s16->old 963, size 1269951
> > s16_update 1: meter->now 1272917, s16->old 1270914, size 2003
> > 
> 
> I think the problem is that s16->old is not reset when status.appl_ptr
> is zeroed and starts running again. There is a call
> 
> static void s16_reset(snd_pcm_scope_t *scope)
> {
> 	snd_pcm_scope_s16_t *s16 = scope->private_data;
> 	snd_pcm_meter_t *meter = s16->pcm->private_data;
> 	s16->old = meter->now;
> }
> 
> but I do not know when this method is called and whether the meter->now
> is already assigned to the newly zeroed status.appl_ptr.

Would adding atomic_add(&meter->reset, 1) in snd_pcm_meter_reset()
help?


Takashi


> 
> Please at which method should I reset s16->old = 0?
> * s16_reset
> * s16_enable
> * s16_start
> * snd_pcm_scope_s16_open
> 
> 
> Thanks a lot for any help,
> 
> Pavel.
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-07-28 17:04   ` Takashi Iwai
@ 2020-07-28 18:04     ` Pavel Hofman
  2020-07-28 18:54       ` Pavel Hofman
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Hofman @ 2020-07-28 18:04 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

Dne 28. 07. 20 v 19:04 Takashi Iwai napsal(a):
> On Tue, 28 Jul 2020 18:46:00 +0200,
> Pavel Hofman wrote:
>>
>>
>>>
>>> The debug around the event looks like this, suddenly meter->now drops down:
>>>
>>> s16_update 1: meter->now 2567498, s16->old 2566593, size 905
>>> s16_update 1: meter->now 2568401, s16->old 2567498, size 903
>>> s16_update 1: meter->now 20786, s16->old 20786, size 0
>>> s16_update 1: meter->now 1065, s16->old 20786, size -19721
>>> s16_update 1: meter->now 24839, s16->old 24838, size 1
>>> s16_update 1: meter->now 701, s16->old 24839, size -24138
>>> s16_update 1: meter->now 1253162, s16->old 701, size 1252461
>>> s16_update 1: meter->now 1255148, s16->old 1253162, size 1986
>>>
>>> ..........
>>>
>>> s16_update 1: meter->now 11136, s16->old 10261, size 875
>>> s16_update 1: meter->now 22525, s16->old 22524, size 1
>>> s16_update 1: meter->now 963, s16->old 22525, size -21562
>>> s16_update 1: meter->now 1270914, s16->old 963, size 1269951
>>> s16_update 1: meter->now 1272917, s16->old 1270914, size 2003
>>>
>>
>> I think the problem is that s16->old is not reset when status.appl_ptr
>> is zeroed and starts running again. There is a call
>>
>> static void s16_reset(snd_pcm_scope_t *scope)
>> {
>> 	snd_pcm_scope_s16_t *s16 = scope->private_data;
>> 	snd_pcm_meter_t *meter = s16->pcm->private_data;
>> 	s16->old = meter->now;
>> }
>>
>> but I do not know when this method is called and whether the meter->now
>> is already assigned to the newly zeroed status.appl_ptr.
> 
> Would adding atomic_add(&meter->reset, 1) in snd_pcm_meter_reset()
> help?
> 
Unfortunately not.

s16_reset is called correctly, setting s16->old = meter->now;  But at
that time meter->now is still 22751, setting s16->old to the same value.

s16_update 1: meter->now 22751, s16->old 22751, size 0

However, in the next update call meter->now comes from the freshly
started application pointer:

s16_update 1: meter->now 839, s16->old 22751, size -21912


Of course this helps:

-       if (size < 0)
-               size += spcm->boundary;
+       if (size < 0) {
+               size = meter->now;
+               s16->old = 0;
+       }

But I understand this is not a solution because:

* it will not work at reaching spcm->boundary (after thousands of hours?)
* it will cause the same problem when the stream is rewound (which is
the problem now too) - size will equal to large meter->now (length from
the beginning of the stream minus the rewound = large number).


BTW, please why is snd_pcm_meter_update_scope called only in capture stream?

Thanks,

Pavel.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-07-28 18:04     ` Pavel Hofman
@ 2020-07-28 18:54       ` Pavel Hofman
  2020-08-02 17:50         ` Pavel Hofman
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Hofman @ 2020-07-28 18:54 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel


Dne 28. 07. 20 v 20:04 Pavel Hofman napsal(a):
> Dne 28. 07. 20 v 19:04 Takashi Iwai napsal(a):
>> On Tue, 28 Jul 2020 18:46:00 +0200,
>> Pavel Hofman wrote:
>>>
>>>
>>>>
>>>> The debug around the event looks like this, suddenly meter->now drops down:
>>>>
>>>> s16_update 1: meter->now 2567498, s16->old 2566593, size 905
>>>> s16_update 1: meter->now 2568401, s16->old 2567498, size 903
>>>> s16_update 1: meter->now 20786, s16->old 20786, size 0
>>>> s16_update 1: meter->now 1065, s16->old 20786, size -19721
>>>> s16_update 1: meter->now 24839, s16->old 24838, size 1
>>>> s16_update 1: meter->now 701, s16->old 24839, size -24138
>>>> s16_update 1: meter->now 1253162, s16->old 701, size 1252461
>>>> s16_update 1: meter->now 1255148, s16->old 1253162, size 1986
>>>>
>>>> ..........
>>>>
>>>> s16_update 1: meter->now 11136, s16->old 10261, size 875
>>>> s16_update 1: meter->now 22525, s16->old 22524, size 1
>>>> s16_update 1: meter->now 963, s16->old 22525, size -21562
>>>> s16_update 1: meter->now 1270914, s16->old 963, size 1269951
>>>> s16_update 1: meter->now 1272917, s16->old 1270914, size 2003
>>>>
>>>
>>> I think the problem is that s16->old is not reset when status.appl_ptr
>>> is zeroed and starts running again. There is a call
>>>
>>> static void s16_reset(snd_pcm_scope_t *scope)
>>> {
>>> 	snd_pcm_scope_s16_t *s16 = scope->private_data;
>>> 	snd_pcm_meter_t *meter = s16->pcm->private_data;
>>> 	s16->old = meter->now;
>>> }
>>>
>>> but I do not know when this method is called and whether the meter->now
>>> is already assigned to the newly zeroed status.appl_ptr.
>>
>> Would adding atomic_add(&meter->reset, 1) in snd_pcm_meter_reset()
>> help?
>>
> Unfortunately not.
> 
> s16_reset is called correctly, setting s16->old = meter->now;  But at
> that time meter->now is still 22751, setting s16->old to the same value.
> 
> s16_update 1: meter->now 22751, s16->old 22751, size 0
> 
> However, in the next update call meter->now comes from the freshly
> started application pointer:
> 
> s16_update 1: meter->now 839, s16->old 22751, size -21912
> 
> 
> Of course this helps:
> 
> -       if (size < 0)
> -               size += spcm->boundary;
> +       if (size < 0) {
> +               size = meter->now;
> +               s16->old = 0;
> +       }
> 
> But I understand this is not a solution because:
> 
> * it will not work at reaching spcm->boundary (after thousands of hours?)
> * it will cause the same problem when the stream is rewound (which is
> the problem now too) - size will equal to large meter->now (length from
> the beginning of the stream minus the rewound = large number).
> 

IMHO there are two cases of the [application pointer + delay] drop 
compared to the previous run:

* stream start, rewinding => s16->old = meter->now; size =0, i.e. 
skipping the samples to show
* wrapping at spcm->boundary => size += spcm->boundary, i.e. showing the 
wrapped samples

Optionally the second case could be handled just like the first case by 
resetting s16->old, assuming the boundary wrap occurs very infrequently.

Pavel.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-07-28 18:54       ` Pavel Hofman
@ 2020-08-02 17:50         ` Pavel Hofman
  2020-08-03  6:17           ` Takashi Iwai
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Hofman @ 2020-08-02 17:50 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel


Dne 28. 07. 20 v 20:54 Pavel Hofman napsal(a):
> 
> Dne 28. 07. 20 v 20:04 Pavel Hofman napsal(a):
>> Dne 28. 07. 20 v 19:04 Takashi Iwai napsal(a):
>>> Would adding atomic_add(&meter->reset, 1) in snd_pcm_meter_reset()
>>> help?
>>>
>> Unfortunately not.
>>
>> s16_reset is called correctly, setting s16->old = meter->now;  But at
>> that time meter->now is still 22751, setting s16->old to the same value.
>>
>> s16_update 1: meter->now 22751, s16->old 22751, size 0
>>
>> However, in the next update call meter->now comes from the freshly
>> started application pointer:
>>
>> s16_update 1: meter->now 839, s16->old 22751, size -21912
>>
>>
>> Of course this helps:
>>
>> -       if (size < 0)
>> -               size += spcm->boundary;
>> +       if (size < 0) {
>> +               size = meter->now;
>> +               s16->old = 0;
>> +       }
>>
>> But I understand this is not a solution because:
>>
>> * it will not work at reaching spcm->boundary (after thousands of hours?)
>> * it will cause the same problem when the stream is rewound (which is
>> the problem now too) - size will equal to large meter->now (length from
>> the beginning of the stream minus the rewound = large number).
>>
> 
> IMHO there are two cases of the [application pointer + delay] drop
> compared to the previous run:
> 
> * stream start, rewinding => s16->old = meter->now; size =0, i.e.
> skipping the samples to show
> * wrapping at spcm->boundary => size += spcm->boundary, i.e. showing the
> wrapped samples
> 
> Optionally the second case could be handled just like the first case by
> resetting s16->old, assuming the boundary wrap occurs very infrequently.

The following patch is tested to work OK, no CPU peaks and no meter
output glitches when the size < 0 condition occurs:

diff --git a/src/pcm/pcm_meter.c b/src/pcm/pcm_meter.c
index 20b41876..48df5945 100644
--- a/src/pcm/pcm_meter.c
+++ b/src/pcm/pcm_meter.c
@@ -1098,8 +1098,15 @@ static void s16_update(snd_pcm_scope_t *scope)
        snd_pcm_sframes_t size;
        snd_pcm_uframes_t offset;
        size = meter->now - s16->old;
-       if (size < 0)
-               size += spcm->boundary;
+       if (size < 0) {
+               /**
+                * Application pointer adjusted for delay (meter->now)
has dropped compared
+                * to the previous update cycle. Either spcm->boundary
wraparound, pcm rewinding,
+                * or pcm restart without s16->old properly reset.
+                * In any case the safest solution is skipping this
conversion cycle.
+                */
+               size = 0;
+       }
        offset = s16->old % meter->buf_size;
        while (size > 0) {
                snd_pcm_uframes_t frames = size;



Please will you accept this (workaround) bugfix? If so, I would send a
proper patch.

Thanks a lot,

Pavel.

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-08-02 17:50         ` Pavel Hofman
@ 2020-08-03  6:17           ` Takashi Iwai
  2020-08-03  7:22             ` Jaroslav Kysela
  0 siblings, 1 reply; 16+ messages in thread
From: Takashi Iwai @ 2020-08-03  6:17 UTC (permalink / raw)
  To: Pavel Hofman; +Cc: alsa-devel

On Sun, 02 Aug 2020 19:50:44 +0200,
Pavel Hofman wrote:
> 
> 
> Dne 28. 07. 20 v 20:54 Pavel Hofman napsal(a):
> > 
> > Dne 28. 07. 20 v 20:04 Pavel Hofman napsal(a):
> >> Dne 28. 07. 20 v 19:04 Takashi Iwai napsal(a):
> >>> Would adding atomic_add(&meter->reset, 1) in snd_pcm_meter_reset()
> >>> help?
> >>>
> >> Unfortunately not.
> >>
> >> s16_reset is called correctly, setting s16->old = meter->now;  But at
> >> that time meter->now is still 22751, setting s16->old to the same value.
> >>
> >> s16_update 1: meter->now 22751, s16->old 22751, size 0
> >>
> >> However, in the next update call meter->now comes from the freshly
> >> started application pointer:
> >>
> >> s16_update 1: meter->now 839, s16->old 22751, size -21912
> >>
> >>
> >> Of course this helps:
> >>
> >> -       if (size < 0)
> >> -               size += spcm->boundary;
> >> +       if (size < 0) {
> >> +               size = meter->now;
> >> +               s16->old = 0;
> >> +       }
> >>
> >> But I understand this is not a solution because:
> >>
> >> * it will not work at reaching spcm->boundary (after thousands of hours?)
> >> * it will cause the same problem when the stream is rewound (which is
> >> the problem now too) - size will equal to large meter->now (length from
> >> the beginning of the stream minus the rewound = large number).
> >>
> > 
> > IMHO there are two cases of the [application pointer + delay] drop
> > compared to the previous run:
> > 
> > * stream start, rewinding => s16->old = meter->now; size =0, i.e.
> > skipping the samples to show
> > * wrapping at spcm->boundary => size += spcm->boundary, i.e. showing the
> > wrapped samples
> > 
> > Optionally the second case could be handled just like the first case by
> > resetting s16->old, assuming the boundary wrap occurs very infrequently.
> 
> The following patch is tested to work OK, no CPU peaks and no meter
> output glitches when the size < 0 condition occurs:
> 
> diff --git a/src/pcm/pcm_meter.c b/src/pcm/pcm_meter.c
> index 20b41876..48df5945 100644
> --- a/src/pcm/pcm_meter.c
> +++ b/src/pcm/pcm_meter.c
> @@ -1098,8 +1098,15 @@ static void s16_update(snd_pcm_scope_t *scope)
>         snd_pcm_sframes_t size;
>         snd_pcm_uframes_t offset;
>         size = meter->now - s16->old;
> -       if (size < 0)
> -               size += spcm->boundary;
> +       if (size < 0) {
> +               /**
> +                * Application pointer adjusted for delay (meter->now)
> has dropped compared
> +                * to the previous update cycle. Either spcm->boundary
> wraparound, pcm rewinding,
> +                * or pcm restart without s16->old properly reset.
> +                * In any case the safest solution is skipping this
> conversion cycle.
> +                */
> +               size = 0;
> +       }
>         offset = s16->old % meter->buf_size;
>         while (size > 0) {
>                 snd_pcm_uframes_t frames = size;
> 
> 
> 
> Please will you accept this (workaround) bugfix? If so, I would send a
> proper patch.

It looks OK, at least this must be safe.
So yes, I'll happily apply if you submit a proper patch.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-08-03  6:17           ` Takashi Iwai
@ 2020-08-03  7:22             ` Jaroslav Kysela
  2020-08-03 10:48               ` Pavel Hofman
  0 siblings, 1 reply; 16+ messages in thread
From: Jaroslav Kysela @ 2020-08-03  7:22 UTC (permalink / raw)
  To: Takashi Iwai, Pavel Hofman; +Cc: alsa-devel

Dne 03. 08. 20 v 8:17 Takashi Iwai napsal(a):
> On Sun, 02 Aug 2020 19:50:44 +0200,
> Pavel Hofman wrote:
>>
>>
>> Dne 28. 07. 20 v 20:54 Pavel Hofman napsal(a):
>>>
>>> Dne 28. 07. 20 v 20:04 Pavel Hofman napsal(a):
>>>> Dne 28. 07. 20 v 19:04 Takashi Iwai napsal(a):
>>>>> Would adding atomic_add(&meter->reset, 1) in snd_pcm_meter_reset()
>>>>> help?
>>>>>
>>>> Unfortunately not.
>>>>
>>>> s16_reset is called correctly, setting s16->old = meter->now;  But at
>>>> that time meter->now is still 22751, setting s16->old to the same value.
>>>>
>>>> s16_update 1: meter->now 22751, s16->old 22751, size 0
>>>>
>>>> However, in the next update call meter->now comes from the freshly
>>>> started application pointer:
>>>>
>>>> s16_update 1: meter->now 839, s16->old 22751, size -21912
>>>>
>>>>
>>>> Of course this helps:
>>>>
>>>> -       if (size < 0)
>>>> -               size += spcm->boundary;
>>>> +       if (size < 0) {
>>>> +               size = meter->now;
>>>> +               s16->old = 0;
>>>> +       }
>>>>
>>>> But I understand this is not a solution because:
>>>>
>>>> * it will not work at reaching spcm->boundary (after thousands of hours?)
>>>> * it will cause the same problem when the stream is rewound (which is
>>>> the problem now too) - size will equal to large meter->now (length from
>>>> the beginning of the stream minus the rewound = large number).
>>>>
>>>
>>> IMHO there are two cases of the [application pointer + delay] drop
>>> compared to the previous run:
>>>
>>> * stream start, rewinding => s16->old = meter->now; size =0, i.e.
>>> skipping the samples to show
>>> * wrapping at spcm->boundary => size += spcm->boundary, i.e. showing the
>>> wrapped samples
>>>
>>> Optionally the second case could be handled just like the first case by
>>> resetting s16->old, assuming the boundary wrap occurs very infrequently.
>>
>> The following patch is tested to work OK, no CPU peaks and no meter
>> output glitches when the size < 0 condition occurs:
>>
>> diff --git a/src/pcm/pcm_meter.c b/src/pcm/pcm_meter.c
>> index 20b41876..48df5945 100644
>> --- a/src/pcm/pcm_meter.c
>> +++ b/src/pcm/pcm_meter.c
>> @@ -1098,8 +1098,15 @@ static void s16_update(snd_pcm_scope_t *scope)
>>         snd_pcm_sframes_t size;
>>         snd_pcm_uframes_t offset;
>>         size = meter->now - s16->old;
>> -       if (size < 0)
>> -               size += spcm->boundary;
>> +       if (size < 0) {
>> +               /**
>> +                * Application pointer adjusted for delay (meter->now)
>> has dropped compared
>> +                * to the previous update cycle. Either spcm->boundary
>> wraparound, pcm rewinding,
>> +                * or pcm restart without s16->old properly reset.
>> +                * In any case the safest solution is skipping this
>> conversion cycle.
>> +                */
>> +               size = 0;
>> +       }
>>         offset = s16->old % meter->buf_size;
>>         while (size > 0) {
>>                 snd_pcm_uframes_t frames = size;
>>
>>
>>
>> Please will you accept this (workaround) bugfix? If so, I would send a
>> proper patch.
> 
> It looks OK, at least this must be safe.
> So yes, I'll happily apply if you submit a proper patch.

It would be probably better to check against the boundary / 2 value to check
correctly the boundary wrap instead to drop all negative size values:

  if (size < 0) {
     if (size < -(spcm->boundary / 2))
        size += spcm->boundary;
     else
        size = 0;
  }

The "hidden" pcm restart referred in the comment should not occur, otherwise
it's another bug somewhere.

					Jaroslav

-- 
Jaroslav Kysela <perex@perex.cz>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-08-03  7:22             ` Jaroslav Kysela
@ 2020-08-03 10:48               ` Pavel Hofman
  2020-08-09  7:05                 ` Pavel Hofman
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Hofman @ 2020-08-03 10:48 UTC (permalink / raw)
  To: Jaroslav Kysela, Takashi Iwai; +Cc: alsa-devel



Dne 03. 08. 20 v 9:22 Jaroslav Kysela napsal(a):
> Dne 03. 08. 20 v 8:17 Takashi Iwai napsal(a):
>> On Sun, 02 Aug 2020 19:50:44 +0200,
>> Pavel Hofman wrote:
>>>
>>>
>>> Dne 28. 07. 20 v 20:54 Pavel Hofman napsal(a):
>>>>
>>>> Dne 28. 07. 20 v 20:04 Pavel Hofman napsal(a):
>>>>> Dne 28. 07. 20 v 19:04 Takashi Iwai napsal(a):
>>>>>> Would adding atomic_add(&meter->reset, 1) in snd_pcm_meter_reset()
>>>>>> help?
>>>>>>
>>>>> Unfortunately not.
>>>>>
>>>>> s16_reset is called correctly, setting s16->old = meter->now;  But at
>>>>> that time meter->now is still 22751, setting s16->old to the same value.
>>>>>
>>>>> s16_update 1: meter->now 22751, s16->old 22751, size 0
>>>>>
>>>>> However, in the next update call meter->now comes from the freshly
>>>>> started application pointer:
>>>>>
>>>>> s16_update 1: meter->now 839, s16->old 22751, size -21912
>>>>>
>>>>>
>>>>> Of course this helps:
>>>>>
>>>>> -       if (size < 0)
>>>>> -               size += spcm->boundary;
>>>>> +       if (size < 0) {
>>>>> +               size = meter->now;
>>>>> +               s16->old = 0;
>>>>> +       }
>>>>>
>>>>> But I understand this is not a solution because:
>>>>>
>>>>> * it will not work at reaching spcm->boundary (after thousands of hours?)
>>>>> * it will cause the same problem when the stream is rewound (which is
>>>>> the problem now too) - size will equal to large meter->now (length from
>>>>> the beginning of the stream minus the rewound = large number).
>>>>>
>>>>
>>>> IMHO there are two cases of the [application pointer + delay] drop
>>>> compared to the previous run:
>>>>
>>>> * stream start, rewinding => s16->old = meter->now; size =0, i.e.
>>>> skipping the samples to show
>>>> * wrapping at spcm->boundary => size += spcm->boundary, i.e. showing the
>>>> wrapped samples
>>>>
>>>> Optionally the second case could be handled just like the first case by
>>>> resetting s16->old, assuming the boundary wrap occurs very infrequently.
>>>
>>> The following patch is tested to work OK, no CPU peaks and no meter
>>> output glitches when the size < 0 condition occurs:
>>>
>>> diff --git a/src/pcm/pcm_meter.c b/src/pcm/pcm_meter.c
>>> index 20b41876..48df5945 100644
>>> --- a/src/pcm/pcm_meter.c
>>> +++ b/src/pcm/pcm_meter.c
>>> @@ -1098,8 +1098,15 @@ static void s16_update(snd_pcm_scope_t *scope)
>>>          snd_pcm_sframes_t size;
>>>          snd_pcm_uframes_t offset;
>>>          size = meter->now - s16->old;
>>> -       if (size < 0)
>>> -               size += spcm->boundary;
>>> +       if (size < 0) {
>>> +               /**
>>> +                * Application pointer adjusted for delay (meter->now)
>>> has dropped compared
>>> +                * to the previous update cycle. Either spcm->boundary
>>> wraparound, pcm rewinding,
>>> +                * or pcm restart without s16->old properly reset.
>>> +                * In any case the safest solution is skipping this
>>> conversion cycle.
>>> +                */
>>> +               size = 0;
>>> +       }
>>>          offset = s16->old % meter->buf_size;
>>>          while (size > 0) {
>>>                  snd_pcm_uframes_t frames = size;
>>>
>>>
>>>
>>> Please will you accept this (workaround) bugfix? If so, I would send a
>>> proper patch.
>>
>> It looks OK, at least this must be safe.
>> So yes, I'll happily apply if you submit a proper patch.
> 
> It would be probably better to check against the boundary / 2 value to check
> correctly the boundary wrap instead to drop all negative size values:
> 
>    if (size < 0) {
>       if (size < -(spcm->boundary / 2))
>          size += spcm->boundary;
>       else
>          size = 0;
>    }

Is there a reliable way to detect the boundary wraparound, at best using 
some dedicated API? I could find any, IMO the wraparound does not create 
any notification. The check is OK for a rewind, half of boundary is 
usually a very large number too. I am not sure what would happen at 
reset when application pointer was already past the boundary half - see 
below.

> 
> The "hidden" pcm restart referred in the comment should not occur, otherwise
> it's another bug somewhere.

I do not know the exact moments when plugin API methods are called. The 
fact is Takashi's suggestion to call s16 reset explicitely in 
snd_pcm_meter_reset created this order:

snd_pcm_meter_reset -> s16->reset
s16_update: meter->now 22751, s16->old 22751, size 0
s16_update: meter->now 839, s16->old 22751, size -21912

I.e. AFTER resetting meter/s16 the variable meter->now was still at the 
original large 22751 (with s16->old equal to its value due to 
s16->reset). The value of meter->now was reset to 839 (= app pointer - 
delay) only in the next call of s16_update (when s16->old was still the 
previous old value => size < 0 => huge size => high CPU load).  From 
this I kind of conclude that the reset is buggy. Maybe the reset code 
should re-calculate meter->now = appl.pointer - delay before aligning 
s16->old = meter->now.

Nevertheless all this (except for the boundary wraparound) would result 
in the same size = 0, thus skipping samples from the last cycle, just 
like what the proposed patch does.


Pavel.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-08-03 10:48               ` Pavel Hofman
@ 2020-08-09  7:05                 ` Pavel Hofman
  2020-08-09 20:29                   ` Jaroslav Kysela
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Hofman @ 2020-08-09  7:05 UTC (permalink / raw)
  To: Jaroslav Kysela, Takashi Iwai; +Cc: alsa-devel

Dne 03. 08. 20 v 12:48 Pavel Hofman napsal(a):
> 
> 
> Dne 03. 08. 20 v 9:22 Jaroslav Kysela napsal(a):
>> Dne 03. 08. 20 v 8:17 Takashi Iwai napsal(a):
>>> On Sun, 02 Aug 2020 19:50:44 +0200,
>>>>>
>>>>> Optionally the second case could be handled just like the first
>>>>> case by
>>>>> resetting s16->old, assuming the boundary wrap occurs very
>>>>> infrequently.
>>>>
>>>> The following patch is tested to work OK, no CPU peaks and no meter
>>>> output glitches when the size < 0 condition occurs:
>>>>
>>>> diff --git a/src/pcm/pcm_meter.c b/src/pcm/pcm_meter.c
>>>> index 20b41876..48df5945 100644
>>>> --- a/src/pcm/pcm_meter.c
>>>> +++ b/src/pcm/pcm_meter.c
>>>> @@ -1098,8 +1098,15 @@ static void s16_update(snd_pcm_scope_t *scope)
>>>>          snd_pcm_sframes_t size;
>>>>          snd_pcm_uframes_t offset;
>>>>          size = meter->now - s16->old;
>>>> -       if (size < 0)
>>>> -               size += spcm->boundary;
>>>> +       if (size < 0) {
>>>> +               /**
>>>> +                * Application pointer adjusted for delay (meter->now)
>>>> has dropped compared
>>>> +                * to the previous update cycle. Either spcm->boundary
>>>> wraparound, pcm rewinding,
>>>> +                * or pcm restart without s16->old properly reset.
>>>> +                * In any case the safest solution is skipping this
>>>> conversion cycle.
>>>> +                */
>>>> +               size = 0;
>>>> +       }
>>>>          offset = s16->old % meter->buf_size;
>>>>          while (size > 0) {
>>>>                  snd_pcm_uframes_t frames = size;
>>>>
>>>>
>>>>
>>>> Please will you accept this (workaround) bugfix? If so, I would send a
>>>> proper patch.
>>>
>>> It looks OK, at least this must be safe.
>>> So yes, I'll happily apply if you submit a proper patch.
>>
>> It would be probably better to check against the boundary / 2 value to
>> check
>> correctly the boundary wrap instead to drop all negative size values:
>>
>>    if (size < 0) {
>>       if (size < -(spcm->boundary / 2))
>>          size += spcm->boundary;
>>       else
>>          size = 0;
>>    }
> 
> Is there a reliable way to detect the boundary wraparound, at best using
> some dedicated API? I could find any, IMO the wraparound does not create
> any notification. The check is OK for a rewind, half of boundary is
> usually a very large number too. I am not sure what would happen at
> reset when application pointer was already past the boundary half - see
> below.
> 
>>
>> The "hidden" pcm restart referred in the comment should not occur,
>> otherwise
>> it's another bug somewhere.
> 
> I do not know the exact moments when plugin API methods are called. The
> fact is Takashi's suggestion to call s16 reset explicitely in
> snd_pcm_meter_reset created this order:
> 
> snd_pcm_meter_reset -> s16->reset
> s16_update: meter->now 22751, s16->old 22751, size 0
> s16_update: meter->now 839, s16->old 22751, size -21912
> 
> I.e. AFTER resetting meter/s16 the variable meter->now was still at the
> original large 22751 (with s16->old equal to its value due to
> s16->reset). The value of meter->now was reset to 839 (= app pointer -
> delay) only in the next call of s16_update (when s16->old was still the
> previous old value => size < 0 => huge size => high CPU load).  From
> this I kind of conclude that the reset is buggy. Maybe the reset code
> should re-calculate meter->now = appl.pointer - delay before aligning
> s16->old = meter->now.
> 
> Nevertheless all this (except for the boundary wraparound) would result
> in the same size = 0, thus skipping samples from the last cycle, just
> like what the proposed patch does.
> 
> 

Please can we reach a decision and close the problem so that affected
use cases do not have to be patched with the next the alsa-lib version?

Thanks a lot in advance,

Pavel.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-08-09  7:05                 ` Pavel Hofman
@ 2020-08-09 20:29                   ` Jaroslav Kysela
  2020-08-09 21:05                     ` Pavel Hofman
  0 siblings, 1 reply; 16+ messages in thread
From: Jaroslav Kysela @ 2020-08-09 20:29 UTC (permalink / raw)
  To: Pavel Hofman, Takashi Iwai; +Cc: alsa-devel

Dne 09. 08. 20 v 9:05 Pavel Hofman napsal(a):
> Dne 03. 08. 20 v 12:48 Pavel Hofman napsal(a):
>>
>>
>> Dne 03. 08. 20 v 9:22 Jaroslav Kysela napsal(a):
>>> Dne 03. 08. 20 v 8:17 Takashi Iwai napsal(a):
>>>> On Sun, 02 Aug 2020 19:50:44 +0200,
>>>>>>
>>>>>> Optionally the second case could be handled just like the first
>>>>>> case by
>>>>>> resetting s16->old, assuming the boundary wrap occurs very
>>>>>> infrequently.
>>>>>
>>>>> The following patch is tested to work OK, no CPU peaks and no meter
>>>>> output glitches when the size < 0 condition occurs:
>>>>>
>>>>> diff --git a/src/pcm/pcm_meter.c b/src/pcm/pcm_meter.c
>>>>> index 20b41876..48df5945 100644
>>>>> --- a/src/pcm/pcm_meter.c
>>>>> +++ b/src/pcm/pcm_meter.c
>>>>> @@ -1098,8 +1098,15 @@ static void s16_update(snd_pcm_scope_t *scope)
>>>>>          snd_pcm_sframes_t size;
>>>>>          snd_pcm_uframes_t offset;
>>>>>          size = meter->now - s16->old;
>>>>> -       if (size < 0)
>>>>> -               size += spcm->boundary;
>>>>> +       if (size < 0) {
>>>>> +               /**
>>>>> +                * Application pointer adjusted for delay (meter->now)
>>>>> has dropped compared
>>>>> +                * to the previous update cycle. Either spcm->boundary
>>>>> wraparound, pcm rewinding,
>>>>> +                * or pcm restart without s16->old properly reset.
>>>>> +                * In any case the safest solution is skipping this
>>>>> conversion cycle.
>>>>> +                */
>>>>> +               size = 0;
>>>>> +       }
>>>>>          offset = s16->old % meter->buf_size;
>>>>>          while (size > 0) {
>>>>>                  snd_pcm_uframes_t frames = size;
>>>>>
>>>>>
>>>>>
>>>>> Please will you accept this (workaround) bugfix? If so, I would send a
>>>>> proper patch.
>>>>
>>>> It looks OK, at least this must be safe.
>>>> So yes, I'll happily apply if you submit a proper patch.
>>>
>>> It would be probably better to check against the boundary / 2 value to
>>> check
>>> correctly the boundary wrap instead to drop all negative size values:
>>>
>>>    if (size < 0) {
>>>       if (size < -(spcm->boundary / 2))
>>>          size += spcm->boundary;
>>>       else
>>>          size = 0;
>>>    }
>>
>> Is there a reliable way to detect the boundary wraparound, at best using
>> some dedicated API? I could find any, IMO the wraparound does not create
>> any notification. The check is OK for a rewind, half of boundary is
>> usually a very large number too. I am not sure what would happen at
>> reset when application pointer was already past the boundary half - see
>> below.

Yes, it's a good argument. In this case, the s16->old value is not properly
synced during the reset operation, otherwise the boundary / 2 threshold
(change limit) is sufficient to detect the boundary wrap.

>>> The "hidden" pcm restart referred in the comment should not occur,
>>> otherwise
>>> it's another bug somewhere.
>>
>> I do not know the exact moments when plugin API methods are called. The
>> fact is Takashi's suggestion to call s16 reset explicitely in
>> snd_pcm_meter_reset created this order:
>>
>> snd_pcm_meter_reset -> s16->reset
>> s16_update: meter->now 22751, s16->old 22751, size 0
>> s16_update: meter->now 839, s16->old 22751, size -21912
>>
>> I.e. AFTER resetting meter/s16 the variable meter->now was still at the
>> original large 22751 (with s16->old equal to its value due to
>> s16->reset). The value of meter->now was reset to 839 (= app pointer -
>> delay) only in the next call of s16_update (when s16->old was still the
>> previous old value => size < 0 => huge size => high CPU load).  From
>> this I kind of conclude that the reset is buggy. Maybe the reset code
>> should re-calculate meter->now = appl.pointer - delay before aligning
>> s16->old = meter->now.
>>
>> Nevertheless all this (except for the boundary wraparound) would result
>> in the same size = 0, thus skipping samples from the last cycle, just
>> like what the proposed patch does.
>>
>>
> 
> Please can we reach a decision and close the problem so that affected
> use cases do not have to be patched with the next the alsa-lib version?

I think that this problem should be fixed for reset and rewind separately. The
meter->reset should be set in snd_pcm_meter_reset() inside the running_mutex
lock to serialize correctly the update operations in the
snd_pcm_meter_thread(). And perhaps, we can follow this logic for the rewind.

I mean, we should ensure to call the s16->reset at the proper time to avoid
broken old/now combinations inside the scope "clients".

Your proposed solution is just a workaround.

						Jaroslav

> 
> Thanks a lot in advance,
> 
> Pavel.
> 


-- 
Jaroslav Kysela <perex@perex.cz>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-08-09 20:29                   ` Jaroslav Kysela
@ 2020-08-09 21:05                     ` Pavel Hofman
  0 siblings, 0 replies; 16+ messages in thread
From: Pavel Hofman @ 2020-08-09 21:05 UTC (permalink / raw)
  To: Jaroslav Kysela, Takashi Iwai; +Cc: alsa-devel

Dne 09. 08. 20 v 22:29 Jaroslav Kysela napsal(a):
> Dne 09. 08. 20 v 9:05 Pavel Hofman napsal(a):
>> Dne 03. 08. 20 v 12:48 Pavel Hofman napsal(a):
>>>
>>>
>>> Dne 03. 08. 20 v 9:22 Jaroslav Kysela napsal(a):
>>>> Dne 03. 08. 20 v 8:17 Takashi Iwai napsal(a):
>>>>> On Sun, 02 Aug 2020 19:50:44 +0200,
>>>>>>>
>>>>>>> Optionally the second case could be handled just like the first
>>>>>>> case by
>>>>>>> resetting s16->old, assuming the boundary wrap occurs very
>>>>>>> infrequently.
>>>>>>
>>>>>> The following patch is tested to work OK, no CPU peaks and no meter
>>>>>> output glitches when the size < 0 condition occurs:
>>>>>>
>>>>>> diff --git a/src/pcm/pcm_meter.c b/src/pcm/pcm_meter.c
>>>>>> index 20b41876..48df5945 100644
>>>>>> --- a/src/pcm/pcm_meter.c
>>>>>> +++ b/src/pcm/pcm_meter.c
>>>>>> @@ -1098,8 +1098,15 @@ static void s16_update(snd_pcm_scope_t *scope)
>>>>>>           snd_pcm_sframes_t size;
>>>>>>           snd_pcm_uframes_t offset;
>>>>>>           size = meter->now - s16->old;
>>>>>> -       if (size < 0)
>>>>>> -               size += spcm->boundary;
>>>>>> +       if (size < 0) {
>>>>>> +               /**
>>>>>> +                * Application pointer adjusted for delay (meter->now)
>>>>>> has dropped compared
>>>>>> +                * to the previous update cycle. Either spcm->boundary
>>>>>> wraparound, pcm rewinding,
>>>>>> +                * or pcm restart without s16->old properly reset.
>>>>>> +                * In any case the safest solution is skipping this
>>>>>> conversion cycle.
>>>>>> +                */
>>>>>> +               size = 0;
>>>>>> +       }
>>>>>>           offset = s16->old % meter->buf_size;
>>>>>>           while (size > 0) {
>>>>>>                   snd_pcm_uframes_t frames = size;
>>>>>>
>>>>>>
>>>>>>
>>>>>> Please will you accept this (workaround) bugfix? If so, I would send a
>>>>>> proper patch.
>>>>>
>>>>> It looks OK, at least this must be safe.
>>>>> So yes, I'll happily apply if you submit a proper patch.
>>>>
>>>> It would be probably better to check against the boundary / 2 value to
>>>> check
>>>> correctly the boundary wrap instead to drop all negative size values:
>>>>
>>>>     if (size < 0) {
>>>>        if (size < -(spcm->boundary / 2))
>>>>           size += spcm->boundary;
>>>>        else
>>>>           size = 0;
>>>>     }
>>>
>>> Is there a reliable way to detect the boundary wraparound, at best using
>>> some dedicated API? I could find any, IMO the wraparound does not create
>>> any notification. The check is OK for a rewind, half of boundary is
>>> usually a very large number too. I am not sure what would happen at
>>> reset when application pointer was already past the boundary half - see
>>> below.
> 
> Yes, it's a good argument. In this case, the s16->old value is not properly
> synced during the reset operation, otherwise the boundary / 2 threshold
> (change limit) is sufficient to detect the boundary wrap.
> 
>>>> The "hidden" pcm restart referred in the comment should not occur,
>>>> otherwise
>>>> it's another bug somewhere.
>>>
>>> I do not know the exact moments when plugin API methods are called. The
>>> fact is Takashi's suggestion to call s16 reset explicitely in
>>> snd_pcm_meter_reset created this order:
>>>
>>> snd_pcm_meter_reset -> s16->reset
>>> s16_update: meter->now 22751, s16->old 22751, size 0
>>> s16_update: meter->now 839, s16->old 22751, size -21912
>>>
>>> I.e. AFTER resetting meter/s16 the variable meter->now was still at the
>>> original large 22751 (with s16->old equal to its value due to
>>> s16->reset). The value of meter->now was reset to 839 (= app pointer -
>>> delay) only in the next call of s16_update (when s16->old was still the
>>> previous old value => size < 0 => huge size => high CPU load).  From
>>> this I kind of conclude that the reset is buggy. Maybe the reset code
>>> should re-calculate meter->now = appl.pointer - delay before aligning
>>> s16->old = meter->now.
>>>
>>> Nevertheless all this (except for the boundary wraparound) would result
>>> in the same size = 0, thus skipping samples from the last cycle, just
>>> like what the proposed patch does.
>>>
>>>
>>
>> Please can we reach a decision and close the problem so that affected
>> use cases do not have to be patched with the next the alsa-lib version?
> 
> I think that this problem should be fixed for reset and rewind separately. The
> meter->reset should be set in snd_pcm_meter_reset() inside the running_mutex
> lock to serialize correctly the update operations in the
> snd_pcm_meter_thread(). And perhaps, we can follow this logic for the rewind.
> 
> I mean, we should ensure to call the s16->reset at the proper time to avoid
> broken old/now combinations inside the scope "clients".
> 
> Your proposed solution is just a workaround.

I am well aware of that. The main cause of the problem is that the 
existing code assumes that a drop in the meter->now value is caused by 
the pcm->boundary wraparound. Only for that particular case the existing 
  size += spcm->boundary code is correct, for all the other cases it is 
grossly wrong, locking the thread for many tens of seconds and jamming 
CPU. If there was a callback or some other way to signal the boundary 
wraparound that "dangerous" code would be called only for that special 
case (which is extremely rare in usual setups).

I do not know all cases when the meter->now can drop. Reset, rewind, any 
other (xrun)? If a single case is omitted, the same problem prevails.

No problem with resetting where appropriate, but still I would suggest 
to not keep size += spcm->boundary in the s16_update as is now.

Regards,

Pavel.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-10-13 17:35   ` Jaroslav Kysela
@ 2020-10-15  3:59     ` Go Peppy
  0 siblings, 0 replies; 16+ messages in thread
From: Go Peppy @ 2020-10-15  3:59 UTC (permalink / raw)
  To: Jaroslav Kysela; +Cc: Pavel Hofman, alsa-devel

Hi Jaroslav,

I've tested the patch. It works fine. I didn't experience the original
issue with high CPU utilization after applying the patch.
When is the next planned release?

Thanks!
PPP


On Tue, Oct 13, 2020 at 10:35 AM Jaroslav Kysela <perex@perex.cz> wrote:

> Dne 17. 09. 20 v 21:13 Pavel Hofman napsal(a):
> > Hi,
> >
> > Dne 15. 09. 20 v 5:40 Go Peppy napsal(a):
> >>
> >> Just to remind what the issue is - while switching from one web radio
> >> station to another there is a high CPU usage for about 20 seconds.
> Because
> >> of that high CPU consumption almost all other processes are blocked.
> >
> > 20 seconds is only due to 32bit kernel on RPi. On 64bit machines the
> > boundary value is usually many orders of magnitude larger, basically
> > making the method never finish.
>
> Could you try this patch?
>
>
> https://github.com/alsa-project/alsa-lib/commit/a6c8ac0c85ca1b16684a687c7000c73aa38b7776
>
> It should prevent the big copies which are definitely nonsense and this
> change
> does not hide the real problem.
>
>                                         Jaroslav
>
> >
> > With regards,
> >
> > Pavel.
> >
>
>
> --
> Jaroslav Kysela <perex@perex.cz>
> Linux Sound Maintainer; ALSA Project; Red Hat, Inc.
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-09-17 19:13 ` Pavel Hofman
@ 2020-10-13 17:35   ` Jaroslav Kysela
  2020-10-15  3:59     ` Go Peppy
  0 siblings, 1 reply; 16+ messages in thread
From: Jaroslav Kysela @ 2020-10-13 17:35 UTC (permalink / raw)
  To: Pavel Hofman, Go Peppy, alsa-devel

Dne 17. 09. 20 v 21:13 Pavel Hofman napsal(a):
> Hi,
> 
> Dne 15. 09. 20 v 5:40 Go Peppy napsal(a):
>>
>> Just to remind what the issue is - while switching from one web radio
>> station to another there is a high CPU usage for about 20 seconds. Because
>> of that high CPU consumption almost all other processes are blocked.
> 
> 20 seconds is only due to 32bit kernel on RPi. On 64bit machines the 
> boundary value is usually many orders of magnitude larger, basically 
> making the method never finish.

Could you try this patch?

https://github.com/alsa-project/alsa-lib/commit/a6c8ac0c85ca1b16684a687c7000c73aa38b7776

It should prevent the big copies which are definitely nonsense and this change
does not hide the real problem.

					Jaroslav

> 
> With regards,
> 
> Pavel.
> 


-- 
Jaroslav Kysela <perex@perex.cz>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pcm_meter.c issue at s16_update
  2020-09-15  3:40 Go Peppy
@ 2020-09-17 19:13 ` Pavel Hofman
  2020-10-13 17:35   ` Jaroslav Kysela
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Hofman @ 2020-09-17 19:13 UTC (permalink / raw)
  To: Go Peppy, alsa-devel

Hi,

Dne 15. 09. 20 v 5:40 Go Peppy napsal(a):
> 
> Just to remind what the issue is - while switching from one web radio
> station to another there is a high CPU usage for about 20 seconds. Because
> of that high CPU consumption almost all other processes are blocked.

20 seconds is only due to 32bit kernel on RPi. On 64bit machines the 
boundary value is usually many orders of magnitude larger, basically 
making the method never finish.

With regards,

Pavel.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* pcm_meter.c issue at s16_update
@ 2020-09-15  3:40 Go Peppy
  2020-09-17 19:13 ` Pavel Hofman
  0 siblings, 1 reply; 16+ messages in thread
From: Go Peppy @ 2020-09-15  3:40 UTC (permalink / raw)
  To: alsa-devel

Hi,

It would be great to have a fix/patch from Pavel in the main codebase. I've
tested his fix and it worked fine.

Just to remind what the issue is - while switching from one web radio
station to another there is a high CPU usage for about 20 seconds. Because
of that high CPU consumption almost all other processes are blocked. The
fix from Pavel helps to avoid this situation.

Without the ALSA patch I need to make a custom ALSA build every time I
release my software. This is very annoying.

Thanks in advance!

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-10-15  4:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-26 18:20 pcm_meter.c issue at s16_update Pavel Hofman
2020-07-28 16:46 ` Pavel Hofman
2020-07-28 17:04   ` Takashi Iwai
2020-07-28 18:04     ` Pavel Hofman
2020-07-28 18:54       ` Pavel Hofman
2020-08-02 17:50         ` Pavel Hofman
2020-08-03  6:17           ` Takashi Iwai
2020-08-03  7:22             ` Jaroslav Kysela
2020-08-03 10:48               ` Pavel Hofman
2020-08-09  7:05                 ` Pavel Hofman
2020-08-09 20:29                   ` Jaroslav Kysela
2020-08-09 21:05                     ` Pavel Hofman
2020-09-15  3:40 Go Peppy
2020-09-17 19:13 ` Pavel Hofman
2020-10-13 17:35   ` Jaroslav Kysela
2020-10-15  3:59     ` Go Peppy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).