All of lore.kernel.org
 help / color / mirror / Atom feed
* twl4030 latency update
@ 2014-03-20 11:13 Leonardo Gabrielli
  2014-03-20 13:35 ` Peter Ujfalusi
  0 siblings, 1 reply; 13+ messages in thread
From: Leonardo Gabrielli @ 2014-03-20 11:13 UTC (permalink / raw)
  To: peter.ujfalusi; +Cc: Edgar Berdahl, alsa-devel

Dear Peter,
I was investigating on TWL4030 high playback latency and stumbled in an 
old thread started by Edgar 
http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.html 
where I read this is related to McBSP2 buffer length
Recent kernels seems to have the same behavior (I have a debian 
beagleboardxM with 3.13.3-armv7-x10)
Did you manage to get a fix to this problem? Would it be possible?

Regards

Leonardo

-- 

Dr. Leonardo Gabrielli, PhD student
A3Lab - Dept. Information Engineering
Università Politecnica delle Marche
via Brecce Bianche 12, 60131, Ancona, Italy
Skype: leonardo.gabrielli
Web: a3lab.dii.univpm.it/people/leonardo-gabrielli 
<http://a3lab.dii.univpm.it/people/leonardo-gabrielli>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
  2014-03-20 11:13 twl4030 latency update Leonardo Gabrielli
@ 2014-03-20 13:35 ` Peter Ujfalusi
  2014-03-20 14:31   ` Leonardo Gabrielli
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Ujfalusi @ 2014-03-20 13:35 UTC (permalink / raw)
  To: Leonardo Gabrielli; +Cc: Edgar Berdahl, alsa-devel

Hi Leonardo,

On 03/20/2014 01:13 PM, Leonardo Gabrielli wrote:
> Dear Peter,
> I was investigating on TWL4030 high playback latency and stumbled in an old
> thread started by Edgar
> http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.html
> where I read this is related to McBSP2 buffer length
> Recent kernels seems to have the same behavior (I have a debian beagleboardxM
> with 3.13.3-armv7-x10)
> Did you manage to get a fix to this problem? Would it be possible?

The 'misusing/configuring the McBSP, and sDMA' did not worked :(
However the mcbsp code went through quite a bit of change since than
concerning the McBSP FIFO/sDMA configuration.

If we have FIFO the sDMA is always in packet mode.
The default is to transfer one sample with sDMA per DMA request.
You can switch the McBSP to 'threshold' mode and set the maximum FIFO
threshold you want to use. The code will figure out the optimal FIFO/burst
size based on the period size and the max threshold you have set.
This is done via a sysfs file under the mcbsp, the file is dma_op_mode if I
recall correctly.
Playing with the max tx/rx threshold you might be able to get better latency.

-- 
Péter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
  2014-03-20 13:35 ` Peter Ujfalusi
@ 2014-03-20 14:31   ` Leonardo Gabrielli
  2014-03-21  7:08     ` Peter Ujfalusi
  0 siblings, 1 reply; 13+ messages in thread
From: Leonardo Gabrielli @ 2014-03-20 14:31 UTC (permalink / raw)
  To: Peter Ujfalusi; +Cc: Edgar Berdahl, alsa-devel

Dear Peter,
thanks, I'm not sure I understand all the details but after a fast find 
in my beagleboard /sys I found

./sys/devices/68000000.ocp/49026000.mcbsp/dma_op_mode
./sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
./sys/devices/68000000.ocp/49024000.mcbsp/dma_op_mode
./sys/devices/68000000.ocp/48096000.mcbsp/dma_op_mode
./sys/devices/68000000.ocp/48074000.mcbsp/dma_op_mode

All of these are already set as threshold:
cat sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
[element] threshold

Probably I found the FIFOs to be shortened in order to reduce latency: 
all of the thresholds are 112 besides one of the devices which has:
cat sys/devices/68000000.ocp/49022000.mcbsp/max_rx_thres
1264
for both tx and rx. Maybe that's the ALSA playback samples queue.

In fact I find:
cat /proc/device-tree/ocp/mcbsp\@49022000/ti\,hwmods
mcbsp2mcbsp2_sidetone

So it's the McBSP2 which you mentioned.

Now, I'm not sure how to change the threshold, I guess I have to patch 
some kernel module and rebuild?


On 20/03/2014 14:35, Peter Ujfalusi wrote:
> Hi Leonardo,
>
> On 03/20/2014 01:13 PM, Leonardo Gabrielli wrote:
>> Dear Peter,
>> I was investigating on TWL4030 high playback latency and stumbled in an old
>> thread started by Edgar
>> http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.html
>> where I read this is related to McBSP2 buffer length
>> Recent kernels seems to have the same behavior (I have a debian beagleboardxM
>> with 3.13.3-armv7-x10)
>> Did you manage to get a fix to this problem? Would it be possible?
> The 'misusing/configuring the McBSP, and sDMA' did not worked :(
> However the mcbsp code went through quite a bit of change since than
> concerning the McBSP FIFO/sDMA configuration.
>
> If we have FIFO the sDMA is always in packet mode.
> The default is to transfer one sample with sDMA per DMA request.
> You can switch the McBSP to 'threshold' mode and set the maximum FIFO
> threshold you want to use. The code will figure out the optimal FIFO/burst
> size based on the period size and the max threshold you have set.
> This is done via a sysfs file under the mcbsp, the file is dma_op_mode if I
> recall correctly.
> Playing with the max tx/rx threshold you might be able to get better latency.
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
  2014-03-20 14:31   ` Leonardo Gabrielli
@ 2014-03-21  7:08     ` Peter Ujfalusi
  2014-03-25 18:50       ` Leonardo Gabrielli
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Ujfalusi @ 2014-03-21  7:08 UTC (permalink / raw)
  To: Leonardo Gabrielli; +Cc: Edgar Berdahl, alsa-devel

On 03/20/2014 04:31 PM, Leonardo Gabrielli wrote:
> Dear Peter,
> thanks, I'm not sure I understand all the details but after a fast find in my
> beagleboard /sys I found
> 
> ./sys/devices/68000000.ocp/49026000.mcbsp/dma_op_mode
> ./sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
> ./sys/devices/68000000.ocp/49024000.mcbsp/dma_op_mode
> ./sys/devices/68000000.ocp/48096000.mcbsp/dma_op_mode
> ./sys/devices/68000000.ocp/48074000.mcbsp/dma_op_mode
> 
> All of these are already set as threshold:
> cat sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
> [element] threshold

it is in 'element' mode, the [] shows the selected mode.
You can change it:
echo threshold > /sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode

> Probably I found the FIFOs to be shortened in order to reduce latency: all of
> the thresholds are 112 besides one of the devices which has:
> cat sys/devices/68000000.ocp/49022000.mcbsp/max_rx_thres
> 1264
> for both tx and rx. Maybe that's the ALSA playback samples queue.

This is for the threshold mode. With this value you can set the maximum slots
you want to use in the McBSP FIFO.

To change it:
echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_tx_thres
echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_ex_thres

for example.

At the end with this value you can limit the sDMA burst sizes and the McBSP
FIFO level. The threshold means: generate DMA request if threshold number of
slots are free in the FIFO (playback) or when threshold amount of data
available in the FIFO (capture).

> 
> In fact I find:
> cat /proc/device-tree/ocp/mcbsp\@49022000/ti\,hwmods
> mcbsp2mcbsp2_sidetone
> 
> So it's the McBSP2 which you mentioned.
> 
> Now, I'm not sure how to change the threshold, I guess I have to patch some
> kernel module and rebuild?

No, you do not need to recompile anything. See my previous comment.

> 
> 
> On 20/03/2014 14:35, Peter Ujfalusi wrote:
>> Hi Leonardo,
>>
>> On 03/20/2014 01:13 PM, Leonardo Gabrielli wrote:
>>> Dear Peter,
>>> I was investigating on TWL4030 high playback latency and stumbled in an old
>>> thread started by Edgar
>>> http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.html
>>> where I read this is related to McBSP2 buffer length
>>> Recent kernels seems to have the same behavior (I have a debian beagleboardxM
>>> with 3.13.3-armv7-x10)
>>> Did you manage to get a fix to this problem? Would it be possible?
>> The 'misusing/configuring the McBSP, and sDMA' did not worked :(
>> However the mcbsp code went through quite a bit of change since than
>> concerning the McBSP FIFO/sDMA configuration.
>>
>> If we have FIFO the sDMA is always in packet mode.
>> The default is to transfer one sample with sDMA per DMA request.
>> You can switch the McBSP to 'threshold' mode and set the maximum FIFO
>> threshold you want to use. The code will figure out the optimal FIFO/burst
>> size based on the period size and the max threshold you have set.
>> This is done via a sysfs file under the mcbsp, the file is dma_op_mode if I
>> recall correctly.
>> Playing with the max tx/rx threshold you might be able to get better latency.
>>
> 


-- 
Péter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
  2014-03-21  7:08     ` Peter Ujfalusi
@ 2014-03-25 18:50       ` Leonardo Gabrielli
  2014-03-26  8:26         ` Peter Ujfalusi
  0 siblings, 1 reply; 13+ messages in thread
From: Leonardo Gabrielli @ 2014-03-25 18:50 UTC (permalink / raw)
  To: Peter Ujfalusi; +Cc: alsa-devel

Thanks Peter,
I've been able today to test following your suggestions.
Unfortunately I didn't get any improvement on latency, but reducing sDMA 
FIFO threshold improved on audio integrity (with some period+samplerate 
combinations I have corrupted audio, maybe scrambled or empty frames).

TESTS:
1- with FIFO threshold 1264 for McBSP2
  running jackd -P62 -dalsa -dhw:0 -r $SRATE -p $PERIOD -n2 -s -S -i2 -o2
with the following combinations of $SRATE - $PERIOD:
22050 - 512 - AU=ko lat=/
22050 - 256 - AU=ok lat=54ms
22050 - 128 - AU=ko lat=/
22050 - 64 - AU=ok lat=34ms
32000 - 64 - AU=ok lat=23ms
44100 - 64 - AU=ok lat=17ms

2- with FIFO threshold 320 for McBSP2
the rest as above
22050 - 512 - AU=ko lat=/
22050 - 256 - AU=ok lat=54ms
22050 - 128 - AU=ok lat=38ms
22050 - 64 - AU=ok lat=40-60ms (changing for each invocation of jackd)
32000 - 64 - AU=ok lat=23ms
44100 - 64 - AU=ok lat=17ms

Outcome: maybe I got it wrong, I thought this would reduce the number of 
periods allocated by jack (they didn't change between the two tests) 
hence reduce latency.
The CPU is not overwhelmed even in the 64sample tests (good).

Also: after a reboot the threshold and dma_op_mode get back to their 
defaults. Can I make it stable or do I need an upstart job to echo the 
proper values into the sysfs each time?

Cheers and thanks

Leonardo


On 21/03/2014 08:08, Peter Ujfalusi wrote:
> On 03/20/2014 04:31 PM, Leonardo Gabrielli wrote:
>> Dear Peter,
>> thanks, I'm not sure I understand all the details but after a fast find in my
>> beagleboard /sys I found
>>
>> ./sys/devices/68000000.ocp/49026000.mcbsp/dma_op_mode
>> ./sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
>> ./sys/devices/68000000.ocp/49024000.mcbsp/dma_op_mode
>> ./sys/devices/68000000.ocp/48096000.mcbsp/dma_op_mode
>> ./sys/devices/68000000.ocp/48074000.mcbsp/dma_op_mode
>>
>> All of these are already set as threshold:
>> cat sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
>> [element] threshold
> it is in 'element' mode, the [] shows the selected mode.
> You can change it:
> echo threshold > /sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
>
>> Probably I found the FIFOs to be shortened in order to reduce latency: all of
>> the thresholds are 112 besides one of the devices which has:
>> cat sys/devices/68000000.ocp/49022000.mcbsp/max_rx_thres
>> 1264
>> for both tx and rx. Maybe that's the ALSA playback samples queue.
> This is for the threshold mode. With this value you can set the maximum slots
> you want to use in the McBSP FIFO.
>
> To change it:
> echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_tx_thres
> echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_ex_thres
>
> for example.
>
> At the end with this value you can limit the sDMA burst sizes and the McBSP
> FIFO level. The threshold means: generate DMA request if threshold number of
> slots are free in the FIFO (playback) or when threshold amount of data
> available in the FIFO (capture).
>
>> In fact I find:
>> cat /proc/device-tree/ocp/mcbsp\@49022000/ti\,hwmods
>> mcbsp2mcbsp2_sidetone
>>
>> So it's the McBSP2 which you mentioned.
>>
>> Now, I'm not sure how to change the threshold, I guess I have to patch some
>> kernel module and rebuild?
> No, you do not need to recompile anything. See my previous comment.
>
>>
>> On 20/03/2014 14:35, Peter Ujfalusi wrote:
>>> Hi Leonardo,
>>>
>>> On 03/20/2014 01:13 PM, Leonardo Gabrielli wrote:
>>>> Dear Peter,
>>>> I was investigating on TWL4030 high playback latency and stumbled in an old
>>>> thread started by Edgar
>>>> http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.html
>>>> where I read this is related to McBSP2 buffer length
>>>> Recent kernels seems to have the same behavior (I have a debian beagleboardxM
>>>> with 3.13.3-armv7-x10)
>>>> Did you manage to get a fix to this problem? Would it be possible?
>>> The 'misusing/configuring the McBSP, and sDMA' did not worked :(
>>> However the mcbsp code went through quite a bit of change since than
>>> concerning the McBSP FIFO/sDMA configuration.
>>>
>>> If we have FIFO the sDMA is always in packet mode.
>>> The default is to transfer one sample with sDMA per DMA request.
>>> You can switch the McBSP to 'threshold' mode and set the maximum FIFO
>>> threshold you want to use. The code will figure out the optimal FIFO/burst
>>> size based on the period size and the max threshold you have set.
>>> This is done via a sysfs file under the mcbsp, the file is dma_op_mode if I
>>> recall correctly.
>>> Playing with the max tx/rx threshold you might be able to get better latency.
>>>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
  2014-03-25 18:50       ` Leonardo Gabrielli
@ 2014-03-26  8:26         ` Peter Ujfalusi
  2014-03-26  8:41           ` Peter Ujfalusi
                             ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Peter Ujfalusi @ 2014-03-26  8:26 UTC (permalink / raw)
  To: Leonardo Gabrielli; +Cc: alsa-devel

Hi Leonardo,

On 03/25/2014 08:50 PM, Leonardo Gabrielli wrote:
> Thanks Peter,
> I've been able today to test following your suggestions.
> Unfortunately I didn't get any improvement on latency, but reducing sDMA FIFO
> threshold improved on audio integrity (with some period+samplerate
> combinations I have corrupted audio, maybe scrambled or empty frames).

Can you elaborate on the corrupted/scrambled audio? I just don't see how it
can happen. Can you get the /proc/asound/card0/pcm0p/sub0/hw_params when you
have the audio quality issue?

I can not run jackd on the board anymore (with linux-next at least):
FATAL: cannot locate cpu MHz in /proc/cpuinfo

but with aplay -v --period-size=512 or 64 2ch-left-since-22050.wav
seams to be fine for me (assuming AU=ko means that it is the corrupted one)

> TESTS:
> 1- with FIFO threshold 1264 for McBSP2
>  running jackd -P62 -dalsa -dhw:0 -r $SRATE -p $PERIOD -n2 -s -S -i2 -o2
> with the following combinations of $SRATE - $PERIOD:
> 22050 - 512 - AU=ko lat=/
> 22050 - 256 - AU=ok lat=54ms
> 22050 - 128 - AU=ko lat=/
> 22050 - 64 - AU=ok lat=34ms
> 32000 - 64 - AU=ok lat=23ms
> 44100 - 64 - AU=ok lat=17ms
> 
> 2- with FIFO threshold 320 for McBSP2
> the rest as above
> 22050 - 512 - AU=ko lat=/
> 22050 - 256 - AU=ok lat=54ms
> 22050 - 128 - AU=ok lat=38ms
> 22050 - 64 - AU=ok lat=40-60ms (changing for each invocation of jackd)
> 32000 - 64 - AU=ok lat=23ms
> 44100 - 64 - AU=ok lat=17ms
> 
> Outcome: maybe I got it wrong, I thought this would reduce the number of
> periods allocated by jack (they didn't change between the two tests) hence
> reduce latency.

The McBSP2 FIFO will be always there. There's nothing can be done on that. The
size on McBSP2 is 1280 words -> 640 stereo samples, ie ~29ms with 22050,
14.5ms with 44100.

If you are staying in element mode this means that it is granted that the
sample at the DMA pointer will out on the i2s line about the mentioned times.
This is the delay caused by the FIFO itself. From where the rest is coming I'm
not really sure.

Now if you are in threshold mode this changes a bit, but the FIFO will be
there still. At start the FIFO is going to be filled up with threshold long
bursts. From there you will have DMA burst with about threshold length every
time the FIFO has that amount of free slots in it.
In case of tx_threshold 1264 (632 sample) and 512 period size:
0. The actual threshold level will be 512 samples.
1. copy of 512 samples (1 period to FIFO)
   ~128 (640 - 512) free slot left in the FIFO
2. nothing happens until the FIFO level drops to 127 (we will have free space
for 512 samples).
3. next 512 sample burst to FIFO.
4. the FIFO will be full or close to full
5. goto 2

When the period size is bigger than the desired threshold (set via sysfs) then
the code will figure out the best configuration for the actual threshold/DMA
burst.

The same principle applies to element mode, where the DMA bursts are set to
one sample. Meaning that at start you will have ~640 quick DMA bursts to fill
the FIFO up and after that you will have the next burst coming at every 1/Hz time.

You see, the FIFO is there to add delay in both cases however in threshold
mode you are not going to stress the system with constant DMA activity, you
only have bursts to fill the FIFO up.

> The CPU is not overwhelmed even in the 64sample tests (good).
> 
> Also: after a reboot the threshold and dma_op_mode get back to their defaults.
> Can I make it stable or do I need an upstart job to echo the proper values
> into the sysfs each time?

You need to change these after every boot, yes. The default is element mode.

> Cheers and thanks
> 
> Leonardo
> 
> 
> On 21/03/2014 08:08, Peter Ujfalusi wrote:
>> On 03/20/2014 04:31 PM, Leonardo Gabrielli wrote:
>>> Dear Peter,
>>> thanks, I'm not sure I understand all the details but after a fast find in my
>>> beagleboard /sys I found
>>>
>>> ./sys/devices/68000000.ocp/49026000.mcbsp/dma_op_mode
>>> ./sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
>>> ./sys/devices/68000000.ocp/49024000.mcbsp/dma_op_mode
>>> ./sys/devices/68000000.ocp/48096000.mcbsp/dma_op_mode
>>> ./sys/devices/68000000.ocp/48074000.mcbsp/dma_op_mode
>>>
>>> All of these are already set as threshold:
>>> cat sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
>>> [element] threshold
>> it is in 'element' mode, the [] shows the selected mode.
>> You can change it:
>> echo threshold > /sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
>>
>>> Probably I found the FIFOs to be shortened in order to reduce latency: all of
>>> the thresholds are 112 besides one of the devices which has:
>>> cat sys/devices/68000000.ocp/49022000.mcbsp/max_rx_thres
>>> 1264
>>> for both tx and rx. Maybe that's the ALSA playback samples queue.
>> This is for the threshold mode. With this value you can set the maximum slots
>> you want to use in the McBSP FIFO.
>>
>> To change it:
>> echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_tx_thres
>> echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_ex_thres
>>
>> for example.
>>
>> At the end with this value you can limit the sDMA burst sizes and the McBSP
>> FIFO level. The threshold means: generate DMA request if threshold number of
>> slots are free in the FIFO (playback) or when threshold amount of data
>> available in the FIFO (capture).
>>
>>> In fact I find:
>>> cat /proc/device-tree/ocp/mcbsp\@49022000/ti\,hwmods
>>> mcbsp2mcbsp2_sidetone
>>>
>>> So it's the McBSP2 which you mentioned.
>>>
>>> Now, I'm not sure how to change the threshold, I guess I have to patch some
>>> kernel module and rebuild?
>> No, you do not need to recompile anything. See my previous comment.
>>
>>>
>>> On 20/03/2014 14:35, Peter Ujfalusi wrote:
>>>> Hi Leonardo,
>>>>
>>>> On 03/20/2014 01:13 PM, Leonardo Gabrielli wrote:
>>>>> Dear Peter,
>>>>> I was investigating on TWL4030 high playback latency and stumbled in an old
>>>>> thread started by Edgar
>>>>> http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.html
>>>>>
>>>>> where I read this is related to McBSP2 buffer length
>>>>> Recent kernels seems to have the same behavior (I have a debian
>>>>> beagleboardxM
>>>>> with 3.13.3-armv7-x10)
>>>>> Did you manage to get a fix to this problem? Would it be possible?
>>>> The 'misusing/configuring the McBSP, and sDMA' did not worked :(
>>>> However the mcbsp code went through quite a bit of change since than
>>>> concerning the McBSP FIFO/sDMA configuration.
>>>>
>>>> If we have FIFO the sDMA is always in packet mode.
>>>> The default is to transfer one sample with sDMA per DMA request.
>>>> You can switch the McBSP to 'threshold' mode and set the maximum FIFO
>>>> threshold you want to use. The code will figure out the optimal FIFO/burst
>>>> size based on the period size and the max threshold you have set.
>>>> This is done via a sysfs file under the mcbsp, the file is dma_op_mode if I
>>>> recall correctly.
>>>> Playing with the max tx/rx threshold you might be able to get better latency.
>>>>
>>
> 


-- 
Péter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
  2014-03-26  8:26         ` Peter Ujfalusi
@ 2014-03-26  8:41           ` Peter Ujfalusi
  2014-03-26  9:35           ` Leonardo Gabrielli
  2014-03-26  9:45           ` Leonardo Gabrielli
  2 siblings, 0 replies; 13+ messages in thread
From: Peter Ujfalusi @ 2014-03-26  8:41 UTC (permalink / raw)
  To: Leonardo Gabrielli; +Cc: alsa-devel

On 03/26/2014 10:26 AM, Peter Ujfalusi wrote:
> Hi Leonardo,
> 
> On 03/25/2014 08:50 PM, Leonardo Gabrielli wrote:
>> Thanks Peter,
>> I've been able today to test following your suggestions.
>> Unfortunately I didn't get any improvement on latency, but reducing sDMA FIFO
>> threshold improved on audio integrity (with some period+samplerate
>> combinations I have corrupted audio, maybe scrambled or empty frames).
> 
> Can you elaborate on the corrupted/scrambled audio? I just don't see how it
> can happen. Can you get the /proc/asound/card0/pcm0p/sub0/hw_params when you
> have the audio quality issue?
> 
> I can not run jackd on the board anymore (with linux-next at least):
> FATAL: cannot locate cpu MHz in /proc/cpuinfo
> 
> but with aplay -v --period-size=512 or 64 2ch-left-since-22050.wav
> seams to be fine for me (assuming AU=ko means that it is the corrupted one)

Yeah, this is not correct. it means 512 or 64 periods and not the period sizes.
I would need the hw_params for the actual playback to be able to reproduce the
issue.

-- 
Péter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
  2014-03-26  8:26         ` Peter Ujfalusi
  2014-03-26  8:41           ` Peter Ujfalusi
@ 2014-03-26  9:35           ` Leonardo Gabrielli
  2014-03-26 12:28             ` Peter Ujfalusi
  2014-03-26  9:45           ` Leonardo Gabrielli
  2 siblings, 1 reply; 13+ messages in thread
From: Leonardo Gabrielli @ 2014-03-26  9:35 UTC (permalink / raw)
  To: Peter Ujfalusi; +Cc: alsa-devel


On 26/03/2014 09:26, Peter Ujfalusi wrote:
> Can you elaborate on the corrupted/scrambled audio? I just don't see how it
> can happen. Can you get the /proc/asound/card0/pcm0p/sub0/hw_params when you
> have the audio quality issue?
Hello,
Here you are:

cat /proc/asound/card0/pcm0p/sub0/hw_params
access: MMAP_INTERLEAVED
format: S16_LE
subformat: STD
channels: 2
rate: 22050 (22050/1)
period_size: 512
buffer_size: 1024

And this is jack output:

jackd -P62 -t2000 -dalsa -dhw:0 -r22050 -p512 -n2 -s -S -i2 -o2 &

jackd 0.124.1
Copyright 2001-2009 Paul Davis, Stephane Letz, Jack O'Quinn, Torben Hohn 
and others.
jackd comes with ABSOLUTELY NO WARRANTY
This is free software, and you are welcome to redistribute it
under certain conditions; see the file COPYING for details

JACK compiled with System V SHM support.
loading driver ..
apparent rate = 22050
creating alsa driver ... 
hw:0|hw:0|512|2|22050|2|2|nomon|swmeter|soft-mode|16bit
configuring for 22050Hz, period = 512 frames (23.2 ms), buffer = 2 periods
ALSA: final selected sample format for capture: 16bit little-endian
ALSA: use 2 periods for capture
ALSA: final selected sample format for playback: 16bit little-endian
ALSA: use 2 periods for playback

I can send you saomething to get it clearer. I recorded  a 10Hz sine 
wave with jaaa. The wave is totally scrambled (probably buffers are not 
read in order). But it may well be an issue with jack.

When the output sounds correct (256 period) hw_param is:
cat /proc/asound/card0/pcm0p/sub0/hw_params
access: MMAP_INTERLEAVED
format: S16_LE
subformat: STD
channels: 2
rate: 22050 (22050/1)
period_size: 256
buffer_size: 768


> I can not run jackd on the board anymore (with linux-next at least):
> FATAL: cannot locate cpu MHz in /proc/cpuinfo

Yes, there's been a recent fix to that (you can checkout the latest 
jackd from git repos, see this thread: 
http://lists.jackaudio.org/private.cgi/jack-devel-jackaudio.org/2014-March/012167.html

Or maybe you can just start jackd specifying a different clock with the 
-c switch i.e.
jackd -P62 -t2000 -c s -dalsa -dhw:0 -r22050 -p512 -n2 -s -S -i2 -o2 &

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
  2014-03-26  8:26         ` Peter Ujfalusi
  2014-03-26  8:41           ` Peter Ujfalusi
  2014-03-26  9:35           ` Leonardo Gabrielli
@ 2014-03-26  9:45           ` Leonardo Gabrielli
  2014-03-26 12:51             ` Peter Ujfalusi
  2 siblings, 1 reply; 13+ messages in thread
From: Leonardo Gabrielli @ 2014-03-26  9:45 UTC (permalink / raw)
  To: Peter Ujfalusi; +Cc: alsa-devel


On 26/03/2014 09:26, Peter Ujfalusi wrote:
> The McBSP2 FIFO will be always there. There's nothing can be done on that. The
> size on McBSP2 is 1280 words -> 640 stereo samples, ie ~29ms with 22050,
> 14.5ms with 44100.
>
> If you are staying in element mode this means that it is granted that the
> sample at the DMA pointer will out on the i2s line about the mentioned times.
> This is the delay caused by the FIFO itself. From where the rest is coming I'm
> not really sure.
BTW: I forgot to mention: the latency listed in my previous email is 
input+output (i.e. I record pulses from the beagleboard input jack and 
the delayed version to the beagleboard output jack). The twl4030 analog 
and digital loopback features have been of course disabled, in order to 
get the total latency due from A/D to D/A.

So just to get confirm I understood the McBSP mechanism well: even 
though I can transfer to/from DMA samples in bursts of <threshold> 
length, each sample will always "travel along" the whole FIFO buffer 
length, (as if in a delay line) and thus they will always have 
640samples delay?

Would it be possible to workaround this, e.g. by putting 4-channel audio 
frames instead of stereo frames in the FIFO (with 2 channels unused), in 
order to fill up the FIFO more quickly and have less latency? Or is it 
pure craze?

Cheers and thank you

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
  2014-03-26  9:35           ` Leonardo Gabrielli
@ 2014-03-26 12:28             ` Peter Ujfalusi
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Ujfalusi @ 2014-03-26 12:28 UTC (permalink / raw)
  To: Leonardo Gabrielli; +Cc: alsa-devel

On 03/26/2014 11:35 AM, Leonardo Gabrielli wrote:
> 
> On 26/03/2014 09:26, Peter Ujfalusi wrote:
>> Can you elaborate on the corrupted/scrambled audio? I just don't see how it
>> can happen. Can you get the /proc/asound/card0/pcm0p/sub0/hw_params when you
>> have the audio quality issue?
> Hello,
> Here you are:
> 
> cat /proc/asound/card0/pcm0p/sub0/hw_params
> access: MMAP_INTERLEAVED
> format: S16_LE
> subformat: STD
> channels: 2
> rate: 22050 (22050/1)
> period_size: 512
> buffer_size: 1024
> 
> And this is jack output:
> 
> jackd -P62 -t2000 -dalsa -dhw:0 -r22050 -p512 -n2 -s -S -i2 -o2 &

arecord -r 22050 -f S16_LE --period-size=512 --buffer-size=1024 -v | aplay -r
22050 -f S16_LE  --period-size=512 --buffer-size=1024 -v

and no issue on the headphone from Beagle.

> I can send you saomething to get it clearer. I recorded  a 10Hz sine wave with
> jaaa. The wave is totally scrambled (probably buffers are not read in order).
> But it may well be an issue with jack.
> 
> When the output sounds correct (256 period) hw_param is:
> cat /proc/asound/card0/pcm0p/sub0/hw_params
> access: MMAP_INTERLEAVED
> format: S16_LE
> subformat: STD
> channels: 2
> rate: 22050 (22050/1)
> period_size: 256
> buffer_size: 768

arecord -r 22050 -f S16_LE -v --period-size=256 --buffer-size=768 | aplay -r
22050 -f S16_LE  --period-size=256 --buffer-size=768 -v

again, audio is clear with this one as well

>> I can not run jackd on the board anymore (with linux-next at least):
>> FATAL: cannot locate cpu MHz in /proc/cpuinfo
> 
> Yes, there's been a recent fix to that (you can checkout the latest jackd from
> git repos, see this thread:
> http://lists.jackaudio.org/private.cgi/jack-devel-jackaudio.org/2014-March/012167.html

I also found this, but lazy to update my jack...

> Or maybe you can just start jackd specifying a different clock with the -c
> switch i.e.
> jackd -P62 -t2000 -c s -dalsa -dhw:0 -r22050 -p512 -n2 -s -S -i2 -o2 &

This does not work.

-- 
Péter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
  2014-03-26  9:45           ` Leonardo Gabrielli
@ 2014-03-26 12:51             ` Peter Ujfalusi
  2014-03-26 15:40               ` Leonardo Gabrielli
       [not found]               ` <533A89D2.5050906@univpm.it>
  0 siblings, 2 replies; 13+ messages in thread
From: Peter Ujfalusi @ 2014-03-26 12:51 UTC (permalink / raw)
  To: Leonardo Gabrielli; +Cc: alsa-devel

On 03/26/2014 11:45 AM, Leonardo Gabrielli wrote:
> 
> On 26/03/2014 09:26, Peter Ujfalusi wrote:
>> The McBSP2 FIFO will be always there. There's nothing can be done on that. The
>> size on McBSP2 is 1280 words -> 640 stereo samples, ie ~29ms with 22050,
>> 14.5ms with 44100.
>>
>> If you are staying in element mode this means that it is granted that the
>> sample at the DMA pointer will out on the i2s line about the mentioned times.
>> This is the delay caused by the FIFO itself. From where the rest is coming I'm
>> not really sure.

> BTW: I forgot to mention: the latency listed in my previous email is
> input+output (i.e. I record pulses from the beagleboard input jack and the
> delayed version to the beagleboard output jack). The twl4030 analog and
> digital loopback features have been of course disabled, in order to get the
> total latency due from A/D to D/A.

This means that the McBSP latency in worst case is 1280 + selected rx
threshold in words (so /2 in case of stereo.) If you lower the rx threshold
you decrease the latency on the capture side. On the playback side there's
nothing can be done.

> So just to get confirm I understood the McBSP mechanism well: even though I
> can transfer to/from DMA samples in bursts of <threshold> length, each sample
> will always "travel along" the whole FIFO buffer length, (as if in a delay
> line) and thus they will always have 640samples delay?

On the playback side this is pretty much true. On capture side the threshold
means that DMA will read from FIFO when threshold amount is available in it.

> Would it be possible to workaround this, e.g. by putting 4-channel audio
> frames instead of stereo frames in the FIFO (with 2 channels unused), in order
> to fill up the FIFO more quickly and have less latency? Or is it pure craze?

>From the FIFO McBSP takes data word by word. If you play stereo, you need to
have stereo data in the FIFO. You can not skip two words with McBSP.

The thing I tried for playback and did not worked AFAIR:
In general the idea was to configure DMA to send threshold/channel to every
request while configuring the McBSP threshold register to be 1280 - threshold.
In case of threshold 80 (40 stereo samples) it would play out:
transfer 40 samples to FIFO per DMA request
assert the DMA request when we have space for 1260 (630 samples). The number
is just a guess, keeping 10 samples in FIFO sounds safe enough
This would keep the FIFO fill between 10 and 50 samples.
But this does not work, I think McBSP is counting the received words also and
deasserts the DMA request based on this count and not the FIFO level.

Another thing which would be even more complicated is to play with the McBSP
threshold runtime. With the same 40 sample:
DMA is to transfer 40 samples per DMA requests.
start
1. McBSP threshold to 80
2. in dma interrupt callback McBSP threshold to 1260
3. in McBSP warning interrupt (that we will be reaching the threshold soon)
back to 80
4. goto 2

If we could do the step between 3 and 4 within one sample time this might work
but as soon as you are late the thing will fail.

I know this is working in realtime systems like in DSPs and non linux systems...

-- 
Péter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
  2014-03-26 12:51             ` Peter Ujfalusi
@ 2014-03-26 15:40               ` Leonardo Gabrielli
       [not found]               ` <533A89D2.5050906@univpm.it>
  1 sibling, 0 replies; 13+ messages in thread
From: Leonardo Gabrielli @ 2014-03-26 15:40 UTC (permalink / raw)
  To: Peter Ujfalusi; +Cc: Edgar Berdahl, alsa-devel

Peter,
thank you for your suggestions.
For the moment I will keep with the following settings, which provide 
satisfying latency and CPU usage:

-r 44100, -p64, with sDMA threshold=320 as per your suggestions, and CPU 
governor set to performance (1GHz).
This way I can have 16ms in-out latency, with 15% CPU, no glitches and 
XRUNs (9% in jack, and 6% is the connection of system:{capture|playback} 
ports.

uname -a
Linux debian-BB3 3.13.3-armv7-x10 #1 SMP Sat Feb 15 01:03:40 UTC 2014 
armv7l GNU/Linux

In case I will need a lower latency I will try the USB soundcard 
suggested by Edgar.

Leonardo

On 26/03/2014 13:51, Peter Ujfalusi wrote:
> I know this is working in realtime systems like in DSPs and non linux systems...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: twl4030 latency update
       [not found]               ` <533A89D2.5050906@univpm.it>
@ 2014-04-01 13:31                 ` Peter Ujfalusi
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Ujfalusi @ 2014-04-01 13:31 UTC (permalink / raw)
  To: Leonardo Gabrielli; +Cc: alsa-devel

Hi Leonardo,

On 04/01/2014 12:41 PM, Leonardo Gabrielli wrote:
> Dear Peter,
> I actually managed to nearly halve the latency with McBSP2 and jackd with a
> little trick: requesting 4 channels audio. Of course two will be zero, but the
> FIFO will grow up more quick.

Yes, this is expected. The FIFO is lost (or word) based. If you have mono
audio, it is 1280 samples long, in stereo it is 640 samples with 4 channel it
can hold up to 320 samples.

> Outcome:
> 4 channels, 44100, 64 input --> jack --> output takes 10.1ms latency, i.e.:
> - 2 periods for input: 128 L/R frames
> - 1280 FIFO words / 4 channels = 320 frames
> Without the trick the latency was 17.4ms :)

What you can also try is to reduce the max_rx_thres to let's say 10 samples
(20 or 40 depending on the number of channels you are using). Keep the
max_tx_thres as you have been using.

> 
> CPU load as increased slightly in jackd, but of course any other jack client
> will stay the same assumed that the 2 fake extra channels are unused.
> 
> In this experiment sample size = 16-bit. If I try to default to 32bit nothing
> changes (so I assume the FIFO words are 32-bit wide and can contain either
> 16-bit zeropadded samples or 32, which I remember from the AM37xx tech guide
> but I'm too lazy to check again ;) )

Yes it is like that.

> Glitches may happen when some regular user tasks (scp, curl, ...) request some
> cpu. I may try the new 3.14 kernel with SCHED_DEADLINE to see if jackd and
> clients can really be not preempted by user tasks...
> 
> Best
> 
> Leonardo
> 
> 
> 
> On 26/03/2014 13:51, Peter Ujfalusi wrote:
>> On 03/26/2014 11:45 AM, Leonardo Gabrielli wrote:
>>> On 26/03/2014 09:26, Peter Ujfalusi wrote:
>>>> The McBSP2 FIFO will be always there. There's nothing can be done on that.
>>>> The
>>>> size on McBSP2 is 1280 words -> 640 stereo samples, ie ~29ms with 22050,
>>>> 14.5ms with 44100.
>>>>
>>>> If you are staying in element mode this means that it is granted that the
>>>> sample at the DMA pointer will out on the i2s line about the mentioned times.
>>>> This is the delay caused by the FIFO itself. From where the rest is coming
>>>> I'm
>>>> not really sure.
>>> BTW: I forgot to mention: the latency listed in my previous email is
>>> input+output (i.e. I record pulses from the beagleboard input jack and the
>>> delayed version to the beagleboard output jack). The twl4030 analog and
>>> digital loopback features have been of course disabled, in order to get the
>>> total latency due from A/D to D/A.
>> This means that the McBSP latency in worst case is 1280 + selected rx
>> threshold in words (so /2 in case of stereo.) If you lower the rx threshold
>> you decrease the latency on the capture side. On the playback side there's
>> nothing can be done.
>>
>>> So just to get confirm I understood the McBSP mechanism well: even though I
>>> can transfer to/from DMA samples in bursts of <threshold> length, each sample
>>> will always "travel along" the whole FIFO buffer length, (as if in a delay
>>> line) and thus they will always have 640samples delay?
>> On the playback side this is pretty much true. On capture side the threshold
>> means that DMA will read from FIFO when threshold amount is available in it.
>>
>>> Would it be possible to workaround this, e.g. by putting 4-channel audio
>>> frames instead of stereo frames in the FIFO (with 2 channels unused), in order
>>> to fill up the FIFO more quickly and have less latency? Or is it pure craze?
>> >From the FIFO McBSP takes data word by word. If you play stereo, you need to
>> have stereo data in the FIFO. You can not skip two words with McBSP.
>>
>> The thing I tried for playback and did not worked AFAIR:
>> In general the idea was to configure DMA to send threshold/channel to every
>> request while configuring the McBSP threshold register to be 1280 - threshold.
>> In case of threshold 80 (40 stereo samples) it would play out:
>> transfer 40 samples to FIFO per DMA request
>> assert the DMA request when we have space for 1260 (630 samples). The number
>> is just a guess, keeping 10 samples in FIFO sounds safe enough
>> This would keep the FIFO fill between 10 and 50 samples.
>> But this does not work, I think McBSP is counting the received words also and
>> deasserts the DMA request based on this count and not the FIFO level.
>>
>> Another thing which would be even more complicated is to play with the McBSP
>> threshold runtime. With the same 40 sample:
>> DMA is to transfer 40 samples per DMA requests.
>> start
>> 1. McBSP threshold to 80
>> 2. in dma interrupt callback McBSP threshold to 1260
>> 3. in McBSP warning interrupt (that we will be reaching the threshold soon)
>> back to 80
>> 4. goto 2
>>
>> If we could do the step between 3 and 4 within one sample time this might work
>> but as soon as you are late the thing will fail.
>>
>> I know this is working in realtime systems like in DSPs and non linux
>> systems...
>>
> 


-- 
Péter

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-04-01 13:31 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-20 11:13 twl4030 latency update Leonardo Gabrielli
2014-03-20 13:35 ` Peter Ujfalusi
2014-03-20 14:31   ` Leonardo Gabrielli
2014-03-21  7:08     ` Peter Ujfalusi
2014-03-25 18:50       ` Leonardo Gabrielli
2014-03-26  8:26         ` Peter Ujfalusi
2014-03-26  8:41           ` Peter Ujfalusi
2014-03-26  9:35           ` Leonardo Gabrielli
2014-03-26 12:28             ` Peter Ujfalusi
2014-03-26  9:45           ` Leonardo Gabrielli
2014-03-26 12:51             ` Peter Ujfalusi
2014-03-26 15:40               ` Leonardo Gabrielli
     [not found]               ` <533A89D2.5050906@univpm.it>
2014-04-01 13:31                 ` Peter Ujfalusi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.