All of lore.kernel.org
 help / color / mirror / Atom feed
* How to pair Wine with ALSA? part2: buffer&period size / blocking / mmap (long)
@ 2011-08-12 12:24 Joerg-Cyril.Hoehle
  2011-08-14 12:23 ` Clemens Ladisch
  0 siblings, 1 reply; 4+ messages in thread
From: Joerg-Cyril.Hoehle @ 2011-08-12 12:24 UTC (permalink / raw)
  To: alsa-devel

Dear ALSA developers,

[Please refer to part1:intro & underruns for the introductory text]
http://mailman.alsa-project.org/pipermail/alsa-devel/2011-August/042746.html

Topic T2 period and buffer size and time

Wine apps using the old winmm API cannot tell at waveOutOpen time
"give me a large buffer" or "give me fast reaction to explosions".

R6 A good compromise is needed for when Wine opens ALSA on behalf of winmm.

Years later, the mmdevapi introduced with Vista provides "period"
and duration parameters.  It also implements sort of a
dmix device: it appears to mix at a fixed rate (48000 or 44100
samples/sec, user settable) and to mix data in packets of 10ms.
Incidentally, that's why MS claims 10ms latency.
For compatibility, Wine should match that rate -- at least when
accessing the "default" device.

Does that really translate to set_period_time?  I doubt it.
I wonder why Wine up to now insists on particular ALSA buffer and
period sizes.  I tend to consider it's none of its business.

It might make sense for Wine to use a periodic timer that calls
snd_pcm_write (see T3 below).  *That* timer should be set to
mmdevapi's 10ms.
Doesn't that translate to set_period_time_*max*(10ms)?
However if ALSA wants/requires 50ms, why not?

Furthermore, if the app wants to use large buffers, why insist on a
tiny period on the ALSA side?  (MS appears to stick to
10ms packets regardless of the app's requested buffer size).

I expect that setting Wine's timer period to at least ALSA's allows it
to actually find room for new data each turn.

So here's what makes sense to me:
if (shared_mode && device=="default")
   set_period_time_near(10ms);
else if (exclusive_mode)
   /* mmdevapi:Initialize receives period from the app */
   set_period_time_near(clamp(3ms, app's wish in mmdevapi:Intialize, 100ms));
   /* 100ms is arbitrary, why not 1s? */
snd_pcm_hw_params()
wine_set_timer_rate(clamp(3ms, snd_pcm_get_period_time(), 0.5s))


Topic T2.b Duration / buffer size

mmdevapi's Initialize method receives a duration parameter as a hint
towards either small latency or large buffering.  One would think that
it makes perfect sense to forward that to snd_pcm_set_buffer_time.

*However*, mmdevapi also requires to hand out a pointer to a buffer
that large (GetBuffer).  Thus Wine must maintain a buffer that large
(possibly even two of them, for reasons not relevant here).
Now should ALSA really keep yet another buffer that large?
Isn't that precisely why people have gripes with PA's 2s buffer?

I'm wondering whether Wine should solely rely on its periodic timer to
regularly submit data and e.g. ask ALSA to use a buffer 3 times the
period?
30ms (3x10ms) seems to play a role in MS systems.
Dmix seems to prefer 4 to 6 times period size.

Unfortunately, snd_pcm_get_period_time may be known only after
invoking snd_pcm_hw_params(), so I can't express:
set_period_time_near(10ms);
set_buffer_time_near(3 x actual_period);

Prefer
set_buffer_time_near(3 x period_above); or
simply not call set_buffer at all?


Finally, there's that snd_pcm_sw_params_set_avail_min
whose purpose I cannot figure out.  Should Wine call
snd_pcm_sw_params_set_avail_min(1); or
snd_pcm_sw_params_set_avail_min(0);
or not at all?


Topic T3 blocking or not

Wine has traditionally used ALSA in non-blocking mode, which ALSA
people recommended against (still?).  Now suppose every write is
preceded with avail_update, for reasons I gave in part1: underruns.

Remember my example from part 1, slightly refined:
if (snd_pcm_avail_update(&avail) > buffer_size)
   snd_pcm_reset() /* skip over late samples */
   /* should be equivalent to snd_pcm_forward(avail) */
written = snd_pcm_write(min(avail,frames));

My understanding of the semantics is that write(<avail) will not block,
thus Wine could as well dispose of SND_PCM_NONBLOCK.

In the NONBLOCK case, this could be simplified to
written = snd_pcm_write(frames);
since I discovered that ALSA can write a little more than what avail
returns, and NONBLOCK implies it'll not wait in an attempt to write a
2s data buffer, but return written < frames instead.

Actually, I've a slightly more elaborate sequence in mind.  I've read
that snd_pcm_open() may be delayed because of networking issues, which
I want to avoid in the audio thread.  Therefore I'm considering using:

snd_pcm_open(SND_PCM_NONBLOCK);
... setup hw&sw_params
snd_pcm_prepare()
snd_pcm_nonblock(0); /* or should it be called before prepare? */

I.e. allow blocking while playing, which does not actually happen
thanks to avail_update() and a push model based on periodic timers to
feed data with pcm_write().

Note that so far, I've not questioned whether a timer-based push model
actually makes sense for Wine with ALSA...


Topic T4 mmap

After I re-read the mmdevapi documentation, I believe that mmdevapi's
GetBuffer/ReleaseBuffer rendering protocol may be compatible with
ALSA's snd_pcm_mmap_begin+commit after all.  It all depends on whether
ALSA grants requests for sizes as large as buffer_size, with
buffer_size up to 2 seconds * samples/sec.

Hence, as an optimization, one could imagine a driver using mmap, like
Winealsa used to have prior to the 2011 rewrite.  Yet it always had
the fallback without mmap, so let's concentrate on the non-mmap case
initially and get that right.

Thank you for your help and for reading up to this point,
      Jörg Höhle

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to pair Wine with ALSA? part2: buffer&period size / blocking / mmap (long)
  2011-08-12 12:24 How to pair Wine with ALSA? part2: buffer&period size / blocking / mmap (long) Joerg-Cyril.Hoehle
@ 2011-08-14 12:23 ` Clemens Ladisch
  2011-08-15 15:57   ` Joerg-Cyril.Hoehle
  0 siblings, 1 reply; 4+ messages in thread
From: Clemens Ladisch @ 2011-08-14 12:23 UTC (permalink / raw)
  To: Joerg-Cyril.Hoehle; +Cc: alsa-devel

Joerg-Cyril.Hoehle@t-systems.com wrote:
 > Topic T2 period and buffer size and time
 >
 > Wine apps using the old winmm API cannot tell at waveOutOpen time
 > "give me a large buffer" or "give me fast reaction to explosions".

AFAIK the WinMM API allows to vary the number and size of submitted
buffers arbitrarily even while the stream is running.  This was designed
for hardware that is reprogrammed after each buffer anyway (ISA DMA) or
that allows dynamic buffers (e.g. ICH AC'97, which was designed for
WinMM).

 > R6 A good compromise is needed for when Wine opens ALSA on behalf of winmm.

Let's assume that ALSA is configured for a certain buffer size.  If the
application does not submit as much data, the ALSA buffer is never full.
(This is not a problem, except that the time until an xrun happens is,
of course, shorter.  PulseAudio does exactly this if it wants to
decrease latency dynamically.)  OTOH, if the application submits more
data than fits into the buffer, Wine must write the remaining data when
some space has become available.

This suggests to use a buffer as big as possible (for the hardware).

 > Years later, the mmdevapi introduced with Vista provides "period"
 > and duration parameters.  It also implements sort of a
 > dmix device: it appears to mix at a fixed rate (48000 or 44100
 > samples/sec, user settable) and to mix data in packets of 10ms.
 > Incidentally, that's why MS claims 10ms latency.
 > For compatibility, Wine should match that rate -- at least when
 > accessing the "default" device.
 >
 > Does that really translate to set_period_time?  I doubt it.
 > [...]
 > I expect that setting Wine's timer period to at least ALSA's allows it
 > to actually find room for new data each turn.

The meaning of ALSA's periods is as follows:
1) The hardware is configured to generate an interrupt every period_size
    samples.  (Please note that the buffer size is not necessarily
    an integer multiple of that.)
2) When ALSA is blocked (in snd_pcm_write* or in poll), it checks
    whether to wake up the application only when an interrupt arrives.

 > [...] Finally, there's that snd_pcm_sw_params_set_avail_min
 > whose purpose I cannot figure out.  Should Wine call
 > snd_pcm_sw_params_set_avail_min(1); or
 > snd_pcm_sw_params_set_avail_min(0);
 > or not at all?

This is an additional restriction on when to wake up.

The device is considered 'ready' (i.e., the application is to be woken
up so that new data can be written) if the number of available (free)
samples in the buffer is at least avail_min.  avail_min=0 does not make
sense.

 > Topic T2.b Duration / buffer size
 >
 > mmdevapi's Initialize method receives a duration parameter as a hint
 > towards either small latency or large buffering.  One would think that
 > it makes perfect sense to forward that to snd_pcm_set_buffer_time.
 >
 > *However*, mmdevapi also requires to hand out a pointer to a buffer
 > that large (GetBuffer).  Thus Wine must maintain a buffer that large
 > (possibly even two of them, for reasons not relevant here).
 > Now should ALSA really keep yet another buffer that large?

Can't you hand out a pointer to ALSA's buffer?

 > I'm wondering whether Wine should solely rely on its periodic timer to
 > regularly submit data and e.g. ask ALSA to use a buffer 3 times the
 > period?
 > 30ms (3x10ms) seems to play a role in MS systems.
 > Dmix seems to prefer 4 to 6 times period size.

Now you are trying to do what PulseAudio does.
Why not simply use PA instead of ALSA?

 > Unfortunately, snd_pcm_get_period_time may be known only after
 > invoking snd_pcm_hw_params(),

Indeed.

 > so I can't express:
 > set_period_time_near(10ms);
 > set_buffer_time_near(3 x actual_period);

Yes you can:
   set_period_time_near(10ms);
   set_periods_near(3);

 > Topic T3 blocking or not
 >
 > Wine has traditionally used ALSA in non-blocking mode, which ALSA
 > people recommended against (still?).

Non-blocking mode is perfectly fine if you're using poll() to wait for
other events at the same time.

 > I've read that snd_pcm_open() may be delayed because of networking
 > issues, which I want to avoid in the audio thread.

This has nothing to do with networking; snd_pcm_open without NONBLOCK
just waits for the device to be closed.  This behaviour is there for
historical reasons; in practice, you always want NONBLOCK.

 > Therefore I'm considering using:
 >
 > snd_pcm_open(SND_PCM_NONBLOCK);
 > ... setup hw&sw_params
 > snd_pcm_prepare()
 > snd_pcm_nonblock(0); /* or should it be called before prepare? */

You can call nonblock(0) immediately after open.
But if your code never actually blocks, why bother to set it?

 > Topic T4 mmap
 > [...]
 > Hence, as an optimization, one could imagine a driver using mmap,

What exactly gets optimized with mmap?  Please note that snd_pcm_write*
copies the data from the supplied buffer into ALSA's buffer; if your
code does the same, it is not the slightest bit faster.


Regards,
Clemens

^ permalink raw reply	[flat|nested] 4+ messages in thread

* How to pair Wine with ALSA? part2: buffer&period size / blocking / mmap (long)
  2011-08-14 12:23 ` Clemens Ladisch
@ 2011-08-15 15:57   ` Joerg-Cyril.Hoehle
  2011-08-16  7:16     ` Clemens Ladisch
  0 siblings, 1 reply; 4+ messages in thread
From: Joerg-Cyril.Hoehle @ 2011-08-15 15:57 UTC (permalink / raw)
  To: alsa-devel; +Cc: clemens

Hi,

[I've reordered some paragraphs]

Clemens Ladisch wrote:
>AFAIK the WinMM API allows to vary the number and size of submitted
>buffers arbitrarily even while the stream is running.
The API is simply like pcm_write: "here are N frames at address X".
The app allocates and owns the buffer memory at address X.

>This was designed for hardware that is reprogrammed after each buffer
>anyway (ISA DMA) or that allows dynamic buffers (e.g. ICH AC'97, which
>was designed for WinMM).
You mean that HW can be told: "play N1 frames at address X1, to be
followed without glitch with N2 frames at address X2"?  And
afterwards, "after you'll be done with X2, play N3 at address X3"?

This is very interesting.  All I knew about were circular buffers.
Hence in my mind, every API where the app supplies the data pointer
would require copying from the app buffers into the circular HW one.

>Can't you hand out a pointer to ALSA's buffer?
I am not aware of any way outside snd_pcm_mmap_begin to obtain a
pointer from ALSA.  Perhaps that's the base of my misunderstandings?

>What exactly gets optimized with mmap?  Please note that snd_pcm_write*
>copies the data from the supplied buffer into ALSA's buffer; if your
>code does the same, it is not the slightest bit faster.
I'm not sure I understand what you mean.

The mmdevapi is unlike WinMM: GetBuffer yields a buffer pointer from
the OS and has some timing restrictions (unclear to me, MSDN talks
about "buffer-processing periods").  Not unlike snd_pcm_mmap_begin.
This pointer could be the audio HW's write_ptr.

Hence in theory, as the app asks mmdevapi for a buffer and fills it,
it can be played by the OS/HW (following ReleaseBuffer, sort of
snd_pcm_mmap_commit) without additional copying.  Thus, mmdevapi is
not in the way of such optimizations with either HW ring-buffer or
the dynamic buffers you mention, whereas WinMM cannot avoid copying
with ring-buffers.

What's expected to happen in Wine without mmap is:
1. app copies/writes data into Wine-managed GetBuffer pool
1.b because of ring-buffer management, there's even a little
    more copying in case of wrap-around.
2. ALSA pcm_write* copies from Wine's pool into HW buffer.

The optimization is:
0. app gets pointer from Wine's GetBuffer, which
   in turn gets it from ALSA's snd_pcm_mmap_begin.
1. app copies/writes data into ALSA's HW buffer.

The current state is even worse w.r.t. WinMM:
1. app copies/writes data into app buffers for use by WinMM.
2. Wine's WinMM copies data into Wine's mmdevapi buffer via GetBuffer.
3. pcm_write* copies data into ALSA's HW buffer.

Well, that's a lot of text about a situation that's becoming less and
less likely.  Desktop reality is: all apps (incl. video) use a mixer
(dmix/PA/upmix) and data still needs to be mixed: no app writes into
the audio HW ringbuffer.  I'm not sure PA or dmix support mmap.


>   set_periods_near(3) / avail_min / periods explanations;
Thank you very much.  The explanation is very welcome as the ALSA doc
did not make it all clear to me.


>This suggests to use a buffer as big as possible (for the hardware).

I'm a little reluctant about that since I've myself experienced some
apps in Wine that exhibited extreme loss of sync between audio and
video when PA was in the queue instead of dmix.  I don't know what the
reason is but for sure I will test those again as I make progress.

I have no idea what one of those apps used for synchronisation.
mmdevapi did not exist at the time the app was written.
 - Did it use dsound?  I know next to nothing about that other API.
 - Did it use WinMM:waveOutGetPosition?
 - How is waveOutGetPosition defined in the presence of huge
   latencies, i.e. is it implicitly based on a ~0 latency assumption?
 - When are buffers submitted to waveOutWrite returned to the app?
   a) After the front-end processed them (sent them to the next stage)?
   b) After the back-end (speaker) played the last sample in it?
   The difference matters only as non-zero latencies are introduced into
   the audio chain, i.e. with networking, USB or simply PA's 2s buffers.
That's about WinMM.
mmdevapi defines its API much more precisely -- lesson learned.

Note that I don't think that b) can be retro-fitted into a system with
huge buffering (e.g. network, USB or PA buffers), because numerous
apps simply work like this: allocate 3 buffers of 1/3 second worth of
samples each and play them in turn.  With 5s latency that breaks.
Hence Wine may need to implement a combination of both:
   c) After the front end processed them *and* they would have been
      played by a zero-latency system, e.g. HW ring-buffers.


>You can call nonblock(0) immediately after open.
>But if your code never actually blocks, why bother to set it?

Somebody reported and I've verified that snd_pcm_drain would always
fail (with -11 IIRC) in non-blocking mode.  It's understandable as an
afterthought since the API says "wait for ..." which violates
non-blocking.  Only snd_pcm_drop works in non-blocking mode.


>Now you are trying to do what PulseAudio does.
No surprise. Every audio framework does something similar.

>Why not simply use PA instead of ALSA?
It is my understanding that the Wine project will maintain at least 4
drivers eventually, including PA.  However, for the time being, it
starts with 3: ALSA/OSS/MacOS' CoreAudio.  It is only when these will
work well enough (and the dynamic constraints on the mmdevapi be
understood/known well enough) that another one will be added AFAIK.
We are not there yet.

Thank you very much for your help and explanations,
	Jörg Höhle

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to pair Wine with ALSA? part2: buffer&period size / blocking / mmap (long)
  2011-08-15 15:57   ` Joerg-Cyril.Hoehle
@ 2011-08-16  7:16     ` Clemens Ladisch
  0 siblings, 0 replies; 4+ messages in thread
From: Clemens Ladisch @ 2011-08-16  7:16 UTC (permalink / raw)
  To: Joerg-Cyril.Hoehle; +Cc: alsa-devel

Joerg-Cyril.Hoehle@t-systems.com wrote:
> Clemens Ladisch wrote:
> >This was designed for hardware that is reprogrammed after each buffer
> >anyway (ISA DMA) or that allows dynamic buffers (e.g. ICH AC'97, which
> >was designed for WinMM).
> 
> You mean that HW can be told: "play N1 frames at address X1, to be
> followed without glitch with N2 frames at address X2"?  And
> afterwards, "after you'll be done with X2, play N3 at address X3"?

Yes.

> >Can't you hand out a pointer to ALSA's buffer?
> 
> I am not aware of any way outside snd_pcm_mmap_begin to obtain a
> pointer from ALSA.

Well, you could hand out that pointer.

>>What exactly gets optimized with mmap?  Please note that snd_pcm_write*
>>copies the data from the supplied buffer into ALSA's buffer; if your
>>code does the same, it is not the slightest bit faster.
> 
> I'm not sure I understand what you mean.

In practice, almost every program ends up with a function like this:

my_pcm_write(buffer, count)
{
	snd_pcm_mmap_begin();
	memcpy(mmap_buffer, buffer, count);
	/* and handle wraparound */
	snd_pcm_mmap_commit();
}

This would be no optimization over snd_pcm_writei().

> What's expected to happen in Wine without mmap is:
> 1. app copies/writes data into Wine-managed GetBuffer pool
> 1.b because of ring-buffer management, there's even a little
>     more copying in case of wrap-around.
> 2. ALSA pcm_write* copies from Wine's pool into HW buffer.
> 
> The optimization is:
> 0. app gets pointer from Wine's GetBuffer, which
>    in turn gets it from ALSA's snd_pcm_mmap_begin.
> 1. app copies/writes data into ALSA's HW buffer.

That would indeed be an optimization.

> I'm not sure PA or dmix support mmap.

There are devices that don't have 'real' memory.  However, the default
device then uses the "plug" plugin that supports mmap emulation (with
a timer that then writes data with the normal write call).

>  - How is waveOutGetPosition defined in the presence of huge
>    latencies, i.e. is it implicitly based on a ~0 latency assumption?
>  - When are buffers submitted to waveOutWrite returned to the app?
>    a) After the front-end processed them (sent them to the next stage)?
>    b) After the back-end (speaker) played the last sample in it?
>    The difference matters only as non-zero latencies are introduced into
>    the audio chain, i.e. with networking, USB or simply PA's 2s buffers.

In the bad old times, there was not software processing, buffers were
returned immediately after the hardware was finished with them, and
there was no significant latency.

> >You can call nonblock(0) immediately after open.
> >But if your code never actually blocks, why bother to set it?
> 
> Somebody reported and I've verified that snd_pcm_drain would always
> fail (with -11 IIRC) in non-blocking mode.

Well, if you do want to block, you indeed need blocking mode.  :)


Regards,
Clemens

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-08-16  7:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-12 12:24 How to pair Wine with ALSA? part2: buffer&period size / blocking / mmap (long) Joerg-Cyril.Hoehle
2011-08-14 12:23 ` Clemens Ladisch
2011-08-15 15:57   ` Joerg-Cyril.Hoehle
2011-08-16  7:16     ` Clemens Ladisch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.