All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/5] ASoC: soc-pcm: fix trigger race conditions with shared BE
@ 2021-10-04 22:54 Pierre-Louis Bossart
  2021-10-04 22:54   ` Pierre-Louis Bossart
                   ` (5 more replies)
  0 siblings, 6 replies; 33+ messages in thread
From: Pierre-Louis Bossart @ 2021-10-04 22:54 UTC (permalink / raw)
  To: alsa-devel
  Cc: Kuninori Morimoto, tiwai, Sameer Pujar, Pierre-Louis Bossart,
	vkoul, broonie, Gyeongtaek Lee, Peter Ujfalusi

We've been adding a 'deep buffer' PCM device to several SOF topologies
in order to reduce power consumption. The typical use-case would be
music playback over a headset: this additional PCM device provides
more buffering and longer latencies, leaving the rest of the system
sleep for longer periods. Notifications and 'regular' low-latency
audio playback would still use the 'normal' PCM device and be mixed
with the 'deep buffer' before rendering on the headphone endpoint. The
tentative direction would be to expose this alternate device to
PulseAudio/PipeWire/CRAS via the UCM SectionModifier definitions.

That seemed a straightforward topology change until our automated
validation stress tests started reporting issues on SoundWire
platforms, when e.g. two START triggers might be send and conversely
the STOP trigger is never sent. The SoundWire stream state management
flagged inconsistent states when the two 'normal' and 'deep buffer'
devices are used concurrently with rapid play/stop/pause monkey
testing.

Looking at the soc-pcm.c code, it seems that the BE state
management needs a lot of love.

a) there is no consistent protection for the BE state. In some parts
of the code, the state updates are protected by a spinlock but in the
trigger they are not. When we open/play/close the two PCM devices in
stress tests, we end-up testing a state that is being modified. That
can't be good.

b) there is a conceptual deadlock: on stop we check the FE states to
see if a shared BE can be stopped, but since we trigger the BE first
the FE states have not been modified yet, so the TRIGGER_STOP is never
sent.

This patchset suggests a transition from a spinlock to a mutex, an
extended protection when walking through the BE list, and the use of a
refcount to decide when to trigger the BE. With these patches I am
able to run our entire validation suite without any issues with this
new 'deep buffer' topology, and no regressions on existing solutions [1]

One might ask 'how come we didn't see this earlier'? The answer is
probably that the .trigger callbacks in most implementations seems to
perform DAPM operations, and sending the triggers multiple times is
not an issue. In the case of SoundWire, we do use the .trigger
callback to reconfigure the bus using the 'bank switch' mechanism. It
could be acceptable to tolerate a trigger multiple times, but the
deadlock on stop cannot be fixed at the SoundWire layer alone.

I chose to send this patchset as an RFCv2 to gather more feedback and
make use others know about DPCM issues. We're going to spend more time
on this but if others can provide feedback/test results it would be
greatly appreciated.

Opens:

1) is this the right solution? The DPCM code is far from simple, has
notions such as SND_SOC_DPCM_UPDATE_NO and 'trigger_pending' that I
have no background on.

2) There are other reports of kernel oopses [2] that seem related to
the lack of protection. I'd be good to confirm if this patchset solve
these problems as well.

[1] https://github.com/thesofproject/linux/pull/3146
[2] https://lore.kernel.org/alsa-devel/002f01d7b4f5$c030f4a0$4092dde0$@samsung.com/

changes since RFC v1:
Removed unused function
Removed exported symbols only used in soc-pcm.c, used static instead
Use a mutex instead of a spinlock
Protect all for_each_dpcm_be() loops
Fix bugs introduced in the refcount

Pierre-Louis Bossart (5):
  ASoC: soc-pcm: remove snd_soc_dpcm_fe_can_update()
  ASoC: soc-pcm: don't export local functions, use static
  ASoC: soc-pcm: replace dpcm_lock with dpcm_mutex
  ASoC: soc-pcm: protect for_each_dpcm_be() loops with dpcm_mutex
  ASoC: soc-pcm: test refcount before triggering

 include/sound/soc-dpcm.h |  17 +----
 include/sound/soc.h      |   2 +-
 sound/soc/soc-core.c     |   2 +-
 sound/soc/soc-pcm.c      | 153 ++++++++++++++++++++++++++-------------
 4 files changed, 108 insertions(+), 66 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2021-10-12 15:08 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-04 22:54 [RFC PATCH v2 0/5] ASoC: soc-pcm: fix trigger race conditions with shared BE Pierre-Louis Bossart
2021-10-04 22:54 ` [RFC PATCH v2 1/5] ASoC: soc-pcm: remove snd_soc_dpcm_fe_can_update() Pierre-Louis Bossart
2021-10-04 22:54   ` Pierre-Louis Bossart
2021-10-04 22:54 ` [RFC PATCH v2 2/5] ASoC: soc-pcm: don't export local functions, use static Pierre-Louis Bossart
2021-10-04 22:54   ` Pierre-Louis Bossart
2021-10-04 22:54 ` [RFC PATCH v2 3/5] ASoC: soc-pcm: replace dpcm_lock with dpcm_mutex Pierre-Louis Bossart
2021-10-04 22:54   ` Pierre-Louis Bossart
2021-10-04 22:54 ` [RFC PATCH v2 4/5] ASoC: soc-pcm: protect for_each_dpcm_be() loops " Pierre-Louis Bossart
2021-10-04 22:54   ` Pierre-Louis Bossart
2021-10-04 22:54 ` [RFC PATCH v2 5/5] ASoC: soc-pcm: test refcount before triggering Pierre-Louis Bossart
2021-10-04 22:54   ` Pierre-Louis Bossart
2021-10-05  6:36 ` [RFC PATCH v2 0/5] ASoC: soc-pcm: fix trigger race conditions with shared BE Sameer Pujar
2021-10-05 13:17   ` Pierre-Louis Bossart
2021-10-06 14:22     ` Sameer Pujar
2021-10-06 19:47       ` Pierre-Louis Bossart
2021-10-07 11:06         ` Takashi Iwai
2021-10-07 13:31           ` Pierre-Louis Bossart
2021-10-07 14:59             ` Takashi Iwai
2021-10-07 15:24               ` Pierre-Louis Bossart
2021-10-07 15:44                 ` Takashi Iwai
2021-10-07 18:13                   ` Pierre-Louis Bossart
2021-10-07 21:11                     ` Takashi Iwai
2021-10-07 21:27                       ` Pierre-Louis Bossart
2021-10-08  6:13                         ` Takashi Iwai
2021-10-08 14:41                           ` Pierre-Louis Bossart
2021-10-08 14:51                             ` Takashi Iwai
2021-10-08 15:41                               ` Pierre-Louis Bossart
2021-10-08 19:09                                 ` Pierre-Louis Bossart
2021-10-11 20:06                                   ` Pierre-Louis Bossart
2021-10-12  6:34                                     ` Takashi Iwai
2021-10-12 10:42                                       ` Takashi Iwai
2021-10-12 13:45                                         ` Pierre-Louis Bossart
2021-10-12 15:07                                           ` Takashi Iwai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.