All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
To: Takashi Iwai <tiwai@suse.de>
Cc: alsa-devel@alsa-project.org,
	Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>,
	Sameer Pujar <spujar@nvidia.com>,
	vkoul@kernel.org, broonie@kernel.org,
	Gyeongtaek Lee <gt82.lee@samsung.com>,
	Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Subject: Re: [RFC PATCH v2 0/5] ASoC: soc-pcm: fix trigger race conditions with shared BE
Date: Mon, 11 Oct 2021 15:06:51 -0500	[thread overview]
Message-ID: <8c1353f0-e897-e7b0-c7b9-5712b05ac91f@linux.intel.com> (raw)
In-Reply-To: <29397354-dc5b-7837-c71b-df4bde707df2@linux.intel.com>


>>> Basically we need to protect two things:
>>> - The BE links
>>> - The concurrent accesses to BEs
>>> The former belongs to each FE that holds the links, and the FE stream
>>> lock would cover.  The latter is rather a per-BE business.
>>>
>>> An oft-seen risk of multiple locks is deadlocking, but in this case,
>>> as long as we keep the lock order FE->BE, nothing wrong can happen.
>>
>> famous last words  "nothing wrong can happen." :-)
>>
>> I already added a helper to do this FE lock, I can easily replace the
>> implementation to remove the spin_lock and use the FE PCM lock.
>> we might even add the lock in the definition of for_each_dpcm_be() to
>> avoid misses.
>>
>> Let me try this out today, thanks for the suggestions.
> 
> well, it's not successful at all...
> 
> When I replace the existing dpcm_lock with the FE PCM lock as you
> suggested, without any additional changes, speaker-test produces valid
> audio on the endpoints, but if I try a Ctrl-C or limit the number of
> loops with the '-l' option, I hear an endless loop on the same buffer
> and I have to power cycle my test device to stop the sound.
> 
> See 2 patches attached, the first patch with the introduction of the
> helper works fine, the second with the use of the FE PCM lock doesn't.
> In hindsight I am glad I worked on minimal patches, one after the other,
> to identify problems.
> 
> And when I add the BE lock, then nothing happens. Device stuck and no
> audio...
> 
> There must be something we're missing related to the locking...

And indeed there's a deadlock!

snd_pcm_period_elapsed() takes the FE pcm stream lock, and will call
snd_pcm_trigger. So if we also take the pcm stream lock in the BE
trigger, there's a conceptual deadlock: we painted ourselves in a corner
by using the same lock twice.

Takashi, are you really sure we should protect these for_each_dpcm_be()
loops with the same pcm lock? it seems like asking for trouble to
revisit the ALSA core just to walking through a list of BEs? Would you
object to changing dpcm_lock as I suggested, but not interfering with
stream handling?

See the traces below based on the hack in
https://github.com/plbossart/sound/tree/test/dpcm-lock-loop-on-stop

[   67.892082] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_sof_pcm_period_elapsed_work: plb taking lock
[   67.892088] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_sof_pcm_period_elapsed_work: plb taking lock - done
[   67.892092] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_period_elapsed_under_stream_lock: start
[   67.892096] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_period_elapsed_under_stream_lock: check running
[   67.892099] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_period_elapsed_under_stream_lock: check update_hw_ptr0
[   67.892103] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_update_hw_ptr0: start
[   67.892110] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_update_hw_ptr0: delta
[   67.892113] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_update_hw_ptr0: check1
[   67.892116] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_update_hw_ptr0: no_delta_check
[   67.892119] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_update_hw_ptr0: playback silence
[   67.892123] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_update_hw_ptr0: checks 3
[   67.892126] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_update_hw_ptr0: done
[   67.892129] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_update_state: start
[   67.892132] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_update_state: before draining
[   67.892136] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_update_state: before draining2
[   67.892139] skl_hda_dsp_generic skl_hda_dsp_generic:
snd_pcm_update_state: before snd_pcm_drain_done
[   67.892143] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_do_stop:
start
[   67.892147] skl_hda_dsp_generic skl_hda_dsp_generic: snd_pcm_do_stop:
before TRIGGER_STOP

<<< we never reach the end of this TRIGGER_STOP

[   67.892153] sof-audio-pci-intel-cnl 0000:00:1f.3: pcm: trigger stream
0 dir 0 cmd 0
[   67.892166] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x60050000:
GLB_STREAM_MSG: TRIG_STOP
[   67.892396] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded:
0x60050000: GLB_STREAM_MSG: TRIG_STOP
[   67.892408] sof-audio-pci-intel-cnl 0000:00:1f.3: FW Poll Status:
reg[0x160]=0x140000 successful
[   67.892418] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x60030000:
GLB_STREAM_MSG: PCM_FREE
[   67.892564] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded:
0x60030000: GLB_STREAM_MSG: PCM_FREE
[   67.892571]  HDA Analog: dpcm_be_dai_trigger: BE TRIGGER_STOP start
[   67.892575]  HDA Analog: dpcm_be_dai_trigger: BE TRIGGER_STOP check
can_be_free_stop
[   67.892579]  HDA Analog: snd_soc_dpcm_check_state: plb: taking fe
lock_irqsave from snd_soc_dpcm_check_state

<<< checkmate at this point, we're trying to take the same lock twice.


  reply	other threads:[~2021-10-11 20:08 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-04 22:54 [RFC PATCH v2 0/5] ASoC: soc-pcm: fix trigger race conditions with shared BE Pierre-Louis Bossart
2021-10-04 22:54 ` [RFC PATCH v2 1/5] ASoC: soc-pcm: remove snd_soc_dpcm_fe_can_update() Pierre-Louis Bossart
2021-10-04 22:54   ` Pierre-Louis Bossart
2021-10-04 22:54 ` [RFC PATCH v2 2/5] ASoC: soc-pcm: don't export local functions, use static Pierre-Louis Bossart
2021-10-04 22:54   ` Pierre-Louis Bossart
2021-10-04 22:54 ` [RFC PATCH v2 3/5] ASoC: soc-pcm: replace dpcm_lock with dpcm_mutex Pierre-Louis Bossart
2021-10-04 22:54   ` Pierre-Louis Bossart
2021-10-04 22:54 ` [RFC PATCH v2 4/5] ASoC: soc-pcm: protect for_each_dpcm_be() loops " Pierre-Louis Bossart
2021-10-04 22:54   ` Pierre-Louis Bossart
2021-10-04 22:54 ` [RFC PATCH v2 5/5] ASoC: soc-pcm: test refcount before triggering Pierre-Louis Bossart
2021-10-04 22:54   ` Pierre-Louis Bossart
2021-10-05  6:36 ` [RFC PATCH v2 0/5] ASoC: soc-pcm: fix trigger race conditions with shared BE Sameer Pujar
2021-10-05 13:17   ` Pierre-Louis Bossart
2021-10-06 14:22     ` Sameer Pujar
2021-10-06 19:47       ` Pierre-Louis Bossart
2021-10-07 11:06         ` Takashi Iwai
2021-10-07 13:31           ` Pierre-Louis Bossart
2021-10-07 14:59             ` Takashi Iwai
2021-10-07 15:24               ` Pierre-Louis Bossart
2021-10-07 15:44                 ` Takashi Iwai
2021-10-07 18:13                   ` Pierre-Louis Bossart
2021-10-07 21:11                     ` Takashi Iwai
2021-10-07 21:27                       ` Pierre-Louis Bossart
2021-10-08  6:13                         ` Takashi Iwai
2021-10-08 14:41                           ` Pierre-Louis Bossart
2021-10-08 14:51                             ` Takashi Iwai
2021-10-08 15:41                               ` Pierre-Louis Bossart
2021-10-08 19:09                                 ` Pierre-Louis Bossart
2021-10-11 20:06                                   ` Pierre-Louis Bossart [this message]
2021-10-12  6:34                                     ` Takashi Iwai
2021-10-12 10:42                                       ` Takashi Iwai
2021-10-12 13:45                                         ` Pierre-Louis Bossart
2021-10-12 15:07                                           ` Takashi Iwai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8c1353f0-e897-e7b0-c7b9-5712b05ac91f@linux.intel.com \
    --to=pierre-louis.bossart@linux.intel.com \
    --cc=alsa-devel@alsa-project.org \
    --cc=broonie@kernel.org \
    --cc=gt82.lee@samsung.com \
    --cc=kuninori.morimoto.gx@renesas.com \
    --cc=peter.ujfalusi@linux.intel.com \
    --cc=spujar@nvidia.com \
    --cc=tiwai@suse.de \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.