alsa-devel.alsa-project.org archive mirror
 help / color / mirror / Atom feed
From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>,
	alsa-devel@alsa-project.org
Cc: 'Linux Samsung SOC' <linux-samsung-soc@vger.kernel.org>,
	Kai Vehmanen <kai.vehmanen@linux.intel.com>,
	Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>,
	Liam Girdwood <lgirdwood@gmail.com>,
	tiwai@suse.de, Bard Liao <yung-chuan.liao@linux.intel.com>,
	Sameer Pujar <spujar@nvidia.com>, Takashi Iwai <tiwai@suse.com>,
	linux-kernel@vger.kernel.org,
	Krzysztof Kozlowski <krzk@kernel.org>,
	Ranjani Sridharan <ranjani.sridharan@linux.intel.com>,
	vkoul@kernel.org, broonie@kernel.org,
	Gyeongtaek Lee <gt82.lee@samsung.com>,
	Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Subject: Re: [PATCH 4/6] ASoC: soc-pcm: serialize BE triggers
Date: Fri, 17 Dec 2021 12:57:42 +0100	[thread overview]
Message-ID: <17b6b4fc-ecbe-1b36-5843-d9394baaf669@samsung.com> (raw)
In-Reply-To: <660cd57a-d445-63ba-b0bf-6bf6168ffee0@samsung.com>

Dear All,

On 17.12.2021 11:11, Marek Szyprowski wrote:
> On 07.12.2021 18:37, Pierre-Louis Bossart wrote:
>> From: Takashi Iwai <tiwai@suse.de>
>>
>> When more than one FE is connected to a BE, e.g. in a mixing use case,
>> the BE can be triggered multiple times when the FE are opened/started
>> concurrently. This race condition is problematic in the case of
>> SoundWire BE dailinks, and this is not desirable in a general
>> case.
>>
>> This patch relies on the existing BE PCM lock, which takes atomicity 
>> into
>> account. The locking model assumes that all interactions start with
>> the FE, so that there is no deadlock between FE and BE locks.
>>
>> Signed-off-by: Takashi Iwai <tiwai@suse.de>
>> [test, checkpatch fix and clarification of commit message by plbossart]
>> Signed-off-by: Pierre-Louis Bossart 
>> <pierre-louis.bossart@linux.intel.com>
>> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
>> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
>> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
>
> This patch recently landed in linux-next (next-20211215) as commit 
> b2ae80663008 ("ASoC: soc-pcm: serialize BE triggers").

It turned out that the issue is caused by the previous patch "[PATCH 
3/6] ASoC: soc-pcm: Fix and cleanup DPCM locking 
<https://lore.kernel.org/all/20211207173745.15850-4-pierre-louis.bossart@linux.intel.com/#r>", 
merged as commit b7898396f4bb ("ASoC: soc-pcm: Fix and cleanup DPCM 
locking"), for some reasons 'git bisect' pointed me initially to the 
wrong patch. I will re-report it under the proper patch. Reverting it 
and the next patches on top of linux-next fixes the issue.


> I found that it triggers a warning on my test boards. This is the one 
> from Exynos4412-based Odroid U3 board:
>
> # speaker-test -l1
>
> speaker-test 1.1.8
>
> Playback device is default
> Stream parameters are 48000Hz, S16_LE, 1 channels
> Using 16 octaves of pink noise
> Rate set to 48000Hz (requested 48000Hz)
> Buffer size range from 128 to 131072
> Period size range from 64 to 65536
> Using max buffer size 131072
> Periods = 4
> was set period_size = 32768
> was set buffer_size = 131072
>  0 - Front Left
>
> ============================================
> WARNING: possible recursive locking detected
> 5.16.0-rc1-00270-gb2ae80663008 #11109 Not tainted
> --------------------------------------------
> speaker-test/1312 is trying to acquire lock:
> c1d78ca4 (&group->lock){....}-{2:2}, at: dpcm_be_dai_trigger+0x80/0x300
>
> but task is already holding lock:
> c1d788a4 (&group->lock){....}-{2:2}, at: 
> snd_pcm_action_lock_irq+0x68/0x7c
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(&group->lock);
>   lock(&group->lock);
>
>  *** DEADLOCK ***
>
>  May be due to missing lock nesting notation
>
> 1 lock held by speaker-test/1312:
>  #0: c1d788a4 (&group->lock){....}-{2:2}, at: 
> snd_pcm_action_lock_irq+0x68/0x7c
>
> stack backtrace:
> CPU: 0 PID: 1312 Comm: speaker-test Not tainted 
> 5.16.0-rc1-00270-gb2ae80663008 #11109
> Hardware name: Samsung Exynos (Flattened Device Tree)
> [<c0110b30>] (unwind_backtrace) from [<c010c618>] (show_stack+0x10/0x14)
> [<c010c618>] (show_stack) from [<c0b65d24>] (dump_stack_lvl+0x58/0x70)
> [<c0b65d24>] (dump_stack_lvl) from [<c0193740>] 
> (__lock_acquire+0x15ac/0x319c)
> [<c0193740>] (__lock_acquire) from [<c0195dd8>] 
> (lock_acquire+0x14c/0x424)
> [<c0195dd8>] (lock_acquire) from [<c0b745b8>] 
> (_raw_spin_lock_irqsave+0x44/0x60)
> [<c0b745b8>] (_raw_spin_lock_irqsave) from [<c0926b6c>] 
> (dpcm_be_dai_trigger+0x80/0x300)
> [<c0926b6c>] (dpcm_be_dai_trigger) from [<c0927004>] 
> (dpcm_fe_dai_do_trigger+0x124/0x1e4)
> [<c0927004>] (dpcm_fe_dai_do_trigger) from [<c090728c>] 
> (snd_pcm_action+0x74/0xb0)
> [<c090728c>] (snd_pcm_action) from [<c0907eac>] 
> (snd_pcm_action_lock_irq+0x3c/0x7c)
> [<c0907eac>] (snd_pcm_action_lock_irq) from [<c02f13a0>] 
> (sys_ioctl+0x568/0xd44)
> [<c02f13a0>] (sys_ioctl) from [<c0100060>] (ret_fast_syscall+0x0/0x1c)
> Exception stack(0xc4777fa8 to 0xc4777ff0)
> 7fa0:                   004f5210 b6e27394 00000004 00004142 004f5398 
> 004f5398
> 7fc0: 004f5210 b6e27394 00020000 00000036 00000000 00000000 bee588e8 
> 00008000
> 7fe0: b6e277c4 bee58874 b6d8e888 b6c751dc
> Time per period = 0.253397
> max98090 1-0010: PLL unlocked
> BUG: sleeping function called from invalid context at 
> kernel/locking/rwsem.c:1526
> in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1312, name: 
> speaker-test
> preempt_count: 1, expected: 0
> RCU nest depth: 0, expected: 0
> INFO: lockdep is turned off.
> irq event stamp: 8158
> hardirqs last  enabled at (8157): [<c0b747d0>] 
> _raw_spin_unlock_irqrestore+0x5c/0x60
> hardirqs last disabled at (8158): [<c0b74570>] 
> _raw_spin_lock_irq+0x58/0x5c
> softirqs last  enabled at (7854): [<c0101578>] __do_softirq+0x348/0x610
> softirqs last disabled at (7849): [<c012e7a4>] __irq_exit_rcu+0x144/0x1ec
> Preemption disabled at:
> [<00000000>] 0x0
> CPU: 0 PID: 1312 Comm: speaker-test Not tainted 
> 5.16.0-rc1-00270-gb2ae80663008 #11109
> Hardware name: Samsung Exynos (Flattened Device Tree)
> [<c0110b30>] (unwind_backtrace) from [<c010c618>] (show_stack+0x10/0x14)
> [<c010c618>] (show_stack) from [<c0b65d24>] (dump_stack_lvl+0x58/0x70)
> [<c0b65d24>] (dump_stack_lvl) from [<c0158b04>] 
> (__might_resched+0x1c0/0x288)
> [<c0158b04>] (__might_resched) from [<c0b71898>] (down_write+0x24/0x8c)
> [<c0b71898>] (down_write) from [<c030ed64>] 
> (simple_recursive_removal+0x6c/0x370)
> [<c030ed64>] (simple_recursive_removal) from [<c04d07a4>] 
> (debugfs_remove+0x38/0x4c)
> [<c04d07a4>] (debugfs_remove) from [<c0928784>] 
> (dpcm_be_disconnect+0x160/0x2c4)
> [<c0928784>] (dpcm_be_disconnect) from [<c092895c>] 
> (dpcm_fe_dai_cleanup+0x74/0xb0)
> [<c092895c>] (dpcm_fe_dai_cleanup) from [<c0928d90>] 
> (dpcm_fe_dai_close+0xe8/0x14c)
> [<c0928d90>] (dpcm_fe_dai_close) from [<c090977c>] 
> (snd_pcm_release_substream.part.0+0x3c/0xcc)
> [<c090977c>] (snd_pcm_release_substream.part.0) from [<c0909878>] 
> (snd_pcm_release+0x54/0xa4)
> [<c0909878>] (snd_pcm_release) from [<c02dc400>] (__fput+0x88/0x258)
> [<c02dc400>] (__fput) from [<c014cd44>] (task_work_run+0x8c/0xc8)
> [<c014cd44>] (task_work_run) from [<c010c08c>] 
> (do_work_pending+0x4a4/0x598)
> [<c010c08c>] (do_work_pending) from [<c0100088>] 
> (slow_work_pending+0xc/0x20)
> Exception stack(0xc4777fb0 to 0xc4777ff8)
> 7fa0:                                     00000000 004f5260 004eaa9c 
> 00000000
> 7fc0: 004f5260 004f536c 004f5210 00000006 004fb700 004e6e8c 004d6120 
> bee58cc4
> 7fe0: b6e27e64 bee58928 b6d8eda4 b6d09ac0 60000050 00000004
>
> Let me know how I can help debugging this issue.
>
>> ---
>>   sound/soc/soc-pcm.c | 46 ++++++++++++++++++++++++++++-----------------
>>   1 file changed, 29 insertions(+), 17 deletions(-)
>>
>> diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
>> index 2e282c42bac2..7043857e30b1 100644
>> --- a/sound/soc/soc-pcm.c
>> +++ b/sound/soc/soc-pcm.c
>> @@ -46,12 +46,18 @@ static inline void 
>> snd_soc_dpcm_stream_lock_irq(struct snd_soc_pcm_runtime *rtd,
>>       snd_pcm_stream_lock_irq(snd_soc_dpcm_get_substream(rtd, stream));
>>   }
>>   +#define snd_soc_dpcm_stream_lock_irqsave(rtd, stream, flags) \
>> +    snd_pcm_stream_lock_irqsave(snd_soc_dpcm_get_substream(rtd, 
>> stream), flags)
>> +
>>   static inline void snd_soc_dpcm_stream_unlock_irq(struct 
>> snd_soc_pcm_runtime *rtd,
>>                             int stream)
>>   {
>>       snd_pcm_stream_unlock_irq(snd_soc_dpcm_get_substream(rtd, 
>> stream));
>>   }
>>   +#define snd_soc_dpcm_stream_unlock_irqrestore(rtd, stream, flags) \
>> + snd_pcm_stream_unlock_irqrestore(snd_soc_dpcm_get_substream(rtd, 
>> stream), flags)
>> +
>>   #define DPCM_MAX_BE_USERS    8
>>     static inline const char *soc_cpu_dai_name(struct 
>> snd_soc_pcm_runtime *rtd)
>> @@ -2079,6 +2085,7 @@ int dpcm_be_dai_trigger(struct 
>> snd_soc_pcm_runtime *fe, int stream,
>>   {
>>       struct snd_soc_pcm_runtime *be;
>>       struct snd_soc_dpcm *dpcm;
>> +    unsigned long flags;
>>       int ret = 0;
>>         for_each_dpcm_be(fe, stream, dpcm) {
>> @@ -2087,9 +2094,11 @@ int dpcm_be_dai_trigger(struct 
>> snd_soc_pcm_runtime *fe, int stream,
>>           be = dpcm->be;
>>           be_substream = snd_soc_dpcm_get_substream(be, stream);
>>   +        snd_soc_dpcm_stream_lock_irqsave(be, stream, flags);
>> +
>>           /* is this op for this BE ? */
>>           if (!snd_soc_dpcm_be_can_update(fe, be, stream))
>> -            continue;
>> +            goto next;
>>             dev_dbg(be->dev, "ASoC: trigger BE %s cmd %d\n",
>>               be->dai_link->name, cmd);
>> @@ -2099,77 +2108,80 @@ int dpcm_be_dai_trigger(struct 
>> snd_soc_pcm_runtime *fe, int stream,
>>               if ((be->dpcm[stream].state != 
>> SND_SOC_DPCM_STATE_PREPARE) &&
>>                   (be->dpcm[stream].state != SND_SOC_DPCM_STATE_STOP) &&
>>                   (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
>> -                continue;
>> +                goto next;
>>                 ret = soc_pcm_trigger(be_substream, cmd);
>>               if (ret)
>> -                goto end;
>> +                goto next;
>>                 be->dpcm[stream].state = SND_SOC_DPCM_STATE_START;
>>               break;
>>           case SNDRV_PCM_TRIGGER_RESUME:
>>               if ((be->dpcm[stream].state != 
>> SND_SOC_DPCM_STATE_SUSPEND))
>> -                continue;
>> +                goto next;
>>                 ret = soc_pcm_trigger(be_substream, cmd);
>>               if (ret)
>> -                goto end;
>> +                goto next;
>>                 be->dpcm[stream].state = SND_SOC_DPCM_STATE_START;
>>               break;
>>           case SNDRV_PCM_TRIGGER_PAUSE_RELEASE:
>>               if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
>> -                continue;
>> +                goto next;
>>                 ret = soc_pcm_trigger(be_substream, cmd);
>>               if (ret)
>> -                goto end;
>> +                goto next;
>>                 be->dpcm[stream].state = SND_SOC_DPCM_STATE_START;
>>               break;
>>           case SNDRV_PCM_TRIGGER_STOP:
>>               if ((be->dpcm[stream].state != 
>> SND_SOC_DPCM_STATE_START) &&
>>                   (be->dpcm[stream].state != SND_SOC_DPCM_STATE_PAUSED))
>> -                continue;
>> +                goto next;
>>                 if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream))
>> -                continue;
>> +                goto next;
>>                 ret = soc_pcm_trigger(be_substream, cmd);
>>               if (ret)
>> -                goto end;
>> +                goto next;
>>                 be->dpcm[stream].state = SND_SOC_DPCM_STATE_STOP;
>>               break;
>>           case SNDRV_PCM_TRIGGER_SUSPEND:
>>               if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START)
>> -                continue;
>> +                goto next;
>>                 if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream))
>> -                continue;
>> +                goto next;
>>                 ret = soc_pcm_trigger(be_substream, cmd);
>>               if (ret)
>> -                goto end;
>> +                goto next;
>>                 be->dpcm[stream].state = SND_SOC_DPCM_STATE_SUSPEND;
>>               break;
>>           case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
>>               if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START)
>> -                continue;
>> +                goto next;
>>                 if (!snd_soc_dpcm_can_be_free_stop(fe, be, stream))
>> -                continue;
>> +                goto next;
>>                 ret = soc_pcm_trigger(be_substream, cmd);
>>               if (ret)
>> -                goto end;
>> +                goto next;
>>                 be->dpcm[stream].state = SND_SOC_DPCM_STATE_PAUSED;
>>               break;
>>           }
>> +next:
>> +        snd_soc_dpcm_stream_unlock_irqrestore(be, stream, flags);
>> +        if (ret)
>> +            break;
>>       }
>> -end:
>>       if (ret < 0)
>>           dev_err(fe->dev, "ASoC: %s() failed at %s (%d)\n",
>>               __func__, be->dai_link->name, ret);
>
> Best regards

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


  parent reply	other threads:[~2021-12-17 12:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-07 17:37 [PATCH 0/6] ASoC : soc-pcm: fix trigger race conditions with shared BE Pierre-Louis Bossart
2021-12-07 17:37 ` [PATCH 1/6] ASoC: soc-pcm: use GFP_ATOMIC for dpcm structure Pierre-Louis Bossart
2021-12-07 17:37 ` [PATCH 2/6] ASoC: soc-pcm: align BE 'atomicity' with that of the FE Pierre-Louis Bossart
2021-12-07 17:37 ` [PATCH 3/6] ASoC: soc-pcm: Fix and cleanup DPCM locking Pierre-Louis Bossart
     [not found]   ` <CGME20211217141541eucas1p1bdcb8b91e8a772229391b525d6adbf3b@eucas1p1.samsung.com>
2021-12-17 14:15     ` Marek Szyprowski
2021-12-17 16:54       ` Pierre-Louis Bossart
2021-12-07 17:37 ` [PATCH 4/6] ASoC: soc-pcm: serialize BE triggers Pierre-Louis Bossart
     [not found]   ` <CGME20211217101152eucas1p14ea09622edde5118ddd832c3ebaebdde@eucas1p1.samsung.com>
2021-12-17 10:11     ` Marek Szyprowski
     [not found]       ` <CGME20211217115743eucas1p10e721164cd845176f9dc5c4c2ad86838@eucas1p1.samsung.com>
2021-12-17 11:57         ` Marek Szyprowski [this message]
2021-12-07 17:37 ` [PATCH 5/6] ASoC: soc-pcm: test refcount before triggering Pierre-Louis Bossart
2021-12-07 17:37 ` [PATCH 6/6] ASoC: soc-pcm: fix BE handling of PAUSE_RELEASE Pierre-Louis Bossart
2021-12-15  2:04 ` [PATCH 0/6] ASoC : soc-pcm: fix trigger race conditions with shared BE Mark Brown
2023-05-11 12:08 [PATCH 0/6] Fix soc compress playback deadlock in 5.15 yixuanjiang
2023-05-11 12:08 ` [PATCH 4/6] ASoC: soc-pcm: serialize BE triggers yixuanjiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17b6b4fc-ecbe-1b36-5843-d9394baaf669@samsung.com \
    --to=m.szyprowski@samsung.com \
    --cc=alsa-devel@alsa-project.org \
    --cc=broonie@kernel.org \
    --cc=gt82.lee@samsung.com \
    --cc=kai.vehmanen@linux.intel.com \
    --cc=krzk@kernel.org \
    --cc=kuninori.morimoto.gx@renesas.com \
    --cc=lgirdwood@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-samsung-soc@vger.kernel.org \
    --cc=peter.ujfalusi@linux.intel.com \
    --cc=pierre-louis.bossart@linux.intel.com \
    --cc=ranjani.sridharan@linux.intel.com \
    --cc=spujar@nvidia.com \
    --cc=tiwai@suse.com \
    --cc=tiwai@suse.de \
    --cc=vkoul@kernel.org \
    --cc=yung-chuan.liao@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).