alsa-devel.alsa-project.org archive mirror
 help / color / mirror / Atom feed
* Context expectations in ALSA
@ 2020-10-22  9:50 Maxime Ripard
  2020-10-22 10:03 ` Jaroslav Kysela
  2020-10-22 13:50 ` Mark Brown
  0 siblings, 2 replies; 8+ messages in thread
From: Maxime Ripard @ 2020-10-22  9:50 UTC (permalink / raw)
  To: Jaroslav Kysela, Takashi Iwai, Liam Girdwood, Mark Brown
  Cc: alsa-devel, Dave Stevenson, linux-kernel, Nicolas Saenz Julienne,
	Dom Cobley

[-- Attachment #1: Type: text/plain, Size: 9247 bytes --]

Hi!

We currently have an issue reported by lockdep on the RaspberryPi and
its HDMI audio output where, at startup, we end up scheduling in atomic
context.

This is caused by the HDMI driver polling some status bit that reports
that the infoframes have been properly sent, and calling usleep_range
between each iteration[1], and that is done in our trigger callback that
seems to be run with a spinlock taken and the interrupt disabled
(snd_pcm_action_lock_irq) as part of snd_pcm_start_lock_irq. This is the
entire stack trace:

# aplay --channels=2 -D 'iec958:CARD=vc4hdmi0,DEV=0' /root/test-tone.wav
Playing WAVE '/root/test-tone.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo
[   14.732730] BUG: sleeping function called from invalid context at drivers/gpu/drm/vc4/vc4_hdmi.c:276
[   14.742005] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 140, name: aplay
[   14.749955] CPU: 0 PID: 140 Comm: aplay Not tainted 5.9.0-rc5-v7l+ #66
[   14.756578] Hardware name: BCM2711
[   14.760026] Backtrace: 
[   14.762524] [<c020d250>] (dump_backtrace) from [<c020d5f4>] (show_stack+0x20/0x24)
[   14.770209]  r7:c167c9e4 r6:600e0093 r5:00000000 r4:c167c9e4
[   14.775960] [<c020d5d4>] (show_stack) from [<c0703164>] (dump_stack+0xb8/0xd8)
[   14.783297] [<c07030ac>] (dump_stack) from [<c025108c>] (___might_sleep+0x138/0x17c)
[   14.791159]  r10:00000084 r9:f0979b00 r8:c10ffd50 r7:00000010 r6:00000000 r5:00000114
[   14.799105]  r4:c10fe000 r3:600e0093
[   14.802736] [<c0250f54>] (___might_sleep) from [<c0251140>] (__might_sleep+0x70/0xb0)
[   14.810680]  r4:c0e8f5f4
[   14.813256] [<c02510d0>] (__might_sleep) from [<c0840724>] (vc4_hdmi_stop_packet+0x120/0x330)
[   14.821907]  r6:ef21f280 r5:00000003 r4:73e746f9
[   14.826598] [<c0840604>] (vc4_hdmi_stop_packet) from [<c084422c>] (vc4_hdmi_write_infoframe+0x140/0x5d0)
[   14.836220]  r9:f0979b00 r8:c10ffd50 r7:ef219900 r6:ef248840 r5:00000000 r4:ef21f280
[   14.844084] [<c08440ec>] (vc4_hdmi_write_infoframe) from [<c0844718>] (vc4_hdmi_set_audio_infoframe+0x5c/0x80)
[   14.854236]  r10:efac7738 r9:ef3ea240 r8:00000001 r7:ef219900 r6:ef248840 r5:ef21f040
[   14.862180]  r4:ef21f280
[   14.864756] [<c08446bc>] (vc4_hdmi_set_audio_infoframe) from [<c0844774>] (vc4_hdmi_audio_trigger+0x38/0x238)
[   14.874815]  r4:00000001
[   14.877392] [<c084473c>] (vc4_hdmi_audio_trigger) from [<c0a155f0>] (snd_soc_pcm_dai_trigger+0x64/0xcc)
[   14.886923]  r5:efac5a00 r4:00000000
[   14.890555] [<c0a1558c>] (snd_soc_pcm_dai_trigger) from [<c0a18c70>] (soc_pcm_trigger+0x70/0xac)
[   14.899472]  r9:ef3ea240 r8:c1133540 r7:00000003 r6:ef219900 r5:ef219900 r4:00000001
[   14.907336] [<c0a18c00>] (soc_pcm_trigger) from [<c09fbd84>] (snd_pcm_do_start+0x3c/0x40)
[   14.915633]  r5:c0c749dc r4:00000000
[   14.919264] [<c09fbd48>] (snd_pcm_do_start) from [<c09fa390>] (snd_pcm_action_single+0x48/0x88)
[   14.928094] [<c09fa348>] (snd_pcm_action_single) from [<c09fc830>] (snd_pcm_action+0x6c/0x78)
[   14.936746]  r7:00000003 r6:c0c749dc r5:00000000 r4:ef219900
[   14.942493] [<c09fc7c4>] (snd_pcm_action) from [<c09fd7ec>] (snd_pcm_action_lock_irq+0x38/0x50)
[   14.951320]  r7:00000030 r6:00000003 r5:c0c749dc r4:ef219900
[   14.957069] [<c09fd7b4>] (snd_pcm_action_lock_irq) from [<c09ffd2c>] (snd_pcm_ioctl+0x930/0x14e4)
[   14.966073]  r7:00000030 r6:00000000 r5:0085ee60 r4:ef219900
[   14.971823] [<c09ff3fc>] (snd_pcm_ioctl) from [<c043accc>] (sys_ioctl+0xe4/0x9e4)
[   14.979420]  r10:efac7738 r9:00000004 r8:c1133540 r7:0085ee60 r6:00005624 r5:c1133540
[   14.987364]  r4:00004142
[   14.989940] [<c043abe8>] (sys_ioctl) from [<c0200040>] (ret_fast_syscall+0x0/0x28)
[   14.997621] Exception stack(0xc10fffa8 to 0xc10ffff0)
[   15.002749] ffa0:                   00869260 b6fa8000 00000004 00004142 0085ee60 00000001
[   15.011050] ffc0: 00869260 b6fa8000 00005624 00000036 00000000 00000000 00001589 00001589
[   15.019350] ffe0: b6fa8504 bedb59fc b6f2ff68 b6da515c
[   15.024479]  r10:00000036 r9:c10fe000 r8:c0200204 r7:00000036 r6:00005624 r5:b6fa8000
[   15.032423]  r4:00869260

We could switch to a udelay instead, but the idea of busy-waiting for up
to a 100ms while having the interrupt disabled doesn't sound ideal
either.

It looks like the snd_soc_dai_link structure has a nonatomic flag that
seems to be made to address more or less that issue, taking a mutex
instead of a spinlock. However setting that flag results in another
lockdep issue, since the dmaengine controller doing the DMA transfer
would call snd_pcm_period_elapsed on completion, in a tasklet, this time
taking a mutex in an atomic context which is just as bad as the initial
issue. This is the stacktrace this time:

# aplay --channels=2 -D 'iec958:CARD=vc4hdmi0,DEV=0' /root/test-tone.wav
Playing WAVE '/root/test-tone.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo
[   43.078900] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281
[   43.087485] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/0
[   43.095452] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc5-v7l+ #67
[   43.102252] Hardware name: BCM2711
[   43.105701] Backtrace: 
[   43.108199] [<c020d250>] (dump_backtrace) from [<c020d5f4>] (show_stack+0x20/0x24)
[   43.115884]  r7:c167c9e4 r6:600e0113 r5:00000000 r4:c167c9e4
[   43.121636] [<c020d5d4>] (show_stack) from [<c0703164>] (dump_stack+0xb8/0xd8)
[   43.128972] [<c07030ac>] (dump_stack) from [<c025108c>] (___might_sleep+0x138/0x17c)
[   43.136833]  r10:c1600000 r9:00000006 r8:00000000 r7:c1601d70 r6:00000000 r5:00000119
[   43.144777]  r4:c1600000 r3:600e0113
[   43.148409] [<c0250f54>] (___might_sleep) from [<c0251140>] (__might_sleep+0x70/0xb0)
[   43.156353]  r4:c0e32994
[   43.158930] [<c02510d0>] (__might_sleep) from [<c0baf508>] (mutex_lock+0x2c/0x74)
[   43.166524]  r6:efbfaa84 r5:ef26f548 r4:ef219998
[   43.171216] [<c0baf4dc>] (mutex_lock) from [<c09fa82c>] (_snd_pcm_stream_lock_irqsave+0x2c/0x40)
[   43.180130]  r5:ef26f548 r4:ef219900
[   43.183765] [<c09fa800>] (_snd_pcm_stream_lock_irqsave) from [<c0a03848>] (snd_pcm_period_elapsed+0x20/0xa4)
[   43.193743] [<c0a03828>] (snd_pcm_period_elapsed) from [<c0a067d4>] (dmaengine_pcm_dma_complete+0x54/0x58)
[   43.203539]  r7:c1601d70 r6:efbfaa84 r5:ef26f548 r4:00000000
[   43.209290] [<c0a06780>] (dmaengine_pcm_dma_complete) from [<c07885e8>] (vchan_complete+0x238/0x240)
[   43.218567] [<c07883b0>] (vchan_complete) from [<c0228f9c>] (tasklet_action_common.constprop.0+0x84/0x12c)
[   43.228365]  r10:c1600000 r9:00000006 r8:00000000 r7:dac802e0 r6:efbfaa84 r5:00000000
[   43.236309]  r4:efbfaa80
[   43.238886] [<c0228f18>] (tasklet_action_common.constprop.0) from [<c022906c>] (tasklet_action+0x28/0x30)
[   43.248596]  r9:00000040 r8:ef80b800 r7:00000101 r6:00000006 r5:00000007 r4:c1603098
[   43.256459] [<c0229044>] (tasklet_action) from [<c0201284>] (__do_softirq+0x18c/0x47c)
[   43.264498] [<c02010f8>] (__do_softirq) from [<c0228de4>] (irq_exit+0xd0/0xf8)
[   43.271829]  r10:c1051f40 r9:c1600000 r8:ef80b800 r7:00000001 r6:00000000 r5:00000000
[   43.279774]  r4:ffffe000
[   43.282349] [<c0228d14>] (irq_exit) from [<c0287c54>] (__handle_domain_irq+0x70/0xc0)
[   43.290294]  r5:00000000 r4:c148ee80
[   43.293927] [<c0287be4>] (__handle_domain_irq) from [<c071d8b4>] (gic_handle_irq+0x4c/0x88)
[   43.302403]  r9:c1600000 r8:c1601ed0 r7:f0803000 r6:f0802000 r5:f080200c r4:c16058d8
[   43.310266] [<c071d868>] (gic_handle_irq) from [<c0200abc>] (__irq_svc+0x5c/0x7c)
[   43.317856] Exception stack(0xc1601ed0 to 0xc1601f18)
[   43.322983] 1ec0:                                     c0208b84 00000000 19bf8000 600e0093
[   43.331283] 1ee0: c1600000 00000000 c1604e34 c1604e7c c16d20e5 c0e30cac c1051f40 c1601f2c
[   43.339582] 1f00: ffffe000 c1601f20 c1600000 c0208b88 600e0013 ffffffff
[   43.346297]  r9:c1600000 r8:c16d20e5 r7:c1601f04 r6:ffffffff r5:600e0013 r4:c0208b88
[   43.354164] [<c0208b54>] (arch_cpu_idle) from [<c0bb23c8>] (default_idle_call+0x3c/0x184)
[   43.362469] [<c0bb238c>] (default_idle_call) from [<c025b5f8>] (do_idle+0x110/0x170)
[   43.370327]  r5:00000000 r4:c1600000
[   43.373958] [<c025b4e8>] (do_idle) from [<c025b950>] (cpu_startup_entry+0x28/0x2c)
[   43.381641]  r9:00000193 r8:effff900 r7:c1604e00 r6:00000000 r5:c160ec50 r4:000000d7
[   43.389503] [<c025b928>] (cpu_startup_entry) from [<c0baa65c>] (rest_init+0x108/0x110)
[   43.397542] [<c0baa554>] (rest_init) from [<c1000cc8>] (arch_call_rest_init+0x18/0x1c)
[   43.405575]  r5:c1051f40 r4:c16e7078
[   43.409205] [<c1000cb0>] (arch_call_rest_init) from [<c1001538>] (start_kernel+0x7f4/0x834)
[   43.417682] [<c1000d44>] (start_kernel) from [<00000000>] (0x0)


So, I'm not really sure what I'm supposed to do here. The drivers
involved don't appear to be doing anything extraordinary, but the issues
lockdep report are definitely valid too. What are the expectations in
terms of context from ALSA when running the callbacks, and how can we
fix it?

Thanks!
Maxime

1: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/vc4/vc4_hdmi.c#n234

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Context expectations in ALSA
  2020-10-22  9:50 Context expectations in ALSA Maxime Ripard
@ 2020-10-22 10:03 ` Jaroslav Kysela
  2020-10-22 12:57   ` Maxime Ripard
  2020-10-22 13:50 ` Mark Brown
  1 sibling, 1 reply; 8+ messages in thread
From: Jaroslav Kysela @ 2020-10-22 10:03 UTC (permalink / raw)
  To: Maxime Ripard, Takashi Iwai, Liam Girdwood, Mark Brown
  Cc: alsa-devel, Dave Stevenson, linux-kernel, Nicolas Saenz Julienne,
	Dom Cobley

Dne 22. 10. 20 v 11:50 Maxime Ripard napsal(a):

> So, I'm not really sure what I'm supposed to do here. The drivers
> involved don't appear to be doing anything extraordinary, but the issues
> lockdep report are definitely valid too. What are the expectations in
> terms of context from ALSA when running the callbacks, and how can we
> fix it?

I think that you should set the non-atomic flag and wake up the workqueue or
so from interrupt handler in this case. Call snd_pcm_period_elapsed() from the
workqueue not the interrupt handler context.

						Jaroslav

-- 
Jaroslav Kysela <perex@perex.cz>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Context expectations in ALSA
  2020-10-22 10:03 ` Jaroslav Kysela
@ 2020-10-22 12:57   ` Maxime Ripard
  2020-10-22 13:20     ` Takashi Iwai
  0 siblings, 1 reply; 8+ messages in thread
From: Maxime Ripard @ 2020-10-22 12:57 UTC (permalink / raw)
  To: Jaroslav Kysela
  Cc: alsa-devel, Dom Cobley, Liam Girdwood, Dave Stevenson,
	linux-kernel, Takashi Iwai, Mark Brown, Nicolas Saenz Julienne

[-- Attachment #1: Type: text/plain, Size: 864 bytes --]

On Thu, Oct 22, 2020 at 12:03:19PM +0200, Jaroslav Kysela wrote:
> Dne 22. 10. 20 v 11:50 Maxime Ripard napsal(a):
> 
> > So, I'm not really sure what I'm supposed to do here. The drivers
> > involved don't appear to be doing anything extraordinary, but the issues
> > lockdep report are definitely valid too. What are the expectations in
> > terms of context from ALSA when running the callbacks, and how can we
> > fix it?
> 
> I think that you should set the non-atomic flag and wake up the workqueue or
> so from interrupt handler in this case. Call snd_pcm_period_elapsed() from the
> workqueue not the interrupt handler context.

Yeah, that was my first guess too. However, the DMA driver uses some
kind of generic helpers using a tasklet, so getting rid of it would take
some work and would very likely not be eligible for stable.

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Context expectations in ALSA
  2020-10-22 12:57   ` Maxime Ripard
@ 2020-10-22 13:20     ` Takashi Iwai
  2020-10-22 13:24       ` Maxime Ripard
  0 siblings, 1 reply; 8+ messages in thread
From: Takashi Iwai @ 2020-10-22 13:20 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: alsa-devel, Dom Cobley, Dave Stevenson, linux-kernel,
	Liam Girdwood, Takashi Iwai, Mark Brown, Nicolas Saenz Julienne

On Thu, 22 Oct 2020 14:57:41 +0200,
Maxime Ripard wrote:
> 
> On Thu, Oct 22, 2020 at 12:03:19PM +0200, Jaroslav Kysela wrote:
> > Dne 22. 10. 20 v 11:50 Maxime Ripard napsal(a):
> > 
> > > So, I'm not really sure what I'm supposed to do here. The drivers
> > > involved don't appear to be doing anything extraordinary, but the issues
> > > lockdep report are definitely valid too. What are the expectations in
> > > terms of context from ALSA when running the callbacks, and how can we
> > > fix it?
> > 
> > I think that you should set the non-atomic flag and wake up the workqueue or
> > so from interrupt handler in this case. Call snd_pcm_period_elapsed() from the
> > workqueue not the interrupt handler context.
> 
> Yeah, that was my first guess too. However, the DMA driver uses some
> kind of generic helpers using a tasklet, so getting rid of it would take
> some work and would very likely not be eligible for stable.

Who sets the nonatomic flag for vc4?  I couldn't find the relevant
code in the latest upstream.

Ideally dmaengine PCM helper should support the nonatomic mode, but
until then, the other side needs to drop the nonatomic flag, I
suppose.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Context expectations in ALSA
  2020-10-22 13:20     ` Takashi Iwai
@ 2020-10-22 13:24       ` Maxime Ripard
  2020-10-22 13:31         ` Takashi Iwai
  0 siblings, 1 reply; 8+ messages in thread
From: Maxime Ripard @ 2020-10-22 13:24 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: alsa-devel, Dom Cobley, Dave Stevenson, linux-kernel,
	Liam Girdwood, Takashi Iwai, Mark Brown, Nicolas Saenz Julienne

[-- Attachment #1: Type: text/plain, Size: 1873 bytes --]

Hi Takashi,

On Thu, Oct 22, 2020 at 03:20:49PM +0200, Takashi Iwai wrote:
> On Thu, 22 Oct 2020 14:57:41 +0200,
> Maxime Ripard wrote:
> > 
> > On Thu, Oct 22, 2020 at 12:03:19PM +0200, Jaroslav Kysela wrote:
> > > Dne 22. 10. 20 v 11:50 Maxime Ripard napsal(a):
> > > 
> > > > So, I'm not really sure what I'm supposed to do here. The drivers
> > > > involved don't appear to be doing anything extraordinary, but the issues
> > > > lockdep report are definitely valid too. What are the expectations in
> > > > terms of context from ALSA when running the callbacks, and how can we
> > > > fix it?
> > > 
> > > I think that you should set the non-atomic flag and wake up the workqueue or
> > > so from interrupt handler in this case. Call snd_pcm_period_elapsed() from the
> > > workqueue not the interrupt handler context.
> > 
> > Yeah, that was my first guess too. However, the DMA driver uses some
> > kind of generic helpers using a tasklet, so getting rid of it would take
> > some work and would very likely not be eligible for stable.
> 
> Who sets the nonatomic flag for vc4?  I couldn't find the relevant
> code in the latest upstream.

Sorry if this wasn't clear enough, it's not there at the moment, ALSA
takes a spinlock and lockdep complains that we're sleeping in an atomic
context.

I tried to add the nonatomic flag in my tree to see if it was fixing the
issue, but ran into another lockdep complain now with ALSA taking a
mutex in a tasklet.

> Ideally dmaengine PCM helper should support the nonatomic mode, but
> until then, the other side needs to drop the nonatomic flag, I
> suppose.

In this case, I'm not sure the blame is in the PCM helper but if there's
any blame, I guess it's the virt-chan layer inside dmaengine (so for
providers) that use a tasklet instead of something that allows sleeping

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Context expectations in ALSA
  2020-10-22 13:24       ` Maxime Ripard
@ 2020-10-22 13:31         ` Takashi Iwai
  0 siblings, 0 replies; 8+ messages in thread
From: Takashi Iwai @ 2020-10-22 13:31 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: alsa-devel, Dom Cobley, Dave Stevenson, linux-kernel,
	Liam Girdwood, Takashi Iwai, Mark Brown, Nicolas Saenz Julienne

On Thu, 22 Oct 2020 15:24:12 +0200,
Maxime Ripard wrote:
> 
> Hi Takashi,
> 
> On Thu, Oct 22, 2020 at 03:20:49PM +0200, Takashi Iwai wrote:
> > On Thu, 22 Oct 2020 14:57:41 +0200,
> > Maxime Ripard wrote:
> > > 
> > > On Thu, Oct 22, 2020 at 12:03:19PM +0200, Jaroslav Kysela wrote:
> > > > Dne 22. 10. 20 v 11:50 Maxime Ripard napsal(a):
> > > > 
> > > > > So, I'm not really sure what I'm supposed to do here. The drivers
> > > > > involved don't appear to be doing anything extraordinary, but the issues
> > > > > lockdep report are definitely valid too. What are the expectations in
> > > > > terms of context from ALSA when running the callbacks, and how can we
> > > > > fix it?
> > > > 
> > > > I think that you should set the non-atomic flag and wake up the workqueue or
> > > > so from interrupt handler in this case. Call snd_pcm_period_elapsed() from the
> > > > workqueue not the interrupt handler context.
> > > 
> > > Yeah, that was my first guess too. However, the DMA driver uses some
> > > kind of generic helpers using a tasklet, so getting rid of it would take
> > > some work and would very likely not be eligible for stable.
> > 
> > Who sets the nonatomic flag for vc4?  I couldn't find the relevant
> > code in the latest upstream.
> 
> Sorry if this wasn't clear enough, it's not there at the moment, ALSA
> takes a spinlock and lockdep complains that we're sleeping in an atomic
> context.
> 
> I tried to add the nonatomic flag in my tree to see if it was fixing the
> issue, but ran into another lockdep complain now with ALSA taking a
> mutex in a tasklet.

I see, thanks for clarification.

> > Ideally dmaengine PCM helper should support the nonatomic mode, but
> > until then, the other side needs to drop the nonatomic flag, I
> > suppose.
> 
> In this case, I'm not sure the blame is in the PCM helper but if there's
> any blame, I guess it's the virt-chan layer inside dmaengine (so for
> providers) that use a tasklet instead of something that allows sleeping

Well, we have to align either to atomic or non-atomic operation.
If we want to solve solely in vc4, the fix would be to make the trigger
action into some own work.


Takashi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Context expectations in ALSA
  2020-10-22  9:50 Context expectations in ALSA Maxime Ripard
  2020-10-22 10:03 ` Jaroslav Kysela
@ 2020-10-22 13:50 ` Mark Brown
  2020-10-22 15:39   ` Maxime Ripard
  1 sibling, 1 reply; 8+ messages in thread
From: Mark Brown @ 2020-10-22 13:50 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: alsa-devel, Dom Cobley, Dave Stevenson, linux-kernel,
	Liam Girdwood, Takashi Iwai, Nicolas Saenz Julienne

[-- Attachment #1: Type: text/plain, Size: 1844 bytes --]

On Thu, Oct 22, 2020 at 11:50:41AM +0200, Maxime Ripard wrote:

> This is caused by the HDMI driver polling some status bit that reports
> that the infoframes have been properly sent, and calling usleep_range
> between each iteration[1], and that is done in our trigger callback that
> seems to be run with a spinlock taken and the interrupt disabled
> (snd_pcm_action_lock_irq) as part of snd_pcm_start_lock_irq. This is the
> entire stack trace:

That doesn't sound like something I would expect you do be doing in the
trigger callback TBH - it feels like if this is something that could
block then the setup should have been done during parameter
configuration or something rather than in trigger.

> It looks like the snd_soc_dai_link structure has a nonatomic flag that
> seems to be made to address more or less that issue, taking a mutex
> instead of a spinlock. However setting that flag results in another
> lockdep issue, since the dmaengine controller doing the DMA transfer
> would call snd_pcm_period_elapsed on completion, in a tasklet, this time
> taking a mutex in an atomic context which is just as bad as the initial
> issue. This is the stacktrace this time:

Like Jaroslav says you could punt to a workqueue here.  I'd be more
inclined to move the sleeping stuff out of the trigger operations but
that'd avoid the issue too.  There are some drivers doing this already
IIRC.

> So, I'm not really sure what I'm supposed to do here. The drivers
> involved don't appear to be doing anything extraordinary, but the issues
> lockdep report are definitely valid too. What are the expectations in
> terms of context from ALSA when running the callbacks, and how can we
> fix it?

To me having something in the trigger that needs waiting for is the bit
that feels the most awkward fit here, trigger is supposed to run very
quickly.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Context expectations in ALSA
  2020-10-22 13:50 ` Mark Brown
@ 2020-10-22 15:39   ` Maxime Ripard
  0 siblings, 0 replies; 8+ messages in thread
From: Maxime Ripard @ 2020-10-22 15:39 UTC (permalink / raw)
  To: Mark Brown
  Cc: alsa-devel, Dom Cobley, Dave Stevenson, linux-kernel,
	Liam Girdwood, Takashi Iwai, Nicolas Saenz Julienne

[-- Attachment #1: Type: text/plain, Size: 2197 bytes --]

Hi Mark

On Thu, Oct 22, 2020 at 02:50:53PM +0100, Mark Brown wrote:
> On Thu, Oct 22, 2020 at 11:50:41AM +0200, Maxime Ripard wrote:
> 
> > This is caused by the HDMI driver polling some status bit that reports
> > that the infoframes have been properly sent, and calling usleep_range
> > between each iteration[1], and that is done in our trigger callback that
> > seems to be run with a spinlock taken and the interrupt disabled
> > (snd_pcm_action_lock_irq) as part of snd_pcm_start_lock_irq. This is the
> > entire stack trace:
> 
> That doesn't sound like something I would expect you do be doing in the
> trigger callback TBH - it feels like if this is something that could
> block then the setup should have been done during parameter
> configuration or something rather than in trigger.
> 
> > It looks like the snd_soc_dai_link structure has a nonatomic flag that
> > seems to be made to address more or less that issue, taking a mutex
> > instead of a spinlock. However setting that flag results in another
> > lockdep issue, since the dmaengine controller doing the DMA transfer
> > would call snd_pcm_period_elapsed on completion, in a tasklet, this time
> > taking a mutex in an atomic context which is just as bad as the initial
> > issue. This is the stacktrace this time:
> 
> Like Jaroslav says you could punt to a workqueue here.  I'd be more
> inclined to move the sleeping stuff out of the trigger operations but
> that'd avoid the issue too.  There are some drivers doing this already
> IIRC.
> 
> > So, I'm not really sure what I'm supposed to do here. The drivers
> > involved don't appear to be doing anything extraordinary, but the issues
> > lockdep report are definitely valid too. What are the expectations in
> > terms of context from ALSA when running the callbacks, and how can we
> > fix it?
> 
> To me having something in the trigger that needs waiting for is the bit
> that feels the most awkward fit here, trigger is supposed to run very
> quickly.

Indeed, other DRM devices seem to send the infoframes as part of
hw_params, and it solves our issue there too. I'll send a patch

Thanks for the suggestion!
Maxime


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-10-22 15:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22  9:50 Context expectations in ALSA Maxime Ripard
2020-10-22 10:03 ` Jaroslav Kysela
2020-10-22 12:57   ` Maxime Ripard
2020-10-22 13:20     ` Takashi Iwai
2020-10-22 13:24       ` Maxime Ripard
2020-10-22 13:31         ` Takashi Iwai
2020-10-22 13:50 ` Mark Brown
2020-10-22 15:39   ` Maxime Ripard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).