On Thu, Oct 22, 2020 at 11:50:41AM +0200, Maxime Ripard wrote: > This is caused by the HDMI driver polling some status bit that reports > that the infoframes have been properly sent, and calling usleep_range > between each iteration[1], and that is done in our trigger callback that > seems to be run with a spinlock taken and the interrupt disabled > (snd_pcm_action_lock_irq) as part of snd_pcm_start_lock_irq. This is the > entire stack trace: That doesn't sound like something I would expect you do be doing in the trigger callback TBH - it feels like if this is something that could block then the setup should have been done during parameter configuration or something rather than in trigger. > It looks like the snd_soc_dai_link structure has a nonatomic flag that > seems to be made to address more or less that issue, taking a mutex > instead of a spinlock. However setting that flag results in another > lockdep issue, since the dmaengine controller doing the DMA transfer > would call snd_pcm_period_elapsed on completion, in a tasklet, this time > taking a mutex in an atomic context which is just as bad as the initial > issue. This is the stacktrace this time: Like Jaroslav says you could punt to a workqueue here. I'd be more inclined to move the sleeping stuff out of the trigger operations but that'd avoid the issue too. There are some drivers doing this already IIRC. > So, I'm not really sure what I'm supposed to do here. The drivers > involved don't appear to be doing anything extraordinary, but the issues > lockdep report are definitely valid too. What are the expectations in > terms of context from ALSA when running the callbacks, and how can we > fix it? To me having something in the trigger that needs waiting for is the bit that feels the most awkward fit here, trigger is supposed to run very quickly.