Hi, On Thu, Oct 14, 2021 at 03:15:36PM +0200, Daniel Vetter wrote: > On Wed, Oct 13, 2021 at 05:01:03PM +0200, Maxime Ripard wrote: > > On Thu, Sep 30, 2021 at 11:19:59AM +0200, Daniel Vetter wrote: > > > On Tue, Sep 28, 2021 at 10:34:46AM +0200, Maxime Ripard wrote: > > > > Hi Daniel, > > > > > > > > On Sat, Sep 25, 2021 at 12:50:17AM +0200, Daniel Vetter wrote: > > > > > On Fri, Sep 24, 2021 at 3:30 PM Maxime Ripard wrote: > > > > > > > > > > > > On Wed, Sep 22, 2021 at 01:25:21PM -0700, Linus Torvalds wrote: > > > > > > > On Wed, Sep 22, 2021 at 1:19 PM Sudip Mukherjee > > > > > > > wrote: > > > > > > > > > > > > > > > > I added some debugs to print the addresses, and I am getting: > > > > > > > > [ 38.813809] sudip crtc 0000000000000000 > > > > > > > > > > > > > > > > This is from struct drm_crtc *crtc = connector->state->crtc; > > > > > > > > > > > > > > Yeah, that was my personal suspicion, because while the line number > > > > > > > implied "crtc->state" being NULL, the drm data structure documentation > > > > > > > and other drivers both imply that "crtc" was the more likely one. > > > > > > > > > > > > > > I suspect a simple > > > > > > > > > > > > > > if (!crtc) > > > > > > > return; > > > > > > > > > > > > > > in vc4_hdmi_set_n_cts() is at least part of the fix for this all, but > > > > > > > I didn't check if there is possibly something else that needs to be > > > > > > > done too. > > > > > > > > > > > > Thanks for the decode_stacktrace.sh and the follow-up > > > > > > > > > > > > Yeah, it looks like we have several things wrong here: > > > > > > > > > > > > * we only check that connector->state is set, and not > > > > > > connector->state->crtc indeed. > > > > > > > > > > > > * We also check only in startup(), so at open() and not later on when > > > > > > the sound streaming actually start. This has been there for a while, > > > > > > so I guess it's never really been causing a practical issue before. > > > > > > > > > > You also have no locking > > > > > > > > Indeed. Do we just need locking to prevent a concurrent audio setup and > > > > modeset, or do you have another corner case in mind? > > > > > > > > Also, generally, what locks should we make sure we have locked when > > > > accessing the connector and CRTC state? drm_mode_config.connection_mutex > > > > and drm_mode_config.mutex, respectively? > > > > > > > > > plus looking at ->state objects outside of atomic commit machinery > > > > > makes no sense because you're not actually in sync with the hw state. > > > > > Relevant bits need to be copied over at commit time, protected by some > > > > > spinlock (and that spinlock also needs to be held over whatever other > > > > > stuff you're setting to make sure we don't get a funny out-of-sync > > > > > state anywhere). > > > > > > > > If we already have a lock protecting against having both an ASoC and KMS > > > > function running, it's not clear to me what the spinlock would prevent > > > > here? > > > > > > Replicating the irc chat here. With > > > > > > commit 6c5ed5ae353cdf156f9ac4db17e15db56b4de880 > > > Author: Maarten Lankhorst > > > Date: Thu Apr 6 20:55:20 2017 +0200 > > > > > > drm/atomic: Acquire connection_mutex lock in drm_helper_probe_single_connector_modes, v4. > > > > > > this is already taken care of for drivers and should be all good from a > > > locking pov. > > > > So, if I understand this properly, this superseeds your comment on the > > spinlock for the hw state, but not the comment that we need some locking > > to synchronize between the audio and KMS path (and CEC?). Right? > > Other way round. There's 3 things involved here: > 1. kms output probe code > 2. kms atomic commit code > 3. calls from asoc side > > The above referenced commit makes sure 1&2 are synchronized. The problem > is that 2&3 are not synchonronized, and from 3, no matter how much locking > you have, you cannot look at kms state. I.e. not allowed to look at > crtc->state for example, irrespective of whether you're holding > drm_modeset_lock or not. This is because the atomic nonblocking commit is > done without holding any locks, protection is purely down to ownership > rules of state structures and ordering (through drm_crtc_commit) of > in-flight nonblocking atomic commits. > > That's why you need a sperate lock _and_ copy state, so taht 2&3 stay in > sync. > > In practice you only care about modeset changes from 2 vs anything from 3, > and most userspace does modeset atomic commits as blocking commits, which > means you won't notice that your locking has gaps. > > btw same problem exists between atomic and (vblank) irq handler. There you > need a irqsafe spinlock and you also have to copy (because the irq handler > just cannot access ->state in any safe way, because it doesn't own that > structure). > > This is maybe a bit the confusing thing with atomic commit: ->state isn't > protected by locks, but through ownership rules. Only for atomic check is > ->state protected by locks, but once we're committed we switch over to > ownership rules for protection. swap_states() is that point of no return. Thanks for the clarifications, I just posted a series that should be implementing this here: https://lore.kernel.org/dri-devel/20211025141113.702757-1-maxime@cerno.tech/ Maxime