From: Daniel Vetter <daniel@ffwll.ch> To: Stephen Boyd <swboyd@chromium.org> Cc: Rob Clark <robdclark@gmail.com>, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Krishna Manikandan <mkrishn@codeaurora.org> Subject: Re: [PATCH] drm/msm/kms: Make a lock_class_key for each crtc mutex Date: Tue, 2 Feb 2021 16:46:56 +0100 [thread overview] Message-ID: <YBlz8Go2DseRWuOa@phenom.ffwll.local> (raw) In-Reply-To: <20210125234901.2730699-1-swboyd@chromium.org> On Mon, Jan 25, 2021 at 03:49:01PM -0800, Stephen Boyd wrote: > Lockdep complains about an AA deadlock when rebooting the device. > > ============================================ > WARNING: possible recursive locking detected > 5.4.91 #1 Not tainted > -------------------------------------------- > reboot/5213 is trying to acquire lock: > ffffff80d13391b0 (&kms->commit_lock[i]){+.+.}, at: lock_crtcs+0x60/0xa4 > > but task is already holding lock: > ffffff80d1339110 (&kms->commit_lock[i]){+.+.}, at: lock_crtcs+0x60/0xa4 > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(&kms->commit_lock[i]); > lock(&kms->commit_lock[i]); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 6 locks held by reboot/5213: > __arm64_sys_reboot+0x148/0x2a0 > device_shutdown+0x10c/0x2c4 > drm_atomic_helper_shutdown+0x48/0xfc > modeset_lock+0x120/0x24c > lock_crtcs+0x60/0xa4 > > stack backtrace: > CPU: 4 PID: 5213 Comm: reboot Not tainted 5.4.91 #1 > Hardware name: Google Pompom (rev1) with LTE (DT) > Call trace: > dump_backtrace+0x0/0x1dc > show_stack+0x24/0x30 > dump_stack+0xfc/0x1a8 > __lock_acquire+0xcd0/0x22b8 > lock_acquire+0x1ec/0x240 > __mutex_lock_common+0xe0/0xc84 > mutex_lock_nested+0x48/0x58 > lock_crtcs+0x60/0xa4 > msm_atomic_commit_tail+0x348/0x570 > commit_tail+0xdc/0x178 > drm_atomic_helper_commit+0x160/0x168 > drm_atomic_commit+0x68/0x80 > > This is because lockdep thinks all the locks taken in lock_crtcs() are > the same lock, when they actually aren't. That's because we call > mutex_init() in msm_kms_init() and that assigns on static key for every > lock initialized in this loop. Let's allocate a dynamic number of > lock_class_keys and assign them to each lock so that lockdep can figure > out an AA deadlock isn't possible here. > > Fixes: b3d91800d9ac ("drm/msm: Fix race condition in msm driver with async layer updates") > Cc: Krishna Manikandan <mkrishn@codeaurora.org> > Signed-off-by: Stephen Boyd <swboyd@chromium.org> This smells like throwing more bad after initial bad code ... First a rant: https://blog.ffwll.ch/2020/08/lockdep-false-positives.html Yes I know the locking you're doing here is correct, but that goes to the second issue: Why is this needed? atomic_async_update helpers are supposed to take care of ordering fun like this, if they're not, we need to address things there. The problem that commit b3d91800d9ac35014e0349292273a6fa7938d402 Author: Krishna Manikandan <mkrishn@codeaurora.org> Date: Fri Oct 16 19:40:43 2020 +0530 drm/msm: Fix race condition in msm driver with async layer updates is _the_ reason we have drm_crtc_commit to track stuff, and Maxime has recently rolled out a pile of changes to vc4 to use these things correctly. Hacking some glorious hand-rolled locking for synchronization of updates really should be the exception for kms drivers, not the rule. And this one here doesn't look like an exception by far (the one legit I know of is the locking issues amdgpu has between atomic_commit_tail and gpu reset, and that one is really nasty, so not going to get fixed in helpers, ever). Cheers, Daniel > --- > drivers/gpu/drm/msm/msm_kms.h | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h > index d8151a89e163..4735251a394d 100644 > --- a/drivers/gpu/drm/msm/msm_kms.h > +++ b/drivers/gpu/drm/msm/msm_kms.h > @@ -157,6 +157,7 @@ struct msm_kms { > * from the crtc's pending_timer close to end of the frame: > */ > struct mutex commit_lock[MAX_CRTCS]; > + struct lock_class_key commit_lock_keys[MAX_CRTCS]; > unsigned pending_crtc_mask; > struct msm_pending_timer pending_timers[MAX_CRTCS]; > }; > @@ -166,8 +167,11 @@ static inline int msm_kms_init(struct msm_kms *kms, > { > unsigned i, ret; > > - for (i = 0; i < ARRAY_SIZE(kms->commit_lock); i++) > - mutex_init(&kms->commit_lock[i]); > + for (i = 0; i < ARRAY_SIZE(kms->commit_lock); i++) { > + lockdep_register_key(&kms->commit_lock_keys[i]); > + __mutex_init(&kms->commit_lock[i], "&kms->commit_lock[i]", > + &kms->commit_lock_keys[i]); > + } > > kms->funcs = funcs; > > > base-commit: 19c329f6808995b142b3966301f217c831e7cf31 > -- > https://chromeos.dev > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch> To: Stephen Boyd <swboyd@chromium.org> Cc: Krishna Manikandan <mkrishn@codeaurora.org>, linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, freedreno@lists.freedesktop.org Subject: Re: [PATCH] drm/msm/kms: Make a lock_class_key for each crtc mutex Date: Tue, 2 Feb 2021 16:46:56 +0100 [thread overview] Message-ID: <YBlz8Go2DseRWuOa@phenom.ffwll.local> (raw) In-Reply-To: <20210125234901.2730699-1-swboyd@chromium.org> On Mon, Jan 25, 2021 at 03:49:01PM -0800, Stephen Boyd wrote: > Lockdep complains about an AA deadlock when rebooting the device. > > ============================================ > WARNING: possible recursive locking detected > 5.4.91 #1 Not tainted > -------------------------------------------- > reboot/5213 is trying to acquire lock: > ffffff80d13391b0 (&kms->commit_lock[i]){+.+.}, at: lock_crtcs+0x60/0xa4 > > but task is already holding lock: > ffffff80d1339110 (&kms->commit_lock[i]){+.+.}, at: lock_crtcs+0x60/0xa4 > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(&kms->commit_lock[i]); > lock(&kms->commit_lock[i]); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 6 locks held by reboot/5213: > __arm64_sys_reboot+0x148/0x2a0 > device_shutdown+0x10c/0x2c4 > drm_atomic_helper_shutdown+0x48/0xfc > modeset_lock+0x120/0x24c > lock_crtcs+0x60/0xa4 > > stack backtrace: > CPU: 4 PID: 5213 Comm: reboot Not tainted 5.4.91 #1 > Hardware name: Google Pompom (rev1) with LTE (DT) > Call trace: > dump_backtrace+0x0/0x1dc > show_stack+0x24/0x30 > dump_stack+0xfc/0x1a8 > __lock_acquire+0xcd0/0x22b8 > lock_acquire+0x1ec/0x240 > __mutex_lock_common+0xe0/0xc84 > mutex_lock_nested+0x48/0x58 > lock_crtcs+0x60/0xa4 > msm_atomic_commit_tail+0x348/0x570 > commit_tail+0xdc/0x178 > drm_atomic_helper_commit+0x160/0x168 > drm_atomic_commit+0x68/0x80 > > This is because lockdep thinks all the locks taken in lock_crtcs() are > the same lock, when they actually aren't. That's because we call > mutex_init() in msm_kms_init() and that assigns on static key for every > lock initialized in this loop. Let's allocate a dynamic number of > lock_class_keys and assign them to each lock so that lockdep can figure > out an AA deadlock isn't possible here. > > Fixes: b3d91800d9ac ("drm/msm: Fix race condition in msm driver with async layer updates") > Cc: Krishna Manikandan <mkrishn@codeaurora.org> > Signed-off-by: Stephen Boyd <swboyd@chromium.org> This smells like throwing more bad after initial bad code ... First a rant: https://blog.ffwll.ch/2020/08/lockdep-false-positives.html Yes I know the locking you're doing here is correct, but that goes to the second issue: Why is this needed? atomic_async_update helpers are supposed to take care of ordering fun like this, if they're not, we need to address things there. The problem that commit b3d91800d9ac35014e0349292273a6fa7938d402 Author: Krishna Manikandan <mkrishn@codeaurora.org> Date: Fri Oct 16 19:40:43 2020 +0530 drm/msm: Fix race condition in msm driver with async layer updates is _the_ reason we have drm_crtc_commit to track stuff, and Maxime has recently rolled out a pile of changes to vc4 to use these things correctly. Hacking some glorious hand-rolled locking for synchronization of updates really should be the exception for kms drivers, not the rule. And this one here doesn't look like an exception by far (the one legit I know of is the locking issues amdgpu has between atomic_commit_tail and gpu reset, and that one is really nasty, so not going to get fixed in helpers, ever). Cheers, Daniel > --- > drivers/gpu/drm/msm/msm_kms.h | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h > index d8151a89e163..4735251a394d 100644 > --- a/drivers/gpu/drm/msm/msm_kms.h > +++ b/drivers/gpu/drm/msm/msm_kms.h > @@ -157,6 +157,7 @@ struct msm_kms { > * from the crtc's pending_timer close to end of the frame: > */ > struct mutex commit_lock[MAX_CRTCS]; > + struct lock_class_key commit_lock_keys[MAX_CRTCS]; > unsigned pending_crtc_mask; > struct msm_pending_timer pending_timers[MAX_CRTCS]; > }; > @@ -166,8 +167,11 @@ static inline int msm_kms_init(struct msm_kms *kms, > { > unsigned i, ret; > > - for (i = 0; i < ARRAY_SIZE(kms->commit_lock); i++) > - mutex_init(&kms->commit_lock[i]); > + for (i = 0; i < ARRAY_SIZE(kms->commit_lock); i++) { > + lockdep_register_key(&kms->commit_lock_keys[i]); > + __mutex_init(&kms->commit_lock[i], "&kms->commit_lock[i]", > + &kms->commit_lock_keys[i]); > + } > > kms->funcs = funcs; > > > base-commit: 19c329f6808995b142b3966301f217c831e7cf31 > -- > https://chromeos.dev > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
next prev parent reply other threads:[~2021-02-02 15:50 UTC|newest] Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-01-25 23:49 [PATCH] drm/msm/kms: Make a lock_class_key for each crtc mutex Stephen Boyd 2021-01-25 23:49 ` Stephen Boyd 2021-01-28 16:39 ` Rob Clark 2021-01-28 16:39 ` Rob Clark 2021-02-02 15:46 ` Daniel Vetter [this message] 2021-02-02 15:46 ` Daniel Vetter 2021-02-02 16:51 ` Rob Clark 2021-02-02 16:51 ` Rob Clark 2021-02-03 10:10 ` Daniel Vetter 2021-02-03 10:10 ` Daniel Vetter 2021-02-03 17:29 ` Rob Clark 2021-02-03 17:29 ` Rob Clark 2021-02-03 21:58 ` Stephen Boyd 2021-02-03 21:58 ` Stephen Boyd 2021-02-03 22:11 ` Rob Clark 2021-02-03 22:11 ` Rob Clark 2021-02-04 15:17 ` Daniel Vetter 2021-02-04 15:17 ` Daniel Vetter 2021-01-26 2:01 Stephen Boyd 2021-01-26 2:01 ` Stephen Boyd
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=YBlz8Go2DseRWuOa@phenom.ffwll.local \ --to=daniel@ffwll.ch \ --cc=dri-devel@lists.freedesktop.org \ --cc=freedreno@lists.freedesktop.org \ --cc=linux-arm-msm@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mkrishn@codeaurora.org \ --cc=robdclark@gmail.com \ --cc=swboyd@chromium.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.