dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Melissa Wen <melissa.srw@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>,
	Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>,
	David Airlie <airlied@linux.ie>,
	Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	kernel-usp@googlegroups.com
Subject: Re: [PATCH] drm/vkms: add missing drm_crtc_vblank_put to the get/put pair on flush
Date: Wed, 22 Jul 2020 17:17:05 +0200	[thread overview]
Message-ID: <CAKMK7uHWCnJ+3YnP2FwVGH6cEDkmPnH9ALjY_1R51QVs0HPG0Q@mail.gmail.com> (raw)
In-Reply-To: <20200722140604.27dfzfnzug5vb75r@smtp.gmail.com>

On Wed, Jul 22, 2020 at 4:06 PM Melissa Wen <melissa.srw@gmail.com> wrote:
>
> On 07/22, daniel@ffwll.ch wrote:
> > On Wed, Jul 22, 2020 at 08:04:11AM -0300, Melissa Wen wrote:
> > > This patch adds a missing drm_crtc_vblank_put op to the pair
> > > drm_crtc_vblank_get/put (inc/decrement counter to guarantee vblanks).
> > >
> > > It clears the execution of the following kms_cursor_crc subtests:
> > > 1. pipe-A-cursor-[size,alpha-opaque, NxN-(on-screen, off-screen, sliding,
> > >    random, fast-moving])] - successful when running individually.
> > > 2. pipe-A-cursor-dpms passes again
> > > 3. pipe-A-cursor-suspend also passes
> > >
> > > The issue was initially tracked in the sequential execution of IGT
> > > kms_cursor_crc subtest: when running the test sequence or one of its
> > > subtests twice, the odd execs complete and the pairs get stuck in an
> > > endless wait. In the IGT code, calling a wait_for_vblank before the start
> > > of CRC capture prevented the busy-wait. But the problem persisted in the
> > > pipe-A-cursor-dpms and -suspend subtests.
> > >
> > > Checking the history, the pipe-A-cursor-dpms subtest was successful when,
> > > in vkms_atomic_commit_tail, instead of using the flip_done op, it used
> > > wait_for_vblanks. Another way to prevent blocking was wait_one_vblank when
> > > enabling crtc. However, in both cases, pipe-A-cursor-suspend persisted
> > > blocking in the 2nd start of CRC capture, which may indicate that
> > > something got stuck in the step of CRC setup. Indeed, wait_one_vblank in
> > > the crc setup was able to sync things and free all kms_cursor_crc
> > > subtests.
> > >
> > > Tracing and comparing a clean run with a blocked one:
> > > - in a clean one, vkms_crtc_atomic_flush enables vblanks;
> > > - when blocked, only in next op, vkms_crtc_atomic_enable, the vblanks
> > > started. Moreover, a series of vkms_vblank_simulate flow out until
> > > disabling vblanks.
> > > Also watching the steps of vkms_crtc_atomic_flush, when the very first
> > > drm_crtc_vblank_get returned an error, the subtest crashed. On the other
> > > hand, when vblank_get succeeded, the subtest completed. Finally, checking
> > > the flush steps: it increases counter to hold a vblank reference (get),
> > > but there isn't a op to decreased it and release vblanks (put).
> > >
> > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > > Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
> > > ---
> > >  drivers/gpu/drm/vkms/vkms_crtc.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > index ac85e17428f8..a99d6b4a92dd 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > @@ -246,6 +246,7 @@ static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
> > >
> > >             spin_unlock(&crtc->dev->event_lock);
> > >
> > > +           drm_crtc_vblank_put(crtc);
> >
> > Uh so I reviewed this a bit more carefully now, and I dont think this is
> > the correct bugfix. From the kerneldoc of drm_crtc_arm_vblank_event():
> >
> >  * Caller must hold a vblank reference for the event @e acquired by a
> >  * drm_crtc_vblank_get(), which will be dropped when the next vblank arrives.
> >
> > So when we call drm_crtc_arm_vblank_event then the vblank_put gets called
> > for us. And that's the only case where we successfully acquired a vblank
> > interrupt reference since on failure of drm_crtc_vblank_get (0 indicates
> > success for that function, failure negative error number) we directly send
> > out the event.
> >
> > So something else fishy is going on, and now I'm totally confused why this
> > even happens.
> >
> > We also have a pile of WARN_ON checks in drm_crtc_vblank_put to make sure
> > we don't underflow the refcount, so it's also not that I think (except if
> > this patch creates more WARNING backtraces).
> >
> > But clearly it changes behaviour somehow ... can you try to figure out
> > what changes? Maybe print out the vblank->refcount at various points in
> > the driver, and maybe also trace when exactly the fake vkms vblank hrtimer
> > is enabled/disabled ...
>
> :(
>
> I can check these, but I also have other suspicions. When I place the
> drm_crct_vblank_put out of the if (at the end of flush), it not only solve
> the issue of blocking on kms_cursor_crc, but also the WARN_ON on kms_flip
> doesn't appear anymore (a total cleanup). Just after:
>
> vkms_output->composer_state = to_vkms_crtc_state(crtc->state);
>
> looks like there is something stuck around here.

Hm do you have the full WARNING for this? Maybe this gives me an idea
what's going wrong.

> Besides, there is a lock at atomic_begin:
>
>   /* This lock is held across the atomic commit to block vblank timer
>    * from scheduling vkms_composer_worker until the composer is updated
>    */
>   spin_lock_irq(&vkms_output->lock);
>
> that seems to be released on atomic_flush and make me suspect something
> missing on the composer update.

atomic_begin/atomic_flush are symmetric functions an always called
around all the plane updates. So having the spin_lock in _begin and
the spin_unlock in _flush should be symmetric and correct.

If you want to make sure, recompile with CONFIG_PROVE_LOCKING, which
should immmediately give you a huge splat in dmesg if there's anything
unbalanced with locking.

> I'll check all these things and come back with news (hope) :)

Have fun chasing stuff :-)

Cheers, Daniel


>
> Thanks,
>
> Melissa
> >
> > I'm totally confused about what's going on here now.
> > -Daniel
> >
> > >             crtc->state->event = NULL;
> > >     }
> > >
> > > --
> > > 2.27.0
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2020-07-22 15:17 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-22 11:04 [PATCH] drm/vkms: add missing drm_crtc_vblank_put to the get/put pair on flush Melissa Wen
2020-07-22 12:05 ` daniel
2020-07-22 14:06   ` Melissa Wen
2020-07-22 15:17     ` Daniel Vetter [this message]
2020-07-25  1:17       ` Sidong Yang
2020-07-25 15:57         ` Daniel Vetter
2020-07-25 17:45           ` Melissa Wen
2020-07-25 18:12             ` Daniel Vetter
2020-07-25 18:49               ` Melissa Wen
2020-07-25 19:19                 ` Melissa Wen
2020-07-25 19:29                   ` Melissa Wen
2020-07-26 10:26                     ` Daniel Vetter
2020-07-28 16:16                       ` Sidong Yang
2020-07-28 21:55                         ` daniel
2020-07-29 19:09               ` Melissa Wen
2020-07-29 21:48                 ` Daniel Vetter
2020-07-30 10:09                   ` Melissa Wen
2020-07-31  9:08                     ` daniel
2020-07-31 16:13                       ` Sidong Yang
2020-07-31 16:47                         ` Melissa Wen
2020-07-31 17:29                           ` Leandro Ribeiro
2020-07-31 17:36                           ` Leandro Ribeiro
2020-07-31 18:10                           ` Leandro Ribeiro
2020-07-31 18:33                           ` Daniel Vetter
2020-07-31 18:39                             ` Leandro Ribeiro
2020-08-01 16:06                             ` Sidong Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKMK7uHWCnJ+3YnP2FwVGH6cEDkmPnH9ALjY_1R51QVs0HPG0Q@mail.gmail.com \
    --to=daniel@ffwll.ch \
    --cc=Rodrigo.Siqueira@amd.com \
    --cc=airlied@linux.ie \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hamohammed.sa@gmail.com \
    --cc=kernel-usp@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=melissa.srw@gmail.com \
    --cc=rodrigosiqueiramelo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).