All of lore.kernel.org
 help / color / mirror / Atom feed
* drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-05  0:01 ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-05  0:01 UTC (permalink / raw)
  To: Rob Clark; +Cc: Sean Paul, freedreno, linux-kernel, linux-arm-msm, dri-devel

Hey Rob,

Since commit 2d99ced787e3 ("drm/msm: async commit support"), the frame
buffer console on my Nexus 5 began throwing these errors:

msm fd900000.mdss: pp done time out, lm=0

The display still works.

I see that mdp5_flush_commit() was introduced in commit 9f6b65642bd2
("drm/msm: add kms->flush_commit()") with a TODO comment and the commit
description mentions flushing registers. I assume that this is the
proper fix. If so, can you point me to where these registers are
defined and I can work on the mdp5 implementation.

Thanks,

Brian

^ permalink raw reply	[flat|nested] 44+ messages in thread

* drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-05  0:01 ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-05  0:01 UTC (permalink / raw)
  To: Rob Clark; +Cc: freedreno, Sean Paul, linux-kernel, dri-devel, linux-arm-msm

Hey Rob,

Since commit 2d99ced787e3 ("drm/msm: async commit support"), the frame
buffer console on my Nexus 5 began throwing these errors:

msm fd900000.mdss: pp done time out, lm=0

The display still works.

I see that mdp5_flush_commit() was introduced in commit 9f6b65642bd2
("drm/msm: add kms->flush_commit()") with a TODO comment and the commit
description mentions flushing registers. I assume that this is the
proper fix. If so, can you point me to where these registers are
defined and I can work on the mdp5 implementation.

Thanks,

Brian
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-05  0:19   ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-05  0:19 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, freedreno, Sean Paul, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm

On Mon, Nov 4, 2019 at 4:01 PM Brian Masney <masneyb@onstation.org> wrote:
>
> Hey Rob,
>
> Since commit 2d99ced787e3 ("drm/msm: async commit support"), the frame
> buffer console on my Nexus 5 began throwing these errors:
>
> msm fd900000.mdss: pp done time out, lm=0
>
> The display still works.
>
> I see that mdp5_flush_commit() was introduced in commit 9f6b65642bd2
> ("drm/msm: add kms->flush_commit()") with a TODO comment and the commit
> description mentions flushing registers. I assume that this is the
> proper fix. If so, can you point me to where these registers are
> defined and I can work on the mdp5 implementation.

See mdp5_ctl_commit(), which writes the CTL_FLUSH registers.. the idea
would be to defer writing CTL_FLUSH[ctl_id] = flush_mask until
kms->flush() (which happens from a timer shortly before vblank).

But I think the async flush case should not come up with fbcon?  It
was really added to cope with hwcursor updates (and userspace that
assumes it can do an unlimited # of cursor updates per frame).. the
intention was that nothing should change in the sequence for mdp5 (but
I guess that was not the case).

BR,
-R

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-05  0:19   ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-05  0:19 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, freedreno, Linux Kernel Mailing List, dri-devel,
	linux-arm-msm, Sean Paul

On Mon, Nov 4, 2019 at 4:01 PM Brian Masney <masneyb@onstation.org> wrote:
>
> Hey Rob,
>
> Since commit 2d99ced787e3 ("drm/msm: async commit support"), the frame
> buffer console on my Nexus 5 began throwing these errors:
>
> msm fd900000.mdss: pp done time out, lm=0
>
> The display still works.
>
> I see that mdp5_flush_commit() was introduced in commit 9f6b65642bd2
> ("drm/msm: add kms->flush_commit()") with a TODO comment and the commit
> description mentions flushing registers. I assume that this is the
> proper fix. If so, can you point me to where these registers are
> defined and I can work on the mdp5 implementation.

See mdp5_ctl_commit(), which writes the CTL_FLUSH registers.. the idea
would be to defer writing CTL_FLUSH[ctl_id] = flush_mask until
kms->flush() (which happens from a timer shortly before vblank).

But I think the async flush case should not come up with fbcon?  It
was really added to cope with hwcursor updates (and userspace that
assumes it can do an unlimited # of cursor updates per frame).. the
intention was that nothing should change in the sequence for mdp5 (but
I guess that was not the case).

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-05 10:08     ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-05 10:08 UTC (permalink / raw)
  To: Rob Clark
  Cc: Rob Clark, freedreno, Sean Paul, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm

On Mon, Nov 04, 2019 at 04:19:07PM -0800, Rob Clark wrote:
> On Mon, Nov 4, 2019 at 4:01 PM Brian Masney <masneyb@onstation.org> wrote:
> >
> > Hey Rob,
> >
> > Since commit 2d99ced787e3 ("drm/msm: async commit support"), the frame
> > buffer console on my Nexus 5 began throwing these errors:
> >
> > msm fd900000.mdss: pp done time out, lm=0
> >
> > The display still works.
> >
> > I see that mdp5_flush_commit() was introduced in commit 9f6b65642bd2
> > ("drm/msm: add kms->flush_commit()") with a TODO comment and the commit
> > description mentions flushing registers. I assume that this is the
> > proper fix. If so, can you point me to where these registers are
> > defined and I can work on the mdp5 implementation.
> 
> See mdp5_ctl_commit(), which writes the CTL_FLUSH registers.. the idea
> would be to defer writing CTL_FLUSH[ctl_id] = flush_mask until
> kms->flush() (which happens from a timer shortly before vblank).
> 
> But I think the async flush case should not come up with fbcon?  It
> was really added to cope with hwcursor updates (and userspace that
> assumes it can do an unlimited # of cursor updates per frame).. the
> intention was that nothing should change in the sequence for mdp5 (but
> I guess that was not the case).

The 'pp done time out' errors go away if I revert the following three
commits:

cd6d923167b1 ("drm/msm/dpu: async commit support")
d934a712c5e6 ("drm/msm: add atomic traces")
2d99ced787e3 ("drm/msm: async commit support")

I reverted the first one to fix a compiler error, and the second one so
that the last patch can be reverted without any merge conflicts.

I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
buffer dance around the screen like its out of sync. I renamed
crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
declaration. Here's the relevant part of what I tried:

--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
@@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
 
 static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
 {
-       /* TODO */
+       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
+       struct drm_crtc *crtc;
+
+       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
+               if (!crtc->state->active)
+                       continue;
+
+               mdp5_crtc_flush_all(crtc);
+       }
 }

Any tips would be appreciated.

Brian

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-05 10:08     ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-05 10:08 UTC (permalink / raw)
  To: Rob Clark
  Cc: Rob Clark, freedreno, Linux Kernel Mailing List, dri-devel,
	linux-arm-msm, Sean Paul

On Mon, Nov 04, 2019 at 04:19:07PM -0800, Rob Clark wrote:
> On Mon, Nov 4, 2019 at 4:01 PM Brian Masney <masneyb@onstation.org> wrote:
> >
> > Hey Rob,
> >
> > Since commit 2d99ced787e3 ("drm/msm: async commit support"), the frame
> > buffer console on my Nexus 5 began throwing these errors:
> >
> > msm fd900000.mdss: pp done time out, lm=0
> >
> > The display still works.
> >
> > I see that mdp5_flush_commit() was introduced in commit 9f6b65642bd2
> > ("drm/msm: add kms->flush_commit()") with a TODO comment and the commit
> > description mentions flushing registers. I assume that this is the
> > proper fix. If so, can you point me to where these registers are
> > defined and I can work on the mdp5 implementation.
> 
> See mdp5_ctl_commit(), which writes the CTL_FLUSH registers.. the idea
> would be to defer writing CTL_FLUSH[ctl_id] = flush_mask until
> kms->flush() (which happens from a timer shortly before vblank).
> 
> But I think the async flush case should not come up with fbcon?  It
> was really added to cope with hwcursor updates (and userspace that
> assumes it can do an unlimited # of cursor updates per frame).. the
> intention was that nothing should change in the sequence for mdp5 (but
> I guess that was not the case).

The 'pp done time out' errors go away if I revert the following three
commits:

cd6d923167b1 ("drm/msm/dpu: async commit support")
d934a712c5e6 ("drm/msm: add atomic traces")
2d99ced787e3 ("drm/msm: async commit support")

I reverted the first one to fix a compiler error, and the second one so
that the last patch can be reverted without any merge conflicts.

I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
buffer dance around the screen like its out of sync. I renamed
crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
declaration. Here's the relevant part of what I tried:

--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
@@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
 
 static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
 {
-       /* TODO */
+       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
+       struct drm_crtc *crtc;
+
+       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
+               if (!crtc->state->active)
+                       continue;
+
+               mdp5_crtc_flush_all(crtc);
+       }
 }

Any tips would be appreciated.

Brian
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
  2019-11-05 10:08     ` Brian Masney
@ 2019-11-05 16:23       ` Rob Clark
  -1 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-05 16:23 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, freedreno, Sean Paul, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm

On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Mon, Nov 04, 2019 at 04:19:07PM -0800, Rob Clark wrote:
> > On Mon, Nov 4, 2019 at 4:01 PM Brian Masney <masneyb@onstation.org> wrote:
> > >
> > > Hey Rob,
> > >
> > > Since commit 2d99ced787e3 ("drm/msm: async commit support"), the frame
> > > buffer console on my Nexus 5 began throwing these errors:
> > >
> > > msm fd900000.mdss: pp done time out, lm=0
> > >
> > > The display still works.
> > >
> > > I see that mdp5_flush_commit() was introduced in commit 9f6b65642bd2
> > > ("drm/msm: add kms->flush_commit()") with a TODO comment and the commit
> > > description mentions flushing registers. I assume that this is the
> > > proper fix. If so, can you point me to where these registers are
> > > defined and I can work on the mdp5 implementation.
> >
> > See mdp5_ctl_commit(), which writes the CTL_FLUSH registers.. the idea
> > would be to defer writing CTL_FLUSH[ctl_id] = flush_mask until
> > kms->flush() (which happens from a timer shortly before vblank).
> >
> > But I think the async flush case should not come up with fbcon?  It
> > was really added to cope with hwcursor updates (and userspace that
> > assumes it can do an unlimited # of cursor updates per frame).. the
> > intention was that nothing should change in the sequence for mdp5 (but
> > I guess that was not the case).
>
> The 'pp done time out' errors go away if I revert the following three
> commits:
>
> cd6d923167b1 ("drm/msm/dpu: async commit support")
> d934a712c5e6 ("drm/msm: add atomic traces")
> 2d99ced787e3 ("drm/msm: async commit support")
>
> I reverted the first one to fix a compiler error, and the second one so
> that the last patch can be reverted without any merge conflicts.
>
> I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> buffer dance around the screen like its out of sync. I renamed
> crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> declaration. Here's the relevant part of what I tried:
>
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
>
>  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
>  {
> -       /* TODO */
> +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> +       struct drm_crtc *crtc;
> +
> +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> +               if (!crtc->state->active)
> +                       continue;
> +
> +               mdp5_crtc_flush_all(crtc);
> +       }
>  }
>
> Any tips would be appreciated.


I think this is along the lines of what we need to enable async commit
for mdp5 (but also removing the flush from the atomic-commit path)..
the principle behind the async commit is to do all the atomic state
commit normally, but defer writing the flush bits.  This way, if you
get another async update before the next vblank, you just apply it
immediately instead of waiting for vblank.

But I guess you are on a command mode panel, if I remember?  Which is
a case I didn't have a way to test.  And I'm not entirely about how
kms_funcs->vsync_time() should be implemented for cmd mode panels.

That all said, I think we should first fix what is broken, before
worrying about extending async commit support to mdp5.. which
shouldn't hit the async==true path, due to not implementing
kms_funcs->vsync_time().

What I think is going on is that, in the cmd mode case,
mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
which waits for a pp-done irq regardless of whether there is a flush
in progress.  Since there is no flush pending, the irq never comes.
But the expectation is that kms_funcs->wait_flush() returns
immediately if there is nothing to wait for.

BR,
-R

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-05 16:23       ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-05 16:23 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, freedreno, Linux Kernel Mailing List, dri-devel,
	linux-arm-msm, Sean Paul

On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Mon, Nov 04, 2019 at 04:19:07PM -0800, Rob Clark wrote:
> > On Mon, Nov 4, 2019 at 4:01 PM Brian Masney <masneyb@onstation.org> wrote:
> > >
> > > Hey Rob,
> > >
> > > Since commit 2d99ced787e3 ("drm/msm: async commit support"), the frame
> > > buffer console on my Nexus 5 began throwing these errors:
> > >
> > > msm fd900000.mdss: pp done time out, lm=0
> > >
> > > The display still works.
> > >
> > > I see that mdp5_flush_commit() was introduced in commit 9f6b65642bd2
> > > ("drm/msm: add kms->flush_commit()") with a TODO comment and the commit
> > > description mentions flushing registers. I assume that this is the
> > > proper fix. If so, can you point me to where these registers are
> > > defined and I can work on the mdp5 implementation.
> >
> > See mdp5_ctl_commit(), which writes the CTL_FLUSH registers.. the idea
> > would be to defer writing CTL_FLUSH[ctl_id] = flush_mask until
> > kms->flush() (which happens from a timer shortly before vblank).
> >
> > But I think the async flush case should not come up with fbcon?  It
> > was really added to cope with hwcursor updates (and userspace that
> > assumes it can do an unlimited # of cursor updates per frame).. the
> > intention was that nothing should change in the sequence for mdp5 (but
> > I guess that was not the case).
>
> The 'pp done time out' errors go away if I revert the following three
> commits:
>
> cd6d923167b1 ("drm/msm/dpu: async commit support")
> d934a712c5e6 ("drm/msm: add atomic traces")
> 2d99ced787e3 ("drm/msm: async commit support")
>
> I reverted the first one to fix a compiler error, and the second one so
> that the last patch can be reverted without any merge conflicts.
>
> I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> buffer dance around the screen like its out of sync. I renamed
> crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> declaration. Here's the relevant part of what I tried:
>
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
>
>  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
>  {
> -       /* TODO */
> +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> +       struct drm_crtc *crtc;
> +
> +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> +               if (!crtc->state->active)
> +                       continue;
> +
> +               mdp5_crtc_flush_all(crtc);
> +       }
>  }
>
> Any tips would be appreciated.


I think this is along the lines of what we need to enable async commit
for mdp5 (but also removing the flush from the atomic-commit path)..
the principle behind the async commit is to do all the atomic state
commit normally, but defer writing the flush bits.  This way, if you
get another async update before the next vblank, you just apply it
immediately instead of waiting for vblank.

But I guess you are on a command mode panel, if I remember?  Which is
a case I didn't have a way to test.  And I'm not entirely about how
kms_funcs->vsync_time() should be implemented for cmd mode panels.

That all said, I think we should first fix what is broken, before
worrying about extending async commit support to mdp5.. which
shouldn't hit the async==true path, due to not implementing
kms_funcs->vsync_time().

What I think is going on is that, in the cmd mode case,
mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
which waits for a pp-done irq regardless of whether there is a flush
in progress.  Since there is no flush pending, the irq never comes.
But the expectation is that kms_funcs->wait_flush() returns
immediately if there is nothing to wait for.

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-06  9:13         ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-06  9:13 UTC (permalink / raw)
  To: Rob Clark
  Cc: Rob Clark, freedreno, Sean Paul, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm

On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > The 'pp done time out' errors go away if I revert the following three
> > commits:
> >
> > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > d934a712c5e6 ("drm/msm: add atomic traces")
> > 2d99ced787e3 ("drm/msm: async commit support")
> >
> > I reverted the first one to fix a compiler error, and the second one so
> > that the last patch can be reverted without any merge conflicts.
> >
> > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > buffer dance around the screen like its out of sync. I renamed
> > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > declaration. Here's the relevant part of what I tried:
> >
> > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> >
> >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> >  {
> > -       /* TODO */
> > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > +       struct drm_crtc *crtc;
> > +
> > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > +               if (!crtc->state->active)
> > +                       continue;
> > +
> > +               mdp5_crtc_flush_all(crtc);
> > +       }
> >  }
> >
> > Any tips would be appreciated.
> 
> 
> I think this is along the lines of what we need to enable async commit
> for mdp5 (but also removing the flush from the atomic-commit path)..
> the principle behind the async commit is to do all the atomic state
> commit normally, but defer writing the flush bits.  This way, if you
> get another async update before the next vblank, you just apply it
> immediately instead of waiting for vblank.
> 
> But I guess you are on a command mode panel, if I remember?  Which is
> a case I didn't have a way to test.  And I'm not entirely about how
> kms_funcs->vsync_time() should be implemented for cmd mode panels.

Yes, this is a command-mode panel and there's no hardware frame counter
available. The key to getting the display working on this phone was this
patch:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf

> That all said, I think we should first fix what is broken, before
> worrying about extending async commit support to mdp5.. which
> shouldn't hit the async==true path, due to not implementing
> kms_funcs->vsync_time().
> 
> What I think is going on is that, in the cmd mode case,
> mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> which waits for a pp-done irq regardless of whether there is a flush
> in progress.  Since there is no flush pending, the irq never comes.
> But the expectation is that kms_funcs->wait_flush() returns
> immediately if there is nothing to wait for.

I don't think that's happening in this case. I added some pr_info()
statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
Here's the first two sets of messages that appear in dmesg:

[   14.018907] msm fd900000.mdss: pp done time out, lm=0
[   14.018993] request_pp_done_pending: HERE
[   14.074208] mdp5_crtc_pp_done_irq: HERE
[   14.074368] Console: switching to colour frame buffer device 135x120
[   14.138938] msm fd900000.mdss: pp done time out, lm=0
[   14.139021] request_pp_done_pending: HERE
[   14.158097] mdp5_crtc_pp_done_irq: HERE

The messages go on like this with the same pattern.

I tried two different changes:

1) I moved the request_pp_done_pending() and corresponding if statement
   from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().

2) I increased the timeout in wait_for_completion_timeout() by several
   increments; all the way to 5 seconds.

I haven't dug into the new code anymore.

Brian

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-06  9:13         ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-06  9:13 UTC (permalink / raw)
  To: Rob Clark
  Cc: Rob Clark, freedreno, Linux Kernel Mailing List, dri-devel,
	linux-arm-msm, Sean Paul

On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > The 'pp done time out' errors go away if I revert the following three
> > commits:
> >
> > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > d934a712c5e6 ("drm/msm: add atomic traces")
> > 2d99ced787e3 ("drm/msm: async commit support")
> >
> > I reverted the first one to fix a compiler error, and the second one so
> > that the last patch can be reverted without any merge conflicts.
> >
> > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > buffer dance around the screen like its out of sync. I renamed
> > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > declaration. Here's the relevant part of what I tried:
> >
> > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> >
> >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> >  {
> > -       /* TODO */
> > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > +       struct drm_crtc *crtc;
> > +
> > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > +               if (!crtc->state->active)
> > +                       continue;
> > +
> > +               mdp5_crtc_flush_all(crtc);
> > +       }
> >  }
> >
> > Any tips would be appreciated.
> 
> 
> I think this is along the lines of what we need to enable async commit
> for mdp5 (but also removing the flush from the atomic-commit path)..
> the principle behind the async commit is to do all the atomic state
> commit normally, but defer writing the flush bits.  This way, if you
> get another async update before the next vblank, you just apply it
> immediately instead of waiting for vblank.
> 
> But I guess you are on a command mode panel, if I remember?  Which is
> a case I didn't have a way to test.  And I'm not entirely about how
> kms_funcs->vsync_time() should be implemented for cmd mode panels.

Yes, this is a command-mode panel and there's no hardware frame counter
available. The key to getting the display working on this phone was this
patch:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf

> That all said, I think we should first fix what is broken, before
> worrying about extending async commit support to mdp5.. which
> shouldn't hit the async==true path, due to not implementing
> kms_funcs->vsync_time().
> 
> What I think is going on is that, in the cmd mode case,
> mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> which waits for a pp-done irq regardless of whether there is a flush
> in progress.  Since there is no flush pending, the irq never comes.
> But the expectation is that kms_funcs->wait_flush() returns
> immediately if there is nothing to wait for.

I don't think that's happening in this case. I added some pr_info()
statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
Here's the first two sets of messages that appear in dmesg:

[   14.018907] msm fd900000.mdss: pp done time out, lm=0
[   14.018993] request_pp_done_pending: HERE
[   14.074208] mdp5_crtc_pp_done_irq: HERE
[   14.074368] Console: switching to colour frame buffer device 135x120
[   14.138938] msm fd900000.mdss: pp done time out, lm=0
[   14.139021] request_pp_done_pending: HERE
[   14.158097] mdp5_crtc_pp_done_irq: HERE

The messages go on like this with the same pattern.

I tried two different changes:

1) I moved the request_pp_done_pending() and corresponding if statement
   from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().

2) I increased the timeout in wait_for_completion_timeout() by several
   increments; all the way to 5 seconds.

I haven't dug into the new code anymore.

Brian
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-06 16:29           ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-06 16:29 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, freedreno, Sean Paul, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm

On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > The 'pp done time out' errors go away if I revert the following three
> > > commits:
> > >
> > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > 2d99ced787e3 ("drm/msm: async commit support")
> > >
> > > I reverted the first one to fix a compiler error, and the second one so
> > > that the last patch can be reverted without any merge conflicts.
> > >
> > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > buffer dance around the screen like its out of sync. I renamed
> > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > declaration. Here's the relevant part of what I tried:
> > >
> > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > >
> > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > >  {
> > > -       /* TODO */
> > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > +       struct drm_crtc *crtc;
> > > +
> > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > +               if (!crtc->state->active)
> > > +                       continue;
> > > +
> > > +               mdp5_crtc_flush_all(crtc);
> > > +       }
> > >  }
> > >
> > > Any tips would be appreciated.
> >
> >
> > I think this is along the lines of what we need to enable async commit
> > for mdp5 (but also removing the flush from the atomic-commit path)..
> > the principle behind the async commit is to do all the atomic state
> > commit normally, but defer writing the flush bits.  This way, if you
> > get another async update before the next vblank, you just apply it
> > immediately instead of waiting for vblank.
> >
> > But I guess you are on a command mode panel, if I remember?  Which is
> > a case I didn't have a way to test.  And I'm not entirely about how
> > kms_funcs->vsync_time() should be implemented for cmd mode panels.
>
> Yes, this is a command-mode panel and there's no hardware frame counter
> available. The key to getting the display working on this phone was this
> patch:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
>
> > That all said, I think we should first fix what is broken, before
> > worrying about extending async commit support to mdp5.. which
> > shouldn't hit the async==true path, due to not implementing
> > kms_funcs->vsync_time().
> >
> > What I think is going on is that, in the cmd mode case,
> > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > which waits for a pp-done irq regardless of whether there is a flush
> > in progress.  Since there is no flush pending, the irq never comes.
> > But the expectation is that kms_funcs->wait_flush() returns
> > immediately if there is nothing to wait for.
>
> I don't think that's happening in this case. I added some pr_info()
> statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> Here's the first two sets of messages that appear in dmesg:
>
> [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> [   14.018993] request_pp_done_pending: HERE
> [   14.074208] mdp5_crtc_pp_done_irq: HERE
> [   14.074368] Console: switching to colour frame buffer device 135x120
> [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> [   14.139021] request_pp_done_pending: HERE
> [   14.158097] mdp5_crtc_pp_done_irq: HERE
>
> The messages go on like this with the same pattern.
>
> I tried two different changes:
>
> 1) I moved the request_pp_done_pending() and corresponding if statement
>    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
>
> 2) I increased the timeout in wait_for_completion_timeout() by several
>    increments; all the way to 5 seconds.

increasing the timeout won't help, because the pp-done irq has already
come at the point where we wait for it..

maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
before requesting, and false when we get the irq.. and then
mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..

BR,
-R

> I haven't dug into the new code anymore.
>
> Brian

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-06 16:29           ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-06 16:29 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, freedreno, Linux Kernel Mailing List, dri-devel,
	linux-arm-msm, Sean Paul

On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > The 'pp done time out' errors go away if I revert the following three
> > > commits:
> > >
> > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > 2d99ced787e3 ("drm/msm: async commit support")
> > >
> > > I reverted the first one to fix a compiler error, and the second one so
> > > that the last patch can be reverted without any merge conflicts.
> > >
> > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > buffer dance around the screen like its out of sync. I renamed
> > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > declaration. Here's the relevant part of what I tried:
> > >
> > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > >
> > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > >  {
> > > -       /* TODO */
> > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > +       struct drm_crtc *crtc;
> > > +
> > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > +               if (!crtc->state->active)
> > > +                       continue;
> > > +
> > > +               mdp5_crtc_flush_all(crtc);
> > > +       }
> > >  }
> > >
> > > Any tips would be appreciated.
> >
> >
> > I think this is along the lines of what we need to enable async commit
> > for mdp5 (but also removing the flush from the atomic-commit path)..
> > the principle behind the async commit is to do all the atomic state
> > commit normally, but defer writing the flush bits.  This way, if you
> > get another async update before the next vblank, you just apply it
> > immediately instead of waiting for vblank.
> >
> > But I guess you are on a command mode panel, if I remember?  Which is
> > a case I didn't have a way to test.  And I'm not entirely about how
> > kms_funcs->vsync_time() should be implemented for cmd mode panels.
>
> Yes, this is a command-mode panel and there's no hardware frame counter
> available. The key to getting the display working on this phone was this
> patch:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
>
> > That all said, I think we should first fix what is broken, before
> > worrying about extending async commit support to mdp5.. which
> > shouldn't hit the async==true path, due to not implementing
> > kms_funcs->vsync_time().
> >
> > What I think is going on is that, in the cmd mode case,
> > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > which waits for a pp-done irq regardless of whether there is a flush
> > in progress.  Since there is no flush pending, the irq never comes.
> > But the expectation is that kms_funcs->wait_flush() returns
> > immediately if there is nothing to wait for.
>
> I don't think that's happening in this case. I added some pr_info()
> statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> Here's the first two sets of messages that appear in dmesg:
>
> [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> [   14.018993] request_pp_done_pending: HERE
> [   14.074208] mdp5_crtc_pp_done_irq: HERE
> [   14.074368] Console: switching to colour frame buffer device 135x120
> [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> [   14.139021] request_pp_done_pending: HERE
> [   14.158097] mdp5_crtc_pp_done_irq: HERE
>
> The messages go on like this with the same pattern.
>
> I tried two different changes:
>
> 1) I moved the request_pp_done_pending() and corresponding if statement
>    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
>
> 2) I increased the timeout in wait_for_completion_timeout() by several
>    increments; all the way to 5 seconds.

increasing the timeout won't help, because the pp-done irq has already
come at the point where we wait for it..

maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
before requesting, and false when we get the irq.. and then
mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..

BR,
-R

> I haven't dug into the new code anymore.
>
> Brian
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-06 16:29           ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-06 16:29 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, freedreno, Linux Kernel Mailing List, dri-devel,
	linux-arm-msm, Sean Paul

On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > The 'pp done time out' errors go away if I revert the following three
> > > commits:
> > >
> > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > 2d99ced787e3 ("drm/msm: async commit support")
> > >
> > > I reverted the first one to fix a compiler error, and the second one so
> > > that the last patch can be reverted without any merge conflicts.
> > >
> > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > buffer dance around the screen like its out of sync. I renamed
> > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > declaration. Here's the relevant part of what I tried:
> > >
> > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > >
> > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > >  {
> > > -       /* TODO */
> > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > +       struct drm_crtc *crtc;
> > > +
> > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > +               if (!crtc->state->active)
> > > +                       continue;
> > > +
> > > +               mdp5_crtc_flush_all(crtc);
> > > +       }
> > >  }
> > >
> > > Any tips would be appreciated.
> >
> >
> > I think this is along the lines of what we need to enable async commit
> > for mdp5 (but also removing the flush from the atomic-commit path)..
> > the principle behind the async commit is to do all the atomic state
> > commit normally, but defer writing the flush bits.  This way, if you
> > get another async update before the next vblank, you just apply it
> > immediately instead of waiting for vblank.
> >
> > But I guess you are on a command mode panel, if I remember?  Which is
> > a case I didn't have a way to test.  And I'm not entirely about how
> > kms_funcs->vsync_time() should be implemented for cmd mode panels.
>
> Yes, this is a command-mode panel and there's no hardware frame counter
> available. The key to getting the display working on this phone was this
> patch:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
>
> > That all said, I think we should first fix what is broken, before
> > worrying about extending async commit support to mdp5.. which
> > shouldn't hit the async==true path, due to not implementing
> > kms_funcs->vsync_time().
> >
> > What I think is going on is that, in the cmd mode case,
> > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > which waits for a pp-done irq regardless of whether there is a flush
> > in progress.  Since there is no flush pending, the irq never comes.
> > But the expectation is that kms_funcs->wait_flush() returns
> > immediately if there is nothing to wait for.
>
> I don't think that's happening in this case. I added some pr_info()
> statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> Here's the first two sets of messages that appear in dmesg:
>
> [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> [   14.018993] request_pp_done_pending: HERE
> [   14.074208] mdp5_crtc_pp_done_irq: HERE
> [   14.074368] Console: switching to colour frame buffer device 135x120
> [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> [   14.139021] request_pp_done_pending: HERE
> [   14.158097] mdp5_crtc_pp_done_irq: HERE
>
> The messages go on like this with the same pattern.
>
> I tried two different changes:
>
> 1) I moved the request_pp_done_pending() and corresponding if statement
>    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
>
> 2) I increased the timeout in wait_for_completion_timeout() by several
>    increments; all the way to 5 seconds.

increasing the timeout won't help, because the pp-done irq has already
come at the point where we wait for it..

maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
before requesting, and false when we get the irq.. and then
mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..

BR,
-R

> I haven't dug into the new code anymore.
>
> Brian
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-06 16:47             ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-06 16:47 UTC (permalink / raw)
  To: Rob Clark
  Cc: Brian Masney, Rob Clark, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
>
> On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> >
> > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > The 'pp done time out' errors go away if I revert the following three
> > > > commits:
> > > >
> > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > >
> > > > I reverted the first one to fix a compiler error, and the second one so
> > > > that the last patch can be reverted without any merge conflicts.
> > > >
> > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > buffer dance around the screen like its out of sync. I renamed
> > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > declaration. Here's the relevant part of what I tried:
> > > >
> > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > >
> > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > >  {
> > > > -       /* TODO */
> > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > +       struct drm_crtc *crtc;
> > > > +
> > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > +               if (!crtc->state->active)
> > > > +                       continue;
> > > > +
> > > > +               mdp5_crtc_flush_all(crtc);
> > > > +       }
> > > >  }
> > > >
> > > > Any tips would be appreciated.
> > >
> > >
> > > I think this is along the lines of what we need to enable async commit
> > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > the principle behind the async commit is to do all the atomic state
> > > commit normally, but defer writing the flush bits.  This way, if you
> > > get another async update before the next vblank, you just apply it
> > > immediately instead of waiting for vblank.
> > >
> > > But I guess you are on a command mode panel, if I remember?  Which is
> > > a case I didn't have a way to test.  And I'm not entirely about how
> > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> >
> > Yes, this is a command-mode panel and there's no hardware frame counter
> > available. The key to getting the display working on this phone was this
> > patch:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> >
> > > That all said, I think we should first fix what is broken, before
> > > worrying about extending async commit support to mdp5.. which
> > > shouldn't hit the async==true path, due to not implementing
> > > kms_funcs->vsync_time().
> > >
> > > What I think is going on is that, in the cmd mode case,
> > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > which waits for a pp-done irq regardless of whether there is a flush
> > > in progress.  Since there is no flush pending, the irq never comes.
> > > But the expectation is that kms_funcs->wait_flush() returns
> > > immediately if there is nothing to wait for.
> >
> > I don't think that's happening in this case. I added some pr_info()
> > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > Here's the first two sets of messages that appear in dmesg:
> >
> > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > [   14.018993] request_pp_done_pending: HERE
> > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > [   14.074368] Console: switching to colour frame buffer device 135x120
> > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > [   14.139021] request_pp_done_pending: HERE
> > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> >
> > The messages go on like this with the same pattern.
> >
> > I tried two different changes:
> >
> > 1) I moved the request_pp_done_pending() and corresponding if statement
> >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> >
> > 2) I increased the timeout in wait_for_completion_timeout() by several
> >    increments; all the way to 5 seconds.
>
> increasing the timeout won't help, because the pp-done irq has already
> come at the point where we wait for it..
>
> maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> before requesting, and false when we get the irq.. and then
> mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..

On the otherhand, what about trying to make command mode panels
resemble video mode panels slightly?  Video mode panels have a vsync
counter in hardware, which is missing from command mode - however it
seems like the driver/drm framework would prefer such a counter.
Would it be a reasonable idea to make a software counter, and just
increment it every time the pp_done irq is triggered?

I'm just thinking that we'll avoid issues long term by trying to make
the code common, rather than diverging it for the two modes.

>
> BR,
> -R
>
> > I haven't dug into the new code anymore.
> >
> > Brian
> _______________________________________________
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-06 16:47             ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-06 16:47 UTC (permalink / raw)
  To: Rob Clark
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, freedreno, Brian Masney

On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
>
> On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> >
> > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > The 'pp done time out' errors go away if I revert the following three
> > > > commits:
> > > >
> > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > >
> > > > I reverted the first one to fix a compiler error, and the second one so
> > > > that the last patch can be reverted without any merge conflicts.
> > > >
> > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > buffer dance around the screen like its out of sync. I renamed
> > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > declaration. Here's the relevant part of what I tried:
> > > >
> > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > >
> > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > >  {
> > > > -       /* TODO */
> > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > +       struct drm_crtc *crtc;
> > > > +
> > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > +               if (!crtc->state->active)
> > > > +                       continue;
> > > > +
> > > > +               mdp5_crtc_flush_all(crtc);
> > > > +       }
> > > >  }
> > > >
> > > > Any tips would be appreciated.
> > >
> > >
> > > I think this is along the lines of what we need to enable async commit
> > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > the principle behind the async commit is to do all the atomic state
> > > commit normally, but defer writing the flush bits.  This way, if you
> > > get another async update before the next vblank, you just apply it
> > > immediately instead of waiting for vblank.
> > >
> > > But I guess you are on a command mode panel, if I remember?  Which is
> > > a case I didn't have a way to test.  And I'm not entirely about how
> > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> >
> > Yes, this is a command-mode panel and there's no hardware frame counter
> > available. The key to getting the display working on this phone was this
> > patch:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> >
> > > That all said, I think we should first fix what is broken, before
> > > worrying about extending async commit support to mdp5.. which
> > > shouldn't hit the async==true path, due to not implementing
> > > kms_funcs->vsync_time().
> > >
> > > What I think is going on is that, in the cmd mode case,
> > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > which waits for a pp-done irq regardless of whether there is a flush
> > > in progress.  Since there is no flush pending, the irq never comes.
> > > But the expectation is that kms_funcs->wait_flush() returns
> > > immediately if there is nothing to wait for.
> >
> > I don't think that's happening in this case. I added some pr_info()
> > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > Here's the first two sets of messages that appear in dmesg:
> >
> > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > [   14.018993] request_pp_done_pending: HERE
> > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > [   14.074368] Console: switching to colour frame buffer device 135x120
> > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > [   14.139021] request_pp_done_pending: HERE
> > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> >
> > The messages go on like this with the same pattern.
> >
> > I tried two different changes:
> >
> > 1) I moved the request_pp_done_pending() and corresponding if statement
> >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> >
> > 2) I increased the timeout in wait_for_completion_timeout() by several
> >    increments; all the way to 5 seconds.
>
> increasing the timeout won't help, because the pp-done irq has already
> come at the point where we wait for it..
>
> maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> before requesting, and false when we get the irq.. and then
> mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..

On the otherhand, what about trying to make command mode panels
resemble video mode panels slightly?  Video mode panels have a vsync
counter in hardware, which is missing from command mode - however it
seems like the driver/drm framework would prefer such a counter.
Would it be a reasonable idea to make a software counter, and just
increment it every time the pp_done irq is triggered?

I'm just thinking that we'll avoid issues long term by trying to make
the code common, rather than diverging it for the two modes.

>
> BR,
> -R
>
> > I haven't dug into the new code anymore.
> >
> > Brian
> _______________________________________________
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-06 16:58               ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-06 16:58 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Brian Masney, Rob Clark, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
>
> On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > >
> > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > commits:
> > > > >
> > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > >
> > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > that the last patch can be reverted without any merge conflicts.
> > > > >
> > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > declaration. Here's the relevant part of what I tried:
> > > > >
> > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > >
> > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > >  {
> > > > > -       /* TODO */
> > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > +       struct drm_crtc *crtc;
> > > > > +
> > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > +               if (!crtc->state->active)
> > > > > +                       continue;
> > > > > +
> > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > +       }
> > > > >  }
> > > > >
> > > > > Any tips would be appreciated.
> > > >
> > > >
> > > > I think this is along the lines of what we need to enable async commit
> > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > the principle behind the async commit is to do all the atomic state
> > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > get another async update before the next vblank, you just apply it
> > > > immediately instead of waiting for vblank.
> > > >
> > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > >
> > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > available. The key to getting the display working on this phone was this
> > > patch:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > >
> > > > That all said, I think we should first fix what is broken, before
> > > > worrying about extending async commit support to mdp5.. which
> > > > shouldn't hit the async==true path, due to not implementing
> > > > kms_funcs->vsync_time().
> > > >
> > > > What I think is going on is that, in the cmd mode case,
> > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > immediately if there is nothing to wait for.
> > >
> > > I don't think that's happening in this case. I added some pr_info()
> > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > Here's the first two sets of messages that appear in dmesg:
> > >
> > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > [   14.018993] request_pp_done_pending: HERE
> > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > [   14.139021] request_pp_done_pending: HERE
> > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > >
> > > The messages go on like this with the same pattern.
> > >
> > > I tried two different changes:
> > >
> > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > >
> > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > >    increments; all the way to 5 seconds.
> >
> > increasing the timeout won't help, because the pp-done irq has already
> > come at the point where we wait for it..
> >
> > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > before requesting, and false when we get the irq.. and then
> > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
>
> On the otherhand, what about trying to make command mode panels
> resemble video mode panels slightly?  Video mode panels have a vsync
> counter in hardware, which is missing from command mode - however it
> seems like the driver/drm framework would prefer such a counter.
> Would it be a reasonable idea to make a software counter, and just
> increment it every time the pp_done irq is triggered?
>
> I'm just thinking that we'll avoid issues long term by trying to make
> the code common, rather than diverging it for the two modes.
>

*possibly*, but I think we want to account somehow periods where
display is not updated.

fwiw, it isn't that uncommon for userspace to use vblanks to "keep
time" (drive animations for desktop switch, window
maximize/unmaximize, etc).. it could be a surprise when "vblank" is
not periodic.

BR,
-R

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-06 16:58               ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-06 16:58 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, freedreno, Brian Masney

On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
>
> On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > >
> > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > commits:
> > > > >
> > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > >
> > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > that the last patch can be reverted without any merge conflicts.
> > > > >
> > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > declaration. Here's the relevant part of what I tried:
> > > > >
> > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > >
> > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > >  {
> > > > > -       /* TODO */
> > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > +       struct drm_crtc *crtc;
> > > > > +
> > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > +               if (!crtc->state->active)
> > > > > +                       continue;
> > > > > +
> > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > +       }
> > > > >  }
> > > > >
> > > > > Any tips would be appreciated.
> > > >
> > > >
> > > > I think this is along the lines of what we need to enable async commit
> > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > the principle behind the async commit is to do all the atomic state
> > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > get another async update before the next vblank, you just apply it
> > > > immediately instead of waiting for vblank.
> > > >
> > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > >
> > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > available. The key to getting the display working on this phone was this
> > > patch:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > >
> > > > That all said, I think we should first fix what is broken, before
> > > > worrying about extending async commit support to mdp5.. which
> > > > shouldn't hit the async==true path, due to not implementing
> > > > kms_funcs->vsync_time().
> > > >
> > > > What I think is going on is that, in the cmd mode case,
> > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > immediately if there is nothing to wait for.
> > >
> > > I don't think that's happening in this case. I added some pr_info()
> > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > Here's the first two sets of messages that appear in dmesg:
> > >
> > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > [   14.018993] request_pp_done_pending: HERE
> > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > [   14.139021] request_pp_done_pending: HERE
> > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > >
> > > The messages go on like this with the same pattern.
> > >
> > > I tried two different changes:
> > >
> > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > >
> > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > >    increments; all the way to 5 seconds.
> >
> > increasing the timeout won't help, because the pp-done irq has already
> > come at the point where we wait for it..
> >
> > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > before requesting, and false when we get the irq.. and then
> > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
>
> On the otherhand, what about trying to make command mode panels
> resemble video mode panels slightly?  Video mode panels have a vsync
> counter in hardware, which is missing from command mode - however it
> seems like the driver/drm framework would prefer such a counter.
> Would it be a reasonable idea to make a software counter, and just
> increment it every time the pp_done irq is triggered?
>
> I'm just thinking that we'll avoid issues long term by trying to make
> the code common, rather than diverging it for the two modes.
>

*possibly*, but I think we want to account somehow periods where
display is not updated.

fwiw, it isn't that uncommon for userspace to use vblanks to "keep
time" (drive animations for desktop switch, window
maximize/unmaximize, etc).. it could be a surprise when "vblank" is
not periodic.

BR,
-R
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-06 16:58               ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-06 16:58 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, freedreno, Brian Masney

On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
>
> On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > >
> > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > commits:
> > > > >
> > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > >
> > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > that the last patch can be reverted without any merge conflicts.
> > > > >
> > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > declaration. Here's the relevant part of what I tried:
> > > > >
> > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > >
> > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > >  {
> > > > > -       /* TODO */
> > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > +       struct drm_crtc *crtc;
> > > > > +
> > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > +               if (!crtc->state->active)
> > > > > +                       continue;
> > > > > +
> > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > +       }
> > > > >  }
> > > > >
> > > > > Any tips would be appreciated.
> > > >
> > > >
> > > > I think this is along the lines of what we need to enable async commit
> > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > the principle behind the async commit is to do all the atomic state
> > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > get another async update before the next vblank, you just apply it
> > > > immediately instead of waiting for vblank.
> > > >
> > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > >
> > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > available. The key to getting the display working on this phone was this
> > > patch:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > >
> > > > That all said, I think we should first fix what is broken, before
> > > > worrying about extending async commit support to mdp5.. which
> > > > shouldn't hit the async==true path, due to not implementing
> > > > kms_funcs->vsync_time().
> > > >
> > > > What I think is going on is that, in the cmd mode case,
> > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > immediately if there is nothing to wait for.
> > >
> > > I don't think that's happening in this case. I added some pr_info()
> > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > Here's the first two sets of messages that appear in dmesg:
> > >
> > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > [   14.018993] request_pp_done_pending: HERE
> > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > [   14.139021] request_pp_done_pending: HERE
> > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > >
> > > The messages go on like this with the same pattern.
> > >
> > > I tried two different changes:
> > >
> > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > >
> > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > >    increments; all the way to 5 seconds.
> >
> > increasing the timeout won't help, because the pp-done irq has already
> > come at the point where we wait for it..
> >
> > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > before requesting, and false when we get the irq.. and then
> > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
>
> On the otherhand, what about trying to make command mode panels
> resemble video mode panels slightly?  Video mode panels have a vsync
> counter in hardware, which is missing from command mode - however it
> seems like the driver/drm framework would prefer such a counter.
> Would it be a reasonable idea to make a software counter, and just
> increment it every time the pp_done irq is triggered?
>
> I'm just thinking that we'll avoid issues long term by trying to make
> the code common, rather than diverging it for the two modes.
>

*possibly*, but I think we want to account somehow periods where
display is not updated.

fwiw, it isn't that uncommon for userspace to use vblanks to "keep
time" (drive animations for desktop switch, window
maximize/unmaximize, etc).. it could be a surprise when "vblank" is
not periodic.

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-07 11:10                 ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-07 11:10 UTC (permalink / raw)
  To: Rob Clark
  Cc: Jeffrey Hugo, Rob Clark, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> >
> > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > >
> > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > >
> > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > commits:
> > > > > >
> > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > >
> > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > >
> > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > declaration. Here's the relevant part of what I tried:
> > > > > >
> > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > >
> > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > >  {
> > > > > > -       /* TODO */
> > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > +       struct drm_crtc *crtc;
> > > > > > +
> > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > +               if (!crtc->state->active)
> > > > > > +                       continue;
> > > > > > +
> > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > +       }
> > > > > >  }
> > > > > >
> > > > > > Any tips would be appreciated.
> > > > >
> > > > >
> > > > > I think this is along the lines of what we need to enable async commit
> > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > the principle behind the async commit is to do all the atomic state
> > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > get another async update before the next vblank, you just apply it
> > > > > immediately instead of waiting for vblank.
> > > > >
> > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > >
> > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > available. The key to getting the display working on this phone was this
> > > > patch:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > >
> > > > > That all said, I think we should first fix what is broken, before
> > > > > worrying about extending async commit support to mdp5.. which
> > > > > shouldn't hit the async==true path, due to not implementing
> > > > > kms_funcs->vsync_time().
> > > > >
> > > > > What I think is going on is that, in the cmd mode case,
> > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > immediately if there is nothing to wait for.
> > > >
> > > > I don't think that's happening in this case. I added some pr_info()
> > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > Here's the first two sets of messages that appear in dmesg:
> > > >
> > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > [   14.018993] request_pp_done_pending: HERE
> > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > [   14.139021] request_pp_done_pending: HERE
> > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > >
> > > > The messages go on like this with the same pattern.
> > > >
> > > > I tried two different changes:
> > > >
> > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > >
> > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > >    increments; all the way to 5 seconds.
> > >
> > > increasing the timeout won't help, because the pp-done irq has already
> > > come at the point where we wait for it..
> > >
> > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > before requesting, and false when we get the irq.. and then
> > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> >
> > On the otherhand, what about trying to make command mode panels
> > resemble video mode panels slightly?  Video mode panels have a vsync
> > counter in hardware, which is missing from command mode - however it
> > seems like the driver/drm framework would prefer such a counter.
> > Would it be a reasonable idea to make a software counter, and just
> > increment it every time the pp_done irq is triggered?
> >
> > I'm just thinking that we'll avoid issues long term by trying to make
> > the code common, rather than diverging it for the two modes.
> >
> 
> *possibly*, but I think we want to account somehow periods where
> display is not updated.
> 
> fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> time" (drive animations for desktop switch, window
> maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> not periodic.

What do you think about using some variation of the current value of
jiffies in the kernel + the number of pp_done IRQs as the software
counter for command-mode panels?

Brian

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-07 11:10                 ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-07 11:10 UTC (permalink / raw)
  To: Rob Clark
  Cc: Rob Clark, Jeffrey Hugo, linux-arm-msm,
	Linux Kernel Mailing List, dri-devel, Sean Paul, freedreno

On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> >
> > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > >
> > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > >
> > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > commits:
> > > > > >
> > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > >
> > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > >
> > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > declaration. Here's the relevant part of what I tried:
> > > > > >
> > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > >
> > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > >  {
> > > > > > -       /* TODO */
> > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > +       struct drm_crtc *crtc;
> > > > > > +
> > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > +               if (!crtc->state->active)
> > > > > > +                       continue;
> > > > > > +
> > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > +       }
> > > > > >  }
> > > > > >
> > > > > > Any tips would be appreciated.
> > > > >
> > > > >
> > > > > I think this is along the lines of what we need to enable async commit
> > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > the principle behind the async commit is to do all the atomic state
> > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > get another async update before the next vblank, you just apply it
> > > > > immediately instead of waiting for vblank.
> > > > >
> > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > >
> > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > available. The key to getting the display working on this phone was this
> > > > patch:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > >
> > > > > That all said, I think we should first fix what is broken, before
> > > > > worrying about extending async commit support to mdp5.. which
> > > > > shouldn't hit the async==true path, due to not implementing
> > > > > kms_funcs->vsync_time().
> > > > >
> > > > > What I think is going on is that, in the cmd mode case,
> > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > immediately if there is nothing to wait for.
> > > >
> > > > I don't think that's happening in this case. I added some pr_info()
> > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > Here's the first two sets of messages that appear in dmesg:
> > > >
> > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > [   14.018993] request_pp_done_pending: HERE
> > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > [   14.139021] request_pp_done_pending: HERE
> > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > >
> > > > The messages go on like this with the same pattern.
> > > >
> > > > I tried two different changes:
> > > >
> > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > >
> > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > >    increments; all the way to 5 seconds.
> > >
> > > increasing the timeout won't help, because the pp-done irq has already
> > > come at the point where we wait for it..
> > >
> > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > before requesting, and false when we get the irq.. and then
> > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> >
> > On the otherhand, what about trying to make command mode panels
> > resemble video mode panels slightly?  Video mode panels have a vsync
> > counter in hardware, which is missing from command mode - however it
> > seems like the driver/drm framework would prefer such a counter.
> > Would it be a reasonable idea to make a software counter, and just
> > increment it every time the pp_done irq is triggered?
> >
> > I'm just thinking that we'll avoid issues long term by trying to make
> > the code common, rather than diverging it for the two modes.
> >
> 
> *possibly*, but I think we want to account somehow periods where
> display is not updated.
> 
> fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> time" (drive animations for desktop switch, window
> maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> not periodic.

What do you think about using some variation of the current value of
jiffies in the kernel + the number of pp_done IRQs as the software
counter for command-mode panels?

Brian
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-07 16:16                   ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-07 16:16 UTC (permalink / raw)
  To: Brian Masney
  Cc: Jeffrey Hugo, Rob Clark, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > >
> > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > >
> > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > commits:
> > > > > > >
> > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > >
> > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > >
> > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > >
> > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > >
> > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > >  {
> > > > > > > -       /* TODO */
> > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > +       struct drm_crtc *crtc;
> > > > > > > +
> > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > +               if (!crtc->state->active)
> > > > > > > +                       continue;
> > > > > > > +
> > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > +       }
> > > > > > >  }
> > > > > > >
> > > > > > > Any tips would be appreciated.
> > > > > >
> > > > > >
> > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > get another async update before the next vblank, you just apply it
> > > > > > immediately instead of waiting for vblank.
> > > > > >
> > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > >
> > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > available. The key to getting the display working on this phone was this
> > > > > patch:
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > >
> > > > > > That all said, I think we should first fix what is broken, before
> > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > kms_funcs->vsync_time().
> > > > > >
> > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > immediately if there is nothing to wait for.
> > > > >
> > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > Here's the first two sets of messages that appear in dmesg:
> > > > >
> > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > >
> > > > > The messages go on like this with the same pattern.
> > > > >
> > > > > I tried two different changes:
> > > > >
> > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > >
> > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > >    increments; all the way to 5 seconds.
> > > >
> > > > increasing the timeout won't help, because the pp-done irq has already
> > > > come at the point where we wait for it..
> > > >
> > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > before requesting, and false when we get the irq.. and then
> > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > >
> > > On the otherhand, what about trying to make command mode panels
> > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > counter in hardware, which is missing from command mode - however it
> > > seems like the driver/drm framework would prefer such a counter.
> > > Would it be a reasonable idea to make a software counter, and just
> > > increment it every time the pp_done irq is triggered?
> > >
> > > I'm just thinking that we'll avoid issues long term by trying to make
> > > the code common, rather than diverging it for the two modes.
> > >
> >
> > *possibly*, but I think we want to account somehow periods where
> > display is not updated.
> >
> > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > time" (drive animations for desktop switch, window
> > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > not periodic.
>
> What do you think about using some variation of the current value of
> jiffies in the kernel + the number of pp_done IRQs as the software
> counter for command-mode panels?
>

jiffies is probably too coarse.. but we could use monotonic clock, I guess.

But I suppose even a cmd mode panel has a "vblank", it is just
internal the panel.  Do we get the TE interrupt at regular intervals?
AFAIU this would be tied to the panel's internal vblank.

BR,
-R

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-07 16:16                   ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-07 16:16 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, Jeffrey Hugo, linux-arm-msm,
	Linux Kernel Mailing List, dri-devel, Sean Paul, freedreno

On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > >
> > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > >
> > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > commits:
> > > > > > >
> > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > >
> > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > >
> > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > >
> > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > >
> > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > >  {
> > > > > > > -       /* TODO */
> > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > +       struct drm_crtc *crtc;
> > > > > > > +
> > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > +               if (!crtc->state->active)
> > > > > > > +                       continue;
> > > > > > > +
> > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > +       }
> > > > > > >  }
> > > > > > >
> > > > > > > Any tips would be appreciated.
> > > > > >
> > > > > >
> > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > get another async update before the next vblank, you just apply it
> > > > > > immediately instead of waiting for vblank.
> > > > > >
> > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > >
> > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > available. The key to getting the display working on this phone was this
> > > > > patch:
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > >
> > > > > > That all said, I think we should first fix what is broken, before
> > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > kms_funcs->vsync_time().
> > > > > >
> > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > immediately if there is nothing to wait for.
> > > > >
> > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > Here's the first two sets of messages that appear in dmesg:
> > > > >
> > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > >
> > > > > The messages go on like this with the same pattern.
> > > > >
> > > > > I tried two different changes:
> > > > >
> > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > >
> > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > >    increments; all the way to 5 seconds.
> > > >
> > > > increasing the timeout won't help, because the pp-done irq has already
> > > > come at the point where we wait for it..
> > > >
> > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > before requesting, and false when we get the irq.. and then
> > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > >
> > > On the otherhand, what about trying to make command mode panels
> > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > counter in hardware, which is missing from command mode - however it
> > > seems like the driver/drm framework would prefer such a counter.
> > > Would it be a reasonable idea to make a software counter, and just
> > > increment it every time the pp_done irq is triggered?
> > >
> > > I'm just thinking that we'll avoid issues long term by trying to make
> > > the code common, rather than diverging it for the two modes.
> > >
> >
> > *possibly*, but I think we want to account somehow periods where
> > display is not updated.
> >
> > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > time" (drive animations for desktop switch, window
> > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > not periodic.
>
> What do you think about using some variation of the current value of
> jiffies in the kernel + the number of pp_done IRQs as the software
> counter for command-mode panels?
>

jiffies is probably too coarse.. but we could use monotonic clock, I guess.

But I suppose even a cmd mode panel has a "vblank", it is just
internal the panel.  Do we get the TE interrupt at regular intervals?
AFAIU this would be tied to the panel's internal vblank.

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-07 17:40                     ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-07 17:40 UTC (permalink / raw)
  To: Rob Clark
  Cc: Brian Masney, Rob Clark, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
>
> On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> >
> > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > >
> > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > >
> > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > commits:
> > > > > > > >
> > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > >
> > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > >
> > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > >
> > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > >
> > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > >  {
> > > > > > > > -       /* TODO */
> > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > +
> > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > +               if (!crtc->state->active)
> > > > > > > > +                       continue;
> > > > > > > > +
> > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > +       }
> > > > > > > >  }
> > > > > > > >
> > > > > > > > Any tips would be appreciated.
> > > > > > >
> > > > > > >
> > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > immediately instead of waiting for vblank.
> > > > > > >
> > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > >
> > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > available. The key to getting the display working on this phone was this
> > > > > > patch:
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > >
> > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > kms_funcs->vsync_time().
> > > > > > >
> > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > immediately if there is nothing to wait for.
> > > > > >
> > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > >
> > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > >
> > > > > > The messages go on like this with the same pattern.
> > > > > >
> > > > > > I tried two different changes:
> > > > > >
> > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > >
> > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > >    increments; all the way to 5 seconds.
> > > > >
> > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > come at the point where we wait for it..
> > > > >
> > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > before requesting, and false when we get the irq.. and then
> > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > >
> > > > On the otherhand, what about trying to make command mode panels
> > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > counter in hardware, which is missing from command mode - however it
> > > > seems like the driver/drm framework would prefer such a counter.
> > > > Would it be a reasonable idea to make a software counter, and just
> > > > increment it every time the pp_done irq is triggered?
> > > >
> > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > the code common, rather than diverging it for the two modes.
> > > >
> > >
> > > *possibly*, but I think we want to account somehow periods where
> > > display is not updated.
> > >
> > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > time" (drive animations for desktop switch, window
> > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > not periodic.
> >
> > What do you think about using some variation of the current value of
> > jiffies in the kernel + the number of pp_done IRQs as the software
> > counter for command-mode panels?
> >
>
> jiffies is probably too coarse.. but we could use monotonic clock, I guess.
>
> But I suppose even a cmd mode panel has a "vblank", it is just
> internal the panel.  Do we get the TE interrupt at regular intervals?
> AFAIU this would be tied to the panel's internal vblank.

The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
predates that.
You can get it from the WR_PTR interrupt, but you have to understand
details about your panel to configure that correctly.

>
> BR,
> -R

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-07 17:40                     ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-07 17:40 UTC (permalink / raw)
  To: Rob Clark
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, freedreno, Brian Masney

On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
>
> On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> >
> > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > >
> > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > >
> > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > >
> > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > commits:
> > > > > > > >
> > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > >
> > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > >
> > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > >
> > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > >
> > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > >  {
> > > > > > > > -       /* TODO */
> > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > +
> > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > +               if (!crtc->state->active)
> > > > > > > > +                       continue;
> > > > > > > > +
> > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > +       }
> > > > > > > >  }
> > > > > > > >
> > > > > > > > Any tips would be appreciated.
> > > > > > >
> > > > > > >
> > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > immediately instead of waiting for vblank.
> > > > > > >
> > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > >
> > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > available. The key to getting the display working on this phone was this
> > > > > > patch:
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > >
> > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > kms_funcs->vsync_time().
> > > > > > >
> > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > immediately if there is nothing to wait for.
> > > > > >
> > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > >
> > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > >
> > > > > > The messages go on like this with the same pattern.
> > > > > >
> > > > > > I tried two different changes:
> > > > > >
> > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > >
> > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > >    increments; all the way to 5 seconds.
> > > > >
> > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > come at the point where we wait for it..
> > > > >
> > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > before requesting, and false when we get the irq.. and then
> > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > >
> > > > On the otherhand, what about trying to make command mode panels
> > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > counter in hardware, which is missing from command mode - however it
> > > > seems like the driver/drm framework would prefer such a counter.
> > > > Would it be a reasonable idea to make a software counter, and just
> > > > increment it every time the pp_done irq is triggered?
> > > >
> > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > the code common, rather than diverging it for the two modes.
> > > >
> > >
> > > *possibly*, but I think we want to account somehow periods where
> > > display is not updated.
> > >
> > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > time" (drive animations for desktop switch, window
> > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > not periodic.
> >
> > What do you think about using some variation of the current value of
> > jiffies in the kernel + the number of pp_done IRQs as the software
> > counter for command-mode panels?
> >
>
> jiffies is probably too coarse.. but we could use monotonic clock, I guess.
>
> But I suppose even a cmd mode panel has a "vblank", it is just
> internal the panel.  Do we get the TE interrupt at regular intervals?
> AFAIU this would be tied to the panel's internal vblank.

The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
predates that.
You can get it from the WR_PTR interrupt, but you have to understand
details about your panel to configure that correctly.

>
> BR,
> -R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-08  2:03                       ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-08  2:03 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, Brian Masney, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
>
> On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > >
> > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > >
> > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > >
> > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > commits:
> > > > > > > > >
> > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > >
> > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > >
> > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > >
> > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > >
> > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > >  {
> > > > > > > > > -       /* TODO */
> > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > +
> > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > +                       continue;
> > > > > > > > > +
> > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > +       }
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > Any tips would be appreciated.
> > > > > > > >
> > > > > > > >
> > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > immediately instead of waiting for vblank.
> > > > > > > >
> > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > >
> > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > patch:
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > >
> > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > kms_funcs->vsync_time().
> > > > > > > >
> > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > immediately if there is nothing to wait for.
> > > > > > >
> > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > >
> > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > >
> > > > > > > The messages go on like this with the same pattern.
> > > > > > >
> > > > > > > I tried two different changes:
> > > > > > >
> > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > >
> > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > >    increments; all the way to 5 seconds.
> > > > > >
> > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > come at the point where we wait for it..
> > > > > >
> > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > before requesting, and false when we get the irq.. and then
> > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > >
> > > > > On the otherhand, what about trying to make command mode panels
> > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > counter in hardware, which is missing from command mode - however it
> > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > increment it every time the pp_done irq is triggered?
> > > > >
> > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > the code common, rather than diverging it for the two modes.
> > > > >
> > > >
> > > > *possibly*, but I think we want to account somehow periods where
> > > > display is not updated.
> > > >
> > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > time" (drive animations for desktop switch, window
> > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > not periodic.
> > >
> > > What do you think about using some variation of the current value of
> > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > counter for command-mode panels?
> > >
> >
> > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> >
> > But I suppose even a cmd mode panel has a "vblank", it is just
> > internal the panel.  Do we get the TE interrupt at regular intervals?
> > AFAIU this would be tied to the panel's internal vblank.
>
> The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> predates that.
> You can get it from the WR_PTR interrupt, but you have to understand
> details about your panel to configure that correctly.

oh, sad.. I kinda assumed it was a pretty common DSI irq since
forever.. I guess the hw is just managing the flow control to prevent
tearing?

Well, anyways, I guess we could just use a free-running timer based on
refresh rate of the panel?

BR,
-R

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-08  2:03                       ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-08  2:03 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Sean Paul, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Rob Clark, freedreno, Brian Masney

On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
>
> On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > >
> > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > >
> > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > >
> > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > commits:
> > > > > > > > >
> > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > >
> > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > >
> > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > >
> > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > >
> > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > >  {
> > > > > > > > > -       /* TODO */
> > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > +
> > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > +                       continue;
> > > > > > > > > +
> > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > +       }
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > Any tips would be appreciated.
> > > > > > > >
> > > > > > > >
> > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > immediately instead of waiting for vblank.
> > > > > > > >
> > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > >
> > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > patch:
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > >
> > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > kms_funcs->vsync_time().
> > > > > > > >
> > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > immediately if there is nothing to wait for.
> > > > > > >
> > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > >
> > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > >
> > > > > > > The messages go on like this with the same pattern.
> > > > > > >
> > > > > > > I tried two different changes:
> > > > > > >
> > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > >
> > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > >    increments; all the way to 5 seconds.
> > > > > >
> > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > come at the point where we wait for it..
> > > > > >
> > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > before requesting, and false when we get the irq.. and then
> > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > >
> > > > > On the otherhand, what about trying to make command mode panels
> > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > counter in hardware, which is missing from command mode - however it
> > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > increment it every time the pp_done irq is triggered?
> > > > >
> > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > the code common, rather than diverging it for the two modes.
> > > > >
> > > >
> > > > *possibly*, but I think we want to account somehow periods where
> > > > display is not updated.
> > > >
> > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > time" (drive animations for desktop switch, window
> > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > not periodic.
> > >
> > > What do you think about using some variation of the current value of
> > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > counter for command-mode panels?
> > >
> >
> > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> >
> > But I suppose even a cmd mode panel has a "vblank", it is just
> > internal the panel.  Do we get the TE interrupt at regular intervals?
> > AFAIU this would be tied to the panel's internal vblank.
>
> The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> predates that.
> You can get it from the WR_PTR interrupt, but you have to understand
> details about your panel to configure that correctly.

oh, sad.. I kinda assumed it was a pretty common DSI irq since
forever.. I guess the hw is just managing the flow control to prevent
tearing?

Well, anyways, I guess we could just use a free-running timer based on
refresh rate of the panel?

BR,
-R
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-08  2:03                       ` Rob Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Rob Clark @ 2019-11-08  2:03 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Sean Paul, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	freedreno, Brian Masney

On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
>
> On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > >
> > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > >
> > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > >
> > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > commits:
> > > > > > > > >
> > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > >
> > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > >
> > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > >
> > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > >
> > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > >  {
> > > > > > > > > -       /* TODO */
> > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > +
> > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > +                       continue;
> > > > > > > > > +
> > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > +       }
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > Any tips would be appreciated.
> > > > > > > >
> > > > > > > >
> > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > immediately instead of waiting for vblank.
> > > > > > > >
> > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > >
> > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > patch:
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > >
> > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > kms_funcs->vsync_time().
> > > > > > > >
> > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > immediately if there is nothing to wait for.
> > > > > > >
> > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > >
> > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > >
> > > > > > > The messages go on like this with the same pattern.
> > > > > > >
> > > > > > > I tried two different changes:
> > > > > > >
> > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > >
> > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > >    increments; all the way to 5 seconds.
> > > > > >
> > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > come at the point where we wait for it..
> > > > > >
> > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > before requesting, and false when we get the irq.. and then
> > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > >
> > > > > On the otherhand, what about trying to make command mode panels
> > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > counter in hardware, which is missing from command mode - however it
> > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > increment it every time the pp_done irq is triggered?
> > > > >
> > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > the code common, rather than diverging it for the two modes.
> > > > >
> > > >
> > > > *possibly*, but I think we want to account somehow periods where
> > > > display is not updated.
> > > >
> > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > time" (drive animations for desktop switch, window
> > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > not periodic.
> > >
> > > What do you think about using some variation of the current value of
> > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > counter for command-mode panels?
> > >
> >
> > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> >
> > But I suppose even a cmd mode panel has a "vblank", it is just
> > internal the panel.  Do we get the TE interrupt at regular intervals?
> > AFAIU this would be tied to the panel's internal vblank.
>
> The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> predates that.
> You can get it from the WR_PTR interrupt, but you have to understand
> details about your panel to configure that correctly.

oh, sad.. I kinda assumed it was a pretty common DSI irq since
forever.. I guess the hw is just managing the flow control to prevent
tearing?

Well, anyways, I guess we could just use a free-running timer based on
refresh rate of the panel?

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-08 14:56                         ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-08 14:56 UTC (permalink / raw)
  To: Rob Clark
  Cc: Rob Clark, Brian Masney, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Thu, Nov 7, 2019 at 7:03 PM Rob Clark <robdclark@chromium.org> wrote:
>
> On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> >
> > On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> > >
> > > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > > >
> > > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > >
> > > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > >
> > > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > > commits:
> > > > > > > > > >
> > > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > > >
> > > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > > >
> > > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > > >
> > > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > > >
> > > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > > >  {
> > > > > > > > > > -       /* TODO */
> > > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > > +
> > > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > > +                       continue;
> > > > > > > > > > +
> > > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > > +       }
> > > > > > > > > >  }
> > > > > > > > > >
> > > > > > > > > > Any tips would be appreciated.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > > immediately instead of waiting for vblank.
> > > > > > > > >
> > > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > > >
> > > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > > patch:
> > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > > >
> > > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > > kms_funcs->vsync_time().
> > > > > > > > >
> > > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > > immediately if there is nothing to wait for.
> > > > > > > >
> > > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > > >
> > > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > > >
> > > > > > > > The messages go on like this with the same pattern.
> > > > > > > >
> > > > > > > > I tried two different changes:
> > > > > > > >
> > > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > > >
> > > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > > >    increments; all the way to 5 seconds.
> > > > > > >
> > > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > > come at the point where we wait for it..
> > > > > > >
> > > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > > before requesting, and false when we get the irq.. and then
> > > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > > >
> > > > > > On the otherhand, what about trying to make command mode panels
> > > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > > counter in hardware, which is missing from command mode - however it
> > > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > > increment it every time the pp_done irq is triggered?
> > > > > >
> > > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > > the code common, rather than diverging it for the two modes.
> > > > > >
> > > > >
> > > > > *possibly*, but I think we want to account somehow periods where
> > > > > display is not updated.
> > > > >
> > > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > > time" (drive animations for desktop switch, window
> > > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > > not periodic.
> > > >
> > > > What do you think about using some variation of the current value of
> > > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > > counter for command-mode panels?
> > > >
> > >
> > > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> > >
> > > But I suppose even a cmd mode panel has a "vblank", it is just
> > > internal the panel.  Do we get the TE interrupt at regular intervals?
> > > AFAIU this would be tied to the panel's internal vblank.
> >
> > The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> > predates that.
> > You can get it from the WR_PTR interrupt, but you have to understand
> > details about your panel to configure that correctly.
>
> oh, sad.. I kinda assumed it was a pretty common DSI irq since
> forever.. I guess the hw is just managing the flow control to prevent
> tearing?
>
> Well, anyways, I guess we could just use a free-running timer based on
> refresh rate of the panel?

That would work.  One more alternative (just want to make sure we've
evaluated all options) is to use the autorefresh feature.  How I would
put it simply, is that autorefresh turns a command mode panel into a
video mode panel.  If autorefresh is enabled, the MDP will
automatically send a frame to the panel every time the panel invokes
the TE signal.  I'm pretty sure this will automatically trigger the
PP_DONE irq, so essentially PP_DONE is now analogous to vsync.  The
downside is that the START trigger and autorefresh are basically
mutually exclusive.  I see the autorefresh config register in MDP 1.0,
so it would be applicable to all platforms supported by the mdp5
driver.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-08 14:56                         ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-08 14:56 UTC (permalink / raw)
  To: Rob Clark
  Cc: Sean Paul, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Rob Clark, freedreno, Brian Masney

On Thu, Nov 7, 2019 at 7:03 PM Rob Clark <robdclark@chromium.org> wrote:
>
> On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> >
> > On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> > >
> > > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > > >
> > > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > >
> > > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > >
> > > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > > commits:
> > > > > > > > > >
> > > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > > >
> > > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > > >
> > > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > > >
> > > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > > >
> > > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > > >  {
> > > > > > > > > > -       /* TODO */
> > > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > > +
> > > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > > +                       continue;
> > > > > > > > > > +
> > > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > > +       }
> > > > > > > > > >  }
> > > > > > > > > >
> > > > > > > > > > Any tips would be appreciated.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > > immediately instead of waiting for vblank.
> > > > > > > > >
> > > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > > >
> > > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > > patch:
> > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > > >
> > > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > > kms_funcs->vsync_time().
> > > > > > > > >
> > > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > > immediately if there is nothing to wait for.
> > > > > > > >
> > > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > > >
> > > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > > >
> > > > > > > > The messages go on like this with the same pattern.
> > > > > > > >
> > > > > > > > I tried two different changes:
> > > > > > > >
> > > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > > >
> > > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > > >    increments; all the way to 5 seconds.
> > > > > > >
> > > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > > come at the point where we wait for it..
> > > > > > >
> > > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > > before requesting, and false when we get the irq.. and then
> > > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > > >
> > > > > > On the otherhand, what about trying to make command mode panels
> > > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > > counter in hardware, which is missing from command mode - however it
> > > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > > increment it every time the pp_done irq is triggered?
> > > > > >
> > > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > > the code common, rather than diverging it for the two modes.
> > > > > >
> > > > >
> > > > > *possibly*, but I think we want to account somehow periods where
> > > > > display is not updated.
> > > > >
> > > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > > time" (drive animations for desktop switch, window
> > > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > > not periodic.
> > > >
> > > > What do you think about using some variation of the current value of
> > > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > > counter for command-mode panels?
> > > >
> > >
> > > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> > >
> > > But I suppose even a cmd mode panel has a "vblank", it is just
> > > internal the panel.  Do we get the TE interrupt at regular intervals?
> > > AFAIU this would be tied to the panel's internal vblank.
> >
> > The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> > predates that.
> > You can get it from the WR_PTR interrupt, but you have to understand
> > details about your panel to configure that correctly.
>
> oh, sad.. I kinda assumed it was a pretty common DSI irq since
> forever.. I guess the hw is just managing the flow control to prevent
> tearing?
>
> Well, anyways, I guess we could just use a free-running timer based on
> refresh rate of the panel?

That would work.  One more alternative (just want to make sure we've
evaluated all options) is to use the autorefresh feature.  How I would
put it simply, is that autorefresh turns a command mode panel into a
video mode panel.  If autorefresh is enabled, the MDP will
automatically send a frame to the panel every time the panel invokes
the TE signal.  I'm pretty sure this will automatically trigger the
PP_DONE irq, so essentially PP_DONE is now analogous to vsync.  The
downside is that the START trigger and autorefresh are basically
mutually exclusive.  I see the autorefresh config register in MDP 1.0,
so it would be applicable to all platforms supported by the mdp5
driver.
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-08 14:56                         ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-08 14:56 UTC (permalink / raw)
  To: Rob Clark
  Cc: Sean Paul, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	freedreno, Brian Masney

On Thu, Nov 7, 2019 at 7:03 PM Rob Clark <robdclark@chromium.org> wrote:
>
> On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> >
> > On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> > >
> > > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > > >
> > > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > >
> > > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > >
> > > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > > commits:
> > > > > > > > > >
> > > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > > >
> > > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > > >
> > > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > > >
> > > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > > >
> > > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > > >  {
> > > > > > > > > > -       /* TODO */
> > > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > > +
> > > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > > +                       continue;
> > > > > > > > > > +
> > > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > > +       }
> > > > > > > > > >  }
> > > > > > > > > >
> > > > > > > > > > Any tips would be appreciated.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > > immediately instead of waiting for vblank.
> > > > > > > > >
> > > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > > >
> > > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > > patch:
> > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > > >
> > > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > > kms_funcs->vsync_time().
> > > > > > > > >
> > > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > > immediately if there is nothing to wait for.
> > > > > > > >
> > > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > > >
> > > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > > >
> > > > > > > > The messages go on like this with the same pattern.
> > > > > > > >
> > > > > > > > I tried two different changes:
> > > > > > > >
> > > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > > >
> > > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > > >    increments; all the way to 5 seconds.
> > > > > > >
> > > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > > come at the point where we wait for it..
> > > > > > >
> > > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > > before requesting, and false when we get the irq.. and then
> > > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > > >
> > > > > > On the otherhand, what about trying to make command mode panels
> > > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > > counter in hardware, which is missing from command mode - however it
> > > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > > increment it every time the pp_done irq is triggered?
> > > > > >
> > > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > > the code common, rather than diverging it for the two modes.
> > > > > >
> > > > >
> > > > > *possibly*, but I think we want to account somehow periods where
> > > > > display is not updated.
> > > > >
> > > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > > time" (drive animations for desktop switch, window
> > > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > > not periodic.
> > > >
> > > > What do you think about using some variation of the current value of
> > > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > > counter for command-mode panels?
> > > >
> > >
> > > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> > >
> > > But I suppose even a cmd mode panel has a "vblank", it is just
> > > internal the panel.  Do we get the TE interrupt at regular intervals?
> > > AFAIU this would be tied to the panel's internal vblank.
> >
> > The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> > predates that.
> > You can get it from the WR_PTR interrupt, but you have to understand
> > details about your panel to configure that correctly.
>
> oh, sad.. I kinda assumed it was a pretty common DSI irq since
> forever.. I guess the hw is just managing the flow control to prevent
> tearing?
>
> Well, anyways, I guess we could just use a free-running timer based on
> refresh rate of the panel?

That would work.  One more alternative (just want to make sure we've
evaluated all options) is to use the autorefresh feature.  How I would
put it simply, is that autorefresh turns a command mode panel into a
video mode panel.  If autorefresh is enabled, the MDP will
automatically send a frame to the panel every time the panel invokes
the TE signal.  I'm pretty sure this will automatically trigger the
PP_DONE irq, so essentially PP_DONE is now analogous to vsync.  The
downside is that the START trigger and autorefresh are basically
mutually exclusive.  I see the autorefresh config register in MDP 1.0,
so it would be applicable to all platforms supported by the mdp5
driver.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-10 13:53                           ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-10 13:53 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, Rob Clark, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> On Thu, Nov 7, 2019 at 7:03 PM Rob Clark <robdclark@chromium.org> wrote:
> >
> > On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > >
> > > On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> > > >
> > > > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > >
> > > > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > > > >
> > > > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > > > commits:
> > > > > > > > > > >
> > > > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > > > >
> > > > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > > > >
> > > > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > > > >
> > > > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > > > >
> > > > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > > > >  {
> > > > > > > > > > > -       /* TODO */
> > > > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > > > +
> > > > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > > > +                       continue;
> > > > > > > > > > > +
> > > > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > > > +       }
> > > > > > > > > > >  }
> > > > > > > > > > >
> > > > > > > > > > > Any tips would be appreciated.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > > > immediately instead of waiting for vblank.
> > > > > > > > > >
> > > > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > > > >
> > > > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > > > patch:
> > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > > > >
> > > > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > > > kms_funcs->vsync_time().
> > > > > > > > > >
> > > > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > > > immediately if there is nothing to wait for.
> > > > > > > > >
> > > > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > > > >
> > > > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > >
> > > > > > > > > The messages go on like this with the same pattern.
> > > > > > > > >
> > > > > > > > > I tried two different changes:
> > > > > > > > >
> > > > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > > > >
> > > > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > > > >    increments; all the way to 5 seconds.
> > > > > > > >
> > > > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > > > come at the point where we wait for it..
> > > > > > > >
> > > > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > > > before requesting, and false when we get the irq.. and then
> > > > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > > > >
> > > > > > > On the otherhand, what about trying to make command mode panels
> > > > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > > > counter in hardware, which is missing from command mode - however it
> > > > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > > > increment it every time the pp_done irq is triggered?
> > > > > > >
> > > > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > > > the code common, rather than diverging it for the two modes.
> > > > > > >
> > > > > >
> > > > > > *possibly*, but I think we want to account somehow periods where
> > > > > > display is not updated.
> > > > > >
> > > > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > > > time" (drive animations for desktop switch, window
> > > > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > > > not periodic.
> > > > >
> > > > > What do you think about using some variation of the current value of
> > > > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > > > counter for command-mode panels?
> > > > >
> > > >
> > > > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> > > >
> > > > But I suppose even a cmd mode panel has a "vblank", it is just
> > > > internal the panel.  Do we get the TE interrupt at regular intervals?
> > > > AFAIU this would be tied to the panel's internal vblank.
> > >
> > > The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> > > predates that.
> > > You can get it from the WR_PTR interrupt, but you have to understand
> > > details about your panel to configure that correctly.
> >
> > oh, sad.. I kinda assumed it was a pretty common DSI irq since
> > forever.. I guess the hw is just managing the flow control to prevent
> > tearing?
> >
> > Well, anyways, I guess we could just use a free-running timer based on
> > refresh rate of the panel?
> 
> That would work.  One more alternative (just want to make sure we've
> evaluated all options) is to use the autorefresh feature.  How I would
> put it simply, is that autorefresh turns a command mode panel into a
> video mode panel.  If autorefresh is enabled, the MDP will
> automatically send a frame to the panel every time the panel invokes
> the TE signal.  I'm pretty sure this will automatically trigger the
> PP_DONE irq, so essentially PP_DONE is now analogous to vsync.  The
> downside is that the START trigger and autorefresh are basically
> mutually exclusive.  I see the autorefresh config register in MDP 1.0,
> so it would be applicable to all platforms supported by the mdp5
> driver.

There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383

I'm not sure what to put in that register but I tried configuring it
with a 1 this way and still have the same issue.

diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
index eeef41fcd4e1..6b9acf68fd2c 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
@@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
        mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
                        MDP5_PP_SYNC_THRESH_START(4) |
                        MDP5_PP_SYNC_THRESH_CONTINUE(4));
+       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
 
        return 0;
 }

Brian

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-10 13:53                           ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-10 13:53 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, Rob Clark, freedreno

On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> On Thu, Nov 7, 2019 at 7:03 PM Rob Clark <robdclark@chromium.org> wrote:
> >
> > On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > >
> > > On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> > > >
> > > > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > >
> > > > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > > > >
> > > > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > > > commits:
> > > > > > > > > > >
> > > > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > > > >
> > > > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > > > >
> > > > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > > > >
> > > > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > > > >
> > > > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > > > >  {
> > > > > > > > > > > -       /* TODO */
> > > > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > > > +
> > > > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > > > +                       continue;
> > > > > > > > > > > +
> > > > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > > > +       }
> > > > > > > > > > >  }
> > > > > > > > > > >
> > > > > > > > > > > Any tips would be appreciated.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > > > immediately instead of waiting for vblank.
> > > > > > > > > >
> > > > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > > > >
> > > > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > > > patch:
> > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > > > >
> > > > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > > > kms_funcs->vsync_time().
> > > > > > > > > >
> > > > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > > > immediately if there is nothing to wait for.
> > > > > > > > >
> > > > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > > > >
> > > > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > >
> > > > > > > > > The messages go on like this with the same pattern.
> > > > > > > > >
> > > > > > > > > I tried two different changes:
> > > > > > > > >
> > > > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > > > >
> > > > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > > > >    increments; all the way to 5 seconds.
> > > > > > > >
> > > > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > > > come at the point where we wait for it..
> > > > > > > >
> > > > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > > > before requesting, and false when we get the irq.. and then
> > > > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > > > >
> > > > > > > On the otherhand, what about trying to make command mode panels
> > > > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > > > counter in hardware, which is missing from command mode - however it
> > > > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > > > increment it every time the pp_done irq is triggered?
> > > > > > >
> > > > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > > > the code common, rather than diverging it for the two modes.
> > > > > > >
> > > > > >
> > > > > > *possibly*, but I think we want to account somehow periods where
> > > > > > display is not updated.
> > > > > >
> > > > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > > > time" (drive animations for desktop switch, window
> > > > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > > > not periodic.
> > > > >
> > > > > What do you think about using some variation of the current value of
> > > > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > > > counter for command-mode panels?
> > > > >
> > > >
> > > > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> > > >
> > > > But I suppose even a cmd mode panel has a "vblank", it is just
> > > > internal the panel.  Do we get the TE interrupt at regular intervals?
> > > > AFAIU this would be tied to the panel's internal vblank.
> > >
> > > The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> > > predates that.
> > > You can get it from the WR_PTR interrupt, but you have to understand
> > > details about your panel to configure that correctly.
> >
> > oh, sad.. I kinda assumed it was a pretty common DSI irq since
> > forever.. I guess the hw is just managing the flow control to prevent
> > tearing?
> >
> > Well, anyways, I guess we could just use a free-running timer based on
> > refresh rate of the panel?
> 
> That would work.  One more alternative (just want to make sure we've
> evaluated all options) is to use the autorefresh feature.  How I would
> put it simply, is that autorefresh turns a command mode panel into a
> video mode panel.  If autorefresh is enabled, the MDP will
> automatically send a frame to the panel every time the panel invokes
> the TE signal.  I'm pretty sure this will automatically trigger the
> PP_DONE irq, so essentially PP_DONE is now analogous to vsync.  The
> downside is that the START trigger and autorefresh are basically
> mutually exclusive.  I see the autorefresh config register in MDP 1.0,
> so it would be applicable to all platforms supported by the mdp5
> driver.

There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383

I'm not sure what to put in that register but I tried configuring it
with a 1 this way and still have the same issue.

diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
index eeef41fcd4e1..6b9acf68fd2c 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
@@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
        mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
                        MDP5_PP_SYNC_THRESH_START(4) |
                        MDP5_PP_SYNC_THRESH_CONTINUE(4));
+       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
 
        return 0;
 }

Brian
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-10 13:53                           ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-10 13:53 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, freedreno

On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> On Thu, Nov 7, 2019 at 7:03 PM Rob Clark <robdclark@chromium.org> wrote:
> >
> > On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > >
> > > On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> > > >
> > > > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > >
> > > > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > > > >
> > > > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > > > commits:
> > > > > > > > > > >
> > > > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > > > >
> > > > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > > > >
> > > > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > > > >
> > > > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > > > >
> > > > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > > > >  {
> > > > > > > > > > > -       /* TODO */
> > > > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > > > +
> > > > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > > > +                       continue;
> > > > > > > > > > > +
> > > > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > > > +       }
> > > > > > > > > > >  }
> > > > > > > > > > >
> > > > > > > > > > > Any tips would be appreciated.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > > > immediately instead of waiting for vblank.
> > > > > > > > > >
> > > > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > > > >
> > > > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > > > patch:
> > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > > > >
> > > > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > > > kms_funcs->vsync_time().
> > > > > > > > > >
> > > > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > > > immediately if there is nothing to wait for.
> > > > > > > > >
> > > > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > > > >
> > > > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > >
> > > > > > > > > The messages go on like this with the same pattern.
> > > > > > > > >
> > > > > > > > > I tried two different changes:
> > > > > > > > >
> > > > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > > > >
> > > > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > > > >    increments; all the way to 5 seconds.
> > > > > > > >
> > > > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > > > come at the point where we wait for it..
> > > > > > > >
> > > > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > > > before requesting, and false when we get the irq.. and then
> > > > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > > > >
> > > > > > > On the otherhand, what about trying to make command mode panels
> > > > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > > > counter in hardware, which is missing from command mode - however it
> > > > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > > > increment it every time the pp_done irq is triggered?
> > > > > > >
> > > > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > > > the code common, rather than diverging it for the two modes.
> > > > > > >
> > > > > >
> > > > > > *possibly*, but I think we want to account somehow periods where
> > > > > > display is not updated.
> > > > > >
> > > > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > > > time" (drive animations for desktop switch, window
> > > > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > > > not periodic.
> > > > >
> > > > > What do you think about using some variation of the current value of
> > > > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > > > counter for command-mode panels?
> > > > >
> > > >
> > > > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> > > >
> > > > But I suppose even a cmd mode panel has a "vblank", it is just
> > > > internal the panel.  Do we get the TE interrupt at regular intervals?
> > > > AFAIU this would be tied to the panel's internal vblank.
> > >
> > > The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> > > predates that.
> > > You can get it from the WR_PTR interrupt, but you have to understand
> > > details about your panel to configure that correctly.
> >
> > oh, sad.. I kinda assumed it was a pretty common DSI irq since
> > forever.. I guess the hw is just managing the flow control to prevent
> > tearing?
> >
> > Well, anyways, I guess we could just use a free-running timer based on
> > refresh rate of the panel?
> 
> That would work.  One more alternative (just want to make sure we've
> evaluated all options) is to use the autorefresh feature.  How I would
> put it simply, is that autorefresh turns a command mode panel into a
> video mode panel.  If autorefresh is enabled, the MDP will
> automatically send a frame to the panel every time the panel invokes
> the TE signal.  I'm pretty sure this will automatically trigger the
> PP_DONE irq, so essentially PP_DONE is now analogous to vsync.  The
> downside is that the START trigger and autorefresh are basically
> mutually exclusive.  I see the autorefresh config register in MDP 1.0,
> so it would be applicable to all platforms supported by the mdp5
> driver.

There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383

I'm not sure what to put in that register but I tried configuring it
with a 1 this way and still have the same issue.

diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
index eeef41fcd4e1..6b9acf68fd2c 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
@@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
        mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
                        MDP5_PP_SYNC_THRESH_START(4) |
                        MDP5_PP_SYNC_THRESH_CONTINUE(4));
+       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
 
        return 0;
 }

Brian
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-10 17:37                             ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-10 17:37 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, Rob Clark, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Sun, Nov 10, 2019 at 6:53 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> > On Thu, Nov 7, 2019 at 7:03 PM Rob Clark <robdclark@chromium.org> wrote:
> > >
> > > On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > >
> > > > On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > >
> > > > > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > >
> > > > > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > > > > commits:
> > > > > > > > > > > >
> > > > > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > > > > >
> > > > > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > > > > >
> > > > > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > > > > >
> > > > > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > > > > >
> > > > > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > > > > >  {
> > > > > > > > > > > > -       /* TODO */
> > > > > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > > > > +
> > > > > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > > > > +                       continue;
> > > > > > > > > > > > +
> > > > > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > > > > +       }
> > > > > > > > > > > >  }
> > > > > > > > > > > >
> > > > > > > > > > > > Any tips would be appreciated.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > > > > immediately instead of waiting for vblank.
> > > > > > > > > > >
> > > > > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > > > > >
> > > > > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > > > > patch:
> > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > > > > >
> > > > > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > > > > kms_funcs->vsync_time().
> > > > > > > > > > >
> > > > > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > > > > immediately if there is nothing to wait for.
> > > > > > > > > >
> > > > > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > > > > >
> > > > > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > > >
> > > > > > > > > > The messages go on like this with the same pattern.
> > > > > > > > > >
> > > > > > > > > > I tried two different changes:
> > > > > > > > > >
> > > > > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > > > > >
> > > > > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > > > > >    increments; all the way to 5 seconds.
> > > > > > > > >
> > > > > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > > > > come at the point where we wait for it..
> > > > > > > > >
> > > > > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > > > > before requesting, and false when we get the irq.. and then
> > > > > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > > > > >
> > > > > > > > On the otherhand, what about trying to make command mode panels
> > > > > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > > > > counter in hardware, which is missing from command mode - however it
> > > > > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > > > > increment it every time the pp_done irq is triggered?
> > > > > > > >
> > > > > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > > > > the code common, rather than diverging it for the two modes.
> > > > > > > >
> > > > > > >
> > > > > > > *possibly*, but I think we want to account somehow periods where
> > > > > > > display is not updated.
> > > > > > >
> > > > > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > > > > time" (drive animations for desktop switch, window
> > > > > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > > > > not periodic.
> > > > > >
> > > > > > What do you think about using some variation of the current value of
> > > > > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > > > > counter for command-mode panels?
> > > > > >
> > > > >
> > > > > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> > > > >
> > > > > But I suppose even a cmd mode panel has a "vblank", it is just
> > > > > internal the panel.  Do we get the TE interrupt at regular intervals?
> > > > > AFAIU this would be tied to the panel's internal vblank.
> > > >
> > > > The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> > > > predates that.
> > > > You can get it from the WR_PTR interrupt, but you have to understand
> > > > details about your panel to configure that correctly.
> > >
> > > oh, sad.. I kinda assumed it was a pretty common DSI irq since
> > > forever.. I guess the hw is just managing the flow control to prevent
> > > tearing?
> > >
> > > Well, anyways, I guess we could just use a free-running timer based on
> > > refresh rate of the panel?
> >
> > That would work.  One more alternative (just want to make sure we've
> > evaluated all options) is to use the autorefresh feature.  How I would
> > put it simply, is that autorefresh turns a command mode panel into a
> > video mode panel.  If autorefresh is enabled, the MDP will
> > automatically send a frame to the panel every time the panel invokes
> > the TE signal.  I'm pretty sure this will automatically trigger the
> > PP_DONE irq, so essentially PP_DONE is now analogous to vsync.  The
> > downside is that the START trigger and autorefresh are basically
> > mutually exclusive.  I see the autorefresh config register in MDP 1.0,
> > so it would be applicable to all platforms supported by the mdp5
> > driver.
>
> There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383
>
> I'm not sure what to put in that register but I tried configuring it
> with a 1 this way and still have the same issue.
>
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> index eeef41fcd4e1..6b9acf68fd2c 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> @@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
>         mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
>                         MDP5_PP_SYNC_THRESH_START(4) |
>                         MDP5_PP_SYNC_THRESH_CONTINUE(4));
> +       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
>
>         return 0;
>  }

bit 31 is the enable bit (set that to 1), bits 15:0 are the
frame_count (how many te events before the MDP sends a frame, I'd
recommend set to 1).  Then after its programmed, you'll have to flush
the config, and probably use a _START to make sure the flush takes
effect.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-10 17:37                             ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-10 17:37 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, Rob Clark, freedreno

On Sun, Nov 10, 2019 at 6:53 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> > On Thu, Nov 7, 2019 at 7:03 PM Rob Clark <robdclark@chromium.org> wrote:
> > >
> > > On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > >
> > > > On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > >
> > > > > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > >
> > > > > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > > > > commits:
> > > > > > > > > > > >
> > > > > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > > > > >
> > > > > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > > > > >
> > > > > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > > > > >
> > > > > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > > > > >
> > > > > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > > > > >  {
> > > > > > > > > > > > -       /* TODO */
> > > > > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > > > > +
> > > > > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > > > > +                       continue;
> > > > > > > > > > > > +
> > > > > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > > > > +       }
> > > > > > > > > > > >  }
> > > > > > > > > > > >
> > > > > > > > > > > > Any tips would be appreciated.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > > > > immediately instead of waiting for vblank.
> > > > > > > > > > >
> > > > > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > > > > >
> > > > > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > > > > patch:
> > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > > > > >
> > > > > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > > > > kms_funcs->vsync_time().
> > > > > > > > > > >
> > > > > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > > > > immediately if there is nothing to wait for.
> > > > > > > > > >
> > > > > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > > > > >
> > > > > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > > >
> > > > > > > > > > The messages go on like this with the same pattern.
> > > > > > > > > >
> > > > > > > > > > I tried two different changes:
> > > > > > > > > >
> > > > > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > > > > >
> > > > > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > > > > >    increments; all the way to 5 seconds.
> > > > > > > > >
> > > > > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > > > > come at the point where we wait for it..
> > > > > > > > >
> > > > > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > > > > before requesting, and false when we get the irq.. and then
> > > > > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > > > > >
> > > > > > > > On the otherhand, what about trying to make command mode panels
> > > > > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > > > > counter in hardware, which is missing from command mode - however it
> > > > > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > > > > increment it every time the pp_done irq is triggered?
> > > > > > > >
> > > > > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > > > > the code common, rather than diverging it for the two modes.
> > > > > > > >
> > > > > > >
> > > > > > > *possibly*, but I think we want to account somehow periods where
> > > > > > > display is not updated.
> > > > > > >
> > > > > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > > > > time" (drive animations for desktop switch, window
> > > > > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > > > > not periodic.
> > > > > >
> > > > > > What do you think about using some variation of the current value of
> > > > > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > > > > counter for command-mode panels?
> > > > > >
> > > > >
> > > > > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> > > > >
> > > > > But I suppose even a cmd mode panel has a "vblank", it is just
> > > > > internal the panel.  Do we get the TE interrupt at regular intervals?
> > > > > AFAIU this would be tied to the panel's internal vblank.
> > > >
> > > > The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> > > > predates that.
> > > > You can get it from the WR_PTR interrupt, but you have to understand
> > > > details about your panel to configure that correctly.
> > >
> > > oh, sad.. I kinda assumed it was a pretty common DSI irq since
> > > forever.. I guess the hw is just managing the flow control to prevent
> > > tearing?
> > >
> > > Well, anyways, I guess we could just use a free-running timer based on
> > > refresh rate of the panel?
> >
> > That would work.  One more alternative (just want to make sure we've
> > evaluated all options) is to use the autorefresh feature.  How I would
> > put it simply, is that autorefresh turns a command mode panel into a
> > video mode panel.  If autorefresh is enabled, the MDP will
> > automatically send a frame to the panel every time the panel invokes
> > the TE signal.  I'm pretty sure this will automatically trigger the
> > PP_DONE irq, so essentially PP_DONE is now analogous to vsync.  The
> > downside is that the START trigger and autorefresh are basically
> > mutually exclusive.  I see the autorefresh config register in MDP 1.0,
> > so it would be applicable to all platforms supported by the mdp5
> > driver.
>
> There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383
>
> I'm not sure what to put in that register but I tried configuring it
> with a 1 this way and still have the same issue.
>
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> index eeef41fcd4e1..6b9acf68fd2c 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> @@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
>         mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
>                         MDP5_PP_SYNC_THRESH_START(4) |
>                         MDP5_PP_SYNC_THRESH_CONTINUE(4));
> +       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
>
>         return 0;
>  }

bit 31 is the enable bit (set that to 1), bits 15:0 are the
frame_count (how many te events before the MDP sends a frame, I'd
recommend set to 1).  Then after its programmed, you'll have to flush
the config, and probably use a _START to make sure the flush takes
effect.
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-10 17:37                             ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-10 17:37 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, freedreno

On Sun, Nov 10, 2019 at 6:53 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> > On Thu, Nov 7, 2019 at 7:03 PM Rob Clark <robdclark@chromium.org> wrote:
> > >
> > > On Thu, Nov 7, 2019 at 9:40 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > >
> > > > On Thu, Nov 7, 2019 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > >
> > > > > On Thu, Nov 7, 2019 at 3:10 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > >
> > > > > > On Wed, Nov 06, 2019 at 08:58:59AM -0800, Rob Clark wrote:
> > > > > > > On Wed, Nov 6, 2019 at 8:47 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Nov 6, 2019 at 9:30 AM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Nov 6, 2019 at 1:13 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Tue, Nov 05, 2019 at 08:23:27AM -0800, Rob Clark wrote:
> > > > > > > > > > > On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@onstation.org> wrote:
> > > > > > > > > > > > The 'pp done time out' errors go away if I revert the following three
> > > > > > > > > > > > commits:
> > > > > > > > > > > >
> > > > > > > > > > > > cd6d923167b1 ("drm/msm/dpu: async commit support")
> > > > > > > > > > > > d934a712c5e6 ("drm/msm: add atomic traces")
> > > > > > > > > > > > 2d99ced787e3 ("drm/msm: async commit support")
> > > > > > > > > > > >
> > > > > > > > > > > > I reverted the first one to fix a compiler error, and the second one so
> > > > > > > > > > > > that the last patch can be reverted without any merge conflicts.
> > > > > > > > > > > >
> > > > > > > > > > > > I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> > > > > > > > > > > > crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> > > > > > > > > > > > buffer dance around the screen like its out of sync. I renamed
> > > > > > > > > > > > crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> > > > > > > > > > > > declaration. Here's the relevant part of what I tried:
> > > > > > > > > > > >
> > > > > > > > > > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> > > > > > > > > > > > @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
> > > > > > > > > > > >
> > > > > > > > > > > >  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
> > > > > > > > > > > >  {
> > > > > > > > > > > > -       /* TODO */
> > > > > > > > > > > > +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> > > > > > > > > > > > +       struct drm_crtc *crtc;
> > > > > > > > > > > > +
> > > > > > > > > > > > +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> > > > > > > > > > > > +               if (!crtc->state->active)
> > > > > > > > > > > > +                       continue;
> > > > > > > > > > > > +
> > > > > > > > > > > > +               mdp5_crtc_flush_all(crtc);
> > > > > > > > > > > > +       }
> > > > > > > > > > > >  }
> > > > > > > > > > > >
> > > > > > > > > > > > Any tips would be appreciated.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I think this is along the lines of what we need to enable async commit
> > > > > > > > > > > for mdp5 (but also removing the flush from the atomic-commit path)..
> > > > > > > > > > > the principle behind the async commit is to do all the atomic state
> > > > > > > > > > > commit normally, but defer writing the flush bits.  This way, if you
> > > > > > > > > > > get another async update before the next vblank, you just apply it
> > > > > > > > > > > immediately instead of waiting for vblank.
> > > > > > > > > > >
> > > > > > > > > > > But I guess you are on a command mode panel, if I remember?  Which is
> > > > > > > > > > > a case I didn't have a way to test.  And I'm not entirely about how
> > > > > > > > > > > kms_funcs->vsync_time() should be implemented for cmd mode panels.
> > > > > > > > > >
> > > > > > > > > > Yes, this is a command-mode panel and there's no hardware frame counter
> > > > > > > > > > available. The key to getting the display working on this phone was this
> > > > > > > > > > patch:
> > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2bab52af6fe68c43b327a57e5ce5fc10eefdfadf
> > > > > > > > > >
> > > > > > > > > > > That all said, I think we should first fix what is broken, before
> > > > > > > > > > > worrying about extending async commit support to mdp5.. which
> > > > > > > > > > > shouldn't hit the async==true path, due to not implementing
> > > > > > > > > > > kms_funcs->vsync_time().
> > > > > > > > > > >
> > > > > > > > > > > What I think is going on is that, in the cmd mode case,
> > > > > > > > > > > mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
> > > > > > > > > > > which waits for a pp-done irq regardless of whether there is a flush
> > > > > > > > > > > in progress.  Since there is no flush pending, the irq never comes.
> > > > > > > > > > > But the expectation is that kms_funcs->wait_flush() returns
> > > > > > > > > > > immediately if there is nothing to wait for.
> > > > > > > > > >
> > > > > > > > > > I don't think that's happening in this case. I added some pr_info()
> > > > > > > > > > statements to request_pp_done_pending() and mdp5_crtc_pp_done_irq().
> > > > > > > > > > Here's the first two sets of messages that appear in dmesg:
> > > > > > > > > >
> > > > > > > > > > [   14.018907] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > > [   14.018993] request_pp_done_pending: HERE
> > > > > > > > > > [   14.074208] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > > > [   14.074368] Console: switching to colour frame buffer device 135x120
> > > > > > > > > > [   14.138938] msm fd900000.mdss: pp done time out, lm=0
> > > > > > > > > > [   14.139021] request_pp_done_pending: HERE
> > > > > > > > > > [   14.158097] mdp5_crtc_pp_done_irq: HERE
> > > > > > > > > >
> > > > > > > > > > The messages go on like this with the same pattern.
> > > > > > > > > >
> > > > > > > > > > I tried two different changes:
> > > > > > > > > >
> > > > > > > > > > 1) I moved the request_pp_done_pending() and corresponding if statement
> > > > > > > > > >    from mdp5_crtc_atomic_flush() and into mdp5_crtc_atomic_begin().
> > > > > > > > > >
> > > > > > > > > > 2) I increased the timeout in wait_for_completion_timeout() by several
> > > > > > > > > >    increments; all the way to 5 seconds.
> > > > > > > > >
> > > > > > > > > increasing the timeout won't help, because the pp-done irq has already
> > > > > > > > > come at the point where we wait for it..
> > > > > > > > >
> > > > > > > > > maybe the easy thing is just add mdp5_crtc->needs_pp, set to true
> > > > > > > > > before requesting, and false when we get the irq.. and then
> > > > > > > > > mdp5_crtc_wait_for_pp_done() just returns if needs_pp==false..
> > > > > > > >
> > > > > > > > On the otherhand, what about trying to make command mode panels
> > > > > > > > resemble video mode panels slightly?  Video mode panels have a vsync
> > > > > > > > counter in hardware, which is missing from command mode - however it
> > > > > > > > seems like the driver/drm framework would prefer such a counter.
> > > > > > > > Would it be a reasonable idea to make a software counter, and just
> > > > > > > > increment it every time the pp_done irq is triggered?
> > > > > > > >
> > > > > > > > I'm just thinking that we'll avoid issues long term by trying to make
> > > > > > > > the code common, rather than diverging it for the two modes.
> > > > > > > >
> > > > > > >
> > > > > > > *possibly*, but I think we want to account somehow periods where
> > > > > > > display is not updated.
> > > > > > >
> > > > > > > fwiw, it isn't that uncommon for userspace to use vblanks to "keep
> > > > > > > time" (drive animations for desktop switch, window
> > > > > > > maximize/unmaximize, etc).. it could be a surprise when "vblank" is
> > > > > > > not periodic.
> > > > > >
> > > > > > What do you think about using some variation of the current value of
> > > > > > jiffies in the kernel + the number of pp_done IRQs as the software
> > > > > > counter for command-mode panels?
> > > > > >
> > > > >
> > > > > jiffies is probably too coarse.. but we could use monotonic clock, I guess.
> > > > >
> > > > > But I suppose even a cmd mode panel has a "vblank", it is just
> > > > > internal the panel.  Do we get the TE interrupt at regular intervals?
> > > > > AFAIU this would be tied to the panel's internal vblank.
> > > >
> > > > The TE interrupt was first implemented in MDP 1.7.0 (msm8996).  8974
> > > > predates that.
> > > > You can get it from the WR_PTR interrupt, but you have to understand
> > > > details about your panel to configure that correctly.
> > >
> > > oh, sad.. I kinda assumed it was a pretty common DSI irq since
> > > forever.. I guess the hw is just managing the flow control to prevent
> > > tearing?
> > >
> > > Well, anyways, I guess we could just use a free-running timer based on
> > > refresh rate of the panel?
> >
> > That would work.  One more alternative (just want to make sure we've
> > evaluated all options) is to use the autorefresh feature.  How I would
> > put it simply, is that autorefresh turns a command mode panel into a
> > video mode panel.  If autorefresh is enabled, the MDP will
> > automatically send a frame to the panel every time the panel invokes
> > the TE signal.  I'm pretty sure this will automatically trigger the
> > PP_DONE irq, so essentially PP_DONE is now analogous to vsync.  The
> > downside is that the START trigger and autorefresh are basically
> > mutually exclusive.  I see the autorefresh config register in MDP 1.0,
> > so it would be applicable to all platforms supported by the mdp5
> > driver.
>
> There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383
>
> I'm not sure what to put in that register but I tried configuring it
> with a 1 this way and still have the same issue.
>
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> index eeef41fcd4e1..6b9acf68fd2c 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> @@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
>         mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
>                         MDP5_PP_SYNC_THRESH_START(4) |
>                         MDP5_PP_SYNC_THRESH_CONTINUE(4));
> +       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
>
>         return 0;
>  }

bit 31 is the enable bit (set that to 1), bits 15:0 are the
frame_count (how many te events before the MDP sends a frame, I'd
recommend set to 1).  Then after its programmed, you'll have to flush
the config, and probably use a _START to make sure the flush takes
effect.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-11 11:38                               ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-11 11:38 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, Rob Clark, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Sun, Nov 10, 2019 at 10:37:33AM -0700, Jeffrey Hugo wrote:
> On Sun, Nov 10, 2019 at 6:53 AM Brian Masney <masneyb@onstation.org> wrote:
> >
> > On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> > There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383
> >
> > I'm not sure what to put in that register but I tried configuring it
> > with a 1 this way and still have the same issue.
> >
> > diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > index eeef41fcd4e1..6b9acf68fd2c 100644
> > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > @@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
> >         mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
> >                         MDP5_PP_SYNC_THRESH_START(4) |
> >                         MDP5_PP_SYNC_THRESH_CONTINUE(4));
> > +       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
> >
> >         return 0;
> >  }
> 
> bit 31 is the enable bit (set that to 1), bits 15:0 are the
> frame_count (how many te events before the MDP sends a frame, I'd
> recommend set to 1).  Then after its programmed, you'll have to flush
> the config, and probably use a _START to make sure the flush takes
> effect.

I think that I initially get autorefresh enabled based on your
description above since the ping pong IRQs occur much more frequently.
However pretty quickly the error 'dsi_err_worker: status=c' is shown,
the contents on the screen shift to the right, and the screen no longer
updates after that. That error decodes to
DSI_ERR_STATE_DLN0_PHY | DSI_ERR_STATE_FIFO according to dsi_host.c.

Here's the relevant code that I have so far:

diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
index eeef41fcd4e1..85a5cfe54ce8 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
@@ -157,6 +157,7 @@ void mdp5_cmd_encoder_enable(struct drm_encoder *encoder)
        struct mdp5_ctl *ctl = mdp5_cmd_enc->ctl;
        struct mdp5_interface *intf = mdp5_cmd_enc->intf;
        struct mdp5_pipeline *pipeline = mdp5_crtc_get_pipeline(encoder->crtc);
+       struct mdp5_kms *mdp5_kms = get_kms(encoder);;
 
        if (WARN_ON(mdp5_cmd_enc->enabled))
                return;
@@ -167,6 +168,14 @@ void mdp5_cmd_encoder_enable(struct drm_encoder *encoder)
 
        mdp5_ctl_commit(ctl, pipeline, mdp_ctl_flush_mask_encoder(intf), true);
 
+       if (intf->type == INTF_DSI &&
+           intf->mode == MDP5_INTF_DSI_MODE_COMMAND) {
+               mdp5_write(mdp5_kms,
+                          REG_MDP5_PP_AUTOREFRESH_CONFIG(pipeline->mixer->pp),
+                          BIT(31) | BIT(0));
+               mdp5_crtc_flush_all(encoder->crtc);
+       }
+
        mdp5_ctl_set_encoder_state(ctl, pipeline, true);
 
        mdp5_cmd_enc->enabled = true;
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
index 05cc04f729d6..369746ebbc42 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
@@ -103,7 +104,7 @@ static u32 crtc_flush(struct drm_crtc *crtc, u32 flush_mask)
  * so that we can safely queue unref to current fb (ie. next
  * vblank we know hw is done w/ previous scanout_fb).
  */
-static u32 crtc_flush_all(struct drm_crtc *crtc)
+u32 mdp5_crtc_flush_all(struct drm_crtc *crtc)
 {
        struct mdp5_crtc_state *mdp5_cstate = to_mdp5_crtc_state(crtc->state);
        struct mdp5_hw_mixer *mixer, *r_mixer;
@@ -734,7 +735,7 @@ static void mdp5_crtc_atomic_flush(struct drm_crtc *crtc,
        if (mdp5_cstate->cmd_mode)
                request_pp_done_pending(crtc);
 
-       mdp5_crtc->flushed_mask = crtc_flush_all(crtc);
+       mdp5_crtc->flushed_mask = mdp5_crtc_flush_all(crtc);
 
        /* XXX are we leaking out state here? */
        mdp5_crtc->vblank.irqmask = mdp5_cstate->vblank_irqmask;
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
index 128866742593..3490328ab63e 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
@@ -278,6 +278,7 @@ enum mdp5_pipe mdp5_plane_right_pipe(struct drm_plane *plane);
 struct drm_plane *mdp5_plane_init(struct drm_device *dev,
                                  enum drm_plane_type type);
 
+u32 mdp5_crtc_flush_all(struct drm_crtc *crtc);
 struct mdp5_ctl *mdp5_crtc_get_ctl(struct drm_crtc *crtc);
 uint32_t mdp5_crtc_vblank(struct drm_crtc *crtc);
 

Note that mdp5_ctl_set_encoder_state() will call send_start_signal()
for a command-mode panel.

I put a HERE log statement in request_pp_done_pending() and
mdp5_crtc_pp_done_irq() and here's the relevant part of dmesg:

[   13.832596] msm fd900000.mdss: pp done time out, lm=0
[   13.832690] request_pp_done_pending: HERE
[   13.899890] mdp5_crtc_pp_done_irq: HERE
[   13.899981] Console: switching to colour frame buffer device 135x120
[   13.916662] mdp5_crtc_pp_done_irq: HERE
[   13.916813] request_pp_done_pending: HERE
[   13.933439] mdp5_crtc_pp_done_irq: HERE
[   13.950217] mdp5_crtc_pp_done_irq: HERE
[   13.950295] request_pp_done_pending: HERE
[   13.959973] msm fd900000.mdss: fb0: msmdrmfb frame buffer device
[   13.964469] i2c i2c-4: Added multiplexed i2c bus 5
[   13.966998] mdp5_crtc_pp_done_irq: HERE
[   13.983780] mdp5_crtc_pp_done_irq: HERE
[   13.983932] request_pp_done_pending: HERE
[   14.000617] mdp5_crtc_pp_done_irq: HERE
[   14.017393] mdp5_crtc_pp_done_irq: HERE
[   14.017539] request_pp_done_pending: HERE
[   14.034173] mdp5_crtc_pp_done_irq: HERE
[   14.050956] mdp5_crtc_pp_done_irq: HERE
[   14.067738] mdp5_crtc_pp_done_irq: HERE
[   14.084521] mdp5_crtc_pp_done_irq: HERE
[   14.101305] mdp5_crtc_pp_done_irq: HERE
[   14.118085] mdp5_crtc_pp_done_irq: HERE
[   14.134866] mdp5_crtc_pp_done_irq: HERE
[   14.151646] mdp5_crtc_pp_done_irq: HERE
[   14.168425] mdp5_crtc_pp_done_irq: HERE
[   14.185204] mdp5_crtc_pp_done_irq: HERE
[   14.192790] request_pp_done_pending: HERE
[   14.192967] dsi_err_worker: status=c
[   14.241759] dsi_err_worker: status=c
[   14.252650] msm fd900000.mdss: pp done time out, lm=0
[   14.462645] msm fd900000.mdss: pp done time out, lm=0
[   14.462704] request_pp_done_pending: HERE
[   14.522644] msm fd900000.mdss: pp done time out, lm=0
[   14.672643] msm fd900000.mdss: pp done time out, lm=0
[   14.672702] request_pp_done_pending: HERE
[   14.732643] msm fd900000.mdss: pp done time out, lm=0
[   14.882644] msm fd900000.mdss: pp done time out, lm=0

Brian

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-11 11:38                               ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-11 11:38 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, Rob Clark, freedreno

On Sun, Nov 10, 2019 at 10:37:33AM -0700, Jeffrey Hugo wrote:
> On Sun, Nov 10, 2019 at 6:53 AM Brian Masney <masneyb@onstation.org> wrote:
> >
> > On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> > There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383
> >
> > I'm not sure what to put in that register but I tried configuring it
> > with a 1 this way and still have the same issue.
> >
> > diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > index eeef41fcd4e1..6b9acf68fd2c 100644
> > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > @@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
> >         mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
> >                         MDP5_PP_SYNC_THRESH_START(4) |
> >                         MDP5_PP_SYNC_THRESH_CONTINUE(4));
> > +       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
> >
> >         return 0;
> >  }
> 
> bit 31 is the enable bit (set that to 1), bits 15:0 are the
> frame_count (how many te events before the MDP sends a frame, I'd
> recommend set to 1).  Then after its programmed, you'll have to flush
> the config, and probably use a _START to make sure the flush takes
> effect.

I think that I initially get autorefresh enabled based on your
description above since the ping pong IRQs occur much more frequently.
However pretty quickly the error 'dsi_err_worker: status=c' is shown,
the contents on the screen shift to the right, and the screen no longer
updates after that. That error decodes to
DSI_ERR_STATE_DLN0_PHY | DSI_ERR_STATE_FIFO according to dsi_host.c.

Here's the relevant code that I have so far:

diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
index eeef41fcd4e1..85a5cfe54ce8 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
@@ -157,6 +157,7 @@ void mdp5_cmd_encoder_enable(struct drm_encoder *encoder)
        struct mdp5_ctl *ctl = mdp5_cmd_enc->ctl;
        struct mdp5_interface *intf = mdp5_cmd_enc->intf;
        struct mdp5_pipeline *pipeline = mdp5_crtc_get_pipeline(encoder->crtc);
+       struct mdp5_kms *mdp5_kms = get_kms(encoder);;
 
        if (WARN_ON(mdp5_cmd_enc->enabled))
                return;
@@ -167,6 +168,14 @@ void mdp5_cmd_encoder_enable(struct drm_encoder *encoder)
 
        mdp5_ctl_commit(ctl, pipeline, mdp_ctl_flush_mask_encoder(intf), true);
 
+       if (intf->type == INTF_DSI &&
+           intf->mode == MDP5_INTF_DSI_MODE_COMMAND) {
+               mdp5_write(mdp5_kms,
+                          REG_MDP5_PP_AUTOREFRESH_CONFIG(pipeline->mixer->pp),
+                          BIT(31) | BIT(0));
+               mdp5_crtc_flush_all(encoder->crtc);
+       }
+
        mdp5_ctl_set_encoder_state(ctl, pipeline, true);
 
        mdp5_cmd_enc->enabled = true;
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
index 05cc04f729d6..369746ebbc42 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
@@ -103,7 +104,7 @@ static u32 crtc_flush(struct drm_crtc *crtc, u32 flush_mask)
  * so that we can safely queue unref to current fb (ie. next
  * vblank we know hw is done w/ previous scanout_fb).
  */
-static u32 crtc_flush_all(struct drm_crtc *crtc)
+u32 mdp5_crtc_flush_all(struct drm_crtc *crtc)
 {
        struct mdp5_crtc_state *mdp5_cstate = to_mdp5_crtc_state(crtc->state);
        struct mdp5_hw_mixer *mixer, *r_mixer;
@@ -734,7 +735,7 @@ static void mdp5_crtc_atomic_flush(struct drm_crtc *crtc,
        if (mdp5_cstate->cmd_mode)
                request_pp_done_pending(crtc);
 
-       mdp5_crtc->flushed_mask = crtc_flush_all(crtc);
+       mdp5_crtc->flushed_mask = mdp5_crtc_flush_all(crtc);
 
        /* XXX are we leaking out state here? */
        mdp5_crtc->vblank.irqmask = mdp5_cstate->vblank_irqmask;
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
index 128866742593..3490328ab63e 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
@@ -278,6 +278,7 @@ enum mdp5_pipe mdp5_plane_right_pipe(struct drm_plane *plane);
 struct drm_plane *mdp5_plane_init(struct drm_device *dev,
                                  enum drm_plane_type type);
 
+u32 mdp5_crtc_flush_all(struct drm_crtc *crtc);
 struct mdp5_ctl *mdp5_crtc_get_ctl(struct drm_crtc *crtc);
 uint32_t mdp5_crtc_vblank(struct drm_crtc *crtc);
 

Note that mdp5_ctl_set_encoder_state() will call send_start_signal()
for a command-mode panel.

I put a HERE log statement in request_pp_done_pending() and
mdp5_crtc_pp_done_irq() and here's the relevant part of dmesg:

[   13.832596] msm fd900000.mdss: pp done time out, lm=0
[   13.832690] request_pp_done_pending: HERE
[   13.899890] mdp5_crtc_pp_done_irq: HERE
[   13.899981] Console: switching to colour frame buffer device 135x120
[   13.916662] mdp5_crtc_pp_done_irq: HERE
[   13.916813] request_pp_done_pending: HERE
[   13.933439] mdp5_crtc_pp_done_irq: HERE
[   13.950217] mdp5_crtc_pp_done_irq: HERE
[   13.950295] request_pp_done_pending: HERE
[   13.959973] msm fd900000.mdss: fb0: msmdrmfb frame buffer device
[   13.964469] i2c i2c-4: Added multiplexed i2c bus 5
[   13.966998] mdp5_crtc_pp_done_irq: HERE
[   13.983780] mdp5_crtc_pp_done_irq: HERE
[   13.983932] request_pp_done_pending: HERE
[   14.000617] mdp5_crtc_pp_done_irq: HERE
[   14.017393] mdp5_crtc_pp_done_irq: HERE
[   14.017539] request_pp_done_pending: HERE
[   14.034173] mdp5_crtc_pp_done_irq: HERE
[   14.050956] mdp5_crtc_pp_done_irq: HERE
[   14.067738] mdp5_crtc_pp_done_irq: HERE
[   14.084521] mdp5_crtc_pp_done_irq: HERE
[   14.101305] mdp5_crtc_pp_done_irq: HERE
[   14.118085] mdp5_crtc_pp_done_irq: HERE
[   14.134866] mdp5_crtc_pp_done_irq: HERE
[   14.151646] mdp5_crtc_pp_done_irq: HERE
[   14.168425] mdp5_crtc_pp_done_irq: HERE
[   14.185204] mdp5_crtc_pp_done_irq: HERE
[   14.192790] request_pp_done_pending: HERE
[   14.192967] dsi_err_worker: status=c
[   14.241759] dsi_err_worker: status=c
[   14.252650] msm fd900000.mdss: pp done time out, lm=0
[   14.462645] msm fd900000.mdss: pp done time out, lm=0
[   14.462704] request_pp_done_pending: HERE
[   14.522644] msm fd900000.mdss: pp done time out, lm=0
[   14.672643] msm fd900000.mdss: pp done time out, lm=0
[   14.672702] request_pp_done_pending: HERE
[   14.732643] msm fd900000.mdss: pp done time out, lm=0
[   14.882644] msm fd900000.mdss: pp done time out, lm=0

Brian
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-11 11:38                               ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-11 11:38 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, freedreno

On Sun, Nov 10, 2019 at 10:37:33AM -0700, Jeffrey Hugo wrote:
> On Sun, Nov 10, 2019 at 6:53 AM Brian Masney <masneyb@onstation.org> wrote:
> >
> > On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> > There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383
> >
> > I'm not sure what to put in that register but I tried configuring it
> > with a 1 this way and still have the same issue.
> >
> > diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > index eeef41fcd4e1..6b9acf68fd2c 100644
> > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > @@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
> >         mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
> >                         MDP5_PP_SYNC_THRESH_START(4) |
> >                         MDP5_PP_SYNC_THRESH_CONTINUE(4));
> > +       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
> >
> >         return 0;
> >  }
> 
> bit 31 is the enable bit (set that to 1), bits 15:0 are the
> frame_count (how many te events before the MDP sends a frame, I'd
> recommend set to 1).  Then after its programmed, you'll have to flush
> the config, and probably use a _START to make sure the flush takes
> effect.

I think that I initially get autorefresh enabled based on your
description above since the ping pong IRQs occur much more frequently.
However pretty quickly the error 'dsi_err_worker: status=c' is shown,
the contents on the screen shift to the right, and the screen no longer
updates after that. That error decodes to
DSI_ERR_STATE_DLN0_PHY | DSI_ERR_STATE_FIFO according to dsi_host.c.

Here's the relevant code that I have so far:

diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
index eeef41fcd4e1..85a5cfe54ce8 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
@@ -157,6 +157,7 @@ void mdp5_cmd_encoder_enable(struct drm_encoder *encoder)
        struct mdp5_ctl *ctl = mdp5_cmd_enc->ctl;
        struct mdp5_interface *intf = mdp5_cmd_enc->intf;
        struct mdp5_pipeline *pipeline = mdp5_crtc_get_pipeline(encoder->crtc);
+       struct mdp5_kms *mdp5_kms = get_kms(encoder);;
 
        if (WARN_ON(mdp5_cmd_enc->enabled))
                return;
@@ -167,6 +168,14 @@ void mdp5_cmd_encoder_enable(struct drm_encoder *encoder)
 
        mdp5_ctl_commit(ctl, pipeline, mdp_ctl_flush_mask_encoder(intf), true);
 
+       if (intf->type == INTF_DSI &&
+           intf->mode == MDP5_INTF_DSI_MODE_COMMAND) {
+               mdp5_write(mdp5_kms,
+                          REG_MDP5_PP_AUTOREFRESH_CONFIG(pipeline->mixer->pp),
+                          BIT(31) | BIT(0));
+               mdp5_crtc_flush_all(encoder->crtc);
+       }
+
        mdp5_ctl_set_encoder_state(ctl, pipeline, true);
 
        mdp5_cmd_enc->enabled = true;
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
index 05cc04f729d6..369746ebbc42 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
@@ -103,7 +104,7 @@ static u32 crtc_flush(struct drm_crtc *crtc, u32 flush_mask)
  * so that we can safely queue unref to current fb (ie. next
  * vblank we know hw is done w/ previous scanout_fb).
  */
-static u32 crtc_flush_all(struct drm_crtc *crtc)
+u32 mdp5_crtc_flush_all(struct drm_crtc *crtc)
 {
        struct mdp5_crtc_state *mdp5_cstate = to_mdp5_crtc_state(crtc->state);
        struct mdp5_hw_mixer *mixer, *r_mixer;
@@ -734,7 +735,7 @@ static void mdp5_crtc_atomic_flush(struct drm_crtc *crtc,
        if (mdp5_cstate->cmd_mode)
                request_pp_done_pending(crtc);
 
-       mdp5_crtc->flushed_mask = crtc_flush_all(crtc);
+       mdp5_crtc->flushed_mask = mdp5_crtc_flush_all(crtc);
 
        /* XXX are we leaking out state here? */
        mdp5_crtc->vblank.irqmask = mdp5_cstate->vblank_irqmask;
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
index 128866742593..3490328ab63e 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
@@ -278,6 +278,7 @@ enum mdp5_pipe mdp5_plane_right_pipe(struct drm_plane *plane);
 struct drm_plane *mdp5_plane_init(struct drm_device *dev,
                                  enum drm_plane_type type);
 
+u32 mdp5_crtc_flush_all(struct drm_crtc *crtc);
 struct mdp5_ctl *mdp5_crtc_get_ctl(struct drm_crtc *crtc);
 uint32_t mdp5_crtc_vblank(struct drm_crtc *crtc);
 

Note that mdp5_ctl_set_encoder_state() will call send_start_signal()
for a command-mode panel.

I put a HERE log statement in request_pp_done_pending() and
mdp5_crtc_pp_done_irq() and here's the relevant part of dmesg:

[   13.832596] msm fd900000.mdss: pp done time out, lm=0
[   13.832690] request_pp_done_pending: HERE
[   13.899890] mdp5_crtc_pp_done_irq: HERE
[   13.899981] Console: switching to colour frame buffer device 135x120
[   13.916662] mdp5_crtc_pp_done_irq: HERE
[   13.916813] request_pp_done_pending: HERE
[   13.933439] mdp5_crtc_pp_done_irq: HERE
[   13.950217] mdp5_crtc_pp_done_irq: HERE
[   13.950295] request_pp_done_pending: HERE
[   13.959973] msm fd900000.mdss: fb0: msmdrmfb frame buffer device
[   13.964469] i2c i2c-4: Added multiplexed i2c bus 5
[   13.966998] mdp5_crtc_pp_done_irq: HERE
[   13.983780] mdp5_crtc_pp_done_irq: HERE
[   13.983932] request_pp_done_pending: HERE
[   14.000617] mdp5_crtc_pp_done_irq: HERE
[   14.017393] mdp5_crtc_pp_done_irq: HERE
[   14.017539] request_pp_done_pending: HERE
[   14.034173] mdp5_crtc_pp_done_irq: HERE
[   14.050956] mdp5_crtc_pp_done_irq: HERE
[   14.067738] mdp5_crtc_pp_done_irq: HERE
[   14.084521] mdp5_crtc_pp_done_irq: HERE
[   14.101305] mdp5_crtc_pp_done_irq: HERE
[   14.118085] mdp5_crtc_pp_done_irq: HERE
[   14.134866] mdp5_crtc_pp_done_irq: HERE
[   14.151646] mdp5_crtc_pp_done_irq: HERE
[   14.168425] mdp5_crtc_pp_done_irq: HERE
[   14.185204] mdp5_crtc_pp_done_irq: HERE
[   14.192790] request_pp_done_pending: HERE
[   14.192967] dsi_err_worker: status=c
[   14.241759] dsi_err_worker: status=c
[   14.252650] msm fd900000.mdss: pp done time out, lm=0
[   14.462645] msm fd900000.mdss: pp done time out, lm=0
[   14.462704] request_pp_done_pending: HERE
[   14.522644] msm fd900000.mdss: pp done time out, lm=0
[   14.672643] msm fd900000.mdss: pp done time out, lm=0
[   14.672702] request_pp_done_pending: HERE
[   14.732643] msm fd900000.mdss: pp done time out, lm=0
[   14.882644] msm fd900000.mdss: pp done time out, lm=0

Brian
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-11 14:51                                 ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-11 14:51 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, Rob Clark, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Mon, Nov 11, 2019 at 4:38 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Sun, Nov 10, 2019 at 10:37:33AM -0700, Jeffrey Hugo wrote:
> > On Sun, Nov 10, 2019 at 6:53 AM Brian Masney <masneyb@onstation.org> wrote:
> > >
> > > On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> > > There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383
> > >
> > > I'm not sure what to put in that register but I tried configuring it
> > > with a 1 this way and still have the same issue.
> > >
> > > diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > index eeef41fcd4e1..6b9acf68fd2c 100644
> > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > @@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
> > >         mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
> > >                         MDP5_PP_SYNC_THRESH_START(4) |
> > >                         MDP5_PP_SYNC_THRESH_CONTINUE(4));
> > > +       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
> > >
> > >         return 0;
> > >  }
> >
> > bit 31 is the enable bit (set that to 1), bits 15:0 are the
> > frame_count (how many te events before the MDP sends a frame, I'd
> > recommend set to 1).  Then after its programmed, you'll have to flush
> > the config, and probably use a _START to make sure the flush takes
> > effect.
>
> I think that I initially get autorefresh enabled based on your
> description above since the ping pong IRQs occur much more frequently.
> However pretty quickly the error 'dsi_err_worker: status=c' is shown,
> the contents on the screen shift to the right, and the screen no longer
> updates after that. That error decodes to
> DSI_ERR_STATE_DLN0_PHY | DSI_ERR_STATE_FIFO according to dsi_host.c.
>
> Here's the relevant code that I have so far:

So, Unless I missed it, you haven't disabled using _start when
autorefresh is enabled.  If you are using both at the same time,
you'll overload the DSI and get those kinds of errors.

>
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> index eeef41fcd4e1..85a5cfe54ce8 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> @@ -157,6 +157,7 @@ void mdp5_cmd_encoder_enable(struct drm_encoder *encoder)
>         struct mdp5_ctl *ctl = mdp5_cmd_enc->ctl;
>         struct mdp5_interface *intf = mdp5_cmd_enc->intf;
>         struct mdp5_pipeline *pipeline = mdp5_crtc_get_pipeline(encoder->crtc);
> +       struct mdp5_kms *mdp5_kms = get_kms(encoder);;
>
>         if (WARN_ON(mdp5_cmd_enc->enabled))
>                 return;
> @@ -167,6 +168,14 @@ void mdp5_cmd_encoder_enable(struct drm_encoder *encoder)
>
>         mdp5_ctl_commit(ctl, pipeline, mdp_ctl_flush_mask_encoder(intf), true);
>
> +       if (intf->type == INTF_DSI &&
> +           intf->mode == MDP5_INTF_DSI_MODE_COMMAND) {
> +               mdp5_write(mdp5_kms,
> +                          REG_MDP5_PP_AUTOREFRESH_CONFIG(pipeline->mixer->pp),
> +                          BIT(31) | BIT(0));
> +               mdp5_crtc_flush_all(encoder->crtc);
> +       }
> +
>         mdp5_ctl_set_encoder_state(ctl, pipeline, true);
>
>         mdp5_cmd_enc->enabled = true;
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
> index 05cc04f729d6..369746ebbc42 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
> @@ -103,7 +104,7 @@ static u32 crtc_flush(struct drm_crtc *crtc, u32 flush_mask)
>   * so that we can safely queue unref to current fb (ie. next
>   * vblank we know hw is done w/ previous scanout_fb).
>   */
> -static u32 crtc_flush_all(struct drm_crtc *crtc)
> +u32 mdp5_crtc_flush_all(struct drm_crtc *crtc)
>  {
>         struct mdp5_crtc_state *mdp5_cstate = to_mdp5_crtc_state(crtc->state);
>         struct mdp5_hw_mixer *mixer, *r_mixer;
> @@ -734,7 +735,7 @@ static void mdp5_crtc_atomic_flush(struct drm_crtc *crtc,
>         if (mdp5_cstate->cmd_mode)
>                 request_pp_done_pending(crtc);
>
> -       mdp5_crtc->flushed_mask = crtc_flush_all(crtc);
> +       mdp5_crtc->flushed_mask = mdp5_crtc_flush_all(crtc);
>
>         /* XXX are we leaking out state here? */
>         mdp5_crtc->vblank.irqmask = mdp5_cstate->vblank_irqmask;
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
> index 128866742593..3490328ab63e 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
> @@ -278,6 +278,7 @@ enum mdp5_pipe mdp5_plane_right_pipe(struct drm_plane *plane);
>  struct drm_plane *mdp5_plane_init(struct drm_device *dev,
>                                   enum drm_plane_type type);
>
> +u32 mdp5_crtc_flush_all(struct drm_crtc *crtc);
>  struct mdp5_ctl *mdp5_crtc_get_ctl(struct drm_crtc *crtc);
>  uint32_t mdp5_crtc_vblank(struct drm_crtc *crtc);
>
>
> Note that mdp5_ctl_set_encoder_state() will call send_start_signal()
> for a command-mode panel.
>
> I put a HERE log statement in request_pp_done_pending() and
> mdp5_crtc_pp_done_irq() and here's the relevant part of dmesg:
>
> [   13.832596] msm fd900000.mdss: pp done time out, lm=0
> [   13.832690] request_pp_done_pending: HERE
> [   13.899890] mdp5_crtc_pp_done_irq: HERE
> [   13.899981] Console: switching to colour frame buffer device 135x120
> [   13.916662] mdp5_crtc_pp_done_irq: HERE
> [   13.916813] request_pp_done_pending: HERE
> [   13.933439] mdp5_crtc_pp_done_irq: HERE
> [   13.950217] mdp5_crtc_pp_done_irq: HERE
> [   13.950295] request_pp_done_pending: HERE
> [   13.959973] msm fd900000.mdss: fb0: msmdrmfb frame buffer device
> [   13.964469] i2c i2c-4: Added multiplexed i2c bus 5
> [   13.966998] mdp5_crtc_pp_done_irq: HERE
> [   13.983780] mdp5_crtc_pp_done_irq: HERE
> [   13.983932] request_pp_done_pending: HERE
> [   14.000617] mdp5_crtc_pp_done_irq: HERE
> [   14.017393] mdp5_crtc_pp_done_irq: HERE
> [   14.017539] request_pp_done_pending: HERE
> [   14.034173] mdp5_crtc_pp_done_irq: HERE
> [   14.050956] mdp5_crtc_pp_done_irq: HERE
> [   14.067738] mdp5_crtc_pp_done_irq: HERE
> [   14.084521] mdp5_crtc_pp_done_irq: HERE
> [   14.101305] mdp5_crtc_pp_done_irq: HERE
> [   14.118085] mdp5_crtc_pp_done_irq: HERE
> [   14.134866] mdp5_crtc_pp_done_irq: HERE
> [   14.151646] mdp5_crtc_pp_done_irq: HERE
> [   14.168425] mdp5_crtc_pp_done_irq: HERE
> [   14.185204] mdp5_crtc_pp_done_irq: HERE
> [   14.192790] request_pp_done_pending: HERE
> [   14.192967] dsi_err_worker: status=c
> [   14.241759] dsi_err_worker: status=c
> [   14.252650] msm fd900000.mdss: pp done time out, lm=0
> [   14.462645] msm fd900000.mdss: pp done time out, lm=0
> [   14.462704] request_pp_done_pending: HERE
> [   14.522644] msm fd900000.mdss: pp done time out, lm=0
> [   14.672643] msm fd900000.mdss: pp done time out, lm=0
> [   14.672702] request_pp_done_pending: HERE
> [   14.732643] msm fd900000.mdss: pp done time out, lm=0
> [   14.882644] msm fd900000.mdss: pp done time out, lm=0
>
> Brian

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-11 14:51                                 ` Jeffrey Hugo
  0 siblings, 0 replies; 44+ messages in thread
From: Jeffrey Hugo @ 2019-11-11 14:51 UTC (permalink / raw)
  To: Brian Masney
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, freedreno

On Mon, Nov 11, 2019 at 4:38 AM Brian Masney <masneyb@onstation.org> wrote:
>
> On Sun, Nov 10, 2019 at 10:37:33AM -0700, Jeffrey Hugo wrote:
> > On Sun, Nov 10, 2019 at 6:53 AM Brian Masney <masneyb@onstation.org> wrote:
> > >
> > > On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> > > There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383
> > >
> > > I'm not sure what to put in that register but I tried configuring it
> > > with a 1 this way and still have the same issue.
> > >
> > > diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > index eeef41fcd4e1..6b9acf68fd2c 100644
> > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > @@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
> > >         mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
> > >                         MDP5_PP_SYNC_THRESH_START(4) |
> > >                         MDP5_PP_SYNC_THRESH_CONTINUE(4));
> > > +       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
> > >
> > >         return 0;
> > >  }
> >
> > bit 31 is the enable bit (set that to 1), bits 15:0 are the
> > frame_count (how many te events before the MDP sends a frame, I'd
> > recommend set to 1).  Then after its programmed, you'll have to flush
> > the config, and probably use a _START to make sure the flush takes
> > effect.
>
> I think that I initially get autorefresh enabled based on your
> description above since the ping pong IRQs occur much more frequently.
> However pretty quickly the error 'dsi_err_worker: status=c' is shown,
> the contents on the screen shift to the right, and the screen no longer
> updates after that. That error decodes to
> DSI_ERR_STATE_DLN0_PHY | DSI_ERR_STATE_FIFO according to dsi_host.c.
>
> Here's the relevant code that I have so far:

So, Unless I missed it, you haven't disabled using _start when
autorefresh is enabled.  If you are using both at the same time,
you'll overload the DSI and get those kinds of errors.

>
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> index eeef41fcd4e1..85a5cfe54ce8 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> @@ -157,6 +157,7 @@ void mdp5_cmd_encoder_enable(struct drm_encoder *encoder)
>         struct mdp5_ctl *ctl = mdp5_cmd_enc->ctl;
>         struct mdp5_interface *intf = mdp5_cmd_enc->intf;
>         struct mdp5_pipeline *pipeline = mdp5_crtc_get_pipeline(encoder->crtc);
> +       struct mdp5_kms *mdp5_kms = get_kms(encoder);;
>
>         if (WARN_ON(mdp5_cmd_enc->enabled))
>                 return;
> @@ -167,6 +168,14 @@ void mdp5_cmd_encoder_enable(struct drm_encoder *encoder)
>
>         mdp5_ctl_commit(ctl, pipeline, mdp_ctl_flush_mask_encoder(intf), true);
>
> +       if (intf->type == INTF_DSI &&
> +           intf->mode == MDP5_INTF_DSI_MODE_COMMAND) {
> +               mdp5_write(mdp5_kms,
> +                          REG_MDP5_PP_AUTOREFRESH_CONFIG(pipeline->mixer->pp),
> +                          BIT(31) | BIT(0));
> +               mdp5_crtc_flush_all(encoder->crtc);
> +       }
> +
>         mdp5_ctl_set_encoder_state(ctl, pipeline, true);
>
>         mdp5_cmd_enc->enabled = true;
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
> index 05cc04f729d6..369746ebbc42 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
> @@ -103,7 +104,7 @@ static u32 crtc_flush(struct drm_crtc *crtc, u32 flush_mask)
>   * so that we can safely queue unref to current fb (ie. next
>   * vblank we know hw is done w/ previous scanout_fb).
>   */
> -static u32 crtc_flush_all(struct drm_crtc *crtc)
> +u32 mdp5_crtc_flush_all(struct drm_crtc *crtc)
>  {
>         struct mdp5_crtc_state *mdp5_cstate = to_mdp5_crtc_state(crtc->state);
>         struct mdp5_hw_mixer *mixer, *r_mixer;
> @@ -734,7 +735,7 @@ static void mdp5_crtc_atomic_flush(struct drm_crtc *crtc,
>         if (mdp5_cstate->cmd_mode)
>                 request_pp_done_pending(crtc);
>
> -       mdp5_crtc->flushed_mask = crtc_flush_all(crtc);
> +       mdp5_crtc->flushed_mask = mdp5_crtc_flush_all(crtc);
>
>         /* XXX are we leaking out state here? */
>         mdp5_crtc->vblank.irqmask = mdp5_cstate->vblank_irqmask;
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
> index 128866742593..3490328ab63e 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.h
> @@ -278,6 +278,7 @@ enum mdp5_pipe mdp5_plane_right_pipe(struct drm_plane *plane);
>  struct drm_plane *mdp5_plane_init(struct drm_device *dev,
>                                   enum drm_plane_type type);
>
> +u32 mdp5_crtc_flush_all(struct drm_crtc *crtc);
>  struct mdp5_ctl *mdp5_crtc_get_ctl(struct drm_crtc *crtc);
>  uint32_t mdp5_crtc_vblank(struct drm_crtc *crtc);
>
>
> Note that mdp5_ctl_set_encoder_state() will call send_start_signal()
> for a command-mode panel.
>
> I put a HERE log statement in request_pp_done_pending() and
> mdp5_crtc_pp_done_irq() and here's the relevant part of dmesg:
>
> [   13.832596] msm fd900000.mdss: pp done time out, lm=0
> [   13.832690] request_pp_done_pending: HERE
> [   13.899890] mdp5_crtc_pp_done_irq: HERE
> [   13.899981] Console: switching to colour frame buffer device 135x120
> [   13.916662] mdp5_crtc_pp_done_irq: HERE
> [   13.916813] request_pp_done_pending: HERE
> [   13.933439] mdp5_crtc_pp_done_irq: HERE
> [   13.950217] mdp5_crtc_pp_done_irq: HERE
> [   13.950295] request_pp_done_pending: HERE
> [   13.959973] msm fd900000.mdss: fb0: msmdrmfb frame buffer device
> [   13.964469] i2c i2c-4: Added multiplexed i2c bus 5
> [   13.966998] mdp5_crtc_pp_done_irq: HERE
> [   13.983780] mdp5_crtc_pp_done_irq: HERE
> [   13.983932] request_pp_done_pending: HERE
> [   14.000617] mdp5_crtc_pp_done_irq: HERE
> [   14.017393] mdp5_crtc_pp_done_irq: HERE
> [   14.017539] request_pp_done_pending: HERE
> [   14.034173] mdp5_crtc_pp_done_irq: HERE
> [   14.050956] mdp5_crtc_pp_done_irq: HERE
> [   14.067738] mdp5_crtc_pp_done_irq: HERE
> [   14.084521] mdp5_crtc_pp_done_irq: HERE
> [   14.101305] mdp5_crtc_pp_done_irq: HERE
> [   14.118085] mdp5_crtc_pp_done_irq: HERE
> [   14.134866] mdp5_crtc_pp_done_irq: HERE
> [   14.151646] mdp5_crtc_pp_done_irq: HERE
> [   14.168425] mdp5_crtc_pp_done_irq: HERE
> [   14.185204] mdp5_crtc_pp_done_irq: HERE
> [   14.192790] request_pp_done_pending: HERE
> [   14.192967] dsi_err_worker: status=c
> [   14.241759] dsi_err_worker: status=c
> [   14.252650] msm fd900000.mdss: pp done time out, lm=0
> [   14.462645] msm fd900000.mdss: pp done time out, lm=0
> [   14.462704] request_pp_done_pending: HERE
> [   14.522644] msm fd900000.mdss: pp done time out, lm=0
> [   14.672643] msm fd900000.mdss: pp done time out, lm=0
> [   14.672702] request_pp_done_pending: HERE
> [   14.732643] msm fd900000.mdss: pp done time out, lm=0
> [   14.882644] msm fd900000.mdss: pp done time out, lm=0
>
> Brian
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-12 10:54                                   ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-12 10:54 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, Rob Clark, freedreno, Linux Kernel Mailing List,
	dri-devel, linux-arm-msm, Sean Paul

On Mon, Nov 11, 2019 at 07:51:22AM -0700, Jeffrey Hugo wrote:
> On Mon, Nov 11, 2019 at 4:38 AM Brian Masney <masneyb@onstation.org> wrote:
> >
> > On Sun, Nov 10, 2019 at 10:37:33AM -0700, Jeffrey Hugo wrote:
> > > On Sun, Nov 10, 2019 at 6:53 AM Brian Masney <masneyb@onstation.org> wrote:
> > > >
> > > > On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> > > > There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383
> > > >
> > > > I'm not sure what to put in that register but I tried configuring it
> > > > with a 1 this way and still have the same issue.
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > > index eeef41fcd4e1..6b9acf68fd2c 100644
> > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > > @@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
> > > >         mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
> > > >                         MDP5_PP_SYNC_THRESH_START(4) |
> > > >                         MDP5_PP_SYNC_THRESH_CONTINUE(4));
> > > > +       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
> > > >
> > > >         return 0;
> > > >  }
> > >
> > > bit 31 is the enable bit (set that to 1), bits 15:0 are the
> > > frame_count (how many te events before the MDP sends a frame, I'd
> > > recommend set to 1).  Then after its programmed, you'll have to flush
> > > the config, and probably use a _START to make sure the flush takes
> > > effect.
> >
> > I think that I initially get autorefresh enabled based on your
> > description above since the ping pong IRQs occur much more frequently.
> > However pretty quickly the error 'dsi_err_worker: status=c' is shown,
> > the contents on the screen shift to the right, and the screen no longer
> > updates after that. That error decodes to
> > DSI_ERR_STATE_DLN0_PHY | DSI_ERR_STATE_FIFO according to dsi_host.c.
> >
> > Here's the relevant code that I have so far:
> 
> So, Unless I missed it, you haven't disabled using _start when
> autorefresh is enabled.  If you are using both at the same time,
> you'll overload the DSI and get those kinds of errors.

That fixed the issue. Just to close out this thread, I submitted a
patch with what I have here:
https://lore.kernel.org/lkml/20191112104854.20850-1-masneyb@onstation.org/T/#u

I'll work on async commit support for the MDP5.

Thanks Jeff and Rob!

Brian

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-12 10:54                                   ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-12 10:54 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, Rob Clark, freedreno

On Mon, Nov 11, 2019 at 07:51:22AM -0700, Jeffrey Hugo wrote:
> On Mon, Nov 11, 2019 at 4:38 AM Brian Masney <masneyb@onstation.org> wrote:
> >
> > On Sun, Nov 10, 2019 at 10:37:33AM -0700, Jeffrey Hugo wrote:
> > > On Sun, Nov 10, 2019 at 6:53 AM Brian Masney <masneyb@onstation.org> wrote:
> > > >
> > > > On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> > > > There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383
> > > >
> > > > I'm not sure what to put in that register but I tried configuring it
> > > > with a 1 this way and still have the same issue.
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > > index eeef41fcd4e1..6b9acf68fd2c 100644
> > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > > @@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
> > > >         mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
> > > >                         MDP5_PP_SYNC_THRESH_START(4) |
> > > >                         MDP5_PP_SYNC_THRESH_CONTINUE(4));
> > > > +       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
> > > >
> > > >         return 0;
> > > >  }
> > >
> > > bit 31 is the enable bit (set that to 1), bits 15:0 are the
> > > frame_count (how many te events before the MDP sends a frame, I'd
> > > recommend set to 1).  Then after its programmed, you'll have to flush
> > > the config, and probably use a _START to make sure the flush takes
> > > effect.
> >
> > I think that I initially get autorefresh enabled based on your
> > description above since the ping pong IRQs occur much more frequently.
> > However pretty quickly the error 'dsi_err_worker: status=c' is shown,
> > the contents on the screen shift to the right, and the screen no longer
> > updates after that. That error decodes to
> > DSI_ERR_STATE_DLN0_PHY | DSI_ERR_STATE_FIFO according to dsi_host.c.
> >
> > Here's the relevant code that I have so far:
> 
> So, Unless I missed it, you haven't disabled using _start when
> autorefresh is enabled.  If you are using both at the same time,
> you'll overload the DSI and get those kinds of errors.

That fixed the issue. Just to close out this thread, I submitted a
patch with what I have here:
https://lore.kernel.org/lkml/20191112104854.20850-1-masneyb@onstation.org/T/#u

I'll work on async commit support for the MDP5.

Thanks Jeff and Rob!

Brian
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes
@ 2019-11-12 10:54                                   ` Brian Masney
  0 siblings, 0 replies; 44+ messages in thread
From: Brian Masney @ 2019-11-12 10:54 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Rob Clark, linux-arm-msm, Linux Kernel Mailing List, dri-devel,
	Sean Paul, freedreno

On Mon, Nov 11, 2019 at 07:51:22AM -0700, Jeffrey Hugo wrote:
> On Mon, Nov 11, 2019 at 4:38 AM Brian Masney <masneyb@onstation.org> wrote:
> >
> > On Sun, Nov 10, 2019 at 10:37:33AM -0700, Jeffrey Hugo wrote:
> > > On Sun, Nov 10, 2019 at 6:53 AM Brian Masney <masneyb@onstation.org> wrote:
> > > >
> > > > On Fri, Nov 08, 2019 at 07:56:25AM -0700, Jeffrey Hugo wrote:
> > > > There's a REG_MDP5_PP_AUTOREFRESH_CONFIG() macro upstream here:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h#n1383
> > > >
> > > > I'm not sure what to put in that register but I tried configuring it
> > > > with a 1 this way and still have the same issue.
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > > index eeef41fcd4e1..6b9acf68fd2c 100644
> > > > --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > > +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
> > > > @@ -80,6 +80,7 @@ static int pingpong_tearcheck_setup(struct drm_encoder *encoder,
> > > >         mdp5_write(mdp5_kms, REG_MDP5_PP_SYNC_THRESH(pp_id),
> > > >                         MDP5_PP_SYNC_THRESH_START(4) |
> > > >                         MDP5_PP_SYNC_THRESH_CONTINUE(4));
> > > > +       mdp5_write(mdp5_kms, REG_MDP5_PP_AUTOREFRESH_CONFIG(pp_id), 1);
> > > >
> > > >         return 0;
> > > >  }
> > >
> > > bit 31 is the enable bit (set that to 1), bits 15:0 are the
> > > frame_count (how many te events before the MDP sends a frame, I'd
> > > recommend set to 1).  Then after its programmed, you'll have to flush
> > > the config, and probably use a _START to make sure the flush takes
> > > effect.
> >
> > I think that I initially get autorefresh enabled based on your
> > description above since the ping pong IRQs occur much more frequently.
> > However pretty quickly the error 'dsi_err_worker: status=c' is shown,
> > the contents on the screen shift to the right, and the screen no longer
> > updates after that. That error decodes to
> > DSI_ERR_STATE_DLN0_PHY | DSI_ERR_STATE_FIFO according to dsi_host.c.
> >
> > Here's the relevant code that I have so far:
> 
> So, Unless I missed it, you haven't disabled using _start when
> autorefresh is enabled.  If you are using both at the same time,
> you'll overload the DSI and get those kinds of errors.

That fixed the issue. Just to close out this thread, I submitted a
patch with what I have here:
https://lore.kernel.org/lkml/20191112104854.20850-1-masneyb@onstation.org/T/#u

I'll work on async commit support for the MDP5.

Thanks Jeff and Rob!

Brian
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2019-11-13  7:55 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-05  0:01 drm/msm: 'pp done time out' errors after async commit changes Brian Masney
2019-11-05  0:01 ` Brian Masney
2019-11-05  0:19 ` [Freedreno] " Rob Clark
2019-11-05  0:19   ` Rob Clark
2019-11-05 10:08   ` Brian Masney
2019-11-05 10:08     ` Brian Masney
2019-11-05 16:23     ` Rob Clark
2019-11-05 16:23       ` Rob Clark
2019-11-06  9:13       ` Brian Masney
2019-11-06  9:13         ` Brian Masney
2019-11-06 16:29         ` Rob Clark
2019-11-06 16:29           ` Rob Clark
2019-11-06 16:29           ` Rob Clark
2019-11-06 16:47           ` [Freedreno] " Jeffrey Hugo
2019-11-06 16:47             ` Jeffrey Hugo
2019-11-06 16:58             ` Rob Clark
2019-11-06 16:58               ` Rob Clark
2019-11-06 16:58               ` Rob Clark
2019-11-07 11:10               ` [Freedreno] " Brian Masney
2019-11-07 11:10                 ` Brian Masney
2019-11-07 16:16                 ` Rob Clark
2019-11-07 16:16                   ` Rob Clark
2019-11-07 17:40                   ` Jeffrey Hugo
2019-11-07 17:40                     ` Jeffrey Hugo
2019-11-08  2:03                     ` Rob Clark
2019-11-08  2:03                       ` Rob Clark
2019-11-08  2:03                       ` Rob Clark
2019-11-08 14:56                       ` [Freedreno] " Jeffrey Hugo
2019-11-08 14:56                         ` Jeffrey Hugo
2019-11-08 14:56                         ` Jeffrey Hugo
2019-11-10 13:53                         ` [Freedreno] " Brian Masney
2019-11-10 13:53                           ` Brian Masney
2019-11-10 13:53                           ` Brian Masney
2019-11-10 17:37                           ` [Freedreno] " Jeffrey Hugo
2019-11-10 17:37                             ` Jeffrey Hugo
2019-11-10 17:37                             ` Jeffrey Hugo
2019-11-11 11:38                             ` [Freedreno] " Brian Masney
2019-11-11 11:38                               ` Brian Masney
2019-11-11 11:38                               ` Brian Masney
2019-11-11 14:51                               ` [Freedreno] " Jeffrey Hugo
2019-11-11 14:51                                 ` Jeffrey Hugo
2019-11-12 10:54                                 ` Brian Masney
2019-11-12 10:54                                   ` Brian Masney
2019-11-12 10:54                                   ` Brian Masney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.