All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] drm/i915: More page flip vs. reset improvements
@ 2013-02-15 15:07 ville.syrjala
  2013-02-15 15:07 ` [PATCH v2 1/3] drm/i915: Wake up pending_flip_queue as part of reset handling ville.syrjala
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: ville.syrjala @ 2013-02-15 15:07 UTC (permalink / raw)
  To: intel-gfx

Here are some more flip vs. reset handling fixes.

I think this series fixes all the usual hangs in the flip vs. reset handling.
It even sends out the page flip events to user space.

I can now run a GL game, stop the rings, wait for the reset to kick in, and
then go on playing as if nothing happened.

i-g-t test will follow ASAP.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/3] drm/i915: Wake up pending_flip_queue as part of reset handling
  2013-02-15 15:07 [PATCH 0/3] drm/i915: More page flip vs. reset improvements ville.syrjala
@ 2013-02-15 15:07 ` ville.syrjala
  2013-02-15 23:53   ` Chris Wilson
  2013-02-15 15:07 ` [PATCH v2 2/3] drm/i915: Really wait for pending flips when panning ville.syrjala
  2013-02-15 15:07 ` [PATCH 3/3] drm/i915: Update primary planes after a GPU reset ville.syrjala
  2 siblings, 1 reply; 16+ messages in thread
From: ville.syrjala @ 2013-02-15 15:07 UTC (permalink / raw)
  To: intel-gfx

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Someone may be waiting for a flip that will never complete due to a GPU
reset. Wake up all such waiters after the GPU reset processing has
finished.

v2: Dropped the wake_up_all() from i915_handle_error() since
    we no longer wait for pending flips with struct_mutex held.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2cd97d1..672816a 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -915,6 +915,8 @@ static void i915_error_work_func(struct work_struct *work)
 		for_each_ring(ring, dev_priv, i)
 			wake_up_all(&ring->irq_queue);
 
+		wake_up_all(&dev_priv->pending_flip_queue);
+
 		wake_up_all(&dev_priv->gpu_error.reset_queue);
 	}
 }
-- 
1.7.12.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/3] drm/i915: Really wait for pending flips when panning
  2013-02-15 15:07 [PATCH 0/3] drm/i915: More page flip vs. reset improvements ville.syrjala
  2013-02-15 15:07 ` [PATCH v2 1/3] drm/i915: Wake up pending_flip_queue as part of reset handling ville.syrjala
@ 2013-02-15 15:07 ` ville.syrjala
  2013-02-15 15:07 ` [PATCH 3/3] drm/i915: Update primary planes after a GPU reset ville.syrjala
  2 siblings, 0 replies; 16+ messages in thread
From: ville.syrjala @ 2013-02-15 15:07 UTC (permalink / raw)
  To: intel-gfx

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Since obj->pending_flips was never set, intel_pipe_set_base() never
actually waited for pending page flips to complete.

We really do want to wait for the pending flips, because otherwise the
mmio surface base address update could overtake the flip, and you
could end up with an old frame on the screen once the flip really
completes.

Just call intel_crtc_wait_pending_flips() prior to calling
intel_pipe_set_base() instead of calling just intel_finish_fb()
from intel_pipe_set_base(). Moving the call outside of
intel_pipe_set_base() avoids calling it twice from the full
modeset path.

v2: Wait for pending flips w/o holding struct_mutex

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 11e72de..f0c6416 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2301,9 +2301,6 @@ intel_pipe_set_base(struct drm_crtc *crtc, int x, int y,
 		return ret;
 	}
 
-	if (crtc->fb)
-		intel_finish_fb(crtc->fb);
-
 	ret = dev_priv->display.update_plane(crtc, fb, x, y);
 	if (ret) {
 		intel_unpin_fb_obj(to_intel_framebuffer(fb)->obj);
@@ -8042,6 +8039,8 @@ static int intel_crtc_set_config(struct drm_mode_set *set)
 			goto fail;
 		}
 	} else if (config->fb_changed) {
+		intel_crtc_wait_for_pending_flips(set->crtc);
+
 		ret = intel_pipe_set_base(set->crtc,
 					  set->x, set->y, set->fb);
 	}
-- 
1.7.12.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] drm/i915: Update primary planes after a GPU reset
  2013-02-15 15:07 [PATCH 0/3] drm/i915: More page flip vs. reset improvements ville.syrjala
  2013-02-15 15:07 ` [PATCH v2 1/3] drm/i915: Wake up pending_flip_queue as part of reset handling ville.syrjala
  2013-02-15 15:07 ` [PATCH v2 2/3] drm/i915: Really wait for pending flips when panning ville.syrjala
@ 2013-02-15 15:07 ` ville.syrjala
  2013-02-15 15:28   ` Chris Wilson
  2 siblings, 1 reply; 16+ messages in thread
From: ville.syrjala @ 2013-02-15 15:07 UTC (permalink / raw)
  To: intel-gfx

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

GPU reset will drop all flips that are still in the ring. So after the
reset, call update_plane() for all CRTCs to make sure the primary
planes are scanning out from the correct buffer.

The base address update will also generate a FLIP_DONE interrupt, which
will complete any pending flips. That means user space will get its
page flip events and won't get stuck waiting for them.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c      |  2 ++
 drivers/gpu/drm/i915/intel_display.c | 26 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h     |  2 ++
 3 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 672816a..873552d 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -917,6 +917,8 @@ static void i915_error_work_func(struct work_struct *work)
 
 		wake_up_all(&dev_priv->pending_flip_queue);
 
+		intel_display_handle_reset(dev);
+
 		wake_up_all(&dev_priv->gpu_error.reset_queue);
 	}
 }
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index f0c6416..9838da1 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2218,6 +2218,32 @@ intel_pipe_set_base_atomic(struct drm_crtc *crtc, struct drm_framebuffer *fb,
 	return dev_priv->display.update_plane(crtc, fb, x, y);
 }
 
+void intel_display_handle_reset(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_crtc *crtc;
+
+	/*
+	 * Flips in the rings have been nuked by the reset,
+	 * so just update the base address of all primary
+	 * planes to the the last fb to make sure we're
+	 * showing the correct fb after a reset.
+	 *
+	 * As a nice side effect this also generates a FLIP_DONE
+	 * interrupt which will complete pending page flips, and
+	 * thus user space will get its events and not get stuck.
+	 */
+
+	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+		mutex_lock(&crtc->mutex);
+
+		if (to_intel_crtc(crtc)->active)
+			dev_priv->display.update_plane(crtc, crtc->fb, crtc->x, crtc->y);
+
+		mutex_unlock(&crtc->mutex);
+	}
+}
+
 static int
 intel_finish_fb(struct drm_framebuffer *old_fb)
 {
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index d282052..0dc14c2 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -694,4 +694,6 @@ extern bool
 intel_ddi_connector_get_hw_state(struct intel_connector *intel_connector);
 extern void intel_ddi_fdi_disable(struct drm_crtc *crtc);
 
+extern void intel_display_handle_reset(struct drm_device *dev);
+
 #endif /* __INTEL_DRV_H__ */
-- 
1.7.12.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] drm/i915: Update primary planes after a GPU reset
  2013-02-15 15:07 ` [PATCH 3/3] drm/i915: Update primary planes after a GPU reset ville.syrjala
@ 2013-02-15 15:28   ` Chris Wilson
  2013-02-15 16:56     ` Ville Syrjälä
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2013-02-15 15:28 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx

On Fri, Feb 15, 2013 at 05:07:46PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> GPU reset will drop all flips that are still in the ring. So after the
> reset, call update_plane() for all CRTCs to make sure the primary
> planes are scanning out from the correct buffer.
> 
> The base address update will also generate a FLIP_DONE interrupt, which
> will complete any pending flips. That means user space will get its
> page flip events and won't get stuck waiting for them.

Not for all generations. There's no harm in explicitly finishing the
pageflip, and then the FLIP_DONE is just a normal spurious interrupt.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] drm/i915: Update primary planes after a GPU reset
  2013-02-15 15:28   ` Chris Wilson
@ 2013-02-15 16:56     ` Ville Syrjälä
  2013-02-15 23:47       ` Chris Wilson
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2013-02-15 16:56 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On Fri, Feb 15, 2013 at 03:28:33PM +0000, Chris Wilson wrote:
> On Fri, Feb 15, 2013 at 05:07:46PM +0200, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > GPU reset will drop all flips that are still in the ring. So after the
> > reset, call update_plane() for all CRTCs to make sure the primary
> > planes are scanning out from the correct buffer.
> > 
> > The base address update will also generate a FLIP_DONE interrupt, which
> > will complete any pending flips. That means user space will get its
> > page flip events and won't get stuck waiting for them.
> 
> Not for all generations.

Hmm OK those seem to be the ones with the pending flip status bit
(Gen4 and older?).

But can someone explain why on those platforms we also check the
vblank interrupt status bit before handling the page flip interrupt?

Also wtf is i965_irq_handler doing? Based on the code it seems to be
expecting the pending flip interrupt to happen when the CS executes the
instruction, and then it'll finish the page flip from the following
vblank irq. BSpec doesn't agree, and still maintains that the page flip
interrupt will be generated when the flip really completes. Even if the
hardware would work the way the code suggests, the code is racy if the
ISR is delayed a bit and ends up handling the previous vblank interrupt
and the following flip pending interrupt at the same time. We'd really
need to check the live flip pending status from the ISR to make it
work right.

> There's no harm in explicitly finishing the
> pageflip, and then the FLIP_DONE is just a normal spurious interrupt.

All right. I suppose I'll need to make that change.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] drm/i915: Update primary planes after a GPU reset
  2013-02-15 16:56     ` Ville Syrjälä
@ 2013-02-15 23:47       ` Chris Wilson
  2013-02-18 10:15         ` Ville Syrjälä
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2013-02-15 23:47 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Fri, Feb 15, 2013 at 06:56:03PM +0200, Ville Syrjälä wrote:
> On Fri, Feb 15, 2013 at 03:28:33PM +0000, Chris Wilson wrote:
> > On Fri, Feb 15, 2013 at 05:07:46PM +0200, ville.syrjala@linux.intel.com wrote:
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > 
> > > GPU reset will drop all flips that are still in the ring. So after the
> > > reset, call update_plane() for all CRTCs to make sure the primary
> > > planes are scanning out from the correct buffer.
> > > 
> > > The base address update will also generate a FLIP_DONE interrupt, which
> > > will complete any pending flips. That means user space will get its
> > > page flip events and won't get stuck waiting for them.
> > 
> > Not for all generations.
> 
> Hmm OK those seem to be the ones with the pending flip status bit
> (Gen4 and older?).

Yes. Also notably the ones unlikely to survive the GPU reset :)

> But can someone explain why on those platforms we also check the
> vblank interrupt status bit before handling the page flip interrupt?

For those generations, we are meant to detect the transition of pending
flip status from 1 -> 0. That transition of course doesn't generate an
interrupt (rather it stops generating one), so the nearest I could come
up with was in anticipating the vblank after we saw the pending flip
status was close enough. In the case of a pending vblank raising an
interrupt just as the pending flip status is asserted, shouldn't MSI
prevent the pending flip from being asserted as we process the IIR for
the vblank. Of course not all of those chipsets even have MSI. I'm not
even sure if we can close that race.

> Also wtf is i965_irq_handler doing? Based on the code it seems to be
> expecting the pending flip interrupt to happen when the CS executes the
> instruction, and then it'll finish the page flip from the following
> vblank irq. BSpec doesn't agree, and still maintains that the page flip
> interrupt will be generated when the flip really completes. Even if the
> hardware would work the way the code suggests, the code is racy if the
> ISR is delayed a bit and ends up handling the previous vblank interrupt
> and the following flip pending interrupt at the same time. We'd really
> need to check the live flip pending status from the ISR to make it
> work right.

Just be sure that you are reading the bspec for the original gen4 that
has not been clobbered by later refinements. If it stands that we can
trust the flip done interrupt for gen4, then that simplifies that
routine greatly. And could even close the mystery pageflipping is broken
on gen4 bugs.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] drm/i915: Wake up pending_flip_queue as part of reset handling
  2013-02-15 15:07 ` [PATCH v2 1/3] drm/i915: Wake up pending_flip_queue as part of reset handling ville.syrjala
@ 2013-02-15 23:53   ` Chris Wilson
  2013-02-18  9:58     ` Ville Syrjälä
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2013-02-15 23:53 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx

On Fri, Feb 15, 2013 at 05:07:44PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Someone may be waiting for a flip that will never complete due to a GPU
> reset. Wake up all such waiters after the GPU reset processing has
> finished.
> 
> v2: Dropped the wake_up_all() from i915_handle_error() since
>     we no longer wait for pending flips with struct_mutex held.

Isn't the wake_up(pending_flip_queue) superseded by performing the
explicit do_intel_finish_page_flip() in patch 3?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] drm/i915: Wake up pending_flip_queue as part of reset handling
  2013-02-15 23:53   ` Chris Wilson
@ 2013-02-18  9:58     ` Ville Syrjälä
  2013-02-18 12:41       ` Ville Syrjälä
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2013-02-18  9:58 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On Fri, Feb 15, 2013 at 11:53:14PM +0000, Chris Wilson wrote:
> On Fri, Feb 15, 2013 at 05:07:44PM +0200, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Someone may be waiting for a flip that will never complete due to a GPU
> > reset. Wake up all such waiters after the GPU reset processing has
> > finished.
> > 
> > v2: Dropped the wake_up_all() from i915_handle_error() since
> >     we no longer wait for pending flips with struct_mutex held.
> 
> Isn't the wake_up(pending_flip_queue) superseded by performing the
> explicit do_intel_finish_page_flip() in patch 3?

Yes that's correct. But I actually forgot to remove the wake_up patch
from my tree when I tested this. I'll run a few more tests just to make
sure it still works.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] drm/i915: Update primary planes after a GPU reset
  2013-02-15 23:47       ` Chris Wilson
@ 2013-02-18 10:15         ` Ville Syrjälä
  2013-02-18 10:47           ` Chris Wilson
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2013-02-18 10:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On Fri, Feb 15, 2013 at 11:47:52PM +0000, Chris Wilson wrote:
> On Fri, Feb 15, 2013 at 06:56:03PM +0200, Ville Syrjälä wrote:
> > On Fri, Feb 15, 2013 at 03:28:33PM +0000, Chris Wilson wrote:
> > > On Fri, Feb 15, 2013 at 05:07:46PM +0200, ville.syrjala@linux.intel.com wrote:
> > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > 
> > > > GPU reset will drop all flips that are still in the ring. So after the
> > > > reset, call update_plane() for all CRTCs to make sure the primary
> > > > planes are scanning out from the correct buffer.
> > > > 
> > > > The base address update will also generate a FLIP_DONE interrupt, which
> > > > will complete any pending flips. That means user space will get its
> > > > page flip events and won't get stuck waiting for them.
> > > 
> > > Not for all generations.
> > 
> > Hmm OK those seem to be the ones with the pending flip status bit
> > (Gen4 and older?).
> 
> Yes. Also notably the ones unlikely to survive the GPU reset :)
> 
> > But can someone explain why on those platforms we also check the
> > vblank interrupt status bit before handling the page flip interrupt?
> 
> For those generations, we are meant to detect the transition of pending
> flip status from 1 -> 0. That transition of course doesn't generate an
> interrupt (rather it stops generating one), so the nearest I could come
> up with was in anticipating the vblank after we saw the pending flip
> status was close enough. In the case of a pending vblank raising an
> interrupt just as the pending flip status is asserted, shouldn't MSI
> prevent the pending flip from being asserted as we process the IIR for
> the vblank. Of course not all of those chipsets even have MSI. I'm not
> even sure if we can close that race.

This is what I *think* gen2-gen4 Bspec are telling me:

1. CS executes MI_DISPLAY_FLIP
   -> ISR:flip bit 0 -> 1
2. Flip completes
   -> ISR:flip bit 1 -> 0
   -> IIR:flip bit 0 -> 1
   -> interrupt generated

But unfortuntely I have no hardware to test that.


But if I'm reading the BSpec incorrectly (or if it's inaccurate), then
I think we can close the race with this code:

i965_irq_handler()
{
...
	if (iir & I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT)
		intel_prepare_page_flip();

	if (iir & PIPE_START_VBLANK_INTERRUPT_STATUS &&
	    (I915_READ(ISR) & I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT) == 0)
		intel_finish_page_flip();
...
}

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] drm/i915: Update primary planes after a GPU reset
  2013-02-18 10:15         ` Ville Syrjälä
@ 2013-02-18 10:47           ` Chris Wilson
  2013-02-18 11:50             ` Ville Syrjälä
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2013-02-18 10:47 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Mon, Feb 18, 2013 at 12:15:34PM +0200, Ville Syrjälä wrote:
> On Fri, Feb 15, 2013 at 11:47:52PM +0000, Chris Wilson wrote:
> > On Fri, Feb 15, 2013 at 06:56:03PM +0200, Ville Syrjälä wrote:
> > > On Fri, Feb 15, 2013 at 03:28:33PM +0000, Chris Wilson wrote:
> > > > On Fri, Feb 15, 2013 at 05:07:46PM +0200, ville.syrjala@linux.intel.com wrote:
> > > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > > 
> > > > > GPU reset will drop all flips that are still in the ring. So after the
> > > > > reset, call update_plane() for all CRTCs to make sure the primary
> > > > > planes are scanning out from the correct buffer.
> > > > > 
> > > > > The base address update will also generate a FLIP_DONE interrupt, which
> > > > > will complete any pending flips. That means user space will get its
> > > > > page flip events and won't get stuck waiting for them.
> > > > 
> > > > Not for all generations.
> > > 
> > > Hmm OK those seem to be the ones with the pending flip status bit
> > > (Gen4 and older?).
> > 
> > Yes. Also notably the ones unlikely to survive the GPU reset :)
> > 
> > > But can someone explain why on those platforms we also check the
> > > vblank interrupt status bit before handling the page flip interrupt?
> > 
> > For those generations, we are meant to detect the transition of pending
> > flip status from 1 -> 0. That transition of course doesn't generate an
> > interrupt (rather it stops generating one), so the nearest I could come
> > up with was in anticipating the vblank after we saw the pending flip
> > status was close enough. In the case of a pending vblank raising an
> > interrupt just as the pending flip status is asserted, shouldn't MSI
> > prevent the pending flip from being asserted as we process the IIR for
> > the vblank. Of course not all of those chipsets even have MSI. I'm not
> > even sure if we can close that race.
> 
> This is what I *think* gen2-gen4 Bspec are telling me:
> 
> 1. CS executes MI_DISPLAY_FLIP
>    -> ISR:flip bit 0 -> 1
> 2. Flip completes
>    -> ISR:flip bit 1 -> 0
>    -> IIR:flip bit 0 -> 1
>    -> interrupt generated

Ah, no. The IIR mirrors the ISR and it really does continually generate
the PendingFlip interrupt on gen2/3.

It wasn't until midway through the gen3 cycle, that they introduced the
behaviour you describe (which is what the "pending flip is actually flip
done" bit in ECOSKPD is about). But since we already have to try
and handle gen2/3, I didn't think it was worth a second gen3 code path
with the revised behaviour.

Can you send that as a patch for gen4? As I have a few people hitting
pageflip system hangs, but only on gen4 afaict.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] drm/i915: Update primary planes after a GPU reset
  2013-02-18 10:47           ` Chris Wilson
@ 2013-02-18 11:50             ` Ville Syrjälä
  2013-02-18 12:08               ` Chris Wilson
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2013-02-18 11:50 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On Mon, Feb 18, 2013 at 10:47:37AM +0000, Chris Wilson wrote:
> On Mon, Feb 18, 2013 at 12:15:34PM +0200, Ville Syrjälä wrote:
> > On Fri, Feb 15, 2013 at 11:47:52PM +0000, Chris Wilson wrote:
> > > On Fri, Feb 15, 2013 at 06:56:03PM +0200, Ville Syrjälä wrote:
> > > > On Fri, Feb 15, 2013 at 03:28:33PM +0000, Chris Wilson wrote:
> > > > > On Fri, Feb 15, 2013 at 05:07:46PM +0200, ville.syrjala@linux.intel.com wrote:
> > > > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > > > 
> > > > > > GPU reset will drop all flips that are still in the ring. So after the
> > > > > > reset, call update_plane() for all CRTCs to make sure the primary
> > > > > > planes are scanning out from the correct buffer.
> > > > > > 
> > > > > > The base address update will also generate a FLIP_DONE interrupt, which
> > > > > > will complete any pending flips. That means user space will get its
> > > > > > page flip events and won't get stuck waiting for them.
> > > > > 
> > > > > Not for all generations.
> > > > 
> > > > Hmm OK those seem to be the ones with the pending flip status bit
> > > > (Gen4 and older?).
> > > 
> > > Yes. Also notably the ones unlikely to survive the GPU reset :)
> > > 
> > > > But can someone explain why on those platforms we also check the
> > > > vblank interrupt status bit before handling the page flip interrupt?
> > > 
> > > For those generations, we are meant to detect the transition of pending
> > > flip status from 1 -> 0. That transition of course doesn't generate an
> > > interrupt (rather it stops generating one), so the nearest I could come
> > > up with was in anticipating the vblank after we saw the pending flip
> > > status was close enough. In the case of a pending vblank raising an
> > > interrupt just as the pending flip status is asserted, shouldn't MSI
> > > prevent the pending flip from being asserted as we process the IIR for
> > > the vblank. Of course not all of those chipsets even have MSI. I'm not
> > > even sure if we can close that race.
> > 
> > This is what I *think* gen2-gen4 Bspec are telling me:
> > 
> > 1. CS executes MI_DISPLAY_FLIP
> >    -> ISR:flip bit 0 -> 1
> > 2. Flip completes
> >    -> ISR:flip bit 1 -> 0
> >    -> IIR:flip bit 0 -> 1
> >    -> interrupt generated
> 
> Ah, no. The IIR mirrors the ISR and it really does continually generate
> the PendingFlip interrupt on gen2/3.

Continually? Oh that's nasty, but makes sense if it's just a level
interrupt.

> It wasn't until midway through the gen3 cycle, that they introduced the
> behaviour you describe (which is what the "pending flip is actually flip
> done" bit in ECOSKPD is about). But since we already have to try
> and handle gen2/3, I didn't think it was worth a second gen3 code path
> with the revised behaviour.

Ok. I see we never actually enable the pending flip interrupt on gen2/3
due to this. The same race exists there as well then. I can try to make
a patch for that too.

And we turn off the bit in ECOSKPD on gen3 to make sure we always behave
the same way. Makes sense.

> Can you send that as a patch for gen4? As I have a few people hitting
> pageflip system hangs, but only on gen4 afaict.

I must admit that I'm confused now. Which model does Gen4 actually
follow, pending flip or flip done?

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] drm/i915: Update primary planes after a GPU reset
  2013-02-18 11:50             ` Ville Syrjälä
@ 2013-02-18 12:08               ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2013-02-18 12:08 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Mon, Feb 18, 2013 at 01:50:26PM +0200, Ville Syrjälä wrote:
> > Can you send that as a patch for gen4? As I have a few people hitting
> > pageflip system hangs, but only on gen4 afaict.
> 
> I must admit that I'm confused now. Which model does Gen4 actually
> follow, pending flip or flip done?

The original bspec for gen4 doesn't mention FlipDone and the description
of PendingFlip reads very similar to the earlier gen2/gen3
implementation.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] drm/i915: Wake up pending_flip_queue as part of reset handling
  2013-02-18  9:58     ` Ville Syrjälä
@ 2013-02-18 12:41       ` Ville Syrjälä
  2013-02-18 15:31         ` Ville Syrjälä
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2013-02-18 12:41 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On Mon, Feb 18, 2013 at 11:58:23AM +0200, Ville Syrjälä wrote:
> On Fri, Feb 15, 2013 at 11:53:14PM +0000, Chris Wilson wrote:
> > On Fri, Feb 15, 2013 at 05:07:44PM +0200, ville.syrjala@linux.intel.com wrote:
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > 
> > > Someone may be waiting for a flip that will never complete due to a GPU
> > > reset. Wake up all such waiters after the GPU reset processing has
> > > finished.
> > > 
> > > v2: Dropped the wake_up_all() from i915_handle_error() since
> > >     we no longer wait for pending flips with struct_mutex held.
> > 
> > Isn't the wake_up(pending_flip_queue) superseded by performing the
> > explicit do_intel_finish_page_flip() in patch 3?
> 
> Yes that's correct. But I actually forgot to remove the wake_up patch
> from my tree when I tested this. I'll run a few more tests just to make
> sure it still works.

I just tried it w/o the wake_up_all() and unfortunately it hung :(

Need to think about it a bit more I suppose.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] drm/i915: Wake up pending_flip_queue as part of reset handling
  2013-02-18 12:41       ` Ville Syrjälä
@ 2013-02-18 15:31         ` Ville Syrjälä
  2013-02-18 16:39           ` Daniel Vetter
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2013-02-18 15:31 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On Mon, Feb 18, 2013 at 02:41:00PM +0200, Ville Syrjälä wrote:
> On Mon, Feb 18, 2013 at 11:58:23AM +0200, Ville Syrjälä wrote:
> > On Fri, Feb 15, 2013 at 11:53:14PM +0000, Chris Wilson wrote:
> > > On Fri, Feb 15, 2013 at 05:07:44PM +0200, ville.syrjala@linux.intel.com wrote:
> > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > 
> > > > Someone may be waiting for a flip that will never complete due to a GPU
> > > > reset. Wake up all such waiters after the GPU reset processing has
> > > > finished.
> > > > 
> > > > v2: Dropped the wake_up_all() from i915_handle_error() since
> > > >     we no longer wait for pending flips with struct_mutex held.
> > > 
> > > Isn't the wake_up(pending_flip_queue) superseded by performing the
> > > explicit do_intel_finish_page_flip() in patch 3?
> > 
> > Yes that's correct. But I actually forgot to remove the wake_up patch
> > from my tree when I tested this. I'll run a few more tests just to make
> > sure it still works.
> 
> I just tried it w/o the wake_up_all() and unfortunately it hung :(
> 
> Need to think about it a bit more I suppose.

Well after a bit more though I think I know what happened:

1. page flip submitted on crtc which is not the first crtc in the list
2. mode set operation on any crtc (will grab all crtc locks)
3. reset handling loops over crtcs
 3.1 first crtc doesn't have a pending flip, so pending_flip_queue is not woken up
 3.2 code tries to grab first crtc mutex to update the base address
  -> deadlock

So either I need to keep the wake_up_all() call in the reset handling,
or I need to do two loops over the crtcs, first loop would complete all
page flips, and second loop would update the base address registers.
I don't really care which way we go. Anyone else have a preference?

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] drm/i915: Wake up pending_flip_queue as part of reset handling
  2013-02-18 15:31         ` Ville Syrjälä
@ 2013-02-18 16:39           ` Daniel Vetter
  0 siblings, 0 replies; 16+ messages in thread
From: Daniel Vetter @ 2013-02-18 16:39 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Mon, Feb 18, 2013 at 05:31:14PM +0200, Ville Syrjälä wrote:
> On Mon, Feb 18, 2013 at 02:41:00PM +0200, Ville Syrjälä wrote:
> > On Mon, Feb 18, 2013 at 11:58:23AM +0200, Ville Syrjälä wrote:
> > > On Fri, Feb 15, 2013 at 11:53:14PM +0000, Chris Wilson wrote:
> > > > On Fri, Feb 15, 2013 at 05:07:44PM +0200, ville.syrjala@linux.intel.com wrote:
> > > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > > 
> > > > > Someone may be waiting for a flip that will never complete due to a GPU
> > > > > reset. Wake up all such waiters after the GPU reset processing has
> > > > > finished.
> > > > > 
> > > > > v2: Dropped the wake_up_all() from i915_handle_error() since
> > > > >     we no longer wait for pending flips with struct_mutex held.
> > > > 
> > > > Isn't the wake_up(pending_flip_queue) superseded by performing the
> > > > explicit do_intel_finish_page_flip() in patch 3?
> > > 
> > > Yes that's correct. But I actually forgot to remove the wake_up patch
> > > from my tree when I tested this. I'll run a few more tests just to make
> > > sure it still works.
> > 
> > I just tried it w/o the wake_up_all() and unfortunately it hung :(
> > 
> > Need to think about it a bit more I suppose.
> 
> Well after a bit more though I think I know what happened:
> 
> 1. page flip submitted on crtc which is not the first crtc in the list
> 2. mode set operation on any crtc (will grab all crtc locks)
> 3. reset handling loops over crtcs
>  3.1 first crtc doesn't have a pending flip, so pending_flip_queue is not woken up
>  3.2 code tries to grab first crtc mutex to update the base address
>   -> deadlock
> 
> So either I need to keep the wake_up_all() call in the reset handling,
> or I need to do two loops over the crtcs, first loop would complete all
> page flips, and second loop would update the base address registers.
> I don't really care which way we go. Anyone else have a preference?

I'd vote for the second approach of first waking everyone up, and then
doing the modeset updates. I guess you could even use
drm_modeset_lock_all, not really worth optimizing things here imo. Also
please add a comment to explain this ordering constraint in the reset
code.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-02-18 16:37 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-15 15:07 [PATCH 0/3] drm/i915: More page flip vs. reset improvements ville.syrjala
2013-02-15 15:07 ` [PATCH v2 1/3] drm/i915: Wake up pending_flip_queue as part of reset handling ville.syrjala
2013-02-15 23:53   ` Chris Wilson
2013-02-18  9:58     ` Ville Syrjälä
2013-02-18 12:41       ` Ville Syrjälä
2013-02-18 15:31         ` Ville Syrjälä
2013-02-18 16:39           ` Daniel Vetter
2013-02-15 15:07 ` [PATCH v2 2/3] drm/i915: Really wait for pending flips when panning ville.syrjala
2013-02-15 15:07 ` [PATCH 3/3] drm/i915: Update primary planes after a GPU reset ville.syrjala
2013-02-15 15:28   ` Chris Wilson
2013-02-15 16:56     ` Ville Syrjälä
2013-02-15 23:47       ` Chris Wilson
2013-02-18 10:15         ` Ville Syrjälä
2013-02-18 10:47           ` Chris Wilson
2013-02-18 11:50             ` Ville Syrjälä
2013-02-18 12:08               ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.