All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] drm/i915/guc: Fix request re-submission after reset
  2017-03-09 10:05 [PATCH] drm/i915/guc: Fix request re-submission after reset Tvrtko Ursulin
@ 2017-03-09  8:42 ` Oscar Mateo
  2017-03-09 16:50   ` Tvrtko Ursulin
  2017-03-09 10:19 ` Chris Wilson
  2017-03-09 10:47 ` ✓ Fi.CI.BAT: success for " Patchwork
  2 siblings, 1 reply; 11+ messages in thread
From: Oscar Mateo @ 2017-03-09  8:42 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx



On 03/09/2017 02:05 AM, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> In order to ensure no missed interrupts we must first re-direct
> the interrupts to GuC, and only then re-submit the requests to
> be replayed after a GPU reset. Otherwise context switch can fire
> before GuC has been set up to receive it triggering more hangs.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 58 +++++++++++++++++++++++++++---
>   drivers/gpu/drm/i915/intel_guc_loader.c    | 47 ------------------------
>   2 files changed, 54 insertions(+), 51 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index beb38e30d0e9..5b8ec0fab563 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -936,6 +936,52 @@ static void guc_reset_wq(struct i915_guc_client *client)
>   	client->wq_tail = 0;
>   }
>   
<SNIP>
>   int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>   {
>   	struct intel_guc *guc = &dev_priv->guc;
> @@ -953,13 +999,17 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>   
>   	/* Take over from manual control of ELSP (execlists) */
>   	for_each_engine(engine, dev_priv, id) {
> -		const int wqi_size = sizeof(struct guc_wq_item);
> -		struct drm_i915_gem_request *rq;
> -
>   		engine->submit_request = i915_guc_submit;
>   		engine->schedule = NULL;
> +	}
> +
> +	guc_interrupts_capture(dev_priv);
> +
> +	/* Replay the current set of previously submitted requests */
> +	for_each_engine(engine, dev_priv, id) {
> +		const int wqi_size = sizeof(struct guc_wq_item);
> +		struct drm_i915_gem_request *rq;
>   
Don't you want to move the guc_interrupts_release to 
i915_guc_submission_disable as well?

With that, Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915/guc: Fix request re-submission after reset
  2017-03-09 16:50   ` Tvrtko Ursulin
@ 2017-03-09  8:55     ` Oscar Mateo
  2017-03-09 17:02       ` Tvrtko Ursulin
  0 siblings, 1 reply; 11+ messages in thread
From: Oscar Mateo @ 2017-03-09  8:55 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx



On 03/09/2017 08:50 AM, Tvrtko Ursulin wrote:
>
> On 09/03/2017 08:42, Oscar Mateo wrote:
>> On 03/09/2017 02:05 AM, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> In order to ensure no missed interrupts we must first re-direct
>>> the interrupts to GuC, and only then re-submit the requests to
>>> be replayed after a GPU reset. Otherwise context switch can fire
>>> before GuC has been set up to receive it triggering more hangs.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>> Cc: Oscar Mateo <oscar.mateo@intel.com>
>>> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_guc_submission.c | 58
>>> +++++++++++++++++++++++++++---
>>>   drivers/gpu/drm/i915/intel_guc_loader.c    | 47
>>> ------------------------
>>>   2 files changed, 54 insertions(+), 51 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> index beb38e30d0e9..5b8ec0fab563 100644
>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> @@ -936,6 +936,52 @@ static void guc_reset_wq(struct i915_guc_client
>>> *client)
>>>       client->wq_tail = 0;
>>>   }
>>>
>> <SNIP>
>>>   int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>>>   {
>>>       struct intel_guc *guc = &dev_priv->guc;
>>> @@ -953,13 +999,17 @@ int i915_guc_submission_enable(struct
>>> drm_i915_private *dev_priv)
>>>         /* Take over from manual control of ELSP (execlists) */
>>>       for_each_engine(engine, dev_priv, id) {
>>> -        const int wqi_size = sizeof(struct guc_wq_item);
>>> -        struct drm_i915_gem_request *rq;
>>> -
>>>           engine->submit_request = i915_guc_submit;
>>>           engine->schedule = NULL;
>>> +    }
>>> +
>>> +    guc_interrupts_capture(dev_priv);
>>> +
>>> +    /* Replay the current set of previously submitted requests */
>>> +    for_each_engine(engine, dev_priv, id) {
>>> +        const int wqi_size = sizeof(struct guc_wq_item);
>>> +        struct drm_i915_gem_request *rq;
>>>
>> Don't you want to move the guc_interrupts_release to
>> i915_guc_submission_disable as well?
>
> I can't spot anything broken in that path. We never go in that 
> direction with the live submission so why do you think it is needed?
>
> Regards,
>
> Tvrtko
Just code symmetry: if we are leaving i915_guc_submission_enable to 
capture the interrupts, why doesn't the disable also releases them? 
Maybe it's not very important now, but it makes a lot more sense with my 
series to do proper unwinding of the whole path (I can tackle it there 
if you prefer).
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] drm/i915/guc: Fix request re-submission after reset
@ 2017-03-09 10:05 Tvrtko Ursulin
  2017-03-09  8:42 ` Oscar Mateo
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Tvrtko Ursulin @ 2017-03-09 10:05 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

In order to ensure no missed interrupts we must first re-direct
the interrupts to GuC, and only then re-submit the requests to
be replayed after a GPU reset. Otherwise context switch can fire
before GuC has been set up to receive it triggering more hangs.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 58 +++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/intel_guc_loader.c    | 47 ------------------------
 2 files changed, 54 insertions(+), 51 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index beb38e30d0e9..5b8ec0fab563 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -936,6 +936,52 @@ static void guc_reset_wq(struct i915_guc_client *client)
 	client->wq_tail = 0;
 }
 
+static void guc_interrupts_capture(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int irqs;
+	u32 tmp;
+
+	/* tell all command streamers to forward interrupts (but not vblank) to GuC */
+	irqs = _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
+	for_each_engine(engine, dev_priv, id)
+		I915_WRITE(RING_MODE_GEN7(engine), irqs);
+
+	/* route USER_INTERRUPT to Host, all others are sent to GuC. */
+	irqs = GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
+	       GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
+	/* These three registers have the same bit definitions */
+	I915_WRITE(GUC_BCS_RCS_IER, ~irqs);
+	I915_WRITE(GUC_VCS2_VCS1_IER, ~irqs);
+	I915_WRITE(GUC_WD_VECS_IER, ~irqs);
+
+	/*
+	 * The REDIRECT_TO_GUC bit of the PMINTRMSK register directs all
+	 * (unmasked) PM interrupts to the GuC. All other bits of this
+	 * register *disable* generation of a specific interrupt.
+	 *
+	 * 'pm_intr_keep' indicates bits that are NOT to be set when
+	 * writing to the PM interrupt mask register, i.e. interrupts
+	 * that must not be disabled.
+	 *
+	 * If the GuC is handling these interrupts, then we must not let
+	 * the PM code disable ANY interrupt that the GuC is expecting.
+	 * So for each ENABLED (0) bit in this register, we must SET the
+	 * bit in pm_intr_keep so that it's left enabled for the GuC.
+	 *
+	 * OTOH the REDIRECT_TO_GUC bit is initially SET in pm_intr_keep
+	 * (so interrupts go to the DISPLAY unit at first); but here we
+	 * need to CLEAR that bit, which will result in the register bit
+	 * being left SET!
+	 */
+	tmp = I915_READ(GEN6_PMINTRMSK);
+	if (tmp & GEN8_PMINTR_REDIRECT_TO_GUC) {
+		dev_priv->rps.pm_intr_keep |= ~tmp;
+		dev_priv->rps.pm_intr_keep &= ~GEN8_PMINTR_REDIRECT_TO_GUC;
+	}
+}
+
 int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
@@ -953,13 +999,17 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 
 	/* Take over from manual control of ELSP (execlists) */
 	for_each_engine(engine, dev_priv, id) {
-		const int wqi_size = sizeof(struct guc_wq_item);
-		struct drm_i915_gem_request *rq;
-
 		engine->submit_request = i915_guc_submit;
 		engine->schedule = NULL;
+	}
+
+	guc_interrupts_capture(dev_priv);
+
+	/* Replay the current set of previously submitted requests */
+	for_each_engine(engine, dev_priv, id) {
+		const int wqi_size = sizeof(struct guc_wq_item);
+		struct drm_i915_gem_request *rq;
 
-		/* Replay the current set of previously submitted requests */
 		spin_lock_irq(&engine->timeline->lock);
 		list_for_each_entry(rq, &engine->timeline->requests, link) {
 			guc_client_update_wq_rsvd(client, wqi_size);
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
index 9885f760f2ef..2e24712cf3ee 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -109,52 +109,6 @@ static void guc_interrupts_release(struct drm_i915_private *dev_priv)
 	I915_WRITE(GUC_WD_VECS_IER, 0);
 }
 
-static void guc_interrupts_capture(struct drm_i915_private *dev_priv)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int irqs;
-	u32 tmp;
-
-	/* tell all command streamers to forward interrupts (but not vblank) to GuC */
-	irqs = _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
-	for_each_engine(engine, dev_priv, id)
-		I915_WRITE(RING_MODE_GEN7(engine), irqs);
-
-	/* route USER_INTERRUPT to Host, all others are sent to GuC. */
-	irqs = GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
-	       GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
-	/* These three registers have the same bit definitions */
-	I915_WRITE(GUC_BCS_RCS_IER, ~irqs);
-	I915_WRITE(GUC_VCS2_VCS1_IER, ~irqs);
-	I915_WRITE(GUC_WD_VECS_IER, ~irqs);
-
-	/*
-	 * The REDIRECT_TO_GUC bit of the PMINTRMSK register directs all
-	 * (unmasked) PM interrupts to the GuC. All other bits of this
-	 * register *disable* generation of a specific interrupt.
-	 *
-	 * 'pm_intr_keep' indicates bits that are NOT to be set when
-	 * writing to the PM interrupt mask register, i.e. interrupts
-	 * that must not be disabled.
-	 *
-	 * If the GuC is handling these interrupts, then we must not let
-	 * the PM code disable ANY interrupt that the GuC is expecting.
-	 * So for each ENABLED (0) bit in this register, we must SET the
-	 * bit in pm_intr_keep so that it's left enabled for the GuC.
-	 *
-	 * OTOH the REDIRECT_TO_GUC bit is initially SET in pm_intr_keep
-	 * (so interrupts go to the DISPLAY unit at first); but here we
-	 * need to CLEAR that bit, which will result in the register bit
-	 * being left SET!
-	 */
-	tmp = I915_READ(GEN6_PMINTRMSK);
-	if (tmp & GEN8_PMINTR_REDIRECT_TO_GUC) {
-		dev_priv->rps.pm_intr_keep |= ~tmp;
-		dev_priv->rps.pm_intr_keep &= ~GEN8_PMINTR_REDIRECT_TO_GUC;
-	}
-}
-
 static u32 get_gttype(struct drm_i915_private *dev_priv)
 {
 	/* XXX: GT type based on PCI device ID? field seems unused by fw */
@@ -529,7 +483,6 @@ int intel_guc_setup(struct drm_i915_private *dev_priv)
 		err = i915_guc_submission_enable(dev_priv);
 		if (err)
 			goto fail;
-		guc_interrupts_capture(dev_priv);
 	}
 
 	DRM_INFO("GuC %s (firmware %s [version %u.%u])\n",
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915/guc: Fix request re-submission after reset
  2017-03-09 10:05 [PATCH] drm/i915/guc: Fix request re-submission after reset Tvrtko Ursulin
  2017-03-09  8:42 ` Oscar Mateo
@ 2017-03-09 10:19 ` Chris Wilson
  2017-03-09 12:06   ` Chris Wilson
  2017-03-09 10:47 ` ✓ Fi.CI.BAT: success for " Patchwork
  2 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2017-03-09 10:19 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Thu, Mar 09, 2017 at 10:05:21AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> In order to ensure no missed interrupts we must first re-direct
> the interrupts to GuC, and only then re-submit the requests to
> be replayed after a GPU reset. Otherwise context switch can fire
> before GuC has been set up to receive it triggering more hangs.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 58 +++++++++++++++++++++++++++---
>  drivers/gpu/drm/i915/intel_guc_loader.c    | 47 ------------------------
>  2 files changed, 54 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index beb38e30d0e9..5b8ec0fab563 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -936,6 +936,52 @@ static void guc_reset_wq(struct i915_guc_client *client)
>  	client->wq_tail = 0;
>  }
>  
> +static void guc_interrupts_capture(struct drm_i915_private *dev_priv)
> +{
> +	struct intel_engine_cs *engine;
> +	enum intel_engine_id id;
> +	int irqs;
> +	u32 tmp;
> +
> +	/* tell all command streamers to forward interrupts (but not vblank) to GuC */
> +	irqs = _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
> +	for_each_engine(engine, dev_priv, id)
> +		I915_WRITE(RING_MODE_GEN7(engine), irqs);
> +
> +	/* route USER_INTERRUPT to Host, all others are sent to GuC. */
> +	irqs = GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
> +	       GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
> +	/* These three registers have the same bit definitions */
> +	I915_WRITE(GUC_BCS_RCS_IER, ~irqs);
> +	I915_WRITE(GUC_VCS2_VCS1_IER, ~irqs);
> +	I915_WRITE(GUC_WD_VECS_IER, ~irqs);
> +
> +	/*
> +	 * The REDIRECT_TO_GUC bit of the PMINTRMSK register directs all
> +	 * (unmasked) PM interrupts to the GuC. All other bits of this
> +	 * register *disable* generation of a specific interrupt.
> +	 *
> +	 * 'pm_intr_keep' indicates bits that are NOT to be set when
> +	 * writing to the PM interrupt mask register, i.e. interrupts
> +	 * that must not be disabled.
> +	 *
> +	 * If the GuC is handling these interrupts, then we must not let
> +	 * the PM code disable ANY interrupt that the GuC is expecting.
> +	 * So for each ENABLED (0) bit in this register, we must SET the
> +	 * bit in pm_intr_keep so that it's left enabled for the GuC.
> +	 *
> +	 * OTOH the REDIRECT_TO_GUC bit is initially SET in pm_intr_keep
> +	 * (so interrupts go to the DISPLAY unit at first); but here we
> +	 * need to CLEAR that bit, which will result in the register bit
> +	 * being left SET!
> +	 */
> +	tmp = I915_READ(GEN6_PMINTRMSK);
> +	if (tmp & GEN8_PMINTR_REDIRECT_TO_GUC) {
> +		dev_priv->rps.pm_intr_keep |= ~tmp;
> +		dev_priv->rps.pm_intr_keep &= ~GEN8_PMINTR_REDIRECT_TO_GUC;
> +	}

Let me just remove this code first (drm/i915: Initialize pm_intr_keep
during intel_irq_init for GuC) depending on the outcome of
https://patchwork.freedesktop.org/series/20980/.

> +}
> +
>  int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>  {
>  	struct intel_guc *guc = &dev_priv->guc;
> @@ -953,13 +999,17 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>  
>  	/* Take over from manual control of ELSP (execlists) */
>  	for_each_engine(engine, dev_priv, id) {
> -		const int wqi_size = sizeof(struct guc_wq_item);
> -		struct drm_i915_gem_request *rq;
> -
>  		engine->submit_request = i915_guc_submit;
>  		engine->schedule = NULL;
> +	}
> +
> +	guc_interrupts_capture(dev_priv);
> +
> +	/* Replay the current set of previously submitted requests */
> +	for_each_engine(engine, dev_priv, id) {
> +		const int wqi_size = sizeof(struct guc_wq_item);
> +		struct drm_i915_gem_request *rq;
>  
> -		/* Replay the current set of previously submitted requests */
>  		spin_lock_irq(&engine->timeline->lock);
>  		list_for_each_entry(rq, &engine->timeline->requests, link) {
>  			guc_client_update_wq_rsvd(client, wqi_size);

>  static u32 get_gttype(struct drm_i915_private *dev_priv)
>  {
>  	/* XXX: GT type based on PCI device ID? field seems unused by fw */
> @@ -529,7 +483,6 @@ int intel_guc_setup(struct drm_i915_private *dev_priv)
>  		err = i915_guc_submission_enable(dev_priv);
>  		if (err)
>  			goto fail;
> -		guc_interrupts_capture(dev_priv);
>  	}

Looks good to me.
R-b for v2 after Sagar :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915/guc: Fix request re-submission after reset
  2017-03-09 10:05 [PATCH] drm/i915/guc: Fix request re-submission after reset Tvrtko Ursulin
  2017-03-09  8:42 ` Oscar Mateo
  2017-03-09 10:19 ` Chris Wilson
@ 2017-03-09 10:47 ` Patchwork
  2 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2017-03-09 10:47 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/guc: Fix request re-submission after reset
URL   : https://patchwork.freedesktop.org/series/20979/
State : success

== Summary ==

Series 20979v1 drm/i915/guc: Fix request re-submission after reset
https://patchwork.freedesktop.org/api/1.0/series/20979/revisions/1/mbox/

Test gem_exec_flush:
        Subgroup basic-batch-kernel-default-uc:
                fail       -> PASS       (fi-snb-2600) fdo#100007
        Subgroup basic-uc-ro-default:
                incomplete -> PASS       (fi-skl-6770hq) fdo#100130
                incomplete -> PASS       (fi-skl-6700hq) fdo#100130
Test gem_exec_suspend:
        Subgroup basic-s4-devices:
                dmesg-warn -> PASS       (fi-bxt-t5700) fdo#100125

fdo#100007 https://bugs.freedesktop.org/show_bug.cgi?id=100007
fdo#100130 https://bugs.freedesktop.org/show_bug.cgi?id=100130
fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  time: 462s
fi-bsw-n3050     total:278  pass:239  dwarn:0   dfail:0   fail:0   skip:39  time: 608s
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time: 536s
fi-bxt-t5700     total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  time: 609s
fi-byt-j1900     total:278  pass:251  dwarn:0   dfail:0   fail:0   skip:27  time: 503s
fi-byt-n2820     total:278  pass:247  dwarn:0   dfail:0   fail:0   skip:31  time: 498s
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time: 435s
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time: 432s
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  time: 439s
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time: 509s
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time: 477s
fi-kbl-7500u     total:278  pass:259  dwarn:1   dfail:0   fail:0   skip:18  time: 481s
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time: 506s
fi-skl-6700hq    total:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  time: 592s
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  time: 501s
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time: 549s
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time: 561s
fi-snb-2600      total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  time: 425s

8a00a7aeee24329099e5fd23d115e794708e34c3 drm-tip: 2017y-03m-09d-09h-10m-04s UTC integration manifest
e48fc3f drm/i915/guc: Fix request re-submission after reset

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4109/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915/guc: Fix request re-submission after reset
  2017-03-09 10:19 ` Chris Wilson
@ 2017-03-09 12:06   ` Chris Wilson
  2017-03-09 12:38     ` Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2017-03-09 12:06 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx, Tvrtko Ursulin, Michal Wajdeczko,
	Mika Kuoppala, Oscar Mateo, Sagar Arun Kamble, Arkadiusz Hiler

On Thu, Mar 09, 2017 at 10:19:10AM +0000, Chris Wilson wrote:
> On Thu, Mar 09, 2017 at 10:05:21AM +0000, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > 
> > In order to ensure no missed interrupts we must first re-direct
> > the interrupts to GuC, and only then re-submit the requests to
> > be replayed after a GPU reset. Otherwise context switch can fire
> > before GuC has been set up to receive it triggering more hangs.
> > 
> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Oscar Mateo <oscar.mateo@intel.com>
> > Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> > Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_guc_submission.c | 58 +++++++++++++++++++++++++++---
> >  drivers/gpu/drm/i915/intel_guc_loader.c    | 47 ------------------------
> >  2 files changed, 54 insertions(+), 51 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> > index beb38e30d0e9..5b8ec0fab563 100644
> > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > @@ -936,6 +936,52 @@ static void guc_reset_wq(struct i915_guc_client *client)
> >  	client->wq_tail = 0;
> >  }
> >  
> > +static void guc_interrupts_capture(struct drm_i915_private *dev_priv)
> > +{
> > +	struct intel_engine_cs *engine;
> > +	enum intel_engine_id id;
> > +	int irqs;
> > +	u32 tmp;
> > +
> > +	/* tell all command streamers to forward interrupts (but not vblank) to GuC */
> > +	irqs = _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
> > +	for_each_engine(engine, dev_priv, id)
> > +		I915_WRITE(RING_MODE_GEN7(engine), irqs);
> > +
> > +	/* route USER_INTERRUPT to Host, all others are sent to GuC. */
> > +	irqs = GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
> > +	       GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
> > +	/* These three registers have the same bit definitions */
> > +	I915_WRITE(GUC_BCS_RCS_IER, ~irqs);
> > +	I915_WRITE(GUC_VCS2_VCS1_IER, ~irqs);
> > +	I915_WRITE(GUC_WD_VECS_IER, ~irqs);
> > +
> > +	/*
> > +	 * The REDIRECT_TO_GUC bit of the PMINTRMSK register directs all
> > +	 * (unmasked) PM interrupts to the GuC. All other bits of this
> > +	 * register *disable* generation of a specific interrupt.
> > +	 *
> > +	 * 'pm_intr_keep' indicates bits that are NOT to be set when
> > +	 * writing to the PM interrupt mask register, i.e. interrupts
> > +	 * that must not be disabled.
> > +	 *
> > +	 * If the GuC is handling these interrupts, then we must not let
> > +	 * the PM code disable ANY interrupt that the GuC is expecting.
> > +	 * So for each ENABLED (0) bit in this register, we must SET the
> > +	 * bit in pm_intr_keep so that it's left enabled for the GuC.
> > +	 *
> > +	 * OTOH the REDIRECT_TO_GUC bit is initially SET in pm_intr_keep
> > +	 * (so interrupts go to the DISPLAY unit at first); but here we
> > +	 * need to CLEAR that bit, which will result in the register bit
> > +	 * being left SET!
> > +	 */
> > +	tmp = I915_READ(GEN6_PMINTRMSK);
> > +	if (tmp & GEN8_PMINTR_REDIRECT_TO_GUC) {
> > +		dev_priv->rps.pm_intr_keep |= ~tmp;
> > +		dev_priv->rps.pm_intr_keep &= ~GEN8_PMINTR_REDIRECT_TO_GUC;
> > +	}
> 
> Let me just remove this code first (drm/i915: Initialize pm_intr_keep
> during intel_irq_init for GuC) depending on the outcome of
> https://patchwork.freedesktop.org/series/20980/.

That failed, so blocker removed.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915/guc: Fix request re-submission after reset
  2017-03-09 12:06   ` Chris Wilson
@ 2017-03-09 12:38     ` Chris Wilson
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2017-03-09 12:38 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx, Tvrtko Ursulin, Michal Wajdeczko,
	Mika Kuoppala, Oscar Mateo, Sagar Arun Kamble, Arkadiusz Hiler

On Thu, Mar 09, 2017 at 12:06:37PM +0000, Chris Wilson wrote:
> On Thu, Mar 09, 2017 at 10:19:10AM +0000, Chris Wilson wrote:
> > On Thu, Mar 09, 2017 at 10:05:21AM +0000, Tvrtko Ursulin wrote:
> > > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > 
> > > In order to ensure no missed interrupts we must first re-direct
> > > the interrupts to GuC, and only then re-submit the requests to
> > > be replayed after a GPU reset. Otherwise context switch can fire
> > > before GuC has been set up to receive it triggering more hangs.
> > > 
> > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > > Cc: Oscar Mateo <oscar.mateo@intel.com>
> > > Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> > > Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_guc_submission.c | 58 +++++++++++++++++++++++++++---
> > >  drivers/gpu/drm/i915/intel_guc_loader.c    | 47 ------------------------
> > >  2 files changed, 54 insertions(+), 51 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > index beb38e30d0e9..5b8ec0fab563 100644
> > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > @@ -936,6 +936,52 @@ static void guc_reset_wq(struct i915_guc_client *client)
> > >  	client->wq_tail = 0;
> > >  }
> > >  
> > > +static void guc_interrupts_capture(struct drm_i915_private *dev_priv)
> > > +{
> > > +	struct intel_engine_cs *engine;
> > > +	enum intel_engine_id id;
> > > +	int irqs;
> > > +	u32 tmp;
> > > +
> > > +	/* tell all command streamers to forward interrupts (but not vblank) to GuC */
> > > +	irqs = _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
> > > +	for_each_engine(engine, dev_priv, id)
> > > +		I915_WRITE(RING_MODE_GEN7(engine), irqs);
> > > +
> > > +	/* route USER_INTERRUPT to Host, all others are sent to GuC. */
> > > +	irqs = GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
> > > +	       GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
> > > +	/* These three registers have the same bit definitions */
> > > +	I915_WRITE(GUC_BCS_RCS_IER, ~irqs);
> > > +	I915_WRITE(GUC_VCS2_VCS1_IER, ~irqs);
> > > +	I915_WRITE(GUC_WD_VECS_IER, ~irqs);
> > > +
> > > +	/*
> > > +	 * The REDIRECT_TO_GUC bit of the PMINTRMSK register directs all
> > > +	 * (unmasked) PM interrupts to the GuC. All other bits of this
> > > +	 * register *disable* generation of a specific interrupt.
> > > +	 *
> > > +	 * 'pm_intr_keep' indicates bits that are NOT to be set when
> > > +	 * writing to the PM interrupt mask register, i.e. interrupts
> > > +	 * that must not be disabled.
> > > +	 *
> > > +	 * If the GuC is handling these interrupts, then we must not let
> > > +	 * the PM code disable ANY interrupt that the GuC is expecting.
> > > +	 * So for each ENABLED (0) bit in this register, we must SET the
> > > +	 * bit in pm_intr_keep so that it's left enabled for the GuC.
> > > +	 *
> > > +	 * OTOH the REDIRECT_TO_GUC bit is initially SET in pm_intr_keep
> > > +	 * (so interrupts go to the DISPLAY unit at first); but here we
> > > +	 * need to CLEAR that bit, which will result in the register bit
> > > +	 * being left SET!
> > > +	 */
> > > +	tmp = I915_READ(GEN6_PMINTRMSK);
> > > +	if (tmp & GEN8_PMINTR_REDIRECT_TO_GUC) {
> > > +		dev_priv->rps.pm_intr_keep |= ~tmp;
> > > +		dev_priv->rps.pm_intr_keep &= ~GEN8_PMINTR_REDIRECT_TO_GUC;
> > > +	}
> > 
> > Let me just remove this code first (drm/i915: Initialize pm_intr_keep
> > during intel_irq_init for GuC) depending on the outcome of
> > https://patchwork.freedesktop.org/series/20980/.
> 
> That failed, so blocker removed.

And just to be really confusing, baseline had identical failure, so
pushed :-p

Removes the majority of lines from this patch :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915/guc: Fix request re-submission after reset
  2017-03-09 20:54         ` Chris Wilson
@ 2017-03-09 14:20           ` Oscar Mateo
  0 siblings, 0 replies; 11+ messages in thread
From: Oscar Mateo @ 2017-03-09 14:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx



On 03/09/2017 12:54 PM, Chris Wilson wrote:
> On Thu, Mar 09, 2017 at 05:02:16PM +0000, Tvrtko Ursulin wrote:
>> On 09/03/2017 08:55, Oscar Mateo wrote:
>>> On 03/09/2017 08:50 AM, Tvrtko Ursulin wrote:
>>>> On 09/03/2017 08:42, Oscar Mateo wrote:
>>>>> On 03/09/2017 02:05 AM, Tvrtko Ursulin wrote:
>>>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>>
>>>>>> In order to ensure no missed interrupts we must first re-direct
>>>>>> the interrupts to GuC, and only then re-submit the requests to
>>>>>> be replayed after a GPU reset. Otherwise context switch can fire
>>>>>> before GuC has been set up to receive it triggering more hangs.
>>>>>>
>>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>>>>> Cc: Oscar Mateo <oscar.mateo@intel.com>
>>>>>> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>>>>> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/i915/i915_guc_submission.c | 58
>>>>>> +++++++++++++++++++++++++++---
>>>>>>   drivers/gpu/drm/i915/intel_guc_loader.c    | 47
>>>>>> ------------------------
>>>>>>   2 files changed, 54 insertions(+), 51 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>>>> index beb38e30d0e9..5b8ec0fab563 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>>>> @@ -936,6 +936,52 @@ static void guc_reset_wq(struct i915_guc_client
>>>>>> *client)
>>>>>>       client->wq_tail = 0;
>>>>>>   }
>>>>>>
>>>>> <SNIP>
>>>>>>   int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>>>>>>   {
>>>>>>       struct intel_guc *guc = &dev_priv->guc;
>>>>>> @@ -953,13 +999,17 @@ int i915_guc_submission_enable(struct
>>>>>> drm_i915_private *dev_priv)
>>>>>>         /* Take over from manual control of ELSP (execlists) */
>>>>>>       for_each_engine(engine, dev_priv, id) {
>>>>>> -        const int wqi_size = sizeof(struct guc_wq_item);
>>>>>> -        struct drm_i915_gem_request *rq;
>>>>>> -
>>>>>>           engine->submit_request = i915_guc_submit;
>>>>>>           engine->schedule = NULL;
>>>>>> +    }
>>>>>> +
>>>>>> +    guc_interrupts_capture(dev_priv);
>>>>>> +
>>>>>> +    /* Replay the current set of previously submitted requests */
>>>>>> +    for_each_engine(engine, dev_priv, id) {
>>>>>> +        const int wqi_size = sizeof(struct guc_wq_item);
>>>>>> +        struct drm_i915_gem_request *rq;
>>>>>>
>>>>> Don't you want to move the guc_interrupts_release to
>>>>> i915_guc_submission_disable as well?
>>>> I can't spot anything broken in that path. We never go in that
>>>> direction with the live submission so why do you think it is needed?
>>>>
>>>> Regards,
>>>>
>>>> Tvrtko
>>> Just code symmetry: if we are leaving i915_guc_submission_enable to
>>> capture the interrupts, why doesn't the disable also releases them?
>>> Maybe it's not very important now, but it makes a lot more sense with my
>>> series to do proper unwinding of the whole path (I can tackle it there
>>> if you prefer).
>> I think so. There is a multitude of people trying to refactor the
>> GuC code at the moment so I would prefer just to fix the reset fail
>> quickly and not interfere with that wider refactoring too much.
>> Because I think it is not just a quick job of moving the interrupt
>> release to get to full symmetry. Ack to merge then?
> Exactly, I don't disagree with the desire to make/keep the code
> symmetrical, but I also think push the fix and wait for the dust to
> settle to fix the otherside, or volunteer somebody...
>
> Just so long as we remember to do it in the short term and not leave
> nits to build up.
> -Chris
>
Ok, push the button then. I'll make it symmetrical later.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915/guc: Fix request re-submission after reset
  2017-03-09  8:42 ` Oscar Mateo
@ 2017-03-09 16:50   ` Tvrtko Ursulin
  2017-03-09  8:55     ` Oscar Mateo
  0 siblings, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2017-03-09 16:50 UTC (permalink / raw)
  To: Oscar Mateo, Tvrtko Ursulin, Intel-gfx


On 09/03/2017 08:42, Oscar Mateo wrote:
> On 03/09/2017 02:05 AM, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> In order to ensure no missed interrupts we must first re-direct
>> the interrupts to GuC, and only then re-submit the requests to
>> be replayed after a GPU reset. Otherwise context switch can fire
>> before GuC has been set up to receive it triggering more hangs.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> Cc: Oscar Mateo <oscar.mateo@intel.com>
>> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_guc_submission.c | 58
>> +++++++++++++++++++++++++++---
>>   drivers/gpu/drm/i915/intel_guc_loader.c    | 47
>> ------------------------
>>   2 files changed, 54 insertions(+), 51 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index beb38e30d0e9..5b8ec0fab563 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -936,6 +936,52 @@ static void guc_reset_wq(struct i915_guc_client
>> *client)
>>       client->wq_tail = 0;
>>   }
>>
> <SNIP>
>>   int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>>   {
>>       struct intel_guc *guc = &dev_priv->guc;
>> @@ -953,13 +999,17 @@ int i915_guc_submission_enable(struct
>> drm_i915_private *dev_priv)
>>         /* Take over from manual control of ELSP (execlists) */
>>       for_each_engine(engine, dev_priv, id) {
>> -        const int wqi_size = sizeof(struct guc_wq_item);
>> -        struct drm_i915_gem_request *rq;
>> -
>>           engine->submit_request = i915_guc_submit;
>>           engine->schedule = NULL;
>> +    }
>> +
>> +    guc_interrupts_capture(dev_priv);
>> +
>> +    /* Replay the current set of previously submitted requests */
>> +    for_each_engine(engine, dev_priv, id) {
>> +        const int wqi_size = sizeof(struct guc_wq_item);
>> +        struct drm_i915_gem_request *rq;
>>
> Don't you want to move the guc_interrupts_release to
> i915_guc_submission_disable as well?

I can't spot anything broken in that path. We never go in that direction 
with the live submission so why do you think it is needed?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915/guc: Fix request re-submission after reset
  2017-03-09  8:55     ` Oscar Mateo
@ 2017-03-09 17:02       ` Tvrtko Ursulin
  2017-03-09 20:54         ` Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2017-03-09 17:02 UTC (permalink / raw)
  To: Oscar Mateo; +Cc: intel-gfx


On 09/03/2017 08:55, Oscar Mateo wrote:
> On 03/09/2017 08:50 AM, Tvrtko Ursulin wrote:
>>
>> On 09/03/2017 08:42, Oscar Mateo wrote:
>>> On 03/09/2017 02:05 AM, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> In order to ensure no missed interrupts we must first re-direct
>>>> the interrupts to GuC, and only then re-submit the requests to
>>>> be replayed after a GPU reset. Otherwise context switch can fire
>>>> before GuC has been set up to receive it triggering more hangs.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>>> Cc: Oscar Mateo <oscar.mateo@intel.com>
>>>> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>>>> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_guc_submission.c | 58
>>>> +++++++++++++++++++++++++++---
>>>>   drivers/gpu/drm/i915/intel_guc_loader.c    | 47
>>>> ------------------------
>>>>   2 files changed, 54 insertions(+), 51 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> index beb38e30d0e9..5b8ec0fab563 100644
>>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> @@ -936,6 +936,52 @@ static void guc_reset_wq(struct i915_guc_client
>>>> *client)
>>>>       client->wq_tail = 0;
>>>>   }
>>>>
>>> <SNIP>
>>>>   int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>>>>   {
>>>>       struct intel_guc *guc = &dev_priv->guc;
>>>> @@ -953,13 +999,17 @@ int i915_guc_submission_enable(struct
>>>> drm_i915_private *dev_priv)
>>>>         /* Take over from manual control of ELSP (execlists) */
>>>>       for_each_engine(engine, dev_priv, id) {
>>>> -        const int wqi_size = sizeof(struct guc_wq_item);
>>>> -        struct drm_i915_gem_request *rq;
>>>> -
>>>>           engine->submit_request = i915_guc_submit;
>>>>           engine->schedule = NULL;
>>>> +    }
>>>> +
>>>> +    guc_interrupts_capture(dev_priv);
>>>> +
>>>> +    /* Replay the current set of previously submitted requests */
>>>> +    for_each_engine(engine, dev_priv, id) {
>>>> +        const int wqi_size = sizeof(struct guc_wq_item);
>>>> +        struct drm_i915_gem_request *rq;
>>>>
>>> Don't you want to move the guc_interrupts_release to
>>> i915_guc_submission_disable as well?
>>
>> I can't spot anything broken in that path. We never go in that
>> direction with the live submission so why do you think it is needed?
>>
>> Regards,
>>
>> Tvrtko
> Just code symmetry: if we are leaving i915_guc_submission_enable to
> capture the interrupts, why doesn't the disable also releases them?
> Maybe it's not very important now, but it makes a lot more sense with my
> series to do proper unwinding of the whole path (I can tackle it there
> if you prefer).

I think so. There is a multitude of people trying to refactor the GuC 
code at the moment so I would prefer just to fix the reset fail quickly 
and not interfere with that wider refactoring too much. Because I think 
it is not just a quick job of moving the interrupt release to get to 
full symmetry. Ack to merge then?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915/guc: Fix request re-submission after reset
  2017-03-09 17:02       ` Tvrtko Ursulin
@ 2017-03-09 20:54         ` Chris Wilson
  2017-03-09 14:20           ` Oscar Mateo
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2017-03-09 20:54 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Thu, Mar 09, 2017 at 05:02:16PM +0000, Tvrtko Ursulin wrote:
> 
> On 09/03/2017 08:55, Oscar Mateo wrote:
> >On 03/09/2017 08:50 AM, Tvrtko Ursulin wrote:
> >>
> >>On 09/03/2017 08:42, Oscar Mateo wrote:
> >>>On 03/09/2017 02:05 AM, Tvrtko Ursulin wrote:
> >>>>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>>
> >>>>In order to ensure no missed interrupts we must first re-direct
> >>>>the interrupts to GuC, and only then re-submit the requests to
> >>>>be replayed after a GPU reset. Otherwise context switch can fire
> >>>>before GuC has been set up to receive it triggering more hangs.
> >>>>
> >>>>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>>Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>>>Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> >>>>Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >>>>Cc: Oscar Mateo <oscar.mateo@intel.com>
> >>>>Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> >>>>Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> >>>>---
> >>>>  drivers/gpu/drm/i915/i915_guc_submission.c | 58
> >>>>+++++++++++++++++++++++++++---
> >>>>  drivers/gpu/drm/i915/intel_guc_loader.c    | 47
> >>>>------------------------
> >>>>  2 files changed, 54 insertions(+), 51 deletions(-)
> >>>>
> >>>>diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> >>>>b/drivers/gpu/drm/i915/i915_guc_submission.c
> >>>>index beb38e30d0e9..5b8ec0fab563 100644
> >>>>--- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >>>>+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >>>>@@ -936,6 +936,52 @@ static void guc_reset_wq(struct i915_guc_client
> >>>>*client)
> >>>>      client->wq_tail = 0;
> >>>>  }
> >>>>
> >>><SNIP>
> >>>>  int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
> >>>>  {
> >>>>      struct intel_guc *guc = &dev_priv->guc;
> >>>>@@ -953,13 +999,17 @@ int i915_guc_submission_enable(struct
> >>>>drm_i915_private *dev_priv)
> >>>>        /* Take over from manual control of ELSP (execlists) */
> >>>>      for_each_engine(engine, dev_priv, id) {
> >>>>-        const int wqi_size = sizeof(struct guc_wq_item);
> >>>>-        struct drm_i915_gem_request *rq;
> >>>>-
> >>>>          engine->submit_request = i915_guc_submit;
> >>>>          engine->schedule = NULL;
> >>>>+    }
> >>>>+
> >>>>+    guc_interrupts_capture(dev_priv);
> >>>>+
> >>>>+    /* Replay the current set of previously submitted requests */
> >>>>+    for_each_engine(engine, dev_priv, id) {
> >>>>+        const int wqi_size = sizeof(struct guc_wq_item);
> >>>>+        struct drm_i915_gem_request *rq;
> >>>>
> >>>Don't you want to move the guc_interrupts_release to
> >>>i915_guc_submission_disable as well?
> >>
> >>I can't spot anything broken in that path. We never go in that
> >>direction with the live submission so why do you think it is needed?
> >>
> >>Regards,
> >>
> >>Tvrtko
> >Just code symmetry: if we are leaving i915_guc_submission_enable to
> >capture the interrupts, why doesn't the disable also releases them?
> >Maybe it's not very important now, but it makes a lot more sense with my
> >series to do proper unwinding of the whole path (I can tackle it there
> >if you prefer).
> 
> I think so. There is a multitude of people trying to refactor the
> GuC code at the moment so I would prefer just to fix the reset fail
> quickly and not interfere with that wider refactoring too much.
> Because I think it is not just a quick job of moving the interrupt
> release to get to full symmetry. Ack to merge then?

Exactly, I don't disagree with the desire to make/keep the code
symmetrical, but I also think push the fix and wait for the dust to
settle to fix the otherside, or volunteer somebody...

Just so long as we remember to do it in the short term and not leave
nits to build up.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-03-09 22:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-09 10:05 [PATCH] drm/i915/guc: Fix request re-submission after reset Tvrtko Ursulin
2017-03-09  8:42 ` Oscar Mateo
2017-03-09 16:50   ` Tvrtko Ursulin
2017-03-09  8:55     ` Oscar Mateo
2017-03-09 17:02       ` Tvrtko Ursulin
2017-03-09 20:54         ` Chris Wilson
2017-03-09 14:20           ` Oscar Mateo
2017-03-09 10:19 ` Chris Wilson
2017-03-09 12:06   ` Chris Wilson
2017-03-09 12:38     ` Chris Wilson
2017-03-09 10:47 ` ✓ Fi.CI.BAT: success for " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.