All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext
@ 2020-01-13 13:26 Chris Wilson
  2020-01-13 13:29 ` Chris Wilson
                   ` (11 more replies)
  0 siblings, 12 replies; 16+ messages in thread
From: Chris Wilson @ 2020-01-13 13:26 UTC (permalink / raw)
  To: intel-gfx

As a final paranoid step (we _should_ have reset the GPU on suspending
the device prior to unload), reset the GPU once more before removing the
powercontext and other related power saving paraphernalia.

A clue that this may not be the case is

<7> [313.203721] __intel_gt_set_wedged rcs'0
<7> [313.203746] __intel_gt_set_wedged 	Awake? 3
<7> [313.203751] __intel_gt_set_wedged 	Barriers?: no
<7> [313.203756] __intel_gt_set_wedged 	Latency: 0us
<7> [313.203762] __intel_gt_set_wedged 	Reset count: 0 (global 0)
<7> [313.203766] __intel_gt_set_wedged 	Requests:
<7> [313.203785] __intel_gt_set_wedged 	MMIO base:  0x00002000
<7> [313.203819] __intel_gt_set_wedged 	RING_START: 0x00000000
<7> [313.203826] __intel_gt_set_wedged 	RING_HEAD:  0x00000000
<7> [313.203833] __intel_gt_set_wedged 	RING_TAIL:  0x00000000
<7> [313.203844] __intel_gt_set_wedged 	RING_CTL:   0x00000000
<7> [313.203854] __intel_gt_set_wedged 	RING_MODE:  0x00000000
<7> [313.203861] __intel_gt_set_wedged 	RING_IMR: fffffefe
<7> [313.203875] __intel_gt_set_wedged 	ACTHD:  0x00000000_00000000
<7> [313.203888] __intel_gt_set_wedged 	BBADDR: 0x00000000_00000000
<7> [313.203901] __intel_gt_set_wedged 	DMA_FADDR: 0x00000000_00000000
<7> [313.203909] __intel_gt_set_wedged 	IPEIR: 0x00000000
<7> [313.203916] __intel_gt_set_wedged 	IPEHR: 0xcccccccc
<7> [313.203921] __intel_gt_set_wedged 	Execlist tasklet queued? no (enabled), preempt? inactive, timeslice? inactive
<7> [313.203932] __intel_gt_set_wedged 	Execlist status: 0x00044032 00000020; CSB read:5, write:0, entries:6
<7> [313.203937] __intel_gt_set_wedged 	Execlist CSB[0]: 0x00000001, context: 0
<7> [313.203952] __intel_gt_set_wedged 		Pending[0] ring:{start:000c4000, hwsp:fedfc000, seqno:00000000}, rq:  402e:2-  prio=2147483647 @ 207ms: [i915]
<7> [313.203983] __intel_gt_set_wedged 		E  402e:2-  prio=2147483647 @ 207ms: [i915]
<7> [313.204006] __intel_gt_set_wedged 		Queue priority hint: 3

during rapid fault-injection reloads. 0xcc is POISON_FREE_INIT which
suggests that the system cleared the pages on initialisation as they are
still being used from the previous module load.

Despite that we also have a couple of GPU resets prior to this...
I have a sneaky suspicion that may be a GuC artifact.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gt: Lift clearing GT wedged out of gt_sanitize

We only want to try and reset a wedged device on resume, not before
suspend, so lift the recovery out of the commont gt_sanitize().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_pm.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index d1c2f034296a..c039185c4bd2 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -138,17 +138,6 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
 	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
 	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
 
-	/*
-	 * As we have just resumed the machine and woken the device up from
-	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
-	 * back to defaults, recovering from whatever wedged state we left it
-	 * in and so worth trying to use the device once more.
-	 */
-	if (intel_gt_is_wedged(gt))
-		intel_gt_unset_wedged(gt);
-
-	intel_uc_sanitize(&gt->uc);
-
 	for_each_engine(engine, gt, id)
 		if (engine->reset.prepare)
 			engine->reset.prepare(engine);
@@ -170,6 +159,7 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
 
 void intel_gt_pm_fini(struct intel_gt *gt)
 {
+	intel_gt_set_wedged(gt);
 	intel_rc6_fini(&gt->rc6);
 }
 
@@ -194,7 +184,19 @@ int intel_gt_resume(struct intel_gt *gt)
 	intel_gt_pm_get(gt);
 
 	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
+
 	intel_rc6_sanitize(&gt->rc6);
+	intel_uc_sanitize(&gt->uc);
+
+	/*
+	 * As we have just resumed the machine and woken the device up from
+	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
+	 * back to defaults, recovering from whatever wedged state we left it
+	 * in and so worth trying to use the device once more.
+	 */
+	if (intel_gt_is_wedged(gt))
+		intel_gt_unset_wedged(gt);
+
 	gt_sanitize(gt, true);
 	if (intel_gt_is_wedged(gt)) {
 		err = -EIO;
@@ -308,8 +310,7 @@ void intel_gt_suspend_late(struct intel_gt *gt)
 		intel_llc_disable(&gt->llc);
 	}
 
-	gt_sanitize(gt, false);
-
+	intel_gt_set_wedged(gt);
 	GT_TRACE(gt, "\n");
 }
 
-- 
2.25.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
@ 2020-01-13 13:29 ` Chris Wilson
  2020-01-13 14:09   ` Ville Syrjälä
  2020-01-13 14:26 ` [Intel-gfx] [PATCH v3] " Chris Wilson
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2020-01-13 13:29 UTC (permalink / raw)
  To: intel-gfx

As a final paranoid step (we _should_ have reset the GPU on suspending
the device prior to unload), reset the GPU once more before removing the
powercontext and other related power saving paraphernalia.

A clue that this may not be the case is

<7> [313.203721] __intel_gt_set_wedged rcs'0
<7> [313.203746] __intel_gt_set_wedged 	Awake? 3
<7> [313.203751] __intel_gt_set_wedged 	Barriers?: no
<7> [313.203756] __intel_gt_set_wedged 	Latency: 0us
<7> [313.203762] __intel_gt_set_wedged 	Reset count: 0 (global 0)
<7> [313.203766] __intel_gt_set_wedged 	Requests:
<7> [313.203785] __intel_gt_set_wedged 	MMIO base:  0x00002000
<7> [313.203819] __intel_gt_set_wedged 	RING_START: 0x00000000
<7> [313.203826] __intel_gt_set_wedged 	RING_HEAD:  0x00000000
<7> [313.203833] __intel_gt_set_wedged 	RING_TAIL:  0x00000000
<7> [313.203844] __intel_gt_set_wedged 	RING_CTL:   0x00000000
<7> [313.203854] __intel_gt_set_wedged 	RING_MODE:  0x00000000
<7> [313.203861] __intel_gt_set_wedged 	RING_IMR: fffffefe
<7> [313.203875] __intel_gt_set_wedged 	ACTHD:  0x00000000_00000000
<7> [313.203888] __intel_gt_set_wedged 	BBADDR: 0x00000000_00000000
<7> [313.203901] __intel_gt_set_wedged 	DMA_FADDR: 0x00000000_00000000
<7> [313.203909] __intel_gt_set_wedged 	IPEIR: 0x00000000
<7> [313.203916] __intel_gt_set_wedged 	IPEHR: 0xcccccccc
<7> [313.203921] __intel_gt_set_wedged 	Execlist tasklet queued? no (enabled), preempt? inactive, timeslice? inactive
<7> [313.203932] __intel_gt_set_wedged 	Execlist status: 0x00044032 00000020; CSB read:5, write:0, entries:6
<7> [313.203937] __intel_gt_set_wedged 	Execlist CSB[0]: 0x00000001, context: 0
<7> [313.203952] __intel_gt_set_wedged 		Pending[0] ring:{start:000c4000, hwsp:fedfc000, seqno:00000000}, rq:  402e:2-  prio=2147483647 @ 207ms: [i915]
<7> [313.203983] __intel_gt_set_wedged 		E  402e:2-  prio=2147483647 @ 207ms: [i915]
<7> [313.204006] __intel_gt_set_wedged 		Queue priority hint: 3

during rapid fault-injection reloads. 0xcc is POISON_FREE_INIT which
suggests that the system cleared the pages on initialisation as they are
still being used from the previous module load.

Despite that we also have a couple of GPU resets prior to this...
I have a sneaky suspicion that may be a GuC artifact.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gt: Lift clearing GT wedged out of gt_sanitize

We only want to try and reset a wedged device on resume, not before
suspend, so lift the recovery out of the commont gt_sanitize().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_pm.c | 56 +++++++++++----------------
 1 file changed, 22 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index d1c2f034296a..09a78d767e24 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -118,36 +118,16 @@ void intel_gt_pm_init(struct intel_gt *gt)
 	intel_rps_init(&gt->rps);
 }
 
-static bool reset_engines(struct intel_gt *gt)
+static void reset_engines(struct intel_gt *gt)
 {
 	if (INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
-		return false;
-
-	return __intel_gt_reset(gt, ALL_ENGINES) == 0;
+		__intel_gt_reset(gt, ALL_ENGINES);
 }
 
-static void gt_sanitize(struct intel_gt *gt, bool force)
+static void gt_sanitize(struct intel_gt *gt)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
-
-	GT_TRACE(gt, "force:%s", yesno(force));
-
-	/* Use a raw wakeref to avoid calling intel_display_power_get early */
-	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
-	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
-
-	/*
-	 * As we have just resumed the machine and woken the device up from
-	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
-	 * back to defaults, recovering from whatever wedged state we left it
-	 * in and so worth trying to use the device once more.
-	 */
-	if (intel_gt_is_wedged(gt))
-		intel_gt_unset_wedged(gt);
-
-	intel_uc_sanitize(&gt->uc);
 
 	for_each_engine(engine, gt, id)
 		if (engine->reset.prepare)
@@ -155,21 +135,18 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
 
 	intel_uc_reset_prepare(&gt->uc);
 
-	if (reset_engines(gt) || force) {
-		for_each_engine(engine, gt, id)
-			__intel_engine_reset(engine, false);
-	}
+	reset_engines(gt);
+	for_each_engine(engine, gt, id)
+		__intel_engine_reset(engine, false);
 
 	for_each_engine(engine, gt, id)
 		if (engine->reset.finish)
 			engine->reset.finish(engine);
-
-	intel_uncore_forcewake_put(gt->uncore, FORCEWAKE_ALL);
-	intel_runtime_pm_put(gt->uncore->rpm, wakeref);
 }
 
 void intel_gt_pm_fini(struct intel_gt *gt)
 {
+	intel_gt_set_wedged(gt);
 	intel_rc6_fini(&gt->rc6);
 }
 
@@ -194,13 +171,25 @@ int intel_gt_resume(struct intel_gt *gt)
 	intel_gt_pm_get(gt);
 
 	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
+
 	intel_rc6_sanitize(&gt->rc6);
-	gt_sanitize(gt, true);
-	if (intel_gt_is_wedged(gt)) {
+	intel_uc_sanitize(&gt->uc);
+
+	/*
+	 * As we have just resumed the machine and woken the device up from
+	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
+	 * back to defaults, recovering from whatever wedged state we left it
+	 * in and so worth trying to use the device once more.
+	 */
+	if (intel_gt_is_wedged(gt))
+		intel_gt_unset_wedged(gt);
+	if (unlikely(intel_gt_is_wedged(gt))) {
 		err = -EIO;
 		goto out_fw;
 	}
 
+	gt_sanitize(gt);
+
 	/* Only when the HW is re-initialised, can we replay the requests */
 	err = intel_gt_init_hw(gt);
 	if (err) {
@@ -308,8 +297,7 @@ void intel_gt_suspend_late(struct intel_gt *gt)
 		intel_llc_disable(&gt->llc);
 	}
 
-	gt_sanitize(gt, false);
-
+	intel_gt_set_wedged(gt);
 	GT_TRACE(gt, "\n");
 }
 
-- 
2.25.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext
  2020-01-13 13:29 ` Chris Wilson
@ 2020-01-13 14:09   ` Ville Syrjälä
  2020-01-13 14:20     ` Chris Wilson
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2020-01-13 14:09 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Mon, Jan 13, 2020 at 01:29:56PM +0000, Chris Wilson wrote:
> As a final paranoid step (we _should_ have reset the GPU on suspending
> the device prior to unload), reset the GPU once more before removing the
> powercontext and other related power saving paraphernalia.
> 
> A clue that this may not be the case is
> 
> <7> [313.203721] __intel_gt_set_wedged rcs'0
> <7> [313.203746] __intel_gt_set_wedged 	Awake? 3
> <7> [313.203751] __intel_gt_set_wedged 	Barriers?: no
> <7> [313.203756] __intel_gt_set_wedged 	Latency: 0us
> <7> [313.203762] __intel_gt_set_wedged 	Reset count: 0 (global 0)
> <7> [313.203766] __intel_gt_set_wedged 	Requests:
> <7> [313.203785] __intel_gt_set_wedged 	MMIO base:  0x00002000
> <7> [313.203819] __intel_gt_set_wedged 	RING_START: 0x00000000
> <7> [313.203826] __intel_gt_set_wedged 	RING_HEAD:  0x00000000
> <7> [313.203833] __intel_gt_set_wedged 	RING_TAIL:  0x00000000
> <7> [313.203844] __intel_gt_set_wedged 	RING_CTL:   0x00000000
> <7> [313.203854] __intel_gt_set_wedged 	RING_MODE:  0x00000000
> <7> [313.203861] __intel_gt_set_wedged 	RING_IMR: fffffefe
> <7> [313.203875] __intel_gt_set_wedged 	ACTHD:  0x00000000_00000000
> <7> [313.203888] __intel_gt_set_wedged 	BBADDR: 0x00000000_00000000
> <7> [313.203901] __intel_gt_set_wedged 	DMA_FADDR: 0x00000000_00000000
> <7> [313.203909] __intel_gt_set_wedged 	IPEIR: 0x00000000
> <7> [313.203916] __intel_gt_set_wedged 	IPEHR: 0xcccccccc
> <7> [313.203921] __intel_gt_set_wedged 	Execlist tasklet queued? no (enabled), preempt? inactive, timeslice? inactive
> <7> [313.203932] __intel_gt_set_wedged 	Execlist status: 0x00044032 00000020; CSB read:5, write:0, entries:6
> <7> [313.203937] __intel_gt_set_wedged 	Execlist CSB[0]: 0x00000001, context: 0
> <7> [313.203952] __intel_gt_set_wedged 		Pending[0] ring:{start:000c4000, hwsp:fedfc000, seqno:00000000}, rq:  402e:2-  prio=2147483647 @ 207ms: [i915]
> <7> [313.203983] __intel_gt_set_wedged 		E  402e:2-  prio=2147483647 @ 207ms: [i915]
> <7> [313.204006] __intel_gt_set_wedged 		Queue priority hint: 3
> 
> during rapid fault-injection reloads. 0xcc is POISON_FREE_INIT which
> suggests that the system cleared the pages on initialisation as they are
> still being used from the previous module load.
> 
> Despite that we also have a couple of GPU resets prior to this...
> I have a sneaky suspicion that may be a GuC artifact.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Andi Shyti <andi.shyti@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> 
> drm/i915/gt: Lift clearing GT wedged out of gt_sanitize
> 
> We only want to try and reset a wedged device on resume, not before
> suspend, so lift the recovery out of the commont gt_sanitize().
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Andi Shyti <andi.shyti@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gt_pm.c | 56 +++++++++++----------------
>  1 file changed, 22 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> index d1c2f034296a..09a78d767e24 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> @@ -118,36 +118,16 @@ void intel_gt_pm_init(struct intel_gt *gt)
>  	intel_rps_init(&gt->rps);
>  }
>  
> -static bool reset_engines(struct intel_gt *gt)
> +static void reset_engines(struct intel_gt *gt)
>  {
>  	if (INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)

Should that be a !gpu_reset_clobbers_display now?

> -		return false;
> -
> -	return __intel_gt_reset(gt, ALL_ENGINES) == 0;
> +		__intel_gt_reset(gt, ALL_ENGINES);
>  }
>  
> -static void gt_sanitize(struct intel_gt *gt, bool force)
> +static void gt_sanitize(struct intel_gt *gt)
>  {
>  	struct intel_engine_cs *engine;
>  	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
> -
> -	GT_TRACE(gt, "force:%s", yesno(force));
> -
> -	/* Use a raw wakeref to avoid calling intel_display_power_get early */
> -	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
> -	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
> -
> -	/*
> -	 * As we have just resumed the machine and woken the device up from
> -	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
> -	 * back to defaults, recovering from whatever wedged state we left it
> -	 * in and so worth trying to use the device once more.
> -	 */
> -	if (intel_gt_is_wedged(gt))
> -		intel_gt_unset_wedged(gt);
> -
> -	intel_uc_sanitize(&gt->uc);
>  
>  	for_each_engine(engine, gt, id)
>  		if (engine->reset.prepare)
> @@ -155,21 +135,18 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
>  
>  	intel_uc_reset_prepare(&gt->uc);
>  
> -	if (reset_engines(gt) || force) {
> -		for_each_engine(engine, gt, id)
> -			__intel_engine_reset(engine, false);
> -	}
> +	reset_engines(gt);
> +	for_each_engine(engine, gt, id)
> +		__intel_engine_reset(engine, false);
>  
>  	for_each_engine(engine, gt, id)
>  		if (engine->reset.finish)
>  			engine->reset.finish(engine);
> -
> -	intel_uncore_forcewake_put(gt->uncore, FORCEWAKE_ALL);
> -	intel_runtime_pm_put(gt->uncore->rpm, wakeref);
>  }
>  
>  void intel_gt_pm_fini(struct intel_gt *gt)
>  {
> +	intel_gt_set_wedged(gt);
>  	intel_rc6_fini(&gt->rc6);
>  }
>  
> @@ -194,13 +171,25 @@ int intel_gt_resume(struct intel_gt *gt)
>  	intel_gt_pm_get(gt);
>  
>  	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
> +
>  	intel_rc6_sanitize(&gt->rc6);
> -	gt_sanitize(gt, true);
> -	if (intel_gt_is_wedged(gt)) {
> +	intel_uc_sanitize(&gt->uc);
> +
> +	/*
> +	 * As we have just resumed the machine and woken the device up from
> +	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
> +	 * back to defaults, recovering from whatever wedged state we left it
> +	 * in and so worth trying to use the device once more.
> +	 */
> +	if (intel_gt_is_wedged(gt))
> +		intel_gt_unset_wedged(gt);
> +	if (unlikely(intel_gt_is_wedged(gt))) {
>  		err = -EIO;
>  		goto out_fw;
>  	}
>  
> +	gt_sanitize(gt);
> +
>  	/* Only when the HW is re-initialised, can we replay the requests */
>  	err = intel_gt_init_hw(gt);
>  	if (err) {
> @@ -308,8 +297,7 @@ void intel_gt_suspend_late(struct intel_gt *gt)
>  		intel_llc_disable(&gt->llc);
>  	}
>  
> -	gt_sanitize(gt, false);
> -
> +	intel_gt_set_wedged(gt);
>  	GT_TRACE(gt, "\n");
>  }
>  
> -- 
> 2.25.0.rc2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext
  2020-01-13 14:09   ` Ville Syrjälä
@ 2020-01-13 14:20     ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2020-01-13 14:20 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

Quoting Ville Syrjälä (2020-01-13 14:09:23)
> On Mon, Jan 13, 2020 at 01:29:56PM +0000, Chris Wilson wrote:
> > As a final paranoid step (we _should_ have reset the GPU on suspending
> > the device prior to unload), reset the GPU once more before removing the
> > powercontext and other related power saving paraphernalia.
> > 
> > A clue that this may not be the case is
> > 
> > <7> [313.203721] __intel_gt_set_wedged rcs'0
> > <7> [313.203746] __intel_gt_set_wedged        Awake? 3
> > <7> [313.203751] __intel_gt_set_wedged        Barriers?: no
> > <7> [313.203756] __intel_gt_set_wedged        Latency: 0us
> > <7> [313.203762] __intel_gt_set_wedged        Reset count: 0 (global 0)
> > <7> [313.203766] __intel_gt_set_wedged        Requests:
> > <7> [313.203785] __intel_gt_set_wedged        MMIO base:  0x00002000
> > <7> [313.203819] __intel_gt_set_wedged        RING_START: 0x00000000
> > <7> [313.203826] __intel_gt_set_wedged        RING_HEAD:  0x00000000
> > <7> [313.203833] __intel_gt_set_wedged        RING_TAIL:  0x00000000
> > <7> [313.203844] __intel_gt_set_wedged        RING_CTL:   0x00000000
> > <7> [313.203854] __intel_gt_set_wedged        RING_MODE:  0x00000000
> > <7> [313.203861] __intel_gt_set_wedged        RING_IMR: fffffefe
> > <7> [313.203875] __intel_gt_set_wedged        ACTHD:  0x00000000_00000000
> > <7> [313.203888] __intel_gt_set_wedged        BBADDR: 0x00000000_00000000
> > <7> [313.203901] __intel_gt_set_wedged        DMA_FADDR: 0x00000000_00000000
> > <7> [313.203909] __intel_gt_set_wedged        IPEIR: 0x00000000
> > <7> [313.203916] __intel_gt_set_wedged        IPEHR: 0xcccccccc
> > <7> [313.203921] __intel_gt_set_wedged        Execlist tasklet queued? no (enabled), preempt? inactive, timeslice? inactive
> > <7> [313.203932] __intel_gt_set_wedged        Execlist status: 0x00044032 00000020; CSB read:5, write:0, entries:6
> > <7> [313.203937] __intel_gt_set_wedged        Execlist CSB[0]: 0x00000001, context: 0
> > <7> [313.203952] __intel_gt_set_wedged                Pending[0] ring:{start:000c4000, hwsp:fedfc000, seqno:00000000}, rq:  402e:2-  prio=2147483647 @ 207ms: [i915]
> > <7> [313.203983] __intel_gt_set_wedged                E  402e:2-  prio=2147483647 @ 207ms: [i915]
> > <7> [313.204006] __intel_gt_set_wedged                Queue priority hint: 3
> > 
> > during rapid fault-injection reloads. 0xcc is POISON_FREE_INIT which
> > suggests that the system cleared the pages on initialisation as they are
> > still being used from the previous module load.
> > 
> > Despite that we also have a couple of GPU resets prior to this...
> > I have a sneaky suspicion that may be a GuC artifact.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Andi Shyti <andi.shyti@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > 
> > drm/i915/gt: Lift clearing GT wedged out of gt_sanitize
> > 
> > We only want to try and reset a wedged device on resume, not before
> > suspend, so lift the recovery out of the commont gt_sanitize().
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Andi Shyti <andi.shyti@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_gt_pm.c | 56 +++++++++++----------------
> >  1 file changed, 22 insertions(+), 34 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > index d1c2f034296a..09a78d767e24 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > @@ -118,36 +118,16 @@ void intel_gt_pm_init(struct intel_gt *gt)
> >       intel_rps_init(&gt->rps);
> >  }
> >  
> > -static bool reset_engines(struct intel_gt *gt)
> > +static void reset_engines(struct intel_gt *gt)
> >  {
> >       if (INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
> 
> Should that be a !gpu_reset_clobbers_display now?

Heh. Yes. Far too many mistakes today.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH v3] drm/i915/gt: Sanitize and reset GPU before removing powercontext
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
  2020-01-13 13:29 ` Chris Wilson
@ 2020-01-13 14:26 ` Chris Wilson
  2020-01-13 16:24   ` [Intel-gfx] [PATCH v4] " Chris Wilson
  2020-01-13 15:44 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev4) Patchwork
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2020-01-13 14:26 UTC (permalink / raw)
  To: intel-gfx

As a final paranoid step (we _should_ have reset the GPU on suspending
the device prior to unload), reset the GPU once more before removing the
powercontext and other related power saving paraphernalia.

A clue that this may not be the case is

<7> [313.203721] __intel_gt_set_wedged rcs'0
<7> [313.203746] __intel_gt_set_wedged 	Awake? 3
<7> [313.203751] __intel_gt_set_wedged 	Barriers?: no
<7> [313.203756] __intel_gt_set_wedged 	Latency: 0us
<7> [313.203762] __intel_gt_set_wedged 	Reset count: 0 (global 0)
<7> [313.203766] __intel_gt_set_wedged 	Requests:
<7> [313.203785] __intel_gt_set_wedged 	MMIO base:  0x00002000
<7> [313.203819] __intel_gt_set_wedged 	RING_START: 0x00000000
<7> [313.203826] __intel_gt_set_wedged 	RING_HEAD:  0x00000000
<7> [313.203833] __intel_gt_set_wedged 	RING_TAIL:  0x00000000
<7> [313.203844] __intel_gt_set_wedged 	RING_CTL:   0x00000000
<7> [313.203854] __intel_gt_set_wedged 	RING_MODE:  0x00000000
<7> [313.203861] __intel_gt_set_wedged 	RING_IMR: fffffefe
<7> [313.203875] __intel_gt_set_wedged 	ACTHD:  0x00000000_00000000
<7> [313.203888] __intel_gt_set_wedged 	BBADDR: 0x00000000_00000000
<7> [313.203901] __intel_gt_set_wedged 	DMA_FADDR: 0x00000000_00000000
<7> [313.203909] __intel_gt_set_wedged 	IPEIR: 0x00000000
<7> [313.203916] __intel_gt_set_wedged 	IPEHR: 0xcccccccc
<7> [313.203921] __intel_gt_set_wedged 	Execlist tasklet queued? no (enabled), preempt? inactive, timeslice? inactive
<7> [313.203932] __intel_gt_set_wedged 	Execlist status: 0x00044032 00000020; CSB read:5, write:0, entries:6
<7> [313.203937] __intel_gt_set_wedged 	Execlist CSB[0]: 0x00000001, context: 0
<7> [313.203952] __intel_gt_set_wedged 		Pending[0] ring:{start:000c4000, hwsp:fedfc000, seqno:00000000}, rq:  402e:2-  prio=2147483647 @ 207ms: [i915]
<7> [313.203983] __intel_gt_set_wedged 		E  402e:2-  prio=2147483647 @ 207ms: [i915]
<7> [313.204006] __intel_gt_set_wedged 		Queue priority hint: 3

during rapid fault-injection reloads. 0xcc is POISON_FREE_INIT which
suggests that the system cleared the pages on initialisation as they are
still being used from the previous module load.

Despite that we also have a couple of GPU resets prior to this...
I have a sneaky suspicion that may be a GuC artifact.

v2: Just set the device as wedged (which includes a reset) on
suspend/unload, and leave the sanitization to load/resume.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_pm.c | 59 +++++++++++----------------
 1 file changed, 23 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index d1c2f034296a..6c0b662b91b8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -118,36 +118,16 @@ void intel_gt_pm_init(struct intel_gt *gt)
 	intel_rps_init(&gt->rps);
 }
 
-static bool reset_engines(struct intel_gt *gt)
+static void reset_engines(struct intel_gt *gt)
 {
-	if (INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
-		return false;
-
-	return __intel_gt_reset(gt, ALL_ENGINES) == 0;
+	if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
+		__intel_gt_reset(gt, ALL_ENGINES);
 }
 
-static void gt_sanitize(struct intel_gt *gt, bool force)
+static void gt_sanitize(struct intel_gt *gt)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
-
-	GT_TRACE(gt, "force:%s", yesno(force));
-
-	/* Use a raw wakeref to avoid calling intel_display_power_get early */
-	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
-	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
-
-	/*
-	 * As we have just resumed the machine and woken the device up from
-	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
-	 * back to defaults, recovering from whatever wedged state we left it
-	 * in and so worth trying to use the device once more.
-	 */
-	if (intel_gt_is_wedged(gt))
-		intel_gt_unset_wedged(gt);
-
-	intel_uc_sanitize(&gt->uc);
 
 	for_each_engine(engine, gt, id)
 		if (engine->reset.prepare)
@@ -155,21 +135,18 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
 
 	intel_uc_reset_prepare(&gt->uc);
 
-	if (reset_engines(gt) || force) {
-		for_each_engine(engine, gt, id)
-			__intel_engine_reset(engine, false);
-	}
+	reset_engines(gt);
+	for_each_engine(engine, gt, id)
+		__intel_engine_reset(engine, false);
 
 	for_each_engine(engine, gt, id)
 		if (engine->reset.finish)
 			engine->reset.finish(engine);
-
-	intel_uncore_forcewake_put(gt->uncore, FORCEWAKE_ALL);
-	intel_runtime_pm_put(gt->uncore->rpm, wakeref);
 }
 
 void intel_gt_pm_fini(struct intel_gt *gt)
 {
+	intel_gt_set_wedged(gt);
 	intel_rc6_fini(&gt->rc6);
 }
 
@@ -192,15 +169,26 @@ int intel_gt_resume(struct intel_gt *gt)
 	 * allowing us to fixup the user contexts on their first pin.
 	 */
 	intel_gt_pm_get(gt);
-
 	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
+
 	intel_rc6_sanitize(&gt->rc6);
-	gt_sanitize(gt, true);
-	if (intel_gt_is_wedged(gt)) {
+	intel_uc_sanitize(&gt->uc);
+
+	/*
+	 * As we have just resumed the machine and woken the device up from
+	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
+	 * back to defaults, recovering from whatever wedged state we left it
+	 * in and so worth trying to use the device once more.
+	 */
+	if (intel_gt_is_wedged(gt))
+		intel_gt_unset_wedged(gt);
+	if (unlikely(intel_gt_is_wedged(gt))) {
 		err = -EIO;
 		goto out_fw;
 	}
 
+	gt_sanitize(gt);
+
 	/* Only when the HW is re-initialised, can we replay the requests */
 	err = intel_gt_init_hw(gt);
 	if (err) {
@@ -308,8 +296,7 @@ void intel_gt_suspend_late(struct intel_gt *gt)
 		intel_llc_disable(&gt->llc);
 	}
 
-	gt_sanitize(gt, false);
-
+	intel_gt_set_wedged(gt);
 	GT_TRACE(gt, "\n");
 }
 
-- 
2.25.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev4)
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
  2020-01-13 13:29 ` Chris Wilson
  2020-01-13 14:26 ` [Intel-gfx] [PATCH v3] " Chris Wilson
@ 2020-01-13 15:44 ` Patchwork
  2020-01-13 16:18 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2020-01-13 15:44 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev4)
URL   : https://patchwork.freedesktop.org/series/71952/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
aa9a347e9533 drm/i915/gt: Sanitize and reset GPU before removing powercontext
-:31: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#31: 
<7> [313.203921] __intel_gt_set_wedged 	Execlist tasklet queued? no (enabled), preempt? inactive, timeslice? inactive

total: 0 errors, 1 warnings, 0 checks, 103 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev4)
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
                   ` (2 preceding siblings ...)
  2020-01-13 15:44 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev4) Patchwork
@ 2020-01-13 16:18 ` Patchwork
  2020-01-13 16:19 ` [Intel-gfx] ✗ Fi.CI.BUILD: warning " Patchwork
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2020-01-13 16:18 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev4)
URL   : https://patchwork.freedesktop.org/series/71952/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7733 -> Patchwork_16073
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/index.html

Known issues
------------

  Here are the changes found in Patchwork_16073 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3:
    - fi-cfl-guc:         [PASS][1] -> [INCOMPLETE][2] ([i915#163] / [i915#184])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-cfl-guc/igt@gem_exec_suspend@basic-s3.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-cfl-guc/igt@gem_exec_suspend@basic-s3.html
    - fi-skl-guc:         [PASS][3] -> [INCOMPLETE][4] ([i915#146] / [i915#184] / [i915#69])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-skl-guc/igt@gem_exec_suspend@basic-s3.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-skl-guc/igt@gem_exec_suspend@basic-s3.html
    - fi-kbl-guc:         [PASS][5] -> [INCOMPLETE][6] ([i915#184])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-kbl-guc/igt@gem_exec_suspend@basic-s3.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-kbl-guc/igt@gem_exec_suspend@basic-s3.html
    - fi-apl-guc:         [PASS][7] -> [INCOMPLETE][8] ([fdo#103927] / [i915#184])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-apl-guc/igt@gem_exec_suspend@basic-s3.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-apl-guc/igt@gem_exec_suspend@basic-s3.html

  * igt@gem_mmap@basic:
    - fi-icl-dsi:         [PASS][9] -> [DMESG-WARN][10] ([i915#109])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-icl-dsi/igt@gem_mmap@basic.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-icl-dsi/igt@gem_mmap@basic.html

  * igt@i915_selftest@live_gem_contexts:
    - fi-byt-j1900:       [PASS][11] -> [DMESG-FAIL][12] ([i915#722])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-byt-j1900/igt@i915_selftest@live_gem_contexts.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-byt-j1900/igt@i915_selftest@live_gem_contexts.html

  * igt@i915_selftest@live_gt_pm:
    - fi-icl-guc:         [PASS][13] -> [INCOMPLETE][14] ([i915#140])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-icl-guc/igt@i915_selftest@live_gt_pm.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-icl-guc/igt@i915_selftest@live_gt_pm.html

  * igt@kms_frontbuffer_tracking@basic:
    - fi-hsw-peppy:       [PASS][15] -> [DMESG-WARN][16] ([i915#44])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html

  
#### Possible fixes ####

  * igt@i915_module_load@reload-with-fault-injection:
    - fi-skl-6770hq:      [INCOMPLETE][17] ([i915#671]) -> [PASS][18]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-skl-6770hq/igt@i915_module_load@reload-with-fault-injection.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-skl-6770hq/igt@i915_module_load@reload-with-fault-injection.html
    - fi-kbl-x1275:       [INCOMPLETE][19] ([i915#879]) -> [PASS][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-kbl-x1275/igt@i915_module_load@reload-with-fault-injection.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-kbl-x1275/igt@i915_module_load@reload-with-fault-injection.html

  * igt@i915_selftest@live_blt:
    - fi-hsw-4770r:       [DMESG-FAIL][21] ([i915#563]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-hsw-4770r/igt@i915_selftest@live_blt.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-hsw-4770r/igt@i915_selftest@live_blt.html

  * igt@kms_chamelium@dp-edid-read:
    - fi-icl-u2:          [FAIL][23] ([fdo#109635] / [i915#217]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-icl-u2/igt@kms_chamelium@dp-edid-read.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-icl-u2/igt@kms_chamelium@dp-edid-read.html

  
#### Warnings ####

  * igt@i915_selftest@live_blt:
    - fi-hsw-4770:        [DMESG-FAIL][25] ([i915#725]) -> [DMESG-FAIL][26] ([i915#770])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-hsw-4770/igt@i915_selftest@live_blt.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/fi-hsw-4770/igt@i915_selftest@live_blt.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#109635]: https://bugs.freedesktop.org/show_bug.cgi?id=109635
  [i915#109]: https://gitlab.freedesktop.org/drm/intel/issues/109
  [i915#140]: https://gitlab.freedesktop.org/drm/intel/issues/140
  [i915#146]: https://gitlab.freedesktop.org/drm/intel/issues/146
  [i915#163]: https://gitlab.freedesktop.org/drm/intel/issues/163
  [i915#184]: https://gitlab.freedesktop.org/drm/intel/issues/184
  [i915#217]: https://gitlab.freedesktop.org/drm/intel/issues/217
  [i915#44]: https://gitlab.freedesktop.org/drm/intel/issues/44
  [i915#563]: https://gitlab.freedesktop.org/drm/intel/issues/563
  [i915#671]: https://gitlab.freedesktop.org/drm/intel/issues/671
  [i915#69]: https://gitlab.freedesktop.org/drm/intel/issues/69
  [i915#722]: https://gitlab.freedesktop.org/drm/intel/issues/722
  [i915#725]: https://gitlab.freedesktop.org/drm/intel/issues/725
  [i915#770]: https://gitlab.freedesktop.org/drm/intel/issues/770
  [i915#879]: https://gitlab.freedesktop.org/drm/intel/issues/879
  [i915#937]: https://gitlab.freedesktop.org/drm/intel/issues/937


Participating hosts (43 -> 45)
------------------------------

  Additional (7): fi-bwr-2160 fi-ilk-650 fi-snb-2520m fi-gdg-551 fi-ivb-3770 fi-bsw-kefka fi-skl-lmem 
  Missing    (5): fi-kbl-soraka fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7733 -> Patchwork_16073

  CI-20190529: 20190529
  CI_DRM_7733: 379e3dc4d5c95f4c3bcb244fd9527986a23b3e74 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5364: b7cb6ffdb65cbd233f5ddee2f2dabf97b34fa640 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_16073: aa9a347e9533d028263b019b339fed83f9b48ea8 @ git://anongit.freedesktop.org/gfx-ci/linux


== Kernel 32bit build ==

Warning: Kernel 32bit buildtest failed:
https://intel-gfx-ci.01.org/Patchwork_16073/build_32bit.log

  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  CHK     include/generated/compile.h
Kernel: arch/x86/boot/bzImage is ready  (#1)
  Building modules, stage 2.
  MODPOST 122 modules
ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
scripts/Makefile.modpost:93: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1282: recipe for target 'modules' failed
make: *** [modules] Error 2


== Linux commits ==

aa9a347e9533 drm/i915/gt: Sanitize and reset GPU before removing powercontext

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev4)
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
                   ` (3 preceding siblings ...)
  2020-01-13 16:18 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2020-01-13 16:19 ` Patchwork
  2020-01-13 16:39 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev5) Patchwork
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2020-01-13 16:19 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev4)
URL   : https://patchwork.freedesktop.org/series/71952/
State : warning

== Summary ==

CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  CHK     include/generated/compile.h
Kernel: arch/x86/boot/bzImage is ready  (#1)
  Building modules, stage 2.
  MODPOST 122 modules
ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
scripts/Makefile.modpost:93: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1282: recipe for target 'modules' failed
make: *** [modules] Error 2

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16073/build_32bit.log
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH v4] drm/i915/gt: Sanitize and reset GPU before removing powercontext
  2020-01-13 14:26 ` [Intel-gfx] [PATCH v3] " Chris Wilson
@ 2020-01-13 16:24   ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2020-01-13 16:24 UTC (permalink / raw)
  To: intel-gfx

As a final paranoid step (we _should_ have reset the GPU on suspending
the device prior to unload), reset the GPU once more before removing the
powercontext and other related power saving paraphernalia.

A clue that this may not be the case is

<7> [313.203721] __intel_gt_set_wedged rcs'0
<7> [313.203746] __intel_gt_set_wedged 	Awake? 3
<7> [313.203751] __intel_gt_set_wedged 	Barriers?: no
<7> [313.203756] __intel_gt_set_wedged 	Latency: 0us
<7> [313.203762] __intel_gt_set_wedged 	Reset count: 0 (global 0)
<7> [313.203766] __intel_gt_set_wedged 	Requests:
<7> [313.203785] __intel_gt_set_wedged 	MMIO base:  0x00002000
<7> [313.203819] __intel_gt_set_wedged 	RING_START: 0x00000000
<7> [313.203826] __intel_gt_set_wedged 	RING_HEAD:  0x00000000
<7> [313.203833] __intel_gt_set_wedged 	RING_TAIL:  0x00000000
<7> [313.203844] __intel_gt_set_wedged 	RING_CTL:   0x00000000
<7> [313.203854] __intel_gt_set_wedged 	RING_MODE:  0x00000000
<7> [313.203861] __intel_gt_set_wedged 	RING_IMR: fffffefe
<7> [313.203875] __intel_gt_set_wedged 	ACTHD:  0x00000000_00000000
<7> [313.203888] __intel_gt_set_wedged 	BBADDR: 0x00000000_00000000
<7> [313.203901] __intel_gt_set_wedged 	DMA_FADDR: 0x00000000_00000000
<7> [313.203909] __intel_gt_set_wedged 	IPEIR: 0x00000000
<7> [313.203916] __intel_gt_set_wedged 	IPEHR: 0xcccccccc
<7> [313.203921] __intel_gt_set_wedged 	Execlist tasklet queued? no (enabled), preempt? inactive, timeslice? inactive
<7> [313.203932] __intel_gt_set_wedged 	Execlist status: 0x00044032 00000020; CSB read:5, write:0, entries:6
<7> [313.203937] __intel_gt_set_wedged 	Execlist CSB[0]: 0x00000001, context: 0
<7> [313.203952] __intel_gt_set_wedged 		Pending[0] ring:{start:000c4000, hwsp:fedfc000, seqno:00000000}, rq:  402e:2-  prio=2147483647 @ 207ms: [i915]
<7> [313.203983] __intel_gt_set_wedged 		E  402e:2-  prio=2147483647 @ 207ms: [i915]
<7> [313.204006] __intel_gt_set_wedged 		Queue priority hint: 3

during rapid fault-injection reloads. 0xcc is POISON_FREE_INIT which
suggests that the system cleared the pages on initialisation as they are
still being used from the previous module load.

Despite that we also have a couple of GPU resets prior to this...
I have a sneaky suspicion that may be a GuC artifact.

v2: Just set the device as wedged (which includes a reset) on
suspend/unload, and leave the sanitization to load/resume.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c    |  3 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c | 60 ++++++++++-----------------
 2 files changed, 24 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index da2b6e2ae692..700ee4c37487 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -588,7 +588,7 @@ int intel_gt_init(struct intel_gt *gt)
 
 	err = intel_gt_resume(gt);
 	if (err)
-		goto err_uc_init;
+		goto err_gt;
 
 	err = __engines_record_defaults(gt);
 	if (err)
@@ -606,7 +606,6 @@ int intel_gt_init(struct intel_gt *gt)
 err_gt:
 	__intel_gt_disable(gt);
 	intel_uc_fini_hw(&gt->uc);
-err_uc_init:
 	intel_uc_fini(&gt->uc);
 err_engines:
 	intel_engines_release(gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index d1c2f034296a..681cd986324f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -118,36 +118,16 @@ void intel_gt_pm_init(struct intel_gt *gt)
 	intel_rps_init(&gt->rps);
 }
 
-static bool reset_engines(struct intel_gt *gt)
+static void reset_engines(struct intel_gt *gt)
 {
-	if (INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
-		return false;
-
-	return __intel_gt_reset(gt, ALL_ENGINES) == 0;
+	if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
+		__intel_gt_reset(gt, ALL_ENGINES);
 }
 
-static void gt_sanitize(struct intel_gt *gt, bool force)
+static void gt_sanitize(struct intel_gt *gt)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
-
-	GT_TRACE(gt, "force:%s", yesno(force));
-
-	/* Use a raw wakeref to avoid calling intel_display_power_get early */
-	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
-	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
-
-	/*
-	 * As we have just resumed the machine and woken the device up from
-	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
-	 * back to defaults, recovering from whatever wedged state we left it
-	 * in and so worth trying to use the device once more.
-	 */
-	if (intel_gt_is_wedged(gt))
-		intel_gt_unset_wedged(gt);
-
-	intel_uc_sanitize(&gt->uc);
 
 	for_each_engine(engine, gt, id)
 		if (engine->reset.prepare)
@@ -155,21 +135,18 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
 
 	intel_uc_reset_prepare(&gt->uc);
 
-	if (reset_engines(gt) || force) {
-		for_each_engine(engine, gt, id)
-			__intel_engine_reset(engine, false);
-	}
+	reset_engines(gt);
+	for_each_engine(engine, gt, id)
+		__intel_engine_reset(engine, false);
 
 	for_each_engine(engine, gt, id)
 		if (engine->reset.finish)
 			engine->reset.finish(engine);
-
-	intel_uncore_forcewake_put(gt->uncore, FORCEWAKE_ALL);
-	intel_runtime_pm_put(gt->uncore->rpm, wakeref);
 }
 
 void intel_gt_pm_fini(struct intel_gt *gt)
 {
+	intel_gt_set_wedged(gt);
 	intel_rc6_fini(&gt->rc6);
 }
 
@@ -192,15 +169,25 @@ int intel_gt_resume(struct intel_gt *gt)
 	 * allowing us to fixup the user contexts on their first pin.
 	 */
 	intel_gt_pm_get(gt);
-
 	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
-	intel_rc6_sanitize(&gt->rc6);
-	gt_sanitize(gt, true);
-	if (intel_gt_is_wedged(gt)) {
+
+	/*
+	 * As we have just resumed the machine and woken the device up from
+	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
+	 * back to defaults, recovering from whatever wedged state we left it
+	 * in and so worth trying to use the device once more.
+	 */
+	if (intel_gt_is_wedged(gt))
+		intel_gt_unset_wedged(gt);
+	if (unlikely(intel_gt_is_wedged(gt))) {
 		err = -EIO;
 		goto out_fw;
 	}
 
+	intel_rc6_sanitize(&gt->rc6);
+	intel_uc_sanitize(&gt->uc);
+	gt_sanitize(gt);
+
 	/* Only when the HW is re-initialised, can we replay the requests */
 	err = intel_gt_init_hw(gt);
 	if (err) {
@@ -308,8 +295,7 @@ void intel_gt_suspend_late(struct intel_gt *gt)
 		intel_llc_disable(&gt->llc);
 	}
 
-	gt_sanitize(gt, false);
-
+	intel_gt_set_wedged(gt);
 	GT_TRACE(gt, "\n");
 }
 
-- 
2.25.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev5)
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
                   ` (4 preceding siblings ...)
  2020-01-13 16:19 ` [Intel-gfx] ✗ Fi.CI.BUILD: warning " Patchwork
@ 2020-01-13 16:39 ` Patchwork
  2020-01-13 17:05 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2020-01-13 16:39 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev5)
URL   : https://patchwork.freedesktop.org/series/71952/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
d73a5619f429 drm/i915/gt: Sanitize and reset GPU before removing powercontext
-:31: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#31: 
<7> [313.203921] __intel_gt_set_wedged 	Execlist tasklet queued? no (enabled), preempt? inactive, timeslice? inactive

total: 0 errors, 1 warnings, 0 checks, 118 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev5)
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
                   ` (5 preceding siblings ...)
  2020-01-13 16:39 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev5) Patchwork
@ 2020-01-13 17:05 ` Patchwork
  2020-01-13 17:05 ` [Intel-gfx] ✗ Fi.CI.BUILD: warning " Patchwork
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2020-01-13 17:05 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev5)
URL   : https://patchwork.freedesktop.org/series/71952/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7733 -> Patchwork_16075
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/index.html

Known issues
------------

  Here are the changes found in Patchwork_16075 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_close_race@basic-threads:
    - fi-byt-j1900:       [PASS][1] -> [TIMEOUT][2] ([fdo#112271] / [i915#816])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-byt-j1900/igt@gem_close_race@basic-threads.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-byt-j1900/igt@gem_close_race@basic-threads.html

  * igt@gem_exec_suspend@basic-s3:
    - fi-kbl-guc:         [PASS][3] -> [INCOMPLETE][4] ([i915#184])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-kbl-guc/igt@gem_exec_suspend@basic-s3.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-kbl-guc/igt@gem_exec_suspend@basic-s3.html
    - fi-cfl-guc:         [PASS][5] -> [INCOMPLETE][6] ([i915#163] / [i915#184])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-cfl-guc/igt@gem_exec_suspend@basic-s3.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-cfl-guc/igt@gem_exec_suspend@basic-s3.html
    - fi-skl-guc:         [PASS][7] -> [INCOMPLETE][8] ([i915#184] / [i915#69])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-skl-guc/igt@gem_exec_suspend@basic-s3.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-skl-guc/igt@gem_exec_suspend@basic-s3.html
    - fi-apl-guc:         [PASS][9] -> [INCOMPLETE][10] ([fdo#103927] / [i915#184])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-apl-guc/igt@gem_exec_suspend@basic-s3.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-apl-guc/igt@gem_exec_suspend@basic-s3.html

  * igt@i915_selftest@live_gem_contexts:
    - fi-cfl-8700k:       [PASS][11] -> [DMESG-FAIL][12] ([i915#623])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-cfl-8700k/igt@i915_selftest@live_gem_contexts.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-cfl-8700k/igt@i915_selftest@live_gem_contexts.html

  * igt@i915_selftest@live_gt_pm:
    - fi-icl-guc:         [PASS][13] -> [INCOMPLETE][14] ([i915#140])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-icl-guc/igt@i915_selftest@live_gt_pm.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-icl-guc/igt@i915_selftest@live_gt_pm.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [PASS][15] -> [FAIL][16] ([fdo#111096] / [i915#323])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
#### Possible fixes ####

  * igt@gem_exec_suspend@basic-s3:
    - {fi-ehl-1}:         [INCOMPLETE][17] ([i915#937]) -> [PASS][18]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-ehl-1/igt@gem_exec_suspend@basic-s3.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-ehl-1/igt@gem_exec_suspend@basic-s3.html

  * igt@i915_module_load@reload-with-fault-injection:
    - fi-kbl-x1275:       [INCOMPLETE][19] ([i915#879]) -> [PASS][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-kbl-x1275/igt@i915_module_load@reload-with-fault-injection.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-kbl-x1275/igt@i915_module_load@reload-with-fault-injection.html

  * igt@i915_selftest@live_blt:
    - fi-hsw-4770:        [DMESG-FAIL][21] ([i915#725]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-hsw-4770/igt@i915_selftest@live_blt.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-hsw-4770/igt@i915_selftest@live_blt.html

  * igt@i915_selftest@live_execlists:
    - fi-kbl-soraka:      [DMESG-FAIL][23] ([i915#656]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-kbl-soraka/igt@i915_selftest@live_execlists.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-kbl-soraka/igt@i915_selftest@live_execlists.html

  * igt@kms_chamelium@dp-edid-read:
    - fi-icl-u2:          [FAIL][25] ([fdo#109635] / [i915#217]) -> [PASS][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7733/fi-icl-u2/igt@kms_chamelium@dp-edid-read.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/fi-icl-u2/igt@kms_chamelium@dp-edid-read.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#109635]: https://bugs.freedesktop.org/show_bug.cgi?id=109635
  [fdo#111096]: https://bugs.freedesktop.org/show_bug.cgi?id=111096
  [fdo#112271]: https://bugs.freedesktop.org/show_bug.cgi?id=112271
  [i915#140]: https://gitlab.freedesktop.org/drm/intel/issues/140
  [i915#163]: https://gitlab.freedesktop.org/drm/intel/issues/163
  [i915#184]: https://gitlab.freedesktop.org/drm/intel/issues/184
  [i915#217]: https://gitlab.freedesktop.org/drm/intel/issues/217
  [i915#323]: https://gitlab.freedesktop.org/drm/intel/issues/323
  [i915#623]: https://gitlab.freedesktop.org/drm/intel/issues/623
  [i915#656]: https://gitlab.freedesktop.org/drm/intel/issues/656
  [i915#69]: https://gitlab.freedesktop.org/drm/intel/issues/69
  [i915#725]: https://gitlab.freedesktop.org/drm/intel/issues/725
  [i915#816]: https://gitlab.freedesktop.org/drm/intel/issues/816
  [i915#879]: https://gitlab.freedesktop.org/drm/intel/issues/879
  [i915#937]: https://gitlab.freedesktop.org/drm/intel/issues/937


Participating hosts (43 -> 41)
------------------------------

  Additional (6): fi-bwr-2160 fi-ilk-650 fi-snb-2520m fi-gdg-551 fi-ivb-3770 fi-skl-lmem 
  Missing    (8): fi-hsw-4770r fi-hsw-4200u fi-skl-6770hq fi-byt-squawks fi-bsw-cyan fi-whl-u fi-byt-n2820 fi-byt-clapper 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7733 -> Patchwork_16075

  CI-20190529: 20190529
  CI_DRM_7733: 379e3dc4d5c95f4c3bcb244fd9527986a23b3e74 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5364: b7cb6ffdb65cbd233f5ddee2f2dabf97b34fa640 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_16075: d73a5619f4291d5bae984c4ab92fe40c5b0ec760 @ git://anongit.freedesktop.org/gfx-ci/linux


== Kernel 32bit build ==

Warning: Kernel 32bit buildtest failed:
https://intel-gfx-ci.01.org/Patchwork_16075/build_32bit.log

  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  CHK     include/generated/compile.h
Kernel: arch/x86/boot/bzImage is ready  (#1)
  Building modules, stage 2.
  MODPOST 122 modules
ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
scripts/Makefile.modpost:93: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1282: recipe for target 'modules' failed
make: *** [modules] Error 2


== Linux commits ==

d73a5619f429 drm/i915/gt: Sanitize and reset GPU before removing powercontext

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev5)
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
                   ` (6 preceding siblings ...)
  2020-01-13 17:05 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2020-01-13 17:05 ` Patchwork
  2020-01-13 17:17 ` [Intel-gfx] [PATCH v5] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2020-01-13 17:05 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev5)
URL   : https://patchwork.freedesktop.org/series/71952/
State : warning

== Summary ==

CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  CHK     include/generated/compile.h
Kernel: arch/x86/boot/bzImage is ready  (#1)
  Building modules, stage 2.
  MODPOST 122 modules
ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
scripts/Makefile.modpost:93: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1282: recipe for target 'modules' failed
make: *** [modules] Error 2

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16075/build_32bit.log
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH v5] drm/i915/gt: Sanitize and reset GPU before removing powercontext
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
                   ` (7 preceding siblings ...)
  2020-01-13 17:05 ` [Intel-gfx] ✗ Fi.CI.BUILD: warning " Patchwork
@ 2020-01-13 17:17 ` Chris Wilson
  2020-01-13 18:06 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev6) Patchwork
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2020-01-13 17:17 UTC (permalink / raw)
  To: intel-gfx

As a final paranoid step (we _should_ have reset the GPU on suspending
the device prior to unload), reset the GPU once more before removing the
powercontext and other related power saving paraphernalia.

A clue that this may not be the case is

<7> [313.203721] __intel_gt_set_wedged rcs'0
<7> [313.203746] __intel_gt_set_wedged 	Awake? 3
<7> [313.203751] __intel_gt_set_wedged 	Barriers?: no
<7> [313.203756] __intel_gt_set_wedged 	Latency: 0us
<7> [313.203762] __intel_gt_set_wedged 	Reset count: 0 (global 0)
<7> [313.203766] __intel_gt_set_wedged 	Requests:
<7> [313.203785] __intel_gt_set_wedged 	MMIO base:  0x00002000
<7> [313.203819] __intel_gt_set_wedged 	RING_START: 0x00000000
<7> [313.203826] __intel_gt_set_wedged 	RING_HEAD:  0x00000000
<7> [313.203833] __intel_gt_set_wedged 	RING_TAIL:  0x00000000
<7> [313.203844] __intel_gt_set_wedged 	RING_CTL:   0x00000000
<7> [313.203854] __intel_gt_set_wedged 	RING_MODE:  0x00000000
<7> [313.203861] __intel_gt_set_wedged 	RING_IMR: fffffefe
<7> [313.203875] __intel_gt_set_wedged 	ACTHD:  0x00000000_00000000
<7> [313.203888] __intel_gt_set_wedged 	BBADDR: 0x00000000_00000000
<7> [313.203901] __intel_gt_set_wedged 	DMA_FADDR: 0x00000000_00000000
<7> [313.203909] __intel_gt_set_wedged 	IPEIR: 0x00000000
<7> [313.203916] __intel_gt_set_wedged 	IPEHR: 0xcccccccc
<7> [313.203921] __intel_gt_set_wedged 	Execlist tasklet queued? no (enabled), preempt? inactive, timeslice? inactive
<7> [313.203932] __intel_gt_set_wedged 	Execlist status: 0x00044032 00000020; CSB read:5, write:0, entries:6
<7> [313.203937] __intel_gt_set_wedged 	Execlist CSB[0]: 0x00000001, context: 0
<7> [313.203952] __intel_gt_set_wedged 		Pending[0] ring:{start:000c4000, hwsp:fedfc000, seqno:00000000}, rq:  402e:2-  prio=2147483647 @ 207ms: [i915]
<7> [313.203983] __intel_gt_set_wedged 		E  402e:2-  prio=2147483647 @ 207ms: [i915]
<7> [313.204006] __intel_gt_set_wedged 		Queue priority hint: 3

during rapid fault-injection reloads. 0xcc is POISON_FREE_INIT which
suggests that the system cleared the pages on initialisation as they are
still being used from the previous module load.

Despite that we also have a couple of GPU resets prior to this...
I have a sneaky suspicion that may be a GuC artifact.

v2: Just set the device as wedged (which includes a reset) on
suspend/unload, and leave the sanitization to load/resume.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c    |  3 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c | 60 ++++++++++-----------------
 drivers/gpu/drm/i915/gt/intel_reset.c |  2 +
 3 files changed, 26 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index da2b6e2ae692..700ee4c37487 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -588,7 +588,7 @@ int intel_gt_init(struct intel_gt *gt)
 
 	err = intel_gt_resume(gt);
 	if (err)
-		goto err_uc_init;
+		goto err_gt;
 
 	err = __engines_record_defaults(gt);
 	if (err)
@@ -606,7 +606,6 @@ int intel_gt_init(struct intel_gt *gt)
 err_gt:
 	__intel_gt_disable(gt);
 	intel_uc_fini_hw(&gt->uc);
-err_uc_init:
 	intel_uc_fini(&gt->uc);
 err_engines:
 	intel_engines_release(gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index d1c2f034296a..681cd986324f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -118,36 +118,16 @@ void intel_gt_pm_init(struct intel_gt *gt)
 	intel_rps_init(&gt->rps);
 }
 
-static bool reset_engines(struct intel_gt *gt)
+static void reset_engines(struct intel_gt *gt)
 {
-	if (INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
-		return false;
-
-	return __intel_gt_reset(gt, ALL_ENGINES) == 0;
+	if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
+		__intel_gt_reset(gt, ALL_ENGINES);
 }
 
-static void gt_sanitize(struct intel_gt *gt, bool force)
+static void gt_sanitize(struct intel_gt *gt)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
-
-	GT_TRACE(gt, "force:%s", yesno(force));
-
-	/* Use a raw wakeref to avoid calling intel_display_power_get early */
-	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
-	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
-
-	/*
-	 * As we have just resumed the machine and woken the device up from
-	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
-	 * back to defaults, recovering from whatever wedged state we left it
-	 * in and so worth trying to use the device once more.
-	 */
-	if (intel_gt_is_wedged(gt))
-		intel_gt_unset_wedged(gt);
-
-	intel_uc_sanitize(&gt->uc);
 
 	for_each_engine(engine, gt, id)
 		if (engine->reset.prepare)
@@ -155,21 +135,18 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
 
 	intel_uc_reset_prepare(&gt->uc);
 
-	if (reset_engines(gt) || force) {
-		for_each_engine(engine, gt, id)
-			__intel_engine_reset(engine, false);
-	}
+	reset_engines(gt);
+	for_each_engine(engine, gt, id)
+		__intel_engine_reset(engine, false);
 
 	for_each_engine(engine, gt, id)
 		if (engine->reset.finish)
 			engine->reset.finish(engine);
-
-	intel_uncore_forcewake_put(gt->uncore, FORCEWAKE_ALL);
-	intel_runtime_pm_put(gt->uncore->rpm, wakeref);
 }
 
 void intel_gt_pm_fini(struct intel_gt *gt)
 {
+	intel_gt_set_wedged(gt);
 	intel_rc6_fini(&gt->rc6);
 }
 
@@ -192,15 +169,25 @@ int intel_gt_resume(struct intel_gt *gt)
 	 * allowing us to fixup the user contexts on their first pin.
 	 */
 	intel_gt_pm_get(gt);
-
 	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
-	intel_rc6_sanitize(&gt->rc6);
-	gt_sanitize(gt, true);
-	if (intel_gt_is_wedged(gt)) {
+
+	/*
+	 * As we have just resumed the machine and woken the device up from
+	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
+	 * back to defaults, recovering from whatever wedged state we left it
+	 * in and so worth trying to use the device once more.
+	 */
+	if (intel_gt_is_wedged(gt))
+		intel_gt_unset_wedged(gt);
+	if (unlikely(intel_gt_is_wedged(gt))) {
 		err = -EIO;
 		goto out_fw;
 	}
 
+	intel_rc6_sanitize(&gt->rc6);
+	intel_uc_sanitize(&gt->uc);
+	gt_sanitize(gt);
+
 	/* Only when the HW is re-initialised, can we replay the requests */
 	err = intel_gt_init_hw(gt);
 	if (err) {
@@ -308,8 +295,7 @@ void intel_gt_suspend_late(struct intel_gt *gt)
 		intel_llc_disable(&gt->llc);
 	}
 
-	gt_sanitize(gt, false);
-
+	intel_gt_set_wedged(gt);
 	GT_TRACE(gt, "\n");
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index beee0cf89bce..234663faf4c2 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -768,6 +768,8 @@ static void reset_finish(struct intel_gt *gt, intel_engine_mask_t awake)
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 
+	intel_uc_sanitize(&gt->uc);
+
 	for_each_engine(engine, gt, id) {
 		reset_finish_engine(engine);
 		if (awake & engine->mask)
-- 
2.25.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev6)
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
                   ` (8 preceding siblings ...)
  2020-01-13 17:17 ` [Intel-gfx] [PATCH v5] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
@ 2020-01-13 18:06 ` Patchwork
  2020-01-13 18:30 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
  2020-01-13 18:30 ` [Intel-gfx] ✗ Fi.CI.BUILD: warning " Patchwork
  11 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2020-01-13 18:06 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev6)
URL   : https://patchwork.freedesktop.org/series/71952/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
4e50429aa2ae drm/i915/gt: Sanitize and reset GPU before removing powercontext
-:31: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#31: 
<7> [313.203921] __intel_gt_set_wedged 	Execlist tasklet queued? no (enabled), preempt? inactive, timeslice? inactive

total: 0 errors, 1 warnings, 0 checks, 126 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev6)
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
                   ` (9 preceding siblings ...)
  2020-01-13 18:06 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev6) Patchwork
@ 2020-01-13 18:30 ` Patchwork
  2020-01-13 18:30 ` [Intel-gfx] ✗ Fi.CI.BUILD: warning " Patchwork
  11 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2020-01-13 18:30 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev6)
URL   : https://patchwork.freedesktop.org/series/71952/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_7734 -> Patchwork_16077
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_16077 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_16077, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_16077:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live_gem_contexts:
    - fi-kbl-x1275:       [PASS][1] -> [DMESG-FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-kbl-x1275/igt@i915_selftest@live_gem_contexts.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-kbl-x1275/igt@i915_selftest@live_gem_contexts.html

  * igt@kms_busy@basic-flip-pipe-b:
    - fi-icl-guc:         [PASS][3] -> [DMESG-WARN][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-icl-guc/igt@kms_busy@basic-flip-pipe-b.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-icl-guc/igt@kms_busy@basic-flip-pipe-b.html

  
Known issues
------------

  Here are the changes found in Patchwork_16077 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_close_race@basic-threads:
    - fi-byt-n2820:       [PASS][5] -> [TIMEOUT][6] ([fdo#112271] / [i915#816])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-byt-n2820/igt@gem_close_race@basic-threads.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-byt-n2820/igt@gem_close_race@basic-threads.html

  * igt@gem_exec_suspend@basic-s3:
    - fi-cfl-guc:         [PASS][7] -> [INCOMPLETE][8] ([i915#163] / [i915#184])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-cfl-guc/igt@gem_exec_suspend@basic-s3.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-cfl-guc/igt@gem_exec_suspend@basic-s3.html
    - fi-skl-guc:         [PASS][9] -> [INCOMPLETE][10] ([i915#146] / [i915#184] / [i915#69])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-skl-guc/igt@gem_exec_suspend@basic-s3.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-skl-guc/igt@gem_exec_suspend@basic-s3.html
    - fi-kbl-guc:         [PASS][11] -> [INCOMPLETE][12] ([i915#184])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-kbl-guc/igt@gem_exec_suspend@basic-s3.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-kbl-guc/igt@gem_exec_suspend@basic-s3.html
    - fi-apl-guc:         [PASS][13] -> [INCOMPLETE][14] ([fdo#103927] / [i915#184])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-apl-guc/igt@gem_exec_suspend@basic-s3.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-apl-guc/igt@gem_exec_suspend@basic-s3.html

  * igt@i915_module_load@reload-with-fault-injection:
    - fi-skl-6700k2:      [PASS][15] -> [INCOMPLETE][16] ([i915#671])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-skl-6700k2/igt@i915_module_load@reload-with-fault-injection.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-skl-6700k2/igt@i915_module_load@reload-with-fault-injection.html

  * igt@i915_selftest@live_execlists:
    - fi-kbl-soraka:      [PASS][17] -> [DMESG-FAIL][18] ([i915#656])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-kbl-soraka/igt@i915_selftest@live_execlists.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-kbl-soraka/igt@i915_selftest@live_execlists.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [PASS][19] -> [FAIL][20] ([fdo#111407])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
#### Possible fixes ####

  * igt@gem_ctx_switch@rcs0:
    - {fi-ehl-1}:         [INCOMPLETE][21] ([i915#937]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-ehl-1/igt@gem_ctx_switch@rcs0.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-ehl-1/igt@gem_ctx_switch@rcs0.html

  * igt@i915_module_load@reload-with-fault-injection:
    - fi-skl-6770hq:      [DMESG-WARN][23] ([i915#889]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-skl-6770hq/igt@i915_module_load@reload-with-fault-injection.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-skl-6770hq/igt@i915_module_load@reload-with-fault-injection.html

  * igt@i915_pm_rpm@module-reload:
    - fi-skl-6770hq:      [FAIL][25] ([i915#178]) -> [PASS][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html

  * igt@i915_selftest@live_gem_contexts:
    - fi-hsw-peppy:       [DMESG-FAIL][27] ([i915#722]) -> [PASS][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-hsw-peppy/igt@i915_selftest@live_gem_contexts.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-hsw-peppy/igt@i915_selftest@live_gem_contexts.html

  * igt@i915_selftest@live_hangcheck:
    - fi-icl-u2:          [DMESG-FAIL][29] ([fdo#108569] / [i915#419]) -> [PASS][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7734/fi-icl-u2/igt@i915_selftest@live_hangcheck.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/fi-icl-u2/igt@i915_selftest@live_hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569
  [fdo#111407]: https://bugs.freedesktop.org/show_bug.cgi?id=111407
  [fdo#112271]: https://bugs.freedesktop.org/show_bug.cgi?id=112271
  [i915#146]: https://gitlab.freedesktop.org/drm/intel/issues/146
  [i915#163]: https://gitlab.freedesktop.org/drm/intel/issues/163
  [i915#178]: https://gitlab.freedesktop.org/drm/intel/issues/178
  [i915#184]: https://gitlab.freedesktop.org/drm/intel/issues/184
  [i915#419]: https://gitlab.freedesktop.org/drm/intel/issues/419
  [i915#656]: https://gitlab.freedesktop.org/drm/intel/issues/656
  [i915#671]: https://gitlab.freedesktop.org/drm/intel/issues/671
  [i915#69]: https://gitlab.freedesktop.org/drm/intel/issues/69
  [i915#722]: https://gitlab.freedesktop.org/drm/intel/issues/722
  [i915#816]: https://gitlab.freedesktop.org/drm/intel/issues/816
  [i915#889]: https://gitlab.freedesktop.org/drm/intel/issues/889
  [i915#937]: https://gitlab.freedesktop.org/drm/intel/issues/937


Participating hosts (50 -> 46)
------------------------------

  Additional (1): fi-hsw-4770r 
  Missing    (5): fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7734 -> Patchwork_16077

  CI-20190529: 20190529
  CI_DRM_7734: 1bcb28176359295f43a23bd07997de822a3cce43 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5364: b7cb6ffdb65cbd233f5ddee2f2dabf97b34fa640 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_16077: 4e50429aa2ae772acb7cd790ed45168c3e58ff65 @ git://anongit.freedesktop.org/gfx-ci/linux


== Kernel 32bit build ==

Warning: Kernel 32bit buildtest failed:
https://intel-gfx-ci.01.org/Patchwork_16077/build_32bit.log

  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  CHK     include/generated/compile.h
Kernel: arch/x86/boot/bzImage is ready  (#1)
  Building modules, stage 2.
  MODPOST 122 modules
ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
scripts/Makefile.modpost:93: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1282: recipe for target 'modules' failed
make: *** [modules] Error 2


== Linux commits ==

4e50429aa2ae drm/i915/gt: Sanitize and reset GPU before removing powercontext

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev6)
  2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
                   ` (10 preceding siblings ...)
  2020-01-13 18:30 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
@ 2020-01-13 18:30 ` Patchwork
  11 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2020-01-13 18:30 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev6)
URL   : https://patchwork.freedesktop.org/series/71952/
State : warning

== Summary ==

CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  CHK     include/generated/compile.h
Kernel: arch/x86/boot/bzImage is ready  (#1)
  Building modules, stage 2.
  MODPOST 122 modules
ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
scripts/Makefile.modpost:93: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1282: recipe for target 'modules' failed
make: *** [modules] Error 2

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16077/build_32bit.log
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-01-13 18:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-13 13:26 [Intel-gfx] [PATCH v2] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
2020-01-13 13:29 ` Chris Wilson
2020-01-13 14:09   ` Ville Syrjälä
2020-01-13 14:20     ` Chris Wilson
2020-01-13 14:26 ` [Intel-gfx] [PATCH v3] " Chris Wilson
2020-01-13 16:24   ` [Intel-gfx] [PATCH v4] " Chris Wilson
2020-01-13 15:44 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev4) Patchwork
2020-01-13 16:18 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-01-13 16:19 ` [Intel-gfx] ✗ Fi.CI.BUILD: warning " Patchwork
2020-01-13 16:39 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev5) Patchwork
2020-01-13 17:05 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-01-13 17:05 ` [Intel-gfx] ✗ Fi.CI.BUILD: warning " Patchwork
2020-01-13 17:17 ` [Intel-gfx] [PATCH v5] drm/i915/gt: Sanitize and reset GPU before removing powercontext Chris Wilson
2020-01-13 18:06 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gt: Sanitize and reset GPU before removing powercontext (rev6) Patchwork
2020-01-13 18:30 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2020-01-13 18:30 ` [Intel-gfx] ✗ Fi.CI.BUILD: warning " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.