All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH] drm/i915/gt: Reset execlists registers before HWSP
@ 2020-05-13  8:59 Chris Wilson
  2020-05-13  9:32 ` Mika Kuoppala
  2020-05-13  9:34 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
  0 siblings, 2 replies; 4+ messages in thread
From: Chris Wilson @ 2020-05-13  8:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Upon gt resume, we first poison then sanitize the engine. However, our
testing shows that gen9 will very rarely retain the poisoned value from
the HWSP mappings of the execlists status registers. This suggests that
it is reading back from the HWSP, so rejig the register reset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3d0e0894c015..a7d644a21f14 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -3924,6 +3924,14 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
 
 	ring_set_paused(engine, 0);
 
+	/*
+	 * Sometimes Icelake forgets to reset its pointers on a GPU reset.
+	 * Bludgeon them with a mmio update to be sure.
+	 */
+	ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,
+		     reset_value << 8 | reset_value);
+	ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR);
+
 	/*
 	 * After a reset, the HW starts writing into CSB entry [0]. We
 	 * therefore have to set our HEAD pointer back one entry so that
@@ -3937,16 +3945,15 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
 	WRITE_ONCE(*execlists->csb_write, reset_value);
 	wmb(); /* Make sure this is visible to HW (paranoia?) */
 
-	/*
-	 * Sometimes Icelake forgets to reset its pointers on a GPU reset.
-	 * Bludgeon them with a mmio update to be sure.
-	 */
+	invalidate_csb_entries(&execlists->csb_status[0],
+			       &execlists->csb_status[reset_value]);
+
+	/* Once more for luck and our trusty paranoia */
 	ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,
 		     reset_value << 8 | reset_value);
 	ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR);
 
-	invalidate_csb_entries(&execlists->csb_status[0],
-			       &execlists->csb_status[reset_value]);
+	GEM_BUG_ON(READ_ONCE(*execlists->csb_write) != reset_value);
 }
 
 static void execlists_sanitize(struct intel_engine_cs *engine)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/gt: Reset execlists registers before HWSP
  2020-05-13  8:59 [Intel-gfx] [PATCH] drm/i915/gt: Reset execlists registers before HWSP Chris Wilson
@ 2020-05-13  9:32 ` Mika Kuoppala
  2020-05-13  9:46   ` Chris Wilson
  2020-05-13  9:34 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
  1 sibling, 1 reply; 4+ messages in thread
From: Mika Kuoppala @ 2020-05-13  9:32 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Upon gt resume, we first poison then sanitize the engine. However, our
> testing shows that gen9 will very rarely retain the poisoned value from
> the HWSP mappings of the execlists status registers. This suggests that
> it is reading back from the HWSP, so rejig the register reset.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 3d0e0894c015..a7d644a21f14 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -3924,6 +3924,14 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
>  
>  	ring_set_paused(engine, 0);
>  
> +	/*
> +	 * Sometimes Icelake forgets to reset its pointers on a GPU reset.
> +	 * Bludgeon them with a mmio update to be sure.
> +	 */
> +	ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,
> +		     reset_value << 8 | reset_value);
> +	ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR);
> +
>  	/*
>  	 * After a reset, the HW starts writing into CSB entry [0]. We
>  	 * therefore have to set our HEAD pointer back one entry so that
> @@ -3937,16 +3945,15 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
>  	WRITE_ONCE(*execlists->csb_write, reset_value);
>  	wmb(); /* Make sure this is visible to HW (paranoia?) */
>  
> -	/*
> -	 * Sometimes Icelake forgets to reset its pointers on a GPU reset.
> -	 * Bludgeon them with a mmio update to be sure.
> -	 */
> +	invalidate_csb_entries(&execlists->csb_status[0],
> +			       &execlists->csb_status[reset_value]);
> +
> +	/* Once more for luck and our trusty paranoia */
>  	ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,
>  		     reset_value << 8 | reset_value);
>  	ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR);
>  
> -	invalidate_csb_entries(&execlists->csb_status[0],
> -			       &execlists->csb_status[reset_value]);
> +	GEM_BUG_ON(READ_ONCE(*execlists->csb_write) != reset_value);
>  }
>  
>  static void execlists_sanitize(struct intel_engine_cs *engine)
> -- 
> 2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/gt: Reset execlists registers before HWSP
  2020-05-13  8:59 [Intel-gfx] [PATCH] drm/i915/gt: Reset execlists registers before HWSP Chris Wilson
  2020-05-13  9:32 ` Mika Kuoppala
@ 2020-05-13  9:34 ` Patchwork
  1 sibling, 0 replies; 4+ messages in thread
From: Patchwork @ 2020-05-13  9:34 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gt: Reset execlists registers before HWSP
URL   : https://patchwork.freedesktop.org/series/77207/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8471 -> Patchwork_17640
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_17640 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_17640, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17640/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_17640:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live@gt_pm:
    - fi-cml-u2:          [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8471/fi-cml-u2/igt@i915_selftest@live@gt_pm.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17640/fi-cml-u2/igt@i915_selftest@live@gt_pm.html

  
Known issues
------------

  Here are the changes found in Patchwork_17640 that come from known issues:

### IGT changes ###

#### Possible fixes ####

  * igt@debugfs_test@read_all_entries:
    - fi-icl-u2:          [{ABORT}][3] ([i915#1814]) -> [PASS][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8471/fi-icl-u2/igt@debugfs_test@read_all_entries.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17640/fi-icl-u2/igt@debugfs_test@read_all_entries.html

  * igt@i915_selftest@live@objects:
    - fi-bwr-2160:        [INCOMPLETE][5] ([i915#489]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8471/fi-bwr-2160/igt@i915_selftest@live@objects.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17640/fi-bwr-2160/igt@i915_selftest@live@objects.html

  
  [i915#1814]: https://gitlab.freedesktop.org/drm/intel/issues/1814
  [i915#489]: https://gitlab.freedesktop.org/drm/intel/issues/489


Participating hosts (49 -> 41)
------------------------------

  Missing    (8): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-cfl-8700k fi-tgl-y fi-byt-clapper fi-kbl-r 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_8471 -> Patchwork_17640

  CI-20190529: 20190529
  CI_DRM_8471: 3c84a88ed50e99b200fac400a9b817a23d399c01 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5651: e54e2642f1967ca3c488db32264607df670d1dfb @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_17640: 731b4b42bdf40d4ff4f61362ea6e460321918da0 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

731b4b42bdf4 drm/i915/gt: Reset execlists registers before HWSP

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17640/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/gt: Reset execlists registers before HWSP
  2020-05-13  9:32 ` Mika Kuoppala
@ 2020-05-13  9:46   ` Chris Wilson
  0 siblings, 0 replies; 4+ messages in thread
From: Chris Wilson @ 2020-05-13  9:46 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2020-05-13 10:32:37)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Upon gt resume, we first poison then sanitize the engine. However, our
> > testing shows that gen9 will very rarely retain the poisoned value from
> > the HWSP mappings of the execlists status registers. This suggests that
> > it is reading back from the HWSP, so rejig the register reset.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

It failed in exactly the same way, got past the
	GEM_BUG_ON(*csb_write != reset_value)
and still ended up with
	*csb_write == 0x5a [90]
in process_csb.

How it's able to see 0x5a at all is a mystery.

We poison, we sanitize, we reset the GPU. The value comes back from out
of nowhere.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-13  9:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-13  8:59 [Intel-gfx] [PATCH] drm/i915/gt: Reset execlists registers before HWSP Chris Wilson
2020-05-13  9:32 ` Mika Kuoppala
2020-05-13  9:46   ` Chris Wilson
2020-05-13  9:34 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.