All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] drm/i915/selftests: Be loud if we run out of time
@ 2018-08-08  9:10 Chris Wilson
  2018-08-08  9:10 ` [PATCH 2/3] drm/i915: Unmask user interrupts writes into HWSP on snb/ivb/vlv/hsw Chris Wilson
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Chris Wilson @ 2018-08-08  9:10 UTC (permalink / raw)
  To: intel-gfx

On flushing the tests, we do so with a timeout to prevent waiting
indefinitely. However, if we miss an interrupt, the timeout provides a
safety net that still completes successfully (as we check the completion
condition after reaching the timeout and see all is well). This safety
net can unfortunately mask some bugs, so let's add a warning here so we
don't just ignore it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/selftests/igt_flush_test.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/selftests/igt_flush_test.c b/drivers/gpu/drm/i915/selftests/igt_flush_test.c
index af66e3d4e23a..da620ed7cc36 100644
--- a/drivers/gpu/drm/i915/selftests/igt_flush_test.c
+++ b/drivers/gpu/drm/i915/selftests/igt_flush_test.c
@@ -19,7 +19,8 @@ int igt_flush_test(struct drm_i915_private *i915, unsigned int flags)
 		i915_gem_set_wedged(i915);
 	}
 
-	if (i915_gem_wait_for_idle(i915, flags, HZ / 5) == -ETIME) {
+	switch (i915_gem_wait_for_idle(i915, flags, HZ / 5)) {
+	case -ETIME:
 		pr_err("%pS timed out, cancelling all further testing.\n",
 		       __builtin_return_address(0));
 
@@ -27,6 +28,12 @@ int igt_flush_test(struct drm_i915_private *i915, unsigned int flags)
 		GEM_TRACE_DUMP();
 
 		i915_gem_set_wedged(i915);
+		break;
+
+	case 0:
+		pr_err("%pS missed idle-completion interrupt\n",
+		       __builtin_return_address(0));
+		break;
 	}
 
 	return i915_terminally_wedged(&i915->gpu_error) ? -EIO : 0;
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] drm/i915: Unmask user interrupts writes into HWSP on snb/ivb/vlv/hsw
  2018-08-08  9:10 [PATCH 1/3] drm/i915/selftests: Be loud if we run out of time Chris Wilson
@ 2018-08-08  9:10 ` Chris Wilson
  2018-08-08  9:17   ` Mika Kuoppala
  2018-08-08  9:10 ` [PATCH 3/3] drm/i915: Remove extra waiter kick on legacy resets Chris Wilson
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 6+ messages in thread
From: Chris Wilson @ 2018-08-08  9:10 UTC (permalink / raw)
  To: intel-gfx

An oddity occurs on Sandybridge, Ivybridge and Haswell (and presumably
Valleyview) in that for the period following the GPU restart after a
reset, there are no GT interrupts received. From Ville's notes, bit 0 in
the HWSTAM corresponds to the render interrupt, and if we unmask it we
do see immediate resumption of GT interrupt delivery (via the master irq
handler) after the reset.

v2: Limit the w/a to the render interrupt from rcs

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107500
Fixes: c5498089463b ("drm/i915: Mask everything in ring HWSTAM on gen6+ in ringbuffer mode")
References: d420a50c21ef ("drm/i915: Clean up the HWSTAM mess")
Testcase: igt/gem_eio/reset-stress
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8003cef767ba..5a2601a4d1aa 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -387,8 +387,18 @@ static void intel_ring_setup_status_page(struct intel_engine_cs *engine)
 		mmio = RING_HWS_PGA(engine->mmio_base);
 	}
 
-	if (INTEL_GEN(dev_priv) >= 6)
-		I915_WRITE(RING_HWSTAM(engine->mmio_base), 0xffffffff);
+	if (INTEL_GEN(dev_priv) >= 6) {
+		u32 mask = ~0u;
+
+		/*
+		 * Keep the render interrupt unmasked as this papaers over
+		 * lost interrupts following a reset.
+		 */
+		if (engine->id == RCS)
+			mask &= ~BIT(0);
+
+		I915_WRITE(RING_HWSTAM(engine->mmio_base), mask);
+	}
 
 	I915_WRITE(mmio, engine->status_page.ggtt_offset);
 	POSTING_READ(mmio);
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] drm/i915: Remove extra waiter kick on legacy resets
  2018-08-08  9:10 [PATCH 1/3] drm/i915/selftests: Be loud if we run out of time Chris Wilson
  2018-08-08  9:10 ` [PATCH 2/3] drm/i915: Unmask user interrupts writes into HWSP on snb/ivb/vlv/hsw Chris Wilson
@ 2018-08-08  9:10 ` Chris Wilson
  2018-08-08  9:52 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915/selftests: Be loud if we run out of time Patchwork
  2018-08-08 10:08 ` ✗ Fi.CI.BAT: failure " Patchwork
  3 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2018-08-08  9:10 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matthew Auld

Now with a more efficacious workaround for the lost interrupts after
reset, we can remove the hack of kicking the waiters after reset. The
issue was that the kick only worked for the immediate window after the
reset (those seqno that would complete in the time it took for the
waiter thread to perform its check) but miss any seqno that lacked an
interrupt afterwards.

References: 39f3be162c46 ("drm/i915: Kick waiters on resetting legacy rings")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 5a2601a4d1aa..adfb22f9d04f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -537,8 +537,6 @@ static int init_ring_common(struct intel_engine_cs *engine)
 	if (INTEL_GEN(dev_priv) > 2)
 		I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
 
-	/* Papering over lost _interrupts_ immediately following the restart */
-	intel_engine_wakeup(engine);
 out:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/3] drm/i915: Unmask user interrupts writes into HWSP on snb/ivb/vlv/hsw
  2018-08-08  9:10 ` [PATCH 2/3] drm/i915: Unmask user interrupts writes into HWSP on snb/ivb/vlv/hsw Chris Wilson
@ 2018-08-08  9:17   ` Mika Kuoppala
  0 siblings, 0 replies; 6+ messages in thread
From: Mika Kuoppala @ 2018-08-08  9:17 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> An oddity occurs on Sandybridge, Ivybridge and Haswell (and presumably
> Valleyview) in that for the period following the GPU restart after a
> reset, there are no GT interrupts received. From Ville's notes, bit 0 in
> the HWSTAM corresponds to the render interrupt, and if we unmask it we
> do see immediate resumption of GT interrupt delivery (via the master irq
> handler) after the reset.
>
> v2: Limit the w/a to the render interrupt from rcs
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107500
> Fixes: c5498089463b ("drm/i915: Mask everything in ring HWSTAM on gen6+ in ringbuffer mode")
> References: d420a50c21ef ("drm/i915: Clean up the HWSTAM mess")
> Testcase: igt/gem_eio/reset-stress
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 8003cef767ba..5a2601a4d1aa 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -387,8 +387,18 @@ static void intel_ring_setup_status_page(struct intel_engine_cs *engine)
>  		mmio = RING_HWS_PGA(engine->mmio_base);
>  	}
>  
> -	if (INTEL_GEN(dev_priv) >= 6)
> -		I915_WRITE(RING_HWSTAM(engine->mmio_base), 0xffffffff);
> +	if (INTEL_GEN(dev_priv) >= 6) {
> +		u32 mask = ~0u;
> +
> +		/*
> +		 * Keep the render interrupt unmasked as this papaers over

papers, tho papaers sounds like it needs grown up to walk it across
the reset.

> +		 * lost interrupts following a reset.
> +		 */
> +		if (engine->id == RCS)
> +			mask &= ~BIT(0);
> +
> +		I915_WRITE(RING_HWSTAM(engine->mmio_base), mask);

This is fine too as the improved test pushes it with all engines.

Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> +	}
>  
>  	I915_WRITE(mmio, engine->status_page.ggtt_offset);
>  	POSTING_READ(mmio);
> -- 
> 2.18.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915/selftests: Be loud if we run out of time
  2018-08-08  9:10 [PATCH 1/3] drm/i915/selftests: Be loud if we run out of time Chris Wilson
  2018-08-08  9:10 ` [PATCH 2/3] drm/i915: Unmask user interrupts writes into HWSP on snb/ivb/vlv/hsw Chris Wilson
  2018-08-08  9:10 ` [PATCH 3/3] drm/i915: Remove extra waiter kick on legacy resets Chris Wilson
@ 2018-08-08  9:52 ` Patchwork
  2018-08-08 10:08 ` ✗ Fi.CI.BAT: failure " Patchwork
  3 siblings, 0 replies; 6+ messages in thread
From: Patchwork @ 2018-08-08  9:52 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915/selftests: Be loud if we run out of time
URL   : https://patchwork.freedesktop.org/series/47866/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
f9e0780297ab drm/i915/selftests: Be loud if we run out of time
1bd99b6f9bc5 drm/i915: Unmask user interrupts writes into HWSP on snb/ivb/vlv/hsw
-:21: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit d420a50c21ef ("drm/i915: Clean up the HWSTAM mess")'
#21: 
References: d420a50c21ef ("drm/i915: Clean up the HWSTAM mess")

total: 1 errors, 0 warnings, 0 checks, 20 lines checked
c7d7b1fb8466 drm/i915: Remove extra waiter kick on legacy resets
-:16: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#16: 
References: 39f3be162c46 ("drm/i915: Kick waiters on resetting legacy rings")

-:16: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 39f3be162c46 ("drm/i915: Kick waiters on resetting legacy rings")'
#16: 
References: 39f3be162c46 ("drm/i915: Kick waiters on resetting legacy rings")

total: 1 errors, 1 warnings, 0 checks, 8 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [1/3] drm/i915/selftests: Be loud if we run out of time
  2018-08-08  9:10 [PATCH 1/3] drm/i915/selftests: Be loud if we run out of time Chris Wilson
                   ` (2 preceding siblings ...)
  2018-08-08  9:52 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915/selftests: Be loud if we run out of time Patchwork
@ 2018-08-08 10:08 ` Patchwork
  3 siblings, 0 replies; 6+ messages in thread
From: Patchwork @ 2018-08-08 10:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915/selftests: Be loud if we run out of time
URL   : https://patchwork.freedesktop.org/series/47866/
State : failure

== Summary ==

= CI Bug Log - changes from CI_DRM_4632 -> Patchwork_9880 =

== Summary - FAILURE ==

  Serious unknown changes coming with Patchwork_9880 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_9880, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/47866/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_9880:

  === IGT changes ===

    ==== Possible regressions ====

    igt@drv_selftest@live_hangcheck:
      fi-snb-2520m:       PASS -> DMESG-WARN
      {fi-bdw-samus}:     PASS -> DMESG-WARN
      {fi-kbl-8809g}:     PASS -> DMESG-WARN
      fi-hsw-peppy:       PASS -> DMESG-WARN
      fi-cnl-psr:         PASS -> DMESG-WARN
      fi-kbl-7500u:       PASS -> DMESG-WARN
      fi-hsw-4770r:       PASS -> DMESG-WARN
      fi-kbl-7560u:       PASS -> DMESG-WARN
      fi-bdw-5557u:       PASS -> DMESG-WARN
      fi-skl-6700hq:      PASS -> DMESG-WARN
      fi-skl-gvtdvm:      PASS -> DMESG-WARN
      fi-skl-6700k2:      PASS -> DMESG-WARN
      fi-elk-e7500:       PASS -> DMESG-WARN
      fi-byt-j1900:       PASS -> DMESG-WARN
      fi-blb-e6850:       PASS -> DMESG-WARN
      fi-cfl-guc:         PASS -> DMESG-WARN
      fi-skl-6600u:       PASS -> DMESG-WARN
      fi-pnv-d510:        PASS -> DMESG-WARN
      {fi-bsw-kefka}:     PASS -> DMESG-WARN
      fi-cfl-8700k:       PASS -> DMESG-WARN
      fi-kbl-r:           PASS -> DMESG-WARN
      fi-byt-n2820:       PASS -> DMESG-WARN
      {fi-byt-clapper}:   PASS -> DMESG-WARN
      {fi-cfl-8109u}:     PASS -> DMESG-WARN
      fi-kbl-guc:         PASS -> DMESG-WARN
      fi-cfl-s3:          PASS -> DMESG-WARN
      fi-gdg-551:         PASS -> DMESG-WARN
      fi-bwr-2160:        PASS -> DMESG-WARN
      fi-snb-2600:        PASS -> DMESG-WARN
      fi-skl-6770hq:      PASS -> DMESG-WARN
      fi-whl-u:           PASS -> DMESG-WARN
      fi-ivb-3520m:       PASS -> DMESG-WARN
      fi-hsw-4770:        PASS -> DMESG-WARN
      fi-bxt-dsi:         PASS -> DMESG-WARN
      fi-bxt-j4205:       PASS -> DMESG-WARN
      fi-skl-6260u:       PASS -> DMESG-WARN
      {fi-skl-iommu}:     PASS -> DMESG-WARN
      fi-glk-j4005:       PASS -> DMESG-WARN
      fi-ivb-3770:        PASS -> DMESG-WARN
      fi-ilk-650:         PASS -> DMESG-WARN
      fi-bsw-n3050:       PASS -> DMESG-WARN
      fi-bdw-gvtdvm:      PASS -> DMESG-WARN
      fi-kbl-x1275:       PASS -> DMESG-WARN
      fi-kbl-7567u:       PASS -> DMESG-WARN
      fi-glk-dsi:         PASS -> DMESG-WARN

    
    ==== Warnings ====

    igt@drv_selftest@live_hangcheck:
      fi-skl-guc:         DMESG-FAIL (fdo#107174) -> DMESG-WARN

    
== Known issues ==

  Here are the changes found in Patchwork_9880 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@drv_selftest@live_workarounds:
      fi-skl-6700k2:      PASS -> DMESG-FAIL (fdo#107292)

    igt@kms_frontbuffer_tracking@basic:
      fi-hsw-peppy:       PASS -> DMESG-FAIL (fdo#106103, fdo#102614)
      {fi-byt-clapper}:   PASS -> FAIL (fdo#103167)

    igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a:
      {fi-byt-clapper}:   PASS -> FAIL (fdo#107362)

    igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
      fi-skl-6700k2:      PASS -> FAIL (fdo#103191)

    
    ==== Possible fixes ====

    igt@drv_selftest@live_workarounds:
      fi-bdw-5557u:       DMESG-FAIL (fdo#107292) -> PASS

    
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  fdo#102614 https://bugs.freedesktop.org/show_bug.cgi?id=102614
  fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
  fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
  fdo#106103 https://bugs.freedesktop.org/show_bug.cgi?id=106103
  fdo#107174 https://bugs.freedesktop.org/show_bug.cgi?id=107174
  fdo#107292 https://bugs.freedesktop.org/show_bug.cgi?id=107292
  fdo#107362 https://bugs.freedesktop.org/show_bug.cgi?id=107362


== Participating hosts (52 -> 47) ==

  Missing    (5): fi-ctg-p8600 fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u 


== Build changes ==

    * Linux: CI_DRM_4632 -> Patchwork_9880

  CI_DRM_4632: 648e2ff1094eabf43613f41d4d719c1a1f555dbb @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4587: 5d78c73d871525ec9caecd88ad7d9abe36637314 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9880: c7d7b1fb8466447a3b284bb993438c15aa496c72 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

c7d7b1fb8466 drm/i915: Remove extra waiter kick on legacy resets
1bd99b6f9bc5 drm/i915: Unmask user interrupts writes into HWSP on snb/ivb/vlv/hsw
f9e0780297ab drm/i915/selftests: Be loud if we run out of time

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9880/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-08-08 10:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-08  9:10 [PATCH 1/3] drm/i915/selftests: Be loud if we run out of time Chris Wilson
2018-08-08  9:10 ` [PATCH 2/3] drm/i915: Unmask user interrupts writes into HWSP on snb/ivb/vlv/hsw Chris Wilson
2018-08-08  9:17   ` Mika Kuoppala
2018-08-08  9:10 ` [PATCH 3/3] drm/i915: Remove extra waiter kick on legacy resets Chris Wilson
2018-08-08  9:52 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915/selftests: Be loud if we run out of time Patchwork
2018-08-08 10:08 ` ✗ Fi.CI.BAT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.