[PATCH 1/2] drm/i915: Keep the rings stopped until they have been re-initialized

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] drm/i915: Keep the rings stopped until they have been re-initialized
@ 2017-10-13 13:12 Chris Wilson
  2017-10-13 13:12 ` [PATCH 2/2] drm/i915: Always stop the rings before a missing GPU reset Chris Wilson
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Chris Wilson @ 2017-10-13 13:12 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Before modifying the ring register (RING_START, HEAD, TAIL, CTL) we
first make sure it is stopped (or else the hw may not resample the
registers). However, we do not need to let the hw restart until after we
have reprogrammed all the rings. This should help prevent situations
where pending operations on the ring may resume (because we are trying
to re-initialize following an unsuccessful GPU hang, i.e. from
i915_gem_unset_wedged).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103260
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 4285f09ff8b8..b2a6cb09c6e7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -484,11 +484,6 @@ static bool stop_ring(struct intel_engine_cs *engine)
 	I915_WRITE_HEAD(engine, 0);
 	I915_WRITE_TAIL(engine, 0);
 
-	if (INTEL_GEN(dev_priv) > 2) {
-		(void)I915_READ_CTL(engine);
-		I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
-	}
-
 	return (I915_READ_HEAD(engine) & HEAD_ADDR) == 0;
 }
 
@@ -570,6 +565,9 @@ static int init_ring_common(struct intel_engine_cs *engine)
 
 	intel_engine_init_hangcheck(engine);
 
+	if (INTEL_GEN(dev_priv) > 2)
+		I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
+
 out:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
-- 
2.15.0.rc0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] drm/i915: Always stop the rings before a missing GPU reset
  2017-10-13 13:12 [PATCH 1/2] drm/i915: Keep the rings stopped until they have been re-initialized Chris Wilson
@ 2017-10-13 13:12 ` Chris Wilson
  2017-10-13 13:55   ` Mika Kuoppala
  2017-10-13 13:49 ` [PATCH 1/2] drm/i915: Keep the rings stopped until they have been re-initialized Mika Kuoppala
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Chris Wilson @ 2017-10-13 13:12 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Always try to stop the rings, even if the GPU reset itself has been
disabled (via modparam i915.reset). This should at least stop the hw
from spinning in the background consuming resources (e.g. power and
memory bandwidth) letting the system rest-in-peace.

References: https://bugs.freedesktop.org/show_bug.cgi?id=103260
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 983617b5b338..20e3c65c0999 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1403,6 +1403,9 @@ static void i915_stop_engines(struct drm_i915_private *dev_priv,
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 
+	if (INTEL_GEN(dev_priv) < 3)
+		return;
+
 	for_each_engine_masked(engine, dev_priv, engine_mask, id)
 		gen3_stop_engine(engine);
 }
@@ -1742,16 +1745,12 @@ static reset_func intel_get_gpu_reset(struct drm_i915_private *dev_priv)
 
 int intel_gpu_reset(struct drm_i915_private *dev_priv, unsigned engine_mask)
 {
-	reset_func reset;
+	reset_func reset = intel_get_gpu_reset(dev_priv);
 	int retry;
 	int ret;
 
 	might_sleep();
 
-	reset = intel_get_gpu_reset(dev_priv);
-	if (reset == NULL)
-		return -ENODEV;
-
 	/* If the power well sleeps during the reset, the reset
 	 * request may be dropped and never completes (causing -EIO).
 	 */
@@ -1771,7 +1770,9 @@ int intel_gpu_reset(struct drm_i915_private *dev_priv, unsigned engine_mask)
 		 */
 		i915_stop_engines(dev_priv, engine_mask);
 
-		ret = reset(dev_priv, engine_mask);
+		ret = -ENODEV;
+		if (reset)
+			ret = reset(dev_priv, engine_mask);
 		if (ret != -ETIMEDOUT)
 			break;
 
-- 
2.15.0.rc0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] drm/i915: Keep the rings stopped until they have been re-initialized
  2017-10-13 13:12 [PATCH 1/2] drm/i915: Keep the rings stopped until they have been re-initialized Chris Wilson
  2017-10-13 13:12 ` [PATCH 2/2] drm/i915: Always stop the rings before a missing GPU reset Chris Wilson
@ 2017-10-13 13:49 ` Mika Kuoppala
  2017-10-13 14:57 ` ✓ Fi.CI.BAT: success for series starting with [1/2] " Patchwork
  2017-10-13 19:53 ` ✗ Fi.CI.IGT: failure " Patchwork
  3 siblings, 0 replies; 7+ messages in thread
From: Mika Kuoppala @ 2017-10-13 13:49 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Before modifying the ring register (RING_START, HEAD, TAIL, CTL) we
> first make sure it is stopped (or else the hw may not resample the
> registers). However, we do not need to let the hw restart until after we
> have reprogrammed all the rings. This should help prevent situations
> where pending operations on the ring may resume (because we are trying
> to re-initialize following an unsuccessful GPU hang, i.e. from
> i915_gem_unset_wedged).
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103260
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 4285f09ff8b8..b2a6cb09c6e7 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -484,11 +484,6 @@ static bool stop_ring(struct intel_engine_cs *engine)
>  	I915_WRITE_HEAD(engine, 0);
>  	I915_WRITE_TAIL(engine, 0);
>  
> -	if (INTEL_GEN(dev_priv) > 2) {
> -		(void)I915_READ_CTL(engine);
> -		I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
> -	}
> -
>  	return (I915_READ_HEAD(engine) & HEAD_ADDR) == 0;
>  }
>  
> @@ -570,6 +565,9 @@ static int init_ring_common(struct intel_engine_cs *engine)
>  
>  	intel_engine_init_hangcheck(engine);
>  
> +	if (INTEL_GEN(dev_priv) > 2)
> +		I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
> +

Make sense. Further work could be to introduce start_ring() helper.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

>  out:
>  	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>  
> -- 
> 2.15.0.rc0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] drm/i915: Always stop the rings before a missing GPU reset
  2017-10-13 13:12 ` [PATCH 2/2] drm/i915: Always stop the rings before a missing GPU reset Chris Wilson
@ 2017-10-13 13:55   ` Mika Kuoppala
  0 siblings, 0 replies; 7+ messages in thread
From: Mika Kuoppala @ 2017-10-13 13:55 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Always try to stop the rings, even if the GPU reset itself has been
> disabled (via modparam i915.reset). This should at least stop the hw
> from spinning in the background consuming resources (e.g. power and
> memory bandwidth) letting the system rest-in-peace.
>
> References: https://bugs.freedesktop.org/show_bug.cgi?id=103260
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 983617b5b338..20e3c65c0999 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -1403,6 +1403,9 @@ static void i915_stop_engines(struct drm_i915_private *dev_priv,
>  	struct intel_engine_cs *engine;
>  	enum intel_engine_id id;
>  
> +	if (INTEL_GEN(dev_priv) < 3)
> +		return;
> +
>  	for_each_engine_masked(engine, dev_priv, engine_mask, id)
>  		gen3_stop_engine(engine);
>  }
> @@ -1742,16 +1745,12 @@ static reset_func intel_get_gpu_reset(struct drm_i915_private *dev_priv)
>  
>  int intel_gpu_reset(struct drm_i915_private *dev_priv, unsigned engine_mask)
>  {
> -	reset_func reset;
> +	reset_func reset = intel_get_gpu_reset(dev_priv);
>  	int retry;
>  	int ret;
>  
>  	might_sleep();
>  
> -	reset = intel_get_gpu_reset(dev_priv);
> -	if (reset == NULL)
> -		return -ENODEV;
> -
>  	/* If the power well sleeps during the reset, the reset
>  	 * request may be dropped and never completes (causing -EIO).
>  	 */
> @@ -1771,7 +1770,9 @@ int intel_gpu_reset(struct drm_i915_private *dev_priv, unsigned engine_mask)
>  		 */
>  		i915_stop_engines(dev_priv, engine_mask);
>  
> -		ret = reset(dev_priv, engine_mask);
> +		ret = -ENODEV;
> +		if (reset)
> +			ret = reset(dev_priv, engine_mask);
>  		if (ret != -ETIMEDOUT)
>  			break;
>  
> -- 
> 2.15.0.rc0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Keep the rings stopped until they have been re-initialized
  2017-10-13 13:12 [PATCH 1/2] drm/i915: Keep the rings stopped until they have been re-initialized Chris Wilson
  2017-10-13 13:12 ` [PATCH 2/2] drm/i915: Always stop the rings before a missing GPU reset Chris Wilson
  2017-10-13 13:49 ` [PATCH 1/2] drm/i915: Keep the rings stopped until they have been re-initialized Mika Kuoppala
@ 2017-10-13 14:57 ` Patchwork
  2017-10-13 19:53 ` ✗ Fi.CI.IGT: failure " Patchwork
  3 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2017-10-13 14:57 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915: Keep the rings stopped until they have been re-initialized
URL   : https://patchwork.freedesktop.org/series/31915/
State : success

== Summary ==

Series 31915v1 series starting with [1/2] drm/i915: Keep the rings stopped until they have been re-initialized
https://patchwork.freedesktop.org/api/1.0/series/31915/revisions/1/mbox/

Test gem_exec_reloc:
        Subgroup basic-gtt-active:
                pass       -> FAIL       (fi-gdg-551) fdo#102582 +1

fdo#102582 https://bugs.freedesktop.org/show_bug.cgi?id=102582

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:451s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:467s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:385s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:560s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:285s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:531s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:517s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:531s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:512s
fi-cfl-s         total:289  pass:253  dwarn:4   dfail:0   fail:0   skip:32  time:564s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:434s
fi-gdg-551       total:289  pass:176  dwarn:1   dfail:0   fail:3   skip:109 time:269s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:602s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:440s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:459s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:512s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:478s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:503s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:487s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:594s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:650s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:473s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:660s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:535s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:569s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:476s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:582s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:427s

07966d6a0782f18849de796094708a42334c9946 drm-tip: 2017y-10m-13d-14h-11m-45s UTC integration manifest
704d14b0fa48 drm/i915: Always stop the rings before a missing GPU reset
d1780b40b32f drm/i915: Keep the rings stopped until they have been re-initialized

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6023/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [1/2] drm/i915: Keep the rings stopped until they have been re-initialized
  2017-10-13 13:12 [PATCH 1/2] drm/i915: Keep the rings stopped until they have been re-initialized Chris Wilson
                   ` (2 preceding siblings ...)
  2017-10-13 14:57 ` ✓ Fi.CI.BAT: success for series starting with [1/2] " Patchwork
@ 2017-10-13 19:53 ` Patchwork
  2017-10-13 19:58   ` Chris Wilson
  3 siblings, 1 reply; 7+ messages in thread
From: Patchwork @ 2017-10-13 19:53 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915: Keep the rings stopped until they have been re-initialized
URL   : https://patchwork.freedesktop.org/series/31915/
State : failure

== Summary ==

Test gem_eio:
        Subgroup in-flight-suspend:
                dmesg-warn -> PASS       (shard-hsw) fdo#103260
Test kms_setmode:
        Subgroup basic:
                pass       -> FAIL       (shard-hsw) fdo#99912
Test kms_chv_cursor_fail:
        Subgroup pipe-B-128x128-right-edge:
                pass       -> DMESG-FAIL (shard-hsw)

fdo#103260 https://bugs.freedesktop.org/show_bug.cgi?id=103260
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912

shard-hsw        total:2553 pass:1440 dwarn:0   dfail:1   fail:9   skip:1103 time:9660s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6023/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ✗ Fi.CI.IGT: failure for series starting with [1/2] drm/i915: Keep the rings stopped until they have been re-initialized
  2017-10-13 19:53 ` ✗ Fi.CI.IGT: failure " Patchwork
@ 2017-10-13 19:58   ` Chris Wilson
  0 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2017-10-13 19:58 UTC (permalink / raw)
  To: Patchwork; +Cc: intel-gfx

Quoting Patchwork (2017-10-13 20:53:49)
> == Series Details ==
> 
> Series: series starting with [1/2] drm/i915: Keep the rings stopped until they have been re-initialized
> URL   : https://patchwork.freedesktop.org/series/31915/
> State : failure
> 
> == Summary ==
> 
> Test gem_eio:
>         Subgroup in-flight-suspend:
>                 dmesg-warn -> PASS       (shard-hsw) fdo#103260

And fingers crossed that wasn't a fluke as that was the goal of
this series! Thanks for the review,
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-10-13 19:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-13 13:12 [PATCH 1/2] drm/i915: Keep the rings stopped until they have been re-initialized Chris Wilson
2017-10-13 13:12 ` [PATCH 2/2] drm/i915: Always stop the rings before a missing GPU reset Chris Wilson
2017-10-13 13:55   ` Mika Kuoppala
2017-10-13 13:49 ` [PATCH 1/2] drm/i915: Keep the rings stopped until they have been re-initialized Mika Kuoppala
2017-10-13 14:57 ` ✓ Fi.CI.BAT: success for series starting with [1/2] " Patchwork
2017-10-13 19:53 ` ✗ Fi.CI.IGT: failure " Patchwork
2017-10-13 19:58   ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.