* [PATCH 1/3] drm/i915: Always sanity check engine state upon idling
@ 2017-08-26 11:09 Chris Wilson
2017-08-26 11:09 ` [PATCH 2/3] drm/i915: Clear wedged status upon resume Chris Wilson
` (6 more replies)
0 siblings, 7 replies; 12+ messages in thread
From: Chris Wilson @ 2017-08-26 11:09 UTC (permalink / raw)
To: intel-gfx; +Cc: Matthias Kaehlcke
When we do a locked idle we know that afterwards all requests have been
completed and the engines have been cleared of tasks. For whatever
reason, this doesn't always happen and we may go into a suspend with
ELSP still full, and this causes an issue upon resume as we get very,
very confused.
If the engines refuse to idle, mark the device as wedged. In the process
we get rid of the maybe unused open-coded version of wait_for_engines
reported by Nick Desaulniers and Matthias Kaehlcke.
References: https://bugs.freedesktop.org/show_bug.cgi?id=101891
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
---
drivers/gpu/drm/i915/i915_gem.c | 20 ++++----------------
1 file changed, 4 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ac02785fdaff..c1520c0d2084 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3371,24 +3371,12 @@ static int wait_for_timeline(struct i915_gem_timeline *tl, unsigned int flags)
return 0;
}
-static int wait_for_engine(struct intel_engine_cs *engine, int timeout_ms)
-{
- return wait_for(intel_engine_is_idle(engine), timeout_ms);
-}
-
static int wait_for_engines(struct drm_i915_private *i915)
{
- struct intel_engine_cs *engine;
- enum intel_engine_id id;
-
- for_each_engine(engine, i915, id) {
- if (GEM_WARN_ON(wait_for_engine(engine, 50))) {
- i915_gem_set_wedged(i915);
- return -EIO;
- }
-
- GEM_BUG_ON(intel_engine_get_seqno(engine) !=
- intel_engine_last_submit(engine));
+ if (wait_for(intel_engines_are_idle(i915), 50)) {
+ DRM_ERROR("Failed to idle engines, declaring wedged!\n");
+ i915_gem_set_wedged(i915);
+ return -EIO;
}
return 0;
--
2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/3] drm/i915: Clear wedged status upon resume
2017-08-26 11:09 [PATCH 1/3] drm/i915: Always sanity check engine state upon idling Chris Wilson
@ 2017-08-26 11:09 ` Chris Wilson
2017-08-29 13:49 ` Mika Kuoppala
2017-08-26 11:09 ` [PATCH 3/3] drm/i915: Discard the request queue if we fail to sleep before suspend Chris Wilson
` (5 subsequent siblings)
6 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2017-08-26 11:09 UTC (permalink / raw)
To: intel-gfx
When we wait up from suspend, the device has been powered down and
should come back afresh. We should be able to safely remove the wedged
status from the previous session and start afresh.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c1520c0d2084..9dc24b915aa7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4518,6 +4518,12 @@ static void assert_kernel_context_is_current(struct drm_i915_private *dev_priv)
void i915_gem_sanitize(struct drm_i915_private *i915)
{
+ if (i915_terminally_wedged(&i915->gpu_error)) {
+ mutex_lock(&i915->drm.struct_mutex);
+ i915_gem_unset_wedged(i915);
+ mutex_unlock(&i915->drm.struct_mutex);
+ }
+
/*
* If we inherit context state from the BIOS or earlier occupants
* of the GPU, the GPU may be in an inconsistent state when we
--
2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/3] drm/i915: Discard the request queue if we fail to sleep before suspend
2017-08-26 11:09 [PATCH 1/3] drm/i915: Always sanity check engine state upon idling Chris Wilson
2017-08-26 11:09 ` [PATCH 2/3] drm/i915: Clear wedged status upon resume Chris Wilson
@ 2017-08-26 11:09 ` Chris Wilson
2017-08-29 13:54 ` Mika Kuoppala
2017-08-26 11:26 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Always sanity check engine state upon idling Patchwork
` (4 subsequent siblings)
6 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2017-08-26 11:09 UTC (permalink / raw)
To: intel-gfx
If we fail to clear the outstanding request queue before suspending,
mark those requests as lost.
References: https://bugs.freedesktop.org/show_bug.cgi?id=102037
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9dc24b915aa7..37fbc64d9ffe 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4585,7 +4585,8 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
* reset the GPU back to its idle, low power state.
*/
WARN_ON(dev_priv->gt.awake);
- WARN_ON(!intel_engines_are_idle(dev_priv));
+ if (WARN_ON(!intel_engines_are_idle(dev_priv)))
+ i915_gem_set_wedged(dev_priv); /* no hope, so reset everthing */
/*
* Neither the BIOS, ourselves or any other kernel
--
2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Always sanity check engine state upon idling
2017-08-26 11:09 [PATCH 1/3] drm/i915: Always sanity check engine state upon idling Chris Wilson
2017-08-26 11:09 ` [PATCH 2/3] drm/i915: Clear wedged status upon resume Chris Wilson
2017-08-26 11:09 ` [PATCH 3/3] drm/i915: Discard the request queue if we fail to sleep before suspend Chris Wilson
@ 2017-08-26 11:26 ` Patchwork
2017-08-26 12:21 ` ✗ Fi.CI.IGT: warning " Patchwork
` (3 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2017-08-26 11:26 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/3] drm/i915: Always sanity check engine state upon idling
URL : https://patchwork.freedesktop.org/series/29387/
State : success
== Summary ==
Series 29387v1 series starting with [1/3] drm/i915: Always sanity check engine state upon idling
https://patchwork.freedesktop.org/api/1.0/series/29387/revisions/1/mbox/
Test kms_cursor_legacy:
Subgroup basic-busy-flip-before-cursor-atomic:
pass -> FAIL (fi-snb-2600) fdo#100215 +1
Test kms_frontbuffer_tracking:
Subgroup basic:
dmesg-warn -> PASS (fi-bdw-5557u) fdo#102410
fdo#100215 https://bugs.freedesktop.org/show_bug.cgi?id=100215
fdo#102410 https://bugs.freedesktop.org/show_bug.cgi?id=102410
fi-bdw-5557u total:279 pass:268 dwarn:0 dfail:0 fail:0 skip:11 time:452s
fi-bdw-gvtdvm total:279 pass:265 dwarn:0 dfail:0 fail:0 skip:14 time:438s
fi-blb-e6850 total:279 pass:224 dwarn:1 dfail:0 fail:0 skip:54 time:366s
fi-bsw-n3050 total:279 pass:243 dwarn:0 dfail:0 fail:0 skip:36 time:550s
fi-bwr-2160 total:279 pass:184 dwarn:0 dfail:0 fail:0 skip:95 time:252s
fi-bxt-j4205 total:279 pass:260 dwarn:0 dfail:0 fail:0 skip:19 time:524s
fi-byt-j1900 total:279 pass:254 dwarn:1 dfail:0 fail:0 skip:24 time:526s
fi-byt-n2820 total:279 pass:250 dwarn:1 dfail:0 fail:0 skip:28 time:516s
fi-elk-e7500 total:279 pass:230 dwarn:0 dfail:0 fail:0 skip:49 time:439s
fi-glk-2a total:279 pass:260 dwarn:0 dfail:0 fail:0 skip:19 time:609s
fi-hsw-4770 total:279 pass:263 dwarn:0 dfail:0 fail:0 skip:16 time:450s
fi-hsw-4770r total:279 pass:263 dwarn:0 dfail:0 fail:0 skip:16 time:425s
fi-ilk-650 total:279 pass:229 dwarn:0 dfail:0 fail:0 skip:50 time:425s
fi-ivb-3520m total:279 pass:261 dwarn:0 dfail:0 fail:0 skip:18 time:495s
fi-ivb-3770 total:279 pass:261 dwarn:0 dfail:0 fail:0 skip:18 time:471s
fi-kbl-7500u total:279 pass:261 dwarn:0 dfail:0 fail:0 skip:18 time:478s
fi-kbl-7560u total:279 pass:269 dwarn:0 dfail:0 fail:0 skip:10 time:595s
fi-kbl-r total:279 pass:261 dwarn:0 dfail:0 fail:0 skip:18 time:601s
fi-pnv-d510 total:279 pass:223 dwarn:1 dfail:0 fail:0 skip:55 time:516s
fi-skl-6260u total:279 pass:269 dwarn:0 dfail:0 fail:0 skip:10 time:470s
fi-skl-6700k total:279 pass:261 dwarn:0 dfail:0 fail:0 skip:18 time:475s
fi-skl-6770hq total:279 pass:269 dwarn:0 dfail:0 fail:0 skip:10 time:489s
fi-skl-gvtdvm total:279 pass:266 dwarn:0 dfail:0 fail:0 skip:13 time:443s
fi-skl-x1585l total:279 pass:268 dwarn:0 dfail:0 fail:0 skip:11 time:482s
fi-snb-2520m total:279 pass:251 dwarn:0 dfail:0 fail:0 skip:28 time:548s
fi-snb-2600 total:279 pass:249 dwarn:0 dfail:0 fail:1 skip:29 time:406s
8afd435a127714eca60bcd86e1856aa14d1eea9e drm-tip: 2017y-08m-26d-10h-07m-05s UTC integration manifest
bc9cf04a1e60 drm/i915: Discard the request queue if we fail to sleep before suspend
336e612acb85 drm/i915: Clear wedged status upon resume
628b21e06ed9 drm/i915: Always sanity check engine state upon idling
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5499/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* ✗ Fi.CI.IGT: warning for series starting with [1/3] drm/i915: Always sanity check engine state upon idling
2017-08-26 11:09 [PATCH 1/3] drm/i915: Always sanity check engine state upon idling Chris Wilson
` (2 preceding siblings ...)
2017-08-26 11:26 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Always sanity check engine state upon idling Patchwork
@ 2017-08-26 12:21 ` Patchwork
2017-08-29 12:25 ` [PATCH 1/3] " Chris Wilson
` (2 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2017-08-26 12:21 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/3] drm/i915: Always sanity check engine state upon idling
URL : https://patchwork.freedesktop.org/series/29387/
State : warning
== Summary ==
Test perf:
Subgroup blocking:
fail -> PASS (shard-hsw) fdo#102252
Test kms_plane_multiple:
Subgroup atomic-pipe-A-tiling-x:
pass -> SKIP (shard-hsw)
fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
shard-hsw total:2185 pass:1210 dwarn:0 dfail:0 fail:16 skip:959 time:9348s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5499/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] drm/i915: Always sanity check engine state upon idling
2017-08-26 11:09 [PATCH 1/3] drm/i915: Always sanity check engine state upon idling Chris Wilson
` (3 preceding siblings ...)
2017-08-26 12:21 ` ✗ Fi.CI.IGT: warning " Patchwork
@ 2017-08-29 12:25 ` Chris Wilson
2017-08-29 13:07 ` Joonas Lahtinen
2017-08-29 13:36 ` Mika Kuoppala
6 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-08-29 12:25 UTC (permalink / raw)
To: intel-gfx; +Cc: Matthias Kaehlcke
Quoting Chris Wilson (2017-08-26 12:09:33)
> When we do a locked idle we know that afterwards all requests have been
> completed and the engines have been cleared of tasks. For whatever
> reason, this doesn't always happen and we may go into a suspend with
> ELSP still full, and this causes an issue upon resume as we get very,
> very confused.
>
> If the engines refuse to idle, mark the device as wedged. In the process
> we get rid of the maybe unused open-coded version of wait_for_engines
> reported by Nick Desaulniers and Matthias Kaehlcke.
>
> References: https://bugs.freedesktop.org/show_bug.cgi?id=101891
References: https://bugs.freedesktop.org/show_bug.cgi?id=102456
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Matthias Kaehlcke <mka@chromium.org>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] drm/i915: Always sanity check engine state upon idling
2017-08-26 11:09 [PATCH 1/3] drm/i915: Always sanity check engine state upon idling Chris Wilson
` (4 preceding siblings ...)
2017-08-29 12:25 ` [PATCH 1/3] " Chris Wilson
@ 2017-08-29 13:07 ` Joonas Lahtinen
2017-08-29 13:19 ` Chris Wilson
2017-08-29 13:36 ` Mika Kuoppala
6 siblings, 1 reply; 12+ messages in thread
From: Joonas Lahtinen @ 2017-08-29 13:07 UTC (permalink / raw)
To: Chris Wilson, intel-gfx; +Cc: Matthias Kaehlcke
On Sat, 2017-08-26 at 12:09 +0100, Chris Wilson wrote:
> When we do a locked idle we know that afterwards all requests have been
> completed and the engines have been cleared of tasks. For whatever
> reason, this doesn't always happen and we may go into a suspend with
> ELSP still full, and this causes an issue upon resume as we get very,
> very confused.
>
> If the engines refuse to idle, mark the device as wedged. In the process
> we get rid of the maybe unused open-coded version of wait_for_engines
> reported by Nick Desaulniers and Matthias Kaehlcke.
>
> References: https://bugs.freedesktop.org/show_bug.cgi?id=101891
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Matthias Kaehlcke <mka@chromium.org>
I assume GEM_WARN_ON -> DRM_ERROR was intentional.
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Regards, Joonas
--
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] drm/i915: Always sanity check engine state upon idling
2017-08-29 13:07 ` Joonas Lahtinen
@ 2017-08-29 13:19 ` Chris Wilson
0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-08-29 13:19 UTC (permalink / raw)
To: Joonas Lahtinen, intel-gfx; +Cc: Matthias Kaehlcke
Quoting Joonas Lahtinen (2017-08-29 14:07:40)
> On Sat, 2017-08-26 at 12:09 +0100, Chris Wilson wrote:
> > When we do a locked idle we know that afterwards all requests have been
> > completed and the engines have been cleared of tasks. For whatever
> > reason, this doesn't always happen and we may go into a suspend with
> > ELSP still full, and this causes an issue upon resume as we get very,
> > very confused.
> >
> > If the engines refuse to idle, mark the device as wedged. In the process
> > we get rid of the maybe unused open-coded version of wait_for_engines
> > reported by Nick Desaulniers and Matthias Kaehlcke.
> >
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=101891
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Matthias Kaehlcke <mka@chromium.org>
>
> I assume GEM_WARN_ON -> DRM_ERROR was intentional.
Yes. The first time the unused function was reported, the thread drifted
off in the direction of "that we probably want to always do the test"
rather than only in CI. Now that we have seen glk actually fail in this
way, we need to code defensively here (as it is no longer a theoretical
programming error). As it still is a hw issue we want the warning
(especially as this will cause the suspend to fail, we want a reason
why) and as a general rule all wedging should indicate an error (because
it is a last resort around driver bugs).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] drm/i915: Always sanity check engine state upon idling
2017-08-26 11:09 [PATCH 1/3] drm/i915: Always sanity check engine state upon idling Chris Wilson
` (5 preceding siblings ...)
2017-08-29 13:07 ` Joonas Lahtinen
@ 2017-08-29 13:36 ` Mika Kuoppala
2017-08-29 13:55 ` Chris Wilson
6 siblings, 1 reply; 12+ messages in thread
From: Mika Kuoppala @ 2017-08-29 13:36 UTC (permalink / raw)
To: Chris Wilson, intel-gfx; +Cc: Matthias Kaehlcke
Chris Wilson <chris@chris-wilson.co.uk> writes:
> When we do a locked idle we know that afterwards all requests have been
> completed and the engines have been cleared of tasks. For whatever
> reason, this doesn't always happen and we may go into a suspend with
> ELSP still full, and this causes an issue upon resume as we get very,
> very confused.
>
> If the engines refuse to idle, mark the device as wedged. In the process
> we get rid of the maybe unused open-coded version of wait_for_engines
> reported by Nick Desaulniers and Matthias Kaehlcke.
>
> References: https://bugs.freedesktop.org/show_bug.cgi?id=101891
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Matthias Kaehlcke <mka@chromium.org>
I noticed that when actually do switch to kernel context, it's
async. And then we always do wait for idle.
So as all our usage is sync, why don't we just wait the req in
i915_gem_switch_to_kernel_context(i915) to pinpoint the request
uncompletion. And in addition have this as a further harderning.
But for the unconditional wedge and warn,
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
-Mika
> ---
> drivers/gpu/drm/i915/i915_gem.c | 20 ++++----------------
> 1 file changed, 4 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index ac02785fdaff..c1520c0d2084 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3371,24 +3371,12 @@ static int wait_for_timeline(struct i915_gem_timeline *tl, unsigned int flags)
> return 0;
> }
>
> -static int wait_for_engine(struct intel_engine_cs *engine, int timeout_ms)
> -{
> - return wait_for(intel_engine_is_idle(engine), timeout_ms);
> -}
> -
> static int wait_for_engines(struct drm_i915_private *i915)
> {
> - struct intel_engine_cs *engine;
> - enum intel_engine_id id;
> -
> - for_each_engine(engine, i915, id) {
> - if (GEM_WARN_ON(wait_for_engine(engine, 50))) {
> - i915_gem_set_wedged(i915);
> - return -EIO;
> - }
> -
> - GEM_BUG_ON(intel_engine_get_seqno(engine) !=
> - intel_engine_last_submit(engine));
> + if (wait_for(intel_engines_are_idle(i915), 50)) {
> + DRM_ERROR("Failed to idle engines, declaring wedged!\n");
> + i915_gem_set_wedged(i915);
> + return -EIO;
> }
>
> return 0;
> --
> 2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] drm/i915: Clear wedged status upon resume
2017-08-26 11:09 ` [PATCH 2/3] drm/i915: Clear wedged status upon resume Chris Wilson
@ 2017-08-29 13:49 ` Mika Kuoppala
0 siblings, 0 replies; 12+ messages in thread
From: Mika Kuoppala @ 2017-08-29 13:49 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> When we wait up from suspend, the device has been powered down and
> should come back afresh. We should be able to safely remove the wedged
> status from the previous session and start afresh.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c1520c0d2084..9dc24b915aa7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4518,6 +4518,12 @@ static void assert_kernel_context_is_current(struct drm_i915_private *dev_priv)
>
> void i915_gem_sanitize(struct drm_i915_private *i915)
> {
> + if (i915_terminally_wedged(&i915->gpu_error)) {
> + mutex_lock(&i915->drm.struct_mutex);
> + i915_gem_unset_wedged(i915);
> + mutex_unlock(&i915->drm.struct_mutex);
> + }
> +
> /*
> * If we inherit context state from the BIOS or earlier occupants
> * of the GPU, the GPU may be in an inconsistent state when we
> --
> 2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] drm/i915: Discard the request queue if we fail to sleep before suspend
2017-08-26 11:09 ` [PATCH 3/3] drm/i915: Discard the request queue if we fail to sleep before suspend Chris Wilson
@ 2017-08-29 13:54 ` Mika Kuoppala
0 siblings, 0 replies; 12+ messages in thread
From: Mika Kuoppala @ 2017-08-29 13:54 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> If we fail to clear the outstanding request queue before suspending,
> mark those requests as lost.
>
> References: https://bugs.freedesktop.org/show_bug.cgi?id=102037
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 9dc24b915aa7..37fbc64d9ffe 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4585,7 +4585,8 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
> * reset the GPU back to its idle, low power state.
> */
> WARN_ON(dev_priv->gt.awake);
> - WARN_ON(!intel_engines_are_idle(dev_priv));
> + if (WARN_ON(!intel_engines_are_idle(dev_priv)))
> + i915_gem_set_wedged(dev_priv); /* no hope, so reset everthing */
s/ever/every
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
>
> /*
> * Neither the BIOS, ourselves or any other kernel
> --
> 2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] drm/i915: Always sanity check engine state upon idling
2017-08-29 13:36 ` Mika Kuoppala
@ 2017-08-29 13:55 ` Chris Wilson
0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-08-29 13:55 UTC (permalink / raw)
To: Mika Kuoppala, intel-gfx; +Cc: Matthias Kaehlcke
Quoting Mika Kuoppala (2017-08-29 14:36:57)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > When we do a locked idle we know that afterwards all requests have been
> > completed and the engines have been cleared of tasks. For whatever
> > reason, this doesn't always happen and we may go into a suspend with
> > ELSP still full, and this causes an issue upon resume as we get very,
> > very confused.
> >
> > If the engines refuse to idle, mark the device as wedged. In the process
> > we get rid of the maybe unused open-coded version of wait_for_engines
> > reported by Nick Desaulniers and Matthias Kaehlcke.
> >
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=101891
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Matthias Kaehlcke <mka@chromium.org>
>
> I noticed that when actually do switch to kernel context, it's
> async. And then we always do wait for idle.
>
> So as all our usage is sync, why don't we just wait the req in
> i915_gem_switch_to_kernel_context(i915) to pinpoint the request
> uncompletion. And in addition have this as a further harderning.
They are separate for historical reasons, i.e. they have been used
independently. Note that the switch to kernel context may be between 0
and one request per engine to wait upon, and yet we still want to wait.
However, we can move the wait-for-idle into switch-to-kernel-context as
that is common across all callers at present.
* spots an open coded switch to kernel context.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-08-29 13:56 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-26 11:09 [PATCH 1/3] drm/i915: Always sanity check engine state upon idling Chris Wilson
2017-08-26 11:09 ` [PATCH 2/3] drm/i915: Clear wedged status upon resume Chris Wilson
2017-08-29 13:49 ` Mika Kuoppala
2017-08-26 11:09 ` [PATCH 3/3] drm/i915: Discard the request queue if we fail to sleep before suspend Chris Wilson
2017-08-29 13:54 ` Mika Kuoppala
2017-08-26 11:26 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Always sanity check engine state upon idling Patchwork
2017-08-26 12:21 ` ✗ Fi.CI.IGT: warning " Patchwork
2017-08-29 12:25 ` [PATCH 1/3] " Chris Wilson
2017-08-29 13:07 ` Joonas Lahtinen
2017-08-29 13:19 ` Chris Wilson
2017-08-29 13:36 ` Mika Kuoppala
2017-08-29 13:55 ` Chris Wilson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.