* [PATCH] drm/i915: Stop ring before doing readiness check
@ 2017-09-13 14:01 Mika Kuoppala
2017-09-13 14:08 ` Chris Wilson
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Mika Kuoppala @ 2017-09-13 14:01 UTC (permalink / raw)
To: intel-gfx
Evidence indicates that even if the hardware happily
tells us to proceed with reset, it really isn't ready.
Resetting a freely running batchbuffer after we have
got ack for readiness, still can cause a system hang.
Attempt to stop ring before proceeding for ready check
and reset to avoid losing the machine.
Testcase: igt/prime_busy/hang-* # kbl
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
drivers/gpu/drm/i915/intel_uncore.c | 54 ++++++++++++++++++++++---------------
1 file changed, 32 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 1b38eb94d461..f9ef1931516c 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1361,33 +1361,38 @@ int i915_reg_read_ioctl(struct drm_device *dev,
return ret;
}
+static void gen3_stop_ring(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ const u32 base = engine->mmio_base;
+ const i915_reg_t mode = RING_MI_MODE(base);
+
+ I915_WRITE_FW(mode, _MASKED_BIT_ENABLE(STOP_RING));
+ if (intel_wait_for_register_fw(dev_priv,
+ mode,
+ MODE_IDLE,
+ MODE_IDLE,
+ 500))
+ DRM_DEBUG_DRIVER("%s: timed out on STOP_RING\n",
+ engine->name);
+
+ I915_WRITE_FW(RING_CTL(base), 0);
+ I915_WRITE_FW(RING_HEAD(base), 0);
+ I915_WRITE_FW(RING_TAIL(base), 0);
+
+ /* Check acts as a post */
+ if (I915_READ_FW(RING_HEAD(base)) != 0)
+ DRM_DEBUG_DRIVER("%s: ring head not parked\n",
+ engine->name);
+}
+
static void gen3_stop_rings(struct drm_i915_private *dev_priv)
{
struct intel_engine_cs *engine;
enum intel_engine_id id;
- for_each_engine(engine, dev_priv, id) {
- const u32 base = engine->mmio_base;
- const i915_reg_t mode = RING_MI_MODE(base);
-
- I915_WRITE_FW(mode, _MASKED_BIT_ENABLE(STOP_RING));
- if (intel_wait_for_register_fw(dev_priv,
- mode,
- MODE_IDLE,
- MODE_IDLE,
- 500))
- DRM_DEBUG_DRIVER("%s: timed out on STOP_RING\n",
- engine->name);
-
- I915_WRITE_FW(RING_CTL(base), 0);
- I915_WRITE_FW(RING_HEAD(base), 0);
- I915_WRITE_FW(RING_TAIL(base), 0);
-
- /* Check acts as a post */
- if (I915_READ_FW(RING_HEAD(base)) != 0)
- DRM_DEBUG_DRIVER("%s: ring head not parked\n",
- engine->name);
- }
+ for_each_engine(engine, dev_priv, id)
+ gen3_stop_ring(engine);
}
static bool i915_reset_complete(struct pci_dev *pdev)
@@ -1668,6 +1673,11 @@ static int gen8_reset_engine_start(struct intel_engine_cs *engine)
struct drm_i915_private *dev_priv = engine->i915;
int ret;
+ /* If the bb is still running at this stage, forcing a
+ * reset risks a system hang.
+ */
+ gen3_stop_ring(engine);
+
I915_WRITE_FW(RING_RESET_CTL(engine->mmio_base),
_MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET));
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Stop ring before doing readiness check
2017-09-13 14:01 [PATCH] drm/i915: Stop ring before doing readiness check Mika Kuoppala
@ 2017-09-13 14:08 ` Chris Wilson
2017-09-13 14:13 ` Ville Syrjälä
2017-09-13 14:15 ` Mika Kuoppala
2017-09-13 14:20 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-09-14 0:07 ` ✗ Fi.CI.IGT: failure " Patchwork
2 siblings, 2 replies; 7+ messages in thread
From: Chris Wilson @ 2017-09-13 14:08 UTC (permalink / raw)
To: Mika Kuoppala, intel-gfx
Quoting Mika Kuoppala (2017-09-13 15:01:17)
> Evidence indicates that even if the hardware happily
> tells us to proceed with reset, it really isn't ready.
> Resetting a freely running batchbuffer after we have
> got ack for readiness, still can cause a system hang.
Hmm, so we see it on early gen and late gen. I suggest we do it
universally (except gen2 which is lacking the mechanism). It's unlikely
that the requirement disappeared just for a couple of gen, more likely
that we simply haven't triggered the pathological behaviour.
Other than,
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
for the find.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Stop ring before doing readiness check
2017-09-13 14:08 ` Chris Wilson
@ 2017-09-13 14:13 ` Ville Syrjälä
2017-09-13 14:15 ` Mika Kuoppala
1 sibling, 0 replies; 7+ messages in thread
From: Ville Syrjälä @ 2017-09-13 14:13 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Wed, Sep 13, 2017 at 03:08:06PM +0100, Chris Wilson wrote:
> Quoting Mika Kuoppala (2017-09-13 15:01:17)
> > Evidence indicates that even if the hardware happily
> > tells us to proceed with reset, it really isn't ready.
> > Resetting a freely running batchbuffer after we have
> > got ack for readiness, still can cause a system hang.
>
> Hmm, so we see it on early gen and late gen. I suggest we do it
> universally (except gen2 which is lacking the mechanism). It's unlikely
> that the requirement disappeared just for a couple of gen, more likely
> that we simply haven't triggered the pathological behaviour.
Could just try setting ring enable=false on gen2 maybe? But we don't have
GPU reset for gen2 anyway so I guess it doesn't matter.
--
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Stop ring before doing readiness check
2017-09-13 14:08 ` Chris Wilson
2017-09-13 14:13 ` Ville Syrjälä
@ 2017-09-13 14:15 ` Mika Kuoppala
1 sibling, 0 replies; 7+ messages in thread
From: Mika Kuoppala @ 2017-09-13 14:15 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> Quoting Mika Kuoppala (2017-09-13 15:01:17)
>> Evidence indicates that even if the hardware happily
>> tells us to proceed with reset, it really isn't ready.
>> Resetting a freely running batchbuffer after we have
>> got ack for readiness, still can cause a system hang.
>
> Hmm, so we see it on early gen and late gen. I suggest we do it
> universally (except gen2 which is lacking the mechanism). It's unlikely
> that the requirement disappeared just for a couple of gen, more likely
> that we simply haven't triggered the pathological behaviour.
>
Agreed that we should do a blanket approach. I was in a hurry
to post a proposed fix as I heard the prime_* are not yet
blacklisted on shards. So lets hope this helps.
> Other than,
> Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
> for the find.
Ta.
-Mika
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Stop ring before doing readiness check
2017-09-13 14:01 [PATCH] drm/i915: Stop ring before doing readiness check Mika Kuoppala
2017-09-13 14:08 ` Chris Wilson
@ 2017-09-13 14:20 ` Patchwork
2017-09-14 0:07 ` ✗ Fi.CI.IGT: failure " Patchwork
2 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2017-09-13 14:20 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Stop ring before doing readiness check
URL : https://patchwork.freedesktop.org/series/30298/
State : success
== Summary ==
Series 30298v1 drm/i915: Stop ring before doing readiness check
https://patchwork.freedesktop.org/api/1.0/series/30298/revisions/1/mbox/
Test kms_cursor_legacy:
Subgroup basic-busy-flip-before-cursor-atomic:
pass -> FAIL (fi-snb-2600) fdo#100215 +1
Subgroup basic-flip-before-cursor-atomic:
incomplete -> PASS (fi-bxt-j4205) fdo#102705
fdo#100215 https://bugs.freedesktop.org/show_bug.cgi?id=100215
fdo#102705 https://bugs.freedesktop.org/show_bug.cgi?id=102705
fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:438s
fi-bdw-gvtdvm total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:458s
fi-blb-e6850 total:289 pass:224 dwarn:1 dfail:0 fail:0 skip:64 time:379s
fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:533s
fi-bwr-2160 total:289 pass:184 dwarn:0 dfail:0 fail:0 skip:105 time:269s
fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:495s
fi-byt-j1900 total:289 pass:254 dwarn:1 dfail:0 fail:0 skip:34 time:506s
fi-byt-n2820 total:289 pass:250 dwarn:1 dfail:0 fail:0 skip:38 time:496s
fi-cfl-s total:289 pass:223 dwarn:34 dfail:0 fail:0 skip:32 time:553s
fi-elk-e7500 total:289 pass:230 dwarn:0 dfail:0 fail:0 skip:59 time:452s
fi-glk-2a total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:594s
fi-hsw-4770 total:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:427s
fi-hsw-4770r total:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:414s
fi-ilk-650 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:436s
fi-ivb-3520m total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:483s
fi-ivb-3770 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:463s
fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:491s
fi-kbl-7560u total:289 pass:270 dwarn:0 dfail:0 fail:0 skip:19 time:576s
fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:587s
fi-pnv-d510 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:555s
fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:466s
fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:522s
fi-skl-6770hq total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:502s
fi-skl-gvtdvm total:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:458s
fi-skl-x1585l total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:476s
fi-snb-2520m total:289 pass:251 dwarn:0 dfail:0 fail:0 skip:38 time:566s
fi-snb-2600 total:289 pass:248 dwarn:0 dfail:0 fail:2 skip:39 time:421s
76f9b11f445f4381eff873a62138ed0b00d08e80 drm-tip: 2017y-09m-13d-12h-28m-54s UTC integration manifest
da99a913e5dd drm/i915: Stop ring before doing readiness check
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5685/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* ✗ Fi.CI.IGT: failure for drm/i915: Stop ring before doing readiness check
2017-09-13 14:01 [PATCH] drm/i915: Stop ring before doing readiness check Mika Kuoppala
2017-09-13 14:08 ` Chris Wilson
2017-09-13 14:20 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-09-14 0:07 ` Patchwork
2017-09-14 10:33 ` Chris Wilson
2 siblings, 1 reply; 7+ messages in thread
From: Patchwork @ 2017-09-14 0:07 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Stop ring before doing readiness check
URL : https://patchwork.freedesktop.org/series/30298/
State : failure
== Summary ==
Test kms_cursor_legacy:
Subgroup cursorA-vs-flipA-atomic-transitions:
pass -> FAIL (shard-hsw)
Test drv_missed_irq:
pass -> FAIL (shard-hsw)
Test kms_setmode:
Subgroup basic:
pass -> FAIL (shard-hsw) fdo#99912
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
shard-hsw total:2313 pass:1242 dwarn:0 dfail:0 fail:16 skip:1055 time:9618s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5685/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ✗ Fi.CI.IGT: failure for drm/i915: Stop ring before doing readiness check
2017-09-14 0:07 ` ✗ Fi.CI.IGT: failure " Patchwork
@ 2017-09-14 10:33 ` Chris Wilson
0 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2017-09-14 10:33 UTC (permalink / raw)
To: Patchwork, Mika Kuoppala; +Cc: intel-gfx
Quoting Patchwork (2017-09-14 01:07:40)
> == Series Details ==
>
> Series: drm/i915: Stop ring before doing readiness check
> URL : https://patchwork.freedesktop.org/series/30298/
> State : failure
>
> == Summary ==
>
> Test kms_cursor_legacy:
> Subgroup cursorA-vs-flipA-atomic-transitions:
> pass -> FAIL (shard-hsw)
> Test drv_missed_irq:
> pass -> FAIL (shard-hsw)
> Test kms_setmode:
> Subgroup basic:
> pass -> FAIL (shard-hsw) fdo#99912
>
> fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
>
> shard-hsw total:2313 pass:1242 dwarn:0 dfail:0 fail:16 skip:1055 time:9618s
>
> == Logs ==
>
> For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5685/shards.html
Of course, it decided not to run the prime_busy hang tests!!!
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-09-14 10:33 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-13 14:01 [PATCH] drm/i915: Stop ring before doing readiness check Mika Kuoppala
2017-09-13 14:08 ` Chris Wilson
2017-09-13 14:13 ` Ville Syrjälä
2017-09-13 14:15 ` Mika Kuoppala
2017-09-13 14:20 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-09-14 0:07 ` ✗ Fi.CI.IGT: failure " Patchwork
2017-09-14 10:33 ` Chris Wilson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.