* [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request
@ 2018-05-17 15:47 Chris Wilson
2018-05-17 15:47 ` [PATCH 2/2] drm/i915: Flush the RING stop bit after clearing RING_HEAD in reset Chris Wilson
` (6 more replies)
0 siblings, 7 replies; 11+ messages in thread
From: Chris Wilson @ 2018-05-17 15:47 UTC (permalink / raw)
To: intel-gfx
When testing reset, we wait for 1s on the main thread for the hang to
start. Meanwhile, we continue submitting requests on all the background
threads, and we may have more threads than cores and so potentially
starve the waiter from being woken within the timeout. As the hang
timeout and the active timeouts are the same, it is hard to distinguish
which caused the timeout. Bump the active thread timeouts to 5s,
compared to the 1s timeout for the hang, so that we preferentially
report the hang timing out, while hopefully ensuring that we do at least
wake up the hang thread first before declaring the background active
timeout.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
.../gpu/drm/i915/selftests/intel_hangcheck.c | 48 +++++++++++++------
1 file changed, 34 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index 438e0b045a2c..f1dc42a171c8 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -560,6 +560,30 @@ struct active_engine {
#define TEST_SELF BIT(2)
#define TEST_PRIORITY BIT(3)
+static int active_request_put(struct i915_request *rq)
+{
+ int err = 0;
+
+ if (!rq)
+ return 0;
+
+ if (i915_request_wait(rq, 0, 5 * HZ) < 0) {
+ GEM_TRACE("%s timed out waiting for completion of fence %llx:%d, seqno %d.\n",
+ rq->engine->name,
+ rq->fence.context,
+ rq->fence.seqno,
+ i915_request_global_seqno(rq));
+ GEM_TRACE_DUMP();
+
+ i915_gem_set_wedged(rq->i915);
+ err = -EIO;
+ }
+
+ i915_request_put(rq);
+
+ return err;
+}
+
static int active_engine(void *data)
{
I915_RND_STATE(prng);
@@ -608,24 +632,20 @@ static int active_engine(void *data)
i915_request_add(new);
mutex_unlock(&engine->i915->drm.struct_mutex);
- if (old) {
- if (i915_request_wait(old, 0, HZ) < 0) {
- GEM_TRACE("%s timed out.\n", engine->name);
- GEM_TRACE_DUMP();
-
- i915_gem_set_wedged(engine->i915);
- i915_request_put(old);
- err = -EIO;
- break;
- }
- i915_request_put(old);
- }
+ err = active_request_put(old);
+ if (err)
+ break;
cond_resched();
}
- for (count = 0; count < ARRAY_SIZE(rq); count++)
- i915_request_put(rq[count]);
+ for (count = 0; count < ARRAY_SIZE(rq); count++) {
+ int err__ = active_request_put(rq[count]);
+
+ /* Keep the first error */
+ if (!err)
+ err = err__;
+ }
err_file:
mock_file_free(engine->i915, file);
--
2.17.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/2] drm/i915: Flush the RING stop bit after clearing RING_HEAD in reset
2018-05-17 15:47 [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request Chris Wilson
@ 2018-05-17 15:47 ` Chris Wilson
2018-05-18 7:29 ` Chris Wilson
2018-05-17 16:46 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/selftests: Wait longer for the old active request Patchwork
` (5 subsequent siblings)
6 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2018-05-17 15:47 UTC (permalink / raw)
To: intel-gfx
Inside the live_hangcheck (reset) selftests, we occasionally see
failures like
<7>[ 239.094840] i915_gem_set_wedged rcs0
<7>[ 239.094843] i915_gem_set_wedged current seqno 19a98, last 19a9a, hangcheck 0 [5158 ms]
<7>[ 239.094846] i915_gem_set_wedged Reset count: 6239 (global 1)
<7>[ 239.094848] i915_gem_set_wedged Requests:
<7>[ 239.095052] i915_gem_set_wedged first 19a99 [e8c:5f] prio=1024 @ 5159ms: (null)
<7>[ 239.095056] i915_gem_set_wedged last 19a9a [e81:1a] prio=139 @ 5159ms: igt/rcs0[5977]/1
<7>[ 239.095059] i915_gem_set_wedged active 19a99 [e8c:5f] prio=1024 @ 5159ms: (null)
<7>[ 239.095062] i915_gem_set_wedged [head 0220, postfix 0280, tail 02a8, batch 0xffffffff_ffffffff]
<7>[ 239.100050] i915_gem_set_wedged ring->start: 0x00283000
<7>[ 239.100053] i915_gem_set_wedged ring->head: 0x000001f8
<7>[ 239.100055] i915_gem_set_wedged ring->tail: 0x000002a8
<7>[ 239.100057] i915_gem_set_wedged ring->emit: 0x000002a8
<7>[ 239.100059] i915_gem_set_wedged ring->space: 0x00000f10
<7>[ 239.100085] i915_gem_set_wedged RING_START: 0x00283000
<7>[ 239.100088] i915_gem_set_wedged RING_HEAD: 0x00000260
<7>[ 239.100091] i915_gem_set_wedged RING_TAIL: 0x000002a8
<7>[ 239.100094] i915_gem_set_wedged RING_CTL: 0x00000001
<7>[ 239.100097] i915_gem_set_wedged RING_MODE: 0x00000300 [idle]
<7>[ 239.100100] i915_gem_set_wedged RING_IMR: fffffefe
<7>[ 239.100104] i915_gem_set_wedged ACTHD: 0x00000000_0000609c
<7>[ 239.100108] i915_gem_set_wedged BBADDR: 0x00000000_0000609d
<7>[ 239.100111] i915_gem_set_wedged DMA_FADDR: 0x00000000_00283260
<7>[ 239.100114] i915_gem_set_wedged IPEIR: 0x00000000
<7>[ 239.100117] i915_gem_set_wedged IPEHR: 0x02800000
<7>[ 239.100120] i915_gem_set_wedged Execlist status: 0x00044052 00000002
<7>[ 239.100124] i915_gem_set_wedged Execlist CSB read 5 [5 cached], write 5 [5 from hws], interrupt posted? no, tasklet queued? no (enabled)
<7>[ 239.100128] i915_gem_set_wedged ELSP[0] count=1, ring->start=00283000, rq: 19a99 [e8c:5f] prio=1024 @ 5164ms: (null)
<7>[ 239.100132] i915_gem_set_wedged ELSP[1] count=1, ring->start=00257000, rq: 19a9a [e81:1a] prio=139 @ 5164ms: igt/rcs0[5977]/1
<7>[ 239.100135] i915_gem_set_wedged HW active? 0x5
<7>[ 239.100250] i915_gem_set_wedged E 19a99 [e8c:5f] prio=1024 @ 5164ms: (null)
<7>[ 239.100338] i915_gem_set_wedged E 19a9a [e81:1a] prio=139 @ 5164ms: igt/rcs0[5977]/1
<7>[ 239.100340] i915_gem_set_wedged Queue priority: 139
<7>[ 239.100343] i915_gem_set_wedged Q 0 [e98:19] prio=132 @ 5164ms: igt/rcs0[5977]/8
<7>[ 239.100346] i915_gem_set_wedged Q 0 [e84:19] prio=121 @ 5165ms: igt/rcs0[5977]/2
<7>[ 239.100349] i915_gem_set_wedged Q 0 [e87:19] prio=82 @ 5165ms: igt/rcs0[5977]/3
<7>[ 239.100352] i915_gem_set_wedged Q 0 [e84:1a] prio=44 @ 5164ms: igt/rcs0[5977]/2
<7>[ 239.100356] i915_gem_set_wedged Q 0 [e8b:19] prio=20 @ 5165ms: igt/rcs0[5977]/4
<7>[ 239.100362] i915_gem_set_wedged drv_selftest [5894] waiting for 19a99
where the GPU saw an arbitration point and idles; AND HAS NOT BEEN RESET!
The RING_MODE indicates that is idle and has the STOP_RING bit set, so
try clearing it.
v2: Only clear the bit on restarting the ring, as we want to be sure the
STOP_RING bit is kept if reset fails on wedging.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/intel_lrc.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 646ecf267411..211585187d2f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1773,6 +1773,9 @@ static void enable_execlists(struct intel_engine_cs *engine)
I915_WRITE(RING_MODE_GEN7(engine),
_MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
+ I915_WRITE(RING_MI_MODE(engine->mmio_base),
+ _MASKED_BIT_DISABLE(STOP_RING));
+
I915_WRITE(RING_HWS_PGA(engine->mmio_base),
engine->status_page.ggtt_offset);
POSTING_READ(RING_HWS_PGA(engine->mmio_base));
--
2.17.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread
* ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
2018-05-17 15:47 [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request Chris Wilson
2018-05-17 15:47 ` [PATCH 2/2] drm/i915: Flush the RING stop bit after clearing RING_HEAD in reset Chris Wilson
@ 2018-05-17 16:46 ` Patchwork
2018-05-17 17:01 ` ✓ Fi.CI.BAT: success " Patchwork
` (4 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-05-17 16:46 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
URL : https://patchwork.freedesktop.org/series/43344/
State : warning
== Summary ==
$ dim checkpatch origin/drm-tip
81caf1d82050 drm/i915/selftests: Wait longer for the old active request
272940a885c0 drm/i915: Flush the RING stop bit after clearing RING_HEAD in reset
-:11: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#11:
<7>[ 239.094843] i915_gem_set_wedged current seqno 19a98, last 19a9a, hangcheck 0 [5158 ms]
total: 0 errors, 1 warnings, 0 checks, 9 lines checked
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
2018-05-17 15:47 [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request Chris Wilson
2018-05-17 15:47 ` [PATCH 2/2] drm/i915: Flush the RING stop bit after clearing RING_HEAD in reset Chris Wilson
2018-05-17 16:46 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/selftests: Wait longer for the old active request Patchwork
@ 2018-05-17 17:01 ` Patchwork
2018-05-17 20:22 ` ✓ Fi.CI.IGT: " Patchwork
` (3 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-05-17 17:01 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
URL : https://patchwork.freedesktop.org/series/43344/
State : success
== Summary ==
= CI Bug Log - changes from CI_DRM_4198 -> Patchwork_9033 =
== Summary - WARNING ==
Minor unknown changes coming with Patchwork_9033 need to be verified
manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_9033, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://patchwork.freedesktop.org/api/1.0/series/43344/revisions/1/mbox/
== Possible new issues ==
Here are the unknown changes that may have been introduced in Patchwork_9033:
=== IGT changes ===
==== Warnings ====
igt@gem_exec_gttfill@basic:
fi-pnv-d510: PASS -> SKIP
== Known issues ==
Here are the changes found in Patchwork_9033 that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@kms_flip@basic-flip-vs-wf_vblank:
fi-cnl-psr: PASS -> FAIL (fdo#100368)
==== Possible fixes ====
igt@kms_flip@basic-flip-vs-dpms:
fi-cnl-y3: INCOMPLETE (fdo#105086) -> PASS
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#105086 https://bugs.freedesktop.org/show_bug.cgi?id=105086
== Participating hosts (43 -> 39) ==
Missing (4): fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-skl-6700hq
== Build changes ==
* Linux: CI_DRM_4198 -> Patchwork_9033
CI_DRM_4198: 9f0af9e6938d975b744e3533410bf6398f3ce2d3 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4487: eccae1360d6d01e73c6af2bd97122cef708207ef @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_9033: 272940a885c0c7bfe68e7be018bc5a3a18d806d6 @ git://anongit.freedesktop.org/gfx-ci/linux
piglit_4487: 6ab75f7eb5e1dccbb773e1739beeb2d7cbd6ad0d @ git://anongit.freedesktop.org/piglit
== Linux commits ==
272940a885c0 drm/i915: Flush the RING stop bit after clearing RING_HEAD in reset
81caf1d82050 drm/i915/selftests: Wait longer for the old active request
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9033/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* ✓ Fi.CI.IGT: success for series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
2018-05-17 15:47 [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request Chris Wilson
` (2 preceding siblings ...)
2018-05-17 17:01 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-05-17 20:22 ` Patchwork
2018-05-17 20:56 ` ✗ Fi.CI.CHECKPATCH: warning " Patchwork
` (2 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-05-17 20:22 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
URL : https://patchwork.freedesktop.org/series/43344/
State : success
== Summary ==
= CI Bug Log - changes from CI_DRM_4198_full -> Patchwork_9033_full =
== Summary - WARNING ==
Minor unknown changes coming with Patchwork_9033_full need to be verified
manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_9033_full, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://patchwork.freedesktop.org/api/1.0/series/43344/revisions/1/mbox/
== Possible new issues ==
Here are the unknown changes that may have been introduced in Patchwork_9033_full:
=== IGT changes ===
==== Warnings ====
igt@gem_exec_schedule@deep-vebox:
shard-kbl: SKIP -> PASS +1
== Known issues ==
Here are the changes found in Patchwork_9033_full that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@gem_exec_create@basic:
shard-kbl: PASS -> DMESG-WARN (fdo#103558, fdo#105602)
igt@kms_atomic_transition@1x-modeset-transitions-nonblocking:
shard-glk: PASS -> FAIL (fdo#105703)
igt@kms_cursor_crc@cursor-128x128-suspend:
shard-kbl: PASS -> FAIL (fdo#103232, fdo#103191, fdo#104724)
igt@kms_cursor_legacy@2x-long-flip-vs-cursor-atomic:
shard-hsw: PASS -> FAIL (fdo#104873)
igt@kms_cursor_legacy@flip-vs-cursor-varying-size:
shard-hsw: PASS -> FAIL (fdo#102670)
igt@kms_flip@2x-flip-vs-blocking-wf-vblank:
shard-hsw: PASS -> FAIL (fdo#103928)
igt@kms_flip@dpms-vs-vblank-race-interruptible:
shard-hsw: PASS -> FAIL (fdo#103060)
igt@kms_flip@flip-vs-expired-vblank-interruptible:
shard-glk: PASS -> FAIL (fdo#105363, fdo#102887)
igt@kms_flip@plain-flip-fb-recreate-interruptible:
shard-glk: PASS -> FAIL (fdo#100368) +1
igt@kms_flip_tiling@flip-to-x-tiled:
shard-glk: PASS -> FAIL (fdo#104724, fdo#103822)
igt@kms_flip_tiling@flip-to-y-tiled:
shard-glk: PASS -> FAIL (fdo#104724) +1
igt@kms_rotation_crc@sprite-rotation-90:
shard-apl: PASS -> FAIL (fdo#104724, fdo#103925)
igt@kms_setmode@basic:
shard-kbl: PASS -> FAIL (fdo#99912)
==== Possible fixes ====
igt@drv_selftest@live_hangcheck:
shard-apl: DMESG-FAIL -> PASS
igt@kms_atomic_transition@1x-modeset-transitions-nonblocking-fencing:
shard-glk: FAIL (fdo#105703) -> PASS
igt@kms_flip@2x-modeset-vs-vblank-race-interruptible:
shard-hsw: FAIL (fdo#103060) -> PASS
igt@kms_flip@2x-plain-flip-fb-recreate-interruptible:
shard-glk: FAIL (fdo#100368) -> PASS
igt@kms_flip@plain-flip-fb-recreate:
shard-hsw: FAIL (fdo#100368) -> PASS
igt@kms_flip_tiling@flip-x-tiled:
shard-glk: FAIL (fdo#104724) -> PASS
igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-shrfb-draw-mmap-wc:
shard-glk: FAIL (fdo#103167, fdo#104724) -> PASS
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#102670 https://bugs.freedesktop.org/show_bug.cgi?id=102670
fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
fdo#103232 https://bugs.freedesktop.org/show_bug.cgi?id=103232
fdo#103558 https://bugs.freedesktop.org/show_bug.cgi?id=103558
fdo#103822 https://bugs.freedesktop.org/show_bug.cgi?id=103822
fdo#103925 https://bugs.freedesktop.org/show_bug.cgi?id=103925
fdo#103928 https://bugs.freedesktop.org/show_bug.cgi?id=103928
fdo#104724 https://bugs.freedesktop.org/show_bug.cgi?id=104724
fdo#104873 https://bugs.freedesktop.org/show_bug.cgi?id=104873
fdo#105363 https://bugs.freedesktop.org/show_bug.cgi?id=105363
fdo#105602 https://bugs.freedesktop.org/show_bug.cgi?id=105602
fdo#105703 https://bugs.freedesktop.org/show_bug.cgi?id=105703
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
== Participating hosts (9 -> 9) ==
No changes in participating hosts
== Build changes ==
* Linux: CI_DRM_4198 -> Patchwork_9033
CI_DRM_4198: 9f0af9e6938d975b744e3533410bf6398f3ce2d3 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4487: eccae1360d6d01e73c6af2bd97122cef708207ef @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_9033: 272940a885c0c7bfe68e7be018bc5a3a18d806d6 @ git://anongit.freedesktop.org/gfx-ci/linux
piglit_4487: 6ab75f7eb5e1dccbb773e1739beeb2d7cbd6ad0d @ git://anongit.freedesktop.org/piglit
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9033/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
2018-05-17 15:47 [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request Chris Wilson
` (3 preceding siblings ...)
2018-05-17 20:22 ` ✓ Fi.CI.IGT: " Patchwork
@ 2018-05-17 20:56 ` Patchwork
2018-05-17 21:12 ` ✓ Fi.CI.BAT: success " Patchwork
2018-05-18 1:03 ` ✓ Fi.CI.IGT: " Patchwork
6 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-05-17 20:56 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
URL : https://patchwork.freedesktop.org/series/43344/
State : warning
== Summary ==
$ dim checkpatch origin/drm-tip
5bad7132a89f drm/i915/selftests: Wait longer for the old active request
afd1ce5db428 drm/i915: Flush the RING stop bit after clearing RING_HEAD in reset
-:11: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#11:
<7>[ 239.094843] i915_gem_set_wedged current seqno 19a98, last 19a9a, hangcheck 0 [5158 ms]
total: 0 errors, 1 warnings, 0 checks, 9 lines checked
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
2018-05-17 15:47 [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request Chris Wilson
` (4 preceding siblings ...)
2018-05-17 20:56 ` ✗ Fi.CI.CHECKPATCH: warning " Patchwork
@ 2018-05-17 21:12 ` Patchwork
2018-05-18 1:03 ` ✓ Fi.CI.IGT: " Patchwork
6 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-05-17 21:12 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
URL : https://patchwork.freedesktop.org/series/43344/
State : success
== Summary ==
= CI Bug Log - changes from CI_DRM_4200 -> Patchwork_9037 =
== Summary - SUCCESS ==
No regressions found.
External URL: https://patchwork.freedesktop.org/api/1.0/series/43344/revisions/1/mbox/
== Known issues ==
Here are the changes found in Patchwork_9037 that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@kms_flip@basic-flip-vs-wf_vblank:
fi-cfl-8700k: PASS -> FAIL (fdo#103928)
==== Possible fixes ====
igt@kms_frontbuffer_tracking@basic:
fi-hsw-peppy: DMESG-WARN -> PASS
igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
fi-kbl-7567u: FAIL (fdo#103191, fdo#104724) -> PASS
fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
fdo#103928 https://bugs.freedesktop.org/show_bug.cgi?id=103928
fdo#104724 https://bugs.freedesktop.org/show_bug.cgi?id=104724
== Participating hosts (43 -> 39) ==
Missing (4): fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-skl-6700hq
== Build changes ==
* Linux: CI_DRM_4200 -> Patchwork_9037
CI_DRM_4200: d7b67c87685f05c99d52db49184552f810bf1729 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4487: eccae1360d6d01e73c6af2bd97122cef708207ef @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_9037: afd1ce5db4288f5db1492a593829c91d635a5486 @ git://anongit.freedesktop.org/gfx-ci/linux
piglit_4487: 6ab75f7eb5e1dccbb773e1739beeb2d7cbd6ad0d @ git://anongit.freedesktop.org/piglit
== Linux commits ==
afd1ce5db428 drm/i915: Flush the RING stop bit after clearing RING_HEAD in reset
5bad7132a89f drm/i915/selftests: Wait longer for the old active request
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9037/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* ✓ Fi.CI.IGT: success for series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
2018-05-17 15:47 [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request Chris Wilson
` (5 preceding siblings ...)
2018-05-17 21:12 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-05-18 1:03 ` Patchwork
6 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-05-18 1:03 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/selftests: Wait longer for the old active request
URL : https://patchwork.freedesktop.org/series/43344/
State : success
== Summary ==
= CI Bug Log - changes from CI_DRM_4200_full -> Patchwork_9037_full =
== Summary - WARNING ==
Minor unknown changes coming with Patchwork_9037_full need to be verified
manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_9037_full, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://patchwork.freedesktop.org/api/1.0/series/43344/revisions/1/mbox/
== Possible new issues ==
Here are the unknown changes that may have been introduced in Patchwork_9037_full:
=== IGT changes ===
==== Warnings ====
igt@gem_mocs_settings@mocs-rc6-ctx-dirty-render:
shard-kbl: PASS -> SKIP
igt@kms_plane_lowres@pipe-c-tiling-x:
shard-apl: PASS -> SKIP
== Known issues ==
Here are the changes found in Patchwork_9037_full that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@kms_cursor_legacy@2x-nonblocking-modeset-vs-cursor-atomic:
shard-glk: PASS -> FAIL (fdo#105454, fdo#106509)
igt@kms_cursor_legacy@flip-vs-cursor-legacy:
shard-hsw: PASS -> FAIL (fdo#102670)
igt@kms_flip@plain-flip-ts-check-interruptible:
shard-glk: PASS -> FAIL (fdo#100368) +1
igt@kms_flip_tiling@flip-x-tiled:
shard-glk: PASS -> FAIL (fdo#103822, fdo#104724) +1
igt@kms_flip_tiling@flip-y-tiled:
shard-glk: PASS -> FAIL (fdo#104724)
igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-shrfb-draw-mmap-gtt:
shard-kbl: PASS -> DMESG-WARN (fdo#106247)
shard-apl: PASS -> DMESG-FAIL (fdo#105602, fdo#103558)
igt@kms_pipe_crc_basic@hang-read-crc-pipe-c:
shard-apl: PASS -> DMESG-WARN (fdo#105602, fdo#103558) +10
igt@kms_setmode@basic:
shard-apl: PASS -> FAIL (fdo#99912)
==== Possible fixes ====
igt@drv_selftest@live_hangcheck:
shard-kbl: DMESG-FAIL -> PASS
igt@kms_flip@2x-flip-vs-expired-vblank-interruptible:
shard-glk: FAIL (fdo#105707) -> PASS
igt@kms_flip@basic-flip-vs-dpms:
shard-hsw: DMESG-WARN (fdo#102614) -> PASS
igt@kms_flip@flip-vs-expired-vblank:
shard-hsw: FAIL (fdo#102887) -> PASS
shard-glk: FAIL (fdo#105363) -> PASS
igt@kms_flip@plain-flip-fb-recreate-interruptible:
shard-glk: FAIL (fdo#100368) -> PASS
igt@kms_setmode@basic:
shard-kbl: FAIL (fdo#99912) -> PASS
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#102614 https://bugs.freedesktop.org/show_bug.cgi?id=102614
fdo#102670 https://bugs.freedesktop.org/show_bug.cgi?id=102670
fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#103558 https://bugs.freedesktop.org/show_bug.cgi?id=103558
fdo#103822 https://bugs.freedesktop.org/show_bug.cgi?id=103822
fdo#104724 https://bugs.freedesktop.org/show_bug.cgi?id=104724
fdo#105363 https://bugs.freedesktop.org/show_bug.cgi?id=105363
fdo#105454 https://bugs.freedesktop.org/show_bug.cgi?id=105454
fdo#105602 https://bugs.freedesktop.org/show_bug.cgi?id=105602
fdo#105707 https://bugs.freedesktop.org/show_bug.cgi?id=105707
fdo#106247 https://bugs.freedesktop.org/show_bug.cgi?id=106247
fdo#106509 https://bugs.freedesktop.org/show_bug.cgi?id=106509
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
== Participating hosts (9 -> 9) ==
No changes in participating hosts
== Build changes ==
* Linux: CI_DRM_4200 -> Patchwork_9037
CI_DRM_4200: d7b67c87685f05c99d52db49184552f810bf1729 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4487: eccae1360d6d01e73c6af2bd97122cef708207ef @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_9037: afd1ce5db4288f5db1492a593829c91d635a5486 @ git://anongit.freedesktop.org/gfx-ci/linux
piglit_4487: 6ab75f7eb5e1dccbb773e1739beeb2d7cbd6ad0d @ git://anongit.freedesktop.org/piglit
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9037/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] drm/i915: Flush the RING stop bit after clearing RING_HEAD in reset
2018-05-17 15:47 ` [PATCH 2/2] drm/i915: Flush the RING stop bit after clearing RING_HEAD in reset Chris Wilson
@ 2018-05-18 7:29 ` Chris Wilson
0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2018-05-18 7:29 UTC (permalink / raw)
To: intel-gfx
Quoting Chris Wilson (2018-05-17 16:47:26)
> Inside the live_hangcheck (reset) selftests, we occasionally see
> failures like
>
> <7>[ 239.094840] i915_gem_set_wedged rcs0
> <7>[ 239.094843] i915_gem_set_wedged current seqno 19a98, last 19a9a, hangcheck 0 [5158 ms]
> <7>[ 239.094846] i915_gem_set_wedged Reset count: 6239 (global 1)
> <7>[ 239.094848] i915_gem_set_wedged Requests:
> <7>[ 239.095052] i915_gem_set_wedged first 19a99 [e8c:5f] prio=1024 @ 5159ms: (null)
> <7>[ 239.095056] i915_gem_set_wedged last 19a9a [e81:1a] prio=139 @ 5159ms: igt/rcs0[5977]/1
> <7>[ 239.095059] i915_gem_set_wedged active 19a99 [e8c:5f] prio=1024 @ 5159ms: (null)
> <7>[ 239.095062] i915_gem_set_wedged [head 0220, postfix 0280, tail 02a8, batch 0xffffffff_ffffffff]
> <7>[ 239.100050] i915_gem_set_wedged ring->start: 0x00283000
> <7>[ 239.100053] i915_gem_set_wedged ring->head: 0x000001f8
> <7>[ 239.100055] i915_gem_set_wedged ring->tail: 0x000002a8
> <7>[ 239.100057] i915_gem_set_wedged ring->emit: 0x000002a8
> <7>[ 239.100059] i915_gem_set_wedged ring->space: 0x00000f10
> <7>[ 239.100085] i915_gem_set_wedged RING_START: 0x00283000
> <7>[ 239.100088] i915_gem_set_wedged RING_HEAD: 0x00000260
> <7>[ 239.100091] i915_gem_set_wedged RING_TAIL: 0x000002a8
> <7>[ 239.100094] i915_gem_set_wedged RING_CTL: 0x00000001
> <7>[ 239.100097] i915_gem_set_wedged RING_MODE: 0x00000300 [idle]
> <7>[ 239.100100] i915_gem_set_wedged RING_IMR: fffffefe
> <7>[ 239.100104] i915_gem_set_wedged ACTHD: 0x00000000_0000609c
> <7>[ 239.100108] i915_gem_set_wedged BBADDR: 0x00000000_0000609d
> <7>[ 239.100111] i915_gem_set_wedged DMA_FADDR: 0x00000000_00283260
> <7>[ 239.100114] i915_gem_set_wedged IPEIR: 0x00000000
> <7>[ 239.100117] i915_gem_set_wedged IPEHR: 0x02800000
> <7>[ 239.100120] i915_gem_set_wedged Execlist status: 0x00044052 00000002
> <7>[ 239.100124] i915_gem_set_wedged Execlist CSB read 5 [5 cached], write 5 [5 from hws], interrupt posted? no, tasklet queued? no (enabled)
> <7>[ 239.100128] i915_gem_set_wedged ELSP[0] count=1, ring->start=00283000, rq: 19a99 [e8c:5f] prio=1024 @ 5164ms: (null)
> <7>[ 239.100132] i915_gem_set_wedged ELSP[1] count=1, ring->start=00257000, rq: 19a9a [e81:1a] prio=139 @ 5164ms: igt/rcs0[5977]/1
> <7>[ 239.100135] i915_gem_set_wedged HW active? 0x5
> <7>[ 239.100250] i915_gem_set_wedged E 19a99 [e8c:5f] prio=1024 @ 5164ms: (null)
> <7>[ 239.100338] i915_gem_set_wedged E 19a9a [e81:1a] prio=139 @ 5164ms: igt/rcs0[5977]/1
> <7>[ 239.100340] i915_gem_set_wedged Queue priority: 139
> <7>[ 239.100343] i915_gem_set_wedged Q 0 [e98:19] prio=132 @ 5164ms: igt/rcs0[5977]/8
> <7>[ 239.100346] i915_gem_set_wedged Q 0 [e84:19] prio=121 @ 5165ms: igt/rcs0[5977]/2
> <7>[ 239.100349] i915_gem_set_wedged Q 0 [e87:19] prio=82 @ 5165ms: igt/rcs0[5977]/3
> <7>[ 239.100352] i915_gem_set_wedged Q 0 [e84:1a] prio=44 @ 5164ms: igt/rcs0[5977]/2
> <7>[ 239.100356] i915_gem_set_wedged Q 0 [e8b:19] prio=20 @ 5165ms: igt/rcs0[5977]/4
> <7>[ 239.100362] i915_gem_set_wedged drv_selftest [5894] waiting for 19a99
>
> where the GPU saw an arbitration point and idles; AND HAS NOT BEEN RESET!
> The RING_MODE indicates that is idle and has the STOP_RING bit set, so
> try clearing it.
>
> v2: Only clear the bit on restarting the ring, as we want to be sure the
> STOP_RING bit is kept if reset fails on wedging.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2/2 passes, it might not just be a coincidence! Please kindly review,
-Chris
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 646ecf267411..211585187d2f 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1773,6 +1773,9 @@ static void enable_execlists(struct intel_engine_cs *engine)
> I915_WRITE(RING_MODE_GEN7(engine),
> _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
>
> + I915_WRITE(RING_MI_MODE(engine->mmio_base),
> + _MASKED_BIT_DISABLE(STOP_RING));
> +
> I915_WRITE(RING_HWS_PGA(engine->mmio_base),
> engine->status_page.ggtt_offset);
> POSTING_READ(RING_HWS_PGA(engine->mmio_base));
> --
> 2.17.0
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request
2018-05-17 14:24 [PATCH 1/2] " Chris Wilson
@ 2018-05-18 9:22 ` Tvrtko Ursulin
0 siblings, 0 replies; 11+ messages in thread
From: Tvrtko Ursulin @ 2018-05-18 9:22 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 17/05/2018 15:24, Chris Wilson wrote:
> When testing reset, we wait for 1s on the main thread for the hang to
> start. Meanwhile, we continue submitting requests on all the background
> threads, and we may have more threads than cores and so potentially
> starve the waiter from being woken within the timeout. As the hang
> timeout and the active timeouts are the same, it is hard to distinguish
> which caused the timeout. Bump the active thread timeouts to 5s,
> compared to the 1s timeout for the hang, so that we preferentially
> report the hang timing out, while hopefully ensuring that we do at least
> wake up the hang thread first before declaring the background active
> timeout.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> .../gpu/drm/i915/selftests/intel_hangcheck.c | 48 +++++++++++++------
> 1 file changed, 34 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> index 438e0b045a2c..f1dc42a171c8 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> @@ -560,6 +560,30 @@ struct active_engine {
> #define TEST_SELF BIT(2)
> #define TEST_PRIORITY BIT(3)
>
> +static int active_request_put(struct i915_request *rq)
> +{
> + int err = 0;
> +
> + if (!rq)
> + return 0;
> +
> + if (i915_request_wait(rq, 0, 5 * HZ) < 0) {
> + GEM_TRACE("%s timed out waiting for completion of fence %llx:%d, seqno %d.\n",
> + rq->engine->name,
> + rq->fence.context,
> + rq->fence.seqno,
> + i915_request_global_seqno(rq));
> + GEM_TRACE_DUMP();
> +
> + i915_gem_set_wedged(rq->i915);
> + err = -EIO;
> + }
> +
> + i915_request_put(rq);
> +
> + return err;
> +}
> +
> static int active_engine(void *data)
> {
> I915_RND_STATE(prng);
> @@ -608,24 +632,20 @@ static int active_engine(void *data)
> i915_request_add(new);
> mutex_unlock(&engine->i915->drm.struct_mutex);
>
> - if (old) {
> - if (i915_request_wait(old, 0, HZ) < 0) {
> - GEM_TRACE("%s timed out.\n", engine->name);
> - GEM_TRACE_DUMP();
> -
> - i915_gem_set_wedged(engine->i915);
> - i915_request_put(old);
> - err = -EIO;
> - break;
> - }
> - i915_request_put(old);
> - }
> + err = active_request_put(old);
> + if (err)
> + break;
>
> cond_resched();
> }
>
> - for (count = 0; count < ARRAY_SIZE(rq); count++)
> - i915_request_put(rq[count]);
> + for (count = 0; count < ARRAY_SIZE(rq); count++) {
> + int err__ = active_request_put(rq[count]);
> +
> + /* Keep the first error */
> + if (!err)
> + err = err__;
> + }
>
> err_file:
> mock_file_free(engine->i915, file);
>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request
@ 2018-05-17 14:24 Chris Wilson
2018-05-18 9:22 ` Tvrtko Ursulin
0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2018-05-17 14:24 UTC (permalink / raw)
To: intel-gfx
When testing reset, we wait for 1s on the main thread for the hang to
start. Meanwhile, we continue submitting requests on all the background
threads, and we may have more threads than cores and so potentially
starve the waiter from being woken within the timeout. As the hang
timeout and the active timeouts are the same, it is hard to distinguish
which caused the timeout. Bump the active thread timeouts to 5s,
compared to the 1s timeout for the hang, so that we preferentially
report the hang timing out, while hopefully ensuring that we do at least
wake up the hang thread first before declaring the background active
timeout.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
.../gpu/drm/i915/selftests/intel_hangcheck.c | 48 +++++++++++++------
1 file changed, 34 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index 438e0b045a2c..f1dc42a171c8 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -560,6 +560,30 @@ struct active_engine {
#define TEST_SELF BIT(2)
#define TEST_PRIORITY BIT(3)
+static int active_request_put(struct i915_request *rq)
+{
+ int err = 0;
+
+ if (!rq)
+ return 0;
+
+ if (i915_request_wait(rq, 0, 5 * HZ) < 0) {
+ GEM_TRACE("%s timed out waiting for completion of fence %llx:%d, seqno %d.\n",
+ rq->engine->name,
+ rq->fence.context,
+ rq->fence.seqno,
+ i915_request_global_seqno(rq));
+ GEM_TRACE_DUMP();
+
+ i915_gem_set_wedged(rq->i915);
+ err = -EIO;
+ }
+
+ i915_request_put(rq);
+
+ return err;
+}
+
static int active_engine(void *data)
{
I915_RND_STATE(prng);
@@ -608,24 +632,20 @@ static int active_engine(void *data)
i915_request_add(new);
mutex_unlock(&engine->i915->drm.struct_mutex);
- if (old) {
- if (i915_request_wait(old, 0, HZ) < 0) {
- GEM_TRACE("%s timed out.\n", engine->name);
- GEM_TRACE_DUMP();
-
- i915_gem_set_wedged(engine->i915);
- i915_request_put(old);
- err = -EIO;
- break;
- }
- i915_request_put(old);
- }
+ err = active_request_put(old);
+ if (err)
+ break;
cond_resched();
}
- for (count = 0; count < ARRAY_SIZE(rq); count++)
- i915_request_put(rq[count]);
+ for (count = 0; count < ARRAY_SIZE(rq); count++) {
+ int err__ = active_request_put(rq[count]);
+
+ /* Keep the first error */
+ if (!err)
+ err = err__;
+ }
err_file:
mock_file_free(engine->i915, file);
--
2.17.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-05-18 9:22 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-17 15:47 [PATCH 1/2] drm/i915/selftests: Wait longer for the old active request Chris Wilson
2018-05-17 15:47 ` [PATCH 2/2] drm/i915: Flush the RING stop bit after clearing RING_HEAD in reset Chris Wilson
2018-05-18 7:29 ` Chris Wilson
2018-05-17 16:46 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/selftests: Wait longer for the old active request Patchwork
2018-05-17 17:01 ` ✓ Fi.CI.BAT: success " Patchwork
2018-05-17 20:22 ` ✓ Fi.CI.IGT: " Patchwork
2018-05-17 20:56 ` ✗ Fi.CI.CHECKPATCH: warning " Patchwork
2018-05-17 21:12 ` ✓ Fi.CI.BAT: success " Patchwork
2018-05-18 1:03 ` ✓ Fi.CI.IGT: " Patchwork
-- strict thread matches above, loose matches on Subject: below --
2018-05-17 14:24 [PATCH 1/2] " Chris Wilson
2018-05-18 9:22 ` Tvrtko Ursulin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.