* [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context
@ 2018-03-30 13:18 Chris Wilson
2018-03-30 13:30 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
` (7 more replies)
0 siblings, 8 replies; 11+ messages in thread
From: Chris Wilson @ 2018-03-30 13:18 UTC (permalink / raw)
To: intel-gfx
We don't handle resetting the kernel context very well, or presumably any
context executing its breadcrumb commands in the ring as opposed to the
batchbuffer and flush. If we trigger a device reset twice in quick
succession while the kernel context is executing, we may end up skipping
the breadcrumb. This is really only a problem for the selftest as
normally there is a large interlude between resets (hangcheck), or we
focus on resetting just one engine and so avoid repeatedly resetting
innocents.
Something to try would be a preempt-to-idle to quiesce the engine before
reset, so that innocent contexts would be spared the reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_drv.c | 3 ++
drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 48 ++++++++++++++++++++++--
2 files changed, 47 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index d354627882e3..684060ed8db6 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1886,6 +1886,8 @@ void i915_reset(struct drm_i915_private *i915)
int ret;
int i;
+ GEM_TRACE("flags=%lx\n", error->flags);
+
might_sleep();
lockdep_assert_held(&i915->drm.struct_mutex);
GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &error->flags));
@@ -2016,6 +2018,7 @@ int i915_reset_engine(struct intel_engine_cs *engine, const char *msg)
struct i915_request *active_request;
int ret;
+ GEM_TRACE("%s flags=%lx\n", engine->name, error->flags);
GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, &error->flags));
active_request = i915_gem_reset_prepare_engine(engine);
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index 9e4e0ad62724..122a32e0a69d 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -979,6 +979,23 @@ static int igt_wait_reset(void *arg)
return err;
}
+static int wait_for_others(struct drm_i915_private *i915,
+ struct intel_engine_cs *exclude)
+{
+ struct intel_engine_cs *engine;
+ enum intel_engine_id id;
+
+ for_each_engine(engine, i915, id) {
+ if (engine == exclude)
+ continue;
+
+ if (wait_for(intel_engine_is_idle(engine), 10))
+ return -EIO;
+ }
+
+ return 0;
+}
+
static int igt_reset_queue(void *arg)
{
struct drm_i915_private *i915 = arg;
@@ -1027,13 +1044,36 @@ static int igt_reset_queue(void *arg)
i915_request_get(rq);
__i915_request_add(rq, true);
+ /*
+ * XXX We don't handle resetting the kernel context
+ * very well. If we trigger a device reset twice in
+ * quick succession while the kernel context is
+ * executing, we may end up skipping the breadcrumb.
+ * This is really only a problem for the selftest as
+ * normally there is a large interlude between resets
+ * (hangcheck), or we focus on resetting just one
+ * engine and so avoid repeatedly resetting innocents.
+ */
+ err = wait_for_others(i915, engine);
+ if (err) {
+ pr_err("%s(%s): Failed to idle other inactive engines after device reset\n",
+ __func__, engine->name);
+ i915_request_put(rq);
+ i915_request_put(prev);
+
+ GEM_TRACE_DUMP();
+ i915_gem_set_wedged(i915);
+ goto fini;
+ }
+
if (!wait_for_hang(&h, prev)) {
struct drm_printer p = drm_info_printer(i915->drm.dev);
- pr_err("%s: Failed to start request %x, at %x\n",
- __func__, prev->fence.seqno, hws_seqno(&h, prev));
- intel_engine_dump(prev->engine, &p,
- "%s\n", prev->engine->name);
+ pr_err("%s(%s): Failed to start request %x, at %x\n",
+ __func__, engine->name,
+ prev->fence.seqno, hws_seqno(&h, prev));
+ intel_engine_dump(engine, &p,
+ "%s\n", engine->name);
i915_request_put(rq);
i915_request_put(prev);
--
2.16.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread
* ✗ Fi.CI.CHECKPATCH: warning for drm/i915/selftests: Avoid repeatedly harming the same innocent context
2018-03-30 13:18 [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context Chris Wilson
@ 2018-03-30 13:30 ` Patchwork
2018-03-30 13:47 ` ✓ Fi.CI.BAT: success " Patchwork
` (6 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-03-30 13:30 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915/selftests: Avoid repeatedly harming the same innocent context
URL : https://patchwork.freedesktop.org/series/40941/
State : warning
== Summary ==
$ dim checkpatch origin/drm-tip
477c0fa0c0b5 drm/i915/selftests: Avoid repeatedly harming the same innocent context
-:57: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#57: FILE: drivers/gpu/drm/i915/selftests/intel_hangcheck.c:983:
+static int wait_for_others(struct drm_i915_private *i915,
+ struct intel_engine_cs *exclude)
total: 0 errors, 0 warnings, 1 checks, 78 lines checked
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915/selftests: Avoid repeatedly harming the same innocent context
2018-03-30 13:18 [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context Chris Wilson
2018-03-30 13:30 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
@ 2018-03-30 13:47 ` Patchwork
2018-03-30 14:33 ` ✗ Fi.CI.IGT: failure " Patchwork
` (5 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-03-30 13:47 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915/selftests: Avoid repeatedly harming the same innocent context
URL : https://patchwork.freedesktop.org/series/40941/
State : success
== Summary ==
Series 40941v1 drm/i915/selftests: Avoid repeatedly harming the same innocent context
https://patchwork.freedesktop.org/api/1.0/series/40941/revisions/1/mbox/
---- Known issues:
Test prime_vgem:
Subgroup basic-fence-flip:
pass -> FAIL (fi-ilk-650) fdo#104008
fdo#104008 https://bugs.freedesktop.org/show_bug.cgi?id=104008
fi-bdw-5557u total:285 pass:264 dwarn:0 dfail:0 fail:0 skip:21 time:435s
fi-bdw-gvtdvm total:285 pass:261 dwarn:0 dfail:0 fail:0 skip:24 time:441s
fi-blb-e6850 total:285 pass:220 dwarn:1 dfail:0 fail:0 skip:64 time:379s
fi-bsw-n3050 total:285 pass:239 dwarn:0 dfail:0 fail:0 skip:46 time:531s
fi-bwr-2160 total:285 pass:180 dwarn:0 dfail:0 fail:0 skip:105 time:298s
fi-bxt-dsi total:285 pass:255 dwarn:0 dfail:0 fail:0 skip:30 time:512s
fi-bxt-j4205 total:285 pass:256 dwarn:0 dfail:0 fail:0 skip:29 time:513s
fi-byt-j1900 total:285 pass:250 dwarn:0 dfail:0 fail:0 skip:35 time:519s
fi-byt-n2820 total:285 pass:246 dwarn:0 dfail:0 fail:0 skip:39 time:506s
fi-cfl-8700k total:285 pass:257 dwarn:0 dfail:0 fail:0 skip:28 time:412s
fi-cfl-s3 total:285 pass:259 dwarn:0 dfail:0 fail:0 skip:26 time:560s
fi-cfl-u total:285 pass:259 dwarn:0 dfail:0 fail:0 skip:26 time:509s
fi-cnl-y3 total:285 pass:259 dwarn:0 dfail:0 fail:0 skip:26 time:589s
fi-elk-e7500 total:285 pass:225 dwarn:1 dfail:0 fail:0 skip:59 time:426s
fi-gdg-551 total:285 pass:176 dwarn:0 dfail:0 fail:1 skip:108 time:316s
fi-glk-1 total:285 pass:257 dwarn:0 dfail:0 fail:0 skip:28 time:533s
fi-hsw-4770 total:285 pass:258 dwarn:0 dfail:0 fail:0 skip:27 time:402s
fi-ilk-650 total:285 pass:224 dwarn:0 dfail:0 fail:1 skip:60 time:421s
fi-ivb-3520m total:285 pass:256 dwarn:0 dfail:0 fail:0 skip:29 time:476s
fi-ivb-3770 total:285 pass:252 dwarn:0 dfail:0 fail:0 skip:33 time:427s
fi-kbl-7500u total:285 pass:260 dwarn:1 dfail:0 fail:0 skip:24 time:472s
fi-kbl-7567u total:285 pass:265 dwarn:0 dfail:0 fail:0 skip:20 time:461s
fi-kbl-r total:285 pass:258 dwarn:0 dfail:0 fail:0 skip:27 time:508s
fi-pnv-d510 total:285 pass:219 dwarn:1 dfail:0 fail:0 skip:65 time:658s
fi-skl-6260u total:285 pass:265 dwarn:0 dfail:0 fail:0 skip:20 time:439s
fi-skl-6600u total:285 pass:258 dwarn:0 dfail:0 fail:0 skip:27 time:533s
fi-skl-6700k2 total:285 pass:261 dwarn:0 dfail:0 fail:0 skip:24 time:506s
fi-skl-6770hq total:285 pass:265 dwarn:0 dfail:0 fail:0 skip:20 time:514s
fi-skl-guc total:285 pass:257 dwarn:0 dfail:0 fail:0 skip:28 time:430s
fi-skl-gvtdvm total:285 pass:262 dwarn:0 dfail:0 fail:0 skip:23 time:447s
fi-snb-2520m total:285 pass:245 dwarn:0 dfail:0 fail:0 skip:40 time:581s
fi-snb-2600 total:285 pass:245 dwarn:0 dfail:0 fail:0 skip:40 time:400s
Blacklisted hosts:
fi-cnl-psr total:285 pass:256 dwarn:3 dfail:0 fail:0 skip:26 time:532s
fi-glk-j4005 total:285 pass:256 dwarn:0 dfail:0 fail:0 skip:29 time:486s
335ef9849310af26d65c54f7d2d2e9dcbce238b9 drm-tip: 2018y-03m-30d-09h-08m-40s UTC integration manifest
477c0fa0c0b5 drm/i915/selftests: Avoid repeatedly harming the same innocent context
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8548/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* ✗ Fi.CI.IGT: failure for drm/i915/selftests: Avoid repeatedly harming the same innocent context
2018-03-30 13:18 [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context Chris Wilson
2018-03-30 13:30 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2018-03-30 13:47 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-03-30 14:33 ` Patchwork
2018-03-31 20:24 ` [PATCH] " kbuild test robot
` (4 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-03-30 14:33 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915/selftests: Avoid repeatedly harming the same innocent context
URL : https://patchwork.freedesktop.org/series/40941/
State : failure
== Summary ==
---- Possible new issues:
Test kms_cursor_legacy:
Subgroup cursor-vs-flip-atomic-transitions:
pass -> FAIL (shard-snb)
Test kms_frontbuffer_tracking:
Subgroup fbcpsr-2p-scndscrn-spr-indfb-move:
skip -> FAIL (shard-snb)
Subgroup fbcpsr-rgb101010-draw-render:
skip -> FAIL (shard-snb)
Subgroup psr-2p-primscrn-spr-indfb-onoff:
skip -> FAIL (shard-snb)
Test kms_plane_lowres:
Subgroup pipe-a-tiling-none:
pass -> FAIL (shard-snb)
Test kms_vblank:
Subgroup pipe-a-query-idle:
pass -> FAIL (shard-snb)
---- Known issues:
Test gem_softpin:
Subgroup noreloc-s3:
pass -> INCOMPLETE (shard-snb) fdo#103375
Test kms_flip:
Subgroup 2x-dpms-vs-vblank-race:
fail -> PASS (shard-hsw) fdo#103060
Subgroup 2x-flip-vs-expired-vblank-interruptible:
fail -> PASS (shard-hsw) fdo#102887
Subgroup flip-vs-blocking-wf-vblank:
fail -> PASS (shard-hsw) fdo#100368
Test kms_setmode:
Subgroup basic:
fail -> PASS (shard-apl) fdo#99912
fdo#103375 https://bugs.freedesktop.org/show_bug.cgi?id=103375
fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
shard-apl total:3495 pass:1833 dwarn:1 dfail:0 fail:6 skip:1655 time:12857s
shard-hsw total:3495 pass:1783 dwarn:1 dfail:0 fail:1 skip:1709 time:11682s
shard-snb total:3398 pass:1320 dwarn:1 dfail:0 fail:10 skip:2066 time:6563s
Blacklisted hosts:
shard-kbl total:3495 pass:1958 dwarn:1 dfail:0 fail:8 skip:1528 time:9256s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8548/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context
2018-03-30 13:18 [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context Chris Wilson
` (2 preceding siblings ...)
2018-03-30 14:33 ` ✗ Fi.CI.IGT: failure " Patchwork
@ 2018-03-31 20:24 ` kbuild test robot
2018-03-31 20:41 ` kbuild test robot
` (3 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: kbuild test robot @ 2018-03-31 20:24 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx, kbuild-all
[-- Attachment #1: Type: text/plain, Size: 5845 bytes --]
Hi Chris,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.16-rc7 next-20180329]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-selftests-Avoid-repeatedly-harming-the-same-innocent-context/20180401-022503
base: git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-b0-04010137 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
In file included from drivers/gpu/drm/i915/intel_hangcheck.c:465:0:
drivers/gpu/drm/i915/selftests/intel_hangcheck.c: In function 'igt_reset_queue':
>> drivers/gpu/drm/i915/selftests/intel_hangcheck.c:988:5: error: implicit declaration of function 'GEM_TRACE_DUMP' [-Werror=implicit-function-declaration]
GEM_TRACE_DUMP();
^
cc1: all warnings being treated as errors
vim +/GEM_TRACE_DUMP +988 drivers/gpu/drm/i915/selftests/intel_hangcheck.c
922
923 static int igt_reset_queue(void *arg)
924 {
925 struct drm_i915_private *i915 = arg;
926 struct intel_engine_cs *engine;
927 enum intel_engine_id id;
928 struct hang h;
929 int err;
930
931 /* Check that we replay pending requests following a hang */
932
933 global_reset_lock(i915);
934
935 mutex_lock(&i915->drm.struct_mutex);
936 err = hang_init(&h, i915);
937 if (err)
938 goto unlock;
939
940 for_each_engine(engine, i915, id) {
941 struct i915_request *prev;
942 IGT_TIMEOUT(end_time);
943 unsigned int count;
944
945 if (!intel_engine_can_store_dword(engine))
946 continue;
947
948 prev = hang_create_request(&h, engine);
949 if (IS_ERR(prev)) {
950 err = PTR_ERR(prev);
951 goto fini;
952 }
953
954 i915_request_get(prev);
955 __i915_request_add(prev, true);
956
957 count = 0;
958 do {
959 struct i915_request *rq;
960 unsigned int reset_count;
961
962 rq = hang_create_request(&h, engine);
963 if (IS_ERR(rq)) {
964 err = PTR_ERR(rq);
965 goto fini;
966 }
967
968 i915_request_get(rq);
969 __i915_request_add(rq, true);
970
971 /*
972 * XXX We don't handle resetting the kernel context
973 * very well. If we trigger a device reset twice in
974 * quick succession while the kernel context is
975 * executing, we may end up skipping the breadcrumb.
976 * This is really only a problem for the selftest as
977 * normally there is a large interlude between resets
978 * (hangcheck), or we focus on resetting just one
979 * engine and so avoid repeatedly resetting innocents.
980 */
981 err = wait_for_others(i915, engine);
982 if (err) {
983 pr_err("%s(%s): Failed to idle other inactive engines after device reset\n",
984 __func__, engine->name);
985 i915_request_put(rq);
986 i915_request_put(prev);
987
> 988 GEM_TRACE_DUMP();
989 i915_gem_set_wedged(i915);
990 goto fini;
991 }
992
993 if (!wait_for_hang(&h, prev)) {
994 struct drm_printer p = drm_info_printer(i915->drm.dev);
995
996 pr_err("%s(%s): Failed to start request %x, at %x\n",
997 __func__, engine->name,
998 prev->fence.seqno, hws_seqno(&h, prev));
999 intel_engine_dump(engine, &p,
1000 "%s\n", engine->name);
1001
1002 i915_request_put(rq);
1003 i915_request_put(prev);
1004
1005 i915_reset(i915, 0);
1006 i915_gem_set_wedged(i915);
1007
1008 err = -EIO;
1009 goto fini;
1010 }
1011
1012 reset_count = fake_hangcheck(prev);
1013
1014 i915_reset(i915, I915_RESET_QUIET);
1015
1016 GEM_BUG_ON(test_bit(I915_RESET_HANDOFF,
1017 &i915->gpu_error.flags));
1018
1019 if (prev->fence.error != -EIO) {
1020 pr_err("GPU reset not recorded on hanging request [fence.error=%d]!\n",
1021 prev->fence.error);
1022 i915_request_put(rq);
1023 i915_request_put(prev);
1024 err = -EINVAL;
1025 goto fini;
1026 }
1027
1028 if (rq->fence.error) {
1029 pr_err("Fence error status not zero [%d] after unrelated reset\n",
1030 rq->fence.error);
1031 i915_request_put(rq);
1032 i915_request_put(prev);
1033 err = -EINVAL;
1034 goto fini;
1035 }
1036
1037 if (i915_reset_count(&i915->gpu_error) == reset_count) {
1038 pr_err("No GPU reset recorded!\n");
1039 i915_request_put(rq);
1040 i915_request_put(prev);
1041 err = -EINVAL;
1042 goto fini;
1043 }
1044
1045 i915_request_put(prev);
1046 prev = rq;
1047 count++;
1048 } while (time_before(jiffies, end_time));
1049 pr_info("%s: Completed %d resets\n", engine->name, count);
1050
1051 *h.batch = MI_BATCH_BUFFER_END;
1052 i915_gem_chipset_flush(i915);
1053
1054 i915_request_put(prev);
1055
1056 err = flush_test(i915, I915_WAIT_LOCKED);
1057 if (err)
1058 break;
1059 }
1060
1061 fini:
1062 hang_fini(&h);
1063 unlock:
1064 mutex_unlock(&i915->drm.struct_mutex);
1065 global_reset_unlock(i915);
1066
1067 if (i915_terminally_wedged(&i915->gpu_error))
1068 return -EIO;
1069
1070 return err;
1071 }
1072
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28573 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context
2018-03-30 13:18 [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context Chris Wilson
` (3 preceding siblings ...)
2018-03-31 20:24 ` [PATCH] " kbuild test robot
@ 2018-03-31 20:41 ` kbuild test robot
2018-04-01 2:45 ` kbuild test robot
` (2 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: kbuild test robot @ 2018-03-31 20:41 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx, kbuild-all
[-- Attachment #1: Type: text/plain, Size: 5889 bytes --]
Hi Chris,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.16-rc7 next-20180329]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-selftests-Avoid-repeatedly-harming-the-same-innocent-context/20180401-022503
base: git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-x011-201813 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All errors (new ones prefixed by >>):
In file included from drivers/gpu/drm/i915/intel_hangcheck.c:465:0:
drivers/gpu/drm/i915/selftests/intel_hangcheck.c: In function 'igt_reset_queue':
>> drivers/gpu/drm/i915/selftests/intel_hangcheck.c:988:5: error: implicit declaration of function 'GEM_TRACE_DUMP'; did you mean 'GEM_TRACE'? [-Werror=implicit-function-declaration]
GEM_TRACE_DUMP();
^~~~~~~~~~~~~~
GEM_TRACE
cc1: some warnings being treated as errors
vim +988 drivers/gpu/drm/i915/selftests/intel_hangcheck.c
922
923 static int igt_reset_queue(void *arg)
924 {
925 struct drm_i915_private *i915 = arg;
926 struct intel_engine_cs *engine;
927 enum intel_engine_id id;
928 struct hang h;
929 int err;
930
931 /* Check that we replay pending requests following a hang */
932
933 global_reset_lock(i915);
934
935 mutex_lock(&i915->drm.struct_mutex);
936 err = hang_init(&h, i915);
937 if (err)
938 goto unlock;
939
940 for_each_engine(engine, i915, id) {
941 struct i915_request *prev;
942 IGT_TIMEOUT(end_time);
943 unsigned int count;
944
945 if (!intel_engine_can_store_dword(engine))
946 continue;
947
948 prev = hang_create_request(&h, engine);
949 if (IS_ERR(prev)) {
950 err = PTR_ERR(prev);
951 goto fini;
952 }
953
954 i915_request_get(prev);
955 __i915_request_add(prev, true);
956
957 count = 0;
958 do {
959 struct i915_request *rq;
960 unsigned int reset_count;
961
962 rq = hang_create_request(&h, engine);
963 if (IS_ERR(rq)) {
964 err = PTR_ERR(rq);
965 goto fini;
966 }
967
968 i915_request_get(rq);
969 __i915_request_add(rq, true);
970
971 /*
972 * XXX We don't handle resetting the kernel context
973 * very well. If we trigger a device reset twice in
974 * quick succession while the kernel context is
975 * executing, we may end up skipping the breadcrumb.
976 * This is really only a problem for the selftest as
977 * normally there is a large interlude between resets
978 * (hangcheck), or we focus on resetting just one
979 * engine and so avoid repeatedly resetting innocents.
980 */
981 err = wait_for_others(i915, engine);
982 if (err) {
983 pr_err("%s(%s): Failed to idle other inactive engines after device reset\n",
984 __func__, engine->name);
985 i915_request_put(rq);
986 i915_request_put(prev);
987
> 988 GEM_TRACE_DUMP();
989 i915_gem_set_wedged(i915);
990 goto fini;
991 }
992
993 if (!wait_for_hang(&h, prev)) {
994 struct drm_printer p = drm_info_printer(i915->drm.dev);
995
996 pr_err("%s(%s): Failed to start request %x, at %x\n",
997 __func__, engine->name,
998 prev->fence.seqno, hws_seqno(&h, prev));
999 intel_engine_dump(engine, &p,
1000 "%s\n", engine->name);
1001
1002 i915_request_put(rq);
1003 i915_request_put(prev);
1004
1005 i915_reset(i915, 0);
1006 i915_gem_set_wedged(i915);
1007
1008 err = -EIO;
1009 goto fini;
1010 }
1011
1012 reset_count = fake_hangcheck(prev);
1013
1014 i915_reset(i915, I915_RESET_QUIET);
1015
1016 GEM_BUG_ON(test_bit(I915_RESET_HANDOFF,
1017 &i915->gpu_error.flags));
1018
1019 if (prev->fence.error != -EIO) {
1020 pr_err("GPU reset not recorded on hanging request [fence.error=%d]!\n",
1021 prev->fence.error);
1022 i915_request_put(rq);
1023 i915_request_put(prev);
1024 err = -EINVAL;
1025 goto fini;
1026 }
1027
1028 if (rq->fence.error) {
1029 pr_err("Fence error status not zero [%d] after unrelated reset\n",
1030 rq->fence.error);
1031 i915_request_put(rq);
1032 i915_request_put(prev);
1033 err = -EINVAL;
1034 goto fini;
1035 }
1036
1037 if (i915_reset_count(&i915->gpu_error) == reset_count) {
1038 pr_err("No GPU reset recorded!\n");
1039 i915_request_put(rq);
1040 i915_request_put(prev);
1041 err = -EINVAL;
1042 goto fini;
1043 }
1044
1045 i915_request_put(prev);
1046 prev = rq;
1047 count++;
1048 } while (time_before(jiffies, end_time));
1049 pr_info("%s: Completed %d resets\n", engine->name, count);
1050
1051 *h.batch = MI_BATCH_BUFFER_END;
1052 i915_gem_chipset_flush(i915);
1053
1054 i915_request_put(prev);
1055
1056 err = flush_test(i915, I915_WAIT_LOCKED);
1057 if (err)
1058 break;
1059 }
1060
1061 fini:
1062 hang_fini(&h);
1063 unlock:
1064 mutex_unlock(&i915->drm.struct_mutex);
1065 global_reset_unlock(i915);
1066
1067 if (i915_terminally_wedged(&i915->gpu_error))
1068 return -EIO;
1069
1070 return err;
1071 }
1072
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28470 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context
2018-03-30 13:18 [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context Chris Wilson
` (4 preceding siblings ...)
2018-03-31 20:41 ` kbuild test robot
@ 2018-04-01 2:45 ` kbuild test robot
2018-04-05 19:29 ` Chris Wilson
2018-04-06 9:11 ` Tvrtko Ursulin
7 siblings, 0 replies; 11+ messages in thread
From: kbuild test robot @ 2018-04-01 2:45 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx, kbuild-all
Hi Chris,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on v4.16-rc7 next-20180329]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-selftests-Avoid-repeatedly-harming-the-same-innocent-context/20180401-022503
base: git://anongit.freedesktop.org/drm-intel for-linux-next
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__
sparse warnings: (new ones prefixed by >>)
drivers/gpu/drm/i915/selftests/intel_hangcheck.c:988:33: sparse: undefined identifier 'GEM_TRACE_DUMP'
>> drivers/gpu/drm/i915/selftests/intel_hangcheck.c:988:47: sparse: call with no type!
In file included from drivers/gpu/drm/i915/intel_hangcheck.c:465:0:
drivers/gpu/drm/i915/selftests/intel_hangcheck.c: In function 'igt_reset_queue':
drivers/gpu/drm/i915/selftests/intel_hangcheck.c:988:5: error: implicit declaration of function 'GEM_TRACE_DUMP'; did you mean 'GEM_TRACE'? [-Werror=implicit-function-declaration]
GEM_TRACE_DUMP();
^~~~~~~~~~~~~~
GEM_TRACE
cc1: some warnings being treated as errors
vim +988 drivers/gpu/drm/i915/selftests/intel_hangcheck.c
922
923 static int igt_reset_queue(void *arg)
924 {
925 struct drm_i915_private *i915 = arg;
926 struct intel_engine_cs *engine;
927 enum intel_engine_id id;
928 struct hang h;
929 int err;
930
931 /* Check that we replay pending requests following a hang */
932
933 global_reset_lock(i915);
934
935 mutex_lock(&i915->drm.struct_mutex);
936 err = hang_init(&h, i915);
937 if (err)
938 goto unlock;
939
940 for_each_engine(engine, i915, id) {
941 struct i915_request *prev;
942 IGT_TIMEOUT(end_time);
943 unsigned int count;
944
945 if (!intel_engine_can_store_dword(engine))
946 continue;
947
948 prev = hang_create_request(&h, engine);
949 if (IS_ERR(prev)) {
950 err = PTR_ERR(prev);
951 goto fini;
952 }
953
954 i915_request_get(prev);
955 __i915_request_add(prev, true);
956
957 count = 0;
958 do {
959 struct i915_request *rq;
960 unsigned int reset_count;
961
962 rq = hang_create_request(&h, engine);
963 if (IS_ERR(rq)) {
964 err = PTR_ERR(rq);
965 goto fini;
966 }
967
968 i915_request_get(rq);
969 __i915_request_add(rq, true);
970
971 /*
972 * XXX We don't handle resetting the kernel context
973 * very well. If we trigger a device reset twice in
974 * quick succession while the kernel context is
975 * executing, we may end up skipping the breadcrumb.
976 * This is really only a problem for the selftest as
977 * normally there is a large interlude between resets
978 * (hangcheck), or we focus on resetting just one
979 * engine and so avoid repeatedly resetting innocents.
980 */
981 err = wait_for_others(i915, engine);
982 if (err) {
983 pr_err("%s(%s): Failed to idle other inactive engines after device reset\n",
984 __func__, engine->name);
985 i915_request_put(rq);
986 i915_request_put(prev);
987
> 988 GEM_TRACE_DUMP();
989 i915_gem_set_wedged(i915);
990 goto fini;
991 }
992
993 if (!wait_for_hang(&h, prev)) {
994 struct drm_printer p = drm_info_printer(i915->drm.dev);
995
996 pr_err("%s(%s): Failed to start request %x, at %x\n",
997 __func__, engine->name,
998 prev->fence.seqno, hws_seqno(&h, prev));
999 intel_engine_dump(engine, &p,
1000 "%s\n", engine->name);
1001
1002 i915_request_put(rq);
1003 i915_request_put(prev);
1004
1005 i915_reset(i915, 0);
1006 i915_gem_set_wedged(i915);
1007
1008 err = -EIO;
1009 goto fini;
1010 }
1011
1012 reset_count = fake_hangcheck(prev);
1013
1014 i915_reset(i915, I915_RESET_QUIET);
1015
1016 GEM_BUG_ON(test_bit(I915_RESET_HANDOFF,
1017 &i915->gpu_error.flags));
1018
1019 if (prev->fence.error != -EIO) {
1020 pr_err("GPU reset not recorded on hanging request [fence.error=%d]!\n",
1021 prev->fence.error);
1022 i915_request_put(rq);
1023 i915_request_put(prev);
1024 err = -EINVAL;
1025 goto fini;
1026 }
1027
1028 if (rq->fence.error) {
1029 pr_err("Fence error status not zero [%d] after unrelated reset\n",
1030 rq->fence.error);
1031 i915_request_put(rq);
1032 i915_request_put(prev);
1033 err = -EINVAL;
1034 goto fini;
1035 }
1036
1037 if (i915_reset_count(&i915->gpu_error) == reset_count) {
1038 pr_err("No GPU reset recorded!\n");
1039 i915_request_put(rq);
1040 i915_request_put(prev);
1041 err = -EINVAL;
1042 goto fini;
1043 }
1044
1045 i915_request_put(prev);
1046 prev = rq;
1047 count++;
1048 } while (time_before(jiffies, end_time));
1049 pr_info("%s: Completed %d resets\n", engine->name, count);
1050
1051 *h.batch = MI_BATCH_BUFFER_END;
1052 i915_gem_chipset_flush(i915);
1053
1054 i915_request_put(prev);
1055
1056 err = flush_test(i915, I915_WAIT_LOCKED);
1057 if (err)
1058 break;
1059 }
1060
1061 fini:
1062 hang_fini(&h);
1063 unlock:
1064 mutex_unlock(&i915->drm.struct_mutex);
1065 global_reset_unlock(i915);
1066
1067 if (i915_terminally_wedged(&i915->gpu_error))
1068 return -EIO;
1069
1070 return err;
1071 }
1072
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context
2018-03-30 13:18 [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context Chris Wilson
` (5 preceding siblings ...)
2018-04-01 2:45 ` kbuild test robot
@ 2018-04-05 19:29 ` Chris Wilson
2018-04-06 9:11 ` Tvrtko Ursulin
7 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2018-04-05 19:29 UTC (permalink / raw)
To: intel-gfx; +Cc: Mika
Quoting Chris Wilson (2018-03-30 14:18:01)
> We don't handle resetting the kernel context very well, or presumably any
> context executing its breadcrumb commands in the ring as opposed to the
> batchbuffer and flush. If we trigger a device reset twice in quick
> succession while the kernel context is executing, we may end up skipping
> the breadcrumb. This is really only a problem for the selftest as
> normally there is a large interlude between resets (hangcheck), or we
> focus on resetting just one engine and so avoid repeatedly resetting
> innocents.
>
> Something to try would be a preempt-to-idle to quiesce the engine before
> reset, so that innocent contexts would be spared the reset.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> CC: Michel Thierry <michel.thierry@intel.com>
Anyone?
> ---
> drivers/gpu/drm/i915/i915_drv.c | 3 ++
> drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 48 ++++++++++++++++++++++--
> 2 files changed, 47 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index d354627882e3..684060ed8db6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1886,6 +1886,8 @@ void i915_reset(struct drm_i915_private *i915)
> int ret;
> int i;
>
> + GEM_TRACE("flags=%lx\n", error->flags);
> +
> might_sleep();
> lockdep_assert_held(&i915->drm.struct_mutex);
> GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &error->flags));
> @@ -2016,6 +2018,7 @@ int i915_reset_engine(struct intel_engine_cs *engine, const char *msg)
> struct i915_request *active_request;
> int ret;
>
> + GEM_TRACE("%s flags=%lx\n", engine->name, error->flags);
> GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, &error->flags));
>
> active_request = i915_gem_reset_prepare_engine(engine);
> diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> index 9e4e0ad62724..122a32e0a69d 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> @@ -979,6 +979,23 @@ static int igt_wait_reset(void *arg)
> return err;
> }
>
> +static int wait_for_others(struct drm_i915_private *i915,
> + struct intel_engine_cs *exclude)
> +{
> + struct intel_engine_cs *engine;
> + enum intel_engine_id id;
> +
> + for_each_engine(engine, i915, id) {
> + if (engine == exclude)
> + continue;
> +
> + if (wait_for(intel_engine_is_idle(engine), 10))
> + return -EIO;
> + }
> +
> + return 0;
> +}
> +
> static int igt_reset_queue(void *arg)
> {
> struct drm_i915_private *i915 = arg;
> @@ -1027,13 +1044,36 @@ static int igt_reset_queue(void *arg)
> i915_request_get(rq);
> __i915_request_add(rq, true);
>
> + /*
> + * XXX We don't handle resetting the kernel context
> + * very well. If we trigger a device reset twice in
> + * quick succession while the kernel context is
> + * executing, we may end up skipping the breadcrumb.
> + * This is really only a problem for the selftest as
> + * normally there is a large interlude between resets
> + * (hangcheck), or we focus on resetting just one
> + * engine and so avoid repeatedly resetting innocents.
> + */
> + err = wait_for_others(i915, engine);
> + if (err) {
> + pr_err("%s(%s): Failed to idle other inactive engines after device reset\n",
> + __func__, engine->name);
> + i915_request_put(rq);
> + i915_request_put(prev);
> +
> + GEM_TRACE_DUMP();
> + i915_gem_set_wedged(i915);
> + goto fini;
> + }
> +
> if (!wait_for_hang(&h, prev)) {
> struct drm_printer p = drm_info_printer(i915->drm.dev);
>
> - pr_err("%s: Failed to start request %x, at %x\n",
> - __func__, prev->fence.seqno, hws_seqno(&h, prev));
> - intel_engine_dump(prev->engine, &p,
> - "%s\n", prev->engine->name);
> + pr_err("%s(%s): Failed to start request %x, at %x\n",
> + __func__, engine->name,
> + prev->fence.seqno, hws_seqno(&h, prev));
> + intel_engine_dump(engine, &p,
> + "%s\n", engine->name);
>
> i915_request_put(rq);
> i915_request_put(prev);
> --
> 2.16.3
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context
2018-03-30 13:18 [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context Chris Wilson
` (6 preceding siblings ...)
2018-04-05 19:29 ` Chris Wilson
@ 2018-04-06 9:11 ` Tvrtko Ursulin
2018-04-06 9:40 ` Chris Wilson
7 siblings, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2018-04-06 9:11 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 30/03/2018 14:18, Chris Wilson wrote:
> We don't handle resetting the kernel context very well, or presumably any
> context executing its breadcrumb commands in the ring as opposed to the
> batchbuffer and flush. If we trigger a device reset twice in quick
> succession while the kernel context is executing, we may end up skipping
> the breadcrumb. This is really only a problem for the selftest as
> normally there is a large interlude between resets (hangcheck), or we
> focus on resetting just one engine and so avoid repeatedly resetting
> innocents.
>
> Something to try would be a preempt-to-idle to quiesce the engine before
> reset, so that innocent contexts would be spared the reset.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> CC: Michel Thierry <michel.thierry@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.c | 3 ++
> drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 48 ++++++++++++++++++++++--
> 2 files changed, 47 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index d354627882e3..684060ed8db6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1886,6 +1886,8 @@ void i915_reset(struct drm_i915_private *i915)
> int ret;
> int i;
>
> + GEM_TRACE("flags=%lx\n", error->flags);
> +
> might_sleep();
> lockdep_assert_held(&i915->drm.struct_mutex);
> GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &error->flags));
> @@ -2016,6 +2018,7 @@ int i915_reset_engine(struct intel_engine_cs *engine, const char *msg)
> struct i915_request *active_request;
> int ret;
>
> + GEM_TRACE("%s flags=%lx\n", engine->name, error->flags);
> GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, &error->flags));
>
> active_request = i915_gem_reset_prepare_engine(engine);
> diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> index 9e4e0ad62724..122a32e0a69d 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> @@ -979,6 +979,23 @@ static int igt_wait_reset(void *arg)
> return err;
> }
>
> +static int wait_for_others(struct drm_i915_private *i915,
> + struct intel_engine_cs *exclude)
> +{
> + struct intel_engine_cs *engine;
> + enum intel_engine_id id;
> +
> + for_each_engine(engine, i915, id) { > + if (engine == exclude)> + continue;
Bike-shedder in me says:
for_each_engine_masked(engine, i915, ~BIT(exclude->id), id)
:)
Although for_each_engine_masked failed to enclose its arguments in
parentheses so never mind.
> +
> + if (wait_for(intel_engine_is_idle(engine), 10))
> + return -EIO;
> + }
> +
> + return 0;
> +}
> +
> static int igt_reset_queue(void *arg)
> {
> struct drm_i915_private *i915 = arg;
> @@ -1027,13 +1044,36 @@ static int igt_reset_queue(void *arg)
> i915_request_get(rq);
> __i915_request_add(rq, true);
>
> + /*
> + * XXX We don't handle resetting the kernel context
> + * very well. If we trigger a device reset twice in
> + * quick succession while the kernel context is
> + * executing, we may end up skipping the breadcrumb.
> + * This is really only a problem for the selftest as
> + * normally there is a large interlude between resets
> + * (hangcheck), or we focus on resetting just one
> + * engine and so avoid repeatedly resetting innocents.
> + */
> + err = wait_for_others(i915, engine);
> + if (err) {
> + pr_err("%s(%s): Failed to idle other inactive engines after device reset\n",
> + __func__, engine->name);
> + i915_request_put(rq);
> + i915_request_put(prev);
> +
> + GEM_TRACE_DUMP();
> + i915_gem_set_wedged(i915);
> + goto fini;
> + }
> +
> if (!wait_for_hang(&h, prev)) {
This was confusing me a period since I was assuming the seqno check is
against the breadcrumb, but it is actually about the same number written
to a different place. So it actually means
wait_for_request_to_start_executing.
> struct drm_printer p = drm_info_printer(i915->drm.dev);
>
> - pr_err("%s: Failed to start request %x, at %x\n",
> - __func__, prev->fence.seqno, hws_seqno(&h, prev));
> - intel_engine_dump(prev->engine, &p,
> - "%s\n", prev->engine->name);
> + pr_err("%s(%s): Failed to start request %x, at %x\n",
> + __func__, engine->name,
> + prev->fence.seqno, hws_seqno(&h, prev));
> + intel_engine_dump(engine, &p,
> + "%s\n", engine->name);
>
> i915_request_put(rq);
> i915_request_put(prev);
>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context
2018-04-06 9:11 ` Tvrtko Ursulin
@ 2018-04-06 9:40 ` Chris Wilson
2018-04-06 9:59 ` Tvrtko Ursulin
0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2018-04-06 9:40 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
Quoting Tvrtko Ursulin (2018-04-06 10:11:57)
>
> On 30/03/2018 14:18, Chris Wilson wrote:
> > if (!wait_for_hang(&h, prev)) {
>
> This was confusing me a period since I was assuming the seqno check is
> against the breadcrumb, but it is actually about the same number written
> to a different place. So it actually means
> wait_for_request_to_start_executing.
Bonus points for a better name than struct hang.
struct spinner;
spinner_wait_until_active() ?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context
2018-04-06 9:40 ` Chris Wilson
@ 2018-04-06 9:59 ` Tvrtko Ursulin
0 siblings, 0 replies; 11+ messages in thread
From: Tvrtko Ursulin @ 2018-04-06 9:59 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 06/04/2018 10:40, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-04-06 10:11:57)
>>
>> On 30/03/2018 14:18, Chris Wilson wrote:
>>> if (!wait_for_hang(&h, prev)) {
>>
>> This was confusing me a period since I was assuming the seqno check is
>> against the breadcrumb, but it is actually about the same number written
>> to a different place. So it actually means
>> wait_for_request_to_start_executing.
>
> Bonus points for a better name than struct hang.
>
> struct spinner;
>
> spinner_wait_until_active() ?
Maybe just rename the helper to wait_until_running if you feel like it.
To keep things simple and scope in check.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-04-06 9:59 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-30 13:18 [PATCH] drm/i915/selftests: Avoid repeatedly harming the same innocent context Chris Wilson
2018-03-30 13:30 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2018-03-30 13:47 ` ✓ Fi.CI.BAT: success " Patchwork
2018-03-30 14:33 ` ✗ Fi.CI.IGT: failure " Patchwork
2018-03-31 20:24 ` [PATCH] " kbuild test robot
2018-03-31 20:41 ` kbuild test robot
2018-04-01 2:45 ` kbuild test robot
2018-04-05 19:29 ` Chris Wilson
2018-04-06 9:11 ` Tvrtko Ursulin
2018-04-06 9:40 ` Chris Wilson
2018-04-06 9:59 ` Tvrtko Ursulin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.