* [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset
@ 2017-12-17 13:28 Chris Wilson
2017-12-17 13:28 ` [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump Chris Wilson
` (6 more replies)
0 siblings, 7 replies; 22+ messages in thread
From: Chris Wilson @ 2017-12-17 13:28 UTC (permalink / raw)
To: intel-gfx
Inside i915_gem_reset(), we start touching the HW and so require the
low-level HW to be re-enabled, in particular the PCI BARs.
Fixes: 7b6da818d86f ("drm/i915: Restore the kernel context after a GPU reset on an idle engine")
Testcase: igt/drv_hangman # i915g/i915gm
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_drv.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6d39fdf2b604..72bea281edb7 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1924,9 +1924,6 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
goto taint;
}
- i915_gem_reset(i915);
- intel_overlay_reset(i915);
-
/* Ok, now get things going again... */
/*
@@ -1939,6 +1936,9 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
goto error;
}
+ i915_gem_reset(i915);
+ intel_overlay_reset(i915);
+
/*
* Next we need to restore the context, but we don't use those
* yet either...
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-17 13:28 [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset Chris Wilson
@ 2017-12-17 13:28 ` Chris Wilson
2017-12-18 11:14 ` Tvrtko Ursulin
` (4 more replies)
2017-12-17 13:28 ` [PATCH 3/3] drm/i915/selftests: Fix up igt_reset_engine Chris Wilson
` (5 subsequent siblings)
6 siblings, 5 replies; 22+ messages in thread
From: Chris Wilson @ 2017-12-17 13:28 UTC (permalink / raw)
To: intel-gfx
A useful bit of information for inspecting GPU stalls from
intel_engine_dump() are the error registers, IPEIR and IPEHR.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/intel_engine_cs.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 510e0bc3a377..05bd9e17452c 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1757,6 +1757,13 @@ void intel_engine_dump(struct intel_engine_cs *engine,
addr = intel_engine_get_last_batch_head(engine);
drm_printf(m, "\tBBADDR: 0x%08x_%08x\n",
upper_32_bits(addr), lower_32_bits(addr));
+ addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
+ drm_printf(m, "\tDMA_FADDR: 0x%08x_%08x\n",
+ upper_32_bits(addr), lower_32_bits(addr));
+ drm_printf(m, "\tIPEIR: 0x%08x\n",
+ I915_READ(RING_IPEIR(engine->mmio_base)));
+ drm_printf(m, "\tIPEHR: 0x%08x\n",
+ I915_READ(RING_IPEHR(engine->mmio_base)));
if (HAS_EXECLISTS(dev_priv)) {
const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 3/3] drm/i915/selftests: Fix up igt_reset_engine
2017-12-17 13:28 [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset Chris Wilson
2017-12-17 13:28 ` [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump Chris Wilson
@ 2017-12-17 13:28 ` Chris Wilson
2017-12-18 21:50 ` Michel Thierry
2017-12-17 14:07 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset Patchwork
` (4 subsequent siblings)
6 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2017-12-17 13:28 UTC (permalink / raw)
To: intel-gfx
Now that we skip a per-engine reset on an idle engine, we need to update
the selftest to take that into account. In the process, we find that we
were not stressing the per-engine reset very hard, so add those missing
active resets.
v2: Actually test i915_reset_engine() by loading it with requests.
Fixes: f6ba181ada55 ("drm/i915: Skip an engine reset if it recovered before our preparations")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 314 ++++++++++++++++++-----
1 file changed, 250 insertions(+), 64 deletions(-)
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index f98546b8a7fa..c8a756e2139f 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -132,6 +132,12 @@ static int emit_recurse_batch(struct hang *h,
*batch++ = lower_32_bits(hws_address(hws, rq));
*batch++ = upper_32_bits(hws_address(hws, rq));
*batch++ = rq->fence.seqno;
+ *batch++ = MI_ARB_CHECK;
+
+ memset(batch, 0, 1024);
+ batch += 1024 / sizeof(*batch);
+
+ *batch++ = MI_ARB_CHECK;
*batch++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
*batch++ = lower_32_bits(vma->node.start);
*batch++ = upper_32_bits(vma->node.start);
@@ -140,6 +146,12 @@ static int emit_recurse_batch(struct hang *h,
*batch++ = 0;
*batch++ = lower_32_bits(hws_address(hws, rq));
*batch++ = rq->fence.seqno;
+ *batch++ = MI_ARB_CHECK;
+
+ memset(batch, 0, 1024);
+ batch += 1024 / sizeof(*batch);
+
+ *batch++ = MI_ARB_CHECK;
*batch++ = MI_BATCH_BUFFER_START | 1 << 8;
*batch++ = lower_32_bits(vma->node.start);
} else if (INTEL_GEN(i915) >= 4) {
@@ -147,12 +159,24 @@ static int emit_recurse_batch(struct hang *h,
*batch++ = 0;
*batch++ = lower_32_bits(hws_address(hws, rq));
*batch++ = rq->fence.seqno;
+ *batch++ = MI_ARB_CHECK;
+
+ memset(batch, 0, 1024);
+ batch += 1024 / sizeof(*batch);
+
+ *batch++ = MI_ARB_CHECK;
*batch++ = MI_BATCH_BUFFER_START | 2 << 6;
*batch++ = lower_32_bits(vma->node.start);
} else {
*batch++ = MI_STORE_DWORD_IMM;
*batch++ = lower_32_bits(hws_address(hws, rq));
*batch++ = rq->fence.seqno;
+ *batch++ = MI_ARB_CHECK;
+
+ memset(batch, 0, 1024);
+ batch += 1024 / sizeof(*batch);
+
+ *batch++ = MI_ARB_CHECK;
*batch++ = MI_BATCH_BUFFER_START | 2 << 6 | 1;
*batch++ = lower_32_bits(vma->node.start);
}
@@ -234,6 +258,16 @@ static void hang_fini(struct hang *h)
i915_gem_wait_for_idle(h->i915, I915_WAIT_LOCKED);
}
+static bool wait_for_hang(struct hang *h, struct drm_i915_gem_request *rq)
+{
+ return !(wait_for_us(i915_seqno_passed(hws_seqno(h, rq),
+ rq->fence.seqno),
+ 10) &&
+ wait_for(i915_seqno_passed(hws_seqno(h, rq),
+ rq->fence.seqno),
+ 1000));
+}
+
static int igt_hang_sanitycheck(void *arg)
{
struct drm_i915_private *i915 = arg;
@@ -296,6 +330,9 @@ static void global_reset_lock(struct drm_i915_private *i915)
struct intel_engine_cs *engine;
enum intel_engine_id id;
+ pr_debug("%s: current gpu_error=%08lx\n",
+ __func__, i915->gpu_error.flags);
+
while (test_and_set_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags))
wait_event(i915->gpu_error.reset_queue,
!test_bit(I915_RESET_BACKOFF,
@@ -353,54 +390,127 @@ static int igt_global_reset(void *arg)
return err;
}
-static int igt_reset_engine(void *arg)
+static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
{
- struct drm_i915_private *i915 = arg;
struct intel_engine_cs *engine;
enum intel_engine_id id;
- unsigned int reset_count, reset_engine_count;
+ struct hang h;
int err = 0;
- /* Check that we can issue a global GPU and engine reset */
+ /* Check that we can issue an engine reset on an idle engine (no-op) */
if (!intel_has_reset_engine(i915))
return 0;
+ if (active) {
+ mutex_lock(&i915->drm.struct_mutex);
+ err = hang_init(&h, i915);
+ mutex_unlock(&i915->drm.struct_mutex);
+ if (err)
+ return err;
+ }
+
for_each_engine(engine, i915, id) {
- set_bit(I915_RESET_ENGINE + engine->id, &i915->gpu_error.flags);
+ unsigned int reset_count, reset_engine_count;
+ IGT_TIMEOUT(end_time);
+
+ if (active && !intel_engine_can_store_dword(engine))
+ continue;
+
reset_count = i915_reset_count(&i915->gpu_error);
reset_engine_count = i915_reset_engine_count(&i915->gpu_error,
engine);
- err = i915_reset_engine(engine, I915_RESET_QUIET);
- if (err) {
- pr_err("i915_reset_engine failed\n");
- break;
- }
+ set_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
+ do {
+ if (active) {
+ struct drm_i915_gem_request *rq;
+
+ mutex_lock(&i915->drm.struct_mutex);
+ rq = hang_create_request(&h, engine,
+ i915->kernel_context);
+ if (IS_ERR(rq)) {
+ err = PTR_ERR(rq);
+ break;
+ }
+
+ i915_gem_request_get(rq);
+ __i915_add_request(rq, true);
+ mutex_unlock(&i915->drm.struct_mutex);
+
+ if (!wait_for_hang(&h, rq)) {
+ struct drm_printer p = drm_info_printer(i915->drm.dev);
+
+ pr_err("%s: Failed to start request %x, at %x\n",
+ __func__, rq->fence.seqno, hws_seqno(&h, rq));
+ intel_engine_dump(engine, &p,
+ "%s\n", engine->name);
+
+ i915_gem_request_put(rq);
+ err = -EIO;
+ break;
+ }
- if (i915_reset_count(&i915->gpu_error) != reset_count) {
- pr_err("Full GPU reset recorded! (engine reset expected)\n");
- err = -EINVAL;
- break;
- }
+ i915_gem_request_put(rq);
+ }
+
+ engine->hangcheck.stalled = true;
+ engine->hangcheck.seqno =
+ intel_engine_get_seqno(engine);
+
+ err = i915_reset_engine(engine, I915_RESET_QUIET);
+ if (err) {
+ pr_err("i915_reset_engine failed\n");
+ break;
+ }
+
+ if (i915_reset_count(&i915->gpu_error) != reset_count) {
+ pr_err("Full GPU reset recorded! (engine reset expected)\n");
+ err = -EINVAL;
+ break;
+ }
+
+ reset_engine_count += active;
+ if (i915_reset_engine_count(&i915->gpu_error, engine) !=
+ reset_engine_count) {
+ pr_err("%s engine reset %srecorded!\n",
+ engine->name, active ? "not " : "");
+ err = -EINVAL;
+ break;
+ }
+
+ engine->hangcheck.stalled = false;
+ } while (time_before(jiffies, end_time));
+ clear_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
- if (i915_reset_engine_count(&i915->gpu_error, engine) ==
- reset_engine_count) {
- pr_err("No %s engine reset recorded!\n", engine->name);
- err = -EINVAL;
+ if (err)
break;
- }
- clear_bit(I915_RESET_ENGINE + engine->id,
- &i915->gpu_error.flags);
+ cond_resched();
}
if (i915_terminally_wedged(&i915->gpu_error))
err = -EIO;
+ if (active) {
+ mutex_lock(&i915->drm.struct_mutex);
+ hang_fini(&h);
+ mutex_unlock(&i915->drm.struct_mutex);
+ }
+
return err;
}
+static int igt_reset_idle_engine(void *arg)
+{
+ return __igt_reset_engine(arg, false);
+}
+
+static int igt_reset_active_engine(void *arg)
+{
+ return __igt_reset_engine(arg, true);
+}
+
static int active_engine(void *data)
{
struct intel_engine_cs *engine = data;
@@ -462,11 +572,12 @@ static int active_engine(void *data)
return err;
}
-static int igt_reset_active_engines(void *arg)
+static int __igt_reset_engine_others(struct drm_i915_private *i915,
+ bool active)
{
- struct drm_i915_private *i915 = arg;
- struct intel_engine_cs *engine, *active;
+ struct intel_engine_cs *engine, *other;
enum intel_engine_id id, tmp;
+ struct hang h;
int err = 0;
/* Check that issuing a reset on one engine does not interfere
@@ -476,24 +587,36 @@ static int igt_reset_active_engines(void *arg)
if (!intel_has_reset_engine(i915))
return 0;
+ if (active) {
+ mutex_lock(&i915->drm.struct_mutex);
+ err = hang_init(&h, i915);
+ mutex_unlock(&i915->drm.struct_mutex);
+ if (err)
+ return err;
+ }
+
for_each_engine(engine, i915, id) {
- struct task_struct *threads[I915_NUM_ENGINES];
+ struct task_struct *threads[I915_NUM_ENGINES] = {};
unsigned long resets[I915_NUM_ENGINES];
unsigned long global = i915_reset_count(&i915->gpu_error);
+ unsigned long count = 0;
IGT_TIMEOUT(end_time);
+ if (active && !intel_engine_can_store_dword(engine))
+ continue;
+
memset(threads, 0, sizeof(threads));
- for_each_engine(active, i915, tmp) {
+ for_each_engine(other, i915, tmp) {
struct task_struct *tsk;
- if (active == engine)
- continue;
-
resets[tmp] = i915_reset_engine_count(&i915->gpu_error,
- active);
+ other);
- tsk = kthread_run(active_engine, active,
- "igt/%s", active->name);
+ if (other == engine)
+ continue;
+
+ tsk = kthread_run(active_engine, other,
+ "igt/%s", other->name);
if (IS_ERR(tsk)) {
err = PTR_ERR(tsk);
goto unwind;
@@ -503,20 +626,70 @@ static int igt_reset_active_engines(void *arg)
get_task_struct(tsk);
}
- set_bit(I915_RESET_ENGINE + engine->id, &i915->gpu_error.flags);
+ set_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
do {
+ if (active) {
+ struct drm_i915_gem_request *rq;
+
+ mutex_lock(&i915->drm.struct_mutex);
+ rq = hang_create_request(&h, engine,
+ i915->kernel_context);
+ if (IS_ERR(rq)) {
+ err = PTR_ERR(rq);
+ mutex_unlock(&i915->drm.struct_mutex);
+ break;
+ }
+
+ i915_gem_request_get(rq);
+ __i915_add_request(rq, true);
+ mutex_unlock(&i915->drm.struct_mutex);
+
+ if (!wait_for_hang(&h, rq)) {
+ struct drm_printer p = drm_info_printer(i915->drm.dev);
+
+ pr_err("%s: Failed to start request %x, at %x\n",
+ __func__, rq->fence.seqno, hws_seqno(&h, rq));
+ intel_engine_dump(engine, &p,
+ "%s\n", engine->name);
+
+ i915_gem_request_put(rq);
+ err = -EIO;
+ break;
+ }
+
+ i915_gem_request_put(rq);
+ }
+
+ engine->hangcheck.stalled = true;
+ engine->hangcheck.seqno =
+ intel_engine_get_seqno(engine);
+
err = i915_reset_engine(engine, I915_RESET_QUIET);
if (err) {
- pr_err("i915_reset_engine(%s) failed, err=%d\n",
- engine->name, err);
+ pr_err("i915_reset_engine(%s:%s) failed, err=%d\n",
+ engine->name, active ? "active" : "idle", err);
break;
}
+
+ engine->hangcheck.stalled = false;
+ count++;
} while (time_before(jiffies, end_time));
- clear_bit(I915_RESET_ENGINE + engine->id,
- &i915->gpu_error.flags);
+ clear_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
+ pr_info("i915_reset_engine(%s:%s): %lu resets\n",
+ engine->name, active ? "active" : "idle", count);
+
+ if (i915_reset_engine_count(&i915->gpu_error, engine) -
+ resets[engine->id] != (active ? count : 0)) {
+ pr_err("i915_reset_engine(%s:%s): reset %lu times, but reported %lu\n",
+ engine->name, active ? "active" : "idle", count,
+ i915_reset_engine_count(&i915->gpu_error,
+ engine) - resets[engine->id]);
+ if (!err)
+ err = -EINVAL;
+ }
unwind:
- for_each_engine(active, i915, tmp) {
+ for_each_engine(other, i915, tmp) {
int ret;
if (!threads[tmp])
@@ -524,27 +697,29 @@ static int igt_reset_active_engines(void *arg)
ret = kthread_stop(threads[tmp]);
if (ret) {
- pr_err("kthread for active engine %s failed, err=%d\n",
- active->name, ret);
+ pr_err("kthread for other engine %s failed, err=%d\n",
+ other->name, ret);
if (!err)
err = ret;
}
put_task_struct(threads[tmp]);
if (resets[tmp] != i915_reset_engine_count(&i915->gpu_error,
- active)) {
+ other)) {
pr_err("Innocent engine %s was reset (count=%ld)\n",
- active->name,
+ other->name,
i915_reset_engine_count(&i915->gpu_error,
- active) - resets[tmp]);
- err = -EIO;
+ other) - resets[tmp]);
+ if (!err)
+ err = -EINVAL;
}
}
if (global != i915_reset_count(&i915->gpu_error)) {
pr_err("Global reset (count=%ld)!\n",
i915_reset_count(&i915->gpu_error) - global);
- err = -EIO;
+ if (!err)
+ err = -EINVAL;
}
if (err)
@@ -556,9 +731,25 @@ static int igt_reset_active_engines(void *arg)
if (i915_terminally_wedged(&i915->gpu_error))
err = -EIO;
+ if (active) {
+ mutex_lock(&i915->drm.struct_mutex);
+ hang_fini(&h);
+ mutex_unlock(&i915->drm.struct_mutex);
+ }
+
return err;
}
+static int igt_reset_idle_engine_others(void *arg)
+{
+ return __igt_reset_engine_others(arg, false);
+}
+
+static int igt_reset_active_engine_others(void *arg)
+{
+ return __igt_reset_engine_others(arg, true);
+}
+
static u32 fake_hangcheck(struct drm_i915_gem_request *rq)
{
u32 reset_count;
@@ -574,16 +765,6 @@ static u32 fake_hangcheck(struct drm_i915_gem_request *rq)
return reset_count;
}
-static bool wait_for_hang(struct hang *h, struct drm_i915_gem_request *rq)
-{
- return !(wait_for_us(i915_seqno_passed(hws_seqno(h, rq),
- rq->fence.seqno),
- 10) &&
- wait_for(i915_seqno_passed(hws_seqno(h, rq),
- rq->fence.seqno),
- 1000));
-}
-
static int igt_wait_reset(void *arg)
{
struct drm_i915_private *i915 = arg;
@@ -617,8 +798,8 @@ static int igt_wait_reset(void *arg)
if (!wait_for_hang(&h, rq)) {
struct drm_printer p = drm_info_printer(i915->drm.dev);
- pr_err("Failed to start request %x, at %x\n",
- rq->fence.seqno, hws_seqno(&h, rq));
+ pr_err("%s: Failed to start request %x, at %x\n",
+ __func__, rq->fence.seqno, hws_seqno(&h, rq));
intel_engine_dump(rq->engine, &p, "%s\n", rq->engine->name);
i915_reset(i915, 0);
@@ -712,8 +893,8 @@ static int igt_reset_queue(void *arg)
if (!wait_for_hang(&h, prev)) {
struct drm_printer p = drm_info_printer(i915->drm.dev);
- pr_err("Failed to start request %x, at %x\n",
- prev->fence.seqno, hws_seqno(&h, prev));
+ pr_err("%s: Failed to start request %x, at %x\n",
+ __func__, prev->fence.seqno, hws_seqno(&h, prev));
intel_engine_dump(prev->engine, &p,
"%s\n", prev->engine->name);
@@ -819,8 +1000,8 @@ static int igt_handle_error(void *arg)
if (!wait_for_hang(&h, rq)) {
struct drm_printer p = drm_info_printer(i915->drm.dev);
- pr_err("Failed to start request %x, at %x\n",
- rq->fence.seqno, hws_seqno(&h, rq));
+ pr_err("%s: Failed to start request %x, at %x\n",
+ __func__, rq->fence.seqno, hws_seqno(&h, rq));
intel_engine_dump(rq->engine, &p, "%s\n", rq->engine->name);
i915_reset(i915, 0);
@@ -864,21 +1045,26 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915)
static const struct i915_subtest tests[] = {
SUBTEST(igt_global_reset), /* attempt to recover GPU first */
SUBTEST(igt_hang_sanitycheck),
- SUBTEST(igt_reset_engine),
- SUBTEST(igt_reset_active_engines),
+ SUBTEST(igt_reset_idle_engine),
+ SUBTEST(igt_reset_active_engine),
+ SUBTEST(igt_reset_idle_engine_others),
+ SUBTEST(igt_reset_active_engine_others),
SUBTEST(igt_wait_reset),
SUBTEST(igt_reset_queue),
SUBTEST(igt_handle_error),
};
+ bool saved_hangcheck;
int err;
if (!intel_has_gpu_reset(i915))
return 0;
intel_runtime_pm_get(i915);
+ saved_hangcheck = fetch_and_zero(&i915_modparams.enable_hangcheck);
err = i915_subtests(tests, i915);
+ i915_modparams.enable_hangcheck = saved_hangcheck;
intel_runtime_pm_put(i915);
return err;
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread
* ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset
2017-12-17 13:28 [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset Chris Wilson
2017-12-17 13:28 ` [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump Chris Wilson
2017-12-17 13:28 ` [PATCH 3/3] drm/i915/selftests: Fix up igt_reset_engine Chris Wilson
@ 2017-12-17 14:07 ` Patchwork
2017-12-17 15:36 ` ✗ Fi.CI.IGT: warning " Patchwork
` (3 subsequent siblings)
6 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2017-12-17 14:07 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset
URL : https://patchwork.freedesktop.org/series/35471/
State : success
== Summary ==
Series 35471v1 series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset
https://patchwork.freedesktop.org/api/1.0/series/35471/revisions/1/mbox/
Test debugfs_test:
Subgroup read_all_entries:
pass -> DMESG-WARN (fi-elk-e7500) fdo#103989 +1
Test drv_hangman:
Subgroup error-state-basic:
dmesg-warn -> PASS (fi-gdg-551)
Test gem_basic:
Subgroup bad-close:
incomplete -> PASS (fi-gdg-551)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
dmesg-warn -> PASS (fi-kbl-r) fdo#104172 +1
Test kms_psr_sink_crc:
Subgroup psr_basic:
pass -> DMESG-WARN (fi-skl-6700hq) fdo#101144
fdo#103989 https://bugs.freedesktop.org/show_bug.cgi?id=103989
fdo#104172 https://bugs.freedesktop.org/show_bug.cgi?id=104172
fdo#101144 https://bugs.freedesktop.org/show_bug.cgi?id=101144
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:434s
fi-bdw-gvtdvm total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:445s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:381s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:504s
fi-bwr-2160 total:288 pass:183 dwarn:0 dfail:0 fail:0 skip:105 time:277s
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:497s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:495s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:475s
fi-byt-n2820 total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:470s
fi-elk-e7500 total:224 pass:163 dwarn:15 dfail:0 fail:0 skip:45
fi-gdg-551 total:288 pass:179 dwarn:1 dfail:0 fail:0 skip:108 time:265s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:537s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:405s
fi-hsw-4770r total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:413s
fi-ilk-650 total:288 pass:228 dwarn:0 dfail:0 fail:0 skip:60 time:383s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:468s
fi-ivb-3770 total:288 pass:255 dwarn:0 dfail:0 fail:0 skip:33 time:427s
fi-kbl-7500u total:288 pass:263 dwarn:1 dfail:0 fail:0 skip:24 time:476s
fi-kbl-7560u total:288 pass:268 dwarn:1 dfail:0 fail:0 skip:19 time:517s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:466s
fi-kbl-r total:288 pass:260 dwarn:1 dfail:0 fail:0 skip:27 time:521s
fi-pnv-d510 total:288 pass:222 dwarn:1 dfail:0 fail:0 skip:65 time:606s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:438s
fi-skl-6600u total:288 pass:260 dwarn:1 dfail:0 fail:0 skip:27 time:531s
fi-skl-6700hq total:288 pass:261 dwarn:1 dfail:0 fail:0 skip:26 time:555s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:510s
fi-skl-gvtdvm total:288 pass:265 dwarn:0 dfail:0 fail:0 skip:23 time:442s
fi-snb-2520m total:245 pass:211 dwarn:0 dfail:0 fail:0 skip:33
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:407s
Blacklisted hosts:
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:590s
fi-cnl-y total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:612s
fi-glk-dsi total:94 pass:45 dwarn:0 dfail:1 fail:0 skip:47
fi-skl-6700k2 total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:503s
aceab3849b32f367b3a38fe6852c5118b1c95839 drm-tip: 2017y-12m-17d-12h-52m-17s UTC integration manifest
ed3ef1790ee7 drm/i915/selftests: Fix up igt_reset_engine
0895dbc71bb0 drm/i915: Show IPEIR and IPEHR in the engine dump
1e42e4c592f9 drm/i915: Re-enable GGTT earlier after GPU reset
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7520/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* ✗ Fi.CI.IGT: warning for series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset
2017-12-17 13:28 [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset Chris Wilson
` (2 preceding siblings ...)
2017-12-17 14:07 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset Patchwork
@ 2017-12-17 15:36 ` Patchwork
2017-12-17 18:19 ` [PATCH 1/3] " Chris Wilson
` (2 subsequent siblings)
6 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2017-12-17 15:36 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset
URL : https://patchwork.freedesktop.org/series/35471/
State : warning
== Summary ==
Test drv_suspend:
Subgroup fence-restore-untiled:
incomplete -> PASS (shard-hsw)
Test gem_exec_suspend:
Subgroup basic-s3-devices:
incomplete -> PASS (shard-hsw) fdo#103990
Test kms_draw_crc:
Subgroup draw-method-rgb565-pwrite-untiled:
pass -> SKIP (shard-snb)
Test drv_selftest:
Subgroup live_hangcheck:
incomplete -> PASS (shard-snb) fdo#103880
Test kms_frontbuffer_tracking:
Subgroup fbc-1p-offscren-pri-shrfb-draw-render:
pass -> FAIL (shard-snb) fdo#101623 +1
Subgroup psr-1p-primscrn-spr-indfb-draw-pwrite:
incomplete -> SKIP (shard-hsw)
Test gem_tiled_swapping:
Subgroup non-threaded:
dmesg-warn -> PASS (shard-hsw) fdo#104218
Test drv_module_reload:
Subgroup basic-reload:
dmesg-warn -> PASS (shard-snb) fdo#102848
pass -> DMESG-WARN (shard-hsw) fdo#102707
Test kms_setmode:
Subgroup basic:
fail -> PASS (shard-hsw) fdo#99912
fdo#103990 https://bugs.freedesktop.org/show_bug.cgi?id=103990
fdo#103880 https://bugs.freedesktop.org/show_bug.cgi?id=103880
fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623
fdo#104218 https://bugs.freedesktop.org/show_bug.cgi?id=104218
fdo#102848 https://bugs.freedesktop.org/show_bug.cgi?id=102848
fdo#102707 https://bugs.freedesktop.org/show_bug.cgi?id=102707
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
shard-hsw total:2712 pass:1537 dwarn:2 dfail:0 fail:9 skip:1164 time:9416s
shard-snb total:2712 pass:1306 dwarn:1 dfail:0 fail:13 skip:1392 time:8023s
Blacklisted hosts:
shard-apl total:2712 pass:1686 dwarn:1 dfail:0 fail:24 skip:1001 time:14096s
shard-kbl total:2694 pass:1788 dwarn:2 dfail:0 fail:25 skip:878 time:10879s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7520/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset
2017-12-17 13:28 [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset Chris Wilson
` (3 preceding siblings ...)
2017-12-17 15:36 ` ✗ Fi.CI.IGT: warning " Patchwork
@ 2017-12-17 18:19 ` Chris Wilson
2017-12-18 11:11 ` Tvrtko Ursulin
2017-12-18 13:13 ` ✗ Fi.CI.BAT: failure for series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset (rev4) Patchwork
6 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2017-12-17 18:19 UTC (permalink / raw)
To: intel-gfx
Quoting Chris Wilson (2017-12-17 13:28:50)
> Inside i915_gem_reset(), we start touching the HW and so require the
> low-level HW to be re-enabled, in particular the PCI BARs.
>
> Fixes: 7b6da818d86f ("drm/i915: Restore the kernel context after a GPU reset on an idle engine")
References: 0db8c9612091 ("drm/i915: Re-enable GTT following a device reset")
Maybe fixes?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset
2017-12-17 13:28 [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset Chris Wilson
` (4 preceding siblings ...)
2017-12-17 18:19 ` [PATCH 1/3] " Chris Wilson
@ 2017-12-18 11:11 ` Tvrtko Ursulin
2017-12-18 11:19 ` Chris Wilson
2017-12-18 13:13 ` ✗ Fi.CI.BAT: failure for series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset (rev4) Patchwork
6 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2017-12-18 11:11 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 17/12/2017 13:28, Chris Wilson wrote:
> Inside i915_gem_reset(), we start touching the HW and so require the
> low-level HW to be re-enabled, in particular the PCI BARs.
>
> Fixes: 7b6da818d86f ("drm/i915: Restore the kernel context after a GPU reset on an idle engine")
> Testcase: igt/drv_hangman # i915g/i915gm
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 6d39fdf2b604..72bea281edb7 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1924,9 +1924,6 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
> goto taint;
> }
>
> - i915_gem_reset(i915);
> - intel_overlay_reset(i915);
> -
> /* Ok, now get things going again... */
>
> /*
> @@ -1939,6 +1936,9 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
> goto error;
> }
>
> + i915_gem_reset(i915);
> + intel_overlay_reset(i915);
> +
> /*
> * Next we need to restore the context, but we don't use those
> * yet either...
>
Looks fine to me.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-17 13:28 ` [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump Chris Wilson
@ 2017-12-18 11:14 ` Tvrtko Ursulin
2017-12-18 11:18 ` Chris Wilson
2017-12-18 11:14 ` Chris Wilson
` (3 subsequent siblings)
4 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2017-12-18 11:14 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 17/12/2017 13:28, Chris Wilson wrote:
> A useful bit of information for inspecting GPU stalls from
> intel_engine_dump() are the error registers, IPEIR and IPEHR.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
> drivers/gpu/drm/i915/intel_engine_cs.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 510e0bc3a377..05bd9e17452c 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1757,6 +1757,13 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> addr = intel_engine_get_last_batch_head(engine);
> drm_printf(m, "\tBBADDR: 0x%08x_%08x\n",
> upper_32_bits(addr), lower_32_bits(addr));
> + addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
> + drm_printf(m, "\tDMA_FADDR: 0x%08x_%08x\n",
> + upper_32_bits(addr), lower_32_bits(addr));
ERror capture handles this register a bit differently.
> + drm_printf(m, "\tIPEIR: 0x%08x\n",
> + I915_READ(RING_IPEIR(engine->mmio_base)));
> + drm_printf(m, "\tIPEHR: 0x%08x\n",
> + I915_READ(RING_IPEHR(engine->mmio_base)));
This one as well two code paths depending on the gen.
Regards,
Tvrtko
>
> if (HAS_EXECLISTS(dev_priv)) {
> const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-17 13:28 ` [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump Chris Wilson
2017-12-18 11:14 ` Tvrtko Ursulin
@ 2017-12-18 11:14 ` Chris Wilson
2017-12-18 11:26 ` [PATCH v2] " Chris Wilson
` (2 subsequent siblings)
4 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2017-12-18 11:14 UTC (permalink / raw)
To: intel-gfx
Quoting Chris Wilson (2017-12-17 13:28:51)
> A useful bit of information for inspecting GPU stalls from
> intel_engine_dump() are the error registers, IPEIR and IPEHR.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
> drivers/gpu/drm/i915/intel_engine_cs.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 510e0bc3a377..05bd9e17452c 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1757,6 +1757,13 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> addr = intel_engine_get_last_batch_head(engine);
> drm_printf(m, " BBADDR: 0x%08x_%08x\n",
> upper_32_bits(addr), lower_32_bits(addr));
> + addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
if (GEN >= 8)
addr |= (u64)I915_READ(RING_DMA_FADD_UDW) << 32;
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-18 11:14 ` Tvrtko Ursulin
@ 2017-12-18 11:18 ` Chris Wilson
0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2017-12-18 11:18 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
Quoting Tvrtko Ursulin (2017-12-18 11:14:19)
>
> On 17/12/2017 13:28, Chris Wilson wrote:
> > A useful bit of information for inspecting GPU stalls from
> > intel_engine_dump() are the error registers, IPEIR and IPEHR.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/intel_engine_cs.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> > index 510e0bc3a377..05bd9e17452c 100644
> > --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> > @@ -1757,6 +1757,13 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> > addr = intel_engine_get_last_batch_head(engine);
> > drm_printf(m, " BBADDR: 0x%08x_%08x\n",
> > upper_32_bits(addr), lower_32_bits(addr));
> > + addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
> > + drm_printf(m, " DMA_FADDR: 0x%08x_%08x\n",
> > + upper_32_bits(addr), lower_32_bits(addr));
>
> ERror capture handles this register a bit differently.
>
> > + drm_printf(m, " IPEIR: 0x%08x\n",
> > + I915_READ(RING_IPEIR(engine->mmio_base)));
> > + drm_printf(m, " IPEHR: 0x%08x\n",
> > + I915_READ(RING_IPEHR(engine->mmio_base)));
>
> This one as well two code paths depending on the gen.
My bad for assuming it was the same location, just per-engine-ified.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset
2017-12-18 11:11 ` Tvrtko Ursulin
@ 2017-12-18 11:19 ` Chris Wilson
0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2017-12-18 11:19 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
Quoting Tvrtko Ursulin (2017-12-18 11:11:49)
>
> On 17/12/2017 13:28, Chris Wilson wrote:
> > Inside i915_gem_reset(), we start touching the HW and so require the
> > low-level HW to be re-enabled, in particular the PCI BARs.
> >
> > Fixes: 7b6da818d86f ("drm/i915: Restore the kernel context after a GPU reset on an idle engine")
> > Testcase: igt/drv_hangman # i915g/i915gm
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Michel Thierry <michel.thierry@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_drv.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 6d39fdf2b604..72bea281edb7 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1924,9 +1924,6 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
> > goto taint;
> > }
> >
> > - i915_gem_reset(i915);
> > - intel_overlay_reset(i915);
> > -
> > /* Ok, now get things going again... */
> >
> > /*
> > @@ -1939,6 +1936,9 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
> > goto error;
> > }
> >
> > + i915_gem_reset(i915);
> > + intel_overlay_reset(i915);
> > +
> > /*
> > * Next we need to restore the context, but we don't use those
> > * yet either...
> >
>
> Looks fine to me.
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Ta, pushed this one by itself so we can bring gdg back online.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-17 13:28 ` [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump Chris Wilson
2017-12-18 11:14 ` Tvrtko Ursulin
2017-12-18 11:14 ` Chris Wilson
@ 2017-12-18 11:26 ` Chris Wilson
2017-12-18 12:08 ` Tvrtko Ursulin
2017-12-18 12:17 ` [PATCH v3] " Chris Wilson
2017-12-18 12:39 ` [PATCH v4] " Chris Wilson
4 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2017-12-18 11:26 UTC (permalink / raw)
To: intel-gfx
A useful bit of information for inspecting GPU stalls from
intel_engine_dump() are the error registers, IPEIR and IPEHR.
v2: Fixup gen changes in register offsets (Tvrtko)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/intel_engine_cs.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 510e0bc3a377..92b9e0dd6378 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1757,6 +1757,20 @@ void intel_engine_dump(struct intel_engine_cs *engine,
addr = intel_engine_get_last_batch_head(engine);
drm_printf(m, "\tBBADDR: 0x%08x_%08x\n",
upper_32_bits(addr), lower_32_bits(addr));
+ addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
+ if (INTEL_GEN(dev_priv) >= 8)
+ addr |= (u64)I915_READ(RING_DMA_FADD_UDW(engine->mmio_base)) << 32;
+ drm_printf(m, "\tDMA_FADDR: 0x%08x_%08x\n",
+ upper_32_bits(addr), lower_32_bits(addr));
+ if (INTEL_GEN(dev_priv) >= 4) {
+ drm_printf(m, "\tIPEIR: 0x%08x\n",
+ I915_READ(RING_IPEIR(engine->mmio_base)));
+ drm_printf(m, "\tIPEHR: 0x%08x\n",
+ I915_READ(RING_IPEHR(engine->mmio_base)));
+ } else {
+ drm_printf(m, "\tIPEIR: 0x%08x\n", I915_READ(IPEIR));
+ drm_printf(m, "\tIPEHR: 0x%08x\n", I915_READ(IPEHR));
+ }
if (HAS_EXECLISTS(dev_priv)) {
const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v2] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-18 11:26 ` [PATCH v2] " Chris Wilson
@ 2017-12-18 12:08 ` Tvrtko Ursulin
0 siblings, 0 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2017-12-18 12:08 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 18/12/2017 11:26, Chris Wilson wrote:
> A useful bit of information for inspecting GPU stalls from
> intel_engine_dump() are the error registers, IPEIR and IPEHR.
>
> v2: Fixup gen changes in register offsets (Tvrtko)
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> drivers/gpu/drm/i915/intel_engine_cs.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 510e0bc3a377..92b9e0dd6378 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1757,6 +1757,20 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> addr = intel_engine_get_last_batch_head(engine);
> drm_printf(m, "\tBBADDR: 0x%08x_%08x\n",
> upper_32_bits(addr), lower_32_bits(addr));
> + addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
> + if (INTEL_GEN(dev_priv) >= 8)
> + addr |= (u64)I915_READ(RING_DMA_FADD_UDW(engine->mmio_base)) << 32;
> + drm_printf(m, "\tDMA_FADDR: 0x%08x_%08x\n",
> + upper_32_bits(addr), lower_32_bits(addr));
< gen4 case not interesting here? Error capture reads DMA_FADD_I8XX in
that case. Doesn't look like the same offset to me.
Regards,
Tvrtko
> + if (INTEL_GEN(dev_priv) >= 4) {
> + drm_printf(m, "\tIPEIR: 0x%08x\n",
> + I915_READ(RING_IPEIR(engine->mmio_base)));
> + drm_printf(m, "\tIPEHR: 0x%08x\n",
> + I915_READ(RING_IPEHR(engine->mmio_base)));
> + } else {
> + drm_printf(m, "\tIPEIR: 0x%08x\n", I915_READ(IPEIR));
> + drm_printf(m, "\tIPEHR: 0x%08x\n", I915_READ(IPEHR));
> + }
>
> if (HAS_EXECLISTS(dev_priv)) {
> const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v3] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-17 13:28 ` [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump Chris Wilson
` (2 preceding siblings ...)
2017-12-18 11:26 ` [PATCH v2] " Chris Wilson
@ 2017-12-18 12:17 ` Chris Wilson
2017-12-18 12:32 ` Tvrtko Ursulin
2017-12-18 12:39 ` [PATCH v4] " Chris Wilson
4 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2017-12-18 12:17 UTC (permalink / raw)
To: intel-gfx
A useful bit of information for inspecting GPU stalls from
intel_engine_dump() are the error registers, IPEIR and IPEHR.
v2: Fixup gen changes in register offsets (Tvrtko)
v3: Old FADDR location as well
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/intel_engine_cs.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 510e0bc3a377..257b03a67e1c 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1757,6 +1757,26 @@ void intel_engine_dump(struct intel_engine_cs *engine,
addr = intel_engine_get_last_batch_head(engine);
drm_printf(m, "\tBBADDR: 0x%08x_%08x\n",
upper_32_bits(addr), lower_32_bits(addr));
+ if (INTEL_GEN(dev_priv) >= 4) {
+ if (INTEL_GEN(dev_priv) >= 8) {
+ addr = I915_READ(RING_DMA_FADD_UDW(engine->mmio_base));
+ addr <<= 32;
+ }
+ addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
+ } else {
+ addr = I915_READ(DMA_FADD_I8XX);
+ }
+ drm_printf(m, "\tDMA_FADDR: 0x%08x_%08x\n",
+ upper_32_bits(addr), lower_32_bits(addr));
+ if (INTEL_GEN(dev_priv) >= 4) {
+ drm_printf(m, "\tIPEIR: 0x%08x\n",
+ I915_READ(RING_IPEIR(engine->mmio_base)));
+ drm_printf(m, "\tIPEHR: 0x%08x\n",
+ I915_READ(RING_IPEHR(engine->mmio_base)));
+ } else {
+ drm_printf(m, "\tIPEIR: 0x%08x\n", I915_READ(IPEIR));
+ drm_printf(m, "\tIPEHR: 0x%08x\n", I915_READ(IPEHR));
+ }
if (HAS_EXECLISTS(dev_priv)) {
const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v3] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-18 12:17 ` [PATCH v3] " Chris Wilson
@ 2017-12-18 12:32 ` Tvrtko Ursulin
2017-12-18 12:35 ` Chris Wilson
0 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2017-12-18 12:32 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 18/12/2017 12:17, Chris Wilson wrote:
> A useful bit of information for inspecting GPU stalls from
> intel_engine_dump() are the error registers, IPEIR and IPEHR.
>
> v2: Fixup gen changes in register offsets (Tvrtko)
> v3: Old FADDR location as well
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> drivers/gpu/drm/i915/intel_engine_cs.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 510e0bc3a377..257b03a67e1c 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1757,6 +1757,26 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> addr = intel_engine_get_last_batch_head(engine);
> drm_printf(m, "\tBBADDR: 0x%08x_%08x\n",
> upper_32_bits(addr), lower_32_bits(addr));
> + if (INTEL_GEN(dev_priv) >= 4) {
> + if (INTEL_GEN(dev_priv) >= 8) {
> + addr = I915_READ(RING_DMA_FADD_UDW(engine->mmio_base));
> + addr <<= 32;
> + }
> + addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
|=, or better reverse order to avoid having to init addr.
Regards,
Tvrtko
> + } else {
> + addr = I915_READ(DMA_FADD_I8XX);
> + }
> + drm_printf(m, "\tDMA_FADDR: 0x%08x_%08x\n",
> + upper_32_bits(addr), lower_32_bits(addr));
> + if (INTEL_GEN(dev_priv) >= 4) {
> + drm_printf(m, "\tIPEIR: 0x%08x\n",
> + I915_READ(RING_IPEIR(engine->mmio_base)));
> + drm_printf(m, "\tIPEHR: 0x%08x\n",
> + I915_READ(RING_IPEHR(engine->mmio_base)));
> + } else {
> + drm_printf(m, "\tIPEIR: 0x%08x\n", I915_READ(IPEIR));
> + drm_printf(m, "\tIPEHR: 0x%08x\n", I915_READ(IPEHR));
> + }
>
> if (HAS_EXECLISTS(dev_priv)) {
> const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v3] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-18 12:32 ` Tvrtko Ursulin
@ 2017-12-18 12:35 ` Chris Wilson
0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2017-12-18 12:35 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
Quoting Tvrtko Ursulin (2017-12-18 12:32:37)
>
> On 18/12/2017 12:17, Chris Wilson wrote:
> > A useful bit of information for inspecting GPU stalls from
> > intel_engine_dump() are the error registers, IPEIR and IPEHR.
> >
> > v2: Fixup gen changes in register offsets (Tvrtko)
> > v3: Old FADDR location as well
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> > drivers/gpu/drm/i915/intel_engine_cs.c | 20 ++++++++++++++++++++
> > 1 file changed, 20 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> > index 510e0bc3a377..257b03a67e1c 100644
> > --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> > @@ -1757,6 +1757,26 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> > addr = intel_engine_get_last_batch_head(engine);
> > drm_printf(m, " BBADDR: 0x%08x_%08x\n",
> > upper_32_bits(addr), lower_32_bits(addr));
> > + if (INTEL_GEN(dev_priv) >= 4) {
> > + if (INTEL_GEN(dev_priv) >= 8) {
> > + addr = I915_READ(RING_DMA_FADD_UDW(engine->mmio_base));
> > + addr <<= 32;
> > + }
> > + addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
>
> |=, or better reverse order to avoid having to init addr.
|= otherwise it's back to the ugly (u64) << 32;
Pick your poison. Or maybe if I started paying attention we wouldn't
need to be going round in some many circles.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v4] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-17 13:28 ` [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump Chris Wilson
` (3 preceding siblings ...)
2017-12-18 12:17 ` [PATCH v3] " Chris Wilson
@ 2017-12-18 12:39 ` Chris Wilson
2017-12-18 12:58 ` Tvrtko Ursulin
4 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2017-12-18 12:39 UTC (permalink / raw)
To: intel-gfx
A useful bit of information for inspecting GPU stalls from
intel_engine_dump() are the error registers, IPEIR and IPEHR.
v2: Fixup gen changes in register offsets (Tvrtko)
v3: Old FADDR location as well
v4: Use I915_READ64_2x32
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/intel_engine_cs.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 510e0bc3a377..b4807497e92d 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1757,6 +1757,24 @@ void intel_engine_dump(struct intel_engine_cs *engine,
addr = intel_engine_get_last_batch_head(engine);
drm_printf(m, "\tBBADDR: 0x%08x_%08x\n",
upper_32_bits(addr), lower_32_bits(addr));
+ if (INTEL_GEN(dev_priv) >= 8)
+ addr = I915_READ64_2x32(RING_DMA_FADD(engine->mmio_base),
+ RING_DMA_FADD_UDW(engine->mmio_base));
+ else if (INTEL_GEN(dev_priv) >= 4)
+ addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
+ else
+ addr = I915_READ(DMA_FADD_I8XX);
+ drm_printf(m, "\tDMA_FADDR: 0x%08x_%08x\n",
+ upper_32_bits(addr), lower_32_bits(addr));
+ if (INTEL_GEN(dev_priv) >= 4) {
+ drm_printf(m, "\tIPEIR: 0x%08x\n",
+ I915_READ(RING_IPEIR(engine->mmio_base)));
+ drm_printf(m, "\tIPEHR: 0x%08x\n",
+ I915_READ(RING_IPEHR(engine->mmio_base)));
+ } else {
+ drm_printf(m, "\tIPEIR: 0x%08x\n", I915_READ(IPEIR));
+ drm_printf(m, "\tIPEHR: 0x%08x\n", I915_READ(IPEHR));
+ }
if (HAS_EXECLISTS(dev_priv)) {
const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v4] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-18 12:39 ` [PATCH v4] " Chris Wilson
@ 2017-12-18 12:58 ` Tvrtko Ursulin
2017-12-18 13:27 ` Chris Wilson
0 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2017-12-18 12:58 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 18/12/2017 12:39, Chris Wilson wrote:
> A useful bit of information for inspecting GPU stalls from
> intel_engine_dump() are the error registers, IPEIR and IPEHR.
>
> v2: Fixup gen changes in register offsets (Tvrtko)
> v3: Old FADDR location as well
> v4: Use I915_READ64_2x32
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> drivers/gpu/drm/i915/intel_engine_cs.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 510e0bc3a377..b4807497e92d 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1757,6 +1757,24 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> addr = intel_engine_get_last_batch_head(engine);
> drm_printf(m, "\tBBADDR: 0x%08x_%08x\n",
> upper_32_bits(addr), lower_32_bits(addr));
> + if (INTEL_GEN(dev_priv) >= 8)
> + addr = I915_READ64_2x32(RING_DMA_FADD(engine->mmio_base),
> + RING_DMA_FADD_UDW(engine->mmio_base));
> + else if (INTEL_GEN(dev_priv) >= 4)
> + addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
> + else
> + addr = I915_READ(DMA_FADD_I8XX);
> + drm_printf(m, "\tDMA_FADDR: 0x%08x_%08x\n",
> + upper_32_bits(addr), lower_32_bits(addr));
> + if (INTEL_GEN(dev_priv) >= 4) {
> + drm_printf(m, "\tIPEIR: 0x%08x\n",
> + I915_READ(RING_IPEIR(engine->mmio_base)));
> + drm_printf(m, "\tIPEHR: 0x%08x\n",
> + I915_READ(RING_IPEHR(engine->mmio_base)));
> + } else {
> + drm_printf(m, "\tIPEIR: 0x%08x\n", I915_READ(IPEIR));
> + drm_printf(m, "\tIPEHR: 0x%08x\n", I915_READ(IPEHR));
> + }
>
> if (HAS_EXECLISTS(dev_priv)) {
> const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* ✗ Fi.CI.BAT: failure for series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset (rev4)
2017-12-17 13:28 [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset Chris Wilson
` (5 preceding siblings ...)
2017-12-18 11:11 ` Tvrtko Ursulin
@ 2017-12-18 13:13 ` Patchwork
6 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2017-12-18 13:13 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset (rev4)
URL : https://patchwork.freedesktop.org/series/35471/
State : failure
== Summary ==
Series 35471v4 series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset
https://patchwork.freedesktop.org/api/1.0/series/35471/revisions/4/mbox/
Test kms_pipe_crc_basic:
Subgroup read-crc-pipe-b:
pass -> FAIL (fi-skl-6700k2)
Subgroup suspend-read-crc-pipe-a:
dmesg-warn -> PASS (fi-kbl-r) fdo#104172 +1
Subgroup suspend-read-crc-pipe-b:
pass -> INCOMPLETE (fi-snb-2520m) fdo#103713
Test kms_psr_sink_crc:
Subgroup psr_basic:
dmesg-warn -> PASS (fi-skl-6700hq) fdo#101144
fdo#104172 https://bugs.freedesktop.org/show_bug.cgi?id=104172
fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
fdo#101144 https://bugs.freedesktop.org/show_bug.cgi?id=101144
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:433s
fi-bdw-gvtdvm total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:439s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:381s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:489s
fi-bwr-2160 total:288 pass:183 dwarn:0 dfail:0 fail:0 skip:105 time:276s
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:496s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:498s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:475s
fi-byt-n2820 total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:460s
fi-elk-e7500 total:224 pass:163 dwarn:15 dfail:0 fail:0 skip:45
fi-gdg-551 total:288 pass:179 dwarn:1 dfail:0 fail:0 skip:108 time:262s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:529s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:405s
fi-hsw-4770r total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:411s
fi-ilk-650 total:288 pass:228 dwarn:0 dfail:0 fail:0 skip:60 time:390s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:469s
fi-ivb-3770 total:288 pass:255 dwarn:0 dfail:0 fail:0 skip:33 time:427s
fi-kbl-7500u total:288 pass:263 dwarn:1 dfail:0 fail:0 skip:24 time:480s
fi-kbl-7560u total:288 pass:268 dwarn:1 dfail:0 fail:0 skip:19 time:520s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:471s
fi-kbl-r total:288 pass:260 dwarn:1 dfail:0 fail:0 skip:27 time:519s
fi-pnv-d510 total:288 pass:222 dwarn:1 dfail:0 fail:0 skip:65 time:595s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:443s
fi-skl-6600u total:288 pass:260 dwarn:1 dfail:0 fail:0 skip:27 time:525s
fi-skl-6700hq total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:558s
fi-skl-6700k2 total:288 pass:263 dwarn:0 dfail:0 fail:1 skip:24 time:505s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:495s
fi-skl-gvtdvm total:288 pass:265 dwarn:0 dfail:0 fail:0 skip:23 time:441s
fi-snb-2520m total:245 pass:211 dwarn:0 dfail:0 fail:0 skip:33
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:415s
Blacklisted hosts:
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:591s
fi-glk-dsi total:288 pass:256 dwarn:0 dfail:0 fail:2 skip:30 time:496s
bf5cdf9e055a88559a6fc707b6e89e88077a2124 drm-tip: 2017y-12m-18d-11h-53m-39s UTC integration manifest
d60eefb6e585 drm/i915/selftests: Fix up igt_reset_engine
46fc9d5cbe36 drm/i915: Show IPEIR and IPEHR in the engine dump
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7525/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4] drm/i915: Show IPEIR and IPEHR in the engine dump
2017-12-18 12:58 ` Tvrtko Ursulin
@ 2017-12-18 13:27 ` Chris Wilson
0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2017-12-18 13:27 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
Quoting Tvrtko Ursulin (2017-12-18 12:58:51)
>
> On 18/12/2017 12:39, Chris Wilson wrote:
> > A useful bit of information for inspecting GPU stalls from
> > intel_engine_dump() are the error registers, IPEIR and IPEHR.
> >
> > v2: Fixup gen changes in register offsets (Tvrtko)
> > v3: Old FADDR location as well
> > v4: Use I915_READ64_2x32
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> > drivers/gpu/drm/i915/intel_engine_cs.c | 18 ++++++++++++++++++
> > 1 file changed, 18 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> > index 510e0bc3a377..b4807497e92d 100644
> > --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> > @@ -1757,6 +1757,24 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> > addr = intel_engine_get_last_batch_head(engine);
> > drm_printf(m, " BBADDR: 0x%08x_%08x\n",
> > upper_32_bits(addr), lower_32_bits(addr));
> > + if (INTEL_GEN(dev_priv) >= 8)
> > + addr = I915_READ64_2x32(RING_DMA_FADD(engine->mmio_base),
> > + RING_DMA_FADD_UDW(engine->mmio_base));
> > + else if (INTEL_GEN(dev_priv) >= 4)
> > + addr = I915_READ(RING_DMA_FADD(engine->mmio_base));
> > + else
> > + addr = I915_READ(DMA_FADD_I8XX);
> > + drm_printf(m, " DMA_FADDR: 0x%08x_%08x\n",
> > + upper_32_bits(addr), lower_32_bits(addr));
> > + if (INTEL_GEN(dev_priv) >= 4) {
> > + drm_printf(m, " IPEIR: 0x%08x\n",
> > + I915_READ(RING_IPEIR(engine->mmio_base)));
> > + drm_printf(m, " IPEHR: 0x%08x\n",
> > + I915_READ(RING_IPEHR(engine->mmio_base)));
> > + } else {
> > + drm_printf(m, " IPEIR: 0x%08x\n", I915_READ(IPEIR));
> > + drm_printf(m, " IPEHR: 0x%08x\n", I915_READ(IPEHR));
> > + }
> >
> > if (HAS_EXECLISTS(dev_priv)) {
> > const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
> >
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Thanks for the review, lots of fixes in such a small patch.
Pushed, so just the selftest for per-engine resets remaining, which I
hope Michel will pick up later.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/3] drm/i915/selftests: Fix up igt_reset_engine
2017-12-17 13:28 ` [PATCH 3/3] drm/i915/selftests: Fix up igt_reset_engine Chris Wilson
@ 2017-12-18 21:50 ` Michel Thierry
2017-12-18 21:54 ` Chris Wilson
0 siblings, 1 reply; 22+ messages in thread
From: Michel Thierry @ 2017-12-18 21:50 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 17/12/17 05:28, Chris Wilson wrote:
> Now that we skip a per-engine reset on an idle engine, we need to update
> the selftest to take that into account. In the process, we find that we
> were not stressing the per-engine reset very hard, so add those missing
> active resets.
>
> v2: Actually test i915_reset_engine() by loading it with requests.
>
> Fixes: f6ba181ada55 ("drm/i915: Skip an engine reset if it recovered before our preparations")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
And all these subtests passed with and without GuC in SKL.
> ---
> drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 314 ++++++++++++++++++-----
> 1 file changed, 250 insertions(+), 64 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> index f98546b8a7fa..c8a756e2139f 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> @@ -132,6 +132,12 @@ static int emit_recurse_batch(struct hang *h,
> *batch++ = lower_32_bits(hws_address(hws, rq));
> *batch++ = upper_32_bits(hws_address(hws, rq));
> *batch++ = rq->fence.seqno;
> + *batch++ = MI_ARB_CHECK;
> +
> + memset(batch, 0, 1024);
> + batch += 1024 / sizeof(*batch);
> +
> + *batch++ = MI_ARB_CHECK;
> *batch++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
> *batch++ = lower_32_bits(vma->node.start);
> *batch++ = upper_32_bits(vma->node.start);
> @@ -140,6 +146,12 @@ static int emit_recurse_batch(struct hang *h,
> *batch++ = 0;
> *batch++ = lower_32_bits(hws_address(hws, rq));
> *batch++ = rq->fence.seqno;
> + *batch++ = MI_ARB_CHECK;
> +
> + memset(batch, 0, 1024);
> + batch += 1024 / sizeof(*batch);
> +
> + *batch++ = MI_ARB_CHECK;
> *batch++ = MI_BATCH_BUFFER_START | 1 << 8;
> *batch++ = lower_32_bits(vma->node.start);
> } else if (INTEL_GEN(i915) >= 4) {
> @@ -147,12 +159,24 @@ static int emit_recurse_batch(struct hang *h,
> *batch++ = 0;
> *batch++ = lower_32_bits(hws_address(hws, rq));
> *batch++ = rq->fence.seqno;
> + *batch++ = MI_ARB_CHECK;
> +
> + memset(batch, 0, 1024);
> + batch += 1024 / sizeof(*batch);
> +
> + *batch++ = MI_ARB_CHECK;
> *batch++ = MI_BATCH_BUFFER_START | 2 << 6;
> *batch++ = lower_32_bits(vma->node.start);
> } else {
> *batch++ = MI_STORE_DWORD_IMM;
> *batch++ = lower_32_bits(hws_address(hws, rq));
> *batch++ = rq->fence.seqno;
> + *batch++ = MI_ARB_CHECK;
> +
> + memset(batch, 0, 1024);
> + batch += 1024 / sizeof(*batch);
> +
> + *batch++ = MI_ARB_CHECK;
> *batch++ = MI_BATCH_BUFFER_START | 2 << 6 | 1;
> *batch++ = lower_32_bits(vma->node.start);
> }
> @@ -234,6 +258,16 @@ static void hang_fini(struct hang *h)
> i915_gem_wait_for_idle(h->i915, I915_WAIT_LOCKED);
> }
>
> +static bool wait_for_hang(struct hang *h, struct drm_i915_gem_request *rq)
> +{
> + return !(wait_for_us(i915_seqno_passed(hws_seqno(h, rq),
> + rq->fence.seqno),
> + 10) &&
> + wait_for(i915_seqno_passed(hws_seqno(h, rq),
> + rq->fence.seqno),
> + 1000));
> +}
> +
> static int igt_hang_sanitycheck(void *arg)
> {
> struct drm_i915_private *i915 = arg;
> @@ -296,6 +330,9 @@ static void global_reset_lock(struct drm_i915_private *i915)
> struct intel_engine_cs *engine;
> enum intel_engine_id id;
>
> + pr_debug("%s: current gpu_error=%08lx\n",
> + __func__, i915->gpu_error.flags);
> +
> while (test_and_set_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags))
> wait_event(i915->gpu_error.reset_queue,
> !test_bit(I915_RESET_BACKOFF,
> @@ -353,54 +390,127 @@ static int igt_global_reset(void *arg)
> return err;
> }
>
> -static int igt_reset_engine(void *arg)
> +static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
> {
> - struct drm_i915_private *i915 = arg;
> struct intel_engine_cs *engine;
> enum intel_engine_id id;
> - unsigned int reset_count, reset_engine_count;
> + struct hang h;
> int err = 0;
>
> - /* Check that we can issue a global GPU and engine reset */
> + /* Check that we can issue an engine reset on an idle engine (no-op) */
>
> if (!intel_has_reset_engine(i915))
> return 0;
>
> + if (active) {
> + mutex_lock(&i915->drm.struct_mutex);
> + err = hang_init(&h, i915);
> + mutex_unlock(&i915->drm.struct_mutex);
> + if (err)
> + return err;
> + }
> +
> for_each_engine(engine, i915, id) {
> - set_bit(I915_RESET_ENGINE + engine->id, &i915->gpu_error.flags);
> + unsigned int reset_count, reset_engine_count;
> + IGT_TIMEOUT(end_time);
> +
> + if (active && !intel_engine_can_store_dword(engine))
> + continue;
> +
> reset_count = i915_reset_count(&i915->gpu_error);
> reset_engine_count = i915_reset_engine_count(&i915->gpu_error,
> engine);
>
> - err = i915_reset_engine(engine, I915_RESET_QUIET);
> - if (err) {
> - pr_err("i915_reset_engine failed\n");
> - break;
> - }
> + set_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
> + do {
> + if (active) {
> + struct drm_i915_gem_request *rq;
> +
> + mutex_lock(&i915->drm.struct_mutex);
> + rq = hang_create_request(&h, engine,
> + i915->kernel_context);
> + if (IS_ERR(rq)) {
> + err = PTR_ERR(rq);
> + break;
> + }
> +
> + i915_gem_request_get(rq);
> + __i915_add_request(rq, true);
> + mutex_unlock(&i915->drm.struct_mutex);
> +
> + if (!wait_for_hang(&h, rq)) {
> + struct drm_printer p = drm_info_printer(i915->drm.dev);
> +
> + pr_err("%s: Failed to start request %x, at %x\n",
> + __func__, rq->fence.seqno, hws_seqno(&h, rq));
> + intel_engine_dump(engine, &p,
> + "%s\n", engine->name);
> +
> + i915_gem_request_put(rq);
> + err = -EIO;
> + break;
> + }
>
> - if (i915_reset_count(&i915->gpu_error) != reset_count) {
> - pr_err("Full GPU reset recorded! (engine reset expected)\n");
> - err = -EINVAL;
> - break;
> - }
> + i915_gem_request_put(rq);
> + }
> +
> + engine->hangcheck.stalled = true;
> + engine->hangcheck.seqno =
> + intel_engine_get_seqno(engine);
> +
> + err = i915_reset_engine(engine, I915_RESET_QUIET);
> + if (err) {
> + pr_err("i915_reset_engine failed\n");
> + break;
> + }
> +
> + if (i915_reset_count(&i915->gpu_error) != reset_count) {
> + pr_err("Full GPU reset recorded! (engine reset expected)\n");
> + err = -EINVAL;
> + break;
> + }
> +
> + reset_engine_count += active;
> + if (i915_reset_engine_count(&i915->gpu_error, engine) !=
> + reset_engine_count) {
> + pr_err("%s engine reset %srecorded!\n",
> + engine->name, active ? "not " : "");
> + err = -EINVAL;
> + break;
> + }
> +
> + engine->hangcheck.stalled = false;
> + } while (time_before(jiffies, end_time));
> + clear_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
>
> - if (i915_reset_engine_count(&i915->gpu_error, engine) ==
> - reset_engine_count) {
> - pr_err("No %s engine reset recorded!\n", engine->name);
> - err = -EINVAL;
> + if (err)
> break;
> - }
>
> - clear_bit(I915_RESET_ENGINE + engine->id,
> - &i915->gpu_error.flags);
> + cond_resched();
> }
>
> if (i915_terminally_wedged(&i915->gpu_error))
> err = -EIO;
>
> + if (active) {
> + mutex_lock(&i915->drm.struct_mutex);
> + hang_fini(&h);
> + mutex_unlock(&i915->drm.struct_mutex);
> + }
> +
> return err;
> }
>
> +static int igt_reset_idle_engine(void *arg)
> +{
> + return __igt_reset_engine(arg, false);
> +}
> +
> +static int igt_reset_active_engine(void *arg)
> +{
> + return __igt_reset_engine(arg, true);
> +}
> +
> static int active_engine(void *data)
> {
> struct intel_engine_cs *engine = data;
> @@ -462,11 +572,12 @@ static int active_engine(void *data)
> return err;
> }
>
> -static int igt_reset_active_engines(void *arg)
> +static int __igt_reset_engine_others(struct drm_i915_private *i915,
> + bool active)
> {
> - struct drm_i915_private *i915 = arg;
> - struct intel_engine_cs *engine, *active;
> + struct intel_engine_cs *engine, *other;
> enum intel_engine_id id, tmp;
> + struct hang h;
> int err = 0;
>
> /* Check that issuing a reset on one engine does not interfere
> @@ -476,24 +587,36 @@ static int igt_reset_active_engines(void *arg)
> if (!intel_has_reset_engine(i915))
> return 0;
>
> + if (active) {
> + mutex_lock(&i915->drm.struct_mutex);
> + err = hang_init(&h, i915);
> + mutex_unlock(&i915->drm.struct_mutex);
> + if (err)
> + return err;
> + }
> +
> for_each_engine(engine, i915, id) {
> - struct task_struct *threads[I915_NUM_ENGINES];
> + struct task_struct *threads[I915_NUM_ENGINES] = {};
> unsigned long resets[I915_NUM_ENGINES];
> unsigned long global = i915_reset_count(&i915->gpu_error);
> + unsigned long count = 0;
> IGT_TIMEOUT(end_time);
>
> + if (active && !intel_engine_can_store_dword(engine))
> + continue;
> +
> memset(threads, 0, sizeof(threads));
> - for_each_engine(active, i915, tmp) {
> + for_each_engine(other, i915, tmp) {
> struct task_struct *tsk;
>
> - if (active == engine)
> - continue;
> -
> resets[tmp] = i915_reset_engine_count(&i915->gpu_error,
> - active);
> + other);
>
> - tsk = kthread_run(active_engine, active,
> - "igt/%s", active->name);
> + if (other == engine)
> + continue;
> +
> + tsk = kthread_run(active_engine, other,
> + "igt/%s", other->name);
> if (IS_ERR(tsk)) {
> err = PTR_ERR(tsk);
> goto unwind;
> @@ -503,20 +626,70 @@ static int igt_reset_active_engines(void *arg)
> get_task_struct(tsk);
> }
>
> - set_bit(I915_RESET_ENGINE + engine->id, &i915->gpu_error.flags);
> + set_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
> do {
> + if (active) {
> + struct drm_i915_gem_request *rq;
> +
> + mutex_lock(&i915->drm.struct_mutex);
> + rq = hang_create_request(&h, engine,
> + i915->kernel_context);
> + if (IS_ERR(rq)) {
> + err = PTR_ERR(rq);
> + mutex_unlock(&i915->drm.struct_mutex);
> + break;
> + }
> +
> + i915_gem_request_get(rq);
> + __i915_add_request(rq, true);
> + mutex_unlock(&i915->drm.struct_mutex);
> +
> + if (!wait_for_hang(&h, rq)) {
> + struct drm_printer p = drm_info_printer(i915->drm.dev);
> +
> + pr_err("%s: Failed to start request %x, at %x\n",
> + __func__, rq->fence.seqno, hws_seqno(&h, rq));
> + intel_engine_dump(engine, &p,
> + "%s\n", engine->name);
> +
> + i915_gem_request_put(rq);
> + err = -EIO;
> + break;
> + }
> +
> + i915_gem_request_put(rq);
> + }
> +
> + engine->hangcheck.stalled = true;
> + engine->hangcheck.seqno =
> + intel_engine_get_seqno(engine);
> +
> err = i915_reset_engine(engine, I915_RESET_QUIET);
> if (err) {
> - pr_err("i915_reset_engine(%s) failed, err=%d\n",
> - engine->name, err);
> + pr_err("i915_reset_engine(%s:%s) failed, err=%d\n",
> + engine->name, active ? "active" : "idle", err);
> break;
> }
> +
> + engine->hangcheck.stalled = false;
> + count++;
> } while (time_before(jiffies, end_time));
> - clear_bit(I915_RESET_ENGINE + engine->id,
> - &i915->gpu_error.flags);
> + clear_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
> + pr_info("i915_reset_engine(%s:%s): %lu resets\n",
> + engine->name, active ? "active" : "idle", count);
> +
> + if (i915_reset_engine_count(&i915->gpu_error, engine) -
> + resets[engine->id] != (active ? count : 0)) {
> + pr_err("i915_reset_engine(%s:%s): reset %lu times, but reported %lu\n",
> + engine->name, active ? "active" : "idle", count,
> + i915_reset_engine_count(&i915->gpu_error,
> + engine) - resets[engine->id]);
> + if (!err)
> + err = -EINVAL;
> + }
>
> unwind:
> - for_each_engine(active, i915, tmp) {
> + for_each_engine(other, i915, tmp) {
> int ret;
>
> if (!threads[tmp])
> @@ -524,27 +697,29 @@ static int igt_reset_active_engines(void *arg)
>
> ret = kthread_stop(threads[tmp]);
> if (ret) {
> - pr_err("kthread for active engine %s failed, err=%d\n",
> - active->name, ret);
> + pr_err("kthread for other engine %s failed, err=%d\n",
> + other->name, ret);
> if (!err)
> err = ret;
> }
> put_task_struct(threads[tmp]);
>
> if (resets[tmp] != i915_reset_engine_count(&i915->gpu_error,
> - active)) {
> + other)) {
> pr_err("Innocent engine %s was reset (count=%ld)\n",
> - active->name,
> + other->name,
> i915_reset_engine_count(&i915->gpu_error,
> - active) - resets[tmp]);
> - err = -EIO;
> + other) - resets[tmp]);
> + if (!err)
> + err = -EINVAL;
> }
> }
>
> if (global != i915_reset_count(&i915->gpu_error)) {
> pr_err("Global reset (count=%ld)!\n",
> i915_reset_count(&i915->gpu_error) - global);
> - err = -EIO;
> + if (!err)
> + err = -EINVAL;
> }
>
> if (err)
> @@ -556,9 +731,25 @@ static int igt_reset_active_engines(void *arg)
> if (i915_terminally_wedged(&i915->gpu_error))
> err = -EIO;
>
> + if (active) {
> + mutex_lock(&i915->drm.struct_mutex);
> + hang_fini(&h);
> + mutex_unlock(&i915->drm.struct_mutex);
> + }
> +
> return err;
> }
>
> +static int igt_reset_idle_engine_others(void *arg)
> +{
> + return __igt_reset_engine_others(arg, false);
> +}
> +
> +static int igt_reset_active_engine_others(void *arg)
> +{
> + return __igt_reset_engine_others(arg, true);
> +}
> +
> static u32 fake_hangcheck(struct drm_i915_gem_request *rq)
> {
> u32 reset_count;
> @@ -574,16 +765,6 @@ static u32 fake_hangcheck(struct drm_i915_gem_request *rq)
> return reset_count;
> }
>
> -static bool wait_for_hang(struct hang *h, struct drm_i915_gem_request *rq)
> -{
> - return !(wait_for_us(i915_seqno_passed(hws_seqno(h, rq),
> - rq->fence.seqno),
> - 10) &&
> - wait_for(i915_seqno_passed(hws_seqno(h, rq),
> - rq->fence.seqno),
> - 1000));
> -}
> -
> static int igt_wait_reset(void *arg)
> {
> struct drm_i915_private *i915 = arg;
> @@ -617,8 +798,8 @@ static int igt_wait_reset(void *arg)
> if (!wait_for_hang(&h, rq)) {
> struct drm_printer p = drm_info_printer(i915->drm.dev);
>
> - pr_err("Failed to start request %x, at %x\n",
> - rq->fence.seqno, hws_seqno(&h, rq));
> + pr_err("%s: Failed to start request %x, at %x\n",
> + __func__, rq->fence.seqno, hws_seqno(&h, rq));
> intel_engine_dump(rq->engine, &p, "%s\n", rq->engine->name);
>
> i915_reset(i915, 0);
> @@ -712,8 +893,8 @@ static int igt_reset_queue(void *arg)
> if (!wait_for_hang(&h, prev)) {
> struct drm_printer p = drm_info_printer(i915->drm.dev);
>
> - pr_err("Failed to start request %x, at %x\n",
> - prev->fence.seqno, hws_seqno(&h, prev));
> + pr_err("%s: Failed to start request %x, at %x\n",
> + __func__, prev->fence.seqno, hws_seqno(&h, prev));
> intel_engine_dump(prev->engine, &p,
> "%s\n", prev->engine->name);
>
> @@ -819,8 +1000,8 @@ static int igt_handle_error(void *arg)
> if (!wait_for_hang(&h, rq)) {
> struct drm_printer p = drm_info_printer(i915->drm.dev);
>
> - pr_err("Failed to start request %x, at %x\n",
> - rq->fence.seqno, hws_seqno(&h, rq));
> + pr_err("%s: Failed to start request %x, at %x\n",
> + __func__, rq->fence.seqno, hws_seqno(&h, rq));
> intel_engine_dump(rq->engine, &p, "%s\n", rq->engine->name);
>
> i915_reset(i915, 0);
> @@ -864,21 +1045,26 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915)
> static const struct i915_subtest tests[] = {
> SUBTEST(igt_global_reset), /* attempt to recover GPU first */
> SUBTEST(igt_hang_sanitycheck),
> - SUBTEST(igt_reset_engine),
> - SUBTEST(igt_reset_active_engines),
> + SUBTEST(igt_reset_idle_engine),
> + SUBTEST(igt_reset_active_engine),
> + SUBTEST(igt_reset_idle_engine_others),
> + SUBTEST(igt_reset_active_engine_others),
> SUBTEST(igt_wait_reset),
> SUBTEST(igt_reset_queue),
> SUBTEST(igt_handle_error),
> };
> + bool saved_hangcheck;
> int err;
>
> if (!intel_has_gpu_reset(i915))
> return 0;
>
> intel_runtime_pm_get(i915);
> + saved_hangcheck = fetch_and_zero(&i915_modparams.enable_hangcheck);
>
> err = i915_subtests(tests, i915);
>
> + i915_modparams.enable_hangcheck = saved_hangcheck;
> intel_runtime_pm_put(i915);
>
> return err;
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/3] drm/i915/selftests: Fix up igt_reset_engine
2017-12-18 21:50 ` Michel Thierry
@ 2017-12-18 21:54 ` Chris Wilson
0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2017-12-18 21:54 UTC (permalink / raw)
To: Michel Thierry, intel-gfx
Quoting Michel Thierry (2017-12-18 21:50:17)
> On 17/12/17 05:28, Chris Wilson wrote:
> > Now that we skip a per-engine reset on an idle engine, we need to update
> > the selftest to take that into account. In the process, we find that we
> > were not stressing the per-engine reset very hard, so add those missing
> > active resets.
> >
> > v2: Actually test i915_reset_engine() by loading it with requests.
> >
> > Fixes: f6ba181ada55 ("drm/i915: Skip an engine reset if it recovered before our preparations")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michel Thierry <michel.thierry@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>
>
> Reviewed-by: Michel Thierry <michel.thierry@intel.com>
I could have put more effort into making it one function with a couple
of parameters (idle/active engine-reset; idle/active other engines), but
honestly I was just happy to put something together that worked!
> And all these subtests passed with and without GuC in SKL.
Happy and sad, I was hoping to break something! :)
Thanks, pushed for a quieter CI.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2017-12-18 21:54 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-17 13:28 [PATCH 1/3] drm/i915: Re-enable GGTT earlier after GPU reset Chris Wilson
2017-12-17 13:28 ` [PATCH 2/3] drm/i915: Show IPEIR and IPEHR in the engine dump Chris Wilson
2017-12-18 11:14 ` Tvrtko Ursulin
2017-12-18 11:18 ` Chris Wilson
2017-12-18 11:14 ` Chris Wilson
2017-12-18 11:26 ` [PATCH v2] " Chris Wilson
2017-12-18 12:08 ` Tvrtko Ursulin
2017-12-18 12:17 ` [PATCH v3] " Chris Wilson
2017-12-18 12:32 ` Tvrtko Ursulin
2017-12-18 12:35 ` Chris Wilson
2017-12-18 12:39 ` [PATCH v4] " Chris Wilson
2017-12-18 12:58 ` Tvrtko Ursulin
2017-12-18 13:27 ` Chris Wilson
2017-12-17 13:28 ` [PATCH 3/3] drm/i915/selftests: Fix up igt_reset_engine Chris Wilson
2017-12-18 21:50 ` Michel Thierry
2017-12-18 21:54 ` Chris Wilson
2017-12-17 14:07 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset Patchwork
2017-12-17 15:36 ` ✗ Fi.CI.IGT: warning " Patchwork
2017-12-17 18:19 ` [PATCH 1/3] " Chris Wilson
2017-12-18 11:11 ` Tvrtko Ursulin
2017-12-18 11:19 ` Chris Wilson
2017-12-18 13:13 ` ✗ Fi.CI.BAT: failure for series starting with [1/3] drm/i915: Re-enable GGTT earlier after GPU reset (rev4) Patchwork
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.