All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] lib/gt: Omit illegal instruction on hang injection with gen 8+
@ 2016-06-08 13:07 Mika Kuoppala
  2016-06-08 13:07 ` [PATCH 2/2] igt/gem_reset_stats: Add time constraints on hang detection Mika Kuoppala
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Mika Kuoppala @ 2016-06-08 13:07 UTC (permalink / raw)
  To: intel-gfx

0xffffffff as an illegal command confuses the command execution
on gen9 so that the next BB start will get ignored, causing a runaway
head past the bb end. This delays the hang detection substantially
as hangcheck then observes only a non progressing seqno.

Omit the bad instruction on gen8+ and rely on the chained batch
loop to cause a deterministic hang. Make the chained bb start
to jump straight into bb start, omitting the MI_NOOP or the bad
instruction on subsequent passes. This makes the acthd sampling
to hit more reliably to the same value, as the loop is smaller,
making the head appear to be 'more stuck'.

References: https://bugs.freedesktop.org/show_bug.cgi?id=92715
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 lib/igt_gt.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/igt_gt.c b/lib/igt_gt.c
index 95d74a0cfeae..adb4b95bb913 100644
--- a/lib/igt_gt.c
+++ b/lib/igt_gt.c
@@ -181,16 +181,22 @@ igt_hang_ring_t igt_hang_ctx(int fd,
 	exec.relocs_ptr = (uintptr_t)&reloc;
 
 	memset(b, 0xc5, sizeof(b));
-	b[0] = 0xffffffff;
+
 	len = 2;
-	if (intel_gen(intel_get_drm_devid(fd)) >= 8)
+	if (intel_gen(intel_get_drm_devid(fd)) >= 8) {
+		b[0] = MI_NOOP;
 		len++;
+	} else {
+		b[0] = 0xffffffff;
+	}
+
 	b[1] = MI_BATCH_BUFFER_START | (len - 2);
 	b[1+len] = MI_BATCH_BUFFER_END;
 	b[2+len] = MI_NOOP;
 	gem_write(fd, exec.handle, 0, b, sizeof(b));
 
 	reloc.offset = 8;
+	reloc.delta = 4;
 	reloc.target_handle = exec.handle;
 	reloc.read_domains = I915_GEM_DOMAIN_COMMAND;
 
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] igt/gem_reset_stats: Add time constraints on hang detection
  2016-06-08 13:07 [PATCH 1/2] lib/gt: Omit illegal instruction on hang injection with gen 8+ Mika Kuoppala
@ 2016-06-08 13:07 ` Mika Kuoppala
  2016-06-08 13:59   ` Chris Wilson
  2016-06-08 13:54 ` [PATCH 1/2] lib/gt: Omit illegal instruction on hang injection with gen 8+ Chris Wilson
  2016-06-23  8:49 ` ✗ Ro.CI.BAT: failure for series starting with [1/2] " Patchwork
  2 siblings, 1 reply; 5+ messages in thread
From: Mika Kuoppala @ 2016-06-08 13:07 UTC (permalink / raw)
  To: intel-gfx

Make sure that injected hang is detected below time threshold.
This ensures that we fail if hang was of no-progress type instead
of a stuck engine.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 tests/gem_reset_stats.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/tests/gem_reset_stats.c b/tests/gem_reset_stats.c
index 27bc6c9cfb59..74c102d3fd7d 100644
--- a/tests/gem_reset_stats.c
+++ b/tests/gem_reset_stats.c
@@ -148,6 +148,23 @@ static int gem_reset_status(int fd, int ctx_id)
 	return RS_NO_ERROR;
 }
 
+static struct timespec ts_injected;
+
+static double elapsed(const struct timespec *start, const struct timespec *end)
+{
+	return ((end->tv_sec - start->tv_sec) +
+		(end->tv_nsec - start->tv_nsec)*1e-9);
+}
+
+static double elapsed_from_hang_injection(void)
+{
+	struct timespec now;
+
+	clock_gettime(CLOCK_MONOTONIC, &now);
+
+	return elapsed(&ts_injected, &now);
+}
+
 #define BAN HANG_ALLOW_BAN
 #define ASYNC 2
 static void inject_hang(int fd, uint32_t ctx,
@@ -156,6 +173,8 @@ static void inject_hang(int fd, uint32_t ctx,
 {
 	igt_hang_ring_t hang;
 
+	clock_gettime(CLOCK_MONOTONIC, &ts_injected);
+
 	hang = igt_hang_ctx(fd, ctx, e->exec_id | e->flags, flags & BAN, NULL);
 	if ((flags & ASYNC) == 0)
 		igt_post_hang_ring(fd, hang);
@@ -238,6 +257,8 @@ static void test_rs(const struct intel_execution_engine *e,
 			assert_reset_status(i, fd[i], 0, RS_BATCH_PENDING);
 	}
 
+	igt_assert(elapsed_from_hang_injection() < 31.0);
+
 	for (i = 0; i < num_fds; i++)
 		close(fd[i]);
 }
@@ -290,6 +311,8 @@ static void test_rs_ctx(const struct intel_execution_engine *e,
 	}
 	sync_gpu();
 
+	igt_assert(elapsed_from_hang_injection() < 31.0);
+
 	for (i = 0; i < num_fds; i++)
 		assert_reset_status(i, fd[i], 0, RS_NO_ERROR);
 
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] lib/gt: Omit illegal instruction on hang injection with gen 8+
  2016-06-08 13:07 [PATCH 1/2] lib/gt: Omit illegal instruction on hang injection with gen 8+ Mika Kuoppala
  2016-06-08 13:07 ` [PATCH 2/2] igt/gem_reset_stats: Add time constraints on hang detection Mika Kuoppala
@ 2016-06-08 13:54 ` Chris Wilson
  2016-06-23  8:49 ` ✗ Ro.CI.BAT: failure for series starting with [1/2] " Patchwork
  2 siblings, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2016-06-08 13:54 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Wed, Jun 08, 2016 at 04:07:17PM +0300, Mika Kuoppala wrote:
> 0xffffffff as an illegal command confuses the command execution
> on gen9 so that the next BB start will get ignored, causing a runaway
> head past the bb end. This delays the hang detection substantially
> as hangcheck then observes only a non progressing seqno.
> 
> Omit the bad instruction on gen8+ and rely on the chained batch
> loop to cause a deterministic hang. Make the chained bb start
> to jump straight into bb start, omitting the MI_NOOP or the bad
> instruction on subsequent passes. This makes the acthd sampling
> to hit more reliably to the same value, as the loop is smaller,
> making the head appear to be 'more stuck'.

You could note that BB(addr | 4) is also illegal. gen2+ requires 8 byte
alignment, ilk requires 64 byte.
 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=92715
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] igt/gem_reset_stats: Add time constraints on hang detection
  2016-06-08 13:07 ` [PATCH 2/2] igt/gem_reset_stats: Add time constraints on hang detection Mika Kuoppala
@ 2016-06-08 13:59   ` Chris Wilson
  0 siblings, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2016-06-08 13:59 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Wed, Jun 08, 2016 at 04:07:18PM +0300, Mika Kuoppala wrote:
> Make sure that injected hang is detected below time threshold.
> This ensures that we fail if hang was of no-progress type instead
> of a stuck engine.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  tests/gem_reset_stats.c | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/tests/gem_reset_stats.c b/tests/gem_reset_stats.c
> index 27bc6c9cfb59..74c102d3fd7d 100644
> --- a/tests/gem_reset_stats.c
> +++ b/tests/gem_reset_stats.c
> @@ -148,6 +148,23 @@ static int gem_reset_status(int fd, int ctx_id)
>  	return RS_NO_ERROR;
>  }
>  
> +static struct timespec ts_injected;
> +
>  #define BAN HANG_ALLOW_BAN
>  #define ASYNC 2
>  static void inject_hang(int fd, uint32_t ctx,
> @@ -156,6 +173,8 @@ static void inject_hang(int fd, uint32_t ctx,
>  {
>  	igt_hang_ring_t hang;
>  
> +	clock_gettime(CLOCK_MONOTONIC, &ts_injected);
> +
>  	hang = igt_hang_ctx(fd, ctx, e->exec_id | e->flags, flags & BAN, NULL);
>  	if ((flags & ASYNC) == 0)
>  		igt_post_hang_ring(fd, hang);
> @@ -238,6 +257,8 @@ static void test_rs(const struct intel_execution_engine *e,
>  			assert_reset_status(i, fd[i], 0, RS_BATCH_PENDING);
>  	}
>  
> +	igt_assert(elapsed_from_hang_injection() < 31.0);
igt_assert(igt_seconds_elapsed(&ts_injected) <= 30);
> +
>  	for (i = 0; i < num_fds; i++)
>  		close(fd[i]);
>  }
> @@ -290,6 +311,8 @@ static void test_rs_ctx(const struct intel_execution_engine *e,
>  	}
>  	sync_gpu();
>  

> +	igt_assert(elapsed_from_hang_injection() < 31.0);
igt_assert(igt_seconds_elapsed(&ts_injected) <= 30);
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* ✗ Ro.CI.BAT: failure for series starting with [1/2] lib/gt: Omit illegal instruction on hang injection with gen 8+
  2016-06-08 13:07 [PATCH 1/2] lib/gt: Omit illegal instruction on hang injection with gen 8+ Mika Kuoppala
  2016-06-08 13:07 ` [PATCH 2/2] igt/gem_reset_stats: Add time constraints on hang detection Mika Kuoppala
  2016-06-08 13:54 ` [PATCH 1/2] lib/gt: Omit illegal instruction on hang injection with gen 8+ Chris Wilson
@ 2016-06-23  8:49 ` Patchwork
  2 siblings, 0 replies; 5+ messages in thread
From: Patchwork @ 2016-06-23  8:49 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] lib/gt: Omit illegal instruction on hang injection with gen 8+
URL   : https://patchwork.freedesktop.org/series/8452/
State : failure

== Summary ==

Applying: lib/gt: Omit illegal instruction on hang injection with gen 8+
fatal: sha1 information is lacking or useless (lib/igt_gt.c).
error: could not build fake ancestor
Patch failed at 0001 lib/gt: Omit illegal instruction on hang injection with gen 8+
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-06-23  8:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-08 13:07 [PATCH 1/2] lib/gt: Omit illegal instruction on hang injection with gen 8+ Mika Kuoppala
2016-06-08 13:07 ` [PATCH 2/2] igt/gem_reset_stats: Add time constraints on hang detection Mika Kuoppala
2016-06-08 13:59   ` Chris Wilson
2016-06-08 13:54 ` [PATCH 1/2] lib/gt: Omit illegal instruction on hang injection with gen 8+ Chris Wilson
2016-06-23  8:49 ` ✗ Ro.CI.BAT: failure for series starting with [1/2] " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.