All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH i-g-t 01/17] lib: Report file cache as available system memory
@ 2018-07-02  9:07 ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

sysinfo() doesn't include all reclaimable memory. In particular it
excludes the majority of global_node_page_state(NR_FILE_PAGES),
reclaimable pages that are a copy of on-disk files It seems the only way
to obtain this counter is by parsing /proc/meminfo. For comparison,
check vm_enough_memory() which includes NR_FILE_PAGES as available
(sadly there's no way to call vm_enough_memory() directly either!)

v2: Pay attention to what one writes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/intel_os.c | 36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/lib/intel_os.c b/lib/intel_os.c
index 885ffdcec..5bd18fc52 100644
--- a/lib/intel_os.c
+++ b/lib/intel_os.c
@@ -84,6 +84,18 @@ intel_get_total_ram_mb(void)
 	return retval / (1024*1024);
 }
 
+static uint64_t get_meminfo(const char *info, const char *tag)
+{
+	const char *str;
+	unsigned long val;
+
+	str = strstr(info, tag);
+	if (str && sscanf(str + strlen(tag), " %lu", &val) == 1)
+		return (uint64_t)val << 10;
+
+	return 0;
+}
+
 /**
  * intel_get_avail_ram_mb:
  *
@@ -96,17 +108,31 @@ intel_get_avail_ram_mb(void)
 	uint64_t retval;
 
 #ifdef HAVE_STRUCT_SYSINFO_TOTALRAM /* Linux */
-	struct sysinfo sysinf;
+	char *info;
 	int fd;
 
 	fd = drm_open_driver(DRIVER_INTEL);
 	intel_purge_vm_caches(fd);
 	close(fd);
 
-	igt_assert(sysinfo(&sysinf) == 0);
-	retval = sysinf.freeram;
-	retval += min(sysinf.freeswap, sysinf.bufferram);
-	retval *= sysinf.mem_unit;
+	fd = open("/proc", O_RDONLY);
+	info = igt_sysfs_get(fd, "meminfo");
+	close(fd);
+
+	if (info) {
+		retval  = get_meminfo(info, "MemAvailable: ");
+		retval += get_meminfo(info, "Buffers: ");
+		retval += get_meminfo(info, "Cached: ");
+		retval += get_meminfo(info, "SwapCached: ");
+		free(info);
+	} else {
+		struct sysinfo sysinf;
+
+		igt_assert(sysinfo(&sysinf) == 0);
+		retval = sysinf.freeram;
+		retval += min(sysinf.freeswap, sysinf.bufferram);
+		retval *= sysinf.mem_unit;
+	}
 #elif defined(_SC_PAGESIZE) && defined(_SC_AVPHYS_PAGES) /* Solaris */
 	long pagesize, npages;
 
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Intel-gfx] [PATCH i-g-t 01/17] lib: Report file cache as available system memory
@ 2018-07-02  9:07 ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

sysinfo() doesn't include all reclaimable memory. In particular it
excludes the majority of global_node_page_state(NR_FILE_PAGES),
reclaimable pages that are a copy of on-disk files It seems the only way
to obtain this counter is by parsing /proc/meminfo. For comparison,
check vm_enough_memory() which includes NR_FILE_PAGES as available
(sadly there's no way to call vm_enough_memory() directly either!)

v2: Pay attention to what one writes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/intel_os.c | 36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/lib/intel_os.c b/lib/intel_os.c
index 885ffdcec..5bd18fc52 100644
--- a/lib/intel_os.c
+++ b/lib/intel_os.c
@@ -84,6 +84,18 @@ intel_get_total_ram_mb(void)
 	return retval / (1024*1024);
 }
 
+static uint64_t get_meminfo(const char *info, const char *tag)
+{
+	const char *str;
+	unsigned long val;
+
+	str = strstr(info, tag);
+	if (str && sscanf(str + strlen(tag), " %lu", &val) == 1)
+		return (uint64_t)val << 10;
+
+	return 0;
+}
+
 /**
  * intel_get_avail_ram_mb:
  *
@@ -96,17 +108,31 @@ intel_get_avail_ram_mb(void)
 	uint64_t retval;
 
 #ifdef HAVE_STRUCT_SYSINFO_TOTALRAM /* Linux */
-	struct sysinfo sysinf;
+	char *info;
 	int fd;
 
 	fd = drm_open_driver(DRIVER_INTEL);
 	intel_purge_vm_caches(fd);
 	close(fd);
 
-	igt_assert(sysinfo(&sysinf) == 0);
-	retval = sysinf.freeram;
-	retval += min(sysinf.freeswap, sysinf.bufferram);
-	retval *= sysinf.mem_unit;
+	fd = open("/proc", O_RDONLY);
+	info = igt_sysfs_get(fd, "meminfo");
+	close(fd);
+
+	if (info) {
+		retval  = get_meminfo(info, "MemAvailable: ");
+		retval += get_meminfo(info, "Buffers: ");
+		retval += get_meminfo(info, "Cached: ");
+		retval += get_meminfo(info, "SwapCached: ");
+		free(info);
+	} else {
+		struct sysinfo sysinf;
+
+		igt_assert(sysinfo(&sysinf) == 0);
+		retval = sysinf.freeram;
+		retval += min(sysinf.freeswap, sysinf.bufferram);
+		retval *= sysinf.mem_unit;
+	}
 #elif defined(_SC_PAGESIZE) && defined(_SC_AVPHYS_PAGES) /* Solaris */
 	long pagesize, npages;
 
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

As we want to compare a templated tiling pattern against the target_bo,
we need to know that the swizzling is compatible. Or else the two
tiling pattern may differ due to underlying page address that we cannot
know, and so the test may sporadically fail.

References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
index fe573c37c..83c57c07d 100644
--- a/tests/gem_tiled_partial_pwrite_pread.c
+++ b/tests/gem_tiled_partial_pwrite_pread.c
@@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
 	}
 }
 
+static bool known_swizzling(uint32_t handle)
+{
+	struct drm_i915_gem_get_tiling2 {
+		uint32_t handle;
+		uint32_t tiling_mode;
+		uint32_t swizzle_mode;
+		uint32_t phys_swizzle_mode;
+	} arg = {
+		.handle = handle,
+	};
+#define DRM_IOCTL_I915_GEM_GET_TILING2	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct drm_i915_gem_get_tiling2)
+
+	if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
+		return false;
+
+	return arg.phys_swizzle_mode == arg.swizzle_mode;
+}
+
 igt_main
 {
 	uint32_t tiling_mode = I915_TILING_X;
@@ -271,6 +289,12 @@ igt_main
 						      &tiling_mode, &scratch_pitch, 0);
 		igt_assert(tiling_mode == I915_TILING_X);
 		igt_assert(scratch_pitch == 4096);
+
+		/*
+		 * As we want to compare our template tiled pattern against
+		 * the target bo, we need consistent swizzling on both.
+		 */
+		igt_require(known_swizzling(scratch_bo->handle));
 		staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
 		tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
 							    BO_SIZE/4096, 4,
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

As we want to compare a templated tiling pattern against the target_bo,
we need to know that the swizzling is compatible. Or else the two
tiling pattern may differ due to underlying page address that we cannot
know, and so the test may sporadically fail.

References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
index fe573c37c..83c57c07d 100644
--- a/tests/gem_tiled_partial_pwrite_pread.c
+++ b/tests/gem_tiled_partial_pwrite_pread.c
@@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
 	}
 }
 
+static bool known_swizzling(uint32_t handle)
+{
+	struct drm_i915_gem_get_tiling2 {
+		uint32_t handle;
+		uint32_t tiling_mode;
+		uint32_t swizzle_mode;
+		uint32_t phys_swizzle_mode;
+	} arg = {
+		.handle = handle,
+	};
+#define DRM_IOCTL_I915_GEM_GET_TILING2	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct drm_i915_gem_get_tiling2)
+
+	if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
+		return false;
+
+	return arg.phys_swizzle_mode == arg.swizzle_mode;
+}
+
 igt_main
 {
 	uint32_t tiling_mode = I915_TILING_X;
@@ -271,6 +289,12 @@ igt_main
 						      &tiling_mode, &scratch_pitch, 0);
 		igt_assert(tiling_mode == I915_TILING_X);
 		igt_assert(scratch_pitch == 4096);
+
+		/*
+		 * As we want to compare our template tiled pattern against
+		 * the target bo, we need consistent swizzling on both.
+		 */
+		igt_require(known_swizzling(scratch_bo->handle));
 		staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
 		tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
 							    BO_SIZE/4096, 4,
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 03/17] igt/gem_set_tiling_vs_pwrite: Show the erroneous value
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_set_tiling_vs_pwrite.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/gem_set_tiling_vs_pwrite.c b/tests/gem_set_tiling_vs_pwrite.c
index 006edfe4e..f0126b648 100644
--- a/tests/gem_set_tiling_vs_pwrite.c
+++ b/tests/gem_set_tiling_vs_pwrite.c
@@ -75,7 +75,7 @@ igt_simple_main
 	memset(data, 0, OBJECT_SIZE);
 	gem_read(fd, handle, 0, data, OBJECT_SIZE);
 	for (i = 0; i < OBJECT_SIZE/4; i++)
-		igt_assert(i == data[i]);
+		igt_assert_eq_u32(data[i], i);
 
 	/* touch it before changing the tiling, so that the fence sticks around */
 	gem_set_domain(fd, handle, I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
@@ -88,7 +88,7 @@ igt_simple_main
 	memset(data, 0, OBJECT_SIZE);
 	gem_read(fd, handle, 0, data, OBJECT_SIZE);
 	for (i = 0; i < OBJECT_SIZE/4; i++)
-		igt_assert(i == data[i]);
+		igt_assert_eq_u32(data[i], i);
 
 	munmap(ptr, OBJECT_SIZE);
 
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Intel-gfx] [PATCH i-g-t 03/17] igt/gem_set_tiling_vs_pwrite: Show the erroneous value
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_set_tiling_vs_pwrite.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/gem_set_tiling_vs_pwrite.c b/tests/gem_set_tiling_vs_pwrite.c
index 006edfe4e..f0126b648 100644
--- a/tests/gem_set_tiling_vs_pwrite.c
+++ b/tests/gem_set_tiling_vs_pwrite.c
@@ -75,7 +75,7 @@ igt_simple_main
 	memset(data, 0, OBJECT_SIZE);
 	gem_read(fd, handle, 0, data, OBJECT_SIZE);
 	for (i = 0; i < OBJECT_SIZE/4; i++)
-		igt_assert(i == data[i]);
+		igt_assert_eq_u32(data[i], i);
 
 	/* touch it before changing the tiling, so that the fence sticks around */
 	gem_set_domain(fd, handle, I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
@@ -88,7 +88,7 @@ igt_simple_main
 	memset(data, 0, OBJECT_SIZE);
 	gem_read(fd, handle, 0, data, OBJECT_SIZE);
 	for (i = 0; i < OBJECT_SIZE/4; i++)
-		igt_assert(i == data[i]);
+		igt_assert_eq_u32(data[i], i);
 
 	munmap(ptr, OBJECT_SIZE);
 
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 04/17] lib: Convert spin batch constructor to a factory
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

In order to make adding more options easier, expose the full set of
options to the caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 lib/igt_dummyload.c            | 147 +++++++++------------------------
 lib/igt_dummyload.h            |  42 +++++-----
 tests/drv_missed_irq.c         |   2 +-
 tests/gem_busy.c               |  17 ++--
 tests/gem_ctx_isolation.c      |  26 +++---
 tests/gem_eio.c                |  13 ++-
 tests/gem_exec_fence.c         |  16 ++--
 tests/gem_exec_latency.c       |  18 +++-
 tests/gem_exec_nop.c           |   4 +-
 tests/gem_exec_reloc.c         |  10 ++-
 tests/gem_exec_schedule.c      |  27 ++++--
 tests/gem_exec_suspend.c       |   2 +-
 tests/gem_fenced_exec_thrash.c |   2 +-
 tests/gem_shrink.c             |   4 +-
 tests/gem_spin_batch.c         |   4 +-
 tests/gem_sync.c               |   5 +-
 tests/gem_wait.c               |   4 +-
 tests/kms_busy.c               |  10 ++-
 tests/kms_cursor_legacy.c      |   7 +-
 tests/perf_pmu.c               |  33 +++++---
 tests/pm_rps.c                 |   9 +-
 21 files changed, 189 insertions(+), 213 deletions(-)

diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
index 3809b4e61..94efdf745 100644
--- a/lib/igt_dummyload.c
+++ b/lib/igt_dummyload.c
@@ -75,12 +75,9 @@ fill_reloc(struct drm_i915_gem_relocation_entry *reloc,
 	reloc->write_domain = write_domains;
 }
 
-#define OUT_FENCE	(1 << 0)
-#define POLL_RUN	(1 << 1)
-
 static int
-emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
-		     uint32_t dep, unsigned int flags)
+emit_recursive_batch(igt_spin_t *spin,
+		     int fd, const struct igt_spin_factory *opts)
 {
 #define SCRATCH 0
 #define BATCH 1
@@ -95,21 +92,18 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
 	int i;
 
 	nengine = 0;
-	if (engine == ALL_ENGINES) {
-		for_each_engine(fd, engine) {
-			if (engine) {
-			if (flags & POLL_RUN)
-				igt_require(!(flags & POLL_RUN) ||
-					    gem_can_store_dword(fd, engine));
-
-				engines[nengine++] = engine;
-			}
+	if (opts->engine == ALL_ENGINES) {
+		unsigned int engine;
+
+		for_each_physical_engine(fd, engine) {
+			if (opts->flags & IGT_SPIN_POLL_RUN &&
+			    !gem_can_store_dword(fd, engine))
+				continue;
+
+			engines[nengine++] = engine;
 		}
 	} else {
-		gem_require_ring(fd, engine);
-		igt_require(!(flags & POLL_RUN) ||
-			    gem_can_store_dword(fd, engine));
-		engines[nengine++] = engine;
+		engines[nengine++] = opts->engine;
 	}
 	igt_require(nengine);
 
@@ -130,20 +124,20 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
 	execbuf->buffer_count++;
 	batch_start = batch;
 
-	if (dep) {
-		igt_assert(!(flags & POLL_RUN));
+	if (opts->dependency) {
+		igt_assert(!(opts->flags & IGT_SPIN_POLL_RUN));
 
 		/* dummy write to dependency */
-		obj[SCRATCH].handle = dep;
+		obj[SCRATCH].handle = opts->dependency;
 		fill_reloc(&relocs[obj[BATCH].relocation_count++],
-			   dep, 1020,
+			   opts->dependency, 1020,
 			   I915_GEM_DOMAIN_RENDER,
 			   I915_GEM_DOMAIN_RENDER);
 		execbuf->buffer_count++;
-	} else if (flags & POLL_RUN) {
+	} else if (opts->flags & IGT_SPIN_POLL_RUN) {
 		unsigned int offset;
 
-		igt_assert(!dep);
+		igt_assert(!opts->dependency);
 
 		if (gen == 4 || gen == 5) {
 			execbuf->flags |= I915_EXEC_SECURE;
@@ -231,9 +225,9 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
 
 	execbuf->buffers_ptr = to_user_pointer(obj +
 					       (2 - execbuf->buffer_count));
-	execbuf->rsvd1 = ctx;
+	execbuf->rsvd1 = opts->ctx;
 
-	if (flags & OUT_FENCE)
+	if (opts->flags & IGT_SPIN_FENCE_OUT)
 		execbuf->flags |= I915_EXEC_FENCE_OUT;
 
 	for (i = 0; i < nengine; i++) {
@@ -242,7 +236,7 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
 
 		gem_execbuf_wr(fd, execbuf);
 
-		if (flags & OUT_FENCE) {
+		if (opts->flags & IGT_SPIN_FENCE_OUT) {
 			int _fd = execbuf->rsvd2 >> 32;
 
 			igt_assert(_fd >= 0);
@@ -271,16 +265,14 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
 }
 
 static igt_spin_t *
-___igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep,
-		      unsigned int flags)
+spin_batch_create(int fd, const struct igt_spin_factory *opts)
 {
 	igt_spin_t *spin;
 
 	spin = calloc(1, sizeof(struct igt_spin));
 	igt_assert(spin);
 
-	spin->out_fence = emit_recursive_batch(spin, fd, ctx, engine, dep,
-					       flags);
+	spin->out_fence = emit_recursive_batch(spin, fd, opts);
 
 	pthread_mutex_lock(&list_lock);
 	igt_list_add(&spin->link, &spin_list);
@@ -290,18 +282,15 @@ ___igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep,
 }
 
 igt_spin_t *
-__igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
+__igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts)
 {
-	return ___igt_spin_batch_new(fd, ctx, engine, dep, 0);
+	return spin_batch_create(fd, opts);
 }
 
 /**
- * igt_spin_batch_new:
+ * igt_spin_batch_factory:
  * @fd: open i915 drm file descriptor
- * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
- *          than 0, execute on all available rings.
- * @dep: handle to a buffer object dependency. If greater than 0, add a
- *              relocation entry to this buffer within the batch.
+ * @opts: controlling options such as context, engine, dependencies etc
  *
  * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
  * contains the batch's handle that can be waited upon. The returned structure
@@ -311,86 +300,26 @@ __igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
  * Structure with helper internal state for igt_spin_batch_free().
  */
 igt_spin_t *
-igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
+igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts)
 {
 	igt_spin_t *spin;
 
 	igt_require_gem(fd);
 
-	spin = __igt_spin_batch_new(fd, ctx, engine, dep);
-	igt_assert(gem_bo_busy(fd, spin->handle));
-
-	return spin;
-}
-
-igt_spin_t *
-__igt_spin_batch_new_fence(int fd, uint32_t ctx, unsigned engine)
-{
-	return ___igt_spin_batch_new(fd, ctx, engine, 0, OUT_FENCE);
-}
+	if (opts->engine != ALL_ENGINES) {
+		gem_require_ring(fd, opts->engine);
+		if (opts->flags & IGT_SPIN_POLL_RUN)
+			igt_require(gem_can_store_dword(fd, opts->engine));
+	}
 
-/**
- * igt_spin_batch_new_fence:
- * @fd: open i915 drm file descriptor
- * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
- *          than 0, execute on all available rings.
- *
- * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
- * contains the batch's handle that can be waited upon. The returned structure
- * must be passed to igt_spin_batch_free() for post-processing.
- *
- * igt_spin_t will contain an output fence associtated with this batch.
- *
- * Returns:
- * Structure with helper internal state for igt_spin_batch_free().
- */
-igt_spin_t *
-igt_spin_batch_new_fence(int fd, uint32_t ctx, unsigned engine)
-{
-	igt_spin_t *spin;
+	spin = spin_batch_create(fd, opts);
 
-	igt_require_gem(fd);
-	igt_require(gem_has_exec_fence(fd));
-
-	spin = __igt_spin_batch_new_fence(fd, ctx, engine);
 	igt_assert(gem_bo_busy(fd, spin->handle));
-	igt_assert(poll(&(struct pollfd){spin->out_fence, POLLIN}, 1, 0) == 0);
-
-	return spin;
-}
-
-igt_spin_t *
-__igt_spin_batch_new_poll(int fd, uint32_t ctx, unsigned engine)
-{
-	return ___igt_spin_batch_new(fd, ctx, engine, 0, POLL_RUN);
-}
+	if (opts->flags & IGT_SPIN_FENCE_OUT) {
+		struct pollfd pfd = { spin->out_fence, POLLIN };
 
-/**
- * igt_spin_batch_new_poll:
- * @fd: open i915 drm file descriptor
- * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
- *          than 0, execute on all available rings.
- *
- * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
- * contains the batch's handle that can be waited upon. The returned structure
- * must be passed to igt_spin_batch_free() for post-processing.
- *
- * igt_spin_t->running will containt a pointer which target will change from
- * zero to one once the spinner actually starts executing on the GPU.
- *
- * Returns:
- * Structure with helper internal state for igt_spin_batch_free().
- */
-igt_spin_t *
-igt_spin_batch_new_poll(int fd, uint32_t ctx, unsigned engine)
-{
-	igt_spin_t *spin;
-
-	igt_require_gem(fd);
-	igt_require(gem_mmap__has_wc(fd));
-
-	spin = __igt_spin_batch_new_poll(fd, ctx, engine);
-	igt_assert(gem_bo_busy(fd, spin->handle));
+		igt_assert(poll(&pfd, 1, 0) == 0);
+	}
 
 	return spin;
 }
diff --git a/lib/igt_dummyload.h b/lib/igt_dummyload.h
index c6ccc2936..c794f2544 100644
--- a/lib/igt_dummyload.h
+++ b/lib/igt_dummyload.h
@@ -43,29 +43,25 @@ typedef struct igt_spin {
 	bool *running;
 } igt_spin_t;
 
-igt_spin_t *__igt_spin_batch_new(int fd,
-				 uint32_t ctx,
-				 unsigned engine,
-				 uint32_t  dep);
-igt_spin_t *igt_spin_batch_new(int fd,
-			       uint32_t ctx,
-			       unsigned engine,
-			       uint32_t  dep);
-
-igt_spin_t *__igt_spin_batch_new_fence(int fd,
-				       uint32_t ctx,
-				       unsigned engine);
-
-igt_spin_t *igt_spin_batch_new_fence(int fd,
-				     uint32_t ctx,
-				     unsigned engine);
-
-igt_spin_t *__igt_spin_batch_new_poll(int fd,
-				       uint32_t ctx,
-				       unsigned engine);
-igt_spin_t *igt_spin_batch_new_poll(int fd,
-				    uint32_t ctx,
-				    unsigned engine);
+struct igt_spin_factory {
+	uint32_t ctx;
+	uint32_t dependency;
+	unsigned int engine;
+	unsigned int flags;
+};
+
+#define IGT_SPIN_FENCE_OUT (1 << 0)
+#define IGT_SPIN_POLL_RUN  (1 << 1)
+
+igt_spin_t *
+__igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
+igt_spin_t *
+igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
+
+#define __igt_spin_batch_new(fd, ...) \
+	__igt_spin_batch_factory(fd, &((struct igt_spin_factory){__VA_ARGS__}))
+#define igt_spin_batch_new(fd, ...) \
+	igt_spin_batch_factory(fd, &((struct igt_spin_factory){__VA_ARGS__}))
 
 void igt_spin_batch_set_timeout(igt_spin_t *spin, int64_t ns);
 void igt_spin_batch_end(igt_spin_t *spin);
diff --git a/tests/drv_missed_irq.c b/tests/drv_missed_irq.c
index 791ee51fb..78690c36a 100644
--- a/tests/drv_missed_irq.c
+++ b/tests/drv_missed_irq.c
@@ -33,7 +33,7 @@ IGT_TEST_DESCRIPTION("Inject missed interrupts and make sure they are caught");
 
 static void trigger_missed_interrupt(int fd, unsigned ring)
 {
-	igt_spin_t *spin = __igt_spin_batch_new(fd, 0, ring, 0);
+	igt_spin_t *spin = __igt_spin_batch_new(fd, .engine = ring);
 	uint32_t go;
 	int link[2];
 
diff --git a/tests/gem_busy.c b/tests/gem_busy.c
index f564651ba..76b44a5d4 100644
--- a/tests/gem_busy.c
+++ b/tests/gem_busy.c
@@ -114,7 +114,9 @@ static void semaphore(int fd, unsigned ring, uint32_t flags)
 
 	/* Create a long running batch which we can use to hog the GPU */
 	handle[BUSY] = gem_create(fd, 4096);
-	spin = igt_spin_batch_new(fd, 0, ring, handle[BUSY]);
+	spin = igt_spin_batch_new(fd,
+				  .engine = ring,
+				  .dependency = handle[BUSY]);
 
 	/* Queue a batch after the busy, it should block and remain "busy" */
 	igt_assert(exec_noop(fd, handle, ring | flags, false));
@@ -363,17 +365,16 @@ static void close_race(int fd)
 		igt_assert(sched_setscheduler(getpid(), SCHED_RR, &rt) == 0);
 
 		for (i = 0; i < nhandles; i++) {
-			spin[i] = __igt_spin_batch_new(fd, 0,
-						       engines[rand() % nengine], 0);
+			spin[i] = __igt_spin_batch_new(fd,
+						       .engine = engines[rand() % nengine]);
 			handles[i] = spin[i]->handle;
 		}
 
 		igt_until_timeout(20) {
 			for (i = 0; i < nhandles; i++) {
 				igt_spin_batch_free(fd, spin[i]);
-				spin[i] = __igt_spin_batch_new(fd, 0,
-							       engines[rand() % nengine],
-							       0);
+				spin[i] = __igt_spin_batch_new(fd,
+							       .engine = engines[rand() % nengine]);
 				handles[i] = spin[i]->handle;
 				__sync_synchronize();
 			}
@@ -415,7 +416,7 @@ static bool has_semaphores(int fd)
 
 static bool has_extended_busy_ioctl(int fd)
 {
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, I915_EXEC_RENDER, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd, .engine = I915_EXEC_RENDER);
 	uint32_t read, write;
 
 	__gem_busy(fd, spin->handle, &read, &write);
@@ -426,7 +427,7 @@ static bool has_extended_busy_ioctl(int fd)
 
 static void basic(int fd, unsigned ring, unsigned flags)
 {
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, ring, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd, .engine = ring);
 	struct timespec tv;
 	int timeout;
 	bool busy;
diff --git a/tests/gem_ctx_isolation.c b/tests/gem_ctx_isolation.c
index fe7d3490c..2e19e8c03 100644
--- a/tests/gem_ctx_isolation.c
+++ b/tests/gem_ctx_isolation.c
@@ -502,7 +502,7 @@ static void isolation(int fd,
 		ctx[0] = gem_context_create(fd);
 		regs[0] = read_regs(fd, ctx[0], e, flags);
 
-		spin = igt_spin_batch_new(fd, ctx[0], engine, 0);
+		spin = igt_spin_batch_new(fd, .ctx = ctx[0], .engine = engine);
 
 		if (flags & DIRTY1) {
 			igt_debug("%s[%d]: Setting all registers of ctx 0 to 0x%08x\n",
@@ -557,8 +557,11 @@ static void isolation(int fd,
 
 static void inject_reset_context(int fd, unsigned int engine)
 {
+	struct igt_spin_factory opts = {
+		.ctx = gem_context_create(fd),
+		.engine = engine,
+	};
 	igt_spin_t *spin;
-	uint32_t ctx;
 
 	/*
 	 * Force a context switch before triggering the reset, or else
@@ -566,19 +569,20 @@ static void inject_reset_context(int fd, unsigned int engine)
 	 * HW for screwing up if the context was already broken.
 	 */
 
-	ctx = gem_context_create(fd);
-	if (gem_can_store_dword(fd, engine)) {
-		spin = __igt_spin_batch_new_poll(fd, ctx, engine);
+	if (gem_can_store_dword(fd, engine))
+		opts.flags |= IGT_SPIN_POLL_RUN;
+
+	spin = __igt_spin_batch_factory(fd, &opts);
+
+	if (spin->running)
 		igt_spin_busywait_until_running(spin);
-	} else {
-		spin = __igt_spin_batch_new(fd, ctx, engine, 0);
+	else
 		usleep(1000); /* better than nothing */
-	}
 
 	igt_force_gpu_reset(fd);
 
 	igt_spin_batch_free(fd, spin);
-	gem_context_destroy(fd, ctx);
+	gem_context_destroy(fd, opts.ctx);
 }
 
 static void preservation(int fd,
@@ -604,7 +608,7 @@ static void preservation(int fd,
 	gem_quiescent_gpu(fd);
 
 	ctx[num_values] = gem_context_create(fd);
-	spin = igt_spin_batch_new(fd, ctx[num_values], engine, 0);
+	spin = igt_spin_batch_new(fd, .ctx = ctx[num_values], .engine = engine);
 	regs[num_values][0] = read_regs(fd, ctx[num_values], e, flags);
 	for (int v = 0; v < num_values; v++) {
 		ctx[v] = gem_context_create(fd);
@@ -644,7 +648,7 @@ static void preservation(int fd,
 		break;
 	}
 
-	spin = igt_spin_batch_new(fd, ctx[num_values], engine, 0);
+	spin = igt_spin_batch_new(fd, .ctx = ctx[num_values], .engine = engine);
 	for (int v = 0; v < num_values; v++)
 		regs[v][1] = read_regs(fd, ctx[v], e, flags);
 	regs[num_values][1] = read_regs(fd, ctx[num_values], e, flags);
diff --git a/tests/gem_eio.c b/tests/gem_eio.c
index 5faf7502b..0ec1aaec9 100644
--- a/tests/gem_eio.c
+++ b/tests/gem_eio.c
@@ -157,10 +157,15 @@ static int __gem_wait(int fd, uint32_t handle, int64_t timeout)
 
 static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
 {
-	if (gem_can_store_dword(fd, flags))
-		return __igt_spin_batch_new_poll(fd, ctx, flags);
-	else
-		return __igt_spin_batch_new(fd, ctx, flags, 0);
+	struct igt_spin_factory opts = {
+		.ctx = ctx,
+		.engine = flags,
+	};
+
+	if (gem_can_store_dword(fd, opts.engine))
+		opts.flags |= IGT_SPIN_POLL_RUN;
+
+	return __igt_spin_batch_factory(fd, &opts);
 }
 
 static void __spin_wait(int fd, igt_spin_t *spin)
diff --git a/tests/gem_exec_fence.c b/tests/gem_exec_fence.c
index eb93308d1..ba46595d3 100644
--- a/tests/gem_exec_fence.c
+++ b/tests/gem_exec_fence.c
@@ -468,7 +468,7 @@ static void test_parallel(int fd, unsigned int master)
 	/* Fill the queue with many requests so that the next one has to
 	 * wait before it can be executed by the hardware.
 	 */
-	spin = igt_spin_batch_new(fd, 0, master, plug);
+	spin = igt_spin_batch_new(fd, .engine = master, .dependency = plug);
 	resubmit(fd, spin->handle, master, 16);
 
 	/* Now queue the master request and its secondaries */
@@ -651,7 +651,7 @@ static void test_keep_in_fence(int fd, unsigned int engine, unsigned int flags)
 	igt_spin_t *spin;
 	int fence;
 
-	spin = igt_spin_batch_new(fd, 0, engine, 0);
+	spin = igt_spin_batch_new(fd, .engine = engine);
 
 	gem_execbuf_wr(fd, &execbuf);
 	fence = upper_32_bits(execbuf.rsvd2);
@@ -1070,7 +1070,7 @@ static void test_syncobj_unused_fence(int fd)
 	struct local_gem_exec_fence fence = {
 		.handle = syncobj_create(fd),
 	};
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd);
 
 	/* sanity check our syncobj_to_sync_file interface */
 	igt_assert_eq(__syncobj_to_sync_file(fd, 0), -ENOENT);
@@ -1162,7 +1162,7 @@ static void test_syncobj_signal(int fd)
 	struct local_gem_exec_fence fence = {
 		.handle = syncobj_create(fd),
 	};
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd);
 
 	/* Check that the syncobj is signaled only when our request/fence is */
 
@@ -1212,7 +1212,7 @@ static void test_syncobj_wait(int fd)
 
 	gem_quiescent_gpu(fd);
 
-	spin = igt_spin_batch_new(fd, 0, 0, 0);
+	spin = igt_spin_batch_new(fd);
 
 	memset(&execbuf, 0, sizeof(execbuf));
 	execbuf.buffers_ptr = to_user_pointer(&obj);
@@ -1282,7 +1282,7 @@ static void test_syncobj_export(int fd)
 		.handle = syncobj_create(fd),
 	};
 	int export[2];
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd);
 
 	/* Check that if we export the syncobj prior to use it picks up
 	 * the later fence. This allows a syncobj to establish a channel
@@ -1340,7 +1340,7 @@ static void test_syncobj_repeat(int fd)
 	struct drm_i915_gem_execbuffer2 execbuf;
 	struct local_gem_exec_fence *fence;
 	int export;
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd);
 
 	/* Check that we can wait on the same fence multiple times */
 	fence = calloc(nfences, sizeof(*fence));
@@ -1395,7 +1395,7 @@ static void test_syncobj_import(int fd)
 	const uint32_t bbe = MI_BATCH_BUFFER_END;
 	struct drm_i915_gem_exec_object2 obj;
 	struct drm_i915_gem_execbuffer2 execbuf;
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd);
 	uint32_t sync = syncobj_create(fd);
 	int fence;
 
diff --git a/tests/gem_exec_latency.c b/tests/gem_exec_latency.c
index ea2e4c681..75811f325 100644
--- a/tests/gem_exec_latency.c
+++ b/tests/gem_exec_latency.c
@@ -63,6 +63,10 @@ static unsigned int ring_size;
 static void
 poll_ring(int fd, unsigned ring, const char *name)
 {
+	const struct igt_spin_factory opts = {
+		.engine = ring,
+		.flags = IGT_SPIN_POLL_RUN,
+	};
 	struct timespec tv = {};
 	unsigned long cycles;
 	igt_spin_t *spin[2];
@@ -72,11 +76,11 @@ poll_ring(int fd, unsigned ring, const char *name)
 	gem_require_ring(fd, ring);
 	igt_require(gem_can_store_dword(fd, ring));
 
-	spin[0] = __igt_spin_batch_new_poll(fd, 0, ring);
+	spin[0] = __igt_spin_batch_factory(fd, &opts);
 	igt_assert(spin[0]->running);
 	cmd = *spin[0]->batch;
 
-	spin[1] = __igt_spin_batch_new_poll(fd, 0, ring);
+	spin[1] = __igt_spin_batch_factory(fd, &opts);
 	igt_assert(spin[1]->running);
 	igt_assert(cmd == *spin[1]->batch);
 
@@ -312,7 +316,9 @@ static void latency_from_ring(int fd,
 			       I915_GEM_DOMAIN_GTT);
 
 		if (flags & PREEMPT)
-			spin = __igt_spin_batch_new(fd, ctx[0], ring, 0);
+			spin = __igt_spin_batch_new(fd,
+						    .ctx = ctx[0],
+						    .engine = ring);
 
 		if (flags & CORK) {
 			obj[0].handle = igt_cork_plug(&c, fd);
@@ -456,6 +462,10 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
 	};
 #define NPASS ARRAY_SIZE(passname)
 #define MMAP_SZ (64 << 10)
+	const struct igt_spin_factory opts = {
+		.engine = engine,
+		.flags = IGT_SPIN_POLL_RUN,
+	};
 	struct rt_pkt *results;
 	unsigned int engines[16];
 	const char *names[16];
@@ -513,7 +523,7 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
 
 			usleep(250);
 
-			spin = __igt_spin_batch_new_poll(fd, 0, engine);
+			spin = __igt_spin_batch_factory(fd, &opts);
 			if (!spin) {
 				igt_warn("Failed to create spinner! (%s)\n",
 					 passname[pass]);
diff --git a/tests/gem_exec_nop.c b/tests/gem_exec_nop.c
index 0523b1c02..74d27522d 100644
--- a/tests/gem_exec_nop.c
+++ b/tests/gem_exec_nop.c
@@ -709,7 +709,9 @@ static void preempt(int fd, uint32_t handle,
 	clock_gettime(CLOCK_MONOTONIC, &start);
 	do {
 		igt_spin_t *spin =
-			__igt_spin_batch_new(fd, ctx[0], ring_id, 0);
+			__igt_spin_batch_new(fd,
+					     .ctx = ctx[0],
+					     .engine = ring_id);
 
 		for (int loop = 0; loop < 1024; loop++)
 			gem_execbuf(fd, &execbuf);
diff --git a/tests/gem_exec_reloc.c b/tests/gem_exec_reloc.c
index 91c6691af..837f60a6c 100644
--- a/tests/gem_exec_reloc.c
+++ b/tests/gem_exec_reloc.c
@@ -388,7 +388,9 @@ static void basic_reloc(int fd, unsigned before, unsigned after, unsigned flags)
 		}
 
 		if (flags & ACTIVE) {
-			spin = igt_spin_batch_new(fd, 0, I915_EXEC_DEFAULT, obj.handle);
+			spin = igt_spin_batch_new(fd,
+						  .engine = I915_EXEC_DEFAULT,
+						  .dependency = obj.handle);
 			if (!(flags & HANG))
 				igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
 			igt_assert(gem_bo_busy(fd, obj.handle));
@@ -454,7 +456,9 @@ static void basic_reloc(int fd, unsigned before, unsigned after, unsigned flags)
 		}
 
 		if (flags & ACTIVE) {
-			spin = igt_spin_batch_new(fd, 0, I915_EXEC_DEFAULT, obj.handle);
+			spin = igt_spin_batch_new(fd,
+						  .engine = I915_EXEC_DEFAULT,
+						  .dependency = obj.handle);
 			if (!(flags & HANG))
 				igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
 			igt_assert(gem_bo_busy(fd, obj.handle));
@@ -581,7 +585,7 @@ static void basic_range(int fd, unsigned flags)
 	execbuf.buffer_count = n + 1;
 
 	if (flags & ACTIVE) {
-		spin = igt_spin_batch_new(fd, 0, 0, obj[n].handle);
+		spin = igt_spin_batch_new(fd, .dependency = obj[n].handle);
 		if (!(flags & HANG))
 			igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
 		igt_assert(gem_bo_busy(fd, obj[n].handle));
diff --git a/tests/gem_exec_schedule.c b/tests/gem_exec_schedule.c
index 1f43147f7..35a44ab10 100644
--- a/tests/gem_exec_schedule.c
+++ b/tests/gem_exec_schedule.c
@@ -132,9 +132,12 @@ static void unplug_show_queue(int fd, struct igt_cork *c, unsigned int engine)
 	igt_spin_t *spin[MAX_ELSP_QLEN];
 
 	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
-		uint32_t ctx = create_highest_priority(fd);
-		spin[n] = __igt_spin_batch_new(fd, ctx, engine, 0);
-		gem_context_destroy(fd, ctx);
+		const struct igt_spin_factory opts = {
+			.ctx = create_highest_priority(fd),
+			.engine = engine,
+		};
+		spin[n] = __igt_spin_batch_factory(fd, &opts);
+		gem_context_destroy(fd, opts.ctx);
 	}
 
 	igt_cork_unplug(c); /* batches will now be queued on the engine */
@@ -196,7 +199,7 @@ static void independent(int fd, unsigned int engine)
 			continue;
 
 		if (spin == NULL) {
-			spin = __igt_spin_batch_new(fd, 0, other, 0);
+			spin = __igt_spin_batch_new(fd, .engine = other);
 		} else {
 			struct drm_i915_gem_exec_object2 obj = {
 				.handle = spin->handle,
@@ -428,7 +431,9 @@ static void preempt(int fd, unsigned ring, unsigned flags)
 			ctx[LO] = gem_context_create(fd);
 			gem_context_set_priority(fd, ctx[LO], MIN_PRIO);
 		}
-		spin[n] = __igt_spin_batch_new(fd, ctx[LO], ring, 0);
+		spin[n] = __igt_spin_batch_new(fd,
+					       .ctx = ctx[LO],
+					       .engine = ring);
 		igt_debug("spin[%d].handle=%d\n", n, spin[n]->handle);
 
 		store_dword(fd, ctx[HI], ring, result, 0, n + 1, 0, I915_GEM_DOMAIN_RENDER);
@@ -462,7 +467,9 @@ static igt_spin_t *__noise(int fd, uint32_t ctx, int prio, igt_spin_t *spin)
 
 	for_each_physical_engine(fd, other) {
 		if (spin == NULL) {
-			spin = __igt_spin_batch_new(fd, ctx, other, 0);
+			spin = __igt_spin_batch_new(fd,
+						    .ctx = ctx,
+						    .engine = other);
 		} else {
 			struct drm_i915_gem_exec_object2 obj = {
 				.handle = spin->handle,
@@ -672,7 +679,9 @@ static void preempt_self(int fd, unsigned ring)
 	n = 0;
 	gem_context_set_priority(fd, ctx[HI], MIN_PRIO);
 	for_each_physical_engine(fd, other) {
-		spin[n] = __igt_spin_batch_new(fd, ctx[NOISE], other, 0);
+		spin[n] = __igt_spin_batch_new(fd,
+					       .ctx = ctx[NOISE],
+					       .engine = other);
 		store_dword(fd, ctx[HI], other,
 			    result, (n + 1)*sizeof(uint32_t), n + 1,
 			    0, I915_GEM_DOMAIN_RENDER);
@@ -714,7 +723,9 @@ static void preemptive_hang(int fd, unsigned ring)
 		ctx[LO] = gem_context_create(fd);
 		gem_context_set_priority(fd, ctx[LO], MIN_PRIO);
 
-		spin[n] = __igt_spin_batch_new(fd, ctx[LO], ring, 0);
+		spin[n] = __igt_spin_batch_new(fd,
+					       .ctx = ctx[LO],
+					       .engine = ring);
 
 		gem_context_destroy(fd, ctx[LO]);
 	}
diff --git a/tests/gem_exec_suspend.c b/tests/gem_exec_suspend.c
index db2bca262..43c52d105 100644
--- a/tests/gem_exec_suspend.c
+++ b/tests/gem_exec_suspend.c
@@ -189,7 +189,7 @@ static void run_test(int fd, unsigned engine, unsigned flags)
 	}
 
 	if (flags & HANG)
-		spin = igt_spin_batch_new(fd, 0, engine, 0);
+		spin = igt_spin_batch_new(fd, .engine = engine);
 
 	switch (mode(flags)) {
 	case NOSLEEP:
diff --git a/tests/gem_fenced_exec_thrash.c b/tests/gem_fenced_exec_thrash.c
index 385790ada..7248d310d 100644
--- a/tests/gem_fenced_exec_thrash.c
+++ b/tests/gem_fenced_exec_thrash.c
@@ -132,7 +132,7 @@ static void run_test(int fd, int num_fences, int expected_errno,
 			igt_spin_t *spin = NULL;
 
 			if (flags & BUSY_LOAD)
-				spin = __igt_spin_batch_new(fd, 0, 0, 0);
+				spin = __igt_spin_batch_new(fd);
 
 			igt_while_interruptible(flags & INTERRUPTIBLE) {
 				igt_assert_eq(__gem_execbuf(fd, &execbuf[i]),
diff --git a/tests/gem_shrink.c b/tests/gem_shrink.c
index 3d33453aa..929e0426a 100644
--- a/tests/gem_shrink.c
+++ b/tests/gem_shrink.c
@@ -346,9 +346,9 @@ static void reclaim(unsigned engine, int timeout)
 		} while (!*shared);
 	}
 
-	spin = igt_spin_batch_new(fd, 0, engine, 0);
+	spin = igt_spin_batch_new(fd, .engine = engine);
 	igt_until_timeout(timeout) {
-		igt_spin_t *next = __igt_spin_batch_new(fd, 0, engine, 0);
+		igt_spin_t *next = __igt_spin_batch_new(fd, .engine = engine);
 
 		igt_spin_batch_set_timeout(spin, timeout_100ms);
 		gem_sync(fd, spin->handle);
diff --git a/tests/gem_spin_batch.c b/tests/gem_spin_batch.c
index cffeb6d71..52410010b 100644
--- a/tests/gem_spin_batch.c
+++ b/tests/gem_spin_batch.c
@@ -41,9 +41,9 @@ static void spin(int fd, unsigned int engine, unsigned int timeout_sec)
 	struct timespec itv = { };
 	uint64_t elapsed;
 
-	spin = __igt_spin_batch_new(fd, 0, engine, 0);
+	spin = __igt_spin_batch_new(fd, .engine = engine);
 	while ((elapsed = igt_nsec_elapsed(&tv)) >> 30 < timeout_sec) {
-		igt_spin_t *next = __igt_spin_batch_new(fd, 0, engine, 0);
+		igt_spin_t *next = __igt_spin_batch_new(fd, .engine = engine);
 
 		igt_spin_batch_set_timeout(spin,
 					   timeout_100ms - igt_nsec_elapsed(&itv));
diff --git a/tests/gem_sync.c b/tests/gem_sync.c
index 1e2e089a1..2fcb9aa01 100644
--- a/tests/gem_sync.c
+++ b/tests/gem_sync.c
@@ -715,9 +715,8 @@ preempt(int fd, unsigned ring, int num_children, int timeout)
 		do {
 			igt_spin_t *spin =
 				__igt_spin_batch_new(fd,
-						     ctx[0],
-						     execbuf.flags,
-						     0);
+						     .ctx = ctx[0],
+						     .engine = execbuf.flags);
 
 			do {
 				gem_execbuf(fd, &execbuf);
diff --git a/tests/gem_wait.c b/tests/gem_wait.c
index 61d8a4059..7914c9365 100644
--- a/tests/gem_wait.c
+++ b/tests/gem_wait.c
@@ -74,7 +74,9 @@ static void basic(int fd, unsigned engine, unsigned flags)
 	IGT_CORK_HANDLE(cork);
 	uint32_t plug =
 		flags & (WRITE | AWAIT) ? igt_cork_plug(&cork, fd) : 0;
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, engine, plug);
+	igt_spin_t *spin = igt_spin_batch_new(fd,
+					      .engine = engine,
+					      .dependency = plug);
 	struct drm_i915_gem_wait wait = {
 		flags & WRITE ? plug : spin->handle
 	};
diff --git a/tests/kms_busy.c b/tests/kms_busy.c
index 4a4e0e156..abf39828b 100644
--- a/tests/kms_busy.c
+++ b/tests/kms_busy.c
@@ -84,7 +84,8 @@ static void flip_to_fb(igt_display_t *dpy, int pipe,
 	struct drm_event_vblank ev;
 
 	igt_spin_t *t = igt_spin_batch_new(dpy->drm_fd,
-					   0, ring, fb->gem_handle);
+					   .engine = ring,
+					   .dependency = fb->gem_handle);
 
 	if (modeset) {
 		/*
@@ -200,7 +201,8 @@ static void test_atomic_commit_hang(igt_display_t *dpy, igt_plane_t *primary,
 				    struct igt_fb *busy_fb, unsigned ring)
 {
 	igt_spin_t *t = igt_spin_batch_new(dpy->drm_fd,
-					   0, ring, busy_fb->gem_handle);
+					   .engine = ring,
+					   .dependency = busy_fb->gem_handle);
 	struct pollfd pfd = { .fd = dpy->drm_fd, .events = POLLIN };
 	unsigned flags = 0;
 	struct drm_event_vblank ev;
@@ -287,7 +289,9 @@ static void test_pageflip_modeset_hang(igt_display_t *dpy,
 
 	igt_display_commit2(dpy, dpy->is_atomic ? COMMIT_ATOMIC : COMMIT_LEGACY);
 
-	t = igt_spin_batch_new(dpy->drm_fd, 0, ring, fb.gem_handle);
+	t = igt_spin_batch_new(dpy->drm_fd,
+			       .engine = ring,
+			       .dependency = fb.gem_handle);
 
 	do_or_die(drmModePageFlip(dpy->drm_fd, dpy->pipes[pipe].crtc_id, fb.fb_id, DRM_MODE_PAGE_FLIP_EVENT, &fb));
 
diff --git a/tests/kms_cursor_legacy.c b/tests/kms_cursor_legacy.c
index d0a28b3c4..85340d43e 100644
--- a/tests/kms_cursor_legacy.c
+++ b/tests/kms_cursor_legacy.c
@@ -532,7 +532,8 @@ static void basic_flip_cursor(igt_display_t *display,
 
 		spin = NULL;
 		if (flags & BASIC_BUSY)
-			spin = igt_spin_batch_new(display->drm_fd, 0, 0, fb_info.gem_handle);
+			spin = igt_spin_batch_new(display->drm_fd,
+						  .dependency = fb_info.gem_handle);
 
 		/* Start with a synchronous query to align with the vblank */
 		vblank_start = get_vblank(display->drm_fd, pipe, DRM_VBLANK_NEXTONMISS);
@@ -1323,8 +1324,8 @@ static void flip_vs_cursor_busy_crc(igt_display_t *display, bool atomic)
 	for (int i = 1; i >= 0; i--) {
 		igt_spin_t *spin;
 
-		spin = igt_spin_batch_new(display->drm_fd, 0, 0,
-					  fb_info[1].gem_handle);
+		spin = igt_spin_batch_new(display->drm_fd,
+					  .dependency = fb_info[1].gem_handle);
 
 		vblank_start = get_vblank(display->drm_fd, pipe, DRM_VBLANK_NEXTONMISS);
 
diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index 4570f926d..a1d36ac4f 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -172,10 +172,15 @@ static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e)
 
 static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
 {
+	struct igt_spin_factory opts = {
+		.ctx = ctx,
+		.engine = flags,
+	};
+
 	if (gem_can_store_dword(fd, flags))
-		return __igt_spin_batch_new_poll(fd, ctx, flags);
-	else
-		return __igt_spin_batch_new(fd, ctx, flags, 0);
+		opts.flags |= IGT_SPIN_POLL_RUN;
+
+	return __igt_spin_batch_factory(fd, &opts);
 }
 
 static unsigned long __spin_wait(int fd, igt_spin_t *spin)
@@ -356,7 +361,9 @@ busy_double_start(int gem_fd, const struct intel_execution_engine2 *e)
 	 */
 	spin[0] = __spin_sync(gem_fd, 0, e2ring(gem_fd, e));
 	usleep(500e3);
-	spin[1] = __igt_spin_batch_new(gem_fd, ctx, e2ring(gem_fd, e), 0);
+	spin[1] = __igt_spin_batch_new(gem_fd,
+				       .ctx = ctx,
+				       .engine = e2ring(gem_fd, e));
 
 	/*
 	 * Open PMU as fast as possible after the second spin batch in attempt
@@ -1045,8 +1052,8 @@ static void cpu_hotplug(int gem_fd)
 	 * Create two spinners so test can ensure shorter gaps in engine
 	 * busyness as it is terminating one and re-starting the other.
 	 */
-	spin[0] = igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
-	spin[1] = __igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
+	spin[0] = igt_spin_batch_new(gem_fd, .engine = I915_EXEC_RENDER);
+	spin[1] = __igt_spin_batch_new(gem_fd, .engine = I915_EXEC_RENDER);
 
 	val = __pmu_read_single(fd, &ts[0]);
 
@@ -1129,8 +1136,8 @@ static void cpu_hotplug(int gem_fd)
 			break;
 
 		igt_spin_batch_free(gem_fd, spin[cur]);
-		spin[cur] = __igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER,
-						 0);
+		spin[cur] = __igt_spin_batch_new(gem_fd,
+						 .engine = I915_EXEC_RENDER);
 		cur ^= 1;
 	}
 
@@ -1167,8 +1174,9 @@ test_interrupts(int gem_fd)
 
 	/* Queue spinning batches. */
 	for (int i = 0; i < target; i++) {
-		spin[i] = __igt_spin_batch_new_fence(gem_fd,
-						     0, I915_EXEC_RENDER);
+		spin[i] = __igt_spin_batch_new(gem_fd,
+					       .engine = I915_EXEC_RENDER,
+					       .flags = IGT_SPIN_FENCE_OUT);
 		if (i == 0) {
 			fence_fd = spin[i]->out_fence;
 		} else {
@@ -1229,7 +1237,8 @@ test_interrupts_sync(int gem_fd)
 
 	/* Queue spinning batches. */
 	for (int i = 0; i < target; i++)
-		spin[i] = __igt_spin_batch_new_fence(gem_fd, 0, 0);
+		spin[i] = __igt_spin_batch_new(gem_fd,
+					       .flags = IGT_SPIN_FENCE_OUT);
 
 	/* Wait for idle state. */
 	idle = pmu_read_single(fd);
@@ -1550,7 +1559,7 @@ accuracy(int gem_fd, const struct intel_execution_engine2 *e,
 		igt_spin_t *spin;
 
 		/* Allocate our spin batch and idle it. */
-		spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
+		spin = igt_spin_batch_new(gem_fd, .engine = e2ring(gem_fd, e));
 		igt_spin_batch_end(spin);
 		gem_sync(gem_fd, spin->handle);
 
diff --git a/tests/pm_rps.c b/tests/pm_rps.c
index 006d084b8..202132b1c 100644
--- a/tests/pm_rps.c
+++ b/tests/pm_rps.c
@@ -235,9 +235,9 @@ static void load_helper_run(enum load load)
 
 		igt_debug("Applying %s load...\n", lh.load ? "high" : "low");
 
-		spin[0] = __igt_spin_batch_new(drm_fd, 0, 0, 0);
+		spin[0] = __igt_spin_batch_new(drm_fd);
 		if (lh.load == HIGH)
-			spin[1] = __igt_spin_batch_new(drm_fd, 0, 0, 0);
+			spin[1] = __igt_spin_batch_new(drm_fd);
 		while (!lh.exit) {
 			handle = spin[0]->handle;
 			igt_spin_batch_end(spin[0]);
@@ -248,8 +248,7 @@ static void load_helper_run(enum load load)
 			usleep(100);
 
 			spin[0] = spin[1];
-			spin[lh.load == HIGH] =
-				__igt_spin_batch_new(drm_fd, 0, 0, 0);
+			spin[lh.load == HIGH] = __igt_spin_batch_new(drm_fd);
 		}
 
 		handle = spin[0]->handle;
@@ -510,7 +509,7 @@ static void boost_freq(int fd, int *boost_freqs)
 	int64_t timeout = 1;
 	igt_spin_t *load;
 
-	load = igt_spin_batch_new(fd, 0, 0, 0);
+	load = igt_spin_batch_new(fd);
 	resubmit_batch(fd, load->handle, 16);
 
 	/* Waiting will grant us a boost to maximum */
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 04/17] lib: Convert spin batch constructor to a factory
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx, Tvrtko Ursulin

In order to make adding more options easier, expose the full set of
options to the caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 lib/igt_dummyload.c            | 147 +++++++++------------------------
 lib/igt_dummyload.h            |  42 +++++-----
 tests/drv_missed_irq.c         |   2 +-
 tests/gem_busy.c               |  17 ++--
 tests/gem_ctx_isolation.c      |  26 +++---
 tests/gem_eio.c                |  13 ++-
 tests/gem_exec_fence.c         |  16 ++--
 tests/gem_exec_latency.c       |  18 +++-
 tests/gem_exec_nop.c           |   4 +-
 tests/gem_exec_reloc.c         |  10 ++-
 tests/gem_exec_schedule.c      |  27 ++++--
 tests/gem_exec_suspend.c       |   2 +-
 tests/gem_fenced_exec_thrash.c |   2 +-
 tests/gem_shrink.c             |   4 +-
 tests/gem_spin_batch.c         |   4 +-
 tests/gem_sync.c               |   5 +-
 tests/gem_wait.c               |   4 +-
 tests/kms_busy.c               |  10 ++-
 tests/kms_cursor_legacy.c      |   7 +-
 tests/perf_pmu.c               |  33 +++++---
 tests/pm_rps.c                 |   9 +-
 21 files changed, 189 insertions(+), 213 deletions(-)

diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
index 3809b4e61..94efdf745 100644
--- a/lib/igt_dummyload.c
+++ b/lib/igt_dummyload.c
@@ -75,12 +75,9 @@ fill_reloc(struct drm_i915_gem_relocation_entry *reloc,
 	reloc->write_domain = write_domains;
 }
 
-#define OUT_FENCE	(1 << 0)
-#define POLL_RUN	(1 << 1)
-
 static int
-emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
-		     uint32_t dep, unsigned int flags)
+emit_recursive_batch(igt_spin_t *spin,
+		     int fd, const struct igt_spin_factory *opts)
 {
 #define SCRATCH 0
 #define BATCH 1
@@ -95,21 +92,18 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
 	int i;
 
 	nengine = 0;
-	if (engine == ALL_ENGINES) {
-		for_each_engine(fd, engine) {
-			if (engine) {
-			if (flags & POLL_RUN)
-				igt_require(!(flags & POLL_RUN) ||
-					    gem_can_store_dword(fd, engine));
-
-				engines[nengine++] = engine;
-			}
+	if (opts->engine == ALL_ENGINES) {
+		unsigned int engine;
+
+		for_each_physical_engine(fd, engine) {
+			if (opts->flags & IGT_SPIN_POLL_RUN &&
+			    !gem_can_store_dword(fd, engine))
+				continue;
+
+			engines[nengine++] = engine;
 		}
 	} else {
-		gem_require_ring(fd, engine);
-		igt_require(!(flags & POLL_RUN) ||
-			    gem_can_store_dword(fd, engine));
-		engines[nengine++] = engine;
+		engines[nengine++] = opts->engine;
 	}
 	igt_require(nengine);
 
@@ -130,20 +124,20 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
 	execbuf->buffer_count++;
 	batch_start = batch;
 
-	if (dep) {
-		igt_assert(!(flags & POLL_RUN));
+	if (opts->dependency) {
+		igt_assert(!(opts->flags & IGT_SPIN_POLL_RUN));
 
 		/* dummy write to dependency */
-		obj[SCRATCH].handle = dep;
+		obj[SCRATCH].handle = opts->dependency;
 		fill_reloc(&relocs[obj[BATCH].relocation_count++],
-			   dep, 1020,
+			   opts->dependency, 1020,
 			   I915_GEM_DOMAIN_RENDER,
 			   I915_GEM_DOMAIN_RENDER);
 		execbuf->buffer_count++;
-	} else if (flags & POLL_RUN) {
+	} else if (opts->flags & IGT_SPIN_POLL_RUN) {
 		unsigned int offset;
 
-		igt_assert(!dep);
+		igt_assert(!opts->dependency);
 
 		if (gen == 4 || gen == 5) {
 			execbuf->flags |= I915_EXEC_SECURE;
@@ -231,9 +225,9 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
 
 	execbuf->buffers_ptr = to_user_pointer(obj +
 					       (2 - execbuf->buffer_count));
-	execbuf->rsvd1 = ctx;
+	execbuf->rsvd1 = opts->ctx;
 
-	if (flags & OUT_FENCE)
+	if (opts->flags & IGT_SPIN_FENCE_OUT)
 		execbuf->flags |= I915_EXEC_FENCE_OUT;
 
 	for (i = 0; i < nengine; i++) {
@@ -242,7 +236,7 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
 
 		gem_execbuf_wr(fd, execbuf);
 
-		if (flags & OUT_FENCE) {
+		if (opts->flags & IGT_SPIN_FENCE_OUT) {
 			int _fd = execbuf->rsvd2 >> 32;
 
 			igt_assert(_fd >= 0);
@@ -271,16 +265,14 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
 }
 
 static igt_spin_t *
-___igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep,
-		      unsigned int flags)
+spin_batch_create(int fd, const struct igt_spin_factory *opts)
 {
 	igt_spin_t *spin;
 
 	spin = calloc(1, sizeof(struct igt_spin));
 	igt_assert(spin);
 
-	spin->out_fence = emit_recursive_batch(spin, fd, ctx, engine, dep,
-					       flags);
+	spin->out_fence = emit_recursive_batch(spin, fd, opts);
 
 	pthread_mutex_lock(&list_lock);
 	igt_list_add(&spin->link, &spin_list);
@@ -290,18 +282,15 @@ ___igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep,
 }
 
 igt_spin_t *
-__igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
+__igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts)
 {
-	return ___igt_spin_batch_new(fd, ctx, engine, dep, 0);
+	return spin_batch_create(fd, opts);
 }
 
 /**
- * igt_spin_batch_new:
+ * igt_spin_batch_factory:
  * @fd: open i915 drm file descriptor
- * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
- *          than 0, execute on all available rings.
- * @dep: handle to a buffer object dependency. If greater than 0, add a
- *              relocation entry to this buffer within the batch.
+ * @opts: controlling options such as context, engine, dependencies etc
  *
  * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
  * contains the batch's handle that can be waited upon. The returned structure
@@ -311,86 +300,26 @@ __igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
  * Structure with helper internal state for igt_spin_batch_free().
  */
 igt_spin_t *
-igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
+igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts)
 {
 	igt_spin_t *spin;
 
 	igt_require_gem(fd);
 
-	spin = __igt_spin_batch_new(fd, ctx, engine, dep);
-	igt_assert(gem_bo_busy(fd, spin->handle));
-
-	return spin;
-}
-
-igt_spin_t *
-__igt_spin_batch_new_fence(int fd, uint32_t ctx, unsigned engine)
-{
-	return ___igt_spin_batch_new(fd, ctx, engine, 0, OUT_FENCE);
-}
+	if (opts->engine != ALL_ENGINES) {
+		gem_require_ring(fd, opts->engine);
+		if (opts->flags & IGT_SPIN_POLL_RUN)
+			igt_require(gem_can_store_dword(fd, opts->engine));
+	}
 
-/**
- * igt_spin_batch_new_fence:
- * @fd: open i915 drm file descriptor
- * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
- *          than 0, execute on all available rings.
- *
- * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
- * contains the batch's handle that can be waited upon. The returned structure
- * must be passed to igt_spin_batch_free() for post-processing.
- *
- * igt_spin_t will contain an output fence associtated with this batch.
- *
- * Returns:
- * Structure with helper internal state for igt_spin_batch_free().
- */
-igt_spin_t *
-igt_spin_batch_new_fence(int fd, uint32_t ctx, unsigned engine)
-{
-	igt_spin_t *spin;
+	spin = spin_batch_create(fd, opts);
 
-	igt_require_gem(fd);
-	igt_require(gem_has_exec_fence(fd));
-
-	spin = __igt_spin_batch_new_fence(fd, ctx, engine);
 	igt_assert(gem_bo_busy(fd, spin->handle));
-	igt_assert(poll(&(struct pollfd){spin->out_fence, POLLIN}, 1, 0) == 0);
-
-	return spin;
-}
-
-igt_spin_t *
-__igt_spin_batch_new_poll(int fd, uint32_t ctx, unsigned engine)
-{
-	return ___igt_spin_batch_new(fd, ctx, engine, 0, POLL_RUN);
-}
+	if (opts->flags & IGT_SPIN_FENCE_OUT) {
+		struct pollfd pfd = { spin->out_fence, POLLIN };
 
-/**
- * igt_spin_batch_new_poll:
- * @fd: open i915 drm file descriptor
- * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
- *          than 0, execute on all available rings.
- *
- * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
- * contains the batch's handle that can be waited upon. The returned structure
- * must be passed to igt_spin_batch_free() for post-processing.
- *
- * igt_spin_t->running will containt a pointer which target will change from
- * zero to one once the spinner actually starts executing on the GPU.
- *
- * Returns:
- * Structure with helper internal state for igt_spin_batch_free().
- */
-igt_spin_t *
-igt_spin_batch_new_poll(int fd, uint32_t ctx, unsigned engine)
-{
-	igt_spin_t *spin;
-
-	igt_require_gem(fd);
-	igt_require(gem_mmap__has_wc(fd));
-
-	spin = __igt_spin_batch_new_poll(fd, ctx, engine);
-	igt_assert(gem_bo_busy(fd, spin->handle));
+		igt_assert(poll(&pfd, 1, 0) == 0);
+	}
 
 	return spin;
 }
diff --git a/lib/igt_dummyload.h b/lib/igt_dummyload.h
index c6ccc2936..c794f2544 100644
--- a/lib/igt_dummyload.h
+++ b/lib/igt_dummyload.h
@@ -43,29 +43,25 @@ typedef struct igt_spin {
 	bool *running;
 } igt_spin_t;
 
-igt_spin_t *__igt_spin_batch_new(int fd,
-				 uint32_t ctx,
-				 unsigned engine,
-				 uint32_t  dep);
-igt_spin_t *igt_spin_batch_new(int fd,
-			       uint32_t ctx,
-			       unsigned engine,
-			       uint32_t  dep);
-
-igt_spin_t *__igt_spin_batch_new_fence(int fd,
-				       uint32_t ctx,
-				       unsigned engine);
-
-igt_spin_t *igt_spin_batch_new_fence(int fd,
-				     uint32_t ctx,
-				     unsigned engine);
-
-igt_spin_t *__igt_spin_batch_new_poll(int fd,
-				       uint32_t ctx,
-				       unsigned engine);
-igt_spin_t *igt_spin_batch_new_poll(int fd,
-				    uint32_t ctx,
-				    unsigned engine);
+struct igt_spin_factory {
+	uint32_t ctx;
+	uint32_t dependency;
+	unsigned int engine;
+	unsigned int flags;
+};
+
+#define IGT_SPIN_FENCE_OUT (1 << 0)
+#define IGT_SPIN_POLL_RUN  (1 << 1)
+
+igt_spin_t *
+__igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
+igt_spin_t *
+igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
+
+#define __igt_spin_batch_new(fd, ...) \
+	__igt_spin_batch_factory(fd, &((struct igt_spin_factory){__VA_ARGS__}))
+#define igt_spin_batch_new(fd, ...) \
+	igt_spin_batch_factory(fd, &((struct igt_spin_factory){__VA_ARGS__}))
 
 void igt_spin_batch_set_timeout(igt_spin_t *spin, int64_t ns);
 void igt_spin_batch_end(igt_spin_t *spin);
diff --git a/tests/drv_missed_irq.c b/tests/drv_missed_irq.c
index 791ee51fb..78690c36a 100644
--- a/tests/drv_missed_irq.c
+++ b/tests/drv_missed_irq.c
@@ -33,7 +33,7 @@ IGT_TEST_DESCRIPTION("Inject missed interrupts and make sure they are caught");
 
 static void trigger_missed_interrupt(int fd, unsigned ring)
 {
-	igt_spin_t *spin = __igt_spin_batch_new(fd, 0, ring, 0);
+	igt_spin_t *spin = __igt_spin_batch_new(fd, .engine = ring);
 	uint32_t go;
 	int link[2];
 
diff --git a/tests/gem_busy.c b/tests/gem_busy.c
index f564651ba..76b44a5d4 100644
--- a/tests/gem_busy.c
+++ b/tests/gem_busy.c
@@ -114,7 +114,9 @@ static void semaphore(int fd, unsigned ring, uint32_t flags)
 
 	/* Create a long running batch which we can use to hog the GPU */
 	handle[BUSY] = gem_create(fd, 4096);
-	spin = igt_spin_batch_new(fd, 0, ring, handle[BUSY]);
+	spin = igt_spin_batch_new(fd,
+				  .engine = ring,
+				  .dependency = handle[BUSY]);
 
 	/* Queue a batch after the busy, it should block and remain "busy" */
 	igt_assert(exec_noop(fd, handle, ring | flags, false));
@@ -363,17 +365,16 @@ static void close_race(int fd)
 		igt_assert(sched_setscheduler(getpid(), SCHED_RR, &rt) == 0);
 
 		for (i = 0; i < nhandles; i++) {
-			spin[i] = __igt_spin_batch_new(fd, 0,
-						       engines[rand() % nengine], 0);
+			spin[i] = __igt_spin_batch_new(fd,
+						       .engine = engines[rand() % nengine]);
 			handles[i] = spin[i]->handle;
 		}
 
 		igt_until_timeout(20) {
 			for (i = 0; i < nhandles; i++) {
 				igt_spin_batch_free(fd, spin[i]);
-				spin[i] = __igt_spin_batch_new(fd, 0,
-							       engines[rand() % nengine],
-							       0);
+				spin[i] = __igt_spin_batch_new(fd,
+							       .engine = engines[rand() % nengine]);
 				handles[i] = spin[i]->handle;
 				__sync_synchronize();
 			}
@@ -415,7 +416,7 @@ static bool has_semaphores(int fd)
 
 static bool has_extended_busy_ioctl(int fd)
 {
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, I915_EXEC_RENDER, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd, .engine = I915_EXEC_RENDER);
 	uint32_t read, write;
 
 	__gem_busy(fd, spin->handle, &read, &write);
@@ -426,7 +427,7 @@ static bool has_extended_busy_ioctl(int fd)
 
 static void basic(int fd, unsigned ring, unsigned flags)
 {
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, ring, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd, .engine = ring);
 	struct timespec tv;
 	int timeout;
 	bool busy;
diff --git a/tests/gem_ctx_isolation.c b/tests/gem_ctx_isolation.c
index fe7d3490c..2e19e8c03 100644
--- a/tests/gem_ctx_isolation.c
+++ b/tests/gem_ctx_isolation.c
@@ -502,7 +502,7 @@ static void isolation(int fd,
 		ctx[0] = gem_context_create(fd);
 		regs[0] = read_regs(fd, ctx[0], e, flags);
 
-		spin = igt_spin_batch_new(fd, ctx[0], engine, 0);
+		spin = igt_spin_batch_new(fd, .ctx = ctx[0], .engine = engine);
 
 		if (flags & DIRTY1) {
 			igt_debug("%s[%d]: Setting all registers of ctx 0 to 0x%08x\n",
@@ -557,8 +557,11 @@ static void isolation(int fd,
 
 static void inject_reset_context(int fd, unsigned int engine)
 {
+	struct igt_spin_factory opts = {
+		.ctx = gem_context_create(fd),
+		.engine = engine,
+	};
 	igt_spin_t *spin;
-	uint32_t ctx;
 
 	/*
 	 * Force a context switch before triggering the reset, or else
@@ -566,19 +569,20 @@ static void inject_reset_context(int fd, unsigned int engine)
 	 * HW for screwing up if the context was already broken.
 	 */
 
-	ctx = gem_context_create(fd);
-	if (gem_can_store_dword(fd, engine)) {
-		spin = __igt_spin_batch_new_poll(fd, ctx, engine);
+	if (gem_can_store_dword(fd, engine))
+		opts.flags |= IGT_SPIN_POLL_RUN;
+
+	spin = __igt_spin_batch_factory(fd, &opts);
+
+	if (spin->running)
 		igt_spin_busywait_until_running(spin);
-	} else {
-		spin = __igt_spin_batch_new(fd, ctx, engine, 0);
+	else
 		usleep(1000); /* better than nothing */
-	}
 
 	igt_force_gpu_reset(fd);
 
 	igt_spin_batch_free(fd, spin);
-	gem_context_destroy(fd, ctx);
+	gem_context_destroy(fd, opts.ctx);
 }
 
 static void preservation(int fd,
@@ -604,7 +608,7 @@ static void preservation(int fd,
 	gem_quiescent_gpu(fd);
 
 	ctx[num_values] = gem_context_create(fd);
-	spin = igt_spin_batch_new(fd, ctx[num_values], engine, 0);
+	spin = igt_spin_batch_new(fd, .ctx = ctx[num_values], .engine = engine);
 	regs[num_values][0] = read_regs(fd, ctx[num_values], e, flags);
 	for (int v = 0; v < num_values; v++) {
 		ctx[v] = gem_context_create(fd);
@@ -644,7 +648,7 @@ static void preservation(int fd,
 		break;
 	}
 
-	spin = igt_spin_batch_new(fd, ctx[num_values], engine, 0);
+	spin = igt_spin_batch_new(fd, .ctx = ctx[num_values], .engine = engine);
 	for (int v = 0; v < num_values; v++)
 		regs[v][1] = read_regs(fd, ctx[v], e, flags);
 	regs[num_values][1] = read_regs(fd, ctx[num_values], e, flags);
diff --git a/tests/gem_eio.c b/tests/gem_eio.c
index 5faf7502b..0ec1aaec9 100644
--- a/tests/gem_eio.c
+++ b/tests/gem_eio.c
@@ -157,10 +157,15 @@ static int __gem_wait(int fd, uint32_t handle, int64_t timeout)
 
 static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
 {
-	if (gem_can_store_dword(fd, flags))
-		return __igt_spin_batch_new_poll(fd, ctx, flags);
-	else
-		return __igt_spin_batch_new(fd, ctx, flags, 0);
+	struct igt_spin_factory opts = {
+		.ctx = ctx,
+		.engine = flags,
+	};
+
+	if (gem_can_store_dword(fd, opts.engine))
+		opts.flags |= IGT_SPIN_POLL_RUN;
+
+	return __igt_spin_batch_factory(fd, &opts);
 }
 
 static void __spin_wait(int fd, igt_spin_t *spin)
diff --git a/tests/gem_exec_fence.c b/tests/gem_exec_fence.c
index eb93308d1..ba46595d3 100644
--- a/tests/gem_exec_fence.c
+++ b/tests/gem_exec_fence.c
@@ -468,7 +468,7 @@ static void test_parallel(int fd, unsigned int master)
 	/* Fill the queue with many requests so that the next one has to
 	 * wait before it can be executed by the hardware.
 	 */
-	spin = igt_spin_batch_new(fd, 0, master, plug);
+	spin = igt_spin_batch_new(fd, .engine = master, .dependency = plug);
 	resubmit(fd, spin->handle, master, 16);
 
 	/* Now queue the master request and its secondaries */
@@ -651,7 +651,7 @@ static void test_keep_in_fence(int fd, unsigned int engine, unsigned int flags)
 	igt_spin_t *spin;
 	int fence;
 
-	spin = igt_spin_batch_new(fd, 0, engine, 0);
+	spin = igt_spin_batch_new(fd, .engine = engine);
 
 	gem_execbuf_wr(fd, &execbuf);
 	fence = upper_32_bits(execbuf.rsvd2);
@@ -1070,7 +1070,7 @@ static void test_syncobj_unused_fence(int fd)
 	struct local_gem_exec_fence fence = {
 		.handle = syncobj_create(fd),
 	};
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd);
 
 	/* sanity check our syncobj_to_sync_file interface */
 	igt_assert_eq(__syncobj_to_sync_file(fd, 0), -ENOENT);
@@ -1162,7 +1162,7 @@ static void test_syncobj_signal(int fd)
 	struct local_gem_exec_fence fence = {
 		.handle = syncobj_create(fd),
 	};
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd);
 
 	/* Check that the syncobj is signaled only when our request/fence is */
 
@@ -1212,7 +1212,7 @@ static void test_syncobj_wait(int fd)
 
 	gem_quiescent_gpu(fd);
 
-	spin = igt_spin_batch_new(fd, 0, 0, 0);
+	spin = igt_spin_batch_new(fd);
 
 	memset(&execbuf, 0, sizeof(execbuf));
 	execbuf.buffers_ptr = to_user_pointer(&obj);
@@ -1282,7 +1282,7 @@ static void test_syncobj_export(int fd)
 		.handle = syncobj_create(fd),
 	};
 	int export[2];
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd);
 
 	/* Check that if we export the syncobj prior to use it picks up
 	 * the later fence. This allows a syncobj to establish a channel
@@ -1340,7 +1340,7 @@ static void test_syncobj_repeat(int fd)
 	struct drm_i915_gem_execbuffer2 execbuf;
 	struct local_gem_exec_fence *fence;
 	int export;
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd);
 
 	/* Check that we can wait on the same fence multiple times */
 	fence = calloc(nfences, sizeof(*fence));
@@ -1395,7 +1395,7 @@ static void test_syncobj_import(int fd)
 	const uint32_t bbe = MI_BATCH_BUFFER_END;
 	struct drm_i915_gem_exec_object2 obj;
 	struct drm_i915_gem_execbuffer2 execbuf;
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
+	igt_spin_t *spin = igt_spin_batch_new(fd);
 	uint32_t sync = syncobj_create(fd);
 	int fence;
 
diff --git a/tests/gem_exec_latency.c b/tests/gem_exec_latency.c
index ea2e4c681..75811f325 100644
--- a/tests/gem_exec_latency.c
+++ b/tests/gem_exec_latency.c
@@ -63,6 +63,10 @@ static unsigned int ring_size;
 static void
 poll_ring(int fd, unsigned ring, const char *name)
 {
+	const struct igt_spin_factory opts = {
+		.engine = ring,
+		.flags = IGT_SPIN_POLL_RUN,
+	};
 	struct timespec tv = {};
 	unsigned long cycles;
 	igt_spin_t *spin[2];
@@ -72,11 +76,11 @@ poll_ring(int fd, unsigned ring, const char *name)
 	gem_require_ring(fd, ring);
 	igt_require(gem_can_store_dword(fd, ring));
 
-	spin[0] = __igt_spin_batch_new_poll(fd, 0, ring);
+	spin[0] = __igt_spin_batch_factory(fd, &opts);
 	igt_assert(spin[0]->running);
 	cmd = *spin[0]->batch;
 
-	spin[1] = __igt_spin_batch_new_poll(fd, 0, ring);
+	spin[1] = __igt_spin_batch_factory(fd, &opts);
 	igt_assert(spin[1]->running);
 	igt_assert(cmd == *spin[1]->batch);
 
@@ -312,7 +316,9 @@ static void latency_from_ring(int fd,
 			       I915_GEM_DOMAIN_GTT);
 
 		if (flags & PREEMPT)
-			spin = __igt_spin_batch_new(fd, ctx[0], ring, 0);
+			spin = __igt_spin_batch_new(fd,
+						    .ctx = ctx[0],
+						    .engine = ring);
 
 		if (flags & CORK) {
 			obj[0].handle = igt_cork_plug(&c, fd);
@@ -456,6 +462,10 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
 	};
 #define NPASS ARRAY_SIZE(passname)
 #define MMAP_SZ (64 << 10)
+	const struct igt_spin_factory opts = {
+		.engine = engine,
+		.flags = IGT_SPIN_POLL_RUN,
+	};
 	struct rt_pkt *results;
 	unsigned int engines[16];
 	const char *names[16];
@@ -513,7 +523,7 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
 
 			usleep(250);
 
-			spin = __igt_spin_batch_new_poll(fd, 0, engine);
+			spin = __igt_spin_batch_factory(fd, &opts);
 			if (!spin) {
 				igt_warn("Failed to create spinner! (%s)\n",
 					 passname[pass]);
diff --git a/tests/gem_exec_nop.c b/tests/gem_exec_nop.c
index 0523b1c02..74d27522d 100644
--- a/tests/gem_exec_nop.c
+++ b/tests/gem_exec_nop.c
@@ -709,7 +709,9 @@ static void preempt(int fd, uint32_t handle,
 	clock_gettime(CLOCK_MONOTONIC, &start);
 	do {
 		igt_spin_t *spin =
-			__igt_spin_batch_new(fd, ctx[0], ring_id, 0);
+			__igt_spin_batch_new(fd,
+					     .ctx = ctx[0],
+					     .engine = ring_id);
 
 		for (int loop = 0; loop < 1024; loop++)
 			gem_execbuf(fd, &execbuf);
diff --git a/tests/gem_exec_reloc.c b/tests/gem_exec_reloc.c
index 91c6691af..837f60a6c 100644
--- a/tests/gem_exec_reloc.c
+++ b/tests/gem_exec_reloc.c
@@ -388,7 +388,9 @@ static void basic_reloc(int fd, unsigned before, unsigned after, unsigned flags)
 		}
 
 		if (flags & ACTIVE) {
-			spin = igt_spin_batch_new(fd, 0, I915_EXEC_DEFAULT, obj.handle);
+			spin = igt_spin_batch_new(fd,
+						  .engine = I915_EXEC_DEFAULT,
+						  .dependency = obj.handle);
 			if (!(flags & HANG))
 				igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
 			igt_assert(gem_bo_busy(fd, obj.handle));
@@ -454,7 +456,9 @@ static void basic_reloc(int fd, unsigned before, unsigned after, unsigned flags)
 		}
 
 		if (flags & ACTIVE) {
-			spin = igt_spin_batch_new(fd, 0, I915_EXEC_DEFAULT, obj.handle);
+			spin = igt_spin_batch_new(fd,
+						  .engine = I915_EXEC_DEFAULT,
+						  .dependency = obj.handle);
 			if (!(flags & HANG))
 				igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
 			igt_assert(gem_bo_busy(fd, obj.handle));
@@ -581,7 +585,7 @@ static void basic_range(int fd, unsigned flags)
 	execbuf.buffer_count = n + 1;
 
 	if (flags & ACTIVE) {
-		spin = igt_spin_batch_new(fd, 0, 0, obj[n].handle);
+		spin = igt_spin_batch_new(fd, .dependency = obj[n].handle);
 		if (!(flags & HANG))
 			igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
 		igt_assert(gem_bo_busy(fd, obj[n].handle));
diff --git a/tests/gem_exec_schedule.c b/tests/gem_exec_schedule.c
index 1f43147f7..35a44ab10 100644
--- a/tests/gem_exec_schedule.c
+++ b/tests/gem_exec_schedule.c
@@ -132,9 +132,12 @@ static void unplug_show_queue(int fd, struct igt_cork *c, unsigned int engine)
 	igt_spin_t *spin[MAX_ELSP_QLEN];
 
 	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
-		uint32_t ctx = create_highest_priority(fd);
-		spin[n] = __igt_spin_batch_new(fd, ctx, engine, 0);
-		gem_context_destroy(fd, ctx);
+		const struct igt_spin_factory opts = {
+			.ctx = create_highest_priority(fd),
+			.engine = engine,
+		};
+		spin[n] = __igt_spin_batch_factory(fd, &opts);
+		gem_context_destroy(fd, opts.ctx);
 	}
 
 	igt_cork_unplug(c); /* batches will now be queued on the engine */
@@ -196,7 +199,7 @@ static void independent(int fd, unsigned int engine)
 			continue;
 
 		if (spin == NULL) {
-			spin = __igt_spin_batch_new(fd, 0, other, 0);
+			spin = __igt_spin_batch_new(fd, .engine = other);
 		} else {
 			struct drm_i915_gem_exec_object2 obj = {
 				.handle = spin->handle,
@@ -428,7 +431,9 @@ static void preempt(int fd, unsigned ring, unsigned flags)
 			ctx[LO] = gem_context_create(fd);
 			gem_context_set_priority(fd, ctx[LO], MIN_PRIO);
 		}
-		spin[n] = __igt_spin_batch_new(fd, ctx[LO], ring, 0);
+		spin[n] = __igt_spin_batch_new(fd,
+					       .ctx = ctx[LO],
+					       .engine = ring);
 		igt_debug("spin[%d].handle=%d\n", n, spin[n]->handle);
 
 		store_dword(fd, ctx[HI], ring, result, 0, n + 1, 0, I915_GEM_DOMAIN_RENDER);
@@ -462,7 +467,9 @@ static igt_spin_t *__noise(int fd, uint32_t ctx, int prio, igt_spin_t *spin)
 
 	for_each_physical_engine(fd, other) {
 		if (spin == NULL) {
-			spin = __igt_spin_batch_new(fd, ctx, other, 0);
+			spin = __igt_spin_batch_new(fd,
+						    .ctx = ctx,
+						    .engine = other);
 		} else {
 			struct drm_i915_gem_exec_object2 obj = {
 				.handle = spin->handle,
@@ -672,7 +679,9 @@ static void preempt_self(int fd, unsigned ring)
 	n = 0;
 	gem_context_set_priority(fd, ctx[HI], MIN_PRIO);
 	for_each_physical_engine(fd, other) {
-		spin[n] = __igt_spin_batch_new(fd, ctx[NOISE], other, 0);
+		spin[n] = __igt_spin_batch_new(fd,
+					       .ctx = ctx[NOISE],
+					       .engine = other);
 		store_dword(fd, ctx[HI], other,
 			    result, (n + 1)*sizeof(uint32_t), n + 1,
 			    0, I915_GEM_DOMAIN_RENDER);
@@ -714,7 +723,9 @@ static void preemptive_hang(int fd, unsigned ring)
 		ctx[LO] = gem_context_create(fd);
 		gem_context_set_priority(fd, ctx[LO], MIN_PRIO);
 
-		spin[n] = __igt_spin_batch_new(fd, ctx[LO], ring, 0);
+		spin[n] = __igt_spin_batch_new(fd,
+					       .ctx = ctx[LO],
+					       .engine = ring);
 
 		gem_context_destroy(fd, ctx[LO]);
 	}
diff --git a/tests/gem_exec_suspend.c b/tests/gem_exec_suspend.c
index db2bca262..43c52d105 100644
--- a/tests/gem_exec_suspend.c
+++ b/tests/gem_exec_suspend.c
@@ -189,7 +189,7 @@ static void run_test(int fd, unsigned engine, unsigned flags)
 	}
 
 	if (flags & HANG)
-		spin = igt_spin_batch_new(fd, 0, engine, 0);
+		spin = igt_spin_batch_new(fd, .engine = engine);
 
 	switch (mode(flags)) {
 	case NOSLEEP:
diff --git a/tests/gem_fenced_exec_thrash.c b/tests/gem_fenced_exec_thrash.c
index 385790ada..7248d310d 100644
--- a/tests/gem_fenced_exec_thrash.c
+++ b/tests/gem_fenced_exec_thrash.c
@@ -132,7 +132,7 @@ static void run_test(int fd, int num_fences, int expected_errno,
 			igt_spin_t *spin = NULL;
 
 			if (flags & BUSY_LOAD)
-				spin = __igt_spin_batch_new(fd, 0, 0, 0);
+				spin = __igt_spin_batch_new(fd);
 
 			igt_while_interruptible(flags & INTERRUPTIBLE) {
 				igt_assert_eq(__gem_execbuf(fd, &execbuf[i]),
diff --git a/tests/gem_shrink.c b/tests/gem_shrink.c
index 3d33453aa..929e0426a 100644
--- a/tests/gem_shrink.c
+++ b/tests/gem_shrink.c
@@ -346,9 +346,9 @@ static void reclaim(unsigned engine, int timeout)
 		} while (!*shared);
 	}
 
-	spin = igt_spin_batch_new(fd, 0, engine, 0);
+	spin = igt_spin_batch_new(fd, .engine = engine);
 	igt_until_timeout(timeout) {
-		igt_spin_t *next = __igt_spin_batch_new(fd, 0, engine, 0);
+		igt_spin_t *next = __igt_spin_batch_new(fd, .engine = engine);
 
 		igt_spin_batch_set_timeout(spin, timeout_100ms);
 		gem_sync(fd, spin->handle);
diff --git a/tests/gem_spin_batch.c b/tests/gem_spin_batch.c
index cffeb6d71..52410010b 100644
--- a/tests/gem_spin_batch.c
+++ b/tests/gem_spin_batch.c
@@ -41,9 +41,9 @@ static void spin(int fd, unsigned int engine, unsigned int timeout_sec)
 	struct timespec itv = { };
 	uint64_t elapsed;
 
-	spin = __igt_spin_batch_new(fd, 0, engine, 0);
+	spin = __igt_spin_batch_new(fd, .engine = engine);
 	while ((elapsed = igt_nsec_elapsed(&tv)) >> 30 < timeout_sec) {
-		igt_spin_t *next = __igt_spin_batch_new(fd, 0, engine, 0);
+		igt_spin_t *next = __igt_spin_batch_new(fd, .engine = engine);
 
 		igt_spin_batch_set_timeout(spin,
 					   timeout_100ms - igt_nsec_elapsed(&itv));
diff --git a/tests/gem_sync.c b/tests/gem_sync.c
index 1e2e089a1..2fcb9aa01 100644
--- a/tests/gem_sync.c
+++ b/tests/gem_sync.c
@@ -715,9 +715,8 @@ preempt(int fd, unsigned ring, int num_children, int timeout)
 		do {
 			igt_spin_t *spin =
 				__igt_spin_batch_new(fd,
-						     ctx[0],
-						     execbuf.flags,
-						     0);
+						     .ctx = ctx[0],
+						     .engine = execbuf.flags);
 
 			do {
 				gem_execbuf(fd, &execbuf);
diff --git a/tests/gem_wait.c b/tests/gem_wait.c
index 61d8a4059..7914c9365 100644
--- a/tests/gem_wait.c
+++ b/tests/gem_wait.c
@@ -74,7 +74,9 @@ static void basic(int fd, unsigned engine, unsigned flags)
 	IGT_CORK_HANDLE(cork);
 	uint32_t plug =
 		flags & (WRITE | AWAIT) ? igt_cork_plug(&cork, fd) : 0;
-	igt_spin_t *spin = igt_spin_batch_new(fd, 0, engine, plug);
+	igt_spin_t *spin = igt_spin_batch_new(fd,
+					      .engine = engine,
+					      .dependency = plug);
 	struct drm_i915_gem_wait wait = {
 		flags & WRITE ? plug : spin->handle
 	};
diff --git a/tests/kms_busy.c b/tests/kms_busy.c
index 4a4e0e156..abf39828b 100644
--- a/tests/kms_busy.c
+++ b/tests/kms_busy.c
@@ -84,7 +84,8 @@ static void flip_to_fb(igt_display_t *dpy, int pipe,
 	struct drm_event_vblank ev;
 
 	igt_spin_t *t = igt_spin_batch_new(dpy->drm_fd,
-					   0, ring, fb->gem_handle);
+					   .engine = ring,
+					   .dependency = fb->gem_handle);
 
 	if (modeset) {
 		/*
@@ -200,7 +201,8 @@ static void test_atomic_commit_hang(igt_display_t *dpy, igt_plane_t *primary,
 				    struct igt_fb *busy_fb, unsigned ring)
 {
 	igt_spin_t *t = igt_spin_batch_new(dpy->drm_fd,
-					   0, ring, busy_fb->gem_handle);
+					   .engine = ring,
+					   .dependency = busy_fb->gem_handle);
 	struct pollfd pfd = { .fd = dpy->drm_fd, .events = POLLIN };
 	unsigned flags = 0;
 	struct drm_event_vblank ev;
@@ -287,7 +289,9 @@ static void test_pageflip_modeset_hang(igt_display_t *dpy,
 
 	igt_display_commit2(dpy, dpy->is_atomic ? COMMIT_ATOMIC : COMMIT_LEGACY);
 
-	t = igt_spin_batch_new(dpy->drm_fd, 0, ring, fb.gem_handle);
+	t = igt_spin_batch_new(dpy->drm_fd,
+			       .engine = ring,
+			       .dependency = fb.gem_handle);
 
 	do_or_die(drmModePageFlip(dpy->drm_fd, dpy->pipes[pipe].crtc_id, fb.fb_id, DRM_MODE_PAGE_FLIP_EVENT, &fb));
 
diff --git a/tests/kms_cursor_legacy.c b/tests/kms_cursor_legacy.c
index d0a28b3c4..85340d43e 100644
--- a/tests/kms_cursor_legacy.c
+++ b/tests/kms_cursor_legacy.c
@@ -532,7 +532,8 @@ static void basic_flip_cursor(igt_display_t *display,
 
 		spin = NULL;
 		if (flags & BASIC_BUSY)
-			spin = igt_spin_batch_new(display->drm_fd, 0, 0, fb_info.gem_handle);
+			spin = igt_spin_batch_new(display->drm_fd,
+						  .dependency = fb_info.gem_handle);
 
 		/* Start with a synchronous query to align with the vblank */
 		vblank_start = get_vblank(display->drm_fd, pipe, DRM_VBLANK_NEXTONMISS);
@@ -1323,8 +1324,8 @@ static void flip_vs_cursor_busy_crc(igt_display_t *display, bool atomic)
 	for (int i = 1; i >= 0; i--) {
 		igt_spin_t *spin;
 
-		spin = igt_spin_batch_new(display->drm_fd, 0, 0,
-					  fb_info[1].gem_handle);
+		spin = igt_spin_batch_new(display->drm_fd,
+					  .dependency = fb_info[1].gem_handle);
 
 		vblank_start = get_vblank(display->drm_fd, pipe, DRM_VBLANK_NEXTONMISS);
 
diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index 4570f926d..a1d36ac4f 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -172,10 +172,15 @@ static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e)
 
 static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
 {
+	struct igt_spin_factory opts = {
+		.ctx = ctx,
+		.engine = flags,
+	};
+
 	if (gem_can_store_dword(fd, flags))
-		return __igt_spin_batch_new_poll(fd, ctx, flags);
-	else
-		return __igt_spin_batch_new(fd, ctx, flags, 0);
+		opts.flags |= IGT_SPIN_POLL_RUN;
+
+	return __igt_spin_batch_factory(fd, &opts);
 }
 
 static unsigned long __spin_wait(int fd, igt_spin_t *spin)
@@ -356,7 +361,9 @@ busy_double_start(int gem_fd, const struct intel_execution_engine2 *e)
 	 */
 	spin[0] = __spin_sync(gem_fd, 0, e2ring(gem_fd, e));
 	usleep(500e3);
-	spin[1] = __igt_spin_batch_new(gem_fd, ctx, e2ring(gem_fd, e), 0);
+	spin[1] = __igt_spin_batch_new(gem_fd,
+				       .ctx = ctx,
+				       .engine = e2ring(gem_fd, e));
 
 	/*
 	 * Open PMU as fast as possible after the second spin batch in attempt
@@ -1045,8 +1052,8 @@ static void cpu_hotplug(int gem_fd)
 	 * Create two spinners so test can ensure shorter gaps in engine
 	 * busyness as it is terminating one and re-starting the other.
 	 */
-	spin[0] = igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
-	spin[1] = __igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
+	spin[0] = igt_spin_batch_new(gem_fd, .engine = I915_EXEC_RENDER);
+	spin[1] = __igt_spin_batch_new(gem_fd, .engine = I915_EXEC_RENDER);
 
 	val = __pmu_read_single(fd, &ts[0]);
 
@@ -1129,8 +1136,8 @@ static void cpu_hotplug(int gem_fd)
 			break;
 
 		igt_spin_batch_free(gem_fd, spin[cur]);
-		spin[cur] = __igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER,
-						 0);
+		spin[cur] = __igt_spin_batch_new(gem_fd,
+						 .engine = I915_EXEC_RENDER);
 		cur ^= 1;
 	}
 
@@ -1167,8 +1174,9 @@ test_interrupts(int gem_fd)
 
 	/* Queue spinning batches. */
 	for (int i = 0; i < target; i++) {
-		spin[i] = __igt_spin_batch_new_fence(gem_fd,
-						     0, I915_EXEC_RENDER);
+		spin[i] = __igt_spin_batch_new(gem_fd,
+					       .engine = I915_EXEC_RENDER,
+					       .flags = IGT_SPIN_FENCE_OUT);
 		if (i == 0) {
 			fence_fd = spin[i]->out_fence;
 		} else {
@@ -1229,7 +1237,8 @@ test_interrupts_sync(int gem_fd)
 
 	/* Queue spinning batches. */
 	for (int i = 0; i < target; i++)
-		spin[i] = __igt_spin_batch_new_fence(gem_fd, 0, 0);
+		spin[i] = __igt_spin_batch_new(gem_fd,
+					       .flags = IGT_SPIN_FENCE_OUT);
 
 	/* Wait for idle state. */
 	idle = pmu_read_single(fd);
@@ -1550,7 +1559,7 @@ accuracy(int gem_fd, const struct intel_execution_engine2 *e,
 		igt_spin_t *spin;
 
 		/* Allocate our spin batch and idle it. */
-		spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
+		spin = igt_spin_batch_new(gem_fd, .engine = e2ring(gem_fd, e));
 		igt_spin_batch_end(spin);
 		gem_sync(gem_fd, spin->handle);
 
diff --git a/tests/pm_rps.c b/tests/pm_rps.c
index 006d084b8..202132b1c 100644
--- a/tests/pm_rps.c
+++ b/tests/pm_rps.c
@@ -235,9 +235,9 @@ static void load_helper_run(enum load load)
 
 		igt_debug("Applying %s load...\n", lh.load ? "high" : "low");
 
-		spin[0] = __igt_spin_batch_new(drm_fd, 0, 0, 0);
+		spin[0] = __igt_spin_batch_new(drm_fd);
 		if (lh.load == HIGH)
-			spin[1] = __igt_spin_batch_new(drm_fd, 0, 0, 0);
+			spin[1] = __igt_spin_batch_new(drm_fd);
 		while (!lh.exit) {
 			handle = spin[0]->handle;
 			igt_spin_batch_end(spin[0]);
@@ -248,8 +248,7 @@ static void load_helper_run(enum load load)
 			usleep(100);
 
 			spin[0] = spin[1];
-			spin[lh.load == HIGH] =
-				__igt_spin_batch_new(drm_fd, 0, 0, 0);
+			spin[lh.load == HIGH] = __igt_spin_batch_new(drm_fd);
 		}
 
 		handle = spin[0]->handle;
@@ -510,7 +509,7 @@ static void boost_freq(int fd, int *boost_freqs)
 	int64_t timeout = 1;
 	igt_spin_t *load;
 
-	load = igt_spin_batch_new(fd, 0, 0, 0);
+	load = igt_spin_batch_new(fd);
 	resubmit_batch(fd, load->handle, 16);
 
 	/* Waiting will grant us a boost to maximum */
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 05/17] lib: Spin fast, retire early
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

When using the pollable spinner, we often want to use it as a means of
ensuring the task is running on the GPU before switching to something
else. In which case we don't want to add extra delay inside the spinner,
but the current 1000 NOPs add on order of 5us, which is often larger
than the target latency.

v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
from a tight GPU spinner.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 lib/igt_dummyload.c       | 3 ++-
 lib/igt_dummyload.h       | 1 +
 tests/gem_ctx_isolation.c | 1 +
 tests/gem_eio.c           | 1 +
 tests/gem_exec_latency.c  | 4 ++--
 5 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
index 94efdf745..7beb66244 100644
--- a/lib/igt_dummyload.c
+++ b/lib/igt_dummyload.c
@@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
 	 * between function calls, that appears enough to keep SNB out of
 	 * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
 	 */
-	batch += 1000;
+	if (!(opts->flags & IGT_SPIN_FAST))
+		batch += 1000;
 
 	/* recurse */
 	r = &relocs[obj[BATCH].relocation_count++];
diff --git a/lib/igt_dummyload.h b/lib/igt_dummyload.h
index c794f2544..e80a12451 100644
--- a/lib/igt_dummyload.h
+++ b/lib/igt_dummyload.h
@@ -52,6 +52,7 @@ struct igt_spin_factory {
 
 #define IGT_SPIN_FENCE_OUT (1 << 0)
 #define IGT_SPIN_POLL_RUN  (1 << 1)
+#define IGT_SPIN_FAST      (1 << 2)
 
 igt_spin_t *
 __igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
diff --git a/tests/gem_ctx_isolation.c b/tests/gem_ctx_isolation.c
index 2e19e8c03..4325e1c28 100644
--- a/tests/gem_ctx_isolation.c
+++ b/tests/gem_ctx_isolation.c
@@ -560,6 +560,7 @@ static void inject_reset_context(int fd, unsigned int engine)
 	struct igt_spin_factory opts = {
 		.ctx = gem_context_create(fd),
 		.engine = engine,
+		.flags = IGT_SPIN_FAST,
 	};
 	igt_spin_t *spin;
 
diff --git a/tests/gem_eio.c b/tests/gem_eio.c
index 0ec1aaec9..3162a3170 100644
--- a/tests/gem_eio.c
+++ b/tests/gem_eio.c
@@ -160,6 +160,7 @@ static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
 	struct igt_spin_factory opts = {
 		.ctx = ctx,
 		.engine = flags,
+		.flags = IGT_SPIN_FAST,
 	};
 
 	if (gem_can_store_dword(fd, opts.engine))
diff --git a/tests/gem_exec_latency.c b/tests/gem_exec_latency.c
index 75811f325..de16322a6 100644
--- a/tests/gem_exec_latency.c
+++ b/tests/gem_exec_latency.c
@@ -65,7 +65,7 @@ poll_ring(int fd, unsigned ring, const char *name)
 {
 	const struct igt_spin_factory opts = {
 		.engine = ring,
-		.flags = IGT_SPIN_POLL_RUN,
+		.flags = IGT_SPIN_POLL_RUN | IGT_SPIN_FAST,
 	};
 	struct timespec tv = {};
 	unsigned long cycles;
@@ -464,7 +464,7 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
 #define MMAP_SZ (64 << 10)
 	const struct igt_spin_factory opts = {
 		.engine = engine,
-		.flags = IGT_SPIN_POLL_RUN,
+		.flags = IGT_SPIN_POLL_RUN | IGT_SPIN_FAST,
 	};
 	struct rt_pkt *results;
 	unsigned int engines[16];
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 05/17] lib: Spin fast, retire early
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx, Tvrtko Ursulin

When using the pollable spinner, we often want to use it as a means of
ensuring the task is running on the GPU before switching to something
else. In which case we don't want to add extra delay inside the spinner,
but the current 1000 NOPs add on order of 5us, which is often larger
than the target latency.

v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
from a tight GPU spinner.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 lib/igt_dummyload.c       | 3 ++-
 lib/igt_dummyload.h       | 1 +
 tests/gem_ctx_isolation.c | 1 +
 tests/gem_eio.c           | 1 +
 tests/gem_exec_latency.c  | 4 ++--
 5 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
index 94efdf745..7beb66244 100644
--- a/lib/igt_dummyload.c
+++ b/lib/igt_dummyload.c
@@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
 	 * between function calls, that appears enough to keep SNB out of
 	 * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
 	 */
-	batch += 1000;
+	if (!(opts->flags & IGT_SPIN_FAST))
+		batch += 1000;
 
 	/* recurse */
 	r = &relocs[obj[BATCH].relocation_count++];
diff --git a/lib/igt_dummyload.h b/lib/igt_dummyload.h
index c794f2544..e80a12451 100644
--- a/lib/igt_dummyload.h
+++ b/lib/igt_dummyload.h
@@ -52,6 +52,7 @@ struct igt_spin_factory {
 
 #define IGT_SPIN_FENCE_OUT (1 << 0)
 #define IGT_SPIN_POLL_RUN  (1 << 1)
+#define IGT_SPIN_FAST      (1 << 2)
 
 igt_spin_t *
 __igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
diff --git a/tests/gem_ctx_isolation.c b/tests/gem_ctx_isolation.c
index 2e19e8c03..4325e1c28 100644
--- a/tests/gem_ctx_isolation.c
+++ b/tests/gem_ctx_isolation.c
@@ -560,6 +560,7 @@ static void inject_reset_context(int fd, unsigned int engine)
 	struct igt_spin_factory opts = {
 		.ctx = gem_context_create(fd),
 		.engine = engine,
+		.flags = IGT_SPIN_FAST,
 	};
 	igt_spin_t *spin;
 
diff --git a/tests/gem_eio.c b/tests/gem_eio.c
index 0ec1aaec9..3162a3170 100644
--- a/tests/gem_eio.c
+++ b/tests/gem_eio.c
@@ -160,6 +160,7 @@ static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
 	struct igt_spin_factory opts = {
 		.ctx = ctx,
 		.engine = flags,
+		.flags = IGT_SPIN_FAST,
 	};
 
 	if (gem_can_store_dword(fd, opts.engine))
diff --git a/tests/gem_exec_latency.c b/tests/gem_exec_latency.c
index 75811f325..de16322a6 100644
--- a/tests/gem_exec_latency.c
+++ b/tests/gem_exec_latency.c
@@ -65,7 +65,7 @@ poll_ring(int fd, unsigned ring, const char *name)
 {
 	const struct igt_spin_factory opts = {
 		.engine = ring,
-		.flags = IGT_SPIN_POLL_RUN,
+		.flags = IGT_SPIN_POLL_RUN | IGT_SPIN_FAST,
 	};
 	struct timespec tv = {};
 	unsigned long cycles;
@@ -464,7 +464,7 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
 #define MMAP_SZ (64 << 10)
 	const struct igt_spin_factory opts = {
 		.engine = engine,
-		.flags = IGT_SPIN_POLL_RUN,
+		.flags = IGT_SPIN_POLL_RUN | IGT_SPIN_FAST,
 	};
 	struct rt_pkt *results;
 	unsigned int engines[16];
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 06/17] igt/gem_sync: Alternate stress for nop+sync
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Apply a different sort of stress by timing how long it takes to sync a
second nop batch in the pipeline. We first start a spinner on the
engine, then when we know the GPU is active, we submit the second nop;
start timing as we then release the spinner and wait for the nop to
complete.

As with every other gem_sync test, it serves two roles. The first is
that it checks that we do not miss a wakeup under common stressful
conditions (the more conditions we check, the happier we will be that
they do not occur in practice). And the second role it fulfils, is that
it provides a very crude estimate for how long it takes for a nop to
execute from a running start (we already have a complimentary estimate
for an idle start).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 tests/gem_sync.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/tests/gem_sync.c b/tests/gem_sync.c
index 2fcb9aa01..747412c5d 100644
--- a/tests/gem_sync.c
+++ b/tests/gem_sync.c
@@ -177,6 +177,95 @@ idle_ring(int fd, unsigned ring, int timeout)
 	gem_close(fd, object.handle);
 }
 
+static void
+wakeup_ring(int fd, unsigned ring, int timeout)
+{
+	unsigned engines[16];
+	const char *names[16];
+	int num_engines = 0;
+
+	if (ring == ALL_ENGINES) {
+		for_each_physical_engine(fd, ring) {
+			if (!gem_can_store_dword(fd, ring))
+				continue;
+
+			names[num_engines] = e__->name;
+			engines[num_engines++] = ring;
+			if (num_engines == ARRAY_SIZE(engines))
+				break;
+		}
+		igt_require(num_engines);
+	} else {
+		gem_require_ring(fd, ring);
+		igt_require(gem_can_store_dword(fd, ring));
+		names[num_engines] = NULL;
+		engines[num_engines++] = ring;
+	}
+
+	intel_detect_and_clear_missed_interrupts(fd);
+	igt_fork(child, num_engines) {
+		const uint32_t bbe = MI_BATCH_BUFFER_END;
+		struct drm_i915_gem_exec_object2 object;
+		struct drm_i915_gem_execbuffer2 execbuf;
+		double end, this, elapsed, now;
+		unsigned long cycles;
+		uint32_t cmd;
+		igt_spin_t *spin;
+
+		memset(&object, 0, sizeof(object));
+		object.handle = gem_create(fd, 4096);
+		gem_write(fd, object.handle, 0, &bbe, sizeof(bbe));
+
+		memset(&execbuf, 0, sizeof(execbuf));
+		execbuf.buffers_ptr = to_user_pointer(&object);
+		execbuf.buffer_count = 1;
+		execbuf.flags = engines[child % num_engines];
+
+		spin = __igt_spin_batch_new(fd,
+					    .engine = execbuf.flags,
+					    .flags = (IGT_SPIN_POLL_RUN |
+						      IGT_SPIN_FAST));
+		igt_assert(spin->running);
+		cmd = *spin->batch;
+
+		gem_execbuf(fd, &execbuf);
+
+		igt_spin_batch_end(spin);
+		gem_sync(fd, object.handle);
+
+		end = gettime() + timeout;
+		elapsed = 0;
+		cycles = 0;
+		do {
+			*spin->batch = cmd;
+			*spin->running = 0;
+			gem_execbuf(fd, &spin->execbuf);
+			while (!READ_ONCE(*spin->running))
+				;
+
+			gem_execbuf(fd, &execbuf);
+
+			this = gettime();
+			igt_spin_batch_end(spin);
+			gem_sync(fd, object.handle);
+			now = gettime();
+
+			elapsed += now - this;
+			cycles++;
+		} while (now < end);
+
+		igt_info("%s%sompleted %ld cycles: %.3f us\n",
+			 names[child % num_engines] ?: "",
+			 names[child % num_engines] ? " c" : "C",
+			 cycles, elapsed*1e6/cycles);
+
+		igt_spin_batch_free(fd, spin);
+		gem_close(fd, object.handle);
+	}
+	igt_waitchildren_timeout(timeout+10, NULL);
+	igt_assert_eq(intel_detect_and_clear_missed_interrupts(fd), 0);
+}
+
 static void
 store_ring(int fd, unsigned ring, int num_children, int timeout)
 {
@@ -761,6 +850,8 @@ igt_main
 			sync_ring(fd, e->exec_id | e->flags, 1, 150);
 		igt_subtest_f("idle-%s", e->name)
 			idle_ring(fd, e->exec_id | e->flags, 150);
+		igt_subtest_f("wakeup-%s", e->name)
+			wakeup_ring(fd, e->exec_id | e->flags, 150);
 		igt_subtest_f("store-%s", e->name)
 			store_ring(fd, e->exec_id | e->flags, 1, 150);
 		igt_subtest_f("many-%s", e->name)
@@ -781,6 +872,8 @@ igt_main
 		sync_ring(fd, ALL_ENGINES, ncpus, 150);
 	igt_subtest("forked-store-each")
 		store_ring(fd, ALL_ENGINES, ncpus, 150);
+	igt_subtest("wakeup-each")
+		wakeup_ring(fd, ALL_ENGINES, 150);
 
 	igt_subtest("basic-all")
 		sync_all(fd, 1, 5);
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Intel-gfx] [PATCH i-g-t 06/17] igt/gem_sync: Alternate stress for nop+sync
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Apply a different sort of stress by timing how long it takes to sync a
second nop batch in the pipeline. We first start a spinner on the
engine, then when we know the GPU is active, we submit the second nop;
start timing as we then release the spinner and wait for the nop to
complete.

As with every other gem_sync test, it serves two roles. The first is
that it checks that we do not miss a wakeup under common stressful
conditions (the more conditions we check, the happier we will be that
they do not occur in practice). And the second role it fulfils, is that
it provides a very crude estimate for how long it takes for a nop to
execute from a running start (we already have a complimentary estimate
for an idle start).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 tests/gem_sync.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/tests/gem_sync.c b/tests/gem_sync.c
index 2fcb9aa01..747412c5d 100644
--- a/tests/gem_sync.c
+++ b/tests/gem_sync.c
@@ -177,6 +177,95 @@ idle_ring(int fd, unsigned ring, int timeout)
 	gem_close(fd, object.handle);
 }
 
+static void
+wakeup_ring(int fd, unsigned ring, int timeout)
+{
+	unsigned engines[16];
+	const char *names[16];
+	int num_engines = 0;
+
+	if (ring == ALL_ENGINES) {
+		for_each_physical_engine(fd, ring) {
+			if (!gem_can_store_dword(fd, ring))
+				continue;
+
+			names[num_engines] = e__->name;
+			engines[num_engines++] = ring;
+			if (num_engines == ARRAY_SIZE(engines))
+				break;
+		}
+		igt_require(num_engines);
+	} else {
+		gem_require_ring(fd, ring);
+		igt_require(gem_can_store_dword(fd, ring));
+		names[num_engines] = NULL;
+		engines[num_engines++] = ring;
+	}
+
+	intel_detect_and_clear_missed_interrupts(fd);
+	igt_fork(child, num_engines) {
+		const uint32_t bbe = MI_BATCH_BUFFER_END;
+		struct drm_i915_gem_exec_object2 object;
+		struct drm_i915_gem_execbuffer2 execbuf;
+		double end, this, elapsed, now;
+		unsigned long cycles;
+		uint32_t cmd;
+		igt_spin_t *spin;
+
+		memset(&object, 0, sizeof(object));
+		object.handle = gem_create(fd, 4096);
+		gem_write(fd, object.handle, 0, &bbe, sizeof(bbe));
+
+		memset(&execbuf, 0, sizeof(execbuf));
+		execbuf.buffers_ptr = to_user_pointer(&object);
+		execbuf.buffer_count = 1;
+		execbuf.flags = engines[child % num_engines];
+
+		spin = __igt_spin_batch_new(fd,
+					    .engine = execbuf.flags,
+					    .flags = (IGT_SPIN_POLL_RUN |
+						      IGT_SPIN_FAST));
+		igt_assert(spin->running);
+		cmd = *spin->batch;
+
+		gem_execbuf(fd, &execbuf);
+
+		igt_spin_batch_end(spin);
+		gem_sync(fd, object.handle);
+
+		end = gettime() + timeout;
+		elapsed = 0;
+		cycles = 0;
+		do {
+			*spin->batch = cmd;
+			*spin->running = 0;
+			gem_execbuf(fd, &spin->execbuf);
+			while (!READ_ONCE(*spin->running))
+				;
+
+			gem_execbuf(fd, &execbuf);
+
+			this = gettime();
+			igt_spin_batch_end(spin);
+			gem_sync(fd, object.handle);
+			now = gettime();
+
+			elapsed += now - this;
+			cycles++;
+		} while (now < end);
+
+		igt_info("%s%sompleted %ld cycles: %.3f us\n",
+			 names[child % num_engines] ?: "",
+			 names[child % num_engines] ? " c" : "C",
+			 cycles, elapsed*1e6/cycles);
+
+		igt_spin_batch_free(fd, spin);
+		gem_close(fd, object.handle);
+	}
+	igt_waitchildren_timeout(timeout+10, NULL);
+	igt_assert_eq(intel_detect_and_clear_missed_interrupts(fd), 0);
+}
+
 static void
 store_ring(int fd, unsigned ring, int num_children, int timeout)
 {
@@ -761,6 +850,8 @@ igt_main
 			sync_ring(fd, e->exec_id | e->flags, 1, 150);
 		igt_subtest_f("idle-%s", e->name)
 			idle_ring(fd, e->exec_id | e->flags, 150);
+		igt_subtest_f("wakeup-%s", e->name)
+			wakeup_ring(fd, e->exec_id | e->flags, 150);
 		igt_subtest_f("store-%s", e->name)
 			store_ring(fd, e->exec_id | e->flags, 1, 150);
 		igt_subtest_f("many-%s", e->name)
@@ -781,6 +872,8 @@ igt_main
 		sync_ring(fd, ALL_ENGINES, ncpus, 150);
 	igt_subtest("forked-store-each")
 		store_ring(fd, ALL_ENGINES, ncpus, 150);
+	igt_subtest("wakeup-each")
+		wakeup_ring(fd, ALL_ENGINES, 150);
 
 	igt_subtest("basic-all")
 		sync_all(fd, 1, 5);
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 07/17] igt/gem_sync: Double the wakeups, twice the pain
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

To further defeat any contemplated spin-optimisations to avoid the irq
latency for synchronous wakeups, increase the queue length.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 tests/gem_sync.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/tests/gem_sync.c b/tests/gem_sync.c
index 747412c5d..60d61a02e 100644
--- a/tests/gem_sync.c
+++ b/tests/gem_sync.c
@@ -178,7 +178,7 @@ idle_ring(int fd, unsigned ring, int timeout)
 }
 
 static void
-wakeup_ring(int fd, unsigned ring, int timeout)
+wakeup_ring(int fd, unsigned ring, int timeout, int wlen)
 {
 	unsigned engines[16];
 	const char *names[16];
@@ -243,7 +243,8 @@ wakeup_ring(int fd, unsigned ring, int timeout)
 			while (!READ_ONCE(*spin->running))
 				;
 
-			gem_execbuf(fd, &execbuf);
+			for (int n = 0; n < wlen; n++)
+				gem_execbuf(fd, &execbuf);
 
 			this = gettime();
 			igt_spin_batch_end(spin);
@@ -851,7 +852,9 @@ igt_main
 		igt_subtest_f("idle-%s", e->name)
 			idle_ring(fd, e->exec_id | e->flags, 150);
 		igt_subtest_f("wakeup-%s", e->name)
-			wakeup_ring(fd, e->exec_id | e->flags, 150);
+			wakeup_ring(fd, e->exec_id | e->flags, 150, 1);
+		igt_subtest_f("double-wakeup-%s", e->name)
+			wakeup_ring(fd, e->exec_id | e->flags, 150, 2);
 		igt_subtest_f("store-%s", e->name)
 			store_ring(fd, e->exec_id | e->flags, 1, 150);
 		igt_subtest_f("many-%s", e->name)
@@ -873,7 +876,9 @@ igt_main
 	igt_subtest("forked-store-each")
 		store_ring(fd, ALL_ENGINES, ncpus, 150);
 	igt_subtest("wakeup-each")
-		wakeup_ring(fd, ALL_ENGINES, 150);
+		wakeup_ring(fd, ALL_ENGINES, 150, 1);
+	igt_subtest("double-wakeup-each")
+		wakeup_ring(fd, ALL_ENGINES, 150, 2);
 
 	igt_subtest("basic-all")
 		sync_all(fd, 1, 5);
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 07/17] igt/gem_sync: Double the wakeups, twice the pain
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

To further defeat any contemplated spin-optimisations to avoid the irq
latency for synchronous wakeups, increase the queue length.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 tests/gem_sync.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/tests/gem_sync.c b/tests/gem_sync.c
index 747412c5d..60d61a02e 100644
--- a/tests/gem_sync.c
+++ b/tests/gem_sync.c
@@ -178,7 +178,7 @@ idle_ring(int fd, unsigned ring, int timeout)
 }
 
 static void
-wakeup_ring(int fd, unsigned ring, int timeout)
+wakeup_ring(int fd, unsigned ring, int timeout, int wlen)
 {
 	unsigned engines[16];
 	const char *names[16];
@@ -243,7 +243,8 @@ wakeup_ring(int fd, unsigned ring, int timeout)
 			while (!READ_ONCE(*spin->running))
 				;
 
-			gem_execbuf(fd, &execbuf);
+			for (int n = 0; n < wlen; n++)
+				gem_execbuf(fd, &execbuf);
 
 			this = gettime();
 			igt_spin_batch_end(spin);
@@ -851,7 +852,9 @@ igt_main
 		igt_subtest_f("idle-%s", e->name)
 			idle_ring(fd, e->exec_id | e->flags, 150);
 		igt_subtest_f("wakeup-%s", e->name)
-			wakeup_ring(fd, e->exec_id | e->flags, 150);
+			wakeup_ring(fd, e->exec_id | e->flags, 150, 1);
+		igt_subtest_f("double-wakeup-%s", e->name)
+			wakeup_ring(fd, e->exec_id | e->flags, 150, 2);
 		igt_subtest_f("store-%s", e->name)
 			store_ring(fd, e->exec_id | e->flags, 1, 150);
 		igt_subtest_f("many-%s", e->name)
@@ -873,7 +876,9 @@ igt_main
 	igt_subtest("forked-store-each")
 		store_ring(fd, ALL_ENGINES, ncpus, 150);
 	igt_subtest("wakeup-each")
-		wakeup_ring(fd, ALL_ENGINES, 150);
+		wakeup_ring(fd, ALL_ENGINES, 150, 1);
+	igt_subtest("double-wakeup-each")
+		wakeup_ring(fd, ALL_ENGINES, 150, 2);
 
 	igt_subtest("basic-all")
 		sync_all(fd, 1, 5);
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 08/17] igt/gem_sync: Show the baseline poll latency for wakeups
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Distinguish between the latency required to switch away from the
pollable spinner into the target nops from the client wakeup of
synchronisation on the last nop.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 tests/gem_sync.c | 35 +++++++++++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/tests/gem_sync.c b/tests/gem_sync.c
index 60d61a02e..493ae61df 100644
--- a/tests/gem_sync.c
+++ b/tests/gem_sync.c
@@ -207,7 +207,7 @@ wakeup_ring(int fd, unsigned ring, int timeout, int wlen)
 		const uint32_t bbe = MI_BATCH_BUFFER_END;
 		struct drm_i915_gem_exec_object2 object;
 		struct drm_i915_gem_execbuffer2 execbuf;
-		double end, this, elapsed, now;
+		double end, this, elapsed, now, baseline;
 		unsigned long cycles;
 		uint32_t cmd;
 		igt_spin_t *spin;
@@ -233,6 +233,32 @@ wakeup_ring(int fd, unsigned ring, int timeout, int wlen)
 		igt_spin_batch_end(spin);
 		gem_sync(fd, object.handle);
 
+		for (int warmup = 0; warmup <= 1; warmup++) {
+			end = gettime() + timeout/10.;
+			elapsed = 0;
+			cycles = 0;
+			do {
+				*spin->batch = cmd;
+				*spin->running = 0;
+				gem_execbuf(fd, &spin->execbuf);
+				while (!READ_ONCE(*spin->running))
+					;
+
+				this = gettime();
+				igt_spin_batch_end(spin);
+				gem_sync(fd, spin->handle);
+				now = gettime();
+
+				elapsed += now - this;
+				cycles++;
+			} while (now < end);
+			baseline = elapsed / cycles;
+		}
+		igt_info("%s%saseline %ld cycles: %.3f us\n",
+			 names[child % num_engines] ?: "",
+			 names[child % num_engines] ? " b" : "B",
+			 cycles, elapsed*1e6/cycles);
+
 		end = gettime() + timeout;
 		elapsed = 0;
 		cycles = 0;
@@ -254,16 +280,17 @@ wakeup_ring(int fd, unsigned ring, int timeout, int wlen)
 			elapsed += now - this;
 			cycles++;
 		} while (now < end);
+		elapsed -= cycles * baseline;
 
-		igt_info("%s%sompleted %ld cycles: %.3f us\n",
+		igt_info("%s%sompleted %ld cycles: %.3f + %.3f us\n",
 			 names[child % num_engines] ?: "",
 			 names[child % num_engines] ? " c" : "C",
-			 cycles, elapsed*1e6/cycles);
+			 cycles, 1e6*baseline, elapsed*1e6/cycles);
 
 		igt_spin_batch_free(fd, spin);
 		gem_close(fd, object.handle);
 	}
-	igt_waitchildren_timeout(timeout+10, NULL);
+	igt_waitchildren_timeout(2*timeout, NULL);
 	igt_assert_eq(intel_detect_and_clear_missed_interrupts(fd), 0);
 }
 
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 08/17] igt/gem_sync: Show the baseline poll latency for wakeups
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Distinguish between the latency required to switch away from the
pollable spinner into the target nops from the client wakeup of
synchronisation on the last nop.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 tests/gem_sync.c | 35 +++++++++++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/tests/gem_sync.c b/tests/gem_sync.c
index 60d61a02e..493ae61df 100644
--- a/tests/gem_sync.c
+++ b/tests/gem_sync.c
@@ -207,7 +207,7 @@ wakeup_ring(int fd, unsigned ring, int timeout, int wlen)
 		const uint32_t bbe = MI_BATCH_BUFFER_END;
 		struct drm_i915_gem_exec_object2 object;
 		struct drm_i915_gem_execbuffer2 execbuf;
-		double end, this, elapsed, now;
+		double end, this, elapsed, now, baseline;
 		unsigned long cycles;
 		uint32_t cmd;
 		igt_spin_t *spin;
@@ -233,6 +233,32 @@ wakeup_ring(int fd, unsigned ring, int timeout, int wlen)
 		igt_spin_batch_end(spin);
 		gem_sync(fd, object.handle);
 
+		for (int warmup = 0; warmup <= 1; warmup++) {
+			end = gettime() + timeout/10.;
+			elapsed = 0;
+			cycles = 0;
+			do {
+				*spin->batch = cmd;
+				*spin->running = 0;
+				gem_execbuf(fd, &spin->execbuf);
+				while (!READ_ONCE(*spin->running))
+					;
+
+				this = gettime();
+				igt_spin_batch_end(spin);
+				gem_sync(fd, spin->handle);
+				now = gettime();
+
+				elapsed += now - this;
+				cycles++;
+			} while (now < end);
+			baseline = elapsed / cycles;
+		}
+		igt_info("%s%saseline %ld cycles: %.3f us\n",
+			 names[child % num_engines] ?: "",
+			 names[child % num_engines] ? " b" : "B",
+			 cycles, elapsed*1e6/cycles);
+
 		end = gettime() + timeout;
 		elapsed = 0;
 		cycles = 0;
@@ -254,16 +280,17 @@ wakeup_ring(int fd, unsigned ring, int timeout, int wlen)
 			elapsed += now - this;
 			cycles++;
 		} while (now < end);
+		elapsed -= cycles * baseline;
 
-		igt_info("%s%sompleted %ld cycles: %.3f us\n",
+		igt_info("%s%sompleted %ld cycles: %.3f + %.3f us\n",
 			 names[child % num_engines] ?: "",
 			 names[child % num_engines] ? " c" : "C",
-			 cycles, elapsed*1e6/cycles);
+			 cycles, 1e6*baseline, elapsed*1e6/cycles);
 
 		igt_spin_batch_free(fd, spin);
 		gem_close(fd, object.handle);
 	}
-	igt_waitchildren_timeout(timeout+10, NULL);
+	igt_waitchildren_timeout(2*timeout, NULL);
 	igt_assert_eq(intel_detect_and_clear_missed_interrupts(fd), 0);
 }
 
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 09/17] igt/gem_userptr: Check read-only mappings
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Setup a userptr object that only has a read-only mapping back to a file
store (memfd). Then attempt to write into that mapping using the GPU and
assert that those writes do not land (while also writing via a writable
userptr mapping into the same memfd to verify that the GPU is working!)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 configure.ac              |   1 +
 lib/ioctl_wrappers.c      |   4 +-
 lib/ioctl_wrappers.h      |   4 +-
 lib/meson.build           |   1 +
 meson.build               |   1 +
 tests/Makefile.am         |   4 +-
 tests/gem_userptr_blits.c | 372 +++++++++++++++++++++++++++++++++++++-
 7 files changed, 377 insertions(+), 10 deletions(-)

diff --git a/configure.ac b/configure.ac
index 1ee4e90e9..195963d4f 100644
--- a/configure.ac
+++ b/configure.ac
@@ -125,6 +125,7 @@ PKG_CHECK_MODULES(PCIACCESS, [pciaccess >= 0.10])
 PKG_CHECK_MODULES(KMOD, [libkmod])
 PKG_CHECK_MODULES(PROCPS, [libprocps])
 PKG_CHECK_MODULES(LIBUNWIND, [libunwind])
+PKG_CHECK_MODULES(SSL, [openssl])
 PKG_CHECK_MODULES(VALGRIND, [valgrind], [have_valgrind=yes], [have_valgrind=no])
 
 if test x$have_valgrind = xyes; then
diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c
index 79db44a8c..d5d2a4e4c 100644
--- a/lib/ioctl_wrappers.c
+++ b/lib/ioctl_wrappers.c
@@ -869,7 +869,7 @@ int gem_madvise(int fd, uint32_t handle, int state)
 	return madv.retained;
 }
 
-int __gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle)
+int __gem_userptr(int fd, void *ptr, uint64_t size, int read_only, uint32_t flags, uint32_t *handle)
 {
 	struct drm_i915_gem_userptr userptr;
 
@@ -898,7 +898,7 @@ int __gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, ui
  *
  * Returns userptr handle for the GEM object.
  */
-void gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle)
+void gem_userptr(int fd, void *ptr, uint64_t size, int read_only, uint32_t flags, uint32_t *handle)
 {
 	igt_assert_eq(__gem_userptr(fd, ptr, size, read_only, flags, handle), 0);
 }
diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h
index b966f72c9..8e2cd380b 100644
--- a/lib/ioctl_wrappers.h
+++ b/lib/ioctl_wrappers.h
@@ -133,8 +133,8 @@ struct local_i915_gem_userptr {
 #define LOCAL_I915_USERPTR_UNSYNCHRONIZED (1<<31)
 	uint32_t handle;
 };
-void gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle);
-int __gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle);
+void gem_userptr(int fd, void *ptr, uint64_t size, int read_only, uint32_t flags, uint32_t *handle);
+int __gem_userptr(int fd, void *ptr, uint64_t size, int read_only, uint32_t flags, uint32_t *handle);
 
 void gem_sw_finish(int fd, uint32_t handle);
 
diff --git a/lib/meson.build b/lib/meson.build
index 1a355414e..939167f91 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -62,6 +62,7 @@ lib_deps = [
 	pthreads,
 	math,
 	realtime,
+	ssl,
 ]
 
 if libdrm_intel.found()
diff --git a/meson.build b/meson.build
index 4d15d6238..638c01066 100644
--- a/meson.build
+++ b/meson.build
@@ -98,6 +98,7 @@ pciaccess = dependency('pciaccess', version : '>=0.10')
 libkmod = dependency('libkmod')
 libprocps = dependency('libprocps', required : true)
 libunwind = dependency('libunwind', required : true)
+ssl = dependency('openssl', required : true)
 
 valgrind = null_dep
 valgrindinfo = 'No'
diff --git a/tests/Makefile.am b/tests/Makefile.am
index f41ad5096..ba307b220 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -126,8 +126,8 @@ gem_tiled_swapping_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 gem_tiled_swapping_LDADD = $(LDADD) -lpthread
 prime_self_import_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 prime_self_import_LDADD = $(LDADD) -lpthread
-gem_userptr_blits_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
-gem_userptr_blits_LDADD = $(LDADD) -lpthread
+gem_userptr_blits_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS) $(SSL_CFLAGS)
+gem_userptr_blits_LDADD = $(LDADD) $(SSL_LIBS) -lpthread
 perf_pmu_LDADD = $(LDADD) $(top_builddir)/lib/libigt_perf.la
 
 gem_eio_LDADD = $(LDADD) -lrt
diff --git a/tests/gem_userptr_blits.c b/tests/gem_userptr_blits.c
index 7e3b6ef38..0fb7eba04 100644
--- a/tests/gem_userptr_blits.c
+++ b/tests/gem_userptr_blits.c
@@ -43,13 +43,17 @@
 #include <fcntl.h>
 #include <inttypes.h>
 #include <errno.h>
+#include <setjmp.h>
 #include <sys/stat.h>
 #include <sys/time.h>
 #include <sys/mman.h>
+#include <openssl/sha.h>
 #include <signal.h>
 #include <pthread.h>
 #include <time.h>
 
+#include <linux/memfd.h>
+
 #include "drm.h"
 #include "i915_drm.h"
 
@@ -238,6 +242,57 @@ blit(int fd, uint32_t dst, uint32_t src, uint32_t *all_bo, int n_bo)
 	return ret;
 }
 
+static void store_dword(int fd, uint32_t target,
+			uint32_t offset, uint32_t value)
+{
+	const int gen = intel_gen(intel_get_drm_devid(fd));
+	struct drm_i915_gem_exec_object2 obj[2];
+	struct drm_i915_gem_relocation_entry reloc;
+	struct drm_i915_gem_execbuffer2 execbuf;
+	uint32_t batch[16];
+	int i;
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.buffer_count = ARRAY_SIZE(obj);
+	execbuf.flags = 0;
+	if (gen < 6)
+		execbuf.flags |= I915_EXEC_SECURE;
+
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = target;
+	obj[1].handle = gem_create(fd, 4096);
+
+	memset(&reloc, 0, sizeof(reloc));
+	reloc.target_handle = obj[0].handle;
+	reloc.presumed_offset = 0;
+	reloc.offset = sizeof(uint32_t);
+	reloc.delta = offset;
+	reloc.read_domains = I915_GEM_DOMAIN_RENDER;
+	reloc.write_domain = I915_GEM_DOMAIN_RENDER;
+	obj[1].relocs_ptr = to_user_pointer(&reloc);
+	obj[1].relocation_count = 1;
+
+	i = 0;
+	batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+	if (gen >= 8) {
+		batch[++i] = offset;
+		batch[++i] = 0;
+	} else if (gen >= 4) {
+		batch[++i] = 0;
+		batch[++i] = offset;
+		reloc.offset += sizeof(uint32_t);
+	} else {
+		batch[i]--;
+		batch[++i] = offset;
+	}
+	batch[++i] = value;
+	batch[++i] = MI_BATCH_BUFFER_END;
+	gem_write(fd, obj[1].handle, 0, batch, sizeof(batch));
+	gem_execbuf(fd, &execbuf);
+	gem_close(fd, obj[1].handle);
+}
+
 static uint32_t
 create_userptr(int fd, uint32_t val, uint32_t *ptr)
 {
@@ -941,6 +996,310 @@ static int test_dmabuf(void)
 	return 0;
 }
 
+static void test_readonly(int i915)
+{
+	unsigned char orig[SHA_DIGEST_LENGTH];
+	uint64_t aperture_size;
+	uint32_t whandle, rhandle;
+	size_t sz, total;
+	void *pages, *space;
+	int memfd;
+
+	/*
+	 * A small batch of pages; small enough to cheaply check for stray
+	 * writes but large enough that we don't create too many VMA pointing
+	 * back to this set from the large arena. The limit on total number
+	 * of VMA for a process is 65,536 (at least on this kernel).
+	 *
+	 * We then write from the GPU through the large arena into the smaller
+	 * backing storage, which we can cheaply check to see if those writes
+	 * have landed (using a SHA1sum). Repeating the same random GPU writes
+	 * though a read-only handle to confirm that this time the writes are
+	 * discarded and the backing store unchanged.
+	 */
+	sz = 16 << 12;
+	memfd = memfd_create("pages", 0);
+	igt_require(memfd != -1);
+	igt_require(ftruncate(memfd, sz) == 0);
+
+	pages = mmap(NULL, sz, PROT_WRITE, MAP_SHARED, memfd, 0);
+	igt_assert(pages != MAP_FAILED);
+
+	igt_require(__gem_userptr(i915, pages, sz, true, userptr_flags, &rhandle) == 0);
+	gem_close(i915, rhandle);
+
+	gem_userptr(i915, pages, sz, false, userptr_flags, &whandle);
+
+	total = 2048ull << 20;
+	aperture_size = gem_aperture_size(i915) / 2;
+	if (aperture_size < total)
+		total = aperture_size;
+	total = total / sz * sz;
+	igt_info("Using a %'zuB (%'zu pages) arena onto %zu pages\n",
+		 total, total >> 12, sz >> 12);
+
+	/* Create an arena all pointing to the same set of pages */
+	space = mmap(NULL, total, PROT_READ, MAP_ANON | MAP_SHARED, -1, 0);
+	igt_require(space != MAP_FAILED);
+	for (size_t offset = 0; offset < total; offset += sz) {
+		igt_assert(mmap(space + offset, sz,
+				PROT_WRITE, MAP_SHARED | MAP_FIXED,
+				memfd, 0) != MAP_FAILED);
+		*(uint32_t *)(space + offset) = offset;
+	}
+	igt_assert_eq_u32(*(uint32_t *)pages, (uint32_t)(total - sz));
+	igt_assert(mlock(space, total) == 0);
+	close(memfd);
+
+	/* Check we can create a normal userptr bo wrapping the wrapper */
+	gem_userptr(i915, space, total, false, userptr_flags, &rhandle);
+	gem_set_domain(i915, rhandle, I915_GEM_DOMAIN_CPU, 0);
+	for (size_t offset = 0; offset < total; offset += sz)
+		store_dword(i915, rhandle, offset + 4, offset / sz);
+	gem_sync(i915, rhandle);
+	igt_assert_eq_u32(*(uint32_t *)(pages + 0), (uint32_t)(total - sz));
+	igt_assert_eq_u32(*(uint32_t *)(pages + 4), (uint32_t)(total / sz - 1));
+	gem_close(i915, rhandle);
+
+	/* Now enforce read-only henceforth */
+	igt_assert(mprotect(space, total, PROT_READ) == 0);
+
+	SHA1(pages, sz, orig);
+	igt_fork(child, 1) {
+		const int gen = intel_gen(intel_get_drm_devid(i915));
+		const int nreloc = 1024;
+		struct drm_i915_gem_relocation_entry *reloc;
+		struct drm_i915_gem_exec_object2 obj[2];
+		struct drm_i915_gem_execbuffer2 exec;
+		unsigned char ref[SHA_DIGEST_LENGTH], result[SHA_DIGEST_LENGTH];
+		uint32_t *batch;
+		int i;
+
+		reloc = calloc(sizeof(*reloc), nreloc);
+		gem_userptr(i915, space, total, true, userptr_flags, &rhandle);
+
+		memset(obj, 0, sizeof(obj));
+		obj[0].flags = LOCAL_EXEC_OBJECT_SUPPORTS_48B;
+		obj[1].handle = gem_create(i915, sz);
+		obj[1].relocation_count = nreloc;
+		obj[1].relocs_ptr = to_user_pointer(reloc);
+
+		batch = gem_mmap__wc(i915, obj[1].handle, 0, sz, PROT_WRITE);
+
+		memset(&exec, 0, sizeof(exec));
+		exec.buffer_count = 2;
+		exec.buffers_ptr = to_user_pointer(obj);
+		if (gen < 6)
+			exec.flags |= I915_EXEC_SECURE;
+
+		for_each_engine(i915, exec.flags) {
+			/* First tweak the backing store through the write */
+			i = 0;
+			obj[0].handle = whandle;
+			for (int n = 0; n < nreloc; n++) {
+				uint64_t offset;
+
+				reloc[n].target_handle = obj[0].handle;
+				reloc[n].delta = rand() % (sz / 4) * 4;
+				reloc[n].offset = (i + 1) * sizeof(uint32_t);
+				reloc[n].presumed_offset = obj[0].offset;
+				reloc[n].read_domains = I915_GEM_DOMAIN_RENDER;
+				reloc[n].write_domain = I915_GEM_DOMAIN_RENDER;
+
+				offset = reloc[n].presumed_offset + reloc[n].delta;
+
+				batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+				if (gen >= 8) {
+					batch[++i] = offset;
+					batch[++i] = offset >> 32;
+				} else if (gen >= 4) {
+					batch[++i] = 0;
+					batch[++i] = offset;
+					reloc[n].offset += sizeof(uint32_t);
+				} else {
+					batch[i]--;
+					batch[++i] = offset;
+				}
+				batch[++i] = rand();
+				i++;
+			}
+			batch[i] = MI_BATCH_BUFFER_END;
+			igt_assert(i * sizeof(uint32_t) < sz);
+
+			gem_execbuf(i915, &exec);
+			gem_sync(i915, obj[0].handle);
+			SHA1(pages, sz, ref);
+
+			igt_assert(memcmp(ref, orig, sizeof(ref)));
+			memcpy(orig, ref, sizeof(orig));
+
+			/* Now try the same through the read-only handle */
+			i = 0;
+			obj[0].handle = rhandle;
+			for (int n = 0; n < nreloc; n++) {
+				uint64_t offset;
+
+				reloc[n].target_handle = obj[0].handle;
+				reloc[n].delta = rand() % (total / 4) * 4;
+				reloc[n].offset = (i + 1) * sizeof(uint32_t);
+				reloc[n].presumed_offset = obj[0].offset;
+				reloc[n].read_domains = I915_GEM_DOMAIN_RENDER;
+				reloc[n].write_domain = I915_GEM_DOMAIN_RENDER;
+
+				offset = reloc[n].presumed_offset + reloc[n].delta;
+
+				batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+				if (gen >= 8) {
+					batch[++i] = offset;
+					batch[++i] = offset >> 32;
+				} else if (gen >= 4) {
+					batch[++i] = 0;
+					batch[++i] = offset;
+					reloc[n].offset += sizeof(uint32_t);
+				} else {
+					batch[i]--;
+					batch[++i] = offset;
+				}
+				batch[++i] = rand();
+				i++;
+			}
+			batch[i] = MI_BATCH_BUFFER_END;
+
+			gem_execbuf(i915, &exec);
+			gem_sync(i915, obj[0].handle);
+			SHA1(pages, sz, result);
+
+			/*
+			 * As the writes into the read-only GPU bo should fail,
+			 * the SHA1 hash of the backing store should be
+			 * unaffected.
+			 */
+			igt_assert(memcmp(ref, result, SHA_DIGEST_LENGTH) == 0);
+		}
+
+		munmap(batch, sz);
+		gem_close(i915, obj[1].handle);
+		gem_close(i915, rhandle);
+	}
+	igt_waitchildren();
+
+	munmap(space, total);
+	munmap(pages, sz);
+}
+
+static jmp_buf sigjmp;
+static void sigjmp_handler(int sig)
+{
+	siglongjmp(sigjmp, sig);
+}
+
+static void test_readonly_mmap(int i915)
+{
+	unsigned char original[SHA_DIGEST_LENGTH];
+	unsigned char result[SHA_DIGEST_LENGTH];
+	uint32_t handle;
+	uint32_t sz;
+	void *pages;
+	void *ptr;
+	int sig;
+
+	/*
+	 * A quick check to ensure that we cannot circumvent the
+	 * read-only nature of our memory by creating a GTT mmap into
+	 * the pages. Imagine receiving a readonly SHM segment from
+	 * another process, or a readonly file mmap, it must remain readonly
+	 * on the GPU as well.
+	 */
+
+	igt_require(igt_setup_clflush());
+
+	sz = 16 << 12;
+	pages = mmap(NULL, sz, PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
+	igt_assert(pages != MAP_FAILED);
+
+	igt_require(__gem_userptr(i915, pages, sz, true, userptr_flags, &handle) == 0);
+	gem_set_caching(i915, handle, 0);
+
+	memset(pages, 0xa5, sz);
+	igt_clflush_range(pages, sz);
+	SHA1(pages, sz, original);
+
+	ptr = __gem_mmap__gtt(i915, handle, sz, PROT_WRITE);
+	igt_assert(ptr == NULL);
+
+	ptr = gem_mmap__gtt(i915, handle, sz, PROT_READ);
+	gem_close(i915, handle);
+
+	/* Check that a write into the GTT readonly map fails */
+	if (!(sig = sigsetjmp(sigjmp, 1))) {
+		signal(SIGBUS, sigjmp_handler);
+		signal(SIGSEGV, sigjmp_handler);
+		memset(ptr, 0x5a, sz);
+		igt_assert(0);
+	}
+	igt_assert_eq(sig, SIGSEGV);
+
+	/* Check that we disallow removing the readonly protection */
+	igt_assert(mprotect(ptr, sz, PROT_WRITE));
+	if (!(sig = sigsetjmp(sigjmp, 1))) {
+		signal(SIGBUS, sigjmp_handler);
+		signal(SIGSEGV, sigjmp_handler);
+		memset(ptr, 0x5a, sz);
+		igt_assert(0);
+	}
+	igt_assert_eq(sig, SIGSEGV);
+
+	/* A single read from the GTT pointer to prove that works */
+	igt_assert_eq_u32(*(uint8_t *)ptr, 0xa5);
+	munmap(ptr, sz);
+
+	/* Double check that the kernel did indeed not let any writes through */
+	igt_clflush_range(pages, sz);
+	SHA1(pages, sz, result);
+	igt_assert(!memcmp(original, result, sizeof(original)));
+
+	munmap(pages, sz);
+}
+
+static void test_readonly_pwrite(int i915)
+{
+	unsigned char original[SHA_DIGEST_LENGTH];
+	unsigned char result[SHA_DIGEST_LENGTH];
+	uint32_t handle;
+	uint32_t sz;
+	void *pages;
+
+	/*
+	 * Same as for GTT mmapings, we cannot alone ourselves to
+	 * circumvent readonly protection on a piece of memory via the
+	 * pwrite ioctl.
+	 */
+
+	igt_require(igt_setup_clflush());
+
+	sz = 16 << 12;
+	pages = mmap(NULL, sz, PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
+	igt_assert(pages != MAP_FAILED);
+
+	igt_require(__gem_userptr(i915, pages, sz, true, userptr_flags, &handle) == 0);
+	memset(pages, 0xa5, sz);
+	SHA1(pages, sz, original);
+
+	for (int page = 0; page < 16; page++) {
+		char data[4096];
+
+		memset(data, page, sizeof(data));
+		igt_assert_eq(__gem_write(i915, handle, page << 12, data, sizeof(data)), -EINVAL);
+	}
+
+	gem_close(i915, handle);
+
+	SHA1(pages, sz, result);
+	igt_assert(!memcmp(original, result, sizeof(original)));
+
+	munmap(pages, sz);
+}
+
 static int test_usage_restrictions(int fd)
 {
 	void *ptr;
@@ -961,10 +1320,6 @@ static int test_usage_restrictions(int fd)
 	ret = __gem_userptr(fd, (char *)ptr + 1, PAGE_SIZE - 1, 0, userptr_flags, &handle);
 	igt_assert_neq(ret, 0);
 
-	/* Read-only not supported. */
-	ret = __gem_userptr(fd, (char *)ptr, PAGE_SIZE, 1, userptr_flags, &handle);
-	igt_assert_neq(ret, 0);
-
 	free(ptr);
 
 	return 0;
@@ -1502,6 +1857,15 @@ int main(int argc, char **argv)
 		igt_subtest("dmabuf-unsync")
 			test_dmabuf();
 
+		igt_subtest("readonly-unsync")
+			test_readonly(fd);
+
+		igt_subtest("readonly-mmap-unsync")
+			test_readonly_mmap(fd);
+
+		igt_subtest("readonly-pwrite-unsync")
+			test_readonly_pwrite(fd);
+
 		for (unsigned flags = 0; flags < ALL_FORKING_EVICTIONS + 1; flags++) {
 			igt_subtest_f("forked-unsync%s%s%s-%s",
 					flags & FORKING_EVICTIONS_SWAPPING ? "-swapping" : "",
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 09/17] igt/gem_userptr: Check read-only mappings
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx, Tvrtko Ursulin

Setup a userptr object that only has a read-only mapping back to a file
store (memfd). Then attempt to write into that mapping using the GPU and
assert that those writes do not land (while also writing via a writable
userptr mapping into the same memfd to verify that the GPU is working!)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 configure.ac              |   1 +
 lib/ioctl_wrappers.c      |   4 +-
 lib/ioctl_wrappers.h      |   4 +-
 lib/meson.build           |   1 +
 meson.build               |   1 +
 tests/Makefile.am         |   4 +-
 tests/gem_userptr_blits.c | 372 +++++++++++++++++++++++++++++++++++++-
 7 files changed, 377 insertions(+), 10 deletions(-)

diff --git a/configure.ac b/configure.ac
index 1ee4e90e9..195963d4f 100644
--- a/configure.ac
+++ b/configure.ac
@@ -125,6 +125,7 @@ PKG_CHECK_MODULES(PCIACCESS, [pciaccess >= 0.10])
 PKG_CHECK_MODULES(KMOD, [libkmod])
 PKG_CHECK_MODULES(PROCPS, [libprocps])
 PKG_CHECK_MODULES(LIBUNWIND, [libunwind])
+PKG_CHECK_MODULES(SSL, [openssl])
 PKG_CHECK_MODULES(VALGRIND, [valgrind], [have_valgrind=yes], [have_valgrind=no])
 
 if test x$have_valgrind = xyes; then
diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c
index 79db44a8c..d5d2a4e4c 100644
--- a/lib/ioctl_wrappers.c
+++ b/lib/ioctl_wrappers.c
@@ -869,7 +869,7 @@ int gem_madvise(int fd, uint32_t handle, int state)
 	return madv.retained;
 }
 
-int __gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle)
+int __gem_userptr(int fd, void *ptr, uint64_t size, int read_only, uint32_t flags, uint32_t *handle)
 {
 	struct drm_i915_gem_userptr userptr;
 
@@ -898,7 +898,7 @@ int __gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, ui
  *
  * Returns userptr handle for the GEM object.
  */
-void gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle)
+void gem_userptr(int fd, void *ptr, uint64_t size, int read_only, uint32_t flags, uint32_t *handle)
 {
 	igt_assert_eq(__gem_userptr(fd, ptr, size, read_only, flags, handle), 0);
 }
diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h
index b966f72c9..8e2cd380b 100644
--- a/lib/ioctl_wrappers.h
+++ b/lib/ioctl_wrappers.h
@@ -133,8 +133,8 @@ struct local_i915_gem_userptr {
 #define LOCAL_I915_USERPTR_UNSYNCHRONIZED (1<<31)
 	uint32_t handle;
 };
-void gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle);
-int __gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle);
+void gem_userptr(int fd, void *ptr, uint64_t size, int read_only, uint32_t flags, uint32_t *handle);
+int __gem_userptr(int fd, void *ptr, uint64_t size, int read_only, uint32_t flags, uint32_t *handle);
 
 void gem_sw_finish(int fd, uint32_t handle);
 
diff --git a/lib/meson.build b/lib/meson.build
index 1a355414e..939167f91 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -62,6 +62,7 @@ lib_deps = [
 	pthreads,
 	math,
 	realtime,
+	ssl,
 ]
 
 if libdrm_intel.found()
diff --git a/meson.build b/meson.build
index 4d15d6238..638c01066 100644
--- a/meson.build
+++ b/meson.build
@@ -98,6 +98,7 @@ pciaccess = dependency('pciaccess', version : '>=0.10')
 libkmod = dependency('libkmod')
 libprocps = dependency('libprocps', required : true)
 libunwind = dependency('libunwind', required : true)
+ssl = dependency('openssl', required : true)
 
 valgrind = null_dep
 valgrindinfo = 'No'
diff --git a/tests/Makefile.am b/tests/Makefile.am
index f41ad5096..ba307b220 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -126,8 +126,8 @@ gem_tiled_swapping_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 gem_tiled_swapping_LDADD = $(LDADD) -lpthread
 prime_self_import_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 prime_self_import_LDADD = $(LDADD) -lpthread
-gem_userptr_blits_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
-gem_userptr_blits_LDADD = $(LDADD) -lpthread
+gem_userptr_blits_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS) $(SSL_CFLAGS)
+gem_userptr_blits_LDADD = $(LDADD) $(SSL_LIBS) -lpthread
 perf_pmu_LDADD = $(LDADD) $(top_builddir)/lib/libigt_perf.la
 
 gem_eio_LDADD = $(LDADD) -lrt
diff --git a/tests/gem_userptr_blits.c b/tests/gem_userptr_blits.c
index 7e3b6ef38..0fb7eba04 100644
--- a/tests/gem_userptr_blits.c
+++ b/tests/gem_userptr_blits.c
@@ -43,13 +43,17 @@
 #include <fcntl.h>
 #include <inttypes.h>
 #include <errno.h>
+#include <setjmp.h>
 #include <sys/stat.h>
 #include <sys/time.h>
 #include <sys/mman.h>
+#include <openssl/sha.h>
 #include <signal.h>
 #include <pthread.h>
 #include <time.h>
 
+#include <linux/memfd.h>
+
 #include "drm.h"
 #include "i915_drm.h"
 
@@ -238,6 +242,57 @@ blit(int fd, uint32_t dst, uint32_t src, uint32_t *all_bo, int n_bo)
 	return ret;
 }
 
+static void store_dword(int fd, uint32_t target,
+			uint32_t offset, uint32_t value)
+{
+	const int gen = intel_gen(intel_get_drm_devid(fd));
+	struct drm_i915_gem_exec_object2 obj[2];
+	struct drm_i915_gem_relocation_entry reloc;
+	struct drm_i915_gem_execbuffer2 execbuf;
+	uint32_t batch[16];
+	int i;
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.buffer_count = ARRAY_SIZE(obj);
+	execbuf.flags = 0;
+	if (gen < 6)
+		execbuf.flags |= I915_EXEC_SECURE;
+
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = target;
+	obj[1].handle = gem_create(fd, 4096);
+
+	memset(&reloc, 0, sizeof(reloc));
+	reloc.target_handle = obj[0].handle;
+	reloc.presumed_offset = 0;
+	reloc.offset = sizeof(uint32_t);
+	reloc.delta = offset;
+	reloc.read_domains = I915_GEM_DOMAIN_RENDER;
+	reloc.write_domain = I915_GEM_DOMAIN_RENDER;
+	obj[1].relocs_ptr = to_user_pointer(&reloc);
+	obj[1].relocation_count = 1;
+
+	i = 0;
+	batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+	if (gen >= 8) {
+		batch[++i] = offset;
+		batch[++i] = 0;
+	} else if (gen >= 4) {
+		batch[++i] = 0;
+		batch[++i] = offset;
+		reloc.offset += sizeof(uint32_t);
+	} else {
+		batch[i]--;
+		batch[++i] = offset;
+	}
+	batch[++i] = value;
+	batch[++i] = MI_BATCH_BUFFER_END;
+	gem_write(fd, obj[1].handle, 0, batch, sizeof(batch));
+	gem_execbuf(fd, &execbuf);
+	gem_close(fd, obj[1].handle);
+}
+
 static uint32_t
 create_userptr(int fd, uint32_t val, uint32_t *ptr)
 {
@@ -941,6 +996,310 @@ static int test_dmabuf(void)
 	return 0;
 }
 
+static void test_readonly(int i915)
+{
+	unsigned char orig[SHA_DIGEST_LENGTH];
+	uint64_t aperture_size;
+	uint32_t whandle, rhandle;
+	size_t sz, total;
+	void *pages, *space;
+	int memfd;
+
+	/*
+	 * A small batch of pages; small enough to cheaply check for stray
+	 * writes but large enough that we don't create too many VMA pointing
+	 * back to this set from the large arena. The limit on total number
+	 * of VMA for a process is 65,536 (at least on this kernel).
+	 *
+	 * We then write from the GPU through the large arena into the smaller
+	 * backing storage, which we can cheaply check to see if those writes
+	 * have landed (using a SHA1sum). Repeating the same random GPU writes
+	 * though a read-only handle to confirm that this time the writes are
+	 * discarded and the backing store unchanged.
+	 */
+	sz = 16 << 12;
+	memfd = memfd_create("pages", 0);
+	igt_require(memfd != -1);
+	igt_require(ftruncate(memfd, sz) == 0);
+
+	pages = mmap(NULL, sz, PROT_WRITE, MAP_SHARED, memfd, 0);
+	igt_assert(pages != MAP_FAILED);
+
+	igt_require(__gem_userptr(i915, pages, sz, true, userptr_flags, &rhandle) == 0);
+	gem_close(i915, rhandle);
+
+	gem_userptr(i915, pages, sz, false, userptr_flags, &whandle);
+
+	total = 2048ull << 20;
+	aperture_size = gem_aperture_size(i915) / 2;
+	if (aperture_size < total)
+		total = aperture_size;
+	total = total / sz * sz;
+	igt_info("Using a %'zuB (%'zu pages) arena onto %zu pages\n",
+		 total, total >> 12, sz >> 12);
+
+	/* Create an arena all pointing to the same set of pages */
+	space = mmap(NULL, total, PROT_READ, MAP_ANON | MAP_SHARED, -1, 0);
+	igt_require(space != MAP_FAILED);
+	for (size_t offset = 0; offset < total; offset += sz) {
+		igt_assert(mmap(space + offset, sz,
+				PROT_WRITE, MAP_SHARED | MAP_FIXED,
+				memfd, 0) != MAP_FAILED);
+		*(uint32_t *)(space + offset) = offset;
+	}
+	igt_assert_eq_u32(*(uint32_t *)pages, (uint32_t)(total - sz));
+	igt_assert(mlock(space, total) == 0);
+	close(memfd);
+
+	/* Check we can create a normal userptr bo wrapping the wrapper */
+	gem_userptr(i915, space, total, false, userptr_flags, &rhandle);
+	gem_set_domain(i915, rhandle, I915_GEM_DOMAIN_CPU, 0);
+	for (size_t offset = 0; offset < total; offset += sz)
+		store_dword(i915, rhandle, offset + 4, offset / sz);
+	gem_sync(i915, rhandle);
+	igt_assert_eq_u32(*(uint32_t *)(pages + 0), (uint32_t)(total - sz));
+	igt_assert_eq_u32(*(uint32_t *)(pages + 4), (uint32_t)(total / sz - 1));
+	gem_close(i915, rhandle);
+
+	/* Now enforce read-only henceforth */
+	igt_assert(mprotect(space, total, PROT_READ) == 0);
+
+	SHA1(pages, sz, orig);
+	igt_fork(child, 1) {
+		const int gen = intel_gen(intel_get_drm_devid(i915));
+		const int nreloc = 1024;
+		struct drm_i915_gem_relocation_entry *reloc;
+		struct drm_i915_gem_exec_object2 obj[2];
+		struct drm_i915_gem_execbuffer2 exec;
+		unsigned char ref[SHA_DIGEST_LENGTH], result[SHA_DIGEST_LENGTH];
+		uint32_t *batch;
+		int i;
+
+		reloc = calloc(sizeof(*reloc), nreloc);
+		gem_userptr(i915, space, total, true, userptr_flags, &rhandle);
+
+		memset(obj, 0, sizeof(obj));
+		obj[0].flags = LOCAL_EXEC_OBJECT_SUPPORTS_48B;
+		obj[1].handle = gem_create(i915, sz);
+		obj[1].relocation_count = nreloc;
+		obj[1].relocs_ptr = to_user_pointer(reloc);
+
+		batch = gem_mmap__wc(i915, obj[1].handle, 0, sz, PROT_WRITE);
+
+		memset(&exec, 0, sizeof(exec));
+		exec.buffer_count = 2;
+		exec.buffers_ptr = to_user_pointer(obj);
+		if (gen < 6)
+			exec.flags |= I915_EXEC_SECURE;
+
+		for_each_engine(i915, exec.flags) {
+			/* First tweak the backing store through the write */
+			i = 0;
+			obj[0].handle = whandle;
+			for (int n = 0; n < nreloc; n++) {
+				uint64_t offset;
+
+				reloc[n].target_handle = obj[0].handle;
+				reloc[n].delta = rand() % (sz / 4) * 4;
+				reloc[n].offset = (i + 1) * sizeof(uint32_t);
+				reloc[n].presumed_offset = obj[0].offset;
+				reloc[n].read_domains = I915_GEM_DOMAIN_RENDER;
+				reloc[n].write_domain = I915_GEM_DOMAIN_RENDER;
+
+				offset = reloc[n].presumed_offset + reloc[n].delta;
+
+				batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+				if (gen >= 8) {
+					batch[++i] = offset;
+					batch[++i] = offset >> 32;
+				} else if (gen >= 4) {
+					batch[++i] = 0;
+					batch[++i] = offset;
+					reloc[n].offset += sizeof(uint32_t);
+				} else {
+					batch[i]--;
+					batch[++i] = offset;
+				}
+				batch[++i] = rand();
+				i++;
+			}
+			batch[i] = MI_BATCH_BUFFER_END;
+			igt_assert(i * sizeof(uint32_t) < sz);
+
+			gem_execbuf(i915, &exec);
+			gem_sync(i915, obj[0].handle);
+			SHA1(pages, sz, ref);
+
+			igt_assert(memcmp(ref, orig, sizeof(ref)));
+			memcpy(orig, ref, sizeof(orig));
+
+			/* Now try the same through the read-only handle */
+			i = 0;
+			obj[0].handle = rhandle;
+			for (int n = 0; n < nreloc; n++) {
+				uint64_t offset;
+
+				reloc[n].target_handle = obj[0].handle;
+				reloc[n].delta = rand() % (total / 4) * 4;
+				reloc[n].offset = (i + 1) * sizeof(uint32_t);
+				reloc[n].presumed_offset = obj[0].offset;
+				reloc[n].read_domains = I915_GEM_DOMAIN_RENDER;
+				reloc[n].write_domain = I915_GEM_DOMAIN_RENDER;
+
+				offset = reloc[n].presumed_offset + reloc[n].delta;
+
+				batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+				if (gen >= 8) {
+					batch[++i] = offset;
+					batch[++i] = offset >> 32;
+				} else if (gen >= 4) {
+					batch[++i] = 0;
+					batch[++i] = offset;
+					reloc[n].offset += sizeof(uint32_t);
+				} else {
+					batch[i]--;
+					batch[++i] = offset;
+				}
+				batch[++i] = rand();
+				i++;
+			}
+			batch[i] = MI_BATCH_BUFFER_END;
+
+			gem_execbuf(i915, &exec);
+			gem_sync(i915, obj[0].handle);
+			SHA1(pages, sz, result);
+
+			/*
+			 * As the writes into the read-only GPU bo should fail,
+			 * the SHA1 hash of the backing store should be
+			 * unaffected.
+			 */
+			igt_assert(memcmp(ref, result, SHA_DIGEST_LENGTH) == 0);
+		}
+
+		munmap(batch, sz);
+		gem_close(i915, obj[1].handle);
+		gem_close(i915, rhandle);
+	}
+	igt_waitchildren();
+
+	munmap(space, total);
+	munmap(pages, sz);
+}
+
+static jmp_buf sigjmp;
+static void sigjmp_handler(int sig)
+{
+	siglongjmp(sigjmp, sig);
+}
+
+static void test_readonly_mmap(int i915)
+{
+	unsigned char original[SHA_DIGEST_LENGTH];
+	unsigned char result[SHA_DIGEST_LENGTH];
+	uint32_t handle;
+	uint32_t sz;
+	void *pages;
+	void *ptr;
+	int sig;
+
+	/*
+	 * A quick check to ensure that we cannot circumvent the
+	 * read-only nature of our memory by creating a GTT mmap into
+	 * the pages. Imagine receiving a readonly SHM segment from
+	 * another process, or a readonly file mmap, it must remain readonly
+	 * on the GPU as well.
+	 */
+
+	igt_require(igt_setup_clflush());
+
+	sz = 16 << 12;
+	pages = mmap(NULL, sz, PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
+	igt_assert(pages != MAP_FAILED);
+
+	igt_require(__gem_userptr(i915, pages, sz, true, userptr_flags, &handle) == 0);
+	gem_set_caching(i915, handle, 0);
+
+	memset(pages, 0xa5, sz);
+	igt_clflush_range(pages, sz);
+	SHA1(pages, sz, original);
+
+	ptr = __gem_mmap__gtt(i915, handle, sz, PROT_WRITE);
+	igt_assert(ptr == NULL);
+
+	ptr = gem_mmap__gtt(i915, handle, sz, PROT_READ);
+	gem_close(i915, handle);
+
+	/* Check that a write into the GTT readonly map fails */
+	if (!(sig = sigsetjmp(sigjmp, 1))) {
+		signal(SIGBUS, sigjmp_handler);
+		signal(SIGSEGV, sigjmp_handler);
+		memset(ptr, 0x5a, sz);
+		igt_assert(0);
+	}
+	igt_assert_eq(sig, SIGSEGV);
+
+	/* Check that we disallow removing the readonly protection */
+	igt_assert(mprotect(ptr, sz, PROT_WRITE));
+	if (!(sig = sigsetjmp(sigjmp, 1))) {
+		signal(SIGBUS, sigjmp_handler);
+		signal(SIGSEGV, sigjmp_handler);
+		memset(ptr, 0x5a, sz);
+		igt_assert(0);
+	}
+	igt_assert_eq(sig, SIGSEGV);
+
+	/* A single read from the GTT pointer to prove that works */
+	igt_assert_eq_u32(*(uint8_t *)ptr, 0xa5);
+	munmap(ptr, sz);
+
+	/* Double check that the kernel did indeed not let any writes through */
+	igt_clflush_range(pages, sz);
+	SHA1(pages, sz, result);
+	igt_assert(!memcmp(original, result, sizeof(original)));
+
+	munmap(pages, sz);
+}
+
+static void test_readonly_pwrite(int i915)
+{
+	unsigned char original[SHA_DIGEST_LENGTH];
+	unsigned char result[SHA_DIGEST_LENGTH];
+	uint32_t handle;
+	uint32_t sz;
+	void *pages;
+
+	/*
+	 * Same as for GTT mmapings, we cannot alone ourselves to
+	 * circumvent readonly protection on a piece of memory via the
+	 * pwrite ioctl.
+	 */
+
+	igt_require(igt_setup_clflush());
+
+	sz = 16 << 12;
+	pages = mmap(NULL, sz, PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
+	igt_assert(pages != MAP_FAILED);
+
+	igt_require(__gem_userptr(i915, pages, sz, true, userptr_flags, &handle) == 0);
+	memset(pages, 0xa5, sz);
+	SHA1(pages, sz, original);
+
+	for (int page = 0; page < 16; page++) {
+		char data[4096];
+
+		memset(data, page, sizeof(data));
+		igt_assert_eq(__gem_write(i915, handle, page << 12, data, sizeof(data)), -EINVAL);
+	}
+
+	gem_close(i915, handle);
+
+	SHA1(pages, sz, result);
+	igt_assert(!memcmp(original, result, sizeof(original)));
+
+	munmap(pages, sz);
+}
+
 static int test_usage_restrictions(int fd)
 {
 	void *ptr;
@@ -961,10 +1320,6 @@ static int test_usage_restrictions(int fd)
 	ret = __gem_userptr(fd, (char *)ptr + 1, PAGE_SIZE - 1, 0, userptr_flags, &handle);
 	igt_assert_neq(ret, 0);
 
-	/* Read-only not supported. */
-	ret = __gem_userptr(fd, (char *)ptr, PAGE_SIZE, 1, userptr_flags, &handle);
-	igt_assert_neq(ret, 0);
-
 	free(ptr);
 
 	return 0;
@@ -1502,6 +1857,15 @@ int main(int argc, char **argv)
 		igt_subtest("dmabuf-unsync")
 			test_dmabuf();
 
+		igt_subtest("readonly-unsync")
+			test_readonly(fd);
+
+		igt_subtest("readonly-mmap-unsync")
+			test_readonly_mmap(fd);
+
+		igt_subtest("readonly-pwrite-unsync")
+			test_readonly_pwrite(fd);
+
 		for (unsigned flags = 0; flags < ALL_FORKING_EVICTIONS + 1; flags++) {
 			igt_subtest_f("forked-unsync%s%s%s-%s",
 					flags & FORKING_EVICTIONS_SWAPPING ? "-swapping" : "",
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 10/17] igt: Exercise creating context with shared GTT
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

v2: Test each shared context is its own timeline and allows request
reordering between shared contexts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 lib/i915/gem_context.c   |  66 +++
 lib/i915/gem_context.h   |  13 +
 tests/Makefile.sources   |   1 +
 tests/gem_ctx_shared.c   | 896 +++++++++++++++++++++++++++++++++++++++
 tests/gem_exec_whisper.c |  32 +-
 5 files changed, 999 insertions(+), 9 deletions(-)
 create mode 100644 tests/gem_ctx_shared.c

diff --git a/lib/i915/gem_context.c b/lib/i915/gem_context.c
index 669bd318c..02cf2ccf3 100644
--- a/lib/i915/gem_context.c
+++ b/lib/i915/gem_context.c
@@ -273,3 +273,69 @@ void gem_context_set_priority(int fd, uint32_t ctx_id, int prio)
 {
 	igt_assert(__gem_context_set_priority(fd, ctx_id, prio) == 0);
 }
+
+struct local_i915_gem_context_create_v2 {
+	uint32_t ctx_id;
+	uint32_t flags;
+	uint32_t share_ctx;
+	uint32_t pad;
+};
+
+#define LOCAL_IOCTL_I915_GEM_CONTEXT_CREATE       DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct local_i915_gem_context_create_v2)
+
+int
+__gem_context_create_shared(int i915, uint32_t share, unsigned int flags,
+			    uint32_t *out)
+{
+	struct local_i915_gem_context_create_v2 arg = {
+		.flags = flags,
+		.share_ctx = share,
+	};
+	int err = 0;
+
+	if (igt_ioctl(i915, LOCAL_IOCTL_I915_GEM_CONTEXT_CREATE, &arg))
+		err = -errno;
+
+	*out = arg.ctx_id;
+
+	errno = 0;
+	return err;
+}
+
+static bool __gem_context_has(int i915, uint32_t flags)
+{
+	uint32_t ctx;
+
+	__gem_context_create_shared(i915, 0, flags, &ctx);
+	if (ctx)
+		gem_context_destroy(i915, ctx);
+
+	errno = 0;
+	return ctx;
+}
+
+bool gem_contexts_has_shared_gtt(int i915)
+{
+	return __gem_context_has(i915, I915_GEM_CONTEXT_SHARE_GTT);
+}
+
+bool gem_has_queues(int i915)
+{
+	return __gem_context_has(i915, I915_GEM_CONTEXT_SHARE_GTT);
+}
+
+uint32_t gem_context_create_shared(int i915, uint32_t share, unsigned int flags)
+{
+	uint32_t ctx;
+
+	igt_assert_eq(__gem_context_create_shared(i915, share, flags, &ctx), 0);
+
+	return ctx;
+}
+
+uint32_t gem_queue_create(int i915)
+{
+	return gem_context_create_shared(i915, 0,
+					 I915_GEM_CONTEXT_SHARE_GTT |
+					 I915_GEM_CONTEXT_SINGLE_TIMELINE);
+}
diff --git a/lib/i915/gem_context.h b/lib/i915/gem_context.h
index aef68dda6..5ed992a70 100644
--- a/lib/i915/gem_context.h
+++ b/lib/i915/gem_context.h
@@ -29,6 +29,19 @@ int __gem_context_create(int fd, uint32_t *ctx_id);
 void gem_context_destroy(int fd, uint32_t ctx_id);
 int __gem_context_destroy(int fd, uint32_t ctx_id);
 
+#define I915_GEM_CONTEXT_SHARE_GTT 0x1
+#define I915_GEM_CONTEXT_SINGLE_TIMELINE 0x2
+
+int __gem_context_create_shared(int i915,
+				uint32_t share, unsigned int flags,
+				uint32_t *out);
+uint32_t gem_context_create_shared(int i915,
+				   uint32_t share, unsigned int flags);
+uint32_t gem_queue_create(int i915);
+
+bool gem_contexts_has_shared_gtt(int i915);
+bool gem_has_queues(int i915);
+
 bool gem_has_contexts(int fd);
 void gem_require_contexts(int fd);
 void gem_context_require_bannable(int fd);
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 54b4a3c21..08a1caf9d 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -56,6 +56,7 @@ TESTS_progs = \
 	gem_ctx_exec \
 	gem_ctx_isolation \
 	gem_ctx_param \
+	gem_ctx_shared \
 	gem_ctx_switch \
 	gem_ctx_thrash \
 	gem_double_irq_loop \
diff --git a/tests/gem_ctx_shared.c b/tests/gem_ctx_shared.c
new file mode 100644
index 000000000..9d3239f69
--- /dev/null
+++ b/tests/gem_ctx_shared.c
@@ -0,0 +1,896 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "igt.h"
+
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <errno.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+
+#include <drm.h>
+
+#include "igt_rand.h"
+#include "igt_vgem.h"
+#include "sync_file.h"
+
+#define LO 0
+#define HI 1
+#define NOISE 2
+
+#define MAX_PRIO LOCAL_I915_CONTEXT_MAX_USER_PRIORITY
+#define MIN_PRIO LOCAL_I915_CONTEXT_MIN_USER_PRIORITY
+
+static int priorities[] = {
+	[LO] = MIN_PRIO / 2,
+	[HI] = MAX_PRIO / 2,
+};
+
+#define BUSY_QLEN 8
+#define MAX_ELSP_QLEN 16
+
+IGT_TEST_DESCRIPTION("Test shared contexts.");
+
+static void create_shared_gtt(int i915, unsigned int flags)
+#define DETACHED 0x1
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+	};
+	uint32_t parent, child;
+
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	gem_execbuf(i915, &execbuf);
+	gem_sync(i915, obj.handle);
+
+	child = flags & DETACHED ? gem_context_create(i915) : 0;
+	igt_until_timeout(2) {
+		parent = flags & DETACHED ? child : 0;
+		child = gem_context_create_shared(i915, parent,
+						  I915_GEM_CONTEXT_SHARE_GTT);
+		execbuf.rsvd1 = child;
+		gem_execbuf(i915, &execbuf);
+
+		if (flags & DETACHED) {
+			gem_context_destroy(i915, parent);
+			gem_execbuf(i915, &execbuf);
+		} else {
+			parent = child;
+			gem_context_destroy(i915, parent);
+		}
+
+		execbuf.rsvd1 = parent;
+		igt_assert_eq(__gem_execbuf(i915, &execbuf), -ENOENT);
+		igt_assert_eq(__gem_context_create_shared(i915, parent,
+							  I915_GEM_CONTEXT_SHARE_GTT,
+							  &parent), -ENOENT);
+	}
+	if (flags & DETACHED)
+		gem_context_destroy(i915, child);
+
+	gem_sync(i915, obj.handle);
+	gem_close(i915, obj.handle);
+}
+
+static void disjoint_timelines(int i915)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+	};
+	struct sync_fence_info parent, child;
+	struct sync_file_info sync_file_info = {
+		.num_fences = 1,
+	};
+
+	/*
+	 * Each context, although they share a vm, are expected to be
+	 * distinct timelines. A request queued to one context should be
+	 * independent of any shared contexts.
+	 *
+	 * This information is exposed via the sync_file->name which
+	 * includes the fence.context.
+	 */
+
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	gem_execbuf(i915, &execbuf);
+	gem_sync(i915, obj.handle);
+
+	execbuf.flags = I915_EXEC_FENCE_OUT;
+	gem_execbuf_wr(i915, &execbuf);
+	sync_file_info.sync_fence_info = to_user_pointer(&parent);
+	do_ioctl(execbuf.rsvd2 >> 32, SYNC_IOC_FILE_INFO, &sync_file_info);
+	close(execbuf.rsvd2 >> 32);
+
+	execbuf.rsvd1 =
+		gem_context_create_shared(i915, 0, I915_GEM_CONTEXT_SHARE_GTT);
+	gem_execbuf_wr(i915, &execbuf);
+	sync_file_info.sync_fence_info = to_user_pointer(&child);
+	do_ioctl(execbuf.rsvd2 >> 32, SYNC_IOC_FILE_INFO, &sync_file_info);
+	close(execbuf.rsvd2 >> 32);
+	gem_context_destroy(i915, execbuf.rsvd1);
+
+	igt_info("Parent fence: %s %s\n", parent.driver_name, parent.obj_name);
+	igt_info("Child fence: %s %s\n", child.driver_name, child.obj_name);
+
+	/* Driver should be the same, but on different timelines */
+	igt_assert(strcmp(parent.driver_name, child.driver_name) == 0);
+	igt_assert(strcmp(parent.obj_name, child.obj_name) != 0);
+
+	gem_sync(i915, obj.handle);
+	gem_close(i915, obj.handle);
+}
+
+static int reopen_driver(int fd)
+{
+	char path[256];
+
+	snprintf(path, sizeof(path), "/proc/self/fd/%d", fd);
+	fd = open(path, O_RDWR);
+	igt_assert_lte(0, fd);
+
+	return fd;
+}
+
+static void exhaust_shared_gtt(int i915, unsigned int flags)
+#define EXHAUST_LRC 0x1
+{
+	i915 = reopen_driver(i915);
+
+	igt_fork(pid, 1) {
+		const uint32_t bbe = MI_BATCH_BUFFER_END;
+		struct drm_i915_gem_exec_object2 obj = {
+			.handle = gem_create(i915, 4096)
+		};
+		struct drm_i915_gem_execbuffer2 execbuf = {
+			.buffers_ptr = to_user_pointer(&obj),
+			.buffer_count = 1,
+		};
+		uint32_t parent, child;
+		unsigned long count = 0;
+		int err;
+
+		gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+
+		child = 0;
+		for (;;) {
+			parent = child;
+			err = __gem_context_create_shared(i915, parent,
+							  I915_GEM_CONTEXT_SHARE_GTT,
+							  &child);
+			if (err)
+				break;
+
+			if (flags & EXHAUST_LRC) {
+				execbuf.rsvd1 = child;
+				err = __gem_execbuf(i915, &execbuf);
+				if (err)
+					break;
+			}
+
+			count++;
+		}
+		gem_sync(i915, obj.handle);
+
+		igt_info("Created %lu shared contexts, before %d (%s)\n",
+			 count, err, strerror(-err));
+	}
+	close(i915);
+	igt_waitchildren();
+}
+
+static void exec_shared_gtt(int i915, unsigned int ring)
+{
+	const int gen = intel_gen(intel_get_drm_devid(i915));
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096)
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+		.flags = ring,
+	};
+	uint32_t scratch = obj.handle;
+	uint32_t batch[16];
+	int i;
+
+	gem_require_ring(i915, ring);
+	igt_require(gem_can_store_dword(i915, ring));
+
+	/* Load object into place in the GTT */
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	gem_execbuf(i915, &execbuf);
+
+	/* Presume nothing causes an eviction in the meantime */
+
+	obj.handle = gem_create(i915, 4096);
+
+	i = 0;
+	batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+	if (gen >= 8) {
+		batch[++i] = obj.offset;
+		batch[++i] = 0;
+	} else if (gen >= 4) {
+		batch[++i] = 0;
+		batch[++i] = obj.offset;
+	} else {
+		batch[i]--;
+		batch[++i] = obj.offset;
+	}
+	batch[++i] = 0xc0ffee;
+	batch[++i] = MI_BATCH_BUFFER_END;
+	gem_write(i915, obj.handle, 0, batch, sizeof(batch));
+
+	obj.offset += 4096; /* make sure we don't cause an eviction! */
+	obj.flags |= EXEC_OBJECT_PINNED;
+	execbuf.rsvd1 = gem_context_create_shared(i915, 0,
+						  I915_GEM_CONTEXT_SHARE_GTT);
+	if (gen > 3 && gen < 6)
+		execbuf.flags |= I915_EXEC_SECURE;
+
+	gem_execbuf(i915, &execbuf);
+	gem_context_destroy(i915, execbuf.rsvd1);
+	gem_sync(i915, obj.handle); /* write hazard lies */
+	gem_close(i915, obj.handle);
+
+	gem_read(i915, scratch, 0, batch, sizeof(uint32_t));
+	gem_close(i915, scratch);
+
+	igt_assert_eq_u32(*batch, 0xc0ffee);
+}
+
+static int nop_sync(int i915, uint32_t ctx, unsigned int ring, int64_t timeout)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+		.flags = ring,
+		.rsvd1 = ctx,
+	};
+	int err;
+
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	gem_execbuf(i915, &execbuf);
+	err = gem_wait(i915, obj.handle, &timeout);
+	gem_close(i915, obj.handle);
+
+	return err;
+}
+
+static bool has_single_timeline(int i915)
+{
+	uint32_t ctx;
+
+	__gem_context_create_shared(i915, 0,
+				    I915_GEM_CONTEXT_SINGLE_TIMELINE, &ctx);
+	if (ctx)
+		gem_context_destroy(i915, ctx);
+
+	return ctx != 0;
+}
+
+static bool ignore_engine(unsigned engine)
+{
+	if (engine == 0)
+		return true;
+
+	if (engine == I915_EXEC_BSD)
+		return true;
+
+	return false;
+}
+
+static void single_timeline(int i915)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+	};
+	struct sync_fence_info rings[16];
+	struct sync_file_info sync_file_info = {
+		.num_fences = 1,
+	};
+	unsigned int engine;
+	int n;
+
+	igt_require(has_single_timeline(i915));
+
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	gem_execbuf(i915, &execbuf);
+	gem_sync(i915, obj.handle);
+
+	/*
+	 * For a "single timeline" context, each ring is on the common
+	 * timeline, unlike a normal context where each ring has an
+	 * independent timeline. That is no matter which engine we submit
+	 * to, it reports the same timeline name and fence context. However,
+	 * the fence context is not reported through the sync_fence_info.
+	 */
+	execbuf.rsvd1 =
+		gem_context_create_shared(i915, 0,
+					  I915_GEM_CONTEXT_SINGLE_TIMELINE);
+	execbuf.flags = I915_EXEC_FENCE_OUT;
+	n = 0;
+	for_each_engine(i915, engine) {
+		gem_execbuf_wr(i915, &execbuf);
+		sync_file_info.sync_fence_info = to_user_pointer(&rings[n]);
+		do_ioctl(execbuf.rsvd2 >> 32, SYNC_IOC_FILE_INFO, &sync_file_info);
+		close(execbuf.rsvd2 >> 32);
+
+		igt_info("ring[%d] fence: %s %s\n",
+			 n, rings[n].driver_name, rings[n].obj_name);
+		n++;
+	}
+	gem_sync(i915, obj.handle);
+	gem_close(i915, obj.handle);
+
+	for (int i = 1; i < n; i++) {
+		igt_assert(!strcmp(rings[0].driver_name, rings[i].driver_name));
+		igt_assert(!strcmp(rings[0].obj_name, rings[i].obj_name));
+	}
+}
+
+static void exec_single_timeline(int i915, unsigned int ring)
+{
+	unsigned int other;
+	igt_spin_t *spin;
+	uint32_t ctx;
+
+	gem_require_ring(i915, ring);
+	igt_require(has_single_timeline(i915));
+
+	/*
+	 * On an ordinary context, a blockage on one ring doesn't prevent
+	 * execution on an other.
+	 */
+	ctx = 0;
+	spin = NULL;
+	for_each_engine(i915, other) {
+		if (other == ring || ignore_engine(other))
+			continue;
+
+		if (spin == NULL) {
+			spin = __igt_spin_batch_new(i915, ctx, other, 0);
+		} else {
+			struct drm_i915_gem_exec_object2 obj = {
+				.handle = spin->handle,
+			};
+			struct drm_i915_gem_execbuffer2 execbuf = {
+				.buffers_ptr = to_user_pointer(&obj),
+				.buffer_count = 1,
+				.flags = other,
+				.rsvd1 = ctx,
+			};
+			gem_execbuf(i915, &execbuf);
+		}
+	}
+	igt_require(spin);
+	igt_assert_eq(nop_sync(i915, ctx, ring, NSEC_PER_SEC), 0);
+	igt_spin_batch_free(i915, spin);
+
+	/*
+	 * But if we create a context with just a single shared timeline,
+	 * then it will block waiting for the earlier requests on the
+	 * other engines.
+	 */
+	ctx = gem_context_create_shared(i915, 0,
+					I915_GEM_CONTEXT_SINGLE_TIMELINE);
+	spin = NULL;
+	for_each_engine(i915, other) {
+		if (other == ring || ignore_engine(other))
+			continue;
+
+		if (spin == NULL) {
+			spin = __igt_spin_batch_new(i915, ctx, other, 0);
+		} else {
+			struct drm_i915_gem_exec_object2 obj = {
+				.handle = spin->handle,
+			};
+			struct drm_i915_gem_execbuffer2 execbuf = {
+				.buffers_ptr = to_user_pointer(&obj),
+				.buffer_count = 1,
+				.flags = other,
+				.rsvd1 = ctx,
+			};
+			gem_execbuf(i915, &execbuf);
+		}
+	}
+	igt_assert(spin);
+	igt_assert_eq(nop_sync(i915, ctx, ring, NSEC_PER_SEC), -ETIME);
+	igt_spin_batch_free(i915, spin);
+}
+
+static void store_dword(int i915, uint32_t ctx, unsigned ring,
+			uint32_t target, uint32_t offset, uint32_t value,
+			uint32_t cork, unsigned write_domain)
+{
+	const int gen = intel_gen(intel_get_drm_devid(i915));
+	struct drm_i915_gem_exec_object2 obj[3];
+	struct drm_i915_gem_relocation_entry reloc;
+	struct drm_i915_gem_execbuffer2 execbuf;
+	uint32_t batch[16];
+	int i;
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = to_user_pointer(obj + !cork);
+	execbuf.buffer_count = 2 + !!cork;
+	execbuf.flags = ring;
+	if (gen < 6)
+		execbuf.flags |= I915_EXEC_SECURE;
+	execbuf.rsvd1 = ctx;
+
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = cork;
+	obj[1].handle = target;
+	obj[2].handle = gem_create(i915, 4096);
+
+	memset(&reloc, 0, sizeof(reloc));
+	reloc.target_handle = obj[1].handle;
+	reloc.presumed_offset = 0;
+	reloc.offset = sizeof(uint32_t);
+	reloc.delta = offset;
+	reloc.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
+	reloc.write_domain = write_domain;
+	obj[2].relocs_ptr = to_user_pointer(&reloc);
+	obj[2].relocation_count = 1;
+
+	i = 0;
+	batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+	if (gen >= 8) {
+		batch[++i] = offset;
+		batch[++i] = 0;
+	} else if (gen >= 4) {
+		batch[++i] = 0;
+		batch[++i] = offset;
+		reloc.offset += sizeof(uint32_t);
+	} else {
+		batch[i]--;
+		batch[++i] = offset;
+	}
+	batch[++i] = value;
+	batch[++i] = MI_BATCH_BUFFER_END;
+	gem_write(i915, obj[2].handle, 0, batch, sizeof(batch));
+	gem_execbuf(i915, &execbuf);
+	gem_close(i915, obj[2].handle);
+}
+
+struct cork {
+	int device;
+	uint32_t handle;
+	uint32_t fence;
+};
+
+static void plug(int i915, struct cork *c)
+{
+	struct vgem_bo bo;
+	int dmabuf;
+
+	c->device = drm_open_driver(DRIVER_VGEM);
+
+	bo.width = bo.height = 1;
+	bo.bpp = 4;
+	vgem_create(c->device, &bo);
+	c->fence = vgem_fence_attach(c->device, &bo, VGEM_FENCE_WRITE);
+
+	dmabuf = prime_handle_to_fd(c->device, bo.handle);
+	c->handle = prime_fd_to_handle(i915, dmabuf);
+	close(dmabuf);
+}
+
+static void unplug(struct cork *c)
+{
+	vgem_fence_signal(c->device, c->fence);
+	close(c->device);
+}
+
+static uint32_t create_highest_priority(int i915)
+{
+	uint32_t ctx = gem_context_create(i915);
+
+	/*
+	 * If there is no priority support, all contexts will have equal
+	 * priority (and therefore the max user priority), so no context
+	 * can overtake us, and we effectively can form a plug.
+	 */
+	__gem_context_set_priority(i915, ctx, MAX_PRIO);
+
+	return ctx;
+}
+
+static void unplug_show_queue(int i915, struct cork *c, unsigned int engine)
+{
+	igt_spin_t *spin[BUSY_QLEN];
+
+	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
+		uint32_t ctx = create_highest_priority(i915);
+		spin[n] = __igt_spin_batch_new(i915, ctx, engine, 0);
+		gem_context_destroy(i915, ctx);
+	}
+
+	unplug(c); /* batches will now be queued on the engine */
+	igt_debugfs_dump(i915, "i915_engine_info");
+
+	for (int n = 0; n < ARRAY_SIZE(spin); n++)
+		igt_spin_batch_free(i915, spin[n]);
+}
+
+static uint32_t store_timestamp(int i915, uint32_t ctx, unsigned ring)
+{
+	const bool r64b = intel_gen(intel_get_drm_devid(i915)) >= 8;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+		.relocation_count = 1,
+	};
+	struct drm_i915_gem_relocation_entry reloc = {
+		.target_handle = obj.handle,
+		.offset = 2 * sizeof(uint32_t),
+		.delta = 4092,
+		.read_domains = I915_GEM_DOMAIN_INSTRUCTION,
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+		.flags = ring,
+		.rsvd1 = ctx,
+	};
+	uint32_t batch[] = {
+		0x24 << 23 | (1 + r64b), /* SRM */
+		0x2358,
+		4092,
+		0,
+		MI_BATCH_BUFFER_END
+	};
+
+	igt_require(intel_gen(intel_get_drm_devid(i915)) >= 7);
+
+	gem_write(i915, obj.handle, 0, batch, sizeof(batch));
+	obj.relocs_ptr = to_user_pointer(&reloc);
+
+	gem_execbuf(i915, &execbuf);
+
+	return obj.handle;
+}
+
+static void independent(int i915, unsigned ring, unsigned flags)
+{
+	uint32_t handle[2];
+	igt_spin_t *spin;
+
+	spin = igt_spin_batch_new(i915, 0, ring, 0);
+	for (int i = 0; i < 16; i++){
+		struct drm_i915_gem_exec_object2 obj = {
+			.handle = spin->handle
+		};
+		struct drm_i915_gem_execbuffer2 eb = {
+			.buffer_count = 1,
+			.buffers_ptr = to_user_pointer(&obj),
+			.flags = ring,
+			.rsvd1 = gem_context_create(i915),
+		};
+		gem_context_set_priority(i915, eb.rsvd1, MAX_PRIO);
+		gem_execbuf(i915, &eb);
+		gem_context_destroy(i915, eb.rsvd1);
+	}
+
+	for (int i = LO; i <= HI; i++) {
+		uint32_t ctx = gem_queue_create(i915);
+		gem_context_set_priority(i915, ctx, priorities[i]);
+		handle[i] = store_timestamp(i915, ctx, ring);
+		gem_context_destroy(i915, ctx);
+	}
+
+	igt_spin_batch_free(i915, spin);
+
+	for (int i = LO; i <= HI; i++) {
+		uint32_t *ptr;
+
+		ptr = gem_mmap__gtt(i915, handle[i], 4096, PROT_READ);
+		gem_set_domain(i915, handle[i], /* no write hazard lies! */
+			       I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+		gem_close(i915, handle[i]);
+
+		handle[i] = ptr[1023];
+		munmap(ptr, 4096);
+
+		igt_debug("ctx[%d] .prio=%d, timestamp=%u\n",
+			  i, priorities[i], handle[i]);
+	}
+
+	igt_assert((int32_t)(handle[HI] - handle[LO]) < 0);
+}
+
+static void reorder(int i915, unsigned ring, unsigned flags)
+#define EQUAL 1
+{
+	struct cork cork;
+	uint32_t scratch;
+	uint32_t *ptr;
+	uint32_t ctx[2];
+
+	ctx[LO] = gem_queue_create(i915);
+	gem_context_set_priority(i915, ctx[LO], MIN_PRIO);
+
+	ctx[HI] = gem_queue_create(i915);
+	gem_context_set_priority(i915, ctx[HI], flags & EQUAL ? MIN_PRIO : 0);
+
+	scratch = gem_create(i915, 4096);
+	plug(i915, &cork);
+
+	/* We expect the high priority context to be executed first, and
+	 * so the final result will be value from the low priority context.
+	 */
+	store_dword(i915, ctx[LO], ring, scratch, 0, ctx[LO], cork.handle, 0);
+	store_dword(i915, ctx[HI], ring, scratch, 0, ctx[HI], cork.handle, 0);
+
+	unplug_show_queue(i915, &cork, ring);
+
+	gem_context_destroy(i915, ctx[LO]);
+	gem_context_destroy(i915, ctx[HI]);
+
+	ptr = gem_mmap__gtt(i915, scratch, 4096, PROT_READ);
+	gem_set_domain(i915, scratch, /* no write hazard lies! */
+		       I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+	gem_close(i915, scratch);
+
+	if (flags & EQUAL) /* equal priority, result will be fifo */
+		igt_assert_eq_u32(ptr[0], ctx[HI]);
+	else
+		igt_assert_eq_u32(ptr[0], ctx[LO]);
+	munmap(ptr, 4096);
+}
+
+static void promotion(int i915, unsigned ring)
+{
+	struct cork cork;
+	uint32_t result, dep;
+	uint32_t *ptr;
+	uint32_t ctx[3];
+
+	ctx[LO] = gem_queue_create(i915);
+	gem_context_set_priority(i915, ctx[LO], MIN_PRIO);
+
+	ctx[HI] = gem_queue_create(i915);
+	gem_context_set_priority(i915, ctx[HI], 0);
+
+	ctx[NOISE] = gem_queue_create(i915);
+	gem_context_set_priority(i915, ctx[NOISE], MIN_PRIO/2);
+
+	result = gem_create(i915, 4096);
+	dep = gem_create(i915, 4096);
+
+	plug(i915, &cork);
+
+	/* Expect that HI promotes LO, so the order will be LO, HI, NOISE.
+	 *
+	 * fifo would be NOISE, LO, HI.
+	 * strict priority would be  HI, NOISE, LO
+	 */
+	store_dword(i915, ctx[NOISE], ring, result, 0, ctx[NOISE], cork.handle, 0);
+	store_dword(i915, ctx[LO], ring, result, 0, ctx[LO], cork.handle, 0);
+
+	/* link LO <-> HI via a dependency on another buffer */
+	store_dword(i915, ctx[LO], ring, dep, 0, ctx[LO], 0, I915_GEM_DOMAIN_INSTRUCTION);
+	store_dword(i915, ctx[HI], ring, dep, 0, ctx[HI], 0, 0);
+
+	store_dword(i915, ctx[HI], ring, result, 0, ctx[HI], 0, 0);
+
+	unplug_show_queue(i915, &cork, ring);
+
+	gem_context_destroy(i915, ctx[NOISE]);
+	gem_context_destroy(i915, ctx[LO]);
+	gem_context_destroy(i915, ctx[HI]);
+
+	ptr = gem_mmap__gtt(i915, dep, 4096, PROT_READ);
+	gem_set_domain(i915, dep, /* no write hazard lies! */
+			I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+	gem_close(i915, dep);
+
+	igt_assert_eq_u32(ptr[0], ctx[HI]);
+	munmap(ptr, 4096);
+
+	ptr = gem_mmap__gtt(i915, result, 4096, PROT_READ);
+	gem_set_domain(i915, result, /* no write hazard lies! */
+			I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+	gem_close(i915, result);
+
+	igt_assert_eq_u32(ptr[0], ctx[NOISE]);
+	munmap(ptr, 4096);
+}
+
+static void smoketest(int i915, unsigned ring, unsigned timeout)
+{
+	const int ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	unsigned engines[16];
+	unsigned nengine;
+	unsigned engine;
+	uint32_t scratch;
+	uint32_t *ptr;
+
+	nengine = 0;
+	for_each_engine(i915, engine) {
+		if (ignore_engine(engine))
+			continue;
+
+		engines[nengine++] = engine;
+	}
+	igt_require(nengine);
+
+	scratch = gem_create(i915, 4096);
+	igt_fork(child, ncpus) {
+		unsigned long count = 0;
+		uint32_t ctx;
+
+		hars_petruska_f54_1_random_perturb(child);
+
+		ctx = gem_queue_create(i915);
+		igt_until_timeout(timeout) {
+			int prio;
+
+			prio = hars_petruska_f54_1_random_unsafe_max(MAX_PRIO - MIN_PRIO) + MIN_PRIO;
+			gem_context_set_priority(i915, ctx, prio);
+
+			engine = engines[hars_petruska_f54_1_random_unsafe_max(nengine)];
+			store_dword(i915, ctx, engine, scratch,
+				    8*child + 0, ~child,
+				    0, 0);
+			for (unsigned int step = 0; step < 8; step++)
+				store_dword(i915, ctx, engine, scratch,
+					    8*child + 4, count++,
+					    0, 0);
+		}
+		gem_context_destroy(i915, ctx);
+	}
+	igt_waitchildren();
+
+	ptr = gem_mmap__gtt(i915, scratch, 4096, PROT_READ);
+	gem_set_domain(i915, scratch, /* no write hazard lies! */
+			I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+	gem_close(i915, scratch);
+
+	for (unsigned n = 0; n < ncpus; n++) {
+		igt_assert_eq_u32(ptr[2*n], ~n);
+		/*
+		 * Note this count is approximate due to unconstrained
+		 * ordering of the dword writes between engines.
+		 *
+		 * Take the result with a pinch of salt.
+		 */
+		igt_info("Child[%d] completed %u cycles\n",  n, ptr[2*n+1]);
+	}
+	munmap(ptr, 4096);
+}
+
+igt_main
+{
+	const struct intel_execution_engine *e;
+	int i915 = -1;
+
+	igt_fixture {
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+	}
+
+	igt_subtest_group {
+		igt_fixture {
+			igt_require(gem_contexts_has_shared_gtt(i915));
+			igt_fork_hang_detector(i915);
+		}
+
+		igt_subtest("create-shared-gtt")
+			create_shared_gtt(i915, 0);
+
+		igt_subtest("detached-shared-gtt")
+			create_shared_gtt(i915, DETACHED);
+
+		igt_subtest("disjoint-timelines")
+			disjoint_timelines(i915);
+
+		igt_subtest("single-timeline")
+			single_timeline(i915);
+
+		igt_subtest("exhaust-shared-gtt")
+			exhaust_shared_gtt(i915, 0);
+
+		igt_subtest("exhaust-shared-gtt-lrc")
+			exhaust_shared_gtt(i915, EXHAUST_LRC);
+
+		for (e = intel_execution_engines; e->name; e++) {
+			igt_subtest_f("exec-shared-gtt-%s", e->name)
+				exec_shared_gtt(i915, e->exec_id | e->flags);
+
+			if (!ignore_engine(e->exec_id | e->flags)) {
+				igt_subtest_f("exec-single-timeline-%s",
+					      e->name)
+					exec_single_timeline(i915,
+							     e->exec_id | e->flags);
+			}
+
+			/*
+			 * Check that the shared contexts operate independently,
+			 * that is requests on one ("queue") can be scheduled
+			 * around another queue. We only check the basics here,
+			 * enough to reduce the queue into just another context,
+			 * and so rely on gem_exec_schedule to prove the rest.
+			 */
+			igt_subtest_group {
+				igt_fixture {
+					gem_require_ring(i915, e->exec_id | e->flags);
+					igt_require(gem_can_store_dword(i915, e->exec_id) | e->flags);
+					igt_require(gem_scheduler_enabled(i915));
+					igt_require(gem_scheduler_has_ctx_priority(i915));
+				}
+
+				igt_subtest_f("Q-independent-%s", e->name)
+					independent(i915, e->exec_id | e->flags, 0);
+
+				igt_subtest_f("Q-in-order-%s", e->name)
+					reorder(i915, e->exec_id | e->flags, EQUAL);
+
+				igt_subtest_f("Q-out-order-%s", e->name)
+					reorder(i915, e->exec_id | e->flags, 0);
+
+				igt_subtest_f("Q-promotion-%s", e->name)
+					promotion(i915, e->exec_id | e->flags);
+
+				igt_subtest_f("Q-smoketest-%s", e->name)
+					smoketest(i915, e->exec_id | e->flags, 5);
+			}
+		}
+
+		igt_subtest("Q-smoketest-all") {
+			igt_require(gem_scheduler_enabled(i915));
+			igt_require(gem_scheduler_has_ctx_priority(i915));
+			smoketest(i915, -1, 30);
+		}
+
+		igt_fixture {
+			igt_stop_hang_detector();
+		}
+	}
+}
diff --git a/tests/gem_exec_whisper.c b/tests/gem_exec_whisper.c
index 81303f847..a41614a7b 100644
--- a/tests/gem_exec_whisper.c
+++ b/tests/gem_exec_whisper.c
@@ -87,6 +87,7 @@ static void verify_reloc(int fd, uint32_t handle,
 #define HANG 0x20
 #define SYNC 0x40
 #define PRIORITY 0x80
+#define QUEUES 0x100
 
 struct hang {
 	struct drm_i915_gem_exec_object2 obj;
@@ -171,7 +172,7 @@ static void ctx_set_random_priority(int fd, uint32_t ctx)
 {
 	int prio = hars_petruska_f54_1_random_unsafe_max(1024) - 512;
 	gem_context_set_priority(fd, ctx, prio);
-};
+}
 
 static void whisper(int fd, unsigned engine, unsigned flags)
 {
@@ -223,6 +224,9 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 	if (flags & CONTEXTS)
 		gem_require_contexts(fd);
 
+	if (flags & QUEUES)
+		igt_require(gem_has_queues(fd));
+
 	if (flags & HANG)
 		init_hang(&hang);
 
@@ -284,6 +288,10 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 			for (n = 0; n < 64; n++)
 				contexts[n] = gem_context_create(fd);
 		}
+		if (flags & QUEUES) {
+			for (n = 0; n < 64; n++)
+				contexts[n] = gem_queue_create(fd);
+		}
 		if (flags & FDS) {
 			for (n = 0; n < 64; n++)
 				fds[n] = drm_open_driver(DRIVER_INTEL);
@@ -396,7 +404,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 						execbuf.flags &= ~ENGINE_MASK;
 						execbuf.flags |= engines[rand() % nengine];
 					}
-					if (flags & CONTEXTS) {
+					if (flags & (CONTEXTS | QUEUES)) {
 						execbuf.rsvd1 = contexts[rand() % 64];
 						if (flags & PRIORITY)
 							ctx_set_random_priority(this_fd, execbuf.rsvd1);
@@ -475,7 +483,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 			for (n = 0; n < 64; n++)
 				close(fds[n]);
 		}
-		if (flags & CONTEXTS) {
+		if (flags & (CONTEXTS | QUEUES)) {
 			for (n = 0; n < 64; n++)
 				gem_context_destroy(fd, contexts[n]);
 		}
@@ -507,18 +515,24 @@ igt_main
 		{ "chain-forked", CHAIN | FORKED },
 		{ "chain-interruptible", CHAIN | INTERRUPTIBLE },
 		{ "chain-sync", CHAIN | SYNC },
-		{ "contexts", CONTEXTS },
-		{ "contexts-interruptible", CONTEXTS | INTERRUPTIBLE},
-		{ "contexts-forked", CONTEXTS | FORKED},
-		{ "contexts-priority", CONTEXTS | FORKED | PRIORITY },
-		{ "contexts-chain", CONTEXTS | CHAIN },
-		{ "contexts-sync", CONTEXTS | SYNC },
 		{ "fds", FDS },
 		{ "fds-interruptible", FDS | INTERRUPTIBLE},
 		{ "fds-forked", FDS | FORKED},
 		{ "fds-priority", FDS | FORKED | PRIORITY },
 		{ "fds-chain", FDS | CHAIN},
 		{ "fds-sync", FDS | SYNC},
+		{ "contexts", CONTEXTS },
+		{ "contexts-interruptible", CONTEXTS | INTERRUPTIBLE},
+		{ "contexts-forked", CONTEXTS | FORKED},
+		{ "contexts-priority", CONTEXTS | FORKED | PRIORITY },
+		{ "contexts-chain", CONTEXTS | CHAIN },
+		{ "contexts-sync", CONTEXTS | SYNC },
+		{ "queues", QUEUES },
+		{ "queues-interruptible", QUEUES | INTERRUPTIBLE},
+		{ "queues-forked", QUEUES | FORKED},
+		{ "queues-priority", QUEUES | FORKED | PRIORITY },
+		{ "queues-chain", QUEUES | CHAIN },
+		{ "queues-sync", QUEUES | SYNC },
 		{ NULL }
 	};
 	int fd;
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Intel-gfx] [PATCH i-g-t 10/17] igt: Exercise creating context with shared GTT
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

v2: Test each shared context is its own timeline and allows request
reordering between shared contexts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 lib/i915/gem_context.c   |  66 +++
 lib/i915/gem_context.h   |  13 +
 tests/Makefile.sources   |   1 +
 tests/gem_ctx_shared.c   | 896 +++++++++++++++++++++++++++++++++++++++
 tests/gem_exec_whisper.c |  32 +-
 5 files changed, 999 insertions(+), 9 deletions(-)
 create mode 100644 tests/gem_ctx_shared.c

diff --git a/lib/i915/gem_context.c b/lib/i915/gem_context.c
index 669bd318c..02cf2ccf3 100644
--- a/lib/i915/gem_context.c
+++ b/lib/i915/gem_context.c
@@ -273,3 +273,69 @@ void gem_context_set_priority(int fd, uint32_t ctx_id, int prio)
 {
 	igt_assert(__gem_context_set_priority(fd, ctx_id, prio) == 0);
 }
+
+struct local_i915_gem_context_create_v2 {
+	uint32_t ctx_id;
+	uint32_t flags;
+	uint32_t share_ctx;
+	uint32_t pad;
+};
+
+#define LOCAL_IOCTL_I915_GEM_CONTEXT_CREATE       DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct local_i915_gem_context_create_v2)
+
+int
+__gem_context_create_shared(int i915, uint32_t share, unsigned int flags,
+			    uint32_t *out)
+{
+	struct local_i915_gem_context_create_v2 arg = {
+		.flags = flags,
+		.share_ctx = share,
+	};
+	int err = 0;
+
+	if (igt_ioctl(i915, LOCAL_IOCTL_I915_GEM_CONTEXT_CREATE, &arg))
+		err = -errno;
+
+	*out = arg.ctx_id;
+
+	errno = 0;
+	return err;
+}
+
+static bool __gem_context_has(int i915, uint32_t flags)
+{
+	uint32_t ctx;
+
+	__gem_context_create_shared(i915, 0, flags, &ctx);
+	if (ctx)
+		gem_context_destroy(i915, ctx);
+
+	errno = 0;
+	return ctx;
+}
+
+bool gem_contexts_has_shared_gtt(int i915)
+{
+	return __gem_context_has(i915, I915_GEM_CONTEXT_SHARE_GTT);
+}
+
+bool gem_has_queues(int i915)
+{
+	return __gem_context_has(i915, I915_GEM_CONTEXT_SHARE_GTT);
+}
+
+uint32_t gem_context_create_shared(int i915, uint32_t share, unsigned int flags)
+{
+	uint32_t ctx;
+
+	igt_assert_eq(__gem_context_create_shared(i915, share, flags, &ctx), 0);
+
+	return ctx;
+}
+
+uint32_t gem_queue_create(int i915)
+{
+	return gem_context_create_shared(i915, 0,
+					 I915_GEM_CONTEXT_SHARE_GTT |
+					 I915_GEM_CONTEXT_SINGLE_TIMELINE);
+}
diff --git a/lib/i915/gem_context.h b/lib/i915/gem_context.h
index aef68dda6..5ed992a70 100644
--- a/lib/i915/gem_context.h
+++ b/lib/i915/gem_context.h
@@ -29,6 +29,19 @@ int __gem_context_create(int fd, uint32_t *ctx_id);
 void gem_context_destroy(int fd, uint32_t ctx_id);
 int __gem_context_destroy(int fd, uint32_t ctx_id);
 
+#define I915_GEM_CONTEXT_SHARE_GTT 0x1
+#define I915_GEM_CONTEXT_SINGLE_TIMELINE 0x2
+
+int __gem_context_create_shared(int i915,
+				uint32_t share, unsigned int flags,
+				uint32_t *out);
+uint32_t gem_context_create_shared(int i915,
+				   uint32_t share, unsigned int flags);
+uint32_t gem_queue_create(int i915);
+
+bool gem_contexts_has_shared_gtt(int i915);
+bool gem_has_queues(int i915);
+
 bool gem_has_contexts(int fd);
 void gem_require_contexts(int fd);
 void gem_context_require_bannable(int fd);
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 54b4a3c21..08a1caf9d 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -56,6 +56,7 @@ TESTS_progs = \
 	gem_ctx_exec \
 	gem_ctx_isolation \
 	gem_ctx_param \
+	gem_ctx_shared \
 	gem_ctx_switch \
 	gem_ctx_thrash \
 	gem_double_irq_loop \
diff --git a/tests/gem_ctx_shared.c b/tests/gem_ctx_shared.c
new file mode 100644
index 000000000..9d3239f69
--- /dev/null
+++ b/tests/gem_ctx_shared.c
@@ -0,0 +1,896 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "igt.h"
+
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <errno.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+
+#include <drm.h>
+
+#include "igt_rand.h"
+#include "igt_vgem.h"
+#include "sync_file.h"
+
+#define LO 0
+#define HI 1
+#define NOISE 2
+
+#define MAX_PRIO LOCAL_I915_CONTEXT_MAX_USER_PRIORITY
+#define MIN_PRIO LOCAL_I915_CONTEXT_MIN_USER_PRIORITY
+
+static int priorities[] = {
+	[LO] = MIN_PRIO / 2,
+	[HI] = MAX_PRIO / 2,
+};
+
+#define BUSY_QLEN 8
+#define MAX_ELSP_QLEN 16
+
+IGT_TEST_DESCRIPTION("Test shared contexts.");
+
+static void create_shared_gtt(int i915, unsigned int flags)
+#define DETACHED 0x1
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+	};
+	uint32_t parent, child;
+
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	gem_execbuf(i915, &execbuf);
+	gem_sync(i915, obj.handle);
+
+	child = flags & DETACHED ? gem_context_create(i915) : 0;
+	igt_until_timeout(2) {
+		parent = flags & DETACHED ? child : 0;
+		child = gem_context_create_shared(i915, parent,
+						  I915_GEM_CONTEXT_SHARE_GTT);
+		execbuf.rsvd1 = child;
+		gem_execbuf(i915, &execbuf);
+
+		if (flags & DETACHED) {
+			gem_context_destroy(i915, parent);
+			gem_execbuf(i915, &execbuf);
+		} else {
+			parent = child;
+			gem_context_destroy(i915, parent);
+		}
+
+		execbuf.rsvd1 = parent;
+		igt_assert_eq(__gem_execbuf(i915, &execbuf), -ENOENT);
+		igt_assert_eq(__gem_context_create_shared(i915, parent,
+							  I915_GEM_CONTEXT_SHARE_GTT,
+							  &parent), -ENOENT);
+	}
+	if (flags & DETACHED)
+		gem_context_destroy(i915, child);
+
+	gem_sync(i915, obj.handle);
+	gem_close(i915, obj.handle);
+}
+
+static void disjoint_timelines(int i915)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+	};
+	struct sync_fence_info parent, child;
+	struct sync_file_info sync_file_info = {
+		.num_fences = 1,
+	};
+
+	/*
+	 * Each context, although they share a vm, are expected to be
+	 * distinct timelines. A request queued to one context should be
+	 * independent of any shared contexts.
+	 *
+	 * This information is exposed via the sync_file->name which
+	 * includes the fence.context.
+	 */
+
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	gem_execbuf(i915, &execbuf);
+	gem_sync(i915, obj.handle);
+
+	execbuf.flags = I915_EXEC_FENCE_OUT;
+	gem_execbuf_wr(i915, &execbuf);
+	sync_file_info.sync_fence_info = to_user_pointer(&parent);
+	do_ioctl(execbuf.rsvd2 >> 32, SYNC_IOC_FILE_INFO, &sync_file_info);
+	close(execbuf.rsvd2 >> 32);
+
+	execbuf.rsvd1 =
+		gem_context_create_shared(i915, 0, I915_GEM_CONTEXT_SHARE_GTT);
+	gem_execbuf_wr(i915, &execbuf);
+	sync_file_info.sync_fence_info = to_user_pointer(&child);
+	do_ioctl(execbuf.rsvd2 >> 32, SYNC_IOC_FILE_INFO, &sync_file_info);
+	close(execbuf.rsvd2 >> 32);
+	gem_context_destroy(i915, execbuf.rsvd1);
+
+	igt_info("Parent fence: %s %s\n", parent.driver_name, parent.obj_name);
+	igt_info("Child fence: %s %s\n", child.driver_name, child.obj_name);
+
+	/* Driver should be the same, but on different timelines */
+	igt_assert(strcmp(parent.driver_name, child.driver_name) == 0);
+	igt_assert(strcmp(parent.obj_name, child.obj_name) != 0);
+
+	gem_sync(i915, obj.handle);
+	gem_close(i915, obj.handle);
+}
+
+static int reopen_driver(int fd)
+{
+	char path[256];
+
+	snprintf(path, sizeof(path), "/proc/self/fd/%d", fd);
+	fd = open(path, O_RDWR);
+	igt_assert_lte(0, fd);
+
+	return fd;
+}
+
+static void exhaust_shared_gtt(int i915, unsigned int flags)
+#define EXHAUST_LRC 0x1
+{
+	i915 = reopen_driver(i915);
+
+	igt_fork(pid, 1) {
+		const uint32_t bbe = MI_BATCH_BUFFER_END;
+		struct drm_i915_gem_exec_object2 obj = {
+			.handle = gem_create(i915, 4096)
+		};
+		struct drm_i915_gem_execbuffer2 execbuf = {
+			.buffers_ptr = to_user_pointer(&obj),
+			.buffer_count = 1,
+		};
+		uint32_t parent, child;
+		unsigned long count = 0;
+		int err;
+
+		gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+
+		child = 0;
+		for (;;) {
+			parent = child;
+			err = __gem_context_create_shared(i915, parent,
+							  I915_GEM_CONTEXT_SHARE_GTT,
+							  &child);
+			if (err)
+				break;
+
+			if (flags & EXHAUST_LRC) {
+				execbuf.rsvd1 = child;
+				err = __gem_execbuf(i915, &execbuf);
+				if (err)
+					break;
+			}
+
+			count++;
+		}
+		gem_sync(i915, obj.handle);
+
+		igt_info("Created %lu shared contexts, before %d (%s)\n",
+			 count, err, strerror(-err));
+	}
+	close(i915);
+	igt_waitchildren();
+}
+
+static void exec_shared_gtt(int i915, unsigned int ring)
+{
+	const int gen = intel_gen(intel_get_drm_devid(i915));
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096)
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+		.flags = ring,
+	};
+	uint32_t scratch = obj.handle;
+	uint32_t batch[16];
+	int i;
+
+	gem_require_ring(i915, ring);
+	igt_require(gem_can_store_dword(i915, ring));
+
+	/* Load object into place in the GTT */
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	gem_execbuf(i915, &execbuf);
+
+	/* Presume nothing causes an eviction in the meantime */
+
+	obj.handle = gem_create(i915, 4096);
+
+	i = 0;
+	batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+	if (gen >= 8) {
+		batch[++i] = obj.offset;
+		batch[++i] = 0;
+	} else if (gen >= 4) {
+		batch[++i] = 0;
+		batch[++i] = obj.offset;
+	} else {
+		batch[i]--;
+		batch[++i] = obj.offset;
+	}
+	batch[++i] = 0xc0ffee;
+	batch[++i] = MI_BATCH_BUFFER_END;
+	gem_write(i915, obj.handle, 0, batch, sizeof(batch));
+
+	obj.offset += 4096; /* make sure we don't cause an eviction! */
+	obj.flags |= EXEC_OBJECT_PINNED;
+	execbuf.rsvd1 = gem_context_create_shared(i915, 0,
+						  I915_GEM_CONTEXT_SHARE_GTT);
+	if (gen > 3 && gen < 6)
+		execbuf.flags |= I915_EXEC_SECURE;
+
+	gem_execbuf(i915, &execbuf);
+	gem_context_destroy(i915, execbuf.rsvd1);
+	gem_sync(i915, obj.handle); /* write hazard lies */
+	gem_close(i915, obj.handle);
+
+	gem_read(i915, scratch, 0, batch, sizeof(uint32_t));
+	gem_close(i915, scratch);
+
+	igt_assert_eq_u32(*batch, 0xc0ffee);
+}
+
+static int nop_sync(int i915, uint32_t ctx, unsigned int ring, int64_t timeout)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+		.flags = ring,
+		.rsvd1 = ctx,
+	};
+	int err;
+
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	gem_execbuf(i915, &execbuf);
+	err = gem_wait(i915, obj.handle, &timeout);
+	gem_close(i915, obj.handle);
+
+	return err;
+}
+
+static bool has_single_timeline(int i915)
+{
+	uint32_t ctx;
+
+	__gem_context_create_shared(i915, 0,
+				    I915_GEM_CONTEXT_SINGLE_TIMELINE, &ctx);
+	if (ctx)
+		gem_context_destroy(i915, ctx);
+
+	return ctx != 0;
+}
+
+static bool ignore_engine(unsigned engine)
+{
+	if (engine == 0)
+		return true;
+
+	if (engine == I915_EXEC_BSD)
+		return true;
+
+	return false;
+}
+
+static void single_timeline(int i915)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+	};
+	struct sync_fence_info rings[16];
+	struct sync_file_info sync_file_info = {
+		.num_fences = 1,
+	};
+	unsigned int engine;
+	int n;
+
+	igt_require(has_single_timeline(i915));
+
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	gem_execbuf(i915, &execbuf);
+	gem_sync(i915, obj.handle);
+
+	/*
+	 * For a "single timeline" context, each ring is on the common
+	 * timeline, unlike a normal context where each ring has an
+	 * independent timeline. That is no matter which engine we submit
+	 * to, it reports the same timeline name and fence context. However,
+	 * the fence context is not reported through the sync_fence_info.
+	 */
+	execbuf.rsvd1 =
+		gem_context_create_shared(i915, 0,
+					  I915_GEM_CONTEXT_SINGLE_TIMELINE);
+	execbuf.flags = I915_EXEC_FENCE_OUT;
+	n = 0;
+	for_each_engine(i915, engine) {
+		gem_execbuf_wr(i915, &execbuf);
+		sync_file_info.sync_fence_info = to_user_pointer(&rings[n]);
+		do_ioctl(execbuf.rsvd2 >> 32, SYNC_IOC_FILE_INFO, &sync_file_info);
+		close(execbuf.rsvd2 >> 32);
+
+		igt_info("ring[%d] fence: %s %s\n",
+			 n, rings[n].driver_name, rings[n].obj_name);
+		n++;
+	}
+	gem_sync(i915, obj.handle);
+	gem_close(i915, obj.handle);
+
+	for (int i = 1; i < n; i++) {
+		igt_assert(!strcmp(rings[0].driver_name, rings[i].driver_name));
+		igt_assert(!strcmp(rings[0].obj_name, rings[i].obj_name));
+	}
+}
+
+static void exec_single_timeline(int i915, unsigned int ring)
+{
+	unsigned int other;
+	igt_spin_t *spin;
+	uint32_t ctx;
+
+	gem_require_ring(i915, ring);
+	igt_require(has_single_timeline(i915));
+
+	/*
+	 * On an ordinary context, a blockage on one ring doesn't prevent
+	 * execution on an other.
+	 */
+	ctx = 0;
+	spin = NULL;
+	for_each_engine(i915, other) {
+		if (other == ring || ignore_engine(other))
+			continue;
+
+		if (spin == NULL) {
+			spin = __igt_spin_batch_new(i915, ctx, other, 0);
+		} else {
+			struct drm_i915_gem_exec_object2 obj = {
+				.handle = spin->handle,
+			};
+			struct drm_i915_gem_execbuffer2 execbuf = {
+				.buffers_ptr = to_user_pointer(&obj),
+				.buffer_count = 1,
+				.flags = other,
+				.rsvd1 = ctx,
+			};
+			gem_execbuf(i915, &execbuf);
+		}
+	}
+	igt_require(spin);
+	igt_assert_eq(nop_sync(i915, ctx, ring, NSEC_PER_SEC), 0);
+	igt_spin_batch_free(i915, spin);
+
+	/*
+	 * But if we create a context with just a single shared timeline,
+	 * then it will block waiting for the earlier requests on the
+	 * other engines.
+	 */
+	ctx = gem_context_create_shared(i915, 0,
+					I915_GEM_CONTEXT_SINGLE_TIMELINE);
+	spin = NULL;
+	for_each_engine(i915, other) {
+		if (other == ring || ignore_engine(other))
+			continue;
+
+		if (spin == NULL) {
+			spin = __igt_spin_batch_new(i915, ctx, other, 0);
+		} else {
+			struct drm_i915_gem_exec_object2 obj = {
+				.handle = spin->handle,
+			};
+			struct drm_i915_gem_execbuffer2 execbuf = {
+				.buffers_ptr = to_user_pointer(&obj),
+				.buffer_count = 1,
+				.flags = other,
+				.rsvd1 = ctx,
+			};
+			gem_execbuf(i915, &execbuf);
+		}
+	}
+	igt_assert(spin);
+	igt_assert_eq(nop_sync(i915, ctx, ring, NSEC_PER_SEC), -ETIME);
+	igt_spin_batch_free(i915, spin);
+}
+
+static void store_dword(int i915, uint32_t ctx, unsigned ring,
+			uint32_t target, uint32_t offset, uint32_t value,
+			uint32_t cork, unsigned write_domain)
+{
+	const int gen = intel_gen(intel_get_drm_devid(i915));
+	struct drm_i915_gem_exec_object2 obj[3];
+	struct drm_i915_gem_relocation_entry reloc;
+	struct drm_i915_gem_execbuffer2 execbuf;
+	uint32_t batch[16];
+	int i;
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = to_user_pointer(obj + !cork);
+	execbuf.buffer_count = 2 + !!cork;
+	execbuf.flags = ring;
+	if (gen < 6)
+		execbuf.flags |= I915_EXEC_SECURE;
+	execbuf.rsvd1 = ctx;
+
+	memset(obj, 0, sizeof(obj));
+	obj[0].handle = cork;
+	obj[1].handle = target;
+	obj[2].handle = gem_create(i915, 4096);
+
+	memset(&reloc, 0, sizeof(reloc));
+	reloc.target_handle = obj[1].handle;
+	reloc.presumed_offset = 0;
+	reloc.offset = sizeof(uint32_t);
+	reloc.delta = offset;
+	reloc.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
+	reloc.write_domain = write_domain;
+	obj[2].relocs_ptr = to_user_pointer(&reloc);
+	obj[2].relocation_count = 1;
+
+	i = 0;
+	batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+	if (gen >= 8) {
+		batch[++i] = offset;
+		batch[++i] = 0;
+	} else if (gen >= 4) {
+		batch[++i] = 0;
+		batch[++i] = offset;
+		reloc.offset += sizeof(uint32_t);
+	} else {
+		batch[i]--;
+		batch[++i] = offset;
+	}
+	batch[++i] = value;
+	batch[++i] = MI_BATCH_BUFFER_END;
+	gem_write(i915, obj[2].handle, 0, batch, sizeof(batch));
+	gem_execbuf(i915, &execbuf);
+	gem_close(i915, obj[2].handle);
+}
+
+struct cork {
+	int device;
+	uint32_t handle;
+	uint32_t fence;
+};
+
+static void plug(int i915, struct cork *c)
+{
+	struct vgem_bo bo;
+	int dmabuf;
+
+	c->device = drm_open_driver(DRIVER_VGEM);
+
+	bo.width = bo.height = 1;
+	bo.bpp = 4;
+	vgem_create(c->device, &bo);
+	c->fence = vgem_fence_attach(c->device, &bo, VGEM_FENCE_WRITE);
+
+	dmabuf = prime_handle_to_fd(c->device, bo.handle);
+	c->handle = prime_fd_to_handle(i915, dmabuf);
+	close(dmabuf);
+}
+
+static void unplug(struct cork *c)
+{
+	vgem_fence_signal(c->device, c->fence);
+	close(c->device);
+}
+
+static uint32_t create_highest_priority(int i915)
+{
+	uint32_t ctx = gem_context_create(i915);
+
+	/*
+	 * If there is no priority support, all contexts will have equal
+	 * priority (and therefore the max user priority), so no context
+	 * can overtake us, and we effectively can form a plug.
+	 */
+	__gem_context_set_priority(i915, ctx, MAX_PRIO);
+
+	return ctx;
+}
+
+static void unplug_show_queue(int i915, struct cork *c, unsigned int engine)
+{
+	igt_spin_t *spin[BUSY_QLEN];
+
+	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
+		uint32_t ctx = create_highest_priority(i915);
+		spin[n] = __igt_spin_batch_new(i915, ctx, engine, 0);
+		gem_context_destroy(i915, ctx);
+	}
+
+	unplug(c); /* batches will now be queued on the engine */
+	igt_debugfs_dump(i915, "i915_engine_info");
+
+	for (int n = 0; n < ARRAY_SIZE(spin); n++)
+		igt_spin_batch_free(i915, spin[n]);
+}
+
+static uint32_t store_timestamp(int i915, uint32_t ctx, unsigned ring)
+{
+	const bool r64b = intel_gen(intel_get_drm_devid(i915)) >= 8;
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+		.relocation_count = 1,
+	};
+	struct drm_i915_gem_relocation_entry reloc = {
+		.target_handle = obj.handle,
+		.offset = 2 * sizeof(uint32_t),
+		.delta = 4092,
+		.read_domains = I915_GEM_DOMAIN_INSTRUCTION,
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+		.flags = ring,
+		.rsvd1 = ctx,
+	};
+	uint32_t batch[] = {
+		0x24 << 23 | (1 + r64b), /* SRM */
+		0x2358,
+		4092,
+		0,
+		MI_BATCH_BUFFER_END
+	};
+
+	igt_require(intel_gen(intel_get_drm_devid(i915)) >= 7);
+
+	gem_write(i915, obj.handle, 0, batch, sizeof(batch));
+	obj.relocs_ptr = to_user_pointer(&reloc);
+
+	gem_execbuf(i915, &execbuf);
+
+	return obj.handle;
+}
+
+static void independent(int i915, unsigned ring, unsigned flags)
+{
+	uint32_t handle[2];
+	igt_spin_t *spin;
+
+	spin = igt_spin_batch_new(i915, 0, ring, 0);
+	for (int i = 0; i < 16; i++){
+		struct drm_i915_gem_exec_object2 obj = {
+			.handle = spin->handle
+		};
+		struct drm_i915_gem_execbuffer2 eb = {
+			.buffer_count = 1,
+			.buffers_ptr = to_user_pointer(&obj),
+			.flags = ring,
+			.rsvd1 = gem_context_create(i915),
+		};
+		gem_context_set_priority(i915, eb.rsvd1, MAX_PRIO);
+		gem_execbuf(i915, &eb);
+		gem_context_destroy(i915, eb.rsvd1);
+	}
+
+	for (int i = LO; i <= HI; i++) {
+		uint32_t ctx = gem_queue_create(i915);
+		gem_context_set_priority(i915, ctx, priorities[i]);
+		handle[i] = store_timestamp(i915, ctx, ring);
+		gem_context_destroy(i915, ctx);
+	}
+
+	igt_spin_batch_free(i915, spin);
+
+	for (int i = LO; i <= HI; i++) {
+		uint32_t *ptr;
+
+		ptr = gem_mmap__gtt(i915, handle[i], 4096, PROT_READ);
+		gem_set_domain(i915, handle[i], /* no write hazard lies! */
+			       I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+		gem_close(i915, handle[i]);
+
+		handle[i] = ptr[1023];
+		munmap(ptr, 4096);
+
+		igt_debug("ctx[%d] .prio=%d, timestamp=%u\n",
+			  i, priorities[i], handle[i]);
+	}
+
+	igt_assert((int32_t)(handle[HI] - handle[LO]) < 0);
+}
+
+static void reorder(int i915, unsigned ring, unsigned flags)
+#define EQUAL 1
+{
+	struct cork cork;
+	uint32_t scratch;
+	uint32_t *ptr;
+	uint32_t ctx[2];
+
+	ctx[LO] = gem_queue_create(i915);
+	gem_context_set_priority(i915, ctx[LO], MIN_PRIO);
+
+	ctx[HI] = gem_queue_create(i915);
+	gem_context_set_priority(i915, ctx[HI], flags & EQUAL ? MIN_PRIO : 0);
+
+	scratch = gem_create(i915, 4096);
+	plug(i915, &cork);
+
+	/* We expect the high priority context to be executed first, and
+	 * so the final result will be value from the low priority context.
+	 */
+	store_dword(i915, ctx[LO], ring, scratch, 0, ctx[LO], cork.handle, 0);
+	store_dword(i915, ctx[HI], ring, scratch, 0, ctx[HI], cork.handle, 0);
+
+	unplug_show_queue(i915, &cork, ring);
+
+	gem_context_destroy(i915, ctx[LO]);
+	gem_context_destroy(i915, ctx[HI]);
+
+	ptr = gem_mmap__gtt(i915, scratch, 4096, PROT_READ);
+	gem_set_domain(i915, scratch, /* no write hazard lies! */
+		       I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+	gem_close(i915, scratch);
+
+	if (flags & EQUAL) /* equal priority, result will be fifo */
+		igt_assert_eq_u32(ptr[0], ctx[HI]);
+	else
+		igt_assert_eq_u32(ptr[0], ctx[LO]);
+	munmap(ptr, 4096);
+}
+
+static void promotion(int i915, unsigned ring)
+{
+	struct cork cork;
+	uint32_t result, dep;
+	uint32_t *ptr;
+	uint32_t ctx[3];
+
+	ctx[LO] = gem_queue_create(i915);
+	gem_context_set_priority(i915, ctx[LO], MIN_PRIO);
+
+	ctx[HI] = gem_queue_create(i915);
+	gem_context_set_priority(i915, ctx[HI], 0);
+
+	ctx[NOISE] = gem_queue_create(i915);
+	gem_context_set_priority(i915, ctx[NOISE], MIN_PRIO/2);
+
+	result = gem_create(i915, 4096);
+	dep = gem_create(i915, 4096);
+
+	plug(i915, &cork);
+
+	/* Expect that HI promotes LO, so the order will be LO, HI, NOISE.
+	 *
+	 * fifo would be NOISE, LO, HI.
+	 * strict priority would be  HI, NOISE, LO
+	 */
+	store_dword(i915, ctx[NOISE], ring, result, 0, ctx[NOISE], cork.handle, 0);
+	store_dword(i915, ctx[LO], ring, result, 0, ctx[LO], cork.handle, 0);
+
+	/* link LO <-> HI via a dependency on another buffer */
+	store_dword(i915, ctx[LO], ring, dep, 0, ctx[LO], 0, I915_GEM_DOMAIN_INSTRUCTION);
+	store_dword(i915, ctx[HI], ring, dep, 0, ctx[HI], 0, 0);
+
+	store_dword(i915, ctx[HI], ring, result, 0, ctx[HI], 0, 0);
+
+	unplug_show_queue(i915, &cork, ring);
+
+	gem_context_destroy(i915, ctx[NOISE]);
+	gem_context_destroy(i915, ctx[LO]);
+	gem_context_destroy(i915, ctx[HI]);
+
+	ptr = gem_mmap__gtt(i915, dep, 4096, PROT_READ);
+	gem_set_domain(i915, dep, /* no write hazard lies! */
+			I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+	gem_close(i915, dep);
+
+	igt_assert_eq_u32(ptr[0], ctx[HI]);
+	munmap(ptr, 4096);
+
+	ptr = gem_mmap__gtt(i915, result, 4096, PROT_READ);
+	gem_set_domain(i915, result, /* no write hazard lies! */
+			I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+	gem_close(i915, result);
+
+	igt_assert_eq_u32(ptr[0], ctx[NOISE]);
+	munmap(ptr, 4096);
+}
+
+static void smoketest(int i915, unsigned ring, unsigned timeout)
+{
+	const int ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	unsigned engines[16];
+	unsigned nengine;
+	unsigned engine;
+	uint32_t scratch;
+	uint32_t *ptr;
+
+	nengine = 0;
+	for_each_engine(i915, engine) {
+		if (ignore_engine(engine))
+			continue;
+
+		engines[nengine++] = engine;
+	}
+	igt_require(nengine);
+
+	scratch = gem_create(i915, 4096);
+	igt_fork(child, ncpus) {
+		unsigned long count = 0;
+		uint32_t ctx;
+
+		hars_petruska_f54_1_random_perturb(child);
+
+		ctx = gem_queue_create(i915);
+		igt_until_timeout(timeout) {
+			int prio;
+
+			prio = hars_petruska_f54_1_random_unsafe_max(MAX_PRIO - MIN_PRIO) + MIN_PRIO;
+			gem_context_set_priority(i915, ctx, prio);
+
+			engine = engines[hars_petruska_f54_1_random_unsafe_max(nengine)];
+			store_dword(i915, ctx, engine, scratch,
+				    8*child + 0, ~child,
+				    0, 0);
+			for (unsigned int step = 0; step < 8; step++)
+				store_dword(i915, ctx, engine, scratch,
+					    8*child + 4, count++,
+					    0, 0);
+		}
+		gem_context_destroy(i915, ctx);
+	}
+	igt_waitchildren();
+
+	ptr = gem_mmap__gtt(i915, scratch, 4096, PROT_READ);
+	gem_set_domain(i915, scratch, /* no write hazard lies! */
+			I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+	gem_close(i915, scratch);
+
+	for (unsigned n = 0; n < ncpus; n++) {
+		igt_assert_eq_u32(ptr[2*n], ~n);
+		/*
+		 * Note this count is approximate due to unconstrained
+		 * ordering of the dword writes between engines.
+		 *
+		 * Take the result with a pinch of salt.
+		 */
+		igt_info("Child[%d] completed %u cycles\n",  n, ptr[2*n+1]);
+	}
+	munmap(ptr, 4096);
+}
+
+igt_main
+{
+	const struct intel_execution_engine *e;
+	int i915 = -1;
+
+	igt_fixture {
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+	}
+
+	igt_subtest_group {
+		igt_fixture {
+			igt_require(gem_contexts_has_shared_gtt(i915));
+			igt_fork_hang_detector(i915);
+		}
+
+		igt_subtest("create-shared-gtt")
+			create_shared_gtt(i915, 0);
+
+		igt_subtest("detached-shared-gtt")
+			create_shared_gtt(i915, DETACHED);
+
+		igt_subtest("disjoint-timelines")
+			disjoint_timelines(i915);
+
+		igt_subtest("single-timeline")
+			single_timeline(i915);
+
+		igt_subtest("exhaust-shared-gtt")
+			exhaust_shared_gtt(i915, 0);
+
+		igt_subtest("exhaust-shared-gtt-lrc")
+			exhaust_shared_gtt(i915, EXHAUST_LRC);
+
+		for (e = intel_execution_engines; e->name; e++) {
+			igt_subtest_f("exec-shared-gtt-%s", e->name)
+				exec_shared_gtt(i915, e->exec_id | e->flags);
+
+			if (!ignore_engine(e->exec_id | e->flags)) {
+				igt_subtest_f("exec-single-timeline-%s",
+					      e->name)
+					exec_single_timeline(i915,
+							     e->exec_id | e->flags);
+			}
+
+			/*
+			 * Check that the shared contexts operate independently,
+			 * that is requests on one ("queue") can be scheduled
+			 * around another queue. We only check the basics here,
+			 * enough to reduce the queue into just another context,
+			 * and so rely on gem_exec_schedule to prove the rest.
+			 */
+			igt_subtest_group {
+				igt_fixture {
+					gem_require_ring(i915, e->exec_id | e->flags);
+					igt_require(gem_can_store_dword(i915, e->exec_id) | e->flags);
+					igt_require(gem_scheduler_enabled(i915));
+					igt_require(gem_scheduler_has_ctx_priority(i915));
+				}
+
+				igt_subtest_f("Q-independent-%s", e->name)
+					independent(i915, e->exec_id | e->flags, 0);
+
+				igt_subtest_f("Q-in-order-%s", e->name)
+					reorder(i915, e->exec_id | e->flags, EQUAL);
+
+				igt_subtest_f("Q-out-order-%s", e->name)
+					reorder(i915, e->exec_id | e->flags, 0);
+
+				igt_subtest_f("Q-promotion-%s", e->name)
+					promotion(i915, e->exec_id | e->flags);
+
+				igt_subtest_f("Q-smoketest-%s", e->name)
+					smoketest(i915, e->exec_id | e->flags, 5);
+			}
+		}
+
+		igt_subtest("Q-smoketest-all") {
+			igt_require(gem_scheduler_enabled(i915));
+			igt_require(gem_scheduler_has_ctx_priority(i915));
+			smoketest(i915, -1, 30);
+		}
+
+		igt_fixture {
+			igt_stop_hang_detector();
+		}
+	}
+}
diff --git a/tests/gem_exec_whisper.c b/tests/gem_exec_whisper.c
index 81303f847..a41614a7b 100644
--- a/tests/gem_exec_whisper.c
+++ b/tests/gem_exec_whisper.c
@@ -87,6 +87,7 @@ static void verify_reloc(int fd, uint32_t handle,
 #define HANG 0x20
 #define SYNC 0x40
 #define PRIORITY 0x80
+#define QUEUES 0x100
 
 struct hang {
 	struct drm_i915_gem_exec_object2 obj;
@@ -171,7 +172,7 @@ static void ctx_set_random_priority(int fd, uint32_t ctx)
 {
 	int prio = hars_petruska_f54_1_random_unsafe_max(1024) - 512;
 	gem_context_set_priority(fd, ctx, prio);
-};
+}
 
 static void whisper(int fd, unsigned engine, unsigned flags)
 {
@@ -223,6 +224,9 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 	if (flags & CONTEXTS)
 		gem_require_contexts(fd);
 
+	if (flags & QUEUES)
+		igt_require(gem_has_queues(fd));
+
 	if (flags & HANG)
 		init_hang(&hang);
 
@@ -284,6 +288,10 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 			for (n = 0; n < 64; n++)
 				contexts[n] = gem_context_create(fd);
 		}
+		if (flags & QUEUES) {
+			for (n = 0; n < 64; n++)
+				contexts[n] = gem_queue_create(fd);
+		}
 		if (flags & FDS) {
 			for (n = 0; n < 64; n++)
 				fds[n] = drm_open_driver(DRIVER_INTEL);
@@ -396,7 +404,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 						execbuf.flags &= ~ENGINE_MASK;
 						execbuf.flags |= engines[rand() % nengine];
 					}
-					if (flags & CONTEXTS) {
+					if (flags & (CONTEXTS | QUEUES)) {
 						execbuf.rsvd1 = contexts[rand() % 64];
 						if (flags & PRIORITY)
 							ctx_set_random_priority(this_fd, execbuf.rsvd1);
@@ -475,7 +483,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 			for (n = 0; n < 64; n++)
 				close(fds[n]);
 		}
-		if (flags & CONTEXTS) {
+		if (flags & (CONTEXTS | QUEUES)) {
 			for (n = 0; n < 64; n++)
 				gem_context_destroy(fd, contexts[n]);
 		}
@@ -507,18 +515,24 @@ igt_main
 		{ "chain-forked", CHAIN | FORKED },
 		{ "chain-interruptible", CHAIN | INTERRUPTIBLE },
 		{ "chain-sync", CHAIN | SYNC },
-		{ "contexts", CONTEXTS },
-		{ "contexts-interruptible", CONTEXTS | INTERRUPTIBLE},
-		{ "contexts-forked", CONTEXTS | FORKED},
-		{ "contexts-priority", CONTEXTS | FORKED | PRIORITY },
-		{ "contexts-chain", CONTEXTS | CHAIN },
-		{ "contexts-sync", CONTEXTS | SYNC },
 		{ "fds", FDS },
 		{ "fds-interruptible", FDS | INTERRUPTIBLE},
 		{ "fds-forked", FDS | FORKED},
 		{ "fds-priority", FDS | FORKED | PRIORITY },
 		{ "fds-chain", FDS | CHAIN},
 		{ "fds-sync", FDS | SYNC},
+		{ "contexts", CONTEXTS },
+		{ "contexts-interruptible", CONTEXTS | INTERRUPTIBLE},
+		{ "contexts-forked", CONTEXTS | FORKED},
+		{ "contexts-priority", CONTEXTS | FORKED | PRIORITY },
+		{ "contexts-chain", CONTEXTS | CHAIN },
+		{ "contexts-sync", CONTEXTS | SYNC },
+		{ "queues", QUEUES },
+		{ "queues-interruptible", QUEUES | INTERRUPTIBLE},
+		{ "queues-forked", QUEUES | FORKED},
+		{ "queues-priority", QUEUES | FORKED | PRIORITY },
+		{ "queues-chain", QUEUES | CHAIN },
+		{ "queues-sync", QUEUES | SYNC },
 		{ NULL }
 	};
 	int fd;
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 11/17] igt/gem_ctx_switch: Exercise queues
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Queues are a form of contexts that share vm and enfore a single timeline
across all engines. Test switching between them, just like ordinary
contexts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_ctx_switch.c | 75 +++++++++++++++++++++++++++++++-----------
 1 file changed, 55 insertions(+), 20 deletions(-)

diff --git a/tests/gem_ctx_switch.c b/tests/gem_ctx_switch.c
index 766ff9ae9..6770e001f 100644
--- a/tests/gem_ctx_switch.c
+++ b/tests/gem_ctx_switch.c
@@ -43,7 +43,8 @@
 #define LOCAL_I915_EXEC_NO_RELOC (1<<11)
 #define LOCAL_I915_EXEC_HANDLE_LUT (1<<12)
 
-#define INTERRUPTIBLE 1
+#define INTERRUPTIBLE 0x1
+#define QUEUE 0x2
 
 static double elapsed(const struct timespec *start, const struct timespec *end)
 {
@@ -97,8 +98,12 @@ static void single(int fd, uint32_t handle,
 
 	gem_require_ring(fd, e->exec_id | e->flags);
 
-	for (n = 0; n < 64; n++)
-		contexts[n] = gem_context_create(fd);
+	for (n = 0; n < 64; n++) {
+		if (flags & QUEUE)
+			contexts[n] = gem_queue_create(fd);
+		else
+			contexts[n] = gem_context_create(fd);
+	}
 
 	memset(&obj, 0, sizeof(obj));
 	obj.handle = handle;
@@ -183,8 +188,12 @@ static void all(int fd, uint32_t handle, unsigned flags, int timeout)
 	}
 	igt_require(nengine);
 
-	for (n = 0; n < ARRAY_SIZE(contexts); n++)
-		contexts[n] = gem_context_create(fd);
+	for (n = 0; n < ARRAY_SIZE(contexts); n++) {
+		if (flags & QUEUE)
+			contexts[n] = gem_queue_create(fd);
+		else
+			contexts[n] = gem_context_create(fd);
+	}
 
 	memset(obj, 0, sizeof(obj));
 	obj[1].handle = handle;
@@ -248,6 +257,17 @@ igt_main
 {
 	const int ncpus = sysconf(_SC_NPROCESSORS_ONLN);
 	const struct intel_execution_engine *e;
+	static const struct {
+		const char *name;
+		unsigned int flags;
+		bool (*require)(int fd);
+	} phases[] = {
+		{ "", 0, NULL },
+		{ "-interruptible", INTERRUPTIBLE, NULL },
+		{ "-queue", QUEUE, gem_has_queues },
+		{ "-queue-interruptible", QUEUE | INTERRUPTIBLE, gem_has_queues },
+		{ }
+	};
 	uint32_t light = 0, heavy;
 	int fd = -1;
 
@@ -269,21 +289,26 @@ igt_main
 	}
 
 	for (e = intel_execution_engines; e->name; e++) {
-		igt_subtest_f("%s%s", e->exec_id == 0 ? "basic-" : "", e->name)
-			single(fd, light, e, 0, 1, 5);
-
-		igt_skip_on_simulation();
-
-		igt_subtest_f("%s%s-heavy", e->exec_id == 0 ? "basic-" : "", e->name)
-			single(fd, heavy, e, 0, 1, 5);
-		igt_subtest_f("%s-interruptible", e->name)
-			single(fd, light, e, INTERRUPTIBLE, 1, 150);
-		igt_subtest_f("forked-%s", e->name)
-			single(fd, light, e, 0, ncpus, 150);
-		igt_subtest_f("forked-%s-heavy", e->name)
-			single(fd, heavy, e, 0, ncpus, 150);
-		igt_subtest_f("forked-%s-interruptible", e->name)
-			single(fd, light, e, INTERRUPTIBLE, ncpus, 150);
+		for (typeof(*phases) *p = phases; p->name; p++) {
+			igt_subtest_group {
+				igt_fixture {
+					if (p->require)
+						igt_require(p->require);
+				}
+
+				igt_subtest_f("%s%s%s", e->exec_id == 0 ? "basic-" : "", e->name, p->name)
+					single(fd, light, e, p->flags, 1, 5);
+
+				igt_skip_on_simulation();
+
+				igt_subtest_f("%s%s-heavy%s", e->exec_id == 0 ? "basic-" : "", e->name, p->name)
+					single(fd, heavy, e, p->flags, 1, 5);
+				igt_subtest_f("forked-%s%s", e->name, p->name)
+					single(fd, light, e, p->flags, ncpus, 150);
+				igt_subtest_f("forked-%s-heavy%s", e->name, p->name)
+					single(fd, heavy, e, p->flags, ncpus, 150);
+			}
+		}
 	}
 
 	igt_subtest("basic-all-light")
@@ -291,6 +316,16 @@ igt_main
 	igt_subtest("basic-all-heavy")
 		all(fd, heavy, 0, 5);
 
+	igt_subtest_group {
+		igt_fixture {
+			igt_require(gem_has_queues(fd));
+		}
+		igt_subtest("basic-queue-light")
+			all(fd, light, QUEUE, 5);
+		igt_subtest("basic-queue-heavy")
+			all(fd, heavy, QUEUE, 5);
+	}
+
 	igt_fixture {
 		igt_stop_hang_detector();
 		gem_close(fd, heavy);
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [Intel-gfx] [PATCH i-g-t 11/17] igt/gem_ctx_switch: Exercise queues
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Queues are a form of contexts that share vm and enfore a single timeline
across all engines. Test switching between them, just like ordinary
contexts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_ctx_switch.c | 75 +++++++++++++++++++++++++++++++-----------
 1 file changed, 55 insertions(+), 20 deletions(-)

diff --git a/tests/gem_ctx_switch.c b/tests/gem_ctx_switch.c
index 766ff9ae9..6770e001f 100644
--- a/tests/gem_ctx_switch.c
+++ b/tests/gem_ctx_switch.c
@@ -43,7 +43,8 @@
 #define LOCAL_I915_EXEC_NO_RELOC (1<<11)
 #define LOCAL_I915_EXEC_HANDLE_LUT (1<<12)
 
-#define INTERRUPTIBLE 1
+#define INTERRUPTIBLE 0x1
+#define QUEUE 0x2
 
 static double elapsed(const struct timespec *start, const struct timespec *end)
 {
@@ -97,8 +98,12 @@ static void single(int fd, uint32_t handle,
 
 	gem_require_ring(fd, e->exec_id | e->flags);
 
-	for (n = 0; n < 64; n++)
-		contexts[n] = gem_context_create(fd);
+	for (n = 0; n < 64; n++) {
+		if (flags & QUEUE)
+			contexts[n] = gem_queue_create(fd);
+		else
+			contexts[n] = gem_context_create(fd);
+	}
 
 	memset(&obj, 0, sizeof(obj));
 	obj.handle = handle;
@@ -183,8 +188,12 @@ static void all(int fd, uint32_t handle, unsigned flags, int timeout)
 	}
 	igt_require(nengine);
 
-	for (n = 0; n < ARRAY_SIZE(contexts); n++)
-		contexts[n] = gem_context_create(fd);
+	for (n = 0; n < ARRAY_SIZE(contexts); n++) {
+		if (flags & QUEUE)
+			contexts[n] = gem_queue_create(fd);
+		else
+			contexts[n] = gem_context_create(fd);
+	}
 
 	memset(obj, 0, sizeof(obj));
 	obj[1].handle = handle;
@@ -248,6 +257,17 @@ igt_main
 {
 	const int ncpus = sysconf(_SC_NPROCESSORS_ONLN);
 	const struct intel_execution_engine *e;
+	static const struct {
+		const char *name;
+		unsigned int flags;
+		bool (*require)(int fd);
+	} phases[] = {
+		{ "", 0, NULL },
+		{ "-interruptible", INTERRUPTIBLE, NULL },
+		{ "-queue", QUEUE, gem_has_queues },
+		{ "-queue-interruptible", QUEUE | INTERRUPTIBLE, gem_has_queues },
+		{ }
+	};
 	uint32_t light = 0, heavy;
 	int fd = -1;
 
@@ -269,21 +289,26 @@ igt_main
 	}
 
 	for (e = intel_execution_engines; e->name; e++) {
-		igt_subtest_f("%s%s", e->exec_id == 0 ? "basic-" : "", e->name)
-			single(fd, light, e, 0, 1, 5);
-
-		igt_skip_on_simulation();
-
-		igt_subtest_f("%s%s-heavy", e->exec_id == 0 ? "basic-" : "", e->name)
-			single(fd, heavy, e, 0, 1, 5);
-		igt_subtest_f("%s-interruptible", e->name)
-			single(fd, light, e, INTERRUPTIBLE, 1, 150);
-		igt_subtest_f("forked-%s", e->name)
-			single(fd, light, e, 0, ncpus, 150);
-		igt_subtest_f("forked-%s-heavy", e->name)
-			single(fd, heavy, e, 0, ncpus, 150);
-		igt_subtest_f("forked-%s-interruptible", e->name)
-			single(fd, light, e, INTERRUPTIBLE, ncpus, 150);
+		for (typeof(*phases) *p = phases; p->name; p++) {
+			igt_subtest_group {
+				igt_fixture {
+					if (p->require)
+						igt_require(p->require);
+				}
+
+				igt_subtest_f("%s%s%s", e->exec_id == 0 ? "basic-" : "", e->name, p->name)
+					single(fd, light, e, p->flags, 1, 5);
+
+				igt_skip_on_simulation();
+
+				igt_subtest_f("%s%s-heavy%s", e->exec_id == 0 ? "basic-" : "", e->name, p->name)
+					single(fd, heavy, e, p->flags, 1, 5);
+				igt_subtest_f("forked-%s%s", e->name, p->name)
+					single(fd, light, e, p->flags, ncpus, 150);
+				igt_subtest_f("forked-%s-heavy%s", e->name, p->name)
+					single(fd, heavy, e, p->flags, ncpus, 150);
+			}
+		}
 	}
 
 	igt_subtest("basic-all-light")
@@ -291,6 +316,16 @@ igt_main
 	igt_subtest("basic-all-heavy")
 		all(fd, heavy, 0, 5);
 
+	igt_subtest_group {
+		igt_fixture {
+			igt_require(gem_has_queues(fd));
+		}
+		igt_subtest("basic-queue-light")
+			all(fd, light, QUEUE, 5);
+		igt_subtest("basic-queue-heavy")
+			all(fd, heavy, QUEUE, 5);
+	}
+
 	igt_fixture {
 		igt_stop_hang_detector();
 		gem_close(fd, heavy);
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 12/17] igt/gem_exec_whisper: Fork all-engine tests one-per-engine
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Add a new mode for some more stress, submit the all-engines tests
simultaneously, a stream per engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_exec_whisper.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/tests/gem_exec_whisper.c b/tests/gem_exec_whisper.c
index a41614a7b..d35807e30 100644
--- a/tests/gem_exec_whisper.c
+++ b/tests/gem_exec_whisper.c
@@ -88,6 +88,7 @@ static void verify_reloc(int fd, uint32_t handle,
 #define SYNC 0x40
 #define PRIORITY 0x80
 #define QUEUES 0x100
+#define ALL 0x200
 
 struct hang {
 	struct drm_i915_gem_exec_object2 obj;
@@ -197,6 +198,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 	unsigned int eb_migrations = 0;
 	uint64_t old_offset;
 	int debugfs;
+	int nchild;
 
 	if (flags & PRIORITY) {
 		igt_require(gem_scheduler_enabled(fd));
@@ -212,6 +214,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 				engines[nengine++] = engine;
 		}
 	} else {
+		igt_assert(!(flags & ALL));
 		igt_require(gem_has_ring(fd, engine));
 		igt_require(gem_can_store_dword(fd, engine));
 		engines[nengine++] = engine;
@@ -230,8 +233,19 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 	if (flags & HANG)
 		init_hang(&hang);
 
+	nchild = 1;
+	if (flags & FORKED)
+		nchild *= sysconf(_SC_NPROCESSORS_ONLN);
+	if (flags & ALL)
+		nchild *= nengine;
+
 	intel_detect_and_clear_missed_interrupts(fd);
-	igt_fork(child, flags & FORKED ? sysconf(_SC_NPROCESSORS_ONLN) : 1)  {
+	igt_fork(child, nchild) {
+		if (flags & ALL) {
+			engines[0] = engines[child % nengine];
+			nengine = 1;
+		}
+
 		memset(&scratch, 0, sizeof(scratch));
 		scratch.handle = gem_create(fd, 4096);
 		scratch.flags = EXEC_OBJECT_WRITE;
@@ -334,7 +348,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 			for (pass = 0; pass < 1024; pass++) {
 				uint64_t offset;
 
-				if (!(flags & FORKED))
+				if (nchild == 1)
 					write_seqno(debugfs, pass);
 
 				if (flags & HANG)
@@ -375,8 +389,8 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 
 				gem_write(fd, batches[1023].handle, loc, &pass, sizeof(pass));
 				for (n = 1024; --n >= 1; ) {
+					uint32_t handle[2] = {};
 					int this_fd = fd;
-					uint32_t handle[2];
 
 					execbuf.buffers_ptr = to_user_pointer(&batches[n-1]);
 					reloc_migrations += batches[n-1].offset != inter[n].presumed_offset;
@@ -535,7 +549,7 @@ igt_main
 		{ "queues-sync", QUEUES | SYNC },
 		{ NULL }
 	};
-	int fd;
+	int fd = -1;
 
 	igt_fixture {
 		fd = drm_open_driver_master(DRIVER_INTEL);
@@ -546,9 +560,12 @@ igt_main
 		igt_fork_hang_detector(fd);
 	}
 
-	for (const struct mode *m = modes; m->name; m++)
+	for (const struct mode *m = modes; m->name; m++) {
 		igt_subtest_f("%s", m->name)
 			whisper(fd, ALL_ENGINES, m->flags);
+		igt_subtest_f("%s-all", m->name)
+			whisper(fd, ALL_ENGINES, m->flags | ALL);
+	}
 
 	for (const struct intel_execution_engine *e = intel_execution_engines;
 	     e->name; e++) {
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 12/17] igt/gem_exec_whisper: Fork all-engine tests one-per-engine
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Add a new mode for some more stress, submit the all-engines tests
simultaneously, a stream per engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_exec_whisper.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/tests/gem_exec_whisper.c b/tests/gem_exec_whisper.c
index a41614a7b..d35807e30 100644
--- a/tests/gem_exec_whisper.c
+++ b/tests/gem_exec_whisper.c
@@ -88,6 +88,7 @@ static void verify_reloc(int fd, uint32_t handle,
 #define SYNC 0x40
 #define PRIORITY 0x80
 #define QUEUES 0x100
+#define ALL 0x200
 
 struct hang {
 	struct drm_i915_gem_exec_object2 obj;
@@ -197,6 +198,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 	unsigned int eb_migrations = 0;
 	uint64_t old_offset;
 	int debugfs;
+	int nchild;
 
 	if (flags & PRIORITY) {
 		igt_require(gem_scheduler_enabled(fd));
@@ -212,6 +214,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 				engines[nengine++] = engine;
 		}
 	} else {
+		igt_assert(!(flags & ALL));
 		igt_require(gem_has_ring(fd, engine));
 		igt_require(gem_can_store_dword(fd, engine));
 		engines[nengine++] = engine;
@@ -230,8 +233,19 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 	if (flags & HANG)
 		init_hang(&hang);
 
+	nchild = 1;
+	if (flags & FORKED)
+		nchild *= sysconf(_SC_NPROCESSORS_ONLN);
+	if (flags & ALL)
+		nchild *= nengine;
+
 	intel_detect_and_clear_missed_interrupts(fd);
-	igt_fork(child, flags & FORKED ? sysconf(_SC_NPROCESSORS_ONLN) : 1)  {
+	igt_fork(child, nchild) {
+		if (flags & ALL) {
+			engines[0] = engines[child % nengine];
+			nengine = 1;
+		}
+
 		memset(&scratch, 0, sizeof(scratch));
 		scratch.handle = gem_create(fd, 4096);
 		scratch.flags = EXEC_OBJECT_WRITE;
@@ -334,7 +348,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 			for (pass = 0; pass < 1024; pass++) {
 				uint64_t offset;
 
-				if (!(flags & FORKED))
+				if (nchild == 1)
 					write_seqno(debugfs, pass);
 
 				if (flags & HANG)
@@ -375,8 +389,8 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 
 				gem_write(fd, batches[1023].handle, loc, &pass, sizeof(pass));
 				for (n = 1024; --n >= 1; ) {
+					uint32_t handle[2] = {};
 					int this_fd = fd;
-					uint32_t handle[2];
 
 					execbuf.buffers_ptr = to_user_pointer(&batches[n-1]);
 					reloc_migrations += batches[n-1].offset != inter[n].presumed_offset;
@@ -535,7 +549,7 @@ igt_main
 		{ "queues-sync", QUEUES | SYNC },
 		{ NULL }
 	};
-	int fd;
+	int fd = -1;
 
 	igt_fixture {
 		fd = drm_open_driver_master(DRIVER_INTEL);
@@ -546,9 +560,12 @@ igt_main
 		igt_fork_hang_detector(fd);
 	}
 
-	for (const struct mode *m = modes; m->name; m++)
+	for (const struct mode *m = modes; m->name; m++) {
 		igt_subtest_f("%s", m->name)
 			whisper(fd, ALL_ENGINES, m->flags);
+		igt_subtest_f("%s-all", m->name)
+			whisper(fd, ALL_ENGINES, m->flags | ALL);
+	}
 
 	for (const struct intel_execution_engine *e = intel_execution_engines;
 	     e->name; e++) {
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 13/17] igt: Add gem_ctx_engines
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

To exercise the new I915_CONTEXT_PARAM_ENGINES and interactions with
gem_execbuf().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.sources  |   1 +
 tests/gem_ctx_engines.c | 236 ++++++++++++++++++++++++++++++++++++++++
 tests/meson.build       |   1 +
 3 files changed, 238 insertions(+)
 create mode 100644 tests/gem_ctx_engines.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 08a1caf9d..d2245f265 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -53,6 +53,7 @@ TESTS_progs = \
 	gem_ctx_bad_destroy \
 	gem_ctx_bad_exec \
 	gem_ctx_create \
+	gem_ctx_engines \
 	gem_ctx_exec \
 	gem_ctx_isolation \
 	gem_ctx_param \
diff --git a/tests/gem_ctx_engines.c b/tests/gem_ctx_engines.c
new file mode 100644
index 000000000..5dcbc3c90
--- /dev/null
+++ b/tests/gem_ctx_engines.c
@@ -0,0 +1,236 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "igt.h"
+
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <errno.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+
+#include <drm.h>
+
+#include "i915/gem_context.h"
+
+struct i915_user_extension {
+	uint64_t next_extension;
+	uint64_t name;
+};
+
+struct i915_context_param_engines {
+	uint64_t extensions;
+
+	struct {
+		uint32_t class;
+		uint32_t instance;
+	} class_instance[0];
+};
+#define I915_CONTEXT_PARAM_ENGINES 0x7
+
+static bool has_context_engines(int i915)
+{
+	struct drm_i915_gem_context_param param = {
+		.ctx_id = 0,
+		.param = I915_CONTEXT_PARAM_ENGINES,
+	};
+	return __gem_context_set_param(i915, &param) == 0;
+}
+
+static void invalid_engines(int i915)
+{
+	struct i915_context_param_engines stack = {}, *engines;
+	struct drm_i915_gem_context_param param = {
+		.ctx_id = gem_context_create(i915),
+		.param = I915_CONTEXT_PARAM_ENGINES,
+		.value = to_user_pointer(&stack),
+	};
+	void *ptr;
+
+	param.size = 0;
+	igt_assert_eq(__gem_context_set_param(i915, &param), 0);
+
+	param.size = 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EINVAL);
+
+	param.size = sizeof(stack) - 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EINVAL);
+
+	param.size = sizeof(stack) + 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EINVAL);
+
+	param.size = 0;
+	igt_assert_eq(__gem_context_set_param(i915, &param), 0);
+
+	/* Create a single page surrounded by inaccessible nothingness */
+	ptr = mmap(NULL, 3 * 4096, PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
+	igt_assert(ptr != MAP_FAILED);
+
+	munmap(ptr, 4096);
+	engines = ptr + 4096;
+	munmap(ptr + 2 *4096, 4096);
+
+	param.size = sizeof(*engines) + sizeof(*engines->class_instance);
+	param.value = to_user_pointer(engines);
+
+	engines->class_instance[0].class = -1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -ENOENT);
+
+	mprotect(engines, 4096, PROT_READ);
+	igt_assert_eq(__gem_context_set_param(i915, &param), -ENOENT);
+
+	mprotect(engines, 4096, PROT_WRITE);
+	engines->class_instance[0].class = 0;
+	if (__gem_context_set_param(i915, &param)) /* XXX needs RCS */
+		goto out;
+
+	engines->extensions = to_user_pointer(ptr);
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	engines->extensions = 0;
+	igt_assert_eq(__gem_context_set_param(i915, &param), 0);
+
+	param.value = to_user_pointer(engines - 1);
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) - 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) - param.size +  1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) + 4096;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) - param.size + 4096;
+	igt_assert_eq(__gem_context_set_param(i915, &param), 0);
+
+	param.value = to_user_pointer(engines) - param.size + 4096 + 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) + 4096;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) + 4096 - 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) - 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines - 1);
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines - 1) + 4096;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines - 1) + 4096 - sizeof(*engines->class_instance) / 2;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+out:
+	munmap(engines, 4096);
+	gem_context_destroy(i915, param.ctx_id);
+}
+
+static void execute_one(int i915)
+{
+	struct one_engine {
+		uint64_t extensions;
+		uint32_t class;
+		uint32_t instance;
+	} engines;
+	struct drm_i915_gem_context_param param = {
+		.ctx_id = gem_context_create(i915),
+		.param = I915_CONTEXT_PARAM_ENGINES,
+		.size = sizeof(engines),
+		.value = to_user_pointer(&engines),
+	};
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+		.rsvd1 = param.ctx_id,
+	};
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	const struct intel_execution_engine2 *e;
+
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	memset(&engines, 0, sizeof(engines));
+
+	/* Unadulterated I915_EXEC_DEFAULT should work */
+	execbuf.flags = 0;
+	igt_assert_eq(__gem_execbuf(i915, &execbuf), 0);
+
+	for_each_engine_class_instance(i915, e) {
+		engines.class = e->class;
+		engines.instance = e->instance;
+		gem_context_set_param(i915, &param);
+
+		execbuf.flags = 0;
+		igt_assert_eq(__gem_execbuf(i915, &execbuf), -EINVAL);
+
+		execbuf.flags = 1;
+		igt_assert_eq(__gem_execbuf(i915, &execbuf), 0);
+
+		execbuf.flags = 2;
+		igt_assert_eq(__gem_execbuf(i915, &execbuf), -EINVAL);
+
+		/* XXX prove we used the right engine! */
+	}
+
+	/* Restore the defaults and check I915_EXEC_DEFAULT works again. */
+	param.size = 0;
+	gem_context_set_param(i915, &param);
+	execbuf.flags = 0;
+	igt_assert_eq(__gem_execbuf(i915, &execbuf), 0);
+
+	gem_close(i915, obj.handle);
+	gem_context_destroy(i915, param.ctx_id);
+}
+
+igt_main
+{
+	int i915 = -1;
+
+	igt_fixture {
+		i915 = drm_open_driver_render(DRIVER_INTEL);
+		igt_require_gem(i915);
+
+		gem_require_contexts(i915);
+		igt_require(has_context_engines(i915));
+	}
+
+	igt_subtest("invalid-engines")
+		invalid_engines(i915);
+
+	igt_subtest("execute-one")
+		execute_one(i915);
+}
diff --git a/tests/meson.build b/tests/meson.build
index 029b80434..c099e29a3 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -30,6 +30,7 @@ test_progs = [
 	'gem_ctx_bad_destroy',
 	'gem_ctx_bad_exec',
 	'gem_ctx_create',
+	'gem_ctx_engines',
 	'gem_ctx_exec',
 	'gem_ctx_isolation',
 	'gem_ctx_param',
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 13/17] igt: Add gem_ctx_engines
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

To exercise the new I915_CONTEXT_PARAM_ENGINES and interactions with
gem_execbuf().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.sources  |   1 +
 tests/gem_ctx_engines.c | 236 ++++++++++++++++++++++++++++++++++++++++
 tests/meson.build       |   1 +
 3 files changed, 238 insertions(+)
 create mode 100644 tests/gem_ctx_engines.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 08a1caf9d..d2245f265 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -53,6 +53,7 @@ TESTS_progs = \
 	gem_ctx_bad_destroy \
 	gem_ctx_bad_exec \
 	gem_ctx_create \
+	gem_ctx_engines \
 	gem_ctx_exec \
 	gem_ctx_isolation \
 	gem_ctx_param \
diff --git a/tests/gem_ctx_engines.c b/tests/gem_ctx_engines.c
new file mode 100644
index 000000000..5dcbc3c90
--- /dev/null
+++ b/tests/gem_ctx_engines.c
@@ -0,0 +1,236 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "igt.h"
+
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <errno.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+
+#include <drm.h>
+
+#include "i915/gem_context.h"
+
+struct i915_user_extension {
+	uint64_t next_extension;
+	uint64_t name;
+};
+
+struct i915_context_param_engines {
+	uint64_t extensions;
+
+	struct {
+		uint32_t class;
+		uint32_t instance;
+	} class_instance[0];
+};
+#define I915_CONTEXT_PARAM_ENGINES 0x7
+
+static bool has_context_engines(int i915)
+{
+	struct drm_i915_gem_context_param param = {
+		.ctx_id = 0,
+		.param = I915_CONTEXT_PARAM_ENGINES,
+	};
+	return __gem_context_set_param(i915, &param) == 0;
+}
+
+static void invalid_engines(int i915)
+{
+	struct i915_context_param_engines stack = {}, *engines;
+	struct drm_i915_gem_context_param param = {
+		.ctx_id = gem_context_create(i915),
+		.param = I915_CONTEXT_PARAM_ENGINES,
+		.value = to_user_pointer(&stack),
+	};
+	void *ptr;
+
+	param.size = 0;
+	igt_assert_eq(__gem_context_set_param(i915, &param), 0);
+
+	param.size = 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EINVAL);
+
+	param.size = sizeof(stack) - 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EINVAL);
+
+	param.size = sizeof(stack) + 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EINVAL);
+
+	param.size = 0;
+	igt_assert_eq(__gem_context_set_param(i915, &param), 0);
+
+	/* Create a single page surrounded by inaccessible nothingness */
+	ptr = mmap(NULL, 3 * 4096, PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
+	igt_assert(ptr != MAP_FAILED);
+
+	munmap(ptr, 4096);
+	engines = ptr + 4096;
+	munmap(ptr + 2 *4096, 4096);
+
+	param.size = sizeof(*engines) + sizeof(*engines->class_instance);
+	param.value = to_user_pointer(engines);
+
+	engines->class_instance[0].class = -1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -ENOENT);
+
+	mprotect(engines, 4096, PROT_READ);
+	igt_assert_eq(__gem_context_set_param(i915, &param), -ENOENT);
+
+	mprotect(engines, 4096, PROT_WRITE);
+	engines->class_instance[0].class = 0;
+	if (__gem_context_set_param(i915, &param)) /* XXX needs RCS */
+		goto out;
+
+	engines->extensions = to_user_pointer(ptr);
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	engines->extensions = 0;
+	igt_assert_eq(__gem_context_set_param(i915, &param), 0);
+
+	param.value = to_user_pointer(engines - 1);
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) - 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) - param.size +  1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) + 4096;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) - param.size + 4096;
+	igt_assert_eq(__gem_context_set_param(i915, &param), 0);
+
+	param.value = to_user_pointer(engines) - param.size + 4096 + 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) + 4096;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) + 4096 - 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines) - 1;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines - 1);
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines - 1) + 4096;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+	param.value = to_user_pointer(engines - 1) + 4096 - sizeof(*engines->class_instance) / 2;
+	igt_assert_eq(__gem_context_set_param(i915, &param), -EFAULT);
+
+out:
+	munmap(engines, 4096);
+	gem_context_destroy(i915, param.ctx_id);
+}
+
+static void execute_one(int i915)
+{
+	struct one_engine {
+		uint64_t extensions;
+		uint32_t class;
+		uint32_t instance;
+	} engines;
+	struct drm_i915_gem_context_param param = {
+		.ctx_id = gem_context_create(i915),
+		.param = I915_CONTEXT_PARAM_ENGINES,
+		.size = sizeof(engines),
+		.value = to_user_pointer(&engines),
+	};
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = gem_create(i915, 4096),
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+		.rsvd1 = param.ctx_id,
+	};
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	const struct intel_execution_engine2 *e;
+
+	gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+	memset(&engines, 0, sizeof(engines));
+
+	/* Unadulterated I915_EXEC_DEFAULT should work */
+	execbuf.flags = 0;
+	igt_assert_eq(__gem_execbuf(i915, &execbuf), 0);
+
+	for_each_engine_class_instance(i915, e) {
+		engines.class = e->class;
+		engines.instance = e->instance;
+		gem_context_set_param(i915, &param);
+
+		execbuf.flags = 0;
+		igt_assert_eq(__gem_execbuf(i915, &execbuf), -EINVAL);
+
+		execbuf.flags = 1;
+		igt_assert_eq(__gem_execbuf(i915, &execbuf), 0);
+
+		execbuf.flags = 2;
+		igt_assert_eq(__gem_execbuf(i915, &execbuf), -EINVAL);
+
+		/* XXX prove we used the right engine! */
+	}
+
+	/* Restore the defaults and check I915_EXEC_DEFAULT works again. */
+	param.size = 0;
+	gem_context_set_param(i915, &param);
+	execbuf.flags = 0;
+	igt_assert_eq(__gem_execbuf(i915, &execbuf), 0);
+
+	gem_close(i915, obj.handle);
+	gem_context_destroy(i915, param.ctx_id);
+}
+
+igt_main
+{
+	int i915 = -1;
+
+	igt_fixture {
+		i915 = drm_open_driver_render(DRIVER_INTEL);
+		igt_require_gem(i915);
+
+		gem_require_contexts(i915);
+		igt_require(has_context_engines(i915));
+	}
+
+	igt_subtest("invalid-engines")
+		invalid_engines(i915);
+
+	igt_subtest("execute-one")
+		execute_one(i915);
+}
diff --git a/tests/meson.build b/tests/meson.build
index 029b80434..c099e29a3 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -30,6 +30,7 @@ test_progs = [
 	'gem_ctx_bad_destroy',
 	'gem_ctx_bad_exec',
 	'gem_ctx_create',
+	'gem_ctx_engines',
 	'gem_ctx_exec',
 	'gem_ctx_isolation',
 	'gem_ctx_param',
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 14/17] igt: Add gem_exec_balancer
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Exercise the in-kernel load balancer checking that we can distribute
batches across the set of ctx->engines to avoid load.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.am         |   1 +
 tests/Makefile.sources    |   1 +
 tests/gem_exec_balancer.c | 465 ++++++++++++++++++++++++++++++++++++++
 tests/meson.build         |   7 +
 4 files changed, 474 insertions(+)
 create mode 100644 tests/gem_exec_balancer.c

diff --git a/tests/Makefile.am b/tests/Makefile.am
index ba307b220..61da999ba 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -106,6 +106,7 @@ gem_close_race_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 gem_close_race_LDADD = $(LDADD) -lpthread
 gem_ctx_thrash_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 gem_ctx_thrash_LDADD = $(LDADD) -lpthread
+gem_exec_balancer_LDADD = $(LDADD) $(top_builddir)/lib/libigt_perf.la
 gem_exec_parallel_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 gem_exec_parallel_LDADD = $(LDADD) -lpthread
 gem_fence_thrash_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index d2245f265..c4fba38dc 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -68,6 +68,7 @@ TESTS_progs = \
 	gem_exec_async \
 	gem_exec_await \
 	gem_exec_bad_domains \
+	gem_exec_balancer \
 	gem_exec_basic \
 	gem_exec_big \
 	gem_exec_blt \
diff --git a/tests/gem_exec_balancer.c b/tests/gem_exec_balancer.c
new file mode 100644
index 000000000..125d32304
--- /dev/null
+++ b/tests/gem_exec_balancer.c
@@ -0,0 +1,465 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <sched.h>
+
+#include "igt.h"
+#include "igt_perf.h"
+#include "i915/gem_ring.h"
+#include "sw_sync.h"
+
+IGT_TEST_DESCRIPTION("Exercise in-kernel load-balancing");
+
+#define I915_CONTEXT_PARAM_ENGINES 0x7
+
+struct class_instance {
+	uint32_t class;
+	uint32_t instance;
+};
+
+static bool has_class_instance(int i915, uint32_t class, uint32_t instance)
+{
+	int fd;
+
+	fd = perf_i915_open(I915_PMU_ENGINE_BUSY(class, instance));
+	if (fd != -1) {
+		close(fd);
+		return true;
+	}
+
+	return false;
+}
+
+static struct class_instance *
+list_engines(int i915, uint32_t class_mask, unsigned int *out)
+{
+	unsigned int count = 0, size = 64;
+	struct class_instance *engines;
+
+	engines = malloc(size * sizeof(*engines));
+	if (!engines) {
+		*out = 0;
+		return NULL;
+	}
+
+	for (enum drm_i915_gem_engine_class class = I915_ENGINE_CLASS_RENDER;
+	     class_mask;
+	     class++, class_mask >>= 1) {
+		if (!(class_mask & 1))
+			continue;
+
+		for (unsigned int instance = 0;
+		     has_class_instance(i915, class, instance);
+		     instance++) {
+			if (count == size) {
+				struct class_instance *e;
+
+				size *= 2;
+				e = realloc(engines, size*sizeof(*engines));
+				if (!e) {
+					*out = count;
+					return engines;
+				}
+
+				engines = e;
+			}
+
+			engines[count++] = (struct class_instance){
+				.class = class,
+				.instance = instance,
+			};
+		}
+	}
+
+	if (!count) {
+		free(engines);
+		engines = NULL;
+	}
+
+	*out = count;
+	return engines;
+}
+
+static int __set_load_balancer(int i915, uint32_t ctx,
+			       const struct class_instance *ci,
+			       unsigned int count)
+{
+#define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0
+	struct balancer {
+		uint64_t next_extension;
+		uint64_t name;
+
+		uint64_t flags;
+		uint64_t mask;
+
+		uint64_t mbz[4];
+	} balancer = {
+		.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
+		.mask = ~0ull,
+	};
+	struct engines {
+		uint64_t extension;
+		uint64_t class_instance[count];
+	} engines;
+	struct drm_i915_gem_context_param p = {
+		.ctx_id = ctx,
+		.param = I915_CONTEXT_PARAM_ENGINES,
+		.size = sizeof(engines),
+		.value = to_user_pointer(&engines)
+	};
+
+	engines.extension = to_user_pointer(&balancer);
+	memcpy(engines.class_instance, ci, sizeof(engines.class_instance));
+
+	return __gem_context_set_param(i915, &p);
+}
+
+static void set_load_balancer(int i915, uint32_t ctx,
+			      const struct class_instance *ci,
+			      unsigned int count)
+{
+	igt_assert_eq(__set_load_balancer(i915, ctx, ci, count), 0);
+}
+
+static uint32_t load_balancer_create(int i915,
+				     const struct class_instance *ci,
+				     unsigned int count)
+{
+	uint32_t ctx;
+
+	ctx = gem_queue_create(i915);
+	set_load_balancer(i915, ctx, ci, count);
+
+	return ctx;
+}
+
+static void kick_kthreads(int period_us)
+{
+	sched_yield();
+	usleep(period_us);
+}
+
+static double measure_load(int pmu, int period_us)
+{
+	uint64_t data[2];
+	uint64_t d_t, d_v;
+
+	kick_kthreads(period_us);
+
+	igt_assert_eq(read(pmu, data, sizeof(data)), sizeof(data));
+	d_v = -data[0];
+	d_t = -data[1];
+
+	usleep(period_us);
+
+	igt_assert_eq(read(pmu, data, sizeof(data)), sizeof(data));
+	d_v += data[0];
+	d_t += data[1];
+
+	return d_v / (double)d_t;
+}
+
+static double measure_min_load(int pmu, unsigned int num, int period_us)
+{
+	uint64_t data[2 + num];
+	uint64_t d_t, d_v[num];
+	uint64_t min = -1, max = 0;
+
+	kick_kthreads(period_us);
+
+	igt_assert_eq(read(pmu, data, sizeof(data)), sizeof(data));
+	for (unsigned int n = 0; n < num; n++)
+		d_v[n] = -data[2 + n];
+	d_t = -data[1];
+
+	usleep(period_us);
+
+	igt_assert_eq(read(pmu, data, sizeof(data)), sizeof(data));
+
+	d_t += data[1];
+	for (unsigned int n = 0; n < num; n++) {
+		d_v[n] += data[2 + n];
+		igt_debug("engine[%d]: %.1f%%\n",
+			  n, d_v[n] / (double)d_t * 100);
+		if (d_v[n] < min)
+			min = d_v[n];
+		if (d_v[n] > max)
+			max = d_v[n];
+	}
+
+	igt_debug("elapsed: %"PRIu64"ns, load [%.1f, %.1f]%%\n",
+		  d_t, min / (double)d_t * 100,  max / (double)d_t * 100);
+
+	return min / (double)d_t;
+}
+
+static void check_individual_engine(int i915,
+				    uint32_t ctx,
+				    const struct class_instance *ci,
+				    int idx)
+{
+	igt_spin_t *spin;
+	double load;
+	int pmu;
+
+	pmu = perf_i915_open(I915_PMU_ENGINE_BUSY(ci[idx].class,
+						  ci[idx].instance));
+
+	spin = igt_spin_batch_new(i915, ctx, idx + 1, 0);
+	load = measure_load(pmu, 10000);
+	igt_spin_batch_free(i915, spin);
+
+	close(pmu);
+
+	igt_assert_f(load > 0.90,
+		     "engine %d (class:instance %d:%d) was found to be only %.1f%% busy\n",
+		     idx, ci[idx].class, ci[idx].instance, load*100);
+}
+
+static void individual(int i915)
+{
+	uint32_t ctx;
+
+	/*
+	 * I915_CONTEXT_PARAM_ENGINE allows us to index into the user
+	 * supplied array from gem_execbuf(). Our check is to build the
+	 * ctx->engine[] with various different engine classes, feed in
+	 * a spinner and then ask pmu to confirm it the expected engine
+	 * was busy.
+	 */
+
+	ctx = gem_queue_create(i915);
+
+	for (int mask = 0; mask < 32; mask++) {
+		struct class_instance *ci;
+		unsigned int count;
+
+		ci = list_engines(i915, 1u << mask, &count);
+		if (!ci)
+			continue;
+
+		igt_debug("Found %d engines of class %d\n", count, mask);
+
+		for (int pass = 0; pass < count; pass++) { /* approx. count! */
+			igt_permute_array(ci, count, igt_exchange_int64);
+			set_load_balancer(i915, ctx, ci, count);
+			for (unsigned int n = 0; n < count; n++)
+				check_individual_engine(i915, ctx, ci, n);
+		}
+
+		free(ci);
+	}
+
+	gem_context_destroy(i915, ctx);
+}
+
+static int add_pmu(int pmu, const struct class_instance *ci)
+{
+	return perf_i915_open_group(I915_PMU_ENGINE_BUSY(ci->class,
+							 ci->instance),
+				    pmu);
+}
+
+static uint32_t batch_create(int i915)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	uint32_t handle;
+
+	handle = gem_create(i915, 4096);
+	gem_write(i915, handle, 0, &bbe, sizeof(bbe));
+
+	return handle;
+}
+
+static void full(int i915, unsigned int flags)
+#define PULSE 0x1
+#define LATE 0x2
+{
+	struct drm_i915_gem_exec_object2 batch = {
+		.handle = batch_create(i915),
+	};
+
+	if (flags & LATE)
+		igt_require_sw_sync();
+
+	/*
+	 * I915_CONTEXT_PARAM_ENGINE changes the meaning of I915_EXEC_DEFAULT
+	 * to provide an automatic selection from the ctx->engine[]. It
+	 * employs load-balancing to evenly distribute the workload the
+	 * array. If we submit N spinners, we expect them to be simultaneously
+	 * running across N engines and use PMU to confirm that the entire
+	 * set of engines are busy.
+	 *
+	 * We complicate matters by interpersing shortlived tasks to challenge
+	 * the kernel to search for space in which to insert new batches.
+	 */
+
+
+	for (int mask = 0; mask < 32; mask++) {
+		struct class_instance *ci;
+		igt_spin_t *spin = NULL;
+		unsigned int count;
+		IGT_CORK_FENCE(cork);
+		double load;
+		int fence = -1;
+		int *pmu;
+
+		ci = list_engines(i915, 1u << mask, &count);
+		if (!ci)
+			continue;
+
+		igt_debug("Found %d engines of class %d\n", count, mask);
+
+		pmu = malloc(sizeof(*pmu) * count);
+		igt_assert(pmu);
+
+		if (flags & LATE)
+			fence = igt_cork_plug(&cork, i915);
+
+		pmu[0] = -1;
+		for (unsigned int n = 0; n < count; n++) {
+			uint32_t ctx;
+
+			pmu[n] = add_pmu(pmu[0], &ci[n]);
+
+			if (flags & PULSE) {
+				struct drm_i915_gem_execbuffer2 eb = {
+					.buffers_ptr = to_user_pointer(&batch),
+					.buffer_count = 1,
+					.rsvd2 = fence,
+					.flags = flags & LATE ? I915_EXEC_FENCE_IN : 0,
+				};
+
+				gem_execbuf(i915, &eb);
+			}
+
+			/*
+			 * Each spinner needs to be one a new timeline,
+			 * otherwise they will just sit in the single queue
+			 * and not run concurrently.
+			 */
+			ctx = load_balancer_create(i915, ci, count);
+
+			if (spin == NULL) {
+				spin = __igt_spin_batch_new(i915, ctx, 0, 0);
+			} else {
+				struct drm_i915_gem_exec_object2 obj = {
+					.handle = spin->handle,
+				};
+				struct drm_i915_gem_execbuffer2 eb = {
+					.buffers_ptr = to_user_pointer(&obj),
+					.buffer_count = 1,
+					.rsvd1 = ctx,
+					.rsvd2 = fence,
+					.flags = flags & LATE ? I915_EXEC_FENCE_IN : 0,
+				};
+
+				gem_execbuf(i915, &eb);
+			}
+
+			gem_context_destroy(i915, ctx);
+		}
+
+		if (flags & LATE) {
+			igt_cork_unplug(&cork);
+			close(fence);
+		}
+
+		load = measure_min_load(pmu[0], count, 10000);
+		igt_spin_batch_free(i915, spin);
+
+		close(pmu[0]);
+		free(pmu);
+
+		free(ci);
+
+		igt_assert_f(load > 0.90,
+			     "minimum load for %d x class:%d was found to be only %.1f%% busy\n",
+			     count, mask, load*100);
+	}
+
+	gem_close(i915, batch.handle);
+}
+
+static bool has_context_engines(int i915)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_ENGINES,
+	};
+
+	return __gem_context_set_param(i915, &p) == 0;
+}
+
+static bool has_load_balancer(int i915)
+{
+	struct class_instance ci = {};
+	uint32_t ctx;
+	int err;
+
+	ctx = gem_queue_create(i915);
+	err = __set_load_balancer(i915, ctx, &ci, 1);
+	gem_context_destroy(i915, ctx);
+
+	return err == 0;
+}
+
+igt_main
+{
+	int i915 = -1;
+
+	igt_skip_on_simulation();
+
+	igt_fixture {
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+
+		gem_require_contexts(i915);
+		igt_require(has_context_engines(i915));
+		igt_require(has_load_balancer(i915));
+
+		igt_fork_hang_detector(i915);
+	}
+
+	igt_subtest("individual")
+		individual(i915);
+
+	igt_subtest_group {
+		static const struct {
+			const char *name;
+			unsigned int flags;
+		} phases[] = {
+			{ "", 0 },
+			{ "-pulse", PULSE },
+			{ "-late", LATE },
+			{ "-late-pulse", PULSE | LATE },
+			{ }
+		};
+		for (typeof(*phases) *p = phases; p->name; p++)
+			igt_subtest_f("full%s", p->name)
+				full(i915, p->flags);
+	}
+
+	igt_fixture {
+		igt_stop_hang_detector();
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index c099e29a3..a3e9ee9f1 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -282,6 +282,13 @@ test_executables += executable('gem_eio', 'gem_eio.c',
 	   install : true)
 test_progs += 'gem_eio'
 
+test_executables += executable('gem_exec_balancer', 'gem_exec_balancer.c',
+	   dependencies : test_deps + [ lib_igt_perf ],
+	   install_dir : libexecdir,
+	   install_rpath : rpathdir,
+	   install : true)
+test_progs += 'gem_exec_balancer'
+
 test_executables += executable('perf_pmu', 'perf_pmu.c',
 	   dependencies : test_deps + [ lib_igt_perf ],
 	   install_dir : libexecdir,
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 14/17] igt: Add gem_exec_balancer
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Exercise the in-kernel load balancer checking that we can distribute
batches across the set of ctx->engines to avoid load.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.am         |   1 +
 tests/Makefile.sources    |   1 +
 tests/gem_exec_balancer.c | 465 ++++++++++++++++++++++++++++++++++++++
 tests/meson.build         |   7 +
 4 files changed, 474 insertions(+)
 create mode 100644 tests/gem_exec_balancer.c

diff --git a/tests/Makefile.am b/tests/Makefile.am
index ba307b220..61da999ba 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -106,6 +106,7 @@ gem_close_race_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 gem_close_race_LDADD = $(LDADD) -lpthread
 gem_ctx_thrash_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 gem_ctx_thrash_LDADD = $(LDADD) -lpthread
+gem_exec_balancer_LDADD = $(LDADD) $(top_builddir)/lib/libigt_perf.la
 gem_exec_parallel_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 gem_exec_parallel_LDADD = $(LDADD) -lpthread
 gem_fence_thrash_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index d2245f265..c4fba38dc 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -68,6 +68,7 @@ TESTS_progs = \
 	gem_exec_async \
 	gem_exec_await \
 	gem_exec_bad_domains \
+	gem_exec_balancer \
 	gem_exec_basic \
 	gem_exec_big \
 	gem_exec_blt \
diff --git a/tests/gem_exec_balancer.c b/tests/gem_exec_balancer.c
new file mode 100644
index 000000000..125d32304
--- /dev/null
+++ b/tests/gem_exec_balancer.c
@@ -0,0 +1,465 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <sched.h>
+
+#include "igt.h"
+#include "igt_perf.h"
+#include "i915/gem_ring.h"
+#include "sw_sync.h"
+
+IGT_TEST_DESCRIPTION("Exercise in-kernel load-balancing");
+
+#define I915_CONTEXT_PARAM_ENGINES 0x7
+
+struct class_instance {
+	uint32_t class;
+	uint32_t instance;
+};
+
+static bool has_class_instance(int i915, uint32_t class, uint32_t instance)
+{
+	int fd;
+
+	fd = perf_i915_open(I915_PMU_ENGINE_BUSY(class, instance));
+	if (fd != -1) {
+		close(fd);
+		return true;
+	}
+
+	return false;
+}
+
+static struct class_instance *
+list_engines(int i915, uint32_t class_mask, unsigned int *out)
+{
+	unsigned int count = 0, size = 64;
+	struct class_instance *engines;
+
+	engines = malloc(size * sizeof(*engines));
+	if (!engines) {
+		*out = 0;
+		return NULL;
+	}
+
+	for (enum drm_i915_gem_engine_class class = I915_ENGINE_CLASS_RENDER;
+	     class_mask;
+	     class++, class_mask >>= 1) {
+		if (!(class_mask & 1))
+			continue;
+
+		for (unsigned int instance = 0;
+		     has_class_instance(i915, class, instance);
+		     instance++) {
+			if (count == size) {
+				struct class_instance *e;
+
+				size *= 2;
+				e = realloc(engines, size*sizeof(*engines));
+				if (!e) {
+					*out = count;
+					return engines;
+				}
+
+				engines = e;
+			}
+
+			engines[count++] = (struct class_instance){
+				.class = class,
+				.instance = instance,
+			};
+		}
+	}
+
+	if (!count) {
+		free(engines);
+		engines = NULL;
+	}
+
+	*out = count;
+	return engines;
+}
+
+static int __set_load_balancer(int i915, uint32_t ctx,
+			       const struct class_instance *ci,
+			       unsigned int count)
+{
+#define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0
+	struct balancer {
+		uint64_t next_extension;
+		uint64_t name;
+
+		uint64_t flags;
+		uint64_t mask;
+
+		uint64_t mbz[4];
+	} balancer = {
+		.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
+		.mask = ~0ull,
+	};
+	struct engines {
+		uint64_t extension;
+		uint64_t class_instance[count];
+	} engines;
+	struct drm_i915_gem_context_param p = {
+		.ctx_id = ctx,
+		.param = I915_CONTEXT_PARAM_ENGINES,
+		.size = sizeof(engines),
+		.value = to_user_pointer(&engines)
+	};
+
+	engines.extension = to_user_pointer(&balancer);
+	memcpy(engines.class_instance, ci, sizeof(engines.class_instance));
+
+	return __gem_context_set_param(i915, &p);
+}
+
+static void set_load_balancer(int i915, uint32_t ctx,
+			      const struct class_instance *ci,
+			      unsigned int count)
+{
+	igt_assert_eq(__set_load_balancer(i915, ctx, ci, count), 0);
+}
+
+static uint32_t load_balancer_create(int i915,
+				     const struct class_instance *ci,
+				     unsigned int count)
+{
+	uint32_t ctx;
+
+	ctx = gem_queue_create(i915);
+	set_load_balancer(i915, ctx, ci, count);
+
+	return ctx;
+}
+
+static void kick_kthreads(int period_us)
+{
+	sched_yield();
+	usleep(period_us);
+}
+
+static double measure_load(int pmu, int period_us)
+{
+	uint64_t data[2];
+	uint64_t d_t, d_v;
+
+	kick_kthreads(period_us);
+
+	igt_assert_eq(read(pmu, data, sizeof(data)), sizeof(data));
+	d_v = -data[0];
+	d_t = -data[1];
+
+	usleep(period_us);
+
+	igt_assert_eq(read(pmu, data, sizeof(data)), sizeof(data));
+	d_v += data[0];
+	d_t += data[1];
+
+	return d_v / (double)d_t;
+}
+
+static double measure_min_load(int pmu, unsigned int num, int period_us)
+{
+	uint64_t data[2 + num];
+	uint64_t d_t, d_v[num];
+	uint64_t min = -1, max = 0;
+
+	kick_kthreads(period_us);
+
+	igt_assert_eq(read(pmu, data, sizeof(data)), sizeof(data));
+	for (unsigned int n = 0; n < num; n++)
+		d_v[n] = -data[2 + n];
+	d_t = -data[1];
+
+	usleep(period_us);
+
+	igt_assert_eq(read(pmu, data, sizeof(data)), sizeof(data));
+
+	d_t += data[1];
+	for (unsigned int n = 0; n < num; n++) {
+		d_v[n] += data[2 + n];
+		igt_debug("engine[%d]: %.1f%%\n",
+			  n, d_v[n] / (double)d_t * 100);
+		if (d_v[n] < min)
+			min = d_v[n];
+		if (d_v[n] > max)
+			max = d_v[n];
+	}
+
+	igt_debug("elapsed: %"PRIu64"ns, load [%.1f, %.1f]%%\n",
+		  d_t, min / (double)d_t * 100,  max / (double)d_t * 100);
+
+	return min / (double)d_t;
+}
+
+static void check_individual_engine(int i915,
+				    uint32_t ctx,
+				    const struct class_instance *ci,
+				    int idx)
+{
+	igt_spin_t *spin;
+	double load;
+	int pmu;
+
+	pmu = perf_i915_open(I915_PMU_ENGINE_BUSY(ci[idx].class,
+						  ci[idx].instance));
+
+	spin = igt_spin_batch_new(i915, ctx, idx + 1, 0);
+	load = measure_load(pmu, 10000);
+	igt_spin_batch_free(i915, spin);
+
+	close(pmu);
+
+	igt_assert_f(load > 0.90,
+		     "engine %d (class:instance %d:%d) was found to be only %.1f%% busy\n",
+		     idx, ci[idx].class, ci[idx].instance, load*100);
+}
+
+static void individual(int i915)
+{
+	uint32_t ctx;
+
+	/*
+	 * I915_CONTEXT_PARAM_ENGINE allows us to index into the user
+	 * supplied array from gem_execbuf(). Our check is to build the
+	 * ctx->engine[] with various different engine classes, feed in
+	 * a spinner and then ask pmu to confirm it the expected engine
+	 * was busy.
+	 */
+
+	ctx = gem_queue_create(i915);
+
+	for (int mask = 0; mask < 32; mask++) {
+		struct class_instance *ci;
+		unsigned int count;
+
+		ci = list_engines(i915, 1u << mask, &count);
+		if (!ci)
+			continue;
+
+		igt_debug("Found %d engines of class %d\n", count, mask);
+
+		for (int pass = 0; pass < count; pass++) { /* approx. count! */
+			igt_permute_array(ci, count, igt_exchange_int64);
+			set_load_balancer(i915, ctx, ci, count);
+			for (unsigned int n = 0; n < count; n++)
+				check_individual_engine(i915, ctx, ci, n);
+		}
+
+		free(ci);
+	}
+
+	gem_context_destroy(i915, ctx);
+}
+
+static int add_pmu(int pmu, const struct class_instance *ci)
+{
+	return perf_i915_open_group(I915_PMU_ENGINE_BUSY(ci->class,
+							 ci->instance),
+				    pmu);
+}
+
+static uint32_t batch_create(int i915)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	uint32_t handle;
+
+	handle = gem_create(i915, 4096);
+	gem_write(i915, handle, 0, &bbe, sizeof(bbe));
+
+	return handle;
+}
+
+static void full(int i915, unsigned int flags)
+#define PULSE 0x1
+#define LATE 0x2
+{
+	struct drm_i915_gem_exec_object2 batch = {
+		.handle = batch_create(i915),
+	};
+
+	if (flags & LATE)
+		igt_require_sw_sync();
+
+	/*
+	 * I915_CONTEXT_PARAM_ENGINE changes the meaning of I915_EXEC_DEFAULT
+	 * to provide an automatic selection from the ctx->engine[]. It
+	 * employs load-balancing to evenly distribute the workload the
+	 * array. If we submit N spinners, we expect them to be simultaneously
+	 * running across N engines and use PMU to confirm that the entire
+	 * set of engines are busy.
+	 *
+	 * We complicate matters by interpersing shortlived tasks to challenge
+	 * the kernel to search for space in which to insert new batches.
+	 */
+
+
+	for (int mask = 0; mask < 32; mask++) {
+		struct class_instance *ci;
+		igt_spin_t *spin = NULL;
+		unsigned int count;
+		IGT_CORK_FENCE(cork);
+		double load;
+		int fence = -1;
+		int *pmu;
+
+		ci = list_engines(i915, 1u << mask, &count);
+		if (!ci)
+			continue;
+
+		igt_debug("Found %d engines of class %d\n", count, mask);
+
+		pmu = malloc(sizeof(*pmu) * count);
+		igt_assert(pmu);
+
+		if (flags & LATE)
+			fence = igt_cork_plug(&cork, i915);
+
+		pmu[0] = -1;
+		for (unsigned int n = 0; n < count; n++) {
+			uint32_t ctx;
+
+			pmu[n] = add_pmu(pmu[0], &ci[n]);
+
+			if (flags & PULSE) {
+				struct drm_i915_gem_execbuffer2 eb = {
+					.buffers_ptr = to_user_pointer(&batch),
+					.buffer_count = 1,
+					.rsvd2 = fence,
+					.flags = flags & LATE ? I915_EXEC_FENCE_IN : 0,
+				};
+
+				gem_execbuf(i915, &eb);
+			}
+
+			/*
+			 * Each spinner needs to be one a new timeline,
+			 * otherwise they will just sit in the single queue
+			 * and not run concurrently.
+			 */
+			ctx = load_balancer_create(i915, ci, count);
+
+			if (spin == NULL) {
+				spin = __igt_spin_batch_new(i915, ctx, 0, 0);
+			} else {
+				struct drm_i915_gem_exec_object2 obj = {
+					.handle = spin->handle,
+				};
+				struct drm_i915_gem_execbuffer2 eb = {
+					.buffers_ptr = to_user_pointer(&obj),
+					.buffer_count = 1,
+					.rsvd1 = ctx,
+					.rsvd2 = fence,
+					.flags = flags & LATE ? I915_EXEC_FENCE_IN : 0,
+				};
+
+				gem_execbuf(i915, &eb);
+			}
+
+			gem_context_destroy(i915, ctx);
+		}
+
+		if (flags & LATE) {
+			igt_cork_unplug(&cork);
+			close(fence);
+		}
+
+		load = measure_min_load(pmu[0], count, 10000);
+		igt_spin_batch_free(i915, spin);
+
+		close(pmu[0]);
+		free(pmu);
+
+		free(ci);
+
+		igt_assert_f(load > 0.90,
+			     "minimum load for %d x class:%d was found to be only %.1f%% busy\n",
+			     count, mask, load*100);
+	}
+
+	gem_close(i915, batch.handle);
+}
+
+static bool has_context_engines(int i915)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_ENGINES,
+	};
+
+	return __gem_context_set_param(i915, &p) == 0;
+}
+
+static bool has_load_balancer(int i915)
+{
+	struct class_instance ci = {};
+	uint32_t ctx;
+	int err;
+
+	ctx = gem_queue_create(i915);
+	err = __set_load_balancer(i915, ctx, &ci, 1);
+	gem_context_destroy(i915, ctx);
+
+	return err == 0;
+}
+
+igt_main
+{
+	int i915 = -1;
+
+	igt_skip_on_simulation();
+
+	igt_fixture {
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+
+		gem_require_contexts(i915);
+		igt_require(has_context_engines(i915));
+		igt_require(has_load_balancer(i915));
+
+		igt_fork_hang_detector(i915);
+	}
+
+	igt_subtest("individual")
+		individual(i915);
+
+	igt_subtest_group {
+		static const struct {
+			const char *name;
+			unsigned int flags;
+		} phases[] = {
+			{ "", 0 },
+			{ "-pulse", PULSE },
+			{ "-late", LATE },
+			{ "-late-pulse", PULSE | LATE },
+			{ }
+		};
+		for (typeof(*phases) *p = phases; p->name; p++)
+			igt_subtest_f("full%s", p->name)
+				full(i915, p->flags);
+	}
+
+	igt_fixture {
+		igt_stop_hang_detector();
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index c099e29a3..a3e9ee9f1 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -282,6 +282,13 @@ test_executables += executable('gem_eio', 'gem_eio.c',
 	   install : true)
 test_progs += 'gem_eio'
 
+test_executables += executable('gem_exec_balancer', 'gem_exec_balancer.c',
+	   dependencies : test_deps + [ lib_igt_perf ],
+	   install_dir : libexecdir,
+	   install_rpath : rpathdir,
+	   install : true)
+test_progs += 'gem_exec_balancer'
+
 test_executables += executable('perf_pmu', 'perf_pmu.c',
 	   dependencies : test_deps + [ lib_igt_perf ],
 	   install_dir : libexecdir,
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 15/17] benchmarks/wsim: Simulate and interpret .wsim
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

A little tool I've been meaning to write for a while... Convert the
.wsim into their dag and find the longest chains and evaluate them on an
simulated machine.

v2: Implement barriers to handle sync commands

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/Makefile.sources |   1 +
 benchmarks/sim_wsim.c       | 419 ++++++++++++++++++++++++++++++++++++
 2 files changed, 420 insertions(+)
 create mode 100644 benchmarks/sim_wsim.c

diff --git a/benchmarks/Makefile.sources b/benchmarks/Makefile.sources
index 86928df5c..633623527 100644
--- a/benchmarks/Makefile.sources
+++ b/benchmarks/Makefile.sources
@@ -17,6 +17,7 @@ benchmarks_prog_list =			\
 	gem_wsim			\
 	kms_vblank			\
 	prime_lookup			\
+	sim_wsim			\
 	vgem_mmap			\
 	$(NULL)
 
diff --git a/benchmarks/sim_wsim.c b/benchmarks/sim_wsim.c
new file mode 100644
index 000000000..5f3d56045
--- /dev/null
+++ b/benchmarks/sim_wsim.c
@@ -0,0 +1,419 @@
+#include <ctype.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "igt_aux.h"
+
+#if 0
+#define DBG(...) printf(__VA_ARGS__)
+#else
+#define DBG(...) do { } while(0)
+#endif
+
+struct dependency {
+	struct task *signal;
+	struct igt_list link;
+};
+
+struct task {
+	int step;
+	int ctx;
+	int class;
+	int instance;
+
+	unsigned long min, max;
+	unsigned long duration;
+	unsigned long deadline;
+
+	struct igt_list link;
+	struct igt_list sched;
+	struct igt_list signals;
+
+	bool visited;
+};
+
+struct work {
+	struct task **tasks;
+	unsigned int count, size;
+
+	struct igt_list ends;
+	struct task *barrier;
+};
+
+static void add_dependency(struct task *task, struct task *signal)
+{
+	struct dependency *dep;
+
+	DBG("%s %d (context %d, engine %d) -> %d (context %d, engine %d)\n",
+	    __func__,
+	    signal->step, signal->ctx, signal->class,
+	    task->step, task->ctx, task->class);
+
+	dep = malloc(sizeof(*dep));
+	dep->signal = signal;
+	igt_list_add(&dep->link, &task->signals);
+	if (!igt_list_empty(&signal->link)) {
+		igt_list_del(&signal->link);
+		igt_list_init(&signal->link);
+	}
+}
+
+enum class {
+	RCS,
+	BCS,
+	VCS,
+	VECS,
+};
+
+static void add_task(struct work *work, char *line)
+{
+#define TSEP "."
+
+	static const char *engines[] = {
+		[RCS]  = "rcs",
+		[BCS]  = "bcs",
+		[VCS]  = "vcs",
+		[VECS] = "vecs",
+	};
+	struct task *task;
+	char *token;
+	int i;
+
+	DBG("line='%s'\n", line);
+
+	token = strtok(line, TSEP);
+
+	if (!strcasecmp(token, "s")) { /* sync point */
+		int sync = atoi(strtok(NULL, TSEP));
+
+		DBG("syncpt %d\n", sync);
+
+		work->barrier = work->tasks[work->count + sync];
+		return;
+	}
+
+	if (!isdigit(*token)) {
+		fprintf(stderr, "Ignoring step '%s' at your peril!\n", token);
+		return;
+	}
+
+	/*
+	 * 1.RCS.2800-3100.-1.0
+	 * - context
+	 * - engine
+	 * - delay
+	 * - dependencies
+	 * - sync
+	 */
+	task = malloc(sizeof(*task));
+
+	igt_list_init(&task->signals);
+	task->step = work->count;
+	task->visited = false;
+
+	/* context */
+	DBG("context='%s'\n", token);
+	task->ctx = atoi(token);
+
+	/* engine */
+	token = strtok(NULL, TSEP);
+	DBG("engine='%s'\n", token);
+	task->class = -1;
+	for (i = 0; i < sizeof(engines)/sizeof(*engines); i++) {
+		const int len = strlen(engines[i]);
+		if (!strncasecmp(token, engines[i], len)) {
+			task->class = i;
+			if (token[len])
+				task->instance = atoi(token + len);
+			else
+				task->instance = -1;
+			break;
+		}
+	}
+
+	/* delay */
+	token = strtok(NULL, TSEP);
+	DBG("delay='%s'\n", token);
+	task->min = strtol(token, &token, 0);
+	if (*token)
+		task->max = strtol(token + 1, NULL, 0);
+	else
+		task->max = task->min;
+	task->duration = (task->min + task->max) / 2;
+	DBG("min=%lu, max=%lu; duration=%lu\n", task->min, task->max, task->duration);
+
+	/* dependencies */
+	token = strtok(NULL, TSEP);
+	DBG("deps='%s'\n", token);
+	while ((i = strtol(token, &token, 0))) {
+		add_dependency(task, work->tasks[work->count + i]);
+		if (*token)
+			token++;
+	}
+
+	/* add a dependency for the context+engine timeline */
+	for (i = work->count; --i >= 0; ) {
+		if (work->tasks[i]->ctx == task->ctx &&
+		    work->tasks[i]->class == task->class) {
+			add_dependency(task, work->tasks[i]);
+			break;
+		}
+	}
+
+	if (work->barrier)
+		add_dependency(task, work->barrier);
+
+	/* sync -- we become the barrier */
+	if (atoi(strtok(NULL, TSEP))) {
+		DBG("marking as a sync point\n");
+		work->barrier = task;
+	}
+
+	igt_list_add(&task->link, &work->ends);
+	work->tasks[work->count++] = task;
+
+#undef TSEP
+}
+
+static struct work *parse_work(FILE *file)
+{
+	struct work *work;
+	char *line = NULL;
+	size_t len = 0;
+
+	work = malloc(sizeof(*work));
+	igt_list_init(&work->ends);
+	work->barrier = NULL;
+
+	work->size = 64;
+	work->count = 0;
+	work->tasks = malloc(sizeof(*work->tasks) * work->size);
+
+	while (getline(&line, &len, file) != -1) {
+		if (work->count == work->size) {
+			work->tasks = realloc(work->tasks,
+					      sizeof(*work->tasks) * work->size);
+			work->size *= 2;
+		}
+		add_task(work, line);
+	}
+
+	free(line);
+
+	DBG("%d tasks\n", work->count);
+	return work;
+}
+
+static unsigned long sum_durations(struct task *task)
+{
+	unsigned long max_duration = 0;
+	struct task *signal = NULL;
+	struct dependency *dep;
+
+	igt_list_for_each(dep, &task->signals, link) {
+		if (dep->signal->duration > max_duration) {
+			signal = dep->signal;
+			max_duration = signal->duration;
+		}
+	}
+
+	return task->duration + (signal ? sum_durations(signal) : 0);
+}
+
+static void ideal_depth(struct work *work)
+{
+	unsigned long total_duration;
+	unsigned long max_duration;
+	struct task *task;
+	int i;
+
+	/*
+	 * The ideal depth is the longest chain of dependencies as the
+	 * dependency chain requires sequential task execution. Each
+	 * chain is assumed to be run in parallel on an infinite set of
+	 * engines, so the ratelimiting step is its longest path.
+	 */
+	max_duration = 0;
+	igt_list_for_each(task, &work->ends, link) {
+		unsigned long duration = sum_durations(task);
+		if (duration > max_duration)
+			max_duration = duration;
+	}
+
+	total_duration = 0;
+	for (i = 0; i < work->count; i++)
+		total_duration += work->tasks[i]->duration;
+
+	printf("Single client\n");
+	printf("   total duration %luus; %.2f wps\n", total_duration, 1e6/total_duration);
+	printf("   ideal duration %luus; %.2f wps\n", max_duration, 1e6/max_duration);
+}
+
+struct sim_class {
+	int ninstance;
+	unsigned long deadline[4];
+	unsigned long busy[4];
+};
+
+static void simulate_client(struct work *work)
+{
+	struct sim_class sim[] = {
+		[RCS]  = { 1 },
+		[BCS]  = { 1 },
+		[VCS]  = { 2 },
+		[VECS] = { 1 },
+	}, *class;
+	IGT_LIST(sched);
+	struct task *task;
+	unsigned long max;
+	int i, j;
+
+	printf("Simulated clients:\n");
+
+	for (i = 0; i < work->count; i++)
+		igt_list_init(&work->tasks[i]->sched);
+
+	igt_list_for_each(task, &work->ends, link)
+		igt_list_add_tail(&task->sched, &sched);
+
+	igt_list_for_each(task, &sched, sched) {
+		struct dependency *dep;
+
+		igt_list_for_each(dep, &task->signals, link)
+			igt_list_move_tail(&dep->signal->sched, &sched);
+	}
+
+	igt_list_for_each_reverse(task, &sched, sched) {
+		struct dependency *dep;
+		int instance;
+
+		class = &sim[task->class];
+		max = class->deadline[0];
+
+		instance = task->instance;
+		if (instance < 0) {
+			instance = 0;
+			for (i = 1; i < class->ninstance; i++) {
+				if (class->deadline[i] < max) {
+					max = class->deadline[i];
+					instance = i;
+				}
+			}
+		}
+
+		/*
+		 * Greedy (first available), not true optimal scheduling.
+		 *
+		 * For optimal, we do have to compute the global optimal
+		 * ordering by checking every permutation...
+		 */
+		igt_list_for_each(dep, &task->signals, link) {
+			if (dep->signal->deadline > max)
+				max = dep->signal->deadline;
+		}
+
+		DBG("task %d: engine %d, instance %d; finish %lu\n",
+		    task->step, task->class, instance, max);
+
+		task->deadline = max + task->duration;
+		class->deadline[instance] = task->deadline;
+		class->busy[instance] += task->duration;
+	}
+
+	max = 0;
+	for (i = 0; i < sizeof(sim)/sizeof(sim[0]); i++) {
+		class = &sim[i];
+		for (j = 0; j < class->ninstance; j++) {
+			if (class->deadline[j] > max)
+				max = class->deadline[j];
+		}
+	}
+	printf("   single duration %luus; %.2f wps\n", max, 1e6/max);
+
+	/*
+	 * Compute the maximum duration required on any engine.
+	 *
+	 * With sufficient clients forcing maximum occupancy under their weight,
+	 * the ratelimiting step becomes a single engine and how many clients
+	 * it takes to fill.
+	 */
+	max = 0;
+	for (i = 0; i < sizeof(sim)/sizeof(sim[0]); i++) {
+		class = &sim[i];
+		for (j = 0; j < class->ninstance; j++) {
+			if (class->busy[j] > max)
+				max = class->busy[j];
+		}
+	}
+	printf("   packed duration %luus; %.2f wps\n", max, 1e6/max);
+}
+
+static void graphviz(struct work *work)
+{
+#if 0
+	int i, j;
+
+	printf("digraph {\n");
+	printf("  rankdir=LR;\n");
+	printf("  splines=line;\n");
+	printf("\n");
+
+	for (i = 0; i < work->count; i++) {
+		struct task *task = work->tasks[i];
+
+		if (task->visited)
+			goto skip;
+
+		printf("  subgraph cluster_%d {\n", task->ctx);
+		printf("    label=\"Context %d\"\n", task->ctx);
+		for (j = i; j < work->count; j++) {
+			if (work->tasks[j]->ctx == task->ctx) {
+				printf("    task_%03d;\n", j);
+				work->tasks[j]->visited = true;
+			}
+		}
+		printf("  }\n\n");
+
+skip:
+		task->visited = false;
+	}
+
+	for (i = 0; i < work->count; i++) {
+		struct task *task = work->tasks[i];
+		struct dependency *dep;
+
+		igt_list_for_each(dep, &task->signals, link) {
+			printf("  task_%03d -> task_%03d;\n",
+			       dep->signal->step, task->step);
+		}
+	}
+
+	printf("}\n");
+#endif
+}
+
+int main(int argc, char **argv)
+{
+	int i;
+
+	for (i = 1; i < argc; i++) {
+		FILE *file = fopen(argv[i], "r");
+		struct work *work;
+
+		if (!file) {
+			perror(argv[i]);
+			return 1;
+		}
+
+		work = parse_work(file);
+		fclose(file);
+
+		graphviz(work);
+
+		ideal_depth(work);
+		simulate_client(work);
+	}
+
+	return 0;
+}
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 15/17] benchmarks/wsim: Simulate and interpret .wsim
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx, Tvrtko Ursulin

A little tool I've been meaning to write for a while... Convert the
.wsim into their dag and find the longest chains and evaluate them on an
simulated machine.

v2: Implement barriers to handle sync commands

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/Makefile.sources |   1 +
 benchmarks/sim_wsim.c       | 419 ++++++++++++++++++++++++++++++++++++
 2 files changed, 420 insertions(+)
 create mode 100644 benchmarks/sim_wsim.c

diff --git a/benchmarks/Makefile.sources b/benchmarks/Makefile.sources
index 86928df5c..633623527 100644
--- a/benchmarks/Makefile.sources
+++ b/benchmarks/Makefile.sources
@@ -17,6 +17,7 @@ benchmarks_prog_list =			\
 	gem_wsim			\
 	kms_vblank			\
 	prime_lookup			\
+	sim_wsim			\
 	vgem_mmap			\
 	$(NULL)
 
diff --git a/benchmarks/sim_wsim.c b/benchmarks/sim_wsim.c
new file mode 100644
index 000000000..5f3d56045
--- /dev/null
+++ b/benchmarks/sim_wsim.c
@@ -0,0 +1,419 @@
+#include <ctype.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "igt_aux.h"
+
+#if 0
+#define DBG(...) printf(__VA_ARGS__)
+#else
+#define DBG(...) do { } while(0)
+#endif
+
+struct dependency {
+	struct task *signal;
+	struct igt_list link;
+};
+
+struct task {
+	int step;
+	int ctx;
+	int class;
+	int instance;
+
+	unsigned long min, max;
+	unsigned long duration;
+	unsigned long deadline;
+
+	struct igt_list link;
+	struct igt_list sched;
+	struct igt_list signals;
+
+	bool visited;
+};
+
+struct work {
+	struct task **tasks;
+	unsigned int count, size;
+
+	struct igt_list ends;
+	struct task *barrier;
+};
+
+static void add_dependency(struct task *task, struct task *signal)
+{
+	struct dependency *dep;
+
+	DBG("%s %d (context %d, engine %d) -> %d (context %d, engine %d)\n",
+	    __func__,
+	    signal->step, signal->ctx, signal->class,
+	    task->step, task->ctx, task->class);
+
+	dep = malloc(sizeof(*dep));
+	dep->signal = signal;
+	igt_list_add(&dep->link, &task->signals);
+	if (!igt_list_empty(&signal->link)) {
+		igt_list_del(&signal->link);
+		igt_list_init(&signal->link);
+	}
+}
+
+enum class {
+	RCS,
+	BCS,
+	VCS,
+	VECS,
+};
+
+static void add_task(struct work *work, char *line)
+{
+#define TSEP "."
+
+	static const char *engines[] = {
+		[RCS]  = "rcs",
+		[BCS]  = "bcs",
+		[VCS]  = "vcs",
+		[VECS] = "vecs",
+	};
+	struct task *task;
+	char *token;
+	int i;
+
+	DBG("line='%s'\n", line);
+
+	token = strtok(line, TSEP);
+
+	if (!strcasecmp(token, "s")) { /* sync point */
+		int sync = atoi(strtok(NULL, TSEP));
+
+		DBG("syncpt %d\n", sync);
+
+		work->barrier = work->tasks[work->count + sync];
+		return;
+	}
+
+	if (!isdigit(*token)) {
+		fprintf(stderr, "Ignoring step '%s' at your peril!\n", token);
+		return;
+	}
+
+	/*
+	 * 1.RCS.2800-3100.-1.0
+	 * - context
+	 * - engine
+	 * - delay
+	 * - dependencies
+	 * - sync
+	 */
+	task = malloc(sizeof(*task));
+
+	igt_list_init(&task->signals);
+	task->step = work->count;
+	task->visited = false;
+
+	/* context */
+	DBG("context='%s'\n", token);
+	task->ctx = atoi(token);
+
+	/* engine */
+	token = strtok(NULL, TSEP);
+	DBG("engine='%s'\n", token);
+	task->class = -1;
+	for (i = 0; i < sizeof(engines)/sizeof(*engines); i++) {
+		const int len = strlen(engines[i]);
+		if (!strncasecmp(token, engines[i], len)) {
+			task->class = i;
+			if (token[len])
+				task->instance = atoi(token + len);
+			else
+				task->instance = -1;
+			break;
+		}
+	}
+
+	/* delay */
+	token = strtok(NULL, TSEP);
+	DBG("delay='%s'\n", token);
+	task->min = strtol(token, &token, 0);
+	if (*token)
+		task->max = strtol(token + 1, NULL, 0);
+	else
+		task->max = task->min;
+	task->duration = (task->min + task->max) / 2;
+	DBG("min=%lu, max=%lu; duration=%lu\n", task->min, task->max, task->duration);
+
+	/* dependencies */
+	token = strtok(NULL, TSEP);
+	DBG("deps='%s'\n", token);
+	while ((i = strtol(token, &token, 0))) {
+		add_dependency(task, work->tasks[work->count + i]);
+		if (*token)
+			token++;
+	}
+
+	/* add a dependency for the context+engine timeline */
+	for (i = work->count; --i >= 0; ) {
+		if (work->tasks[i]->ctx == task->ctx &&
+		    work->tasks[i]->class == task->class) {
+			add_dependency(task, work->tasks[i]);
+			break;
+		}
+	}
+
+	if (work->barrier)
+		add_dependency(task, work->barrier);
+
+	/* sync -- we become the barrier */
+	if (atoi(strtok(NULL, TSEP))) {
+		DBG("marking as a sync point\n");
+		work->barrier = task;
+	}
+
+	igt_list_add(&task->link, &work->ends);
+	work->tasks[work->count++] = task;
+
+#undef TSEP
+}
+
+static struct work *parse_work(FILE *file)
+{
+	struct work *work;
+	char *line = NULL;
+	size_t len = 0;
+
+	work = malloc(sizeof(*work));
+	igt_list_init(&work->ends);
+	work->barrier = NULL;
+
+	work->size = 64;
+	work->count = 0;
+	work->tasks = malloc(sizeof(*work->tasks) * work->size);
+
+	while (getline(&line, &len, file) != -1) {
+		if (work->count == work->size) {
+			work->tasks = realloc(work->tasks,
+					      sizeof(*work->tasks) * work->size);
+			work->size *= 2;
+		}
+		add_task(work, line);
+	}
+
+	free(line);
+
+	DBG("%d tasks\n", work->count);
+	return work;
+}
+
+static unsigned long sum_durations(struct task *task)
+{
+	unsigned long max_duration = 0;
+	struct task *signal = NULL;
+	struct dependency *dep;
+
+	igt_list_for_each(dep, &task->signals, link) {
+		if (dep->signal->duration > max_duration) {
+			signal = dep->signal;
+			max_duration = signal->duration;
+		}
+	}
+
+	return task->duration + (signal ? sum_durations(signal) : 0);
+}
+
+static void ideal_depth(struct work *work)
+{
+	unsigned long total_duration;
+	unsigned long max_duration;
+	struct task *task;
+	int i;
+
+	/*
+	 * The ideal depth is the longest chain of dependencies as the
+	 * dependency chain requires sequential task execution. Each
+	 * chain is assumed to be run in parallel on an infinite set of
+	 * engines, so the ratelimiting step is its longest path.
+	 */
+	max_duration = 0;
+	igt_list_for_each(task, &work->ends, link) {
+		unsigned long duration = sum_durations(task);
+		if (duration > max_duration)
+			max_duration = duration;
+	}
+
+	total_duration = 0;
+	for (i = 0; i < work->count; i++)
+		total_duration += work->tasks[i]->duration;
+
+	printf("Single client\n");
+	printf("   total duration %luus; %.2f wps\n", total_duration, 1e6/total_duration);
+	printf("   ideal duration %luus; %.2f wps\n", max_duration, 1e6/max_duration);
+}
+
+struct sim_class {
+	int ninstance;
+	unsigned long deadline[4];
+	unsigned long busy[4];
+};
+
+static void simulate_client(struct work *work)
+{
+	struct sim_class sim[] = {
+		[RCS]  = { 1 },
+		[BCS]  = { 1 },
+		[VCS]  = { 2 },
+		[VECS] = { 1 },
+	}, *class;
+	IGT_LIST(sched);
+	struct task *task;
+	unsigned long max;
+	int i, j;
+
+	printf("Simulated clients:\n");
+
+	for (i = 0; i < work->count; i++)
+		igt_list_init(&work->tasks[i]->sched);
+
+	igt_list_for_each(task, &work->ends, link)
+		igt_list_add_tail(&task->sched, &sched);
+
+	igt_list_for_each(task, &sched, sched) {
+		struct dependency *dep;
+
+		igt_list_for_each(dep, &task->signals, link)
+			igt_list_move_tail(&dep->signal->sched, &sched);
+	}
+
+	igt_list_for_each_reverse(task, &sched, sched) {
+		struct dependency *dep;
+		int instance;
+
+		class = &sim[task->class];
+		max = class->deadline[0];
+
+		instance = task->instance;
+		if (instance < 0) {
+			instance = 0;
+			for (i = 1; i < class->ninstance; i++) {
+				if (class->deadline[i] < max) {
+					max = class->deadline[i];
+					instance = i;
+				}
+			}
+		}
+
+		/*
+		 * Greedy (first available), not true optimal scheduling.
+		 *
+		 * For optimal, we do have to compute the global optimal
+		 * ordering by checking every permutation...
+		 */
+		igt_list_for_each(dep, &task->signals, link) {
+			if (dep->signal->deadline > max)
+				max = dep->signal->deadline;
+		}
+
+		DBG("task %d: engine %d, instance %d; finish %lu\n",
+		    task->step, task->class, instance, max);
+
+		task->deadline = max + task->duration;
+		class->deadline[instance] = task->deadline;
+		class->busy[instance] += task->duration;
+	}
+
+	max = 0;
+	for (i = 0; i < sizeof(sim)/sizeof(sim[0]); i++) {
+		class = &sim[i];
+		for (j = 0; j < class->ninstance; j++) {
+			if (class->deadline[j] > max)
+				max = class->deadline[j];
+		}
+	}
+	printf("   single duration %luus; %.2f wps\n", max, 1e6/max);
+
+	/*
+	 * Compute the maximum duration required on any engine.
+	 *
+	 * With sufficient clients forcing maximum occupancy under their weight,
+	 * the ratelimiting step becomes a single engine and how many clients
+	 * it takes to fill.
+	 */
+	max = 0;
+	for (i = 0; i < sizeof(sim)/sizeof(sim[0]); i++) {
+		class = &sim[i];
+		for (j = 0; j < class->ninstance; j++) {
+			if (class->busy[j] > max)
+				max = class->busy[j];
+		}
+	}
+	printf("   packed duration %luus; %.2f wps\n", max, 1e6/max);
+}
+
+static void graphviz(struct work *work)
+{
+#if 0
+	int i, j;
+
+	printf("digraph {\n");
+	printf("  rankdir=LR;\n");
+	printf("  splines=line;\n");
+	printf("\n");
+
+	for (i = 0; i < work->count; i++) {
+		struct task *task = work->tasks[i];
+
+		if (task->visited)
+			goto skip;
+
+		printf("  subgraph cluster_%d {\n", task->ctx);
+		printf("    label=\"Context %d\"\n", task->ctx);
+		for (j = i; j < work->count; j++) {
+			if (work->tasks[j]->ctx == task->ctx) {
+				printf("    task_%03d;\n", j);
+				work->tasks[j]->visited = true;
+			}
+		}
+		printf("  }\n\n");
+
+skip:
+		task->visited = false;
+	}
+
+	for (i = 0; i < work->count; i++) {
+		struct task *task = work->tasks[i];
+		struct dependency *dep;
+
+		igt_list_for_each(dep, &task->signals, link) {
+			printf("  task_%03d -> task_%03d;\n",
+			       dep->signal->step, task->step);
+		}
+	}
+
+	printf("}\n");
+#endif
+}
+
+int main(int argc, char **argv)
+{
+	int i;
+
+	for (i = 1; i < argc; i++) {
+		FILE *file = fopen(argv[i], "r");
+		struct work *work;
+
+		if (!file) {
+			perror(argv[i]);
+			return 1;
+		}
+
+		work = parse_work(file);
+		fclose(file);
+
+		graphviz(work);
+
+		ideal_depth(work);
+		simulate_client(work);
+	}
+
+	return 0;
+}
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 16/17] tools: capture execution pathways
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

igt_kcov edges -o edges.yaml command-to-trace args
...
igt_kcov sort -o sorted.yaml edges.yaml
igt_kcov list --total=30 --single=0.5 sorted.yaml

Requires CONFIG_KCOV

Use LD_PRELOAD to wrap calls to DRM ioctls and capture the execution
trace of all basic blocks invoked directly from the syscall. (Key
limitation!) From the chain of basic blocks, the "edges" are computed
and the number of times each edge is hit is stored in a vector (modulo a
large hash). The goal then is to select a series of tests that execute
the most unique pathways in the sortest amount of time. To do this we
compare the similarity of the coverage vectors between all commands, and
try to preferentially pick those that are dissimilar from the rest.

* Direct execution pathways from syscall is a huge limitation for
evaluating general testing of driver pathways, but is not so huge when
myopically focusing on the ABI as say used by mesa.

** Caveat lector. Hastily thrown together. (Still)
---
 meson.build            |    1 +
 tools/Makefile.am      |   11 +-
 tools/igt_kcov.c       | 1218 ++++++++++++++++++++++++++++++++++++++++
 tools/igt_kcov_edges.c |  145 +++++
 tools/meson.build      |   13 +
 5 files changed, 1387 insertions(+), 1 deletion(-)
 create mode 100644 tools/igt_kcov.c
 create mode 100644 tools/igt_kcov_edges.c

diff --git a/meson.build b/meson.build
index 638c01066..213745954 100644
--- a/meson.build
+++ b/meson.build
@@ -173,6 +173,7 @@ math = cc.find_library('m')
 realtime = cc.find_library('rt')
 dlsym = cc.find_library('dl')
 zlib = cc.find_library('z')
+yaml = cc.find_library('yaml', required: false)
 
 if cc.has_header('linux/kd.h')
 	config.set('HAVE_LINUX_KD_H', 1)
diff --git a/tools/Makefile.am b/tools/Makefile.am
index a0b016ddd..ae70f5a76 100644
--- a/tools/Makefile.am
+++ b/tools/Makefile.am
@@ -1,6 +1,7 @@
 include Makefile.sources
 
-bin_PROGRAMS = $(tools_prog_lists)
+bin_PROGRAMS = $(tools_prog_lists) igt_kcov
+lib_LTLIBRARIES = igt_kcov_edges.la
 
 if HAVE_LIBDRM_INTEL
 bin_PROGRAMS += $(LIBDRM_INTEL_BIN)
@@ -30,6 +31,14 @@ intel_aubdump_la_LIBADD = $(top_builddir)/lib/libintel_tools.la -ldl
 
 intel_gpu_top_LDADD = $(top_builddir)/lib/libigt_perf.la
 
+# XXX FIXME
+igt_kcov_SOURCES = igt_kcov.c
+igt_kcov_LDADD = -lyaml -lz -lrt -lm
+
+igt_kcov_edges_la_SOURCES = igt_kcov_edges.c
+igt_kcov_edges_la_LDFLAGS = -module -no-undefined -avoid-version
+igt_kcov_edges_la_LIBADD = -ldl
+
 bin_SCRIPTS = intel_aubdump
 CLEANFILES = $(bin_SCRIPTS)
 
diff --git a/tools/igt_kcov.c b/tools/igt_kcov.c
new file mode 100644
index 000000000..e9179a4b3
--- /dev/null
+++ b/tools/igt_kcov.c
@@ -0,0 +1,1218 @@
+#include <sys/types.h>
+#include <sys/file.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/wait.h>
+
+#include <ctype.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <getopt.h>
+#include <inttypes.h>
+#include <math.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <unistd.h>
+
+#include <yaml.h>
+#include <zlib.h>
+
+#define SZ_8 (1 << 16)
+#define SZ_32 (sizeof(uint32_t) * SZ_8)
+
+static pid_t child = 0;
+
+static void sighandler(int sig)
+{
+	kill(sig, child);
+}
+
+static uint64_t gettime(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_MONOTONIC, &ts);
+
+	return ts.tv_sec * 1000000000 + ts.tv_nsec;
+}
+
+struct exec {
+	const char *preload;
+	int uid;
+	int gid;
+};
+
+static int edges_exec(const struct exec *args, char **argv, uint64_t *elapsed)
+{
+	char buf[10];
+	int status;
+	int kcov;
+	int shm;
+	int ret;
+
+	kcov = open("/sys/kernel/debug/kcov", O_RDWR);
+	if (kcov < 0)
+		return -errno;
+
+	shm = memfd_create("igt_kcov", 0);
+	if (shm == -1) {
+		close(kcov);
+		return -errno;
+	}
+
+	ftruncate(shm, SZ_32);
+
+	switch ((child = fork())) {
+	case 0: /* child */
+		if (args->gid)
+			setgid(args->gid);
+		if (args->uid)
+			setuid(args->uid);
+
+		sprintf(buf, "%d", shm);
+		setenv("IGT_SHM_FD", buf, 1);
+
+		sprintf(buf, "%d", kcov);
+		setenv("IGT_KCOV_FD", buf, 1);
+
+		setenv("LD_PRELOAD", args->preload, 1);
+		exit(execvp(argv[0], argv));
+		break;
+
+	case -1:
+		ret = -errno;
+		break;
+
+	default:
+		signal(SIGINT, sighandler);
+
+		*elapsed = -gettime();
+		do {
+			ret = waitpid(child, &status, 0);
+			if (ret == -1)
+				ret = -errno;
+		} while (ret == -EINTR);
+		*elapsed += gettime();
+
+		signal(SIGINT, SIG_DFL);
+		child = 0;
+	}
+
+	close(kcov);
+	if (ret < 0) {
+		close(shm);
+		shm = ret;
+	}
+
+	return shm;
+}
+
+static inline unsigned long __fls(unsigned long word)
+{
+#if defined(__GNUC__) && (defined(__i386__) || defined(__x86__) || defined(__x86_64__))
+	asm("bsr %1,%0"
+	    : "=r" (word)
+	    : "rm" (word));
+	return word;
+#else
+	unsigned int v = 0;
+
+	while (word >>= 1)
+		v++;
+
+	return v;
+#endif
+}
+
+static uint8_t lower_u32(uint32_t x)
+{
+	if (x < 16)
+		return x;
+	else
+		return (x >> (__fls(x) - 2)) + ((__fls(x) - 1) << 2);
+}
+
+static uint8_t *lower_edges(int shm)
+{
+	uint8_t *lower;
+	uint32_t *tbl;
+	unsigned int n;
+
+	tbl = mmap(NULL, SZ_32, PROT_READ, MAP_SHARED, shm, 0);
+	if (tbl == MAP_FAILED)
+		return NULL;
+
+	if (tbl[0] == 0) /* empty */
+		goto out;
+
+	lower = malloc(SZ_8);
+	if (!lower)
+		goto out;
+
+	for (n = 0; n < 1 << 16; n++)
+		lower[n] = lower_u32(tbl[n]);
+
+out:
+	munmap(tbl, SZ_32);
+	return lower;
+}
+
+static bool ascii85_encode(uint32_t in, char *out)
+{
+	int i;
+
+	if (in == 0)
+		return false;
+
+	for (i = 5; i--; ) {
+		out[i] = '!' + in % 85;
+		in /= 85;
+	}
+
+	return true;
+}
+
+static char *edges_to_ascii85(uint8_t *tbl)
+{
+	z_stream zs;
+	uint32_t *p;
+	void *out;
+	char *str, *s;
+	int sz;
+
+	sz = SZ_8 * 3 /2;
+	out = malloc(sz);
+	if (!out)
+		return NULL;
+
+	memset(&zs, 0, sizeof(zs));
+	if (deflateInit(&zs, 9)) {
+		free(out);
+		return NULL;
+	}
+
+	zs.next_in = tbl;
+	zs.avail_in = SZ_8;
+	zs.total_in = 0;
+	zs.avail_out = sz;
+	zs.total_out = 0;
+	zs.next_out = out;
+
+	deflate(&zs, Z_FINISH);
+	deflateEnd(&zs);
+
+	if (zs.total_out & 3)
+		memset((char *)out + zs.total_out, 0, 4 - (zs.total_out & 3));
+	zs.total_out = (zs.total_out + 3) / 4;
+
+	str = malloc(zs.total_out * 5 + 1);
+	if (!str) {
+		free(out);
+		return NULL;
+	}
+
+	p = out;
+	s = str;
+	for (int i = 0; i < zs.total_out; i++) {
+		if (ascii85_encode(*p++, s))
+			s += 5;
+		else
+			*s++ = 'z';
+	}
+	*s++ = '\0';
+	free(out);
+
+	return str;
+}
+
+static void edges(int argc, char **argv)
+{
+	static const struct option longopts[] = {
+		{"output", required_argument, 0, 'o'},
+		{"preload", required_argument, 0, 'p'},
+		{"user", required_argument, 0, 'u'},
+		{"group", required_argument, 0, 'g'},
+		{ NULL, 0, NULL, 0 }
+	};
+	struct exec args = {
+		.preload = "/tmp/igt_kcov_edges.so",
+	};
+	FILE *out = stdout;
+	uint64_t elapsed;
+	uint8_t *tbl;
+	char *str, *s;
+	int shm, i;
+
+	while ((i = getopt_long(argc, argv,
+				"+o:g:u:",
+				longopts, NULL)) != -1) {
+		switch (i) {
+		case 'o':
+			if (strcmp(optarg, "-"))
+				out = fopen(optarg, "a");
+			if (!out) {
+				fprintf(stderr,
+					"Unable to open output file '%s'\n",
+					optarg);
+				exit(1);
+			}
+			break;
+		case 'p':
+			args.preload = optarg;
+			break;
+		case 'u':
+			args.uid = atoi(optarg);
+			break;
+		case 'g':
+			args.gid = atoi(optarg);
+			break;
+		}
+	}
+	argc -= optind;
+	argv += optind;
+
+	if (argc < 1) {
+		fprintf(stderr,
+			"usage: igt_kcov edges [options] program-to-trace args...\n");
+		exit(1);
+	}
+
+	shm = edges_exec(&args, argv, &elapsed);
+	if (shm < 0) {
+		fprintf(stderr,
+			"Execution of %s failed: err=%d\n",
+			argv[0], shm);
+		exit(1);
+	}
+
+	tbl = lower_edges(shm);
+	close(shm);
+	if (!tbl)
+		exit(1);
+
+	str = edges_to_ascii85(tbl);
+
+	flock(fileno(out), LOCK_EX);
+	fprintf(out, "---\n");
+
+	fprintf(out, "cmd: |  \n");
+	for (i = 0; i < argc; i++) {
+		fprintf(out, " '");
+		for (s = argv[i]; *s; s++) {
+			if (*s == '\'')
+				fprintf(out, "\\\'");
+			else
+				fprintf(out, "%c", *s);
+		}
+		fprintf(out, "'");
+	}
+	fprintf(out, "\n");
+
+	fprintf(out, "elapsed: %"PRIu64" # %.1fms\n", elapsed, 1e-6 * elapsed);
+
+	fprintf(out, "edges: !!ascii85.gz |\n");
+	i = strlen(str);
+	s = str;
+	while (i) {
+		int len = i > 70 ? 70 : i;
+		char tmp;
+
+		tmp = s[len];
+		s[len] = '\0';
+		fprintf(out, "  %s\n", s);
+		s[len] = tmp;
+
+		s += len;
+		i -= len;
+	}
+
+	fflush(out);
+	flock(fileno(out), LOCK_UN);
+
+	free(str);
+}
+
+static unsigned long zlib_inflate(void *in, unsigned long len,
+				  void *ptr, unsigned long max)
+{
+	struct z_stream_s zstream;
+
+	memset(&zstream, 0, sizeof(zstream));
+
+	zstream.next_in = in;
+	zstream.avail_in = len;
+
+	if (inflateInit(&zstream) != Z_OK)
+		return 0;
+
+	zstream.next_out = ptr;
+	zstream.avail_out = max;
+
+	switch (inflate(&zstream, Z_SYNC_FLUSH)) {
+	case Z_STREAM_END:
+	case Z_OK:
+		break;
+	default:
+		zstream.total_out = 0;
+		break;
+	}
+
+	inflateEnd(&zstream);
+	return zstream.total_out;
+}
+
+static unsigned long ascii85_decode(const char *in,
+				    void *ptr, unsigned long max)
+{
+	unsigned long sz = max / sizeof(uint32_t);
+	unsigned long len = 0;
+	uint32_t *out;
+
+	out = malloc(sz * sizeof(uint32_t));
+	if (out == NULL)
+		return 0;
+
+	while (*in) {
+		uint32_t v = 0;
+
+		if (isspace(*in)) {
+			in++;
+			continue;
+		}
+
+		if (*in < '!' || *in > 'z') {
+			fprintf(stderr, "Invalid value in ascii85 block\n");
+			free(out);
+			return 0;
+		}
+
+		if (len == sz) {
+			sz *= 2;
+			out = realloc(out, sz * sizeof(uint32_t));
+			if (out == NULL)
+				return 0;
+		}
+
+		if (*in == 'z') {
+			in++;
+		} else {
+			v += in[0] - 33; v *= 85;
+			v += in[1] - 33; v *= 85;
+			v += in[2] - 33; v *= 85;
+			v += in[3] - 33; v *= 85;
+			v += in[4] - 33;
+			in += 5;
+		}
+		out[len++] = v;
+	}
+
+	len = zlib_inflate(out, len * sizeof(*out), ptr, max);
+	free(out);
+
+	return len;
+}
+
+static void yaml_print_parser_error(yaml_parser_t *parser, FILE *stream)
+{
+	switch (parser->error) {
+	case YAML_MEMORY_ERROR:
+		fprintf(stderr, "Memory error: Not enough memory for parsing\n");
+		break;
+
+	case YAML_READER_ERROR:
+		if (parser->problem_value != -1) {
+			fprintf(stderr, "Reader error: %s: #%X at %zd\n", parser->problem,
+				parser->problem_value, parser->problem_offset);
+		} else {
+			fprintf(stderr, "Reader error: %s at %zd\n", parser->problem,
+				parser->problem_offset);
+		}
+		break;
+
+	case YAML_SCANNER_ERROR:
+		if (parser->context) {
+			fprintf(stderr, "Scanner error: %s at line %lu, column %lu\n"
+				"%s at line %lu, column %lu\n", parser->context,
+				parser->context_mark.line+1, parser->context_mark.column+1,
+				parser->problem, parser->problem_mark.line+1,
+				parser->problem_mark.column+1);
+		} else {
+			fprintf(stderr, "Scanner error: %s at line %lu, column %lu\n",
+				parser->problem, parser->problem_mark.line+1,
+				parser->problem_mark.column+1);
+		}
+		break;
+
+	case YAML_PARSER_ERROR:
+		if (parser->context) {
+			fprintf(stderr, "Parser error: %s at line %lu, column %lu\n"
+				"%s at line %lu, column %lu\n", parser->context,
+				parser->context_mark.line+1, parser->context_mark.column+1,
+				parser->problem, parser->problem_mark.line+1,
+				parser->problem_mark.column+1);
+		} else {
+			fprintf(stderr, "Parser error: %s at line %lu, column %lu\n",
+				parser->problem, parser->problem_mark.line+1,
+				parser->problem_mark.column+1);
+		}
+		break;
+
+	default:
+		/* Couldn't happen. */
+		fprintf(stderr, "Internal error\n");
+		break;
+	}
+}
+
+struct edges {
+	struct edges *next;
+
+	char *command;
+	uint64_t elapsed;
+
+	unsigned int weight;
+
+	uint8_t tbl[SZ_8];
+};
+
+struct sort {
+	struct edges *edges;
+	unsigned int count;
+
+	unsigned int max_weight;
+	struct edges *best;
+};
+
+static bool sort_parse_command(struct sort *sort,
+			       struct edges *e,
+			       yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+	const char *s;
+	int len;
+
+	if (!yaml_parser_parse(parser, &ev))
+		return false;
+
+	switch (ev.type) {
+	case YAML_SCALAR_EVENT:
+		break;
+
+	default:
+		return false;
+	}
+
+
+	s = (const char *)ev.data.scalar.value;
+	len = strlen(s);
+	while (s[len - 1] == '\n')
+		len--;
+	e->command = malloc(len + 1);
+	if (e->command) {
+		memcpy(e->command, s, len);
+		e->command[len] = '\0';
+	}
+	yaml_event_delete(&ev);
+
+	return true;
+}
+
+static bool sort_parse_elapsed(struct sort *sort,
+			       struct edges *e,
+			       yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+	const char *s;
+
+	if (!yaml_parser_parse(parser, &ev))
+		return false;
+
+	switch (ev.type) {
+	case YAML_SCALAR_EVENT:
+		break;
+
+	default:
+		return false;
+	}
+
+	s = (const char *)ev.data.scalar.value;
+	e->elapsed = strtoull(s, NULL, 0);
+	yaml_event_delete(&ev);
+
+	return true;
+}
+
+static unsigned int bitmap_weight(const void *bitmap, unsigned int bits)
+{
+	const uint32_t *b = bitmap;
+	unsigned int k, lim = bits / 32;
+	unsigned int w = 0;
+
+	for (k = 0; k < lim; k++)
+		w += __builtin_popcount(b[k]);
+
+	if (bits % 32)
+		w += __builtin_popcount(b[k] << (32 - bits % 32));
+
+	return w;
+}
+
+static bool sort_parse_edges(struct sort *sort,
+			     struct edges *e,
+			     yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+	const char *s;
+
+	if (!yaml_parser_parse(parser, &ev))
+		return false;
+
+	switch (ev.type) {
+	case YAML_SCALAR_EVENT:
+		break;
+
+	default:
+		return false;
+	}
+
+	s = (const char *)ev.data.scalar.value;
+	if (ascii85_decode(s, e->tbl, sizeof(e->tbl)))
+		e->weight = bitmap_weight(e->tbl, sizeof(e->tbl) * 8);
+	yaml_event_delete(&ev);
+
+	return true;
+}
+
+static bool edges_valid(const struct edges *e)
+{
+	if (!e->command)
+		return false;
+
+	if (!e->weight)
+		return false;
+
+	if (!e->elapsed)
+		return false;
+
+	return true; /* good enough at least */
+}
+
+static bool sort_add_edges(struct sort *sort, struct edges *e)
+{
+	if (!edges_valid(e))
+		return false;
+
+	e->next = sort->edges;
+	sort->edges = e;
+
+	if (e->weight > sort->max_weight) {
+		sort->max_weight = e->weight;
+		sort->best = e;
+	}
+
+	sort->count++;
+
+	return true;
+}
+
+static bool sort_parse_node(struct sort *sort, yaml_parser_t *parser)
+{
+	struct edges *e;
+	yaml_event_t ev;
+	char *s;
+
+	e = malloc(sizeof(*e));
+	if (!e)
+		return false;
+
+	e->weight = 0;
+
+	do {
+		if (!yaml_parser_parse(parser, &ev))
+			goto err;
+
+		switch (ev.type) {
+		case YAML_MAPPING_END_EVENT:
+			if (!sort_add_edges(sort, e))
+				goto err;
+
+			return true;
+
+		case YAML_SCALAR_EVENT:
+			break;
+
+		default:
+			goto err;
+		}
+
+		s = (char *)ev.data.scalar.value;
+		if (!strcmp(s, "cmd")) {
+			sort_parse_command(sort, e, parser);
+		} else if (!strcmp(s, "elapsed")) {
+			sort_parse_elapsed(sort, e, parser);
+		} else if (!strcmp(s, "edges")) {
+			sort_parse_edges(sort, e, parser);
+		} else {
+			fprintf(stderr,
+				"Unknown element in edges file: %s\n",
+				s);
+		}
+
+		yaml_event_delete(&ev);
+	} while (1);
+
+err:
+	free(e);
+	return false;
+}
+
+static bool sort_parse_doc(struct sort *sort, yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+
+	do {
+		if (!yaml_parser_parse(parser, &ev))
+			return false;
+
+		switch (ev.type) {
+		case YAML_DOCUMENT_END_EVENT:
+			return true;
+
+		case YAML_MAPPING_START_EVENT:
+			break;
+
+		default:
+			return false;
+		}
+		yaml_event_delete(&ev);
+
+		if (!sort_parse_node(sort, parser))
+			return false;
+	} while(1);
+}
+
+static bool sort_add(struct sort *sort, FILE *file)
+{
+	yaml_parser_t parser;
+	bool done = false;
+	yaml_event_t ev;
+
+	yaml_parser_initialize(&parser);
+	yaml_parser_set_input_file(&parser, file);
+
+	memset(&ev, 0, sizeof(ev));
+	yaml_parser_parse(&parser, &ev);
+	if (ev.type != YAML_STREAM_START_EVENT) {
+		fprintf(stderr, "Parser setup failed\n");
+		return false;
+	}
+	yaml_event_delete(&ev);
+
+	do {
+		if (!yaml_parser_parse(&parser, &ev)) {
+			yaml_print_parser_error(&parser, stderr);
+			return false;
+		}
+
+		switch (ev.type) {
+		case YAML_DOCUMENT_START_EVENT:
+			break;
+
+		case YAML_STREAM_END_EVENT:
+			done = true;
+			break;
+
+		default:
+			return false;
+		}
+		yaml_event_delete(&ev);
+
+		sort_parse_doc(sort, &parser);
+	} while (!done);
+
+	yaml_parser_delete(&parser);
+	return true;
+}
+
+static double edges_similarity(const struct edges *ta, const struct edges *tb)
+{
+	uint64_t sab, saa, sbb;
+	const uint8_t *a, *b;
+
+	if (ta == tb)
+		return 1;
+
+	a = ta->tbl;
+	b = tb->tbl;
+	sab = 0;
+	saa = 0;
+	sbb = 0;
+
+	for (unsigned int i = 0; i < SZ_8; i++) {
+		sab += (uint32_t)a[i] * b[i];
+		saa += (uint32_t)a[i] * a[i];
+		sbb += (uint32_t)b[i] * b[i];
+	}
+
+	return ((long double)sab * sab) / ((long double)saa * sbb);
+}
+
+static void rank_by_dissimilarity(struct sort *sort, FILE *out)
+{
+	const unsigned int count = sort->count;
+	double *M, *dis, *sim;
+	struct edges **edges, **t;
+	bool *used;
+	int last = -1;
+
+	t = edges = malloc(count * sizeof(*edges));
+	for (struct edges *e = sort->edges; e; e = e->next)
+		*t++ = e;
+
+	M = malloc(sizeof(double) * count * (count + 2));
+	dis = M + count * count;
+	sim = dis + count;
+	for (int i = 0; i < count; i++) {
+		dis[i] = 0.;
+		sim[i] = 1.;
+		for (int j = 0; j < i; j++)
+			dis[i] += M[j*count + i];
+		for (int j = i + 1; j < count; j++) {
+			double d = 1. - edges_similarity(edges[i], edges[j]);
+			M[i*count + j] = d;
+			dis[i] += d;
+		}
+	}
+
+	fprintf(out, "---\n");
+
+	used = calloc(count, sizeof(bool));
+	for (int rank = 0; rank < count; rank++) {
+		struct edges *e;
+		double best = -HUGE_VAL;
+		int this = -1;
+
+		for (int i = 0; i < count; i++) {
+			double d;
+
+			if (used[i])
+				continue;
+
+			d = dis[i];
+			if (last != -1) {
+				double s;
+
+				if (last < i)
+					s = M[last * count + i];
+				else
+					s = M[i * count + last];
+
+				s *= sim[i];
+				sim[i] = s;
+
+				d *= sqrt(s);
+			}
+			if (d > best) {
+				best = d;
+				this = i;
+			}
+		}
+
+		if (this < 0)
+			break;
+
+		e = edges[this];
+		used[this] = true;
+		last = this;
+
+		fprintf(out, "- cmd: |\n    %s\n", e->command);
+		fprintf(out, "  elapsed: %"PRIu64" # %.1fms\n",
+			e->elapsed, 1e-6 * e->elapsed);
+		fprintf(out, "  dissimilarity: %.4g\n",
+			sqrt(best / (count - rank)));
+	}
+
+	free(M);
+	free(edges);
+}
+
+static void sort(int argc, char **argv)
+{
+	static const struct option longopts[] = {
+		{"output", required_argument, 0, 'o'},
+		{ NULL, 0, NULL, 0 }
+	};
+	FILE *out = stdout;
+	struct sort sort;
+	int i;
+
+	while ((i = getopt_long(argc, argv,
+				"+o:",
+				longopts, NULL)) != -1) {
+		switch (i) {
+		case 'o':
+			if (strcmp(optarg, "-"))
+				out = fopen(optarg, "a");
+			if (!out) {
+				fprintf(stderr,
+					"Unable to open output file '%s'\n",
+					optarg);
+				exit(1);
+			}
+			break;
+		}
+	}
+	argc -= optind;
+	argv += optind;
+
+	memset(&sort, 0, sizeof(sort));
+
+	if (argc == 0) {
+		sort_add(&sort, stdin);
+	} else {
+		for (i = 0; i < argc; i++) {
+			FILE *in;
+
+			in = fopen(argv[i], "r");
+			if (!in) {
+				fprintf(stderr, "unable to open input '%s'\n",
+					argv[i]);
+				exit(1);
+			}
+
+			sort_add(&sort, in);
+
+			fclose(in);
+		}
+	}
+
+	if (!sort.count)
+		return;
+
+	rank_by_dissimilarity(&sort, out);
+}
+
+struct list {
+	struct {
+		double single;
+		double total;
+	} limit;
+	double total;
+	FILE *out;
+};
+
+struct list_node {
+	char *cmd;
+	uint64_t elapsed;
+};
+
+static bool list_parse_command(struct list *list,
+			       struct list_node *node,
+			       yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+	const char *s;
+	int len;
+
+	if (!yaml_parser_parse(parser, &ev))
+		return false;
+
+	switch (ev.type) {
+	case YAML_SCALAR_EVENT:
+		break;
+
+	default:
+		return false;
+	}
+
+
+	s = (const char *)ev.data.scalar.value;
+	len = strlen(s);
+	while (len > 0 && isspace(s[len - 1]))
+		len--;
+
+	node->cmd = strndup(s, len);
+	yaml_event_delete(&ev);
+
+	return true;
+}
+
+static bool list_parse_elapsed(struct list *list,
+			       struct list_node *node,
+			       yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+	const char *s;
+
+	if (!yaml_parser_parse(parser, &ev))
+		return false;
+
+	switch (ev.type) {
+	case YAML_SCALAR_EVENT:
+		break;
+
+	default:
+		return false;
+	}
+
+	s = (const char *)ev.data.scalar.value;
+	node->elapsed = strtoull(s, NULL, 0);
+	yaml_event_delete(&ev);
+
+	return true;
+}
+
+static bool list_parse_node(struct list *list, yaml_parser_t *parser)
+{
+	struct list_node node = {};
+	yaml_event_t ev;
+	char *s;
+
+	do {
+		if (!yaml_parser_parse(parser, &ev))
+			return false;
+
+		switch (ev.type) {
+		case YAML_MAPPING_END_EVENT:
+			if (list->limit.single &&
+			    1e-9 * node.elapsed > list->limit.single) {
+				free(node.cmd);
+				return true;
+			}
+
+			if (list->limit.total &&
+			    list->total + 1e-9 * node.elapsed > list->limit.total) {
+				free(node.cmd);
+				return true;
+			}
+
+			if (node.cmd) {
+				list->total += 1e-9 * node.elapsed;
+				fprintf(list->out,
+					"%s # %.3fms, total %.1fms\n",
+					node.cmd,
+					1e-6 * node.elapsed,
+					1e3 * list->total);
+				free(node.cmd);
+			}
+			return true;
+
+		case YAML_SCALAR_EVENT:
+			break;
+
+		default:
+			return false;
+		}
+
+		s = (char *)ev.data.scalar.value;
+		if (!strcmp(s, "cmd")) {
+			list_parse_command(list, &node, parser);
+		} else if (!strcmp(s, "elapsed")) {
+			if (!list_parse_elapsed(list, &node,parser))
+				return false;
+		}
+		yaml_event_delete(&ev);
+	} while (1);
+}
+
+static bool list_parse_sequence(struct list *list, yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+
+	do {
+		if (!yaml_parser_parse(parser, &ev))
+			return false;
+
+		switch (ev.type) {
+		case YAML_SEQUENCE_END_EVENT:
+			return true;
+
+		case YAML_MAPPING_START_EVENT:
+			if (!list_parse_node(list, parser))
+				return false;
+			break;
+
+		default:
+			return false;
+		}
+		yaml_event_delete(&ev);
+	} while(1);
+}
+
+static bool list_parse_doc(struct list *list, yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+
+	do {
+		if (!yaml_parser_parse(parser, &ev))
+			return false;
+
+		switch (ev.type) {
+		case YAML_DOCUMENT_END_EVENT:
+			return true;
+
+		case YAML_MAPPING_START_EVENT:
+			if (!list_parse_node(list, parser))
+				return false;
+			break;
+
+		case YAML_SEQUENCE_START_EVENT:
+			if (!list_parse_sequence(list, parser))
+				return false;
+			break;
+
+		default:
+			return false;
+		}
+		yaml_event_delete(&ev);
+	} while(1);
+}
+
+static bool list_show(struct list *list, FILE *file)
+{
+	yaml_parser_t parser;
+	bool done = false;
+	yaml_event_t ev;
+
+	yaml_parser_initialize(&parser);
+	yaml_parser_set_input_file(&parser, file);
+
+	memset(&ev, 0, sizeof(ev));
+	yaml_parser_parse(&parser, &ev);
+	if (ev.type != YAML_STREAM_START_EVENT) {
+		fprintf(stderr, "Parser setup failed\n");
+		return false;
+	}
+	yaml_event_delete(&ev);
+
+	do {
+		if (!yaml_parser_parse(&parser, &ev)) {
+			yaml_print_parser_error(&parser, stderr);
+			return false;
+		}
+
+		switch (ev.type) {
+		case YAML_DOCUMENT_START_EVENT:
+			break;
+
+		case YAML_STREAM_END_EVENT:
+			done = true;
+			break;
+
+		default:
+			return false;
+		}
+		yaml_event_delete(&ev);
+
+		list_parse_doc(list, &parser);
+	} while (!done);
+
+	yaml_parser_delete(&parser);
+	return true;
+}
+
+static void list(int argc, char **argv)
+{
+	static const struct option longopts[] = {
+		{"output", required_argument, 0, 'o'},
+		{"total", required_argument, 0, 't'},
+		{"single", required_argument, 0, 's'},
+		{ NULL, 0, NULL, 0 }
+	};
+	struct list list;
+	int i;
+
+	memset(&list, 0, sizeof(list));
+	list.out = stdout;
+
+	while ((i = getopt_long(argc, argv,
+				"+o:t:",
+				longopts, NULL)) != -1) {
+		switch (i) {
+		case 'o':
+			if (strcmp(optarg, "-"))
+				list.out = fopen(optarg, "a");
+			if (!list.out) {
+				fprintf(stderr,
+					"Unable to open output file '%s'\n",
+					optarg);
+				exit(1);
+			}
+			break;
+		case 's':
+			list.limit.single = atof(optarg);
+			break;
+		case 't':
+			list.limit.total = atof(optarg);
+			break;
+		}
+	}
+	argc -= optind;
+	argv += optind;
+
+	if (argc == 0) {
+		list_show(&list, stdin);
+	} else {
+		for (i = 0; i < argc; i++) {
+			FILE *in;
+
+			in = fopen(argv[i], "r");
+			if (!in) {
+				fprintf(stderr, "unable to open input '%s'\n",
+					argv[i]);
+				exit(1);
+			}
+
+			if (!list_show(&list, in))
+				i = argc;
+
+			fclose(in);
+		}
+	}
+}
+
+int main(int argc, char **argv)
+{
+	static const struct option longopts[] = {
+		{"verbose", no_argument, 0, 'v'},
+		{ NULL, 0, NULL, 0 }
+	};
+	int o;
+
+	while ((o = getopt_long(argc, argv,
+				"+v",
+				longopts, NULL)) != -1) {
+		switch (o) {
+		case 'v':
+			break;
+		default:
+			break;
+		}
+	}
+
+	if (optind == argc) {
+		fprintf(stderr, "no subcommand specified\n");
+		exit(1);
+	}
+
+	argc -= optind;
+	argv += optind;
+	optind = 1;
+
+	if (!strcmp(argv[0], "edges")) {
+		edges(argc, argv);
+	} else if (!strcmp(argv[0], "sort")) {
+		sort(argc, argv);
+	} else if (!strcmp(argv[0], "list")) {
+		list(argc, argv);
+	} else {
+		fprintf(stderr, "Unknown command '%s'\n", argv[0]);
+		exit(1);
+	}
+
+	return 0;
+}
diff --git a/tools/igt_kcov_edges.c b/tools/igt_kcov_edges.c
new file mode 100644
index 000000000..750d4f8b5
--- /dev/null
+++ b/tools/igt_kcov_edges.c
@@ -0,0 +1,145 @@
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+
+#include <dlfcn.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <stdarg.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <pthread.h>
+
+#include <linux/types.h>
+
+static int (*libc_ioctl)(int fd, unsigned long request, void *argp);
+
+static struct kcov {
+	unsigned long *trace;
+	uint32_t *table;
+	int fd;
+} kcov;
+
+#define KCOV_INIT_TRACE                 _IOR('c', 1, unsigned long)
+#define KCOV_ENABLE                     _IO('c', 100)
+#define   KCOV_TRACE_PC  0
+#define   KCOV_TRACE_CMP 1
+#define KCOV_DISABLE                    _IO('c', 101)
+
+#define DRM_IOCTL_BASE 'd'
+
+#define GOLDEN_RATIO_32 0x61C88647
+#define GOLDEN_RATIO_64 0x61C8864680B583EBull
+
+static inline uint32_t hash_32(uint32_t val, unsigned int bits)
+{
+        return val * GOLDEN_RATIO_32 >> (32 - bits);
+}
+
+static inline uint32_t hash_64(uint64_t val, unsigned int bits)
+{
+        return val * GOLDEN_RATIO_64 >> (64 - bits);
+}
+
+#define hash_long(x, y) hash_64(x, y)
+
+static bool kcov_open(struct kcov *kc, unsigned long count)
+{
+	const char *env;
+	int shm;
+
+	env = getenv("IGT_SHM_FD");
+	if (!env)
+		return false;
+
+	shm = atoi(env);
+	kc->table = mmap(NULL, sizeof(uint32_t) << 16,
+			 PROT_WRITE, MAP_SHARED, shm, 0);
+	close(shm);
+	if (kc->table == (uint32_t *)MAP_FAILED)
+		return false;
+
+	env = getenv("IGT_KCOV_FD");
+	if (!env)
+		goto err_shm;
+
+	kc->fd = atoi(env);
+	if (libc_ioctl(kc->fd, KCOV_INIT_TRACE, (void *)count))
+		goto err_close;
+
+	kc->trace = mmap(NULL, count * sizeof(unsigned long),
+			 PROT_WRITE, MAP_SHARED, kc->fd, 0);
+	if (kc->trace == MAP_FAILED)
+		goto err_close;
+
+	return true;
+
+err_close:
+	close(kc->fd);
+err_shm:
+	munmap(kc->table, sizeof(uint32_t) << 16);
+	return false;
+}
+
+static void kcov_enable(struct kcov *kc)
+{
+	libc_ioctl(kc->fd, KCOV_ENABLE, KCOV_TRACE_PC);
+	__atomic_store_n(&kc->trace[0], 0, __ATOMIC_RELAXED);
+}
+
+static unsigned long kcov_disable(struct kcov *kc)
+{
+	unsigned long depth;
+
+	depth = __atomic_load_n(&kc->trace[0], __ATOMIC_RELAXED);
+	if (libc_ioctl(kc->fd, KCOV_DISABLE, 0))
+		depth = 0;
+
+	return depth;
+}
+
+int ioctl(int fd, unsigned long request, ...)
+{
+	static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
+	unsigned long count, n, prev;
+	va_list args;
+	void *argp;
+	int res;
+
+	va_start(args, request);
+	argp = va_arg(args, void *);
+	va_end(args);
+
+	if (kcov.fd < 0 || _IOC_TYPE(request) != DRM_IOCTL_BASE)
+		return libc_ioctl(fd, request, argp);
+
+	pthread_mutex_lock(&mutex);
+	kcov_enable(&kcov);
+
+	res = libc_ioctl(fd, request, argp);
+
+	count = kcov_disable(&kcov);
+	prev = hash_long(kcov.trace[1], 16);
+	for (n = 2; n <= count; n++) {
+		unsigned long loc = hash_long(kcov.trace[n], 16);
+
+		kcov.table[prev ^ loc]++;
+		prev = loc >> 1;
+	}
+	kcov.table[0] |= 1;
+	pthread_mutex_unlock(&mutex);
+
+	return res;
+}
+
+__attribute__((constructor))
+static void init(void)
+{
+	libc_ioctl = dlsym(RTLD_NEXT, "ioctl");
+
+	if (!kcov_open(&kcov, 64 << 10))
+		kcov.fd = -1;
+}
diff --git a/tools/meson.build b/tools/meson.build
index 8ed1ccf48..d0d9c9c91 100644
--- a/tools/meson.build
+++ b/tools/meson.build
@@ -104,6 +104,19 @@ executable('intel_reg', sources : intel_reg_src,
 	     '-DIGT_DATADIR="@0@"'.format(join_paths(prefix, datadir)),
 	   ])
 
+if yaml.found()
+	executable('igt_kcov',
+		   sources: [ 'igt_kcov.c' ],
+		   dependencies: [ zlib, realtime, math, yaml ],
+		   install: true) # setuid me!
+
+	shared_module('igt_kcov_edges',
+		      sources: [ 'igt_kcov_edges.c' ],
+		      name_prefix: '',
+		      dependencies: [ dlsym ],
+		      install: true) # -version, -Dlibdir
+endif
+
 install_data('intel_gpu_abrt', install_dir : bindir)
 
 install_subdir('registers', install_dir : datadir,
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 16/17] tools: capture execution pathways
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

igt_kcov edges -o edges.yaml command-to-trace args
...
igt_kcov sort -o sorted.yaml edges.yaml
igt_kcov list --total=30 --single=0.5 sorted.yaml

Requires CONFIG_KCOV

Use LD_PRELOAD to wrap calls to DRM ioctls and capture the execution
trace of all basic blocks invoked directly from the syscall. (Key
limitation!) From the chain of basic blocks, the "edges" are computed
and the number of times each edge is hit is stored in a vector (modulo a
large hash). The goal then is to select a series of tests that execute
the most unique pathways in the sortest amount of time. To do this we
compare the similarity of the coverage vectors between all commands, and
try to preferentially pick those that are dissimilar from the rest.

* Direct execution pathways from syscall is a huge limitation for
evaluating general testing of driver pathways, but is not so huge when
myopically focusing on the ABI as say used by mesa.

** Caveat lector. Hastily thrown together. (Still)
---
 meson.build            |    1 +
 tools/Makefile.am      |   11 +-
 tools/igt_kcov.c       | 1218 ++++++++++++++++++++++++++++++++++++++++
 tools/igt_kcov_edges.c |  145 +++++
 tools/meson.build      |   13 +
 5 files changed, 1387 insertions(+), 1 deletion(-)
 create mode 100644 tools/igt_kcov.c
 create mode 100644 tools/igt_kcov_edges.c

diff --git a/meson.build b/meson.build
index 638c01066..213745954 100644
--- a/meson.build
+++ b/meson.build
@@ -173,6 +173,7 @@ math = cc.find_library('m')
 realtime = cc.find_library('rt')
 dlsym = cc.find_library('dl')
 zlib = cc.find_library('z')
+yaml = cc.find_library('yaml', required: false)
 
 if cc.has_header('linux/kd.h')
 	config.set('HAVE_LINUX_KD_H', 1)
diff --git a/tools/Makefile.am b/tools/Makefile.am
index a0b016ddd..ae70f5a76 100644
--- a/tools/Makefile.am
+++ b/tools/Makefile.am
@@ -1,6 +1,7 @@
 include Makefile.sources
 
-bin_PROGRAMS = $(tools_prog_lists)
+bin_PROGRAMS = $(tools_prog_lists) igt_kcov
+lib_LTLIBRARIES = igt_kcov_edges.la
 
 if HAVE_LIBDRM_INTEL
 bin_PROGRAMS += $(LIBDRM_INTEL_BIN)
@@ -30,6 +31,14 @@ intel_aubdump_la_LIBADD = $(top_builddir)/lib/libintel_tools.la -ldl
 
 intel_gpu_top_LDADD = $(top_builddir)/lib/libigt_perf.la
 
+# XXX FIXME
+igt_kcov_SOURCES = igt_kcov.c
+igt_kcov_LDADD = -lyaml -lz -lrt -lm
+
+igt_kcov_edges_la_SOURCES = igt_kcov_edges.c
+igt_kcov_edges_la_LDFLAGS = -module -no-undefined -avoid-version
+igt_kcov_edges_la_LIBADD = -ldl
+
 bin_SCRIPTS = intel_aubdump
 CLEANFILES = $(bin_SCRIPTS)
 
diff --git a/tools/igt_kcov.c b/tools/igt_kcov.c
new file mode 100644
index 000000000..e9179a4b3
--- /dev/null
+++ b/tools/igt_kcov.c
@@ -0,0 +1,1218 @@
+#include <sys/types.h>
+#include <sys/file.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/wait.h>
+
+#include <ctype.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <getopt.h>
+#include <inttypes.h>
+#include <math.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <unistd.h>
+
+#include <yaml.h>
+#include <zlib.h>
+
+#define SZ_8 (1 << 16)
+#define SZ_32 (sizeof(uint32_t) * SZ_8)
+
+static pid_t child = 0;
+
+static void sighandler(int sig)
+{
+	kill(sig, child);
+}
+
+static uint64_t gettime(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_MONOTONIC, &ts);
+
+	return ts.tv_sec * 1000000000 + ts.tv_nsec;
+}
+
+struct exec {
+	const char *preload;
+	int uid;
+	int gid;
+};
+
+static int edges_exec(const struct exec *args, char **argv, uint64_t *elapsed)
+{
+	char buf[10];
+	int status;
+	int kcov;
+	int shm;
+	int ret;
+
+	kcov = open("/sys/kernel/debug/kcov", O_RDWR);
+	if (kcov < 0)
+		return -errno;
+
+	shm = memfd_create("igt_kcov", 0);
+	if (shm == -1) {
+		close(kcov);
+		return -errno;
+	}
+
+	ftruncate(shm, SZ_32);
+
+	switch ((child = fork())) {
+	case 0: /* child */
+		if (args->gid)
+			setgid(args->gid);
+		if (args->uid)
+			setuid(args->uid);
+
+		sprintf(buf, "%d", shm);
+		setenv("IGT_SHM_FD", buf, 1);
+
+		sprintf(buf, "%d", kcov);
+		setenv("IGT_KCOV_FD", buf, 1);
+
+		setenv("LD_PRELOAD", args->preload, 1);
+		exit(execvp(argv[0], argv));
+		break;
+
+	case -1:
+		ret = -errno;
+		break;
+
+	default:
+		signal(SIGINT, sighandler);
+
+		*elapsed = -gettime();
+		do {
+			ret = waitpid(child, &status, 0);
+			if (ret == -1)
+				ret = -errno;
+		} while (ret == -EINTR);
+		*elapsed += gettime();
+
+		signal(SIGINT, SIG_DFL);
+		child = 0;
+	}
+
+	close(kcov);
+	if (ret < 0) {
+		close(shm);
+		shm = ret;
+	}
+
+	return shm;
+}
+
+static inline unsigned long __fls(unsigned long word)
+{
+#if defined(__GNUC__) && (defined(__i386__) || defined(__x86__) || defined(__x86_64__))
+	asm("bsr %1,%0"
+	    : "=r" (word)
+	    : "rm" (word));
+	return word;
+#else
+	unsigned int v = 0;
+
+	while (word >>= 1)
+		v++;
+
+	return v;
+#endif
+}
+
+static uint8_t lower_u32(uint32_t x)
+{
+	if (x < 16)
+		return x;
+	else
+		return (x >> (__fls(x) - 2)) + ((__fls(x) - 1) << 2);
+}
+
+static uint8_t *lower_edges(int shm)
+{
+	uint8_t *lower;
+	uint32_t *tbl;
+	unsigned int n;
+
+	tbl = mmap(NULL, SZ_32, PROT_READ, MAP_SHARED, shm, 0);
+	if (tbl == MAP_FAILED)
+		return NULL;
+
+	if (tbl[0] == 0) /* empty */
+		goto out;
+
+	lower = malloc(SZ_8);
+	if (!lower)
+		goto out;
+
+	for (n = 0; n < 1 << 16; n++)
+		lower[n] = lower_u32(tbl[n]);
+
+out:
+	munmap(tbl, SZ_32);
+	return lower;
+}
+
+static bool ascii85_encode(uint32_t in, char *out)
+{
+	int i;
+
+	if (in == 0)
+		return false;
+
+	for (i = 5; i--; ) {
+		out[i] = '!' + in % 85;
+		in /= 85;
+	}
+
+	return true;
+}
+
+static char *edges_to_ascii85(uint8_t *tbl)
+{
+	z_stream zs;
+	uint32_t *p;
+	void *out;
+	char *str, *s;
+	int sz;
+
+	sz = SZ_8 * 3 /2;
+	out = malloc(sz);
+	if (!out)
+		return NULL;
+
+	memset(&zs, 0, sizeof(zs));
+	if (deflateInit(&zs, 9)) {
+		free(out);
+		return NULL;
+	}
+
+	zs.next_in = tbl;
+	zs.avail_in = SZ_8;
+	zs.total_in = 0;
+	zs.avail_out = sz;
+	zs.total_out = 0;
+	zs.next_out = out;
+
+	deflate(&zs, Z_FINISH);
+	deflateEnd(&zs);
+
+	if (zs.total_out & 3)
+		memset((char *)out + zs.total_out, 0, 4 - (zs.total_out & 3));
+	zs.total_out = (zs.total_out + 3) / 4;
+
+	str = malloc(zs.total_out * 5 + 1);
+	if (!str) {
+		free(out);
+		return NULL;
+	}
+
+	p = out;
+	s = str;
+	for (int i = 0; i < zs.total_out; i++) {
+		if (ascii85_encode(*p++, s))
+			s += 5;
+		else
+			*s++ = 'z';
+	}
+	*s++ = '\0';
+	free(out);
+
+	return str;
+}
+
+static void edges(int argc, char **argv)
+{
+	static const struct option longopts[] = {
+		{"output", required_argument, 0, 'o'},
+		{"preload", required_argument, 0, 'p'},
+		{"user", required_argument, 0, 'u'},
+		{"group", required_argument, 0, 'g'},
+		{ NULL, 0, NULL, 0 }
+	};
+	struct exec args = {
+		.preload = "/tmp/igt_kcov_edges.so",
+	};
+	FILE *out = stdout;
+	uint64_t elapsed;
+	uint8_t *tbl;
+	char *str, *s;
+	int shm, i;
+
+	while ((i = getopt_long(argc, argv,
+				"+o:g:u:",
+				longopts, NULL)) != -1) {
+		switch (i) {
+		case 'o':
+			if (strcmp(optarg, "-"))
+				out = fopen(optarg, "a");
+			if (!out) {
+				fprintf(stderr,
+					"Unable to open output file '%s'\n",
+					optarg);
+				exit(1);
+			}
+			break;
+		case 'p':
+			args.preload = optarg;
+			break;
+		case 'u':
+			args.uid = atoi(optarg);
+			break;
+		case 'g':
+			args.gid = atoi(optarg);
+			break;
+		}
+	}
+	argc -= optind;
+	argv += optind;
+
+	if (argc < 1) {
+		fprintf(stderr,
+			"usage: igt_kcov edges [options] program-to-trace args...\n");
+		exit(1);
+	}
+
+	shm = edges_exec(&args, argv, &elapsed);
+	if (shm < 0) {
+		fprintf(stderr,
+			"Execution of %s failed: err=%d\n",
+			argv[0], shm);
+		exit(1);
+	}
+
+	tbl = lower_edges(shm);
+	close(shm);
+	if (!tbl)
+		exit(1);
+
+	str = edges_to_ascii85(tbl);
+
+	flock(fileno(out), LOCK_EX);
+	fprintf(out, "---\n");
+
+	fprintf(out, "cmd: |  \n");
+	for (i = 0; i < argc; i++) {
+		fprintf(out, " '");
+		for (s = argv[i]; *s; s++) {
+			if (*s == '\'')
+				fprintf(out, "\\\'");
+			else
+				fprintf(out, "%c", *s);
+		}
+		fprintf(out, "'");
+	}
+	fprintf(out, "\n");
+
+	fprintf(out, "elapsed: %"PRIu64" # %.1fms\n", elapsed, 1e-6 * elapsed);
+
+	fprintf(out, "edges: !!ascii85.gz |\n");
+	i = strlen(str);
+	s = str;
+	while (i) {
+		int len = i > 70 ? 70 : i;
+		char tmp;
+
+		tmp = s[len];
+		s[len] = '\0';
+		fprintf(out, "  %s\n", s);
+		s[len] = tmp;
+
+		s += len;
+		i -= len;
+	}
+
+	fflush(out);
+	flock(fileno(out), LOCK_UN);
+
+	free(str);
+}
+
+static unsigned long zlib_inflate(void *in, unsigned long len,
+				  void *ptr, unsigned long max)
+{
+	struct z_stream_s zstream;
+
+	memset(&zstream, 0, sizeof(zstream));
+
+	zstream.next_in = in;
+	zstream.avail_in = len;
+
+	if (inflateInit(&zstream) != Z_OK)
+		return 0;
+
+	zstream.next_out = ptr;
+	zstream.avail_out = max;
+
+	switch (inflate(&zstream, Z_SYNC_FLUSH)) {
+	case Z_STREAM_END:
+	case Z_OK:
+		break;
+	default:
+		zstream.total_out = 0;
+		break;
+	}
+
+	inflateEnd(&zstream);
+	return zstream.total_out;
+}
+
+static unsigned long ascii85_decode(const char *in,
+				    void *ptr, unsigned long max)
+{
+	unsigned long sz = max / sizeof(uint32_t);
+	unsigned long len = 0;
+	uint32_t *out;
+
+	out = malloc(sz * sizeof(uint32_t));
+	if (out == NULL)
+		return 0;
+
+	while (*in) {
+		uint32_t v = 0;
+
+		if (isspace(*in)) {
+			in++;
+			continue;
+		}
+
+		if (*in < '!' || *in > 'z') {
+			fprintf(stderr, "Invalid value in ascii85 block\n");
+			free(out);
+			return 0;
+		}
+
+		if (len == sz) {
+			sz *= 2;
+			out = realloc(out, sz * sizeof(uint32_t));
+			if (out == NULL)
+				return 0;
+		}
+
+		if (*in == 'z') {
+			in++;
+		} else {
+			v += in[0] - 33; v *= 85;
+			v += in[1] - 33; v *= 85;
+			v += in[2] - 33; v *= 85;
+			v += in[3] - 33; v *= 85;
+			v += in[4] - 33;
+			in += 5;
+		}
+		out[len++] = v;
+	}
+
+	len = zlib_inflate(out, len * sizeof(*out), ptr, max);
+	free(out);
+
+	return len;
+}
+
+static void yaml_print_parser_error(yaml_parser_t *parser, FILE *stream)
+{
+	switch (parser->error) {
+	case YAML_MEMORY_ERROR:
+		fprintf(stderr, "Memory error: Not enough memory for parsing\n");
+		break;
+
+	case YAML_READER_ERROR:
+		if (parser->problem_value != -1) {
+			fprintf(stderr, "Reader error: %s: #%X at %zd\n", parser->problem,
+				parser->problem_value, parser->problem_offset);
+		} else {
+			fprintf(stderr, "Reader error: %s at %zd\n", parser->problem,
+				parser->problem_offset);
+		}
+		break;
+
+	case YAML_SCANNER_ERROR:
+		if (parser->context) {
+			fprintf(stderr, "Scanner error: %s at line %lu, column %lu\n"
+				"%s at line %lu, column %lu\n", parser->context,
+				parser->context_mark.line+1, parser->context_mark.column+1,
+				parser->problem, parser->problem_mark.line+1,
+				parser->problem_mark.column+1);
+		} else {
+			fprintf(stderr, "Scanner error: %s at line %lu, column %lu\n",
+				parser->problem, parser->problem_mark.line+1,
+				parser->problem_mark.column+1);
+		}
+		break;
+
+	case YAML_PARSER_ERROR:
+		if (parser->context) {
+			fprintf(stderr, "Parser error: %s at line %lu, column %lu\n"
+				"%s at line %lu, column %lu\n", parser->context,
+				parser->context_mark.line+1, parser->context_mark.column+1,
+				parser->problem, parser->problem_mark.line+1,
+				parser->problem_mark.column+1);
+		} else {
+			fprintf(stderr, "Parser error: %s at line %lu, column %lu\n",
+				parser->problem, parser->problem_mark.line+1,
+				parser->problem_mark.column+1);
+		}
+		break;
+
+	default:
+		/* Couldn't happen. */
+		fprintf(stderr, "Internal error\n");
+		break;
+	}
+}
+
+struct edges {
+	struct edges *next;
+
+	char *command;
+	uint64_t elapsed;
+
+	unsigned int weight;
+
+	uint8_t tbl[SZ_8];
+};
+
+struct sort {
+	struct edges *edges;
+	unsigned int count;
+
+	unsigned int max_weight;
+	struct edges *best;
+};
+
+static bool sort_parse_command(struct sort *sort,
+			       struct edges *e,
+			       yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+	const char *s;
+	int len;
+
+	if (!yaml_parser_parse(parser, &ev))
+		return false;
+
+	switch (ev.type) {
+	case YAML_SCALAR_EVENT:
+		break;
+
+	default:
+		return false;
+	}
+
+
+	s = (const char *)ev.data.scalar.value;
+	len = strlen(s);
+	while (s[len - 1] == '\n')
+		len--;
+	e->command = malloc(len + 1);
+	if (e->command) {
+		memcpy(e->command, s, len);
+		e->command[len] = '\0';
+	}
+	yaml_event_delete(&ev);
+
+	return true;
+}
+
+static bool sort_parse_elapsed(struct sort *sort,
+			       struct edges *e,
+			       yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+	const char *s;
+
+	if (!yaml_parser_parse(parser, &ev))
+		return false;
+
+	switch (ev.type) {
+	case YAML_SCALAR_EVENT:
+		break;
+
+	default:
+		return false;
+	}
+
+	s = (const char *)ev.data.scalar.value;
+	e->elapsed = strtoull(s, NULL, 0);
+	yaml_event_delete(&ev);
+
+	return true;
+}
+
+static unsigned int bitmap_weight(const void *bitmap, unsigned int bits)
+{
+	const uint32_t *b = bitmap;
+	unsigned int k, lim = bits / 32;
+	unsigned int w = 0;
+
+	for (k = 0; k < lim; k++)
+		w += __builtin_popcount(b[k]);
+
+	if (bits % 32)
+		w += __builtin_popcount(b[k] << (32 - bits % 32));
+
+	return w;
+}
+
+static bool sort_parse_edges(struct sort *sort,
+			     struct edges *e,
+			     yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+	const char *s;
+
+	if (!yaml_parser_parse(parser, &ev))
+		return false;
+
+	switch (ev.type) {
+	case YAML_SCALAR_EVENT:
+		break;
+
+	default:
+		return false;
+	}
+
+	s = (const char *)ev.data.scalar.value;
+	if (ascii85_decode(s, e->tbl, sizeof(e->tbl)))
+		e->weight = bitmap_weight(e->tbl, sizeof(e->tbl) * 8);
+	yaml_event_delete(&ev);
+
+	return true;
+}
+
+static bool edges_valid(const struct edges *e)
+{
+	if (!e->command)
+		return false;
+
+	if (!e->weight)
+		return false;
+
+	if (!e->elapsed)
+		return false;
+
+	return true; /* good enough at least */
+}
+
+static bool sort_add_edges(struct sort *sort, struct edges *e)
+{
+	if (!edges_valid(e))
+		return false;
+
+	e->next = sort->edges;
+	sort->edges = e;
+
+	if (e->weight > sort->max_weight) {
+		sort->max_weight = e->weight;
+		sort->best = e;
+	}
+
+	sort->count++;
+
+	return true;
+}
+
+static bool sort_parse_node(struct sort *sort, yaml_parser_t *parser)
+{
+	struct edges *e;
+	yaml_event_t ev;
+	char *s;
+
+	e = malloc(sizeof(*e));
+	if (!e)
+		return false;
+
+	e->weight = 0;
+
+	do {
+		if (!yaml_parser_parse(parser, &ev))
+			goto err;
+
+		switch (ev.type) {
+		case YAML_MAPPING_END_EVENT:
+			if (!sort_add_edges(sort, e))
+				goto err;
+
+			return true;
+
+		case YAML_SCALAR_EVENT:
+			break;
+
+		default:
+			goto err;
+		}
+
+		s = (char *)ev.data.scalar.value;
+		if (!strcmp(s, "cmd")) {
+			sort_parse_command(sort, e, parser);
+		} else if (!strcmp(s, "elapsed")) {
+			sort_parse_elapsed(sort, e, parser);
+		} else if (!strcmp(s, "edges")) {
+			sort_parse_edges(sort, e, parser);
+		} else {
+			fprintf(stderr,
+				"Unknown element in edges file: %s\n",
+				s);
+		}
+
+		yaml_event_delete(&ev);
+	} while (1);
+
+err:
+	free(e);
+	return false;
+}
+
+static bool sort_parse_doc(struct sort *sort, yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+
+	do {
+		if (!yaml_parser_parse(parser, &ev))
+			return false;
+
+		switch (ev.type) {
+		case YAML_DOCUMENT_END_EVENT:
+			return true;
+
+		case YAML_MAPPING_START_EVENT:
+			break;
+
+		default:
+			return false;
+		}
+		yaml_event_delete(&ev);
+
+		if (!sort_parse_node(sort, parser))
+			return false;
+	} while(1);
+}
+
+static bool sort_add(struct sort *sort, FILE *file)
+{
+	yaml_parser_t parser;
+	bool done = false;
+	yaml_event_t ev;
+
+	yaml_parser_initialize(&parser);
+	yaml_parser_set_input_file(&parser, file);
+
+	memset(&ev, 0, sizeof(ev));
+	yaml_parser_parse(&parser, &ev);
+	if (ev.type != YAML_STREAM_START_EVENT) {
+		fprintf(stderr, "Parser setup failed\n");
+		return false;
+	}
+	yaml_event_delete(&ev);
+
+	do {
+		if (!yaml_parser_parse(&parser, &ev)) {
+			yaml_print_parser_error(&parser, stderr);
+			return false;
+		}
+
+		switch (ev.type) {
+		case YAML_DOCUMENT_START_EVENT:
+			break;
+
+		case YAML_STREAM_END_EVENT:
+			done = true;
+			break;
+
+		default:
+			return false;
+		}
+		yaml_event_delete(&ev);
+
+		sort_parse_doc(sort, &parser);
+	} while (!done);
+
+	yaml_parser_delete(&parser);
+	return true;
+}
+
+static double edges_similarity(const struct edges *ta, const struct edges *tb)
+{
+	uint64_t sab, saa, sbb;
+	const uint8_t *a, *b;
+
+	if (ta == tb)
+		return 1;
+
+	a = ta->tbl;
+	b = tb->tbl;
+	sab = 0;
+	saa = 0;
+	sbb = 0;
+
+	for (unsigned int i = 0; i < SZ_8; i++) {
+		sab += (uint32_t)a[i] * b[i];
+		saa += (uint32_t)a[i] * a[i];
+		sbb += (uint32_t)b[i] * b[i];
+	}
+
+	return ((long double)sab * sab) / ((long double)saa * sbb);
+}
+
+static void rank_by_dissimilarity(struct sort *sort, FILE *out)
+{
+	const unsigned int count = sort->count;
+	double *M, *dis, *sim;
+	struct edges **edges, **t;
+	bool *used;
+	int last = -1;
+
+	t = edges = malloc(count * sizeof(*edges));
+	for (struct edges *e = sort->edges; e; e = e->next)
+		*t++ = e;
+
+	M = malloc(sizeof(double) * count * (count + 2));
+	dis = M + count * count;
+	sim = dis + count;
+	for (int i = 0; i < count; i++) {
+		dis[i] = 0.;
+		sim[i] = 1.;
+		for (int j = 0; j < i; j++)
+			dis[i] += M[j*count + i];
+		for (int j = i + 1; j < count; j++) {
+			double d = 1. - edges_similarity(edges[i], edges[j]);
+			M[i*count + j] = d;
+			dis[i] += d;
+		}
+	}
+
+	fprintf(out, "---\n");
+
+	used = calloc(count, sizeof(bool));
+	for (int rank = 0; rank < count; rank++) {
+		struct edges *e;
+		double best = -HUGE_VAL;
+		int this = -1;
+
+		for (int i = 0; i < count; i++) {
+			double d;
+
+			if (used[i])
+				continue;
+
+			d = dis[i];
+			if (last != -1) {
+				double s;
+
+				if (last < i)
+					s = M[last * count + i];
+				else
+					s = M[i * count + last];
+
+				s *= sim[i];
+				sim[i] = s;
+
+				d *= sqrt(s);
+			}
+			if (d > best) {
+				best = d;
+				this = i;
+			}
+		}
+
+		if (this < 0)
+			break;
+
+		e = edges[this];
+		used[this] = true;
+		last = this;
+
+		fprintf(out, "- cmd: |\n    %s\n", e->command);
+		fprintf(out, "  elapsed: %"PRIu64" # %.1fms\n",
+			e->elapsed, 1e-6 * e->elapsed);
+		fprintf(out, "  dissimilarity: %.4g\n",
+			sqrt(best / (count - rank)));
+	}
+
+	free(M);
+	free(edges);
+}
+
+static void sort(int argc, char **argv)
+{
+	static const struct option longopts[] = {
+		{"output", required_argument, 0, 'o'},
+		{ NULL, 0, NULL, 0 }
+	};
+	FILE *out = stdout;
+	struct sort sort;
+	int i;
+
+	while ((i = getopt_long(argc, argv,
+				"+o:",
+				longopts, NULL)) != -1) {
+		switch (i) {
+		case 'o':
+			if (strcmp(optarg, "-"))
+				out = fopen(optarg, "a");
+			if (!out) {
+				fprintf(stderr,
+					"Unable to open output file '%s'\n",
+					optarg);
+				exit(1);
+			}
+			break;
+		}
+	}
+	argc -= optind;
+	argv += optind;
+
+	memset(&sort, 0, sizeof(sort));
+
+	if (argc == 0) {
+		sort_add(&sort, stdin);
+	} else {
+		for (i = 0; i < argc; i++) {
+			FILE *in;
+
+			in = fopen(argv[i], "r");
+			if (!in) {
+				fprintf(stderr, "unable to open input '%s'\n",
+					argv[i]);
+				exit(1);
+			}
+
+			sort_add(&sort, in);
+
+			fclose(in);
+		}
+	}
+
+	if (!sort.count)
+		return;
+
+	rank_by_dissimilarity(&sort, out);
+}
+
+struct list {
+	struct {
+		double single;
+		double total;
+	} limit;
+	double total;
+	FILE *out;
+};
+
+struct list_node {
+	char *cmd;
+	uint64_t elapsed;
+};
+
+static bool list_parse_command(struct list *list,
+			       struct list_node *node,
+			       yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+	const char *s;
+	int len;
+
+	if (!yaml_parser_parse(parser, &ev))
+		return false;
+
+	switch (ev.type) {
+	case YAML_SCALAR_EVENT:
+		break;
+
+	default:
+		return false;
+	}
+
+
+	s = (const char *)ev.data.scalar.value;
+	len = strlen(s);
+	while (len > 0 && isspace(s[len - 1]))
+		len--;
+
+	node->cmd = strndup(s, len);
+	yaml_event_delete(&ev);
+
+	return true;
+}
+
+static bool list_parse_elapsed(struct list *list,
+			       struct list_node *node,
+			       yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+	const char *s;
+
+	if (!yaml_parser_parse(parser, &ev))
+		return false;
+
+	switch (ev.type) {
+	case YAML_SCALAR_EVENT:
+		break;
+
+	default:
+		return false;
+	}
+
+	s = (const char *)ev.data.scalar.value;
+	node->elapsed = strtoull(s, NULL, 0);
+	yaml_event_delete(&ev);
+
+	return true;
+}
+
+static bool list_parse_node(struct list *list, yaml_parser_t *parser)
+{
+	struct list_node node = {};
+	yaml_event_t ev;
+	char *s;
+
+	do {
+		if (!yaml_parser_parse(parser, &ev))
+			return false;
+
+		switch (ev.type) {
+		case YAML_MAPPING_END_EVENT:
+			if (list->limit.single &&
+			    1e-9 * node.elapsed > list->limit.single) {
+				free(node.cmd);
+				return true;
+			}
+
+			if (list->limit.total &&
+			    list->total + 1e-9 * node.elapsed > list->limit.total) {
+				free(node.cmd);
+				return true;
+			}
+
+			if (node.cmd) {
+				list->total += 1e-9 * node.elapsed;
+				fprintf(list->out,
+					"%s # %.3fms, total %.1fms\n",
+					node.cmd,
+					1e-6 * node.elapsed,
+					1e3 * list->total);
+				free(node.cmd);
+			}
+			return true;
+
+		case YAML_SCALAR_EVENT:
+			break;
+
+		default:
+			return false;
+		}
+
+		s = (char *)ev.data.scalar.value;
+		if (!strcmp(s, "cmd")) {
+			list_parse_command(list, &node, parser);
+		} else if (!strcmp(s, "elapsed")) {
+			if (!list_parse_elapsed(list, &node,parser))
+				return false;
+		}
+		yaml_event_delete(&ev);
+	} while (1);
+}
+
+static bool list_parse_sequence(struct list *list, yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+
+	do {
+		if (!yaml_parser_parse(parser, &ev))
+			return false;
+
+		switch (ev.type) {
+		case YAML_SEQUENCE_END_EVENT:
+			return true;
+
+		case YAML_MAPPING_START_EVENT:
+			if (!list_parse_node(list, parser))
+				return false;
+			break;
+
+		default:
+			return false;
+		}
+		yaml_event_delete(&ev);
+	} while(1);
+}
+
+static bool list_parse_doc(struct list *list, yaml_parser_t *parser)
+{
+	yaml_event_t ev;
+
+	do {
+		if (!yaml_parser_parse(parser, &ev))
+			return false;
+
+		switch (ev.type) {
+		case YAML_DOCUMENT_END_EVENT:
+			return true;
+
+		case YAML_MAPPING_START_EVENT:
+			if (!list_parse_node(list, parser))
+				return false;
+			break;
+
+		case YAML_SEQUENCE_START_EVENT:
+			if (!list_parse_sequence(list, parser))
+				return false;
+			break;
+
+		default:
+			return false;
+		}
+		yaml_event_delete(&ev);
+	} while(1);
+}
+
+static bool list_show(struct list *list, FILE *file)
+{
+	yaml_parser_t parser;
+	bool done = false;
+	yaml_event_t ev;
+
+	yaml_parser_initialize(&parser);
+	yaml_parser_set_input_file(&parser, file);
+
+	memset(&ev, 0, sizeof(ev));
+	yaml_parser_parse(&parser, &ev);
+	if (ev.type != YAML_STREAM_START_EVENT) {
+		fprintf(stderr, "Parser setup failed\n");
+		return false;
+	}
+	yaml_event_delete(&ev);
+
+	do {
+		if (!yaml_parser_parse(&parser, &ev)) {
+			yaml_print_parser_error(&parser, stderr);
+			return false;
+		}
+
+		switch (ev.type) {
+		case YAML_DOCUMENT_START_EVENT:
+			break;
+
+		case YAML_STREAM_END_EVENT:
+			done = true;
+			break;
+
+		default:
+			return false;
+		}
+		yaml_event_delete(&ev);
+
+		list_parse_doc(list, &parser);
+	} while (!done);
+
+	yaml_parser_delete(&parser);
+	return true;
+}
+
+static void list(int argc, char **argv)
+{
+	static const struct option longopts[] = {
+		{"output", required_argument, 0, 'o'},
+		{"total", required_argument, 0, 't'},
+		{"single", required_argument, 0, 's'},
+		{ NULL, 0, NULL, 0 }
+	};
+	struct list list;
+	int i;
+
+	memset(&list, 0, sizeof(list));
+	list.out = stdout;
+
+	while ((i = getopt_long(argc, argv,
+				"+o:t:",
+				longopts, NULL)) != -1) {
+		switch (i) {
+		case 'o':
+			if (strcmp(optarg, "-"))
+				list.out = fopen(optarg, "a");
+			if (!list.out) {
+				fprintf(stderr,
+					"Unable to open output file '%s'\n",
+					optarg);
+				exit(1);
+			}
+			break;
+		case 's':
+			list.limit.single = atof(optarg);
+			break;
+		case 't':
+			list.limit.total = atof(optarg);
+			break;
+		}
+	}
+	argc -= optind;
+	argv += optind;
+
+	if (argc == 0) {
+		list_show(&list, stdin);
+	} else {
+		for (i = 0; i < argc; i++) {
+			FILE *in;
+
+			in = fopen(argv[i], "r");
+			if (!in) {
+				fprintf(stderr, "unable to open input '%s'\n",
+					argv[i]);
+				exit(1);
+			}
+
+			if (!list_show(&list, in))
+				i = argc;
+
+			fclose(in);
+		}
+	}
+}
+
+int main(int argc, char **argv)
+{
+	static const struct option longopts[] = {
+		{"verbose", no_argument, 0, 'v'},
+		{ NULL, 0, NULL, 0 }
+	};
+	int o;
+
+	while ((o = getopt_long(argc, argv,
+				"+v",
+				longopts, NULL)) != -1) {
+		switch (o) {
+		case 'v':
+			break;
+		default:
+			break;
+		}
+	}
+
+	if (optind == argc) {
+		fprintf(stderr, "no subcommand specified\n");
+		exit(1);
+	}
+
+	argc -= optind;
+	argv += optind;
+	optind = 1;
+
+	if (!strcmp(argv[0], "edges")) {
+		edges(argc, argv);
+	} else if (!strcmp(argv[0], "sort")) {
+		sort(argc, argv);
+	} else if (!strcmp(argv[0], "list")) {
+		list(argc, argv);
+	} else {
+		fprintf(stderr, "Unknown command '%s'\n", argv[0]);
+		exit(1);
+	}
+
+	return 0;
+}
diff --git a/tools/igt_kcov_edges.c b/tools/igt_kcov_edges.c
new file mode 100644
index 000000000..750d4f8b5
--- /dev/null
+++ b/tools/igt_kcov_edges.c
@@ -0,0 +1,145 @@
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+
+#include <dlfcn.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <stdarg.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <pthread.h>
+
+#include <linux/types.h>
+
+static int (*libc_ioctl)(int fd, unsigned long request, void *argp);
+
+static struct kcov {
+	unsigned long *trace;
+	uint32_t *table;
+	int fd;
+} kcov;
+
+#define KCOV_INIT_TRACE                 _IOR('c', 1, unsigned long)
+#define KCOV_ENABLE                     _IO('c', 100)
+#define   KCOV_TRACE_PC  0
+#define   KCOV_TRACE_CMP 1
+#define KCOV_DISABLE                    _IO('c', 101)
+
+#define DRM_IOCTL_BASE 'd'
+
+#define GOLDEN_RATIO_32 0x61C88647
+#define GOLDEN_RATIO_64 0x61C8864680B583EBull
+
+static inline uint32_t hash_32(uint32_t val, unsigned int bits)
+{
+        return val * GOLDEN_RATIO_32 >> (32 - bits);
+}
+
+static inline uint32_t hash_64(uint64_t val, unsigned int bits)
+{
+        return val * GOLDEN_RATIO_64 >> (64 - bits);
+}
+
+#define hash_long(x, y) hash_64(x, y)
+
+static bool kcov_open(struct kcov *kc, unsigned long count)
+{
+	const char *env;
+	int shm;
+
+	env = getenv("IGT_SHM_FD");
+	if (!env)
+		return false;
+
+	shm = atoi(env);
+	kc->table = mmap(NULL, sizeof(uint32_t) << 16,
+			 PROT_WRITE, MAP_SHARED, shm, 0);
+	close(shm);
+	if (kc->table == (uint32_t *)MAP_FAILED)
+		return false;
+
+	env = getenv("IGT_KCOV_FD");
+	if (!env)
+		goto err_shm;
+
+	kc->fd = atoi(env);
+	if (libc_ioctl(kc->fd, KCOV_INIT_TRACE, (void *)count))
+		goto err_close;
+
+	kc->trace = mmap(NULL, count * sizeof(unsigned long),
+			 PROT_WRITE, MAP_SHARED, kc->fd, 0);
+	if (kc->trace == MAP_FAILED)
+		goto err_close;
+
+	return true;
+
+err_close:
+	close(kc->fd);
+err_shm:
+	munmap(kc->table, sizeof(uint32_t) << 16);
+	return false;
+}
+
+static void kcov_enable(struct kcov *kc)
+{
+	libc_ioctl(kc->fd, KCOV_ENABLE, KCOV_TRACE_PC);
+	__atomic_store_n(&kc->trace[0], 0, __ATOMIC_RELAXED);
+}
+
+static unsigned long kcov_disable(struct kcov *kc)
+{
+	unsigned long depth;
+
+	depth = __atomic_load_n(&kc->trace[0], __ATOMIC_RELAXED);
+	if (libc_ioctl(kc->fd, KCOV_DISABLE, 0))
+		depth = 0;
+
+	return depth;
+}
+
+int ioctl(int fd, unsigned long request, ...)
+{
+	static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
+	unsigned long count, n, prev;
+	va_list args;
+	void *argp;
+	int res;
+
+	va_start(args, request);
+	argp = va_arg(args, void *);
+	va_end(args);
+
+	if (kcov.fd < 0 || _IOC_TYPE(request) != DRM_IOCTL_BASE)
+		return libc_ioctl(fd, request, argp);
+
+	pthread_mutex_lock(&mutex);
+	kcov_enable(&kcov);
+
+	res = libc_ioctl(fd, request, argp);
+
+	count = kcov_disable(&kcov);
+	prev = hash_long(kcov.trace[1], 16);
+	for (n = 2; n <= count; n++) {
+		unsigned long loc = hash_long(kcov.trace[n], 16);
+
+		kcov.table[prev ^ loc]++;
+		prev = loc >> 1;
+	}
+	kcov.table[0] |= 1;
+	pthread_mutex_unlock(&mutex);
+
+	return res;
+}
+
+__attribute__((constructor))
+static void init(void)
+{
+	libc_ioctl = dlsym(RTLD_NEXT, "ioctl");
+
+	if (!kcov_open(&kcov, 64 << 10))
+		kcov.fd = -1;
+}
diff --git a/tools/meson.build b/tools/meson.build
index 8ed1ccf48..d0d9c9c91 100644
--- a/tools/meson.build
+++ b/tools/meson.build
@@ -104,6 +104,19 @@ executable('intel_reg', sources : intel_reg_src,
 	     '-DIGT_DATADIR="@0@"'.format(join_paths(prefix, datadir)),
 	   ])
 
+if yaml.found()
+	executable('igt_kcov',
+		   sources: [ 'igt_kcov.c' ],
+		   dependencies: [ zlib, realtime, math, yaml ],
+		   install: true) # setuid me!
+
+	shared_module('igt_kcov_edges',
+		      sources: [ 'igt_kcov_edges.c' ],
+		      name_prefix: '',
+		      dependencies: [ dlsym ],
+		      install: true) # -version, -Dlibdir
+endif
+
 install_data('intel_gpu_abrt', install_dir : bindir)
 
 install_subdir('registers', install_dir : datadir,
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH i-g-t 17/17] igt/gem_exec_latency: Robustify measurements
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02  9:07   ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Repeat the latency measurements and present the median over many so that
the results are more reliable.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_exec_latency.c | 165 ++++++++++++++++++---------------------
 1 file changed, 75 insertions(+), 90 deletions(-)

diff --git a/tests/gem_exec_latency.c b/tests/gem_exec_latency.c
index de16322a6..c2fdf7ee2 100644
--- a/tests/gem_exec_latency.c
+++ b/tests/gem_exec_latency.c
@@ -258,13 +258,12 @@ static void latency_from_ring(int fd,
 	const int gen = intel_gen(intel_get_drm_devid(fd));
 	const int has_64bit_reloc = gen >= 8;
 	struct drm_i915_gem_exec_object2 obj[3];
-	struct drm_i915_gem_relocation_entry reloc;
 	struct drm_i915_gem_execbuffer2 execbuf;
-	const unsigned int repeats = ring_size / 2;
+	uint64_t presumed_offset;
 	unsigned int other;
 	uint32_t *map, *results;
 	uint32_t ctx[2] = {};
-	int i, j;
+	int j;
 
 	if (flags & PREEMPT) {
 		ctx[0] = gem_context_create(fd);
@@ -294,102 +293,83 @@ static void latency_from_ring(int fd,
 	map[0] = MI_BATCH_BUFFER_END;
 	gem_execbuf(fd, &execbuf);
 
-	memset(&reloc,0, sizeof(reloc));
-	obj[2].relocation_count = 1;
-	obj[2].relocs_ptr = to_user_pointer(&reloc);
-
 	gem_set_domain(fd, obj[2].handle,
 		       I915_GEM_DOMAIN_GTT,
 		       I915_GEM_DOMAIN_GTT);
+	presumed_offset = obj[1].offset;
+	for (j = 0; j < 1024; j++) {
+		uint64_t offset = sizeof(uint32_t) * j + presumed_offset;
+		int i = 16 * j;
 
-	reloc.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-	reloc.write_domain = I915_GEM_DOMAIN_INSTRUCTION;
-	reloc.presumed_offset = obj[1].offset;
-	reloc.target_handle = flags & CORK ? 1 : 0;
+		/* MI_STORE_REG_MEM */
+		map[i++] = 0x24 << 23 | 1;
+		if (has_64bit_reloc)
+			map[i-1]++;
+		map[i++] = RCS_TIMESTAMP; /* ring local! */
+		map[i++] = offset;
+		if (has_64bit_reloc)
+			map[i++] = offset >> 32;
+
+		map[i++] = MI_BATCH_BUFFER_END;
+	}
+	igt_assert(ring_size <= 1024);
 
 	for_each_physical_engine(fd, other) {
-		igt_spin_t *spin = NULL;
-		IGT_CORK_HANDLE(c);
-
-		gem_set_domain(fd, obj[2].handle,
-			       I915_GEM_DOMAIN_GTT,
-			       I915_GEM_DOMAIN_GTT);
-
-		if (flags & PREEMPT)
-			spin = __igt_spin_batch_new(fd,
-						    .ctx = ctx[0],
-						    .engine = ring);
-
-		if (flags & CORK) {
-			obj[0].handle = igt_cork_plug(&c, fd);
-			execbuf.buffers_ptr = to_user_pointer(&obj[0]);
-			execbuf.buffer_count = 3;
-		}
+		if (flags & PREEMPT && other == ring)
+			continue;
+
+		for (int qlen = 1; qlen < ring_size/2; qlen *= 2) {
+			unsigned int count = 32 * ring_size / qlen;
+			igt_stats_t stats;
+
+			igt_stats_init_with_size(&stats, count);
+			for (unsigned int rep = 0; rep < count; rep++) {
+				igt_spin_t *spin = NULL;
+				IGT_CORK_HANDLE(c);
+
+				if (flags & PREEMPT)
+					spin = __igt_spin_batch_new(fd,
+								    .ctx = ctx[0],
+								    .engine = ring);
+
+				if (flags & CORK) {
+					obj[0].handle = igt_cork_plug(&c, fd);
+					execbuf.buffers_ptr = to_user_pointer(&obj[0]);
+					execbuf.buffer_count = 3;
+				}
 
-		for (j = 0; j < repeats; j++) {
-			uint64_t offset;
-
-			execbuf.flags &= ~ENGINE_FLAGS;
-			execbuf.flags |= ring;
-
-			execbuf.batch_start_offset = 64 * j;
-			reloc.offset =
-				execbuf.batch_start_offset + sizeof(uint32_t);
-			reloc.delta = sizeof(uint32_t) * j;
-
-			reloc.presumed_offset = obj[1].offset;
-			offset = reloc.presumed_offset;
-			offset += reloc.delta;
-
-			i = 16 * j;
-			/* MI_STORE_REG_MEM */
-			map[i++] = 0x24 << 23 | 1;
-			if (has_64bit_reloc)
-				map[i-1]++;
-			map[i++] = RCS_TIMESTAMP; /* ring local! */
-			map[i++] = offset;
-			if (has_64bit_reloc)
-				map[i++] = offset >> 32;
-			map[i++] = MI_BATCH_BUFFER_END;
-
-			gem_execbuf(fd, &execbuf);
-
-			execbuf.flags &= ~ENGINE_FLAGS;
-			execbuf.flags |= other;
-
-			execbuf.batch_start_offset = 64 * (j + repeats);
-			reloc.offset =
-				execbuf.batch_start_offset + sizeof(uint32_t);
-			reloc.delta = sizeof(uint32_t) * (j + repeats);
-
-			reloc.presumed_offset = obj[1].offset;
-			offset = reloc.presumed_offset;
-			offset += reloc.delta;
-
-			i = 16 * (j + repeats);
-			/* MI_STORE_REG_MEM */
-			map[i++] = 0x24 << 23 | 1;
-			if (has_64bit_reloc)
-				map[i-1]++;
-			map[i++] = RCS_TIMESTAMP; /* ring local! */
-			map[i++] = offset;
-			if (has_64bit_reloc)
-				map[i++] = offset >> 32;
-			map[i++] = MI_BATCH_BUFFER_END;
-
-			gem_execbuf(fd, &execbuf);
-		}
+				for (j = 0; j < qlen; j++) {
+					execbuf.flags &= ~ENGINE_FLAGS;
+					execbuf.flags |= ring;
+					execbuf.batch_start_offset = 64 * j;
+
+					gem_execbuf(fd, &execbuf);
 
-		if (flags & CORK)
-			igt_cork_unplug(&c);
-		gem_set_domain(fd, obj[1].handle,
-			       I915_GEM_DOMAIN_GTT,
-			       I915_GEM_DOMAIN_GTT);
-		igt_spin_batch_free(fd, spin);
+					execbuf.flags &= ~ENGINE_FLAGS;
+					execbuf.flags |= other;
+					execbuf.batch_start_offset = 64 * (j + qlen);
 
-		igt_info("%s-%s delay: %.2f\n",
-			 name, e__->name,
-			 (results[2*repeats-1] - results[0]) / (double)repeats);
+					gem_execbuf(fd, &execbuf);
+				}
+
+				if (flags & CORK)
+					igt_cork_unplug(&c);
+				gem_set_domain(fd, obj[1].handle,
+					       I915_GEM_DOMAIN_GTT,
+					       I915_GEM_DOMAIN_GTT);
+				igt_spin_batch_free(fd, spin);
+
+				igt_assert_eq_u64(obj[1].offset,
+						  presumed_offset);
+				igt_stats_push(&stats,
+					       ((uint64_t)(results[2*qlen-1] - results[0]) << 32) / qlen);
+			}
+
+			igt_info("%s-%s-%d delay: %.2f\n",
+				 name, e__->name, qlen,
+				 igt_stats_get_median(&stats) / (1ull << 32));
+			igt_stats_fini(&stats);
+		}
 	}
 
 	munmap(map, 64*1024);
@@ -710,6 +690,11 @@ igt_main
 						latency_from_ring(device,
 								  e->exec_id | e->flags,
 								  e->name, PREEMPT);
+
+					igt_subtest_f("%s-preemption-queued", e->name)
+						latency_from_ring(device,
+								  e->exec_id | e->flags,
+								  e->name, PREEMPT | CORK);
 				}
 			}
 		}
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [igt-dev] [PATCH i-g-t 17/17] igt/gem_exec_latency: Robustify measurements
@ 2018-07-02  9:07   ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02  9:07 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Repeat the latency measurements and present the median over many so that
the results are more reliable.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_exec_latency.c | 165 ++++++++++++++++++---------------------
 1 file changed, 75 insertions(+), 90 deletions(-)

diff --git a/tests/gem_exec_latency.c b/tests/gem_exec_latency.c
index de16322a6..c2fdf7ee2 100644
--- a/tests/gem_exec_latency.c
+++ b/tests/gem_exec_latency.c
@@ -258,13 +258,12 @@ static void latency_from_ring(int fd,
 	const int gen = intel_gen(intel_get_drm_devid(fd));
 	const int has_64bit_reloc = gen >= 8;
 	struct drm_i915_gem_exec_object2 obj[3];
-	struct drm_i915_gem_relocation_entry reloc;
 	struct drm_i915_gem_execbuffer2 execbuf;
-	const unsigned int repeats = ring_size / 2;
+	uint64_t presumed_offset;
 	unsigned int other;
 	uint32_t *map, *results;
 	uint32_t ctx[2] = {};
-	int i, j;
+	int j;
 
 	if (flags & PREEMPT) {
 		ctx[0] = gem_context_create(fd);
@@ -294,102 +293,83 @@ static void latency_from_ring(int fd,
 	map[0] = MI_BATCH_BUFFER_END;
 	gem_execbuf(fd, &execbuf);
 
-	memset(&reloc,0, sizeof(reloc));
-	obj[2].relocation_count = 1;
-	obj[2].relocs_ptr = to_user_pointer(&reloc);
-
 	gem_set_domain(fd, obj[2].handle,
 		       I915_GEM_DOMAIN_GTT,
 		       I915_GEM_DOMAIN_GTT);
+	presumed_offset = obj[1].offset;
+	for (j = 0; j < 1024; j++) {
+		uint64_t offset = sizeof(uint32_t) * j + presumed_offset;
+		int i = 16 * j;
 
-	reloc.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-	reloc.write_domain = I915_GEM_DOMAIN_INSTRUCTION;
-	reloc.presumed_offset = obj[1].offset;
-	reloc.target_handle = flags & CORK ? 1 : 0;
+		/* MI_STORE_REG_MEM */
+		map[i++] = 0x24 << 23 | 1;
+		if (has_64bit_reloc)
+			map[i-1]++;
+		map[i++] = RCS_TIMESTAMP; /* ring local! */
+		map[i++] = offset;
+		if (has_64bit_reloc)
+			map[i++] = offset >> 32;
+
+		map[i++] = MI_BATCH_BUFFER_END;
+	}
+	igt_assert(ring_size <= 1024);
 
 	for_each_physical_engine(fd, other) {
-		igt_spin_t *spin = NULL;
-		IGT_CORK_HANDLE(c);
-
-		gem_set_domain(fd, obj[2].handle,
-			       I915_GEM_DOMAIN_GTT,
-			       I915_GEM_DOMAIN_GTT);
-
-		if (flags & PREEMPT)
-			spin = __igt_spin_batch_new(fd,
-						    .ctx = ctx[0],
-						    .engine = ring);
-
-		if (flags & CORK) {
-			obj[0].handle = igt_cork_plug(&c, fd);
-			execbuf.buffers_ptr = to_user_pointer(&obj[0]);
-			execbuf.buffer_count = 3;
-		}
+		if (flags & PREEMPT && other == ring)
+			continue;
+
+		for (int qlen = 1; qlen < ring_size/2; qlen *= 2) {
+			unsigned int count = 32 * ring_size / qlen;
+			igt_stats_t stats;
+
+			igt_stats_init_with_size(&stats, count);
+			for (unsigned int rep = 0; rep < count; rep++) {
+				igt_spin_t *spin = NULL;
+				IGT_CORK_HANDLE(c);
+
+				if (flags & PREEMPT)
+					spin = __igt_spin_batch_new(fd,
+								    .ctx = ctx[0],
+								    .engine = ring);
+
+				if (flags & CORK) {
+					obj[0].handle = igt_cork_plug(&c, fd);
+					execbuf.buffers_ptr = to_user_pointer(&obj[0]);
+					execbuf.buffer_count = 3;
+				}
 
-		for (j = 0; j < repeats; j++) {
-			uint64_t offset;
-
-			execbuf.flags &= ~ENGINE_FLAGS;
-			execbuf.flags |= ring;
-
-			execbuf.batch_start_offset = 64 * j;
-			reloc.offset =
-				execbuf.batch_start_offset + sizeof(uint32_t);
-			reloc.delta = sizeof(uint32_t) * j;
-
-			reloc.presumed_offset = obj[1].offset;
-			offset = reloc.presumed_offset;
-			offset += reloc.delta;
-
-			i = 16 * j;
-			/* MI_STORE_REG_MEM */
-			map[i++] = 0x24 << 23 | 1;
-			if (has_64bit_reloc)
-				map[i-1]++;
-			map[i++] = RCS_TIMESTAMP; /* ring local! */
-			map[i++] = offset;
-			if (has_64bit_reloc)
-				map[i++] = offset >> 32;
-			map[i++] = MI_BATCH_BUFFER_END;
-
-			gem_execbuf(fd, &execbuf);
-
-			execbuf.flags &= ~ENGINE_FLAGS;
-			execbuf.flags |= other;
-
-			execbuf.batch_start_offset = 64 * (j + repeats);
-			reloc.offset =
-				execbuf.batch_start_offset + sizeof(uint32_t);
-			reloc.delta = sizeof(uint32_t) * (j + repeats);
-
-			reloc.presumed_offset = obj[1].offset;
-			offset = reloc.presumed_offset;
-			offset += reloc.delta;
-
-			i = 16 * (j + repeats);
-			/* MI_STORE_REG_MEM */
-			map[i++] = 0x24 << 23 | 1;
-			if (has_64bit_reloc)
-				map[i-1]++;
-			map[i++] = RCS_TIMESTAMP; /* ring local! */
-			map[i++] = offset;
-			if (has_64bit_reloc)
-				map[i++] = offset >> 32;
-			map[i++] = MI_BATCH_BUFFER_END;
-
-			gem_execbuf(fd, &execbuf);
-		}
+				for (j = 0; j < qlen; j++) {
+					execbuf.flags &= ~ENGINE_FLAGS;
+					execbuf.flags |= ring;
+					execbuf.batch_start_offset = 64 * j;
+
+					gem_execbuf(fd, &execbuf);
 
-		if (flags & CORK)
-			igt_cork_unplug(&c);
-		gem_set_domain(fd, obj[1].handle,
-			       I915_GEM_DOMAIN_GTT,
-			       I915_GEM_DOMAIN_GTT);
-		igt_spin_batch_free(fd, spin);
+					execbuf.flags &= ~ENGINE_FLAGS;
+					execbuf.flags |= other;
+					execbuf.batch_start_offset = 64 * (j + qlen);
 
-		igt_info("%s-%s delay: %.2f\n",
-			 name, e__->name,
-			 (results[2*repeats-1] - results[0]) / (double)repeats);
+					gem_execbuf(fd, &execbuf);
+				}
+
+				if (flags & CORK)
+					igt_cork_unplug(&c);
+				gem_set_domain(fd, obj[1].handle,
+					       I915_GEM_DOMAIN_GTT,
+					       I915_GEM_DOMAIN_GTT);
+				igt_spin_batch_free(fd, spin);
+
+				igt_assert_eq_u64(obj[1].offset,
+						  presumed_offset);
+				igt_stats_push(&stats,
+					       ((uint64_t)(results[2*qlen-1] - results[0]) << 32) / qlen);
+			}
+
+			igt_info("%s-%s-%d delay: %.2f\n",
+				 name, e__->name, qlen,
+				 igt_stats_get_median(&stats) / (1ull << 32));
+			igt_stats_fini(&stats);
+		}
 	}
 
 	munmap(map, 64*1024);
@@ -710,6 +690,11 @@ igt_main
 						latency_from_ring(device,
 								  e->exec_id | e->flags,
 								  e->name, PREEMPT);
+
+					igt_subtest_f("%s-preemption-queued", e->name)
+						latency_from_ring(device,
+								  e->exec_id | e->flags,
+								  e->name, PREEMPT | CORK);
 				}
 			}
 		}
-- 
2.18.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 01/17] lib: Report file cache as available system memory
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
@ 2018-07-02 11:54   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-02 11:54 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 02/07/2018 10:07, Chris Wilson wrote:
> sysinfo() doesn't include all reclaimable memory. In particular it
> excludes the majority of global_node_page_state(NR_FILE_PAGES),
> reclaimable pages that are a copy of on-disk files It seems the only way
> to obtain this counter is by parsing /proc/meminfo. For comparison,
> check vm_enough_memory() which includes NR_FILE_PAGES as available
> (sadly there's no way to call vm_enough_memory() directly either!)
> 
> v2: Pay attention to what one writes.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   lib/intel_os.c | 36 +++++++++++++++++++++++++++++++-----
>   1 file changed, 31 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/intel_os.c b/lib/intel_os.c
> index 885ffdcec..5bd18fc52 100644
> --- a/lib/intel_os.c
> +++ b/lib/intel_os.c
> @@ -84,6 +84,18 @@ intel_get_total_ram_mb(void)
>   	return retval / (1024*1024);
>   }
>   
> +static uint64_t get_meminfo(const char *info, const char *tag)
> +{
> +	const char *str;
> +	unsigned long val;
> +
> +	str = strstr(info, tag);
> +	if (str && sscanf(str + strlen(tag), " %lu", &val) == 1)
> +		return (uint64_t)val << 10;
> +
> +	return 0;

Might be safer to assert format was as expected and keywords we expect 
are there, rather than silently return zero.

> +}
> +
>   /**
>    * intel_get_avail_ram_mb:
>    *
> @@ -96,17 +108,31 @@ intel_get_avail_ram_mb(void)
>   	uint64_t retval;
>   
>   #ifdef HAVE_STRUCT_SYSINFO_TOTALRAM /* Linux */
> -	struct sysinfo sysinf;
> +	char *info;
>   	int fd;
>   
>   	fd = drm_open_driver(DRIVER_INTEL);
>   	intel_purge_vm_caches(fd);
>   	close(fd);
>   
> -	igt_assert(sysinfo(&sysinf) == 0);
> -	retval = sysinf.freeram;
> -	retval += min(sysinf.freeswap, sysinf.bufferram);
> -	retval *= sysinf.mem_unit;
> +	fd = open("/proc", O_RDONLY);
> +	info = igt_sysfs_get(fd, "meminfo");
> +	close(fd);
> +
> +	if (info) {
> +		retval  = get_meminfo(info, "MemAvailable: ");
> +		retval += get_meminfo(info, "Buffers: ");
> +		retval += get_meminfo(info, "Cached: ");
> +		retval += get_meminfo(info, "SwapCached: ");

I think it would be more robust to have no trailing space in tag strings.

> +		free(info);
> +	} else {
> +		struct sysinfo sysinf;
> +
> +		igt_assert(sysinfo(&sysinf) == 0);
> +		retval = sysinf.freeram;
> +		retval += min(sysinf.freeswap, sysinf.bufferram);
> +		retval *= sysinf.mem_unit;
> +	}

Not sure it is worth keeping this path - will we ever not have 
/proc/meminfo?

>   #elif defined(_SC_PAGESIZE) && defined(_SC_AVPHYS_PAGES) /* Solaris */
>   	long pagesize, npages;
>   
> 

Google agrees with you that sysinfo indeed has this limitation. So in 
general no complaints.

One tiny detail might be that this would now return a too large value - 
doesn't count that it should not swap out itself when thinking about 
free memory. But I don't think that is for this level of API to concern 
with - it is definitely way more correct to report page cache.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 01/17] lib: Report file cache as available system memory
@ 2018-07-02 11:54   ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-02 11:54 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 02/07/2018 10:07, Chris Wilson wrote:
> sysinfo() doesn't include all reclaimable memory. In particular it
> excludes the majority of global_node_page_state(NR_FILE_PAGES),
> reclaimable pages that are a copy of on-disk files It seems the only way
> to obtain this counter is by parsing /proc/meminfo. For comparison,
> check vm_enough_memory() which includes NR_FILE_PAGES as available
> (sadly there's no way to call vm_enough_memory() directly either!)
> 
> v2: Pay attention to what one writes.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   lib/intel_os.c | 36 +++++++++++++++++++++++++++++++-----
>   1 file changed, 31 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/intel_os.c b/lib/intel_os.c
> index 885ffdcec..5bd18fc52 100644
> --- a/lib/intel_os.c
> +++ b/lib/intel_os.c
> @@ -84,6 +84,18 @@ intel_get_total_ram_mb(void)
>   	return retval / (1024*1024);
>   }
>   
> +static uint64_t get_meminfo(const char *info, const char *tag)
> +{
> +	const char *str;
> +	unsigned long val;
> +
> +	str = strstr(info, tag);
> +	if (str && sscanf(str + strlen(tag), " %lu", &val) == 1)
> +		return (uint64_t)val << 10;
> +
> +	return 0;

Might be safer to assert format was as expected and keywords we expect 
are there, rather than silently return zero.

> +}
> +
>   /**
>    * intel_get_avail_ram_mb:
>    *
> @@ -96,17 +108,31 @@ intel_get_avail_ram_mb(void)
>   	uint64_t retval;
>   
>   #ifdef HAVE_STRUCT_SYSINFO_TOTALRAM /* Linux */
> -	struct sysinfo sysinf;
> +	char *info;
>   	int fd;
>   
>   	fd = drm_open_driver(DRIVER_INTEL);
>   	intel_purge_vm_caches(fd);
>   	close(fd);
>   
> -	igt_assert(sysinfo(&sysinf) == 0);
> -	retval = sysinf.freeram;
> -	retval += min(sysinf.freeswap, sysinf.bufferram);
> -	retval *= sysinf.mem_unit;
> +	fd = open("/proc", O_RDONLY);
> +	info = igt_sysfs_get(fd, "meminfo");
> +	close(fd);
> +
> +	if (info) {
> +		retval  = get_meminfo(info, "MemAvailable: ");
> +		retval += get_meminfo(info, "Buffers: ");
> +		retval += get_meminfo(info, "Cached: ");
> +		retval += get_meminfo(info, "SwapCached: ");

I think it would be more robust to have no trailing space in tag strings.

> +		free(info);
> +	} else {
> +		struct sysinfo sysinf;
> +
> +		igt_assert(sysinfo(&sysinf) == 0);
> +		retval = sysinf.freeram;
> +		retval += min(sysinf.freeswap, sysinf.bufferram);
> +		retval *= sysinf.mem_unit;
> +	}

Not sure it is worth keeping this path - will we ever not have 
/proc/meminfo?

>   #elif defined(_SC_PAGESIZE) && defined(_SC_AVPHYS_PAGES) /* Solaris */
>   	long pagesize, npages;
>   
> 

Google agrees with you that sysinfo indeed has this limitation. So in 
general no complaints.

One tiny detail might be that this would now return a too large value - 
doesn't count that it should not swap out itself when thinking about 
free memory. But I don't think that is for this level of API to concern 
with - it is definitely way more correct to report page cache.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
  2018-07-02  9:07   ` [igt-dev] " Chris Wilson
@ 2018-07-02 12:00     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-02 12:00 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 02/07/2018 10:07, Chris Wilson wrote:
> As we want to compare a templated tiling pattern against the target_bo,
> we need to know that the swizzling is compatible. Or else the two
> tiling pattern may differ due to underlying page address that we cannot
> know, and so the test may sporadically fail.
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
>   1 file changed, 24 insertions(+)
> 
> diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
> index fe573c37c..83c57c07d 100644
> --- a/tests/gem_tiled_partial_pwrite_pread.c
> +++ b/tests/gem_tiled_partial_pwrite_pread.c
> @@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
>   	}
>   }
>   
> +static bool known_swizzling(uint32_t handle)
> +{
> +	struct drm_i915_gem_get_tiling2 {
> +		uint32_t handle;
> +		uint32_t tiling_mode;
> +		uint32_t swizzle_mode;
> +		uint32_t phys_swizzle_mode;
> +	} arg = {
> +		.handle = handle,
> +	};
> +#define DRM_IOCTL_I915_GEM_GET_TILING2	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct 

Can't we rely on this being in system headers by now?

drm_i915_gem_get_tiling2)
> +
> +	if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
> +		return false;
> +
> +	return arg.phys_swizzle_mode == arg.swizzle_mode;
> +}
> +
>   igt_main
>   {
>   	uint32_t tiling_mode = I915_TILING_X;
> @@ -271,6 +289,12 @@ igt_main
>   						      &tiling_mode, &scratch_pitch, 0);
>   		igt_assert(tiling_mode == I915_TILING_X);
>   		igt_assert(scratch_pitch == 4096);
> +
> +		/*
> +		 * As we want to compare our template tiled pattern against
> +		 * the target bo, we need consistent swizzling on both.
> +		 */
> +		igt_require(known_swizzling(scratch_bo->handle));
>   		staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
>   		tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
>   							    BO_SIZE/4096, 4,
> 

Another option could be to keep allocating until we found one in the 
memory area with compatible swizzling? Like this it may be some noise in 
the test pass<->skip transitions.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
@ 2018-07-02 12:00     ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-02 12:00 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 02/07/2018 10:07, Chris Wilson wrote:
> As we want to compare a templated tiling pattern against the target_bo,
> we need to know that the swizzling is compatible. Or else the two
> tiling pattern may differ due to underlying page address that we cannot
> know, and so the test may sporadically fail.
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
>   1 file changed, 24 insertions(+)
> 
> diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
> index fe573c37c..83c57c07d 100644
> --- a/tests/gem_tiled_partial_pwrite_pread.c
> +++ b/tests/gem_tiled_partial_pwrite_pread.c
> @@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
>   	}
>   }
>   
> +static bool known_swizzling(uint32_t handle)
> +{
> +	struct drm_i915_gem_get_tiling2 {
> +		uint32_t handle;
> +		uint32_t tiling_mode;
> +		uint32_t swizzle_mode;
> +		uint32_t phys_swizzle_mode;
> +	} arg = {
> +		.handle = handle,
> +	};
> +#define DRM_IOCTL_I915_GEM_GET_TILING2	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct 

Can't we rely on this being in system headers by now?

drm_i915_gem_get_tiling2)
> +
> +	if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
> +		return false;
> +
> +	return arg.phys_swizzle_mode == arg.swizzle_mode;
> +}
> +
>   igt_main
>   {
>   	uint32_t tiling_mode = I915_TILING_X;
> @@ -271,6 +289,12 @@ igt_main
>   						      &tiling_mode, &scratch_pitch, 0);
>   		igt_assert(tiling_mode == I915_TILING_X);
>   		igt_assert(scratch_pitch == 4096);
> +
> +		/*
> +		 * As we want to compare our template tiled pattern against
> +		 * the target bo, we need consistent swizzling on both.
> +		 */
> +		igt_require(known_swizzling(scratch_bo->handle));
>   		staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
>   		tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
>   							    BO_SIZE/4096, 4,
> 

Another option could be to keep allocating until we found one in the 
memory area with compatible swizzling? Like this it may be some noise in 
the test pass<->skip transitions.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 03/17] igt/gem_set_tiling_vs_pwrite: Show the erroneous value
  2018-07-02  9:07   ` [Intel-gfx] " Chris Wilson
@ 2018-07-02 12:00     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-02 12:00 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 02/07/2018 10:07, Chris Wilson wrote:
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   tests/gem_set_tiling_vs_pwrite.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/gem_set_tiling_vs_pwrite.c b/tests/gem_set_tiling_vs_pwrite.c
> index 006edfe4e..f0126b648 100644
> --- a/tests/gem_set_tiling_vs_pwrite.c
> +++ b/tests/gem_set_tiling_vs_pwrite.c
> @@ -75,7 +75,7 @@ igt_simple_main
>   	memset(data, 0, OBJECT_SIZE);
>   	gem_read(fd, handle, 0, data, OBJECT_SIZE);
>   	for (i = 0; i < OBJECT_SIZE/4; i++)
> -		igt_assert(i == data[i]);
> +		igt_assert_eq_u32(data[i], i);
>   
>   	/* touch it before changing the tiling, so that the fence sticks around */
>   	gem_set_domain(fd, handle, I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
> @@ -88,7 +88,7 @@ igt_simple_main
>   	memset(data, 0, OBJECT_SIZE);
>   	gem_read(fd, handle, 0, data, OBJECT_SIZE);
>   	for (i = 0; i < OBJECT_SIZE/4; i++)
> -		igt_assert(i == data[i]);
> +		igt_assert_eq_u32(data[i], i);
>   
>   	munmap(ptr, OBJECT_SIZE);
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 03/17] igt/gem_set_tiling_vs_pwrite: Show the erroneous value
@ 2018-07-02 12:00     ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-02 12:00 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 02/07/2018 10:07, Chris Wilson wrote:
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   tests/gem_set_tiling_vs_pwrite.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/gem_set_tiling_vs_pwrite.c b/tests/gem_set_tiling_vs_pwrite.c
> index 006edfe4e..f0126b648 100644
> --- a/tests/gem_set_tiling_vs_pwrite.c
> +++ b/tests/gem_set_tiling_vs_pwrite.c
> @@ -75,7 +75,7 @@ igt_simple_main
>   	memset(data, 0, OBJECT_SIZE);
>   	gem_read(fd, handle, 0, data, OBJECT_SIZE);
>   	for (i = 0; i < OBJECT_SIZE/4; i++)
> -		igt_assert(i == data[i]);
> +		igt_assert_eq_u32(data[i], i);
>   
>   	/* touch it before changing the tiling, so that the fence sticks around */
>   	gem_set_domain(fd, handle, I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
> @@ -88,7 +88,7 @@ igt_simple_main
>   	memset(data, 0, OBJECT_SIZE);
>   	gem_read(fd, handle, 0, data, OBJECT_SIZE);
>   	for (i = 0; i < OBJECT_SIZE/4; i++)
> -		igt_assert(i == data[i]);
> +		igt_assert_eq_u32(data[i], i);
>   
>   	munmap(ptr, OBJECT_SIZE);
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 01/17] lib: Report file cache as available system memory
  2018-07-02 11:54   ` Tvrtko Ursulin
@ 2018-07-02 12:08     ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02 12:08 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-02 12:54:18)
> 
> On 02/07/2018 10:07, Chris Wilson wrote:
> > sysinfo() doesn't include all reclaimable memory. In particular it
> > excludes the majority of global_node_page_state(NR_FILE_PAGES),
> > reclaimable pages that are a copy of on-disk files It seems the only way
> > to obtain this counter is by parsing /proc/meminfo. For comparison,
> > check vm_enough_memory() which includes NR_FILE_PAGES as available
> > (sadly there's no way to call vm_enough_memory() directly either!)
> > 
> > v2: Pay attention to what one writes.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   lib/intel_os.c | 36 +++++++++++++++++++++++++++++++-----
> >   1 file changed, 31 insertions(+), 5 deletions(-)
> > 
> > diff --git a/lib/intel_os.c b/lib/intel_os.c
> > index 885ffdcec..5bd18fc52 100644
> > --- a/lib/intel_os.c
> > +++ b/lib/intel_os.c
> > @@ -84,6 +84,18 @@ intel_get_total_ram_mb(void)
> >       return retval / (1024*1024);
> >   }
> >   
> > +static uint64_t get_meminfo(const char *info, const char *tag)
> > +{
> > +     const char *str;
> > +     unsigned long val;
> > +
> > +     str = strstr(info, tag);
> > +     if (str && sscanf(str + strlen(tag), " %lu", &val) == 1)
> > +             return (uint64_t)val << 10;
> > +
> > +     return 0;
> 
> Might be safer to assert format was as expected and keywords we expect 
> are there, rather than silently return zero.

Definitely prefer not to assert here if we can help it.
igt_warn should be obvious enough to catch change in tags, but not if
the information changes.
 
> > +}
> > +
> >   /**
> >    * intel_get_avail_ram_mb:
> >    *
> > @@ -96,17 +108,31 @@ intel_get_avail_ram_mb(void)
> >       uint64_t retval;
> >   
> >   #ifdef HAVE_STRUCT_SYSINFO_TOTALRAM /* Linux */
> > -     struct sysinfo sysinf;
> > +     char *info;
> >       int fd;
> >   
> >       fd = drm_open_driver(DRIVER_INTEL);
> >       intel_purge_vm_caches(fd);
> >       close(fd);
> >   
> > -     igt_assert(sysinfo(&sysinf) == 0);
> > -     retval = sysinf.freeram;
> > -     retval += min(sysinf.freeswap, sysinf.bufferram);
> > -     retval *= sysinf.mem_unit;
> > +     fd = open("/proc", O_RDONLY);
> > +     info = igt_sysfs_get(fd, "meminfo");
> > +     close(fd);
> > +
> > +     if (info) {
> > +             retval  = get_meminfo(info, "MemAvailable: ");
> > +             retval += get_meminfo(info, "Buffers: ");
> > +             retval += get_meminfo(info, "Cached: ");
> > +             retval += get_meminfo(info, "SwapCached: ");
> 
> I think it would be more robust to have no trailing space in tag strings.

There was a reason iirc, but I think it was just for debug convenience.

> > +             free(info);
> > +     } else {
> > +             struct sysinfo sysinf;
> > +
> > +             igt_assert(sysinfo(&sysinf) == 0);
> > +             retval = sysinf.freeram;
> > +             retval += min(sysinf.freeswap, sysinf.bufferram);
> > +             retval *= sysinf.mem_unit;
> > +     }
> 
> Not sure it is worth keeping this path - will we ever not have 
> /proc/meminfo?

I expect someone will eventually deem it to be too dangerous information
and remove it. I felt since we had the code here, we might as well try.
Technically /proc doesn't have to be mounted. :|

> >   #elif defined(_SC_PAGESIZE) && defined(_SC_AVPHYS_PAGES) /* Solaris */
> >       long pagesize, npages;
> >   
> > 
> 
> Google agrees with you that sysinfo indeed has this limitation. So in 
> general no complaints.
> 
> One tiny detail might be that this would now return a too large value - 
> doesn't count that it should not swap out itself when thinking about 
> free memory. But I don't think that is for this level of API to concern 
> with - it is definitely way more correct to report page cache.

The question we ask is whether or not the test itself will be swapped
out; I don't mind if we swap others out in order to run the test. We do
after all try and purge memory to make it all available to ourselves,
and part of that was in the belief that it would make the filecache
available to us.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 01/17] lib: Report file cache as available system memory
@ 2018-07-02 12:08     ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-02 12:08 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-02 12:54:18)
> 
> On 02/07/2018 10:07, Chris Wilson wrote:
> > sysinfo() doesn't include all reclaimable memory. In particular it
> > excludes the majority of global_node_page_state(NR_FILE_PAGES),
> > reclaimable pages that are a copy of on-disk files It seems the only way
> > to obtain this counter is by parsing /proc/meminfo. For comparison,
> > check vm_enough_memory() which includes NR_FILE_PAGES as available
> > (sadly there's no way to call vm_enough_memory() directly either!)
> > 
> > v2: Pay attention to what one writes.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   lib/intel_os.c | 36 +++++++++++++++++++++++++++++++-----
> >   1 file changed, 31 insertions(+), 5 deletions(-)
> > 
> > diff --git a/lib/intel_os.c b/lib/intel_os.c
> > index 885ffdcec..5bd18fc52 100644
> > --- a/lib/intel_os.c
> > +++ b/lib/intel_os.c
> > @@ -84,6 +84,18 @@ intel_get_total_ram_mb(void)
> >       return retval / (1024*1024);
> >   }
> >   
> > +static uint64_t get_meminfo(const char *info, const char *tag)
> > +{
> > +     const char *str;
> > +     unsigned long val;
> > +
> > +     str = strstr(info, tag);
> > +     if (str && sscanf(str + strlen(tag), " %lu", &val) == 1)
> > +             return (uint64_t)val << 10;
> > +
> > +     return 0;
> 
> Might be safer to assert format was as expected and keywords we expect 
> are there, rather than silently return zero.

Definitely prefer not to assert here if we can help it.
igt_warn should be obvious enough to catch change in tags, but not if
the information changes.
 
> > +}
> > +
> >   /**
> >    * intel_get_avail_ram_mb:
> >    *
> > @@ -96,17 +108,31 @@ intel_get_avail_ram_mb(void)
> >       uint64_t retval;
> >   
> >   #ifdef HAVE_STRUCT_SYSINFO_TOTALRAM /* Linux */
> > -     struct sysinfo sysinf;
> > +     char *info;
> >       int fd;
> >   
> >       fd = drm_open_driver(DRIVER_INTEL);
> >       intel_purge_vm_caches(fd);
> >       close(fd);
> >   
> > -     igt_assert(sysinfo(&sysinf) == 0);
> > -     retval = sysinf.freeram;
> > -     retval += min(sysinf.freeswap, sysinf.bufferram);
> > -     retval *= sysinf.mem_unit;
> > +     fd = open("/proc", O_RDONLY);
> > +     info = igt_sysfs_get(fd, "meminfo");
> > +     close(fd);
> > +
> > +     if (info) {
> > +             retval  = get_meminfo(info, "MemAvailable: ");
> > +             retval += get_meminfo(info, "Buffers: ");
> > +             retval += get_meminfo(info, "Cached: ");
> > +             retval += get_meminfo(info, "SwapCached: ");
> 
> I think it would be more robust to have no trailing space in tag strings.

There was a reason iirc, but I think it was just for debug convenience.

> > +             free(info);
> > +     } else {
> > +             struct sysinfo sysinf;
> > +
> > +             igt_assert(sysinfo(&sysinf) == 0);
> > +             retval = sysinf.freeram;
> > +             retval += min(sysinf.freeswap, sysinf.bufferram);
> > +             retval *= sysinf.mem_unit;
> > +     }
> 
> Not sure it is worth keeping this path - will we ever not have 
> /proc/meminfo?

I expect someone will eventually deem it to be too dangerous information
and remove it. I felt since we had the code here, we might as well try.
Technically /proc doesn't have to be mounted. :|

> >   #elif defined(_SC_PAGESIZE) && defined(_SC_AVPHYS_PAGES) /* Solaris */
> >       long pagesize, npages;
> >   
> > 
> 
> Google agrees with you that sysinfo indeed has this limitation. So in 
> general no complaints.
> 
> One tiny detail might be that this would now return a too large value - 
> doesn't count that it should not swap out itself when thinking about 
> free memory. But I don't think that is for this level of API to concern 
> with - it is definitely way more correct to report page cache.

The question we ask is whether or not the test itself will be swapped
out; I don't mind if we swap others out in order to run the test. We do
after all try and purge memory to make it all available to ourselves,
and part of that was in the belief that it would make the filecache
available to us.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for series starting with [i-g-t,01/17] lib: Report file cache as available system memory
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
                   ` (17 preceding siblings ...)
  (?)
@ 2018-07-02 13:09 ` Patchwork
  -1 siblings, 0 replies; 68+ messages in thread
From: Patchwork @ 2018-07-02 13:09 UTC (permalink / raw)
  To: Chris Wilson; +Cc: igt-dev

== Series Details ==

Series: series starting with [i-g-t,01/17] lib: Report file cache as available system memory
URL   : https://patchwork.freedesktop.org/series/45759/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4404 -> IGTPW_1519 =

== Summary - SUCCESS ==

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/45759/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in IGTPW_1519:

  === IGT changes ===

    ==== Warnings ====

    igt@kms_pipe_crc_basic@read-crc-pipe-b:
      {fi-cfl-8109u}:     PASS -> SKIP +36

    
== Known issues ==

  Here are the changes found in IGTPW_1519 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@prime_vgem@basic-fence-flip:
      fi-ilk-650:         PASS -> FAIL (fdo#104008)

    
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  fdo#104008 https://bugs.freedesktop.org/show_bug.cgi?id=104008


== Participating hosts (45 -> 39) ==

  Missing    (6): fi-ilk-m540 fi-hsw-4200u fi-byt-j1900 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 


== Build changes ==

    * IGT: IGT_4531 -> IGTPW_1519

  CI_DRM_4404: ceaab659002c938f1788b7458d5081fadc3c1ddc @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_1519: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1519/
  IGT_4531: a14bc8b4d69eaca189665de505e6b10cbfbb7730 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools



== Testlist changes ==

+++ 177 lines
--- 1 lines

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1519/issues.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [igt-dev] ✓ Fi.CI.IGT: success for series starting with [i-g-t,01/17] lib: Report file cache as available system memory
  2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
                   ` (18 preceding siblings ...)
  (?)
@ 2018-07-02 14:16 ` Patchwork
  -1 siblings, 0 replies; 68+ messages in thread
From: Patchwork @ 2018-07-02 14:16 UTC (permalink / raw)
  To: Chris Wilson; +Cc: igt-dev

== Series Details ==

Series: series starting with [i-g-t,01/17] lib: Report file cache as available system memory
URL   : https://patchwork.freedesktop.org/series/45759/
State : success

== Summary ==

= CI Bug Log - changes from IGT_4531_full -> IGTPW_1519_full =

== Summary - WARNING ==

  Minor unknown changes coming with IGTPW_1519_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_1519_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/45759/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in IGTPW_1519_full:

  === IGT changes ===

    ==== Possible regressions ====

    {igt@gem_ctx_switch@basic-default-heavy-queue}:
      shard-glk:          NOTRUN -> FAIL +3

    {igt@gem_ctx_switch@basic-default-heavy-queue-interruptible}:
      shard-hsw:          NOTRUN -> FAIL +4

    {igt@gem_ctx_switch@basic-default-queue}:
      shard-kbl:          NOTRUN -> FAIL +1

    {igt@gem_ctx_switch@basic-default-queue-interruptible}:
      shard-snb:          NOTRUN -> FAIL +3
      shard-apl:          NOTRUN -> FAIL +3

    
    ==== Warnings ====

    igt@gem_mocs_settings@mocs-rc6-blt:
      shard-kbl:          PASS -> SKIP

    igt@kms_cursor_crc@cursor-64x64-dpms:
      shard-snb:          SKIP -> PASS

    
== Known issues ==

  Here are the changes found in IGTPW_1519_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@drv_selftest@live_gtt:
      shard-glk:          PASS -> INCOMPLETE (k.org#198133, fdo#103359)

    igt@gem_exec_await@wide-contexts:
      shard-glk:          PASS -> FAIL (fdo#105900)

    igt@gem_exec_schedule@pi-ringfull-bsd1:
      shard-kbl:          NOTRUN -> FAIL (fdo#103158) +1

    igt@kms_available_modes_crc@available_mode_test_crc:
      shard-kbl:          NOTRUN -> FAIL (fdo#106641)

    igt@kms_draw_crc@draw-method-rgb565-blt-xtiled:
      shard-snb:          PASS -> INCOMPLETE (fdo#105411)

    igt@kms_flip@2x-plain-flip-fb-recreate-interruptible:
      shard-glk:          PASS -> FAIL (fdo#100368) +2

    igt@kms_plane_multiple@atomic-pipe-a-tiling-x:
      shard-snb:          PASS -> FAIL (fdo#103166, fdo#104724)

    igt@kms_rotation_crc@primary-rotation-180:
      shard-apl:          PASS -> FAIL (fdo#104724, fdo#103925)

    igt@kms_setmode@basic:
      shard-kbl:          PASS -> FAIL (fdo#99912)

    igt@kms_sysfs_edid_timing:
      shard-kbl:          NOTRUN -> FAIL (fdo#100047)

    igt@perf@polling:
      shard-hsw:          PASS -> FAIL (fdo#102252)

    
    ==== Possible fixes ====

    igt@drv_selftest@live_hangcheck:
      shard-apl:          DMESG-FAIL (fdo#106560, fdo#106947) -> PASS

    igt@drv_selftest@live_hugepages:
      shard-kbl:          INCOMPLETE (fdo#103665) -> PASS

    igt@gem_persistent_relocs@forked-faulting-reloc:
      shard-snb:          INCOMPLETE (fdo#105411) -> PASS

    igt@kms_flip@2x-flip-vs-expired-vblank:
      shard-glk:          FAIL (fdo#105363) -> PASS

    igt@kms_frontbuffer_tracking@fbc-1p-pri-indfb-multidraw:
      shard-snb:          FAIL (fdo#103167, fdo#104724) -> PASS

    igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-shrfb-draw-mmap-wc:
      shard-glk:          FAIL (fdo#103167, fdo#104724) -> PASS

    
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  fdo#100047 https://bugs.freedesktop.org/show_bug.cgi?id=100047
  fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
  fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
  fdo#103158 https://bugs.freedesktop.org/show_bug.cgi?id=103158
  fdo#103166 https://bugs.freedesktop.org/show_bug.cgi?id=103166
  fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
  fdo#103359 https://bugs.freedesktop.org/show_bug.cgi?id=103359
  fdo#103665 https://bugs.freedesktop.org/show_bug.cgi?id=103665
  fdo#103925 https://bugs.freedesktop.org/show_bug.cgi?id=103925
  fdo#104724 https://bugs.freedesktop.org/show_bug.cgi?id=104724
  fdo#105363 https://bugs.freedesktop.org/show_bug.cgi?id=105363
  fdo#105411 https://bugs.freedesktop.org/show_bug.cgi?id=105411
  fdo#105900 https://bugs.freedesktop.org/show_bug.cgi?id=105900
  fdo#106560 https://bugs.freedesktop.org/show_bug.cgi?id=106560
  fdo#106641 https://bugs.freedesktop.org/show_bug.cgi?id=106641
  fdo#106947 https://bugs.freedesktop.org/show_bug.cgi?id=106947
  fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
  k.org#198133 https://bugzilla.kernel.org/show_bug.cgi?id=198133


== Participating hosts (5 -> 5) ==

  No changes in participating hosts


== Build changes ==

    * IGT: IGT_4531 -> IGTPW_1519
    * Linux: CI_DRM_4401 -> CI_DRM_4404

  CI_DRM_4401: 4fe59a304a9a855a1c0e9a576c94d4cca239b427 @ git://anongit.freedesktop.org/gfx-ci/linux
  CI_DRM_4404: ceaab659002c938f1788b7458d5081fadc3c1ddc @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_1519: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1519/
  IGT_4531: a14bc8b4d69eaca189665de505e6b10cbfbb7730 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1519/shards.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 04/17] lib: Convert spin batch constructor to a factory
  2018-07-02  9:07   ` [igt-dev] " Chris Wilson
@ 2018-07-02 15:34     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-02 15:34 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 02/07/2018 10:07, Chris Wilson wrote:
> In order to make adding more options easier, expose the full set of
> options to the caller.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   lib/igt_dummyload.c            | 147 +++++++++------------------------
>   lib/igt_dummyload.h            |  42 +++++-----
>   tests/drv_missed_irq.c         |   2 +-
>   tests/gem_busy.c               |  17 ++--
>   tests/gem_ctx_isolation.c      |  26 +++---
>   tests/gem_eio.c                |  13 ++-
>   tests/gem_exec_fence.c         |  16 ++--
>   tests/gem_exec_latency.c       |  18 +++-
>   tests/gem_exec_nop.c           |   4 +-
>   tests/gem_exec_reloc.c         |  10 ++-
>   tests/gem_exec_schedule.c      |  27 ++++--
>   tests/gem_exec_suspend.c       |   2 +-
>   tests/gem_fenced_exec_thrash.c |   2 +-
>   tests/gem_shrink.c             |   4 +-
>   tests/gem_spin_batch.c         |   4 +-
>   tests/gem_sync.c               |   5 +-
>   tests/gem_wait.c               |   4 +-
>   tests/kms_busy.c               |  10 ++-
>   tests/kms_cursor_legacy.c      |   7 +-
>   tests/perf_pmu.c               |  33 +++++---
>   tests/pm_rps.c                 |   9 +-
>   21 files changed, 189 insertions(+), 213 deletions(-)
> 
> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
> index 3809b4e61..94efdf745 100644
> --- a/lib/igt_dummyload.c
> +++ b/lib/igt_dummyload.c
> @@ -75,12 +75,9 @@ fill_reloc(struct drm_i915_gem_relocation_entry *reloc,
>   	reloc->write_domain = write_domains;
>   }
>   
> -#define OUT_FENCE	(1 << 0)
> -#define POLL_RUN	(1 << 1)
> -
>   static int
> -emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
> -		     uint32_t dep, unsigned int flags)
> +emit_recursive_batch(igt_spin_t *spin,
> +		     int fd, const struct igt_spin_factory *opts)
>   {
>   #define SCRATCH 0
>   #define BATCH 1
> @@ -95,21 +92,18 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
>   	int i;
>   
>   	nengine = 0;
> -	if (engine == ALL_ENGINES) {
> -		for_each_engine(fd, engine) {
> -			if (engine) {
> -			if (flags & POLL_RUN)
> -				igt_require(!(flags & POLL_RUN) ||
> -					    gem_can_store_dword(fd, engine));
> -
> -				engines[nengine++] = engine;
> -			}
> +	if (opts->engine == ALL_ENGINES) {
> +		unsigned int engine;
> +
> +		for_each_physical_engine(fd, engine) {
> +			if (opts->flags & IGT_SPIN_POLL_RUN &&
> +			    !gem_can_store_dword(fd, engine))
> +				continue;
> +
> +			engines[nengine++] = engine;
>   		}
>   	} else {
> -		gem_require_ring(fd, engine);
> -		igt_require(!(flags & POLL_RUN) ||
> -			    gem_can_store_dword(fd, engine));
> -		engines[nengine++] = engine;
> +		engines[nengine++] = opts->engine;
>   	}
>   	igt_require(nengine);
>   
> @@ -130,20 +124,20 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
>   	execbuf->buffer_count++;
>   	batch_start = batch;
>   
> -	if (dep) {
> -		igt_assert(!(flags & POLL_RUN));
> +	if (opts->dependency) {
> +		igt_assert(!(opts->flags & IGT_SPIN_POLL_RUN));
>   
>   		/* dummy write to dependency */
> -		obj[SCRATCH].handle = dep;
> +		obj[SCRATCH].handle = opts->dependency;
>   		fill_reloc(&relocs[obj[BATCH].relocation_count++],
> -			   dep, 1020,
> +			   opts->dependency, 1020,
>   			   I915_GEM_DOMAIN_RENDER,
>   			   I915_GEM_DOMAIN_RENDER);
>   		execbuf->buffer_count++;
> -	} else if (flags & POLL_RUN) {
> +	} else if (opts->flags & IGT_SPIN_POLL_RUN) {
>   		unsigned int offset;
>   
> -		igt_assert(!dep);
> +		igt_assert(!opts->dependency);
>   
>   		if (gen == 4 || gen == 5) {
>   			execbuf->flags |= I915_EXEC_SECURE;
> @@ -231,9 +225,9 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
>   
>   	execbuf->buffers_ptr = to_user_pointer(obj +
>   					       (2 - execbuf->buffer_count));
> -	execbuf->rsvd1 = ctx;
> +	execbuf->rsvd1 = opts->ctx;
>   
> -	if (flags & OUT_FENCE)
> +	if (opts->flags & IGT_SPIN_FENCE_OUT)
>   		execbuf->flags |= I915_EXEC_FENCE_OUT;
>   
>   	for (i = 0; i < nengine; i++) {
> @@ -242,7 +236,7 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
>   
>   		gem_execbuf_wr(fd, execbuf);
>   
> -		if (flags & OUT_FENCE) {
> +		if (opts->flags & IGT_SPIN_FENCE_OUT) {
>   			int _fd = execbuf->rsvd2 >> 32;
>   
>   			igt_assert(_fd >= 0);
> @@ -271,16 +265,14 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
>   }
>   
>   static igt_spin_t *
> -___igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep,
> -		      unsigned int flags)
> +spin_batch_create(int fd, const struct igt_spin_factory *opts)
>   {
>   	igt_spin_t *spin;
>   
>   	spin = calloc(1, sizeof(struct igt_spin));
>   	igt_assert(spin);
>   
> -	spin->out_fence = emit_recursive_batch(spin, fd, ctx, engine, dep,
> -					       flags);
> +	spin->out_fence = emit_recursive_batch(spin, fd, opts);
>   
>   	pthread_mutex_lock(&list_lock);
>   	igt_list_add(&spin->link, &spin_list);
> @@ -290,18 +282,15 @@ ___igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep,
>   }
>   
>   igt_spin_t *
> -__igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
> +__igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts)
>   {
> -	return ___igt_spin_batch_new(fd, ctx, engine, dep, 0);
> +	return spin_batch_create(fd, opts);
>   }
>   
>   /**
> - * igt_spin_batch_new:
> + * igt_spin_batch_factory:
>    * @fd: open i915 drm file descriptor
> - * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
> - *          than 0, execute on all available rings.
> - * @dep: handle to a buffer object dependency. If greater than 0, add a
> - *              relocation entry to this buffer within the batch.
> + * @opts: controlling options such as context, engine, dependencies etc
>    *
>    * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
>    * contains the batch's handle that can be waited upon. The returned structure
> @@ -311,86 +300,26 @@ __igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
>    * Structure with helper internal state for igt_spin_batch_free().
>    */
>   igt_spin_t *
> -igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
> +igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts)
>   {
>   	igt_spin_t *spin;
>   
>   	igt_require_gem(fd);
>   
> -	spin = __igt_spin_batch_new(fd, ctx, engine, dep);
> -	igt_assert(gem_bo_busy(fd, spin->handle));
> -
> -	return spin;
> -}
> -
> -igt_spin_t *
> -__igt_spin_batch_new_fence(int fd, uint32_t ctx, unsigned engine)
> -{
> -	return ___igt_spin_batch_new(fd, ctx, engine, 0, OUT_FENCE);
> -}
> +	if (opts->engine != ALL_ENGINES) {
> +		gem_require_ring(fd, opts->engine);
> +		if (opts->flags & IGT_SPIN_POLL_RUN)
> +			igt_require(gem_can_store_dword(fd, opts->engine));
> +	}
>   
> -/**
> - * igt_spin_batch_new_fence:
> - * @fd: open i915 drm file descriptor
> - * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
> - *          than 0, execute on all available rings.
> - *
> - * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
> - * contains the batch's handle that can be waited upon. The returned structure
> - * must be passed to igt_spin_batch_free() for post-processing.
> - *
> - * igt_spin_t will contain an output fence associtated with this batch.
> - *
> - * Returns:
> - * Structure with helper internal state for igt_spin_batch_free().
> - */
> -igt_spin_t *
> -igt_spin_batch_new_fence(int fd, uint32_t ctx, unsigned engine)
> -{
> -	igt_spin_t *spin;
> +	spin = spin_batch_create(fd, opts);
>   
> -	igt_require_gem(fd);
> -	igt_require(gem_has_exec_fence(fd));
> -
> -	spin = __igt_spin_batch_new_fence(fd, ctx, engine);
>   	igt_assert(gem_bo_busy(fd, spin->handle));
> -	igt_assert(poll(&(struct pollfd){spin->out_fence, POLLIN}, 1, 0) == 0);
> -
> -	return spin;
> -}
> -
> -igt_spin_t *
> -__igt_spin_batch_new_poll(int fd, uint32_t ctx, unsigned engine)
> -{
> -	return ___igt_spin_batch_new(fd, ctx, engine, 0, POLL_RUN);
> -}
> +	if (opts->flags & IGT_SPIN_FENCE_OUT) {
> +		struct pollfd pfd = { spin->out_fence, POLLIN };
>   
> -/**
> - * igt_spin_batch_new_poll:
> - * @fd: open i915 drm file descriptor
> - * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
> - *          than 0, execute on all available rings.
> - *
> - * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
> - * contains the batch's handle that can be waited upon. The returned structure
> - * must be passed to igt_spin_batch_free() for post-processing.
> - *
> - * igt_spin_t->running will containt a pointer which target will change from
> - * zero to one once the spinner actually starts executing on the GPU.
> - *
> - * Returns:
> - * Structure with helper internal state for igt_spin_batch_free().
> - */
> -igt_spin_t *
> -igt_spin_batch_new_poll(int fd, uint32_t ctx, unsigned engine)
> -{
> -	igt_spin_t *spin;
> -
> -	igt_require_gem(fd);
> -	igt_require(gem_mmap__has_wc(fd));
> -
> -	spin = __igt_spin_batch_new_poll(fd, ctx, engine);
> -	igt_assert(gem_bo_busy(fd, spin->handle));
> +		igt_assert(poll(&pfd, 1, 0) == 0);
> +	}
>   
>   	return spin;
>   }
> diff --git a/lib/igt_dummyload.h b/lib/igt_dummyload.h
> index c6ccc2936..c794f2544 100644
> --- a/lib/igt_dummyload.h
> +++ b/lib/igt_dummyload.h
> @@ -43,29 +43,25 @@ typedef struct igt_spin {
>   	bool *running;
>   } igt_spin_t;
>   
> -igt_spin_t *__igt_spin_batch_new(int fd,
> -				 uint32_t ctx,
> -				 unsigned engine,
> -				 uint32_t  dep);
> -igt_spin_t *igt_spin_batch_new(int fd,
> -			       uint32_t ctx,
> -			       unsigned engine,
> -			       uint32_t  dep);
> -
> -igt_spin_t *__igt_spin_batch_new_fence(int fd,
> -				       uint32_t ctx,
> -				       unsigned engine);
> -
> -igt_spin_t *igt_spin_batch_new_fence(int fd,
> -				     uint32_t ctx,
> -				     unsigned engine);
> -
> -igt_spin_t *__igt_spin_batch_new_poll(int fd,
> -				       uint32_t ctx,
> -				       unsigned engine);
> -igt_spin_t *igt_spin_batch_new_poll(int fd,
> -				    uint32_t ctx,
> -				    unsigned engine);
> +struct igt_spin_factory {
> +	uint32_t ctx;
> +	uint32_t dependency;
> +	unsigned int engine;
> +	unsigned int flags;
> +};
> +
> +#define IGT_SPIN_FENCE_OUT (1 << 0)
> +#define IGT_SPIN_POLL_RUN  (1 << 1)
> +
> +igt_spin_t *
> +__igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
> +igt_spin_t *
> +igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
> +
> +#define __igt_spin_batch_new(fd, ...) \
> +	__igt_spin_batch_factory(fd, &((struct igt_spin_factory){__VA_ARGS__}))
> +#define igt_spin_batch_new(fd, ...) \
> +	igt_spin_batch_factory(fd, &((struct igt_spin_factory){__VA_ARGS__}))
>   
>   void igt_spin_batch_set_timeout(igt_spin_t *spin, int64_t ns);
>   void igt_spin_batch_end(igt_spin_t *spin);
> diff --git a/tests/drv_missed_irq.c b/tests/drv_missed_irq.c
> index 791ee51fb..78690c36a 100644
> --- a/tests/drv_missed_irq.c
> +++ b/tests/drv_missed_irq.c
> @@ -33,7 +33,7 @@ IGT_TEST_DESCRIPTION("Inject missed interrupts and make sure they are caught");
>   
>   static void trigger_missed_interrupt(int fd, unsigned ring)
>   {
> -	igt_spin_t *spin = __igt_spin_batch_new(fd, 0, ring, 0);
> +	igt_spin_t *spin = __igt_spin_batch_new(fd, .engine = ring);
>   	uint32_t go;
>   	int link[2];
>   
> diff --git a/tests/gem_busy.c b/tests/gem_busy.c
> index f564651ba..76b44a5d4 100644
> --- a/tests/gem_busy.c
> +++ b/tests/gem_busy.c
> @@ -114,7 +114,9 @@ static void semaphore(int fd, unsigned ring, uint32_t flags)
>   
>   	/* Create a long running batch which we can use to hog the GPU */
>   	handle[BUSY] = gem_create(fd, 4096);
> -	spin = igt_spin_batch_new(fd, 0, ring, handle[BUSY]);
> +	spin = igt_spin_batch_new(fd,
> +				  .engine = ring,
> +				  .dependency = handle[BUSY]);
>   
>   	/* Queue a batch after the busy, it should block and remain "busy" */
>   	igt_assert(exec_noop(fd, handle, ring | flags, false));
> @@ -363,17 +365,16 @@ static void close_race(int fd)
>   		igt_assert(sched_setscheduler(getpid(), SCHED_RR, &rt) == 0);
>   
>   		for (i = 0; i < nhandles; i++) {
> -			spin[i] = __igt_spin_batch_new(fd, 0,
> -						       engines[rand() % nengine], 0);
> +			spin[i] = __igt_spin_batch_new(fd,
> +						       .engine = engines[rand() % nengine]);
>   			handles[i] = spin[i]->handle;
>   		}
>   
>   		igt_until_timeout(20) {
>   			for (i = 0; i < nhandles; i++) {
>   				igt_spin_batch_free(fd, spin[i]);
> -				spin[i] = __igt_spin_batch_new(fd, 0,
> -							       engines[rand() % nengine],
> -							       0);
> +				spin[i] = __igt_spin_batch_new(fd,
> +							       .engine = engines[rand() % nengine]);
>   				handles[i] = spin[i]->handle;
>   				__sync_synchronize();
>   			}
> @@ -415,7 +416,7 @@ static bool has_semaphores(int fd)
>   
>   static bool has_extended_busy_ioctl(int fd)
>   {
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, I915_EXEC_RENDER, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd, .engine = I915_EXEC_RENDER);
>   	uint32_t read, write;
>   
>   	__gem_busy(fd, spin->handle, &read, &write);
> @@ -426,7 +427,7 @@ static bool has_extended_busy_ioctl(int fd)
>   
>   static void basic(int fd, unsigned ring, unsigned flags)
>   {
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, ring, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd, .engine = ring);
>   	struct timespec tv;
>   	int timeout;
>   	bool busy;
> diff --git a/tests/gem_ctx_isolation.c b/tests/gem_ctx_isolation.c
> index fe7d3490c..2e19e8c03 100644
> --- a/tests/gem_ctx_isolation.c
> +++ b/tests/gem_ctx_isolation.c
> @@ -502,7 +502,7 @@ static void isolation(int fd,
>   		ctx[0] = gem_context_create(fd);
>   		regs[0] = read_regs(fd, ctx[0], e, flags);
>   
> -		spin = igt_spin_batch_new(fd, ctx[0], engine, 0);
> +		spin = igt_spin_batch_new(fd, .ctx = ctx[0], .engine = engine);
>   
>   		if (flags & DIRTY1) {
>   			igt_debug("%s[%d]: Setting all registers of ctx 0 to 0x%08x\n",
> @@ -557,8 +557,11 @@ static void isolation(int fd,
>   
>   static void inject_reset_context(int fd, unsigned int engine)
>   {
> +	struct igt_spin_factory opts = {
> +		.ctx = gem_context_create(fd),
> +		.engine = engine,
> +	};
>   	igt_spin_t *spin;
> -	uint32_t ctx;
>   
>   	/*
>   	 * Force a context switch before triggering the reset, or else
> @@ -566,19 +569,20 @@ static void inject_reset_context(int fd, unsigned int engine)
>   	 * HW for screwing up if the context was already broken.
>   	 */
>   
> -	ctx = gem_context_create(fd);
> -	if (gem_can_store_dword(fd, engine)) {
> -		spin = __igt_spin_batch_new_poll(fd, ctx, engine);
> +	if (gem_can_store_dword(fd, engine))
> +		opts.flags |= IGT_SPIN_POLL_RUN;
> +
> +	spin = __igt_spin_batch_factory(fd, &opts);
> +
> +	if (spin->running)
>   		igt_spin_busywait_until_running(spin);
> -	} else {
> -		spin = __igt_spin_batch_new(fd, ctx, engine, 0);
> +	else
>   		usleep(1000); /* better than nothing */
> -	}
>   
>   	igt_force_gpu_reset(fd);
>   
>   	igt_spin_batch_free(fd, spin);
> -	gem_context_destroy(fd, ctx);
> +	gem_context_destroy(fd, opts.ctx);
>   }
>   
>   static void preservation(int fd,
> @@ -604,7 +608,7 @@ static void preservation(int fd,
>   	gem_quiescent_gpu(fd);
>   
>   	ctx[num_values] = gem_context_create(fd);
> -	spin = igt_spin_batch_new(fd, ctx[num_values], engine, 0);
> +	spin = igt_spin_batch_new(fd, .ctx = ctx[num_values], .engine = engine);
>   	regs[num_values][0] = read_regs(fd, ctx[num_values], e, flags);
>   	for (int v = 0; v < num_values; v++) {
>   		ctx[v] = gem_context_create(fd);
> @@ -644,7 +648,7 @@ static void preservation(int fd,
>   		break;
>   	}
>   
> -	spin = igt_spin_batch_new(fd, ctx[num_values], engine, 0);
> +	spin = igt_spin_batch_new(fd, .ctx = ctx[num_values], .engine = engine);
>   	for (int v = 0; v < num_values; v++)
>   		regs[v][1] = read_regs(fd, ctx[v], e, flags);
>   	regs[num_values][1] = read_regs(fd, ctx[num_values], e, flags);
> diff --git a/tests/gem_eio.c b/tests/gem_eio.c
> index 5faf7502b..0ec1aaec9 100644
> --- a/tests/gem_eio.c
> +++ b/tests/gem_eio.c
> @@ -157,10 +157,15 @@ static int __gem_wait(int fd, uint32_t handle, int64_t timeout)
>   
>   static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
>   {
> -	if (gem_can_store_dword(fd, flags))
> -		return __igt_spin_batch_new_poll(fd, ctx, flags);
> -	else
> -		return __igt_spin_batch_new(fd, ctx, flags, 0);
> +	struct igt_spin_factory opts = {
> +		.ctx = ctx,
> +		.engine = flags,
> +	};
> +
> +	if (gem_can_store_dword(fd, opts.engine))
> +		opts.flags |= IGT_SPIN_POLL_RUN;
> +
> +	return __igt_spin_batch_factory(fd, &opts);
>   }
>   
>   static void __spin_wait(int fd, igt_spin_t *spin)
> diff --git a/tests/gem_exec_fence.c b/tests/gem_exec_fence.c
> index eb93308d1..ba46595d3 100644
> --- a/tests/gem_exec_fence.c
> +++ b/tests/gem_exec_fence.c
> @@ -468,7 +468,7 @@ static void test_parallel(int fd, unsigned int master)
>   	/* Fill the queue with many requests so that the next one has to
>   	 * wait before it can be executed by the hardware.
>   	 */
> -	spin = igt_spin_batch_new(fd, 0, master, plug);
> +	spin = igt_spin_batch_new(fd, .engine = master, .dependency = plug);
>   	resubmit(fd, spin->handle, master, 16);
>   
>   	/* Now queue the master request and its secondaries */
> @@ -651,7 +651,7 @@ static void test_keep_in_fence(int fd, unsigned int engine, unsigned int flags)
>   	igt_spin_t *spin;
>   	int fence;
>   
> -	spin = igt_spin_batch_new(fd, 0, engine, 0);
> +	spin = igt_spin_batch_new(fd, .engine = engine);
>   
>   	gem_execbuf_wr(fd, &execbuf);
>   	fence = upper_32_bits(execbuf.rsvd2);
> @@ -1070,7 +1070,7 @@ static void test_syncobj_unused_fence(int fd)
>   	struct local_gem_exec_fence fence = {
>   		.handle = syncobj_create(fd),
>   	};
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd);
>   
>   	/* sanity check our syncobj_to_sync_file interface */
>   	igt_assert_eq(__syncobj_to_sync_file(fd, 0), -ENOENT);
> @@ -1162,7 +1162,7 @@ static void test_syncobj_signal(int fd)
>   	struct local_gem_exec_fence fence = {
>   		.handle = syncobj_create(fd),
>   	};
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd);
>   
>   	/* Check that the syncobj is signaled only when our request/fence is */
>   
> @@ -1212,7 +1212,7 @@ static void test_syncobj_wait(int fd)
>   
>   	gem_quiescent_gpu(fd);
>   
> -	spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	spin = igt_spin_batch_new(fd);
>   
>   	memset(&execbuf, 0, sizeof(execbuf));
>   	execbuf.buffers_ptr = to_user_pointer(&obj);
> @@ -1282,7 +1282,7 @@ static void test_syncobj_export(int fd)
>   		.handle = syncobj_create(fd),
>   	};
>   	int export[2];
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd);
>   
>   	/* Check that if we export the syncobj prior to use it picks up
>   	 * the later fence. This allows a syncobj to establish a channel
> @@ -1340,7 +1340,7 @@ static void test_syncobj_repeat(int fd)
>   	struct drm_i915_gem_execbuffer2 execbuf;
>   	struct local_gem_exec_fence *fence;
>   	int export;
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd);
>   
>   	/* Check that we can wait on the same fence multiple times */
>   	fence = calloc(nfences, sizeof(*fence));
> @@ -1395,7 +1395,7 @@ static void test_syncobj_import(int fd)
>   	const uint32_t bbe = MI_BATCH_BUFFER_END;
>   	struct drm_i915_gem_exec_object2 obj;
>   	struct drm_i915_gem_execbuffer2 execbuf;
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd);
>   	uint32_t sync = syncobj_create(fd);
>   	int fence;
>   
> diff --git a/tests/gem_exec_latency.c b/tests/gem_exec_latency.c
> index ea2e4c681..75811f325 100644
> --- a/tests/gem_exec_latency.c
> +++ b/tests/gem_exec_latency.c
> @@ -63,6 +63,10 @@ static unsigned int ring_size;
>   static void
>   poll_ring(int fd, unsigned ring, const char *name)
>   {
> +	const struct igt_spin_factory opts = {
> +		.engine = ring,
> +		.flags = IGT_SPIN_POLL_RUN,
> +	};
>   	struct timespec tv = {};
>   	unsigned long cycles;
>   	igt_spin_t *spin[2];
> @@ -72,11 +76,11 @@ poll_ring(int fd, unsigned ring, const char *name)
>   	gem_require_ring(fd, ring);
>   	igt_require(gem_can_store_dword(fd, ring));
>   
> -	spin[0] = __igt_spin_batch_new_poll(fd, 0, ring);
> +	spin[0] = __igt_spin_batch_factory(fd, &opts);
>   	igt_assert(spin[0]->running);
>   	cmd = *spin[0]->batch;
>   
> -	spin[1] = __igt_spin_batch_new_poll(fd, 0, ring);
> +	spin[1] = __igt_spin_batch_factory(fd, &opts);
>   	igt_assert(spin[1]->running);
>   	igt_assert(cmd == *spin[1]->batch);
>   
> @@ -312,7 +316,9 @@ static void latency_from_ring(int fd,
>   			       I915_GEM_DOMAIN_GTT);
>   
>   		if (flags & PREEMPT)
> -			spin = __igt_spin_batch_new(fd, ctx[0], ring, 0);
> +			spin = __igt_spin_batch_new(fd,
> +						    .ctx = ctx[0],
> +						    .engine = ring);
>   
>   		if (flags & CORK) {
>   			obj[0].handle = igt_cork_plug(&c, fd);
> @@ -456,6 +462,10 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
>   	};
>   #define NPASS ARRAY_SIZE(passname)
>   #define MMAP_SZ (64 << 10)
> +	const struct igt_spin_factory opts = {
> +		.engine = engine,
> +		.flags = IGT_SPIN_POLL_RUN,
> +	};
>   	struct rt_pkt *results;
>   	unsigned int engines[16];
>   	const char *names[16];
> @@ -513,7 +523,7 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
>   
>   			usleep(250);
>   
> -			spin = __igt_spin_batch_new_poll(fd, 0, engine);
> +			spin = __igt_spin_batch_factory(fd, &opts);
>   			if (!spin) {
>   				igt_warn("Failed to create spinner! (%s)\n",
>   					 passname[pass]);
> diff --git a/tests/gem_exec_nop.c b/tests/gem_exec_nop.c
> index 0523b1c02..74d27522d 100644
> --- a/tests/gem_exec_nop.c
> +++ b/tests/gem_exec_nop.c
> @@ -709,7 +709,9 @@ static void preempt(int fd, uint32_t handle,
>   	clock_gettime(CLOCK_MONOTONIC, &start);
>   	do {
>   		igt_spin_t *spin =
> -			__igt_spin_batch_new(fd, ctx[0], ring_id, 0);
> +			__igt_spin_batch_new(fd,
> +					     .ctx = ctx[0],
> +					     .engine = ring_id);
>   
>   		for (int loop = 0; loop < 1024; loop++)
>   			gem_execbuf(fd, &execbuf);
> diff --git a/tests/gem_exec_reloc.c b/tests/gem_exec_reloc.c
> index 91c6691af..837f60a6c 100644
> --- a/tests/gem_exec_reloc.c
> +++ b/tests/gem_exec_reloc.c
> @@ -388,7 +388,9 @@ static void basic_reloc(int fd, unsigned before, unsigned after, unsigned flags)
>   		}
>   
>   		if (flags & ACTIVE) {
> -			spin = igt_spin_batch_new(fd, 0, I915_EXEC_DEFAULT, obj.handle);
> +			spin = igt_spin_batch_new(fd,
> +						  .engine = I915_EXEC_DEFAULT,
> +						  .dependency = obj.handle);
>   			if (!(flags & HANG))
>   				igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
>   			igt_assert(gem_bo_busy(fd, obj.handle));
> @@ -454,7 +456,9 @@ static void basic_reloc(int fd, unsigned before, unsigned after, unsigned flags)
>   		}
>   
>   		if (flags & ACTIVE) {
> -			spin = igt_spin_batch_new(fd, 0, I915_EXEC_DEFAULT, obj.handle);
> +			spin = igt_spin_batch_new(fd,
> +						  .engine = I915_EXEC_DEFAULT,
> +						  .dependency = obj.handle);
>   			if (!(flags & HANG))
>   				igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
>   			igt_assert(gem_bo_busy(fd, obj.handle));
> @@ -581,7 +585,7 @@ static void basic_range(int fd, unsigned flags)
>   	execbuf.buffer_count = n + 1;
>   
>   	if (flags & ACTIVE) {
> -		spin = igt_spin_batch_new(fd, 0, 0, obj[n].handle);
> +		spin = igt_spin_batch_new(fd, .dependency = obj[n].handle);
>   		if (!(flags & HANG))
>   			igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
>   		igt_assert(gem_bo_busy(fd, obj[n].handle));
> diff --git a/tests/gem_exec_schedule.c b/tests/gem_exec_schedule.c
> index 1f43147f7..35a44ab10 100644
> --- a/tests/gem_exec_schedule.c
> +++ b/tests/gem_exec_schedule.c
> @@ -132,9 +132,12 @@ static void unplug_show_queue(int fd, struct igt_cork *c, unsigned int engine)
>   	igt_spin_t *spin[MAX_ELSP_QLEN];
>   
>   	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
> -		uint32_t ctx = create_highest_priority(fd);
> -		spin[n] = __igt_spin_batch_new(fd, ctx, engine, 0);
> -		gem_context_destroy(fd, ctx);
> +		const struct igt_spin_factory opts = {
> +			.ctx = create_highest_priority(fd),
> +			.engine = engine,
> +		};
> +		spin[n] = __igt_spin_batch_factory(fd, &opts);
> +		gem_context_destroy(fd, opts.ctx);
>   	}
>   
>   	igt_cork_unplug(c); /* batches will now be queued on the engine */
> @@ -196,7 +199,7 @@ static void independent(int fd, unsigned int engine)
>   			continue;
>   
>   		if (spin == NULL) {
> -			spin = __igt_spin_batch_new(fd, 0, other, 0);
> +			spin = __igt_spin_batch_new(fd, .engine = other);
>   		} else {
>   			struct drm_i915_gem_exec_object2 obj = {
>   				.handle = spin->handle,
> @@ -428,7 +431,9 @@ static void preempt(int fd, unsigned ring, unsigned flags)
>   			ctx[LO] = gem_context_create(fd);
>   			gem_context_set_priority(fd, ctx[LO], MIN_PRIO);
>   		}
> -		spin[n] = __igt_spin_batch_new(fd, ctx[LO], ring, 0);
> +		spin[n] = __igt_spin_batch_new(fd,
> +					       .ctx = ctx[LO],
> +					       .engine = ring);
>   		igt_debug("spin[%d].handle=%d\n", n, spin[n]->handle);
>   
>   		store_dword(fd, ctx[HI], ring, result, 0, n + 1, 0, I915_GEM_DOMAIN_RENDER);
> @@ -462,7 +467,9 @@ static igt_spin_t *__noise(int fd, uint32_t ctx, int prio, igt_spin_t *spin)
>   
>   	for_each_physical_engine(fd, other) {
>   		if (spin == NULL) {
> -			spin = __igt_spin_batch_new(fd, ctx, other, 0);
> +			spin = __igt_spin_batch_new(fd,
> +						    .ctx = ctx,
> +						    .engine = other);
>   		} else {
>   			struct drm_i915_gem_exec_object2 obj = {
>   				.handle = spin->handle,
> @@ -672,7 +679,9 @@ static void preempt_self(int fd, unsigned ring)
>   	n = 0;
>   	gem_context_set_priority(fd, ctx[HI], MIN_PRIO);
>   	for_each_physical_engine(fd, other) {
> -		spin[n] = __igt_spin_batch_new(fd, ctx[NOISE], other, 0);
> +		spin[n] = __igt_spin_batch_new(fd,
> +					       .ctx = ctx[NOISE],
> +					       .engine = other);
>   		store_dword(fd, ctx[HI], other,
>   			    result, (n + 1)*sizeof(uint32_t), n + 1,
>   			    0, I915_GEM_DOMAIN_RENDER);
> @@ -714,7 +723,9 @@ static void preemptive_hang(int fd, unsigned ring)
>   		ctx[LO] = gem_context_create(fd);
>   		gem_context_set_priority(fd, ctx[LO], MIN_PRIO);
>   
> -		spin[n] = __igt_spin_batch_new(fd, ctx[LO], ring, 0);
> +		spin[n] = __igt_spin_batch_new(fd,
> +					       .ctx = ctx[LO],
> +					       .engine = ring);
>   
>   		gem_context_destroy(fd, ctx[LO]);
>   	}
> diff --git a/tests/gem_exec_suspend.c b/tests/gem_exec_suspend.c
> index db2bca262..43c52d105 100644
> --- a/tests/gem_exec_suspend.c
> +++ b/tests/gem_exec_suspend.c
> @@ -189,7 +189,7 @@ static void run_test(int fd, unsigned engine, unsigned flags)
>   	}
>   
>   	if (flags & HANG)
> -		spin = igt_spin_batch_new(fd, 0, engine, 0);
> +		spin = igt_spin_batch_new(fd, .engine = engine);
>   
>   	switch (mode(flags)) {
>   	case NOSLEEP:
> diff --git a/tests/gem_fenced_exec_thrash.c b/tests/gem_fenced_exec_thrash.c
> index 385790ada..7248d310d 100644
> --- a/tests/gem_fenced_exec_thrash.c
> +++ b/tests/gem_fenced_exec_thrash.c
> @@ -132,7 +132,7 @@ static void run_test(int fd, int num_fences, int expected_errno,
>   			igt_spin_t *spin = NULL;
>   
>   			if (flags & BUSY_LOAD)
> -				spin = __igt_spin_batch_new(fd, 0, 0, 0);
> +				spin = __igt_spin_batch_new(fd);
>   
>   			igt_while_interruptible(flags & INTERRUPTIBLE) {
>   				igt_assert_eq(__gem_execbuf(fd, &execbuf[i]),
> diff --git a/tests/gem_shrink.c b/tests/gem_shrink.c
> index 3d33453aa..929e0426a 100644
> --- a/tests/gem_shrink.c
> +++ b/tests/gem_shrink.c
> @@ -346,9 +346,9 @@ static void reclaim(unsigned engine, int timeout)
>   		} while (!*shared);
>   	}
>   
> -	spin = igt_spin_batch_new(fd, 0, engine, 0);
> +	spin = igt_spin_batch_new(fd, .engine = engine);
>   	igt_until_timeout(timeout) {
> -		igt_spin_t *next = __igt_spin_batch_new(fd, 0, engine, 0);
> +		igt_spin_t *next = __igt_spin_batch_new(fd, .engine = engine);
>   
>   		igt_spin_batch_set_timeout(spin, timeout_100ms);
>   		gem_sync(fd, spin->handle);
> diff --git a/tests/gem_spin_batch.c b/tests/gem_spin_batch.c
> index cffeb6d71..52410010b 100644
> --- a/tests/gem_spin_batch.c
> +++ b/tests/gem_spin_batch.c
> @@ -41,9 +41,9 @@ static void spin(int fd, unsigned int engine, unsigned int timeout_sec)
>   	struct timespec itv = { };
>   	uint64_t elapsed;
>   
> -	spin = __igt_spin_batch_new(fd, 0, engine, 0);
> +	spin = __igt_spin_batch_new(fd, .engine = engine);
>   	while ((elapsed = igt_nsec_elapsed(&tv)) >> 30 < timeout_sec) {
> -		igt_spin_t *next = __igt_spin_batch_new(fd, 0, engine, 0);
> +		igt_spin_t *next = __igt_spin_batch_new(fd, .engine = engine);
>   
>   		igt_spin_batch_set_timeout(spin,
>   					   timeout_100ms - igt_nsec_elapsed(&itv));
> diff --git a/tests/gem_sync.c b/tests/gem_sync.c
> index 1e2e089a1..2fcb9aa01 100644
> --- a/tests/gem_sync.c
> +++ b/tests/gem_sync.c
> @@ -715,9 +715,8 @@ preempt(int fd, unsigned ring, int num_children, int timeout)
>   		do {
>   			igt_spin_t *spin =
>   				__igt_spin_batch_new(fd,
> -						     ctx[0],
> -						     execbuf.flags,
> -						     0);
> +						     .ctx = ctx[0],
> +						     .engine = execbuf.flags);
>   
>   			do {
>   				gem_execbuf(fd, &execbuf);
> diff --git a/tests/gem_wait.c b/tests/gem_wait.c
> index 61d8a4059..7914c9365 100644
> --- a/tests/gem_wait.c
> +++ b/tests/gem_wait.c
> @@ -74,7 +74,9 @@ static void basic(int fd, unsigned engine, unsigned flags)
>   	IGT_CORK_HANDLE(cork);
>   	uint32_t plug =
>   		flags & (WRITE | AWAIT) ? igt_cork_plug(&cork, fd) : 0;
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, engine, plug);
> +	igt_spin_t *spin = igt_spin_batch_new(fd,
> +					      .engine = engine,
> +					      .dependency = plug);
>   	struct drm_i915_gem_wait wait = {
>   		flags & WRITE ? plug : spin->handle
>   	};
> diff --git a/tests/kms_busy.c b/tests/kms_busy.c
> index 4a4e0e156..abf39828b 100644
> --- a/tests/kms_busy.c
> +++ b/tests/kms_busy.c
> @@ -84,7 +84,8 @@ static void flip_to_fb(igt_display_t *dpy, int pipe,
>   	struct drm_event_vblank ev;
>   
>   	igt_spin_t *t = igt_spin_batch_new(dpy->drm_fd,
> -					   0, ring, fb->gem_handle);
> +					   .engine = ring,
> +					   .dependency = fb->gem_handle);
>   
>   	if (modeset) {
>   		/*
> @@ -200,7 +201,8 @@ static void test_atomic_commit_hang(igt_display_t *dpy, igt_plane_t *primary,
>   				    struct igt_fb *busy_fb, unsigned ring)
>   {
>   	igt_spin_t *t = igt_spin_batch_new(dpy->drm_fd,
> -					   0, ring, busy_fb->gem_handle);
> +					   .engine = ring,
> +					   .dependency = busy_fb->gem_handle);
>   	struct pollfd pfd = { .fd = dpy->drm_fd, .events = POLLIN };
>   	unsigned flags = 0;
>   	struct drm_event_vblank ev;
> @@ -287,7 +289,9 @@ static void test_pageflip_modeset_hang(igt_display_t *dpy,
>   
>   	igt_display_commit2(dpy, dpy->is_atomic ? COMMIT_ATOMIC : COMMIT_LEGACY);
>   
> -	t = igt_spin_batch_new(dpy->drm_fd, 0, ring, fb.gem_handle);
> +	t = igt_spin_batch_new(dpy->drm_fd,
> +			       .engine = ring,
> +			       .dependency = fb.gem_handle);
>   
>   	do_or_die(drmModePageFlip(dpy->drm_fd, dpy->pipes[pipe].crtc_id, fb.fb_id, DRM_MODE_PAGE_FLIP_EVENT, &fb));
>   
> diff --git a/tests/kms_cursor_legacy.c b/tests/kms_cursor_legacy.c
> index d0a28b3c4..85340d43e 100644
> --- a/tests/kms_cursor_legacy.c
> +++ b/tests/kms_cursor_legacy.c
> @@ -532,7 +532,8 @@ static void basic_flip_cursor(igt_display_t *display,
>   
>   		spin = NULL;
>   		if (flags & BASIC_BUSY)
> -			spin = igt_spin_batch_new(display->drm_fd, 0, 0, fb_info.gem_handle);
> +			spin = igt_spin_batch_new(display->drm_fd,
> +						  .dependency = fb_info.gem_handle);
>   
>   		/* Start with a synchronous query to align with the vblank */
>   		vblank_start = get_vblank(display->drm_fd, pipe, DRM_VBLANK_NEXTONMISS);
> @@ -1323,8 +1324,8 @@ static void flip_vs_cursor_busy_crc(igt_display_t *display, bool atomic)
>   	for (int i = 1; i >= 0; i--) {
>   		igt_spin_t *spin;
>   
> -		spin = igt_spin_batch_new(display->drm_fd, 0, 0,
> -					  fb_info[1].gem_handle);
> +		spin = igt_spin_batch_new(display->drm_fd,
> +					  .dependency = fb_info[1].gem_handle);
>   
>   		vblank_start = get_vblank(display->drm_fd, pipe, DRM_VBLANK_NEXTONMISS);
>   
> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> index 4570f926d..a1d36ac4f 100644
> --- a/tests/perf_pmu.c
> +++ b/tests/perf_pmu.c
> @@ -172,10 +172,15 @@ static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e)
>   
>   static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
>   {
> +	struct igt_spin_factory opts = {
> +		.ctx = ctx,
> +		.engine = flags,
> +	};
> +
>   	if (gem_can_store_dword(fd, flags))
> -		return __igt_spin_batch_new_poll(fd, ctx, flags);
> -	else
> -		return __igt_spin_batch_new(fd, ctx, flags, 0);
> +		opts.flags |= IGT_SPIN_POLL_RUN;
> +
> +	return __igt_spin_batch_factory(fd, &opts);
>   }
>   
>   static unsigned long __spin_wait(int fd, igt_spin_t *spin)
> @@ -356,7 +361,9 @@ busy_double_start(int gem_fd, const struct intel_execution_engine2 *e)
>   	 */
>   	spin[0] = __spin_sync(gem_fd, 0, e2ring(gem_fd, e));
>   	usleep(500e3);
> -	spin[1] = __igt_spin_batch_new(gem_fd, ctx, e2ring(gem_fd, e), 0);
> +	spin[1] = __igt_spin_batch_new(gem_fd,
> +				       .ctx = ctx,
> +				       .engine = e2ring(gem_fd, e));
>   
>   	/*
>   	 * Open PMU as fast as possible after the second spin batch in attempt
> @@ -1045,8 +1052,8 @@ static void cpu_hotplug(int gem_fd)
>   	 * Create two spinners so test can ensure shorter gaps in engine
>   	 * busyness as it is terminating one and re-starting the other.
>   	 */
> -	spin[0] = igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
> -	spin[1] = __igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
> +	spin[0] = igt_spin_batch_new(gem_fd, .engine = I915_EXEC_RENDER);
> +	spin[1] = __igt_spin_batch_new(gem_fd, .engine = I915_EXEC_RENDER);
>   
>   	val = __pmu_read_single(fd, &ts[0]);
>   
> @@ -1129,8 +1136,8 @@ static void cpu_hotplug(int gem_fd)
>   			break;
>   
>   		igt_spin_batch_free(gem_fd, spin[cur]);
> -		spin[cur] = __igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER,
> -						 0);
> +		spin[cur] = __igt_spin_batch_new(gem_fd,
> +						 .engine = I915_EXEC_RENDER);
>   		cur ^= 1;
>   	}
>   
> @@ -1167,8 +1174,9 @@ test_interrupts(int gem_fd)
>   
>   	/* Queue spinning batches. */
>   	for (int i = 0; i < target; i++) {
> -		spin[i] = __igt_spin_batch_new_fence(gem_fd,
> -						     0, I915_EXEC_RENDER);
> +		spin[i] = __igt_spin_batch_new(gem_fd,
> +					       .engine = I915_EXEC_RENDER,
> +					       .flags = IGT_SPIN_FENCE_OUT);
>   		if (i == 0) {
>   			fence_fd = spin[i]->out_fence;
>   		} else {
> @@ -1229,7 +1237,8 @@ test_interrupts_sync(int gem_fd)
>   
>   	/* Queue spinning batches. */
>   	for (int i = 0; i < target; i++)
> -		spin[i] = __igt_spin_batch_new_fence(gem_fd, 0, 0);
> +		spin[i] = __igt_spin_batch_new(gem_fd,
> +					       .flags = IGT_SPIN_FENCE_OUT);
>   
>   	/* Wait for idle state. */
>   	idle = pmu_read_single(fd);
> @@ -1550,7 +1559,7 @@ accuracy(int gem_fd, const struct intel_execution_engine2 *e,
>   		igt_spin_t *spin;
>   
>   		/* Allocate our spin batch and idle it. */
> -		spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
> +		spin = igt_spin_batch_new(gem_fd, .engine = e2ring(gem_fd, e));
>   		igt_spin_batch_end(spin);
>   		gem_sync(gem_fd, spin->handle);
>   
> diff --git a/tests/pm_rps.c b/tests/pm_rps.c
> index 006d084b8..202132b1c 100644
> --- a/tests/pm_rps.c
> +++ b/tests/pm_rps.c
> @@ -235,9 +235,9 @@ static void load_helper_run(enum load load)
>   
>   		igt_debug("Applying %s load...\n", lh.load ? "high" : "low");
>   
> -		spin[0] = __igt_spin_batch_new(drm_fd, 0, 0, 0);
> +		spin[0] = __igt_spin_batch_new(drm_fd);
>   		if (lh.load == HIGH)
> -			spin[1] = __igt_spin_batch_new(drm_fd, 0, 0, 0);
> +			spin[1] = __igt_spin_batch_new(drm_fd);
>   		while (!lh.exit) {
>   			handle = spin[0]->handle;
>   			igt_spin_batch_end(spin[0]);
> @@ -248,8 +248,7 @@ static void load_helper_run(enum load load)
>   			usleep(100);
>   
>   			spin[0] = spin[1];
> -			spin[lh.load == HIGH] =
> -				__igt_spin_batch_new(drm_fd, 0, 0, 0);
> +			spin[lh.load == HIGH] = __igt_spin_batch_new(drm_fd);
>   		}
>   
>   		handle = spin[0]->handle;
> @@ -510,7 +509,7 @@ static void boost_freq(int fd, int *boost_freqs)
>   	int64_t timeout = 1;
>   	igt_spin_t *load;
>   
> -	load = igt_spin_batch_new(fd, 0, 0, 0);
> +	load = igt_spin_batch_new(fd);
>   	resubmit_batch(fd, load->handle, 16);
>   
>   	/* Waiting will grant us a boost to maximum */
> 

Neat trick! :)

Conversion looks correct.

Two nitpicks are:

1. It's not really a factory, but I guess we can give it an unusual name 
in our world so it sticks out as unusual.

2. We could simplify by avoiding macro tricks by just passing in { 
.field = value } by value.

I don't actually mind the trick, but since it is a novel concept in IGT 
codebase (AFAIR), then please run it past Petri at least.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 04/17] lib: Convert spin batch constructor to a factory
@ 2018-07-02 15:34     ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-02 15:34 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx, Tvrtko Ursulin


On 02/07/2018 10:07, Chris Wilson wrote:
> In order to make adding more options easier, expose the full set of
> options to the caller.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   lib/igt_dummyload.c            | 147 +++++++++------------------------
>   lib/igt_dummyload.h            |  42 +++++-----
>   tests/drv_missed_irq.c         |   2 +-
>   tests/gem_busy.c               |  17 ++--
>   tests/gem_ctx_isolation.c      |  26 +++---
>   tests/gem_eio.c                |  13 ++-
>   tests/gem_exec_fence.c         |  16 ++--
>   tests/gem_exec_latency.c       |  18 +++-
>   tests/gem_exec_nop.c           |   4 +-
>   tests/gem_exec_reloc.c         |  10 ++-
>   tests/gem_exec_schedule.c      |  27 ++++--
>   tests/gem_exec_suspend.c       |   2 +-
>   tests/gem_fenced_exec_thrash.c |   2 +-
>   tests/gem_shrink.c             |   4 +-
>   tests/gem_spin_batch.c         |   4 +-
>   tests/gem_sync.c               |   5 +-
>   tests/gem_wait.c               |   4 +-
>   tests/kms_busy.c               |  10 ++-
>   tests/kms_cursor_legacy.c      |   7 +-
>   tests/perf_pmu.c               |  33 +++++---
>   tests/pm_rps.c                 |   9 +-
>   21 files changed, 189 insertions(+), 213 deletions(-)
> 
> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
> index 3809b4e61..94efdf745 100644
> --- a/lib/igt_dummyload.c
> +++ b/lib/igt_dummyload.c
> @@ -75,12 +75,9 @@ fill_reloc(struct drm_i915_gem_relocation_entry *reloc,
>   	reloc->write_domain = write_domains;
>   }
>   
> -#define OUT_FENCE	(1 << 0)
> -#define POLL_RUN	(1 << 1)
> -
>   static int
> -emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
> -		     uint32_t dep, unsigned int flags)
> +emit_recursive_batch(igt_spin_t *spin,
> +		     int fd, const struct igt_spin_factory *opts)
>   {
>   #define SCRATCH 0
>   #define BATCH 1
> @@ -95,21 +92,18 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
>   	int i;
>   
>   	nengine = 0;
> -	if (engine == ALL_ENGINES) {
> -		for_each_engine(fd, engine) {
> -			if (engine) {
> -			if (flags & POLL_RUN)
> -				igt_require(!(flags & POLL_RUN) ||
> -					    gem_can_store_dword(fd, engine));
> -
> -				engines[nengine++] = engine;
> -			}
> +	if (opts->engine == ALL_ENGINES) {
> +		unsigned int engine;
> +
> +		for_each_physical_engine(fd, engine) {
> +			if (opts->flags & IGT_SPIN_POLL_RUN &&
> +			    !gem_can_store_dword(fd, engine))
> +				continue;
> +
> +			engines[nengine++] = engine;
>   		}
>   	} else {
> -		gem_require_ring(fd, engine);
> -		igt_require(!(flags & POLL_RUN) ||
> -			    gem_can_store_dword(fd, engine));
> -		engines[nengine++] = engine;
> +		engines[nengine++] = opts->engine;
>   	}
>   	igt_require(nengine);
>   
> @@ -130,20 +124,20 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
>   	execbuf->buffer_count++;
>   	batch_start = batch;
>   
> -	if (dep) {
> -		igt_assert(!(flags & POLL_RUN));
> +	if (opts->dependency) {
> +		igt_assert(!(opts->flags & IGT_SPIN_POLL_RUN));
>   
>   		/* dummy write to dependency */
> -		obj[SCRATCH].handle = dep;
> +		obj[SCRATCH].handle = opts->dependency;
>   		fill_reloc(&relocs[obj[BATCH].relocation_count++],
> -			   dep, 1020,
> +			   opts->dependency, 1020,
>   			   I915_GEM_DOMAIN_RENDER,
>   			   I915_GEM_DOMAIN_RENDER);
>   		execbuf->buffer_count++;
> -	} else if (flags & POLL_RUN) {
> +	} else if (opts->flags & IGT_SPIN_POLL_RUN) {
>   		unsigned int offset;
>   
> -		igt_assert(!dep);
> +		igt_assert(!opts->dependency);
>   
>   		if (gen == 4 || gen == 5) {
>   			execbuf->flags |= I915_EXEC_SECURE;
> @@ -231,9 +225,9 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
>   
>   	execbuf->buffers_ptr = to_user_pointer(obj +
>   					       (2 - execbuf->buffer_count));
> -	execbuf->rsvd1 = ctx;
> +	execbuf->rsvd1 = opts->ctx;
>   
> -	if (flags & OUT_FENCE)
> +	if (opts->flags & IGT_SPIN_FENCE_OUT)
>   		execbuf->flags |= I915_EXEC_FENCE_OUT;
>   
>   	for (i = 0; i < nengine; i++) {
> @@ -242,7 +236,7 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
>   
>   		gem_execbuf_wr(fd, execbuf);
>   
> -		if (flags & OUT_FENCE) {
> +		if (opts->flags & IGT_SPIN_FENCE_OUT) {
>   			int _fd = execbuf->rsvd2 >> 32;
>   
>   			igt_assert(_fd >= 0);
> @@ -271,16 +265,14 @@ emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
>   }
>   
>   static igt_spin_t *
> -___igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep,
> -		      unsigned int flags)
> +spin_batch_create(int fd, const struct igt_spin_factory *opts)
>   {
>   	igt_spin_t *spin;
>   
>   	spin = calloc(1, sizeof(struct igt_spin));
>   	igt_assert(spin);
>   
> -	spin->out_fence = emit_recursive_batch(spin, fd, ctx, engine, dep,
> -					       flags);
> +	spin->out_fence = emit_recursive_batch(spin, fd, opts);
>   
>   	pthread_mutex_lock(&list_lock);
>   	igt_list_add(&spin->link, &spin_list);
> @@ -290,18 +282,15 @@ ___igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep,
>   }
>   
>   igt_spin_t *
> -__igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
> +__igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts)
>   {
> -	return ___igt_spin_batch_new(fd, ctx, engine, dep, 0);
> +	return spin_batch_create(fd, opts);
>   }
>   
>   /**
> - * igt_spin_batch_new:
> + * igt_spin_batch_factory:
>    * @fd: open i915 drm file descriptor
> - * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
> - *          than 0, execute on all available rings.
> - * @dep: handle to a buffer object dependency. If greater than 0, add a
> - *              relocation entry to this buffer within the batch.
> + * @opts: controlling options such as context, engine, dependencies etc
>    *
>    * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
>    * contains the batch's handle that can be waited upon. The returned structure
> @@ -311,86 +300,26 @@ __igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
>    * Structure with helper internal state for igt_spin_batch_free().
>    */
>   igt_spin_t *
> -igt_spin_batch_new(int fd, uint32_t ctx, unsigned engine, uint32_t dep)
> +igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts)
>   {
>   	igt_spin_t *spin;
>   
>   	igt_require_gem(fd);
>   
> -	spin = __igt_spin_batch_new(fd, ctx, engine, dep);
> -	igt_assert(gem_bo_busy(fd, spin->handle));
> -
> -	return spin;
> -}
> -
> -igt_spin_t *
> -__igt_spin_batch_new_fence(int fd, uint32_t ctx, unsigned engine)
> -{
> -	return ___igt_spin_batch_new(fd, ctx, engine, 0, OUT_FENCE);
> -}
> +	if (opts->engine != ALL_ENGINES) {
> +		gem_require_ring(fd, opts->engine);
> +		if (opts->flags & IGT_SPIN_POLL_RUN)
> +			igt_require(gem_can_store_dword(fd, opts->engine));
> +	}
>   
> -/**
> - * igt_spin_batch_new_fence:
> - * @fd: open i915 drm file descriptor
> - * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
> - *          than 0, execute on all available rings.
> - *
> - * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
> - * contains the batch's handle that can be waited upon. The returned structure
> - * must be passed to igt_spin_batch_free() for post-processing.
> - *
> - * igt_spin_t will contain an output fence associtated with this batch.
> - *
> - * Returns:
> - * Structure with helper internal state for igt_spin_batch_free().
> - */
> -igt_spin_t *
> -igt_spin_batch_new_fence(int fd, uint32_t ctx, unsigned engine)
> -{
> -	igt_spin_t *spin;
> +	spin = spin_batch_create(fd, opts);
>   
> -	igt_require_gem(fd);
> -	igt_require(gem_has_exec_fence(fd));
> -
> -	spin = __igt_spin_batch_new_fence(fd, ctx, engine);
>   	igt_assert(gem_bo_busy(fd, spin->handle));
> -	igt_assert(poll(&(struct pollfd){spin->out_fence, POLLIN}, 1, 0) == 0);
> -
> -	return spin;
> -}
> -
> -igt_spin_t *
> -__igt_spin_batch_new_poll(int fd, uint32_t ctx, unsigned engine)
> -{
> -	return ___igt_spin_batch_new(fd, ctx, engine, 0, POLL_RUN);
> -}
> +	if (opts->flags & IGT_SPIN_FENCE_OUT) {
> +		struct pollfd pfd = { spin->out_fence, POLLIN };
>   
> -/**
> - * igt_spin_batch_new_poll:
> - * @fd: open i915 drm file descriptor
> - * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
> - *          than 0, execute on all available rings.
> - *
> - * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
> - * contains the batch's handle that can be waited upon. The returned structure
> - * must be passed to igt_spin_batch_free() for post-processing.
> - *
> - * igt_spin_t->running will containt a pointer which target will change from
> - * zero to one once the spinner actually starts executing on the GPU.
> - *
> - * Returns:
> - * Structure with helper internal state for igt_spin_batch_free().
> - */
> -igt_spin_t *
> -igt_spin_batch_new_poll(int fd, uint32_t ctx, unsigned engine)
> -{
> -	igt_spin_t *spin;
> -
> -	igt_require_gem(fd);
> -	igt_require(gem_mmap__has_wc(fd));
> -
> -	spin = __igt_spin_batch_new_poll(fd, ctx, engine);
> -	igt_assert(gem_bo_busy(fd, spin->handle));
> +		igt_assert(poll(&pfd, 1, 0) == 0);
> +	}
>   
>   	return spin;
>   }
> diff --git a/lib/igt_dummyload.h b/lib/igt_dummyload.h
> index c6ccc2936..c794f2544 100644
> --- a/lib/igt_dummyload.h
> +++ b/lib/igt_dummyload.h
> @@ -43,29 +43,25 @@ typedef struct igt_spin {
>   	bool *running;
>   } igt_spin_t;
>   
> -igt_spin_t *__igt_spin_batch_new(int fd,
> -				 uint32_t ctx,
> -				 unsigned engine,
> -				 uint32_t  dep);
> -igt_spin_t *igt_spin_batch_new(int fd,
> -			       uint32_t ctx,
> -			       unsigned engine,
> -			       uint32_t  dep);
> -
> -igt_spin_t *__igt_spin_batch_new_fence(int fd,
> -				       uint32_t ctx,
> -				       unsigned engine);
> -
> -igt_spin_t *igt_spin_batch_new_fence(int fd,
> -				     uint32_t ctx,
> -				     unsigned engine);
> -
> -igt_spin_t *__igt_spin_batch_new_poll(int fd,
> -				       uint32_t ctx,
> -				       unsigned engine);
> -igt_spin_t *igt_spin_batch_new_poll(int fd,
> -				    uint32_t ctx,
> -				    unsigned engine);
> +struct igt_spin_factory {
> +	uint32_t ctx;
> +	uint32_t dependency;
> +	unsigned int engine;
> +	unsigned int flags;
> +};
> +
> +#define IGT_SPIN_FENCE_OUT (1 << 0)
> +#define IGT_SPIN_POLL_RUN  (1 << 1)
> +
> +igt_spin_t *
> +__igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
> +igt_spin_t *
> +igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
> +
> +#define __igt_spin_batch_new(fd, ...) \
> +	__igt_spin_batch_factory(fd, &((struct igt_spin_factory){__VA_ARGS__}))
> +#define igt_spin_batch_new(fd, ...) \
> +	igt_spin_batch_factory(fd, &((struct igt_spin_factory){__VA_ARGS__}))
>   
>   void igt_spin_batch_set_timeout(igt_spin_t *spin, int64_t ns);
>   void igt_spin_batch_end(igt_spin_t *spin);
> diff --git a/tests/drv_missed_irq.c b/tests/drv_missed_irq.c
> index 791ee51fb..78690c36a 100644
> --- a/tests/drv_missed_irq.c
> +++ b/tests/drv_missed_irq.c
> @@ -33,7 +33,7 @@ IGT_TEST_DESCRIPTION("Inject missed interrupts and make sure they are caught");
>   
>   static void trigger_missed_interrupt(int fd, unsigned ring)
>   {
> -	igt_spin_t *spin = __igt_spin_batch_new(fd, 0, ring, 0);
> +	igt_spin_t *spin = __igt_spin_batch_new(fd, .engine = ring);
>   	uint32_t go;
>   	int link[2];
>   
> diff --git a/tests/gem_busy.c b/tests/gem_busy.c
> index f564651ba..76b44a5d4 100644
> --- a/tests/gem_busy.c
> +++ b/tests/gem_busy.c
> @@ -114,7 +114,9 @@ static void semaphore(int fd, unsigned ring, uint32_t flags)
>   
>   	/* Create a long running batch which we can use to hog the GPU */
>   	handle[BUSY] = gem_create(fd, 4096);
> -	spin = igt_spin_batch_new(fd, 0, ring, handle[BUSY]);
> +	spin = igt_spin_batch_new(fd,
> +				  .engine = ring,
> +				  .dependency = handle[BUSY]);
>   
>   	/* Queue a batch after the busy, it should block and remain "busy" */
>   	igt_assert(exec_noop(fd, handle, ring | flags, false));
> @@ -363,17 +365,16 @@ static void close_race(int fd)
>   		igt_assert(sched_setscheduler(getpid(), SCHED_RR, &rt) == 0);
>   
>   		for (i = 0; i < nhandles; i++) {
> -			spin[i] = __igt_spin_batch_new(fd, 0,
> -						       engines[rand() % nengine], 0);
> +			spin[i] = __igt_spin_batch_new(fd,
> +						       .engine = engines[rand() % nengine]);
>   			handles[i] = spin[i]->handle;
>   		}
>   
>   		igt_until_timeout(20) {
>   			for (i = 0; i < nhandles; i++) {
>   				igt_spin_batch_free(fd, spin[i]);
> -				spin[i] = __igt_spin_batch_new(fd, 0,
> -							       engines[rand() % nengine],
> -							       0);
> +				spin[i] = __igt_spin_batch_new(fd,
> +							       .engine = engines[rand() % nengine]);
>   				handles[i] = spin[i]->handle;
>   				__sync_synchronize();
>   			}
> @@ -415,7 +416,7 @@ static bool has_semaphores(int fd)
>   
>   static bool has_extended_busy_ioctl(int fd)
>   {
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, I915_EXEC_RENDER, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd, .engine = I915_EXEC_RENDER);
>   	uint32_t read, write;
>   
>   	__gem_busy(fd, spin->handle, &read, &write);
> @@ -426,7 +427,7 @@ static bool has_extended_busy_ioctl(int fd)
>   
>   static void basic(int fd, unsigned ring, unsigned flags)
>   {
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, ring, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd, .engine = ring);
>   	struct timespec tv;
>   	int timeout;
>   	bool busy;
> diff --git a/tests/gem_ctx_isolation.c b/tests/gem_ctx_isolation.c
> index fe7d3490c..2e19e8c03 100644
> --- a/tests/gem_ctx_isolation.c
> +++ b/tests/gem_ctx_isolation.c
> @@ -502,7 +502,7 @@ static void isolation(int fd,
>   		ctx[0] = gem_context_create(fd);
>   		regs[0] = read_regs(fd, ctx[0], e, flags);
>   
> -		spin = igt_spin_batch_new(fd, ctx[0], engine, 0);
> +		spin = igt_spin_batch_new(fd, .ctx = ctx[0], .engine = engine);
>   
>   		if (flags & DIRTY1) {
>   			igt_debug("%s[%d]: Setting all registers of ctx 0 to 0x%08x\n",
> @@ -557,8 +557,11 @@ static void isolation(int fd,
>   
>   static void inject_reset_context(int fd, unsigned int engine)
>   {
> +	struct igt_spin_factory opts = {
> +		.ctx = gem_context_create(fd),
> +		.engine = engine,
> +	};
>   	igt_spin_t *spin;
> -	uint32_t ctx;
>   
>   	/*
>   	 * Force a context switch before triggering the reset, or else
> @@ -566,19 +569,20 @@ static void inject_reset_context(int fd, unsigned int engine)
>   	 * HW for screwing up if the context was already broken.
>   	 */
>   
> -	ctx = gem_context_create(fd);
> -	if (gem_can_store_dword(fd, engine)) {
> -		spin = __igt_spin_batch_new_poll(fd, ctx, engine);
> +	if (gem_can_store_dword(fd, engine))
> +		opts.flags |= IGT_SPIN_POLL_RUN;
> +
> +	spin = __igt_spin_batch_factory(fd, &opts);
> +
> +	if (spin->running)
>   		igt_spin_busywait_until_running(spin);
> -	} else {
> -		spin = __igt_spin_batch_new(fd, ctx, engine, 0);
> +	else
>   		usleep(1000); /* better than nothing */
> -	}
>   
>   	igt_force_gpu_reset(fd);
>   
>   	igt_spin_batch_free(fd, spin);
> -	gem_context_destroy(fd, ctx);
> +	gem_context_destroy(fd, opts.ctx);
>   }
>   
>   static void preservation(int fd,
> @@ -604,7 +608,7 @@ static void preservation(int fd,
>   	gem_quiescent_gpu(fd);
>   
>   	ctx[num_values] = gem_context_create(fd);
> -	spin = igt_spin_batch_new(fd, ctx[num_values], engine, 0);
> +	spin = igt_spin_batch_new(fd, .ctx = ctx[num_values], .engine = engine);
>   	regs[num_values][0] = read_regs(fd, ctx[num_values], e, flags);
>   	for (int v = 0; v < num_values; v++) {
>   		ctx[v] = gem_context_create(fd);
> @@ -644,7 +648,7 @@ static void preservation(int fd,
>   		break;
>   	}
>   
> -	spin = igt_spin_batch_new(fd, ctx[num_values], engine, 0);
> +	spin = igt_spin_batch_new(fd, .ctx = ctx[num_values], .engine = engine);
>   	for (int v = 0; v < num_values; v++)
>   		regs[v][1] = read_regs(fd, ctx[v], e, flags);
>   	regs[num_values][1] = read_regs(fd, ctx[num_values], e, flags);
> diff --git a/tests/gem_eio.c b/tests/gem_eio.c
> index 5faf7502b..0ec1aaec9 100644
> --- a/tests/gem_eio.c
> +++ b/tests/gem_eio.c
> @@ -157,10 +157,15 @@ static int __gem_wait(int fd, uint32_t handle, int64_t timeout)
>   
>   static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
>   {
> -	if (gem_can_store_dword(fd, flags))
> -		return __igt_spin_batch_new_poll(fd, ctx, flags);
> -	else
> -		return __igt_spin_batch_new(fd, ctx, flags, 0);
> +	struct igt_spin_factory opts = {
> +		.ctx = ctx,
> +		.engine = flags,
> +	};
> +
> +	if (gem_can_store_dword(fd, opts.engine))
> +		opts.flags |= IGT_SPIN_POLL_RUN;
> +
> +	return __igt_spin_batch_factory(fd, &opts);
>   }
>   
>   static void __spin_wait(int fd, igt_spin_t *spin)
> diff --git a/tests/gem_exec_fence.c b/tests/gem_exec_fence.c
> index eb93308d1..ba46595d3 100644
> --- a/tests/gem_exec_fence.c
> +++ b/tests/gem_exec_fence.c
> @@ -468,7 +468,7 @@ static void test_parallel(int fd, unsigned int master)
>   	/* Fill the queue with many requests so that the next one has to
>   	 * wait before it can be executed by the hardware.
>   	 */
> -	spin = igt_spin_batch_new(fd, 0, master, plug);
> +	spin = igt_spin_batch_new(fd, .engine = master, .dependency = plug);
>   	resubmit(fd, spin->handle, master, 16);
>   
>   	/* Now queue the master request and its secondaries */
> @@ -651,7 +651,7 @@ static void test_keep_in_fence(int fd, unsigned int engine, unsigned int flags)
>   	igt_spin_t *spin;
>   	int fence;
>   
> -	spin = igt_spin_batch_new(fd, 0, engine, 0);
> +	spin = igt_spin_batch_new(fd, .engine = engine);
>   
>   	gem_execbuf_wr(fd, &execbuf);
>   	fence = upper_32_bits(execbuf.rsvd2);
> @@ -1070,7 +1070,7 @@ static void test_syncobj_unused_fence(int fd)
>   	struct local_gem_exec_fence fence = {
>   		.handle = syncobj_create(fd),
>   	};
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd);
>   
>   	/* sanity check our syncobj_to_sync_file interface */
>   	igt_assert_eq(__syncobj_to_sync_file(fd, 0), -ENOENT);
> @@ -1162,7 +1162,7 @@ static void test_syncobj_signal(int fd)
>   	struct local_gem_exec_fence fence = {
>   		.handle = syncobj_create(fd),
>   	};
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd);
>   
>   	/* Check that the syncobj is signaled only when our request/fence is */
>   
> @@ -1212,7 +1212,7 @@ static void test_syncobj_wait(int fd)
>   
>   	gem_quiescent_gpu(fd);
>   
> -	spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	spin = igt_spin_batch_new(fd);
>   
>   	memset(&execbuf, 0, sizeof(execbuf));
>   	execbuf.buffers_ptr = to_user_pointer(&obj);
> @@ -1282,7 +1282,7 @@ static void test_syncobj_export(int fd)
>   		.handle = syncobj_create(fd),
>   	};
>   	int export[2];
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd);
>   
>   	/* Check that if we export the syncobj prior to use it picks up
>   	 * the later fence. This allows a syncobj to establish a channel
> @@ -1340,7 +1340,7 @@ static void test_syncobj_repeat(int fd)
>   	struct drm_i915_gem_execbuffer2 execbuf;
>   	struct local_gem_exec_fence *fence;
>   	int export;
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd);
>   
>   	/* Check that we can wait on the same fence multiple times */
>   	fence = calloc(nfences, sizeof(*fence));
> @@ -1395,7 +1395,7 @@ static void test_syncobj_import(int fd)
>   	const uint32_t bbe = MI_BATCH_BUFFER_END;
>   	struct drm_i915_gem_exec_object2 obj;
>   	struct drm_i915_gem_execbuffer2 execbuf;
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, 0, 0);
> +	igt_spin_t *spin = igt_spin_batch_new(fd);
>   	uint32_t sync = syncobj_create(fd);
>   	int fence;
>   
> diff --git a/tests/gem_exec_latency.c b/tests/gem_exec_latency.c
> index ea2e4c681..75811f325 100644
> --- a/tests/gem_exec_latency.c
> +++ b/tests/gem_exec_latency.c
> @@ -63,6 +63,10 @@ static unsigned int ring_size;
>   static void
>   poll_ring(int fd, unsigned ring, const char *name)
>   {
> +	const struct igt_spin_factory opts = {
> +		.engine = ring,
> +		.flags = IGT_SPIN_POLL_RUN,
> +	};
>   	struct timespec tv = {};
>   	unsigned long cycles;
>   	igt_spin_t *spin[2];
> @@ -72,11 +76,11 @@ poll_ring(int fd, unsigned ring, const char *name)
>   	gem_require_ring(fd, ring);
>   	igt_require(gem_can_store_dword(fd, ring));
>   
> -	spin[0] = __igt_spin_batch_new_poll(fd, 0, ring);
> +	spin[0] = __igt_spin_batch_factory(fd, &opts);
>   	igt_assert(spin[0]->running);
>   	cmd = *spin[0]->batch;
>   
> -	spin[1] = __igt_spin_batch_new_poll(fd, 0, ring);
> +	spin[1] = __igt_spin_batch_factory(fd, &opts);
>   	igt_assert(spin[1]->running);
>   	igt_assert(cmd == *spin[1]->batch);
>   
> @@ -312,7 +316,9 @@ static void latency_from_ring(int fd,
>   			       I915_GEM_DOMAIN_GTT);
>   
>   		if (flags & PREEMPT)
> -			spin = __igt_spin_batch_new(fd, ctx[0], ring, 0);
> +			spin = __igt_spin_batch_new(fd,
> +						    .ctx = ctx[0],
> +						    .engine = ring);
>   
>   		if (flags & CORK) {
>   			obj[0].handle = igt_cork_plug(&c, fd);
> @@ -456,6 +462,10 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
>   	};
>   #define NPASS ARRAY_SIZE(passname)
>   #define MMAP_SZ (64 << 10)
> +	const struct igt_spin_factory opts = {
> +		.engine = engine,
> +		.flags = IGT_SPIN_POLL_RUN,
> +	};
>   	struct rt_pkt *results;
>   	unsigned int engines[16];
>   	const char *names[16];
> @@ -513,7 +523,7 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
>   
>   			usleep(250);
>   
> -			spin = __igt_spin_batch_new_poll(fd, 0, engine);
> +			spin = __igt_spin_batch_factory(fd, &opts);
>   			if (!spin) {
>   				igt_warn("Failed to create spinner! (%s)\n",
>   					 passname[pass]);
> diff --git a/tests/gem_exec_nop.c b/tests/gem_exec_nop.c
> index 0523b1c02..74d27522d 100644
> --- a/tests/gem_exec_nop.c
> +++ b/tests/gem_exec_nop.c
> @@ -709,7 +709,9 @@ static void preempt(int fd, uint32_t handle,
>   	clock_gettime(CLOCK_MONOTONIC, &start);
>   	do {
>   		igt_spin_t *spin =
> -			__igt_spin_batch_new(fd, ctx[0], ring_id, 0);
> +			__igt_spin_batch_new(fd,
> +					     .ctx = ctx[0],
> +					     .engine = ring_id);
>   
>   		for (int loop = 0; loop < 1024; loop++)
>   			gem_execbuf(fd, &execbuf);
> diff --git a/tests/gem_exec_reloc.c b/tests/gem_exec_reloc.c
> index 91c6691af..837f60a6c 100644
> --- a/tests/gem_exec_reloc.c
> +++ b/tests/gem_exec_reloc.c
> @@ -388,7 +388,9 @@ static void basic_reloc(int fd, unsigned before, unsigned after, unsigned flags)
>   		}
>   
>   		if (flags & ACTIVE) {
> -			spin = igt_spin_batch_new(fd, 0, I915_EXEC_DEFAULT, obj.handle);
> +			spin = igt_spin_batch_new(fd,
> +						  .engine = I915_EXEC_DEFAULT,
> +						  .dependency = obj.handle);
>   			if (!(flags & HANG))
>   				igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
>   			igt_assert(gem_bo_busy(fd, obj.handle));
> @@ -454,7 +456,9 @@ static void basic_reloc(int fd, unsigned before, unsigned after, unsigned flags)
>   		}
>   
>   		if (flags & ACTIVE) {
> -			spin = igt_spin_batch_new(fd, 0, I915_EXEC_DEFAULT, obj.handle);
> +			spin = igt_spin_batch_new(fd,
> +						  .engine = I915_EXEC_DEFAULT,
> +						  .dependency = obj.handle);
>   			if (!(flags & HANG))
>   				igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
>   			igt_assert(gem_bo_busy(fd, obj.handle));
> @@ -581,7 +585,7 @@ static void basic_range(int fd, unsigned flags)
>   	execbuf.buffer_count = n + 1;
>   
>   	if (flags & ACTIVE) {
> -		spin = igt_spin_batch_new(fd, 0, 0, obj[n].handle);
> +		spin = igt_spin_batch_new(fd, .dependency = obj[n].handle);
>   		if (!(flags & HANG))
>   			igt_spin_batch_set_timeout(spin, NSEC_PER_SEC/100);
>   		igt_assert(gem_bo_busy(fd, obj[n].handle));
> diff --git a/tests/gem_exec_schedule.c b/tests/gem_exec_schedule.c
> index 1f43147f7..35a44ab10 100644
> --- a/tests/gem_exec_schedule.c
> +++ b/tests/gem_exec_schedule.c
> @@ -132,9 +132,12 @@ static void unplug_show_queue(int fd, struct igt_cork *c, unsigned int engine)
>   	igt_spin_t *spin[MAX_ELSP_QLEN];
>   
>   	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
> -		uint32_t ctx = create_highest_priority(fd);
> -		spin[n] = __igt_spin_batch_new(fd, ctx, engine, 0);
> -		gem_context_destroy(fd, ctx);
> +		const struct igt_spin_factory opts = {
> +			.ctx = create_highest_priority(fd),
> +			.engine = engine,
> +		};
> +		spin[n] = __igt_spin_batch_factory(fd, &opts);
> +		gem_context_destroy(fd, opts.ctx);
>   	}
>   
>   	igt_cork_unplug(c); /* batches will now be queued on the engine */
> @@ -196,7 +199,7 @@ static void independent(int fd, unsigned int engine)
>   			continue;
>   
>   		if (spin == NULL) {
> -			spin = __igt_spin_batch_new(fd, 0, other, 0);
> +			spin = __igt_spin_batch_new(fd, .engine = other);
>   		} else {
>   			struct drm_i915_gem_exec_object2 obj = {
>   				.handle = spin->handle,
> @@ -428,7 +431,9 @@ static void preempt(int fd, unsigned ring, unsigned flags)
>   			ctx[LO] = gem_context_create(fd);
>   			gem_context_set_priority(fd, ctx[LO], MIN_PRIO);
>   		}
> -		spin[n] = __igt_spin_batch_new(fd, ctx[LO], ring, 0);
> +		spin[n] = __igt_spin_batch_new(fd,
> +					       .ctx = ctx[LO],
> +					       .engine = ring);
>   		igt_debug("spin[%d].handle=%d\n", n, spin[n]->handle);
>   
>   		store_dword(fd, ctx[HI], ring, result, 0, n + 1, 0, I915_GEM_DOMAIN_RENDER);
> @@ -462,7 +467,9 @@ static igt_spin_t *__noise(int fd, uint32_t ctx, int prio, igt_spin_t *spin)
>   
>   	for_each_physical_engine(fd, other) {
>   		if (spin == NULL) {
> -			spin = __igt_spin_batch_new(fd, ctx, other, 0);
> +			spin = __igt_spin_batch_new(fd,
> +						    .ctx = ctx,
> +						    .engine = other);
>   		} else {
>   			struct drm_i915_gem_exec_object2 obj = {
>   				.handle = spin->handle,
> @@ -672,7 +679,9 @@ static void preempt_self(int fd, unsigned ring)
>   	n = 0;
>   	gem_context_set_priority(fd, ctx[HI], MIN_PRIO);
>   	for_each_physical_engine(fd, other) {
> -		spin[n] = __igt_spin_batch_new(fd, ctx[NOISE], other, 0);
> +		spin[n] = __igt_spin_batch_new(fd,
> +					       .ctx = ctx[NOISE],
> +					       .engine = other);
>   		store_dword(fd, ctx[HI], other,
>   			    result, (n + 1)*sizeof(uint32_t), n + 1,
>   			    0, I915_GEM_DOMAIN_RENDER);
> @@ -714,7 +723,9 @@ static void preemptive_hang(int fd, unsigned ring)
>   		ctx[LO] = gem_context_create(fd);
>   		gem_context_set_priority(fd, ctx[LO], MIN_PRIO);
>   
> -		spin[n] = __igt_spin_batch_new(fd, ctx[LO], ring, 0);
> +		spin[n] = __igt_spin_batch_new(fd,
> +					       .ctx = ctx[LO],
> +					       .engine = ring);
>   
>   		gem_context_destroy(fd, ctx[LO]);
>   	}
> diff --git a/tests/gem_exec_suspend.c b/tests/gem_exec_suspend.c
> index db2bca262..43c52d105 100644
> --- a/tests/gem_exec_suspend.c
> +++ b/tests/gem_exec_suspend.c
> @@ -189,7 +189,7 @@ static void run_test(int fd, unsigned engine, unsigned flags)
>   	}
>   
>   	if (flags & HANG)
> -		spin = igt_spin_batch_new(fd, 0, engine, 0);
> +		spin = igt_spin_batch_new(fd, .engine = engine);
>   
>   	switch (mode(flags)) {
>   	case NOSLEEP:
> diff --git a/tests/gem_fenced_exec_thrash.c b/tests/gem_fenced_exec_thrash.c
> index 385790ada..7248d310d 100644
> --- a/tests/gem_fenced_exec_thrash.c
> +++ b/tests/gem_fenced_exec_thrash.c
> @@ -132,7 +132,7 @@ static void run_test(int fd, int num_fences, int expected_errno,
>   			igt_spin_t *spin = NULL;
>   
>   			if (flags & BUSY_LOAD)
> -				spin = __igt_spin_batch_new(fd, 0, 0, 0);
> +				spin = __igt_spin_batch_new(fd);
>   
>   			igt_while_interruptible(flags & INTERRUPTIBLE) {
>   				igt_assert_eq(__gem_execbuf(fd, &execbuf[i]),
> diff --git a/tests/gem_shrink.c b/tests/gem_shrink.c
> index 3d33453aa..929e0426a 100644
> --- a/tests/gem_shrink.c
> +++ b/tests/gem_shrink.c
> @@ -346,9 +346,9 @@ static void reclaim(unsigned engine, int timeout)
>   		} while (!*shared);
>   	}
>   
> -	spin = igt_spin_batch_new(fd, 0, engine, 0);
> +	spin = igt_spin_batch_new(fd, .engine = engine);
>   	igt_until_timeout(timeout) {
> -		igt_spin_t *next = __igt_spin_batch_new(fd, 0, engine, 0);
> +		igt_spin_t *next = __igt_spin_batch_new(fd, .engine = engine);
>   
>   		igt_spin_batch_set_timeout(spin, timeout_100ms);
>   		gem_sync(fd, spin->handle);
> diff --git a/tests/gem_spin_batch.c b/tests/gem_spin_batch.c
> index cffeb6d71..52410010b 100644
> --- a/tests/gem_spin_batch.c
> +++ b/tests/gem_spin_batch.c
> @@ -41,9 +41,9 @@ static void spin(int fd, unsigned int engine, unsigned int timeout_sec)
>   	struct timespec itv = { };
>   	uint64_t elapsed;
>   
> -	spin = __igt_spin_batch_new(fd, 0, engine, 0);
> +	spin = __igt_spin_batch_new(fd, .engine = engine);
>   	while ((elapsed = igt_nsec_elapsed(&tv)) >> 30 < timeout_sec) {
> -		igt_spin_t *next = __igt_spin_batch_new(fd, 0, engine, 0);
> +		igt_spin_t *next = __igt_spin_batch_new(fd, .engine = engine);
>   
>   		igt_spin_batch_set_timeout(spin,
>   					   timeout_100ms - igt_nsec_elapsed(&itv));
> diff --git a/tests/gem_sync.c b/tests/gem_sync.c
> index 1e2e089a1..2fcb9aa01 100644
> --- a/tests/gem_sync.c
> +++ b/tests/gem_sync.c
> @@ -715,9 +715,8 @@ preempt(int fd, unsigned ring, int num_children, int timeout)
>   		do {
>   			igt_spin_t *spin =
>   				__igt_spin_batch_new(fd,
> -						     ctx[0],
> -						     execbuf.flags,
> -						     0);
> +						     .ctx = ctx[0],
> +						     .engine = execbuf.flags);
>   
>   			do {
>   				gem_execbuf(fd, &execbuf);
> diff --git a/tests/gem_wait.c b/tests/gem_wait.c
> index 61d8a4059..7914c9365 100644
> --- a/tests/gem_wait.c
> +++ b/tests/gem_wait.c
> @@ -74,7 +74,9 @@ static void basic(int fd, unsigned engine, unsigned flags)
>   	IGT_CORK_HANDLE(cork);
>   	uint32_t plug =
>   		flags & (WRITE | AWAIT) ? igt_cork_plug(&cork, fd) : 0;
> -	igt_spin_t *spin = igt_spin_batch_new(fd, 0, engine, plug);
> +	igt_spin_t *spin = igt_spin_batch_new(fd,
> +					      .engine = engine,
> +					      .dependency = plug);
>   	struct drm_i915_gem_wait wait = {
>   		flags & WRITE ? plug : spin->handle
>   	};
> diff --git a/tests/kms_busy.c b/tests/kms_busy.c
> index 4a4e0e156..abf39828b 100644
> --- a/tests/kms_busy.c
> +++ b/tests/kms_busy.c
> @@ -84,7 +84,8 @@ static void flip_to_fb(igt_display_t *dpy, int pipe,
>   	struct drm_event_vblank ev;
>   
>   	igt_spin_t *t = igt_spin_batch_new(dpy->drm_fd,
> -					   0, ring, fb->gem_handle);
> +					   .engine = ring,
> +					   .dependency = fb->gem_handle);
>   
>   	if (modeset) {
>   		/*
> @@ -200,7 +201,8 @@ static void test_atomic_commit_hang(igt_display_t *dpy, igt_plane_t *primary,
>   				    struct igt_fb *busy_fb, unsigned ring)
>   {
>   	igt_spin_t *t = igt_spin_batch_new(dpy->drm_fd,
> -					   0, ring, busy_fb->gem_handle);
> +					   .engine = ring,
> +					   .dependency = busy_fb->gem_handle);
>   	struct pollfd pfd = { .fd = dpy->drm_fd, .events = POLLIN };
>   	unsigned flags = 0;
>   	struct drm_event_vblank ev;
> @@ -287,7 +289,9 @@ static void test_pageflip_modeset_hang(igt_display_t *dpy,
>   
>   	igt_display_commit2(dpy, dpy->is_atomic ? COMMIT_ATOMIC : COMMIT_LEGACY);
>   
> -	t = igt_spin_batch_new(dpy->drm_fd, 0, ring, fb.gem_handle);
> +	t = igt_spin_batch_new(dpy->drm_fd,
> +			       .engine = ring,
> +			       .dependency = fb.gem_handle);
>   
>   	do_or_die(drmModePageFlip(dpy->drm_fd, dpy->pipes[pipe].crtc_id, fb.fb_id, DRM_MODE_PAGE_FLIP_EVENT, &fb));
>   
> diff --git a/tests/kms_cursor_legacy.c b/tests/kms_cursor_legacy.c
> index d0a28b3c4..85340d43e 100644
> --- a/tests/kms_cursor_legacy.c
> +++ b/tests/kms_cursor_legacy.c
> @@ -532,7 +532,8 @@ static void basic_flip_cursor(igt_display_t *display,
>   
>   		spin = NULL;
>   		if (flags & BASIC_BUSY)
> -			spin = igt_spin_batch_new(display->drm_fd, 0, 0, fb_info.gem_handle);
> +			spin = igt_spin_batch_new(display->drm_fd,
> +						  .dependency = fb_info.gem_handle);
>   
>   		/* Start with a synchronous query to align with the vblank */
>   		vblank_start = get_vblank(display->drm_fd, pipe, DRM_VBLANK_NEXTONMISS);
> @@ -1323,8 +1324,8 @@ static void flip_vs_cursor_busy_crc(igt_display_t *display, bool atomic)
>   	for (int i = 1; i >= 0; i--) {
>   		igt_spin_t *spin;
>   
> -		spin = igt_spin_batch_new(display->drm_fd, 0, 0,
> -					  fb_info[1].gem_handle);
> +		spin = igt_spin_batch_new(display->drm_fd,
> +					  .dependency = fb_info[1].gem_handle);
>   
>   		vblank_start = get_vblank(display->drm_fd, pipe, DRM_VBLANK_NEXTONMISS);
>   
> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> index 4570f926d..a1d36ac4f 100644
> --- a/tests/perf_pmu.c
> +++ b/tests/perf_pmu.c
> @@ -172,10 +172,15 @@ static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e)
>   
>   static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
>   {
> +	struct igt_spin_factory opts = {
> +		.ctx = ctx,
> +		.engine = flags,
> +	};
> +
>   	if (gem_can_store_dword(fd, flags))
> -		return __igt_spin_batch_new_poll(fd, ctx, flags);
> -	else
> -		return __igt_spin_batch_new(fd, ctx, flags, 0);
> +		opts.flags |= IGT_SPIN_POLL_RUN;
> +
> +	return __igt_spin_batch_factory(fd, &opts);
>   }
>   
>   static unsigned long __spin_wait(int fd, igt_spin_t *spin)
> @@ -356,7 +361,9 @@ busy_double_start(int gem_fd, const struct intel_execution_engine2 *e)
>   	 */
>   	spin[0] = __spin_sync(gem_fd, 0, e2ring(gem_fd, e));
>   	usleep(500e3);
> -	spin[1] = __igt_spin_batch_new(gem_fd, ctx, e2ring(gem_fd, e), 0);
> +	spin[1] = __igt_spin_batch_new(gem_fd,
> +				       .ctx = ctx,
> +				       .engine = e2ring(gem_fd, e));
>   
>   	/*
>   	 * Open PMU as fast as possible after the second spin batch in attempt
> @@ -1045,8 +1052,8 @@ static void cpu_hotplug(int gem_fd)
>   	 * Create two spinners so test can ensure shorter gaps in engine
>   	 * busyness as it is terminating one and re-starting the other.
>   	 */
> -	spin[0] = igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
> -	spin[1] = __igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
> +	spin[0] = igt_spin_batch_new(gem_fd, .engine = I915_EXEC_RENDER);
> +	spin[1] = __igt_spin_batch_new(gem_fd, .engine = I915_EXEC_RENDER);
>   
>   	val = __pmu_read_single(fd, &ts[0]);
>   
> @@ -1129,8 +1136,8 @@ static void cpu_hotplug(int gem_fd)
>   			break;
>   
>   		igt_spin_batch_free(gem_fd, spin[cur]);
> -		spin[cur] = __igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER,
> -						 0);
> +		spin[cur] = __igt_spin_batch_new(gem_fd,
> +						 .engine = I915_EXEC_RENDER);
>   		cur ^= 1;
>   	}
>   
> @@ -1167,8 +1174,9 @@ test_interrupts(int gem_fd)
>   
>   	/* Queue spinning batches. */
>   	for (int i = 0; i < target; i++) {
> -		spin[i] = __igt_spin_batch_new_fence(gem_fd,
> -						     0, I915_EXEC_RENDER);
> +		spin[i] = __igt_spin_batch_new(gem_fd,
> +					       .engine = I915_EXEC_RENDER,
> +					       .flags = IGT_SPIN_FENCE_OUT);
>   		if (i == 0) {
>   			fence_fd = spin[i]->out_fence;
>   		} else {
> @@ -1229,7 +1237,8 @@ test_interrupts_sync(int gem_fd)
>   
>   	/* Queue spinning batches. */
>   	for (int i = 0; i < target; i++)
> -		spin[i] = __igt_spin_batch_new_fence(gem_fd, 0, 0);
> +		spin[i] = __igt_spin_batch_new(gem_fd,
> +					       .flags = IGT_SPIN_FENCE_OUT);
>   
>   	/* Wait for idle state. */
>   	idle = pmu_read_single(fd);
> @@ -1550,7 +1559,7 @@ accuracy(int gem_fd, const struct intel_execution_engine2 *e,
>   		igt_spin_t *spin;
>   
>   		/* Allocate our spin batch and idle it. */
> -		spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
> +		spin = igt_spin_batch_new(gem_fd, .engine = e2ring(gem_fd, e));
>   		igt_spin_batch_end(spin);
>   		gem_sync(gem_fd, spin->handle);
>   
> diff --git a/tests/pm_rps.c b/tests/pm_rps.c
> index 006d084b8..202132b1c 100644
> --- a/tests/pm_rps.c
> +++ b/tests/pm_rps.c
> @@ -235,9 +235,9 @@ static void load_helper_run(enum load load)
>   
>   		igt_debug("Applying %s load...\n", lh.load ? "high" : "low");
>   
> -		spin[0] = __igt_spin_batch_new(drm_fd, 0, 0, 0);
> +		spin[0] = __igt_spin_batch_new(drm_fd);
>   		if (lh.load == HIGH)
> -			spin[1] = __igt_spin_batch_new(drm_fd, 0, 0, 0);
> +			spin[1] = __igt_spin_batch_new(drm_fd);
>   		while (!lh.exit) {
>   			handle = spin[0]->handle;
>   			igt_spin_batch_end(spin[0]);
> @@ -248,8 +248,7 @@ static void load_helper_run(enum load load)
>   			usleep(100);
>   
>   			spin[0] = spin[1];
> -			spin[lh.load == HIGH] =
> -				__igt_spin_batch_new(drm_fd, 0, 0, 0);
> +			spin[lh.load == HIGH] = __igt_spin_batch_new(drm_fd);
>   		}
>   
>   		handle = spin[0]->handle;
> @@ -510,7 +509,7 @@ static void boost_freq(int fd, int *boost_freqs)
>   	int64_t timeout = 1;
>   	igt_spin_t *load;
>   
> -	load = igt_spin_batch_new(fd, 0, 0, 0);
> +	load = igt_spin_batch_new(fd);
>   	resubmit_batch(fd, load->handle, 16);
>   
>   	/* Waiting will grant us a boost to maximum */
> 

Neat trick! :)

Conversion looks correct.

Two nitpicks are:

1. It's not really a factory, but I guess we can give it an unusual name 
in our world so it sticks out as unusual.

2. We could simplify by avoiding macro tricks by just passing in { 
.field = value } by value.

I don't actually mind the trick, but since it is a novel concept in IGT 
codebase (AFAIR), then please run it past Petri at least.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH i-g-t 05/17] lib: Spin fast, retire early
  2018-07-02  9:07   ` [igt-dev] " Chris Wilson
@ 2018-07-02 15:36     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-02 15:36 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 02/07/2018 10:07, Chris Wilson wrote:
> When using the pollable spinner, we often want to use it as a means of
> ensuring the task is running on the GPU before switching to something
> else. In which case we don't want to add extra delay inside the spinner,
> but the current 1000 NOPs add on order of 5us, which is often larger
> than the target latency.
> 
> v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
> from a tight GPU spinner.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   lib/igt_dummyload.c       | 3 ++-
>   lib/igt_dummyload.h       | 1 +
>   tests/gem_ctx_isolation.c | 1 +
>   tests/gem_eio.c           | 1 +
>   tests/gem_exec_latency.c  | 4 ++--
>   5 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
> index 94efdf745..7beb66244 100644
> --- a/lib/igt_dummyload.c
> +++ b/lib/igt_dummyload.c
> @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
>   	 * between function calls, that appears enough to keep SNB out of
>   	 * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
>   	 */
> -	batch += 1000;
> +	if (!(opts->flags & IGT_SPIN_FAST))
> +		batch += 1000;

igt_require(!snb) or something, given the comment whose last two lines 
can be seen in the diff above?

Regards,

Tvrtko

>   
>   	/* recurse */
>   	r = &relocs[obj[BATCH].relocation_count++];
> diff --git a/lib/igt_dummyload.h b/lib/igt_dummyload.h
> index c794f2544..e80a12451 100644
> --- a/lib/igt_dummyload.h
> +++ b/lib/igt_dummyload.h
> @@ -52,6 +52,7 @@ struct igt_spin_factory {
>   
>   #define IGT_SPIN_FENCE_OUT (1 << 0)
>   #define IGT_SPIN_POLL_RUN  (1 << 1)
> +#define IGT_SPIN_FAST      (1 << 2)
>   
>   igt_spin_t *
>   __igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
> diff --git a/tests/gem_ctx_isolation.c b/tests/gem_ctx_isolation.c
> index 2e19e8c03..4325e1c28 100644
> --- a/tests/gem_ctx_isolation.c
> +++ b/tests/gem_ctx_isolation.c
> @@ -560,6 +560,7 @@ static void inject_reset_context(int fd, unsigned int engine)
>   	struct igt_spin_factory opts = {
>   		.ctx = gem_context_create(fd),
>   		.engine = engine,
> +		.flags = IGT_SPIN_FAST,
>   	};
>   	igt_spin_t *spin;
>   
> diff --git a/tests/gem_eio.c b/tests/gem_eio.c
> index 0ec1aaec9..3162a3170 100644
> --- a/tests/gem_eio.c
> +++ b/tests/gem_eio.c
> @@ -160,6 +160,7 @@ static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
>   	struct igt_spin_factory opts = {
>   		.ctx = ctx,
>   		.engine = flags,
> +		.flags = IGT_SPIN_FAST,
>   	};
>   
>   	if (gem_can_store_dword(fd, opts.engine))
> diff --git a/tests/gem_exec_latency.c b/tests/gem_exec_latency.c
> index 75811f325..de16322a6 100644
> --- a/tests/gem_exec_latency.c
> +++ b/tests/gem_exec_latency.c
> @@ -65,7 +65,7 @@ poll_ring(int fd, unsigned ring, const char *name)
>   {
>   	const struct igt_spin_factory opts = {
>   		.engine = ring,
> -		.flags = IGT_SPIN_POLL_RUN,
> +		.flags = IGT_SPIN_POLL_RUN | IGT_SPIN_FAST,
>   	};
>   	struct timespec tv = {};
>   	unsigned long cycles;
> @@ -464,7 +464,7 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
>   #define MMAP_SZ (64 << 10)
>   	const struct igt_spin_factory opts = {
>   		.engine = engine,
> -		.flags = IGT_SPIN_POLL_RUN,
> +		.flags = IGT_SPIN_POLL_RUN | IGT_SPIN_FAST,
>   	};
>   	struct rt_pkt *results;
>   	unsigned int engines[16];
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Intel-gfx] [PATCH i-g-t 05/17] lib: Spin fast, retire early
@ 2018-07-02 15:36     ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-02 15:36 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 02/07/2018 10:07, Chris Wilson wrote:
> When using the pollable spinner, we often want to use it as a means of
> ensuring the task is running on the GPU before switching to something
> else. In which case we don't want to add extra delay inside the spinner,
> but the current 1000 NOPs add on order of 5us, which is often larger
> than the target latency.
> 
> v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
> from a tight GPU spinner.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   lib/igt_dummyload.c       | 3 ++-
>   lib/igt_dummyload.h       | 1 +
>   tests/gem_ctx_isolation.c | 1 +
>   tests/gem_eio.c           | 1 +
>   tests/gem_exec_latency.c  | 4 ++--
>   5 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
> index 94efdf745..7beb66244 100644
> --- a/lib/igt_dummyload.c
> +++ b/lib/igt_dummyload.c
> @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
>   	 * between function calls, that appears enough to keep SNB out of
>   	 * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
>   	 */
> -	batch += 1000;
> +	if (!(opts->flags & IGT_SPIN_FAST))
> +		batch += 1000;

igt_require(!snb) or something, given the comment whose last two lines 
can be seen in the diff above?

Regards,

Tvrtko

>   
>   	/* recurse */
>   	r = &relocs[obj[BATCH].relocation_count++];
> diff --git a/lib/igt_dummyload.h b/lib/igt_dummyload.h
> index c794f2544..e80a12451 100644
> --- a/lib/igt_dummyload.h
> +++ b/lib/igt_dummyload.h
> @@ -52,6 +52,7 @@ struct igt_spin_factory {
>   
>   #define IGT_SPIN_FENCE_OUT (1 << 0)
>   #define IGT_SPIN_POLL_RUN  (1 << 1)
> +#define IGT_SPIN_FAST      (1 << 2)
>   
>   igt_spin_t *
>   __igt_spin_batch_factory(int fd, const struct igt_spin_factory *opts);
> diff --git a/tests/gem_ctx_isolation.c b/tests/gem_ctx_isolation.c
> index 2e19e8c03..4325e1c28 100644
> --- a/tests/gem_ctx_isolation.c
> +++ b/tests/gem_ctx_isolation.c
> @@ -560,6 +560,7 @@ static void inject_reset_context(int fd, unsigned int engine)
>   	struct igt_spin_factory opts = {
>   		.ctx = gem_context_create(fd),
>   		.engine = engine,
> +		.flags = IGT_SPIN_FAST,
>   	};
>   	igt_spin_t *spin;
>   
> diff --git a/tests/gem_eio.c b/tests/gem_eio.c
> index 0ec1aaec9..3162a3170 100644
> --- a/tests/gem_eio.c
> +++ b/tests/gem_eio.c
> @@ -160,6 +160,7 @@ static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
>   	struct igt_spin_factory opts = {
>   		.ctx = ctx,
>   		.engine = flags,
> +		.flags = IGT_SPIN_FAST,
>   	};
>   
>   	if (gem_can_store_dword(fd, opts.engine))
> diff --git a/tests/gem_exec_latency.c b/tests/gem_exec_latency.c
> index 75811f325..de16322a6 100644
> --- a/tests/gem_exec_latency.c
> +++ b/tests/gem_exec_latency.c
> @@ -65,7 +65,7 @@ poll_ring(int fd, unsigned ring, const char *name)
>   {
>   	const struct igt_spin_factory opts = {
>   		.engine = ring,
> -		.flags = IGT_SPIN_POLL_RUN,
> +		.flags = IGT_SPIN_POLL_RUN | IGT_SPIN_FAST,
>   	};
>   	struct timespec tv = {};
>   	unsigned long cycles;
> @@ -464,7 +464,7 @@ rthog_latency_on_ring(int fd, unsigned int engine, const char *name, unsigned in
>   #define MMAP_SZ (64 << 10)
>   	const struct igt_spin_factory opts = {
>   		.engine = engine,
> -		.flags = IGT_SPIN_POLL_RUN,
> +		.flags = IGT_SPIN_POLL_RUN | IGT_SPIN_FAST,
>   	};
>   	struct rt_pkt *results;
>   	unsigned int engines[16];
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
  2018-07-02 12:00     ` Tvrtko Ursulin
@ 2018-07-05 11:14       ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 11:14 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-02 13:00:07)
> 
> On 02/07/2018 10:07, Chris Wilson wrote:
> > As we want to compare a templated tiling pattern against the target_bo,
> > we need to know that the swizzling is compatible. Or else the two
> > tiling pattern may differ due to underlying page address that we cannot
> > know, and so the test may sporadically fail.
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
> >   1 file changed, 24 insertions(+)
> > 
> > diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
> > index fe573c37c..83c57c07d 100644
> > --- a/tests/gem_tiled_partial_pwrite_pread.c
> > +++ b/tests/gem_tiled_partial_pwrite_pread.c
> > @@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
> >       }
> >   }
> >   
> > +static bool known_swizzling(uint32_t handle)
> > +{
> > +     struct drm_i915_gem_get_tiling2 {
> > +             uint32_t handle;
> > +             uint32_t tiling_mode;
> > +             uint32_t swizzle_mode;
> > +             uint32_t phys_swizzle_mode;
> > +     } arg = {
> > +             .handle = handle,
> > +     };
> > +#define DRM_IOCTL_I915_GEM_GET_TILING2       DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct 
> 
> Can't we rely on this being in system headers by now?
> 
> drm_i915_gem_get_tiling2)
> > +
> > +     if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
> > +             return false;
> > +
> > +     return arg.phys_swizzle_mode == arg.swizzle_mode;
> > +}
> > +
> >   igt_main
> >   {
> >       uint32_t tiling_mode = I915_TILING_X;
> > @@ -271,6 +289,12 @@ igt_main
> >                                                     &tiling_mode, &scratch_pitch, 0);
> >               igt_assert(tiling_mode == I915_TILING_X);
> >               igt_assert(scratch_pitch == 4096);
> > +
> > +             /*
> > +              * As we want to compare our template tiled pattern against
> > +              * the target bo, we need consistent swizzling on both.
> > +              */
> > +             igt_require(known_swizzling(scratch_bo->handle));
> >               staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
> >               tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
> >                                                           BO_SIZE/4096, 4,
> > 
> 
> Another option could be to keep allocating until we found one in the 
> memory area with compatible swizzling? Like this it may be some noise in 
> the test pass<->skip transitions.

It depends on physical layout which the kernel keeps hidden (for
understandable reasons).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
@ 2018-07-05 11:14       ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 11:14 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-02 13:00:07)
> 
> On 02/07/2018 10:07, Chris Wilson wrote:
> > As we want to compare a templated tiling pattern against the target_bo,
> > we need to know that the swizzling is compatible. Or else the two
> > tiling pattern may differ due to underlying page address that we cannot
> > know, and so the test may sporadically fail.
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
> >   1 file changed, 24 insertions(+)
> > 
> > diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
> > index fe573c37c..83c57c07d 100644
> > --- a/tests/gem_tiled_partial_pwrite_pread.c
> > +++ b/tests/gem_tiled_partial_pwrite_pread.c
> > @@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
> >       }
> >   }
> >   
> > +static bool known_swizzling(uint32_t handle)
> > +{
> > +     struct drm_i915_gem_get_tiling2 {
> > +             uint32_t handle;
> > +             uint32_t tiling_mode;
> > +             uint32_t swizzle_mode;
> > +             uint32_t phys_swizzle_mode;
> > +     } arg = {
> > +             .handle = handle,
> > +     };
> > +#define DRM_IOCTL_I915_GEM_GET_TILING2       DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct 
> 
> Can't we rely on this being in system headers by now?
> 
> drm_i915_gem_get_tiling2)
> > +
> > +     if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
> > +             return false;
> > +
> > +     return arg.phys_swizzle_mode == arg.swizzle_mode;
> > +}
> > +
> >   igt_main
> >   {
> >       uint32_t tiling_mode = I915_TILING_X;
> > @@ -271,6 +289,12 @@ igt_main
> >                                                     &tiling_mode, &scratch_pitch, 0);
> >               igt_assert(tiling_mode == I915_TILING_X);
> >               igt_assert(scratch_pitch == 4096);
> > +
> > +             /*
> > +              * As we want to compare our template tiled pattern against
> > +              * the target bo, we need consistent swizzling on both.
> > +              */
> > +             igt_require(known_swizzling(scratch_bo->handle));
> >               staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
> >               tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
> >                                                           BO_SIZE/4096, 4,
> > 
> 
> Another option could be to keep allocating until we found one in the 
> memory area with compatible swizzling? Like this it may be some noise in 
> the test pass<->skip transitions.

It depends on physical layout which the kernel keeps hidden (for
understandable reasons).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH i-g-t 05/17] lib: Spin fast, retire early
  2018-07-02 15:36     ` [Intel-gfx] " Tvrtko Ursulin
@ 2018-07-05 11:23       ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 11:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-02 16:36:28)
> 
> On 02/07/2018 10:07, Chris Wilson wrote:
> > When using the pollable spinner, we often want to use it as a means of
> > ensuring the task is running on the GPU before switching to something
> > else. In which case we don't want to add extra delay inside the spinner,
> > but the current 1000 NOPs add on order of 5us, which is often larger
> > than the target latency.
> > 
> > v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
> > from a tight GPU spinner.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
> > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   lib/igt_dummyload.c       | 3 ++-
> >   lib/igt_dummyload.h       | 1 +
> >   tests/gem_ctx_isolation.c | 1 +
> >   tests/gem_eio.c           | 1 +
> >   tests/gem_exec_latency.c  | 4 ++--
> >   5 files changed, 7 insertions(+), 3 deletions(-)
> > 
> > diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
> > index 94efdf745..7beb66244 100644
> > --- a/lib/igt_dummyload.c
> > +++ b/lib/igt_dummyload.c
> > @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
> >        * between function calls, that appears enough to keep SNB out of
> >        * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
> >        */
> > -     batch += 1000;
> > +     if (!(opts->flags & IGT_SPIN_FAST))
> > +             batch += 1000;
> 
> igt_require(!snb) or something, given the comment whose last two lines 
> can be seen in the diff above?

It's not a machine killer since we have the required fix in the kernel,
it just has interesting system latency. That latency is not specific to
snb, and whether we want to trade system latency vs gpu latency is the
reason we punt the decision to the caller.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 05/17] lib: Spin fast, retire early
@ 2018-07-05 11:23       ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 11:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-02 16:36:28)
> 
> On 02/07/2018 10:07, Chris Wilson wrote:
> > When using the pollable spinner, we often want to use it as a means of
> > ensuring the task is running on the GPU before switching to something
> > else. In which case we don't want to add extra delay inside the spinner,
> > but the current 1000 NOPs add on order of 5us, which is often larger
> > than the target latency.
> > 
> > v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
> > from a tight GPU spinner.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
> > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   lib/igt_dummyload.c       | 3 ++-
> >   lib/igt_dummyload.h       | 1 +
> >   tests/gem_ctx_isolation.c | 1 +
> >   tests/gem_eio.c           | 1 +
> >   tests/gem_exec_latency.c  | 4 ++--
> >   5 files changed, 7 insertions(+), 3 deletions(-)
> > 
> > diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
> > index 94efdf745..7beb66244 100644
> > --- a/lib/igt_dummyload.c
> > +++ b/lib/igt_dummyload.c
> > @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
> >        * between function calls, that appears enough to keep SNB out of
> >        * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
> >        */
> > -     batch += 1000;
> > +     if (!(opts->flags & IGT_SPIN_FAST))
> > +             batch += 1000;
> 
> igt_require(!snb) or something, given the comment whose last two lines 
> can be seen in the diff above?

It's not a machine killer since we have the required fix in the kernel,
it just has interesting system latency. That latency is not specific to
snb, and whether we want to trade system latency vs gpu latency is the
reason we punt the decision to the caller.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
  2018-07-05 11:14       ` [Intel-gfx] " Chris Wilson
@ 2018-07-05 12:30         ` Tvrtko Ursulin
  -1 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-05 12:30 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 05/07/2018 12:14, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-07-02 13:00:07)
>>
>> On 02/07/2018 10:07, Chris Wilson wrote:
>>> As we want to compare a templated tiling pattern against the target_bo,
>>> we need to know that the swizzling is compatible. Or else the two
>>> tiling pattern may differ due to underlying page address that we cannot
>>> know, and so the test may sporadically fail.
>>>
>>> References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
>>>    1 file changed, 24 insertions(+)
>>>
>>> diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
>>> index fe573c37c..83c57c07d 100644
>>> --- a/tests/gem_tiled_partial_pwrite_pread.c
>>> +++ b/tests/gem_tiled_partial_pwrite_pread.c
>>> @@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
>>>        }
>>>    }
>>>    
>>> +static bool known_swizzling(uint32_t handle)
>>> +{
>>> +     struct drm_i915_gem_get_tiling2 {
>>> +             uint32_t handle;
>>> +             uint32_t tiling_mode;
>>> +             uint32_t swizzle_mode;
>>> +             uint32_t phys_swizzle_mode;
>>> +     } arg = {
>>> +             .handle = handle,
>>> +     };
>>> +#define DRM_IOCTL_I915_GEM_GET_TILING2       DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct
>>
>> Can't we rely on this being in system headers by now?
>>
>> drm_i915_gem_get_tiling2)
>>> +
>>> +     if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
>>> +             return false;
>>> +
>>> +     return arg.phys_swizzle_mode == arg.swizzle_mode;
>>> +}
>>> +
>>>    igt_main
>>>    {
>>>        uint32_t tiling_mode = I915_TILING_X;
>>> @@ -271,6 +289,12 @@ igt_main
>>>                                                      &tiling_mode, &scratch_pitch, 0);
>>>                igt_assert(tiling_mode == I915_TILING_X);
>>>                igt_assert(scratch_pitch == 4096);
>>> +
>>> +             /*
>>> +              * As we want to compare our template tiled pattern against
>>> +              * the target bo, we need consistent swizzling on both.
>>> +              */
>>> +             igt_require(known_swizzling(scratch_bo->handle));
>>>                staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
>>>                tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
>>>                                                            BO_SIZE/4096, 4,
>>>
>>
>> Another option could be to keep allocating until we found one in the
>> memory area with compatible swizzling? Like this it may be some noise in
>> the test pass<->skip transitions.
> 
> It depends on physical layout which the kernel keeps hidden (for
> understandable reasons).

Yeah, but we could allocate more and more until we end up in the area 
where args.phys_swizzle_mode == args.swizzle_mode. Might be to heavy 
approach. But then this skip can be random depending on what physical 
memory gets allocated in each test run.

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
@ 2018-07-05 12:30         ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-05 12:30 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 05/07/2018 12:14, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-07-02 13:00:07)
>>
>> On 02/07/2018 10:07, Chris Wilson wrote:
>>> As we want to compare a templated tiling pattern against the target_bo,
>>> we need to know that the swizzling is compatible. Or else the two
>>> tiling pattern may differ due to underlying page address that we cannot
>>> know, and so the test may sporadically fail.
>>>
>>> References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
>>>    1 file changed, 24 insertions(+)
>>>
>>> diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
>>> index fe573c37c..83c57c07d 100644
>>> --- a/tests/gem_tiled_partial_pwrite_pread.c
>>> +++ b/tests/gem_tiled_partial_pwrite_pread.c
>>> @@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
>>>        }
>>>    }
>>>    
>>> +static bool known_swizzling(uint32_t handle)
>>> +{
>>> +     struct drm_i915_gem_get_tiling2 {
>>> +             uint32_t handle;
>>> +             uint32_t tiling_mode;
>>> +             uint32_t swizzle_mode;
>>> +             uint32_t phys_swizzle_mode;
>>> +     } arg = {
>>> +             .handle = handle,
>>> +     };
>>> +#define DRM_IOCTL_I915_GEM_GET_TILING2       DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct
>>
>> Can't we rely on this being in system headers by now?
>>
>> drm_i915_gem_get_tiling2)
>>> +
>>> +     if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
>>> +             return false;
>>> +
>>> +     return arg.phys_swizzle_mode == arg.swizzle_mode;
>>> +}
>>> +
>>>    igt_main
>>>    {
>>>        uint32_t tiling_mode = I915_TILING_X;
>>> @@ -271,6 +289,12 @@ igt_main
>>>                                                      &tiling_mode, &scratch_pitch, 0);
>>>                igt_assert(tiling_mode == I915_TILING_X);
>>>                igt_assert(scratch_pitch == 4096);
>>> +
>>> +             /*
>>> +              * As we want to compare our template tiled pattern against
>>> +              * the target bo, we need consistent swizzling on both.
>>> +              */
>>> +             igt_require(known_swizzling(scratch_bo->handle));
>>>                staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
>>>                tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
>>>                                                            BO_SIZE/4096, 4,
>>>
>>
>> Another option could be to keep allocating until we found one in the
>> memory area with compatible swizzling? Like this it may be some noise in
>> the test pass<->skip transitions.
> 
> It depends on physical layout which the kernel keeps hidden (for
> understandable reasons).

Yeah, but we could allocate more and more until we end up in the area 
where args.phys_swizzle_mode == args.swizzle_mode. Might be to heavy 
approach. But then this skip can be random depending on what physical 
memory gets allocated in each test run.

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH i-g-t 05/17] lib: Spin fast, retire early
  2018-07-05 11:23       ` [igt-dev] [Intel-gfx] " Chris Wilson
@ 2018-07-05 12:33         ` Tvrtko Ursulin
  -1 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-05 12:33 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 05/07/2018 12:23, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-07-02 16:36:28)
>>
>> On 02/07/2018 10:07, Chris Wilson wrote:
>>> When using the pollable spinner, we often want to use it as a means of
>>> ensuring the task is running on the GPU before switching to something
>>> else. In which case we don't want to add extra delay inside the spinner,
>>> but the current 1000 NOPs add on order of 5us, which is often larger
>>> than the target latency.
>>>
>>> v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
>>> from a tight GPU spinner.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
>>> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    lib/igt_dummyload.c       | 3 ++-
>>>    lib/igt_dummyload.h       | 1 +
>>>    tests/gem_ctx_isolation.c | 1 +
>>>    tests/gem_eio.c           | 1 +
>>>    tests/gem_exec_latency.c  | 4 ++--
>>>    5 files changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
>>> index 94efdf745..7beb66244 100644
>>> --- a/lib/igt_dummyload.c
>>> +++ b/lib/igt_dummyload.c
>>> @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
>>>         * between function calls, that appears enough to keep SNB out of
>>>         * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
>>>         */
>>> -     batch += 1000;
>>> +     if (!(opts->flags & IGT_SPIN_FAST))
>>> +             batch += 1000;
>>
>> igt_require(!snb) or something, given the comment whose last two lines
>> can be seen in the diff above?
> 
> It's not a machine killer since we have the required fix in the kernel,
> it just has interesting system latency. That latency is not specific to
> snb, and whether we want to trade system latency vs gpu latency is the
> reason we punt the decision to the caller.

I am not sure if the comment and linked BZ is then still relevant?

If it is, it is not nice to punt the responsibility to the caller. We 
are exporting new API and it is hard to think callers will dig into the 
implementation to find it (this comment).

So a info/warn/skip when using the feature on a platform where it 
doesn't work well is I think much friendlier approach.

If the comment is not relevant, or needs to be adjusted to reflect the 
changed reality (bugfix you mention), then either removed or adjusted.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Intel-gfx] [PATCH i-g-t 05/17] lib: Spin fast, retire early
@ 2018-07-05 12:33         ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-05 12:33 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 05/07/2018 12:23, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-07-02 16:36:28)
>>
>> On 02/07/2018 10:07, Chris Wilson wrote:
>>> When using the pollable spinner, we often want to use it as a means of
>>> ensuring the task is running on the GPU before switching to something
>>> else. In which case we don't want to add extra delay inside the spinner,
>>> but the current 1000 NOPs add on order of 5us, which is often larger
>>> than the target latency.
>>>
>>> v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
>>> from a tight GPU spinner.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
>>> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    lib/igt_dummyload.c       | 3 ++-
>>>    lib/igt_dummyload.h       | 1 +
>>>    tests/gem_ctx_isolation.c | 1 +
>>>    tests/gem_eio.c           | 1 +
>>>    tests/gem_exec_latency.c  | 4 ++--
>>>    5 files changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
>>> index 94efdf745..7beb66244 100644
>>> --- a/lib/igt_dummyload.c
>>> +++ b/lib/igt_dummyload.c
>>> @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
>>>         * between function calls, that appears enough to keep SNB out of
>>>         * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
>>>         */
>>> -     batch += 1000;
>>> +     if (!(opts->flags & IGT_SPIN_FAST))
>>> +             batch += 1000;
>>
>> igt_require(!snb) or something, given the comment whose last two lines
>> can be seen in the diff above?
> 
> It's not a machine killer since we have the required fix in the kernel,
> it just has interesting system latency. That latency is not specific to
> snb, and whether we want to trade system latency vs gpu latency is the
> reason we punt the decision to the caller.

I am not sure if the comment and linked BZ is then still relevant?

If it is, it is not nice to punt the responsibility to the caller. We 
are exporting new API and it is hard to think callers will dig into the 
implementation to find it (this comment).

So a info/warn/skip when using the feature on a platform where it 
doesn't work well is I think much friendlier approach.

If the comment is not relevant, or needs to be adjusted to reflect the 
changed reality (bugfix you mention), then either removed or adjusted.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
  2018-07-05 12:30         ` [Intel-gfx] " Tvrtko Ursulin
@ 2018-07-05 12:35           ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 12:35 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-05 13:30:57)
> 
> On 05/07/2018 12:14, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-07-02 13:00:07)
> >>
> >> On 02/07/2018 10:07, Chris Wilson wrote:
> >>> As we want to compare a templated tiling pattern against the target_bo,
> >>> we need to know that the swizzling is compatible. Or else the two
> >>> tiling pattern may differ due to underlying page address that we cannot
> >>> know, and so the test may sporadically fail.
> >>>
> >>> References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> ---
> >>>    tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
> >>>    1 file changed, 24 insertions(+)
> >>>
> >>> diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
> >>> index fe573c37c..83c57c07d 100644
> >>> --- a/tests/gem_tiled_partial_pwrite_pread.c
> >>> +++ b/tests/gem_tiled_partial_pwrite_pread.c
> >>> @@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
> >>>        }
> >>>    }
> >>>    
> >>> +static bool known_swizzling(uint32_t handle)
> >>> +{
> >>> +     struct drm_i915_gem_get_tiling2 {
> >>> +             uint32_t handle;
> >>> +             uint32_t tiling_mode;
> >>> +             uint32_t swizzle_mode;
> >>> +             uint32_t phys_swizzle_mode;
> >>> +     } arg = {
> >>> +             .handle = handle,
> >>> +     };
> >>> +#define DRM_IOCTL_I915_GEM_GET_TILING2       DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct
> >>
> >> Can't we rely on this being in system headers by now?
> >>
> >> drm_i915_gem_get_tiling2)
> >>> +
> >>> +     if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
> >>> +             return false;
> >>> +
> >>> +     return arg.phys_swizzle_mode == arg.swizzle_mode;
> >>> +}
> >>> +
> >>>    igt_main
> >>>    {
> >>>        uint32_t tiling_mode = I915_TILING_X;
> >>> @@ -271,6 +289,12 @@ igt_main
> >>>                                                      &tiling_mode, &scratch_pitch, 0);
> >>>                igt_assert(tiling_mode == I915_TILING_X);
> >>>                igt_assert(scratch_pitch == 4096);
> >>> +
> >>> +             /*
> >>> +              * As we want to compare our template tiled pattern against
> >>> +              * the target bo, we need consistent swizzling on both.
> >>> +              */
> >>> +             igt_require(known_swizzling(scratch_bo->handle));
> >>>                staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
> >>>                tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
> >>>                                                            BO_SIZE/4096, 4,
> >>>
> >>
> >> Another option could be to keep allocating until we found one in the
> >> memory area with compatible swizzling? Like this it may be some noise in
> >> the test pass<->skip transitions.
> > 
> > It depends on physical layout which the kernel keeps hidden (for
> > understandable reasons).
> 
> Yeah, but we could allocate more and more until we end up in the area 
> where args.phys_swizzle_mode == args.swizzle_mode. Might be to heavy 
> approach. But then this skip can be random depending on what physical 
> memory gets allocated in each test run.

Ah, but phys_swizzle_mode and swizzle_mode are invariants determined
during boot for each fence-tiling type.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
@ 2018-07-05 12:35           ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 12:35 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-05 13:30:57)
> 
> On 05/07/2018 12:14, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-07-02 13:00:07)
> >>
> >> On 02/07/2018 10:07, Chris Wilson wrote:
> >>> As we want to compare a templated tiling pattern against the target_bo,
> >>> we need to know that the swizzling is compatible. Or else the two
> >>> tiling pattern may differ due to underlying page address that we cannot
> >>> know, and so the test may sporadically fail.
> >>>
> >>> References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> ---
> >>>    tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
> >>>    1 file changed, 24 insertions(+)
> >>>
> >>> diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
> >>> index fe573c37c..83c57c07d 100644
> >>> --- a/tests/gem_tiled_partial_pwrite_pread.c
> >>> +++ b/tests/gem_tiled_partial_pwrite_pread.c
> >>> @@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
> >>>        }
> >>>    }
> >>>    
> >>> +static bool known_swizzling(uint32_t handle)
> >>> +{
> >>> +     struct drm_i915_gem_get_tiling2 {
> >>> +             uint32_t handle;
> >>> +             uint32_t tiling_mode;
> >>> +             uint32_t swizzle_mode;
> >>> +             uint32_t phys_swizzle_mode;
> >>> +     } arg = {
> >>> +             .handle = handle,
> >>> +     };
> >>> +#define DRM_IOCTL_I915_GEM_GET_TILING2       DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct
> >>
> >> Can't we rely on this being in system headers by now?
> >>
> >> drm_i915_gem_get_tiling2)
> >>> +
> >>> +     if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
> >>> +             return false;
> >>> +
> >>> +     return arg.phys_swizzle_mode == arg.swizzle_mode;
> >>> +}
> >>> +
> >>>    igt_main
> >>>    {
> >>>        uint32_t tiling_mode = I915_TILING_X;
> >>> @@ -271,6 +289,12 @@ igt_main
> >>>                                                      &tiling_mode, &scratch_pitch, 0);
> >>>                igt_assert(tiling_mode == I915_TILING_X);
> >>>                igt_assert(scratch_pitch == 4096);
> >>> +
> >>> +             /*
> >>> +              * As we want to compare our template tiled pattern against
> >>> +              * the target bo, we need consistent swizzling on both.
> >>> +              */
> >>> +             igt_require(known_swizzling(scratch_bo->handle));
> >>>                staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
> >>>                tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
> >>>                                                            BO_SIZE/4096, 4,
> >>>
> >>
> >> Another option could be to keep allocating until we found one in the
> >> memory area with compatible swizzling? Like this it may be some noise in
> >> the test pass<->skip transitions.
> > 
> > It depends on physical layout which the kernel keeps hidden (for
> > understandable reasons).
> 
> Yeah, but we could allocate more and more until we end up in the area 
> where args.phys_swizzle_mode == args.swizzle_mode. Might be to heavy 
> approach. But then this skip can be random depending on what physical 
> memory gets allocated in each test run.

Ah, but phys_swizzle_mode and swizzle_mode are invariants determined
during boot for each fence-tiling type.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH i-g-t 05/17] lib: Spin fast, retire early
  2018-07-05 12:33         ` [Intel-gfx] " Tvrtko Ursulin
@ 2018-07-05 12:42           ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 12:42 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-05 13:33:55)
> 
> On 05/07/2018 12:23, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-07-02 16:36:28)
> >>
> >> On 02/07/2018 10:07, Chris Wilson wrote:
> >>> When using the pollable spinner, we often want to use it as a means of
> >>> ensuring the task is running on the GPU before switching to something
> >>> else. In which case we don't want to add extra delay inside the spinner,
> >>> but the current 1000 NOPs add on order of 5us, which is often larger
> >>> than the target latency.
> >>>
> >>> v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
> >>> from a tight GPU spinner.
> >>>
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
> >>> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> ---
> >>>    lib/igt_dummyload.c       | 3 ++-
> >>>    lib/igt_dummyload.h       | 1 +
> >>>    tests/gem_ctx_isolation.c | 1 +
> >>>    tests/gem_eio.c           | 1 +
> >>>    tests/gem_exec_latency.c  | 4 ++--
> >>>    5 files changed, 7 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
> >>> index 94efdf745..7beb66244 100644
> >>> --- a/lib/igt_dummyload.c
> >>> +++ b/lib/igt_dummyload.c
> >>> @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
> >>>         * between function calls, that appears enough to keep SNB out of
> >>>         * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
> >>>         */
> >>> -     batch += 1000;
> >>> +     if (!(opts->flags & IGT_SPIN_FAST))
> >>> +             batch += 1000;
> >>
> >> igt_require(!snb) or something, given the comment whose last two lines
> >> can be seen in the diff above?
> > 
> > It's not a machine killer since we have the required fix in the kernel,
> > it just has interesting system latency. That latency is not specific to
> > snb, and whether we want to trade system latency vs gpu latency is the
> > reason we punt the decision to the caller.
> 
> I am not sure if the comment and linked BZ is then still relevant?

It is, it's just too specific, and it particularly affects modesetting as
it doesn't defend itself against CPU starvation / timer latency (imo).
 
> If it is, it is not nice to punt the responsibility to the caller. We 
> are exporting new API and it is hard to think callers will dig into the 
> implementation to find it (this comment).

Why not? Is not the sordid history of our hw supposed to be our
specialist topic? :) For what has once happened before is sure to happen
again. (Corollary to George Santayana.)
 
> So a info/warn/skip when using the feature on a platform where it 
> doesn't work well is I think much friendlier approach.

But for those who want to use it, it is just noise.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 05/17] lib: Spin fast, retire early
@ 2018-07-05 12:42           ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 12:42 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-05 13:33:55)
> 
> On 05/07/2018 12:23, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-07-02 16:36:28)
> >>
> >> On 02/07/2018 10:07, Chris Wilson wrote:
> >>> When using the pollable spinner, we often want to use it as a means of
> >>> ensuring the task is running on the GPU before switching to something
> >>> else. In which case we don't want to add extra delay inside the spinner,
> >>> but the current 1000 NOPs add on order of 5us, which is often larger
> >>> than the target latency.
> >>>
> >>> v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
> >>> from a tight GPU spinner.
> >>>
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
> >>> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> ---
> >>>    lib/igt_dummyload.c       | 3 ++-
> >>>    lib/igt_dummyload.h       | 1 +
> >>>    tests/gem_ctx_isolation.c | 1 +
> >>>    tests/gem_eio.c           | 1 +
> >>>    tests/gem_exec_latency.c  | 4 ++--
> >>>    5 files changed, 7 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
> >>> index 94efdf745..7beb66244 100644
> >>> --- a/lib/igt_dummyload.c
> >>> +++ b/lib/igt_dummyload.c
> >>> @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
> >>>         * between function calls, that appears enough to keep SNB out of
> >>>         * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
> >>>         */
> >>> -     batch += 1000;
> >>> +     if (!(opts->flags & IGT_SPIN_FAST))
> >>> +             batch += 1000;
> >>
> >> igt_require(!snb) or something, given the comment whose last two lines
> >> can be seen in the diff above?
> > 
> > It's not a machine killer since we have the required fix in the kernel,
> > it just has interesting system latency. That latency is not specific to
> > snb, and whether we want to trade system latency vs gpu latency is the
> > reason we punt the decision to the caller.
> 
> I am not sure if the comment and linked BZ is then still relevant?

It is, it's just too specific, and it particularly affects modesetting as
it doesn't defend itself against CPU starvation / timer latency (imo).
 
> If it is, it is not nice to punt the responsibility to the caller. We 
> are exporting new API and it is hard to think callers will dig into the 
> implementation to find it (this comment).

Why not? Is not the sordid history of our hw supposed to be our
specialist topic? :) For what has once happened before is sure to happen
again. (Corollary to George Santayana.)
 
> So a info/warn/skip when using the feature on a platform where it 
> doesn't work well is I think much friendlier approach.

But for those who want to use it, it is just noise.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
  2018-07-05 12:35           ` [Intel-gfx] " Chris Wilson
@ 2018-07-05 15:26             ` Tvrtko Ursulin
  -1 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-05 15:26 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 05/07/2018 13:35, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-07-05 13:30:57)
>>
>> On 05/07/2018 12:14, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-07-02 13:00:07)
>>>>
>>>> On 02/07/2018 10:07, Chris Wilson wrote:
>>>>> As we want to compare a templated tiling pattern against the target_bo,
>>>>> we need to know that the swizzling is compatible. Or else the two
>>>>> tiling pattern may differ due to underlying page address that we cannot
>>>>> know, and so the test may sporadically fail.
>>>>>
>>>>> References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> ---
>>>>>     tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
>>>>>     1 file changed, 24 insertions(+)
>>>>>
>>>>> diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
>>>>> index fe573c37c..83c57c07d 100644
>>>>> --- a/tests/gem_tiled_partial_pwrite_pread.c
>>>>> +++ b/tests/gem_tiled_partial_pwrite_pread.c
>>>>> @@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
>>>>>         }
>>>>>     }
>>>>>     
>>>>> +static bool known_swizzling(uint32_t handle)
>>>>> +{
>>>>> +     struct drm_i915_gem_get_tiling2 {
>>>>> +             uint32_t handle;
>>>>> +             uint32_t tiling_mode;
>>>>> +             uint32_t swizzle_mode;
>>>>> +             uint32_t phys_swizzle_mode;
>>>>> +     } arg = {
>>>>> +             .handle = handle,
>>>>> +     };
>>>>> +#define DRM_IOCTL_I915_GEM_GET_TILING2       DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct
>>>>
>>>> Can't we rely on this being in system headers by now?
>>>>
>>>> drm_i915_gem_get_tiling2)
>>>>> +
>>>>> +     if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
>>>>> +             return false;
>>>>> +
>>>>> +     return arg.phys_swizzle_mode == arg.swizzle_mode;
>>>>> +}
>>>>> +
>>>>>     igt_main
>>>>>     {
>>>>>         uint32_t tiling_mode = I915_TILING_X;
>>>>> @@ -271,6 +289,12 @@ igt_main
>>>>>                                                       &tiling_mode, &scratch_pitch, 0);
>>>>>                 igt_assert(tiling_mode == I915_TILING_X);
>>>>>                 igt_assert(scratch_pitch == 4096);
>>>>> +
>>>>> +             /*
>>>>> +              * As we want to compare our template tiled pattern against
>>>>> +              * the target bo, we need consistent swizzling on both.
>>>>> +              */
>>>>> +             igt_require(known_swizzling(scratch_bo->handle));
>>>>>                 staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
>>>>>                 tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
>>>>>                                                             BO_SIZE/4096, 4,
>>>>>
>>>>
>>>> Another option could be to keep allocating until we found one in the
>>>> memory area with compatible swizzling? Like this it may be some noise in
>>>> the test pass<->skip transitions.
>>>
>>> It depends on physical layout which the kernel keeps hidden (for
>>> understandable reasons).
>>
>> Yeah, but we could allocate more and more until we end up in the area
>> where args.phys_swizzle_mode == args.swizzle_mode. Might be to heavy
>> approach. But then this skip can be random depending on what physical
>> memory gets allocated in each test run.
> 
> Ah, but phys_swizzle_mode and swizzle_mode are invariants determined
> during boot for each fence-tiling type.

True, get_tiling ioctl has nothing on that level. I dont' know where was 
I. Had this ideas about multi-bank RAM with one bank single channel, 
another multi-channel and then swizzling differing between the two.

Okay, then question remains if you just want to drop the local struct 
and ioctl definition since AFAICS the same definitions are in the kernel 
since 2012.

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
@ 2018-07-05 15:26             ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-05 15:26 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 05/07/2018 13:35, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-07-05 13:30:57)
>>
>> On 05/07/2018 12:14, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-07-02 13:00:07)
>>>>
>>>> On 02/07/2018 10:07, Chris Wilson wrote:
>>>>> As we want to compare a templated tiling pattern against the target_bo,
>>>>> we need to know that the swizzling is compatible. Or else the two
>>>>> tiling pattern may differ due to underlying page address that we cannot
>>>>> know, and so the test may sporadically fail.
>>>>>
>>>>> References: https://bugs.freedesktop.org/show_bug.cgi?id=102575
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> ---
>>>>>     tests/gem_tiled_partial_pwrite_pread.c | 24 ++++++++++++++++++++++++
>>>>>     1 file changed, 24 insertions(+)
>>>>>
>>>>> diff --git a/tests/gem_tiled_partial_pwrite_pread.c b/tests/gem_tiled_partial_pwrite_pread.c
>>>>> index fe573c37c..83c57c07d 100644
>>>>> --- a/tests/gem_tiled_partial_pwrite_pread.c
>>>>> +++ b/tests/gem_tiled_partial_pwrite_pread.c
>>>>> @@ -249,6 +249,24 @@ static void test_partial_read_writes(void)
>>>>>         }
>>>>>     }
>>>>>     
>>>>> +static bool known_swizzling(uint32_t handle)
>>>>> +{
>>>>> +     struct drm_i915_gem_get_tiling2 {
>>>>> +             uint32_t handle;
>>>>> +             uint32_t tiling_mode;
>>>>> +             uint32_t swizzle_mode;
>>>>> +             uint32_t phys_swizzle_mode;
>>>>> +     } arg = {
>>>>> +             .handle = handle,
>>>>> +     };
>>>>> +#define DRM_IOCTL_I915_GEM_GET_TILING2       DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_GET_TILING, struct
>>>>
>>>> Can't we rely on this being in system headers by now?
>>>>
>>>> drm_i915_gem_get_tiling2)
>>>>> +
>>>>> +     if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_GET_TILING2, &arg))
>>>>> +             return false;
>>>>> +
>>>>> +     return arg.phys_swizzle_mode == arg.swizzle_mode;
>>>>> +}
>>>>> +
>>>>>     igt_main
>>>>>     {
>>>>>         uint32_t tiling_mode = I915_TILING_X;
>>>>> @@ -271,6 +289,12 @@ igt_main
>>>>>                                                       &tiling_mode, &scratch_pitch, 0);
>>>>>                 igt_assert(tiling_mode == I915_TILING_X);
>>>>>                 igt_assert(scratch_pitch == 4096);
>>>>> +
>>>>> +             /*
>>>>> +              * As we want to compare our template tiled pattern against
>>>>> +              * the target bo, we need consistent swizzling on both.
>>>>> +              */
>>>>> +             igt_require(known_swizzling(scratch_bo->handle));
>>>>>                 staging_bo = drm_intel_bo_alloc(bufmgr, "staging bo", BO_SIZE, 4096);
>>>>>                 tiled_staging_bo = drm_intel_bo_alloc_tiled(bufmgr, "scratch bo", 1024,
>>>>>                                                             BO_SIZE/4096, 4,
>>>>>
>>>>
>>>> Another option could be to keep allocating until we found one in the
>>>> memory area with compatible swizzling? Like this it may be some noise in
>>>> the test pass<->skip transitions.
>>>
>>> It depends on physical layout which the kernel keeps hidden (for
>>> understandable reasons).
>>
>> Yeah, but we could allocate more and more until we end up in the area
>> where args.phys_swizzle_mode == args.swizzle_mode. Might be to heavy
>> approach. But then this skip can be random depending on what physical
>> memory gets allocated in each test run.
> 
> Ah, but phys_swizzle_mode and swizzle_mode are invariants determined
> during boot for each fence-tiling type.

True, get_tiling ioctl has nothing on that level. I dont' know where was 
I. Had this ideas about multi-bank RAM with one bank single channel, 
another multi-channel and then swizzling differing between the two.

Okay, then question remains if you just want to drop the local struct 
and ioctl definition since AFAICS the same definitions are in the kernel 
since 2012.

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH i-g-t 05/17] lib: Spin fast, retire early
  2018-07-05 12:42           ` [igt-dev] [Intel-gfx] " Chris Wilson
@ 2018-07-05 15:29             ` Tvrtko Ursulin
  -1 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-05 15:29 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 05/07/2018 13:42, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-07-05 13:33:55)
>>
>> On 05/07/2018 12:23, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-07-02 16:36:28)
>>>>
>>>> On 02/07/2018 10:07, Chris Wilson wrote:
>>>>> When using the pollable spinner, we often want to use it as a means of
>>>>> ensuring the task is running on the GPU before switching to something
>>>>> else. In which case we don't want to add extra delay inside the spinner,
>>>>> but the current 1000 NOPs add on order of 5us, which is often larger
>>>>> than the target latency.
>>>>>
>>>>> v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
>>>>> from a tight GPU spinner.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
>>>>> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> ---
>>>>>     lib/igt_dummyload.c       | 3 ++-
>>>>>     lib/igt_dummyload.h       | 1 +
>>>>>     tests/gem_ctx_isolation.c | 1 +
>>>>>     tests/gem_eio.c           | 1 +
>>>>>     tests/gem_exec_latency.c  | 4 ++--
>>>>>     5 files changed, 7 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
>>>>> index 94efdf745..7beb66244 100644
>>>>> --- a/lib/igt_dummyload.c
>>>>> +++ b/lib/igt_dummyload.c
>>>>> @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
>>>>>          * between function calls, that appears enough to keep SNB out of
>>>>>          * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
>>>>>          */
>>>>> -     batch += 1000;
>>>>> +     if (!(opts->flags & IGT_SPIN_FAST))
>>>>> +             batch += 1000;
>>>>
>>>> igt_require(!snb) or something, given the comment whose last two lines
>>>> can be seen in the diff above?
>>>
>>> It's not a machine killer since we have the required fix in the kernel,
>>> it just has interesting system latency. That latency is not specific to
>>> snb, and whether we want to trade system latency vs gpu latency is the
>>> reason we punt the decision to the caller.
>>
>> I am not sure if the comment and linked BZ is then still relevant?
> 
> It is, it's just too specific, and it particularly affects modesetting as
> it doesn't defend itself against CPU starvation / timer latency (imo).
>   
>> If it is, it is not nice to punt the responsibility to the caller. We
>> are exporting new API and it is hard to think callers will dig into the
>> implementation to find it (this comment).
> 
> Why not? Is not the sordid history of our hw supposed to be our
> specialist topic? :) For what has once happened before is sure to happen
> again. (Corollary to George Santayana.)
>   
>> So a info/warn/skip when using the feature on a platform where it
>> doesn't work well is I think much friendlier approach.
> 
> But for those who want to use it, it is just noise.

For current users yes, for new users maybe not - maybe they need to have 
a way to notice they might be doing things unknowingly suboptimally. 
What would accomplish that goal? igt_info too much noise? igt_debug too 
low key? Blah..

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 05/17] lib: Spin fast, retire early
@ 2018-07-05 15:29             ` Tvrtko Ursulin
  0 siblings, 0 replies; 68+ messages in thread
From: Tvrtko Ursulin @ 2018-07-05 15:29 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: intel-gfx


On 05/07/2018 13:42, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-07-05 13:33:55)
>>
>> On 05/07/2018 12:23, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-07-02 16:36:28)
>>>>
>>>> On 02/07/2018 10:07, Chris Wilson wrote:
>>>>> When using the pollable spinner, we often want to use it as a means of
>>>>> ensuring the task is running on the GPU before switching to something
>>>>> else. In which case we don't want to add extra delay inside the spinner,
>>>>> but the current 1000 NOPs add on order of 5us, which is often larger
>>>>> than the target latency.
>>>>>
>>>>> v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
>>>>> from a tight GPU spinner.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
>>>>> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> ---
>>>>>     lib/igt_dummyload.c       | 3 ++-
>>>>>     lib/igt_dummyload.h       | 1 +
>>>>>     tests/gem_ctx_isolation.c | 1 +
>>>>>     tests/gem_eio.c           | 1 +
>>>>>     tests/gem_exec_latency.c  | 4 ++--
>>>>>     5 files changed, 7 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
>>>>> index 94efdf745..7beb66244 100644
>>>>> --- a/lib/igt_dummyload.c
>>>>> +++ b/lib/igt_dummyload.c
>>>>> @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
>>>>>          * between function calls, that appears enough to keep SNB out of
>>>>>          * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
>>>>>          */
>>>>> -     batch += 1000;
>>>>> +     if (!(opts->flags & IGT_SPIN_FAST))
>>>>> +             batch += 1000;
>>>>
>>>> igt_require(!snb) or something, given the comment whose last two lines
>>>> can be seen in the diff above?
>>>
>>> It's not a machine killer since we have the required fix in the kernel,
>>> it just has interesting system latency. That latency is not specific to
>>> snb, and whether we want to trade system latency vs gpu latency is the
>>> reason we punt the decision to the caller.
>>
>> I am not sure if the comment and linked BZ is then still relevant?
> 
> It is, it's just too specific, and it particularly affects modesetting as
> it doesn't defend itself against CPU starvation / timer latency (imo).
>   
>> If it is, it is not nice to punt the responsibility to the caller. We
>> are exporting new API and it is hard to think callers will dig into the
>> implementation to find it (this comment).
> 
> Why not? Is not the sordid history of our hw supposed to be our
> specialist topic? :) For what has once happened before is sure to happen
> again. (Corollary to George Santayana.)
>   
>> So a info/warn/skip when using the feature on a platform where it
>> doesn't work well is I think much friendlier approach.
> 
> But for those who want to use it, it is just noise.

For current users yes, for new users maybe not - maybe they need to have 
a way to notice they might be doing things unknowingly suboptimally. 
What would accomplish that goal? igt_info too much noise? igt_debug too 
low key? Blah..

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH i-g-t 05/17] lib: Spin fast, retire early
  2018-07-05 15:29             ` [igt-dev] [Intel-gfx] " Tvrtko Ursulin
@ 2018-07-05 15:52               ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 15:52 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-05 16:29:32)
> 
> On 05/07/2018 13:42, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-07-05 13:33:55)
> >>
> >> On 05/07/2018 12:23, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2018-07-02 16:36:28)
> >>>>
> >>>> On 02/07/2018 10:07, Chris Wilson wrote:
> >>>>> When using the pollable spinner, we often want to use it as a means of
> >>>>> ensuring the task is running on the GPU before switching to something
> >>>>> else. In which case we don't want to add extra delay inside the spinner,
> >>>>> but the current 1000 NOPs add on order of 5us, which is often larger
> >>>>> than the target latency.
> >>>>>
> >>>>> v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
> >>>>> from a tight GPU spinner.
> >>>>>
> >>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>>>> Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
> >>>>> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
> >>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>>> ---
> >>>>>     lib/igt_dummyload.c       | 3 ++-
> >>>>>     lib/igt_dummyload.h       | 1 +
> >>>>>     tests/gem_ctx_isolation.c | 1 +
> >>>>>     tests/gem_eio.c           | 1 +
> >>>>>     tests/gem_exec_latency.c  | 4 ++--
> >>>>>     5 files changed, 7 insertions(+), 3 deletions(-)
> >>>>>
> >>>>> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
> >>>>> index 94efdf745..7beb66244 100644
> >>>>> --- a/lib/igt_dummyload.c
> >>>>> +++ b/lib/igt_dummyload.c
> >>>>> @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
> >>>>>          * between function calls, that appears enough to keep SNB out of
> >>>>>          * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
> >>>>>          */
> >>>>> -     batch += 1000;
> >>>>> +     if (!(opts->flags & IGT_SPIN_FAST))
> >>>>> +             batch += 1000;
> >>>>
> >>>> igt_require(!snb) or something, given the comment whose last two lines
> >>>> can be seen in the diff above?
> >>>
> >>> It's not a machine killer since we have the required fix in the kernel,
> >>> it just has interesting system latency. That latency is not specific to
> >>> snb, and whether we want to trade system latency vs gpu latency is the
> >>> reason we punt the decision to the caller.
> >>
> >> I am not sure if the comment and linked BZ is then still relevant?
> > 
> > It is, it's just too specific, and it particularly affects modesetting as
> > it doesn't defend itself against CPU starvation / timer latency (imo).
> >   
> >> If it is, it is not nice to punt the responsibility to the caller. We
> >> are exporting new API and it is hard to think callers will dig into the
> >> implementation to find it (this comment).
> > 
> > Why not? Is not the sordid history of our hw supposed to be our
> > specialist topic? :) For what has once happened before is sure to happen
> > again. (Corollary to George Santayana.)
> >   
> >> So a info/warn/skip when using the feature on a platform where it
> >> doesn't work well is I think much friendlier approach.
> > 
> > But for those who want to use it, it is just noise.
> 
> For current users yes, for new users maybe not - maybe they need to have 
> a way to notice they might be doing things unknowingly suboptimally. 
> What would accomplish that goal? igt_info too much noise? igt_debug too 
> low key? Blah..

Rather, as I see it unless it _impacts_ their test, ignorance is bliss.

Having opted into using SPIN_FAST  and their test failing on the farm
on some machines, I don't expect it to be hard for a novice to debug.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 05/17] lib: Spin fast, retire early
@ 2018-07-05 15:52               ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 15:52 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-05 16:29:32)
> 
> On 05/07/2018 13:42, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-07-05 13:33:55)
> >>
> >> On 05/07/2018 12:23, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2018-07-02 16:36:28)
> >>>>
> >>>> On 02/07/2018 10:07, Chris Wilson wrote:
> >>>>> When using the pollable spinner, we often want to use it as a means of
> >>>>> ensuring the task is running on the GPU before switching to something
> >>>>> else. In which case we don't want to add extra delay inside the spinner,
> >>>>> but the current 1000 NOPs add on order of 5us, which is often larger
> >>>>> than the target latency.
> >>>>>
> >>>>> v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
> >>>>> from a tight GPU spinner.
> >>>>>
> >>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>>>> Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
> >>>>> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
> >>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>>> ---
> >>>>>     lib/igt_dummyload.c       | 3 ++-
> >>>>>     lib/igt_dummyload.h       | 1 +
> >>>>>     tests/gem_ctx_isolation.c | 1 +
> >>>>>     tests/gem_eio.c           | 1 +
> >>>>>     tests/gem_exec_latency.c  | 4 ++--
> >>>>>     5 files changed, 7 insertions(+), 3 deletions(-)
> >>>>>
> >>>>> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
> >>>>> index 94efdf745..7beb66244 100644
> >>>>> --- a/lib/igt_dummyload.c
> >>>>> +++ b/lib/igt_dummyload.c
> >>>>> @@ -199,7 +199,8 @@ emit_recursive_batch(igt_spin_t *spin,
> >>>>>          * between function calls, that appears enough to keep SNB out of
> >>>>>          * trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262
> >>>>>          */
> >>>>> -     batch += 1000;
> >>>>> +     if (!(opts->flags & IGT_SPIN_FAST))
> >>>>> +             batch += 1000;
> >>>>
> >>>> igt_require(!snb) or something, given the comment whose last two lines
> >>>> can be seen in the diff above?
> >>>
> >>> It's not a machine killer since we have the required fix in the kernel,
> >>> it just has interesting system latency. That latency is not specific to
> >>> snb, and whether we want to trade system latency vs gpu latency is the
> >>> reason we punt the decision to the caller.
> >>
> >> I am not sure if the comment and linked BZ is then still relevant?
> > 
> > It is, it's just too specific, and it particularly affects modesetting as
> > it doesn't defend itself against CPU starvation / timer latency (imo).
> >   
> >> If it is, it is not nice to punt the responsibility to the caller. We
> >> are exporting new API and it is hard to think callers will dig into the
> >> implementation to find it (this comment).
> > 
> > Why not? Is not the sordid history of our hw supposed to be our
> > specialist topic? :) For what has once happened before is sure to happen
> > again. (Corollary to George Santayana.)
> >   
> >> So a info/warn/skip when using the feature on a platform where it
> >> doesn't work well is I think much friendlier approach.
> > 
> > But for those who want to use it, it is just noise.
> 
> For current users yes, for new users maybe not - maybe they need to have 
> a way to notice they might be doing things unknowingly suboptimally. 
> What would accomplish that goal? igt_info too much noise? igt_debug too 
> low key? Blah..

Rather, as I see it unless it _impacts_ their test, ignorance is bliss.

Having opted into using SPIN_FAST  and their test failing on the farm
on some machines, I don't expect it to be hard for a novice to debug.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
  2018-07-05 15:26             ` [Intel-gfx] " Tvrtko Ursulin
@ 2018-07-05 15:55               ` Chris Wilson
  -1 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 15:55 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-05 16:26:51)
> 
> Okay, then question remains if you just want to drop the local struct 
> and ioctl definition since AFAICS the same definitions are in the kernel 
> since 2012.

I haven't commented on that as there's no argument here. I was lazy
and copied existing code, and will always be as lazy as I can get away
with.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling
@ 2018-07-05 15:55               ` Chris Wilson
  0 siblings, 0 replies; 68+ messages in thread
From: Chris Wilson @ 2018-07-05 15:55 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: intel-gfx

Quoting Tvrtko Ursulin (2018-07-05 16:26:51)
> 
> Okay, then question remains if you just want to drop the local struct 
> and ioctl definition since AFAICS the same definitions are in the kernel 
> since 2012.

I haven't commented on that as there's no argument here. I was lazy
and copied existing code, and will always be as lazy as I can get away
with.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2018-07-05 15:55 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-02  9:07 [PATCH i-g-t 01/17] lib: Report file cache as available system memory Chris Wilson
2018-07-02  9:07 ` [Intel-gfx] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 02/17] igt/gem_tiled_partial_pwrite_pread: Check for known swizzling Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02 12:00   ` Tvrtko Ursulin
2018-07-02 12:00     ` Tvrtko Ursulin
2018-07-05 11:14     ` Chris Wilson
2018-07-05 11:14       ` [Intel-gfx] " Chris Wilson
2018-07-05 12:30       ` Tvrtko Ursulin
2018-07-05 12:30         ` [Intel-gfx] " Tvrtko Ursulin
2018-07-05 12:35         ` Chris Wilson
2018-07-05 12:35           ` [Intel-gfx] " Chris Wilson
2018-07-05 15:26           ` Tvrtko Ursulin
2018-07-05 15:26             ` [Intel-gfx] " Tvrtko Ursulin
2018-07-05 15:55             ` Chris Wilson
2018-07-05 15:55               ` Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 03/17] igt/gem_set_tiling_vs_pwrite: Show the erroneous value Chris Wilson
2018-07-02  9:07   ` [Intel-gfx] " Chris Wilson
2018-07-02 12:00   ` [igt-dev] " Tvrtko Ursulin
2018-07-02 12:00     ` Tvrtko Ursulin
2018-07-02  9:07 ` [PATCH i-g-t 04/17] lib: Convert spin batch constructor to a factory Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02 15:34   ` Tvrtko Ursulin
2018-07-02 15:34     ` Tvrtko Ursulin
2018-07-02  9:07 ` [PATCH i-g-t 05/17] lib: Spin fast, retire early Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02 15:36   ` Tvrtko Ursulin
2018-07-02 15:36     ` [Intel-gfx] " Tvrtko Ursulin
2018-07-05 11:23     ` Chris Wilson
2018-07-05 11:23       ` [igt-dev] [Intel-gfx] " Chris Wilson
2018-07-05 12:33       ` Tvrtko Ursulin
2018-07-05 12:33         ` [Intel-gfx] " Tvrtko Ursulin
2018-07-05 12:42         ` Chris Wilson
2018-07-05 12:42           ` [igt-dev] [Intel-gfx] " Chris Wilson
2018-07-05 15:29           ` Tvrtko Ursulin
2018-07-05 15:29             ` [igt-dev] [Intel-gfx] " Tvrtko Ursulin
2018-07-05 15:52             ` Chris Wilson
2018-07-05 15:52               ` [igt-dev] [Intel-gfx] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 06/17] igt/gem_sync: Alternate stress for nop+sync Chris Wilson
2018-07-02  9:07   ` [Intel-gfx] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 07/17] igt/gem_sync: Double the wakeups, twice the pain Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 08/17] igt/gem_sync: Show the baseline poll latency for wakeups Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 09/17] igt/gem_userptr: Check read-only mappings Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 10/17] igt: Exercise creating context with shared GTT Chris Wilson
2018-07-02  9:07   ` [Intel-gfx] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 11/17] igt/gem_ctx_switch: Exercise queues Chris Wilson
2018-07-02  9:07   ` [Intel-gfx] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 12/17] igt/gem_exec_whisper: Fork all-engine tests one-per-engine Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 13/17] igt: Add gem_ctx_engines Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 14/17] igt: Add gem_exec_balancer Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 15/17] benchmarks/wsim: Simulate and interpret .wsim Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 16/17] tools: capture execution pathways Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02  9:07 ` [PATCH i-g-t 17/17] igt/gem_exec_latency: Robustify measurements Chris Wilson
2018-07-02  9:07   ` [igt-dev] " Chris Wilson
2018-07-02 11:54 ` [igt-dev] [PATCH i-g-t 01/17] lib: Report file cache as available system memory Tvrtko Ursulin
2018-07-02 11:54   ` Tvrtko Ursulin
2018-07-02 12:08   ` Chris Wilson
2018-07-02 12:08     ` Chris Wilson
2018-07-02 13:09 ` [igt-dev] ✓ Fi.CI.BAT: success for series starting with [i-g-t,01/17] " Patchwork
2018-07-02 14:16 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.