All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH i-g-t 1/9] i915/gem_exec_schedule: Split pi-ringfull into two tests
@ 2019-11-13 12:52 ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

pi-ringfull uses 2 contexts that share a buffer. The intent was that the
contexts were independent, but it was the effect of the global lock held
by the low priority client that prevented the high priority client from
executing. I began to add a second variant where there was a shared
resource which may induce a priority inversion, only to notice the
existing test already imposed a shared resource. Hence adding a second
test to rerun pi-ringfull in both unshared and shared resource modes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_exec_schedule.c | 38 +++++++++++++++++++++++++++++++---
 1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index 5c15f1770..84581bffe 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -1468,7 +1468,8 @@ static void bind_to_cpu(int cpu)
 	igt_assert(sched_setaffinity(getpid(), sizeof(cpu_set_t), &allowed) == 0);
 }
 
-static void test_pi_ringfull(int fd, unsigned int engine)
+static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
+#define SHARED BIT(0)
 {
 	const uint32_t bbe = MI_BATCH_BUFFER_END;
 	struct sigaction sa = { .sa_handler = alarm_handler };
@@ -1480,6 +1481,24 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 	uint32_t vip;
 	bool *result;
 
+	/*
+	 * We start simple. A low priority client should never prevent a high
+	 * priority client from submitting their work; even if the low priority
+	 * client exhausts their ringbuffer and so is throttled.
+	 *
+	 * SHARED: A variant on the above rule is that even is the 2 clients
+	 * share a read-only resource, the blocked low priority client should
+	 * not prevent the high priority client from executing. A buffer,
+	 * e.g. the batch buffer, that is shared only for reads (no write
+	 * hazard, so the reads can be executed in parallel or in any order),
+	 * so not cause priority inversion due to the resource conflict.
+	 *
+	 * First, we have the low priority context who fills their ring and so
+	 * blocks. As soon as that context blocks, we try to submit a high
+	 * priority batch, which should be executed immediately before the low
+	 * priority context is unblocked.
+	 */
+
 	result = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
 	igt_assert(result != MAP_FAILED);
 
@@ -1545,6 +1564,12 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 	igt_fork(child, 1) {
 		int err;
 
+		/* Replace our batch to avoid conflicts over shared resources? */
+		if (!(flags & SHARED)) {
+			obj[1].handle = gem_create(fd, 4096);
+			gem_write(fd, obj[1].handle, 0, &bbe, sizeof(bbe));
+		}
+
 		result[0] = vip != execbuf.rsvd1;
 
 		igt_debug("Waking parent\n");
@@ -1557,7 +1582,8 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 		itv.it_value.tv_usec = 10000;
 		setitimer(ITIMER_REAL, &itv, NULL);
 
-		/* Since we are the high priority task, we expect to be
+		/*
+		 * Since we are the high priority task, we expect to be
 		 * able to add ourselves to *our* ring without interruption.
 		 */
 		igt_debug("HP child executing\n");
@@ -1569,6 +1595,9 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 		setitimer(ITIMER_REAL, &itv, NULL);
 
 		result[2] = err == 0;
+
+		if (!(flags & SHARED))
+			gem_close(fd, obj[1].handle);
 	}
 
 	/* Relinquish CPU just to allow child to create a context */
@@ -1831,7 +1860,10 @@ igt_main
 				}
 
 				igt_subtest_f("pi-ringfull-%s", e->name)
-					test_pi_ringfull(fd, eb_ring(e));
+					test_pi_ringfull(fd, eb_ring(e), 0);
+
+				igt_subtest_f("pi-common-%s", e->name)
+					test_pi_ringfull(fd, eb_ring(e), SHARED);
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Intel-gfx] [PATCH i-g-t 1/9] i915/gem_exec_schedule: Split pi-ringfull into two tests
@ 2019-11-13 12:52 ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

pi-ringfull uses 2 contexts that share a buffer. The intent was that the
contexts were independent, but it was the effect of the global lock held
by the low priority client that prevented the high priority client from
executing. I began to add a second variant where there was a shared
resource which may induce a priority inversion, only to notice the
existing test already imposed a shared resource. Hence adding a second
test to rerun pi-ringfull in both unshared and shared resource modes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_exec_schedule.c | 38 +++++++++++++++++++++++++++++++---
 1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index 5c15f1770..84581bffe 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -1468,7 +1468,8 @@ static void bind_to_cpu(int cpu)
 	igt_assert(sched_setaffinity(getpid(), sizeof(cpu_set_t), &allowed) == 0);
 }
 
-static void test_pi_ringfull(int fd, unsigned int engine)
+static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
+#define SHARED BIT(0)
 {
 	const uint32_t bbe = MI_BATCH_BUFFER_END;
 	struct sigaction sa = { .sa_handler = alarm_handler };
@@ -1480,6 +1481,24 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 	uint32_t vip;
 	bool *result;
 
+	/*
+	 * We start simple. A low priority client should never prevent a high
+	 * priority client from submitting their work; even if the low priority
+	 * client exhausts their ringbuffer and so is throttled.
+	 *
+	 * SHARED: A variant on the above rule is that even is the 2 clients
+	 * share a read-only resource, the blocked low priority client should
+	 * not prevent the high priority client from executing. A buffer,
+	 * e.g. the batch buffer, that is shared only for reads (no write
+	 * hazard, so the reads can be executed in parallel or in any order),
+	 * so not cause priority inversion due to the resource conflict.
+	 *
+	 * First, we have the low priority context who fills their ring and so
+	 * blocks. As soon as that context blocks, we try to submit a high
+	 * priority batch, which should be executed immediately before the low
+	 * priority context is unblocked.
+	 */
+
 	result = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
 	igt_assert(result != MAP_FAILED);
 
@@ -1545,6 +1564,12 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 	igt_fork(child, 1) {
 		int err;
 
+		/* Replace our batch to avoid conflicts over shared resources? */
+		if (!(flags & SHARED)) {
+			obj[1].handle = gem_create(fd, 4096);
+			gem_write(fd, obj[1].handle, 0, &bbe, sizeof(bbe));
+		}
+
 		result[0] = vip != execbuf.rsvd1;
 
 		igt_debug("Waking parent\n");
@@ -1557,7 +1582,8 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 		itv.it_value.tv_usec = 10000;
 		setitimer(ITIMER_REAL, &itv, NULL);
 
-		/* Since we are the high priority task, we expect to be
+		/*
+		 * Since we are the high priority task, we expect to be
 		 * able to add ourselves to *our* ring without interruption.
 		 */
 		igt_debug("HP child executing\n");
@@ -1569,6 +1595,9 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 		setitimer(ITIMER_REAL, &itv, NULL);
 
 		result[2] = err == 0;
+
+		if (!(flags & SHARED))
+			gem_close(fd, obj[1].handle);
 	}
 
 	/* Relinquish CPU just to allow child to create a context */
@@ -1831,7 +1860,10 @@ igt_main
 				}
 
 				igt_subtest_f("pi-ringfull-%s", e->name)
-					test_pi_ringfull(fd, eb_ring(e));
+					test_pi_ringfull(fd, eb_ring(e), 0);
+
+				igt_subtest_f("pi-common-%s", e->name)
+					test_pi_ringfull(fd, eb_ring(e), SHARED);
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH i-g-t 2/9] i915/gem_exec_schedule: Exercise priority inversion from resource contention
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

One of the hardest priority inversion tasks to both handle and to
simulate in testing is inversion due to resource contention. The
challenge is that a high priority context should never be blocked by a
low priority context, even if both are starving for resources --
ideally, at least for some RT OSes, the higher priority context has
first pick of the meagre resources so that it can be executed with
minimum latency.

userfaultfd allows us to handle a page fault in userspace, and so
arbitrary impose a delay on the fault handler, creating a situation
where a low priority context is blocked waiting for the fault. This
blocked context should not prevent a high priority context from being
executed. While the userfault tries to emulate a slow fault (e.g. from a
failing swap device), it is unfortunately limited to a single object
type: the userptr. Hopefully, we will find other ways to impose other
starvation conditions on global resources.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_exec_schedule.c | 155 +++++++++++++++++++++++++++++++++
 1 file changed, 155 insertions(+)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index 84581bffe..d98434123 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -23,10 +23,16 @@
 
 #include "config.h"
 
+#include <linux/userfaultfd.h>
+
+#include <pthread.h>
 #include <sys/poll.h>
 #include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
 #include <sched.h>
 #include <signal.h>
+#include <unistd.h>
 
 #include "igt.h"
 #include "igt_rand.h"
@@ -1625,6 +1631,152 @@ static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
 	munmap(result, 4096);
 }
 
+static int userfaultfd(int flags)
+{
+	return syscall(SYS_userfaultfd, flags);
+}
+
+struct ufd_thread {
+	uint32_t batch;
+	uint32_t *page;
+	unsigned int engine;
+	int i915;
+};
+
+static uint32_t create_userptr(int i915, void *page)
+{
+	uint32_t handle;
+
+	gem_userptr(i915, page, 4096, 0, 0, &handle);
+	return handle;
+}
+
+static void *ufd_thread(void *arg)
+{
+	struct ufd_thread *t = arg;
+	struct drm_i915_gem_exec_object2 obj[2] = {
+		{ .handle = create_userptr(t->i915, t->page) },
+		{ .handle = t->batch },
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffers_ptr = to_user_pointer(obj),
+		.buffer_count = ARRAY_SIZE(obj),
+		.flags = t->engine,
+		.rsvd1 = gem_context_create(t->i915),
+	};
+	gem_context_set_priority(t->i915, eb.rsvd1, MIN_PRIO);
+
+	igt_debug("submitting fault\n");
+	gem_execbuf(t->i915, &eb);
+	gem_sync(t->i915, obj[0].handle);
+	gem_close(t->i915, obj[0].handle);
+
+	gem_context_destroy(t->i915, eb.rsvd1);
+
+	t->i915 = -1;
+	return NULL;
+}
+
+static void test_pi_userfault(int i915, unsigned int engine)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	pthread_t thread;
+	char poison[4096];
+	int ufd;
+
+	/*
+	 * Resource contention can easily lead to priority inversion problems,
+	 * that we wish to avoid. Here, we simulate one simple form of resource
+	 * starvation by using an arbitrary slow userspace fault handler to cause
+	 * the low priority context to block waiting for its resource. While it
+	 * is blocked, it should not prevent a higher priority context from
+	 * executing.
+	 *
+	 * This is only a very simple scenario, in more general tests we will
+	 * need to simulate contention on the shared resource such that both
+	 * low and high priority contexts are starving and must fight over
+	 * the meagre resources. One step at a time.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+	t.engine = engine;
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+
+	t.batch = gem_create(i915, 4096);
+	memset(poison, 0xff, sizeof(poison));
+	gem_write(i915, t.batch, 0, poison, 4096);
+
+	/* Register our fault handler for t.page */
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	/* Kick off the low priority submission */
+	igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
+
+	/* Wait until the low priority thread is blocked on a fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* While the low priority context is blocked; execute a vip */
+	if (1) {
+		const uint32_t bbe = MI_BATCH_BUFFER_END;
+		struct drm_i915_gem_exec_object2 obj = {
+			.handle = t.batch,
+		};
+		struct pollfd pfd;
+		struct drm_i915_gem_execbuffer2 eb = {
+			.buffers_ptr = to_user_pointer(&obj),
+			.buffer_count = 1,
+			.flags = engine | I915_EXEC_FENCE_OUT,
+			.rsvd1 = gem_context_create(i915),
+		};
+		gem_context_set_priority(i915, eb.rsvd1, MAX_PRIO);
+		gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+		gem_execbuf_wr(i915, &eb);
+
+		memset(&pfd, 0, sizeof(pfd));
+		pfd.fd = eb.rsvd2 >> 32;
+		pfd.events = POLLIN;
+		poll(&pfd, 1, -1);
+		igt_assert_eq(sync_fence_status(pfd.fd), 1);
+		close(pfd.fd);
+
+		gem_context_destroy(i915, eb.rsvd1);
+	}
+
+	/* Confirm the low priority context is still waiting */
+	igt_assert_eq(t.i915, i915);
+
+	/* Service the fault; releasing the low priority context */
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	pthread_join(thread, NULL);
+
+	gem_close(i915, t.batch);
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 static void measure_semaphore_power(int i915)
 {
 	struct rapl gpu, pkg;
@@ -1864,6 +2016,9 @@ igt_main
 
 				igt_subtest_f("pi-common-%s", e->name)
 					test_pi_ringfull(fd, eb_ring(e), SHARED);
+
+				igt_subtest_f("pi-userfault-%s", e->name)
+					test_pi_userfault(fd, eb_ring(e));
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Intel-gfx] [PATCH i-g-t 2/9] i915/gem_exec_schedule: Exercise priority inversion from resource contention
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

One of the hardest priority inversion tasks to both handle and to
simulate in testing is inversion due to resource contention. The
challenge is that a high priority context should never be blocked by a
low priority context, even if both are starving for resources --
ideally, at least for some RT OSes, the higher priority context has
first pick of the meagre resources so that it can be executed with
minimum latency.

userfaultfd allows us to handle a page fault in userspace, and so
arbitrary impose a delay on the fault handler, creating a situation
where a low priority context is blocked waiting for the fault. This
blocked context should not prevent a high priority context from being
executed. While the userfault tries to emulate a slow fault (e.g. from a
failing swap device), it is unfortunately limited to a single object
type: the userptr. Hopefully, we will find other ways to impose other
starvation conditions on global resources.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_exec_schedule.c | 155 +++++++++++++++++++++++++++++++++
 1 file changed, 155 insertions(+)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index 84581bffe..d98434123 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -23,10 +23,16 @@
 
 #include "config.h"
 
+#include <linux/userfaultfd.h>
+
+#include <pthread.h>
 #include <sys/poll.h>
 #include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
 #include <sched.h>
 #include <signal.h>
+#include <unistd.h>
 
 #include "igt.h"
 #include "igt_rand.h"
@@ -1625,6 +1631,152 @@ static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
 	munmap(result, 4096);
 }
 
+static int userfaultfd(int flags)
+{
+	return syscall(SYS_userfaultfd, flags);
+}
+
+struct ufd_thread {
+	uint32_t batch;
+	uint32_t *page;
+	unsigned int engine;
+	int i915;
+};
+
+static uint32_t create_userptr(int i915, void *page)
+{
+	uint32_t handle;
+
+	gem_userptr(i915, page, 4096, 0, 0, &handle);
+	return handle;
+}
+
+static void *ufd_thread(void *arg)
+{
+	struct ufd_thread *t = arg;
+	struct drm_i915_gem_exec_object2 obj[2] = {
+		{ .handle = create_userptr(t->i915, t->page) },
+		{ .handle = t->batch },
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffers_ptr = to_user_pointer(obj),
+		.buffer_count = ARRAY_SIZE(obj),
+		.flags = t->engine,
+		.rsvd1 = gem_context_create(t->i915),
+	};
+	gem_context_set_priority(t->i915, eb.rsvd1, MIN_PRIO);
+
+	igt_debug("submitting fault\n");
+	gem_execbuf(t->i915, &eb);
+	gem_sync(t->i915, obj[0].handle);
+	gem_close(t->i915, obj[0].handle);
+
+	gem_context_destroy(t->i915, eb.rsvd1);
+
+	t->i915 = -1;
+	return NULL;
+}
+
+static void test_pi_userfault(int i915, unsigned int engine)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	pthread_t thread;
+	char poison[4096];
+	int ufd;
+
+	/*
+	 * Resource contention can easily lead to priority inversion problems,
+	 * that we wish to avoid. Here, we simulate one simple form of resource
+	 * starvation by using an arbitrary slow userspace fault handler to cause
+	 * the low priority context to block waiting for its resource. While it
+	 * is blocked, it should not prevent a higher priority context from
+	 * executing.
+	 *
+	 * This is only a very simple scenario, in more general tests we will
+	 * need to simulate contention on the shared resource such that both
+	 * low and high priority contexts are starving and must fight over
+	 * the meagre resources. One step at a time.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+	t.engine = engine;
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+
+	t.batch = gem_create(i915, 4096);
+	memset(poison, 0xff, sizeof(poison));
+	gem_write(i915, t.batch, 0, poison, 4096);
+
+	/* Register our fault handler for t.page */
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	/* Kick off the low priority submission */
+	igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
+
+	/* Wait until the low priority thread is blocked on a fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* While the low priority context is blocked; execute a vip */
+	if (1) {
+		const uint32_t bbe = MI_BATCH_BUFFER_END;
+		struct drm_i915_gem_exec_object2 obj = {
+			.handle = t.batch,
+		};
+		struct pollfd pfd;
+		struct drm_i915_gem_execbuffer2 eb = {
+			.buffers_ptr = to_user_pointer(&obj),
+			.buffer_count = 1,
+			.flags = engine | I915_EXEC_FENCE_OUT,
+			.rsvd1 = gem_context_create(i915),
+		};
+		gem_context_set_priority(i915, eb.rsvd1, MAX_PRIO);
+		gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+		gem_execbuf_wr(i915, &eb);
+
+		memset(&pfd, 0, sizeof(pfd));
+		pfd.fd = eb.rsvd2 >> 32;
+		pfd.events = POLLIN;
+		poll(&pfd, 1, -1);
+		igt_assert_eq(sync_fence_status(pfd.fd), 1);
+		close(pfd.fd);
+
+		gem_context_destroy(i915, eb.rsvd1);
+	}
+
+	/* Confirm the low priority context is still waiting */
+	igt_assert_eq(t.i915, i915);
+
+	/* Service the fault; releasing the low priority context */
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	pthread_join(thread, NULL);
+
+	gem_close(i915, t.batch);
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 static void measure_semaphore_power(int i915)
 {
 	struct rapl gpu, pkg;
@@ -1864,6 +2016,9 @@ igt_main
 
 				igt_subtest_f("pi-common-%s", e->name)
 					test_pi_ringfull(fd, eb_ring(e), SHARED);
+
+				igt_subtest_f("pi-userfault-%s", e->name)
+					test_pi_userfault(fd, eb_ring(e));
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [igt-dev] [PATCH i-g-t 2/9] i915/gem_exec_schedule: Exercise priority inversion from resource contention
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev, Tvrtko Ursulin

One of the hardest priority inversion tasks to both handle and to
simulate in testing is inversion due to resource contention. The
challenge is that a high priority context should never be blocked by a
low priority context, even if both are starving for resources --
ideally, at least for some RT OSes, the higher priority context has
first pick of the meagre resources so that it can be executed with
minimum latency.

userfaultfd allows us to handle a page fault in userspace, and so
arbitrary impose a delay on the fault handler, creating a situation
where a low priority context is blocked waiting for the fault. This
blocked context should not prevent a high priority context from being
executed. While the userfault tries to emulate a slow fault (e.g. from a
failing swap device), it is unfortunately limited to a single object
type: the userptr. Hopefully, we will find other ways to impose other
starvation conditions on global resources.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_exec_schedule.c | 155 +++++++++++++++++++++++++++++++++
 1 file changed, 155 insertions(+)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index 84581bffe..d98434123 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -23,10 +23,16 @@
 
 #include "config.h"
 
+#include <linux/userfaultfd.h>
+
+#include <pthread.h>
 #include <sys/poll.h>
 #include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
 #include <sched.h>
 #include <signal.h>
+#include <unistd.h>
 
 #include "igt.h"
 #include "igt_rand.h"
@@ -1625,6 +1631,152 @@ static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
 	munmap(result, 4096);
 }
 
+static int userfaultfd(int flags)
+{
+	return syscall(SYS_userfaultfd, flags);
+}
+
+struct ufd_thread {
+	uint32_t batch;
+	uint32_t *page;
+	unsigned int engine;
+	int i915;
+};
+
+static uint32_t create_userptr(int i915, void *page)
+{
+	uint32_t handle;
+
+	gem_userptr(i915, page, 4096, 0, 0, &handle);
+	return handle;
+}
+
+static void *ufd_thread(void *arg)
+{
+	struct ufd_thread *t = arg;
+	struct drm_i915_gem_exec_object2 obj[2] = {
+		{ .handle = create_userptr(t->i915, t->page) },
+		{ .handle = t->batch },
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffers_ptr = to_user_pointer(obj),
+		.buffer_count = ARRAY_SIZE(obj),
+		.flags = t->engine,
+		.rsvd1 = gem_context_create(t->i915),
+	};
+	gem_context_set_priority(t->i915, eb.rsvd1, MIN_PRIO);
+
+	igt_debug("submitting fault\n");
+	gem_execbuf(t->i915, &eb);
+	gem_sync(t->i915, obj[0].handle);
+	gem_close(t->i915, obj[0].handle);
+
+	gem_context_destroy(t->i915, eb.rsvd1);
+
+	t->i915 = -1;
+	return NULL;
+}
+
+static void test_pi_userfault(int i915, unsigned int engine)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	pthread_t thread;
+	char poison[4096];
+	int ufd;
+
+	/*
+	 * Resource contention can easily lead to priority inversion problems,
+	 * that we wish to avoid. Here, we simulate one simple form of resource
+	 * starvation by using an arbitrary slow userspace fault handler to cause
+	 * the low priority context to block waiting for its resource. While it
+	 * is blocked, it should not prevent a higher priority context from
+	 * executing.
+	 *
+	 * This is only a very simple scenario, in more general tests we will
+	 * need to simulate contention on the shared resource such that both
+	 * low and high priority contexts are starving and must fight over
+	 * the meagre resources. One step at a time.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+	t.engine = engine;
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+
+	t.batch = gem_create(i915, 4096);
+	memset(poison, 0xff, sizeof(poison));
+	gem_write(i915, t.batch, 0, poison, 4096);
+
+	/* Register our fault handler for t.page */
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	/* Kick off the low priority submission */
+	igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
+
+	/* Wait until the low priority thread is blocked on a fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* While the low priority context is blocked; execute a vip */
+	if (1) {
+		const uint32_t bbe = MI_BATCH_BUFFER_END;
+		struct drm_i915_gem_exec_object2 obj = {
+			.handle = t.batch,
+		};
+		struct pollfd pfd;
+		struct drm_i915_gem_execbuffer2 eb = {
+			.buffers_ptr = to_user_pointer(&obj),
+			.buffer_count = 1,
+			.flags = engine | I915_EXEC_FENCE_OUT,
+			.rsvd1 = gem_context_create(i915),
+		};
+		gem_context_set_priority(i915, eb.rsvd1, MAX_PRIO);
+		gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+		gem_execbuf_wr(i915, &eb);
+
+		memset(&pfd, 0, sizeof(pfd));
+		pfd.fd = eb.rsvd2 >> 32;
+		pfd.events = POLLIN;
+		poll(&pfd, 1, -1);
+		igt_assert_eq(sync_fence_status(pfd.fd), 1);
+		close(pfd.fd);
+
+		gem_context_destroy(i915, eb.rsvd1);
+	}
+
+	/* Confirm the low priority context is still waiting */
+	igt_assert_eq(t.i915, i915);
+
+	/* Service the fault; releasing the low priority context */
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	pthread_join(thread, NULL);
+
+	gem_close(i915, t.batch);
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 static void measure_semaphore_power(int i915)
 {
 	struct rapl gpu, pkg;
@@ -1864,6 +2016,9 @@ igt_main
 
 				igt_subtest_f("pi-common-%s", e->name)
 					test_pi_ringfull(fd, eb_ring(e), SHARED);
+
+				igt_subtest_f("pi-userfault-%s", e->name)
+					test_pi_userfault(fd, eb_ring(e));
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH i-g-t 3/9] i915/gem_exec_schedule: Beware priority inversion from iova faults
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

Check that if two contexts (one high priority, one low) share the same
buffer that has taken a page fault that we do not create an implicit
dependency between the two contexts for servicing that page fault and
binding the vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/i915/gem_exec_schedule.c | 166 +++++++++++++++++++++++++++++++++
 1 file changed, 166 insertions(+)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index d98434123..f8b0ef5a8 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -1638,9 +1638,15 @@ static int userfaultfd(int flags)
 
 struct ufd_thread {
 	uint32_t batch;
+	uint32_t scratch;
 	uint32_t *page;
 	unsigned int engine;
+	unsigned int flags;
 	int i915;
+
+	pthread_mutex_t mutex;
+	pthread_cond_t cond;
+	int count;
 };
 
 static uint32_t create_userptr(int i915, void *page)
@@ -1777,6 +1783,160 @@ static void test_pi_userfault(int i915, unsigned int engine)
 	close(ufd);
 }
 
+static void *iova_thread(struct ufd_thread *t, int prio)
+{
+	uint32_t ctx =
+		gem_context_clone(t->i915, 0,
+				  t->flags & SHARED ? I915_CONTEXT_CLONE_VM : 0,
+				  0);
+
+	gem_context_set_priority(t->i915, ctx, prio);
+
+	store_dword_plug(t->i915, ctx, t->engine,
+			 t->scratch, 0, prio,
+			 t->batch, 0 /* no write hazard! */);
+
+	pthread_mutex_lock(&t->mutex);
+	if (!--t->count)
+		pthread_cond_signal(&t->cond);
+	pthread_mutex_unlock(&t->mutex);
+
+	gem_context_destroy(t->i915, ctx);
+	return NULL;
+}
+
+static void *iova_low(void *arg)
+{
+	return iova_thread(arg, MIN_PRIO);
+}
+
+static void *iova_high(void *arg)
+{
+	return iova_thread(arg, MAX_PRIO);
+}
+
+static void test_pi_iova(int i915, unsigned int engine, unsigned int flags)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	igt_spin_t *spin;
+	pthread_t hi, lo;
+	char poison[4096];
+	uint32_t result;
+	int ufd;
+
+	/*
+	 * In this scenario, we have a pair of contending contexts that
+	 * share the same resource. That resource is stuck behind a slow
+	 * page fault such that neither context has immediate access to it.
+	 * What is expected is that as soon as that resource becomes available,
+	 * the two contexts are queued with the high priority context taking
+	 * precedence. We need to check that we do not cross-contaminate
+	 * the two contents with the page fault on the shared resource
+	 * initiated by the low priority context. (Consider that the low
+	 * priority context may install an exclusive fence for the page
+	 * fault, which is then used for strict ordering by the high priority
+	 * context, causing an unwanted implicit dependency between the two
+	 * and promoting the low priority context to high.)
+	 *
+	 * SHARED: the two contexts share a vm, but still have separate
+	 * timelines that should not mingle.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+	t.engine = engine;
+	t.flags = flags;
+
+	t.count = 2;
+	pthread_cond_init(&t.cond, NULL);
+	pthread_mutex_init(&t.mutex, NULL);
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+	t.batch = create_userptr(i915, t.page);
+	t.scratch = gem_create(i915, 4096);
+
+	/* Register our fault handler for t.page */
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	/*
+	 * Fill the engine with spinners; the store_dword() is too quick!
+	 *
+	 * It is not that it is too quick, it that the order in which the
+	 * requests are signaled from the pagefault completion is loosely
+	 * defined (currently, it's in order of attachment so low context
+	 * wins), then submission into the execlists is immediate with the
+	 * low context filling the last slot in the ELSP. Preemption will
+	 * not take place until after the low priority context has had a
+	 * chance to run, and since the task is very short there is no
+	 * arbitration point inside the batch buffer so we only preempt
+	 * after the low priority context has completed.
+	 *
+	 * One way to prevent such opportunistic execution of the low priority
+	 * context would be to remove direct submission and wait until all
+	 * signals are delivered (as the signal delivery is under the irq lock,
+	 * the local tasklet will not run until after all signals have been
+	 * delivered... but another tasklet might).
+	 */
+	spin = igt_spin_new(i915, .engine = engine);
+	for (int i = 0; i < MAX_ELSP_QLEN; i++) {
+		spin->execbuf.rsvd1 = create_highest_priority(i915);
+		gem_execbuf(i915, &spin->execbuf);
+		gem_context_destroy(i915, spin->execbuf.rsvd1);
+	}
+
+	/* Kick off the submission threads */
+	igt_assert(pthread_create(&lo, NULL, iova_low, &t) == 0);
+
+	/* Wait until the low priority thread is blocked on the fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* Then release a very similar thread, but at high priority! */
+	igt_assert(pthread_create(&hi, NULL, iova_high, &t) == 0);
+
+	/* Service the fault; releasing both contexts */
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	/* Wait until both threads have had a chance to submit */
+	pthread_mutex_lock(&t.mutex);
+	while (t.count)
+		pthread_cond_wait(&t.cond, &t.mutex);
+	pthread_mutex_unlock(&t.mutex);
+	igt_debugfs_dump(i915, "i915_engine_info");
+	igt_spin_free(i915, spin);
+
+	pthread_join(hi, NULL);
+	pthread_join(lo, NULL);
+	gem_close(i915, t.batch);
+
+	gem_sync(i915, t.scratch); /* write hazard lies */
+	gem_read(i915, t.scratch, 0, &result, sizeof(result));
+	igt_assert_eq(result, MIN_PRIO);
+	gem_close(i915, t.scratch);
+
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 static void measure_semaphore_power(int i915)
 {
 	struct rapl gpu, pkg;
@@ -2019,6 +2179,12 @@ igt_main
 
 				igt_subtest_f("pi-userfault-%s", e->name)
 					test_pi_userfault(fd, eb_ring(e));
+
+				igt_subtest_f("pi-distinct-iova-%s", e->name)
+					test_pi_iova(fd, eb_ring(e), 0);
+
+				igt_subtest_f("pi-shared-iova-%s", e->name)
+					test_pi_iova(fd, eb_ring(e), SHARED);
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Intel-gfx] [PATCH i-g-t 3/9] i915/gem_exec_schedule: Beware priority inversion from iova faults
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

Check that if two contexts (one high priority, one low) share the same
buffer that has taken a page fault that we do not create an implicit
dependency between the two contexts for servicing that page fault and
binding the vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/i915/gem_exec_schedule.c | 166 +++++++++++++++++++++++++++++++++
 1 file changed, 166 insertions(+)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index d98434123..f8b0ef5a8 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -1638,9 +1638,15 @@ static int userfaultfd(int flags)
 
 struct ufd_thread {
 	uint32_t batch;
+	uint32_t scratch;
 	uint32_t *page;
 	unsigned int engine;
+	unsigned int flags;
 	int i915;
+
+	pthread_mutex_t mutex;
+	pthread_cond_t cond;
+	int count;
 };
 
 static uint32_t create_userptr(int i915, void *page)
@@ -1777,6 +1783,160 @@ static void test_pi_userfault(int i915, unsigned int engine)
 	close(ufd);
 }
 
+static void *iova_thread(struct ufd_thread *t, int prio)
+{
+	uint32_t ctx =
+		gem_context_clone(t->i915, 0,
+				  t->flags & SHARED ? I915_CONTEXT_CLONE_VM : 0,
+				  0);
+
+	gem_context_set_priority(t->i915, ctx, prio);
+
+	store_dword_plug(t->i915, ctx, t->engine,
+			 t->scratch, 0, prio,
+			 t->batch, 0 /* no write hazard! */);
+
+	pthread_mutex_lock(&t->mutex);
+	if (!--t->count)
+		pthread_cond_signal(&t->cond);
+	pthread_mutex_unlock(&t->mutex);
+
+	gem_context_destroy(t->i915, ctx);
+	return NULL;
+}
+
+static void *iova_low(void *arg)
+{
+	return iova_thread(arg, MIN_PRIO);
+}
+
+static void *iova_high(void *arg)
+{
+	return iova_thread(arg, MAX_PRIO);
+}
+
+static void test_pi_iova(int i915, unsigned int engine, unsigned int flags)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	igt_spin_t *spin;
+	pthread_t hi, lo;
+	char poison[4096];
+	uint32_t result;
+	int ufd;
+
+	/*
+	 * In this scenario, we have a pair of contending contexts that
+	 * share the same resource. That resource is stuck behind a slow
+	 * page fault such that neither context has immediate access to it.
+	 * What is expected is that as soon as that resource becomes available,
+	 * the two contexts are queued with the high priority context taking
+	 * precedence. We need to check that we do not cross-contaminate
+	 * the two contents with the page fault on the shared resource
+	 * initiated by the low priority context. (Consider that the low
+	 * priority context may install an exclusive fence for the page
+	 * fault, which is then used for strict ordering by the high priority
+	 * context, causing an unwanted implicit dependency between the two
+	 * and promoting the low priority context to high.)
+	 *
+	 * SHARED: the two contexts share a vm, but still have separate
+	 * timelines that should not mingle.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+	t.engine = engine;
+	t.flags = flags;
+
+	t.count = 2;
+	pthread_cond_init(&t.cond, NULL);
+	pthread_mutex_init(&t.mutex, NULL);
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+	t.batch = create_userptr(i915, t.page);
+	t.scratch = gem_create(i915, 4096);
+
+	/* Register our fault handler for t.page */
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	/*
+	 * Fill the engine with spinners; the store_dword() is too quick!
+	 *
+	 * It is not that it is too quick, it that the order in which the
+	 * requests are signaled from the pagefault completion is loosely
+	 * defined (currently, it's in order of attachment so low context
+	 * wins), then submission into the execlists is immediate with the
+	 * low context filling the last slot in the ELSP. Preemption will
+	 * not take place until after the low priority context has had a
+	 * chance to run, and since the task is very short there is no
+	 * arbitration point inside the batch buffer so we only preempt
+	 * after the low priority context has completed.
+	 *
+	 * One way to prevent such opportunistic execution of the low priority
+	 * context would be to remove direct submission and wait until all
+	 * signals are delivered (as the signal delivery is under the irq lock,
+	 * the local tasklet will not run until after all signals have been
+	 * delivered... but another tasklet might).
+	 */
+	spin = igt_spin_new(i915, .engine = engine);
+	for (int i = 0; i < MAX_ELSP_QLEN; i++) {
+		spin->execbuf.rsvd1 = create_highest_priority(i915);
+		gem_execbuf(i915, &spin->execbuf);
+		gem_context_destroy(i915, spin->execbuf.rsvd1);
+	}
+
+	/* Kick off the submission threads */
+	igt_assert(pthread_create(&lo, NULL, iova_low, &t) == 0);
+
+	/* Wait until the low priority thread is blocked on the fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* Then release a very similar thread, but at high priority! */
+	igt_assert(pthread_create(&hi, NULL, iova_high, &t) == 0);
+
+	/* Service the fault; releasing both contexts */
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	/* Wait until both threads have had a chance to submit */
+	pthread_mutex_lock(&t.mutex);
+	while (t.count)
+		pthread_cond_wait(&t.cond, &t.mutex);
+	pthread_mutex_unlock(&t.mutex);
+	igt_debugfs_dump(i915, "i915_engine_info");
+	igt_spin_free(i915, spin);
+
+	pthread_join(hi, NULL);
+	pthread_join(lo, NULL);
+	gem_close(i915, t.batch);
+
+	gem_sync(i915, t.scratch); /* write hazard lies */
+	gem_read(i915, t.scratch, 0, &result, sizeof(result));
+	igt_assert_eq(result, MIN_PRIO);
+	gem_close(i915, t.scratch);
+
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 static void measure_semaphore_power(int i915)
 {
 	struct rapl gpu, pkg;
@@ -2019,6 +2179,12 @@ igt_main
 
 				igt_subtest_f("pi-userfault-%s", e->name)
 					test_pi_userfault(fd, eb_ring(e));
+
+				igt_subtest_f("pi-distinct-iova-%s", e->name)
+					test_pi_iova(fd, eb_ring(e), 0);
+
+				igt_subtest_f("pi-shared-iova-%s", e->name)
+					test_pi_iova(fd, eb_ring(e), SHARED);
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [igt-dev] [PATCH i-g-t 3/9] i915/gem_exec_schedule: Beware priority inversion from iova faults
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

Check that if two contexts (one high priority, one low) share the same
buffer that has taken a page fault that we do not create an implicit
dependency between the two contexts for servicing that page fault and
binding the vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/i915/gem_exec_schedule.c | 166 +++++++++++++++++++++++++++++++++
 1 file changed, 166 insertions(+)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index d98434123..f8b0ef5a8 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -1638,9 +1638,15 @@ static int userfaultfd(int flags)
 
 struct ufd_thread {
 	uint32_t batch;
+	uint32_t scratch;
 	uint32_t *page;
 	unsigned int engine;
+	unsigned int flags;
 	int i915;
+
+	pthread_mutex_t mutex;
+	pthread_cond_t cond;
+	int count;
 };
 
 static uint32_t create_userptr(int i915, void *page)
@@ -1777,6 +1783,160 @@ static void test_pi_userfault(int i915, unsigned int engine)
 	close(ufd);
 }
 
+static void *iova_thread(struct ufd_thread *t, int prio)
+{
+	uint32_t ctx =
+		gem_context_clone(t->i915, 0,
+				  t->flags & SHARED ? I915_CONTEXT_CLONE_VM : 0,
+				  0);
+
+	gem_context_set_priority(t->i915, ctx, prio);
+
+	store_dword_plug(t->i915, ctx, t->engine,
+			 t->scratch, 0, prio,
+			 t->batch, 0 /* no write hazard! */);
+
+	pthread_mutex_lock(&t->mutex);
+	if (!--t->count)
+		pthread_cond_signal(&t->cond);
+	pthread_mutex_unlock(&t->mutex);
+
+	gem_context_destroy(t->i915, ctx);
+	return NULL;
+}
+
+static void *iova_low(void *arg)
+{
+	return iova_thread(arg, MIN_PRIO);
+}
+
+static void *iova_high(void *arg)
+{
+	return iova_thread(arg, MAX_PRIO);
+}
+
+static void test_pi_iova(int i915, unsigned int engine, unsigned int flags)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	igt_spin_t *spin;
+	pthread_t hi, lo;
+	char poison[4096];
+	uint32_t result;
+	int ufd;
+
+	/*
+	 * In this scenario, we have a pair of contending contexts that
+	 * share the same resource. That resource is stuck behind a slow
+	 * page fault such that neither context has immediate access to it.
+	 * What is expected is that as soon as that resource becomes available,
+	 * the two contexts are queued with the high priority context taking
+	 * precedence. We need to check that we do not cross-contaminate
+	 * the two contents with the page fault on the shared resource
+	 * initiated by the low priority context. (Consider that the low
+	 * priority context may install an exclusive fence for the page
+	 * fault, which is then used for strict ordering by the high priority
+	 * context, causing an unwanted implicit dependency between the two
+	 * and promoting the low priority context to high.)
+	 *
+	 * SHARED: the two contexts share a vm, but still have separate
+	 * timelines that should not mingle.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+	t.engine = engine;
+	t.flags = flags;
+
+	t.count = 2;
+	pthread_cond_init(&t.cond, NULL);
+	pthread_mutex_init(&t.mutex, NULL);
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+	t.batch = create_userptr(i915, t.page);
+	t.scratch = gem_create(i915, 4096);
+
+	/* Register our fault handler for t.page */
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	/*
+	 * Fill the engine with spinners; the store_dword() is too quick!
+	 *
+	 * It is not that it is too quick, it that the order in which the
+	 * requests are signaled from the pagefault completion is loosely
+	 * defined (currently, it's in order of attachment so low context
+	 * wins), then submission into the execlists is immediate with the
+	 * low context filling the last slot in the ELSP. Preemption will
+	 * not take place until after the low priority context has had a
+	 * chance to run, and since the task is very short there is no
+	 * arbitration point inside the batch buffer so we only preempt
+	 * after the low priority context has completed.
+	 *
+	 * One way to prevent such opportunistic execution of the low priority
+	 * context would be to remove direct submission and wait until all
+	 * signals are delivered (as the signal delivery is under the irq lock,
+	 * the local tasklet will not run until after all signals have been
+	 * delivered... but another tasklet might).
+	 */
+	spin = igt_spin_new(i915, .engine = engine);
+	for (int i = 0; i < MAX_ELSP_QLEN; i++) {
+		spin->execbuf.rsvd1 = create_highest_priority(i915);
+		gem_execbuf(i915, &spin->execbuf);
+		gem_context_destroy(i915, spin->execbuf.rsvd1);
+	}
+
+	/* Kick off the submission threads */
+	igt_assert(pthread_create(&lo, NULL, iova_low, &t) == 0);
+
+	/* Wait until the low priority thread is blocked on the fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* Then release a very similar thread, but at high priority! */
+	igt_assert(pthread_create(&hi, NULL, iova_high, &t) == 0);
+
+	/* Service the fault; releasing both contexts */
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	/* Wait until both threads have had a chance to submit */
+	pthread_mutex_lock(&t.mutex);
+	while (t.count)
+		pthread_cond_wait(&t.cond, &t.mutex);
+	pthread_mutex_unlock(&t.mutex);
+	igt_debugfs_dump(i915, "i915_engine_info");
+	igt_spin_free(i915, spin);
+
+	pthread_join(hi, NULL);
+	pthread_join(lo, NULL);
+	gem_close(i915, t.batch);
+
+	gem_sync(i915, t.scratch); /* write hazard lies */
+	gem_read(i915, t.scratch, 0, &result, sizeof(result));
+	igt_assert_eq(result, MIN_PRIO);
+	gem_close(i915, t.scratch);
+
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 static void measure_semaphore_power(int i915)
 {
 	struct rapl gpu, pkg;
@@ -2019,6 +2179,12 @@ igt_main
 
 				igt_subtest_f("pi-userfault-%s", e->name)
 					test_pi_userfault(fd, eb_ring(e));
+
+				igt_subtest_f("pi-distinct-iova-%s", e->name)
+					test_pi_iova(fd, eb_ring(e), 0);
+
+				igt_subtest_f("pi-shared-iova-%s", e->name)
+					test_pi_iova(fd, eb_ring(e), SHARED);
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

Several tests depend upon the implicit engine->mmio_base but have no
means of determining the physical layout. Since the kernel has started
providing this information, start putting it to use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
 lib/i915/gem_engine_topology.h |  5 ++
 tests/i915/gem_ctx_shared.c    | 38 +++++----------
 tests/i915/gem_exec_latency.c  | 17 ++++---
 4 files changed, 111 insertions(+), 33 deletions(-)

diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
index 790d455ff..bd200a4b9 100644
--- a/lib/i915/gem_engine_topology.c
+++ b/lib/i915/gem_engine_topology.c
@@ -21,7 +21,12 @@
  * IN THE SOFTWARE.
  */
 
+#include <fcntl.h>
+#include <unistd.h>
+
 #include "drmtest.h"
+#include "igt_sysfs.h"
+#include "intel_chipset.h"
 #include "ioctl_wrappers.h"
 
 #include "i915/gem_engine_topology.h"
@@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
 {
 	return e1->class == e2->class && e1->instance == e2->instance;
 }
+
+static int descend(int dir, const char *path)
+{
+	int fd;
+
+	fd = openat(dir, path, O_RDONLY);
+	close(dir);
+
+	return fd;
+}
+
+int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
+			      const char *fmt, ...)
+{
+	FILE *file;
+	va_list ap;
+	int ret;
+	int fd;
+
+	fd = igt_sysfs_open(i915);
+	if (fd < 0)
+		return fd;
+
+	fd = descend(fd, "engine");
+	if (fd < 0)
+		return fd;
+
+	fd = descend(fd, engine);
+	if (fd < 0)
+		return fd;
+
+	fd = descend(fd, attr);
+	if (fd < 0)
+		return fd;
+
+	file = fdopen(fd, "r");
+	if (!file) {
+		close(fd);
+		return -1;
+	}
+
+	va_start(ap, fmt);
+	ret = vfscanf(file, fmt, ap);
+	va_end(ap);
+
+	fclose(file);
+	return ret;
+}
+
+uint32_t gem_engine_mmio_base(int i915, const char *engine)
+{
+	unsigned int mmio = 0;
+
+	if (gem_engine_property_scanf(i915, engine, "mmio_base",
+				      "%x", &mmio) < 0) {
+		int gen = intel_gen(intel_get_drm_devid(i915));
+
+		/* The layout of xcs1+ is unreliable -- hence the property! */
+		if (!strcmp(engine, "rcs0")) {
+			mmio = 0x2000;
+		} else if (!strcmp(engine, "bcs0")) {
+			mmio = 0x22000;
+		} else if (!strcmp(engine, "vcs0")) {
+			if (gen < 6)
+				mmio = 0x4000;
+			else if (gen < 11)
+				mmio = 0x12000;
+			else
+				mmio = 0x1c0000;
+		} else if (!strcmp(engine, "vecs0")) {
+			if (gen < 11)
+				mmio = 0x1a000;
+			else
+				mmio = 0x1c8000;
+		}
+	}
+
+	return mmio;
+}
diff --git a/lib/i915/gem_engine_topology.h b/lib/i915/gem_engine_topology.h
index d98773e06..e728ebd93 100644
--- a/lib/i915/gem_engine_topology.h
+++ b/lib/i915/gem_engine_topology.h
@@ -74,4 +74,9 @@ struct intel_execution_engine2 gem_eb_flags_to_engine(unsigned int flags);
 	     ((e__) = intel_get_current_physical_engine(&i__)); \
 	     intel_next_engine(&i__))
 
+__attribute__((format(scanf, 4, 5)))
+int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
+			      const char *fmt, ...);
+uint32_t gem_engine_mmio_base(int i915, const char *engine);
+
 #endif /* GEM_ENGINE_TOPOLOGY_H */
diff --git a/tests/i915/gem_ctx_shared.c b/tests/i915/gem_ctx_shared.c
index a6eee16dd..949e1f3d4 100644
--- a/tests/i915/gem_ctx_shared.c
+++ b/tests/i915/gem_ctx_shared.c
@@ -38,6 +38,7 @@
 
 #include <drm.h>
 
+#include "i915/gem_engine_topology.h"
 #include "igt_rand.h"
 #include "igt_vgem.h"
 #include "sync_file.h"
@@ -556,6 +557,14 @@ static uint32_t store_timestamp(int i915,
 	return obj.handle;
 }
 
+static uint32_t ring_base(int i915, unsigned ring)
+{
+	if (ring == I915_EXEC_DEFAULT)
+		ring = I915_EXEC_RENDER; /* XXX */
+
+	return gem_engine_mmio_base(i915, gem_eb_flags_to_engine(ring).name);
+}
+
 static void independent(int i915, unsigned ring, unsigned flags)
 {
 	const int TIMESTAMP = 1023;
@@ -563,33 +572,8 @@ static void independent(int i915, unsigned ring, unsigned flags)
 	igt_spin_t *spin[MAX_ELSP_QLEN];
 	unsigned int mmio_base;
 
-	/* XXX i915_query()! */
-	switch (ring) {
-	case I915_EXEC_DEFAULT:
-	case I915_EXEC_RENDER:
-		mmio_base = 0x2000;
-		break;
-#if 0
-	case I915_EXEC_BSD:
-		mmio_base = 0x12000;
-		break;
-#endif
-	case I915_EXEC_BLT:
-		mmio_base = 0x22000;
-		break;
-
-#define GEN11_VECS0_BASE 0x1c8000
-#define GEN11_VECS1_BASE 0x1d8000
-	case I915_EXEC_VEBOX:
-		if (intel_gen(intel_get_drm_devid(i915)) >= 11)
-			mmio_base = GEN11_VECS0_BASE;
-		else
-			mmio_base = 0x1a000;
-		break;
-
-	default:
-		igt_skip("mmio base not known\n");
-	}
+	mmio_base = ring_base(i915, ring);
+	igt_require_f(mmio_base, "mmio base not known\n");
 
 	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
 		const struct igt_spin_factory opts = {
diff --git a/tests/i915/gem_exec_latency.c b/tests/i915/gem_exec_latency.c
index 3d99182a0..d2159f317 100644
--- a/tests/i915/gem_exec_latency.c
+++ b/tests/i915/gem_exec_latency.c
@@ -109,7 +109,7 @@ poll_ring(int fd, unsigned ring, const char *name)
 	igt_spin_free(fd, spin[0]);
 }
 
-#define RCS_TIMESTAMP (0x2000 + 0x358)
+#define TIMESTAMP (0x358)
 static void latency_on_ring(int fd,
 			    unsigned ring, const char *name,
 			    unsigned flags)
@@ -119,6 +119,7 @@ static void latency_on_ring(int fd,
 	struct drm_i915_gem_exec_object2 obj[3];
 	struct drm_i915_gem_relocation_entry reloc;
 	struct drm_i915_gem_execbuffer2 execbuf;
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
 	igt_spin_t *spin = NULL;
 	IGT_CORK_HANDLE(c);
 	volatile uint32_t *reg;
@@ -128,7 +129,8 @@ static void latency_on_ring(int fd,
 	double gpu_latency;
 	int i, j;
 
-	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + RCS_TIMESTAMP);
+	igt_require(mmio_base);
+	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + mmio_base + TIMESTAMP);
 
 	memset(&execbuf, 0, sizeof(execbuf));
 	execbuf.buffers_ptr = to_user_pointer(&obj[1]);
@@ -176,7 +178,7 @@ static void latency_on_ring(int fd,
 		map[i++] = 0x24 << 23 | 1;
 		if (has_64bit_reloc)
 			map[i-1]++;
-		map[i++] = RCS_TIMESTAMP; /* ring local! */
+		map[i++] = mmio_base + TIMESTAMP;
 		map[i++] = offset;
 		if (has_64bit_reloc)
 			map[i++] = offset >> 32;
@@ -266,11 +268,14 @@ static void latency_from_ring(int fd,
 	struct drm_i915_gem_exec_object2 obj[3];
 	struct drm_i915_gem_relocation_entry reloc;
 	struct drm_i915_gem_execbuffer2 execbuf;
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
 	const unsigned int repeats = ring_size / 2;
 	uint32_t *map, *results;
 	uint32_t ctx[2] = {};
 	int i, j;
 
+	igt_require(mmio_base);
+
 	if (flags & PREEMPT) {
 		ctx[0] = gem_context_create(fd);
 		gem_context_set_priority(fd, ctx[0], -1023);
@@ -351,7 +356,7 @@ static void latency_from_ring(int fd,
 			map[i++] = 0x24 << 23 | 1;
 			if (has_64bit_reloc)
 				map[i-1]++;
-			map[i++] = RCS_TIMESTAMP; /* ring local! */
+			map[i++] = mmio_base + TIMESTAMP;
 			map[i++] = offset;
 			if (has_64bit_reloc)
 				map[i++] = offset >> 32;
@@ -376,7 +381,7 @@ static void latency_from_ring(int fd,
 			map[i++] = 0x24 << 23 | 1;
 			if (has_64bit_reloc)
 				map[i-1]++;
-			map[i++] = RCS_TIMESTAMP; /* ring local! */
+			map[i++] = mmio_base + TIMESTAMP;
 			map[i++] = offset;
 			if (has_64bit_reloc)
 				map[i++] = offset >> 32;
@@ -669,7 +674,7 @@ igt_main
 			ring_size = 1024;
 
 		intel_register_access_init(&mmio_data, intel_get_pci_device(), false, device);
-		rcs_clock = clockrate(device, RCS_TIMESTAMP);
+		rcs_clock = clockrate(device, 0x2000 + TIMESTAMP);
 		igt_info("RCS timestamp clock: %.0fKHz, %.1fns\n",
 			 rcs_clock / 1e3, 1e9 / rcs_clock);
 		rcs_clock = 1e9 / rcs_clock;
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Intel-gfx] [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

Several tests depend upon the implicit engine->mmio_base but have no
means of determining the physical layout. Since the kernel has started
providing this information, start putting it to use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
 lib/i915/gem_engine_topology.h |  5 ++
 tests/i915/gem_ctx_shared.c    | 38 +++++----------
 tests/i915/gem_exec_latency.c  | 17 ++++---
 4 files changed, 111 insertions(+), 33 deletions(-)

diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
index 790d455ff..bd200a4b9 100644
--- a/lib/i915/gem_engine_topology.c
+++ b/lib/i915/gem_engine_topology.c
@@ -21,7 +21,12 @@
  * IN THE SOFTWARE.
  */
 
+#include <fcntl.h>
+#include <unistd.h>
+
 #include "drmtest.h"
+#include "igt_sysfs.h"
+#include "intel_chipset.h"
 #include "ioctl_wrappers.h"
 
 #include "i915/gem_engine_topology.h"
@@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
 {
 	return e1->class == e2->class && e1->instance == e2->instance;
 }
+
+static int descend(int dir, const char *path)
+{
+	int fd;
+
+	fd = openat(dir, path, O_RDONLY);
+	close(dir);
+
+	return fd;
+}
+
+int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
+			      const char *fmt, ...)
+{
+	FILE *file;
+	va_list ap;
+	int ret;
+	int fd;
+
+	fd = igt_sysfs_open(i915);
+	if (fd < 0)
+		return fd;
+
+	fd = descend(fd, "engine");
+	if (fd < 0)
+		return fd;
+
+	fd = descend(fd, engine);
+	if (fd < 0)
+		return fd;
+
+	fd = descend(fd, attr);
+	if (fd < 0)
+		return fd;
+
+	file = fdopen(fd, "r");
+	if (!file) {
+		close(fd);
+		return -1;
+	}
+
+	va_start(ap, fmt);
+	ret = vfscanf(file, fmt, ap);
+	va_end(ap);
+
+	fclose(file);
+	return ret;
+}
+
+uint32_t gem_engine_mmio_base(int i915, const char *engine)
+{
+	unsigned int mmio = 0;
+
+	if (gem_engine_property_scanf(i915, engine, "mmio_base",
+				      "%x", &mmio) < 0) {
+		int gen = intel_gen(intel_get_drm_devid(i915));
+
+		/* The layout of xcs1+ is unreliable -- hence the property! */
+		if (!strcmp(engine, "rcs0")) {
+			mmio = 0x2000;
+		} else if (!strcmp(engine, "bcs0")) {
+			mmio = 0x22000;
+		} else if (!strcmp(engine, "vcs0")) {
+			if (gen < 6)
+				mmio = 0x4000;
+			else if (gen < 11)
+				mmio = 0x12000;
+			else
+				mmio = 0x1c0000;
+		} else if (!strcmp(engine, "vecs0")) {
+			if (gen < 11)
+				mmio = 0x1a000;
+			else
+				mmio = 0x1c8000;
+		}
+	}
+
+	return mmio;
+}
diff --git a/lib/i915/gem_engine_topology.h b/lib/i915/gem_engine_topology.h
index d98773e06..e728ebd93 100644
--- a/lib/i915/gem_engine_topology.h
+++ b/lib/i915/gem_engine_topology.h
@@ -74,4 +74,9 @@ struct intel_execution_engine2 gem_eb_flags_to_engine(unsigned int flags);
 	     ((e__) = intel_get_current_physical_engine(&i__)); \
 	     intel_next_engine(&i__))
 
+__attribute__((format(scanf, 4, 5)))
+int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
+			      const char *fmt, ...);
+uint32_t gem_engine_mmio_base(int i915, const char *engine);
+
 #endif /* GEM_ENGINE_TOPOLOGY_H */
diff --git a/tests/i915/gem_ctx_shared.c b/tests/i915/gem_ctx_shared.c
index a6eee16dd..949e1f3d4 100644
--- a/tests/i915/gem_ctx_shared.c
+++ b/tests/i915/gem_ctx_shared.c
@@ -38,6 +38,7 @@
 
 #include <drm.h>
 
+#include "i915/gem_engine_topology.h"
 #include "igt_rand.h"
 #include "igt_vgem.h"
 #include "sync_file.h"
@@ -556,6 +557,14 @@ static uint32_t store_timestamp(int i915,
 	return obj.handle;
 }
 
+static uint32_t ring_base(int i915, unsigned ring)
+{
+	if (ring == I915_EXEC_DEFAULT)
+		ring = I915_EXEC_RENDER; /* XXX */
+
+	return gem_engine_mmio_base(i915, gem_eb_flags_to_engine(ring).name);
+}
+
 static void independent(int i915, unsigned ring, unsigned flags)
 {
 	const int TIMESTAMP = 1023;
@@ -563,33 +572,8 @@ static void independent(int i915, unsigned ring, unsigned flags)
 	igt_spin_t *spin[MAX_ELSP_QLEN];
 	unsigned int mmio_base;
 
-	/* XXX i915_query()! */
-	switch (ring) {
-	case I915_EXEC_DEFAULT:
-	case I915_EXEC_RENDER:
-		mmio_base = 0x2000;
-		break;
-#if 0
-	case I915_EXEC_BSD:
-		mmio_base = 0x12000;
-		break;
-#endif
-	case I915_EXEC_BLT:
-		mmio_base = 0x22000;
-		break;
-
-#define GEN11_VECS0_BASE 0x1c8000
-#define GEN11_VECS1_BASE 0x1d8000
-	case I915_EXEC_VEBOX:
-		if (intel_gen(intel_get_drm_devid(i915)) >= 11)
-			mmio_base = GEN11_VECS0_BASE;
-		else
-			mmio_base = 0x1a000;
-		break;
-
-	default:
-		igt_skip("mmio base not known\n");
-	}
+	mmio_base = ring_base(i915, ring);
+	igt_require_f(mmio_base, "mmio base not known\n");
 
 	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
 		const struct igt_spin_factory opts = {
diff --git a/tests/i915/gem_exec_latency.c b/tests/i915/gem_exec_latency.c
index 3d99182a0..d2159f317 100644
--- a/tests/i915/gem_exec_latency.c
+++ b/tests/i915/gem_exec_latency.c
@@ -109,7 +109,7 @@ poll_ring(int fd, unsigned ring, const char *name)
 	igt_spin_free(fd, spin[0]);
 }
 
-#define RCS_TIMESTAMP (0x2000 + 0x358)
+#define TIMESTAMP (0x358)
 static void latency_on_ring(int fd,
 			    unsigned ring, const char *name,
 			    unsigned flags)
@@ -119,6 +119,7 @@ static void latency_on_ring(int fd,
 	struct drm_i915_gem_exec_object2 obj[3];
 	struct drm_i915_gem_relocation_entry reloc;
 	struct drm_i915_gem_execbuffer2 execbuf;
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
 	igt_spin_t *spin = NULL;
 	IGT_CORK_HANDLE(c);
 	volatile uint32_t *reg;
@@ -128,7 +129,8 @@ static void latency_on_ring(int fd,
 	double gpu_latency;
 	int i, j;
 
-	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + RCS_TIMESTAMP);
+	igt_require(mmio_base);
+	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + mmio_base + TIMESTAMP);
 
 	memset(&execbuf, 0, sizeof(execbuf));
 	execbuf.buffers_ptr = to_user_pointer(&obj[1]);
@@ -176,7 +178,7 @@ static void latency_on_ring(int fd,
 		map[i++] = 0x24 << 23 | 1;
 		if (has_64bit_reloc)
 			map[i-1]++;
-		map[i++] = RCS_TIMESTAMP; /* ring local! */
+		map[i++] = mmio_base + TIMESTAMP;
 		map[i++] = offset;
 		if (has_64bit_reloc)
 			map[i++] = offset >> 32;
@@ -266,11 +268,14 @@ static void latency_from_ring(int fd,
 	struct drm_i915_gem_exec_object2 obj[3];
 	struct drm_i915_gem_relocation_entry reloc;
 	struct drm_i915_gem_execbuffer2 execbuf;
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
 	const unsigned int repeats = ring_size / 2;
 	uint32_t *map, *results;
 	uint32_t ctx[2] = {};
 	int i, j;
 
+	igt_require(mmio_base);
+
 	if (flags & PREEMPT) {
 		ctx[0] = gem_context_create(fd);
 		gem_context_set_priority(fd, ctx[0], -1023);
@@ -351,7 +356,7 @@ static void latency_from_ring(int fd,
 			map[i++] = 0x24 << 23 | 1;
 			if (has_64bit_reloc)
 				map[i-1]++;
-			map[i++] = RCS_TIMESTAMP; /* ring local! */
+			map[i++] = mmio_base + TIMESTAMP;
 			map[i++] = offset;
 			if (has_64bit_reloc)
 				map[i++] = offset >> 32;
@@ -376,7 +381,7 @@ static void latency_from_ring(int fd,
 			map[i++] = 0x24 << 23 | 1;
 			if (has_64bit_reloc)
 				map[i-1]++;
-			map[i++] = RCS_TIMESTAMP; /* ring local! */
+			map[i++] = mmio_base + TIMESTAMP;
 			map[i++] = offset;
 			if (has_64bit_reloc)
 				map[i++] = offset >> 32;
@@ -669,7 +674,7 @@ igt_main
 			ring_size = 1024;
 
 		intel_register_access_init(&mmio_data, intel_get_pci_device(), false, device);
-		rcs_clock = clockrate(device, RCS_TIMESTAMP);
+		rcs_clock = clockrate(device, 0x2000 + TIMESTAMP);
 		igt_info("RCS timestamp clock: %.0fKHz, %.1fns\n",
 			 rcs_clock / 1e3, 1e9 / rcs_clock);
 		rcs_clock = 1e9 / rcs_clock;
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [igt-dev] [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

Several tests depend upon the implicit engine->mmio_base but have no
means of determining the physical layout. Since the kernel has started
providing this information, start putting it to use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
 lib/i915/gem_engine_topology.h |  5 ++
 tests/i915/gem_ctx_shared.c    | 38 +++++----------
 tests/i915/gem_exec_latency.c  | 17 ++++---
 4 files changed, 111 insertions(+), 33 deletions(-)

diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
index 790d455ff..bd200a4b9 100644
--- a/lib/i915/gem_engine_topology.c
+++ b/lib/i915/gem_engine_topology.c
@@ -21,7 +21,12 @@
  * IN THE SOFTWARE.
  */
 
+#include <fcntl.h>
+#include <unistd.h>
+
 #include "drmtest.h"
+#include "igt_sysfs.h"
+#include "intel_chipset.h"
 #include "ioctl_wrappers.h"
 
 #include "i915/gem_engine_topology.h"
@@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
 {
 	return e1->class == e2->class && e1->instance == e2->instance;
 }
+
+static int descend(int dir, const char *path)
+{
+	int fd;
+
+	fd = openat(dir, path, O_RDONLY);
+	close(dir);
+
+	return fd;
+}
+
+int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
+			      const char *fmt, ...)
+{
+	FILE *file;
+	va_list ap;
+	int ret;
+	int fd;
+
+	fd = igt_sysfs_open(i915);
+	if (fd < 0)
+		return fd;
+
+	fd = descend(fd, "engine");
+	if (fd < 0)
+		return fd;
+
+	fd = descend(fd, engine);
+	if (fd < 0)
+		return fd;
+
+	fd = descend(fd, attr);
+	if (fd < 0)
+		return fd;
+
+	file = fdopen(fd, "r");
+	if (!file) {
+		close(fd);
+		return -1;
+	}
+
+	va_start(ap, fmt);
+	ret = vfscanf(file, fmt, ap);
+	va_end(ap);
+
+	fclose(file);
+	return ret;
+}
+
+uint32_t gem_engine_mmio_base(int i915, const char *engine)
+{
+	unsigned int mmio = 0;
+
+	if (gem_engine_property_scanf(i915, engine, "mmio_base",
+				      "%x", &mmio) < 0) {
+		int gen = intel_gen(intel_get_drm_devid(i915));
+
+		/* The layout of xcs1+ is unreliable -- hence the property! */
+		if (!strcmp(engine, "rcs0")) {
+			mmio = 0x2000;
+		} else if (!strcmp(engine, "bcs0")) {
+			mmio = 0x22000;
+		} else if (!strcmp(engine, "vcs0")) {
+			if (gen < 6)
+				mmio = 0x4000;
+			else if (gen < 11)
+				mmio = 0x12000;
+			else
+				mmio = 0x1c0000;
+		} else if (!strcmp(engine, "vecs0")) {
+			if (gen < 11)
+				mmio = 0x1a000;
+			else
+				mmio = 0x1c8000;
+		}
+	}
+
+	return mmio;
+}
diff --git a/lib/i915/gem_engine_topology.h b/lib/i915/gem_engine_topology.h
index d98773e06..e728ebd93 100644
--- a/lib/i915/gem_engine_topology.h
+++ b/lib/i915/gem_engine_topology.h
@@ -74,4 +74,9 @@ struct intel_execution_engine2 gem_eb_flags_to_engine(unsigned int flags);
 	     ((e__) = intel_get_current_physical_engine(&i__)); \
 	     intel_next_engine(&i__))
 
+__attribute__((format(scanf, 4, 5)))
+int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
+			      const char *fmt, ...);
+uint32_t gem_engine_mmio_base(int i915, const char *engine);
+
 #endif /* GEM_ENGINE_TOPOLOGY_H */
diff --git a/tests/i915/gem_ctx_shared.c b/tests/i915/gem_ctx_shared.c
index a6eee16dd..949e1f3d4 100644
--- a/tests/i915/gem_ctx_shared.c
+++ b/tests/i915/gem_ctx_shared.c
@@ -38,6 +38,7 @@
 
 #include <drm.h>
 
+#include "i915/gem_engine_topology.h"
 #include "igt_rand.h"
 #include "igt_vgem.h"
 #include "sync_file.h"
@@ -556,6 +557,14 @@ static uint32_t store_timestamp(int i915,
 	return obj.handle;
 }
 
+static uint32_t ring_base(int i915, unsigned ring)
+{
+	if (ring == I915_EXEC_DEFAULT)
+		ring = I915_EXEC_RENDER; /* XXX */
+
+	return gem_engine_mmio_base(i915, gem_eb_flags_to_engine(ring).name);
+}
+
 static void independent(int i915, unsigned ring, unsigned flags)
 {
 	const int TIMESTAMP = 1023;
@@ -563,33 +572,8 @@ static void independent(int i915, unsigned ring, unsigned flags)
 	igt_spin_t *spin[MAX_ELSP_QLEN];
 	unsigned int mmio_base;
 
-	/* XXX i915_query()! */
-	switch (ring) {
-	case I915_EXEC_DEFAULT:
-	case I915_EXEC_RENDER:
-		mmio_base = 0x2000;
-		break;
-#if 0
-	case I915_EXEC_BSD:
-		mmio_base = 0x12000;
-		break;
-#endif
-	case I915_EXEC_BLT:
-		mmio_base = 0x22000;
-		break;
-
-#define GEN11_VECS0_BASE 0x1c8000
-#define GEN11_VECS1_BASE 0x1d8000
-	case I915_EXEC_VEBOX:
-		if (intel_gen(intel_get_drm_devid(i915)) >= 11)
-			mmio_base = GEN11_VECS0_BASE;
-		else
-			mmio_base = 0x1a000;
-		break;
-
-	default:
-		igt_skip("mmio base not known\n");
-	}
+	mmio_base = ring_base(i915, ring);
+	igt_require_f(mmio_base, "mmio base not known\n");
 
 	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
 		const struct igt_spin_factory opts = {
diff --git a/tests/i915/gem_exec_latency.c b/tests/i915/gem_exec_latency.c
index 3d99182a0..d2159f317 100644
--- a/tests/i915/gem_exec_latency.c
+++ b/tests/i915/gem_exec_latency.c
@@ -109,7 +109,7 @@ poll_ring(int fd, unsigned ring, const char *name)
 	igt_spin_free(fd, spin[0]);
 }
 
-#define RCS_TIMESTAMP (0x2000 + 0x358)
+#define TIMESTAMP (0x358)
 static void latency_on_ring(int fd,
 			    unsigned ring, const char *name,
 			    unsigned flags)
@@ -119,6 +119,7 @@ static void latency_on_ring(int fd,
 	struct drm_i915_gem_exec_object2 obj[3];
 	struct drm_i915_gem_relocation_entry reloc;
 	struct drm_i915_gem_execbuffer2 execbuf;
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
 	igt_spin_t *spin = NULL;
 	IGT_CORK_HANDLE(c);
 	volatile uint32_t *reg;
@@ -128,7 +129,8 @@ static void latency_on_ring(int fd,
 	double gpu_latency;
 	int i, j;
 
-	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + RCS_TIMESTAMP);
+	igt_require(mmio_base);
+	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + mmio_base + TIMESTAMP);
 
 	memset(&execbuf, 0, sizeof(execbuf));
 	execbuf.buffers_ptr = to_user_pointer(&obj[1]);
@@ -176,7 +178,7 @@ static void latency_on_ring(int fd,
 		map[i++] = 0x24 << 23 | 1;
 		if (has_64bit_reloc)
 			map[i-1]++;
-		map[i++] = RCS_TIMESTAMP; /* ring local! */
+		map[i++] = mmio_base + TIMESTAMP;
 		map[i++] = offset;
 		if (has_64bit_reloc)
 			map[i++] = offset >> 32;
@@ -266,11 +268,14 @@ static void latency_from_ring(int fd,
 	struct drm_i915_gem_exec_object2 obj[3];
 	struct drm_i915_gem_relocation_entry reloc;
 	struct drm_i915_gem_execbuffer2 execbuf;
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
 	const unsigned int repeats = ring_size / 2;
 	uint32_t *map, *results;
 	uint32_t ctx[2] = {};
 	int i, j;
 
+	igt_require(mmio_base);
+
 	if (flags & PREEMPT) {
 		ctx[0] = gem_context_create(fd);
 		gem_context_set_priority(fd, ctx[0], -1023);
@@ -351,7 +356,7 @@ static void latency_from_ring(int fd,
 			map[i++] = 0x24 << 23 | 1;
 			if (has_64bit_reloc)
 				map[i-1]++;
-			map[i++] = RCS_TIMESTAMP; /* ring local! */
+			map[i++] = mmio_base + TIMESTAMP;
 			map[i++] = offset;
 			if (has_64bit_reloc)
 				map[i++] = offset >> 32;
@@ -376,7 +381,7 @@ static void latency_from_ring(int fd,
 			map[i++] = 0x24 << 23 | 1;
 			if (has_64bit_reloc)
 				map[i-1]++;
-			map[i++] = RCS_TIMESTAMP; /* ring local! */
+			map[i++] = mmio_base + TIMESTAMP;
 			map[i++] = offset;
 			if (has_64bit_reloc)
 				map[i++] = offset >> 32;
@@ -669,7 +674,7 @@ igt_main
 			ring_size = 1024;
 
 		intel_register_access_init(&mmio_data, intel_get_pci_device(), false, device);
-		rcs_clock = clockrate(device, RCS_TIMESTAMP);
+		rcs_clock = clockrate(device, 0x2000 + TIMESTAMP);
 		igt_info("RCS timestamp clock: %.0fKHz, %.1fns\n",
 			 rcs_clock / 1e3, 1e9 / rcs_clock);
 		rcs_clock = 1e9 / rcs_clock;
-- 
2.24.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

Some of the non-privileged registers are at the same offset on each
engine. We can improve our coverage for unknown HW layout by using the
reported engine->mmio_base for relative offsets.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/i915/gem_ctx_isolation.c | 164 ++++++++++++++++++++-------------
 1 file changed, 100 insertions(+), 64 deletions(-)

diff --git a/tests/i915/gem_ctx_isolation.c b/tests/i915/gem_ctx_isolation.c
index 6aa27133c..546ffac3a 100644
--- a/tests/i915/gem_ctx_isolation.c
+++ b/tests/i915/gem_ctx_isolation.c
@@ -70,6 +70,7 @@ static const struct named_register {
 	uint32_t ignore_bits;
 	uint32_t write_mask; /* some registers bits do not exist */
 	bool masked;
+	bool relative;
 } nonpriv_registers[] = {
 	{ "NOPID", NOCTX, RCS0, 0x2094 },
 	{ "MI_PREDICATE_RESULT_2", NOCTX, RCS0, 0x23bc },
@@ -109,7 +110,6 @@ static const struct named_register {
 	{ "PS_DEPTH_COUNT_1", GEN8, RCS0, 0x22f8, 2 },
 	{ "BB_OFFSET", GEN8, RCS0, 0x2158, .ignore_bits = 0x7 },
 	{ "MI_PREDICATE_RESULT_1", GEN8, RCS0, 0x241c },
-	{ "CS_GPR", GEN8, RCS0, 0x2600, 32 },
 	{ "OA_CTX_CONTROL", GEN8, RCS0, 0x2360 },
 	{ "OACTXID", GEN8, RCS0, 0x2364 },
 	{ "PS_INVOCATION_COUNT_2", GEN8, RCS0, 0x2448, 2, .write_mask = ~0x3 },
@@ -138,79 +138,56 @@ static const struct named_register {
 
 	{ "CTX_PREEMPT", NOCTX /* GEN10 */, RCS0, 0x2248 },
 	{ "CS_CHICKEN1", GEN11, RCS0, 0x2580, .masked = true },
-	{ "HDC_CHICKEN1", GEN_RANGE(10, 10), RCS0, 0x7304, .masked = true },
 
 	/* Privileged (enabled by w/a + FORCE_TO_NONPRIV) */
 	{ "CTX_PREEMPT", NOCTX /* GEN9 */, RCS0, 0x2248 },
 	{ "CS_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x2580, .masked = true },
 	{ "COMMON_SLICE_CHICKEN2", GEN_RANGE(9, 9), RCS0, 0x7014, .masked = true },
-	{ "HDC_CHICKEN1", GEN_RANGE(9, 9), RCS0, 0x7304, .masked = true },
+	{ "HDC_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x7304, .masked = true },
 	{ "SLICE_COMMON_ECO_CHICKEN1", GEN_RANGE(11, 11) /* + glk */, RCS0,  0x731c, .masked = true },
 	{ "L3SQREG4", NOCTX /* GEN9:skl,kbl */, RCS0, 0xb118, .write_mask = ~0x1ffff0 },
 	{ "HALF_SLICE_CHICKEN7", GEN_RANGE(11, 11), RCS0, 0xe194, .masked = true },
 	{ "SAMPLER_MODE", GEN_RANGE(11, 11), RCS0, 0xe18c, .masked = true },
 
-	{ "BCS_GPR", GEN9, BCS0, 0x22600, 32 },
 	{ "BCS_SWCTRL", GEN8, BCS0, 0x22200, .write_mask = 0x3, .masked = true },
 
 	{ "MFC_VDBOX1", NOCTX, VCS0, 0x12800, 64 },
 	{ "MFC_VDBOX2", NOCTX, VCS1, 0x1c800, 64 },
 
-	{ "VCS0_GPR", GEN_RANGE(9, 10), VCS0, 0x12600, 32 },
-	{ "VCS1_GPR", GEN_RANGE(9, 10), VCS1, 0x1c600, 32 },
-	{ "VECS_GPR", GEN_RANGE(9, 10), VECS0, 0x1a600, 32 },
-
-	{ "VCS0_GPR", GEN11, VCS0, 0x1c0600, 32 },
-	{ "VCS1_GPR", GEN11, VCS1, 0x1c4600, 32 },
-	{ "VCS2_GPR", GEN11, VCS2, 0x1d0600, 32 },
-	{ "VCS3_GPR", GEN11, VCS3, 0x1d4600, 32 },
-	{ "VECS_GPR", GEN11, VECS0, 0x1c8600, 32 },
+	{ "xCS_GPR", GEN9, ALL, 0x600, 32, .relative = true },
 
 	{}
 }, ignore_registers[] = {
 	{ "RCS timestamp", GEN6, ~0u, 0x2358 },
 	{ "BCS timestamp", GEN7, ~0u, 0x22358 },
 
-	{ "VCS0 timestamp", GEN_RANGE(7, 10), ~0u, 0x12358 },
-	{ "VCS1 timestamp", GEN_RANGE(7, 10), ~0u, 0x1c358 },
-	{ "VECS timestamp", GEN_RANGE(8, 10), ~0u, 0x1a358 },
-
-	{ "VCS0 timestamp", GEN11, ~0u, 0x1c0358 },
-	{ "VCS1 timestamp", GEN11, ~0u, 0x1c4358 },
-	{ "VCS2 timestamp", GEN11, ~0u, 0x1d0358 },
-	{ "VCS3 timestamp", GEN11, ~0u, 0x1d4358 },
-	{ "VECS timestamp", GEN11, ~0u, 0x1c8358 },
+	{ "xCS timestamp", GEN8, ALL, 0x358, .relative = true },
 
 	/* huc read only */
-	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2000 },
-	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2014 },
-	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x23b0 },
-
-	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2000 },
-	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2014 },
-	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x23b0 },
-
-	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2000 },
-	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2014 },
-	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x23b0 },
-
-	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2000 },
-	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2014 },
-	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x23b0 },
+	{ "BSD 0x2000", GEN11, ALL, 0x2000, .relative = true },
+	{ "BSD 0x2014", GEN11, ALL, 0x2014, .relative = true },
+	{ "BSD 0x23b0", GEN11, ALL, 0x23b0, .relative = true },
 
 	{}
 };
 
-static const char *register_name(uint32_t offset, char *buf, size_t len)
+static const char *
+register_name(uint32_t offset, uint32_t mmio_base, char *buf, size_t len)
 {
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
 		unsigned int width = r->count ? 4*r->count : 4;
-		if (offset >= r->offset && offset < r->offset + width) {
+		uint32_t base;
+
+		base = r->offset;
+		if (r->relative)
+			base += mmio_base;
+
+		if (offset >= base && offset < base + width) {
 			if (r->count <= 1)
 				return r->name;
 
 			snprintf(buf, len, "%s[%d]",
-				 r->name, (offset - r->offset)/4);
+				 r->name, (offset - base) / 4);
 			return buf;
 		}
 	}
@@ -218,22 +195,35 @@ static const char *register_name(uint32_t offset, char *buf, size_t len)
 	return "unknown";
 }
 
-static const struct named_register *lookup_register(uint32_t offset)
+static const struct named_register *
+lookup_register(uint32_t offset, uint32_t mmio_base)
 {
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
 		unsigned int width = r->count ? 4*r->count : 4;
-		if (offset >= r->offset && offset < r->offset + width)
+		uint32_t base;
+
+		base = r->offset;
+		if (r->relative)
+			base += mmio_base;
+
+		if (offset >= base && offset < base + width)
 			return r;
 	}
 
 	return NULL;
 }
 
-static bool ignore_register(uint32_t offset)
+static bool ignore_register(uint32_t offset, uint32_t mmio_base)
 {
 	for (const struct named_register *r = ignore_registers; r->name; r++) {
 		unsigned int width = r->count ? 4*r->count : 4;
-		if (offset >= r->offset && offset < r->offset + width)
+		uint32_t base;
+
+		base = r->offset;
+		if (r->relative)
+			base += mmio_base;
+
+		if (offset >= base && offset < base + width)
 			return true;
 	}
 
@@ -248,6 +238,7 @@ static void tmpl_regs(int fd,
 {
 	const unsigned int gen_bit = 1 << intel_gen(intel_get_drm_devid(fd));
 	const unsigned int engine_bit = ENGINE(e->class, e->instance);
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	unsigned int regs_size;
 	uint32_t *regs;
 
@@ -259,12 +250,20 @@ static void tmpl_regs(int fd,
 		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
 
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
+		uint32_t offset;
+
 		if (!(r->engine_mask & engine_bit))
 			continue;
 		if (!(r->gen_mask & gen_bit))
 			continue;
-		for (unsigned count = r->count ?: 1, offset = r->offset;
-		     count--; offset += 4) {
+		if (r->relative && !mmio_base)
+			continue;
+
+		offset = r->offset;
+		if (r->relative)
+			offset += mmio_base;
+
+		for (unsigned count = r->count ?: 1; count--; offset += 4) {
 			uint32_t x = value;
 			if (r->write_mask)
 				x &= r->write_mask;
@@ -284,6 +283,7 @@ static uint32_t read_regs(int fd,
 	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
 	const unsigned int gen_bit = 1 << gen;
 	const unsigned int engine_bit = ENGINE(e->class, e->instance);
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	const bool r64b = gen >= 8;
 	struct drm_i915_gem_exec_object2 obj[2];
 	struct drm_i915_gem_relocation_entry *reloc;
@@ -311,13 +311,20 @@ static uint32_t read_regs(int fd,
 
 	n = 0;
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
+		uint32_t offset;
+
 		if (!(r->engine_mask & engine_bit))
 			continue;
 		if (!(r->gen_mask & gen_bit))
 			continue;
+		if (r->relative && !mmio_base)
+			continue;
+
+		offset = r->offset;
+		if (r->relative)
+			offset += mmio_base;
 
-		for (unsigned count = r->count ?: 1, offset = r->offset;
-		     count--; offset += 4) {
+		for (unsigned count = r->count ?: 1; count--; offset += 4) {
 			*b++ = 0x24 << 23 | (1 + r64b); /* SRM */
 			*b++ = offset;
 			reloc[n].target_handle = obj[0].handle;
@@ -357,6 +364,7 @@ static void write_regs(int fd,
 {
 	const unsigned int gen_bit = 1 << intel_gen(intel_get_drm_devid(fd));
 	const unsigned int engine_bit = ENGINE(e->class, e->instance);
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	struct drm_i915_gem_exec_object2 obj;
 	struct drm_i915_gem_execbuffer2 execbuf;
 	unsigned int batch_size;
@@ -372,12 +380,20 @@ static void write_regs(int fd,
 	gem_set_domain(fd, obj.handle,
 		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
+		uint32_t offset;
+
 		if (!(r->engine_mask & engine_bit))
 			continue;
 		if (!(r->gen_mask & gen_bit))
 			continue;
-		for (unsigned count = r->count ?: 1, offset = r->offset;
-		     count--; offset += 4) {
+		if (r->relative && !mmio_base)
+			continue;
+
+		offset = r->offset;
+		if (r->relative)
+			offset += mmio_base;
+
+		for (unsigned count = r->count ?: 1; count--; offset += 4) {
 			uint32_t x = value;
 			if (r->write_mask)
 				x &= r->write_mask;
@@ -410,6 +426,7 @@ static void restore_regs(int fd,
 	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
 	const unsigned int gen_bit = 1 << gen;
 	const unsigned int engine_bit = ENGINE(e->class, e->instance);
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	const bool r64b = gen >= 8;
 	struct drm_i915_gem_exec_object2 obj[2];
 	struct drm_i915_gem_execbuffer2 execbuf;
@@ -437,13 +454,20 @@ static void restore_regs(int fd,
 
 	n = 0;
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
+		uint32_t offset;
+
 		if (!(r->engine_mask & engine_bit))
 			continue;
 		if (!(r->gen_mask & gen_bit))
 			continue;
+		if (r->relative && !mmio_base)
+			continue;
+
+		offset = r->offset;
+		if (r->relative)
+			offset += mmio_base;
 
-		for (unsigned count = r->count ?: 1, offset = r->offset;
-		     count--; offset += 4) {
+		for (unsigned count = r->count ?: 1; count--; offset += 4) {
 			*b++ = 0x29 << 23 | (1 + r64b); /* LRM */
 			*b++ = offset;
 			reloc[n].target_handle = obj[0].handle;
@@ -479,6 +503,7 @@ static void dump_regs(int fd,
 	const int gen = intel_gen(intel_get_drm_devid(fd));
 	const unsigned int gen_bit = 1 << gen;
 	const unsigned int engine_bit = ENGINE(e->class, e->instance);
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	unsigned int regs_size;
 	uint32_t *out;
 
@@ -489,26 +514,36 @@ static void dump_regs(int fd,
 	gem_set_domain(fd, regs, I915_GEM_DOMAIN_CPU, 0);
 
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
+		uint32_t offset;
+
 		if (!(r->engine_mask & engine_bit))
 			continue;
 		if (!(r->gen_mask & gen_bit))
 			continue;
+		if (r->relative && !mmio_base)
+			continue;
+
+		offset = r->offset;
+		if (r->relative)
+			offset += mmio_base;
 
 		if (r->count <= 1) {
 			igt_debug("0x%04x (%s): 0x%08x\n",
-				  r->offset, r->name, out[r->offset/4]);
+				  offset, r->name, out[offset / 4]);
 		} else {
 			for (unsigned x = 0; x < r->count; x++)
 				igt_debug("0x%04x (%s[%d]): 0x%08x\n",
-					  r->offset+4*x, r->name, x,
-					  out[r->offset/4 + x]);
+					  offset + 4 * x, r->name, x,
+					  out[offset / 4 + x]);
 		}
 	}
 	munmap(out, regs_size);
 }
 
-static void compare_regs(int fd, uint32_t A, uint32_t B, const char *who)
+static void compare_regs(int fd, const struct intel_execution_engine2 *e,
+			 uint32_t A, uint32_t B, const char *who)
 {
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	unsigned int num_errors;
 	unsigned int regs_size;
 	uint32_t *a, *b;
@@ -532,11 +567,11 @@ static void compare_regs(int fd, uint32_t A, uint32_t B, const char *who)
 		if (a[n] == b[n])
 			continue;
 
-		if (ignore_register(offset))
+		if (ignore_register(offset, mmio_base))
 			continue;
 
 		mask = ~0u;
-		r = lookup_register(offset);
+		r = lookup_register(offset, mmio_base);
 		if (r && r->masked)
 			mask >>= 16;
 		if (r && r->ignore_bits)
@@ -547,7 +582,7 @@ static void compare_regs(int fd, uint32_t A, uint32_t B, const char *who)
 
 		igt_warn("Register 0x%04x (%s): A=%08x B=%08x\n",
 			 offset,
-			 register_name(offset, buf, sizeof(buf)),
+			 register_name(offset, mmio_base, buf, sizeof(buf)),
 			 a[n] & mask, b[n] & mask);
 		num_errors++;
 	}
@@ -638,7 +673,7 @@ static void nonpriv(int fd,
 
 		igt_spin_free(fd, spin);
 
-		compare_regs(fd, tmpl, regs[1], "nonpriv read/writes");
+		compare_regs(fd, e, tmpl, regs[1], "nonpriv read/writes");
 
 		for (int n = 0; n < ARRAY_SIZE(regs); n++)
 			gem_close(fd, regs[n]);
@@ -708,8 +743,9 @@ static void isolation(int fd,
 		igt_spin_free(fd, spin);
 
 		if (!(flags & DIRTY1))
-			compare_regs(fd, regs[0], tmp, "two reads of the same ctx");
-		compare_regs(fd, regs[0], regs[1], "two virgin contexts");
+			compare_regs(fd, e, regs[0], tmp,
+				     "two reads of the same ctx");
+		compare_regs(fd, e, regs[0], regs[1], "two virgin contexts");
 
 		for (int n = 0; n < ARRAY_SIZE(ctx); n++) {
 			gem_close(fd, regs[n]);
@@ -829,13 +865,13 @@ static void preservation(int fd,
 		char buf[80];
 
 		snprintf(buf, sizeof(buf), "dirty %x context\n", values[v]);
-		compare_regs(fd, regs[v][0], regs[v][1], buf);
+		compare_regs(fd, e, regs[v][0], regs[v][1], buf);
 
 		gem_close(fd, regs[v][0]);
 		gem_close(fd, regs[v][1]);
 		gem_context_destroy(fd, ctx[v]);
 	}
-	compare_regs(fd, regs[num_values][0], regs[num_values][1], "clean");
+	compare_regs(fd, e, regs[num_values][0], regs[num_values][1], "clean");
 	gem_context_destroy(fd, ctx[num_values]);
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

Some of the non-privileged registers are at the same offset on each
engine. We can improve our coverage for unknown HW layout by using the
reported engine->mmio_base for relative offsets.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/i915/gem_ctx_isolation.c | 164 ++++++++++++++++++++-------------
 1 file changed, 100 insertions(+), 64 deletions(-)

diff --git a/tests/i915/gem_ctx_isolation.c b/tests/i915/gem_ctx_isolation.c
index 6aa27133c..546ffac3a 100644
--- a/tests/i915/gem_ctx_isolation.c
+++ b/tests/i915/gem_ctx_isolation.c
@@ -70,6 +70,7 @@ static const struct named_register {
 	uint32_t ignore_bits;
 	uint32_t write_mask; /* some registers bits do not exist */
 	bool masked;
+	bool relative;
 } nonpriv_registers[] = {
 	{ "NOPID", NOCTX, RCS0, 0x2094 },
 	{ "MI_PREDICATE_RESULT_2", NOCTX, RCS0, 0x23bc },
@@ -109,7 +110,6 @@ static const struct named_register {
 	{ "PS_DEPTH_COUNT_1", GEN8, RCS0, 0x22f8, 2 },
 	{ "BB_OFFSET", GEN8, RCS0, 0x2158, .ignore_bits = 0x7 },
 	{ "MI_PREDICATE_RESULT_1", GEN8, RCS0, 0x241c },
-	{ "CS_GPR", GEN8, RCS0, 0x2600, 32 },
 	{ "OA_CTX_CONTROL", GEN8, RCS0, 0x2360 },
 	{ "OACTXID", GEN8, RCS0, 0x2364 },
 	{ "PS_INVOCATION_COUNT_2", GEN8, RCS0, 0x2448, 2, .write_mask = ~0x3 },
@@ -138,79 +138,56 @@ static const struct named_register {
 
 	{ "CTX_PREEMPT", NOCTX /* GEN10 */, RCS0, 0x2248 },
 	{ "CS_CHICKEN1", GEN11, RCS0, 0x2580, .masked = true },
-	{ "HDC_CHICKEN1", GEN_RANGE(10, 10), RCS0, 0x7304, .masked = true },
 
 	/* Privileged (enabled by w/a + FORCE_TO_NONPRIV) */
 	{ "CTX_PREEMPT", NOCTX /* GEN9 */, RCS0, 0x2248 },
 	{ "CS_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x2580, .masked = true },
 	{ "COMMON_SLICE_CHICKEN2", GEN_RANGE(9, 9), RCS0, 0x7014, .masked = true },
-	{ "HDC_CHICKEN1", GEN_RANGE(9, 9), RCS0, 0x7304, .masked = true },
+	{ "HDC_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x7304, .masked = true },
 	{ "SLICE_COMMON_ECO_CHICKEN1", GEN_RANGE(11, 11) /* + glk */, RCS0,  0x731c, .masked = true },
 	{ "L3SQREG4", NOCTX /* GEN9:skl,kbl */, RCS0, 0xb118, .write_mask = ~0x1ffff0 },
 	{ "HALF_SLICE_CHICKEN7", GEN_RANGE(11, 11), RCS0, 0xe194, .masked = true },
 	{ "SAMPLER_MODE", GEN_RANGE(11, 11), RCS0, 0xe18c, .masked = true },
 
-	{ "BCS_GPR", GEN9, BCS0, 0x22600, 32 },
 	{ "BCS_SWCTRL", GEN8, BCS0, 0x22200, .write_mask = 0x3, .masked = true },
 
 	{ "MFC_VDBOX1", NOCTX, VCS0, 0x12800, 64 },
 	{ "MFC_VDBOX2", NOCTX, VCS1, 0x1c800, 64 },
 
-	{ "VCS0_GPR", GEN_RANGE(9, 10), VCS0, 0x12600, 32 },
-	{ "VCS1_GPR", GEN_RANGE(9, 10), VCS1, 0x1c600, 32 },
-	{ "VECS_GPR", GEN_RANGE(9, 10), VECS0, 0x1a600, 32 },
-
-	{ "VCS0_GPR", GEN11, VCS0, 0x1c0600, 32 },
-	{ "VCS1_GPR", GEN11, VCS1, 0x1c4600, 32 },
-	{ "VCS2_GPR", GEN11, VCS2, 0x1d0600, 32 },
-	{ "VCS3_GPR", GEN11, VCS3, 0x1d4600, 32 },
-	{ "VECS_GPR", GEN11, VECS0, 0x1c8600, 32 },
+	{ "xCS_GPR", GEN9, ALL, 0x600, 32, .relative = true },
 
 	{}
 }, ignore_registers[] = {
 	{ "RCS timestamp", GEN6, ~0u, 0x2358 },
 	{ "BCS timestamp", GEN7, ~0u, 0x22358 },
 
-	{ "VCS0 timestamp", GEN_RANGE(7, 10), ~0u, 0x12358 },
-	{ "VCS1 timestamp", GEN_RANGE(7, 10), ~0u, 0x1c358 },
-	{ "VECS timestamp", GEN_RANGE(8, 10), ~0u, 0x1a358 },
-
-	{ "VCS0 timestamp", GEN11, ~0u, 0x1c0358 },
-	{ "VCS1 timestamp", GEN11, ~0u, 0x1c4358 },
-	{ "VCS2 timestamp", GEN11, ~0u, 0x1d0358 },
-	{ "VCS3 timestamp", GEN11, ~0u, 0x1d4358 },
-	{ "VECS timestamp", GEN11, ~0u, 0x1c8358 },
+	{ "xCS timestamp", GEN8, ALL, 0x358, .relative = true },
 
 	/* huc read only */
-	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2000 },
-	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2014 },
-	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x23b0 },
-
-	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2000 },
-	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2014 },
-	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x23b0 },
-
-	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2000 },
-	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2014 },
-	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x23b0 },
-
-	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2000 },
-	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2014 },
-	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x23b0 },
+	{ "BSD 0x2000", GEN11, ALL, 0x2000, .relative = true },
+	{ "BSD 0x2014", GEN11, ALL, 0x2014, .relative = true },
+	{ "BSD 0x23b0", GEN11, ALL, 0x23b0, .relative = true },
 
 	{}
 };
 
-static const char *register_name(uint32_t offset, char *buf, size_t len)
+static const char *
+register_name(uint32_t offset, uint32_t mmio_base, char *buf, size_t len)
 {
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
 		unsigned int width = r->count ? 4*r->count : 4;
-		if (offset >= r->offset && offset < r->offset + width) {
+		uint32_t base;
+
+		base = r->offset;
+		if (r->relative)
+			base += mmio_base;
+
+		if (offset >= base && offset < base + width) {
 			if (r->count <= 1)
 				return r->name;
 
 			snprintf(buf, len, "%s[%d]",
-				 r->name, (offset - r->offset)/4);
+				 r->name, (offset - base) / 4);
 			return buf;
 		}
 	}
@@ -218,22 +195,35 @@ static const char *register_name(uint32_t offset, char *buf, size_t len)
 	return "unknown";
 }
 
-static const struct named_register *lookup_register(uint32_t offset)
+static const struct named_register *
+lookup_register(uint32_t offset, uint32_t mmio_base)
 {
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
 		unsigned int width = r->count ? 4*r->count : 4;
-		if (offset >= r->offset && offset < r->offset + width)
+		uint32_t base;
+
+		base = r->offset;
+		if (r->relative)
+			base += mmio_base;
+
+		if (offset >= base && offset < base + width)
 			return r;
 	}
 
 	return NULL;
 }
 
-static bool ignore_register(uint32_t offset)
+static bool ignore_register(uint32_t offset, uint32_t mmio_base)
 {
 	for (const struct named_register *r = ignore_registers; r->name; r++) {
 		unsigned int width = r->count ? 4*r->count : 4;
-		if (offset >= r->offset && offset < r->offset + width)
+		uint32_t base;
+
+		base = r->offset;
+		if (r->relative)
+			base += mmio_base;
+
+		if (offset >= base && offset < base + width)
 			return true;
 	}
 
@@ -248,6 +238,7 @@ static void tmpl_regs(int fd,
 {
 	const unsigned int gen_bit = 1 << intel_gen(intel_get_drm_devid(fd));
 	const unsigned int engine_bit = ENGINE(e->class, e->instance);
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	unsigned int regs_size;
 	uint32_t *regs;
 
@@ -259,12 +250,20 @@ static void tmpl_regs(int fd,
 		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
 
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
+		uint32_t offset;
+
 		if (!(r->engine_mask & engine_bit))
 			continue;
 		if (!(r->gen_mask & gen_bit))
 			continue;
-		for (unsigned count = r->count ?: 1, offset = r->offset;
-		     count--; offset += 4) {
+		if (r->relative && !mmio_base)
+			continue;
+
+		offset = r->offset;
+		if (r->relative)
+			offset += mmio_base;
+
+		for (unsigned count = r->count ?: 1; count--; offset += 4) {
 			uint32_t x = value;
 			if (r->write_mask)
 				x &= r->write_mask;
@@ -284,6 +283,7 @@ static uint32_t read_regs(int fd,
 	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
 	const unsigned int gen_bit = 1 << gen;
 	const unsigned int engine_bit = ENGINE(e->class, e->instance);
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	const bool r64b = gen >= 8;
 	struct drm_i915_gem_exec_object2 obj[2];
 	struct drm_i915_gem_relocation_entry *reloc;
@@ -311,13 +311,20 @@ static uint32_t read_regs(int fd,
 
 	n = 0;
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
+		uint32_t offset;
+
 		if (!(r->engine_mask & engine_bit))
 			continue;
 		if (!(r->gen_mask & gen_bit))
 			continue;
+		if (r->relative && !mmio_base)
+			continue;
+
+		offset = r->offset;
+		if (r->relative)
+			offset += mmio_base;
 
-		for (unsigned count = r->count ?: 1, offset = r->offset;
-		     count--; offset += 4) {
+		for (unsigned count = r->count ?: 1; count--; offset += 4) {
 			*b++ = 0x24 << 23 | (1 + r64b); /* SRM */
 			*b++ = offset;
 			reloc[n].target_handle = obj[0].handle;
@@ -357,6 +364,7 @@ static void write_regs(int fd,
 {
 	const unsigned int gen_bit = 1 << intel_gen(intel_get_drm_devid(fd));
 	const unsigned int engine_bit = ENGINE(e->class, e->instance);
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	struct drm_i915_gem_exec_object2 obj;
 	struct drm_i915_gem_execbuffer2 execbuf;
 	unsigned int batch_size;
@@ -372,12 +380,20 @@ static void write_regs(int fd,
 	gem_set_domain(fd, obj.handle,
 		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
+		uint32_t offset;
+
 		if (!(r->engine_mask & engine_bit))
 			continue;
 		if (!(r->gen_mask & gen_bit))
 			continue;
-		for (unsigned count = r->count ?: 1, offset = r->offset;
-		     count--; offset += 4) {
+		if (r->relative && !mmio_base)
+			continue;
+
+		offset = r->offset;
+		if (r->relative)
+			offset += mmio_base;
+
+		for (unsigned count = r->count ?: 1; count--; offset += 4) {
 			uint32_t x = value;
 			if (r->write_mask)
 				x &= r->write_mask;
@@ -410,6 +426,7 @@ static void restore_regs(int fd,
 	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
 	const unsigned int gen_bit = 1 << gen;
 	const unsigned int engine_bit = ENGINE(e->class, e->instance);
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	const bool r64b = gen >= 8;
 	struct drm_i915_gem_exec_object2 obj[2];
 	struct drm_i915_gem_execbuffer2 execbuf;
@@ -437,13 +454,20 @@ static void restore_regs(int fd,
 
 	n = 0;
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
+		uint32_t offset;
+
 		if (!(r->engine_mask & engine_bit))
 			continue;
 		if (!(r->gen_mask & gen_bit))
 			continue;
+		if (r->relative && !mmio_base)
+			continue;
+
+		offset = r->offset;
+		if (r->relative)
+			offset += mmio_base;
 
-		for (unsigned count = r->count ?: 1, offset = r->offset;
-		     count--; offset += 4) {
+		for (unsigned count = r->count ?: 1; count--; offset += 4) {
 			*b++ = 0x29 << 23 | (1 + r64b); /* LRM */
 			*b++ = offset;
 			reloc[n].target_handle = obj[0].handle;
@@ -479,6 +503,7 @@ static void dump_regs(int fd,
 	const int gen = intel_gen(intel_get_drm_devid(fd));
 	const unsigned int gen_bit = 1 << gen;
 	const unsigned int engine_bit = ENGINE(e->class, e->instance);
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	unsigned int regs_size;
 	uint32_t *out;
 
@@ -489,26 +514,36 @@ static void dump_regs(int fd,
 	gem_set_domain(fd, regs, I915_GEM_DOMAIN_CPU, 0);
 
 	for (const struct named_register *r = nonpriv_registers; r->name; r++) {
+		uint32_t offset;
+
 		if (!(r->engine_mask & engine_bit))
 			continue;
 		if (!(r->gen_mask & gen_bit))
 			continue;
+		if (r->relative && !mmio_base)
+			continue;
+
+		offset = r->offset;
+		if (r->relative)
+			offset += mmio_base;
 
 		if (r->count <= 1) {
 			igt_debug("0x%04x (%s): 0x%08x\n",
-				  r->offset, r->name, out[r->offset/4]);
+				  offset, r->name, out[offset / 4]);
 		} else {
 			for (unsigned x = 0; x < r->count; x++)
 				igt_debug("0x%04x (%s[%d]): 0x%08x\n",
-					  r->offset+4*x, r->name, x,
-					  out[r->offset/4 + x]);
+					  offset + 4 * x, r->name, x,
+					  out[offset / 4 + x]);
 		}
 	}
 	munmap(out, regs_size);
 }
 
-static void compare_regs(int fd, uint32_t A, uint32_t B, const char *who)
+static void compare_regs(int fd, const struct intel_execution_engine2 *e,
+			 uint32_t A, uint32_t B, const char *who)
 {
+	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
 	unsigned int num_errors;
 	unsigned int regs_size;
 	uint32_t *a, *b;
@@ -532,11 +567,11 @@ static void compare_regs(int fd, uint32_t A, uint32_t B, const char *who)
 		if (a[n] == b[n])
 			continue;
 
-		if (ignore_register(offset))
+		if (ignore_register(offset, mmio_base))
 			continue;
 
 		mask = ~0u;
-		r = lookup_register(offset);
+		r = lookup_register(offset, mmio_base);
 		if (r && r->masked)
 			mask >>= 16;
 		if (r && r->ignore_bits)
@@ -547,7 +582,7 @@ static void compare_regs(int fd, uint32_t A, uint32_t B, const char *who)
 
 		igt_warn("Register 0x%04x (%s): A=%08x B=%08x\n",
 			 offset,
-			 register_name(offset, buf, sizeof(buf)),
+			 register_name(offset, mmio_base, buf, sizeof(buf)),
 			 a[n] & mask, b[n] & mask);
 		num_errors++;
 	}
@@ -638,7 +673,7 @@ static void nonpriv(int fd,
 
 		igt_spin_free(fd, spin);
 
-		compare_regs(fd, tmpl, regs[1], "nonpriv read/writes");
+		compare_regs(fd, e, tmpl, regs[1], "nonpriv read/writes");
 
 		for (int n = 0; n < ARRAY_SIZE(regs); n++)
 			gem_close(fd, regs[n]);
@@ -708,8 +743,9 @@ static void isolation(int fd,
 		igt_spin_free(fd, spin);
 
 		if (!(flags & DIRTY1))
-			compare_regs(fd, regs[0], tmp, "two reads of the same ctx");
-		compare_regs(fd, regs[0], regs[1], "two virgin contexts");
+			compare_regs(fd, e, regs[0], tmp,
+				     "two reads of the same ctx");
+		compare_regs(fd, e, regs[0], regs[1], "two virgin contexts");
 
 		for (int n = 0; n < ARRAY_SIZE(ctx); n++) {
 			gem_close(fd, regs[n]);
@@ -829,13 +865,13 @@ static void preservation(int fd,
 		char buf[80];
 
 		snprintf(buf, sizeof(buf), "dirty %x context\n", values[v]);
-		compare_regs(fd, regs[v][0], regs[v][1], buf);
+		compare_regs(fd, e, regs[v][0], regs[v][1], buf);
 
 		gem_close(fd, regs[v][0]);
 		gem_close(fd, regs[v][1]);
 		gem_context_destroy(fd, ctx[v]);
 	}
-	compare_regs(fd, regs[num_values][0], regs[num_values][1], "clean");
+	compare_regs(fd, e, regs[num_values][0], regs[num_values][1], "clean");
 	gem_context_destroy(fd, ctx[num_values]);
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH i-g-t 6/9] i915: Exercise preemption timeout controls in sysfs
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

We [will] expose various per-engine scheduling controls. One of which,
'preempt_timeout_ms', defines how we wait for a preemption request to be
honoured by the currently executing context. If it fails to relieve the
GPU within the required timeout, the engine is reset and the miscreant
forcibly evicted.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/i915/gem_context.c             |  41 ++++
 lib/i915/gem_context.h             |   2 +
 tests/Makefile.sources             |   1 +
 tests/i915/sysfs_preempt_timeout.c | 355 +++++++++++++++++++++++++++++
 tests/meson.build                  |   1 +
 5 files changed, 400 insertions(+)
 create mode 100644 tests/i915/sysfs_preempt_timeout.c

diff --git a/lib/i915/gem_context.c b/lib/i915/gem_context.c
index 1fae5191f..aa083c2f0 100644
--- a/lib/i915/gem_context.c
+++ b/lib/i915/gem_context.c
@@ -403,3 +403,44 @@ bool gem_context_has_engine(int fd, uint32_t ctx, uint64_t engine)
 
 	return __gem_execbuf(fd, &execbuf) == -ENOENT;
 }
+
+static int create_ext_ioctl(int i915,
+			    struct drm_i915_gem_context_create_ext *arg)
+{
+	int err;
+
+	err = 0;
+	if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
+		err = -errno;
+		igt_assume(err);
+	}
+
+	errno = 0;
+	return err;
+}
+
+uint32_t gem_context_create_for_engine(int i915, unsigned int class, unsigned int inst)
+{
+	I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 1) = {
+		.engines = { { .engine_class = class, .engine_instance = inst } }
+	};
+	struct drm_i915_gem_context_create_ext_setparam p_engines = {
+		.base = {
+			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+			.next_extension = 0, /* end of chain */
+		},
+		.param = {
+			.param = I915_CONTEXT_PARAM_ENGINES,
+			.value = to_user_pointer(&engines),
+			.size = sizeof(engines),
+		},
+	};
+	struct drm_i915_gem_context_create_ext create = {
+		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+		.extensions = to_user_pointer(&p_engines),
+	};
+
+	igt_assert_eq(create_ext_ioctl(i915, &create), 0);
+	igt_assert_neq(create.ctx_id, 0);
+	return create.ctx_id;
+}
diff --git a/lib/i915/gem_context.h b/lib/i915/gem_context.h
index c0d4c9615..9e0a083f0 100644
--- a/lib/i915/gem_context.h
+++ b/lib/i915/gem_context.h
@@ -34,6 +34,8 @@ int __gem_context_create(int fd, uint32_t *ctx_id);
 void gem_context_destroy(int fd, uint32_t ctx_id);
 int __gem_context_destroy(int fd, uint32_t ctx_id);
 
+uint32_t gem_context_create_for_engine(int fd, unsigned int class, unsigned int inst);
+
 int __gem_context_clone(int i915,
 			uint32_t src, unsigned int share,
 			unsigned int flags,
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index abf1e2fc1..413952c7c 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -98,6 +98,7 @@ TESTS_progs = \
 	tools_test \
 	vgem_basic \
 	vgem_slow \
+	i915/sysfs_preempt_timeout \
 	$(NULL)
 
 TESTS_progs += gem_bad_reloc
diff --git a/tests/i915/sysfs_preempt_timeout.c b/tests/i915/sysfs_preempt_timeout.c
new file mode 100644
index 000000000..4edc69c51
--- /dev/null
+++ b/tests/i915/sysfs_preempt_timeout.c
@@ -0,0 +1,355 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "drmtest.h" /* gem_quiescent_gpu()! */
+#include "i915/gem_engine_topology.h"
+#include "igt_dummyload.h"
+#include "igt_sysfs.h"
+#include "ioctl_wrappers.h" /* igt_require_gem()! */
+#include "sw_sync.h"
+
+#include "igt_debugfs.h"
+
+static bool __enable_hangcheck(int dir, bool state)
+{
+	return igt_sysfs_set(dir, "enable_hangcheck", state ? "1" : "0");
+}
+
+static bool enable_hangcheck(int i915, bool state)
+{
+	bool success;
+	int dir;
+
+	dir = igt_sysfs_open_parameters(i915);
+	if (dir < 0) /* no parameters, must be default! */
+		return false;
+
+	success = __enable_hangcheck(dir, state);
+	close(dir);
+
+	return success;
+}
+
+static void set_preempt_timeout(int engine, unsigned int value)
+{
+	unsigned int delay;
+
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", value);
+	igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &delay);
+	igt_assert_eq(delay, value);
+}
+
+static void test_idempotent(int i915, int engine)
+{
+	unsigned int delays[] = { 0, 1, 1000, 1234, 654321 };
+	unsigned int saved;
+
+	/* Quick test that store/show reports the same values */
+
+	igt_assert(igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &saved) == 1);
+	igt_debug("Initial preempt_timeout_ms:%u\n", saved);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++)
+		set_preempt_timeout(engine, delays[i]);
+
+	set_preempt_timeout(engine, saved);
+}
+
+static void test_invalid(int i915, int engine)
+{
+	unsigned int saved, delay;
+
+	/* Quick test that values that are not representable are rejected */
+
+	igt_assert(igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &saved) == 1);
+	igt_debug("Initial preempt_timeout_ms:%u\n", saved);
+
+	igt_sysfs_printf(engine, "preempt_timeout_ms", PRIu64, -1);
+	igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%d", -1);
+	igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "preempt_timeout_ms", PRIu64, 40ull << 32);
+	igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+}
+
+static void set_unbannable(int i915, uint32_t ctx)
+{
+	struct drm_i915_gem_context_param p = {
+		.ctx_id = ctx,
+		.param = I915_CONTEXT_PARAM_BANNABLE,
+	};
+
+	igt_assert_eq(__gem_context_set_param(i915, &p), 0);
+}
+
+static uint32_t create_context(int i915, unsigned int class, unsigned int inst, int prio)
+{
+	uint32_t ctx;
+
+	ctx = gem_context_create_for_engine(i915, class, inst);
+	set_unbannable(i915, ctx);
+	gem_context_set_priority(i915, ctx, prio);
+
+	return ctx;
+}
+
+static uint64_t __test_timeout(int i915, int engine, unsigned int timeout)
+{
+	unsigned int class, inst;
+	struct timespec ts = {};
+	igt_spin_t *spin[2];
+	uint64_t elapsed;
+	uint32_t ctx[2];
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_preempt_timeout(engine, timeout);
+
+	ctx[0] = create_context(i915, class, inst, -1023);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, 1023);
+	igt_nsec_elapsed(&ts);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+	igt_spin_busywait_until_started(spin[1]);
+	elapsed = igt_nsec_elapsed(&ts);
+
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+	gem_quiescent_gpu(i915);
+
+	return elapsed;
+}
+
+static void test_timeout(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * Send down some non-preemptable workloads and then request a
+	 * switch to a higher priority context. The HW will not be able to
+	 * respond, so the kernel will be forced to reset the hog. This
+	 * timeout should match our specification, and so we can measure
+	 * the delay from requesting the preemption to its completion.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &saved) == 1);
+	igt_debug("Initial preempt_timeout_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+	igt_require(enable_hangcheck(i915, false));
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_timeout(i915, engine, delays[i]);
+		igt_info("preempt_timeout_ms:%d, elapsed=%.3fms\n",
+			 delays[i], elapsed * 1e-6);
+
+		/*
+		 * We need to give a couple of jiffies slack for the scheduler timeouts
+		 * and then a little more slack fr the overhead in submitting and
+		 * measuring. 50ms should cover all of our sins and be useful
+		 * tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < delays[i] + 50,
+			     "Forced preemption timeout exceeded request!\n");
+	}
+
+	igt_assert(enable_hangcheck(i915, true));
+	gem_quiescent_gpu(i915);
+	set_preempt_timeout(engine, saved);
+}
+
+static void test_off(int i915, int engine)
+{
+	unsigned int class, inst;
+	igt_spin_t *spin[2];
+	unsigned int saved;
+	uint32_t ctx[2];
+
+	/*
+	 * We support setting the timeout to 0 to disable the reset on
+	 * preemption failure. Having established that we can do forced
+	 * preemption on demand, we use the same setup (non-preeemptable hog
+	 * followed by a high priority context) and verify that the hog is
+	 * never reset. Never is a long time, so we settle for 150s.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &saved) == 1);
+	igt_debug("Initial preempt_timeout_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+	igt_require(enable_hangcheck(i915, false));
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_preempt_timeout(engine, 0);
+
+	ctx[0] = create_context(i915, class, inst, -1023);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, 1023);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+
+	for (int i = 0; i < 150; i++) {
+		igt_assert_eq(sync_fence_status(spin[0]->out_fence), 0);
+		sleep(1);
+	}
+
+	set_preempt_timeout(engine, 1);
+
+	igt_spin_busywait_until_started(spin[1]);
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+
+	igt_assert(enable_hangcheck(i915, true));
+	gem_quiescent_gpu(i915);
+
+	set_preempt_timeout(engine, saved);
+}
+
+#if 0
+static void each_engines(int fd)
+{
+	struct dirent *de;
+	DIR *dir;
+
+	dir = fdopendir(fd);
+	while (dir && (de = readdir(dir))) {
+		int engine = openat(engines, de->d_name, O_RDONLY);
+		char *name;
+
+		name = igt_sysfs_get(engine, "name");
+		if (!name)
+			continue;
+
+		igt_subtest_group {
+			igt_fixture {
+				igt_require(fstatat(engine,
+							"preempt_timeout_ms",
+							&st, 0) == 0);
+			}
+		}
+	}
+	closedir(dir);
+}
+#endif
+
+igt_main
+{
+	const struct intel_execution_engine2 *it;
+	int i915 = -1, engines = -1;
+
+	igt_fixture {
+		int sys;
+
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+		igt_allow_hang(i915, 0, 0);
+
+		sys = igt_sysfs_open(i915);
+		igt_require(sys != -1);
+
+		engines = openat(sys, "engine", O_RDONLY);
+		igt_require(engines != -1);
+
+		close(sys);
+	}
+
+	__for_each_static_engine(it) {
+		igt_subtest_group {
+			int engine = -1;
+			char *name = NULL;
+
+			igt_fixture {
+				struct stat st;
+
+				engine = openat(engines, it->name, O_RDONLY);
+				igt_require(fstatat(engine,
+						    "preempt_timeout_ms",
+						    &st, 0) == 0);
+
+				name = igt_sysfs_get(engine, "name");
+				igt_require(name);
+			}
+			if (!name)
+				name = strdup(it->name);
+
+			igt_subtest_f("%s-idempotent", name)
+				test_idempotent(i915, engine);
+			igt_subtest_f("%s-invalid", name)
+				test_invalid(i915, engine);
+			igt_subtest_f("%s-timeout", name)
+				test_timeout(i915, engine);
+			igt_subtest_f("%s-off", name)
+				test_off(i915, engine);
+
+			free(name);
+			close(engine);
+		}
+	}
+
+	igt_fixture {
+		close(engines);
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 98f2db555..338da2e95 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -239,6 +239,7 @@ i915_progs = [
 	'i915_query',
 	'i915_selftest',
 	'i915_suspend',
+	'sysfs_preempt_timeout',
 ]
 
 test_deps = [ igt_deps ]
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Intel-gfx] [PATCH i-g-t 6/9] i915: Exercise preemption timeout controls in sysfs
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

We [will] expose various per-engine scheduling controls. One of which,
'preempt_timeout_ms', defines how we wait for a preemption request to be
honoured by the currently executing context. If it fails to relieve the
GPU within the required timeout, the engine is reset and the miscreant
forcibly evicted.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/i915/gem_context.c             |  41 ++++
 lib/i915/gem_context.h             |   2 +
 tests/Makefile.sources             |   1 +
 tests/i915/sysfs_preempt_timeout.c | 355 +++++++++++++++++++++++++++++
 tests/meson.build                  |   1 +
 5 files changed, 400 insertions(+)
 create mode 100644 tests/i915/sysfs_preempt_timeout.c

diff --git a/lib/i915/gem_context.c b/lib/i915/gem_context.c
index 1fae5191f..aa083c2f0 100644
--- a/lib/i915/gem_context.c
+++ b/lib/i915/gem_context.c
@@ -403,3 +403,44 @@ bool gem_context_has_engine(int fd, uint32_t ctx, uint64_t engine)
 
 	return __gem_execbuf(fd, &execbuf) == -ENOENT;
 }
+
+static int create_ext_ioctl(int i915,
+			    struct drm_i915_gem_context_create_ext *arg)
+{
+	int err;
+
+	err = 0;
+	if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
+		err = -errno;
+		igt_assume(err);
+	}
+
+	errno = 0;
+	return err;
+}
+
+uint32_t gem_context_create_for_engine(int i915, unsigned int class, unsigned int inst)
+{
+	I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 1) = {
+		.engines = { { .engine_class = class, .engine_instance = inst } }
+	};
+	struct drm_i915_gem_context_create_ext_setparam p_engines = {
+		.base = {
+			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+			.next_extension = 0, /* end of chain */
+		},
+		.param = {
+			.param = I915_CONTEXT_PARAM_ENGINES,
+			.value = to_user_pointer(&engines),
+			.size = sizeof(engines),
+		},
+	};
+	struct drm_i915_gem_context_create_ext create = {
+		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+		.extensions = to_user_pointer(&p_engines),
+	};
+
+	igt_assert_eq(create_ext_ioctl(i915, &create), 0);
+	igt_assert_neq(create.ctx_id, 0);
+	return create.ctx_id;
+}
diff --git a/lib/i915/gem_context.h b/lib/i915/gem_context.h
index c0d4c9615..9e0a083f0 100644
--- a/lib/i915/gem_context.h
+++ b/lib/i915/gem_context.h
@@ -34,6 +34,8 @@ int __gem_context_create(int fd, uint32_t *ctx_id);
 void gem_context_destroy(int fd, uint32_t ctx_id);
 int __gem_context_destroy(int fd, uint32_t ctx_id);
 
+uint32_t gem_context_create_for_engine(int fd, unsigned int class, unsigned int inst);
+
 int __gem_context_clone(int i915,
 			uint32_t src, unsigned int share,
 			unsigned int flags,
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index abf1e2fc1..413952c7c 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -98,6 +98,7 @@ TESTS_progs = \
 	tools_test \
 	vgem_basic \
 	vgem_slow \
+	i915/sysfs_preempt_timeout \
 	$(NULL)
 
 TESTS_progs += gem_bad_reloc
diff --git a/tests/i915/sysfs_preempt_timeout.c b/tests/i915/sysfs_preempt_timeout.c
new file mode 100644
index 000000000..4edc69c51
--- /dev/null
+++ b/tests/i915/sysfs_preempt_timeout.c
@@ -0,0 +1,355 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "drmtest.h" /* gem_quiescent_gpu()! */
+#include "i915/gem_engine_topology.h"
+#include "igt_dummyload.h"
+#include "igt_sysfs.h"
+#include "ioctl_wrappers.h" /* igt_require_gem()! */
+#include "sw_sync.h"
+
+#include "igt_debugfs.h"
+
+static bool __enable_hangcheck(int dir, bool state)
+{
+	return igt_sysfs_set(dir, "enable_hangcheck", state ? "1" : "0");
+}
+
+static bool enable_hangcheck(int i915, bool state)
+{
+	bool success;
+	int dir;
+
+	dir = igt_sysfs_open_parameters(i915);
+	if (dir < 0) /* no parameters, must be default! */
+		return false;
+
+	success = __enable_hangcheck(dir, state);
+	close(dir);
+
+	return success;
+}
+
+static void set_preempt_timeout(int engine, unsigned int value)
+{
+	unsigned int delay;
+
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", value);
+	igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &delay);
+	igt_assert_eq(delay, value);
+}
+
+static void test_idempotent(int i915, int engine)
+{
+	unsigned int delays[] = { 0, 1, 1000, 1234, 654321 };
+	unsigned int saved;
+
+	/* Quick test that store/show reports the same values */
+
+	igt_assert(igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &saved) == 1);
+	igt_debug("Initial preempt_timeout_ms:%u\n", saved);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++)
+		set_preempt_timeout(engine, delays[i]);
+
+	set_preempt_timeout(engine, saved);
+}
+
+static void test_invalid(int i915, int engine)
+{
+	unsigned int saved, delay;
+
+	/* Quick test that values that are not representable are rejected */
+
+	igt_assert(igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &saved) == 1);
+	igt_debug("Initial preempt_timeout_ms:%u\n", saved);
+
+	igt_sysfs_printf(engine, "preempt_timeout_ms", PRIu64, -1);
+	igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%d", -1);
+	igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "preempt_timeout_ms", PRIu64, 40ull << 32);
+	igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+}
+
+static void set_unbannable(int i915, uint32_t ctx)
+{
+	struct drm_i915_gem_context_param p = {
+		.ctx_id = ctx,
+		.param = I915_CONTEXT_PARAM_BANNABLE,
+	};
+
+	igt_assert_eq(__gem_context_set_param(i915, &p), 0);
+}
+
+static uint32_t create_context(int i915, unsigned int class, unsigned int inst, int prio)
+{
+	uint32_t ctx;
+
+	ctx = gem_context_create_for_engine(i915, class, inst);
+	set_unbannable(i915, ctx);
+	gem_context_set_priority(i915, ctx, prio);
+
+	return ctx;
+}
+
+static uint64_t __test_timeout(int i915, int engine, unsigned int timeout)
+{
+	unsigned int class, inst;
+	struct timespec ts = {};
+	igt_spin_t *spin[2];
+	uint64_t elapsed;
+	uint32_t ctx[2];
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_preempt_timeout(engine, timeout);
+
+	ctx[0] = create_context(i915, class, inst, -1023);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, 1023);
+	igt_nsec_elapsed(&ts);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+	igt_spin_busywait_until_started(spin[1]);
+	elapsed = igt_nsec_elapsed(&ts);
+
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+	gem_quiescent_gpu(i915);
+
+	return elapsed;
+}
+
+static void test_timeout(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * Send down some non-preemptable workloads and then request a
+	 * switch to a higher priority context. The HW will not be able to
+	 * respond, so the kernel will be forced to reset the hog. This
+	 * timeout should match our specification, and so we can measure
+	 * the delay from requesting the preemption to its completion.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &saved) == 1);
+	igt_debug("Initial preempt_timeout_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+	igt_require(enable_hangcheck(i915, false));
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_timeout(i915, engine, delays[i]);
+		igt_info("preempt_timeout_ms:%d, elapsed=%.3fms\n",
+			 delays[i], elapsed * 1e-6);
+
+		/*
+		 * We need to give a couple of jiffies slack for the scheduler timeouts
+		 * and then a little more slack fr the overhead in submitting and
+		 * measuring. 50ms should cover all of our sins and be useful
+		 * tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < delays[i] + 50,
+			     "Forced preemption timeout exceeded request!\n");
+	}
+
+	igt_assert(enable_hangcheck(i915, true));
+	gem_quiescent_gpu(i915);
+	set_preempt_timeout(engine, saved);
+}
+
+static void test_off(int i915, int engine)
+{
+	unsigned int class, inst;
+	igt_spin_t *spin[2];
+	unsigned int saved;
+	uint32_t ctx[2];
+
+	/*
+	 * We support setting the timeout to 0 to disable the reset on
+	 * preemption failure. Having established that we can do forced
+	 * preemption on demand, we use the same setup (non-preeemptable hog
+	 * followed by a high priority context) and verify that the hog is
+	 * never reset. Never is a long time, so we settle for 150s.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "preempt_timeout_ms", "%u", &saved) == 1);
+	igt_debug("Initial preempt_timeout_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+	igt_require(enable_hangcheck(i915, false));
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_preempt_timeout(engine, 0);
+
+	ctx[0] = create_context(i915, class, inst, -1023);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, 1023);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+
+	for (int i = 0; i < 150; i++) {
+		igt_assert_eq(sync_fence_status(spin[0]->out_fence), 0);
+		sleep(1);
+	}
+
+	set_preempt_timeout(engine, 1);
+
+	igt_spin_busywait_until_started(spin[1]);
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+
+	igt_assert(enable_hangcheck(i915, true));
+	gem_quiescent_gpu(i915);
+
+	set_preempt_timeout(engine, saved);
+}
+
+#if 0
+static void each_engines(int fd)
+{
+	struct dirent *de;
+	DIR *dir;
+
+	dir = fdopendir(fd);
+	while (dir && (de = readdir(dir))) {
+		int engine = openat(engines, de->d_name, O_RDONLY);
+		char *name;
+
+		name = igt_sysfs_get(engine, "name");
+		if (!name)
+			continue;
+
+		igt_subtest_group {
+			igt_fixture {
+				igt_require(fstatat(engine,
+							"preempt_timeout_ms",
+							&st, 0) == 0);
+			}
+		}
+	}
+	closedir(dir);
+}
+#endif
+
+igt_main
+{
+	const struct intel_execution_engine2 *it;
+	int i915 = -1, engines = -1;
+
+	igt_fixture {
+		int sys;
+
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+		igt_allow_hang(i915, 0, 0);
+
+		sys = igt_sysfs_open(i915);
+		igt_require(sys != -1);
+
+		engines = openat(sys, "engine", O_RDONLY);
+		igt_require(engines != -1);
+
+		close(sys);
+	}
+
+	__for_each_static_engine(it) {
+		igt_subtest_group {
+			int engine = -1;
+			char *name = NULL;
+
+			igt_fixture {
+				struct stat st;
+
+				engine = openat(engines, it->name, O_RDONLY);
+				igt_require(fstatat(engine,
+						    "preempt_timeout_ms",
+						    &st, 0) == 0);
+
+				name = igt_sysfs_get(engine, "name");
+				igt_require(name);
+			}
+			if (!name)
+				name = strdup(it->name);
+
+			igt_subtest_f("%s-idempotent", name)
+				test_idempotent(i915, engine);
+			igt_subtest_f("%s-invalid", name)
+				test_invalid(i915, engine);
+			igt_subtest_f("%s-timeout", name)
+				test_timeout(i915, engine);
+			igt_subtest_f("%s-off", name)
+				test_off(i915, engine);
+
+			free(name);
+			close(engine);
+		}
+	}
+
+	igt_fixture {
+		close(engines);
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 98f2db555..338da2e95 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -239,6 +239,7 @@ i915_progs = [
 	'i915_query',
 	'i915_selftest',
 	'i915_suspend',
+	'sysfs_preempt_timeout',
 ]
 
 test_deps = [ igt_deps ]
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH i-g-t 7/9] i915: Exercise sysfs heartbeat controls
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

We [will] expose various per-engine scheduling controls. One of which,
'heartbeat_duration_ms', defines how often we send a heartbeat down the
engine to check upon the health of the engine. If a heartbeat does not
complete within the interval (or two), the engine is declared hung.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.sources                |   1 +
 tests/i915/sysfs_heartbeat_interval.c | 478 ++++++++++++++++++++++++++
 tests/meson.build                     |   1 +
 3 files changed, 480 insertions(+)
 create mode 100644 tests/i915/sysfs_heartbeat_interval.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 413952c7c..13544133a 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -98,6 +98,7 @@ TESTS_progs = \
 	tools_test \
 	vgem_basic \
 	vgem_slow \
+	i915/sysfs_heartbeat_interval \
 	i915/sysfs_preempt_timeout \
 	$(NULL)
 
diff --git a/tests/i915/sysfs_heartbeat_interval.c b/tests/i915/sysfs_heartbeat_interval.c
new file mode 100644
index 000000000..ba3c523fb
--- /dev/null
+++ b/tests/i915/sysfs_heartbeat_interval.c
@@ -0,0 +1,478 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <signal.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <unistd.h>
+
+#include "drmtest.h" /* gem_quiescent_gpu()! */
+#include "i915/gem_engine_topology.h"
+#include "igt_dummyload.h"
+#include "igt_sysfs.h"
+#include "ioctl_wrappers.h" /* igt_require_gem()! */
+#include "sw_sync.h"
+
+#include "igt_debugfs.h"
+
+static bool __enable_hangcheck(int dir, bool state)
+{
+	return igt_sysfs_set(dir, "enable_hangcheck", state ? "1" : "0");
+}
+
+static void enable_hangcheck(int i915, bool state)
+{
+	int dir;
+
+	dir = igt_sysfs_open_parameters(i915);
+	if (dir < 0) /* no parameters, must be default! */
+		return;
+
+	__enable_hangcheck(dir, state);
+	close(dir);
+}
+
+static void set_heartbeat(int engine, unsigned int value)
+{
+	unsigned int delay = ~value;
+
+	igt_sysfs_printf(engine, "heartbeat_interval_ms", "%u", value);
+	igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &delay);
+	igt_assert_eq(delay, value);
+}
+
+static void test_idempotent(int i915, int engine)
+{
+	unsigned int delays[] = { 1, 1000, 5000, 50000, 123456789 };
+	unsigned int saved;
+
+	/* Quick test that the property reports the values we set */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++)
+		set_heartbeat(engine, delays[i]);
+
+	set_heartbeat(engine, saved);
+}
+
+static void test_invalid(int i915, int engine)
+{
+	unsigned int saved, delay;
+
+	/* Quick test that we reject any unrepresentable intervals */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+
+	igt_sysfs_printf(engine, "heartbeat_interval_ms", PRIu64, -1);
+	igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "heartbeat_interval_ms", PRIu64, 10ull << 32);
+	igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+}
+
+static void set_unbannable(int i915, uint32_t ctx)
+{
+	struct drm_i915_gem_context_param p = {
+		.ctx_id = ctx,
+		.param = I915_CONTEXT_PARAM_BANNABLE,
+	};
+
+	igt_assert_eq(__gem_context_set_param(i915, &p), 0);
+}
+
+static uint32_t create_context(int i915, unsigned int class, unsigned int inst, int prio)
+{
+	uint32_t ctx;
+
+	ctx = gem_context_create_for_engine(i915, class, inst);
+	set_unbannable(i915, ctx);
+	gem_context_set_priority(i915, ctx, prio);
+
+	return ctx;
+}
+
+static uint64_t __test_timeout(int i915, int engine, unsigned int timeout)
+{
+	unsigned int class, inst;
+	struct timespec ts = {};
+	igt_spin_t *spin[2];
+	uint64_t elapsed;
+	uint32_t ctx[2];
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_heartbeat(engine, timeout);
+
+	ctx[0] = create_context(i915, class, inst, 1023);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, -1023);
+	igt_nsec_elapsed(&ts);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+	igt_spin_busywait_until_started(spin[1]);
+	elapsed = igt_nsec_elapsed(&ts);
+
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+	gem_quiescent_gpu(i915);
+
+	return elapsed;
+}
+
+static void test_precise(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * The heartbeat interval defines how long the kernel waits between
+	 * checking on the status of the engines. It first sends down a
+	 * heartbeat pulse, waits the interval and sees if the system managed
+	 * to complete the pulse. If not, it gives a priority bump to the pulse
+	 * and waits again. This is repeated until the priority cannot be bumped
+	 * any more, and the system is declared hung.
+	 *
+	 * If we combine the preemptive pulse with forced preemption, we instead
+	 * get a much faster hang detection. Thus in combination we can measure
+	 * the system response time to reseting a hog as a measure of the
+	 * heartbeat interval, and so confirm it matches our specification.
+	 */
+
+	igt_require(igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1) == 1);
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_timeout(i915, engine, delays[i]);
+		igt_info("heartbeat_interval_ms:%d, elapsed=%.3fms[%d]\n",
+			 delays[i], elapsed * 1e-6,
+				(int)(elapsed / 1000 / 1000)
+			 );
+
+		/*
+		 * It takes a couple of missed heartbeats before we start
+		 * terminating hogs, and a little bit of jiffie slack for
+		 * scheduling at each step. 150ms should cover all of our
+		 * sins and be useful tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < 3 * delays[i] + 150,
+			     "Heartbeat interval (and CPR) exceeded request!\n");
+	}
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+static void test_nopreempt(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * The same principle as test_precise(), except that forced preemption
+	 * is disabled (or simply not supported by the platform). This time,
+	 * it waits until the system misses a few heartbeat before doing a
+	 * per-engine/full-gpu reset. As such it is less precise, but we
+	 * can still estimate an upper bound for our specified heartbeat
+	 * interval and verify the system conforms.
+	 */
+
+	/* Test heartbeats with forced preemption  disabled */
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 0);
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_timeout(i915, engine, delays[i]);
+		igt_info("heartbeat_interval_ms:%d, elapsed=%.3fms[%d]\n",
+			 delays[i], elapsed * 1e-6,
+				(int)(elapsed / 1000 / 1000)
+			 );
+
+		/*
+		 * It takes a several missed heartbeats before we start
+		 * terminating hogs, and a little bit of jiffie slack for
+		 * scheduling at each step. 250ms should cover all of our
+		 * sins and be useful tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < 5 * delays[i] + 150,
+			     "Heartbeat interval (and CPR) exceeded request!\n");
+	}
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+static void client(int i915, int engine, int *ctl, int duration, int expect)
+{
+	unsigned int class, inst;
+	unsigned long count = 0;
+	uint32_t ctx;
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	ctx = create_context(i915, class, inst, 0);
+
+	while (!READ_ONCE(*ctl)) {
+		igt_spin_t *spin;
+
+		spin = igt_spin_new(i915, ctx,
+				    .flags = (IGT_SPIN_NO_PREEMPTION |
+					      IGT_SPIN_POLL_RUN |
+					      IGT_SPIN_FENCE_OUT));
+		igt_spin_busywait_until_started(spin);
+
+		igt_spin_set_timeout(spin, (uint64_t)duration * 1000 * 1000);
+		sync_fence_wait(spin->out_fence, -1);
+
+		igt_assert_eq(sync_fence_status(spin->out_fence), expect);
+		count++;
+	}
+
+	gem_context_destroy(i915, ctx);
+	igt_info("%s client completed %lu spins\n",
+		 expect < 0 ? "Bad" : "Good", count);
+}
+
+static void sigign(int sig)
+{
+}
+
+static void wait_until(int duration)
+{
+	signal(SIGCHLD, sigign);
+	sleep(duration);
+	signal(SIGCHLD, SIG_IGN);
+}
+
+static void __test_mixed(int i915, int engine,
+			 int heartbeat,
+			 int good,
+			 int bad,
+			 int duration)
+{
+	unsigned int saved;
+	int *shared;
+
+	/*
+	 * Given two clients of which one is a hog, be sure we cleanly
+	 * terminate the hog leaving the good client to run.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	shared = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
+	igt_assert(shared != MAP_FAILED);
+
+	set_heartbeat(engine, heartbeat);
+
+	igt_fork(child, 1) /* good client */
+		client(i915, engine, shared, good, 1);
+	igt_fork(child, 1) /* bad client */
+		client(i915, engine, shared, bad, -EIO);
+
+	wait_until(duration);
+
+	*shared = true;
+	igt_waitchildren();
+	munmap(shared, 4096);
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+static void test_mixed(int i915, int engine)
+{
+	/*
+	 * Hogs rarely run alone. Our hang detection must carefully wean
+	 * out the hogs from the innocent clients. Thus we run a mixed workload
+	 * with non-preemptable hogs that exceed the heartbeat, and quicker
+	 * innocents. We inspect the fence status of each to verify that
+	 * only the hogs are reset.
+	 */
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1);
+	__test_mixed(i915, engine, 10, 10, 100, 5);
+}
+
+static void test_long(int i915, int engine)
+{
+	/*
+	 * Some clients relish being hogs, and demand that the system
+	 * never do hangchecking. Never is hard to test, so instead we
+	 * run over a day and verify that only the super hogs are reset.
+	 */
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 0);
+	__test_mixed(i915, engine,
+		     60 * 1000, /* 60s */
+		     60 * 1000, /* 60s */
+		     300 * 1000, /* 5min */
+		     24 * 3600 /* 24hours */);
+}
+
+static void test_off(int i915, int engine)
+{
+	unsigned int class, inst;
+	unsigned int saved;
+	igt_spin_t *spin;
+	uint32_t ctx;
+
+	/*
+	 * Some other clients request that there is never any interruption
+	 * or jitter in their workload and so demand that the kernel never
+	 * sends a heartbeat to steal precious cycles from their workload.
+	 * Turn off the heartbeat and check that the workload is uninterrupted
+	 * for 150s.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_heartbeat(engine, 0);
+
+	ctx = create_context(i915, class, inst, 0);
+
+	spin = igt_spin_new(i915, ctx,
+			    .flags = (IGT_SPIN_POLL_RUN |
+				      IGT_SPIN_NO_PREEMPTION |
+				      IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin);
+
+	for (int i = 0; i < 150; i++) {
+		igt_assert_eq(sync_fence_status(spin->out_fence), 0);
+		sleep(1);
+	}
+
+	set_heartbeat(engine, 1);
+
+	igt_assert_eq(sync_fence_wait(spin->out_fence, 250), 0);
+	igt_assert_eq(sync_fence_status(spin->out_fence), -EIO);
+
+	igt_spin_free(i915, spin);
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+igt_main
+{
+	const struct intel_execution_engine2 *it;
+	int i915 = -1, engines = -1;
+
+	igt_fixture {
+		int sys;
+
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+		igt_allow_hang(i915, 0, 0);
+
+		sys = igt_sysfs_open(i915);
+		igt_require(sys != -1);
+
+		engines = openat(sys, "engine", O_RDONLY);
+		igt_require(engines != -1);
+		close(sys);
+
+		enable_hangcheck(i915, true);
+	}
+
+	__for_each_static_engine(it) {
+		igt_subtest_group {
+			int engine = -1;
+			char *name = NULL;
+
+			igt_fixture {
+				struct stat st;
+
+				engine = openat(engines, it->name, O_RDONLY);
+				igt_require(fstatat(engine,
+						    "heartbeat_interval_ms",
+						    &st, 0) == 0);
+				name = igt_sysfs_get(engine, "name");
+				igt_require(name);
+			}
+			if (!name)
+				name = strdup(it->name);
+
+			igt_subtest_f("%s-idempotent", name)
+				test_idempotent(i915, engine);
+			igt_subtest_f("%s-invalid", name)
+				test_invalid(i915, engine);
+
+			igt_subtest_f("%s-precise", name)
+				test_precise(i915, engine);
+			igt_subtest_f("%s-nopreempt", name)
+				test_nopreempt(i915, engine);
+			igt_subtest_f("%s-mixed", name)
+				test_mixed(i915, engine);
+			igt_subtest_f("%s-off", name)
+				test_off(i915, engine);
+			igt_subtest_f("%s-long", name)
+				test_long(i915, engine);
+
+			free(name);
+			close(engine);
+		}
+	}
+
+	igt_fixture {
+		close(engines);
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 338da2e95..8d0964fe0 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -239,6 +239,7 @@ i915_progs = [
 	'i915_query',
 	'i915_selftest',
 	'i915_suspend',
+	'sysfs_heartbeat_interval',
 	'sysfs_preempt_timeout',
 ]
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Intel-gfx] [PATCH i-g-t 7/9] i915: Exercise sysfs heartbeat controls
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

We [will] expose various per-engine scheduling controls. One of which,
'heartbeat_duration_ms', defines how often we send a heartbeat down the
engine to check upon the health of the engine. If a heartbeat does not
complete within the interval (or two), the engine is declared hung.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.sources                |   1 +
 tests/i915/sysfs_heartbeat_interval.c | 478 ++++++++++++++++++++++++++
 tests/meson.build                     |   1 +
 3 files changed, 480 insertions(+)
 create mode 100644 tests/i915/sysfs_heartbeat_interval.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 413952c7c..13544133a 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -98,6 +98,7 @@ TESTS_progs = \
 	tools_test \
 	vgem_basic \
 	vgem_slow \
+	i915/sysfs_heartbeat_interval \
 	i915/sysfs_preempt_timeout \
 	$(NULL)
 
diff --git a/tests/i915/sysfs_heartbeat_interval.c b/tests/i915/sysfs_heartbeat_interval.c
new file mode 100644
index 000000000..ba3c523fb
--- /dev/null
+++ b/tests/i915/sysfs_heartbeat_interval.c
@@ -0,0 +1,478 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <signal.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <unistd.h>
+
+#include "drmtest.h" /* gem_quiescent_gpu()! */
+#include "i915/gem_engine_topology.h"
+#include "igt_dummyload.h"
+#include "igt_sysfs.h"
+#include "ioctl_wrappers.h" /* igt_require_gem()! */
+#include "sw_sync.h"
+
+#include "igt_debugfs.h"
+
+static bool __enable_hangcheck(int dir, bool state)
+{
+	return igt_sysfs_set(dir, "enable_hangcheck", state ? "1" : "0");
+}
+
+static void enable_hangcheck(int i915, bool state)
+{
+	int dir;
+
+	dir = igt_sysfs_open_parameters(i915);
+	if (dir < 0) /* no parameters, must be default! */
+		return;
+
+	__enable_hangcheck(dir, state);
+	close(dir);
+}
+
+static void set_heartbeat(int engine, unsigned int value)
+{
+	unsigned int delay = ~value;
+
+	igt_sysfs_printf(engine, "heartbeat_interval_ms", "%u", value);
+	igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &delay);
+	igt_assert_eq(delay, value);
+}
+
+static void test_idempotent(int i915, int engine)
+{
+	unsigned int delays[] = { 1, 1000, 5000, 50000, 123456789 };
+	unsigned int saved;
+
+	/* Quick test that the property reports the values we set */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++)
+		set_heartbeat(engine, delays[i]);
+
+	set_heartbeat(engine, saved);
+}
+
+static void test_invalid(int i915, int engine)
+{
+	unsigned int saved, delay;
+
+	/* Quick test that we reject any unrepresentable intervals */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+
+	igt_sysfs_printf(engine, "heartbeat_interval_ms", PRIu64, -1);
+	igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "heartbeat_interval_ms", PRIu64, 10ull << 32);
+	igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+}
+
+static void set_unbannable(int i915, uint32_t ctx)
+{
+	struct drm_i915_gem_context_param p = {
+		.ctx_id = ctx,
+		.param = I915_CONTEXT_PARAM_BANNABLE,
+	};
+
+	igt_assert_eq(__gem_context_set_param(i915, &p), 0);
+}
+
+static uint32_t create_context(int i915, unsigned int class, unsigned int inst, int prio)
+{
+	uint32_t ctx;
+
+	ctx = gem_context_create_for_engine(i915, class, inst);
+	set_unbannable(i915, ctx);
+	gem_context_set_priority(i915, ctx, prio);
+
+	return ctx;
+}
+
+static uint64_t __test_timeout(int i915, int engine, unsigned int timeout)
+{
+	unsigned int class, inst;
+	struct timespec ts = {};
+	igt_spin_t *spin[2];
+	uint64_t elapsed;
+	uint32_t ctx[2];
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_heartbeat(engine, timeout);
+
+	ctx[0] = create_context(i915, class, inst, 1023);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, -1023);
+	igt_nsec_elapsed(&ts);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+	igt_spin_busywait_until_started(spin[1]);
+	elapsed = igt_nsec_elapsed(&ts);
+
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+	gem_quiescent_gpu(i915);
+
+	return elapsed;
+}
+
+static void test_precise(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * The heartbeat interval defines how long the kernel waits between
+	 * checking on the status of the engines. It first sends down a
+	 * heartbeat pulse, waits the interval and sees if the system managed
+	 * to complete the pulse. If not, it gives a priority bump to the pulse
+	 * and waits again. This is repeated until the priority cannot be bumped
+	 * any more, and the system is declared hung.
+	 *
+	 * If we combine the preemptive pulse with forced preemption, we instead
+	 * get a much faster hang detection. Thus in combination we can measure
+	 * the system response time to reseting a hog as a measure of the
+	 * heartbeat interval, and so confirm it matches our specification.
+	 */
+
+	igt_require(igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1) == 1);
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_timeout(i915, engine, delays[i]);
+		igt_info("heartbeat_interval_ms:%d, elapsed=%.3fms[%d]\n",
+			 delays[i], elapsed * 1e-6,
+				(int)(elapsed / 1000 / 1000)
+			 );
+
+		/*
+		 * It takes a couple of missed heartbeats before we start
+		 * terminating hogs, and a little bit of jiffie slack for
+		 * scheduling at each step. 150ms should cover all of our
+		 * sins and be useful tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < 3 * delays[i] + 150,
+			     "Heartbeat interval (and CPR) exceeded request!\n");
+	}
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+static void test_nopreempt(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * The same principle as test_precise(), except that forced preemption
+	 * is disabled (or simply not supported by the platform). This time,
+	 * it waits until the system misses a few heartbeat before doing a
+	 * per-engine/full-gpu reset. As such it is less precise, but we
+	 * can still estimate an upper bound for our specified heartbeat
+	 * interval and verify the system conforms.
+	 */
+
+	/* Test heartbeats with forced preemption  disabled */
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 0);
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_timeout(i915, engine, delays[i]);
+		igt_info("heartbeat_interval_ms:%d, elapsed=%.3fms[%d]\n",
+			 delays[i], elapsed * 1e-6,
+				(int)(elapsed / 1000 / 1000)
+			 );
+
+		/*
+		 * It takes a several missed heartbeats before we start
+		 * terminating hogs, and a little bit of jiffie slack for
+		 * scheduling at each step. 250ms should cover all of our
+		 * sins and be useful tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < 5 * delays[i] + 150,
+			     "Heartbeat interval (and CPR) exceeded request!\n");
+	}
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+static void client(int i915, int engine, int *ctl, int duration, int expect)
+{
+	unsigned int class, inst;
+	unsigned long count = 0;
+	uint32_t ctx;
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	ctx = create_context(i915, class, inst, 0);
+
+	while (!READ_ONCE(*ctl)) {
+		igt_spin_t *spin;
+
+		spin = igt_spin_new(i915, ctx,
+				    .flags = (IGT_SPIN_NO_PREEMPTION |
+					      IGT_SPIN_POLL_RUN |
+					      IGT_SPIN_FENCE_OUT));
+		igt_spin_busywait_until_started(spin);
+
+		igt_spin_set_timeout(spin, (uint64_t)duration * 1000 * 1000);
+		sync_fence_wait(spin->out_fence, -1);
+
+		igt_assert_eq(sync_fence_status(spin->out_fence), expect);
+		count++;
+	}
+
+	gem_context_destroy(i915, ctx);
+	igt_info("%s client completed %lu spins\n",
+		 expect < 0 ? "Bad" : "Good", count);
+}
+
+static void sigign(int sig)
+{
+}
+
+static void wait_until(int duration)
+{
+	signal(SIGCHLD, sigign);
+	sleep(duration);
+	signal(SIGCHLD, SIG_IGN);
+}
+
+static void __test_mixed(int i915, int engine,
+			 int heartbeat,
+			 int good,
+			 int bad,
+			 int duration)
+{
+	unsigned int saved;
+	int *shared;
+
+	/*
+	 * Given two clients of which one is a hog, be sure we cleanly
+	 * terminate the hog leaving the good client to run.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	shared = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
+	igt_assert(shared != MAP_FAILED);
+
+	set_heartbeat(engine, heartbeat);
+
+	igt_fork(child, 1) /* good client */
+		client(i915, engine, shared, good, 1);
+	igt_fork(child, 1) /* bad client */
+		client(i915, engine, shared, bad, -EIO);
+
+	wait_until(duration);
+
+	*shared = true;
+	igt_waitchildren();
+	munmap(shared, 4096);
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+static void test_mixed(int i915, int engine)
+{
+	/*
+	 * Hogs rarely run alone. Our hang detection must carefully wean
+	 * out the hogs from the innocent clients. Thus we run a mixed workload
+	 * with non-preemptable hogs that exceed the heartbeat, and quicker
+	 * innocents. We inspect the fence status of each to verify that
+	 * only the hogs are reset.
+	 */
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1);
+	__test_mixed(i915, engine, 10, 10, 100, 5);
+}
+
+static void test_long(int i915, int engine)
+{
+	/*
+	 * Some clients relish being hogs, and demand that the system
+	 * never do hangchecking. Never is hard to test, so instead we
+	 * run over a day and verify that only the super hogs are reset.
+	 */
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 0);
+	__test_mixed(i915, engine,
+		     60 * 1000, /* 60s */
+		     60 * 1000, /* 60s */
+		     300 * 1000, /* 5min */
+		     24 * 3600 /* 24hours */);
+}
+
+static void test_off(int i915, int engine)
+{
+	unsigned int class, inst;
+	unsigned int saved;
+	igt_spin_t *spin;
+	uint32_t ctx;
+
+	/*
+	 * Some other clients request that there is never any interruption
+	 * or jitter in their workload and so demand that the kernel never
+	 * sends a heartbeat to steal precious cycles from their workload.
+	 * Turn off the heartbeat and check that the workload is uninterrupted
+	 * for 150s.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_heartbeat(engine, 0);
+
+	ctx = create_context(i915, class, inst, 0);
+
+	spin = igt_spin_new(i915, ctx,
+			    .flags = (IGT_SPIN_POLL_RUN |
+				      IGT_SPIN_NO_PREEMPTION |
+				      IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin);
+
+	for (int i = 0; i < 150; i++) {
+		igt_assert_eq(sync_fence_status(spin->out_fence), 0);
+		sleep(1);
+	}
+
+	set_heartbeat(engine, 1);
+
+	igt_assert_eq(sync_fence_wait(spin->out_fence, 250), 0);
+	igt_assert_eq(sync_fence_status(spin->out_fence), -EIO);
+
+	igt_spin_free(i915, spin);
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+igt_main
+{
+	const struct intel_execution_engine2 *it;
+	int i915 = -1, engines = -1;
+
+	igt_fixture {
+		int sys;
+
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+		igt_allow_hang(i915, 0, 0);
+
+		sys = igt_sysfs_open(i915);
+		igt_require(sys != -1);
+
+		engines = openat(sys, "engine", O_RDONLY);
+		igt_require(engines != -1);
+		close(sys);
+
+		enable_hangcheck(i915, true);
+	}
+
+	__for_each_static_engine(it) {
+		igt_subtest_group {
+			int engine = -1;
+			char *name = NULL;
+
+			igt_fixture {
+				struct stat st;
+
+				engine = openat(engines, it->name, O_RDONLY);
+				igt_require(fstatat(engine,
+						    "heartbeat_interval_ms",
+						    &st, 0) == 0);
+				name = igt_sysfs_get(engine, "name");
+				igt_require(name);
+			}
+			if (!name)
+				name = strdup(it->name);
+
+			igt_subtest_f("%s-idempotent", name)
+				test_idempotent(i915, engine);
+			igt_subtest_f("%s-invalid", name)
+				test_invalid(i915, engine);
+
+			igt_subtest_f("%s-precise", name)
+				test_precise(i915, engine);
+			igt_subtest_f("%s-nopreempt", name)
+				test_nopreempt(i915, engine);
+			igt_subtest_f("%s-mixed", name)
+				test_mixed(i915, engine);
+			igt_subtest_f("%s-off", name)
+				test_off(i915, engine);
+			igt_subtest_f("%s-long", name)
+				test_long(i915, engine);
+
+			free(name);
+			close(engine);
+		}
+	}
+
+	igt_fixture {
+		close(engines);
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 338da2e95..8d0964fe0 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -239,6 +239,7 @@ i915_progs = [
 	'i915_query',
 	'i915_selftest',
 	'i915_suspend',
+	'sysfs_heartbeat_interval',
 	'sysfs_preempt_timeout',
 ]
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [igt-dev] [PATCH i-g-t 7/9] i915: Exercise sysfs heartbeat controls
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

We [will] expose various per-engine scheduling controls. One of which,
'heartbeat_duration_ms', defines how often we send a heartbeat down the
engine to check upon the health of the engine. If a heartbeat does not
complete within the interval (or two), the engine is declared hung.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.sources                |   1 +
 tests/i915/sysfs_heartbeat_interval.c | 478 ++++++++++++++++++++++++++
 tests/meson.build                     |   1 +
 3 files changed, 480 insertions(+)
 create mode 100644 tests/i915/sysfs_heartbeat_interval.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 413952c7c..13544133a 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -98,6 +98,7 @@ TESTS_progs = \
 	tools_test \
 	vgem_basic \
 	vgem_slow \
+	i915/sysfs_heartbeat_interval \
 	i915/sysfs_preempt_timeout \
 	$(NULL)
 
diff --git a/tests/i915/sysfs_heartbeat_interval.c b/tests/i915/sysfs_heartbeat_interval.c
new file mode 100644
index 000000000..ba3c523fb
--- /dev/null
+++ b/tests/i915/sysfs_heartbeat_interval.c
@@ -0,0 +1,478 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <signal.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <unistd.h>
+
+#include "drmtest.h" /* gem_quiescent_gpu()! */
+#include "i915/gem_engine_topology.h"
+#include "igt_dummyload.h"
+#include "igt_sysfs.h"
+#include "ioctl_wrappers.h" /* igt_require_gem()! */
+#include "sw_sync.h"
+
+#include "igt_debugfs.h"
+
+static bool __enable_hangcheck(int dir, bool state)
+{
+	return igt_sysfs_set(dir, "enable_hangcheck", state ? "1" : "0");
+}
+
+static void enable_hangcheck(int i915, bool state)
+{
+	int dir;
+
+	dir = igt_sysfs_open_parameters(i915);
+	if (dir < 0) /* no parameters, must be default! */
+		return;
+
+	__enable_hangcheck(dir, state);
+	close(dir);
+}
+
+static void set_heartbeat(int engine, unsigned int value)
+{
+	unsigned int delay = ~value;
+
+	igt_sysfs_printf(engine, "heartbeat_interval_ms", "%u", value);
+	igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &delay);
+	igt_assert_eq(delay, value);
+}
+
+static void test_idempotent(int i915, int engine)
+{
+	unsigned int delays[] = { 1, 1000, 5000, 50000, 123456789 };
+	unsigned int saved;
+
+	/* Quick test that the property reports the values we set */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++)
+		set_heartbeat(engine, delays[i]);
+
+	set_heartbeat(engine, saved);
+}
+
+static void test_invalid(int i915, int engine)
+{
+	unsigned int saved, delay;
+
+	/* Quick test that we reject any unrepresentable intervals */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+
+	igt_sysfs_printf(engine, "heartbeat_interval_ms", PRIu64, -1);
+	igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "heartbeat_interval_ms", PRIu64, 10ull << 32);
+	igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+}
+
+static void set_unbannable(int i915, uint32_t ctx)
+{
+	struct drm_i915_gem_context_param p = {
+		.ctx_id = ctx,
+		.param = I915_CONTEXT_PARAM_BANNABLE,
+	};
+
+	igt_assert_eq(__gem_context_set_param(i915, &p), 0);
+}
+
+static uint32_t create_context(int i915, unsigned int class, unsigned int inst, int prio)
+{
+	uint32_t ctx;
+
+	ctx = gem_context_create_for_engine(i915, class, inst);
+	set_unbannable(i915, ctx);
+	gem_context_set_priority(i915, ctx, prio);
+
+	return ctx;
+}
+
+static uint64_t __test_timeout(int i915, int engine, unsigned int timeout)
+{
+	unsigned int class, inst;
+	struct timespec ts = {};
+	igt_spin_t *spin[2];
+	uint64_t elapsed;
+	uint32_t ctx[2];
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_heartbeat(engine, timeout);
+
+	ctx[0] = create_context(i915, class, inst, 1023);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, -1023);
+	igt_nsec_elapsed(&ts);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+	igt_spin_busywait_until_started(spin[1]);
+	elapsed = igt_nsec_elapsed(&ts);
+
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+	gem_quiescent_gpu(i915);
+
+	return elapsed;
+}
+
+static void test_precise(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * The heartbeat interval defines how long the kernel waits between
+	 * checking on the status of the engines. It first sends down a
+	 * heartbeat pulse, waits the interval and sees if the system managed
+	 * to complete the pulse. If not, it gives a priority bump to the pulse
+	 * and waits again. This is repeated until the priority cannot be bumped
+	 * any more, and the system is declared hung.
+	 *
+	 * If we combine the preemptive pulse with forced preemption, we instead
+	 * get a much faster hang detection. Thus in combination we can measure
+	 * the system response time to reseting a hog as a measure of the
+	 * heartbeat interval, and so confirm it matches our specification.
+	 */
+
+	igt_require(igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1) == 1);
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_timeout(i915, engine, delays[i]);
+		igt_info("heartbeat_interval_ms:%d, elapsed=%.3fms[%d]\n",
+			 delays[i], elapsed * 1e-6,
+				(int)(elapsed / 1000 / 1000)
+			 );
+
+		/*
+		 * It takes a couple of missed heartbeats before we start
+		 * terminating hogs, and a little bit of jiffie slack for
+		 * scheduling at each step. 150ms should cover all of our
+		 * sins and be useful tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < 3 * delays[i] + 150,
+			     "Heartbeat interval (and CPR) exceeded request!\n");
+	}
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+static void test_nopreempt(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * The same principle as test_precise(), except that forced preemption
+	 * is disabled (or simply not supported by the platform). This time,
+	 * it waits until the system misses a few heartbeat before doing a
+	 * per-engine/full-gpu reset. As such it is less precise, but we
+	 * can still estimate an upper bound for our specified heartbeat
+	 * interval and verify the system conforms.
+	 */
+
+	/* Test heartbeats with forced preemption  disabled */
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 0);
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_timeout(i915, engine, delays[i]);
+		igt_info("heartbeat_interval_ms:%d, elapsed=%.3fms[%d]\n",
+			 delays[i], elapsed * 1e-6,
+				(int)(elapsed / 1000 / 1000)
+			 );
+
+		/*
+		 * It takes a several missed heartbeats before we start
+		 * terminating hogs, and a little bit of jiffie slack for
+		 * scheduling at each step. 250ms should cover all of our
+		 * sins and be useful tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < 5 * delays[i] + 150,
+			     "Heartbeat interval (and CPR) exceeded request!\n");
+	}
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+static void client(int i915, int engine, int *ctl, int duration, int expect)
+{
+	unsigned int class, inst;
+	unsigned long count = 0;
+	uint32_t ctx;
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	ctx = create_context(i915, class, inst, 0);
+
+	while (!READ_ONCE(*ctl)) {
+		igt_spin_t *spin;
+
+		spin = igt_spin_new(i915, ctx,
+				    .flags = (IGT_SPIN_NO_PREEMPTION |
+					      IGT_SPIN_POLL_RUN |
+					      IGT_SPIN_FENCE_OUT));
+		igt_spin_busywait_until_started(spin);
+
+		igt_spin_set_timeout(spin, (uint64_t)duration * 1000 * 1000);
+		sync_fence_wait(spin->out_fence, -1);
+
+		igt_assert_eq(sync_fence_status(spin->out_fence), expect);
+		count++;
+	}
+
+	gem_context_destroy(i915, ctx);
+	igt_info("%s client completed %lu spins\n",
+		 expect < 0 ? "Bad" : "Good", count);
+}
+
+static void sigign(int sig)
+{
+}
+
+static void wait_until(int duration)
+{
+	signal(SIGCHLD, sigign);
+	sleep(duration);
+	signal(SIGCHLD, SIG_IGN);
+}
+
+static void __test_mixed(int i915, int engine,
+			 int heartbeat,
+			 int good,
+			 int bad,
+			 int duration)
+{
+	unsigned int saved;
+	int *shared;
+
+	/*
+	 * Given two clients of which one is a hog, be sure we cleanly
+	 * terminate the hog leaving the good client to run.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	shared = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
+	igt_assert(shared != MAP_FAILED);
+
+	set_heartbeat(engine, heartbeat);
+
+	igt_fork(child, 1) /* good client */
+		client(i915, engine, shared, good, 1);
+	igt_fork(child, 1) /* bad client */
+		client(i915, engine, shared, bad, -EIO);
+
+	wait_until(duration);
+
+	*shared = true;
+	igt_waitchildren();
+	munmap(shared, 4096);
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+static void test_mixed(int i915, int engine)
+{
+	/*
+	 * Hogs rarely run alone. Our hang detection must carefully wean
+	 * out the hogs from the innocent clients. Thus we run a mixed workload
+	 * with non-preemptable hogs that exceed the heartbeat, and quicker
+	 * innocents. We inspect the fence status of each to verify that
+	 * only the hogs are reset.
+	 */
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1);
+	__test_mixed(i915, engine, 10, 10, 100, 5);
+}
+
+static void test_long(int i915, int engine)
+{
+	/*
+	 * Some clients relish being hogs, and demand that the system
+	 * never do hangchecking. Never is hard to test, so instead we
+	 * run over a day and verify that only the super hogs are reset.
+	 */
+	igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 0);
+	__test_mixed(i915, engine,
+		     60 * 1000, /* 60s */
+		     60 * 1000, /* 60s */
+		     300 * 1000, /* 5min */
+		     24 * 3600 /* 24hours */);
+}
+
+static void test_off(int i915, int engine)
+{
+	unsigned int class, inst;
+	unsigned int saved;
+	igt_spin_t *spin;
+	uint32_t ctx;
+
+	/*
+	 * Some other clients request that there is never any interruption
+	 * or jitter in their workload and so demand that the kernel never
+	 * sends a heartbeat to steal precious cycles from their workload.
+	 * Turn off the heartbeat and check that the workload is uninterrupted
+	 * for 150s.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "heartbeat_interval_ms", "%u", &saved) == 1);
+	igt_debug("Initial heartbeat_interval_ms:%u\n", saved);
+	gem_quiescent_gpu(i915);
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_heartbeat(engine, 0);
+
+	ctx = create_context(i915, class, inst, 0);
+
+	spin = igt_spin_new(i915, ctx,
+			    .flags = (IGT_SPIN_POLL_RUN |
+				      IGT_SPIN_NO_PREEMPTION |
+				      IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin);
+
+	for (int i = 0; i < 150; i++) {
+		igt_assert_eq(sync_fence_status(spin->out_fence), 0);
+		sleep(1);
+	}
+
+	set_heartbeat(engine, 1);
+
+	igt_assert_eq(sync_fence_wait(spin->out_fence, 250), 0);
+	igt_assert_eq(sync_fence_status(spin->out_fence), -EIO);
+
+	igt_spin_free(i915, spin);
+
+	gem_quiescent_gpu(i915);
+	set_heartbeat(engine, saved);
+}
+
+igt_main
+{
+	const struct intel_execution_engine2 *it;
+	int i915 = -1, engines = -1;
+
+	igt_fixture {
+		int sys;
+
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+		igt_allow_hang(i915, 0, 0);
+
+		sys = igt_sysfs_open(i915);
+		igt_require(sys != -1);
+
+		engines = openat(sys, "engine", O_RDONLY);
+		igt_require(engines != -1);
+		close(sys);
+
+		enable_hangcheck(i915, true);
+	}
+
+	__for_each_static_engine(it) {
+		igt_subtest_group {
+			int engine = -1;
+			char *name = NULL;
+
+			igt_fixture {
+				struct stat st;
+
+				engine = openat(engines, it->name, O_RDONLY);
+				igt_require(fstatat(engine,
+						    "heartbeat_interval_ms",
+						    &st, 0) == 0);
+				name = igt_sysfs_get(engine, "name");
+				igt_require(name);
+			}
+			if (!name)
+				name = strdup(it->name);
+
+			igt_subtest_f("%s-idempotent", name)
+				test_idempotent(i915, engine);
+			igt_subtest_f("%s-invalid", name)
+				test_invalid(i915, engine);
+
+			igt_subtest_f("%s-precise", name)
+				test_precise(i915, engine);
+			igt_subtest_f("%s-nopreempt", name)
+				test_nopreempt(i915, engine);
+			igt_subtest_f("%s-mixed", name)
+				test_mixed(i915, engine);
+			igt_subtest_f("%s-off", name)
+				test_off(i915, engine);
+			igt_subtest_f("%s-long", name)
+				test_long(i915, engine);
+
+			free(name);
+			close(engine);
+		}
+	}
+
+	igt_fixture {
+		close(engines);
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 338da2e95..8d0964fe0 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -239,6 +239,7 @@ i915_progs = [
 	'i915_query',
 	'i915_selftest',
 	'i915_suspend',
+	'sysfs_heartbeat_interval',
 	'sysfs_preempt_timeout',
 ]
 
-- 
2.24.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH i-g-t 8/9] i915: Exercise timeslice sysfs property
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

We [will] expose various per-engine scheduling controls. One of which,
'timeslice_duration_ms', defines the scheduling quantum. If a context
exhausts its timeslice, it will be preempted in favour of running one of
its compatriots.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.sources                |   1 +
 tests/i915/sysfs_timeslice_duration.c | 519 ++++++++++++++++++++++++++
 tests/meson.build                     |   1 +
 3 files changed, 521 insertions(+)
 create mode 100644 tests/i915/sysfs_timeslice_duration.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 13544133a..e17d43155 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -100,6 +100,7 @@ TESTS_progs = \
 	vgem_slow \
 	i915/sysfs_heartbeat_interval \
 	i915/sysfs_preempt_timeout \
+	i915/sysfs_timeslice_duration \
 	$(NULL)
 
 TESTS_progs += gem_bad_reloc
diff --git a/tests/i915/sysfs_timeslice_duration.c b/tests/i915/sysfs_timeslice_duration.c
new file mode 100644
index 000000000..02f1f31f8
--- /dev/null
+++ b/tests/i915/sysfs_timeslice_duration.c
@@ -0,0 +1,519 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "drmtest.h" /* gem_quiescent_gpu()! */
+#include "i915/gem_engine_topology.h"
+#include "i915/gem_mman.h"
+#include "igt_dummyload.h"
+#include "igt_sysfs.h"
+#include "ioctl_wrappers.h" /* igt_require_gem()! */
+#include "intel_chipset.h"
+#include "intel_reg.h"
+#include "sw_sync.h"
+
+#define MI_SEMAPHORE_WAIT		(0x1c << 23)
+#define   MI_SEMAPHORE_POLL             (1 << 15)
+#define   MI_SEMAPHORE_SAD_GT_SDD       (0 << 12)
+#define   MI_SEMAPHORE_SAD_GTE_SDD      (1 << 12)
+#define   MI_SEMAPHORE_SAD_LT_SDD       (2 << 12)
+#define   MI_SEMAPHORE_SAD_LTE_SDD      (3 << 12)
+#define   MI_SEMAPHORE_SAD_EQ_SDD       (4 << 12)
+#define   MI_SEMAPHORE_SAD_NEQ_SDD      (5 << 12)
+
+static bool __enable_hangcheck(int dir, bool state)
+{
+	return igt_sysfs_set(dir, "enable_hangcheck", state ? "1" : "0");
+}
+
+static bool enable_hangcheck(int i915, bool state)
+{
+	bool success;
+	int dir;
+
+	dir = igt_sysfs_open_parameters(i915);
+	if (dir < 0) /* no parameters, must be default! */
+		return false;
+
+	success = __enable_hangcheck(dir, state);
+	close(dir);
+
+	return success;
+}
+
+static void set_timeslice(int engine, unsigned int value)
+{
+	unsigned int delay;
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", "%u", value);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, value);
+}
+
+static void test_idempotent(int i915, int engine)
+{
+	const unsigned int delays[] = { 0, 1, 1234, 654321 };
+	unsigned int saved;
+
+	/* Quick test to verify the kernel reports the same values as we write */
+
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++)
+		set_timeslice(engine, delays[i]);
+
+	set_timeslice(engine, saved);
+}
+
+static void test_invalid(int i915, int engine)
+{
+	unsigned int saved, delay;
+
+	/* Quick test that non-representable delays are rejected */
+
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", PRIu64, -1);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", "%d", -1);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", PRIu64, 123ull << 32);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+}
+
+static void set_unbannable(int i915, uint32_t ctx)
+{
+	struct drm_i915_gem_context_param p = {
+		.ctx_id = ctx,
+		.param = I915_CONTEXT_PARAM_BANNABLE,
+	};
+
+	igt_assert_eq(__gem_context_set_param(i915, &p), 0);
+}
+
+static uint32_t create_context(int i915, unsigned int class, unsigned int inst, int prio)
+{
+	uint32_t ctx;
+
+	ctx = gem_context_create_for_engine(i915, class, inst);
+	set_unbannable(i915, ctx);
+	gem_context_set_priority(i915, ctx, prio);
+
+	return ctx;
+}
+
+static int cmp_u32(const void *_a, const void *_b)
+{
+	const uint32_t *a = _a, *b = _b;
+
+	return *a - *b;
+}
+
+static double clockrate(int i915)
+{
+	int freq;
+	drm_i915_getparam_t gp = {
+		.value = &freq,
+		.param = I915_PARAM_CS_TIMESTAMP_FREQUENCY,
+	};
+
+	igt_require(igt_ioctl(i915, DRM_IOCTL_I915_GETPARAM, &gp) == 0);
+	return 1e9 / freq;
+}
+
+static uint64_t __test_duration(int i915, int engine, unsigned int timeout)
+{
+	struct drm_i915_gem_exec_object2 obj[3] = {
+		{
+			.handle = gem_create(i915, 4096),
+			.offset = 0,
+			.flags = EXEC_OBJECT_PINNED,
+		},
+		{
+			.handle = gem_create(i915, 4096),
+			.offset = 4096,
+			.flags = EXEC_OBJECT_PINNED,
+		},
+		{ gem_create(i915, 4096) }
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffer_count = ARRAY_SIZE(obj),
+		.buffers_ptr = to_user_pointer(obj),
+	};
+	const int gen = intel_gen(intel_get_drm_devid(i915));
+	double duration = clockrate(i915);
+	unsigned int class, inst, mmio;
+	uint32_t *cs, *map;
+	uint32_t ctx[2];
+	int start;
+	int i;
+
+	igt_require(gem_scheduler_has_preemption(i915));
+	igt_require(gen >= 8); /* MI_SEMAPHORE_WAIT */
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+	igt_require(igt_sysfs_scanf(engine, "mmio_base", "%x", &mmio) == 1);
+
+	set_timeslice(engine, timeout);
+
+	ctx[0] = create_context(i915, class, inst, 0);
+	ctx[1] = create_context(i915, class, inst, 0);
+
+	map = gem_mmap__cpu(i915, obj[2].handle, 0, 4096, PROT_WRITE);
+
+	cs = map;
+	for (i = 0; i < 10; i++) {
+		*cs++ = MI_SEMAPHORE_WAIT |
+			MI_SEMAPHORE_POLL |
+			MI_SEMAPHORE_SAD_NEQ_SDD |
+			(4 - 2 + (gen >= 12));
+		*cs++ = 0;
+		*cs++ = obj[0].offset + sizeof(uint32_t) * i;
+		*cs++ = 0;
+		if (gen >= 12)
+			*cs++ = 0;
+
+		*cs++ = 0x24 << 23 | 2; /* SRM */
+		*cs++ = mmio + 0x358;
+		*cs++ = obj[1].offset + sizeof(uint32_t) * i;
+		*cs++ = 0;
+
+		*cs++ = MI_STORE_DWORD_IMM;
+		*cs++ = obj[0].offset +
+			4096 - sizeof(uint32_t) * i - sizeof(uint32_t);
+		*cs++ = 0;
+		*cs++ = 1;
+	}
+	*cs++ = MI_BATCH_BUFFER_END;
+
+	cs += 16 - ((cs - map) & 15);
+	start = (cs - map) * sizeof(*cs);
+	for (i = 0; i < 10; i++) {
+		*cs++ = MI_STORE_DWORD_IMM;
+		*cs++ = obj[0].offset + sizeof(uint32_t) * i;
+		*cs++ = 0;
+		*cs++ = 1;
+
+		*cs++ = MI_SEMAPHORE_WAIT |
+			MI_SEMAPHORE_POLL |
+			MI_SEMAPHORE_SAD_NEQ_SDD |
+			(4 - 2 + (gen >= 12));
+		*cs++ = 0;
+		*cs++ = obj[0].offset +
+			4096 - sizeof(uint32_t) * i - sizeof(uint32_t);
+		*cs++ = 0;
+		if (gen >= 12)
+			*cs++ = 0;
+	}
+	*cs++ = MI_BATCH_BUFFER_END;
+	igt_assert(cs - map < 4096 / sizeof(*cs));
+	munmap(map, 4096);
+
+	eb.rsvd1 = ctx[0];
+	gem_execbuf(i915, &eb);
+
+	eb.rsvd1 = ctx[1];
+	eb.batch_start_offset = start;
+	gem_execbuf(i915, &eb);
+
+	gem_sync(i915, obj[2].handle);
+
+	gem_set_domain(i915, obj[1].handle,
+		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
+	map = gem_mmap__cpu(i915, obj[1].handle, 0, 4096, PROT_WRITE);
+	for (i = 0; i < 9; i++)
+		map[i] = map[i + 1] - map[i];
+	qsort(map, 9, sizeof(*map), cmp_u32);
+	duration *= map[4] / 2; /* 2 sema-waits between timestamp updates */
+	munmap(map, 4096);
+
+	for (i = 0; i < ARRAY_SIZE(ctx); i++)
+		gem_context_destroy(i915, ctx[i]);
+
+	for (i = 0; i < ARRAY_SIZE(obj); i++)
+		gem_close(i915, obj[i].handle);
+
+	return duration;
+}
+
+static void test_duration(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * Timeslicing at its very basic level is sharing the GPU by
+	 * running one context for interval before running another. After
+	 * each interval the running context is swapped for another runnable
+	 * context.
+	 *
+	 * We can measure this directly by watching the xCS_TIMESTAMP and
+	 * recording its value every time we switch into the context, using
+	 * a couple of semaphores to busyspin for the timeslice.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_duration(i915, engine, delays[i]);
+		igt_info("timeslice_duration_ms:%d, elapsed=%.3fms\n",
+			 delays[i], elapsed * 1e-6);
+
+		/*
+		 * We need to give a couple of jiffies slack for the scheduler timeouts
+		 * and then a little more slack fr the overhead in submitting and
+		 * measuring. 50ms should cover all of our sins and be useful
+		 * tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < delays[i] + 50,
+			     "Timeslice exceeded request!\n");
+	}
+
+	gem_quiescent_gpu(i915);
+	set_timeslice(engine, saved);
+}
+
+static uint64_t __test_timeout(int i915, int engine, unsigned int timeout)
+{
+	unsigned int class, inst;
+	struct timespec ts = {};
+	igt_spin_t *spin[2];
+	uint64_t elapsed;
+	uint32_t ctx[2];
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_timeslice(engine, timeout);
+
+	ctx[0] = create_context(i915, class, inst, 0);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, 0);
+	igt_nsec_elapsed(&ts);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+	igt_spin_busywait_until_started(spin[1]);
+	elapsed = igt_nsec_elapsed(&ts);
+
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+	gem_quiescent_gpu(i915);
+
+	return elapsed;
+}
+
+static void test_timeout(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * Timeslicing requires us to preempt the running context in order to
+	 * switch into its contemporary. If we couple a unpreemptable hog
+	 * with a fast forced reset, we can measure the timeslice by how long
+	 * it takes for the hog to be reset and the high priority context
+	 * to complete.
+	 */
+
+	igt_require(igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+	igt_require(enable_hangcheck(i915, false));
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_timeout(i915, engine, delays[i]);
+		igt_info("timeslice_duration_ms:%d, elapsed=%.3fms\n",
+			 delays[i], elapsed * 1e-6);
+
+		/*
+		 * We need to give a couple of jiffies slack for the scheduler timeouts
+		 * and then a little more slack fr the overhead in submitting and
+		 * measuring. 50ms should cover all of our sins and be useful
+		 * tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < delays[i] + 50,
+			     "Timeslice exceeded request!\n");
+	}
+
+	igt_assert(enable_hangcheck(i915, true));
+	gem_quiescent_gpu(i915);
+	set_timeslice(engine, saved);
+}
+
+static void test_off(int i915, int engine)
+{
+	unsigned int class, inst;
+	unsigned int saved;
+	igt_spin_t *spin[2];
+	uint32_t ctx[2];
+
+	/*
+	 * As always, there are some who must run uninterrupted and simply do
+	 * not want to share the GPU even for a microsecond. Those greedy
+	 * clients can disable timeslicing entirely, and so set the timeslice
+	 * to 0. We test that a hog is not preempted within the 150s of
+	 * our boredom threshold.
+	 */
+
+	igt_require(igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+	igt_require(enable_hangcheck(i915, false));
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_timeslice(engine, 0);
+
+	ctx[0] = create_context(i915, class, inst, 0);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, 0);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+
+	for (int i = 0; i < 150; i++) {
+		igt_assert_eq(sync_fence_status(spin[0]->out_fence), 0);
+		sleep(1);
+	}
+
+	set_timeslice(engine, 1);
+
+	igt_spin_busywait_until_started(spin[1]);
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+
+	igt_assert(enable_hangcheck(i915, true));
+	gem_quiescent_gpu(i915);
+
+	set_timeslice(engine, saved);
+}
+
+igt_main
+{
+	const struct intel_execution_engine2 *it;
+	int i915 = -1, engines = -1;
+
+	igt_fixture {
+		int sys;
+
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+		igt_allow_hang(i915, 0, 0);
+
+		sys = igt_sysfs_open(i915);
+		igt_require(sys != -1);
+
+		engines = openat(sys, "engine", O_RDONLY);
+		igt_require(engines != -1);
+
+		close(sys);
+	}
+
+	__for_each_static_engine(it) {
+		igt_subtest_group {
+			int engine = -1;
+			char *name = NULL;
+
+			igt_fixture {
+				struct stat st;
+
+				engine = openat(engines, it->name, O_RDONLY);
+				igt_require(fstatat(engine,
+							"timeslice_duration_ms",
+							&st, 0) == 0);
+
+				name = igt_sysfs_get(engine, "name");
+				igt_require(name);
+			}
+			if (!name)
+				name = strdup(it->name);
+
+			igt_subtest_f("%s-idempotent", name)
+				test_idempotent(i915, engine);
+			igt_subtest_f("%s-invalid", name)
+				test_invalid(i915, engine);
+			igt_subtest_f("%s-duration", name)
+				test_duration(i915, engine);
+			igt_subtest_f("%s-timeout", name)
+				test_timeout(i915, engine);
+			igt_subtest_f("%s-off", name)
+				test_off(i915, engine);
+
+			free(name);
+			close(engine);
+		}
+	}
+
+	igt_fixture {
+		close(engines);
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 8d0964fe0..b0c567594 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -241,6 +241,7 @@ i915_progs = [
 	'i915_suspend',
 	'sysfs_heartbeat_interval',
 	'sysfs_preempt_timeout',
+	'sysfs_timeslice_duration',
 ]
 
 test_deps = [ igt_deps ]
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Intel-gfx] [PATCH i-g-t 8/9] i915: Exercise timeslice sysfs property
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

We [will] expose various per-engine scheduling controls. One of which,
'timeslice_duration_ms', defines the scheduling quantum. If a context
exhausts its timeslice, it will be preempted in favour of running one of
its compatriots.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.sources                |   1 +
 tests/i915/sysfs_timeslice_duration.c | 519 ++++++++++++++++++++++++++
 tests/meson.build                     |   1 +
 3 files changed, 521 insertions(+)
 create mode 100644 tests/i915/sysfs_timeslice_duration.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 13544133a..e17d43155 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -100,6 +100,7 @@ TESTS_progs = \
 	vgem_slow \
 	i915/sysfs_heartbeat_interval \
 	i915/sysfs_preempt_timeout \
+	i915/sysfs_timeslice_duration \
 	$(NULL)
 
 TESTS_progs += gem_bad_reloc
diff --git a/tests/i915/sysfs_timeslice_duration.c b/tests/i915/sysfs_timeslice_duration.c
new file mode 100644
index 000000000..02f1f31f8
--- /dev/null
+++ b/tests/i915/sysfs_timeslice_duration.c
@@ -0,0 +1,519 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "drmtest.h" /* gem_quiescent_gpu()! */
+#include "i915/gem_engine_topology.h"
+#include "i915/gem_mman.h"
+#include "igt_dummyload.h"
+#include "igt_sysfs.h"
+#include "ioctl_wrappers.h" /* igt_require_gem()! */
+#include "intel_chipset.h"
+#include "intel_reg.h"
+#include "sw_sync.h"
+
+#define MI_SEMAPHORE_WAIT		(0x1c << 23)
+#define   MI_SEMAPHORE_POLL             (1 << 15)
+#define   MI_SEMAPHORE_SAD_GT_SDD       (0 << 12)
+#define   MI_SEMAPHORE_SAD_GTE_SDD      (1 << 12)
+#define   MI_SEMAPHORE_SAD_LT_SDD       (2 << 12)
+#define   MI_SEMAPHORE_SAD_LTE_SDD      (3 << 12)
+#define   MI_SEMAPHORE_SAD_EQ_SDD       (4 << 12)
+#define   MI_SEMAPHORE_SAD_NEQ_SDD      (5 << 12)
+
+static bool __enable_hangcheck(int dir, bool state)
+{
+	return igt_sysfs_set(dir, "enable_hangcheck", state ? "1" : "0");
+}
+
+static bool enable_hangcheck(int i915, bool state)
+{
+	bool success;
+	int dir;
+
+	dir = igt_sysfs_open_parameters(i915);
+	if (dir < 0) /* no parameters, must be default! */
+		return false;
+
+	success = __enable_hangcheck(dir, state);
+	close(dir);
+
+	return success;
+}
+
+static void set_timeslice(int engine, unsigned int value)
+{
+	unsigned int delay;
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", "%u", value);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, value);
+}
+
+static void test_idempotent(int i915, int engine)
+{
+	const unsigned int delays[] = { 0, 1, 1234, 654321 };
+	unsigned int saved;
+
+	/* Quick test to verify the kernel reports the same values as we write */
+
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++)
+		set_timeslice(engine, delays[i]);
+
+	set_timeslice(engine, saved);
+}
+
+static void test_invalid(int i915, int engine)
+{
+	unsigned int saved, delay;
+
+	/* Quick test that non-representable delays are rejected */
+
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", PRIu64, -1);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", "%d", -1);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", PRIu64, 123ull << 32);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+}
+
+static void set_unbannable(int i915, uint32_t ctx)
+{
+	struct drm_i915_gem_context_param p = {
+		.ctx_id = ctx,
+		.param = I915_CONTEXT_PARAM_BANNABLE,
+	};
+
+	igt_assert_eq(__gem_context_set_param(i915, &p), 0);
+}
+
+static uint32_t create_context(int i915, unsigned int class, unsigned int inst, int prio)
+{
+	uint32_t ctx;
+
+	ctx = gem_context_create_for_engine(i915, class, inst);
+	set_unbannable(i915, ctx);
+	gem_context_set_priority(i915, ctx, prio);
+
+	return ctx;
+}
+
+static int cmp_u32(const void *_a, const void *_b)
+{
+	const uint32_t *a = _a, *b = _b;
+
+	return *a - *b;
+}
+
+static double clockrate(int i915)
+{
+	int freq;
+	drm_i915_getparam_t gp = {
+		.value = &freq,
+		.param = I915_PARAM_CS_TIMESTAMP_FREQUENCY,
+	};
+
+	igt_require(igt_ioctl(i915, DRM_IOCTL_I915_GETPARAM, &gp) == 0);
+	return 1e9 / freq;
+}
+
+static uint64_t __test_duration(int i915, int engine, unsigned int timeout)
+{
+	struct drm_i915_gem_exec_object2 obj[3] = {
+		{
+			.handle = gem_create(i915, 4096),
+			.offset = 0,
+			.flags = EXEC_OBJECT_PINNED,
+		},
+		{
+			.handle = gem_create(i915, 4096),
+			.offset = 4096,
+			.flags = EXEC_OBJECT_PINNED,
+		},
+		{ gem_create(i915, 4096) }
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffer_count = ARRAY_SIZE(obj),
+		.buffers_ptr = to_user_pointer(obj),
+	};
+	const int gen = intel_gen(intel_get_drm_devid(i915));
+	double duration = clockrate(i915);
+	unsigned int class, inst, mmio;
+	uint32_t *cs, *map;
+	uint32_t ctx[2];
+	int start;
+	int i;
+
+	igt_require(gem_scheduler_has_preemption(i915));
+	igt_require(gen >= 8); /* MI_SEMAPHORE_WAIT */
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+	igt_require(igt_sysfs_scanf(engine, "mmio_base", "%x", &mmio) == 1);
+
+	set_timeslice(engine, timeout);
+
+	ctx[0] = create_context(i915, class, inst, 0);
+	ctx[1] = create_context(i915, class, inst, 0);
+
+	map = gem_mmap__cpu(i915, obj[2].handle, 0, 4096, PROT_WRITE);
+
+	cs = map;
+	for (i = 0; i < 10; i++) {
+		*cs++ = MI_SEMAPHORE_WAIT |
+			MI_SEMAPHORE_POLL |
+			MI_SEMAPHORE_SAD_NEQ_SDD |
+			(4 - 2 + (gen >= 12));
+		*cs++ = 0;
+		*cs++ = obj[0].offset + sizeof(uint32_t) * i;
+		*cs++ = 0;
+		if (gen >= 12)
+			*cs++ = 0;
+
+		*cs++ = 0x24 << 23 | 2; /* SRM */
+		*cs++ = mmio + 0x358;
+		*cs++ = obj[1].offset + sizeof(uint32_t) * i;
+		*cs++ = 0;
+
+		*cs++ = MI_STORE_DWORD_IMM;
+		*cs++ = obj[0].offset +
+			4096 - sizeof(uint32_t) * i - sizeof(uint32_t);
+		*cs++ = 0;
+		*cs++ = 1;
+	}
+	*cs++ = MI_BATCH_BUFFER_END;
+
+	cs += 16 - ((cs - map) & 15);
+	start = (cs - map) * sizeof(*cs);
+	for (i = 0; i < 10; i++) {
+		*cs++ = MI_STORE_DWORD_IMM;
+		*cs++ = obj[0].offset + sizeof(uint32_t) * i;
+		*cs++ = 0;
+		*cs++ = 1;
+
+		*cs++ = MI_SEMAPHORE_WAIT |
+			MI_SEMAPHORE_POLL |
+			MI_SEMAPHORE_SAD_NEQ_SDD |
+			(4 - 2 + (gen >= 12));
+		*cs++ = 0;
+		*cs++ = obj[0].offset +
+			4096 - sizeof(uint32_t) * i - sizeof(uint32_t);
+		*cs++ = 0;
+		if (gen >= 12)
+			*cs++ = 0;
+	}
+	*cs++ = MI_BATCH_BUFFER_END;
+	igt_assert(cs - map < 4096 / sizeof(*cs));
+	munmap(map, 4096);
+
+	eb.rsvd1 = ctx[0];
+	gem_execbuf(i915, &eb);
+
+	eb.rsvd1 = ctx[1];
+	eb.batch_start_offset = start;
+	gem_execbuf(i915, &eb);
+
+	gem_sync(i915, obj[2].handle);
+
+	gem_set_domain(i915, obj[1].handle,
+		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
+	map = gem_mmap__cpu(i915, obj[1].handle, 0, 4096, PROT_WRITE);
+	for (i = 0; i < 9; i++)
+		map[i] = map[i + 1] - map[i];
+	qsort(map, 9, sizeof(*map), cmp_u32);
+	duration *= map[4] / 2; /* 2 sema-waits between timestamp updates */
+	munmap(map, 4096);
+
+	for (i = 0; i < ARRAY_SIZE(ctx); i++)
+		gem_context_destroy(i915, ctx[i]);
+
+	for (i = 0; i < ARRAY_SIZE(obj); i++)
+		gem_close(i915, obj[i].handle);
+
+	return duration;
+}
+
+static void test_duration(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * Timeslicing at its very basic level is sharing the GPU by
+	 * running one context for interval before running another. After
+	 * each interval the running context is swapped for another runnable
+	 * context.
+	 *
+	 * We can measure this directly by watching the xCS_TIMESTAMP and
+	 * recording its value every time we switch into the context, using
+	 * a couple of semaphores to busyspin for the timeslice.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_duration(i915, engine, delays[i]);
+		igt_info("timeslice_duration_ms:%d, elapsed=%.3fms\n",
+			 delays[i], elapsed * 1e-6);
+
+		/*
+		 * We need to give a couple of jiffies slack for the scheduler timeouts
+		 * and then a little more slack fr the overhead in submitting and
+		 * measuring. 50ms should cover all of our sins and be useful
+		 * tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < delays[i] + 50,
+			     "Timeslice exceeded request!\n");
+	}
+
+	gem_quiescent_gpu(i915);
+	set_timeslice(engine, saved);
+}
+
+static uint64_t __test_timeout(int i915, int engine, unsigned int timeout)
+{
+	unsigned int class, inst;
+	struct timespec ts = {};
+	igt_spin_t *spin[2];
+	uint64_t elapsed;
+	uint32_t ctx[2];
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_timeslice(engine, timeout);
+
+	ctx[0] = create_context(i915, class, inst, 0);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, 0);
+	igt_nsec_elapsed(&ts);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+	igt_spin_busywait_until_started(spin[1]);
+	elapsed = igt_nsec_elapsed(&ts);
+
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+	gem_quiescent_gpu(i915);
+
+	return elapsed;
+}
+
+static void test_timeout(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * Timeslicing requires us to preempt the running context in order to
+	 * switch into its contemporary. If we couple a unpreemptable hog
+	 * with a fast forced reset, we can measure the timeslice by how long
+	 * it takes for the hog to be reset and the high priority context
+	 * to complete.
+	 */
+
+	igt_require(igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+	igt_require(enable_hangcheck(i915, false));
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_timeout(i915, engine, delays[i]);
+		igt_info("timeslice_duration_ms:%d, elapsed=%.3fms\n",
+			 delays[i], elapsed * 1e-6);
+
+		/*
+		 * We need to give a couple of jiffies slack for the scheduler timeouts
+		 * and then a little more slack fr the overhead in submitting and
+		 * measuring. 50ms should cover all of our sins and be useful
+		 * tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < delays[i] + 50,
+			     "Timeslice exceeded request!\n");
+	}
+
+	igt_assert(enable_hangcheck(i915, true));
+	gem_quiescent_gpu(i915);
+	set_timeslice(engine, saved);
+}
+
+static void test_off(int i915, int engine)
+{
+	unsigned int class, inst;
+	unsigned int saved;
+	igt_spin_t *spin[2];
+	uint32_t ctx[2];
+
+	/*
+	 * As always, there are some who must run uninterrupted and simply do
+	 * not want to share the GPU even for a microsecond. Those greedy
+	 * clients can disable timeslicing entirely, and so set the timeslice
+	 * to 0. We test that a hog is not preempted within the 150s of
+	 * our boredom threshold.
+	 */
+
+	igt_require(igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+	igt_require(enable_hangcheck(i915, false));
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_timeslice(engine, 0);
+
+	ctx[0] = create_context(i915, class, inst, 0);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, 0);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+
+	for (int i = 0; i < 150; i++) {
+		igt_assert_eq(sync_fence_status(spin[0]->out_fence), 0);
+		sleep(1);
+	}
+
+	set_timeslice(engine, 1);
+
+	igt_spin_busywait_until_started(spin[1]);
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+
+	igt_assert(enable_hangcheck(i915, true));
+	gem_quiescent_gpu(i915);
+
+	set_timeslice(engine, saved);
+}
+
+igt_main
+{
+	const struct intel_execution_engine2 *it;
+	int i915 = -1, engines = -1;
+
+	igt_fixture {
+		int sys;
+
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+		igt_allow_hang(i915, 0, 0);
+
+		sys = igt_sysfs_open(i915);
+		igt_require(sys != -1);
+
+		engines = openat(sys, "engine", O_RDONLY);
+		igt_require(engines != -1);
+
+		close(sys);
+	}
+
+	__for_each_static_engine(it) {
+		igt_subtest_group {
+			int engine = -1;
+			char *name = NULL;
+
+			igt_fixture {
+				struct stat st;
+
+				engine = openat(engines, it->name, O_RDONLY);
+				igt_require(fstatat(engine,
+							"timeslice_duration_ms",
+							&st, 0) == 0);
+
+				name = igt_sysfs_get(engine, "name");
+				igt_require(name);
+			}
+			if (!name)
+				name = strdup(it->name);
+
+			igt_subtest_f("%s-idempotent", name)
+				test_idempotent(i915, engine);
+			igt_subtest_f("%s-invalid", name)
+				test_invalid(i915, engine);
+			igt_subtest_f("%s-duration", name)
+				test_duration(i915, engine);
+			igt_subtest_f("%s-timeout", name)
+				test_timeout(i915, engine);
+			igt_subtest_f("%s-off", name)
+				test_off(i915, engine);
+
+			free(name);
+			close(engine);
+		}
+	}
+
+	igt_fixture {
+		close(engines);
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 8d0964fe0..b0c567594 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -241,6 +241,7 @@ i915_progs = [
 	'i915_suspend',
 	'sysfs_heartbeat_interval',
 	'sysfs_preempt_timeout',
+	'sysfs_timeslice_duration',
 ]
 
 test_deps = [ igt_deps ]
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [igt-dev] [PATCH i-g-t 8/9] i915: Exercise timeslice sysfs property
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

We [will] expose various per-engine scheduling controls. One of which,
'timeslice_duration_ms', defines the scheduling quantum. If a context
exhausts its timeslice, it will be preempted in favour of running one of
its compatriots.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.sources                |   1 +
 tests/i915/sysfs_timeslice_duration.c | 519 ++++++++++++++++++++++++++
 tests/meson.build                     |   1 +
 3 files changed, 521 insertions(+)
 create mode 100644 tests/i915/sysfs_timeslice_duration.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 13544133a..e17d43155 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -100,6 +100,7 @@ TESTS_progs = \
 	vgem_slow \
 	i915/sysfs_heartbeat_interval \
 	i915/sysfs_preempt_timeout \
+	i915/sysfs_timeslice_duration \
 	$(NULL)
 
 TESTS_progs += gem_bad_reloc
diff --git a/tests/i915/sysfs_timeslice_duration.c b/tests/i915/sysfs_timeslice_duration.c
new file mode 100644
index 000000000..02f1f31f8
--- /dev/null
+++ b/tests/i915/sysfs_timeslice_duration.c
@@ -0,0 +1,519 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "drmtest.h" /* gem_quiescent_gpu()! */
+#include "i915/gem_engine_topology.h"
+#include "i915/gem_mman.h"
+#include "igt_dummyload.h"
+#include "igt_sysfs.h"
+#include "ioctl_wrappers.h" /* igt_require_gem()! */
+#include "intel_chipset.h"
+#include "intel_reg.h"
+#include "sw_sync.h"
+
+#define MI_SEMAPHORE_WAIT		(0x1c << 23)
+#define   MI_SEMAPHORE_POLL             (1 << 15)
+#define   MI_SEMAPHORE_SAD_GT_SDD       (0 << 12)
+#define   MI_SEMAPHORE_SAD_GTE_SDD      (1 << 12)
+#define   MI_SEMAPHORE_SAD_LT_SDD       (2 << 12)
+#define   MI_SEMAPHORE_SAD_LTE_SDD      (3 << 12)
+#define   MI_SEMAPHORE_SAD_EQ_SDD       (4 << 12)
+#define   MI_SEMAPHORE_SAD_NEQ_SDD      (5 << 12)
+
+static bool __enable_hangcheck(int dir, bool state)
+{
+	return igt_sysfs_set(dir, "enable_hangcheck", state ? "1" : "0");
+}
+
+static bool enable_hangcheck(int i915, bool state)
+{
+	bool success;
+	int dir;
+
+	dir = igt_sysfs_open_parameters(i915);
+	if (dir < 0) /* no parameters, must be default! */
+		return false;
+
+	success = __enable_hangcheck(dir, state);
+	close(dir);
+
+	return success;
+}
+
+static void set_timeslice(int engine, unsigned int value)
+{
+	unsigned int delay;
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", "%u", value);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, value);
+}
+
+static void test_idempotent(int i915, int engine)
+{
+	const unsigned int delays[] = { 0, 1, 1234, 654321 };
+	unsigned int saved;
+
+	/* Quick test to verify the kernel reports the same values as we write */
+
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++)
+		set_timeslice(engine, delays[i]);
+
+	set_timeslice(engine, saved);
+}
+
+static void test_invalid(int i915, int engine)
+{
+	unsigned int saved, delay;
+
+	/* Quick test that non-representable delays are rejected */
+
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", PRIu64, -1);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", "%d", -1);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+
+	igt_sysfs_printf(engine, "timeslice_duration_ms", PRIu64, 123ull << 32);
+	igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &delay);
+	igt_assert_eq(delay, saved);
+}
+
+static void set_unbannable(int i915, uint32_t ctx)
+{
+	struct drm_i915_gem_context_param p = {
+		.ctx_id = ctx,
+		.param = I915_CONTEXT_PARAM_BANNABLE,
+	};
+
+	igt_assert_eq(__gem_context_set_param(i915, &p), 0);
+}
+
+static uint32_t create_context(int i915, unsigned int class, unsigned int inst, int prio)
+{
+	uint32_t ctx;
+
+	ctx = gem_context_create_for_engine(i915, class, inst);
+	set_unbannable(i915, ctx);
+	gem_context_set_priority(i915, ctx, prio);
+
+	return ctx;
+}
+
+static int cmp_u32(const void *_a, const void *_b)
+{
+	const uint32_t *a = _a, *b = _b;
+
+	return *a - *b;
+}
+
+static double clockrate(int i915)
+{
+	int freq;
+	drm_i915_getparam_t gp = {
+		.value = &freq,
+		.param = I915_PARAM_CS_TIMESTAMP_FREQUENCY,
+	};
+
+	igt_require(igt_ioctl(i915, DRM_IOCTL_I915_GETPARAM, &gp) == 0);
+	return 1e9 / freq;
+}
+
+static uint64_t __test_duration(int i915, int engine, unsigned int timeout)
+{
+	struct drm_i915_gem_exec_object2 obj[3] = {
+		{
+			.handle = gem_create(i915, 4096),
+			.offset = 0,
+			.flags = EXEC_OBJECT_PINNED,
+		},
+		{
+			.handle = gem_create(i915, 4096),
+			.offset = 4096,
+			.flags = EXEC_OBJECT_PINNED,
+		},
+		{ gem_create(i915, 4096) }
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffer_count = ARRAY_SIZE(obj),
+		.buffers_ptr = to_user_pointer(obj),
+	};
+	const int gen = intel_gen(intel_get_drm_devid(i915));
+	double duration = clockrate(i915);
+	unsigned int class, inst, mmio;
+	uint32_t *cs, *map;
+	uint32_t ctx[2];
+	int start;
+	int i;
+
+	igt_require(gem_scheduler_has_preemption(i915));
+	igt_require(gen >= 8); /* MI_SEMAPHORE_WAIT */
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+	igt_require(igt_sysfs_scanf(engine, "mmio_base", "%x", &mmio) == 1);
+
+	set_timeslice(engine, timeout);
+
+	ctx[0] = create_context(i915, class, inst, 0);
+	ctx[1] = create_context(i915, class, inst, 0);
+
+	map = gem_mmap__cpu(i915, obj[2].handle, 0, 4096, PROT_WRITE);
+
+	cs = map;
+	for (i = 0; i < 10; i++) {
+		*cs++ = MI_SEMAPHORE_WAIT |
+			MI_SEMAPHORE_POLL |
+			MI_SEMAPHORE_SAD_NEQ_SDD |
+			(4 - 2 + (gen >= 12));
+		*cs++ = 0;
+		*cs++ = obj[0].offset + sizeof(uint32_t) * i;
+		*cs++ = 0;
+		if (gen >= 12)
+			*cs++ = 0;
+
+		*cs++ = 0x24 << 23 | 2; /* SRM */
+		*cs++ = mmio + 0x358;
+		*cs++ = obj[1].offset + sizeof(uint32_t) * i;
+		*cs++ = 0;
+
+		*cs++ = MI_STORE_DWORD_IMM;
+		*cs++ = obj[0].offset +
+			4096 - sizeof(uint32_t) * i - sizeof(uint32_t);
+		*cs++ = 0;
+		*cs++ = 1;
+	}
+	*cs++ = MI_BATCH_BUFFER_END;
+
+	cs += 16 - ((cs - map) & 15);
+	start = (cs - map) * sizeof(*cs);
+	for (i = 0; i < 10; i++) {
+		*cs++ = MI_STORE_DWORD_IMM;
+		*cs++ = obj[0].offset + sizeof(uint32_t) * i;
+		*cs++ = 0;
+		*cs++ = 1;
+
+		*cs++ = MI_SEMAPHORE_WAIT |
+			MI_SEMAPHORE_POLL |
+			MI_SEMAPHORE_SAD_NEQ_SDD |
+			(4 - 2 + (gen >= 12));
+		*cs++ = 0;
+		*cs++ = obj[0].offset +
+			4096 - sizeof(uint32_t) * i - sizeof(uint32_t);
+		*cs++ = 0;
+		if (gen >= 12)
+			*cs++ = 0;
+	}
+	*cs++ = MI_BATCH_BUFFER_END;
+	igt_assert(cs - map < 4096 / sizeof(*cs));
+	munmap(map, 4096);
+
+	eb.rsvd1 = ctx[0];
+	gem_execbuf(i915, &eb);
+
+	eb.rsvd1 = ctx[1];
+	eb.batch_start_offset = start;
+	gem_execbuf(i915, &eb);
+
+	gem_sync(i915, obj[2].handle);
+
+	gem_set_domain(i915, obj[1].handle,
+		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
+	map = gem_mmap__cpu(i915, obj[1].handle, 0, 4096, PROT_WRITE);
+	for (i = 0; i < 9; i++)
+		map[i] = map[i + 1] - map[i];
+	qsort(map, 9, sizeof(*map), cmp_u32);
+	duration *= map[4] / 2; /* 2 sema-waits between timestamp updates */
+	munmap(map, 4096);
+
+	for (i = 0; i < ARRAY_SIZE(ctx); i++)
+		gem_context_destroy(i915, ctx[i]);
+
+	for (i = 0; i < ARRAY_SIZE(obj); i++)
+		gem_close(i915, obj[i].handle);
+
+	return duration;
+}
+
+static void test_duration(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * Timeslicing at its very basic level is sharing the GPU by
+	 * running one context for interval before running another. After
+	 * each interval the running context is swapped for another runnable
+	 * context.
+	 *
+	 * We can measure this directly by watching the xCS_TIMESTAMP and
+	 * recording its value every time we switch into the context, using
+	 * a couple of semaphores to busyspin for the timeslice.
+	 */
+
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_duration(i915, engine, delays[i]);
+		igt_info("timeslice_duration_ms:%d, elapsed=%.3fms\n",
+			 delays[i], elapsed * 1e-6);
+
+		/*
+		 * We need to give a couple of jiffies slack for the scheduler timeouts
+		 * and then a little more slack fr the overhead in submitting and
+		 * measuring. 50ms should cover all of our sins and be useful
+		 * tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < delays[i] + 50,
+			     "Timeslice exceeded request!\n");
+	}
+
+	gem_quiescent_gpu(i915);
+	set_timeslice(engine, saved);
+}
+
+static uint64_t __test_timeout(int i915, int engine, unsigned int timeout)
+{
+	unsigned int class, inst;
+	struct timespec ts = {};
+	igt_spin_t *spin[2];
+	uint64_t elapsed;
+	uint32_t ctx[2];
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_timeslice(engine, timeout);
+
+	ctx[0] = create_context(i915, class, inst, 0);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, 0);
+	igt_nsec_elapsed(&ts);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+	igt_spin_busywait_until_started(spin[1]);
+	elapsed = igt_nsec_elapsed(&ts);
+
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+	gem_quiescent_gpu(i915);
+
+	return elapsed;
+}
+
+static void test_timeout(int i915, int engine)
+{
+	int delays[] = { 1, 50, 100, 500 };
+	unsigned int saved;
+
+	/*
+	 * Timeslicing requires us to preempt the running context in order to
+	 * switch into its contemporary. If we couple a unpreemptable hog
+	 * with a fast forced reset, we can measure the timeslice by how long
+	 * it takes for the hog to be reset and the high priority context
+	 * to complete.
+	 */
+
+	igt_require(igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+	igt_require(enable_hangcheck(i915, false));
+
+	for (int i = 0; i < ARRAY_SIZE(delays); i++) {
+		uint64_t elapsed;
+
+		elapsed = __test_timeout(i915, engine, delays[i]);
+		igt_info("timeslice_duration_ms:%d, elapsed=%.3fms\n",
+			 delays[i], elapsed * 1e-6);
+
+		/*
+		 * We need to give a couple of jiffies slack for the scheduler timeouts
+		 * and then a little more slack fr the overhead in submitting and
+		 * measuring. 50ms should cover all of our sins and be useful
+		 * tolerance.
+		 */
+		igt_assert_f(elapsed / 1000 / 1000 < delays[i] + 50,
+			     "Timeslice exceeded request!\n");
+	}
+
+	igt_assert(enable_hangcheck(i915, true));
+	gem_quiescent_gpu(i915);
+	set_timeslice(engine, saved);
+}
+
+static void test_off(int i915, int engine)
+{
+	unsigned int class, inst;
+	unsigned int saved;
+	igt_spin_t *spin[2];
+	uint32_t ctx[2];
+
+	/*
+	 * As always, there are some who must run uninterrupted and simply do
+	 * not want to share the GPU even for a microsecond. Those greedy
+	 * clients can disable timeslicing entirely, and so set the timeslice
+	 * to 0. We test that a hog is not preempted within the 150s of
+	 * our boredom threshold.
+	 */
+
+	igt_require(igt_sysfs_printf(engine, "preempt_timeout_ms", "%u", 1) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "timeslice_duration_ms", "%u", &saved) == 1);
+	igt_debug("Initial timeslice_duration_ms:%u\n", saved);
+
+	gem_quiescent_gpu(i915);
+	igt_require(enable_hangcheck(i915, false));
+
+	igt_assert(igt_sysfs_scanf(engine, "class", "%u", &class) == 1);
+	igt_assert(igt_sysfs_scanf(engine, "instance", "%u", &inst) == 1);
+
+	set_timeslice(engine, 0);
+
+	ctx[0] = create_context(i915, class, inst, 0);
+	spin[0] = igt_spin_new(i915, ctx[0],
+			       .flags = (IGT_SPIN_NO_PREEMPTION |
+					 IGT_SPIN_POLL_RUN |
+					 IGT_SPIN_FENCE_OUT));
+	igt_spin_busywait_until_started(spin[0]);
+
+	ctx[1] = create_context(i915, class, inst, 0);
+	spin[1] = igt_spin_new(i915, ctx[1], .flags = IGT_SPIN_POLL_RUN);
+
+	for (int i = 0; i < 150; i++) {
+		igt_assert_eq(sync_fence_status(spin[0]->out_fence), 0);
+		sleep(1);
+	}
+
+	set_timeslice(engine, 1);
+
+	igt_spin_busywait_until_started(spin[1]);
+	igt_spin_free(i915, spin[1]);
+
+	igt_assert_eq(sync_fence_wait(spin[0]->out_fence, 1), 0);
+	igt_assert_eq(sync_fence_status(spin[0]->out_fence), -EIO);
+
+	igt_spin_free(i915, spin[0]);
+
+	gem_context_destroy(i915, ctx[1]);
+	gem_context_destroy(i915, ctx[0]);
+
+	igt_assert(enable_hangcheck(i915, true));
+	gem_quiescent_gpu(i915);
+
+	set_timeslice(engine, saved);
+}
+
+igt_main
+{
+	const struct intel_execution_engine2 *it;
+	int i915 = -1, engines = -1;
+
+	igt_fixture {
+		int sys;
+
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+		igt_allow_hang(i915, 0, 0);
+
+		sys = igt_sysfs_open(i915);
+		igt_require(sys != -1);
+
+		engines = openat(sys, "engine", O_RDONLY);
+		igt_require(engines != -1);
+
+		close(sys);
+	}
+
+	__for_each_static_engine(it) {
+		igt_subtest_group {
+			int engine = -1;
+			char *name = NULL;
+
+			igt_fixture {
+				struct stat st;
+
+				engine = openat(engines, it->name, O_RDONLY);
+				igt_require(fstatat(engine,
+							"timeslice_duration_ms",
+							&st, 0) == 0);
+
+				name = igt_sysfs_get(engine, "name");
+				igt_require(name);
+			}
+			if (!name)
+				name = strdup(it->name);
+
+			igt_subtest_f("%s-idempotent", name)
+				test_idempotent(i915, engine);
+			igt_subtest_f("%s-invalid", name)
+				test_invalid(i915, engine);
+			igt_subtest_f("%s-duration", name)
+				test_duration(i915, engine);
+			igt_subtest_f("%s-timeout", name)
+				test_timeout(i915, engine);
+			igt_subtest_f("%s-off", name)
+				test_off(i915, engine);
+
+			free(name);
+			close(engine);
+		}
+	}
+
+	igt_fixture {
+		close(engines);
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 8d0964fe0..b0c567594 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -241,6 +241,7 @@ i915_progs = [
 	'i915_suspend',
 	'sysfs_heartbeat_interval',
 	'sysfs_preempt_timeout',
+	'sysfs_timeslice_duration',
 ]
 
 test_deps = [ igt_deps ]
-- 
2.24.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
ringbuffer for logical ring contects. This directly affects the number
of batches userspace can submit before blocking waiting for space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.sources        |   3 +
 tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
 tests/meson.build             |   1 +
 3 files changed, 300 insertions(+)
 create mode 100644 tests/i915/gem_ctx_ringsize.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index e17d43155..801fc52f3 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
 TESTS_progs += gem_ctx_persistence
 gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
 
+TESTS_progs += gem_ctx_ringsize
+gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
+
 TESTS_progs += gem_ctx_shared
 gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
 
diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
new file mode 100644
index 000000000..1450e8f0d
--- /dev/null
+++ b/tests/i915/gem_ctx_ringsize.c
@@ -0,0 +1,296 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "drmtest.h" /* gem_quiescent_gpu()! */
+#include "i915/gem_context.h"
+#include "i915/gem_engine_topology.h"
+#include "ioctl_wrappers.h" /* gem_wait()! */
+#include "sw_sync.h"
+
+#define I915_CONTEXT_PARAM_RINGSIZE 0xc
+
+static bool has_ringsize(int i915)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+
+	return __gem_context_get_param(i915, &p) == 0;
+}
+
+static void test_idempotent(int i915)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+	uint32_t saved;
+
+	/*
+	 * Simple test to verify that we are able to read back the same
+	 * value as we set.
+	 */
+
+	gem_context_get_param(i915, &p);
+	saved = p.value;
+
+	for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {
+		p.value = x;
+		gem_context_set_param(i915, &p);
+		gem_context_get_param(i915, &p);
+		igt_assert_eq_u32(p.value, x);
+	}
+
+	p.value = saved;
+	gem_context_set_param(i915, &p);
+}
+
+static void test_invalid(int i915)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+	uint64_t invalid[] = {
+		0, 1, 4095, 4097, 8191, 8193,
+		/* upper limit may be HW dependent, atm it is 512KiB */
+		(512 << 10) - 1, (512 << 10) + 1,
+		-1, -1u
+	};
+	uint32_t saved;
+
+	gem_context_get_param(i915, &p);
+	saved = p.value;
+
+	for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
+		p.value = invalid[i];
+		igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
+		gem_context_get_param(i915, &p);
+		igt_assert_eq_u64(p.value, saved);
+	}
+}
+
+static int create_ext_ioctl(int i915,
+			    struct drm_i915_gem_context_create_ext *arg)
+{
+	int err;
+
+	err = 0;
+	if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
+		err = -errno;
+		igt_assume(err);
+	}
+
+	errno = 0;
+	return err;
+}
+
+static void test_create(int i915)
+{
+	struct drm_i915_gem_context_create_ext_setparam p = {
+		.base = {
+			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+			.next_extension = 0, /* end of chain */
+		},
+		.param = {
+			.param = I915_CONTEXT_PARAM_RINGSIZE,
+			.value = 512 << 10,
+		}
+	};
+	struct drm_i915_gem_context_create_ext create = {
+		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+		.extensions = to_user_pointer(&p),
+	};
+
+	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
+
+	p.param.ctx_id = create.ctx_id;
+	p.param.value = 0;
+	gem_context_get_param(i915, &p.param);
+	igt_assert_eq(p.param.value, 512 << 10);
+
+	gem_context_destroy(i915, create.ctx_id);
+}
+
+static void test_clone(int i915)
+{
+	struct drm_i915_gem_context_create_ext_setparam p = {
+		.base = {
+			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+			.next_extension = 0, /* end of chain */
+		},
+		.param = {
+			.param = I915_CONTEXT_PARAM_RINGSIZE,
+			.value = 512 << 10,
+		}
+	};
+	struct drm_i915_gem_context_create_ext create = {
+		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+		.extensions = to_user_pointer(&p),
+	};
+
+	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
+
+	p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
+					   I915_CONTEXT_CLONE_ENGINES, 0);
+	igt_assert_neq(p.param.ctx_id, create.ctx_id);
+	gem_context_destroy(i915, create.ctx_id);
+
+	p.param.value = 0;
+	gem_context_get_param(i915, &p.param);
+	igt_assert_eq(p.param.value, 512 << 10);
+
+	gem_context_destroy(i915, p.param.ctx_id);
+}
+
+static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
+{
+	int err;
+
+	err = 0;
+	if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
+		err = -errno;
+
+	errno = 0;
+	return err;
+}
+
+static uint32_t __batch_create(int i915, uint32_t offset)
+{
+	const uint32_t bbe = 0xa << 23;
+	uint32_t handle;
+
+	handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));
+	gem_write(i915, handle, offset, &bbe, sizeof(bbe));
+
+	return handle;
+}
+
+static uint32_t batch_create(int i915)
+{
+	return __batch_create(i915, 0);
+}
+
+static unsigned int measure_inflight(int i915, unsigned int engine)
+{
+	IGT_CORK_FENCE(cork);
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = batch_create(i915)
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+		.flags = engine | I915_EXEC_FENCE_IN,
+		.rsvd2 = igt_cork_plug(&cork, i915),
+	};
+	unsigned int count;
+
+	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
+
+	gem_execbuf(i915, &execbuf);
+	for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
+		;
+	close(execbuf.rsvd2);
+
+	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) & ~O_NONBLOCK);
+
+	igt_cork_unplug(&cork);
+	gem_close(i915, obj.handle);
+
+	return count;
+}
+
+static void test_resize(int i915,
+			const struct intel_execution_engine2 *e,
+			unsigned int flags)
+#define IDLE (1 << 0)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+	unsigned int prev[2] = {};
+	uint32_t saved;
+
+	gem_context_get_param(i915, &p);
+	saved = p.value;
+
+	gem_quiescent_gpu(i915);
+	for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
+		unsigned int count;
+
+		gem_context_set_param(i915, &p);
+
+		count = measure_inflight(i915, e->flags);
+		igt_info("%s: %llx -> %d\n", e->name, p.value, count);
+		igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);
+		if (flags & IDLE)
+			gem_quiescent_gpu(i915);
+
+		prev[0] = prev[1];
+		prev[1] = count;
+	}
+	gem_quiescent_gpu(i915);
+
+	p.value = saved;
+	gem_context_set_param(i915, &p);
+}
+
+igt_main
+{
+	const struct intel_execution_engine2 *e;
+	int i915;
+
+	igt_fixture {
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+
+		igt_require(has_ringsize(i915));
+	}
+
+	igt_subtest("idempotent")
+		test_idempotent(i915);
+
+	igt_subtest("invalid")
+		test_invalid(i915);
+
+	igt_subtest("create")
+		test_create(i915);
+	igt_subtest("clone")
+		test_clone(i915);
+
+	__for_each_physical_engine(i915, e) {
+		igt_subtest_f("%s-idle", e->name)
+			test_resize(i915, e, IDLE);
+		igt_subtest_f("%s-active", e->name)
+			test_resize(i915, e, 0);
+	}
+
+	igt_fixture {
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index b0c567594..9b7ca2423 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -123,6 +123,7 @@ i915_progs = [
 	'gem_ctx_isolation',
 	'gem_ctx_param',
 	'gem_ctx_persistence',
+	'gem_ctx_ringsize',
 	'gem_ctx_shared',
 	'gem_ctx_switch',
 	'gem_ctx_thrash',
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Intel-gfx] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
ringbuffer for logical ring contects. This directly affects the number
of batches userspace can submit before blocking waiting for space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.sources        |   3 +
 tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
 tests/meson.build             |   1 +
 3 files changed, 300 insertions(+)
 create mode 100644 tests/i915/gem_ctx_ringsize.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index e17d43155..801fc52f3 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
 TESTS_progs += gem_ctx_persistence
 gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
 
+TESTS_progs += gem_ctx_ringsize
+gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
+
 TESTS_progs += gem_ctx_shared
 gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
 
diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
new file mode 100644
index 000000000..1450e8f0d
--- /dev/null
+++ b/tests/i915/gem_ctx_ringsize.c
@@ -0,0 +1,296 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "drmtest.h" /* gem_quiescent_gpu()! */
+#include "i915/gem_context.h"
+#include "i915/gem_engine_topology.h"
+#include "ioctl_wrappers.h" /* gem_wait()! */
+#include "sw_sync.h"
+
+#define I915_CONTEXT_PARAM_RINGSIZE 0xc
+
+static bool has_ringsize(int i915)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+
+	return __gem_context_get_param(i915, &p) == 0;
+}
+
+static void test_idempotent(int i915)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+	uint32_t saved;
+
+	/*
+	 * Simple test to verify that we are able to read back the same
+	 * value as we set.
+	 */
+
+	gem_context_get_param(i915, &p);
+	saved = p.value;
+
+	for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {
+		p.value = x;
+		gem_context_set_param(i915, &p);
+		gem_context_get_param(i915, &p);
+		igt_assert_eq_u32(p.value, x);
+	}
+
+	p.value = saved;
+	gem_context_set_param(i915, &p);
+}
+
+static void test_invalid(int i915)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+	uint64_t invalid[] = {
+		0, 1, 4095, 4097, 8191, 8193,
+		/* upper limit may be HW dependent, atm it is 512KiB */
+		(512 << 10) - 1, (512 << 10) + 1,
+		-1, -1u
+	};
+	uint32_t saved;
+
+	gem_context_get_param(i915, &p);
+	saved = p.value;
+
+	for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
+		p.value = invalid[i];
+		igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
+		gem_context_get_param(i915, &p);
+		igt_assert_eq_u64(p.value, saved);
+	}
+}
+
+static int create_ext_ioctl(int i915,
+			    struct drm_i915_gem_context_create_ext *arg)
+{
+	int err;
+
+	err = 0;
+	if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
+		err = -errno;
+		igt_assume(err);
+	}
+
+	errno = 0;
+	return err;
+}
+
+static void test_create(int i915)
+{
+	struct drm_i915_gem_context_create_ext_setparam p = {
+		.base = {
+			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+			.next_extension = 0, /* end of chain */
+		},
+		.param = {
+			.param = I915_CONTEXT_PARAM_RINGSIZE,
+			.value = 512 << 10,
+		}
+	};
+	struct drm_i915_gem_context_create_ext create = {
+		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+		.extensions = to_user_pointer(&p),
+	};
+
+	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
+
+	p.param.ctx_id = create.ctx_id;
+	p.param.value = 0;
+	gem_context_get_param(i915, &p.param);
+	igt_assert_eq(p.param.value, 512 << 10);
+
+	gem_context_destroy(i915, create.ctx_id);
+}
+
+static void test_clone(int i915)
+{
+	struct drm_i915_gem_context_create_ext_setparam p = {
+		.base = {
+			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+			.next_extension = 0, /* end of chain */
+		},
+		.param = {
+			.param = I915_CONTEXT_PARAM_RINGSIZE,
+			.value = 512 << 10,
+		}
+	};
+	struct drm_i915_gem_context_create_ext create = {
+		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+		.extensions = to_user_pointer(&p),
+	};
+
+	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
+
+	p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
+					   I915_CONTEXT_CLONE_ENGINES, 0);
+	igt_assert_neq(p.param.ctx_id, create.ctx_id);
+	gem_context_destroy(i915, create.ctx_id);
+
+	p.param.value = 0;
+	gem_context_get_param(i915, &p.param);
+	igt_assert_eq(p.param.value, 512 << 10);
+
+	gem_context_destroy(i915, p.param.ctx_id);
+}
+
+static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
+{
+	int err;
+
+	err = 0;
+	if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
+		err = -errno;
+
+	errno = 0;
+	return err;
+}
+
+static uint32_t __batch_create(int i915, uint32_t offset)
+{
+	const uint32_t bbe = 0xa << 23;
+	uint32_t handle;
+
+	handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));
+	gem_write(i915, handle, offset, &bbe, sizeof(bbe));
+
+	return handle;
+}
+
+static uint32_t batch_create(int i915)
+{
+	return __batch_create(i915, 0);
+}
+
+static unsigned int measure_inflight(int i915, unsigned int engine)
+{
+	IGT_CORK_FENCE(cork);
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = batch_create(i915)
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+		.flags = engine | I915_EXEC_FENCE_IN,
+		.rsvd2 = igt_cork_plug(&cork, i915),
+	};
+	unsigned int count;
+
+	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
+
+	gem_execbuf(i915, &execbuf);
+	for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
+		;
+	close(execbuf.rsvd2);
+
+	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) & ~O_NONBLOCK);
+
+	igt_cork_unplug(&cork);
+	gem_close(i915, obj.handle);
+
+	return count;
+}
+
+static void test_resize(int i915,
+			const struct intel_execution_engine2 *e,
+			unsigned int flags)
+#define IDLE (1 << 0)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+	unsigned int prev[2] = {};
+	uint32_t saved;
+
+	gem_context_get_param(i915, &p);
+	saved = p.value;
+
+	gem_quiescent_gpu(i915);
+	for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
+		unsigned int count;
+
+		gem_context_set_param(i915, &p);
+
+		count = measure_inflight(i915, e->flags);
+		igt_info("%s: %llx -> %d\n", e->name, p.value, count);
+		igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);
+		if (flags & IDLE)
+			gem_quiescent_gpu(i915);
+
+		prev[0] = prev[1];
+		prev[1] = count;
+	}
+	gem_quiescent_gpu(i915);
+
+	p.value = saved;
+	gem_context_set_param(i915, &p);
+}
+
+igt_main
+{
+	const struct intel_execution_engine2 *e;
+	int i915;
+
+	igt_fixture {
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+
+		igt_require(has_ringsize(i915));
+	}
+
+	igt_subtest("idempotent")
+		test_idempotent(i915);
+
+	igt_subtest("invalid")
+		test_invalid(i915);
+
+	igt_subtest("create")
+		test_create(i915);
+	igt_subtest("clone")
+		test_clone(i915);
+
+	__for_each_physical_engine(i915, e) {
+		igt_subtest_f("%s-idle", e->name)
+			test_resize(i915, e, IDLE);
+		igt_subtest_f("%s-active", e->name)
+			test_resize(i915, e, 0);
+	}
+
+	igt_fixture {
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index b0c567594..9b7ca2423 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -123,6 +123,7 @@ i915_progs = [
 	'gem_ctx_isolation',
 	'gem_ctx_param',
 	'gem_ctx_persistence',
+	'gem_ctx_ringsize',
 	'gem_ctx_shared',
 	'gem_ctx_switch',
 	'gem_ctx_thrash',
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
@ 2019-11-13 12:52   ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-13 12:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
ringbuffer for logical ring contects. This directly affects the number
of batches userspace can submit before blocking waiting for space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/Makefile.sources        |   3 +
 tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
 tests/meson.build             |   1 +
 3 files changed, 300 insertions(+)
 create mode 100644 tests/i915/gem_ctx_ringsize.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index e17d43155..801fc52f3 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
 TESTS_progs += gem_ctx_persistence
 gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
 
+TESTS_progs += gem_ctx_ringsize
+gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
+
 TESTS_progs += gem_ctx_shared
 gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
 
diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
new file mode 100644
index 000000000..1450e8f0d
--- /dev/null
+++ b/tests/i915/gem_ctx_ringsize.c
@@ -0,0 +1,296 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "drmtest.h" /* gem_quiescent_gpu()! */
+#include "i915/gem_context.h"
+#include "i915/gem_engine_topology.h"
+#include "ioctl_wrappers.h" /* gem_wait()! */
+#include "sw_sync.h"
+
+#define I915_CONTEXT_PARAM_RINGSIZE 0xc
+
+static bool has_ringsize(int i915)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+
+	return __gem_context_get_param(i915, &p) == 0;
+}
+
+static void test_idempotent(int i915)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+	uint32_t saved;
+
+	/*
+	 * Simple test to verify that we are able to read back the same
+	 * value as we set.
+	 */
+
+	gem_context_get_param(i915, &p);
+	saved = p.value;
+
+	for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {
+		p.value = x;
+		gem_context_set_param(i915, &p);
+		gem_context_get_param(i915, &p);
+		igt_assert_eq_u32(p.value, x);
+	}
+
+	p.value = saved;
+	gem_context_set_param(i915, &p);
+}
+
+static void test_invalid(int i915)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+	uint64_t invalid[] = {
+		0, 1, 4095, 4097, 8191, 8193,
+		/* upper limit may be HW dependent, atm it is 512KiB */
+		(512 << 10) - 1, (512 << 10) + 1,
+		-1, -1u
+	};
+	uint32_t saved;
+
+	gem_context_get_param(i915, &p);
+	saved = p.value;
+
+	for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
+		p.value = invalid[i];
+		igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
+		gem_context_get_param(i915, &p);
+		igt_assert_eq_u64(p.value, saved);
+	}
+}
+
+static int create_ext_ioctl(int i915,
+			    struct drm_i915_gem_context_create_ext *arg)
+{
+	int err;
+
+	err = 0;
+	if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
+		err = -errno;
+		igt_assume(err);
+	}
+
+	errno = 0;
+	return err;
+}
+
+static void test_create(int i915)
+{
+	struct drm_i915_gem_context_create_ext_setparam p = {
+		.base = {
+			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+			.next_extension = 0, /* end of chain */
+		},
+		.param = {
+			.param = I915_CONTEXT_PARAM_RINGSIZE,
+			.value = 512 << 10,
+		}
+	};
+	struct drm_i915_gem_context_create_ext create = {
+		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+		.extensions = to_user_pointer(&p),
+	};
+
+	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
+
+	p.param.ctx_id = create.ctx_id;
+	p.param.value = 0;
+	gem_context_get_param(i915, &p.param);
+	igt_assert_eq(p.param.value, 512 << 10);
+
+	gem_context_destroy(i915, create.ctx_id);
+}
+
+static void test_clone(int i915)
+{
+	struct drm_i915_gem_context_create_ext_setparam p = {
+		.base = {
+			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+			.next_extension = 0, /* end of chain */
+		},
+		.param = {
+			.param = I915_CONTEXT_PARAM_RINGSIZE,
+			.value = 512 << 10,
+		}
+	};
+	struct drm_i915_gem_context_create_ext create = {
+		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+		.extensions = to_user_pointer(&p),
+	};
+
+	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
+
+	p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
+					   I915_CONTEXT_CLONE_ENGINES, 0);
+	igt_assert_neq(p.param.ctx_id, create.ctx_id);
+	gem_context_destroy(i915, create.ctx_id);
+
+	p.param.value = 0;
+	gem_context_get_param(i915, &p.param);
+	igt_assert_eq(p.param.value, 512 << 10);
+
+	gem_context_destroy(i915, p.param.ctx_id);
+}
+
+static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
+{
+	int err;
+
+	err = 0;
+	if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
+		err = -errno;
+
+	errno = 0;
+	return err;
+}
+
+static uint32_t __batch_create(int i915, uint32_t offset)
+{
+	const uint32_t bbe = 0xa << 23;
+	uint32_t handle;
+
+	handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));
+	gem_write(i915, handle, offset, &bbe, sizeof(bbe));
+
+	return handle;
+}
+
+static uint32_t batch_create(int i915)
+{
+	return __batch_create(i915, 0);
+}
+
+static unsigned int measure_inflight(int i915, unsigned int engine)
+{
+	IGT_CORK_FENCE(cork);
+	struct drm_i915_gem_exec_object2 obj = {
+		.handle = batch_create(i915)
+	};
+	struct drm_i915_gem_execbuffer2 execbuf = {
+		.buffers_ptr = to_user_pointer(&obj),
+		.buffer_count = 1,
+		.flags = engine | I915_EXEC_FENCE_IN,
+		.rsvd2 = igt_cork_plug(&cork, i915),
+	};
+	unsigned int count;
+
+	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
+
+	gem_execbuf(i915, &execbuf);
+	for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
+		;
+	close(execbuf.rsvd2);
+
+	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) & ~O_NONBLOCK);
+
+	igt_cork_unplug(&cork);
+	gem_close(i915, obj.handle);
+
+	return count;
+}
+
+static void test_resize(int i915,
+			const struct intel_execution_engine2 *e,
+			unsigned int flags)
+#define IDLE (1 << 0)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_RINGSIZE,
+	};
+	unsigned int prev[2] = {};
+	uint32_t saved;
+
+	gem_context_get_param(i915, &p);
+	saved = p.value;
+
+	gem_quiescent_gpu(i915);
+	for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
+		unsigned int count;
+
+		gem_context_set_param(i915, &p);
+
+		count = measure_inflight(i915, e->flags);
+		igt_info("%s: %llx -> %d\n", e->name, p.value, count);
+		igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);
+		if (flags & IDLE)
+			gem_quiescent_gpu(i915);
+
+		prev[0] = prev[1];
+		prev[1] = count;
+	}
+	gem_quiescent_gpu(i915);
+
+	p.value = saved;
+	gem_context_set_param(i915, &p);
+}
+
+igt_main
+{
+	const struct intel_execution_engine2 *e;
+	int i915;
+
+	igt_fixture {
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+
+		igt_require(has_ringsize(i915));
+	}
+
+	igt_subtest("idempotent")
+		test_idempotent(i915);
+
+	igt_subtest("invalid")
+		test_invalid(i915);
+
+	igt_subtest("create")
+		test_create(i915);
+	igt_subtest("clone")
+		test_clone(i915);
+
+	__for_each_physical_engine(i915, e) {
+		igt_subtest_f("%s-idle", e->name)
+			test_resize(i915, e, IDLE);
+		igt_subtest_f("%s-active", e->name)
+			test_resize(i915, e, 0);
+	}
+
+	igt_fixture {
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index b0c567594..9b7ca2423 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -123,6 +123,7 @@ i915_progs = [
 	'gem_ctx_isolation',
 	'gem_ctx_param',
 	'gem_ctx_persistence',
+	'gem_ctx_ringsize',
 	'gem_ctx_shared',
 	'gem_ctx_switch',
 	'gem_ctx_thrash',
-- 
2.24.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for series starting with [i-g-t,1/9] i915/gem_exec_schedule: Split pi-ringfull into two tests
  2019-11-13 12:52 ` [Intel-gfx] " Chris Wilson
                   ` (8 preceding siblings ...)
  (?)
@ 2019-11-13 14:30 ` Patchwork
  -1 siblings, 0 replies; 57+ messages in thread
From: Patchwork @ 2019-11-13 14:30 UTC (permalink / raw)
  To: Chris Wilson; +Cc: igt-dev

== Series Details ==

Series: series starting with [i-g-t,1/9] i915/gem_exec_schedule: Split pi-ringfull into two tests
URL   : https://patchwork.freedesktop.org/series/69401/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7330 -> IGTPW_3690
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/index.html

Known issues
------------

  Here are the changes found in IGTPW_3690 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_pm_rpm@module-reload:
    - fi-skl-6770hq:      [PASS][1] -> [FAIL][2] ([fdo#108511])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html

  * igt@kms_busy@basic-flip-pipe-b:
    - fi-skl-6770hq:      [PASS][3] -> [DMESG-WARN][4] ([fdo#105541])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html

  * igt@kms_frontbuffer_tracking@basic:
    - fi-hsw-peppy:       [PASS][5] -> [DMESG-WARN][6] ([fdo#102614])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html

  * igt@kms_setmode@basic-clone-single-crtc:
    - fi-skl-6770hq:      [PASS][7] -> [WARN][8] ([fdo#112252])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/fi-skl-6770hq/igt@kms_setmode@basic-clone-single-crtc.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/fi-skl-6770hq/igt@kms_setmode@basic-clone-single-crtc.html

  
#### Possible fixes ####

  * igt@i915_module_load@reload-with-fault-injection:
    - {fi-kbl-7560u}:     [INCOMPLETE][9] ([fdo#109964] / [fdo#110343]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/fi-kbl-7560u/igt@i915_module_load@reload-with-fault-injection.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/fi-kbl-7560u/igt@i915_module_load@reload-with-fault-injection.html

  * igt@i915_pm_rpm@module-reload:
    - fi-skl-lmem:        [DMESG-WARN][11] ([fdo#112261]) -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/fi-skl-lmem/igt@i915_pm_rpm@module-reload.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/fi-skl-lmem/igt@i915_pm_rpm@module-reload.html

  * igt@i915_selftest@live_blt:
    - fi-hsw-peppy:       [DMESG-FAIL][13] ([fdo#112147]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/fi-hsw-peppy/igt@i915_selftest@live_blt.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/fi-hsw-peppy/igt@i915_selftest@live_blt.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [FAIL][15] ([fdo#111045] / [fdo#111096]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#102614]: https://bugs.freedesktop.org/show_bug.cgi?id=102614
  [fdo#105541]: https://bugs.freedesktop.org/show_bug.cgi?id=105541
  [fdo#108511]: https://bugs.freedesktop.org/show_bug.cgi?id=108511
  [fdo#109964]: https://bugs.freedesktop.org/show_bug.cgi?id=109964
  [fdo#110343]: https://bugs.freedesktop.org/show_bug.cgi?id=110343
  [fdo#111045]: https://bugs.freedesktop.org/show_bug.cgi?id=111045
  [fdo#111096]: https://bugs.freedesktop.org/show_bug.cgi?id=111096
  [fdo#111736]: https://bugs.freedesktop.org/show_bug.cgi?id=111736
  [fdo#112147]: https://bugs.freedesktop.org/show_bug.cgi?id=112147
  [fdo#112252]: https://bugs.freedesktop.org/show_bug.cgi?id=112252
  [fdo#112261]: https://bugs.freedesktop.org/show_bug.cgi?id=112261


Participating hosts (53 -> 46)
------------------------------

  Missing    (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_5276 -> IGTPW_3690

  CI-20190529: 20190529
  CI_DRM_7330: 693c0e2adcc5a92272746951802d99bdf446b354 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_3690: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/index.html
  IGT_5276: 868d38c2bc075b6756ebed486db6e7152ed2c5be @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools



== Testlist changes ==

+++ 136 lines
--- 0 lines

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/index.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [igt-dev] ✗ GitLab.Pipeline: warning for series starting with [i-g-t,1/9] i915/gem_exec_schedule: Split pi-ringfull into two tests
  2019-11-13 12:52 ` [Intel-gfx] " Chris Wilson
                   ` (9 preceding siblings ...)
  (?)
@ 2019-11-13 14:40 ` Patchwork
  -1 siblings, 0 replies; 57+ messages in thread
From: Patchwork @ 2019-11-13 14:40 UTC (permalink / raw)
  To: Chris Wilson; +Cc: igt-dev

== Series Details ==

Series: series starting with [i-g-t,1/9] i915/gem_exec_schedule: Split pi-ringfull into two tests
URL   : https://patchwork.freedesktop.org/series/69401/
State : warning

== Summary ==

Did not get list of undocumented tests for this run, something is wrong!

Other than that, pipeline status: FAILED.

see https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/pipelines/78657 for the overview.

== Logs ==

For more details see: https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/pipelines/78657
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [igt-dev] ✓ Fi.CI.IGT: success for series starting with [i-g-t,1/9] i915/gem_exec_schedule: Split pi-ringfull into two tests
  2019-11-13 12:52 ` [Intel-gfx] " Chris Wilson
                   ` (10 preceding siblings ...)
  (?)
@ 2019-11-14  2:10 ` Patchwork
  -1 siblings, 0 replies; 57+ messages in thread
From: Patchwork @ 2019-11-14  2:10 UTC (permalink / raw)
  To: Chris Wilson; +Cc: igt-dev

== Series Details ==

Series: series starting with [i-g-t,1/9] i915/gem_exec_schedule: Split pi-ringfull into two tests
URL   : https://patchwork.freedesktop.org/series/69401/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7330_full -> IGTPW_3690_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_3690_full:

### IGT changes ###

#### Possible regressions ####

  * {igt@gem_ctx_ringsize@idempotent} (NEW):
    - shard-tglb:         NOTRUN -> [SKIP][1] +78 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-tglb4/igt@gem_ctx_ringsize@idempotent.html

  * {igt@sysfs_timeslice_duration@vecs0-timeout} (NEW):
    - shard-iclb:         NOTRUN -> [SKIP][2] +72 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-iclb5/igt@sysfs_timeslice_duration@vecs0-timeout.html

  
New tests
---------

  New tests have been introduced between CI_DRM_7330_full and IGTPW_3690_full:

### New IGT tests (136) ###

  * igt@gem_ctx_ringsize@bcs0-active:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@bcs0-idle:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@clone:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@create:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@idempotent:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@invalid:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@rcs0-active:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@rcs0-idle:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@vcs0-active:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@vcs0-idle:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@vcs1-active:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@vcs1-idle:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@vcs2-active:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@vcs2-idle:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@vecs0-active:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_ringsize@vecs0-idle:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_schedule@pi-common-blt:
    - Statuses : 4 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.08] s

  * igt@gem_exec_schedule@pi-common-bsd:
    - Statuses : 3 pass(s) 4 skip(s)
    - Exec time: [0.0, 0.08] s

  * igt@gem_exec_schedule@pi-common-bsd1:
    - Statuses : 2 pass(s) 5 skip(s)
    - Exec time: [0.0, 0.08] s

  * igt@gem_exec_schedule@pi-common-bsd2:
    - Statuses : 2 pass(s) 5 skip(s)
    - Exec time: [0.0, 0.08] s

  * igt@gem_exec_schedule@pi-common-render:
    - Statuses : 3 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.08] s

  * igt@gem_exec_schedule@pi-common-vebox:
    - Statuses : 4 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.08] s

  * igt@gem_exec_schedule@pi-distinct-iova-blt:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.02] s

  * igt@gem_exec_schedule@pi-distinct-iova-bsd:
    - Statuses : 3 pass(s) 4 skip(s)
    - Exec time: [0.0, 0.02] s

  * igt@gem_exec_schedule@pi-distinct-iova-bsd1:
    - Statuses : 2 pass(s) 4 skip(s)
    - Exec time: [0.0, 0.01] s

  * igt@gem_exec_schedule@pi-distinct-iova-bsd2:
    - Statuses : 2 pass(s) 5 skip(s)
    - Exec time: [0.0, 0.01] s

  * igt@gem_exec_schedule@pi-distinct-iova-render:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.03] s

  * igt@gem_exec_schedule@pi-distinct-iova-vebox:
    - Statuses : 4 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.02] s

  * igt@gem_exec_schedule@pi-shared-iova-blt:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.03] s

  * igt@gem_exec_schedule@pi-shared-iova-bsd:
    - Statuses : 3 pass(s) 3 skip(s)
    - Exec time: [0.0, 0.02] s

  * igt@gem_exec_schedule@pi-shared-iova-bsd1:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 0.01] s

  * igt@gem_exec_schedule@pi-shared-iova-bsd2:
    - Statuses : 2 pass(s) 1 skip(s)
    - Exec time: [0.0, 0.01] s

  * igt@gem_exec_schedule@pi-shared-iova-render:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.03] s

  * igt@gem_exec_schedule@pi-shared-iova-vebox:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.02] s

  * igt@gem_exec_schedule@pi-userfault-blt:
    - Statuses : 4 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.01] s

  * igt@gem_exec_schedule@pi-userfault-bsd:
    - Statuses : 3 pass(s) 4 skip(s)
    - Exec time: [0.0, 0.01] s

  * igt@gem_exec_schedule@pi-userfault-bsd1:
    - Statuses : 3 pass(s) 3 skip(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_schedule@pi-userfault-bsd2:
    - Statuses : 2 pass(s) 5 skip(s)
    - Exec time: [0.0, 0.00] s

  * igt@gem_exec_schedule@pi-userfault-render:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_exec_schedule@pi-userfault-vebox:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.01] s

  * igt@sysfs_heartbeat_interval@bcs0-idempotent:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@bcs0-invalid:
    - Statuses : 3 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@bcs0-long:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@bcs0-mixed:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@bcs0-nopreempt:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@bcs0-off:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@bcs0-precise:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@rcs0-idempotent:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@rcs0-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@rcs0-long:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@rcs0-mixed:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@rcs0-nopreempt:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@rcs0-off:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@rcs0-precise:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs0-idempotent:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs0-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs0-long:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs0-mixed:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs0-nopreempt:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs0-off:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs0-precise:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs1-idempotent:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs1-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs1-long:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs1-mixed:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs1-nopreempt:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs1-off:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs1-precise:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs2-idempotent:
    - Statuses : 3 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs2-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs2-long:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs2-mixed:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs2-nopreempt:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs2-off:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vcs2-precise:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vecs0-idempotent:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vecs0-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vecs0-long:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vecs0-mixed:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vecs0-nopreempt:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vecs0-off:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_heartbeat_interval@vecs0-precise:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@bcs0-idempotent:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@bcs0-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@bcs0-off:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@bcs0-timeout:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@rcs0-idempotent:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@rcs0-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@rcs0-off:
    - Statuses :
    - Exec time: [None] s

  * igt@sysfs_preempt_timeout@rcs0-timeout:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vcs0-idempotent:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vcs0-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vcs0-off:
    - Statuses :
    - Exec time: [None] s

  * igt@sysfs_preempt_timeout@vcs0-timeout:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vcs1-idempotent:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vcs1-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vcs1-off:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vcs1-timeout:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vcs2-idempotent:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vcs2-invalid:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vcs2-off:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vcs2-timeout:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vecs0-idempotent:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vecs0-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vecs0-off:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_preempt_timeout@vecs0-timeout:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@bcs0-duration:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@bcs0-idempotent:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@bcs0-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@bcs0-off:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@bcs0-timeout:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@rcs0-duration:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@rcs0-idempotent:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@rcs0-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@rcs0-off:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@rcs0-timeout:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs0-duration:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs0-idempotent:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs0-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs0-off:
    - Statuses : 3 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs0-timeout:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs1-duration:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs1-idempotent:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs1-invalid:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs1-off:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs1-timeout:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs2-duration:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs2-idempotent:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs2-invalid:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs2-off:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vcs2-timeout:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vecs0-duration:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vecs0-idempotent:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vecs0-invalid:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vecs0-off:
    - Statuses : 3 skip(s)
    - Exec time: [0.0] s

  * igt@sysfs_timeslice_duration@vecs0-timeout:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  

Known issues
------------

  Here are the changes found in IGTPW_3690_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_isolation@vcs1-s3:
    - shard-tglb:         [PASS][3] -> [INCOMPLETE][4] ([fdo#111832])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-tglb3/igt@gem_ctx_isolation@vcs1-s3.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-tglb7/igt@gem_ctx_isolation@vcs1-s3.html

  * igt@gem_ctx_persistence@vcs1-mixed-process:
    - shard-iclb:         [PASS][5] -> [SKIP][6] ([fdo#109276] / [fdo#112080]) +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-iclb2/igt@gem_ctx_persistence@vcs1-mixed-process.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-iclb5/igt@gem_ctx_persistence@vcs1-mixed-process.html

  * igt@gem_exec_create@forked:
    - shard-tglb:         [PASS][7] -> [INCOMPLETE][8] ([fdo#108838] / [fdo#111747])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-tglb7/igt@gem_exec_create@forked.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-tglb6/igt@gem_exec_create@forked.html

  * igt@gem_exec_schedule@preempt-other-chain-bsd:
    - shard-iclb:         [PASS][9] -> [SKIP][10] ([fdo#112146]) +2 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-iclb3/igt@gem_exec_schedule@preempt-other-chain-bsd.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-iclb1/igt@gem_exec_schedule@preempt-other-chain-bsd.html

  * igt@gem_userptr_blits@map-fixed-invalidate-busy-gup:
    - shard-snb:          [PASS][11] -> [DMESG-WARN][12] ([fdo#111870]) +1 similar issue
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-snb1/igt@gem_userptr_blits@map-fixed-invalidate-busy-gup.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-snb1/igt@gem_userptr_blits@map-fixed-invalidate-busy-gup.html

  * igt@i915_suspend@fence-restore-untiled:
    - shard-tglb:         [PASS][13] -> [INCOMPLETE][14] ([fdo#111832] / [fdo#111850])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-tglb9/igt@i915_suspend@fence-restore-untiled.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-tglb5/igt@i915_suspend@fence-restore-untiled.html

  * igt@kms_flip@2x-plain-flip-fb-recreate:
    - shard-hsw:          [PASS][15] -> [INCOMPLETE][16] ([fdo#103540])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-hsw4/igt@kms_flip@2x-plain-flip-fb-recreate.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-hsw7/igt@kms_flip@2x-plain-flip-fb-recreate.html

  * igt@kms_flip@flip-vs-suspend:
    - shard-iclb:         [PASS][17] -> [DMESG-WARN][18] ([fdo#111764])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-iclb3/igt@kms_flip@flip-vs-suspend.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-iclb2/igt@kms_flip@flip-vs-suspend.html

  * igt@kms_flip@flip-vs-suspend-interruptible:
    - shard-kbl:          [PASS][19] -> [INCOMPLETE][20] ([fdo#103665])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-kbl6/igt@kms_flip@flip-vs-suspend-interruptible.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-kbl1/igt@kms_flip@flip-vs-suspend-interruptible.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-shrfb-plflip-blt:
    - shard-iclb:         [PASS][21] -> [FAIL][22] ([fdo#103167]) +5 similar issues
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-iclb6/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-shrfb-plflip-blt.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-iclb8/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-shrfb-plflip-blt.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - shard-kbl:          [PASS][23] -> [DMESG-WARN][24] ([fdo#108566]) +4 similar issues
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-kbl3/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-kbl1/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes:
    - shard-apl:          [PASS][25] -> [DMESG-WARN][26] ([fdo#108566])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-apl8/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-apl4/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html

  * igt@kms_psr@psr2_dpms:
    - shard-iclb:         [PASS][27] -> [SKIP][28] ([fdo#109441])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-iclb2/igt@kms_psr@psr2_dpms.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-iclb4/igt@kms_psr@psr2_dpms.html

  * igt@perf_pmu@busy-no-semaphores-vcs1:
    - shard-iclb:         [PASS][29] -> [SKIP][30] ([fdo#112080]) +11 similar issues
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7330/shard-iclb4/igt@perf_pmu@busy-no-semaphores-vcs1.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/shard-iclb6/igt@perf_pmu@busy-no-semaphores-vcs1.html

  * igt@prime_busy@hang-bsd2:
    - shard-iclb:

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3690/index.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-21 12:04     ` Lionel Landwerlin
  0 siblings, 0 replies; 57+ messages in thread
From: Lionel Landwerlin @ 2019-11-21 12:04 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev

On 13/11/2019 14:52, Chris Wilson wrote:
> Several tests depend upon the implicit engine->mmio_base but have no
> means of determining the physical layout. Since the kernel has started
> providing this information, start putting it to use.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
>   lib/i915/gem_engine_topology.h |  5 ++
>   tests/i915/gem_ctx_shared.c    | 38 +++++----------
>   tests/i915/gem_exec_latency.c  | 17 ++++---
>   4 files changed, 111 insertions(+), 33 deletions(-)
>
> diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
> index 790d455ff..bd200a4b9 100644
> --- a/lib/i915/gem_engine_topology.c
> +++ b/lib/i915/gem_engine_topology.c
> @@ -21,7 +21,12 @@
>    * IN THE SOFTWARE.
>    */
>   
> +#include <fcntl.h>
> +#include <unistd.h>
> +
>   #include "drmtest.h"
> +#include "igt_sysfs.h"
> +#include "intel_chipset.h"
>   #include "ioctl_wrappers.h"
>   
>   #include "i915/gem_engine_topology.h"
> @@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
>   {
>   	return e1->class == e2->class && e1->instance == e2->instance;
>   }
> +
> +static int descend(int dir, const char *path)
> +{
> +	int fd;
> +
> +	fd = openat(dir, path, O_RDONLY);
> +	close(dir);
> +
> +	return fd;
> +}
> +


Not sure I understand what file the function below is supposed to parse.

Is that /sys/kernel/debug/dri/0/i915_engine_info?


Probably doesn't work on my system as I get a 0 mmio_base for vcs1.


-Lionel


> +int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
> +			      const char *fmt, ...)
> +{
> +	FILE *file;
> +	va_list ap;
> +	int ret;
> +	int fd;
> +
> +	fd = igt_sysfs_open(i915);
> +	if (fd < 0)
> +		return fd;
> +
> +	fd = descend(fd, "engine");
> +	if (fd < 0)
> +		return fd;
> +
> +	fd = descend(fd, engine);
> +	if (fd < 0)
> +		return fd;
> +
> +	fd = descend(fd, attr);
> +	if (fd < 0)
> +		return fd;
> +
> +	file = fdopen(fd, "r");
> +	if (!file) {
> +		close(fd);
> +		return -1;
> +	}
> +
> +	va_start(ap, fmt);
> +	ret = vfscanf(file, fmt, ap);
> +	va_end(ap);
> +
> +	fclose(file);
> +	return ret;
> +}
> +
> +uint32_t gem_engine_mmio_base(int i915, const char *engine)
> +{
> +	unsigned int mmio = 0;
> +
> +	if (gem_engine_property_scanf(i915, engine, "mmio_base",
> +				      "%x", &mmio) < 0) {
> +		int gen = intel_gen(intel_get_drm_devid(i915));
> +
> +		/* The layout of xcs1+ is unreliable -- hence the property! */
> +		if (!strcmp(engine, "rcs0")) {
> +			mmio = 0x2000;
> +		} else if (!strcmp(engine, "bcs0")) {
> +			mmio = 0x22000;
> +		} else if (!strcmp(engine, "vcs0")) {
> +			if (gen < 6)
> +				mmio = 0x4000;
> +			else if (gen < 11)
> +				mmio = 0x12000;
> +			else
> +				mmio = 0x1c0000;
> +		} else if (!strcmp(engine, "vecs0")) {
> +			if (gen < 11)
> +				mmio = 0x1a000;
> +			else
> +				mmio = 0x1c8000;
> +		}
> +	}
> +
> +	return mmio;
> +}
> diff --git a/lib/i915/gem_engine_topology.h b/lib/i915/gem_engine_topology.h
> index d98773e06..e728ebd93 100644
> --- a/lib/i915/gem_engine_topology.h
> +++ b/lib/i915/gem_engine_topology.h
> @@ -74,4 +74,9 @@ struct intel_execution_engine2 gem_eb_flags_to_engine(unsigned int flags);
>   	     ((e__) = intel_get_current_physical_engine(&i__)); \
>   	     intel_next_engine(&i__))
>   
> +__attribute__((format(scanf, 4, 5)))
> +int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
> +			      const char *fmt, ...);
> +uint32_t gem_engine_mmio_base(int i915, const char *engine);
> +
>   #endif /* GEM_ENGINE_TOPOLOGY_H */
> diff --git a/tests/i915/gem_ctx_shared.c b/tests/i915/gem_ctx_shared.c
> index a6eee16dd..949e1f3d4 100644
> --- a/tests/i915/gem_ctx_shared.c
> +++ b/tests/i915/gem_ctx_shared.c
> @@ -38,6 +38,7 @@
>   
>   #include <drm.h>
>   
> +#include "i915/gem_engine_topology.h"
>   #include "igt_rand.h"
>   #include "igt_vgem.h"
>   #include "sync_file.h"
> @@ -556,6 +557,14 @@ static uint32_t store_timestamp(int i915,
>   	return obj.handle;
>   }
>   
> +static uint32_t ring_base(int i915, unsigned ring)
> +{
> +	if (ring == I915_EXEC_DEFAULT)
> +		ring = I915_EXEC_RENDER; /* XXX */
> +
> +	return gem_engine_mmio_base(i915, gem_eb_flags_to_engine(ring).name);
> +}
> +
>   static void independent(int i915, unsigned ring, unsigned flags)
>   {
>   	const int TIMESTAMP = 1023;
> @@ -563,33 +572,8 @@ static void independent(int i915, unsigned ring, unsigned flags)
>   	igt_spin_t *spin[MAX_ELSP_QLEN];
>   	unsigned int mmio_base;
>   
> -	/* XXX i915_query()! */
> -	switch (ring) {
> -	case I915_EXEC_DEFAULT:
> -	case I915_EXEC_RENDER:
> -		mmio_base = 0x2000;
> -		break;
> -#if 0
> -	case I915_EXEC_BSD:
> -		mmio_base = 0x12000;
> -		break;
> -#endif
> -	case I915_EXEC_BLT:
> -		mmio_base = 0x22000;
> -		break;
> -
> -#define GEN11_VECS0_BASE 0x1c8000
> -#define GEN11_VECS1_BASE 0x1d8000
> -	case I915_EXEC_VEBOX:
> -		if (intel_gen(intel_get_drm_devid(i915)) >= 11)
> -			mmio_base = GEN11_VECS0_BASE;
> -		else
> -			mmio_base = 0x1a000;
> -		break;
> -
> -	default:
> -		igt_skip("mmio base not known\n");
> -	}
> +	mmio_base = ring_base(i915, ring);
> +	igt_require_f(mmio_base, "mmio base not known\n");
>   
>   	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
>   		const struct igt_spin_factory opts = {
> diff --git a/tests/i915/gem_exec_latency.c b/tests/i915/gem_exec_latency.c
> index 3d99182a0..d2159f317 100644
> --- a/tests/i915/gem_exec_latency.c
> +++ b/tests/i915/gem_exec_latency.c
> @@ -109,7 +109,7 @@ poll_ring(int fd, unsigned ring, const char *name)
>   	igt_spin_free(fd, spin[0]);
>   }
>   
> -#define RCS_TIMESTAMP (0x2000 + 0x358)
> +#define TIMESTAMP (0x358)
>   static void latency_on_ring(int fd,
>   			    unsigned ring, const char *name,
>   			    unsigned flags)
> @@ -119,6 +119,7 @@ static void latency_on_ring(int fd,
>   	struct drm_i915_gem_exec_object2 obj[3];
>   	struct drm_i915_gem_relocation_entry reloc;
>   	struct drm_i915_gem_execbuffer2 execbuf;
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
>   	igt_spin_t *spin = NULL;
>   	IGT_CORK_HANDLE(c);
>   	volatile uint32_t *reg;
> @@ -128,7 +129,8 @@ static void latency_on_ring(int fd,
>   	double gpu_latency;
>   	int i, j;
>   
> -	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + RCS_TIMESTAMP);
> +	igt_require(mmio_base);
> +	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + mmio_base + TIMESTAMP);
>   
>   	memset(&execbuf, 0, sizeof(execbuf));
>   	execbuf.buffers_ptr = to_user_pointer(&obj[1]);
> @@ -176,7 +178,7 @@ static void latency_on_ring(int fd,
>   		map[i++] = 0x24 << 23 | 1;
>   		if (has_64bit_reloc)
>   			map[i-1]++;
> -		map[i++] = RCS_TIMESTAMP; /* ring local! */
> +		map[i++] = mmio_base + TIMESTAMP;
>   		map[i++] = offset;
>   		if (has_64bit_reloc)
>   			map[i++] = offset >> 32;
> @@ -266,11 +268,14 @@ static void latency_from_ring(int fd,
>   	struct drm_i915_gem_exec_object2 obj[3];
>   	struct drm_i915_gem_relocation_entry reloc;
>   	struct drm_i915_gem_execbuffer2 execbuf;
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
>   	const unsigned int repeats = ring_size / 2;
>   	uint32_t *map, *results;
>   	uint32_t ctx[2] = {};
>   	int i, j;
>   
> +	igt_require(mmio_base);
> +
>   	if (flags & PREEMPT) {
>   		ctx[0] = gem_context_create(fd);
>   		gem_context_set_priority(fd, ctx[0], -1023);
> @@ -351,7 +356,7 @@ static void latency_from_ring(int fd,
>   			map[i++] = 0x24 << 23 | 1;
>   			if (has_64bit_reloc)
>   				map[i-1]++;
> -			map[i++] = RCS_TIMESTAMP; /* ring local! */
> +			map[i++] = mmio_base + TIMESTAMP;
>   			map[i++] = offset;
>   			if (has_64bit_reloc)
>   				map[i++] = offset >> 32;
> @@ -376,7 +381,7 @@ static void latency_from_ring(int fd,
>   			map[i++] = 0x24 << 23 | 1;
>   			if (has_64bit_reloc)
>   				map[i-1]++;
> -			map[i++] = RCS_TIMESTAMP; /* ring local! */
> +			map[i++] = mmio_base + TIMESTAMP;
>   			map[i++] = offset;
>   			if (has_64bit_reloc)
>   				map[i++] = offset >> 32;
> @@ -669,7 +674,7 @@ igt_main
>   			ring_size = 1024;
>   
>   		intel_register_access_init(&mmio_data, intel_get_pci_device(), false, device);
> -		rcs_clock = clockrate(device, RCS_TIMESTAMP);
> +		rcs_clock = clockrate(device, 0x2000 + TIMESTAMP);
>   		igt_info("RCS timestamp clock: %.0fKHz, %.1fns\n",
>   			 rcs_clock / 1e3, 1e9 / rcs_clock);
>   		rcs_clock = 1e9 / rcs_clock;


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-21 12:04     ` Lionel Landwerlin
  0 siblings, 0 replies; 57+ messages in thread
From: Lionel Landwerlin @ 2019-11-21 12:04 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev

On 13/11/2019 14:52, Chris Wilson wrote:
> Several tests depend upon the implicit engine->mmio_base but have no
> means of determining the physical layout. Since the kernel has started
> providing this information, start putting it to use.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
>   lib/i915/gem_engine_topology.h |  5 ++
>   tests/i915/gem_ctx_shared.c    | 38 +++++----------
>   tests/i915/gem_exec_latency.c  | 17 ++++---
>   4 files changed, 111 insertions(+), 33 deletions(-)
>
> diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
> index 790d455ff..bd200a4b9 100644
> --- a/lib/i915/gem_engine_topology.c
> +++ b/lib/i915/gem_engine_topology.c
> @@ -21,7 +21,12 @@
>    * IN THE SOFTWARE.
>    */
>   
> +#include <fcntl.h>
> +#include <unistd.h>
> +
>   #include "drmtest.h"
> +#include "igt_sysfs.h"
> +#include "intel_chipset.h"
>   #include "ioctl_wrappers.h"
>   
>   #include "i915/gem_engine_topology.h"
> @@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
>   {
>   	return e1->class == e2->class && e1->instance == e2->instance;
>   }
> +
> +static int descend(int dir, const char *path)
> +{
> +	int fd;
> +
> +	fd = openat(dir, path, O_RDONLY);
> +	close(dir);
> +
> +	return fd;
> +}
> +


Not sure I understand what file the function below is supposed to parse.

Is that /sys/kernel/debug/dri/0/i915_engine_info?


Probably doesn't work on my system as I get a 0 mmio_base for vcs1.


-Lionel


> +int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
> +			      const char *fmt, ...)
> +{
> +	FILE *file;
> +	va_list ap;
> +	int ret;
> +	int fd;
> +
> +	fd = igt_sysfs_open(i915);
> +	if (fd < 0)
> +		return fd;
> +
> +	fd = descend(fd, "engine");
> +	if (fd < 0)
> +		return fd;
> +
> +	fd = descend(fd, engine);
> +	if (fd < 0)
> +		return fd;
> +
> +	fd = descend(fd, attr);
> +	if (fd < 0)
> +		return fd;
> +
> +	file = fdopen(fd, "r");
> +	if (!file) {
> +		close(fd);
> +		return -1;
> +	}
> +
> +	va_start(ap, fmt);
> +	ret = vfscanf(file, fmt, ap);
> +	va_end(ap);
> +
> +	fclose(file);
> +	return ret;
> +}
> +
> +uint32_t gem_engine_mmio_base(int i915, const char *engine)
> +{
> +	unsigned int mmio = 0;
> +
> +	if (gem_engine_property_scanf(i915, engine, "mmio_base",
> +				      "%x", &mmio) < 0) {
> +		int gen = intel_gen(intel_get_drm_devid(i915));
> +
> +		/* The layout of xcs1+ is unreliable -- hence the property! */
> +		if (!strcmp(engine, "rcs0")) {
> +			mmio = 0x2000;
> +		} else if (!strcmp(engine, "bcs0")) {
> +			mmio = 0x22000;
> +		} else if (!strcmp(engine, "vcs0")) {
> +			if (gen < 6)
> +				mmio = 0x4000;
> +			else if (gen < 11)
> +				mmio = 0x12000;
> +			else
> +				mmio = 0x1c0000;
> +		} else if (!strcmp(engine, "vecs0")) {
> +			if (gen < 11)
> +				mmio = 0x1a000;
> +			else
> +				mmio = 0x1c8000;
> +		}
> +	}
> +
> +	return mmio;
> +}
> diff --git a/lib/i915/gem_engine_topology.h b/lib/i915/gem_engine_topology.h
> index d98773e06..e728ebd93 100644
> --- a/lib/i915/gem_engine_topology.h
> +++ b/lib/i915/gem_engine_topology.h
> @@ -74,4 +74,9 @@ struct intel_execution_engine2 gem_eb_flags_to_engine(unsigned int flags);
>   	     ((e__) = intel_get_current_physical_engine(&i__)); \
>   	     intel_next_engine(&i__))
>   
> +__attribute__((format(scanf, 4, 5)))
> +int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
> +			      const char *fmt, ...);
> +uint32_t gem_engine_mmio_base(int i915, const char *engine);
> +
>   #endif /* GEM_ENGINE_TOPOLOGY_H */
> diff --git a/tests/i915/gem_ctx_shared.c b/tests/i915/gem_ctx_shared.c
> index a6eee16dd..949e1f3d4 100644
> --- a/tests/i915/gem_ctx_shared.c
> +++ b/tests/i915/gem_ctx_shared.c
> @@ -38,6 +38,7 @@
>   
>   #include <drm.h>
>   
> +#include "i915/gem_engine_topology.h"
>   #include "igt_rand.h"
>   #include "igt_vgem.h"
>   #include "sync_file.h"
> @@ -556,6 +557,14 @@ static uint32_t store_timestamp(int i915,
>   	return obj.handle;
>   }
>   
> +static uint32_t ring_base(int i915, unsigned ring)
> +{
> +	if (ring == I915_EXEC_DEFAULT)
> +		ring = I915_EXEC_RENDER; /* XXX */
> +
> +	return gem_engine_mmio_base(i915, gem_eb_flags_to_engine(ring).name);
> +}
> +
>   static void independent(int i915, unsigned ring, unsigned flags)
>   {
>   	const int TIMESTAMP = 1023;
> @@ -563,33 +572,8 @@ static void independent(int i915, unsigned ring, unsigned flags)
>   	igt_spin_t *spin[MAX_ELSP_QLEN];
>   	unsigned int mmio_base;
>   
> -	/* XXX i915_query()! */
> -	switch (ring) {
> -	case I915_EXEC_DEFAULT:
> -	case I915_EXEC_RENDER:
> -		mmio_base = 0x2000;
> -		break;
> -#if 0
> -	case I915_EXEC_BSD:
> -		mmio_base = 0x12000;
> -		break;
> -#endif
> -	case I915_EXEC_BLT:
> -		mmio_base = 0x22000;
> -		break;
> -
> -#define GEN11_VECS0_BASE 0x1c8000
> -#define GEN11_VECS1_BASE 0x1d8000
> -	case I915_EXEC_VEBOX:
> -		if (intel_gen(intel_get_drm_devid(i915)) >= 11)
> -			mmio_base = GEN11_VECS0_BASE;
> -		else
> -			mmio_base = 0x1a000;
> -		break;
> -
> -	default:
> -		igt_skip("mmio base not known\n");
> -	}
> +	mmio_base = ring_base(i915, ring);
> +	igt_require_f(mmio_base, "mmio base not known\n");
>   
>   	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
>   		const struct igt_spin_factory opts = {
> diff --git a/tests/i915/gem_exec_latency.c b/tests/i915/gem_exec_latency.c
> index 3d99182a0..d2159f317 100644
> --- a/tests/i915/gem_exec_latency.c
> +++ b/tests/i915/gem_exec_latency.c
> @@ -109,7 +109,7 @@ poll_ring(int fd, unsigned ring, const char *name)
>   	igt_spin_free(fd, spin[0]);
>   }
>   
> -#define RCS_TIMESTAMP (0x2000 + 0x358)
> +#define TIMESTAMP (0x358)
>   static void latency_on_ring(int fd,
>   			    unsigned ring, const char *name,
>   			    unsigned flags)
> @@ -119,6 +119,7 @@ static void latency_on_ring(int fd,
>   	struct drm_i915_gem_exec_object2 obj[3];
>   	struct drm_i915_gem_relocation_entry reloc;
>   	struct drm_i915_gem_execbuffer2 execbuf;
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
>   	igt_spin_t *spin = NULL;
>   	IGT_CORK_HANDLE(c);
>   	volatile uint32_t *reg;
> @@ -128,7 +129,8 @@ static void latency_on_ring(int fd,
>   	double gpu_latency;
>   	int i, j;
>   
> -	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + RCS_TIMESTAMP);
> +	igt_require(mmio_base);
> +	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + mmio_base + TIMESTAMP);
>   
>   	memset(&execbuf, 0, sizeof(execbuf));
>   	execbuf.buffers_ptr = to_user_pointer(&obj[1]);
> @@ -176,7 +178,7 @@ static void latency_on_ring(int fd,
>   		map[i++] = 0x24 << 23 | 1;
>   		if (has_64bit_reloc)
>   			map[i-1]++;
> -		map[i++] = RCS_TIMESTAMP; /* ring local! */
> +		map[i++] = mmio_base + TIMESTAMP;
>   		map[i++] = offset;
>   		if (has_64bit_reloc)
>   			map[i++] = offset >> 32;
> @@ -266,11 +268,14 @@ static void latency_from_ring(int fd,
>   	struct drm_i915_gem_exec_object2 obj[3];
>   	struct drm_i915_gem_relocation_entry reloc;
>   	struct drm_i915_gem_execbuffer2 execbuf;
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
>   	const unsigned int repeats = ring_size / 2;
>   	uint32_t *map, *results;
>   	uint32_t ctx[2] = {};
>   	int i, j;
>   
> +	igt_require(mmio_base);
> +
>   	if (flags & PREEMPT) {
>   		ctx[0] = gem_context_create(fd);
>   		gem_context_set_priority(fd, ctx[0], -1023);
> @@ -351,7 +356,7 @@ static void latency_from_ring(int fd,
>   			map[i++] = 0x24 << 23 | 1;
>   			if (has_64bit_reloc)
>   				map[i-1]++;
> -			map[i++] = RCS_TIMESTAMP; /* ring local! */
> +			map[i++] = mmio_base + TIMESTAMP;
>   			map[i++] = offset;
>   			if (has_64bit_reloc)
>   				map[i++] = offset >> 32;
> @@ -376,7 +381,7 @@ static void latency_from_ring(int fd,
>   			map[i++] = 0x24 << 23 | 1;
>   			if (has_64bit_reloc)
>   				map[i-1]++;
> -			map[i++] = RCS_TIMESTAMP; /* ring local! */
> +			map[i++] = mmio_base + TIMESTAMP;
>   			map[i++] = offset;
>   			if (has_64bit_reloc)
>   				map[i++] = offset >> 32;
> @@ -669,7 +674,7 @@ igt_main
>   			ring_size = 1024;
>   
>   		intel_register_access_init(&mmio_data, intel_get_pci_device(), false, device);
> -		rcs_clock = clockrate(device, RCS_TIMESTAMP);
> +		rcs_clock = clockrate(device, 0x2000 + TIMESTAMP);
>   		igt_info("RCS timestamp clock: %.0fKHz, %.1fns\n",
>   			 rcs_clock / 1e3, 1e9 / rcs_clock);
>   		rcs_clock = 1e9 / rcs_clock;


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-21 12:04     ` Lionel Landwerlin
  0 siblings, 0 replies; 57+ messages in thread
From: Lionel Landwerlin @ 2019-11-21 12:04 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev

On 13/11/2019 14:52, Chris Wilson wrote:
> Several tests depend upon the implicit engine->mmio_base but have no
> means of determining the physical layout. Since the kernel has started
> providing this information, start putting it to use.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
>   lib/i915/gem_engine_topology.h |  5 ++
>   tests/i915/gem_ctx_shared.c    | 38 +++++----------
>   tests/i915/gem_exec_latency.c  | 17 ++++---
>   4 files changed, 111 insertions(+), 33 deletions(-)
>
> diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
> index 790d455ff..bd200a4b9 100644
> --- a/lib/i915/gem_engine_topology.c
> +++ b/lib/i915/gem_engine_topology.c
> @@ -21,7 +21,12 @@
>    * IN THE SOFTWARE.
>    */
>   
> +#include <fcntl.h>
> +#include <unistd.h>
> +
>   #include "drmtest.h"
> +#include "igt_sysfs.h"
> +#include "intel_chipset.h"
>   #include "ioctl_wrappers.h"
>   
>   #include "i915/gem_engine_topology.h"
> @@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
>   {
>   	return e1->class == e2->class && e1->instance == e2->instance;
>   }
> +
> +static int descend(int dir, const char *path)
> +{
> +	int fd;
> +
> +	fd = openat(dir, path, O_RDONLY);
> +	close(dir);
> +
> +	return fd;
> +}
> +


Not sure I understand what file the function below is supposed to parse.

Is that /sys/kernel/debug/dri/0/i915_engine_info?


Probably doesn't work on my system as I get a 0 mmio_base for vcs1.


-Lionel


> +int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
> +			      const char *fmt, ...)
> +{
> +	FILE *file;
> +	va_list ap;
> +	int ret;
> +	int fd;
> +
> +	fd = igt_sysfs_open(i915);
> +	if (fd < 0)
> +		return fd;
> +
> +	fd = descend(fd, "engine");
> +	if (fd < 0)
> +		return fd;
> +
> +	fd = descend(fd, engine);
> +	if (fd < 0)
> +		return fd;
> +
> +	fd = descend(fd, attr);
> +	if (fd < 0)
> +		return fd;
> +
> +	file = fdopen(fd, "r");
> +	if (!file) {
> +		close(fd);
> +		return -1;
> +	}
> +
> +	va_start(ap, fmt);
> +	ret = vfscanf(file, fmt, ap);
> +	va_end(ap);
> +
> +	fclose(file);
> +	return ret;
> +}
> +
> +uint32_t gem_engine_mmio_base(int i915, const char *engine)
> +{
> +	unsigned int mmio = 0;
> +
> +	if (gem_engine_property_scanf(i915, engine, "mmio_base",
> +				      "%x", &mmio) < 0) {
> +		int gen = intel_gen(intel_get_drm_devid(i915));
> +
> +		/* The layout of xcs1+ is unreliable -- hence the property! */
> +		if (!strcmp(engine, "rcs0")) {
> +			mmio = 0x2000;
> +		} else if (!strcmp(engine, "bcs0")) {
> +			mmio = 0x22000;
> +		} else if (!strcmp(engine, "vcs0")) {
> +			if (gen < 6)
> +				mmio = 0x4000;
> +			else if (gen < 11)
> +				mmio = 0x12000;
> +			else
> +				mmio = 0x1c0000;
> +		} else if (!strcmp(engine, "vecs0")) {
> +			if (gen < 11)
> +				mmio = 0x1a000;
> +			else
> +				mmio = 0x1c8000;
> +		}
> +	}
> +
> +	return mmio;
> +}
> diff --git a/lib/i915/gem_engine_topology.h b/lib/i915/gem_engine_topology.h
> index d98773e06..e728ebd93 100644
> --- a/lib/i915/gem_engine_topology.h
> +++ b/lib/i915/gem_engine_topology.h
> @@ -74,4 +74,9 @@ struct intel_execution_engine2 gem_eb_flags_to_engine(unsigned int flags);
>   	     ((e__) = intel_get_current_physical_engine(&i__)); \
>   	     intel_next_engine(&i__))
>   
> +__attribute__((format(scanf, 4, 5)))
> +int gem_engine_property_scanf(int i915, const char *engine, const char *attr,
> +			      const char *fmt, ...);
> +uint32_t gem_engine_mmio_base(int i915, const char *engine);
> +
>   #endif /* GEM_ENGINE_TOPOLOGY_H */
> diff --git a/tests/i915/gem_ctx_shared.c b/tests/i915/gem_ctx_shared.c
> index a6eee16dd..949e1f3d4 100644
> --- a/tests/i915/gem_ctx_shared.c
> +++ b/tests/i915/gem_ctx_shared.c
> @@ -38,6 +38,7 @@
>   
>   #include <drm.h>
>   
> +#include "i915/gem_engine_topology.h"
>   #include "igt_rand.h"
>   #include "igt_vgem.h"
>   #include "sync_file.h"
> @@ -556,6 +557,14 @@ static uint32_t store_timestamp(int i915,
>   	return obj.handle;
>   }
>   
> +static uint32_t ring_base(int i915, unsigned ring)
> +{
> +	if (ring == I915_EXEC_DEFAULT)
> +		ring = I915_EXEC_RENDER; /* XXX */
> +
> +	return gem_engine_mmio_base(i915, gem_eb_flags_to_engine(ring).name);
> +}
> +
>   static void independent(int i915, unsigned ring, unsigned flags)
>   {
>   	const int TIMESTAMP = 1023;
> @@ -563,33 +572,8 @@ static void independent(int i915, unsigned ring, unsigned flags)
>   	igt_spin_t *spin[MAX_ELSP_QLEN];
>   	unsigned int mmio_base;
>   
> -	/* XXX i915_query()! */
> -	switch (ring) {
> -	case I915_EXEC_DEFAULT:
> -	case I915_EXEC_RENDER:
> -		mmio_base = 0x2000;
> -		break;
> -#if 0
> -	case I915_EXEC_BSD:
> -		mmio_base = 0x12000;
> -		break;
> -#endif
> -	case I915_EXEC_BLT:
> -		mmio_base = 0x22000;
> -		break;
> -
> -#define GEN11_VECS0_BASE 0x1c8000
> -#define GEN11_VECS1_BASE 0x1d8000
> -	case I915_EXEC_VEBOX:
> -		if (intel_gen(intel_get_drm_devid(i915)) >= 11)
> -			mmio_base = GEN11_VECS0_BASE;
> -		else
> -			mmio_base = 0x1a000;
> -		break;
> -
> -	default:
> -		igt_skip("mmio base not known\n");
> -	}
> +	mmio_base = ring_base(i915, ring);
> +	igt_require_f(mmio_base, "mmio base not known\n");
>   
>   	for (int n = 0; n < ARRAY_SIZE(spin); n++) {
>   		const struct igt_spin_factory opts = {
> diff --git a/tests/i915/gem_exec_latency.c b/tests/i915/gem_exec_latency.c
> index 3d99182a0..d2159f317 100644
> --- a/tests/i915/gem_exec_latency.c
> +++ b/tests/i915/gem_exec_latency.c
> @@ -109,7 +109,7 @@ poll_ring(int fd, unsigned ring, const char *name)
>   	igt_spin_free(fd, spin[0]);
>   }
>   
> -#define RCS_TIMESTAMP (0x2000 + 0x358)
> +#define TIMESTAMP (0x358)
>   static void latency_on_ring(int fd,
>   			    unsigned ring, const char *name,
>   			    unsigned flags)
> @@ -119,6 +119,7 @@ static void latency_on_ring(int fd,
>   	struct drm_i915_gem_exec_object2 obj[3];
>   	struct drm_i915_gem_relocation_entry reloc;
>   	struct drm_i915_gem_execbuffer2 execbuf;
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
>   	igt_spin_t *spin = NULL;
>   	IGT_CORK_HANDLE(c);
>   	volatile uint32_t *reg;
> @@ -128,7 +129,8 @@ static void latency_on_ring(int fd,
>   	double gpu_latency;
>   	int i, j;
>   
> -	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + RCS_TIMESTAMP);
> +	igt_require(mmio_base);
> +	reg = (volatile uint32_t *)((volatile char *)igt_global_mmio + mmio_base + TIMESTAMP);
>   
>   	memset(&execbuf, 0, sizeof(execbuf));
>   	execbuf.buffers_ptr = to_user_pointer(&obj[1]);
> @@ -176,7 +178,7 @@ static void latency_on_ring(int fd,
>   		map[i++] = 0x24 << 23 | 1;
>   		if (has_64bit_reloc)
>   			map[i-1]++;
> -		map[i++] = RCS_TIMESTAMP; /* ring local! */
> +		map[i++] = mmio_base + TIMESTAMP;
>   		map[i++] = offset;
>   		if (has_64bit_reloc)
>   			map[i++] = offset >> 32;
> @@ -266,11 +268,14 @@ static void latency_from_ring(int fd,
>   	struct drm_i915_gem_exec_object2 obj[3];
>   	struct drm_i915_gem_relocation_entry reloc;
>   	struct drm_i915_gem_execbuffer2 execbuf;
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, name);
>   	const unsigned int repeats = ring_size / 2;
>   	uint32_t *map, *results;
>   	uint32_t ctx[2] = {};
>   	int i, j;
>   
> +	igt_require(mmio_base);
> +
>   	if (flags & PREEMPT) {
>   		ctx[0] = gem_context_create(fd);
>   		gem_context_set_priority(fd, ctx[0], -1023);
> @@ -351,7 +356,7 @@ static void latency_from_ring(int fd,
>   			map[i++] = 0x24 << 23 | 1;
>   			if (has_64bit_reloc)
>   				map[i-1]++;
> -			map[i++] = RCS_TIMESTAMP; /* ring local! */
> +			map[i++] = mmio_base + TIMESTAMP;
>   			map[i++] = offset;
>   			if (has_64bit_reloc)
>   				map[i++] = offset >> 32;
> @@ -376,7 +381,7 @@ static void latency_from_ring(int fd,
>   			map[i++] = 0x24 << 23 | 1;
>   			if (has_64bit_reloc)
>   				map[i-1]++;
> -			map[i++] = RCS_TIMESTAMP; /* ring local! */
> +			map[i++] = mmio_base + TIMESTAMP;
>   			map[i++] = offset;
>   			if (has_64bit_reloc)
>   				map[i++] = offset >> 32;
> @@ -669,7 +674,7 @@ igt_main
>   			ring_size = 1024;
>   
>   		intel_register_access_init(&mmio_data, intel_get_pci_device(), false, device);
> -		rcs_clock = clockrate(device, RCS_TIMESTAMP);
> +		rcs_clock = clockrate(device, 0x2000 + TIMESTAMP);
>   		igt_info("RCS timestamp clock: %.0fKHz, %.1fns\n",
>   			 rcs_clock / 1e3, 1e9 / rcs_clock);
>   		rcs_clock = 1e9 / rcs_clock;


_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-21 12:11       ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-21 12:11 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx; +Cc: igt-dev

Quoting Lionel Landwerlin (2019-11-21 12:04:42)
> On 13/11/2019 14:52, Chris Wilson wrote:
> > Several tests depend upon the implicit engine->mmio_base but have no
> > means of determining the physical layout. Since the kernel has started
> > providing this information, start putting it to use.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
> >   lib/i915/gem_engine_topology.h |  5 ++
> >   tests/i915/gem_ctx_shared.c    | 38 +++++----------
> >   tests/i915/gem_exec_latency.c  | 17 ++++---
> >   4 files changed, 111 insertions(+), 33 deletions(-)
> >
> > diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
> > index 790d455ff..bd200a4b9 100644
> > --- a/lib/i915/gem_engine_topology.c
> > +++ b/lib/i915/gem_engine_topology.c
> > @@ -21,7 +21,12 @@
> >    * IN THE SOFTWARE.
> >    */
> >   
> > +#include <fcntl.h>
> > +#include <unistd.h>
> > +
> >   #include "drmtest.h"
> > +#include "igt_sysfs.h"
> > +#include "intel_chipset.h"
> >   #include "ioctl_wrappers.h"
> >   
> >   #include "i915/gem_engine_topology.h"
> > @@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
> >   {
> >       return e1->class == e2->class && e1->instance == e2->instance;
> >   }
> > +
> > +static int descend(int dir, const char *path)
> > +{
> > +     int fd;
> > +
> > +     fd = openat(dir, path, O_RDONLY);
> > +     close(dir);
> > +
> > +     return fd;
> > +}
> > +
> 
> 
> Not sure I understand what file the function below is supposed to parse.
> 
> Is that /sys/kernel/debug/dri/0/i915_engine_info?

/sys/class/drm/card0/engine/*/mmio_base
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-21 12:11       ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-21 12:11 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx; +Cc: igt-dev

Quoting Lionel Landwerlin (2019-11-21 12:04:42)
> On 13/11/2019 14:52, Chris Wilson wrote:
> > Several tests depend upon the implicit engine->mmio_base but have no
> > means of determining the physical layout. Since the kernel has started
> > providing this information, start putting it to use.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
> >   lib/i915/gem_engine_topology.h |  5 ++
> >   tests/i915/gem_ctx_shared.c    | 38 +++++----------
> >   tests/i915/gem_exec_latency.c  | 17 ++++---
> >   4 files changed, 111 insertions(+), 33 deletions(-)
> >
> > diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
> > index 790d455ff..bd200a4b9 100644
> > --- a/lib/i915/gem_engine_topology.c
> > +++ b/lib/i915/gem_engine_topology.c
> > @@ -21,7 +21,12 @@
> >    * IN THE SOFTWARE.
> >    */
> >   
> > +#include <fcntl.h>
> > +#include <unistd.h>
> > +
> >   #include "drmtest.h"
> > +#include "igt_sysfs.h"
> > +#include "intel_chipset.h"
> >   #include "ioctl_wrappers.h"
> >   
> >   #include "i915/gem_engine_topology.h"
> > @@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
> >   {
> >       return e1->class == e2->class && e1->instance == e2->instance;
> >   }
> > +
> > +static int descend(int dir, const char *path)
> > +{
> > +     int fd;
> > +
> > +     fd = openat(dir, path, O_RDONLY);
> > +     close(dir);
> > +
> > +     return fd;
> > +}
> > +
> 
> 
> Not sure I understand what file the function below is supposed to parse.
> 
> Is that /sys/kernel/debug/dri/0/i915_engine_info?

/sys/class/drm/card0/engine/*/mmio_base
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-21 12:11       ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-21 12:11 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx; +Cc: igt-dev

Quoting Lionel Landwerlin (2019-11-21 12:04:42)
> On 13/11/2019 14:52, Chris Wilson wrote:
> > Several tests depend upon the implicit engine->mmio_base but have no
> > means of determining the physical layout. Since the kernel has started
> > providing this information, start putting it to use.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
> >   lib/i915/gem_engine_topology.h |  5 ++
> >   tests/i915/gem_ctx_shared.c    | 38 +++++----------
> >   tests/i915/gem_exec_latency.c  | 17 ++++---
> >   4 files changed, 111 insertions(+), 33 deletions(-)
> >
> > diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
> > index 790d455ff..bd200a4b9 100644
> > --- a/lib/i915/gem_engine_topology.c
> > +++ b/lib/i915/gem_engine_topology.c
> > @@ -21,7 +21,12 @@
> >    * IN THE SOFTWARE.
> >    */
> >   
> > +#include <fcntl.h>
> > +#include <unistd.h>
> > +
> >   #include "drmtest.h"
> > +#include "igt_sysfs.h"
> > +#include "intel_chipset.h"
> >   #include "ioctl_wrappers.h"
> >   
> >   #include "i915/gem_engine_topology.h"
> > @@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
> >   {
> >       return e1->class == e2->class && e1->instance == e2->instance;
> >   }
> > +
> > +static int descend(int dir, const char *path)
> > +{
> > +     int fd;
> > +
> > +     fd = openat(dir, path, O_RDONLY);
> > +     close(dir);
> > +
> > +     return fd;
> > +}
> > +
> 
> 
> Not sure I understand what file the function below is supposed to parse.
> 
> Is that /sys/kernel/debug/dri/0/i915_engine_info?

/sys/class/drm/card0/engine/*/mmio_base
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-21 13:11         ` Lionel Landwerlin
  0 siblings, 0 replies; 57+ messages in thread
From: Lionel Landwerlin @ 2019-11-21 13:11 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev

On 21/11/2019 14:11, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-11-21 12:04:42)
>> On 13/11/2019 14:52, Chris Wilson wrote:
>>> Several tests depend upon the implicit engine->mmio_base but have no
>>> means of determining the physical layout. Since the kernel has started
>>> providing this information, start putting it to use.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
>>>    lib/i915/gem_engine_topology.h |  5 ++
>>>    tests/i915/gem_ctx_shared.c    | 38 +++++----------
>>>    tests/i915/gem_exec_latency.c  | 17 ++++---
>>>    4 files changed, 111 insertions(+), 33 deletions(-)
>>>
>>> diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
>>> index 790d455ff..bd200a4b9 100644
>>> --- a/lib/i915/gem_engine_topology.c
>>> +++ b/lib/i915/gem_engine_topology.c
>>> @@ -21,7 +21,12 @@
>>>     * IN THE SOFTWARE.
>>>     */
>>>    
>>> +#include <fcntl.h>
>>> +#include <unistd.h>
>>> +
>>>    #include "drmtest.h"
>>> +#include "igt_sysfs.h"
>>> +#include "intel_chipset.h"
>>>    #include "ioctl_wrappers.h"
>>>    
>>>    #include "i915/gem_engine_topology.h"
>>> @@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
>>>    {
>>>        return e1->class == e2->class && e1->instance == e2->instance;
>>>    }
>>> +
>>> +static int descend(int dir, const char *path)
>>> +{
>>> +     int fd;
>>> +
>>> +     fd = openat(dir, path, O_RDONLY);
>>> +     close(dir);
>>> +
>>> +     return fd;
>>> +}
>>> +
>>
>> Not sure I understand what file the function below is supposed to parse.
>>
>> Is that /sys/kernel/debug/dri/0/i915_engine_info?
> /sys/class/drm/card0/engine/*/mmio_base
> -Chris

But that's not in drm-tip right?

-Lionel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-21 13:11         ` Lionel Landwerlin
  0 siblings, 0 replies; 57+ messages in thread
From: Lionel Landwerlin @ 2019-11-21 13:11 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev

On 21/11/2019 14:11, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-11-21 12:04:42)
>> On 13/11/2019 14:52, Chris Wilson wrote:
>>> Several tests depend upon the implicit engine->mmio_base but have no
>>> means of determining the physical layout. Since the kernel has started
>>> providing this information, start putting it to use.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
>>>    lib/i915/gem_engine_topology.h |  5 ++
>>>    tests/i915/gem_ctx_shared.c    | 38 +++++----------
>>>    tests/i915/gem_exec_latency.c  | 17 ++++---
>>>    4 files changed, 111 insertions(+), 33 deletions(-)
>>>
>>> diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
>>> index 790d455ff..bd200a4b9 100644
>>> --- a/lib/i915/gem_engine_topology.c
>>> +++ b/lib/i915/gem_engine_topology.c
>>> @@ -21,7 +21,12 @@
>>>     * IN THE SOFTWARE.
>>>     */
>>>    
>>> +#include <fcntl.h>
>>> +#include <unistd.h>
>>> +
>>>    #include "drmtest.h"
>>> +#include "igt_sysfs.h"
>>> +#include "intel_chipset.h"
>>>    #include "ioctl_wrappers.h"
>>>    
>>>    #include "i915/gem_engine_topology.h"
>>> @@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
>>>    {
>>>        return e1->class == e2->class && e1->instance == e2->instance;
>>>    }
>>> +
>>> +static int descend(int dir, const char *path)
>>> +{
>>> +     int fd;
>>> +
>>> +     fd = openat(dir, path, O_RDONLY);
>>> +     close(dir);
>>> +
>>> +     return fd;
>>> +}
>>> +
>>
>> Not sure I understand what file the function below is supposed to parse.
>>
>> Is that /sys/kernel/debug/dri/0/i915_engine_info?
> /sys/class/drm/card0/engine/*/mmio_base
> -Chris

But that's not in drm-tip right?

-Lionel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use
@ 2019-11-21 13:11         ` Lionel Landwerlin
  0 siblings, 0 replies; 57+ messages in thread
From: Lionel Landwerlin @ 2019-11-21 13:11 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev

On 21/11/2019 14:11, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-11-21 12:04:42)
>> On 13/11/2019 14:52, Chris Wilson wrote:
>>> Several tests depend upon the implicit engine->mmio_base but have no
>>> means of determining the physical layout. Since the kernel has started
>>> providing this information, start putting it to use.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    lib/i915/gem_engine_topology.c | 84 ++++++++++++++++++++++++++++++++++
>>>    lib/i915/gem_engine_topology.h |  5 ++
>>>    tests/i915/gem_ctx_shared.c    | 38 +++++----------
>>>    tests/i915/gem_exec_latency.c  | 17 ++++---
>>>    4 files changed, 111 insertions(+), 33 deletions(-)
>>>
>>> diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
>>> index 790d455ff..bd200a4b9 100644
>>> --- a/lib/i915/gem_engine_topology.c
>>> +++ b/lib/i915/gem_engine_topology.c
>>> @@ -21,7 +21,12 @@
>>>     * IN THE SOFTWARE.
>>>     */
>>>    
>>> +#include <fcntl.h>
>>> +#include <unistd.h>
>>> +
>>>    #include "drmtest.h"
>>> +#include "igt_sysfs.h"
>>> +#include "intel_chipset.h"
>>>    #include "ioctl_wrappers.h"
>>>    
>>>    #include "i915/gem_engine_topology.h"
>>> @@ -337,3 +342,82 @@ bool gem_engine_is_equal(const struct intel_execution_engine2 *e1,
>>>    {
>>>        return e1->class == e2->class && e1->instance == e2->instance;
>>>    }
>>> +
>>> +static int descend(int dir, const char *path)
>>> +{
>>> +     int fd;
>>> +
>>> +     fd = openat(dir, path, O_RDONLY);
>>> +     close(dir);
>>> +
>>> +     return fd;
>>> +}
>>> +
>>
>> Not sure I understand what file the function below is supposed to parse.
>>
>> Is that /sys/kernel/debug/dri/0/i915_engine_info?
> /sys/class/drm/card0/engine/*/mmio_base
> -Chris

But that's not in drm-tip right?

-Lionel

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-21 21:07     ` Tang, CQ
  0 siblings, 0 replies; 57+ messages in thread
From: Tang, CQ @ 2019-11-21 21:07 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev



> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> Chris Wilson
> Sent: Wednesday, November 13, 2019 4:53 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: igt-dev@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine
> relative registers
> 
> Some of the non-privileged registers are at the same offset on each engine.
> We can improve our coverage for unknown HW layout by using the reported
> engine->mmio_base for relative offsets.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  tests/i915/gem_ctx_isolation.c | 164 ++++++++++++++++++++-------------
>  1 file changed, 100 insertions(+), 64 deletions(-)
> 
> diff --git a/tests/i915/gem_ctx_isolation.c b/tests/i915/gem_ctx_isolation.c
> index 6aa27133c..546ffac3a 100644
> --- a/tests/i915/gem_ctx_isolation.c
> +++ b/tests/i915/gem_ctx_isolation.c
> @@ -70,6 +70,7 @@ static const struct named_register {
>  	uint32_t ignore_bits;
>  	uint32_t write_mask; /* some registers bits do not exist */
>  	bool masked;
> +	bool relative;
>  } nonpriv_registers[] = {
>  	{ "NOPID", NOCTX, RCS0, 0x2094 },
>  	{ "MI_PREDICATE_RESULT_2", NOCTX, RCS0, 0x23bc }, @@ -109,7
> +110,6 @@ static const struct named_register {
>  	{ "PS_DEPTH_COUNT_1", GEN8, RCS0, 0x22f8, 2 },
>  	{ "BB_OFFSET", GEN8, RCS0, 0x2158, .ignore_bits = 0x7 },
>  	{ "MI_PREDICATE_RESULT_1", GEN8, RCS0, 0x241c },
> -	{ "CS_GPR", GEN8, RCS0, 0x2600, 32 },
>  	{ "OA_CTX_CONTROL", GEN8, RCS0, 0x2360 },
>  	{ "OACTXID", GEN8, RCS0, 0x2364 },
>  	{ "PS_INVOCATION_COUNT_2", GEN8, RCS0, 0x2448, 2, .write_mask
> = ~0x3 }, @@ -138,79 +138,56 @@ static const struct named_register {
> 
>  	{ "CTX_PREEMPT", NOCTX /* GEN10 */, RCS0, 0x2248 },
>  	{ "CS_CHICKEN1", GEN11, RCS0, 0x2580, .masked = true },
> -	{ "HDC_CHICKEN1", GEN_RANGE(10, 10), RCS0, 0x7304, .masked =
> true },
> 
>  	/* Privileged (enabled by w/a + FORCE_TO_NONPRIV) */
>  	{ "CTX_PREEMPT", NOCTX /* GEN9 */, RCS0, 0x2248 },
>  	{ "CS_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x2580, .masked = true },
>  	{ "COMMON_SLICE_CHICKEN2", GEN_RANGE(9, 9), RCS0,
> 0x7014, .masked = true },
> -	{ "HDC_CHICKEN1", GEN_RANGE(9, 9), RCS0, 0x7304, .masked =
> true },
> +	{ "HDC_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x7304, .masked =
> true },
>  	{ "SLICE_COMMON_ECO_CHICKEN1", GEN_RANGE(11, 11) /* + glk */,
> RCS0,  0x731c, .masked = true },
>  	{ "L3SQREG4", NOCTX /* GEN9:skl,kbl */, RCS0, 0xb118, .write_mask
> = ~0x1ffff0 },
>  	{ "HALF_SLICE_CHICKEN7", GEN_RANGE(11, 11), RCS0,
> 0xe194, .masked = true },
>  	{ "SAMPLER_MODE", GEN_RANGE(11, 11), RCS0, 0xe18c, .masked =
> true },
> 
> -	{ "BCS_GPR", GEN9, BCS0, 0x22600, 32 },
>  	{ "BCS_SWCTRL", GEN8, BCS0, 0x22200, .write_mask = 0x3, .masked =
> true },
> 
>  	{ "MFC_VDBOX1", NOCTX, VCS0, 0x12800, 64 },
>  	{ "MFC_VDBOX2", NOCTX, VCS1, 0x1c800, 64 },
> 
> -	{ "VCS0_GPR", GEN_RANGE(9, 10), VCS0, 0x12600, 32 },
> -	{ "VCS1_GPR", GEN_RANGE(9, 10), VCS1, 0x1c600, 32 },
> -	{ "VECS_GPR", GEN_RANGE(9, 10), VECS0, 0x1a600, 32 },
> -
> -	{ "VCS0_GPR", GEN11, VCS0, 0x1c0600, 32 },
> -	{ "VCS1_GPR", GEN11, VCS1, 0x1c4600, 32 },
> -	{ "VCS2_GPR", GEN11, VCS2, 0x1d0600, 32 },
> -	{ "VCS3_GPR", GEN11, VCS3, 0x1d4600, 32 },
> -	{ "VECS_GPR", GEN11, VECS0, 0x1c8600, 32 },
> +	{ "xCS_GPR", GEN9, ALL, 0x600, 32, .relative = true },
> 
>  	{}
>  }, ignore_registers[] = {
>  	{ "RCS timestamp", GEN6, ~0u, 0x2358 },
>  	{ "BCS timestamp", GEN7, ~0u, 0x22358 },
> 
> -	{ "VCS0 timestamp", GEN_RANGE(7, 10), ~0u, 0x12358 },
> -	{ "VCS1 timestamp", GEN_RANGE(7, 10), ~0u, 0x1c358 },
> -	{ "VECS timestamp", GEN_RANGE(8, 10), ~0u, 0x1a358 },
> -
> -	{ "VCS0 timestamp", GEN11, ~0u, 0x1c0358 },
> -	{ "VCS1 timestamp", GEN11, ~0u, 0x1c4358 },
> -	{ "VCS2 timestamp", GEN11, ~0u, 0x1d0358 },
> -	{ "VCS3 timestamp", GEN11, ~0u, 0x1d4358 },
> -	{ "VECS timestamp", GEN11, ~0u, 0x1c8358 },
> +	{ "xCS timestamp", GEN8, ALL, 0x358, .relative = true },
> 
>  	/* huc read only */
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2000 },
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2014 },
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x23b0 },
> -
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2000 },
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2014 },
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x23b0 },
> -
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2000 },
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2014 },
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x23b0 },
> -
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2000 },
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2014 },
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x23b0 },
> +	{ "BSD 0x2000", GEN11, ALL, 0x2000, .relative = true },
> +	{ "BSD 0x2014", GEN11, ALL, 0x2014, .relative = true },
> +	{ "BSD 0x23b0", GEN11, ALL, 0x23b0, .relative = true },
> 
>  	{}
>  };
> 
> -static const char *register_name(uint32_t offset, char *buf, size_t len)
> +static const char *
> +register_name(uint32_t offset, uint32_t mmio_base, char *buf, size_t
> +len)
>  {
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width) {
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width) {
>  			if (r->count <= 1)
>  				return r->name;
> 
>  			snprintf(buf, len, "%s[%d]",
> -				 r->name, (offset - r->offset)/4);
> +				 r->name, (offset - base) / 4);
>  			return buf;
>  		}
>  	}
> @@ -218,22 +195,35 @@ static const char *register_name(uint32_t offset,
> char *buf, size_t len)
>  	return "unknown";
>  }
> 
> -static const struct named_register *lookup_register(uint32_t offset)
> +static const struct named_register *
> +lookup_register(uint32_t offset, uint32_t mmio_base)
>  {
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width)
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width)
>  			return r;
>  	}
> 
>  	return NULL;
>  }
> 
> -static bool ignore_register(uint32_t offset)
> +static bool ignore_register(uint32_t offset, uint32_t mmio_base)
>  {
>  	for (const struct named_register *r = ignore_registers; r->name; r++)
> {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width)
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width)
>  			return true;
>  	}
> 
> @@ -248,6 +238,7 @@ static void tmpl_regs(int fd,  {
>  	const unsigned int gen_bit = 1 <<
> intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);

Chris, I tried to test this patch, but "gem_engine_mmio_base()" above is not defined.
Can you check?

--CQ


>  	unsigned int regs_size;
>  	uint32_t *regs;
> 
> @@ -259,12 +250,20 @@ static void tmpl_regs(int fd,
>  		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
> 
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> +
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			uint32_t x = value;
>  			if (r->write_mask)
>  				x &= r->write_mask;
> @@ -284,6 +283,7 @@ static uint32_t read_regs(int fd,
>  	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	const bool r64b = gen >= 8;
>  	struct drm_i915_gem_exec_object2 obj[2];
>  	struct drm_i915_gem_relocation_entry *reloc; @@ -311,13 +311,20
> @@ static uint32_t read_regs(int fd,
> 
>  	n = 0;
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			*b++ = 0x24 << 23 | (1 + r64b); /* SRM */
>  			*b++ = offset;
>  			reloc[n].target_handle = obj[0].handle; @@ -357,6
> +364,7 @@ static void write_regs(int fd,  {
>  	const unsigned int gen_bit = 1 <<
> intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	struct drm_i915_gem_exec_object2 obj;
>  	struct drm_i915_gem_execbuffer2 execbuf;
>  	unsigned int batch_size;
> @@ -372,12 +380,20 @@ static void write_regs(int fd,
>  	gem_set_domain(fd, obj.handle,
>  		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> +
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			uint32_t x = value;
>  			if (r->write_mask)
>  				x &= r->write_mask;
> @@ -410,6 +426,7 @@ static void restore_regs(int fd,
>  	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	const bool r64b = gen >= 8;
>  	struct drm_i915_gem_exec_object2 obj[2];
>  	struct drm_i915_gem_execbuffer2 execbuf; @@ -437,13 +454,20
> @@ static void restore_regs(int fd,
> 
>  	n = 0;
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			*b++ = 0x29 << 23 | (1 + r64b); /* LRM */
>  			*b++ = offset;
>  			reloc[n].target_handle = obj[0].handle; @@ -479,6
> +503,7 @@ static void dump_regs(int fd,
>  	const int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int regs_size;
>  	uint32_t *out;
> 
> @@ -489,26 +514,36 @@ static void dump_regs(int fd,
>  	gem_set_domain(fd, regs, I915_GEM_DOMAIN_CPU, 0);
> 
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
>  		if (r->count <= 1) {
>  			igt_debug("0x%04x (%s): 0x%08x\n",
> -				  r->offset, r->name, out[r->offset/4]);
> +				  offset, r->name, out[offset / 4]);
>  		} else {
>  			for (unsigned x = 0; x < r->count; x++)
>  				igt_debug("0x%04x (%s[%d]): 0x%08x\n",
> -					  r->offset+4*x, r->name, x,
> -					  out[r->offset/4 + x]);
> +					  offset + 4 * x, r->name, x,
> +					  out[offset / 4 + x]);
>  		}
>  	}
>  	munmap(out, regs_size);
>  }
> 
> -static void compare_regs(int fd, uint32_t A, uint32_t B, const char *who)
> +static void compare_regs(int fd, const struct intel_execution_engine2 *e,
> +			 uint32_t A, uint32_t B, const char *who)
>  {
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int num_errors;
>  	unsigned int regs_size;
>  	uint32_t *a, *b;
> @@ -532,11 +567,11 @@ static void compare_regs(int fd, uint32_t A,
> uint32_t B, const char *who)
>  		if (a[n] == b[n])
>  			continue;
> 
> -		if (ignore_register(offset))
> +		if (ignore_register(offset, mmio_base))
>  			continue;
> 
>  		mask = ~0u;
> -		r = lookup_register(offset);
> +		r = lookup_register(offset, mmio_base);
>  		if (r && r->masked)
>  			mask >>= 16;
>  		if (r && r->ignore_bits)
> @@ -547,7 +582,7 @@ static void compare_regs(int fd, uint32_t A, uint32_t B,
> const char *who)
> 
>  		igt_warn("Register 0x%04x (%s): A=%08x B=%08x\n",
>  			 offset,
> -			 register_name(offset, buf, sizeof(buf)),
> +			 register_name(offset, mmio_base, buf, sizeof(buf)),
>  			 a[n] & mask, b[n] & mask);
>  		num_errors++;
>  	}
> @@ -638,7 +673,7 @@ static void nonpriv(int fd,
> 
>  		igt_spin_free(fd, spin);
> 
> -		compare_regs(fd, tmpl, regs[1], "nonpriv read/writes");
> +		compare_regs(fd, e, tmpl, regs[1], "nonpriv read/writes");
> 
>  		for (int n = 0; n < ARRAY_SIZE(regs); n++)
>  			gem_close(fd, regs[n]);
> @@ -708,8 +743,9 @@ static void isolation(int fd,
>  		igt_spin_free(fd, spin);
> 
>  		if (!(flags & DIRTY1))
> -			compare_regs(fd, regs[0], tmp, "two reads of the
> same ctx");
> -		compare_regs(fd, regs[0], regs[1], "two virgin contexts");
> +			compare_regs(fd, e, regs[0], tmp,
> +				     "two reads of the same ctx");
> +		compare_regs(fd, e, regs[0], regs[1], "two virgin contexts");
> 
>  		for (int n = 0; n < ARRAY_SIZE(ctx); n++) {
>  			gem_close(fd, regs[n]);
> @@ -829,13 +865,13 @@ static void preservation(int fd,
>  		char buf[80];
> 
>  		snprintf(buf, sizeof(buf), "dirty %x context\n", values[v]);
> -		compare_regs(fd, regs[v][0], regs[v][1], buf);
> +		compare_regs(fd, e, regs[v][0], regs[v][1], buf);
> 
>  		gem_close(fd, regs[v][0]);
>  		gem_close(fd, regs[v][1]);
>  		gem_context_destroy(fd, ctx[v]);
>  	}
> -	compare_regs(fd, regs[num_values][0], regs[num_values][1],
> "clean");
> +	compare_regs(fd, e, regs[num_values][0], regs[num_values][1],
> +"clean");
>  	gem_context_destroy(fd, ctx[num_values]);  }
> 
> --
> 2.24.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-21 21:07     ` Tang, CQ
  0 siblings, 0 replies; 57+ messages in thread
From: Tang, CQ @ 2019-11-21 21:07 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev



> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> Chris Wilson
> Sent: Wednesday, November 13, 2019 4:53 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: igt-dev@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine
> relative registers
> 
> Some of the non-privileged registers are at the same offset on each engine.
> We can improve our coverage for unknown HW layout by using the reported
> engine->mmio_base for relative offsets.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  tests/i915/gem_ctx_isolation.c | 164 ++++++++++++++++++++-------------
>  1 file changed, 100 insertions(+), 64 deletions(-)
> 
> diff --git a/tests/i915/gem_ctx_isolation.c b/tests/i915/gem_ctx_isolation.c
> index 6aa27133c..546ffac3a 100644
> --- a/tests/i915/gem_ctx_isolation.c
> +++ b/tests/i915/gem_ctx_isolation.c
> @@ -70,6 +70,7 @@ static const struct named_register {
>  	uint32_t ignore_bits;
>  	uint32_t write_mask; /* some registers bits do not exist */
>  	bool masked;
> +	bool relative;
>  } nonpriv_registers[] = {
>  	{ "NOPID", NOCTX, RCS0, 0x2094 },
>  	{ "MI_PREDICATE_RESULT_2", NOCTX, RCS0, 0x23bc }, @@ -109,7
> +110,6 @@ static const struct named_register {
>  	{ "PS_DEPTH_COUNT_1", GEN8, RCS0, 0x22f8, 2 },
>  	{ "BB_OFFSET", GEN8, RCS0, 0x2158, .ignore_bits = 0x7 },
>  	{ "MI_PREDICATE_RESULT_1", GEN8, RCS0, 0x241c },
> -	{ "CS_GPR", GEN8, RCS0, 0x2600, 32 },
>  	{ "OA_CTX_CONTROL", GEN8, RCS0, 0x2360 },
>  	{ "OACTXID", GEN8, RCS0, 0x2364 },
>  	{ "PS_INVOCATION_COUNT_2", GEN8, RCS0, 0x2448, 2, .write_mask
> = ~0x3 }, @@ -138,79 +138,56 @@ static const struct named_register {
> 
>  	{ "CTX_PREEMPT", NOCTX /* GEN10 */, RCS0, 0x2248 },
>  	{ "CS_CHICKEN1", GEN11, RCS0, 0x2580, .masked = true },
> -	{ "HDC_CHICKEN1", GEN_RANGE(10, 10), RCS0, 0x7304, .masked =
> true },
> 
>  	/* Privileged (enabled by w/a + FORCE_TO_NONPRIV) */
>  	{ "CTX_PREEMPT", NOCTX /* GEN9 */, RCS0, 0x2248 },
>  	{ "CS_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x2580, .masked = true },
>  	{ "COMMON_SLICE_CHICKEN2", GEN_RANGE(9, 9), RCS0,
> 0x7014, .masked = true },
> -	{ "HDC_CHICKEN1", GEN_RANGE(9, 9), RCS0, 0x7304, .masked =
> true },
> +	{ "HDC_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x7304, .masked =
> true },
>  	{ "SLICE_COMMON_ECO_CHICKEN1", GEN_RANGE(11, 11) /* + glk */,
> RCS0,  0x731c, .masked = true },
>  	{ "L3SQREG4", NOCTX /* GEN9:skl,kbl */, RCS0, 0xb118, .write_mask
> = ~0x1ffff0 },
>  	{ "HALF_SLICE_CHICKEN7", GEN_RANGE(11, 11), RCS0,
> 0xe194, .masked = true },
>  	{ "SAMPLER_MODE", GEN_RANGE(11, 11), RCS0, 0xe18c, .masked =
> true },
> 
> -	{ "BCS_GPR", GEN9, BCS0, 0x22600, 32 },
>  	{ "BCS_SWCTRL", GEN8, BCS0, 0x22200, .write_mask = 0x3, .masked =
> true },
> 
>  	{ "MFC_VDBOX1", NOCTX, VCS0, 0x12800, 64 },
>  	{ "MFC_VDBOX2", NOCTX, VCS1, 0x1c800, 64 },
> 
> -	{ "VCS0_GPR", GEN_RANGE(9, 10), VCS0, 0x12600, 32 },
> -	{ "VCS1_GPR", GEN_RANGE(9, 10), VCS1, 0x1c600, 32 },
> -	{ "VECS_GPR", GEN_RANGE(9, 10), VECS0, 0x1a600, 32 },
> -
> -	{ "VCS0_GPR", GEN11, VCS0, 0x1c0600, 32 },
> -	{ "VCS1_GPR", GEN11, VCS1, 0x1c4600, 32 },
> -	{ "VCS2_GPR", GEN11, VCS2, 0x1d0600, 32 },
> -	{ "VCS3_GPR", GEN11, VCS3, 0x1d4600, 32 },
> -	{ "VECS_GPR", GEN11, VECS0, 0x1c8600, 32 },
> +	{ "xCS_GPR", GEN9, ALL, 0x600, 32, .relative = true },
> 
>  	{}
>  }, ignore_registers[] = {
>  	{ "RCS timestamp", GEN6, ~0u, 0x2358 },
>  	{ "BCS timestamp", GEN7, ~0u, 0x22358 },
> 
> -	{ "VCS0 timestamp", GEN_RANGE(7, 10), ~0u, 0x12358 },
> -	{ "VCS1 timestamp", GEN_RANGE(7, 10), ~0u, 0x1c358 },
> -	{ "VECS timestamp", GEN_RANGE(8, 10), ~0u, 0x1a358 },
> -
> -	{ "VCS0 timestamp", GEN11, ~0u, 0x1c0358 },
> -	{ "VCS1 timestamp", GEN11, ~0u, 0x1c4358 },
> -	{ "VCS2 timestamp", GEN11, ~0u, 0x1d0358 },
> -	{ "VCS3 timestamp", GEN11, ~0u, 0x1d4358 },
> -	{ "VECS timestamp", GEN11, ~0u, 0x1c8358 },
> +	{ "xCS timestamp", GEN8, ALL, 0x358, .relative = true },
> 
>  	/* huc read only */
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2000 },
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2014 },
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x23b0 },
> -
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2000 },
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2014 },
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x23b0 },
> -
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2000 },
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2014 },
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x23b0 },
> -
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2000 },
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2014 },
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x23b0 },
> +	{ "BSD 0x2000", GEN11, ALL, 0x2000, .relative = true },
> +	{ "BSD 0x2014", GEN11, ALL, 0x2014, .relative = true },
> +	{ "BSD 0x23b0", GEN11, ALL, 0x23b0, .relative = true },
> 
>  	{}
>  };
> 
> -static const char *register_name(uint32_t offset, char *buf, size_t len)
> +static const char *
> +register_name(uint32_t offset, uint32_t mmio_base, char *buf, size_t
> +len)
>  {
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width) {
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width) {
>  			if (r->count <= 1)
>  				return r->name;
> 
>  			snprintf(buf, len, "%s[%d]",
> -				 r->name, (offset - r->offset)/4);
> +				 r->name, (offset - base) / 4);
>  			return buf;
>  		}
>  	}
> @@ -218,22 +195,35 @@ static const char *register_name(uint32_t offset,
> char *buf, size_t len)
>  	return "unknown";
>  }
> 
> -static const struct named_register *lookup_register(uint32_t offset)
> +static const struct named_register *
> +lookup_register(uint32_t offset, uint32_t mmio_base)
>  {
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width)
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width)
>  			return r;
>  	}
> 
>  	return NULL;
>  }
> 
> -static bool ignore_register(uint32_t offset)
> +static bool ignore_register(uint32_t offset, uint32_t mmio_base)
>  {
>  	for (const struct named_register *r = ignore_registers; r->name; r++)
> {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width)
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width)
>  			return true;
>  	}
> 
> @@ -248,6 +238,7 @@ static void tmpl_regs(int fd,  {
>  	const unsigned int gen_bit = 1 <<
> intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);

Chris, I tried to test this patch, but "gem_engine_mmio_base()" above is not defined.
Can you check?

--CQ


>  	unsigned int regs_size;
>  	uint32_t *regs;
> 
> @@ -259,12 +250,20 @@ static void tmpl_regs(int fd,
>  		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
> 
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> +
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			uint32_t x = value;
>  			if (r->write_mask)
>  				x &= r->write_mask;
> @@ -284,6 +283,7 @@ static uint32_t read_regs(int fd,
>  	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	const bool r64b = gen >= 8;
>  	struct drm_i915_gem_exec_object2 obj[2];
>  	struct drm_i915_gem_relocation_entry *reloc; @@ -311,13 +311,20
> @@ static uint32_t read_regs(int fd,
> 
>  	n = 0;
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			*b++ = 0x24 << 23 | (1 + r64b); /* SRM */
>  			*b++ = offset;
>  			reloc[n].target_handle = obj[0].handle; @@ -357,6
> +364,7 @@ static void write_regs(int fd,  {
>  	const unsigned int gen_bit = 1 <<
> intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	struct drm_i915_gem_exec_object2 obj;
>  	struct drm_i915_gem_execbuffer2 execbuf;
>  	unsigned int batch_size;
> @@ -372,12 +380,20 @@ static void write_regs(int fd,
>  	gem_set_domain(fd, obj.handle,
>  		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> +
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			uint32_t x = value;
>  			if (r->write_mask)
>  				x &= r->write_mask;
> @@ -410,6 +426,7 @@ static void restore_regs(int fd,
>  	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	const bool r64b = gen >= 8;
>  	struct drm_i915_gem_exec_object2 obj[2];
>  	struct drm_i915_gem_execbuffer2 execbuf; @@ -437,13 +454,20
> @@ static void restore_regs(int fd,
> 
>  	n = 0;
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			*b++ = 0x29 << 23 | (1 + r64b); /* LRM */
>  			*b++ = offset;
>  			reloc[n].target_handle = obj[0].handle; @@ -479,6
> +503,7 @@ static void dump_regs(int fd,
>  	const int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int regs_size;
>  	uint32_t *out;
> 
> @@ -489,26 +514,36 @@ static void dump_regs(int fd,
>  	gem_set_domain(fd, regs, I915_GEM_DOMAIN_CPU, 0);
> 
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
>  		if (r->count <= 1) {
>  			igt_debug("0x%04x (%s): 0x%08x\n",
> -				  r->offset, r->name, out[r->offset/4]);
> +				  offset, r->name, out[offset / 4]);
>  		} else {
>  			for (unsigned x = 0; x < r->count; x++)
>  				igt_debug("0x%04x (%s[%d]): 0x%08x\n",
> -					  r->offset+4*x, r->name, x,
> -					  out[r->offset/4 + x]);
> +					  offset + 4 * x, r->name, x,
> +					  out[offset / 4 + x]);
>  		}
>  	}
>  	munmap(out, regs_size);
>  }
> 
> -static void compare_regs(int fd, uint32_t A, uint32_t B, const char *who)
> +static void compare_regs(int fd, const struct intel_execution_engine2 *e,
> +			 uint32_t A, uint32_t B, const char *who)
>  {
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int num_errors;
>  	unsigned int regs_size;
>  	uint32_t *a, *b;
> @@ -532,11 +567,11 @@ static void compare_regs(int fd, uint32_t A,
> uint32_t B, const char *who)
>  		if (a[n] == b[n])
>  			continue;
> 
> -		if (ignore_register(offset))
> +		if (ignore_register(offset, mmio_base))
>  			continue;
> 
>  		mask = ~0u;
> -		r = lookup_register(offset);
> +		r = lookup_register(offset, mmio_base);
>  		if (r && r->masked)
>  			mask >>= 16;
>  		if (r && r->ignore_bits)
> @@ -547,7 +582,7 @@ static void compare_regs(int fd, uint32_t A, uint32_t B,
> const char *who)
> 
>  		igt_warn("Register 0x%04x (%s): A=%08x B=%08x\n",
>  			 offset,
> -			 register_name(offset, buf, sizeof(buf)),
> +			 register_name(offset, mmio_base, buf, sizeof(buf)),
>  			 a[n] & mask, b[n] & mask);
>  		num_errors++;
>  	}
> @@ -638,7 +673,7 @@ static void nonpriv(int fd,
> 
>  		igt_spin_free(fd, spin);
> 
> -		compare_regs(fd, tmpl, regs[1], "nonpriv read/writes");
> +		compare_regs(fd, e, tmpl, regs[1], "nonpriv read/writes");
> 
>  		for (int n = 0; n < ARRAY_SIZE(regs); n++)
>  			gem_close(fd, regs[n]);
> @@ -708,8 +743,9 @@ static void isolation(int fd,
>  		igt_spin_free(fd, spin);
> 
>  		if (!(flags & DIRTY1))
> -			compare_regs(fd, regs[0], tmp, "two reads of the
> same ctx");
> -		compare_regs(fd, regs[0], regs[1], "two virgin contexts");
> +			compare_regs(fd, e, regs[0], tmp,
> +				     "two reads of the same ctx");
> +		compare_regs(fd, e, regs[0], regs[1], "two virgin contexts");
> 
>  		for (int n = 0; n < ARRAY_SIZE(ctx); n++) {
>  			gem_close(fd, regs[n]);
> @@ -829,13 +865,13 @@ static void preservation(int fd,
>  		char buf[80];
> 
>  		snprintf(buf, sizeof(buf), "dirty %x context\n", values[v]);
> -		compare_regs(fd, regs[v][0], regs[v][1], buf);
> +		compare_regs(fd, e, regs[v][0], regs[v][1], buf);
> 
>  		gem_close(fd, regs[v][0]);
>  		gem_close(fd, regs[v][1]);
>  		gem_context_destroy(fd, ctx[v]);
>  	}
> -	compare_regs(fd, regs[num_values][0], regs[num_values][1],
> "clean");
> +	compare_regs(fd, e, regs[num_values][0], regs[num_values][1],
> +"clean");
>  	gem_context_destroy(fd, ctx[num_values]);  }
> 
> --
> 2.24.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-21 21:07     ` Tang, CQ
  0 siblings, 0 replies; 57+ messages in thread
From: Tang, CQ @ 2019-11-21 21:07 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev



> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> Chris Wilson
> Sent: Wednesday, November 13, 2019 4:53 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: igt-dev@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine
> relative registers
> 
> Some of the non-privileged registers are at the same offset on each engine.
> We can improve our coverage for unknown HW layout by using the reported
> engine->mmio_base for relative offsets.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  tests/i915/gem_ctx_isolation.c | 164 ++++++++++++++++++++-------------
>  1 file changed, 100 insertions(+), 64 deletions(-)
> 
> diff --git a/tests/i915/gem_ctx_isolation.c b/tests/i915/gem_ctx_isolation.c
> index 6aa27133c..546ffac3a 100644
> --- a/tests/i915/gem_ctx_isolation.c
> +++ b/tests/i915/gem_ctx_isolation.c
> @@ -70,6 +70,7 @@ static const struct named_register {
>  	uint32_t ignore_bits;
>  	uint32_t write_mask; /* some registers bits do not exist */
>  	bool masked;
> +	bool relative;
>  } nonpriv_registers[] = {
>  	{ "NOPID", NOCTX, RCS0, 0x2094 },
>  	{ "MI_PREDICATE_RESULT_2", NOCTX, RCS0, 0x23bc }, @@ -109,7
> +110,6 @@ static const struct named_register {
>  	{ "PS_DEPTH_COUNT_1", GEN8, RCS0, 0x22f8, 2 },
>  	{ "BB_OFFSET", GEN8, RCS0, 0x2158, .ignore_bits = 0x7 },
>  	{ "MI_PREDICATE_RESULT_1", GEN8, RCS0, 0x241c },
> -	{ "CS_GPR", GEN8, RCS0, 0x2600, 32 },
>  	{ "OA_CTX_CONTROL", GEN8, RCS0, 0x2360 },
>  	{ "OACTXID", GEN8, RCS0, 0x2364 },
>  	{ "PS_INVOCATION_COUNT_2", GEN8, RCS0, 0x2448, 2, .write_mask
> = ~0x3 }, @@ -138,79 +138,56 @@ static const struct named_register {
> 
>  	{ "CTX_PREEMPT", NOCTX /* GEN10 */, RCS0, 0x2248 },
>  	{ "CS_CHICKEN1", GEN11, RCS0, 0x2580, .masked = true },
> -	{ "HDC_CHICKEN1", GEN_RANGE(10, 10), RCS0, 0x7304, .masked =
> true },
> 
>  	/* Privileged (enabled by w/a + FORCE_TO_NONPRIV) */
>  	{ "CTX_PREEMPT", NOCTX /* GEN9 */, RCS0, 0x2248 },
>  	{ "CS_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x2580, .masked = true },
>  	{ "COMMON_SLICE_CHICKEN2", GEN_RANGE(9, 9), RCS0,
> 0x7014, .masked = true },
> -	{ "HDC_CHICKEN1", GEN_RANGE(9, 9), RCS0, 0x7304, .masked =
> true },
> +	{ "HDC_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x7304, .masked =
> true },
>  	{ "SLICE_COMMON_ECO_CHICKEN1", GEN_RANGE(11, 11) /* + glk */,
> RCS0,  0x731c, .masked = true },
>  	{ "L3SQREG4", NOCTX /* GEN9:skl,kbl */, RCS0, 0xb118, .write_mask
> = ~0x1ffff0 },
>  	{ "HALF_SLICE_CHICKEN7", GEN_RANGE(11, 11), RCS0,
> 0xe194, .masked = true },
>  	{ "SAMPLER_MODE", GEN_RANGE(11, 11), RCS0, 0xe18c, .masked =
> true },
> 
> -	{ "BCS_GPR", GEN9, BCS0, 0x22600, 32 },
>  	{ "BCS_SWCTRL", GEN8, BCS0, 0x22200, .write_mask = 0x3, .masked =
> true },
> 
>  	{ "MFC_VDBOX1", NOCTX, VCS0, 0x12800, 64 },
>  	{ "MFC_VDBOX2", NOCTX, VCS1, 0x1c800, 64 },
> 
> -	{ "VCS0_GPR", GEN_RANGE(9, 10), VCS0, 0x12600, 32 },
> -	{ "VCS1_GPR", GEN_RANGE(9, 10), VCS1, 0x1c600, 32 },
> -	{ "VECS_GPR", GEN_RANGE(9, 10), VECS0, 0x1a600, 32 },
> -
> -	{ "VCS0_GPR", GEN11, VCS0, 0x1c0600, 32 },
> -	{ "VCS1_GPR", GEN11, VCS1, 0x1c4600, 32 },
> -	{ "VCS2_GPR", GEN11, VCS2, 0x1d0600, 32 },
> -	{ "VCS3_GPR", GEN11, VCS3, 0x1d4600, 32 },
> -	{ "VECS_GPR", GEN11, VECS0, 0x1c8600, 32 },
> +	{ "xCS_GPR", GEN9, ALL, 0x600, 32, .relative = true },
> 
>  	{}
>  }, ignore_registers[] = {
>  	{ "RCS timestamp", GEN6, ~0u, 0x2358 },
>  	{ "BCS timestamp", GEN7, ~0u, 0x22358 },
> 
> -	{ "VCS0 timestamp", GEN_RANGE(7, 10), ~0u, 0x12358 },
> -	{ "VCS1 timestamp", GEN_RANGE(7, 10), ~0u, 0x1c358 },
> -	{ "VECS timestamp", GEN_RANGE(8, 10), ~0u, 0x1a358 },
> -
> -	{ "VCS0 timestamp", GEN11, ~0u, 0x1c0358 },
> -	{ "VCS1 timestamp", GEN11, ~0u, 0x1c4358 },
> -	{ "VCS2 timestamp", GEN11, ~0u, 0x1d0358 },
> -	{ "VCS3 timestamp", GEN11, ~0u, 0x1d4358 },
> -	{ "VECS timestamp", GEN11, ~0u, 0x1c8358 },
> +	{ "xCS timestamp", GEN8, ALL, 0x358, .relative = true },
> 
>  	/* huc read only */
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2000 },
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2014 },
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x23b0 },
> -
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2000 },
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2014 },
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x23b0 },
> -
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2000 },
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2014 },
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x23b0 },
> -
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2000 },
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2014 },
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x23b0 },
> +	{ "BSD 0x2000", GEN11, ALL, 0x2000, .relative = true },
> +	{ "BSD 0x2014", GEN11, ALL, 0x2014, .relative = true },
> +	{ "BSD 0x23b0", GEN11, ALL, 0x23b0, .relative = true },
> 
>  	{}
>  };
> 
> -static const char *register_name(uint32_t offset, char *buf, size_t len)
> +static const char *
> +register_name(uint32_t offset, uint32_t mmio_base, char *buf, size_t
> +len)
>  {
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width) {
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width) {
>  			if (r->count <= 1)
>  				return r->name;
> 
>  			snprintf(buf, len, "%s[%d]",
> -				 r->name, (offset - r->offset)/4);
> +				 r->name, (offset - base) / 4);
>  			return buf;
>  		}
>  	}
> @@ -218,22 +195,35 @@ static const char *register_name(uint32_t offset,
> char *buf, size_t len)
>  	return "unknown";
>  }
> 
> -static const struct named_register *lookup_register(uint32_t offset)
> +static const struct named_register *
> +lookup_register(uint32_t offset, uint32_t mmio_base)
>  {
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width)
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width)
>  			return r;
>  	}
> 
>  	return NULL;
>  }
> 
> -static bool ignore_register(uint32_t offset)
> +static bool ignore_register(uint32_t offset, uint32_t mmio_base)
>  {
>  	for (const struct named_register *r = ignore_registers; r->name; r++)
> {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width)
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width)
>  			return true;
>  	}
> 
> @@ -248,6 +238,7 @@ static void tmpl_regs(int fd,  {
>  	const unsigned int gen_bit = 1 <<
> intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);

Chris, I tried to test this patch, but "gem_engine_mmio_base()" above is not defined.
Can you check?

--CQ


>  	unsigned int regs_size;
>  	uint32_t *regs;
> 
> @@ -259,12 +250,20 @@ static void tmpl_regs(int fd,
>  		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
> 
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> +
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			uint32_t x = value;
>  			if (r->write_mask)
>  				x &= r->write_mask;
> @@ -284,6 +283,7 @@ static uint32_t read_regs(int fd,
>  	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	const bool r64b = gen >= 8;
>  	struct drm_i915_gem_exec_object2 obj[2];
>  	struct drm_i915_gem_relocation_entry *reloc; @@ -311,13 +311,20
> @@ static uint32_t read_regs(int fd,
> 
>  	n = 0;
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			*b++ = 0x24 << 23 | (1 + r64b); /* SRM */
>  			*b++ = offset;
>  			reloc[n].target_handle = obj[0].handle; @@ -357,6
> +364,7 @@ static void write_regs(int fd,  {
>  	const unsigned int gen_bit = 1 <<
> intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	struct drm_i915_gem_exec_object2 obj;
>  	struct drm_i915_gem_execbuffer2 execbuf;
>  	unsigned int batch_size;
> @@ -372,12 +380,20 @@ static void write_regs(int fd,
>  	gem_set_domain(fd, obj.handle,
>  		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> +
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			uint32_t x = value;
>  			if (r->write_mask)
>  				x &= r->write_mask;
> @@ -410,6 +426,7 @@ static void restore_regs(int fd,
>  	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	const bool r64b = gen >= 8;
>  	struct drm_i915_gem_exec_object2 obj[2];
>  	struct drm_i915_gem_execbuffer2 execbuf; @@ -437,13 +454,20
> @@ static void restore_regs(int fd,
> 
>  	n = 0;
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			*b++ = 0x29 << 23 | (1 + r64b); /* LRM */
>  			*b++ = offset;
>  			reloc[n].target_handle = obj[0].handle; @@ -479,6
> +503,7 @@ static void dump_regs(int fd,
>  	const int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int regs_size;
>  	uint32_t *out;
> 
> @@ -489,26 +514,36 @@ static void dump_regs(int fd,
>  	gem_set_domain(fd, regs, I915_GEM_DOMAIN_CPU, 0);
> 
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
>  		if (r->count <= 1) {
>  			igt_debug("0x%04x (%s): 0x%08x\n",
> -				  r->offset, r->name, out[r->offset/4]);
> +				  offset, r->name, out[offset / 4]);
>  		} else {
>  			for (unsigned x = 0; x < r->count; x++)
>  				igt_debug("0x%04x (%s[%d]): 0x%08x\n",
> -					  r->offset+4*x, r->name, x,
> -					  out[r->offset/4 + x]);
> +					  offset + 4 * x, r->name, x,
> +					  out[offset / 4 + x]);
>  		}
>  	}
>  	munmap(out, regs_size);
>  }
> 
> -static void compare_regs(int fd, uint32_t A, uint32_t B, const char *who)
> +static void compare_regs(int fd, const struct intel_execution_engine2 *e,
> +			 uint32_t A, uint32_t B, const char *who)
>  {
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int num_errors;
>  	unsigned int regs_size;
>  	uint32_t *a, *b;
> @@ -532,11 +567,11 @@ static void compare_regs(int fd, uint32_t A,
> uint32_t B, const char *who)
>  		if (a[n] == b[n])
>  			continue;
> 
> -		if (ignore_register(offset))
> +		if (ignore_register(offset, mmio_base))
>  			continue;
> 
>  		mask = ~0u;
> -		r = lookup_register(offset);
> +		r = lookup_register(offset, mmio_base);
>  		if (r && r->masked)
>  			mask >>= 16;
>  		if (r && r->ignore_bits)
> @@ -547,7 +582,7 @@ static void compare_regs(int fd, uint32_t A, uint32_t B,
> const char *who)
> 
>  		igt_warn("Register 0x%04x (%s): A=%08x B=%08x\n",
>  			 offset,
> -			 register_name(offset, buf, sizeof(buf)),
> +			 register_name(offset, mmio_base, buf, sizeof(buf)),
>  			 a[n] & mask, b[n] & mask);
>  		num_errors++;
>  	}
> @@ -638,7 +673,7 @@ static void nonpriv(int fd,
> 
>  		igt_spin_free(fd, spin);
> 
> -		compare_regs(fd, tmpl, regs[1], "nonpriv read/writes");
> +		compare_regs(fd, e, tmpl, regs[1], "nonpriv read/writes");
> 
>  		for (int n = 0; n < ARRAY_SIZE(regs); n++)
>  			gem_close(fd, regs[n]);
> @@ -708,8 +743,9 @@ static void isolation(int fd,
>  		igt_spin_free(fd, spin);
> 
>  		if (!(flags & DIRTY1))
> -			compare_regs(fd, regs[0], tmp, "two reads of the
> same ctx");
> -		compare_regs(fd, regs[0], regs[1], "two virgin contexts");
> +			compare_regs(fd, e, regs[0], tmp,
> +				     "two reads of the same ctx");
> +		compare_regs(fd, e, regs[0], regs[1], "two virgin contexts");
> 
>  		for (int n = 0; n < ARRAY_SIZE(ctx); n++) {
>  			gem_close(fd, regs[n]);
> @@ -829,13 +865,13 @@ static void preservation(int fd,
>  		char buf[80];
> 
>  		snprintf(buf, sizeof(buf), "dirty %x context\n", values[v]);
> -		compare_regs(fd, regs[v][0], regs[v][1], buf);
> +		compare_regs(fd, e, regs[v][0], regs[v][1], buf);
> 
>  		gem_close(fd, regs[v][0]);
>  		gem_close(fd, regs[v][1]);
>  		gem_context_destroy(fd, ctx[v]);
>  	}
> -	compare_regs(fd, regs[num_values][0], regs[num_values][1],
> "clean");
> +	compare_regs(fd, e, regs[num_values][0], regs[num_values][1],
> +"clean");
>  	gem_context_destroy(fd, ctx[num_values]);  }
> 
> --
> 2.24.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-21 23:44       ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-21 23:44 UTC (permalink / raw)
  To: Tang, CQ, intel-gfx; +Cc: igt-dev

Quoting Tang, CQ (2019-11-21 21:07:13)
> 
> 
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> > Chris Wilson
> > Sent: Wednesday, November 13, 2019 4:53 AM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: igt-dev@lists.freedesktop.org
> > Subject: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine
> > @@ -248,6 +238,7 @@ static void tmpl_regs(int fd,  {
> >       const unsigned int gen_bit = 1 <<
> > intel_gen(intel_get_drm_devid(fd));
> >       const unsigned int engine_bit = ENGINE(e->class, e->instance);
> > +     const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
> 
> Chris, I tried to test this patch, but "gem_engine_mmio_base()" above is not defined.
> Can you check?

Did you perchance look at patch 4?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-21 23:44       ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-21 23:44 UTC (permalink / raw)
  To: Tang, CQ, intel-gfx; +Cc: igt-dev

Quoting Tang, CQ (2019-11-21 21:07:13)
> 
> 
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> > Chris Wilson
> > Sent: Wednesday, November 13, 2019 4:53 AM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: igt-dev@lists.freedesktop.org
> > Subject: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine
> > @@ -248,6 +238,7 @@ static void tmpl_regs(int fd,  {
> >       const unsigned int gen_bit = 1 <<
> > intel_gen(intel_get_drm_devid(fd));
> >       const unsigned int engine_bit = ENGINE(e->class, e->instance);
> > +     const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
> 
> Chris, I tried to test this patch, but "gem_engine_mmio_base()" above is not defined.
> Can you check?

Did you perchance look at patch 4?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-21 23:44       ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-11-21 23:44 UTC (permalink / raw)
  To: Tang, CQ, intel-gfx; +Cc: igt-dev

Quoting Tang, CQ (2019-11-21 21:07:13)
> 
> 
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> > Chris Wilson
> > Sent: Wednesday, November 13, 2019 4:53 AM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: igt-dev@lists.freedesktop.org
> > Subject: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine
> > @@ -248,6 +238,7 @@ static void tmpl_regs(int fd,  {
> >       const unsigned int gen_bit = 1 <<
> > intel_gen(intel_get_drm_devid(fd));
> >       const unsigned int engine_bit = ENGINE(e->class, e->instance);
> > +     const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
> 
> Chris, I tried to test this patch, but "gem_engine_mmio_base()" above is not defined.
> Can you check?

Did you perchance look at patch 4?
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-21 23:56         ` Tang, CQ
  0 siblings, 0 replies; 57+ messages in thread
From: Tang, CQ @ 2019-11-21 23:56 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev



> -----Original Message-----
> From: Chris Wilson <chris@chris-wilson.co.uk>
> Sent: Thursday, November 21, 2019 3:45 PM
> To: Tang, CQ <cq.tang@intel.com>; intel-gfx@lists.freedesktop.org
> Cc: igt-dev@lists.freedesktop.org
> Subject: RE: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check
> engine relative registers
> 
> Quoting Tang, CQ (2019-11-21 21:07:13)
> >
> >
> > > -----Original Message-----
> > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf
> > > Of Chris Wilson
> > > Sent: Wednesday, November 13, 2019 4:53 AM
> > > To: intel-gfx@lists.freedesktop.org
> > > Cc: igt-dev@lists.freedesktop.org
> > > Subject: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check
> > > engine @@ -248,6 +238,7 @@ static void tmpl_regs(int fd,  {
> > >       const unsigned int gen_bit = 1 <<
> > > intel_gen(intel_get_drm_devid(fd));
> > >       const unsigned int engine_bit = ENGINE(e->class, e->instance);
> > > +     const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
> >
> > Chris, I tried to test this patch, but "gem_engine_mmio_base()" above is
> not defined.
> > Can you check?
> 
> Did you perchance look at patch 4?

Thanks, find this one:
[i-g-t,4/9] i915: Start putting the mmio_base to wider use

--CQ

> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-21 23:56         ` Tang, CQ
  0 siblings, 0 replies; 57+ messages in thread
From: Tang, CQ @ 2019-11-21 23:56 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev



> -----Original Message-----
> From: Chris Wilson <chris@chris-wilson.co.uk>
> Sent: Thursday, November 21, 2019 3:45 PM
> To: Tang, CQ <cq.tang@intel.com>; intel-gfx@lists.freedesktop.org
> Cc: igt-dev@lists.freedesktop.org
> Subject: RE: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check
> engine relative registers
> 
> Quoting Tang, CQ (2019-11-21 21:07:13)
> >
> >
> > > -----Original Message-----
> > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf
> > > Of Chris Wilson
> > > Sent: Wednesday, November 13, 2019 4:53 AM
> > > To: intel-gfx@lists.freedesktop.org
> > > Cc: igt-dev@lists.freedesktop.org
> > > Subject: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check
> > > engine @@ -248,6 +238,7 @@ static void tmpl_regs(int fd,  {
> > >       const unsigned int gen_bit = 1 <<
> > > intel_gen(intel_get_drm_devid(fd));
> > >       const unsigned int engine_bit = ENGINE(e->class, e->instance);
> > > +     const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
> >
> > Chris, I tried to test this patch, but "gem_engine_mmio_base()" above is
> not defined.
> > Can you check?
> 
> Did you perchance look at patch 4?

Thanks, find this one:
[i-g-t,4/9] i915: Start putting the mmio_base to wider use

--CQ

> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-21 23:56         ` Tang, CQ
  0 siblings, 0 replies; 57+ messages in thread
From: Tang, CQ @ 2019-11-21 23:56 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev



> -----Original Message-----
> From: Chris Wilson <chris@chris-wilson.co.uk>
> Sent: Thursday, November 21, 2019 3:45 PM
> To: Tang, CQ <cq.tang@intel.com>; intel-gfx@lists.freedesktop.org
> Cc: igt-dev@lists.freedesktop.org
> Subject: RE: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check
> engine relative registers
> 
> Quoting Tang, CQ (2019-11-21 21:07:13)
> >
> >
> > > -----Original Message-----
> > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf
> > > Of Chris Wilson
> > > Sent: Wednesday, November 13, 2019 4:53 AM
> > > To: intel-gfx@lists.freedesktop.org
> > > Cc: igt-dev@lists.freedesktop.org
> > > Subject: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check
> > > engine @@ -248,6 +238,7 @@ static void tmpl_regs(int fd,  {
> > >       const unsigned int gen_bit = 1 <<
> > > intel_gen(intel_get_drm_devid(fd));
> > >       const unsigned int engine_bit = ENGINE(e->class, e->instance);
> > > +     const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
> >
> > Chris, I tried to test this patch, but "gem_engine_mmio_base()" above is
> not defined.
> > Can you check?
> 
> Did you perchance look at patch 4?

Thanks, find this one:
[i-g-t,4/9] i915: Start putting the mmio_base to wider use

--CQ

> -Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-25 19:13     ` Tang, CQ
  0 siblings, 0 replies; 57+ messages in thread
From: Tang, CQ @ 2019-11-25 19:13 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev

Chris,
    I applied your patches and tested on DG1 hardware.  This new code works well, except there are still two issues to this test  (gem_ctx_isloation.c)

1. the test loops over all possible engines (......ccs0, ccs1, ccs2,....). the legacy interface is used to check if an engine is supported:
          gem_require_ring(fd, e->flags);

for unsupported engines, the following message is printed by IGT test system:
Test requirement not met in function gem_require_ring, file ../lib/ioctl_wrappers.c:1288:
Test requirement: gem_has_ring(fd, ring)

I think this is a minor issue, enhancement is expected to the IGT test code to handle it.

2. all the xxx-nonpriv-switch subtest fails (where xxx is the supported engine), for rcs0 example:

Starting subtest: rcs0-nonpriv-switch
(gem_ctx_isolation:16463) igt_aux-CRITICAL: Test assertion failure function sig_abort, file ../lib/igt_aux.c:502:
(gem_ctx_isolation:16463) igt_aux-CRITICAL: Failed assertion: !"GPU hung"
Stack trace:
  #0 ../lib/igt_core.c:1850 __igt_fail_assert()
  #1 ../lib/igt_aux.c:506 igt_fork_hang_detector()
  #2 [killpg+0x40]
  #3 [ioctl+0xb]
  #4 [drmIoctl+0x30]
  #5 ../lib/i915/gem_context.c:119 __gem_context_destroy()
  #6 ../lib/i915/gem_context.c:137 gem_context_destroy()
  #7 ../tests/i915/gem_ctx_isolation.c:666 nonpriv()
  #8 ../tests/i915/gem_ctx_isolation.c:947 __real_main893()
  #9 ../tests/i915/gem_ctx_isolation.c:893 main()
  #10 [__libc_start_main+0xf3]

I spent quite a while to study the source code, but I still don't understand what this subtest is doing.  If you know, can you explain some?

--CQ


> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> Chris Wilson
> Sent: Wednesday, November 13, 2019 4:53 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: igt-dev@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine
> relative registers
> 
> Some of the non-privileged registers are at the same offset on each engine.
> We can improve our coverage for unknown HW layout by using the reported
> engine->mmio_base for relative offsets.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  tests/i915/gem_ctx_isolation.c | 164 ++++++++++++++++++++-------------
>  1 file changed, 100 insertions(+), 64 deletions(-)
> 
> diff --git a/tests/i915/gem_ctx_isolation.c b/tests/i915/gem_ctx_isolation.c
> index 6aa27133c..546ffac3a 100644
> --- a/tests/i915/gem_ctx_isolation.c
> +++ b/tests/i915/gem_ctx_isolation.c
> @@ -70,6 +70,7 @@ static const struct named_register {
>  	uint32_t ignore_bits;
>  	uint32_t write_mask; /* some registers bits do not exist */
>  	bool masked;
> +	bool relative;
>  } nonpriv_registers[] = {
>  	{ "NOPID", NOCTX, RCS0, 0x2094 },
>  	{ "MI_PREDICATE_RESULT_2", NOCTX, RCS0, 0x23bc }, @@ -109,7
> +110,6 @@ static const struct named_register {
>  	{ "PS_DEPTH_COUNT_1", GEN8, RCS0, 0x22f8, 2 },
>  	{ "BB_OFFSET", GEN8, RCS0, 0x2158, .ignore_bits = 0x7 },
>  	{ "MI_PREDICATE_RESULT_1", GEN8, RCS0, 0x241c },
> -	{ "CS_GPR", GEN8, RCS0, 0x2600, 32 },
>  	{ "OA_CTX_CONTROL", GEN8, RCS0, 0x2360 },
>  	{ "OACTXID", GEN8, RCS0, 0x2364 },
>  	{ "PS_INVOCATION_COUNT_2", GEN8, RCS0, 0x2448, 2, .write_mask
> = ~0x3 }, @@ -138,79 +138,56 @@ static const struct named_register {
> 
>  	{ "CTX_PREEMPT", NOCTX /* GEN10 */, RCS0, 0x2248 },
>  	{ "CS_CHICKEN1", GEN11, RCS0, 0x2580, .masked = true },
> -	{ "HDC_CHICKEN1", GEN_RANGE(10, 10), RCS0, 0x7304, .masked =
> true },
> 
>  	/* Privileged (enabled by w/a + FORCE_TO_NONPRIV) */
>  	{ "CTX_PREEMPT", NOCTX /* GEN9 */, RCS0, 0x2248 },
>  	{ "CS_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x2580, .masked = true },
>  	{ "COMMON_SLICE_CHICKEN2", GEN_RANGE(9, 9), RCS0,
> 0x7014, .masked = true },
> -	{ "HDC_CHICKEN1", GEN_RANGE(9, 9), RCS0, 0x7304, .masked =
> true },
> +	{ "HDC_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x7304, .masked =
> true },
>  	{ "SLICE_COMMON_ECO_CHICKEN1", GEN_RANGE(11, 11) /* + glk */,
> RCS0,  0x731c, .masked = true },
>  	{ "L3SQREG4", NOCTX /* GEN9:skl,kbl */, RCS0, 0xb118, .write_mask
> = ~0x1ffff0 },
>  	{ "HALF_SLICE_CHICKEN7", GEN_RANGE(11, 11), RCS0,
> 0xe194, .masked = true },
>  	{ "SAMPLER_MODE", GEN_RANGE(11, 11), RCS0, 0xe18c, .masked =
> true },
> 
> -	{ "BCS_GPR", GEN9, BCS0, 0x22600, 32 },
>  	{ "BCS_SWCTRL", GEN8, BCS0, 0x22200, .write_mask = 0x3, .masked =
> true },
> 
>  	{ "MFC_VDBOX1", NOCTX, VCS0, 0x12800, 64 },
>  	{ "MFC_VDBOX2", NOCTX, VCS1, 0x1c800, 64 },
> 
> -	{ "VCS0_GPR", GEN_RANGE(9, 10), VCS0, 0x12600, 32 },
> -	{ "VCS1_GPR", GEN_RANGE(9, 10), VCS1, 0x1c600, 32 },
> -	{ "VECS_GPR", GEN_RANGE(9, 10), VECS0, 0x1a600, 32 },
> -
> -	{ "VCS0_GPR", GEN11, VCS0, 0x1c0600, 32 },
> -	{ "VCS1_GPR", GEN11, VCS1, 0x1c4600, 32 },
> -	{ "VCS2_GPR", GEN11, VCS2, 0x1d0600, 32 },
> -	{ "VCS3_GPR", GEN11, VCS3, 0x1d4600, 32 },
> -	{ "VECS_GPR", GEN11, VECS0, 0x1c8600, 32 },
> +	{ "xCS_GPR", GEN9, ALL, 0x600, 32, .relative = true },
> 
>  	{}
>  }, ignore_registers[] = {
>  	{ "RCS timestamp", GEN6, ~0u, 0x2358 },
>  	{ "BCS timestamp", GEN7, ~0u, 0x22358 },
> 
> -	{ "VCS0 timestamp", GEN_RANGE(7, 10), ~0u, 0x12358 },
> -	{ "VCS1 timestamp", GEN_RANGE(7, 10), ~0u, 0x1c358 },
> -	{ "VECS timestamp", GEN_RANGE(8, 10), ~0u, 0x1a358 },
> -
> -	{ "VCS0 timestamp", GEN11, ~0u, 0x1c0358 },
> -	{ "VCS1 timestamp", GEN11, ~0u, 0x1c4358 },
> -	{ "VCS2 timestamp", GEN11, ~0u, 0x1d0358 },
> -	{ "VCS3 timestamp", GEN11, ~0u, 0x1d4358 },
> -	{ "VECS timestamp", GEN11, ~0u, 0x1c8358 },
> +	{ "xCS timestamp", GEN8, ALL, 0x358, .relative = true },
> 
>  	/* huc read only */
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2000 },
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2014 },
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x23b0 },
> -
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2000 },
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2014 },
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x23b0 },
> -
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2000 },
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2014 },
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x23b0 },
> -
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2000 },
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2014 },
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x23b0 },
> +	{ "BSD 0x2000", GEN11, ALL, 0x2000, .relative = true },
> +	{ "BSD 0x2014", GEN11, ALL, 0x2014, .relative = true },
> +	{ "BSD 0x23b0", GEN11, ALL, 0x23b0, .relative = true },
> 
>  	{}
>  };
> 
> -static const char *register_name(uint32_t offset, char *buf, size_t len)
> +static const char *
> +register_name(uint32_t offset, uint32_t mmio_base, char *buf, size_t
> +len)
>  {
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width) {
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width) {
>  			if (r->count <= 1)
>  				return r->name;
> 
>  			snprintf(buf, len, "%s[%d]",
> -				 r->name, (offset - r->offset)/4);
> +				 r->name, (offset - base) / 4);
>  			return buf;
>  		}
>  	}
> @@ -218,22 +195,35 @@ static const char *register_name(uint32_t offset,
> char *buf, size_t len)
>  	return "unknown";
>  }
> 
> -static const struct named_register *lookup_register(uint32_t offset)
> +static const struct named_register *
> +lookup_register(uint32_t offset, uint32_t mmio_base)
>  {
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width)
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width)
>  			return r;
>  	}
> 
>  	return NULL;
>  }
> 
> -static bool ignore_register(uint32_t offset)
> +static bool ignore_register(uint32_t offset, uint32_t mmio_base)
>  {
>  	for (const struct named_register *r = ignore_registers; r->name; r++)
> {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width)
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width)
>  			return true;
>  	}
> 
> @@ -248,6 +238,7 @@ static void tmpl_regs(int fd,  {
>  	const unsigned int gen_bit = 1 <<
> intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int regs_size;
>  	uint32_t *regs;
> 
> @@ -259,12 +250,20 @@ static void tmpl_regs(int fd,
>  		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
> 
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> +
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			uint32_t x = value;
>  			if (r->write_mask)
>  				x &= r->write_mask;
> @@ -284,6 +283,7 @@ static uint32_t read_regs(int fd,
>  	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	const bool r64b = gen >= 8;
>  	struct drm_i915_gem_exec_object2 obj[2];
>  	struct drm_i915_gem_relocation_entry *reloc; @@ -311,13 +311,20
> @@ static uint32_t read_regs(int fd,
> 
>  	n = 0;
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			*b++ = 0x24 << 23 | (1 + r64b); /* SRM */
>  			*b++ = offset;
>  			reloc[n].target_handle = obj[0].handle; @@ -357,6
> +364,7 @@ static void write_regs(int fd,  {
>  	const unsigned int gen_bit = 1 <<
> intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	struct drm_i915_gem_exec_object2 obj;
>  	struct drm_i915_gem_execbuffer2 execbuf;
>  	unsigned int batch_size;
> @@ -372,12 +380,20 @@ static void write_regs(int fd,
>  	gem_set_domain(fd, obj.handle,
>  		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> +
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			uint32_t x = value;
>  			if (r->write_mask)
>  				x &= r->write_mask;
> @@ -410,6 +426,7 @@ static void restore_regs(int fd,
>  	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	const bool r64b = gen >= 8;
>  	struct drm_i915_gem_exec_object2 obj[2];
>  	struct drm_i915_gem_execbuffer2 execbuf; @@ -437,13 +454,20
> @@ static void restore_regs(int fd,
> 
>  	n = 0;
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			*b++ = 0x29 << 23 | (1 + r64b); /* LRM */
>  			*b++ = offset;
>  			reloc[n].target_handle = obj[0].handle; @@ -479,6
> +503,7 @@ static void dump_regs(int fd,
>  	const int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int regs_size;
>  	uint32_t *out;
> 
> @@ -489,26 +514,36 @@ static void dump_regs(int fd,
>  	gem_set_domain(fd, regs, I915_GEM_DOMAIN_CPU, 0);
> 
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
>  		if (r->count <= 1) {
>  			igt_debug("0x%04x (%s): 0x%08x\n",
> -				  r->offset, r->name, out[r->offset/4]);
> +				  offset, r->name, out[offset / 4]);
>  		} else {
>  			for (unsigned x = 0; x < r->count; x++)
>  				igt_debug("0x%04x (%s[%d]): 0x%08x\n",
> -					  r->offset+4*x, r->name, x,
> -					  out[r->offset/4 + x]);
> +					  offset + 4 * x, r->name, x,
> +					  out[offset / 4 + x]);
>  		}
>  	}
>  	munmap(out, regs_size);
>  }
> 
> -static void compare_regs(int fd, uint32_t A, uint32_t B, const char *who)
> +static void compare_regs(int fd, const struct intel_execution_engine2 *e,
> +			 uint32_t A, uint32_t B, const char *who)
>  {
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int num_errors;
>  	unsigned int regs_size;
>  	uint32_t *a, *b;
> @@ -532,11 +567,11 @@ static void compare_regs(int fd, uint32_t A,
> uint32_t B, const char *who)
>  		if (a[n] == b[n])
>  			continue;
> 
> -		if (ignore_register(offset))
> +		if (ignore_register(offset, mmio_base))
>  			continue;
> 
>  		mask = ~0u;
> -		r = lookup_register(offset);
> +		r = lookup_register(offset, mmio_base);
>  		if (r && r->masked)
>  			mask >>= 16;
>  		if (r && r->ignore_bits)
> @@ -547,7 +582,7 @@ static void compare_regs(int fd, uint32_t A, uint32_t B,
> const char *who)
> 
>  		igt_warn("Register 0x%04x (%s): A=%08x B=%08x\n",
>  			 offset,
> -			 register_name(offset, buf, sizeof(buf)),
> +			 register_name(offset, mmio_base, buf, sizeof(buf)),
>  			 a[n] & mask, b[n] & mask);
>  		num_errors++;
>  	}
> @@ -638,7 +673,7 @@ static void nonpriv(int fd,
> 
>  		igt_spin_free(fd, spin);
> 
> -		compare_regs(fd, tmpl, regs[1], "nonpriv read/writes");
> +		compare_regs(fd, e, tmpl, regs[1], "nonpriv read/writes");
> 
>  		for (int n = 0; n < ARRAY_SIZE(regs); n++)
>  			gem_close(fd, regs[n]);
> @@ -708,8 +743,9 @@ static void isolation(int fd,
>  		igt_spin_free(fd, spin);
> 
>  		if (!(flags & DIRTY1))
> -			compare_regs(fd, regs[0], tmp, "two reads of the
> same ctx");
> -		compare_regs(fd, regs[0], regs[1], "two virgin contexts");
> +			compare_regs(fd, e, regs[0], tmp,
> +				     "two reads of the same ctx");
> +		compare_regs(fd, e, regs[0], regs[1], "two virgin contexts");
> 
>  		for (int n = 0; n < ARRAY_SIZE(ctx); n++) {
>  			gem_close(fd, regs[n]);
> @@ -829,13 +865,13 @@ static void preservation(int fd,
>  		char buf[80];
> 
>  		snprintf(buf, sizeof(buf), "dirty %x context\n", values[v]);
> -		compare_regs(fd, regs[v][0], regs[v][1], buf);
> +		compare_regs(fd, e, regs[v][0], regs[v][1], buf);
> 
>  		gem_close(fd, regs[v][0]);
>  		gem_close(fd, regs[v][1]);
>  		gem_context_destroy(fd, ctx[v]);
>  	}
> -	compare_regs(fd, regs[num_values][0], regs[num_values][1],
> "clean");
> +	compare_regs(fd, e, regs[num_values][0], regs[num_values][1],
> +"clean");
>  	gem_context_destroy(fd, ctx[num_values]);  }
> 
> --
> 2.24.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers
@ 2019-11-25 19:13     ` Tang, CQ
  0 siblings, 0 replies; 57+ messages in thread
From: Tang, CQ @ 2019-11-25 19:13 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev

Chris,
    I applied your patches and tested on DG1 hardware.  This new code works well, except there are still two issues to this test  (gem_ctx_isloation.c)

1. the test loops over all possible engines (......ccs0, ccs1, ccs2,....). the legacy interface is used to check if an engine is supported:
          gem_require_ring(fd, e->flags);

for unsupported engines, the following message is printed by IGT test system:
Test requirement not met in function gem_require_ring, file ../lib/ioctl_wrappers.c:1288:
Test requirement: gem_has_ring(fd, ring)

I think this is a minor issue, enhancement is expected to the IGT test code to handle it.

2. all the xxx-nonpriv-switch subtest fails (where xxx is the supported engine), for rcs0 example:

Starting subtest: rcs0-nonpriv-switch
(gem_ctx_isolation:16463) igt_aux-CRITICAL: Test assertion failure function sig_abort, file ../lib/igt_aux.c:502:
(gem_ctx_isolation:16463) igt_aux-CRITICAL: Failed assertion: !"GPU hung"
Stack trace:
  #0 ../lib/igt_core.c:1850 __igt_fail_assert()
  #1 ../lib/igt_aux.c:506 igt_fork_hang_detector()
  #2 [killpg+0x40]
  #3 [ioctl+0xb]
  #4 [drmIoctl+0x30]
  #5 ../lib/i915/gem_context.c:119 __gem_context_destroy()
  #6 ../lib/i915/gem_context.c:137 gem_context_destroy()
  #7 ../tests/i915/gem_ctx_isolation.c:666 nonpriv()
  #8 ../tests/i915/gem_ctx_isolation.c:947 __real_main893()
  #9 ../tests/i915/gem_ctx_isolation.c:893 main()
  #10 [__libc_start_main+0xf3]

I spent quite a while to study the source code, but I still don't understand what this subtest is doing.  If you know, can you explain some?

--CQ


> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> Chris Wilson
> Sent: Wednesday, November 13, 2019 4:53 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: igt-dev@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine
> relative registers
> 
> Some of the non-privileged registers are at the same offset on each engine.
> We can improve our coverage for unknown HW layout by using the reported
> engine->mmio_base for relative offsets.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  tests/i915/gem_ctx_isolation.c | 164 ++++++++++++++++++++-------------
>  1 file changed, 100 insertions(+), 64 deletions(-)
> 
> diff --git a/tests/i915/gem_ctx_isolation.c b/tests/i915/gem_ctx_isolation.c
> index 6aa27133c..546ffac3a 100644
> --- a/tests/i915/gem_ctx_isolation.c
> +++ b/tests/i915/gem_ctx_isolation.c
> @@ -70,6 +70,7 @@ static const struct named_register {
>  	uint32_t ignore_bits;
>  	uint32_t write_mask; /* some registers bits do not exist */
>  	bool masked;
> +	bool relative;
>  } nonpriv_registers[] = {
>  	{ "NOPID", NOCTX, RCS0, 0x2094 },
>  	{ "MI_PREDICATE_RESULT_2", NOCTX, RCS0, 0x23bc }, @@ -109,7
> +110,6 @@ static const struct named_register {
>  	{ "PS_DEPTH_COUNT_1", GEN8, RCS0, 0x22f8, 2 },
>  	{ "BB_OFFSET", GEN8, RCS0, 0x2158, .ignore_bits = 0x7 },
>  	{ "MI_PREDICATE_RESULT_1", GEN8, RCS0, 0x241c },
> -	{ "CS_GPR", GEN8, RCS0, 0x2600, 32 },
>  	{ "OA_CTX_CONTROL", GEN8, RCS0, 0x2360 },
>  	{ "OACTXID", GEN8, RCS0, 0x2364 },
>  	{ "PS_INVOCATION_COUNT_2", GEN8, RCS0, 0x2448, 2, .write_mask
> = ~0x3 }, @@ -138,79 +138,56 @@ static const struct named_register {
> 
>  	{ "CTX_PREEMPT", NOCTX /* GEN10 */, RCS0, 0x2248 },
>  	{ "CS_CHICKEN1", GEN11, RCS0, 0x2580, .masked = true },
> -	{ "HDC_CHICKEN1", GEN_RANGE(10, 10), RCS0, 0x7304, .masked =
> true },
> 
>  	/* Privileged (enabled by w/a + FORCE_TO_NONPRIV) */
>  	{ "CTX_PREEMPT", NOCTX /* GEN9 */, RCS0, 0x2248 },
>  	{ "CS_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x2580, .masked = true },
>  	{ "COMMON_SLICE_CHICKEN2", GEN_RANGE(9, 9), RCS0,
> 0x7014, .masked = true },
> -	{ "HDC_CHICKEN1", GEN_RANGE(9, 9), RCS0, 0x7304, .masked =
> true },
> +	{ "HDC_CHICKEN1", GEN_RANGE(9, 10), RCS0, 0x7304, .masked =
> true },
>  	{ "SLICE_COMMON_ECO_CHICKEN1", GEN_RANGE(11, 11) /* + glk */,
> RCS0,  0x731c, .masked = true },
>  	{ "L3SQREG4", NOCTX /* GEN9:skl,kbl */, RCS0, 0xb118, .write_mask
> = ~0x1ffff0 },
>  	{ "HALF_SLICE_CHICKEN7", GEN_RANGE(11, 11), RCS0,
> 0xe194, .masked = true },
>  	{ "SAMPLER_MODE", GEN_RANGE(11, 11), RCS0, 0xe18c, .masked =
> true },
> 
> -	{ "BCS_GPR", GEN9, BCS0, 0x22600, 32 },
>  	{ "BCS_SWCTRL", GEN8, BCS0, 0x22200, .write_mask = 0x3, .masked =
> true },
> 
>  	{ "MFC_VDBOX1", NOCTX, VCS0, 0x12800, 64 },
>  	{ "MFC_VDBOX2", NOCTX, VCS1, 0x1c800, 64 },
> 
> -	{ "VCS0_GPR", GEN_RANGE(9, 10), VCS0, 0x12600, 32 },
> -	{ "VCS1_GPR", GEN_RANGE(9, 10), VCS1, 0x1c600, 32 },
> -	{ "VECS_GPR", GEN_RANGE(9, 10), VECS0, 0x1a600, 32 },
> -
> -	{ "VCS0_GPR", GEN11, VCS0, 0x1c0600, 32 },
> -	{ "VCS1_GPR", GEN11, VCS1, 0x1c4600, 32 },
> -	{ "VCS2_GPR", GEN11, VCS2, 0x1d0600, 32 },
> -	{ "VCS3_GPR", GEN11, VCS3, 0x1d4600, 32 },
> -	{ "VECS_GPR", GEN11, VECS0, 0x1c8600, 32 },
> +	{ "xCS_GPR", GEN9, ALL, 0x600, 32, .relative = true },
> 
>  	{}
>  }, ignore_registers[] = {
>  	{ "RCS timestamp", GEN6, ~0u, 0x2358 },
>  	{ "BCS timestamp", GEN7, ~0u, 0x22358 },
> 
> -	{ "VCS0 timestamp", GEN_RANGE(7, 10), ~0u, 0x12358 },
> -	{ "VCS1 timestamp", GEN_RANGE(7, 10), ~0u, 0x1c358 },
> -	{ "VECS timestamp", GEN_RANGE(8, 10), ~0u, 0x1a358 },
> -
> -	{ "VCS0 timestamp", GEN11, ~0u, 0x1c0358 },
> -	{ "VCS1 timestamp", GEN11, ~0u, 0x1c4358 },
> -	{ "VCS2 timestamp", GEN11, ~0u, 0x1d0358 },
> -	{ "VCS3 timestamp", GEN11, ~0u, 0x1d4358 },
> -	{ "VECS timestamp", GEN11, ~0u, 0x1c8358 },
> +	{ "xCS timestamp", GEN8, ALL, 0x358, .relative = true },
> 
>  	/* huc read only */
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2000 },
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x2014 },
> -	{ "BSD0 0x2000", GEN11, ~0u, 0x1c0000 + 0x23b0 },
> -
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2000 },
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x2014 },
> -	{ "BSD1 0x2000", GEN11, ~0u, 0x1c4000 + 0x23b0 },
> -
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2000 },
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x2014 },
> -	{ "BSD2 0x2000", GEN11, ~0u, 0x1d0000 + 0x23b0 },
> -
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2000 },
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x2014 },
> -	{ "BSD3 0x2000", GEN11, ~0u, 0x1d4000 + 0x23b0 },
> +	{ "BSD 0x2000", GEN11, ALL, 0x2000, .relative = true },
> +	{ "BSD 0x2014", GEN11, ALL, 0x2014, .relative = true },
> +	{ "BSD 0x23b0", GEN11, ALL, 0x23b0, .relative = true },
> 
>  	{}
>  };
> 
> -static const char *register_name(uint32_t offset, char *buf, size_t len)
> +static const char *
> +register_name(uint32_t offset, uint32_t mmio_base, char *buf, size_t
> +len)
>  {
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width) {
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width) {
>  			if (r->count <= 1)
>  				return r->name;
> 
>  			snprintf(buf, len, "%s[%d]",
> -				 r->name, (offset - r->offset)/4);
> +				 r->name, (offset - base) / 4);
>  			return buf;
>  		}
>  	}
> @@ -218,22 +195,35 @@ static const char *register_name(uint32_t offset,
> char *buf, size_t len)
>  	return "unknown";
>  }
> 
> -static const struct named_register *lookup_register(uint32_t offset)
> +static const struct named_register *
> +lookup_register(uint32_t offset, uint32_t mmio_base)
>  {
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width)
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width)
>  			return r;
>  	}
> 
>  	return NULL;
>  }
> 
> -static bool ignore_register(uint32_t offset)
> +static bool ignore_register(uint32_t offset, uint32_t mmio_base)
>  {
>  	for (const struct named_register *r = ignore_registers; r->name; r++)
> {
>  		unsigned int width = r->count ? 4*r->count : 4;
> -		if (offset >= r->offset && offset < r->offset + width)
> +		uint32_t base;
> +
> +		base = r->offset;
> +		if (r->relative)
> +			base += mmio_base;
> +
> +		if (offset >= base && offset < base + width)
>  			return true;
>  	}
> 
> @@ -248,6 +238,7 @@ static void tmpl_regs(int fd,  {
>  	const unsigned int gen_bit = 1 <<
> intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int regs_size;
>  	uint32_t *regs;
> 
> @@ -259,12 +250,20 @@ static void tmpl_regs(int fd,
>  		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
> 
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> +
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			uint32_t x = value;
>  			if (r->write_mask)
>  				x &= r->write_mask;
> @@ -284,6 +283,7 @@ static uint32_t read_regs(int fd,
>  	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	const bool r64b = gen >= 8;
>  	struct drm_i915_gem_exec_object2 obj[2];
>  	struct drm_i915_gem_relocation_entry *reloc; @@ -311,13 +311,20
> @@ static uint32_t read_regs(int fd,
> 
>  	n = 0;
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			*b++ = 0x24 << 23 | (1 + r64b); /* SRM */
>  			*b++ = offset;
>  			reloc[n].target_handle = obj[0].handle; @@ -357,6
> +364,7 @@ static void write_regs(int fd,  {
>  	const unsigned int gen_bit = 1 <<
> intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	struct drm_i915_gem_exec_object2 obj;
>  	struct drm_i915_gem_execbuffer2 execbuf;
>  	unsigned int batch_size;
> @@ -372,12 +380,20 @@ static void write_regs(int fd,
>  	gem_set_domain(fd, obj.handle,
>  		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> +
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			uint32_t x = value;
>  			if (r->write_mask)
>  				x &= r->write_mask;
> @@ -410,6 +426,7 @@ static void restore_regs(int fd,
>  	const unsigned int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	const bool r64b = gen >= 8;
>  	struct drm_i915_gem_exec_object2 obj[2];
>  	struct drm_i915_gem_execbuffer2 execbuf; @@ -437,13 +454,20
> @@ static void restore_regs(int fd,
> 
>  	n = 0;
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
> -		for (unsigned count = r->count ?: 1, offset = r->offset;
> -		     count--; offset += 4) {
> +		for (unsigned count = r->count ?: 1; count--; offset += 4) {
>  			*b++ = 0x29 << 23 | (1 + r64b); /* LRM */
>  			*b++ = offset;
>  			reloc[n].target_handle = obj[0].handle; @@ -479,6
> +503,7 @@ static void dump_regs(int fd,
>  	const int gen = intel_gen(intel_get_drm_devid(fd));
>  	const unsigned int gen_bit = 1 << gen;
>  	const unsigned int engine_bit = ENGINE(e->class, e->instance);
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int regs_size;
>  	uint32_t *out;
> 
> @@ -489,26 +514,36 @@ static void dump_regs(int fd,
>  	gem_set_domain(fd, regs, I915_GEM_DOMAIN_CPU, 0);
> 
>  	for (const struct named_register *r = nonpriv_registers; r->name;
> r++) {
> +		uint32_t offset;
> +
>  		if (!(r->engine_mask & engine_bit))
>  			continue;
>  		if (!(r->gen_mask & gen_bit))
>  			continue;
> +		if (r->relative && !mmio_base)
> +			continue;
> +
> +		offset = r->offset;
> +		if (r->relative)
> +			offset += mmio_base;
> 
>  		if (r->count <= 1) {
>  			igt_debug("0x%04x (%s): 0x%08x\n",
> -				  r->offset, r->name, out[r->offset/4]);
> +				  offset, r->name, out[offset / 4]);
>  		} else {
>  			for (unsigned x = 0; x < r->count; x++)
>  				igt_debug("0x%04x (%s[%d]): 0x%08x\n",
> -					  r->offset+4*x, r->name, x,
> -					  out[r->offset/4 + x]);
> +					  offset + 4 * x, r->name, x,
> +					  out[offset / 4 + x]);
>  		}
>  	}
>  	munmap(out, regs_size);
>  }
> 
> -static void compare_regs(int fd, uint32_t A, uint32_t B, const char *who)
> +static void compare_regs(int fd, const struct intel_execution_engine2 *e,
> +			 uint32_t A, uint32_t B, const char *who)
>  {
> +	const uint32_t mmio_base = gem_engine_mmio_base(fd, e->name);
>  	unsigned int num_errors;
>  	unsigned int regs_size;
>  	uint32_t *a, *b;
> @@ -532,11 +567,11 @@ static void compare_regs(int fd, uint32_t A,
> uint32_t B, const char *who)
>  		if (a[n] == b[n])
>  			continue;
> 
> -		if (ignore_register(offset))
> +		if (ignore_register(offset, mmio_base))
>  			continue;
> 
>  		mask = ~0u;
> -		r = lookup_register(offset);
> +		r = lookup_register(offset, mmio_base);
>  		if (r && r->masked)
>  			mask >>= 16;
>  		if (r && r->ignore_bits)
> @@ -547,7 +582,7 @@ static void compare_regs(int fd, uint32_t A, uint32_t B,
> const char *who)
> 
>  		igt_warn("Register 0x%04x (%s): A=%08x B=%08x\n",
>  			 offset,
> -			 register_name(offset, buf, sizeof(buf)),
> +			 register_name(offset, mmio_base, buf, sizeof(buf)),
>  			 a[n] & mask, b[n] & mask);
>  		num_errors++;
>  	}
> @@ -638,7 +673,7 @@ static void nonpriv(int fd,
> 
>  		igt_spin_free(fd, spin);
> 
> -		compare_regs(fd, tmpl, regs[1], "nonpriv read/writes");
> +		compare_regs(fd, e, tmpl, regs[1], "nonpriv read/writes");
> 
>  		for (int n = 0; n < ARRAY_SIZE(regs); n++)
>  			gem_close(fd, regs[n]);
> @@ -708,8 +743,9 @@ static void isolation(int fd,
>  		igt_spin_free(fd, spin);
> 
>  		if (!(flags & DIRTY1))
> -			compare_regs(fd, regs[0], tmp, "two reads of the
> same ctx");
> -		compare_regs(fd, regs[0], regs[1], "two virgin contexts");
> +			compare_regs(fd, e, regs[0], tmp,
> +				     "two reads of the same ctx");
> +		compare_regs(fd, e, regs[0], regs[1], "two virgin contexts");
> 
>  		for (int n = 0; n < ARRAY_SIZE(ctx); n++) {
>  			gem_close(fd, regs[n]);
> @@ -829,13 +865,13 @@ static void preservation(int fd,
>  		char buf[80];
> 
>  		snprintf(buf, sizeof(buf), "dirty %x context\n", values[v]);
> -		compare_regs(fd, regs[v][0], regs[v][1], buf);
> +		compare_regs(fd, e, regs[v][0], regs[v][1], buf);
> 
>  		gem_close(fd, regs[v][0]);
>  		gem_close(fd, regs[v][1]);
>  		gem_context_destroy(fd, ctx[v]);
>  	}
> -	compare_regs(fd, regs[num_values][0], regs[num_values][1],
> "clean");
> +	compare_regs(fd, e, regs[num_values][0], regs[num_values][1],
> +"clean");
>  	gem_context_destroy(fd, ctx[num_values]);  }
> 
> --
> 2.24.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
@ 2019-12-02 14:42     ` Janusz Krzysztofik
  0 siblings, 0 replies; 57+ messages in thread
From: Janusz Krzysztofik @ 2019-12-02 14:42 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Hi Chris,

I have a few questions rather than comments.  I hope they are worth spending 
your time.

On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> ringbuffer for logical ring contects. This directly affects the number

s/contects/contexts/

> of batches userspace can submit before blocking waiting for space.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  tests/Makefile.sources        |   3 +
>  tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
>  tests/meson.build             |   1 +
>  3 files changed, 300 insertions(+)
>  create mode 100644 tests/i915/gem_ctx_ringsize.c
> 
> diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> index e17d43155..801fc52f3 100644
> --- a/tests/Makefile.sources
> +++ b/tests/Makefile.sources
> @@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
>  TESTS_progs += gem_ctx_persistence
>  gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
>  
> +TESTS_progs += gem_ctx_ringsize
> +gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
> +
>  TESTS_progs += gem_ctx_shared
>  gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
>  
> diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
> new file mode 100644
> index 000000000..1450e8f0d
> --- /dev/null
> +++ b/tests/i915/gem_ctx_ringsize.c
> @@ -0,0 +1,296 @@
> +/*
> + * Copyright © 2019 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <inttypes.h>
> +#include <sys/ioctl.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +
> +#include "drmtest.h" /* gem_quiescent_gpu()! */
> +#include "i915/gem_context.h"
> +#include "i915/gem_engine_topology.h"
> +#include "ioctl_wrappers.h" /* gem_wait()! */
> +#include "sw_sync.h"
> +
> +#define I915_CONTEXT_PARAM_RINGSIZE 0xc

How are we going to handle symbol redefinition conflict which arises as soon 
as this symbol is also included from kernel headers (e.g. via 
"i915/gem_engine_topology.h")?

> +
> +static bool has_ringsize(int i915)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +
> +	return __gem_context_get_param(i915, &p) == 0;
> +}
> +
> +static void test_idempotent(int i915)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +	uint32_t saved;
> +
> +	/*
> +	 * Simple test to verify that we are able to read back the same
> +	 * value as we set.
> +	 */
> +
> +	gem_context_get_param(i915, &p);
> +	saved = p.value;
> +
> +	for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {

I've noticed you are using two different notations for those minimum/maximum 
constants.  I think that may be confusing.  How about defining and using 
macros?  

> +		p.value = x;
> +		gem_context_set_param(i915, &p);
> +		gem_context_get_param(i915, &p);
> +		igt_assert_eq_u32(p.value, x);
> +	}
> +
> +	p.value = saved;
> +	gem_context_set_param(i915, &p);
> +}
> +
> +static void test_invalid(int i915)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +	uint64_t invalid[] = {
> +		0, 1, 4095, 4097, 8191, 8193,
> +		/* upper limit may be HW dependent, atm it is 512KiB */
> +		(512 << 10) - 1, (512 << 10) + 1,

Here is an example of that different notation mentioned above.

> +		-1, -1u
> +	};
> +	uint32_t saved;
> +
> +	gem_context_get_param(i915, &p);
> +	saved = p.value;
> +
> +	for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
> +		p.value = invalid[i];
> +		igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
> +		gem_context_get_param(i915, &p);
> +		igt_assert_eq_u64(p.value, saved);
> +	}
> +}
> +
> +static int create_ext_ioctl(int i915,
> +			    struct drm_i915_gem_context_create_ext *arg)
> +{
> +	int err;
> +
> +	err = 0;
> +	if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
> +		err = -errno;
> +		igt_assume(err);
> +	}
> +
> +	errno = 0;
> +	return err;
> +}

This helper looks like pretty standard for me.  Why there are no library 
functions for such generic operations?

> +
> +static void test_create(int i915)
> +{
> +	struct drm_i915_gem_context_create_ext_setparam p = {
> +		.base = {
> +			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> +			.next_extension = 0, /* end of chain */
> +		},
> +		.param = {
> +			.param = I915_CONTEXT_PARAM_RINGSIZE,
> +			.value = 512 << 10,
> +		}
> +	};
> +	struct drm_i915_gem_context_create_ext create = {
> +		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> +		.extensions = to_user_pointer(&p),
> +	};
> +
> +	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> +
> +	p.param.ctx_id = create.ctx_id;
> +	p.param.value = 0;
> +	gem_context_get_param(i915, &p.param);
> +	igt_assert_eq(p.param.value, 512 << 10);
> +
> +	gem_context_destroy(i915, create.ctx_id);
> +}
> +
> +static void test_clone(int i915)
> +{
> +	struct drm_i915_gem_context_create_ext_setparam p = {
> +		.base = {
> +			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> +			.next_extension = 0, /* end of chain */
> +		},
> +		.param = {
> +			.param = I915_CONTEXT_PARAM_RINGSIZE,
> +			.value = 512 << 10,
> +		}
> +	};
> +	struct drm_i915_gem_context_create_ext create = {
> +		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> +		.extensions = to_user_pointer(&p),
> +	};
> +
> +	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> +
> +	p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
> +					   I915_CONTEXT_CLONE_ENGINES, 0);
> +	igt_assert_neq(p.param.ctx_id, create.ctx_id);
> +	gem_context_destroy(i915, create.ctx_id);
> +
> +	p.param.value = 0;
> +	gem_context_get_param(i915, &p.param);
> +	igt_assert_eq(p.param.value, 512 << 10);
> +
> +	gem_context_destroy(i915, p.param.ctx_id);
> +}
> +
> +static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
> +{
> +	int err;
> +
> +	err = 0;
> +	if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
> +		err = -errno;
> +
> +	errno = 0;
> +	return err;
> +}

The above helper looks pretty the same as lib/ioctlwrappers.c:__gem_execbuf().  
Does igt_assume(err) found in the latter matter so much that you use your own 
version?

> +
> +static uint32_t __batch_create(int i915, uint32_t offset)

This is always called with offset = 0, do we expect other values to be used 
later?

> +{
> +	const uint32_t bbe = 0xa << 23;
> +	uint32_t handle;
> +
> +	handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));

Why don't we rely on the driver making the alignment for us?

> +	gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> +
> +	return handle;
> +}
> +
> +static uint32_t batch_create(int i915)
> +{
> +	return __batch_create(i915, 0);
> +}
> +
> +static unsigned int measure_inflight(int i915, unsigned int engine)
> +{
> +	IGT_CORK_FENCE(cork);
> +	struct drm_i915_gem_exec_object2 obj = {
> +		.handle = batch_create(i915)
> +	};
> +	struct drm_i915_gem_execbuffer2 execbuf = {
> +		.buffers_ptr = to_user_pointer(&obj),
> +		.buffer_count = 1,
> +		.flags = engine | I915_EXEC_FENCE_IN,
> +		.rsvd2 = igt_cork_plug(&cork, i915),
> +	};
> +	unsigned int count;
> +
> +	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
> +
> +	gem_execbuf(i915, &execbuf);
> +	for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
> +		;

Shouldn't we check if the reason for the failure is what we expect, i.e., 
-EWOULDBLOCK (or -EINTR)?  And why don't we put a time constraint on that loop 
in case O_NONBLOCK handling is not supported (yet)?

> +	close(execbuf.rsvd2);
> +
> +	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) & ~O_NONBLOCK);
> +
> +	igt_cork_unplug(&cork);
> +	gem_close(i915, obj.handle);
> +
> +	return count;
> +}
> +
> +static void test_resize(int i915,
> +			const struct intel_execution_engine2 *e,
> +			unsigned int flags)
> +#define IDLE (1 << 0)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +	unsigned int prev[2] = {};
> +	uint32_t saved;
> +
> +	gem_context_get_param(i915, &p);
> +	saved = p.value;
> +
> +	gem_quiescent_gpu(i915);
> +	for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
> +		unsigned int count;
> +
> +		gem_context_set_param(i915, &p);
> +
> +		count = measure_inflight(i915, e->flags);
> +		igt_info("%s: %llx -> %d\n", e->name, p.value, count);
> +		igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);

Where does this formula come from?  Why not just count == 2 * prev[1] ?
What results should we expect in "active" vs. "idle" mode?

Thanks,
Janusz


> +		if (flags & IDLE)
> +			gem_quiescent_gpu(i915);
> +
> +		prev[0] = prev[1];
> +		prev[1] = count;
> +	}
> +	gem_quiescent_gpu(i915);
> +
> +	p.value = saved;
> +	gem_context_set_param(i915, &p);
> +}
> +
> +igt_main
> +{
> +	const struct intel_execution_engine2 *e;
> +	int i915;
> +
> +	igt_fixture {
> +		i915 = drm_open_driver(DRIVER_INTEL);
> +		igt_require_gem(i915);
> +
> +		igt_require(has_ringsize(i915));
> +	}
> +
> +	igt_subtest("idempotent")
> +		test_idempotent(i915);
> +
> +	igt_subtest("invalid")
> +		test_invalid(i915);
> +
> +	igt_subtest("create")
> +		test_create(i915);
> +	igt_subtest("clone")
> +		test_clone(i915);
> +
> +	__for_each_physical_engine(i915, e) {
> +		igt_subtest_f("%s-idle", e->name)
> +			test_resize(i915, e, IDLE);
> +		igt_subtest_f("%s-active", e->name)
> +			test_resize(i915, e, 0);
> +	}
> +
> +	igt_fixture {
> +		close(i915);
> +	}
> +}
> diff --git a/tests/meson.build b/tests/meson.build
> index b0c567594..9b7ca2423 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -123,6 +123,7 @@ i915_progs = [
>  	'gem_ctx_isolation',
>  	'gem_ctx_param',
>  	'gem_ctx_persistence',
> +	'gem_ctx_ringsize',
>  	'gem_ctx_shared',
>  	'gem_ctx_switch',
>  	'gem_ctx_thrash',
> 




_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
@ 2019-12-02 14:42     ` Janusz Krzysztofik
  0 siblings, 0 replies; 57+ messages in thread
From: Janusz Krzysztofik @ 2019-12-02 14:42 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Hi Chris,

I have a few questions rather than comments.  I hope they are worth spending 
your time.

On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> ringbuffer for logical ring contects. This directly affects the number

s/contects/contexts/

> of batches userspace can submit before blocking waiting for space.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  tests/Makefile.sources        |   3 +
>  tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
>  tests/meson.build             |   1 +
>  3 files changed, 300 insertions(+)
>  create mode 100644 tests/i915/gem_ctx_ringsize.c
> 
> diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> index e17d43155..801fc52f3 100644
> --- a/tests/Makefile.sources
> +++ b/tests/Makefile.sources
> @@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
>  TESTS_progs += gem_ctx_persistence
>  gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
>  
> +TESTS_progs += gem_ctx_ringsize
> +gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
> +
>  TESTS_progs += gem_ctx_shared
>  gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
>  
> diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
> new file mode 100644
> index 000000000..1450e8f0d
> --- /dev/null
> +++ b/tests/i915/gem_ctx_ringsize.c
> @@ -0,0 +1,296 @@
> +/*
> + * Copyright © 2019 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <inttypes.h>
> +#include <sys/ioctl.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +
> +#include "drmtest.h" /* gem_quiescent_gpu()! */
> +#include "i915/gem_context.h"
> +#include "i915/gem_engine_topology.h"
> +#include "ioctl_wrappers.h" /* gem_wait()! */
> +#include "sw_sync.h"
> +
> +#define I915_CONTEXT_PARAM_RINGSIZE 0xc

How are we going to handle symbol redefinition conflict which arises as soon 
as this symbol is also included from kernel headers (e.g. via 
"i915/gem_engine_topology.h")?

> +
> +static bool has_ringsize(int i915)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +
> +	return __gem_context_get_param(i915, &p) == 0;
> +}
> +
> +static void test_idempotent(int i915)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +	uint32_t saved;
> +
> +	/*
> +	 * Simple test to verify that we are able to read back the same
> +	 * value as we set.
> +	 */
> +
> +	gem_context_get_param(i915, &p);
> +	saved = p.value;
> +
> +	for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {

I've noticed you are using two different notations for those minimum/maximum 
constants.  I think that may be confusing.  How about defining and using 
macros?  

> +		p.value = x;
> +		gem_context_set_param(i915, &p);
> +		gem_context_get_param(i915, &p);
> +		igt_assert_eq_u32(p.value, x);
> +	}
> +
> +	p.value = saved;
> +	gem_context_set_param(i915, &p);
> +}
> +
> +static void test_invalid(int i915)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +	uint64_t invalid[] = {
> +		0, 1, 4095, 4097, 8191, 8193,
> +		/* upper limit may be HW dependent, atm it is 512KiB */
> +		(512 << 10) - 1, (512 << 10) + 1,

Here is an example of that different notation mentioned above.

> +		-1, -1u
> +	};
> +	uint32_t saved;
> +
> +	gem_context_get_param(i915, &p);
> +	saved = p.value;
> +
> +	for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
> +		p.value = invalid[i];
> +		igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
> +		gem_context_get_param(i915, &p);
> +		igt_assert_eq_u64(p.value, saved);
> +	}
> +}
> +
> +static int create_ext_ioctl(int i915,
> +			    struct drm_i915_gem_context_create_ext *arg)
> +{
> +	int err;
> +
> +	err = 0;
> +	if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
> +		err = -errno;
> +		igt_assume(err);
> +	}
> +
> +	errno = 0;
> +	return err;
> +}

This helper looks like pretty standard for me.  Why there are no library 
functions for such generic operations?

> +
> +static void test_create(int i915)
> +{
> +	struct drm_i915_gem_context_create_ext_setparam p = {
> +		.base = {
> +			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> +			.next_extension = 0, /* end of chain */
> +		},
> +		.param = {
> +			.param = I915_CONTEXT_PARAM_RINGSIZE,
> +			.value = 512 << 10,
> +		}
> +	};
> +	struct drm_i915_gem_context_create_ext create = {
> +		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> +		.extensions = to_user_pointer(&p),
> +	};
> +
> +	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> +
> +	p.param.ctx_id = create.ctx_id;
> +	p.param.value = 0;
> +	gem_context_get_param(i915, &p.param);
> +	igt_assert_eq(p.param.value, 512 << 10);
> +
> +	gem_context_destroy(i915, create.ctx_id);
> +}
> +
> +static void test_clone(int i915)
> +{
> +	struct drm_i915_gem_context_create_ext_setparam p = {
> +		.base = {
> +			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> +			.next_extension = 0, /* end of chain */
> +		},
> +		.param = {
> +			.param = I915_CONTEXT_PARAM_RINGSIZE,
> +			.value = 512 << 10,
> +		}
> +	};
> +	struct drm_i915_gem_context_create_ext create = {
> +		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> +		.extensions = to_user_pointer(&p),
> +	};
> +
> +	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> +
> +	p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
> +					   I915_CONTEXT_CLONE_ENGINES, 0);
> +	igt_assert_neq(p.param.ctx_id, create.ctx_id);
> +	gem_context_destroy(i915, create.ctx_id);
> +
> +	p.param.value = 0;
> +	gem_context_get_param(i915, &p.param);
> +	igt_assert_eq(p.param.value, 512 << 10);
> +
> +	gem_context_destroy(i915, p.param.ctx_id);
> +}
> +
> +static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
> +{
> +	int err;
> +
> +	err = 0;
> +	if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
> +		err = -errno;
> +
> +	errno = 0;
> +	return err;
> +}

The above helper looks pretty the same as lib/ioctlwrappers.c:__gem_execbuf().  
Does igt_assume(err) found in the latter matter so much that you use your own 
version?

> +
> +static uint32_t __batch_create(int i915, uint32_t offset)

This is always called with offset = 0, do we expect other values to be used 
later?

> +{
> +	const uint32_t bbe = 0xa << 23;
> +	uint32_t handle;
> +
> +	handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));

Why don't we rely on the driver making the alignment for us?

> +	gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> +
> +	return handle;
> +}
> +
> +static uint32_t batch_create(int i915)
> +{
> +	return __batch_create(i915, 0);
> +}
> +
> +static unsigned int measure_inflight(int i915, unsigned int engine)
> +{
> +	IGT_CORK_FENCE(cork);
> +	struct drm_i915_gem_exec_object2 obj = {
> +		.handle = batch_create(i915)
> +	};
> +	struct drm_i915_gem_execbuffer2 execbuf = {
> +		.buffers_ptr = to_user_pointer(&obj),
> +		.buffer_count = 1,
> +		.flags = engine | I915_EXEC_FENCE_IN,
> +		.rsvd2 = igt_cork_plug(&cork, i915),
> +	};
> +	unsigned int count;
> +
> +	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
> +
> +	gem_execbuf(i915, &execbuf);
> +	for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
> +		;

Shouldn't we check if the reason for the failure is what we expect, i.e., 
-EWOULDBLOCK (or -EINTR)?  And why don't we put a time constraint on that loop 
in case O_NONBLOCK handling is not supported (yet)?

> +	close(execbuf.rsvd2);
> +
> +	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) & ~O_NONBLOCK);
> +
> +	igt_cork_unplug(&cork);
> +	gem_close(i915, obj.handle);
> +
> +	return count;
> +}
> +
> +static void test_resize(int i915,
> +			const struct intel_execution_engine2 *e,
> +			unsigned int flags)
> +#define IDLE (1 << 0)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +	unsigned int prev[2] = {};
> +	uint32_t saved;
> +
> +	gem_context_get_param(i915, &p);
> +	saved = p.value;
> +
> +	gem_quiescent_gpu(i915);
> +	for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
> +		unsigned int count;
> +
> +		gem_context_set_param(i915, &p);
> +
> +		count = measure_inflight(i915, e->flags);
> +		igt_info("%s: %llx -> %d\n", e->name, p.value, count);
> +		igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);

Where does this formula come from?  Why not just count == 2 * prev[1] ?
What results should we expect in "active" vs. "idle" mode?

Thanks,
Janusz


> +		if (flags & IDLE)
> +			gem_quiescent_gpu(i915);
> +
> +		prev[0] = prev[1];
> +		prev[1] = count;
> +	}
> +	gem_quiescent_gpu(i915);
> +
> +	p.value = saved;
> +	gem_context_set_param(i915, &p);
> +}
> +
> +igt_main
> +{
> +	const struct intel_execution_engine2 *e;
> +	int i915;
> +
> +	igt_fixture {
> +		i915 = drm_open_driver(DRIVER_INTEL);
> +		igt_require_gem(i915);
> +
> +		igt_require(has_ringsize(i915));
> +	}
> +
> +	igt_subtest("idempotent")
> +		test_idempotent(i915);
> +
> +	igt_subtest("invalid")
> +		test_invalid(i915);
> +
> +	igt_subtest("create")
> +		test_create(i915);
> +	igt_subtest("clone")
> +		test_clone(i915);
> +
> +	__for_each_physical_engine(i915, e) {
> +		igt_subtest_f("%s-idle", e->name)
> +			test_resize(i915, e, IDLE);
> +		igt_subtest_f("%s-active", e->name)
> +			test_resize(i915, e, 0);
> +	}
> +
> +	igt_fixture {
> +		close(i915);
> +	}
> +}
> diff --git a/tests/meson.build b/tests/meson.build
> index b0c567594..9b7ca2423 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -123,6 +123,7 @@ i915_progs = [
>  	'gem_ctx_isolation',
>  	'gem_ctx_param',
>  	'gem_ctx_persistence',
> +	'gem_ctx_ringsize',
>  	'gem_ctx_shared',
>  	'gem_ctx_switch',
>  	'gem_ctx_thrash',
> 




_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
@ 2019-12-02 14:42     ` Janusz Krzysztofik
  0 siblings, 0 replies; 57+ messages in thread
From: Janusz Krzysztofik @ 2019-12-02 14:42 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-gfx

Hi Chris,

I have a few questions rather than comments.  I hope they are worth spending 
your time.

On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> ringbuffer for logical ring contects. This directly affects the number

s/contects/contexts/

> of batches userspace can submit before blocking waiting for space.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  tests/Makefile.sources        |   3 +
>  tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
>  tests/meson.build             |   1 +
>  3 files changed, 300 insertions(+)
>  create mode 100644 tests/i915/gem_ctx_ringsize.c
> 
> diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> index e17d43155..801fc52f3 100644
> --- a/tests/Makefile.sources
> +++ b/tests/Makefile.sources
> @@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
>  TESTS_progs += gem_ctx_persistence
>  gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
>  
> +TESTS_progs += gem_ctx_ringsize
> +gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
> +
>  TESTS_progs += gem_ctx_shared
>  gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
>  
> diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
> new file mode 100644
> index 000000000..1450e8f0d
> --- /dev/null
> +++ b/tests/i915/gem_ctx_ringsize.c
> @@ -0,0 +1,296 @@
> +/*
> + * Copyright © 2019 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <inttypes.h>
> +#include <sys/ioctl.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +
> +#include "drmtest.h" /* gem_quiescent_gpu()! */
> +#include "i915/gem_context.h"
> +#include "i915/gem_engine_topology.h"
> +#include "ioctl_wrappers.h" /* gem_wait()! */
> +#include "sw_sync.h"
> +
> +#define I915_CONTEXT_PARAM_RINGSIZE 0xc

How are we going to handle symbol redefinition conflict which arises as soon 
as this symbol is also included from kernel headers (e.g. via 
"i915/gem_engine_topology.h")?

> +
> +static bool has_ringsize(int i915)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +
> +	return __gem_context_get_param(i915, &p) == 0;
> +}
> +
> +static void test_idempotent(int i915)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +	uint32_t saved;
> +
> +	/*
> +	 * Simple test to verify that we are able to read back the same
> +	 * value as we set.
> +	 */
> +
> +	gem_context_get_param(i915, &p);
> +	saved = p.value;
> +
> +	for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {

I've noticed you are using two different notations for those minimum/maximum 
constants.  I think that may be confusing.  How about defining and using 
macros?  

> +		p.value = x;
> +		gem_context_set_param(i915, &p);
> +		gem_context_get_param(i915, &p);
> +		igt_assert_eq_u32(p.value, x);
> +	}
> +
> +	p.value = saved;
> +	gem_context_set_param(i915, &p);
> +}
> +
> +static void test_invalid(int i915)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +	uint64_t invalid[] = {
> +		0, 1, 4095, 4097, 8191, 8193,
> +		/* upper limit may be HW dependent, atm it is 512KiB */
> +		(512 << 10) - 1, (512 << 10) + 1,

Here is an example of that different notation mentioned above.

> +		-1, -1u
> +	};
> +	uint32_t saved;
> +
> +	gem_context_get_param(i915, &p);
> +	saved = p.value;
> +
> +	for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
> +		p.value = invalid[i];
> +		igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
> +		gem_context_get_param(i915, &p);
> +		igt_assert_eq_u64(p.value, saved);
> +	}
> +}
> +
> +static int create_ext_ioctl(int i915,
> +			    struct drm_i915_gem_context_create_ext *arg)
> +{
> +	int err;
> +
> +	err = 0;
> +	if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
> +		err = -errno;
> +		igt_assume(err);
> +	}
> +
> +	errno = 0;
> +	return err;
> +}

This helper looks like pretty standard for me.  Why there are no library 
functions for such generic operations?

> +
> +static void test_create(int i915)
> +{
> +	struct drm_i915_gem_context_create_ext_setparam p = {
> +		.base = {
> +			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> +			.next_extension = 0, /* end of chain */
> +		},
> +		.param = {
> +			.param = I915_CONTEXT_PARAM_RINGSIZE,
> +			.value = 512 << 10,
> +		}
> +	};
> +	struct drm_i915_gem_context_create_ext create = {
> +		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> +		.extensions = to_user_pointer(&p),
> +	};
> +
> +	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> +
> +	p.param.ctx_id = create.ctx_id;
> +	p.param.value = 0;
> +	gem_context_get_param(i915, &p.param);
> +	igt_assert_eq(p.param.value, 512 << 10);
> +
> +	gem_context_destroy(i915, create.ctx_id);
> +}
> +
> +static void test_clone(int i915)
> +{
> +	struct drm_i915_gem_context_create_ext_setparam p = {
> +		.base = {
> +			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> +			.next_extension = 0, /* end of chain */
> +		},
> +		.param = {
> +			.param = I915_CONTEXT_PARAM_RINGSIZE,
> +			.value = 512 << 10,
> +		}
> +	};
> +	struct drm_i915_gem_context_create_ext create = {
> +		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> +		.extensions = to_user_pointer(&p),
> +	};
> +
> +	igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> +
> +	p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
> +					   I915_CONTEXT_CLONE_ENGINES, 0);
> +	igt_assert_neq(p.param.ctx_id, create.ctx_id);
> +	gem_context_destroy(i915, create.ctx_id);
> +
> +	p.param.value = 0;
> +	gem_context_get_param(i915, &p.param);
> +	igt_assert_eq(p.param.value, 512 << 10);
> +
> +	gem_context_destroy(i915, p.param.ctx_id);
> +}
> +
> +static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
> +{
> +	int err;
> +
> +	err = 0;
> +	if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
> +		err = -errno;
> +
> +	errno = 0;
> +	return err;
> +}

The above helper looks pretty the same as lib/ioctlwrappers.c:__gem_execbuf().  
Does igt_assume(err) found in the latter matter so much that you use your own 
version?

> +
> +static uint32_t __batch_create(int i915, uint32_t offset)

This is always called with offset = 0, do we expect other values to be used 
later?

> +{
> +	const uint32_t bbe = 0xa << 23;
> +	uint32_t handle;
> +
> +	handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));

Why don't we rely on the driver making the alignment for us?

> +	gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> +
> +	return handle;
> +}
> +
> +static uint32_t batch_create(int i915)
> +{
> +	return __batch_create(i915, 0);
> +}
> +
> +static unsigned int measure_inflight(int i915, unsigned int engine)
> +{
> +	IGT_CORK_FENCE(cork);
> +	struct drm_i915_gem_exec_object2 obj = {
> +		.handle = batch_create(i915)
> +	};
> +	struct drm_i915_gem_execbuffer2 execbuf = {
> +		.buffers_ptr = to_user_pointer(&obj),
> +		.buffer_count = 1,
> +		.flags = engine | I915_EXEC_FENCE_IN,
> +		.rsvd2 = igt_cork_plug(&cork, i915),
> +	};
> +	unsigned int count;
> +
> +	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
> +
> +	gem_execbuf(i915, &execbuf);
> +	for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
> +		;

Shouldn't we check if the reason for the failure is what we expect, i.e., 
-EWOULDBLOCK (or -EINTR)?  And why don't we put a time constraint on that loop 
in case O_NONBLOCK handling is not supported (yet)?

> +	close(execbuf.rsvd2);
> +
> +	fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) & ~O_NONBLOCK);
> +
> +	igt_cork_unplug(&cork);
> +	gem_close(i915, obj.handle);
> +
> +	return count;
> +}
> +
> +static void test_resize(int i915,
> +			const struct intel_execution_engine2 *e,
> +			unsigned int flags)
> +#define IDLE (1 << 0)
> +{
> +	struct drm_i915_gem_context_param p = {
> +		.param = I915_CONTEXT_PARAM_RINGSIZE,
> +	};
> +	unsigned int prev[2] = {};
> +	uint32_t saved;
> +
> +	gem_context_get_param(i915, &p);
> +	saved = p.value;
> +
> +	gem_quiescent_gpu(i915);
> +	for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
> +		unsigned int count;
> +
> +		gem_context_set_param(i915, &p);
> +
> +		count = measure_inflight(i915, e->flags);
> +		igt_info("%s: %llx -> %d\n", e->name, p.value, count);
> +		igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);

Where does this formula come from?  Why not just count == 2 * prev[1] ?
What results should we expect in "active" vs. "idle" mode?

Thanks,
Janusz


> +		if (flags & IDLE)
> +			gem_quiescent_gpu(i915);
> +
> +		prev[0] = prev[1];
> +		prev[1] = count;
> +	}
> +	gem_quiescent_gpu(i915);
> +
> +	p.value = saved;
> +	gem_context_set_param(i915, &p);
> +}
> +
> +igt_main
> +{
> +	const struct intel_execution_engine2 *e;
> +	int i915;
> +
> +	igt_fixture {
> +		i915 = drm_open_driver(DRIVER_INTEL);
> +		igt_require_gem(i915);
> +
> +		igt_require(has_ringsize(i915));
> +	}
> +
> +	igt_subtest("idempotent")
> +		test_idempotent(i915);
> +
> +	igt_subtest("invalid")
> +		test_invalid(i915);
> +
> +	igt_subtest("create")
> +		test_create(i915);
> +	igt_subtest("clone")
> +		test_clone(i915);
> +
> +	__for_each_physical_engine(i915, e) {
> +		igt_subtest_f("%s-idle", e->name)
> +			test_resize(i915, e, IDLE);
> +		igt_subtest_f("%s-active", e->name)
> +			test_resize(i915, e, 0);
> +	}
> +
> +	igt_fixture {
> +		close(i915);
> +	}
> +}
> diff --git a/tests/meson.build b/tests/meson.build
> index b0c567594..9b7ca2423 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -123,6 +123,7 @@ i915_progs = [
>  	'gem_ctx_isolation',
>  	'gem_ctx_param',
>  	'gem_ctx_persistence',
> +	'gem_ctx_ringsize',
>  	'gem_ctx_shared',
>  	'gem_ctx_switch',
>  	'gem_ctx_thrash',
> 




_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
@ 2019-12-02 14:59       ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-12-02 14:59 UTC (permalink / raw)
  To: Janusz Krzysztofik, igt-dev; +Cc: intel-gfx

Quoting Janusz Krzysztofik (2019-12-02 14:42:58)
> Hi Chris,
> 
> I have a few questions rather than comments.  I hope they are worth spending 
> your time.
> 
> On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> > I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> > ringbuffer for logical ring contects. This directly affects the number
> 
> s/contects/contexts/
> 
> > of batches userspace can submit before blocking waiting for space.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  tests/Makefile.sources        |   3 +
> >  tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
> >  tests/meson.build             |   1 +
> >  3 files changed, 300 insertions(+)
> >  create mode 100644 tests/i915/gem_ctx_ringsize.c
> > 
> > diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> > index e17d43155..801fc52f3 100644
> > --- a/tests/Makefile.sources
> > +++ b/tests/Makefile.sources
> > @@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
> >  TESTS_progs += gem_ctx_persistence
> >  gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
> >  
> > +TESTS_progs += gem_ctx_ringsize
> > +gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
> > +
> >  TESTS_progs += gem_ctx_shared
> >  gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
> >  
> > diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
> > new file mode 100644
> > index 000000000..1450e8f0d
> > --- /dev/null
> > +++ b/tests/i915/gem_ctx_ringsize.c
> > @@ -0,0 +1,296 @@
> > +/*
> > + * Copyright © 2019 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice (including the next
> > + * paragraph) shall be included in all copies or substantial portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > + * IN THE SOFTWARE.
> > + */
> > +
> > +#include <errno.h>
> > +#include <fcntl.h>
> > +#include <inttypes.h>
> > +#include <sys/ioctl.h>
> > +#include <sys/types.h>
> > +#include <unistd.h>
> > +
> > +#include "drmtest.h" /* gem_quiescent_gpu()! */
> > +#include "i915/gem_context.h"
> > +#include "i915/gem_engine_topology.h"
> > +#include "ioctl_wrappers.h" /* gem_wait()! */
> > +#include "sw_sync.h"
> > +
> > +#define I915_CONTEXT_PARAM_RINGSIZE 0xc
> 
> How are we going to handle symbol redefinition conflict which arises as soon 
> as this symbol is also included from kernel headers (e.g. via 
> "i915/gem_engine_topology.h")?

Final version we copy the headers form the kernel. Conflicts remind us
when we forget.

> 
> > +
> > +static bool has_ringsize(int i915)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +
> > +     return __gem_context_get_param(i915, &p) == 0;
> > +}
> > +
> > +static void test_idempotent(int i915)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +     uint32_t saved;
> > +
> > +     /*
> > +      * Simple test to verify that we are able to read back the same
> > +      * value as we set.
> > +      */
> > +
> > +     gem_context_get_param(i915, &p);
> > +     saved = p.value;
> > +
> > +     for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {
> 
> I've noticed you are using two different notations for those minimum/maximum 
> constants.  I think that may be confusing.  How about defining and using 
> macros?  

A range in pages...
 
> > +             p.value = x;
> > +             gem_context_set_param(i915, &p);
> > +             gem_context_get_param(i915, &p);
> > +             igt_assert_eq_u32(p.value, x);
> > +     }
> > +
> > +     p.value = saved;
> > +     gem_context_set_param(i915, &p);
> > +}
> > +
> > +static void test_invalid(int i915)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +     uint64_t invalid[] = {
> > +             0, 1, 4095, 4097, 8191, 8193,
> > +             /* upper limit may be HW dependent, atm it is 512KiB */
> > +             (512 << 10) - 1, (512 << 10) + 1,
> 
> Here is an example of that different notation mentioned above.

And here written in KiB to match comments.

> 
> > +             -1, -1u
> > +     };
> > +     uint32_t saved;
> > +
> > +     gem_context_get_param(i915, &p);
> > +     saved = p.value;
> > +
> > +     for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
> > +             p.value = invalid[i];
> > +             igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
> > +             gem_context_get_param(i915, &p);
> > +             igt_assert_eq_u64(p.value, saved);
> > +     }
> > +}
> > +
> > +static int create_ext_ioctl(int i915,
> > +                         struct drm_i915_gem_context_create_ext *arg)
> > +{
> > +     int err;
> > +
> > +     err = 0;
> > +     if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
> > +             err = -errno;
> > +             igt_assume(err);
> > +     }
> > +
> > +     errno = 0;
> > +     return err;
> > +}
> 
> This helper looks like pretty standard for me.  Why there are no library 
> functions for such generic operations?

Because no one has written that yet.

> 
> > +
> > +static void test_create(int i915)
> > +{
> > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > +             .base = {
> > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > +                     .next_extension = 0, /* end of chain */
> > +             },
> > +             .param = {
> > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +                     .value = 512 << 10,
> > +             }
> > +     };
> > +     struct drm_i915_gem_context_create_ext create = {
> > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > +             .extensions = to_user_pointer(&p),
> > +     };
> > +
> > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > +
> > +     p.param.ctx_id = create.ctx_id;
> > +     p.param.value = 0;
> > +     gem_context_get_param(i915, &p.param);
> > +     igt_assert_eq(p.param.value, 512 << 10);
> > +
> > +     gem_context_destroy(i915, create.ctx_id);
> > +}
> > +
> > +static void test_clone(int i915)
> > +{
> > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > +             .base = {
> > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > +                     .next_extension = 0, /* end of chain */
> > +             },
> > +             .param = {
> > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +                     .value = 512 << 10,
> > +             }
> > +     };
> > +     struct drm_i915_gem_context_create_ext create = {
> > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > +             .extensions = to_user_pointer(&p),
> > +     };
> > +
> > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > +
> > +     p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
> > +                                        I915_CONTEXT_CLONE_ENGINES, 0);
> > +     igt_assert_neq(p.param.ctx_id, create.ctx_id);
> > +     gem_context_destroy(i915, create.ctx_id);
> > +
> > +     p.param.value = 0;
> > +     gem_context_get_param(i915, &p.param);
> > +     igt_assert_eq(p.param.value, 512 << 10);
> > +
> > +     gem_context_destroy(i915, p.param.ctx_id);
> > +}
> > +
> > +static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
> > +{
> > +     int err;
> > +
> > +     err = 0;
> > +     if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
> > +             err = -errno;
> > +
> > +     errno = 0;
> > +     return err;
> > +}
> 
> The above helper looks pretty the same as lib/ioctlwrappers.c:__gem_execbuf().  
> Does igt_assume(err) found in the latter matter so much that you use your own 
> version?

It's very, very different from that one.

> > +
> > +static uint32_t __batch_create(int i915, uint32_t offset)
> 
> This is always called with offset = 0, do we expect other values to be used 
> later?

Why not.
 
> > +{
> > +     const uint32_t bbe = 0xa << 23;
> > +     uint32_t handle;
> > +
> > +     handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));
> 
> Why don't we rely on the driver making the alignment for us?

I'm used to being inside the kernel where it's expected to be correct.

> > +     gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> > +
> > +     return handle;
> > +}
> > +
> > +static uint32_t batch_create(int i915)
> > +{
> > +     return __batch_create(i915, 0);
> > +}
> > +
> > +static unsigned int measure_inflight(int i915, unsigned int engine)
> > +{
> > +     IGT_CORK_FENCE(cork);
> > +     struct drm_i915_gem_exec_object2 obj = {
> > +             .handle = batch_create(i915)
> > +     };
> > +     struct drm_i915_gem_execbuffer2 execbuf = {
> > +             .buffers_ptr = to_user_pointer(&obj),
> > +             .buffer_count = 1,
> > +             .flags = engine | I915_EXEC_FENCE_IN,
> > +             .rsvd2 = igt_cork_plug(&cork, i915),
> > +     };
> > +     unsigned int count;
> > +
> > +     fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
> > +
> > +     gem_execbuf(i915, &execbuf);
> > +     for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
> > +             ;
> 
> Shouldn't we check if the reason for the failure is what we expect, i.e., 
> -EWOULDBLOCK (or -EINTR)?  And why don't we put a time constraint on that loop 
> in case O_NONBLOCK handling is not supported (yet)?

Sure. The idea is that O_NONBLOCK is supported, otherwise we don't
have fast and precise feedback.

> > +static void test_resize(int i915,
> > +                     const struct intel_execution_engine2 *e,
> > +                     unsigned int flags)
> > +#define IDLE (1 << 0)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +     unsigned int prev[2] = {};
> > +     uint32_t saved;
> > +
> > +     gem_context_get_param(i915, &p);
> > +     saved = p.value;
> > +
> > +     gem_quiescent_gpu(i915);
> > +     for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
> > +             unsigned int count;
> > +
> > +             gem_context_set_param(i915, &p);
> > +
> > +             count = measure_inflight(i915, e->flags);
> > +             igt_info("%s: %llx -> %d\n", e->name, p.value, count);
> > +             igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);
> 
> Where does this formula come from?  Why not just count == 2 * prev[1] ?
> What results should we expect in "active" vs. "idle" mode?

I've explained somewhere why it is not 2*prev... And there's a small
amount of imprecision (+-1 request). In test_resize is the comment:

        /*
         * The ringsize directly affects the number of batches we can have
         * inflight -- when we run out of room in the ring, the client is
         * blocked (or if O_NONBLOCK is specified, -EWOULDBLOCK is reported).
         * The kernel throttles the client when they enter the last 4KiB page,
         * so as we double the size of the ring, we nearly double the number
         * of requests we can fit as 2^n-1: i.e 0, 1, 3, 7, 15, 31 pages.
         */

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
@ 2019-12-02 14:59       ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-12-02 14:59 UTC (permalink / raw)
  To: Janusz Krzysztofik, igt-dev; +Cc: intel-gfx

Quoting Janusz Krzysztofik (2019-12-02 14:42:58)
> Hi Chris,
> 
> I have a few questions rather than comments.  I hope they are worth spending 
> your time.
> 
> On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> > I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> > ringbuffer for logical ring contects. This directly affects the number
> 
> s/contects/contexts/
> 
> > of batches userspace can submit before blocking waiting for space.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  tests/Makefile.sources        |   3 +
> >  tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
> >  tests/meson.build             |   1 +
> >  3 files changed, 300 insertions(+)
> >  create mode 100644 tests/i915/gem_ctx_ringsize.c
> > 
> > diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> > index e17d43155..801fc52f3 100644
> > --- a/tests/Makefile.sources
> > +++ b/tests/Makefile.sources
> > @@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
> >  TESTS_progs += gem_ctx_persistence
> >  gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
> >  
> > +TESTS_progs += gem_ctx_ringsize
> > +gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
> > +
> >  TESTS_progs += gem_ctx_shared
> >  gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
> >  
> > diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
> > new file mode 100644
> > index 000000000..1450e8f0d
> > --- /dev/null
> > +++ b/tests/i915/gem_ctx_ringsize.c
> > @@ -0,0 +1,296 @@
> > +/*
> > + * Copyright © 2019 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice (including the next
> > + * paragraph) shall be included in all copies or substantial portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > + * IN THE SOFTWARE.
> > + */
> > +
> > +#include <errno.h>
> > +#include <fcntl.h>
> > +#include <inttypes.h>
> > +#include <sys/ioctl.h>
> > +#include <sys/types.h>
> > +#include <unistd.h>
> > +
> > +#include "drmtest.h" /* gem_quiescent_gpu()! */
> > +#include "i915/gem_context.h"
> > +#include "i915/gem_engine_topology.h"
> > +#include "ioctl_wrappers.h" /* gem_wait()! */
> > +#include "sw_sync.h"
> > +
> > +#define I915_CONTEXT_PARAM_RINGSIZE 0xc
> 
> How are we going to handle symbol redefinition conflict which arises as soon 
> as this symbol is also included from kernel headers (e.g. via 
> "i915/gem_engine_topology.h")?

Final version we copy the headers form the kernel. Conflicts remind us
when we forget.

> 
> > +
> > +static bool has_ringsize(int i915)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +
> > +     return __gem_context_get_param(i915, &p) == 0;
> > +}
> > +
> > +static void test_idempotent(int i915)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +     uint32_t saved;
> > +
> > +     /*
> > +      * Simple test to verify that we are able to read back the same
> > +      * value as we set.
> > +      */
> > +
> > +     gem_context_get_param(i915, &p);
> > +     saved = p.value;
> > +
> > +     for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {
> 
> I've noticed you are using two different notations for those minimum/maximum 
> constants.  I think that may be confusing.  How about defining and using 
> macros?  

A range in pages...
 
> > +             p.value = x;
> > +             gem_context_set_param(i915, &p);
> > +             gem_context_get_param(i915, &p);
> > +             igt_assert_eq_u32(p.value, x);
> > +     }
> > +
> > +     p.value = saved;
> > +     gem_context_set_param(i915, &p);
> > +}
> > +
> > +static void test_invalid(int i915)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +     uint64_t invalid[] = {
> > +             0, 1, 4095, 4097, 8191, 8193,
> > +             /* upper limit may be HW dependent, atm it is 512KiB */
> > +             (512 << 10) - 1, (512 << 10) + 1,
> 
> Here is an example of that different notation mentioned above.

And here written in KiB to match comments.

> 
> > +             -1, -1u
> > +     };
> > +     uint32_t saved;
> > +
> > +     gem_context_get_param(i915, &p);
> > +     saved = p.value;
> > +
> > +     for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
> > +             p.value = invalid[i];
> > +             igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
> > +             gem_context_get_param(i915, &p);
> > +             igt_assert_eq_u64(p.value, saved);
> > +     }
> > +}
> > +
> > +static int create_ext_ioctl(int i915,
> > +                         struct drm_i915_gem_context_create_ext *arg)
> > +{
> > +     int err;
> > +
> > +     err = 0;
> > +     if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
> > +             err = -errno;
> > +             igt_assume(err);
> > +     }
> > +
> > +     errno = 0;
> > +     return err;
> > +}
> 
> This helper looks like pretty standard for me.  Why there are no library 
> functions for such generic operations?

Because no one has written that yet.

> 
> > +
> > +static void test_create(int i915)
> > +{
> > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > +             .base = {
> > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > +                     .next_extension = 0, /* end of chain */
> > +             },
> > +             .param = {
> > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +                     .value = 512 << 10,
> > +             }
> > +     };
> > +     struct drm_i915_gem_context_create_ext create = {
> > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > +             .extensions = to_user_pointer(&p),
> > +     };
> > +
> > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > +
> > +     p.param.ctx_id = create.ctx_id;
> > +     p.param.value = 0;
> > +     gem_context_get_param(i915, &p.param);
> > +     igt_assert_eq(p.param.value, 512 << 10);
> > +
> > +     gem_context_destroy(i915, create.ctx_id);
> > +}
> > +
> > +static void test_clone(int i915)
> > +{
> > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > +             .base = {
> > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > +                     .next_extension = 0, /* end of chain */
> > +             },
> > +             .param = {
> > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +                     .value = 512 << 10,
> > +             }
> > +     };
> > +     struct drm_i915_gem_context_create_ext create = {
> > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > +             .extensions = to_user_pointer(&p),
> > +     };
> > +
> > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > +
> > +     p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
> > +                                        I915_CONTEXT_CLONE_ENGINES, 0);
> > +     igt_assert_neq(p.param.ctx_id, create.ctx_id);
> > +     gem_context_destroy(i915, create.ctx_id);
> > +
> > +     p.param.value = 0;
> > +     gem_context_get_param(i915, &p.param);
> > +     igt_assert_eq(p.param.value, 512 << 10);
> > +
> > +     gem_context_destroy(i915, p.param.ctx_id);
> > +}
> > +
> > +static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
> > +{
> > +     int err;
> > +
> > +     err = 0;
> > +     if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
> > +             err = -errno;
> > +
> > +     errno = 0;
> > +     return err;
> > +}
> 
> The above helper looks pretty the same as lib/ioctlwrappers.c:__gem_execbuf().  
> Does igt_assume(err) found in the latter matter so much that you use your own 
> version?

It's very, very different from that one.

> > +
> > +static uint32_t __batch_create(int i915, uint32_t offset)
> 
> This is always called with offset = 0, do we expect other values to be used 
> later?

Why not.
 
> > +{
> > +     const uint32_t bbe = 0xa << 23;
> > +     uint32_t handle;
> > +
> > +     handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));
> 
> Why don't we rely on the driver making the alignment for us?

I'm used to being inside the kernel where it's expected to be correct.

> > +     gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> > +
> > +     return handle;
> > +}
> > +
> > +static uint32_t batch_create(int i915)
> > +{
> > +     return __batch_create(i915, 0);
> > +}
> > +
> > +static unsigned int measure_inflight(int i915, unsigned int engine)
> > +{
> > +     IGT_CORK_FENCE(cork);
> > +     struct drm_i915_gem_exec_object2 obj = {
> > +             .handle = batch_create(i915)
> > +     };
> > +     struct drm_i915_gem_execbuffer2 execbuf = {
> > +             .buffers_ptr = to_user_pointer(&obj),
> > +             .buffer_count = 1,
> > +             .flags = engine | I915_EXEC_FENCE_IN,
> > +             .rsvd2 = igt_cork_plug(&cork, i915),
> > +     };
> > +     unsigned int count;
> > +
> > +     fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
> > +
> > +     gem_execbuf(i915, &execbuf);
> > +     for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
> > +             ;
> 
> Shouldn't we check if the reason for the failure is what we expect, i.e., 
> -EWOULDBLOCK (or -EINTR)?  And why don't we put a time constraint on that loop 
> in case O_NONBLOCK handling is not supported (yet)?

Sure. The idea is that O_NONBLOCK is supported, otherwise we don't
have fast and precise feedback.

> > +static void test_resize(int i915,
> > +                     const struct intel_execution_engine2 *e,
> > +                     unsigned int flags)
> > +#define IDLE (1 << 0)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +     unsigned int prev[2] = {};
> > +     uint32_t saved;
> > +
> > +     gem_context_get_param(i915, &p);
> > +     saved = p.value;
> > +
> > +     gem_quiescent_gpu(i915);
> > +     for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
> > +             unsigned int count;
> > +
> > +             gem_context_set_param(i915, &p);
> > +
> > +             count = measure_inflight(i915, e->flags);
> > +             igt_info("%s: %llx -> %d\n", e->name, p.value, count);
> > +             igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);
> 
> Where does this formula come from?  Why not just count == 2 * prev[1] ?
> What results should we expect in "active" vs. "idle" mode?

I've explained somewhere why it is not 2*prev... And there's a small
amount of imprecision (+-1 request). In test_resize is the comment:

        /*
         * The ringsize directly affects the number of batches we can have
         * inflight -- when we run out of room in the ring, the client is
         * blocked (or if O_NONBLOCK is specified, -EWOULDBLOCK is reported).
         * The kernel throttles the client when they enter the last 4KiB page,
         * so as we double the size of the ring, we nearly double the number
         * of requests we can fit as 2^n-1: i.e 0, 1, 3, 7, 15, 31 pages.
         */

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
@ 2019-12-02 14:59       ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2019-12-02 14:59 UTC (permalink / raw)
  To: Janusz Krzysztofik, igt-dev; +Cc: intel-gfx

Quoting Janusz Krzysztofik (2019-12-02 14:42:58)
> Hi Chris,
> 
> I have a few questions rather than comments.  I hope they are worth spending 
> your time.
> 
> On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> > I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> > ringbuffer for logical ring contects. This directly affects the number
> 
> s/contects/contexts/
> 
> > of batches userspace can submit before blocking waiting for space.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  tests/Makefile.sources        |   3 +
> >  tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
> >  tests/meson.build             |   1 +
> >  3 files changed, 300 insertions(+)
> >  create mode 100644 tests/i915/gem_ctx_ringsize.c
> > 
> > diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> > index e17d43155..801fc52f3 100644
> > --- a/tests/Makefile.sources
> > +++ b/tests/Makefile.sources
> > @@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
> >  TESTS_progs += gem_ctx_persistence
> >  gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
> >  
> > +TESTS_progs += gem_ctx_ringsize
> > +gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
> > +
> >  TESTS_progs += gem_ctx_shared
> >  gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
> >  
> > diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
> > new file mode 100644
> > index 000000000..1450e8f0d
> > --- /dev/null
> > +++ b/tests/i915/gem_ctx_ringsize.c
> > @@ -0,0 +1,296 @@
> > +/*
> > + * Copyright © 2019 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice (including the next
> > + * paragraph) shall be included in all copies or substantial portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > + * IN THE SOFTWARE.
> > + */
> > +
> > +#include <errno.h>
> > +#include <fcntl.h>
> > +#include <inttypes.h>
> > +#include <sys/ioctl.h>
> > +#include <sys/types.h>
> > +#include <unistd.h>
> > +
> > +#include "drmtest.h" /* gem_quiescent_gpu()! */
> > +#include "i915/gem_context.h"
> > +#include "i915/gem_engine_topology.h"
> > +#include "ioctl_wrappers.h" /* gem_wait()! */
> > +#include "sw_sync.h"
> > +
> > +#define I915_CONTEXT_PARAM_RINGSIZE 0xc
> 
> How are we going to handle symbol redefinition conflict which arises as soon 
> as this symbol is also included from kernel headers (e.g. via 
> "i915/gem_engine_topology.h")?

Final version we copy the headers form the kernel. Conflicts remind us
when we forget.

> 
> > +
> > +static bool has_ringsize(int i915)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +
> > +     return __gem_context_get_param(i915, &p) == 0;
> > +}
> > +
> > +static void test_idempotent(int i915)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +     uint32_t saved;
> > +
> > +     /*
> > +      * Simple test to verify that we are able to read back the same
> > +      * value as we set.
> > +      */
> > +
> > +     gem_context_get_param(i915, &p);
> > +     saved = p.value;
> > +
> > +     for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {
> 
> I've noticed you are using two different notations for those minimum/maximum 
> constants.  I think that may be confusing.  How about defining and using 
> macros?  

A range in pages...
 
> > +             p.value = x;
> > +             gem_context_set_param(i915, &p);
> > +             gem_context_get_param(i915, &p);
> > +             igt_assert_eq_u32(p.value, x);
> > +     }
> > +
> > +     p.value = saved;
> > +     gem_context_set_param(i915, &p);
> > +}
> > +
> > +static void test_invalid(int i915)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +     uint64_t invalid[] = {
> > +             0, 1, 4095, 4097, 8191, 8193,
> > +             /* upper limit may be HW dependent, atm it is 512KiB */
> > +             (512 << 10) - 1, (512 << 10) + 1,
> 
> Here is an example of that different notation mentioned above.

And here written in KiB to match comments.

> 
> > +             -1, -1u
> > +     };
> > +     uint32_t saved;
> > +
> > +     gem_context_get_param(i915, &p);
> > +     saved = p.value;
> > +
> > +     for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
> > +             p.value = invalid[i];
> > +             igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
> > +             gem_context_get_param(i915, &p);
> > +             igt_assert_eq_u64(p.value, saved);
> > +     }
> > +}
> > +
> > +static int create_ext_ioctl(int i915,
> > +                         struct drm_i915_gem_context_create_ext *arg)
> > +{
> > +     int err;
> > +
> > +     err = 0;
> > +     if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
> > +             err = -errno;
> > +             igt_assume(err);
> > +     }
> > +
> > +     errno = 0;
> > +     return err;
> > +}
> 
> This helper looks like pretty standard for me.  Why there are no library 
> functions for such generic operations?

Because no one has written that yet.

> 
> > +
> > +static void test_create(int i915)
> > +{
> > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > +             .base = {
> > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > +                     .next_extension = 0, /* end of chain */
> > +             },
> > +             .param = {
> > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +                     .value = 512 << 10,
> > +             }
> > +     };
> > +     struct drm_i915_gem_context_create_ext create = {
> > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > +             .extensions = to_user_pointer(&p),
> > +     };
> > +
> > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > +
> > +     p.param.ctx_id = create.ctx_id;
> > +     p.param.value = 0;
> > +     gem_context_get_param(i915, &p.param);
> > +     igt_assert_eq(p.param.value, 512 << 10);
> > +
> > +     gem_context_destroy(i915, create.ctx_id);
> > +}
> > +
> > +static void test_clone(int i915)
> > +{
> > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > +             .base = {
> > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > +                     .next_extension = 0, /* end of chain */
> > +             },
> > +             .param = {
> > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +                     .value = 512 << 10,
> > +             }
> > +     };
> > +     struct drm_i915_gem_context_create_ext create = {
> > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > +             .extensions = to_user_pointer(&p),
> > +     };
> > +
> > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > +
> > +     p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
> > +                                        I915_CONTEXT_CLONE_ENGINES, 0);
> > +     igt_assert_neq(p.param.ctx_id, create.ctx_id);
> > +     gem_context_destroy(i915, create.ctx_id);
> > +
> > +     p.param.value = 0;
> > +     gem_context_get_param(i915, &p.param);
> > +     igt_assert_eq(p.param.value, 512 << 10);
> > +
> > +     gem_context_destroy(i915, p.param.ctx_id);
> > +}
> > +
> > +static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
> > +{
> > +     int err;
> > +
> > +     err = 0;
> > +     if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
> > +             err = -errno;
> > +
> > +     errno = 0;
> > +     return err;
> > +}
> 
> The above helper looks pretty the same as lib/ioctlwrappers.c:__gem_execbuf().  
> Does igt_assume(err) found in the latter matter so much that you use your own 
> version?

It's very, very different from that one.

> > +
> > +static uint32_t __batch_create(int i915, uint32_t offset)
> 
> This is always called with offset = 0, do we expect other values to be used 
> later?

Why not.
 
> > +{
> > +     const uint32_t bbe = 0xa << 23;
> > +     uint32_t handle;
> > +
> > +     handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));
> 
> Why don't we rely on the driver making the alignment for us?

I'm used to being inside the kernel where it's expected to be correct.

> > +     gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> > +
> > +     return handle;
> > +}
> > +
> > +static uint32_t batch_create(int i915)
> > +{
> > +     return __batch_create(i915, 0);
> > +}
> > +
> > +static unsigned int measure_inflight(int i915, unsigned int engine)
> > +{
> > +     IGT_CORK_FENCE(cork);
> > +     struct drm_i915_gem_exec_object2 obj = {
> > +             .handle = batch_create(i915)
> > +     };
> > +     struct drm_i915_gem_execbuffer2 execbuf = {
> > +             .buffers_ptr = to_user_pointer(&obj),
> > +             .buffer_count = 1,
> > +             .flags = engine | I915_EXEC_FENCE_IN,
> > +             .rsvd2 = igt_cork_plug(&cork, i915),
> > +     };
> > +     unsigned int count;
> > +
> > +     fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
> > +
> > +     gem_execbuf(i915, &execbuf);
> > +     for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
> > +             ;
> 
> Shouldn't we check if the reason for the failure is what we expect, i.e., 
> -EWOULDBLOCK (or -EINTR)?  And why don't we put a time constraint on that loop 
> in case O_NONBLOCK handling is not supported (yet)?

Sure. The idea is that O_NONBLOCK is supported, otherwise we don't
have fast and precise feedback.

> > +static void test_resize(int i915,
> > +                     const struct intel_execution_engine2 *e,
> > +                     unsigned int flags)
> > +#define IDLE (1 << 0)
> > +{
> > +     struct drm_i915_gem_context_param p = {
> > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > +     };
> > +     unsigned int prev[2] = {};
> > +     uint32_t saved;
> > +
> > +     gem_context_get_param(i915, &p);
> > +     saved = p.value;
> > +
> > +     gem_quiescent_gpu(i915);
> > +     for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
> > +             unsigned int count;
> > +
> > +             gem_context_set_param(i915, &p);
> > +
> > +             count = measure_inflight(i915, e->flags);
> > +             igt_info("%s: %llx -> %d\n", e->name, p.value, count);
> > +             igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);
> 
> Where does this formula come from?  Why not just count == 2 * prev[1] ?
> What results should we expect in "active" vs. "idle" mode?

I've explained somewhere why it is not 2*prev... And there's a small
amount of imprecision (+-1 request). In test_resize is the comment:

        /*
         * The ringsize directly affects the number of batches we can have
         * inflight -- when we run out of room in the ring, the client is
         * blocked (or if O_NONBLOCK is specified, -EWOULDBLOCK is reported).
         * The kernel throttles the client when they enter the last 4KiB page,
         * so as we double the size of the ring, we nearly double the number
         * of requests we can fit as 2^n-1: i.e 0, 1, 3, 7, 15, 31 pages.
         */

-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
  2019-12-02 14:59       ` [Intel-gfx] " Chris Wilson
@ 2020-02-20 15:57         ` Janusz Krzysztofik
  -1 siblings, 0 replies; 57+ messages in thread
From: Janusz Krzysztofik @ 2020-02-20 15:57 UTC (permalink / raw)
  To: Chris Wilson; +Cc: igt-dev, intel-gfx

Hi Chris,

On Monday, December 2, 2019 3:59:19 PM CET Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-12-02 14:42:58)
> > Hi Chris,
> > 
> > I have a few questions rather than comments.  I hope they are worth spending 
> > your time.
> > 
> > On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> > > I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> > > ringbuffer for logical ring contects. This directly affects the number
> > 
> > s/contects/contexts/
> > 
> > > of batches userspace can submit before blocking waiting for space.
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Have you got this patch still queued somewhere?  As UMD has accepted the 
solution and are ready with changes on their side, I think we need to merge it 
soon, and the kernel side as well.

Thanks,
Janusz


> > > ---
> > >  tests/Makefile.sources        |   3 +
> > >  tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
> > >  tests/meson.build             |   1 +
> > >  3 files changed, 300 insertions(+)
> > >  create mode 100644 tests/i915/gem_ctx_ringsize.c
> > > 
> > > diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> > > index e17d43155..801fc52f3 100644
> > > --- a/tests/Makefile.sources
> > > +++ b/tests/Makefile.sources
> > > @@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
> > >  TESTS_progs += gem_ctx_persistence
> > >  gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
> > >  
> > > +TESTS_progs += gem_ctx_ringsize
> > > +gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
> > > +
> > >  TESTS_progs += gem_ctx_shared
> > >  gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
> > >  
> > > diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
> > > new file mode 100644
> > > index 000000000..1450e8f0d
> > > --- /dev/null
> > > +++ b/tests/i915/gem_ctx_ringsize.c
> > > @@ -0,0 +1,296 @@
> > > +/*
> > > + * Copyright © 2019 Intel Corporation
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person obtaining a
> > > + * copy of this software and associated documentation files (the "Software"),
> > > + * to deal in the Software without restriction, including without limitation
> > > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > > + * and/or sell copies of the Software, and to permit persons to whom the
> > > + * Software is furnished to do so, subject to the following conditions:
> > > + *
> > > + * The above copyright notice and this permission notice (including the next
> > > + * paragraph) shall be included in all copies or substantial portions of the
> > > + * Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > > + * IN THE SOFTWARE.
> > > + */
> > > +
> > > +#include <errno.h>
> > > +#include <fcntl.h>
> > > +#include <inttypes.h>
> > > +#include <sys/ioctl.h>
> > > +#include <sys/types.h>
> > > +#include <unistd.h>
> > > +
> > > +#include "drmtest.h" /* gem_quiescent_gpu()! */
> > > +#include "i915/gem_context.h"
> > > +#include "i915/gem_engine_topology.h"
> > > +#include "ioctl_wrappers.h" /* gem_wait()! */
> > > +#include "sw_sync.h"
> > > +
> > > +#define I915_CONTEXT_PARAM_RINGSIZE 0xc
> > 
> > How are we going to handle symbol redefinition conflict which arises as soon 
> > as this symbol is also included from kernel headers (e.g. via 
> > "i915/gem_engine_topology.h")?
> 
> Final version we copy the headers form the kernel. Conflicts remind us
> when we forget.
> 
> > 
> > > +
> > > +static bool has_ringsize(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +
> > > +     return __gem_context_get_param(i915, &p) == 0;
> > > +}
> > > +
> > > +static void test_idempotent(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +     uint32_t saved;
> > > +
> > > +     /*
> > > +      * Simple test to verify that we are able to read back the same
> > > +      * value as we set.
> > > +      */
> > > +
> > > +     gem_context_get_param(i915, &p);
> > > +     saved = p.value;
> > > +
> > > +     for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {
> > 
> > I've noticed you are using two different notations for those minimum/maximum 
> > constants.  I think that may be confusing.  How about defining and using 
> > macros?  
> 
> A range in pages...
>  
> > > +             p.value = x;
> > > +             gem_context_set_param(i915, &p);
> > > +             gem_context_get_param(i915, &p);
> > > +             igt_assert_eq_u32(p.value, x);
> > > +     }
> > > +
> > > +     p.value = saved;
> > > +     gem_context_set_param(i915, &p);
> > > +}
> > > +
> > > +static void test_invalid(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +     uint64_t invalid[] = {
> > > +             0, 1, 4095, 4097, 8191, 8193,
> > > +             /* upper limit may be HW dependent, atm it is 512KiB */
> > > +             (512 << 10) - 1, (512 << 10) + 1,
> > 
> > Here is an example of that different notation mentioned above.
> 
> And here written in KiB to match comments.
> 
> > 
> > > +             -1, -1u
> > > +     };
> > > +     uint32_t saved;
> > > +
> > > +     gem_context_get_param(i915, &p);
> > > +     saved = p.value;
> > > +
> > > +     for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
> > > +             p.value = invalid[i];
> > > +             igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
> > > +             gem_context_get_param(i915, &p);
> > > +             igt_assert_eq_u64(p.value, saved);
> > > +     }
> > > +}
> > > +
> > > +static int create_ext_ioctl(int i915,
> > > +                         struct drm_i915_gem_context_create_ext *arg)
> > > +{
> > > +     int err;
> > > +
> > > +     err = 0;
> > > +     if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
> > > +             err = -errno;
> > > +             igt_assume(err);
> > > +     }
> > > +
> > > +     errno = 0;
> > > +     return err;
> > > +}
> > 
> > This helper looks like pretty standard for me.  Why there are no library 
> > functions for such generic operations?
> 
> Because no one has written that yet.
> 
> > 
> > > +
> > > +static void test_create(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > > +             .base = {
> > > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > > +                     .next_extension = 0, /* end of chain */
> > > +             },
> > > +             .param = {
> > > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +                     .value = 512 << 10,
> > > +             }
> > > +     };
> > > +     struct drm_i915_gem_context_create_ext create = {
> > > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > > +             .extensions = to_user_pointer(&p),
> > > +     };
> > > +
> > > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > > +
> > > +     p.param.ctx_id = create.ctx_id;
> > > +     p.param.value = 0;
> > > +     gem_context_get_param(i915, &p.param);
> > > +     igt_assert_eq(p.param.value, 512 << 10);
> > > +
> > > +     gem_context_destroy(i915, create.ctx_id);
> > > +}
> > > +
> > > +static void test_clone(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > > +             .base = {
> > > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > > +                     .next_extension = 0, /* end of chain */
> > > +             },
> > > +             .param = {
> > > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +                     .value = 512 << 10,
> > > +             }
> > > +     };
> > > +     struct drm_i915_gem_context_create_ext create = {
> > > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > > +             .extensions = to_user_pointer(&p),
> > > +     };
> > > +
> > > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > > +
> > > +     p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
> > > +                                        I915_CONTEXT_CLONE_ENGINES, 0);
> > > +     igt_assert_neq(p.param.ctx_id, create.ctx_id);
> > > +     gem_context_destroy(i915, create.ctx_id);
> > > +
> > > +     p.param.value = 0;
> > > +     gem_context_get_param(i915, &p.param);
> > > +     igt_assert_eq(p.param.value, 512 << 10);
> > > +
> > > +     gem_context_destroy(i915, p.param.ctx_id);
> > > +}
> > > +
> > > +static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
> > > +{
> > > +     int err;
> > > +
> > > +     err = 0;
> > > +     if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
> > > +             err = -errno;
> > > +
> > > +     errno = 0;
> > > +     return err;
> > > +}
> > 
> > The above helper looks pretty the same as lib/ioctlwrappers.c:__gem_execbuf().  
> > Does igt_assume(err) found in the latter matter so much that you use your own 
> > version?
> 
> It's very, very different from that one.
> 
> > > +
> > > +static uint32_t __batch_create(int i915, uint32_t offset)
> > 
> > This is always called with offset = 0, do we expect other values to be used 
> > later?
> 
> Why not.
>  
> > > +{
> > > +     const uint32_t bbe = 0xa << 23;
> > > +     uint32_t handle;
> > > +
> > > +     handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));
> > 
> > Why don't we rely on the driver making the alignment for us?
> 
> I'm used to being inside the kernel where it's expected to be correct.
> 
> > > +     gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> > > +
> > > +     return handle;
> > > +}
> > > +
> > > +static uint32_t batch_create(int i915)
> > > +{
> > > +     return __batch_create(i915, 0);
> > > +}
> > > +
> > > +static unsigned int measure_inflight(int i915, unsigned int engine)
> > > +{
> > > +     IGT_CORK_FENCE(cork);
> > > +     struct drm_i915_gem_exec_object2 obj = {
> > > +             .handle = batch_create(i915)
> > > +     };
> > > +     struct drm_i915_gem_execbuffer2 execbuf = {
> > > +             .buffers_ptr = to_user_pointer(&obj),
> > > +             .buffer_count = 1,
> > > +             .flags = engine | I915_EXEC_FENCE_IN,
> > > +             .rsvd2 = igt_cork_plug(&cork, i915),
> > > +     };
> > > +     unsigned int count;
> > > +
> > > +     fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
> > > +
> > > +     gem_execbuf(i915, &execbuf);
> > > +     for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
> > > +             ;
> > 
> > Shouldn't we check if the reason for the failure is what we expect, i.e., 
> > -EWOULDBLOCK (or -EINTR)?  And why don't we put a time constraint on that loop 
> > in case O_NONBLOCK handling is not supported (yet)?
> 
> Sure. The idea is that O_NONBLOCK is supported, otherwise we don't
> have fast and precise feedback.
> 
> > > +static void test_resize(int i915,
> > > +                     const struct intel_execution_engine2 *e,
> > > +                     unsigned int flags)
> > > +#define IDLE (1 << 0)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +     unsigned int prev[2] = {};
> > > +     uint32_t saved;
> > > +
> > > +     gem_context_get_param(i915, &p);
> > > +     saved = p.value;
> > > +
> > > +     gem_quiescent_gpu(i915);
> > > +     for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
> > > +             unsigned int count;
> > > +
> > > +             gem_context_set_param(i915, &p);
> > > +
> > > +             count = measure_inflight(i915, e->flags);
> > > +             igt_info("%s: %llx -> %d\n", e->name, p.value, count);
> > > +             igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);
> > 
> > Where does this formula come from?  Why not just count == 2 * prev[1] ?
> > What results should we expect in "active" vs. "idle" mode?
> 
> I've explained somewhere why it is not 2*prev... And there's a small
> amount of imprecision (+-1 request). In test_resize is the comment:
> 
>         /*
>          * The ringsize directly affects the number of batches we can have
>          * inflight -- when we run out of room in the ring, the client is
>          * blocked (or if O_NONBLOCK is specified, -EWOULDBLOCK is reported).
>          * The kernel throttles the client when they enter the last 4KiB page,
>          * so as we double the size of the ring, we nearly double the number
>          * of requests we can fit as 2^n-1: i.e 0, 1, 3, 7, 15, 31 pages.
>          */
> 
> -Chris
> 




_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
@ 2020-02-20 15:57         ` Janusz Krzysztofik
  0 siblings, 0 replies; 57+ messages in thread
From: Janusz Krzysztofik @ 2020-02-20 15:57 UTC (permalink / raw)
  To: Chris Wilson; +Cc: igt-dev, intel-gfx

Hi Chris,

On Monday, December 2, 2019 3:59:19 PM CET Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-12-02 14:42:58)
> > Hi Chris,
> > 
> > I have a few questions rather than comments.  I hope they are worth spending 
> > your time.
> > 
> > On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> > > I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> > > ringbuffer for logical ring contects. This directly affects the number
> > 
> > s/contects/contexts/
> > 
> > > of batches userspace can submit before blocking waiting for space.
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Have you got this patch still queued somewhere?  As UMD has accepted the 
solution and are ready with changes on their side, I think we need to merge it 
soon, and the kernel side as well.

Thanks,
Janusz


> > > ---
> > >  tests/Makefile.sources        |   3 +
> > >  tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
> > >  tests/meson.build             |   1 +
> > >  3 files changed, 300 insertions(+)
> > >  create mode 100644 tests/i915/gem_ctx_ringsize.c
> > > 
> > > diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> > > index e17d43155..801fc52f3 100644
> > > --- a/tests/Makefile.sources
> > > +++ b/tests/Makefile.sources
> > > @@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
> > >  TESTS_progs += gem_ctx_persistence
> > >  gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
> > >  
> > > +TESTS_progs += gem_ctx_ringsize
> > > +gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
> > > +
> > >  TESTS_progs += gem_ctx_shared
> > >  gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
> > >  
> > > diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
> > > new file mode 100644
> > > index 000000000..1450e8f0d
> > > --- /dev/null
> > > +++ b/tests/i915/gem_ctx_ringsize.c
> > > @@ -0,0 +1,296 @@
> > > +/*
> > > + * Copyright © 2019 Intel Corporation
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person obtaining a
> > > + * copy of this software and associated documentation files (the "Software"),
> > > + * to deal in the Software without restriction, including without limitation
> > > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > > + * and/or sell copies of the Software, and to permit persons to whom the
> > > + * Software is furnished to do so, subject to the following conditions:
> > > + *
> > > + * The above copyright notice and this permission notice (including the next
> > > + * paragraph) shall be included in all copies or substantial portions of the
> > > + * Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > > + * IN THE SOFTWARE.
> > > + */
> > > +
> > > +#include <errno.h>
> > > +#include <fcntl.h>
> > > +#include <inttypes.h>
> > > +#include <sys/ioctl.h>
> > > +#include <sys/types.h>
> > > +#include <unistd.h>
> > > +
> > > +#include "drmtest.h" /* gem_quiescent_gpu()! */
> > > +#include "i915/gem_context.h"
> > > +#include "i915/gem_engine_topology.h"
> > > +#include "ioctl_wrappers.h" /* gem_wait()! */
> > > +#include "sw_sync.h"
> > > +
> > > +#define I915_CONTEXT_PARAM_RINGSIZE 0xc
> > 
> > How are we going to handle symbol redefinition conflict which arises as soon 
> > as this symbol is also included from kernel headers (e.g. via 
> > "i915/gem_engine_topology.h")?
> 
> Final version we copy the headers form the kernel. Conflicts remind us
> when we forget.
> 
> > 
> > > +
> > > +static bool has_ringsize(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +
> > > +     return __gem_context_get_param(i915, &p) == 0;
> > > +}
> > > +
> > > +static void test_idempotent(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +     uint32_t saved;
> > > +
> > > +     /*
> > > +      * Simple test to verify that we are able to read back the same
> > > +      * value as we set.
> > > +      */
> > > +
> > > +     gem_context_get_param(i915, &p);
> > > +     saved = p.value;
> > > +
> > > +     for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {
> > 
> > I've noticed you are using two different notations for those minimum/maximum 
> > constants.  I think that may be confusing.  How about defining and using 
> > macros?  
> 
> A range in pages...
>  
> > > +             p.value = x;
> > > +             gem_context_set_param(i915, &p);
> > > +             gem_context_get_param(i915, &p);
> > > +             igt_assert_eq_u32(p.value, x);
> > > +     }
> > > +
> > > +     p.value = saved;
> > > +     gem_context_set_param(i915, &p);
> > > +}
> > > +
> > > +static void test_invalid(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +     uint64_t invalid[] = {
> > > +             0, 1, 4095, 4097, 8191, 8193,
> > > +             /* upper limit may be HW dependent, atm it is 512KiB */
> > > +             (512 << 10) - 1, (512 << 10) + 1,
> > 
> > Here is an example of that different notation mentioned above.
> 
> And here written in KiB to match comments.
> 
> > 
> > > +             -1, -1u
> > > +     };
> > > +     uint32_t saved;
> > > +
> > > +     gem_context_get_param(i915, &p);
> > > +     saved = p.value;
> > > +
> > > +     for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
> > > +             p.value = invalid[i];
> > > +             igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
> > > +             gem_context_get_param(i915, &p);
> > > +             igt_assert_eq_u64(p.value, saved);
> > > +     }
> > > +}
> > > +
> > > +static int create_ext_ioctl(int i915,
> > > +                         struct drm_i915_gem_context_create_ext *arg)
> > > +{
> > > +     int err;
> > > +
> > > +     err = 0;
> > > +     if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
> > > +             err = -errno;
> > > +             igt_assume(err);
> > > +     }
> > > +
> > > +     errno = 0;
> > > +     return err;
> > > +}
> > 
> > This helper looks like pretty standard for me.  Why there are no library 
> > functions for such generic operations?
> 
> Because no one has written that yet.
> 
> > 
> > > +
> > > +static void test_create(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > > +             .base = {
> > > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > > +                     .next_extension = 0, /* end of chain */
> > > +             },
> > > +             .param = {
> > > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +                     .value = 512 << 10,
> > > +             }
> > > +     };
> > > +     struct drm_i915_gem_context_create_ext create = {
> > > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > > +             .extensions = to_user_pointer(&p),
> > > +     };
> > > +
> > > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > > +
> > > +     p.param.ctx_id = create.ctx_id;
> > > +     p.param.value = 0;
> > > +     gem_context_get_param(i915, &p.param);
> > > +     igt_assert_eq(p.param.value, 512 << 10);
> > > +
> > > +     gem_context_destroy(i915, create.ctx_id);
> > > +}
> > > +
> > > +static void test_clone(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > > +             .base = {
> > > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > > +                     .next_extension = 0, /* end of chain */
> > > +             },
> > > +             .param = {
> > > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +                     .value = 512 << 10,
> > > +             }
> > > +     };
> > > +     struct drm_i915_gem_context_create_ext create = {
> > > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > > +             .extensions = to_user_pointer(&p),
> > > +     };
> > > +
> > > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > > +
> > > +     p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
> > > +                                        I915_CONTEXT_CLONE_ENGINES, 0);
> > > +     igt_assert_neq(p.param.ctx_id, create.ctx_id);
> > > +     gem_context_destroy(i915, create.ctx_id);
> > > +
> > > +     p.param.value = 0;
> > > +     gem_context_get_param(i915, &p.param);
> > > +     igt_assert_eq(p.param.value, 512 << 10);
> > > +
> > > +     gem_context_destroy(i915, p.param.ctx_id);
> > > +}
> > > +
> > > +static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
> > > +{
> > > +     int err;
> > > +
> > > +     err = 0;
> > > +     if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
> > > +             err = -errno;
> > > +
> > > +     errno = 0;
> > > +     return err;
> > > +}
> > 
> > The above helper looks pretty the same as lib/ioctlwrappers.c:__gem_execbuf().  
> > Does igt_assume(err) found in the latter matter so much that you use your own 
> > version?
> 
> It's very, very different from that one.
> 
> > > +
> > > +static uint32_t __batch_create(int i915, uint32_t offset)
> > 
> > This is always called with offset = 0, do we expect other values to be used 
> > later?
> 
> Why not.
>  
> > > +{
> > > +     const uint32_t bbe = 0xa << 23;
> > > +     uint32_t handle;
> > > +
> > > +     handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));
> > 
> > Why don't we rely on the driver making the alignment for us?
> 
> I'm used to being inside the kernel where it's expected to be correct.
> 
> > > +     gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> > > +
> > > +     return handle;
> > > +}
> > > +
> > > +static uint32_t batch_create(int i915)
> > > +{
> > > +     return __batch_create(i915, 0);
> > > +}
> > > +
> > > +static unsigned int measure_inflight(int i915, unsigned int engine)
> > > +{
> > > +     IGT_CORK_FENCE(cork);
> > > +     struct drm_i915_gem_exec_object2 obj = {
> > > +             .handle = batch_create(i915)
> > > +     };
> > > +     struct drm_i915_gem_execbuffer2 execbuf = {
> > > +             .buffers_ptr = to_user_pointer(&obj),
> > > +             .buffer_count = 1,
> > > +             .flags = engine | I915_EXEC_FENCE_IN,
> > > +             .rsvd2 = igt_cork_plug(&cork, i915),
> > > +     };
> > > +     unsigned int count;
> > > +
> > > +     fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
> > > +
> > > +     gem_execbuf(i915, &execbuf);
> > > +     for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
> > > +             ;
> > 
> > Shouldn't we check if the reason for the failure is what we expect, i.e., 
> > -EWOULDBLOCK (or -EINTR)?  And why don't we put a time constraint on that loop 
> > in case O_NONBLOCK handling is not supported (yet)?
> 
> Sure. The idea is that O_NONBLOCK is supported, otherwise we don't
> have fast and precise feedback.
> 
> > > +static void test_resize(int i915,
> > > +                     const struct intel_execution_engine2 *e,
> > > +                     unsigned int flags)
> > > +#define IDLE (1 << 0)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +     unsigned int prev[2] = {};
> > > +     uint32_t saved;
> > > +
> > > +     gem_context_get_param(i915, &p);
> > > +     saved = p.value;
> > > +
> > > +     gem_quiescent_gpu(i915);
> > > +     for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
> > > +             unsigned int count;
> > > +
> > > +             gem_context_set_param(i915, &p);
> > > +
> > > +             count = measure_inflight(i915, e->flags);
> > > +             igt_info("%s: %llx -> %d\n", e->name, p.value, count);
> > > +             igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);
> > 
> > Where does this formula come from?  Why not just count == 2 * prev[1] ?
> > What results should we expect in "active" vs. "idle" mode?
> 
> I've explained somewhere why it is not 2*prev... And there's a small
> amount of imprecision (+-1 request). In test_resize is the comment:
> 
>         /*
>          * The ringsize directly affects the number of batches we can have
>          * inflight -- when we run out of room in the ring, the client is
>          * blocked (or if O_NONBLOCK is specified, -EWOULDBLOCK is reported).
>          * The kernel throttles the client when they enter the last 4KiB page,
>          * so as we double the size of the ring, we nearly double the number
>          * of requests we can fit as 2^n-1: i.e 0, 1, 3, 7, 15, 31 pages.
>          */
> 
> -Chris
> 




_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
  2020-02-20 15:57         ` Janusz Krzysztofik
@ 2020-02-20 16:00           ` Chris Wilson
  -1 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2020-02-20 16:00 UTC (permalink / raw)
  To: Janusz Krzysztofik; +Cc: igt-dev, intel-gfx

Quoting Janusz Krzysztofik (2020-02-20 15:57:24)
> Hi Chris,
> 
> On Monday, December 2, 2019 3:59:19 PM CET Chris Wilson wrote:
> > Quoting Janusz Krzysztofik (2019-12-02 14:42:58)
> > > Hi Chris,
> > > 
> > > I have a few questions rather than comments.  I hope they are worth spending 
> > > your time.
> > > 
> > > On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> > > > I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> > > > ringbuffer for logical ring contects. This directly affects the number
> > > 
> > > s/contects/contexts/
> > > 
> > > > of batches userspace can submit before blocking waiting for space.
> > > > 
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Have you got this patch still queued somewhere?  As UMD has accepted the 
> solution and are ready with changes on their side, I think we need to merge it 
> soon, and the kernel side as well.

Link? That's all I need to merge the kernel...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
@ 2020-02-20 16:00           ` Chris Wilson
  0 siblings, 0 replies; 57+ messages in thread
From: Chris Wilson @ 2020-02-20 16:00 UTC (permalink / raw)
  To: Janusz Krzysztofik; +Cc: igt-dev, intel-gfx

Quoting Janusz Krzysztofik (2020-02-20 15:57:24)
> Hi Chris,
> 
> On Monday, December 2, 2019 3:59:19 PM CET Chris Wilson wrote:
> > Quoting Janusz Krzysztofik (2019-12-02 14:42:58)
> > > Hi Chris,
> > > 
> > > I have a few questions rather than comments.  I hope they are worth spending 
> > > your time.
> > > 
> > > On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> > > > I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> > > > ringbuffer for logical ring contects. This directly affects the number
> > > 
> > > s/contects/contexts/
> > > 
> > > > of batches userspace can submit before blocking waiting for space.
> > > > 
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Have you got this patch still queued somewhere?  As UMD has accepted the 
> solution and are ready with changes on their side, I think we need to merge it 
> soon, and the kernel side as well.

Link? That's all I need to merge the kernel...
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2020-02-20 16:00 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-13 12:52 [PATCH i-g-t 1/9] i915/gem_exec_schedule: Split pi-ringfull into two tests Chris Wilson
2019-11-13 12:52 ` [Intel-gfx] " Chris Wilson
2019-11-13 12:52 ` [PATCH i-g-t 2/9] i915/gem_exec_schedule: Exercise priority inversion from resource contention Chris Wilson
2019-11-13 12:52   ` [igt-dev] " Chris Wilson
2019-11-13 12:52   ` [Intel-gfx] " Chris Wilson
2019-11-13 12:52 ` [PATCH i-g-t 3/9] i915/gem_exec_schedule: Beware priority inversion from iova faults Chris Wilson
2019-11-13 12:52   ` [igt-dev] " Chris Wilson
2019-11-13 12:52   ` [Intel-gfx] " Chris Wilson
2019-11-13 12:52 ` [PATCH i-g-t 4/9] i915: Start putting the mmio_base to wider use Chris Wilson
2019-11-13 12:52   ` [igt-dev] " Chris Wilson
2019-11-13 12:52   ` [Intel-gfx] " Chris Wilson
2019-11-21 12:04   ` [igt-dev] " Lionel Landwerlin
2019-11-21 12:04     ` Lionel Landwerlin
2019-11-21 12:04     ` [Intel-gfx] " Lionel Landwerlin
2019-11-21 12:11     ` Chris Wilson
2019-11-21 12:11       ` Chris Wilson
2019-11-21 12:11       ` [Intel-gfx] " Chris Wilson
2019-11-21 13:11       ` Lionel Landwerlin
2019-11-21 13:11         ` Lionel Landwerlin
2019-11-21 13:11         ` [Intel-gfx] " Lionel Landwerlin
2019-11-13 12:52 ` [PATCH i-g-t 5/9] i915/gem_ctx_isolation: Check engine relative registers Chris Wilson
2019-11-13 12:52   ` [Intel-gfx] " Chris Wilson
2019-11-21 21:07   ` Tang, CQ
2019-11-21 21:07     ` [igt-dev] [Intel-gfx] " Tang, CQ
2019-11-21 21:07     ` Tang, CQ
2019-11-21 23:44     ` Chris Wilson
2019-11-21 23:44       ` [igt-dev] [Intel-gfx] " Chris Wilson
2019-11-21 23:44       ` Chris Wilson
2019-11-21 23:56       ` Tang, CQ
2019-11-21 23:56         ` [igt-dev] [Intel-gfx] " Tang, CQ
2019-11-21 23:56         ` Tang, CQ
2019-11-25 19:13   ` Tang, CQ
2019-11-25 19:13     ` [Intel-gfx] " Tang, CQ
2019-11-13 12:52 ` [PATCH i-g-t 6/9] i915: Exercise preemption timeout controls in sysfs Chris Wilson
2019-11-13 12:52   ` [Intel-gfx] " Chris Wilson
2019-11-13 12:52 ` [PATCH i-g-t 7/9] i915: Exercise sysfs heartbeat controls Chris Wilson
2019-11-13 12:52   ` [igt-dev] " Chris Wilson
2019-11-13 12:52   ` [Intel-gfx] " Chris Wilson
2019-11-13 12:52 ` [PATCH i-g-t 8/9] i915: Exercise timeslice sysfs property Chris Wilson
2019-11-13 12:52   ` [igt-dev] " Chris Wilson
2019-11-13 12:52   ` [Intel-gfx] " Chris Wilson
2019-11-13 12:52 ` [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE Chris Wilson
2019-11-13 12:52   ` [igt-dev] " Chris Wilson
2019-11-13 12:52   ` [Intel-gfx] " Chris Wilson
2019-12-02 14:42   ` [igt-dev] " Janusz Krzysztofik
2019-12-02 14:42     ` Janusz Krzysztofik
2019-12-02 14:42     ` [Intel-gfx] " Janusz Krzysztofik
2019-12-02 14:59     ` Chris Wilson
2019-12-02 14:59       ` Chris Wilson
2019-12-02 14:59       ` [Intel-gfx] " Chris Wilson
2020-02-20 15:57       ` Janusz Krzysztofik
2020-02-20 15:57         ` Janusz Krzysztofik
2020-02-20 16:00         ` [Intel-gfx] " Chris Wilson
2020-02-20 16:00           ` Chris Wilson
2019-11-13 14:30 ` [igt-dev] ✓ Fi.CI.BAT: success for series starting with [i-g-t,1/9] i915/gem_exec_schedule: Split pi-ringfull into two tests Patchwork
2019-11-13 14:40 ` [igt-dev] ✗ GitLab.Pipeline: warning " Patchwork
2019-11-14  2:10 ` [igt-dev] ✓ Fi.CI.IGT: success " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.