All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH i-g-t 1/3] i915/gem_exec_schedule: Split pi-ringfull into two tests
@ 2019-11-08 20:49 ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-08 20:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

pi-ringfull uses 2 contexts that share a buffer. The intent was that the
contexts were independent, but it was the effect of the global lock held
by the low priority client that prevented the high priority client from
executing. I began to add a second variant where there was a shared
resource which may induce a priority inversion, only to notice the
existing test already imposed a shared resource. Hence adding a second
test to rerun pi-ringfull in both unshared and shared resource modes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_exec_schedule.c | 38 +++++++++++++++++++++++++++++++---
 1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index 5c15f1770..84581bffe 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -1468,7 +1468,8 @@ static void bind_to_cpu(int cpu)
 	igt_assert(sched_setaffinity(getpid(), sizeof(cpu_set_t), &allowed) == 0);
 }
 
-static void test_pi_ringfull(int fd, unsigned int engine)
+static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
+#define SHARED BIT(0)
 {
 	const uint32_t bbe = MI_BATCH_BUFFER_END;
 	struct sigaction sa = { .sa_handler = alarm_handler };
@@ -1480,6 +1481,24 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 	uint32_t vip;
 	bool *result;
 
+	/*
+	 * We start simple. A low priority client should never prevent a high
+	 * priority client from submitting their work; even if the low priority
+	 * client exhausts their ringbuffer and so is throttled.
+	 *
+	 * SHARED: A variant on the above rule is that even is the 2 clients
+	 * share a read-only resource, the blocked low priority client should
+	 * not prevent the high priority client from executing. A buffer,
+	 * e.g. the batch buffer, that is shared only for reads (no write
+	 * hazard, so the reads can be executed in parallel or in any order),
+	 * so not cause priority inversion due to the resource conflict.
+	 *
+	 * First, we have the low priority context who fills their ring and so
+	 * blocks. As soon as that context blocks, we try to submit a high
+	 * priority batch, which should be executed immediately before the low
+	 * priority context is unblocked.
+	 */
+
 	result = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
 	igt_assert(result != MAP_FAILED);
 
@@ -1545,6 +1564,12 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 	igt_fork(child, 1) {
 		int err;
 
+		/* Replace our batch to avoid conflicts over shared resources? */
+		if (!(flags & SHARED)) {
+			obj[1].handle = gem_create(fd, 4096);
+			gem_write(fd, obj[1].handle, 0, &bbe, sizeof(bbe));
+		}
+
 		result[0] = vip != execbuf.rsvd1;
 
 		igt_debug("Waking parent\n");
@@ -1557,7 +1582,8 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 		itv.it_value.tv_usec = 10000;
 		setitimer(ITIMER_REAL, &itv, NULL);
 
-		/* Since we are the high priority task, we expect to be
+		/*
+		 * Since we are the high priority task, we expect to be
 		 * able to add ourselves to *our* ring without interruption.
 		 */
 		igt_debug("HP child executing\n");
@@ -1569,6 +1595,9 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 		setitimer(ITIMER_REAL, &itv, NULL);
 
 		result[2] = err == 0;
+
+		if (!(flags & SHARED))
+			gem_close(fd, obj[1].handle);
 	}
 
 	/* Relinquish CPU just to allow child to create a context */
@@ -1831,7 +1860,10 @@ igt_main
 				}
 
 				igt_subtest_f("pi-ringfull-%s", e->name)
-					test_pi_ringfull(fd, eb_ring(e));
+					test_pi_ringfull(fd, eb_ring(e), 0);
+
+				igt_subtest_f("pi-common-%s", e->name)
+					test_pi_ringfull(fd, eb_ring(e), SHARED);
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Intel-gfx] [PATCH i-g-t 1/3] i915/gem_exec_schedule: Split pi-ringfull into two tests
@ 2019-11-08 20:49 ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-08 20:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

pi-ringfull uses 2 contexts that share a buffer. The intent was that the
contexts were independent, but it was the effect of the global lock held
by the low priority client that prevented the high priority client from
executing. I began to add a second variant where there was a shared
resource which may induce a priority inversion, only to notice the
existing test already imposed a shared resource. Hence adding a second
test to rerun pi-ringfull in both unshared and shared resource modes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_exec_schedule.c | 38 +++++++++++++++++++++++++++++++---
 1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index 5c15f1770..84581bffe 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -1468,7 +1468,8 @@ static void bind_to_cpu(int cpu)
 	igt_assert(sched_setaffinity(getpid(), sizeof(cpu_set_t), &allowed) == 0);
 }
 
-static void test_pi_ringfull(int fd, unsigned int engine)
+static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
+#define SHARED BIT(0)
 {
 	const uint32_t bbe = MI_BATCH_BUFFER_END;
 	struct sigaction sa = { .sa_handler = alarm_handler };
@@ -1480,6 +1481,24 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 	uint32_t vip;
 	bool *result;
 
+	/*
+	 * We start simple. A low priority client should never prevent a high
+	 * priority client from submitting their work; even if the low priority
+	 * client exhausts their ringbuffer and so is throttled.
+	 *
+	 * SHARED: A variant on the above rule is that even is the 2 clients
+	 * share a read-only resource, the blocked low priority client should
+	 * not prevent the high priority client from executing. A buffer,
+	 * e.g. the batch buffer, that is shared only for reads (no write
+	 * hazard, so the reads can be executed in parallel or in any order),
+	 * so not cause priority inversion due to the resource conflict.
+	 *
+	 * First, we have the low priority context who fills their ring and so
+	 * blocks. As soon as that context blocks, we try to submit a high
+	 * priority batch, which should be executed immediately before the low
+	 * priority context is unblocked.
+	 */
+
 	result = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
 	igt_assert(result != MAP_FAILED);
 
@@ -1545,6 +1564,12 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 	igt_fork(child, 1) {
 		int err;
 
+		/* Replace our batch to avoid conflicts over shared resources? */
+		if (!(flags & SHARED)) {
+			obj[1].handle = gem_create(fd, 4096);
+			gem_write(fd, obj[1].handle, 0, &bbe, sizeof(bbe));
+		}
+
 		result[0] = vip != execbuf.rsvd1;
 
 		igt_debug("Waking parent\n");
@@ -1557,7 +1582,8 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 		itv.it_value.tv_usec = 10000;
 		setitimer(ITIMER_REAL, &itv, NULL);
 
-		/* Since we are the high priority task, we expect to be
+		/*
+		 * Since we are the high priority task, we expect to be
 		 * able to add ourselves to *our* ring without interruption.
 		 */
 		igt_debug("HP child executing\n");
@@ -1569,6 +1595,9 @@ static void test_pi_ringfull(int fd, unsigned int engine)
 		setitimer(ITIMER_REAL, &itv, NULL);
 
 		result[2] = err == 0;
+
+		if (!(flags & SHARED))
+			gem_close(fd, obj[1].handle);
 	}
 
 	/* Relinquish CPU just to allow child to create a context */
@@ -1831,7 +1860,10 @@ igt_main
 				}
 
 				igt_subtest_f("pi-ringfull-%s", e->name)
-					test_pi_ringfull(fd, eb_ring(e));
+					test_pi_ringfull(fd, eb_ring(e), 0);
+
+				igt_subtest_f("pi-common-%s", e->name)
+					test_pi_ringfull(fd, eb_ring(e), SHARED);
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-08 20:49   ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-08 20:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

Register a userspace fault handler for a memory region that we also pass
to the GPU via userptr, and make sure the pagefault is properly serviced
before we execute.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
 1 file changed, 118 insertions(+), 1 deletion(-)

diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
index 11d6f4a1c..774a9f92c 100644
--- a/tests/i915/gem_userptr_blits.c
+++ b/tests/i915/gem_userptr_blits.c
@@ -36,6 +36,8 @@
  * The goal is to simply ensure the basics work.
  */
 
+#include <linux/userfaultfd.h>
+
 #include "igt.h"
 #include <stdlib.h>
 #include <stdio.h>
@@ -44,9 +46,11 @@
 #include <inttypes.h>
 #include <errno.h>
 #include <setjmp.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
 #include <sys/stat.h>
+#include <sys/syscall.h>
 #include <sys/time.h>
-#include <sys/mman.h>
 #include <glib.h>
 #include <signal.h>
 #include <pthread.h>
@@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
 	free(t_data.ptr);
 }
 
+struct ufd_thread {
+	uint32_t *page;
+	int i915;
+};
+
+static uint32_t create_page(int i915, void *page)
+{
+	uint32_t handle;
+
+	gem_userptr(i915, page, 4096, 0, 0, &handle);
+	return handle;
+}
+
+static uint32_t create_batch(int i915)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	uint32_t handle;
+
+	handle = gem_create(i915, 4096);
+	gem_write(i915, handle, 0, &bbe, sizeof(bbe));
+
+	return handle;
+}
+
+static void *ufd_thread(void *arg)
+{
+	struct ufd_thread *t = arg;
+	struct drm_i915_gem_exec_object2 obj[2] = {
+		{ .handle = create_page(t->i915, t->page) },
+		{ .handle = create_batch(t->i915) },
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffers_ptr = to_user_pointer(obj),
+		.buffer_count = ARRAY_SIZE(obj),
+	};
+
+	igt_debug("submitting fault\n");
+	gem_execbuf(t->i915, &eb);
+	gem_sync(t->i915, obj[1].handle);
+
+	for (int i = 0; i < ARRAY_SIZE(obj); i++)
+		gem_close(t->i915, obj[i].handle);
+
+	t->i915 = -1;
+	return NULL;
+}
+
+static int userfaultfd(int flags)
+{
+	return syscall(SYS_userfaultfd, flags);
+}
+
+static void test_userfault(int i915)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	pthread_t thread;
+	char poison[4096];
+	int ufd;
+
+	/*
+	 * Register a page with userfaultfd, and wrap that inside a userptr bo.
+	 * When we try to use gup insider userptr_get_pages, it will trigger
+	 * a pagefault that is sent to the userfaultfd for servicing. This
+	 * is arbitrarily slow, as the submission must wait until the fault
+	 * is serviced by the userspace fault handler.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
+
+	/* Wait for the fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* Faulting thread remains blocked */
+	igt_assert_eq(t.i915, i915);
+
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	pthread_join(thread, NULL);
+
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 uint64_t total_ram;
 uint64_t aperture_size;
 int fd, count;
@@ -1902,6 +2016,9 @@ igt_main_args("c:", NULL, help_str, opt_handler, NULL)
 		igt_subtest("forbidden-operations")
 			test_forbidden_ops(fd);
 
+		igt_subtest("userfault")
+			test_userfault(fd);
+
 		igt_subtest("relocations")
 			test_relocations(fd);
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Intel-gfx] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-08 20:49   ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-08 20:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

Register a userspace fault handler for a memory region that we also pass
to the GPU via userptr, and make sure the pagefault is properly serviced
before we execute.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
 1 file changed, 118 insertions(+), 1 deletion(-)

diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
index 11d6f4a1c..774a9f92c 100644
--- a/tests/i915/gem_userptr_blits.c
+++ b/tests/i915/gem_userptr_blits.c
@@ -36,6 +36,8 @@
  * The goal is to simply ensure the basics work.
  */
 
+#include <linux/userfaultfd.h>
+
 #include "igt.h"
 #include <stdlib.h>
 #include <stdio.h>
@@ -44,9 +46,11 @@
 #include <inttypes.h>
 #include <errno.h>
 #include <setjmp.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
 #include <sys/stat.h>
+#include <sys/syscall.h>
 #include <sys/time.h>
-#include <sys/mman.h>
 #include <glib.h>
 #include <signal.h>
 #include <pthread.h>
@@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
 	free(t_data.ptr);
 }
 
+struct ufd_thread {
+	uint32_t *page;
+	int i915;
+};
+
+static uint32_t create_page(int i915, void *page)
+{
+	uint32_t handle;
+
+	gem_userptr(i915, page, 4096, 0, 0, &handle);
+	return handle;
+}
+
+static uint32_t create_batch(int i915)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	uint32_t handle;
+
+	handle = gem_create(i915, 4096);
+	gem_write(i915, handle, 0, &bbe, sizeof(bbe));
+
+	return handle;
+}
+
+static void *ufd_thread(void *arg)
+{
+	struct ufd_thread *t = arg;
+	struct drm_i915_gem_exec_object2 obj[2] = {
+		{ .handle = create_page(t->i915, t->page) },
+		{ .handle = create_batch(t->i915) },
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffers_ptr = to_user_pointer(obj),
+		.buffer_count = ARRAY_SIZE(obj),
+	};
+
+	igt_debug("submitting fault\n");
+	gem_execbuf(t->i915, &eb);
+	gem_sync(t->i915, obj[1].handle);
+
+	for (int i = 0; i < ARRAY_SIZE(obj); i++)
+		gem_close(t->i915, obj[i].handle);
+
+	t->i915 = -1;
+	return NULL;
+}
+
+static int userfaultfd(int flags)
+{
+	return syscall(SYS_userfaultfd, flags);
+}
+
+static void test_userfault(int i915)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	pthread_t thread;
+	char poison[4096];
+	int ufd;
+
+	/*
+	 * Register a page with userfaultfd, and wrap that inside a userptr bo.
+	 * When we try to use gup insider userptr_get_pages, it will trigger
+	 * a pagefault that is sent to the userfaultfd for servicing. This
+	 * is arbitrarily slow, as the submission must wait until the fault
+	 * is serviced by the userspace fault handler.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
+
+	/* Wait for the fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* Faulting thread remains blocked */
+	igt_assert_eq(t.i915, i915);
+
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	pthread_join(thread, NULL);
+
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 uint64_t total_ram;
 uint64_t aperture_size;
 int fd, count;
@@ -1902,6 +2016,9 @@ igt_main_args("c:", NULL, help_str, opt_handler, NULL)
 		igt_subtest("forbidden-operations")
 			test_forbidden_ops(fd);
 
+		igt_subtest("userfault")
+			test_userfault(fd);
+
 		igt_subtest("relocations")
 			test_relocations(fd);
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-08 20:49   ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-08 20:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev, Tvrtko Ursulin

Register a userspace fault handler for a memory region that we also pass
to the GPU via userptr, and make sure the pagefault is properly serviced
before we execute.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
 1 file changed, 118 insertions(+), 1 deletion(-)

diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
index 11d6f4a1c..774a9f92c 100644
--- a/tests/i915/gem_userptr_blits.c
+++ b/tests/i915/gem_userptr_blits.c
@@ -36,6 +36,8 @@
  * The goal is to simply ensure the basics work.
  */
 
+#include <linux/userfaultfd.h>
+
 #include "igt.h"
 #include <stdlib.h>
 #include <stdio.h>
@@ -44,9 +46,11 @@
 #include <inttypes.h>
 #include <errno.h>
 #include <setjmp.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
 #include <sys/stat.h>
+#include <sys/syscall.h>
 #include <sys/time.h>
-#include <sys/mman.h>
 #include <glib.h>
 #include <signal.h>
 #include <pthread.h>
@@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
 	free(t_data.ptr);
 }
 
+struct ufd_thread {
+	uint32_t *page;
+	int i915;
+};
+
+static uint32_t create_page(int i915, void *page)
+{
+	uint32_t handle;
+
+	gem_userptr(i915, page, 4096, 0, 0, &handle);
+	return handle;
+}
+
+static uint32_t create_batch(int i915)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	uint32_t handle;
+
+	handle = gem_create(i915, 4096);
+	gem_write(i915, handle, 0, &bbe, sizeof(bbe));
+
+	return handle;
+}
+
+static void *ufd_thread(void *arg)
+{
+	struct ufd_thread *t = arg;
+	struct drm_i915_gem_exec_object2 obj[2] = {
+		{ .handle = create_page(t->i915, t->page) },
+		{ .handle = create_batch(t->i915) },
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffers_ptr = to_user_pointer(obj),
+		.buffer_count = ARRAY_SIZE(obj),
+	};
+
+	igt_debug("submitting fault\n");
+	gem_execbuf(t->i915, &eb);
+	gem_sync(t->i915, obj[1].handle);
+
+	for (int i = 0; i < ARRAY_SIZE(obj); i++)
+		gem_close(t->i915, obj[i].handle);
+
+	t->i915 = -1;
+	return NULL;
+}
+
+static int userfaultfd(int flags)
+{
+	return syscall(SYS_userfaultfd, flags);
+}
+
+static void test_userfault(int i915)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	pthread_t thread;
+	char poison[4096];
+	int ufd;
+
+	/*
+	 * Register a page with userfaultfd, and wrap that inside a userptr bo.
+	 * When we try to use gup insider userptr_get_pages, it will trigger
+	 * a pagefault that is sent to the userfaultfd for servicing. This
+	 * is arbitrarily slow, as the submission must wait until the fault
+	 * is serviced by the userspace fault handler.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
+
+	/* Wait for the fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* Faulting thread remains blocked */
+	igt_assert_eq(t.i915, i915);
+
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	pthread_join(thread, NULL);
+
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 uint64_t total_ram;
 uint64_t aperture_size;
 int fd, count;
@@ -1902,6 +2016,9 @@ igt_main_args("c:", NULL, help_str, opt_handler, NULL)
 		igt_subtest("forbidden-operations")
 			test_forbidden_ops(fd);
 
+		igt_subtest("userfault")
+			test_userfault(fd);
+
 		igt_subtest("relocations")
 			test_relocations(fd);
 	}
-- 
2.24.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-08 20:49   ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-08 20:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

One of the hardest priority inversion tasks to both handle and to
simulate in testing is inversion due to resource contention. The
challenge is that a high priority context should never be blocked by a
low priority context, even if both are starving for resources --
ideally, at least for some RT OSes, the higher priority context has
first pick of the meagre resources so that it can be executed with
minimum latency.

userfaultfd allows us to handle a page fault in userspace, and so
arbitrary impose a delay on the fault handler, creating a situation
where a low priority context is blocked waiting for the fault. This
blocked context should not prevent a high priority context from being
executed. While the userfault tries to emulate a slow fault (e.g. from a
failing swap device), it is unfortunately limited to a single object
type: the userptr. Hopefully, we will find other ways to impose other
starvation conditions on global resources.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_exec_schedule.c | 155 +++++++++++++++++++++++++++++++++
 1 file changed, 155 insertions(+)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index 84581bffe..0af7b4c6d 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -23,10 +23,16 @@
 
 #include "config.h"
 
+#include <linux/userfaultfd.h>
+
+#include <pthread.h>
 #include <sys/poll.h>
 #include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
 #include <sched.h>
 #include <signal.h>
+#include <unistd.h>
 
 #include "igt.h"
 #include "igt_rand.h"
@@ -1625,6 +1631,152 @@ static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
 	munmap(result, 4096);
 }
 
+static int userfaultfd(int flags)
+{
+	return syscall(SYS_userfaultfd, flags);
+}
+
+struct ufd_thread {
+	pthread_t thread;
+	uint32_t batch;
+	uint32_t *page;
+	unsigned int engine;
+	int i915;
+};
+
+static uint32_t create_userptr(int i915, void *page)
+{
+	uint32_t handle;
+
+	gem_userptr(i915, page, 4096, 0, 0, &handle);
+	return handle;
+}
+
+static void *ufd_thread(void *arg)
+{
+	struct ufd_thread *t = arg;
+	struct drm_i915_gem_exec_object2 obj[2] = {
+		{ .handle = create_userptr(t->i915, t->page) },
+		{ .handle = t->batch },
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffers_ptr = to_user_pointer(obj),
+		.buffer_count = ARRAY_SIZE(obj),
+		.flags = t->engine,
+		.rsvd1 = gem_context_create(t->i915),
+	};
+	gem_context_set_priority(t->i915, eb.rsvd1, MIN_PRIO);
+
+	igt_debug("submitting fault\n");
+	gem_execbuf(t->i915, &eb);
+	gem_sync(t->i915, obj[0].handle);
+	gem_close(t->i915, obj[0].handle);
+
+	gem_context_destroy(t->i915, eb.rsvd1);
+
+	t->i915 = -1;
+	return NULL;
+}
+
+static void test_pi_userfault(int i915, unsigned int engine)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	char poison[4096];
+	int ufd;
+
+	/*
+	 * Resource contention can easily lead to priority inversion problems,
+	 * that we wish to avoid. Here, we simulate one simple form of resource
+	 * starvation by using an arbitrary slow userspace fault handler to cause
+	 * the low priority context to block waiting for its resource. While it
+	 * is blocked, it should not prevent a higher priority context from
+	 * executing.
+	 *
+	 * This is only a very simple scenario, in more general tests we will
+	 * need to simulate contention on the shared resource such that both
+	 * low and high priority contexts are starving and must fight over
+	 * the meagre resources. One step at a time.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+	t.engine = engine;
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+
+	t.batch = gem_create(i915, 4096);
+	memset(poison, 0xff, sizeof(poison));
+	gem_write(i915, t.batch, 0, poison, 4096);
+
+	/* Register our fault handler for t.page */
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	/* Kick off the low priority submission */
+	igt_assert(pthread_create(&t.thread, NULL, ufd_thread, &t) == 0);
+
+	/* Wait until the low priority thread is blocked on a fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* While the low priority context is blocked; execute a vip */
+	if (1) {
+		const uint32_t bbe = MI_BATCH_BUFFER_END;
+		struct drm_i915_gem_exec_object2 obj = {
+			.handle = t.batch,
+		};
+		struct pollfd pfd;
+		struct drm_i915_gem_execbuffer2 eb = {
+			.buffers_ptr = to_user_pointer(&obj),
+			.buffer_count = 1,
+			.flags = engine | I915_EXEC_FENCE_OUT,
+			.rsvd1 = gem_context_create(i915),
+		};
+		gem_context_set_priority(i915, eb.rsvd1, MAX_PRIO);
+		gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+		gem_execbuf_wr(i915, &eb);
+
+		memset(&pfd, 0, sizeof(pfd));
+		pfd.fd = eb.rsvd2 >> 32;
+		pfd.events = POLLIN;
+		poll(&pfd, 1, -1);
+		igt_assert_eq(sync_fence_status(pfd.fd), 1);
+		close(pfd.fd);
+
+		gem_context_destroy(i915, eb.rsvd1);
+	}
+
+	/* Confirm the low priority context is still waiting */
+	igt_assert_eq(t.i915, i915);
+
+	/* Service the fault; releasing the low priority context */
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	pthread_join(t.thread, NULL);
+
+	gem_close(i915, t.batch);
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 static void measure_semaphore_power(int i915)
 {
 	struct rapl gpu, pkg;
@@ -1864,6 +2016,9 @@ igt_main
 
 				igt_subtest_f("pi-common-%s", e->name)
 					test_pi_ringfull(fd, eb_ring(e), SHARED);
+
+				igt_subtest_f("pi-userfault-%s", e->name)
+					test_pi_userfault(fd, eb_ring(e));
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Intel-gfx] [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-08 20:49   ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-08 20:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

One of the hardest priority inversion tasks to both handle and to
simulate in testing is inversion due to resource contention. The
challenge is that a high priority context should never be blocked by a
low priority context, even if both are starving for resources --
ideally, at least for some RT OSes, the higher priority context has
first pick of the meagre resources so that it can be executed with
minimum latency.

userfaultfd allows us to handle a page fault in userspace, and so
arbitrary impose a delay on the fault handler, creating a situation
where a low priority context is blocked waiting for the fault. This
blocked context should not prevent a high priority context from being
executed. While the userfault tries to emulate a slow fault (e.g. from a
failing swap device), it is unfortunately limited to a single object
type: the userptr. Hopefully, we will find other ways to impose other
starvation conditions on global resources.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_exec_schedule.c | 155 +++++++++++++++++++++++++++++++++
 1 file changed, 155 insertions(+)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index 84581bffe..0af7b4c6d 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -23,10 +23,16 @@
 
 #include "config.h"
 
+#include <linux/userfaultfd.h>
+
+#include <pthread.h>
 #include <sys/poll.h>
 #include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
 #include <sched.h>
 #include <signal.h>
+#include <unistd.h>
 
 #include "igt.h"
 #include "igt_rand.h"
@@ -1625,6 +1631,152 @@ static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
 	munmap(result, 4096);
 }
 
+static int userfaultfd(int flags)
+{
+	return syscall(SYS_userfaultfd, flags);
+}
+
+struct ufd_thread {
+	pthread_t thread;
+	uint32_t batch;
+	uint32_t *page;
+	unsigned int engine;
+	int i915;
+};
+
+static uint32_t create_userptr(int i915, void *page)
+{
+	uint32_t handle;
+
+	gem_userptr(i915, page, 4096, 0, 0, &handle);
+	return handle;
+}
+
+static void *ufd_thread(void *arg)
+{
+	struct ufd_thread *t = arg;
+	struct drm_i915_gem_exec_object2 obj[2] = {
+		{ .handle = create_userptr(t->i915, t->page) },
+		{ .handle = t->batch },
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffers_ptr = to_user_pointer(obj),
+		.buffer_count = ARRAY_SIZE(obj),
+		.flags = t->engine,
+		.rsvd1 = gem_context_create(t->i915),
+	};
+	gem_context_set_priority(t->i915, eb.rsvd1, MIN_PRIO);
+
+	igt_debug("submitting fault\n");
+	gem_execbuf(t->i915, &eb);
+	gem_sync(t->i915, obj[0].handle);
+	gem_close(t->i915, obj[0].handle);
+
+	gem_context_destroy(t->i915, eb.rsvd1);
+
+	t->i915 = -1;
+	return NULL;
+}
+
+static void test_pi_userfault(int i915, unsigned int engine)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	char poison[4096];
+	int ufd;
+
+	/*
+	 * Resource contention can easily lead to priority inversion problems,
+	 * that we wish to avoid. Here, we simulate one simple form of resource
+	 * starvation by using an arbitrary slow userspace fault handler to cause
+	 * the low priority context to block waiting for its resource. While it
+	 * is blocked, it should not prevent a higher priority context from
+	 * executing.
+	 *
+	 * This is only a very simple scenario, in more general tests we will
+	 * need to simulate contention on the shared resource such that both
+	 * low and high priority contexts are starving and must fight over
+	 * the meagre resources. One step at a time.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+	t.engine = engine;
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+
+	t.batch = gem_create(i915, 4096);
+	memset(poison, 0xff, sizeof(poison));
+	gem_write(i915, t.batch, 0, poison, 4096);
+
+	/* Register our fault handler for t.page */
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	/* Kick off the low priority submission */
+	igt_assert(pthread_create(&t.thread, NULL, ufd_thread, &t) == 0);
+
+	/* Wait until the low priority thread is blocked on a fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* While the low priority context is blocked; execute a vip */
+	if (1) {
+		const uint32_t bbe = MI_BATCH_BUFFER_END;
+		struct drm_i915_gem_exec_object2 obj = {
+			.handle = t.batch,
+		};
+		struct pollfd pfd;
+		struct drm_i915_gem_execbuffer2 eb = {
+			.buffers_ptr = to_user_pointer(&obj),
+			.buffer_count = 1,
+			.flags = engine | I915_EXEC_FENCE_OUT,
+			.rsvd1 = gem_context_create(i915),
+		};
+		gem_context_set_priority(i915, eb.rsvd1, MAX_PRIO);
+		gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+		gem_execbuf_wr(i915, &eb);
+
+		memset(&pfd, 0, sizeof(pfd));
+		pfd.fd = eb.rsvd2 >> 32;
+		pfd.events = POLLIN;
+		poll(&pfd, 1, -1);
+		igt_assert_eq(sync_fence_status(pfd.fd), 1);
+		close(pfd.fd);
+
+		gem_context_destroy(i915, eb.rsvd1);
+	}
+
+	/* Confirm the low priority context is still waiting */
+	igt_assert_eq(t.i915, i915);
+
+	/* Service the fault; releasing the low priority context */
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	pthread_join(t.thread, NULL);
+
+	gem_close(i915, t.batch);
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 static void measure_semaphore_power(int i915)
 {
 	struct rapl gpu, pkg;
@@ -1864,6 +2016,9 @@ igt_main
 
 				igt_subtest_f("pi-common-%s", e->name)
 					test_pi_ringfull(fd, eb_ring(e), SHARED);
+
+				igt_subtest_f("pi-userfault-%s", e->name)
+					test_pi_userfault(fd, eb_ring(e));
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [igt-dev] [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-08 20:49   ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-08 20:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev, Tvrtko Ursulin

One of the hardest priority inversion tasks to both handle and to
simulate in testing is inversion due to resource contention. The
challenge is that a high priority context should never be blocked by a
low priority context, even if both are starving for resources --
ideally, at least for some RT OSes, the higher priority context has
first pick of the meagre resources so that it can be executed with
minimum latency.

userfaultfd allows us to handle a page fault in userspace, and so
arbitrary impose a delay on the fault handler, creating a situation
where a low priority context is blocked waiting for the fault. This
blocked context should not prevent a high priority context from being
executed. While the userfault tries to emulate a slow fault (e.g. from a
failing swap device), it is unfortunately limited to a single object
type: the userptr. Hopefully, we will find other ways to impose other
starvation conditions on global resources.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/gem_exec_schedule.c | 155 +++++++++++++++++++++++++++++++++
 1 file changed, 155 insertions(+)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index 84581bffe..0af7b4c6d 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -23,10 +23,16 @@
 
 #include "config.h"
 
+#include <linux/userfaultfd.h>
+
+#include <pthread.h>
 #include <sys/poll.h>
 #include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
 #include <sched.h>
 #include <signal.h>
+#include <unistd.h>
 
 #include "igt.h"
 #include "igt_rand.h"
@@ -1625,6 +1631,152 @@ static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
 	munmap(result, 4096);
 }
 
+static int userfaultfd(int flags)
+{
+	return syscall(SYS_userfaultfd, flags);
+}
+
+struct ufd_thread {
+	pthread_t thread;
+	uint32_t batch;
+	uint32_t *page;
+	unsigned int engine;
+	int i915;
+};
+
+static uint32_t create_userptr(int i915, void *page)
+{
+	uint32_t handle;
+
+	gem_userptr(i915, page, 4096, 0, 0, &handle);
+	return handle;
+}
+
+static void *ufd_thread(void *arg)
+{
+	struct ufd_thread *t = arg;
+	struct drm_i915_gem_exec_object2 obj[2] = {
+		{ .handle = create_userptr(t->i915, t->page) },
+		{ .handle = t->batch },
+	};
+	struct drm_i915_gem_execbuffer2 eb = {
+		.buffers_ptr = to_user_pointer(obj),
+		.buffer_count = ARRAY_SIZE(obj),
+		.flags = t->engine,
+		.rsvd1 = gem_context_create(t->i915),
+	};
+	gem_context_set_priority(t->i915, eb.rsvd1, MIN_PRIO);
+
+	igt_debug("submitting fault\n");
+	gem_execbuf(t->i915, &eb);
+	gem_sync(t->i915, obj[0].handle);
+	gem_close(t->i915, obj[0].handle);
+
+	gem_context_destroy(t->i915, eb.rsvd1);
+
+	t->i915 = -1;
+	return NULL;
+}
+
+static void test_pi_userfault(int i915, unsigned int engine)
+{
+	struct uffdio_api api = { .api = UFFD_API };
+	struct uffdio_register reg;
+	struct uffdio_copy copy;
+	struct uffd_msg msg;
+	struct ufd_thread t;
+	char poison[4096];
+	int ufd;
+
+	/*
+	 * Resource contention can easily lead to priority inversion problems,
+	 * that we wish to avoid. Here, we simulate one simple form of resource
+	 * starvation by using an arbitrary slow userspace fault handler to cause
+	 * the low priority context to block waiting for its resource. While it
+	 * is blocked, it should not prevent a higher priority context from
+	 * executing.
+	 *
+	 * This is only a very simple scenario, in more general tests we will
+	 * need to simulate contention on the shared resource such that both
+	 * low and high priority contexts are starving and must fight over
+	 * the meagre resources. One step at a time.
+	 */
+
+	ufd = userfaultfd(0);
+	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
+	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
+		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
+
+	t.i915 = i915;
+	t.engine = engine;
+
+	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
+	igt_assert(t.page != MAP_FAILED);
+
+	t.batch = gem_create(i915, 4096);
+	memset(poison, 0xff, sizeof(poison));
+	gem_write(i915, t.batch, 0, poison, 4096);
+
+	/* Register our fault handler for t.page */
+	memset(&reg, 0, sizeof(reg));
+	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
+	reg.range.start = to_user_pointer(t.page);
+	reg.range.len = 4096;
+	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
+	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
+
+	/* Kick off the low priority submission */
+	igt_assert(pthread_create(&t.thread, NULL, ufd_thread, &t) == 0);
+
+	/* Wait until the low priority thread is blocked on a fault */
+	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
+	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
+	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
+
+	/* While the low priority context is blocked; execute a vip */
+	if (1) {
+		const uint32_t bbe = MI_BATCH_BUFFER_END;
+		struct drm_i915_gem_exec_object2 obj = {
+			.handle = t.batch,
+		};
+		struct pollfd pfd;
+		struct drm_i915_gem_execbuffer2 eb = {
+			.buffers_ptr = to_user_pointer(&obj),
+			.buffer_count = 1,
+			.flags = engine | I915_EXEC_FENCE_OUT,
+			.rsvd1 = gem_context_create(i915),
+		};
+		gem_context_set_priority(i915, eb.rsvd1, MAX_PRIO);
+		gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
+		gem_execbuf_wr(i915, &eb);
+
+		memset(&pfd, 0, sizeof(pfd));
+		pfd.fd = eb.rsvd2 >> 32;
+		pfd.events = POLLIN;
+		poll(&pfd, 1, -1);
+		igt_assert_eq(sync_fence_status(pfd.fd), 1);
+		close(pfd.fd);
+
+		gem_context_destroy(i915, eb.rsvd1);
+	}
+
+	/* Confirm the low priority context is still waiting */
+	igt_assert_eq(t.i915, i915);
+
+	/* Service the fault; releasing the low priority context */
+	memset(&copy, 0, sizeof(copy));
+	copy.dst = msg.arg.pagefault.address;
+	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
+	copy.len = 4096;
+	do_ioctl(ufd, UFFDIO_COPY, &copy);
+
+	pthread_join(t.thread, NULL);
+
+	gem_close(i915, t.batch);
+	munmap(t.page, 4096);
+	close(ufd);
+}
+
 static void measure_semaphore_power(int i915)
 {
 	struct rapl gpu, pkg;
@@ -1864,6 +2016,9 @@ igt_main
 
 				igt_subtest_f("pi-common-%s", e->name)
 					test_pi_ringfull(fd, eb_ring(e), SHARED);
+
+				igt_subtest_f("pi-userfault-%s", e->name)
+					test_pi_userfault(fd, eb_ring(e));
 			}
 		}
 	}
-- 
2.24.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-08 21:13     ` Daniel Vetter
  0 siblings, 0 replies; 34+ messages in thread
From: Daniel Vetter @ 2019-11-08 21:13 UTC (permalink / raw)
  To: Chris Wilson, Maarten Lankhorst, Peter Zijlstra
  Cc: IGT development, intel-gfx

On Fri, Nov 8, 2019 at 9:49 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> One of the hardest priority inversion tasks to both handle and to
> simulate in testing is inversion due to resource contention. The
> challenge is that a high priority context should never be blocked by a
> low priority context, even if both are starving for resources --
> ideally, at least for some RT OSes, the higher priority context has
> first pick of the meagre resources so that it can be executed with
> minimum latency.
>
> userfaultfd allows us to handle a page fault in userspace, and so
> arbitrary impose a delay on the fault handler, creating a situation
> where a low priority context is blocked waiting for the fault. This
> blocked context should not prevent a high priority context from being
> executed. While the userfault tries to emulate a slow fault (e.g. from a
> failing swap device), it is unfortunately limited to a single object
> type: the userptr. Hopefully, we will find other ways to impose other
> starvation conditions on global resources.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

So rt-ww_mutexes?

I don't think we want/should do that on the first round of rolling out
ww_mutex in i915. And probably should be handled within ww_mutex&cpu
scheduler core code as RT addition (i.e. if you want RT gpu workloads,
then you need to tell both the cpu and gpu scheduler to treat you
accordingly). But I do think it's possible. Rough idea would be to not
just have a global ticket queue, but also take the rt priority into
account. Prio-inversion boosting should already be possible with the
normal rt-mutex code. I think even boosting for rt-ww_mutexes should
be fine:
- if a non-rt task gets boosted because it holds a non-ww_mutex lock,
then it either needs to start out with a new ww_acquire_ctx ticket
(which will already be boosted), or it can't take new ww_mutex locks
for an existing ticket, since that would be outside the non-ww_mutex
and create a locking inversion (ww_acquire_ctx -> non-ww_mutex ->
ww_mutex goes boom). Hence I don't think we need to retroactively
boost the ticket itself if the contended lock that causes the boosting
is a non-ww_mutex.
- if a non-rt task is holding a ww_mutex lock (which an rt task wants
to acquire), then boosting will only need to happen until it drops
that lock. And we can force that by returning EDEADLCK on the next
ww_mutex_lock, hence again we don't need to be able to boost the
existing ticket itself (since the boosted task can be prevented from
acquiring more ww_mutexes of the same class), only the normal mutex
boosting.

If this ad-hoc analysis is correct and we really don't need to boost
existing tickets, then the normal ww_mutex algo should keep working.
Only difference is that all rt-priority tasks will be able to jump the
queue. Simplest way to implement that would be to shift the rt
priority into the high bits of the ticket (and make sure we handle
wrap-around correctly, since that will happen quite regularly with
this approach, between a highest-rt-priority task and a non-rt task).

What I'm not sure about is if a task gets bosted while hodling a
normal mutex, and before releasing that mutex, grabs a ww_acquire_ctx
(with a boosted ticket), and then drops the normal mutex, whether we
can also drop the boosting of the ticket (the ticket would always jump
the queue to its boosted frequence when we grab it, but then it needs
to be fixed to make sure the ww backoff algo works). But that's a
rather unusual locking scheme, and of the 3 ww_mutex classes we have
in the kernel thus far (drm_modeset_lock, dma_resv and the clock tree
graph lock thing) none do something silly like this. So probably can
be ignored.

Adding Maarten and Peter to this, so they can mentally prepare that
maybe we'll go totally crazy :-)

Cheers, Daniel

> ---
>  tests/i915/gem_exec_schedule.c | 155 +++++++++++++++++++++++++++++++++
>  1 file changed, 155 insertions(+)
>
> diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
> index 84581bffe..0af7b4c6d 100644
> --- a/tests/i915/gem_exec_schedule.c
> +++ b/tests/i915/gem_exec_schedule.c
> @@ -23,10 +23,16 @@
>
>  #include "config.h"
>
> +#include <linux/userfaultfd.h>
> +
> +#include <pthread.h>
>  #include <sys/poll.h>
>  #include <sys/ioctl.h>
> +#include <sys/mman.h>
> +#include <sys/syscall.h>
>  #include <sched.h>
>  #include <signal.h>
> +#include <unistd.h>
>
>  #include "igt.h"
>  #include "igt_rand.h"
> @@ -1625,6 +1631,152 @@ static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
>         munmap(result, 4096);
>  }
>
> +static int userfaultfd(int flags)
> +{
> +       return syscall(SYS_userfaultfd, flags);
> +}
> +
> +struct ufd_thread {
> +       pthread_t thread;
> +       uint32_t batch;
> +       uint32_t *page;
> +       unsigned int engine;
> +       int i915;
> +};
> +
> +static uint32_t create_userptr(int i915, void *page)
> +{
> +       uint32_t handle;
> +
> +       gem_userptr(i915, page, 4096, 0, 0, &handle);
> +       return handle;
> +}
> +
> +static void *ufd_thread(void *arg)
> +{
> +       struct ufd_thread *t = arg;
> +       struct drm_i915_gem_exec_object2 obj[2] = {
> +               { .handle = create_userptr(t->i915, t->page) },
> +               { .handle = t->batch },
> +       };
> +       struct drm_i915_gem_execbuffer2 eb = {
> +               .buffers_ptr = to_user_pointer(obj),
> +               .buffer_count = ARRAY_SIZE(obj),
> +               .flags = t->engine,
> +               .rsvd1 = gem_context_create(t->i915),
> +       };
> +       gem_context_set_priority(t->i915, eb.rsvd1, MIN_PRIO);
> +
> +       igt_debug("submitting fault\n");
> +       gem_execbuf(t->i915, &eb);
> +       gem_sync(t->i915, obj[0].handle);
> +       gem_close(t->i915, obj[0].handle);
> +
> +       gem_context_destroy(t->i915, eb.rsvd1);
> +
> +       t->i915 = -1;
> +       return NULL;
> +}
> +
> +static void test_pi_userfault(int i915, unsigned int engine)
> +{
> +       struct uffdio_api api = { .api = UFFD_API };
> +       struct uffdio_register reg;
> +       struct uffdio_copy copy;
> +       struct uffd_msg msg;
> +       struct ufd_thread t;
> +       char poison[4096];
> +       int ufd;
> +
> +       /*
> +        * Resource contention can easily lead to priority inversion problems,
> +        * that we wish to avoid. Here, we simulate one simple form of resource
> +        * starvation by using an arbitrary slow userspace fault handler to cause
> +        * the low priority context to block waiting for its resource. While it
> +        * is blocked, it should not prevent a higher priority context from
> +        * executing.
> +        *
> +        * This is only a very simple scenario, in more general tests we will
> +        * need to simulate contention on the shared resource such that both
> +        * low and high priority contexts are starving and must fight over
> +        * the meagre resources. One step at a time.
> +        */
> +
> +       ufd = userfaultfd(0);
> +       igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
> +       igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
> +                     "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
> +
> +       t.i915 = i915;
> +       t.engine = engine;
> +
> +       t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> +       igt_assert(t.page != MAP_FAILED);
> +
> +       t.batch = gem_create(i915, 4096);
> +       memset(poison, 0xff, sizeof(poison));
> +       gem_write(i915, t.batch, 0, poison, 4096);
> +
> +       /* Register our fault handler for t.page */
> +       memset(&reg, 0, sizeof(reg));
> +       reg.mode = UFFDIO_REGISTER_MODE_MISSING;
> +       reg.range.start = to_user_pointer(t.page);
> +       reg.range.len = 4096;
> +       do_ioctl(ufd, UFFDIO_REGISTER, &reg);
> +       igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
> +
> +       /* Kick off the low priority submission */
> +       igt_assert(pthread_create(&t.thread, NULL, ufd_thread, &t) == 0);
> +
> +       /* Wait until the low priority thread is blocked on a fault */
> +       igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
> +       igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
> +       igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
> +
> +       /* While the low priority context is blocked; execute a vip */
> +       if (1) {
> +               const uint32_t bbe = MI_BATCH_BUFFER_END;
> +               struct drm_i915_gem_exec_object2 obj = {
> +                       .handle = t.batch,
> +               };
> +               struct pollfd pfd;
> +               struct drm_i915_gem_execbuffer2 eb = {
> +                       .buffers_ptr = to_user_pointer(&obj),
> +                       .buffer_count = 1,
> +                       .flags = engine | I915_EXEC_FENCE_OUT,
> +                       .rsvd1 = gem_context_create(i915),
> +               };
> +               gem_context_set_priority(i915, eb.rsvd1, MAX_PRIO);
> +               gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
> +               gem_execbuf_wr(i915, &eb);
> +
> +               memset(&pfd, 0, sizeof(pfd));
> +               pfd.fd = eb.rsvd2 >> 32;
> +               pfd.events = POLLIN;
> +               poll(&pfd, 1, -1);
> +               igt_assert_eq(sync_fence_status(pfd.fd), 1);
> +               close(pfd.fd);
> +
> +               gem_context_destroy(i915, eb.rsvd1);
> +       }
> +
> +       /* Confirm the low priority context is still waiting */
> +       igt_assert_eq(t.i915, i915);
> +
> +       /* Service the fault; releasing the low priority context */
> +       memset(&copy, 0, sizeof(copy));
> +       copy.dst = msg.arg.pagefault.address;
> +       copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
> +       copy.len = 4096;
> +       do_ioctl(ufd, UFFDIO_COPY, &copy);
> +
> +       pthread_join(t.thread, NULL);
> +
> +       gem_close(i915, t.batch);
> +       munmap(t.page, 4096);
> +       close(ufd);
> +}
> +
>  static void measure_semaphore_power(int i915)
>  {
>         struct rapl gpu, pkg;
> @@ -1864,6 +2016,9 @@ igt_main
>
>                                 igt_subtest_f("pi-common-%s", e->name)
>                                         test_pi_ringfull(fd, eb_ring(e), SHARED);
> +
> +                               igt_subtest_f("pi-userfault-%s", e->name)
> +                                       test_pi_userfault(fd, eb_ring(e));
>                         }
>                 }
>         }
> --
> 2.24.0
>
> _______________________________________________
> igt-dev mailing list
> igt-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/igt-dev



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-08 21:13     ` Daniel Vetter
  0 siblings, 0 replies; 34+ messages in thread
From: Daniel Vetter @ 2019-11-08 21:13 UTC (permalink / raw)
  To: Chris Wilson, Maarten Lankhorst, Peter Zijlstra
  Cc: IGT development, intel-gfx

On Fri, Nov 8, 2019 at 9:49 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> One of the hardest priority inversion tasks to both handle and to
> simulate in testing is inversion due to resource contention. The
> challenge is that a high priority context should never be blocked by a
> low priority context, even if both are starving for resources --
> ideally, at least for some RT OSes, the higher priority context has
> first pick of the meagre resources so that it can be executed with
> minimum latency.
>
> userfaultfd allows us to handle a page fault in userspace, and so
> arbitrary impose a delay on the fault handler, creating a situation
> where a low priority context is blocked waiting for the fault. This
> blocked context should not prevent a high priority context from being
> executed. While the userfault tries to emulate a slow fault (e.g. from a
> failing swap device), it is unfortunately limited to a single object
> type: the userptr. Hopefully, we will find other ways to impose other
> starvation conditions on global resources.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

So rt-ww_mutexes?

I don't think we want/should do that on the first round of rolling out
ww_mutex in i915. And probably should be handled within ww_mutex&cpu
scheduler core code as RT addition (i.e. if you want RT gpu workloads,
then you need to tell both the cpu and gpu scheduler to treat you
accordingly). But I do think it's possible. Rough idea would be to not
just have a global ticket queue, but also take the rt priority into
account. Prio-inversion boosting should already be possible with the
normal rt-mutex code. I think even boosting for rt-ww_mutexes should
be fine:
- if a non-rt task gets boosted because it holds a non-ww_mutex lock,
then it either needs to start out with a new ww_acquire_ctx ticket
(which will already be boosted), or it can't take new ww_mutex locks
for an existing ticket, since that would be outside the non-ww_mutex
and create a locking inversion (ww_acquire_ctx -> non-ww_mutex ->
ww_mutex goes boom). Hence I don't think we need to retroactively
boost the ticket itself if the contended lock that causes the boosting
is a non-ww_mutex.
- if a non-rt task is holding a ww_mutex lock (which an rt task wants
to acquire), then boosting will only need to happen until it drops
that lock. And we can force that by returning EDEADLCK on the next
ww_mutex_lock, hence again we don't need to be able to boost the
existing ticket itself (since the boosted task can be prevented from
acquiring more ww_mutexes of the same class), only the normal mutex
boosting.

If this ad-hoc analysis is correct and we really don't need to boost
existing tickets, then the normal ww_mutex algo should keep working.
Only difference is that all rt-priority tasks will be able to jump the
queue. Simplest way to implement that would be to shift the rt
priority into the high bits of the ticket (and make sure we handle
wrap-around correctly, since that will happen quite regularly with
this approach, between a highest-rt-priority task and a non-rt task).

What I'm not sure about is if a task gets bosted while hodling a
normal mutex, and before releasing that mutex, grabs a ww_acquire_ctx
(with a boosted ticket), and then drops the normal mutex, whether we
can also drop the boosting of the ticket (the ticket would always jump
the queue to its boosted frequence when we grab it, but then it needs
to be fixed to make sure the ww backoff algo works). But that's a
rather unusual locking scheme, and of the 3 ww_mutex classes we have
in the kernel thus far (drm_modeset_lock, dma_resv and the clock tree
graph lock thing) none do something silly like this. So probably can
be ignored.

Adding Maarten and Peter to this, so they can mentally prepare that
maybe we'll go totally crazy :-)

Cheers, Daniel

> ---
>  tests/i915/gem_exec_schedule.c | 155 +++++++++++++++++++++++++++++++++
>  1 file changed, 155 insertions(+)
>
> diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
> index 84581bffe..0af7b4c6d 100644
> --- a/tests/i915/gem_exec_schedule.c
> +++ b/tests/i915/gem_exec_schedule.c
> @@ -23,10 +23,16 @@
>
>  #include "config.h"
>
> +#include <linux/userfaultfd.h>
> +
> +#include <pthread.h>
>  #include <sys/poll.h>
>  #include <sys/ioctl.h>
> +#include <sys/mman.h>
> +#include <sys/syscall.h>
>  #include <sched.h>
>  #include <signal.h>
> +#include <unistd.h>
>
>  #include "igt.h"
>  #include "igt_rand.h"
> @@ -1625,6 +1631,152 @@ static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
>         munmap(result, 4096);
>  }
>
> +static int userfaultfd(int flags)
> +{
> +       return syscall(SYS_userfaultfd, flags);
> +}
> +
> +struct ufd_thread {
> +       pthread_t thread;
> +       uint32_t batch;
> +       uint32_t *page;
> +       unsigned int engine;
> +       int i915;
> +};
> +
> +static uint32_t create_userptr(int i915, void *page)
> +{
> +       uint32_t handle;
> +
> +       gem_userptr(i915, page, 4096, 0, 0, &handle);
> +       return handle;
> +}
> +
> +static void *ufd_thread(void *arg)
> +{
> +       struct ufd_thread *t = arg;
> +       struct drm_i915_gem_exec_object2 obj[2] = {
> +               { .handle = create_userptr(t->i915, t->page) },
> +               { .handle = t->batch },
> +       };
> +       struct drm_i915_gem_execbuffer2 eb = {
> +               .buffers_ptr = to_user_pointer(obj),
> +               .buffer_count = ARRAY_SIZE(obj),
> +               .flags = t->engine,
> +               .rsvd1 = gem_context_create(t->i915),
> +       };
> +       gem_context_set_priority(t->i915, eb.rsvd1, MIN_PRIO);
> +
> +       igt_debug("submitting fault\n");
> +       gem_execbuf(t->i915, &eb);
> +       gem_sync(t->i915, obj[0].handle);
> +       gem_close(t->i915, obj[0].handle);
> +
> +       gem_context_destroy(t->i915, eb.rsvd1);
> +
> +       t->i915 = -1;
> +       return NULL;
> +}
> +
> +static void test_pi_userfault(int i915, unsigned int engine)
> +{
> +       struct uffdio_api api = { .api = UFFD_API };
> +       struct uffdio_register reg;
> +       struct uffdio_copy copy;
> +       struct uffd_msg msg;
> +       struct ufd_thread t;
> +       char poison[4096];
> +       int ufd;
> +
> +       /*
> +        * Resource contention can easily lead to priority inversion problems,
> +        * that we wish to avoid. Here, we simulate one simple form of resource
> +        * starvation by using an arbitrary slow userspace fault handler to cause
> +        * the low priority context to block waiting for its resource. While it
> +        * is blocked, it should not prevent a higher priority context from
> +        * executing.
> +        *
> +        * This is only a very simple scenario, in more general tests we will
> +        * need to simulate contention on the shared resource such that both
> +        * low and high priority contexts are starving and must fight over
> +        * the meagre resources. One step at a time.
> +        */
> +
> +       ufd = userfaultfd(0);
> +       igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
> +       igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
> +                     "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
> +
> +       t.i915 = i915;
> +       t.engine = engine;
> +
> +       t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> +       igt_assert(t.page != MAP_FAILED);
> +
> +       t.batch = gem_create(i915, 4096);
> +       memset(poison, 0xff, sizeof(poison));
> +       gem_write(i915, t.batch, 0, poison, 4096);
> +
> +       /* Register our fault handler for t.page */
> +       memset(&reg, 0, sizeof(reg));
> +       reg.mode = UFFDIO_REGISTER_MODE_MISSING;
> +       reg.range.start = to_user_pointer(t.page);
> +       reg.range.len = 4096;
> +       do_ioctl(ufd, UFFDIO_REGISTER, &reg);
> +       igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
> +
> +       /* Kick off the low priority submission */
> +       igt_assert(pthread_create(&t.thread, NULL, ufd_thread, &t) == 0);
> +
> +       /* Wait until the low priority thread is blocked on a fault */
> +       igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
> +       igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
> +       igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
> +
> +       /* While the low priority context is blocked; execute a vip */
> +       if (1) {
> +               const uint32_t bbe = MI_BATCH_BUFFER_END;
> +               struct drm_i915_gem_exec_object2 obj = {
> +                       .handle = t.batch,
> +               };
> +               struct pollfd pfd;
> +               struct drm_i915_gem_execbuffer2 eb = {
> +                       .buffers_ptr = to_user_pointer(&obj),
> +                       .buffer_count = 1,
> +                       .flags = engine | I915_EXEC_FENCE_OUT,
> +                       .rsvd1 = gem_context_create(i915),
> +               };
> +               gem_context_set_priority(i915, eb.rsvd1, MAX_PRIO);
> +               gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
> +               gem_execbuf_wr(i915, &eb);
> +
> +               memset(&pfd, 0, sizeof(pfd));
> +               pfd.fd = eb.rsvd2 >> 32;
> +               pfd.events = POLLIN;
> +               poll(&pfd, 1, -1);
> +               igt_assert_eq(sync_fence_status(pfd.fd), 1);
> +               close(pfd.fd);
> +
> +               gem_context_destroy(i915, eb.rsvd1);
> +       }
> +
> +       /* Confirm the low priority context is still waiting */
> +       igt_assert_eq(t.i915, i915);
> +
> +       /* Service the fault; releasing the low priority context */
> +       memset(&copy, 0, sizeof(copy));
> +       copy.dst = msg.arg.pagefault.address;
> +       copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
> +       copy.len = 4096;
> +       do_ioctl(ufd, UFFDIO_COPY, &copy);
> +
> +       pthread_join(t.thread, NULL);
> +
> +       gem_close(i915, t.batch);
> +       munmap(t.page, 4096);
> +       close(ufd);
> +}
> +
>  static void measure_semaphore_power(int i915)
>  {
>         struct rapl gpu, pkg;
> @@ -1864,6 +2016,9 @@ igt_main
>
>                                 igt_subtest_f("pi-common-%s", e->name)
>                                         test_pi_ringfull(fd, eb_ring(e), SHARED);
> +
> +                               igt_subtest_f("pi-userfault-%s", e->name)
> +                                       test_pi_userfault(fd, eb_ring(e));
>                         }
>                 }
>         }
> --
> 2.24.0
>
> _______________________________________________
> igt-dev mailing list
> igt-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/igt-dev



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-08 21:13     ` Daniel Vetter
  0 siblings, 0 replies; 34+ messages in thread
From: Daniel Vetter @ 2019-11-08 21:13 UTC (permalink / raw)
  To: Chris Wilson, Maarten Lankhorst, Peter Zijlstra
  Cc: IGT development, intel-gfx, Tvrtko Ursulin

On Fri, Nov 8, 2019 at 9:49 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> One of the hardest priority inversion tasks to both handle and to
> simulate in testing is inversion due to resource contention. The
> challenge is that a high priority context should never be blocked by a
> low priority context, even if both are starving for resources --
> ideally, at least for some RT OSes, the higher priority context has
> first pick of the meagre resources so that it can be executed with
> minimum latency.
>
> userfaultfd allows us to handle a page fault in userspace, and so
> arbitrary impose a delay on the fault handler, creating a situation
> where a low priority context is blocked waiting for the fault. This
> blocked context should not prevent a high priority context from being
> executed. While the userfault tries to emulate a slow fault (e.g. from a
> failing swap device), it is unfortunately limited to a single object
> type: the userptr. Hopefully, we will find other ways to impose other
> starvation conditions on global resources.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

So rt-ww_mutexes?

I don't think we want/should do that on the first round of rolling out
ww_mutex in i915. And probably should be handled within ww_mutex&cpu
scheduler core code as RT addition (i.e. if you want RT gpu workloads,
then you need to tell both the cpu and gpu scheduler to treat you
accordingly). But I do think it's possible. Rough idea would be to not
just have a global ticket queue, but also take the rt priority into
account. Prio-inversion boosting should already be possible with the
normal rt-mutex code. I think even boosting for rt-ww_mutexes should
be fine:
- if a non-rt task gets boosted because it holds a non-ww_mutex lock,
then it either needs to start out with a new ww_acquire_ctx ticket
(which will already be boosted), or it can't take new ww_mutex locks
for an existing ticket, since that would be outside the non-ww_mutex
and create a locking inversion (ww_acquire_ctx -> non-ww_mutex ->
ww_mutex goes boom). Hence I don't think we need to retroactively
boost the ticket itself if the contended lock that causes the boosting
is a non-ww_mutex.
- if a non-rt task is holding a ww_mutex lock (which an rt task wants
to acquire), then boosting will only need to happen until it drops
that lock. And we can force that by returning EDEADLCK on the next
ww_mutex_lock, hence again we don't need to be able to boost the
existing ticket itself (since the boosted task can be prevented from
acquiring more ww_mutexes of the same class), only the normal mutex
boosting.

If this ad-hoc analysis is correct and we really don't need to boost
existing tickets, then the normal ww_mutex algo should keep working.
Only difference is that all rt-priority tasks will be able to jump the
queue. Simplest way to implement that would be to shift the rt
priority into the high bits of the ticket (and make sure we handle
wrap-around correctly, since that will happen quite regularly with
this approach, between a highest-rt-priority task and a non-rt task).

What I'm not sure about is if a task gets bosted while hodling a
normal mutex, and before releasing that mutex, grabs a ww_acquire_ctx
(with a boosted ticket), and then drops the normal mutex, whether we
can also drop the boosting of the ticket (the ticket would always jump
the queue to its boosted frequence when we grab it, but then it needs
to be fixed to make sure the ww backoff algo works). But that's a
rather unusual locking scheme, and of the 3 ww_mutex classes we have
in the kernel thus far (drm_modeset_lock, dma_resv and the clock tree
graph lock thing) none do something silly like this. So probably can
be ignored.

Adding Maarten and Peter to this, so they can mentally prepare that
maybe we'll go totally crazy :-)

Cheers, Daniel

> ---
>  tests/i915/gem_exec_schedule.c | 155 +++++++++++++++++++++++++++++++++
>  1 file changed, 155 insertions(+)
>
> diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
> index 84581bffe..0af7b4c6d 100644
> --- a/tests/i915/gem_exec_schedule.c
> +++ b/tests/i915/gem_exec_schedule.c
> @@ -23,10 +23,16 @@
>
>  #include "config.h"
>
> +#include <linux/userfaultfd.h>
> +
> +#include <pthread.h>
>  #include <sys/poll.h>
>  #include <sys/ioctl.h>
> +#include <sys/mman.h>
> +#include <sys/syscall.h>
>  #include <sched.h>
>  #include <signal.h>
> +#include <unistd.h>
>
>  #include "igt.h"
>  #include "igt_rand.h"
> @@ -1625,6 +1631,152 @@ static void test_pi_ringfull(int fd, unsigned int engine, unsigned int flags)
>         munmap(result, 4096);
>  }
>
> +static int userfaultfd(int flags)
> +{
> +       return syscall(SYS_userfaultfd, flags);
> +}
> +
> +struct ufd_thread {
> +       pthread_t thread;
> +       uint32_t batch;
> +       uint32_t *page;
> +       unsigned int engine;
> +       int i915;
> +};
> +
> +static uint32_t create_userptr(int i915, void *page)
> +{
> +       uint32_t handle;
> +
> +       gem_userptr(i915, page, 4096, 0, 0, &handle);
> +       return handle;
> +}
> +
> +static void *ufd_thread(void *arg)
> +{
> +       struct ufd_thread *t = arg;
> +       struct drm_i915_gem_exec_object2 obj[2] = {
> +               { .handle = create_userptr(t->i915, t->page) },
> +               { .handle = t->batch },
> +       };
> +       struct drm_i915_gem_execbuffer2 eb = {
> +               .buffers_ptr = to_user_pointer(obj),
> +               .buffer_count = ARRAY_SIZE(obj),
> +               .flags = t->engine,
> +               .rsvd1 = gem_context_create(t->i915),
> +       };
> +       gem_context_set_priority(t->i915, eb.rsvd1, MIN_PRIO);
> +
> +       igt_debug("submitting fault\n");
> +       gem_execbuf(t->i915, &eb);
> +       gem_sync(t->i915, obj[0].handle);
> +       gem_close(t->i915, obj[0].handle);
> +
> +       gem_context_destroy(t->i915, eb.rsvd1);
> +
> +       t->i915 = -1;
> +       return NULL;
> +}
> +
> +static void test_pi_userfault(int i915, unsigned int engine)
> +{
> +       struct uffdio_api api = { .api = UFFD_API };
> +       struct uffdio_register reg;
> +       struct uffdio_copy copy;
> +       struct uffd_msg msg;
> +       struct ufd_thread t;
> +       char poison[4096];
> +       int ufd;
> +
> +       /*
> +        * Resource contention can easily lead to priority inversion problems,
> +        * that we wish to avoid. Here, we simulate one simple form of resource
> +        * starvation by using an arbitrary slow userspace fault handler to cause
> +        * the low priority context to block waiting for its resource. While it
> +        * is blocked, it should not prevent a higher priority context from
> +        * executing.
> +        *
> +        * This is only a very simple scenario, in more general tests we will
> +        * need to simulate contention on the shared resource such that both
> +        * low and high priority contexts are starving and must fight over
> +        * the meagre resources. One step at a time.
> +        */
> +
> +       ufd = userfaultfd(0);
> +       igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
> +       igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
> +                     "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
> +
> +       t.i915 = i915;
> +       t.engine = engine;
> +
> +       t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> +       igt_assert(t.page != MAP_FAILED);
> +
> +       t.batch = gem_create(i915, 4096);
> +       memset(poison, 0xff, sizeof(poison));
> +       gem_write(i915, t.batch, 0, poison, 4096);
> +
> +       /* Register our fault handler for t.page */
> +       memset(&reg, 0, sizeof(reg));
> +       reg.mode = UFFDIO_REGISTER_MODE_MISSING;
> +       reg.range.start = to_user_pointer(t.page);
> +       reg.range.len = 4096;
> +       do_ioctl(ufd, UFFDIO_REGISTER, &reg);
> +       igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
> +
> +       /* Kick off the low priority submission */
> +       igt_assert(pthread_create(&t.thread, NULL, ufd_thread, &t) == 0);
> +
> +       /* Wait until the low priority thread is blocked on a fault */
> +       igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
> +       igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
> +       igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
> +
> +       /* While the low priority context is blocked; execute a vip */
> +       if (1) {
> +               const uint32_t bbe = MI_BATCH_BUFFER_END;
> +               struct drm_i915_gem_exec_object2 obj = {
> +                       .handle = t.batch,
> +               };
> +               struct pollfd pfd;
> +               struct drm_i915_gem_execbuffer2 eb = {
> +                       .buffers_ptr = to_user_pointer(&obj),
> +                       .buffer_count = 1,
> +                       .flags = engine | I915_EXEC_FENCE_OUT,
> +                       .rsvd1 = gem_context_create(i915),
> +               };
> +               gem_context_set_priority(i915, eb.rsvd1, MAX_PRIO);
> +               gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
> +               gem_execbuf_wr(i915, &eb);
> +
> +               memset(&pfd, 0, sizeof(pfd));
> +               pfd.fd = eb.rsvd2 >> 32;
> +               pfd.events = POLLIN;
> +               poll(&pfd, 1, -1);
> +               igt_assert_eq(sync_fence_status(pfd.fd), 1);
> +               close(pfd.fd);
> +
> +               gem_context_destroy(i915, eb.rsvd1);
> +       }
> +
> +       /* Confirm the low priority context is still waiting */
> +       igt_assert_eq(t.i915, i915);
> +
> +       /* Service the fault; releasing the low priority context */
> +       memset(&copy, 0, sizeof(copy));
> +       copy.dst = msg.arg.pagefault.address;
> +       copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
> +       copy.len = 4096;
> +       do_ioctl(ufd, UFFDIO_COPY, &copy);
> +
> +       pthread_join(t.thread, NULL);
> +
> +       gem_close(i915, t.batch);
> +       munmap(t.page, 4096);
> +       close(ufd);
> +}
> +
>  static void measure_semaphore_power(int i915)
>  {
>         struct rapl gpu, pkg;
> @@ -1864,6 +2016,9 @@ igt_main
>
>                                 igt_subtest_f("pi-common-%s", e->name)
>                                         test_pi_ringfull(fd, eb_ring(e), SHARED);
> +
> +                               igt_subtest_f("pi-userfault-%s", e->name)
> +                                       test_pi_userfault(fd, eb_ring(e));
>                         }
>                 }
>         }
> --
> 2.24.0
>
> _______________________________________________
> igt-dev mailing list
> igt-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/igt-dev



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [igt-dev] ✗ GitLab.Pipeline: failure for series starting with [i-g-t,1/3] i915/gem_exec_schedule: Split pi-ringfull into two tests
  2019-11-08 20:49 ` [Intel-gfx] " Chris Wilson
                   ` (2 preceding siblings ...)
  (?)
@ 2019-11-08 21:14 ` Patchwork
  -1 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2019-11-08 21:14 UTC (permalink / raw)
  To: Chris Wilson; +Cc: igt-dev

== Series Details ==

Series: series starting with [i-g-t,1/3] i915/gem_exec_schedule: Split pi-ringfull into two tests
URL   : https://patchwork.freedesktop.org/series/69219/
State : failure

== Summary ==

ERROR! This series introduces new undocumented tests:

gem_exec_schedule@pi-common-blt
gem_exec_schedule@pi-common-bsd
gem_exec_schedule@pi-common-bsd1
gem_exec_schedule@pi-common-bsd2
gem_exec_schedule@pi-common-render
gem_exec_schedule@pi-common-vebox
gem_exec_schedule@pi-userfault-blt
gem_exec_schedule@pi-userfault-bsd
gem_exec_schedule@pi-userfault-bsd1
gem_exec_schedule@pi-userfault-bsd2
gem_exec_schedule@pi-userfault-render
gem_exec_schedule@pi-userfault-vebox
gem_userptr_blits@userfault

Can you document them as per the requirement in the [CONTRIBUTING.md]?

[Documentation] has more details on how to do this.

Here are few examples:
https://gitlab.freedesktop.org/drm/igt-gpu-tools/commit/0316695d03aa46108296b27f3982ec93200c7a6e
https://gitlab.freedesktop.org/drm/igt-gpu-tools/commit/443cc658e1e6b492ee17bf4f4d891029eb7a205d

Thanks in advance!

[CONTRIBUTING.md]: https://gitlab.freedesktop.org/drm/igt-gpu-tools/blob/master/CONTRIBUTING.md#L19
[Documentation]: https://drm.pages.freedesktop.org/igt-gpu-tools/igt-gpu-tools-Core.html#igt-describe

Other than that, pipeline status: SUCCESS.

see https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/pipelines/77550 for more details

== Logs ==

For more details see: https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/pipelines/77550
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-08 21:22       ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-08 21:22 UTC (permalink / raw)
  To: Daniel Vetter, Maarten Lankhorst, Peter Zijlstra
  Cc: IGT development, intel-gfx

Quoting Daniel Vetter (2019-11-08 21:13:13)
> On Fri, Nov 8, 2019 at 9:49 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > One of the hardest priority inversion tasks to both handle and to
> > simulate in testing is inversion due to resource contention. The
> > challenge is that a high priority context should never be blocked by a
> > low priority context, even if both are starving for resources --
> > ideally, at least for some RT OSes, the higher priority context has
> > first pick of the meagre resources so that it can be executed with
> > minimum latency.
> >
> > userfaultfd allows us to handle a page fault in userspace, and so
> > arbitrary impose a delay on the fault handler, creating a situation
> > where a low priority context is blocked waiting for the fault. This
> > blocked context should not prevent a high priority context from being
> > executed. While the userfault tries to emulate a slow fault (e.g. from a
> > failing swap device), it is unfortunately limited to a single object
> > type: the userptr. Hopefully, we will find other ways to impose other
> > starvation conditions on global resources.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> So rt-ww_mutexes?
> 
> I don't think we want/should do that on the first round of rolling out
> ww_mutex in i915.

It works today. And will continue working across any conversion.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-08 21:22       ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-08 21:22 UTC (permalink / raw)
  To: Daniel Vetter, Maarten Lankhorst, Peter Zijlstra
  Cc: IGT development, intel-gfx

Quoting Daniel Vetter (2019-11-08 21:13:13)
> On Fri, Nov 8, 2019 at 9:49 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > One of the hardest priority inversion tasks to both handle and to
> > simulate in testing is inversion due to resource contention. The
> > challenge is that a high priority context should never be blocked by a
> > low priority context, even if both are starving for resources --
> > ideally, at least for some RT OSes, the higher priority context has
> > first pick of the meagre resources so that it can be executed with
> > minimum latency.
> >
> > userfaultfd allows us to handle a page fault in userspace, and so
> > arbitrary impose a delay on the fault handler, creating a situation
> > where a low priority context is blocked waiting for the fault. This
> > blocked context should not prevent a high priority context from being
> > executed. While the userfault tries to emulate a slow fault (e.g. from a
> > failing swap device), it is unfortunately limited to a single object
> > type: the userptr. Hopefully, we will find other ways to impose other
> > starvation conditions on global resources.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> So rt-ww_mutexes?
> 
> I don't think we want/should do that on the first round of rolling out
> ww_mutex in i915.

It works today. And will continue working across any conversion.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-08 21:22       ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-08 21:22 UTC (permalink / raw)
  To: Daniel Vetter, Maarten Lankhorst, Peter Zijlstra
  Cc: IGT development, intel-gfx, Tvrtko Ursulin

Quoting Daniel Vetter (2019-11-08 21:13:13)
> On Fri, Nov 8, 2019 at 9:49 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > One of the hardest priority inversion tasks to both handle and to
> > simulate in testing is inversion due to resource contention. The
> > challenge is that a high priority context should never be blocked by a
> > low priority context, even if both are starving for resources --
> > ideally, at least for some RT OSes, the higher priority context has
> > first pick of the meagre resources so that it can be executed with
> > minimum latency.
> >
> > userfaultfd allows us to handle a page fault in userspace, and so
> > arbitrary impose a delay on the fault handler, creating a situation
> > where a low priority context is blocked waiting for the fault. This
> > blocked context should not prevent a high priority context from being
> > executed. While the userfault tries to emulate a slow fault (e.g. from a
> > failing swap device), it is unfortunately limited to a single object
> > type: the userptr. Hopefully, we will find other ways to impose other
> > starvation conditions on global resources.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> So rt-ww_mutexes?
> 
> I don't think we want/should do that on the first round of rolling out
> ww_mutex in i915.

It works today. And will continue working across any conversion.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for series starting with [i-g-t,1/3] i915/gem_exec_schedule: Split pi-ringfull into two tests
  2019-11-08 20:49 ` [Intel-gfx] " Chris Wilson
                   ` (3 preceding siblings ...)
  (?)
@ 2019-11-08 21:46 ` Patchwork
  -1 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2019-11-08 21:46 UTC (permalink / raw)
  To: Chris Wilson; +Cc: igt-dev

== Series Details ==

Series: series starting with [i-g-t,1/3] i915/gem_exec_schedule: Split pi-ringfull into two tests
URL   : https://patchwork.freedesktop.org/series/69219/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7299 -> IGTPW_3675
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/index.html

Known issues
------------

  Here are the changes found in IGTPW_3675 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3:
    - fi-cml-s:           [PASS][1] -> [DMESG-WARN][2] ([fdo#111764])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/fi-cml-s/igt@gem_exec_suspend@basic-s3.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/fi-cml-s/igt@gem_exec_suspend@basic-s3.html

  * igt@i915_selftest@live_gem_contexts:
    - fi-bsw-n3050:       [PASS][3] -> [INCOMPLETE][4] ([fdo# 111542])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/fi-bsw-n3050/igt@i915_selftest@live_gem_contexts.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/fi-bsw-n3050/igt@i915_selftest@live_gem_contexts.html

  
#### Possible fixes ####

  * igt@i915_selftest@live_blt:
    - fi-hsw-peppy:       [DMESG-FAIL][5] ([fdo#112147]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/fi-hsw-peppy/igt@i915_selftest@live_blt.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/fi-hsw-peppy/igt@i915_selftest@live_blt.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo# 111542]: https://bugs.freedesktop.org/show_bug.cgi?id= 111542
  [fdo#111736]: https://bugs.freedesktop.org/show_bug.cgi?id=111736
  [fdo#111764]: https://bugs.freedesktop.org/show_bug.cgi?id=111764
  [fdo#112147]: https://bugs.freedesktop.org/show_bug.cgi?id=112147


Participating hosts (51 -> 45)
------------------------------

  Missing    (6): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_5268 -> IGTPW_3675

  CI-20190529: 20190529
  CI_DRM_7299: e7de48a8b1161a99f4b8e4483bc1bb85f5d31039 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_3675: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/index.html
  IGT_5268: c94958b8f7caefcda72392417ae6f3a98e36a48b @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools



== Testlist changes ==

+igt@gem_exec_schedule@pi-common-blt
+igt@gem_exec_schedule@pi-common-bsd
+igt@gem_exec_schedule@pi-common-bsd1
+igt@gem_exec_schedule@pi-common-bsd2
+igt@gem_exec_schedule@pi-common-render
+igt@gem_exec_schedule@pi-common-vebox
+igt@gem_exec_schedule@pi-userfault-blt
+igt@gem_exec_schedule@pi-userfault-bsd
+igt@gem_exec_schedule@pi-userfault-bsd1
+igt@gem_exec_schedule@pi-userfault-bsd2
+igt@gem_exec_schedule@pi-userfault-render
+igt@gem_exec_schedule@pi-userfault-vebox
+igt@gem_userptr_blits@userfault

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/index.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [igt-dev] ✓ Fi.CI.IGT: success for series starting with [i-g-t,1/3] i915/gem_exec_schedule: Split pi-ringfull into two tests
  2019-11-08 20:49 ` [Intel-gfx] " Chris Wilson
                   ` (4 preceding siblings ...)
  (?)
@ 2019-11-10 18:03 ` Patchwork
  -1 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2019-11-10 18:03 UTC (permalink / raw)
  To: Chris Wilson; +Cc: igt-dev

== Series Details ==

Series: series starting with [i-g-t,1/3] i915/gem_exec_schedule: Split pi-ringfull into two tests
URL   : https://patchwork.freedesktop.org/series/69219/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7299_full -> IGTPW_3675_full
====================================================

Summary
-------

  **WARNING**

  Minor unknown changes coming with IGTPW_3675_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_3675_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_3675_full:

### IGT changes ###

#### Possible regressions ####

  * {igt@gem_exec_schedule@pi-userfault-blt} (NEW):
    - shard-iclb:         NOTRUN -> [SKIP][1] +4 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb6/igt@gem_exec_schedule@pi-userfault-blt.html

  * {igt@gem_exec_schedule@pi-userfault-render} (NEW):
    - shard-tglb:         NOTRUN -> [SKIP][2] +5 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-tglb7/igt@gem_exec_schedule@pi-userfault-render.html

  
#### Warnings ####

  * igt@i915_pm_dc@dc6-psr:
    - shard-tglb:         [SKIP][3] ([fdo#111865]) -> [FAIL][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-tglb6/igt@i915_pm_dc@dc6-psr.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-tglb4/igt@i915_pm_dc@dc6-psr.html

  
New tests
---------

  New tests have been introduced between CI_DRM_7299_full and IGTPW_3675_full:

### New IGT tests (13) ###

  * igt@gem_exec_schedule@pi-common-blt:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.08] s

  * igt@gem_exec_schedule@pi-common-bsd:
    - Statuses : 3 pass(s) 4 skip(s)
    - Exec time: [0.0, 0.08] s

  * igt@gem_exec_schedule@pi-common-bsd1:
    - Statuses : 2 pass(s) 5 skip(s)
    - Exec time: [0.0, 0.07] s

  * igt@gem_exec_schedule@pi-common-bsd2:
    - Statuses : 2 pass(s) 5 skip(s)
    - Exec time: [0.0, 0.08] s

  * igt@gem_exec_schedule@pi-common-render:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.08] s

  * igt@gem_exec_schedule@pi-common-vebox:
    - Statuses : 5 pass(s) 2 skip(s)
    - Exec time: [0.0, 0.08] s

  * igt@gem_exec_schedule@pi-userfault-blt:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_exec_schedule@pi-userfault-bsd:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@gem_exec_schedule@pi-userfault-bsd1:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_exec_schedule@pi-userfault-bsd2:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@gem_exec_schedule@pi-userfault-render:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_exec_schedule@pi-userfault-vebox:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@gem_userptr_blits@userfault:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  

Known issues
------------

  Here are the changes found in IGTPW_3675_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_isolation@rcs0-s3:
    - shard-kbl:          [PASS][5] -> [DMESG-WARN][6] ([fdo#108566]) +2 similar issues
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-kbl2/igt@gem_ctx_isolation@rcs0-s3.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-kbl3/igt@gem_ctx_isolation@rcs0-s3.html

  * igt@gem_ctx_isolation@vcs1-none:
    - shard-iclb:         [PASS][7] -> [SKIP][8] ([fdo#109276] / [fdo#112080]) +4 similar issues
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb1/igt@gem_ctx_isolation@vcs1-none.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb5/igt@gem_ctx_isolation@vcs1-none.html

  * igt@gem_ctx_switch@vcs1-heavy:
    - shard-iclb:         [PASS][9] -> [SKIP][10] ([fdo#112080]) +12 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb4/igt@gem_ctx_switch@vcs1-heavy.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb3/igt@gem_ctx_switch@vcs1-heavy.html

  * igt@gem_exec_balancer@smoke:
    - shard-iclb:         [PASS][11] -> [SKIP][12] ([fdo#110854])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb4/igt@gem_exec_balancer@smoke.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb6/igt@gem_exec_balancer@smoke.html

  * igt@gem_exec_schedule@in-order-bsd:
    - shard-iclb:         [PASS][13] -> [SKIP][14] ([fdo#112146]) +8 similar issues
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb3/igt@gem_exec_schedule@in-order-bsd.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb4/igt@gem_exec_schedule@in-order-bsd.html

  * igt@gem_exec_schedule@out-order-bsd2:
    - shard-iclb:         [PASS][15] -> [SKIP][16] ([fdo#109276]) +11 similar issues
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb1/igt@gem_exec_schedule@out-order-bsd2.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb6/igt@gem_exec_schedule@out-order-bsd2.html

  * igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing:
    - shard-kbl:          [PASS][17] -> [FAIL][18] ([fdo#112037])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-kbl6/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-kbl7/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrashing.html

  * igt@gem_userptr_blits@map-fixed-invalidate-busy:
    - shard-snb:          [PASS][19] -> [DMESG-WARN][20] ([fdo#111870]) +2 similar issues
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-snb4/igt@gem_userptr_blits@map-fixed-invalidate-busy.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-snb4/igt@gem_userptr_blits@map-fixed-invalidate-busy.html

  * igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy:
    - shard-hsw:          [PASS][21] -> [DMESG-WARN][22] ([fdo#111870]) +2 similar issues
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-hsw5/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-hsw2/igt@gem_userptr_blits@map-fixed-invalidate-overlap-busy.html

  * igt@gem_workarounds@suspend-resume-fd:
    - shard-tglb:         [PASS][23] -> [INCOMPLETE][24] ([fdo#111832] / [fdo#111850]) +1 similar issue
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-tglb8/igt@gem_workarounds@suspend-resume-fd.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-tglb2/igt@gem_workarounds@suspend-resume-fd.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
    - shard-apl:          [PASS][25] -> [DMESG-WARN][26] ([fdo#108566]) +3 similar issues
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-apl3/igt@i915_suspend@fence-restore-tiled2untiled.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-apl1/igt@i915_suspend@fence-restore-tiled2untiled.html

  * igt@kms_color@pipe-a-ctm-blue-to-red:
    - shard-kbl:          [PASS][27] -> [FAIL][28] ([fdo#107201])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-kbl6/igt@kms_color@pipe-a-ctm-blue-to-red.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-kbl3/igt@kms_color@pipe-a-ctm-blue-to-red.html
    - shard-apl:          [PASS][29] -> [FAIL][30] ([fdo#107201])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-apl7/igt@kms_color@pipe-a-ctm-blue-to-red.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-apl4/igt@kms_color@pipe-a-ctm-blue-to-red.html

  * igt@kms_cursor_crc@pipe-a-cursor-256x85-onscreen:
    - shard-kbl:          [PASS][31] -> [FAIL][32] ([fdo#103232])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-kbl6/igt@kms_cursor_crc@pipe-a-cursor-256x85-onscreen.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-kbl3/igt@kms_cursor_crc@pipe-a-cursor-256x85-onscreen.html
    - shard-apl:          [PASS][33] -> [FAIL][34] ([fdo#103232])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-apl1/igt@kms_cursor_crc@pipe-a-cursor-256x85-onscreen.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-apl4/igt@kms_cursor_crc@pipe-a-cursor-256x85-onscreen.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-move:
    - shard-apl:          [PASS][35] -> [FAIL][36] ([fdo#103167])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-apl2/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-move.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-apl4/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-move.html
    - shard-kbl:          [PASS][37] -> [FAIL][38] ([fdo#103167])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-kbl3/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-move.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-kbl3/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-move.html

  * igt@kms_frontbuffer_tracking@fbc-1p-rte:
    - shard-iclb:         [PASS][39] -> [FAIL][40] ([fdo#103167] / [fdo#110378])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb3/igt@kms_frontbuffer_tracking@fbc-1p-rte.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb6/igt@kms_frontbuffer_tracking@fbc-1p-rte.html

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-shrfb-msflip-blt:
    - shard-glk:          [PASS][41] -> [FAIL][42] ([fdo#103167])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-glk9/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-shrfb-msflip-blt.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-glk6/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-shrfb-msflip-blt.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-blt:
    - shard-iclb:         [PASS][43] -> [FAIL][44] ([fdo#103167]) +3 similar issues
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-blt.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-blt.html

  * igt@kms_psr2_su@frontbuffer:
    - shard-iclb:         [PASS][45] -> [SKIP][46] ([fdo#109642] / [fdo#111068])
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb2/igt@kms_psr2_su@frontbuffer.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb4/igt@kms_psr2_su@frontbuffer.html

  * igt@kms_psr@psr2_cursor_plane_move:
    - shard-iclb:         [PASS][47] -> [SKIP][48] ([fdo#109441]) +1 similar issue
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb2/igt@kms_psr@psr2_cursor_plane_move.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb6/igt@kms_psr@psr2_cursor_plane_move.html

  * igt@kms_setmode@basic:
    - shard-kbl:          [PASS][49] -> [FAIL][50] ([fdo#99912])
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-kbl2/igt@kms_setmode@basic.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-kbl4/igt@kms_setmode@basic.html

  
#### Possible fixes ####

  * igt@gem_ctx_isolation@bcs0-s3:
    - shard-tglb:         [INCOMPLETE][51] ([fdo#111832]) -> [PASS][52] +2 similar issues
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-tglb4/igt@gem_ctx_isolation@bcs0-s3.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-tglb8/igt@gem_ctx_isolation@bcs0-s3.html

  * igt@gem_ctx_isolation@rcs0-s3:
    - shard-apl:          [DMESG-WARN][53] ([fdo#108566]) -> [PASS][54] +1 similar issue
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-apl2/igt@gem_ctx_isolation@rcs0-s3.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-apl4/igt@gem_ctx_isolation@rcs0-s3.html

  * igt@gem_ctx_persistence@vcs1-hostile:
    - shard-iclb:         [SKIP][55] ([fdo#109276] / [fdo#112080]) -> [PASS][56]
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb7/igt@gem_ctx_persistence@vcs1-hostile.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb1/igt@gem_ctx_persistence@vcs1-hostile.html

  * igt@gem_ctx_shared@exec-single-timeline-bsd:
    - shard-iclb:         [SKIP][57] ([fdo#110841]) -> [PASS][58]
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb4/igt@gem_ctx_shared@exec-single-timeline-bsd.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb8/igt@gem_ctx_shared@exec-single-timeline-bsd.html

  * igt@gem_ctx_shared@q-smoketest-vebox:
    - shard-tglb:         [INCOMPLETE][59] ([fdo#111735]) -> [PASS][60]
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-tglb6/igt@gem_ctx_shared@q-smoketest-vebox.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-tglb6/igt@gem_ctx_shared@q-smoketest-vebox.html

  * igt@gem_exec_parallel@vcs1-fds:
    - shard-iclb:         [SKIP][61] ([fdo#112080]) -> [PASS][62] +8 similar issues
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb3/igt@gem_exec_parallel@vcs1-fds.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb4/igt@gem_exec_parallel@vcs1-fds.html

  * igt@gem_exec_schedule@preempt-other-chain-bsd:
    - shard-iclb:         [SKIP][63] ([fdo#112146]) -> [PASS][64] +4 similar issues
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb2/igt@gem_exec_schedule@preempt-other-chain-bsd.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb6/igt@gem_exec_schedule@preempt-other-chain-bsd.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-snb:          [DMESG-WARN][65] ([fdo#111870]) -> [PASS][66] +1 similar issue
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-snb7/igt@gem_userptr_blits@dmabuf-sync.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-snb6/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gem_userptr_blits@dmabuf-unsync:
    - shard-hsw:          [DMESG-WARN][67] ([fdo#110789] / [fdo#111870]) -> [PASS][68]
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-hsw5/igt@gem_userptr_blits@dmabuf-unsync.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-hsw5/igt@gem_userptr_blits@dmabuf-unsync.html

  * igt@gem_userptr_blits@map-fixed-invalidate-busy-gup:
    - shard-hsw:          [DMESG-WARN][69] ([fdo#111870]) -> [PASS][70] +1 similar issue
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-hsw2/igt@gem_userptr_blits@map-fixed-invalidate-busy-gup.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-hsw6/igt@gem_userptr_blits@map-fixed-invalidate-busy-gup.html

  * igt@i915_pm_rc6_residency@rc6-accuracy:
    - shard-snb:          [SKIP][71] ([fdo#109271]) -> [PASS][72]
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-snb7/igt@i915_pm_rc6_residency@rc6-accuracy.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-snb2/igt@i915_pm_rc6_residency@rc6-accuracy.html

  * igt@i915_selftest@live_requests:
    - shard-tglb:         [INCOMPLETE][73] ([fdo#112057]) -> [PASS][74]
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-tglb2/igt@i915_selftest@live_requests.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-tglb8/igt@i915_selftest@live_requests.html

  * igt@kms_cursor_legacy@cursor-vs-flip-atomic:
    - shard-hsw:          [FAIL][75] ([fdo#103355]) -> [PASS][76]
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-hsw6/igt@kms_cursor_legacy@cursor-vs-flip-atomic.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-hsw2/igt@kms_cursor_legacy@cursor-vs-flip-atomic.html

  * igt@kms_cursor_legacy@cursora-vs-flipa-atomic:
    - shard-tglb:         [INCOMPLETE][77] ([fdo#112035 ]) -> [PASS][78]
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-tglb1/igt@kms_cursor_legacy@cursora-vs-flipa-atomic.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-tglb8/igt@kms_cursor_legacy@cursora-vs-flipa-atomic.html

  * igt@kms_cursor_legacy@flip-vs-cursor-busy-crc-legacy:
    - shard-apl:          [DMESG-WARN][79] -> [PASS][80]
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-apl6/igt@kms_cursor_legacy@flip-vs-cursor-busy-crc-legacy.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-apl6/igt@kms_cursor_legacy@flip-vs-cursor-busy-crc-legacy.html

  * igt@kms_fbcon_fbt@fbc-suspend:
    - shard-tglb:         [INCOMPLETE][81] ([fdo#111747] / [fdo#111832] / [fdo#111850]) -> [PASS][82]
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-tglb1/igt@kms_fbcon_fbt@fbc-suspend.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-tglb2/igt@kms_fbcon_fbt@fbc-suspend.html

  * igt@kms_flip@flip-vs-suspend-interruptible:
    - shard-kbl:          [DMESG-WARN][83] ([fdo#108566]) -> [PASS][84] +6 similar issues
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-kbl1/igt@kms_flip@flip-vs-suspend-interruptible.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-kbl2/igt@kms_flip@flip-vs-suspend-interruptible.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-pwrite:
    - shard-tglb:         [FAIL][85] ([fdo#103167]) -> [PASS][86] +2 similar issues
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-tglb1/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-pwrite.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-tglb4/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-pwrite.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-pri-indfb-multidraw:
    - shard-iclb:         [FAIL][87] ([fdo#103167]) -> [PASS][88] +3 similar issues
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-pri-indfb-multidraw.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb3/igt@kms_frontbuffer_tracking@fbcpsr-1p-pri-indfb-multidraw.html

  * igt@kms_pipe_crc_basic@hang-read-crc-pipe-d:
    - shard-tglb:         [TIMEOUT][89] ([fdo#112168]) -> [PASS][90]
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-tglb6/igt@kms_pipe_crc_basic@hang-read-crc-pipe-d.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-tglb1/igt@kms_pipe_crc_basic@hang-read-crc-pipe-d.html

  * igt@kms_psr@psr2_no_drrs:
    - shard-iclb:         [SKIP][91] ([fdo#109441]) -> [PASS][92] +2 similar issues
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-iclb6/igt@kms_psr@psr2_no_drrs.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-iclb2/igt@kms_psr@psr2_no_drrs.html

  * igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend:
    - shard-hsw:          [INCOMPLETE][93] ([fdo#103540]) -> [PASS][94]
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-hsw5/igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-hsw6/igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend.html

  * igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend:
    - shard-tglb:         [INCOMPLETE][95] ([fdo#111832] / [fdo#111850]) -> [PASS][96]
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7299/shard-tglb5/igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/shard-tglb2/igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend.html

  * igt@prime_busy@hang-bsd2:
    - sha

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3675/index.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-11 16:48     ` Tvrtko Ursulin
  0 siblings, 0 replies; 34+ messages in thread
From: Tvrtko Ursulin @ 2019-11-11 16:48 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev


On 08/11/2019 20:49, Chris Wilson wrote:
> Register a userspace fault handler for a memory region that we also pass
> to the GPU via userptr, and make sure the pagefault is properly serviced
> before we execute.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
>   1 file changed, 118 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
> index 11d6f4a1c..774a9f92c 100644
> --- a/tests/i915/gem_userptr_blits.c
> +++ b/tests/i915/gem_userptr_blits.c
> @@ -36,6 +36,8 @@
>    * The goal is to simply ensure the basics work.
>    */
>   
> +#include <linux/userfaultfd.h>
> +
>   #include "igt.h"
>   #include <stdlib.h>
>   #include <stdio.h>
> @@ -44,9 +46,11 @@
>   #include <inttypes.h>
>   #include <errno.h>
>   #include <setjmp.h>
> +#include <sys/ioctl.h>
> +#include <sys/mman.h>
>   #include <sys/stat.h>
> +#include <sys/syscall.h>
>   #include <sys/time.h>
> -#include <sys/mman.h>
>   #include <glib.h>
>   #include <signal.h>
>   #include <pthread.h>
> @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
>   	free(t_data.ptr);
>   }
>   
> +struct ufd_thread {
> +	uint32_t *page;
> +	int i915;
> +};
> +
> +static uint32_t create_page(int i915, void *page)
> +{
> +	uint32_t handle;
> +
> +	gem_userptr(i915, page, 4096, 0, 0, &handle);
> +	return handle;
> +}
> +
> +static uint32_t create_batch(int i915)
> +{
> +	const uint32_t bbe = MI_BATCH_BUFFER_END;
> +	uint32_t handle;
> +
> +	handle = gem_create(i915, 4096);
> +	gem_write(i915, handle, 0, &bbe, sizeof(bbe));
> +
> +	return handle;
> +}
> +
> +static void *ufd_thread(void *arg)
> +{
> +	struct ufd_thread *t = arg;
> +	struct drm_i915_gem_exec_object2 obj[2] = {
> +		{ .handle = create_page(t->i915, t->page) },
> +		{ .handle = create_batch(t->i915) },
> +	};
> +	struct drm_i915_gem_execbuffer2 eb = {
> +		.buffers_ptr = to_user_pointer(obj),
> +		.buffer_count = ARRAY_SIZE(obj),
> +	};
> +
> +	igt_debug("submitting fault\n");
> +	gem_execbuf(t->i915, &eb);
> +	gem_sync(t->i915, obj[1].handle);
> +
> +	for (int i = 0; i < ARRAY_SIZE(obj); i++)
> +		gem_close(t->i915, obj[i].handle);
> +
> +	t->i915 = -1;
> +	return NULL;
> +}
> +
> +static int userfaultfd(int flags)
> +{
> +	return syscall(SYS_userfaultfd, flags);
> +}
> +
> +static void test_userfault(int i915)
> +{
> +	struct uffdio_api api = { .api = UFFD_API };
> +	struct uffdio_register reg;
> +	struct uffdio_copy copy;
> +	struct uffd_msg msg;
> +	struct ufd_thread t;
> +	pthread_t thread;
> +	char poison[4096];
> +	int ufd;
> +
> +	/*
> +	 * Register a page with userfaultfd, and wrap that inside a userptr bo.
> +	 * When we try to use gup insider userptr_get_pages, it will trigger
> +	 * a pagefault that is sent to the userfaultfd for servicing. This
> +	 * is arbitrarily slow, as the submission must wait until the fault
> +	 * is serviced by the userspace fault handler.
> +	 */
> +
> +	ufd = userfaultfd(0);
> +	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
> +	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
> +		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
> +
> +	t.i915 = i915;
> +
> +	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> +	igt_assert(t.page != MAP_FAILED);
> +
> +	memset(&reg, 0, sizeof(reg));
> +	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
> +	reg.range.start = to_user_pointer(t.page);
> +	reg.range.len = 4096;
> +	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
> +	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
> +
> +	igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
> +
> +	/* Wait for the fault */
> +	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
> +	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
> +	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
> +
> +	/* Faulting thread remains blocked */
> +	igt_assert_eq(t.i915, i915);

This looks could be a false negative since nothing says the thread is 
not blocked just not got round resetting t->i915.

> +
> +	memset(&copy, 0, sizeof(copy));
> +	copy.dst = msg.arg.pagefault.address;
> +	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
> +	copy.len = 4096;
> +	do_ioctl(ufd, UFFDIO_COPY, &copy);

What is the point of poison data?

Would it work better to have a hanging batch registered with userfault 
and then replace it with a valid batch here? That would ensure execbuf 
was  blocked until userfault handler finishes otherwise we get a GPU hang.

Regards,

Tvrtko

> +
> +	pthread_join(thread, NULL);
> +
> +	munmap(t.page, 4096);
> +	close(ufd);
> +}
> +
>   uint64_t total_ram;
>   uint64_t aperture_size;
>   int fd, count;
> @@ -1902,6 +2016,9 @@ igt_main_args("c:", NULL, help_str, opt_handler, NULL)
>   		igt_subtest("forbidden-operations")
>   			test_forbidden_ops(fd);
>   
> +		igt_subtest("userfault")
> +			test_userfault(fd);
> +
>   		igt_subtest("relocations")
>   			test_relocations(fd);
>   	}
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-11 16:48     ` Tvrtko Ursulin
  0 siblings, 0 replies; 34+ messages in thread
From: Tvrtko Ursulin @ 2019-11-11 16:48 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev


On 08/11/2019 20:49, Chris Wilson wrote:
> Register a userspace fault handler for a memory region that we also pass
> to the GPU via userptr, and make sure the pagefault is properly serviced
> before we execute.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
>   1 file changed, 118 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
> index 11d6f4a1c..774a9f92c 100644
> --- a/tests/i915/gem_userptr_blits.c
> +++ b/tests/i915/gem_userptr_blits.c
> @@ -36,6 +36,8 @@
>    * The goal is to simply ensure the basics work.
>    */
>   
> +#include <linux/userfaultfd.h>
> +
>   #include "igt.h"
>   #include <stdlib.h>
>   #include <stdio.h>
> @@ -44,9 +46,11 @@
>   #include <inttypes.h>
>   #include <errno.h>
>   #include <setjmp.h>
> +#include <sys/ioctl.h>
> +#include <sys/mman.h>
>   #include <sys/stat.h>
> +#include <sys/syscall.h>
>   #include <sys/time.h>
> -#include <sys/mman.h>
>   #include <glib.h>
>   #include <signal.h>
>   #include <pthread.h>
> @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
>   	free(t_data.ptr);
>   }
>   
> +struct ufd_thread {
> +	uint32_t *page;
> +	int i915;
> +};
> +
> +static uint32_t create_page(int i915, void *page)
> +{
> +	uint32_t handle;
> +
> +	gem_userptr(i915, page, 4096, 0, 0, &handle);
> +	return handle;
> +}
> +
> +static uint32_t create_batch(int i915)
> +{
> +	const uint32_t bbe = MI_BATCH_BUFFER_END;
> +	uint32_t handle;
> +
> +	handle = gem_create(i915, 4096);
> +	gem_write(i915, handle, 0, &bbe, sizeof(bbe));
> +
> +	return handle;
> +}
> +
> +static void *ufd_thread(void *arg)
> +{
> +	struct ufd_thread *t = arg;
> +	struct drm_i915_gem_exec_object2 obj[2] = {
> +		{ .handle = create_page(t->i915, t->page) },
> +		{ .handle = create_batch(t->i915) },
> +	};
> +	struct drm_i915_gem_execbuffer2 eb = {
> +		.buffers_ptr = to_user_pointer(obj),
> +		.buffer_count = ARRAY_SIZE(obj),
> +	};
> +
> +	igt_debug("submitting fault\n");
> +	gem_execbuf(t->i915, &eb);
> +	gem_sync(t->i915, obj[1].handle);
> +
> +	for (int i = 0; i < ARRAY_SIZE(obj); i++)
> +		gem_close(t->i915, obj[i].handle);
> +
> +	t->i915 = -1;
> +	return NULL;
> +}
> +
> +static int userfaultfd(int flags)
> +{
> +	return syscall(SYS_userfaultfd, flags);
> +}
> +
> +static void test_userfault(int i915)
> +{
> +	struct uffdio_api api = { .api = UFFD_API };
> +	struct uffdio_register reg;
> +	struct uffdio_copy copy;
> +	struct uffd_msg msg;
> +	struct ufd_thread t;
> +	pthread_t thread;
> +	char poison[4096];
> +	int ufd;
> +
> +	/*
> +	 * Register a page with userfaultfd, and wrap that inside a userptr bo.
> +	 * When we try to use gup insider userptr_get_pages, it will trigger
> +	 * a pagefault that is sent to the userfaultfd for servicing. This
> +	 * is arbitrarily slow, as the submission must wait until the fault
> +	 * is serviced by the userspace fault handler.
> +	 */
> +
> +	ufd = userfaultfd(0);
> +	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
> +	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
> +		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
> +
> +	t.i915 = i915;
> +
> +	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> +	igt_assert(t.page != MAP_FAILED);
> +
> +	memset(&reg, 0, sizeof(reg));
> +	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
> +	reg.range.start = to_user_pointer(t.page);
> +	reg.range.len = 4096;
> +	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
> +	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
> +
> +	igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
> +
> +	/* Wait for the fault */
> +	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
> +	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
> +	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
> +
> +	/* Faulting thread remains blocked */
> +	igt_assert_eq(t.i915, i915);

This looks could be a false negative since nothing says the thread is 
not blocked just not got round resetting t->i915.

> +
> +	memset(&copy, 0, sizeof(copy));
> +	copy.dst = msg.arg.pagefault.address;
> +	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
> +	copy.len = 4096;
> +	do_ioctl(ufd, UFFDIO_COPY, &copy);

What is the point of poison data?

Would it work better to have a hanging batch registered with userfault 
and then replace it with a valid batch here? That would ensure execbuf 
was  blocked until userfault handler finishes otherwise we get a GPU hang.

Regards,

Tvrtko

> +
> +	pthread_join(thread, NULL);
> +
> +	munmap(t.page, 4096);
> +	close(ufd);
> +}
> +
>   uint64_t total_ram;
>   uint64_t aperture_size;
>   int fd, count;
> @@ -1902,6 +2016,9 @@ igt_main_args("c:", NULL, help_str, opt_handler, NULL)
>   		igt_subtest("forbidden-operations")
>   			test_forbidden_ops(fd);
>   
> +		igt_subtest("userfault")
> +			test_userfault(fd);
> +
>   		igt_subtest("relocations")
>   			test_relocations(fd);
>   	}
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-11 16:48     ` Tvrtko Ursulin
  0 siblings, 0 replies; 34+ messages in thread
From: Tvrtko Ursulin @ 2019-11-11 16:48 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev, Tvrtko Ursulin


On 08/11/2019 20:49, Chris Wilson wrote:
> Register a userspace fault handler for a memory region that we also pass
> to the GPU via userptr, and make sure the pagefault is properly serviced
> before we execute.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
>   1 file changed, 118 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
> index 11d6f4a1c..774a9f92c 100644
> --- a/tests/i915/gem_userptr_blits.c
> +++ b/tests/i915/gem_userptr_blits.c
> @@ -36,6 +36,8 @@
>    * The goal is to simply ensure the basics work.
>    */
>   
> +#include <linux/userfaultfd.h>
> +
>   #include "igt.h"
>   #include <stdlib.h>
>   #include <stdio.h>
> @@ -44,9 +46,11 @@
>   #include <inttypes.h>
>   #include <errno.h>
>   #include <setjmp.h>
> +#include <sys/ioctl.h>
> +#include <sys/mman.h>
>   #include <sys/stat.h>
> +#include <sys/syscall.h>
>   #include <sys/time.h>
> -#include <sys/mman.h>
>   #include <glib.h>
>   #include <signal.h>
>   #include <pthread.h>
> @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
>   	free(t_data.ptr);
>   }
>   
> +struct ufd_thread {
> +	uint32_t *page;
> +	int i915;
> +};
> +
> +static uint32_t create_page(int i915, void *page)
> +{
> +	uint32_t handle;
> +
> +	gem_userptr(i915, page, 4096, 0, 0, &handle);
> +	return handle;
> +}
> +
> +static uint32_t create_batch(int i915)
> +{
> +	const uint32_t bbe = MI_BATCH_BUFFER_END;
> +	uint32_t handle;
> +
> +	handle = gem_create(i915, 4096);
> +	gem_write(i915, handle, 0, &bbe, sizeof(bbe));
> +
> +	return handle;
> +}
> +
> +static void *ufd_thread(void *arg)
> +{
> +	struct ufd_thread *t = arg;
> +	struct drm_i915_gem_exec_object2 obj[2] = {
> +		{ .handle = create_page(t->i915, t->page) },
> +		{ .handle = create_batch(t->i915) },
> +	};
> +	struct drm_i915_gem_execbuffer2 eb = {
> +		.buffers_ptr = to_user_pointer(obj),
> +		.buffer_count = ARRAY_SIZE(obj),
> +	};
> +
> +	igt_debug("submitting fault\n");
> +	gem_execbuf(t->i915, &eb);
> +	gem_sync(t->i915, obj[1].handle);
> +
> +	for (int i = 0; i < ARRAY_SIZE(obj); i++)
> +		gem_close(t->i915, obj[i].handle);
> +
> +	t->i915 = -1;
> +	return NULL;
> +}
> +
> +static int userfaultfd(int flags)
> +{
> +	return syscall(SYS_userfaultfd, flags);
> +}
> +
> +static void test_userfault(int i915)
> +{
> +	struct uffdio_api api = { .api = UFFD_API };
> +	struct uffdio_register reg;
> +	struct uffdio_copy copy;
> +	struct uffd_msg msg;
> +	struct ufd_thread t;
> +	pthread_t thread;
> +	char poison[4096];
> +	int ufd;
> +
> +	/*
> +	 * Register a page with userfaultfd, and wrap that inside a userptr bo.
> +	 * When we try to use gup insider userptr_get_pages, it will trigger
> +	 * a pagefault that is sent to the userfaultfd for servicing. This
> +	 * is arbitrarily slow, as the submission must wait until the fault
> +	 * is serviced by the userspace fault handler.
> +	 */
> +
> +	ufd = userfaultfd(0);
> +	igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
> +	igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
> +		      "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
> +
> +	t.i915 = i915;
> +
> +	t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> +	igt_assert(t.page != MAP_FAILED);
> +
> +	memset(&reg, 0, sizeof(reg));
> +	reg.mode = UFFDIO_REGISTER_MODE_MISSING;
> +	reg.range.start = to_user_pointer(t.page);
> +	reg.range.len = 4096;
> +	do_ioctl(ufd, UFFDIO_REGISTER, &reg);
> +	igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
> +
> +	igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
> +
> +	/* Wait for the fault */
> +	igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
> +	igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
> +	igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
> +
> +	/* Faulting thread remains blocked */
> +	igt_assert_eq(t.i915, i915);

This looks could be a false negative since nothing says the thread is 
not blocked just not got round resetting t->i915.

> +
> +	memset(&copy, 0, sizeof(copy));
> +	copy.dst = msg.arg.pagefault.address;
> +	copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
> +	copy.len = 4096;
> +	do_ioctl(ufd, UFFDIO_COPY, &copy);

What is the point of poison data?

Would it work better to have a hanging batch registered with userfault 
and then replace it with a valid batch here? That would ensure execbuf 
was  blocked until userfault handler finishes otherwise we get a GPU hang.

Regards,

Tvrtko

> +
> +	pthread_join(thread, NULL);
> +
> +	munmap(t.page, 4096);
> +	close(ufd);
> +}
> +
>   uint64_t total_ram;
>   uint64_t aperture_size;
>   int fd, count;
> @@ -1902,6 +2016,9 @@ igt_main_args("c:", NULL, help_str, opt_handler, NULL)
>   		igt_subtest("forbidden-operations")
>   			test_forbidden_ops(fd);
>   
> +		igt_subtest("userfault")
> +			test_userfault(fd);
> +
>   		igt_subtest("relocations")
>   			test_relocations(fd);
>   	}
> 
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-11 16:58       ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-11 16:58 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: igt-dev

Quoting Tvrtko Ursulin (2019-11-11 16:48:14)
> 
> On 08/11/2019 20:49, Chris Wilson wrote:
> > Register a userspace fault handler for a memory region that we also pass
> > to the GPU via userptr, and make sure the pagefault is properly serviced
> > before we execute.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
> >   1 file changed, 118 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
> > index 11d6f4a1c..774a9f92c 100644
> > --- a/tests/i915/gem_userptr_blits.c
> > +++ b/tests/i915/gem_userptr_blits.c
> > @@ -36,6 +36,8 @@
> >    * The goal is to simply ensure the basics work.
> >    */
> >   
> > +#include <linux/userfaultfd.h>
> > +
> >   #include "igt.h"
> >   #include <stdlib.h>
> >   #include <stdio.h>
> > @@ -44,9 +46,11 @@
> >   #include <inttypes.h>
> >   #include <errno.h>
> >   #include <setjmp.h>
> > +#include <sys/ioctl.h>
> > +#include <sys/mman.h>
> >   #include <sys/stat.h>
> > +#include <sys/syscall.h>
> >   #include <sys/time.h>
> > -#include <sys/mman.h>
> >   #include <glib.h>
> >   #include <signal.h>
> >   #include <pthread.h>
> > @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
> >       free(t_data.ptr);
> >   }
> >   
> > +struct ufd_thread {
> > +     uint32_t *page;
> > +     int i915;
> > +};
> > +
> > +static uint32_t create_page(int i915, void *page)
> > +{
> > +     uint32_t handle;
> > +
> > +     gem_userptr(i915, page, 4096, 0, 0, &handle);
> > +     return handle;
> > +}
> > +
> > +static uint32_t create_batch(int i915)
> > +{
> > +     const uint32_t bbe = MI_BATCH_BUFFER_END;
> > +     uint32_t handle;
> > +
> > +     handle = gem_create(i915, 4096);
> > +     gem_write(i915, handle, 0, &bbe, sizeof(bbe));
> > +
> > +     return handle;
> > +}
> > +
> > +static void *ufd_thread(void *arg)
> > +{
> > +     struct ufd_thread *t = arg;
> > +     struct drm_i915_gem_exec_object2 obj[2] = {
> > +             { .handle = create_page(t->i915, t->page) },
> > +             { .handle = create_batch(t->i915) },
> > +     };
> > +     struct drm_i915_gem_execbuffer2 eb = {
> > +             .buffers_ptr = to_user_pointer(obj),
> > +             .buffer_count = ARRAY_SIZE(obj),
> > +     };
> > +
> > +     igt_debug("submitting fault\n");
> > +     gem_execbuf(t->i915, &eb);
> > +     gem_sync(t->i915, obj[1].handle);
> > +
> > +     for (int i = 0; i < ARRAY_SIZE(obj); i++)
> > +             gem_close(t->i915, obj[i].handle);
> > +
> > +     t->i915 = -1;
> > +     return NULL;
> > +}
> > +
> > +static int userfaultfd(int flags)
> > +{
> > +     return syscall(SYS_userfaultfd, flags);
> > +}
> > +
> > +static void test_userfault(int i915)
> > +{
> > +     struct uffdio_api api = { .api = UFFD_API };
> > +     struct uffdio_register reg;
> > +     struct uffdio_copy copy;
> > +     struct uffd_msg msg;
> > +     struct ufd_thread t;
> > +     pthread_t thread;
> > +     char poison[4096];
> > +     int ufd;
> > +
> > +     /*
> > +      * Register a page with userfaultfd, and wrap that inside a userptr bo.
> > +      * When we try to use gup insider userptr_get_pages, it will trigger
> > +      * a pagefault that is sent to the userfaultfd for servicing. This
> > +      * is arbitrarily slow, as the submission must wait until the fault
> > +      * is serviced by the userspace fault handler.
> > +      */
> > +
> > +     ufd = userfaultfd(0);
> > +     igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
> > +     igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
> > +                   "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
> > +
> > +     t.i915 = i915;
> > +
> > +     t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> > +     igt_assert(t.page != MAP_FAILED);
> > +
> > +     memset(&reg, 0, sizeof(reg));
> > +     reg.mode = UFFDIO_REGISTER_MODE_MISSING;
> > +     reg.range.start = to_user_pointer(t.page);
> > +     reg.range.len = 4096;
> > +     do_ioctl(ufd, UFFDIO_REGISTER, &reg);
> > +     igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
> > +
> > +     igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
> > +
> > +     /* Wait for the fault */
> > +     igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
> > +     igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
> > +     igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
> > +
> > +     /* Faulting thread remains blocked */
> > +     igt_assert_eq(t.i915, i915);
> 
> This looks could be a false negative since nothing says the thread is 
> not blocked just not got round resetting t->i915.

There's a gem_sync() in the thread. Our goal is that the thread is
blocked (either at submit or in the sync) until we service the fault.

> > +     memset(&copy, 0, sizeof(copy));
> > +     copy.dst = msg.arg.pagefault.address;
> > +     copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
> > +     copy.len = 4096;
> > +     do_ioctl(ufd, UFFDIO_COPY, &copy);
> 
> What is the point of poison data?

Just a tell tale.
 
> Would it work better to have a hanging batch registered with userfault 
> and then replace it with a valid batch here? That would ensure execbuf 
> was  blocked until userfault handler finishes otherwise we get a GPU hang.

Strictly we can't use userptr for the batch itself, as aiui that
requires LLC for the CS coherency. One of the following tests uses the
userfault handler as the sync point for replacing a poisoned batch with
a valid one, with the intention of looking for the GPU hang if the fault
was missed. That's the test I started with, my goal here was to focus
on the userptr + userfault as simply as I could; so gem_sync().
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-11 16:58       ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-11 16:58 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: igt-dev

Quoting Tvrtko Ursulin (2019-11-11 16:48:14)
> 
> On 08/11/2019 20:49, Chris Wilson wrote:
> > Register a userspace fault handler for a memory region that we also pass
> > to the GPU via userptr, and make sure the pagefault is properly serviced
> > before we execute.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
> >   1 file changed, 118 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
> > index 11d6f4a1c..774a9f92c 100644
> > --- a/tests/i915/gem_userptr_blits.c
> > +++ b/tests/i915/gem_userptr_blits.c
> > @@ -36,6 +36,8 @@
> >    * The goal is to simply ensure the basics work.
> >    */
> >   
> > +#include <linux/userfaultfd.h>
> > +
> >   #include "igt.h"
> >   #include <stdlib.h>
> >   #include <stdio.h>
> > @@ -44,9 +46,11 @@
> >   #include <inttypes.h>
> >   #include <errno.h>
> >   #include <setjmp.h>
> > +#include <sys/ioctl.h>
> > +#include <sys/mman.h>
> >   #include <sys/stat.h>
> > +#include <sys/syscall.h>
> >   #include <sys/time.h>
> > -#include <sys/mman.h>
> >   #include <glib.h>
> >   #include <signal.h>
> >   #include <pthread.h>
> > @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
> >       free(t_data.ptr);
> >   }
> >   
> > +struct ufd_thread {
> > +     uint32_t *page;
> > +     int i915;
> > +};
> > +
> > +static uint32_t create_page(int i915, void *page)
> > +{
> > +     uint32_t handle;
> > +
> > +     gem_userptr(i915, page, 4096, 0, 0, &handle);
> > +     return handle;
> > +}
> > +
> > +static uint32_t create_batch(int i915)
> > +{
> > +     const uint32_t bbe = MI_BATCH_BUFFER_END;
> > +     uint32_t handle;
> > +
> > +     handle = gem_create(i915, 4096);
> > +     gem_write(i915, handle, 0, &bbe, sizeof(bbe));
> > +
> > +     return handle;
> > +}
> > +
> > +static void *ufd_thread(void *arg)
> > +{
> > +     struct ufd_thread *t = arg;
> > +     struct drm_i915_gem_exec_object2 obj[2] = {
> > +             { .handle = create_page(t->i915, t->page) },
> > +             { .handle = create_batch(t->i915) },
> > +     };
> > +     struct drm_i915_gem_execbuffer2 eb = {
> > +             .buffers_ptr = to_user_pointer(obj),
> > +             .buffer_count = ARRAY_SIZE(obj),
> > +     };
> > +
> > +     igt_debug("submitting fault\n");
> > +     gem_execbuf(t->i915, &eb);
> > +     gem_sync(t->i915, obj[1].handle);
> > +
> > +     for (int i = 0; i < ARRAY_SIZE(obj); i++)
> > +             gem_close(t->i915, obj[i].handle);
> > +
> > +     t->i915 = -1;
> > +     return NULL;
> > +}
> > +
> > +static int userfaultfd(int flags)
> > +{
> > +     return syscall(SYS_userfaultfd, flags);
> > +}
> > +
> > +static void test_userfault(int i915)
> > +{
> > +     struct uffdio_api api = { .api = UFFD_API };
> > +     struct uffdio_register reg;
> > +     struct uffdio_copy copy;
> > +     struct uffd_msg msg;
> > +     struct ufd_thread t;
> > +     pthread_t thread;
> > +     char poison[4096];
> > +     int ufd;
> > +
> > +     /*
> > +      * Register a page with userfaultfd, and wrap that inside a userptr bo.
> > +      * When we try to use gup insider userptr_get_pages, it will trigger
> > +      * a pagefault that is sent to the userfaultfd for servicing. This
> > +      * is arbitrarily slow, as the submission must wait until the fault
> > +      * is serviced by the userspace fault handler.
> > +      */
> > +
> > +     ufd = userfaultfd(0);
> > +     igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
> > +     igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
> > +                   "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
> > +
> > +     t.i915 = i915;
> > +
> > +     t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> > +     igt_assert(t.page != MAP_FAILED);
> > +
> > +     memset(&reg, 0, sizeof(reg));
> > +     reg.mode = UFFDIO_REGISTER_MODE_MISSING;
> > +     reg.range.start = to_user_pointer(t.page);
> > +     reg.range.len = 4096;
> > +     do_ioctl(ufd, UFFDIO_REGISTER, &reg);
> > +     igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
> > +
> > +     igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
> > +
> > +     /* Wait for the fault */
> > +     igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
> > +     igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
> > +     igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
> > +
> > +     /* Faulting thread remains blocked */
> > +     igt_assert_eq(t.i915, i915);
> 
> This looks could be a false negative since nothing says the thread is 
> not blocked just not got round resetting t->i915.

There's a gem_sync() in the thread. Our goal is that the thread is
blocked (either at submit or in the sync) until we service the fault.

> > +     memset(&copy, 0, sizeof(copy));
> > +     copy.dst = msg.arg.pagefault.address;
> > +     copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
> > +     copy.len = 4096;
> > +     do_ioctl(ufd, UFFDIO_COPY, &copy);
> 
> What is the point of poison data?

Just a tell tale.
 
> Would it work better to have a hanging batch registered with userfault 
> and then replace it with a valid batch here? That would ensure execbuf 
> was  blocked until userfault handler finishes otherwise we get a GPU hang.

Strictly we can't use userptr for the batch itself, as aiui that
requires LLC for the CS coherency. One of the following tests uses the
userfault handler as the sync point for replacing a poisoned batch with
a valid one, with the intention of looking for the GPU hang if the fault
was missed. That's the test I started with, my goal here was to focus
on the userptr + userfault as simply as I could; so gem_sync().
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-11 16:58       ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-11 16:58 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: igt-dev, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-11-11 16:48:14)
> 
> On 08/11/2019 20:49, Chris Wilson wrote:
> > Register a userspace fault handler for a memory region that we also pass
> > to the GPU via userptr, and make sure the pagefault is properly serviced
> > before we execute.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
> >   1 file changed, 118 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
> > index 11d6f4a1c..774a9f92c 100644
> > --- a/tests/i915/gem_userptr_blits.c
> > +++ b/tests/i915/gem_userptr_blits.c
> > @@ -36,6 +36,8 @@
> >    * The goal is to simply ensure the basics work.
> >    */
> >   
> > +#include <linux/userfaultfd.h>
> > +
> >   #include "igt.h"
> >   #include <stdlib.h>
> >   #include <stdio.h>
> > @@ -44,9 +46,11 @@
> >   #include <inttypes.h>
> >   #include <errno.h>
> >   #include <setjmp.h>
> > +#include <sys/ioctl.h>
> > +#include <sys/mman.h>
> >   #include <sys/stat.h>
> > +#include <sys/syscall.h>
> >   #include <sys/time.h>
> > -#include <sys/mman.h>
> >   #include <glib.h>
> >   #include <signal.h>
> >   #include <pthread.h>
> > @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
> >       free(t_data.ptr);
> >   }
> >   
> > +struct ufd_thread {
> > +     uint32_t *page;
> > +     int i915;
> > +};
> > +
> > +static uint32_t create_page(int i915, void *page)
> > +{
> > +     uint32_t handle;
> > +
> > +     gem_userptr(i915, page, 4096, 0, 0, &handle);
> > +     return handle;
> > +}
> > +
> > +static uint32_t create_batch(int i915)
> > +{
> > +     const uint32_t bbe = MI_BATCH_BUFFER_END;
> > +     uint32_t handle;
> > +
> > +     handle = gem_create(i915, 4096);
> > +     gem_write(i915, handle, 0, &bbe, sizeof(bbe));
> > +
> > +     return handle;
> > +}
> > +
> > +static void *ufd_thread(void *arg)
> > +{
> > +     struct ufd_thread *t = arg;
> > +     struct drm_i915_gem_exec_object2 obj[2] = {
> > +             { .handle = create_page(t->i915, t->page) },
> > +             { .handle = create_batch(t->i915) },
> > +     };
> > +     struct drm_i915_gem_execbuffer2 eb = {
> > +             .buffers_ptr = to_user_pointer(obj),
> > +             .buffer_count = ARRAY_SIZE(obj),
> > +     };
> > +
> > +     igt_debug("submitting fault\n");
> > +     gem_execbuf(t->i915, &eb);
> > +     gem_sync(t->i915, obj[1].handle);
> > +
> > +     for (int i = 0; i < ARRAY_SIZE(obj); i++)
> > +             gem_close(t->i915, obj[i].handle);
> > +
> > +     t->i915 = -1;
> > +     return NULL;
> > +}
> > +
> > +static int userfaultfd(int flags)
> > +{
> > +     return syscall(SYS_userfaultfd, flags);
> > +}
> > +
> > +static void test_userfault(int i915)
> > +{
> > +     struct uffdio_api api = { .api = UFFD_API };
> > +     struct uffdio_register reg;
> > +     struct uffdio_copy copy;
> > +     struct uffd_msg msg;
> > +     struct ufd_thread t;
> > +     pthread_t thread;
> > +     char poison[4096];
> > +     int ufd;
> > +
> > +     /*
> > +      * Register a page with userfaultfd, and wrap that inside a userptr bo.
> > +      * When we try to use gup insider userptr_get_pages, it will trigger
> > +      * a pagefault that is sent to the userfaultfd for servicing. This
> > +      * is arbitrarily slow, as the submission must wait until the fault
> > +      * is serviced by the userspace fault handler.
> > +      */
> > +
> > +     ufd = userfaultfd(0);
> > +     igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
> > +     igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
> > +                   "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
> > +
> > +     t.i915 = i915;
> > +
> > +     t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> > +     igt_assert(t.page != MAP_FAILED);
> > +
> > +     memset(&reg, 0, sizeof(reg));
> > +     reg.mode = UFFDIO_REGISTER_MODE_MISSING;
> > +     reg.range.start = to_user_pointer(t.page);
> > +     reg.range.len = 4096;
> > +     do_ioctl(ufd, UFFDIO_REGISTER, &reg);
> > +     igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
> > +
> > +     igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
> > +
> > +     /* Wait for the fault */
> > +     igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
> > +     igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
> > +     igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
> > +
> > +     /* Faulting thread remains blocked */
> > +     igt_assert_eq(t.i915, i915);
> 
> This looks could be a false negative since nothing says the thread is 
> not blocked just not got round resetting t->i915.

There's a gem_sync() in the thread. Our goal is that the thread is
blocked (either at submit or in the sync) until we service the fault.

> > +     memset(&copy, 0, sizeof(copy));
> > +     copy.dst = msg.arg.pagefault.address;
> > +     copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
> > +     copy.len = 4096;
> > +     do_ioctl(ufd, UFFDIO_COPY, &copy);
> 
> What is the point of poison data?

Just a tell tale.
 
> Would it work better to have a hanging batch registered with userfault 
> and then replace it with a valid batch here? That would ensure execbuf 
> was  blocked until userfault handler finishes otherwise we get a GPU hang.

Strictly we can't use userptr for the batch itself, as aiui that
requires LLC for the CS coherency. One of the following tests uses the
userfault handler as the sync point for replacing a poisoned batch with
a valid one, with the intention of looking for the GPU hang if the fault
was missed. That's the test I started with, my goal here was to focus
on the userptr + userfault as simply as I could; so gem_sync().
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-11 17:54         ` Tvrtko Ursulin
  0 siblings, 0 replies; 34+ messages in thread
From: Tvrtko Ursulin @ 2019-11-11 17:54 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev


On 11/11/2019 16:58, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-11 16:48:14)
>>
>> On 08/11/2019 20:49, Chris Wilson wrote:
>>> Register a userspace fault handler for a memory region that we also pass
>>> to the GPU via userptr, and make sure the pagefault is properly serviced
>>> before we execute.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
>>>    1 file changed, 118 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
>>> index 11d6f4a1c..774a9f92c 100644
>>> --- a/tests/i915/gem_userptr_blits.c
>>> +++ b/tests/i915/gem_userptr_blits.c
>>> @@ -36,6 +36,8 @@
>>>     * The goal is to simply ensure the basics work.
>>>     */
>>>    
>>> +#include <linux/userfaultfd.h>
>>> +
>>>    #include "igt.h"
>>>    #include <stdlib.h>
>>>    #include <stdio.h>
>>> @@ -44,9 +46,11 @@
>>>    #include <inttypes.h>
>>>    #include <errno.h>
>>>    #include <setjmp.h>
>>> +#include <sys/ioctl.h>
>>> +#include <sys/mman.h>
>>>    #include <sys/stat.h>
>>> +#include <sys/syscall.h>
>>>    #include <sys/time.h>
>>> -#include <sys/mman.h>
>>>    #include <glib.h>
>>>    #include <signal.h>
>>>    #include <pthread.h>
>>> @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
>>>        free(t_data.ptr);
>>>    }
>>>    
>>> +struct ufd_thread {
>>> +     uint32_t *page;
>>> +     int i915;
>>> +};
>>> +
>>> +static uint32_t create_page(int i915, void *page)
>>> +{
>>> +     uint32_t handle;
>>> +
>>> +     gem_userptr(i915, page, 4096, 0, 0, &handle);
>>> +     return handle;
>>> +}
>>> +
>>> +static uint32_t create_batch(int i915)
>>> +{
>>> +     const uint32_t bbe = MI_BATCH_BUFFER_END;
>>> +     uint32_t handle;
>>> +
>>> +     handle = gem_create(i915, 4096);
>>> +     gem_write(i915, handle, 0, &bbe, sizeof(bbe));
>>> +
>>> +     return handle;
>>> +}
>>> +
>>> +static void *ufd_thread(void *arg)
>>> +{
>>> +     struct ufd_thread *t = arg;
>>> +     struct drm_i915_gem_exec_object2 obj[2] = {
>>> +             { .handle = create_page(t->i915, t->page) },
>>> +             { .handle = create_batch(t->i915) },
>>> +     };
>>> +     struct drm_i915_gem_execbuffer2 eb = {
>>> +             .buffers_ptr = to_user_pointer(obj),
>>> +             .buffer_count = ARRAY_SIZE(obj),
>>> +     };
>>> +
>>> +     igt_debug("submitting fault\n");
>>> +     gem_execbuf(t->i915, &eb);
>>> +     gem_sync(t->i915, obj[1].handle);
>>> +
>>> +     for (int i = 0; i < ARRAY_SIZE(obj); i++)
>>> +             gem_close(t->i915, obj[i].handle);
>>> +
>>> +     t->i915 = -1;
>>> +     return NULL;
>>> +}
>>> +
>>> +static int userfaultfd(int flags)
>>> +{
>>> +     return syscall(SYS_userfaultfd, flags);
>>> +}
>>> +
>>> +static void test_userfault(int i915)
>>> +{
>>> +     struct uffdio_api api = { .api = UFFD_API };
>>> +     struct uffdio_register reg;
>>> +     struct uffdio_copy copy;
>>> +     struct uffd_msg msg;
>>> +     struct ufd_thread t;
>>> +     pthread_t thread;
>>> +     char poison[4096];
>>> +     int ufd;
>>> +
>>> +     /*
>>> +      * Register a page with userfaultfd, and wrap that inside a userptr bo.
>>> +      * When we try to use gup insider userptr_get_pages, it will trigger
>>> +      * a pagefault that is sent to the userfaultfd for servicing. This
>>> +      * is arbitrarily slow, as the submission must wait until the fault
>>> +      * is serviced by the userspace fault handler.
>>> +      */
>>> +
>>> +     ufd = userfaultfd(0);
>>> +     igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
>>> +     igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
>>> +                   "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
>>> +
>>> +     t.i915 = i915;
>>> +
>>> +     t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
>>> +     igt_assert(t.page != MAP_FAILED);
>>> +
>>> +     memset(&reg, 0, sizeof(reg));
>>> +     reg.mode = UFFDIO_REGISTER_MODE_MISSING;
>>> +     reg.range.start = to_user_pointer(t.page);
>>> +     reg.range.len = 4096;
>>> +     do_ioctl(ufd, UFFDIO_REGISTER, &reg);
>>> +     igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
>>> +
>>> +     igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
>>> +
>>> +     /* Wait for the fault */
>>> +     igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
>>> +     igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
>>> +     igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
>>> +
>>> +     /* Faulting thread remains blocked */
>>> +     igt_assert_eq(t.i915, i915);
>>
>> This looks could be a false negative since nothing says the thread is
>> not blocked just not got round resetting t->i915.
> 
> There's a gem_sync() in the thread. Our goal is that the thread is
> blocked (either at submit or in the sync) until we service the fault.

What I meant was it could have passed gem_execbuf and gem_sync, just not 
got to the t->i915 = -1 line yet. Am I being to pedantic? Maybe using 
output fence and passing it back to parent thread would be easier? 
Parent then does igt_assert_eq(poll(fd, some_timeout), 0).
>>> +     memset(&copy, 0, sizeof(copy));
>>> +     copy.dst = msg.arg.pagefault.address;
>>> +     copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
>>> +     copy.len = 4096;
>>> +     do_ioctl(ufd, UFFDIO_COPY, &copy);
>>
>> What is the point of poison data?
> 
> Just a tell tale.
>   
>> Would it work better to have a hanging batch registered with userfault
>> and then replace it with a valid batch here? That would ensure execbuf
>> was  blocked until userfault handler finishes otherwise we get a GPU hang.
> 
> Strictly we can't use userptr for the batch itself, as aiui that
> requires LLC for the CS coherency. One of the following tests uses the
> userfault handler as the sync point for replacing a poisoned batch with
> a valid one, with the intention of looking for the GPU hang if the fault
> was missed. That's the test I started with, my goal here was to focus
> on the userptr + userfault as simply as I could; so gem_sync().

Okay that's fair then.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-11 17:54         ` Tvrtko Ursulin
  0 siblings, 0 replies; 34+ messages in thread
From: Tvrtko Ursulin @ 2019-11-11 17:54 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev


On 11/11/2019 16:58, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-11 16:48:14)
>>
>> On 08/11/2019 20:49, Chris Wilson wrote:
>>> Register a userspace fault handler for a memory region that we also pass
>>> to the GPU via userptr, and make sure the pagefault is properly serviced
>>> before we execute.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
>>>    1 file changed, 118 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
>>> index 11d6f4a1c..774a9f92c 100644
>>> --- a/tests/i915/gem_userptr_blits.c
>>> +++ b/tests/i915/gem_userptr_blits.c
>>> @@ -36,6 +36,8 @@
>>>     * The goal is to simply ensure the basics work.
>>>     */
>>>    
>>> +#include <linux/userfaultfd.h>
>>> +
>>>    #include "igt.h"
>>>    #include <stdlib.h>
>>>    #include <stdio.h>
>>> @@ -44,9 +46,11 @@
>>>    #include <inttypes.h>
>>>    #include <errno.h>
>>>    #include <setjmp.h>
>>> +#include <sys/ioctl.h>
>>> +#include <sys/mman.h>
>>>    #include <sys/stat.h>
>>> +#include <sys/syscall.h>
>>>    #include <sys/time.h>
>>> -#include <sys/mman.h>
>>>    #include <glib.h>
>>>    #include <signal.h>
>>>    #include <pthread.h>
>>> @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
>>>        free(t_data.ptr);
>>>    }
>>>    
>>> +struct ufd_thread {
>>> +     uint32_t *page;
>>> +     int i915;
>>> +};
>>> +
>>> +static uint32_t create_page(int i915, void *page)
>>> +{
>>> +     uint32_t handle;
>>> +
>>> +     gem_userptr(i915, page, 4096, 0, 0, &handle);
>>> +     return handle;
>>> +}
>>> +
>>> +static uint32_t create_batch(int i915)
>>> +{
>>> +     const uint32_t bbe = MI_BATCH_BUFFER_END;
>>> +     uint32_t handle;
>>> +
>>> +     handle = gem_create(i915, 4096);
>>> +     gem_write(i915, handle, 0, &bbe, sizeof(bbe));
>>> +
>>> +     return handle;
>>> +}
>>> +
>>> +static void *ufd_thread(void *arg)
>>> +{
>>> +     struct ufd_thread *t = arg;
>>> +     struct drm_i915_gem_exec_object2 obj[2] = {
>>> +             { .handle = create_page(t->i915, t->page) },
>>> +             { .handle = create_batch(t->i915) },
>>> +     };
>>> +     struct drm_i915_gem_execbuffer2 eb = {
>>> +             .buffers_ptr = to_user_pointer(obj),
>>> +             .buffer_count = ARRAY_SIZE(obj),
>>> +     };
>>> +
>>> +     igt_debug("submitting fault\n");
>>> +     gem_execbuf(t->i915, &eb);
>>> +     gem_sync(t->i915, obj[1].handle);
>>> +
>>> +     for (int i = 0; i < ARRAY_SIZE(obj); i++)
>>> +             gem_close(t->i915, obj[i].handle);
>>> +
>>> +     t->i915 = -1;
>>> +     return NULL;
>>> +}
>>> +
>>> +static int userfaultfd(int flags)
>>> +{
>>> +     return syscall(SYS_userfaultfd, flags);
>>> +}
>>> +
>>> +static void test_userfault(int i915)
>>> +{
>>> +     struct uffdio_api api = { .api = UFFD_API };
>>> +     struct uffdio_register reg;
>>> +     struct uffdio_copy copy;
>>> +     struct uffd_msg msg;
>>> +     struct ufd_thread t;
>>> +     pthread_t thread;
>>> +     char poison[4096];
>>> +     int ufd;
>>> +
>>> +     /*
>>> +      * Register a page with userfaultfd, and wrap that inside a userptr bo.
>>> +      * When we try to use gup insider userptr_get_pages, it will trigger
>>> +      * a pagefault that is sent to the userfaultfd for servicing. This
>>> +      * is arbitrarily slow, as the submission must wait until the fault
>>> +      * is serviced by the userspace fault handler.
>>> +      */
>>> +
>>> +     ufd = userfaultfd(0);
>>> +     igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
>>> +     igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
>>> +                   "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
>>> +
>>> +     t.i915 = i915;
>>> +
>>> +     t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
>>> +     igt_assert(t.page != MAP_FAILED);
>>> +
>>> +     memset(&reg, 0, sizeof(reg));
>>> +     reg.mode = UFFDIO_REGISTER_MODE_MISSING;
>>> +     reg.range.start = to_user_pointer(t.page);
>>> +     reg.range.len = 4096;
>>> +     do_ioctl(ufd, UFFDIO_REGISTER, &reg);
>>> +     igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
>>> +
>>> +     igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
>>> +
>>> +     /* Wait for the fault */
>>> +     igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
>>> +     igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
>>> +     igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
>>> +
>>> +     /* Faulting thread remains blocked */
>>> +     igt_assert_eq(t.i915, i915);
>>
>> This looks could be a false negative since nothing says the thread is
>> not blocked just not got round resetting t->i915.
> 
> There's a gem_sync() in the thread. Our goal is that the thread is
> blocked (either at submit or in the sync) until we service the fault.

What I meant was it could have passed gem_execbuf and gem_sync, just not 
got to the t->i915 = -1 line yet. Am I being to pedantic? Maybe using 
output fence and passing it back to parent thread would be easier? 
Parent then does igt_assert_eq(poll(fd, some_timeout), 0).
>>> +     memset(&copy, 0, sizeof(copy));
>>> +     copy.dst = msg.arg.pagefault.address;
>>> +     copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
>>> +     copy.len = 4096;
>>> +     do_ioctl(ufd, UFFDIO_COPY, &copy);
>>
>> What is the point of poison data?
> 
> Just a tell tale.
>   
>> Would it work better to have a hanging batch registered with userfault
>> and then replace it with a valid batch here? That would ensure execbuf
>> was  blocked until userfault handler finishes otherwise we get a GPU hang.
> 
> Strictly we can't use userptr for the batch itself, as aiui that
> requires LLC for the CS coherency. One of the following tests uses the
> userfault handler as the sync point for replacing a poisoned batch with
> a valid one, with the intention of looking for the GPU hang if the fault
> was missed. That's the test I started with, my goal here was to focus
> on the userptr + userfault as simply as I could; so gem_sync().

Okay that's fair then.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-11 17:54         ` Tvrtko Ursulin
  0 siblings, 0 replies; 34+ messages in thread
From: Tvrtko Ursulin @ 2019-11-11 17:54 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev, Tvrtko Ursulin


On 11/11/2019 16:58, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-11 16:48:14)
>>
>> On 08/11/2019 20:49, Chris Wilson wrote:
>>> Register a userspace fault handler for a memory region that we also pass
>>> to the GPU via userptr, and make sure the pagefault is properly serviced
>>> before we execute.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
>>>    1 file changed, 118 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
>>> index 11d6f4a1c..774a9f92c 100644
>>> --- a/tests/i915/gem_userptr_blits.c
>>> +++ b/tests/i915/gem_userptr_blits.c
>>> @@ -36,6 +36,8 @@
>>>     * The goal is to simply ensure the basics work.
>>>     */
>>>    
>>> +#include <linux/userfaultfd.h>
>>> +
>>>    #include "igt.h"
>>>    #include <stdlib.h>
>>>    #include <stdio.h>
>>> @@ -44,9 +46,11 @@
>>>    #include <inttypes.h>
>>>    #include <errno.h>
>>>    #include <setjmp.h>
>>> +#include <sys/ioctl.h>
>>> +#include <sys/mman.h>
>>>    #include <sys/stat.h>
>>> +#include <sys/syscall.h>
>>>    #include <sys/time.h>
>>> -#include <sys/mman.h>
>>>    #include <glib.h>
>>>    #include <signal.h>
>>>    #include <pthread.h>
>>> @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
>>>        free(t_data.ptr);
>>>    }
>>>    
>>> +struct ufd_thread {
>>> +     uint32_t *page;
>>> +     int i915;
>>> +};
>>> +
>>> +static uint32_t create_page(int i915, void *page)
>>> +{
>>> +     uint32_t handle;
>>> +
>>> +     gem_userptr(i915, page, 4096, 0, 0, &handle);
>>> +     return handle;
>>> +}
>>> +
>>> +static uint32_t create_batch(int i915)
>>> +{
>>> +     const uint32_t bbe = MI_BATCH_BUFFER_END;
>>> +     uint32_t handle;
>>> +
>>> +     handle = gem_create(i915, 4096);
>>> +     gem_write(i915, handle, 0, &bbe, sizeof(bbe));
>>> +
>>> +     return handle;
>>> +}
>>> +
>>> +static void *ufd_thread(void *arg)
>>> +{
>>> +     struct ufd_thread *t = arg;
>>> +     struct drm_i915_gem_exec_object2 obj[2] = {
>>> +             { .handle = create_page(t->i915, t->page) },
>>> +             { .handle = create_batch(t->i915) },
>>> +     };
>>> +     struct drm_i915_gem_execbuffer2 eb = {
>>> +             .buffers_ptr = to_user_pointer(obj),
>>> +             .buffer_count = ARRAY_SIZE(obj),
>>> +     };
>>> +
>>> +     igt_debug("submitting fault\n");
>>> +     gem_execbuf(t->i915, &eb);
>>> +     gem_sync(t->i915, obj[1].handle);
>>> +
>>> +     for (int i = 0; i < ARRAY_SIZE(obj); i++)
>>> +             gem_close(t->i915, obj[i].handle);
>>> +
>>> +     t->i915 = -1;
>>> +     return NULL;
>>> +}
>>> +
>>> +static int userfaultfd(int flags)
>>> +{
>>> +     return syscall(SYS_userfaultfd, flags);
>>> +}
>>> +
>>> +static void test_userfault(int i915)
>>> +{
>>> +     struct uffdio_api api = { .api = UFFD_API };
>>> +     struct uffdio_register reg;
>>> +     struct uffdio_copy copy;
>>> +     struct uffd_msg msg;
>>> +     struct ufd_thread t;
>>> +     pthread_t thread;
>>> +     char poison[4096];
>>> +     int ufd;
>>> +
>>> +     /*
>>> +      * Register a page with userfaultfd, and wrap that inside a userptr bo.
>>> +      * When we try to use gup insider userptr_get_pages, it will trigger
>>> +      * a pagefault that is sent to the userfaultfd for servicing. This
>>> +      * is arbitrarily slow, as the submission must wait until the fault
>>> +      * is serviced by the userspace fault handler.
>>> +      */
>>> +
>>> +     ufd = userfaultfd(0);
>>> +     igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
>>> +     igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
>>> +                   "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
>>> +
>>> +     t.i915 = i915;
>>> +
>>> +     t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
>>> +     igt_assert(t.page != MAP_FAILED);
>>> +
>>> +     memset(&reg, 0, sizeof(reg));
>>> +     reg.mode = UFFDIO_REGISTER_MODE_MISSING;
>>> +     reg.range.start = to_user_pointer(t.page);
>>> +     reg.range.len = 4096;
>>> +     do_ioctl(ufd, UFFDIO_REGISTER, &reg);
>>> +     igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
>>> +
>>> +     igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
>>> +
>>> +     /* Wait for the fault */
>>> +     igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
>>> +     igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
>>> +     igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
>>> +
>>> +     /* Faulting thread remains blocked */
>>> +     igt_assert_eq(t.i915, i915);
>>
>> This looks could be a false negative since nothing says the thread is
>> not blocked just not got round resetting t->i915.
> 
> There's a gem_sync() in the thread. Our goal is that the thread is
> blocked (either at submit or in the sync) until we service the fault.

What I meant was it could have passed gem_execbuf and gem_sync, just not 
got to the t->i915 = -1 line yet. Am I being to pedantic? Maybe using 
output fence and passing it back to parent thread would be easier? 
Parent then does igt_assert_eq(poll(fd, some_timeout), 0).
>>> +     memset(&copy, 0, sizeof(copy));
>>> +     copy.dst = msg.arg.pagefault.address;
>>> +     copy.src = to_user_pointer(memset(poison, 0xc5, sizeof(poison)));
>>> +     copy.len = 4096;
>>> +     do_ioctl(ufd, UFFDIO_COPY, &copy);
>>
>> What is the point of poison data?
> 
> Just a tell tale.
>   
>> Would it work better to have a hanging batch registered with userfault
>> and then replace it with a valid batch here? That would ensure execbuf
>> was  blocked until userfault handler finishes otherwise we get a GPU hang.
> 
> Strictly we can't use userptr for the batch itself, as aiui that
> requires LLC for the CS coherency. One of the following tests uses the
> userfault handler as the sync point for replacing a poisoned batch with
> a valid one, with the intention of looking for the GPU hang if the fault
> was missed. That's the test I started with, my goal here was to focus
> on the userptr + userfault as simply as I could; so gem_sync().

Okay that's fair then.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-11 18:07           ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-11 18:07 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: igt-dev

Quoting Tvrtko Ursulin (2019-11-11 17:54:27)
> 
> On 11/11/2019 16:58, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-11-11 16:48:14)
> >>
> >> On 08/11/2019 20:49, Chris Wilson wrote:
> >>> Register a userspace fault handler for a memory region that we also pass
> >>> to the GPU via userptr, and make sure the pagefault is properly serviced
> >>> before we execute.
> >>>
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> ---
> >>>    tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
> >>>    1 file changed, 118 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
> >>> index 11d6f4a1c..774a9f92c 100644
> >>> --- a/tests/i915/gem_userptr_blits.c
> >>> +++ b/tests/i915/gem_userptr_blits.c
> >>> @@ -36,6 +36,8 @@
> >>>     * The goal is to simply ensure the basics work.
> >>>     */
> >>>    
> >>> +#include <linux/userfaultfd.h>
> >>> +
> >>>    #include "igt.h"
> >>>    #include <stdlib.h>
> >>>    #include <stdio.h>
> >>> @@ -44,9 +46,11 @@
> >>>    #include <inttypes.h>
> >>>    #include <errno.h>
> >>>    #include <setjmp.h>
> >>> +#include <sys/ioctl.h>
> >>> +#include <sys/mman.h>
> >>>    #include <sys/stat.h>
> >>> +#include <sys/syscall.h>
> >>>    #include <sys/time.h>
> >>> -#include <sys/mman.h>
> >>>    #include <glib.h>
> >>>    #include <signal.h>
> >>>    #include <pthread.h>
> >>> @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
> >>>        free(t_data.ptr);
> >>>    }
> >>>    
> >>> +struct ufd_thread {
> >>> +     uint32_t *page;
> >>> +     int i915;
> >>> +};
> >>> +
> >>> +static uint32_t create_page(int i915, void *page)
> >>> +{
> >>> +     uint32_t handle;
> >>> +
> >>> +     gem_userptr(i915, page, 4096, 0, 0, &handle);
> >>> +     return handle;
> >>> +}
> >>> +
> >>> +static uint32_t create_batch(int i915)
> >>> +{
> >>> +     const uint32_t bbe = MI_BATCH_BUFFER_END;
> >>> +     uint32_t handle;
> >>> +
> >>> +     handle = gem_create(i915, 4096);
> >>> +     gem_write(i915, handle, 0, &bbe, sizeof(bbe));
> >>> +
> >>> +     return handle;
> >>> +}
> >>> +
> >>> +static void *ufd_thread(void *arg)
> >>> +{
> >>> +     struct ufd_thread *t = arg;
> >>> +     struct drm_i915_gem_exec_object2 obj[2] = {
> >>> +             { .handle = create_page(t->i915, t->page) },
> >>> +             { .handle = create_batch(t->i915) },
> >>> +     };
> >>> +     struct drm_i915_gem_execbuffer2 eb = {
> >>> +             .buffers_ptr = to_user_pointer(obj),
> >>> +             .buffer_count = ARRAY_SIZE(obj),
> >>> +     };
> >>> +
> >>> +     igt_debug("submitting fault\n");
> >>> +     gem_execbuf(t->i915, &eb);
> >>> +     gem_sync(t->i915, obj[1].handle);
> >>> +
> >>> +     for (int i = 0; i < ARRAY_SIZE(obj); i++)
> >>> +             gem_close(t->i915, obj[i].handle);
> >>> +
> >>> +     t->i915 = -1;
> >>> +     return NULL;
> >>> +}
> >>> +
> >>> +static int userfaultfd(int flags)
> >>> +{
> >>> +     return syscall(SYS_userfaultfd, flags);
> >>> +}
> >>> +
> >>> +static void test_userfault(int i915)
> >>> +{
> >>> +     struct uffdio_api api = { .api = UFFD_API };
> >>> +     struct uffdio_register reg;
> >>> +     struct uffdio_copy copy;
> >>> +     struct uffd_msg msg;
> >>> +     struct ufd_thread t;
> >>> +     pthread_t thread;
> >>> +     char poison[4096];
> >>> +     int ufd;
> >>> +
> >>> +     /*
> >>> +      * Register a page with userfaultfd, and wrap that inside a userptr bo.
> >>> +      * When we try to use gup insider userptr_get_pages, it will trigger
> >>> +      * a pagefault that is sent to the userfaultfd for servicing. This
> >>> +      * is arbitrarily slow, as the submission must wait until the fault
> >>> +      * is serviced by the userspace fault handler.
> >>> +      */
> >>> +
> >>> +     ufd = userfaultfd(0);
> >>> +     igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
> >>> +     igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
> >>> +                   "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
> >>> +
> >>> +     t.i915 = i915;
> >>> +
> >>> +     t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> >>> +     igt_assert(t.page != MAP_FAILED);
> >>> +
> >>> +     memset(&reg, 0, sizeof(reg));
> >>> +     reg.mode = UFFDIO_REGISTER_MODE_MISSING;
> >>> +     reg.range.start = to_user_pointer(t.page);
> >>> +     reg.range.len = 4096;
> >>> +     do_ioctl(ufd, UFFDIO_REGISTER, &reg);
> >>> +     igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
> >>> +
> >>> +     igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
> >>> +
> >>> +     /* Wait for the fault */
> >>> +     igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
> >>> +     igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
> >>> +     igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
> >>> +
> >>> +     /* Faulting thread remains blocked */
> >>> +     igt_assert_eq(t.i915, i915);
> >>
> >> This looks could be a false negative since nothing says the thread is
> >> not blocked just not got round resetting t->i915.
> > 
> > There's a gem_sync() in the thread. Our goal is that the thread is
> > blocked (either at submit or in the sync) until we service the fault.
> 
> What I meant was it could have passed gem_execbuf and gem_sync, just not 
> got to the t->i915 = -1 line yet. Am I being to pedantic? Maybe using 
> output fence and passing it back to parent thread would be easier? 
> Parent then does igt_assert_eq(poll(fd, some_timeout), 0).

That could only be if the fault never occurred, in which case we would
still be blocking on the read(). I think that's a reasonable level for
us not to care about -- it's the same as our depending on write()/read()
for synchronising between parent & child elsewhere. No?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-11 18:07           ` Chris Wilson
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2019-11-11 18:07 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: igt-dev

Quoting Tvrtko Ursulin (2019-11-11 17:54:27)
> 
> On 11/11/2019 16:58, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-11-11 16:48:14)
> >>
> >> On 08/11/2019 20:49, Chris Wilson wrote:
> >>> Register a userspace fault handler for a memory region that we also pass
> >>> to the GPU via userptr, and make sure the pagefault is properly serviced
> >>> before we execute.
> >>>
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> ---
> >>>    tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
> >>>    1 file changed, 118 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
> >>> index 11d6f4a1c..774a9f92c 100644
> >>> --- a/tests/i915/gem_userptr_blits.c
> >>> +++ b/tests/i915/gem_userptr_blits.c
> >>> @@ -36,6 +36,8 @@
> >>>     * The goal is to simply ensure the basics work.
> >>>     */
> >>>    
> >>> +#include <linux/userfaultfd.h>
> >>> +
> >>>    #include "igt.h"
> >>>    #include <stdlib.h>
> >>>    #include <stdio.h>
> >>> @@ -44,9 +46,11 @@
> >>>    #include <inttypes.h>
> >>>    #include <errno.h>
> >>>    #include <setjmp.h>
> >>> +#include <sys/ioctl.h>
> >>> +#include <sys/mman.h>
> >>>    #include <sys/stat.h>
> >>> +#include <sys/syscall.h>
> >>>    #include <sys/time.h>
> >>> -#include <sys/mman.h>
> >>>    #include <glib.h>
> >>>    #include <signal.h>
> >>>    #include <pthread.h>
> >>> @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
> >>>        free(t_data.ptr);
> >>>    }
> >>>    
> >>> +struct ufd_thread {
> >>> +     uint32_t *page;
> >>> +     int i915;
> >>> +};
> >>> +
> >>> +static uint32_t create_page(int i915, void *page)
> >>> +{
> >>> +     uint32_t handle;
> >>> +
> >>> +     gem_userptr(i915, page, 4096, 0, 0, &handle);
> >>> +     return handle;
> >>> +}
> >>> +
> >>> +static uint32_t create_batch(int i915)
> >>> +{
> >>> +     const uint32_t bbe = MI_BATCH_BUFFER_END;
> >>> +     uint32_t handle;
> >>> +
> >>> +     handle = gem_create(i915, 4096);
> >>> +     gem_write(i915, handle, 0, &bbe, sizeof(bbe));
> >>> +
> >>> +     return handle;
> >>> +}
> >>> +
> >>> +static void *ufd_thread(void *arg)
> >>> +{
> >>> +     struct ufd_thread *t = arg;
> >>> +     struct drm_i915_gem_exec_object2 obj[2] = {
> >>> +             { .handle = create_page(t->i915, t->page) },
> >>> +             { .handle = create_batch(t->i915) },
> >>> +     };
> >>> +     struct drm_i915_gem_execbuffer2 eb = {
> >>> +             .buffers_ptr = to_user_pointer(obj),
> >>> +             .buffer_count = ARRAY_SIZE(obj),
> >>> +     };
> >>> +
> >>> +     igt_debug("submitting fault\n");
> >>> +     gem_execbuf(t->i915, &eb);
> >>> +     gem_sync(t->i915, obj[1].handle);
> >>> +
> >>> +     for (int i = 0; i < ARRAY_SIZE(obj); i++)
> >>> +             gem_close(t->i915, obj[i].handle);
> >>> +
> >>> +     t->i915 = -1;
> >>> +     return NULL;
> >>> +}
> >>> +
> >>> +static int userfaultfd(int flags)
> >>> +{
> >>> +     return syscall(SYS_userfaultfd, flags);
> >>> +}
> >>> +
> >>> +static void test_userfault(int i915)
> >>> +{
> >>> +     struct uffdio_api api = { .api = UFFD_API };
> >>> +     struct uffdio_register reg;
> >>> +     struct uffdio_copy copy;
> >>> +     struct uffd_msg msg;
> >>> +     struct ufd_thread t;
> >>> +     pthread_t thread;
> >>> +     char poison[4096];
> >>> +     int ufd;
> >>> +
> >>> +     /*
> >>> +      * Register a page with userfaultfd, and wrap that inside a userptr bo.
> >>> +      * When we try to use gup insider userptr_get_pages, it will trigger
> >>> +      * a pagefault that is sent to the userfaultfd for servicing. This
> >>> +      * is arbitrarily slow, as the submission must wait until the fault
> >>> +      * is serviced by the userspace fault handler.
> >>> +      */
> >>> +
> >>> +     ufd = userfaultfd(0);
> >>> +     igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
> >>> +     igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
> >>> +                   "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
> >>> +
> >>> +     t.i915 = i915;
> >>> +
> >>> +     t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
> >>> +     igt_assert(t.page != MAP_FAILED);
> >>> +
> >>> +     memset(&reg, 0, sizeof(reg));
> >>> +     reg.mode = UFFDIO_REGISTER_MODE_MISSING;
> >>> +     reg.range.start = to_user_pointer(t.page);
> >>> +     reg.range.len = 4096;
> >>> +     do_ioctl(ufd, UFFDIO_REGISTER, &reg);
> >>> +     igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
> >>> +
> >>> +     igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
> >>> +
> >>> +     /* Wait for the fault */
> >>> +     igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
> >>> +     igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
> >>> +     igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
> >>> +
> >>> +     /* Faulting thread remains blocked */
> >>> +     igt_assert_eq(t.i915, i915);
> >>
> >> This looks could be a false negative since nothing says the thread is
> >> not blocked just not got round resetting t->i915.
> > 
> > There's a gem_sync() in the thread. Our goal is that the thread is
> > blocked (either at submit or in the sync) until we service the fault.
> 
> What I meant was it could have passed gem_execbuf and gem_sync, just not 
> got to the t->i915 = -1 line yet. Am I being to pedantic? Maybe using 
> output fence and passing it back to parent thread would be easier? 
> Parent then does igt_assert_eq(poll(fd, some_timeout), 0).

That could only be if the fault never occurred, in which case we would
still be blocking on the read(). I think that's a reasonable level for
us not to care about -- it's the same as our depending on write()/read()
for synchronising between parent & child elsewhere. No?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-12  8:26             ` Tvrtko Ursulin
  0 siblings, 0 replies; 34+ messages in thread
From: Tvrtko Ursulin @ 2019-11-12  8:26 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev


On 11/11/2019 18:07, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-11 17:54:27)
>>
>> On 11/11/2019 16:58, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-11-11 16:48:14)
>>>>
>>>> On 08/11/2019 20:49, Chris Wilson wrote:
>>>>> Register a userspace fault handler for a memory region that we also pass
>>>>> to the GPU via userptr, and make sure the pagefault is properly serviced
>>>>> before we execute.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> ---
>>>>>     tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
>>>>>     1 file changed, 118 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
>>>>> index 11d6f4a1c..774a9f92c 100644
>>>>> --- a/tests/i915/gem_userptr_blits.c
>>>>> +++ b/tests/i915/gem_userptr_blits.c
>>>>> @@ -36,6 +36,8 @@
>>>>>      * The goal is to simply ensure the basics work.
>>>>>      */
>>>>>     
>>>>> +#include <linux/userfaultfd.h>
>>>>> +
>>>>>     #include "igt.h"
>>>>>     #include <stdlib.h>
>>>>>     #include <stdio.h>
>>>>> @@ -44,9 +46,11 @@
>>>>>     #include <inttypes.h>
>>>>>     #include <errno.h>
>>>>>     #include <setjmp.h>
>>>>> +#include <sys/ioctl.h>
>>>>> +#include <sys/mman.h>
>>>>>     #include <sys/stat.h>
>>>>> +#include <sys/syscall.h>
>>>>>     #include <sys/time.h>
>>>>> -#include <sys/mman.h>
>>>>>     #include <glib.h>
>>>>>     #include <signal.h>
>>>>>     #include <pthread.h>
>>>>> @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
>>>>>         free(t_data.ptr);
>>>>>     }
>>>>>     
>>>>> +struct ufd_thread {
>>>>> +     uint32_t *page;
>>>>> +     int i915;
>>>>> +};
>>>>> +
>>>>> +static uint32_t create_page(int i915, void *page)
>>>>> +{
>>>>> +     uint32_t handle;
>>>>> +
>>>>> +     gem_userptr(i915, page, 4096, 0, 0, &handle);
>>>>> +     return handle;
>>>>> +}
>>>>> +
>>>>> +static uint32_t create_batch(int i915)
>>>>> +{
>>>>> +     const uint32_t bbe = MI_BATCH_BUFFER_END;
>>>>> +     uint32_t handle;
>>>>> +
>>>>> +     handle = gem_create(i915, 4096);
>>>>> +     gem_write(i915, handle, 0, &bbe, sizeof(bbe));
>>>>> +
>>>>> +     return handle;
>>>>> +}
>>>>> +
>>>>> +static void *ufd_thread(void *arg)
>>>>> +{
>>>>> +     struct ufd_thread *t = arg;
>>>>> +     struct drm_i915_gem_exec_object2 obj[2] = {
>>>>> +             { .handle = create_page(t->i915, t->page) },
>>>>> +             { .handle = create_batch(t->i915) },
>>>>> +     };
>>>>> +     struct drm_i915_gem_execbuffer2 eb = {
>>>>> +             .buffers_ptr = to_user_pointer(obj),
>>>>> +             .buffer_count = ARRAY_SIZE(obj),
>>>>> +     };
>>>>> +
>>>>> +     igt_debug("submitting fault\n");
>>>>> +     gem_execbuf(t->i915, &eb);
>>>>> +     gem_sync(t->i915, obj[1].handle);
>>>>> +
>>>>> +     for (int i = 0; i < ARRAY_SIZE(obj); i++)
>>>>> +             gem_close(t->i915, obj[i].handle);
>>>>> +
>>>>> +     t->i915 = -1;
>>>>> +     return NULL;
>>>>> +}
>>>>> +
>>>>> +static int userfaultfd(int flags)
>>>>> +{
>>>>> +     return syscall(SYS_userfaultfd, flags);
>>>>> +}
>>>>> +
>>>>> +static void test_userfault(int i915)
>>>>> +{
>>>>> +     struct uffdio_api api = { .api = UFFD_API };
>>>>> +     struct uffdio_register reg;
>>>>> +     struct uffdio_copy copy;
>>>>> +     struct uffd_msg msg;
>>>>> +     struct ufd_thread t;
>>>>> +     pthread_t thread;
>>>>> +     char poison[4096];
>>>>> +     int ufd;
>>>>> +
>>>>> +     /*
>>>>> +      * Register a page with userfaultfd, and wrap that inside a userptr bo.
>>>>> +      * When we try to use gup insider userptr_get_pages, it will trigger
>>>>> +      * a pagefault that is sent to the userfaultfd for servicing. This
>>>>> +      * is arbitrarily slow, as the submission must wait until the fault
>>>>> +      * is serviced by the userspace fault handler.
>>>>> +      */
>>>>> +
>>>>> +     ufd = userfaultfd(0);
>>>>> +     igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
>>>>> +     igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
>>>>> +                   "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
>>>>> +
>>>>> +     t.i915 = i915;
>>>>> +
>>>>> +     t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
>>>>> +     igt_assert(t.page != MAP_FAILED);
>>>>> +
>>>>> +     memset(&reg, 0, sizeof(reg));
>>>>> +     reg.mode = UFFDIO_REGISTER_MODE_MISSING;
>>>>> +     reg.range.start = to_user_pointer(t.page);
>>>>> +     reg.range.len = 4096;
>>>>> +     do_ioctl(ufd, UFFDIO_REGISTER, &reg);
>>>>> +     igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
>>>>> +
>>>>> +     igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
>>>>> +
>>>>> +     /* Wait for the fault */
>>>>> +     igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
>>>>> +     igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
>>>>> +     igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
>>>>> +
>>>>> +     /* Faulting thread remains blocked */
>>>>> +     igt_assert_eq(t.i915, i915);
>>>>
>>>> This looks could be a false negative since nothing says the thread is
>>>> not blocked just not got round resetting t->i915.
>>>
>>> There's a gem_sync() in the thread. Our goal is that the thread is
>>> blocked (either at submit or in the sync) until we service the fault.
>>
>> What I meant was it could have passed gem_execbuf and gem_sync, just not
>> got to the t->i915 = -1 line yet. Am I being to pedantic? Maybe using
>> output fence and passing it back to parent thread would be easier?
>> Parent then does igt_assert_eq(poll(fd, some_timeout), 0).
> 
> That could only be if the fault never occurred, in which case we would
> still be blocking on the read(). I think that's a reasonable level for
> us not to care about -- it's the same as our depending on write()/read()
> for synchronising between parent & child elsewhere. No?

Using userfaultfd is a bit more complex so I thought of extra checks. 
But okay, if probably is overly paranoid.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-12  8:26             ` Tvrtko Ursulin
  0 siblings, 0 replies; 34+ messages in thread
From: Tvrtko Ursulin @ 2019-11-12  8:26 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev


On 11/11/2019 18:07, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-11 17:54:27)
>>
>> On 11/11/2019 16:58, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-11-11 16:48:14)
>>>>
>>>> On 08/11/2019 20:49, Chris Wilson wrote:
>>>>> Register a userspace fault handler for a memory region that we also pass
>>>>> to the GPU via userptr, and make sure the pagefault is properly serviced
>>>>> before we execute.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> ---
>>>>>     tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
>>>>>     1 file changed, 118 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
>>>>> index 11d6f4a1c..774a9f92c 100644
>>>>> --- a/tests/i915/gem_userptr_blits.c
>>>>> +++ b/tests/i915/gem_userptr_blits.c
>>>>> @@ -36,6 +36,8 @@
>>>>>      * The goal is to simply ensure the basics work.
>>>>>      */
>>>>>     
>>>>> +#include <linux/userfaultfd.h>
>>>>> +
>>>>>     #include "igt.h"
>>>>>     #include <stdlib.h>
>>>>>     #include <stdio.h>
>>>>> @@ -44,9 +46,11 @@
>>>>>     #include <inttypes.h>
>>>>>     #include <errno.h>
>>>>>     #include <setjmp.h>
>>>>> +#include <sys/ioctl.h>
>>>>> +#include <sys/mman.h>
>>>>>     #include <sys/stat.h>
>>>>> +#include <sys/syscall.h>
>>>>>     #include <sys/time.h>
>>>>> -#include <sys/mman.h>
>>>>>     #include <glib.h>
>>>>>     #include <signal.h>
>>>>>     #include <pthread.h>
>>>>> @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
>>>>>         free(t_data.ptr);
>>>>>     }
>>>>>     
>>>>> +struct ufd_thread {
>>>>> +     uint32_t *page;
>>>>> +     int i915;
>>>>> +};
>>>>> +
>>>>> +static uint32_t create_page(int i915, void *page)
>>>>> +{
>>>>> +     uint32_t handle;
>>>>> +
>>>>> +     gem_userptr(i915, page, 4096, 0, 0, &handle);
>>>>> +     return handle;
>>>>> +}
>>>>> +
>>>>> +static uint32_t create_batch(int i915)
>>>>> +{
>>>>> +     const uint32_t bbe = MI_BATCH_BUFFER_END;
>>>>> +     uint32_t handle;
>>>>> +
>>>>> +     handle = gem_create(i915, 4096);
>>>>> +     gem_write(i915, handle, 0, &bbe, sizeof(bbe));
>>>>> +
>>>>> +     return handle;
>>>>> +}
>>>>> +
>>>>> +static void *ufd_thread(void *arg)
>>>>> +{
>>>>> +     struct ufd_thread *t = arg;
>>>>> +     struct drm_i915_gem_exec_object2 obj[2] = {
>>>>> +             { .handle = create_page(t->i915, t->page) },
>>>>> +             { .handle = create_batch(t->i915) },
>>>>> +     };
>>>>> +     struct drm_i915_gem_execbuffer2 eb = {
>>>>> +             .buffers_ptr = to_user_pointer(obj),
>>>>> +             .buffer_count = ARRAY_SIZE(obj),
>>>>> +     };
>>>>> +
>>>>> +     igt_debug("submitting fault\n");
>>>>> +     gem_execbuf(t->i915, &eb);
>>>>> +     gem_sync(t->i915, obj[1].handle);
>>>>> +
>>>>> +     for (int i = 0; i < ARRAY_SIZE(obj); i++)
>>>>> +             gem_close(t->i915, obj[i].handle);
>>>>> +
>>>>> +     t->i915 = -1;
>>>>> +     return NULL;
>>>>> +}
>>>>> +
>>>>> +static int userfaultfd(int flags)
>>>>> +{
>>>>> +     return syscall(SYS_userfaultfd, flags);
>>>>> +}
>>>>> +
>>>>> +static void test_userfault(int i915)
>>>>> +{
>>>>> +     struct uffdio_api api = { .api = UFFD_API };
>>>>> +     struct uffdio_register reg;
>>>>> +     struct uffdio_copy copy;
>>>>> +     struct uffd_msg msg;
>>>>> +     struct ufd_thread t;
>>>>> +     pthread_t thread;
>>>>> +     char poison[4096];
>>>>> +     int ufd;
>>>>> +
>>>>> +     /*
>>>>> +      * Register a page with userfaultfd, and wrap that inside a userptr bo.
>>>>> +      * When we try to use gup insider userptr_get_pages, it will trigger
>>>>> +      * a pagefault that is sent to the userfaultfd for servicing. This
>>>>> +      * is arbitrarily slow, as the submission must wait until the fault
>>>>> +      * is serviced by the userspace fault handler.
>>>>> +      */
>>>>> +
>>>>> +     ufd = userfaultfd(0);
>>>>> +     igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
>>>>> +     igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
>>>>> +                   "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
>>>>> +
>>>>> +     t.i915 = i915;
>>>>> +
>>>>> +     t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
>>>>> +     igt_assert(t.page != MAP_FAILED);
>>>>> +
>>>>> +     memset(&reg, 0, sizeof(reg));
>>>>> +     reg.mode = UFFDIO_REGISTER_MODE_MISSING;
>>>>> +     reg.range.start = to_user_pointer(t.page);
>>>>> +     reg.range.len = 4096;
>>>>> +     do_ioctl(ufd, UFFDIO_REGISTER, &reg);
>>>>> +     igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
>>>>> +
>>>>> +     igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
>>>>> +
>>>>> +     /* Wait for the fault */
>>>>> +     igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
>>>>> +     igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
>>>>> +     igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
>>>>> +
>>>>> +     /* Faulting thread remains blocked */
>>>>> +     igt_assert_eq(t.i915, i915);
>>>>
>>>> This looks could be a false negative since nothing says the thread is
>>>> not blocked just not got round resetting t->i915.
>>>
>>> There's a gem_sync() in the thread. Our goal is that the thread is
>>> blocked (either at submit or in the sync) until we service the fault.
>>
>> What I meant was it could have passed gem_execbuf and gem_sync, just not
>> got to the t->i915 = -1 line yet. Am I being to pedantic? Maybe using
>> output fence and passing it back to parent thread would be easier?
>> Parent then does igt_assert_eq(poll(fd, some_timeout), 0).
> 
> That could only be if the fault never occurred, in which case we would
> still be blocking on the read(). I think that's a reasonable level for
> us not to care about -- it's the same as our depending on write()/read()
> for synchronising between parent & child elsewhere. No?

Using userfaultfd is a bit more complex so I thought of extra checks. 
But okay, if probably is overly paranoid.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd
@ 2019-11-12  8:26             ` Tvrtko Ursulin
  0 siblings, 0 replies; 34+ messages in thread
From: Tvrtko Ursulin @ 2019-11-12  8:26 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: igt-dev, Tvrtko Ursulin


On 11/11/2019 18:07, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-11 17:54:27)
>>
>> On 11/11/2019 16:58, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-11-11 16:48:14)
>>>>
>>>> On 08/11/2019 20:49, Chris Wilson wrote:
>>>>> Register a userspace fault handler for a memory region that we also pass
>>>>> to the GPU via userptr, and make sure the pagefault is properly serviced
>>>>> before we execute.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> ---
>>>>>     tests/i915/gem_userptr_blits.c | 119 ++++++++++++++++++++++++++++++++-
>>>>>     1 file changed, 118 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
>>>>> index 11d6f4a1c..774a9f92c 100644
>>>>> --- a/tests/i915/gem_userptr_blits.c
>>>>> +++ b/tests/i915/gem_userptr_blits.c
>>>>> @@ -36,6 +36,8 @@
>>>>>      * The goal is to simply ensure the basics work.
>>>>>      */
>>>>>     
>>>>> +#include <linux/userfaultfd.h>
>>>>> +
>>>>>     #include "igt.h"
>>>>>     #include <stdlib.h>
>>>>>     #include <stdio.h>
>>>>> @@ -44,9 +46,11 @@
>>>>>     #include <inttypes.h>
>>>>>     #include <errno.h>
>>>>>     #include <setjmp.h>
>>>>> +#include <sys/ioctl.h>
>>>>> +#include <sys/mman.h>
>>>>>     #include <sys/stat.h>
>>>>> +#include <sys/syscall.h>
>>>>>     #include <sys/time.h>
>>>>> -#include <sys/mman.h>
>>>>>     #include <glib.h>
>>>>>     #include <signal.h>
>>>>>     #include <pthread.h>
>>>>> @@ -1831,6 +1835,116 @@ static void test_invalidate_close_race(int fd, bool overlap)
>>>>>         free(t_data.ptr);
>>>>>     }
>>>>>     
>>>>> +struct ufd_thread {
>>>>> +     uint32_t *page;
>>>>> +     int i915;
>>>>> +};
>>>>> +
>>>>> +static uint32_t create_page(int i915, void *page)
>>>>> +{
>>>>> +     uint32_t handle;
>>>>> +
>>>>> +     gem_userptr(i915, page, 4096, 0, 0, &handle);
>>>>> +     return handle;
>>>>> +}
>>>>> +
>>>>> +static uint32_t create_batch(int i915)
>>>>> +{
>>>>> +     const uint32_t bbe = MI_BATCH_BUFFER_END;
>>>>> +     uint32_t handle;
>>>>> +
>>>>> +     handle = gem_create(i915, 4096);
>>>>> +     gem_write(i915, handle, 0, &bbe, sizeof(bbe));
>>>>> +
>>>>> +     return handle;
>>>>> +}
>>>>> +
>>>>> +static void *ufd_thread(void *arg)
>>>>> +{
>>>>> +     struct ufd_thread *t = arg;
>>>>> +     struct drm_i915_gem_exec_object2 obj[2] = {
>>>>> +             { .handle = create_page(t->i915, t->page) },
>>>>> +             { .handle = create_batch(t->i915) },
>>>>> +     };
>>>>> +     struct drm_i915_gem_execbuffer2 eb = {
>>>>> +             .buffers_ptr = to_user_pointer(obj),
>>>>> +             .buffer_count = ARRAY_SIZE(obj),
>>>>> +     };
>>>>> +
>>>>> +     igt_debug("submitting fault\n");
>>>>> +     gem_execbuf(t->i915, &eb);
>>>>> +     gem_sync(t->i915, obj[1].handle);
>>>>> +
>>>>> +     for (int i = 0; i < ARRAY_SIZE(obj); i++)
>>>>> +             gem_close(t->i915, obj[i].handle);
>>>>> +
>>>>> +     t->i915 = -1;
>>>>> +     return NULL;
>>>>> +}
>>>>> +
>>>>> +static int userfaultfd(int flags)
>>>>> +{
>>>>> +     return syscall(SYS_userfaultfd, flags);
>>>>> +}
>>>>> +
>>>>> +static void test_userfault(int i915)
>>>>> +{
>>>>> +     struct uffdio_api api = { .api = UFFD_API };
>>>>> +     struct uffdio_register reg;
>>>>> +     struct uffdio_copy copy;
>>>>> +     struct uffd_msg msg;
>>>>> +     struct ufd_thread t;
>>>>> +     pthread_t thread;
>>>>> +     char poison[4096];
>>>>> +     int ufd;
>>>>> +
>>>>> +     /*
>>>>> +      * Register a page with userfaultfd, and wrap that inside a userptr bo.
>>>>> +      * When we try to use gup insider userptr_get_pages, it will trigger
>>>>> +      * a pagefault that is sent to the userfaultfd for servicing. This
>>>>> +      * is arbitrarily slow, as the submission must wait until the fault
>>>>> +      * is serviced by the userspace fault handler.
>>>>> +      */
>>>>> +
>>>>> +     ufd = userfaultfd(0);
>>>>> +     igt_require_f(ufd != -1, "kernel support for userfaultfd\n");
>>>>> +     igt_require_f(ioctl(ufd, UFFDIO_API, &api) == 0 && api.api == UFFD_API,
>>>>> +                   "userfaultfd API v%lld:%lld\n", UFFD_API, api.api);
>>>>> +
>>>>> +     t.i915 = i915;
>>>>> +
>>>>> +     t.page = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0);
>>>>> +     igt_assert(t.page != MAP_FAILED);
>>>>> +
>>>>> +     memset(&reg, 0, sizeof(reg));
>>>>> +     reg.mode = UFFDIO_REGISTER_MODE_MISSING;
>>>>> +     reg.range.start = to_user_pointer(t.page);
>>>>> +     reg.range.len = 4096;
>>>>> +     do_ioctl(ufd, UFFDIO_REGISTER, &reg);
>>>>> +     igt_assert(reg.ioctls == UFFD_API_RANGE_IOCTLS);
>>>>> +
>>>>> +     igt_assert(pthread_create(&thread, NULL, ufd_thread, &t) == 0);
>>>>> +
>>>>> +     /* Wait for the fault */
>>>>> +     igt_assert_eq(read(ufd, &msg, sizeof(msg)), sizeof(msg));
>>>>> +     igt_assert_eq(msg.event, UFFD_EVENT_PAGEFAULT);
>>>>> +     igt_assert(from_user_pointer(msg.arg.pagefault.address) == t.page);
>>>>> +
>>>>> +     /* Faulting thread remains blocked */
>>>>> +     igt_assert_eq(t.i915, i915);
>>>>
>>>> This looks could be a false negative since nothing says the thread is
>>>> not blocked just not got round resetting t->i915.
>>>
>>> There's a gem_sync() in the thread. Our goal is that the thread is
>>> blocked (either at submit or in the sync) until we service the fault.
>>
>> What I meant was it could have passed gem_execbuf and gem_sync, just not
>> got to the t->i915 = -1 line yet. Am I being to pedantic? Maybe using
>> output fence and passing it back to parent thread would be easier?
>> Parent then does igt_assert_eq(poll(fd, some_timeout), 0).
> 
> That could only be if the fault never occurred, in which case we would
> still be blocking on the read(). I think that's a reasonable level for
> us not to care about -- it's the same as our depending on write()/read()
> for synchronising between parent & child elsewhere. No?

Using userfaultfd is a bit more complex so I thought of extra checks. 
But okay, if probably is overly paranoid.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-12 17:55         ` Daniel Vetter
  0 siblings, 0 replies; 34+ messages in thread
From: Daniel Vetter @ 2019-11-12 17:55 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Peter Zijlstra, IGT development, intel-gfx

On Fri, Nov 8, 2019 at 10:22 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Daniel Vetter (2019-11-08 21:13:13)
> > On Fri, Nov 8, 2019 at 9:49 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > >
> > > One of the hardest priority inversion tasks to both handle and to
> > > simulate in testing is inversion due to resource contention. The
> > > challenge is that a high priority context should never be blocked by a
> > > low priority context, even if both are starving for resources --
> > > ideally, at least for some RT OSes, the higher priority context has
> > > first pick of the meagre resources so that it can be executed with
> > > minimum latency.
> > >
> > > userfaultfd allows us to handle a page fault in userspace, and so
> > > arbitrary impose a delay on the fault handler, creating a situation
> > > where a low priority context is blocked waiting for the fault. This
> > > blocked context should not prevent a high priority context from being
> > > executed. While the userfault tries to emulate a slow fault (e.g. from a
> > > failing swap device), it is unfortunately limited to a single object
> > > type: the userptr. Hopefully, we will find other ways to impose other
> > > starvation conditions on global resources.
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >
> > So rt-ww_mutexes?
> >
> > I don't think we want/should do that on the first round of rolling out
> > ww_mutex in i915.
>
> It works today. And will continue working across any conversion.

Isn't that just an artifact of how we retry userptr page-in? I think
if we'd do this check with other objects, then it'll fall apart ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-12 17:55         ` Daniel Vetter
  0 siblings, 0 replies; 34+ messages in thread
From: Daniel Vetter @ 2019-11-12 17:55 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Peter Zijlstra, IGT development, intel-gfx

On Fri, Nov 8, 2019 at 10:22 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Daniel Vetter (2019-11-08 21:13:13)
> > On Fri, Nov 8, 2019 at 9:49 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > >
> > > One of the hardest priority inversion tasks to both handle and to
> > > simulate in testing is inversion due to resource contention. The
> > > challenge is that a high priority context should never be blocked by a
> > > low priority context, even if both are starving for resources --
> > > ideally, at least for some RT OSes, the higher priority context has
> > > first pick of the meagre resources so that it can be executed with
> > > minimum latency.
> > >
> > > userfaultfd allows us to handle a page fault in userspace, and so
> > > arbitrary impose a delay on the fault handler, creating a situation
> > > where a low priority context is blocked waiting for the fault. This
> > > blocked context should not prevent a high priority context from being
> > > executed. While the userfault tries to emulate a slow fault (e.g. from a
> > > failing swap device), it is unfortunately limited to a single object
> > > type: the userptr. Hopefully, we will find other ways to impose other
> > > starvation conditions on global resources.
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >
> > So rt-ww_mutexes?
> >
> > I don't think we want/should do that on the first round of rolling out
> > ww_mutex in i915.
>
> It works today. And will continue working across any conversion.

Isn't that just an artifact of how we retry userptr page-in? I think
if we'd do this check with other objects, then it'll fall apart ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention
@ 2019-11-12 17:55         ` Daniel Vetter
  0 siblings, 0 replies; 34+ messages in thread
From: Daniel Vetter @ 2019-11-12 17:55 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Peter Zijlstra, IGT development, intel-gfx

On Fri, Nov 8, 2019 at 10:22 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Daniel Vetter (2019-11-08 21:13:13)
> > On Fri, Nov 8, 2019 at 9:49 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > >
> > > One of the hardest priority inversion tasks to both handle and to
> > > simulate in testing is inversion due to resource contention. The
> > > challenge is that a high priority context should never be blocked by a
> > > low priority context, even if both are starving for resources --
> > > ideally, at least for some RT OSes, the higher priority context has
> > > first pick of the meagre resources so that it can be executed with
> > > minimum latency.
> > >
> > > userfaultfd allows us to handle a page fault in userspace, and so
> > > arbitrary impose a delay on the fault handler, creating a situation
> > > where a low priority context is blocked waiting for the fault. This
> > > blocked context should not prevent a high priority context from being
> > > executed. While the userfault tries to emulate a slow fault (e.g. from a
> > > failing swap device), it is unfortunately limited to a single object
> > > type: the userptr. Hopefully, we will find other ways to impose other
> > > starvation conditions on global resources.
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >
> > So rt-ww_mutexes?
> >
> > I don't think we want/should do that on the first round of rolling out
> > ww_mutex in i915.
>
> It works today. And will continue working across any conversion.

Isn't that just an artifact of how we retry userptr page-in? I think
if we'd do this check with other objects, then it'll fall apart ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2019-11-12 17:55 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-08 20:49 [PATCH i-g-t 1/3] i915/gem_exec_schedule: Split pi-ringfull into two tests Chris Wilson
2019-11-08 20:49 ` [Intel-gfx] " Chris Wilson
2019-11-08 20:49 ` [PATCH i-g-t 2/3] i915/gem_userptr_blits: Exercise userptr + userfaultfd Chris Wilson
2019-11-08 20:49   ` [igt-dev] " Chris Wilson
2019-11-08 20:49   ` [Intel-gfx] " Chris Wilson
2019-11-11 16:48   ` [igt-dev] " Tvrtko Ursulin
2019-11-11 16:48     ` Tvrtko Ursulin
2019-11-11 16:48     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-11 16:58     ` Chris Wilson
2019-11-11 16:58       ` Chris Wilson
2019-11-11 16:58       ` [Intel-gfx] " Chris Wilson
2019-11-11 17:54       ` Tvrtko Ursulin
2019-11-11 17:54         ` Tvrtko Ursulin
2019-11-11 17:54         ` [Intel-gfx] " Tvrtko Ursulin
2019-11-11 18:07         ` Chris Wilson
2019-11-11 18:07           ` [Intel-gfx] " Chris Wilson
2019-11-12  8:26           ` Tvrtko Ursulin
2019-11-12  8:26             ` Tvrtko Ursulin
2019-11-12  8:26             ` [Intel-gfx] " Tvrtko Ursulin
2019-11-08 20:49 ` [PATCH i-g-t 3/3] i915/gem_exec_scheduler: Exercise priority inversion from resource contention Chris Wilson
2019-11-08 20:49   ` [igt-dev] " Chris Wilson
2019-11-08 20:49   ` [Intel-gfx] " Chris Wilson
2019-11-08 21:13   ` [igt-dev] " Daniel Vetter
2019-11-08 21:13     ` Daniel Vetter
2019-11-08 21:13     ` [Intel-gfx] " Daniel Vetter
2019-11-08 21:22     ` Chris Wilson
2019-11-08 21:22       ` Chris Wilson
2019-11-08 21:22       ` [Intel-gfx] " Chris Wilson
2019-11-12 17:55       ` Daniel Vetter
2019-11-12 17:55         ` [igt-dev] [Intel-gfx] " Daniel Vetter
2019-11-12 17:55         ` [Intel-gfx] [igt-dev] " Daniel Vetter
2019-11-08 21:14 ` [igt-dev] ✗ GitLab.Pipeline: failure for series starting with [i-g-t,1/3] i915/gem_exec_schedule: Split pi-ringfull into two tests Patchwork
2019-11-08 21:46 ` [igt-dev] ✓ Fi.CI.BAT: success " Patchwork
2019-11-10 18:03 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.