Intel-GFX Archive on lore.kernel.org
 help / color / Atom feed
* [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings
@ 2020-05-20  7:54 Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 02/22] drm/i915/execlists: Shortcircuit queue_prio() for no internal levels Chris Wilson
                   ` (24 more replies)
  0 siblings, 25 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Leave the error propagation in place, but limit the warnings to only
show up in CI if the unlikely errors are hit.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 3 +--
 drivers/gpu/drm/i915/gem/i915_gem_phys.c       | 3 +--
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c      | 3 +--
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c    | 2 +-
 4 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e4fb6c372537..219a36995b96 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1626,8 +1626,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 			err = i915_vma_bind(target->vma,
 					    target->vma->obj->cache_level,
 					    PIN_GLOBAL, NULL);
-			if (drm_WARN_ONCE(&i915->drm, err,
-				      "Unexpected failure to bind target VMA!"))
+			if (err)
 				return err;
 		}
 	}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 4c1c7232b024..12245a47e5fb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -27,8 +27,7 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 	void *dst;
 	int i;
 
-	if (drm_WARN_ON(obj->base.dev,
-			i915_gem_object_needs_bit17_swizzle(obj)))
+	if (GEM_WARN_ON(i915_gem_object_needs_bit17_swizzle(obj)))
 		return -EINVAL;
 
 	/*
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 5d5d7eef3f43..19dd21a95c47 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -148,8 +148,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 		last_pfn = page_to_pfn(page);
 
 		/* Check that the i965g/gm workaround works. */
-		drm_WARN_ON(&i915->drm,
-			    (gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL));
+		GEM_BUG_ON(gfp & __GFP_DMA32 && last_pfn >= 0x00100000UL);
 	}
 	if (sg) { /* loop terminated early; short sg table */
 		sg_page_sizes |= sg->length;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 8b0708708671..ec9d25680b41 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -235,7 +235,7 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 	if (flags & I915_USERPTR_UNSYNCHRONIZED)
 		return capable(CAP_SYS_ADMIN) ? 0 : -EPERM;
 
-	if (drm_WARN_ON(obj->base.dev, obj->userptr.mm == NULL))
+	if (GEM_WARN_ON(obj->userptr.mm == NULL))
 		return -EINVAL;
 
 	mn = i915_mmu_notifier_find(obj->userptr.mm);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 02/22] drm/i915/execlists: Shortcircuit queue_prio() for no internal levels
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 03/22] drm/i915: Avoid using rq->engine after free during i915_fence_release Chris Wilson
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

If there are no internal levels and the user priority-shift is zero, we
can help the compiler eliminate some dead code:

Function                                     old     new   delta
start_timeslice                              169     154     -15
__execlists_submission_tasklet              4696    4659     -37

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index de5be57ed6d2..3214a4ecc31a 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -446,6 +446,9 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
 	 * we have to flip the index value to become priority.
 	 */
 	p = to_priolist(rb);
+	if (!I915_USER_PRIORITY_SHIFT)
+		return p->priority;
+
 	return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used);
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 03/22] drm/i915: Avoid using rq->engine after free during i915_fence_release
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 02/22] drm/i915/execlists: Shortcircuit queue_prio() for no internal levels Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 04/22] drm/i915: Move saturated workload detection back to the context Chris Wilson
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

In order to be valid to dereference during the i915_fence_release, after
retiring the fence and releasing its refererences, we assume that
rq->engine can only be a real engine (that stay intact until the device
is shutdown after all fences have been flushed). However, due to a quirk
of preempt-to-busy, we may retire a request that still belongs to a
virtual engine and so eventually free it with rq->engine being invalid.
To avoid dereferencing that invalid engine, we look at the
execution_mask which if it indicates it may be executed on more than one
engine, we know it originated on a virtual engine and may still be on
one.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1906
Fixes: 43acd6516ca9 ("drm/i915: Keep a per-engine request pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 526c1e9acbd5..6e357183bece 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -121,8 +121,29 @@ static void i915_fence_release(struct dma_fence *fence)
 	i915_sw_fence_fini(&rq->submit);
 	i915_sw_fence_fini(&rq->semaphore);
 
-	/* Keep one request on each engine for reserved use under mempressure */
-	if (!cmpxchg(&rq->engine->request_pool, NULL, rq))
+	/*
+	 * Keep one request on each engine for reserved use under mempressure
+	 *
+	 * We do not hold a reference to the engine here and so have to be
+	 * very careful in what rq->engine we poke. The virtual engine is
+	 * referenced via the rq->context and we released that ref during
+	 * i915_request_retire(), ergo we must not dereference a virtual
+	 * engine here. Not that we would want to, as the only consumer of
+	 * the reserved engine->request_pool is the powermanagent parking,
+	 * which must-not-fail, and that is only run on the physical engines.
+	 *
+	 * Since the request must have been executed to be have completed,
+	 * we know that it will have been processed by the HW and will
+	 * not be unsubmitted again, so rq->engine and rq->execution_mask
+	 * at this point is stable. rq->execution_mask will be a single
+	 * bit if the last and only engine it could execution on was a
+	 * physical engine, if it's multiple bits then it started on and
+	 * could still be on a virtual engine. Thus if the mask is not a
+	 * power-of-two we assume that rq->engine may still be a virtual
+	 * engien and so a dangling invalid pointer that we cannot dereference
+	 */
+	if (is_power_of_2(rq->execution_mask) &&
+	    !cmpxchg(&rq->engine->request_pool, NULL, rq))
 		return;
 
 	kmem_cache_free(global.slab_requests, rq);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 04/22] drm/i915: Move saturated workload detection back to the context
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 02/22] drm/i915/execlists: Shortcircuit queue_prio() for no internal levels Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 03/22] drm/i915: Avoid using rq->engine after free during i915_fence_release Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 05/22] drm/i915/gt: Use virtual_engine during execlists_dequeue Chris Wilson
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

When we introduced the saturated workload detection to tell us to back
off from semaphore usage [semaphores have a noticeable impact on
contended bus cycles with the CPU for some heavy workloads], we first
introduced it as a per-context tracker. This allows individual contexts
to try and optimise their own usage, but we found that with the local
tracking and the no-semaphore boosting, the first context to disable
semaphores got a massive priority boost and so would starve the rest and
all new contexts (as they started with semaphores enabled and lower
priority). Hence we moved the saturated workload detection to the
engine, and a consequence had to disable semaphores on virtual engines.

Now that we do not have semaphore priority boosting, we can move the
tracking back to the context and virtual engines can now utilise the
faster inter-engine synchronisation.

References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c       |  1 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |  2 ++
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |  2 --
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  2 --
 drivers/gpu/drm/i915/gt/intel_lrc.c           | 15 ---------------
 drivers/gpu/drm/i915/i915_request.c           |  4 ++--
 6 files changed, 5 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index e4aece20bc80..762a251d553b 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -268,6 +268,7 @@ static int __intel_context_active(struct i915_active *active)
 	if (err)
 		goto err_timeline;
 
+	ce->saturated = 0;
 	return 0;
 
 err_timeline:
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 4954b0df4864..aed26d93c2ca 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -78,6 +78,8 @@ struct intel_context {
 	} lrc;
 	u32 tag; /* cookie passed to HW to track this context on submission */
 
+	intel_engine_mask_t saturated; /* submitting semaphores too late? */
+
 	/* Time on GPU as tracked by the hw. */
 	struct {
 		struct ewma_runtime avg;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index d0a1078ef632..6d7fdba5adef 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -229,8 +229,6 @@ static int __engine_park(struct intel_wakeref *wf)
 	struct intel_engine_cs *engine =
 		container_of(wf, typeof(*engine), wakeref);
 
-	engine->saturated = 0;
-
 	/*
 	 * If one and only one request is completed between pm events,
 	 * we know that we are inside the kernel context and it is
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 2b6cdf47d428..c443b6bb884b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -332,8 +332,6 @@ struct intel_engine_cs {
 
 	struct intel_context *kernel_context; /* pinned */
 
-	intel_engine_mask_t saturated; /* submitting semaphores too late? */
-
 	struct {
 		struct delayed_work work;
 		struct i915_request *systole;
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3214a4ecc31a..42cb0cae2845 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -5651,21 +5651,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 	ve->base.instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
 	ve->base.uabi_instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
 
-	/*
-	 * The decision on whether to submit a request using semaphores
-	 * depends on the saturated state of the engine. We only compute
-	 * this during HW submission of the request, and we need for this
-	 * state to be globally applied to all requests being submitted
-	 * to this engine. Virtual engines encompass more than one physical
-	 * engine and so we cannot accurately tell in advance if one of those
-	 * engines is already saturated and so cannot afford to use a semaphore
-	 * and be pessimized in priority for doing so -- if we are the only
-	 * context using semaphores after all other clients have stopped, we
-	 * will be starved on the saturated system. Such a global switch for
-	 * semaphores is less than ideal, but alas is the current compromise.
-	 */
-	ve->base.saturated = ALL_ENGINES;
-
 	snprintf(ve->base.name, sizeof(ve->base.name), "virtual");
 
 	intel_engine_init_active(&ve->base, ENGINE_VIRTUAL);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 6e357183bece..8e5511bc0f22 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -488,7 +488,7 @@ bool __i915_request_submit(struct i915_request *request)
 	 */
 	if (request->sched.semaphores &&
 	    i915_sw_fence_signaled(&request->semaphore))
-		engine->saturated |= request->sched.semaphores;
+		request->context->saturated |= request->sched.semaphores;
 
 	engine->emit_fini_breadcrumb(request,
 				     request->ring->vaddr + request->postfix);
@@ -940,7 +940,7 @@ already_busywaiting(struct i915_request *rq)
 	 *
 	 * See the are-we-too-late? check in __i915_request_submit().
 	 */
-	return rq->sched.semaphores | READ_ONCE(rq->engine->saturated);
+	return rq->sched.semaphores | READ_ONCE(rq->context->saturated);
 }
 
 static int
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 05/22] drm/i915/gt: Use virtual_engine during execlists_dequeue
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (2 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 04/22] drm/i915: Move saturated workload detection back to the context Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 06/22] drm/i915/gt: Decouple inflight virtual engines Chris Wilson
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Rather than going back and forth between the rb_node entry and the
virtual_engine type, store the ve local and reuse it. As the
container_of conversion from rb_node to virtual_engine requires a
variable offset, performing that conversion just once shaves off a bit
of code.

v2: Keep a single virtual engine lookup, for typical use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 217 +++++++++++++---------------
 1 file changed, 104 insertions(+), 113 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 42cb0cae2845..fe9e31a25a76 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -454,7 +454,7 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
 
 static inline bool need_preempt(const struct intel_engine_cs *engine,
 				const struct i915_request *rq,
-				struct rb_node *rb)
+				struct virtual_engine *ve)
 {
 	int last_prio;
 
@@ -491,9 +491,7 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 	    rq_prio(list_next_entry(rq, sched.link)) > last_prio)
 		return true;
 
-	if (rb) {
-		struct virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+	if (ve) {
 		bool preempt = false;
 
 		if (engine == ve->siblings[0]) { /* only preempt one sibling */
@@ -1815,6 +1813,35 @@ static bool virtual_matches(const struct virtual_engine *ve,
 	return true;
 }
 
+static struct virtual_engine *
+first_virtual_engine(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists *el = &engine->execlists;
+	struct rb_node *rb = rb_first_cached(&el->virtual);
+
+	while (rb) {
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq = READ_ONCE(ve->request);
+
+		if (!rq) { /* lazily cleanup after another engine handled rq */
+			rb_erase_cached(rb, &el->virtual);
+			RB_CLEAR_NODE(rb);
+			rb = rb_first_cached(&el->virtual);
+			continue;
+		}
+
+		if (!virtual_matches(ve, rq, engine)) {
+			rb = rb_next(rb);
+			continue;
+		}
+
+		return ve;
+	}
+
+	return NULL;
+}
+
 static void virtual_xfer_breadcrumbs(struct virtual_engine *ve)
 {
 	/*
@@ -1899,7 +1926,7 @@ static void defer_active(struct intel_engine_cs *engine)
 static bool
 need_timeslice(const struct intel_engine_cs *engine,
 	       const struct i915_request *rq,
-	       const struct rb_node *rb)
+	       struct virtual_engine *ve)
 {
 	int hint;
 
@@ -1908,9 +1935,7 @@ need_timeslice(const struct intel_engine_cs *engine,
 
 	hint = engine->execlists.queue_priority_hint;
 
-	if (rb) {
-		const struct virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+	if (ve) {
 		const struct intel_engine_cs *inflight =
 			intel_context_inflight(&ve->context);
 
@@ -2060,7 +2085,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_request **port = execlists->pending;
 	struct i915_request ** const last_port = port + execlists->port_mask;
-	struct i915_request * const *active;
+	struct i915_request * const *active = READ_ONCE(execlists->active);
+	struct virtual_engine *ve = first_virtual_engine(engine);
 	struct i915_request *last;
 	struct rb_node *rb;
 	bool submit = false;
@@ -2087,26 +2113,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * and context switches) submission.
 	 */
 
-	for (rb = rb_first_cached(&execlists->virtual); rb; ) {
-		struct virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
-		struct i915_request *rq = READ_ONCE(ve->request);
-
-		if (!rq) { /* lazily cleanup after another engine handled rq */
-			rb_erase_cached(rb, &execlists->virtual);
-			RB_CLEAR_NODE(rb);
-			rb = rb_first_cached(&execlists->virtual);
-			continue;
-		}
-
-		if (!virtual_matches(ve, rq, engine)) {
-			rb = rb_next(rb);
-			continue;
-		}
-
-		break;
-	}
-
 	/*
 	 * If the queue is higher priority than the last
 	 * request in the currently active context, submit afresh.
@@ -2114,10 +2120,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * the active context to interject the preemption request,
 	 * i.e. we will retrigger preemption following the ack in case
 	 * of trouble.
-	 */
-	active = READ_ONCE(execlists->active);
-
-	/*
+	 *
 	 * In theory we can skip over completed contexts that have not
 	 * yet been processed by events (as those events are in flight):
 	 *
@@ -2128,9 +2131,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * find itself trying to jump back into a context it has just
 	 * completed and barf.
 	 */
-
 	if ((last = *active)) {
-		if (need_preempt(engine, last, rb)) {
+		if (need_preempt(engine, last, ve)) {
 			if (i915_request_completed(last)) {
 				tasklet_hi_schedule(&execlists->tasklet);
 				return;
@@ -2161,7 +2163,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			__unwind_incomplete_requests(engine);
 
 			last = NULL;
-		} else if (need_timeslice(engine, last, rb) &&
+		} else if (need_timeslice(engine, last, ve) &&
 			   timeslice_expired(execlists, last)) {
 			if (i915_request_completed(last)) {
 				tasklet_hi_schedule(&execlists->tasklet);
@@ -2215,110 +2217,99 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		}
 	}
 
-	while (rb) { /* XXX virtual is always taking precedence */
-		struct virtual_engine *ve =
-			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+	while (ve) { /* XXX virtual is always taking precedence */
 		struct i915_request *rq;
 
 		spin_lock(&ve->base.active.lock);
 
 		rq = ve->request;
-		if (unlikely(!rq)) { /* lost the race to a sibling */
-			spin_unlock(&ve->base.active.lock);
-			rb_erase_cached(rb, &execlists->virtual);
-			RB_CLEAR_NODE(rb);
-			rb = rb_first_cached(&execlists->virtual);
-			continue;
-		}
+		if (unlikely(!rq)) /* lost the race to a sibling */
+			goto unlock;
 
 		GEM_BUG_ON(rq != ve->request);
 		GEM_BUG_ON(rq->engine != &ve->base);
 		GEM_BUG_ON(rq->context != &ve->context);
 
-		if (rq_prio(rq) >= queue_prio(execlists)) {
-			if (!virtual_matches(ve, rq, engine)) {
-				spin_unlock(&ve->base.active.lock);
-				rb = rb_next(rb);
-				continue;
-			}
+		if (unlikely(rq_prio(rq) < queue_prio(execlists))) {
+			spin_unlock(&ve->base.active.lock);
+			break;
+		}
 
-			if (last && !can_merge_rq(last, rq)) {
-				spin_unlock(&ve->base.active.lock);
-				start_timeslice(engine, rq_prio(rq));
-				return; /* leave this for another sibling */
-			}
+		GEM_BUG_ON(!virtual_matches(ve, rq, engine));
 
-			ENGINE_TRACE(engine,
-				     "virtual rq=%llx:%lld%s, new engine? %s\n",
-				     rq->fence.context,
-				     rq->fence.seqno,
-				     i915_request_completed(rq) ? "!" :
-				     i915_request_started(rq) ? "*" :
-				     "",
-				     yesno(engine != ve->siblings[0]));
-
-			WRITE_ONCE(ve->request, NULL);
-			WRITE_ONCE(ve->base.execlists.queue_priority_hint,
-				   INT_MIN);
-			rb_erase_cached(rb, &execlists->virtual);
-			RB_CLEAR_NODE(rb);
+		if (last && !can_merge_rq(last, rq)) {
+			spin_unlock(&ve->base.active.lock);
+			start_timeslice(engine, rq_prio(rq));
+			return; /* leave this for another sibling */
+		}
 
-			GEM_BUG_ON(!(rq->execution_mask & engine->mask));
-			WRITE_ONCE(rq->engine, engine);
+		ENGINE_TRACE(engine,
+			     "virtual rq=%llx:%lld%s, new engine? %s\n",
+			     rq->fence.context,
+			     rq->fence.seqno,
+			     i915_request_completed(rq) ? "!" :
+			     i915_request_started(rq) ? "*" :
+			     "",
+			     yesno(engine != ve->siblings[0]));
 
-			if (engine != ve->siblings[0]) {
-				u32 *regs = ve->context.lrc_reg_state;
-				unsigned int n;
+		WRITE_ONCE(ve->request, NULL);
+		WRITE_ONCE(ve->base.execlists.queue_priority_hint,
+			   INT_MIN);
 
-				GEM_BUG_ON(READ_ONCE(ve->context.inflight));
+		rb = &ve->nodes[engine->id].rb;
+		rb_erase_cached(rb, &execlists->virtual);
+		RB_CLEAR_NODE(rb);
 
-				if (!intel_engine_has_relative_mmio(engine))
-					virtual_update_register_offsets(regs,
-									engine);
+		GEM_BUG_ON(!(rq->execution_mask & engine->mask));
+		WRITE_ONCE(rq->engine, engine);
 
-				if (!list_empty(&ve->context.signals))
-					virtual_xfer_breadcrumbs(ve);
+		if (engine != ve->siblings[0]) {
+			u32 *regs = ve->context.lrc_reg_state;
+			unsigned int n;
 
-				/*
-				 * Move the bound engine to the top of the list
-				 * for future execution. We then kick this
-				 * tasklet first before checking others, so that
-				 * we preferentially reuse this set of bound
-				 * registers.
-				 */
-				for (n = 1; n < ve->num_siblings; n++) {
-					if (ve->siblings[n] == engine) {
-						swap(ve->siblings[n],
-						     ve->siblings[0]);
-						break;
-					}
-				}
+			GEM_BUG_ON(READ_ONCE(ve->context.inflight));
 
-				GEM_BUG_ON(ve->siblings[0] != engine);
-			}
+			if (!intel_engine_has_relative_mmio(engine))
+				virtual_update_register_offsets(regs,
+								engine);
 
-			if (__i915_request_submit(rq)) {
-				submit = true;
-				last = rq;
-			}
-			i915_request_put(rq);
+			if (!list_empty(&ve->context.signals))
+				virtual_xfer_breadcrumbs(ve);
 
 			/*
-			 * Hmm, we have a bunch of virtual engine requests,
-			 * but the first one was already completed (thanks
-			 * preempt-to-busy!). Keep looking at the veng queue
-			 * until we have no more relevant requests (i.e.
-			 * the normal submit queue has higher priority).
+			 * Move the bound engine to the top of the list for
+			 * future execution. We then kick this tasklet first
+			 * before checking others, so that we preferentially
+			 * reuse this set of bound registers.
 			 */
-			if (!submit) {
-				spin_unlock(&ve->base.active.lock);
-				rb = rb_first_cached(&execlists->virtual);
-				continue;
+			for (n = 1; n < ve->num_siblings; n++) {
+				if (ve->siblings[n] == engine) {
+					swap(ve->siblings[n],
+					     ve->siblings[0]);
+					break;
+				}
 			}
+
+			GEM_BUG_ON(ve->siblings[0] != engine);
 		}
 
+		if (__i915_request_submit(rq)) {
+			submit = true;
+			last = rq;
+		}
+
+		i915_request_put(rq);
+unlock:
 		spin_unlock(&ve->base.active.lock);
-		break;
+
+		/*
+		 * Hmm, we have a bunch of virtual engine requests,
+		 * but the first one was already completed (thanks
+		 * preempt-to-busy!). Keep looking at the veng queue
+		 * until we have no more relevant requests (i.e.
+		 * the normal submit queue has higher priority).
+		 */
+		ve = submit ? NULL : first_virtual_engine(engine);
 	}
 
 	while ((rb = rb_first_cached(&execlists->queue))) {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 06/22] drm/i915/gt: Decouple inflight virtual engines
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (3 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 05/22] drm/i915/gt: Use virtual_engine during execlists_dequeue Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 07/22] drm/i915/gt: Resubmit the virtual engine on schedule-out Chris Wilson
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Once a virtual engine has been bound to a sibling, it will remain bound
until we finally schedule out the last active request. We can not rebind
the context to a new sibling while it is inflight as the context save
will conflict, hence we wait. As we cannot then use any other sibliing
while the context is inflight, only kick the bound sibling while it
inflight and upon scheduling out the kick the rest (so that we can swap
engines on timeslicing if the previously bound engine becomes
oversubscribed).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 30 +++++++++++++----------------
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index fe9e31a25a76..7eee35edfb42 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1401,9 +1401,8 @@ execlists_schedule_in(struct i915_request *rq, int idx)
 static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
 {
 	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
-	struct i915_request *next = READ_ONCE(ve->request);
 
-	if (next == rq || (next && next->execution_mask & ~rq->execution_mask))
+	if (READ_ONCE(ve->request))
 		tasklet_hi_schedule(&ve->base.execlists.tasklet);
 }
 
@@ -1824,18 +1823,14 @@ first_virtual_engine(struct intel_engine_cs *engine)
 			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
 		struct i915_request *rq = READ_ONCE(ve->request);
 
-		if (!rq) { /* lazily cleanup after another engine handled rq */
+		/* lazily cleanup after another engine handled rq */
+		if (!rq || !virtual_matches(ve, rq, engine)) {
 			rb_erase_cached(rb, &el->virtual);
 			RB_CLEAR_NODE(rb);
 			rb = rb_first_cached(&el->virtual);
 			continue;
 		}
 
-		if (!virtual_matches(ve, rq, engine)) {
-			rb = rb_next(rb);
-			continue;
-		}
-
 		return ve;
 	}
 
@@ -5471,7 +5466,6 @@ static void virtual_submission_tasklet(unsigned long data)
 	if (unlikely(!mask))
 		return;
 
-	local_irq_disable();
 	for (n = 0; n < ve->num_siblings; n++) {
 		struct intel_engine_cs *sibling = READ_ONCE(ve->siblings[n]);
 		struct ve_node * const node = &ve->nodes[sibling->id];
@@ -5481,20 +5475,19 @@ static void virtual_submission_tasklet(unsigned long data)
 		if (!READ_ONCE(ve->request))
 			break; /* already handled by a sibling's tasklet */
 
+		spin_lock_irq(&sibling->active.lock);
+
 		if (unlikely(!(mask & sibling->mask))) {
 			if (!RB_EMPTY_NODE(&node->rb)) {
-				spin_lock(&sibling->active.lock);
 				rb_erase_cached(&node->rb,
 						&sibling->execlists.virtual);
 				RB_CLEAR_NODE(&node->rb);
-				spin_unlock(&sibling->active.lock);
 			}
-			continue;
-		}
 
-		spin_lock(&sibling->active.lock);
+			goto unlock_engine;
+		}
 
-		if (!RB_EMPTY_NODE(&node->rb)) {
+		if (unlikely(!RB_EMPTY_NODE(&node->rb))) {
 			/*
 			 * Cheat and avoid rebalancing the tree if we can
 			 * reuse this node in situ.
@@ -5534,9 +5527,12 @@ static void virtual_submission_tasklet(unsigned long data)
 		if (first && prio > sibling->execlists.queue_priority_hint)
 			tasklet_hi_schedule(&sibling->execlists.tasklet);
 
-		spin_unlock(&sibling->active.lock);
+unlock_engine:
+		spin_unlock_irq(&sibling->active.lock);
+
+		if (intel_context_inflight(&ve->context))
+			break;
 	}
-	local_irq_enable();
 }
 
 static void virtual_submit_request(struct i915_request *rq)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 07/22] drm/i915/gt: Resubmit the virtual engine on schedule-out
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (4 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 06/22] drm/i915/gt: Decouple inflight virtual engines Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 08/22] drm/i915: Improve execute_cb struct packing Chris Wilson
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Having recognised that we do not change the sibling until we schedule
out, we can then defer the decision to resubmit the virtual engine from
the unwind of the active queue to scheduling out of the virtual context.

By keeping the unwind order intact on the local engine, we can preserve
data dependency ordering while doing a preempt-to-busy pass until we
have determined the new ELSP. This means that if we try to timeslice
between a virtual engine and a data-dependent ordinary request, the pair
will maintain their relative ordering and we will avoid the
resubmission, cancelling the timeslicing until further change.

The dilemma though is that we then may end up in a situation where the
'demotion' of the virtual request to an ordinary request in the engine
queue results in filling the ELSP[] with virtual requests instead of
spreading the load across the engines. To compensate for this, we mark
each virtual request and refuse to resubmit a virtual request in the
secondary ELSP slots, thus forcing subsequent virtual requests to be
scheduled out after timeslicing. By delaying the decision until we
schedule out, we will avoid unnecessary resubmission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c    | 99 ++++++++++++++++----------
 drivers/gpu/drm/i915/gt/selftest_lrc.c |  2 +-
 2 files changed, 62 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 7eee35edfb42..f85a2be92dc6 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1117,46 +1117,17 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 
 		__i915_request_unsubmit(rq);
 
-		/*
-		 * Push the request back into the queue for later resubmission.
-		 * If this request is not native to this physical engine (i.e.
-		 * it came from a virtual source), push it back onto the virtual
-		 * engine so that it can be moved across onto another physical
-		 * engine as load dictates.
-		 */
-		if (likely(rq->execution_mask == engine->mask)) {
-			GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
-			if (rq_prio(rq) != prio) {
-				prio = rq_prio(rq);
-				pl = i915_sched_lookup_priolist(engine, prio);
-			}
-			GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
-
-			list_move(&rq->sched.link, pl);
-			set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+		GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
+		if (rq_prio(rq) != prio) {
+			prio = rq_prio(rq);
+			pl = i915_sched_lookup_priolist(engine, prio);
+		}
+		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
 
-			active = rq;
-		} else {
-			struct intel_engine_cs *owner = rq->context->engine;
+		list_move(&rq->sched.link, pl);
+		set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
 
-			/*
-			 * Decouple the virtual breadcrumb before moving it
-			 * back to the virtual engine -- we don't want the
-			 * request to complete in the background and try
-			 * and cancel the breadcrumb on the virtual engine
-			 * (instead of the old engine where it is linked)!
-			 */
-			if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
-				     &rq->fence.flags)) {
-				spin_lock_nested(&rq->lock,
-						 SINGLE_DEPTH_NESTING);
-				i915_request_cancel_breadcrumb(rq);
-				spin_unlock(&rq->lock);
-			}
-			WRITE_ONCE(rq->engine, owner);
-			owner->submit_request(rq);
-			active = NULL;
-		}
+		active = rq;
 	}
 
 	return active;
@@ -1398,12 +1369,41 @@ execlists_schedule_in(struct i915_request *rq, int idx)
 	return i915_request_get(rq);
 }
 
+static void
+resubmit_virtual_request(struct i915_request *rq, struct virtual_engine *ve)
+{
+	/*
+	 * Decouple the virtual breadcrumb before moving it back to the virtual
+	 * engine -- we don't want the request to complete in the background
+	 * and then try and cancel the breadcrumb on the virtual engine
+	 * (instead of the old engine where it is linked)!
+	 */
+	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) {
+		spin_lock_nested(&rq->lock, SINGLE_DEPTH_NESTING);
+		i915_request_cancel_breadcrumb(rq);
+		spin_unlock(&rq->lock);
+	}
+
+	WRITE_ONCE(rq->engine, &ve->base);
+	ve->base.submit_request(rq);
+}
+
 static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
 {
 	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
 
 	if (READ_ONCE(ve->request))
 		tasklet_hi_schedule(&ve->base.execlists.tasklet);
+
+	/*
+	 * This engine is now too busy to run this virtual request, so
+	 * see if we can find an alternative engine for it to execute on.
+	 * Once a request has become bonded to this engine, we treat it the
+	 * same as other native request.
+	 */
+	if (i915_request_in_priority_queue(rq) &&
+	    rq->execution_mask != rq->engine->mask)
+		resubmit_virtual_request(rq, ve);
 }
 
 static inline void
@@ -1649,6 +1649,20 @@ assert_pending_valid(const struct intel_engine_execlists *execlists,
 			return false;
 		}
 
+		/*
+		 * We want virtual requests to only be in the first slot so
+		 * that they are never stuck behind a hog and can be immediately
+		 * transferred onto the next idle engine.
+		 */
+		if (rq->execution_mask != engine->mask &&
+		    port != execlists->pending) {
+			GEM_TRACE_ERR("%s: virtual engine:%llx not in prime position[%zd]\n",
+				      engine->name,
+				      ce->timeline->fence_context,
+				      port - execlists->pending);
+			return false;
+		}
+
 		/* Hold tightly onto the lock to prevent concurrent retires! */
 		if (!spin_trylock_irqsave(&rq->lock, flags))
 			continue;
@@ -2346,6 +2360,15 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				if (i915_request_has_sentinel(last))
 					goto done;
 
+				/*
+				 * We avoid submitting virtual requests into
+				 * the secondary ports so that we can migrate
+				 * the request immediately to another engine
+				 * rather than wait for the primary request.
+				 */
+				if (rq->execution_mask != engine->mask)
+					goto done;
+
 				/*
 				 * If GVT overrides us we only ever submit
 				 * port[0], leaving port[1] empty. Note that we
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index ef38dd52945c..c3d722840e2d 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -4384,7 +4384,7 @@ static int reset_virtual_engine(struct intel_gt *gt,
 	spin_lock_irq(&engine->active.lock);
 	__unwind_incomplete_requests(engine);
 	spin_unlock_irq(&engine->active.lock);
-	GEM_BUG_ON(rq->engine != ve->engine);
+	GEM_BUG_ON(rq->engine != engine);
 
 	/* Reset the engine while keeping our active request on hold */
 	execlists_hold(engine, rq);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 08/22] drm/i915: Improve execute_cb struct packing
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (5 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 07/22] drm/i915/gt: Resubmit the virtual engine on schedule-out Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 09/22] dma-buf: Proxy fence, an unsignaled fence placeholder Chris Wilson
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Reduce the irq_work llist for attaching the callbacks to the signal for
both smaller structs (two fewer pointers!) and simpler [debug] code:

Function                                     old     new   delta
irq_execute_cb                                35      34      -1
__igt_breadcrumbs_smoketest                 1684    1682      -2
i915_request_retire                         2003    1996      -7
__i915_request_create                       1047    1040      -7
__notify_execute_cb                          135     126      -9
__i915_request_ctor                          188     178     -10
__await_execution.part.constprop             451     440     -11
igt_wait_request                             924     714    -210

One minor artifact is that the order of cb exection is reversed. No
current use cases are affected by that change.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_request.c | 18 +++++++++---------
 drivers/gpu/drm/i915/i915_request.h |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 8e5511bc0f22..6f8b5794b73c 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -42,7 +42,6 @@
 #include "intel_pm.h"
 
 struct execute_cb {
-	struct list_head link;
 	struct irq_work work;
 	struct i915_sw_fence *fence;
 	void (*hook)(struct i915_request *rq, struct dma_fence *signal);
@@ -179,14 +178,14 @@ static void irq_execute_cb_hook(struct irq_work *wrk)
 
 static void __notify_execute_cb(struct i915_request *rq)
 {
-	struct execute_cb *cb;
+	struct execute_cb *cb, *cn;
 
 	lockdep_assert_held(&rq->lock);
 
-	if (list_empty(&rq->execute_cb))
+	if (llist_empty(&rq->execute_cb))
 		return;
 
-	list_for_each_entry(cb, &rq->execute_cb, link)
+	llist_for_each_entry_safe(cb, cn, rq->execute_cb.first, work.llnode)
 		irq_work_queue(&cb->work);
 
 	/*
@@ -199,7 +198,7 @@ static void __notify_execute_cb(struct i915_request *rq)
 	 * preempt-to-idle cycle on the target engine, all the while the
 	 * master execute_cb may refire.
 	 */
-	INIT_LIST_HEAD(&rq->execute_cb);
+	rq->execute_cb.first = NULL;
 }
 
 static inline void
@@ -317,7 +316,7 @@ bool i915_request_retire(struct i915_request *rq)
 		set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
 		__notify_execute_cb(rq);
 	}
-	GEM_BUG_ON(!list_empty(&rq->execute_cb));
+	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
 	spin_unlock_irq(&rq->lock);
 
 	remove_from_client(rq);
@@ -385,7 +384,8 @@ __await_execution(struct i915_request *rq,
 		i915_sw_fence_complete(cb->fence);
 		kmem_cache_free(global.slab_execute_cbs, cb);
 	} else {
-		list_add_tail(&cb->link, &signal->execute_cb);
+		cb->work.llnode.next = signal->execute_cb.first;
+		signal->execute_cb.first = &cb->work.llnode;
 	}
 	spin_unlock_irq(&signal->lock);
 
@@ -694,7 +694,7 @@ static void __i915_request_ctor(void *arg)
 	rq->file_priv = NULL;
 	rq->capture_list = NULL;
 
-	INIT_LIST_HEAD(&rq->execute_cb);
+	init_llist_head(&rq->execute_cb);
 }
 
 struct i915_request *
@@ -784,7 +784,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 	rq->batch = NULL;
 	GEM_BUG_ON(rq->file_priv);
 	GEM_BUG_ON(rq->capture_list);
-	GEM_BUG_ON(!list_empty(&rq->execute_cb));
+	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
 
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 8ec7ee4dbadc..5d4709a3dace 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -214,7 +214,7 @@ struct i915_request {
 			ktime_t emitted;
 		} duration;
 	};
-	struct list_head execute_cb;
+	struct llist_head execute_cb;
 	struct i915_sw_fence semaphore;
 
 	/*
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 09/22] dma-buf: Proxy fence, an unsignaled fence placeholder
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (6 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 08/22] drm/i915: Improve execute_cb struct packing Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 10/22] drm/syncobj: Allow use of dma-fence-proxy Chris Wilson
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Often we need to create a fence for a future event that has not yet been
associated with a fence. We can store a proxy fence, a placeholder, in
the timeline and replace it later when the real fence is known. Any
listeners that attach to the proxy fence will automatically be signaled
when the real fence completes, and any future listeners will instead be
attach directly to the real fence avoiding any indirection overhead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/dma-buf/Makefile             |  13 +-
 drivers/dma-buf/dma-fence-private.h  |  20 +
 drivers/dma-buf/dma-fence-proxy.c    | 248 ++++++++++
 drivers/dma-buf/dma-fence.c          |   4 +-
 drivers/dma-buf/selftests.h          |   1 +
 drivers/dma-buf/st-dma-fence-proxy.c | 699 +++++++++++++++++++++++++++
 include/linux/dma-fence-proxy.h      |  34 ++
 7 files changed, 1015 insertions(+), 4 deletions(-)
 create mode 100644 drivers/dma-buf/dma-fence-private.h
 create mode 100644 drivers/dma-buf/dma-fence-proxy.c
 create mode 100644 drivers/dma-buf/st-dma-fence-proxy.c
 create mode 100644 include/linux/dma-fence-proxy.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 995e05f609ff..afaf6dadd9a3 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,6 +1,12 @@
 # SPDX-License-Identifier: GPL-2.0-only
-obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
-	 dma-resv.o seqno-fence.o
+obj-y := \
+	dma-buf.o \
+	dma-fence.o \
+	dma-fence-array.o \
+	dma-fence-chain.o \
+	dma-fence-proxy.o \
+	dma-resv.o \
+	seqno-fence.o
 obj-$(CONFIG_DMABUF_HEAPS)	+= dma-heap.o
 obj-$(CONFIG_DMABUF_HEAPS)	+= heaps/
 obj-$(CONFIG_SYNC_FILE)		+= sync_file.o
@@ -10,6 +16,7 @@ obj-$(CONFIG_UDMABUF)		+= udmabuf.o
 dmabuf_selftests-y := \
 	selftest.o \
 	st-dma-fence.o \
-	st-dma-fence-chain.o
+	st-dma-fence-chain.o \
+	st-dma-fence-proxy.o
 
 obj-$(CONFIG_DMABUF_SELFTESTS)	+= dmabuf_selftests.o
diff --git a/drivers/dma-buf/dma-fence-private.h b/drivers/dma-buf/dma-fence-private.h
new file mode 100644
index 000000000000..6924d28af0fa
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-private.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Fence mechanism for dma-buf and to allow for asynchronous dma access
+ *
+ * Copyright (C) 2012 Canonical Ltd
+ * Copyright (C) 2012 Texas Instruments
+ *
+ * Authors:
+ * Rob Clark <robdclark@gmail.com>
+ * Maarten Lankhorst <maarten.lankhorst@canonical.com>
+ */
+
+#ifndef DMA_FENCE_PRIVATE_H
+#define DMA_FENCE_PRIAVTE_H
+
+struct dma_fence;
+
+bool __dma_fence_enable_signaling(struct dma_fence *fence);
+
+#endif /* DMA_FENCE_PRIAVTE_H */
diff --git a/drivers/dma-buf/dma-fence-proxy.c b/drivers/dma-buf/dma-fence-proxy.c
new file mode 100644
index 000000000000..f0cd89b966e0
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-proxy.c
@@ -0,0 +1,248 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * dma-fence-proxy: placeholder unsignaled fence
+ *
+ * Copyright (C) 2017-2019 Intel Corporation
+ */
+
+#include <linux/dma-fence.h>
+#include <linux/dma-fence-proxy.h>
+#include <linux/export.h>
+#include <linux/irq_work.h>
+#include <linux/slab.h>
+
+#include "dma-fence-private.h"
+
+struct dma_fence_proxy {
+	struct dma_fence base;
+
+	struct dma_fence *real;
+	struct dma_fence_cb cb;
+	struct irq_work work;
+
+	wait_queue_head_t wq;
+};
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+#define same_lockclass(A, B) (A)->dep_map.key == (B)->dep_map.key
+#else
+#define same_lockclass(A, B) 0
+#endif
+
+static const char *proxy_get_driver_name(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+	struct dma_fence *real = READ_ONCE(p->real);
+
+	return real ? real->ops->get_driver_name(real) : "proxy";
+}
+
+static const char *proxy_get_timeline_name(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+	struct dma_fence *real = READ_ONCE(p->real);
+
+	return real ? real->ops->get_timeline_name(real) : "unset";
+}
+
+static void proxy_irq_work(struct irq_work *work)
+{
+	struct dma_fence_proxy *p = container_of(work, typeof(*p), work);
+
+	dma_fence_signal(&p->base);
+	dma_fence_put(&p->base);
+}
+
+static void proxy_callback(struct dma_fence *real, struct dma_fence_cb *cb)
+{
+	struct dma_fence_proxy *p = container_of(cb, typeof(*p), cb);
+
+	if (real->error)
+		dma_fence_set_error(&p->base, real->error);
+
+	/* Lower the height of the proxy chain -> single stack frame */
+	irq_work_queue(&p->work);
+}
+
+static bool proxy_enable_signaling(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+	struct dma_fence *real = READ_ONCE(p->real);
+	bool ret = true;
+
+	if (real) {
+		spin_lock_nested(real->lock,
+				 same_lockclass(&p->wq.lock, real->lock));
+		ret = __dma_fence_enable_signaling(real);
+		spin_unlock(real->lock);
+	}
+
+	return ret;
+}
+
+static void proxy_release(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+
+	dma_fence_put(p->real);
+	dma_fence_free(&p->base);
+}
+
+const struct dma_fence_ops dma_fence_proxy_ops = {
+	.get_driver_name = proxy_get_driver_name,
+	.get_timeline_name = proxy_get_timeline_name,
+	.enable_signaling = proxy_enable_signaling,
+	.wait = dma_fence_default_wait,
+	.release = proxy_release,
+};
+EXPORT_SYMBOL_GPL(dma_fence_proxy_ops);
+
+/**
+ * dma_fence_create_proxy - Create an unset dma-fence
+ *
+ * dma_fence_create_proxy() creates a new dma_fence stub that is initially
+ * unsignaled and may later be replaced with a real fence. Any listeners
+ * to the proxy fence will be signaled when the target fence signals its
+ * completion.
+ */
+struct dma_fence *dma_fence_create_proxy(void)
+{
+	struct dma_fence_proxy *p;
+
+	p = kzalloc(sizeof(*p), GFP_KERNEL);
+	if (!p)
+		return NULL;
+
+	init_waitqueue_head(&p->wq);
+	dma_fence_init(&p->base, &dma_fence_proxy_ops, &p->wq.lock,
+		       dma_fence_context_alloc(1), 0);
+	init_irq_work(&p->work, proxy_irq_work);
+
+	return &p->base;
+}
+EXPORT_SYMBOL(dma_fence_create_proxy);
+
+static void __wake_up_listeners(struct dma_fence_proxy *p)
+{
+	struct wait_queue_entry *wait, *next;
+
+	list_for_each_entry_safe(wait, next, &p->wq.head, entry) {
+		INIT_LIST_HEAD(&wait->entry);
+		wait->func(wait, TASK_NORMAL, 0, p->real);
+	}
+}
+
+static void proxy_assign(struct dma_fence *fence, struct dma_fence *real)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+	unsigned long flags;
+
+	if (WARN_ON(fence == real))
+		return;
+
+	if (WARN_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)))
+		return;
+
+	if (WARN_ON(p->real))
+		return;
+
+	spin_lock_irqsave(&p->wq.lock, flags);
+
+	if (unlikely(!real)) {
+		dma_fence_signal_locked(&p->base);
+		goto unlock;
+	}
+
+	p->real = dma_fence_get(real);
+
+	dma_fence_get(&p->base);
+	spin_lock_nested(real->lock, same_lockclass(&p->wq.lock, real->lock));
+	if (dma_fence_is_signaled_locked(real)) {
+		proxy_callback(real, &p->cb);
+	} else if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+			    &p->base.flags) &&
+		   !__dma_fence_enable_signaling(real)) {
+		proxy_callback(real, &p->cb);
+	} else {
+		p->cb.func = proxy_callback;
+		list_add_tail(&p->cb.node, &real->cb_list);
+	}
+	spin_unlock(real->lock);
+
+unlock:
+	__wake_up_listeners(p);
+	spin_unlock_irqrestore(&p->wq.lock, flags);
+}
+
+/**
+ * dma_fence_replace_proxy - Replace the proxy fence with the real target
+ * @slot: pointer to location of fence to update
+ * @fence: the new fence to store in @slot
+ *
+ * Once the real dma_fence is known, we can replace the proxy fence holder
+ * with a pointer to the real dma fence. Future listeners will attach to
+ * the real fence, avoiding any indirection overhead. Previous listeners
+ * will remain attached to the proxy fence, and be signaled in turn when
+ * the target fence completes.
+ */
+struct dma_fence *
+dma_fence_replace_proxy(struct dma_fence __rcu **slot, struct dma_fence *fence)
+{
+	struct dma_fence *old;
+
+	if (fence)
+		dma_fence_get(fence);
+
+	old = rcu_replace_pointer(*slot, fence, true);
+	if (old && dma_fence_is_proxy(old))
+		proxy_assign(old, fence);
+
+	return old;
+}
+EXPORT_SYMBOL(dma_fence_replace_proxy);
+
+void dma_fence_add_proxy_listener(struct dma_fence *fence,
+				  struct wait_queue_entry *wait)
+{
+	if (dma_fence_is_proxy(fence)) {
+		struct dma_fence_proxy *p =
+			container_of(fence, typeof(*p), base);
+		unsigned long flags;
+
+		spin_lock_irqsave(&p->wq.lock, flags);
+		if (!p->real) {
+			list_add_tail(&wait->entry, &p->wq.head);
+			wait = NULL;
+		}
+		fence = p->real;
+		spin_unlock_irqrestore(&p->wq.lock, flags);
+	}
+
+	if (wait) {
+		INIT_LIST_HEAD(&wait->entry);
+		wait->func(wait, TASK_NORMAL, 0, fence);
+	}
+}
+EXPORT_SYMBOL(dma_fence_add_proxy_listener);
+
+bool dma_fence_remove_proxy_listener(struct dma_fence *fence,
+				     struct wait_queue_entry *wait)
+{
+	bool ret = false;
+
+	if (dma_fence_is_proxy(fence)) {
+		struct dma_fence_proxy *p =
+			container_of(fence, typeof(*p), base);
+		unsigned long flags;
+
+		spin_lock_irqsave(&p->wq.lock, flags);
+		if (!list_empty(&wait->entry)) {
+			list_del_init(&wait->entry);
+			ret = true;
+		}
+		spin_unlock_irqrestore(&p->wq.lock, flags);
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL(dma_fence_remove_proxy_listener);
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 90edf2b281b0..5a9ff241e39e 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -19,6 +19,8 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/dma_fence.h>
 
+#include "dma-fence-private.h"
+
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
@@ -273,7 +275,7 @@ void dma_fence_free(struct dma_fence *fence)
 }
 EXPORT_SYMBOL(dma_fence_free);
 
-static bool __dma_fence_enable_signaling(struct dma_fence *fence)
+bool __dma_fence_enable_signaling(struct dma_fence *fence)
 {
 	bool was_set;
 
diff --git a/drivers/dma-buf/selftests.h b/drivers/dma-buf/selftests.h
index 55918ef9adab..616eca70e2d8 100644
--- a/drivers/dma-buf/selftests.h
+++ b/drivers/dma-buf/selftests.h
@@ -12,3 +12,4 @@
 selftest(sanitycheck, __sanitycheck__) /* keep first (igt selfcheck) */
 selftest(dma_fence, dma_fence)
 selftest(dma_fence_chain, dma_fence_chain)
+selftest(dma_fence_proxy, dma_fence_proxy)
diff --git a/drivers/dma-buf/st-dma-fence-proxy.c b/drivers/dma-buf/st-dma-fence-proxy.c
new file mode 100644
index 000000000000..c95811199c16
--- /dev/null
+++ b/drivers/dma-buf/st-dma-fence-proxy.c
@@ -0,0 +1,699 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/delay.h>
+#include <linux/dma-fence.h>
+#include <linux/dma-fence-proxy.h>
+#include <linux/kernel.h>
+#include <linux/sched/signal.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "selftest.h"
+
+static struct kmem_cache *slab_fences;
+
+static struct mock_fence {
+	struct dma_fence base;
+	spinlock_t lock;
+} *to_mock_fence(struct dma_fence *f) {
+	return container_of(f, struct mock_fence, base);
+}
+
+static const char *mock_name(struct dma_fence *f)
+{
+	return "mock";
+}
+
+static void mock_fence_release(struct dma_fence *f)
+{
+	kmem_cache_free(slab_fences, to_mock_fence(f));
+}
+
+static const struct dma_fence_ops mock_ops = {
+	.get_driver_name = mock_name,
+	.get_timeline_name = mock_name,
+	.release = mock_fence_release,
+};
+
+static struct dma_fence *mock_fence(void)
+{
+	struct mock_fence *f;
+
+	f = kmem_cache_alloc(slab_fences, GFP_KERNEL);
+	if (!f)
+		return NULL;
+
+	spin_lock_init(&f->lock);
+	dma_fence_init(&f->base, &mock_ops, &f->lock, 0, 0);
+
+	return &f->base;
+}
+
+static int sanitycheck(void *arg)
+{
+	struct dma_fence *f;
+
+	f = dma_fence_create_proxy();
+	if (!f)
+		return -ENOMEM;
+
+	dma_fence_signal(f);
+	dma_fence_put(f);
+
+	return 0;
+}
+
+struct fences {
+	struct dma_fence *real;
+	struct dma_fence *proxy;
+	struct dma_fence __rcu *slot;
+};
+
+static int create_fences(struct fences *f, bool attach)
+{
+	f->proxy = dma_fence_create_proxy();
+	if (!f->proxy)
+		return -ENOMEM;
+
+	RCU_INIT_POINTER(f->slot, f->proxy);
+
+	f->real = mock_fence();
+	if (!f->real) {
+		dma_fence_put(f->proxy);
+		return -ENOMEM;
+	}
+
+	if (attach)
+		dma_fence_replace_proxy(&f->slot, f->real);
+
+	return 0;
+}
+
+static void free_fences(struct fences *f)
+{
+	dma_fence_put(dma_fence_replace_proxy(&f->slot, NULL));
+	dma_fence_put(f->real);
+	dma_fence_put(f->proxy);
+}
+
+static int wrap_signaling(void *arg)
+{
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_is_signaled(f.proxy)) {
+		pr_err("Fence unexpectedly signaled on creation\n");
+		goto err_free;
+	}
+
+	if (dma_fence_signal(f.real)) {
+		pr_err("Fence reported being already signaled\n");
+		goto err_free;
+	}
+
+	if (!dma_fence_is_signaled(f.proxy)) {
+		pr_err("Fence not reporting signaled\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_signaling_recurse(void *arg)
+{
+	struct fences f;
+	struct dma_fence *chain;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	chain = dma_fence_create_proxy();
+	if (!chain) {
+		err = -ENOMEM;
+		goto err_free;
+	}
+
+	dma_fence_replace_proxy(&f.slot, chain);
+	dma_fence_put(dma_fence_replace_proxy(&f.slot, f.real));
+	dma_fence_put(chain);
+
+	/* f.real <- chain <- f.proxy */
+
+	if (dma_fence_is_signaled(f.proxy)) {
+		pr_err("Fence unexpectedly signaled on creation\n");
+		goto err_free;
+	}
+
+	if (dma_fence_signal(f.real)) {
+		pr_err("Fence reported being already signaled\n");
+		goto err_free;
+	}
+
+	if (!dma_fence_is_signaled(f.proxy)) {
+		pr_err("Fence not reporting signaled\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+struct simple_cb {
+	struct dma_fence_cb cb;
+	bool seen;
+};
+
+static void simple_callback(struct dma_fence *f, struct dma_fence_cb *cb)
+{
+	smp_store_mb(container_of(cb, struct simple_cb, cb)->seen, true);
+}
+
+static int wrap_add_callback(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_add_callback_recurse(void *arg)
+{
+	struct simple_cb cb = {};
+	struct dma_fence *chain;
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	chain = dma_fence_create_proxy();
+	if (!chain) {
+		err = -ENOMEM;
+		goto err_free;
+	}
+
+	dma_fence_replace_proxy(&f.slot, chain);
+	dma_fence_put(dma_fence_replace_proxy(&f.slot, f.real));
+	dma_fence_put(chain);
+
+	/* f.real <- chain <- f.proxy */
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_late_add_callback(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	dma_fence_signal(f.real);
+
+	if (!dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Added callback, but fence was already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (cb.seen) {
+		pr_err("Callback called after failed attachment!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_early_add_callback(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_replace_proxy(&f.slot, f.real);
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_early_add_callback_late(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	dma_fence_signal(f.real);
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_replace_proxy(&f.slot, f.real);
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_early_add_callback_early(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_replace_proxy(&f.slot, f.real);
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_rm_callback(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	if (!dma_fence_remove_callback(f.proxy, &cb.cb)) {
+		pr_err("Failed to remove callback!\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (cb.seen) {
+		pr_err("Callback still signaled after removal!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_late_rm_callback(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	if (dma_fence_remove_callback(f.proxy, &cb.cb)) {
+		pr_err("Callback removal succeed after being executed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_status(void *arg)
+{
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_get_status(f.proxy)) {
+		pr_err("Fence unexpectedly has signaled status on creation\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (!dma_fence_get_status(f.proxy)) {
+		pr_err("Fence not reporting signaled status\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_error(void *arg)
+{
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	dma_fence_set_error(f.real, -EIO);
+
+	if (dma_fence_get_status(f.proxy)) {
+		pr_err("Fence unexpectedly has error status before signal\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (dma_fence_get_status(f.proxy) != -EIO) {
+		pr_err("Fence not reporting error status, got %d\n",
+		       dma_fence_get_status(f.proxy));
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_wait(void *arg)
+{
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_wait_timeout(f.proxy, false, 0) != 0) {
+		pr_err("Wait reported complete before being signaled\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+
+	if (dma_fence_wait_timeout(f.proxy, false, 0) == 0) {
+		pr_err("Wait reported incomplete after being signaled\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	dma_fence_signal(f.real);
+	free_fences(&f);
+	return err;
+}
+
+struct wait_timer {
+	struct timer_list timer;
+	struct fences f;
+};
+
+static void wait_timer(struct timer_list *timer)
+{
+	struct wait_timer *wt = from_timer(wt, timer, timer);
+
+	dma_fence_signal(wt->f.real);
+}
+
+static int wrap_wait_timeout(void *arg)
+{
+	struct wait_timer wt;
+	int err = -EINVAL;
+
+	if (create_fences(&wt.f, true))
+		return -ENOMEM;
+
+	timer_setup_on_stack(&wt.timer, wait_timer, 0);
+
+	if (dma_fence_wait_timeout(wt.f.proxy, false, 1) != 0) {
+		pr_err("Wait reported complete before being signaled\n");
+		goto err_free;
+	}
+
+	mod_timer(&wt.timer, jiffies + 1);
+
+	if (dma_fence_wait_timeout(wt.f.proxy, false, 2) != 0) {
+		if (timer_pending(&wt.timer)) {
+			pr_notice("Timer did not fire within the jiffie!\n");
+			err = 0; /* not our fault! */
+		} else {
+			pr_err("Wait reported incomplete after timeout\n");
+		}
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	del_timer_sync(&wt.timer);
+	destroy_timer_on_stack(&wt.timer);
+	dma_fence_signal(wt.f.real);
+	free_fences(&wt.f);
+	return err;
+}
+
+struct proxy_wait {
+	struct wait_queue_entry base;
+	struct dma_fence *fence;
+	bool seen;
+};
+
+static int proxy_wait_cb(struct wait_queue_entry *entry,
+			 unsigned int mode, int flags, void *key)
+{
+	struct proxy_wait *p = container_of(entry, typeof(*p), base);
+
+	p->fence = key;
+	p->seen = true;
+
+	return 0;
+}
+
+static int wrap_listen_early(void *arg)
+{
+	struct proxy_wait wait = { .base.func = proxy_wait_cb };
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	dma_fence_replace_proxy(&f.slot, f.real);
+	dma_fence_add_proxy_listener(f.proxy, &wait.base);
+
+	if (!wait.seen) {
+		pr_err("Proxy listener was not called after replace!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	if (wait.fence != f.real) {
+		pr_err("Proxy listener was not passed the real fence!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	dma_fence_signal(f.real);
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_listen_late(void *arg)
+{
+	struct proxy_wait wait = { .base.func = proxy_wait_cb };
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	dma_fence_add_proxy_listener(f.proxy, &wait.base);
+	dma_fence_replace_proxy(&f.slot, f.real);
+
+	if (!wait.seen) {
+		pr_err("Proxy listener was not called on replace!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	if (wait.fence != f.real) {
+		pr_err("Proxy listener was not passed the real fence!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	dma_fence_signal(f.real);
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_listen_cancel(void *arg)
+{
+	struct proxy_wait wait = { .base.func = proxy_wait_cb };
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	dma_fence_add_proxy_listener(f.proxy, &wait.base);
+	if (!dma_fence_remove_proxy_listener(f.proxy, &wait.base)) {
+		pr_err("Cancelling listener, already detached?\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+	dma_fence_replace_proxy(&f.slot, f.real);
+
+	if (wait.seen) {
+		pr_err("Proxy listener was called after being removed!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	if (dma_fence_remove_proxy_listener(f.proxy, &wait.base)) {
+		pr_err("Double listener cancellation!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	dma_fence_signal(f.real);
+	free_fences(&f);
+	return err;
+}
+
+int dma_fence_proxy(void)
+{
+	static const struct subtest tests[] = {
+		SUBTEST(sanitycheck),
+		SUBTEST(wrap_signaling),
+		SUBTEST(wrap_signaling_recurse),
+		SUBTEST(wrap_add_callback),
+		SUBTEST(wrap_add_callback_recurse),
+		SUBTEST(wrap_late_add_callback),
+		SUBTEST(wrap_early_add_callback),
+		SUBTEST(wrap_early_add_callback_late),
+		SUBTEST(wrap_early_add_callback_early),
+		SUBTEST(wrap_rm_callback),
+		SUBTEST(wrap_late_rm_callback),
+		SUBTEST(wrap_status),
+		SUBTEST(wrap_error),
+		SUBTEST(wrap_wait),
+		SUBTEST(wrap_wait_timeout),
+		SUBTEST(wrap_listen_early),
+		SUBTEST(wrap_listen_late),
+		SUBTEST(wrap_listen_cancel),
+	};
+	int ret;
+
+	slab_fences = KMEM_CACHE(mock_fence,
+				 SLAB_TYPESAFE_BY_RCU |
+				 SLAB_HWCACHE_ALIGN);
+	if (!slab_fences)
+		return -ENOMEM;
+
+	ret = subtests(tests, NULL);
+
+	kmem_cache_destroy(slab_fences);
+
+	return ret;
+}
diff --git a/include/linux/dma-fence-proxy.h b/include/linux/dma-fence-proxy.h
new file mode 100644
index 000000000000..063cde6b42c4
--- /dev/null
+++ b/include/linux/dma-fence-proxy.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * dma-fence-proxy: allows waiting upon unset and future fences
+ *
+ * Copyright (C) 2017 Intel Corporation
+ */
+
+#ifndef __LINUX_DMA_FENCE_PROXY_H
+#define __LINUX_DMA_FENCE_PROXY_H
+
+#include <linux/kernel.h>
+#include <linux/dma-fence.h>
+
+struct wait_queue_entry;
+
+extern const struct dma_fence_ops dma_fence_proxy_ops;
+
+struct dma_fence *dma_fence_create_proxy(void);
+
+static inline bool dma_fence_is_proxy(struct dma_fence *fence)
+{
+	return fence->ops == &dma_fence_proxy_ops;
+}
+
+struct dma_fence *
+dma_fence_replace_proxy(struct dma_fence __rcu **slot,
+			struct dma_fence *fence);
+
+void dma_fence_add_proxy_listener(struct dma_fence *fence,
+				  struct wait_queue_entry *wait);
+bool dma_fence_remove_proxy_listener(struct dma_fence *fence,
+				     struct wait_queue_entry *wait);
+
+#endif /* __LINUX_DMA_FENCE_PROXY_H */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 10/22] drm/syncobj: Allow use of dma-fence-proxy
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (7 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 09/22] dma-buf: Proxy fence, an unsignaled fence placeholder Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 11/22] drm/i915/gem: Teach execbuf how to wait on future syncobj Chris Wilson
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Allow the callers to supply a dma-fence-proxy for asynchronous waiting on
future fences.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/drm_syncobj.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 42d46414f767..e141db0e1eb6 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -184,6 +184,7 @@
  */
 
 #include <linux/anon_inodes.h>
+#include <linux/dma-fence-proxy.h>
 #include <linux/file.h>
 #include <linux/fs.h>
 #include <linux/sched/signal.h>
@@ -324,14 +325,9 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 	struct dma_fence *old_fence;
 	struct syncobj_wait_entry *cur, *tmp;
 
-	if (fence)
-		dma_fence_get(fence);
-
 	spin_lock(&syncobj->lock);
 
-	old_fence = rcu_dereference_protected(syncobj->fence,
-					      lockdep_is_held(&syncobj->lock));
-	rcu_assign_pointer(syncobj->fence, fence);
+	old_fence = dma_fence_replace_proxy(&syncobj->fence, fence);
 
 	if (fence != old_fence) {
 		list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 11/22] drm/i915/gem: Teach execbuf how to wait on future syncobj
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (8 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 10/22] drm/syncobj: Allow use of dma-fence-proxy Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 12/22] drm/i915/gem: Allow combining submit-fences with syncobj Chris Wilson
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

If a syncobj has not yet been assigned, treat it as a future fence and
install and wait upon a dma-fence-proxy. The proxy will be replace by
the real fence later, and that fence will be responsible for signaling
our waiter.

Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4854
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  21 ++-
 drivers/gpu/drm/i915/i915_request.c           | 153 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_scheduler.c         |  41 +++++
 drivers/gpu/drm/i915/i915_scheduler.h         |   3 +
 4 files changed, 216 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 219a36995b96..ce27da907dc3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -5,6 +5,7 @@
  */
 
 #include <linux/intel-iommu.h>
+#include <linux/dma-fence-proxy.h>
 #include <linux/dma-resv.h>
 #include <linux/sync_file.h>
 #include <linux/uaccess.h>
@@ -2523,8 +2524,24 @@ await_fence_array(struct i915_execbuffer *eb,
 			continue;
 
 		fence = drm_syncobj_fence_get(syncobj);
-		if (!fence)
-			return -EINVAL;
+		if (!fence) {
+			struct dma_fence *old;
+
+			fence = dma_fence_create_proxy();
+			if (!fence)
+				return -ENOMEM;
+
+			spin_lock(&syncobj->lock);
+			old = rcu_dereference_protected(syncobj->fence, true);
+			if (unlikely(old)) {
+				dma_fence_put(fence);
+				fence = dma_fence_get(old);
+			} else {
+				rcu_assign_pointer(syncobj->fence,
+						   dma_fence_get(fence));
+			}
+			spin_unlock(&syncobj->lock);
+		}
 
 		err = i915_request_await_dma_fence(eb->request, fence);
 		dma_fence_put(fence);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 6f8b5794b73c..3f7d9e4bf89f 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -24,6 +24,7 @@
 
 #include <linux/dma-fence-array.h>
 #include <linux/dma-fence-chain.h>
+#include <linux/dma-fence-proxy.h>
 #include <linux/irq_work.h>
 #include <linux/prefetch.h>
 #include <linux/sched.h>
@@ -398,6 +399,7 @@ static bool fatal_error(int error)
 	case 0: /* not an error! */
 	case -EAGAIN: /* innocent victim of a GT reset (__i915_request_reset) */
 	case -ETIMEDOUT: /* waiting for Godot (timer_i915_sw_fence_wake) */
+	case -EDEADLK: /* cyclic fence lockup (await_proxy)  */
 		return false;
 	default:
 		return true;
@@ -1125,6 +1127,155 @@ i915_request_await_external(struct i915_request *rq, struct dma_fence *fence)
 	return err;
 }
 
+struct await_proxy {
+	struct wait_queue_entry base;
+	struct i915_request *request;
+	struct dma_fence *fence;
+	struct timer_list timer;
+	struct work_struct work;
+	int (*attach)(struct await_proxy *ap);
+	void *data;
+};
+
+static void await_proxy_work(struct work_struct *work)
+{
+	struct await_proxy *ap = container_of(work, typeof(*ap), work);
+	struct i915_request *rq = ap->request;
+
+	del_timer_sync(&ap->timer);
+
+	if (ap->fence) {
+		int err = 0;
+
+		/*
+		 * If the fence is external, we impose a 10s timeout.
+		 * However, if the fence is internal, we skip a timeout in
+		 * the belief that all fences are in-order (DAG, no cycles)
+		 * and we can enforce forward progress by reset the GPU if
+		 * necessary. A future fence, provided userspace, can trivially
+		 * generate a cycle in the dependency graph, and so cause
+		 * that entire cycle to become deadlocked and for no forward
+		 * progress to either be made, and the driver being kept
+		 * eternally awake.
+		 */
+		if (dma_fence_is_i915(ap->fence) &&
+		    !i915_sched_node_verify_dag(&rq->sched,
+						&to_request(ap->fence)->sched))
+			err = -EDEADLK;
+
+		if (!err) {
+			mutex_lock(&rq->context->timeline->mutex);
+			err = ap->attach(ap);
+			mutex_unlock(&rq->context->timeline->mutex);
+		}
+
+		/* Don't flag an error for co-dependent scheduling */
+		if (err == -EDEADLK) {
+			struct i915_sched_node *waiter =
+				&to_request(ap->fence)->sched;
+			struct i915_dependency *p;
+
+			list_for_each_entry_lockless(p,
+						     &rq->sched.waiters_list,
+						     wait_link) {
+				if (p->waiter == waiter &&
+				    p->flags & I915_DEPENDENCY_WEAK) {
+					err = 0;
+					break;
+				}
+			}
+		}
+
+		if (err < 0)
+			i915_sw_fence_set_error_once(&rq->submit, err);
+	}
+
+	i915_sw_fence_complete(&rq->submit);
+
+	dma_fence_put(ap->fence);
+	kfree(ap);
+}
+
+static int
+await_proxy_wake(struct wait_queue_entry *entry,
+		 unsigned int mode,
+		 int flags,
+		 void *fence)
+{
+	struct await_proxy *ap = container_of(entry, typeof(*ap), base);
+
+	ap->fence = dma_fence_get(fence);
+	schedule_work(&ap->work);
+
+	return 0;
+}
+
+static void
+await_proxy_timer(struct timer_list *t)
+{
+	struct await_proxy *ap = container_of(t, typeof(*ap), timer);
+
+	if (dma_fence_remove_proxy_listener(ap->base.private, &ap->base)) {
+		struct i915_request *rq = ap->request;
+
+		pr_notice("Asynchronous wait on unset proxy fence by %s:%s:%llx timed out\n",
+			  rq->fence.ops->get_driver_name(&rq->fence),
+			  rq->fence.ops->get_timeline_name(&rq->fence),
+			  rq->fence.seqno);
+		i915_sw_fence_set_error_once(&rq->submit, -ETIMEDOUT);
+
+		schedule_work(&ap->work);
+	}
+}
+
+static int
+__i915_request_await_proxy(struct i915_request *rq,
+			   struct dma_fence *fence,
+			   unsigned long timeout,
+			   int (*attach)(struct await_proxy *ap),
+			   void *data)
+{
+	struct await_proxy *ap;
+
+	ap = kzalloc(sizeof(*ap), I915_FENCE_GFP);
+	if (!ap)
+		return -ENOMEM;
+
+	i915_sw_fence_await(&rq->submit);
+	mark_external(rq);
+
+	ap->base.private = fence;
+	ap->base.func = await_proxy_wake;
+	ap->request = rq;
+	INIT_WORK(&ap->work, await_proxy_work);
+	ap->attach = attach;
+	ap->data = data;
+
+	timer_setup(&ap->timer, await_proxy_timer, 0);
+	if (timeout)
+		mod_timer(&ap->timer, round_jiffies_up(jiffies + timeout));
+
+	dma_fence_add_proxy_listener(fence, &ap->base);
+	return 0;
+}
+
+static int await_proxy(struct await_proxy *ap)
+{
+	return i915_request_await_dma_fence(ap->request, ap->fence);
+}
+
+static int
+i915_request_await_proxy(struct i915_request *rq, struct dma_fence *fence)
+{
+	/*
+	 * Wait until we know the real fence so that can optimise the
+	 * inter-fence synchronisation.
+	 */
+	return __i915_request_await_proxy(rq, fence,
+					  i915_fence_timeout(rq->i915),
+					  await_proxy, NULL);
+}
+
 int
 i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 {
@@ -1171,6 +1322,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 
 		if (dma_fence_is_i915(fence))
 			ret = i915_request_await_request(rq, to_request(fence));
+		else if (dma_fence_is_proxy(fence))
+			ret = i915_request_await_proxy(rq, fence);
 		else
 			ret = i915_request_await_external(rq, fence);
 		if (ret < 0)
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index cbb880b10c65..250832768279 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -469,6 +469,47 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
 	return 0;
 }
 
+bool i915_sched_node_verify_dag(struct i915_sched_node *waiter,
+				struct i915_sched_node *signaler)
+{
+	struct i915_dependency *dep, *p;
+	struct i915_dependency stack;
+	bool result = false;
+	LIST_HEAD(dfs);
+
+	if (list_empty(&waiter->waiters_list))
+		return true;
+
+	spin_lock_irq(&schedule_lock);
+
+	stack.signaler = signaler;
+	list_add(&stack.dfs_link, &dfs);
+
+	list_for_each_entry(dep, &dfs, dfs_link) {
+		struct i915_sched_node *node = dep->signaler;
+
+		if (node_signaled(node))
+			continue;
+
+		list_for_each_entry(p, &node->signalers_list, signal_link) {
+			if (p->signaler == waiter)
+				goto out;
+
+			if (list_empty(&p->dfs_link))
+				list_add_tail(&p->dfs_link, &dfs);
+		}
+	}
+
+	result = true;
+out:
+	list_for_each_entry_safe(dep, p, &dfs, dfs_link)
+		INIT_LIST_HEAD(&dep->dfs_link);
+
+	spin_unlock_irq(&schedule_lock);
+
+	return result;
+}
+
 void i915_sched_node_fini(struct i915_sched_node *node)
 {
 	struct i915_dependency *dep, *tmp;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 6f0bf00fc569..13432add8929 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -28,6 +28,9 @@
 void i915_sched_node_init(struct i915_sched_node *node);
 void i915_sched_node_reinit(struct i915_sched_node *node);
 
+bool i915_sched_node_verify_dag(struct i915_sched_node *waiter,
+				struct i915_sched_node *signal);
+
 bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
 				      struct i915_sched_node *signal,
 				      struct i915_dependency *dep,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 12/22] drm/i915/gem: Allow combining submit-fences with syncobj
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (9 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 11/22] drm/i915/gem: Teach execbuf how to wait on future syncobj Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 13/22] drm/i915/gt: Declare when we enabled timeslicing Chris Wilson
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

We allow exported sync_file fences to be used as submit fences, but they
are not the only source of user fences. We also accept an array of
syncobj, and as with sync_file these are dma_fences underneath and so
feature the same set of controls. The submit-fence allows for a request
to be scheduled at the same time as the signaler, rather than as normal
after. Userspace can combine submit-fence with its own semaphores for
intra-batch scheduling.

Not exposing submit-fences to syncobj was at the time just a matter of
pragmatic expediency.

Fixes: a88b6e4cbafd ("drm/i915: Allow specification of parallel execbuf")
Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4854
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 14 +++++++----
 drivers/gpu/drm/i915/i915_request.c           | 25 +++++++++++++++++++
 include/uapi/drm/i915_drm.h                   |  7 +++---
 3 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index ce27da907dc3..91e77348282a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2431,7 +2431,7 @@ static void
 __free_fence_array(struct drm_syncobj **fences, unsigned int n)
 {
 	while (n--)
-		drm_syncobj_put(ptr_mask_bits(fences[n], 2));
+		drm_syncobj_put(ptr_mask_bits(fences[n], 3));
 	kvfree(fences);
 }
 
@@ -2488,7 +2488,7 @@ get_fence_array(struct drm_i915_gem_execbuffer2 *args,
 		BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) &
 			     ~__I915_EXEC_FENCE_UNKNOWN_FLAGS);
 
-		fences[n] = ptr_pack_bits(syncobj, fence.flags, 2);
+		fences[n] = ptr_pack_bits(syncobj, fence.flags, 3);
 	}
 
 	return fences;
@@ -2519,7 +2519,7 @@ await_fence_array(struct i915_execbuffer *eb,
 		struct dma_fence *fence;
 		unsigned int flags;
 
-		syncobj = ptr_unpack_bits(fences[n], &flags, 2);
+		syncobj = ptr_unpack_bits(fences[n], &flags, 3);
 		if (!(flags & I915_EXEC_FENCE_WAIT))
 			continue;
 
@@ -2543,7 +2543,11 @@ await_fence_array(struct i915_execbuffer *eb,
 			spin_unlock(&syncobj->lock);
 		}
 
-		err = i915_request_await_dma_fence(eb->request, fence);
+		if (flags & I915_EXEC_FENCE_WAIT_SUBMIT)
+			err = i915_request_await_execution(eb->request, fence,
+							   eb->engine->bond_execute);
+		else
+			err = i915_request_await_dma_fence(eb->request, fence);
 		dma_fence_put(fence);
 		if (err < 0)
 			return err;
@@ -2564,7 +2568,7 @@ signal_fence_array(struct i915_execbuffer *eb,
 		struct drm_syncobj *syncobj;
 		unsigned int flags;
 
-		syncobj = ptr_unpack_bits(fences[n], &flags, 2);
+		syncobj = ptr_unpack_bits(fences[n], &flags, 3);
 		if (!(flags & I915_EXEC_FENCE_SIGNAL))
 			continue;
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 3f7d9e4bf89f..d71728edca57 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1425,6 +1425,27 @@ __i915_request_await_execution(struct i915_request *to,
 					     &from->fence);
 }
 
+static int execution_proxy(struct await_proxy *ap)
+{
+	return i915_request_await_execution(ap->request, ap->fence, ap->data);
+}
+
+static int
+i915_request_await_proxy_execution(struct i915_request *rq,
+				   struct dma_fence *fence,
+				   void (*hook)(struct i915_request *rq,
+						struct dma_fence *signal))
+{
+	/*
+	 * We have to wait until the real request is known in order to
+	 * be able to hook into its execution, as opposed to waiting for
+	 * its completion.
+	 */
+	return __i915_request_await_proxy(rq, fence,
+					  i915_fence_timeout(rq->i915),
+					  execution_proxy, hook);
+}
+
 int
 i915_request_await_execution(struct i915_request *rq,
 			     struct dma_fence *fence,
@@ -1464,6 +1485,10 @@ i915_request_await_execution(struct i915_request *rq,
 			ret = __i915_request_await_execution(rq,
 							     to_request(fence),
 							     hook);
+		else if (dma_fence_is_proxy(fence))
+			ret = i915_request_await_proxy_execution(rq,
+								 fence,
+								 hook);
 		else
 			ret = i915_request_await_external(rq, fence);
 		if (ret < 0)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 14b67cd6b54b..704dd0e3bc1d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1040,9 +1040,10 @@ struct drm_i915_gem_exec_fence {
 	 */
 	__u32 handle;
 
-#define I915_EXEC_FENCE_WAIT            (1<<0)
-#define I915_EXEC_FENCE_SIGNAL          (1<<1)
-#define __I915_EXEC_FENCE_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SIGNAL << 1))
+#define I915_EXEC_FENCE_WAIT            (1u << 0)
+#define I915_EXEC_FENCE_SIGNAL          (1u << 1)
+#define I915_EXEC_FENCE_WAIT_SUBMIT     (1u << 2)
+#define __I915_EXEC_FENCE_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_WAIT_SUBMIT << 1))
 	__u32 flags;
 };
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 13/22] drm/i915/gt: Declare when we enabled timeslicing
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (10 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 12/22] drm/i915/gem: Allow combining submit-fences with syncobj Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 14/22] drm/i915/gt: Use built-in active intel_context reference Chris Wilson
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kenneth Graunke, Chris Wilson

Let userspace know if they can trust timeslicing by including it as part
of the I915_PARAM_HAS_SCHEDULER::I915_SCHEDULER_CAP_TIMESLICING

v2: Only declare timeslicing if we can safely preempt userspace.

Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing")
Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802
Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4854
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_user.c | 1 +
 include/uapi/drm/i915_drm.h                 | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c
index 848decee9066..8415511f1465 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
@@ -98,6 +98,7 @@ static void set_scheduler_caps(struct drm_i915_private *i915)
 		MAP(HAS_PREEMPTION, PREEMPTION),
 		MAP(HAS_SEMAPHORES, SEMAPHORES),
 		MAP(SUPPORTS_STATS, ENGINE_BUSY_STATS),
+		MAP(HAS_TIMESLICES, TIMESLICING),
 #undef MAP
 	};
 	struct intel_engine_cs *engine;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 704dd0e3bc1d..1ee227b5131a 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -523,6 +523,7 @@ typedef struct drm_i915_irq_wait {
 #define   I915_SCHEDULER_CAP_PREEMPTION	(1ul << 2)
 #define   I915_SCHEDULER_CAP_SEMAPHORES	(1ul << 3)
 #define   I915_SCHEDULER_CAP_ENGINE_BUSY_STATS	(1ul << 4)
+#define   I915_SCHEDULER_CAP_TIMESLICING	(1ul << 5)
 
 #define I915_PARAM_HUC_STATUS		 42
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 14/22] drm/i915/gt: Use built-in active intel_context reference
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (11 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 13/22] drm/i915/gt: Declare when we enabled timeslicing Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 15/22] drm/i915: Drop I915_IDLE_ENGINES_TIMEOUT Chris Wilson
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Since a few rearragements ago, we have an explicit reference to the
containing intel_context from inside the active reference and can drop
our own reference handling dancing around releasing the i915_active.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 762a251d553b..20faffbc217e 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -155,15 +155,7 @@ void intel_context_unpin(struct intel_context *ce)
 	CE_TRACE(ce, "unpin\n");
 	ce->ops->unpin(ce);
 
-	/*
-	 * Once released, we may asynchronously drop the active reference.
-	 * As that may be the only reference keeping the context alive,
-	 * take an extra now so that it is not freed before we finish
-	 * dereferencing it.
-	 */
-	intel_context_get(ce);
 	intel_context_active_release(ce);
-	intel_context_put(ce);
 }
 
 static int __context_pin_state(struct i915_vma *vma)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 15/22] drm/i915: Drop I915_IDLE_ENGINES_TIMEOUT
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (12 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 14/22] drm/i915/gt: Use built-in active intel_context reference Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 16/22] drm/i915: Always defer fenced work to the worker Chris Wilson
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

This timeout is only used in one place, to provide a tiny bit of grace
for slow igt to cleanup after themselves. If we are a bit stricter and
opt to kill outstanding requsts rather than wait, we can speed up igt by
not waiting for 200ms after a hang.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 11 ++++++-----
 drivers/gpu/drm/i915/i915_drv.h     |  2 --
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index bca036ac6621..c0bd26ef4772 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1462,12 +1462,13 @@ gt_drop_caches(struct intel_gt *gt, u64 val)
 {
 	int ret;
 
-	if (val & DROP_RESET_ACTIVE &&
-	    wait_for(intel_engines_are_idle(gt), I915_IDLE_ENGINES_TIMEOUT))
-		intel_gt_set_wedged(gt);
+	if (val & (DROP_RETIRE | DROP_RESET_ACTIVE))
+		intel_gt_wait_for_idle(gt, 1);
 
-	if (val & DROP_RETIRE)
-		intel_gt_retire_requests(gt);
+	if (val & DROP_RESET_ACTIVE && intel_gt_pm_get_if_awake(gt)) {
+		intel_gt_set_wedged(gt);
+		intel_gt_pm_put(gt);
+	}
 
 	if (val & (DROP_IDLE | DROP_ACTIVE)) {
 		ret = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 10383e01efde..39f3934d0ecf 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -615,8 +615,6 @@ struct i915_gem_mm {
 	u32 shrink_count;
 };
 
-#define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */
-
 unsigned long i915_fence_context_timeout(const struct drm_i915_private *i915,
 					 u64 context);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 16/22] drm/i915: Always defer fenced work to the worker
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (13 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 15/22] drm/i915: Drop I915_IDLE_ENGINES_TIMEOUT Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 17/22] drm/i915/gem: Assign context id for async work Chris Wilson
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Currently, if an error is raised we always call the cleanup locally
[and skip the main work callback]. However, some future users may need
to take a mutex to cleanup and so we cannot immediately execute the
cleanup as we may still be in interrupt context.

With the execute-immediate flag, for most cases this should result in
immediate cleanup of an error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_sw_fence_work.c | 25 +++++++++++------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c b/drivers/gpu/drm/i915/i915_sw_fence_work.c
index a3a81bb8f2c3..29f63ebc24e8 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence_work.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c
@@ -16,11 +16,14 @@ static void fence_complete(struct dma_fence_work *f)
 static void fence_work(struct work_struct *work)
 {
 	struct dma_fence_work *f = container_of(work, typeof(*f), work);
-	int err;
 
-	err = f->ops->work(f);
-	if (err)
-		dma_fence_set_error(&f->dma, err);
+	if (!f->dma.error) {
+		int err;
+
+		err = f->ops->work(f);
+		if (err)
+			dma_fence_set_error(&f->dma, err);
+	}
 
 	fence_complete(f);
 	dma_fence_put(&f->dma);
@@ -36,15 +39,11 @@ fence_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 		if (fence->error)
 			dma_fence_set_error(&f->dma, fence->error);
 
-		if (!f->dma.error) {
-			dma_fence_get(&f->dma);
-			if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
-				fence_work(&f->work);
-			else
-				queue_work(system_unbound_wq, &f->work);
-		} else {
-			fence_complete(f);
-		}
+		dma_fence_get(&f->dma);
+		if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
+			fence_work(&f->work);
+		else
+			queue_work(system_unbound_wq, &f->work);
 		break;
 
 	case FENCE_FREE:
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 17/22] drm/i915/gem: Assign context id for async work
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (14 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 16/22] drm/i915: Always defer fenced work to the worker Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 18/22] drm/i915: Export a preallocate variant of i915_active_acquire() Chris Wilson
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Allocate a few dma fence context id that we can use to associate async work
[for the CPU] launched on behalf of this context. For extra fun, we allow
a configurable concurrency width.

A current example would be that we spawn an unbound worker for every
userptr get_pages. In the future, we wish to charge this work to the
context that initiated the async work and to impose concurrency limits
based on the context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c       | 4 ++++
 drivers/gpu/drm/i915/gem/i915_gem_context.h       | 6 ++++++
 drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 6 ++++++
 3 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 900ea8b7fc8f..fd7d064a0e46 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -715,6 +715,10 @@ __create_context(struct drm_i915_private *i915)
 	ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL);
 	mutex_init(&ctx->mutex);
 
+	ctx->async.width = rounddown_pow_of_two(num_online_cpus());
+	ctx->async.context = dma_fence_context_alloc(ctx->async.width);
+	ctx->async.width--;
+
 	spin_lock_init(&ctx->stale.lock);
 	INIT_LIST_HEAD(&ctx->stale.engines);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 3702b2fb27ab..e104ff0ae740 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -134,6 +134,12 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
 
+static inline u64 i915_gem_context_async_id(struct i915_gem_context *ctx)
+{
+	return (ctx->async.context +
+		(atomic_fetch_inc(&ctx->async.cur) & ctx->async.width));
+}
+
 static inline struct i915_gem_context *
 i915_gem_context_get(struct i915_gem_context *ctx)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 28760bd03265..5f5cfa3a3e9b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -85,6 +85,12 @@ struct i915_gem_context {
 
 	struct intel_timeline *timeline;
 
+	struct {
+		u64 context;
+		atomic_t cur;
+		unsigned int width;
+	} async;
+
 	/**
 	 * @vm: unique address space (GTT)
 	 *
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 18/22] drm/i915: Export a preallocate variant of i915_active_acquire()
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (15 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 17/22] drm/i915/gem: Assign context id for async work Chris Wilson
@ 2020-05-20  7:54 ` Chris Wilson
  2020-05-20  7:55 ` [Intel-gfx] [PATCH 19/22] drm/i915/gem: Separate the ww_mutex walker into its own list Chris Wilson
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Sometimes we have to be very careful not to allocate underneath a mutex
(or spinlock) and yet still want to track activity. Enter
i915_active_acquire_for_context(). This raises the activity counter on
i915_active prior to use and ensures that the fence-tree contains a slot
for the context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_active.c | 107 ++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_active.h |   5 ++
 2 files changed, 103 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index d960d0be5bd2..71ad0d452680 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -217,11 +217,10 @@ excl_retire(struct dma_fence *fence, struct dma_fence_cb *cb)
 }
 
 static struct i915_active_fence *
-active_instance(struct i915_active *ref, struct intel_timeline *tl)
+active_instance(struct i915_active *ref, u64 idx)
 {
 	struct active_node *node, *prealloc;
 	struct rb_node **p, *parent;
-	u64 idx = tl->fence_context;
 
 	/*
 	 * We track the most recently used timeline to skip a rbtree search
@@ -367,7 +366,7 @@ int i915_active_ref(struct i915_active *ref,
 	if (err)
 		return err;
 
-	active = active_instance(ref, tl);
+	active = active_instance(ref, tl->fence_context);
 	if (!active) {
 		err = -ENOMEM;
 		goto out;
@@ -384,32 +383,104 @@ int i915_active_ref(struct i915_active *ref,
 		atomic_dec(&ref->count);
 	}
 	if (!__i915_active_fence_set(active, fence))
-		atomic_inc(&ref->count);
+		__i915_active_acquire(ref);
 
 out:
 	i915_active_release(ref);
 	return err;
 }
 
-struct dma_fence *
-i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f)
+static struct dma_fence *
+__i915_active_set_fence(struct i915_active *ref,
+			struct i915_active_fence *active,
+			struct dma_fence *fence)
 {
 	struct dma_fence *prev;
 
 	/* We expect the caller to manage the exclusive timeline ordering */
 	GEM_BUG_ON(i915_active_is_idle(ref));
 
+	if (is_barrier(active)) { /* proto-node used by our idle barrier */
+		/*
+		 * This request is on the kernel_context timeline, and so
+		 * we can use it to substitute for the pending idle-barrer
+		 * request that we want to emit on the kernel_context.
+		 */
+		__active_del_barrier(ref, node_from_active(active));
+		RCU_INIT_POINTER(active->fence, NULL);
+		atomic_dec(&ref->count);
+	}
+
 	rcu_read_lock();
-	prev = __i915_active_fence_set(&ref->excl, f);
+	prev = __i915_active_fence_set(active, fence);
 	if (prev)
 		prev = dma_fence_get_rcu(prev);
 	else
-		atomic_inc(&ref->count);
+		__i915_active_acquire(ref);
 	rcu_read_unlock();
 
 	return prev;
 }
 
+static struct i915_active_fence *
+__active_lookup(struct i915_active *ref, u64 idx)
+{
+	struct active_node *node;
+	struct rb_node *p;
+
+	/* Like active_instance() but with no malloc */
+
+	node = READ_ONCE(ref->cache);
+	if (node && node->timeline == idx)
+		return &node->base;
+
+	spin_lock_irq(&ref->tree_lock);
+	GEM_BUG_ON(i915_active_is_idle(ref));
+
+	p = ref->tree.rb_node;
+	while (p) {
+		node = rb_entry(p, struct active_node, node);
+		if (node->timeline == idx) {
+			ref->cache = node;
+			spin_unlock_irq(&ref->tree_lock);
+			return &node->base;
+		}
+
+		if (node->timeline < idx)
+			p = p->rb_right;
+		else
+			p = p->rb_left;
+	}
+
+	spin_unlock_irq(&ref->tree_lock);
+
+	return NULL;
+}
+
+struct dma_fence *
+__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence)
+{
+	struct dma_fence *prev = ERR_PTR(-ENOENT);
+	struct i915_active_fence *active;
+
+	if (!i915_active_acquire_if_busy(ref))
+		return ERR_PTR(-EINVAL);
+
+	active = __active_lookup(ref, idx);
+	if (active)
+		prev = __i915_active_set_fence(ref, active, fence);
+
+	i915_active_release(ref);
+	return prev;
+}
+
+struct dma_fence *
+i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f)
+{
+	/* We expect the caller to manage the exclusive timeline ordering */
+	return __i915_active_set_fence(ref, &ref->excl, f);
+}
+
 bool i915_active_acquire_if_busy(struct i915_active *ref)
 {
 	debug_active_assert(ref);
@@ -443,6 +514,24 @@ int i915_active_acquire(struct i915_active *ref)
 	return err;
 }
 
+int i915_active_acquire_for_context(struct i915_active *ref, u64 idx)
+{
+	struct i915_active_fence *active;
+	int err;
+
+	err = i915_active_acquire(ref);
+	if (err)
+		return err;
+
+	active = active_instance(ref, idx);
+	if (!active) {
+		i915_active_release(ref);
+		return -ENOMEM;
+	}
+
+	return 0; /* return with active ref */
+}
+
 void i915_active_release(struct i915_active *ref)
 {
 	debug_active_assert(ref);
@@ -804,7 +893,7 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref,
 			 */
 			RCU_INIT_POINTER(node->base.fence, ERR_PTR(-EAGAIN));
 			node->base.cb.node.prev = (void *)engine;
-			atomic_inc(&ref->count);
+			__i915_active_acquire(ref);
 		}
 		GEM_BUG_ON(rcu_access_pointer(node->base.fence) != ERR_PTR(-EAGAIN));
 
diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h
index cf4058150966..042502abefe5 100644
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -163,6 +163,9 @@ void __i915_active_init(struct i915_active *ref,
 	__i915_active_init(ref, active, retire, &__mkey, &__wkey);	\
 } while (0)
 
+struct dma_fence *
+__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence);
+
 int i915_active_ref(struct i915_active *ref,
 		    struct intel_timeline *tl,
 		    struct dma_fence *fence);
@@ -198,7 +201,9 @@ int i915_request_await_active(struct i915_request *rq,
 #define I915_ACTIVE_AWAIT_BARRIER BIT(2)
 
 int i915_active_acquire(struct i915_active *ref);
+int i915_active_acquire_for_context(struct i915_active *ref, u64 idx);
 bool i915_active_acquire_if_busy(struct i915_active *ref);
+
 void i915_active_release(struct i915_active *ref);
 
 static inline void __i915_active_acquire(struct i915_active *ref)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 19/22] drm/i915/gem: Separate the ww_mutex walker into its own list
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (16 preceding siblings ...)
  2020-05-20  7:54 ` [Intel-gfx] [PATCH 18/22] drm/i915: Export a preallocate variant of i915_active_acquire() Chris Wilson
@ 2020-05-20  7:55 ` Chris Wilson
  2020-05-20  7:55 ` [Intel-gfx] [PATCH 20/22] drm/i915/gem: Asynchronous GTT unbinding Chris Wilson
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:55 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

In preparation for making eb_vma bigger and heavy to run inn parallel,
we need to stop apply an in-place swap() to reorder around ww_mutex
deadlocks. Keep the array intact and reorder the locks using a dedicated
list.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 48 +++++++++++--------
 drivers/gpu/drm/i915/i915_utils.h             |  6 +++
 2 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 91e77348282a..653e32f79066 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -36,6 +36,7 @@ struct eb_vma {
 	struct drm_i915_gem_exec_object2 *exec;
 	struct list_head bind_link;
 	struct list_head reloc_link;
+	struct list_head lock_link;
 
 	struct hlist_node node;
 	u32 handle;
@@ -254,6 +255,8 @@ struct i915_execbuffer {
 	/** list of vma that have execobj.relocation_count */
 	struct list_head relocs;
 
+	struct list_head lock;
+
 	/**
 	 * Track the most recently used object for relocations, as we
 	 * frequently have to perform multiple relocations within the same
@@ -399,6 +402,10 @@ static int eb_create(struct i915_execbuffer *eb)
 		eb->lut_size = -eb->buffer_count;
 	}
 
+	INIT_LIST_HEAD(&eb->relocs);
+	INIT_LIST_HEAD(&eb->unbound);
+	INIT_LIST_HEAD(&eb->lock);
+
 	return 0;
 }
 
@@ -604,6 +611,8 @@ eb_add_vma(struct i915_execbuffer *eb,
 		eb_unreserve_vma(ev);
 		list_add_tail(&ev->bind_link, &eb->unbound);
 	}
+
+	list_add_tail(&ev->lock_link, &eb->lock);
 }
 
 static inline int use_cpu_reloc(const struct reloc_cache *cache,
@@ -880,9 +889,6 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
 	unsigned int i;
 	int err = 0;
 
-	INIT_LIST_HEAD(&eb->relocs);
-	INIT_LIST_HEAD(&eb->unbound);
-
 	for (i = 0; i < eb->buffer_count; i++) {
 		struct i915_vma *vma;
 
@@ -1789,38 +1795,39 @@ static int eb_relocate(struct i915_execbuffer *eb)
 
 static int eb_move_to_gpu(struct i915_execbuffer *eb)
 {
-	const unsigned int count = eb->buffer_count;
 	struct ww_acquire_ctx acquire;
-	unsigned int i;
+	struct eb_vma *ev;
 	int err = 0;
 
 	ww_acquire_init(&acquire, &reservation_ww_class);
 
-	for (i = 0; i < count; i++) {
-		struct eb_vma *ev = &eb->vma[i];
+	list_for_each_entry(ev, &eb->lock, lock_link) {
 		struct i915_vma *vma = ev->vma;
 
 		err = ww_mutex_lock_interruptible(&vma->resv->lock, &acquire);
 		if (err == -EDEADLK) {
-			GEM_BUG_ON(i == 0);
-			do {
-				int j = i - 1;
+			struct eb_vma *unlock = ev, *en;
 
-				ww_mutex_unlock(&eb->vma[j].vma->resv->lock);
-
-				swap(eb->vma[i],  eb->vma[j]);
-			} while (--i);
+			list_for_each_entry_safe_continue_reverse(unlock, en, &eb->lock, lock_link) {
+				ww_mutex_unlock(&unlock->vma->resv->lock);
+				list_move_tail(&unlock->lock_link, &eb->lock);
+			}
 
+			GEM_BUG_ON(!list_is_first(&ev->lock_link, &eb->lock));
 			err = ww_mutex_lock_slow_interruptible(&vma->resv->lock,
 							       &acquire);
 		}
-		if (err)
-			break;
+		if (err) {
+			list_for_each_entry_continue_reverse(ev, &eb->lock, lock_link)
+				ww_mutex_unlock(&ev->vma->resv->lock);
+
+			ww_acquire_fini(&acquire);
+			goto err_skip;
+		}
 	}
 	ww_acquire_done(&acquire);
 
-	while (i--) {
-		struct eb_vma *ev = &eb->vma[i];
+	list_for_each_entry(ev, &eb->lock, lock_link) {
 		struct i915_vma *vma = ev->vma;
 		unsigned int flags = ev->flags;
 		struct drm_i915_gem_object *obj = vma->obj;
@@ -2121,9 +2128,10 @@ static int eb_parse(struct i915_execbuffer *eb)
 	if (err)
 		goto err_trampoline;
 
-	eb->vma[eb->buffer_count].vma = i915_vma_get(shadow);
-	eb->vma[eb->buffer_count].flags = __EXEC_OBJECT_HAS_PIN;
 	eb->batch = &eb->vma[eb->buffer_count++];
+	eb->batch->vma = i915_vma_get(shadow);
+	eb->batch->flags = __EXEC_OBJECT_HAS_PIN;
+	list_add_tail(&eb->batch->lock_link, &eb->lock);
 	eb->vma[eb->buffer_count].vma = NULL;
 
 	eb->trampoline = trampoline;
diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index 03a73d2bd50d..28813806bc19 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -266,6 +266,12 @@ static inline int list_is_last_rcu(const struct list_head *list,
 	return READ_ONCE(list->next) == head;
 }
 
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)	\
+	for (pos = list_prev_entry(pos, member),			\
+		n = list_prev_entry(pos, member);			\
+	     &pos->member != (head);					\
+	     pos = n, n = list_prev_entry(n, member))
+
 /*
  * Wait until the work is finally complete, even if it tries to postpone
  * by requeueing itself. Note, that if the worker never cancels itself,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 20/22] drm/i915/gem: Asynchronous GTT unbinding
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (17 preceding siblings ...)
  2020-05-20  7:55 ` [Intel-gfx] [PATCH 19/22] drm/i915/gem: Separate the ww_mutex walker into its own list Chris Wilson
@ 2020-05-20  7:55 ` Chris Wilson
  2020-05-20  7:55 ` [Intel-gfx] [PATCH 21/22] drm/i915/gem: Bind the fence async for execbuf Chris Wilson
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:55 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matthew Auld, Chris Wilson

It is reasonably common for userspace (even modern drivers like iris) to
reuse an active address for a new buffer. This would cause the
application to stall under its mutex (originally struct_mutex) until the
old batches were idle and it could synchronously remove the stale PTE.
However, we can queue up a job that waits on the signal for the old
nodes to complete and upon those signals, remove the old nodes replacing
them with the new ones for the batch. This is still CPU driven, but in
theory we can do the GTT patching from the GPU. The job itself has a
completion signal allowing the execbuf to wait upon the rebinding, and
also other observers to coordinate with the common VM activity.

Letting userspace queue up more work, lets it do more stuff without
blocking other clients. In turn, we take care not to let it too much
concurrent work, creating a small number of queues for each context to
limit the number of concurrent tasks.

The implementation relies on only scheduling one unbind operation per
vma as we use the unbound vma->node location to track the stale PTE.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1402
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 809 ++++++++++++++++--
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |   1 +
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   3 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           |   4 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |   2 +
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |   3 +-
 drivers/gpu/drm/i915/i915_gem.c               |   7 +
 drivers/gpu/drm/i915/i915_gem_gtt.c           |   5 +
 drivers/gpu/drm/i915/i915_vma.c               | 130 +--
 drivers/gpu/drm/i915/i915_vma.h               |   5 +
 10 files changed, 847 insertions(+), 122 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 653e32f79066..8fadf2dfb4bc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -19,6 +19,7 @@
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_buffer_pool.h"
 #include "gt/intel_gt_pm.h"
+#include "gt/intel_gt_requests.h"
 #include "gt/intel_ring.h"
 
 #include "i915_drv.h"
@@ -32,6 +33,9 @@ struct eb_vma {
 	struct i915_vma *vma;
 	unsigned int flags;
 
+	struct drm_mm_node hole;
+	unsigned int bind_flags;
+
 	/** This vma's place in the execbuf reservation list */
 	struct drm_i915_gem_exec_object2 *exec;
 	struct list_head bind_link;
@@ -58,7 +62,8 @@ enum {
 #define __EXEC_OBJECT_HAS_FENCE		BIT(30)
 #define __EXEC_OBJECT_NEEDS_MAP		BIT(29)
 #define __EXEC_OBJECT_NEEDS_BIAS	BIT(28)
-#define __EXEC_OBJECT_INTERNAL_FLAGS	(~0u << 28) /* all of the above */
+#define __EXEC_OBJECT_HAS_PAGES		BIT(27)
+#define __EXEC_OBJECT_INTERNAL_FLAGS	(~0u << 27) /* all of the above */
 
 #define __EXEC_HAS_RELOC	BIT(31)
 #define __EXEC_INTERNAL_FLAGS	(~0u << 31)
@@ -72,11 +77,12 @@ enum {
 	 I915_EXEC_RESOURCE_STREAMER)
 
 /* Catch emission of unexpected errors for CI! */
+#define __EINVAL__ 22
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
 #undef EINVAL
 #define EINVAL ({ \
 	DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
-	22; \
+	__EINVAL__; \
 })
 #endif
 
@@ -317,6 +323,12 @@ static struct eb_vma_array *eb_vma_array_create(unsigned int count)
 	return arr;
 }
 
+static struct eb_vma_array *eb_vma_array_get(struct eb_vma_array *arr)
+{
+	kref_get(&arr->kref);
+	return arr;
+}
+
 static inline void eb_unreserve_vma(struct eb_vma *ev)
 {
 	struct i915_vma *vma = ev->vma;
@@ -327,8 +339,12 @@ static inline void eb_unreserve_vma(struct eb_vma *ev)
 	if (ev->flags & __EXEC_OBJECT_HAS_PIN)
 		__i915_vma_unpin(vma);
 
+	if (ev->flags & __EXEC_OBJECT_HAS_PAGES)
+		i915_vma_put_pages(vma);
+
 	ev->flags &= ~(__EXEC_OBJECT_HAS_PIN |
-		       __EXEC_OBJECT_HAS_FENCE);
+		       __EXEC_OBJECT_HAS_FENCE |
+		       __EXEC_OBJECT_HAS_PAGES);
 }
 
 static void eb_vma_array_destroy(struct kref *kref)
@@ -414,7 +430,7 @@ eb_vma_misplaced(const struct drm_i915_gem_exec_object2 *entry,
 		 const struct i915_vma *vma,
 		 unsigned int flags)
 {
-	if (vma->node.size < entry->pad_to_size)
+	if (vma->node.size < max(vma->size, entry->pad_to_size))
 		return true;
 
 	if (entry->alignment && !IS_ALIGNED(vma->node.start, entry->alignment))
@@ -487,12 +503,18 @@ eb_pin_vma(struct i915_execbuffer *eb,
 		if (entry->flags & EXEC_OBJECT_PINNED)
 			return false;
 
+		/* Concurrent async binds in progress, get in the queue */
+		if (!i915_active_is_idle(&vma->vm->active))
+			return false;
+
 		/* Failing that pick any _free_ space if suitable */
 		if (unlikely(i915_vma_pin(vma,
 					  entry->pad_to_size,
 					  entry->alignment,
 					  eb_pin_flags(entry, ev->flags) |
-					  PIN_USER | PIN_NOEVICT)))
+					  PIN_USER |
+					  PIN_NOEVICT |
+					  PIN_NOSEARCH)))
 			return false;
 	}
 
@@ -632,85 +654,711 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache,
 		obj->cache_level != I915_CACHE_NONE);
 }
 
-static int eb_reserve_vma(const struct i915_execbuffer *eb,
-			  struct eb_vma *ev,
-			  u64 pin_flags)
+struct eb_vm_work {
+	struct dma_fence_work base;
+	struct list_head unbound;
+	struct eb_vma_array *array;
+	struct i915_address_space *vm;
+	struct list_head evict_list;
+	u64 *p_flags;
+	u64 active;
+};
+
+static inline u64 node_end(const struct drm_mm_node *node)
+{
+	return node->start + node->size;
+}
+
+static int set_bind_fence(struct i915_vma *vma, struct eb_vm_work *work)
+{
+	struct dma_fence *prev;
+	int err = 0;
+
+	lockdep_assert_held(&vma->vm->mutex);
+	prev = i915_active_set_exclusive(&vma->active, &work->base.dma);
+	if (unlikely(prev)) {
+		err = i915_sw_fence_await_dma_fence(&work->base.chain, prev, 0,
+						    GFP_NOWAIT | __GFP_NOWARN);
+		dma_fence_put(prev);
+	}
+
+	return err < 0 ? err : 0;
+}
+
+static int await_evict(struct eb_vm_work *work, struct i915_vma *vma)
+{
+	int err;
+
+	if (rcu_access_pointer(vma->active.excl.fence) == &work->base.dma)
+		return 0;
+
+	/* Wait for all other previous activity */
+	err = i915_sw_fence_await_active(&work->base.chain,
+					 &vma->active,
+					 I915_ACTIVE_AWAIT_ACTIVE);
+	/* Then insert along the exclusive vm->mutex timeline */
+	if (err == 0)
+		err = set_bind_fence(vma, work);
+
+	return err;
+}
+
+static int
+evict_for_node(struct eb_vm_work *work,
+	       struct eb_vma *const target,
+	       unsigned int flags)
+{
+	struct i915_address_space *vm = target->vma->vm;
+	const unsigned long color = target->vma->node.color;
+	const u64 start = target->vma->node.start;
+	const u64 end = start + target->vma->node.size;
+	u64 hole_start = start, hole_end = end;
+	struct i915_vma *vma, *next;
+	struct drm_mm_node *node;
+	LIST_HEAD(evict_list);
+	LIST_HEAD(steal_list);
+	int err = 0;
+
+	lockdep_assert_held(&vm->mutex);
+	GEM_BUG_ON(drm_mm_node_allocated(&target->vma->node));
+	GEM_BUG_ON(!IS_ALIGNED(start, I915_GTT_PAGE_SIZE));
+	GEM_BUG_ON(!IS_ALIGNED(end, I915_GTT_PAGE_SIZE));
+
+	if (i915_vm_has_cache_coloring(vm)) {
+		/* Expand search to cover neighbouring guard pages (or lack!) */
+		if (hole_start)
+			hole_start -= I915_GTT_PAGE_SIZE;
+
+		/* Always look at the page afterwards to avoid the end-of-GTT */
+		hole_end += I915_GTT_PAGE_SIZE;
+	}
+	GEM_BUG_ON(hole_start >= hole_end);
+
+	drm_mm_for_each_node_in_range(node, &vm->mm, hole_start, hole_end) {
+		GEM_BUG_ON(node == &target->vma->node);
+
+		/* If we find any non-objects (!vma), we cannot evict them */
+		if (node->color == I915_COLOR_UNEVICTABLE) {
+			err = -ENOSPC;
+			goto err;
+		}
+
+		/*
+		 * If we are using coloring to insert guard pages between
+		 * different cache domains within the address space, we have
+		 * to check whether the objects on either side of our range
+		 * abutt and conflict. If they are in conflict, then we evict
+		 * those as well to make room for our guard pages.
+		 */
+		if (i915_vm_has_cache_coloring(vm)) {
+			if (node_end(node) == start && node->color == color)
+				continue;
+
+			if (node->start == end && node->color == color)
+				continue;
+		}
+
+		GEM_BUG_ON(!drm_mm_node_allocated(node));
+		vma = container_of(node, typeof(*vma), node);
+
+		if (i915_vma_is_pinned(vma)) {
+			err = -ENOSPC;
+			goto err;
+		}
+
+		/* If this VMA is already being freed, or idle, steal it! */
+		if (!i915_active_acquire_if_busy(&vma->active)) {
+			list_move(&vma->vm_link, &steal_list);
+			continue;
+		}
+
+		if (flags & PIN_NONBLOCK)
+			err = -EAGAIN;
+		else
+			err = await_evict(work, vma);
+		i915_active_release(&vma->active);
+		if (err)
+			goto err;
+
+		GEM_BUG_ON(!i915_vma_is_active(vma));
+		list_move(&vma->vm_link, &evict_list);
+	}
+
+	list_for_each_entry_safe(vma, next, &steal_list, vm_link) {
+		atomic_and(~I915_VMA_BIND_MASK, &vma->flags);
+		__i915_vma_evict(vma);
+		drm_mm_remove_node(&vma->node);
+		/* No ref held; vma may now be concurrently freed */
+	}
+
+	/* No overlapping nodes to evict, claim the slot for ourselves! */
+	if (list_empty(&evict_list))
+		return drm_mm_reserve_node(&vm->mm, &target->vma->node);
+
+	/*
+	 * Mark this range as reserved.
+	 *
+	 * We have not yet removed the PTEs for the old evicted nodes, so
+	 * must prevent this range from being reused for anything else. The
+	 * PTE will be cleared when the range is idle (during the rebind
+	 * phase in the worker).
+	 */
+	target->hole.color = I915_COLOR_UNEVICTABLE;
+	target->hole.start = start;
+	target->hole.size = end;
+
+	list_for_each_entry(vma, &evict_list, vm_link) {
+		target->hole.start =
+			min(target->hole.start, vma->node.start);
+		target->hole.size =
+			max(target->hole.size, node_end(&vma->node));
+
+		GEM_BUG_ON(vma->node.mm != &vm->mm);
+		drm_mm_remove_node(&vma->node);
+		atomic_and(~I915_VMA_BIND_MASK, &vma->flags);
+		GEM_BUG_ON(i915_vma_is_pinned(vma));
+	}
+	list_splice(&evict_list, &work->evict_list);
+
+	target->hole.size -= target->hole.start;
+
+	return drm_mm_reserve_node(&vm->mm, &target->hole);
+
+err:
+	list_splice(&evict_list, &vm->bound_list);
+	list_splice(&steal_list, &vm->bound_list);
+	return err;
+}
+
+static int
+evict_in_range(struct eb_vm_work *work,
+	       struct eb_vma * const target,
+	       u64 start, u64 end, u64 align)
+{
+	struct i915_address_space *vm = target->vma->vm;
+	struct i915_vma *active = NULL;
+	struct i915_vma *vma, *next;
+	struct drm_mm_scan scan;
+	LIST_HEAD(evict_list);
+	bool found = false;
+
+	lockdep_assert_held(&vm->mutex);
+
+	drm_mm_scan_init_with_range(&scan, &vm->mm,
+				    target->vma->node.size,
+				    align,
+				    target->vma->node.color,
+				    start, end,
+				    DRM_MM_INSERT_BEST);
+
+	list_for_each_entry_safe(vma, next, &vm->bound_list, vm_link) {
+		if (i915_vma_is_pinned(vma))
+			continue;
+
+		if (vma == active)
+			active = ERR_PTR(-EAGAIN);
+
+		/* Prefer to reuse idle nodes; push all active vma to the end */
+		if (active != ERR_PTR(-EAGAIN) && i915_vma_is_active(vma)) {
+			if (!active)
+				active = vma;
+
+			list_move_tail(&vma->vm_link, &vm->bound_list);
+			continue;
+		}
+
+		list_move(&vma->vm_link, &evict_list);
+		if (drm_mm_scan_add_block(&scan, &vma->node)) {
+			target->vma->node.start =
+				round_up(scan.hit_start, align);
+			found = true;
+			break;
+		}
+	}
+
+	list_for_each_entry(vma, &evict_list, vm_link)
+		drm_mm_scan_remove_block(&scan, &vma->node);
+	list_splice(&evict_list, &vm->bound_list);
+	if (!found)
+		return -ENOSPC;
+
+	return evict_for_node(work, target, 0);
+}
+
+static u64 random_offset(u64 start, u64 end, u64 len, u64 align)
+{
+	u64 range, addr;
+
+	GEM_BUG_ON(range_overflows(start, len, end));
+	GEM_BUG_ON(round_up(start, align) > round_down(end - len, align));
+
+	range = round_down(end - len, align) - round_up(start, align);
+	if (range) {
+		if (sizeof(unsigned long) == sizeof(u64)) {
+			addr = get_random_long();
+		} else {
+			addr = get_random_int();
+			if (range > U32_MAX) {
+				addr <<= 32;
+				addr |= get_random_int();
+			}
+		}
+		div64_u64_rem(addr, range, &addr);
+		start += addr;
+	}
+
+	return round_up(start, align);
+}
+
+static u64 align0(u64 align)
+{
+	return align <= I915_GTT_MIN_ALIGNMENT ? 0 : align;
+}
+
+static struct drm_mm_node *__best_hole(struct drm_mm *mm, u64 size)
+{
+	struct rb_node *rb = mm->holes_size.rb_root.rb_node;
+	struct drm_mm_node *best = NULL;
+
+	do {
+		struct drm_mm_node *node =
+			rb_entry(rb, struct drm_mm_node, rb_hole_size);
+
+		if (size <= node->hole_size) {
+			best = node;
+			rb = rb->rb_right;
+		} else {
+			rb = rb->rb_left;
+		}
+	} while (rb);
+
+	return best;
+}
+
+static int best_hole(struct drm_mm *mm, struct drm_mm_node *node,
+		     u64 start, u64 end, u64 align)
+{
+	struct drm_mm_node *hole;
+	u64 size = node->size;
+
+	do {
+		hole = __best_hole(mm, size);
+		if (!hole)
+			return -ENOSPC;
+
+		node->start = round_up(max(start, drm_mm_hole_node_start(hole)),
+				       align);
+		if (min(drm_mm_hole_node_end(hole), end) >=
+		    node->start + node->size)
+			return drm_mm_reserve_node(mm, node);
+
+		/*
+		 * Too expensive to search for every single hole every time,
+		 * so just look for the next bigger hole, introducing enough
+		 * space for alignments. Finding the smallest hole with ideal
+		 * alignment scales very poorly, so we choose to waste space
+		 * if an alignment is forced. On the other hand, simply
+		 * randomly selecting an offset in 48b space will cause us
+		 * to use the majority of that space and exhaust all memory
+		 * in storing the page directories. Compromise is required.
+		 */
+		size = hole->hole_size + align;
+	} while (1);
+}
+
+static int eb_reserve_vma(struct eb_vm_work *work, struct eb_vma *ev)
 {
 	struct drm_i915_gem_exec_object2 *entry = ev->exec;
+	const unsigned int exec_flags = ev->flags;
 	struct i915_vma *vma = ev->vma;
+	struct i915_address_space *vm = vma->vm;
+	u64 start = 0, end = vm->total;
+	u64 align = entry->alignment ?: I915_GTT_MIN_ALIGNMENT;
+	unsigned int bind_flags;
 	int err;
 
-	if (drm_mm_node_allocated(&vma->node) &&
-	    eb_vma_misplaced(entry, vma, ev->flags)) {
-		err = i915_vma_unbind(vma);
-		if (err)
-			return err;
+	lockdep_assert_held(&vm->mutex);
+
+	bind_flags = PIN_USER;
+	if (exec_flags & EXEC_OBJECT_NEEDS_GTT)
+		bind_flags |= PIN_GLOBAL;
+
+	if (drm_mm_node_allocated(&vma->node))
+		goto pin;
+
+	GEM_BUG_ON(i915_vma_is_pinned(vma));
+	GEM_BUG_ON(i915_vma_is_bound(vma, I915_VMA_BIND_MASK));
+	GEM_BUG_ON(i915_active_fence_isset(&vma->active.excl));
+	GEM_BUG_ON(!vma->size);
+
+	/* Reuse old address (if it doesn't conflict with new requirements) */
+	if (eb_vma_misplaced(entry, vma, exec_flags)) {
+		vma->node.start = entry->offset & PIN_OFFSET_MASK;
+		vma->node.size = max(entry->pad_to_size, vma->size);
+		vma->node.color = 0;
+		if (i915_vm_has_cache_coloring(vm))
+			vma->node.color = vma->obj->cache_level;
 	}
 
-	err = i915_vma_pin(vma,
-			   entry->pad_to_size, entry->alignment,
-			   eb_pin_flags(entry, ev->flags) | pin_flags);
-	if (err)
-		return err;
+	/*
+	 * Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
+	 * limit address to the first 4GBs for unflagged objects.
+	 */
+	if (!(exec_flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS))
+		end = min_t(u64, end, (1ULL << 32) - I915_GTT_PAGE_SIZE);
 
-	if (entry->offset != vma->node.start) {
-		entry->offset = vma->node.start | UPDATE;
-		eb->args->flags |= __EXEC_HAS_RELOC;
+	align = max(align, vma->display_alignment);
+	if (exec_flags & __EXEC_OBJECT_NEEDS_MAP) {
+		vma->node.size = max_t(u64, vma->node.size, vma->fence_size);
+		end = min_t(u64, end, i915_vm_to_ggtt(vm)->mappable_end);
+		align = max_t(u64, align, vma->fence_alignment);
 	}
 
-	if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		err = i915_vma_pin_fence(vma);
-		if (unlikely(err)) {
-			i915_vma_unpin(vma);
+	if (exec_flags & __EXEC_OBJECT_NEEDS_BIAS)
+		start = BATCH_OFFSET_BIAS;
+
+	GEM_BUG_ON(!vma->node.size);
+	if (vma->node.size > end - start)
+		return -E2BIG;
+
+	/* Try the user's preferred location first (mandatory if soft-pinned) */
+	err = -__EINVAL__;
+	if (vma->node.start >= start &&
+	    IS_ALIGNED(vma->node.start, align) &&
+	    !range_overflows(vma->node.start, vma->node.size, end)) {
+		unsigned int pin_flags;
+
+		if (drm_mm_reserve_node(&vm->mm, &vma->node) == 0)
+			goto pin;
+
+		pin_flags = 0;
+		if (!(exec_flags & EXEC_OBJECT_PINNED))
+			pin_flags = PIN_NONBLOCK;
+
+		err = evict_for_node(work, ev, pin_flags);
+		if (err == 0)
+			goto pin;
+	}
+	if (exec_flags & EXEC_OBJECT_PINNED)
+		return err;
+
+	/* Try the first available free space */
+	if (!best_hole(&vm->mm, &vma->node, start, end, align))
+		goto pin;
+
+	/* Pick a random slot and see if it's available [O(N) worst case] */
+	vma->node.start = random_offset(start, end, vma->node.size, align);
+	if (evict_for_node(work, ev, 0) == 0)
+		goto pin;
+
+	/* Otherwise search all free space [degrades to O(N^2)] */
+	if (drm_mm_insert_node_in_range(&vm->mm, &vma->node,
+					vma->node.size,
+					align0(align),
+					vma->node.color,
+					start, end,
+					DRM_MM_INSERT_BEST) == 0)
+		goto pin;
+
+	/* Pretty busy! Loop over "LRU" and evict oldest in our search range */
+	err = evict_in_range(work, ev, start, end, align);
+	if (unlikely(err))
+		return err;
+
+pin:
+	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
+		err = __i915_vma_pin_fence(vma); /* XXX no waiting */
+		if (unlikely(err))
 			return err;
-		}
 
 		if (vma->fence)
 			ev->flags |= __EXEC_OBJECT_HAS_FENCE;
 	}
 
+	bind_flags &= ~atomic_read(&vma->flags);
+	if (bind_flags) {
+		err = set_bind_fence(vma, work);
+		if (unlikely(err))
+			return err;
+
+		atomic_add(I915_VMA_PAGES_ACTIVE, &vma->pages_count);
+		atomic_or(bind_flags, &vma->flags);
+
+		if (i915_vma_is_ggtt(vma))
+			__i915_vma_set_map_and_fenceable(vma);
+
+		GEM_BUG_ON(!i915_vma_is_active(vma));
+		list_move_tail(&vma->vm_link, &vm->bound_list);
+		ev->bind_flags = bind_flags;
+	}
+	__i915_vma_pin(vma); /* and release */
+
+	GEM_BUG_ON(!bind_flags && !drm_mm_node_allocated(&vma->node));
+	GEM_BUG_ON(!(drm_mm_node_allocated(&vma->node) ^
+		     drm_mm_node_allocated(&ev->hole)));
+
+	if (entry->offset != vma->node.start) {
+		entry->offset = vma->node.start | UPDATE;
+		*work->p_flags |= __EXEC_HAS_RELOC;
+	}
+
 	ev->flags |= __EXEC_OBJECT_HAS_PIN;
 	GEM_BUG_ON(eb_vma_misplaced(entry, vma, ev->flags));
 
 	return 0;
 }
 
-static int eb_reserve(struct i915_execbuffer *eb)
+static int __eb_bind_vma(struct eb_vm_work *work, int err)
 {
-	const unsigned int count = eb->buffer_count;
-	unsigned int pin_flags = PIN_USER | PIN_NONBLOCK;
-	struct list_head last;
+	struct i915_address_space *vm = work->vm;
 	struct eb_vma *ev;
-	unsigned int i, pass;
-	int err = 0;
+
+	GEM_BUG_ON(!intel_gt_pm_is_awake(vm->gt));
 
 	/*
-	 * Attempt to pin all of the buffers into the GTT.
-	 * This is done in 3 phases:
-	 *
-	 * 1a. Unbind all objects that do not match the GTT constraints for
-	 *     the execbuffer (fenceable, mappable, alignment etc).
-	 * 1b. Increment pin count for already bound objects.
-	 * 2.  Bind new objects.
-	 * 3.  Decrement pin count.
-	 *
-	 * This avoid unnecessary unbinding of later objects in order to make
-	 * room for the earlier objects *unless* we need to defragment.
+	 * We have to wait until the stale nodes are completely idle before
+	 * we can remove their PTE and unbind their pages. Hence, after
+	 * claiming their slot in the drm_mm, we defer their removal to
+	 * after the fences are signaled.
 	 */
+	if (!list_empty(&work->evict_list)) {
+		struct i915_vma *vma, *vn;
+
+		mutex_lock(&vm->mutex);
+		list_for_each_entry_safe(vma, vn, &work->evict_list, vm_link) {
+			GEM_BUG_ON(vma->vm != vm);
+			__i915_vma_evict(vma);
+			GEM_BUG_ON(!i915_vma_is_active(vma));
+		}
+		mutex_unlock(&vm->mutex);
+	}
+
+	/*
+	 * Now we know the nodes we require in drm_mm are idle, we can
+	 * replace the PTE in those ranges with our own.
+	 */
+	list_for_each_entry(ev, &work->unbound, bind_link) {
+		struct i915_vma *vma = ev->vma;
+
+		if (!ev->bind_flags)
+			goto put;
 
-	if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex))
-		return -EINTR;
+		GEM_BUG_ON(vma->vm != vm);
+		GEM_BUG_ON(!i915_vma_is_active(vma));
+
+		if (err == 0)
+			err = vma->ops->bind_vma(vma,
+						 vma->obj->cache_level,
+						 ev->bind_flags |
+						 I915_VMA_ALLOC);
+		if (err)
+			atomic_and(~ev->bind_flags, &vma->flags);
+
+		if (drm_mm_node_allocated(&ev->hole)) {
+			mutex_lock(&vm->mutex);
+			GEM_BUG_ON(ev->hole.mm != &vm->mm);
+			GEM_BUG_ON(ev->hole.color != I915_COLOR_UNEVICTABLE);
+			GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
+			drm_mm_remove_node(&ev->hole);
+			if (!err) {
+				drm_mm_reserve_node(&vm->mm, &vma->node);
+				GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
+			} else {
+				list_del_init(&vma->vm_link);
+			}
+			mutex_unlock(&vm->mutex);
+		}
+		ev->bind_flags = 0;
+
+put:
+		GEM_BUG_ON(drm_mm_node_allocated(&ev->hole));
+	}
+	INIT_LIST_HEAD(&work->unbound);
+
+	return err;
+}
+
+static int eb_bind_vma(struct dma_fence_work *base)
+{
+	struct eb_vm_work *work = container_of(base, typeof(*work), base);
+
+	return __eb_bind_vma(work, 0);
+}
+
+static void eb_vma_work_release(struct dma_fence_work *base)
+{
+	struct eb_vm_work *work = container_of(base, typeof(*work), base);
+
+	if (work->active) {
+		if (!list_empty(&work->unbound)) {
+			GEM_BUG_ON(!work->base.dma.error);
+			__eb_bind_vma(work, work->base.dma.error);
+		}
+		i915_active_release(&work->vm->active);
+	}
+
+	eb_vma_array_put(work->array);
+}
+
+static const struct dma_fence_work_ops eb_bind_ops = {
+	.name = "eb_bind",
+	.work = eb_bind_vma,
+	.release = eb_vma_work_release,
+};
+
+static struct eb_vm_work *eb_vm_work(struct i915_execbuffer *eb)
+{
+	struct eb_vm_work *work;
+
+	work = kzalloc(sizeof(*work), GFP_KERNEL);
+	if (!work)
+		return NULL;
+
+	dma_fence_work_init(&work->base, &eb_bind_ops);
+	list_replace_init(&eb->unbound, &work->unbound);
+	work->array = eb_vma_array_get(eb->array);
+	work->p_flags = &eb->args->flags;
+	work->vm = eb->context->vm;
+
+	/* Preallocate our slot in vm->active, outside of vm->mutex */
+	work->active = i915_gem_context_async_id(eb->gem_context);
+	if (i915_active_acquire_for_context(&work->vm->active, work->active)) {
+		work->active = 0;
+		work->base.dma.error = -ENOMEM;
+		dma_fence_work_commit(&work->base);
+		return NULL;
+	}
+
+	INIT_LIST_HEAD(&work->evict_list);
+
+	GEM_BUG_ON(list_empty(&work->unbound));
+	GEM_BUG_ON(!list_empty(&eb->unbound));
+
+	return work;
+}
+
+static int eb_vm_throttle(struct eb_vm_work *work)
+{
+	struct dma_fence *p;
+	int err;
+
+	/* Keep async work queued per context */
+	p = __i915_active_ref(&work->vm->active, work->active, &work->base.dma);
+	if (IS_ERR_OR_NULL(p))
+		return PTR_ERR_OR_ZERO(p);
+
+	err = i915_sw_fence_await_dma_fence(&work->base.chain, p, 0,
+					    GFP_NOWAIT | __GFP_NOWARN);
+	dma_fence_put(p);
+
+	return err < 0 ? err : 0;
+}
+
+static int eb_prepare_vma(struct eb_vma *ev)
+{
+	struct i915_vma *vma = ev->vma;
+	int err;
+
+	ev->hole.flags = 0;
+	ev->bind_flags = 0;
+
+	if (!(ev->flags &  __EXEC_OBJECT_HAS_PAGES)) {
+		err = i915_vma_get_pages(vma);
+		if (err)
+			return err;
+
+		ev->flags |=  __EXEC_OBJECT_HAS_PAGES;
+	}
+
+	return 0;
+}
+
+static int eb_reserve(struct i915_execbuffer *eb)
+{
+	const unsigned int count = eb->buffer_count;
+	struct i915_address_space *vm = eb->context->vm;
+	struct list_head last;
+	unsigned int i, pass;
+	struct eb_vma *ev;
+	int err = 0;
 
 	pass = 0;
 	do {
+		struct eb_vm_work *work;
+
 		list_for_each_entry(ev, &eb->unbound, bind_link) {
-			err = eb_reserve_vma(eb, ev, pin_flags);
+			err = eb_prepare_vma(ev);
+			switch (err) {
+			case 0:
+				break;
+			case -EAGAIN:
+				goto retry;
+			default:
+				return err;
+			}
+		}
+
+		work = eb_vm_work(eb);
+		if (!work)
+			return -ENOMEM;
+
+		/* No allocations allowed beyond this point */
+		if (mutex_lock_interruptible(&vm->mutex)) {
+			work->base.dma.error = -EINTR;
+			dma_fence_work_commit(&work->base);
+			return -EINTR;
+		}
+
+		err = eb_vm_throttle(work);
+		if (err) {
+			mutex_unlock(&vm->mutex);
+			work->base.dma.error = err;
+			dma_fence_work_commit(&work->base);
+			return err;
+		}
+
+		list_for_each_entry(ev, &work->unbound, bind_link) {
+			struct i915_vma *vma = ev->vma;
+
+			/*
+			 * Check if this node is being evicted or must be.
+			 *
+			 * As we use the single node inside the vma to track
+			 * both the eviction and where to insert the new node,
+			 * we cannot handle migrating the vma inside the worker.
+			 */
+			if (drm_mm_node_allocated(&vma->node)) {
+				if (eb_vma_misplaced(ev->exec, vma, ev->flags)) {
+					err = -ENOSPC;
+					break;
+				}
+			} else {
+				if (i915_vma_is_active(vma)) {
+					err = -ENOSPC;
+					break;
+				}
+			}
+
+			err = i915_active_acquire(&vma->active);
+			if (!err) {
+				err = eb_reserve_vma(work, ev);
+				i915_active_release(&vma->active);
+			}
 			if (err)
 				break;
 		}
-		if (!(err == -ENOSPC || err == -EAGAIN))
-			break;
 
+		mutex_unlock(&vm->mutex);
+
+		dma_fence_get(&work->base.dma);
+		dma_fence_work_commit_imm(&work->base);
+		if (err == -ENOSPC && dma_fence_wait(&work->base.dma, true))
+			err = -EINTR;
+		dma_fence_put(&work->base.dma);
+		if (err != -ENOSPC)
+			return err;
+
+retry:
 		/* Resort *all* the objects into priority order */
 		INIT_LIST_HEAD(&eb->unbound);
 		INIT_LIST_HEAD(&last);
@@ -739,37 +1387,52 @@ static int eb_reserve(struct i915_execbuffer *eb)
 		}
 		list_splice_tail(&last, &eb->unbound);
 
+		if (signal_pending(current))
+			return -EINTR;
+
 		if (err == -EAGAIN) {
-			mutex_unlock(&eb->i915->drm.struct_mutex);
 			flush_workqueue(eb->i915->mm.userptr_wq);
-			mutex_lock(&eb->i915->drm.struct_mutex);
 			continue;
 		}
 
-		switch (pass++) {
-		case 0:
-			break;
+		/* Now safe to wait with no reservations held */
+		list_for_each_entry(ev, &eb->unbound, bind_link) {
+			struct i915_vma *vma = ev->vma;
 
-		case 1:
-			/* Too fragmented, unbind everything and retry */
-			mutex_lock(&eb->context->vm->mutex);
-			err = i915_gem_evict_vm(eb->context->vm);
-			mutex_unlock(&eb->context->vm->mutex);
+			GEM_BUG_ON(ev->flags & __EXEC_OBJECT_HAS_PIN);
+
+			if (drm_mm_node_allocated(&vma->node) &&
+			    eb_vma_misplaced(ev->exec, vma, ev->flags)) {
+				err = i915_vma_unbind(vma);
+				if (err)
+					return err;
+			}
+
+			/* Wait for previous to avoid reusing vma->node */
+			err = i915_vma_wait_for_unbind(vma);
 			if (err)
-				goto unlock;
-			break;
+				return err;
+		}
 
+		switch (pass++) {
 		default:
-			err = -ENOSPC;
-			goto unlock;
-		}
+			return -ENOSPC;
 
-		pin_flags = PIN_USER;
-	} while (1);
+		case 2:
+			if (intel_gt_wait_for_idle(vm->gt,
+						   MAX_SCHEDULE_TIMEOUT))
+				return -EINTR;
 
-unlock:
-	mutex_unlock(&eb->i915->drm.struct_mutex);
-	return err;
+			fallthrough;
+		case 1:
+			if (i915_active_wait(&vm->active))
+				return -EINTR;
+
+			fallthrough;
+		case 0:
+			break;
+		}
+	} while (1);
 }
 
 static unsigned int eb_batch_index(const struct i915_execbuffer *eb)
@@ -1235,6 +1898,8 @@ static void *reloc_vaddr(struct drm_i915_gem_object *obj,
 {
 	void *vaddr;
 
+	GEM_BUG_ON(page >= obj->base.size >> PAGE_SHIFT);
+
 	if (cache->page == page) {
 		vaddr = unmask_page(cache->vaddr);
 	} else {
@@ -1244,6 +1909,7 @@ static void *reloc_vaddr(struct drm_i915_gem_object *obj,
 		if (!vaddr)
 			vaddr = reloc_kmap(obj, cache, page);
 	}
+	GEM_BUG_ON(!vaddr);
 
 	return vaddr;
 }
@@ -1596,6 +2262,8 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 	if (unlikely(!target))
 		return -ENOENT;
 
+	GEM_BUG_ON(!i915_vma_is_pinned(target->vma));
+
 	/* Validate that the target is in a valid r/w GPU domain */
 	if (unlikely(reloc->write_domain & (reloc->write_domain - 1))) {
 		drm_dbg(&i915->drm, "reloc with multiple write domains: "
@@ -1871,7 +2539,6 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 			err = i915_vma_move_to_active(vma, eb->request, flags);
 
 		i915_vma_unlock(vma);
-		eb_unreserve_vma(ev);
 	}
 	ww_acquire_fini(&acquire);
 
@@ -2749,7 +3416,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	 * snb/ivb/vlv conflate the "batch in ppgtt" bit with the "non-secure
 	 * batch" bit. Hence we need to pin secure batches into the global gtt.
 	 * hsw should have this fixed, but bdw mucks it up again. */
-	batch = eb.batch->vma;
+	batch = i915_vma_get(eb.batch->vma);
 	if (eb.batch_flags & I915_DISPATCH_SECURE) {
 		struct i915_vma *vma;
 
@@ -2769,6 +3436,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 			goto err_parse;
 		}
 
+		GEM_BUG_ON(vma->obj != batch->obj);
 		batch = vma;
 	}
 
@@ -2847,6 +3515,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_parse:
 	if (batch->private)
 		intel_gt_buffer_pool_put(batch->private);
+	i915_vma_put(batch);
 err_vma:
 	if (eb.trampoline)
 		i915_vma_unpin(eb.trampoline);
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index f4fec7eb4064..2c5ac598ade2 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -370,6 +370,7 @@ static struct i915_vma *pd_vma_create(struct gen6_ppgtt *ppgtt, int size)
 	atomic_set(&vma->flags, I915_VMA_GGTT);
 	vma->ggtt_view.type = I915_GGTT_VIEW_ROTATED; /* prevent fencing */
 
+	INIT_LIST_HEAD(&vma->vm_link);
 	INIT_LIST_HEAD(&vma->obj_link);
 	INIT_LIST_HEAD(&vma->closed_link);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 66165b10256e..dfc979a24450 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -568,7 +568,8 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
 	if (flags & I915_VMA_LOCAL_BIND) {
 		struct i915_ppgtt *alias = i915_vm_to_ggtt(vma->vm)->alias;
 
-		if (flags & I915_VMA_ALLOC) {
+		if (flags & I915_VMA_ALLOC &&
+		    !test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) {
 			ret = alias->vm.allocate_va_range(&alias->vm,
 							  vma->node.start,
 							  vma->size);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 2a72cce63fd9..82d4f943c346 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -194,6 +194,8 @@ void __i915_vm_close(struct i915_address_space *vm)
 
 void i915_address_space_fini(struct i915_address_space *vm)
 {
+	i915_active_fini(&vm->active);
+
 	spin_lock(&vm->free_pages.lock);
 	if (pagevec_count(&vm->free_pages.pvec))
 		vm_free_pages_release(vm, true);
@@ -246,6 +248,8 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	drm_mm_init(&vm->mm, 0, vm->total);
 	vm->mm.head_node.color = I915_COLOR_UNEVICTABLE;
 
+	i915_active_init(&vm->active, NULL, NULL);
+
 	stash_init(&vm->free_pages);
 
 	INIT_LIST_HEAD(&vm->bound_list);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index d93ebdf3fa0e..773fc76dfa1b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -263,6 +263,8 @@ struct i915_address_space {
 	 */
 	struct list_head bound_list;
 
+	struct i915_active active;
+
 	struct pagestash free_pages;
 
 	/* Global GTT */
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index f86f7e68ce5e..ecdd58f4b993 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -162,7 +162,8 @@ static int ppgtt_bind_vma(struct i915_vma *vma,
 	u32 pte_flags;
 	int err;
 
-	if (flags & I915_VMA_ALLOC) {
+	if (flags & I915_VMA_ALLOC &&
+	    !test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) {
 		err = vma->vm->allocate_va_range(vma->vm,
 						 vma->node.start, vma->size);
 		if (err)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0cbcb9f54e7d..6effa85532c6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -984,6 +984,9 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
 		return vma;
 
 	if (i915_vma_misplaced(vma, size, alignment, flags)) {
+		if (flags & PIN_NOEVICT)
+			return ERR_PTR(-ENOSPC);
+
 		if (flags & PIN_NONBLOCK) {
 			if (i915_vma_is_pinned(vma) || i915_vma_is_active(vma))
 				return ERR_PTR(-ENOSPC);
@@ -998,6 +1001,10 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
 			return ERR_PTR(ret);
 	}
 
+	if (flags & PIN_NONBLOCK &&
+	    i915_active_fence_isset(&vma->active.excl))
+		return ERR_PTR(-EAGAIN);
+
 	ret = i915_vma_pin(vma, size, alignment, flags | PIN_GLOBAL);
 	if (ret)
 		return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index cb43381b0d37..7e1225874b03 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -219,6 +219,8 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 		mode = DRM_MM_INSERT_HIGHEST;
 	if (flags & PIN_MAPPABLE)
 		mode = DRM_MM_INSERT_LOW;
+	if (flags & PIN_NOSEARCH)
+		mode |= DRM_MM_INSERT_ONCE;
 
 	/* We only allocate in PAGE_SIZE/GTT_PAGE_SIZE (4096) chunks,
 	 * so we know that we always have a minimum alignment of 4096.
@@ -236,6 +238,9 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 	if (err != -ENOSPC)
 		return err;
 
+	if (flags & PIN_NOSEARCH)
+		return -ENOSPC;
+
 	if (mode & DRM_MM_INSERT_ONCE) {
 		err = drm_mm_insert_node_in_range(&vm->mm, node,
 						  size, alignment, color,
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index fc14ebf9a0b7..9da9ef99a8bb 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -132,6 +132,7 @@ vma_create(struct drm_i915_gem_object *obj,
 		fs_reclaim_release(GFP_KERNEL);
 	}
 
+	INIT_LIST_HEAD(&vma->vm_link);
 	INIT_LIST_HEAD(&vma->closed_link);
 
 	if (view && view->type != I915_GGTT_VIEW_NORMAL) {
@@ -342,25 +343,37 @@ struct i915_vma_work *i915_vma_work(void)
 	return vw;
 }
 
-int i915_vma_wait_for_bind(struct i915_vma *vma)
+static int
+__i915_vma_wait_excl(struct i915_vma *vma, bool bound, unsigned int flags)
 {
+	struct dma_fence *fence;
 	int err = 0;
 
-	if (rcu_access_pointer(vma->active.excl.fence)) {
-		struct dma_fence *fence;
+	fence = i915_active_fence_get(&vma->active.excl);
+	if (!fence)
+		return 0;
 
-		rcu_read_lock();
-		fence = dma_fence_get_rcu_safe(&vma->active.excl.fence);
-		rcu_read_unlock();
-		if (fence) {
-			err = dma_fence_wait(fence, MAX_SCHEDULE_TIMEOUT);
-			dma_fence_put(fence);
-		}
+	if (drm_mm_node_allocated(&vma->node) == bound) {
+		if (flags & PIN_NOEVICT)
+			err = -EBUSY;
+		else
+			err = dma_fence_wait(fence, true);
 	}
 
+	dma_fence_put(fence);
 	return err;
 }
 
+int i915_vma_wait_for_bind(struct i915_vma *vma)
+{
+	return __i915_vma_wait_excl(vma, true, 0);
+}
+
+int i915_vma_wait_for_unbind(struct i915_vma *vma)
+{
+	return __i915_vma_wait_excl(vma, false, 0);
+}
+
 /**
  * i915_vma_bind - Sets up PTEs for an VMA in it's corresponding address space.
  * @vma: VMA to map
@@ -630,8 +643,9 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	u64 start, end;
 	int ret;
 
-	GEM_BUG_ON(i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND));
+	GEM_BUG_ON(i915_vma_is_bound(vma, I915_VMA_BIND_MASK));
 	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
+	GEM_BUG_ON(i915_active_fence_isset(&vma->active.excl));
 
 	size = max(size, vma->size);
 	alignment = max(alignment, vma->display_alignment);
@@ -727,7 +741,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
 	GEM_BUG_ON(!i915_gem_valid_gtt_space(vma, color));
 
-	list_add_tail(&vma->vm_link, &vma->vm->bound_list);
+	list_move_tail(&vma->vm_link, &vma->vm->bound_list);
 
 	return 0;
 }
@@ -735,15 +749,12 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 static void
 i915_vma_detach(struct i915_vma *vma)
 {
-	GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
-	GEM_BUG_ON(i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND));
-
 	/*
 	 * And finally now the object is completely decoupled from this
 	 * vma, we can drop its hold on the backing storage and allow
 	 * it to be reaped by the shrinker.
 	 */
-	list_del(&vma->vm_link);
+	list_del_init(&vma->vm_link);
 }
 
 static bool try_qad_pin(struct i915_vma *vma, unsigned int flags)
@@ -789,7 +800,7 @@ static bool try_qad_pin(struct i915_vma *vma, unsigned int flags)
 	return pinned;
 }
 
-static int vma_get_pages(struct i915_vma *vma)
+int i915_vma_get_pages(struct i915_vma *vma)
 {
 	int err = 0;
 
@@ -836,7 +847,7 @@ static void __vma_put_pages(struct i915_vma *vma, unsigned int count)
 	mutex_unlock(&vma->pages_mutex);
 }
 
-static void vma_put_pages(struct i915_vma *vma)
+void i915_vma_put_pages(struct i915_vma *vma)
 {
 	if (atomic_add_unless(&vma->pages_count, -1, 1))
 		return;
@@ -853,9 +864,13 @@ static void vma_unbind_pages(struct i915_vma *vma)
 	/* The upper portion of pages_count is the number of bindings */
 	count = atomic_read(&vma->pages_count);
 	count >>= I915_VMA_PAGES_BIAS;
-	GEM_BUG_ON(!count);
+	if (count)
+		__vma_put_pages(vma, count | count << I915_VMA_PAGES_BIAS);
+}
 
-	__vma_put_pages(vma, count | count << I915_VMA_PAGES_BIAS);
+static int __wait_for_unbind(struct i915_vma *vma, unsigned int flags)
+{
+	return __i915_vma_wait_excl(vma, false, flags);
 }
 
 int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
@@ -875,10 +890,14 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	if (try_qad_pin(vma, flags & I915_VMA_BIND_MASK))
 		return 0;
 
-	err = vma_get_pages(vma);
+	err = i915_vma_get_pages(vma);
 	if (err)
 		return err;
 
+	err = __wait_for_unbind(vma, flags);
+	if (err)
+		goto err_pages;
+
 	if (flags & vma->vm->bind_async_flags) {
 		work = i915_vma_work();
 		if (!work) {
@@ -940,6 +959,10 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 		goto err_unlock;
 
 	if (!(bound & I915_VMA_BIND_MASK)) {
+		err = __wait_for_unbind(vma, flags);
+		if (err)
+			goto err_active;
+
 		err = i915_vma_insert(vma, size, alignment, flags);
 		if (err)
 			goto err_active;
@@ -959,6 +982,7 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	GEM_BUG_ON(bound + I915_VMA_PAGES_ACTIVE < bound);
 	atomic_add(I915_VMA_PAGES_ACTIVE, &vma->pages_count);
 	list_move_tail(&vma->vm_link, &vma->vm->bound_list);
+	GEM_BUG_ON(!i915_vma_is_active(vma));
 
 	__i915_vma_pin(vma);
 	GEM_BUG_ON(!i915_vma_is_pinned(vma));
@@ -980,7 +1004,7 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	if (wakeref)
 		intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref);
 err_pages:
-	vma_put_pages(vma);
+	i915_vma_put_pages(vma);
 	return err;
 }
 
@@ -1084,6 +1108,7 @@ void i915_vma_release(struct kref *ref)
 		GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
 	}
 	GEM_BUG_ON(i915_vma_is_active(vma));
+	GEM_BUG_ON(!list_empty(&vma->vm_link));
 
 	if (vma->obj) {
 		struct drm_i915_gem_object *obj = vma->obj;
@@ -1142,7 +1167,7 @@ static void __i915_vma_iounmap(struct i915_vma *vma)
 {
 	GEM_BUG_ON(i915_vma_is_pinned(vma));
 
-	if (vma->iomap == NULL)
+	if (!vma->iomap)
 		return;
 
 	io_mapping_unmap(vma->iomap);
@@ -1232,31 +1257,9 @@ int i915_vma_move_to_active(struct i915_vma *vma,
 	return 0;
 }
 
-int __i915_vma_unbind(struct i915_vma *vma)
+void __i915_vma_evict(struct i915_vma *vma)
 {
-	int ret;
-
-	lockdep_assert_held(&vma->vm->mutex);
-
-	if (i915_vma_is_pinned(vma)) {
-		vma_print_allocator(vma, "is pinned");
-		return -EAGAIN;
-	}
-
-	/*
-	 * After confirming that no one else is pinning this vma, wait for
-	 * any laggards who may have crept in during the wait (through
-	 * a residual pin skipping the vm->mutex) to complete.
-	 */
-	ret = i915_vma_sync(vma);
-	if (ret)
-		return ret;
-
-	if (!drm_mm_node_allocated(&vma->node))
-		return 0;
-
 	GEM_BUG_ON(i915_vma_is_pinned(vma));
-	GEM_BUG_ON(i915_vma_is_active(vma));
 
 	if (i915_vma_is_map_and_fenceable(vma)) {
 		/* Force a pagefault for domain tracking on next user access */
@@ -1295,6 +1298,33 @@ int __i915_vma_unbind(struct i915_vma *vma)
 
 	i915_vma_detach(vma);
 	vma_unbind_pages(vma);
+}
+
+int __i915_vma_unbind(struct i915_vma *vma)
+{
+	int ret;
+
+	lockdep_assert_held(&vma->vm->mutex);
+
+	if (!drm_mm_node_allocated(&vma->node))
+		return 0;
+
+	if (i915_vma_is_pinned(vma)) {
+		vma_print_allocator(vma, "is pinned");
+		return -EAGAIN;
+	}
+
+	/*
+	 * After confirming that no one else is pinning this vma, wait for
+	 * any laggards who may have crept in during the wait (through
+	 * a residual pin skipping the vm->mutex) to complete.
+	 */
+	ret = i915_vma_sync(vma);
+	if (ret)
+		return ret;
+
+	GEM_BUG_ON(i915_vma_is_active(vma));
+	__i915_vma_evict(vma);
 
 	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
 	return 0;
@@ -1306,13 +1336,13 @@ int i915_vma_unbind(struct i915_vma *vma)
 	intel_wakeref_t wakeref = 0;
 	int err;
 
-	if (!drm_mm_node_allocated(&vma->node))
-		return 0;
-
 	/* Optimistic wait before taking the mutex */
 	err = i915_vma_sync(vma);
 	if (err)
-		goto out_rpm;
+		return err;
+
+	if (!drm_mm_node_allocated(&vma->node))
+		return 0;
 
 	if (i915_vma_is_pinned(vma)) {
 		vma_print_allocator(vma, "is pinned");
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 8ad1daabcd58..478e8679f331 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -203,6 +203,7 @@ bool i915_vma_misplaced(const struct i915_vma *vma,
 			u64 size, u64 alignment, u64 flags);
 void __i915_vma_set_map_and_fenceable(struct i915_vma *vma);
 void i915_vma_revoke_mmap(struct i915_vma *vma);
+void __i915_vma_evict(struct i915_vma *vma);
 int __i915_vma_unbind(struct i915_vma *vma);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
 void i915_vma_unlink_ctx(struct i915_vma *vma);
@@ -239,6 +240,9 @@ int __must_check
 i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags);
 int i915_ggtt_pin(struct i915_vma *vma, u32 align, unsigned int flags);
 
+int i915_vma_get_pages(struct i915_vma *vma);
+void i915_vma_put_pages(struct i915_vma *vma);
+
 static inline int i915_vma_pin_count(const struct i915_vma *vma)
 {
 	return atomic_read(&vma->flags) & I915_VMA_PIN_MASK;
@@ -376,6 +380,7 @@ void i915_vma_make_shrinkable(struct i915_vma *vma);
 void i915_vma_make_purgeable(struct i915_vma *vma);
 
 int i915_vma_wait_for_bind(struct i915_vma *vma);
+int i915_vma_wait_for_unbind(struct i915_vma *vma);
 
 static inline int i915_vma_sync(struct i915_vma *vma)
 {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 21/22] drm/i915/gem: Bind the fence async for execbuf
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (18 preceding siblings ...)
  2020-05-20  7:55 ` [Intel-gfx] [PATCH 20/22] drm/i915/gem: Asynchronous GTT unbinding Chris Wilson
@ 2020-05-20  7:55 ` Chris Wilson
  2020-05-20  7:55 ` [Intel-gfx] [PATCH 22/22] drm/i915: Micro-optimise i915_request_completed() Chris Wilson
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:55 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

It is illegal to wait on an another vma while holding the vm->mutex, as
that easily leads to ABBA deadlocks (we wait on a second vma that waits
on us to release the vm->mutex). So while the vm->mutex exists, move the
waiting outside of the lock into the async binding pipeline.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  41 ++++--
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c  | 137 +++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h  |   5 +
 3 files changed, 166 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 8fadf2dfb4bc..5e3b89aa5beb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -519,13 +519,23 @@ eb_pin_vma(struct i915_execbuffer *eb,
 	}
 
 	if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		if (unlikely(i915_vma_pin_fence(vma))) {
-			i915_vma_unpin(vma);
-			return false;
-		}
+		struct i915_fence_reg *reg = vma->fence;
 
-		if (vma->fence)
+		/* Avoid waiting to change the fence; defer to async worker */
+		if (reg) {
+			if (READ_ONCE(reg->dirty)) {
+				__i915_vma_unpin(vma);
+				return false;
+			}
+
+			atomic_inc(&reg->pin_count);
 			ev->flags |= __EXEC_OBJECT_HAS_FENCE;
+		} else {
+			if (i915_gem_object_is_tiled(vma->obj)) {
+				__i915_vma_unpin(vma);
+				return false;
+			}
+		}
 	}
 
 	ev->flags |= __EXEC_OBJECT_HAS_PIN;
@@ -1066,15 +1076,6 @@ static int eb_reserve_vma(struct eb_vm_work *work, struct eb_vma *ev)
 		return err;
 
 pin:
-	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		err = __i915_vma_pin_fence(vma); /* XXX no waiting */
-		if (unlikely(err))
-			return err;
-
-		if (vma->fence)
-			ev->flags |= __EXEC_OBJECT_HAS_FENCE;
-	}
-
 	bind_flags &= ~atomic_read(&vma->flags);
 	if (bind_flags) {
 		err = set_bind_fence(vma, work);
@@ -1105,6 +1106,15 @@ static int eb_reserve_vma(struct eb_vm_work *work, struct eb_vma *ev)
 	ev->flags |= __EXEC_OBJECT_HAS_PIN;
 	GEM_BUG_ON(eb_vma_misplaced(entry, vma, ev->flags));
 
+	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
+		err = __i915_vma_pin_fence_async(vma, &work->base);
+		if (unlikely(err))
+			return err;
+
+		if (vma->fence)
+			ev->flags |= __EXEC_OBJECT_HAS_FENCE;
+	}
+
 	return 0;
 }
 
@@ -1140,6 +1150,9 @@ static int __eb_bind_vma(struct eb_vm_work *work, int err)
 	list_for_each_entry(ev, &work->unbound, bind_link) {
 		struct i915_vma *vma = ev->vma;
 
+		if (ev->flags & __EXEC_OBJECT_HAS_FENCE)
+			__i915_vma_apply_fence_async(vma);
+
 		if (!ev->bind_flags)
 			goto put;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
index 7fb36b12fe7a..734b6aa61809 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
@@ -21,10 +21,13 @@
  * IN THE SOFTWARE.
  */
 
+#include "i915_active.h"
 #include "i915_drv.h"
 #include "i915_scatterlist.h"
+#include "i915_sw_fence_work.h"
 #include "i915_pvinfo.h"
 #include "i915_vgpu.h"
+#include "i915_vma.h"
 
 /**
  * DOC: fence register handling
@@ -340,19 +343,37 @@ static struct i915_fence_reg *fence_find(struct i915_ggtt *ggtt)
 	return ERR_PTR(-EDEADLK);
 }
 
+static int fence_wait_bind(struct i915_fence_reg *reg)
+{
+	struct dma_fence *fence;
+	int err = 0;
+
+	fence = i915_active_fence_get(&reg->active.excl);
+	if (fence) {
+		err = dma_fence_wait(fence, true);
+		dma_fence_put(fence);
+	}
+
+	return err;
+}
+
 int __i915_vma_pin_fence(struct i915_vma *vma)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vma->vm);
-	struct i915_fence_reg *fence;
+	struct i915_fence_reg *fence = vma->fence;
 	struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL;
 	int err;
 
 	lockdep_assert_held(&vma->vm->mutex);
 
 	/* Just update our place in the LRU if our fence is getting reused. */
-	if (vma->fence) {
-		fence = vma->fence;
+	if (fence) {
 		GEM_BUG_ON(fence->vma != vma);
+
+		err = fence_wait_bind(fence);
+		if (err)
+			return err;
+
 		atomic_inc(&fence->pin_count);
 		if (!fence->dirty) {
 			list_move_tail(&fence->link, &ggtt->fence_list);
@@ -384,6 +405,116 @@ int __i915_vma_pin_fence(struct i915_vma *vma)
 	return err;
 }
 
+static int set_bind_fence(struct i915_fence_reg *fence,
+			  struct dma_fence_work *work)
+{
+	struct dma_fence *prev;
+	int err;
+
+	if (rcu_access_pointer(fence->active.excl.fence) == &work->dma)
+		return 0;
+
+	err = i915_sw_fence_await_active(&work->chain,
+					 &fence->active,
+					 I915_ACTIVE_AWAIT_ACTIVE);
+	if (err)
+		return err;
+
+	if (i915_active_acquire(&fence->active))
+		return -ENOENT;
+
+	prev = i915_active_set_exclusive(&fence->active, &work->dma);
+	if (unlikely(prev)) {
+		err = i915_sw_fence_await_dma_fence(&work->chain, prev, 0,
+						    GFP_NOWAIT | __GFP_NOWARN);
+		dma_fence_put(prev);
+	}
+
+	i915_active_release(&fence->active);
+	return err < 0 ? err : 0;
+}
+
+int __i915_vma_pin_fence_async(struct i915_vma *vma,
+			       struct dma_fence_work *work)
+{
+	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vma->vm);
+	struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL;
+	struct i915_fence_reg *fence = vma->fence;
+	int err;
+
+	lockdep_assert_held(&vma->vm->mutex);
+
+	/* Just update our place in the LRU if our fence is getting reused. */
+	if (fence) {
+		GEM_BUG_ON(fence->vma != vma);
+		GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+	} else if (set) {
+		if (!i915_vma_is_map_and_fenceable(vma))
+			return -EINVAL;
+
+		fence = fence_find(ggtt);
+		if (IS_ERR(fence))
+			return -ENOSPC;
+
+		GEM_BUG_ON(atomic_read(&fence->pin_count));
+		fence->dirty = true;
+	} else {
+		return 0;
+	}
+
+	atomic_inc(&fence->pin_count);
+	list_move_tail(&fence->link, &ggtt->fence_list);
+	if (!fence->dirty)
+		return 0;
+
+	if (INTEL_GEN(fence_to_i915(fence)) < 4 &&
+	    rcu_access_pointer(vma->active.excl.fence) != &work->dma) {
+		/* implicit 'unfenced' GPU blits */
+		err = i915_sw_fence_await_active(&work->chain,
+						 &vma->active,
+						 I915_ACTIVE_AWAIT_ACTIVE);
+		if (err)
+			goto err_unpin;
+	}
+
+	err = set_bind_fence(fence, work);
+	if (err)
+		goto err_unpin;
+
+	if (set) {
+		fence->start = vma->node.start;
+		fence->size  = vma->fence_size;
+		fence->stride = i915_gem_object_get_stride(vma->obj);
+		fence->tiling = i915_gem_object_get_tiling(vma->obj);
+
+		vma->fence = fence;
+	} else {
+		fence->tiling = 0;
+		vma->fence = NULL;
+	}
+
+	set = xchg(&fence->vma, set);
+	if (set && set != vma) {
+		GEM_BUG_ON(set->fence != fence);
+		WRITE_ONCE(set->fence, NULL);
+		i915_vma_revoke_mmap(set);
+	}
+
+	return 0;
+
+err_unpin:
+	atomic_dec(&fence->pin_count);
+	return err;
+}
+
+void __i915_vma_apply_fence_async(struct i915_vma *vma)
+{
+	struct i915_fence_reg *fence = vma->fence;
+
+	if (fence->dirty)
+		fence_write(fence);
+}
+
 /**
  * i915_vma_pin_fence - set up fencing for a vma
  * @vma: vma to map through a fence reg
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h
index 9eef679e1311..d306ac14d47e 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h
@@ -30,6 +30,7 @@
 
 #include "i915_active.h"
 
+struct dma_fence_work;
 struct drm_i915_gem_object;
 struct i915_ggtt;
 struct i915_vma;
@@ -70,6 +71,10 @@ void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj,
 void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj,
 					 struct sg_table *pages);
 
+int __i915_vma_pin_fence_async(struct i915_vma *vma,
+			       struct dma_fence_work *work);
+void __i915_vma_apply_fence_async(struct i915_vma *vma);
+
 void intel_ggtt_init_fences(struct i915_ggtt *ggtt);
 void intel_ggtt_fini_fences(struct i915_ggtt *ggtt);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] [PATCH 22/22] drm/i915: Micro-optimise i915_request_completed()
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (19 preceding siblings ...)
  2020-05-20  7:55 ` [Intel-gfx] [PATCH 21/22] drm/i915/gem: Bind the fence async for execbuf Chris Wilson
@ 2020-05-20  7:55 ` Chris Wilson
  2020-05-20  8:37 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915/gem: Suppress some random warnings Patchwork
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Chris Wilson @ 2020-05-20  7:55 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

This is primarily focused on reducing the number of lockdep checks we
endure in CI, as each i915_request_completed() takes the rcu_read_lock()
and we use i915_request_completed() very, very frequently.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   | 11 +++-------
 drivers/gpu/drm/i915/gt/intel_lrc.c           | 20 +++++++++----------
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  2 +-
 drivers/gpu/drm/i915/gt/intel_timeline.c      |  4 ++--
 drivers/gpu/drm/i915/i915_request.c           | 10 +++++-----
 drivers/gpu/drm/i915/i915_request.h           |  5 +++++
 6 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index d907d538176e..39335d5ceb5d 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -82,11 +82,6 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
 	spin_unlock_irqrestore(&b->irq_lock, flags);
 }
 
-static inline bool __request_completed(const struct i915_request *rq)
-{
-	return i915_seqno_passed(__hwsp_seqno(rq), rq->fence.seqno);
-}
-
 __maybe_unused static bool
 check_signal_order(struct intel_context *ce, struct i915_request *rq)
 {
@@ -177,7 +172,7 @@ static void signal_irq_work(struct irq_work *work)
 				list_entry(pos, typeof(*rq), signal_link);
 
 			GEM_BUG_ON(!check_signal_order(ce, rq));
-			if (!__request_completed(rq))
+			if (!__i915_request_completed(rq))
 				break;
 
 			/*
@@ -293,7 +288,7 @@ void intel_engine_transfer_stale_breadcrumbs(struct intel_engine_cs *engine,
 		/* Queue for executing the signal callbacks in the irq_work */
 		list_for_each_entry_safe(rq, next, &ce->signals, signal_link) {
 			GEM_BUG_ON(rq->engine != engine);
-			GEM_BUG_ON(!__request_completed(rq));
+			GEM_BUG_ON(!__i915_request_completed(rq));
 
 			__signal_request(rq, &b->signaled_requests);
 		}
@@ -356,7 +351,7 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq)
 		spin_unlock(&b->irq_lock);
 	}
 
-	return !__request_completed(rq);
+	return !__i915_request_completed(rq);
 }
 
 void i915_request_cancel_breadcrumb(struct i915_request *rq)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index f85a2be92dc6..80a4369b4ba9 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -374,7 +374,7 @@ active_request(const struct intel_timeline * const tl, struct i915_request *rq)
 
 	rcu_read_lock();
 	list_for_each_entry_continue_reverse(rq, &tl->requests, link) {
-		if (i915_request_completed(rq))
+		if (__i915_request_completed(rq))
 			break;
 
 		active = rq;
@@ -1112,7 +1112,7 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	list_for_each_entry_safe_reverse(rq, rn,
 					 &engine->active.requests,
 					 sched.link) {
-		if (i915_request_completed(rq))
+		if (__i915_request_completed(rq))
 			continue; /* XXX */
 
 		__i915_request_unsubmit(rq);
@@ -1265,7 +1265,7 @@ static void reset_active(struct i915_request *rq,
 		     rq->fence.context, rq->fence.seqno);
 
 	/* On resubmission of the active request, payload will be scrubbed */
-	if (i915_request_completed(rq))
+	if (__i915_request_completed(rq))
 		head = rq->tail;
 	else
 		head = active_request(ce->timeline, rq)->head;
@@ -1424,7 +1424,7 @@ __execlists_schedule_out(struct i915_request *rq,
 	 * idle and we want to re-enter powersaving.
 	 */
 	if (list_is_last_rcu(&rq->link, &ce->timeline->requests) &&
-	    i915_request_completed(rq))
+	    __i915_request_completed(rq))
 		intel_engine_add_retire(engine, ce->timeline);
 
 	ccid >>= GEN11_SW_CTX_ID_SHIFT - 32;
@@ -1667,7 +1667,7 @@ assert_pending_valid(const struct intel_engine_execlists *execlists,
 		if (!spin_trylock_irqsave(&rq->lock, flags))
 			continue;
 
-		if (i915_request_completed(rq))
+		if (__i915_request_completed(rq))
 			goto unlock;
 
 		if (i915_active_is_idle(&ce->active) &&
@@ -1780,7 +1780,7 @@ static bool can_merge_rq(const struct i915_request *prev,
 	 * contexts, despite the best efforts of preempt-to-busy to confuse
 	 * us.
 	 */
-	if (i915_request_completed(next))
+	if (__i915_request_completed(next))
 		return true;
 
 	if (unlikely((i915_request_flags(prev) ^ i915_request_flags(next)) &
@@ -2011,7 +2011,7 @@ static unsigned long active_timeslice(const struct intel_engine_cs *engine)
 	const struct intel_engine_execlists *execlists = &engine->execlists;
 	const struct i915_request *rq = *execlists->active;
 
-	if (!rq || i915_request_completed(rq))
+	if (!rq || __i915_request_completed(rq))
 		return 0;
 
 	if (READ_ONCE(execlists->switch_priority_hint) < effective_prio(rq))
@@ -2142,7 +2142,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 */
 	if ((last = *active)) {
 		if (need_preempt(engine, last, ve)) {
-			if (i915_request_completed(last)) {
+			if (__i915_request_completed(last)) {
 				tasklet_hi_schedule(&execlists->tasklet);
 				return;
 			}
@@ -2174,7 +2174,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			last = NULL;
 		} else if (need_timeslice(engine, last, ve) &&
 			   timeslice_expired(execlists, last)) {
-			if (i915_request_completed(last)) {
+			if (__i915_request_completed(last)) {
 				tasklet_hi_schedule(&execlists->tasklet);
 				return;
 			}
@@ -5579,7 +5579,7 @@ static void virtual_submit_request(struct i915_request *rq)
 		i915_request_put(old);
 	}
 
-	if (i915_request_completed(rq)) {
+	if (__i915_request_completed(rq)) {
 		__i915_request_submit(rq);
 
 		ve->base.execlists.queue_priority_hint = INT_MIN;
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index ca7286e58409..d50470051404 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -805,7 +805,7 @@ static void reset_rewind(struct intel_engine_cs *engine, bool stalled)
 	rq = NULL;
 	spin_lock_irqsave(&engine->active.lock, flags);
 	list_for_each_entry(pos, &engine->active.requests, sched.link) {
-		if (!i915_request_completed(pos)) {
+		if (!__i915_request_completed(pos)) {
 			rq = pos;
 			break;
 		}
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 4546284fede1..11529acf704c 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -544,11 +544,11 @@ int intel_timeline_read_hwsp(struct i915_request *from,
 
 	rcu_read_lock();
 	cl = rcu_dereference(from->hwsp_cacheline);
-	if (i915_request_completed(from)) /* confirm cacheline is valid */
+	if (__i915_request_completed(from)) /* confirm cacheline is valid */
 		goto unlock;
 	if (unlikely(!i915_active_acquire_if_busy(&cl->active)))
 		goto unlock; /* seqno wrapped and completed! */
-	if (unlikely(i915_request_completed(from)))
+	if (unlikely(__i915_request_completed(from)))
 		goto release;
 	rcu_read_unlock();
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index d71728edca57..eacfb026ddb6 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -271,7 +271,7 @@ static void remove_from_engine(struct i915_request *rq)
 
 bool i915_request_retire(struct i915_request *rq)
 {
-	if (!i915_request_completed(rq))
+	if (!__i915_request_completed(rq))
 		return false;
 
 	RQ_TRACE(rq, "\n");
@@ -340,7 +340,7 @@ void i915_request_retire_upto(struct i915_request *rq)
 
 	RQ_TRACE(rq, "\n");
 
-	GEM_BUG_ON(!i915_request_completed(rq));
+	GEM_BUG_ON(!__i915_request_completed(rq));
 
 	do {
 		tmp = list_first_entry(&tl->requests, typeof(*tmp), link);
@@ -464,7 +464,7 @@ bool __i915_request_submit(struct i915_request *request)
 	 * dropped upon retiring. (Otherwise if resubmit a *retired*
 	 * request, this would be a horrible use-after-free.)
 	 */
-	if (i915_request_completed(request))
+	if (__i915_request_completed(request))
 		goto xfer;
 
 	if (unlikely(intel_context_is_banned(request->context)))
@@ -772,7 +772,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 	RCU_INIT_POINTER(rq->timeline, tl);
 	RCU_INIT_POINTER(rq->hwsp_cacheline, tl->hwsp_cacheline);
 	rq->hwsp_seqno = tl->hwsp_seqno;
-	GEM_BUG_ON(i915_request_completed(rq));
+	GEM_BUG_ON(__i915_request_completed(rq));
 
 	rq->rcustate = get_state_synchronize_rcu(); /* acts as smp_mb() */
 
@@ -1588,7 +1588,7 @@ __i915_request_add_to_timeline(struct i915_request *rq)
 	 */
 	prev = to_request(__i915_active_fence_set(&timeline->last_request,
 						  &rq->fence));
-	if (prev && !i915_request_completed(prev)) {
+	if (prev && !__i915_request_completed(prev)) {
 		/*
 		 * The requests are supposed to be kept in order. However,
 		 * we need to be wary in case the timeline->last_request
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 5d4709a3dace..6a3bd8bed383 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -417,6 +417,11 @@ static inline u32 __hwsp_seqno(const struct i915_request *rq)
 	return READ_ONCE(*hwsp);
 }
 
+static inline bool __i915_request_completed(const struct i915_request *rq)
+{
+	return i915_seqno_passed(__hwsp_seqno(rq), rq->fence.seqno);
+}
+
 /**
  * hwsp_seqno - the current breadcrumb value in the HW status page
  * @rq: the request, to chase the relevant HW status page
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915/gem: Suppress some random warnings
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (20 preceding siblings ...)
  2020-05-20  7:55 ` [Intel-gfx] [PATCH 22/22] drm/i915: Micro-optimise i915_request_completed() Chris Wilson
@ 2020-05-20  8:37 ` Patchwork
  2020-05-20  8:38 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Patchwork @ 2020-05-20  8:37 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/22] drm/i915/gem: Suppress some random warnings
URL   : https://patchwork.freedesktop.org/series/77459/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
35b35b313bad drm/i915/gem: Suppress some random warnings
-:62: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written "!obj->userptr.mm"
#62: FILE: drivers/gpu/drm/i915/gem/i915_gem_userptr.c:238:
+	if (GEM_WARN_ON(obj->userptr.mm == NULL))

total: 0 errors, 0 warnings, 1 checks, 35 lines checked
ec289918ff88 drm/i915/execlists: Shortcircuit queue_prio() for no internal levels
ad7a7cab4515 drm/i915: Avoid using rq->engine after free during i915_fence_release
8ea14b342e49 drm/i915: Move saturated workload detection back to the context
-:22: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#22: 
References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")

-:22: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")'
#22: 
References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")

total: 1 errors, 1 warnings, 0 checks, 68 lines checked
c4839c4c4fe9 drm/i915/gt: Use virtual_engine during execlists_dequeue
f63db413341d drm/i915/gt: Decouple inflight virtual engines
0435401407a9 drm/i915/gt: Resubmit the virtual engine on schedule-out
b72331f639dd drm/i915: Improve execute_cb struct packing
cc0b5daa9c3a dma-buf: Proxy fence, an unsignaled fence placeholder
-:45: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#45: 
new file mode 100644

-:380: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment
#380: FILE: drivers/dma-buf/st-dma-fence-proxy.c:20:
+	spinlock_t lock;

-:540: WARNING:MEMORY_BARRIER: memory barrier without comment
#540: FILE: drivers/dma-buf/st-dma-fence-proxy.c:180:
+	smp_store_mb(container_of(cb, struct simple_cb, cb)->seen, true);

total: 0 errors, 2 warnings, 1 checks, 1043 lines checked
be1de9ed0950 drm/syncobj: Allow use of dma-fence-proxy
6995a5de6f1d drm/i915/gem: Teach execbuf how to wait on future syncobj
c78b3fceaec2 drm/i915/gem: Allow combining submit-fences with syncobj
44980c2e4806 drm/i915/gt: Declare when we enabled timeslicing
a612c3379605 drm/i915/gt: Use built-in active intel_context reference
1db0d219282e drm/i915: Drop I915_IDLE_ENGINES_TIMEOUT
a166cd2383c3 drm/i915: Always defer fenced work to the worker
4d8c9d1e6bf7 drm/i915/gem: Assign context id for async work
d4f5912b32c2 drm/i915: Export a preallocate variant of i915_active_acquire()
32a908f52387 drm/i915/gem: Separate the ww_mutex walker into its own list
-:92: WARNING:LONG_LINE: line over 100 characters
#92: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1811:
+			list_for_each_entry_safe_continue_reverse(unlock, en, &eb->lock, lock_link) {

-:140: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'pos' - possible side-effects?
#140: FILE: drivers/gpu/drm/i915/i915_utils.h:269:
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)	\
+	for (pos = list_prev_entry(pos, member),			\
+		n = list_prev_entry(pos, member);			\
+	     &pos->member != (head);					\
+	     pos = n, n = list_prev_entry(n, member))

-:140: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'n' - possible side-effects?
#140: FILE: drivers/gpu/drm/i915/i915_utils.h:269:
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)	\
+	for (pos = list_prev_entry(pos, member),			\
+		n = list_prev_entry(pos, member);			\
+	     &pos->member != (head);					\
+	     pos = n, n = list_prev_entry(n, member))

-:140: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'member' - possible side-effects?
#140: FILE: drivers/gpu/drm/i915/i915_utils.h:269:
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)	\
+	for (pos = list_prev_entry(pos, member),			\
+		n = list_prev_entry(pos, member);			\
+	     &pos->member != (head);					\
+	     pos = n, n = list_prev_entry(n, member))

total: 0 errors, 1 warnings, 3 checks, 120 lines checked
93d1d01786b9 drm/i915/gem: Asynchronous GTT unbinding
002b1ada3b2c drm/i915/gem: Bind the fence async for execbuf
b91e0003626c drm/i915: Micro-optimise i915_request_completed()

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [01/22] drm/i915/gem: Suppress some random warnings
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (21 preceding siblings ...)
  2020-05-20  8:37 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915/gem: Suppress some random warnings Patchwork
@ 2020-05-20  8:38 ` Patchwork
  2020-05-20  9:03 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
  2020-05-20 20:47 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
  24 siblings, 0 replies; 26+ messages in thread
From: Patchwork @ 2020-05-20  8:38 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/22] drm/i915/gem: Suppress some random warnings
URL   : https://patchwork.freedesktop.org/series/77459/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.
+drivers/gpu/drm/i915/intel_wakeref.c:137:19: warning: context imbalance in 'wakeref_auto_timeout' - unexpected unlock
+drivers/gpu/drm/i915/selftests/i915_syncmap.c:80:54: warning: dubious: x | !y

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [01/22] drm/i915/gem: Suppress some random warnings
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (22 preceding siblings ...)
  2020-05-20  8:38 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2020-05-20  9:03 ` Patchwork
  2020-05-20 20:47 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
  24 siblings, 0 replies; 26+ messages in thread
From: Patchwork @ 2020-05-20  9:03 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/22] drm/i915/gem: Suppress some random warnings
URL   : https://patchwork.freedesktop.org/series/77459/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8509 -> Patchwork_17724
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/index.html

New tests
---------

  New tests have been introduced between CI_DRM_8509 and Patchwork_17724:

### New IGT tests (1) ###

  * igt@dmabuf@all@dma_fence_proxy:
    - Statuses : 38 pass(s)
    - Exec time: [0.02, 0.10] s

  

Known issues
------------

  Here are the changes found in Patchwork_17724 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_selftest@live@execlists:
    - fi-skl-lmem:        [PASS][1] -> [INCOMPLETE][2] ([i915#1874])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/fi-skl-lmem/igt@i915_selftest@live@execlists.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/fi-skl-lmem/igt@i915_selftest@live@execlists.html

  
  [i915#1874]: https://gitlab.freedesktop.org/drm/intel/issues/1874


Participating hosts (48 -> 44)
------------------------------

  Additional (1): fi-kbl-7560u 
  Missing    (5): fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_8509 -> Patchwork_17724

  CI-20190529: 20190529
  CI_DRM_8509: ea6a2729d3d286137415319de4161042b0337e87 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5662: e79462659e0f45cd3f4f766f58cb792303c6bf9b @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_17724: b91e0003626c8e9af7d9bc5c6716d6dfb68a02c0 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

b91e0003626c drm/i915: Micro-optimise i915_request_completed()
002b1ada3b2c drm/i915/gem: Bind the fence async for execbuf
93d1d01786b9 drm/i915/gem: Asynchronous GTT unbinding
32a908f52387 drm/i915/gem: Separate the ww_mutex walker into its own list
d4f5912b32c2 drm/i915: Export a preallocate variant of i915_active_acquire()
4d8c9d1e6bf7 drm/i915/gem: Assign context id for async work
a166cd2383c3 drm/i915: Always defer fenced work to the worker
1db0d219282e drm/i915: Drop I915_IDLE_ENGINES_TIMEOUT
a612c3379605 drm/i915/gt: Use built-in active intel_context reference
44980c2e4806 drm/i915/gt: Declare when we enabled timeslicing
c78b3fceaec2 drm/i915/gem: Allow combining submit-fences with syncobj
6995a5de6f1d drm/i915/gem: Teach execbuf how to wait on future syncobj
be1de9ed0950 drm/syncobj: Allow use of dma-fence-proxy
cc0b5daa9c3a dma-buf: Proxy fence, an unsignaled fence placeholder
b72331f639dd drm/i915: Improve execute_cb struct packing
0435401407a9 drm/i915/gt: Resubmit the virtual engine on schedule-out
f63db413341d drm/i915/gt: Decouple inflight virtual engines
c4839c4c4fe9 drm/i915/gt: Use virtual_engine during execlists_dequeue
8ea14b342e49 drm/i915: Move saturated workload detection back to the context
ad7a7cab4515 drm/i915: Avoid using rq->engine after free during i915_fence_release
ec289918ff88 drm/i915/execlists: Shortcircuit queue_prio() for no internal levels
35b35b313bad drm/i915/gem: Suppress some random warnings

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [01/22] drm/i915/gem: Suppress some random warnings
  2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (23 preceding siblings ...)
  2020-05-20  9:03 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2020-05-20 20:47 ` Patchwork
  24 siblings, 0 replies; 26+ messages in thread
From: Patchwork @ 2020-05-20 20:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/22] drm/i915/gem: Suppress some random warnings
URL   : https://patchwork.freedesktop.org/series/77459/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8509_full -> Patchwork_17724_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_17724_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_17724_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_17724_full:

### CI changes ###

#### Possible regressions ####

  * boot:
    - shard-skl:          ([PASS][1], [PASS][2], [PASS][3], [PASS][4], [PASS][5], [PASS][6], [PASS][7], [PASS][8], [PASS][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13], [PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19], [PASS][20], [PASS][21], [PASS][22], [PASS][23], [PASS][24]) -> ([PASS][25], [PASS][26], [PASS][27], [PASS][28], [PASS][29], [PASS][30], [PASS][31], [PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36], [PASS][37], [PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], [FAIL][43], [PASS][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48], [PASS][49])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl9/boot.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl9/boot.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl8/boot.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl8/boot.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl7/boot.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl7/boot.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl6/boot.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl6/boot.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl5/boot.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl5/boot.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl5/boot.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl4/boot.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl4/boot.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl4/boot.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl3/boot.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl3/boot.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl2/boot.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl2/boot.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl2/boot.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl10/boot.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl10/boot.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl1/boot.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl1/boot.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl1/boot.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl10/boot.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl9/boot.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl9/boot.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl9/boot.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl8/boot.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl8/boot.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl8/boot.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl7/boot.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl7/boot.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl6/boot.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl6/boot.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl5/boot.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl5/boot.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl4/boot.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl4/boot.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl3/boot.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl3/boot.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl3/boot.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl2/boot.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl2/boot.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl10/boot.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl10/boot.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl1/boot.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl1/boot.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl1/boot.html

  

### IGT changes ###

#### Possible regressions ####

  * igt@gem_exec_balancer@semaphore:
    - shard-kbl:          [PASS][50] -> [DMESG-WARN][51]
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-kbl7/igt@gem_exec_balancer@semaphore.html
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-kbl6/igt@gem_exec_balancer@semaphore.html
    - shard-iclb:         [PASS][52] -> [DMESG-WARN][53]
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-iclb6/igt@gem_exec_balancer@semaphore.html
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-iclb4/igt@gem_exec_balancer@semaphore.html

  * igt@gem_linear_blits@interruptible:
    - shard-snb:          [PASS][54] -> [FAIL][55]
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-snb4/igt@gem_linear_blits@interruptible.html
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-snb4/igt@gem_linear_blits@interruptible.html

  * igt@runner@aborted:
    - shard-iclb:         NOTRUN -> [FAIL][56]
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-iclb4/igt@runner@aborted.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * {igt@gem_exec_balancer@sliced}:
    - shard-kbl:          [PASS][57] -> [INCOMPLETE][58]
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-kbl4/igt@gem_exec_balancer@sliced.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-kbl2/igt@gem_exec_balancer@sliced.html
    - shard-tglb:         [PASS][59] -> [INCOMPLETE][60]
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-tglb7/igt@gem_exec_balancer@sliced.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-tglb6/igt@gem_exec_balancer@sliced.html

  * {igt@gem_exec_fence@syncobj-invalid-wait}:
    - shard-snb:          [PASS][61] -> [FAIL][62]
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-snb2/igt@gem_exec_fence@syncobj-invalid-wait.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-snb1/igt@gem_exec_fence@syncobj-invalid-wait.html
    - shard-tglb:         [PASS][63] -> [FAIL][64]
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-tglb6/igt@gem_exec_fence@syncobj-invalid-wait.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-tglb3/igt@gem_exec_fence@syncobj-invalid-wait.html
    - shard-skl:          [PASS][65] -> [FAIL][66]
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl7/igt@gem_exec_fence@syncobj-invalid-wait.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl10/igt@gem_exec_fence@syncobj-invalid-wait.html
    - shard-glk:          [PASS][67] -> [FAIL][68]
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-glk7/igt@gem_exec_fence@syncobj-invalid-wait.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-glk5/igt@gem_exec_fence@syncobj-invalid-wait.html
    - shard-apl:          [PASS][69] -> [FAIL][70]
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-apl2/igt@gem_exec_fence@syncobj-invalid-wait.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-apl1/igt@gem_exec_fence@syncobj-invalid-wait.html
    - shard-kbl:          [PASS][71] -> [FAIL][72]
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-kbl3/igt@gem_exec_fence@syncobj-invalid-wait.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-kbl3/igt@gem_exec_fence@syncobj-invalid-wait.html
    - shard-iclb:         [PASS][73] -> [FAIL][74]
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-iclb8/igt@gem_exec_fence@syncobj-invalid-wait.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-iclb3/igt@gem_exec_fence@syncobj-invalid-wait.html

  
New tests
---------

  New tests have been introduced between CI_DRM_8509_full and Patchwork_17724_full:

### New IGT tests (1) ###

  * igt@dmabuf@all@dma_fence_proxy:
    - Statuses : 7 pass(s)
    - Exec time: [0.03, 0.13] s

  

Known issues
------------

  Here are the changes found in Patchwork_17724_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_workarounds@suspend-resume-context:
    - shard-apl:          [PASS][75] -> [DMESG-WARN][76] ([i915#180]) +1 similar issue
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-apl8/igt@gem_workarounds@suspend-resume-context.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-apl6/igt@gem_workarounds@suspend-resume-context.html

  * igt@gem_workarounds@suspend-resume-fd:
    - shard-kbl:          [PASS][77] -> [DMESG-WARN][78] ([i915#180])
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-kbl7/igt@gem_workarounds@suspend-resume-fd.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-kbl2/igt@gem_workarounds@suspend-resume-fd.html

  * igt@kms_cursor_crc@pipe-a-cursor-128x42-random:
    - shard-skl:          [PASS][79] -> [FAIL][80] ([i915#54])
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl10/igt@kms_cursor_crc@pipe-a-cursor-128x42-random.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl8/igt@kms_cursor_crc@pipe-a-cursor-128x42-random.html

  * igt@kms_cursor_legacy@pipe-b-torture-bo:
    - shard-iclb:         [PASS][81] -> [DMESG-WARN][82] ([i915#128])
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-iclb8/igt@kms_cursor_legacy@pipe-b-torture-bo.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-iclb3/igt@kms_cursor_legacy@pipe-b-torture-bo.html

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          [PASS][83] -> [FAIL][84] ([fdo#108145] / [i915#265]) +2 similar issues
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl1/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl3/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html

  * igt@kms_psr2_su@frontbuffer:
    - shard-iclb:         [PASS][85] -> [SKIP][86] ([fdo#109642] / [fdo#111068])
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-iclb2/igt@kms_psr2_su@frontbuffer.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-iclb4/igt@kms_psr2_su@frontbuffer.html

  * igt@kms_psr@psr2_primary_mmap_cpu:
    - shard-iclb:         [PASS][87] -> [SKIP][88] ([fdo#109441]) +1 similar issue
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-iclb2/igt@kms_psr@psr2_primary_mmap_cpu.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-iclb7/igt@kms_psr@psr2_primary_mmap_cpu.html

  
#### Possible fixes ####

  * {igt@kms_flip@flip-vs-suspend@a-dp1}:
    - shard-apl:          [DMESG-WARN][89] ([i915#180]) -> [PASS][90] +2 similar issues
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-apl6/igt@kms_flip@flip-vs-suspend@a-dp1.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-apl7/igt@kms_flip@flip-vs-suspend@a-dp1.html
    - shard-kbl:          [DMESG-WARN][91] ([i915#180]) -> [PASS][92] +3 similar issues
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-kbl7/igt@kms_flip@flip-vs-suspend@a-dp1.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-kbl6/igt@kms_flip@flip-vs-suspend@a-dp1.html

  * igt@kms_hdr@bpc-switch-dpms:
    - shard-skl:          [FAIL][93] ([i915#1188]) -> [PASS][94] +1 similar issue
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl5/igt@kms_hdr@bpc-switch-dpms.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl4/igt@kms_hdr@bpc-switch-dpms.html

  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min:
    - shard-skl:          [FAIL][95] ([fdo#108145] / [i915#265]) -> [PASS][96]
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-skl4/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-skl3/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html

  * igt@kms_psr@psr2_primary_page_flip:
    - shard-iclb:         [SKIP][97] ([fdo#109441]) -> [PASS][98] +1 similar issue
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-iclb5/igt@kms_psr@psr2_primary_page_flip.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-iclb2/igt@kms_psr@psr2_primary_page_flip.html

  * igt@kms_setmode@basic:
    - shard-kbl:          [FAIL][99] ([i915#31]) -> [PASS][100]
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-kbl2/igt@kms_setmode@basic.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-kbl7/igt@kms_setmode@basic.html

  
#### Warnings ####

  * igt@i915_pm_dc@dc3co-vpb-simulation:
    - shard-iclb:         [SKIP][101] ([i915#588]) -> [SKIP][102] ([i915#658])
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-iclb2/igt@i915_pm_dc@dc3co-vpb-simulation.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-iclb4/igt@i915_pm_dc@dc3co-vpb-simulation.html

  * igt@kms_content_protection@atomic-dpms:
    - shard-apl:          [FAIL][103] ([fdo#110321] / [fdo#110336]) -> [TIMEOUT][104] ([i915#1319])
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-apl1/igt@kms_content_protection@atomic-dpms.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-apl3/igt@kms_content_protection@atomic-dpms.html

  * igt@kms_content_protection@legacy:
    - shard-apl:          [TIMEOUT][105] ([i915#1319]) -> [FAIL][106] ([fdo#110321] / [fdo#110336])
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-apl8/igt@kms_content_protection@legacy.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-apl6/igt@kms_content_protection@legacy.html

  * igt@kms_content_protection@lic:
    - shard-apl:          [TIMEOUT][107] ([i915#1319]) -> [DMESG-FAIL][108] ([fdo#110321])
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-apl2/igt@kms_content_protection@lic.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-apl8/igt@kms_content_protection@lic.html

  * igt@kms_content_protection@srm:
    - shard-apl:          [FAIL][109] ([fdo#110321]) -> [TIMEOUT][110] ([i915#1319])
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-apl4/igt@kms_content_protection@srm.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-apl1/igt@kms_content_protection@srm.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-basic:
    - shard-apl:          [FAIL][111] ([fdo#108145] / [i915#265] / [i915#95]) -> [FAIL][112] ([fdo#108145] / [i915#265])
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-apl8/igt@kms_plane_alpha_blend@pipe-a-alpha-basic.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-apl4/igt@kms_plane_alpha_blend@pipe-a-alpha-basic.html

  * igt@kms_setmode@basic:
    - shard-apl:          [FAIL][113] ([i915#31]) -> [FAIL][114] ([i915#31] / [i915#95])
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8509/shard-apl3/igt@kms_setmode@basic.html
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/shard-apl6/igt@kms_setmode@basic.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#110321]: https://bugs.freedesktop.org/show_bug.cgi?id=110321
  [fdo#110336]: https://bugs.freedesktop.org/show_bug.cgi?id=110336
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [i915#1188]: https://gitlab.freedesktop.org/drm/intel/issues/1188
  [i915#128]: https://gitlab.freedesktop.org/drm/intel/issues/128
  [i915#1319]: https://gitlab.freedesktop.org/drm/intel/issues/1319
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#198]: https://gitlab.freedesktop.org/drm/intel/issues/198
  [i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
  [i915#31]: https://gitlab.freedesktop.org/drm/intel/issues/31
  [i915#54]: https://gitlab.freedesktop.org/drm/intel/issues/54
  [i915#588]: https://gitlab.freedesktop.org/drm/intel/issues/588
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (11 -> 11)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * Linux: CI_DRM_8509 -> Patchwork_17724

  CI-20190529: 20190529
  CI_DRM_8509: ea6a2729d3d286137415319de4161042b0337e87 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5662: e79462659e0f45cd3f4f766f58cb792303c6bf9b @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_17724: b91e0003626c8e9af7d9bc5c6716d6dfb68a02c0 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17724/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, back to index

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-20  7:54 [Intel-gfx] [PATCH 01/22] drm/i915/gem: Suppress some random warnings Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 02/22] drm/i915/execlists: Shortcircuit queue_prio() for no internal levels Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 03/22] drm/i915: Avoid using rq->engine after free during i915_fence_release Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 04/22] drm/i915: Move saturated workload detection back to the context Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 05/22] drm/i915/gt: Use virtual_engine during execlists_dequeue Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 06/22] drm/i915/gt: Decouple inflight virtual engines Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 07/22] drm/i915/gt: Resubmit the virtual engine on schedule-out Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 08/22] drm/i915: Improve execute_cb struct packing Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 09/22] dma-buf: Proxy fence, an unsignaled fence placeholder Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 10/22] drm/syncobj: Allow use of dma-fence-proxy Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 11/22] drm/i915/gem: Teach execbuf how to wait on future syncobj Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 12/22] drm/i915/gem: Allow combining submit-fences with syncobj Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 13/22] drm/i915/gt: Declare when we enabled timeslicing Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 14/22] drm/i915/gt: Use built-in active intel_context reference Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 15/22] drm/i915: Drop I915_IDLE_ENGINES_TIMEOUT Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 16/22] drm/i915: Always defer fenced work to the worker Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 17/22] drm/i915/gem: Assign context id for async work Chris Wilson
2020-05-20  7:54 ` [Intel-gfx] [PATCH 18/22] drm/i915: Export a preallocate variant of i915_active_acquire() Chris Wilson
2020-05-20  7:55 ` [Intel-gfx] [PATCH 19/22] drm/i915/gem: Separate the ww_mutex walker into its own list Chris Wilson
2020-05-20  7:55 ` [Intel-gfx] [PATCH 20/22] drm/i915/gem: Asynchronous GTT unbinding Chris Wilson
2020-05-20  7:55 ` [Intel-gfx] [PATCH 21/22] drm/i915/gem: Bind the fence async for execbuf Chris Wilson
2020-05-20  7:55 ` [Intel-gfx] [PATCH 22/22] drm/i915: Micro-optimise i915_request_completed() Chris Wilson
2020-05-20  8:37 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915/gem: Suppress some random warnings Patchwork
2020-05-20  8:38 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-05-20  9:03 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-05-20 20:47 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Intel-GFX Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/intel-gfx/0 intel-gfx/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 intel-gfx intel-gfx/ https://lore.kernel.org/intel-gfx \
		intel-gfx@lists.freedesktop.org
	public-inbox-index intel-gfx

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.freedesktop.lists.intel-gfx


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git