All of lore.kernel.org
 help / color / mirror / Atom feed
* [CI 1/4] drm/i915/execlists: Suppress redundant preemption
@ 2019-03-01 17:08 Chris Wilson
  2019-03-01 17:08 ` [CI 2/4] drm/i915: Keep timeline HWSP allocated until idle across the system Chris Wilson
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Chris Wilson @ 2019-03-01 17:08 UTC (permalink / raw)
  To: intel-gfx

On unwinding the active request we give it a small (limited to internal
priority levels) boost to prevent it from being gazumped a second time.
However, this means that it can be promoted to above the request that
triggered the preemption request, causing a preempt-to-idle cycle for no
change. We can avoid this if we take the boost into account when
checking if the preemption request is valid.

v2: After preemption the active request will be after the preemptee if
they end up with equal priority.

v3: Tvrtko pointed out that this, the existing logic, makes
I915_PRIORITY_WAIT non-preemptible. Document this interesting quirk!

v4: Prove Tvrtko was right about WAIT being non-preemptible and test it.
v5: Except not all priorities were made equal, and the WAIT not preempting
is only if we start off as !NEWCLIENT.

v6: More commentary after coming to an understanding about what I had
forgotten to say.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 44 +++++++++++++++++++++++++++++---
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4f2187aa44e4..3fd0c45a2920 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -164,6 +164,8 @@
 #define WA_TAIL_DWORDS 2
 #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
 
+#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT)
+
 static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 					    struct intel_engine_cs *engine,
 					    struct intel_context *ce);
@@ -190,8 +192,30 @@ static inline int rq_prio(const struct i915_request *rq)
 
 static int effective_prio(const struct i915_request *rq)
 {
+	int prio = rq_prio(rq);
+
+	/*
+	 * On unwinding the active request, we give it a priority bump
+	 * equivalent to a freshly submitted request. This protects it from
+	 * being gazumped again, but it would be preferable if we didn't
+	 * let it be gazumped in the first place!
+	 *
+	 * See __unwind_incomplete_requests()
+	 */
+	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(rq)) {
+		/*
+		 * After preemption, we insert the active request at the
+		 * end of the new priority level. This means that we will be
+		 * _lower_ priority than the preemptee all things equal (and
+		 * so the preemption is valid), so adjust our comparison
+		 * accordingly.
+		 */
+		prio |= ACTIVE_PRIORITY;
+		prio--;
+	}
+
 	/* Restrict mere WAIT boosts from triggering preemption */
-	return rq_prio(rq) | __NO_PREEMPTION;
+	return prio | __NO_PREEMPTION;
 }
 
 static int queue_prio(const struct intel_engine_execlists *execlists)
@@ -359,7 +383,7 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 {
 	struct i915_request *rq, *rn, *active = NULL;
 	struct list_head *uninitialized_var(pl);
-	int prio = I915_PRIORITY_INVALID | I915_PRIORITY_NEWCLIENT;
+	int prio = I915_PRIORITY_INVALID | ACTIVE_PRIORITY;
 
 	lockdep_assert_held(&engine->timeline.lock);
 
@@ -390,9 +414,21 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	 * The active request is now effectively the start of a new client
 	 * stream, so give it the equivalent small priority bump to prevent
 	 * it being gazumped a second time by another peer.
+	 *
+	 * Note we have to be careful not to apply a priority boost to a request
+	 * still spinning on its semaphores. If the request hasn't started, that
+	 * means it is still waiting for its dependencies to be signaled, and
+	 * if we apply a priority boost to this request, we will boost it past
+	 * its signalers and so break PI.
+	 *
+	 * One consequence of this preemption boost is that we may jump
+	 * over lesser priorities (such as I915_PRIORITY_WAIT), effectively
+	 * making those priorities non-preemptible. They will be moved forward
+	 * in the priority queue, but they will not gain immediate access to
+	 * the GPU.
 	 */
-	if (!(prio & I915_PRIORITY_NEWCLIENT)) {
-		prio |= I915_PRIORITY_NEWCLIENT;
+	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(active)) {
+		prio |= ACTIVE_PRIORITY;
 		active->sched.attr.priority = prio;
 		list_move_tail(&active->sched.link,
 			       i915_sched_lookup_priolist(engine, prio));
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [CI 2/4] drm/i915: Keep timeline HWSP allocated until idle across the system
  2019-03-01 17:08 [CI 1/4] drm/i915/execlists: Suppress redundant preemption Chris Wilson
@ 2019-03-01 17:08 ` Chris Wilson
  2019-03-01 17:09 ` [CI 3/4] drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+ Chris Wilson
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2019-03-01 17:08 UTC (permalink / raw)
  To: intel-gfx

In preparation for enabling HW semaphores, we need to keep in flight
timeline HWSP alive until its use across entire system has completed,
as any other timeline active on the GPU may still refer back to the
already retired timeline. We both have to delay recycling available
cachelines and unpinning old HWSP until the next idle point.

An easy option would be to simply keep all used HWSP until the system as
a whole was idle, i.e. we could release them all at once on parking.
However, on a busy system, we may never see a global idle point,
essentially meaning the resource will be leaked until we are forced to
do a GC pass. We already employ a fine-grained idle detection mechanism
for vma, which we can reuse here so that each cacheline can be freed
immediately after the last request using it is retired.

v3: Keep track of the activity of each cacheline.
v4: cacheline_free() on canceling the seqno tracking
v5: Finally with a testcase to exercise wraparound
v6: Pack cacheline into empty bits of page-aligned vaddr
v7: Use i915_utils to hide the pointer casting around bit manipulation

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c           |  31 +-
 drivers/gpu/drm/i915/i915_request.h           |  11 +
 drivers/gpu/drm/i915/i915_timeline.c          | 293 ++++++++++++++++--
 drivers/gpu/drm/i915/i915_timeline.h          |  11 +-
 .../gpu/drm/i915/selftests/i915_timeline.c    | 113 +++++++
 5 files changed, 420 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 719d1a5ab082..d354967d6ae8 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -325,11 +325,6 @@ void i915_request_retire_upto(struct i915_request *rq)
 	} while (tmp != rq);
 }
 
-static u32 timeline_get_seqno(struct i915_timeline *tl)
-{
-	return tl->seqno += 1 + tl->has_initial_breadcrumb;
-}
-
 static void move_to_timeline(struct i915_request *request,
 			     struct i915_timeline *timeline)
 {
@@ -532,8 +527,10 @@ struct i915_request *
 i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 {
 	struct drm_i915_private *i915 = engine->i915;
-	struct i915_request *rq;
 	struct intel_context *ce;
+	struct i915_timeline *tl;
+	struct i915_request *rq;
+	u32 seqno;
 	int ret;
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
@@ -610,24 +607,27 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 		}
 	}
 
-	rq->rcustate = get_state_synchronize_rcu();
-
 	INIT_LIST_HEAD(&rq->active_list);
+
+	tl = ce->ring->timeline;
+	ret = i915_timeline_get_seqno(tl, rq, &seqno);
+	if (ret)
+		goto err_free;
+
 	rq->i915 = i915;
 	rq->engine = engine;
 	rq->gem_context = ctx;
 	rq->hw_context = ce;
 	rq->ring = ce->ring;
-	rq->timeline = ce->ring->timeline;
+	rq->timeline = tl;
 	GEM_BUG_ON(rq->timeline == &engine->timeline);
-	rq->hwsp_seqno = rq->timeline->hwsp_seqno;
+	rq->hwsp_seqno = tl->hwsp_seqno;
+	rq->hwsp_cacheline = tl->hwsp_cacheline;
+	rq->rcustate = get_state_synchronize_rcu(); /* acts as smp_mb() */
 
 	spin_lock_init(&rq->lock);
-	dma_fence_init(&rq->fence,
-		       &i915_fence_ops,
-		       &rq->lock,
-		       rq->timeline->fence_context,
-		       timeline_get_seqno(rq->timeline));
+	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock,
+		       tl->fence_context, seqno);
 
 	/* We bump the ref for the fence chain */
 	i915_sw_fence_init(&i915_request_get(rq)->submit, submit_notify);
@@ -687,6 +687,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	GEM_BUG_ON(!list_empty(&rq->sched.signalers_list));
 	GEM_BUG_ON(!list_empty(&rq->sched.waiters_list));
 
+err_free:
 	kmem_cache_free(global.slab_requests, rq);
 err_unreserve:
 	mutex_unlock(&ce->ring->timeline->mutex);
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index be3ded6bcf56..09eaad06d2c6 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -38,6 +38,7 @@ struct drm_file;
 struct drm_i915_gem_object;
 struct i915_request;
 struct i915_timeline;
+struct i915_timeline_cacheline;
 
 struct i915_capture_list {
 	struct i915_capture_list *next;
@@ -148,6 +149,16 @@ struct i915_request {
 	 */
 	const u32 *hwsp_seqno;
 
+	/*
+	 * If we need to access the timeline's seqno for this request in
+	 * another request, we need to keep a read reference to this associated
+	 * cacheline, so that we do not free and recycle it before the foreign
+	 * observers have completed. Hence, we keep a pointer to the cacheline
+	 * inside the timeline's HWSP vma, but it is only valid while this
+	 * request has not completed and guarded by the timeline mutex.
+	 */
+	struct i915_timeline_cacheline *hwsp_cacheline;
+
 	/** Position in the ring of the start of the request */
 	u32 head;
 
diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c
index 87a80558da28..8484ba6e51d1 100644
--- a/drivers/gpu/drm/i915/i915_timeline.c
+++ b/drivers/gpu/drm/i915/i915_timeline.c
@@ -6,19 +6,32 @@
 
 #include "i915_drv.h"
 
-#include "i915_timeline.h"
+#include "i915_active.h"
 #include "i915_syncmap.h"
+#include "i915_timeline.h"
+
+#define ptr_set_bit(ptr, bit) ((typeof(ptr))((unsigned long)(ptr) | BIT(bit)))
+#define ptr_test_bit(ptr, bit) ((unsigned long)(ptr) & BIT(bit))
 
 struct i915_timeline_hwsp {
-	struct i915_vma *vma;
+	struct i915_gt_timelines *gt;
 	struct list_head free_link;
+	struct i915_vma *vma;
 	u64 free_bitmap;
 };
 
-static inline struct i915_timeline_hwsp *
-i915_timeline_hwsp(const struct i915_timeline *tl)
+struct i915_timeline_cacheline {
+	struct i915_active active;
+	struct i915_timeline_hwsp *hwsp;
+	void *vaddr;
+#define CACHELINE_BITS 6
+#define CACHELINE_FREE CACHELINE_BITS
+};
+
+static inline struct drm_i915_private *
+hwsp_to_i915(struct i915_timeline_hwsp *hwsp)
 {
-	return tl->hwsp_ggtt->private;
+	return container_of(hwsp->gt, struct drm_i915_private, gt.timelines);
 }
 
 static struct i915_vma *__hwsp_alloc(struct drm_i915_private *i915)
@@ -71,6 +84,7 @@ hwsp_alloc(struct i915_timeline *timeline, unsigned int *cacheline)
 		vma->private = hwsp;
 		hwsp->vma = vma;
 		hwsp->free_bitmap = ~0ull;
+		hwsp->gt = gt;
 
 		spin_lock(&gt->hwsp_lock);
 		list_add(&hwsp->free_link, &gt->hwsp_free_list);
@@ -88,14 +102,9 @@ hwsp_alloc(struct i915_timeline *timeline, unsigned int *cacheline)
 	return hwsp->vma;
 }
 
-static void hwsp_free(struct i915_timeline *timeline)
+static void __idle_hwsp_free(struct i915_timeline_hwsp *hwsp, int cacheline)
 {
-	struct i915_gt_timelines *gt = &timeline->i915->gt.timelines;
-	struct i915_timeline_hwsp *hwsp;
-
-	hwsp = i915_timeline_hwsp(timeline);
-	if (!hwsp) /* leave global HWSP alone! */
-		return;
+	struct i915_gt_timelines *gt = hwsp->gt;
 
 	spin_lock(&gt->hwsp_lock);
 
@@ -103,7 +112,8 @@ static void hwsp_free(struct i915_timeline *timeline)
 	if (!hwsp->free_bitmap)
 		list_add_tail(&hwsp->free_link, &gt->hwsp_free_list);
 
-	hwsp->free_bitmap |= BIT_ULL(timeline->hwsp_offset / CACHELINE_BYTES);
+	GEM_BUG_ON(cacheline >= BITS_PER_TYPE(hwsp->free_bitmap));
+	hwsp->free_bitmap |= BIT_ULL(cacheline);
 
 	/* And if no one is left using it, give the page back to the system */
 	if (hwsp->free_bitmap == ~0ull) {
@@ -115,6 +125,76 @@ static void hwsp_free(struct i915_timeline *timeline)
 	spin_unlock(&gt->hwsp_lock);
 }
 
+static void __idle_cacheline_free(struct i915_timeline_cacheline *cl)
+{
+	GEM_BUG_ON(!i915_active_is_idle(&cl->active));
+
+	i915_gem_object_unpin_map(cl->hwsp->vma->obj);
+	i915_vma_put(cl->hwsp->vma);
+	__idle_hwsp_free(cl->hwsp, ptr_unmask_bits(cl->vaddr, CACHELINE_BITS));
+
+	i915_active_fini(&cl->active);
+	kfree(cl);
+}
+
+static void __cacheline_retire(struct i915_active *active)
+{
+	struct i915_timeline_cacheline *cl =
+		container_of(active, typeof(*cl), active);
+
+	i915_vma_unpin(cl->hwsp->vma);
+	if (ptr_test_bit(cl->vaddr, CACHELINE_FREE))
+		__idle_cacheline_free(cl);
+}
+
+static struct i915_timeline_cacheline *
+cacheline_alloc(struct i915_timeline_hwsp *hwsp, unsigned int cacheline)
+{
+	struct i915_timeline_cacheline *cl;
+	void *vaddr;
+
+	GEM_BUG_ON(cacheline >= BIT(CACHELINE_BITS));
+
+	cl = kmalloc(sizeof(*cl), GFP_KERNEL);
+	if (!cl)
+		return ERR_PTR(-ENOMEM);
+
+	vaddr = i915_gem_object_pin_map(hwsp->vma->obj, I915_MAP_WB);
+	if (IS_ERR(vaddr)) {
+		kfree(cl);
+		return ERR_CAST(vaddr);
+	}
+
+	i915_vma_get(hwsp->vma);
+	cl->hwsp = hwsp;
+	cl->vaddr = page_pack_bits(vaddr, cacheline);
+
+	i915_active_init(hwsp_to_i915(hwsp), &cl->active, __cacheline_retire);
+
+	return cl;
+}
+
+static void cacheline_acquire(struct i915_timeline_cacheline *cl)
+{
+	if (cl && i915_active_acquire(&cl->active))
+		__i915_vma_pin(cl->hwsp->vma);
+}
+
+static void cacheline_release(struct i915_timeline_cacheline *cl)
+{
+	if (cl)
+		i915_active_release(&cl->active);
+}
+
+static void cacheline_free(struct i915_timeline_cacheline *cl)
+{
+	GEM_BUG_ON(ptr_test_bit(cl->vaddr, CACHELINE_FREE));
+	cl->vaddr = ptr_set_bit(cl->vaddr, CACHELINE_FREE);
+
+	if (i915_active_is_idle(&cl->active))
+		__idle_cacheline_free(cl);
+}
+
 int i915_timeline_init(struct drm_i915_private *i915,
 		       struct i915_timeline *timeline,
 		       const char *name,
@@ -136,29 +216,40 @@ int i915_timeline_init(struct drm_i915_private *i915,
 	timeline->name = name;
 	timeline->pin_count = 0;
 	timeline->has_initial_breadcrumb = !hwsp;
+	timeline->hwsp_cacheline = NULL;
 
-	timeline->hwsp_offset = I915_GEM_HWS_SEQNO_ADDR;
 	if (!hwsp) {
+		struct i915_timeline_cacheline *cl;
 		unsigned int cacheline;
 
 		hwsp = hwsp_alloc(timeline, &cacheline);
 		if (IS_ERR(hwsp))
 			return PTR_ERR(hwsp);
 
+		cl = cacheline_alloc(hwsp->private, cacheline);
+		if (IS_ERR(cl)) {
+			__idle_hwsp_free(hwsp->private, cacheline);
+			return PTR_ERR(cl);
+		}
+
+		timeline->hwsp_cacheline = cl;
 		timeline->hwsp_offset = cacheline * CACHELINE_BYTES;
-	}
-	timeline->hwsp_ggtt = i915_vma_get(hwsp);
 
-	vaddr = i915_gem_object_pin_map(hwsp->obj, I915_MAP_WB);
-	if (IS_ERR(vaddr)) {
-		hwsp_free(timeline);
-		i915_vma_put(hwsp);
-		return PTR_ERR(vaddr);
+		vaddr = page_mask_bits(cl->vaddr);
+	} else {
+		timeline->hwsp_offset = I915_GEM_HWS_SEQNO_ADDR;
+
+		vaddr = i915_gem_object_pin_map(hwsp->obj, I915_MAP_WB);
+		if (IS_ERR(vaddr))
+			return PTR_ERR(vaddr);
 	}
 
 	timeline->hwsp_seqno =
 		memset(vaddr + timeline->hwsp_offset, 0, CACHELINE_BYTES);
 
+	timeline->hwsp_ggtt = i915_vma_get(hwsp);
+	GEM_BUG_ON(timeline->hwsp_offset >= hwsp->size);
+
 	timeline->fence_context = dma_fence_context_alloc(1);
 
 	spin_lock_init(&timeline->lock);
@@ -240,9 +331,12 @@ void i915_timeline_fini(struct i915_timeline *timeline)
 	GEM_BUG_ON(i915_active_request_isset(&timeline->barrier));
 
 	i915_syncmap_free(&timeline->sync);
-	hwsp_free(timeline);
 
-	i915_gem_object_unpin_map(timeline->hwsp_ggtt->obj);
+	if (timeline->hwsp_cacheline)
+		cacheline_free(timeline->hwsp_cacheline);
+	else
+		i915_gem_object_unpin_map(timeline->hwsp_ggtt->obj);
+
 	i915_vma_put(timeline->hwsp_ggtt);
 }
 
@@ -285,6 +379,7 @@ int i915_timeline_pin(struct i915_timeline *tl)
 		i915_ggtt_offset(tl->hwsp_ggtt) +
 		offset_in_page(tl->hwsp_offset);
 
+	cacheline_acquire(tl->hwsp_cacheline);
 	timeline_add_to_active(tl);
 
 	return 0;
@@ -294,6 +389,157 @@ int i915_timeline_pin(struct i915_timeline *tl)
 	return err;
 }
 
+static u32 timeline_advance(struct i915_timeline *tl)
+{
+	GEM_BUG_ON(!tl->pin_count);
+	GEM_BUG_ON(tl->seqno & tl->has_initial_breadcrumb);
+
+	return tl->seqno += 1 + tl->has_initial_breadcrumb;
+}
+
+static void timeline_rollback(struct i915_timeline *tl)
+{
+	tl->seqno -= 1 + tl->has_initial_breadcrumb;
+}
+
+static noinline int
+__i915_timeline_get_seqno(struct i915_timeline *tl,
+			  struct i915_request *rq,
+			  u32 *seqno)
+{
+	struct i915_timeline_cacheline *cl;
+	unsigned int cacheline;
+	struct i915_vma *vma;
+	void *vaddr;
+	int err;
+
+	/*
+	 * If there is an outstanding GPU reference to this cacheline,
+	 * such as it being sampled by a HW semaphore on another timeline,
+	 * we cannot wraparound our seqno value (the HW semaphore does
+	 * a strict greater-than-or-equals compare, not i915_seqno_passed).
+	 * So if the cacheline is still busy, we must detach ourselves
+	 * from it and leave it inflight alongside its users.
+	 *
+	 * However, if nobody is watching and we can guarantee that nobody
+	 * will, we could simply reuse the same cacheline.
+	 *
+	 * if (i915_active_request_is_signaled(&tl->last_request) &&
+	 *     i915_active_is_signaled(&tl->hwsp_cacheline->active))
+	 *	return 0;
+	 *
+	 * That seems unlikely for a busy timeline that needed to wrap in
+	 * the first place, so just replace the cacheline.
+	 */
+
+	vma = hwsp_alloc(tl, &cacheline);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err_rollback;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL | PIN_HIGH);
+	if (err) {
+		__idle_hwsp_free(vma->private, cacheline);
+		goto err_rollback;
+	}
+
+	cl = cacheline_alloc(vma->private, cacheline);
+	if (IS_ERR(cl)) {
+		err = PTR_ERR(cl);
+		__idle_hwsp_free(vma->private, cacheline);
+		goto err_unpin;
+	}
+	GEM_BUG_ON(cl->hwsp->vma != vma);
+
+	/*
+	 * Attach the old cacheline to the current request, so that we only
+	 * free it after the current request is retired, which ensures that
+	 * all writes into the cacheline from previous requests are complete.
+	 */
+	err = i915_active_ref(&tl->hwsp_cacheline->active,
+			      tl->fence_context, rq);
+	if (err)
+		goto err_cacheline;
+
+	cacheline_release(tl->hwsp_cacheline); /* ownership now xfered to rq */
+	cacheline_free(tl->hwsp_cacheline);
+
+	i915_vma_unpin(tl->hwsp_ggtt); /* binding kept alive by old cacheline */
+	i915_vma_put(tl->hwsp_ggtt);
+
+	tl->hwsp_ggtt = i915_vma_get(vma);
+
+	vaddr = page_mask_bits(cl->vaddr);
+	tl->hwsp_offset = cacheline * CACHELINE_BYTES;
+	tl->hwsp_seqno =
+		memset(vaddr + tl->hwsp_offset, 0, CACHELINE_BYTES);
+
+	tl->hwsp_offset += i915_ggtt_offset(vma);
+
+	cacheline_acquire(cl);
+	tl->hwsp_cacheline = cl;
+
+	*seqno = timeline_advance(tl);
+	GEM_BUG_ON(i915_seqno_passed(*tl->hwsp_seqno, *seqno));
+	return 0;
+
+err_cacheline:
+	cacheline_free(cl);
+err_unpin:
+	i915_vma_unpin(vma);
+err_rollback:
+	timeline_rollback(tl);
+	return err;
+}
+
+int i915_timeline_get_seqno(struct i915_timeline *tl,
+			    struct i915_request *rq,
+			    u32 *seqno)
+{
+	*seqno = timeline_advance(tl);
+
+	/* Replace the HWSP on wraparound for HW semaphores */
+	if (unlikely(!*seqno && tl->hwsp_cacheline))
+		return __i915_timeline_get_seqno(tl, rq, seqno);
+
+	return 0;
+}
+
+static int cacheline_ref(struct i915_timeline_cacheline *cl,
+			 struct i915_request *rq)
+{
+	return i915_active_ref(&cl->active, rq->fence.context, rq);
+}
+
+int i915_timeline_read_hwsp(struct i915_request *from,
+			    struct i915_request *to,
+			    u32 *hwsp)
+{
+	struct i915_timeline_cacheline *cl = from->hwsp_cacheline;
+	struct i915_timeline *tl = from->timeline;
+	int err;
+
+	GEM_BUG_ON(to->timeline == tl);
+
+	mutex_lock_nested(&tl->mutex, SINGLE_DEPTH_NESTING);
+	err = i915_request_completed(from);
+	if (!err)
+		err = cacheline_ref(cl, to);
+	if (!err) {
+		if (likely(cl == tl->hwsp_cacheline)) {
+			*hwsp = tl->hwsp_offset;
+		} else { /* across a seqno wrap, recover the original offset */
+			*hwsp = i915_ggtt_offset(cl->hwsp->vma) +
+				ptr_unmask_bits(cl->vaddr, CACHELINE_BITS) *
+				CACHELINE_BYTES;
+		}
+	}
+	mutex_unlock(&tl->mutex);
+
+	return err;
+}
+
 void i915_timeline_unpin(struct i915_timeline *tl)
 {
 	GEM_BUG_ON(!tl->pin_count);
@@ -301,6 +547,7 @@ void i915_timeline_unpin(struct i915_timeline *tl)
 		return;
 
 	timeline_remove_from_active(tl);
+	cacheline_release(tl->hwsp_cacheline);
 
 	/*
 	 * Since this timeline is idle, all bariers upon which we were waiting
diff --git a/drivers/gpu/drm/i915/i915_timeline.h b/drivers/gpu/drm/i915/i915_timeline.h
index 36c3849f7108..60b1dfad93ed 100644
--- a/drivers/gpu/drm/i915/i915_timeline.h
+++ b/drivers/gpu/drm/i915/i915_timeline.h
@@ -34,7 +34,7 @@
 #include "i915_utils.h"
 
 struct i915_vma;
-struct i915_timeline_hwsp;
+struct i915_timeline_cacheline;
 
 struct i915_timeline {
 	u64 fence_context;
@@ -51,6 +51,8 @@ struct i915_timeline {
 	struct i915_vma *hwsp_ggtt;
 	u32 hwsp_offset;
 
+	struct i915_timeline_cacheline *hwsp_cacheline;
+
 	bool has_initial_breadcrumb;
 
 	/**
@@ -162,8 +164,15 @@ static inline bool i915_timeline_sync_is_later(struct i915_timeline *tl,
 }
 
 int i915_timeline_pin(struct i915_timeline *tl);
+int i915_timeline_get_seqno(struct i915_timeline *tl,
+			    struct i915_request *rq,
+			    u32 *seqno);
 void i915_timeline_unpin(struct i915_timeline *tl);
 
+int i915_timeline_read_hwsp(struct i915_request *from,
+			    struct i915_request *until,
+			    u32 *hwsp_offset);
+
 void i915_timelines_init(struct drm_i915_private *i915);
 void i915_timelines_park(struct drm_i915_private *i915);
 void i915_timelines_fini(struct drm_i915_private *i915);
diff --git a/drivers/gpu/drm/i915/selftests/i915_timeline.c b/drivers/gpu/drm/i915/selftests/i915_timeline.c
index 12ea69b1a1e5..844701759ffc 100644
--- a/drivers/gpu/drm/i915/selftests/i915_timeline.c
+++ b/drivers/gpu/drm/i915/selftests/i915_timeline.c
@@ -641,6 +641,118 @@ static int live_hwsp_alternate(void *arg)
 #undef NUM_TIMELINES
 }
 
+static int live_hwsp_wrap(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine;
+	struct i915_timeline *tl;
+	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+	int err = 0;
+
+	/*
+	 * Across a seqno wrap, we need to keep the old cacheline alive for
+	 * foreign GPU references.
+	 */
+
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	tl = i915_timeline_create(i915, __func__, NULL);
+	if (IS_ERR(tl)) {
+		err = PTR_ERR(tl);
+		goto out_rpm;
+	}
+	if (!tl->has_initial_breadcrumb || !tl->hwsp_cacheline)
+		goto out_free;
+
+	err = i915_timeline_pin(tl);
+	if (err)
+		goto out_free;
+
+	for_each_engine(engine, i915, id) {
+		const u32 *hwsp_seqno[2];
+		struct i915_request *rq;
+		u32 seqno[2];
+
+		if (!intel_engine_can_store_dword(engine))
+			continue;
+
+		rq = i915_request_alloc(engine, i915->kernel_context);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto out;
+		}
+
+		tl->seqno = -4u;
+
+		err = i915_timeline_get_seqno(tl, rq, &seqno[0]);
+		if (err) {
+			i915_request_add(rq);
+			goto out;
+		}
+		pr_debug("seqno[0]:%08x, hwsp_offset:%08x\n",
+			 seqno[0], tl->hwsp_offset);
+
+		err = emit_ggtt_store_dw(rq, tl->hwsp_offset, seqno[0]);
+		if (err) {
+			i915_request_add(rq);
+			goto out;
+		}
+		hwsp_seqno[0] = tl->hwsp_seqno;
+
+		err = i915_timeline_get_seqno(tl, rq, &seqno[1]);
+		if (err) {
+			i915_request_add(rq);
+			goto out;
+		}
+		pr_debug("seqno[1]:%08x, hwsp_offset:%08x\n",
+			 seqno[1], tl->hwsp_offset);
+
+		err = emit_ggtt_store_dw(rq, tl->hwsp_offset, seqno[1]);
+		if (err) {
+			i915_request_add(rq);
+			goto out;
+		}
+		hwsp_seqno[1] = tl->hwsp_seqno;
+
+		/* With wrap should come a new hwsp */
+		GEM_BUG_ON(seqno[1] >= seqno[0]);
+		GEM_BUG_ON(hwsp_seqno[0] == hwsp_seqno[1]);
+
+		i915_request_add(rq);
+
+		if (i915_request_wait(rq, I915_WAIT_LOCKED, HZ / 5) < 0) {
+			pr_err("Wait for timeline writes timed out!\n");
+			err = -EIO;
+			goto out;
+		}
+
+		if (*hwsp_seqno[0] != seqno[0] || *hwsp_seqno[1] != seqno[1]) {
+			pr_err("Bad timeline values: found (%x, %x), expected (%x, %x)\n",
+			       *hwsp_seqno[0], *hwsp_seqno[1],
+			       seqno[0], seqno[1]);
+			err = -EINVAL;
+			goto out;
+		}
+
+		i915_retire_requests(i915); /* recycle HWSP */
+	}
+
+out:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	i915_timeline_unpin(tl);
+out_free:
+	i915_timeline_put(tl);
+out_rpm:
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	return err;
+}
+
 static int live_hwsp_recycle(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -723,6 +835,7 @@ int i915_timeline_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_hwsp_recycle),
 		SUBTEST(live_hwsp_engine),
 		SUBTEST(live_hwsp_alternate),
+		SUBTEST(live_hwsp_wrap),
 	};
 
 	return i915_subtests(tests, i915);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [CI 3/4] drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+
  2019-03-01 17:08 [CI 1/4] drm/i915/execlists: Suppress redundant preemption Chris Wilson
  2019-03-01 17:08 ` [CI 2/4] drm/i915: Keep timeline HWSP allocated until idle across the system Chris Wilson
@ 2019-03-01 17:09 ` Chris Wilson
  2019-03-01 17:09 ` [CI 4/4] drm/i915: Prioritise non-busywait semaphore workloads Chris Wilson
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2019-03-01 17:09 UTC (permalink / raw)
  To: intel-gfx

Having introduced per-context seqno, we now have a means to identity
progress across the system without feel of rollback as befell the
global_seqno. That is we can program a MI_SEMAPHORE_WAIT operation in
advance of submission safe in the knowledge that our target seqno and
address is stable.

However, since we are telling the GPU to busy-spin on the target address
until it matches the signaling seqno, we only want to do so when we are
sure that busy-spin will be completed quickly. To achieve this we only
submit the request to HW once the signaler is itself executing (modulo
preemption causing us to wait longer), and we only do so for default and
above priority requests (so that idle priority tasks never themselves
hog the GPU waiting for others).

As might be reasonably expected, HW semaphores excel in inter-engine
synchronisation microbenchmarks (where the 3x reduced latency / increased
throughput more than offset the power cost of spinning on a second ring)
and have significant improvement (can be up to ~10%, most see no change)
for single clients that utilize multiple engines (typically media players
and transcoders), without regressing multiple clients that can saturate
the system or changing the power envelope dramatically.

v3: Drop the older NEQ branch, now we pin the signaler's HWSP anyway.
v4: Tell the world and include it as part of scheduler caps.

Testcase: igt/gem_exec_whisper
Testcase: igt/benchmarks/gem_wsim
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c           |   2 +-
 drivers/gpu/drm/i915/i915_request.c       | 138 +++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_request.h       |  26 +++-
 drivers/gpu/drm/i915/i915_sw_fence.c      |   4 +-
 drivers/gpu/drm/i915/i915_sw_fence.h      |   3 +
 drivers/gpu/drm/i915/intel_engine_cs.c    |   1 +
 drivers/gpu/drm/i915/intel_gpu_commands.h |   9 +-
 drivers/gpu/drm/i915/intel_lrc.c          |   1 +
 drivers/gpu/drm/i915/intel_ringbuffer.h   |   7 ++
 include/uapi/drm/i915_drm.h               |   1 +
 10 files changed, 181 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c6354f6cdbdb..c08abdef5eb6 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -351,7 +351,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		value = min_t(int, INTEL_PPGTT(dev_priv), I915_GEM_PPGTT_FULL);
 		break;
 	case I915_PARAM_HAS_SEMAPHORES:
-		value = 0;
+		value = !!(dev_priv->caps.scheduler & I915_SCHEDULER_CAP_SEMAPHORES);
 		break;
 	case I915_PARAM_HAS_SECURE_BATCHES:
 		value = capable(CAP_SYS_ADMIN);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index d354967d6ae8..59e30b8c4ee9 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -22,8 +22,9 @@
  *
  */
 
-#include <linux/prefetch.h>
 #include <linux/dma-fence-array.h>
+#include <linux/irq_work.h>
+#include <linux/prefetch.h>
 #include <linux/sched.h>
 #include <linux/sched/clock.h>
 #include <linux/sched/signal.h>
@@ -32,9 +33,16 @@
 #include "i915_active.h"
 #include "i915_reset.h"
 
+struct execute_cb {
+	struct list_head link;
+	struct irq_work work;
+	struct i915_sw_fence *fence;
+};
+
 static struct i915_global_request {
 	struct kmem_cache *slab_requests;
 	struct kmem_cache *slab_dependencies;
+	struct kmem_cache *slab_execute_cbs;
 } global;
 
 static const char *i915_fence_get_driver_name(struct dma_fence *fence)
@@ -325,6 +333,69 @@ void i915_request_retire_upto(struct i915_request *rq)
 	} while (tmp != rq);
 }
 
+static void irq_execute_cb(struct irq_work *wrk)
+{
+	struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
+
+	i915_sw_fence_complete(cb->fence);
+	kmem_cache_free(global.slab_execute_cbs, cb);
+}
+
+static void __notify_execute_cb(struct i915_request *rq)
+{
+	struct execute_cb *cb;
+
+	lockdep_assert_held(&rq->lock);
+
+	if (list_empty(&rq->execute_cb))
+		return;
+
+	list_for_each_entry(cb, &rq->execute_cb, link)
+		irq_work_queue(&cb->work);
+
+	/*
+	 * XXX Rollback on __i915_request_unsubmit()
+	 *
+	 * In the future, perhaps when we have an active time-slicing scheduler,
+	 * it will be interesting to unsubmit parallel execution and remove
+	 * busywaits from the GPU until their master is restarted. This is
+	 * quite hairy, we have to carefully rollback the fence and do a
+	 * preempt-to-idle cycle on the target engine, all the while the
+	 * master execute_cb may refire.
+	 */
+	INIT_LIST_HEAD(&rq->execute_cb);
+}
+
+static int
+i915_request_await_execution(struct i915_request *rq,
+			     struct i915_request *signal,
+			     gfp_t gfp)
+{
+	struct execute_cb *cb;
+
+	if (i915_request_is_active(signal))
+		return 0;
+
+	cb = kmem_cache_alloc(global.slab_execute_cbs, gfp);
+	if (!cb)
+		return -ENOMEM;
+
+	cb->fence = &rq->submit;
+	i915_sw_fence_await(cb->fence);
+	init_irq_work(&cb->work, irq_execute_cb);
+
+	spin_lock_irq(&signal->lock);
+	if (i915_request_is_active(signal)) {
+		i915_sw_fence_complete(cb->fence);
+		kmem_cache_free(global.slab_execute_cbs, cb);
+	} else {
+		list_add_tail(&cb->link, &signal->execute_cb);
+	}
+	spin_unlock_irq(&signal->lock);
+
+	return 0;
+}
+
 static void move_to_timeline(struct i915_request *request,
 			     struct i915_timeline *timeline)
 {
@@ -361,6 +432,8 @@ void __i915_request_submit(struct i915_request *request)
 	    !i915_request_enable_breadcrumb(request))
 		intel_engine_queue_breadcrumbs(engine);
 
+	__notify_execute_cb(request);
+
 	spin_unlock(&request->lock);
 
 	engine->emit_fini_breadcrumb(request,
@@ -608,6 +681,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	}
 
 	INIT_LIST_HEAD(&rq->active_list);
+	INIT_LIST_HEAD(&rq->execute_cb);
 
 	tl = ce->ring->timeline;
 	ret = i915_timeline_get_seqno(tl, rq, &seqno);
@@ -696,6 +770,52 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	return ERR_PTR(ret);
 }
 
+static int
+emit_semaphore_wait(struct i915_request *to,
+		    struct i915_request *from,
+		    gfp_t gfp)
+{
+	u32 hwsp_offset;
+	u32 *cs;
+	int err;
+
+	GEM_BUG_ON(!from->timeline->has_initial_breadcrumb);
+	GEM_BUG_ON(INTEL_GEN(to->i915) < 8);
+
+	/* We need to pin the signaler's HWSP until we are finished reading. */
+	err = i915_timeline_read_hwsp(from, to, &hwsp_offset);
+	if (err)
+		return err;
+
+	/* Only submit our spinner after the signaler is running! */
+	err = i915_request_await_execution(to, from, gfp);
+	if (err)
+		return err;
+
+	cs = intel_ring_begin(to, 4);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	/*
+	 * Using greater-than-or-equal here means we have to worry
+	 * about seqno wraparound. To side step that issue, we swap
+	 * the timeline HWSP upon wrapping, so that everyone listening
+	 * for the old (pre-wrap) values do not see the much smaller
+	 * (post-wrap) values than they were expecting (and so wait
+	 * forever).
+	 */
+	*cs++ = MI_SEMAPHORE_WAIT |
+		MI_SEMAPHORE_GLOBAL_GTT |
+		MI_SEMAPHORE_POLL |
+		MI_SEMAPHORE_SAD_GTE_SDD;
+	*cs++ = from->fence.seqno;
+	*cs++ = hwsp_offset;
+	*cs++ = 0;
+
+	intel_ring_advance(to, cs);
+	return 0;
+}
+
 static int
 i915_request_await_request(struct i915_request *to, struct i915_request *from)
 {
@@ -717,6 +837,9 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
 		ret = i915_sw_fence_await_sw_fence_gfp(&to->submit,
 						       &from->submit,
 						       I915_FENCE_GFP);
+	} else if (intel_engine_has_semaphores(to->engine) &&
+		   to->gem_context->sched.priority >= I915_PRIORITY_NORMAL) {
+		ret = emit_semaphore_wait(to, from, I915_FENCE_GFP);
 	} else {
 		ret = i915_sw_fence_await_dma_fence(&to->submit,
 						    &from->fence, 0,
@@ -1208,14 +1331,23 @@ int __init i915_global_request_init(void)
 	if (!global.slab_requests)
 		return -ENOMEM;
 
+	global.slab_execute_cbs = KMEM_CACHE(execute_cb,
+					     SLAB_HWCACHE_ALIGN |
+					     SLAB_RECLAIM_ACCOUNT |
+					     SLAB_TYPESAFE_BY_RCU);
+	if (!global.slab_execute_cbs)
+		goto err_requests;
+
 	global.slab_dependencies = KMEM_CACHE(i915_dependency,
 					      SLAB_HWCACHE_ALIGN |
 					      SLAB_RECLAIM_ACCOUNT);
 	if (!global.slab_dependencies)
-		goto err_requests;
+		goto err_execute_cbs;
 
 	return 0;
 
+err_execute_cbs:
+	kmem_cache_destroy(global.slab_execute_cbs);
 err_requests:
 	kmem_cache_destroy(global.slab_requests);
 	return -ENOMEM;
@@ -1224,11 +1356,13 @@ int __init i915_global_request_init(void)
 void i915_global_request_shrink(void)
 {
 	kmem_cache_shrink(global.slab_dependencies);
+	kmem_cache_shrink(global.slab_execute_cbs);
 	kmem_cache_shrink(global.slab_requests);
 }
 
 void i915_global_request_exit(void)
 {
 	kmem_cache_destroy(global.slab_dependencies);
+	kmem_cache_destroy(global.slab_execute_cbs);
 	kmem_cache_destroy(global.slab_requests);
 }
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 09eaad06d2c6..da2a273161a9 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -129,6 +129,7 @@ struct i915_request {
 	 */
 	struct i915_sw_fence submit;
 	wait_queue_entry_t submitq;
+	struct list_head execute_cb;
 
 	/*
 	 * A list of everyone we wait upon, and everyone who waits upon us.
@@ -343,10 +344,27 @@ static inline bool __i915_request_has_started(const struct i915_request *rq)
  * i915_request_started - check if the request has begun being executed
  * @rq: the request
  *
- * Returns true if the request has been submitted to hardware, and the hardware
- * has advanced passed the end of the previous request and so should be either
- * currently processing the request (though it may be preempted and so
- * not necessarily the next request to complete) or have completed the request.
+ * If the timeline is not using initial breadcrumbs, a request is
+ * considered started if the previous request on its timeline (i.e.
+ * context) has been signaled.
+ *
+ * If the timeline is using semaphores, it will also be emitting an
+ * "initial breadcrumb" after the semaphores are complete and just before
+ * it being executing the user payload. A request can therefore be active
+ * on the HW and not yet started as it is still busywaiting on its
+ * dependencies (via HW semaphores).
+ *
+ * If the request has started, its dependencies will have been signaled
+ * (either by fences or by semaphores) and it will have begun processing
+ * the user payload.
+ *
+ * However, even if a request has started, it may have been preempted and
+ * so no longer active, or it may have already completed.
+ *
+ * See also i915_request_is_active().
+ *
+ * Returns true if the request has begun executing the user payload, or
+ * has completed:
  */
 static inline bool i915_request_started(const struct i915_request *rq)
 {
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index 7c58b049ecb5..8d1400d378d7 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -192,7 +192,7 @@ static void __i915_sw_fence_complete(struct i915_sw_fence *fence,
 	__i915_sw_fence_notify(fence, FENCE_FREE);
 }
 
-static void i915_sw_fence_complete(struct i915_sw_fence *fence)
+void i915_sw_fence_complete(struct i915_sw_fence *fence)
 {
 	debug_fence_assert(fence);
 
@@ -202,7 +202,7 @@ static void i915_sw_fence_complete(struct i915_sw_fence *fence)
 	__i915_sw_fence_complete(fence, NULL);
 }
 
-static void i915_sw_fence_await(struct i915_sw_fence *fence)
+void i915_sw_fence_await(struct i915_sw_fence *fence)
 {
 	debug_fence_assert(fence);
 	WARN_ON(atomic_inc_return(&fence->pending) <= 1);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h
index 0e055ea0179f..6dec9e1d1102 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -79,6 +79,9 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 				    unsigned long timeout,
 				    gfp_t gfp);
 
+void i915_sw_fence_await(struct i915_sw_fence *fence);
+void i915_sw_fence_complete(struct i915_sw_fence *fence);
+
 static inline bool i915_sw_fence_signaled(const struct i915_sw_fence *fence)
 {
 	return atomic_read(&fence->pending) <= 0;
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index df8f88142f1d..4fc7e2ac6278 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -616,6 +616,7 @@ void intel_engines_set_scheduler_caps(struct drm_i915_private *i915)
 	} map[] = {
 #define MAP(x, y) { ilog2(I915_ENGINE_HAS_##x), ilog2(I915_SCHEDULER_CAP_##y) }
 		MAP(PREEMPTION, PREEMPTION),
+		MAP(SEMAPHORES, SEMAPHORES),
 #undef MAP
 	};
 	struct intel_engine_cs *engine;
diff --git a/drivers/gpu/drm/i915/intel_gpu_commands.h b/drivers/gpu/drm/i915/intel_gpu_commands.h
index b96a31bc1080..a34ece53a771 100644
--- a/drivers/gpu/drm/i915/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/intel_gpu_commands.h
@@ -105,8 +105,13 @@
 #define MI_SEMAPHORE_SIGNAL	MI_INSTR(0x1b, 0) /* GEN8+ */
 #define   MI_SEMAPHORE_TARGET(engine)	((engine)<<15)
 #define MI_SEMAPHORE_WAIT	MI_INSTR(0x1c, 2) /* GEN8+ */
-#define   MI_SEMAPHORE_POLL		(1<<15)
-#define   MI_SEMAPHORE_SAD_GTE_SDD	(1<<12)
+#define   MI_SEMAPHORE_POLL		(1 << 15)
+#define   MI_SEMAPHORE_SAD_GT_SDD	(0 << 12)
+#define   MI_SEMAPHORE_SAD_GTE_SDD	(1 << 12)
+#define   MI_SEMAPHORE_SAD_LT_SDD	(2 << 12)
+#define   MI_SEMAPHORE_SAD_LTE_SDD	(3 << 12)
+#define   MI_SEMAPHORE_SAD_EQ_SDD	(4 << 12)
+#define   MI_SEMAPHORE_SAD_NEQ_SDD	(5 << 12)
 #define MI_STORE_DWORD_IMM	MI_INSTR(0x20, 1)
 #define MI_STORE_DWORD_IMM_GEN4	MI_INSTR(0x20, 2)
 #define   MI_MEM_VIRTUAL	(1 << 22) /* 945,g33,965 */
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3fd0c45a2920..6c9479acb433 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2330,6 +2330,7 @@ void intel_execlists_set_default_submission(struct intel_engine_cs *engine)
 	engine->park = NULL;
 	engine->unpark = NULL;
 
+	engine->flags |= I915_ENGINE_HAS_SEMAPHORES;
 	engine->flags |= I915_ENGINE_SUPPORTS_STATS;
 	if (engine->i915->preempt_context)
 		engine->flags |= I915_ENGINE_HAS_PREEMPTION;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index b8ec7e40a59b..2e7264119ec4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -499,6 +499,7 @@ struct intel_engine_cs {
 #define I915_ENGINE_NEEDS_CMD_PARSER BIT(0)
 #define I915_ENGINE_SUPPORTS_STATS   BIT(1)
 #define I915_ENGINE_HAS_PREEMPTION   BIT(2)
+#define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
 	unsigned int flags;
 
 	/*
@@ -576,6 +577,12 @@ intel_engine_has_preemption(const struct intel_engine_cs *engine)
 	return engine->flags & I915_ENGINE_HAS_PREEMPTION;
 }
 
+static inline bool
+intel_engine_has_semaphores(const struct intel_engine_cs *engine)
+{
+	return engine->flags & I915_ENGINE_HAS_SEMAPHORES;
+}
+
 void intel_engines_set_scheduler_caps(struct drm_i915_private *i915);
 
 static inline bool __execlists_need_preempt(int prio, int last)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 8304a7f1ec3f..b10eea3f6d24 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -479,6 +479,7 @@ typedef struct drm_i915_irq_wait {
 #define   I915_SCHEDULER_CAP_ENABLED	(1ul << 0)
 #define   I915_SCHEDULER_CAP_PRIORITY	(1ul << 1)
 #define   I915_SCHEDULER_CAP_PREEMPTION	(1ul << 2)
+#define   I915_SCHEDULER_CAP_SEMAPHORES	(1ul << 3)
 
 #define I915_PARAM_HUC_STATUS		 42
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [CI 4/4] drm/i915: Prioritise non-busywait semaphore workloads
  2019-03-01 17:08 [CI 1/4] drm/i915/execlists: Suppress redundant preemption Chris Wilson
  2019-03-01 17:08 ` [CI 2/4] drm/i915: Keep timeline HWSP allocated until idle across the system Chris Wilson
  2019-03-01 17:09 ` [CI 3/4] drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+ Chris Wilson
@ 2019-03-01 17:09 ` Chris Wilson
  2019-03-01 18:05 ` ✗ Fi.CI.SPARSE: warning for series starting with [CI,1/4] drm/i915/execlists: Suppress redundant preemption Patchwork
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2019-03-01 17:09 UTC (permalink / raw)
  To: intel-gfx

We don't want to busywait on the GPU if we have other work to do. If we
give non-busywaiting workloads higher (initial) priority than workloads
that require a busywait, we will prioritise work that is ready to run
immediately. We then also have to be careful that we don't give earlier
semaphores an accidental boost because later work doesn't wait on other
rings, hence we keep a history of semaphore usage of the dependency chain.

v2: Stop rolling the bits into a chain and just use a flag in case this
request or any of our dependencies use a semaphore. The rolling around
was contagious as Tvrtko was heard to fall off his chair.

Testcase: igt/gem_exec_schedule/semaphore
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c   | 16 ++++++++++++++++
 drivers/gpu/drm/i915/i915_scheduler.c |  6 ++++++
 drivers/gpu/drm/i915/i915_scheduler.h |  9 ++++++---
 drivers/gpu/drm/i915/intel_lrc.c      |  2 +-
 4 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 59e30b8c4ee9..bcf3c1a155e2 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -813,6 +813,7 @@ emit_semaphore_wait(struct i915_request *to,
 	*cs++ = 0;
 
 	intel_ring_advance(to, cs);
+	to->sched.flags |= I915_SCHED_HAS_SEMAPHORE;
 	return 0;
 }
 
@@ -1083,6 +1084,21 @@ void i915_request_add(struct i915_request *request)
 	if (engine->schedule) {
 		struct i915_sched_attr attr = request->gem_context->sched;
 
+		/*
+		 * Boost actual workloads past semaphores!
+		 *
+		 * With semaphores we spin on one engine waiting for another,
+		 * simply to reduce the latency of starting our work when
+		 * the signaler completes. However, if there is any other
+		 * work that we could be doing on this engine instead, that
+		 * is better utilisation and will reduce the overall duration
+		 * of the current work. To avoid PI boosting a semaphore
+		 * far in the distance past over useful work, we keep a history
+		 * of any semaphore use along our dependency chain.
+		 */
+		if (!(request->sched.flags & I915_SCHED_HAS_SEMAPHORE))
+			attr.priority |= I915_PRIORITY_NOSEMAPHORE;
+
 		/*
 		 * Boost priorities to new clients (new request flows).
 		 *
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 50018ad30233..8a64748a7912 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -39,6 +39,7 @@ void i915_sched_node_init(struct i915_sched_node *node)
 	INIT_LIST_HEAD(&node->waiters_list);
 	INIT_LIST_HEAD(&node->link);
 	node->attr.priority = I915_PRIORITY_INVALID;
+	node->flags = 0;
 }
 
 static struct i915_dependency *
@@ -69,6 +70,11 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
 		dep->signaler = signal;
 		dep->flags = flags;
 
+		/* Keep track of whether anyone on this chain has a semaphore */
+		if (signal->flags & I915_SCHED_HAS_SEMAPHORE &&
+		    !node_started(signal))
+			node->flags |=  I915_SCHED_HAS_SEMAPHORE;
+
 		ret = true;
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 7d4a49750d92..6ce450cf63fa 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -24,14 +24,15 @@ enum {
 	I915_PRIORITY_INVALID = INT_MIN
 };
 
-#define I915_USER_PRIORITY_SHIFT 2
+#define I915_USER_PRIORITY_SHIFT 3
 #define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
 
 #define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
 #define I915_PRIORITY_MASK (I915_PRIORITY_COUNT - 1)
 
-#define I915_PRIORITY_WAIT	((u8)BIT(0))
-#define I915_PRIORITY_NEWCLIENT	((u8)BIT(1))
+#define I915_PRIORITY_WAIT		((u8)BIT(0))
+#define I915_PRIORITY_NEWCLIENT		((u8)BIT(1))
+#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(2))
 
 #define __NO_PREEMPTION (I915_PRIORITY_WAIT)
 
@@ -74,6 +75,8 @@ struct i915_sched_node {
 	struct list_head waiters_list; /* those after us, they depend upon us */
 	struct list_head link;
 	struct i915_sched_attr attr;
+	unsigned int flags;
+#define I915_SCHED_HAS_SEMAPHORE	BIT(0)
 };
 
 struct i915_dependency {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6c9479acb433..578c8c98c718 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -164,7 +164,7 @@
 #define WA_TAIL_DWORDS 2
 #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
 
-#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT)
+#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT | I915_PRIORITY_NOSEMAPHORE)
 
 static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 					    struct intel_engine_cs *engine,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [CI,1/4] drm/i915/execlists: Suppress redundant preemption
  2019-03-01 17:08 [CI 1/4] drm/i915/execlists: Suppress redundant preemption Chris Wilson
                   ` (2 preceding siblings ...)
  2019-03-01 17:09 ` [CI 4/4] drm/i915: Prioritise non-busywait semaphore workloads Chris Wilson
@ 2019-03-01 18:05 ` Patchwork
  2019-03-01 18:23 ` ✓ Fi.CI.BAT: success " Patchwork
  2019-03-02  0:10 ` ✓ Fi.CI.IGT: " Patchwork
  5 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2019-03-01 18:05 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [CI,1/4] drm/i915/execlists: Suppress redundant preemption
URL   : https://patchwork.freedesktop.org/series/57434/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915/execlists: Suppress redundant preemption
Okay!

Commit: drm/i915: Keep timeline HWSP allocated until idle across the system
Okay!

Commit: drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+
-O:drivers/gpu/drm/i915/i915_drv.c:351:25: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_drv.c:351:25: warning: expression using sizeof(void)

Commit: drm/i915: Prioritise non-busywait semaphore workloads
Okay!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [CI,1/4] drm/i915/execlists: Suppress redundant preemption
  2019-03-01 17:08 [CI 1/4] drm/i915/execlists: Suppress redundant preemption Chris Wilson
                   ` (3 preceding siblings ...)
  2019-03-01 18:05 ` ✗ Fi.CI.SPARSE: warning for series starting with [CI,1/4] drm/i915/execlists: Suppress redundant preemption Patchwork
@ 2019-03-01 18:23 ` Patchwork
  2019-03-02  0:10 ` ✓ Fi.CI.IGT: " Patchwork
  5 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2019-03-01 18:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [CI,1/4] drm/i915/execlists: Suppress redundant preemption
URL   : https://patchwork.freedesktop.org/series/57434/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_5679 -> Patchwork_12347
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/57434/revisions/1/mbox/

Known issues
------------

  Here are the changes found in Patchwork_12347 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_basic@readonly-bsd2:
    - fi-pnv-d510:        NOTRUN -> SKIP [fdo#109271] +76

  * igt@kms_busy@basic-flip-b:
    - fi-gdg-551:         PASS -> FAIL [fdo#103182]

  * igt@kms_busy@basic-flip-c:
    - fi-pnv-d510:        NOTRUN -> SKIP [fdo#109271] / [fdo#109278]

  
#### Possible fixes ####

  * igt@i915_pm_rpm@basic-pci-d3-state:
    - fi-hsw-4770:        SKIP [fdo#109271] -> PASS +4

  * igt@kms_busy@basic-flip-a:
    - fi-kbl-7567u:       SKIP [fdo#109271] / [fdo#109278] -> PASS +2

  
  [fdo#103182]: https://bugs.freedesktop.org/show_bug.cgi?id=103182
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278


Participating hosts (42 -> 39)
------------------------------

  Additional (1): fi-pnv-d510 
  Missing    (4): fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u 


Build changes
-------------

    * Linux: CI_DRM_5679 -> Patchwork_12347

  CI_DRM_5679: e533152feeb49f1dc4e689a37c415a997e5d094b @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4866: 189956af183c245eb237b3be4fa22953ec93bbe0 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12347: 293b3c3d08c2529728aa7aeea139af325d6f945a @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

293b3c3d08c2 drm/i915: Prioritise non-busywait semaphore workloads
44a508029dd9 drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+
0ee004ae9f8d drm/i915: Keep timeline HWSP allocated until idle across the system
81a0b2f4e32f drm/i915/execlists: Suppress redundant preemption

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12347/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [CI,1/4] drm/i915/execlists: Suppress redundant preemption
  2019-03-01 17:08 [CI 1/4] drm/i915/execlists: Suppress redundant preemption Chris Wilson
                   ` (4 preceding siblings ...)
  2019-03-01 18:23 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2019-03-02  0:10 ` Patchwork
  5 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2019-03-02  0:10 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [CI,1/4] drm/i915/execlists: Suppress redundant preemption
URL   : https://patchwork.freedesktop.org/series/57434/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_5679_full -> Patchwork_12347_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Known issues
------------

  Here are the changes found in Patchwork_12347_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@debugfs_test@read_all_entries_display_off:
    - shard-skl:          PASS -> INCOMPLETE [fdo#104108]

  * igt@gem_ctx_isolation@rcs0-dirty-create:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109281] +3

  * igt@gem_ctx_param@invalid-param-set:
    - shard-skl:          NOTRUN -> FAIL [fdo#109674]

  * igt@gem_exec_parallel@bsd1:
    - shard-skl:          NOTRUN -> SKIP [fdo#109271] +194

  * igt@gem_exec_params@no-bsd:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109283]

  * igt@gem_exec_schedule@preempt-other-chain-bsd2:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109276] +8

  * igt@gem_mocs_settings@mocs-rc6-bsd1:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109276] / [fdo#109287]

  * igt@gem_mocs_settings@mocs-settings-vebox:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109287] +2

  * igt@gem_pwrite@huge-gtt-forwards:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109290]

  * igt@gem_stolen@stolen-pread:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109277] +2

  * igt@i915_missed_irq:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109503]

  * igt@i915_pm_rpm@basic-pci-d3-state:
    - shard-skl:          PASS -> INCOMPLETE [fdo#107807]

  * igt@i915_pm_rpm@gem-execbuf-stress:
    - shard-iclb:         PASS -> INCOMPLETE [fdo#108840]

  * igt@i915_pm_rpm@modeset-lpsp-stress-no-wait:
    - shard-iclb:         PASS -> DMESG-WARN [fdo#107724] / [fdo#108654]

  * igt@i915_pm_rpm@system-suspend-modeset:
    - shard-iclb:         PASS -> DMESG-WARN [fdo#107724] +2

  * igt@kms_busy@extended-modeset-hang-newfb-render-a:
    - shard-skl:          NOTRUN -> DMESG-WARN [fdo#107956] +3

  * igt@kms_busy@extended-modeset-hang-oldfb-render-f:
    - shard-skl:          NOTRUN -> SKIP [fdo#109271] / [fdo#109278] +16
    - shard-iclb:         NOTRUN -> SKIP [fdo#109278] +2

  * igt@kms_chamelium@vga-edid-read:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109284] +3

  * igt@kms_color@pipe-b-legacy-gamma:
    - shard-iclb:         NOTRUN -> FAIL [fdo#104782]

  * igt@kms_cursor_crc@cursor-128x42-onscreen:
    - shard-apl:          PASS -> FAIL [fdo#103232] +1

  * igt@kms_cursor_crc@cursor-512x512-dpms:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109279]

  * igt@kms_cursor_crc@cursor-64x21-onscreen:
    - shard-iclb:         NOTRUN -> FAIL [fdo#103232] +1

  * igt@kms_cursor_crc@cursor-64x64-suspend:
    - shard-apl:          PASS -> FAIL [fdo#103191] / [fdo#103232]

  * igt@kms_flip@2x-flip-vs-expired-vblank-interruptible:
    - shard-glk:          PASS -> FAIL [fdo#105363]

  * igt@kms_flip@2x-nonexisting-fb:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109274] +8

  * igt@kms_flip@flip-vs-expired-vblank:
    - shard-glk:          PASS -> FAIL [fdo#102887] / [fdo#105363]

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-pwrite:
    - shard-apl:          PASS -> FAIL [fdo#103167]

  * igt@kms_frontbuffer_tracking@fbc-stridechange:
    - shard-skl:          NOTRUN -> FAIL [fdo#105683]

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-spr-indfb-draw-mmap-cpu:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109280] +16

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-move:
    - shard-iclb:         PASS -> FAIL [fdo#103167] +4

  * igt@kms_plane@pixel-format-pipe-a-planes-source-clamping:
    - shard-skl:          NOTRUN -> DMESG-WARN [fdo#106885]

  * igt@kms_plane_alpha_blend@pipe-b-alpha-7efc:
    - shard-apl:          NOTRUN -> FAIL [fdo#108145]

  * igt@kms_plane_alpha_blend@pipe-b-alpha-transparant-fb:
    - shard-skl:          NOTRUN -> FAIL [fdo#108145] +4

  * igt@kms_plane_alpha_blend@pipe-c-alpha-7efc:
    - shard-skl:          NOTRUN -> FAIL [fdo#107815] / [fdo#108145] +1

  * igt@kms_plane_multiple@atomic-pipe-b-tiling-x:
    - shard-apl:          PASS -> FAIL [fdo#103166] +2

  * igt@kms_plane_multiple@atomic-pipe-b-tiling-yf:
    - shard-glk:          PASS -> FAIL [fdo#103166]

  * igt@kms_plane_multiple@atomic-pipe-c-tiling-none:
    - shard-iclb:         PASS -> FAIL [fdo#103166] +1

  * igt@kms_plane_multiple@atomic-pipe-c-tiling-x:
    - shard-apl:          NOTRUN -> FAIL [fdo#103166]

  * igt@kms_plane_scaling@pipe-a-scaler-with-pixel-format:
    - shard-iclb:         NOTRUN -> DMESG-WARN [fdo#107724] +1

  * igt@kms_psr@primary_mmap_cpu:
    - shard-apl:          NOTRUN -> SKIP [fdo#109271] +9

  * igt@kms_psr@psr2_dpms:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109441]

  * igt@kms_rotation_crc@multiplane-rotation-cropping-top:
    - shard-kbl:          PASS -> FAIL [fdo#109016]

  * igt@prime_nv_test@i915_nv_sharing:
    - shard-iclb:         NOTRUN -> SKIP [fdo#109291] +2

  
#### Possible fixes ####

  * igt@gem_busy@extended-semaphore-blt:
    - shard-iclb:         SKIP [fdo#109275] -> PASS +2
    - shard-kbl:          SKIP [fdo#109271] -> PASS +5
    - shard-glk:          SKIP [fdo#109271] -> PASS +3

  * igt@gem_busy@extended-semaphore-vebox:
    - shard-apl:          SKIP [fdo#109271] -> PASS +3
    - shard-skl:          SKIP [fdo#109271] -> PASS +2

  * igt@gem_fence_thrash@bo-copy:
    - shard-apl:          INCOMPLETE [fdo#103927] -> PASS +1

  * igt@i915_pm_rpm@dpms-lpsp:
    - shard-iclb:         DMESG-WARN [fdo#107724] -> PASS +6

  * igt@i915_pm_rpm@legacy-planes:
    - shard-iclb:         DMESG-WARN [fdo#108654] -> PASS

  * igt@kms_busy@extended-pageflip-hang-newfb-render-c:
    - shard-glk:          DMESG-WARN [fdo#107956] -> PASS

  * igt@kms_color@pipe-b-degamma:
    - shard-apl:          FAIL [fdo#104782] -> PASS

  * igt@kms_cursor_crc@cursor-128x42-sliding:
    - shard-apl:          FAIL [fdo#103232] -> PASS +1

  * igt@kms_flip@flip-vs-suspend:
    - shard-skl:          INCOMPLETE [fdo#107773] / [fdo#109507] -> PASS

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-blt:
    - shard-apl:          FAIL [fdo#103167] -> PASS

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-pwrite:
    - shard-iclb:         FAIL [fdo#103167] -> PASS +1

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-shrfb-plflip-blt:
    - shard-glk:          FAIL [fdo#103167] -> PASS +1

  * igt@kms_plane@pixel-format-pipe-b-planes-source-clamping:
    - shard-glk:          FAIL [fdo#108948] -> PASS

  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-max:
    - shard-glk:          FAIL [fdo#108145] -> PASS

  * igt@kms_plane_multiple@atomic-pipe-b-tiling-y:
    - shard-apl:          FAIL [fdo#103166] -> PASS +1
    - shard-iclb:         FAIL [fdo#103166] -> PASS

  * igt@kms_plane_multiple@atomic-pipe-c-tiling-y:
    - shard-glk:          FAIL [fdo#103166] -> PASS

  * igt@kms_rotation_crc@multiplane-rotation:
    - shard-kbl:          FAIL [fdo#109016] -> PASS

  * igt@kms_rotation_crc@multiplane-rotation-cropping-bottom:
    - shard-kbl:          DMESG-FAIL [fdo#105763] -> PASS

  * igt@kms_vblank@pipe-c-ts-continuation-dpms-suspend:
    - shard-iclb:         INCOMPLETE [fdo#107713] -> PASS

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#102887]: https://bugs.freedesktop.org/show_bug.cgi?id=102887
  [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103191]: https://bugs.freedesktop.org/show_bug.cgi?id=103191
  [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#104782]: https://bugs.freedesktop.org/show_bug.cgi?id=104782
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#105683]: https://bugs.freedesktop.org/show_bug.cgi?id=105683
  [fdo#105763]: https://bugs.freedesktop.org/show_bug.cgi?id=105763
  [fdo#106885]: https://bugs.freedesktop.org/show_bug.cgi?id=106885
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#107773]: https://bugs.freedesktop.org/show_bug.cgi?id=107773
  [fdo#107807]: https://bugs.freedesktop.org/show_bug.cgi?id=107807
  [fdo#107815]: https://bugs.freedesktop.org/show_bug.cgi?id=107815
  [fdo#107956]: https://bugs.freedesktop.org/show_bug.cgi?id=107956
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108654]: https://bugs.freedesktop.org/show_bug.cgi?id=108654
  [fdo#108840]: https://bugs.freedesktop.org/show_bug.cgi?id=108840
  [fdo#108948]: https://bugs.freedesktop.org/show_bug.cgi?id=108948
  [fdo#109016]: https://bugs.freedesktop.org/show_bug.cgi?id=109016
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109274]: https://bugs.freedesktop.org/show_bug.cgi?id=109274
  [fdo#109275]: https://bugs.freedesktop.org/show_bug.cgi?id=109275
  [fdo#109276]: https://bugs.freedesktop.org/show_bug.cgi?id=109276
  [fdo#109277]: https://bugs.freedesktop.org/show_bug.cgi?id=109277
  [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278
  [fdo#109279]: https://bugs.freedesktop.org/show_bug.cgi?id=109279
  [fdo#109280]: https://bugs.freedesktop.org/show_bug.cgi?id=109280
  [fdo#109281]: https://bugs.freedesktop.org/show_bug.cgi?id=109281
  [fdo#109283]: https://bugs.freedesktop.org/show_bug.cgi?id=109283
  [fdo#109284]: https://bugs.freedesktop.org/show_bug.cgi?id=109284
  [fdo#109287]: https://bugs.freedesktop.org/show_bug.cgi?id=109287
  [fdo#109290]: https://bugs.freedesktop.org/show_bug.cgi?id=109290
  [fdo#109291]: https://bugs.freedesktop.org/show_bug.cgi?id=109291
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109503]: https://bugs.freedesktop.org/show_bug.cgi?id=109503
  [fdo#109507]: https://bugs.freedesktop.org/show_bug.cgi?id=109507
  [fdo#109674]: https://bugs.freedesktop.org/show_bug.cgi?id=109674


Participating hosts (7 -> 7)
------------------------------

  No changes in participating hosts


Build changes
-------------

    * Linux: CI_DRM_5679 -> Patchwork_12347

  CI_DRM_5679: e533152feeb49f1dc4e689a37c415a997e5d094b @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4866: 189956af183c245eb237b3be4fa22953ec93bbe0 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12347: 293b3c3d08c2529728aa7aeea139af325d6f945a @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12347/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-03-02  0:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-01 17:08 [CI 1/4] drm/i915/execlists: Suppress redundant preemption Chris Wilson
2019-03-01 17:08 ` [CI 2/4] drm/i915: Keep timeline HWSP allocated until idle across the system Chris Wilson
2019-03-01 17:09 ` [CI 3/4] drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+ Chris Wilson
2019-03-01 17:09 ` [CI 4/4] drm/i915: Prioritise non-busywait semaphore workloads Chris Wilson
2019-03-01 18:05 ` ✗ Fi.CI.SPARSE: warning for series starting with [CI,1/4] drm/i915/execlists: Suppress redundant preemption Patchwork
2019-03-01 18:23 ` ✓ Fi.CI.BAT: success " Patchwork
2019-03-02  0:10 ` ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.