[PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started
@ 2019-05-15 13:00 Chris Wilson
  2019-05-15 13:00 ` [PATCH 2/5] drm/i915: Truly bump ready tasks ahead of busywaits Chris Wilson
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Chris Wilson @ 2019-05-15 13:00 UTC (permalink / raw)
  To: intel-gfx

Avoid charging us for the presumed busywait if the request was preempted
after successfully using semaphores to reduce inter-engine latency.

v2: Bump the priority to reflect the lack of semaphores now required.

References: ca6e56f654e7 ("drm/i915: Disable semaphore busywaits on saturated systems")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index bed213148cbb..11670774cd25 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -509,6 +509,12 @@ void __i915_request_unsubmit(struct i915_request *request)
 	/* Transfer back from the global per-engine timeline to per-context */
 	move_to_timeline(request, request->timeline);
 
+	/* We've already spun, don't charge on resubmitting. */
+	if (request->sched.semaphores && i915_request_started(request)) {
+		request->sched.attr.priority |= I915_PRIORITY_NOSEMAPHORE;
+		request->sched.semaphores = 0;
+	}
+
 	/*
 	 * We don't need to wake_up any waiters on request->execute, they
 	 * will get woken by any other event or us re-adding this request
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] drm/i915: Truly bump ready tasks ahead of busywaits
  2019-05-15 13:00 [PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Chris Wilson
@ 2019-05-15 13:00 ` Chris Wilson
  2019-05-17 12:35   ` Tvrtko Ursulin
  2019-05-15 13:00 ` [PATCH 3/5] drm/i915: Bump signaler priority on adding a waiter Chris Wilson
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2019-05-15 13:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dmitry Ermilov

In commit b7404c7ecb38 ("drm/i915: Bump ready tasks ahead of
busywaits"), I tried cutting a corner in order to not install a signal
for each of our dependencies, and only listened to requests on which we
were intending to busywait. The compromise that was made was that
instead of then being able to promite the request with a full
NOSEMAPHORE like its non-busywaiting brethren, as we had not ensured we
had cleared the semaphore chain, we settled for only using the NEWCLIENT
boost. With an over saturated system with multiple NEWCLIENTS in flight
at any time, this was found to be an inadequate promotion and left us
with a much poorer scheduling order than prior to using semaphores.

The outcome of this patch, is that all requests have NOSEMAPHORE
priority when they have no dependencies and are ready to run and not
busywait, restoring the pre-semaphore ordering on saturated systems.

We can demonstrate the effect of poor scheduling order by oversaturating
the system using gem_wsim on a system with multiple vcs engines
(i.e running the same workloads across more clients than required for
peak throughput, e.g. media_load_balance_17i7.wsim -c4 -b context):

x v5.1 (normalized)
+ tip
* fix
+------------------------------------------------------------------------+
|                                                                    x   |
|                                                                    x   |
|                                                                    x   |
|                                                                    x   |
|                                                                   %x   |
|                                                                  %%x   |
|                                                                  %%x   |
|                                                                  %%x   |
|                                                                  %%x   |
|                                                                  %%x   |
|                                                                  %%x   |
|                                                                  %%x   |
|                                                                  %%x   |
|                                                                  %%x   |
|                                                                  %%x   |
|                                                                  %#x   |
|                                                                  %#x   |
|                                                                  %#x   |
|                                                                  %#x   |
|                                                                  %#x   |
|         +                                                        %#xx  |
|         +                                                        %#xx  |
|         +                                                       %%#xx  |
|         +                                                       %%#xx  |
|         +                                                       %%#xx  |
|         +                                                       %%#xx  |
|         +                                                       %%##x  |
|         +++                                                     %%##x  |
|         +++                                                     %%##x  |
|         +++                                                     %%##x  |
|        ++++                                                     %%##x  |
|        ++++                                                     %%##x  |
|        ++++                                                     %%##xx |
|        ++++                                                     %###xx |
|        ++++                                                     %###xx |
|        ++++                                                     %###xx |
|        ++++                                                     %###xx |
|        ++++ +                                                   %#O#xx |
|        ++++ +                                                   %#O#xx |
|        ++++++ +                                                 %#O#xx |
|       ++++++++++                                                %OOOxxx|
|       ++++++++++       +                                       %#OOO#xx|
|     + ++++++++++++ ++ +++++    +                        ++    @@OOOO#xx|
|                                                                   |A_| |
||__________M_______A____________________|                               |
|                                                                 |A_|   |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 120       0.99456       1.00628      0.999985     1.0001545  0.0024387139
+ 120      0.873021       1.00037      0.884134    0.90148752   0.039190862
Difference at 99.5% confidence
	-0.098667 +/- 0.0110762
	-9.86517% +/- 1.10745%
	(Student's t, pooled s = 0.0277657)
% 120      0.990207       1.00165     0.9970265    0.99699748     0.0021024
Difference at 99.5% confidence
	-0.003157 +/- 0.000908245
	-0.315651% +/- 0.0908105%
	(Student's t, pooled s = 0.00227678)

Fixes: b7404c7ecb38 ("drm/i915: Bump ready tasks ahead of busywaits")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 31 +++++++++++------------------
 1 file changed, 12 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 11670774cd25..2a1079e472e2 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -575,18 +575,7 @@ semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 
 	switch (state) {
 	case FENCE_COMPLETE:
-		/*
-		 * We only check a small portion of our dependencies
-		 * and so cannot guarantee that there remains no
-		 * semaphore chain across all. Instead of opting
-		 * for the full NOSEMAPHORE boost, we go for the
-		 * smaller (but still preempting) boost of
-		 * NEWCLIENT. This will be enough to boost over
-		 * a busywaiting request (as that cannot be
-		 * NEWCLIENT) without accidentally boosting
-		 * a busywait over real work elsewhere.
-		 */
-		i915_schedule_bump_priority(request, I915_PRIORITY_NEWCLIENT);
+		i915_schedule_bump_priority(request, I915_PRIORITY_NOSEMAPHORE);
 		break;
 
 	case FENCE_FREE:
@@ -852,12 +841,6 @@ emit_semaphore_wait(struct i915_request *to,
 	if (err < 0)
 		return err;
 
-	err = i915_sw_fence_await_dma_fence(&to->semaphore,
-					    &from->fence, 0,
-					    I915_FENCE_GFP);
-	if (err < 0)
-		return err;
-
 	/* Only submit our spinner after the signaler is running! */
 	err = i915_request_await_execution(to, from, gfp);
 	if (err)
@@ -923,8 +906,18 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
 						    &from->fence, 0,
 						    I915_FENCE_GFP);
 	}
+	if (ret < 0)
+		return ret;
+
+	if (to->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN) {
+		ret = i915_sw_fence_await_dma_fence(&to->semaphore,
+						    &from->fence, 0,
+						    I915_FENCE_GFP);
+		if (ret < 0)
+			return ret;
+	}
 
-	return ret < 0 ? ret : 0;
+	return 0;
 }
 
 int
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] drm/i915: Bump signaler priority on adding a waiter
  2019-05-15 13:00 [PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Chris Wilson
  2019-05-15 13:00 ` [PATCH 2/5] drm/i915: Truly bump ready tasks ahead of busywaits Chris Wilson
@ 2019-05-15 13:00 ` Chris Wilson
  2019-05-17 14:53   ` Tvrtko Ursulin
  2019-05-15 13:00 ` [PATCH 4/5] drm/i915: Downgrade NEWCLIENT to non-preemptive Chris Wilson
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2019-05-15 13:00 UTC (permalink / raw)
  To: intel-gfx

The handling of the no-preemption priority level imposes the restriction
that we need to maintain the implied ordering even though preemption is
disabled. Otherwise we may end up with an AB-BA deadlock across multiple
engine due to a real preemption event reordering the no-preemption
WAITs. To resolve this issue we currently promote all requests to WAIT
on unsubmission, however this interferes with the timeslicing
requirement that we do not apply any implicit promotion that will defeat
the round-robin timeslice list. (If we automatically promote the active
request it will go back to the head of the queue and not the tail!)

So we need implicit promotion to prevent reordering around semaphores
where we are not allowed to preempt, and we must avoid implicit
promotion on unsubmission. So instead of at unsubmit, if we apply that
implicit promotion on adding the dependency, we avoid the semaphore
deadlock and we also reduce the gains made by the promotion for user
space waiting. Furthermore, by keeping the earlier dependencies at a
higher level, we reduce the search space for timeslicing without
altering runtime scheduling too badly (no dependencies at all will be
assigned a higher priority for rrul).

v2: Limit the bump to external edges (as originally intended) i.e.
between contexts and out to the user.

Testcase: igt/gem_concurrent_blit
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_lrc.c      | 12 ++++++++----
 drivers/gpu/drm/i915/i915_request.c         |  9 ---------
 drivers/gpu/drm/i915/i915_scheduler.c       | 11 +++++++++++
 drivers/gpu/drm/i915/i915_scheduler_types.h |  3 ++-
 4 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 4b042893dc0e..5b3d8e33f1cf 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -98,12 +98,14 @@ static int live_busywait_preempt(void *arg)
 	ctx_hi = kernel_context(i915);
 	if (!ctx_hi)
 		goto err_unlock;
-	ctx_hi->sched.priority = INT_MAX;
+	ctx_hi->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
 
 	ctx_lo = kernel_context(i915);
 	if (!ctx_lo)
 		goto err_ctx_hi;
-	ctx_lo->sched.priority = INT_MIN;
+	ctx_lo->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
 
 	obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
 	if (IS_ERR(obj)) {
@@ -958,12 +960,14 @@ static int live_preempt_hang(void *arg)
 	ctx_hi = kernel_context(i915);
 	if (!ctx_hi)
 		goto err_spin_lo;
-	ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
+	ctx_hi->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
 
 	ctx_lo = kernel_context(i915);
 	if (!ctx_lo)
 		goto err_ctx_hi;
-	ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
+	ctx_lo->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
 
 	for_each_engine(engine, i915, id) {
 		struct i915_request *rq;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 2a1079e472e2..4899195e58c2 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -489,15 +489,6 @@ void __i915_request_unsubmit(struct i915_request *request)
 	/* We may be recursing from the signal callback of another i915 fence */
 	spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
 
-	/*
-	 * As we do not allow WAIT to preempt inflight requests,
-	 * once we have executed a request, along with triggering
-	 * any execution callbacks, we must preserve its ordering
-	 * within the non-preemptible FIFO.
-	 */
-	BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
-	request->sched.attr.priority |= __NO_PREEMPTION;
-
 	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags))
 		i915_request_cancel_breadcrumb(request);
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 5581c5004ff0..d215dcdf0b1e 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -387,6 +387,16 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
 		    !node_started(signal))
 			node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
 
+		/*
+		 * As we do not allow WAIT to preempt inflight requests,
+		 * once we have executed a request, along with triggering
+		 * any execution callbacks, we must preserve its ordering
+		 * within the non-preemptible FIFO.
+		 */
+		BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK);
+		if (flags & I915_DEPENDENCY_EXTERNAL)
+			__bump_priority(signal, __NO_PREEMPTION);
+
 		ret = true;
 	}
 
@@ -405,6 +415,7 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
 		return -ENOMEM;
 
 	if (!__i915_sched_node_add_dependency(node, signal, dep,
+					      I915_DEPENDENCY_EXTERNAL |
 					      I915_DEPENDENCY_ALLOC))
 		i915_dependency_free(dep);
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index 166a457884b2..3e309631bd0b 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -66,7 +66,8 @@ struct i915_dependency {
 	struct list_head wait_link;
 	struct list_head dfs_link;
 	unsigned long flags;
-#define I915_DEPENDENCY_ALLOC BIT(0)
+#define I915_DEPENDENCY_ALLOC		BIT(0)
+#define I915_DEPENDENCY_EXTERNAL	BIT(1)
 };
 
 #endif /* _I915_SCHEDULER_TYPES_H_ */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] drm/i915: Downgrade NEWCLIENT to non-preemptive
  2019-05-15 13:00 [PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Chris Wilson
  2019-05-15 13:00 ` [PATCH 2/5] drm/i915: Truly bump ready tasks ahead of busywaits Chris Wilson
  2019-05-15 13:00 ` [PATCH 3/5] drm/i915: Bump signaler priority on adding a waiter Chris Wilson
@ 2019-05-15 13:00 ` Chris Wilson
  2019-05-17 12:55   ` Tvrtko Ursulin
  2019-05-15 13:00 ` [PATCH 5/5] drm/i915/execlists: Drop promotion on unsubmit Chris Wilson
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2019-05-15 13:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dmitry Ermilov

Commit 1413b2bc0717 ("drm/i915: Trim NEWCLIENT boosting") had the
intended consequence of not allowing a sequence of work that merely
crossed into a new engine the privilege to be promoted to NEWCLIENT
status. It also had the unintended consequence of actually making
NEWCLIENT effective on heavily oversubscribed transcode machines and
impacting upon their throughput.

If we consider a client packet composed of (rcsA, rcsB, vcs) and 30 of
those clients, using the NEWCLIENT boost that will be scheduled as

	rcsA x 30, (rcsB, vcs) x 30

where as before it would have been

	(rcsA, rcsB, vcs) x 30

That is with NEWCLIENT only boosting the first request of each client,
we would execute all rcsA requests prior to running on the vcs engines;
acruing a lot of dead time as compared to the previous case where the
vcs engine would be started in parallel to processing the second client.

The previous patch has the effect of delaying submission until it is
required by a third party (either the user with an explicit wait, or by
another client/engine). We reduce the NEWCLIENT bump to a mere WAIT,
which has the effect of removing its preemptive grant and reducing it to
the same level as any other user interaction -- that it will not be
promoted above the interengine dependencies, and so preventing NEWCLIENTS
from starving other engines. This a large nerf to the rrul properties of
the current NEWCLIENT, but it still does give prioritised submission to
new requests from light workloads.

References: b16c765122f9 ("drm/i915: Priority boost for new clients")
Fixes: 1413b2bc0717 ("drm/i915: Trim NEWCLIENT boosting") # customer impact
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c        | 2 +-
 drivers/gpu/drm/i915/i915_priolist_types.h | 5 ++---
 drivers/gpu/drm/i915/i915_request.c        | 2 +-
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e18623def282..b5e82171df8f 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -164,7 +164,7 @@
 #define WA_TAIL_DWORDS 2
 #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
 
-#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT | I915_PRIORITY_NOSEMAPHORE)
+#define ACTIVE_PRIORITY (I915_PRIORITY_NOSEMAPHORE)
 
 static int execlists_context_deferred_alloc(struct intel_context *ce,
 					    struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
index cc44ebd3b553..49709de69875 100644
--- a/drivers/gpu/drm/i915/i915_priolist_types.h
+++ b/drivers/gpu/drm/i915/i915_priolist_types.h
@@ -20,15 +20,14 @@ enum {
 	I915_PRIORITY_INVALID = INT_MIN
 };
 
-#define I915_USER_PRIORITY_SHIFT 3
+#define I915_USER_PRIORITY_SHIFT 2
 #define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
 
 #define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
 #define I915_PRIORITY_MASK (I915_PRIORITY_COUNT - 1)
 
 #define I915_PRIORITY_WAIT		((u8)BIT(0))
-#define I915_PRIORITY_NEWCLIENT		((u8)BIT(1))
-#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(2))
+#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(1))
 
 #define __NO_PREEMPTION (I915_PRIORITY_WAIT)
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 4899195e58c2..2fca0b59578d 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1189,7 +1189,7 @@ struct i915_request *__i915_request_commit(struct i915_request *rq)
 		 * the bulk clients. (FQ_CODEL)
 		 */
 		if (list_empty(&rq->sched.signalers_list))
-			attr.priority |= I915_PRIORITY_NEWCLIENT;
+			attr.priority |= I915_PRIORITY_WAIT;
 
 		engine->schedule(rq, &attr);
 	}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] drm/i915/execlists: Drop promotion on unsubmit
  2019-05-15 13:00 [PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Chris Wilson
                   ` (2 preceding siblings ...)
  2019-05-15 13:00 ` [PATCH 4/5] drm/i915: Downgrade NEWCLIENT to non-preemptive Chris Wilson
@ 2019-05-15 13:00 ` Chris Wilson
  2019-05-17 14:30   ` Tvrtko Ursulin
  2019-05-15 13:21 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Patchwork
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2019-05-15 13:00 UTC (permalink / raw)
  To: intel-gfx

With the disappearance of NEWCLIENT, we no longer need to provide the
priority boost on preemption in order to prevent repeated gazumping,
and we can remove the dead code.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 59 +++++------------------------
 1 file changed, 10 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index b5e82171df8f..f263a8374273 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -164,8 +164,6 @@
 #define WA_TAIL_DWORDS 2
 #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
 
-#define ACTIVE_PRIORITY (I915_PRIORITY_NOSEMAPHORE)
-
 static int execlists_context_deferred_alloc(struct intel_context *ce,
 					    struct intel_engine_cs *engine);
 static void execlists_init_reg_state(u32 *reg_state,
@@ -189,23 +187,12 @@ static int effective_prio(const struct i915_request *rq)
 
 	/*
 	 * On unwinding the active request, we give it a priority bump
-	 * equivalent to a freshly submitted request. This protects it from
-	 * being gazumped again, but it would be preferable if we didn't
-	 * let it be gazumped in the first place!
-	 *
-	 * See __unwind_incomplete_requests()
+	 * if it has completed waiting on any semaphore. If we know that
+	 * the request has already started, we can prevent an unwanted
+	 * preempt-to-idle cycle by taking that into account now.
 	 */
-	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(rq)) {
-		/*
-		 * After preemption, we insert the active request at the
-		 * end of the new priority level. This means that we will be
-		 * _lower_ priority than the preemptee all things equal (and
-		 * so the preemption is valid), so adjust our comparison
-		 * accordingly.
-		 */
-		prio |= ACTIVE_PRIORITY;
-		prio--;
-	}
+	if (__i915_request_has_started(rq))
+		prio |= I915_PRIORITY_NOSEMAPHORE;
 
 	/* Restrict mere WAIT boosts from triggering preemption */
 	return prio | __NO_PREEMPTION;
@@ -371,11 +358,11 @@ static void unwind_wa_tail(struct i915_request *rq)
 }
 
 static struct i915_request *
-__unwind_incomplete_requests(struct intel_engine_cs *engine, int boost)
+__unwind_incomplete_requests(struct intel_engine_cs *engine)
 {
 	struct i915_request *rq, *rn, *active = NULL;
 	struct list_head *uninitialized_var(pl);
-	int prio = I915_PRIORITY_INVALID | boost;
+	int prio = I915_PRIORITY_INVALID;
 
 	lockdep_assert_held(&engine->timeline.lock);
 
@@ -402,31 +389,6 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine, int boost)
 		active = rq;
 	}
 
-	/*
-	 * The active request is now effectively the start of a new client
-	 * stream, so give it the equivalent small priority bump to prevent
-	 * it being gazumped a second time by another peer.
-	 *
-	 * Note we have to be careful not to apply a priority boost to a request
-	 * still spinning on its semaphores. If the request hasn't started, that
-	 * means it is still waiting for its dependencies to be signaled, and
-	 * if we apply a priority boost to this request, we will boost it past
-	 * its signalers and so break PI.
-	 *
-	 * One consequence of this preemption boost is that we may jump
-	 * over lesser priorities (such as I915_PRIORITY_WAIT), effectively
-	 * making those priorities non-preemptible. They will be moved forward
-	 * in the priority queue, but they will not gain immediate access to
-	 * the GPU.
-	 */
-	if (~prio & boost && __i915_request_has_started(active)) {
-		prio |= boost;
-		GEM_BUG_ON(active->sched.attr.priority >= prio);
-		active->sched.attr.priority = prio;
-		list_move_tail(&active->sched.link,
-			       i915_sched_lookup_priolist(engine, prio));
-	}
-
 	return active;
 }
 
@@ -436,7 +398,7 @@ execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
 	struct intel_engine_cs *engine =
 		container_of(execlists, typeof(*engine), execlists);
 
-	return __unwind_incomplete_requests(engine, 0);
+	return __unwind_incomplete_requests(engine);
 }
 
 static inline void
@@ -657,8 +619,7 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
 	execlists_cancel_port_requests(execlists);
 	__unwind_incomplete_requests(container_of(execlists,
 						  struct intel_engine_cs,
-						  execlists),
-				     ACTIVE_PRIORITY);
+						  execlists));
 }
 
 static void execlists_dequeue(struct intel_engine_cs *engine)
@@ -1911,7 +1872,7 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
 	execlists_cancel_port_requests(execlists);
 
 	/* Push back any incomplete requests for replay after the reset. */
-	rq = __unwind_incomplete_requests(engine, 0);
+	rq = __unwind_incomplete_requests(engine);
 	if (!rq)
 		goto out_replay;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started
  2019-05-15 13:00 [PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Chris Wilson
                   ` (3 preceding siblings ...)
  2019-05-15 13:00 ` [PATCH 5/5] drm/i915/execlists: Drop promotion on unsubmit Chris Wilson
@ 2019-05-15 13:21 ` Patchwork
  2019-05-15 13:43 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2019-05-15 13:21 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started
URL   : https://patchwork.freedesktop.org/series/60671/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
44f76ee8e2e2 drm/i915: Mark semaphores as complete on unsubmit out if payload was started
-:12: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#12: 
References: ca6e56f654e7 ("drm/i915: Disable semaphore busywaits on saturated systems")

-:12: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit ca6e56f654e7 ("drm/i915: Disable semaphore busywaits on saturated systems")'
#12: 
References: ca6e56f654e7 ("drm/i915: Disable semaphore busywaits on saturated systems")

total: 1 errors, 1 warnings, 0 checks, 12 lines checked
2e6339d4939b drm/i915: Truly bump ready tasks ahead of busywaits
e8b241693441 drm/i915: Bump signaler priority on adding a waiter
29e2d059081e drm/i915: Downgrade NEWCLIENT to non-preemptive
-:37: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit b16c765122f9 ("drm/i915: Priority boost for new clients")'
#37: 
References: b16c765122f9 ("drm/i915: Priority boost for new clients")

total: 1 errors, 0 warnings, 0 checks, 33 lines checked
8737f6b9a62f drm/i915/execlists: Drop promotion on unsubmit

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started
  2019-05-15 13:00 [PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Chris Wilson
                   ` (4 preceding siblings ...)
  2019-05-15 13:21 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Patchwork
@ 2019-05-15 13:43 ` Patchwork
  2019-05-15 18:40 ` ✗ Fi.CI.IGT: failure " Patchwork
  2019-05-17 12:26 ` [PATCH 1/5] " Tvrtko Ursulin
  7 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2019-05-15 13:43 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started
URL   : https://patchwork.freedesktop.org/series/60671/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6085 -> Patchwork_13020
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/

Known issues
------------

  Here are the changes found in Patchwork_13020 that come from known issues:

### IGT changes ###

#### Possible fixes ####

  * igt@gem_busy@basic-busy-default:
    - fi-icl-u2:          [INCOMPLETE][1] ([fdo#107713]) -> [PASS][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/fi-icl-u2/igt@gem_busy@basic-busy-default.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/fi-icl-u2/igt@gem_busy@basic-busy-default.html

  * igt@i915_selftest@live_contexts:
    - fi-bdw-gvtdvm:      [DMESG-FAIL][3] ([fdo#110235]) -> [PASS][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/fi-bdw-gvtdvm/igt@i915_selftest@live_contexts.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/fi-bdw-gvtdvm/igt@i915_selftest@live_contexts.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#110235]: https://bugs.freedesktop.org/show_bug.cgi?id=110235
  [fdo#110627]: https://bugs.freedesktop.org/show_bug.cgi?id=110627


Participating hosts (52 -> 46)
------------------------------

  Missing    (6): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_6085 -> Patchwork_13020

  CI_DRM_6085: 48d8cf5cc0aadd21924d05ad3e86b08d8e0e1c50 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4988: 2f6303d13e09b2457762540383c7f36cecd02bbf @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_13020: 8737f6b9a62fa2d982870429e1ca43e5bffb273c @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

8737f6b9a62f drm/i915/execlists: Drop promotion on unsubmit
29e2d059081e drm/i915: Downgrade NEWCLIENT to non-preemptive
e8b241693441 drm/i915: Bump signaler priority on adding a waiter
2e6339d4939b drm/i915: Truly bump ready tasks ahead of busywaits
44f76ee8e2e2 drm/i915: Mark semaphores as complete on unsubmit out if payload was started

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started
  2019-05-15 13:00 [PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Chris Wilson
                   ` (5 preceding siblings ...)
  2019-05-15 13:43 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2019-05-15 18:40 ` Patchwork
  2019-05-17 12:26 ` [PATCH 1/5] " Tvrtko Ursulin
  7 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2019-05-15 18:40 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started
URL   : https://patchwork.freedesktop.org/series/60671/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6085_full -> Patchwork_13020_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_13020_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_13020_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_13020_full:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_eio@in-flight-suspend:
    - shard-snb:          [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-snb1/igt@gem_eio@in-flight-suspend.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-snb5/igt@gem_eio@in-flight-suspend.html

  
Known issues
------------

  Here are the changes found in Patchwork_13020_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_eio@unwedge-stress:
    - shard-snb:          [PASS][3] -> [FAIL][4] ([fdo#109661])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-snb7/igt@gem_eio@unwedge-stress.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-snb1/igt@gem_eio@unwedge-stress.html

  * igt@gem_softpin@noreloc-s3:
    - shard-skl:          [PASS][5] -> [INCOMPLETE][6] ([fdo#104108] / [fdo#107773])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-skl10/igt@gem_softpin@noreloc-s3.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-skl8/igt@gem_softpin@noreloc-s3.html

  * igt@i915_pm_rpm@debugfs-read:
    - shard-skl:          [PASS][7] -> [INCOMPLETE][8] ([fdo#107807])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-skl3/igt@i915_pm_rpm@debugfs-read.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-skl7/igt@i915_pm_rpm@debugfs-read.html

  * igt@i915_pm_rpm@i2c:
    - shard-iclb:         [PASS][9] -> [DMESG-WARN][10] ([fdo#109982])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-iclb2/igt@i915_pm_rpm@i2c.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-iclb2/igt@i915_pm_rpm@i2c.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
    - shard-apl:          [PASS][11] -> [DMESG-WARN][12] ([fdo#108566]) +2 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-apl1/igt@i915_suspend@fence-restore-tiled2untiled.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-apl8/igt@i915_suspend@fence-restore-tiled2untiled.html

  * igt@kms_cursor_legacy@cursor-vs-flip-legacy:
    - shard-hsw:          [PASS][13] -> [FAIL][14] ([fdo#103355])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-hsw8/igt@kms_cursor_legacy@cursor-vs-flip-legacy.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-hsw8/igt@kms_cursor_legacy@cursor-vs-flip-legacy.html

  * igt@kms_flip@flip-vs-suspend:
    - shard-skl:          [PASS][15] -> [INCOMPLETE][16] ([fdo#109507])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-skl6/igt@kms_flip@flip-vs-suspend.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-skl9/igt@kms_flip@flip-vs-suspend.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-blt:
    - shard-iclb:         [PASS][17] -> [FAIL][18] ([fdo#103167]) +4 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-iclb5/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-blt.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-iclb5/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-blt.html

  * igt@kms_psr@no_drrs:
    - shard-iclb:         [PASS][19] -> [FAIL][20] ([fdo#108341])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-iclb8/igt@kms_psr@no_drrs.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-iclb1/igt@kms_psr@no_drrs.html

  * igt@kms_psr@psr2_cursor_plane_move:
    - shard-iclb:         [PASS][21] -> [SKIP][22] ([fdo#109441])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-iclb2/igt@kms_psr@psr2_cursor_plane_move.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-iclb3/igt@kms_psr@psr2_cursor_plane_move.html

  * igt@kms_setmode@basic:
    - shard-kbl:          [PASS][23] -> [FAIL][24] ([fdo#99912])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-kbl2/igt@kms_setmode@basic.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-kbl3/igt@kms_setmode@basic.html

  * igt@kms_sysfs_edid_timing:
    - shard-iclb:         [PASS][25] -> [FAIL][26] ([fdo#100047])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-iclb6/igt@kms_sysfs_edid_timing.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-iclb2/igt@kms_sysfs_edid_timing.html

  * igt@perf@oa-exponents:
    - shard-glk:          [PASS][27] -> [FAIL][28] ([fdo#105483])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-glk1/igt@perf@oa-exponents.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-glk3/igt@perf@oa-exponents.html

  
#### Possible fixes ####

  * igt@gem_eio@in-flight-suspend:
    - shard-kbl:          [FAIL][29] ([fdo#110667]) -> [PASS][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-kbl2/igt@gem_eio@in-flight-suspend.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-kbl4/igt@gem_eio@in-flight-suspend.html

  * igt@gem_exec_suspend@basic-s3-devices:
    - shard-glk:          [INCOMPLETE][31] ([fdo#103359] / [k.org#198133]) -> [PASS][32]
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-glk3/igt@gem_exec_suspend@basic-s3-devices.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-glk9/igt@gem_exec_suspend@basic-s3-devices.html

  * igt@i915_suspend@fence-restore-untiled:
    - shard-apl:          [DMESG-WARN][33] ([fdo#108566]) -> [PASS][34] +2 similar issues
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-apl5/igt@i915_suspend@fence-restore-untiled.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-apl1/igt@i915_suspend@fence-restore-untiled.html

  * {igt@kms_cursor_crc@pipe-c-cursor-64x21-offscreen}:
    - shard-skl:          [FAIL][35] ([fdo#103232]) -> [PASS][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-skl4/igt@kms_cursor_crc@pipe-c-cursor-64x21-offscreen.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-skl5/igt@kms_cursor_crc@pipe-c-cursor-64x21-offscreen.html

  * igt@kms_flip@flip-vs-expired-vblank:
    - shard-skl:          [FAIL][37] ([fdo#105363]) -> [PASS][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-skl4/igt@kms_flip@flip-vs-expired-vblank.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-skl10/igt@kms_flip@flip-vs-expired-vblank.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-offscren-pri-shrfb-draw-pwrite:
    - shard-iclb:         [FAIL][39] ([fdo#103167]) -> [PASS][40] +3 similar issues
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-offscren-pri-shrfb-draw-pwrite.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-iclb3/igt@kms_frontbuffer_tracking@fbcpsr-1p-offscren-pri-shrfb-draw-pwrite.html

  * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
    - shard-skl:          [FAIL][41] ([fdo#108145]) -> [PASS][42] +1 similar issue
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-skl1/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-skl8/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html

  * igt@kms_psr@psr2_sprite_blt:
    - shard-iclb:         [SKIP][43] ([fdo#109441]) -> [PASS][44] +1 similar issue
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6085/shard-iclb7/igt@kms_psr@psr2_sprite_blt.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/shard-iclb2/igt@kms_psr@psr2_sprite_blt.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#100047]: https://bugs.freedesktop.org/show_bug.cgi?id=100047
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
  [fdo#103355]: https://bugs.freedesktop.org/show_bug.cgi?id=103355
  [fdo#103359]: https://bugs.freedesktop.org/show_bug.cgi?id=103359
  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#105483]: https://bugs.freedesktop.org/show_bug.cgi?id=105483
  [fdo#107773]: https://bugs.freedesktop.org/show_bug.cgi?id=107773
  [fdo#107807]: https://bugs.freedesktop.org/show_bug.cgi?id=107807
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108341]: https://bugs.freedesktop.org/show_bug.cgi?id=108341
  [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109507]: https://bugs.freedesktop.org/show_bug.cgi?id=109507
  [fdo#109661]: https://bugs.freedesktop.org/show_bug.cgi?id=109661
  [fdo#109982]: https://bugs.freedesktop.org/show_bug.cgi?id=109982
  [fdo#110667]: https://bugs.freedesktop.org/show_bug.cgi?id=110667
  [fdo#99912]: https://bugs.freedesktop.org/show_bug.cgi?id=99912
  [k.org#198133]: https://bugzilla.kernel.org/show_bug.cgi?id=198133


Participating hosts (11 -> 10)
------------------------------

  Missing    (1): pig-snb-2600 


Build changes
-------------

  * Linux: CI_DRM_6085 -> Patchwork_13020

  CI_DRM_6085: 48d8cf5cc0aadd21924d05ad3e86b08d8e0e1c50 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4988: 2f6303d13e09b2457762540383c7f36cecd02bbf @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_13020: 8737f6b9a62fa2d982870429e1ca43e5bffb273c @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13020/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started
  2019-05-15 13:00 [PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Chris Wilson
                   ` (6 preceding siblings ...)
  2019-05-15 18:40 ` ✗ Fi.CI.IGT: failure " Patchwork
@ 2019-05-17 12:26 ` Tvrtko Ursulin
  7 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 12:26 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 15/05/2019 14:00, Chris Wilson wrote:
> Avoid charging us for the presumed busywait if the request was preempted
> after successfully using semaphores to reduce inter-engine latency.
> 
> v2: Bump the priority to reflect the lack of semaphores now required.
> 
> References: ca6e56f654e7 ("drm/i915: Disable semaphore busywaits on saturated systems")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_request.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index bed213148cbb..11670774cd25 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -509,6 +509,12 @@ void __i915_request_unsubmit(struct i915_request *request)
>   	/* Transfer back from the global per-engine timeline to per-context */
>   	move_to_timeline(request, request->timeline);
>   
> +	/* We've already spun, don't charge on resubmitting. */
> +	if (request->sched.semaphores && i915_request_started(request)) {
> +		request->sched.attr.priority |= I915_PRIORITY_NOSEMAPHORE;
> +		request->sched.semaphores = 0;
> +	}
> +
>   	/*
>   	 * We don't need to wake_up any waiters on request->execute, they
>   	 * will get woken by any other event or us re-adding this request
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] drm/i915: Truly bump ready tasks ahead of busywaits
  2019-05-15 13:00 ` [PATCH 2/5] drm/i915: Truly bump ready tasks ahead of busywaits Chris Wilson
@ 2019-05-17 12:35   ` Tvrtko Ursulin
  0 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 12:35 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Dmitry Ermilov


On 15/05/2019 14:00, Chris Wilson wrote:
> In commit b7404c7ecb38 ("drm/i915: Bump ready tasks ahead of
> busywaits"), I tried cutting a corner in order to not install a signal
> for each of our dependencies, and only listened to requests on which we
> were intending to busywait. The compromise that was made was that
> instead of then being able to promite the request with a full

promote

> NOSEMAPHORE like its non-busywaiting brethren, as we had not ensured we
> had cleared the semaphore chain, we settled for only using the NEWCLIENT
> boost. With an over saturated system with multiple NEWCLIENTS in flight
> at any time, this was found to be an inadequate promotion and left us
> with a much poorer scheduling order than prior to using semaphores.
> 
> The outcome of this patch, is that all requests have NOSEMAPHORE
> priority when they have no dependencies and are ready to run and not
> busywait, restoring the pre-semaphore ordering on saturated systems.
> 
> We can demonstrate the effect of poor scheduling order by oversaturating
> the system using gem_wsim on a system with multiple vcs engines
> (i.e running the same workloads across more clients than required for
> peak throughput, e.g. media_load_balance_17i7.wsim -c4 -b context):
> 
> x v5.1 (normalized)
> + tip
> * fix
> +------------------------------------------------------------------------+
> |                                                                    x   |
> |                                                                    x   |
> |                                                                    x   |
> |                                                                    x   |
> |                                                                   %x   |
> |                                                                  %%x   |
> |                                                                  %%x   |
> |                                                                  %%x   |
> |                                                                  %%x   |
> |                                                                  %%x   |
> |                                                                  %%x   |
> |                                                                  %%x   |
> |                                                                  %%x   |
> |                                                                  %%x   |
> |                                                                  %%x   |
> |                                                                  %#x   |
> |                                                                  %#x   |
> |                                                                  %#x   |
> |                                                                  %#x   |
> |                                                                  %#x   |
> |         +                                                        %#xx  |
> |         +                                                        %#xx  |
> |         +                                                       %%#xx  |
> |         +                                                       %%#xx  |
> |         +                                                       %%#xx  |
> |         +                                                       %%#xx  |
> |         +                                                       %%##x  |
> |         +++                                                     %%##x  |
> |         +++                                                     %%##x  |
> |         +++                                                     %%##x  |
> |        ++++                                                     %%##x  |
> |        ++++                                                     %%##x  |
> |        ++++                                                     %%##xx |
> |        ++++                                                     %###xx |
> |        ++++                                                     %###xx |
> |        ++++                                                     %###xx |
> |        ++++                                                     %###xx |
> |        ++++ +                                                   %#O#xx |
> |        ++++ +                                                   %#O#xx |
> |        ++++++ +                                                 %#O#xx |
> |       ++++++++++                                                %OOOxxx|
> |       ++++++++++       +                                       %#OOO#xx|
> |     + ++++++++++++ ++ +++++    +                        ++    @@OOOO#xx|
> |                                                                   |A_| |
> ||__________M_______A____________________|                               |
> |                                                                 |A_|   |
> +------------------------------------------------------------------------+
>      N           Min           Max        Median           Avg        Stddev
> x 120       0.99456       1.00628      0.999985     1.0001545  0.0024387139
> + 120      0.873021       1.00037      0.884134    0.90148752   0.039190862
> Difference at 99.5% confidence
> 	-0.098667 +/- 0.0110762
> 	-9.86517% +/- 1.10745%
> 	(Student's t, pooled s = 0.0277657)
> % 120      0.990207       1.00165     0.9970265    0.99699748     0.0021024
> Difference at 99.5% confidence
> 	-0.003157 +/- 0.000908245
> 	-0.315651% +/- 0.0908105%
> 	(Student's t, pooled s = 0.00227678)
> 
> Fixes: b7404c7ecb38 ("drm/i915: Bump ready tasks ahead of busywaits")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_request.c | 31 +++++++++++------------------
>   1 file changed, 12 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 11670774cd25..2a1079e472e2 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -575,18 +575,7 @@ semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
>   
>   	switch (state) {
>   	case FENCE_COMPLETE:
> -		/*
> -		 * We only check a small portion of our dependencies
> -		 * and so cannot guarantee that there remains no
> -		 * semaphore chain across all. Instead of opting
> -		 * for the full NOSEMAPHORE boost, we go for the
> -		 * smaller (but still preempting) boost of
> -		 * NEWCLIENT. This will be enough to boost over
> -		 * a busywaiting request (as that cannot be
> -		 * NEWCLIENT) without accidentally boosting
> -		 * a busywait over real work elsewhere.
> -		 */
> -		i915_schedule_bump_priority(request, I915_PRIORITY_NEWCLIENT);
> +		i915_schedule_bump_priority(request, I915_PRIORITY_NOSEMAPHORE);
>   		break;
>   
>   	case FENCE_FREE:
> @@ -852,12 +841,6 @@ emit_semaphore_wait(struct i915_request *to,
>   	if (err < 0)
>   		return err;
>   
> -	err = i915_sw_fence_await_dma_fence(&to->semaphore,
> -					    &from->fence, 0,
> -					    I915_FENCE_GFP);
> -	if (err < 0)
> -		return err;
> -
>   	/* Only submit our spinner after the signaler is running! */
>   	err = i915_request_await_execution(to, from, gfp);
>   	if (err)
> @@ -923,8 +906,18 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
>   						    &from->fence, 0,
>   						    I915_FENCE_GFP);
>   	}
> +	if (ret < 0)
> +		return ret;
> +
> +	if (to->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN) {
> +		ret = i915_sw_fence_await_dma_fence(&to->semaphore,
> +						    &from->fence, 0,
> +						    I915_FENCE_GFP);
> +		if (ret < 0)
> +			return ret;
> +	}
>   
> -	return ret < 0 ? ret : 0;
> +	return 0;
>   }
>   
>   int
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] drm/i915: Downgrade NEWCLIENT to non-preemptive
  2019-05-15 13:00 ` [PATCH 4/5] drm/i915: Downgrade NEWCLIENT to non-preemptive Chris Wilson
@ 2019-05-17 12:55   ` Tvrtko Ursulin
  2019-05-17 13:30     ` Chris Wilson
  0 siblings, 1 reply; 15+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 12:55 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Dmitry Ermilov


On 15/05/2019 14:00, Chris Wilson wrote:
> Commit 1413b2bc0717 ("drm/i915: Trim NEWCLIENT boosting") had the
> intended consequence of not allowing a sequence of work that merely
> crossed into a new engine the privilege to be promoted to NEWCLIENT

What do you mean with crossed into a new engine? At first I thought the 
statement implies the engine timeline was used to query for previous 
request, but that's not true.

Regards,

Tvrtko

> status. It also had the unintended consequence of actually making
> NEWCLIENT effective on heavily oversubscribed transcode machines and
> impacting upon their throughput.
> 
> If we consider a client packet composed of (rcsA, rcsB, vcs) and 30 of
> those clients, using the NEWCLIENT boost that will be scheduled as
> 
> 	rcsA x 30, (rcsB, vcs) x 30
> 
> where as before it would have been
> 
> 	(rcsA, rcsB, vcs) x 30
> 
> That is with NEWCLIENT only boosting the first request of each client,
> we would execute all rcsA requests prior to running on the vcs engines;
> acruing a lot of dead time as compared to the previous case where the
> vcs engine would be started in parallel to processing the second client.
> 
> The previous patch has the effect of delaying submission until it is
> required by a third party (either the user with an explicit wait, or by
> another client/engine). We reduce the NEWCLIENT bump to a mere WAIT,
> which has the effect of removing its preemptive grant and reducing it to
> the same level as any other user interaction -- that it will not be
> promoted above the interengine dependencies, and so preventing NEWCLIENTS
> from starving other engines. This a large nerf to the rrul properties of
> the current NEWCLIENT, but it still does give prioritised submission to
> new requests from light workloads.
> 
> References: b16c765122f9 ("drm/i915: Priority boost for new clients")
> Fixes: 1413b2bc0717 ("drm/i915: Trim NEWCLIENT boosting") # customer impact
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_lrc.c        | 2 +-
>   drivers/gpu/drm/i915/i915_priolist_types.h | 5 ++---
>   drivers/gpu/drm/i915/i915_request.c        | 2 +-
>   3 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index e18623def282..b5e82171df8f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -164,7 +164,7 @@
>   #define WA_TAIL_DWORDS 2
>   #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
>   
> -#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT | I915_PRIORITY_NOSEMAPHORE)
> +#define ACTIVE_PRIORITY (I915_PRIORITY_NOSEMAPHORE)
>   
>   static int execlists_context_deferred_alloc(struct intel_context *ce,
>   					    struct intel_engine_cs *engine);
> diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
> index cc44ebd3b553..49709de69875 100644
> --- a/drivers/gpu/drm/i915/i915_priolist_types.h
> +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
> @@ -20,15 +20,14 @@ enum {
>   	I915_PRIORITY_INVALID = INT_MIN
>   };
>   
> -#define I915_USER_PRIORITY_SHIFT 3
> +#define I915_USER_PRIORITY_SHIFT 2
>   #define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
>   
>   #define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
>   #define I915_PRIORITY_MASK (I915_PRIORITY_COUNT - 1)
>   
>   #define I915_PRIORITY_WAIT		((u8)BIT(0))
> -#define I915_PRIORITY_NEWCLIENT		((u8)BIT(1))
> -#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(2))
> +#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(1))
>   
>   #define __NO_PREEMPTION (I915_PRIORITY_WAIT)
>   
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 4899195e58c2..2fca0b59578d 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1189,7 +1189,7 @@ struct i915_request *__i915_request_commit(struct i915_request *rq)
>   		 * the bulk clients. (FQ_CODEL)
>   		 */
>   		if (list_empty(&rq->sched.signalers_list))
> -			attr.priority |= I915_PRIORITY_NEWCLIENT;
> +			attr.priority |= I915_PRIORITY_WAIT;
>   
>   		engine->schedule(rq, &attr);
>   	}
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] drm/i915: Downgrade NEWCLIENT to non-preemptive
  2019-05-17 12:55   ` Tvrtko Ursulin
@ 2019-05-17 13:30     ` Chris Wilson
  2019-05-17 14:29       ` Tvrtko Ursulin
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2019-05-17 13:30 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: Dmitry Ermilov

Quoting Tvrtko Ursulin (2019-05-17 13:55:48)
> 
> On 15/05/2019 14:00, Chris Wilson wrote:
> > Commit 1413b2bc0717 ("drm/i915: Trim NEWCLIENT boosting") had the
> > intended consequence of not allowing a sequence of work that merely
> > crossed into a new engine the privilege to be promoted to NEWCLIENT
> 
> What do you mean with crossed into a new engine? At first I thought the 
> statement implies the engine timeline was used to query for previous 
> request, but that's not true.

Our previous test was if all previous requests in the ring (along the
engine timeline) were complete then give this new request a boost. That
also gave the boost to any dependencies in other contexts and across
engines -- the consideration there was for a display server who was only
blitting from client buffers into the framebuffer to get an early boost
and bump all of its clients in order to be ahead of the next vblank. The
second thought was that was a bit too wide, but now we have evidence
from will-it-scale style oversaturation of transcode work that we should
take into account the workloads across engines and across contexts.

I think these two patches are quite nice in that effect, work is
essentially bottled up until required and so should arrive at the GPU in
batches of related work (but we don't prevent work from being executed
if the GPU is idle). Then combined with the timeslice we will
round-robin between the work required for the external observer to make
progress before doing other work.

It makes a pretty picture in my head and so far looks respectable in the
wsim comparisons (as well as the sample transcode reproducers). The
casualty is the realtime-response under load has lost its preemptive
power, and is relegated to just towards the head of the queue as opposed
to the front. On the other head, it makes WAIT far, far less special.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] drm/i915: Downgrade NEWCLIENT to non-preemptive
  2019-05-17 13:30     ` Chris Wilson
@ 2019-05-17 14:29       ` Tvrtko Ursulin
  0 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 14:29 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Dmitry Ermilov


On 17/05/2019 14:30, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-17 13:55:48)
>>
>> On 15/05/2019 14:00, Chris Wilson wrote:
>>> Commit 1413b2bc0717 ("drm/i915: Trim NEWCLIENT boosting") had the
>>> intended consequence of not allowing a sequence of work that merely
>>> crossed into a new engine the privilege to be promoted to NEWCLIENT
>>
>> What do you mean with crossed into a new engine? At first I thought the
>> statement implies the engine timeline was used to query for previous
>> request, but that's not true.
> 
> Our previous test was if all previous requests in the ring (along the
> engine timeline) were complete then give this new request a boost. That
> also gave the boost to any dependencies in other contexts and across
> engines -- the consideration there was for a display server who was only
> blitting from client buffers into the framebuffer to get an early boost
> and bump all of its clients in order to be ahead of the next vblank. The
> second thought was that was a bit too wide, but now we have evidence
> from will-it-scale style oversaturation of transcode work that we should
> take into account the workloads across engines and across contexts.
> 
> I think these two patches are quite nice in that effect, work is
> essentially bottled up until required and so should arrive at the GPU in
> batches of related work (but we don't prevent work from being executed
> if the GPU is idle). Then combined with the timeslice we will
> round-robin between the work required for the external observer to make
> progress before doing other work.
> 
> It makes a pretty picture in my head and so far looks respectable in the
> wsim comparisons (as well as the sample transcode reproducers). The
> casualty is the realtime-response under load has lost its preemptive
> power, and is relegated to just towards the head of the queue as opposed
> to the front. On the other head, it makes WAIT far, far less special.

Sorry for some reason I was thinking of a single timeline contexts when 
thinking about the commit message. :(  Numbers prove we need it..

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] drm/i915/execlists: Drop promotion on unsubmit
  2019-05-15 13:00 ` [PATCH 5/5] drm/i915/execlists: Drop promotion on unsubmit Chris Wilson
@ 2019-05-17 14:30   ` Tvrtko Ursulin
  0 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 14:30 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 15/05/2019 14:00, Chris Wilson wrote:
> With the disappearance of NEWCLIENT, we no longer need to provide the
> priority boost on preemption in order to prevent repeated gazumping,
> and we can remove the dead code.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_lrc.c | 59 +++++------------------------
>   1 file changed, 10 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index b5e82171df8f..f263a8374273 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -164,8 +164,6 @@
>   #define WA_TAIL_DWORDS 2
>   #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
>   
> -#define ACTIVE_PRIORITY (I915_PRIORITY_NOSEMAPHORE)
> -
>   static int execlists_context_deferred_alloc(struct intel_context *ce,
>   					    struct intel_engine_cs *engine);
>   static void execlists_init_reg_state(u32 *reg_state,
> @@ -189,23 +187,12 @@ static int effective_prio(const struct i915_request *rq)
>   
>   	/*
>   	 * On unwinding the active request, we give it a priority bump
> -	 * equivalent to a freshly submitted request. This protects it from
> -	 * being gazumped again, but it would be preferable if we didn't
> -	 * let it be gazumped in the first place!
> -	 *
> -	 * See __unwind_incomplete_requests()
> +	 * if it has completed waiting on any semaphore. If we know that
> +	 * the request has already started, we can prevent an unwanted
> +	 * preempt-to-idle cycle by taking that into account now.
>   	 */
> -	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(rq)) {
> -		/*
> -		 * After preemption, we insert the active request at the
> -		 * end of the new priority level. This means that we will be
> -		 * _lower_ priority than the preemptee all things equal (and
> -		 * so the preemption is valid), so adjust our comparison
> -		 * accordingly.
> -		 */
> -		prio |= ACTIVE_PRIORITY;
> -		prio--;
> -	}
> +	if (__i915_request_has_started(rq))
> +		prio |= I915_PRIORITY_NOSEMAPHORE;
>   
>   	/* Restrict mere WAIT boosts from triggering preemption */
>   	return prio | __NO_PREEMPTION;
> @@ -371,11 +358,11 @@ static void unwind_wa_tail(struct i915_request *rq)
>   }
>   
>   static struct i915_request *
> -__unwind_incomplete_requests(struct intel_engine_cs *engine, int boost)
> +__unwind_incomplete_requests(struct intel_engine_cs *engine)
>   {
>   	struct i915_request *rq, *rn, *active = NULL;
>   	struct list_head *uninitialized_var(pl);
> -	int prio = I915_PRIORITY_INVALID | boost;
> +	int prio = I915_PRIORITY_INVALID;
>   
>   	lockdep_assert_held(&engine->timeline.lock);
>   
> @@ -402,31 +389,6 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine, int boost)
>   		active = rq;
>   	}
>   
> -	/*
> -	 * The active request is now effectively the start of a new client
> -	 * stream, so give it the equivalent small priority bump to prevent
> -	 * it being gazumped a second time by another peer.
> -	 *
> -	 * Note we have to be careful not to apply a priority boost to a request
> -	 * still spinning on its semaphores. If the request hasn't started, that
> -	 * means it is still waiting for its dependencies to be signaled, and
> -	 * if we apply a priority boost to this request, we will boost it past
> -	 * its signalers and so break PI.
> -	 *
> -	 * One consequence of this preemption boost is that we may jump
> -	 * over lesser priorities (such as I915_PRIORITY_WAIT), effectively
> -	 * making those priorities non-preemptible. They will be moved forward
> -	 * in the priority queue, but they will not gain immediate access to
> -	 * the GPU.
> -	 */
> -	if (~prio & boost && __i915_request_has_started(active)) {
> -		prio |= boost;
> -		GEM_BUG_ON(active->sched.attr.priority >= prio);
> -		active->sched.attr.priority = prio;
> -		list_move_tail(&active->sched.link,
> -			       i915_sched_lookup_priolist(engine, prio));
> -	}
> -
>   	return active;
>   }
>   
> @@ -436,7 +398,7 @@ execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
>   	struct intel_engine_cs *engine =
>   		container_of(execlists, typeof(*engine), execlists);
>   
> -	return __unwind_incomplete_requests(engine, 0);
> +	return __unwind_incomplete_requests(engine);
>   }
>   
>   static inline void
> @@ -657,8 +619,7 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
>   	execlists_cancel_port_requests(execlists);
>   	__unwind_incomplete_requests(container_of(execlists,
>   						  struct intel_engine_cs,
> -						  execlists),
> -				     ACTIVE_PRIORITY);
> +						  execlists));
>   }
>   
>   static void execlists_dequeue(struct intel_engine_cs *engine)
> @@ -1911,7 +1872,7 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
>   	execlists_cancel_port_requests(execlists);
>   
>   	/* Push back any incomplete requests for replay after the reset. */
> -	rq = __unwind_incomplete_requests(engine, 0);
> +	rq = __unwind_incomplete_requests(engine);
>   	if (!rq)
>   		goto out_replay;
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] drm/i915: Bump signaler priority on adding a waiter
  2019-05-15 13:00 ` [PATCH 3/5] drm/i915: Bump signaler priority on adding a waiter Chris Wilson
@ 2019-05-17 14:53   ` Tvrtko Ursulin
  0 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 14:53 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 15/05/2019 14:00, Chris Wilson wrote:
> The handling of the no-preemption priority level imposes the restriction
> that we need to maintain the implied ordering even though preemption is
> disabled. Otherwise we may end up with an AB-BA deadlock across multiple
> engine due to a real preemption event reordering the no-preemption
> WAITs. To resolve this issue we currently promote all requests to WAIT
> on unsubmission, however this interferes with the timeslicing
> requirement that we do not apply any implicit promotion that will defeat
> the round-robin timeslice list. (If we automatically promote the active
> request it will go back to the head of the queue and not the tail!)
> 
> So we need implicit promotion to prevent reordering around semaphores
> where we are not allowed to preempt, and we must avoid implicit
> promotion on unsubmission. So instead of at unsubmit, if we apply that
> implicit promotion on adding the dependency, we avoid the semaphore
> deadlock and we also reduce the gains made by the promotion for user
> space waiting. Furthermore, by keeping the earlier dependencies at a
> higher level, we reduce the search space for timeslicing without
> altering runtime scheduling too badly (no dependencies at all will be
> assigned a higher priority for rrul).
> 
> v2: Limit the bump to external edges (as originally intended) i.e.
> between contexts and out to the user.
> 
> Testcase: igt/gem_concurrent_blit
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/selftest_lrc.c      | 12 ++++++++----
>   drivers/gpu/drm/i915/i915_request.c         |  9 ---------
>   drivers/gpu/drm/i915/i915_scheduler.c       | 11 +++++++++++
>   drivers/gpu/drm/i915/i915_scheduler_types.h |  3 ++-
>   4 files changed, 21 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> index 4b042893dc0e..5b3d8e33f1cf 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> @@ -98,12 +98,14 @@ static int live_busywait_preempt(void *arg)
>   	ctx_hi = kernel_context(i915);
>   	if (!ctx_hi)
>   		goto err_unlock;
> -	ctx_hi->sched.priority = INT_MAX;
> +	ctx_hi->sched.priority =
> +		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
>   
>   	ctx_lo = kernel_context(i915);
>   	if (!ctx_lo)
>   		goto err_ctx_hi;
> -	ctx_lo->sched.priority = INT_MIN;
> +	ctx_lo->sched.priority =
> +		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
>   
>   	obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
>   	if (IS_ERR(obj)) {
> @@ -958,12 +960,14 @@ static int live_preempt_hang(void *arg)
>   	ctx_hi = kernel_context(i915);
>   	if (!ctx_hi)
>   		goto err_spin_lo;
> -	ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
> +	ctx_hi->sched.priority =
> +		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
>   
>   	ctx_lo = kernel_context(i915);
>   	if (!ctx_lo)
>   		goto err_ctx_hi;
> -	ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
> +	ctx_lo->sched.priority =
> +		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
>   
>   	for_each_engine(engine, i915, id) {
>   		struct i915_request *rq;
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 2a1079e472e2..4899195e58c2 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -489,15 +489,6 @@ void __i915_request_unsubmit(struct i915_request *request)
>   	/* We may be recursing from the signal callback of another i915 fence */
>   	spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
>   
> -	/*
> -	 * As we do not allow WAIT to preempt inflight requests,
> -	 * once we have executed a request, along with triggering
> -	 * any execution callbacks, we must preserve its ordering
> -	 * within the non-preemptible FIFO.
> -	 */
> -	BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
> -	request->sched.attr.priority |= __NO_PREEMPTION;
> -
>   	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags))
>   		i915_request_cancel_breadcrumb(request);
>   
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 5581c5004ff0..d215dcdf0b1e 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -387,6 +387,16 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
>   		    !node_started(signal))
>   			node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
>   
> +		/*
> +		 * As we do not allow WAIT to preempt inflight requests,
> +		 * once we have executed a request, along with triggering
> +		 * any execution callbacks, we must preserve its ordering
> +		 * within the non-preemptible FIFO.
> +		 */
> +		BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK);
> +		if (flags & I915_DEPENDENCY_EXTERNAL)
> +			__bump_priority(signal, __NO_PREEMPTION);
> +
>   		ret = true;
>   	}
>   
> @@ -405,6 +415,7 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
>   		return -ENOMEM;
>   
>   	if (!__i915_sched_node_add_dependency(node, signal, dep,
> +					      I915_DEPENDENCY_EXTERNAL |
>   					      I915_DEPENDENCY_ALLOC))
>   		i915_dependency_free(dep);
>   
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index 166a457884b2..3e309631bd0b 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -66,7 +66,8 @@ struct i915_dependency {
>   	struct list_head wait_link;
>   	struct list_head dfs_link;
>   	unsigned long flags;
> -#define I915_DEPENDENCY_ALLOC BIT(0)
> +#define I915_DEPENDENCY_ALLOC		BIT(0)
> +#define I915_DEPENDENCY_EXTERNAL	BIT(1)
>   };
>   
>   #endif /* _I915_SCHEDULER_TYPES_H_ */
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-05-17 14:53 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-15 13:00 [PATCH 1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Chris Wilson
2019-05-15 13:00 ` [PATCH 2/5] drm/i915: Truly bump ready tasks ahead of busywaits Chris Wilson
2019-05-17 12:35   ` Tvrtko Ursulin
2019-05-15 13:00 ` [PATCH 3/5] drm/i915: Bump signaler priority on adding a waiter Chris Wilson
2019-05-17 14:53   ` Tvrtko Ursulin
2019-05-15 13:00 ` [PATCH 4/5] drm/i915: Downgrade NEWCLIENT to non-preemptive Chris Wilson
2019-05-17 12:55   ` Tvrtko Ursulin
2019-05-17 13:30     ` Chris Wilson
2019-05-17 14:29       ` Tvrtko Ursulin
2019-05-15 13:00 ` [PATCH 5/5] drm/i915/execlists: Drop promotion on unsubmit Chris Wilson
2019-05-17 14:30   ` Tvrtko Ursulin
2019-05-15 13:21 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/5] drm/i915: Mark semaphores as complete on unsubmit out if payload was started Patchwork
2019-05-15 13:43 ` ✓ Fi.CI.BAT: success " Patchwork
2019-05-15 18:40 ` ✗ Fi.CI.IGT: failure " Patchwork
2019-05-17 12:26 ` [PATCH 1/5] " Tvrtko Ursulin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.