All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
@ 2019-05-03 11:52 Chris Wilson
  2019-05-03 11:52 ` [PATCH 02/13] drm/i915: Prefer checking the wakeref itself rather than the counter Chris Wilson
                   ` (18 more replies)
  0 siblings, 19 replies; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

Inside the signal handler, we expect the requests to be ordered by their
breadcrumb such that no later request may be complete if we find an
earlier incomplete. Add an assert to check that the next breadcrumb
should not be logically before the current.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index 3cbffd400b1b..a6ffb25f72a2 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -99,6 +99,12 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
 			struct i915_request *rq =
 				list_entry(pos, typeof(*rq), signal_link);
 
+			GEM_BUG_ON(next != &ce->signals &&
+				   i915_seqno_passed(rq->fence.seqno,
+						     list_entry(next,
+								typeof(*rq),
+								signal_link)->fence.seqno));
+
 			if (!__request_completed(rq))
 				break;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/13] drm/i915: Prefer checking the wakeref itself rather than the counter
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-07 10:48   ` Tvrtko Ursulin
  2019-05-03 11:52 ` [PATCH 03/13] drm/i915: Assert the local engine->wakeref is active Chris Wilson
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

The counter goes to zero at the start of the parking cycle, but the
wakeref itself is held until the end. Likewise, the counter becomes one
at the end of the unparking, but the wakeref is taken first. If we check
the wakeref instead of the counter, we include the unpark/unparking time
as intel_wakeref_is_active(), and do not spuriously declare inactive if
we fail to park (i.e. the parking and wakeref drop is postponed).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_wakeref.c | 20 +++++++++++++++++---
 drivers/gpu/drm/i915/intel_wakeref.h |  2 +-
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
index 1f94bc4ff9e4..91196d9612bb 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -7,6 +7,19 @@
 #include "intel_drv.h"
 #include "intel_wakeref.h"
 
+static void rpm_get(struct drm_i915_private *i915, struct intel_wakeref *wf)
+{
+	wf->wakeref = intel_runtime_pm_get(i915);
+}
+
+static void rpm_put(struct drm_i915_private *i915, struct intel_wakeref *wf)
+{
+	intel_wakeref_t wakeref = fetch_and_zero(&wf->wakeref);
+
+	intel_runtime_pm_put(i915, wakeref);
+	GEM_BUG_ON(!wakeref);
+}
+
 int __intel_wakeref_get_first(struct drm_i915_private *i915,
 			      struct intel_wakeref *wf,
 			      int (*fn)(struct intel_wakeref *wf))
@@ -21,11 +34,11 @@ int __intel_wakeref_get_first(struct drm_i915_private *i915,
 	if (!atomic_read(&wf->count)) {
 		int err;
 
-		wf->wakeref = intel_runtime_pm_get(i915);
+		rpm_get(i915, wf);
 
 		err = fn(wf);
 		if (unlikely(err)) {
-			intel_runtime_pm_put(i915, wf->wakeref);
+			rpm_put(i915, wf);
 			mutex_unlock(&wf->mutex);
 			return err;
 		}
@@ -46,7 +59,7 @@ int __intel_wakeref_put_last(struct drm_i915_private *i915,
 
 	err = fn(wf);
 	if (likely(!err))
-		intel_runtime_pm_put(i915, wf->wakeref);
+		rpm_put(i915, wf);
 	else
 		atomic_inc(&wf->count);
 	mutex_unlock(&wf->mutex);
@@ -58,4 +71,5 @@ void __intel_wakeref_init(struct intel_wakeref *wf, struct lock_class_key *key)
 {
 	__mutex_init(&wf->mutex, "wakeref", key);
 	atomic_set(&wf->count, 0);
+	wf->wakeref = 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
index a979d638344b..db742291211c 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.h
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -127,7 +127,7 @@ intel_wakeref_unlock(struct intel_wakeref *wf)
 static inline bool
 intel_wakeref_active(struct intel_wakeref *wf)
 {
-	return atomic_read(&wf->count);
+	return READ_ONCE(wf->wakeref);
 }
 
 #endif /* INTEL_WAKEREF_H */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/13] drm/i915: Assert the local engine->wakeref is active
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
  2019-05-03 11:52 ` [PATCH 02/13] drm/i915: Prefer checking the wakeref itself rather than the counter Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-07 10:52   ` Tvrtko Ursulin
  2019-05-03 11:52 ` [PATCH 04/13] drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD Chris Wilson
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

Due to the asynchronous tasklet and recursive GT wakeref, it may happen
that we submit to the engine (underneath it's own wakeref) prior to the
central wakeref being marked as taken. Switch to checking the local wakeref
for greater consistency.

Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 3 +++
 drivers/gpu/drm/i915/gt/intel_lrc.c       | 4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 5907a9613641..416d7e2e6f8c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1090,6 +1090,9 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (i915_reset_failed(engine->i915))
 		return true;
 
+	if (!intel_wakeref_active(&engine->wakeref))
+		return true;
+
 	/* Waiting to drain ELSP? */
 	if (READ_ONCE(engine->execlists.active)) {
 		struct tasklet_struct *t = &engine->execlists.tasklet;
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 7d69d07490e8..5580b6f1aa0c 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -535,7 +535,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 	 * that all ELSP are drained i.e. we have processed the CSB,
 	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
 	 */
-	GEM_BUG_ON(!engine->i915->gt.awake);
+	GEM_BUG_ON(!intel_wakeref_active(&engine->wakeref));
 
 	/*
 	 * ELSQ note: the submit queue is not cleared after being submitted
@@ -1085,7 +1085,7 @@ static void execlists_submission_tasklet(unsigned long data)
 
 	GEM_TRACE("%s awake?=%d, active=%x\n",
 		  engine->name,
-		  !!engine->i915->gt.awake,
+		  !!intel_wakeref_active(&engine->wakeref),
 		  engine->execlists.active);
 
 	spin_lock_irqsave(&engine->timeline.lock, flags);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/13] drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
  2019-05-03 11:52 ` [PATCH 02/13] drm/i915: Prefer checking the wakeref itself rather than the counter Chris Wilson
  2019-05-03 11:52 ` [PATCH 03/13] drm/i915: Assert the local engine->wakeref is active Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-03 11:52 ` [PATCH 05/13] drm/i915: Remove delay for idle_work Chris Wilson
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

After realising we need to sample RING_START to detect context switches
from preemption events that do not allow for the seqno to advance, we
can also realise that the seqno itself is just a distance along the ring
and so can be replaced by sampling RING_HEAD.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine.h       | 15 ---------
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  5 ++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  3 +-
 drivers/gpu/drm/i915/gt/intel_hangcheck.c    |  8 ++---
 drivers/gpu/drm/i915/gt/intel_lrc.c          | 11 -------
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c   | 32 ++------------------
 drivers/gpu/drm/i915/i915_debugfs.c          | 12 ++------
 7 files changed, 12 insertions(+), 74 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index f5b0f27cecb6..9e2a183a351c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -233,8 +233,6 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
  */
 #define I915_GEM_HWS_PREEMPT		0x32
 #define I915_GEM_HWS_PREEMPT_ADDR	(I915_GEM_HWS_PREEMPT * sizeof(u32))
-#define I915_GEM_HWS_HANGCHECK		0x34
-#define I915_GEM_HWS_HANGCHECK_ADDR	(I915_GEM_HWS_HANGCHECK * sizeof(u32))
 #define I915_GEM_HWS_SEQNO		0x40
 #define I915_GEM_HWS_SEQNO_ADDR		(I915_GEM_HWS_SEQNO * sizeof(u32))
 #define I915_GEM_HWS_SCRATCH		0x80
@@ -566,17 +564,4 @@ static inline bool inject_preempt_hang(struct intel_engine_execlists *execlists)
 
 #endif
 
-static inline u32
-intel_engine_next_hangcheck_seqno(struct intel_engine_cs *engine)
-{
-	return engine->hangcheck.next_seqno =
-		next_pseudo_random32(engine->hangcheck.next_seqno);
-}
-
-static inline u32
-intel_engine_get_hangcheck_seqno(struct intel_engine_cs *engine)
-{
-	return intel_read_status_page(engine, I915_GEM_HWS_HANGCHECK);
-}
-
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 416d7e2e6f8c..4c3753c1b573 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -721,6 +721,7 @@ static int measure_breadcrumb_dw(struct intel_engine_cs *engine)
 		goto out_timeline;
 
 	dw = engine->emit_fini_breadcrumb(&frame->rq, frame->cs) - frame->cs;
+	GEM_BUG_ON(dw & 1); /* RING_TAIL must be qword aligned */
 
 	i915_timeline_unpin(&frame->timeline);
 
@@ -1444,9 +1445,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 		drm_printf(m, "*** WEDGED ***\n");
 
 	drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
-	drm_printf(m, "\tHangcheck %x:%x [%d ms]\n",
-		   engine->hangcheck.last_seqno,
-		   engine->hangcheck.next_seqno,
+	drm_printf(m, "\tHangcheck: %d ms ago\n",
 		   jiffies_to_msecs(jiffies - engine->hangcheck.action_timestamp));
 	drm_printf(m, "\tReset count: %d (global %d)\n",
 		   i915_reset_engine_count(error, engine),
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index c0ab11b12e14..e381c1c73902 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -54,8 +54,7 @@ struct intel_instdone {
 struct intel_engine_hangcheck {
 	u64 acthd;
 	u32 last_ring;
-	u32 last_seqno;
-	u32 next_seqno;
+	u32 last_head;
 	unsigned long action_timestamp;
 	struct intel_instdone instdone;
 };
diff --git a/drivers/gpu/drm/i915/gt/intel_hangcheck.c b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
index 721ab74a382f..3a4d09b80fa0 100644
--- a/drivers/gpu/drm/i915/gt/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
@@ -28,7 +28,7 @@
 struct hangcheck {
 	u64 acthd;
 	u32 ring;
-	u32 seqno;
+	u32 head;
 	enum intel_engine_hangcheck_action action;
 	unsigned long action_timestamp;
 	int deadlock;
@@ -134,16 +134,16 @@ static void hangcheck_load_sample(struct intel_engine_cs *engine,
 				  struct hangcheck *hc)
 {
 	hc->acthd = intel_engine_get_active_head(engine);
-	hc->seqno = intel_engine_get_hangcheck_seqno(engine);
 	hc->ring = ENGINE_READ(engine, RING_START);
+	hc->head = ENGINE_READ(engine, RING_HEAD);
 }
 
 static void hangcheck_store_sample(struct intel_engine_cs *engine,
 				   const struct hangcheck *hc)
 {
 	engine->hangcheck.acthd = hc->acthd;
-	engine->hangcheck.last_seqno = hc->seqno;
 	engine->hangcheck.last_ring = hc->ring;
+	engine->hangcheck.last_head = hc->head;
 }
 
 static enum intel_engine_hangcheck_action
@@ -156,7 +156,7 @@ hangcheck_get_action(struct intel_engine_cs *engine,
 	if (engine->hangcheck.last_ring != hc->ring)
 		return ENGINE_ACTIVE_SEQNO;
 
-	if (engine->hangcheck.last_seqno != hc->seqno)
+	if (engine->hangcheck.last_head != hc->head)
 		return ENGINE_ACTIVE_SEQNO;
 
 	return engine_stuck(engine, hc->acthd);
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 5580b6f1aa0c..90900c7d7058 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2273,12 +2273,6 @@ static u32 *gen8_emit_fini_breadcrumb(struct i915_request *request, u32 *cs)
 				  request->timeline->hwsp_offset,
 				  0);
 
-	cs = gen8_emit_ggtt_write(cs,
-				  intel_engine_next_hangcheck_seqno(request->engine),
-				  I915_GEM_HWS_HANGCHECK_ADDR,
-				  MI_FLUSH_DW_STORE_INDEX);
-
-
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
 
@@ -2299,11 +2293,6 @@ static u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs)
 				      PIPE_CONTROL_FLUSH_ENABLE |
 				      PIPE_CONTROL_CS_STALL);
 
-	cs = gen8_emit_ggtt_write_rcs(cs,
-				      intel_engine_next_hangcheck_seqno(request->engine),
-				      I915_GEM_HWS_HANGCHECK_ADDR,
-				      PIPE_CONTROL_STORE_DATA_INDEX);
-
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
index 6fbc5ddbc896..f0d60affdba3 100644
--- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
@@ -309,11 +309,6 @@ static u32 *gen6_rcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 	*cs++ = rq->timeline->hwsp_offset | PIPE_CONTROL_GLOBAL_GTT;
 	*cs++ = rq->fence.seqno;
 
-	*cs++ = GFX_OP_PIPE_CONTROL(4);
-	*cs++ = PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_STORE_DATA_INDEX;
-	*cs++ = I915_GEM_HWS_HANGCHECK_ADDR | PIPE_CONTROL_GLOBAL_GTT;
-	*cs++ = intel_engine_next_hangcheck_seqno(rq->engine);
-
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_NOOP;
 
@@ -415,13 +410,6 @@ static u32 *gen7_rcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 	*cs++ = rq->timeline->hwsp_offset;
 	*cs++ = rq->fence.seqno;
 
-	*cs++ = GFX_OP_PIPE_CONTROL(4);
-	*cs++ = (PIPE_CONTROL_QW_WRITE |
-		 PIPE_CONTROL_STORE_DATA_INDEX |
-		 PIPE_CONTROL_GLOBAL_GTT_IVB);
-	*cs++ = I915_GEM_HWS_HANGCHECK_ADDR;
-	*cs++ = intel_engine_next_hangcheck_seqno(rq->engine);
-
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_NOOP;
 
@@ -440,12 +428,7 @@ static u32 *gen6_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 	*cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT;
 	*cs++ = rq->fence.seqno;
 
-	*cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
-	*cs++ = I915_GEM_HWS_HANGCHECK_ADDR | MI_FLUSH_DW_USE_GTT;
-	*cs++ = intel_engine_next_hangcheck_seqno(rq->engine);
-
 	*cs++ = MI_USER_INTERRUPT;
-	*cs++ = MI_NOOP;
 
 	rq->tail = intel_ring_offset(rq, cs);
 	assert_ring_tail_valid(rq->ring, rq->tail);
@@ -465,10 +448,6 @@ static u32 *gen7_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 	*cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT;
 	*cs++ = rq->fence.seqno;
 
-	*cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
-	*cs++ = I915_GEM_HWS_HANGCHECK_ADDR | MI_FLUSH_DW_USE_GTT;
-	*cs++ = intel_engine_next_hangcheck_seqno(rq->engine);
-
 	for (i = 0; i < GEN7_XCS_WA; i++) {
 		*cs++ = MI_STORE_DWORD_INDEX;
 		*cs++ = I915_GEM_HWS_SEQNO_ADDR;
@@ -480,6 +459,7 @@ static u32 *gen7_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 	*cs++ = 0;
 
 	*cs++ = MI_USER_INTERRUPT;
+	*cs++ = MI_NOOP;
 
 	rq->tail = intel_ring_offset(rq, cs);
 	assert_ring_tail_valid(rq->ring, rq->tail);
@@ -928,11 +908,8 @@ static u32 *i9xx_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 	*cs++ = I915_GEM_HWS_SEQNO_ADDR;
 	*cs++ = rq->fence.seqno;
 
-	*cs++ = MI_STORE_DWORD_INDEX;
-	*cs++ = I915_GEM_HWS_HANGCHECK_ADDR;
-	*cs++ = intel_engine_next_hangcheck_seqno(rq->engine);
-
 	*cs++ = MI_USER_INTERRUPT;
+	*cs++ = MI_NOOP;
 
 	rq->tail = intel_ring_offset(rq, cs);
 	assert_ring_tail_valid(rq->ring, rq->tail);
@@ -950,10 +927,6 @@ static u32 *gen5_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 
 	*cs++ = MI_FLUSH;
 
-	*cs++ = MI_STORE_DWORD_INDEX;
-	*cs++ = I915_GEM_HWS_HANGCHECK_ADDR;
-	*cs++ = intel_engine_next_hangcheck_seqno(rq->engine);
-
 	BUILD_BUG_ON(GEN5_WA_STORES < 1);
 	for (i = 0; i < GEN5_WA_STORES; i++) {
 		*cs++ = MI_STORE_DWORD_INDEX;
@@ -962,7 +935,6 @@ static u32 *gen5_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 	}
 
 	*cs++ = MI_USER_INTERRUPT;
-	*cs++ = MI_NOOP;
 
 	rq->tail = intel_ring_offset(rq, cs);
 	assert_ring_tail_valid(rq->ring, rq->tail);
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 14cd83e9ea8b..da4fb9f34d27 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1288,7 +1288,6 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct intel_engine_cs *engine;
 	u64 acthd[I915_NUM_ENGINES];
-	u32 seqno[I915_NUM_ENGINES];
 	struct intel_instdone instdone;
 	intel_wakeref_t wakeref;
 	enum intel_engine_id id;
@@ -1305,10 +1304,8 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 	}
 
 	with_intel_runtime_pm(dev_priv, wakeref) {
-		for_each_engine(engine, dev_priv, id) {
+		for_each_engine(engine, dev_priv, id)
 			acthd[id] = intel_engine_get_active_head(engine);
-			seqno[id] = intel_engine_get_hangcheck_seqno(engine);
-		}
 
 		intel_engine_get_instdone(dev_priv->engine[RCS0], &instdone);
 	}
@@ -1325,11 +1322,8 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 	seq_printf(m, "GT active? %s\n", yesno(dev_priv->gt.awake));
 
 	for_each_engine(engine, dev_priv, id) {
-		seq_printf(m, "%s:\n", engine->name);
-		seq_printf(m, "\tseqno = %x [current %x, last %x], %dms ago\n",
-			   engine->hangcheck.last_seqno,
-			   seqno[id],
-			   engine->hangcheck.next_seqno,
+		seq_printf(m, "%s: %d ms ago\n",
+			   engine->name,
 			   jiffies_to_msecs(jiffies -
 					    engine->hangcheck.action_timestamp));
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/13] drm/i915: Remove delay for idle_work
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (2 preceding siblings ...)
  2019-05-03 11:52 ` [PATCH 04/13] drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-07 10:54   ` Tvrtko Ursulin
  2019-05-03 11:52 ` [PATCH 06/13] drm/i915: Cancel retire_worker on parking Chris Wilson
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

The original intent for the delay before running the idle_work was to
provide a hysteresis to avoid ping-ponging the device runtime-pm. Since
then we have also pulled in some memory management and general device
management for parking. But with the inversion of the wakeref handling,
GEM is no longer responsible for the wakeref and by the time we call the
idle_work, the device is asleep. It seems appropriate now to drop the
delay and just run the worker immediately to flush the cached GEM state
before sleeping.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c           |  2 +-
 drivers/gpu/drm/i915/i915_drv.h               |  2 +-
 drivers/gpu/drm/i915/i915_gem_pm.c            | 21 +++++++------------
 .../gpu/drm/i915/selftests/i915_gem_object.c  |  2 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  4 ++--
 5 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index da4fb9f34d27..674c8c936057 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3931,8 +3931,8 @@ i915_drop_caches_set(void *data, u64 val)
 	if (val & DROP_IDLE) {
 		do {
 			flush_delayed_work(&i915->gem.retire_work);
-			drain_delayed_work(&i915->gem.idle_work);
 		} while (READ_ONCE(i915->gt.awake));
+		flush_work(&i915->gem.idle_work);
 	}
 
 	if (val & DROP_FREED)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 64fa353a62bb..2bf518fea36e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2031,7 +2031,7 @@ struct drm_i915_private {
 		 * arrive within a small period of time, we fire
 		 * off the idle_work.
 		 */
-		struct delayed_work idle_work;
+		struct work_struct idle_work;
 	} gem;
 
 	/* For i945gm vblank irq vs. C3 workaround */
diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
index 49b0ce594f20..ae91ad7cb31e 100644
--- a/drivers/gpu/drm/i915/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/i915_gem_pm.c
@@ -29,12 +29,12 @@ static void i915_gem_park(struct drm_i915_private *i915)
 static void idle_work_handler(struct work_struct *work)
 {
 	struct drm_i915_private *i915 =
-		container_of(work, typeof(*i915), gem.idle_work.work);
+		container_of(work, typeof(*i915), gem.idle_work);
 
 	mutex_lock(&i915->drm.struct_mutex);
 
 	intel_wakeref_lock(&i915->gt.wakeref);
-	if (!intel_wakeref_active(&i915->gt.wakeref))
+	if (!intel_wakeref_active(&i915->gt.wakeref) && !work_pending(work))
 		i915_gem_park(i915);
 	intel_wakeref_unlock(&i915->gt.wakeref);
 
@@ -74,9 +74,7 @@ static int pm_notifier(struct notifier_block *nb,
 		break;
 
 	case INTEL_GT_PARK:
-		mod_delayed_work(i915->wq,
-				 &i915->gem.idle_work,
-				 msecs_to_jiffies(100));
+		queue_work(i915->wq, &i915->gem.idle_work);
 		break;
 	}
 
@@ -142,16 +140,11 @@ void i915_gem_suspend(struct drm_i915_private *i915)
 	 * Assert that we successfully flushed all the work and
 	 * reset the GPU back to its idle, low power state.
 	 */
-	GEM_BUG_ON(i915->gt.awake);
-	cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
-
 	drain_delayed_work(&i915->gem.retire_work);
+	GEM_BUG_ON(i915->gt.awake);
+	flush_work(&i915->gem.idle_work);
 
-	/*
-	 * As the idle_work is rearming if it detects a race, play safe and
-	 * repeat the flush until it is definitely idle.
-	 */
-	drain_delayed_work(&i915->gem.idle_work);
+	cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
 
 	i915_gem_drain_freed_objects(i915);
 
@@ -242,7 +235,7 @@ void i915_gem_resume(struct drm_i915_private *i915)
 
 void i915_gem_init__pm(struct drm_i915_private *i915)
 {
-	INIT_DELAYED_WORK(&i915->gem.idle_work, idle_work_handler);
+	INIT_WORK(&i915->gem.idle_work, idle_work_handler);
 	INIT_DELAYED_WORK(&i915->gem.retire_work, retire_work_handler);
 
 	i915->gem.pm_notifier.notifier_call = pm_notifier;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index 088b2aa05dcd..b926d1cd165d 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -509,7 +509,7 @@ static void disable_retire_worker(struct drm_i915_private *i915)
 	intel_gt_pm_get(i915);
 
 	cancel_delayed_work_sync(&i915->gem.retire_work);
-	cancel_delayed_work_sync(&i915->gem.idle_work);
+	flush_work(&i915->gem.idle_work);
 }
 
 static void restore_retire_worker(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index e4033d0576c4..d919f512042c 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -59,7 +59,7 @@ static void mock_device_release(struct drm_device *dev)
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	drain_delayed_work(&i915->gem.retire_work);
-	drain_delayed_work(&i915->gem.idle_work);
+	flush_work(&i915->gem.idle_work);
 	i915_gem_drain_workqueue(i915);
 
 	mutex_lock(&i915->drm.struct_mutex);
@@ -195,7 +195,7 @@ struct drm_i915_private *mock_gem_device(void)
 	mock_init_contexts(i915);
 
 	INIT_DELAYED_WORK(&i915->gem.retire_work, mock_retire_work_handler);
-	INIT_DELAYED_WORK(&i915->gem.idle_work, mock_idle_work_handler);
+	INIT_WORK(&i915->gem.idle_work, mock_idle_work_handler);
 
 	i915->gt.awake = true;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/13] drm/i915: Cancel retire_worker on parking
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (3 preceding siblings ...)
  2019-05-03 11:52 ` [PATCH 05/13] drm/i915: Remove delay for idle_work Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-07 10:55   ` Tvrtko Ursulin
  2019-05-03 11:52 ` [PATCH 07/13] drm/i915: Stop spinning for DROP_IDLE (debugfs/i915_drop_caches) Chris Wilson
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

Replace the racy continuation check within retire_work with a definite
kill-switch on idling. The race was being exposed by gem_concurrent_blit
where the retire_worker would be terminated too early leaving us
spinning in debugfs/i915_drop_caches with nothing flushing the
retirement queue.

Although that the igt is trying to idle from one child while submitting
from another may be a contributing factor as to why  it runs so slowly...

v2: Use the non-sync version of cancel_delayed_work(), we only need to
stop it from being scheduled as we independently check whether now is
the right time to be parking.

Testcase: igt/gem_concurrent_blit
Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_pm.c             | 18 ++++++++++++------
 .../gpu/drm/i915/selftests/mock_gem_device.c   |  1 -
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
index ae91ad7cb31e..fa9c2ebd966a 100644
--- a/drivers/gpu/drm/i915/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/i915_gem_pm.c
@@ -30,15 +30,23 @@ static void idle_work_handler(struct work_struct *work)
 {
 	struct drm_i915_private *i915 =
 		container_of(work, typeof(*i915), gem.idle_work);
+	bool restart = true;
 
+	cancel_delayed_work(&i915->gem.retire_work);
 	mutex_lock(&i915->drm.struct_mutex);
 
 	intel_wakeref_lock(&i915->gt.wakeref);
-	if (!intel_wakeref_active(&i915->gt.wakeref) && !work_pending(work))
+	if (!intel_wakeref_active(&i915->gt.wakeref) && !work_pending(work)) {
 		i915_gem_park(i915);
+		restart = false;
+	}
 	intel_wakeref_unlock(&i915->gt.wakeref);
 
 	mutex_unlock(&i915->drm.struct_mutex);
+	if (restart)
+		queue_delayed_work(i915->wq,
+				   &i915->gem.retire_work,
+				   round_jiffies_up_relative(HZ));
 }
 
 static void retire_work_handler(struct work_struct *work)
@@ -52,10 +60,9 @@ static void retire_work_handler(struct work_struct *work)
 		mutex_unlock(&i915->drm.struct_mutex);
 	}
 
-	if (intel_wakeref_active(&i915->gt.wakeref))
-		queue_delayed_work(i915->wq,
-				   &i915->gem.retire_work,
-				   round_jiffies_up_relative(HZ));
+	queue_delayed_work(i915->wq,
+			   &i915->gem.retire_work,
+			   round_jiffies_up_relative(HZ));
 }
 
 static int pm_notifier(struct notifier_block *nb,
@@ -140,7 +147,6 @@ void i915_gem_suspend(struct drm_i915_private *i915)
 	 * Assert that we successfully flushed all the work and
 	 * reset the GPU back to its idle, low power state.
 	 */
-	drain_delayed_work(&i915->gem.retire_work);
 	GEM_BUG_ON(i915->gt.awake);
 	flush_work(&i915->gem.idle_work);
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index d919f512042c..9fd02025d382 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -58,7 +58,6 @@ static void mock_device_release(struct drm_device *dev)
 	i915_gem_contexts_lost(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 
-	drain_delayed_work(&i915->gem.retire_work);
 	flush_work(&i915->gem.idle_work);
 	i915_gem_drain_workqueue(i915);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/13] drm/i915: Stop spinning for DROP_IDLE (debugfs/i915_drop_caches)
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (4 preceding siblings ...)
  2019-05-03 11:52 ` [PATCH 06/13] drm/i915: Cancel retire_worker on parking Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-03 11:52 ` [PATCH 08/13] drm/i915: Only reschedule the submission tasklet if preemption is possible Chris Wilson
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

If the user is racing a call to debugfs/i915_drop_caches with ongoing
submission from another thread/process, we may never end up idling the
GPU and be uninterruptibly spinning in debugfs/i915_drop_caches trying
to catch an idle moment.

Just flush the work once, that should be enough to park the system under
correct conditions. Outside of those we either have a driver bug or the
user is racing themselves. Sadly, because the user may be provoking the
unwanted situation we can't put a warn here to attract attention to a
probable bug.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 674c8c936057..ca6e12193470 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3929,9 +3929,7 @@ i915_drop_caches_set(void *data, u64 val)
 	fs_reclaim_release(GFP_KERNEL);
 
 	if (val & DROP_IDLE) {
-		do {
-			flush_delayed_work(&i915->gem.retire_work);
-		} while (READ_ONCE(i915->gt.awake));
+		flush_delayed_work(&i915->gem.retire_work);
 		flush_work(&i915->gem.idle_work);
 	}
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/13] drm/i915: Only reschedule the submission tasklet if preemption is possible
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (5 preceding siblings ...)
  2019-05-03 11:52 ` [PATCH 07/13] drm/i915: Stop spinning for DROP_IDLE (debugfs/i915_drop_caches) Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-07 11:53   ` Tvrtko Ursulin
  2019-05-03 11:52 ` [PATCH 09/13] drm/i915/execlists: Don't apply priority boost for resets Chris Wilson
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

If we couple the scheduler more tightly with the execlists policy, we
can apply the preemption policy to the question of whether we need to
kick the tasklet at all for this priority bump.

v2: Rephrase it as a core i915 policy and not an execlists foible.
v3: Pull the kick together.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine.h      | 18 ----------
 drivers/gpu/drm/i915/gt/intel_lrc.c         |  4 +--
 drivers/gpu/drm/i915/gt/selftest_lrc.c      |  7 +++-
 drivers/gpu/drm/i915/i915_request.c         |  2 --
 drivers/gpu/drm/i915/i915_scheduler.c       | 37 ++++++++++++---------
 drivers/gpu/drm/i915/i915_scheduler.h       | 18 ++++++++++
 drivers/gpu/drm/i915/intel_guc_submission.c |  3 +-
 7 files changed, 50 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 9e2a183a351c..9359b3a7ad9c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -106,24 +106,6 @@ hangcheck_action_to_str(const enum intel_engine_hangcheck_action a)
 
 void intel_engines_set_scheduler_caps(struct drm_i915_private *i915);
 
-static inline bool __execlists_need_preempt(int prio, int last)
-{
-	/*
-	 * Allow preemption of low -> normal -> high, but we do
-	 * not allow low priority tasks to preempt other low priority
-	 * tasks under the impression that latency for low priority
-	 * tasks does not matter (as much as background throughput),
-	 * so kiss.
-	 *
-	 * More naturally we would write
-	 *	prio >= max(0, last);
-	 * except that we wish to prevent triggering preemption at the same
-	 * priority level: the task that is running should remain running
-	 * to preserve FIFO ordering of dependencies.
-	 */
-	return prio > max(I915_PRIORITY_NORMAL - 1, last);
-}
-
 static inline void
 execlists_set_active(struct intel_engine_execlists *execlists,
 		     unsigned int bit)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 90900c7d7058..afcdfc440bbd 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -252,8 +252,8 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 	 * ourselves, ignore the request.
 	 */
 	last_prio = effective_prio(rq);
-	if (!__execlists_need_preempt(engine->execlists.queue_priority_hint,
-				      last_prio))
+	if (!i915_scheduler_need_preempt(engine->execlists.queue_priority_hint,
+					 last_prio))
 		return false;
 
 	/*
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 84538f69185b..4b042893dc0e 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -638,14 +638,19 @@ static struct i915_request *dummy_request(struct intel_engine_cs *engine)
 	GEM_BUG_ON(i915_request_completed(rq));
 
 	i915_sw_fence_init(&rq->submit, dummy_notify);
-	i915_sw_fence_commit(&rq->submit);
+	set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
 
 	return rq;
 }
 
 static void dummy_request_free(struct i915_request *dummy)
 {
+	/* We have to fake the CS interrupt to kick the next request */
+	i915_sw_fence_commit(&dummy->submit);
+
 	i915_request_mark_complete(dummy);
+	dma_fence_signal(&dummy->fence);
+
 	i915_sched_node_fini(&dummy->sched);
 	i915_sw_fence_fini(&dummy->submit);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index d06c45305b03..8cb3ed5531e3 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1377,9 +1377,7 @@ long i915_request_wait(struct i915_request *rq,
 	if (flags & I915_WAIT_PRIORITY) {
 		if (!i915_request_started(rq) && INTEL_GEN(rq->i915) >= 6)
 			gen6_rps_boost(rq);
-		local_bh_disable(); /* suspend tasklets for reprioritisation */
 		i915_schedule_bump_priority(rq, I915_PRIORITY_WAIT);
-		local_bh_enable(); /* kick tasklets en masse */
 	}
 
 	wait.tsk = current;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 39bc4f54e272..fadf0cd9c75a 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -261,16 +261,30 @@ sched_lock_engine(const struct i915_sched_node *node,
 	return engine;
 }
 
-static bool inflight(const struct i915_request *rq,
-		     const struct intel_engine_cs *engine)
+static inline int rq_prio(const struct i915_request *rq)
 {
-	const struct i915_request *active;
+	return rq->sched.attr.priority | __NO_PREEMPTION;
+}
+
+static void kick_submission(struct intel_engine_cs *engine, int prio)
+{
+	const struct i915_request *inflight =
+		port_request(engine->execlists.port);
 
-	if (!i915_request_is_active(rq))
-		return false;
+	if (!inflight) /* currently idle, nothing to do */
+		return;
+
+	/*
+	 * If we are already the currently executing context, don't
+	 * bother evaluating if we should preempt ourselves, or if
+	 * we expect nothing to change as a result of running the
+	 * tasklet, i.e. we have not change the priority queue
+	 * sufficiently to oust the running context.
+	 */
+	if (!i915_scheduler_need_preempt(prio, rq_prio(inflight)))
+		return;
 
-	active = port_request(engine->execlists.port);
-	return active->hw_context == rq->hw_context;
+	tasklet_hi_schedule(&engine->execlists.tasklet);
 }
 
 static void __i915_schedule(struct i915_request *rq,
@@ -396,15 +410,8 @@ static void __i915_schedule(struct i915_request *rq,
 
 		engine->execlists.queue_priority_hint = prio;
 
-		/*
-		 * If we are already the currently executing context, don't
-		 * bother evaluating if we should preempt ourselves.
-		 */
-		if (inflight(node_to_request(node), engine))
-			continue;
-
 		/* Defer (tasklet) submission until after all of our updates. */
-		tasklet_hi_schedule(&engine->execlists.tasklet);
+		kick_submission(engine, prio);
 	}
 
 	spin_unlock(&engine->timeline.lock);
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 07d243acf553..7eefccff39bf 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -52,4 +52,22 @@ static inline void i915_priolist_free(struct i915_priolist *p)
 		__i915_priolist_free(p);
 }
 
+static inline bool i915_scheduler_need_preempt(int prio, int active)
+{
+	/*
+	 * Allow preemption of low -> normal -> high, but we do
+	 * not allow low priority tasks to preempt other low priority
+	 * tasks under the impression that latency for low priority
+	 * tasks does not matter (as much as background throughput),
+	 * so kiss.
+	 *
+	 * More naturally we would write
+	 *	prio >= max(0, last);
+	 * except that we wish to prevent triggering preemption at the same
+	 * priority level: the task that is running should remain running
+	 * to preserve FIFO ordering of dependencies.
+	 */
+	return prio > max(I915_PRIORITY_NORMAL - 1, active);
+}
+
 #endif /* _I915_SCHEDULER_H_ */
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 57ed1dd4ae41..380d83a2bfb6 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -747,7 +747,8 @@ static bool __guc_dequeue(struct intel_engine_cs *engine)
 				&engine->i915->guc.preempt_work[engine->id];
 			int prio = execlists->queue_priority_hint;
 
-			if (__execlists_need_preempt(prio, port_prio(port))) {
+			if (i915_scheduler_need_preempt(prio,
+							port_prio(port))) {
 				execlists_set_active(execlists,
 						     EXECLISTS_ACTIVE_PREEMPT);
 				queue_work(engine->i915->guc.preempt_wq,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/13] drm/i915/execlists: Don't apply priority boost for resets
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (6 preceding siblings ...)
  2019-05-03 11:52 ` [PATCH 08/13] drm/i915: Only reschedule the submission tasklet if preemption is possible Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-07 12:04   ` Tvrtko Ursulin
  2019-05-03 11:52 ` [PATCH 10/13] drm/i915: Rearrange i915_scheduler.c Chris Wilson
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

Do not treat reset as a normal preemption event and avoid giving the
guilty request a priority boost for simply being active at the time of
reset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index afcdfc440bbd..6419bcaf1ecc 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -371,11 +371,11 @@ static void unwind_wa_tail(struct i915_request *rq)
 }
 
 static struct i915_request *
-__unwind_incomplete_requests(struct intel_engine_cs *engine)
+__unwind_incomplete_requests(struct intel_engine_cs *engine, int boost)
 {
 	struct i915_request *rq, *rn, *active = NULL;
 	struct list_head *uninitialized_var(pl);
-	int prio = I915_PRIORITY_INVALID | ACTIVE_PRIORITY;
+	int prio = I915_PRIORITY_INVALID | boost;
 
 	lockdep_assert_held(&engine->timeline.lock);
 
@@ -419,8 +419,9 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	 * in the priority queue, but they will not gain immediate access to
 	 * the GPU.
 	 */
-	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(active)) {
-		prio |= ACTIVE_PRIORITY;
+	if (~prio & boost && __i915_request_has_started(active)) {
+		prio |= boost;
+		GEM_BUG_ON(active->sched.attr.priority >= prio);
 		active->sched.attr.priority = prio;
 		list_move_tail(&active->sched.link,
 			       i915_sched_lookup_priolist(engine, prio));
@@ -435,7 +436,7 @@ execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
 	struct intel_engine_cs *engine =
 		container_of(execlists, typeof(*engine), execlists);
 
-	return __unwind_incomplete_requests(engine);
+	return __unwind_incomplete_requests(engine, 0);
 }
 
 static inline void
@@ -656,7 +657,8 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
 	execlists_cancel_port_requests(execlists);
 	__unwind_incomplete_requests(container_of(execlists,
 						  struct intel_engine_cs,
-						  execlists));
+						  execlists),
+				     ACTIVE_PRIORITY);
 }
 
 static void execlists_dequeue(struct intel_engine_cs *engine)
@@ -1909,7 +1911,7 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
 	execlists_cancel_port_requests(execlists);
 
 	/* Push back any incomplete requests for replay after the reset. */
-	rq = __unwind_incomplete_requests(engine);
+	rq = __unwind_incomplete_requests(engine, 0);
 	if (!rq)
 		goto out_replay;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/13] drm/i915: Rearrange i915_scheduler.c
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (7 preceding siblings ...)
  2019-05-03 11:52 ` [PATCH 09/13] drm/i915/execlists: Don't apply priority boost for resets Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-07 12:06   ` Tvrtko Ursulin
  2019-05-03 11:52 ` [PATCH 11/13] drm/i915: Pass i915_sched_node around internally Chris Wilson
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

To avoid pulling in a forward declaration in the next patch, move the
i915_sched_node handling to after the main dfs of the scheduler.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_scheduler.c | 210 +++++++++++++-------------
 1 file changed, 105 insertions(+), 105 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index fadf0cd9c75a..4a95cf2201a7 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -35,109 +35,6 @@ static inline bool node_signaled(const struct i915_sched_node *node)
 	return i915_request_completed(node_to_request(node));
 }
 
-void i915_sched_node_init(struct i915_sched_node *node)
-{
-	INIT_LIST_HEAD(&node->signalers_list);
-	INIT_LIST_HEAD(&node->waiters_list);
-	INIT_LIST_HEAD(&node->link);
-	node->attr.priority = I915_PRIORITY_INVALID;
-	node->semaphores = 0;
-	node->flags = 0;
-}
-
-static struct i915_dependency *
-i915_dependency_alloc(void)
-{
-	return kmem_cache_alloc(global.slab_dependencies, GFP_KERNEL);
-}
-
-static void
-i915_dependency_free(struct i915_dependency *dep)
-{
-	kmem_cache_free(global.slab_dependencies, dep);
-}
-
-bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
-				      struct i915_sched_node *signal,
-				      struct i915_dependency *dep,
-				      unsigned long flags)
-{
-	bool ret = false;
-
-	spin_lock_irq(&schedule_lock);
-
-	if (!node_signaled(signal)) {
-		INIT_LIST_HEAD(&dep->dfs_link);
-		list_add(&dep->wait_link, &signal->waiters_list);
-		list_add(&dep->signal_link, &node->signalers_list);
-		dep->signaler = signal;
-		dep->flags = flags;
-
-		/* Keep track of whether anyone on this chain has a semaphore */
-		if (signal->flags & I915_SCHED_HAS_SEMAPHORE_CHAIN &&
-		    !node_started(signal))
-			node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
-
-		ret = true;
-	}
-
-	spin_unlock_irq(&schedule_lock);
-
-	return ret;
-}
-
-int i915_sched_node_add_dependency(struct i915_sched_node *node,
-				   struct i915_sched_node *signal)
-{
-	struct i915_dependency *dep;
-
-	dep = i915_dependency_alloc();
-	if (!dep)
-		return -ENOMEM;
-
-	if (!__i915_sched_node_add_dependency(node, signal, dep,
-					      I915_DEPENDENCY_ALLOC))
-		i915_dependency_free(dep);
-
-	return 0;
-}
-
-void i915_sched_node_fini(struct i915_sched_node *node)
-{
-	struct i915_dependency *dep, *tmp;
-
-	GEM_BUG_ON(!list_empty(&node->link));
-
-	spin_lock_irq(&schedule_lock);
-
-	/*
-	 * Everyone we depended upon (the fences we wait to be signaled)
-	 * should retire before us and remove themselves from our list.
-	 * However, retirement is run independently on each timeline and
-	 * so we may be called out-of-order.
-	 */
-	list_for_each_entry_safe(dep, tmp, &node->signalers_list, signal_link) {
-		GEM_BUG_ON(!node_signaled(dep->signaler));
-		GEM_BUG_ON(!list_empty(&dep->dfs_link));
-
-		list_del(&dep->wait_link);
-		if (dep->flags & I915_DEPENDENCY_ALLOC)
-			i915_dependency_free(dep);
-	}
-
-	/* Remove ourselves from everyone who depends upon us */
-	list_for_each_entry_safe(dep, tmp, &node->waiters_list, wait_link) {
-		GEM_BUG_ON(dep->signaler != node);
-		GEM_BUG_ON(!list_empty(&dep->dfs_link));
-
-		list_del(&dep->signal_link);
-		if (dep->flags & I915_DEPENDENCY_ALLOC)
-			i915_dependency_free(dep);
-	}
-
-	spin_unlock_irq(&schedule_lock);
-}
-
 static inline struct i915_priolist *to_priolist(struct rb_node *rb)
 {
 	return rb_entry(rb, struct i915_priolist, node);
@@ -239,6 +136,11 @@ i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio)
 	return &p->requests[idx];
 }
 
+void __i915_priolist_free(struct i915_priolist *p)
+{
+	kmem_cache_free(global.slab_priorities, p);
+}
+
 struct sched_cache {
 	struct list_head *priolist;
 };
@@ -443,9 +345,107 @@ void i915_schedule_bump_priority(struct i915_request *rq, unsigned int bump)
 	spin_unlock_irqrestore(&schedule_lock, flags);
 }
 
-void __i915_priolist_free(struct i915_priolist *p)
+void i915_sched_node_init(struct i915_sched_node *node)
 {
-	kmem_cache_free(global.slab_priorities, p);
+	INIT_LIST_HEAD(&node->signalers_list);
+	INIT_LIST_HEAD(&node->waiters_list);
+	INIT_LIST_HEAD(&node->link);
+	node->attr.priority = I915_PRIORITY_INVALID;
+	node->semaphores = 0;
+	node->flags = 0;
+}
+
+static struct i915_dependency *
+i915_dependency_alloc(void)
+{
+	return kmem_cache_alloc(global.slab_dependencies, GFP_KERNEL);
+}
+
+static void
+i915_dependency_free(struct i915_dependency *dep)
+{
+	kmem_cache_free(global.slab_dependencies, dep);
+}
+
+bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
+				      struct i915_sched_node *signal,
+				      struct i915_dependency *dep,
+				      unsigned long flags)
+{
+	bool ret = false;
+
+	spin_lock_irq(&schedule_lock);
+
+	if (!node_signaled(signal)) {
+		INIT_LIST_HEAD(&dep->dfs_link);
+		list_add(&dep->wait_link, &signal->waiters_list);
+		list_add(&dep->signal_link, &node->signalers_list);
+		dep->signaler = signal;
+		dep->flags = flags;
+
+		/* Keep track of whether anyone on this chain has a semaphore */
+		if (signal->flags & I915_SCHED_HAS_SEMAPHORE_CHAIN &&
+		    !node_started(signal))
+			node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
+
+		ret = true;
+	}
+
+	spin_unlock_irq(&schedule_lock);
+
+	return ret;
+}
+
+int i915_sched_node_add_dependency(struct i915_sched_node *node,
+				   struct i915_sched_node *signal)
+{
+	struct i915_dependency *dep;
+
+	dep = i915_dependency_alloc();
+	if (!dep)
+		return -ENOMEM;
+
+	if (!__i915_sched_node_add_dependency(node, signal, dep,
+					      I915_DEPENDENCY_ALLOC))
+		i915_dependency_free(dep);
+
+	return 0;
+}
+
+void i915_sched_node_fini(struct i915_sched_node *node)
+{
+	struct i915_dependency *dep, *tmp;
+
+	GEM_BUG_ON(!list_empty(&node->link));
+
+	spin_lock_irq(&schedule_lock);
+
+	/*
+	 * Everyone we depended upon (the fences we wait to be signaled)
+	 * should retire before us and remove themselves from our list.
+	 * However, retirement is run independently on each timeline and
+	 * so we may be called out-of-order.
+	 */
+	list_for_each_entry_safe(dep, tmp, &node->signalers_list, signal_link) {
+		GEM_BUG_ON(!node_signaled(dep->signaler));
+		GEM_BUG_ON(!list_empty(&dep->dfs_link));
+
+		list_del(&dep->wait_link);
+		if (dep->flags & I915_DEPENDENCY_ALLOC)
+			i915_dependency_free(dep);
+	}
+
+	/* Remove ourselves from everyone who depends upon us */
+	list_for_each_entry_safe(dep, tmp, &node->waiters_list, wait_link) {
+		GEM_BUG_ON(dep->signaler != node);
+		GEM_BUG_ON(!list_empty(&dep->dfs_link));
+
+		list_del(&dep->signal_link);
+		if (dep->flags & I915_DEPENDENCY_ALLOC)
+			i915_dependency_free(dep);
+	}
+
+	spin_unlock_irq(&schedule_lock);
 }
 
 static void i915_global_scheduler_shrink(void)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 11/13] drm/i915: Pass i915_sched_node around internally
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (8 preceding siblings ...)
  2019-05-03 11:52 ` [PATCH 10/13] drm/i915: Rearrange i915_scheduler.c Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-07 12:12   ` Tvrtko Ursulin
  2019-05-03 11:52 ` [PATCH 12/13] drm/i915: Bump signaler priority on adding a waiter Chris Wilson
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

To simplify the next patch, update bump_priority and schedule to accept
the internal i915_sched_ndoe directly and not expect a request pointer.

add/remove: 0/0 grow/shrink: 2/1 up/down: 8/-15 (-7)
Function                                     old     new   delta
i915_schedule_bump_priority                  109     113      +4
i915_schedule                                 50      54      +4
__i915_schedule                              922     907     -15

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_scheduler.c | 33 +++++++++++++++------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 4a95cf2201a7..380cb7343a10 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -189,7 +189,7 @@ static void kick_submission(struct intel_engine_cs *engine, int prio)
 	tasklet_hi_schedule(&engine->execlists.tasklet);
 }
 
-static void __i915_schedule(struct i915_request *rq,
+static void __i915_schedule(struct i915_sched_node *rq,
 			    const struct i915_sched_attr *attr)
 {
 	struct intel_engine_cs *engine;
@@ -203,13 +203,13 @@ static void __i915_schedule(struct i915_request *rq,
 	lockdep_assert_held(&schedule_lock);
 	GEM_BUG_ON(prio == I915_PRIORITY_INVALID);
 
-	if (i915_request_completed(rq))
+	if (prio <= READ_ONCE(rq->attr.priority))
 		return;
 
-	if (prio <= READ_ONCE(rq->sched.attr.priority))
+	if (node_signaled(rq))
 		return;
 
-	stack.signaler = &rq->sched;
+	stack.signaler = rq;
 	list_add(&stack.dfs_link, &dfs);
 
 	/*
@@ -260,9 +260,9 @@ static void __i915_schedule(struct i915_request *rq,
 	 * execlists_submit_request()), we can set our own priority and skip
 	 * acquiring the engine locks.
 	 */
-	if (rq->sched.attr.priority == I915_PRIORITY_INVALID) {
-		GEM_BUG_ON(!list_empty(&rq->sched.link));
-		rq->sched.attr = *attr;
+	if (rq->attr.priority == I915_PRIORITY_INVALID) {
+		GEM_BUG_ON(!list_empty(&rq->link));
+		rq->attr = *attr;
 
 		if (stack.dfs_link.next == stack.dfs_link.prev)
 			return;
@@ -271,7 +271,7 @@ static void __i915_schedule(struct i915_request *rq,
 	}
 
 	memset(&cache, 0, sizeof(cache));
-	engine = rq->engine;
+	engine = node_to_request(rq)->engine;
 	spin_lock(&engine->timeline.lock);
 
 	/* Fifo and depth-first replacement ensure our deps execute before us */
@@ -322,13 +322,20 @@ static void __i915_schedule(struct i915_request *rq,
 void i915_schedule(struct i915_request *rq, const struct i915_sched_attr *attr)
 {
 	spin_lock_irq(&schedule_lock);
-	__i915_schedule(rq, attr);
+	__i915_schedule(&rq->sched, attr);
 	spin_unlock_irq(&schedule_lock);
 }
 
+static void __bump_priority(struct i915_sched_node *node, unsigned int bump)
+{
+	struct i915_sched_attr attr = node->attr;
+
+	attr.priority |= bump;
+	__i915_schedule(node, &attr);
+}
+
 void i915_schedule_bump_priority(struct i915_request *rq, unsigned int bump)
 {
-	struct i915_sched_attr attr;
 	unsigned long flags;
 
 	GEM_BUG_ON(bump & ~I915_PRIORITY_MASK);
@@ -337,11 +344,7 @@ void i915_schedule_bump_priority(struct i915_request *rq, unsigned int bump)
 		return;
 
 	spin_lock_irqsave(&schedule_lock, flags);
-
-	attr = rq->sched.attr;
-	attr.priority |= bump;
-	__i915_schedule(rq, &attr);
-
+	__bump_priority(&rq->sched, bump);
 	spin_unlock_irqrestore(&schedule_lock, flags);
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 12/13] drm/i915: Bump signaler priority on adding a waiter
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (9 preceding siblings ...)
  2019-05-03 11:52 ` [PATCH 11/13] drm/i915: Pass i915_sched_node around internally Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-07 12:46   ` Tvrtko Ursulin
  2019-05-03 11:52 ` [PATCH 13/13] drm/i915: Disable semaphore busywaits on saturated systems Chris Wilson
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx

The handling of the no-preemption priority level imposes the restriction
that we need to maintain the implied ordering even though preemption is
disabled. Otherwise we may end up with an AB-BA deadlock across multiple
engine due to a real preemption event reordering the no-preemption
WAITs. To resolve this issue we currently promote all requests to WAIT
on unsubmission, however this interferes with the timeslicing
requirement that we do not apply any implicit promotion that will defeat
the round-robin timeslice list. (If we automatically promote the active
request it will go back to the head of the queue and not the tail!)

So we need implicit promotion to prevent reordering around semaphores
where we are not allowed to preempt, and we must avoid implicit
promotion on unsubmission. So instead of at unsubmit, if we apply that
implicit promotion on adding the dependency, we avoid the semaphore
deadlock and we also reduce the gains made by the promotion for user
space waiting. Furthermore, by keeping the earlier dependencies at a
higher level, we reduce the search space for timeslicing without
altering runtime scheduling too badly (no dependencies at all will be
assigned a higher priority for rrul).

v2: Limit the bump to external edges (as originally intended) i.e.
between contexts and out to the user.

Testcase: igt/gem_concurrent_blit
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_lrc.c      | 12 ++++++++----
 drivers/gpu/drm/i915/i915_request.c         |  9 ---------
 drivers/gpu/drm/i915/i915_scheduler.c       | 11 +++++++++++
 drivers/gpu/drm/i915/i915_scheduler_types.h |  3 ++-
 4 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 4b042893dc0e..5b3d8e33f1cf 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -98,12 +98,14 @@ static int live_busywait_preempt(void *arg)
 	ctx_hi = kernel_context(i915);
 	if (!ctx_hi)
 		goto err_unlock;
-	ctx_hi->sched.priority = INT_MAX;
+	ctx_hi->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
 
 	ctx_lo = kernel_context(i915);
 	if (!ctx_lo)
 		goto err_ctx_hi;
-	ctx_lo->sched.priority = INT_MIN;
+	ctx_lo->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
 
 	obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
 	if (IS_ERR(obj)) {
@@ -958,12 +960,14 @@ static int live_preempt_hang(void *arg)
 	ctx_hi = kernel_context(i915);
 	if (!ctx_hi)
 		goto err_spin_lo;
-	ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
+	ctx_hi->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
 
 	ctx_lo = kernel_context(i915);
 	if (!ctx_lo)
 		goto err_ctx_hi;
-	ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
+	ctx_lo->sched.priority =
+		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
 
 	for_each_engine(engine, i915, id) {
 		struct i915_request *rq;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 8cb3ed5531e3..065da1bcbb4c 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -468,15 +468,6 @@ void __i915_request_unsubmit(struct i915_request *request)
 	/* We may be recursing from the signal callback of another i915 fence */
 	spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
 
-	/*
-	 * As we do not allow WAIT to preempt inflight requests,
-	 * once we have executed a request, along with triggering
-	 * any execution callbacks, we must preserve its ordering
-	 * within the non-preemptible FIFO.
-	 */
-	BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
-	request->sched.attr.priority |= __NO_PREEMPTION;
-
 	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags))
 		i915_request_cancel_breadcrumb(request);
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 380cb7343a10..ff0ca5718f97 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -391,6 +391,16 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
 		    !node_started(signal))
 			node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
 
+		/*
+		 * As we do not allow WAIT to preempt inflight requests,
+		 * once we have executed a request, along with triggering
+		 * any execution callbacks, we must preserve its ordering
+		 * within the non-preemptible FIFO.
+		 */
+		BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK);
+		if (flags & I915_DEPENDENCY_EXTERNAL)
+			__bump_priority(signal, __NO_PREEMPTION);
+
 		ret = true;
 	}
 
@@ -409,6 +419,7 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
 		return -ENOMEM;
 
 	if (!__i915_sched_node_add_dependency(node, signal, dep,
+					      I915_DEPENDENCY_EXTERNAL |
 					      I915_DEPENDENCY_ALLOC))
 		i915_dependency_free(dep);
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index 166a457884b2..3e309631bd0b 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -66,7 +66,8 @@ struct i915_dependency {
 	struct list_head wait_link;
 	struct list_head dfs_link;
 	unsigned long flags;
-#define I915_DEPENDENCY_ALLOC BIT(0)
+#define I915_DEPENDENCY_ALLOC		BIT(0)
+#define I915_DEPENDENCY_EXTERNAL	BIT(1)
 };
 
 #endif /* _I915_SCHEDULER_TYPES_H_ */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 13/13] drm/i915: Disable semaphore busywaits on saturated systems
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (10 preceding siblings ...)
  2019-05-03 11:52 ` [PATCH 12/13] drm/i915: Bump signaler priority on adding a waiter Chris Wilson
@ 2019-05-03 11:52 ` Chris Wilson
  2019-05-03 12:56 ` ✗ Fi.CI.SPARSE: warning for series starting with [01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Patchwork
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 11:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dmitry Ermilov

Asking the GPU to busywait on a memory address, perhaps not unexpectedly
in hindsight for a shared system, leads to bus contention that affects
CPU programs trying to concurrently access memory. This can manifest as
a drop in transcode throughput on highly over-saturated workloads.

The only clue offered by perf, is that the bus-cycles (perf stat -e
bus-cycles) jumped by 50% when enabling semaphores. This corresponds
with extra CPU active cycles being attributed to intel_idle's mwait.

This patch introduces a heuristic to try and detect when more than one
client is submitting to the GPU pushing it into an oversaturated state.
As we already keep track of when the semaphores are signaled, we can
inspect their state on submitting the busywait batch and if we planned
to use a semaphore but were too late, conclude that the GPU is
overloaded and not try to use semaphores in future requests. In
practice, this means we optimistically try to use semaphores for the
first frame of a transcode job split over multiple engines, and fail is
there are multiple clients active and continue not to use semaphores for
the subsequent frames in the sequence. Periodically, trying to
optimistically switch semaphores back on whenever the client waits to
catch up with the transcode results.

With 1 client, on Broxton J3455, with the relative fps normalized by %cpu:

x no semaphores
+ drm-tip
* throttle
+------------------------------------------------------------------------+
|                                                    *                   |
|                                                    *+                  |
|                                                    **+                 |
|                                                    **+  x              |
|                                x               *  +**+  x              |
|                                x  x       *    *  +***x xx             |
|                                x  x       *    * *+***x *x             |
|                                x  x*   +  *    * *****x *x x           |
|                         +    x xx+x*   + ***   * ********* x   *       |
|                         +    x xx+x*   * *** +** ********* xx  *       |
|    *   +         ++++*  +    x*x****+*+* ***+*************+x*  *       |
|*+ +** *+ + +* + *++****** *xxx**********x***+*****************+*++    *|
|                                   |__________A_____M_____|             |
|                           |_______________A____M_________|             |
|                                 |____________A___M________|            |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 120       2.60475       3.50941       3.31123     3.2143953    0.21117399
+ 120        2.3826       3.57077       3.25101     3.1414161    0.28146407
Difference at 95.0% confidence
	-0.0729792 +/- 0.0629585
	-2.27039% +/- 1.95864%
	(Student's t, pooled s = 0.248814)
* 120       2.35536       3.66713        3.2849     3.2059917    0.24618565
No difference proven at 95.0% confidence

With 10 clients over-saturating the pipeline:

x no semaphores
+ drm-tip
* patch
+------------------------------------------------------------------------+
|                     ++                                        **       |
|                     ++                                        **       |
|                     ++                                        **       |
|                     ++                                        **       |
|                     ++                                    xx ***       |
|                     ++                                    xx ***       |
|                     ++                                    xxx***       |
|                     ++                                    xxx***       |
|                    +++                                    xxx***       |
|                    +++                                    xx****       |
|                    +++                                    xx****       |
|                    +++                                    xx****       |
|                    +++                                    xx****       |
|                    ++++                                   xx****       |
|                   +++++                                   xx****       |
|                   +++++                                 x x******      |
|                  ++++++                                 xxx*******     |
|                  ++++++                                 xxx*******     |
|                  ++++++                                 xxx*******     |
|                  ++++++                                 xx********     |
|                  ++++++                               xxxx********     |
|                  ++++++                               xxxx********     |
|                ++++++++                             xxxxx*********     |
|+ +  +        + ++++++++                           xxx*xx**********x*  *|
|                                                         |__A__|        |
|                 |__AM__|                                               |
|                                                            |__A_|      |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 120       2.47855        2.8972       2.72376     2.7193402   0.074604933
+ 120       1.17367       1.77459       1.71977     1.6966782   0.085850697
Difference at 95.0% confidence
	-1.02266 +/- 0.0203502
	-37.607% +/- 0.748352%
	(Student's t, pooled s = 0.0804246)
* 120       2.57868       3.00821       2.80142     2.7923878   0.058646477
Difference at 95.0% confidence
	0.0730476 +/- 0.0169791
	2.68622% +/- 0.624383%
	(Student's t, pooled s = 0.0671018)

Indicating that we've recovered the regression from enabling semaphores
on this saturated setup, with a hint towards an overall improvement.

Very similar, but of smaller magnitude, results are observed on both
Skylake(gt2) and Kabylake(gt4). This may be due to the reduced impact of
bus-cycles, where we see a 50% hit on Broxton, it is only 10% on the big
core, in this particular test.

One observation to make here is that for a greedy client trying to
maximise its own throughput, using semaphores is the right choice. It is
only the holistic system-wide view that semaphores of one client
impacts another and reduces the overall throughput where we would choose
to disable semaphores.

The most noticeable negactive impact this has is on the no-op
microbenchmarks, which are also very notable for having no cpu bus load.
In particular, this increases the runtime and energy consumption of
gem_exec_whisper.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context.c       |  2 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |  3 ++
 drivers/gpu/drm/i915/i915_request.c           | 40 ++++++++++++++++++-
 3 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 1f1761fc6597..5b31e1e05ddd 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -116,6 +116,7 @@ intel_context_init(struct intel_context *ce,
 	ce->engine = engine;
 	ce->ops = engine->cops;
 	ce->sseu = engine->sseu;
+	ce->saturated = 0;
 
 	INIT_LIST_HEAD(&ce->signal_link);
 	INIT_LIST_HEAD(&ce->signals);
@@ -158,6 +159,7 @@ void intel_context_enter_engine(struct intel_context *ce)
 
 void intel_context_exit_engine(struct intel_context *ce)
 {
+	ce->saturated = 0;
 	intel_engine_pm_put(ce->engine);
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index d5a7dbd0daee..963a312430e6 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -13,6 +13,7 @@
 #include <linux/types.h>
 
 #include "i915_active_types.h"
+#include "intel_engine_types.h"
 #include "intel_sseu.h"
 
 struct i915_gem_context;
@@ -52,6 +53,8 @@ struct intel_context {
 	atomic_t pin_count;
 	struct mutex pin_mutex; /* guards pinning and associated on-gpuing */
 
+	intel_engine_mask_t saturated; /* submitting semaphores too late? */
+
 	/**
 	 * active_tracker: Active tracker for the external rq activity
 	 * on this intel_context object.
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 065da1bcbb4c..f06028733242 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -410,6 +410,26 @@ void __i915_request_submit(struct i915_request *request)
 	if (i915_gem_context_is_banned(request->gem_context))
 		i915_request_skip(request, -EIO);
 
+	/*
+	 * Are we using semaphores when the gpu is already saturated?
+	 *
+	 * Using semaphores incurs a cost in having the GPU poll a
+	 * memory location, busywaiting for it to change. The continual
+	 * memory reads can have a noticeable impact on the rest of the
+	 * system with the extra bus traffic, stalling the cpu as it too
+	 * tries to access memory across the bus (perf stat -e bus-cycles).
+	 *
+	 * If we installed a semaphore on this request and we only submit
+	 * the request after the signaler completed, that indicates the
+	 * system is overloaded and using semaphores at this time only
+	 * increases the amount of work we are doing. If so, we disable
+	 * further use of semaphores until we are idle again, whence we
+	 * optimistically try again.
+	 */
+	if (request->sched.semaphores &&
+	    i915_sw_fence_signaled(&request->semaphore))
+		request->hw_context->saturated |= request->sched.semaphores;
+
 	/* We may be recursing from the signal callback of another i915 fence */
 	spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
 
@@ -776,6 +796,24 @@ i915_request_await_start(struct i915_request *rq, struct i915_request *signal)
 					     I915_FENCE_GFP);
 }
 
+static intel_engine_mask_t
+already_busywaiting(struct i915_request *rq)
+{
+	/*
+	 * Polling a semaphore causes bus traffic, delaying other users of
+	 * both the GPU and CPU. We want to limit the impact on others,
+	 * while taking advantage of early submission to reduce GPU
+	 * latency. Therefore we restrict ourselves to not using more
+	 * than one semaphore from each source, and not using a semaphore
+	 * if we have detected the engine is saturated (i.e. would not be
+	 * submitted early and cause bus traffic reading an already passed
+	 * semaphore).
+	 *
+	 * See the are-we-too-late? check in __i915_request_submit().
+	 */
+	return rq->sched.semaphores | rq->hw_context->saturated;
+}
+
 static int
 emit_semaphore_wait(struct i915_request *to,
 		    struct i915_request *from,
@@ -789,7 +827,7 @@ emit_semaphore_wait(struct i915_request *to,
 	GEM_BUG_ON(INTEL_GEN(to->i915) < 8);
 
 	/* Just emit the first semaphore we see as request space is limited. */
-	if (to->sched.semaphores & from->engine->mask)
+	if (already_busywaiting(to) & from->engine->mask)
 		return i915_sw_fence_await_dma_fence(&to->submit,
 						     &from->fence, 0,
 						     I915_FENCE_GFP);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (11 preceding siblings ...)
  2019-05-03 11:52 ` [PATCH 13/13] drm/i915: Disable semaphore busywaits on saturated systems Chris Wilson
@ 2019-05-03 12:56 ` Patchwork
  2019-05-03 13:18 ` ✗ Fi.CI.BAT: failure " Patchwork
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2019-05-03 12:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
URL   : https://patchwork.freedesktop.org/series/60257/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
Okay!

Commit: drm/i915: Prefer checking the wakeref itself rather than the counter
Okay!

Commit: drm/i915: Assert the local engine->wakeref is active
Okay!

Commit: drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD
Okay!

Commit: drm/i915: Remove delay for idle_work
Okay!

Commit: drm/i915: Cancel retire_worker on parking
Okay!

Commit: drm/i915: Stop spinning for DROP_IDLE (debugfs/i915_drop_caches)
Okay!

Commit: drm/i915: Only reschedule the submission tasklet if preemption is possible
-O:drivers/gpu/drm/i915/gt/intel_engine.h:124:23: warning: expression using sizeof(void)
-O:drivers/gpu/drm/i915/gt/intel_engine.h:124:23: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_scheduler.h:70:23: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_scheduler.h:70:23: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_scheduler.h:70:23: warning: expression using sizeof(void)

Commit: drm/i915/execlists: Don't apply priority boost for resets
Okay!

Commit: drm/i915: Rearrange i915_scheduler.c
Okay!

Commit: drm/i915: Pass i915_sched_node around internally
Okay!

Commit: drm/i915: Bump signaler priority on adding a waiter
Okay!

Commit: drm/i915: Disable semaphore busywaits on saturated systems
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (12 preceding siblings ...)
  2019-05-03 12:56 ` ✗ Fi.CI.SPARSE: warning for series starting with [01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Patchwork
@ 2019-05-03 13:18 ` Patchwork
  2019-05-03 13:32 ` [PATCH 01/13] " Tvrtko Ursulin
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2019-05-03 13:18 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
URL   : https://patchwork.freedesktop.org/series/60257/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6038 -> Patchwork_12957
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_12957 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_12957, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12957/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_12957:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live_hangcheck:
    - fi-apl-guc:         NOTRUN -> [FAIL][1] +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12957/fi-apl-guc/igt@i915_selftest@live_hangcheck.html

  
Known issues
------------

  Here are the changes found in Patchwork_12957 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_selftest@live_hangcheck:
    - fi-icl-y:           [PASS][2] -> [INCOMPLETE][3] ([fdo#107713] / [fdo#108569] / [fdo#110581])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6038/fi-icl-y/igt@i915_selftest@live_hangcheck.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12957/fi-icl-y/igt@i915_selftest@live_hangcheck.html

  
#### Possible fixes ####

  * igt@i915_selftest@live_contexts:
    - fi-bdw-gvtdvm:      [DMESG-FAIL][4] ([fdo#110235]) -> [PASS][5]
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6038/fi-bdw-gvtdvm/igt@i915_selftest@live_contexts.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12957/fi-bdw-gvtdvm/igt@i915_selftest@live_contexts.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - fi-blb-e6850:       [INCOMPLETE][6] ([fdo#107718] / [fdo#110581]) -> [PASS][7]
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6038/fi-blb-e6850/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12957/fi-blb-e6850/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569
  [fdo#110235]: https://bugs.freedesktop.org/show_bug.cgi?id=110235
  [fdo#110581]: https://bugs.freedesktop.org/show_bug.cgi?id=110581


Participating hosts (52 -> 46)
------------------------------

  Additional (2): fi-cfl-guc fi-apl-guc 
  Missing    (8): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_6038 -> Patchwork_12957

  CI_DRM_6038: e53f0aced0dd2d3f2eb32c98a419af87ce23206a @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4972: f052e49a43cc9704ea5f240df15dd9d3dfed68ab @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12957: cabb75f2920af68475b0cc6abe83fa53bc6a4ed5 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

cabb75f2920a drm/i915: Disable semaphore busywaits on saturated systems
9444a1d263e6 drm/i915: Bump signaler priority on adding a waiter
25385cf05c10 drm/i915: Pass i915_sched_node around internally
fa5c579cebdf drm/i915: Rearrange i915_scheduler.c
9fb5d3a33c11 drm/i915/execlists: Don't apply priority boost for resets
e5672eabe995 drm/i915: Only reschedule the submission tasklet if preemption is possible
45d1066b44cc drm/i915: Stop spinning for DROP_IDLE (debugfs/i915_drop_caches)
92a89fd9909c drm/i915: Cancel retire_worker on parking
73b219cd5dc0 drm/i915: Remove delay for idle_work
5d9d9317cac7 drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD
4aa7e9d6d91f drm/i915: Assert the local engine->wakeref is active
6f682e41691d drm/i915: Prefer checking the wakeref itself rather than the counter
f28a2bf53dc9 drm/i915: Assert breadcrumbs are correctly ordered in the signal handler

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12957/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (13 preceding siblings ...)
  2019-05-03 13:18 ` ✗ Fi.CI.BAT: failure " Patchwork
@ 2019-05-03 13:32 ` Tvrtko Ursulin
  2019-05-03 13:37   ` Chris Wilson
  2019-05-03 15:22 ` [PATCH v2] " Chris Wilson
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-03 13:32 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 12:52, Chris Wilson wrote:
> Inside the signal handler, we expect the requests to be ordered by their
> breadcrumb such that no later request may be complete if we find an
> earlier incomplete. Add an assert to check that the next breadcrumb
> should not be logically before the current.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> index 3cbffd400b1b..a6ffb25f72a2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> @@ -99,6 +99,12 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
>   			struct i915_request *rq =
>   				list_entry(pos, typeof(*rq), signal_link);
>   
> +			GEM_BUG_ON(next != &ce->signals &&
> +				   i915_seqno_passed(rq->fence.seqno,
> +						     list_entry(next,
> +								typeof(*rq),
> +								signal_link)->fence.seqno));

I know its only GEM_BUG_ON, but why check for this in the irq handler? 
You don't trust the insertion, or group deletion? Or just becuase it is 
the smallest amount of code to piggy-back on existing iteration?

Regards,

Tvrtko

> +
>   			if (!__request_completed(rq))
>   				break;
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
  2019-05-03 13:32 ` [PATCH 01/13] " Tvrtko Ursulin
@ 2019-05-03 13:37   ` Chris Wilson
  2019-05-03 13:49     ` Tvrtko Ursulin
  0 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 13:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-05-03 14:32:59)
> 
> On 03/05/2019 12:52, Chris Wilson wrote:
> > Inside the signal handler, we expect the requests to be ordered by their
> > breadcrumb such that no later request may be complete if we find an
> > earlier incomplete. Add an assert to check that the next breadcrumb
> > should not be logically before the current.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 6 ++++++
> >   1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> > index 3cbffd400b1b..a6ffb25f72a2 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> > @@ -99,6 +99,12 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
> >                       struct i915_request *rq =
> >                               list_entry(pos, typeof(*rq), signal_link);
> >   
> > +                     GEM_BUG_ON(next != &ce->signals &&
> > +                                i915_seqno_passed(rq->fence.seqno,
> > +                                                  list_entry(next,
> > +                                                             typeof(*rq),
> > +                                                             signal_link)->fence.seqno));
> 
> I know its only GEM_BUG_ON, but why check for this in the irq handler? 
> You don't trust the insertion, or group deletion? Or just becuase it is 
> the smallest amount of code to piggy-back on existing iteration?

At this point, it's documenting the required sorting of ce->signals. The
vulnerable part is list insertion. Would you like something like

check_signal_order(ce, rq)

and use it after insertion as well?

We can do prev/next checking, just to be sure.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
  2019-05-03 13:37   ` Chris Wilson
@ 2019-05-03 13:49     ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-03 13:49 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 14:37, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-03 14:32:59)
>>
>> On 03/05/2019 12:52, Chris Wilson wrote:
>>> Inside the signal handler, we expect the requests to be ordered by their
>>> breadcrumb such that no later request may be complete if we find an
>>> earlier incomplete. Add an assert to check that the next breadcrumb
>>> should not be logically before the current.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 6 ++++++
>>>    1 file changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
>>> index 3cbffd400b1b..a6ffb25f72a2 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
>>> @@ -99,6 +99,12 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
>>>                        struct i915_request *rq =
>>>                                list_entry(pos, typeof(*rq), signal_link);
>>>    
>>> +                     GEM_BUG_ON(next != &ce->signals &&
>>> +                                i915_seqno_passed(rq->fence.seqno,
>>> +                                                  list_entry(next,
>>> +                                                             typeof(*rq),
>>> +                                                             signal_link)->fence.seqno));
>>
>> I know its only GEM_BUG_ON, but why check for this in the irq handler?
>> You don't trust the insertion, or group deletion? Or just becuase it is
>> the smallest amount of code to piggy-back on existing iteration?
> 
> At this point, it's documenting the required sorting of ce->signals. The
> vulnerable part is list insertion. Would you like something like
> 
> check_signal_order(ce, rq)
> 
> and use it after insertion as well?
> 
> We can do prev/next checking, just to be sure.

I don't feel strongly either way. Was just curious why you decided to 
put it where it was.

Helper I suppose is better since it is more self-documenting and it's 
easy to call it from all strategic places.

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (14 preceding siblings ...)
  2019-05-03 13:32 ` [PATCH 01/13] " Tvrtko Ursulin
@ 2019-05-03 15:22 ` Chris Wilson
  2019-05-07 10:39   ` Tvrtko Ursulin
  2019-05-03 15:38 ` ✗ Fi.CI.SPARSE: warning for series starting with [v2] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler (rev2) Patchwork
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2019-05-03 15:22 UTC (permalink / raw)
  To: intel-gfx

Inside the signal handler, we expect the requests to be ordered by their
breadcrumb such that no later request may be complete if we find an
earlier incomplete. Add an assert to check that the next breadcrumb
should not be logically before the current.

v2: Move the overhanging line into its own function and reuse it after
doing the insertion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index 3cbffd400b1b..fe455f01aa65 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -80,6 +80,22 @@ static inline bool __request_completed(const struct i915_request *rq)
 	return i915_seqno_passed(__hwsp_seqno(rq), rq->fence.seqno);
 }
 
+__maybe_unused static bool
+check_signal_order(struct intel_context *ce, struct i915_request *rq)
+{
+	if (!list_is_last(&rq->signal_link, &ce->signals) &&
+	    i915_seqno_passed(rq->fence.seqno,
+			      list_next_entry(rq, signal_link)->fence.seqno))
+		return false;
+
+	if (!list_is_first(&rq->signal_link, &ce->signals) &&
+	    i915_seqno_passed(list_prev_entry(rq, signal_link)->fence.seqno,
+			      rq->fence.seqno))
+		return false;
+
+	return true;
+}
+
 void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
 {
 	struct intel_breadcrumbs *b = &engine->breadcrumbs;
@@ -99,6 +115,8 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
 			struct i915_request *rq =
 				list_entry(pos, typeof(*rq), signal_link);
 
+			GEM_BUG_ON(!check_signal_order(ce, rq));
+
 			if (!__request_completed(rq))
 				break;
 
@@ -282,6 +300,7 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq)
 		list_add(&rq->signal_link, pos);
 		if (pos == &ce->signals) /* catch transitions from empty list */
 			list_move_tail(&ce->signal_link, &b->signalers);
+		GEM_BUG_ON(!check_signal_order(ce, rq));
 
 		set_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
 	}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [v2] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler (rev2)
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (15 preceding siblings ...)
  2019-05-03 15:22 ` [PATCH v2] " Chris Wilson
@ 2019-05-03 15:38 ` Patchwork
  2019-05-03 15:53 ` ✓ Fi.CI.BAT: success " Patchwork
  2019-05-03 19:22 ` ✗ Fi.CI.IGT: failure " Patchwork
  18 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2019-05-03 15:38 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v2] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler (rev2)
URL   : https://patchwork.freedesktop.org/series/60257/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
Okay!

Commit: drm/i915: Prefer checking the wakeref itself rather than the counter
Okay!

Commit: drm/i915: Assert the local engine->wakeref is active
Okay!

Commit: drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD
Okay!

Commit: drm/i915: Remove delay for idle_work
Okay!

Commit: drm/i915: Cancel retire_worker on parking
Okay!

Commit: drm/i915: Stop spinning for DROP_IDLE (debugfs/i915_drop_caches)
Okay!

Commit: drm/i915: Only reschedule the submission tasklet if preemption is possible
-O:drivers/gpu/drm/i915/gt/intel_engine.h:124:23: warning: expression using sizeof(void)
-O:drivers/gpu/drm/i915/gt/intel_engine.h:124:23: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_scheduler.h:70:23: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_scheduler.h:70:23: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_scheduler.h:70:23: warning: expression using sizeof(void)

Commit: drm/i915/execlists: Don't apply priority boost for resets
Okay!

Commit: drm/i915: Rearrange i915_scheduler.c
Okay!

Commit: drm/i915: Pass i915_sched_node around internally
Okay!

Commit: drm/i915: Bump signaler priority on adding a waiter
Okay!

Commit: drm/i915: Disable semaphore busywaits on saturated systems
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [v2] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler (rev2)
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (16 preceding siblings ...)
  2019-05-03 15:38 ` ✗ Fi.CI.SPARSE: warning for series starting with [v2] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler (rev2) Patchwork
@ 2019-05-03 15:53 ` Patchwork
  2019-05-03 19:22 ` ✗ Fi.CI.IGT: failure " Patchwork
  18 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2019-05-03 15:53 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v2] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler (rev2)
URL   : https://patchwork.freedesktop.org/series/60257/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6039 -> Patchwork_12960
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/

Known issues
------------

  Here are the changes found in Patchwork_12960 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s4-devices:
    - fi-blb-e6850:       [PASS][1] -> [INCOMPLETE][2] ([fdo#107718] / [fdo#110581])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/fi-blb-e6850/igt@gem_exec_suspend@basic-s4-devices.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/fi-blb-e6850/igt@gem_exec_suspend@basic-s4-devices.html

  
#### Possible fixes ####

  * igt@i915_selftest@live_gtt:
    - fi-icl-y:           [INCOMPLETE][3] ([fdo#107713] / [fdo#110581]) -> [PASS][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/fi-icl-y/igt@i915_selftest@live_gtt.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/fi-icl-y/igt@i915_selftest@live_gtt.html

  * igt@kms_chamelium@common-hpd-after-suspend:
    - {fi-icl-u2}:        [DMESG-WARN][5] ([fdo#102505]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/fi-icl-u2/igt@kms_chamelium@common-hpd-after-suspend.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/fi-icl-u2/igt@kms_chamelium@common-hpd-after-suspend.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#102505]: https://bugs.freedesktop.org/show_bug.cgi?id=102505
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#110581]: https://bugs.freedesktop.org/show_bug.cgi?id=110581


Participating hosts (52 -> 46)
------------------------------

  Additional (2): fi-byt-j1900 fi-bsw-kefka 
  Missing    (8): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_6039 -> Patchwork_12960

  CI_DRM_6039: 941848427de77537b5709bd68ca4a4048be5b5d9 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4972: f052e49a43cc9704ea5f240df15dd9d3dfed68ab @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12960: 824b9e2304e344e55b1fea49f624c7b2ab0725bf @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

824b9e2304e3 drm/i915: Disable semaphore busywaits on saturated systems
ed45ebbc606a drm/i915: Bump signaler priority on adding a waiter
85b3c81ac684 drm/i915: Pass i915_sched_node around internally
cbce16bb8ac0 drm/i915: Rearrange i915_scheduler.c
058614b44ff5 drm/i915/execlists: Don't apply priority boost for resets
39d542060c6b drm/i915: Only reschedule the submission tasklet if preemption is possible
b72f62bbbc63 drm/i915: Stop spinning for DROP_IDLE (debugfs/i915_drop_caches)
1f6964af2303 drm/i915: Cancel retire_worker on parking
b9b006f3e868 drm/i915: Remove delay for idle_work
6ea24581934a drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD
d6ebdabd10da drm/i915: Assert the local engine->wakeref is active
bdbff11af05c drm/i915: Prefer checking the wakeref itself rather than the counter
0bd5b1be074d drm/i915: Assert breadcrumbs are correctly ordered in the signal handler

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [v2] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler (rev2)
  2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
                   ` (17 preceding siblings ...)
  2019-05-03 15:53 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2019-05-03 19:22 ` Patchwork
  18 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2019-05-03 19:22 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v2] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler (rev2)
URL   : https://patchwork.freedesktop.org/series/60257/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6039_full -> Patchwork_12960_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_12960_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_12960_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_12960_full:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_ppgtt@flink-and-close-vma-leak:
    - shard-hsw:          [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-hsw5/igt@gem_ppgtt@flink-and-close-vma-leak.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-hsw5/igt@gem_ppgtt@flink-and-close-vma-leak.html
    - shard-iclb:         [PASS][3] -> [FAIL][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-iclb6/igt@gem_ppgtt@flink-and-close-vma-leak.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-iclb6/igt@gem_ppgtt@flink-and-close-vma-leak.html

  
Known issues
------------

  Here are the changes found in Patchwork_12960_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_suspend@debugfs-reader:
    - shard-apl:          [PASS][5] -> [DMESG-WARN][6] ([fdo#108566]) +3 similar issues
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-apl5/igt@i915_suspend@debugfs-reader.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-apl5/igt@i915_suspend@debugfs-reader.html

  * igt@kms_draw_crc@draw-method-rgb565-pwrite-xtiled:
    - shard-snb:          [PASS][7] -> [SKIP][8] ([fdo#109271])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-snb4/igt@kms_draw_crc@draw-method-rgb565-pwrite-xtiled.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-snb7/igt@kms_draw_crc@draw-method-rgb565-pwrite-xtiled.html

  * igt@kms_frontbuffer_tracking@fbc-rgb101010-draw-mmap-cpu:
    - shard-skl:          [PASS][9] -> [FAIL][10] ([fdo#103167] / [fdo#110379])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-skl8/igt@kms_frontbuffer_tracking@fbc-rgb101010-draw-mmap-cpu.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-skl5/igt@kms_frontbuffer_tracking@fbc-rgb101010-draw-mmap-cpu.html

  * igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite:
    - shard-iclb:         [PASS][11] -> [FAIL][12] ([fdo#103167]) +2 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-iclb6/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-iclb5/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite.html

  * igt@kms_plane_scaling@pipe-a-scaler-with-clipping-clamping:
    - shard-glk:          [PASS][13] -> [SKIP][14] ([fdo#109271] / [fdo#109278])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-glk9/igt@kms_plane_scaling@pipe-a-scaler-with-clipping-clamping.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-glk5/igt@kms_plane_scaling@pipe-a-scaler-with-clipping-clamping.html

  * igt@kms_psr@psr2_cursor_plane_onoff:
    - shard-iclb:         [PASS][15] -> [SKIP][16] ([fdo#109441]) +4 similar issues
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-iclb2/igt@kms_psr@psr2_cursor_plane_onoff.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-iclb4/igt@kms_psr@psr2_cursor_plane_onoff.html

  * igt@kms_setmode@basic:
    - shard-kbl:          [PASS][17] -> [FAIL][18] ([fdo#99912])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-kbl1/igt@kms_setmode@basic.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-kbl6/igt@kms_setmode@basic.html

  
#### Possible fixes ####

  * igt@gem_eio@in-flight-suspend:
    - shard-apl:          [DMESG-WARN][19] ([fdo#108566]) -> [PASS][20] +4 similar issues
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-apl1/igt@gem_eio@in-flight-suspend.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-apl4/igt@gem_eio@in-flight-suspend.html

  * igt@kms_cursor_legacy@2x-long-flip-vs-cursor-atomic:
    - shard-glk:          [FAIL][21] ([fdo#104873]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-glk1/igt@kms_cursor_legacy@2x-long-flip-vs-cursor-atomic.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-glk5/igt@kms_cursor_legacy@2x-long-flip-vs-cursor-atomic.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible:
    - shard-skl:          [FAIL][23] ([fdo#105363]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-skl8/igt@kms_flip@flip-vs-expired-vblank-interruptible.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-skl5/igt@kms_flip@flip-vs-expired-vblank-interruptible.html

  * igt@kms_flip@plain-flip-ts-check-interruptible:
    - shard-skl:          [FAIL][25] ([fdo#100368]) -> [PASS][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-skl1/igt@kms_flip@plain-flip-ts-check-interruptible.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-skl5/igt@kms_flip@plain-flip-ts-check-interruptible.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render:
    - shard-iclb:         [FAIL][27] ([fdo#103167]) -> [PASS][28] +2 similar issues
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-iclb6/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-iclb6/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render.html

  * igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min:
    - shard-skl:          [FAIL][29] ([fdo#108145]) -> [PASS][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-skl6/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-skl3/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          [FAIL][31] ([fdo#108145] / [fdo#110403]) -> [PASS][32] +1 similar issue
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-skl3/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-skl3/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html

  * igt@kms_plane_lowres@pipe-a-tiling-x:
    - shard-iclb:         [FAIL][33] ([fdo#103166]) -> [PASS][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-iclb1/igt@kms_plane_lowres@pipe-a-tiling-x.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-iclb8/igt@kms_plane_lowres@pipe-a-tiling-x.html

  * igt@kms_plane_scaling@pipe-b-scaler-with-clipping-clamping:
    - shard-glk:          [SKIP][35] ([fdo#109271] / [fdo#109278]) -> [PASS][36] +1 similar issue
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-glk6/igt@kms_plane_scaling@pipe-b-scaler-with-clipping-clamping.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-glk9/igt@kms_plane_scaling@pipe-b-scaler-with-clipping-clamping.html

  * igt@kms_psr@psr2_cursor_mmap_cpu:
    - shard-iclb:         [SKIP][37] ([fdo#109441]) -> [PASS][38] +3 similar issues
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-iclb7/igt@kms_psr@psr2_cursor_mmap_cpu.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-iclb2/igt@kms_psr@psr2_cursor_mmap_cpu.html

  * igt@perf_pmu@rc6:
    - shard-kbl:          [SKIP][39] ([fdo#109271]) -> [PASS][40]
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-kbl5/igt@perf_pmu@rc6.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-kbl6/igt@perf_pmu@rc6.html

  
#### Warnings ####

  * igt@kms_atomic_transition@4x-modeset-transitions-nonblocking:
    - shard-apl:          [SKIP][41] ([fdo#109271] / [fdo#109278]) -> [INCOMPLETE][42] ([fdo#103927] / [fdo#110581])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-apl8/igt@kms_atomic_transition@4x-modeset-transitions-nonblocking.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-apl2/igt@kms_atomic_transition@4x-modeset-transitions-nonblocking.html

  * igt@kms_atomic_transition@6x-modeset-transitions-fencing:
    - shard-snb:          [SKIP][43] ([fdo#109271] / [fdo#109278]) -> [SKIP][44] ([fdo#109271])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-snb4/igt@kms_atomic_transition@6x-modeset-transitions-fencing.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-snb7/igt@kms_atomic_transition@6x-modeset-transitions-fencing.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-gtt:
    - shard-skl:          [FAIL][45] ([fdo#103167]) -> [FAIL][46] ([fdo#108040])
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6039/shard-skl8/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-gtt.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/shard-skl5/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-gtt.html

  
  [fdo#100368]: https://bugs.freedesktop.org/show_bug.cgi?id=100368
  [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#104873]: https://bugs.freedesktop.org/show_bug.cgi?id=104873
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#108040]: https://bugs.freedesktop.org/show_bug.cgi?id=108040
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#110379]: https://bugs.freedesktop.org/show_bug.cgi?id=110379
  [fdo#110403]: https://bugs.freedesktop.org/show_bug.cgi?id=110403
  [fdo#110581]: https://bugs.freedesktop.org/show_bug.cgi?id=110581
  [fdo#99912]: https://bugs.freedesktop.org/show_bug.cgi?id=99912


Participating hosts (10 -> 10)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * Linux: CI_DRM_6039 -> Patchwork_12960

  CI_DRM_6039: 941848427de77537b5709bd68ca4a4048be5b5d9 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4972: f052e49a43cc9704ea5f240df15dd9d3dfed68ab @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12960: 824b9e2304e344e55b1fea49f624c7b2ab0725bf @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12960/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler
  2019-05-03 15:22 ` [PATCH v2] " Chris Wilson
@ 2019-05-07 10:39   ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-07 10:39 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 16:22, Chris Wilson wrote:
> Inside the signal handler, we expect the requests to be ordered by their
> breadcrumb such that no later request may be complete if we find an
> earlier incomplete. Add an assert to check that the next breadcrumb
> should not be logically before the current.
> 
> v2: Move the overhanging line into its own function and reuse it after
> doing the insertion.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 19 +++++++++++++++++++
>   1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> index 3cbffd400b1b..fe455f01aa65 100644
> --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> @@ -80,6 +80,22 @@ static inline bool __request_completed(const struct i915_request *rq)
>   	return i915_seqno_passed(__hwsp_seqno(rq), rq->fence.seqno);
>   }
>   
> +__maybe_unused static bool
> +check_signal_order(struct intel_context *ce, struct i915_request *rq)
> +{
> +	if (!list_is_last(&rq->signal_link, &ce->signals) &&
> +	    i915_seqno_passed(rq->fence.seqno,
> +			      list_next_entry(rq, signal_link)->fence.seqno))
> +		return false;
> +
> +	if (!list_is_first(&rq->signal_link, &ce->signals) &&
> +	    i915_seqno_passed(list_prev_entry(rq, signal_link)->fence.seqno,
> +			      rq->fence.seqno))
> +		return false;
> +
> +	return true;
> +}
> +
>   void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
>   {
>   	struct intel_breadcrumbs *b = &engine->breadcrumbs;
> @@ -99,6 +115,8 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
>   			struct i915_request *rq =
>   				list_entry(pos, typeof(*rq), signal_link);
>   
> +			GEM_BUG_ON(!check_signal_order(ce, rq));
> +
>   			if (!__request_completed(rq))
>   				break;
>   
> @@ -282,6 +300,7 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq)
>   		list_add(&rq->signal_link, pos);
>   		if (pos == &ce->signals) /* catch transitions from empty list */
>   			list_move_tail(&ce->signal_link, &b->signalers);
> +		GEM_BUG_ON(!check_signal_order(ce, rq));
>   
>   		set_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
>   	}
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/13] drm/i915: Prefer checking the wakeref itself rather than the counter
  2019-05-03 11:52 ` [PATCH 02/13] drm/i915: Prefer checking the wakeref itself rather than the counter Chris Wilson
@ 2019-05-07 10:48   ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-07 10:48 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 12:52, Chris Wilson wrote:
> The counter goes to zero at the start of the parking cycle, but the
> wakeref itself is held until the end. Likewise, the counter becomes one
> at the end of the unparking, but the wakeref is taken first. If we check
> the wakeref instead of the counter, we include the unpark/unparking time
> as intel_wakeref_is_active(), and do not spuriously declare inactive if
> we fail to park (i.e. the parking and wakeref drop is postponed).
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/intel_wakeref.c | 20 +++++++++++++++++---
>   drivers/gpu/drm/i915/intel_wakeref.h |  2 +-
>   2 files changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
> index 1f94bc4ff9e4..91196d9612bb 100644
> --- a/drivers/gpu/drm/i915/intel_wakeref.c
> +++ b/drivers/gpu/drm/i915/intel_wakeref.c
> @@ -7,6 +7,19 @@
>   #include "intel_drv.h"
>   #include "intel_wakeref.h"
>   
> +static void rpm_get(struct drm_i915_private *i915, struct intel_wakeref *wf)
> +{
> +	wf->wakeref = intel_runtime_pm_get(i915);
> +}
> +
> +static void rpm_put(struct drm_i915_private *i915, struct intel_wakeref *wf)
> +{
> +	intel_wakeref_t wakeref = fetch_and_zero(&wf->wakeref);
> +
> +	intel_runtime_pm_put(i915, wakeref);
> +	GEM_BUG_ON(!wakeref);
> +}
> +
>   int __intel_wakeref_get_first(struct drm_i915_private *i915,
>   			      struct intel_wakeref *wf,
>   			      int (*fn)(struct intel_wakeref *wf))
> @@ -21,11 +34,11 @@ int __intel_wakeref_get_first(struct drm_i915_private *i915,
>   	if (!atomic_read(&wf->count)) {
>   		int err;
>   
> -		wf->wakeref = intel_runtime_pm_get(i915);
> +		rpm_get(i915, wf);
>   
>   		err = fn(wf);
>   		if (unlikely(err)) {
> -			intel_runtime_pm_put(i915, wf->wakeref);
> +			rpm_put(i915, wf);
>   			mutex_unlock(&wf->mutex);
>   			return err;
>   		}
> @@ -46,7 +59,7 @@ int __intel_wakeref_put_last(struct drm_i915_private *i915,
>   
>   	err = fn(wf);
>   	if (likely(!err))
> -		intel_runtime_pm_put(i915, wf->wakeref);
> +		rpm_put(i915, wf);
>   	else
>   		atomic_inc(&wf->count);
>   	mutex_unlock(&wf->mutex);
> @@ -58,4 +71,5 @@ void __intel_wakeref_init(struct intel_wakeref *wf, struct lock_class_key *key)
>   {
>   	__mutex_init(&wf->mutex, "wakeref", key);
>   	atomic_set(&wf->count, 0);
> +	wf->wakeref = 0;
>   }
> diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
> index a979d638344b..db742291211c 100644
> --- a/drivers/gpu/drm/i915/intel_wakeref.h
> +++ b/drivers/gpu/drm/i915/intel_wakeref.h
> @@ -127,7 +127,7 @@ intel_wakeref_unlock(struct intel_wakeref *wf)
>   static inline bool
>   intel_wakeref_active(struct intel_wakeref *wf)
>   {
> -	return atomic_read(&wf->count);
> +	return READ_ONCE(wf->wakeref);
>   }
>   
>   #endif /* INTEL_WAKEREF_H */
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/13] drm/i915: Assert the local engine->wakeref is active
  2019-05-03 11:52 ` [PATCH 03/13] drm/i915: Assert the local engine->wakeref is active Chris Wilson
@ 2019-05-07 10:52   ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-07 10:52 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 12:52, Chris Wilson wrote:
> Due to the asynchronous tasklet and recursive GT wakeref, it may happen
> that we submit to the engine (underneath it's own wakeref) prior to the
> central wakeref being marked as taken. Switch to checking the local wakeref
> for greater consistency.
> 
> Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 3 +++
>   drivers/gpu/drm/i915/gt/intel_lrc.c       | 4 ++--
>   2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 5907a9613641..416d7e2e6f8c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1090,6 +1090,9 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
>   	if (i915_reset_failed(engine->i915))
>   		return true;
>   
> +	if (!intel_wakeref_active(&engine->wakeref))
> +		return true;
> +
>   	/* Waiting to drain ELSP? */
>   	if (READ_ONCE(engine->execlists.active)) {
>   		struct tasklet_struct *t = &engine->execlists.tasklet;
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 7d69d07490e8..5580b6f1aa0c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -535,7 +535,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>   	 * that all ELSP are drained i.e. we have processed the CSB,
>   	 * before allowing ourselves to idle and calling intel_runtime_pm_put().
>   	 */
> -	GEM_BUG_ON(!engine->i915->gt.awake);
> +	GEM_BUG_ON(!intel_wakeref_active(&engine->wakeref));
>   
>   	/*
>   	 * ELSQ note: the submit queue is not cleared after being submitted
> @@ -1085,7 +1085,7 @@ static void execlists_submission_tasklet(unsigned long data)
>   
>   	GEM_TRACE("%s awake?=%d, active=%x\n",
>   		  engine->name,
> -		  !!engine->i915->gt.awake,
> +		  !!intel_wakeref_active(&engine->wakeref),
>   		  engine->execlists.active);
>   
>   	spin_lock_irqsave(&engine->timeline.lock, flags);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/13] drm/i915: Remove delay for idle_work
  2019-05-03 11:52 ` [PATCH 05/13] drm/i915: Remove delay for idle_work Chris Wilson
@ 2019-05-07 10:54   ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-07 10:54 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 12:52, Chris Wilson wrote:
> The original intent for the delay before running the idle_work was to
> provide a hysteresis to avoid ping-ponging the device runtime-pm. Since
> then we have also pulled in some memory management and general device
> management for parking. But with the inversion of the wakeref handling,
> GEM is no longer responsible for the wakeref and by the time we call the
> idle_work, the device is asleep. It seems appropriate now to drop the
> delay and just run the worker immediately to flush the cached GEM state
> before sleeping.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c           |  2 +-
>   drivers/gpu/drm/i915/i915_drv.h               |  2 +-
>   drivers/gpu/drm/i915/i915_gem_pm.c            | 21 +++++++------------
>   .../gpu/drm/i915/selftests/i915_gem_object.c  |  2 +-
>   .../gpu/drm/i915/selftests/mock_gem_device.c  |  4 ++--
>   5 files changed, 12 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index da4fb9f34d27..674c8c936057 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3931,8 +3931,8 @@ i915_drop_caches_set(void *data, u64 val)
>   	if (val & DROP_IDLE) {
>   		do {
>   			flush_delayed_work(&i915->gem.retire_work);
> -			drain_delayed_work(&i915->gem.idle_work);
>   		} while (READ_ONCE(i915->gt.awake));
> +		flush_work(&i915->gem.idle_work);
>   	}
>   
>   	if (val & DROP_FREED)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 64fa353a62bb..2bf518fea36e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2031,7 +2031,7 @@ struct drm_i915_private {
>   		 * arrive within a small period of time, we fire
>   		 * off the idle_work.
>   		 */
> -		struct delayed_work idle_work;
> +		struct work_struct idle_work;
>   	} gem;
>   
>   	/* For i945gm vblank irq vs. C3 workaround */
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
> index 49b0ce594f20..ae91ad7cb31e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_pm.c
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.c
> @@ -29,12 +29,12 @@ static void i915_gem_park(struct drm_i915_private *i915)
>   static void idle_work_handler(struct work_struct *work)
>   {
>   	struct drm_i915_private *i915 =
> -		container_of(work, typeof(*i915), gem.idle_work.work);
> +		container_of(work, typeof(*i915), gem.idle_work);
>   
>   	mutex_lock(&i915->drm.struct_mutex);
>   
>   	intel_wakeref_lock(&i915->gt.wakeref);
> -	if (!intel_wakeref_active(&i915->gt.wakeref))
> +	if (!intel_wakeref_active(&i915->gt.wakeref) && !work_pending(work))
>   		i915_gem_park(i915);
>   	intel_wakeref_unlock(&i915->gt.wakeref);
>   
> @@ -74,9 +74,7 @@ static int pm_notifier(struct notifier_block *nb,
>   		break;
>   
>   	case INTEL_GT_PARK:
> -		mod_delayed_work(i915->wq,
> -				 &i915->gem.idle_work,
> -				 msecs_to_jiffies(100));
> +		queue_work(i915->wq, &i915->gem.idle_work);
>   		break;
>   	}
>   
> @@ -142,16 +140,11 @@ void i915_gem_suspend(struct drm_i915_private *i915)
>   	 * Assert that we successfully flushed all the work and
>   	 * reset the GPU back to its idle, low power state.
>   	 */
> -	GEM_BUG_ON(i915->gt.awake);
> -	cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
> -
>   	drain_delayed_work(&i915->gem.retire_work);
> +	GEM_BUG_ON(i915->gt.awake);
> +	flush_work(&i915->gem.idle_work);
>   
> -	/*
> -	 * As the idle_work is rearming if it detects a race, play safe and
> -	 * repeat the flush until it is definitely idle.
> -	 */
> -	drain_delayed_work(&i915->gem.idle_work);
> +	cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
>   
>   	i915_gem_drain_freed_objects(i915);
>   
> @@ -242,7 +235,7 @@ void i915_gem_resume(struct drm_i915_private *i915)
>   
>   void i915_gem_init__pm(struct drm_i915_private *i915)
>   {
> -	INIT_DELAYED_WORK(&i915->gem.idle_work, idle_work_handler);
> +	INIT_WORK(&i915->gem.idle_work, idle_work_handler);
>   	INIT_DELAYED_WORK(&i915->gem.retire_work, retire_work_handler);
>   
>   	i915->gem.pm_notifier.notifier_call = pm_notifier;
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> index 088b2aa05dcd..b926d1cd165d 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> @@ -509,7 +509,7 @@ static void disable_retire_worker(struct drm_i915_private *i915)
>   	intel_gt_pm_get(i915);
>   
>   	cancel_delayed_work_sync(&i915->gem.retire_work);
> -	cancel_delayed_work_sync(&i915->gem.idle_work);
> +	flush_work(&i915->gem.idle_work);
>   }
>   
>   static void restore_retire_worker(struct drm_i915_private *i915)
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> index e4033d0576c4..d919f512042c 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> @@ -59,7 +59,7 @@ static void mock_device_release(struct drm_device *dev)
>   	mutex_unlock(&i915->drm.struct_mutex);
>   
>   	drain_delayed_work(&i915->gem.retire_work);
> -	drain_delayed_work(&i915->gem.idle_work);
> +	flush_work(&i915->gem.idle_work);
>   	i915_gem_drain_workqueue(i915);
>   
>   	mutex_lock(&i915->drm.struct_mutex);
> @@ -195,7 +195,7 @@ struct drm_i915_private *mock_gem_device(void)
>   	mock_init_contexts(i915);
>   
>   	INIT_DELAYED_WORK(&i915->gem.retire_work, mock_retire_work_handler);
> -	INIT_DELAYED_WORK(&i915->gem.idle_work, mock_idle_work_handler);
> +	INIT_WORK(&i915->gem.idle_work, mock_idle_work_handler);
>   
>   	i915->gt.awake = true;
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/13] drm/i915: Cancel retire_worker on parking
  2019-05-03 11:52 ` [PATCH 06/13] drm/i915: Cancel retire_worker on parking Chris Wilson
@ 2019-05-07 10:55   ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-07 10:55 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 12:52, Chris Wilson wrote:
> Replace the racy continuation check within retire_work with a definite
> kill-switch on idling. The race was being exposed by gem_concurrent_blit
> where the retire_worker would be terminated too early leaving us
> spinning in debugfs/i915_drop_caches with nothing flushing the
> retirement queue.
> 
> Although that the igt is trying to idle from one child while submitting
> from another may be a contributing factor as to why  it runs so slowly...
> 
> v2: Use the non-sync version of cancel_delayed_work(), we only need to
> stop it from being scheduled as we independently check whether now is
> the right time to be parking.
> 
> Testcase: igt/gem_concurrent_blit
> Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_pm.c             | 18 ++++++++++++------
>   .../gpu/drm/i915/selftests/mock_gem_device.c   |  1 -
>   2 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
> index ae91ad7cb31e..fa9c2ebd966a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_pm.c
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.c
> @@ -30,15 +30,23 @@ static void idle_work_handler(struct work_struct *work)
>   {
>   	struct drm_i915_private *i915 =
>   		container_of(work, typeof(*i915), gem.idle_work);
> +	bool restart = true;
>   
> +	cancel_delayed_work(&i915->gem.retire_work);
>   	mutex_lock(&i915->drm.struct_mutex);
>   
>   	intel_wakeref_lock(&i915->gt.wakeref);
> -	if (!intel_wakeref_active(&i915->gt.wakeref) && !work_pending(work))
> +	if (!intel_wakeref_active(&i915->gt.wakeref) && !work_pending(work)) {
>   		i915_gem_park(i915);
> +		restart = false;
> +	}
>   	intel_wakeref_unlock(&i915->gt.wakeref);
>   
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	if (restart)
> +		queue_delayed_work(i915->wq,
> +				   &i915->gem.retire_work,
> +				   round_jiffies_up_relative(HZ));
>   }
>   
>   static void retire_work_handler(struct work_struct *work)
> @@ -52,10 +60,9 @@ static void retire_work_handler(struct work_struct *work)
>   		mutex_unlock(&i915->drm.struct_mutex);
>   	}
>   
> -	if (intel_wakeref_active(&i915->gt.wakeref))
> -		queue_delayed_work(i915->wq,
> -				   &i915->gem.retire_work,
> -				   round_jiffies_up_relative(HZ));
> +	queue_delayed_work(i915->wq,
> +			   &i915->gem.retire_work,
> +			   round_jiffies_up_relative(HZ));
>   }
>   
>   static int pm_notifier(struct notifier_block *nb,
> @@ -140,7 +147,6 @@ void i915_gem_suspend(struct drm_i915_private *i915)
>   	 * Assert that we successfully flushed all the work and
>   	 * reset the GPU back to its idle, low power state.
>   	 */
> -	drain_delayed_work(&i915->gem.retire_work);
>   	GEM_BUG_ON(i915->gt.awake);
>   	flush_work(&i915->gem.idle_work);
>   
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> index d919f512042c..9fd02025d382 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> @@ -58,7 +58,6 @@ static void mock_device_release(struct drm_device *dev)
>   	i915_gem_contexts_lost(i915);
>   	mutex_unlock(&i915->drm.struct_mutex);
>   
> -	drain_delayed_work(&i915->gem.retire_work);
>   	flush_work(&i915->gem.idle_work);
>   	i915_gem_drain_workqueue(i915);
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 08/13] drm/i915: Only reschedule the submission tasklet if preemption is possible
  2019-05-03 11:52 ` [PATCH 08/13] drm/i915: Only reschedule the submission tasklet if preemption is possible Chris Wilson
@ 2019-05-07 11:53   ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-07 11:53 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 12:52, Chris Wilson wrote:
> If we couple the scheduler more tightly with the execlists policy, we
> can apply the preemption policy to the question of whether we need to
> kick the tasklet at all for this priority bump.
> 
> v2: Rephrase it as a core i915 policy and not an execlists foible.
> v3: Pull the kick together.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine.h      | 18 ----------
>   drivers/gpu/drm/i915/gt/intel_lrc.c         |  4 +--
>   drivers/gpu/drm/i915/gt/selftest_lrc.c      |  7 +++-
>   drivers/gpu/drm/i915/i915_request.c         |  2 --
>   drivers/gpu/drm/i915/i915_scheduler.c       | 37 ++++++++++++---------
>   drivers/gpu/drm/i915/i915_scheduler.h       | 18 ++++++++++
>   drivers/gpu/drm/i915/intel_guc_submission.c |  3 +-
>   7 files changed, 50 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> index 9e2a183a351c..9359b3a7ad9c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -106,24 +106,6 @@ hangcheck_action_to_str(const enum intel_engine_hangcheck_action a)
>   
>   void intel_engines_set_scheduler_caps(struct drm_i915_private *i915);
>   
> -static inline bool __execlists_need_preempt(int prio, int last)
> -{
> -	/*
> -	 * Allow preemption of low -> normal -> high, but we do
> -	 * not allow low priority tasks to preempt other low priority
> -	 * tasks under the impression that latency for low priority
> -	 * tasks does not matter (as much as background throughput),
> -	 * so kiss.
> -	 *
> -	 * More naturally we would write
> -	 *	prio >= max(0, last);
> -	 * except that we wish to prevent triggering preemption at the same
> -	 * priority level: the task that is running should remain running
> -	 * to preserve FIFO ordering of dependencies.
> -	 */
> -	return prio > max(I915_PRIORITY_NORMAL - 1, last);
> -}
> -
>   static inline void
>   execlists_set_active(struct intel_engine_execlists *execlists,
>   		     unsigned int bit)
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 90900c7d7058..afcdfc440bbd 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -252,8 +252,8 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
>   	 * ourselves, ignore the request.
>   	 */
>   	last_prio = effective_prio(rq);
> -	if (!__execlists_need_preempt(engine->execlists.queue_priority_hint,
> -				      last_prio))
> +	if (!i915_scheduler_need_preempt(engine->execlists.queue_priority_hint,
> +					 last_prio))
>   		return false;
>   
>   	/*
> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> index 84538f69185b..4b042893dc0e 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> @@ -638,14 +638,19 @@ static struct i915_request *dummy_request(struct intel_engine_cs *engine)
>   	GEM_BUG_ON(i915_request_completed(rq));
>   
>   	i915_sw_fence_init(&rq->submit, dummy_notify);
> -	i915_sw_fence_commit(&rq->submit);
> +	set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
>   
>   	return rq;
>   }
>   
>   static void dummy_request_free(struct i915_request *dummy)
>   {
> +	/* We have to fake the CS interrupt to kick the next request */
> +	i915_sw_fence_commit(&dummy->submit);
> +
>   	i915_request_mark_complete(dummy);
> +	dma_fence_signal(&dummy->fence);
> +
>   	i915_sched_node_fini(&dummy->sched);
>   	i915_sw_fence_fini(&dummy->submit);
>   
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index d06c45305b03..8cb3ed5531e3 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1377,9 +1377,7 @@ long i915_request_wait(struct i915_request *rq,
>   	if (flags & I915_WAIT_PRIORITY) {
>   		if (!i915_request_started(rq) && INTEL_GEN(rq->i915) >= 6)
>   			gen6_rps_boost(rq);
> -		local_bh_disable(); /* suspend tasklets for reprioritisation */
>   		i915_schedule_bump_priority(rq, I915_PRIORITY_WAIT);
> -		local_bh_enable(); /* kick tasklets en masse */
>   	}
>   
>   	wait.tsk = current;
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 39bc4f54e272..fadf0cd9c75a 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -261,16 +261,30 @@ sched_lock_engine(const struct i915_sched_node *node,
>   	return engine;
>   }
>   
> -static bool inflight(const struct i915_request *rq,
> -		     const struct intel_engine_cs *engine)
> +static inline int rq_prio(const struct i915_request *rq)
>   {
> -	const struct i915_request *active;
> +	return rq->sched.attr.priority | __NO_PREEMPTION;
> +}
> +
> +static void kick_submission(struct intel_engine_cs *engine, int prio)
> +{
> +	const struct i915_request *inflight =
> +		port_request(engine->execlists.port);
>   
> -	if (!i915_request_is_active(rq))
> -		return false;
> +	if (!inflight) /* currently idle, nothing to do */
> +		return;
> +
> +	/*
> +	 * If we are already the currently executing context, don't
> +	 * bother evaluating if we should preempt ourselves, or if
> +	 * we expect nothing to change as a result of running the
> +	 * tasklet, i.e. we have not change the priority queue
> +	 * sufficiently to oust the running context.
> +	 */
> +	if (!i915_scheduler_need_preempt(prio, rq_prio(inflight)))
> +		return;
>   
> -	active = port_request(engine->execlists.port);
> -	return active->hw_context == rq->hw_context;
> +	tasklet_hi_schedule(&engine->execlists.tasklet);
>   }
>   
>   static void __i915_schedule(struct i915_request *rq,
> @@ -396,15 +410,8 @@ static void __i915_schedule(struct i915_request *rq,
>   
>   		engine->execlists.queue_priority_hint = prio;
>   
> -		/*
> -		 * If we are already the currently executing context, don't
> -		 * bother evaluating if we should preempt ourselves.
> -		 */
> -		if (inflight(node_to_request(node), engine))
> -			continue;
> -
>   		/* Defer (tasklet) submission until after all of our updates. */
> -		tasklet_hi_schedule(&engine->execlists.tasklet);
> +		kick_submission(engine, prio);
>   	}
>   
>   	spin_unlock(&engine->timeline.lock);
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 07d243acf553..7eefccff39bf 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -52,4 +52,22 @@ static inline void i915_priolist_free(struct i915_priolist *p)
>   		__i915_priolist_free(p);
>   }
>   
> +static inline bool i915_scheduler_need_preempt(int prio, int active)
> +{
> +	/*
> +	 * Allow preemption of low -> normal -> high, but we do
> +	 * not allow low priority tasks to preempt other low priority
> +	 * tasks under the impression that latency for low priority
> +	 * tasks does not matter (as much as background throughput),
> +	 * so kiss.
> +	 *
> +	 * More naturally we would write
> +	 *	prio >= max(0, last);
> +	 * except that we wish to prevent triggering preemption at the same
> +	 * priority level: the task that is running should remain running
> +	 * to preserve FIFO ordering of dependencies.
> +	 */
> +	return prio > max(I915_PRIORITY_NORMAL - 1, active);
> +}
> +
>   #endif /* _I915_SCHEDULER_H_ */
> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
> index 57ed1dd4ae41..380d83a2bfb6 100644
> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
> @@ -747,7 +747,8 @@ static bool __guc_dequeue(struct intel_engine_cs *engine)
>   				&engine->i915->guc.preempt_work[engine->id];
>   			int prio = execlists->queue_priority_hint;
>   
> -			if (__execlists_need_preempt(prio, port_prio(port))) {
> +			if (i915_scheduler_need_preempt(prio,
> +							port_prio(port))) {
>   				execlists_set_active(execlists,
>   						     EXECLISTS_ACTIVE_PREEMPT);
>   				queue_work(engine->i915->guc.preempt_wq,
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 09/13] drm/i915/execlists: Don't apply priority boost for resets
  2019-05-03 11:52 ` [PATCH 09/13] drm/i915/execlists: Don't apply priority boost for resets Chris Wilson
@ 2019-05-07 12:04   ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-07 12:04 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 12:52, Chris Wilson wrote:
> Do not treat reset as a normal preemption event and avoid giving the
> guilty request a priority boost for simply being active at the time of
> reset.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_lrc.c | 16 +++++++++-------
>   1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index afcdfc440bbd..6419bcaf1ecc 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -371,11 +371,11 @@ static void unwind_wa_tail(struct i915_request *rq)
>   }
>   
>   static struct i915_request *
> -__unwind_incomplete_requests(struct intel_engine_cs *engine)
> +__unwind_incomplete_requests(struct intel_engine_cs *engine, int boost)
>   {
>   	struct i915_request *rq, *rn, *active = NULL;
>   	struct list_head *uninitialized_var(pl);
> -	int prio = I915_PRIORITY_INVALID | ACTIVE_PRIORITY;
> +	int prio = I915_PRIORITY_INVALID | boost;
>   
>   	lockdep_assert_held(&engine->timeline.lock);
>   
> @@ -419,8 +419,9 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
>   	 * in the priority queue, but they will not gain immediate access to
>   	 * the GPU.
>   	 */
> -	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(active)) {
> -		prio |= ACTIVE_PRIORITY;
> +	if (~prio & boost && __i915_request_has_started(active)) {
> +		prio |= boost;
> +		GEM_BUG_ON(active->sched.attr.priority >= prio);
>   		active->sched.attr.priority = prio;
>   		list_move_tail(&active->sched.link,
>   			       i915_sched_lookup_priolist(engine, prio));
> @@ -435,7 +436,7 @@ execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
>   	struct intel_engine_cs *engine =
>   		container_of(execlists, typeof(*engine), execlists);
>   
> -	return __unwind_incomplete_requests(engine);
> +	return __unwind_incomplete_requests(engine, 0);
>   }
>   
>   static inline void
> @@ -656,7 +657,8 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
>   	execlists_cancel_port_requests(execlists);
>   	__unwind_incomplete_requests(container_of(execlists,
>   						  struct intel_engine_cs,
> -						  execlists));
> +						  execlists),
> +				     ACTIVE_PRIORITY);
>   }
>   
>   static void execlists_dequeue(struct intel_engine_cs *engine)
> @@ -1909,7 +1911,7 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
>   	execlists_cancel_port_requests(execlists);
>   
>   	/* Push back any incomplete requests for replay after the reset. */
> -	rq = __unwind_incomplete_requests(engine);
> +	rq = __unwind_incomplete_requests(engine, 0);
>   	if (!rq)
>   		goto out_replay;
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 10/13] drm/i915: Rearrange i915_scheduler.c
  2019-05-03 11:52 ` [PATCH 10/13] drm/i915: Rearrange i915_scheduler.c Chris Wilson
@ 2019-05-07 12:06   ` Tvrtko Ursulin
  0 siblings, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-07 12:06 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 12:52, Chris Wilson wrote:
> To avoid pulling in a forward declaration in the next patch, move the
> i915_sched_node handling to after the main dfs of the scheduler.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_scheduler.c | 210 +++++++++++++-------------
>   1 file changed, 105 insertions(+), 105 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index fadf0cd9c75a..4a95cf2201a7 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -35,109 +35,6 @@ static inline bool node_signaled(const struct i915_sched_node *node)
>   	return i915_request_completed(node_to_request(node));
>   }
>   
> -void i915_sched_node_init(struct i915_sched_node *node)
> -{
> -	INIT_LIST_HEAD(&node->signalers_list);
> -	INIT_LIST_HEAD(&node->waiters_list);
> -	INIT_LIST_HEAD(&node->link);
> -	node->attr.priority = I915_PRIORITY_INVALID;
> -	node->semaphores = 0;
> -	node->flags = 0;
> -}
> -
> -static struct i915_dependency *
> -i915_dependency_alloc(void)
> -{
> -	return kmem_cache_alloc(global.slab_dependencies, GFP_KERNEL);
> -}
> -
> -static void
> -i915_dependency_free(struct i915_dependency *dep)
> -{
> -	kmem_cache_free(global.slab_dependencies, dep);
> -}
> -
> -bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
> -				      struct i915_sched_node *signal,
> -				      struct i915_dependency *dep,
> -				      unsigned long flags)
> -{
> -	bool ret = false;
> -
> -	spin_lock_irq(&schedule_lock);
> -
> -	if (!node_signaled(signal)) {
> -		INIT_LIST_HEAD(&dep->dfs_link);
> -		list_add(&dep->wait_link, &signal->waiters_list);
> -		list_add(&dep->signal_link, &node->signalers_list);
> -		dep->signaler = signal;
> -		dep->flags = flags;
> -
> -		/* Keep track of whether anyone on this chain has a semaphore */
> -		if (signal->flags & I915_SCHED_HAS_SEMAPHORE_CHAIN &&
> -		    !node_started(signal))
> -			node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
> -
> -		ret = true;
> -	}
> -
> -	spin_unlock_irq(&schedule_lock);
> -
> -	return ret;
> -}
> -
> -int i915_sched_node_add_dependency(struct i915_sched_node *node,
> -				   struct i915_sched_node *signal)
> -{
> -	struct i915_dependency *dep;
> -
> -	dep = i915_dependency_alloc();
> -	if (!dep)
> -		return -ENOMEM;
> -
> -	if (!__i915_sched_node_add_dependency(node, signal, dep,
> -					      I915_DEPENDENCY_ALLOC))
> -		i915_dependency_free(dep);
> -
> -	return 0;
> -}
> -
> -void i915_sched_node_fini(struct i915_sched_node *node)
> -{
> -	struct i915_dependency *dep, *tmp;
> -
> -	GEM_BUG_ON(!list_empty(&node->link));
> -
> -	spin_lock_irq(&schedule_lock);
> -
> -	/*
> -	 * Everyone we depended upon (the fences we wait to be signaled)
> -	 * should retire before us and remove themselves from our list.
> -	 * However, retirement is run independently on each timeline and
> -	 * so we may be called out-of-order.
> -	 */
> -	list_for_each_entry_safe(dep, tmp, &node->signalers_list, signal_link) {
> -		GEM_BUG_ON(!node_signaled(dep->signaler));
> -		GEM_BUG_ON(!list_empty(&dep->dfs_link));
> -
> -		list_del(&dep->wait_link);
> -		if (dep->flags & I915_DEPENDENCY_ALLOC)
> -			i915_dependency_free(dep);
> -	}
> -
> -	/* Remove ourselves from everyone who depends upon us */
> -	list_for_each_entry_safe(dep, tmp, &node->waiters_list, wait_link) {
> -		GEM_BUG_ON(dep->signaler != node);
> -		GEM_BUG_ON(!list_empty(&dep->dfs_link));
> -
> -		list_del(&dep->signal_link);
> -		if (dep->flags & I915_DEPENDENCY_ALLOC)
> -			i915_dependency_free(dep);
> -	}
> -
> -	spin_unlock_irq(&schedule_lock);
> -}
> -
>   static inline struct i915_priolist *to_priolist(struct rb_node *rb)
>   {
>   	return rb_entry(rb, struct i915_priolist, node);
> @@ -239,6 +136,11 @@ i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio)
>   	return &p->requests[idx];
>   }
>   
> +void __i915_priolist_free(struct i915_priolist *p)
> +{
> +	kmem_cache_free(global.slab_priorities, p);
> +}
> +
>   struct sched_cache {
>   	struct list_head *priolist;
>   };
> @@ -443,9 +345,107 @@ void i915_schedule_bump_priority(struct i915_request *rq, unsigned int bump)
>   	spin_unlock_irqrestore(&schedule_lock, flags);
>   }
>   
> -void __i915_priolist_free(struct i915_priolist *p)
> +void i915_sched_node_init(struct i915_sched_node *node)
>   {
> -	kmem_cache_free(global.slab_priorities, p);
> +	INIT_LIST_HEAD(&node->signalers_list);
> +	INIT_LIST_HEAD(&node->waiters_list);
> +	INIT_LIST_HEAD(&node->link);
> +	node->attr.priority = I915_PRIORITY_INVALID;
> +	node->semaphores = 0;
> +	node->flags = 0;
> +}
> +
> +static struct i915_dependency *
> +i915_dependency_alloc(void)
> +{
> +	return kmem_cache_alloc(global.slab_dependencies, GFP_KERNEL);
> +}
> +
> +static void
> +i915_dependency_free(struct i915_dependency *dep)
> +{
> +	kmem_cache_free(global.slab_dependencies, dep);
> +}
> +
> +bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
> +				      struct i915_sched_node *signal,
> +				      struct i915_dependency *dep,
> +				      unsigned long flags)
> +{
> +	bool ret = false;
> +
> +	spin_lock_irq(&schedule_lock);
> +
> +	if (!node_signaled(signal)) {
> +		INIT_LIST_HEAD(&dep->dfs_link);
> +		list_add(&dep->wait_link, &signal->waiters_list);
> +		list_add(&dep->signal_link, &node->signalers_list);
> +		dep->signaler = signal;
> +		dep->flags = flags;
> +
> +		/* Keep track of whether anyone on this chain has a semaphore */
> +		if (signal->flags & I915_SCHED_HAS_SEMAPHORE_CHAIN &&
> +		    !node_started(signal))
> +			node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
> +
> +		ret = true;
> +	}
> +
> +	spin_unlock_irq(&schedule_lock);
> +
> +	return ret;
> +}
> +
> +int i915_sched_node_add_dependency(struct i915_sched_node *node,
> +				   struct i915_sched_node *signal)
> +{
> +	struct i915_dependency *dep;
> +
> +	dep = i915_dependency_alloc();
> +	if (!dep)
> +		return -ENOMEM;
> +
> +	if (!__i915_sched_node_add_dependency(node, signal, dep,
> +					      I915_DEPENDENCY_ALLOC))
> +		i915_dependency_free(dep);
> +
> +	return 0;
> +}
> +
> +void i915_sched_node_fini(struct i915_sched_node *node)
> +{
> +	struct i915_dependency *dep, *tmp;
> +
> +	GEM_BUG_ON(!list_empty(&node->link));
> +
> +	spin_lock_irq(&schedule_lock);
> +
> +	/*
> +	 * Everyone we depended upon (the fences we wait to be signaled)
> +	 * should retire before us and remove themselves from our list.
> +	 * However, retirement is run independently on each timeline and
> +	 * so we may be called out-of-order.
> +	 */
> +	list_for_each_entry_safe(dep, tmp, &node->signalers_list, signal_link) {
> +		GEM_BUG_ON(!node_signaled(dep->signaler));
> +		GEM_BUG_ON(!list_empty(&dep->dfs_link));
> +
> +		list_del(&dep->wait_link);
> +		if (dep->flags & I915_DEPENDENCY_ALLOC)
> +			i915_dependency_free(dep);
> +	}
> +
> +	/* Remove ourselves from everyone who depends upon us */
> +	list_for_each_entry_safe(dep, tmp, &node->waiters_list, wait_link) {
> +		GEM_BUG_ON(dep->signaler != node);
> +		GEM_BUG_ON(!list_empty(&dep->dfs_link));
> +
> +		list_del(&dep->signal_link);
> +		if (dep->flags & I915_DEPENDENCY_ALLOC)
> +			i915_dependency_free(dep);
> +	}
> +
> +	spin_unlock_irq(&schedule_lock);
>   }
>   
>   static void i915_global_scheduler_shrink(void)
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 11/13] drm/i915: Pass i915_sched_node around internally
  2019-05-03 11:52 ` [PATCH 11/13] drm/i915: Pass i915_sched_node around internally Chris Wilson
@ 2019-05-07 12:12   ` Tvrtko Ursulin
  2019-05-07 12:26     ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-07 12:12 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 12:52, Chris Wilson wrote:
> To simplify the next patch, update bump_priority and schedule to accept
> the internal i915_sched_ndoe directly and not expect a request pointer.
> 
> add/remove: 0/0 grow/shrink: 2/1 up/down: 8/-15 (-7)
> Function                                     old     new   delta
> i915_schedule_bump_priority                  109     113      +4
> i915_schedule                                 50      54      +4
> __i915_schedule                              922     907     -15
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_scheduler.c | 33 +++++++++++++++------------
>   1 file changed, 18 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 4a95cf2201a7..380cb7343a10 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -189,7 +189,7 @@ static void kick_submission(struct intel_engine_cs *engine, int prio)
>   	tasklet_hi_schedule(&engine->execlists.tasklet);
>   }
>   
> -static void __i915_schedule(struct i915_request *rq,
> +static void __i915_schedule(struct i915_sched_node *rq,

Can you not use rq for sched node, but perhaps node?

>   			    const struct i915_sched_attr *attr)
>   {
>   	struct intel_engine_cs *engine;
> @@ -203,13 +203,13 @@ static void __i915_schedule(struct i915_request *rq,
>   	lockdep_assert_held(&schedule_lock);
>   	GEM_BUG_ON(prio == I915_PRIORITY_INVALID);
>   
> -	if (i915_request_completed(rq))
> +	if (prio <= READ_ONCE(rq->attr.priority))
>   		return;
>   
> -	if (prio <= READ_ONCE(rq->sched.attr.priority))
> +	if (node_signaled(rq))

And refrain from re-ordering the sequence in this patch please.

>   		return;
>   
> -	stack.signaler = &rq->sched;
> +	stack.signaler = rq;
>   	list_add(&stack.dfs_link, &dfs);
>   
>   	/*
> @@ -260,9 +260,9 @@ static void __i915_schedule(struct i915_request *rq,
>   	 * execlists_submit_request()), we can set our own priority and skip
>   	 * acquiring the engine locks.
>   	 */
> -	if (rq->sched.attr.priority == I915_PRIORITY_INVALID) {
> -		GEM_BUG_ON(!list_empty(&rq->sched.link));
> -		rq->sched.attr = *attr;
> +	if (rq->attr.priority == I915_PRIORITY_INVALID) {
> +		GEM_BUG_ON(!list_empty(&rq->link));
> +		rq->attr = *attr;
>   
>   		if (stack.dfs_link.next == stack.dfs_link.prev)
>   			return;
> @@ -271,7 +271,7 @@ static void __i915_schedule(struct i915_request *rq,
>   	}
>   
>   	memset(&cache, 0, sizeof(cache));
> -	engine = rq->engine;
> +	engine = node_to_request(rq)->engine;
>   	spin_lock(&engine->timeline.lock);
>   
>   	/* Fifo and depth-first replacement ensure our deps execute before us */
> @@ -322,13 +322,20 @@ static void __i915_schedule(struct i915_request *rq,
>   void i915_schedule(struct i915_request *rq, const struct i915_sched_attr *attr)
>   {
>   	spin_lock_irq(&schedule_lock);
> -	__i915_schedule(rq, attr);
> +	__i915_schedule(&rq->sched, attr);
>   	spin_unlock_irq(&schedule_lock);
>   }
>   
> +static void __bump_priority(struct i915_sched_node *node, unsigned int bump)
> +{
> +	struct i915_sched_attr attr = node->attr;
> +
> +	attr.priority |= bump;
> +	__i915_schedule(node, &attr);
> +}
> +
>   void i915_schedule_bump_priority(struct i915_request *rq, unsigned int bump)
>   {
> -	struct i915_sched_attr attr;
>   	unsigned long flags;
>   
>   	GEM_BUG_ON(bump & ~I915_PRIORITY_MASK);
> @@ -337,11 +344,7 @@ void i915_schedule_bump_priority(struct i915_request *rq, unsigned int bump)
>   		return;
>   
>   	spin_lock_irqsave(&schedule_lock, flags);
> -
> -	attr = rq->sched.attr;
> -	attr.priority |= bump;
> -	__i915_schedule(rq, &attr);
> -
> +	__bump_priority(&rq->sched, bump);
>   	spin_unlock_irqrestore(&schedule_lock, flags);
>   }
>   
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 11/13] drm/i915: Pass i915_sched_node around internally
  2019-05-07 12:12   ` Tvrtko Ursulin
@ 2019-05-07 12:26     ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-05-07 12:26 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-05-07 13:12:05)
> 
> On 03/05/2019 12:52, Chris Wilson wrote:
> > To simplify the next patch, update bump_priority and schedule to accept
> > the internal i915_sched_ndoe directly and not expect a request pointer.
> > 
> > add/remove: 0/0 grow/shrink: 2/1 up/down: 8/-15 (-7)
> > Function                                     old     new   delta
> > i915_schedule_bump_priority                  109     113      +4
> > i915_schedule                                 50      54      +4
> > __i915_schedule                              922     907     -15
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/i915_scheduler.c | 33 +++++++++++++++------------
> >   1 file changed, 18 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> > index 4a95cf2201a7..380cb7343a10 100644
> > --- a/drivers/gpu/drm/i915/i915_scheduler.c
> > +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> > @@ -189,7 +189,7 @@ static void kick_submission(struct intel_engine_cs *engine, int prio)
> >       tasklet_hi_schedule(&engine->execlists.tasklet);
> >   }
> >   
> > -static void __i915_schedule(struct i915_request *rq,
> > +static void __i915_schedule(struct i915_sched_node *rq,
> 
> Can you not use rq for sched node, but perhaps node?

We use node later on. I kept with rq to try and keep the patch small,
and stick to the current semantics. We could reuse node... That looks
like it is semantically clean.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 12/13] drm/i915: Bump signaler priority on adding a waiter
  2019-05-03 11:52 ` [PATCH 12/13] drm/i915: Bump signaler priority on adding a waiter Chris Wilson
@ 2019-05-07 12:46   ` Tvrtko Ursulin
  2019-05-07 13:14     ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-07 12:46 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/05/2019 12:52, Chris Wilson wrote:
> The handling of the no-preemption priority level imposes the restriction
> that we need to maintain the implied ordering even though preemption is
> disabled. Otherwise we may end up with an AB-BA deadlock across multiple
> engine due to a real preemption event reordering the no-preemption
> WAITs. To resolve this issue we currently promote all requests to WAIT
> on unsubmission, however this interferes with the timeslicing
> requirement that we do not apply any implicit promotion that will defeat
> the round-robin timeslice list. (If we automatically promote the active
> request it will go back to the head of the queue and not the tail!)
> 
> So we need implicit promotion to prevent reordering around semaphores
> where we are not allowed to preempt, and we must avoid implicit
> promotion on unsubmission. So instead of at unsubmit, if we apply that
> implicit promotion on adding the dependency, we avoid the semaphore
> deadlock and we also reduce the gains made by the promotion for user
> space waiting. Furthermore, by keeping the earlier dependencies at a
> higher level, we reduce the search space for timeslicing without
> altering runtime scheduling too badly (no dependencies at all will be
> assigned a higher priority for rrul).
> 
> v2: Limit the bump to external edges (as originally intended) i.e.
> between contexts and out to the user.
> 
> Testcase: igt/gem_concurrent_blit
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/selftest_lrc.c      | 12 ++++++++----
>   drivers/gpu/drm/i915/i915_request.c         |  9 ---------
>   drivers/gpu/drm/i915/i915_scheduler.c       | 11 +++++++++++
>   drivers/gpu/drm/i915/i915_scheduler_types.h |  3 ++-
>   4 files changed, 21 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> index 4b042893dc0e..5b3d8e33f1cf 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> @@ -98,12 +98,14 @@ static int live_busywait_preempt(void *arg)
>   	ctx_hi = kernel_context(i915);
>   	if (!ctx_hi)
>   		goto err_unlock;
> -	ctx_hi->sched.priority = INT_MAX;
> +	ctx_hi->sched.priority =
> +		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
>   
>   	ctx_lo = kernel_context(i915);
>   	if (!ctx_lo)
>   		goto err_ctx_hi;
> -	ctx_lo->sched.priority = INT_MIN;
> +	ctx_lo->sched.priority =
> +		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
>   
>   	obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
>   	if (IS_ERR(obj)) {
> @@ -958,12 +960,14 @@ static int live_preempt_hang(void *arg)
>   	ctx_hi = kernel_context(i915);
>   	if (!ctx_hi)
>   		goto err_spin_lo;
> -	ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
> +	ctx_hi->sched.priority =
> +		I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
>   
>   	ctx_lo = kernel_context(i915);
>   	if (!ctx_lo)
>   		goto err_ctx_hi;
> -	ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
> +	ctx_lo->sched.priority =
> +		I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
>   
>   	for_each_engine(engine, i915, id) {
>   		struct i915_request *rq;
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 8cb3ed5531e3..065da1bcbb4c 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -468,15 +468,6 @@ void __i915_request_unsubmit(struct i915_request *request)
>   	/* We may be recursing from the signal callback of another i915 fence */
>   	spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
>   
> -	/*
> -	 * As we do not allow WAIT to preempt inflight requests,
> -	 * once we have executed a request, along with triggering
> -	 * any execution callbacks, we must preserve its ordering
> -	 * within the non-preemptible FIFO.
> -	 */
> -	BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
> -	request->sched.attr.priority |= __NO_PREEMPTION;
> -
>   	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags))
>   		i915_request_cancel_breadcrumb(request);
>   
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 380cb7343a10..ff0ca5718f97 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -391,6 +391,16 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
>   		    !node_started(signal))
>   			node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
>   
> +		/*
> +		 * As we do not allow WAIT to preempt inflight requests,
> +		 * once we have executed a request, along with triggering
> +		 * any execution callbacks, we must preserve its ordering
> +		 * within the non-preemptible FIFO.
> +		 */
> +		BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK);
> +		if (flags & I915_DEPENDENCY_EXTERNAL)
> +			__bump_priority(signal, __NO_PREEMPTION);
> +

I don't really follow how can this be okay from here. It gives wait 
priority to every request which has a dependency now. Which sounds not 
far off from removing the priority bump for waiters altogether. Or 
reversing things and giving requests with no priority a boost.

Regards,

Tvrtko

>   		ret = true;
>   	}
>   
> @@ -409,6 +419,7 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
>   		return -ENOMEM;
>   
>   	if (!__i915_sched_node_add_dependency(node, signal, dep,
> +					      I915_DEPENDENCY_EXTERNAL |
>   					      I915_DEPENDENCY_ALLOC))
>   		i915_dependency_free(dep);
>   
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index 166a457884b2..3e309631bd0b 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -66,7 +66,8 @@ struct i915_dependency {
>   	struct list_head wait_link;
>   	struct list_head dfs_link;
>   	unsigned long flags;
> -#define I915_DEPENDENCY_ALLOC BIT(0)
> +#define I915_DEPENDENCY_ALLOC		BIT(0)
> +#define I915_DEPENDENCY_EXTERNAL	BIT(1)
>   };
>   
>   #endif /* _I915_SCHEDULER_TYPES_H_ */
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 12/13] drm/i915: Bump signaler priority on adding a waiter
  2019-05-07 12:46   ` Tvrtko Ursulin
@ 2019-05-07 13:14     ` Chris Wilson
  2019-05-07 14:32       ` Tvrtko Ursulin
  2019-05-07 14:38       ` Chris Wilson
  0 siblings, 2 replies; 36+ messages in thread
From: Chris Wilson @ 2019-05-07 13:14 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-05-07 13:46:45)
> 
> On 03/05/2019 12:52, Chris Wilson wrote:
> > The handling of the no-preemption priority level imposes the restriction
> > that we need to maintain the implied ordering even though preemption is
> > disabled. Otherwise we may end up with an AB-BA deadlock across multiple
> > engine due to a real preemption event reordering the no-preemption
> > WAITs. To resolve this issue we currently promote all requests to WAIT
> > on unsubmission, however this interferes with the timeslicing
> > requirement that we do not apply any implicit promotion that will defeat
> > the round-robin timeslice list. (If we automatically promote the active
> > request it will go back to the head of the queue and not the tail!)
> > 
> > So we need implicit promotion to prevent reordering around semaphores
> > where we are not allowed to preempt, and we must avoid implicit
> > promotion on unsubmission. So instead of at unsubmit, if we apply that
> > implicit promotion on adding the dependency, we avoid the semaphore
> > deadlock and we also reduce the gains made by the promotion for user
> > space waiting. Furthermore, by keeping the earlier dependencies at a
> > higher level, we reduce the search space for timeslicing without
> > altering runtime scheduling too badly (no dependencies at all will be
> > assigned a higher priority for rrul).
> > 
> > v2: Limit the bump to external edges (as originally intended) i.e.
> > between contexts and out to the user.
> > 
> > Testcase: igt/gem_concurrent_blit
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/gt/selftest_lrc.c      | 12 ++++++++----
> >   drivers/gpu/drm/i915/i915_request.c         |  9 ---------
> >   drivers/gpu/drm/i915/i915_scheduler.c       | 11 +++++++++++
> >   drivers/gpu/drm/i915/i915_scheduler_types.h |  3 ++-
> >   4 files changed, 21 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > index 4b042893dc0e..5b3d8e33f1cf 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> > @@ -98,12 +98,14 @@ static int live_busywait_preempt(void *arg)
> >       ctx_hi = kernel_context(i915);
> >       if (!ctx_hi)
> >               goto err_unlock;
> > -     ctx_hi->sched.priority = INT_MAX;
> > +     ctx_hi->sched.priority =
> > +             I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
> >   
> >       ctx_lo = kernel_context(i915);
> >       if (!ctx_lo)
> >               goto err_ctx_hi;
> > -     ctx_lo->sched.priority = INT_MIN;
> > +     ctx_lo->sched.priority =
> > +             I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
> >   
> >       obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
> >       if (IS_ERR(obj)) {
> > @@ -958,12 +960,14 @@ static int live_preempt_hang(void *arg)
> >       ctx_hi = kernel_context(i915);
> >       if (!ctx_hi)
> >               goto err_spin_lo;
> > -     ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
> > +     ctx_hi->sched.priority =
> > +             I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
> >   
> >       ctx_lo = kernel_context(i915);
> >       if (!ctx_lo)
> >               goto err_ctx_hi;
> > -     ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
> > +     ctx_lo->sched.priority =
> > +             I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
> >   
> >       for_each_engine(engine, i915, id) {
> >               struct i915_request *rq;
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 8cb3ed5531e3..065da1bcbb4c 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -468,15 +468,6 @@ void __i915_request_unsubmit(struct i915_request *request)
> >       /* We may be recursing from the signal callback of another i915 fence */
> >       spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
> >   
> > -     /*
> > -      * As we do not allow WAIT to preempt inflight requests,
> > -      * once we have executed a request, along with triggering
> > -      * any execution callbacks, we must preserve its ordering
> > -      * within the non-preemptible FIFO.
> > -      */
> > -     BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
> > -     request->sched.attr.priority |= __NO_PREEMPTION;
> > -
> >       if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags))
> >               i915_request_cancel_breadcrumb(request);
> >   
> > diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> > index 380cb7343a10..ff0ca5718f97 100644
> > --- a/drivers/gpu/drm/i915/i915_scheduler.c
> > +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> > @@ -391,6 +391,16 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
> >                   !node_started(signal))
> >                       node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
> >   
> > +             /*
> > +              * As we do not allow WAIT to preempt inflight requests,
> > +              * once we have executed a request, along with triggering
> > +              * any execution callbacks, we must preserve its ordering
> > +              * within the non-preemptible FIFO.
> > +              */
> > +             BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK);
> > +             if (flags & I915_DEPENDENCY_EXTERNAL)
> > +                     __bump_priority(signal, __NO_PREEMPTION);
> > +
> 
> I don't really follow how can this be okay from here. It gives wait 
> priority to every request which has a dependency now. Which sounds not 
> far off from removing the priority bump for waiters altogether. Or 
> reversing things and giving requests with no priority a boost.

It treats even inter-context wait equally, and so reduces the effect of
the user wait from the wait/set-domain ioctl and instead includes all
the waits they supply during execbuf.

It all stems from the way those priorities are ignored for preemption.
Currently we perform the same boost implicitly to all running requests,
which screws up timeslicing, so instead of doing it on the running
request we apply it to the queue and limit it to just the edges that are
susceptible to deadlocks (so in effect we are reducing the number of
requests that have the implicit boost).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 12/13] drm/i915: Bump signaler priority on adding a waiter
  2019-05-07 13:14     ` Chris Wilson
@ 2019-05-07 14:32       ` Tvrtko Ursulin
  2019-05-07 14:38       ` Chris Wilson
  1 sibling, 0 replies; 36+ messages in thread
From: Tvrtko Ursulin @ 2019-05-07 14:32 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 07/05/2019 14:14, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-07 13:46:45)
>>
>> On 03/05/2019 12:52, Chris Wilson wrote:
>>> The handling of the no-preemption priority level imposes the restriction
>>> that we need to maintain the implied ordering even though preemption is
>>> disabled. Otherwise we may end up with an AB-BA deadlock across multiple
>>> engine due to a real preemption event reordering the no-preemption
>>> WAITs. To resolve this issue we currently promote all requests to WAIT
>>> on unsubmission, however this interferes with the timeslicing
>>> requirement that we do not apply any implicit promotion that will defeat
>>> the round-robin timeslice list. (If we automatically promote the active
>>> request it will go back to the head of the queue and not the tail!)
>>>
>>> So we need implicit promotion to prevent reordering around semaphores
>>> where we are not allowed to preempt, and we must avoid implicit
>>> promotion on unsubmission. So instead of at unsubmit, if we apply that
>>> implicit promotion on adding the dependency, we avoid the semaphore
>>> deadlock and we also reduce the gains made by the promotion for user
>>> space waiting. Furthermore, by keeping the earlier dependencies at a
>>> higher level, we reduce the search space for timeslicing without
>>> altering runtime scheduling too badly (no dependencies at all will be
>>> assigned a higher priority for rrul).
>>>
>>> v2: Limit the bump to external edges (as originally intended) i.e.
>>> between contexts and out to the user.
>>>
>>> Testcase: igt/gem_concurrent_blit
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/gt/selftest_lrc.c      | 12 ++++++++----
>>>    drivers/gpu/drm/i915/i915_request.c         |  9 ---------
>>>    drivers/gpu/drm/i915/i915_scheduler.c       | 11 +++++++++++
>>>    drivers/gpu/drm/i915/i915_scheduler_types.h |  3 ++-
>>>    4 files changed, 21 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> index 4b042893dc0e..5b3d8e33f1cf 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
>>> @@ -98,12 +98,14 @@ static int live_busywait_preempt(void *arg)
>>>        ctx_hi = kernel_context(i915);
>>>        if (!ctx_hi)
>>>                goto err_unlock;
>>> -     ctx_hi->sched.priority = INT_MAX;
>>> +     ctx_hi->sched.priority =
>>> +             I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
>>>    
>>>        ctx_lo = kernel_context(i915);
>>>        if (!ctx_lo)
>>>                goto err_ctx_hi;
>>> -     ctx_lo->sched.priority = INT_MIN;
>>> +     ctx_lo->sched.priority =
>>> +             I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
>>>    
>>>        obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
>>>        if (IS_ERR(obj)) {
>>> @@ -958,12 +960,14 @@ static int live_preempt_hang(void *arg)
>>>        ctx_hi = kernel_context(i915);
>>>        if (!ctx_hi)
>>>                goto err_spin_lo;
>>> -     ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
>>> +     ctx_hi->sched.priority =
>>> +             I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
>>>    
>>>        ctx_lo = kernel_context(i915);
>>>        if (!ctx_lo)
>>>                goto err_ctx_hi;
>>> -     ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
>>> +     ctx_lo->sched.priority =
>>> +             I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
>>>    
>>>        for_each_engine(engine, i915, id) {
>>>                struct i915_request *rq;
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 8cb3ed5531e3..065da1bcbb4c 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -468,15 +468,6 @@ void __i915_request_unsubmit(struct i915_request *request)
>>>        /* We may be recursing from the signal callback of another i915 fence */
>>>        spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
>>>    
>>> -     /*
>>> -      * As we do not allow WAIT to preempt inflight requests,
>>> -      * once we have executed a request, along with triggering
>>> -      * any execution callbacks, we must preserve its ordering
>>> -      * within the non-preemptible FIFO.
>>> -      */
>>> -     BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
>>> -     request->sched.attr.priority |= __NO_PREEMPTION;
>>> -
>>>        if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags))
>>>                i915_request_cancel_breadcrumb(request);
>>>    
>>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
>>> index 380cb7343a10..ff0ca5718f97 100644
>>> --- a/drivers/gpu/drm/i915/i915_scheduler.c
>>> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
>>> @@ -391,6 +391,16 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
>>>                    !node_started(signal))
>>>                        node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
>>>    
>>> +             /*
>>> +              * As we do not allow WAIT to preempt inflight requests,
>>> +              * once we have executed a request, along with triggering
>>> +              * any execution callbacks, we must preserve its ordering
>>> +              * within the non-preemptible FIFO.
>>> +              */
>>> +             BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK);
>>> +             if (flags & I915_DEPENDENCY_EXTERNAL)
>>> +                     __bump_priority(signal, __NO_PREEMPTION);
>>> +
>>
>> I don't really follow how can this be okay from here. It gives wait
>> priority to every request which has a dependency now. Which sounds not
>> far off from removing the priority bump for waiters altogether. Or
>> reversing things and giving requests with no priority a boost.
> 
> It treats even inter-context wait equally, and so reduces the effect of
> the user wait from the wait/set-domain ioctl and instead includes all
> the waits they supply during execbuf.
> 
> It all stems from the way those priorities are ignored for preemption.
> Currently we perform the same boost implicitly to all running requests,
> which screws up timeslicing, so instead of doing it on the running
> request we apply it to the queue and limit it to just the edges that are
> susceptible to deadlocks (so in effect we are reducing the number of
> requests that have the implicit boost).

I don't really understand why priority bump on unsubmit would screw up 
timeslicing. Would more thorough time-slicing scheduler help there? For 
instance, prio bump on unsubmit sounds like something scheduler normally 
do to prevent starvation. So fundamentally concept sounds sane. Perhaps 
the issue is just some implementation details.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 12/13] drm/i915: Bump signaler priority on adding a waiter
  2019-05-07 13:14     ` Chris Wilson
  2019-05-07 14:32       ` Tvrtko Ursulin
@ 2019-05-07 14:38       ` Chris Wilson
  1 sibling, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2019-05-07 14:38 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2019-05-07 14:14:01)
> Quoting Tvrtko Ursulin (2019-05-07 13:46:45)
> > 
> > On 03/05/2019 12:52, Chris Wilson wrote:
> > > diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> > > index 380cb7343a10..ff0ca5718f97 100644
> > > --- a/drivers/gpu/drm/i915/i915_scheduler.c
> > > +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> > > @@ -391,6 +391,16 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
> > >                   !node_started(signal))
> > >                       node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
> > >   
> > > +             /*
> > > +              * As we do not allow WAIT to preempt inflight requests,
> > > +              * once we have executed a request, along with triggering
> > > +              * any execution callbacks, we must preserve its ordering
> > > +              * within the non-preemptible FIFO.
> > > +              */
> > > +             BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK);
> > > +             if (flags & I915_DEPENDENCY_EXTERNAL)
> > > +                     __bump_priority(signal, __NO_PREEMPTION);
> > > +
> > 
> > I don't really follow how can this be okay from here. It gives wait 
> > priority to every request which has a dependency now. Which sounds not 
> > far off from removing the priority bump for waiters altogether. Or 
> > reversing things and giving requests with no priority a boost.
> 
> It treats even inter-context wait equally, and so reduces the effect of
> the user wait from the wait/set-domain ioctl and instead includes all
> the waits they supply during execbuf.
> 
> It all stems from the way those priorities are ignored for preemption.
> Currently we perform the same boost implicitly to all running requests,
> which screws up timeslicing, so instead of doing it on the running
> request we apply it to the queue and limit it to just the edges that are
> susceptible to deadlocks (so in effect we are reducing the number of
> requests that have the implicit boost).

So one aspect of this is that it does cause the queue to one engine to
be submitted ensemble, slightly out of submission order. Instead of
per-request submission order, you get an entire client's submission to
one engine in a single block as it gets bumped to WAIT, but overall the
workload is still in submission order, and there's no preemption of
running processes.

In early versions (before semaphore wasn't quite so greedy) this did
result in each client being batched into one block, more or less, with
overlapping clients filling in the bubbles. The positive aspect of that
is that, without inducing bubbles, you get much more stable throughput
results -- the cost would be less fairness in the short term, though
even there we have the NEWCLIENT boost to overcome our inherent
unfairness and coarseness.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2019-05-07 14:38 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-03 11:52 [PATCH 01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Chris Wilson
2019-05-03 11:52 ` [PATCH 02/13] drm/i915: Prefer checking the wakeref itself rather than the counter Chris Wilson
2019-05-07 10:48   ` Tvrtko Ursulin
2019-05-03 11:52 ` [PATCH 03/13] drm/i915: Assert the local engine->wakeref is active Chris Wilson
2019-05-07 10:52   ` Tvrtko Ursulin
2019-05-03 11:52 ` [PATCH 04/13] drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD Chris Wilson
2019-05-03 11:52 ` [PATCH 05/13] drm/i915: Remove delay for idle_work Chris Wilson
2019-05-07 10:54   ` Tvrtko Ursulin
2019-05-03 11:52 ` [PATCH 06/13] drm/i915: Cancel retire_worker on parking Chris Wilson
2019-05-07 10:55   ` Tvrtko Ursulin
2019-05-03 11:52 ` [PATCH 07/13] drm/i915: Stop spinning for DROP_IDLE (debugfs/i915_drop_caches) Chris Wilson
2019-05-03 11:52 ` [PATCH 08/13] drm/i915: Only reschedule the submission tasklet if preemption is possible Chris Wilson
2019-05-07 11:53   ` Tvrtko Ursulin
2019-05-03 11:52 ` [PATCH 09/13] drm/i915/execlists: Don't apply priority boost for resets Chris Wilson
2019-05-07 12:04   ` Tvrtko Ursulin
2019-05-03 11:52 ` [PATCH 10/13] drm/i915: Rearrange i915_scheduler.c Chris Wilson
2019-05-07 12:06   ` Tvrtko Ursulin
2019-05-03 11:52 ` [PATCH 11/13] drm/i915: Pass i915_sched_node around internally Chris Wilson
2019-05-07 12:12   ` Tvrtko Ursulin
2019-05-07 12:26     ` Chris Wilson
2019-05-03 11:52 ` [PATCH 12/13] drm/i915: Bump signaler priority on adding a waiter Chris Wilson
2019-05-07 12:46   ` Tvrtko Ursulin
2019-05-07 13:14     ` Chris Wilson
2019-05-07 14:32       ` Tvrtko Ursulin
2019-05-07 14:38       ` Chris Wilson
2019-05-03 11:52 ` [PATCH 13/13] drm/i915: Disable semaphore busywaits on saturated systems Chris Wilson
2019-05-03 12:56 ` ✗ Fi.CI.SPARSE: warning for series starting with [01/13] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler Patchwork
2019-05-03 13:18 ` ✗ Fi.CI.BAT: failure " Patchwork
2019-05-03 13:32 ` [PATCH 01/13] " Tvrtko Ursulin
2019-05-03 13:37   ` Chris Wilson
2019-05-03 13:49     ` Tvrtko Ursulin
2019-05-03 15:22 ` [PATCH v2] " Chris Wilson
2019-05-07 10:39   ` Tvrtko Ursulin
2019-05-03 15:38 ` ✗ Fi.CI.SPARSE: warning for series starting with [v2] drm/i915: Assert breadcrumbs are correctly ordered in the signal handler (rev2) Patchwork
2019-05-03 15:53 ` ✓ Fi.CI.BAT: success " Patchwork
2019-05-03 19:22 ` ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.