[PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
@ 2017-10-20  9:59 Chris Wilson
  2017-10-20  9:59 ` [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts Chris Wilson
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Chris Wilson @ 2017-10-20  9:59 UTC (permalink / raw)
  To: intel-gfx

In the idle worker we drop the prolonged GT wakeref used to cover such
essentials as interrupt delivery. (When a CS interrupt arrives, we also
assert that the GT is awake.) However, it turns out that 10ms is not
long enough to be assured that the last CS interrupt has been delivered,
so bump that to 200ms, and move the entirety of that wait to before we
take the struct_mutex to avoid blocking. As this is now a potentially
long wait, restore the earlier behaviour of bailing out early when a new
request arrives.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 026cb52ece0b..d3a638613857 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3281,8 +3281,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
 {
 	struct drm_i915_private *dev_priv =
 		container_of(work, typeof(*dev_priv), gt.idle_work.work);
-	struct drm_device *dev = &dev_priv->drm;
 	bool rearm_hangcheck;
+	ktime_t end;
 
 	if (!READ_ONCE(dev_priv->gt.awake))
 		return;
@@ -3291,14 +3291,22 @@ i915_gem_idle_work_handler(struct work_struct *work)
 	 * Wait for last execlists context complete, but bail out in case a
 	 * new request is submitted.
 	 */
-	wait_for(intel_engines_are_idle(dev_priv), 10);
-	if (READ_ONCE(dev_priv->gt.active_requests))
-		return;
+	end = ktime_add_ms(ktime_get(), 200);
+	do {
+		if (READ_ONCE(dev_priv->gt.active_requests) ||
+		    work_pending(work))
+			return;
+
+		if (intel_engines_are_idle(dev_priv))
+			break;
+
+		usleep_range(100, 500);
+	} while (ktime_before(ktime_get(), end));
 
 	rearm_hangcheck =
 		cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
 
-	if (!mutex_trylock(&dev->struct_mutex)) {
+	if (!mutex_trylock(&dev_priv->drm.struct_mutex)) {
 		/* Currently busy, come back later */
 		mod_delayed_work(dev_priv->wq,
 				 &dev_priv->gt.idle_work,
@@ -3310,13 +3318,14 @@ i915_gem_idle_work_handler(struct work_struct *work)
 	 * New request retired after this work handler started, extend active
 	 * period until next instance of the work.
 	 */
-	if (work_pending(work))
+	if (dev_priv->gt.active_requests || work_pending(work))
 		goto out_unlock;
 
-	if (dev_priv->gt.active_requests)
-		goto out_unlock;
-
-	if (wait_for(intel_engines_are_idle(dev_priv), 10))
+	/*
+	 * We are committed now to parking the engines, make sure there
+	 * will be no more interrupts arriving later.
+	 */
+	if (!intel_engines_are_idle(dev_priv))
 		DRM_ERROR("Timeout waiting for engines to idle\n");
 
 	intel_engines_mark_idle(dev_priv);
@@ -3330,7 +3339,7 @@ i915_gem_idle_work_handler(struct work_struct *work)
 		gen6_rps_idle(dev_priv);
 	intel_runtime_pm_put(dev_priv);
 out_unlock:
-	mutex_unlock(&dev->struct_mutex);
+	mutex_unlock(&dev_priv->drm.struct_mutex);
 
 out_rearm:
 	if (rearm_hangcheck) {
-- 
2.15.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts
  2017-10-20  9:59 [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Chris Wilson
@ 2017-10-20  9:59 ` Chris Wilson
  2017-10-20 11:48   ` [PATCH v2] " Chris Wilson
                     ` (2 more replies)
  2017-10-20 10:23 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Patchwork
                   ` (5 subsequent siblings)
  6 siblings, 3 replies; 21+ messages in thread
From: Chris Wilson @ 2017-10-20  9:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
context-switch when idle") we noticed the presence of late
context-switch interrupts. We were able to filter those out by looking
at whether the ELSP remained active, but in commit beecec901790
("drm/i915/execlists: Preemption!") that became problematic as we now
anticipate receiving a context-switch event for preemption while ELSP
may be empty. To restore the spurious interrupt suppression, add a
counter for the expected number of pending context-switches and skip if
we do not need to handle this interrupt to make forward progress.

Fixes: beecec901790 ("drm/i915/execlists: Preemption!")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 11 ++++++-----
 drivers/gpu/drm/i915/i915_irq.c            |  6 ++++--
 drivers/gpu/drm/i915/intel_engine_cs.c     |  5 +++--
 drivers/gpu/drm/i915/intel_lrc.c           | 21 +++++++++++++++++----
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  7 +++++++
 5 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index a2e8114b739d..c22d0a07467c 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -550,11 +550,9 @@ static void port_assign(struct execlist_port *port,
 			struct drm_i915_gem_request *rq)
 {
 	GEM_BUG_ON(rq == port_request(port));
+	GEM_BUG_ON(port_isset(port));
 
-	if (port_isset(port))
-		i915_gem_request_put(port_request(port));
-
-	port_set(port, port_pack(i915_gem_request_get(rq), port_count(port)));
+	port_set(port, i915_gem_request_get(rq));
 	nested_enable_signaling(rq);
 }
 
@@ -586,8 +584,10 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 					goto done;
 				}
 
-				if (submit)
+				if (submit) {
 					port_assign(port, last);
+					execlists->active++;
+				}
 				port++;
 			}
 
@@ -610,6 +610,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 	execlists->first = rb;
 	if (submit) {
 		port_assign(port, last);
+		execlists->active++;
 		i915_guc_submit(engine);
 	}
 	spin_unlock_irq(&engine->timeline->lock);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index b1296a55c1e4..f8205841868b 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1388,8 +1388,10 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 	bool tasklet = false;
 
 	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
-		__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
-		tasklet = true;
+		if (READ_ONCE(engine->execlists.active)) {
+			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
+			tasklet = true;
+		}
 	}
 
 	if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index a47a9c6bea52..8aecbc321786 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1548,8 +1548,8 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
 		return false;
 
-	/* Both ports drained, no more ELSP submission? */
-	if (port_request(&engine->execlists.port[0]))
+	/* Waiting to drain ELSP? */
+	if (READ_ONCE(engine->execlists.active))
 		return false;
 
 	/* ELSP is empty, but there are ready requests? */
@@ -1749,6 +1749,7 @@ void intel_engine_dump(struct intel_engine_cs *engine, struct drm_printer *m)
 					   idx);
 			}
 		}
+		drm_printf(m, "\t\tELSP active %d\n", execlists->active);
 		rcu_read_unlock();
 	} else if (INTEL_GEN(dev_priv) > 6) {
 		drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 7f45dd7dc3e5..8a47b8211c74 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -482,15 +482,23 @@ static bool can_merge_ctx(const struct i915_gem_context *prev,
 	return true;
 }
 
-static void port_assign(struct execlist_port *port,
+static bool port_assign(struct execlist_port *port,
 			struct drm_i915_gem_request *rq)
 {
+	bool was_idle;
+
 	GEM_BUG_ON(rq == port_request(port));
 
-	if (port_isset(port))
+	if (port_isset(port)) {
 		i915_gem_request_put(port_request(port));
+		was_idle = false;
+	} else {
+		was_idle = true;
+	}
 
 	port_set(port, port_pack(i915_gem_request_get(rq), port_count(port)));
+
+	return was_idle;
 }
 
 static void inject_preempt_context(struct intel_engine_cs *engine)
@@ -657,7 +665,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				GEM_BUG_ON(last->ctx == rq->ctx);
 
 				if (submit)
-					port_assign(port, last);
+					execlists->active += port_assign(port, last);
 				port++;
 
 				GEM_BUG_ON(port_isset(port));
@@ -679,7 +687,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 done:
 	execlists->first = rb;
 	if (submit)
-		port_assign(port, last);
+		execlists->active += port_assign(port, last);
 unlock:
 	spin_unlock_irq(&engine->timeline->lock);
 
@@ -696,10 +704,12 @@ execlist_cancel_port_requests(struct intel_engine_execlists *execlists)
 	while (num_ports-- && port_isset(port)) {
 		struct drm_i915_gem_request *rq = port_request(port);
 
+		GEM_BUG_ON(!execlists->active);
 		execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_PREEMPTED);
 		i915_gem_request_put(rq);
 
 		memset(port, 0, sizeof(*port));
+		execlists->active--;
 		port++;
 	}
 }
@@ -863,6 +873,8 @@ static void intel_lrc_irq_handler(unsigned long data)
 
 				GEM_BUG_ON(!execlists->preempt);
 				execlists->preempt = false;
+				execlists->active--;
+				GEM_BUG_ON(execlists->active);
 				continue;
 			}
 
@@ -1461,6 +1473,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
 	clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 	execlists->csb_head = -1;
 	execlists->preempt = false;
+	execlists->active = 0;
 
 	/* After a GPU reset, we may have requests to replay */
 	if (!i915_modparams.enable_guc_submission && execlists->first)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 17186f067408..588f5fe9d2b7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -245,6 +245,11 @@ struct intel_engine_execlists {
 	 */
 	bool preempt;
 
+	/**
+	 * @active: count of the number of outstanding CSB events
+	 */
+	unsigned int active;
+
 	/**
 	 * @port_mask: number of execlist ports - 1
 	 */
@@ -538,9 +543,11 @@ execlists_port_complete(struct intel_engine_execlists * const execlists,
 	const unsigned int m = execlists->port_mask;
 
 	GEM_BUG_ON(port_index(port, execlists) != 0);
+	GEM_BUG_ON(!execlists->active);
 
 	memmove(port, port + 1, m * sizeof(struct execlist_port));
 	memset(port + m, 0, sizeof(struct execlist_port));
+	execlists->active--;
 }
 
 static inline unsigned int
-- 
2.15.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
  2017-10-20  9:59 [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Chris Wilson
  2017-10-20  9:59 ` [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts Chris Wilson
@ 2017-10-20 10:23 ` Patchwork
  2017-10-20 11:38 ` ✓ Fi.CI.IGT: " Patchwork
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2017-10-20 10:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
URL   : https://patchwork.freedesktop.org/series/32349/
State : success

== Summary ==

Series 32349v1 series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
https://patchwork.freedesktop.org/api/1.0/series/32349/revisions/1/mbox/

Test chamelium:
        Subgroup dp-crc-fast:
                pass       -> FAIL       (fi-kbl-7500u) fdo#102514
Test gem_exec_reloc:
        Subgroup basic-cpu-gtt-active:
                pass       -> FAIL       (fi-gdg-551) fdo#102582 +2
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-b:
                pass       -> DMESG-WARN (fi-byt-j1900) fdo#101705
Test drv_module_reload:
        Subgroup basic-reload-inject:
                dmesg-warn -> INCOMPLETE (fi-cfl-s) fdo#103206

fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
fdo#102582 https://bugs.freedesktop.org/show_bug.cgi?id=102582
fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705
fdo#103206 https://bugs.freedesktop.org/show_bug.cgi?id=103206

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:437s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:454s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:373s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:534s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:263s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:499s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:497s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:493s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:477s
fi-cfl-s         total:288  pass:253  dwarn:3   dfail:0   fail:0   skip:31 
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:406s
fi-gdg-551       total:289  pass:174  dwarn:1   dfail:0   fail:5   skip:109 time:251s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:582s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:449s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:427s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:431s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:486s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:459s
fi-kbl-7500u     total:289  pass:263  dwarn:1   dfail:0   fail:1   skip:24  time:480s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:565s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:477s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:581s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:546s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:454s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:644s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:514s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:504s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:460s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:562s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:417s

8ddf486a3f3a9cdadfe9f86a879b5092c136e491 drm-tip: 2017y-10m-19d-20h-21m-08s UTC integration manifest
44db6733d92e drm/i915: Filter out spurious execlists context-switch interrupts
c5717b809926 drm/i915: Bump wait-times for the final CS interrupt before parking

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6119/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
  2017-10-20  9:59 [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Chris Wilson
  2017-10-20  9:59 ` [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts Chris Wilson
  2017-10-20 10:23 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Patchwork
@ 2017-10-20 11:38 ` Patchwork
  2017-10-20 12:19 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking (rev2) Patchwork
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2017-10-20 11:38 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
URL   : https://patchwork.freedesktop.org/series/32349/
State : success

== Summary ==

Test kms_busy:
        Subgroup extended-modeset-hang-oldfb-with-reset-render-C:
                pass       -> DMESG-WARN (shard-hsw) fdo#102249
Test kms_flip:
        Subgroup plain-flip-fb-recreate:
                pass       -> FAIL       (shard-hsw) fdo#102504
Test kms_setmode:
        Subgroup basic:
                pass       -> FAIL       (shard-hsw) fdo#99912

fdo#102249 https://bugs.freedesktop.org/show_bug.cgi?id=102249
fdo#102504 https://bugs.freedesktop.org/show_bug.cgi?id=102504
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912

shard-hsw        total:2540 pass:1426 dwarn:3   dfail:0   fail:10  skip:1101 time:9198s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6119/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2] drm/i915: Filter out spurious execlists context-switch interrupts
  2017-10-20  9:59 ` [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts Chris Wilson
@ 2017-10-20 11:48   ` Chris Wilson
  2017-10-20 13:21   ` [PATCH 2/2] " Mika Kuoppala
  2017-10-20 13:59   ` Mika Kuoppala
  2 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2017-10-20 11:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
context-switch when idle") we noticed the presence of late
context-switch interrupts. We were able to filter those out by looking
at whether the ELSP remained active, but in commit beecec901790
("drm/i915/execlists: Preemption!") that became problematic as we now
anticipate receiving a context-switch event for preemption while ELSP
may be empty. To restore the spurious interrupt suppression, add a
counter for the expected number of pending context-switches and skip if
we do not need to handle this interrupt to make forward progress.

v2: Add the missing inc(active) for preempt.

Fixes: beecec901790 ("drm/i915/execlists: Preemption!")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 11 ++++++-----
 drivers/gpu/drm/i915/i915_irq.c            |  6 ++++--
 drivers/gpu/drm/i915/intel_engine_cs.c     |  5 +++--
 drivers/gpu/drm/i915/intel_lrc.c           | 22 ++++++++++++++++++----
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  7 +++++++
 5 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index a2e8114b739d..c22d0a07467c 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -550,11 +550,9 @@ static void port_assign(struct execlist_port *port,
 			struct drm_i915_gem_request *rq)
 {
 	GEM_BUG_ON(rq == port_request(port));
+	GEM_BUG_ON(port_isset(port));
 
-	if (port_isset(port))
-		i915_gem_request_put(port_request(port));
-
-	port_set(port, port_pack(i915_gem_request_get(rq), port_count(port)));
+	port_set(port, i915_gem_request_get(rq));
 	nested_enable_signaling(rq);
 }
 
@@ -586,8 +584,10 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 					goto done;
 				}
 
-				if (submit)
+				if (submit) {
 					port_assign(port, last);
+					execlists->active++;
+				}
 				port++;
 			}
 
@@ -610,6 +610,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 	execlists->first = rb;
 	if (submit) {
 		port_assign(port, last);
+		execlists->active++;
 		i915_guc_submit(engine);
 	}
 	spin_unlock_irq(&engine->timeline->lock);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index b1296a55c1e4..f8205841868b 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1388,8 +1388,10 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 	bool tasklet = false;
 
 	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
-		__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
-		tasklet = true;
+		if (READ_ONCE(engine->execlists.active)) {
+			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
+			tasklet = true;
+		}
 	}
 
 	if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index a47a9c6bea52..8aecbc321786 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1548,8 +1548,8 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
 		return false;
 
-	/* Both ports drained, no more ELSP submission? */
-	if (port_request(&engine->execlists.port[0]))
+	/* Waiting to drain ELSP? */
+	if (READ_ONCE(engine->execlists.active))
 		return false;
 
 	/* ELSP is empty, but there are ready requests? */
@@ -1749,6 +1749,7 @@ void intel_engine_dump(struct intel_engine_cs *engine, struct drm_printer *m)
 					   idx);
 			}
 		}
+		drm_printf(m, "\t\tELSP active %d\n", execlists->active);
 		rcu_read_unlock();
 	} else if (INTEL_GEN(dev_priv) > 6) {
 		drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 7f45dd7dc3e5..c2cdb85db365 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -482,15 +482,23 @@ static bool can_merge_ctx(const struct i915_gem_context *prev,
 	return true;
 }
 
-static void port_assign(struct execlist_port *port,
+static bool port_assign(struct execlist_port *port,
 			struct drm_i915_gem_request *rq)
 {
+	bool was_idle;
+
 	GEM_BUG_ON(rq == port_request(port));
 
-	if (port_isset(port))
+	if (port_isset(port)) {
 		i915_gem_request_put(port_request(port));
+		was_idle = false;
+	} else {
+		was_idle = true;
+	}
 
 	port_set(port, port_pack(i915_gem_request_get(rq), port_count(port)));
+
+	return was_idle;
 }
 
 static void inject_preempt_context(struct intel_engine_cs *engine)
@@ -576,6 +584,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			 */
 			inject_preempt_context(engine);
 			execlists->preempt = true;
+			execlists->active++;
 			goto unlock;
 		} else {
 			/*
@@ -657,7 +666,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 				GEM_BUG_ON(last->ctx == rq->ctx);
 
 				if (submit)
-					port_assign(port, last);
+					execlists->active += port_assign(port, last);
 				port++;
 
 				GEM_BUG_ON(port_isset(port));
@@ -679,7 +688,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 done:
 	execlists->first = rb;
 	if (submit)
-		port_assign(port, last);
+		execlists->active += port_assign(port, last);
 unlock:
 	spin_unlock_irq(&engine->timeline->lock);
 
@@ -696,10 +705,12 @@ execlist_cancel_port_requests(struct intel_engine_execlists *execlists)
 	while (num_ports-- && port_isset(port)) {
 		struct drm_i915_gem_request *rq = port_request(port);
 
+		GEM_BUG_ON(!execlists->active);
 		execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_PREEMPTED);
 		i915_gem_request_put(rq);
 
 		memset(port, 0, sizeof(*port));
+		execlists->active--;
 		port++;
 	}
 }
@@ -863,6 +874,8 @@ static void intel_lrc_irq_handler(unsigned long data)
 
 				GEM_BUG_ON(!execlists->preempt);
 				execlists->preempt = false;
+				execlists->active--;
+				GEM_BUG_ON(execlists->active);
 				continue;
 			}
 
@@ -1461,6 +1474,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
 	clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 	execlists->csb_head = -1;
 	execlists->preempt = false;
+	execlists->active = 0;
 
 	/* After a GPU reset, we may have requests to replay */
 	if (!i915_modparams.enable_guc_submission && execlists->first)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 17186f067408..588f5fe9d2b7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -245,6 +245,11 @@ struct intel_engine_execlists {
 	 */
 	bool preempt;
 
+	/**
+	 * @active: count of the number of outstanding CSB events
+	 */
+	unsigned int active;
+
 	/**
 	 * @port_mask: number of execlist ports - 1
 	 */
@@ -538,9 +543,11 @@ execlists_port_complete(struct intel_engine_execlists * const execlists,
 	const unsigned int m = execlists->port_mask;
 
 	GEM_BUG_ON(port_index(port, execlists) != 0);
+	GEM_BUG_ON(!execlists->active);
 
 	memmove(port, port + 1, m * sizeof(struct execlist_port));
 	memset(port + m, 0, sizeof(struct execlist_port));
+	execlists->active--;
 }
 
 static inline unsigned int
-- 
2.15.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking (rev2)
  2017-10-20  9:59 [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Chris Wilson
                   ` (2 preceding siblings ...)
  2017-10-20 11:38 ` ✓ Fi.CI.IGT: " Patchwork
@ 2017-10-20 12:19 ` Patchwork
  2017-10-20 13:11 ` [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Mika Kuoppala
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2017-10-20 12:19 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking (rev2)
URL   : https://patchwork.freedesktop.org/series/32349/
State : success

== Summary ==

Series 32349v2 series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
https://patchwork.freedesktop.org/api/1.0/series/32349/revisions/2/mbox/

Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b-frame-sequence:
                incomplete -> PASS       (fi-skl-6700hq) fdo#102035
Test prime_vgem:
        Subgroup basic-fence-flip:
                dmesg-warn -> PASS       (fi-kbl-7567u) fdo#103165 +1

fdo#102035 https://bugs.freedesktop.org/show_bug.cgi?id=102035
fdo#103165 https://bugs.freedesktop.org/show_bug.cgi?id=103165

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:456s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:447s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:370s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:544s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:264s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:503s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:500s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:504s
fi-byt-n2820     total:289  pass:249  dwarn:1   dfail:0   fail:0   skip:39  time:485s
fi-cfl-s         total:289  pass:253  dwarn:4   dfail:0   fail:0   skip:32  time:551s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:415s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:251s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:583s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:453s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:430s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:437s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:485s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:461s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:489s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:571s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:478s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:583s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:543s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:458s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:649s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:517s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:504s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:456s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:572s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:428s

75b85b0dce2faee585046d3ce19d76bcb163c3e7 drm-tip: 2017y-10m-20d-10h-07m-16s UTC integration manifest
ba00357f8415 drm/i915: Filter out spurious execlists context-switch interrupts
d38bebdd7c55 drm/i915: Bump wait-times for the final CS interrupt before parking

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6121/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
  2017-10-20  9:59 [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Chris Wilson
                   ` (3 preceding siblings ...)
  2017-10-20 12:19 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking (rev2) Patchwork
@ 2017-10-20 13:11 ` Mika Kuoppala
  2017-10-20 13:19   ` Chris Wilson
  2017-10-20 13:47 ` ✓ Fi.CI.IGT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking (rev2) Patchwork
  2017-10-23 11:52 ` [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Mika Kuoppala
  6 siblings, 1 reply; 21+ messages in thread
From: Mika Kuoppala @ 2017-10-20 13:11 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> In the idle worker we drop the prolonged GT wakeref used to cover such
> essentials as interrupt delivery. (When a CS interrupt arrives, we also
> assert that the GT is awake.) However, it turns out that 10ms is not
> long enough to be assured that the last CS interrupt has been delivered,
> so bump that to 200ms, and move the entirety of that wait to before we
> take the struct_mutex to avoid blocking. As this is now a potentially
> long wait, restore the earlier behaviour of bailing out early when a new
> request arrives.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 31 ++++++++++++++++++++-----------
>  1 file changed, 20 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 026cb52ece0b..d3a638613857 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3281,8 +3281,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  {
>  	struct drm_i915_private *dev_priv =
>  		container_of(work, typeof(*dev_priv), gt.idle_work.work);
> -	struct drm_device *dev = &dev_priv->drm;
>  	bool rearm_hangcheck;
> +	ktime_t end;
>  
>  	if (!READ_ONCE(dev_priv->gt.awake))
>  		return;
> @@ -3291,14 +3291,22 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  	 * Wait for last execlists context complete, but bail out in case a
>  	 * new request is submitted.
>  	 */
> -	wait_for(intel_engines_are_idle(dev_priv), 10);
> -	if (READ_ONCE(dev_priv->gt.active_requests))
> -		return;
> +	end = ktime_add_ms(ktime_get(), 200);
> +	do {
> +		if (READ_ONCE(dev_priv->gt.active_requests) ||
> +		    work_pending(work))
> +			return;
> +
> +		if (intel_engines_are_idle(dev_priv))
> +			break;
> +
> +		usleep_range(100, 500);
> +	} while (ktime_before(ktime_get(), end));
>

Why can't we just wait_for(intel_engines_are_idle(dev_priv, 200)) ?

-Mika

>  	rearm_hangcheck =
>  		cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
>  
> -	if (!mutex_trylock(&dev->struct_mutex)) {
> +	if (!mutex_trylock(&dev_priv->drm.struct_mutex)) {
>  		/* Currently busy, come back later */
>  		mod_delayed_work(dev_priv->wq,
>  				 &dev_priv->gt.idle_work,
> @@ -3310,13 +3318,14 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  	 * New request retired after this work handler started, extend active
>  	 * period until next instance of the work.
>  	 */
> -	if (work_pending(work))
> +	if (dev_priv->gt.active_requests || work_pending(work))
>  		goto out_unlock;
>  
> -	if (dev_priv->gt.active_requests)
> -		goto out_unlock;
> -
> -	if (wait_for(intel_engines_are_idle(dev_priv), 10))
> +	/*
> +	 * We are committed now to parking the engines, make sure there
> +	 * will be no more interrupts arriving later.
> +	 */
> +	if (!intel_engines_are_idle(dev_priv))
>  		DRM_ERROR("Timeout waiting for engines to idle\n");
>  
>  	intel_engines_mark_idle(dev_priv);
> @@ -3330,7 +3339,7 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  		gen6_rps_idle(dev_priv);
>  	intel_runtime_pm_put(dev_priv);
>  out_unlock:
> -	mutex_unlock(&dev->struct_mutex);
> +	mutex_unlock(&dev_priv->drm.struct_mutex);
>  
>  out_rearm:
>  	if (rearm_hangcheck) {
> -- 
> 2.15.0.rc1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
  2017-10-20 13:11 ` [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Mika Kuoppala
@ 2017-10-20 13:19   ` Chris Wilson
  2017-10-20 13:23     ` Mika Kuoppala
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2017-10-20 13:19 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-10-20 14:11:53)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > In the idle worker we drop the prolonged GT wakeref used to cover such
> > essentials as interrupt delivery. (When a CS interrupt arrives, we also
> > assert that the GT is awake.) However, it turns out that 10ms is not
> > long enough to be assured that the last CS interrupt has been delivered,
> > so bump that to 200ms, and move the entirety of that wait to before we
> > take the struct_mutex to avoid blocking. As this is now a potentially
> > long wait, restore the earlier behaviour of bailing out early when a new
> > request arrives.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Imre Deak <imre.deak@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 31 ++++++++++++++++++++-----------
> >  1 file changed, 20 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 026cb52ece0b..d3a638613857 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3281,8 +3281,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >  {
> >       struct drm_i915_private *dev_priv =
> >               container_of(work, typeof(*dev_priv), gt.idle_work.work);
> > -     struct drm_device *dev = &dev_priv->drm;
> >       bool rearm_hangcheck;
> > +     ktime_t end;
> >  
> >       if (!READ_ONCE(dev_priv->gt.awake))
> >               return;
> > @@ -3291,14 +3291,22 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >        * Wait for last execlists context complete, but bail out in case a
> >        * new request is submitted.
> >        */
> > -     wait_for(intel_engines_are_idle(dev_priv), 10);
> > -     if (READ_ONCE(dev_priv->gt.active_requests))
> > -             return;
> > +     end = ktime_add_ms(ktime_get(), 200);
> > +     do {
> > +             if (READ_ONCE(dev_priv->gt.active_requests) ||
> > +                 work_pending(work))
> > +                     return;
> > +
> > +             if (intel_engines_are_idle(dev_priv))
> > +                     break;
> > +
> > +             usleep_range(100, 500);
> > +     } while (ktime_before(ktime_get(), end));
> >
> 
> Why can't we just wait_for(intel_engines_are_idle(dev_priv, 200)) ?

That return. We don't really want to block the ordered wq for 200ms when
we already know we won't make progress. (Whilst we are running nothing
else that wants to use dev_priv->wq can.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts
  2017-10-20  9:59 ` [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts Chris Wilson
  2017-10-20 11:48   ` [PATCH v2] " Chris Wilson
@ 2017-10-20 13:21   ` Mika Kuoppala
  2017-10-20 13:24     ` Chris Wilson
  2017-10-20 13:59   ` Mika Kuoppala
  2 siblings, 1 reply; 21+ messages in thread
From: Mika Kuoppala @ 2017-10-20 13:21 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
> context-switch when idle") we noticed the presence of late
> context-switch interrupts. We were able to filter those out by looking
> at whether the ELSP remained active, but in commit beecec901790
> ("drm/i915/execlists: Preemption!") that became problematic as we now
> anticipate receiving a context-switch event for preemption while ELSP
> may be empty. To restore the spurious interrupt suppression, add a

This confuses me, how can we preempt something that was never submitted.
Could you elaborate?

Thanks,
-Mika

> counter for the expected number of pending context-switches and skip if
> we do not need to handle this interrupt to make forward progress.
>
> Fixes: beecec901790 ("drm/i915/execlists: Preemption!")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michal Winiarski <michal.winiarski@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 11 ++++++-----
>  drivers/gpu/drm/i915/i915_irq.c            |  6 ++++--
>  drivers/gpu/drm/i915/intel_engine_cs.c     |  5 +++--
>  drivers/gpu/drm/i915/intel_lrc.c           | 21 +++++++++++++++++----
>  drivers/gpu/drm/i915/intel_ringbuffer.h    |  7 +++++++
>  5 files changed, 37 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index a2e8114b739d..c22d0a07467c 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -550,11 +550,9 @@ static void port_assign(struct execlist_port *port,
>  			struct drm_i915_gem_request *rq)
>  {
>  	GEM_BUG_ON(rq == port_request(port));
> +	GEM_BUG_ON(port_isset(port));
>  
> -	if (port_isset(port))
> -		i915_gem_request_put(port_request(port));
> -
> -	port_set(port, port_pack(i915_gem_request_get(rq), port_count(port)));
> +	port_set(port, i915_gem_request_get(rq));
>  	nested_enable_signaling(rq);
>  }
>  
> @@ -586,8 +584,10 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
>  					goto done;
>  				}
>  
> -				if (submit)
> +				if (submit) {
>  					port_assign(port, last);
> +					execlists->active++;
> +				}
>  				port++;
>  			}
>  
> @@ -610,6 +610,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
>  	execlists->first = rb;
>  	if (submit) {
>  		port_assign(port, last);
> +		execlists->active++;
>  		i915_guc_submit(engine);
>  	}
>  	spin_unlock_irq(&engine->timeline->lock);
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index b1296a55c1e4..f8205841868b 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1388,8 +1388,10 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
>  	bool tasklet = false;
>  
>  	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
> -		__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> -		tasklet = true;
> +		if (READ_ONCE(engine->execlists.active)) {
> +			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> +			tasklet = true;
> +		}
>  	}
>  
>  	if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index a47a9c6bea52..8aecbc321786 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1548,8 +1548,8 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
>  	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
>  		return false;
>  
> -	/* Both ports drained, no more ELSP submission? */
> -	if (port_request(&engine->execlists.port[0]))
> +	/* Waiting to drain ELSP? */
> +	if (READ_ONCE(engine->execlists.active))
>  		return false;
>  
>  	/* ELSP is empty, but there are ready requests? */
> @@ -1749,6 +1749,7 @@ void intel_engine_dump(struct intel_engine_cs *engine, struct drm_printer *m)
>  					   idx);
>  			}
>  		}
> +		drm_printf(m, "\t\tELSP active %d\n", execlists->active);
>  		rcu_read_unlock();
>  	} else if (INTEL_GEN(dev_priv) > 6) {
>  		drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 7f45dd7dc3e5..8a47b8211c74 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -482,15 +482,23 @@ static bool can_merge_ctx(const struct i915_gem_context *prev,
>  	return true;
>  }
>  
> -static void port_assign(struct execlist_port *port,
> +static bool port_assign(struct execlist_port *port,
>  			struct drm_i915_gem_request *rq)
>  {
> +	bool was_idle;
> +
>  	GEM_BUG_ON(rq == port_request(port));
>  
> -	if (port_isset(port))
> +	if (port_isset(port)) {
>  		i915_gem_request_put(port_request(port));
> +		was_idle = false;
> +	} else {
> +		was_idle = true;
> +	}
>  
>  	port_set(port, port_pack(i915_gem_request_get(rq), port_count(port)));
> +
> +	return was_idle;
>  }
>  
>  static void inject_preempt_context(struct intel_engine_cs *engine)
> @@ -657,7 +665,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  				GEM_BUG_ON(last->ctx == rq->ctx);
>  
>  				if (submit)
> -					port_assign(port, last);
> +					execlists->active += port_assign(port, last);
>  				port++;
>  
>  				GEM_BUG_ON(port_isset(port));
> @@ -679,7 +687,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  done:
>  	execlists->first = rb;
>  	if (submit)
> -		port_assign(port, last);
> +		execlists->active += port_assign(port, last);
>  unlock:
>  	spin_unlock_irq(&engine->timeline->lock);
>  
> @@ -696,10 +704,12 @@ execlist_cancel_port_requests(struct intel_engine_execlists *execlists)
>  	while (num_ports-- && port_isset(port)) {
>  		struct drm_i915_gem_request *rq = port_request(port);
>  
> +		GEM_BUG_ON(!execlists->active);
>  		execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_PREEMPTED);
>  		i915_gem_request_put(rq);
>  
>  		memset(port, 0, sizeof(*port));
> +		execlists->active--;
>  		port++;
>  	}
>  }
> @@ -863,6 +873,8 @@ static void intel_lrc_irq_handler(unsigned long data)
>  
>  				GEM_BUG_ON(!execlists->preempt);
>  				execlists->preempt = false;
> +				execlists->active--;
> +				GEM_BUG_ON(execlists->active);
>  				continue;
>  			}
>  
> @@ -1461,6 +1473,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
>  	clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>  	execlists->csb_head = -1;
>  	execlists->preempt = false;
> +	execlists->active = 0;
>  
>  	/* After a GPU reset, we may have requests to replay */
>  	if (!i915_modparams.enable_guc_submission && execlists->first)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 17186f067408..588f5fe9d2b7 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -245,6 +245,11 @@ struct intel_engine_execlists {
>  	 */
>  	bool preempt;
>  
> +	/**
> +	 * @active: count of the number of outstanding CSB events
> +	 */
> +	unsigned int active;
> +
>  	/**
>  	 * @port_mask: number of execlist ports - 1
>  	 */
> @@ -538,9 +543,11 @@ execlists_port_complete(struct intel_engine_execlists * const execlists,
>  	const unsigned int m = execlists->port_mask;
>  
>  	GEM_BUG_ON(port_index(port, execlists) != 0);
> +	GEM_BUG_ON(!execlists->active);
>  
>  	memmove(port, port + 1, m * sizeof(struct execlist_port));
>  	memset(port + m, 0, sizeof(struct execlist_port));
> +	execlists->active--;
>  }
>  
>  static inline unsigned int
> -- 
> 2.15.0.rc1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
  2017-10-20 13:19   ` Chris Wilson
@ 2017-10-20 13:23     ` Mika Kuoppala
  2017-10-20 13:52       ` Chris Wilson
  0 siblings, 1 reply; 21+ messages in thread
From: Mika Kuoppala @ 2017-10-20 13:23 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2017-10-20 14:11:53)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > In the idle worker we drop the prolonged GT wakeref used to cover such
>> > essentials as interrupt delivery. (When a CS interrupt arrives, we also
>> > assert that the GT is awake.) However, it turns out that 10ms is not
>> > long enough to be assured that the last CS interrupt has been delivered,
>> > so bump that to 200ms, and move the entirety of that wait to before we
>> > take the struct_mutex to avoid blocking. As this is now a potentially
>> > long wait, restore the earlier behaviour of bailing out early when a new
>> > request arrives.
>> >
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> > Cc: Imre Deak <imre.deak@intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/i915_gem.c | 31 ++++++++++++++++++++-----------
>> >  1 file changed, 20 insertions(+), 11 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> > index 026cb52ece0b..d3a638613857 100644
>> > --- a/drivers/gpu/drm/i915/i915_gem.c
>> > +++ b/drivers/gpu/drm/i915/i915_gem.c
>> > @@ -3281,8 +3281,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
>> >  {
>> >       struct drm_i915_private *dev_priv =
>> >               container_of(work, typeof(*dev_priv), gt.idle_work.work);
>> > -     struct drm_device *dev = &dev_priv->drm;
>> >       bool rearm_hangcheck;
>> > +     ktime_t end;
>> >  
>> >       if (!READ_ONCE(dev_priv->gt.awake))
>> >               return;
>> > @@ -3291,14 +3291,22 @@ i915_gem_idle_work_handler(struct work_struct *work)
>> >        * Wait for last execlists context complete, but bail out in case a
>> >        * new request is submitted.
>> >        */
>> > -     wait_for(intel_engines_are_idle(dev_priv), 10);
>> > -     if (READ_ONCE(dev_priv->gt.active_requests))
>> > -             return;
>> > +     end = ktime_add_ms(ktime_get(), 200);
>> > +     do {
>> > +             if (READ_ONCE(dev_priv->gt.active_requests) ||
>> > +                 work_pending(work))
>> > +                     return;
>> > +
>> > +             if (intel_engines_are_idle(dev_priv))
>> > +                     break;
>> > +
>> > +             usleep_range(100, 500);
>> > +     } while (ktime_before(ktime_get(), end));
>> >
>> 
>> Why can't we just wait_for(intel_engines_are_idle(dev_priv, 200)) ?
>
> That return. We don't really want to block the ordered wq for 200ms when
> we already know we won't make progress. (Whilst we are running nothing
> else that wants to use dev_priv->wq can.)

Ok, that makes sense but why don't we have own workqueue for the
idleworker?
-Mika
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts
  2017-10-20 13:21   ` [PATCH 2/2] " Mika Kuoppala
@ 2017-10-20 13:24     ` Chris Wilson
  2017-10-20 13:31       ` Mika Kuoppala
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2017-10-20 13:24 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-10-20 14:21:08)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
> > context-switch when idle") we noticed the presence of late
> > context-switch interrupts. We were able to filter those out by looking
> > at whether the ELSP remained active, but in commit beecec901790
> > ("drm/i915/execlists: Preemption!") that became problematic as we now
> > anticipate receiving a context-switch event for preemption while ELSP
> > may be empty. To restore the spurious interrupt suppression, add a
> 
> This confuses me, how can we preempt something that was never submitted.
> Could you elaborate?

The ELSP may become empty after we told the hw to switch to the preempt
context. So the preemption context-switch may appear as a separate
interrupt after !port_isset(execlists.port[0]). The joy of asynchronous
communications.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts
  2017-10-20 13:24     ` Chris Wilson
@ 2017-10-20 13:31       ` Mika Kuoppala
  2017-10-20 13:47         ` Chris Wilson
  0 siblings, 1 reply; 21+ messages in thread
From: Mika Kuoppala @ 2017-10-20 13:31 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2017-10-20 14:21:08)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
>> > context-switch when idle") we noticed the presence of late
>> > context-switch interrupts. We were able to filter those out by looking
>> > at whether the ELSP remained active, but in commit beecec901790
>> > ("drm/i915/execlists: Preemption!") that became problematic as we now
>> > anticipate receiving a context-switch event for preemption while ELSP
>> > may be empty. To restore the spurious interrupt suppression, add a
>> 
>> This confuses me, how can we preempt something that was never submitted.
>> Could you elaborate?
>
> The ELSP may become empty after we told the hw to switch to the preempt
> context. So the preemption context-switch may appear as a separate
> interrupt after !port_isset(execlists.port[0]). The joy of asynchronous
> communications.

Let me check if I got this right: as we dont use port[0] to
switch to preempt context and thus we might get 2 interrupts in a
a row, first one clearing port[0], we can't assume any idleness
by port[0] status alone?

-Mika
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts
  2017-10-20 13:31       ` Mika Kuoppala
@ 2017-10-20 13:47         ` Chris Wilson
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2017-10-20 13:47 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-10-20 14:31:14)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Quoting Mika Kuoppala (2017-10-20 14:21:08)
> >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> >> 
> >> > Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
> >> > context-switch when idle") we noticed the presence of late
> >> > context-switch interrupts. We were able to filter those out by looking
> >> > at whether the ELSP remained active, but in commit beecec901790
> >> > ("drm/i915/execlists: Preemption!") that became problematic as we now
> >> > anticipate receiving a context-switch event for preemption while ELSP
> >> > may be empty. To restore the spurious interrupt suppression, add a
> >> 
> >> This confuses me, how can we preempt something that was never submitted.
> >> Could you elaborate?
> >
> > The ELSP may become empty after we told the hw to switch to the preempt
> > context. So the preemption context-switch may appear as a separate
> > interrupt after !port_isset(execlists.port[0]). The joy of asynchronous
> > communications.
> 
> Let me check if I got this right: as we dont use port[0] to
> switch to preempt context and thus we might get 2 interrupts in a
> a row, first one clearing port[0], we can't assume any idleness
> by port[0] status alone?

Correct.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking (rev2)
  2017-10-20  9:59 [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Chris Wilson
                   ` (4 preceding siblings ...)
  2017-10-20 13:11 ` [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Mika Kuoppala
@ 2017-10-20 13:47 ` Patchwork
  2017-10-23 11:52 ` [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Mika Kuoppala
  6 siblings, 0 replies; 21+ messages in thread
From: Patchwork @ 2017-10-20 13:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking (rev2)
URL   : https://patchwork.freedesktop.org/series/32349/
State : success

== Summary ==

Test kms_plane:
        Subgroup plane-position-hole-pipe-A-planes:
                skip       -> PASS       (shard-hsw)
Test kms_frontbuffer_tracking:
        Subgroup fbc-1p-primscrn-cur-indfb-move:
                skip       -> PASS       (shard-hsw)
        Subgroup fbc-1p-offscren-pri-indfb-draw-mmap-cpu:
                skip       -> PASS       (shard-hsw)
Test kms_busy:
        Subgroup extended-modeset-hang-newfb-with-reset-render-B:
                dmesg-warn -> PASS       (shard-hsw) fdo#103038
        Subgroup extended-modeset-hang-oldfb-with-reset-render-B:
                dmesg-warn -> PASS       (shard-hsw) fdo#102249
Test kms_flip:
        Subgroup blt-wf_vblank-vs-dpms:
                dmesg-warn -> PASS       (shard-hsw)
Test kms_setmode:
        Subgroup basic:
                pass       -> FAIL       (shard-hsw) fdo#99912

fdo#103038 https://bugs.freedesktop.org/show_bug.cgi?id=103038
fdo#102249 https://bugs.freedesktop.org/show_bug.cgi?id=102249
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912

shard-hsw        total:2540 pass:1429 dwarn:1   dfail:0   fail:9   skip:1101 time:9212s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6121/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
  2017-10-20 13:23     ` Mika Kuoppala
@ 2017-10-20 13:52       ` Chris Wilson
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2017-10-20 13:52 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-10-20 14:23:02)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Quoting Mika Kuoppala (2017-10-20 14:11:53)
> >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> >> 
> >> > In the idle worker we drop the prolonged GT wakeref used to cover such
> >> > essentials as interrupt delivery. (When a CS interrupt arrives, we also
> >> > assert that the GT is awake.) However, it turns out that 10ms is not
> >> > long enough to be assured that the last CS interrupt has been delivered,
> >> > so bump that to 200ms, and move the entirety of that wait to before we
> >> > take the struct_mutex to avoid blocking. As this is now a potentially
> >> > long wait, restore the earlier behaviour of bailing out early when a new
> >> > request arrives.
> >> >
> >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >> > Cc: Imre Deak <imre.deak@intel.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/i915_gem.c | 31 ++++++++++++++++++++-----------
> >> >  1 file changed, 20 insertions(+), 11 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >> > index 026cb52ece0b..d3a638613857 100644
> >> > --- a/drivers/gpu/drm/i915/i915_gem.c
> >> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> >> > @@ -3281,8 +3281,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >> >  {
> >> >       struct drm_i915_private *dev_priv =
> >> >               container_of(work, typeof(*dev_priv), gt.idle_work.work);
> >> > -     struct drm_device *dev = &dev_priv->drm;
> >> >       bool rearm_hangcheck;
> >> > +     ktime_t end;
> >> >  
> >> >       if (!READ_ONCE(dev_priv->gt.awake))
> >> >               return;
> >> > @@ -3291,14 +3291,22 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >> >        * Wait for last execlists context complete, but bail out in case a
> >> >        * new request is submitted.
> >> >        */
> >> > -     wait_for(intel_engines_are_idle(dev_priv), 10);
> >> > -     if (READ_ONCE(dev_priv->gt.active_requests))
> >> > -             return;
> >> > +     end = ktime_add_ms(ktime_get(), 200);
> >> > +     do {
> >> > +             if (READ_ONCE(dev_priv->gt.active_requests) ||
> >> > +                 work_pending(work))
> >> > +                     return;
> >> > +
> >> > +             if (intel_engines_are_idle(dev_priv))
> >> > +                     break;
> >> > +
> >> > +             usleep_range(100, 500);
> >> > +     } while (ktime_before(ktime_get(), end));
> >> >
> >> 
> >> Why can't we just wait_for(intel_engines_are_idle(dev_priv, 200)) ?
> >
> > That return. We don't really want to block the ordered wq for 200ms when
> > we already know we won't make progress. (Whilst we are running nothing
> > else that wants to use dev_priv->wq can.)
> 
> Ok, that makes sense but why don't we have own workqueue for the
> idleworker?

We have a wq for those lowfreq work that need struct_mutex. We don't
really need it, it just helps to have a named wq when staring at a stuck
machine. One wq per struct work_struct would seem to be overkill ;)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts
  2017-10-20  9:59 ` [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts Chris Wilson
  2017-10-20 11:48   ` [PATCH v2] " Chris Wilson
  2017-10-20 13:21   ` [PATCH 2/2] " Mika Kuoppala
@ 2017-10-20 13:59   ` Mika Kuoppala
  2017-10-20 14:24     ` Chris Wilson
  2 siblings, 1 reply; 21+ messages in thread
From: Mika Kuoppala @ 2017-10-20 13:59 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
> context-switch when idle") we noticed the presence of late
> context-switch interrupts. We were able to filter those out by looking
> at whether the ELSP remained active, but in commit beecec901790
> ("drm/i915/execlists: Preemption!") that became problematic as we now
> anticipate receiving a context-switch event for preemption while ELSP
> may be empty. To restore the spurious interrupt suppression, add a
> counter for the expected number of pending context-switches and skip if
> we do not need to handle this interrupt to make forward progress.
>
> Fixes: beecec901790 ("drm/i915/execlists: Preemption!")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michal Winiarski <michal.winiarski@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 11 ++++++-----
>  drivers/gpu/drm/i915/i915_irq.c            |  6 ++++--
>  drivers/gpu/drm/i915/intel_engine_cs.c     |  5 +++--
>  drivers/gpu/drm/i915/intel_lrc.c           | 21 +++++++++++++++++----
>  drivers/gpu/drm/i915/intel_ringbuffer.h    |  7 +++++++
>  5 files changed, 37 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index a2e8114b739d..c22d0a07467c 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -550,11 +550,9 @@ static void port_assign(struct execlist_port *port,
>  			struct drm_i915_gem_request *rq)
>  {
>  	GEM_BUG_ON(rq == port_request(port));
> +	GEM_BUG_ON(port_isset(port));
>  
> -	if (port_isset(port))
> -		i915_gem_request_put(port_request(port));
> -
> -	port_set(port, port_pack(i915_gem_request_get(rq), port_count(port)));
> +	port_set(port, i915_gem_request_get(rq));

No lite restore with guc so we can make this more strict and lean?

>  	nested_enable_signaling(rq);
>  }
>  
> @@ -586,8 +584,10 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
>  					goto done;
>  				}
>  
> -				if (submit)
> +				if (submit) {
>  					port_assign(port, last);
> +					execlists->active++;
> +				}
>  				port++;
>  			}
>  
> @@ -610,6 +610,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
>  	execlists->first = rb;
>  	if (submit) {
>  		port_assign(port, last);
> +		execlists->active++;
>  		i915_guc_submit(engine);
>  	}
>  	spin_unlock_irq(&engine->timeline->lock);
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index b1296a55c1e4..f8205841868b 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1388,8 +1388,10 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
>  	bool tasklet = false;
>  
>  	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
> -		__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> -		tasklet = true;
> +		if (READ_ONCE(engine->execlists.active)) {
> +			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> +			tasklet = true;
> +		}
>  	}
>  
>  	if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index a47a9c6bea52..8aecbc321786 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1548,8 +1548,8 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
>  	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
>  		return false;
>  
> -	/* Both ports drained, no more ELSP submission? */
> -	if (port_request(&engine->execlists.port[0]))
> +	/* Waiting to drain ELSP? */
> +	if (READ_ONCE(engine->execlists.active))
>  		return false;
>

It would not make sense now to keep the port[0] check but
we could get more fine grained debugging info if we keep
it?

-Mika

>  	/* ELSP is empty, but there are ready requests? */
> @@ -1749,6 +1749,7 @@ void intel_engine_dump(struct intel_engine_cs *engine, struct drm_printer *m)
>  					   idx);
>  			}
>  		}
> +		drm_printf(m, "\t\tELSP active %d\n", execlists->active);
>  		rcu_read_unlock();
>  	} else if (INTEL_GEN(dev_priv) > 6) {
>  		drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 7f45dd7dc3e5..8a47b8211c74 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -482,15 +482,23 @@ static bool can_merge_ctx(const struct i915_gem_context *prev,
>  	return true;
>  }
>  
> -static void port_assign(struct execlist_port *port,
> +static bool port_assign(struct execlist_port *port,
>  			struct drm_i915_gem_request *rq)
>  {
> +	bool was_idle;
> +
>  	GEM_BUG_ON(rq == port_request(port));
>  
> -	if (port_isset(port))
> +	if (port_isset(port)) {
>  		i915_gem_request_put(port_request(port));
> +		was_idle = false;
> +	} else {
> +		was_idle = true;
> +	}
>  
>  	port_set(port, port_pack(i915_gem_request_get(rq), port_count(port)));
> +
> +	return was_idle;
>  }
>  
>  static void inject_preempt_context(struct intel_engine_cs *engine)
> @@ -657,7 +665,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  				GEM_BUG_ON(last->ctx == rq->ctx);
>  
>  				if (submit)
> -					port_assign(port, last);
> +					execlists->active += port_assign(port, last);
>  				port++;
>  
>  				GEM_BUG_ON(port_isset(port));
> @@ -679,7 +687,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  done:
>  	execlists->first = rb;
>  	if (submit)
> -		port_assign(port, last);
> +		execlists->active += port_assign(port, last);
>  unlock:
>  	spin_unlock_irq(&engine->timeline->lock);
>  
> @@ -696,10 +704,12 @@ execlist_cancel_port_requests(struct intel_engine_execlists *execlists)
>  	while (num_ports-- && port_isset(port)) {
>  		struct drm_i915_gem_request *rq = port_request(port);
>  
> +		GEM_BUG_ON(!execlists->active);
>  		execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_PREEMPTED);
>  		i915_gem_request_put(rq);
>  
>  		memset(port, 0, sizeof(*port));
> +		execlists->active--;
>  		port++;
>  	}
>  }
> @@ -863,6 +873,8 @@ static void intel_lrc_irq_handler(unsigned long data)
>  
>  				GEM_BUG_ON(!execlists->preempt);
>  				execlists->preempt = false;
> +				execlists->active--;
> +				GEM_BUG_ON(execlists->active);
>  				continue;
>  			}
>  
> @@ -1461,6 +1473,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
>  	clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>  	execlists->csb_head = -1;
>  	execlists->preempt = false;
> +	execlists->active = 0;
>  
>  	/* After a GPU reset, we may have requests to replay */
>  	if (!i915_modparams.enable_guc_submission && execlists->first)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 17186f067408..588f5fe9d2b7 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -245,6 +245,11 @@ struct intel_engine_execlists {
>  	 */
>  	bool preempt;
>  
> +	/**
> +	 * @active: count of the number of outstanding CSB events
> +	 */
> +	unsigned int active;
> +
>  	/**
>  	 * @port_mask: number of execlist ports - 1
>  	 */
> @@ -538,9 +543,11 @@ execlists_port_complete(struct intel_engine_execlists * const execlists,
>  	const unsigned int m = execlists->port_mask;
>  
>  	GEM_BUG_ON(port_index(port, execlists) != 0);
> +	GEM_BUG_ON(!execlists->active);
>  
>  	memmove(port, port + 1, m * sizeof(struct execlist_port));
>  	memset(port + m, 0, sizeof(struct execlist_port));
> +	execlists->active--;
>  }
>  
>  static inline unsigned int
> -- 
> 2.15.0.rc1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts
  2017-10-20 13:59   ` Mika Kuoppala
@ 2017-10-20 14:24     ` Chris Wilson
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2017-10-20 14:24 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-10-20 14:59:44)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
> > context-switch when idle") we noticed the presence of late
> > context-switch interrupts. We were able to filter those out by looking
> > at whether the ELSP remained active, but in commit beecec901790
> > ("drm/i915/execlists: Preemption!") that became problematic as we now
> > anticipate receiving a context-switch event for preemption while ELSP
> > may be empty. To restore the spurious interrupt suppression, add a
> > counter for the expected number of pending context-switches and skip if
> > we do not need to handle this interrupt to make forward progress.
> >
> > Fixes: beecec901790 ("drm/i915/execlists: Preemption!")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michal Winiarski <michal.winiarski@intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_guc_submission.c | 11 ++++++-----
> >  drivers/gpu/drm/i915/i915_irq.c            |  6 ++++--
> >  drivers/gpu/drm/i915/intel_engine_cs.c     |  5 +++--
> >  drivers/gpu/drm/i915/intel_lrc.c           | 21 +++++++++++++++++----
> >  drivers/gpu/drm/i915/intel_ringbuffer.h    |  7 +++++++
> >  5 files changed, 37 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> > index a2e8114b739d..c22d0a07467c 100644
> > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > @@ -550,11 +550,9 @@ static void port_assign(struct execlist_port *port,
> >                       struct drm_i915_gem_request *rq)
> >  {
> >       GEM_BUG_ON(rq == port_request(port));
> > +     GEM_BUG_ON(port_isset(port));
> >  
> > -     if (port_isset(port))
> > -             i915_gem_request_put(port_request(port));
> > -
> > -     port_set(port, port_pack(i915_gem_request_get(rq), port_count(port)));
> > +     port_set(port, i915_gem_request_get(rq));
> 
> No lite restore with guc so we can make this more strict and lean?

Yes. We are not coalescing requests into an active slot, so we always
know that we are assigning a new request into a new slot.

> > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> > index a47a9c6bea52..8aecbc321786 100644
> > --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> > @@ -1548,8 +1548,8 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
> >       if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
> >               return false;
> >  
> > -     /* Both ports drained, no more ELSP submission? */
> > -     if (port_request(&engine->execlists.port[0]))
> > +     /* Waiting to drain ELSP? */
> > +     if (READ_ONCE(engine->execlists.active))
> >               return false;
> >
> 
> It would not make sense now to keep the port[0] check but
> we could get more fine grained debugging info if we keep
> it?

This function is the antithesis of fine grained debugging. It is called
in a tight loop, watching for idle so reporting from here ends up with a
lot of noise. Often I ask exactly what isn't idle... Now that we do a
single upfront wait, and then a check all is well, moving this to
intel_engine_park where we can be more verbose in the one-off state
check is the next step.

The importance of making this change is so the gen8_cs_irq_handler() and
the idle_worker are in sync about when to drop the GT wakeref.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
  2017-10-20  9:59 [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Chris Wilson
                   ` (5 preceding siblings ...)
  2017-10-20 13:47 ` ✓ Fi.CI.IGT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking (rev2) Patchwork
@ 2017-10-23 11:52 ` Mika Kuoppala
  2017-10-23 12:00   ` Chris Wilson
  6 siblings, 1 reply; 21+ messages in thread
From: Mika Kuoppala @ 2017-10-23 11:52 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> In the idle worker we drop the prolonged GT wakeref used to cover such
> essentials as interrupt delivery. (When a CS interrupt arrives, we also
> assert that the GT is awake.) However, it turns out that 10ms is not
> long enough to be assured that the last CS interrupt has been delivered,
> so bump that to 200ms, and move the entirety of that wait to before we
> take the struct_mutex to avoid blocking. As this is now a potentially
> long wait, restore the earlier behaviour of bailing out early when a new
> request arrives.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 31 ++++++++++++++++++++-----------
>  1 file changed, 20 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 026cb52ece0b..d3a638613857 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3281,8 +3281,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  {
>  	struct drm_i915_private *dev_priv =
>  		container_of(work, typeof(*dev_priv), gt.idle_work.work);
> -	struct drm_device *dev = &dev_priv->drm;
>  	bool rearm_hangcheck;
> +	ktime_t end;
>  
>  	if (!READ_ONCE(dev_priv->gt.awake))
>  		return;
> @@ -3291,14 +3291,22 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  	 * Wait for last execlists context complete, but bail out in case a
>  	 * new request is submitted.
>  	 */
> -	wait_for(intel_engines_are_idle(dev_priv), 10);
> -	if (READ_ONCE(dev_priv->gt.active_requests))
> -		return;
> +	end = ktime_add_ms(ktime_get(), 200);
> +	do {
> +		if (READ_ONCE(dev_priv->gt.active_requests) ||
> +		    work_pending(work))
> +			return;
> +
> +		if (intel_engines_are_idle(dev_priv))
> +			break;
> +
> +		usleep_range(100, 500);
> +	} while (ktime_before(ktime_get(), end));
>  
>  	rearm_hangcheck =
>  		cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
>  
> -	if (!mutex_trylock(&dev->struct_mutex)) {
> +	if (!mutex_trylock(&dev_priv->drm.struct_mutex)) {
>  		/* Currently busy, come back later */
>  		mod_delayed_work(dev_priv->wq,
>  				 &dev_priv->gt.idle_work,
> @@ -3310,13 +3318,14 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  	 * New request retired after this work handler started, extend active
>  	 * period until next instance of the work.
>  	 */
> -	if (work_pending(work))
> +	if (dev_priv->gt.active_requests || work_pending(work))
>  		goto out_unlock;
>

In here there might be some value of introducing helper
for gt_work_pending as you could use it in early bailout and
in here. You would get one superfluous READ_ONCE by having that inside
the helper but in idle work it doesnt matter.

I think it would read better too. But as it is in bikesched
department.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>


> -	if (dev_priv->gt.active_requests)
> -		goto out_unlock;
> -
> -	if (wait_for(intel_engines_are_idle(dev_priv), 10))
> +	/*
> +	 * We are committed now to parking the engines, make sure there
> +	 * will be no more interrupts arriving later.
> +	 */
> +	if (!intel_engines_are_idle(dev_priv))
>  		DRM_ERROR("Timeout waiting for engines to idle\n");
>  
>  	intel_engines_mark_idle(dev_priv);
> @@ -3330,7 +3339,7 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  		gen6_rps_idle(dev_priv);
>  	intel_runtime_pm_put(dev_priv);
>  out_unlock:
> -	mutex_unlock(&dev->struct_mutex);
> +	mutex_unlock(&dev_priv->drm.struct_mutex);
>  
>  out_rearm:
>  	if (rearm_hangcheck) {
> -- 
> 2.15.0.rc1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking
  2017-10-23 11:52 ` [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Mika Kuoppala
@ 2017-10-23 12:00   ` Chris Wilson
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2017-10-23 12:00 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2017-10-23 12:52:11)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > In the idle worker we drop the prolonged GT wakeref used to cover such
> > essentials as interrupt delivery. (When a CS interrupt arrives, we also
> > assert that the GT is awake.) However, it turns out that 10ms is not
> > long enough to be assured that the last CS interrupt has been delivered,
> > so bump that to 200ms, and move the entirety of that wait to before we
> > take the struct_mutex to avoid blocking. As this is now a potentially
> > long wait, restore the earlier behaviour of bailing out early when a new
> > request arrives.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Imre Deak <imre.deak@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 31 ++++++++++++++++++++-----------
> >  1 file changed, 20 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 026cb52ece0b..d3a638613857 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3281,8 +3281,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >  {
> >       struct drm_i915_private *dev_priv =
> >               container_of(work, typeof(*dev_priv), gt.idle_work.work);
> > -     struct drm_device *dev = &dev_priv->drm;
> >       bool rearm_hangcheck;
> > +     ktime_t end;
> >  
> >       if (!READ_ONCE(dev_priv->gt.awake))
> >               return;
> > @@ -3291,14 +3291,22 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >        * Wait for last execlists context complete, but bail out in case a
> >        * new request is submitted.
> >        */
> > -     wait_for(intel_engines_are_idle(dev_priv), 10);
> > -     if (READ_ONCE(dev_priv->gt.active_requests))
> > -             return;
> > +     end = ktime_add_ms(ktime_get(), 200);
> > +     do {
> > +             if (READ_ONCE(dev_priv->gt.active_requests) ||
> > +                 work_pending(work))
> > +                     return;
> > +
> > +             if (intel_engines_are_idle(dev_priv))
> > +                     break;
> > +
> > +             usleep_range(100, 500);
> > +     } while (ktime_before(ktime_get(), end));
> >  
> >       rearm_hangcheck =
> >               cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
> >  
> > -     if (!mutex_trylock(&dev->struct_mutex)) {
> > +     if (!mutex_trylock(&dev_priv->drm.struct_mutex)) {
> >               /* Currently busy, come back later */
> >               mod_delayed_work(dev_priv->wq,
> >                                &dev_priv->gt.idle_work,
> > @@ -3310,13 +3318,14 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >        * New request retired after this work handler started, extend active
> >        * period until next instance of the work.
> >        */
> > -     if (work_pending(work))
> > +     if (dev_priv->gt.active_requests || work_pending(work))
> >               goto out_unlock;
> >
> 
> In here there might be some value of introducing helper
> for gt_work_pending as you could use it in early bailout and
> in here. You would get one superfluous READ_ONCE by having that inside
> the helper but in idle work it doesnt matter.
> 
> I think it would read better too. But as it is in bikesched
> department.

Read better depends on finding the right name.

new_requests_since_last_retire()?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts
  2017-10-23 20:06 ` [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts Chris Wilson
@ 2017-10-23 21:12   ` Chris Wilson
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Wilson @ 2017-10-23 21:12 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Quoting Chris Wilson (2017-10-23 21:06:16)
> Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
> context-switch when idle") we noticed the presence of late
> context-switch interrupts. We were able to filter those out by looking
> at whether the ELSP remained active, but in commit beecec901790
> ("drm/i915/execlists: Preemption!") that became problematic as we now
> anticipate receiving a context-switch event for preemption while ELSP
> may be empty. To restore the spurious interrupt suppression, add a
> counter for the expected number of pending context-switches and skip if
> we do not need to handle this interrupt to make forward progress.

Looking at an example from
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_1299/
the common case is where we still get the interrupt after already
parsing the whole CSB:

<6>[   22.723238] i915 0000:00:02.0: [drm] vecs0
<6>[   22.723246] i915 0000:00:02.0: [drm] 	current seqno 8, last 8, hangcheck 0 [-277277 ms], inflight 0
<6>[   22.723260] i915 0000:00:02.0: [drm] 	Reset count: 0
<6>[   22.723269] i915 0000:00:02.0: [drm] 	Requests:
<6>[   22.723278] i915 0000:00:02.0: [drm] 	RING_START: 0x007fb000 [0x00000000]
<6>[   22.723289] i915 0000:00:02.0: [drm] 	RING_HEAD:  0x00000278 [0x00000000]
<6>[   22.723300] i915 0000:00:02.0: [drm] 	RING_TAIL:  0x00000278 [0x00000000]
<6>[   22.723311] i915 0000:00:02.0: [drm] 	RING_CTL:   0x00003001 []
<6>[   22.723322] i915 0000:00:02.0: [drm] 	ACTHD:  0x00000000_00000278
<6>[   22.723333] i915 0000:00:02.0: [drm] 	BBADDR: 0x00000000_00000004
<6>[   22.723343] i915 0000:00:02.0: [drm] 	Execlist status: 0x00000301 00000000
<6>[   22.723355] i915 0000:00:02.0: [drm] 	Execlist CSB read 1 [1 cached], write 1 [1 from hws], interrupt posted? no
<6>[   22.723370] i915 0000:00:02.0: [drm] 		ELSP[0] idle
<6>[   22.723378] i915 0000:00:02.0: [drm] 		ELSP[1] idle
<6>[   22.723387] i915 0000:00:02.0: [drm] 		HW active? 0x0
<6>[   22.723402] i915 0000:00:02.0: [drm] 


Those should not lead to hitting BUG_ON(gt.awake) though as the tasklet
is flushed before we clear gt.awake. Except if maybe the interrupt
arrives after the tasklet_kill...

Given that we wait for the engines to be idle before parking, we should
be safe enough with

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bb0e85043e01..fa46137d431a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3327,6 +3327,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
        if (new_requests_since_last_retire(dev_priv))
                goto out_unlock;
 
+       synchronize_irq(dev_priv->drm.irq);
+
        /*
         * We are committed now to parking the engines, make sure there
         * will be no more interrupts arriving later.

to flush a pending irq and not worry about a multi-phase park.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts
  2017-10-23 20:06 Chris Wilson
@ 2017-10-23 20:06 ` Chris Wilson
  2017-10-23 21:12   ` Chris Wilson
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Wilson @ 2017-10-23 20:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Back in commit a4b2b01523a8 ("drm/i915: Don't mark an execlists
context-switch when idle") we noticed the presence of late
context-switch interrupts. We were able to filter those out by looking
at whether the ELSP remained active, but in commit beecec901790
("drm/i915/execlists: Preemption!") that became problematic as we now
anticipate receiving a context-switch event for preemption while ELSP
may be empty. To restore the spurious interrupt suppression, add a
counter for the expected number of pending context-switches and skip if
we do not need to handle this interrupt to make forward progress.

v2: Don't forget to switch on for preempt.
v3: Reduce the counter to a on/off boolean tracker. Declare the HW as
active when we first submit, and idle after the final completion event
(with which we confirm the HW says it is idle), and track each source
of activity separately. With a finite number of sources, it should us
debug which gets stuck.

Fixes: beecec901790 ("drm/i915/execlists: Preemption!")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c |  3 +++
 drivers/gpu/drm/i915/i915_irq.c            |  6 ++++--
 drivers/gpu/drm/i915/intel_engine_cs.c     |  5 +++--
 drivers/gpu/drm/i915/intel_lrc.c           | 27 ++++++++++++++++++------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 34 ++++++++++++++++++++++++++++--
 5 files changed, 62 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index a2e8114b739d..f84c267728fd 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -610,6 +610,7 @@ static void i915_guc_dequeue(struct intel_engine_cs *engine)
 	execlists->first = rb;
 	if (submit) {
 		port_assign(port, last);
+		execlists_set_active(execlists, EXECLISTS_ACTIVE_USER);
 		i915_guc_submit(engine);
 	}
 	spin_unlock_irq(&engine->timeline->lock);
@@ -633,6 +634,8 @@ static void i915_guc_irq_handler(unsigned long data)
 
 		rq = port_request(&port[0]);
 	}
+	if (!rq)
+		execlists_clear_active(execlists, EXECLISTS_ACTIVE_USER);
 
 	if (!port_isset(last_port))
 		i915_guc_dequeue(engine);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index b1296a55c1e4..f8205841868b 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1388,8 +1388,10 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
 	bool tasklet = false;
 
 	if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
-		__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
-		tasklet = true;
+		if (READ_ONCE(engine->execlists.active)) {
+			__set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
+			tasklet = true;
+		}
 	}
 
 	if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index a47a9c6bea52..ab5bf4e2e28e 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1548,8 +1548,8 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
 		return false;
 
-	/* Both ports drained, no more ELSP submission? */
-	if (port_request(&engine->execlists.port[0]))
+	/* Waiting to drain ELSP? */
+	if (READ_ONCE(engine->execlists.active))
 		return false;
 
 	/* ELSP is empty, but there are ready requests? */
@@ -1749,6 +1749,7 @@ void intel_engine_dump(struct intel_engine_cs *engine, struct drm_printer *m)
 					   idx);
 			}
 		}
+		drm_printf(m, "\t\tHW active? 0x%x\n", execlists->active);
 		rcu_read_unlock();
 	} else if (INTEL_GEN(dev_priv) > 6) {
 		drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 7f45dd7dc3e5..233c992a886b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -575,7 +575,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			 * the state of the GPU is known (idle).
 			 */
 			inject_preempt_context(engine);
-			execlists->preempt = true;
+			execlists_set_active(execlists,
+					     EXECLISTS_ACTIVE_PREEMPT);
 			goto unlock;
 		} else {
 			/*
@@ -683,8 +684,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 unlock:
 	spin_unlock_irq(&engine->timeline->lock);
 
-	if (submit)
+	if (submit) {
+		execlists_set_active(execlists, EXECLISTS_ACTIVE_USER);
 		execlists_submit_ports(engine);
+	}
 }
 
 static void
@@ -696,6 +699,7 @@ execlist_cancel_port_requests(struct intel_engine_execlists *execlists)
 	while (num_ports-- && port_isset(port)) {
 		struct drm_i915_gem_request *rq = port_request(port);
 
+		GEM_BUG_ON(!execlists->active);
 		execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_PREEMPTED);
 		i915_gem_request_put(rq);
 
@@ -861,15 +865,21 @@ static void intel_lrc_irq_handler(unsigned long data)
 				unwind_incomplete_requests(engine);
 				spin_unlock_irq(&engine->timeline->lock);
 
-				GEM_BUG_ON(!execlists->preempt);
-				execlists->preempt = false;
+				GEM_BUG_ON(!execlists_is_active(execlists,
+								EXECLISTS_ACTIVE_PREEMPT));
+				execlists_clear_active(execlists,
+						       EXECLISTS_ACTIVE_PREEMPT);
 				continue;
 			}
 
 			if (status & GEN8_CTX_STATUS_PREEMPTED &&
-			    execlists->preempt)
+			    execlists_is_active(execlists,
+					       EXECLISTS_ACTIVE_PREEMPT))
 				continue;
 
+			GEM_BUG_ON(!execlists_is_active(execlists,
+							EXECLISTS_ACTIVE_USER));
+
 			/* Check the context/desc id for this event matches */
 			GEM_DEBUG_BUG_ON(buf[2 * head + 1] != port->context_id);
 
@@ -892,6 +902,9 @@ static void intel_lrc_irq_handler(unsigned long data)
 			/* After the final element, the hw should be idle */
 			GEM_BUG_ON(port_count(port) == 0 &&
 				   !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
+			if (port_count(port) == 0)
+				execlists_clear_active(execlists,
+						       EXECLISTS_ACTIVE_USER);
 		}
 
 		if (head != execlists->csb_head) {
@@ -901,7 +914,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 		}
 	}
 
-	if (!execlists->preempt)
+	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT))
 		execlists_dequeue(engine);
 
 	intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
@@ -1460,7 +1473,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
 		   GT_CONTEXT_SWITCH_INTERRUPT << engine->irq_shift);
 	clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 	execlists->csb_head = -1;
-	execlists->preempt = false;
+	execlists->active = 0;
 
 	/* After a GPU reset, we may have requests to replay */
 	if (!i915_modparams.enable_guc_submission && execlists->first)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 17186f067408..51bc704bf413 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -241,9 +241,17 @@ struct intel_engine_execlists {
 	} port[EXECLIST_MAX_PORTS];
 
 	/**
-	 * @preempt: are we currently handling a preempting context switch?
+	 * @active: is the HW active? We consider the HW as active after
+	 * submitted any context for execution until we have seen the last
+	 * context completion event. After that, we do not expect any more
+	 * events until we submit, and so can park the HW.
+	 *
+	 * As we have a small number of different sources from which we feed
+	 * the HW, we track the state of each inside a single bitfield.
 	 */
-	bool preempt;
+	unsigned int active;
+#define EXECLISTS_ACTIVE_USER 0
+#define EXECLISTS_ACTIVE_PREEMPT 1
 
 	/**
 	 * @port_mask: number of execlist ports - 1
@@ -525,6 +533,27 @@ struct intel_engine_cs {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 };
 
+static inline void
+execlists_set_active(struct intel_engine_execlists *execlists,
+		     unsigned int bit)
+{
+	__set_bit(bit, (unsigned long *)&execlists->active);
+}
+
+static inline void
+execlists_clear_active(struct intel_engine_execlists *execlists,
+		       unsigned int bit)
+{
+	__clear_bit(bit, (unsigned long *)&execlists->active);
+}
+
+static inline bool
+execlists_is_active(const struct intel_engine_execlists *execlists,
+		    unsigned int bit)
+{
+	return test_bit(bit, (unsigned long *)&execlists->active);
+}
+
 static inline unsigned int
 execlists_num_ports(const struct intel_engine_execlists * const execlists)
 {
@@ -538,6 +567,7 @@ execlists_port_complete(struct intel_engine_execlists * const execlists,
 	const unsigned int m = execlists->port_mask;
 
 	GEM_BUG_ON(port_index(port, execlists) != 0);
+	GEM_BUG_ON(!execlists_is_active(execlists, EXECLISTS_ACTIVE_USER));
 
 	memmove(port, port + 1, m * sizeof(struct execlist_port));
 	memset(port + m, 0, sizeof(struct execlist_port));
-- 
2.15.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-10-23 21:13 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-20  9:59 [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Chris Wilson
2017-10-20  9:59 ` [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts Chris Wilson
2017-10-20 11:48   ` [PATCH v2] " Chris Wilson
2017-10-20 13:21   ` [PATCH 2/2] " Mika Kuoppala
2017-10-20 13:24     ` Chris Wilson
2017-10-20 13:31       ` Mika Kuoppala
2017-10-20 13:47         ` Chris Wilson
2017-10-20 13:59   ` Mika Kuoppala
2017-10-20 14:24     ` Chris Wilson
2017-10-20 10:23 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Patchwork
2017-10-20 11:38 ` ✓ Fi.CI.IGT: " Patchwork
2017-10-20 12:19 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking (rev2) Patchwork
2017-10-20 13:11 ` [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Mika Kuoppala
2017-10-20 13:19   ` Chris Wilson
2017-10-20 13:23     ` Mika Kuoppala
2017-10-20 13:52       ` Chris Wilson
2017-10-20 13:47 ` ✓ Fi.CI.IGT: success for series starting with [1/2] drm/i915: Bump wait-times for the final CS interrupt before parking (rev2) Patchwork
2017-10-23 11:52 ` [PATCH 1/2] drm/i915: Bump wait-times for the final CS interrupt before parking Mika Kuoppala
2017-10-23 12:00   ` Chris Wilson
2017-10-23 20:06 Chris Wilson
2017-10-23 20:06 ` [PATCH 2/2] drm/i915: Filter out spurious execlists context-switch interrupts Chris Wilson
2017-10-23 21:12   ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.