All of lore.kernel.org
 help / color / mirror / Atom feed
* [CI-ping 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+
@ 2016-04-09  9:57 Chris Wilson
  2016-04-09  9:57 ` [CI-ping 2/5] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Chris Wilson @ 2016-04-09  9:57 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Mika Kuoppala

In order to ensure seqno/irq coherency, we currently read a ring register.
The mmio transaction following the interrupt delays the inspection of
the seqno long enough for the MI_STORE_DWORD_IMM to update the CPU
cache. However, it is only the memory timing that is important for the
purposes of the delay, we do not need nor desire the extra forcewake.

v3: Update commentary

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2]
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 6b4952031e30..edd5ae41ba60 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1573,10 +1573,19 @@ gen6_ring_get_seqno(struct intel_engine_cs *engine, bool lazy_coherency)
 {
 	/* Workaround to force correct ordering between irq and seqno writes on
 	 * ivb (and maybe also on snb) by reading from a CS register (like
-	 * ACTHD) before reading the status page. */
+	 * ACTHD) before reading the status page.
+	 *
+	 * Note that this effectively stalls the read by the time it takes to
+	 * do a memory transaction, which more or less ensures that the write
+	 * from the GPU has sufficient time to invalidate the CPU cacheline.
+	 * Alternatively we could delay the interrupt from the CS ring to give
+	 * the write time to land, but that would incur a delay after every
+	 * batch i.e. much more frequent than a delay when waiting for the
+	 * interrupt (with the same net latency).
+	 */
 	if (!lazy_coherency) {
 		struct drm_i915_private *dev_priv = engine->dev->dev_private;
-		POSTING_READ(RING_ACTHD(engine->mmio_base));
+		POSTING_READ_FW(RING_ACTHD(engine->mmio_base));
 	}
 
 	return intel_read_status_page(engine, I915_GEM_HWS_INDEX);
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [CI-ping 2/5] drm/i915: Separate out the seqno-barrier from engine->get_seqno
  2016-04-09  9:57 [CI-ping 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson
@ 2016-04-09  9:57 ` Chris Wilson
  2016-04-09  9:57 ` [CI-ping 3/5] drm/i915: Harden detection of missed interrupts Chris Wilson
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2016-04-09  9:57 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

In order to simplify future patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.

v2: Rename the barrier to engine->irq_seqno_barrier to try and better
reflect that the barrier is only required after the user interrupt before
reading the seqno (to ensure that the seqno update lands in time as we
do not have strict seqno-irq ordering on all platforms).

Reviewed-by: Dave Gordon <david.s.gordon@intel.com> [#v2]

v3: Comments for hangcheck paranoia. Mika wanted to keep the extra
barrier inside the hangcheck, just in case. I can argue that it doesn't
provide a barrier against anything, but the side-effects of applying the
barrier may prevent a false declaration of a hung GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  6 +++---
 drivers/gpu/drm/i915/i915_drv.h         | 12 ++++++++----
 drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
 drivers/gpu/drm/i915/i915_irq.c         | 14 ++++++++++++--
 drivers/gpu/drm/i915/i915_trace.h       |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c        | 19 ++++++------------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 34 +++++++++++++++++----------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 ++--
 8 files changed, 51 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index ebbf4e400684..919c05ba9932 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -598,7 +598,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data)
 					   engine->name,
 					   i915_gem_request_get_seqno(work->flip_queued_req),
 					   dev_priv->next_seqno,
-					   engine->get_seqno(engine, true),
+					   engine->get_seqno(engine),
 					   i915_gem_request_completed(work->flip_queued_req, true));
 			} else
 				seq_printf(m, "Flip not associated with any ring\n");
@@ -730,7 +730,7 @@ static void i915_ring_seqno_info(struct seq_file *m,
 {
 	if (engine->get_seqno) {
 		seq_printf(m, "Current sequence (%s): %x\n",
-			   engine->name, engine->get_seqno(engine, false));
+			   engine->name, engine->get_seqno(engine));
 	}
 }
 
@@ -1346,8 +1346,8 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 	intel_runtime_pm_get(dev_priv);
 
 	for_each_engine_id(engine, dev_priv, id) {
-		seqno[id] = engine->get_seqno(engine, false);
 		acthd[id] = intel_ring_get_active_head(engine);
+		seqno[id] = engine->get_seqno(engine);
 	}
 
 	i915_get_extra_instdone(dev, instdone);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a93e5dd4fa9a..542401659013 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3017,15 +3017,19 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
 static inline bool i915_gem_request_started(struct drm_i915_gem_request *req,
 					   bool lazy_coherency)
 {
-	u32 seqno = req->engine->get_seqno(req->engine, lazy_coherency);
-	return i915_seqno_passed(seqno, req->previous_seqno);
+	if (!lazy_coherency && req->engine->irq_seqno_barrier)
+		req->engine->irq_seqno_barrier(req->engine);
+	return i915_seqno_passed(req->engine->get_seqno(req->engine),
+				 req->previous_seqno);
 }
 
 static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
 					      bool lazy_coherency)
 {
-	u32 seqno = req->engine->get_seqno(req->engine, lazy_coherency);
-	return i915_seqno_passed(seqno, req->seqno);
+	if (!lazy_coherency && req->engine->irq_seqno_barrier)
+		req->engine->irq_seqno_barrier(req->engine);
+	return i915_seqno_passed(req->engine->get_seqno(req->engine),
+				 req->seqno);
 }
 
 int __must_check i915_gem_get_seqno(struct drm_device *dev, u32 *seqno);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index ce77713a555d..89725c9efc25 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -931,8 +931,8 @@ static void i915_record_ring_state(struct drm_device *dev,
 
 	ering->waiting = waitqueue_active(&engine->irq_queue);
 	ering->instpm = I915_READ(RING_INSTPM(engine->mmio_base));
-	ering->seqno = engine->get_seqno(engine, false);
 	ering->acthd = intel_ring_get_active_head(engine);
+	ering->seqno = engine->get_seqno(engine);
 	ering->last_seqno = engine->last_submitted_seqno;
 	ering->start = I915_READ_START(engine);
 	ering->head = I915_READ_HEAD(engine);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index c30a12ef6daf..3b946e1c7614 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2941,7 +2941,7 @@ static int semaphore_passed(struct intel_engine_cs *engine)
 	if (signaller->hangcheck.deadlock >= I915_NUM_ENGINES)
 		return -1;
 
-	if (i915_seqno_passed(signaller->get_seqno(signaller, false), seqno))
+	if (i915_seqno_passed(signaller->get_seqno(signaller), seqno))
 		return 1;
 
 	/* cursory check for an unkickable deadlock */
@@ -3100,8 +3100,18 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 
 		semaphore_clear_deadlocks(dev_priv);
 
-		seqno = engine->get_seqno(engine, false);
+		/* We don't strictly need an irq-barrier here, as we are not
+		 * serving an interrupt request, be paranoid in case the
+		 * barrier has side-effects (such as preventing a broken
+		 * cacheline snoop) and so be sure that we can see the seqno
+		 * advance. If the seqno should stick, due to a stale
+		 * cacheline, we would erroneously declare the GPU hung.
+		 */
+		if (engine->irq_seqno_barrier)
+			engine->irq_seqno_barrier(engine);
+
 		acthd = intel_ring_get_active_head(engine);
+		seqno = engine->get_seqno(engine);
 
 		if (engine->hangcheck.seqno == seqno) {
 			if (ring_idle(engine, seqno)) {
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index afdd8aefb5b7..dc0def210097 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -562,7 +562,7 @@ TRACE_EVENT(i915_gem_request_notify,
 	    TP_fast_assign(
 			   __entry->dev = engine->dev->primary->index;
 			   __entry->ring = engine->id;
-			   __entry->seqno = engine->get_seqno(engine, false);
+			   __entry->seqno = engine->get_seqno(engine);
 			   ),
 
 	    TP_printk("dev=%u, ring=%u, seqno=%u",
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a1db6a02cf23..419cef89c2b3 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1852,7 +1852,7 @@ static int gen8_emit_flush_render(struct drm_i915_gem_request *request,
 	return 0;
 }
 
-static u32 gen8_get_seqno(struct intel_engine_cs *engine, bool lazy_coherency)
+static u32 gen8_get_seqno(struct intel_engine_cs *engine)
 {
 	return intel_read_status_page(engine, I915_GEM_HWS_INDEX);
 }
@@ -1862,10 +1862,8 @@ static void gen8_set_seqno(struct intel_engine_cs *engine, u32 seqno)
 	intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno);
 }
 
-static u32 bxt_a_get_seqno(struct intel_engine_cs *engine,
-			   bool lazy_coherency)
+static void bxt_a_seqno_barrier(struct intel_engine_cs *engine)
 {
-
 	/*
 	 * On BXT A steppings there is a HW coherency issue whereby the
 	 * MI_STORE_DATA_IMM storing the completed request's seqno
@@ -1876,11 +1874,7 @@ static u32 bxt_a_get_seqno(struct intel_engine_cs *engine,
 	 * bxt_a_set_seqno(), where we also do a clflush after the write. So
 	 * this clflush in practice becomes an invalidate operation.
 	 */
-
-	if (!lazy_coherency)
-		intel_flush_status_page(engine, I915_GEM_HWS_INDEX);
-
-	return intel_read_status_page(engine, I915_GEM_HWS_INDEX);
+	intel_flush_status_page(engine, I915_GEM_HWS_INDEX);
 }
 
 static void bxt_a_set_seqno(struct intel_engine_cs *engine, u32 seqno)
@@ -2058,12 +2052,11 @@ logical_ring_default_vfuncs(struct drm_device *dev,
 	engine->irq_get = gen8_logical_ring_get_irq;
 	engine->irq_put = gen8_logical_ring_put_irq;
 	engine->emit_bb_start = gen8_emit_bb_start;
+	engine->get_seqno = gen8_get_seqno;
+	engine->set_seqno = gen8_set_seqno;
 	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
-		engine->get_seqno = bxt_a_get_seqno;
+		engine->irq_seqno_barrier = bxt_a_seqno_barrier;
 		engine->set_seqno = bxt_a_set_seqno;
-	} else {
-		engine->get_seqno = gen8_get_seqno;
-		engine->set_seqno = gen8_set_seqno;
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index edd5ae41ba60..215af7603f6c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1568,8 +1568,8 @@ pc_render_add_request(struct drm_i915_gem_request *req)
 	return 0;
 }
 
-static u32
-gen6_ring_get_seqno(struct intel_engine_cs *engine, bool lazy_coherency)
+static void
+gen6_seqno_barrier(struct intel_engine_cs *engine)
 {
 	/* Workaround to force correct ordering between irq and seqno writes on
 	 * ivb (and maybe also on snb) by reading from a CS register (like
@@ -1583,16 +1583,12 @@ gen6_ring_get_seqno(struct intel_engine_cs *engine, bool lazy_coherency)
 	 * batch i.e. much more frequent than a delay when waiting for the
 	 * interrupt (with the same net latency).
 	 */
-	if (!lazy_coherency) {
-		struct drm_i915_private *dev_priv = engine->dev->dev_private;
-		POSTING_READ_FW(RING_ACTHD(engine->mmio_base));
-	}
-
-	return intel_read_status_page(engine, I915_GEM_HWS_INDEX);
+	struct drm_i915_private *dev_priv = engine->dev->dev_private;
+	POSTING_READ_FW(RING_ACTHD(engine->mmio_base));
 }
 
 static u32
-ring_get_seqno(struct intel_engine_cs *engine, bool lazy_coherency)
+ring_get_seqno(struct intel_engine_cs *engine)
 {
 	return intel_read_status_page(engine, I915_GEM_HWS_INDEX);
 }
@@ -1604,7 +1600,7 @@ ring_set_seqno(struct intel_engine_cs *engine, u32 seqno)
 }
 
 static u32
-pc_render_get_seqno(struct intel_engine_cs *engine, bool lazy_coherency)
+pc_render_get_seqno(struct intel_engine_cs *engine)
 {
 	return engine->scratch.cpu_page[0];
 }
@@ -2828,7 +2824,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		engine->irq_get = gen8_ring_get_irq;
 		engine->irq_put = gen8_ring_put_irq;
 		engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
-		engine->get_seqno = gen6_ring_get_seqno;
+		engine->irq_seqno_barrier = gen6_seqno_barrier;
+		engine->get_seqno = ring_get_seqno;
 		engine->set_seqno = ring_set_seqno;
 		if (i915_semaphore_is_enabled(dev)) {
 			WARN_ON(!dev_priv->semaphore_obj);
@@ -2845,7 +2842,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		engine->irq_get = gen6_ring_get_irq;
 		engine->irq_put = gen6_ring_put_irq;
 		engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
-		engine->get_seqno = gen6_ring_get_seqno;
+		engine->irq_seqno_barrier = gen6_seqno_barrier;
+		engine->get_seqno = ring_get_seqno;
 		engine->set_seqno = ring_set_seqno;
 		if (i915_semaphore_is_enabled(dev)) {
 			engine->semaphore.sync_to = gen6_ring_sync;
@@ -2960,7 +2958,8 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 			engine->write_tail = gen6_bsd_ring_write_tail;
 		engine->flush = gen6_bsd_ring_flush;
 		engine->add_request = gen6_add_request;
-		engine->get_seqno = gen6_ring_get_seqno;
+		engine->irq_seqno_barrier = gen6_seqno_barrier;
+		engine->get_seqno = ring_get_seqno;
 		engine->set_seqno = ring_set_seqno;
 		if (INTEL_INFO(dev)->gen >= 8) {
 			engine->irq_enable_mask =
@@ -3033,7 +3032,8 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev)
 	engine->mmio_base = GEN8_BSD2_RING_BASE;
 	engine->flush = gen6_bsd_ring_flush;
 	engine->add_request = gen6_add_request;
-	engine->get_seqno = gen6_ring_get_seqno;
+	engine->irq_seqno_barrier = gen6_seqno_barrier;
+	engine->get_seqno = ring_get_seqno;
 	engine->set_seqno = ring_set_seqno;
 	engine->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT;
@@ -3064,7 +3064,8 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 	engine->write_tail = ring_write_tail;
 	engine->flush = gen6_ring_flush;
 	engine->add_request = gen6_add_request;
-	engine->get_seqno = gen6_ring_get_seqno;
+	engine->irq_seqno_barrier = gen6_seqno_barrier;
+	engine->get_seqno = ring_get_seqno;
 	engine->set_seqno = ring_set_seqno;
 	if (INTEL_INFO(dev)->gen >= 8) {
 		engine->irq_enable_mask =
@@ -3122,7 +3123,8 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 	engine->write_tail = ring_write_tail;
 	engine->flush = gen6_ring_flush;
 	engine->add_request = gen6_add_request;
-	engine->get_seqno = gen6_ring_get_seqno;
+	engine->irq_seqno_barrier = gen6_seqno_barrier;
+	engine->get_seqno = ring_get_seqno;
 	engine->set_seqno = ring_set_seqno;
 
 	if (INTEL_INFO(dev)->gen >= 8) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 98eadfa79116..3f04906a081f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -193,8 +193,8 @@ struct  intel_engine_cs {
 	 * seen value is good enough. Note that the seqno will always be
 	 * monotonic, even if not coherent.
 	 */
-	u32		(*get_seqno)(struct intel_engine_cs *ring,
-				     bool lazy_coherency);
+	void		(*irq_seqno_barrier)(struct intel_engine_cs *ring);
+	u32		(*get_seqno)(struct intel_engine_cs *ring);
 	void		(*set_seqno)(struct intel_engine_cs *ring,
 				     u32 seqno);
 	int		(*dispatch_execbuffer)(struct drm_i915_gem_request *req,
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [CI-ping 3/5] drm/i915: Harden detection of missed interrupts
  2016-04-09  9:57 [CI-ping 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson
  2016-04-09  9:57 ` [CI-ping 2/5] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson
@ 2016-04-09  9:57 ` Chris Wilson
  2016-04-09  9:57 ` [CI-ping 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS Chris Wilson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2016-04-09  9:57 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Only declare a missed interrupt if we find that the GPU is idle with
waiters and a hangcheck interval has passed in which no new user
interrupts have been raised.

v2: Clear the stuck interrupt marker between successful batches

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     | 11 ++++++----
 drivers/gpu/drm/i915/i915_irq.c         | 38 ++++++++++++++++++++++-----------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
 3 files changed, 35 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 919c05ba9932..9640738aabf2 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -728,10 +728,10 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
 static void i915_ring_seqno_info(struct seq_file *m,
 				 struct intel_engine_cs *engine)
 {
-	if (engine->get_seqno) {
-		seq_printf(m, "Current sequence (%s): %x\n",
-			   engine->name, engine->get_seqno(engine));
-	}
+	seq_printf(m, "Current sequence (%s): %x\n",
+		   engine->name, engine->get_seqno(engine));
+	seq_printf(m, "Current user interrupts (%s): %x\n",
+		   engine->name, READ_ONCE(engine->user_interrupts));
 }
 
 static int i915_gem_seqno_info(struct seq_file *m, void *data)
@@ -1367,6 +1367,9 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 			   engine->hangcheck.seqno,
 			   seqno[id],
 			   engine->last_submitted_seqno);
+		seq_printf(m, "\tuser interrupts = %x [current %x]\n",
+			   engine->hangcheck.user_interrupts,
+			   READ_ONCE(engine->user_interrupts));
 		seq_printf(m, "\tACTHD = 0x%08llx [current 0x%08llx]\n",
 			   (long long)engine->hangcheck.acthd,
 			   (long long)acthd[id]);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 3b946e1c7614..679f08c944ef 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1000,6 +1000,7 @@ static void notify_ring(struct intel_engine_cs *engine)
 		return;
 
 	trace_i915_gem_request_notify(engine);
+	engine->user_interrupts++;
 
 	wake_up_all(&engine->irq_queue);
 }
@@ -3054,6 +3055,24 @@ ring_stuck(struct intel_engine_cs *engine, u64 acthd)
 	return HANGCHECK_HUNG;
 }
 
+static unsigned kick_waiters(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *i915 = to_i915(engine->dev);
+	unsigned user_interrupts = READ_ONCE(engine->user_interrupts);
+
+	if (engine->hangcheck.user_interrupts == user_interrupts &&
+	    !test_and_set_bit(engine->id, &i915->gpu_error.missed_irq_rings)) {
+		if (!(i915->gpu_error.test_irq_rings & intel_engine_flag(engine)))
+			DRM_ERROR("Hangcheck timer elapsed... %s idle\n",
+				  engine->name);
+		else
+			DRM_INFO("Fake missed irq on %s\n",
+				 engine->name);
+		wake_up_all(&engine->irq_queue);
+	}
+
+	return user_interrupts;
+}
 /*
  * This is called when the chip hasn't reported back with completed
  * batchbuffers in a long time. We keep track per ring seqno progress and
@@ -3096,6 +3115,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	for_each_engine_id(engine, dev_priv, id) {
 		u64 acthd;
 		u32 seqno;
+		unsigned user_interrupts;
 		bool busy = true;
 
 		semaphore_clear_deadlocks(dev_priv);
@@ -3113,22 +3133,15 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 		acthd = intel_ring_get_active_head(engine);
 		seqno = engine->get_seqno(engine);
 
+		/* Reset stuck interrupts between batch advances */
+		user_interrupts = 0;
+
 		if (engine->hangcheck.seqno == seqno) {
 			if (ring_idle(engine, seqno)) {
 				engine->hangcheck.action = HANGCHECK_IDLE;
-
 				if (waitqueue_active(&engine->irq_queue)) {
-					/* Issue a wake-up to catch stuck h/w. */
-					if (!test_and_set_bit(engine->id, &dev_priv->gpu_error.missed_irq_rings)) {
-						if (!(dev_priv->gpu_error.test_irq_rings & intel_engine_flag(engine)))
-							DRM_ERROR("Hangcheck timer elapsed... %s idle\n",
-								  engine->name);
-						else
-							DRM_INFO("Fake missed irq on %s\n",
-								 engine->name);
-						wake_up_all(&engine->irq_queue);
-					}
 					/* Safeguard against driver failure */
+					user_interrupts = kick_waiters(engine);
 					engine->hangcheck.score += BUSY;
 				} else
 					busy = false;
@@ -3179,7 +3192,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 				engine->hangcheck.score = 0;
 
 			/* Clear head and subunit states on seqno movement */
-			engine->hangcheck.acthd = 0;
+			acthd = 0;
 
 			memset(engine->hangcheck.instdone, 0,
 			       sizeof(engine->hangcheck.instdone));
@@ -3187,6 +3200,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 
 		engine->hangcheck.seqno = seqno;
 		engine->hangcheck.acthd = acthd;
+		engine->hangcheck.user_interrupts = user_interrupts;
 		busy_count += busy;
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 3f04906a081f..29c54cc1ee5c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -87,6 +87,7 @@ enum intel_ring_hangcheck_action {
 struct intel_ring_hangcheck {
 	u64 acthd;
 	u32 seqno;
+	unsigned user_interrupts;
 	int score;
 	enum intel_ring_hangcheck_action action;
 	int deadlock;
@@ -305,6 +306,7 @@ struct  intel_engine_cs {
 	 * inspecting request list.
 	 */
 	u32 last_submitted_seqno;
+	unsigned user_interrupts;
 
 	bool gpu_caches_dirty;
 
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [CI-ping 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS
  2016-04-09  9:57 [CI-ping 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson
  2016-04-09  9:57 ` [CI-ping 2/5] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson
  2016-04-09  9:57 ` [CI-ping 3/5] drm/i915: Harden detection of missed interrupts Chris Wilson
@ 2016-04-09  9:57 ` Chris Wilson
  2016-04-09  9:57 ` [CI-ping 5/5] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor Chris Wilson
  2016-04-09 11:02 ` ✗ Fi.CI.BAT: failure for series starting with [CI-ping,1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Patchwork
  4 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2016-04-09  9:57 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Rather than call a function to compute the matching cachelines and
clflush them, just call the clflush *instruction* directly. We also know
that we can use the unpatched plain clflush rather than the clflushopt
alternative.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 29c54cc1ee5c..9d7b7bf9ed14 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -385,8 +385,9 @@ intel_ring_sync_index(struct intel_engine_cs *engine,
 static inline void
 intel_flush_status_page(struct intel_engine_cs *engine, int reg)
 {
-	drm_clflush_virt_range(&engine->status_page.page_addr[reg],
-			       sizeof(uint32_t));
+	mb();
+	clflush(&engine->status_page.page_addr[reg]);
+	mb();
 }
 
 static inline u32
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [CI-ping 5/5] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor
  2016-04-09  9:57 [CI-ping 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson
                   ` (2 preceding siblings ...)
  2016-04-09  9:57 ` [CI-ping 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS Chris Wilson
@ 2016-04-09  9:57 ` Chris Wilson
  2016-04-09 11:02 ` ✗ Fi.CI.BAT: failure for series starting with [CI-ping,1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Patchwork
  4 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2016-04-09  9:57 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Mika Kuoppala

When reading from the HWS page, we use barrier() to prevent the compiler
optimising away the read from the volatile (may be updated by the GPU)
memory address. This is more suited to READ_ONCE(); make it so.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.h | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 9d7b7bf9ed14..78dc46864a10 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -391,12 +391,10 @@ intel_flush_status_page(struct intel_engine_cs *engine, int reg)
 }
 
 static inline u32
-intel_read_status_page(struct intel_engine_cs *engine,
-		       int reg)
+intel_read_status_page(struct intel_engine_cs *engine, int reg)
 {
 	/* Ensure that the compiler doesn't optimize away the load. */
-	barrier();
-	return engine->status_page.page_addr[reg];
+	return READ_ONCE(engine->status_page.page_addr[reg]);
 }
 
 static inline void
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [CI-ping,1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+
  2016-04-09  9:57 [CI-ping 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson
                   ` (3 preceding siblings ...)
  2016-04-09  9:57 ` [CI-ping 5/5] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor Chris Wilson
@ 2016-04-09 11:02 ` Patchwork
  4 siblings, 0 replies; 6+ messages in thread
From: Patchwork @ 2016-04-09 11:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [CI-ping,1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+
URL   : https://patchwork.freedesktop.org/series/5485/
State : failure

== Summary ==

Series 5485v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/5485/revisions/1/mbox/

Test gem_exec_basic:
        Subgroup basic-bsd:
                dmesg-warn -> PASS       (bsw-nuc-2)
Test gem_exec_suspend:
        Subgroup basic-s3:
                dmesg-warn -> PASS       (bsw-nuc-2)
Test gem_sync:
        Subgroup basic-bsd:
                dmesg-warn -> PASS       (bsw-nuc-2)
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-c:
                pass       -> FAIL       (bdw-ultra)
Test pm_rpm:
        Subgroup basic-rte:
                dmesg-warn -> PASS       (bsw-nuc-2)

bdw-nuci7        total:196  pass:184  dwarn:0   dfail:0   fail:0   skip:12 
bdw-ultra        total:196  pass:174  dwarn:0   dfail:0   fail:1   skip:21 
bsw-nuc-2        total:196  pass:159  dwarn:0   dfail:0   fail:0   skip:37 
byt-nuc          total:196  pass:161  dwarn:0   dfail:0   fail:0   skip:35 
hsw-brixbox      total:196  pass:174  dwarn:0   dfail:0   fail:0   skip:22 
ilk-hp8440p      total:196  pass:132  dwarn:0   dfail:0   fail:0   skip:64 
ivb-t430s        total:196  pass:171  dwarn:0   dfail:0   fail:0   skip:25 
skl-i7k-2        total:196  pass:173  dwarn:0   dfail:0   fail:0   skip:23 
snb-dellxps      total:196  pass:162  dwarn:0   dfail:0   fail:0   skip:34 
snb-x220t        total:196  pass:162  dwarn:0   dfail:0   fail:1   skip:33 

Results at /archive/results/CI_IGT_test/Patchwork_1853/

949884a57b51aa158e3ae9afe1f08130cdb7a3ef drm-intel-nightly: 2016y-04m-08d-10h-45m-28s UTC integration manifest
6ba3220c2113d0a221600df3fc45b9e71366163b drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor
4067c51b5ab7dc1ade379ffc889612dd58cd5977 drm/i915: Use simplest form for flushing the single cacheline in the HWS
34bea3e89af1dd18367e1074bdb8b2f83274baaa drm/i915: Harden detection of missed interrupts
880a5353ea3377b4c9f6045e81591440bf6347a3 drm/i915: Separate out the seqno-barrier from engine->get_seqno
49f9ebf02b4818e3fe2088ef80cc690fec73a5a4 drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-04-09 11:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-09  9:57 [CI-ping 1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson
2016-04-09  9:57 ` [CI-ping 2/5] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson
2016-04-09  9:57 ` [CI-ping 3/5] drm/i915: Harden detection of missed interrupts Chris Wilson
2016-04-09  9:57 ` [CI-ping 4/5] drm/i915: Use simplest form for flushing the single cacheline in the HWS Chris Wilson
2016-04-09  9:57 ` [CI-ping 5/5] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor Chris Wilson
2016-04-09 11:02 ` ✗ Fi.CI.BAT: failure for series starting with [CI-ping,1/5] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.