All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 01/10] drm/i915: Make semaphore updates more precise
@ 2014-06-30 16:53 Rodrigo Vivi
  2014-06-30 16:53 ` [PATCH 02/10] drm/i915: gen specific ring init Rodrigo Vivi
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Rodrigo Vivi @ 2014-06-30 16:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Rodrigo Vivi

From: Ben Widawsky <ben@bwidawsk.net>

With the ring mask we now have an easy way to know the number of rings
in the system, and therefore can accurately predict the number of dwords
to emit for semaphore signalling. This was not possible (easily)
previously.

There should be no functional impact, simply fewer instructions emitted.

While we're here, simply do the round up to 2 instead of the fancier
rounding we did before, which rounding up per mbox, ie 4. This also
allows us to drop the unnecessary MI_NOOP, so not really 4, 3.

v2: Use 3 dwords instead of 4 (Ville)
Do the proper calculation to get the number of dwords to emit (Ville)
Conditionally set .sync_to when semaphores are enabled (Ville)

v3: Rebased on VCS2
Replace hweight_long with hweight32 (Ville)

v4: Pull out the accidentally squashed hunk from the next patch after
rebase (Daniel).

v5: Fix conflict after rebase (Rodrigo)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 27 +++++++++------------------
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 2faef26..5c20536 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -679,23 +679,16 @@ static int gen6_signal(struct intel_engine_cs *signaller,
 	struct drm_device *dev = signaller->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *useless;
-	int i, ret;
+	int i, ret, num_rings;
 
-	/* NB: In order to be able to do semaphore MBOX updates for varying
-	 * number of rings, it's easiest if we round up each individual update
-	 * to a multiple of 2 (since ring updates must always be a multiple of
-	 * 2) even though the actual update only requires 3 dwords.
-	 */
-#define MBOX_UPDATE_DWORDS 4
-	if (i915_semaphore_is_enabled(dev))
-		num_dwords += ((I915_NUM_RINGS-1) * MBOX_UPDATE_DWORDS);
-	else
-		return intel_ring_begin(signaller, num_dwords);
+#define MBOX_UPDATE_DWORDS 3
+	num_rings = hweight32(INTEL_INFO(dev)->ring_mask);
+	num_dwords += round_up((num_rings-1) * MBOX_UPDATE_DWORDS, 2);
+#undef MBOX_UPDATE_DWORDS
 
 	ret = intel_ring_begin(signaller, num_dwords);
 	if (ret)
 		return ret;
-#undef MBOX_UPDATE_DWORDS
 
 	for_each_ring(useless, dev_priv, i) {
 		u32 mbox_reg = signaller->semaphore.mbox.signal[i];
@@ -703,15 +696,13 @@ static int gen6_signal(struct intel_engine_cs *signaller,
 			intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1));
 			intel_ring_emit(signaller, mbox_reg);
 			intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
-			intel_ring_emit(signaller, MI_NOOP);
-		} else {
-			intel_ring_emit(signaller, MI_NOOP);
-			intel_ring_emit(signaller, MI_NOOP);
-			intel_ring_emit(signaller, MI_NOOP);
-			intel_ring_emit(signaller, MI_NOOP);
 		}
 	}
 
+	/* If num_dwords was rounded, make sure the tail pointer is correct */
+	if (num_rings % 2 == 0)
+		intel_ring_emit(signaller, MI_NOOP);
+
 	return 0;
 }
 
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 02/10] drm/i915: gen specific ring init
  2014-06-30 16:53 [PATCH 01/10] drm/i915: Make semaphore updates more precise Rodrigo Vivi
@ 2014-06-30 16:53 ` Rodrigo Vivi
  2014-06-30 16:53 ` [PATCH 03/10] drm/i915/bdw: implement semaphore signal Rodrigo Vivi
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Rodrigo Vivi @ 2014-06-30 16:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Rodrigo Vivi, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

Gen8 has already had some differentiation with how it handles rings.
Semaphores bring yet more differences, and now is as good a time as any
to do the split.

Also, since gen8 doesn't actually use semaphores up until this point,
put the proper "NULL" values in for the mbox info.

v2: v1 had a stale commit message

v3: Move everything in the is_semaphore_enabled() check

v4: VCS2 rebase
Remove double assignment of signal in render ring (Ville)

v5: Adding missed VCS2 signal init on gen8+ (Rodrigo)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 227 +++++++++++++++++++++-----------
 1 file changed, 151 insertions(+), 76 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 5c20536..e8be49b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -720,7 +720,11 @@ gen6_add_request(struct intel_engine_cs *ring)
 {
 	int ret;
 
-	ret = ring->semaphore.signal(ring, 4);
+	if (ring->semaphore.signal)
+		ret = ring->semaphore.signal(ring, 4);
+	else
+		ret = intel_ring_begin(ring, 4);
+
 	if (ret)
 		return ret;
 
@@ -1943,40 +1947,59 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 	ring->id = RCS;
 	ring->mmio_base = RENDER_RING_BASE;
 
-	if (INTEL_INFO(dev)->gen >= 6) {
+	if (INTEL_INFO(dev)->gen >= 8) {
+		ring->add_request = gen6_add_request;
+		ring->flush = gen8_render_ring_flush;
+		ring->irq_get = gen8_ring_get_irq;
+		ring->irq_put = gen8_ring_put_irq;
+		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
+		ring->get_seqno = gen6_ring_get_seqno;
+		ring->set_seqno = ring_set_seqno;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.signal = gen6_signal;
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
+	} else if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
 		ring->flush = gen7_render_ring_flush;
 		if (INTEL_INFO(dev)->gen == 6)
 			ring->flush = gen6_render_ring_flush;
-		if (INTEL_INFO(dev)->gen >= 8) {
-			ring->flush = gen8_render_ring_flush;
-			ring->irq_get = gen8_ring_get_irq;
-			ring->irq_put = gen8_ring_put_irq;
-		} else {
-			ring->irq_get = gen6_ring_get_irq;
-			ring->irq_put = gen6_ring_put_irq;
-		}
+		ring->irq_get = gen6_ring_get_irq;
+		ring->irq_put = gen6_ring_put_irq;
 		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
-		ring->semaphore.sync_to = gen6_ring_sync;
-		ring->semaphore.signal = gen6_signal;
-		/*
-		 * The current semaphore is only applied on pre-gen8 platform.
-		 * And there is no VCS2 ring on the pre-gen8 platform. So the
-		 * semaphore between RCS and VCS2 is initialized as INVALID.
-		 * Gen8 will initialize the sema between VCS2 and RCS later.
-		 */
-		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_RV;
-		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_RB;
-		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_RVE;
-		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
-		ring->semaphore.mbox.signal[VCS] = GEN6_VRSYNC;
-		ring->semaphore.mbox.signal[BCS] = GEN6_BRSYNC;
-		ring->semaphore.mbox.signal[VECS] = GEN6_VERSYNC;
-		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.signal = gen6_signal;
+			/*
+			 * The current semaphore is only applied on pre-gen8
+			 * platform.  And there is no VCS2 ring on the pre-gen8
+			 * platform. So the semaphore between RCS and VCS2 is
+			 * initialized as INVALID.  Gen8 will initialize the
+			 * sema between VCS2 and RCS later.
+			 */
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_RV;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_RB;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_RVE;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_VRSYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_BRSYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_VERSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
 	} else if (IS_GEN5(dev)) {
 		ring->add_request = pc_render_add_request;
 		ring->flush = gen4_render_ring_flush;
@@ -2004,6 +2027,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->irq_enable_mask = I915_USER_INTERRUPT;
 	}
 	ring->write_tail = ring_write_tail;
+
 	if (IS_HASWELL(dev))
 		ring->dispatch_execbuffer = hsw_ring_dispatch_execbuffer;
 	else if (IS_GEN8(dev))
@@ -2154,31 +2178,49 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 			ring->irq_put = gen8_ring_put_irq;
 			ring->dispatch_execbuffer =
 				gen8_ring_dispatch_execbuffer;
+			if (i915_semaphore_is_enabled(dev)) {
+				ring->semaphore.sync_to = gen6_ring_sync;
+				ring->semaphore.signal = gen6_signal;
+				/*
+				 * The current semaphore is only applied on
+				 * pre-gen8 platform.  And there is no VCS2 ring
+				 * on the pre-gen8 platform. So the semaphore
+				 * between VCS and VCS2 is initialized as
+				 * INVALID.  Gen8 will initialize the sema
+				 * between VCS2 and VCS later.
+				 */
+				ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+				ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+				ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+				ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+				ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+			}
 		} else {
 			ring->irq_enable_mask = GT_BSD_USER_INTERRUPT;
 			ring->irq_get = gen6_ring_get_irq;
 			ring->irq_put = gen6_ring_put_irq;
 			ring->dispatch_execbuffer =
 				gen6_ring_dispatch_execbuffer;
+			if (i915_semaphore_is_enabled(dev)) {
+				ring->semaphore.sync_to = gen6_ring_sync;
+				ring->semaphore.signal = gen6_signal;
+				ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VR;
+				ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VB;
+				ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_VVE;
+				ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.signal[RCS] = GEN6_RVSYNC;
+				ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+				ring->semaphore.mbox.signal[BCS] = GEN6_BVSYNC;
+				ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
+				ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+			}
 		}
-		ring->semaphore.sync_to = gen6_ring_sync;
-		ring->semaphore.signal = gen6_signal;
-		/*
-		 * The current semaphore is only applied on pre-gen8 platform.
-		 * And there is no VCS2 ring on the pre-gen8 platform. So the
-		 * semaphore between VCS and VCS2 is initialized as INVALID.
-		 * Gen8 will initialize the sema between VCS2 and VCS later.
-		 */
-		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VR;
-		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VB;
-		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_VVE;
-		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.signal[RCS] = GEN6_RVSYNC;
-		ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-		ring->semaphore.mbox.signal[BCS] = GEN6_BVSYNC;
-		ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
-		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 	} else {
 		ring->mmio_base = BSD_RING_BASE;
 		ring->flush = bsd_ring_flush;
@@ -2274,30 +2316,47 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 		ring->irq_get = gen8_ring_get_irq;
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.signal = gen6_signal;
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
 	} else {
 		ring->irq_enable_mask = GT_BLT_USER_INTERRUPT;
 		ring->irq_get = gen6_ring_get_irq;
 		ring->irq_put = gen6_ring_put_irq;
 		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.signal = gen6_signal;
+			ring->semaphore.sync_to = gen6_ring_sync;
+			/*
+			 * The current semaphore is only applied on pre-gen8
+			 * platform.  And there is no VCS2 ring on the pre-gen8
+			 * platform. So the semaphore between BCS and VCS2 is
+			 * initialized as INVALID.  Gen8 will initialize the
+			 * sema between BCS and VCS2 later.
+			 */
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_BR;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_BV;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_BVE;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_RBSYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_VBSYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
 	}
-	ring->semaphore.sync_to = gen6_ring_sync;
-	ring->semaphore.signal = gen6_signal;
-	/*
-	 * The current semaphore is only applied on pre-gen8 platform. And
-	 * there is no VCS2 ring on the pre-gen8 platform. So the semaphore
-	 * between BCS and VCS2 is initialized as INVALID.
-	 * Gen8 will initialize the sema between BCS and VCS2 later.
-	 */
-	ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_BR;
-	ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_BV;
-	ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_BVE;
-	ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.signal[RCS] = GEN6_RBSYNC;
-	ring->semaphore.mbox.signal[VCS] = GEN6_VBSYNC;
-	ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-	ring->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
-	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 	ring->init = init_ring_common;
 
 	return intel_init_ring_buffer(dev, ring);
@@ -2324,24 +2383,40 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 		ring->irq_get = gen8_ring_get_irq;
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.signal = gen6_signal;
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
 	} else {
 		ring->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
 		ring->irq_get = hsw_vebox_get_irq;
 		ring->irq_put = hsw_vebox_put_irq;
 		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.signal = gen6_signal;
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_RVESYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_VVESYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
 	}
-	ring->semaphore.sync_to = gen6_ring_sync;
-	ring->semaphore.signal = gen6_signal;
-	ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
-	ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
-	ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
-	ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.signal[RCS] = GEN6_RVESYNC;
-	ring->semaphore.mbox.signal[VCS] = GEN6_VVESYNC;
-	ring->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
-	ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
-	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 	ring->init = init_ring_common;
 
 	return intel_init_ring_buffer(dev, ring);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 03/10] drm/i915/bdw: implement semaphore signal
  2014-06-30 16:53 [PATCH 01/10] drm/i915: Make semaphore updates more precise Rodrigo Vivi
  2014-06-30 16:53 ` [PATCH 02/10] drm/i915: gen specific ring init Rodrigo Vivi
@ 2014-06-30 16:53 ` Rodrigo Vivi
  2014-06-30 16:53 ` [PATCH 04/10] drm/i915/bdw: implement semaphore wait Rodrigo Vivi
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Rodrigo Vivi @ 2014-06-30 16:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Rodrigo Vivi

From: Ben Widawsky <ben@bwidawsk.net>

Semaphore signalling works similarly to previous GENs with the exception
that the per ring mailboxes no longer exist. Instead you must define
your own space, somewhere in the GTT.

The comments in the code define the layout I've opted for, which should
be fairly future proof. Ie. I tried to define offsets in abstract terms
(NUM_RINGS, seqno size, etc).

NOTE: If one wanted to move this to the HWSP they could. I've decided
one 4k object would be easier to deal with, and provide potential wins
with cache locality, but that's all speculative.

v2: Update the macro to not need the other ring's ring->id (Chris)
Update the comment to use the correct formula (Chris)

v3: Move the macros the ringbuffer.h to prevent churn in next patch
(Ville)

v4: Fixed compilation rebase conflict
commit 1ec9e26ddab06459e89a890431b2de064c5d1056
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Feb 14 14:01:11 2014 +0100

    drm/i915: Consolidate binding parameters into flags

v5: VCS2 rebase
Replace hweight_long with hweight32

v6 (Rodrigo): * Add missed VC2 gen8 ring signal init
   	      * fixing conflicst on rebase
    	      * minor fixes on address table
	      * remove WARN_ON

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |   1 +
 drivers/gpu/drm/i915/i915_reg.h         |   5 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 185 +++++++++++++++++++-------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  78 ++++++++++++--
 4 files changed, 189 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8cea596..61e43fc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1374,6 +1374,7 @@ struct drm_i915_private {
 
 	struct pci_dev *bridge_dev;
 	struct intel_engine_cs ring[I915_NUM_RINGS];
+	struct drm_i915_gem_object *semaphore_obj;
 	uint32_t last_seqno, next_seqno;
 
 	drm_dma_handle_t *status_page_dmah;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 3488567..2130d07 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -240,7 +240,7 @@
 #define   MI_DISPLAY_FLIP_IVB_SPRITE_B (3 << 19)
 #define   MI_DISPLAY_FLIP_IVB_PLANE_C  (4 << 19)
 #define   MI_DISPLAY_FLIP_IVB_SPRITE_C (5 << 19)
-#define MI_SEMAPHORE_MBOX	MI_INSTR(0x16, 1) /* gen6+ */
+#define MI_SEMAPHORE_MBOX	MI_INSTR(0x16, 1) /* gen6, gen7 */
 #define   MI_SEMAPHORE_GLOBAL_GTT    (1<<22)
 #define   MI_SEMAPHORE_UPDATE	    (1<<21)
 #define   MI_SEMAPHORE_COMPARE	    (1<<20)
@@ -266,6 +266,8 @@
 #define   MI_RESTORE_EXT_STATE_EN	(1<<2)
 #define   MI_FORCE_RESTORE		(1<<1)
 #define   MI_RESTORE_INHIBIT		(1<<0)
+#define MI_SEMAPHORE_SIGNAL	MI_INSTR(0x1b, 0) /* GEN8+ */
+#define   MI_SEMAPHORE_TARGET(engine)	((engine)<<15)
 #define MI_STORE_DWORD_IMM	MI_INSTR(0x20, 1)
 #define   MI_MEM_VIRTUAL	(1 << 22) /* 965+ only */
 #define MI_STORE_DWORD_INDEX	MI_INSTR(0x21, 1)
@@ -360,6 +362,7 @@
 #define   PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE		(1<<10) /* GM45+ only */
 #define   PIPE_CONTROL_INDIRECT_STATE_DISABLE		(1<<9)
 #define   PIPE_CONTROL_NOTIFY				(1<<8)
+#define   PIPE_CONTROL_FLUSH_ENABLE			(1<<7) /* gen7+ */
 #define   PIPE_CONTROL_VF_CACHE_INVALIDATE		(1<<4)
 #define   PIPE_CONTROL_CONST_CACHE_INVALIDATE		(1<<3)
 #define   PIPE_CONTROL_STATE_CACHE_INVALIDATE		(1<<2)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e8be49b..a215ab4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -660,6 +660,13 @@ static int init_render_ring(struct intel_engine_cs *ring)
 static void render_ring_cleanup(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (dev_priv->semaphore_obj) {
+		i915_gem_object_ggtt_unpin(dev_priv->semaphore_obj);
+		drm_gem_object_unreference(&dev_priv->semaphore_obj->base);
+		dev_priv->semaphore_obj = NULL;
+	}
 
 	if (ring->scratch.obj == NULL)
 		return;
@@ -673,6 +680,80 @@ static void render_ring_cleanup(struct intel_engine_cs *ring)
 	ring->scratch.obj = NULL;
 }
 
+static int gen8_rcs_signal(struct intel_engine_cs *signaller,
+			   unsigned int num_dwords)
+{
+#define MBOX_UPDATE_DWORDS 8
+	struct drm_device *dev = signaller->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_engine_cs *waiter;
+	int i, ret, num_rings;
+
+	num_rings = hweight32(INTEL_INFO(dev)->ring_mask);
+	num_dwords += (num_rings-1) * MBOX_UPDATE_DWORDS;
+#undef MBOX_UPDATE_DWORDS
+
+	ret = intel_ring_begin(signaller, num_dwords);
+	if (ret)
+		return ret;
+
+	for_each_ring(waiter, dev_priv, i) {
+		u64 gtt_offset = signaller->semaphore.signal_ggtt[i];
+		if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID)
+			continue;
+
+		intel_ring_emit(signaller, GFX_OP_PIPE_CONTROL(6));
+		intel_ring_emit(signaller, PIPE_CONTROL_GLOBAL_GTT_IVB |
+					   PIPE_CONTROL_QW_WRITE |
+					   PIPE_CONTROL_FLUSH_ENABLE);
+		intel_ring_emit(signaller, lower_32_bits(gtt_offset));
+		intel_ring_emit(signaller, upper_32_bits(gtt_offset));
+		intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
+		intel_ring_emit(signaller, 0);
+		intel_ring_emit(signaller, MI_SEMAPHORE_SIGNAL |
+					   MI_SEMAPHORE_TARGET(waiter->id));
+		intel_ring_emit(signaller, 0);
+	}
+
+	return 0;
+}
+
+static int gen8_xcs_signal(struct intel_engine_cs *signaller,
+			   unsigned int num_dwords)
+{
+#define MBOX_UPDATE_DWORDS 6
+	struct drm_device *dev = signaller->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_engine_cs *waiter;
+	int i, ret, num_rings;
+
+	num_rings = hweight32(INTEL_INFO(dev)->ring_mask);
+	num_dwords += (num_rings-1) * MBOX_UPDATE_DWORDS;
+#undef MBOX_UPDATE_DWORDS
+
+	ret = intel_ring_begin(signaller, num_dwords);
+	if (ret)
+		return ret;
+
+	for_each_ring(waiter, dev_priv, i) {
+		u64 gtt_offset = signaller->semaphore.signal_ggtt[i];
+		if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID)
+			continue;
+
+		intel_ring_emit(signaller, (MI_FLUSH_DW + 1) |
+					   MI_FLUSH_DW_OP_STOREDW);
+		intel_ring_emit(signaller, lower_32_bits(gtt_offset) |
+					   MI_FLUSH_DW_USE_GTT);
+		intel_ring_emit(signaller, upper_32_bits(gtt_offset));
+		intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
+		intel_ring_emit(signaller, MI_SEMAPHORE_SIGNAL |
+					   MI_SEMAPHORE_TARGET(waiter->id));
+		intel_ring_emit(signaller, 0);
+	}
+
+	return 0;
+}
+
 static int gen6_signal(struct intel_engine_cs *signaller,
 		       unsigned int num_dwords)
 {
@@ -1942,12 +2023,30 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *ring = &dev_priv->ring[RCS];
+	struct drm_i915_gem_object *obj;
+	int ret;
 
 	ring->name = "render ring";
 	ring->id = RCS;
 	ring->mmio_base = RENDER_RING_BASE;
 
 	if (INTEL_INFO(dev)->gen >= 8) {
+		if (i915_semaphore_is_enabled(dev)) {
+			obj = i915_gem_alloc_object(dev, 4096);
+			if (obj == NULL) {
+				DRM_ERROR("Failed to allocate semaphore bo. Disabling semaphores\n");
+				i915.semaphores = 0;
+			} else {
+				i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+				ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_NONBLOCK);
+				if (ret != 0) {
+					drm_gem_object_unreference(&obj->base);
+					DRM_ERROR("Failed to pin semaphore bo. Disabling semaphores\n");
+					i915.semaphores = 0;
+				} else
+					dev_priv->semaphore_obj = obj;
+			}
+		}
 		ring->add_request = gen6_add_request;
 		ring->flush = gen8_render_ring_flush;
 		ring->irq_get = gen8_ring_get_irq;
@@ -1956,18 +2055,10 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		if (i915_semaphore_is_enabled(dev)) {
+			BUG_ON(!dev_priv->semaphore_obj);
 			ring->semaphore.sync_to = gen6_ring_sync;
-			ring->semaphore.signal = gen6_signal;
-			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+			ring->semaphore.signal = gen8_rcs_signal;
+			GEN8_RING_SEMAPHORE_INIT;
 		}
 	} else if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
@@ -2045,9 +2136,6 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 
 	/* Workaround batchbuffer to combat CS tlb bug. */
 	if (HAS_BROKEN_CS_TLB(dev)) {
-		struct drm_i915_gem_object *obj;
-		int ret;
-
 		obj = i915_gem_alloc_object(dev, I830_BATCH_LIMIT);
 		if (obj == NULL) {
 			DRM_ERROR("Failed to allocate batch bo\n");
@@ -2180,25 +2268,8 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 				gen8_ring_dispatch_execbuffer;
 			if (i915_semaphore_is_enabled(dev)) {
 				ring->semaphore.sync_to = gen6_ring_sync;
-				ring->semaphore.signal = gen6_signal;
-				/*
-				 * The current semaphore is only applied on
-				 * pre-gen8 platform.  And there is no VCS2 ring
-				 * on the pre-gen8 platform. So the semaphore
-				 * between VCS and VCS2 is initialized as
-				 * INVALID.  Gen8 will initialize the sema
-				 * between VCS2 and VCS later.
-				 */
-				ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-				ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-				ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-				ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-				ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-				ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
-				ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-				ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-				ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
-				ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+				ring->semaphore.signal = gen8_xcs_signal;
+				GEN8_RING_SEMAPHORE_INIT;
 			}
 		} else {
 			ring->irq_enable_mask = GT_BSD_USER_INTERRUPT;
@@ -2273,24 +2344,10 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev)
 	ring->dispatch_execbuffer =
 			gen8_ring_dispatch_execbuffer;
 	ring->semaphore.sync_to = gen6_ring_sync;
-	ring->semaphore.signal = gen6_signal;
-	/*
-	 * The current semaphore is only applied on the pre-gen8. And there
-	 * is no bsd2 ring on the pre-gen8. So now the semaphore_register
-	 * between VCS2 and other ring is initialized as invalid.
-	 * Gen8 will initialize the sema between VCS2 and other ring later.
-	 */
-	ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
-	ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-	ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-	ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
-	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
-
+	if (i915_semaphore_is_enabled(dev)) {
+		ring->semaphore.signal = gen8_xcs_signal;
+		GEN8_RING_SEMAPHORE_INIT;
+	}
 	ring->init = init_ring_common;
 
 	return intel_init_ring_buffer(dev, ring);
@@ -2318,17 +2375,8 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 		if (i915_semaphore_is_enabled(dev)) {
 			ring->semaphore.sync_to = gen6_ring_sync;
-			ring->semaphore.signal = gen6_signal;
-			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+			ring->semaphore.signal = gen8_xcs_signal;
+			GEN8_RING_SEMAPHORE_INIT;
 		}
 	} else {
 		ring->irq_enable_mask = GT_BLT_USER_INTERRUPT;
@@ -2385,17 +2433,8 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 		if (i915_semaphore_is_enabled(dev)) {
 			ring->semaphore.sync_to = gen6_ring_sync;
-			ring->semaphore.signal = gen6_signal;
-			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+			ring->semaphore.signal = gen8_xcs_signal;
+			GEN8_RING_SEMAPHORE_INIT;
 		}
 	} else {
 		ring->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index e72017b..1132f2e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -40,6 +40,32 @@ struct  intel_hw_status_page {
 #define I915_READ_MODE(ring) I915_READ(RING_MI_MODE((ring)->mmio_base))
 #define I915_WRITE_MODE(ring, val) I915_WRITE(RING_MI_MODE((ring)->mmio_base), val)
 
+/* seqno size is actually only a uint32, but since we plan to use MI_FLUSH_DW to
+ * do the writes, and that must have qw aligned offsets, simply pretend it's 8b.
+ */
+#define i915_semaphore_seqno_size sizeof(uint64_t)
+#define GEN8_SIGNAL_OFFSET(__ring, to)			     \
+	(i915_gem_obj_ggtt_offset(dev_priv->semaphore_obj) + \
+	((__ring)->id * I915_NUM_RINGS * i915_semaphore_seqno_size) +	\
+	(i915_semaphore_seqno_size * (to)))
+
+#define GEN8_WAIT_OFFSET(__ring, from)			     \
+	(i915_gem_obj_ggtt_offset(dev_priv->semaphore_obj) + \
+	((from) * I915_NUM_RINGS * i915_semaphore_seqno_size) + \
+	(i915_semaphore_seqno_size * (__ring)->id))
+
+#define GEN8_RING_SEMAPHORE_INIT do { \
+	if (!dev_priv->semaphore_obj) { \
+		break; \
+	} \
+	ring->semaphore.signal_ggtt[RCS] = GEN8_SIGNAL_OFFSET(ring, RCS); \
+	ring->semaphore.signal_ggtt[VCS] = GEN8_SIGNAL_OFFSET(ring, VCS); \
+	ring->semaphore.signal_ggtt[BCS] = GEN8_SIGNAL_OFFSET(ring, BCS); \
+	ring->semaphore.signal_ggtt[VECS] = GEN8_SIGNAL_OFFSET(ring, VECS); \
+	ring->semaphore.signal_ggtt[VCS2] = GEN8_SIGNAL_OFFSET(ring, VCS2); \
+	ring->semaphore.signal_ggtt[ring->id] = MI_SEMAPHORE_SYNC_INVALID; \
+	} while(0)
+
 enum intel_ring_hangcheck_action {
 	HANGCHECK_IDLE = 0,
 	HANGCHECK_WAIT,
@@ -127,15 +153,55 @@ struct  intel_engine_cs {
 #define I915_DISPATCH_PINNED 0x2
 	void		(*cleanup)(struct intel_engine_cs *ring);
 
+	/* GEN8 signal/wait table - never trust comments!
+	 *	  signal to	signal to    signal to   signal to      signal to
+	 *	    RCS		   VCS          BCS        VECS		 VCS2
+	 *      --------------------------------------------------------------------
+	 *  RCS | NOP (0x00) | VCS (0x08) | BCS (0x10) | VECS (0x18) | VCS2 (0x20) |
+	 *	|-------------------------------------------------------------------
+	 *  VCS | RCS (0x28) | NOP (0x30) | BCS (0x38) | VECS (0x40) | VCS2 (0x48) |
+	 *	|-------------------------------------------------------------------
+	 *  BCS | RCS (0x50) | VCS (0x58) | NOP (0x60) | VECS (0x68) | VCS2 (0x70) |
+	 *	|-------------------------------------------------------------------
+	 * VECS | RCS (0x78) | VCS (0x80) | BCS (0x88) |  NOP (0x90) | VCS2 (0x98) |
+	 *	|-------------------------------------------------------------------
+	 * VCS2 | RCS (0xa0) | VCS (0xa8) | BCS (0xb0) | VECS (0xb8) | NOP  (0xc0) |
+	 *	|-------------------------------------------------------------------
+	 *
+	 * Generalization:
+	 *  f(x, y) := (x->id * NUM_RINGS * seqno_size) + (seqno_size * y->id)
+	 *  ie. transpose of g(x, y)
+	 *
+	 *	 sync from	sync from    sync from    sync from	sync from
+	 *	    RCS		   VCS          BCS        VECS		 VCS2
+	 *      --------------------------------------------------------------------
+	 *  RCS | NOP (0x00) | VCS (0x28) | BCS (0x50) | VECS (0x78) | VCS2 (0xa0) |
+	 *	|-------------------------------------------------------------------
+	 *  VCS | RCS (0x08) | NOP (0x30) | BCS (0x58) | VECS (0x80) | VCS2 (0xa8) |
+	 *	|-------------------------------------------------------------------
+	 *  BCS | RCS (0x10) | VCS (0x38) | NOP (0x60) | VECS (0x88) | VCS2 (0xb0) |
+	 *	|-------------------------------------------------------------------
+	 * VECS | RCS (0x18) | VCS (0x40) | BCS (0x68) |  NOP (0x90) | VCS2 (0xb8) |
+	 *	|-------------------------------------------------------------------
+	 * VCS2 | RCS (0x20) | VCS (0x48) | BCS (0x70) | VECS (0x98) |  NOP (0xc0) |
+	 *	|-------------------------------------------------------------------
+	 *
+	 * Generalization:
+	 *  g(x, y) := (y->id * NUM_RINGS * seqno_size) + (seqno_size * x->id)
+	 *  ie. transpose of f(x, y)
+	 */
 	struct {
 		u32	sync_seqno[I915_NUM_RINGS-1];
 
-		struct {
-			/* our mbox written by others */
-			u32		wait[I915_NUM_RINGS];
-			/* mboxes this ring signals to */
-			u32		signal[I915_NUM_RINGS];
-		} mbox;
+		union {
+			struct {
+				/* our mbox written by others */
+				u32		wait[I915_NUM_RINGS];
+				/* mboxes this ring signals to */
+				u32		signal[I915_NUM_RINGS];
+			} mbox;
+			u64		signal_ggtt[I915_NUM_RINGS];
+		};
 
 		/* AKA wait() */
 		int	(*sync_to)(struct intel_engine_cs *ring,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 04/10] drm/i915/bdw: implement semaphore wait
  2014-06-30 16:53 [PATCH 01/10] drm/i915: Make semaphore updates more precise Rodrigo Vivi
  2014-06-30 16:53 ` [PATCH 02/10] drm/i915: gen specific ring init Rodrigo Vivi
  2014-06-30 16:53 ` [PATCH 03/10] drm/i915/bdw: implement semaphore signal Rodrigo Vivi
@ 2014-06-30 16:53 ` Rodrigo Vivi
  2014-06-30 16:53 ` [PATCH 05/10] drm/i915: Implement MI decode for gen8 Rodrigo Vivi
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Rodrigo Vivi @ 2014-06-30 16:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Rodrigo Vivi

From: Ben Widawsky <ben@bwidawsk.net>

Semaphore waits use a new instruction, MI_SEMAPHORE_WAIT. The seqno to
wait on is all well defined by the table in the previous patch. There is
nothing else different from previous GEN's semaphore synchronization
code.

v2: Update macros to not require the other ring's ring->id (Chris)

v3: Add missing VCS2 gen8_ring_wait init besides
    s/ring_buffer/engine_cs (Rodrigo)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h         |  3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 35 ++++++++++++++++++++++++++++-----
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 2130d07..9de11de 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -268,6 +268,9 @@
 #define   MI_RESTORE_INHIBIT		(1<<0)
 #define MI_SEMAPHORE_SIGNAL	MI_INSTR(0x1b, 0) /* GEN8+ */
 #define   MI_SEMAPHORE_TARGET(engine)	((engine)<<15)
+#define MI_SEMAPHORE_WAIT	MI_INSTR(0x1c, 2) /* GEN8+ */
+#define   MI_SEMAPHORE_POLL		(1<<15)
+#define   MI_SEMAPHORE_SAD_GTE_SDD	(1<<12)
 #define MI_STORE_DWORD_IMM	MI_INSTR(0x20, 1)
 #define   MI_MEM_VIRTUAL	(1 << 22) /* 965+ only */
 #define MI_STORE_DWORD_INDEX	MI_INSTR(0x21, 1)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index a215ab4..2e0413c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -832,6 +832,31 @@ static inline bool i915_gem_has_seqno_wrapped(struct drm_device *dev,
  * @signaller - ring which has, or will signal
  * @seqno - seqno which the waiter will block on
  */
+
+static int
+gen8_ring_sync(struct intel_engine_cs *waiter,
+	       struct intel_engine_cs *signaller,
+	       u32 seqno)
+{
+	struct drm_i915_private *dev_priv = waiter->dev->dev_private;
+	int ret;
+
+	ret = intel_ring_begin(waiter, 4);
+	if (ret)
+		return ret;
+
+	intel_ring_emit(waiter, MI_SEMAPHORE_WAIT |
+				MI_SEMAPHORE_GLOBAL_GTT |
+				MI_SEMAPHORE_SAD_GTE_SDD);
+	intel_ring_emit(waiter, seqno);
+	intel_ring_emit(waiter,
+			lower_32_bits(GEN8_WAIT_OFFSET(waiter, signaller->id)));
+	intel_ring_emit(waiter,
+			upper_32_bits(GEN8_WAIT_OFFSET(waiter, signaller->id)));
+	intel_ring_advance(waiter);
+	return 0;
+}
+
 static int
 gen6_ring_sync(struct intel_engine_cs *waiter,
 	       struct intel_engine_cs *signaller,
@@ -2056,7 +2081,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->set_seqno = ring_set_seqno;
 		if (i915_semaphore_is_enabled(dev)) {
 			BUG_ON(!dev_priv->semaphore_obj);
-			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.sync_to = gen8_ring_sync;
 			ring->semaphore.signal = gen8_rcs_signal;
 			GEN8_RING_SEMAPHORE_INIT;
 		}
@@ -2267,7 +2292,7 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 			ring->dispatch_execbuffer =
 				gen8_ring_dispatch_execbuffer;
 			if (i915_semaphore_is_enabled(dev)) {
-				ring->semaphore.sync_to = gen6_ring_sync;
+				ring->semaphore.sync_to = gen8_ring_sync;
 				ring->semaphore.signal = gen8_xcs_signal;
 				GEN8_RING_SEMAPHORE_INIT;
 			}
@@ -2343,8 +2368,8 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev)
 	ring->irq_put = gen8_ring_put_irq;
 	ring->dispatch_execbuffer =
 			gen8_ring_dispatch_execbuffer;
-	ring->semaphore.sync_to = gen6_ring_sync;
 	if (i915_semaphore_is_enabled(dev)) {
+		ring->semaphore.sync_to = gen8_ring_sync;
 		ring->semaphore.signal = gen8_xcs_signal;
 		GEN8_RING_SEMAPHORE_INIT;
 	}
@@ -2374,7 +2399,7 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 		if (i915_semaphore_is_enabled(dev)) {
-			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.sync_to = gen8_ring_sync;
 			ring->semaphore.signal = gen8_xcs_signal;
 			GEN8_RING_SEMAPHORE_INIT;
 		}
@@ -2432,7 +2457,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 		if (i915_semaphore_is_enabled(dev)) {
-			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.sync_to = gen8_ring_sync;
 			ring->semaphore.signal = gen8_xcs_signal;
 			GEN8_RING_SEMAPHORE_INIT;
 		}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 05/10] drm/i915: Implement MI decode for gen8
  2014-06-30 16:53 [PATCH 01/10] drm/i915: Make semaphore updates more precise Rodrigo Vivi
                   ` (2 preceding siblings ...)
  2014-06-30 16:53 ` [PATCH 04/10] drm/i915/bdw: implement semaphore wait Rodrigo Vivi
@ 2014-06-30 16:53 ` Rodrigo Vivi
  2014-07-01  0:13   ` Ben Widawsky
  2014-07-07 21:17   ` Daniel Vetter
  2014-06-30 16:53 ` [PATCH 06/10] drm/i915: Extract semaphore error collection Rodrigo Vivi
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 17+ messages in thread
From: Rodrigo Vivi @ 2014-06-30 16:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Rodrigo Vivi

Ipehr just carries Dword 0 and on Gen 8, offsets are located
on Dword 2 and 3 of MI_SEMAPHORE_WAIT.

This implementation was based on Ben's work and on Ville's suggestion for Ben

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c | 42 ++++++++++++++++++++++-------------------
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 0217a41..66e2481 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2784,12 +2784,7 @@ static bool
 ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
 {
 	if (INTEL_INFO(dev)->gen >= 8) {
-		/*
-		 * FIXME: gen8 semaphore support - currently we don't emit
-		 * semaphores on bdw anyway, but this needs to be addressed when
-		 * we merge that code.
-		 */
-		return false;
+		return (ipehr >> 23) == 0x1c;
 	} else {
 		ipehr &= ~MI_SEMAPHORE_SYNC_MASK;
 		return ipehr == (MI_SEMAPHORE_MBOX | MI_SEMAPHORE_COMPARE |
@@ -2798,19 +2793,20 @@ ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
 }
 
 static struct intel_engine_cs *
-semaphore_wait_to_signaller_ring(struct intel_engine_cs *ring, u32 ipehr)
+semaphore_wait_to_signaller_ring(struct intel_engine_cs *ring, u32 ipehr, u64 offset)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct intel_engine_cs *signaller;
 	int i;
 
 	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
-		/*
-		 * FIXME: gen8 semaphore support - currently we don't emit
-		 * semaphores on bdw anyway, but this needs to be addressed when
-		 * we merge that code.
-		 */
-		return NULL;
+		for_each_ring(signaller, dev_priv, i) {
+			if (ring == signaller)
+				continue;
+
+			if (offset == signaller->semaphore.signal_ggtt[ring->id])
+				return signaller;
+		}
 	} else {
 		u32 sync_bits = ipehr & MI_SEMAPHORE_SYNC_MASK;
 
@@ -2823,8 +2819,8 @@ semaphore_wait_to_signaller_ring(struct intel_engine_cs *ring, u32 ipehr)
 		}
 	}
 
-	DRM_ERROR("No signaller ring found for ring %i, ipehr 0x%08x\n",
-		  ring->id, ipehr);
+	DRM_ERROR("No signaller ring found for ring %i, ipehr 0x%08x, offset 0x%0%016llx\n",
+		  ring->id, ipehr, offset);
 
 	return NULL;
 }
@@ -2834,7 +2830,8 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	u32 cmd, ipehr, head;
-	int i;
+	u64 offset = 0;
+	int i, backwards;
 
 	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
 	if (!ipehr_is_semaphore_wait(ring->dev, ipehr))
@@ -2843,13 +2840,15 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
 	/*
 	 * HEAD is likely pointing to the dword after the actual command,
 	 * so scan backwards until we find the MBOX. But limit it to just 3
-	 * dwords. Note that we don't care about ACTHD here since that might
+	 * or 4 dwords depending on the semaphore wait command size.
+	 * Note that we don't care about ACTHD here since that might
 	 * point at at batch, and semaphores are always emitted into the
 	 * ringbuffer itself.
 	 */
 	head = I915_READ_HEAD(ring) & HEAD_ADDR;
+	backwards = (INTEL_INFO(ring->dev)->gen >= 8) ? 5 : 4;
 
-	for (i = 4; i; --i) {
+	for (i = backwards; i; --i) {
 		/*
 		 * Be paranoid and presume the hw has gone off into the wild -
 		 * our ring is smaller than what the hardware (and hence
@@ -2869,7 +2868,12 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
 		return NULL;
 
 	*seqno = ioread32(ring->buffer->virtual_start + head + 4) + 1;
-	return semaphore_wait_to_signaller_ring(ring, ipehr);
+	if (INTEL_INFO(ring->dev)->gen >= 8) {
+		offset = ioread32(ring->buffer->virtual_start + head + 12);
+		offset <<= 32;
+		offset = ioread32(ring->buffer->virtual_start + head + 8);
+	}
+	return semaphore_wait_to_signaller_ring(ring, ipehr, offset);
 }
 
 static int semaphore_passed(struct intel_engine_cs *ring)
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 06/10] drm/i915: Extract semaphore error collection
  2014-06-30 16:53 [PATCH 01/10] drm/i915: Make semaphore updates more precise Rodrigo Vivi
                   ` (3 preceding siblings ...)
  2014-06-30 16:53 ` [PATCH 05/10] drm/i915: Implement MI decode for gen8 Rodrigo Vivi
@ 2014-06-30 16:53 ` Rodrigo Vivi
  2014-06-30 16:53 ` [PATCH 07/10] drm/i915/bdw: collect semaphore error state Rodrigo Vivi
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Rodrigo Vivi @ 2014-06-30 16:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Rodrigo Vivi, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

v2: s/ring_buffer/engine_cs (Rodrigo)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 66cf417..9d42b6a 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -746,6 +746,23 @@ static void i915_gem_record_fences(struct drm_device *dev,
 	}
 }
 
+
+static void gen6_record_semaphore_state(struct drm_i915_private *dev_priv,
+					struct intel_engine_cs *ring,
+					struct drm_i915_error_ring *ering)
+{
+	ering->semaphore_mboxes[0] = I915_READ(RING_SYNC_0(ring->mmio_base));
+	ering->semaphore_mboxes[1] = I915_READ(RING_SYNC_1(ring->mmio_base));
+	ering->semaphore_seqno[0] = ring->semaphore.sync_seqno[0];
+	ering->semaphore_seqno[1] = ring->semaphore.sync_seqno[1];
+
+	if (HAS_VEBOX(dev_priv->dev)) {
+		ering->semaphore_mboxes[2] =
+			I915_READ(RING_SYNC_2(ring->mmio_base));
+		ering->semaphore_seqno[2] = ring->semaphore.sync_seqno[2];
+	}
+}
+
 static void i915_record_ring_state(struct drm_device *dev,
 				   struct intel_engine_cs *ring,
 				   struct drm_i915_error_ring *ering)
@@ -755,18 +772,7 @@ static void i915_record_ring_state(struct drm_device *dev,
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ering->rc_psmi = I915_READ(ring->mmio_base + 0x50);
 		ering->fault_reg = I915_READ(RING_FAULT_REG(ring));
-		ering->semaphore_mboxes[0]
-			= I915_READ(RING_SYNC_0(ring->mmio_base));
-		ering->semaphore_mboxes[1]
-			= I915_READ(RING_SYNC_1(ring->mmio_base));
-		ering->semaphore_seqno[0] = ring->semaphore.sync_seqno[0];
-		ering->semaphore_seqno[1] = ring->semaphore.sync_seqno[1];
-	}
-
-	if (HAS_VEBOX(dev)) {
-		ering->semaphore_mboxes[2] =
-			I915_READ(RING_SYNC_2(ring->mmio_base));
-		ering->semaphore_seqno[2] = ring->semaphore.sync_seqno[2];
+		gen6_record_semaphore_state(dev_priv, ring, ering);
 	}
 
 	if (INTEL_INFO(dev)->gen >= 4) {
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 07/10] drm/i915/bdw: collect semaphore error state
  2014-06-30 16:53 [PATCH 01/10] drm/i915: Make semaphore updates more precise Rodrigo Vivi
                   ` (4 preceding siblings ...)
  2014-06-30 16:53 ` [PATCH 06/10] drm/i915: Extract semaphore error collection Rodrigo Vivi
@ 2014-06-30 16:53 ` Rodrigo Vivi
  2014-07-14 10:04   ` Damien Lespiau
  2014-06-30 16:53 ` [PATCH 08/10] drm/i915: semaphore debugfs Rodrigo Vivi
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Rodrigo Vivi @ 2014-06-30 16:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Rodrigo Vivi

From: Ben Widawsky <ben@bwidawsk.net>

Since the semaphore information is in an object, just dump it, and let
the user parse it later.

NOTE: The page being used for the semaphores are incoherent with the
CPU. No matter what I do, I cannot figure out a way to read anything but
0s. Note that the semaphore waits are indeed working.

v2: Don't print signal, and wait (they should be the same). Instead,
print sync_seqno (Chris)

v3: Free the semaphore error object (Chris)

v4: Fix semaphore offset calculation during error state collection
(Ville)

v5: VCS2 rebase
Make semaphore object error capture coding style consistent (Ville)
Do the proper math for the signal offset (Ville)

v6: Fix small conflicts on rebase and s/ring_buffer/engine_cs (Rodrigo)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h       |  1 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 51 ++++++++++++++++++++++++++++++++---
 2 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 61e43fc..f50ae5d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -324,6 +324,7 @@ struct drm_i915_error_state {
 	u64 fence[I915_MAX_NUM_FENCES];
 	struct intel_overlay_error_state *overlay;
 	struct intel_display_error_state *display;
+	struct drm_i915_error_object *semaphore_obj;
 
 	struct drm_i915_error_ring {
 		bool valid;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 9d42b6a..45b6191 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -327,6 +327,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 	struct drm_device *dev = error_priv->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_error_state *error = error_priv->error;
+	struct drm_i915_error_object *obj;
 	int i, j, offset, elt;
 	int max_hangcheck_score;
 
@@ -395,8 +396,6 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 				    error->pinned_bo_count[0]);
 
 	for (i = 0; i < ARRAY_SIZE(error->ring); i++) {
-		struct drm_i915_error_object *obj;
-
 		obj = error->ring[i].batchbuffer;
 		if (obj) {
 			err_puts(m, dev_priv->ring[i].name);
@@ -459,6 +458,18 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 		}
 	}
 
+	if ((obj = error->semaphore_obj)) {
+		err_printf(m, "Semaphore page = 0x%08x\n", obj->gtt_offset);
+		for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
+			err_printf(m, "[%04x] %08x %08x %08x %08x\n",
+				   elt * 4,
+				   obj->pages[0][elt],
+				   obj->pages[0][elt+1],
+				   obj->pages[0][elt+2],
+				   obj->pages[0][elt+3]);
+		}
+	}
+
 	if (error->overlay)
 		intel_overlay_print_error_state(m, error->overlay);
 
@@ -529,6 +540,7 @@ static void i915_error_state_free(struct kref *error_ref)
 		kfree(error->ring[i].requests);
 	}
 
+	i915_error_object_free(error->semaphore_obj);
 	kfree(error->active_bo);
 	kfree(error->overlay);
 	kfree(error->display);
@@ -747,6 +759,33 @@ static void i915_gem_record_fences(struct drm_device *dev,
 }
 
 
+static void gen8_record_semaphore_state(struct drm_i915_private *dev_priv,
+					struct drm_i915_error_state *error,
+					struct intel_engine_cs *ring,
+					struct drm_i915_error_ring *ering)
+{
+	struct intel_engine_cs *useless;
+	int i;
+
+	if (!i915_semaphore_is_enabled(dev_priv->dev))
+		return;
+
+	if (!error->semaphore_obj)
+		error->semaphore_obj =
+			i915_error_object_create(dev_priv,
+						 dev_priv->semaphore_obj,
+						 &dev_priv->gtt.base);
+
+	for_each_ring(useless, dev_priv, i) {
+		u16 signal_offset =
+			(GEN8_SIGNAL_OFFSET(ring, i) & PAGE_MASK) / 4;
+		u32 *tmp = error->semaphore_obj->pages[0];
+
+		ering->semaphore_mboxes[i] = tmp[signal_offset];
+		ering->semaphore_seqno[i] = ring->semaphore.sync_seqno[i];
+	}
+}
+
 static void gen6_record_semaphore_state(struct drm_i915_private *dev_priv,
 					struct intel_engine_cs *ring,
 					struct drm_i915_error_ring *ering)
@@ -764,6 +803,7 @@ static void gen6_record_semaphore_state(struct drm_i915_private *dev_priv,
 }
 
 static void i915_record_ring_state(struct drm_device *dev,
+				   struct drm_i915_error_state *error,
 				   struct intel_engine_cs *ring,
 				   struct drm_i915_error_ring *ering)
 {
@@ -772,7 +812,10 @@ static void i915_record_ring_state(struct drm_device *dev,
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ering->rc_psmi = I915_READ(ring->mmio_base + 0x50);
 		ering->fault_reg = I915_READ(RING_FAULT_REG(ring));
-		gen6_record_semaphore_state(dev_priv, ring, ering);
+		if (INTEL_INFO(dev)->gen >= 8)
+			gen8_record_semaphore_state(dev_priv, error, ring, ering);
+		else
+			gen6_record_semaphore_state(dev_priv, ring, ering);
 	}
 
 	if (INTEL_INFO(dev)->gen >= 4) {
@@ -901,7 +944,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 
 		error->ring[i].valid = true;
 
-		i915_record_ring_state(dev, ring, &error->ring[i]);
+		i915_record_ring_state(dev, error, ring, &error->ring[i]);
 
 		request = i915_gem_find_active_request(ring);
 		if (request) {
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 08/10] drm/i915: semaphore debugfs
  2014-06-30 16:53 [PATCH 01/10] drm/i915: Make semaphore updates more precise Rodrigo Vivi
                   ` (5 preceding siblings ...)
  2014-06-30 16:53 ` [PATCH 07/10] drm/i915/bdw: collect semaphore error state Rodrigo Vivi
@ 2014-06-30 16:53 ` Rodrigo Vivi
  2014-06-30 16:53 ` [PATCH 09/10] drm/i915/bdw: poll semaphores Rodrigo Vivi
  2014-06-30 16:53 ` [PATCH 10/10] drm/i915: Enable semaphores on BDW Rodrigo Vivi
  8 siblings, 0 replies; 17+ messages in thread
From: Rodrigo Vivi @ 2014-06-30 16:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Rodrigo Vivi

From: Ben Widawsky <ben@bwidawsk.net>

Simple debugfs file to display the current state of semaphores. This is
useful if you want to see the state without hanging the GPU.

NOTE: This patch is optional to the series.

NOTE2: Like the GPU error state collection, the reads are currently
incoherent.

v2 (Rodrigo): * Iterate only on active rings.
   	      * s/ring_buffer/engine_cs.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 71 +++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 6b7b32b..ec24e14 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2386,6 +2386,76 @@ static int i915_display_info(struct seq_file *m, void *unused)
 	return 0;
 }
 
+static int i915_semaphore_status(struct seq_file *m, void *unused)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_engine_cs *ring;
+	int num_rings = hweight32(INTEL_INFO(dev)->ring_mask);
+	int i, j, ret;
+
+	if (!i915_semaphore_is_enabled(dev)) {
+		seq_puts(m, "Semaphores are disabled\n");
+		return 0;
+	}
+
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		return ret;
+
+	if (IS_BROADWELL(dev)) {
+		struct page *page;
+		uint64_t *seqno;
+
+		page = i915_gem_object_get_page(dev_priv->semaphore_obj, 0);
+
+		seqno = (uint64_t *)kmap_atomic(page);
+		for_each_ring(ring, dev_priv, i) {
+			uint64_t offset;
+
+			seq_printf(m, "%s\n", ring->name);
+
+			seq_puts(m, "  Last signal:");
+			for (j = 0; j < num_rings; j++) {
+				offset = i * I915_NUM_RINGS + j;
+				seq_printf(m, "0x%08llx (0x%02llx) ",
+					   seqno[offset], offset * 8);
+			}
+			seq_putc(m, '\n');
+
+			seq_puts(m, "  Last wait:  ");
+			for (j = 0; j < num_rings; j++) {
+				offset = i + (j * I915_NUM_RINGS);
+				seq_printf(m, "0x%08llx (0x%02llx) ",
+					   seqno[offset], offset * 8);
+			}
+			seq_putc(m, '\n');
+
+		}
+		kunmap_atomic(seqno);
+	} else {
+		seq_puts(m, "  Last signal:");
+		for_each_ring(ring, dev_priv, i)
+			for (j = 0; j < num_rings; j++)
+				seq_printf(m, "0x%08x\n",
+					   I915_READ(ring->semaphore.mbox.signal[j]));
+		seq_putc(m, '\n');
+	}
+
+	seq_puts(m, "\nSync seqno:\n");
+	for_each_ring(ring, dev_priv, i) {
+		for (j = 0; j < num_rings; j++) {
+			seq_printf(m, "  0x%08x ", ring->semaphore.sync_seqno[j]);
+		}
+		seq_putc(m, '\n');
+	}
+	seq_putc(m, '\n');
+
+	mutex_unlock(&dev->struct_mutex);
+	return 0;
+}
+
 struct pipe_crc_info {
 	const char *name;
 	struct drm_device *dev;
@@ -3837,6 +3907,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_pc8_status", i915_pc8_status, 0},
 	{"i915_power_domain_info", i915_power_domain_info, 0},
 	{"i915_display_info", i915_display_info, 0},
+	{"i915_semaphore_status", i915_semaphore_status, 0},
 };
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 09/10] drm/i915/bdw: poll semaphores
  2014-06-30 16:53 [PATCH 01/10] drm/i915: Make semaphore updates more precise Rodrigo Vivi
                   ` (6 preceding siblings ...)
  2014-06-30 16:53 ` [PATCH 08/10] drm/i915: semaphore debugfs Rodrigo Vivi
@ 2014-06-30 16:53 ` Rodrigo Vivi
  2014-06-30 16:53 ` [PATCH 10/10] drm/i915: Enable semaphores on BDW Rodrigo Vivi
  8 siblings, 0 replies; 17+ messages in thread
From: Rodrigo Vivi @ 2014-06-30 16:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Rodrigo Vivi, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

As Ville points out, it's possible/probable we don't actually need this.
Potentially, this validates the letter of the spec, and not the spirit.

Ville:
> I discussed this on irc w/ Ben, and I was suggesting we don't need to
> poll. Polling apparently can be used as a workaround for certain
> hardware issues, but it looks like those issues shouldn't affect us,
> for the momemnt at least. So my suggestion was to try w/o polling
> first (since there could be some power cost to polling) and add the
> poll bit if problems arise.

Rodrigo: Spec suggests this as an W/A for GT3. However semaphores didn't
worked in my BDW GT2 on Signal Mode. So pool mode is definitely needed.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 2e0413c..7a6112e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -847,6 +847,7 @@ gen8_ring_sync(struct intel_engine_cs *waiter,
 
 	intel_ring_emit(waiter, MI_SEMAPHORE_WAIT |
 				MI_SEMAPHORE_GLOBAL_GTT |
+				MI_SEMAPHORE_POLL |
 				MI_SEMAPHORE_SAD_GTE_SDD);
 	intel_ring_emit(waiter, seqno);
 	intel_ring_emit(waiter,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 10/10] drm/i915: Enable semaphores on BDW
  2014-06-30 16:53 [PATCH 01/10] drm/i915: Make semaphore updates more precise Rodrigo Vivi
                   ` (7 preceding siblings ...)
  2014-06-30 16:53 ` [PATCH 09/10] drm/i915/bdw: poll semaphores Rodrigo Vivi
@ 2014-06-30 16:53 ` Rodrigo Vivi
  2014-07-01  0:14   ` Ben Widawsky
  8 siblings, 1 reply; 17+ messages in thread
From: Rodrigo Vivi @ 2014-06-30 16:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6eb45ac..1f84f88 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -477,10 +477,6 @@ bool i915_semaphore_is_enabled(struct drm_device *dev)
 	if (i915.semaphores >= 0)
 		return i915.semaphores;
 
-	/* Until we get further testing... */
-	if (IS_GEN8(dev))
-		return false;
-
 #ifdef CONFIG_INTEL_IOMMU
 	/* Enable semaphores on SNB when IO remapping is off */
 	if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped)
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 05/10] drm/i915: Implement MI decode for gen8
  2014-06-30 16:53 ` [PATCH 05/10] drm/i915: Implement MI decode for gen8 Rodrigo Vivi
@ 2014-07-01  0:13   ` Ben Widawsky
  2014-07-07 21:17   ` Daniel Vetter
  1 sibling, 0 replies; 17+ messages in thread
From: Ben Widawsky @ 2014-07-01  0:13 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Mon, Jun 30, 2014 at 09:53:39AM -0700, Rodrigo Vivi wrote:
> Ipehr just carries Dword 0 and on Gen 8, offsets are located
> on Dword 2 and 3 of MI_SEMAPHORE_WAIT.
> 
> This implementation was based on Ben's work and on Ville's suggestion for Ben
> 
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 10/10] drm/i915: Enable semaphores on BDW
  2014-06-30 16:53 ` [PATCH 10/10] drm/i915: Enable semaphores on BDW Rodrigo Vivi
@ 2014-07-01  0:14   ` Ben Widawsky
  2014-07-07 20:26     ` Daniel Vetter
  0 siblings, 1 reply; 17+ messages in thread
From: Ben Widawsky @ 2014-07-01  0:14 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Mon, Jun 30, 2014 at 09:53:44AM -0700, Rodrigo Vivi wrote:
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] drm/i915: Implement MI decode for gen8
  2014-07-07 21:17   ` Daniel Vetter
@ 2014-07-07 14:29     ` Rodrigo Vivi
  0 siblings, 0 replies; 17+ messages in thread
From: Rodrigo Vivi @ 2014-07-07 14:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky, Rodrigo Vivi

Ipehr just carries Dword 0 and on Gen 8, offsets are located
on Dword 2 and 3 of MI_SEMAPHORE_WAIT.

This implementation was based on Ben's work and on Ville's suggestion for Ben

v2: fix typo. Removing spurious 0% from debug msg "0x%0%0". (Daniel)

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c | 42 ++++++++++++++++++++++-------------------
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 0217a41..05264ad 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2784,12 +2784,7 @@ static bool
 ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
 {
 	if (INTEL_INFO(dev)->gen >= 8) {
-		/*
-		 * FIXME: gen8 semaphore support - currently we don't emit
-		 * semaphores on bdw anyway, but this needs to be addressed when
-		 * we merge that code.
-		 */
-		return false;
+		return (ipehr >> 23) == 0x1c;
 	} else {
 		ipehr &= ~MI_SEMAPHORE_SYNC_MASK;
 		return ipehr == (MI_SEMAPHORE_MBOX | MI_SEMAPHORE_COMPARE |
@@ -2798,19 +2793,20 @@ ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
 }
 
 static struct intel_engine_cs *
-semaphore_wait_to_signaller_ring(struct intel_engine_cs *ring, u32 ipehr)
+semaphore_wait_to_signaller_ring(struct intel_engine_cs *ring, u32 ipehr, u64 offset)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct intel_engine_cs *signaller;
 	int i;
 
 	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
-		/*
-		 * FIXME: gen8 semaphore support - currently we don't emit
-		 * semaphores on bdw anyway, but this needs to be addressed when
-		 * we merge that code.
-		 */
-		return NULL;
+		for_each_ring(signaller, dev_priv, i) {
+			if (ring == signaller)
+				continue;
+
+			if (offset == signaller->semaphore.signal_ggtt[ring->id])
+				return signaller;
+		}
 	} else {
 		u32 sync_bits = ipehr & MI_SEMAPHORE_SYNC_MASK;
 
@@ -2823,8 +2819,8 @@ semaphore_wait_to_signaller_ring(struct intel_engine_cs *ring, u32 ipehr)
 		}
 	}
 
-	DRM_ERROR("No signaller ring found for ring %i, ipehr 0x%08x\n",
-		  ring->id, ipehr);
+	DRM_ERROR("No signaller ring found for ring %i, ipehr 0x%08x, offset 0x%016llx\n",
+		  ring->id, ipehr, offset);
 
 	return NULL;
 }
@@ -2834,7 +2830,8 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	u32 cmd, ipehr, head;
-	int i;
+	u64 offset = 0;
+	int i, backwards;
 
 	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
 	if (!ipehr_is_semaphore_wait(ring->dev, ipehr))
@@ -2843,13 +2840,15 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
 	/*
 	 * HEAD is likely pointing to the dword after the actual command,
 	 * so scan backwards until we find the MBOX. But limit it to just 3
-	 * dwords. Note that we don't care about ACTHD here since that might
+	 * or 4 dwords depending on the semaphore wait command size.
+	 * Note that we don't care about ACTHD here since that might
 	 * point at at batch, and semaphores are always emitted into the
 	 * ringbuffer itself.
 	 */
 	head = I915_READ_HEAD(ring) & HEAD_ADDR;
+	backwards = (INTEL_INFO(ring->dev)->gen >= 8) ? 5 : 4;
 
-	for (i = 4; i; --i) {
+	for (i = backwards; i; --i) {
 		/*
 		 * Be paranoid and presume the hw has gone off into the wild -
 		 * our ring is smaller than what the hardware (and hence
@@ -2869,7 +2868,12 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
 		return NULL;
 
 	*seqno = ioread32(ring->buffer->virtual_start + head + 4) + 1;
-	return semaphore_wait_to_signaller_ring(ring, ipehr);
+	if (INTEL_INFO(ring->dev)->gen >= 8) {
+		offset = ioread32(ring->buffer->virtual_start + head + 12);
+		offset <<= 32;
+		offset = ioread32(ring->buffer->virtual_start + head + 8);
+	}
+	return semaphore_wait_to_signaller_ring(ring, ipehr, offset);
 }
 
 static int semaphore_passed(struct intel_engine_cs *ring)
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 10/10] drm/i915: Enable semaphores on BDW
  2014-07-01  0:14   ` Ben Widawsky
@ 2014-07-07 20:26     ` Daniel Vetter
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Vetter @ 2014-07-07 20:26 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Rodrigo Vivi

On Mon, Jun 30, 2014 at 05:14:50PM -0700, Ben Widawsky wrote:
> On Mon, Jun 30, 2014 at 09:53:44AM -0700, Rodrigo Vivi wrote:
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

Pulled in entire series. Yay!
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 05/10] drm/i915: Implement MI decode for gen8
  2014-06-30 16:53 ` [PATCH 05/10] drm/i915: Implement MI decode for gen8 Rodrigo Vivi
  2014-07-01  0:13   ` Ben Widawsky
@ 2014-07-07 21:17   ` Daniel Vetter
  2014-07-07 14:29     ` [PATCH] " Rodrigo Vivi
  1 sibling, 1 reply; 17+ messages in thread
From: Daniel Vetter @ 2014-07-07 21:17 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, Ben Widawsky

On Mon, Jun 30, 2014 at 09:53:39AM -0700, Rodrigo Vivi wrote:
> Ipehr just carries Dword 0 and on Gen 8, offsets are located
> on Dword 2 and 3 of MI_SEMAPHORE_WAIT.
> 
> This implementation was based on Ben's work and on Ville's suggestion for Ben
> 
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_irq.c | 42 ++++++++++++++++++++++-------------------
>  1 file changed, 23 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 0217a41..66e2481 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -2784,12 +2784,7 @@ static bool
>  ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
>  {
>  	if (INTEL_INFO(dev)->gen >= 8) {
> -		/*
> -		 * FIXME: gen8 semaphore support - currently we don't emit
> -		 * semaphores on bdw anyway, but this needs to be addressed when
> -		 * we merge that code.
> -		 */
> -		return false;
> +		return (ipehr >> 23) == 0x1c;
>  	} else {
>  		ipehr &= ~MI_SEMAPHORE_SYNC_MASK;
>  		return ipehr == (MI_SEMAPHORE_MBOX | MI_SEMAPHORE_COMPARE |
> @@ -2798,19 +2793,20 @@ ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
>  }
>  
>  static struct intel_engine_cs *
> -semaphore_wait_to_signaller_ring(struct intel_engine_cs *ring, u32 ipehr)
> +semaphore_wait_to_signaller_ring(struct intel_engine_cs *ring, u32 ipehr, u64 offset)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct intel_engine_cs *signaller;
>  	int i;
>  
>  	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
> -		/*
> -		 * FIXME: gen8 semaphore support - currently we don't emit
> -		 * semaphores on bdw anyway, but this needs to be addressed when
> -		 * we merge that code.
> -		 */
> -		return NULL;
> +		for_each_ring(signaller, dev_priv, i) {
> +			if (ring == signaller)
> +				continue;
> +
> +			if (offset == signaller->semaphore.signal_ggtt[ring->id])
> +				return signaller;
> +		}
>  	} else {
>  		u32 sync_bits = ipehr & MI_SEMAPHORE_SYNC_MASK;
>  
> @@ -2823,8 +2819,8 @@ semaphore_wait_to_signaller_ring(struct intel_engine_cs *ring, u32 ipehr)
>  		}
>  	}
>  
> -	DRM_ERROR("No signaller ring found for ring %i, ipehr 0x%08x\n",
> -		  ring->id, ipehr);
> +	DRM_ERROR("No signaller ring found for ring %i, ipehr 0x%08x, offset 0x%0%016llx\n",

The format string has a spurious 0% here. I've dropped it.
-Daniel

> +		  ring->id, ipehr, offset);
>  
>  	return NULL;
>  }
> @@ -2834,7 +2830,8 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	u32 cmd, ipehr, head;
> -	int i;
> +	u64 offset = 0;
> +	int i, backwards;
>  
>  	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
>  	if (!ipehr_is_semaphore_wait(ring->dev, ipehr))
> @@ -2843,13 +2840,15 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
>  	/*
>  	 * HEAD is likely pointing to the dword after the actual command,
>  	 * so scan backwards until we find the MBOX. But limit it to just 3
> -	 * dwords. Note that we don't care about ACTHD here since that might
> +	 * or 4 dwords depending on the semaphore wait command size.
> +	 * Note that we don't care about ACTHD here since that might
>  	 * point at at batch, and semaphores are always emitted into the
>  	 * ringbuffer itself.
>  	 */
>  	head = I915_READ_HEAD(ring) & HEAD_ADDR;
> +	backwards = (INTEL_INFO(ring->dev)->gen >= 8) ? 5 : 4;
>  
> -	for (i = 4; i; --i) {
> +	for (i = backwards; i; --i) {
>  		/*
>  		 * Be paranoid and presume the hw has gone off into the wild -
>  		 * our ring is smaller than what the hardware (and hence
> @@ -2869,7 +2868,12 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
>  		return NULL;
>  
>  	*seqno = ioread32(ring->buffer->virtual_start + head + 4) + 1;
> -	return semaphore_wait_to_signaller_ring(ring, ipehr);
> +	if (INTEL_INFO(ring->dev)->gen >= 8) {
> +		offset = ioread32(ring->buffer->virtual_start + head + 12);
> +		offset <<= 32;
> +		offset = ioread32(ring->buffer->virtual_start + head + 8);
> +	}
> +	return semaphore_wait_to_signaller_ring(ring, ipehr, offset);
>  }
>  
>  static int semaphore_passed(struct intel_engine_cs *ring)
> -- 
> 1.9.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 07/10] drm/i915/bdw: collect semaphore error state
  2014-06-30 16:53 ` [PATCH 07/10] drm/i915/bdw: collect semaphore error state Rodrigo Vivi
@ 2014-07-14 10:04   ` Damien Lespiau
  2014-07-14 18:22     ` Ben Widawsky
  0 siblings, 1 reply; 17+ messages in thread
From: Damien Lespiau @ 2014-07-14 10:04 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, Ben Widawsky

On Mon, Jun 30, 2014 at 09:53:41AM -0700, Rodrigo Vivi wrote:
> +	for_each_ring(useless, dev_priv, i) {
> +		u16 signal_offset =
> +			(GEN8_SIGNAL_OFFSET(ring, i) & PAGE_MASK) / 4;
> +		u32 *tmp = error->semaphore_obj->pages[0];
> +
> +		ering->semaphore_mboxes[i] = tmp[signal_offset];
> +		ering->semaphore_seqno[i] = ring->semaphore.sync_seqno[i];
> +	}
> +}

This loops around all the rings, but semaphore_mboxes[] and
semaphore_seqno[] are of size (I915_NUM_RINGS - 1).

-- 
Damien

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 07/10] drm/i915/bdw: collect semaphore error state
  2014-07-14 10:04   ` Damien Lespiau
@ 2014-07-14 18:22     ` Ben Widawsky
  0 siblings, 0 replies; 17+ messages in thread
From: Ben Widawsky @ 2014-07-14 18:22 UTC (permalink / raw)
  To: Damien Lespiau; +Cc: intel-gfx, Rodrigo Vivi

On Mon, Jul 14, 2014 at 11:04:22AM +0100, Damien Lespiau wrote:
> On Mon, Jun 30, 2014 at 09:53:41AM -0700, Rodrigo Vivi wrote:
> > +	for_each_ring(useless, dev_priv, i) {
> > +		u16 signal_offset =
> > +			(GEN8_SIGNAL_OFFSET(ring, i) & PAGE_MASK) / 4;
> > +		u32 *tmp = error->semaphore_obj->pages[0];
> > +
> > +		ering->semaphore_mboxes[i] = tmp[signal_offset];
> > +		ering->semaphore_seqno[i] = ring->semaphore.sync_seqno[i];
> > +	}
> > +}
> 
> This loops around all the rings, but semaphore_mboxes[] and
> semaphore_seqno[] are of size (I915_NUM_RINGS - 1).
> 
> -- 
> Damien

Dan Carpenter has already reported this to us. I was expecting a patch
from Rodrigo.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-07-14 18:22 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-30 16:53 [PATCH 01/10] drm/i915: Make semaphore updates more precise Rodrigo Vivi
2014-06-30 16:53 ` [PATCH 02/10] drm/i915: gen specific ring init Rodrigo Vivi
2014-06-30 16:53 ` [PATCH 03/10] drm/i915/bdw: implement semaphore signal Rodrigo Vivi
2014-06-30 16:53 ` [PATCH 04/10] drm/i915/bdw: implement semaphore wait Rodrigo Vivi
2014-06-30 16:53 ` [PATCH 05/10] drm/i915: Implement MI decode for gen8 Rodrigo Vivi
2014-07-01  0:13   ` Ben Widawsky
2014-07-07 21:17   ` Daniel Vetter
2014-07-07 14:29     ` [PATCH] " Rodrigo Vivi
2014-06-30 16:53 ` [PATCH 06/10] drm/i915: Extract semaphore error collection Rodrigo Vivi
2014-06-30 16:53 ` [PATCH 07/10] drm/i915/bdw: collect semaphore error state Rodrigo Vivi
2014-07-14 10:04   ` Damien Lespiau
2014-07-14 18:22     ` Ben Widawsky
2014-06-30 16:53 ` [PATCH 08/10] drm/i915: semaphore debugfs Rodrigo Vivi
2014-06-30 16:53 ` [PATCH 09/10] drm/i915/bdw: poll semaphores Rodrigo Vivi
2014-06-30 16:53 ` [PATCH 10/10] drm/i915: Enable semaphores on BDW Rodrigo Vivi
2014-07-01  0:14   ` Ben Widawsky
2014-07-07 20:26     ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.