All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/13] [REPOST] BDW Semaphores
@ 2014-04-29 21:52 Ben Widawsky
  2014-04-29 21:52 ` [PATCH 01/13] drm/i915: Move semaphore specific ring members to struct Ben Widawsky
                   ` (13 more replies)
  0 siblings, 14 replies; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

Okay, trying this again after the somewhat painful VCS2 rebase. I think I got
to all of Ville's comments, but I could have missed a few. I apologize if so.

Daniel, even if you don't merge the whole series, the first few would really
help rebase pain - though now that VCS2 is merged, there's probably not much
other than execlists to be painful

The series is completely untested since the last rebase. I also didn't look
really closely to make sure the rebase was correct - I'm just totally short on
time atm. It was tested before that.

Ben Widawsky (13):
  drm/i915: Move semaphore specific ring members to struct
  drm/i915: Virtualize the ringbuffer signal func
  drm/i915: Move ring_begin to signal()
  drm/i915: Make semaphore updates more precise
  drm/i915: gen specific ring init
  drm/i915/bdw: implement semaphore signal
  drm/i915/bdw: implement semaphore wait
  drm/i915: Implement MI decode for gen8
  drm/i915/bdw: poll semaphores
  drm/i915: Extract semaphore error collection
  drm/i915/bdw: collect semaphore error state
  drm/i915: semaphore debugfs
  DONT_MERGE drm/i915: FORCE_RESTORE for gen8 semaphores

 drivers/gpu/drm/i915/i915_debugfs.c     |  70 ++++++
 drivers/gpu/drm/i915/i915_drv.h         |   2 +
 drivers/gpu/drm/i915/i915_gem.c         |  10 +-
 drivers/gpu/drm/i915/i915_gem_context.c |   7 +
 drivers/gpu/drm/i915/i915_gpu_error.c   |  79 +++++--
 drivers/gpu/drm/i915/i915_irq.c         |  14 +-
 drivers/gpu/drm/i915/i915_reg.h         |   8 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 405 ++++++++++++++++++++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  90 ++++++-
 9 files changed, 528 insertions(+), 157 deletions(-)

-- 
1.9.2

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/13] drm/i915: Move semaphore specific ring members to struct
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-29 21:52 ` [PATCH 02/13] drm/i915: Virtualize the ringbuffer signal func Ben Widawsky
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

This will be helpful in abstracting some of the code in preparation for
gen8 semaphores.

v2: Move mbox stuff to a separate struct

v3: Rebased over VCS2 work

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c         |  10 +--
 drivers/gpu/drm/i915/i915_gpu_error.c   |   6 +-
 drivers/gpu/drm/i915/i915_irq.c         |   3 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 124 ++++++++++++++++----------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  19 +++--
 5 files changed, 82 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b00a77e..dae51c3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2119,8 +2119,8 @@ i915_gem_init_seqno(struct drm_device *dev, u32 seqno)
 	for_each_ring(ring, dev_priv, i) {
 		intel_ring_init_seqno(ring, seqno);
 
-		for (j = 0; j < ARRAY_SIZE(ring->sync_seqno); j++)
-			ring->sync_seqno[j] = 0;
+		for (j = 0; j < ARRAY_SIZE(ring->semaphore.sync_seqno); j++)
+			ring->semaphore.sync_seqno[j] = 0;
 	}
 
 	return 0;
@@ -2692,7 +2692,7 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj,
 	idx = intel_ring_sync_index(from, to);
 
 	seqno = obj->last_read_seqno;
-	if (seqno <= from->sync_seqno[idx])
+	if (seqno <= from->semaphore.sync_seqno[idx])
 		return 0;
 
 	ret = i915_gem_check_olr(obj->ring, seqno);
@@ -2700,13 +2700,13 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj,
 		return ret;
 
 	trace_i915_gem_ring_sync_to(from, to, seqno);
-	ret = to->sync_to(to, from, seqno);
+	ret = to->semaphore.sync_to(to, from, seqno);
 	if (!ret)
 		/* We use last_read_seqno because sync_to()
 		 * might have just caused seqno wrap under
 		 * the radar.
 		 */
-		from->sync_seqno[idx] = obj->last_read_seqno;
+		from->semaphore.sync_seqno[idx] = obj->last_read_seqno;
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 51e9978..2d81985 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -757,14 +757,14 @@ static void i915_record_ring_state(struct drm_device *dev,
 			= I915_READ(RING_SYNC_0(ring->mmio_base));
 		ering->semaphore_mboxes[1]
 			= I915_READ(RING_SYNC_1(ring->mmio_base));
-		ering->semaphore_seqno[0] = ring->sync_seqno[0];
-		ering->semaphore_seqno[1] = ring->sync_seqno[1];
+		ering->semaphore_seqno[0] = ring->semaphore.sync_seqno[0];
+		ering->semaphore_seqno[1] = ring->semaphore.sync_seqno[1];
 	}
 
 	if (HAS_VEBOX(dev)) {
 		ering->semaphore_mboxes[2] =
 			I915_READ(RING_SYNC_2(ring->mmio_base));
-		ering->semaphore_seqno[2] = ring->sync_seqno[2];
+		ering->semaphore_seqno[2] = ring->semaphore.sync_seqno[2];
 	}
 
 	if (INTEL_INFO(dev)->gen >= 4) {
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 7e0d577..2d76183 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2595,8 +2595,7 @@ semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
 			if(ring == signaller)
 				continue;
 
-			if (sync_bits ==
-			    signaller->semaphore_register[ring->id])
+			if (sync_bits == signaller->semaphore.mbox.wait[ring->id])
 				return signaller;
 		}
 	}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index ab22d70..3076a99 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -706,7 +706,7 @@ gen6_add_request(struct intel_ring_buffer *ring)
 
 	if (i915_semaphore_is_enabled(dev)) {
 		for_each_ring(useless, dev_priv, i) {
-			u32 mbox_reg = ring->signal_mbox[i];
+			u32 mbox_reg = ring->semaphore.mbox.signal[i];
 			if (mbox_reg != GEN6_NOSYNC)
 				update_mboxes(ring, mbox_reg);
 		}
@@ -740,10 +740,11 @@ gen6_ring_sync(struct intel_ring_buffer *waiter,
 	       struct intel_ring_buffer *signaller,
 	       u32 seqno)
 {
-	int ret;
 	u32 dw1 = MI_SEMAPHORE_MBOX |
 		  MI_SEMAPHORE_COMPARE |
 		  MI_SEMAPHORE_REGISTER;
+	u32 wait_mbox = signaller->semaphore.mbox.wait[waiter->id];
+	int ret;
 
 	/* Throughout all of the GEM code, seqno passed implies our current
 	 * seqno is >= the last seqno executed. However for hardware the
@@ -751,8 +752,7 @@ gen6_ring_sync(struct intel_ring_buffer *waiter,
 	 */
 	seqno -= 1;
 
-	WARN_ON(signaller->semaphore_register[waiter->id] ==
-		MI_SEMAPHORE_SYNC_INVALID);
+	WARN_ON(wait_mbox == MI_SEMAPHORE_SYNC_INVALID);
 
 	ret = intel_ring_begin(waiter, 4);
 	if (ret)
@@ -760,9 +760,7 @@ gen6_ring_sync(struct intel_ring_buffer *waiter,
 
 	/* If seqno wrap happened, omit the wait with no-ops */
 	if (likely(!i915_gem_has_seqno_wrapped(waiter->dev, seqno))) {
-		intel_ring_emit(waiter,
-				dw1 |
-				signaller->semaphore_register[waiter->id]);
+		intel_ring_emit(waiter, dw1 | wait_mbox);
 		intel_ring_emit(waiter, seqno);
 		intel_ring_emit(waiter, 0);
 		intel_ring_emit(waiter, MI_NOOP);
@@ -1414,7 +1412,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 	ring->size = 32 * PAGE_SIZE;
-	memset(ring->sync_seqno, 0, sizeof(ring->sync_seqno));
+	memset(ring->semaphore.sync_seqno, 0, sizeof(ring->semaphore.sync_seqno));
 
 	init_waitqueue_head(&ring->irq_queue);
 
@@ -1921,23 +1919,23 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
-		ring->sync_to = gen6_ring_sync;
+		ring->semaphore.sync_to = gen6_ring_sync;
 		/*
 		 * The current semaphore is only applied on pre-gen8 platform.
 		 * And there is no VCS2 ring on the pre-gen8 platform. So the
 		 * semaphore between RCS and VCS2 is initialized as INVALID.
 		 * Gen8 will initialize the sema between VCS2 and RCS later.
 		 */
-		ring->semaphore_register[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore_register[VCS] = MI_SEMAPHORE_SYNC_RV;
-		ring->semaphore_register[BCS] = MI_SEMAPHORE_SYNC_RB;
-		ring->semaphore_register[VECS] = MI_SEMAPHORE_SYNC_RVE;
-		ring->semaphore_register[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->signal_mbox[RCS] = GEN6_NOSYNC;
-		ring->signal_mbox[VCS] = GEN6_VRSYNC;
-		ring->signal_mbox[BCS] = GEN6_BRSYNC;
-		ring->signal_mbox[VECS] = GEN6_VERSYNC;
-		ring->signal_mbox[VCS2] = GEN6_NOSYNC;
+		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_RV;
+		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_RB;
+		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_RVE;
+		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+		ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+		ring->semaphore.mbox.signal[VCS] = GEN6_VRSYNC;
+		ring->semaphore.mbox.signal[BCS] = GEN6_BRSYNC;
+		ring->semaphore.mbox.signal[VECS] = GEN6_VERSYNC;
+		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 	} else if (IS_GEN5(dev)) {
 		ring->add_request = pc_render_add_request;
 		ring->flush = gen4_render_ring_flush;
@@ -2105,23 +2103,23 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 			ring->dispatch_execbuffer =
 				gen6_ring_dispatch_execbuffer;
 		}
-		ring->sync_to = gen6_ring_sync;
+		ring->semaphore.sync_to = gen6_ring_sync;
 		/*
 		 * The current semaphore is only applied on pre-gen8 platform.
 		 * And there is no VCS2 ring on the pre-gen8 platform. So the
 		 * semaphore between VCS and VCS2 is initialized as INVALID.
 		 * Gen8 will initialize the sema between VCS2 and VCS later.
 		 */
-		ring->semaphore_register[RCS] = MI_SEMAPHORE_SYNC_VR;
-		ring->semaphore_register[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore_register[BCS] = MI_SEMAPHORE_SYNC_VB;
-		ring->semaphore_register[VECS] = MI_SEMAPHORE_SYNC_VVE;
-		ring->semaphore_register[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->signal_mbox[RCS] = GEN6_RVSYNC;
-		ring->signal_mbox[VCS] = GEN6_NOSYNC;
-		ring->signal_mbox[BCS] = GEN6_BVSYNC;
-		ring->signal_mbox[VECS] = GEN6_VEVSYNC;
-		ring->signal_mbox[VCS2] = GEN6_NOSYNC;
+		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VR;
+		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VB;
+		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_VVE;
+		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+		ring->semaphore.mbox.signal[RCS] = GEN6_RVSYNC;
+		ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+		ring->semaphore.mbox.signal[BCS] = GEN6_BVSYNC;
+		ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
+		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 	} else {
 		ring->mmio_base = BSD_RING_BASE;
 		ring->flush = bsd_ring_flush;
@@ -2173,23 +2171,23 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev)
 	ring->irq_put = gen8_ring_put_irq;
 	ring->dispatch_execbuffer =
 			gen8_ring_dispatch_execbuffer;
-	ring->sync_to = gen6_ring_sync;
+	ring->semaphore.sync_to = gen6_ring_sync;
 	/*
 	 * The current semaphore is only applied on the pre-gen8. And there
 	 * is no bsd2 ring on the pre-gen8. So now the semaphore_register
 	 * between VCS2 and other ring is initialized as invalid.
 	 * Gen8 will initialize the sema between VCS2 and other ring later.
 	 */
-	ring->semaphore_register[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore_register[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore_register[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore_register[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore_register[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->signal_mbox[RCS] = GEN6_NOSYNC;
-	ring->signal_mbox[VCS] = GEN6_NOSYNC;
-	ring->signal_mbox[BCS] = GEN6_NOSYNC;
-	ring->signal_mbox[VECS] = GEN6_NOSYNC;
-	ring->signal_mbox[VCS2] = GEN6_NOSYNC;
+	ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+	ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+	ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+	ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+	ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+	ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+	ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+	ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+	ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 
 	ring->init = init_ring_common;
 
@@ -2222,23 +2220,23 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 		ring->irq_put = gen6_ring_put_irq;
 		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
 	}
-	ring->sync_to = gen6_ring_sync;
+	ring->semaphore.sync_to = gen6_ring_sync;
 	/*
 	 * The current semaphore is only applied on pre-gen8 platform. And
 	 * there is no VCS2 ring on the pre-gen8 platform. So the semaphore
 	 * between BCS and VCS2 is initialized as INVALID.
 	 * Gen8 will initialize the sema between BCS and VCS2 later.
 	 */
-	ring->semaphore_register[RCS] = MI_SEMAPHORE_SYNC_BR;
-	ring->semaphore_register[VCS] = MI_SEMAPHORE_SYNC_BV;
-	ring->semaphore_register[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore_register[VECS] = MI_SEMAPHORE_SYNC_BVE;
-	ring->semaphore_register[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->signal_mbox[RCS] = GEN6_RBSYNC;
-	ring->signal_mbox[VCS] = GEN6_VBSYNC;
-	ring->signal_mbox[BCS] = GEN6_NOSYNC;
-	ring->signal_mbox[VECS] = GEN6_VEBSYNC;
-	ring->signal_mbox[VCS2] = GEN6_NOSYNC;
+	ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_BR;
+	ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_BV;
+	ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+	ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_BVE;
+	ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+	ring->semaphore.mbox.signal[RCS] = GEN6_RBSYNC;
+	ring->semaphore.mbox.signal[VCS] = GEN6_VBSYNC;
+	ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+	ring->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
+	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 	ring->init = init_ring_common;
 
 	return intel_init_ring_buffer(dev, ring);
@@ -2271,17 +2269,17 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 		ring->irq_put = hsw_vebox_put_irq;
 		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
 	}
-	ring->sync_to = gen6_ring_sync;
-	ring->semaphore_register[RCS] = MI_SEMAPHORE_SYNC_VER;
-	ring->semaphore_register[VCS] = MI_SEMAPHORE_SYNC_VEV;
-	ring->semaphore_register[BCS] = MI_SEMAPHORE_SYNC_VEB;
-	ring->semaphore_register[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore_register[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->signal_mbox[RCS] = GEN6_RVESYNC;
-	ring->signal_mbox[VCS] = GEN6_VVESYNC;
-	ring->signal_mbox[BCS] = GEN6_BVESYNC;
-	ring->signal_mbox[VECS] = GEN6_NOSYNC;
-	ring->signal_mbox[VCS2] = GEN6_NOSYNC;
+	ring->semaphore.sync_to = gen6_ring_sync;
+	ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
+	ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
+	ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
+	ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+	ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+	ring->semaphore.mbox.signal[RCS] = GEN6_RVESYNC;
+	ring->semaphore.mbox.signal[VCS] = GEN6_VVESYNC;
+	ring->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
+	ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
 	ring->init = init_ring_common;
 
 	return intel_init_ring_buffer(dev, ring);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 13e398f..6a44a64 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -90,7 +90,6 @@ struct  intel_ring_buffer {
 	unsigned irq_refcount; /* protected by dev_priv->irq_lock */
 	u32		irq_enable_mask;	/* bitmask to enable ring interrupt */
 	u32		trace_irq_seqno;
-	u32		sync_seqno[I915_NUM_RINGS-1];
 	bool __must_check (*irq_get)(struct intel_ring_buffer *ring);
 	void		(*irq_put)(struct intel_ring_buffer *ring);
 
@@ -118,14 +117,20 @@ struct  intel_ring_buffer {
 #define I915_DISPATCH_SECURE 0x1
 #define I915_DISPATCH_PINNED 0x2
 	void		(*cleanup)(struct intel_ring_buffer *ring);
-	int		(*sync_to)(struct intel_ring_buffer *ring,
+
+	struct {
+		u32	sync_seqno[I915_NUM_RINGS-1];
+		/* AKA wait() */
+		int	(*sync_to)(struct intel_ring_buffer *ring,
 				   struct intel_ring_buffer *to,
 				   u32 seqno);
-
-	/* our mbox written by others */
-	u32		semaphore_register[I915_NUM_RINGS];
-	/* mboxes this ring signals to */
-	u32		signal_mbox[I915_NUM_RINGS];
+		struct {
+			/* our mbox written by others */
+			u32		wait[I915_NUM_RINGS];
+			/* mboxes this ring signals to */
+			u32		signal[I915_NUM_RINGS];
+		} mbox;
+	} semaphore;
 
 	/**
 	 * List of objects currently involved in rendering from the
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 02/13] drm/i915: Virtualize the ringbuffer signal func
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
  2014-04-29 21:52 ` [PATCH 01/13] drm/i915: Move semaphore specific ring members to struct Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-29 21:52 ` [PATCH 03/13] drm/i915: Move ring_begin to signal() Ben Widawsky
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

This abstraction again is in preparation for gen8. Gen8 will bring new
semantics for doing this operation.

While here, make the writes of MI_NOOPs explicit for non-existent rings.
This should have been implicit before.

NOTE: This is going to be removed in a few patches.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 42 ++++++++++++++++++++-------------
 drivers/gpu/drm/i915/intel_ringbuffer.h | 11 +++++----
 2 files changed, 32 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3076a99..ea81b54 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -663,20 +663,32 @@ static void render_ring_cleanup(struct intel_ring_buffer *ring)
 	ring->scratch.obj = NULL;
 }
 
-static void
-update_mboxes(struct intel_ring_buffer *ring,
-	      u32 mmio_offset)
+static void gen6_signal(struct intel_ring_buffer *signaller)
 {
+	struct drm_i915_private *dev_priv = signaller->dev->dev_private;
+	struct intel_ring_buffer *useless;
+	int i;
+
 /* NB: In order to be able to do semaphore MBOX updates for varying number
  * of rings, it's easiest if we round up each individual update to a
  * multiple of 2 (since ring updates must always be a multiple of 2)
  * even though the actual update only requires 3 dwords.
  */
 #define MBOX_UPDATE_DWORDS 4
-	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
-	intel_ring_emit(ring, mmio_offset);
-	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
-	intel_ring_emit(ring, MI_NOOP);
+	for_each_ring(useless, dev_priv, i) {
+		u32 mbox_reg = signaller->semaphore.mbox.signal[i];
+		if (mbox_reg != GEN6_NOSYNC) {
+			intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1));
+			intel_ring_emit(signaller, mbox_reg);
+			intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
+			intel_ring_emit(signaller, MI_NOOP);
+		} else {
+			intel_ring_emit(signaller, MI_NOOP);
+			intel_ring_emit(signaller, MI_NOOP);
+			intel_ring_emit(signaller, MI_NOOP);
+			intel_ring_emit(signaller, MI_NOOP);
+		}
+	}
 }
 
 /**
@@ -692,9 +704,7 @@ static int
 gen6_add_request(struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = ring->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *useless;
-	int i, ret, num_dwords = 4;
+	int ret, num_dwords = 4;
 
 	if (i915_semaphore_is_enabled(dev))
 		num_dwords += ((I915_NUM_RINGS-1) * MBOX_UPDATE_DWORDS);
@@ -704,13 +714,7 @@ gen6_add_request(struct intel_ring_buffer *ring)
 	if (ret)
 		return ret;
 
-	if (i915_semaphore_is_enabled(dev)) {
-		for_each_ring(useless, dev_priv, i) {
-			u32 mbox_reg = ring->semaphore.mbox.signal[i];
-			if (mbox_reg != GEN6_NOSYNC)
-				update_mboxes(ring, mbox_reg);
-		}
-	}
+	ring->semaphore.signal(ring);
 
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
@@ -1920,6 +1924,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		ring->semaphore.sync_to = gen6_ring_sync;
+		ring->semaphore.signal = gen6_signal;
 		/*
 		 * The current semaphore is only applied on pre-gen8 platform.
 		 * And there is no VCS2 ring on the pre-gen8 platform. So the
@@ -2104,6 +2109,7 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 				gen6_ring_dispatch_execbuffer;
 		}
 		ring->semaphore.sync_to = gen6_ring_sync;
+		ring->semaphore.signal = gen6_signal;
 		/*
 		 * The current semaphore is only applied on pre-gen8 platform.
 		 * And there is no VCS2 ring on the pre-gen8 platform. So the
@@ -2221,6 +2227,7 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
 	}
 	ring->semaphore.sync_to = gen6_ring_sync;
+	ring->semaphore.signal = gen6_signal;
 	/*
 	 * The current semaphore is only applied on pre-gen8 platform. And
 	 * there is no VCS2 ring on the pre-gen8 platform. So the semaphore
@@ -2270,6 +2277,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
 	}
 	ring->semaphore.sync_to = gen6_ring_sync;
+	ring->semaphore.signal = gen6_signal;
 	ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
 	ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
 	ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 6a44a64..830ff26 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -120,16 +120,19 @@ struct  intel_ring_buffer {
 
 	struct {
 		u32	sync_seqno[I915_NUM_RINGS-1];
-		/* AKA wait() */
-		int	(*sync_to)(struct intel_ring_buffer *ring,
-				   struct intel_ring_buffer *to,
-				   u32 seqno);
+
 		struct {
 			/* our mbox written by others */
 			u32		wait[I915_NUM_RINGS];
 			/* mboxes this ring signals to */
 			u32		signal[I915_NUM_RINGS];
 		} mbox;
+
+		/* AKA wait() */
+		int	(*sync_to)(struct intel_ring_buffer *ring,
+				   struct intel_ring_buffer *to,
+				   u32 seqno);
+		void	(*signal)(struct intel_ring_buffer *signaller);
 	} semaphore;
 
 	/**
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 03/13] drm/i915: Move ring_begin to signal()
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
  2014-04-29 21:52 ` [PATCH 01/13] drm/i915: Move semaphore specific ring members to struct Ben Widawsky
  2014-04-29 21:52 ` [PATCH 02/13] drm/i915: Virtualize the ringbuffer signal func Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-29 21:52 ` [PATCH 04/13] drm/i915: Make semaphore updates more precise Ben Widawsky
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

Add_request has always contained both the semaphore mailbox updates as
well as the breadcrumb writes. Since the semaphore signal is the one
which actually knows about the number of dwords it needs to emit to the
ring, we move the ring_begin to that function. This allows us to remove
the hideously shared #define

On a related not, gen8 will use a different number of dwords for
semaphores, but not for add request.

v2: Make number of dwords an explicit part of signalling (via function
argument). (Chris)

v3: very slight comment change

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 39 +++++++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 +++-
 2 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index ea81b54..e0c7bf2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -663,18 +663,28 @@ static void render_ring_cleanup(struct intel_ring_buffer *ring)
 	ring->scratch.obj = NULL;
 }
 
-static void gen6_signal(struct intel_ring_buffer *signaller)
+static int gen6_signal(struct intel_ring_buffer *signaller,
+		       unsigned int num_dwords)
 {
-	struct drm_i915_private *dev_priv = signaller->dev->dev_private;
+	struct drm_device *dev = signaller->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *useless;
-	int i;
+	int i, ret;
 
-/* NB: In order to be able to do semaphore MBOX updates for varying number
- * of rings, it's easiest if we round up each individual update to a
- * multiple of 2 (since ring updates must always be a multiple of 2)
- * even though the actual update only requires 3 dwords.
- */
+	/* NB: In order to be able to do semaphore MBOX updates for varying
+	 * number of rings, it's easiest if we round up each individual update
+	 * to a multiple of 2 (since ring updates must always be a multiple of
+	 * 2) even though the actual update only requires 3 dwords.
+	 */
 #define MBOX_UPDATE_DWORDS 4
+	if (i915_semaphore_is_enabled(dev))
+		num_dwords += ((I915_NUM_RINGS-1) * MBOX_UPDATE_DWORDS);
+
+	ret = intel_ring_begin(signaller, num_dwords);
+	if (ret)
+		return ret;
+#undef MBOX_UPDATE_DWORDS
+
 	for_each_ring(useless, dev_priv, i) {
 		u32 mbox_reg = signaller->semaphore.mbox.signal[i];
 		if (mbox_reg != GEN6_NOSYNC) {
@@ -689,6 +699,8 @@ static void gen6_signal(struct intel_ring_buffer *signaller)
 			intel_ring_emit(signaller, MI_NOOP);
 		}
 	}
+
+	return 0;
 }
 
 /**
@@ -703,19 +715,12 @@ static void gen6_signal(struct intel_ring_buffer *signaller)
 static int
 gen6_add_request(struct intel_ring_buffer *ring)
 {
-	struct drm_device *dev = ring->dev;
-	int ret, num_dwords = 4;
-
-	if (i915_semaphore_is_enabled(dev))
-		num_dwords += ((I915_NUM_RINGS-1) * MBOX_UPDATE_DWORDS);
-#undef MBOX_UPDATE_DWORDS
+	int ret;
 
-	ret = intel_ring_begin(ring, num_dwords);
+	ret = ring->semaphore.signal(ring, 4);
 	if (ret)
 		return ret;
 
-	ring->semaphore.signal(ring);
-
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 830ff26..0fdf030 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -132,7 +132,9 @@ struct  intel_ring_buffer {
 		int	(*sync_to)(struct intel_ring_buffer *ring,
 				   struct intel_ring_buffer *to,
 				   u32 seqno);
-		void	(*signal)(struct intel_ring_buffer *signaller);
+		int	(*signal)(struct intel_ring_buffer *signaller,
+				  /* num_dwords needed by caller */
+				  unsigned int num_dwords);
 	} semaphore;
 
 	/**
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 04/13] drm/i915: Make semaphore updates more precise
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
                   ` (2 preceding siblings ...)
  2014-04-29 21:52 ` [PATCH 03/13] drm/i915: Move ring_begin to signal() Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-30 12:45   ` Daniel Vetter
  2014-04-29 21:52 ` [PATCH 05/13] drm/i915: gen specific ring init Ben Widawsky
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

With the ring mask we now have an easy way to know the number of rings
in the system, and therefore can accurately predict the number of dwords
to emit for semaphore signalling. This was not possible (easily)
previously.

There should be no functional impact, simply fewer instructions emitted.

While we're here, simply do the round up to 2 instead of the fancier
rounding we did before, which rounding up per mbox, ie 4. This also
allows us to drop the unnecessary MI_NOOP, so not really 4, 3.

v2: Use 3 dwords instead of 4 (Ville)
Do the proper calculation to get the number of dwords to emit (Ville)
Conditionally set .sync_to when semaphores are enabled (Ville)

v3: Rebased on VCS2
Replace hweight_long with hweight32 (Ville)

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 173 +++++++++++++++++---------------
 1 file changed, 90 insertions(+), 83 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e0c7bf2..7aedc0c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -666,24 +666,19 @@ static void render_ring_cleanup(struct intel_ring_buffer *ring)
 static int gen6_signal(struct intel_ring_buffer *signaller,
 		       unsigned int num_dwords)
 {
+#define MBOX_UPDATE_DWORDS 3
 	struct drm_device *dev = signaller->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *useless;
-	int i, ret;
+	int i, ret, num_rings;
 
-	/* NB: In order to be able to do semaphore MBOX updates for varying
-	 * number of rings, it's easiest if we round up each individual update
-	 * to a multiple of 2 (since ring updates must always be a multiple of
-	 * 2) even though the actual update only requires 3 dwords.
-	 */
-#define MBOX_UPDATE_DWORDS 4
-	if (i915_semaphore_is_enabled(dev))
-		num_dwords += ((I915_NUM_RINGS-1) * MBOX_UPDATE_DWORDS);
+	num_rings = hweight32(INTEL_INFO(dev)->ring_mask);
+	num_dwords += round_up((num_rings-1) * MBOX_UPDATE_DWORDS, 2);
+#undef MBOX_UPDATE_DWORDS
 
 	ret = intel_ring_begin(signaller, num_dwords);
 	if (ret)
 		return ret;
-#undef MBOX_UPDATE_DWORDS
 
 	for_each_ring(useless, dev_priv, i) {
 		u32 mbox_reg = signaller->semaphore.mbox.signal[i];
@@ -691,15 +686,13 @@ static int gen6_signal(struct intel_ring_buffer *signaller,
 			intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1));
 			intel_ring_emit(signaller, mbox_reg);
 			intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
-			intel_ring_emit(signaller, MI_NOOP);
-		} else {
-			intel_ring_emit(signaller, MI_NOOP);
-			intel_ring_emit(signaller, MI_NOOP);
-			intel_ring_emit(signaller, MI_NOOP);
-			intel_ring_emit(signaller, MI_NOOP);
 		}
 	}
 
+	/* If num_dwords was rounded, make sure the tail pointer is correct */
+	if (num_rings % 2 == 0)
+		intel_ring_emit(signaller, MI_NOOP);
+
 	return 0;
 }
 
@@ -717,7 +710,11 @@ gen6_add_request(struct intel_ring_buffer *ring)
 {
 	int ret;
 
-	ret = ring->semaphore.signal(ring, 4);
+	if (ring->semaphore.signal)
+		ret = ring->semaphore.signal(ring, 4);
+	else
+		ret = intel_ring_begin(ring, 4);
+
 	if (ret)
 		return ret;
 
@@ -1928,24 +1925,27 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
-		ring->semaphore.sync_to = gen6_ring_sync;
-		ring->semaphore.signal = gen6_signal;
-		/*
-		 * The current semaphore is only applied on pre-gen8 platform.
-		 * And there is no VCS2 ring on the pre-gen8 platform. So the
-		 * semaphore between RCS and VCS2 is initialized as INVALID.
-		 * Gen8 will initialize the sema between VCS2 and RCS later.
-		 */
-		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_RV;
-		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_RB;
-		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_RVE;
-		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
-		ring->semaphore.mbox.signal[VCS] = GEN6_VRSYNC;
-		ring->semaphore.mbox.signal[BCS] = GEN6_BRSYNC;
-		ring->semaphore.mbox.signal[VECS] = GEN6_VERSYNC;
-		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.signal = gen6_signal;
+			/*
+			 * The current semaphore is only applied on pre-gen8
+			 * platform.  And there is no VCS2 ring on the pre-gen8
+			 * platform. So the semaphore between RCS and VCS2 is
+			 * initialized as INVALID.  Gen8 will initialize the
+			 * sema between VCS2 and RCS later.
+			 */
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_RV;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_RB;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_RVE;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_VRSYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_BRSYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_VERSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
 	} else if (IS_GEN5(dev)) {
 		ring->add_request = pc_render_add_request;
 		ring->flush = gen4_render_ring_flush;
@@ -2113,24 +2113,27 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 			ring->dispatch_execbuffer =
 				gen6_ring_dispatch_execbuffer;
 		}
-		ring->semaphore.sync_to = gen6_ring_sync;
-		ring->semaphore.signal = gen6_signal;
-		/*
-		 * The current semaphore is only applied on pre-gen8 platform.
-		 * And there is no VCS2 ring on the pre-gen8 platform. So the
-		 * semaphore between VCS and VCS2 is initialized as INVALID.
-		 * Gen8 will initialize the sema between VCS2 and VCS later.
-		 */
-		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VR;
-		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VB;
-		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_VVE;
-		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.signal[RCS] = GEN6_RVSYNC;
-		ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-		ring->semaphore.mbox.signal[BCS] = GEN6_BVSYNC;
-		ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
-		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.signal = gen6_signal;
+			/*
+			 * The current semaphore is only applied on pre-gen8
+			 * platform.  And there is no VCS2 ring on the pre-gen8
+			 * platform. So the semaphore between VCS and VCS2 is
+			 * initialized as INVALID.  Gen8 will initialize the
+			 * sema between VCS2 and VCS later.
+			 */
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VR;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VB;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_VVE;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_RVSYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_BVSYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
 	} else {
 		ring->mmio_base = BSD_RING_BASE;
 		ring->flush = bsd_ring_flush;
@@ -2231,24 +2234,26 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 		ring->irq_put = gen6_ring_put_irq;
 		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
 	}
-	ring->semaphore.sync_to = gen6_ring_sync;
-	ring->semaphore.signal = gen6_signal;
-	/*
-	 * The current semaphore is only applied on pre-gen8 platform. And
-	 * there is no VCS2 ring on the pre-gen8 platform. So the semaphore
-	 * between BCS and VCS2 is initialized as INVALID.
-	 * Gen8 will initialize the sema between BCS and VCS2 later.
-	 */
-	ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_BR;
-	ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_BV;
-	ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_BVE;
-	ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.signal[RCS] = GEN6_RBSYNC;
-	ring->semaphore.mbox.signal[VCS] = GEN6_VBSYNC;
-	ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-	ring->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
-	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+	if (i915_semaphore_is_enabled(dev)) {
+		ring->semaphore.signal = gen6_signal;
+		ring->semaphore.sync_to = gen6_ring_sync;
+		/*
+		 * The current semaphore is only applied on pre-gen8 platform.
+		 * And there is no VCS2 ring on the pre-gen8 platform. So the
+		 * semaphore between BCS and VCS2 is initialized as INVALID.
+		 * Gen8 will initialize the sema between BCS and VCS2 later.
+		 */
+		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_BR;
+		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_BV;
+		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_BVE;
+		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+		ring->semaphore.mbox.signal[RCS] = GEN6_RBSYNC;
+		ring->semaphore.mbox.signal[VCS] = GEN6_VBSYNC;
+		ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+		ring->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
+		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+	}
 	ring->init = init_ring_common;
 
 	return intel_init_ring_buffer(dev, ring);
@@ -2281,18 +2286,20 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 		ring->irq_put = hsw_vebox_put_irq;
 		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
 	}
-	ring->semaphore.sync_to = gen6_ring_sync;
-	ring->semaphore.signal = gen6_signal;
-	ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
-	ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
-	ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
-	ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-	ring->semaphore.mbox.signal[RCS] = GEN6_RVESYNC;
-	ring->semaphore.mbox.signal[VCS] = GEN6_VVESYNC;
-	ring->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
-	ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
-	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+	if (i915_semaphore_is_enabled(dev)) {
+		ring->semaphore.sync_to = gen6_ring_sync;
+		ring->semaphore.signal = gen6_signal;
+		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
+		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
+		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
+		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+		ring->semaphore.mbox.signal[RCS] = GEN6_RVESYNC;
+		ring->semaphore.mbox.signal[VCS] = GEN6_VVESYNC;
+		ring->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
+		ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+	}
 	ring->init = init_ring_common;
 
 	return intel_init_ring_buffer(dev, ring);
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 05/13] drm/i915: gen specific ring init
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
                   ` (3 preceding siblings ...)
  2014-04-29 21:52 ` [PATCH 04/13] drm/i915: Make semaphore updates more precise Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-29 21:52 ` [PATCH 06/13] drm/i915/bdw: implement semaphore signal Ben Widawsky
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

Gen8 has already had some differentiation with how it handles rings.
Semaphores bring yet more differences, and now is as good a time as any
to do the split.

Also, since gen8 doesn't actually use semaphores up until this point,
put the proper "NULL" values in for the mbox info.

v2: v1 had a stale commit message

v3: Move everything in the is_semaphore_enabled() check

v4: VCS2 rebase
Remove double assignment of signal in render ring (Ville)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 187 +++++++++++++++++++++-----------
 1 file changed, 123 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 7aedc0c..f2bae6f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1909,19 +1909,33 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 	ring->id = RCS;
 	ring->mmio_base = RENDER_RING_BASE;
 
-	if (INTEL_INFO(dev)->gen >= 6) {
+	if (INTEL_INFO(dev)->gen >= 8) {
+		ring->add_request = gen6_add_request;
+		ring->flush = gen8_render_ring_flush;
+		ring->irq_get = gen8_ring_get_irq;
+		ring->irq_put = gen8_ring_put_irq;
+		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
+		ring->get_seqno = gen6_ring_get_seqno;
+		ring->set_seqno = ring_set_seqno;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.signal = gen6_signal;
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+		}
+	} else if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
 		ring->flush = gen7_render_ring_flush;
 		if (INTEL_INFO(dev)->gen == 6)
 			ring->flush = gen6_render_ring_flush;
-		if (INTEL_INFO(dev)->gen >= 8) {
-			ring->flush = gen8_render_ring_flush;
-			ring->irq_get = gen8_ring_get_irq;
-			ring->irq_put = gen8_ring_put_irq;
-		} else {
-			ring->irq_get = gen6_ring_get_irq;
-			ring->irq_put = gen6_ring_put_irq;
-		}
+		ring->irq_get = gen6_ring_get_irq;
+		ring->irq_put = gen6_ring_put_irq;
 		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
@@ -1973,6 +1987,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->irq_enable_mask = I915_USER_INTERRUPT;
 	}
 	ring->write_tail = ring_write_tail;
+
 	if (IS_HASWELL(dev))
 		ring->dispatch_execbuffer = hsw_ring_dispatch_execbuffer;
 	else if (IS_GEN8(dev))
@@ -2106,33 +2121,48 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 			ring->irq_put = gen8_ring_put_irq;
 			ring->dispatch_execbuffer =
 				gen8_ring_dispatch_execbuffer;
+			if (i915_semaphore_is_enabled(dev)) {
+				ring->semaphore.sync_to = gen6_ring_sync;
+				ring->semaphore.signal = gen6_signal;
+				/*
+				 * The current semaphore is only applied on
+				 * pre-gen8 platform.  And there is no VCS2 ring
+				 * on the pre-gen8 platform. So the semaphore
+				 * between VCS and VCS2 is initialized as
+				 * INVALID.  Gen8 will initialize the sema
+				 * between VCS2 and VCS later.
+				 */
+				ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+				ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+				ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+				ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+				ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+			}
 		} else {
 			ring->irq_enable_mask = GT_BSD_USER_INTERRUPT;
 			ring->irq_get = gen6_ring_get_irq;
 			ring->irq_put = gen6_ring_put_irq;
 			ring->dispatch_execbuffer =
 				gen6_ring_dispatch_execbuffer;
-		}
-		if (i915_semaphore_is_enabled(dev)) {
-			ring->semaphore.sync_to = gen6_ring_sync;
-			ring->semaphore.signal = gen6_signal;
-			/*
-			 * The current semaphore is only applied on pre-gen8
-			 * platform.  And there is no VCS2 ring on the pre-gen8
-			 * platform. So the semaphore between VCS and VCS2 is
-			 * initialized as INVALID.  Gen8 will initialize the
-			 * sema between VCS2 and VCS later.
-			 */
-			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VR;
-			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VB;
-			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_VVE;
-			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.signal[RCS] = GEN6_RVSYNC;
-			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[BCS] = GEN6_BVSYNC;
-			ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
-			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+			if (i915_semaphore_is_enabled(dev)) {
+				ring->semaphore.sync_to = gen6_ring_sync;
+				ring->semaphore.signal = gen6_signal;
+				ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VR;
+				ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VB;
+				ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_VVE;
+				ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+				ring->semaphore.mbox.signal[RCS] = GEN6_RVSYNC;
+				ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+				ring->semaphore.mbox.signal[BCS] = GEN6_BVSYNC;
+				ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
+				ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+			}
 		}
 	} else {
 		ring->mmio_base = BSD_RING_BASE;
@@ -2228,31 +2258,46 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 		ring->irq_get = gen8_ring_get_irq;
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.signal = gen6_signal;
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
 	} else {
 		ring->irq_enable_mask = GT_BLT_USER_INTERRUPT;
 		ring->irq_get = gen6_ring_get_irq;
 		ring->irq_put = gen6_ring_put_irq;
 		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
-	}
-	if (i915_semaphore_is_enabled(dev)) {
-		ring->semaphore.signal = gen6_signal;
-		ring->semaphore.sync_to = gen6_ring_sync;
-		/*
-		 * The current semaphore is only applied on pre-gen8 platform.
-		 * And there is no VCS2 ring on the pre-gen8 platform. So the
-		 * semaphore between BCS and VCS2 is initialized as INVALID.
-		 * Gen8 will initialize the sema between BCS and VCS2 later.
-		 */
-		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_BR;
-		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_BV;
-		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_BVE;
-		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.signal[RCS] = GEN6_RBSYNC;
-		ring->semaphore.mbox.signal[VCS] = GEN6_VBSYNC;
-		ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-		ring->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
-		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.signal = gen6_signal;
+			ring->semaphore.sync_to = gen6_ring_sync;
+			/*
+			 * The current semaphore is only applied on pre-gen8
+			 * platform.  And there is no VCS2 ring on the pre-gen8
+			 * platform. So the semaphore between BCS and VCS2 is
+			 * initialized as INVALID.  Gen8 will initialize the
+			 * sema between BCS and VCS2 later.
+			 */
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_BR;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_BV;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_BVE;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_RBSYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_VBSYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
 	}
 	ring->init = init_ring_common;
 
@@ -2280,25 +2325,39 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 		ring->irq_get = gen8_ring_get_irq;
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.signal = gen6_signal;
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
 	} else {
 		ring->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
 		ring->irq_get = hsw_vebox_get_irq;
 		ring->irq_put = hsw_vebox_put_irq;
 		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
-	}
-	if (i915_semaphore_is_enabled(dev)) {
-		ring->semaphore.sync_to = gen6_ring_sync;
-		ring->semaphore.signal = gen6_signal;
-		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
-		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
-		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
-		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-		ring->semaphore.mbox.signal[RCS] = GEN6_RVESYNC;
-		ring->semaphore.mbox.signal[VCS] = GEN6_VVESYNC;
-		ring->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
-		ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
-		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		if (i915_semaphore_is_enabled(dev)) {
+			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.signal = gen6_signal;
+			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
+			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
+			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
+			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
+			ring->semaphore.mbox.signal[RCS] = GEN6_RVESYNC;
+			ring->semaphore.mbox.signal[VCS] = GEN6_VVESYNC;
+			ring->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
+			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+		}
 	}
 	ring->init = init_ring_common;
 
-- 
1.9.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 06/13] drm/i915/bdw: implement semaphore signal
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
                   ` (4 preceding siblings ...)
  2014-04-29 21:52 ` [PATCH 05/13] drm/i915: gen specific ring init Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-29 21:52 ` [PATCH 07/13] drm/i915/bdw: implement semaphore wait Ben Widawsky
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

Semaphore signalling works similarly to previous GENs with the exception
that the per ring mailboxes no longer exist. Instead you must define
your own space, somewhere in the GTT.

The comments in the code define the layout I've opted for, which should
be fairly future proof. Ie. I tried to define offsets in abstract terms
(NUM_RINGS, seqno size, etc).

NOTE: If one wanted to move this to the HWSP they could. I've decided
one 4k object would be easier to deal with, and provide potential wins
with cache locality, but that's all speculative.

v2: Update the macro to not need the other ring's ring->id (Chris)
Update the comment to use the correct formula (Chris)

v3: Move the macros the ringbuffer.h to prevent churn in next patch
(Ville)

v4: Fixed compilation rebase conflict
commit 1ec9e26ddab06459e89a890431b2de064c5d1056
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Feb 14 14:01:11 2014 +0100

    drm/i915: Consolidate binding parameters into flags

v5: VCS2 rebase
Replace hweight_long with hweight32

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |   1 +
 drivers/gpu/drm/i915/i915_reg.h         |   5 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 166 ++++++++++++++++++++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  78 +++++++++++++--
 4 files changed, 190 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 50dfc3a..44cb744 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1293,6 +1293,7 @@ struct drm_i915_private {
 
 	struct pci_dev *bridge_dev;
 	struct intel_ring_buffer ring[I915_NUM_RINGS];
+	struct drm_i915_gem_object *semaphore_obj;
 	uint32_t last_seqno, next_seqno;
 
 	drm_dma_handle_t *status_page_dmah;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 0eff337..8e6ec03 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -229,7 +229,7 @@
 #define   MI_DISPLAY_FLIP_IVB_SPRITE_B (3 << 19)
 #define   MI_DISPLAY_FLIP_IVB_PLANE_C  (4 << 19)
 #define   MI_DISPLAY_FLIP_IVB_SPRITE_C (5 << 19)
-#define MI_SEMAPHORE_MBOX	MI_INSTR(0x16, 1) /* gen6+ */
+#define MI_SEMAPHORE_MBOX	MI_INSTR(0x16, 1) /* gen6, gen7 */
 #define   MI_SEMAPHORE_GLOBAL_GTT    (1<<22)
 #define   MI_SEMAPHORE_UPDATE	    (1<<21)
 #define   MI_SEMAPHORE_COMPARE	    (1<<20)
@@ -255,6 +255,8 @@
 #define   MI_RESTORE_EXT_STATE_EN	(1<<2)
 #define   MI_FORCE_RESTORE		(1<<1)
 #define   MI_RESTORE_INHIBIT		(1<<0)
+#define MI_SEMAPHORE_SIGNAL	MI_INSTR(0x1b, 0) /* GEN8+ */
+#define   MI_SEMAPHORE_TARGET(engine)	((engine)<<15)
 #define MI_STORE_DWORD_IMM	MI_INSTR(0x20, 1)
 #define   MI_MEM_VIRTUAL	(1 << 22) /* 965+ only */
 #define MI_STORE_DWORD_INDEX	MI_INSTR(0x21, 1)
@@ -349,6 +351,7 @@
 #define   PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE		(1<<10) /* GM45+ only */
 #define   PIPE_CONTROL_INDIRECT_STATE_DISABLE		(1<<9)
 #define   PIPE_CONTROL_NOTIFY				(1<<8)
+#define   PIPE_CONTROL_FLUSH_ENABLE			(1<<7) /* gen7+ */
 #define   PIPE_CONTROL_VF_CACHE_INVALIDATE		(1<<4)
 #define   PIPE_CONTROL_CONST_CACHE_INVALIDATE		(1<<3)
 #define   PIPE_CONTROL_STATE_CACHE_INVALIDATE		(1<<2)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index f2bae6f..03324b2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -650,6 +650,13 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 static void render_ring_cleanup(struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (dev_priv->semaphore_obj) {
+		i915_gem_object_ggtt_unpin(dev_priv->semaphore_obj);
+		drm_gem_object_unreference(&dev_priv->semaphore_obj->base);
+		dev_priv->semaphore_obj = NULL;
+	}
 
 	if (ring->scratch.obj == NULL)
 		return;
@@ -663,6 +670,85 @@ static void render_ring_cleanup(struct intel_ring_buffer *ring)
 	ring->scratch.obj = NULL;
 }
 
+static int gen8_rcs_signal(struct intel_ring_buffer *signaller,
+			   unsigned int num_dwords)
+{
+#define MBOX_UPDATE_DWORDS 8
+	struct drm_device *dev = signaller->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_ring_buffer *waiter;
+	int i, ret, num_rings;
+
+	num_rings = hweight32(INTEL_INFO(dev)->ring_mask);
+	num_dwords += (num_rings-1) * MBOX_UPDATE_DWORDS;
+#undef MBOX_UPDATE_DWORDS
+
+	ret = intel_ring_begin(signaller, num_dwords);
+	if (ret)
+		return ret;
+
+	for_each_ring(waiter, dev_priv, i) {
+		u64 gtt_offset = signaller->semaphore.signal_ggtt[i];
+		if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID)
+			continue;
+
+		intel_ring_emit(signaller, GFX_OP_PIPE_CONTROL(6));
+		intel_ring_emit(signaller, PIPE_CONTROL_GLOBAL_GTT_IVB |
+					   PIPE_CONTROL_QW_WRITE |
+					   PIPE_CONTROL_FLUSH_ENABLE);
+		intel_ring_emit(signaller, lower_32_bits(gtt_offset));
+		intel_ring_emit(signaller, upper_32_bits(gtt_offset));
+		intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
+		intel_ring_emit(signaller, 0);
+		intel_ring_emit(signaller, MI_SEMAPHORE_SIGNAL |
+					   MI_SEMAPHORE_TARGET(waiter->id));
+		intel_ring_emit(signaller, 0);
+	}
+
+	WARN_ON(i != num_rings);
+
+	return 0;
+}
+
+static int gen8_xcs_signal(struct intel_ring_buffer *signaller,
+			   unsigned int num_dwords)
+{
+#define MBOX_UPDATE_DWORDS 6
+	struct drm_device *dev = signaller->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_ring_buffer *waiter;
+	int i, ret, num_rings;
+
+	num_rings = hweight_long(INTEL_INFO(dev)->ring_mask);
+	num_dwords = (num_rings-1) * MBOX_UPDATE_DWORDS;
+#undef MBOX_UPDATE_DWORDS
+
+	/* XXX: + 4 for the caller */
+	ret = intel_ring_begin(signaller, num_dwords + 4);
+	if (ret)
+		return ret;
+
+	for_each_ring(waiter, dev_priv, i) {
+		u64 gtt_offset = signaller->semaphore.signal_ggtt[i];
+		if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID)
+			continue;
+
+		intel_ring_emit(signaller, (MI_FLUSH_DW + 1) |
+					   MI_FLUSH_DW_OP_STOREDW);
+		intel_ring_emit(signaller, lower_32_bits(gtt_offset) |
+					   MI_FLUSH_DW_USE_GTT);
+		intel_ring_emit(signaller, upper_32_bits(gtt_offset));
+		intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
+		intel_ring_emit(signaller, MI_SEMAPHORE_SIGNAL |
+					   MI_SEMAPHORE_TARGET(waiter->id));
+		intel_ring_emit(signaller, 0);
+	}
+
+	WARN_ON(i != num_rings);
+
+	return 0;
+}
+
 static int gen6_signal(struct intel_ring_buffer *signaller,
 		       unsigned int num_dwords)
 {
@@ -1904,12 +1990,30 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
+	struct drm_i915_gem_object *obj;
+	int ret;
 
 	ring->name = "render ring";
 	ring->id = RCS;
 	ring->mmio_base = RENDER_RING_BASE;
 
 	if (INTEL_INFO(dev)->gen >= 8) {
+		if (i915_semaphore_is_enabled(dev)) {
+			obj = i915_gem_alloc_object(dev, 4096);
+			if (obj == NULL) {
+				DRM_ERROR("Failed to allocate semaphore bo. Disabling semaphores\n");
+				i915.semaphores = 0;
+			} else {
+				i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+				ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_NONBLOCK);
+				if (ret != 0) {
+					drm_gem_object_unreference(&obj->base);
+					DRM_ERROR("Failed to pin semaphore bo. Disabling semaphores\n");
+					i915.semaphores = 0;
+				} else
+					dev_priv->semaphore_obj = obj;
+			}
+		}
 		ring->add_request = gen6_add_request;
 		ring->flush = gen8_render_ring_flush;
 		ring->irq_get = gen8_ring_get_irq;
@@ -1918,16 +2022,10 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->get_seqno = gen6_ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		if (i915_semaphore_is_enabled(dev)) {
+			BUG_ON(!dev_priv->semaphore_obj);
 			ring->semaphore.sync_to = gen6_ring_sync;
-			ring->semaphore.signal = gen6_signal;
-			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
+			ring->semaphore.signal = gen8_rcs_signal;
+			GEN8_RING_SEMAPHORE_INIT;
 		}
 	} else if (INTEL_INFO(dev)->gen >= 6) {
 		ring->add_request = gen6_add_request;
@@ -2005,9 +2103,6 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 
 	/* Workaround batchbuffer to combat CS tlb bug. */
 	if (HAS_BROKEN_CS_TLB(dev)) {
-		struct drm_i915_gem_object *obj;
-		int ret;
-
 		obj = i915_gem_alloc_object(dev, I830_BATCH_LIMIT);
 		if (obj == NULL) {
 			DRM_ERROR("Failed to allocate batch bo\n");
@@ -2123,25 +2218,8 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 				gen8_ring_dispatch_execbuffer;
 			if (i915_semaphore_is_enabled(dev)) {
 				ring->semaphore.sync_to = gen6_ring_sync;
-				ring->semaphore.signal = gen6_signal;
-				/*
-				 * The current semaphore is only applied on
-				 * pre-gen8 platform.  And there is no VCS2 ring
-				 * on the pre-gen8 platform. So the semaphore
-				 * between VCS and VCS2 is initialized as
-				 * INVALID.  Gen8 will initialize the sema
-				 * between VCS2 and VCS later.
-				 */
-				ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-				ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-				ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-				ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-				ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-				ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
-				ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-				ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-				ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
-				ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+				ring->semaphore.signal = gen8_xcs_signal;
+				GEN8_RING_SEMAPHORE_INIT;
 			}
 		} else {
 			ring->irq_enable_mask = GT_BSD_USER_INTERRUPT;
@@ -2260,17 +2338,8 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 		if (i915_semaphore_is_enabled(dev)) {
 			ring->semaphore.sync_to = gen6_ring_sync;
-			ring->semaphore.signal = gen6_signal;
-			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+			ring->semaphore.signal = gen8_xcs_signal;
+			GEN8_RING_SEMAPHORE_INIT;
 		}
 	} else {
 		ring->irq_enable_mask = GT_BLT_USER_INTERRUPT;
@@ -2327,17 +2396,8 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 		if (i915_semaphore_is_enabled(dev)) {
 			ring->semaphore.sync_to = gen6_ring_sync;
-			ring->semaphore.signal = gen6_signal;
-			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
-			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
-			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
+			ring->semaphore.signal = gen8_xcs_signal;
+			GEN8_RING_SEMAPHORE_INIT;
 		}
 	} else {
 		ring->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 0fdf030..d7de09b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -36,6 +36,32 @@ struct  intel_hw_status_page {
 #define I915_READ_MODE(ring) I915_READ(RING_MI_MODE((ring)->mmio_base))
 #define I915_WRITE_MODE(ring, val) I915_WRITE(RING_MI_MODE((ring)->mmio_base), val)
 
+/* seqno size is actually only a uint32, but since we plan to use MI_FLUSH_DW to
+ * do the writes, and that must have qw aligned offsets, simply pretend it's 8b.
+ */
+#define i915_semaphore_seqno_size sizeof(uint64_t)
+#define GEN8_SIGNAL_OFFSET(to) \
+	(i915_gem_obj_ggtt_offset(dev_priv->semaphore_obj) + \
+	(ring->id * I915_NUM_RINGS * i915_semaphore_seqno_size) + \
+	(i915_semaphore_seqno_size * (to)))
+
+#define GEN8_WAIT_OFFSET(from) \
+	(i915_gem_obj_ggtt_offset(dev_priv->semaphore_obj) + \
+	((from) * I915_NUM_RINGS * i915_semaphore_seqno_size) + \
+	(i915_semaphore_seqno_size * ring->id))
+
+#define GEN8_RING_SEMAPHORE_INIT do { \
+	if (!dev_priv->semaphore_obj) { \
+		break; \
+	} \
+	ring->semaphore.signal_ggtt[RCS] = GEN8_SIGNAL_OFFSET(RCS); \
+	ring->semaphore.signal_ggtt[VCS] = GEN8_SIGNAL_OFFSET(VCS); \
+	ring->semaphore.signal_ggtt[BCS] = GEN8_SIGNAL_OFFSET(BCS); \
+	ring->semaphore.signal_ggtt[VECS] = GEN8_SIGNAL_OFFSET(VECS); \
+	ring->semaphore.signal_ggtt[VCS2] = GEN8_SIGNAL_OFFSET(VCS2); \
+	ring->semaphore.signal_ggtt[ring->id] = MI_SEMAPHORE_SYNC_INVALID; \
+	} while(0)
+
 enum intel_ring_hangcheck_action {
 	HANGCHECK_IDLE = 0,
 	HANGCHECK_WAIT,
@@ -118,15 +144,55 @@ struct  intel_ring_buffer {
 #define I915_DISPATCH_PINNED 0x2
 	void		(*cleanup)(struct intel_ring_buffer *ring);
 
+	/* GEN8 signal/wait table - never trust comments!
+	 *	  signal to	signal to    signal to   signal to      signal to
+	 *	    RCS		   VCS          BCS        VECS		 VCS2
+	 *      --------------------------------------------------------------------
+	 *  RCS | NOP (0x00) | BCS (0x08) | VCS (0x10) | VECS (0x18) | VCS2 (0x20) |
+	 *	|-------------------------------------------------------------------
+	 *  VCS | RCS (0x28) | NOP (0x30) | BCS (0x38) | VECS (0x40) | VCS2 (0x48) |
+	 *	|-------------------------------------------------------------------
+	 *  BCS | RCS (0x50) | VCS (0x58) | NOP (0x60) | VECS (0x68) | VCS2 (0x70) |
+	 *	|-------------------------------------------------------------------
+	 * VECS | RCS (0x78) | VCS (0x80) | BCS (0x88) |  NOP (0x90) | VCS2 (0x98) |
+	 *	|-------------------------------------------------------------------
+	 * VECS | RCS (0xa0) | VCS (0xa8) | BCS (0xb0) |  NOP (0xb8) | NOP  (0xc0) |
+	 *	|-------------------------------------------------------------------
+	 *
+	 * Generalization:
+	 *  f(x, y) := (x->id * NUM_RINGS * seqno_size) + (seqno_size * y->id)
+	 *  ie. transpose of g(x, y)
+	 *
+	 *	 sync from	sync from    sync from    sync from	sync from
+	 *	    RCS		   VCS          BCS        VECS		 VCS2
+	 *      --------------------------------------------------------------------
+	 *  RCS | NOP (0x00) | VCS (0x28) | BCS (0x50) | VECS (0x78) | VCS2 (0xa0) |
+	 *	|-------------------------------------------------------------------
+	 *  VCS | RCS (0x08) | NOP (0x30) | BCS (0x58) | VECS (0x80) | VCS2 (0xa8) |
+	 *	|-------------------------------------------------------------------
+	 *  BCS | RCS (0x10) | VCS (0x38) | NOP (0x60) | VECS (0x88) | VCS2 (0xb0) |
+	 *	|-------------------------------------------------------------------
+	 * VECS | RCS (0x18) | VCS (0x40) | BCS (0x68) |  NOP (0x90) | VCS2 (0xb8) |
+	 *	|-------------------------------------------------------------------
+	 * VCS2 | RCS (0x20) | VCS (0x48) | BCS (0x70) | VECS (0x98) |  NOP (0xc0) |
+	 *	|-------------------------------------------------------------------
+	 *
+	 * Generalization:
+	 *  g(x, y) := (y->id * NUM_RINGS * seqno_size) + (seqno_size * x->id)
+	 *  ie. transpose of f(x, y)
+	 */
 	struct {
 		u32	sync_seqno[I915_NUM_RINGS-1];
 
-		struct {
-			/* our mbox written by others */
-			u32		wait[I915_NUM_RINGS];
-			/* mboxes this ring signals to */
-			u32		signal[I915_NUM_RINGS];
-		} mbox;
+		union {
+			struct {
+				/* our mbox written by others */
+				u32		wait[I915_NUM_RINGS];
+				/* mboxes this ring signals to */
+				u32		signal[I915_NUM_RINGS];
+			} mbox;
+			u64		signal_ggtt[I915_NUM_RINGS];
+		};
 
 		/* AKA wait() */
 		int	(*sync_to)(struct intel_ring_buffer *ring,
-- 
1.9.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 07/13] drm/i915/bdw: implement semaphore wait
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
                   ` (5 preceding siblings ...)
  2014-04-29 21:52 ` [PATCH 06/13] drm/i915/bdw: implement semaphore signal Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-29 21:52 ` [PATCH 08/13] drm/i915: Implement MI decode for gen8 Ben Widawsky
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

Semaphore waits use a new instruction, MI_SEMAPHORE_WAIT. The seqno to
wait on is all well defined by the table in the previous patch. There is
nothing else different from previous GEN's semaphore synchronization
code.

v2: Update macros to not require the other ring's ring->id (Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_reg.h         |  3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 33 +++++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 ++--
 3 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 8e6ec03..712ae05 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -257,6 +257,9 @@
 #define   MI_RESTORE_INHIBIT		(1<<0)
 #define MI_SEMAPHORE_SIGNAL	MI_INSTR(0x1b, 0) /* GEN8+ */
 #define   MI_SEMAPHORE_TARGET(engine)	((engine)<<15)
+#define MI_SEMAPHORE_WAIT	MI_INSTR(0x1c, 2) /* GEN8+ */
+#define   MI_SEMAPHORE_POLL		(1<<15)
+#define   MI_SEMAPHORE_SAD_GTE_SDD	(1<<12)
 #define MI_STORE_DWORD_IMM	MI_INSTR(0x20, 1)
 #define   MI_MEM_VIRTUAL	(1 << 22) /* 965+ only */
 #define MI_STORE_DWORD_INDEX	MI_INSTR(0x21, 1)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 03324b2..31b1f3c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -827,6 +827,31 @@ static inline bool i915_gem_has_seqno_wrapped(struct drm_device *dev,
  * @signaller - ring which has, or will signal
  * @seqno - seqno which the waiter will block on
  */
+
+static int
+gen8_ring_sync(struct intel_ring_buffer *waiter,
+	       struct intel_ring_buffer *signaller,
+	       u32 seqno)
+{
+	struct drm_i915_private *dev_priv = waiter->dev->dev_private;
+	int ret;
+
+	ret = intel_ring_begin(waiter, 4);
+	if (ret)
+		return ret;
+
+	intel_ring_emit(waiter, MI_SEMAPHORE_WAIT |
+				MI_SEMAPHORE_GLOBAL_GTT |
+				MI_SEMAPHORE_SAD_GTE_SDD);
+	intel_ring_emit(waiter, seqno);
+	intel_ring_emit(waiter,
+			lower_32_bits(GEN8_WAIT_OFFSET(waiter, signaller->id)));
+	intel_ring_emit(waiter,
+			upper_32_bits(GEN8_WAIT_OFFSET(waiter, signaller->id)));
+	intel_ring_advance(waiter);
+	return 0;
+}
+
 static int
 gen6_ring_sync(struct intel_ring_buffer *waiter,
 	       struct intel_ring_buffer *signaller,
@@ -2023,7 +2048,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->set_seqno = ring_set_seqno;
 		if (i915_semaphore_is_enabled(dev)) {
 			BUG_ON(!dev_priv->semaphore_obj);
-			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.sync_to = gen8_ring_sync;
 			ring->semaphore.signal = gen8_rcs_signal;
 			GEN8_RING_SEMAPHORE_INIT;
 		}
@@ -2217,7 +2242,7 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 			ring->dispatch_execbuffer =
 				gen8_ring_dispatch_execbuffer;
 			if (i915_semaphore_is_enabled(dev)) {
-				ring->semaphore.sync_to = gen6_ring_sync;
+				ring->semaphore.sync_to = gen8_ring_sync;
 				ring->semaphore.signal = gen8_xcs_signal;
 				GEN8_RING_SEMAPHORE_INIT;
 			}
@@ -2337,7 +2362,7 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 		if (i915_semaphore_is_enabled(dev)) {
-			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.sync_to = gen8_ring_sync;
 			ring->semaphore.signal = gen8_xcs_signal;
 			GEN8_RING_SEMAPHORE_INIT;
 		}
@@ -2395,7 +2420,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 		ring->irq_put = gen8_ring_put_irq;
 		ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
 		if (i915_semaphore_is_enabled(dev)) {
-			ring->semaphore.sync_to = gen6_ring_sync;
+			ring->semaphore.sync_to = gen8_ring_sync;
 			ring->semaphore.signal = gen8_xcs_signal;
 			GEN8_RING_SEMAPHORE_INIT;
 		}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index d7de09b..a7ff166 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -45,10 +45,10 @@ struct  intel_hw_status_page {
 	(ring->id * I915_NUM_RINGS * i915_semaphore_seqno_size) + \
 	(i915_semaphore_seqno_size * (to)))
 
-#define GEN8_WAIT_OFFSET(from) \
+#define GEN8_WAIT_OFFSET(__ring, from) \
 	(i915_gem_obj_ggtt_offset(dev_priv->semaphore_obj) + \
 	((from) * I915_NUM_RINGS * i915_semaphore_seqno_size) + \
-	(i915_semaphore_seqno_size * ring->id))
+	(i915_semaphore_seqno_size * (__ring)->id))
 
 #define GEN8_RING_SEMAPHORE_INIT do { \
 	if (!dev_priv->semaphore_obj) { \
-- 
1.9.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 08/13] drm/i915: Implement MI decode for gen8
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
                   ` (6 preceding siblings ...)
  2014-04-29 21:52 ` [PATCH 07/13] drm/i915/bdw: implement semaphore wait Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-30 11:21   ` Ville Syrjälä
  2014-04-29 21:52 ` [PATCH 09/13] drm/i915/bdw: poll semaphores Ben Widawsky
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@linux.intel.com>

This is needed to implement ipehr_is_semaphore_wait

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_irq.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2d76183..bfd21c7 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2561,12 +2561,9 @@ static bool
 ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
 {
 	if (INTEL_INFO(dev)->gen >= 8) {
-		/*
-		 * FIXME: gen8 semaphore support - currently we don't emit
-		 * semaphores on bdw anyway, but this needs to be addressed when
-		 * we merge that code.
-		 */
-		return false;
+		/* Broadwell's semaphore wait is 3 dwords. We hope IPEHR is the
+		 * first dword. */
+		return (ipehr >> 23) == 0x1c;
 	} else {
 		ipehr &= ~MI_SEMAPHORE_SYNC_MASK;
 		return ipehr == (MI_SEMAPHORE_MBOX | MI_SEMAPHORE_COMPARE |
@@ -2586,6 +2583,8 @@ semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
 		 * FIXME: gen8 semaphore support - currently we don't emit
 		 * semaphores on bdw anyway, but this needs to be addressed when
 		 * we merge that code.
+		 *
+		 * XXX: Gen8 needs more than just IPEHR.
 		 */
 		return NULL;
 	} else {
-- 
1.9.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 09/13] drm/i915/bdw: poll semaphores
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
                   ` (7 preceding siblings ...)
  2014-04-29 21:52 ` [PATCH 08/13] drm/i915: Implement MI decode for gen8 Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-30 10:53   ` Ville Syrjälä
  2014-04-29 21:52 ` [PATCH 10/13] drm/i915: Extract semaphore error collection Ben Widawsky
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

As Ville points out, it's possible/probable we don't actually need this.
Potentially, this validates the letter of the spec, and not the spirit.

Ville:
> I discussed this on irc w/ Ben, and I was suggesting we don't need to
> poll. Polling apparently can be used as a workaround for certain
> hardware issues, but it looks like those issues shouldn't affect us,
> for the momemnt at least. So my suggestion was to try w/o polling
> first (since there could be some power cost to polling) and add the
> poll bit if problems arise.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 31b1f3c..e7748ef 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -842,6 +842,7 @@ gen8_ring_sync(struct intel_ring_buffer *waiter,
 
 	intel_ring_emit(waiter, MI_SEMAPHORE_WAIT |
 				MI_SEMAPHORE_GLOBAL_GTT |
+				MI_SEMAPHORE_POLL |
 				MI_SEMAPHORE_SAD_GTE_SDD);
 	intel_ring_emit(waiter, seqno);
 	intel_ring_emit(waiter,
-- 
1.9.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 10/13] drm/i915: Extract semaphore error collection
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
                   ` (8 preceding siblings ...)
  2014-04-29 21:52 ` [PATCH 09/13] drm/i915/bdw: poll semaphores Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-29 21:52 ` [PATCH 11/13] drm/i915/bdw: collect semaphore error state Ben Widawsky
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 2d81985..a7eaab2 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -744,6 +744,23 @@ static void i915_gem_record_fences(struct drm_device *dev,
 	}
 }
 
+
+static void gen6_record_semaphore_state(struct drm_i915_private *dev_priv,
+					struct intel_ring_buffer *ring,
+					struct drm_i915_error_ring *ering)
+{
+	ering->semaphore_mboxes[0] = I915_READ(RING_SYNC_0(ring->mmio_base));
+	ering->semaphore_mboxes[1] = I915_READ(RING_SYNC_1(ring->mmio_base));
+	ering->semaphore_seqno[0] = ring->semaphore.sync_seqno[0];
+	ering->semaphore_seqno[1] = ring->semaphore.sync_seqno[1];
+
+	if (HAS_VEBOX(dev_priv->dev)) {
+		ering->semaphore_mboxes[2] =
+			I915_READ(RING_SYNC_2(ring->mmio_base));
+		ering->semaphore_seqno[2] = ring->semaphore.sync_seqno[2];
+	}
+}
+
 static void i915_record_ring_state(struct drm_device *dev,
 				   struct intel_ring_buffer *ring,
 				   struct drm_i915_error_ring *ering)
@@ -753,18 +770,7 @@ static void i915_record_ring_state(struct drm_device *dev,
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ering->rc_psmi = I915_READ(ring->mmio_base + 0x50);
 		ering->fault_reg = I915_READ(RING_FAULT_REG(ring));
-		ering->semaphore_mboxes[0]
-			= I915_READ(RING_SYNC_0(ring->mmio_base));
-		ering->semaphore_mboxes[1]
-			= I915_READ(RING_SYNC_1(ring->mmio_base));
-		ering->semaphore_seqno[0] = ring->semaphore.sync_seqno[0];
-		ering->semaphore_seqno[1] = ring->semaphore.sync_seqno[1];
-	}
-
-	if (HAS_VEBOX(dev)) {
-		ering->semaphore_mboxes[2] =
-			I915_READ(RING_SYNC_2(ring->mmio_base));
-		ering->semaphore_seqno[2] = ring->semaphore.sync_seqno[2];
+		gen6_record_semaphore_state(dev_priv, ring, ering);
 	}
 
 	if (INTEL_INFO(dev)->gen >= 4) {
-- 
1.9.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 11/13] drm/i915/bdw: collect semaphore error state
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
                   ` (9 preceding siblings ...)
  2014-04-29 21:52 ` [PATCH 10/13] drm/i915: Extract semaphore error collection Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-29 21:52 ` [PATCH 12/13] drm/i915: semaphore debugfs Ben Widawsky
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

Since the semaphore information is in an object, just dump it, and let
the user parse it later.

NOTE: The page being used for the semaphores are incoherent with the
CPU. No matter what I do, I cannot figure out a way to read anything but
0s. Note that the semaphore waits are indeed working.

v2: Don't print signal, and wait (they should be the same). Instead,
print sync_seqno (Chris)

v3: Free the semaphore error object (Chris)

v4: Fix semaphore offset calculation during error state collection
(Ville)

v5: VCS2 rebase
Make semaphore object error capture coding style consistent (Ville)
Do the proper math for the signal offset (Ville)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |  1 +
 drivers/gpu/drm/i915/i915_gpu_error.c   | 51 ++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/intel_ringbuffer.h | 14 ++++-----
 3 files changed, 55 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 44cb744..237faf3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -328,6 +328,7 @@ struct drm_i915_error_state {
 	u64 fence[I915_MAX_NUM_FENCES];
 	struct intel_overlay_error_state *overlay;
 	struct intel_display_error_state *display;
+	struct drm_i915_error_object *semaphore_obj;
 
 	struct drm_i915_error_ring {
 		bool valid;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index a7eaab2..50d2af8 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -326,6 +326,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 	struct drm_device *dev = error_priv->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_error_state *error = error_priv->error;
+	struct drm_i915_error_object *obj;
 	int i, j, offset, elt;
 	int max_hangcheck_score;
 
@@ -394,8 +395,6 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 				    error->pinned_bo_count[0]);
 
 	for (i = 0; i < ARRAY_SIZE(error->ring); i++) {
-		struct drm_i915_error_object *obj;
-
 		obj = error->ring[i].batchbuffer;
 		if (obj) {
 			err_puts(m, dev_priv->ring[i].name);
@@ -458,6 +457,18 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 		}
 	}
 
+	if ((obj = error->semaphore_obj)) {
+		err_printf(m, "Semaphore page = 0x%08x\n", obj->gtt_offset);
+		for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
+			err_printf(m, "[%04x] %08x %08x %08x %08x\n",
+				   elt * 4,
+				   obj->pages[0][elt],
+				   obj->pages[0][elt+1],
+				   obj->pages[0][elt+2],
+				   obj->pages[0][elt+3]);
+		}
+	}
+
 	if (error->overlay)
 		intel_overlay_print_error_state(m, error->overlay);
 
@@ -528,6 +539,7 @@ static void i915_error_state_free(struct kref *error_ref)
 		kfree(error->ring[i].requests);
 	}
 
+	i915_error_object_free(error->semaphore_obj);
 	kfree(error->active_bo);
 	kfree(error->overlay);
 	kfree(error->display);
@@ -745,6 +757,33 @@ static void i915_gem_record_fences(struct drm_device *dev,
 }
 
 
+static void gen8_record_semaphore_state(struct drm_i915_private *dev_priv,
+					struct drm_i915_error_state *error,
+					struct intel_ring_buffer *ring,
+					struct drm_i915_error_ring *ering)
+{
+	struct intel_ring_buffer *useless;
+	int i;
+
+	if (!i915_semaphore_is_enabled(dev_priv->dev))
+		return;
+
+	if (!error->semaphore_obj)
+		error->semaphore_obj =
+			i915_error_object_create(dev_priv,
+						 dev_priv->semaphore_obj,
+						 &dev_priv->gtt.base);
+
+	for_each_ring(useless, dev_priv, i) {
+		u16 signal_offset =
+			(GEN8_SIGNAL_OFFSET(ring, i) & PAGE_MASK) / 4;
+		u32 *tmp = error->semaphore_obj->pages[0];
+
+		ering->semaphore_mboxes[i] = tmp[signal_offset];
+		ering->semaphore_seqno[i] = ring->semaphore.sync_seqno[i];
+	}
+}
+
 static void gen6_record_semaphore_state(struct drm_i915_private *dev_priv,
 					struct intel_ring_buffer *ring,
 					struct drm_i915_error_ring *ering)
@@ -762,6 +801,7 @@ static void gen6_record_semaphore_state(struct drm_i915_private *dev_priv,
 }
 
 static void i915_record_ring_state(struct drm_device *dev,
+				   struct drm_i915_error_state *error,
 				   struct intel_ring_buffer *ring,
 				   struct drm_i915_error_ring *ering)
 {
@@ -770,7 +810,10 @@ static void i915_record_ring_state(struct drm_device *dev,
 	if (INTEL_INFO(dev)->gen >= 6) {
 		ering->rc_psmi = I915_READ(ring->mmio_base + 0x50);
 		ering->fault_reg = I915_READ(RING_FAULT_REG(ring));
-		gen6_record_semaphore_state(dev_priv, ring, ering);
+		if (INTEL_INFO(dev)->gen >= 8)
+			gen8_record_semaphore_state(dev_priv, error, ring, ering);
+		else
+			gen6_record_semaphore_state(dev_priv, ring, ering);
 	}
 
 	if (INTEL_INFO(dev)->gen >= 4) {
@@ -897,7 +940,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 
 		error->ring[i].valid = true;
 
-		i915_record_ring_state(dev, ring, &error->ring[i]);
+		i915_record_ring_state(dev, error, ring, &error->ring[i]);
 
 		error->ring[i].pid = -1;
 		request = i915_gem_find_active_request(ring);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index a7ff166..20af934 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -40,9 +40,9 @@ struct  intel_hw_status_page {
  * do the writes, and that must have qw aligned offsets, simply pretend it's 8b.
  */
 #define i915_semaphore_seqno_size sizeof(uint64_t)
-#define GEN8_SIGNAL_OFFSET(to) \
+#define GEN8_SIGNAL_OFFSET(__ring, to) \
 	(i915_gem_obj_ggtt_offset(dev_priv->semaphore_obj) + \
-	(ring->id * I915_NUM_RINGS * i915_semaphore_seqno_size) + \
+	((__ring)->id * I915_NUM_RINGS * i915_semaphore_seqno_size) + \
 	(i915_semaphore_seqno_size * (to)))
 
 #define GEN8_WAIT_OFFSET(__ring, from) \
@@ -54,11 +54,11 @@ struct  intel_hw_status_page {
 	if (!dev_priv->semaphore_obj) { \
 		break; \
 	} \
-	ring->semaphore.signal_ggtt[RCS] = GEN8_SIGNAL_OFFSET(RCS); \
-	ring->semaphore.signal_ggtt[VCS] = GEN8_SIGNAL_OFFSET(VCS); \
-	ring->semaphore.signal_ggtt[BCS] = GEN8_SIGNAL_OFFSET(BCS); \
-	ring->semaphore.signal_ggtt[VECS] = GEN8_SIGNAL_OFFSET(VECS); \
-	ring->semaphore.signal_ggtt[VCS2] = GEN8_SIGNAL_OFFSET(VCS2); \
+	ring->semaphore.signal_ggtt[RCS] = GEN8_SIGNAL_OFFSET(ring, RCS); \
+	ring->semaphore.signal_ggtt[VCS] = GEN8_SIGNAL_OFFSET(ring, VCS); \
+	ring->semaphore.signal_ggtt[BCS] = GEN8_SIGNAL_OFFSET(ring, BCS); \
+	ring->semaphore.signal_ggtt[VECS] = GEN8_SIGNAL_OFFSET(ring, VECS); \
+	ring->semaphore.signal_ggtt[VCS2] = GEN8_SIGNAL_OFFSET(ring, VCS2); \
 	ring->semaphore.signal_ggtt[ring->id] = MI_SEMAPHORE_SYNC_INVALID; \
 	} while(0)
 
-- 
1.9.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 12/13] drm/i915: semaphore debugfs
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
                   ` (10 preceding siblings ...)
  2014-04-29 21:52 ` [PATCH 11/13] drm/i915/bdw: collect semaphore error state Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-05-03  2:23   ` [PATCH 12.1/13] drm/i915: Small semaphore debugfs fixup Ben Widawsky
  2014-04-29 21:52 ` [PATCH 13/13] DONT_MERGE drm/i915: FORCE_RESTORE for gen8 semaphores Ben Widawsky
  2014-04-30 11:35 ` [PATCH 00/13] [REPOST] BDW Semaphores Ville Syrjälä
  13 siblings, 1 reply; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

Simple debugfs file to display the current state of semaphores. This is
useful if you want to see the state without hanging the GPU.

NOTE: This patch is optional to the series.

NOTE2: Like the GPU error state collection, the reads are currently
incoherent.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 70 +++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index cad175c..0e02901 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2332,6 +2332,75 @@ static int i915_display_info(struct seq_file *m, void *unused)
 	return 0;
 }
 
+static int i915_semaphore_status(struct seq_file *m, void *unused)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_ring_buffer *ring;
+	int i, j, ret;
+
+	if (!i915_semaphore_is_enabled(dev)) {
+		seq_puts(m, "Semaphores are disabled\n");
+		return 0;
+	}
+
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		return ret;
+
+	if (IS_BROADWELL(dev)) {
+		struct page *page;
+		uint64_t *seqno;
+
+		page = i915_gem_object_get_page(dev_priv->semaphore_obj, 0);
+
+		seqno = (uint64_t *)kmap_atomic(page);
+		for_each_ring(ring, dev_priv, i) {
+			uint64_t offset;
+
+			seq_printf(m, "%s\n", ring->name);
+
+			seq_puts(m, "  Last signal:");
+			for (j = 0; j < I915_NUM_RINGS; j++) {
+				offset = i * I915_NUM_RINGS + j;
+				seq_printf(m, "0x%08llx (0x%02llx) ",
+					   seqno[offset], offset * 8);
+			}
+			seq_putc(m, '\n');
+
+			seq_puts(m, "  Last wait:  ");
+			for (j = 0; j < I915_NUM_RINGS; j++) {
+				offset = i + (j * I915_NUM_RINGS);
+				seq_printf(m, "0x%08llx (0x%02llx) ",
+					   seqno[offset], offset * 8);
+			}
+			seq_putc(m, '\n');
+
+		}
+		kunmap_atomic(seqno);
+	} else {
+		seq_puts(m, "  Last signal:");
+		for_each_ring(ring, dev_priv, i)
+			for (j = 0; j < I915_NUM_RINGS; j++)
+				seq_printf(m, "0x%08x\n",
+					   I915_READ(ring->semaphore.mbox.signal[j]));
+		seq_putc(m, '\n');
+	}
+
+	seq_puts(m, "\nSync seqno:\n");
+	for_each_ring(ring, dev_priv, i) {
+		for (j = 0; j < I915_NUM_RINGS; j++) {
+			seq_printf(m, "  0x%08x ", ring->semaphore.sync_seqno[j]);
+		}
+		seq_putc(m, '\n');
+	}
+	seq_putc(m, '\n');
+
+	mutex_unlock(&dev->struct_mutex);
+	return 0;
+}
+
 struct pipe_crc_info {
 	const char *name;
 	struct drm_device *dev;
@@ -3778,6 +3847,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_pc8_status", i915_pc8_status, 0},
 	{"i915_power_domain_info", i915_power_domain_info, 0},
 	{"i915_display_info", i915_display_info, 0},
+	{"i915_semaphore_status", i915_semaphore_status, 0},
 };
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
-- 
1.9.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 13/13] DONT_MERGE drm/i915: FORCE_RESTORE for gen8 semaphores
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
                   ` (11 preceding siblings ...)
  2014-04-29 21:52 ` [PATCH 12/13] drm/i915: semaphore debugfs Ben Widawsky
@ 2014-04-29 21:52 ` Ben Widawsky
  2014-04-30  7:13   ` Chris Wilson
  2014-04-30 11:35 ` [PATCH 00/13] [REPOST] BDW Semaphores Ville Syrjälä
  13 siblings, 1 reply; 25+ messages in thread
From: Ben Widawsky @ 2014-04-29 21:52 UTC (permalink / raw)
  To: Intel GFX

This appears to not actually be needed on the current code. Just putting
it on the ML so we can point bug reports at it later.

As pointed out by Ville, the current code is "broken" since we do
FORCE_RESTORE, and RESTORE_INHIBIT on the same dword. Anecdotally, this
seems fine.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index f77b4c1..aa82fb4 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -661,6 +661,13 @@ static int do_switch(struct intel_ring_buffer *ring,
 	if (!to->is_initialized || i915_gem_context_is_default(to))
 		hw_flags |= MI_RESTORE_INHIBIT;
 
+	/* When SW intends to use semaphore signaling between Command streamers,
+	 * it must avoid lite restores in HW by programming "Force Restore" bit
+	 * to ‘1’ in context descriptor during context submission
+	 */
+	if (IS_GEN8(ring->dev) && i915_semaphore_is_enabled(ring->dev))
+		hw_flags |= MI_FORCE_RESTORE;
+
 	ret = mi_set_context(ring, to, hw_flags);
 	if (ret)
 		goto unpin_out;
-- 
1.9.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 13/13] DONT_MERGE drm/i915: FORCE_RESTORE for gen8 semaphores
  2014-04-29 21:52 ` [PATCH 13/13] DONT_MERGE drm/i915: FORCE_RESTORE for gen8 semaphores Ben Widawsky
@ 2014-04-30  7:13   ` Chris Wilson
  2014-04-30 18:44     ` Ben Widawsky
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2014-04-30  7:13 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Tue, Apr 29, 2014 at 02:52:40PM -0700, Ben Widawsky wrote:
> This appears to not actually be needed on the current code. Just putting
> it on the ML so we can point bug reports at it later.
> 
> As pointed out by Ville, the current code is "broken" since we do
> FORCE_RESTORE, and RESTORE_INHIBIT on the same dword. Anecdotally, this
> seems fine.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index f77b4c1..aa82fb4 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -661,6 +661,13 @@ static int do_switch(struct intel_ring_buffer *ring,
>  	if (!to->is_initialized || i915_gem_context_is_default(to))
>  		hw_flags |= MI_RESTORE_INHIBIT;
>  
> +	/* When SW intends to use semaphore signaling between Command streamers,
> +	 * it must avoid lite restores in HW by programming "Force Restore" bit
> +	 * to ‘1’ in context descriptor during context submission
> +	 */
> +	if (IS_GEN8(ring->dev) && i915_semaphore_is_enabled(ring->dev))
> +		hw_flags |= MI_FORCE_RESTORE;

Is it not an error to set both FORCE and INHIBIT?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 09/13] drm/i915/bdw: poll semaphores
  2014-04-29 21:52 ` [PATCH 09/13] drm/i915/bdw: poll semaphores Ben Widawsky
@ 2014-04-30 10:53   ` Ville Syrjälä
  0 siblings, 0 replies; 25+ messages in thread
From: Ville Syrjälä @ 2014-04-30 10:53 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Tue, Apr 29, 2014 at 02:52:36PM -0700, Ben Widawsky wrote:
> As Ville points out, it's possible/probable we don't actually need this.
> Potentially, this validates the letter of the spec, and not the spirit.
> 
> Ville:
> > I discussed this on irc w/ Ben, and I was suggesting we don't need to
> > poll. Polling apparently can be used as a workaround for certain
> > hardware issues, but it looks like those issues shouldn't affect us,
> > for the momemnt at least. So my suggestion was to try w/o polling
> > first (since there could be some power cost to polling) and add the
> > poll bit if problems arise.

I had another look at bspec and there seems to be a bit more text about
the signal mode stuff since the last time I looked. But it still looks
like signal mode should be OK for production hardware.

One of the workarounds says that !RCS rings can lose the PPGTT
page directory information when becoming idle while waiting for a
semaphore. The workaround is either reprogamming the PPGTT
after semaphore wait or disabling the idle message. We already
disable the idle message for RCS for some other reason but not
for the other rings. Doing that for all rings seems like the easier
option here. This seems to apply to production hardware.

There's another idle message disable w/a (using another bit which
maybe disables more rc states?) but that seems to affect only
pre-production hardware. Here the other option here is to use
polling instead.

> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 31b1f3c..e7748ef 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -842,6 +842,7 @@ gen8_ring_sync(struct intel_ring_buffer *waiter,
>  
>  	intel_ring_emit(waiter, MI_SEMAPHORE_WAIT |
>  				MI_SEMAPHORE_GLOBAL_GTT |
> +				MI_SEMAPHORE_POLL |
>  				MI_SEMAPHORE_SAD_GTE_SDD);
>  	intel_ring_emit(waiter, seqno);
>  	intel_ring_emit(waiter,
> -- 
> 1.9.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 08/13] drm/i915: Implement MI decode for gen8
  2014-04-29 21:52 ` [PATCH 08/13] drm/i915: Implement MI decode for gen8 Ben Widawsky
@ 2014-04-30 11:21   ` Ville Syrjälä
  2014-05-07 16:59     ` Ben Widawsky
  0 siblings, 1 reply; 25+ messages in thread
From: Ville Syrjälä @ 2014-04-30 11:21 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Ben Widawsky, Intel GFX

On Tue, Apr 29, 2014 at 02:52:35PM -0700, Ben Widawsky wrote:
> From: Ben Widawsky <benjamin.widawsky@linux.intel.com>
> 
> This is needed to implement ipehr_is_semaphore_wait
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_irq.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 2d76183..bfd21c7 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -2561,12 +2561,9 @@ static bool
>  ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
>  {
>  	if (INTEL_INFO(dev)->gen >= 8) {
> -		/*
> -		 * FIXME: gen8 semaphore support - currently we don't emit
> -		 * semaphores on bdw anyway, but this needs to be addressed when
> -		 * we merge that code.
> -		 */
> -		return false;
> +		/* Broadwell's semaphore wait is 3 dwords. We hope IPEHR is the
> +		 * first dword. */
> +		return (ipehr >> 23) == 0x1c;
>  	} else {
>  		ipehr &= ~MI_SEMAPHORE_SYNC_MASK;
>  		return ipehr == (MI_SEMAPHORE_MBOX | MI_SEMAPHORE_COMPARE |
> @@ -2586,6 +2583,8 @@ semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
>  		 * FIXME: gen8 semaphore support - currently we don't emit
>  		 * semaphores on bdw anyway, but this needs to be addressed when
>  		 * we merge that code.
> +		 *
> +		 * XXX: Gen8 needs more than just IPEHR.
>  		 */

I believe something like this should take care of the remaining gap.

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2446e61..cd1069e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2590,19 +2590,21 @@ ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
 }
 
 static struct intel_ring_buffer *
-semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
+semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring,
+				 u32 ipehr, u64 offset)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct intel_ring_buffer *signaller;
 	int i;
 
 	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
-		/*
-		 * FIXME: gen8 semaphore support - currently we don't emit
-		 * semaphores on bdw anyway, but this needs to be addressed when
-		 * we merge that code.
-		 */
-		return NULL;
+		for_each_ring(signaller, dev_priv, i) {
+			if (ring == signaller)
+				continue;
+
+			if (offset == signaller->semaphore.signal_gtt[ring->id])
+				return signaller;
+		}
 	} else {
 		u32 sync_bits = ipehr & MI_SEMAPHORE_SYNC_MASK;
 
@@ -2627,6 +2629,7 @@ semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	u32 cmd, ipehr, head;
+	u64 offset = 0;
 	int i;
 
 	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
@@ -2662,7 +2665,12 @@ semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
 		return NULL;
 
 	*seqno = ioread32(ring->virtual_start + head + 4) + 1;
-	return semaphore_wait_to_signaller_ring(ring, ipehr);
+	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
+		offset = ioread32(ring->virtual_start + head + 12);
+		offset <<= 32;
+		offset |= ioread32(ring->virtual_start + head + 8);
+	}
+	return semaphore_wait_to_signaller_ring(ring, ipehr, offset);
 }
 
 static int semaphore_passed(struct intel_ring_buffer *ring)
-- 
1.8.3.2


>  		return NULL;
>  	} else {
> -- 
> 1.9.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/13] [REPOST] BDW Semaphores
  2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
                   ` (12 preceding siblings ...)
  2014-04-29 21:52 ` [PATCH 13/13] DONT_MERGE drm/i915: FORCE_RESTORE for gen8 semaphores Ben Widawsky
@ 2014-04-30 11:35 ` Ville Syrjälä
  13 siblings, 0 replies; 25+ messages in thread
From: Ville Syrjälä @ 2014-04-30 11:35 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Tue, Apr 29, 2014 at 02:52:27PM -0700, Ben Widawsky wrote:
> Okay, trying this again after the somewhat painful VCS2 rebase. I think I got
> to all of Ville's comments, but I could have missed a few. I apologize if so.
> 
> Daniel, even if you don't merge the whole series, the first few would really
> help rebase pain - though now that VCS2 is merged, there's probably not much
> other than execlists to be painful
> 
> The series is completely untested since the last rebase. I also didn't look
> really closely to make sure the rebase was correct - I'm just totally short on
> time atm. It was tested before that.

I think it looks all right, but I must admit to my eyes glazing over
when looking at the ring->semaphore... init parts.

I don't really like the GEN8_RING_SEMAPHORE_INIT macro. Not sure why
it's not a simple static function.

But anyway I think you can slap on
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
for patches 1-7 and 10-11 if they didn't have it already.

I left out the poll patch since it shouldn't be needed for production hw
AFAICS (I guess we can merge it if we add a comment there that we should
try to remove it once the early hardware is sufficiently phased out). The
force restore patch doesn't seem applicable for ring buffer mode, and I
didn't really look at the debugfs patch so far.

The missing bits seem to be one or two workarounds and the
semaphore_wait_to_signaller_ring() thing for which I tried to
cobble together something in my reply.

> 
> Ben Widawsky (13):
>   drm/i915: Move semaphore specific ring members to struct
>   drm/i915: Virtualize the ringbuffer signal func
>   drm/i915: Move ring_begin to signal()
>   drm/i915: Make semaphore updates more precise
>   drm/i915: gen specific ring init
>   drm/i915/bdw: implement semaphore signal
>   drm/i915/bdw: implement semaphore wait
>   drm/i915: Implement MI decode for gen8
>   drm/i915/bdw: poll semaphores
>   drm/i915: Extract semaphore error collection
>   drm/i915/bdw: collect semaphore error state
>   drm/i915: semaphore debugfs
>   DONT_MERGE drm/i915: FORCE_RESTORE for gen8 semaphores
> 
>  drivers/gpu/drm/i915/i915_debugfs.c     |  70 ++++++
>  drivers/gpu/drm/i915/i915_drv.h         |   2 +
>  drivers/gpu/drm/i915/i915_gem.c         |  10 +-
>  drivers/gpu/drm/i915/i915_gem_context.c |   7 +
>  drivers/gpu/drm/i915/i915_gpu_error.c   |  79 +++++--
>  drivers/gpu/drm/i915/i915_irq.c         |  14 +-
>  drivers/gpu/drm/i915/i915_reg.h         |   8 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 405 ++++++++++++++++++++++----------
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  90 ++++++-
>  9 files changed, 528 insertions(+), 157 deletions(-)
> 
> -- 
> 1.9.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 04/13] drm/i915: Make semaphore updates more precise
  2014-04-29 21:52 ` [PATCH 04/13] drm/i915: Make semaphore updates more precise Ben Widawsky
@ 2014-04-30 12:45   ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2014-04-30 12:45 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Tue, Apr 29, 2014 at 02:52:31PM -0700, Ben Widawsky wrote:
> With the ring mask we now have an easy way to know the number of rings
> in the system, and therefore can accurately predict the number of dwords
> to emit for semaphore signalling. This was not possible (easily)
> previously.
> 
> There should be no functional impact, simply fewer instructions emitted.
> 
> While we're here, simply do the round up to 2 instead of the fancier
> rounding we did before, which rounding up per mbox, ie 4. This also
> allows us to drop the unnecessary MI_NOOP, so not really 4, 3.
> 
> v2: Use 3 dwords instead of 4 (Ville)
> Do the proper calculation to get the number of dwords to emit (Ville)
> Conditionally set .sync_to when semaphores are enabled (Ville)
> 
> v3: Rebased on VCS2
> Replace hweight_long with hweight32 (Ville)
> 
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 173 +++++++++++++++++---------------
>  1 file changed, 90 insertions(+), 83 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index e0c7bf2..7aedc0c 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -666,24 +666,19 @@ static void render_ring_cleanup(struct intel_ring_buffer *ring)
>  static int gen6_signal(struct intel_ring_buffer *signaller,
>  		       unsigned int num_dwords)
>  {
> +#define MBOX_UPDATE_DWORDS 3
>  	struct drm_device *dev = signaller->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_ring_buffer *useless;
> -	int i, ret;
> +	int i, ret, num_rings;
>  
> -	/* NB: In order to be able to do semaphore MBOX updates for varying
> -	 * number of rings, it's easiest if we round up each individual update
> -	 * to a multiple of 2 (since ring updates must always be a multiple of
> -	 * 2) even though the actual update only requires 3 dwords.
> -	 */
> -#define MBOX_UPDATE_DWORDS 4
> -	if (i915_semaphore_is_enabled(dev))
> -		num_dwords += ((I915_NUM_RINGS-1) * MBOX_UPDATE_DWORDS);
> +	num_rings = hweight32(INTEL_INFO(dev)->ring_mask);
> +	num_dwords += round_up((num_rings-1) * MBOX_UPDATE_DWORDS, 2);
> +#undef MBOX_UPDATE_DWORDS
>  
>  	ret = intel_ring_begin(signaller, num_dwords);
>  	if (ret)
>  		return ret;
> -#undef MBOX_UPDATE_DWORDS
>  
>  	for_each_ring(useless, dev_priv, i) {
>  		u32 mbox_reg = signaller->semaphore.mbox.signal[i];
> @@ -691,15 +686,13 @@ static int gen6_signal(struct intel_ring_buffer *signaller,
>  			intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1));
>  			intel_ring_emit(signaller, mbox_reg);
>  			intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
> -			intel_ring_emit(signaller, MI_NOOP);
> -		} else {
> -			intel_ring_emit(signaller, MI_NOOP);
> -			intel_ring_emit(signaller, MI_NOOP);
> -			intel_ring_emit(signaller, MI_NOOP);
> -			intel_ring_emit(signaller, MI_NOOP);
>  		}
>  	}
>  
> +	/* If num_dwords was rounded, make sure the tail pointer is correct */
> +	if (num_rings % 2 == 0)
> +		intel_ring_emit(signaller, MI_NOOP);
> +
>  	return 0;
>  }
>  
> @@ -717,7 +710,11 @@ gen6_add_request(struct intel_ring_buffer *ring)
>  {
>  	int ret;
>  
> -	ret = ring->semaphore.signal(ring, 4);
> +	if (ring->semaphore.signal)
> +		ret = ring->semaphore.signal(ring, 4);
> +	else
> +		ret = intel_ring_begin(ring, 4);
> +
>  	if (ret)
>  		return ret;
>  

The hunks below look like a different patch. Accidental squash while
rebasing?

I've merged patches 1-3 of this series already.
-Daniel

> @@ -1928,24 +1925,27 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>  		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
>  		ring->get_seqno = gen6_ring_get_seqno;
>  		ring->set_seqno = ring_set_seqno;
> -		ring->semaphore.sync_to = gen6_ring_sync;
> -		ring->semaphore.signal = gen6_signal;
> -		/*
> -		 * The current semaphore is only applied on pre-gen8 platform.
> -		 * And there is no VCS2 ring on the pre-gen8 platform. So the
> -		 * semaphore between RCS and VCS2 is initialized as INVALID.
> -		 * Gen8 will initialize the sema between VCS2 and RCS later.
> -		 */
> -		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
> -		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_RV;
> -		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_RB;
> -		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_RVE;
> -		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> -		ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
> -		ring->semaphore.mbox.signal[VCS] = GEN6_VRSYNC;
> -		ring->semaphore.mbox.signal[BCS] = GEN6_BRSYNC;
> -		ring->semaphore.mbox.signal[VECS] = GEN6_VERSYNC;
> -		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> +		if (i915_semaphore_is_enabled(dev)) {
> +			ring->semaphore.sync_to = gen6_ring_sync;
> +			ring->semaphore.signal = gen6_signal;
> +			/*
> +			 * The current semaphore is only applied on pre-gen8
> +			 * platform.  And there is no VCS2 ring on the pre-gen8
> +			 * platform. So the semaphore between RCS and VCS2 is
> +			 * initialized as INVALID.  Gen8 will initialize the
> +			 * sema between VCS2 and RCS later.
> +			 */
> +			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
> +			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_RV;
> +			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_RB;
> +			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_RVE;
> +			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> +			ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
> +			ring->semaphore.mbox.signal[VCS] = GEN6_VRSYNC;
> +			ring->semaphore.mbox.signal[BCS] = GEN6_BRSYNC;
> +			ring->semaphore.mbox.signal[VECS] = GEN6_VERSYNC;
> +			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> +		}
>  	} else if (IS_GEN5(dev)) {
>  		ring->add_request = pc_render_add_request;
>  		ring->flush = gen4_render_ring_flush;
> @@ -2113,24 +2113,27 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
>  			ring->dispatch_execbuffer =
>  				gen6_ring_dispatch_execbuffer;
>  		}
> -		ring->semaphore.sync_to = gen6_ring_sync;
> -		ring->semaphore.signal = gen6_signal;
> -		/*
> -		 * The current semaphore is only applied on pre-gen8 platform.
> -		 * And there is no VCS2 ring on the pre-gen8 platform. So the
> -		 * semaphore between VCS and VCS2 is initialized as INVALID.
> -		 * Gen8 will initialize the sema between VCS2 and VCS later.
> -		 */
> -		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VR;
> -		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
> -		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VB;
> -		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_VVE;
> -		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> -		ring->semaphore.mbox.signal[RCS] = GEN6_RVSYNC;
> -		ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
> -		ring->semaphore.mbox.signal[BCS] = GEN6_BVSYNC;
> -		ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
> -		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> +		if (i915_semaphore_is_enabled(dev)) {
> +			ring->semaphore.sync_to = gen6_ring_sync;
> +			ring->semaphore.signal = gen6_signal;
> +			/*
> +			 * The current semaphore is only applied on pre-gen8
> +			 * platform.  And there is no VCS2 ring on the pre-gen8
> +			 * platform. So the semaphore between VCS and VCS2 is
> +			 * initialized as INVALID.  Gen8 will initialize the
> +			 * sema between VCS2 and VCS later.
> +			 */
> +			ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VR;
> +			ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
> +			ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VB;
> +			ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_VVE;
> +			ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> +			ring->semaphore.mbox.signal[RCS] = GEN6_RVSYNC;
> +			ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
> +			ring->semaphore.mbox.signal[BCS] = GEN6_BVSYNC;
> +			ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
> +			ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> +		}
>  	} else {
>  		ring->mmio_base = BSD_RING_BASE;
>  		ring->flush = bsd_ring_flush;
> @@ -2231,24 +2234,26 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
>  		ring->irq_put = gen6_ring_put_irq;
>  		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
>  	}
> -	ring->semaphore.sync_to = gen6_ring_sync;
> -	ring->semaphore.signal = gen6_signal;
> -	/*
> -	 * The current semaphore is only applied on pre-gen8 platform. And
> -	 * there is no VCS2 ring on the pre-gen8 platform. So the semaphore
> -	 * between BCS and VCS2 is initialized as INVALID.
> -	 * Gen8 will initialize the sema between BCS and VCS2 later.
> -	 */
> -	ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_BR;
> -	ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_BV;
> -	ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
> -	ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_BVE;
> -	ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> -	ring->semaphore.mbox.signal[RCS] = GEN6_RBSYNC;
> -	ring->semaphore.mbox.signal[VCS] = GEN6_VBSYNC;
> -	ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
> -	ring->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
> -	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> +	if (i915_semaphore_is_enabled(dev)) {
> +		ring->semaphore.signal = gen6_signal;
> +		ring->semaphore.sync_to = gen6_ring_sync;
> +		/*
> +		 * The current semaphore is only applied on pre-gen8 platform.
> +		 * And there is no VCS2 ring on the pre-gen8 platform. So the
> +		 * semaphore between BCS and VCS2 is initialized as INVALID.
> +		 * Gen8 will initialize the sema between BCS and VCS2 later.
> +		 */
> +		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_BR;
> +		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_BV;
> +		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
> +		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_BVE;
> +		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> +		ring->semaphore.mbox.signal[RCS] = GEN6_RBSYNC;
> +		ring->semaphore.mbox.signal[VCS] = GEN6_VBSYNC;
> +		ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
> +		ring->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
> +		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> +	}
>  	ring->init = init_ring_common;
>  
>  	return intel_init_ring_buffer(dev, ring);
> @@ -2281,18 +2286,20 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
>  		ring->irq_put = hsw_vebox_put_irq;
>  		ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
>  	}
> -	ring->semaphore.sync_to = gen6_ring_sync;
> -	ring->semaphore.signal = gen6_signal;
> -	ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
> -	ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
> -	ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
> -	ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
> -	ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> -	ring->semaphore.mbox.signal[RCS] = GEN6_RVESYNC;
> -	ring->semaphore.mbox.signal[VCS] = GEN6_VVESYNC;
> -	ring->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
> -	ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
> -	ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> +	if (i915_semaphore_is_enabled(dev)) {
> +		ring->semaphore.sync_to = gen6_ring_sync;
> +		ring->semaphore.signal = gen6_signal;
> +		ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
> +		ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
> +		ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
> +		ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
> +		ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> +		ring->semaphore.mbox.signal[RCS] = GEN6_RVESYNC;
> +		ring->semaphore.mbox.signal[VCS] = GEN6_VVESYNC;
> +		ring->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
> +		ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
> +		ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> +	}
>  	ring->init = init_ring_common;
>  
>  	return intel_init_ring_buffer(dev, ring);
> -- 
> 1.9.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 13/13] DONT_MERGE drm/i915: FORCE_RESTORE for gen8 semaphores
  2014-04-30  7:13   ` Chris Wilson
@ 2014-04-30 18:44     ` Ben Widawsky
  2014-04-30 19:03       ` Chris Wilson
  0 siblings, 1 reply; 25+ messages in thread
From: Ben Widawsky @ 2014-04-30 18:44 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Wed, Apr 30, 2014 at 08:13:25AM +0100, Chris Wilson wrote:
> On Tue, Apr 29, 2014 at 02:52:40PM -0700, Ben Widawsky wrote:
> > This appears to not actually be needed on the current code. Just putting
> > it on the ML so we can point bug reports at it later.
> > 
> > As pointed out by Ville, the current code is "broken" since we do
> > FORCE_RESTORE, and RESTORE_INHIBIT on the same dword. Anecdotally, this
> > seems fine.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_context.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index f77b4c1..aa82fb4 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -661,6 +661,13 @@ static int do_switch(struct intel_ring_buffer *ring,
> >  	if (!to->is_initialized || i915_gem_context_is_default(to))
> >  		hw_flags |= MI_RESTORE_INHIBIT;
> >  
> > +	/* When SW intends to use semaphore signaling between Command streamers,
> > +	 * it must avoid lite restores in HW by programming "Force Restore" bit
> > +	 * to ‘1’ in context descriptor during context submission
> > +	 */
> > +	if (IS_GEN8(ring->dev) && i915_semaphore_is_enabled(ring->dev))
> > +		hw_flags |= MI_FORCE_RESTORE;
> 
> Is it not an error to set both FORCE and INHIBIT?
> -Chris

Read the commit message.


-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 13/13] DONT_MERGE drm/i915: FORCE_RESTORE for gen8 semaphores
  2014-04-30 18:44     ` Ben Widawsky
@ 2014-04-30 19:03       ` Chris Wilson
  2014-04-30 19:27         ` Ben Widawsky
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2014-04-30 19:03 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Wed, Apr 30, 2014 at 11:44:47AM -0700, Ben Widawsky wrote:
> On Wed, Apr 30, 2014 at 08:13:25AM +0100, Chris Wilson wrote:
> > On Tue, Apr 29, 2014 at 02:52:40PM -0700, Ben Widawsky wrote:
> > > This appears to not actually be needed on the current code. Just putting
> > > it on the ML so we can point bug reports at it later.
> > > 
> > > As pointed out by Ville, the current code is "broken" since we do
> > > FORCE_RESTORE, and RESTORE_INHIBIT on the same dword. Anecdotally, this
> > > seems fine.
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem_context.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > index f77b4c1..aa82fb4 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > @@ -661,6 +661,13 @@ static int do_switch(struct intel_ring_buffer *ring,
> > >  	if (!to->is_initialized || i915_gem_context_is_default(to))
> > >  		hw_flags |= MI_RESTORE_INHIBIT;
> > >  
> > > +	/* When SW intends to use semaphore signaling between Command streamers,
> > > +	 * it must avoid lite restores in HW by programming "Force Restore" bit
> > > +	 * to ‘1’ in context descriptor during context submission
> > > +	 */
> > > +	if (IS_GEN8(ring->dev) && i915_semaphore_is_enabled(ring->dev))
> > > +		hw_flags |= MI_FORCE_RESTORE;
> > 
> > Is it not an error to set both FORCE and INHIBIT?
> > -Chris
> 
> Read the commit message.

And for once I was reading the code. Wouldn't it be worth to clear
MI_RESTORE_INHIBIT any way to silence the critics?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 13/13] DONT_MERGE drm/i915: FORCE_RESTORE for gen8 semaphores
  2014-04-30 19:03       ` Chris Wilson
@ 2014-04-30 19:27         ` Ben Widawsky
  0 siblings, 0 replies; 25+ messages in thread
From: Ben Widawsky @ 2014-04-30 19:27 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Wed, Apr 30, 2014 at 08:03:27PM +0100, Chris Wilson wrote:
> On Wed, Apr 30, 2014 at 11:44:47AM -0700, Ben Widawsky wrote:
> > On Wed, Apr 30, 2014 at 08:13:25AM +0100, Chris Wilson wrote:
> > > On Tue, Apr 29, 2014 at 02:52:40PM -0700, Ben Widawsky wrote:
> > > > This appears to not actually be needed on the current code. Just putting
> > > > it on the ML so we can point bug reports at it later.
> > > > 
> > > > As pointed out by Ville, the current code is "broken" since we do
> > > > FORCE_RESTORE, and RESTORE_INHIBIT on the same dword. Anecdotally, this
> > > > seems fine.
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_gem_context.c | 7 +++++++
> > > >  1 file changed, 7 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > index f77b4c1..aa82fb4 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > > @@ -661,6 +661,13 @@ static int do_switch(struct intel_ring_buffer *ring,
> > > >  	if (!to->is_initialized || i915_gem_context_is_default(to))
> > > >  		hw_flags |= MI_RESTORE_INHIBIT;
> > > >  
> > > > +	/* When SW intends to use semaphore signaling between Command streamers,
> > > > +	 * it must avoid lite restores in HW by programming "Force Restore" bit
> > > > +	 * to ‘1’ in context descriptor during context submission
> > > > +	 */
> > > > +	if (IS_GEN8(ring->dev) && i915_semaphore_is_enabled(ring->dev))
> > > > +		hw_flags |= MI_FORCE_RESTORE;
> > > 
> > > Is it not an error to set both FORCE and INHIBIT?
> > > -Chris
> > 
> > Read the commit message.
> 
> And for once I was reading the code. Wouldn't it be worth to clear
> MI_RESTORE_INHIBIT any way to silence the critics?
> -Chris
> 

If we had interest in merging the patch, I can certainly fix it.

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 12.1/13] drm/i915: Small semaphore debugfs fixup
  2014-04-29 21:52 ` [PATCH 12/13] drm/i915: semaphore debugfs Ben Widawsky
@ 2014-05-03  2:23   ` Ben Widawsky
  0 siblings, 0 replies; 25+ messages in thread
From: Ben Widawsky @ 2014-05-03  2:23 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

Each ring only has ring-1 sync seqnos. It is a bug to try to print
extra.

This should be squashed into drm/i915: semaphore debugfs. I don't have
an easy way at the moment to do the rebase and resend, but that is what
should be done.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 0c4159d..7cd3c7f 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2443,7 +2443,7 @@ static int i915_semaphore_status(struct seq_file *m, void *unused)
 
 	seq_puts(m, "\nSync seqno:\n");
 	for_each_ring(ring, dev_priv, i) {
-		for (j = 0; j < I915_NUM_RINGS; j++) {
+		for (j = 0; j < I915_NUM_RINGS - 1; j++) {
 			seq_printf(m, "  0x%08x ", ring->semaphore.sync_seqno[j]);
 		}
 		seq_putc(m, '\n');
-- 
1.9.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 08/13] drm/i915: Implement MI decode for gen8
  2014-04-30 11:21   ` Ville Syrjälä
@ 2014-05-07 16:59     ` Ben Widawsky
  2014-05-07 17:09       ` Ville Syrjälä
  0 siblings, 1 reply; 25+ messages in thread
From: Ben Widawsky @ 2014-05-07 16:59 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Ben Widawsky, Intel GFX, Ben Widawsky

On Wed, Apr 30, 2014 at 02:21:15PM +0300, Ville Syrjälä wrote:
> On Tue, Apr 29, 2014 at 02:52:35PM -0700, Ben Widawsky wrote:
> > From: Ben Widawsky <benjamin.widawsky@linux.intel.com>
> > 
> > This is needed to implement ipehr_is_semaphore_wait
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_irq.c | 11 +++++------
> >  1 file changed, 5 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 2d76183..bfd21c7 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -2561,12 +2561,9 @@ static bool
> >  ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
> >  {
> >  	if (INTEL_INFO(dev)->gen >= 8) {
> > -		/*
> > -		 * FIXME: gen8 semaphore support - currently we don't emit
> > -		 * semaphores on bdw anyway, but this needs to be addressed when
> > -		 * we merge that code.
> > -		 */
> > -		return false;
> > +		/* Broadwell's semaphore wait is 3 dwords. We hope IPEHR is the
> > +		 * first dword. */
> > +		return (ipehr >> 23) == 0x1c;
> >  	} else {
> >  		ipehr &= ~MI_SEMAPHORE_SYNC_MASK;
> >  		return ipehr == (MI_SEMAPHORE_MBOX | MI_SEMAPHORE_COMPARE |
> > @@ -2586,6 +2583,8 @@ semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
> >  		 * FIXME: gen8 semaphore support - currently we don't emit
> >  		 * semaphores on bdw anyway, but this needs to be addressed when
> >  		 * we merge that code.
> > +		 *
> > +		 * XXX: Gen8 needs more than just IPEHR.
> >  		 */
> 
> I believe something like this should take care of the remaining gap.
> 

Thanks for the help. At this point I just want to get the damn thing
merged, and want to avoid any extra risk. Would you mind resending this
as a distinct patch (authored by you), after we get the rest merged?

> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 2446e61..cd1069e 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -2590,19 +2590,21 @@ ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
>  }
>  
>  static struct intel_ring_buffer *
> -semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
> +semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring,
> +				 u32 ipehr, u64 offset)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	struct intel_ring_buffer *signaller;
>  	int i;
>  
>  	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
> -		/*
> -		 * FIXME: gen8 semaphore support - currently we don't emit
> -		 * semaphores on bdw anyway, but this needs to be addressed when
> -		 * we merge that code.
> -		 */
> -		return NULL;
> +		for_each_ring(signaller, dev_priv, i) {
> +			if (ring == signaller)
> +				continue;
> +
> +			if (offset == signaller->semaphore.signal_gtt[ring->id])
> +				return signaller;
> +		}
>  	} else {
>  		u32 sync_bits = ipehr & MI_SEMAPHORE_SYNC_MASK;
>  
> @@ -2627,6 +2629,7 @@ semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
>  {
>  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>  	u32 cmd, ipehr, head;
> +	u64 offset = 0;
>  	int i;
>  
>  	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
> @@ -2662,7 +2665,12 @@ semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
>  		return NULL;
>  
>  	*seqno = ioread32(ring->virtual_start + head + 4) + 1;
> -	return semaphore_wait_to_signaller_ring(ring, ipehr);
> +	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
> +		offset = ioread32(ring->virtual_start + head + 12);
> +		offset <<= 32;
> +		offset |= ioread32(ring->virtual_start + head + 8);
> +	}
> +	return semaphore_wait_to_signaller_ring(ring, ipehr, offset);
>  }
>  
>  static int semaphore_passed(struct intel_ring_buffer *ring)
> -- 
> 1.8.3.2
> 
> 
> >  		return NULL;
> >  	} else {
> > -- 
> > 1.9.2
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Ville Syrjälä
> Intel OTC
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 08/13] drm/i915: Implement MI decode for gen8
  2014-05-07 16:59     ` Ben Widawsky
@ 2014-05-07 17:09       ` Ville Syrjälä
  0 siblings, 0 replies; 25+ messages in thread
From: Ville Syrjälä @ 2014-05-07 17:09 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Ben Widawsky, Intel GFX, Ben Widawsky

On Wed, May 07, 2014 at 09:59:14AM -0700, Ben Widawsky wrote:
> On Wed, Apr 30, 2014 at 02:21:15PM +0300, Ville Syrjälä wrote:
> > On Tue, Apr 29, 2014 at 02:52:35PM -0700, Ben Widawsky wrote:
> > > From: Ben Widawsky <benjamin.widawsky@linux.intel.com>
> > > 
> > > This is needed to implement ipehr_is_semaphore_wait
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_irq.c | 11 +++++------
> > >  1 file changed, 5 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > index 2d76183..bfd21c7 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -2561,12 +2561,9 @@ static bool
> > >  ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
> > >  {
> > >  	if (INTEL_INFO(dev)->gen >= 8) {
> > > -		/*
> > > -		 * FIXME: gen8 semaphore support - currently we don't emit
> > > -		 * semaphores on bdw anyway, but this needs to be addressed when
> > > -		 * we merge that code.
> > > -		 */
> > > -		return false;
> > > +		/* Broadwell's semaphore wait is 3 dwords. We hope IPEHR is the
> > > +		 * first dword. */
> > > +		return (ipehr >> 23) == 0x1c;
> > >  	} else {
> > >  		ipehr &= ~MI_SEMAPHORE_SYNC_MASK;
> > >  		return ipehr == (MI_SEMAPHORE_MBOX | MI_SEMAPHORE_COMPARE |
> > > @@ -2586,6 +2583,8 @@ semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
> > >  		 * FIXME: gen8 semaphore support - currently we don't emit
> > >  		 * semaphores on bdw anyway, but this needs to be addressed when
> > >  		 * we merge that code.
> > > +		 *
> > > +		 * XXX: Gen8 needs more than just IPEHR.
> > >  		 */
> > 
> > I believe something like this should take care of the remaining gap.
> > 
> 
> Thanks for the help. At this point I just want to get the damn thing
> merged, and want to avoid any extra risk. Would you mind resending this
> as a distinct patch (authored by you), after we get the rest merged?

Can do.

> 
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 2446e61..cd1069e 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -2590,19 +2590,21 @@ ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
> >  }
> >  
> >  static struct intel_ring_buffer *
> > -semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring, u32 ipehr)
> > +semaphore_wait_to_signaller_ring(struct intel_ring_buffer *ring,
> > +				 u32 ipehr, u64 offset)
> >  {
> >  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	struct intel_ring_buffer *signaller;
> >  	int i;
> >  
> >  	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
> > -		/*
> > -		 * FIXME: gen8 semaphore support - currently we don't emit
> > -		 * semaphores on bdw anyway, but this needs to be addressed when
> > -		 * we merge that code.
> > -		 */
> > -		return NULL;
> > +		for_each_ring(signaller, dev_priv, i) {
> > +			if (ring == signaller)
> > +				continue;
> > +
> > +			if (offset == signaller->semaphore.signal_gtt[ring->id])
> > +				return signaller;
> > +		}
> >  	} else {
> >  		u32 sync_bits = ipehr & MI_SEMAPHORE_SYNC_MASK;
> >  
> > @@ -2627,6 +2629,7 @@ semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
> >  {
> >  	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >  	u32 cmd, ipehr, head;
> > +	u64 offset = 0;
> >  	int i;
> >  
> >  	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
> > @@ -2662,7 +2665,12 @@ semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno)
> >  		return NULL;
> >  
> >  	*seqno = ioread32(ring->virtual_start + head + 4) + 1;
> > -	return semaphore_wait_to_signaller_ring(ring, ipehr);
> > +	if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
> > +		offset = ioread32(ring->virtual_start + head + 12);
> > +		offset <<= 32;
> > +		offset |= ioread32(ring->virtual_start + head + 8);
> > +	}
> > +	return semaphore_wait_to_signaller_ring(ring, ipehr, offset);
> >  }
> >  
> >  static int semaphore_passed(struct intel_ring_buffer *ring)
> > -- 
> > 1.8.3.2
> > 
> > 
> > >  		return NULL;
> > >  	} else {
> > > -- 
> > > 1.9.2
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Ville Syrjälä
> > Intel OTC
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-05-07 17:10 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-29 21:52 [PATCH 00/13] [REPOST] BDW Semaphores Ben Widawsky
2014-04-29 21:52 ` [PATCH 01/13] drm/i915: Move semaphore specific ring members to struct Ben Widawsky
2014-04-29 21:52 ` [PATCH 02/13] drm/i915: Virtualize the ringbuffer signal func Ben Widawsky
2014-04-29 21:52 ` [PATCH 03/13] drm/i915: Move ring_begin to signal() Ben Widawsky
2014-04-29 21:52 ` [PATCH 04/13] drm/i915: Make semaphore updates more precise Ben Widawsky
2014-04-30 12:45   ` Daniel Vetter
2014-04-29 21:52 ` [PATCH 05/13] drm/i915: gen specific ring init Ben Widawsky
2014-04-29 21:52 ` [PATCH 06/13] drm/i915/bdw: implement semaphore signal Ben Widawsky
2014-04-29 21:52 ` [PATCH 07/13] drm/i915/bdw: implement semaphore wait Ben Widawsky
2014-04-29 21:52 ` [PATCH 08/13] drm/i915: Implement MI decode for gen8 Ben Widawsky
2014-04-30 11:21   ` Ville Syrjälä
2014-05-07 16:59     ` Ben Widawsky
2014-05-07 17:09       ` Ville Syrjälä
2014-04-29 21:52 ` [PATCH 09/13] drm/i915/bdw: poll semaphores Ben Widawsky
2014-04-30 10:53   ` Ville Syrjälä
2014-04-29 21:52 ` [PATCH 10/13] drm/i915: Extract semaphore error collection Ben Widawsky
2014-04-29 21:52 ` [PATCH 11/13] drm/i915/bdw: collect semaphore error state Ben Widawsky
2014-04-29 21:52 ` [PATCH 12/13] drm/i915: semaphore debugfs Ben Widawsky
2014-05-03  2:23   ` [PATCH 12.1/13] drm/i915: Small semaphore debugfs fixup Ben Widawsky
2014-04-29 21:52 ` [PATCH 13/13] DONT_MERGE drm/i915: FORCE_RESTORE for gen8 semaphores Ben Widawsky
2014-04-30  7:13   ` Chris Wilson
2014-04-30 18:44     ` Ben Widawsky
2014-04-30 19:03       ` Chris Wilson
2014-04-30 19:27         ` Ben Widawsky
2014-04-30 11:35 ` [PATCH 00/13] [REPOST] BDW Semaphores Ville Syrjälä

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.