All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915 : Removed the unconditional cross engine/ring update of MBOX registers
@ 2014-06-10  7:18 sourab.gupta
  2014-06-10  7:24 ` Chris Wilson
  0 siblings, 1 reply; 4+ messages in thread
From: sourab.gupta @ 2014-06-10  7:18 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

Removed the unconditional cross engine/ring update of MBOX registers.
The MBox update will done only when needed when the actual inter ring
dependency has been ascertained. Although this late sync could affect
the Media performance slightly but it shall improve the residency time
of individual power wells in C6 state.

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 53 ++++++++++++++++-----------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 +--
 2 files changed, 28 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 279488a..c684b47 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -672,43 +672,35 @@ static void render_ring_cleanup(struct intel_engine_cs *ring)
 }
 
 static int gen6_signal(struct intel_engine_cs *signaller,
-		       unsigned int num_dwords)
+			struct intel_engine_cs *waiter,
+			u32 seqno)
 {
-	struct drm_device *dev = signaller->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_engine_cs *useless;
-	int i, ret;
+	u32 mbox_reg = signaller->semaphore.mbox.signal[waiter->id];
+	int ret;
 
 	/* NB: In order to be able to do semaphore MBOX updates for varying
 	 * number of rings, it's easiest if we round up each individual update
 	 * to a multiple of 2 (since ring updates must always be a multiple of
 	 * 2) even though the actual update only requires 3 dwords.
 	 */
-#define MBOX_UPDATE_DWORDS 4
-	if (i915_semaphore_is_enabled(dev))
-		num_dwords += ((I915_NUM_RINGS-1) * MBOX_UPDATE_DWORDS);
-	else
-		return intel_ring_begin(signaller, num_dwords);
 
-	ret = intel_ring_begin(signaller, num_dwords);
+#define MBOX_UPDATE_DWORDS 4
+	ret = intel_ring_begin(signaller, MBOX_UPDATE_DWORDS);
+#undef MBOX_UPDATE_DWORDS
 	if (ret)
 		return ret;
-#undef MBOX_UPDATE_DWORDS
-
-	for_each_ring(useless, dev_priv, i) {
-		u32 mbox_reg = signaller->semaphore.mbox.signal[i];
-		if (mbox_reg != GEN6_NOSYNC) {
-			intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1));
-			intel_ring_emit(signaller, mbox_reg);
-			intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
-			intel_ring_emit(signaller, MI_NOOP);
-		} else {
-			intel_ring_emit(signaller, MI_NOOP);
-			intel_ring_emit(signaller, MI_NOOP);
-			intel_ring_emit(signaller, MI_NOOP);
-			intel_ring_emit(signaller, MI_NOOP);
-		}
+	if (mbox_reg != GEN6_NOSYNC) {
+		intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1));
+		intel_ring_emit(signaller, mbox_reg);
+		intel_ring_emit(signaller, seqno);
+		intel_ring_emit(signaller, MI_NOOP);
+	} else {
+		intel_ring_emit(signaller, MI_NOOP);
+		intel_ring_emit(signaller, MI_NOOP);
+		intel_ring_emit(signaller, MI_NOOP);
+		intel_ring_emit(signaller, MI_NOOP);
 	}
+	__intel_ring_advance(signaller);
 
 	return 0;
 }
@@ -727,7 +719,7 @@ gen6_add_request(struct intel_engine_cs *ring)
 {
 	int ret;
 
-	ret = ring->semaphore.signal(ring, 4);
+	ret = intel_ring_begin(ring, 4);
 	if (ret)
 		return ret;
 
@@ -779,6 +771,13 @@ gen6_ring_sync(struct intel_engine_cs *waiter,
 
 	/* If seqno wrap happened, omit the wait with no-ops */
 	if (likely(!i915_gem_has_seqno_wrapped(waiter->dev, seqno))) {
+		/* Add the Mbox update command in the Signaller ring,
+		 * this a point where the actual inter ring dependency has
+		 * been ascertained.
+		 */
+		ret = signaller->semaphore.signal(signaller, waiter, seqno+1);
+		if (ret)
+			return ret;
 		intel_ring_emit(waiter, dw1 | wait_mbox);
 		intel_ring_emit(waiter, seqno);
 		intel_ring_emit(waiter, 0);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 910c83c..216c341 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -142,8 +142,8 @@ struct  intel_engine_cs {
 				   struct intel_engine_cs *to,
 				   u32 seqno);
 		int	(*signal)(struct intel_engine_cs *signaller,
-				  /* num_dwords needed by caller */
-				  unsigned int num_dwords);
+				  struct intel_engine_cs *waiter,
+				  u32 seqno);
 	} semaphore;
 
 	/**
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915 : Removed the unconditional cross engine/ring update of MBOX registers
  2014-06-10  7:18 [PATCH] drm/i915 : Removed the unconditional cross engine/ring update of MBOX registers sourab.gupta
@ 2014-06-10  7:24 ` Chris Wilson
  2014-06-10 14:44   ` Gupta, Sourab
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Wilson @ 2014-06-10  7:24 UTC (permalink / raw)
  To: sourab.gupta; +Cc: intel-gfx, Akash Goel

On Tue, Jun 10, 2014 at 12:48:44PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Removed the unconditional cross engine/ring update of MBOX registers.
> The MBox update will done only when needed when the actual inter ring
> dependency has been ascertained. Although this late sync could affect
> the Media performance slightly but it shall improve the residency time
> of individual power wells in C6 state.

NAK. Did you even consider the deadlocks above and beyond the issues with
latency? Maybe suggest that the hardware guys consider a reordering write
FIFO next time like elsewhere on the chip.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915 : Removed the unconditional cross engine/ring update of MBOX registers
  2014-06-10  7:24 ` Chris Wilson
@ 2014-06-10 14:44   ` Gupta, Sourab
  2014-07-06 13:08     ` Gupta, Sourab
  0 siblings, 1 reply; 4+ messages in thread
From: Gupta, Sourab @ 2014-06-10 14:44 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Goel, Akash

On Tue, 2014-06-10 at 07:24 +0000, Chris Wilson wrote:
> On Tue, Jun 10, 2014 at 12:48:44PM +0530, sourab.gupta@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > Removed the unconditional cross engine/ring update of MBOX registers.
> > The MBox update will done only when needed when the actual inter ring
> > dependency has been ascertained. Although this late sync could affect
> > the Media performance slightly but it shall improve the residency time
> > of individual power wells in C6 state.
> 
> NAK. Did you even consider the deadlocks above and beyond the issues with
> latency? Maybe suggest that the hardware guys consider a reordering write
> FIFO next time like elsewhere on the chip.
> -Chris
> 
Hi Chris,
We had thought about the deadlock scenarios between two rings but it
didn't seem plausible. Below scenario was considered:
Let us say Ring1 has to sync for obj1 which is being processed on Ring2.
So it inserts Wait command on Ring1 and corresponding Signal command on
Ring2.
Now, Ring1 will be deadlocked only in the case when Ring2 is waiting for
some obj2 to be processed on Ring1. That would mean that Ring2 would
have inserted a corresponding signal command on Ring1. Now, this signal
command for obj2 on Ring1 has to be has to be there before the wait
command for obj1 on Ring1 (because wait/signal command pair is inserted
together). So, Ring2 should go ahead to process the Signal command for
obj1.

Are there any missing points in this scenario we considered? Or, some
other scenario for deadlock?

Thanks,
Sourab

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915 : Removed the unconditional cross engine/ring update of MBOX registers
  2014-06-10 14:44   ` Gupta, Sourab
@ 2014-07-06 13:08     ` Gupta, Sourab
  0 siblings, 0 replies; 4+ messages in thread
From: Gupta, Sourab @ 2014-07-06 13:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Goel, Akash

On Tue, 2014-06-10 at 20:15 +0530, sourab gupta wrote:
> On Tue, 2014-06-10 at 07:24 +0000, Chris Wilson wrote:
> > On Tue, Jun 10, 2014 at 12:48:44PM +0530, sourab.gupta@intel.com wrote:
> > > From: Akash Goel <akash.goel@intel.com>
> > > 
> > > Removed the unconditional cross engine/ring update of MBOX registers.
> > > The MBox update will done only when needed when the actual inter ring
> > > dependency has been ascertained. Although this late sync could affect
> > > the Media performance slightly but it shall improve the residency time
> > > of individual power wells in C6 state.
> > 
> > NAK. Did you even consider the deadlocks above and beyond the issues with
> > latency? Maybe suggest that the hardware guys consider a reordering write
> > FIFO next time like elsewhere on the chip.
> > -Chris
> > 
> Hi Chris,
> We had thought about the deadlock scenarios between two rings but it
> didn't seem plausible. Below scenario was considered:
> Let us say Ring1 has to sync for obj1 which is being processed on Ring2.
> So it inserts Wait command on Ring1 and corresponding Signal command on
> Ring2.
> Now, Ring1 will be deadlocked only in the case when Ring2 is waiting for
> some obj2 to be processed on Ring1. That would mean that Ring2 would
> have inserted a corresponding signal command on Ring1. Now, this signal
> command for obj2 on Ring1 has to be has to be there before the wait
> command for obj1 on Ring1 (because wait/signal command pair is inserted
> together). So, Ring2 should go ahead to process the Signal command for
> obj1.
> 
> Are there any missing points in this scenario we considered? Or, some
> other scenario for deadlock?
> 
> Thanks,
> Sourab
> 
Hi Chris,

We're sorry to bother you but could you please point out to flaws here.
It would help us in understanding the missing scenarios for  deadlock
arising in this implementation w.r.t. the earlier one.

Thanks,
Sourab

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-07-06 13:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-10  7:18 [PATCH] drm/i915 : Removed the unconditional cross engine/ring update of MBOX registers sourab.gupta
2014-06-10  7:24 ` Chris Wilson
2014-06-10 14:44   ` Gupta, Sourab
2014-07-06 13:08     ` Gupta, Sourab

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.